v0.2.0
Model Support
- Mistral – support for GGUF-format Mistral models with optimized GPU execution.
- Qwen2.5 – GGUF-format Qwen2.5 models supported, including performance improvements for attention layers.
- Qwen3 – compatible with GGUF-format Qwen3 models and updated integration.
- DeepSeek-R1-Distill-Qwen-1.5B – GGUF-format DeepSeek distilled models supported for efficient inference.
- Phi-3 – full support for GGUF-format Microsoft Phi-3 models for high-performance workloads.
What's Changed
- [refactor] Renamed aux package to resolve Windows issue by @stratika in #11
- Windows support for GPULlama3.java by @stratika in #12
- [API] Update TornadoVM API to use latest warmup features by @mikepapadim in #13
- [model] Add support for Mistral models by @orionpapadakis in #17
- Cleanups post Mistral Integration by @mikepapadim in #27
- Add a Docker section to README with available images and usage examples by @mikepapadim in #28
- Refactor TornadoVMMasterPlan to simplify scheduling decision for non-Nvidia HW and Mistral Models by @mikepapadim in #32
- File not found error handling in loadModel method in GGUF.java by @dhruvarayasam in #34
- Update README for clarity by @mikepapadim in #36
- [models] Support for Qwen3 models by @orionpapadakis in #37
- [models][phi-3] Support for Microsoft's Phi-3 models by @mikepapadim in #38
- Reorganize package structure and update imports to use `org.beehive.g… by @mikepapadim in #42
- Update README.md by @kotselidis in #44
- [models][deepseek][qwen2.5] Add support for Qwen2.5 and Deepseek-Distilled-Qwen models by @orionpapadakis in #40
- Improve attention performance for qwen2.5 & deepseek by @orionpapadakis in #46
New Contributors
- @orionpapadakis made their first contribution in #17
- @dhruvarayasam made their first contribution in #34
Full Changelog: v0.1.0-beta...v0.2.0