[Feature] how to fine-tuning Qwen3-30B-A3B on a single RTX 3090 #3163
Replies: 2 comments
-
I will move this to discussions as it's not related to an actual bug or issue. |
Beta Was this translation helpful? Give feedback.
-
Hi there @jackhovran01 currently it is not possible with any framework unless you use this framework which builds ontop of Unsloth's exisiting MOE kernels whcih we haven't implemented yet. Keep in mind though that the kernels are under an AGPL3 open-source license though I don't think it should affect your workflow at all: https://github.com/woct0rdho/transformers-qwen3-moe-fused |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello,
I would like to request detailed instructions or an example workflow for fine-tuning the Qwen3-30B-A3B model using a single NVIDIA RTX 3090 GPU.
Since this model is very large, it cannot fit into 24GB VRAM directly, so I am looking for guidance on efficient fine-tuning approaches that work under this hardware limitation.
Specifically, I would like to know:
Memory Optimization Strategies
4-bit
,8-bit
,NF4
,FP16
mix).Training Framework
Batch Size & Sequence Length
Offloading & Distributed Training
Evaluation & Inference
If there are any existing scripts, configs, or Colab examples for fine-tuning this model on limited VRAM GPUs like 3090, please share them.
Beta Was this translation helpful? Give feedback.
All reactions