[Feature] how to fine-tuning Qwen3-30B-A3B on a single RTX 3090 #3163

jackhovran01 · 2025-08-15T03:30:25Z

jackhovran01
Aug 15, 2025

Hello,

I would like to request detailed instructions or an example workflow for fine-tuning the Qwen3-30B-A3B model using a single NVIDIA RTX 3090 GPU.
Since this model is very large, it cannot fit into 24GB VRAM directly, so I am looking for guidance on efficient fine-tuning approaches that work under this hardware limitation.

Specifically, I would like to know:

Memory Optimization Strategies
- How to configure LoRA / QLoRA for this model on a 3090.
- Recommended settings for quantization (e.g., 4-bit, 8-bit, NF4, FP16 mix).
- Whether gradient checkpointing and paged optimizers are supported and recommended.
Training Framework
- Examples using Hugging Face Transformers, Unsloth, LLaMA Factory, or any other supported frameworks.
- Example training script or configuration file compatible with a 3090.
Batch Size & Sequence Length
- Suggested micro-batch size and gradient accumulation settings to avoid OOM errors.
- Recommended max sequence length for training under 24GB VRAM.
Offloading & Distributed Training
- Whether CPU / NVMe offloading is possible and how to enable it.
- If partial model offload is supported for Qwen3-30B-A3B to make it fit on a 3090.
Evaluation & Inference
- How to run inference efficiently on 3090 after fine-tuning.
- If quantized models can still be used for inference without major performance loss.

If there are any existing scripts, configs, or Colab examples for fine-tuning this model on limited VRAM GPUs like 3090, please share them.

rolandtannous · 2025-08-15T05:13:49Z

rolandtannous
Aug 15, 2025
Collaborator

I will move this to discussions as it's not related to an actual bug or issue.
Please read the docs at : https://docs.unsloth.ai/
Checkout the pre-made notebooks here: https://docs.unsloth.ai/get-started/unsloth-notebooks
Feel free to also join our discord https://discord.com/invite/unsloth where you can get hints and tips from community members.

0 replies

shimmyshimmer · 2025-08-19T02:29:25Z

shimmyshimmer
Aug 19, 2025
Maintainer

Hi there @jackhovran01 currently it is not possible with any framework unless you use this framework which builds ontop of Unsloth's exisiting MOE kernels whcih we haven't implemented yet. Keep in mind though that the kernels are under an AGPL3 open-source license though I don't think it should affect your workflow at all: https://github.com/woct0rdho/transformers-qwen3-moe-fused

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature] how to fine-tuning Qwen3-30B-A3B on a single RTX 3090 #3163

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

[Feature] how to fine-tuning Qwen3-30B-A3B on a single RTX 3090 #3163

Uh oh!

jackhovran01 Aug 15, 2025

Replies: 2 comments

Uh oh!

rolandtannous Aug 15, 2025 Collaborator

Uh oh!

shimmyshimmer Aug 19, 2025 Maintainer

jackhovran01
Aug 15, 2025

rolandtannous
Aug 15, 2025
Collaborator

shimmyshimmer
Aug 19, 2025
Maintainer