🧩 Model Card: Qwen3-0.6B
- Type: Text-to-Text
- Think: Toggleable
- Tool Calling Support: No
- Base Model: Qwen/Qwen3-0.6B
- Quantization: Q4_1
- Max Context Length: 32k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen3:0.6b
📝 Note:
- CLI: Type
/thinkto toggle on/off interactively. - Server Mode: Set the
"think"flag in the request payload.
🧩 Model Card: Qwen3-1.7B
- Type: Text-to-Text
- Think: Toggleable
- Tool Calling Support: No
- Base Model: Qwen/Qwen3-1.7B
- Quantization: Q4_1
- Max Context Length: 32k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen3:0.6b
📝 Note:
- CLI: Type
/thinkto toggle on/off interactively. - Server Mode: Set the
"think"flag in the request payload.
🧩 Model Card: Qwen3-4B
- Type: Text-to-Text
- Think: Toggleable
- Tool Calling Support: Yes
- Base Model: Qwen/Qwen3-4B
- Quantization: Q4_1
- Max Context Length: 32k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen3:4b
📝 Note:
- CLI: Type
/thinkto toggle on/off interactively. - Server Mode: Set the
"think"flag in the request payload.
🧩 Model Card: Qwen3-8B
- Type: Text-to-Text
- Think: Toggleable
- Tool Calling Support: Yes
- Base Model: Qwen/Qwen3-8B
- Quantization: Q4_1
- Max Context Length: 32k tokens
- Default Context Length: 16k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen3:8b
📝 Note:
- CLI: Type
/thinkto toggle on/off interactively. - Server Mode: Set the
"think"flag in the request payload.
🧩 Model Card: Qwen3-4B-Thinking-2507
- Type: Text-to-Text
- Think: Yes
- Tool Calling Support: Yes
- Base Model: Qwen/Qwen3-4B-Thinking-2507
- Quantization: Q4_1
- Max Context Length: 256k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen3-tk:4b
🧩 Model Card: Qwen3-4B-Instruct-2507
- Type: Text-to-Text
- Think: No
- Tool Calling Support: Yes
- Base Model: Qwen/Qwen3-4B-Instruct-2507
- Quantization: Q4_1
- Max Context Length: 256k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen3-it:4b
🧩 Model Card: Qwen3-VL-4B-Instruct
- Type: Image-Text-to-Text
- Think: No
- Tool Calling Support: Yes
- Base Model: Qwen/Qwen3-VL-4B-Instruct
- Quantization: Q4_1
- Max Context Length: 256k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen3vl-it:4b
▶️ Image Resize Options
You can control image resizing when running or serving the model using the --img-pre-resize flag or simply -r:
flm run qwen3vl-it:3b -r 1
flm serve qwen3vl-it:3b -r 1
The -r option determines image’s height:
- 0: original size
- 1: height = 480 px
- 2: height = 720 px
- 3: height = 1080 px (default)
- 4: height = 1440 px
Don’t worry—if your image is already smaller than the setup, it keeps its original resolution! ✨
📝 Note
- Image understanding adapts to image size. Image TTFT can range from under 1 second to ~200 seconds depending on resolution. Use lower-resolution images (720p or below) unless high resolution is required (e.g. OCR on small text).
- Video understanding is not supported yet.
🧩 Model Card: Qwen2.5-3B-Instruct
- Type: Text-to-Text
- Think: No
- Tool Calling Support: No
- Base Model: Qwen/Qwen2.5-3B-Instruct
- Quantization: Q4_1
- Max Context Length: 32k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen2.5-it:3b
🧩 Model Card: Qwen2.5-VL-3B-Instruct
- Type: Image-Text-to-Text
- Think: No
- Tool Calling Support: No
- Base Model: Qwen/Qwen2.5-VL-3B-Instruct
- Quantization: Q4_1
- Max Context Length: 256k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen2.5vl-it:3b
▶️ Image Resize Options
You can control image resizing when running or serving the model using the --img-pre-resize flag or simply -r:
flm run qwen3vl-it:3b -r 1
flm serve qwen3vl-it:3b -r 1
The -r option determines image’s height:
- 0: original size
- 1: height = 480 px
- 2: height = 720 px
- 3: height = 1080 px (default)
- 4: height = 1440 px
Don’t worry—if your image is already smaller than the setup, it keeps its original resolution! ✨
📝 Note
- Image understanding adapts to image size. Image TTFT can range from under 1 second to ~200 seconds depending on resolution. Use lower-resolution images (720p or below) unless high resolution is required (e.g. OCR on small text).
- Video understanding is not supported yet.
🧩 Model Card: Qwen3.5-4B
- Type: Image-Text-to-Text
- Think: Toggleable
- Tool Calling Support: Yes
- Base Model: Qwen/Qwen3.5-4B
- Quantization: Q4_1
- Max Context Length: 256k tokens
- Default Context Length: 32k tokens (change default)
- Set Context Length at Launch
▶️ Run with FastFlowLM in PowerShell:
flm run qwen3.5:4b
▶️ Image Resize Options
You can control image resizing when running or serving the model using the --img-pre-resize flag or simply -r:
flm run qwen3.5:4b -r 1
flm serve qwen3.5:4b -r 1
The -r option determines image’s height:
- 0: original size
- 1: height = 480 px
- 2: height = 720 px
- 3: height = 1080 px (default)
- 4: height = 1440 px
Don’t worry—if your image is already smaller than the setup, it keeps its original resolution! ✨
📝 Note
- Image understanding adapts to image size. Image TTFT can range from under 1 second to ~200 seconds depending on resolution. Use lower-resolution images (720p or below) unless high resolution is required (e.g. OCR on small text).
- Video understanding is not supported yet.