Interfaze

logo

Beta

pricing

help

docs

blog

sign in

Qwopus3.6 27B Coder MTP GGUF

Qwopus3.6 27B Coder MTP GGUF by Jackrong, a image-text-to-text model with multimodal capabilities. Understand and compare multimodal features, benchmarks, and capabilities.

Comparison

FeatureQwopus3.6 27B Coder MTP GGUFInterfaze
Input Modalities

text, image

image, text, audio, video, document

Native OCRNoYes
Long Document ProcessingNoYes
Language Support

5 partial

162+

Native Speech-to-TextNoYes
Native Object DetectionNoYes
Guardrail ControlsNoYes
Context Input Size

32.8K

1M

Tool CallingYes

Tool calling supported + built in browser, code execution and web search

Scaling

FeatureQwopus3.6 27B Coder MTP GGUFInterfaze
Scaling

Self-hosted/Provider-hosted with quantization

Unlimited

View model card on Hugging Face

[!WARNING] Community Release Notice: Qwopus-3.6-27B-Coder is an experimental community release intended for research, evaluation, and agent workflow exploration. It has not undergone full safety evaluation or broad general-domain benchmarking.

[!IMPORTANT] Benchmark Status: The first completed benchmark is SWE-bench Verified full 500 in thinking-off / no-thinking mode, where the Q5_K_M 27B GGUF run resolved 335/500 = 67.0%. Other benchmark suites remain pending and will be updated as testing completes.


πŸ’‘ 1. Base Model, Training Stack & Collaboration


πŸ“– 2. Background & Motivation

This model integrates:

Agent Traces (lambda/hermes-agent-reasoning-traces): Each sample contains real multi-turn tool execution results (not fabricated outputs), with step-by-step reasoning inside <think> tags. Coverage includes:


πŸ“Š 3. Performance Benchmarks


πŸ—ΊοΈ 4. Training & Data Pipeline Overview

The training process fuses Trace Inversion data augmentation with a Three-Stage Curriculum Learning pipeline. The core engineering focuses on expanding context length gradually while training on reconstructed reasoning traces and real agent trajectories to keep the output format stable.

[ πŸ—ΊοΈ Trace Inversion: Reconstructing Distillation Workflow ]

  A. Surrogate Model Training (Trace Inverter)
     Open-source Model (GLM-5.1 / DS-V4) ──► Complete Reasoning Chain ──► [ Qwen3-235B Compression ] ──► Reasoning Bubbles
                                              β”‚                                   β”‚
                                              └──────────► [ Training ] β—„β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                   (Base: Qwen3-4B-Instruct)
                                                   (Result: Trace-Inverter-4B)

  B. Inversion Phase: Reconstructing Claude-4.7-Max
     _______________________________________________________
    |                                                       |
    |  Claude-4.7-Max API ──► Compressed Bubbles + Answer   |
    |_______________________________________________________|
                      β”‚
                      β–Ό
    [ 🧠 Trace-Inverter-4B (Logic Reconstructor) ] ──► Synthetic Deep Reasoning Trace (Learnable CoT)
                      β”‚
                      β–Ό
    [ 🧩 Data Splicing ] ◄────────── (Original Prompt + Response)
    (Embed reconstructed CoT in <think> tags, splicing with original prompt/response)
                      β”‚
                      β–Ό
             (Result: claude-opus-4.6/4.7 inverted sets)

  C. Final Coder SFT Curriculum Pipeline
     ___________________________________________
    |                                           |
    |       Base Model (Qwopus3.6-27B-v2)       |
    |___________________________________________|
                      β”‚
                      β–Ό
    [ πŸ“¦ Phase 1: Format Inception ] ──► [ πŸ› οΈ Phase 2: Agent/Coding Expansion ] ──► [ πŸš€ Phase 3: Long-Context SFT ]
      ( < 4096 tokens )                     ( 4096 - 8192 tokens )                     ( 8192 - 32K tokens )
      (Stable <think> format)               (Tool traces + coding tasks)               (Long / multi-turn / replay)
                      β”‚                                                                            β”‚
                      β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                                    β–Ό
                                   _______________________________________________
                                  |                                               |
                                  |   🌟 Final Model: Qwopus-3.6-27B-Coder        |
                                  |_______________________________________________|

[!NOTE] Due to the complex and diverse format of agent trajectory datasets, rigorous cleaning and format standardization were applied to ensure data quality.


πŸ“š 5. Three-Stage Curriculum Learning

To steadily scale reasoning quality under long-context inference, Qwopus-3.6-27B-Coder uses a curriculum-style data mixture building on the approach proven in the Qwopus coder line. The model is first stabilized on short, clean reasoning samples, then exposed to complex coding and agent traces, and finally reinforced with longer contexts plus replay data.


[!CAUTION] Deployment note: The model may emit reasoning inside <think> and </think> tags. Front-end applications and agent frameworks should parse or hide these sections where appropriate. For tool calling, ensure the prompt format and system prompt match the training data configuration to activate agent capabilities.


⚠️ 7. Training & Deployment Notes

[!CAUTION] Compatibility Notes

  • Tool Calling Format: To activate the model's agent capabilities, ensure the prompt format and system prompt include appropriate tool definitions and match the training data format.
  • Reasoning Output Extraction: The model's thinking process is wrapped in <think> and </think> tags. Front-end applications may need to parse and hide these tags.
  • Long-Context Usage: For contexts beyond 32K, consider enabling RoPE/YaRN scaling (e.g., --rope-scaling yarn --rope-scale 4 --yarn-orig-ctx 32768 in llama.cpp).

πŸ“‹ 8. Benchmark Progress

The first completed evaluation is the no-thinking SWE-bench Verified run reported above. Additional local agentic benchmarks remain pending and will be added after testing.

BenchmarkStatusResult / Reference
SWE-bench Verifiedβœ… Completed335/500 = 67.0% (thinking-off, Q5_K_M, RTX 5090 + MTP)
BugFind-15πŸ“‹ Pending9B reference: 79
HermesAgent-20πŸ“‹ Pending9B reference: 85
ToolCall-15πŸ“‹ Pending9B reference: 100
InstructFollow-15πŸ“‹ Pending9B reference: 93

πŸ“š 9. Resources & Guides

πŸ‘‰ GitHub Repository: Jackrong-llm-finetuning-guide Access the repository to dive into the codebase and reproduce our results.

πŸ‘‰ Qwen MTP GGUF Processing Workflow A custom splitting and merging methodology designed specifically for Qwen series Multi-Token Prediction (MTP) heads.

πŸ‘‰ benchlocal Evaluation Framework The evaluation framework used to run the local agentic and coding benchmarks.

πŸ‘‰ Qwopus3.6-27B-v2 Model Card Base model card with full MMLU-Pro, SWE-bench, and throughput benchmarks.


πŸ™ 10. Acknowledgements

Special thanks to:

  • The Qwen team for providing the powerful Qwen3.6-27B base model.
  • Unsloth for providing the highly efficient fine-tuning framework.
  • Kyle Hessling for the close collaboration on hardware, training infrastructure, and evaluation support.
  • Open-source datasets and community contributors, particularly lambda/hermes-agent-reasoning-traces for the high-quality agent trajectory data.

πŸ“– 11. Citation

@misc{jackrong_qwopus36_27b_coder,
  title        = {Qwopus-3.6-27B-Coder},
  author       = {Jackrong},
  year         = {2026},
  publisher    = {Hugging Face},
  howpublished = {\url{https://huggingface.co/Jackrong/Qwopus-3.6-27B-Coder}}
}

Want more deterministic results?