Qwen3.5 35B A3B Uncensored HauhauCS Aggressive
Qwen3.5 35B A3B Uncensored HauhauCS Aggressive by HauhauCS, a image-text-to-text model with multimodal capabilities. Understand and compare multimodal features, benchmarks, and capabilities.
Comparison
| Feature | Qwen3.5 35B A3B Uncensored HauhauCS Aggressive | Interfaze |
|---|---|---|
| Input Modalities | text, image, video | image, text, audio, video, document |
| Native OCR | No | Yes |
| Long Document Processing | No | Yes |
| Language Support | 201 partial | 162+ |
| Native Speech-to-Text | No | Yes |
| Native Object Detection | No | Yes |
| Guardrail Controls | No | Yes |
| Context Input Size | 262K | 1M |
| Tool Calling | No | Tool calling supported + built in browser, code execution and web search |
Scaling
| Feature | Qwen3.5 35B A3B Uncensored HauhauCS Aggressive | Interfaze |
|---|---|---|
| Scaling | Self-hosted/Provider-hosted with quantization | Unlimited |
View model card on Hugging Face
Join the Discord for updates, roadmaps, projects, or just to chat.
Qwen3.5-35B-A3B uncensored by HauhauCS. 0/465 refusals.
About
No changes to datasets or capabilities. Fully functional, 100% of what the original authors intended - just without the refusals.
These are meant to be the best lossless uncensored models out there.
Aggressive Variant
Stronger uncensoring — model is fully unlocked and won't refuse prompts. May occasionally append short disclaimers (baked into base model training, not refusals) but full content is always generated.
For a more conservative uncensor that keeps some safety guardrails, check the Balanced variant when it's available.
Downloads
All quants generated with importance matrix (imatrix) for optimal quality preservation on abliterated weights.
Specs
- 35B total parameters, ~3B active per forward pass (MoE)
- 256 experts, 8 routed + 1 shared per token
- Hybrid architecture: Gated DeltaNet linear attention + full softmax attention (3:1 ratio)
- 40 layers, pattern: 10 x (3 x DeltaNet-MoE + 1 x Attention-MoE)
- 262K native context (extendable to 1M with YaRN)
- Natively multimodal (text, image, video)
- Multi-token prediction (MTP) support
- 248K vocabulary, 201 languages
- Based on Qwen/Qwen3.5-35B-A3B
Recommended Settings
From the official Qwen authors:
Thinking mode (default):
- General:
temperature=1.0, top_p=0.95, top_k=20, min_p=0, presence_penalty=1.5 - Coding/precise tasks:
temperature=0.6, top_p=0.95, top_k=20, min_p=0, presence_penalty=0
Non-thinking mode:
- General:
temperature=0.7, top_p=0.8, top_k=20, min_p=0, presence_penalty=1.5 - Reasoning tasks:
temperature=1.0, top_p=1.0, top_k=40, min_p=0, presence_penalty=2.0
Important:
- Keep at least 128K context to preserve thinking capabilities
- Use
--jinjaflag with llama.cpp for proper chat template handling - Vision support requires the
mmprojfile alongside the main GGUF
Usage
Works with llama.cpp, LM Studio, Jan, koboldcpp, and other GGUF-compatible runtimes.
llama-cli -m Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf \
--jinja -c 131072 -ngl 99
llama-cli -m Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-Q4_K_M.gguf \
--mmproj mmproj-Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-f16.gguf \
--jinja -c 131072 -ngl 99
Note: LM Studio may show 256x2.6B in the params column instead of 35B-A3B — this is a cosmetic metadata quirk, the model runs correctly.
Other Formats
- GGUF (this repo)
- GPTQ — coming soon