Qwen3.6-35B-A3B: Agentic coding power, now open to all

📝 Discussion Summary (Click to expand)

1. Model release& design
Quote: “Qwen/Qwen3.6‑35B‑A3B is intended as a superior replacement of Qwen/Qwen3.5‑27B.” — adrian_b

2. Open‑weight & censorship
Quote: “A relief to see the Qwen team still publishing open weights, after the kneecapping and departures of Junyang Lin and others!” — bertili

3. Running locally (hardware & quantisation)
Quote: “The 35B model can be run at home with the weights stored on an SSD (or on 2 SSDs, for double throughput).” — adrian_b

4. Performance & benchmark comparisons
Quote: “Despite its efficiency, Qwen3.6‑35B‑A3B delivers outstanding agentic coding performance, surpassing its predecessor Qwen3.5‑35B‑A3B by a wide margin and rivaling much larger dense models such as Qwen3.5‑27B.” — adrian_b

5. Community outlook & future expectations Quote: “This is just one model in the Qwen 3.6 series. They will most likely release the other small sizes … the flagship 397 B size seems to have been excluded.” — zozbot234

These five themes capture the main points discussed: the new model’s positioning, the value of open‑weight releases amid recent setbacks, practical local‑inference considerations, head‑to‑head performance claims, and the community’s optimism (and speculation) about upcoming sizes.

🚀 Project Ideas

LocalLLM Orchestration Platform

Summary

Unified management of multiple open-weight LLMs with auto‑quantization and context caching.
Seamless API bridge for coding assistants (e.g., Continue, Pi) to switch models without code changes.

Details

Key	Value
Target Audience	Developers and power users running local LLMs on consumer hardware
Core Feature	Multi‑model selector, automatic quantization recommendation, KV‑cache offload, OpenAI‑compatible server
Tech Stack	Backend: FastAPI + Python; Frontend: React; Quantization: llama.cpp/GGUF; Containerization: Docker
Difficulty	Medium
Monetization	Revenue-ready: SaaS subscription (tiered)

Notes

HN users repeatedly complain about fragmented model handling and slow context management; this tool centralizes those tasks.
Provides “plug‑and‑play” for agentic coding workflows, directly addressing the desire for local agents without manual tuning.

Model Provenance & Integrity Service

Summary

Automatic SHA‑256 verification and version tracking for GGUF model releases.
Real‑time update notifications and signed manifests for trusted downloads.

Details

Key	Value
Target Audience	Researchers, hobbyists, and compliance‑focused teams
Core Feature	Hash verification API, signed release metadata, diff viewer for quant changes
Tech Stack	Backend: Node.js; Database: PostgreSQL; UI: Vue; Integration: Hugging Face API
Difficulty	Low
Monetization	Hobby (free, optional paid premium for enterprise SLA)

Notes- Addresses repeated HN requests for checksums and concerns about corrupted quant downloads.

Community values trustworthy model provenance, making this a sought‑after utility.

Privacy‑First Chat Archiver

Summary

Export, encrypt, and audit local LLM conversation logs with bulk export and redaction controls.
Optional compliance tagging for GDPR/CCPA.

Details

Key	Value
Target Audience	Privacy‑concerned professionals (legal, healthcare, finance) using local LLMs
Core Feature	Bulk export to encrypted JSON/CSV, metadata stripping, redaction wizard
Tech Stack	Python (Flask) backend; Electron UI; libsodium for encryption; OpenAPI integration
Difficulty	Medium
Monetization	Revenue-ready: One‑time license $49 per user

Notes

HN discussions highlight the need to keep data off‑cloud; this tool meets that need.
Could integrate with OpenRouter to sync exported sessions for personal analytics.

Hardware Sizing Assistant

Summary

AI‑driven calculator that recommends minimal hardware (CPU, GPU, RAM) for a given model and context length.
Generates cost estimate and links to purchase options.

Details

Key	Value
Target Audience	Users planning local LLM deployments on a budget
Core Feature	Interactive sizing wizard, price‑comparison engine, offline PDF report
Tech Stack	Python (FastAPI) backend; React frontend; Stripe API for pricing data; SQLite for recommendation rules
Difficulty	Low
Monetization	Revenue-ready: Affiliate revenue share (e.g., 5% per hardware sale)

Notes

Community frequently asks “what’s the cheapest Mac/PC that can run Qwen‑3.6‑35B?”; this solves that pain point.
Provides up‑to‑date pricing scrapes, adding freshness to recommendations.

Lightweight Agentic Coding Toolkit

Summary

Library of optimized prompts, thinking‑effort controls, and KV‑caching utilities for small MoE models. - Built‑in benchmark harness for automatic performance tuning.

Details

Key	Value
Target Audience	Developers building local coding assistants on limited hardware
Core Feature	Prompt templates, auto‑retry loop manager, benchmark suite, integration with Continue/Olly
Tech Stack	TypeScript (Node.js) backend; React component library; CLI (Node); OpenAI‑compatible API layer
Difficulty	Medium
Monetization	Hobby (open source, optional paid support)

Notes

Directly addresses the “how to get decent coding quality on 16‑32 GB RAM” problem discussed on HN.
Promises higher token‑per‑second rates and better context handling for MoE models, increasing usability for local agents.

Qwen3.6-35B-A3B: Agentic coding power, now open to all

🚀 Project Ideas

LocalLLM Orchestration Platform

Summary

Details

Notes

Model Provenance & Integrity Service

Summary

Details

Notes- Addresses repeated HN requests for checksums and concerns about corrupted quant downloads.

Privacy‑First Chat Archiver

Summary

Details

Notes

Hardware Sizing Assistant

Summary

Details

Notes

Lightweight Agentic Coding Toolkit

Summary

Details

Notes

Read Later