1️⃣Ultra‑compact, high‑speed models > "1-bit g128 with a shared 16-bit scale for every group. So, effectively 1.125 bit." — woadwarrior01
2️⃣ Edge‑device viability & community testing
"I have older M1 air with 8GB, but still getting over 23 t/s on 4B model.. and the quality of outputs is on par with top models of similar size." — freakynit
3️⃣ Trade‑offs, benchmarking & scaling concerns
"Their own (presumably cherry‑picked) benchmarks put their models near the ‘middle of the market’ models (llama3 3B, qwen3 1.7B), not competing with Claude, GPT‑4, or Gemini." — kvdveer