Project ideas from Hacker News discussions.

GLM-4.7: Advancing the Coding Capability

📝 Discussion Summary (Click to expand)

1. Competitive Coding Performance

GLM-4.7 matches or approaches frontier models like Claude Sonnet 4.5/Opus 4.5 and GPT-5.2 in coding/agentic tasks. "GLM-4.7 is more than capable for what I need. Opus 4.5 is nice but not worth the quota cost for most tasks." -bigyabai; "This model is much stronger than 3.5 sonnet... about 4 points ahead of sonnet4, but behind sonnet 4.5 by 4 points." -lumost.

2. Superior Cost-Effectiveness

Z.ai's cheap plans ($30/year lite) make GLM a compelling Claude alternative. "z.ai models are crazy cheap. The one year lite plan is like 30€ (on sale though). Complete no-brainer." -theshrike79; "less than 30 bucks for entire year, insanely cheap." -tonyhart7.

3. Local Inference Challenges

Large MoE size (358B) demands expensive hardware (e.g., $10k+ Mac Studio); slow speeds favor cloud APIs. "In practice, it'll be incredible slow and you'll quickly regret spending that much money on it instead of just using paid APIs." -embedding-shape; "consumer grade hardware is still too slow for these things to work." -g947o.


🚀 Project Ideas

LocalMoE Runner

Summary

  • A lightweight inference engine optimized for running large MoE open-weight models (e.g., GLM-4.7) on Apple Silicon Macs with 128-512GB RAM, emphasizing fast prompt processing and quantization-aware scheduling to achieve interactive speeds (20+ t/s decode).
  • Solves slow local inference frustration, enabling privacy-focused coding without cloud dependency.

Details

Key Value
Target Audience Indie devs, privacy-conscious coders with M-series Macs
Core Feature Auto-quantize/load MoE experts on-the-fly, unified MLX backend with lookahead expert prediction for 2x faster prefill
Tech Stack MLX, Rust for scheduler, GGUF support
Difficulty Medium
Monetization Hobby

Notes

  • "it's just way too slow... input processing, tokenization, and prompt loading; it takes so much time" (hasperdi); HN loves local sovereignty: "You should be able to own and run your own computation, permissionlessly" (pixelpoet).
  • High utility for async/batch coding; sparks discussions on consumer AI hardware limits.

QuotaGuard CLI Router

Summary

  • CLI tool that proxies multiple LLM providers (Anthropic, Z.ai, OpenRouter, Cerebras) with auto-failover on rate limits, normalizes tool-calling/CoT formats, and compacts context for seamless model switching mid-session.
  • Addresses quota exhaustion and vendor lock-in, allowing "Claude for planning, GLM for implementation" without workflow breakage.

Details

Key Value
Target Audience Power users of Claude Code/OpenCode/Crush hitting daily limits
Core Feature Configurable priority queue of models/providers, auto-context compaction, XML/JSON tool format translation
Tech Stack Go, OpenAI-compatible API spec, env var configs
Difficulty Low
Monetization Revenue-ready: Freemium ($5/mo pro routes)

Notes

  • "hit the Claude daily limit... type 'continue', hit enter" (theshrike79); "spilling over to a less preferred model when you run out of quota" (mlyle).
  • Instant HN appeal for multi-vendor hacks; practical for nightly coding sessions.

PlanMode Coding Agent

Summary

  • TUI CLI agent for coding tasks that enforces a "plan-review-implement" mode before execution, supports local/open models first with cloud fallback, and integrates DSL fine-tuning prompts.
  • Fixes CLI tools "rushing into implementation headlong without planning or code reviews" (theshrike79), ideal for agentic workflows on slower hardware.

Details

Key Value
Target Audience CLI enthusiasts using Crush/OpenCode/Gemini CLI
Core Feature Mandatory plan mode with user approval gates, sub-agent delegation, one-shot prompting from local files
Tech Stack Bubble Tea (TUI), LM Studio/Ollama integration, YAML for workflows
Difficulty Medium
Monetization Hobby

Notes

  • "I can't understand why every CLI tool doesn't have Plan mode already, it should be table stakes" (theshrike79); "not good enough for agentic coding" (hasperdi).
  • Fuels debates on agent UIs; high utility for bugfixing/exploration with mixed local/cloud models.

Read Later