Project ideas from Hacker News discussions.

Gemini 3.5 Flash

📝 Discussion Summary (Click to expand)

1. Pricing shock &cost increase
The price jump is a major pain point.

“$9 vs $12 for output.” – swe_dima
“3× price increase of the last Flash model ($3 → $9 per 1M output).” – bakugo
“Gemini 3.5 Flash: $0.75 input / $4.50 output per 1 M tokens, with output price explicitly ‘including thinking tokens’.” – GodelNumbering

2. Confusing model naming & tier structure
Users are frustrated by the new naming conventions (Flash‑Lite, Flash, Pro, etc.).

“Flash‑Lite is a different product from Flash, which is more expensive. They couldn’t be more confusing with their naming.” – naman
“Gemini 3.5 Flash is priced like a Pro model while still being called ‘Flash’.” – Alfon

3. Performance vs. benchmark claims
There is debate over how the new model stacks up against rivals.

“Arena.ai: Gemini 3.5 Flash shifts the Pareto frontier in Text. 8 models from Google dominate the price‑performance curve.” – Arena.ai (tweet)
“Gemini 3.5 Flash uses ~7,500 tokens for a complex SVG task while 3.1 Pro uses ~28 k tokens, yet only the latter animates correctly.” – sxx

4. UI / access reliability problems
The Gemini web UI, Antigravity quota, and CLI are frequently cited as flaky.

“In our experience, caching is not very reliable with Google. We always get random cache misses that don’t happen with other providers.” – henryah
“Google’s AI Studio is shockingly bad – sessions refresh, disappear, and error out indefinitely.” – veselin

5. Hallucinations & search reliability
Many note that the models still fabricate links and citations.

“People complain about them incessantly, but I can almost never get people to actually post receipts.” – WarmWash > “Gemini will confidently give me wrong or outdated information unless forced to search, and even then it often hallucinates the source.” – krupan

These five themes capture the dominant concerns across the discussion.


🚀 Project Ideas

Generating project ideas…

Gemini Cache‑Optimized Relay#Summary

  • Problem: Google’s Gemini pricing and flaky cache reliability make it costly and unpredictable for developers building agentic workflows.
  • Value Prop: A lightweight proxy that caches Gemini responses, tracks hit‑rates, and automatically falls back to cheaper models when cache misses occur, reducing token spend by 30‑50%.

Details

Key Value
Target Audience Mid‑size AI teams, indie hackers, and SaaS founders building multi‑model agents.
Core Feature Automatic Gemini caching with hit‑rate analytics, fallback to DeepSeek/Qwen on miss, real‑time cost estimator.
Tech Stack Backend: Node.js + Redis (TTL‑based cache); API: OpenAPI‑compatible wrapper; Front‑end: minimal CLI & dashboard.
Difficulty Medium
Monetization Revenue-ready: $0.01 per 1 k input tokens + $0.02 per 1 k output tokens (flexible tier).

Notes- HN commenters repeatedly lamented “price hikes” and “cache flakiness”; a solution that guarantees cheaper reuse of cached tokens would be immediately attractive.

  • Could integrate Google’s explicit caching API while hiding its quirks, turning Gemini into a reliable, cost‑effective building block for agentic coding.

Cross‑Provider Token Router

Summary

  • Problem: Users must manually switch between Gemini, DeepSeek, Qwen, etc., to balance performance and cost; no automated tool exists.
  • Value Prop: An autonomous router that selects the cheapest model meeting a predefined quality threshold for each request, continuously learning from token usage patterns.

Details

Key Value
Target Audience DevOps engineers, API platform builders, and AI product managers.
Core Feature Dynamic model selection + cost‑aware request routing; fallback logic for flaky services.
Tech Stack Backend: Python FastAPI; Decision engine: lightweight RL bandit; Storage: SQLite; Deploy: Docker/K8s.
Difficulty Medium
Monetization Revenue-ready: usage‑based fee 0.5% of total token spend saved.

Notes

  • Discussions on HN about “price is just a tax” and “pricing noose tightening” indicate appetite for a tool that reduces spend without manual overhead.
  • The router could surface real‑time price comparisons and alert users to emerging cheaper models (e.g., upcoming DeepSeek Flash).

Local‑First LLM Proxy (Open‑Source Alternative) #Summary

  • Problem: Cloud API costs are prohibitive; developers want a self‑hosted endpoint that mimics Gemini/DeepSeek behavior for low‑latency, low‑cost usage.
  • Value Prop: A Docker‑compose‑ready proxy that serves Qwen 3.6‑35B, DeepSeek‑V4‑Flash, and other open models with an OpenAI‑compatible API, auto‑updating weights.

Details| Key | Value |

|-----|-------| | Target Audience | Hobbyists, small startups, privacy‑focused enterprises. | | Core Feature | Unified API endpoint, automatic model model‑swap based on load, built‑in token‑count monitoring. | | Tech Stack | Backend: FastAPI + vLLM; Model loader: Hugging Face Transformers; Auth: JWT; UI: Streamlit dashboard. | | Difficulty | High | | Monetization | Hobby |

Notes

  • HN users regularly cite “price hikes” and “China’s cheaper models” as reasons to self‑host; a ready‑to‑run solution would capture that market.
  • Could be positioned as a “Gemini‑Free” drop‑in replace for any Gemini‑compatible codebase.

Cost‑Aware Benchmark Dashboard for LLM APIs

Summary

  • Problem: Teams lack visibility into per‑token costs, cache hits, and latency across multiple LLMs, leading to unexpected bills.
  • Value Prop: A SaaS dashboard that ingests API logs, calculates effective cost per task, visualizes cache efficiency, and sends alerts for anomalous price spikes.

Details

Key Value
Target Audience Engineering managers, AI product owners, finance ops for AI‑heavy firms.
Core Feature Multi‑provider cost breakdown, cache‑hit ratio reporting, predictive spend modeling.
Tech Stack Frontend: React + D3; Backend: Go (log parser); DB: TimescaleDB; Hosting: Vercel/Render.
Difficulty Medium
Monetization Revenue-ready: tiered subscription ($19/mo basic, $99/mo pro).

Notes

  • Frequent HN complaints about “unexpected price jumps” and “price war” signal demand for proactive cost monitoring.
  • Could integrate with the Cross‑Provider Token Router to provide actionable insights directly to users.

Agentic Workflow Scheduler with Adaptive Model Switching

Summary

  • Problem: Agentic coding pipelines waste tokens by using premium models for every step; users need a strategy that uses cheap models for most operations and only upgrades when necessary.
  • Value Prop: A workflow engine that parses an AGENTS.md‑style plan, executes low‑cost model steps, validates output with a cheaper verifier, and escalates to a premium model only on failure, cutting token costs by up to 70%.

Details

Key Value
Target Audience AI‑centric dev teams, automation engineers, SaaS founders building AI‑augmented products.
Core Feature Plan parsing, multi‑stage execution, on‑the‑fly cost‑budgeting, fail‑over to high‑quality model.
Tech Stack Backend: Python + Celery; Scheduler: Airflow‑lite; Model API wrapper; Pricing engine.
Difficulty High
Monetization Revenue-ready: $0.005 per 1 k input tokens processed, $0.01 per 1 k output tokens generated (pay‑as‑you‑go).

Notes

  • Discussions about “Flash is now premium” and “need for caching” highlight the need for intelligent token budgeting.
  • Providing a concrete scheduler that integrates with existing CLI tools (e.g., Gemini CLI, Claude Code) would resonate strongly with HN developers.

Read Later