Project ideas from Hacker News discussions.

Production engineering when trading billions of dollars a day [video]

📝 Discussion Summary (Click to expand)

Three dominant themes from thediscussion

  1. Zero‑downtime vs. scheduled downtime tension
    Consumer‑facing SaaS teams aim for “never‑dropping a request,” whereas exchange operators can only tolerate maintenance during natural market closures.

    “One advantage that they have is that the market closes, so they can do maintenance that takes the whole system down, but when you're running a global consumer product, it's a lot harder to do that without pushback.” — jedberg

  2. Technical difficulty of achieving true zero‑downtime upgrades
    Implementing seamless roll‑outs requires massive infrastructure (shadow execution, duplicated hardware, data‑sync, extra testing). The cost is prohibitive for most organizations.

    “I image you'd have to use shadow execution, where you roll out a full second copy, run every transaction through both, and compare the results… you would need a ton of extra hardware (more than double).” — jedberg
    “It cost a lot of money… a canary is still production traffic, so some transactions would fail, which isn't allowed for this kind of workload.” — jedberg

  3. Regulatory and operational constraints that enforce maintenance windows
    Trading platforms must coordinate versioning, protocol upgrades, and testing against a tightly regulated ecosystem, making continuous 24/7 upgrades impractical.

    “Only US. Other markets barely have liquidity during daytime… maintenance periods are actually a complication… the only value is for upgrades, which would still be scheduled with the market down.” — cgio
    “The 23/7 is not so much for maintenance as to have a defined window for changes to the market to happen.” — dmurray

These themes capture the core of the conversation: the clash between relentless consumer‑grade availability, the engineering overhead of truly uninterrupted upgrades, and the hard‑won operational windows that regulated trading systems must respect.


🚀 Project Ideas

Zero‑Downtime Exchange Upgrade Platform

Summary

  • Provides automated shadow‑execution, canary rollout, and hot‑swap orchestration for financial exchange services.
  • Eliminates trade‑loss risk by keeping the old and new infra in lockstep until correctness is proven.

Details

Key Value
Target Audience Tier‑1 exchange operators, HFT firms, crypto‑derivatives platforms
Core Feature Seamless parallel execution of legacy and new versions with automatic regression‑test gating and instant rollback on error
Tech Stack Kubernetes + Istio service mesh, gRPC‑based shadow traffic mirroring, Redis Streams for state sync, Prometheus + Grafana for monitoring
Difficulty High
Monetization Revenue-ready: $39/mo per compute node (pay‑as‑you‑grow)

Notes - HN speakers repeatedly mentioned the cost of “double‑hardware” and the need for “shadow execution” for non‑deterministic components like LLMs – this platform abstracts that complexity and charges per node to offset hardware waste. - The concept directly addresses jedberg’s stress about “zero downtime maintenance” while satisfying market participants’ expectation that “the trade must be executed.”

  • Could be demoed at a virtual conference of SREs and quants, generating immediate discussion and trial sign‑ups.

Exchange Upgrade Sandbox Simulator #Summary

  • A cloud‑based simulation environment where engineers can model multi‑layer exchange architectures and test upgrade paths with jitter and versioning constraints.
  • Generates deployment playbooks and compliance reports for regulators.

Details

Key Value
Target Audience Exchange architects, risk‑engine teams, regulatory compliance officers
Core Feature Drag‑and‑drop topology builder, automated “upgrade jitter” scheduler, synthetic trade generator, built‑in data‑integrity checker
Tech Stack Docker Compose sandbox, Node.js backend, React UI, PostGIS for network latency simulation, OpenAPI spec validation
Difficulty Medium
Monetization Revenue-ready: $29/mo per sandbox tenant

Notes

  • The discussion highlighted cgio’s point about “upgrade jitter” and the difficulty of silent upgrades on external protocols – this tool lets teams experiment with that safely.
  • Skippyboxedhero’s claim that “16‑person crypto exchanges out‑perform legacy ones” suggests a market for lightweight, flexible tooling, which this simulator delivers.
  • Community of HN SREs has praised “extensive testing” – the simulator would be a natural extension they would adopt.

Dynamic Shard Switcher for 24/7 Trading

Summary

  • Manages scheduled, time‑based sharding of exchange workloads (e.g., weekday vs. weekend systems) with automatic database reconciliation and minimal latency impact.
  • Guarantees continuous ordering flow while swapping underlying compute clusters.

Details

Key Value
Target Audience Market infrastructure teams, crypto exchanges, multi‑asset brokers
Core Feature Dual‑stack routing, atomic state migration with lag‑compensated commit, alerting on sync drift, API‑level switchover without client reconfiguration
Tech Stack NGINX + Envoy for routing, AWS Aurora global database, Lambda functions for trigger orchestration, Terraform for infra provisioning
Difficulty Medium
Monetization Revenue-ready: $15/mo per active shard pair

Notes - Inspired by nippoo’s “time‑based request sharding” idea and gricardo99’s “move to 24/7 trading,” the product offers a production‑grade implementation of that concept.

  • The HN thread emphasized the “devil is in the details” of reconciling trades across a changeover – this service automates that reconciliation, reducing human error.
  • SREs like woah and cyberpunk will appreciate a managed solution that abstracts away “manual interventions” while still meeting regulatory “pause‑and‑assess” requirements.

Read Later