Project ideas from Hacker News discussions.

My iPhone 16 Pro Max produces garbage output when running MLX LLMs

📝 Discussion Summary (Click to expand)

1. Phone calculators are not the real calculators
Users keep replacing the stock app with emulators of TI‑, HP‑, or NumWorks calculators because they need a full history, CAS, or a familiar interface.

“I use the NumWorks emulator app whenever I need something more advanced.” – varun_ch
“I was pretty delighted to realize I could now delete the lame Calculator.app … I settled on NumWorks.” – xp84

2. Built‑in calculator apps feel under‑baked
The default iOS/Android calculators lack history, symbolic evaluation, and a good UI for long expressions.

“Honestly, the main beef I have with Calculator.app is that on a screen this big, I ought to be able to see several previous calculations and scroll up if needed.” – xp84
“Calculator.app does have history now … it goes back to 2025 on my device.” – vscode‑rest

3. Apple’s MLX/LLM bug shows a hardware‑level defect
A specific iPhone 16 Pro Max fails to run Apple’s own LLM correctly, pointing to a defect in the Neural Engine or its driver.

“Apple’s own LLM silently failed on this device … it seems Bad (TM) that Apple would ship devices where their own LLM didn’t work.” – bri3d
“The author’s conclusion was still completely reasonable given the evidence they had.” – TimByte

4. Floating‑point/NaN behaviour is a source of confusion
The discussion turns to IEEE‑754 guarantees, NaN propagation, and the limits of reproducibility across platforms.

“Anything that relies on bit patterns of NaNs behaving in a certain way … is in dangerous territory.” – ekelsen
“Binary operations combining two NaN inputs must result in one of the input NaNs.” – addaon

These four threads capture the bulk of the conversation’s concerns and preferences.


🚀 Project Ideas

MobileCalc REPL

Summary

  • A mobile calculator app that behaves like a REPL: you can edit previous expressions, assign variables, and re-run dependent calculations.
  • Provides full history, syntax highlighting, CAS support, and graphing (2D/3D) in a single, lightweight UI.

Details

Key Value
Target Audience Students, engineers, hobbyists who need a powerful calculator on their phone.
Core Feature Interactive expression editor with variable assignment, history navigation, and graphing.
Tech Stack SwiftUI + Combine (iOS), Kotlin Multiplatform for Android, MathJax for rendering, libqalculate for CAS.
Difficulty Medium
Monetization Revenue‑ready: $4.99 one‑time purchase or $0.99/month subscription for advanced features.

Notes

  • HN commenters lament the lack of variable support in built‑in calculators: “I want to be able to return to an earlier expression, modify it, assign it to a variable…” (varun_ch).
  • The app would let users “tap to select previous expressions” and “modify the variable and rerun” (josephg).
  • The ability to preview the whole expression as you type would satisfy “built‑in calculator apps are surprisingly underbaked” (varun_ch).
  • Discussion potential: comparing to existing emulators (NumWorks, HP Prime) and why a native mobile REPL is superior.

FPRepro Checker

Summary

  • A cross‑platform CLI/web service that runs a suite of floating‑point expressions on multiple devices/architectures and reports discrepancies.
  • Helps developers detect non‑reproducible results caused by hardware, compiler, or runtime differences.

Details

Key Value
Target Audience Mobile and embedded developers, QA engineers, scientific computing teams.
Core Feature Automated reproducibility tests across iOS, Android, macOS, Linux, and various CPU/GPU backends.
Tech Stack Rust (performance), Docker for isolated environments, WebAssembly for browser runs, REST API.
Difficulty High
Monetization Revenue‑ready: $99/month for enterprise API access, free tier with limited runs.

Notes

  • Addresses frustration about “floating point accumulation doesn’t commute” and inconsistent results across devices (bri3d, ekelsen).
  • Provides a practical utility for debugging the “Apple Intelligence” LLM issue where math operations diverge on a specific iPhone 16 Pro Max.
  • Could spark discussion on IEEE 754 compliance and platform‑specific quirks.

SmartKeyboard

Summary

  • A lightweight, privacy‑first iOS keyboard that replaces the default predictive text with a modern, ML‑based next‑word model.
  • Offers customizable language models, offline mode, and real‑time correction.

Details

Key Value
Target Audience iOS users frustrated with broken predictive text, developers of custom keyboards.
Core Feature On‑device language model (e.g., GPT‑2 distilled) with fast inference, context‑aware suggestions, and user‑controlled privacy settings.
Tech Stack Swift + CoreML, TensorFlow Lite, optional server fallback for heavy models.
Difficulty Medium
Monetization Hobby (open source) with optional in‑app purchases for premium language packs.

Notes

  • Directly responds to “Typing on my iPhone… just gives up and stops correcting anything at all” (sen) and the broader “iOS Keyboard is Broken” discussion (macintux, taneq).
  • Users would appreciate a keyboard that “doesn’t randomly break” and can be tuned to their language habits.
  • Potential for community contributions and model fine‑tuning.

NeuralEngine Debugger

Summary

  • A diagnostic app for iOS that runs a battery of tests on the Apple Neural Engine (ANE) and GPU tensor cores, reporting performance, correctness, and compatibility.
  • Includes a UI to run sample LLM inference and compare results across device models.

Details

Key Value
Target Audience iOS developers, ML engineers, QA teams testing on Apple hardware.
Core Feature Automated ANE health checks, benchmark suite, reproducibility checker for tensor operations, LLM inference demo.
Tech Stack Swift, Metal, MLX, XCTest for automated tests, CoreML for model loading.
Difficulty High
Monetization Revenue‑ready: $29.99 one‑time purchase or $4.99/month for cloud‑based test results and analytics.

Notes

  • Addresses the “Apple Intelligence” bug where a specific iPhone 16 Pro Max produced wrong math results (bri3d, zczb).
  • Provides a practical tool for developers to verify that their models run correctly on the target device.
  • Could generate discussion on hardware‑level ML debugging and the need for better diagnostics in the Apple ecosystem.

Read Later