Project ideas from Hacker News discussions.

Futhark by example (2020)

📝 Discussion Summary (Click to expand)

3Prevalent Themes

  1. Compile‑time length tracking for vectors – many users point out that having the size of an array baked into its type (e.g., Vec<T, n>) eliminates a whole class of bugs.

    "It's definitely worth it to include the length as part of the type information for dynamic arrays" – ethanlipson
    "The point of the dependent types is that you can have the type system track that concat creates an M+N length vector, sort preserves length ... etc." – alpinisme
    "Here they are in Futhark: ... val concat [n] [m] 't : (xs: [n]t) -> (ys: [m]t) -> [n + m]t" – itishappy*

  2. Shape‑analysis tooling and its absence in mainstream ML libraries – while some projects (Pyrefly, MLIR) are building shape‑hinting infrastructure, deep‑learning ecosystems still rely on manual annotations.

    "Shape functions and shape analysis are basically mundane infra in almost every ML compiler/language/DSL." – mathisfun123
    "The Pyrefly type checker is starting to work on this kind of shape hinting … I believe the plan is for it to work with other array packages (eg. JAX, NumPy)" – ainch

  3. Confusion over the name “Futhark” and community perception – the language’s name (taken from the runic alphabet) sparks curiosity, jokes, and occasional frustration about naming expectations.

    "It would be nice to not name your language after another language … I came here expecting something else." – CapricornNoble
    "More accurately it would be like calling it Alphabet, since that takes its name from Alpha Beta … just like the Futhark takes its name from the first letters in it." – hnarn


🚀 Project Ideas

Length‑Indexed Tensor Library for Python (LiteTensor)

Summary

  • Provide first‑class length‑indexed types for NumPy arrays and PyTorch tensors via a Python wrapper and a mypy plugin.
  • Eliminate shape‑related bugs in data pipelines and GPU kernels without leaving Python.
  • Core value: dependent‑type length tracking accessible to everyday Python developers.

Details

Key Value
Target Audience Python data‑science / ML practitioners
Core Feature Array[T, n] type where n is a compile‑time integer length; functions return new length‑preserving types.
Tech Stack Python 3.12, mypy plugin API, NumPy C‑API, PyTorch C++ bindings, optional Cython for speed.
Difficulty Medium
Monetization Hobby

Notes

  • HN commenters repeatedly mention wishing for compile‑time length guarantees in dynamic arrays (e.g., argv: Vec<T, argc>). This library would give them that guarantee in Python.
  • Aligns with Pyrefly’s shape‑hinting ambitions and would reduce the need for manual shape comments across notebooks.
  • Potential for community adoption in Jupyter, scientific Python, and GPU‑accelerated workflows.

CudaLengthRust: Dependent‑Type Wrappers for Safe CUDA Kernels

Summary

  • Introduce const‑parameterized array types that encode runtime lengths at kernel launch time, preventing mismatched dimensions.
  • Offer ergonomic APIs for matrix multiplication, reductions, and element‑wise ops that the compiler can verify.
  • Core value: zero‑overhead safety for CUDA kernels written in Rust.

Details

Key Value
Target Audience Rust developers building high‑performance GPU code, ML researchers using CUDA
Core Feature Kernel signatures like fn matmul<const N: usize, const M: usize, const K: usize>(A: &[T; N][M], B: &[T; M][K]) -> [T; N][K] that the compiler checks.
Tech Stack Rust 1.78+, cust crate for CUDA binding, bindgen for header generation, const generics and generic const features.
Difficulty High
Monetization Revenue-ready: $50/mo enterprise support

Notes

  • Echoes Ethan Lipson’s desire for “concat(Vec<T, n>, Vec<T, m>) -> Vec<T, n+m>” style guarantees in GPU code.
  • Would let Rust users achieve the same safety guarantees as Futhark or C++ template tricks, but with simpler syntax and better error messages.
  • Opens a pathway for commercial consulting around safe GPU kernels in sectors like finance and scientific computing.

ShapeGuard: Shape‑Aware Static Analysis for Jupyter Notebooks

Summary

  • Detect and report shape mismatches across notebook cells automatically, using incremental shape tracking.
  • Offer actionable warnings (e.g., “matrix multiplication expects (n, m) but got (p, q)”).
  • Core value: prevent common ML bugs without leaving the notebook environment.

Details

Key Value
Target Audience Data scientists and ML engineers using Jupyter notebooks
Core Feature Cell‑level shape propagation: variables carry inferred shapes; execution aborts on invalid operations.
Tech Stack Python kernel gateway, TypeScript front‑end for UI, integration with Pyrefly’s shape‑analysis library, optional MLIR backend for advanced inference.
Difficulty Medium
Monetization Revenue-ready: Freemium SaaS – team plan $15/user/mo

Notes- Directly addresses a‑inch’s observation that “ML people all have their own schemes for tracking shape information” and the pain of manual shape comments.

  • Would be a natural extension of Pyrefly’s roadmap, giving notebook users the same safety guarantees.
  • Sparks discussion on open‑source vs. commercial models for notebook tooling and could become a staple in collaborative ML workflows.

Read Later