Project ideas from Hacker News discussions.

Show HN: CLI tool for detecting non-exact code duplication with embedding models

📝 Discussion Summary (Click to expand)

3 Prevalent Themes

  1. Detecting subtle, semantic code similarity

    "I built Slopo to solve one specific problem: finding similar code that is hardest to detect by other tools, coding AI agents, and humans." – rkochanowski
    "Similar code is often not a clone to refactor, and this is a trade‑off." – rkochanowski

  2. Interest in extending language support

    "If it did PHP I would love to run it over WordPress. What would it take to add that?" – realxrobau
    "PHP support can be easily added, I will release a new version soon." – rkochanowski

  3. Potential workflow integration and concerns

    "I can imagine putting this into a pre push hook to keep things clean after an initial sweep." – hdz
    "Nice idea. I can see this being useful before refactors, especially when the duplication is semantic rather than copy paste." – murats
    "I think that this is pretty cool, but is there any reason why we would want to remove similar/possible duplicate code?" – SpyCoder77


🚀 Project Ideas

Semantic Clone Detector for CI

Summary

  • Detects semantic code clones across many languages, surfacing hidden duplication before refactors.
  • Integrates with CI pipelines to flag risky duplication, saving developer time.

Details

Key Value
Target Audience Developers and teams maintaining large codebases, especially those using CI/CD
Core Feature Semantic similarity detection using embeddings, configurable thresholds
Tech Stack Rust backend, Python inference, Docker, GitHub Actions
Difficulty Medium
Monetization Revenue-ready: Subscription (tiered per pipeline runs)

Notes

  • Solves the exact pain point raised by rkochanowski and users wanting pre‑push hooks.
  • Community can request language support, e.g., PHP, expanding adoption.

AI‑Powered Refactor Service

Summary

  • Provides AI‑driven detection of refactor‑worthy duplicates and auto‑generates pull‑requests.
  • Offers context‑aware suggestions that prioritize semantic over literal clones.

Details

Key Value
Target Audience Individual developers and small teams seeking automated code quality improvements
Core Feature Web UI + API that ingests repository, runs clone detection, suggests refactors, opens PRs
Tech Stack Node.js backend, GPT‑4 API for suggestion, GitHub API for PR creation, PostgreSQL
Difficulty High
Monetization Revenue-ready: Pay‑per‑repo monthly ($5–$15)

Notes

  • Directly answers the question of why remove duplicates by showing refactor ROI.
  • Would be a natural extension of Slopo’s embedding approach, appealing to HN readers.

VS Code Duplicate Explorer Extension

Summary

  • Shows real‑time similarity scores for selected code blocks, highlighting hidden duplicates.
  • Provides quick actions to view, compare, or auto‑refactor duplicates within the editor.

Details

Key Value
Target Audience IDE‑centric developers who want on‑the‑fly duplicate insight
Core Feature Extension that runs local embeddings on the fly, surfaces matches in margin, supports configurable ignore patterns
Tech Stack TypeScript, Rust compiled to WASM for embedding, VS Code Extension API
Difficulty Low
Monetization Hobby

Notes

  • Matches the desire for pre‑push integration and IDE feedback mentioned by hdz.
  • Open‑source community can extend language support, fostering discussion.

Read Later