Project ideas from Hacker News discussions.

Updates to our web search products and Programmable Search Engine capabilities

📝 Discussion Summary (Click to expand)

Based on the Hacker News discussion regarding Google’s discontinuation of full-web search for its Programmable Search Engine, the four most prevalent themes are the erosion of the indie search ecosystem, Google’s monopoly and aggressive legal tactics, the technical feasibility of alternatives, and the general decline in search quality.

Here are the four prevalent themes:

1. The End of the Indie Search Ecosystem

The most immediate reaction was that Google is shutting down a vital resource for small developers and niche search engines. By capping new engines at 50 domains and forcing existing ones to transition to enterprise solutions, Google is effectively pricing out independent builders who relied on free or low-cost access to Google's index.

  • "This seems like it effectively ends the era of indie / niche search engines being able to build on Google’s index. Anything that looks like general web search is getting pushed behind enterprise gates." — 01jonny01
  • "I've seen similar patterns with Twitter's API restrictions and other platforms gradually closing down their ecosystems... RIP, another one to the Google Graveyard." — jpalepu33
  • "The beauty about Google Programmable Search across the entire web is that it's free and users can make money by linking it their Adsense account. Bing charge per query for the average user." — 01jonny01

2. Google's Monopoly and Anti-Competitive Practices

There is a strong consensus that Google is leveraging its monopoly to stifle competition and extract maximum value from the web. Users argue that Google’s dominance in search (and browsers) allows them to tax businesses unfairly via ads and to weaponize legal threats against scraping services (like SerpAPI) that enable alternatives.

  • "Google is a monopoly across several broad categories. They're also a taxation enterprise. Google Search took over as the URL bar for 91% of all web users across all devices." — echelon
  • "Searching for ChatGPT -> Ads in first place... This is inexcusable." — echelon
  • "This is a clear example of why building on proprietary APIs is risky... Google is essentially saying: indie search is dead, pay enterprise prices or leave." — jpalepu33
  • "Google is now transitioning into a private web. Others have to replace Google. We need access to public information. States can not allow corporations to hold us here hostage." — shevy-java

3. Technical Feasibility of Building New Indexes (and the Difficulty)

A technical debate emerged regarding the viability of building independent search indexes (like Marginalia or the Qwant/Ecosia joint venture). While some users are experimenting with their own indexes, the consensus is that ranking (relevance) is significantly harder than indexing, and overcoming Google's infrastructure advantage is a massive hurdle.

  • "Unfortunately the index is the easy part. Transforming user input into a series of tokens which get used to rank possible matches... is the hard part." — jfindley
  • "The hard part is doing it at any sort of scale and producing useful results. It's easy to build something that indexes a few million documents. Pushing into billions is a bigger challenge." — marginalia_nu
  • "I found [YaCy] didn’t really work as a real search engine but it was interesting." — Gigachad
  • "Hard part is doing it at any sort of scale... It's easy to build something that indexes a few million documents. Pushing into billions is a bigger challenge." — marginalia_nu

4. Search Quality and the "Spam/SEO" Problem

Underlying the discussion is a shared frustration with the current state of search results. Many users feel Google has become unusable due to ads and SEO spam, which ironically makes the idea of independent, curated search engines more attractive, even if they are technically inferior in scale.

  • "I tested it using a local keyword, as I normally do, and it took me to a Wikipedia page I didn’t know existed. So thanks for that." — johnofthesea (commenting on an indie search engine)
  • "Five years ago Google already became unusable without 'site:reddit.com'... Nowadays reddit is also shit, which means that the only use case for me to use Google or any search engine is to find products that for some reason I don't want to buy on Amazon." — anal_reactor
  • "Is this perhaps to prevent ChatGPT, Claude and Grok to use Google Search? It would make sense for Google to keep that ability for Gemini." — cubefox

🚀 Project Ideas

Custom Search Engine Kit

Summary

  • [A toolkit to help indie developers build and maintain their own focused web crawlers and search indexes.]
  • [Solves the immediate need for developers who lose access to Google's Programmable Search by providing a curated, open-source stack for indexing, ranking, and hosting a niche search engine.]

Details

Key Value
Target Audience Indie developers, niche community builders, and privacy-focused startups.
Core Feature Modular toolkit: a crawler, an indexer (like Typesense/Meilisearch), a ranking module (simple PageRank or content heuristic), and a self-hostable UI.
Tech Stack Rust (for crawler efficiency), Go (for backend/indexing), TypeScript (for UI), PostgreSQL/Typesense (for index).
Difficulty High
Monetization Revenue-ready: Enterprise support, managed hosting, or a premium "Pro" tier with advanced crawling features.

Notes

  • [Directly addresses the "curated" approach mentioned by @saltysalt (greppr.org) and the need to "own your core infrastructure." It provides the tools others need to replicate this success.]
  • [High potential for HN discussion as it empowers the community to decentralize search, a frequent topic in antitrust and web infrastructure debates.]

Ad-Free Search API Aggregator

Summary

  • [A centralized proxy API service that aggregates results from multiple adversarial SERP providers (like SerpAPI) and offers a unified, ad-free endpoint.]
  • [Solves the fragmentation and legal risk issue for search engines like Kagi, providing a stable, paid API layer that abstracts away the underlying scraping infrastructure and potential shutdowns.]

Details

Key Value
Target Audience SaaS companies, LLM developers, and alternative search engines needing reliable search results without Google's ad syndication.
Core Feature Single API key for web search, redundant backend providers, rate limiting, and clean result formatting (stripping ads/SEO spam).
Tech Stack Node.js/TypeScript (API layer), Redis (caching/rate limits), Docker/Kubernetes (deployment).
Difficulty Medium
Monetization Revenue-ready: Per-query pricing (e.g., $0.001 per request) with volume discounts.

Notes

  • [Addresses the frustration that Kagi and others are forced to rely on "third-party vendors" who scrape Google. As @tpetry noted, SerpAPI is a major player here, and Google is already suing them.]
  • [Fills the gap left by Google's API restriction and Bing's discontinuation of Custom Search. HN users would appreciate a tool that helps small players navigate this hostile landscape.]

"Local-First" Domain Searcher

Summary

  • [A lightweight desktop tool that lets users index and search specific sets of websites (up to 50) locally.]
  • [Solves the problem for users and small businesses who used Google Programmable Search to power internal documentation, intranet search, or personal research libraries, which are now capped.]

Details

Key Value
Target Audience Researchers, documentation maintainers, power users, and small teams needing private search.
Core Feature GUI to input URLs, a headless browser crawler, local full-text indexing (SQLite), and a simple search interface.
Tech Stack Electron (Desktop app), SQLite (Local DB), Node.js (Crawler logic).
Difficulty Low/Medium
Monetization Hobby: Free open-source tool. Revenue-ready: Paid "Pro" version for unlimited sites or advanced filtering.

Notes

  • [Directly counters the new 50-domain cap by moving the search capability to the user's local machine.]
  • [Appeals to HN's "build your own tools" ethos and privacy-conscious users who want to search their own data without relying on cloud services.]

Trademark-Safe Search Proxy

Summary

  • [A search proxy that automatically filters out ads and results targeting specific trademarks, providing a "clean" organic result set.]
  • [Solves the "taxation" problem identified by @echelon, where searching for trademarks like "iPhone" or "ChatGPT" triggers paid ads, allowing businesses to avoid paying Google for their own brand traffic.]

Details

Key Value
Target Audience E-commerce sites, brand managers, and businesses reliant on organic search traffic.
Core Feature API that accepts a query, executes it, parses the SERP, removes paid entries containing specific keywords (trademarks), and returns clean data.
Tech Stack Python (Scraping/Processing), FastAPI (Proxy), NLP libraries (for trademark matching).
Difficulty Medium (due to constant maintenance against Google's DOM changes)
Monetization Revenue-ready: SaaS subscription based on query volume.

Notes

  • [Tackles the "Google is a tax" argument by offering a technical workaround to the ad-heavy SERP, a major pain point for businesses.]
  • [Would spark debate on HN regarding "adversarial interoperability" and the ethics of scraping/ad filtering. It offers a pragmatic tool for a philosophical problem discussed in the thread.]

Read Later