TL;DR: CatchAll is a web search API that generates unique datasets that don’t exist anywhere else on the web. Built on NewsCatcher’s proprietary real-world event index, it delivers SOTA recall – finding all relevant events, not just top results.
Why CatchAll Exists
Web search is optimized for ranking and speed. It's designed to find the best answer quickly. But when the correct answer is "all matching records," ranking doesn't help. CatchAll solves this. It’s designed from the ground up for enumeration tasks, returning all relevant events.
We benchmarked 26 complex enterprise queries against two leading tools, Exa Websets and OpenAI Deep Research (with access to the web). CatchAll found 64.5% more true positives than Exa and 3940.8% more true positives than OpenAI Deep Research. This means CatchAll picks up on critical events that other leading web search APIs miss.
Some queries that were included:
Query: List all security incidents (data breaches, ransomware, hacks) disclosed between November 3-5.
CatchAll: 298 validated security incidents
Other web search tools: 3-78 incidents identified
Query: Catch all large energy and tech corporations releasing new net-zero targets or sustainability progress updates between November 3-5.
CatchAll: 90 identified
Other web search tools: 1-6 identified
Query: Find all companies opening new offices or facilities in the US, including smaller satellite offices, between November 3-5.
CatchAll: 111 companies identified
Traditional web search: 1-42 companies identified
A full comparison with OpenAI Deep Research, Exa, Parallel, and others is coming in December 2025.
How CatchAll Works: Maximize Coverage, Then Validate
We flip the traditional search stack:
- Query to Retrieval Plan
An LLM interprets your natural-language query and generates optimized search queries for maximum recall.
- High-Coverage Retrieval
We search 1.5 billion articles, typically returning 40,000-50,000 candidates for a two-week window.
- Intelligent Clustering
Graph-based clustering (Leiden algorithm) groups near-duplicate stories, reducing the dataset to distinct events.
- Validation
A Gemini-class model evaluates web pages to determine whether they meet your criteria—keeping only relevant, credible items.
- Structured Extraction
We extract entities, amounts, dates, and contextual fields. Output isn't links—it's structured, citation-backed data.
- Final Deduplication
You get unique, validated records ready for dashboards, LLMs, or automated workflows.
CatchAll processes 10,000+ pages per minute.
We're the slowest, yet most comprehensive, approach available—because completeness is what gives confidence.
Example Output

The Data Behind CatchAll
CatchAll is built on fast-moving real-world data that updates constantly, rather than static sources.
What CatchAll searches:
- ~1.5 billion news articles
- Press releases and business announcements
- Trade publications and industry blogs
- Local news sources (where most edge events appear)
What CatchAll does not search:
- General web content (Wikipedia, how-to guides, forums)
- Social media posts
- Academic papers
- Private databases or paywalled content
When to Use CatchAll
Unlike tools focused on entity search, CatchAll works for any real-world event that appears in public reporting, such as:
- Physical events - openings, closures, accidents, strikes
- Financial events - transactions, funding, bankruptcies
- Legal events - lawsuits, regulatory actions, enforcement
- Personnel events - appointments, departures, promotions
- Product events - launches, recalls, discontinuations
Industry-Specific Examples
Supply Chain & Operations
- "All factory closures in semiconductor manufacturing this month"
- "All port strikes and labor disputes in Southeast Asia"
- "All warehouse openings by logistics companies in Texas"
M&A & Corporate Activity
- "All pharmaceutical company acquisitions in the US this quarter"
- "All Series A funding rounds in climate tech"
- "All executive departures from Fortune 500 companies this week"
Risk & Compliance
- "All data breaches affecting healthcare organizations"
- "All product recalls in the food industry"
- "All regulatory enforcement actions by the SEC"
Commercial Real Estate
- "All multifamily transactions above $50M in California"
- "All new hotel construction projects announced in Florida"
- "All retail store closures by national chains"
Market Intelligence
- "All new restaurant openings in Miami"
- "All manufacturing facility expansion announcements"
- "All new data center construction projects"
When Not to Use CatchAll
- When the answer exists on one authoritative page
- When you need just one example, not a complete list
- When you're looking for opinions, not events
Live Monitoring Included
CatchAll isn't just for one-time research. You can set up continuous monitors that run daily, weekly, or in real-time:
{
"monitor_name": "Florida Restaurant Openings",
"query": "new restaurant opening Florida",
"frequency": "daily",
}
Get fresh, complete datasets delivered automatically. Perfect for:
- Daily market intelligence briefings
- Real-time risk monitoring
- Continuous lead generation
- Dataset refreshes for ML pipelines
Getting Started with CatchAll
Try CatchAll here
Documentation: https://www.newscatcherapi.com/docs/v3/catch-all/overview/introduction
Questions? Email our CEO artem@newscatcherapi.com
About NewsCatcher
NewsCatcher has spent four years developing a proprietary web index used by the U.S. Department of State, Samsung, the Armed Forces of Ukraine, and Fortune 1000 teams who rely on real-world intelligence at scale.





























