Skip to main content
CatchAll is a web search API that generates unique datasets that don’t exist anywhere else on the web. Built on NewsCatcher’s proprietary real-world event index, it delivers state-of-the-art recall—finding all relevant events, not just top results.

How it works

CatchAll processes queries through a multi-stage pipeline that analyzes over 50,000 web pages per job:
1

Analyze

Generates targeted search queries for NewsCatcher’s proprietary web index and creates validation rules and extraction patterns based on your input.
2

Fetch

Retrieves and processes 50,000+ web pages to ensure comprehensive coverage.
3

Cluster

Groups related web pages into distinct real-world events using the Leiden algorithm for community detection.
4

Validate

Applies generated validators to filter clusters, ensuring only relevant events that match your criteria proceed to extraction.
5

Extract

Transforms validated events into structured JSON records with dynamic schemas tailored to your query.
Processing typically takes 10-15 minutes per job. Poll the status endpoint every 30-60 seconds to track progress through each stage.
Each job returns structured JSON records with dynamic schemas. Fields like company_name, deal_value, and acquisition_date are automatically generated based on your query.See the Quickstart > Review response for a complete response example.

Key characteristics

CatchAll searches NewsCatcher’s continuously updated web index of 2+ billion web pages, optimized for finding real-world events (acquisitions, approvals, incidents). The index is time-series by nature — each web page is recorded by its discovery date, not the date of the event it describes. This means a web page discovered today may report on an event from months ago, yet CatchAll still evaluates it against your query.
To learn how the index works and what search depth controls, see Index and search depth.
Control what data gets extracted by providing custom validators and enrichments, or let the system generate them automatically based on your query.Custom validators filter which events are relevant:
{
  "name": "is_acquisition",
  "description": "true if web page describes an acquisition",
  "type": "boolean"
}
Custom enrichments define what data to extract:
{
  "name": "acquiring_company",
  "description": "Extract the acquiring company name",
  "type": "company"
}
The company enrichment type extracts structured data including name, alternative names, website candidates, people, and address.
Use the POST /catchAll/initialize endpoint to get suggested validators and enrichments before submitting your job.
Each job generates a unique response schema. Field names and structure in the enrichment object vary between jobs—even with identical inputs.Guaranteed fields in every record:
  • record_id
  • record_title
  • enrichment object
  • citations array
Variable fields:
  • All fields inside enrichment (names, types, structure)
See Understanding dynamic schemas for integration patterns.
The start_date and end_date parameters define your search window within the web index. They control which web pages are searched, not which events are returned. A web page within your date range may describe events from outside that range.Date ranges are validated against your plan’s allowed search depth. To preview date adjustments before submitting, you can use POST /catchAll/initialize.
To learn how search depth and date ranges work, see Index and search depth.
Identical queries can produce different results:
  • LLMs may generate different keywords, validators, and extractors
  • Different content sources may be retrieved
  • Field names and structure vary between runs
  • Record counts differ
Each query creates a job that processes asynchronously. Use the returned job_id to poll the job status and retrieve results when completed. Processing typically takes 10-15 minutes.Track detailed progress through the steps array in the status endpoint response.
Results become available progressively during the enriching stage as validation completes in batches. Check for status: "enriching" to retrieve partial results before job completion.The progress_validated field tracks how many candidate clusters have been processed. This allows you to access early results while the job continues processing remaining batches.
Start with fewer records using the limit parameter for quick testing, then use POST /catchAll/continue to process more records without re-submitting the query.Continue requests preserve all analysis, validation, and extraction logic from the original job.

Endpoints

Base URL: https://catchall.newscatcherapi.com
EndpointMethodDescription
/catchAll/initializePOSTGet validator, enrichment, and date suggestions
/catchAll/submitPOSTCreate a new job
/catchAll/continuePOSTContinue job with higher limit
/catchAll/jobs/userGET List all jobs for your API key
/catchAll/status/{job_id}GETCheck job processing status
/catchAll/pull/{job_id}GETRetrieve job results
Track detailed progress using the steps array in the status endpoint response. See Job status > steps for details.

Example queries

Market intelligence
M&A activity in the AI sector over $100M in the last month
Enterprise software company earnings reports this quarter
Product launches by Fortune 500 technology companies this week
Regulatory monitoring
FDA drug approvals for oncology treatments in the last 30 days
Financial regulatory actions against banks in the EU this month
Government policy changes affecting semiconductor exports this week
Business development
Series B funding rounds for SaaS startups over $20M this month
Strategic partnerships between automotive and technology companies this week
Market entry announcements by US companies in Southeast Asia this month
Competitive analysis
Executive appointments at major cloud infrastructure companies this month
Product launches by Salesforce, HubSpot, and Zendesk this quarter
Layoffs and restructuring announcements at fintech companies this week
Research automation
Clinical trial results for diabetes treatments published this month
Cybersecurity incidents at financial institutions in the last 30 days
Bankruptcy filings by retail companies in the US this quarter

What’s next

For technical support, contact us at support@newscatcherapi.com.