Skip to main content
CatchAll is a web search API that generates unique datasets that don’t exist anywhere else on the web. Built on NewsCatcher’s proprietary real-world event index, it delivers state-of-the-art recall—finding all relevant events, not just top results.

How it works

CatchAll processes queries through a multi-stage pipeline that analyzes over 50,000 web pages per job:
1

Analyze

Generates targeted search queries for NewsCatcher’s proprietary news index and creates validation rules and extraction patterns based on your input.
2

Fetch

Retrieves and processes 50,000+ articles from web sources to ensure comprehensive coverage.
3

Cluster

Groups related articles into distinct real-world events using the Leiden algorithm for community detection.
4

Validate

Applies generated validators to filter clusters, ensuring only relevant events that match your criteria proceed to extraction.
5

Extract

Transforms validated events into structured JSON records with dynamic schemas tailored to your query.
Processing typically takes 10-15 minutes per job. Poll the status endpoint every 30-60 seconds to track progress through each stage.
Each job returns structured JSON records with dynamic schemas. Fields like company_name, deal_value, and acquisition_date are automatically generated based on your query.See the Quickstart > Review response for a complete response example.

Key characteristics

Each job generates a unique response schema. Field names and structure in the enrichment object vary between jobs—even with identical inputs.Guaranteed fields in every record:
  • record_id
  • record_title
  • citations array
Variable fields:
  • All other fields in enrichment (names, types, structure)
  • Number of records returned
  • Specific content extracted
See Understanding dynamic schemas for integration patterns.
Identical queries can produce different results:
  • LLMs may generate different keywords, validators, and extractors.
  • Different content sources may be retrieved.
  • Field names and structure vary between runs.
  • Record counts differ.
Each query creates a job that processes asynchronously. Use the returned job_id to poll the job status from steps and retrieve results when the job is completed.

Endpoints

Base URL: https://catchall.newscatcherapi.com
EndpointMethodDescription
/catchAll/submitPOSTCreate a new job
/catchAll/jobs/userGETList all jobs for your API key
/catchAll/status/{job_id}GETCheck job processing status
/catchAll/pull/{job_id}GETRetrieve job results
Track detailed progress using the steps array in the status endpoint response. See Job status > steps for details.

Use cases

  • Market intelligence: Company earnings, M&A activity, product launches
  • Regulatory monitoring: Policy changes, government actions, compliance updates
  • Business development: Partnerships, funding rounds, market entries
  • Competitive analysis: Competitor activities and announcements
  • Research automation: Structured data extraction for analysis
  • News aggregation: Topic-specific news with structured output

What’s next

For technical support, contact us at support@newscatcherapi.com.