What is CatchAll API?
CatchAll provides an end-to-end pipeline for converting natural language queries into structured event data. The system analyzes your question, retrieves relevant content from diverse sources (news sites, government databases, public records, corporate sites), clusters similar information, validates relevance, and extracts structured data tailored to your specific query.How it works
When you submit a query, CatchAll follows this multi-stage pipeline:- Analyze: Understands your query and generates search queries, validators, and extractors.
- Fetch: Retrieves relevant content from web sources.
- Cluster: Groups similar content into distinct events.
- Validate: Filters clusters to keep only relevant ones.
- Extract: Pulls structured data from validated clusters.
- Return: Delivers records with citations.
What to expect
Dynamic schemas
Response schemas are generated uniquely for each query. Field names and structure in theenrichment
object vary between jobs—even with identical
inputs.
What’s guaranteed in every record:
record_id
enrichment.record_title
citations
array
- All other fields in
enrichment
(names, types, structure) - Number of records returned
- Specific content extracted
See Understanding dynamic
schemas to learn how to
build integrations that handle variable response structures.
Non-deterministic results
Identical queries can produce different results because:- LLMs may generate different keywords, validators, and extractors.
- Different content sources may be retrieved.
- Field names and structure vary between runs.
- The number of records extracted can differ significantly.
Asynchronous processing
Each query creates a job that processes asynchronously. You receive ajob_id
to poll for status and retrieve results when complete.
Base URL
For API requests use the following base URL:Endpoints
Endpoint | Method | Description | Use Case |
---|---|---|---|
/catchAll/submit | POST | Create a new job | Submit a natural language query to start processing |
/catchAll/status/{job_id} | GET | Check job status | Monitor processing progress through 12 status stages |
/catchAll/pull/{job_id} | GET | Get job results | Retrieve structured records when job completes |
Request format
Include your API key in thex-api-key
header for each request. All requests
must use HTTPS.
Basic request
Request parameters
- query (string, required): Natural language question describing what to find
- context (string, optional): Additional context to focus search and extraction
- summary_template (string, optional): Template to guide record summary
formatting. When provided, adds a
template_based_summary
field to each record
Response format
Job creation response
Job status response
Results response
The field names in the
enrichment
object are LLM-generated and may vary
event for the same inputs. The example above shows one possible structure for
earnings queries.Job statuses
To monitor the progress of your job, use the/status/{job_id}
endpoint. We
recommend polling this endpoint every 30-60 seconds.
Jobs move through the following statuses:
Status | Description | Typical Duration |
---|---|---|
pending | Job queued, waiting to start | Seconds |
analysis_started | Beginning query analysis | Seconds |
analysis_keywords_extracted | Keywords identified from query | 30-60 seconds |
analysis_enrichments_extracted | Validators and extractors generated | 30-60 seconds |
analysis_queries_extracted | Search queries created (typically 10 queries) | 30-60 seconds |
retrieval_dispatched | Queries sent to fetching service | Seconds |
data_fetched | Articles retrieved from news database | 3-5 minutes |
clustering_dispatched | Clustering process initiated | Seconds |
data_grouped | Similar articles clustered | 2-4 minutes |
enrichment_dispatched | Validation and extraction started | Seconds |
data_enriched | Structured data extracted from valid clusters | 4-6 minutes |
job_completed | Job finished, results ready | - |
Use cases
CatchAll is designed for applications requiring structured data from unstructured web content:- Market intelligence: Track company earnings, M&A activity, product launches.
- Regulatory monitoring: Follow policy changes, government actions, compliance updates.
- Business development: Discover partnerships, funding rounds, market entries.
- Competitive analysis: Monitor competitor activities and announcements.
- Research automation: Extract structured data for academic or business research.
- News aggregation: Build topic-specific news applications with structured output.
Beta limitations
These features are planned for implementation after the beta period:- Formal error handling and
failed
status - Error response objects with detailed failure information
- Maximum job duration enforcement
- Result expiration and cleanup
- Query deduplication (submitting the same query creates separate jobs)
- Pagination for large result sets
Get started
- Book a demo and get your API key.
- Follow the Quickstart guide to make your first request.
- Review Understanding dynamic schemas to learn how to handle variable response structures.
- Explore the API Reference for detailed endpoint documentation.