> ## Documentation Index
> Fetch the complete documentation index at: https://newscatcherinc-docs.mintlify.site/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Meet CatchAll: Our recall-first web search API

> Find all results, not just top candidates

CatchAll is a web search API that generates unique datasets that don't exist
anywhere else on the web. Built on NewsCatcher's proprietary real-world event
index, it delivers state-of-the-art recall—finding all relevant events, not just
top results.

## How it works

CatchAll processes queries through a multi-stage pipeline that analyzes over
50,000 web pages per job:

<Steps>
  <Step title="Analyze">
    Generates targeted search queries for NewsCatcher's proprietary web index
    and creates validation rules and extraction patterns based on your input.
  </Step>

  <Step title="Fetch">
    Retrieves and processes 50,000+ web pages to ensure comprehensive coverage.
  </Step>

  <Step title="Cluster">
    Groups related web pages into distinct real-world events using the Leiden
    algorithm for community detection.
  </Step>

  <Step title="Validate">
    Applies generated validators to filter clusters, ensuring only relevant
    events that match your criteria proceed to extraction.
  </Step>

  <Step title="Extract">
    Transforms validated events into structured JSON records with dynamic
    schemas tailored to your query.
  </Step>
</Steps>

<Note>
  Processing typically takes 10-15 minutes per job. Poll the status endpoint
  every 30-60 seconds to track progress through each stage.
</Note>

<Tip>
  Each job returns structured JSON records with dynamic schemas. Fields like
  `company_name`, `deal_value`, and `acquisition_date` are automatically
  generated based on your query.

  See the [Quickstart > Review response](/web-search-api/get-started/quickstart#steps)
  for a complete response example.
</Tip>

## Key characteristics

<AccordionGroup>
  <Accordion title="Event-centric index" icon="database">
    CatchAll searches NewsCatcher’s continuously updated web index of 2+ billion
    web pages, optimized for finding real-world events (acquisitions, approvals,
    incidents). The index is time-series by nature — each web page is recorded by
    its discovery date, not the date of the event it describes. This means a web
    page discovered today may report on an event from months ago, yet CatchAll
    still evaluates it against your query.

    <Tip>
      To learn how the index works and what search depth controls, see
      [Index and search depth](/web-search-api/concepts/index-and-search-depth).
    </Tip>
  </Accordion>

  <Accordion title="Customizable extraction" icon="wand-magic-sparkles">
    Control what data gets extracted by providing custom validators and enrichments,
    or let the system generate them automatically based on your query.

    **Custom validators** filter which events are relevant:

    ```json theme={null}
    {
      "name": "is_acquisition",
      "description": "true if web page describes an acquisition",
      "type": "boolean"
    }
    ```

    **Custom enrichments** define what data to extract:

    ```json theme={null}
    {
      "name": "acquiring_company",
      "description": "Extract the acquiring company name",
      "type": "company"
    }
    ```

    The `company` enrichment type extracts structured data including name,
    alternative names, website candidates, people, and address.

    <Tip>
      Use the [`POST /catchAll/initialize`](/web-search-api/api-reference/jobs/initialize-job) endpoint to get suggested validators and
      enrichments before submitting your job.
    </Tip>
  </Accordion>

  <Accordion title="Dynamic schemas" icon="flask">
    Each job generates a unique response schema. Field names and structure in
    the `enrichment` object vary between jobs—even with identical inputs.

    **Guaranteed fields in every record:**

    * `record_id`
    * `record_title`
    * `enrichment` object
    * `citations` array

    **Variable fields:**

    * All fields inside `enrichment` (names, types, structure)

    <Tip>
      See [Understanding dynamic schemas](/web-search-api/concepts/dynamic-schemas)
      for integration patterns.
    </Tip>
  </Accordion>

  <Accordion title="Date range controls" icon="calendar">
    The `start_date` and `end_date` parameters define your search window within
    the web index. They control which web pages are searched, not which events
    are returned. A web page within your date range may describe events from
    outside that range.

    Date ranges are validated against your plan's allowed search depth.
    To preview date adjustments before submitting, you can use
    [`POST /catchAll/initialize`](/web-search-api/api-reference/jobs/initialize-job).

    <Tip>
      To learn how search depth and date ranges work, see
      [Index and search depth](/web-search-api/concepts/index-and-search-depth).
    </Tip>
  </Accordion>

  <Accordion title="Non-deterministic processing" icon="shuffle">
    Identical queries can produce different results:

    * LLMs may generate different keywords, validators, and extractors
    * Different content sources may be retrieved
    * Field names and structure vary between runs
    * Record counts differ
  </Accordion>

  <Accordion title="Asynchronous operation" icon="clock">
    Each query creates a job that processes asynchronously. Use the returned
    `job_id` to poll the job status and retrieve results when completed.
    Processing typically takes 10-15 minutes.

    Track detailed progress through the [`steps`](/web-search-api/api-reference/jobs/get-job-status#response-steps)
    array in the status endpoint response.
  </Accordion>

  <Accordion title="Batch processing" icon="layer-group">
    Results become available progressively during the `enriching` stage as
    validation completes in batches. Check for `status: "enriching"` to retrieve
    partial results before job completion.

    The `progress_validated` field tracks how many candidate clusters have been
    processed. This allows you to access early results while the job continues
    processing remaining batches.
  </Accordion>

  <Accordion title="Job continuation" icon="rotate">
    Start with fewer records using the `limit` parameter for quick testing, then use
    [`POST /catchAll/continue`](/web-search-api/api-reference/jobs/continue-job) to process more
    records without re-submitting the query.

    Continue requests preserve all analysis, validation, and extraction logic
    from the original job.
  </Accordion>

  <Accordion title="Company search" icon="building">
    Filter job results to a predefined list of companies and receive
    per-company relevance scores on every record. Activate by passing
    `connected_dataset_ids` when submitting a job.

    Build a dataset of company entities, then connect it to any job:

    * Results are filtered to events relevant to your companies
    * Each record includes a `connected_entities` array with `entity_id`,
      `name`, `ed_score` (1–10), and `relation` for each matched company

    <Tip>
      See [Company search](/web-search-api/concepts/company-search)
      for a full walkthrough.
    </Tip>
  </Accordion>
</AccordionGroup>

## Endpoints

**Base URL:** `https://catchall.newscatcherapi.com`

<Tabs>
  <Tab title="Jobs">
    | Endpoint                    | Method   | Description                                     |
    | --------------------------- | -------- | ----------------------------------------------- |
    | `/catchAll/validate`        | `POST`   | Validate a query before submission              |
    | `/catchAll/initialize`      | `POST`   | Get validator, enrichment, and date suggestions |
    | `/catchAll/submit`          | `POST`   | Create a new job                                |
    | `/catchAll/continue`        | `POST`   | Continue job with higher limit                  |
    | `/catchAll/jobs/user`       | `GET `   | List all jobs for your API key                  |
    | `/catchAll/jobs/{job_id}`   | `DELETE` | Delete a job                                    |
    | `/catchAll/status/{job_id}` | `GET`    | Check job processing status                     |
    | `/catchAll/pull/{job_id}`   | `GET`    | Retrieve job results                            |

    <Tip>
      Track detailed progress using the `steps` array in the status endpoint response.
      See [Job status > steps](/web-search-api/api-reference/jobs/get-job-status#response-steps) for details.
    </Tip>
  </Tab>

  <Tab title="Monitors">
    | Endpoint                                  | Method   | Description                    |
    | ----------------------------------------- | -------- | ------------------------------ |
    | `/catchAll/monitors/create`               | `POST`   | Create scheduled monitor       |
    | `/catchAll/monitors/{monitor_id}`         | `PATCH`  | Update monitor                 |
    | `/catchAll/monitors/{monitor_id}`         | `DELETE` | Delete a monitor               |
    | `/catchAll/monitors`                      | `GET`    | List all monitors              |
    | `/catchAll/monitors/{monitor_id}/jobs`    | `GET`    | List jobs for a monitor        |
    | `/catchAll/monitors/{monitor_id}/status`  | `GET`    | Get monitor execution history  |
    | `/catchAll/monitors/pull/{monitor_id}`    | `GET`    | Get aggregated monitor results |
    | `/catchAll/monitors/{monitor_id}/enable`  | `POST`   | Enable a monitor               |
    | `/catchAll/monitors/{monitor_id}/disable` | `POST`   | Disable a monitor              |
  </Tab>

  <Tab title="Webhooks">
    | Endpoint                                                                  | Method   | Description                          |
    | ------------------------------------------------------------------------- | -------- | ------------------------------------ |
    | `/catchAll/webhooks`                                                      | `POST`   | Create a webhook                     |
    | `/catchAll/webhooks`                                                      | `GET`    | List webhooks                        |
    | `/catchAll/webhooks/{webhook_id}`                                         | `GET`    | Get a webhook                        |
    | `/catchAll/webhooks/{webhook_id}`                                         | `PATCH`  | Update a webhook                     |
    | `/catchAll/webhooks/{webhook_id}`                                         | `DELETE` | Delete a webhook                     |
    | `/catchAll/webhooks/{webhook_id}/test`                                    | `POST`   | Test webhook delivery                |
    | `/catchAll/webhooks/{webhook_id}/resources`                               | `POST`   | Assign a resource to a webhook       |
    | `/catchAll/webhooks/{webhook_id}/resources`                               | `GET`    | List resources assigned to a webhook |
    | `/catchAll/webhooks/{webhook_id}/resources/{resource_type}/{resource_id}` | `DELETE` | Remove a resource from a webhook     |
    | `/catchAll/resources/{resource_type}/{resource_id}/webhooks`              | `GET`    | List webhooks assigned to a resource |
    | `/catchAll/webhook-history`                                               | `GET`    | Get webhook delivery history         |
  </Tab>

  <Tab title="Entities">
    | Endpoint                         | Method   | Description              |
    | -------------------------------- | -------- | ------------------------ |
    | `/catchAll/entities`             | `POST`   | Create a company entity  |
    | `/catchAll/entities/batch`       | `POST`   | Create multiple entities |
    | `/catchAll/entities`             | `GET`    | List entities            |
    | `/catchAll/entities/{entity_id}` | `GET`    | Get entity               |
    | `/catchAll/entities/{entity_id}` | `PATCH`  | Update entity            |
    | `/catchAll/entities/{entity_id}` | `DELETE` | Delete entity            |
  </Tab>

  <Tab title="Datasets">
    | Endpoint                                        | Method   | Description                        |
    | ----------------------------------------------- | -------- | ---------------------------------- |
    | `/catchAll/datasets`                            | `POST`   | Create dataset from entity IDs     |
    | `/catchAll/datasets/upload`                     | `POST`   | Create dataset from CSV            |
    | `/catchAll/datasets`                            | `GET`    | List datasets                      |
    | `/catchAll/datasets/{dataset_id}`               | `GET`    | Get dataset                        |
    | `/catchAll/datasets/{dataset_id}`               | `PATCH`  | Update dataset name or description |
    | `/catchAll/datasets/{dataset_id}`               | `DELETE` | Delete dataset                     |
    | `/catchAll/datasets/{dataset_id}/entities`      | `POST`   | Add entities to dataset            |
    | `/catchAll/datasets/{dataset_id}/entities`      | `DELETE` | Remove entities from dataset       |
    | `/catchAll/datasets/{dataset_id}/entities/list` | `POST`   | List entities in dataset           |
    | `/catchAll/datasets/{dataset_id}/status`        | `GET`    | Get dataset status history         |
    | `/catchAll/datasets/{dataset_id}/upload`        | `POST`   | Add companies to dataset via CSV   |
  </Tab>

  <Tab title="Projects">
    | Endpoint                                                                  | Method   | Description                            |
    | ------------------------------------------------------------------------- | -------- | -------------------------------------- |
    | `/catchAll/projects`                                                      | `POST`   | Create a project                       |
    | `/catchAll/projects`                                                      | `GET`    | List projects                          |
    | `/catchAll/projects/{project_id}`                                         | `GET`    | Get a project                          |
    | `/catchAll/projects/{project_id}`                                         | `PATCH`  | Update a project                       |
    | `/catchAll/projects/{project_id}`                                         | `DELETE` | Delete a project                       |
    | `/catchAll/projects/{project_id}/overview`                                | `GET`    | Get resource counts by type and status |
    | `/catchAll/projects/{project_id}/resources`                               | `POST`   | Add resources to a project             |
    | `/catchAll/projects/{project_id}/resources`                               | `GET`    | List resources in a project            |
    | `/catchAll/projects/{project_id}/resources/{resource_type}/{resource_id}` | `DELETE` | Remove a resource from a project       |
  </Tab>

  <Tab title="Meta">
    | Endpoint                | Method | Description               |
    | ----------------------- | ------ | ------------------------- |
    | `/health`               | `GET`  | Check API health status   |
    | `/version`              | `GET`  | Get API version info      |
    | `/catchAll/user/limits` | `POST` | Get plan limits and usage |
  </Tab>
</Tabs>

## Example queries

**Market intelligence**

```text theme={null}
M&A activity in the AI sector over $100M in the last month
Enterprise software company earnings reports this quarter
Product launches by Fortune 500 technology companies this week
```

**Regulatory monitoring**

```text theme={null}
FDA drug approvals for oncology treatments in the last 30 days
Financial regulatory actions against banks in the EU this month
Government policy changes affecting semiconductor exports this week
```

**Business development**

```text theme={null}
Series B funding rounds for SaaS startups over $20M this month
Strategic partnerships between automotive and technology companies this week
Market entry announcements by US companies in Southeast Asia this month
```

**Competitive analysis**

```text theme={null}
Executive appointments at major cloud infrastructure companies this month
Product launches by Salesforce, HubSpot, and Zendesk this quarter
Layoffs and restructuring announcements at fintech companies this week
```

**Research automation**

```text theme={null}
Clinical trial results for diabetes treatments published this month
Cybersecurity incidents at financial institutions in the last 30 days
Bankruptcy filings by retail companies in the US this quarter
```

## What's next

<CardGroup cols={2}>
  <Card title="Quickstart" icon="rocket" href="/web-search-api/get-started/quickstart">
    Make your first request and get results in minutes
  </Card>

  <Card title="Monitors" icon="clock" href="/web-search-api/concepts/monitors">
    Automate recurring queries with scheduled execution
  </Card>

  <Card title="API Reference" icon="book" href="/web-search-api/api-reference/jobs/create-job">
    Detailed endpoint documentation and parameters
  </Card>

  <Card title="Postman Collection" icon="share" href="https://www.postman.com/newscatcherapi/newscatcher-public-workspace/collection/38930966-6f0e594c-52f9-4c97-8d41-abe44a396533">
    Explore and test all endpoints interactively
  </Card>
</CardGroup>

<Note>For technical support, contact us at [support@newscatcherapi.com](mailto:support@newscatcherapi.com).</Note>
