Working with dynamic schemas

CatchAll generates response schemas dynamically—field names change between jobs, even with identical inputs. This guide shows how to build integrations that handle this variability.

Only 3 fields are guaranteed:

record_id
record_title
citations array

All other fields in enrichment vary between jobs.

Why schemas vary

Submit the same query twice, get different field names:

{
  "record_id": "5262823697790152939",
  "record_title": "Oracle Q1 2026 Earnings Exceed Expectations",
  "enrichment": {
    "company_name": "Oracle",
    "revenue": "$14.9 billion",
    "profit_margin": "42%"
  }
}

Why this happens:

LLMs generate extractors dynamically for each job
Different keywords, validators, and extractors are created
Field names are chosen semantically to match content

This is expected behavior, not a bug.

Integration strategies

Store raw data

Preserve the entire enrichment object as JSON:

import json

db.insert({
    "record_id": record["record_id"],
    "title": record["record_title"],
    "raw_data": json.dumps(record["enrichment"]),
    "citations": json.dumps(record["citations"])
})

Use when: You need to preserve all data without loss. Query dynamically later.

Map to canonical fields

Translate variable field names to your fixed schema using pattern matching:

FIELD_PATTERNS = {
    "revenue": ["revenue", "sales", "total_revenue"],
    "profit": ["profit", "margin", "income", "earnings"],
    "quarter": ["quarter", "q", "period"],
}

def normalize_record(record):
    normalized = {"title": record["record_title"]}

    for canonical, patterns in FIELD_PATTERNS.items():
        for key, value in record["enrichment"].items():
            if any(p in key.lower() for p in patterns):
                normalized[canonical] = value
                break

    return normalized

Use when: You have a fixed database schema and need consistent field names.

Process dynamically

Handle all fields without assumptions:

print(f"Title: {record['record_title']}\n")

for key, value in record["enrichment"].items():
    display_name = key.replace("_", " ").title()
    print(f"{display_name}: {value}")

Use when: Your application can display fields without fixed structure (dashboards, search results).

Common patterns

Find fields by pattern

def find_field(enrichment, patterns):
    """Find first field matching any pattern."""
    for key, value in enrichment.items():
        if any(pattern in key.lower() for pattern in patterns):
            return value
    return None

# Usage
revenue = find_field(record["enrichment"], ["revenue", "sales"])
profit = find_field(record["enrichment"], ["profit", "margin", "income"])

Handle multiple possible names

# Check variations with fallback
revenue = (
    record["enrichment"].get("revenue") or
    record["enrichment"].get("total_revenue") or
    record["enrichment"].get("sales") or
    "N/A"
)

Parse different formats

Revenue can appear as "$14.9 billion", "$14.9B", "14900000000", or 14.9:

import re

revenue_str = record["enrichment"].get("revenue", "0")
numbers = re.findall(r'[\d.]+', revenue_str)
revenue_value = float(numbers[0]) if numbers else 0.0

What not to do

Don’t hardcode field names

# ❌ Breaks when field name differs
revenue = record["enrichment"]["revenue"]  # KeyError

# ✅ Handle variations
revenue = record["enrichment"].get("revenue", "N/A")

Don’t validate specific fields

# ❌ Breaks when schema changes
required_fields = ["company_name", "revenue", "quarter"]
for field in required_fields:
    assert field in record["enrichment"]

# ✅ Only check guaranteed fields
assert "record_id" in record
assert "record_title" in record
assert "citations" in record

Don’t assume consistent fields across records

# ❌ Assumes all records have same fields
df = pd.DataFrame([
    {"company": r["enrichment"]["company_name"]}
    for r in results["all_records"]
])

# ✅ Handle variable fields
data = []
for record in results["all_records"]:
    row = {"title": record["record_title"]}
    row.update(record["enrichment"])  # Add all fields dynamically
    data.append(row)

df = pd.DataFrame(data)

Test your integration

Submit the same query multiple times to see schema variations:

import requests
import time

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://catchall.newscatcherapi.com"
HEADERS = {"x-api-key": API_KEY}

# Submit same query 3 times
job_ids = []
for i in range(3):
    response = requests.post(
        f"{BASE_URL}/catchAll/submit",
        headers=HEADERS,
        json={"query": "Tech earnings Q3"}
    )
    job_ids.append(response.json()["job_id"])
    print(f"Created job {i+1}: {job_ids[-1]}")

# Wait for jobs to complete
print("\nWaiting for jobs to complete...")
time.sleep(900)  # Wait 15 minutes

# Compare schemas
print("\nComparing schemas across jobs:\n")
for idx, job_id in enumerate(job_ids, 1):
    response = requests.get(
        f"{BASE_URL}/catchAll/pull/{job_id}",
        headers=HEADERS
    )
    results = response.json()

    if results.get("all_records"):
        first_record = results["all_records"][0]["enrichment"]
        field_names = list(first_record.keys())
        print(f"Job {idx} fields: {field_names}")
    else:
        print(f"Job {idx}: No records yet")

Verify your code:

Doesn’t crash when field names differ
Extracts data from all variations
Handles missing fields gracefully
Works with both empty and populated enrichment objects

Using schema parameter

The schema parameter can influence field naming but doesn’t guarantee specific names:

{
  "query": "Tech earnings",
  "schema": "[COMPANY] earned [REVENUE] in [QUARTER]"
}

What you get:

schema_based_summary field is added to enrichment
Field names may align with placeholders (company, revenue, quarter)
Specific field names are not guaranteed

Overview

How to

Endpoints

Libraries

Integrations

Working with dynamic schemas

Why schemas vary

Integration strategies

Store raw data

Map to canonical fields

Process dynamically

Common patterns

Find fields by pattern

Handle multiple possible names

Parse different formats

What not to do

Don’t hardcode field names

Don’t validate specific fields

Don’t assume consistent fields across records

Test your integration

Using schema parameter

See also

Overview

How to

Endpoints

Libraries

Integrations

​Why schemas vary

​Integration strategies

​Store raw data

​Map to canonical fields

​Process dynamically

​Common patterns

​Find fields by pattern

​Handle multiple possible names

​Parse different formats

​What not to do

​Don’t hardcode field names

​Don’t validate specific fields

​Don’t assume consistent fields across records

​Test your integration

​Using schema parameter

​See also

Why schemas vary

Integration strategies

Store raw data

Map to canonical fields

Process dynamically

Common patterns

Find fields by pattern

Handle multiple possible names

Parse different formats

What not to do

Don’t hardcode field names

Don’t validate specific fields

Don’t assume consistent fields across records

Test your integration

Using schema parameter

See also