Validate queries with Python SDK

The Newscatcher Python SDK includes client-side query validation that mirrors the API’s server-side validation. This feature helps you catch invalid query syntax before making API calls.

Overview

Query validation offers these benefits:

Immediate feedback: Catch errors without making API calls.
Reduced API usage: Avoid requests for invalid queries.
Cost savings: Prevent billable API calls for invalid queries.
Essential for LLM workflows: Critical when you process thousands of generated queries.
Consistent behavior: Match server-side validation exactly.
Better developer experience: Get detailed error messages immediately.

Basic usage

Use the validate_query() method to check query syntax before you make API calls:

Basic query validation

from newscatcher import NewscatcherApi

client = NewscatcherApi(api_key="YOUR_API_KEY")

# Validate a query before using it
is_valid, error_message = client.validate_query("machine learning")
if is_valid:
    print("Query is valid!")
else:
    print(f"Invalid query: {error_message}")

The method returns a tuple:

is_valid (bool): Whether the query passes validation
error_message (str): Detailed error description if validation fails, or empty string if valid

Automatic validation in SDK methods

Query validation is enabled by default in methods like get_all_articles() and get_all_headlines(). You can control this behavior:

SDK method validation

# Enable validation (default behavior)
articles = client.get_all_articles(
    q="AI OR \"artificial intelligence\"",  # Valid query
    validate_query=True,  # Optional, True by default
    from_="7d"
)

# Disable validation (not recommended)
articles = client.get_all_articles(
    q="some query",
    validate_query=False,  # Skip client-side validation
    from_="7d"
)

When validation is enabled and a query fails validation, the method raises a ValueError with the specific error message.

Validation rules

The SDK validates queries using the same rules as the Newscatcher API.

Valid patterns

Single words
Multi-word
Exact phrases
Boolean operators
Wildcards
Grouping

Single words and terms

"technology"   # Single word
"AI"           # Acronym

Invalid patterns

Forbidden chars
Invalid wildcards
Operator errors
Double operators
Mixed levels
Unbalanced

Forbidden characters

"machine[learning]"       # Square brackets not allowed
"AI/ML"                   # Forward slashes not allowed
"machine:learning"        # Colons not allowed
"data^science"            # Caret symbols not allowed

Understand automatic AND insertion

The API automatically inserts AND operators between standalone terms, which can create validation conflicts with mixed operator levels. Problem

Common AND insertion conflicts

# ❌ This fails because of automatic AND insertion:
"AI OR artificial intelligence"
# Becomes: "AI OR artificial AND intelligence" (mixed operator levels)

# ❌ Another example:
"startup OR venture capital"
# Becomes: "startup OR venture AND capital" (mixed operator levels)

Solutions

Fix AND insertion conflicts

# ✅ Fix by using exact phrase matching:
"AI OR \"artificial intelligence\""
# Stays as: "AI OR \"artificial intelligence\"" (same level)

# ✅ Or use proper grouping:
"startup OR (venture AND capital)"
# Becomes: "startup OR (venture AND capital)" (properly grouped)

Always use double quotes for multi-word terms when combining with OR operators to prevent automatic AND insertion conflicts.

Validate multiple queries

For applications that process multiple queries (like LLM-generated queries), you can validate them in bulk:

Hide Bulk validation code example

Bulk query validation

def validate_queries_bulk(client, queries):
    """Validate multiple queries and return results with details."""
 results = []

    for i, query in enumerate(queries):
 is_valid, error_message = client.validate_query(query)
 results.append({
            'index': i,
            'query': query,
            'is_valid': is_valid,
            'error': error_message if not is_valid else None
 })

    return results

# Example usage with multiple queries
queries = [
    "machine learning",                    # Valid
    "AI OR artificial intelligence",       # Invalid (mixed levels)
    "\"artificial intelligence\"",         # Valid (exact phrase)
    "technology[invalid]",                 # Invalid (forbidden chars)
    "(python AND ML) OR (data AND science)"  # Valid (proper grouping)
]

client = NewscatcherApi(api_key="YOUR_API_KEY")
validation_results = validate_queries_bulk(client, queries)

# Process results
valid_queries = [r['query'] for r in validation_results if r['is_valid']]
invalid_queries = [r for r in validation_results if not r['is_valid']]

print(f"Valid queries: {len(valid_queries)}")
print(f"Invalid queries: {len(invalid_queries)}")

# Show invalid queries with errors
for result in invalid_queries:
    print(f"Query: {result['query']}")
    print(f"Error: {result['error']}\n")

This approach works well when you work with:

LLM-generated queries that may have syntax issues.
User input that needs validation before processing.
Batch processing scenarios where you want to filter valid queries first.

Bulk validation is especially valuable for production applications processing thousands of queries, as it prevents costly API calls for invalid queries.

Best practices

Use exact phrases for multi-word terms: When you search for specific phrases, always use double quotes to prevent automatic AND insertion conflicts.
Validate LLM-generated queries: Essential for applications that process thousands of AI-generated queries to save time and money.
Group complex queries: Use parentheses to make query logic clear and avoid operator-level conflicts.

Get started

Guides and concepts

How to

Troubleshooting

Migration

Validate queries with Python SDK

Overview

Basic usage

Automatic validation in SDK methods

Validation rules

Valid patterns

Invalid patterns

Understand automatic AND insertion

Validate multiple queries

Best practices

See also

Get started

Guides and concepts

How to

Troubleshooting

Migration

​Overview

​Basic usage

​Automatic validation in SDK methods

​Validation rules

​Valid patterns

​Invalid patterns

​Understand automatic AND insertion

​Validate multiple queries

​Best practices

​See also

Overview

Basic usage

Automatic validation in SDK methods

Validation rules

Valid patterns

Invalid patterns

Understand automatic AND insertion

Validate multiple queries

Best practices

See also