> ## Documentation Index
> Fetch the complete documentation index at: https://newscatcherinc-docs.mintlify.site/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Validate queries with Python SDK

> Use client-side query validation to catch syntax errors before making API calls

The Newscatcher Python SDK includes client-side query validation that mirrors
the API's server-side validation. This feature helps you catch invalid query
syntax before making API calls.

## Overview

Query validation offers these benefits:

* **Immediate feedback**: Catch errors without making API calls.
* **Reduced API usage**: Avoid requests for invalid queries.
* **Cost savings**: Prevent billable API calls for invalid queries.
* **Essential for LLM workflows**: Critical when you process thousands of
  generated queries.
* **Consistent behavior**: Match server-side validation exactly.
* **Better developer experience**: Get detailed error messages immediately.

## Basic usage

Use the `validate_query()` method to check query syntax before you make API
calls:

```python Basic query validation icon="python" theme={null}
from newscatcher import NewscatcherApi

client = NewscatcherApi(api_key="YOUR_API_KEY")

# Validate a query before using it
is_valid, error_message = client.validate_query("machine learning")
if is_valid:
    print("Query is valid!")
else:
    print(f"Invalid query: {error_message}")
```

The method returns a tuple:

* `is_valid` (bool): Whether the query passes validation
* `error_message` (str): Detailed error description if validation fails, or
  empty string if valid

## Automatic validation in SDK methods

Query validation is enabled by default in methods like `get_all_articles()` and
`get_all_headlines()`. You can control this behavior:

```python SDK method validation icon="python" theme={null}
# Enable validation (default behavior)
articles = client.get_all_articles(
    q="AI OR \"artificial intelligence\"",  # Valid query
    validate_query=True,  # Optional, True by default
    from_="7d"
)

# Disable validation (not recommended)
articles = client.get_all_articles(
    q="some query",
    validate_query=False,  # Skip client-side validation
    from_="7d"
)
```

<Warning>
  When validation is enabled and a query fails validation, the method raises a
  `ValueError` with the specific error message.
</Warning>

## Validation rules

The SDK validates queries using the same rules as the Newscatcher API.

### Valid patterns

<Tabs>
  <Tab title="Single words">
    ```python Single words and terms theme={null}
    "technology"   # Single word
    "AI"           # Acronym
    ```
  </Tab>

  <Tab title="Multi-word">
    ```python Multi-word searches (automatic AND insertion) theme={null}
    "machine learning"         # → "machine AND learning"
    "artificial intelligence"  # → "artificial AND intelligence"
    "data science"             # → "data AND science"
    ```
  </Tab>

  <Tab title="Exact phrases">
    ```python Exact phrase matching (prevents AND insertion) theme={null}
    "\"machine learning\""         # Exact phrase: "machine learning"
    "\"artificial intelligence\""  # Exact phrase: "artificial intelligence"
    "\"New York Times\""           # Exact phrase: "New York Times"
    ```
  </Tab>

  <Tab title="Boolean operators">
    ```python Boolean operators (same level) theme={null}
    "AI OR ML"                        # Same level OR operators
    "python AND machine"              # Same level AND operators
    "AI OR ML OR NLP"                 # Multiple same-level operators
    "machine AND learning AND python" # Multiple same-level AND operators
    ```
  </Tab>

  <Tab title="Wildcards">
    ```python Wildcards theme={null}
    "technolog*"      # Suffix wildcard
    "learn*"          # Valid wildcard usage
    "*"               # Special case: matches all
    ```
  </Tab>

  <Tab title="Grouping">
    ```python Proper grouping with parentheses theme={null}
    "(AI AND research) OR (ML AND development)"
    "\"machine learning\" AND (python OR R)"
    "(startup AND innovation) OR (\"venture capital\")"
    ```
  </Tab>
</Tabs>

### Invalid patterns

<Tabs>
  <Tab title="Forbidden chars">
    ```python Forbidden characters theme={null}
    "machine[learning]"       # Square brackets not allowed
    "AI/ML"                   # Forward slashes not allowed
    "machine:learning"        # Colons not allowed
    "data^science"            # Caret symbols not allowed
    ```
  </Tab>

  <Tab title="Invalid wildcards">
    ```python Invalid wildcards theme={null}
    "*machine"                # Cannot start with wildcard
    "term *"                  # Space before wildcard
    "**"                      # Multiple wildcards
    "***technology"           # Multiple wildcards
    ```
  </Tab>

  <Tab title="Operator errors">
    ```python Operator boundary errors theme={null}
    "AND machine learning"        # Cannot start with operator
    "machine learning OR"         # Cannot end with operator
    "OR artificial intelligence"  # Cannot start with operator
    "python AND"                  # Cannot end with operator
    ```
  </Tab>

  <Tab title="Double operators">
    ```python Double operators theme={null}
    "machine OR OR learning"  # Doubled operators
    "AI AND AND research"     # Invalid combinations
    "data NOT NOT science"    # Multiple NOT operators
    ```
  </Tab>

  <Tab title="Mixed levels">
    ```python Mixed operator levels (automatic AND insertion creates conflicts) theme={null}
    "AI OR artificial intelligence"         # → "AI OR artificial AND intelligence" ❌
    "startup OR venture capital"            # → "startup OR venture AND capital" ❌
    "python AND (machine learning OR AI)"   # →  mixed levels ❌
    "blockchain OR artificial intelligence" # → "blockchain OR artificial AND intelligence" ❌
    ```
  </Tab>

  <Tab title="Unbalanced">
    ```python Unbalanced quotes and parentheses theme={null}
    "unbalanced \"quote"                # Unclosed quote
    "unbalanced (parenthesis"           # Unclosed parenthesis
    "multiple \"unbalanced \"quotes\""  # Nested unbalanced quotes
    ```
  </Tab>
</Tabs>

## Understand automatic AND insertion

The API automatically inserts AND operators between standalone terms, which can
create validation conflicts with mixed operator levels.

**Problem**

```python Common AND insertion conflicts icon="exclamation-triangle" theme={null}
# ❌ This fails because of automatic AND insertion:
"AI OR artificial intelligence"
# Becomes: "AI OR artificial AND intelligence" (mixed operator levels)

# ❌ Another example:
"startup OR venture capital"
# Becomes: "startup OR venture AND capital" (mixed operator levels)
```

**Solutions**

```python Fix AND insertion conflicts icon="check" theme={null}
# ✅ Fix by using exact phrase matching:
"AI OR \"artificial intelligence\""
# Stays as: "AI OR \"artificial intelligence\"" (same level)

# ✅ Or use proper grouping:
"startup OR (venture AND capital)"
# Becomes: "startup OR (venture AND capital)" (properly grouped)
```

<Tip>
  Always use double quotes for multi-word terms when combining with OR operators
  to prevent automatic AND insertion conflicts.
</Tip>

## Validate multiple queries

For applications that process multiple queries (like LLM-generated queries), you
can validate them in bulk:

<Expandable title="Bulk validation code example" defaultOpen="false">
  ```python Bulk query validation icon="python" expandable theme={null}
  def validate_queries_bulk(client, queries):
      """Validate multiple queries and return results with details."""
   results = []

      for i, query in enumerate(queries):
   is_valid, error_message = client.validate_query(query)
   results.append({
              'index': i,
              'query': query,
              'is_valid': is_valid,
              'error': error_message if not is_valid else None
   })

      return results

  # Example usage with multiple queries
  queries = [
      "machine learning",                    # Valid
      "AI OR artificial intelligence",       # Invalid (mixed levels)
      "\"artificial intelligence\"",         # Valid (exact phrase)
      "technology[invalid]",                 # Invalid (forbidden chars)
      "(python AND ML) OR (data AND science)"  # Valid (proper grouping)
  ]

  client = NewscatcherApi(api_key="YOUR_API_KEY")
  validation_results = validate_queries_bulk(client, queries)

  # Process results
  valid_queries = [r['query'] for r in validation_results if r['is_valid']]
  invalid_queries = [r for r in validation_results if not r['is_valid']]

  print(f"Valid queries: {len(valid_queries)}")
  print(f"Invalid queries: {len(invalid_queries)}")

  # Show invalid queries with errors
  for result in invalid_queries:
      print(f"Query: {result['query']}")
      print(f"Error: {result['error']}\n")
  ```
</Expandable>

This approach works well when you work with:

* LLM-generated queries that may have syntax issues.
* User input that needs validation before processing.
* Batch processing scenarios where you want to filter valid queries first.

<Note>
  Bulk validation is especially valuable for production applications processing
  thousands of queries, as it prevents costly API calls for invalid queries.
</Note>

## Best practices

1. **Use exact phrases for multi-word terms**: When you search for specific
   phrases, always use double quotes to prevent automatic AND insertion
   conflicts.

2. **Validate LLM-generated queries**: Essential for applications that process
   thousands of AI-generated queries to save time and money.

3. **Group complex queries**: Use parentheses to make query logic clear and
   avoid operator-level conflicts.

## See also

* [Query syntax](/news-api/guides-and-concepts/advanced-querying)
* [How to build search queries](/news-api/how-to/build-search-queries)
* [Python SDK](/news-api/libraries/python)
