How to retrieve more than 10,000 articles

The Newscatcher API limits results to 10,000 articles per search query. The Python SDK provides special methods that automatically split your search across multiple time periods to bypass the limit and retrieve all articles relevant to your query.

These advanced retrieval methods are available only in the Python SDK.

Understanding the article limit

When your query matches more than 10,000 articles, the API returns "total_hits": 10000 as a hard limit, and you cannot retrieve more through standard pagination.

from newscatcher import NewscatcherApi

client = NewscatcherApi(api_key="YOUR_API_KEY")

response = client.search.post(
    q="technology",
    from_="7d",
    to="now"
)

print(f"Total hits: {response.total_hits}")
print(f"Is result capped: {response.total_hits == 10000}")  # True if limit reached

Using time-chunking methods

The SDK provides two special methods to retrieve large volumes of articles:

get_all_articles
get_all_headlines

Both methods available for synchronous and asynchronous clients. All existing v3 API parameters are available for both methods, giving you full control over filtering, sorting, and content selection while bypassing the 10,000 article limit.

Get all articles

from newscatcher import NewscatcherApi

client = NewscatcherApi(api_key="YOUR_API_KEY")

articles = client.get_all_articles(
    q="renewable energy",
    from_="30d",
    to="now",
    time_chunk_size="1d",
    max_articles=50000,
    show_progress=True,
)

print(f"Retrieved {len(articles)} articles")

Get all headlines

headlines = client.get_all_headlines(
    when="30d",
    time_chunk_size="1d",
    max_articles=20000,
    show_progress=True
)

print(f"Retrieved {len(headlines)} headlines")

How time-chunking works

Time-chunking divides your date range into smaller intervals, making separate API calls for each period and combining the results. Each interval can return up to 10,000 articles. For example, with time_chunk_size="1d" over 5 days, the method makes 5 API calls, one for each day, with auto pagination to potentially retrieve up to 50,000 articles.

Time-chunking diagram showing how multiple requests are combined

Choosing the right chunk size

The optimal chunk size depends on how many articles your query returns:

Query type	Articles per day	Recommended chunk size
Extremely broad	10,000+ per hour	`"1h"`
Very broad	10,000+ per day	`"6h"`
Broad	3,000-10,000 per day	`"1d"`
Moderate	1,000-3,000 per day	`"3d"`
Specific	100-1,000 per day	`"7d"`
Very specific	< 100 per day	`"30d"`

Method parameters

string

required

Your search query. Supports AND, OR, NOT operators and advanced syntax.

from_

string

default:"30d"

Starting date for get_all_articles (e.g., "10d" or "2023-03-15").

string

default:"now"

Ending date for get_all_articles defaults to current time.

when

string

default:"7d"

Time range for get_all_headlines (e.g., "1d" or "2023-03-15").

time_chunk_size

string

default:"1h"

Chunk size: "1h", "6h", "1d", "7d", "1m".

max_articles

integer

default:"100000"

Maximum number of articles to retrieve.

show_progress

boolean

default:"false"

Whether to display a progress bar.

deduplicate

boolean

default:"true"

Whether to remove duplicate articles.

concurrency

integer

default:"3"

For async methods only: number of concurrent requests.

Common issues and solutions

Rate limiting errors

Memory errors

Missing results

Get started

Guides and concepts

How to

Troubleshooting

Migration

How to retrieve more than 10,000 articles

Understanding the article limit

Using time-chunking methods

Get all articles

Get all headlines

How time-chunking works

Choosing the right chunk size

Method parameters

Common issues and solutions

See also

Get started

Guides and concepts

How to

Troubleshooting

Migration

​Understanding the article limit

​Using time-chunking methods

​Get all articles

​Get all headlines

​How time-chunking works

​Choosing the right chunk size

​Method parameters

​Common issues and solutions

​See also

Understanding the article limit

Using time-chunking methods

Get all articles

Get all headlines

How time-chunking works

Choosing the right chunk size

Method parameters

Common issues and solutions

See also