Searches for articles similar to the specified query. You can filter results by language, country, source, and more.

Search similar articles

Additional information about the domain of the article.

Indicates whether the domain is a news domain.

Is News Domain

News Domain Type

The type of news content provided by the domain.

News Type

Additional Domain Info

The domain(s) mentioned in the article. For multiple domains, use a comma-separated string or an array of strings.

Examples: 
- `"who.int, nih.gov"`
- `["who.int", "nih.gov"]`

For more details, see [Search by URL](/docs/v3/documentation/how-to/search-by-url).


The complete URL(s) mentioned in the article. For multiple URLs, use a comma-separated string or an array of strings.

Examples: 
- `"https://aiindex.stanford.edu/report/, https://www.stateof.ai/"`
- `["https://aiindex.stanford.edu/report/", "https://www.stateof.ai/"]`

For more details, see [Search by URL](/docs/v3/documentation/how-to/search-by-url).


The data model representing a single article in the search results.

A list of all domain URLs mentioned in the article.

All Domain Links

A list of all URLs mentioned in the article.

All Links

Authors

Content

English translation of the article content. Available when using the `search_in` parameter with the `content_translated` option or by setting the `include_translation_fields` parameter to `true`.


The country where the article was published.

Country

An object that contains custom tags associated with an article, where each key is a taxonomy name, and the value is an array of tags.

Custom Tags

Description

Domain URL

Full Domain URL

Is Headline

Indicates if the article is an opinion piece.

Is Opinion

A list of journalists associated with the article.

Journalists

The language in which the article is written.

Language

Link

Media

The name of the source where the article was published.

Name Source

Indicates if the article is paid content.

Paid Content

Parent URL

Parse Date

Published Date

Published Date Precision

Rank

Rights

True if the article content can be safely accessed according to the publisher's robots.txt rules; false otherwise.


Robots Compliant

Score

Title

English translation of the article title. Available when using the `search_in` parameter with the `title_translated` option or by setting the `include_translation_fields` parameter to `true`.


The Twitter account associated with the article.

Twitter Account

Updated Date

Updated Date Precision

Word Count

Article Object

If true, the `from_` and `to_` parameters use article parse dates instead of published dates. Additionally, the `parse_date` variable is added to the output for each article object.


Filters articles based on the maximum sentiment score of their content.

Range is `-1.0` to `1.0`, where:
- Negative values indicate negative sentiment.
- Positive values indicate positive sentiment.
- Values close to 0 indicate neutral sentiment.

**Note**: The `content_sentiment_max` parameter is only available if NLP is included in your subscription plan.

To learn more, see [NLP features](/docs/v3/documentation/guides-and-concepts/nlp-features).


Filters articles based on the minimum sentiment score of their content.

Range is `-1.0` to `1.0`, where:
- Negative values indicate negative sentiment.
- Positive values indicate positive sentiment.
- Values close to 0 indicate neutral sentiment.

**Note**: The `content_sentiment_min` parameter is only available if NLP is included in your subscription plan.

To learn more, see [NLP features](/docs/v3/documentation/guides-and-concepts/nlp-features).


The countries where the news publisher is located. The accepted format is the two-letter [ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) code. To select multiple countries, use a comma-separated string or an array of strings.

Examples:
- `"US,CA"`
- `["US", "CA"]`

To learn more, see [Enumerated parameters > Country](/docs/v3/api-reference/overview/enumerated-parameters#country-country-and-not-country).


Filters articles based on provided taxonomy that is tailored to your specific needs and is accessible only with your API key. To specify tags, use the following pattern: 

- `custom_tags.taxonomy=Tag1,Tag2,Tag3`, where `taxonomy` is the taxonomy name and `Tag1,Tag2,Tag3` are comma-separated tags. For POST requests, you can also specify tags as an array of strings.

Examples:
- `custom_tags.industry="Manufacturing, Supply Chain, Logistics"`
- `"custom_tags.industry": ["Manufacturing", "Supply Chain", "Logistics"]`

To learn more, see the [Custom tags](/docs/v3/documentation/guides-and-concepts/custom-tags).


The starting point in time to search from. Accepts date-time strings in ISO 8601 format and plain text strings. The default time zone is UTC. 

Formats with examples:
- YYYY-mm-ddTHH:MM:SS: `2024-07-01T00:00:00`
- YYYY-MM-dd: `2024-07-01`
- YYYY/mm/dd HH:MM:SS: `2024/07/01 00:00:00`
- YYYY/mm/dd: `2024/07/01`
- English phrases: `1 day ago`, `today`

**Note**: By default, applied to the publication date of the article. To use the article's parse date instead, set the `by_parse_date` parameter to `true`.


The lowest boundary of the rank of a news website to filter by. A lower rank indicates a more popular source.


If true, filters results to include only articles that have NLP data.

**Note**: NLP coverage and analysis completeness may vary by language, with full data available for articles in English and Arabic. The `has_nlp` parameter is available only in NLP subscription plans.

To learn more, see [NLP features](/docs/v3/documentation/guides-and-concepts/nlp-features).


If true, includes an NLP object for each article in the response. This object provides results of NLP analysis, including article theme, summary, sentiment, tags, and named entity recognition if available.

**Note**: NLP coverage and analysis completeness may vary by language, with full data available for articles in English and Arabic. The `include_nlp_data` parameter is available only in NLP subscription plans.

To learn more, see [NLP features](/docs/v3/documentation/guides-and-concepts/nlp-features).


If true, the response includes translation fields `title_translated_en` and `content_translated_en`.

**Note**: Article translations are available only in the NLP plan.


Filters articles based on International Press Telecommunications Council (IPTC) media topic tags. To specify multiple IPTC tags, use a comma-separated string or an array of strings. 

Examples: 
- `"20000199, 20000209"`
- `["20000199", "20000209"]`

**Note**: The `iptc_tags` parameter is only available in the `v3_nlp_iptc_tags` subscription plan.

To learn more, see [IPTC Media Topic NewsCodes](https://www.iptc.org/std/NewsCodes/treeview/mediatopic/mediatopic-en-GB.html).


If true, only returns articles that were posted on the home page of a given news domain.


If true, returns only opinion pieces. If false, excludes opinion-based articles and returns news only.


If false, returns only articles that have publicly available complete content. Some publishers partially block content, so this setting ensures that only full articles are retrieved.


The language(s) of the search. The only accepted format is the two-letter [ISO 639-1](https://en.wikipedia.org/wiki/ISO_639-1) code. To select multiple languages, use a comma-separated string or an array of strings.

Examples:
- `"en,es"`
- `["en", "es"]`

To learn more, see [Enumerated parameters > Language](/docs/v3/api-reference/overview/enumerated-parameters#language-lang-and-not-lang).


A list of named entities identified in the article.

The number of times this entity appears in the article.

The name of the entity identified in the article.

The name of person, organization, location, product or other named entity to search for. To specify multiple names use a comma-separated string. 

Example: `"Tesla, Amazon"`


Natural Language Processing data for the article.

IAB content taxonomy paths identified in the article content. Each path represents a hierarchical category following the IAB content standard.

**Note**: The `iab_tags_name` field is only available in the `v3_nlp_iptc_tags` subscription plan.


IPTC media topic numeric codes identified in the article content. These codes correspond to the standardized IPTC media topic taxonomy.

**Note**: The `iptc_tags_id` field is only available in the `v3_nlp_iptc_tags` subscription plan.


IPTC media topic taxonomy paths identified in the article content. Each path represents a hierarchical category following the IPTC standard.

**Note**: The `iptc_tags_name` field is only available in the `v3_nlp_iptc_tags` subscription plan.


Named Entity Recognition for location entities (cities, countries, geographic features).

Named Entity Recognition for miscellaneous entities (events, nationalities, products).

Named Entity Recognition for organization entities (company names, institutions).

Named Entity Recognition for person entities (individuals' names).

A dense 1024-dimensional vector representation of the article content, generated using  the [multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large) model.

**Note**: The `new_embedding` field is only available in the `v3_local_news_nlp_embeddings` subscription plan.


A brief AI-generated summary of the article content.

A brief AI-generated summary of the article's English translation.


The publisher location countries to exclude from the search. The accepted format is the two-letter [ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2) code. To exclude multiple countries, use a comma-separated string or an array of strings.

Examples: 
- `"UK,FR"`
- `["UK", "FR"]`

To learn more, see [Enumerated parameters > Country](/docs/v3/api-reference/overview/enumerated-parameters#country-country-and-not-country).


Inverse of the `iptc_tags` parameter. Excludes articles based on International Press Telecommunications Council (IPTC) media topic tags. To specify multiple IPTC tags to exclude, use a comma-separated string or an array of strings. 

Examples: 
- `"20000205, 20000209"`
- `["20000205", "20000209"]`

**Note**: The `not_iptc_tags` parameter is only available in the `v3_nlp_iptc_tags` subscription plan.

To learn more, see [IPTC Media Topic NewsCodes](https://www.iptc.org/std/NewsCodes/treeview/mediatopic/mediatopic-en-GB.html).


The language(s) to exclude from the search. The accepted format is the two-letter [ISO 639-1](https://en.wikipedia.org/wiki/ISO_639-1) code. To exclude multiple languages, use a comma-separated string or an array of strings.

Examples:
- `"fr,de"`
- `["fr", "de"]`

To learn more, see [Enumerated parameters > Language](/docs/v3/api-reference/overview/enumerated-parameters#language-lang-and-not-lang).


The news sources to exclude from the search. To exclude multiple sources, use a comma-separated string or an array of strings.

Examples: 
- `"cnn.com, wsj.com"`
- `["cnn.com", "wsj.com"]`


Inverse of the `theme` parameter. Excludes articles based on their general topic, as determined by NLP analysis. To exclude multiple themes, use a comma-separated string or an array of strings. 

Examples: 
- `"Crime, Tech"`
- `["Crime", "Tech"]`

**Note**: The `not_theme` parameter is only available if NLP is included in your subscription plan.

To learn more, see [NLP features](/docs/v3/documentation/guides-and-concepts/nlp-features).


The page number to scroll through the results. Use for pagination, as a single API response can return up to 1,000 articles. 

For details, see [How to paginate large datasets](https://www.newscatcherapi.com/docs/v3/documentation/how-to/paginate-large-datasets).


The number of articles to return per page.


The categorical URL(s) to filter your search. To filter your search by multiple categorical URLs, use a comma-separated string or an array of strings.

Examples: 
- `"wsj.com/politics,wsj.com/tech"`
- `["wsj.com/politics", "wsj.com/tech"]`


Predefined top news sources per country. 

Format: start with the word `top`, followed by the number of desired sources, and then the two-letter country code [ISO 3166-1 alpha-2](https://en.wikipedia.org/wiki/ISO_3166-1_alpha-2). Multiple countries with the number of top sources can be specified as a comma-separated string or an array of strings. 

Examples: 
- `"top 100 US"`
- `"top 33 AT"`
- `"top 50 US, top 20 GB"`
- `["top 50 US", "top 20 GB"]`


The precision of the published date. There are three types:
- `full`: The day and time of an article is correctly identified with the appropriate timezone.
- `timezone unknown`: The day and time of an article is correctly identified without timezone.
- `date`: Only the day is identified without an exact time.


The keyword(s) to search for in articles. Query syntax supports logical operators (`AND`, `OR`, `NOT`) and wildcards:

- For an exact match, use double quotes. For example, `"technology news"`.
- Use `*` to search for any keyword.
- Use `+` to include and `-` to exclude specific words or phrases. For example, `+Apple`, `-Google`.
- Use `AND`, `OR`, and `NOT` to refine search results. For example, `technology AND (Apple OR Microsoft) NOT Google`.

For more details, see [Advanced querying](/docs/v3/documentation/guides-and-concepts/advanced-querying).


If true, limits the search to sources ranked in the top 1 million online websites. If false, includes unranked sources which are assigned a rank of 999999.


If true, returns only articles/sources that comply with the publisher's robots.txt rules. If false, returns only articles/sources that do not comply with robots.txt rules. If omitted, returns all articles/sources regardless of compliance status.


The article fields to search in. Use a comma-separated string for multiple options, with a maximum of 2 in a single request.

Available options: 
- Standard fields: `title`, `content`, `summary`, `title_content`
- Translation fields: `title_translated`, `content_translated`, `summary_translated`, `title_content_translated`

**Note**: The summary and translation options are available only in NLP subscription plans.


Sentiment scores for the article's title and content.

The sentiment score for the article content (-1.0 to 1.0).

The sentiment score for the article title (-1.0 to 1.0).

The sorting order of the results. Possible values are:
- `relevancy`: The most relevant results first.
- `date`: The most recently published results first.
- `rank`: The results from the highest-ranked sources first.


One or more news sources to narrow down the search. The format must be a domain URL. Subdomains, such as `finance.yahoo.com`, are also acceptable. To specify multiple sources, use a comma-separated string or an array of strings.

Examples: 
- `"nytimes.com, theguardian.com"`
- `["nytimes.com", "theguardian.com"]`


Filters articles based on the maximum sentiment score of their titles.

Range is `-1.0` to `1.0`, where:
- Negative values indicate negative sentiment.
- Positive values indicate positive sentiment.
- Values close to 0 indicate neutral sentiment.

**Note**: The `title_sentiment_max` parameter is only available if NLP is included in your subscription plan.

To learn more, see [NLP features](/docs/v3/documentation/guides-and-concepts/nlp-features).


Filters articles based on the minimum sentiment score of their titles.

Range is `-1.0` to `1.0`, where:
- Negative values indicate negative sentiment.
- Positive values indicate positive sentiment.
- Values close to 0 indicate neutral sentiment.

**Note**: The `title_sentiment_min` parameter is only available if NLP is included in your subscription plan.

To learn more, see [NLP features](/docs/v3/documentation/guides-and-concepts/nlp-features).


The ending point in time to search up to. Accepts date-time strings in ISO 8601 format and plain text strings. The default time zone is UTC. 

Formats with examples:
- YYYY-mm-ddTHH:MM:SS: `2024-07-01T00:00:00`
- YYYY-MM-dd: `2024-07-01`
- YYYY/mm/dd HH:MM:SS: `2024/07/01 00:00:00`
- YYYY/mm/dd: `2024/07/01`
- English phrases: `1 day ago`, `today`

**Note**: By default, applied to the publication date of the article. To use the article's parse date instead, set the `by_parse_date` parameter to `true`.


The highest boundary of the rank of a news website to filter by. A lower rank indicates a more popular source.


The user input parameters for the request.

The maximum number of words an article can contain. 
To be used for avoiding articles with large content.


The minimum number of words an article must contain. To be used for avoiding articles with small content.


Request body for searching similar articles based on specified criteria such as query, language, country, source, and more.

ApiKeyAuth

A successful response containing articles similar to the specified query. If no matches, returns a failed search response according to the defined schema.

The response model for a successful `Search similar` request. Response field behavior:
- Required fields are guaranteed to be present and non-null. 
- Optional fields may be `null` or `undefined` if the data point is not presented or couldn't be extracted during processing.
- To access article properties in the `articles` response array, use array index notation. For example, `articles[n].title`, where `n` is the zero-based index of the article object (0, 1, 2, etc.).
- The `nlp` property within the article object `articles[n].nlp` is only available with NLP-enabled subscription plans.

The base response model containing common fields for search operations.

Search Similar Response

The response model for a failed `Search Similar` request.
The base response model containing common fields for search operations.

Failed Search Similar Response

newscatcher

Case studies

Blog

Documentation

API Reference

Local News

Events

Glossary

Get API key

Discover NewsCatcher News API v3, an enterprise-level solution for large-scale news data retrieval and analysis.

Overview

News API v3 overview

This guide will help you make your first API call to NewsCatcher News API v3 and start retrieving news data in just a few minutes.

Quickstart

News API v3 quickstart guide

Libraries

Select a News API v3 plan that matches your needs - from basic news monitoring to advanced AI-powered content analysis.

Subscription plans

News API v3 subscription plans

Master advanced querying techniques to enhance the precision and relevance of your searches

Advanced querying

Advanced querying techniques

Group similar articles together to reduce noise and gain insights

Clustering news articles

Enhance search efficiency by filtering out duplicate articles.

Articles deduplication

Familiarize yourself with the NLP features available in News API v3

NLP features

NLP features in News API v3

Filter and classify news articles using your organization's unique taxonomy.

Custom tags

Cut through the clutter with precision - ensure every article pinpoints the exact company or individual you're tracking.

Entity disambiguation

Learn about breaking news detection and how to access emerging stories through the News API v3

Breaking news

Breaking news (beta)

Understand robots.txt compliance fields and parameters in News API v3 to build applications that respect publisher permissions

Robots.txt compliance

Robots.txt compliance in News API v3

Learn how to efficiently query historical data in News API v3 while maintaining performance and avoiding common pitfalls.

Working with historical data

Explore key changes and prepare for migration.

API changes v2 vs v3

Step-by-step guide for migrating from News API v2 to v3

Migration guide

Migration guide v2 to v3

Use boolean operators

How to use boolean operators

Find articles with related terms in close proximity

Search with NEAR

Proximity search with NEAR

Find articles mentioning specific people, organizations, locations, or other named entities

Search by entity

How to search articles by entity

Find articles that mention specific URLs or domains using News API v3

Search by URL

How to search articles by URL

Learn how to search within translated articles and retrieve translated content from News API v3

Work with translations

Search and retrieve translated content

Efficiently retrieve and process large volumes of news data using pagination in News API v3

Paginate large datasets

How to paginate large datasets

Learn how to use time-chunking methods in the Python SDK to retrieve large volumes of articles

Overview

Endpoints

Libraries

Search similar articles

Authorizations

Body

Response