This guide outlines the key differences between NewsCatcher News API v2 and v3. It provides a technical comparison to help you understand the changes and prepare for migration.

This guide only covers the changes for v2/v3 shared endpoints. To learn about other v3 endpoints, their parameters, and response fields, see the API Reference.

Base API changes

Infrastructure updates

Featurev2v3
Base URLapi.newscatcherapi.com/v2v3-api.newscatcherapi.com/api
Authentication headerx-api-keyx-api-token
Maximum articles/request1001,000
Historical dataSince 2019Since July 2023*

All v2 historical data will be available in v3 starting Q1 2025 and initially include core functionality only. Advanced features like NLP analysis, clustering, and deduplication can be implemented for historical data upon request.

Available endpoints

v2v3Change
/search/searchEnhanced with additional filtering capabilities, NLP features, clustering, and deduplication
/latest_headlines/latest_headlinesEnhanced with additional filtering capabilities, NLP features, clustering, and deduplication
/sources/sourcesEnhanced with additional filtering capabilities
/authorsSearch by author name
/search_by_linkSearch by URL or article ID
/search_similarFind similar articles
/aggregation_countGet aggregation count by interval
/subscriptionView subscription info

Method support

  • Both v2 and v3 support GET and POST methods for all endpoints
  • Multiple value parameter formats:
    • v2: Most parameters use comma-separated strings, with some exceptions (e.g., search_in uses underscore-separated strings) regardless of the method
    • v3: More consistent formatting:
      • GET: Supports a comma-separated string
      • POST: Supports both a comma-separated string and an array of strings
  • Single-value parameters maintain their respective formats in both versions

Parameter changes

Renaming

  • fromfrom_
  • toto_
  • topictheme

The topic parameter is removed from the /sources endpoint. Instead, you have new filtering capabilities. To learn more, see Retrieve sources.

Query parameter (q)

Aspectv2v3Change
RequiredYesYesNo change
Query operators✓ Exact match with quotes "keyword"
✓ Boolean: AND, OR, NOT
✓ Wildcards: * and ?
✓ Must/Must not: + and -
✓ Grouping with ()
Same as v2 plus:
NEAR operator
COUNT operator
Enhanced operators
Default behaviorSpace-separated tokens treated as ANDSame as v2No change

Search fields (search_in)

Aspectv2v3Change
Field name for article titletitletitleNo change
Field name for article contentsummarycontentRenamed to reflect actual content
Default value"title_summary""title,content"Functionally equivalent
LLM-generated summaryNot availablesummary (requires NLP)New feature
Multiple valuesUnderscore-separated string, e.g. "title_summary"GET: Comma-separated string, e.g. "title,content"
POST: Comma-separated string or array
Format standardization

Content classification (topic -> theme)

Aspectv2v3Change
Parameter nametopicthemeRenamed
Case formatlowercase, e.g. "tech"Capitalized, e.g. "Tech"Updated format
Available categories15 lowercase categories17 capitalized categoriesExpanded
New categories in v3-"Health", "Crime", "Financial Crime", "Lifestyle", "Automotive", "Weather", "General"Added
Removed v2 categories"beauty", "music", "food", "gaming"Consolidated into new categoriesCategory restructuring
Multiple valuesComma-separated stringGET: Comma-separated string
POST: Comma-separated string or array
Enhanced POST format
Exclusion optionNot availablenot_theme parameterNew feature
NLP dependencyNoYesNew requirement

New v3 parameters

Parameters are grouped by their availability in different subscription plans. For detailed plan information, see Subscription plans.

Core features

Available in all v3 plans, including v3_basic.

Content classification

ParameterTypeDescription
is_headlinebooleanFilters for articles that were posted on the home page of a given news domain
is_opinionbooleanFilters for opinion pieces when true, or excludes opinion-based articles when false
is_paid_contentbooleanFilters out articles with paywalled content when false
word_count_minintegerFilters articles based on minimum word count
word_count_maxintegerFilters articles based on maximum word count

URLs

ParameterTypeDescription
parent_urlstringFilters articles by categorical URLs (e.g., “wsj.com/politics”)
all_linksstringFilters articles by mentioned URLs within their content
all_domain_linksstringFilters articles by mentioned domain names within their content

Author

ParameterTypeDescription
not_author_namestringExcludes articles by specified authors
ParameterTypeDescription
by_parse_datebooleanUses parse dates instead of published dates for date filtering

Source

ParameterTypeDescription
predefined_sourcesstringFilters by predefined top sources per country (e.g., “top 100 US”)
additional_domain_infobooleanIncludes extra metadata about the source domain
is_news_domainbooleanFilters for news domain sources only
news_domain_typestringFilters by domain type (Original Content, Aggregator, etc.)
news_typestringFilters by news type categories

Advanced features

Available in specific subscription plans.

Natural language processing

Requires v3_nlp plan or higher.

ParameterTypeDescription
include_nlp_databooleanIncludes NLP analysis layer with enhanced information
has_nlpbooleanFilters for articles that have NLP analysis available
themestringReplaces topic parameter with expanded categories and NLP integration
not_themestringExcludes articles with specified themes
ORG_entity_namestringFilters articles mentioning specific organization names
PER_entity_namestringFilters articles mentioning specific person names
LOC_entity_namestringFilters articles mentioning specific location names
MISC_entity_namestringFilters articles mentioning other named entities
title_sentiment_minfloatFilters articles by minimum title sentiment score (-1 to 1)
title_sentiment_maxfloatFilters articles by maximum title sentiment score (-1 to 1)
content_sentiment_minfloatFilters articles by minimum content sentiment score (-1 to 1)
content_sentiment_maxfloatFilters articles by maximum content sentiment score (-1 to 1)

Clustering and deduplication

Requires v3_nlp plan or higher.

ParameterTypeDescription
clustering_enabledbooleanEnables grouping of similar articles into clusters
clustering_variablestringSpecifies which part of the article to use for clustering (“content”, “title”, or “summary”)
clustering_thresholdfloatSets similarity threshold for clustering (range: 0-1)
exclude_duplicatesbooleanRemoves duplicate and highly similar articles from results

Tagging

Requires the v3_nlp_iptc_tags subscription plan.

ParameterTypeDescription
iptc_tagsstringFilters articles by IPTC media topic tags
not_iptc_tagsstringExcludes articles with specific IPTC media topic tags
iab_tagsstringFilters articles by IAB content categories
not_iab_tagsstringExcludes articles with specific IAB content categories

Custom tags

Custom tags are available in all the NLP plans as a custom solution that provides tailored content classification using your organization’s taxonomy. For implementation details and examples, see Custom tags.

Response changes v2 vs v3

Field renaming

The following fields have been renamed in v3 for better clarity and consistency:

v2v3TypeDescription
clean_urldomain_urlstringBase domain of the source
excerptdescriptionstringBrief article description
summarycontentstringFull article content
_scorescorenumberRelevancy score
_ididstringUnique article identifier
topicthemestringAvailable in v3 with NLP enabled

New fields in v3

Article object

The following fields are available in all v3 plans:

FieldTypeDescription
full_domain_urlstringComplete domain with subdomain
name_sourcestringPublisher name
is_headlinebooleanHomepage article indicator
paid_contentbooleanPaywall indicator
parent_urlstringCategory/section URL
journalistsarrayArray of journalist names
word_countintegerArticle length
updated_datestringLast update timestamp
updated_date_precisionstringUpdate time precision
all_linksarrayURLs mentioned in article
all_domain_linksarrayDomains mentioned in article

Natural language processing (NLP) object

NLP object is a part of article object and vailable for all NLP plans (v3_nlp plan or higher) when include_nlp_data=true.

FieldTypeDescription
nlpobjectNatural Language Processing analysis results for the article content.

Article understanding

FieldTypeDescription
nlp.summarystringAI-generated concise summary of article content
nlp.themearray[string]High-level thematic categories from fixed set: Business, Economics, Entertainment, Finance, Health, Politics, Science, Sports, Tech, Crime, Financial Crime, Lifestyle, Automotive, Travel, Weather, General

Sentiment analysis

FieldTypeDescription
nlp.sentiment.titlenumberSentiment score for article title (range: -1 to 1, negative values indicate negative sentiment)
nlp.sentiment.contentnumberSentiment score for article content (range: -1 to 1, negative values indicate negative sentiment)

Named entity recognition (NER)

FieldTypeDescription
nlp.ner_PERarray[object]Named entities recognized as persons
nlp.ner_ORGarray[object]Named entities recognized as organizations
nlp.ner_LOCarray[object]Named entities recognized as locations
nlp.ner_MISCarray[object]Named entities recognized as other types (events, products, etc.)

Each NER object contains:

{
  "entity_name": "string", // Recognized entity name
  "count": "integer" // Number of mentions in the article
}

Tags

Available for the v3_nlp_iptc_tags subscription plan.

FieldTypeDescription
nlp.iab_tags_namearray[string]Interactive Advertising Bureau content categorization
nlp.iptc_tags_namearray[string]International Press Telecommunications Council subject names
nlp.iptc_tags_idarray[string]International Press Telecommunications Council subject IDs

Vector representation

Available for the v3_nlp_embeddings plan.

FieldTypeDescription
nlp.new_embeddingarray[number]1024-dimensional vector embedding for semantic similarity comparison (v3_nlp_embeddings plan only)

Clustering data

Available for all NLP plans when clustering_enabled=true:

FieldTypeDescription
clusters_countintegerTotal number of clusters in the response
clustersarrayArray of cluster objects
cluster_idstringUnique identifier for each cluster
cluster_sizeintegerNumber of articles in the cluster
articlesarrayArray of article objects in the cluster

Deduplication data

Available for all NLP plans when exclude_duplicates=true:

FieldTypeDescription
duplicate_countintegerNumber of duplicate articles found
duplicate_articles_group_idstringUnique identifier for the duplicate group

Source object

Enhanced source information available in all v3 plans:

FieldTypeDescription
name_sourcestringPublisher name
domain_urlstringBase domain URL
logostringSource logo URL
additional_infoobjectExtended source data

Additional info object fields:

FieldTypeDescription
nb_articles_for_7dintegerArticles published in last week
countrystringSource country code
rankintegerSEO rank
is_news_domainbooleanIndicates if domain is a news source
news_domain_typestringType of news domain
news_typestringCategory of news content

Removed response fields in v3

  • topic: Replaced by theme in NLP features for the /search and /latest_headlines endpoints. The field is unavailable for the /sources endpoint as the corresponding parameter has been removed.
  • is_republisher: Replaced by more detailed domain classification.

Error response changes

Format

Status codes

Codev2 Descriptionv3 Description
400API not in headersBad request - Invalid JSON
401API Key not foundUnauthorized
403Not presentPlan limits exceeded
406Wrong parameterNot present
408Request TimeoutRequest Timeout
422Not presentValidation Error
429Concurrency violatedRate limit exceeded
500Not presentInternal server error

SDKs

  • v2: Python SDK only
  • v3: SDKs for:
    • Python
    • TypeScript
    • Go
    • Java
    • C#

All v3 SDKs provide complete support for both core and advanced features. For implementation details, see the Libraries documentation.

Timeline and support

Migration timeline

  • v2 supported until Q1 2025.
  • Historical data:
    • v3 data available since July 2023.
    • v2 historical data migration to v3 planned for Q1 2025.
    • Initial historical data will include core features only.
    • Advanced features (NLP, clustering, etc.) available for historical data upon request.

Support during migration

  • Both versions are available for parallel testing.
  • Automatic migration to v3_basic plan for existing v2 customers.
  • All new v3 endpoints accessible in v3_basic plan.
  • Advanced features require specific plans:
    • NLP features: v3_nlp plan.
    • IPTC and IAB tags: v3_nlp_iptc_tags plan.
    • Embeddings: v3_nlp_embeddings plan.
    • Custom tags: Available as a custom solution in all v3 NLP plans.

Next steps

  1. Review the Migration guide for implementation details.
  2. Explore plan features and requirements in Subscription plans.
  3. Check the version-specific API Reference for detailed parameter and response field documentation:
  4. Test v3 endpoints alongside your v2 implementation.
  5. For advanced features:

For implementation support or custom solutions, contact our support team.