Skip to main content

Documentation Index

Fetch the complete documentation index at: https://newscatcherinc-docs.mintlify.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

As part of the NewsCatcher processing pipeline, each article is enriched with NLP data before it is indexed: theme classification, sentiment scores, named entities, content tags, and vector embeddings. News API exposes these fields in the response when you set include_nlp_data to true.
NLP enrichment is available only for articles indexed from July 2023 onward. For earlier articles, the API returns "nlp": {}.To request NLP enrichment for historical articles, contact support@newscatcherapi.com.

How NLP processing works

Processing mode depends on the article’s language and determines which response fields are populated and which are null. Native processing applies to English and Arabic articles. NLP runs on the original text and results appear in the standard nlp.* fields. Translation-based processing applies to all other languages. The article is first translated to English, then NLP runs on that translation. Results appear in nlp.translation_* fields — the corresponding standard fields are explicitly null, not absent. To receive translation fields in the response, set include_translation_fields to true. This distinction matters when consuming NER or summary fields: a null value in nlp.ner_PER means the article was processed via translation, not that no entities exist — check nlp.translation_ner_PER instead.

Available features

FeatureWhat it produces
ThemeOne or more topic labels per article, for example Tech or Finance. Filterable with theme and not_theme.
SummaryAI-generated article summary. nlp.summary for native, nlp.translation_summary for translation-based.
SentimentTone scores from -1.0 to 1.0 for title and content independently.
Named entity recognitionPersons, organizations, locations, and miscellaneous entities with mention counts.
IPTC tagsHierarchical news category tags using the IPTC media topic standard.
IAB tagsContent category tags using the IAB content taxonomy, used for audience segmentation.
Custom tagsOrganization-specific taxonomy, private to your API key.
Vector embeddings1024-dimensional semantic vectors for similarity search and clustering.

See also