> ## Documentation Index
> Fetch the complete documentation index at: https://newscatcherinc-docs.mintlify.site/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Search articles by identifiers

> Search for local news using article links, IDs, or RSS GUIDs.



## OpenAPI

````yaml local-news-api post /api/search_by
openapi: 3.0.3
info:
  title: Local News API
  description: >
    The Local News API provides access to local news articles with
    location-specific filtering capabilities.


    ## Standard endpoints


    - `/search`: Search articles by keywords with simple location filtering
    ("City, State" format).

    - `/latest_headlines`: Retrieve recent articles for specified locations and
    time periods.

    - `/search_by`: Retrieve articles by URL, ID, or RSS GUID.

    - `/sources`: List available news sources.


    ## Advanced endpoints


    - `/search/advanced`: Search with structured GeoNames filtering.

    - `/latest_headlines/advanced`: Latest headlines with structured GeoNames
    filtering.


    ## Features


    - Multiple location detection methods including dedicated sources, proximity
    analysis, and AI extraction

    - Natural language processing for sentiment analysis and entity recognition
    on original content and English translations

    - Article clustering for topic analysis

    - English translations for non-English content
  termsOfService: https://newscatcherapi.com/terms-of-service
  contact:
    name: Maksym Sugonyaka
    email: maksym@newscatcherapi.com
    url: https://www.newscatcherapi.com/book-a-demo
  version: 1.2.0
servers:
  - url: https://local-news.newscatcherapi.com
    description: Local News API production server
security:
  - ApiKeyAuth: []
tags:
  - name: Search
    description: >-
      Operations to search for local news articles. Includes both standard
      location filtering and advanced GeoNames filtering.
  - name: LatestHeadlines
    description: >-
      Operations to retrieve local news latest headlines. Includes both standard
      location filtering and advanced GeoNames filtering.
  - name: SearchBy
    description: Operations to search local news by link, ID or RSS GUID.
  - name: Sources
    description: Operations to retrieve local news sources.
externalDocs:
  description: Find out more about Local News API
  url: https://www.newscatcherapi.com/docs/local-news-api/get-started/introduction
paths:
  /api/search_by:
    post:
      tags:
        - SearchBy
      summary: Search articles by identifiers
      description: Search for local news using article links, IDs, or RSS GUIDs.
      operationId: SearchBy_post
      requestBody:
        $ref: '#/components/requestBodies/SearchByRequestBody'
      responses:
        '200':
          $ref: '#/components/responses/SearchByResponse'
        '400':
          $ref: '#/components/responses/BadRequestError'
        '401':
          $ref: '#/components/responses/UnauthorizedError'
        '403':
          $ref: '#/components/responses/ForbiddenError'
        '408':
          $ref: '#/components/responses/RequestTimeoutError'
        '422':
          $ref: '#/components/responses/ValidationError'
        '429':
          $ref: '#/components/responses/RateLimitError'
        '500':
          $ref: '#/components/responses/InternalServerError'
components:
  requestBodies:
    SearchByRequestBody:
      required: true
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/SearchByRequestDto'
  responses:
    SearchByResponse:
      description: |
        A successful response containing articles that match the
        specified search criteria.
      content:
        application/json:
          schema:
            allOf:
              - $ref: '#/components/schemas/ArticleSearchAdvancedResponseDto'
            title: Search By Response
    BadRequestError:
      description: Bad request
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          example:
            message: Invalid JSON in request body
            status_code: 400
            status: Bad request
    UnauthorizedError:
      description: Unauthorized - Authentication failed
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          example:
            message: 'Invalid api key: INVALID_API_KEY'
            status_code: 401
            status: Unauthorized
    ForbiddenError:
      description: Forbidden - Server refuses action
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          example:
            message: Your plan request date range cannot be greater than 400 days
            status_code: 403
            status: Forbidden
    RequestTimeoutError:
      description: Request timeout
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          example:
            message: Request timed out after 30 seconds
            status_code: 408
            status: Request timeout
    ValidationError:
      description: Validation error
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          example:
            message: Invalid date format
            status_code: 422
            status: Validation error
    RateLimitError:
      description: Too many requests - Rate limit exceeded
      content:
        application/json:
          schema:
            $ref: '#/components/schemas/Error'
          example:
            message: Max API requests concurrency reached
            status_code: 429
            status: Too many requests
    InternalServerError:
      description: Internal server error
      content:
        text/plain:
          schema:
            type: string
          example: Internal Server Error
  schemas:
    SearchByRequestDto:
      type: object
      properties:
        links:
          $ref: '#/components/schemas/Links'
        ids:
          $ref: '#/components/schemas/Ids'
        rss_guids:
          $ref: '#/components/schemas/RssGuids'
        from_:
          $ref: '#/components/schemas/From'
        to_:
          $ref: '#/components/schemas/To'
        page:
          $ref: '#/components/schemas/Page'
        page_size:
          $ref: '#/components/schemas/PageSize'
    ArticleSearchAdvancedResponseDto:
      title: Advanced Search Response
      description: >
        The response model for the `Search advanced`, `Latest headlines
        advanced`, and `Search by` requests.


        Response field behavior:

        - Required fields are guaranteed to be present and non-null. 

        - Optional fields may be `null`/`undefined` if the data couldn't be
        extracted during processing.

        - To access article properties in the `articles` response array, use
        array index notation. For example, `articles[n].title`, where `n` is the
        zero-based index of the article object (0, 1, 2, etc.).
      allOf:
        - $ref: '#/components/schemas/SearchResponseDto'
        - type: object
          properties:
            articles:
              type: array
              items:
                $ref: '#/components/schemas/ArticleAdvancedResultEntity'
              default: []
            user_input:
              type: object
    Error:
      type: object
      properties:
        message:
          type: string
          description: A detailed description of the error.
        status_code:
          type: integer
          description: The HTTP status code of the error.
        status:
          type: string
          description: A short description of the status code.
      required:
        - message
        - status_code
        - status
    Links:
      oneOf:
        - type: string
        - type: array
          items:
            type: string
      description: >
        The article link or list of article links to search for. To specify
        multiple links, use a comma-separated string or an array of strings.


        **Note**: You can use the `links` parameter in combination with `ids` or
        `rss_guids`, but at least one of these parameters must be provided.
      example: https://nytimes.com/article1
    Ids:
      oneOf:
        - type: string
        - type: array
          items:
            type: string
      description: >
        The Newscatcher article ID (see the `_id` field in API response) or a
        list of article IDs to search for. To specify multiple IDs, use a
        comma-separated string or an array of strings.


        **Note**: You can use the `ids` parameter in combination with `links` or
        `rss_guids`, but at least one of these parameters must be provided.
      example:
        - 5f8d0d55b6e45e00179c6e7e
        - 5f8d0d55b6e45e00179c6e7f
    RssGuids:
      oneOf:
        - type: string
        - type: array
          items:
            type: string
      description: >
        The RSS GUID (Globally Unique Identifier) or list of GUIDs to search
        for. To specify multiple GUIDs, use a comma-separated string or an array
        of strings. 


        GUIDs are unique identifiers assigned to RSS feed items. They are often
        URLs or other unique strings.


        **Note**: You can use the `rss_guids` parameter in combination with
        `links` or `ids`, but at least one of these parameters must be provided.
      example:
        - https://example.com/article1
        - https://example.com/article2
    From:
      oneOf:
        - type: string
          format: date-time
          example: '2024-09-24T00:00:00.000Z'
        - type: string
          example: 1 day ago
      default: 7 days ago
      description: >
        The starting point in time to search from. Accepts date-time strings in
        ISO 8601 format and plain text strings. The default time zone is UTC. 


        Formats with examples:

        - YYYY-mm-ddTHH:MM:SS: `2024-09-24T00:00:00`

        - YYYY-MM-dd: `2024-09-24`

        - YYYY/mm/dd HH:MM:SS: `2024/09/24 00:00:00`

        - YYYY/mm/dd: `2024/09/24`

        - English phrases: `1 day ago`, `today`


        **Note**: By default, applied to the publication date of the article. 

        To use the article's parse date instead, set the `by_parse_date`
        parameter to `true`.
      example: 2024/09/24
    To:
      oneOf:
        - type: string
          format: date-time
          example: '2024-09-25T00:00:00.000Z'
        - type: string
          example: 1 day ago
      default: now
      description: >
        The ending point in time to search up to. Accepts date-time strings in
        ISO 8601 format and plain text strings. The default time zone is UTC. 


        Formats with examples:

        - YYYY-mm-ddTHH:MM:SS: `2024-09-25T00:00:00`

        - YYYY-MM-dd: `2024-09-25`

        - YYYY/mm/dd HH:MM:SS: `2024/09/25 00:00:00`

        - YYYY/mm/dd: `2024/09/25`

        - English phrases: `1 day ago`, `today`, `now`


        **Note**: By default, applied to the publication date of the article. 

        To use the article's parse date instead, set the `by_parse_date`
        parameter to `true`.
      example: 2024/09/25
    Page:
      type: integer
      minimum: 1
      default: 1
      description: >
        The page number to scroll through the results. This parameter is used to
        paginate: scroll through results because one API response cannot return
        more than 1000 articles.
      example: 2
    PageSize:
      type: integer
      minimum: 1
      maximum: 1000
      default: 100
      description: |
        The number of articles to return per page. Range: `1` to `1000`.
      example: 100
    SearchResponseDto:
      title: Base Search Response
      required:
        - status
        - total_hits
        - page
        - total_pages
        - page_size
      type: object
      properties:
        status:
          title: Status
          description: The status of the response.
          type: string
          default: ok
        total_hits:
          title: Total Hits
          description: The total number of articles matching the search criteria.
          type: integer
        page:
          title: Page
          description: The current page number of the results.
          type: integer
        total_pages:
          title: Total Pages
          description: The total number of pages available for the given search criteria.
          type: integer
        page_size:
          title: Page Size
          description: The number of articles per page.
          type: integer
    ArticleAdvancedResultEntity:
      title: Article Result (Advanced)
      allOf:
        - $ref: '#/components/schemas/BaseArticleEntity'
        - type: object
          properties:
            geonames:
              type: array
              items:
                $ref: '#/components/schemas/GeoNamesResponseEntity'
              description: >
                A list of locations identified in the article, including
                detection methods, confidence, and localization scores. The
                location data adheres to the GeoNames format.
    BaseArticleEntity:
      title: Article Result
      description: >-
        The data model representing the commont properties of the article object
        in the search results. Required fields are always non-null. Optional
        fields may be `null`/`undefined` if data extraction is unsuccessful.
      type: object
      required:
        - id
        - title
        - link
        - content
        - domain_url
        - published_date_precision
        - published_date
        - is_opinion
        - rank
      properties:
        id:
          type: string
          description: The unique identifier for the article.
        score:
          type: number
          format: float
          description: The relevance score of the article.
        title:
          type: string
          description: The title of the article.
        author:
          type: string
          description: The primary author of the article.
        link:
          type: string
          description: The URL link to the article.
        description:
          type: string
          description: A brief description of the article.
        media:
          type: string
          description: >-
            The URL of the media associated with the article, typically an
            image.
        content:
          type: string
          description: The full content of the article.
        authors:
          type: array
          items:
            type: string
          description: A list of authors of the article.
        published_date:
          type: string
          format: date-time
          description: The date and time the article was published.
        published_date_precision:
          type: string
          description: The precision of the published date.
        updated_date:
          type: string
          format: date-time
          description: The date and time the article was last updated.
        updated_date_precision:
          type: string
          description: The precision of the updated date.
        is_opinion:
          type: boolean
          description: Indicates if the article is an opinion piece.
        twitter_account:
          type: string
          nullable: true
          description: The Twitter account associated with the article. Can be `null`.
        domain_url:
          type: string
          description: The domain URL of the article's source.
        parent_url:
          type: string
          description: >-
            The parent URL of the article, typically representing the homepage
            of the source.
        word_count:
          type: integer
          description: The word count of the article.
        rank:
          type: integer
          description: The rank of the article's source.
        country:
          type: string
          description: The country code where the article was published.
        rights:
          type: string
          description: The rights information for the article, typically the domain name.
        language:
          type: string
          description: The language code in which the article is written.
        nlp:
          $ref: '#/components/schemas/NlpDataEntity'
        paid_content:
          type: boolean
          description: >-
            Indicates whether the source labels the article as paywalled or
            requiring a subscription for full access.
        title_translated_en:
          type: string
          description: >
            English translation of the article title. Available when using the
            `search_in` parameter with the `title_translated` option or by
            setting the `include_translation_fields` parameter to `true`.
          nullable: true
        content_translated_en:
          type: string
          description: >
            English translation of the article content. Available when using the
            `search_in` parameter with the `content_translated` option or by
            setting the `include_translation_fields` parameter to `true`.
          nullable: true
    GeoNamesResponseEntity:
      type: object
      description: >
        Represents a geographic location identified in an article using GeoNames
        structured data, including detection confidence and localization scores.
      required:
        - name
        - detection_methods
      properties:
        geonames_id:
          type: string
          description: |
            The unique GeoNames identifier for the location.
          example: '5128581'
        name:
          type: string
          description: |
            The canonical name of the location from GeoNames database.
          example: New York City
        country:
          type: string
          description: |
            Two-letter ISO 3166-1 alpha-2 country code.
          example: US
        admin1:
          allOf:
            - $ref: '#/components/schemas/GeoNamesLocationAdminEntity'
          description: |
            First-order administrative division (e.g., state, province, region).
        admin2:
          allOf:
            - $ref: '#/components/schemas/GeoNamesLocationAdminEntity'
          description: |
            Second-order administrative division (e.g., county, department).
        admin3:
          allOf:
            - $ref: '#/components/schemas/GeoNamesLocationAdminEntity'
          description: |
            Third-order administrative division (e.g., township, borough).
        admin4:
          allOf:
            - $ref: '#/components/schemas/GeoNamesLocationAdminEntity'
          description: >
            Fourth-order administrative division (smallest administrative
            units).
        coordinates:
          allOf:
            - $ref: '#/components/schemas/Coordinates'
        feature_class:
          type: string
          description: >
            GeoNames feature class (A: Administrative, H: Hydrographic, L: Area,
            P: Populated places, etc.).
          example: P
        feature_code:
          type: string
          description: |
            Specific GeoNames feature code (e.g., PPL for populated place).
          example: PPL
        detection_methods:
          $ref: '#/components/schemas/DetectionMethods'
        reason:
          type: string
          description: >
            Explanation of why this location was identified in the article
            context.
          example: >-
            New York City is mentioned as the location of the Icahn School of
            Medicine.
        localization_score:
          type: number
          format: float
          minimum: 0
          maximum: 10
          description: >
            Geographic focus score (0-10) indicating how locally relevant the
            article is to this location.

            - 10: Hyper-local with clear local impact

            - 7-9: Regional relevance

            - 4-6: Subnational relevance  

            - 1-3: National relevance only

            - 0: No local relevance
          example: 10
        confidence_score:
          type: number
          format: float
          minimum: 0
          maximum: 10
          description: |
            Model confidence score (0-10) in location identification accuracy.
            - 10: Certain match
            - 7-9: High confidence
            - 4-6: Medium confidence
            - 1-3: Low confidence
            - 0: Uncertain/not relevant
          example: 10
    NlpDataEntity:
      type: object
      description: Natural Language Processing data for the article.
      properties:
        theme:
          type: array
          items:
            type: string
          description: The themes or categories identified in the article.
        summary:
          type: string
          description: A brief AI-generated summary of the article content.
        sentiment:
          $ref: '#/components/schemas/SentimentScores'
        new_embedding:
          type: array
          items:
            type: number
            format: float
          description: >
            A dense 1024-dimensional vector representation of the article
            content, generated using the
            [multilingual-e5-large](https://huggingface.co/intfloat/multilingual-e5-large)
            model.


            **Note**: The `new_embedding` field is only available in the
            `v3_local_news_nlp_embeddings` subscription plan.
        ner_PER:
          allOf:
            - $ref: '#/components/schemas/NamedEntityList'
          description: Named Entity Recognition for person entities (individuals' names).
        ner_ORG:
          allOf:
            - $ref: '#/components/schemas/NamedEntityList'
          description: >-
            Named Entity Recognition for organization entities (company names,
            institutions).
        ner_MISC:
          allOf:
            - $ref: '#/components/schemas/NamedEntityList'
          description: >-
            Named Entity Recognition for miscellaneous entities (events,
            nationalities, products).
        ner_LOC:
          allOf:
            - $ref: '#/components/schemas/NamedEntityList'
          description: >-
            Named Entity Recognition for location entities (cities, countries,
            geographic features).
        translation_ner_PER:
          allOf:
            - $ref: '#/components/schemas/NamedEntityList'
          description: >
            Named Entity Recognition for person entities (individuals' names)
            extracted from the English translation of the article.
        translation_ner_ORG:
          allOf:
            - $ref: '#/components/schemas/NamedEntityList'
          description: >
            Named Entity Recognition for organization entities (company names,
            institutions) extracted from the English translation of the article.
        translation_ner_MISC:
          allOf:
            - $ref: '#/components/schemas/NamedEntityList'
          description: >
            Named Entity Recognition for miscellaneous entities (events,
            nationalities, products) extracted from the English translation of
            the article.
        translation_ner_LOC:
          allOf:
            - $ref: '#/components/schemas/NamedEntityList'
          description: >
            Named Entity Recognition for location entities (cities, countries,
            geographic features) extracted from the English translation of the
            article.
    GeoNamesLocationAdminEntity:
      type: object
      properties:
        geonames_id:
          type: string
          description: GeoNames ID for the administrative division.
        name:
          type: string
          description: >
            Administrative division name. Use leading minus `-` to exclude the
            name.
        code:
          type: string
          description: Administrative division code.
    Coordinates:
      type: object
      description: Geographic coordinates for a location.
      properties:
        lat:
          type: number
          format: float
          nullable: true
          minimum: -90
          maximum: 90
          description: The latitude coordinate.
        lon:
          type: number
          format: float
          nullable: true
          minimum: -180
          maximum: 180
          description: The longitude coordinate.
      example:
        lat: 40.71427
        lon: -74.00597
    DetectionMethods:
      type: array
      items:
        type: string
        enum:
          - dedicated_source
          - local_section
          - regional_source
          - standard_format
          - proximity_mention
          - ai_extracted
      description: >
        The location detection methods to filter results by:

        - `dedicated_source`: Identifies locations based on sources exclusively
        covering a specific location.

        - `local_section`: Identifies locations through location-specific
        sections within larger publications.

        - `regional_source`: Identifies locations using regional context from
        state-level publications.

        - `standard_format`: Identifies locations written in standard formats
        like "City, State" or "City, County".

        - `proximity_mention`: Identifies cities and states mentioned within 15
        words of each other.

        - `ai_extracted`: Identifies locations through AI-based content
        analysis. Requires AI Extraction plan.


        For detailed information, see [Location detection
        methods](/local-news-api/guides-and-concepts/location-detection-methods).
      example:
        - dedicated_source
        - proximity_mention
        - ai_extracted
    SentimentScores:
      type: object
      description: Sentiment scores for the article's title and content.
      properties:
        title:
          type: number
          format: float
          minimum: -1
          maximum: 1
          description: The sentiment score for the article title (-1.0 to 1.0).
        content:
          type: number
          format: float
          minimum: -1
          maximum: 1
          description: The sentiment score for the article content (-1.0 to 1.0).
    NamedEntityList:
      type: array
      description: A list of named entities identified in the article.
      items:
        type: object
        properties:
          entity_name:
            type: string
            description: The name of the entity identified in the article.
          count:
            type: integer
            description: The number of times this entity appears in the article.
  securitySchemes:
    ApiKeyAuth:
      type: apiKey
      in: header
      name: x-api-token
      description: >
        API Key to authenticate requests.


        To access the API, include your API key in the `x-api-token` header. 

        To obtain your API key, complete the
        [form](https://www.newscatcherapi.com/book-a-demo) or contact us
        directly.

````