PRODUCT UPDATES
Welcome to the Instapage Product Updates page. Here you can find the latest news about changes to the Instapage platform.
August 02, 2024
- Expanded Sources List:
- We’ve broadened our source coverage by increasing the number of sources from 80,000 to 91,000.
- This expansion was driven by feedback from our clients, ensuring a more comprehensive and diverse news feed.
- Enhanced Proxy Logic:
- We’ve optimized our proxy mechanisms, reducing the number of instances where our data extractors are blocked by 80%.
- This improvement ensures more consistent and reliable data extraction across various sources.
July 19, 2024
- Preventing Historic Data Breakdowns:
- To safeguard against data loss and service disruptions, we’ve implemented regular snapshots of our data, stored on AWS Glacier. This allows for quick recovery during downtimes.
- Additionally, each year of historical data is now duplicated across two servers, ensuring data remains accessible and secure even in the event of a server failure.
- Improved System Performance:
- We’ve added four new servers to our v3 historic clusters, enhancing data management and overall system performance.
June 14, 2024
- New Clustering Algorithm on v3 API:
- We’ve introduced a more efficient clustering method for our v3 API and benchmarked it against our existing approach.
- The new method is approximately 1.75x faster, offering significant performance improvements without requiring any changes to your existing code or API calls.
June 07, 2024
- is_opinion Flag Now Available:
- The is_opinion attribute, already present in v3, is now available as a filter parameter in the API. This allows for more precise filtering of opinion articles in your data queries.
- Improved Source Country Identification:
- We’ve enhanced our logic for determining the country of origin for news sources.
- This update has reduced the number of sources marked as ‘unknown’ by over 5,500, improving the accuracy of geographical data.
May 31, 2024
- Translated Articles on v3 API:
- The v3 API now includes English translations for non-English articles.
- We’ve achieved a 90% translation rate for non-English content, providing broader access to global news in English.
May 24, 2024
- New English Sentiment Model:
- We’ve fine-tuned our sentiment analysis model using a synthetic dataset of over a million articles labeled with ChatGPT.
- The new model operates 10x faster and delivers improved accuracy, with F-1 scores of 0.89 for non-finance articles and 0.87 for finance-related content.
May 17, 2024
- Improved Language Detection:
- We’ve fixed a bug that caused incorrect language identification due to certain text transformations. This fix enhances the accuracy of our language detection across articles.
- Article Deduplication:
- We’ve implemented a deduplication feature to identify and filter out republished or syndicated articles, ensuring that your data stream focuses on original content.
- Comprehensive documentation is available for this feature:
May 10, 2024
- New English Sentiment Model:
- We’ve fine-tuned our sentiment analysis model using a synthetic dataset of over a million articles labeled with ChatGPT.
- The new model operates 10x faster and delivers improved accuracy, with F-1 scores of 0.89 for non-finance articles and 0.87 for finance-related content.
May 03, 2024
- Article Update Monitoring:
- We’ve introduced a feature that checks whether an article has been updated after its initial publication.
- If changes are detected, we ensure the extracted version reflects the latest content, keeping your data current.
April 19, 2024
- Enhanced Parent URL Logic:
- We’ve refined the logic for the parent_url attribute, which previously defaulted to the homepage of the news source where the article was first found.
- The new logic now prioritizes section-specific URLs over homepage links, improving the contextual relevance of the parent URL data.
April 12, 2024
- Text Formatting Preservation:
- We’ve improved our text extraction process to preserve formatting, ensuring that more than 90% of articles maintain clear paragraph splits.
- This enhancement provides cleaner and more readable data.
- V3 API SDKs Launched:
- We’ve launched SDKs for the v3 API in multiple programming languages, including Python, C#, Java, Go, and TypeScript, making it easier to integrate our API into various development environments.
March 29, 2024
- Additional Historical Data:
- Our v3 API now includes NLP-enriched articles dating back to the beginning of July 2023.
- Improved latency
- We’ve deployed a dedicated processing pipeline for priority sources, ensuring that these articles are indexed in under 5 minutes, down from the usual 15-60 minute delay.
March 25, 2024
- Author Extraction Enhancement:
- We’ve improved our extraction methods to better identify author names within the article content, including in-text endings like “…written by John Smith.”
- This ensures more accurate attribution of articles to their authors.