Site Comments
1m
3k more today
Stories
47k
129 more today
Reddit Posts
1m
5k more today

NewsTalk changes

  • 15/6/2023: Removed Mamamia harvester. Fix internal navigation on story view.
  • 16/5/2023: Fixed story search highlight text for phrase searches.
  • 2/5/2023: Story similatory uses Instructor embeddings and vector database, greatly improved story similarity. Auto-ranging similarity scores. NewsTalk data API and Python Client launched.
  • 5/4/2023: Story similatory sorts by date. Layout improvements to story grid pages.
  • 3/4/2023: Named Entity search now queries database rather than RAM (no startup delay)
  • 27/3/2023: Placeholder for broken images. Entities don't overflow. Top entities for comments performance improvements.
  • 17/3/2023: Added Perthnow harvester. Tweaked front page design.
  • 14/3/2023: Added Nine News and Saturday Paper harvesters.
  • 13/3/2023: Added Sky News harvester.
  • 10/3/2023: Back end improvements: Resilience to individual component failures. Service status panel to follow.
  • 9/3/2023: Quick fix to allow concurrent users (for beta testing!)
  • 8/3/2023: Data output comments have ISO string dates. Bug fix: Story linked comment counts set incorrectly. Retrospective DB repair.
  • 6/3/2023: News.com.au Harvester. Feedback form.
  • 4/3/2023: Data export includes linked comments with thread fields. SBS Harvester.
  • 28/2/2023: Entity recognition on linked comments too. Front-end tweaks incl. news ticker shows popular stories.
  • 27/2/2023: Linked comment harvesting. Reddit linked harvester. ABC News harvester (linked comments only).
  • 20/2/2023: Kotaku Harvester.
  • 19/2/2023: ITNews Harvester. Migrate IA to API.
  • 17/2/2023: Independent Australia Selenium Harvester (Disqus).
  • 15/2/2023: Adjustable story similarity slider on story page. Lower similarity cut-off in NLP processor.
  • 14/2/2023: Reduction in stories added with zero comments. Split harvester and NLP modules and improve component reliability.