Data feeds

Webz.io delivers broad and diversified coverage of the internet to enterprise-level customers in a wide range of web data verticals. Using our APIs you can filter the data to get relevant, comprehensive, up-to-date feeds that you can consume on-demand.

On the Open Web, Webz.io supports multiple verticals which you can access via an API:

  • News API - Articles published in news outlets, and articles published in top global news and blog outlets and enriched with NLP.
  • Blogs API - Self-published blogs by individual authors.
  • Online Discussions API - Discussions across message boards, Q&A pages, and forums.
  • Reviews API - Posts and conversations on review sites.
  • eCommerce Reviews API - Product details and product-based review data from hundreds of marketplaces and online retailer sites.
  • Archived Web Data - Historical data from archived news, blogs, discussions, and reviews going back more than 10 years.

In addition, we offer an Open Web Firehose solution, which delivers data at scale from News, Blogs, Discussions and Reviews. Unlike the other APIs where the retrieved data is pre-filtered, the Firehose solution provides all the data we crawl. Posts are put in an XML format, compressed and uploaded in 10 megabytes Zip files, and made available to retrieve every other minute on a dedicated FTP site.

On the Dark Web, we provide access to data that is crawled and extracted from dark and deep networks and various messaging applications.

  • Data Breach Detection API - Leaked records from different dark web networks, sites, and applications from the last 5 years
  • Cyber API - Content from millions of illicit sites from multiple networks, such as TOR, I2P, Zeronet, Telegram, IRC, and many others