Data Segmentation
The segmentation feature is a powerful API that provides instant insight into a dataset. Using this feature, you can find the top 100 sites, languages, authors, entities around a topic or a dataset, produced by a query.
It’s a powerful tool to get a 360° view of the dataset you’ve created. For example, you can segment the top forums for your query. For each forum, you can find the top authors, and for each author, the top users who comment on their posts. Another example is listing the top journalists who write about certain topics, or the top people or locations mentioned in a text.
URL Structure:
https://api.webz.io/cyberSeg?token=XXXXXXXX&q=[QUERY]&field=[SEGMENT]
HTTP GET Parameters
Parameter | Description | Example |
---|---|---|
q | A Boolean query containing the filters that define which posts will be returned. | Top domains with high risk cyber threats enriched.cyber_risk.value:>6 |
token | Your private access token that you received when you signed up. | |
field | The field by which to segment the data. The following are the available segments: author author_extended.display_name author_extended.user_id author_extended.user_link language published site.domain site.type site.country site.category thread.url thread.published thread.site_section thread.replies_count thread.participants_count extended.file_type extended.network enriched.category enriched.persons enriched.organizations enriched.cyber_risk.value enriched.cyber_risk.content_risk enriched.cyber_risk.site_risk enriched.cyber_risk.actor_risk enriched.ip.count enriched.icq_id.value enriched.icq_id.count enriched.cve.value enriched.cve.count enriched.ssn.value enriched.ssn.count enriched.location.value enriched.location.count enriched.email.value enriched.email.count enriched.domain.value enriched.domain.count enriched.phone.value enriched.phone.count enriched.credit_card.value * enriched.credit_card.count | Segment by top domains &field=site.domain |
ts | The ts (timestamp) parameter returns results that were crawled after this timestamp (ts is a Unix timestamp in milliseconds). Here is a Timestamp/Date converter When not specified the default is the past 3 days. | &ts=1459835503426 |