API Usage Flow

Step 1: Add New Search Method

Overview

In order to initiate collection of product documents, the process begins with our Search Methods endpoint. Collection of product documents from search methods takes up to 48 hours to complete.

To begin, define what is the eCommerce domain to be crawled and identify the best method to get your product data. We have 3 types of parsers or web scrapers. They are categorized according to the type of result they return.

  1. Search - these scrapers are designed to help you extract search results data from the different categories sources contain.
    1. The output is up to 100 product documents per page from the url added.
    2. An example of a category can be: Headphones, Wearable Technology, etc.
  2. Keyword - these scrapers are designed to help you extract results through a keyword search through the main search bar of a domain.
    1. The output is up to 100 product documents per page from the url added.
  3. Product - these scrapers are designed to help you extract details for a single product by using its url.
    1. The output is the corresponding product doc to the product url added.

URL Structure

Example

Input:

InputMethod TypesDefinitionExample
token- A unique identifier used to authenticate API access?token=123456789
method

(only one can be used per Search Method)
PRODUCT_URL
SEARCH_URL
KEYWORD
Specific product url
Category Page url
General search term
&method=PRODUCT_URL
&method=SEARCH_URL
&method=KEYWORD
domain-Website name

(without www.)
&domain=amazon.com
value-related to the method addedPRODUCT_URL:
&value=amazon.com/iPhone-Pro-256GB-Sierra-Blue/dp/B0BGYFDQJX/

SEARCH_URL:
&value=amazon.com/s?i=specialty-aps&bbn=16225007011

KEYWORD:
&value=Iphone%206s

Output:

OutputDefinitionExample
status code and messagesWhether the search method was successfully added or there was an error.{status: 200,
messages: [
“Topic already exists”
]}
methodThe method ID assigned to the search method added.method: 123

If you’re interested in learning more about our current HTTP status codes, see the corresponding section.


Step 2: Get Products Data

Overview

After a search method has been successfully added, use the “Get Product Data” endpoint in order to extract the products collected given the corresponding method. Here you will find product documents (structured and sorted data by products crawled). It’s important to remember that data may only be visible in the endpoint in up to 48 hours since the method was added. Updated product information data will only be available after product activation.

URL Structure:

Example

Input & Output

InputInput DefinitionOutputExample
tokenA unique identifier used to authenticate API access-?token=123456789
qQuery for what is desired to search - all product fields can be used for the searchProduct Documents corresponding to query&q=domain:amazon.com
When no query is listedAll Product Documents that have been crawled-

Product Object:

Field NameDescriptionSearchableTypeExample
uuidA unique ID representing the itemYesString18e5b30e542cf3a1fbb983e4572551490f07dc15
urlThe link to the itemYesStringhttps://amazon.com/dp/B07Q45VKVF
parent_urlThe link corresponding to the search methodYesStringhttps://www.amazon.com/s?k=Sky Organics
image_urlA link to the main or first image of the itemYesStringhttps://images-na.ssl-images-amazon.com/images/I/61Xv-uMM%2BXL._SL1024_.jpg
nameThe title of the itemNoStringSky Organics USDA Organic Moroccan Argan Oil: Unrefined, 100% Pure, Cold-Pressed, Moisturizing & Healing, for Dry Skin, Sensitive Skin, Hair Conditioning, Cruelty Free, Vegan, 4 oz (Pack of 2)
descriptionProduct details and descriptionNoStringContains: 2 x 4 oz. bottle of Organic Argan Oil by Sky Organics. 100% Pure and Cold-pressed Oil: Our Moroccan Argan Oil is free of synthetic ingredients, fragrance, alcohol, silicones, or fillers…..
priceOriginal price of productYesString$28.35
skuManufacturer stock keeping unit idYesStringB07Q45VKVF
product_idProduct unique identifierYesStringB07Q45VKVF
review_countThe amount of reviews submitted on a productYesInteger138
rating_count

source specific field
The total number of customer ratings that a product has receivedYesInteger138
aggregated_ratingAverage rating of product as listedYesFloat4.2
brand_nameBrand of product listingYesStringSky Organics
domainSource nameYesStringamazon.com
methodSearch method idYesString9189
methodsIf in multiple methodsYesString9189
9187
reviews_retrievalActivation for reviews (are reviews being retrieved)YesBooleanfalse
historical_collectionWhether was classified to collect historical reviews, new reviews, or both. NoBoolean false
crawledDate data collectedYesDate
Format:
yyyy-MM-dd'T'HH:mm:ss.SSSXXX
2020-05-21T17:21:27.226+03:00
updatedDate data was last updatedYesDate
Format:
yyyy-MM-dd'T'HH:mm:ss.SSSXXX
2020-05-21T17:21:27.226+03:00

Step 3: Update Products (Review Activation)

Overview

This endpoint gives you the ability to manage which product documents (a specific uuid) you are interested in retrieving continuous new data on and extract reviews on. By setting the reviews_retrieval field to "true" (review collection activation), we will begin collecting all available historical reviews on the corresponding product or entity and then continue to collect reviews on an ongoing basis to bring the delta - or until classified otherwise. Additionally, activated products will also be re-crawled in parallel to update the product information fields which can be found in the getProducts endpoint. If historical reviews collection is not of your scope, please set the historical_collection field to "false" and by doing so, the product activated will only have new reviews collected. It’s important to remember that review data may take up to 48 hours to be collected.

URL Structure:

Update/Activation URL Example:

Usage Example:

Input:

InputDefinitionExample
tokenA unique identifier used to authenticate API access?token=123456789
uuidA unique ID representing the item&uuid=033cd2dc1eadfff83fcc242dcd8b3031496cac7c
reviews_retrieval“null” when product has not been activated yet

“true” where the objective is to receive reviews for corresponding product

‘false” where the objective is to stop receiving reviews for corresponding product
&reviews_retrieval=true
historical_collection(Optional input parameter)

“null” when parameter was not used

“true” where the objective is to collect historical reviews and new reviews

‘false” where the objective is to only collect new reviews
&historical_collection=true

Output:

OutputDefinitionExample
status code and messagesWhether review retrieval was successfully activated or there was an error.{status: 500,
messages: [
“Failed to update product : xyz123”
]}

If you’re interested in learning more about our current HTTP status codes, see the corresponding section.


Step 4: Get Product Review Data

Overview

This endpoint gives you the ability to extract reviews on activated products (review_retrieval field set to “true”). Once initial historical data has been collected, we will continue to crawl activated products to deliver new reviews (the delta from what was collected, if any) in up to every 48 hours. Users will receive data in descending order.

URL Structure:

Example:

Input & Output

InputInput DefinitionOutputQuery Example
tokenA unique identifier used to authenticate API access-?token=123456789
qQuery for what is desired to search - all product fields can be used for the searchReview Documents corresponding to query&q=product_uuid:e8fb0a6f89a013e9b385aa22e294724b2e6da361
No QueryAll Review Documents-

Review Object, Core Fields:

Field NameDescriptionSearchableTypeExample
uuidA unique ID representing a post in a threadYesString596787e5146389e00e88acb55a8b452b433afc64
review_idConsumer review unique identifierYesString220866944
textThe text body of the reviewNoStringWe’ve had this TV and sound bar for about a year now and it’s been great. We’re not that tech savvy so it took a little while for us to set up; the sound bar, Wifi, but after we got it hooked up, nice…..
publishedThe date/time when the review was published.YesDate
Format:
yyyy-MM-dd'T'HH:mm:ss.SSSXXX
2022-11-06T05:28:00.000+02:00
authorThe name of the review authorYesStringJakeR
ratingThe rating parameter provides the star rating for the review. rating is a floating number between 0.0 to 5.0.YesFloat5.0
titleThe title of the reviewNoStringPicture quality is great
domainThe source crawledYesStringhttps://www.samsung.com/es
urlA link to the review of the itemYesStringhttps://www.samsung.com/es/tvs/qled-tv/q50a-32-inch-qled-smart-tv-qe32q50aauxxc/#220866944
product_idCorresponding product idYesStringQE32Q50AAUXXC
product_uuidCorresponding product that was reviewedYesStringcde042856d476a8b9dfde7d7111efe1a8a421ece
image_urls

source specific field
Images included in customer reviews displayed as direct URLs in a list.YesListhttps://m.media-amazon.com/images/I/61TKWWlevAL._SL1600_.jpg
num_of_images

source specific field
Total number of images posted for the reviewYesInteger6
crawledThe date/time when the review was crawled.YesDate
Format:
yyyy-MM-dd'T'HH:mm:ss.SSSXXX
2022-11-13T14:30:50.686+02:00

Review Object, Optional Fields:

Field NameDescriptionSearchableTypeExample
is_verified

source specific field
Is the reviewer a verified purchasedYesBooleantrue
variant

source specific field
Product ID of a specific variantYesStringB01M1NAMVM
syndicated_text

source specific field
A domain where the review was originally writtenYesStringamazon.com
is_vine

source specific field
See: https://www.amazon.com/vine/aboutYesBooleantrue
is_recommended

source specific field
Recommended productYesBooleantrue
is_incentivized

source specific field
The contributor received a free product or service to reviewYesBooleantrue
is_sweepstakes_entry

source specific field
Offering your customers an entry to win a prize, discount, or any other type of reward for writing a reviewYesBooleantrue
is_seed_member

source specific field
Only for homedepot.comYesBooleantrue
is_reviewer_program

source specific field
Only for homedepot.comYesBooleantrue
is_neighbors_program

source specific field
Only for wayfair.comYesBooleantrue
had_tried_product

source specific field
This reviewer was invited to try the product in exchange for their honest opinion. Only for currys.co.ukYesBooleantrue
is_prize_draw_participant

source specific field
Only for bosch.co.ukYesBooleantrue
is_sponsored_rating

source specific field
Only for aeg.deYesBooleantrue
is_part_of_competition

source specific field
Only for aeg.deNoBooleantrue
purchased

source specific field
Only for bestbuy.comYesDate
Format:
yyyy-MM-dd'T'HH:mm:ss.SSSXXX
2023-04-26T03:00:00.000+03:00
is_partner_review

source specific field
Only for johnlewis.comNoBooleantrue

Get Search Methods

External endpoint - not part of main flow

Overview

This endpoint gives you the ability to manage, access, and sort through existing search methods added. Accessing your active search methods can give you an understanding of what type of data you extracted in the past and can give you ideas into what you may want to extract in the future.

URL Structure:

Example:

Input & Output:

InputInput DefinitionOutputQuery Example
tokenA unique identifier used to authenticate API access-?token=123456789
qQuery for what is desired searchMethods and details corresponding to the query&domain=amazon.com
No QueryAll methods and corresponding details-

Restriction

  • Currently there is only an ability to query by “domain” field in this endpoint (however, you can still use all the general get parameters).

Get Status

External endpoint - not part of main flow

Overview

This endpoint gives API subscription customers the ability to check and understand how many credits are available to them for products and reviews collection.

To learn more about our pricing plans and subscription models please contact [email protected]

URL Structure:

Use the following url to access the “Get Status” endpoint:

Usage Example:

  • As a user I want to check how many products and reviews credits I have available in order to plan out my next requests.
  • User inputs the following query into Get Status endpoint:

Input

InputDefinitionExample
tokenA unique identifier used to authenticate API access?token=123456789

Output

OutputDefinitionTypeExample
productRequestsLeftHow many more product doc requests/credits are available in your current subscription plan.integer"productRequestsLeft": 886
reviewRequestsLeftHow many more review doc requests/credits are available in your current subscription plan.integer"reviewRequestsLeft": 886

Delete Products

External endpoint - not part of main flow

Overview

This endpoint gives API subscription customers the ability to delete products collected - helping the user manage their products' credit capacity in accordance with the package purchased. Please note that when a product is deleted, the corresponding reviews associated with the product will be deleted as well, but the credits used for those reviews will not be recycled.

To learn more about our pricing plans and subscription models please contact [email protected]

URL Structure

Use the following url to access the “Delete Products” endpoint:

Usage Example:

Input

InputDefinitionExample
tokenA unique identifier used to authenticate API access?token=123456789
uuidA unique ID representing the item&uuid=zyz123

Output

OutputDefinitionExample
status code and messageWhether the product was successfully deleted or there was an error.{status: 500,
messages: [
“Failed to delete product : xyz123”
]}