Educational Resources from ScrapeHero

Customer Education
Details about our process, terminology and techniques

pre sales resources

Legal Information

Legal Information Disclaimer: ScrapeHero does NOT provide legal advice. Please consult your own legal counsel for advice. This page will provide updated information about the legal issues related to Web Scraping with links to experts in this field and recent court cases. If you have recent information or see any

Read More »

Our Delivery Process

Our Delivery Process and Typical Timeline 1. Requirements Once you reach out to us, we will collect your requirements by asking a few simple questions such as the data sources, frequency of data gathering and any custom processing that you might need. We may have a call to discuss the

Read More »

Page Count

Page Count A webpage or page quantifies the volume of consumption of our data services. As a result, one of the main variables in our pricing is the number of pages (or specifically webpages) that we scrape. The more you consume, the more it costs (with increasing volume discounts lowering that per page cost).

Read More »

Data handling and formats

etl-web-scraping-process

WebScraping and ETL – Extract, Transform and Load

The data gathered from the internet through web scraping is usually unstructured and needs to be formatted in order to be used for analysis. This page goes into detail about a couple of common needs based on the data that we provide –  “Formatting of the extracted Data in various

Read More »

The best data and file formats for scraped data

The data we provide comes in various forms from the source and is largely text (barring rich media such as images and videos or proprietary file formats such as PDFs). Our customers need this data in various formats and the key to a successful and scalable solution that fits the

Read More »

product and price monitoring

price-monitoring

Product Monitoring – Access Method

Price Impact of Access method Type of Access: Direct vs Browse vs Search based The method of product identification has a direct and substantial impact on the pricing so we would like to explain the difference clearly so that you can optimize your spend. Direct Access If we get direct

Read More »

Product Matching Challenges

Product Matching Challenges “Apples to Apples” comparison of products across various websites is not as easy as it sounds Let’s say that Samsung would like to monitor their Smartphones across various popular eCommerce websites such as Amazon.com, Walmart.com, Bestbuy.com etc. and for simplicity sake, let’s focus on just one particular

Read More »

web data

real-estate-and-housing-data

Real Estate Data – Quality and Challenges

Real Estate data available online is rife with quality issues and lack of inconsistency. In this post we will demonstrate some of the issues with examples and describe some of the challenges associated with real estate data management – collecting, cleaning, and standardizing real estate data. We will be focusing

Read More »

Location based Data

Location based Data Online data is becoming increasingly localized The days of having the same data or information shown by a website to all users is fast ending. Websites are dynamically showing different content and data based on the location of the user. In this article we will highlight the

Read More »

Samples

Samples Thank you for expressing a need for sample data. This page describes our policy and the reasons behind it. Why Samples? Usually potential customers of ours ask for samples upfront for 2 reasons: 1. Does ScrapeHero have the ability to get this data? We are one of the top companies

Read More »

technical articles

How To Open a Unicode CSV in Excel (the right way)

How To Open a Unicode CSV in Excel (the right way) When we scrape data from Non-English languages and give you a CSV file, the data may appear corrupted or unreadable (when you double click and open the file in Excel). This issue occurs because we scrape the data as

Read More »
yellow-pages-extracted-results

Opening numbers related data in Excel the right way

Opening numbers related data in Excel the right way Common CSV files that have data usually contain numbers and when you open those files in Microsoft Excel you encounter issues with how the data is displayed in Excel.Some common problems are ‘Leading’  zeros may get dropped – very commonly seen

Read More »