Scrape Google Trends in Python

Share:

scrape Google Trends

Google Trends is a tool that analyzes the popularity of top search queries in Google Search across various regions and languages using real-time data.

By scraping Google Trends, you can get details on the latest search patterns and trends, which can help you make decisions.

This article is a step-by-step guide on scraping Google Trends data by building a custom scraper using Playwright. 

We also discuss how you can scrape with Pytrends, an unofficial API to scrape Google Trends.

Initial Requirements

Before you begin scraping Google Trends with Playwright, make sure that you install the required dependencies.

  1. Install the Playwright library along with the Pytest plugin for testing automation
pip install pytest-playwright
  1. Install necessary browsers such as Chromium, Firefox, or WebKit for Playwright to automate
playwright install
  1. Install lxml library for parsing and manipulating HTML/XML in Python
pip install lxml 
  1. Import necessary libraries to handle CSV files, asynchronous operations, HTML parsing, and browser automation using Playwright
import csv
import asyncio
from lxml import html
from playwright.async_api import Playwright, async_playwright
  1. Define an asynchronous function run that uses Playwright to control the browser
async def run(playwright: Playwright) -> None:
  1. Launch a Chromium browser in non-headless mode to create a new browser context, and open a new page (tab) for web navigation
  browser = await playwright.chromium.launch(headless=False)
    context = await browser.new_context()
    page = await context.new_page()
  1. Now navigate to the Google Trends page and wait for the page to load fully
   await page.goto("https://trends.google.com/trending?geo=US")
    await page.wait_for_load_state(timeout=30000)
  1. Fetch the HTML content of the page and parse it using lxml to extract relevant elements
response = await page.content()
    print(len(response))
    tree = html.fromstring(response)
  1. Use XPath to locate all the rows that contain trending topics and print the number of rows found
rows = tree.xpath('//tr[@role="row"]')
    print(len(rows))
  1. Now, you can extract data from each row and append it to the data list for further processing
 data = []
    for element in rows:
        row_data = element.xpath('./td/div/text()')
        row_data.append(element.xpath('./td/div/div/text()')[0])
        data.append(row_data)
  1. Define the CSV column titles and write the extracted data into a CSV file (trending_topics.csv)
  titles = ['trends', 'started', 'volume']
    with open('trending_topics.csv', 'w', newline='') as file:
        writer = csv.writer(file)
        writer.writerow(titles)
        writer.writerows(data)
  1. Define and run the main function that initializes Playwright and execute the run function asynchronously
async def main() -> None:
    async with async_playwright() as playwright:
        await run(playwright)
asyncio.run(main())
import csv
import asyncio
from lxml import html
from playwright.async_api import Playwright, async_playwright


async def run(playwright: Playwright) -> None:
    """
    Launches a Chromium browser and extracts trending topics from Google Trends.
    """
    browser = await playwright.chromium.launch(headless=False)
    # Create a new browser context
    context = await browser.new_context()
    # Open a new page in the browser context
    page = await context.new_page()
    
    # Navigate to the Google Trends page for the US
    await page.goto("https://trends.google.com/trending?geo=US")
    await page.wait_for_load_state(timeout=30000)
    
    # Get the page content as a string
    response = await page.content()
    print(len(response))
    
    # Parse the HTML content using lxml
    tree = html.fromstring(response)
    
    # Extract all row elements containing trending topics
    rows = tree.xpath('//tr[@role="row"]')
    print(len(rows))
    
    # Initialize a list to store the extracted data
    data = []
    
    # Iterate over each row and extract relevant data
    for element in rows:
        row_data = element.xpath('./td/div/text()')
        row_data.append(element.xpath('./td/div/div/text()')[0])
        data.append(row_data)
    
    # Define the column titles for the CSV
    titles = ['trends', 'started', 'volume']
    
    # Write the extracted data to a CSV file
    with open('trending_topics.csv', 'w', newline='') as file:
        writer = csv.writer(file)
        writer.writerow(titles)
        writer.writerows(data)
    
    # Close the browser context and browser
    await context.close()
    await browser.close()


async def main() -> None:
    """
    Runs the Playwright script.
    """
    async with async_playwright() as playwright:
        await run(playwright)


asyncio.run(main())

You can also scrape Google Trends data with Pytrends, an unofficial API wrapper for Google Trends. 

Pytrends allows you to access the trend data without needing browser automation, simplifying querying and retrieving data directly in a structured format.

  1. Install pytrends and run the given command in your terminal
pip install pytrends
  1. Import pytrends and pandas
from pytrends.request import TrendReq
import pandas as pd

Here, TrendReq, which is a class from Pytrends, allows you to connect to Google Trends. 

  1. Set up Pytrends connection to start interacting with Google Trends
pytrends = TrendReq(hl='en-US', tz=360)

Note that hl=’en-US’ is the language of the results, which is English in the United States, and tz=360 sets the timezone offset from GMT.

  1. Define the keywords you want to search for and fetch data
keywords = ['Python', 'Java', 'JavaScript']
pytrends.build_payload(keywords, cat=0, timeframe='today 12-m', geo='', gprop='')

The build_payload mentioned here is the method to set up the search. The parameters are:

  • cat=0  (category of interest, 0 means all categories) ,
  • timeframe=’today 12-m’ (data for the past 12 months),
  • geo=” (empty means global data),
  • gprop=” (empty means no specific Google property, like ‘news’ or ‘images’).
  1. Retrieve the interest over time data for the specified keywords
interest_over_time_df = pytrends.interest_over_time()
  1. Print the data
print(interest_over_time_df.head())

You can have a quick look at the results with head() displaying the first few rows of the data.

  1. Save the data to a CSV file for later use
interest_over_time_df.to_csv('google_trends_data.csv')
from pytrends.request import TrendReq
import pandas as pd

# Set up Pytrends connection
pytrends = TrendReq(hl='en-US', tz=360)

# Define keywords and fetch data
keywords = ['Python', 'Java', 'JavaScript']
pytrends.build_payload(keywords, cat=0, timeframe='today 12-m', geo='', gprop='')

# Retrieve interest over time
interest_over_time_df = pytrends.interest_over_time()

# Display data
print(interest_over_time_df.head())

# Save data to CSV (optional)
interest_over_time_df.to_csv('google_trends_data.csv')

Scraping Google Trends data is valuable as it is an excellent data source that serves various purposes, especially for businesses, marketers, and researchers. 

Some of the key reasons why you should scrape Google Trends are: 

  • Market Research
  • Product Development
  • Geographic Insights
  • Performance Tracking
  • Data-Driven Decisions

Why scrape Google Trends?1. Market Research

Google Trends data can help you understand the emerging trends in consumer behavior and compare your search interest with that of your competitors.

2. Product Development

By scraping Google Trends, you can identify consumer preferences related to products or services and gather insights into what users are looking for.

Do you want to gain an edge over your competitors by monitoring competitor products in multiple countries and markets? Then you can try out our Price Monitoring Service.

3. Geographic Insights

With Google Trends data, you can discover geographic regions with higher search interest for specific keywords and adjust your marketing strategies based on this information.

4. Performance Tracking

You can track the changing search interest in your keywords over time and compare the performance of your products against industry benchmarks with Google Trends data.

5. Data-Driven Decisions

Google Trends data helps you make data-driven decisions based on real-time and historical search trends and plan long-term business strategies.

Wrapping Up

Google Trends is a valuable tool for understanding customer behavior and expectations. 

Enterprises can use the data extracted from Google Trends to gain a competitive edge in the market. The code explained in this article is ideal for small-scale data extraction. 

If you are an enterprise with more extensive data requirements, consider a full web scraping service. You are more likely to encounter different challenges with web scraping, including anti-scraping technologies and legal issues.

For enterprises with significant data needs, a dependable data partner like ScrapeHero is essential. Our custom web scrapers are designed to overcome the challenges of web scraping and ensure the job is done to perfection. 

We are an enterprise-grade web scraping service with a 98% customer retention rate. We provide high-quality data services and fulfill all our customers’ data needs. 

Frequently Asked Questions

1. Is it legal to scrape Google Trends?

The legality of web scraping depends on several factors: the use of data, adherence to privacy laws, respect for website terms of service, and the impact on website performance.

2. How can you create a Google Trends scraper in Python?

Using the Pytrends library, you create a Google Trends scraper in Python, which allows you to fetch trend data programmatically.

3. What is the Google Trends API?

Google does not officially provide the Google Trends API. Pytrends is an unofficial API that allows you to query Google Trends data through Python.

4. What is the rate limit for Google Trends?

Google doesn’t explicitly state the rate limits for Trends. However, an increase in the number of queries can lead to temporary blocks.

We can help with your data or automation needs

Turn the Internet into meaningful, structured and usable data



Please DO NOT contact us for any help with our Tutorials and Code using this form or by calling us, instead please add a comment to the bottom of the tutorial page for help

Table of content

Scrape any website, any format, no sweat.

ScrapeHero is the real deal for enterprise-grade scraping.

Ready to turn the internet into meaningful and usable data?

Contact us to schedule a brief, introductory call with our experts and learn how we can assist your needs.

Continue Reading

Transform and map scraped data

How to Transform and Map Scraped Data with Python Libraries

Learn how you can transform and map data using Python.
Using NLP to clean and structure scraped data

How to Use NLP to Clean and Structure Scraped Data

Learn how to use NLP to clean and structure scraped data.
Search engine web crawling

From Crawling to Ranking! This is How Search Engines Use Web Crawling to Index Websites!

Search engine crawling indexes web pages, making it essential for ranking and visibility in search results.
ScrapeHero Logo

Can we help you get some data?