Need for Points of Interest? Here’s How to Scrape Google Maps POI Data

Share:

Scrape Google Maps POI data

Google Maps has a highly dynamic website, which makes scraping POI (Points of Interest) data challenging. But you can scrape Google Maps POI data. You can use Python and Selenium to navigate Google Maps, render JavaScript, and extract the necessary data. 

This tutorial shows you how to scrape POI data from Google Maps.

Data Scraped From Google Maps

The tutorial scrapes POI data from Google Maps across six categories:

  • Banks
  • Car Washes
  • Clinics
  • Stores
  • Hotels
  • Pharmacies

For each point of interest, the code extracts six data points:

  • Name
  • Rating
  • Review Count
  • Address
  • Phone Number
  • Website

You need to analyze the HTML code of Google Maps’s SERP to find unique ways to locate these data points. Once you do that, you can begin setting up the environment.

As mentioned above, this tutorial uses Selenium for data scraping; if you want to use Playwright, read this article on how to scrape Google Maps.

Scrape Google Maps POI Data: The Environment

The tutorial requires three external libraries to scrape POI data that you must install using Python pip:

  1. Selenium: Enables interaction with web pages, execution of JavaScript, and data extraction.
  2. BeautifulSoup: Offers intuitive methods for extracting data from HTML code
  3. Geopy: Provides latitude and longitude for a given location
pip install selenium beautifulsoup4 geopy

Want to learn more about scraping with Selenium? Read our article on how to scrape a dynamic website.

Scrape Google Maps POI Data: The Code

1. Import Packages

Start by importing necessary modules or classes from the aforementioned packages.

from selenium import webdriver
from selenium.webdriver.common.by import By
from geopy.geocoders import Nominatim
from bs4 import BeautifulSoup

import json, time

In this code snippet: 

  • webdriver: Controls the Selenium browser.
  • By: Specifies the selector type for data extraction.
  • Nominatim: Retrieves latitude and longitude for a location.
  • json: Saves the extracted data as a JSON file.
  • time: Provides the sleep() function that pauses the script execution for a specified duration.

2. Define functions

Define three functions:

  • getElements(): Returns the HTML code of the elements containing POIs.
  • extractDetails(): Extracts required data points from the HTML elements.
  • getData(): Calls the above two functions and saves the extracted data as a JSON file.

Let’s look at the functions in detail.

getElements()

The function takes a category, latitude, and longitude as inputs and returns an array containing the HTML code of POI listings for that specific location.

Begin by launching the Selenium browser with defined options:

browser = webdriver.Chrome(options=options)

Construct the URL of the page containing POI listings, which includes the category, latitude, and longitude:

url = f"https://www.google.com/maps/search/{category}/@{lat},{long}"

Navigate to the URL using get() method of the Selenium webdriver. Pause execution for 3 seconds to ensure all required elements are loaded:

browser.get(url)
time.sleep(3)

Locate the div element containing the listings to find the elements holding the POI data:

results = browser.find_element(By.XPATH,f'//div[@aria-label="Results for {category}"]')

Since the page uses lazy-loading to load the POI elements, you need to scroll. Set an upper limit on scrolls while ensuring at least ten elements are loaded:

listings = results.find_elements(By.CLASS_NAME,"lI9IFe")

linkCount = len(listings)

i = 1

while(linkCount<=10 and i < 20):

    try:
        browser.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight", results)

        listings = list(map(lambda x: x.get_attribute('outerHTML'), results.find_elements(By.CLASS_NAME,"lI9IFe")))
        linkCount = len(listings)
        i+=1

    except Exception as e:
        print(e)
        break  

The loop continues until either more than ten listings are extracted or more than twenty scrolls have been performed.

Each iteration stores the listing’s HTML code in an array, which is returned after the loop completes:

return listings

extractDetails()

This function extracts necessary data from the HTML elements obtained via getElements(). It accepts a dict of extracted HTML elements, loops through them, retrieves the data points, and returns another dict containing with extracted POI data. 

Here is how it looks:

def extractDetails(data):

    places_of_interest = {}

    for info in data:
        category_data = []

        for d in data[info]:  
            soup = BeautifulSoup(d)

            try:
                url = soup.find('div',{'class':'Rwjeuc'}).a['href']
            except:
                url = "Not Available"
            name = soup.find('div',{'class':'qBF1Pd'}).text
            try:
                rating = soup.find('span',{'class':'MW4etd'}).text
            except:
                rating = 'Not Available'
            try:
                review_count = soup.find('span',{'class':'UY7F9'}).text.replace('(','').replace(')','')
            except:
                review_count = 'Not Available'

            details = soup.find_all('div',{'class':'W4Efsd'})

            try:
                address = details[2].text.split('·')[2]
            except:
                address = 'Not Available'
            try:
                phone = details[3].text.split('·')[1]
            except:
                phone = 'Not Available'

            all_details = {
                'Name':name,
                'Rating':rating,
                'Review Count':review_count,
                'Address':address,
                'Phone':phone,
                'Website':url
            }        

            category_data.append(all_details)
        places_of_interest[info] = category_data

    return places_of_interest

This code initializes an empty dict to store all POI data. This data will hold the POI data across categories. 

It iterates through each key in the dict, 

1. Defining an empty array for one category’s POI data.

2. Looping through the HTML elements in that category, where each loop

  1. Parses the element with BeautifulSoup
  2. Extracts required details
  3. Saves them in a dict
  4. Appends the dict to the array defined earlier

3. Updating the main dict with category names as keys and extracted data as values.

Finally, the function returns the dict containing the extracted POI data. 

getData()

This function integrates getElements() and extractDetails()

Start by prompting the user for a location using input().

search = input(‘enter a place’)

Next, use Geopy to get the latitude and longitude.

geolocator = Nominatim(user_agent='poi')
location = geolocator.geocode(search)
lat = location.latitude
long = location.longitude

Create an array of categories that will be used to construct the URLs:

categories = ['banks', 'car washes', 'clinics', 'stores', 'hotels', 'Pharmacies']

Iterate through these categories and call getElements() in each iteration to collect HTML elements into a dictionary.

poi_data = {}
    for category in categories:
        poi_data[category] = getElements(category,lat,long)
        print(f'{category} data extracted')

Pass the dict to extractDetails(), which returns another dict containing extracted POI across all categories.

details = extractDetails(poi_data)

Pass the dict to extractDetails(), which returns another dict containing extracted POI across all categories.

details = extractDetails(poi_data)

Finally, save this extracted data into a JSON file. 

with open(f'{search}_poi.json','w',encoding='utf-8') as f:
    json.dump(details, f, indent=4, ensure_ascii=False)

You can now run the complete script by calling getData():

if __name__ == "__main__":
    getData()

The results from extracting Google Maps POI data will resemble this format.

{
            "Name": "Valley Bank ATM",
            "Rating": "4.1",
            "Review Count": "54",
            "Address": " 211 Main Ave",
            "Phone": " (973) 777-6441",
            "Website": "https://locations.valley.com/nj/passaic/valley-bank-2a.html"
        }

Complete Code Example

Here’s the entire code to extract Google Maps POI data.

from selenium import webdriver
from selenium.webdriver.common.by import By
from geopy.geocoders import Nominatim
from bs4 import BeautifulSoup

import json, time

options = webdriver.ChromeOptions()
options.add_argument("--headless=new")
options.add_argument("user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36")

def getElements(category, lat, long):

    browser = webdriver.Chrome(options=options)
    url = f"https://www.google.com/maps/search/{category}/@{lat},{long}"


    browser.get(url)
    time.sleep(3)


    try:
        results = browser.find_element(By.XPATH,f'//div[@aria-label="Results for {category}"]')
    except:
        print(url)

    listings = results.find_elements(By.CLASS_NAME,"lI9IFe")
    linkCount = len(listings)
    i = 1

    while(linkCount<=10 and i < 20):

        try:
            browser.execute_script("arguments[0].scrollTop = arguments[0].scrollHeight", results)

            listings = list(map(lambda x: x.get_attribute('outerHTML'), results.find_elements(By.CLASS_NAME,"lI9IFe")))
            linkCount = len(listings)
            i+=1
        except Exception as e:
            print(e)
            break  

    return listings


def extractDetails(data):

    places_of_interest = {}


    for info in data:
        category_data = []

        for d in data[info]:  
            soup = BeautifulSoup(d)

            try:
                url = soup.find('div',{'class':'Rwjeuc'}).a['href']
            except:
                url = "Not Available"

            name = soup.find('div',{'class':'qBF1Pd'}).text


            try:
                rating = soup.find('span',{'class':'MW4etd'}).text
            except:
                rating = 'Not Available'
            try:
                review_count = soup.find('span',{'class':'UY7F9'}).text.replace('(','').replace(')','')
            except:
                review_count = 'Not Available'

            details = soup.find_all('div',{'class':'W4Efsd'})

            try:
                address = details[2].text.split('·')[2]
            except:
                address = 'Not Available'
            try:
                phone = details[3].text.split('·')[1]
            except:
                phone = 'Not Available'


            all_details = {
                'Name':name,
                'Rating':rating,
                'Review Count':review_count,
                'Address':address,
                'Phone':phone,
                'Website':url
            }        


            category_data.append(all_details)

        places_of_interest[info] = category_data

    return places_of_interest

def getData():


    search = input('enter a place')

    print('Decoding lattitude and longitude')

    geolocator = Nominatim(user_agent='poi')
    location = geolocator.geocode(search)

    lat = location.latitude
    long = location.longitude

    categories = ['banks', 'car washes', 'clinics', 'stores', 'hotels', 'Pharmacies']

    print('Commensing extraction')

    poi_data = {}
    for category in categories:
        poi_data[category] = getElements(category,lat,long)
        print(f'{category} data extracted')

    #data = dict(zip(categories,poi_data))

    details = extractDetails(poi_data)

    print('Extraction completed')

    with open(f'{search}_poi.json','w',encoding='utf-8') as f:
        json.dump(details, f, indent=4, ensure_ascii=False)

if __name__ == "__main__":
    getData()

Code Limitations

While this tutorial demonstrates how to scrape Google Maps POI data effectively, there are limitations:

  • It is not suitable for large-scale web scraping since it lacks techniques to bypass anti-scraping measures.
  • You must monitor changes in Google Maps’ HTML structure; any alterations will require updates to your code to avoid breaking functionality. 
  • The code only extracts six data points; if you want more, you’ll need to modify the code further.

Alternative POI Sources: ScrapeHero Cloud and Datastore

If you prefer not to code yourself, consider using ScrapeHero’s alternative sources for POI data through its Cloud and Datastore. 

ScrapeHero Cloud

ScrapeHero Cloud is a web scraping platform that offers no-code web scrapers. Its Google Maps Search Results Scraper allows you to quickly gather POI data with just a few clicks.

To use this scraper for Google Maps POI data, follow these steps:

  1. Sign up for ScrapeHero Cloud
  2. Create a new project 
  3. Name the Project
  4. Enter the search queries
  5. Click ‘Gather Data’
  6. Download the data when finished

ScrapeHero Datastore

ScrapeHero Datastore simplifies the process even further by directly providing POI data. You can easily obtain high-quality data by:

  1. Visiting ScrapeHero datastore
  2. Adding the desired data to your cart
  3. Navigating to your cart
  4. Completing the payment process

Why Use ScrapeHero’s Web Scraping Service?

By coding yourself, scraping a few dozen POIs might be manageable, but large-scale scraping with thousands of PoIs across multiple locations becomes more complex. This is where ScrapeHero’s fully managed web scraping service comes into play.

ScrapeHero offers a comprehensive service that handles the entire scraping process for you.

We provide custom solutions to handle dynamic websites like Google Maps for large-scale projects. You can forget about managing proxies, CAPTCHAs, or any other complexities associated with scraping protected sites.

We can help with your data or automation needs

Turn the Internet into meaningful, structured and usable data



Please DO NOT contact us for any help with our Tutorials and Code using this form or by calling us, instead please add a comment to the bottom of the tutorial page for help

Table of content

Scrape any website, any format, no sweat.

ScrapeHero is the real deal for enterprise-grade scraping.

Ready to turn the internet into meaningful and usable data?

Contact us to schedule a brief, introductory call with our experts and learn how we can assist your needs.

Continue Reading

Automate web scraping

Try These Techniques To Automate Web Scraping, Saving Time and Effort!

Explore different methods to automate web scraping, from Python libraries and no-code platforms to AI-powered tools.
Google Maps Scraping

Having Trouble with Google Maps Scraping? Understand Key Challenges and Try These Solutions!

Discover the significant challenges in Google Maps scraping and effective methods to tackle each.
Scrape walmart reviews

Need to Understand Walmart Customers? Here’s How to Scrape Walmart Reviews

Learn how to scrape Walmart customer reviews.
ScrapeHero Logo

Can we help you get some data?