Just Dial Data extractor tool

Just Dial Data Extractor is the most powerful and easy-to-use python Script. Users can easily get data from the websites and save it in the .csv format.

This python script extracts the following data:

  1. Phone number
  2. Name
  3. Address

Let’s now get our hands dirty!!

  • Python version: I’m using Python 3.0, however, feel free to use Python 2.0 by making slight adjustments. I’m using Jupyter Notebook, so you don’t need any command-line knowledge.
  • Selenium package: Install selenium package using the following command.
!pip install selenium

That’s it! You are all set.

Approach:

  1. Import the following modules: webdriver from seleniumChromeDriverManagerpandastime and os.
  2. Use the driver.get() method and pass the link you want to get information from.
  3. Use the driver.find_elements_by_class_name() method and pass ‘store-details’.
  4. Instantiate empty lists to store the values.
  5. Iterate the StoreDetails and start fetching the individual details that are required.
  6. Create a user-defined function strings_to_number() to convert the extracted string to numbers.
  7. Display the details and save them as a CSV file according to the requirements.
# code
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
driver = webdriver.Chrome(ChromeDriverManager().install())
import pandas as pd
import time
import os

# driver.get method() will navigate to a page given by the URL address
driver.get("https://www.justdial.com/Delhi/Ceiling-Tile-Dealers-Armstrong/nct-11271379")


def strings_to_num(argument): 
    
    switcher = { 
        'dc': '+',
        'fe': '(',
        'hg': ')',
        'ba': '-',
        'acb': '0', 
        'yz': '1', 
        'wx': '2',
        'vu': '3',
        'ts': '4',
        'rq': '5',
        'po': '6',
        'nm': '7',
        'lk': '8',
        'ji': '9'
    } 
    return switcher.get(argument, "nothing")

storeDetails = driver.find_elements_by_class_name('store-details')

nameList = []
addressList = []
numbersList = []

for i in range(len(storeDetails)):
    
    name = storeDetails[i].find_element_by_class_name('lng_cont_name').text
    address = storeDetails[i].find_element_by_class_name('cont_sw_addr').text
    contactList = storeDetails[i].find_elements_by_class_name('mobilesv')
    
    myList = []
    
    for j in range(len(contactList)):
        
        myString = contactList[j].get_attribute('class').split("-")[1]
    
        myList.append(strings_to_num(myString))

    nameList.append(name)
    addressList.append(address)
    numbersList.append("".join(myList))


    
# intialise data of lists.
data = {'Company Name':nameList,
        'Address': addressList,
        'Phone':numbersList}

# Create DataFrame
df = pd.DataFrame(data)
print(df)

# Save Data as .csv
df.to_csv('demo1.csv', mode='a', header=False)
  • No need to download the chrome driver
  • Feel free to modify the code. Try to :
    • Extract Rating and etc.
  • Code Link : https://github.com/alokm014/WhatsApp-Automation-Selenium

Comment below about your experience! There is no limit to scraping and the above is just an example to get you guys started. So happy coding!

Video Link:

Leave a Comment

Your email address will not be published. Required fields are marked *