Updated Scraping Linkedin Profile Using Python (tested April 2021)
This method is tested last time on April 2021. This article is aimed to update this article https://medium.com/@gerrysabar/scraping-linkedin-profile-using-python-selenium-88cb64888cf which is no longer working today.
The workflow for scraping the job is as follow:
Alright, now let’s create a directory for our working space:
$ mkdir linkedin
go to the newly created directory then create python virtual environment:
$ python3 -m venv venv
activate the virtual environment:
source venv/bin/activate
we need to install required libraries to do this work in our Python virtual environment:
$ pip install selenium
$ pip install requests
$ pip install beautifulsoup4
you also need to install chromedriver, it’ll be used to mimic as a real user browsing using Chrome web browser.
Installing Chromedriver for Ubuntu:
- Install prerequisites:
$ sudo apt-get update
$ sudo apt-get install -y unzip xvfb libxi6 libgconf-2-4
$ sudo apt-get install default-jdk
2. Install Google Chrome
$ sudo curl -sS -o - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add
sudo echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google-chrome.list$ sudo apt-get -y update$ sudo apt-get -y install google-chrome-stable
3. Install Chromedriver
$ wget https://chromedriver.storage.googleapis.com/2.41/chromedriver_linux64.zip
$ unzip chromedriver_linux64.zip
4. Let’s move chromedriver to another location to make it more organizable
$ sudo mv chromedriver /usr/bin/chromedriver$ sudo chmod +x /usr/bin/chromedriver
Note: if you’re using Windows or Mac system you can googling how to install chromedriver and take a note where your cromedriver is located at (in this article it’s located in /user/bin/chromedriver) because we’ll need the path later in our Python app.
Now, we’re going to the interesting part, let’s create a python file named linkedin.py with code as follow (don’t forget to modify the credential):
The script above will list about 10 linkedin profiles for Python developer based in San Fransisco.
Conclusion
This approach works until it’s tested last time on April 11th 2021. As you might know, the tricky part of scraping work is website element may change day to day.