- Scrape number of profiles that exist in result of Linkedin searchUrl.
- Export the content of profiles to Excel and Json files.
- Use the package manager pip to install Scrapy.
(Anaconda Recomended)
cd LinkedinScraperProject
pip install -r requirements.txt
- clone the project
git clone https://github.com/khaleddallah/GoogleImageScrapyDownloader.git
- get into the directory of the project:
cd LinkedinScraperProject
- to get help :
python LinkedinScraper -h
usage:
python LinkedinScraper [-h] [-n NUM] [-o OUTPUT] [-p] [-f format] [-m excelMode] (searchUrl or profilesUrl)
positional arguments:
searchUrl URL of Linkedin search URL or Profiles URL
optional arguments:
-h, --help show this help message and exit
-n NUM num of profiles
** the number must be lower or equal of result number
'page' will parse profiles of url page (10 profiles) (Default)
-o OUTPUT Output file
-p Enable Parse Profiles
-f FORMAT json Json output file
excel Excel file output
all Json and Excel output files
-m EXCELMODE 1 to make each profile in Excel file appear in one row
m to make each profile in Excel file appear in multi row
- Parse ( https://www.linkedin.com/in/khaled-dallah/ and https://www.linkedin.com/in/linustorvalds/ ) profiles and export the result content to ABC.xlsx and ABC.json
(-p) because of parsing single profiles
python LinkedinScraper -p -o 'ABC' 'https://www.linkedin.com/in/khaled-dallah/' 'https://www.linkedin.com/in/linustorvalds/'
- Parse 23 profiles of searchUrl https://www.linkedin.com/.../?keywords=Robotic&...&
if you don't set output name by (-o), Name of result files will be value of keywords (Robotic)
python LinkedinScraper -n 23 'https://www.linkedin.com/search/results/all/?keywords=Robotic&origin=GLOBAL_SEARCH_HEADER'
- Parse 17 profiles of searchUrl https://www.linkedin.com/.../?keywords=Robotic&...&
and get output as excel file and put the information of each profile in one row
python LinkedinScraper -n 17 -f excel -m 1 'https://www.linkedin.com/search/results/all/?keywords=Robotic&origin=GLOBAL_SEARCH_HEADER'
- Python 3.7
- Scrapy
- openpyxl
- Khaled Dallah - Software Engineer | Python/c++ Developer
khaled.dallah0@gmail.com
Report bugs and feature requests here.
Contributions are always welcome!
This project is licensed under the LGPL-V3.0 License - see the LICENSE.md file for details
