Dynamic Puppeteer Web Scraping for AliExpress Without API Key
Scrape product data from AliExpress based on a given keyword using Puppeteer.
- Node.js (version 22.13.1)
- npm (version 10.9.2)
Install them from Node.js official website.
-
Clone the repository:
git clone https://github.com/p3nnatr4tion/aliexpress-puppeteer.git cd aliexpress-puppeteer -
Install dependencies:
npm install
-
Install Puppeteer version 24.8.0:
npm install puppeteer@24.8.0
-
Configure the scraper in
scraper-starter.js: ModifykeywordandmaxPageto your needs:const keyword = "laptop"; // Search keyword const maxPage = 1; // Max number of pages to scrape
-
Run the scraper:
node scraper-starter.js
- Stealth Plugin: Avoids detection by using Puppeteer’s Stealth Plugin.
- Tab Pooling: Limits concurrent tabs to avoid overload, improving stability.
- Retry Mechanism: Retries failed operations up to 3 times.
- Captcha Handling: Automatically solves CAPTCHA challenges.
- Random Delays: Introduces random delays between requests to mimic human behavior.
- Comprehensive Data: Collects detailed product info such as title, price, specifications, images, reviews, shipping, and more.
- Efficient Page Navigation: Handles scrolling and pagination to collect data from multiple pages.
- Puppeteer: Web scraping library (version 24.8.0)
- Node.js and npm
- Make sure to comply with AliExpress Terms of Service.
- Be mindful of rate limiting to avoid IP blocks.
MIT License