Web Unlocker API

Web Unlocker ist eine leistungsstarke Scraping API, die den Zugriff auf jede Website ermöglicht, während gleichzeitig anspruchsvolle Bot-Schutzmechanismen umgangen werden. Sie können saubere HTML/JSON-Antworten mit einem einzigen API-Aufruf abrufen, ohne eine komplexe Anti-Bot-Infrastruktur verwalten zu müssen.

API-Endpunkt: https://api.brightdata.com/request
Authorization-Header: Ihr API token aus der Web Unlocker API zone
Payload:
- zone: Ihr Name der Web Unlocker API zone
- url: Ziel-URL, auf die zugegriffen werden soll
- format: Antwortformat (verwenden Sie raw für die direkte Website-Antwort)

Example: Python Script

import requests

API_URL = "https://api.brightdata.com/request"
API_TOKEN = "INSERT_YOUR_API_TOKEN"
ZONE_NAME = "INSERT_YOUR_WEB_UNLOCKER_ZONE_NAME"
TARGET_URL = "http://lumtest.com/myip.json"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_TOKEN}"
}

payload = {
    "zone": ZONE_NAME,
    "url": TARGET_URL,
    "format": "raw"
}

response = requests.post(API_URL, headers=headers, json=payload)

if response.status_code == 200:
    print("Success:", response.text)
else:
    print(f"Error {response.status_code}: {response.text}")

Native Proxy-based Access

Alternative Methode unter Verwendung von Proxy-basiertem Routing.

Example: cURL Command

curl "http://lumtest.com/myip.json" \
--proxy "brd.superproxy.io:33335" \
--proxy-user "brd-customer-<CUSTOMER_ID>-zone-<ZONE_NAME>:<ZONE_PASSWORD>"

Erforderliche Zugangsdaten:

Customer ID: Zu finden in den Account settings
Name der Web Unlocker API zone: Zu finden im Overview-Tab
Passwort der Web Unlocker API: Zu finden im Overview-Tab

Example: Python Script

import requests

customer_id = "<customer_id>"
zone_name = "<zone_name>"
zone_password = "<zone_password>"

host = "brd.superproxy.io"
port = 33335
proxy_url = f"http://brd-customer-{customer_id}-zone-{zone_name}:{zone_password}@{host}:{port}"

proxies = {"http": proxy_url, "https": proxy_url}

response = requests.get("http://lumtest.com/myip.json", proxies=proxies)

if response.status_code == 200:
    print(response.json())
else:
    print(f"Error: {response.status_code}")

Practical Example: Scraping G2 Reviews

Sehen wir uns an, wie Sie Reviews von G2.com scrapen können – einer Website, die stark durch Cloudflare geschützt ist.

Basic Request (Without Web Unlocker)

Verwendung eines einfachen Python-Skripts zum Scrapen von G2 reviews:

import requests
from bs4 import BeautifulSoup

url = 'https://www.g2.com/products/mongodb/reviews'
response = requests.get(url)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = soup.find_all('h2')
    
    if headings:
        print("\nHeadings Found:")
        for heading in headings:
            print(f"- {heading.get_text(strip=True)}")
    else:
        print("No headings found")
else:
    print("Request blocked")

Result: Das Skript schlägt (403-Fehler) aufgrund der Anti-Bot-Maßnahmen von Cloudflare fehl.

Enhanced Request (With Web Unlocker)

Um solche Einschränkungen zu umgehen, verwenden Sie Web Unlocker. Unten finden Sie eine Python-Implementierung:

Direct API Access

import requests
from bs4 import BeautifulSoup

API_URL = "https://api.brightdata.com/request"
API_TOKEN = "INSERT_YOUR_API_TOKEN"
ZONE_NAME = "INSERT_YOUR_ZONE"
TARGET_URL = "https://www.g2.com/products/mongodb/reviews"

headers = {
    "Content-Type": "application/json",
    "Authorization": f"Bearer {API_TOKEN}"
}
payload = {"zone": ZONE_NAME, "url": TARGET_URL, "format": "raw"}

response = requests.post(API_URL, headers=headers, json=payload)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = [h.get_text(strip=True) for h in soup.find_all('h2')]
    print("\nExtracted Headings:", headings)
else:
    print(f"Error {response.status_code}: {response.text}")

Result: Umgeht den Schutz erfolgreich und ruft Inhalte mit Status 200 ab.

Proxy-Based Access

Alternativ können Sie die Proxy-basierte Methode verwenden:

import requests
from bs4 import BeautifulSoup

proxy_url = "http://brd-customer-<customer_id>-zone-<zone_name>:<zone_password>@brd.superproxy.io:33335"
proxies = {"http": proxy_url, "https": proxy_url}

url = "https://www.g2.com/products/mongodb/reviews"
response = requests.get(url, proxies=proxies, verify=False)

if response.status_code == 200:
    soup = BeautifulSoup(response.text, "lxml")
    headings = [h.get_text(strip=True) for h in soup.find_all('h2')]
    print("\nExtracted Headings:", headings)
else:
    print(f"Error {response.status_code}: {response.text}")

Note: Unterdrücken Sie SSL-Zertifikatswarnungen, indem Sie Folgendes hinzufügen:

from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

Waiting for Specific Elements

Verwenden Sie den x-unblock-expect-Header, um auf bestimmte Elemente oder Text zu warten:

headers["x-unblock-expect"] = '{"element": ".star-wrapper__desc"}'
# or
headers["x-unblock-expect"] = '{"text": "reviews"}'

👉 Sie finden den vollständigen Code in g2_wait.py

Mobile User-Agent Targeting

Um mobile User-Agents statt Desktop-User-Agents zu verwenden, hängen Sie -ua-mobile an Ihren Benutzernamen an:

username = f"brd-customer-{customer_id}-zone-{zone_name}-ua-mobile"

👉 Sie finden den vollständigen Code in g2_mobile.py

Geolocation Targeting

Während Web Unlocker automatisch optimale IP-Standorte auswählt, können Sie Zielstandorte festlegen:

username = f"brd-customer-{customer_id}-zone-{zone_name}-country-us"
username = f"brd-customer-{customer_id}-zone-{zone_name}-country-us-city-sanfrancisco"

👉 Sie können hier mehr erfahren.

Debugging Requests

Aktivieren Sie detaillierte Debugging-Informationen, indem Sie das Flag -debug-full hinzufügen:

username = f"brd-customer-{customer_id}-zone-{zone_name}-debug-full"

👉 Sie finden den vollständigen Code in g2_debug.py

Success Rate Statistics

Überwachen Sie die API-Erfolgsraten für bestimmte Domains:

import requests

API_TOKEN = "INSERT_YOUR_API_TOKEN"

def get_success_rate(domain):
    url = f"https://api.brightdata.com/unblocker/success_rate/{domain}"
    headers = {
        "Content-Type": "application/json",
        "Authorization": f"Bearer {API_TOKEN}"
    }
    response = requests.get(url, headers=headers)
    print(response.json() if response.status_code == 200 else response.text)

get_success_rate("g2.com") # Get statistics for specific domain
get_success_rate("g2.*") # Get statistics for all top-level domains

Final Notes

Web Unlocker ermöglicht es Ihnen, selbst die am stärksten geschützten Websites mühelos zu scrapen. Wichtige Punkte, die Sie beachten sollten:

Not Compatible With:
- Browser (Chrome, Firefox, Edge)
- Anti-detect-Browser (Adspower, Multilogin)
- Automatisierungstools (Puppeteer, Playwright, Selenium)
Use Scraping Browser:
Für browserbasierte Automatisierung verwenden Sie Bright Data’s Scraping Browser.
Premium Domains:
Greifen Sie mit den Funktionen für premium domain auf anspruchsvolle Websites zu.
CAPTCHA Solving:
Wird automatisch gelöst, kann jedoch deaktiviert werden. Erfahren Sie mehr über Bright Data's CAPTCHA Solver.
Custom Headers & Cookies:
Senden Sie Ihre eigenen, um bestimmte Website-Versionen gezielt anzusteuern. Learn more.

Besuchen Sie die official documentation für weitere Details.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
src		src
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Web Unlocker API

Table of Contents

Features

Getting Started

Direct API Access

Native Proxy-based Access

Practical Example: Scraping G2 Reviews

Basic Request (Without Web Unlocker)

Enhanced Request (With Web Unlocker)

Direct API Access

Proxy-Based Access

Waiting for Specific Elements

Mobile User-Agent Targeting

Geolocation Targeting

Debugging Requests

Success Rate Statistics

Final Notes

About

Uh oh!

Releases

Packages

Languages

bright-data-de/web-unlocker-api

Folders and files

Latest commit

History

Repository files navigation

Web Unlocker API

Table of Contents

Features

Getting Started

Direct API Access

Native Proxy-based Access

Practical Example: Scraping G2 Reviews

Basic Request (Without Web Unlocker)

Enhanced Request (With Web Unlocker)

Direct API Access

Proxy-Based Access

Waiting for Specific Elements

Mobile User-Agent Targeting

Geolocation Targeting

Debugging Requests

Success Rate Statistics

Final Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages