Watch the scraper in action:
Automated scraping process demonstration
An automated Python script to extract delinquent tax information from Lancaster County, PA's public parcel viewer system. This scraper extracts delinquent tax data from the Lancaster County Property Tax portal:
https://lancasterpa.devnetwedge.com/parcel/view/{parcel_number}/{tax_year}Example URL:
https://lancasterpa.devnetwedge.com/parcel/view/5408465600000/2025
Screenshot of the Lancaster County Parcel Viewer interface where data is extracted from
The script generates a structured CSV file containing delinquent tax information:
parcel_number,address,owner,scrape_date,tax_year,amount_due,amount_paid,total_due
5408465600000,123 MAIN ST LANCASTER PA,JOHN DOE,2024-03-20,2023,1500.00,0.00,1500.00
1200794700000,456 ELM ST LANCASTER PA,JANE SMITH,2024-03-20,2022,2000.00,500.00,1500.00lancaster-property-tax-scraper/
├── src/
│ └── property_scraper.py # Main scraper implementation
├── output/
│ └── delinquent_taxes.csv # Generated output file
├── img/ # Documentation images
├── requirements.txt # Python dependencies
└── README.md # Documentation
graph LR
A["Input Parcel List"] --> B["Initialize Scraper"]
B --> C["Process Each Parcel"]
C --> D["Check for<br/>Delinquent Taxes"]
D --> E{"Has Delinquent<br/>Taxes?"}
E -->|"Yes"| F["Extract Data"]
E -->|"No"| G["Skip Parcel"]
F --> H["Add to Results"]
G --> C
H --> C
C --> I["Export to CSV"]
graph TD
A["Parcel Page"] --> B["Basic Info"]
A --> C["Tax Info"]
B --> D["Parcel Number"]
B --> E["Property Address"]
B --> F["Owner Details"]
C --> G["Tax Year<br/>2022-2024"]
C --> H["Amount Due"]
C --> I["Amount Paid"]
C --> J["Total Due"]
G & H & I & J --> K["CSV Record"]
sequenceDiagram
participant S as Scraper
participant W as Web Server
participant D as Database
S->>W: Request Parcel Page
Note over S,W: 2-5 second delay
W->>S: Return Page
S->>S: Extract Data
alt Success
S->>D: Store Results
else Network Timeout
S->>S: Retry Request
else No Data Found
S->>S: Log & Skip
end
- Automated scraping of delinquent tax data from Lancaster County's parcel viewer
- Handles multiple parcel numbers in batch
- Extracts data for tax years 2022-2024
- Collects property address and owner information
- Outputs results to CSV format
- Built-in rate limiting to prevent server overload
- Only captures parcels with actual delinquent taxes
For each parcel with delinquent taxes, the script collects:
- Parcel number
- Property address
- Owner information
- Tax year (2022-2024)
- Amount due
- Amount paid
- Total due
- Scrape date
- Python 3.7+
- Playwright
- Pandas
- Clone this repository:
git clone https://github.com/caesarw0/lancaster-property-tax-scraper.git
cd lancaster-property-tax-scraper- Install required packages:
pip install -r requirements.txt- Install Playwright browsers:
playwright install-
Prepare a list of parcel numbers in the script or import them from a file.
-
Run the script:
python src/property_scraper.pyThe script will:
- Process each parcel number
- Extract delinquent tax information if available
- Save results to
output/delinquent_taxes.csv
from property_scraper import scrape_multiple_parcels
parcel_numbers = [
"5408465600000",
"1200794700000",
]
df = scrape_multiple_parcels(parcel_numbers)The script includes built-in delays between requests (2-5 seconds) to avoid overwhelming the server. This helps ensure:
- Ethical scraping practices
- Reduced likelihood of IP blocking
- Server resource conservation
The script generates a CSV file with the following columns:
- parcel_number
- address
- owner
- scrape_date
- tax_year
- amount_due
- amount_paid
- total_due
The script includes robust error handling for:
- Network timeouts
- Missing data
- Invalid parcel numbers
- Server errors
This tool is designed for legitimate data collection from publicly available information. Users should:
- Review and comply with Lancaster County's terms of service
- Use reasonable request rates
- Respect the public resource
Contributions are welcome! Please feel free to submit a Pull Request.

