Skip to content

update script using uv and removing python 2 dependencies#27

Open
randoneering wants to merge 5 commits intoHuman-Centric-Machine-Learning:masterfrom
randoneering:main
Open

update script using uv and removing python 2 dependencies#27
randoneering wants to merge 5 commits intoHuman-Centric-Machine-Learning:masterfrom
randoneering:main

Conversation

@randoneering
Copy link

Update to 2025 Data Dump, Removal of Python 2 Compatibility, and Introducing UV

This PR updates the project to work with the September 2025 StackExchange data dump and removes
legacy Python 2 compatibility code. Additionally, we introduce the use of uv to better handle Python dependencies and managing the project.

Changes

Archive URL Update:

Database Auto-Creation:

  • Added ensureDatabaseExists() function to automatically create target database if it doesn't exist
  • Connects to postgres database first to check/create target database
  • Uses safe SQL identifier composition to prevent SQL injection
  • Eliminates manual createdb step for basic usage

Documentation Updates:

  • Updated README to reference September 2025 data dump
  • Added uv as recommended dependency manager
  • Updated all command examples to use uv run python instead of plain python
  • Clarified that manual database creation is now optional (script will create database if it does not exist)
  • Updated StackExchange

Python 2 Compatibility Removal:

  • Removed six library dependency from pyproject.toml
  • Replaced six.print_() with standard print() throughout codebase
  • Replaced six.next() with built-in next() in row_processor.py
  • Replaced six.moves.urllib.request.urlretrieve() with urllib.request.urlretrieve() in
    load_into_pg.py
  • Removed raw_input compatibility shim

Testing

All Python files pass validation:

Compatibility

Requires Python 3.8+, which matches the existing requires-python = ">=3.8" in pyproject.toml.
Python 2 reached end-of-life in January 2020.


This project has not had a contribution is some time, but I am hoping someone out there still wants to maintain this repo. I was able to take the repo and help me easily import various StackExchange topics into my test server for pgFirstAid. Having some test data will help me validate health checks added to pgFirstAid as it grows. Thanks to all that contributed to this project before me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants