refactor: move app code from scripts/ into src/devto_mirror#50
refactor: move app code from scripts/ into src/devto_mirror#50anchildress1 merged 2 commits intomainfrom
Conversation
refactor(project): move scripts into src/devto_mirror & update tests - Move code from `scripts/` into `src/devto_mirror/` (site_generation, tools, templates) - Update imports across tests and modules to use `devto_mirror.*` package paths - Convert generator to a runnable module (`devto_mirror.site_generation.generator:main()`) - Reduce complexity in `article_fetcher._convert_cached_post_to_devto_article` - Add/adjust tests (dedupe, generate-site asset runner, api client) to match refactor Generated-by: GitHub Copilot <copilot@github.com> Signed-off-by: Ashley Childress <6563688+anchildress1@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This pull request performs a major refactoring to modernize the codebase structure by moving application code from scripts/ into the src/devto_mirror/ package. The changes improve maintainability, testability, and establish proper separation between application code and runnable entrypoints. The PR also enhances timestamp handling for incremental article updates and introduces a timeout wrapper for security checks.
Changes:
- Moved core application logic from
scripts/tosrc/devto_mirror/core/(api_client, utils, article_fetcher, run_state, etc.) - Refactored site generation into a runnable module (
devto_mirror.site_generation.generator:main()) - Enhanced article filtering to consider edited/updated timestamps in addition to published timestamps
- Added timeout wrapper for pip-audit to prevent indefinite hangs in security checks
Reviewed changes
Copilot reviewed 22 out of 31 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/devto_mirror/templates/post_template.html | New template file for post rendering (contains critical bug - invalid closing tags) |
| src/devto_mirror/site_generation/generator.py | Main generator module moved from scripts/generate_site.py |
| src/devto_mirror/site_generation/renderer.py | Updated imports to use core.utils |
| src/devto_mirror/core/api_client.py | Enhanced timestamp filtering to consider edited/updated timestamps |
| src/devto_mirror/core/article_fetcher.py | New module for article fetching with reduced complexity |
| src/devto_mirror/core/utils.py | Template path fix and improved dedupe logic with activity timestamps |
| src/devto_mirror/core/run_state.py | New module for run state management |
| src/devto_mirror/core/robots_parser.py | New robots.txt parsing module |
| src/devto_mirror/core/path_utils.py | New path sanitization utilities |
| src/devto_mirror/core/html_sanitization.py | New HTML sanitization module |
| src/devto_mirror/core/constants.py | New constants module |
| src/devto_mirror/tools/fix_slugs.py | New tool for fixing slug truncation |
| src/devto_mirror/tools/clean_posts.py | Updated imports to use core modules |
| src/devto_mirror/tools/analyze_descriptions.py | Updated imports to use core modules |
| scripts/validate_site_generation.py | Simplified to use module execution with PYTHONPATH |
| scripts/run_pip_audit.py | New timeout wrapper for pip-audit with CI/local modes |
| scripts/generate_site.py | Deleted (moved to src/devto_mirror/site_generation/generator.py) |
| tests/test_*.py | Updated imports and added comprehensive coverage tests |
| lefthook.yml | Reorganized hooks: security moved to pre-commit, parallel pre-push |
| Makefile | Updated security target to conditionally run pip-audit |
| AGENTS.md | Updated to mark scripts/ as legacy directory |
| .secrets.baseline | Timestamp update only |
anchildress1
left a comment
There was a problem hiding this comment.
I have reviewed the Copilot comments and addressed the remaining issues:
- Fixed:
src/devto_mirror/core/article_fetcher.py: Added explicit returnNoneat the end of_fetch_full_article_jsonto avoid implicit fallthrough (addressing the implicit return warning). - Verified:
src/devto_mirror/core/article_fetcher.py: Type hintAnyis correctly capitalized in the current code (addressing theanyvsAnycomment). - Verified:
src/devto_mirror/site_generation/generator.py:set_last_run_timestampcall correctly includesLAST_RUN_FILEargument in the current code. - Verified:
scripts/run_pip_audit.py: Documentation comment accurately reflects the default timeout values in the current code.
The accessible tools do not permit replying directly to the existing review threads, so I am summarizing the resolution here.
- `src/devto_mirror/core/article_fetcher.py`:
- Add explicit `return None` to avoid implicit fallthrough.
- Correct type hint from `any` to `Any`.
- `src/devto_mirror/site_generation/generator.py`:
- Explicitly pass `LAST_RUN_FILE` to `set_last_run_timestamp`.
- `scripts/run_pip_audit.py`:
- Update docstring to match actual timeout configuration.
- `src/devto_mirror/templates/post_template.html`:
- Remove stray `</invoke>` tag.
- `.secrets.baseline`:
- Update generation timestamp.
Generated-by: GitHub Copilot <copilot@github.com>
Signed-off-by: Ashley Childress <6563688+anchildress1@users.noreply.github.com>
refactor(project): move scripts into src/devto_mirror & update tests
scripts/intosrc/devto_mirror/(site_generation, tools, templates)devto_mirror.*package pathsdevto_mirror.site_generation.generator:main())article_fetcher._convert_cached_post_to_devto_articleGenerated-by: GitHub Copilot copilot@github.com
This pull request introduces several significant improvements to the project's structure, security workflow, and article-fetching logic. The core logic for fetching articles and managing run state has been modularized into the
src/devto_mirror/core/directory, making the codebase cleaner and more maintainable. The security checks and pre-commit hooks have been enhanced for reliability, and the article-fetching logic now handles timestamps and API failures more robustly.Core logic modularization and improvements:
src/devto_mirror/core/article_fetcher.py, introducing theFetchArticlesResultdataclass and functions to robustly fetch, filter, and cache articles, including full-article fetching with retries and fallback to cached data.src/devto_mirror/core/run_state.pyfor better separation of concerns and easier testing.filter_new_articlesto correctly normalize and compare UTC datetimes, and to consider both published and edited times for detecting new articles.src/devto_mirror/core/utils.pyto look for templates in the correct directory after refactoring.Security and pre-commit workflow enhancements:
scripts/run_pip_audit.py, a wrapper forpip-auditthat enforces a timeout and differentiates between strict (CI) and developer modes, preventing indefinite hangs during security checks.Makefileandlefthook.ymlto use the new security check flow, makingpip-auditoptional locally and more robust in CI, and reorganized hooks for parallel execution and clarity. [1] [2]Miscellaneous and documentation:
AGENTS.mdto clarify thatscripts/is now a legacy directory and code should be moved out opportunistically..secrets.baselineand simplification ofscripts/validate_site_generation.pyto align with the new module structure. [1] [2] [3] [4]File moves and renames:
api_client.pyandutils.pyfromscripts/tosrc/devto_mirror/core/, updating imports and references accordingly. [1] [2]These changes collectively modernize the codebase, improve reliability in CI, and set the foundation for further modularization and testing.