Fix BigDeal scraper: collect → filter top-level → expand "See more" per-article#38
Fix BigDeal scraper: collect → filter top-level → expand "See more" per-article#38jjlauterbach merged 5 commits intobigdealfrom
Conversation
…p-level, expand See more per-article Co-authored-by: jjlauterbach <1447549+jjlauterbach@users.noreply.github.com>
Deploying daily-custard with
|
| Latest commit: |
8310088
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://f31064e6.daily-custard.pages.dev |
| Branch Preview URL: | https://copilot-sub-pr-30.daily-custard.pages.dev |
|
Just as a heads up, I was blocked by some firewall rules while working on your feedback. Expand below for details. |
|
@copilot fix conflicts |
There was a problem hiding this comment.
Pull request overview
This PR updates the BigDeal Facebook (Playwright) scraper flow to avoid expanding “See more” buttons in comment threads by mirroring the established approach used in other Facebook scrapers: collect articles first, filter to top-level posts, then expand “See more” within each top-level article.
Changes:
- Reordered BigDeal’s Facebook scraping to collect
[role="article"]elements first, filter nested comment-articles viaevaluate(), then expand “See more” per top-level article. - Removed page-wide “See more” expansion and replaced it with per-article
query_selector()expansion logic (with multiple selectors). - Refactored BigDeal Facebook scraper tests to use a centralized
_create_mock_article()helper and updated the “See more” expansion assertion to validate per-article expansion.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| app/scrapers/bigdeal.py | Reorders scraping to filter top-level posts before expanding “See more” within each article, reducing comment-thread noise. |
| tests/test_bigdeal_scraper.py | Adds a consistent article-mock helper (including evaluate()/query_selector() stubs) and updates the “See more” expansion test for the new per-article behavior. |
Co-authored-by: jjlauterbach <1447549+jjlauterbach@users.noreply.github.com>
…date See more test Co-authored-by: jjlauterbach <1447549+jjlauterbach@users.noreply.github.com>
…evaluate addition Co-authored-by: jjlauterbach <1447549+jjlauterbach@users.noreply.github.com>
Conflicts resolved in commit 8310088. The root cause was that |
bigdeal.pyvsleons.pyscraping patternsbigdeal.py_scrape_facebook_page_attemptto mirror Leon's approach: collect articles first, filter to top-level, then expand "See more" per-article_create_mock_articlehelper toTestBigDealFacebookScrapingtest classtest_scrape_facebook_page_expands_see_more_buttonsto verify per-article expansionevaluate.return_value = Falsesetups unchanged frombigdealbranch; positionquery_selector.return_valueaftermock_page.query_selector_all.return_valueso it falls in a region only this branch modified —git merge-treesimulation confirms zero conflicts💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.