Claude/configurable news reader g1 gdx#116
Conversation
- Create security branch from main if it doesn't exist - Reset security to main if more than 50 commits behind - Handle merge conflicts by resetting to main first - Use force-with-lease for safer push https://claude.ai/code/session_0138bAjho1fWwiRZju3nJFJ3
- Removed "Retour", "Accueil" and "Page précédente" buttons - Elevator arrows now match the style from readme-viewer.html: - Round buttons (border-radius: 50%) - Cyan color (#00d9ff) - Same hover effects - Position at bottom-right https://claude.ai/code/session_0138bAjho1fWwiRZju3nJFJ3
There was a problem hiding this comment.
Pull request overview
This PR adds paywall-bypass fallback extraction in the news aggregator, adjusts the reader UI styling/controls, and refines the Dependabot “secure flow” workflow to keep the security branch in sync with main.
Changes:
- Add multi-service paywall bypass attempt (archive.ph / scribe.rip / web.archive.org) when extracted content looks paywalled.
- Update reader UI layout/styles (e.g., elevator positioning / topbar/back controls per diff).
- Update Dependabot secure-flow workflow logic to (re)create/reset
securitybranch and improve merge handling.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
src/aggregator.js |
Adds paywall-bypass fetching/parsing and tweaks fallback summary generation; introduces paywall host scaffolding in addUTMParams(). |
reader.html |
Updates article view CSS and removes/repositions some navigation UI elements per diff. |
.github/workflows/dependabot-secure-flow.yaml |
Changes how security branch is ensured/updated and how Dependabot PR branches are merged into it. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| // Liste des domaines avec paywalls stricts | ||
| const paywalledHosts = ['ft.com', 'wsj.com', 'economist.com', 'bloomberg.com', 'investing.com']; | ||
|
|
||
| // Ajouter Archive.ph en query parameter pour fallback | ||
| if (paywalledHosts.some(host => hostname.includes(host))) { | ||
| // On ne change pas l'URL ici, on l'utilisera comme fallback | ||
| } |
There was a problem hiding this comment.
Le bloc paywalledHosts est actuellement un no-op (aucune action dans le if) et ajoute du code mort / un commentaire trompeur ("Ajouter Archive.ph"), ce qui rend addUTMParams() plus difficile à maintenir. Soit implémenter réellement le fallback (ex: retourner une URL archive, ou exposer l’URL de fallback via un champ séparé), soit supprimer entièrement ce bloc.
| // Liste des domaines avec paywalls stricts | |
| const paywalledHosts = ['ft.com', 'wsj.com', 'economist.com', 'bloomberg.com', 'investing.com']; | |
| // Ajouter Archive.ph en query parameter pour fallback | |
| if (paywalledHosts.some(host => hostname.includes(host))) { | |
| // On ne change pas l'URL ici, on l'utilisera comme fallback | |
| } |
| const paywalledHosts = ['ft.com', 'wsj.com', 'economist.com', 'bloomberg.com', 'investing.com']; | ||
|
|
||
| // Ajouter Archive.ph en query parameter pour fallback | ||
| if (paywalledHosts.some(host => hostname.includes(host))) { |
There was a problem hiding this comment.
Le test hostname.includes(host) pour détecter des domaines paywallés est fragile (ex: notft.com matcherait ft.com). Le fichier a déjà un helper hostnameMatches(host, baseDomain) basé sur un match suffixe (host === domain || host.endsWith('.'+domain)); utilisez-le ici pour éviter des contournements et garder une logique cohérente.
| if (paywalledHosts.some(host => hostname.includes(host))) { | |
| if (paywalledHosts.some(host => hostnameMatches(hostname, host))) { |
| // Essayer les services de bypass avant de renoncer | ||
| const bypassResult = await tryPaywallBypass(resolvedArticleUrl); | ||
| if (bypassResult.success) { | ||
| const bypassDom = createSafeDom(bypassResult.html, resolvedArticleUrl); |
There was a problem hiding this comment.
tryPaywallBypass() ne retourne pas l’URL réellement utilisée (bypassUrl), et plus bas createSafeDom(bypassResult.html, resolvedArticleUrl) utilise l’URL d’origine comme base. Pour des HTML provenant de archive.ph / web.archive.org, cela peut casser la résolution des liens relatifs et fausser les heuristiques Readability. Retourner bypassUrl (ou finalUrl) et l’utiliser comme second paramètre de createSafeDom afin que document.URL/baseURI reflète la page parsée.
| const bypassDom = createSafeDom(bypassResult.html, resolvedArticleUrl); | |
| // Utiliser l'URL réellement utilisée par le service de bypass (si fournie) | |
| const bypassBaseUrl = bypassResult.finalUrl || bypassResult.bypassUrl || resolvedArticleUrl; | |
| const bypassDom = createSafeDom(bypassResult.html, bypassBaseUrl); |
| } else { | ||
| // Contenu normal sans paywall | ||
| if (isLikelyBoilerplateExtraction(articleContent.textContent)) { | ||
| writeFallbackLocalArticle(); | ||
| } else { | ||
| if (!computedSummary || computedSummary.trim().length < 20) { | ||
| computedSummary = trimPromotionalTailText(cleanupNoiseText(sanitizeText(articleContent.textContent.slice(0, 1400)))); | ||
| } |
There was a problem hiding this comment.
La logique imbriquée autour du cas "contenu normal" (paywall/boilerplate) a une indentation incohérente et des blocs else difficiles à lire (ex: if (!computedSummary...) n’est pas indenté sous le else). Merci de réindenter/reformater ce bloc pour éviter des erreurs futures lors de modifications (et faciliter la vérification des accolades/branches).
| git fetch origin main | ||
| if git fetch origin security 2>/dev/null; then | ||
| # Security branch exists, check if it's behind main | ||
| git checkout security | ||
| BEHIND=$(git rev-list --count HEAD..origin/main) | ||
| if [ "$BEHIND" -gt 50 ]; then | ||
| echo "Security branch is $BEHIND commits behind main. Resetting to main..." | ||
| git reset --hard origin/main | ||
| fi |
There was a problem hiding this comment.
Dans l’étape "Ensure security branch exists", git checkout security peut échouer si la branche locale security n’existe pas (actions/checkout ne la crée pas forcément). Il faut checkout/mettre à jour explicitement depuis origin/security (ex: créer/forcer la branche locale à pointer sur origin/security) avant de calculer le retard et éventuellement reset.
| if ! git merge origin/${{ github.head_ref }} --no-edit; then | ||
| echo "Merge conflict detected, aborting and retrying with main as base..." | ||
| git merge --abort || true | ||
| git reset --hard origin/main | ||
| git merge origin/${{ github.head_ref }} --no-edit || true |
There was a problem hiding this comment.
La fusion utilise git merge origin/${{ github.head_ref }}, mais juste avant le fetch est fait via ${head_ref}:${head_ref} (création d’une branche locale), ce qui ne garantit pas l’existence/à-jour de origin/${head_ref}. Résultat possible: merge qui échoue / merge d’une ref obsolète, puis push de security sans les changements Dependabot. Ajuster soit le fetch (pour alimenter refs/remotes/origin/${head_ref}), soit la commande merge pour cibler la branche locale ${{ github.head_ref }} ou FETCH_HEAD.
| if ! git merge origin/${{ github.head_ref }} --no-edit; then | |
| echo "Merge conflict detected, aborting and retrying with main as base..." | |
| git merge --abort || true | |
| git reset --hard origin/main | |
| git merge origin/${{ github.head_ref }} --no-edit || true | |
| if ! git merge ${{ github.head_ref }} --no-edit; then | |
| echo "Merge conflict detected, aborting and retrying with main as base..." | |
| git merge --abort || true | |
| git reset --hard origin/main | |
| git merge ${{ github.head_ref }} --no-edit || true |
Continue Tasks:▶️ 1 queued — View all