fix: Secure Medium URL parsing to prevent SSRF vulnerability (CodeQL …#127
Conversation
…alert #27) Replace unsafe string matching with proper URL parsing: - Parse URL hostname before checking for medium.com - Prevents bypasses like http://evil-medium.com or http://evil.net/medium.com - Add freedium.app as fallback mirror for Medium articles Fixes: CWE-20 (Incomplete URL substring sanitization) https://claude.ai/code/session_0138bAjho1fWwiRZju3nJFJ3
There was a problem hiding this comment.
Pull request overview
This PR fixes a security vulnerability (SSRF/CWE-20) in Medium URL parsing within the paywall bypass logic by replacing unsafe string matching (u.includes('medium.com')) with proper URL parsing and hostname validation. The fix prevents potential bypasses through malicious URLs like http://evil-medium.com or http://evil.net/medium.com.
Changes:
- Replaced string-based
.includes()check with URL parsing and hostname validation for the scribe.rip Medium bypass - Added a new freedium.app bypass service for Medium articles with the same secure hostname validation approach
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| transform: (u) => { | ||
| try { | ||
| const parsedUrl = new URL(u); | ||
| // Safely check if this is a medium.com URL | ||
| if (parsedUrl.hostname === 'medium.com' || parsedUrl.hostname.endsWith('.medium.com')) { | ||
| return u.replace(parsedUrl.hostname, 'freedium.app'); | ||
| } | ||
| } catch (_) {} | ||
| return null; | ||
| } |
There was a problem hiding this comment.
This hostname validation should use the existing hostnameMatches() helper function (defined at line 584) for consistency with the rest of the codebase. The helper is already used throughout the file (lines 597-608) for the same purpose.
The current implementation duplicates the logic inline. Extract the hostname once and use the helper:
try {
const parsedUrl = new URL(u);
const hostname = parsedUrl.hostname;
if (hostnameMatches(hostname, 'medium.com')) {
return u.replace(parsedUrl.hostname, 'freedium.app');
}
} catch (_) {}This ensures consistent security checks across the codebase and improves maintainability.
| name: 'freedium', | ||
| transform: (u) => { | ||
| try { | ||
| const parsedUrl = new URL(u); | ||
| // Safely check if this is a medium.com URL | ||
| if (parsedUrl.hostname === 'medium.com' || parsedUrl.hostname.endsWith('.medium.com')) { | ||
| return u.replace(parsedUrl.hostname, 'freedium.app'); | ||
| } | ||
| } catch (_) {} | ||
| return null; | ||
| } |
There was a problem hiding this comment.
There's an inconsistency between the proactive Medium bypass and this reactive bypass. The codebase already has a proactive Medium paywall bypass at line 260 that uses freedium.cloud, but this PR adds a reactive bypass using freedium.app.
Consider:
- Using the same domain (
freedium.cloudorfreedium.app) in both places for consistency - Evaluating whether both proactive (line 260) and reactive (this code) approaches are needed, or if the proactive approach should be the primary method
The proactive approach is generally more reliable as it bypasses the paywall before any content extraction issues occur, as noted in the stored memory about paywall handling.
| transform: (u) => { | ||
| try { | ||
| const parsedUrl = new URL(u); | ||
| // Safely check if this is a medium.com URL by parsing the hostname | ||
| if (parsedUrl.hostname === 'medium.com' || parsedUrl.hostname.endsWith('.medium.com')) { | ||
| return u.replace(parsedUrl.hostname, 'scribe.rip'); | ||
| } | ||
| } catch (_) {} | ||
| return null; | ||
| } |
There was a problem hiding this comment.
This hostname validation should use the existing hostnameMatches() helper function (defined at line 584) for consistency with the rest of the codebase. The helper is already used throughout the file (lines 597-608) for the same purpose of checking if a hostname matches a domain including its subdomains.
The current implementation duplicates the logic of hostnameMatches() inline. Instead, extract the hostname once and use the helper:
try {
const parsedUrl = new URL(u);
const hostname = parsedUrl.hostname;
if (hostnameMatches(hostname, 'medium.com')) {
return u.replace(parsedUrl.hostname, 'scribe.rip');
}
} catch (_) {}This improves maintainability and ensures consistent security checks across the codebase.
…alert #27)
Replace unsafe string matching with proper URL parsing:
Fixes: CWE-20 (Incomplete URL substring sanitization)
https://claude.ai/code/session_0138bAjho1fWwiRZju3nJFJ3
Continue Tasks:▶️ 1 queued — View all