Skip to content

Potential fix for code scanning alert no. 16: Incomplete URL substring sanitization#28

Merged
EthanThePhoenix38 merged 1 commit intomainfrom
alert-autofix-16
Jan 15, 2026
Merged

Potential fix for code scanning alert no. 16: Incomplete URL substring sanitization#28
EthanThePhoenix38 merged 1 commit intomainfrom
alert-autofix-16

Conversation

@EthanThePhoenix38
Copy link
Member

Potential fix for https://github.com/ThePhoenixAgency/AI-Pulse/security/code-scanning/16

In general, substring matching on the full URL string should be replaced by checks on the parsed URL’s hostname, combined with an explicit allowlist and precise matching (e.g., exact host or specific known mirrors). For this particular function, we should parse the URL, inspect the hostname, and only apply the Freedium rewrite if the hostname is exactly medium.com or towardsdatascience.com (or known subdomains like www.medium.com if desired). This prevents URLs where medium.com only appears in the path, query, or as part of another domain from being rewritten.

The best minimal fix inside src/aggregator.js is:

  • Import Node’s built-in url parsing (or use the WHATWG URL class) without changing existing imports.
  • In addUTMParams, parse the incoming url into a URL object, safely extract hostname, and use strict equality against a small allowlist of Medium-related domains.
  • Only if the hostname matches, construct the Freedium URL using just the origin+path+search of the original URL to avoid accidentally embedding schemes or credentials. Otherwise, leave the URL unchanged.
  • Keep the rest of the function behavior (adding UTM parameters) intact.

Concretely:

  • Add const { URL } = require('url'); at the top of src/aggregator.js after existing requires (anywhere among them).
  • Replace the body of addUTMParams so that:
    • It tries to construct new URL(url), catching errors; on parse failure, skips the Freedium logic and just appends UTM params.
    • Defines an ALLOWED_MEDIUM_HOSTS array like ['medium.com', 'www.medium.com', 'towardsdatascience.com', 'www.towardsdatascience.com'].
    • If parsed.hostname is in that list, rewrites url to https://freedium.cloud/ + the original URL string (or a normalized version).
    • Then appends the UTM parameters as before.

This keeps all existing functionality (Freedium rewrite + UTM parameters) but removes the incomplete substring sanitization.

Suggested fixes powered by Copilot Autofix. Review carefully before merging.

…g sanitization

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
@EthanThePhoenix38 EthanThePhoenix38 marked this pull request as ready for review January 15, 2026 14:32
Copilot AI review requested due to automatic review settings January 15, 2026 14:32
@EthanThePhoenix38 EthanThePhoenix38 merged commit bd9ad16 into main Jan 15, 2026
4 checks passed
@EthanThePhoenix38 EthanThePhoenix38 deleted the alert-autofix-16 branch January 15, 2026 14:32
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request addresses a security code scanning alert (#16) related to incomplete URL substring sanitization in the addUTMParams function. The fix replaces unsafe string matching with proper URL parsing to prevent malicious URLs from being incorrectly rewritten to the Freedium mirror.

Changes:

  • Added import for Node.js built-in URL class for proper URL parsing
  • Replaced substring-based URL checking with hostname-based validation using an explicit allowlist
  • Wrapped the Freedium rewrite logic in a try-catch block to handle malformed URLs gracefully

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +40 to +45
const mediumHosts = [
'medium.com',
'www.medium.com',
'towardsdatascience.com',
'www.towardsdatascience.com'
];
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mediumHosts array is defined inside the function on every call. Consider moving this to a constant at the module level (similar to FEED_CATEGORIES) to avoid unnecessary array creation on each invocation, improving performance and maintainability.

Copilot uses AI. Check for mistakes.
'www.towardsdatascience.com'
];

if (mediumHosts.includes(hostname)) {
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allowlist only includes exact domains but doesn't handle Medium's custom subdomain pattern. Medium allows users to publish on custom subdomains (e.g., username.medium.com or publication.medium.com). Consider checking if the hostname ends with .medium.com to catch all legitimate Medium subdomains, while still preventing path-based attacks.

Suggested change
if (mediumHosts.includes(hostname)) {
if (mediumHosts.includes(hostname) || hostname.endsWith('.medium.com')) {

Copilot uses AI. Check for mistakes.
// Rewrite only genuine Medium/TDS URLs to Freedium
url = `https://freedium.cloud/${url}`;
}
} catch (e) {
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling silently swallows all exceptions without logging. This makes debugging difficult if URL parsing fails. Consider logging the error (at minimum during development) or documenting why silent failure is acceptable here. A malformed URL error could indicate a data quality issue that should be investigated.

Suggested change
} catch (e) {
} catch (e) {
if (process.env.NODE_ENV !== 'production') {
console.error('addUTMParams: Failed to parse URL for UTM handling:', url, e);
}

Copilot uses AI. Check for mistakes.
@@ -3,6 +3,7 @@ const Parser = require('rss-parser');
const axios = require('axios');
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable axios.

Suggested change
const axios = require('axios');

Copilot uses AI. Check for mistakes.
@@ -3,6 +3,7 @@ const Parser = require('rss-parser');
const axios = require('axios');
const { Octokit } = require('@octokit/rest');
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable Octokit.

Suggested change
const { Octokit } = require('@octokit/rest');

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants