Skip to content

Potential fix for code scanning alert no. 11: Incomplete multi-character sanitization#12

Merged
EthanThePhoenix38 merged 1 commit intomainfrom
alert-autofix-11
Jan 8, 2026
Merged

Potential fix for code scanning alert no. 11: Incomplete multi-character sanitization#12
EthanThePhoenix38 merged 1 commit intomainfrom
alert-autofix-11

Conversation

@EthanThePhoenix38
Copy link
Member

Potential fix for https://github.com/ThePhoenixAgency/AI-Pulse/security/code-scanning/11

In general, the fix is to stop using an ad‑hoc regex to “remove HTML tags” and instead use a well-tested, HTML‑aware sanitization/parsing approach. For this code, the best low-impact change is to replace the regex-based .replace(/<[^>]*>/g, '') with a call to a dedicated HTML-sanitizing/stripping library that reliably removes or escapes unsafe content, including <script> and other dangerous constructs.

Concretely, in src/aggregator.js:

  • Add an import for a popular, well-tested sanitizer such as sanitize-html.
  • Define a small helper, e.g. sanitizeText, that:
    • Uses sanitizeHtml with a configuration that strips all tags and attributes, returning only safe text.
    • Ensures that multi-character sequences that could re-form tags (e.g. in malformed HTML) are fully neutralized.
  • Update line 68 to use this helper on article.contentSnippet.
  • Update line 71 similarly for article.title, replacing the inlined regex with the helper before slicing to 200 chars.

This keeps the existing functionality (getting a plain-text-ish title/summary) but implements it in a robust way that avoids incomplete multi-character sanitization.

You will need:

  • An import at the top of src/aggregator.js for sanitize-html.
  • A sanitizeText function defined near the other helpers.
  • Replacement of the two regex uses on lines 68 and 71 with calls to sanitizeText.

Suggested fixes powered by Copilot Autofix. Review carefully before merging.

…ter sanitization

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
@EthanThePhoenix38 EthanThePhoenix38 marked this pull request as ready for review January 8, 2026 20:01
Copilot AI review requested due to automatic review settings January 8, 2026 20:01
Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f29c13b85f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +10 to +11
"rss-parser": "^3.13.0",
"sanitize-html": "^2.17.0"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Update lockfile for new sanitize-html dependency

This commit adds sanitize-html to package.json but does not update package-lock.json. In any environment that installs dependencies via the lockfile (e.g., npm ci in CI/CD or deployments), sanitize-html will not be installed and the new require('sanitize-html') in src/aggregator.js will throw MODULE_NOT_FOUND at runtime. Please regenerate the lockfile so it includes the new dependency.

Useful? React with 👍 / 👎.

@EthanThePhoenix38 EthanThePhoenix38 merged commit 576ed70 into main Jan 8, 2026
8 of 10 checks passed
@EthanThePhoenix38 EthanThePhoenix38 deleted the alert-autofix-11 branch January 8, 2026 20:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses a code scanning alert regarding incomplete multi-character sanitization by replacing regex-based HTML tag removal with a dedicated HTML sanitization library. The fix targets a security vulnerability where the regex pattern /<[^>]*>/g could be bypassed by malformed HTML.

Key changes:

  • Added sanitize-html library as a new dependency
  • Implemented a sanitizeText helper function that uses sanitize-html with strict configuration
  • Replaced regex-based HTML stripping in article.contentSnippet and article.title processing

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
package.json Added sanitize-html version 2.17.0 as a new dependency
src/aggregator.js Added import for sanitize-html, implemented sanitizeText helper function, and replaced regex-based tag removal with sanitizeText calls

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +10 to +11
"rss-parser": "^3.13.0",
"sanitize-html": "^2.17.0"
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The project already includes dompurify and isomorphic-dompurify as dependencies. Adding sanitize-html creates redundancy with three different HTML sanitization libraries. Consider using one of the existing libraries (particularly isomorphic-dompurify which works in both browser and Node.js environments) instead of introducing a third sanitization library. This would reduce bundle size, dependency complexity, and potential security surface.

Suggested change
"rss-parser": "^3.13.0",
"sanitize-html": "^2.17.0"
"rss-parser": "^3.13.0"

Copilot uses AI. Check for mistakes.
const Parser = require('rss-parser');
const axios = require('axios');
const { Octokit } = require('@octokit/rest');
const sanitizeHtml = require('sanitize-html');
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider using the existing isomorphic-dompurify library instead of adding sanitize-html. The project already has DOMPurify available, which can strip HTML tags effectively using DOMPurify.sanitize(input, {ALLOWED_TAGS: []}). This would maintain consistency with the sanitization approach used in the HTML files (portfolio.html, reader.html, stats.html) and reduce the number of dependencies.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants