Skip to content

🔒 Security Fix: SQL Injection Vulnerabilities in Database Clients#52

Open
ivosetyadi wants to merge 1 commit intobagofwords1:mainfrom
ivosetyadi:security/fix-sql-injection-vulnerabilities
Open

🔒 Security Fix: SQL Injection Vulnerabilities in Database Clients#52
ivosetyadi wants to merge 1 commit intobagofwords1:mainfrom
ivosetyadi:security/fix-sql-injection-vulnerabilities

Conversation

@ivosetyadi
Copy link
Contributor

🔒 Security Fix: SQL Injection Vulnerabilities in Database Clients

Summary

Fixed critical SQL injection vulnerabilities in three database client implementations where user-controlled values were being interpolated directly into SQL statements without proper escaping or parameterization.

Vulnerabilities Fixed

1. AWS Redshift Client (aws_redshift_client.py)

  • Location: Line 298
  • Issue: Schema names in SET search_path statement were concatenated without escaping
  • Fix: Now uses psycopg2.sql.Identifier for safe identifier quoting
  • Impact: Prevents SQL injection through schema name manipulation

2. DuckDB Client (duckdb_client.py)

  • Locations: Lines 121, 124, 180, 189, 201, 208, 235
  • Issue: Table/view names directly interpolated in:
    • CREATE VIEW statements
    • DESCRIBE statements
    • SELECT statements
  • Fix: Added _quote_identifier() helper method that properly escapes identifiers using DuckDB double-quote syntax
  • Impact: Prevents SQL injection through table/view name manipulation

3. PostgreSQL Client (postgresql_client.py)

  • Location: Line 54
  • Issue: Schema names in SET search_path concatenated without escaping
  • Fix: Now uses psycopg2.sql.Identifier with proper list handling for multiple schemas
  • Impact: Prevents SQL injection through schema name manipulation

Security Impact

Severity: HIGH

These vulnerabilities could have allowed attackers to:

  • Execute arbitrary SQL commands
  • Access unauthorized data
  • Modify or delete database contents
  • Escalate privileges

Attack vector required control over database/schema/table names, which could occur through:

  • User input in configuration
  • Data source connection parameters
  • Malicious data source metadata

Testing

  • ✅ Python syntax validation passed
  • ✅ All modified files compile successfully
  • ✅ Changes follow database-specific identifier quoting standards

Changes

  • Files modified: 3
  • Lines added: 41
  • Lines removed: 10

Diff Summary

backend/app/data_sources/clients/aws_redshift_client.py    | 7 ++++++-
backend/app/data_sources/clients/duckdb_client.py          | 29 ++++++++++++++++++++++-------
backend/app/data_sources/clients/postgresql_client.py      | 15 +++++++++++++--
3 files changed, 41 insertions(+), 10 deletions(-)

Checklist

  • Fixed SQL injection in AWS Redshift client
  • Fixed SQL injection in DuckDB client
  • Fixed SQL injection in PostgreSQL client
  • Added proper identifier quoting throughout
  • Syntax validation completed
  • Commit message follows conventions

Fixed SQL injection vulnerabilities in three database client implementations
where user-controlled values were being interpolated directly into SQL
statements without proper escaping or parameterization:

1. aws_redshift_client.py: Schema names in SET search_path statement
   - Now uses psycopg2.sql.Identifier for safe identifier quoting

2. duckdb_client.py: Multiple injection points including:
   - View/table names in CREATE VIEW, DESCRIBE, and SELECT statements
   - Added _quote_identifier() helper method for proper DuckDB identifier quoting

3. postgresql_client.py: Schema names in SET search_path statement
   - Now uses psycopg2.sql.Identifier with proper list handling

These vulnerabilities could have allowed attackers to execute arbitrary SQL
commands if they could control database/schema/table names. All identifiers
are now properly quoted to prevent injection attacks.

Security Impact: HIGH - Prevents potential data breach and unauthorized
database access

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@yochze
Copy link
Contributor

yochze commented Jan 6, 2026

@claude review

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request addresses critical SQL injection vulnerabilities in three database client implementations by implementing proper identifier quoting and escaping mechanisms to prevent malicious SQL injection through schema, table, and view names.

Key Changes:

  • Added psycopg2.sql.Identifier-based quoting for PostgreSQL and AWS Redshift schema names in SET search_path statements
  • Implemented custom _quote_identifier() method in DuckDB client to safely escape table and view names
  • Applied identifier quoting across 7 locations in DuckDB client where table/view names are used in SQL statements

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 4 comments.

File Description
backend/app/data_sources/clients/postgresql_client.py Replaced string concatenation with psycopg2.sql.Identifier for search_path to prevent schema name SQL injection
backend/app/data_sources/clients/duckdb_client.py Added _quote_identifier() helper method and applied it to all table/view name usages in CREATE VIEW, DESCRIBE, and SELECT statements
backend/app/data_sources/clients/aws_redshift_client.py Replaced string concatenation with psycopg2.sql.Identifier for search_path with proper transaction commit

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +298 to +302
# Use parameterized query with identifier quoting to prevent SQL injection
from psycopg2 import sql
cursor.execute(
sql.SQL("SET search_path TO {}").format(sql.Identifier(self.schema))
)
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQL injection fix for search_path should be covered by tests. Consider adding test cases that verify schema names containing special characters or SQL injection attempts are properly escaped using psycopg2.sql.Identifier and cannot be exploited.

Copilot uses AI. Check for mistakes.
search_path_sql = psql.SQL("SET search_path TO {}").format(
psql.SQL(", ").join(identifiers)
)
cursor.execute(search_path_sql)
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SET search_path command is executed on the raw psycopg2 connection without an explicit commit. In psycopg2, when not in autocommit mode, SET commands need to be committed to take effect. Consider adding raw_conn.commit() after line 65 to ensure the search_path setting persists for subsequent queries, similar to the approach used in the AWS Redshift client.

Suggested change
cursor.execute(search_path_sql)
cursor.execute(search_path_sql)
# Commit so that the SET search_path takes effect when autocommit is off
raw_conn.commit()

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +57
def _quote_identifier(self, identifier: str) -> str:
"""Quote an identifier (table/view/column name) to prevent SQL injection.

In DuckDB, identifiers are quoted with double quotes, and any double quotes
within the identifier are escaped by doubling them.
"""
if identifier is None:
raise ValueError("Identifier cannot be None")
# Escape double quotes by doubling them
escaped = str(identifier).replace('"', '""')
return f'"{escaped}"'
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new _quote_identifier method and its security implications should be covered by tests. Consider adding test cases that verify identifiers containing special characters (like double quotes, semicolons, or SQL keywords) are properly escaped and cannot be used for SQL injection. For example, test with table names like 'test"; DROP TABLE users; --' to ensure the quoting mechanism prevents injection attacks.

Copilot uses AI. Check for mistakes.
Comment on lines +52 to +65
# Use proper identifier quoting to prevent SQL injection
# PostgreSQL requires identifiers to be quoted individually
from psycopg2 import sql as psql
try:
conn.execute(text(f"SET search_path TO {search_path}"))
# Get the raw psycopg2 connection from SQLAlchemy
raw_conn = conn.connection.driver_connection
with raw_conn.cursor() as cursor:
# Build a list of quoted identifiers
identifiers = [psql.Identifier(schema) for schema in self._schemas]
# Join them with commas
search_path_sql = psql.SQL("SET search_path TO {}").format(
psql.SQL(", ").join(identifiers)
)
cursor.execute(search_path_sql)
Copy link

Copilot AI Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SQL injection fix for search_path should be covered by tests. Consider adding test cases that verify schema names containing special characters or SQL injection attempts (e.g., schemas like 'public; DROP TABLE users; --') are properly escaped using psycopg2.sql.Identifier and cannot be exploited.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants