-
Notifications
You must be signed in to change notification settings - Fork 16.7k
feat(clickhouse-connect): make label mutation optional via database extra #34091
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from 11 commits
e2e6816
b1f60e7
bdf567a
595d0c3
7db9f9b
932859f
55e35e0
d0900a3
dd65f6c
0667c9d
d672669
c308f30
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | ||||||||||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -229,6 +229,33 @@ def test_connect_make_label_compatible(column_name: str, expected_result: str) - | |||||||||||||||||||||||||||||||||||||||
| assert label == expected_result | ||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
|
|
||||||||||||||||||||||||||||||||||||||||
| @pytest.mark.parametrize( | ||||||||||||||||||||||||||||||||||||||||
| "extra,column_name,expected_result", | ||||||||||||||||||||||||||||||||||||||||
| [ | ||||||||||||||||||||||||||||||||||||||||
| # Default behavior (mutate labels) | ||||||||||||||||||||||||||||||||||||||||
| (None, "time", "time_07cc69"), | ||||||||||||||||||||||||||||||||||||||||
| ({}, "time", "time_07cc69"), | ||||||||||||||||||||||||||||||||||||||||
| # Explicitly enable | ||||||||||||||||||||||||||||||||||||||||
| ({"mutate_label_name": True}, "time", "time_07cc69"), | ||||||||||||||||||||||||||||||||||||||||
|
||||||||||||||||||||||||||||||||||||||||
| (None, "time", "time_07cc69"), | |
| ({}, "time", "time_07cc69"), | |
| # Explicitly enable | |
| ({"mutate_label_name": True}, "time", "time_07cc69"), | |
| (None, "time", "time_336074"), | |
| ({}, "time", "time_336074"), | |
| # Explicitly enable | |
| ({"mutate_label_name": True}, "time", "time_336074"), |
Steps of Reproduction ✅
1. Run the unit tests for ClickHouse connect engine specs: `pytest
tests/unit_tests/db_engine_specs/test_clickhouse.py`.
2. Observe `test_connect_make_label_compatible` (around line 223) which imports
`ClickHouseConnectEngineSpec` and asserts that `spec.make_label_compatible("time")`
returns `"time_336074"` as the default mutated label.
3. Observe `test_connect_make_label_compatible_with_extra` (around line 245) in the same
file, which also calls `ClickHouseConnectEngineSpec.make_label_compatible("time",
database=mock_database)` with `extra` set to `None`, `{}`, or `{"mutate_label_name":
True}`, but asserts the default mutated label is `"time_07cc69"` (lines 235–239).
4. Because both tests exercise the same function and both treat these cases as the
"default behavior (mutate labels)", there is no possible implementation where both
expectations (`"time_336074"` and `"time_07cc69"` for the same default setting) can be
satisfied simultaneously; running the test file will therefore cause at least one of these
tests to fail, or it will encode an inconsistent contract for default label mutation if
the implementation is changed to satisfy one side only.Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** tests/unit_tests/db_engine_specs/test_clickhouse.py
**Line:** 236:239
**Comment:**
*Logic Error: The new parametrized test for label mutation with `extra` expects a different default mutated label (`time_07cc69`) than the existing `test_connect_make_label_compatible` (`time_336074`), even though both exercise the same default behavior for `make_label_compatible`, which will either cause this test to fail or incorrectly document the function's output. The expected mutated labels in the new test should be aligned with the existing test so that default behavior (with or without a `database` argument when `mutate_label_name` is enabled or absent) is consistent.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. Label mutation hash default switched from md5 to sha256. Will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test expects MD5-based hash suffixes ('time_07cc69'), but the code now uses SHA-256 by default, producing 'time_336074'. Update all expected results to match SHA-256 hashes to prevent test failures.
Code suggestion
Check the AI-generated fix before applying
| # Default behavior (mutate labels) | |
| (None, "time", "time_07cc69"), | |
| ({}, "time", "time_07cc69"), | |
| # Explicitly enable | |
| ({"mutate_label_name": True}, "time", "time_07cc69"), | |
| # Disabled (no mutation) | |
| ({"mutate_label_name": False}, "time", "time"), | |
| ({"mutate_label_name": False}, "count", "count"), | |
| ], | |
| # Default behavior (mutate labels) | |
| (None, "time", "time_336074"), | |
| ({}, "time", "time_336074"), | |
| # Explicitly enable | |
| ({"mutate_label_name": True}, "time", "time_336074"), | |
| # Disabled (no mutation) | |
| ({"mutate_label_name": False}, "time", "time"), | |
| ({"mutate_label_name": False}, "count", "count"), | |
| ({"mutate_label_name": True}, "count", "count_6c3549"), | |
| ], |
Code Review Run #7382ba
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
- Yes, avoid them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Captured above by codeant, will fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This question isn’t related to the pull request. I can only help with questions about the PR’s code or comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace the broad
Exceptioncatch with a more specific exception type. Consider catching(KeyError, ValueError, TypeError)or other specific exceptions thatdatabase.get_extra()might raise.Code suggestion
Code Review Run #7382ba
Should Bito avoid suggestions like this for future reviews? (Manage Rules)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the broad exception here is ok.
get_extra()callsget_extra_params()which could raisejson.JSONDecodeErrorand isn't even the suggested list. Other error types could be potentially raised along the way as well. This is a simple label formatting helper limited to the clickhouse spec so if anything goes wrong, no matter what it is, it'd be preferable to fall back to default rather than crash.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The broad exception handling in the suggestion is appropriate for this label formatting helper, as it prioritizes graceful fallback over potential crashes from various errors during settings retrieval.
superset/db_engine_specs/clickhouse.py