Skip to content

Conversation

@ventusfortis
Copy link
Contributor

@ventusfortis ventusfortis commented Dec 24, 2025

This commit adds a runOnPart method for python imports, which enriches import nodes with missing data about location (#5651 and also fixes the code property to be exact in output

@ventusfortis ventusfortis changed the title Enrich Python import nodes with location metadata and exact code in the code property (#5651) [pysrc2cpg] Enrich Python import nodes with location metadata and exact code in the code property (#5651) Dec 24, 2025
@ventusfortis ventusfortis requested a review from maltek January 15, 2026 15:13
Copy link
Contributor

@maltek maltek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! We're almost there.

To get past the code style checker, run sbt scalafmtAll and commit the changed code.

@ventusfortis ventusfortis requested a review from maltek January 15, 2026 16:28
@ventusfortis
Copy link
Contributor Author

ventusfortis commented Jan 19, 2026

Test are failing, but the failed tests don't seem to be affected by changes in my commits =/ Correct me if I'm wrong

@maltek
Copy link
Contributor

maltek commented Jan 19, 2026

the failing tests are

[error] 	io.joern.pysrc2cpg.dataflow.RegexDefinedFlowsDataFlowTests
[error] 	io.joern.pysrc2cpg.passes.TypeRecoveryPassTests

Given that this PR touches the representation of imports in Python, it seems likely to me that your change is responsible. Unfortunately, I'm not familiar with how the python type recovery works either. I suspect there is some code there that's looking at the code field that you've changed.

ventusfortis and others added 2 commits January 26, 2026 13:34
Avoid treating import assignments as module globals after code formatting changes,
and seed call types for called ResolvedMember imports to restore methodFullName.
@ventusfortis
Copy link
Contributor Author

There's a lot of changed in this commit because I screwed up and didn't track the changes, should've made multiple commits... But, the changes are as follows:

After syncing with master and changing Python import .code to the exact source text, two groups of tests broke: regex dataflow semantics (sanitizerFoo/create_sanitizer) and module variable resolution (appV2 / referencing members). The root cause was not the .code itself, but how import lowering interacts with type recovery and module‑level variable detection.

First, calls created from from foo import sanitizerFoo were still tagged as ResolvedMember so methodFullName stayed <unknownFullName> and regex semantics did not apply. To fix this, I added a small piece of logic in XTypeRecovery for ResolvedMember to register an alias to the base module. Then, in PythonTypeRecovery I added Python‑specific handling: if an imported member is actually called in the module, seed that call directly with basePath.memberName so the call gets a real methodFullName without changing global behavior for other frontends.

Second, module variable tests failed because module‑level assignments were now incorrectly treated as globals: the old check skipped imports by matching import( in the RHS code, but after the .code change that string is no longer present. I replaced that string heuristic with an AST‑based check (rhs is a call named import), so import assignments are no longer turned into module members. This restores correct module variable origins for appV2 and fixes referencingMembers.

With these changes, sbt "project pysrc2cpg" test passes again.

@ventusfortis ventusfortis requested a review from maltek January 26, 2026 12:34
@ventusfortis
Copy link
Contributor Author

Alright, I forgot to test the rest of the frontends and broke something... Back to work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants