fix collect freesurfer #3588

bpinsard · 2025-12-16T16:33:20Z

Changes proposed in this pull request

Currently when preprocessing fMRI only dataset using precomputed sMRIPrep+freesurfer (--fs-no-resume) and passing --session-label yy, the fs_subject_id ends up being sub-xx_ses-yy using the fMRI selected session, and thus it picks files from that folder. I noticed it because my jobs was only pre-fetching with datalad freesurfer/sub-xx folder from a precomputed longitudinal freesurfer, but the pipeline tried using freesurfer files from sub-xx_ses-yy instead, which were broken symlinks.

Not sure what is the best approach, but it seems that it should only try to get session-specific freesurfer when using --subject-anatomical-reference sessionwise, but it might require another case for first-lex.

I am not 100% sure what are the outputs when you using this option:

session-wise -> freesurfer/sub-xx_ses-yy/ ?
first-lex -> freesurfer/sub-xx or same as sessionwise
unbiased -> freesurfer/sub-xx I assume

Documentation that should be reviewed

codecov · 2025-12-16T16:47:04Z

Codecov Report

❌ Patch coverage is 0% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 72.85%. Comparing base (55087d5) to head (6952f35).

Files with missing lines	Patch %	Lines
fmriprep/workflows/base.py	0.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #3588      +/-   ##
==========================================
- Coverage   72.88%   72.85%   -0.04%     
==========================================
  Files          60       60              
  Lines        4821     4823       +2     
  Branches      626      627       +1     
==========================================
  Hits         3514     3514              
- Misses       1164     1165       +1     
- Partials      143      144       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

tsalo · 2025-12-16T17:04:51Z

Given that, even if you use first-lex or unbiased, you can still select specific sessions with --session-label, I think the Freesurfer subject ID needs to include the session label regardless. When running the newer versions of fMRIPrep with older Freesurfer derivatives, I have created symlinks from sub-<label> to sub-<label>_ses-<label> to make it work.

bpinsard · 2025-12-16T17:23:38Z

Do you mean that this change would break the pipeline when not using precomputed freesurfer?

If I understand correctly the code, if using unbiased or first-lex BIDSSourceFile will generate a source_file that removes the ses- entity if multiple T1w (T2w) are used (even if all from the same session), so the sMRIPrep and freesurfer outputs will not contain that entity (even for first-lex !).
So reusing these outputs for both cases for a new fMRI only session would fail without this fix, as it will try to get freesurfer output with that session entity.
So the way I see it is that --session-label is for fMRI only, while --subject-anatomical-reference (and bids_filters) control what is used to build the anatomical reference.

In my case I cannot simply use symlinks, as these session-level folder already exists as it is a true longitudinal freesurfer (including sub-xx_ses-yy.long too) that was run separately (though using sMRIPrep brain extraction as it is more robust).

Covering all these cases with tests would be tricky, but important.

tsalo · 2025-12-16T17:30:38Z

I think the current changes would break things when using --subject-anatomical reference unbiased or first-lex with --session-label. I don't think --session-label is supposed to only filter fMRI sessions, but I could be wrong. I can imagine people wanting to run fMRIPrep on a subset of sessions, using the anatomical data from just those sessions.

tsalo · 2025-12-16T17:31:30Z

fmriprep/workflows/base.py

        (about, ds_report_about, [('out_report', 'in_file')]),
    ])  # fmt:skip

+    if config.workflow.subject_anatomical_reference == 'sessionwise':


What about something like this?

Suggested change

if config.workflow.subject_anatomical_reference == 'sessionwise':

if (config.workflow.subject_anatomical_reference == 'sessionwise') or (config.workflow.session_label is not None):

bpinsard · 2025-12-16T19:37:41Z

I think the current changes would break things when using --subject-anatomical reference unbiased or first-lex with --session-label. I don't think --session-label is supposed to only filter fMRI sessions, but I could be wrong. I can imagine people wanting to run fMRIPrep on a subset of sessions, using the anatomical data from just those sessions.

I think it would not break that usecase.

Trying to branch all the possibility, this is complex.

if no precomputed sMRIPrep+freesurfer (haven't tested it):
- "unbiased" or "first-lex" : if multiple T1w from one or multiple sessions, with or without --session-label -> fs_id has no ses- entity currently (related to comment here [FIX] ravel multi-echo data for BIDSSourceFile with precomputed smriprep #3583 (comment) ).
- "sessionwise": with one or multiple T[12]w per session _create_multi_source_file keeps the ses- entity, freesurfer output gets name accordingly.
if precomputed + sessionwise
if precomputed given with a freesurfer subjects-dir/sub-xx (output from first-lex or unbiased above or custom):
- "unbiased" or "first-lex":
  - with or without --session-label
  - if no anat in BIDS and a single bold file _create_multi_source_file keeps the -ses entity in the filename, if multiple bold runs drops it (including for multi-echo with the latest fix) generating an invalid bids path (ses- folder, no ses- in file), so as @mgxd mentioned this function is problematic. Either way niworkflows.interfaces.bids.BIDSInfo extract the session entity from the path, from the first BOLD (or anat) file that is then used to generated the fs_subject_id, which points to non-existent folder. Unless I am missing something that logic is flawed.

> niworkflows.interfaces.bids.BIDSInfo(in_file='/path/to/bids/sub-01/ses-001/func/sub-01_bold.nii.gz',bids_validate=True).run().outputs

acquisition = <undefined>
reconstruction = <undefined>
run = <undefined>
session = 001
subject = 01
suffix = bold
task = <undefined>

(BTW: bids_validate is not used anywhere in that interface)

Anyway, sorry I am getting lost in the multiverse, just to say that the logic from command-line options -> select files -> select single representative file / generic source (with the broken _create_multi_source_file ) -> extract BIDS entities -> generate fs_subject_id is very convoluted, brittle and might result in questionable output naming that are not reusable for "apply".

AFAIK the freesurfer outputs naming is not covered by tests in either fMRIPrep or sMRIPrep for circleCI or pytests, but I know running full freesurfer in tests would be a pain, unless we use --fs-no-resume and check that the workflow finds what it needs.

tsalo · 2025-12-19T17:37:50Z

The scenario I was imagining was a longitudinal dataset (let's say 4 sessions), and running fMRIPrep using either --subject-anatomical reference unbiased or first-lex: once with --session-label 01 02 and once with --session-label 03 04. In that scenario, I think using just the subject ID as the Freesurfer subject ID would mean that the --session-label 03 04 run would use the Freesurfer derivatives from the --session-label 01 02 run.

Either way niworkflows.interfaces.bids.BIDSInfo extract the session entity from the path, from the first BOLD (or anat) file that is then used to generated the fs_subject_id, which points to non-existent folder. Unless I am missing something that logic is flawed.

That is definitely an issue when you update your fMRIPrep version. It shouldn't be a problem if you're rerunning with the same version though, right?

AFAIK the freesurfer outputs naming is not covered by tests in either fMRIPrep or sMRIPrep for circleCI or pytests, but I know running full freesurfer in tests would be a pain, unless we use --fs-no-resume and check that the workflow finds what it needs.

I think that's a great idea! I had a PR that raises an error if --fs-no-resume is called and the Freesurfer subject folder doesn't exist (nipreps/smriprep#504), which should available in fMRIPrep tests since fMRIPrep uses sMRIPrep at main.

bpinsard · 2025-12-19T21:11:26Z

The scenario I was imagining was a longitudinal dataset (let's say 4 sessions), and running fMRIPrep using either --subject-anatomical reference unbiased or first-lex: once with --session-label 01 02 and once with --session-label 03 04. In that scenario, I think using just the subject ID as the Freesurfer subject ID would mean that the --session-label 03 04 run would use the Freesurfer derivatives from the --session-label 01 02 run.

What output do you envision in smriprep and freesurfer for that usecase?
My understanding is that smriprep will not include ses- entity so same outputs for both pipelines.
For freesurfer, currently:

first-lex or unbiased
single T1w -> freesurfer contains the ses- entity ** with this PR it is removed **
multiple T1w (even if all in a single session) no ses- entity
So the PR makes it more consistent, in particular if some subject have 1 T1w, others have multiple.

Maybe we should have sub-xx_ses-01+02 and sub-xx_ses-03+04 as freesurfer outputs for unbiased and sub-xx_ses-01 and sub-xx_ses-03 for first-lex ?
The main issue is when outputs are reused in an apply pipeline (or a --derivatives pipeline on new fMRI), grabbing these will be trickier with these convoluted naming.
For example, if you then want to run --session-label 05 which does not have a T1w, it will try to grab freesurfer sub-xx if there are multiple T1w or sub-xx_ses-01 if there was only a single T1w in the first session for that subject across all sessions, while grabbing sub-xx makes more sense (and can be symlinked). Or we need a logic to first check for a sub-xx then with the session-label, then glob any.

For the usecase you describe, the simpler would be to have separate outputs derivatives folders, as sMRIPrep outputs will collide, and freesurfer will too with this fix.

Either way niworkflows.interfaces.bids.BIDSInfo extract the session entity from the path, from the first BOLD (or anat) file that is then used to generated the fs_subject_id, which points to non-existent folder. Unless I am missing something that logic is flawed.

That is definitely an issue when you update your fMRIPrep version. It shouldn't be a problem if you're rerunning with the same version though, right?

No, with the same version, for apply and derivatives reuse, as described above.

AFAIK the freesurfer outputs naming is not covered by tests in either fMRIPrep or sMRIPrep for circleCI or pytests, but I know running full freesurfer in tests would be a pain, unless we use --fs-no-resume and check that the workflow finds what it needs.

I think that's a great idea! I had a PR that raises an error if --fs-no-resume is called and the Freesurfer subject folder doesn't exist (nipreps/smriprep#504), which should available in fMRIPrep tests since fMRIPrep uses sMRIPrep at main.

The main issue is that a lot of the tests require workflow runtime, to check that the files it tries to grab makes sense. So for it to be lightweight, we should have a separated workflow that only grab files. Or maybe monkeypatch the heavy compute subworfklows.

fix collect freesurfer longitudinal

6952f35

tsalo reviewed Dec 16, 2025

View reviewed changes

bpinsard mentioned this pull request Feb 11, 2026

--derivatives and --session-label and bids filters conflicts #3581

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix collect freesurfer #3588

fix collect freesurfer #3588

Uh oh!

bpinsard commented Dec 16, 2025

Uh oh!

codecov bot commented Dec 16, 2025

Uh oh!

tsalo commented Dec 16, 2025

Uh oh!

bpinsard commented Dec 16, 2025

Uh oh!

tsalo commented Dec 16, 2025

Uh oh!

tsalo Dec 16, 2025

Uh oh!

bpinsard commented Dec 16, 2025

Uh oh!

tsalo commented Dec 19, 2025

Uh oh!

bpinsard commented Dec 19, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	if config.workflow.subject_anatomical_reference == 'sessionwise':
	if (config.workflow.subject_anatomical_reference == 'sessionwise') or (config.workflow.session_label is not None):

fix collect freesurfer #3588

Are you sure you want to change the base?

fix collect freesurfer #3588

Uh oh!

Conversation

bpinsard commented Dec 16, 2025

Changes proposed in this pull request

Documentation that should be reviewed

Uh oh!

codecov bot commented Dec 16, 2025

Codecov Report

Uh oh!

tsalo commented Dec 16, 2025

Uh oh!

bpinsard commented Dec 16, 2025

Uh oh!

tsalo commented Dec 16, 2025

Uh oh!

tsalo Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

bpinsard commented Dec 16, 2025

Uh oh!

tsalo commented Dec 19, 2025

Uh oh!

bpinsard commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

bpinsard commented Dec 19, 2025 •

edited

Loading