Single target detection #1588

LOCEANlloydizard · 2025-12-19T21:13:33Z

This PR introduces the public API and output schema for single target detection! Ref #1532

It focuses on:

defining the function signature and dispatcher integration
validating required inputs and parameters
returning a stable xr.Dataset with a target-based schema
issuing warnings when optional metadata are missing

This provides a foundation for upcoming algorithm development and allows early review of the API design, data model, and test coverage!

codecov-commenter · 2025-12-19T21:24:28Z

Codecov Report

❌ Patch coverage is 53.92405% with 182 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.29%. Comparing base (d371686) to head (a6a8282).
⚠️ Report is 24 commits behind head on main.

Files with missing lines	Patch %	Lines
...ype/mask/single_target_detection/detect_matecho.py	52.61%	181 Missing ⚠️
echopype/mask/api.py	90.90%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1588      +/-   ##
==========================================
- Coverage   84.95%   83.29%   -1.67%     
==========================================
  Files          79       80       +1     
  Lines        6998     7391     +393     
==========================================
+ Hits         5945     6156     +211     
- Misses       1053     1235     +182

Flag	Coverage Δ
unittests	`83.29% <53.92%> (-1.67%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

leewujung · 2025-12-31T23:16:42Z

echopype/tests/mask/test_mask.py

+    expected_vars = [
+        "TS_comp",
+        "TS_uncomp",
+        "Target_range",
+        "Target_range_disp",
+        "Target_range_min",
+        "Target_range_max",
+        "idx_r",
+        "idx_target_lin",
+        "pulse_env_before_lin",
+        "pulse_env_after_lin",
+        "PulseLength_Normalized_PLDL",
+        "Transmitted_pulse_length",
+        "Angle_minor_axis",
+        "Angle_major_axis",
+        "StandDev_Angles_Minor_Axis",
+        "StandDev_Angles_Major_Axis",
+        "Heave",
+        "Roll",
+        "Pitch",
+        "Heading",
+        "Dist",
+        "TSfreq_matrix",
+    ]


changing these to using snake_case?

and for consistency, how about following the pattern identifier_modifier, so:

"target_range_{min, max}"

"angles_{minor, major}_axis"

"angles_{minor, major}_axis_std_dev"

...

I'll put my comments for the variables themselves in detect_matecho.py

leewujung · 2025-12-31T23:17:14Z

echopype/tests/mask/test_mask.py

+REQ = {
+    "channel": "chan1",
+    "pulse_length": 1e-3,             # 1 ms dummy
+    "timeSampleInterval_usec": 40.0,  # dummy


Not sure where this would be used, but maybe also change this to using snake_case?

leewujung · 2025-12-31T23:50:22Z

echopype/mask/single_target_detection/detect_matecho.py

+            Ping_number=("target", _arr("Ping_number", int)),
+            Time=("target", _arr("Time", np.dtype("datetime64[ns]"))),


I don't see any variables above aligned with Ping_number or Time, so think these are redundant and can be removed?

Also, ping number and time are themselves the same thing, unless the timestamps are not derived from ping time (though I think for ping-by-ping methods like this it is derived from ping time?)

leewujung

Hey @LOCEANlloydizard : Thanks for putting together this PR for discussion! I have a few overarching suggestion:

use xarray data container (dataarray or dataset) instead of dict when possible -- for example at the output of _compute_core_matrices, instead of a Dict, a xr.Dataset would work well. BUT see my comment below about that function [1].
add more inline code comments to enhance code readability
using xarray operation would help with code readability too, compared to when doing vanilla numpy operations, since it's xarray is label-aware and it's easier to follow the operations.

I have some thoughts between _cond_8_build_outputs and detect_matecho that would affect the implementation:

If I read the code right, what you have in the double loop in _cond_8_build_outputs is to go through all pings (ip) that contain detected single targets, and then all ranges (pidx) of the detected targets within that ping. The out assembly and refinement does 3 things:

"flatten" the ping-range structure to just a 1-D array indexed by target
refine range in meters, though it's still index-based and not on a continuous scale
compute some derived variables, such as positionBeamCoord

So, at the core the single target detection output has 1 dimension target (which is also a coordinate), and 2 variables: the index number along ping_time, and the index number along range_sample (or range_meter, depending on what the detection is operating on [2]). All other information can be extracted or computed from the original dataarray the detection runs on using these 2 index numbers for each target. isel or sel allow passing multiple indices at once, so seems more efficient than looping (i.e. vectorizing the code).

If the above is true, how about breaking things up into 2 steps? The first function focuses on getting this "core" single target detection output that's flattened (outputA), and the second function focuses on extracting and computing the required values in the full output (outputB)? Then users can decide if they want to store outputA and stop there, or proceed to compute outputB.

The reasons why I think it may be useful to separate outputA and outputB are:

it gives a good progression in what the code is trying to do in a workflow
derived values in outputB likely change more dramatically if calibration or something else change in the source dataset, but outputA is less variable
I am not exactly sure about this one: this may provide us a way for scaling the single target detection operation in the future, along with the idea that storing an intermediate data product to disk and then read it back for lazy computation is more scalable than trying to do everything in memory at once.

I am also fine with keeping outputA just as an intermediate produce within detect_matecho and not exposing it to the user, though thinking it's good to have a structure that's easy to follow.

[1] _compute_core_matrices computes quantities that largely overlap with current echopype compute_Sv and compute_TS functions, and I think it's better to use those directly (and fix any errors there if any!)

[2] I didn't read in detail to see if the current code includes the range decimation you mentioned earlier, so this comment may have no bearing: as part of compute_Sv/TS there's a function that computes sample range and store it into the range_meter variable in the Sv dataset. Could we compute the refined range from there? This is so that we don't have redundancy in code that does the same thing scattered in different parts of the codebase, which would make maintenance challenging.

- Remove preliminary Matecho algorithm logic - Introduce a stub implementation for single-target detection - Define and validate the public API (inputs, params, outputs) - Return a zero-target xr.Dataset with a stable schema - Add parameter and dataset validation checks - Prepare ground for future algorithmic implementation

fix tests to match updated single-target detection code

leewujung self-requested a review December 20, 2025 18:04

LOCEANlloydizard changed the title ~~Single target detection API and output schema~~ Single target detection Dec 23, 2025

leewujung reviewed Dec 31, 2025

View reviewed changes

leewujung reviewed Jan 1, 2026

View reviewed changes

LOCEANlloydizard added 5 commits January 4, 2026 10:35

Refine single target detection implementation

f9ea881

edit comments

63038e4

Update detect_matecho.py

04c5ca2

Update test_mask.py

a6a8282

fix tests to match updated single-target detection code

LOCEANlloydizard force-pushed the single_target_detection branch from fc6e8da to a6a8282 Compare January 4, 2026 14:38

LOCEANlloydizard closed this Jan 8, 2026

LOCEANlloydizard reopened this Jan 8, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Single target detection #1588

Single target detection #1588

Uh oh!

LOCEANlloydizard commented Dec 19, 2025

Uh oh!

codecov-commenter commented Dec 19, 2025 •

edited

Loading

Uh oh!

leewujung Dec 31, 2025

Uh oh!

leewujung Dec 31, 2025 •

edited

Loading

Uh oh!

leewujung Dec 31, 2025

Uh oh!

leewujung Dec 31, 2025

Uh oh!

leewujung Dec 31, 2025

Uh oh!

leewujung left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		Ping_number=("target", _arr("Ping_number", int)),
		Time=("target", _arr("Time", np.dtype("datetime64[ns]"))),

Single target detection #1588

Are you sure you want to change the base?

Single target detection #1588

Uh oh!

Conversation

LOCEANlloydizard commented Dec 19, 2025

Uh oh!

codecov-commenter commented Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

leewujung Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

leewujung Dec 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leewujung Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

leewujung Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

leewujung Dec 31, 2025

Choose a reason for hiding this comment

Uh oh!

leewujung left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov-commenter commented Dec 19, 2025 •

edited

Loading

leewujung Dec 31, 2025 •

edited

Loading