perf: fix N+1 query issue in stops-for-agency handler#290
Conversation
Replace loop-based queries with batch operations in buildStopsListForAgency. Previously executed 2N+1 queries, now executes only 3 queries regardless of stop count. - Use GetStopsByIDs() for batch stop fetching - Use GetRouteIDsForStops() for batch route associations - Build route mapping in memory instead of querying per stop Fixes performance issue where agencies with 5000+ stops would timeout due to excessive database queries.
|
@aaronbrethorst |
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Aditya! Excellent work on tackling the N+1 query problem. Reducing from 10,000+ queries to 3 queries is a massive performance win.
Summary
This PR refactors buildStopsListForAgency to use batch queries (GetStopsByIDs and GetRouteIDsForStops) instead of iterating and querying each stop individually. This follows the same pattern established in stops_for_location_handler.go.
Critical Issues
1. Double-prefixed route IDs (pre-existing bug, reproduced in new code)
Location: internal/restapi/stops_for_agency_handler.go:102-105
for _, row := range routeIDsRows {
if rid, ok := row.RouteID.(string); ok {
routesByStop[row.StopID] = append(routesByStop[row.StopID], utils.FormCombinedID(agencyID, rid))
}
}The GetRouteIDsForStops query already returns combined IDs (see gtfsdb/query.sql:534):
routes.agency_id || '_' || routes.id AS route_idSo rid is already "agencyID_routeID". Calling FormCombinedID(agencyID, rid) produces "agencyID_agencyID_routeID".
Note: This bug existed in the old code too (FormCombinedID(agencyID, id.(string))), so the PR preserves existing behavior. However, since you're refactoring this code anyway, now is an ideal time to fix it.
Fix: Use rid directly, like stops_for_location_handler.go does:
for _, row := range routeIDsRows {
if rid, ok := row.RouteID.(string); ok {
routesByStop[row.StopID] = append(routesByStop[row.StopID], rid) // Already combined
}
}Important Issues
1. Inconsistent indentation in early return
Location: internal/restapi/stops_for_agency_handler.go:83-86
// If no stops, return empty list
if len(stopIDs) == 0 {
return []models.Stop{}, nil
}The first line uses tabs, but lines 84-86 use spaces. This should be consistent (use tabs per Go convention).
2. Behavior change: errors now propagate instead of being silently skipped
Location: internal/restapi/stops_for_agency_handler.go:89-98
The old code silently skipped stops that failed to fetch:
stop, err := api.GtfsManager.GtfsDB.Queries.GetStop(ctx, stopID)
if err != nil {
continue // Silent skip
}The new code returns errors to the caller:
stops, err := api.GtfsManager.GtfsDB.Queries.GetStopsByIDs(ctx, stopIDs)
if err != nil {
return nil, err // Propagates error
}This is actually a better behavior—failing fast is preferable to silently degraded responses. Just noting it as a behavioral change that's intentional and correct.
Fit and Finish
1. Trailing space in comment
Location: internal/restapi/stops_for_agency_handler.go:83
// If no stops, return empty listRemove the trailing space after "list".
Strengths
- Significant performance improvement: 2N+1 queries → 3 queries is a major win
- Follows established patterns: Matches the approach in
stops_for_location_handler.go - Clean logic flow: Fetch all data upfront, then process in memory
- Good error handling: Propagates errors instead of silently degrading
Recommendation
Please fix the double-prefixed route IDs (use rid directly instead of wrapping in FormCombinedID) and the indentation issue. The route ID fix actually improves correctness beyond what the old code did.
After those fixes, this is ready to merge.
Great performance optimization work!
|
@aaronbrethorst |
aaronbrethorst
left a comment
There was a problem hiding this comment.
Hey Aditya, you've done a great job addressing all the feedback from the first round. The double-prefixed route ID fix is correct, the indentation is clean, and the code now follows the exact same batch query pattern used in stops_for_location_handler.go and stops_for_route_handler.go.
There's nothing left to change—this PR is ready to merge. Reducing 10,000+ individual queries down to 3 batch queries is a big performance win, and the fail-fast error handling is a nice improvement over the old silent continue behavior. Well done!

Description:
This PR optimizes the
stops-for-agencyendpoint by replacing iterative database queries with batch operations, significantly improving performance for agencies with large numbers of stops.Closes #289
Motivation:
Previously,
buildStopsListForAgencysuffered from an N+1 query problem. It iterated through every stop ID and executed two separate SQL queries (GetStopandGetRouteIDsForStop) for each one. For an agency with 5,000 stops, this resulted in over 10,000 database queries for a single API request.Changes:
Refactored
buildStopsListForAgencyininternal/restapi/stops_for_agency_handler.goto:GetStopsByIDsandGetRouteIDsForStopsto fetch all data in just 2 queries, regardless of the number of stops.if len(stopIDs) == 0to prevent unnecessary DB calls for empty lists.Verification:
stops_for_location_handler.go).make testpasses.