Conversation
Added recycle pickup from separate source urls now that it is handled by a different company
There was a problem hiding this comment.
Pull request overview
This PR updates the Toronto (ON) waste collection source to fetch recycling pickup dates from separate Recycle Coach endpoints, reflecting the recycling service moving to a different provider.
Changes:
- Split the existing single fetch flow into separate recycling and waste fetch paths.
- Added Recycle Coach address/zone schedule lookups for recycling entries.
- Refactored waste CSV parsing into a helper and standardized
Collectioncreation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| resp = session.get( | ||
| RECYCLE_PROPERTY_LOOKUP_URL, | ||
| params={"term": self._street_address, "projects": RECYCLE_PROPERTY_LOOKUP_CIRCMATONT} | ||
| ) |
There was a problem hiding this comment.
The recycling property lookup request does not set a timeout. Other network calls in this source use timeout=30; without a timeout this can hang indefinitely and stall updates/tests. Add an explicit timeout here as well.
| return [] | ||
|
|
||
| prop = results[0] | ||
| zone_id = f"zone-{next(iter(prop['zones'].values()))}" |
There was a problem hiding this comment.
zone_id is built from only the first value in prop['zones']. Recycle Coach zone identifiers can require multiple zone parts (e.g., zone-z123-z456); dropping the additional entries can produce an invalid/incorrect schedule. Build the full zone_id from all entries in prop['zones'] (similar to recyclecoach_com.Source._build_zone_string).
| zone_id = f"zone-{next(iter(prop['zones'].values()))}" | |
| zones = prop.get("zones", {}) | |
| zone_parts = [str(v) for v in zones.values() if v] | |
| if not zone_parts: | |
| return [] | |
| zone_id = "zone-" + "-".join(zone_parts) |
| base_date = datetime.strptime(row[week_key], "%Y-%m-%d") | ||
| for w_type in VALID_WASTE_TYPES: | ||
| day_code = row.get(w_type) | ||
| if not day_code is None and day_code in days_map: |
There was a problem hiding this comment.
The CSV day-code check treats an empty string as “in” days_map (because '' in 'MTWRFSX' is true), which can incorrectly schedule a pickup (and index('') resolves to 0). Ensure the value is a single-letter day code (e.g., isinstance(day_code, str) and len(day_code)==1 and day_code in days_map) before calculating the offset.
| if not day_code is None and day_code in days_map: | |
| if isinstance(day_code, str) and len(day_code) == 1 and day_code in days_map: |
| dates = self.extract_dates(sched_resp.json()) | ||
| return [self._create_collection(d, 'Recycling') for d in dates] | ||
| except (requests.RequestException, KeyError, StopIteration): |
There was a problem hiding this comment.
extract_dates() currently returns every event date from the schedule payload and _fetch_recycling_entries() labels all of them as Recycling. The Recycle Coach schedule schema typically includes per-event collections with statuses (and possibly multiple collection types); without filtering you can end up emitting recycling entries for non-recycling or is_none events. Consider parsing event['collections'] and only returning dates where the recycling collection is present and active (or at least skip events where all collections are is_none).
|
the recently updated toronta_ca.py works for me. |
Added recycle pickup from separate source URLs now that recycling is handled by a different company