Skip to content

Refactor get_start_datetime function #485

@braddf

Description

@braddf

Issue

The get_start_datetime function in utils found here as it currently works is slightly hard to follow the logic and has multiple responsibilities depending on how it is called.

Suggested Implementation

  • Split out the logic in the function into constituent parts with single responsibilities, that can either be called by this original (parent) function, or as standalone functions called in route handlers for better separation of concerns.
  • Refactor places where this function is used to reflect the new separated concerns, i.e. only get the default start_datetime if we haven't got one already set, instead of proxying this value oddly through the function as a pseudo-validator.

Something like this might work for the first point:

def _parse_n_history_days(n_history_days: Optional[Union[str, int]]) -> tuple[int, bool]:
    """Return (int days, is_yesterday) from n_history_days input."""
    if n_history_days is None:
        n_history_days = os.getenv("N_HISTORY_DAYS", "yesterday")
    if n_history_days == "yesterday":
        return 1, True
    return int(n_history_days), False

def _get_yesterday_midnight_utc() -> datetime:
    """Return yesterday at midnight Europe/London, in UTC."""
    dt = datetime.now(tz=europe_london_tz).date() - timedelta(days=1)
    dt = datetime.combine(dt, datetime.min.time())
    dt = europe_london_tz.localize(dt)
    return dt.astimezone(utc)

def get_start_datetime(
    n_history_days: Optional[Union[str, int]] = None,
    days: Optional[int] = 3,
) -> datetime:
    """
    Always compute and return the start datetime for the query.

    - If 'yesterday', returns yesterday at midnight (Europe/London) in UTC.
    - Otherwise, returns floored 6-hour interval N days ago.
    """
    n_days, is_yesterday = _parse_n_history_days(n_history_days)
    if is_yesterday:
        return _get_yesterday_midnight_utc()
    dt = datetime.now(tz=europe_london_tz) - timedelta(days=n_days)
    dt = floor_6_hours_dt(dt)
    return dt.astimezone(utc)

def validate_start_datetime(
    start_datetime: Optional[datetime],
    n_history_days: Optional[Union[str, int]] = None,
    days: int = 3,
) -> datetime:
    now = datetime.now(tz=utc)
    if start_datetime is None or now - start_datetime > timedelta(days=days):
        return get_start_datetime(n_history_days)
    return start_datetime

So if when we have start_datetime = None you can get_start_datetime for the default value, and when you just want to check if the given start_datetime is allowed in that circumstance (depending on the params given for different places/ways this function is used) then you could use validate_start_datetime.

For the second point, each place this function is called would need updating to use the new functions above. I haven't checked this code, so would need testing thoroughly, but in essence I reckon it would be better in terms of engineering principles for this default behaviour.

Metadata

Metadata

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions