-
-
Notifications
You must be signed in to change notification settings - Fork 53
Fix utilsforecast evaluation compatibility #282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Handle missing cutoff/id_col in evaluation output Make cross-validation robust to utilsforecast schema changes
Add cutoff_col to mase and forward it to mae for correct evaluation
AzulGarza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @Kushagra7777! since the pr introduced the new cutoff_col, we need to update the tests as well.
AzulGarza
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, @Kushagra7777! it seems the tests are working now, one remaining thing is fixing lint and styling checks.
|
i'm also adding copilot as reviewer to have its comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request enhances the robustness of the experiment handler to handle schema changes in the utilsforecast library. The changes make the evaluation and cross-validation functionality more resilient when the utilsforecast library's output schema varies (e.g., missing cutoff or id_col columns).
Key changes:
- Introduced
_zero_to_nan_pdhelper function to replace reliance on utilsforecast's_zero_to_nan - Updated the
masefunction to defensively handle optional cutoff columns in evaluation output - Added defensive column filtering in
evaluate_forecast_dfto handle missing cutoff/id_cutoff columns
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if "cutoff" not in eval_df.columns: | ||
| if "id_cutoff" in eval_df.columns: | ||
| eval_df = eval_df.merge(cutoffs, on="id_cutoff", how="left") | ||
| else: | ||
| pass | ||
|
|
||
| cols = ["unique_id", "cutoff", "metric"] + models | ||
| cols = [c for c in cols if c in eval_df.columns] | ||
| eval_df = eval_df[cols] |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new defensive logic for handling missing cutoff/id_cutoff columns (lines 271-275) and the column filtering (lines 277-279) lack test coverage. Consider adding test cases that verify the behavior when utilsforecast's evaluate() function returns DataFrames with different schema variations (e.g., missing cutoff column, missing id_cutoff column, or both).
| def _zero_to_nan_pd(s: pd.Series) -> pd.Series: | ||
| s = s.astype(float).copy() | ||
| s[s == 0] = np.nan | ||
| return s |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new _zero_to_nan_pd helper function lacks direct test coverage. While it may be indirectly tested through the mase function, consider adding a direct unit test to verify its behavior, especially for edge cases like empty series, all-zero series, and series with mixed zero and non-zero values.
| seasonality: int, | ||
| train_df: pd.DataFrame, | ||
| id_col: str = "unique_id", | ||
| time_col: str = "ds", |
Copilot
AI
Jan 8, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The time_col parameter is added to the function signature but is never used in the function body. Consider removing this parameter if it's not needed, or if it was added for future use, document its intended purpose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should work on adding the option to set a time column to timecopilot in general, but as suggested here I'm not sure if it should be added as an argument until the functionality of the argument is also being added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The laziest way of doing this would be to add a time_col arg to calls like forecast() then just renaming the specified column in the dataframe to "ds".
Update live tests to query using the actual unique_id from generated data instead of a hardcoded value. This avoids empty series selection and prevents flaky failures when checking queryability.
as suggested by copilot Co-authored-by: Copilot <[email protected]>
as suggested by copilot Co-authored-by: Copilot <[email protected]>
Remove series-specific query that caused invalid dataframe Keep test focused on queryable state only
Disable anomaly detection in test_is_queryable
This reverts commit d0829f7.
|
Why switch to _zero_to_nan_pd? It appears to achieve the same purpose as the imported function it's replacing? timecopilot/timecopilot/utils/experiment_handler.py Lines 28 to 31 in 1bad91c
The .copy() after the astype() call shouldn't be necessary, by default astype() calls make a copy.
|
Replaced the imported _zero_to_nan with a pandas-based implementation to avoid division-by-zero and index alignment issues. Fixed the unnecessary .copy() usage. Removed the unused time_col: str = "ds" argument. Reverted earlier notebook changes that were unrelated and only attempted to address lint and styling errors. and also did the final lint fix,
Applied ruff-format changes to satisfy pre-commit checks. No functional changes.
|
@spolisar and about copy(), i am fixing it. removing this line as well --> time_col: str = "ds". |
|
That makes sense, seems good overall. |
|
|
| seasonality: int, | ||
| train_df: pd.DataFrame, | ||
| id_col: str = "unique_id", | ||
| time_col: str = "ds", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Kushagra7777 should we use time_col inside the function? in mae?
spolisar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things I noticed when taking a look at the recent changes. Replacing an internal function with an import means we might want to change an import in a test file that used the internal function. My memory might be off, but from what I remember some of the lines removed in 726da08 were used to resolve the missing cutoff col issue.
Was there a chunk of code that was used to consistently recreate this problem for testing?
thanks for the review @spolisar! i think the problem appeared when we tried to maintain two versions of utilsforecast, but now that we are forcing the latest one, we are only worried about the cutoff col arg. |
|
Sounds good then. |
spolisar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like the issue is resolved and changes/questions have been addressed. Seems good to me.
Handle missing cutoff/id_col in evaluation output
Make cross-validation robust to utilsforecast schema changes