Add cached OSM test mode#280
Open
paco-barreras wants to merge 25 commits intomainfrom
Open
Conversation
…sion tests A new per-user regression test exposed a real stop-table concat bug when some users had empty outputs.\n\nThis commit hardens empty stop-table construction by deriving exact output columns and explicit dtypes from shared helpers, then using those typed empties in stop-detection paths. It also applies reset_index(drop=True) after grouped stop summarization and adds passthrough guards to avoid duplicate user_id columns.\n\nPer-user regression tests were cleaned up and made faster:\n- compare labels directly by (user_id, timestamp)\n- remove offset/parts-style expectation logic\n- run on a 4-user sample\n- parameterize n_jobs with 1 and 2\n\nFor now, this is prototyped in dbstop.py and sequential.py via the focused per-user regression path that originally surfaced the bug, with shared helper changes ready for wider consolidation.
Replace split empty-stop schema helpers (column names + dtype map) with one shared helper that directly returns a typed empty stop DataFrame. Update all active stop-detection summarization callsites (dbstop, dbscan, density_based, hdbscan, lachesis, sequential, grid_based) to use the unified helper, removing duplicated empty-frame construction logic.
refreshed stop detection utilities and needed to resolve conflicts
This branch cleans up the HDBSCAN validation notebook with some deeper
refactors to nomad's function.
## Validation
The first part of the change makes the validation path related to
`compute_visitation_errors` better, and it now lives with the rest of
the stop-detection validation logic in validation.py, and the overlap /
validation code can handle a separate traj_cols for the right-hand table
when the predicted stops and the truth table do not use the same column
names. That let me remove a lot of notebook-side transformations that
was only there to work with that fragile code.
## Notebook
The notebook `hdbscan_validation_paper` is leaner. It no longer passes
default traj_cols mappings into loaders just to restate the defaults,
and it no longer drops diary rows with missing building IDs before
validation. The general metrics now use the full truth diary, while
category-specific slices happen naturally where the categories are
actually used. I also fixed the stale `start_timestamp` / `timestamp`
mismatch after the summarize-stop output switched to
`keep_col_names=True`, and cleaned up the generation path so regenerated
diaries keep `user_id`.
## Plotting
The plotting code also got reorganized. The notebook was mixing up two
different statistical objects: the per-user distribution of a metric,
and uncertainty in the median metric estimate. Those are now shown
separately. `validation.py` now provides a small bootstrap summary
helper plus two plotting helpers: one for per-user boxplots, and one for
bootstrapped median estimates with interval whiskers. The boxplots are
there to show the spread across users; the point-and-whisker plot is
there to compare the estimated medians. That split makes the
interpretation much clearer for this notebook.
For the grouped colors, the x-axis still uses the registry family labels
such as `lachesis_coarse` and `lachesis_fine`, but the colors are
grouped by the underlying base algorithm. That is piped through from the
registry as `{algo['family']: algo['algorithm']}`, so variants of the
same base method share a hue family without hardcoding the palette in
the notebook.
I ran the validation notebook end to end on the 250-agent dataset after
these changes. The script completes successfully and writes figures that
make sense to me.
ee27810 to
158c4e1
Compare
52e1324 to
5027b55
Compare
Closes #272
5027b55 to
a7e12e2
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Adds a small OSM cache generator and makes the existing map tests use cached OSM data by default.
test_maps.pystill runs throughdownload_osm_buildings,download_osm_streets, andget_city_boundary_osm; only the OSMnx network boundary is monkeypatched to read local parquet/GraphML fixtures. This closes #271 and #272.The cache stays out of git for now. To use the default path, run
python -m nomad.data.generate_osm_test_cache, thenpytest nomad/tests/test_maps.py; missing or stale cache files fail with a clear message to rerun the script. To run against live OSM instead, setNOMAD_OSM_TEST_CACHE=0. I also left notes in the live tests where smaller fixtures would provide the same signal, plus a README warning about keeping large geospatial artifacts out of repository history.