📝 CodeRabbit Chat: Implement requested code changes by coderabbitai[bot] · Pull Request #1325 · RolnickLab/antenna

coderabbitai · 2026-05-28T00:18:36Z

Code changes was requested by @mihow.

Endpoint for stats about verified occurrences #1307 (comment)

The following files were modified:

ami/base/serializers.py
ami/main/api/views.py

Pure-Python LCA over (taxon_id, rank, parents_json) tuples. Returns the deepest shared TaxonRank or None. Used by the upcoming human-model-agreement stat to bucket agreement at-or-finer-than ORDER. Plan: docs/claude/planning/2026-05-14-human-model-agreement-endpoint.md Side-research: docs/claude/planning/occurrence-filter-driven-exports.md Co-Authored-By: Claude <noreply@anthropic.com>

… queryset Pure aggregation; caller wires apply_default_filters + OccurrenceFilter. Annotates best machine prediction, prefetches non-withdrawn identifications, batches Taxon fetch for parents_json, buckets exact / under-order / above-order. Co-Authored-By: Claude <noreply@anthropic.com>

Adds HumanModelAgreementSerializer and the human_model_agreement action on OccurrenceStatsViewSet. Extracts OccurrenceViewSet's filter backends + filterset_fields into a module-level tuple so OccurrenceStatsViewSet can reuse the same OccurrenceFilter pass-through (deployment, event, taxa lists, verified, score thresholds, apply_defaults=false, etc). The top_identifiers action keeps its current behavior — filter_queryset is only invoked by actions that opt in. Co-Authored-By: Claude <noreply@anthropic.com>

Adds 6 HTTP-level tests: missing project_id 400, draft 404, empty zeros, happy-path exact match, deployment filter pass-through, apply_defaults=false score-threshold bypass. Also adds DjangoFilterBackend to OccurrenceStatsViewSet.filter_backends so filterset_fields (event, deployment, determination__rank, ...) actually take effect. Without DjangoFilterBackend, filterset_fields are silently ignored and ?deployment=N returns the unfiltered set. Co-Authored-By: Claude <noreply@anthropic.com>

Mirrors useTopIdentifiers's useAuthorizedQuery pattern. Accepts an arbitrary filter map so the occurrence list page can thread its filter state through unchanged (deployment, event, taxon, score thresholds, apply_defaults). Co-Authored-By: Claude <noreply@anthropic.com>

… review fixes Captures: review findings from Copilot + CodeRabbit, perf bench evidence (43k rows → 159s timeout on apply_defaults=false), and the planned changes for the next session (rename to model-agreement, push aggregation into SQL/ORM, fix UNKNOWN rank LCA + denominator + verified_by_me anon gap + test gaps). Co-Authored-By: Claude <noreply@anthropic.com>

…ion to SQL Addresses review feedback on PR #1307: Rename (drop "human"): - URL: /occurrences/stats/human-model-agreement/ -> /model-agreement/ - Function: human_model_agreement_for_project -> model_agreement_for_project - Serializer: HumanModelAgreementSerializer -> ModelAgreementSerializer - Viewset action + url_path: human_model_agreement -> model_agreement - FE hook: useHumanModelAgreement -> useModelAgreement (file + symbol) - FE type: Response -> ModelAgreementResponse (fixes DOM Response shadow) - Test class: TestHumanModelAgreementForProject -> TestModelAgreementForProject SQL push-down (Copilot+CodeRabbit perf flag): - Replace list(qs) full-row materialization with annotated aggregate(). - Annotate best_user_taxon_id via Subquery over Identification (BEST_IDENTIFICATION_ORDER). Drop the prefetch + select_related("taxon") on identifications since only taxon_id is read. - aggregate() Count(filter=Q(...)) for total/verified/exact/no-prediction. - For under-order disagreement: group disagreement set by distinct (user_taxon, machine_taxon) pair before LCA. Each pair's LCA runs once. - Bench against project 18 (43,149 occurrences): pre-rework apply_defaults=false curl timed out at 159s; post-rework 1.96s unfiltered / 3.4s with bypass (93,019 occurrences post-filter). Denominator fix (Copilot): - agreed_*_pct now divides by verified_with_prediction_count instead of verified_count. A verified occurrence with no machine prediction can't agree or disagree; including it in the denominator drags the rate down without representing actual model disagreement. - Surface no_prediction_count + verified_with_prediction_count as sibling fields so consumers can see how many such occurrences exist. UNKNOWN rank bug (Copilot): - TaxonRank.UNKNOWN sorts after SPECIES in OrderedEnum definition order, so without explicit exclusion UNKNOWN >= ORDER is True and a shared UNKNOWN ancestor would wrongly count as under-order agreement. Filter UNKNOWN out of lca_rank_between's candidate ranks. Add regression test. Tests: - New: test_unknown_rank_excluded_from_lca (LCA regression) - New: test_agreement_under_order_bucket (HTTP coverage for sister-species case, previously only exact-match shortcut was exercised) - Updated: happy-path asserts verified_with_prediction_count and no_prediction_count. 22/22 backend tests green: docker compose exec django python manage.py test ami.main.tests.TestLcaRankBetween ami.main.tests.TestModelAgreementForProject ami.main.tests.TestOccurrenceStatsViewSet Co-Authored-By: Claude <noreply@anthropic.com>

Co-Authored-By: Claude <noreply@anthropic.com>

Replace the .aggregate() over the full filtered queryset with a two-step approach: 1. SQL Count('pk') for total_occurrences (no joins, no subqueries). 2. Fetch the verified set (occurrences with at least one non-withdrawn ident) with both best_user_taxon_id and best_machine_prediction_taxon_id annotated, then bucket counts + LCA in Python. Why: the previous version evaluated two correlated subqueries (best user identification + best machine prediction) on every row of the filtered queryset. For typical projects, >95% of occurrences have no identification — those rows ran the user-ident subquery only to discover NULL, then ran the (much more expensive) machine-prediction subquery on detections that won't contribute to any agreement bucket. Scoping the subqueries to the verified set avoids that waste. Bench (cold, cache invalidated): Project Total Verified Pre Post P#85 SEC-SEQ 36,253 13,140 — 1.18s P#20 BCI 40,958 1,351 — 0.92s P#84 Pennsylvania 18,407 251 — 0.56s P#24 Atlantic Forestry 2,797 274 — 0.50s P#18 Vermont 43,149 45 ~928ms 0.35s P#23 Insectarium Montreal 20,393 74 — 0.43s Warm via django-cachalot: 122–343ms across all projects. For P#85 (highest absolute identification count in the system), the cost is dominated by apply_default_filters' score-threshold join, not the subqueries. apply_defaults=false actually runs faster (0.69s cold, 179,466 total / 13,140 verified) because the classification join is skipped. Co-Authored-By: Claude <noreply@anthropic.com>

… param Replaces hardcoded `lca >= TaxonRank.ORDER` agreement gate with two layers: - Always returned: `agreed_any_rank_*` — exact matches plus any non-null LCA at a real rank (UNKNOWN excluded). The upstream filter (e.g. a Lepidoptera include list) is what bounds the meaningful scope, not a hardcoded threshold in this function. - Optional `?agreement_coarsest_rank=FAMILY`: when supplied, response also includes `agreed_coarser_rank_*` (exact + LCAs at or below the threshold). The applied rank is echoed in `agreement_coarsest_rank`; null when absent. Also addresses CodeRabbit feedback on the existing branch: - Dedupe base queryset before counting (joins from default-filter chain can inflate Occurrence rows). - Bound `*_pct` FloatFields to [0.0, 1.0] in the serializer. Param validation: invalid rank → 400; UNKNOWN rejected as not meaningful. Tests cover any-rank fallback, threshold filtering, invalid + UNKNOWN rejection, and threshold echo. Co-Authored-By: Claude <noreply@anthropic.com>

…ry params - Rename `agreed_under_order_*` → `agreed_any_rank_*` to match the endpoint's dropped ORDER threshold (0565f06). - Add optional `agreement_coarsest_rank` + `agreed_coarser_rank_*` fields to the response type (not consumed yet — UI follows in #1308). - Widen `filters` to accept arrays and append repeated query params so multi-value filters (e.g. `algorithm`, `not_algorithm` — backend reads via `request.query_params.getlist(...)`) survive. Per CodeRabbit review. Co-Authored-By: Claude <noreply@anthropic.com>

Session-scratchpad doc — belongs in local notes, not the merged branch. Co-Authored-By: Claude <noreply@anthropic.com>

- 2026-05-14-human-model-agreement-endpoint.md — design narrative; superseded by code + PR description. - occurrence-filter-driven-exports.md — side-research stub Copilot flagged as out-of-scope. Promoted to a PR-description follow-up item. Co-Authored-By: Claude <noreply@anthropic.com>

create_detections assigns the classification taxon via .order_by("?"), so the previous test picked a random machine taxon and then required a sister species under the same genus. Random non-species picks (ORDER / FAMILY / GENUS) have no sister, flaking ~50% of runs. Pin both the machine prediction and the human ID to two fixed Vanessa species, so the LCA is always GENUS (any-rank bucket, not exact) and the test is deterministic. Co-Authored-By: Claude <noreply@anthropic.com>

useModelAgreement.ts belongs with the frontend consumer (#1308), not the backend endpoint PR. Keeps #1307 backend-only. Co-Authored-By: Claude <noreply@anthropic.com>

Both derive from the verified_rows already in memory — no extra query. - wilson_interval(): 95% Wilson score CI on agreed_exact_pct and agreed_any_rank_pct (agreed_*_ci_low / _ci_high). Wilson stays inside [0,1] and is honest at the small n typical of verified sets, where the normal approximation breaks down. - cohens_kappa(): exact-taxon agreement beyond chance (cohens_kappa field, range [-1, 1]). Null when no doubly-classified occurrences or expected agreement is 1.0. Discounts the agreement you'd get for free in a project dominated by one common species. Adds 5 nullable response fields. Backwards-compatible (additive only). 9 pure-Python unit tests + 2 HTTP field-presence tests. Co-Authored-By: Claude <noreply@anthropic.com>

Both are generic statistical helpers — they don't depend on Django or any domain model. Lifting them out of ami/main/models_future/occurrence.py so other endpoints/jobs that need binomial CIs or chance-corrected agreement can import them without dragging in the occurrence module. Same implementations, just relocated. Renamed parameter names on cohens_kappa from (human, model) to (rater_a, rater_b) so the helper reads as generic rather than human-vs-model specific. Tests already use isolated `from ami.utils.stats import …` imports (updated all 9 sites in ami/main/tests.py). Co-Authored-By: Claude <noreply@anthropic.com>

netlify · 2026-05-28T00:18:42Z

✅ Deploy Preview for antenna-preview canceled.

Name	Link
🔨 Latest commit	`3d522db`
🔍 Latest deploy log	https://app.netlify.com/projects/antenna-preview/deploys/6a1789df8daeec000838fe1d

coderabbitai · 2026-05-28T00:18:43Z

Important

Review skipped

This PR was authored by the user configured for CodeRabbit reviews. CodeRabbit does not review PRs authored by this user. It's recommended to use a dedicated user account to post CodeRabbit review feedback.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7610e23d-e83e-473e-afea-57d1b8256f92

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

🔍 Trigger review

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

mihow · 2026-06-04T21:41:05Z

Claude says: Closing this — it's superseded by #1307, which merged on 2026-05-29 (the same day this PR was opened). The endpoint, ModelAgreementSerializer, model_agreement_for_project, and the filter backends this diff would add are all already on main, and the branch is now conflicting against it.

The one genuinely new idea here is the EnumChoiceField helper in ami/base/serializers.py — a reusable enum-backed ChoiceField that accepts member names case-insensitively and supports excluding sentinel values like UNKNOWN. That abstraction didn't land with #1307.

For reference, the rank param is currently validated inline in model_agreement (ami/main/api/views.py, around line 1463):

valid_ranks = [r.name for r in TaxonRank if r != TaxonRank.UNKNOWN]
raw_rank = request.query_params.get("agreement_coarsest_rank")
# Build a plain dict (not the QueryDict) so a blank value is validated as a real "".
# DRF treats blank fields in HTML/QueryDict input as absent, which would let
# ?agreement_coarsest_rank= silently no-op. Uppercase to stay case-insensitive.
rank_data = {} if raw_rank is None else {"agreement_coarsest_rank": raw_rank.upper()}
coarsest_rank_param = SingleParamSerializer[str].clean(
    param_name="agreement_coarsest_rank",
    field=serializers.ChoiceField(choices=valid_ranks, required=False, allow_blank=False),
    data=rank_data,
)
coarsest_rank = TaxonRank[coarsest_rank_param] if coarsest_rank_param else None

This works and gives strict 400s for blank, unknown, and UNKNOWN ranks, plus drf-spectacular reads the choices into the OpenAPI schema. EnumChoiceField would just DRY this up — worth doing once a second enum-backed query param appears.

Potential follow-up (good first issue): extract EnumChoiceField into ami/base/serializers.py as a fresh PR off current main, swap this call site to use it, and add a couple of unit tests covering the case-insensitive name match, the exclude rejection, and the blank-value 400.

mihow and others added 18 commits May 27, 2026 06:25

docs(plan): add text lang to fenced block (markdownlint MD040)

da2a232

Co-Authored-By: Claude <noreply@anthropic.com>

chore(docs): drop NEXT_SESSION_PROMPT.md from PR

7c144b0

Session-scratchpad doc — belongs in local notes, not the merged branch. Co-Authored-By: Claude <noreply@anthropic.com>

chore(occurrence-stats): move FE hook to UI PR #1308

b74b3cd

useModelAgreement.ts belongs with the frontend consumer (#1308), not the backend endpoint PR. Keeps #1307 backend-only. Co-Authored-By: Claude <noreply@anthropic.com>

📝 CodeRabbit Chat: Implement requested code changes

3d522db

coderabbitai Bot requested a review from mihow May 28, 2026 00:18

coderabbitai Bot assigned mihow May 28, 2026

Base automatically changed from feat/human-model-agreement-endpoint to main May 29, 2026 03:54

mihow closed this Jun 4, 2026

mihow mentioned this pull request Jun 4, 2026

Endpoint for stats about verified occurrences #1307

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

📝 CodeRabbit Chat: Implement requested code changes#1325

📝 CodeRabbit Chat: Implement requested code changes#1325
coderabbitai[bot] wants to merge 18 commits into
mainfrom
coderabbitai/chat/336c1fe

coderabbitai Bot commented May 28, 2026

Uh oh!

netlify Bot commented May 28, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026

Review skipped

Uh oh!

mihow commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

coderabbitai Bot commented May 28, 2026

Uh oh!

netlify Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for antenna-preview canceled.

Uh oh!

coderabbitai Bot commented May 28, 2026

Review skipped

Uh oh!

mihow commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

netlify Bot commented May 28, 2026 •

edited

Loading