Skip to content

feat(generation): add Generate Tab DINO/SAM detailer#9136

Draft
AsuraAce wants to merge 3 commits intoinvoke-ai:mainfrom
AsuraAce:codex/generate-tab-detailer
Draft

feat(generation): add Generate Tab DINO/SAM detailer#9136
AsuraAce wants to merge 3 commits intoinvoke-ai:mainfrom
AsuraAce:codex/generate-tab-detailer

Conversation

@AsuraAce
Copy link
Copy Markdown

@AsuraAce AsuraAce commented May 8, 2026

Summary

Adds a Generate Tab Detailer for local SD1, SD2, and SDXL generation. The supported V1 path uses GroundingDINO for target detection, SAM/SAM2 for mask generation, a prepared crop/mask path, a second same-checkpoint denoise pass, and alpha compositing back into the generated image before final output handling.

This PR also adds the user-facing Detailer accordion, quality starter presets, metadata recall/remix support, dev-only debug collage output, and backend invocations for bbox selection, crop preparation, paste compositing, and debugging.

Supported V1 scope:

  • Generate Tab only
  • Local SD1/SD2/SDXL first
  • One selected target per generation
  • DINO/SAM is the supported detector path
  • MediaPipe remains dev-only legacy/experimental
View Screenshots image
image
image

Related Issues / Discussions

No linked issue.

QA Instructions

Automated checks run locally after rebasing onto current upstream/main:

  • uv run python ../../../scripts/generate_openapi_schema.py | pnpm typegen
  • uv run pytest tests/app/invocations/test_detailer.py tests/app/invocations/test_grounding_dino.py
  • uv run ruff check invokeai/app/invocations/detailer.py invokeai/app/invocations/grounding_dino.py tests/app/invocations/test_detailer.py tests/app/invocations/test_grounding_dino.py
  • pnpm test:run src/features/nodes/util/graph/generation/addFaceDetailerPass.test.ts src/features/controlLayers/store/paramsSlice.test.ts src/features/metadata/parsing.test.ts
  • pnpm exec eslint src/features/settingsAccordions/components/FaceDetailerSettingsAccordion/FaceDetailerSettingsAccordion.tsx src/features/nodes/util/graph/generation/addFaceDetailerPass.ts src/features/nodes/util/graph/generation/addFaceDetailerPass.test.ts src/features/controlLayers/store/paramsSlice.ts src/features/controlLayers/store/paramsSlice.test.ts src/features/controlLayers/store/detailerQualityPresets.ts src/features/controlLayers/store/detailerRuntimeConfig.ts src/features/metadata/parsing.tsx src/features/metadata/parsing.test.ts
  • pnpm lint:tsc

Suggested reviewer QA:

  • Generate with Detailer disabled and confirm the graph/output path is unchanged.
  • Enable Detailer with an SDXL or SD1 model and test Face Balanced and Face High.
  • Test Head High with SAM large.
  • Test Body/person High as a broad-subject experimental path.
  • Test a no-target prompt and confirm generation completes without a visible change or hard failure.
  • Generate with Detailer enabled, then use Recall/Remix and confirm user-facing Detailer settings restore while debug state stays disabled.
  • In a dev build, enable Debug Output and confirm the extra collage shows bbox, detector prompt, SAM model, crop/process sizes, masks, and final paste.

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • ❗Changes to a redux slice have a corresponding migration
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions Bot added python PRs that change python files invocations PRs that change invocations frontend PRs that change frontend files python-tests PRs that change python tests labels May 8, 2026
@AsuraAce AsuraAce force-pushed the codex/generate-tab-detailer branch from 7094887 to 78cc9d1 Compare May 8, 2026 11:28
@AsuraAce AsuraAce changed the title Add Generate Tab Detailer with DINO/SAM target refinement feat(generation): add Generate Tab DINO/SAM detailer May 8, 2026
@Pfannkuchensack
Copy link
Copy Markdown
Collaborator

Findings

Medium: i18n violation in detailer metadata viewer

Path: invokeai/frontend/web/src/features/metadata/parsing.tsx

The DetailerSettings ValueComponent (the [value.enabled ? 'On' : 'Off', value.targetPrompt, value.quality, value.detector].filter(Boolean).join(' / ') block) is rendered in the user-visible metadata panel but displays:

  • Hardcoded English 'On' / 'Off' rather than t('common.on') / t('common.off').
  • Raw enum values for value.quality (fast / balanced / high) and value.detector (grounding-dino-sam / mediapipe).

Translation keys for these already exist (parameters.faceDetailer.qualities.*, parameters.faceDetailer.detectors.*) but are not used. This is the only place in the new code that breaks the codebase's existing localization contract for visible strings.

To expose this issue, add a test under invokeai/frontend/web/src/features/metadata/parsing.test.ts that asserts the DetailerSettings rendered value emits translation keys, not raw English/enum values. Since DOM rendering is out of scope per the repo's testing conventions, factor the join logic into a helper and unit-test the enum -> i18n key mapping.


Low: raw technical identifiers shown as Combobox option labels

Path: invokeai/frontend/web/src/features/settingsAccordions/components/FaceDetailerSettingsAccordion/FaceDetailerSettingsAccordion.tsx:808-815

dinoModelOptions and samModelOptions are constructed with { label: model, value: model } where model is the kebab/lower technical identifier:

  • grounding-dino-tiny, grounding-dino-base
  • segment-anything-2-small, segment-anything-2-tiny, segment-anything-2-base, segment-anything-2-large
  • segment-anything-base, segment-anything-large, segment-anything-huge

These appear directly in user-facing dropdowns. The detector dropdown (detectorOptions) at lines 793-799 was localized but the DINO/SAM model dropdowns were not. There are no parameters.faceDetailer.dinoModels.* or parameters.faceDetailer.samModels.* keys in invokeai/frontend/web/public/locales/en.json.

To expose this issue, add a test that builds the option list factory in isolation and asserts each option's label differs from its value.


Low: detailer color-correction wires denoise_mask polarity in a non-obvious way

Path: invokeai/frontend/web/src/features/nodes/util/graph/generation/addFaceDetailerPass.ts:158-168

Feeds the detailer's denoise_mask (black = edits, white = preserves) into color_correct.mask. Per invokeai/app/invocations/image.py:750-759, the color-correct mask semantics are "white = original, black = result" — so the wiring effectively color-matches the detailed crop only inside the denoise region and keeps the surrounding context uncorrected.

That happens to be the intended behavior, but no explicit unit test asserts that the mask edge corresponds to "color-corrected only inside the detail region". A future change to either the detailer mask polarity or the color_correct semantics will break this silently.

To expose this issue, add a test under invokeai/frontend/web/src/features/nodes/util/graph/generation/addFaceDetailerPass.test.ts that asserts the wiring crop.denoise_mask -> color_correct.mask and document the polarity contract. Alternatively, add an integration-style backend test that runs ColorCorrectInvocation with the actual detailer crop output.


Low: detailer enabled flag persists across recall from non-detailer images

Path: invokeai/frontend/web/src/features/metadata/parsing.tsx

DetailerSettings.parse uses zParamsState.shape.detailerEnabled.parse(getProperty(metadata, 'detailer_enabled')), so when detailer_enabled is missing the handler rejects, leaving the existing detailerEnabled redux value untouched. That is consistent with how other handlers behave, but it means recalling a previous "detailer ON" image, then recalling a new image without detailer metadata, leaves the detailer ON in the UI, which can silently insert a second denoise pass into the next generation. Worth confirming this is intended, given the cost difference.

To expose this issue, add a test in invokeai/frontend/web/src/features/metadata/parsing.test.ts that asserts how recall behaves when detailer_enabled is absent, then document the chosen contract.


  • I did not execute any test suites — this is a static review only.

@AsuraAce
Copy link
Copy Markdown
Author

AsuraAce commented May 9, 2026

Findings

Medium: i18n violation in detailer metadata viewer

Path: invokeai/frontend/web/src/features/metadata/parsing.tsx

The DetailerSettings ValueComponent (the [value.enabled ? 'On' : 'Off', value.targetPrompt, value.quality, value.detector].filter(Boolean).join(' / ') block) is rendered in the user-visible metadata panel but displays:

  • Hardcoded English 'On' / 'Off' rather than t('common.on') / t('common.off').
  • Raw enum values for value.quality (fast / balanced / high) and value.detector (grounding-dino-sam / mediapipe).

Translation keys for these already exist (parameters.faceDetailer.qualities.*, parameters.faceDetailer.detectors.*) but are not used. This is the only place in the new code that breaks the codebase's existing localization contract for visible strings.

To expose this issue, add a test under invokeai/frontend/web/src/features/metadata/parsing.test.ts that asserts the DetailerSettings rendered value emits translation keys, not raw English/enum values. Since DOM rendering is out of scope per the repo's testing conventions, factor the join logic into a helper and unit-test the enum -> i18n key mapping.

Low: raw technical identifiers shown as Combobox option labels

Path: invokeai/frontend/web/src/features/settingsAccordions/components/FaceDetailerSettingsAccordion/FaceDetailerSettingsAccordion.tsx:808-815

dinoModelOptions and samModelOptions are constructed with { label: model, value: model } where model is the kebab/lower technical identifier:

  • grounding-dino-tiny, grounding-dino-base
  • segment-anything-2-small, segment-anything-2-tiny, segment-anything-2-base, segment-anything-2-large
  • segment-anything-base, segment-anything-large, segment-anything-huge

These appear directly in user-facing dropdowns. The detector dropdown (detectorOptions) at lines 793-799 was localized but the DINO/SAM model dropdowns were not. There are no parameters.faceDetailer.dinoModels.* or parameters.faceDetailer.samModels.* keys in invokeai/frontend/web/public/locales/en.json.

To expose this issue, add a test that builds the option list factory in isolation and asserts each option's label differs from its value.

Low: detailer color-correction wires denoise_mask polarity in a non-obvious way

Path: invokeai/frontend/web/src/features/nodes/util/graph/generation/addFaceDetailerPass.ts:158-168

Feeds the detailer's denoise_mask (black = edits, white = preserves) into color_correct.mask. Per invokeai/app/invocations/image.py:750-759, the color-correct mask semantics are "white = original, black = result" — so the wiring effectively color-matches the detailed crop only inside the denoise region and keeps the surrounding context uncorrected.

That happens to be the intended behavior, but no explicit unit test asserts that the mask edge corresponds to "color-corrected only inside the detail region". A future change to either the detailer mask polarity or the color_correct semantics will break this silently.

To expose this issue, add a test under invokeai/frontend/web/src/features/nodes/util/graph/generation/addFaceDetailerPass.test.ts that asserts the wiring crop.denoise_mask -> color_correct.mask and document the polarity contract. Alternatively, add an integration-style backend test that runs ColorCorrectInvocation with the actual detailer crop output.

Low: detailer enabled flag persists across recall from non-detailer images

Path: invokeai/frontend/web/src/features/metadata/parsing.tsx

DetailerSettings.parse uses zParamsState.shape.detailerEnabled.parse(getProperty(metadata, 'detailer_enabled')), so when detailer_enabled is missing the handler rejects, leaving the existing detailerEnabled redux value untouched. That is consistent with how other handlers behave, but it means recalling a previous "detailer ON" image, then recalling a new image without detailer metadata, leaves the detailer ON in the UI, which can silently insert a second denoise pass into the next generation. Worth confirming this is intended, given the cost difference.

To expose this issue, add a test in invokeai/frontend/web/src/features/metadata/parsing.test.ts that asserts how recall behaves when detailer_enabled is absent, then document the chosen contract.

  • I did not execute any test suites — this is a static review only.

Detailer metadata now localizes On/Off, quality, and detector.
DINO/SAM model labels are localized.
Color-correct mask polarity is documented and covered by a graph test.
Missing detailer_enabled behavior is explicitly tested as “no recall.”

@lstein lstein added the 6.14.x label May 9, 2026
@lstein lstein moved this to 6.14.x Theme: LIBRARY UPDATES in Invoke - Community Roadmap May 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.14.x frontend PRs that change frontend files invocations PRs that change invocations python PRs that change python files python-tests PRs that change python tests

Projects

Status: 6.14.x Theme: LIBRARY UPDATES

Development

Successfully merging this pull request may close these issues.

4 participants