Skip to content

Conversation

@anshifmonz
Copy link
Contributor

This PR adds a script that lets users select a region of the screen and extract text from it using OCR.
Once triggered, it grabs a screenshot, preprocesses it for clarity, runs Tesseract OCR, and copies the text to the clipboard.

Added

  • dotfiles/.config/hypr/scripts/text-extractor.sh

Updated

  • Setup scripts (Arch / Fedora / openSUSE) to install Tesseract language packs
  • Global package list: added tesseract and wl-clipboard
  • Docs: added new dependencies

How It Works (Implementation Details)

  1. Launches hyprpicker to freeze screen
  2. Uses slurp to select a screen region
  3. Captures that region using grim
  4. Pipes the image into ImageMagick for preprocessing:
    • grayscale
    • normalize
    • contrast stretch
    • sharpen
    • upscale 2×
  5. Feeds the processed image into Tesseract (--psm 6)
  6. Saves result to clipboard via wl-copy
  7. Includes:
    • dependency checking
    • cleanup traps
    • timeout protection for slurp
    • safe kill for background picker

Usage

Bind to a key in Hyprland (default):

bind = $mainMod ALT, A, exec, $HYPRSCRIPTS/text-extractor.sh

Trigger → select area → extracted text is instantly in clipboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant