Linux Saloon 63: Application Potluck, Tagger, Tesseract, exiftool, Stacer

On this Application “potluck” we talked about some applications that we enjoy using. The best part about these episodes is that we all can try something new or at least have a new tool on the ready when we need it.

Thanks so much for your continued support in watching, sharing and subscribing to Linux Saloon.

00:00:00 Introductions
00:01:29 Colin’s looking at Rolling Release Linux Distributions
00:19:23 StrawPolls on Telegram layout and poll anonymity
00:34:54 Tagger Meta tagging Application
00:56:17 Tesseract - Ulfnic
01:00:57 exiftool
xhttps://exiftool.org/
01:05:33 Notepad Next
Notepad Next | Flathub
Notepad++ https://snapcraft.io/notepad-plus-plus
01:10:49 NeoVIM
https://neovim.io/
01:13:27 KeypassXC and Joplin with a little Syncthing
https://keepassxc.org/
https://joplinapp.org/
Syncthing on openSUSE – CubicleNate's Techpad
01:22:04 Konqueror
01:25:21 Gnome Meta Data Cleaner
https://metadatacleaner.romainvigier.fr/
01:26:17 FreeTube
https://freetubeapp.io/
01:30:42 Stacer
https://metadatacleaner.romainvigier.fr/
01:33:48 Next Week - Open Mic Night
01:34:24 Housekeeping
01:36:48 McDonald’s Arch Deluxe most expensive flop
The Arch Deluxe Was a Hell of a Burger. It Was Also McDonald’s Most Expensive Flop.
01:44:14 Last Call
01:54:09 Bloopers

Tesseract

Tesseract is a command line tool that does optical character recognition (or OCR) for extracting text from digital images.

How I use it is to interpret text inside screen-grabs which is especially useful when watching ‘YouTubers’ writing code or going to links on video, I then pipe it into a clipboard manager so I can grab the text from a screen-grab using a keyboard shortcut. Examples below:

X11 instructions:

  • sudo apt install maim tesseract xclip # Install dependencies
  • maim --select --hidecursor --format=png --quality=10 /dev/fd/1 | tesseract stdin stdout | xclip -in -selection clipboard

Wayland/wlroots instructions:

  • sudo apt install grim tesseract wl-clipboard # Install dependencies
  • grim -t png -l 9 -g “$(slurp)” - | tesseract stdin stdout | wl-copy
1 Like

Upcoming Linux Saloons

Linux Saloon | Open Mic Night 13
2023-04-16T00:00:00Z

Linux Saloon | Distro Exploration - TBD
2023-04-23T00:00:00Z

Source of truth for dates: Linux Saloon – CubicleNate's Techpad

This was a bit long for the show but if you want the best accuracy from screengrab | tesseract you want this script.

tesseract was designed for scanning documents which tend to be very large at 100% scale so it sometimes struggles with screen text. Upsizing screen-grabs by 400% fixes that.

tesseract also sometimes adds whitespace/newlines to the beginning or end so it’s nice to remove those.

The following script uses ffmpeg or mogrify for the upscaling if one of those packages is installed. Otherwise it OCRs without upscale. trailing/leading newlines and whitespace are removed and it supports Wayland (wlroots only?) and X11 based distros.

It also doesn’t use temp files, all the magic happens over pipes and process substitution.

/usr/local/bin/ocr2clip

#!/usr/bin/env bash
# License: BSD-0 Clause, Ulfnic

:<<-'Comment'
	Dependencies:
		- tesseract

	Wayland dependencies:
		- grim
		- slurp
		- wl-copy

	X11 dependencies:
		- maim
		- xsel or xclip

	Optional dependencies:
		- ffmpeg or ImageMagick (for vastly better accuracy)
Comment

set -o errexit

print_stderr() {
	[[ $2 ]] && printf "$2" "${@:3}" 1>&2
	[[ $1 == '0' ]] || exit $1
}

# Define display server
if [[ $DISPLAY ]]; then
	[[ $WAYLAND_DISPLAY ]] && display_server='xwayland' || display_server='x11'
else
	display_server='wayland'
fi

# Check dependencies
type tesseract &> /dev/null || print_stderr 1 '%s\n' 'Missing dependency: tesseract'

if [[ $display_server == 'wayland' ]] || [[ $display_server == 'xwayland' ]]; then
	type wl-copy &> /dev/null || print_stderr 1 '%s\n' 'Missing dependency: wl-copy'
	type slurp &> /dev/null || print_stderr 1 '%s\n' 'Missing dependency: slurp'
	type grim &> /dev/null || print_stderr 1 '%s\n' 'Missing dependency: grim'
	clipboard_cmd='wl-copy'

elif [[ $display_server == 'x11' ]]; then
	type maim &> /dev/null || print_stderr 1 '%s\n' 'Missing dependency: maim'
	if type xsel &> /dev/null; then
		clipboard_cmd='xsel --input --clipboard'
	elif type xclip &> /dev/null; then
		clipboard_cmd='xclip -in -selection clipboard'
	else
		print_stderr 1 '%s\n' 'Missing dependency: xsel or xclip'
	fi
fi

# Stdout user's screen selection
function screen_select(){
	if [[ $display_server == 'wayland' ]] || [[ $display_server == 'xwayland' ]]; then
		# Get selection and honor escape key
		grim -t png -l 9 -g "$(slurp)" -

	elif [[ $display_server == 'x11' ]]; then
		maim --select --hidecursor --format=png --quality=10 /dev/fd/1

	fi
}

# OCR screen selection and deliver to clipboard
function ocr_selection(){
	str=$( tesseract stdin stdout 2>/dev/null )

	# Remove leading and trailing whitespace
	str=${str#"${str%%[![:space:]]*}"}
	str=${str%"${str##*[![:space:]]}"}

	# Place in clipboard
	printf '%s' "$str" | $clipboard_cmd
}

# Empty clipboard to avoid false positives
printf '' | $clipboard_cmd

# If a suitable program is available, upscale the image by 4x using either ffmpeg or ImageMagik to improve accuracy
if type ffmpeg &> /dev/null; then
	ffmpeg \
		-hide_banner \
		-loglevel error \
		-i <( screen_select ) \
		-vf scale=iw*4:ih*4 \
		-f image2 \
		>( ocr_selection ) \
		-y \

elif type mogrify &> /dev/null; then
	screen_select \
		| mogrify \
			png:- \
			-modulate 100,0 \
			-resize 400% \
		| ocr_selection

else
	screen_select | ocr_selection

fi
2 Likes

I think this is fantastic. I am impressed by the way that this works.

1 Like