TG Telegram Group Link
Channel: GeekTips
Back to Bottom
0xProto font my current favorite
sudo apt install python-is-python3 

is an automatic way instead of having so symlink python3 to python it sets it all up.
substitcher202309.zip
14.4 MB
substitcher202309 .zip 14MiB

requirements
pip3 install pysubs2 titlecase

sudo apt install jq kid3-cli rename ffmpeg

flatpak install flathub org.freac.freac

Purpose is to identify hallucinations, repeating subs, stuck timecodes, repeating timecodes.
Biggest difference I've noticed is medium is better to not hallucinate than medium.en and large (which is largev2). I've also tried quantized 5 model but accuracy is as bad as small so might as well just use small in that case.

Substicher comes with a sample librivox audiobook to quickly play around with the options. You put all your srt or vtt subs into the root directory along with the opus audio segments and stitch them together. With the included audiobook extract by chapters and rename 001.opus, 002.opus which is option h) and that will correspond to the whisper.cpp transcribed vtt or srt.

Play audiobooks with subs with a black cover image. Linux SMPlayer, Windows PotPlayer, Mac IINA, Android mpv-android, iOS $ nPlayer or Liquid Player.
Substitcher menu ..I think 5) is same as 3) ...oh well will fix when I update the script.
ffmpeg -i a.vtt a.srt

ffmpeg -i a.mkv -c copy srt a.srt

now reverse it from srt to vtt
ffmpeg -i a.srt a.vtt

This might be better than
pysubs2 a.vtt -t srt

due to the fact that it won’t put an index number. This alone would enable Substitcher to run on mac and most likely Windows as I wouldn’t need those two sed commands I’m thinking.

for f in *.vtt ; do ffmpeg -i "$f" "${f%.*}.srt" ; done

is equivalent of

pysubs2 *.vtt -t srt
Testing out a few firefox (Librewolf) extensions to display and/or download subs. Dualsubs is a no go cuz it can only do two for free and pushes you to pay $600 for 5 years which is insane.

First one is called Multi Subsitles Youtube and yes they spelled Subtitles wrong…oh well.
And last one which lets you download the subs (srt) is Youtube Subtitle Downloader https://www.youtube.com/watch?v=GMcDekiRKs8 (this is the one that has some subs in other languages)
Easy Video Downloader(librewolf extension, asks for donations) but it’s pretty good to download just stuff from youtube.
So these are the extensions I’ll use for Librewolf. Although still no easy way for average user to do yt-dlp —split-chapters or select quality format for odysee or bitchute though.
substitcher202310.zip
14.4 MB
Substitcher 202310 finally got it to work on Mac. Main thing was switch from converting subs from pysubs2 to ffmpeg so didn’t need to use two sed scripts that were trouble on mac. Still use pysubs2 for shifting time.

Removing metadata should work much better now as all chapter data is deleted. Title is added based on filename. Added a bunch more time scripts to each job so it shows time took. Cleaned up script a bit.
Substitcher
mkdir output; parallel --tag -j 2 ocrmypdf -O 3 -s --skip-big .1 '{}' 'output/{}' ::: *.pdf


Batch PDF image optimization to reduce file sizes. Usually use level 2 but sometimes 3 is warranted as long as you don't have text in images then it looks bad. In these pdfs only the first two images look kinda bad but rest throughout the pdfs are just fine. File savings is great 2.5GB —> 1.4GB (44% overall reduction)
Adding bookmarks with booky script. Download booky .sh and booky .py and chmod +x both and put in /usr/local/bin then make a text file with chapters / bookmarks with a comma , before the page number and save as bookmarks.txt

Now just run booky and a new PDF file with the bookmarks is created

booky.sh some.pdf bookmarks.txt

Creates some_new.pdf
Sometimes can just highlight contents in PDF and it'll copy just fine.
Sometimes it won't select properly so in that case use OCR capture.
Normcap (free Mac,Win, Linux) can OCR can capture the text.
Just put a comma , before the end of the numbers. Search for all numbers (\d+) at end of line $ then replace first capture group $1 adding a , before.
bookmarks.txt
7.8 KB
bookmarks.txt give you an example just put bookmarks between { }. You can do multiple levels that expand by using many { }.
Booky bookmarks how to do multilevel. Just embed collapsilbe bookmarks within { } below each chapter. Can do 3rd and 4th levels too. Use Rust for syntax highlighting.
HTML Embed Code:
2025/07/07 11:14:01
Back to Top