GeekTips Webview 964.html Telegram

How to apply an offset to page numbers. So the actual page number is +1. Sometimes it's +3 or even +30. So in VSCode first select all the numbers you wish to increase. Only wish to select numbers at end of line.

56 viewsedited 20:32

GeekTips

First select all numbers with regex (\d+)$ and they'll be highlighted but not selected.

64 views20:33

GeekTips

On Mac press CMD+Shift+L. On Linux, Windows Ctrl+Shift+L to select all regex selections.

72 viewsedited 20:38

GeekTips

In VSCode using extension Text Power Tools choose option increase/decrease decimal number with a custom value and put in 1 to increase. Notice all page numbers have been incremented by 1.

70 viewsedited 20:40

GeekTips

Final result for bookmarked PDF

70 views20:42

GeekTips

bookmarks.txt

5.7 KB

booky. sh some.pdf bookmarks.txt

Here are the bookmarks with the 7 sections and offset adjusted to +1 for pages numbers to get an idea.

74 views21:49

GeekTips

Had a chm and uploaded to an online site zamzar and it automatically generated bookmarks. So I wanted to mimic that. Finally found pdf.tocgen

pip3 install pdf.tocgen

Extract some metadata to help make a recipe. Doing level one (-a 1) on page 78 (-p 78) and trying to figure out font and font size and/or font color to search for the pattern so it automatically generates the bookmarks.
pdfxmeta -a 1 -p 78 input.pdf >> recipe.toml

57 views01:34

GeekTips

This is the recipe.toml and found main chapters at level 1 and at level 2 the verses. Playing with tolerance took awhile but now I kinda know what to do.

56 views01:34

GeekTips

toc (table of contents) using the recipe you created.

pdftocgen doc.pdf < recipe.toml > toc

Clean up toc if necessary.

Just had to do some regex to get rid of unwanted text after verses and some other unwated text and then delete any blank lines and the toc is ready to go.

48 viewsedited 01:34

GeekTips

now add toc / bookmarks to pdf

pdftocio input.pdf < toc

and it'll output a PDF with bookmarks named input_out.pdf

Another PDF I did it went up 1.5MB in file size with tons of bookmrks. This one went from 11MB to 242MB. No idea why so compressed it and it went down to 5MB.

ocrmypdf -O 3 -s --skip-big .1 --jbig2-lossy input.pdf output.pdf

48 viewsedited 01:34

GeekTips

pdftocgen -v volume_10_surahs_12-15.pdf < recipe.toml > toc
 pdftocio volume_10_surahs_12-15.pdf < toc
 pdftocgen -v volume_11_surahs_16-20.pdf < recipe.toml > toc
 pdftocio volume_11_surahs_16-20.pdf < toc
 pdftocgen -v volume_12_surahs_21-25.pdf < recipe.toml > toc 
 pdftocio volume_12_surahs_21-25.pdf < toc
 pdftocgen -v volume_13_surahs_26-32.pdf < recipe.toml > toc
 pdftocio volume_13_surahs_26-32.pdf < toc
 pdftocgen -v volume_14_surahs_33-39.pdf < recipe.toml > toc
 pdftocio volume_14_surahs_33-39.pdf < toc

Used this recipe

[[heading]]
level = 1
greedy = true
font.name = "BookAntiqua-Italic"
font.size = 24.0

[[heading]]
level = 1
greedy = true
font.name = "BookAntiqua-Bold"
font.size = 24.0

[[heading]]
level = 1
greedy = true
font.name = "BookAntiqua"
font.size = 18.0

[[heading]]
level = 2
greedy = true
font.name = "BookAntiqua"
font.size = 23.377168655395508
font.size_tolerance =  0.2

[[heading]]
level = 2
greedy = true
font.name = "BookAntiqua"
font.size = 24.0

[[heading]]
level =


greedy = true
font.name = "BookAntiqua-Bold"
font.size = 12.0

54 viewsedited 02:48

GeekTips

An this is the result. Still need to edit the TOC a tad before writing it to a new pdf. Couldn't imagine manually bookmarking thousands of bookmark in this 18 volume book.

regex to strip out any bookmarks longer than 60 characters in length
search:

(^"\d+:\d+.\D{60})(.*?")

replace:
$1"

to replace all no verses 7:23, 8:23, 110:11 for instance
(^"\D+\.*?\d+ \d+.\d+)

(^"\D+.*?$)

just select verses Ctrl+Shift+L in vscode for regex. Cut them then delete rest of lines then paste back
(^"\d+:\d+.*?$)

59 viewsedited 02:50

GeekTips

Switched Substicher down to 99 max for last month or so but now back up to 200 since no matter what I did be it removing silence, medium or large model and translate or no translate or other language then translate lo and behold it would hallucinate (repeat). So now combatting it with keeping each audio segment under 30 mins. Hopefully when stitch back all 200 subtitles into one it'll be in sync.

72 views03:02

GeekTips

This particular audiobook I'm transcribing is 84 hours so 200 chunks is 25m each. Found out if using large with ComputeAllUnites for CoreML with large model works fine but can't really use M1 for anything else so using -ng (no graphics) option allows one to keep using laptop and whisper.cpp goes a tad slower. Worth it for me though.

88 views03:04

GeekTips

yt-dlp -f wa -o "%(autonumber)03d - %(title)s.%(ext)s" "someyoutubeplaylist"

If the playlist you're downloading only has chapter names then they'll be out of order. Numbering them by modified date might not always work either.

The Cow [HDenOsoOJXQ].mp4
The Family of Amran [jUondpleUD8].mp4
The Opening [KlCyXnSCcyM].mp4
The Women [rY8l3LkcLKw].mp4

so instead of getting out of order (like above) the playlist order is maintained

001 - The Opening.mp4 
002 - The Cow.mp4 
003 - The Family of Amran.mp4 
004 - The Women.mp4 
005 - The Food.mp4 
006 - The Cattle.mp4 
007 - The Elevated Places.mp4

download a playlist and number in reverse since playlist is in reverse order.

yt-dlp -f wa --split-chapters --playlist-reverse -o "%(playlist_autonumber)03d - %(title)s.%(ext)s "someyoutubeplaylist URL"

101 viewsedited 02:22

GeekTips

parallel isn't all that hard. I actually decided to read the man page once. I used to specify -j2 or -j8 for number of jobs. No need as it automatically uses all available depending on computer.

84 views06:16

GeekTips

parallel examples ..notice after the ::: is the input

I just processed 105 mp3 files to remove silence and hiss and first method I used took on a Mac M1

Removing silence and hiss took 01h:52m:05s

Now I still might use this method but obviously use parallel.

Now for an A to B comparison I did only ffmpeg to remove silence and dynamic audio normalization, remove hiss

Removing silence and hiss took 00h:40m:23s

Removing silence and hiss using parallel took 00h:10m:42s

So yeah 4x faster.

93 views06:24

GeekTips

Notice it removed almost 4 hours of silent segments in 80 hours of content. That can be adjusted with the dB. Think I'm happy with -30dB though.

86 views06:28

GeekTips

example convert a for in loop using ffmpeg to parallel

dtmove=$( date +%Y_%m_%d_%H_%M_%S)
[ ! -d output ] && mkdir output
mkdir output/"$dtmove"

parallel --bar ffmpeg -i  {} -hide_banner -c:a libopus -b:a 32k -af "highpass=200,lowpass=3000,afftdn=tr=1,volume=8dB,dynaudnorm" output/$dtmove/{/} ::: output/$dt/*.opus

Convert to parallel (below to above example)

for i in *.opus ; do ffmpeg -i "$i" -hide_banner -c:a libopus -b:a 32k -af "highpass=200,lowpass=3000,afftdn=tr=1,volume=8dB,dynaudnorm" ../$dtmove/"$i" ; done

121 views07:40

GeekTips

mpvconfig02152024mac.zip

3.7 MB

mpvconfig 02152024 Mac 3.7MiB ... mainly for opus chaptered audiobooks. Includes iptv lists.

86 views19:36

HTML Embed Code:

<iframe width="100%" src="https://www.hottg.com/buyppe/webview?embed=1" title="Telegram Webview" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

2025/07/07 02:27:10
Back to Top