mirror of
https://github.com/jlengrand/tldw.git
synced 2026-03-10 08:51:17 +00:00
ca89524af53f8d4c9a28e6fc828eb3b09563f68f
Too Long, Didnt Watch
WIP experiments in summarizing long youtube videos.
Whats going on here?
- Downloaded https://www.youtube.com/watch?v=KQ7Dw-739VY
- Used whisper.cpp to transcribe audio to text (see
ufo-clean.txt) - Trim the relevant section of text (see
ufo-clean-parts.txt) - Break text into chunks
- Summarize each chunk
- Profit?
Step-by-step
Download audio (m4a)
pip install
Transcode audio (wav)
ffmpeg ..
`
Transcribe with whisper.cpp
main -m models/.. -t <threads>
Chunk
Process
Description
Languages
Python
99.5%
Dockerfile
0.5%