jlengrand/tldw

Fork 0

mirror of https://github.com/jlengrand/tldw.git synced 2026-03-10 08:51:17 +00:00

Go to file

Mike df95e86878 diarize cli wip

2023-08-04 11:26:55 -04:00

compare

more ufo compares

2023-07-30 15:43:54 -04:00

data

run the 4 testcases

2023-08-02 19:07:53 -04:00

params

consitent naming

2023-07-30 13:05:26 -04:00

prompts

switch to json instead of text so we can chunk by time

2023-08-02 18:45:21 -04:00

results

run the 4 testcases

2023-08-02 19:07:53 -04:00

chunker.py

run the 4 testcases

2023-08-02 19:07:53 -04:00

compare-app.py

wip

2023-08-04 10:46:46 -04:00

compare.py

results!

2023-07-30 11:34:29 -04:00

diarize.py

diarize cli wip

2023-08-04 11:26:55 -04:00

merger.py

wip

2023-08-04 10:46:46 -04:00

pyannote.py

wip

2023-08-04 10:46:46 -04:00

README.md

improve readme

2023-07-31 11:41:15 -04:00

README.md

Too Long, Didnt Watch

YouTube contains an incredible amount of knowledge, much of which is locked inside multi-hour videos. Let's extract and summarize with AI!

yt-dlp - download audio tracks of youtube videos
ffmpeg - decompress audio
whisper.cpp - transcribe audio to text
chunk.py - break text into parts and prepare each part for LLM summarization
can-ai-code - leverage interview_cuda or `interview-llamacpp`` executor to run LLM inference
compare.py - prepare LLM outputs for webapp
compare-app.py - summary viewer webapp

This project is under active development and is not ready for production use.

DEMO @ HF Space

Video Transcript Datasets

Filename	Title	Whisper Model	URL
ufo.txt	Subcommittee on National Security, the Border, and Foreign Affairs Hearing	small.en	https://www.youtube.com/watch?v=KQ7Dw-739VY
aoe-grand-finale.txt	GRAND FINAL $10,000 AoE2 Event (The Resurgence)	medium.en	https://www.youtube.com/watch?v=jnoxjLJind4

Creating a Dataset

Download with yt-dlp

Download the audio track:

pip install yt-dlp
yt-dlp -f "bestaudio[ext=m4a]" --extract-audio  'https://www.youtube.com/watch?v=<video>'

Transcode with ffmpeg

Convert the audio track to wav:

ffmpeg -i *.m4a -hide_banner -vn -loglevel error -ar 16000 -ac 1 -c:a pcm_s16le -y resampled.wav

Transcribe with whisper.cpp

Transcribe the wav to txt:

main -m ../models/ggml-medium.en.bin -f resampled.wav -t 32 -otxt