Nathan Wailes - Blog - GitHub - LinkedIn - Patreon - Reddit - Stack Overflow - Twitter - YouTube
How to get a fairly high-quality transcription of your speech quickly and for free by using YouTube
- Background
- I've noticed that it can be far faster to "write" something by first speaking out loud and having your speech transcribed, and then using that transcription as a first draft.
- It seems to me that what has prevented most people from using this method is that 1) having a human transcribe your speech is expensive (~$1 per minute of speech), and 2) transcription software has been somewhat expensive (~$120+) and not very good.
- Step-by-step process
- Spend some time thinking up a basic outline of the points you would like to make.
- This will help prevent you from creating an overly-long, rambling, poorly-organized video, which will be much harder to turn into easily-digestible prose.
- Record a video in which you go through the points you've decided you want to make.
- Give examples and explanations where you feel they will help the listener / reader.
- This transcription method is especially good and helping you to quickly convey explanations and examples that would normally take a long time to type out.
- Try to record in a quiet environment.
- Try to speak clearly.
- Try to speak at a distance from the microphone where your voice will be clear on the recording.
- Give examples and explanations where you feel they will help the listener / reader.
- Upload the video to YouTube.
- Wait for YouTube to create a transcription for your video.
- This may be immediate; I'm not sure.
- Get the HTML containing the transcript into Notepad++.
- Background
- We are doing this so that we can extract the transcript from its surrounding HTML.
- Sublime Text 2 can work as well, but it takes a long time to manipulate large transcripts from long videos (1.5 hours or more), whereas Notepad++ has no such problem.
- Step-by-step process
- Using Chrome, navigate to the video's URL, click the "More" button, and select the 'Transcript" option.
- A div will become visible underneath the "More" button that will say "Transcript" and will have a closed drop-down box that says something like "English (Automatic Captions)".
- Click the drop-down box and click the option that says "English (Automatic Captions)".
- There will usually only be one option in the drop-down box.
- The transcript should then become visible underneath the drop-down box.
- Right click one of the lines in the transcript and select "Inspect".
- When Chrome's Devtools open up, select the parent div of the one-line-of-the-transcript.
- As I'm writing this, the id of the parent div is "transcript-scrollbox".
- Right click that div and select "Edit as HTML".
- The unmodifiable HTML source code should change into a modifiable text-box containing the same source code.
- Click into the text-box and press "Ctrl + a" to select all of the HTML in that div.
- Press "Ctrl + c" to copy the source.
- Open up Notepad++ and, if necessary, create a new document.
- Paste in the source HTML that you just copied.
- Background
- Hit "Ctrl + h" to open the "Replace" search box.
- Make sure that the "Search Mode" in the "Replace" box is set to "Regular expression".
- Get rid of the HTML tags and timestamps (eg "1:18:29").
- For "Find what", put
<.*?>|(?:[\d:]*\d:[\d:]+)
- For "Replace with", put a single empty space: " " (without the quotation marks).
- Click "Replace all".
- For "Find what", put
- Get rid of the excess spaces that were created when removing the HTML tags / timestamps.
- In the "Find what" box, put
\s\s+
- In the "Replace with" box, leave it as a single empty space.
- Click "Replace all".
- In the "Find what" box, put
- The document should now be all and only the transcription of your speech.
- Spend some time thinking up a basic outline of the points you would like to make.