Install Whisper, which allows you to transcribe text. You can use Whisper for free on your own PC (local environment) Whisper is an advanced speech recognition model developed by OpenAI that can convert speech into text.
- Transcription of meetings and lectures: Audio from a meeting or lecture can be used to convert it into text for later reference.
- Subtitle generation: Can be used to add subtitles to video or audio content. This is especially useful for making content accessible to the hearing impaired and speakers of different languages.
- Voice command processing: Can be used to analyze user voice commands and perform corresponding actions using speech recognition.
- Recording interviews: Can be used to convert the audio of interviews conducted by journalists and researchers into text.
- Real-time interpretation: Can be used for real-time translation to aid communication between different languages.
- Converting voice memos to text: can be used to convert personal voice memos and ideas into text for easier retrieval later.
- Legal and medical transcription: Can be used to record and document conversations in a legal or medical setting.
- Transcribing podcasts and audiobooks: Can be used to provide the content of podcasts and audiobooks in text format.
Whisper is useful in a wide variety of situations due to its accuracy and multilingual capabilities. It is especially suited for use in situations requiring quick and accurate transcription. The following pages will help you work with it.
https://github.com/jhj0517/Whisper-WebUI
It describes the requirements for installation on Windows. Install these.
Git for Windows
Python (version 3.8 ~ 3.10)
FFmpeg
First, start a command prompt and navigate to an appropriate directory (for clarity). Create a clone in this directory.
cd\
cd youtube
git clone https://github.com/jhj0517/Whisper-WebUI.git
After cloning, a directory called Whisper-WebUI should have been created, so move to it.
cd Whisper-WebUI
In this directory, you will find the same contents as GitHub, so you can install Whisper by executing the batch file.
Install.bat
The contents of this batch file will create a python virtual environment (venv), so your computer will not be messed up. Here are the details of each step
- @echo off: This command prevents the command itself from appearing on the console while the batch file is running.
- if not exist “%~dp0\venv\Scripts”: This conditional statement checks if the
venv\Scripts
folder exists.%~dp0
refers to the directory where the batch file itself exists. - python -m venv venv: If the
venv\Scripts
folder does not exist, this command creates a new Python virtual environment (venv). - echo checked the venv folder. now installing requirements.: Notify the user that the virtual environment has been checked.
- cd /d “%~dp0\venv\Scripts” and call activate.bat: These commands activate the created virtual environment.
- cd /d “%~dp0” and pip install -r requirements.txt: Returns to the directory where the batch file resides and installs the dependencies listed in the
requirements.txt
file. - if errorlevel 1: This conditional statement checks to see if the
pip install
command terminated with an error. If there is an error, it prompts the user to remove the virtual environment and reruninstall.bat
. If there are no errors, it informs the user that the dependencies were successfully installed. - pause: This command stops the execution of the script until the user presses any key.
In short, this script is used to facilitate the setup of a Python project and automates the creation of the virtual environment and the installation of necessary dependencies.
This section also describes Python virtual environments.
A Python virtual environment is an independent environment for managing the Python versions and packages required for a particular project. Using this virtual environment avoids dependency conflicts between different projects. In other words, when developing multiple projects simultaneously on a single system, you can independently manage the versions of specific libraries and modules required by each project.
The main advantages of using a virtual environment are
- Separation of dependencies: Different projects can use different versions of libraries, preventing interference between projects.
- Safe experimentation: New packages and updates can be tested safely without affecting the entire system.
- Easy Dependency Management: Manage project dependencies with files such as
requirements.txt
and easily share them with other developers. - Consistent with production environments: Minimize differences between development and production environments.
To create a virtual environment in Python, use tools such as venv
(standard in Python 3.3 and later) or virtualenv
(for use with older versions of Python or when additional features are needed). These tools make it easy to set up and manage independent Python runtime environments for each project.
Next, also run the following batch file. The main purpose of this batch file is to run the app.py
script using the Python interpreter within a specific virtual environment.
start-webui.bat
@echo off
command: This will prevent the command itself from appearing on the screen during subsequent command executions.goto :activate_venv
command: This command transfers control of the batch file to the:activate_venv
label.:launch
label: This will run the Python scriptapp.py
using the Python interpreter stored in the%PYTHON%
variable. The%*
is used to pass command line arguments toapp.py
. Thepause
command will stop the prompt after script execution.:activate_venv
label: where thePYTHON
environment variable is set to the path of the Python interpreter in the virtual environment."%~dp0\venv\Scripts\Python.exe"
is the path to the Python interpreter in the virtual environment in the directory where the batch file resides. Next, use theecho
command to display the path to the Python interpreter you are currently using, and usegoto :launch
to transfer control to the:launch
label.:endofscript
label: This label is not actually used, but can be used to pause the program with a message upon script exit or error.
Pause so that the user can see the results after the process is finished. It can now be accessed in a browser.
http://127.0.0.1:7860
Now let’s upload the audio file and see how it works. It can be music with vocals, as long as it contains audio. The lyrics have been successfully displayed. Next, transcribe the audio contained in the video file. The goal is to create subtitles. By setting the format format to SRT, you can easily add subtitles to the video.
The SRT (SubRip Text) file format is a very simple text-based format used to store subtitles for movies and videos. an SRT file defines the subtitle text and the time it should appear.
The basic structure of an SRT file is as follows
- Sequential numbering: each subtitle begins with a sequential number starting from 1.
- Time code: Indicates when the subtitle appears and disappears. The format is usually
hours:minutes:seconds,milliseconds --> hours:minutes:seconds,milliseconds
. For example,00:01:20,000 --> 00:01:22,000
means the subtitle starts at 1 minute 20 seconds and ends at 1 minute 22 seconds. - Subtitle text: This is placed immediately after the time code. This is the text that will appear on the screen.
- Blank lines: Each subtitle section is separated by a blank line.
Example of an SRT file
1
00:00:20,000 --> 00:00:24,400
Hello, welcome to the movie.
2
00:00:24,600 --> 00:00:27,800
This is a subtitle example.
3
00:00:28,000 --> 00:00:31,150
Thank you for watching.
In this example, there are three different subtitles, each appearing at a specified time.
They are also supported by PowerDirector, so you can actually insert subtitles. The most up-to-date PowerDirector has an AI automatic transcription feature.
There were very few typos, although I had to correct myself where they were too long to be displayed. There are paid services, but this is sufficient. If you can do this much for free, you will be satisfied. Furthermore, by entering the URL of a youtube video, you can also transcribe the text. It also has a translation function, so you can even translate them into English.