|
|
||
|---|---|---|
| plume | ||
| .flake8 | ||
| .gitignore | ||
| LICENSE | ||
| Notes.md | ||
| README.md | ||
| setup.py | ||
README.md
Plume ASR
Generates text from audio containing speech
Table of Contents
Prerequisites
# apt install libsndfile-dev ffmpeg
Features
- ASR using Jasper (from NemoToolkit )
- ASR using Wav2Vec2 (from fairseq )
Installation
To install the packages and its dependencies run.
python setup.py install
or with pip
pip install .[all]
The installation should work on Python 3.6 or newer. Untested on Python 2.7
Usage
Library
Jasper
from plume.models.jasper.asr import JasperASR
asr_model = JasperASR("/path/to/model_config_yaml","/path/to/encoder_checkpoint","/path/to/decoder_checkpoint") # Loads the models
TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav
Wav2Vec2
from plume.models.wav2vec2.asr import Wav2Vec2ASR
asr_model = Wav2Vec2ASR("/path/to/ctc_checkpoint","/path/to/w2v_checkpoint","/path/to/target_dictionary") # Loads the models
TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav
Command Line
$ plume