mirror of https://github.com/malarinv/plume-asr.git synced 2026-03-07 20:02:34 +00:00

Go to file

Malar Kannan 79aa5e8578 1. set flake8 max-line to 79

2. update streamlit dep to 1.0
3. add dev optional dep key
4. implement mono diarized dataset generation script
5. enable gpu support on asr transformers inference pipeline
6. use typer logging
7. clean-up annotation ui with everything other than asr-data keys as optional(including plots)
8. implement chunk_transcribe_meta_gen abstraction for asr chunking logic
9. make ui_persist compatibility change for streamlit 1.0
10. add diarize commands(bugfix)
11. add notebooks for diarization

2021-10-28 00:47:53 +05:30

notebooks

1. set flake8 max-line to 79

2021-10-28 00:47:53 +05:30

src/plume

1. set flake8 max-line to 79

2021-10-28 00:47:53 +05:30

tests/plume

1. include additional ui dependencies

2021-08-16 18:02:26 +05:30

.flake8

1. set flake8 max-line to 79

2021-10-28 00:47:53 +05:30

.gitignore

1. integrated data generator using google tts

2020-07-14 12:09:46 +05:30

LICENSE

Initial commit

2020-03-16 14:21:51 +05:30

MANIFEST.in

1. refactor package root to src/ layout

2021-06-03 11:30:08 +05:30

Notes.md

massive refactor/rename to plume

2021-02-23 19:43:33 +05:30

pyproject.toml

1. refactor package root to src/ layout

2021-06-03 11:30:08 +05:30

README.md

1. refactor package root to src/ layout

2021-06-03 11:30:08 +05:30

setup.py

1. set flake8 max-line to 79

2021-10-28 00:47:53 +05:30

tox.ini

1. refactor package root to src/ layout

2021-06-03 11:30:08 +05:30

README.md

Plume ASR

Generates text from audio containing speech

Prerequisites
Features
Installation
Usage

Prerequisites

# apt install libsndfile-dev ffmpeg

Features

ASR using Jasper (from NemoToolkit )
ASR using Wav2Vec2 (from fairseq )

Installation

To install the packages and its dependencies run.

python setup.py install

or with pip

pip install .[all]

The installation should work on Python 3.6 or newer. Untested on Python 2.7

Usage

Library

Jasper

from plume.models.jasper_nemo.asr import JasperASR
asr_model = JasperASR("/path/to/model_config_yaml","/path/to/encoder_checkpoint","/path/to/decoder_checkpoint") # Loads the models
TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav

Wav2Vec2

from plume.models.wav2vec2.asr import Wav2Vec2ASR
asr_model = Wav2Vec2ASR("/path/to/ctc_checkpoint","/path/to/w2v_checkpoint","/path/to/target_dictionary") # Loads the models
TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav

Command Line

$ plume

Pretrained Models

Jasper https://ngc.nvidia.com/catalog/models/nvidia:multidataset_jasper10x5dr/files?version=3 Wav2Vec2 https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/README.md

README.md

Plume ASR

Table of Contents

Prerequisites

Features

Installation

Usage

Library

Command Line

Pretrained Models