plume-asr/README.md

# Plume ASR

[![image](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/python/black)

> Generates text from audio containing speech
---

# Table of Contents

* [Prerequisites](#prerequisites)
* [Features](#features)
* [Installation](#installation)
* [Usage](#usage)

# Prerequisites
```bash
# apt install libsndfile-dev ffmpeg
```

# Features

* ASR using Jasper (from [NemoToolkit](https://github.com/NVIDIA/NeMo) )
* ASR using Wav2Vec2 (from [fairseq](https://github.com/pytorch/fairseq) )

# Installation
To install the packages and its dependencies run.
```bash
python setup.py install
```
or with pip
```bash
pip install .[all]
```

The installation should work on Python 3.6 or newer. Untested on Python 2.7

# Usage
### Library
> Jasper
```python
from plume.models.jasper_nemo.asr import JasperASR
asr_model = JasperASR("/path/to/model_config_yaml","/path/to/encoder_checkpoint","/path/to/decoder_checkpoint") # Loads the models
TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav
```
> Wav2Vec2
```python
from plume.models.wav2vec2.asr import Wav2Vec2ASR
asr_model = Wav2Vec2ASR("/path/to/ctc_checkpoint","/path/to/w2v_checkpoint","/path/to/target_dictionary") # Loads the models
TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav
```
### Command Line
```
$ plume
```
### Pretrained Models
**Jasper**
https://ngc.nvidia.com/catalog/models/nvidia:multidataset_jasper10x5dr/files?version=3
**Wav2Vec2**
https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/README.md
massive refactor/rename to plume 2021-02-23 14:13:33 +00:00			`# Plume ASR`
jasper asr first commit 2020-03-16 08:50:54 +00:00
			`[![image](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/python/black)`

massive refactor/rename to plume 2021-02-23 14:13:33 +00:00			`> Generates text from audio containing speech`
jasper asr first commit 2020-03-16 08:50:54 +00:00			`---`

			`# Table of Contents`

1. fixed dependency issues 2. add task-id option to validation ui to respawn previous task 3. clean-up rastrik-recycler 2020-08-06 17:10:14 +00:00			`* [Prerequisites](#prerequisites)`
jasper asr first commit 2020-03-16 08:50:54 +00:00			`* [Features](#features)`
			`* [Installation](#installation)`
			`* [Usage](#usage)`

1. fixed dependency issues 2. add task-id option to validation ui to respawn previous task 3. clean-up rastrik-recycler 2020-08-06 17:10:14 +00:00			`# Prerequisites`
			```bash
			`# apt install libsndfile-dev ffmpeg`
			```

jasper asr first commit 2020-03-16 08:50:54 +00:00			`# Features`

			`* ASR using Jasper (from [NemoToolkit](https://github.com/NVIDIA/NeMo) )`
massive refactor/rename to plume 2021-02-23 14:13:33 +00:00			`* ASR using Wav2Vec2 (from [fairseq](https://github.com/pytorch/fairseq) )`
jasper asr first commit 2020-03-16 08:50:54 +00:00
			`# Installation`
			`To install the packages and its dependencies run.`
			```bash
			`python setup.py install`
			```
			`or with pip`
			```bash
massive refactor/rename to plume 2021-02-23 14:13:33 +00:00			`pip install .[all]`
jasper asr first commit 2020-03-16 08:50:54 +00:00			```

			`The installation should work on Python 3.6 or newer. Untested on Python 2.7`

			`# Usage`
massive refactor/rename to plume 2021-02-23 14:13:33 +00:00			`### Library`
			`> Jasper`
jasper asr first commit 2020-03-16 08:50:54 +00:00			```python
1. refactor package root to src/ layout 2. add framwork suffix for models 3. change black max columns to 79 4. add tests 5. integrate vad, encrypt and refactor manifest, regentity, extended_path, audio, parallel utils 6. added ui utils for encrypted preview 7. wip marblenet model 8. added transformers based wav2vec2 inference 9. update readme and manifest 10. add deploy setup target 2021-06-02 13:17:44 +00:00			`from plume.models.jasper_nemo.asr import JasperASR`
jasper asr first commit 2020-03-16 08:50:54 +00:00			`asr_model = JasperASR("/path/to/model_config_yaml","/path/to/encoder_checkpoint","/path/to/decoder_checkpoint") # Loads the models`
			`TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav`
			```
massive refactor/rename to plume 2021-02-23 14:13:33 +00:00			`> Wav2Vec2`
			```python
			`from plume.models.wav2vec2.asr import Wav2Vec2ASR`
			`asr_model = Wav2Vec2ASR("/path/to/ctc_checkpoint","/path/to/w2v_checkpoint","/path/to/target_dictionary") # Loads the models`
			`TEXT = asr_model.transcribe(wav_data) # Returns the text spoken in the wav`
			```
			`### Command Line`
			```
			`$ plume`
			```
add links to pretrained models 2021-02-26 05:27:23 +00:00			`### Pretrained Models`
			`Jasper`
			`https://ngc.nvidia.com/catalog/models/nvidia:multidataset_jasper10x5dr/files?version=3`
			`Wav2Vec2`
			`https://github.com/pytorch/fairseq/blob/master/examples/wav2vec/README.md`