Commit Graph

10 Commits (120302aad33e397beadd9692eac84b1952a8e906)

Author SHA1 Message Date
Malar Kannan 120302aad3 added support for name/dates/cities call data extraction and more logs 2020-06-15 10:24:38 +05:30
Malar Kannan a7a25e9b07 1. using dataname args for update/fill annotations
2. rename to dump_ui
2020-06-10 14:55:59 +05:30
Malar Kannan 8db1be0083 refactor validation process arguments and logging 2020-06-05 16:32:08 +05:30
Malar Kannan bca227a7d7 1. removed the transcriber_pretrained/speller from utils
2. introduced get_mongo_coll to get the collection object directly from mongo uri
3. removed processing of correction entries to remove space/upper casing
2020-06-04 17:49:16 +05:30
Malar Kannan de21952349 1. refactored wav chunk processing method
2. renamed streamlit to validation_ui
2020-05-28 11:18:39 +05:30
Malar Kannan d87369c8fe don't load audio for annotation only ui and keep spoken as text for normal asr validation 2020-05-27 15:57:42 +05:30
Malar Kannan a38789d0c3 added option to disable plots during validation 2020-05-27 15:43:03 +05:30
Malar Kannan 1acf9e403c 1. added support for mono/dual channel rev transcripts
2. handle errors when extracting datapoints from rev meta data
3. added suport for annotation only task when dumping ui data
2020-05-27 15:19:25 +05:30
Malar Kannan 1f2bedc156 1. enabled silece stripping in chunks when recycling audio from asr logs
2. limit asr recycling to 1 min of start audio to get reliable alignments and ignoring agent channel
3. added rev recycler for generating asr dataset from rev transcripts and audio
4. update pydub dependency for silence stripping fn and removing threadpool hardcoded worker count
2020-05-27 14:22:44 +05:30
Malar Kannan fca9c1aeb3 refactored module structure 2020-05-21 19:13:44 +05:30