Malar Kannan
f5c49338d9
1. added deepgram support
...
2. compute asr sample accuracy
2020-08-07 12:02:01 +05:30
Malar Kannan
fa89775f86
1. add a new streamlit ui to preview manifest
...
2. implement rpcy transcription client for files
2020-08-07 12:00:33 +05:30
Malar Kannan
ae5586be72
added evaluation command
2020-07-09 14:36:51 +05:30
Malar Kannan
069392d098
1. added a test generator and slu evaluator
...
2. ui dump now include gcp results
3. showing default option for more args validation process commands
2020-06-29 14:24:56 +05:30
Malar Kannan
515e9c1037
1. split extract all data types in one shot with --extraction-type all flag
...
2. add notes about diffing split extracted and original data
3. add a nlu conv generator to generate conv data based on nlu utterances and entities
4. add task uid support for dumping corrections
5. abstracted generate date fn
2020-06-25 11:03:09 +05:30
Malar Kannan
e76ccda5dd
1. fix update-correction to use ui_dump instead of manifest
...
2. update training params no of checkpoints on chpk frequency
2020-06-19 14:16:04 +05:30
Malar Kannan
000853b600
1. added option to strip silent chunks
...
2. computing caller quality based on task-id of corrections
2020-06-17 21:42:20 +05:30
Malar Kannan
ac0e04c226
stripping silence on call chunk
2020-06-17 19:43:25 +05:30
Malar Kannan
62eefb9294
fix 11st to 11th in ordinal
2020-06-17 19:30:12 +05:30
Malar Kannan
8e238c254e
1. added start delay arg in call recycler
...
2. implement ui_dump/manifest writer in call_recycler itself
3. refactored call data point plotter
4. added sample-ui task-ui on the validation process
5. implemented call-quality stats using corrections from mongo
6. support deleting cursors on mongo
7. implement multiple task support on validation ui based on task_id mongo field
2020-06-17 19:11:15 +05:30
Malar Kannan
7dbb04dcbf
1. added conv data generator
...
2. more utils
2020-06-16 15:38:07 +05:30
Malar Kannan
7472b6457d
handling non-pnr cases without parens in text data
2020-06-16 11:02:53 +05:30
Malar Kannan
120302aad3
added support for name/dates/cities call data extraction and more logs
2020-06-15 10:24:38 +05:30
Malar Kannan
a7a25e9b07
1. using dataname args for update/fill annotations
...
2. rename to dump_ui
2020-06-10 14:55:59 +05:30
Malar Kannan
6d149d282d
1. added a data extraction type argument
...
2. cleanup/refactor
2020-06-09 19:16:24 +05:30
Malar Kannan
8db1be0083
refactor validation process arguments and logging
2020-06-05 16:32:08 +05:30
Malar Kannan
bca227a7d7
1. removed the transcriber_pretrained/speller from utils
...
2. introduced get_mongo_coll to get the collection object directly from mongo uri
3. removed processing of correction entries to remove space/upper casing
2020-06-04 17:49:16 +05:30
Malar Kannan
e3a01169c2
skipping invalid data points
2020-06-02 17:21:30 +05:30
Malar Kannan
9f9cb62b60
show duration on validation of dataset
2020-05-28 11:35:31 +05:30
Malar Kannan
de21952349
1. refactored wav chunk processing method
...
2. renamed streamlit to validation_ui
2020-05-28 11:18:39 +05:30
Malar Kannan
d87369c8fe
don't load audio for annotation only ui and keep spoken as text for normal asr validation
2020-05-27 15:57:42 +05:30
Malar Kannan
41af0a87de
respect verbose flag
2020-05-27 15:54:16 +05:30
Malar Kannan
6f395af10d
fix skipping null audio and add more verbose logs
2020-05-27 15:49:58 +05:30
Malar Kannan
a38789d0c3
added option to disable plots during validation
2020-05-27 15:43:03 +05:30
Malar Kannan
7ff2db3e2e
cleanup rev recycle
2020-05-27 15:33:22 +05:30
Malar Kannan
1acf9e403c
1. added support for mono/dual channel rev transcripts
...
2. handle errors when extracting datapoints from rev meta data
3. added suport for annotation only task when dumping ui data
2020-05-27 15:19:25 +05:30
Malar Kannan
1f2bedc156
1. enabled silece stripping in chunks when recycling audio from asr logs
...
2. limit asr recycling to 1 min of start audio to get reliable alignments and ignoring agent channel
3. added rev recycler for generating asr dataset from rev transcripts and audio
4. update pydub dependency for silence stripping fn and removing threadpool hardcoded worker count
2020-05-27 14:22:44 +05:30
Malar Kannan
fca9c1aeb3
refactored module structure
2020-05-21 19:13:44 +05:30