1
0
mirror of https://github.com/malarinv/jasper-asr.git synced 2026-03-08 10:32:35 +00:00
Commit Graph

24 Commits

Author SHA1 Message Date
515e9c1037 1. split extract all data types in one shot with --extraction-type all flag
2. add notes about diffing split extracted and original data
3. add a nlu conv generator to generate conv data based on nlu utterances and entities
4. add task uid support for dumping corrections
5. abstracted generate date fn
2020-06-25 11:03:09 +05:30
e76ccda5dd 1. fix update-correction to use ui_dump instead of manifest
2. update training params no of checkpoints on chpk frequency
2020-06-19 14:16:04 +05:30
000853b600 1. added option to strip silent chunks
2. computing caller quality based on task-id of corrections
2020-06-17 21:42:20 +05:30
ac0e04c226 stripping silence on call chunk 2020-06-17 19:43:25 +05:30
62eefb9294 fix 11st to 11th in ordinal 2020-06-17 19:30:12 +05:30
8e238c254e 1. added start delay arg in call recycler
2. implement ui_dump/manifest  writer in call_recycler itself
3. refactored call data point plotter
4. added sample-ui task-ui  on the validation process
5. implemented call-quality stats using corrections from mongo
6. support deleting cursors on mongo
7. implement multiple task support on validation ui based on task_id mongo field
2020-06-17 19:11:15 +05:30
7dbb04dcbf 1. added conv data generator
2. more utils
2020-06-16 15:38:07 +05:30
7472b6457d handling non-pnr cases without parens in text data 2020-06-16 11:02:53 +05:30
120302aad3 added support for name/dates/cities call data extraction and more logs 2020-06-15 10:24:38 +05:30
a7a25e9b07 1. using dataname args for update/fill annotations
2. rename to dump_ui
2020-06-10 14:55:59 +05:30
6d149d282d 1. added a data extraction type argument
2. cleanup/refactor
2020-06-09 19:16:24 +05:30
8db1be0083 refactor validation process arguments and logging 2020-06-05 16:32:08 +05:30
bca227a7d7 1. removed the transcriber_pretrained/speller from utils
2. introduced get_mongo_coll to get the collection object directly from mongo uri
3. removed processing of correction entries to remove space/upper casing
2020-06-04 17:49:16 +05:30
e3a01169c2 skipping invalid data points 2020-06-02 17:21:30 +05:30
9f9cb62b60 show duration on validation of dataset 2020-05-28 11:35:31 +05:30
de21952349 1. refactored wav chunk processing method
2. renamed streamlit to validation_ui
2020-05-28 11:18:39 +05:30
d87369c8fe don't load audio for annotation only ui and keep spoken as text for normal asr validation 2020-05-27 15:57:42 +05:30
41af0a87de respect verbose flag 2020-05-27 15:54:16 +05:30
6f395af10d fix skipping null audio and add more verbose logs 2020-05-27 15:49:58 +05:30
a38789d0c3 added option to disable plots during validation 2020-05-27 15:43:03 +05:30
7ff2db3e2e cleanup rev recycle 2020-05-27 15:33:22 +05:30
1acf9e403c 1. added support for mono/dual channel rev transcripts
2. handle errors when extracting datapoints from rev meta data
3. added suport for annotation only task when dumping ui data
2020-05-27 15:19:25 +05:30
1f2bedc156 1. enabled silece stripping in chunks when recycling audio from asr logs
2. limit asr recycling to 1 min of start audio to get reliable alignments and ignoring agent channel
3. added rev recycler for generating asr dataset from rev transcripts and audio
4. update pydub dependency for silence stripping fn and removing threadpool hardcoded worker count
2020-05-27 14:22:44 +05:30
fca9c1aeb3 refactored module structure 2020-05-21 19:13:44 +05:30