Go to file
Malar Kannan 225a720f18 updated README to include testing 2017-12-29 16:21:38 +05:30
inputs implemented phoneme/voice/rate variant genration 2017-10-04 23:21:28 +05:30
.gitignore updated model data 2017-11-27 14:08:01 +05:30
CLI.md added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30
README.md updated README to include testing 2017-12-29 16:21:38 +05:30
TODO.md wip high variant phoneme 2017-10-26 18:06:14 +05:30
generate_similar.py computing phoneme/word variant for each word in a phrase 2017-11-03 14:48:55 +05:30
requirements-linux.txt implemented the model, todo implement ctc and training queueing logic 2017-11-28 19:10:19 +05:30
requirements.txt generated voice files using ios api 2017-10-04 17:51:39 +05:30
segment_data.py 1. fixed softmax output and overfit the model for small sample 2017-12-12 12:18:27 +05:30
segment_model.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
similarity.csv similarity wip 2017-10-05 17:37:49 +05:30
speech_data.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
speech_model.py saving model on better 'acc' 2017-12-28 20:00:19 +05:30
speech_pitch.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
speech_samplegen.py Merge branch 'master' of /home/ilml/Public/Repos/speech_scoring 2017-12-29 13:15:51 +05:30
speech_segmentgen.py generating segmentation for words 2017-12-28 13:37:27 +05:30
speech_similar.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
speech_spectrum.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
speech_test.py added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30
speech_testgen.py added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30
speech_tools.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
voicerss_tts.py added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30
voicerss_tts.py.bak added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30

README.md

Setup

. env/bin/activate to activate the virtualenv.

Data Generation

  • update OUTPUT_NAME in speech_samplegen.py to create the dataset folder with the name
  • python speech_samplegen.py generates variants of audio samples

Data Preprocessing

  • python speech_data.py creates the training-testing data from the generated samples.
  • run fix_csv(OUTPUT_NAME) once to create the fixed index of the dataset generated
  • run generate_sppas_trans(OUTPUT_NAME) once to create the SPPAS transcription(wav+txt) data
  • run $ (SPPAS_DIR)/bin/annotation.py -l eng -e csv --ipus --tok --phon --align --align -w ./outputs/OUTPUT_NAME/ once to create the phoneme alignment csv files for all variants.
  • create_seg_phonpair_tfrecords(OUTPUT_NAME) creates the tfrecords files with the phoneme level pairs of right/wrong stresses

Training

  • python speech_model.py trains the model with the training data generated.
  • train_siamese(OUTPUT_NAME) trains the siamese model with the generated dataset.

Testing

  • python speech_test.py tests the trained model with the test dataset
  • evaluate_siamese(TEST_RECORD_FILE,audio_group=OUTPUT_NAME,weights = WEIGHTS_FILE_NAME) the TEST_RECORD_FILE will be under outputs directory and WEIGHTS_FILE_NAME will be under the models directory, pick the most recent weights file.