Go to file
Malar Kannan eb10b577ae Added README.md describing the workflow 2017-12-29 13:14:37 +05:30
inputs implemented phoneme/voice/rate variant genration 2017-10-04 23:21:28 +05:30
.gitignore updated model data 2017-11-27 14:08:01 +05:30
CLI.md added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30
README.md Added README.md describing the workflow 2017-12-29 13:14:37 +05:30
TODO.md wip high variant phoneme 2017-10-26 18:06:14 +05:30
generate_similar.py computing phoneme/word variant for each word in a phrase 2017-11-03 14:48:55 +05:30
requirements-linux.txt implemented the model, todo implement ctc and training queueing logic 2017-11-28 19:10:19 +05:30
requirements.txt generated voice files using ios api 2017-10-04 17:51:39 +05:30
segment_data.py 1. fixed softmax output and overfit the model for small sample 2017-12-12 12:18:27 +05:30
segment_model.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
similarity.csv similarity wip 2017-10-05 17:37:49 +05:30
speech_data.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
speech_model.py saving model on better 'acc' 2017-12-28 20:00:19 +05:30
speech_pitch.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
speech_samplegen.py Added README.md describing the workflow 2017-12-29 13:14:37 +05:30
speech_segmentgen.py implemented segment-generation for random words for testing 2017-12-06 14:41:25 +05:30
speech_similar.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
speech_spectrum.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
speech_test.py added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30
speech_testgen.py added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30
speech_tools.py implemented phoneme segmented training on samples 2017-12-28 18:53:54 +05:30
voicerss_tts.py added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30
voicerss_tts.py.bak added voicerss tts support for test data generation 2017-12-26 14:32:56 +05:30

README.md

Setup

. env/bin/activate to activate the virtualenv.

Data Generation

  • update OUTPUT_NAME in speech_samplegen.py to create the dataset folder with the name
  • python speech_samplegen.py generates variants of audio samples

Data Preprocessing

  • python speech_data.py creates the training-testing data from the generated samples.
  • run fix_csv(OUTPUT_NAME) to create the fixed index of the dataset generated
  • generate_sppas_trans(OUTPUT_NAME) creates the SPPAS transcription(wav+txt) data
  • $ (SPPAS_DIR)/bin/annotation.py -l eng -e csv --ipus --tok --phon --align --align -w ./outputs/OUTPUT_NAME/ creates the phoneme alignment csv files for all variants.
  • create_seg_phonpair_tfrecords(OUTPUT_NAME) creates the tfrecords files with the phoneme level pairs of right/wrong stresses

Training

  • python speech_model.py trains the model with the training data generated.
  • train_siamese(OUTPUT_NAME) trains the siamese model with the generated dataset.