speech-scoring/README.md

1.3 KiB

Setup

. env/bin/activate to activate the virtualenv.

Data Generation

  • update OUTPUT_NAME in speech_samplegen.py to create the dataset folder with the name
  • python speech_samplegen.py generates variants of audio samples

Data Preprocessing

  • python speech_data.py creates the training-testing data from the generated samples.
  • run fix_csv(OUTPUT_NAME) once to create the fixed index of the dataset generated
  • run generate_sppas_trans(OUTPUT_NAME) once to create the SPPAS transcription(wav+txt) data
  • run $ (SPPAS_DIR)/bin/annotation.py -l eng -e csv --ipus --tok --phon --align --align -w ./outputs/OUTPUT_NAME/ once to create the phoneme alignment csv files for all variants.
  • create_seg_phonpair_tfrecords(OUTPUT_NAME) creates the tfrecords files with the phoneme level pairs of right/wrong stresses

Training

  • python speech_model.py trains the model with the training data generated.
  • train_siamese(OUTPUT_NAME) trains the siamese model with the generated dataset.

Testing

  • python speech_test.py tests the trained model with the test dataset
  • evaluate_siamese(TEST_RECORD_FILE,audio_group=OUTPUT_NAME,weights = WEIGHTS_FILE_NAME) the TEST_RECORD_FILE will be under outputs directory and WEIGHTS_FILE_NAME will be under the models directory, pick the most recent weights file.