Compare commits

..

No commits in common. "b267b89a44ba61d99f1e253a7982645c2a7b5ae1" and "ee2eb63f66f5499ab0ed1f1a90fa77428b33dd34" have entirely different histories.

2 changed files with 0 additions and 25 deletions

View File

@ -1,18 +0,0 @@
### Setup
`. env/bin/activate` to activate the virtualenv.
### Data Generation
* update `OUTPUT_NAME` in *speech_samplegen.py* to create the dataset folder with the name
* `python speech_samplegen.py` generates variants of audio samples
### Data Preprocessing
* `python speech_data.py` creates the training-testing data from the generated samples.
* run `fix_csv(OUTPUT_NAME)` to create the fixed index of the dataset generated
* `generate_sppas_trans(OUTPUT_NAME)` creates the SPPAS transcription(wav+txt) data
* `$ (SPPAS_DIR)/bin/annotation.py -l eng -e csv --ipus --tok --phon --align --align -w ./outputs/OUTPUT_NAME/` creates the phoneme alignment csv files for all variants.
* `create_seg_phonpair_tfrecords(OUTPUT_NAME)` creates the tfrecords files
with the phoneme level pairs of right/wrong stresses
### Training
* `python speech_model.py` trains the model with the training data generated.
* `train_siamese(OUTPUT_NAME)` trains the siamese model with the generated dataset.

View File

@ -216,9 +216,6 @@ def generate_audio_for_text_list(text_list):
closer()
def generate_audio_for_stories():
'''
Generates the audio sample variants for the list of words in the stories
'''
# story_file = './inputs/all_stories_hs.json'
story_file = './inputs/all_stories.json'
stories_data = json.load(open(story_file))
@ -228,10 +225,6 @@ def generate_audio_for_stories():
generate_audio_for_text_list(text_list)
def generate_test_audio_for_stories(sample_count=0):
'''
Picks a list of words from the wordlist that are not in story words
and generates the variants
'''
story_file = './inputs/all_stories_hs.json'
# story_file = './inputs/all_stories.json'
stories_data = json.load(open(story_file))