speech-scoring/TODO.md

0. generate the samples of phoneme similarity variants.
1. create spectrograms of 150ms windows with 50ms overlap for each word.
2. train a rnn to output a vector using the spectrograms
3. train a nn to output True/False based on the acceptability of the rnn output. -> Siamese network(implementation detail)
4. validate with real world samples

same word spoken by multiple people etc. will be low distance. two words which are very different (you can use similarity measure given in the speech_recognition repo) will have high distance.

the one with wrong pronunciation will have medium distance from one with right pronunciation

i also had good experience with getting non-English voices to speak out the English words to get "wrong" pronunciation - so that will be subtly different too.
1. implemented spectrogram generator for audio files 2. imported siamese network class (wip) 3. added similarity measure based phoneme neighbor generator 4. fixed samplegen variants code 5. create triplets (wip) 6. updates 2017-10-13 11:10:57 +00:00			`0. generate the samples of phoneme similarity variants.`
			`1. create spectrograms of 150ms windows with 50ms overlap for each word.`
			`2. train a rnn to output a vector using the spectrograms`
			`3. train a nn to output True/False based on the acceptability of the rnn output. -> Siamese network(implementation detail)`
discarding phoneme incapable synthesizers 2017-10-26 11:21:32 +00:00			`4. validate with real world samples`
wip high variant phoneme 2017-10-26 12:36:14 +00:00
			`same word spoken by multiple people etc. will be low distance. two words which are very different (you can use similarity measure given in the speech_recognition repo) will have high distance.`

			`the one with wrong pronunciation will have medium distance from one with right pronunciation`

			`i also had good experience with getting non-English voices to speak out the English words to get "wrong" pronunciation - so that will be subtly different too.`