speech-scoring/TODO.md

345 B

  1. generate the samples of phoneme similarity variants.
  2. create spectrograms of 150ms windows with 50ms overlap for each word.
  3. train a rnn to output a vector using the spectrograms
  4. train a nn to output True/False based on the acceptability of the rnn output. -> Siamese network(implementation detail)
  5. validate with real world samples