We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First of all, I am impressed this exciting source and appreciate all contributors.
Question 1. Is it possible for kaldi scripts to create a set of input files with which ctc-stanford training script can run?
I think that all input files for an execution are: key#.txt, feat.bin and alis#.txt. Examples of these files are in the URL: http://deeplearning.stanford.edu/lexfree/timit/
Question 2. If the answer of the previous question is NO, then how can I create these three kinds of files from my wav files and transcripts of them?
I know methods and scripts to extract the mfcc and log mel filter bank features as described in the URL https://github.com/jameslyons/python_speech_features
I think that a set of feature vectors of a wav file can be an input for dataLoarder.py. But I am not sure how the rest of three files can be obtained.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
First of all, I am impressed this exciting source and appreciate all contributors.
Question 1. Is it possible for kaldi scripts to create a set of input files with which ctc-stanford training script can run?
I think that all input files for an execution are: key#.txt, feat.bin and alis#.txt. Examples of these files are in the URL: http://deeplearning.stanford.edu/lexfree/timit/
Question 2. If the answer of the previous question is NO, then how can I create these three kinds of files from my wav files and transcripts of them?
I know methods and scripts to extract the mfcc and log mel filter bank features as described in the URL https://github.com/jameslyons/python_speech_features
I think that a set of feature vectors of a wav file can be an input for dataLoarder.py. But I am not sure how the rest of three files can be obtained.
The text was updated successfully, but these errors were encountered: