Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in AI and Deep Learning by (50.2k points)

I have a mapping of 100,000+ words to their phonemes (CMUdict), like:

ABANDONED => [ 'AH', 'B', 'AE', 'N', 'D', 'AH', 'N', 'D' ]

I want to split the original words' letters into several groups equal to the number of phonemes, e.x.

ABANDONED => [ 'A', 'B', 'A', 'N', 'D', 'O', 'N', 'ED' ]

I don't have a mapping of phonemes to graphemes, but it seems like I should be able to compute a statistical model of phonemes to graphemes, then use that to decide where to split each word. (It would be nice if the model could also be used to convert new words to their probable phonemes)

How can I do this? I was thinking a hidden Markov model sounds like it could be applicable, but beyond that hunch, I don't know.

1 Answer

0 votes
by (108k points)

First, you have to align the word to its phonetic representation by matching the identical letters and phonemes (like N and N). You can receive the best match with dynamic programming. Then you can outline the remaining characters of the words to the remaining phonemes.

If you wish to obtain certifications in programming then you cloud join any of the programming courses.

Browse Categories

...