Search this board for Levenshtein Distance, which can calculate how far one string is from another string by counting the number of single character replacements are needed to transform the first string into the other.
Your tracking index (1.. length of speech text) is the number of leading characters in the speech that have the lowest Levenshtein Distance from the spoken words so far.