Such a word predictor needs to do only one thing: Look at the current word, and predict the next; though other bots and programs, such as Swiftkey and M.V. Shaney have made use of context, but that is not essential. To do so, however, it needs to examine all the words that have followed the current word before and determine which one was the most common from frequency of occurrence. This process can easily be described as a Markov chain:
Let there be a sequence of words representing input prose called T. Let there be a finite state space S representing all the unique words in T. Let C be a matrix with as many rows and columns as there are states in S. If the number of times word i has transitioned immediately to j in text T can be represented as Ci,j, the ith column and jth row of C represents Ci,j. This will be the basis to creating the transition matrix.
To be able to predict words and add more input text, fixed probabilities were unfeasible. There were two methods of calculating transition probabilities that I created. The first was more suitable for large pieces of text in the manner of M.V. Shaney, as it gave greater variation in diction and virtually eliminated instances where the program would simply repeat a phrase over and over again. That first method is detailed below.
For a transition matrix P describing state space S, let Pi,j, representing the transition probability from state i in S to j in S, equal f(i,j).
Where n equals the last element of S.