A simple Least Squares Classifier. Predicts whether a given word is Spanish or French based on a few bi-gram features.
I wrote a function that generates every two letter sequence in the alphabet to use it as a feature; I also manually added some common French and Spanish sequences and preffixes. This model achieved an accuracy of at least 75% on the training data. This model performed well on the unseen data and achieved an accuracy of %84.12 on the leaderboard.
Acknowledgements: Professor Justin Eldridge, UC San Diego.