Summary: Correcting spelling mistakes like "campagn" (campaign) is comparatively easy.
However, if you intend to type 'three', a common mistake is typing 'there' instead of 'three'. Both 'there' and 'three' are spelled correctly. However, if we compare the phrases 'three days' and 'there days', it it obvious 'three days' is the correct phrase.
How to make your spelling correction algorithm recognize the difference mentioned above? In this program, I attempt to solve this problem using information from the context.
Required Python packages: re, collections, nltk, numpy, operator, csv, sys
Compatibility: The program is tested to run on Python 3.6.5 using Anaconda distribution
The program takes a few minutes to run with the given example. So some patience is appreciated.
How to run:
python3 main.py inputFileLocation
For example, python3 main.py /Users/tg/Desktop/517/assignment2/input.txt
Outputs:
The program will generate "output.txt" file in the same location where the file main.py is located.