spell-check

所属分类:多国语言处理
开发工具:Erlang
文件大小:0KB
下载次数:0
上传日期:2017-10-01 06:58:12
上 传 者sh-1993
说明:  Erlang中拼写校正器的实现
(Implementation of spell-corrector in Erlang)

文件列表:
.idea/ (0, 2017-09-30)
.idea/codeStyleSettings.xml (206, 2017-09-30)
.idea/misc.xml (3114, 2017-09-30)
.idea/modules.xml (274, 2017-09-30)
.idea/spell-check.iml (336, 2017-09-30)
.idea/vcs.xml (167, 2017-09-30)
.idea/workspace.xml (20560, 2017-09-30)
LICENSE (1068, 2017-09-30)
big.txt (6488668, 2017-09-30)
check.erl (5052, 2017-09-30)

# Spell-Checker - Inspired by the spelling checker in various Search-Engines, Office Packages and many more, here is an attempt to implement spelling-corrector in Erlang. - [Norvig](https://research.google.com/pubs/author205.html)(**Director of Research at Google Inc**) in 2007 had released the Toy Spelling Corrector in Python(only 21 lines),achieving 80 or 90% accuracy at a processing speed of at least 10 words per second in about half a page of code. - He had released it after his two friends [Dean](https://en.wikipedia.org/wiki/Jeff_Dean_(computer_scientist)) and [Bill](https://en.wikipedia.org/wiki/Bill_Maris) were amazed at Google's spelling correction and did not have good intuitions about how the process works,though being highly accomplished engineers and mathematicians. ### Implementation - It takes reference of words from [big.txt](https://github.com/mishal23/spell-check/blob/master/big.txt) which has about a million words(The same was used by Norvig in his implementation of Spell-Corrector). - All the words of the file [big.txt](https://github.com/mishal23/spell-check/blob/master/big.txt) are splitted and saved as a list. - New list is formed with various edits from the 4 functions( ```deletion_edits```, ```transposition_edits```, ```alteration_edits```, ```insertion_edits```). - After which list is filtered by comparing the words of list formed by [big.txt](https://github.com/mishal23/spell-check/blob/master/big.txt) and the list formed by various edits, and returns a list with the similarities found. ### Steps to Run - Clone the repository after forking it and then head to the Erlang Shell. - Change the directory to cloned repository. - Compile it. - Input a word in double quotes and check the recommendations given. - For my system after heading to Erlang Shell, it is as follows ```erlang 1> cd("C:/Users/Mishal Shah/Desktop/Erlang"). C:/Users/Mishal Shah/Desktop/Erlang ok 2> c(check). {ok,check} 3> check:known("helo"). Did you mean? ["felo","halo","held","hell","hello","helm","help","hero"] 4> check:known("seach"). Did you mean? ["beach","each","reach","search","teach"] 5> check:known("somthing"). Did you mean? ["something","soothing"] ``` ### Timer - The time noted is the average of 6 outputs of timer function.
Word 3rd Release time(in seconds) 2nd Release time(in seconds) 1st Release time(in seconds)
somthing 3.09 4.525 13.3
seach 3.05 4.46 9.5
helo 2.8 4.5 8.2
### Future Scope - [ ] Work on run-time speed. - [ ] Work on increasing accuracy. - [ ] Work on spell-checker in more than one word. ### License - This repository is under [MIT License](https://github.com/mishal23/spell-check/blob/master/LICENSE) **FootNotes** - Norvig's original post [here](http://norvig.com/spell-correct.html)

近期下载者

相关文件


收藏者