spell-check
所属分类:多国语言处理
开发工具:Erlang
文件大小:0KB
下载次数:0
上传日期:2017-10-01 06:58:12
上 传 者:
sh-1993
说明: Erlang中拼写校正器的实现
(Implementation of spell-corrector in Erlang)
文件列表:
.idea/ (0, 2017-09-30)
.idea/codeStyleSettings.xml (206, 2017-09-30)
.idea/misc.xml (3114, 2017-09-30)
.idea/modules.xml (274, 2017-09-30)
.idea/spell-check.iml (336, 2017-09-30)
.idea/vcs.xml (167, 2017-09-30)
.idea/workspace.xml (20560, 2017-09-30)
LICENSE (1068, 2017-09-30)
big.txt (6488668, 2017-09-30)
check.erl (5052, 2017-09-30)
# Spell-Checker
- Inspired by the spelling checker in various Search-Engines, Office Packages and many more, here is an attempt to implement spelling-corrector in Erlang.
- [Norvig](https://research.google.com/pubs/author205.html)(**Director of Research at Google Inc**) in 2007 had released the Toy Spelling Corrector in Python(only 21 lines),achieving 80 or 90% accuracy at a processing speed of at least 10 words per second in about half a page of code.
- He had released it after his two friends [Dean](https://en.wikipedia.org/wiki/Jeff_Dean_(computer_scientist)) and [Bill](https://en.wikipedia.org/wiki/Bill_Maris) were amazed at Google's spelling correction and did not have good intuitions about how the process works,though being highly accomplished engineers and mathematicians.
### Implementation
- It takes reference of words from [big.txt](https://github.com/mishal23/spell-check/blob/master/big.txt) which has about a million words(The same was used by Norvig in his implementation of Spell-Corrector).
- All the words of the file [big.txt](https://github.com/mishal23/spell-check/blob/master/big.txt) are splitted and saved as a list.
- New list is formed with various edits from the 4 functions( ```deletion_edits```, ```transposition_edits```, ```alteration_edits```, ```insertion_edits```).
- After which list is filtered by comparing the words of list formed by [big.txt](https://github.com/mishal23/spell-check/blob/master/big.txt) and the list formed by various edits, and returns a list with the similarities found.
### Steps to Run
- Clone the repository after forking it and then head to the Erlang Shell.
- Change the directory to cloned repository.
- Compile it.
- Input a word in double quotes and check the recommendations given.
- For my system after heading to Erlang Shell, it is as follows
```erlang
1> cd("C:/Users/Mishal Shah/Desktop/Erlang").
C:/Users/Mishal Shah/Desktop/Erlang
ok
2> c(check).
{ok,check}
3> check:known("helo").
Did you mean?
["felo","halo","held","hell","hello","helm","help","hero"]
4> check:known("seach").
Did you mean?
["beach","each","reach","search","teach"]
5> check:known("somthing").
Did you mean?
["something","soothing"]
```
### Timer
- The time noted is the average of 6 outputs of timer function.
Word |
3rd Release time(in seconds) |
2nd Release time(in seconds) |
1st Release time(in seconds) |
somthing |
3.09 |
4.525 |
13.3 |
seach |
3.05 |
4.46 |
9.5 |
helo |
2.8 |
4.5 |
8.2 |
### Future Scope
- [ ] Work on run-time speed.
- [ ] Work on increasing accuracy.
- [ ] Work on spell-checker in more than one word.
### License
- This repository is under [MIT License](https://github.com/mishal23/spell-check/blob/master/LICENSE)
**FootNotes**
- Norvig's original post [here](http://norvig.com/spell-correct.html)
近期下载者:
相关文件:
收藏者: