sylbreak

所属分类:模式识别(视觉/语音等)
开发工具:HTML
文件大小:1773KB
下载次数:0
上传日期:2023-06-03 11:16:22
上 传 者sh-1993
说明:  no intro
(Syllable segmentation tool for Myanmar language (Burmese) by Ye.)

文件列表:
Java (0, 2023-06-03)
Java\SyllableSegmentation.java (1678, 2023-06-03)
Java\SyllableSegmentationTest.java (553, 2023-06-03)
Javascript (0, 2023-06-03)
Javascript\resegment.html (684, 2023-06-03)
Javascript\resegment.js (1080, 2023-06-03)
LICENSE (11357, 2023-06-03)
cpp (0, 2023-06-03)
cpp\how2run.txt (1477, 2023-06-03)
cpp\sylbreak.cpp (3398, 2023-06-03)
data (0, 2023-06-03)
data\my-input.txt (3381, 2023-06-03)
data\my-input2.txt (12515, 2023-06-03)
jupyter-notebook (0, 2023-06-03)
jupyter-notebook\note.txt (1469, 2023-06-03)
jupyter-notebook\using-sylbreak-in-jupyter-notebook.html (288914, 2023-06-03)
jupyter-notebook\using-sylbreak-in-jupyter-notebook.ipynb (33825, 2023-06-03)
jupyter-notebook\using-sylbreak-in-jupyter-notebook.pdf (187767, 2023-06-03)
perl (0, 2023-06-03)
perl\how2run.txt (21938, 2023-06-03)
perl\note.txt (544, 2023-06-03)
perl\sylbreak.pl (2356, 2023-06-03)
php (0, 2023-06-03)
php\sylbreak.php (949, 2023-06-03)
python (0, 2023-06-03)
python\how2run.txt (26002, 2023-06-03)
python\note.txt (24, 2023-06-03)
python\sylbreak.py (2475, 2023-06-03)
python\sylbreak3.py (2649, 2023-06-03)
reference (0, 2023-06-03)
reference\E12-3004.pdf (500006, 2023-06-03)
reference\I08-3010.pdf (300136, 2023-06-03)
reference\SNLP-3-A Large-scale Study of Statistical Machine Translation Methods for Myanmar Language.pdf (163683, 2023-06-03)
reference\my2Others-CameraReady.pdf (676387, 2023-06-03)
ruby (0, 2023-06-03)
ruby\how2run.txt (1145, 2023-06-03)
... ...

# sylbreak [Myanmar language (Burmese) README](https://github.com/ye-kyaw-thu/sylbreak/blob/master/README-Myanmar.md) Syllable segmenation is an important preprocess for many natural language processing (NLP) such as romanization, transliteration and graphame-to-phoneme (g2p) conversion. "sylbreak" is a syllable segmentation tool for Myanmar language (Burmese) text encoded with Unicode (e.g. Myanmar3, Padauk). I used only one short line of regular expression (RE) as follow: ```perl $line =~ s/((?

Fig. Visualization of sylbreak RE

If you use shell (sylbreak.sh), perl (sylbreak.pl) and python (sylbreak.py) scripts, no need to make installation. Enjoy syllable breaking! Ye@Lab ### Acknowledgement Thanks to [Swan Htet Aung](https://github.com/swanhtet1992) who informed my typo mistake of $otherChar ... ---> sylbreak RE example programs for Java and Java Script was written by [Chan Mrate Ko Ko](https://github.com/ye-kyaw-thu/sylbreak/commits?author=chanmratekoko). ### Reference 1. Dr. Thein Tun, Acoustic Phonetics and The Phonology of the Myanmar Language 2. Romanization: https://en.wikipedia.org/wiki/Romanization 3. Myanmar Unicode: http://unicode.org/charts/PDF/U1000.pdf 4. Syllable segmentation algorithm of Myanmar text: http://gii2.nagaokaut.ac.jp/gii/media/share/20080901-ZMM%20Presentation.pdf 5. Zin Maung Maung and Yoshiki Makami,"A rule-based syllable segmentation of Myanmar Text", in Proceeding of the IJCNLP-08 workshop of NLP for Less Privileged Language, January, 2008, Hyderabad, India, pp. 51-58. [Paper](https://github.com/ye-kyaw-thu/sylbreak/blob/master/reference/I08-3010.pdf) 6. Tin Htay Hlaing, "Manually constructed context-free grammar for Myanmar syllable structure", in Proceedings of the Student Research Workshop at the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL '12), Association for Computational Linguistics, Stroudsburg, PA, USA, pp. 32-37. [Paper](https://github.com/ye-kyaw-thu/sylbreak/blob/master/reference/E12-3004.pdf) 7. Ye Kyaw Thu, Andrew Finch, Yoshinori Sagisaka and Eiichiro Sumita, "A Study of Myanmar Word Segmentation Schemes for Statistical Machine Translation", in Proceedings of the 11th International Conference on Computer Applications (ICCA 2013), February 26~27, 2013, Yangon, Myanmar, pp. 167-179. [Paper](https://github.com/ye-kyaw-thu/sylbreak/blob/master/reference/my2Others-CameraReady.pdf) 8. Ye Kyaw Thu, Andrew Finch, Win Pa Pa, and Eiichiro Sumita, "A Large-scale Study of Statistical Machine Translation Methods for Myanmar Language", in Proceedings of SNLP2016, February 10-12, 2016, Phranakhon Si Ayutthaya, Thailand. [Paper](https://github.com/ye-kyaw-thu/sylbreak/blob/master/reference/SNLP-3-A%20Large-scale%20Study%20of%20Statistical%20Machine%20Translation%20Methods%20for%20Myanmar%20Language.pdf) 9. Regular Expression: https://en.wikipedia.org/wiki/Regular_expression 10. DebuggexBeter: https://www.debuggex.com/

近期下载者

相关文件


收藏者