thaigov-v2-corpus

所属分类:数据集
开发工具:Jupyter Notebook
文件大小:0KB
下载次数:0
上传日期:2023-08-04 11:33:43
上 传 者sh-1993
说明:  泰国政府网站的泰国新闻数据集。,
(Thai News Dataset from Thai government website.,)

文件列表:
LICENSE (11357, 2023-12-14)
data/ (0, 2023-12-14)
data/2020/ (0, 2023-12-14)
data/2020/09/ (0, 2023-12-14)
data/2020/09/17/ (0, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_1.txt (14588, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_10.txt (4391, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_11.txt (3888, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_12.txt (11435, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_13.txt (11242, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_14.txt (3549, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_15.txt (5828, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_16.txt (4626, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_17.txt (9506, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_18.txt (12445, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_19.txt (4792, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_2.txt (5211, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_20.txt (4182, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_21.txt (9424, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_22.txt (9276, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_23.txt (7414, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_24.txt (11846, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_25.txt (8111, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_26.txt (8952, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_27.txt (8124, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_28.txt (6415, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_29.txt (5289, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_3.txt (6827, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_30.txt (12305, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_31.txt (9165, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_32.txt (7978, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_33.txt (8502, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_34.txt (5880, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_35.txt (5348, 2023-12-14)
data/2020/09/17/喔傕箞喔侧抚喔椸赋喙喔權傅喔⑧笟喔`副喔愢笟喔侧弗_36.txt (7242, 2023-12-14)
... ...

# ThaiGov V2 Corpus ## English - Data from Thai government website. https://www.thaigov.go.th - This part of PyThaiNLP Project. - Compiled by Mr.Wannaphong Phatthiyaphaibun - License Dataset is public domain. ## Data format - 1 file, 1 news, which is extracted from 1 url. ``` topic (Blank line) content content content content content (Blank line) (URL source) : http://www.thaigov.go.th/news/contents/details/NNN ``` ## Thai - https://www.thaigov.go.th - [PyThaiNLP](https://github.com/PyThaiNLP/) - - (public domain) ... .. 2537 7 ( (1) [...] (3) [...]) ** Git** ### - 17 .. 2563 ### - 1 1 1 url ``` () () : http://www.thaigov.go.th/news/contents/details/NNN ``` ### - _.txt ### Script - run.py url ```http://www.thaigov.go.th/news/contents/details/NNN``` NNN - i - clean.py - ```clean.py ``` - ```clean.py 1 2``` - ```clean.py *.txt``` We build Thai NLP. PyThaiNLP

近期下载者

相关文件


收藏者