Multi-Language-Translation-of-News

所属分类:机器翻译
开发工具:Jupyter Notebook
文件大小:0KB
下载次数:0
上传日期:2023-12-19 12:06:47
上 传 者sh-1993
说明:  新闻的多语言翻译
(Multi Language Translation of News)

文件列表:
parser/
presentation/
test/
train/
LICENSE

# Large Multi-Language Models for News Translation * In this repo you may find examples __how to fine-tune Large Language Models__ (LLM) and apply them to the real task of __news translation__. * Also in this repo we provide __news parser__, so you can easily parse any news web page you want (for example CNN, BBC news) and test how pre-trained LLM would __translate parsed real news__. Снимок экрана 2023-12-18 в 14 48 37 # __1. Facebook: M2M100__ __Facebook: M2M100 (1.2b parameters)__ - is a multilingual encoder-decoder (seq-to-seq) model primarily intended for translation tasks, covering 100 languages. __All available languages:__ Afrikaans, Amharic, Arabic, Asturian, Azerbaijani, Bashkir, Belarusian , Bulgarian, Bengali, Breton, Bosnian, Catalan; Valencian, Cebuano, Czech, Welsh, Danish, German, Greeek, English, Spanish, Estonian, Persian, Fulah, Finnish, French, Western Frisian, Irish, Gaelic; Scottish Gaelic , Galician, Gujarati, Hausa, Hebrew, Hindi, Croatian, Haitian; Haitian Creole, Hungarian, Armenian, Indonesian , Igbo, Iloko, Icelandic, Italian, Japanese, Javanese, Georgian, Kazakh, Central Khmer, Kannada, Korean , Luxembourgish; Letzeburgesch, Ganda, Lingala, Lao, Lithuanian, Latvian, Malagasy, Macedonian, Malayalam, Mongolian, Marathi, Malay, Burmese, Nepali, Dutch; Flemish, Norwegian, Northern Sotho, Occitan (post 1500), Oriya, Panjabi; Punjabi, Polish, Pushto; Pashto, Portuguese, Romanian; Moldavian; Moldovan , Russian, Sindhi, Sinhala; Sinhalese, Slovak, Slovenian , Somali, Albanian, Serbian, Swati, Sundanese, Swedish, Swahili, Tamil, Thai, Tagalog, Tswana, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Wolof, Xhosa, Yiddish, Yoruba, Chinese, Zulu # __2. Google: mT5__ __Google: mT5 (1.2b parameters)__ - mT5 is pretrained on the mC4 corpus, covering 101 languages. __All available languages:__ Afrikaans, Albanian, Amharic, Arabic, Armenian, Azerbaijani, Basque, Belarusian, Bengali, Bulgarian, Burmese, Catalan, Cebuano, Chichewa, Chinese, Corsican, Czech, Danish, Dutch, English, Esperanto, Estonian, Filipino, Finnish, French, Galician, Georgian, German, Greek, Gujarati, Haitian Creole, Hausa, Hawaiian, Hebrew, Hindi, Hmong, Hungarian, Icelandic, Igbo, Indonesian, Irish, Italian, Japanese, Javanese, Kannada, Kazakh, Khmer, Korean, Kurdish, Kyrgyz, Lao, Latin, Latvian, Lithuanian, Luxembourgish, Macedonian, Malagasy, Malay, Malayalam, Maltese, Maori, Marathi, Mongolian, Nepali, Norwegian, Pashto, Persian, Polish, Portuguese, Punjabi, Romanian, Russian, Samoan, Scottish Gaelic, Serbian, Shona, Sindhi, Sinhala, Slovak, Slovenian, Somali, Sotho, Spanish, Sundanese, Swahili, Swedish, Tajik, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Uzbek, Vietnamese, Welsh, West Frisian, Xhosa, Yiddish, Yoruba, Zulu. Снимок экрана 2023-12-19 в 11 54 35

近期下载者

相关文件


收藏者