News_Summary

所属分类:特征抽取
开发工具:Jupyter Notebook
文件大小:20242KB
下载次数:0
上传日期:2019-11-12 04:56:09
上 传 者sh-1993
说明:  数据集和脚本,用于从流行来源抓取新闻文章以及文章摘要。
(Dataset and scripts for scraping the news articles from popular sources along with the summary of the article.)

文件列表:
news_summary.csv (11887088, 2019-11-12)
news_summary_more.csv (41399270, 2019-11-12)
scrape.ipynb (1136045, 2019-11-12)
store.py (2352, 2019-11-12)

### Context I am currently working on summarizing chat context where it helps an agent in understanding previous context quickly. It interests me to apply the deep learning models to existing datasets and how they perform on them. I believe news articles are rich in grammar and vocabulary which allows us to gain greater insights. ### Content The dataset consists of 4515 examples and contains Author_name, Headlines, Url of Article, Short text, Complete Article. I gathered the summarized news from Inshorts and only scraped the news articles from Hindu, Indian times and Guardian. Time period ranges from febrauary to august 2017. ### Acknowledgements I would like to thank the authors of Inshorts for their amazing work ### Inspiration * Generating short length descriptions(headlines) from text(news articles). * Summarizing large amount of information which can be represented in compressed space ### Purpose When I was working on the summarization task I didn't find any open source data-sets to work on, I believe there are people just like me who are working on these tasks and I hope it helps them. ### Contributions It will be really helpful if anyone found nice insights from this data and can share their work. Thankyou...!!!

近期下载者

相关文件


收藏者