KhabarNama-News-Ranking-Tool

所属分类:聚类算法
开发工具:JavaScript
文件大小:0KB
下载次数:0
上传日期:2024-01-03 12:56:55
上 传 者sh-1993
说明:  信息检索系统,使用2010年至2021年《黎明新闻》的数据集,使用类似NLP的tf-idf和余弦相似性的多种技术构建
(information retrievel system that are made up by using dataset from Dawn News of years 2010 to 2021,using multiple techniques of NLP like tf-idf and cosine similarity)

文件列表:
backend/
frontend/
package-lock.json

# Khabar Nama Project ## Overview The Khabar Nama project is an advanced Natural Language Processing (NLP) initiative designed to extract pertinent data from the DAWN news dataset. Leveraging TF-IDF and cosine similarity techniques from the sklearn library, the project aims to provide valuable insights and facilitate informed decision-making. ## Technologies Used - **Frontend:** React - **Backend:** Docker container - **NLP Techniques:** TF-IDF, Cosine Similarity - **Data Preprocessing:** Stemming, Lemmatization, Stop Word Removal ## Features 1. **Web Application:** Intuitive React-based interface for user interaction. 2. **Containerized Backend:** Docker deployment ensures seamless scalability and portability. ## Data Preprocessing The project employs various preprocessing techniques to enhance data quality: - **Stemming:** Reducing words to their base or root form for consistency. - **Lemmatization:** Further refining words to their base form. - **Stop Word Removal:** Eliminating common and non-informative terms for dataset refinement. ## How to Use 1. Clone the repository. 2. Set up the React frontend. 3. Deploy the Docker container for the backend. 4. Explore the extracted insights from the DAWN news dataset. ## Contributors - Saad Dastgir - Muhammad Umair khan - Ahsan Abbasi ## Acknowledgments - Special thanks to the sklearn library for providing robust NLP tools and enhancing the project's capabilities. - Thanks to Dr. Arif ur rehman(HOD CS Bahria University) for Supervising the Project.

近期下载者

相关文件


收藏者