Credit_risk_resampling

所属分类:项目管理
开发工具:Jupyter Notebook
文件大小:0KB
下载次数:0
上传日期:2023-08-17 16:37:33
上 传 者sh-1993
说明:  本项目的中心是采用机器学习技术,通过逻辑回归模型预测信用风险。p...,
(This project is centered around employing machine learning techniques for predicting credit risk through logistic regression models. The project encompasses the management of imbalanced data and a performance evaluation that contrasts the model s effectiveness using both the initial dataset and a resampled dataset.)

文件列表:
Resources (0, 2023-08-22)
credit_risk_resampling.ipynb (31181, 2023-08-22)

# Credit_risk_resampling # Table of Contents * [Introduction](https://github.com/Aaronbunting/Credit_risk_resampling/blob/master/#Introduction) * [Objective](https://github.com/Aaronbunting/Credit_risk_resampling/blob/master/#Objective) * [Project Overview](https://github.com/Aaronbunting/Credit_risk_resampling/blob/master/#ProjectOverview) * [Conclusion](https://github.com/Aaronbunting/Credit_risk_resampling/blob/master/#Conclusion) * [Getting Started](https://github.com/Aaronbunting/Credit_risk_resampling/blob/master/#GettingStarted) * [Dependencies](https://github.com/Aaronbunting/Credit_risk_resampling/blob/master/#Dependencies) ## Introduction Welcome to the Credit Risk Analysis Repository! This platform serves as a crucial tool for financial institutions, particularly banks, engaged in lending activities. Here, we present a machine learning-based solution that effectively predicts credit risk, harnessing historical lending data from a peer-to-peer lending services provider. ## Objective Our primary mission in this project is to construct a model with the capacity to accurately gauge the creditworthiness of borrowers. This endeavor involves tackling a classification challenge characterized by inherent data imbalance, where instances of healthy loans far outnumber risky ones. Our focus is on ensuring the model's adeptness in recognizing both loan categories. ## Project Overview The project entails the development and comparison of two distinct machine learning models: Logistic Regression Model trained on the original data. Logistic Regression Model trained on resampled data utilizing the RandomOverSampler module from the imbalanced-learn library. The workflow encompasses data division into training and testing sets, creation of a Logistic Regression model using the original dataset, prediction using this model, and replication of the process with a resampled training dataset. ## Outcome A comprehensive Credit Risk Analysis Report encapsulates the findings of our investigation. This report features balanced accuracy scores, precision, and recall scores for both machine learning models. This compilation offers insights into model performance encompassing accuracy, false-positive rate, and the model's efficacy in identifying high-risk loans. ## Conclusion The project culminates in a head-to-head comparison of the two models, accompanied by a well-founded recommendation. Our analysis discerns that the model trained on oversampled data outperforms in identifying high-risk loans. As a result, it emerges as the favored choice for implementation. ## Getting Started To engage with this analysis, access the Jupyter Notebook containing the code. You can initiate the process by cloning this repository, installing the requisite Python libraries, and subsequently running the Jupyter Notebook. ## Dependencies The successful execution of this project relies on the following dependencies: Python Pandas NumPy Scikit-learn Imbalanced-learn

近期下载者

相关文件


收藏者