Soledad Galli - Machine Learning in Financial Credit Risk Assessment
PyData PyData
158K subscribers
28,254 views
0

 Published On Jun 13, 2017

Filmed at PyData London 2017

Description
Risk management is paramount to any lending institution, allowing it to perform well-informed decisions while originating loans. In this talk, I will describe our research and development approach to build our Credit Risk Prediction Model. I will browse over our target definition, feature optimisation, model building and tuning and our experience with model stacking.

Abstract
Credit Risk assessment is a general term used among financial institutions to describe the methodology used to determine the likelihood of loss on a particular asset, investment or loan. The objective of assessing credit risk is to determine if an investment is worthwhile, what steps should be taken to mitigate risk, and what the return rate should be to make an investment successful.

Building a Credit Risk Prediction Model as accurate as possible becomes essential, as it allows the institution to provide fair prices to the customers while ensuring predictable and minimal losses. We build our Credit Risk Model by combining data gathered from the customer’s application on our online platform with their credit history provided by different credit agencies.

In this talk, we will cover the research and development behind our recently created Credit Risk Model. We will discuss the definition of the target, the variable selection procedure, the different machine learning models built and how we optimise their hyper-parameters, as well us some of our latest research in model stacking and deep learning.

Our development and Modelling pipeline is built in Python, using Pandas, Numpy, Scikit-Learn, XGBboost, Keras, Matplotlib and Seaborn. We combine the use of machine learning algorithms with data visualisation to better understand the variables and our customers, and to convey the message to different stakeholders within and outside the company. Throughout the talk, we will focus both on the intellectual rationale of the research and the utilisation of the different python tools to accomplish each task, highlighting both the problems encountered and the solutions devised.

www.pydata.org

PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R.

We aim to be an accessible, community-driven conference, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.

Want to help add timestamps to our YouTube videos to help with discoverability? Find out more here: https://github.com/numfocus/YouTubeVi...

show more

Share/Embed