文本描述
I 摘要 P2P网贷是一种以互联网技术为核心的创新型金融模式,其交易便捷、覆盖人 群广泛,有效的提高社会金融资本的流转效率,发挥了传统银行不可替代的作用。 但相较于传统的金融借贷,P2P网贷在信息不对称的情况下,平台和投资人面临着 巨大的借款人信用违约风险。相比于欧美发达国家,我国的P2P欧亿·体育(中国)有限公司整体发展水 平较低,管理手段落后,由信用风险引发的平台问题屡见不鲜。因此,如何构建 科学有效的信用风险评估体系是P2P平台持续发展需解决的核心问题之一。 相较于我国,美国的信贷制度较为完善,并诞生了全球最大的P2P平台Lending Club,对于Lending Club的成功,其卓越的信用风险评估体系功不可没,而大数 据分析是Lending Club信用风控的核心。立足于解决当前P2P平台对于借款人信 用风险评估难的问题,本文借鉴全球P2P领导平台Lending Club的大数据信用风 控体系,目的是探索大数据分析技术在P2P信用风险评估中的运用,从而提升P2P 企业的信用风险管理水平。本文首先通过文献分析法,对近年来国内外学者有关 互联网金融风险管理、信用风险因素识别、信用风险评估模型方面的研究进行了 整理和总结。然后对当前国内P2P平台的信用风险管理状况进行了阐述并总结其 中存在的问题,同时分析了Lending Club的运营模式和风控体系,并将重点聚焦 于Lending Club的大数据信用风控机制中。 本文的实证部分利用Lending Club真实的借款数据,基于数据挖掘技术进行 信用评分建模,数据包含1373252条样本,每条样本有144个字段,经过数据分 析、特征预处理最终筛选出16个字段参与建模。考虑到模型可解释性的重要意义, 本文主要研究基于逻辑回归算法的建模,并利用KS和AUC指标对模型进行评估, 最终的建模结果显示模型KS值为0.301,AUC取值为0.707,并通过信用评分, 将借款人的信用风险进行了量化,用实践证明了大数据挖掘技术在P2P信用评估 中具有的良好效用。 最后基于本文的实证分析,总结了Lending Club案例中对国内P2P平台有借 鉴意义的成果,并在最后从P2P平台自身与国家政策两个层面,提出关于增强我 国P2P平台信用风险管理水平的欧亿·体育(中国)有限公司建议。 关键字:P2P网贷、信用风险控制、Lending Club、数据挖掘、信用评分 ABSTRACT II ABSTRACT Peer-to-peer lending (P2P) is an innovative financial model with Internet technology as the core. With convenient transaction and high coverage of users, it can effectively improve the transfer efficiency of social financial capital, having played an irreplaceable role compared with traditional banking. However, compared with traditional financial lending, P2P puts financial platforms and investors at huge credit default risks from borrowers under asymmetric information. Compared with developed countries, China’s P2P market is still underdeveloped with poor management, and the problems caused by credit risk are very common among the platforms. Therefore, constructing a scientific and efficient system of credit risk management is the core issue that the P2P industry needs to deal with for sustainable development. Compared with China, the credit system in the United States is more advanced, and the world’s largest P2P platform, Lending Club, started off in there. The success of Lending Club can be attributed to its excellent credit risk assessment system, with the big data analysis as the core of Lending Club’s credit risk control. Aiming at solving the problem that it is hard for current P2P platforms to carry out the borrower credit risk assessment, this paper draws on the big data credit risk control system of the global P2P leading platform——Lending Club, to explore the application of big data analysis technology in P2P credit assessment, hence improving the credit risk management level of P2P enterprises. This paper first reviews recent researches on Internet financial risk management, identification of credit risk factors, and credit risk assessment models at home and abroad. Secondly, it elaborates the credit risk management of current domestic P2P platforms and summarizes the existing problems, and then analyzing the operating model and the risk control system of “Lending Club”. It focuses on Lending Club’s big data credit risk control mechanism. The empirical part of this article uses the real lending data from “Lending Club” to conduct credit scoring modeling. The data contains 1,373,252 pieces of samples, each of which has 144 fields. After data analysis and feature preprocessing, 16 fields are finally selected to participate in modeling. Considering the significance of model interpretability, this paper mainly studies the modeling based on logistic regression algorithm and evaluates the model by using indicators of KS and AUC. The final ABSTRACT III modeling results show that the model has a KS value of 0.301 and an AUC value of 0.707. Through credit scoring, the credit risk of the borrower is quantified, which proves a good effect of big data mining technology on P2P credit evaluation in practice. Finally, based on the results of the empirical analysis, this thesis summarizes what is worthful to learn from Lending Club for domestic P2P platforms, and puts forward corresponding suggestions on enhancing the credit risk management capabilities of China’s P2P industry in the aspects of the P2P lending platform’s own operation and national policies. Keywords: P2P lending, credit risk control, Lending Club, data mining, credit scoring 目 录 IV 目 录 第一章 绪论 ........................ 1 1.1 研究背景 ................ 1 1.2 研究意义 ................ 2 1.3 国内外文献综述 .... 3 1.3.1 互联网金融信用风险管理文献综述 ........................ 3 1.3.2 P2P网贷信用风险因素评估文献综述 ...................... 4 1.3.3 P2P网贷信用风险量化评估文献综述 ...................... 6 1.3.4 文献评述 ...... 8 1.4 研究内容 ................ 8 1.5 研究思路与论文框架 ........................... 9 1.5.1 研究思路 ...... 9 1.5.2 论文框架 ...... 9 1.6 本章小结 .............. 10 第二章 P2P网贷信用风险管理概述 .............. 11 2.1 P2P网贷理论综述 11 2.1.1 P2P网络借贷的定义 ................. 11 2.1.2 P2P网贷信用风险定义 ............. 12 2.2 国内P2P平台信用风险管理概述 ..... 13 2.2.1 国内P2P平台总体发展概况 ... 13 2.2.2 国内P2P网贷平台运营模式 ... 13 2.2.3 国内P2P网贷基本交易流程 ... 15 2.2.4 国内P2P网贷信用风险成因 ... 16 2.2.5 国内P2P平台信用风险应对机制 .......................... 17 2.2.6 国内P2P网贷信用风险管理存在的问题 .............. 18 2.3 Lending Club平台信用风险管理概述 .............................. 20 2.3.1 Lending Club 运营模式 ............ 21 2.3.2 Lending Club平台交易流程 ..... 21 2.3.3 Lending Club风险管理组织架构 ............................ 23 2.3.4 Lending Club 信用风控机制 .... 23 2.3.5 Lending Club信用风控中的大数据技术应用 ........ 26 目 录 V 2.4 国内平台借鉴Lending Club大数据风控的必要性 ........ 27 2.4.1大数据信用风控系统的优势 .... 27 2.4.2 国内平台对比Lending Club的差异与不足 .......... 28 2.5 本章小结 .............. 29 第三章 基于大数据分析的P2P网贷信用风险评估方法 ............ 30 3.1 数据化信用风险评估概念与流程 ..... 30 3.2 中外P2P大数据风控系统概览 ......... 31 3.3 信用风险评估建模方法概述 ............. 32 3.3.1 信用风险评估建模方法选取 ... 32 3.3.2 逻辑回归信用评估方法详述 ... 33 3.4 P2P风控建模特征处理方法 ............... 36 3.4.1 数据分箱 .... 36 3.4.2 分箱数据WOE编码 ................ 36 3.4.3 IV值计算 .... 37 3.5 P2P风控建模评估指标 ....................... 37 3.5.1 ROC曲线与AUC值 ................ 37 3.5.2 KS曲线与KS值 ....................... 39 3.6本章小结 ............... 39 第四章 基于Lending Club数据的风控建模实证 ........................ 41 4.1 P2P借款数据源说明 ........................... 41 4.2 P2P信用风险数据预处理 ................... 43 4.2.1 P2P借款状态分析 ..................... 43 4.2.2 P2P借款数据预处理 ................. 44 4.3 逻辑回归风控建模 ............................. 51 4.3.1 P2P借款数据特征处理 ............. 51 4.3.2 P2P信用风险关键特征选择 ..... 52 4.3.3 P2P信用评分模型构建与评估 . 54 4.3.4 P2P借款人信用评分 ................. 58 4.3.5 P2P信用违约因素分析 ............. 60 4.4 P2P风控建模实证结果分析 ............... 64 4.5 本章小结 .............. 67 第五章 Lending Club案例总结 ...................... 68 5.1 Lending Club实证案例对国内P2P平台的借鉴意义 ...... 68 目 录 VI 5.1.1 大数据信用风控机制 ............... 68 5.1.2 大数据指标体系 ....................... 71 5.2借鉴Lending Club的国内P2P欧亿·体育(中国)有限公司发展建议 ................. 72 5.2.1平台层面 ..... 72 5.2.2 国