It’s been three straight months since I started my work at HomeCredit China, and I have to admit that companies like this big really take new employees a lot of time to understand their business model at the beginning. But thank god, time and intelligence can together help me to overcome the difficulties, so now I am starting to get the hang of it, and I am going to share some of the techniques I found out in my work that can help others, potentially.
Working in the decision-making department that controls the underwriting strategies for loan applications, quite often we would use “scores”. They could be scores calculated by the company it self, or the score from other companies like the popular “Zhima Score” from Alipay in China, or they could be a little bit of either. For example, if we were Alipay, and we are gonna say in order for us to credit you some money, you’d better have a Zhima Score that is higher than 590. One day you found out that this decision is rejecting more customers than you would hope for, you’ll naturally want to adjust the cutoff value, right? Therefore you’ll start to simulate the new cutoff value, for example, 580, to see if this new strategy is doing you any better. Simulation is also important but I’m not gonna talk about it here.
However, what I just said is just for the most simple case, where you only have one score cutoff, so that you can just rerun the simulation again and again to draw a graph where the x-axis shows you the cutoff value and the y-axis shows you the approval rate, for instance. What if, you have a lot of score cutoff strategies? You can still do the analysis one by one by rerunning the simulation by different cutoff combinations, but when the data for simulation is large this process can become crazy slow, and you’ll not be able to see their interactions at first glance.
What I am proposing here is, to use an “incremental” method for visualization instead of the “all-over-again” method. Let’s get the simulated data once, and then do the modifications on it. For instance, when you want to lower the Zhima Score cutoff, try to find out what part of customers in the dataset will be possibly getting the offer, and then assign a possibility of them being approved, such as 80% or any other value calculated. Correspondingly you can apply this modifications to all other cutoffs, and because the strategies in the decision-making system are usually run in an order, and are coded in a table with each line referring to one strategy, you could break down the specific segment of customers who might get new results, calculated by probability. It is not accurate at the first glance, but when your data is large, it could be more precise than you would’ve thought.
And the huge upside of this method is, you could visualize the interactions among several score cutoff categories AT THE SAME TIME! You can play with the modifications and see their general impact on the variable you’re interested in. In my example here you can play with the sliders for different score cutoffs, and see what they would do, how large the impact would be for the approval rate and risk performance. At the very end, when you find out the perfect combinations you want, you could simply rerun the simulation to prove this thought.
And trust me, the result won’t be too far away from the visualization, because you already know what is going to happen through the visualization.
Project link here: Github Link