I am not sure how one would remove all ageism, sexism, racism, classism, title-ism, and so on from lending. The whole concept is about making a prediction about the future with sub optimal information, guessing who will default on a loan and who won't. Same goes with insurance.
I have been pretty tempted to lie about where I live in order to reduce my insurance costs. It would reduce the insurance cost by half. It seems pretty disproportionately harsh that I should get lumped together with the people who simply happen to live around me.
Is it possible to make predictions illegal if they are based on historical data from other than the individual customer?
My beliefs aside [0], you do bring up a good point. I can't find the HN thread, but there was a good TEDx presentation on it as well:
Given two curves on the same graph, you are not guaranteed to be able to min/max them simultaneously.
For example: say you are a mortgage broker at a bank and you have to give out mortgages to people in your community. Obviously, the people in your community are diverse. There are men, women, white people, black people, homosexuals, heterosexuals, etc. all wanting mortgages. Say you have some historical data that you know to be accurate, namely the credit score of successful applicants, the associated foreclosure rate, and some demographic data like race and sex. Also say that you want the bank to remain solvent and competitive, handing out loans to the best applicants and also garnering a reputation for being a 'fair' community bank that people will actually be able to get loans from. Now, the question is, what is the cutoff for credit-scores that will cause an applicant to not get a mortgage [1]?
Say, on the y-axis is the foreclosure rate, and on the x-axis is an applicant's credit score. You have some curves for white women, black women, white men, black men, and many other permutations of the human rainbow. These curves are all different from each other.
Now, if you set an arbitrary cut-off score, you may be setting a score that says that white people are less likely to get a mortgage than black people. That's obviously disenfranchising white people in your community, and you should change the cut-off to be more equitable to the people around you.
However, now that you have changed the cut-off for approval of a mortgage to be racial equitable, you have now made it such that men are less likely to get a mortgage than women are. This is also not good, and you should change the cut-off again to be fair to all sexes and genders. However, now you are back to being unfair to white people, again.
Given the data you have, you are in a situation where, no matter what you do, you are discriminatory. In fact, I'd say that this is the most likely situation to be in. That all the curves and graphs would align just so and that you could be non-discriminatory under ALL scenarios is extremely unlikely. Honestly, you do have to pick and choose who you will not discriminate against.
This may seem disheartening, and, yeah, it is [2]. However, that does not mean that we shouldn't try to change things. If anything, understanding that you will very likely be discriminatory no-matter-what, is helpful. You now have a better view of what you can change and how that may affect things. You can choose where to set your parameters with better clarity towards your fellow humans. Maybe you oscillate between gender-parity and height-parity. Maybe you choose to focus on income-inequality for 5 years and then switch to racial-inequality for a focus. Whatever your thesis is on how to gain better equity in your community, maybe you now have a better understanding of the mechanics of the system and you can better affect it positively.
[0] In talking privately to friends that are also considered 'minorities', discrimination is occurring and is systemic. Though this is personal anecdata.
[1] I'm trying to simplify this as much as possible. Obviously, real applications are VERY nuanced and complicated.
[2] Unless you are a journalist. Then, well, this means you will always have a lot to write about!
That's the point. If the prediction is accurate, that's net beneficial: irresponsible people will get fewer loans, and responsible people will get more loans.
It sounds like you haven't quite decided whether this would be less accurate than FICO (maybe! But there's no reason to be outraged that some random person's prejudice is costing them money) or more accurate than FICO (in which case your argument seems to be "Yes, this kind of prejudice is accurate--or, at least, more accurate than the alternative. But that makes me feel bad, so misallocate capital and make more bad loans.")
The world economy just went through a couple very bad years caused by making wishful-thinking loans instead of unpleasant-truth loans. I hope that doesn't have to happen again.
This is the first I'm hearing of it, but assessing credit risk based on social profiles is almost definitely illegal. Conclusions based on friends or location rather than actual history will have a disparate impact on minorities.
Is it legal to model that type of dynamic into credit decisions? I've never heard of being able to use that type of social dynamic into a very regulated financial product.
Frankly, private industry has been lucky up until this point that anything other than your financial history has been allowed to factor into your credit score or credit worthiness __at all__. I wouldn't be surprised if it gets limited to solely financial factors by regulation within the next 10-20 years.
In terms of regulations, I think by regulation they shouldn't be allowed to give the algorithms anything but loan amounts, payment schedules, and payment history. Don't give them locations. Don't give them age, name, race, or gender. Don't even give them the names of banks, since this too can be used in a discriminatory way that can be used to imply geographic location and thus race etc. Yes, people with no credit history will get terrible scores, but that's the point of a credit system. It should be based solely off of your merits and what you've actually done, not off of what people in your area/situation tend to do based on statistics. Then it's just statistics creating statistics, and socioeconomic mobility freezes
Probably less plausible than you think. What incentive would company X have in denying service to someone based on unrelated parameters? That's just going to reduce business and profits.
With credit scores, its a bit different. Credit scores are a risk profile of the consumer, so if a company doesn't deny service/charge higher interest to a consumer with low credit scores, it potentially stands to loose money. Besides, this is regulated, using parameters like gender, age, race etc. in your underwriting model is still illegal, I think.
Any such score, used to deny service to group x based on parameters unrelated to the product itself would simply be discrimination, no?
Your home address, age and marital status are absolutely used by lenders in a data-driven analysis of whether you are a creditworthy borrower. I don't think that's a necessarily malevolent analysis, but it's also not true that we live in a society where your demographics are irrelevant to your ability to access credit.
Loan companies could model a population's exact historical credit risk by combining extensively detailed loan data, right? But they can't use that calculated credit risk to issue loans because it probably contains off-limits attributes like gender, race, religion, etc.
What about developing their "clean" model by minimizing errors between it and the actual model? The clean model wouldn't contain any regulated inputs but will be as close as you can get to the real thing.
Articles critical of the loss of privacy that comes along with social media often come up with this type of scaremongering. 'If you know someone that has bad credit, then you won't be able to get credit either'. Well, if your relationships do actually effect your ability to repay your debts, then banks would be right to be at least a little more wary of giving you credit - amd you should not be trying to overextend your credit.
I would have said that the undesirable practices that are mentioned - such as redlining - are the result of using crude, discriminatory heuristics rather than the result of having too much data about customers. This isn't necessarily a bad development - it could enable banks to lend to more marginal borrowers if they could see they had a strong support network. Clearly, if you have a good conventional credit record, then your social credit will hardly matter. Banks that use this new source of data stupidly won't do well.
Is it a bad thing that getting a loan requires you to go to church? Yeah, kind of.
Edit: Now I don't know what I'm responding to. You agree that someone's history in repaying debts is legit to use as an indicator. You agree in augmenting the lending decisions with algorithms. That's what credit reporting is!
And yet somehow that's a defense of your conclusion that "therefore it should be more based on whether your social circles include the people with money" (which no one would scream bloody murder about for being anti-minority)? I'm lost.
Redlining and credit scoring has and in some cases still does systematically implement discriminatory practices via algorithmic means. And arguing with a loan officer isn't going to get you anywhere. The put your details in the black box, and whatever the black box says is implicitly trusted.
Facebook has really, really good people working on machine learning. What Facebook wants to do is use whatever features it can scrape from your profile to accurately price your odds of defaulting on a loan.
It's very unlikely that removing a few 'poor' friends will improve your credit score. There's likely to be many features that will be highly correlated all of which will enter a statistical model.
Keep in mind - actuaries already use all sorts of information about you in order to price a loan. This will just make the pricing more accurate, making (on average) cheaper loans.
The downside of more accurate loan pricing is that it might lead to de-facto discrimination against groups that are classified as high risk. I believe the government has a role there to provide high risk pools, because the alternative will be to push those people towards black market loans.
You’re viewing “fixing” from the wrong lens. There are infinite bits of information about a person that could potentially be used in a model to calculate a credit score. At all times in every model you’re choosing a subset of one’s facets and can build statistically unbiased models based on the data you make available.
But when you talking actually fixing models like this you’re actually forced to correct the final result, not filter the data. Being blind to ethnicity doesn’t work because one’s ethnicity permeates (to different degrees, sure) every part of their lives. All the data is bad, everything is a statistically detectable proxy for ethnicity.
It's very hard to argue against this. Either your social network is a good signal, or it's not. If it's not a good signal, the people who use this information are going to lose money--they will quite literally pay for being wrong.
But if it does work, then overall mortgage rates will marginally decline. Better information will lead to better choices, so mortgages will become less risky overall. Some people will suffer, but keep in mind that when those people oppose this, they are saying "I want to hide material information from the people with whom I do business; if they knew the truth about me, I would get a worse deal." That's fraud.
Yes, there will be false positives. And it's easier to visualize being the victim of one of those than it is to visualize the tiny collective improvement in human welfare that tends to result from better decisions.
I would be interested in an argument that applies in this specific case, but doesn't apply to the general class of rough but useful heuristics, like "Kids with bad grades who drive recklessly are probably less responsible than 40-year-old moms who drive minivans, even though there is some C-student with a red car who is safer than some particular 40-year-old minivan-driving mom."
Yes. Information absolutely can be used against you. Imagine if you perhaps decide to smoke... now the bank (and others). know this behavior and can calculate it into your overall credit score / health care score / behavior score.
You could use lots and lots of examples where market forces and banks could influence human behavior through this method. There is freedom and power in using cash.
Having the same credit score, or other data, does not mean they have identical circumstances. For example, your willingness to follow the rules of society, even to your detriment, might be a result of whether society's rules have treated you fairly in the past.
So belief that you'll get different default rates given some financial data does not imply that you are a scientific racist. To create identical circumstances you'd have to do a brain swap (and some other relevant internal organs) on some black and white infants. A scientific racist view is that then the likelihood of paying off the loans would follow the brain, not the skin.
Surrounding these laws is a bunch of regulation and agency interpretation - i.e. if an agency doesn't like your lending/free association choices they'll hit you with lawsuits/regulatory
actions that are themselves very punishing.
As a concrete example, see Figure 7 of this paper. This figure shows that lenders could avoid a lot of defaults if they did per-race isotonic regression on FICO scores. In non-technical terms, this means a black person with a FICO score of 600 is more likely to default than an Asian person with the same score, and this information is easy to use. However, lenders are legally forbidden from using this information; as a result, lenders must associate with people (potential defaulters) that they don't want to and Asian people must pay higher interest rates because of this.
Financial institutions are using all available data to improve estimations of creditworthiness. The Horror!
I don't support the implicit assumption that it is better to err on the side of leniency when it comes to credit provision. Inadequate information was as responsible as lack of oversight for causing the financial crisis. Racial discrimination is also a flawed analogy -- that was a case of using traits that had no bearing on credit risk to calculate a deliberately false appraisal, whereas using data on online activity is far more akin to the practice of requiring bank statements.
Granted, mistakes could very likely be made, but so far the author presents no evidence that using such information is any worse than other means of estimating credit.
Some of these seem fairly reasonable. Is it not fair to pay a premium if you tend to return a lot of items. It costs the store more money? If you have friends with "bad" backgrounds there's clearly direct correlation between that and loan risk. Is this any different from how loan officers operated for all of history? Yes these create biases but are these biases justified?
With that said there's definitely a problem with how some of this data is used (ex: deliberately designing apps to be addictive). I'm just not a fan of blanket statements for / against data collection. There's a balance to be had.
I have been pretty tempted to lie about where I live in order to reduce my insurance costs. It would reduce the insurance cost by half. It seems pretty disproportionately harsh that I should get lumped together with the people who simply happen to live around me.
Is it possible to make predictions illegal if they are based on historical data from other than the individual customer?
reply