Unsupervised Learning
Posts
Machine Learning Doesn’t Introduce Unfairness—It Reveals It

Machine Learning Doesn’t Introduce Unfairness—It Reveals It

Many people have a concern about the use of machine learning in the credit rating and overall FinTech space. The concern is that any service that provides key human services—such as being able to own a home—should be free of bias in its filtering process. And of course I agree.

I define bias here as ‘unreasoned judgements based on personal preferences or experiences’.

The problem is that machine learning improves by having more data, so companies will inevitably search for and incorporate more signals to improve their ability to predict who will pay and who will default. So the question is not whether FinTech will use ML (it will) but rather, how to improve that signal without introducing bias.

It’s quite possible to do ML improperly, but one shouldn’t assume that’s happening intentionally or often because bad doing so is counter-productive.

My view on this is quite clear: Machine Learning—when done properly—isn’t creating or introducing unfairness against people—it’s uncovering existing unfairness. And the unfairness it’s uncovering is that which is built into nature and society itself.

The more signals AI receives about something, the closer it comes to understanding the Je Ne Sais Quoi of that thing. And in the case of credit, those judgements will bring them closer to the truth about one’s credit-worthiness.

If it says someone has a higher chance of defaulting on loans, that’s not because they chose to be a bad person—it’s because their personal configuration, consisting of body, mind, and circumstance, indicates they’re less likely to pay back loans.

As someone who doesn’t believe in free will, this is completely logical to me. People don’t pick their parents. They don’t pick who becomes their friends as children. They don’t pick what elementary schools they go to. They don’t pick their peer group. And these variables are what largely determine whether you’ll go to college or not, whether you’ll have other friends who went to college or not, and ultimately how vibrant and stable your financial situation will be. This is what these algorithms attempt to peer into.

Ironically, the loan business is where the blindness of ML could help people who are likely to be good customers but who were excluded before due to human bias.

Let’s look at an example, but in the auto insurance space instead of FinTech. Let’s say we’re trying to assign premium to a young person who just purchased a motorcycle, which means we need to rate their chances of behaving in a safe vs. a reckless manner. And let’s say their public social media presence is full of images of them doing Free Climbing (where you climb mountains without ropes), and saying things like, “I’d rather die young doing something crazy than old being boring!”

The difference between unfairness and bias is that bias is disconnected from reality, and is usually based in personal prejudice.

Those are clear signals about risk as it relates to motorcycles, and it raises the question of the definition of bias. One could say that insurance people are “biased” against thrill-seekers. And there are surely similar correlations in the financial industry for who pays bills and who doesn’t. If someone posts something like, “I hope all my bill collectors see this message. I’m not paying, stop calling!”, or “I can’t believe someone was stupid enough to give me a credit card again. Bankruptcy number 4 in 6 months!”.

Those are extreme examples, but it shows that it’s absolutely possible for a signal to correlate to a behavior, and that behavior can correlate either positively or negatively to the thing you care about—in this case the chances that someone will pay off their loans.

This is a definition of unfairness specific to this discussion, not an overarching one.

And that’s the difference between unfairness and bias: unfairness is where someone is negatively judged due to someone’s characteristics or behavior, when it’s either difficult or impossible for them to change what they were judged on. Examples might include being short, or obese, or introverted, or bad with money, or undependable. Those might be perfectly valid reasons for someone to choose not to interact with you, but it’s also not completely fair that they do so.

And bias is where a judgement is not valid, where it’s based in someone’s personal experience, is not supported by data, and is quite often powered by prejudice or bigotry. Examples would be denying someone’s loan for a house—even though the algorithm told you they’re extremely likely to pay it off—because you don’t want any more of “those people” in your neighborhood. And biased data or algorithms would have those sorts of unfounded connections built into them.

And remember, if companies just wanted to deny people from getting loans they could do that pretty easily. But they aren’t in the business of denying loans; they’re in the business of granting loans. Any time a company denies someone who could have paid, they give money to their competitors or leave it on the table. So they are incentivized not to give money to people they like, but to people who will pay that money back—regardless of background, appearance, etc.

Unsupervised Learning — Security, Tech, and AI in 10 minutes…

Get a weekly breakdown of what's happening in security and tech—and why it matters.

So, in order to refine that signal, and find the people who are most likely (or not) to pay back a loan, the algorithms need more data. The signal from social media can absolutely help lock onto that truth about a person, as it tells you how someone speaks, how they interact with others, and how they spend their time.

The accuracy of the data is what matters.

This doesn’t mean AI is fair, or that it’s nice. It isn’t. But life isn’t nice, either. Machine Learning, like the Amazon Forest or an Excel Spreadsheet, is neither good nor evil. It’s just telling you what is.

Machine Learning locks onto truth by looking at signal. And in this case, as with many other deeply human issues like employment and crime, we might not like the truth that it uncovers.

But when that happens, and we’re shown some people are more likely to get in motorcycle accidents, or pay their bills on time, or default on a loan, or be violent—it’s really telling us about the society we have built for ourselves. It tells us that there are some who have had more advantages and opportunities than others, which lead to different outcomes—and that should not surprise us.

Machine Learning is simply a tool that allows us to look inside this puzzle of chaotic human behavior, and to find patterns that enable predictions.

When that tool shows us something uncomfortable—something we wish we didn’t see—we must resist the temptation to blame what we used to see it. It’s what you’re looking at that’s disturbing and needs to be fixed.

It’s the unfairness built into society itself.

No related posts.