Predictive Models, Secret Scores: How Computers Decide Who You Are & What To Sell You

Savvy consumers all know that their lifetime debt history ends up in their credit score, and that lenders use that score to try to predict if someone is a good bet for a big loan like a mortgage. But even the most-connected consumer may not realize how many hundreds of other scores we all now trail in our wakes too, thanks to the advent of big data. Do you know, to the last decimal, how likely are you to buy jewelry? To sign up for cable? To have a kid in the next year? Someone, somewhere, is tallying all of that information about almost everyone. But good luck finding out what’s out there, who’s scoring it, and if your numbers are even actually about you at all.

Yesterday, the FTC held an event looking at the scope and effects of those secret scores. The panel brought together experts in marketing, privacy, consumer protection, and technology for a discussion about “alternative scoring products.”

Predictive modeling expert Claudia Perlich opened the session by explaining how the algorithms that drive online advertising technically work. The computers, she explained, don’t judge anything; they’re “agnostic.” They just look for patterns: given a set of 10 million variables, what commonalities do groups share, and how can you divide them into sets whose behavior you can predict?

The machines themselves may not judge, but the people who use them certainly do. A computer told to analyze data about loan applicants might spit out that dividing into groups based on variables X, Y, and Z works best. A human looking at the data parses it out differently, seeing that X is age, Y is income, and Z is ZIP code.

THE WILD WEST
That’s where consumer privacy and potential discrimination issues come into play, and advocates get worried. Consumer advocate Ed Mierzwinski from US PIRG stressed that although a network of laws like the Fair Credit Reporting Act and Equal Credit Opportunity Act govern the generation and use of credit scores, the era of alternative scoring products is an unregulated “wild west” that has big effects on consumers.

Pam Dixon from the World Privacy Forum echoed the sentiment and outlined the problems she sees. They all stem from a lack of understanding and transparency, Dixon explained: nobody’s quite sure what scores are out there, what they’re measuring, how accurate the data they’re based on is, how the factors are put together, and if they’re accurate at all.

The entire panel agreed that under the law, only credit scores are used to determine a consumer’s eligibility for products and services — that is, a bank can look at alternative scoring all it wants when it determines who to advertise to, but when someone applies for a credit card or a mortgage, a federally-regulated credit score is the only one that the bank can use to make a yes or no decision. Marketing leaves a big loophole in the law, though.

FUZZY NUDGES
Privacy and tech expert Ashkan Soltani presented his research finding how companies make different “fuzzy nudges” to consumers to push them into one kind of behavior or another. He was one of the researchers who discovered, for example, that Orbitz pushes higher-priced, luxury hotel listings to the top of the list for Mac users. And it isn’t just hotels: in researchers’ studies, stores like Staples presented higher prices on identical goods to consumers visiting the site from lower-income areas, and credit card companies like Capital One present different card offerings to users based on the ZIP code they’re accessing the website from.

While all of these actions are currently legal, the ripple effects they cause may not entirely be. ZIP code, for example, often correlates highly to race — a legally protected class. As Mierzwinski put it, we have reached an era where “The FCRA is small and these other scores are big.”

MARKETING VS. CREDIT
Stuart Pratt of the Consumer Data Industry Association and Rachel N. Thomas of the Direct Marketing Association countered that marketing is not the same as credit offers, and that the onus is on consumers to choose good products. Shoppers should “be aggressive” and “frustrate the analytics,” Pratt said. Thomas agreed, saying, “That’s why you shop,” when asked about different consumers being presented different options.

However, the marketing and industry experts seemed to not particularly care about the fact that the majority of consumers are never going to call up and ask a company about offers that they don’t know exist. Informed consumers, when opening a new credit card, are probably going to go to a bunch of websites and compare products that way, never knowing that they aren’t seeing plenty of other products that they could apply for.

Consumers also suffer from being lumped together with groups who may or may not be their peers, the consumer advocates argued. Cohort scoring tries to predict the behavior of individuals from the behavior of groups. So if your neighbors are falling behind on their mortgage, your bank might start giving you the side-eye, too. The panelists cited a report finding that American Express had lowered some customers’ credit limits due to other customers’ repayment histories as an example of potential harm that opaque analytics can cause.

ROOM FOR ERROR
And of course, there’s the problem of accuracy: compiling tens of millions of data points about well over 300 million Americans leaves a spectacular amount of room for error. Some first-person observed data is accurate: a company can easily get hard numbers on how many people are hitting their website daily, for example. But lots of data is bought, sold, and traded by third parties to whom accuracy and verification are not exactly a top priority. Analytic systems extrapolating a person’s demographic info based on probabilities have a lot of room for error.

However, the industry experts and consumer advocates did all agree that the big data era has potential to be helpful, and not just harmful. “Big data is an opportunity for inclusion, and it’s an opportunity to help people,” said Dixon. But, she added, the way that data is gathered, quantified, and used must be transparent, beneficial, and careful. And that’s not on the systems, but on the people designing and using them.

UNINTENTIONAL DISCRIMINATION
Perlich ended with the crucial point that although the algorithms and machines that sort, tally, and aggregate the data are themselves agnostic, all existing models are a “reflection of the current biases of human nature,” and that no model can control for morals and judgement.

In other words, a model that isn’t intentionally designed to discriminate still can, and we need human eyes, human judgement, and possibly human regulation to try to make sure they don’t.

The FTC has a a video of the full two-hour session available here.