Pitfalls Of Big Data: Test Prep Company Charges By Geography, Ends Up Charging More By Race


Many teenagers’ parents want to give their kids every possible advantage when it comes to the SATs. They pony up a few thousand dollars and buy Junior a test-prep course. It’s expensive, but at least it’s the same kind of expensive for everyone, right? Well, no, it’s not. And worst of all: there sure is an awfully high correlation between the race of the family doing the buying and the price that they get charged.

ProPublica crunched the numbers and found that The Princeton Review, one of the most popular test-prep companies, not only sells the same product at different prices depending where you start from — but also that Asian customers, no matter where they live, are likely paying the most.

While higher-income areas generally get charged more for the same package, ProPublica found, the correlation isn’t just that straightforward. New York and D.C. residents pay the most, but customers in regions with high densities of Asian residents were likely to see the highest price — even if those regions are themselves low-income:

When it came to getting the highest prices, living in a ZIP code with a high median income or a large Asian population seemed to make the greatest difference.

The analysis showed that higher income areas are twice as likely to receive higher prices than the general population. For example, wealthy suburbs of Washington D.C. are charged higher prices. But that isn’t always the case: Residents of affluent neighborhoods in Dallas are charged the lowest price, $6,600.

Customers in areas with a high density of Asian residents were 1.8 times as likely to be offered higher prices, regardless of income. For instance, residents of the gritty industrial city of Westminster, California, which is half Asian with a median income below most, were charged the second-highest price for the Premier tutoring service.

The Princeton Review, of course, almost certainly isn’t explicitly setting out deliberately to charge customers different prices by race. In addition to being a scummy thing to do and looking very bad PR-wise, that is also very illegal.

Indeed, the company denied it strenuously. In a statement to Pro Publica, The Princeton Review said, “To equate the incidental differences in impact that occur from this type of geographic based pricing that pervades all American commerce with discrimination misconstrues both the literal, legal and moral meaning of the word.”

But even if the outcome isn’t intended, that doesn’t mean it’s not there.

Race-based price discrimination may not be the goal, but it is still a real, unfortunate side-effect of the algorithms the Princeton Review is using to set different prices. The publication bases its pricing on ZIP code. They told ProPublica that pricing is based on the “costs of running our business and the competitive attributes of the given market.” Translate that “competitive attributes” into English, and it basically means that if demand for their services is particularly high in an area, they can charge more without losing customers.

The Princeton Review is using algorithms — theoretically neutral pieces of software based on math — to make assumptions and judgements about who wants, and can pay for, what. But those algorithms, like thousands of others that millions of us are classified by every day, are designed by humans that bring their own assumptions and biases with them into the code.

It’s not just SAT prep. The reliance on algorithms to decide who customers are and how you should treat them is pervasive and growing. Sites will use anything from your IP address to your browsing history to decide anything from what search results you should get to what credit card offers you should see.

But the challenge of relying on all those numbers to do the thinking for you is exactly what the Princeton Review ran into here: disparate impact. When you try to privilege or minimize some non-protected attributes — like ZIP code — you can end up prioritizing or deprioritizing consumers by legally-protected statuses, too. If Orbitz thinks Mac users want to pay more for hotels, that’s crappy but not illegal. If The Princeton Review thinks that Asian-American families should pay more for their services than white families, well, that’s a whole other problem.

But in general, right now, the disparate impacts of algorithms on everyone who uses digital services are still, as consumer advocate Ed Mierzwinski put it last year, the “Wild West.” Regulation has not yet caught up with the ability for vendors to discriminate — intentionally or not — against legally-protected populations with a single click.

The Tiger Mom Tax: Asians Are Nearly Twice as Likely to Get a Higher Price from Princeton Review [Pro Publica]
When Big Data Becomes Bad Data [Pro Publica]