Statisticians Sniffing Out Fake Online Reviews Using Scientific Methods

Online reviews can make or break your decision to try out a new hotel, check out a restaurant or even influence your choice of medical care. Unfortunately, companies desperate to be perceived as awesome sometimes try to sneak in their own positive reviews, posing as fellow consumers. Good for us, then, as science has come up with yet another way to help sniff out the fake reviews.

Researchers from the State University of New York, Stony Brook are using statistical methods to detect if a company has been posting bogus reviews online, says Technology Review. The method can’t root out individual fraudulent reviews, but it can see where fake reviews are distorting the statistical distribution of say, a hotel’s scores. Basically, the method can tell you when something’s fishy.

Here’s how it works, in a nutshell: Review scores for products are plotted on a graph, and in most cases, would form a shape that looks sort of like a “J.” Usually on a scale of one to five, a product or service will have a pretty high amount of one-star reviews, followed by a smattering of twos, threes and fours and then a bunch of five-star ratings, say the researchers.

This is because people have a tendency to buy things they like, and then in turn, rate those things highly because they already like them. Usually those of us who are just satisfied but not blown away or horribly disappointed don’t feel the inclination to post a review at all, thus, the smaller amount of twos, threes and fours.

Fake reviews mess up this whole “J” shape idea, say researchers. They compared ratings of reviewers they believed to be reliable (having written at least 10 reviews more than a day or two apart) to single-time reviewers, to see if those single-timers gave out a weirdly high number of five-star reviews.

Researchers labeled hotels with large discrepancies between those two sets of reviewers as more suspicious.

Using earlier work with an algorithm that spotted textual clues to find fakes, researchers had a computer measure the effect that the known fake reviews had on the shape of the distribution and compared them, finding fraudulent activity 72% of the time.

One of the researchers says fake reviewers “might think that it was a perfect crime, but the truth is, they distorted the shape of the review scores of their own hotels, and that leaves a footprint of the deceptive activity, and the more they do it, the stronger it becomes.”

This isn’t the first time researchers have tackled the bogus online review problem: In April we looked at another study that was able to identify groups of spam reviewers better than individual fake reviews, by how often the posted reviews with the five stars. Thanks for staying on the case, scientists!

Statistics Unmask Phony Online Reviews [Technology Review]


Edit Your Comment

  1. Coffee says:


  2. Harry Greek says:

    I am more concerned over fake negative reviews. Like the ones on Amazon, where people lose their minds when a business does something with their product, that the users don’t like. For example:
    1) DRM
    2) Change color
    3) Go out of business

    The most rabid and insane negative reviewers are on Amazon. I have never seen so many nasty and uncouth people spew anger and hate because they were somehow ‘wronged’ by Starbursts making tropical flavors in addition to the traditional cherry, strawberry, orange and lemon.

    • Coffee says:

      My favorite is the one-star review because the product didn’t ship in a timely manner. Um…the company that makes the product has no control over that, buddy.

      • Marlin says:

        No even better are the 1 star where UPS damaged the item.

      • Princess Beech loves a warm cup of treason every morning says:

        Some people don’t have a clue with the difference between ‘packaging feedback’, ‘product feedback’, and ‘supplier feedback’ sections. Others don’t have a clue on how to rate — they put a stellar review on some product but give it a 1-star (lowest) — those make me go “huh?” and falls into my mental “spam” box. Maybe they are just trolling around, but I’ve seen better trolls than that.

        I typically try to review stuff that I bought that not many people have placed reviews yet (I usually leave alone those that have hundreds of reviews already). If an item is within the extremely low or high point, I have a lot to say about it — those items that fall “so-so”, I don’t have much to say, although I try to give the reason why I give it a middling rate.

    • Judah says:

      Products with DRM that actually prevents use of the product deserve one star reviews. Why would you give a good review to a broken item? I don’t understand it.

    • Awesome McAwesomeness says:

      I was trying to find a good balance ball to buy with an Amazon gift card and so many of them had scathing reviews because they came in a different color than was pictured or had a line of graphics around the center that wasn’t pictured. I had a very difficult time finding out the actual reliability and size accuracy b/c people were like 7 year-old little girls who were made because they got the purple instead of the pink.

    • Plasmafox says:

      Letting other customers know they will potentially have their privacy violated/purchase rendered unuseable, or won’t be able to get the color they want, or won’t be able to order replacement parts are all valid negatives to leave in a review.

  3. Loias supports harsher punishments against corporations says:

    How do they find product reviews to use as a control group? Those could be fake as well.

    • Tim says:

      As the post said, they used reviews from reviewers who written at least 10 reviews more than a day or two apart. Not a fool-proof method, but I’d believe it.

  4. SkeptiSys says:

    By design, my reviews group at 3 and 4 out of 5 stars. I think a bell curve trending towards above average is overall the most accurate way to view the places I review. So I have the opposite of the described J curve. But these idiots would see me as fake? We can weed out fraudulent reviews with statistics, but not with poor statisticians.

    • Loias supports harsher punishments against corporations says:

      They do not care about the individial, but the average of many individuals. So one outlier is not a flag. Clearly, the majority does not work the same way you do.

      You /= all people.

    • Fineous K. Douchenstein says:

      You would not be proven as fake. Most fake reviews are either 1 or 5 stars. I tend to not even make note of either of these. 4-star reviews are where people like the product, but something isn’t exactly as they want, and they’re nit-picking. This is usually where I start looking, because then I can see if I would have similar nits to pick. The 2 to 3-star reviews also often indicate where someone wants to like a product, but there’s a serious issue with the product, such as a pattern of failure after it worked great for a few weeks.

    • Firethorn says:

      They wouldn’t see you as fake, but statistically unusual. As Fineous said, you’re probably making a BETTER review of the product than the 1/5 star people, but it describes in the article why ‘most’ reviews are either 1 or 5 stars. 1 stars because of the people who get a broken product, can’t make it work, etc… 5 stars because, well, people tend to buy stuff they’re going to like off the internet. 2-4 stars means somebody took the time to really assess the product that most don’t. I’m another who tends to look for the 2-4 star reviews over 5 star for realistic assessments of the product.

      What tends to irk me are the 1 star ‘This is GREAT!!!’ reviews and 5 star ‘Didn’t contain the office chair I ordered; had a live bobcat instead’.

  5. There's room to move as a fry cook says:

    I’d bet my sweet bippy that 1/2 the glowing Trip Advisor reviews are fake – esp. when the 1 to 5 scale histogram is U shaped.

  6. Fineous K. Douchenstein says:


    Oh, wait. They made the College World Series.

  7. failurate says:

    I hope they apply their methods to the Google Play app market. The MLB At Bat 12 app almost never works. Most of the positive reviews just infuriate me with their marshmellowy fluffy sweetness.


    (it is the only app I have ever paid for, and has left me very disappointed)

  8. homehome says:

    the only time I ever used reviews to decide to use a product or service was when I was building my computer and was checking out the parts. Other then that, I rarely trust reviews because I find many times they are biased positively and negatively.

  9. d3vpsaux says:

    Reminds me of a certain home improvement company who, during the warranty validation process, enticed me to earn $20 gift cards for posting (and letting them know about) positive marks on a half-dozen reputable review sites. Interestingly enough, the ones they mentioned specifically did not have review-for-compensation clauses in their TOS.

    I submitted 1 review to a site not on their list on what a nightmare their installation and sales support was, and sent them that. Still haven’t received my $20 gift card for that one.

  10. do-it-myself says:

    Maybe I’m just highly skilled at this, but the fake reviews are easy as pie to detect. The real ones talk about an experience or a situation that the customer faced and the meaning behind their rating. The fake positive ones say that a product is great and will tell all their friends about it and it is reliable, etc BS.

    The few people that will take the time to write a review will write a thorough one, positive or negative.

  11. Nigerian prince looking for business partner says:

    I tend to ignore reviews that:

    1) Are 1 or 5 stars.
    2) Have a disproportionate number of spelling or grammatical errors.
    3) Have consistent spelling/grammatical errors between different reviewers.
    4) Use the long, technical name for a product/service multiple times in the review.

    While it’s not exactly foolproof, I think it does the job well enough to give a fairly good overview on a product.

  12. Snakeophelia says:

    As a statistician, I LOVE THIS. I just sent this to my research mentor. And it’s not too far removed from some methods of assessing human raters of performance (such as essays) – you examine the distributions; you plot expected points, actual points, and residuals; you compare “gold standards” (known reviewers) to the unknowns; etc.

    That much said, not everyone has jumped on the online-reviewing bandwagon, and I would expect that many actual people, when they first write a review for Amazon or Tripadvisor, might do so because they had a very good or very bad experience. So I’d be interested in knowing the algorithm that that helped them see the difference between fake distributions with one or few data points vs. real distributions.

  13. evilpete says:

    Wasn’t this in the new 8 or 10 weeks ago….

  14. CubeRat says:

    I encountered a really bad set of fake reviews yesterday. They had all the 5 star “works great” “shipped really fast” “I love it”….however, all were dated June 17, 2012. LOL

  15. MCerberus says:

    I wonder if they found fraudulent activity on the amazon page for “How to avoid huge ships”

  16. Sarek says:

    “OMG this is the best [] I’ve ever been to!!! And the prices are so low!!! The service was unsurpassed!!!

    Right, you need a computer to sniff out the fake reviews.

  17. ScottCh says:

    Sometimes reviews of local places on google maps read like a ping-pong match. I’ll see an angry, vengeful review full of bad grammar followed by a fluffy, happy review laden with generalities, then another set, then another.

    When we’re done converting all of the online review services to shill hangouts, what’s next?

  18. crazydavythe1st says:

    Congrats, they just figured out the Yelp algorithm.

    Hint: that’s why when you leave a bunch of 1-star reviews and then completely check out of Yelp you end up getting filtered out.

  19. maxamus2 says:

    Except this doesn’t work on Apple products, I mean, all Apple products get a 5 star review. Their graph isn’t “J” shaped, it is just a dot.

  20. rgudi16 says:

    Best option is to find the reviews and recommendations of your friends that you trust. And get more details of the item or service being reviewed – like the date of purchase or service.