On the Film Rating System

Miguel Penabella | 23 November 2011

You’ve heard it before: “What’s the Rotten Tomatoes score?” And of course, you take a quick glance at the Rotten Tomatoes page, find the film, and report back the overall score that eerily resembles an elementary school test grade with its 100% scale and below 60% = failing grade association. RT is undeniably THE most popular source for critical film ratings, an aggregating website that collects both professional movie critics’ and community reviews/ratings to project an obscure consensus: fresh or rotten. And while there are plenty of other aggregating sites – Metacritic and IMDb are the first that come to mind – RT maintains its royal position as the pop culture equalizer. All of its critics’ reviews are more or less reduced to a simplistic quotation blurb with an external link, yet rarely anyone ever decides to follow up with the complete article because at the end of the day, the scores are all that matter in this increasingly short-term thinking world.

With the rise of mobile devices, tablets, and other on-the-go gadgets, consumers demand quick and easy means to condense entire written articles to mirror a fast-paced lifestyle. This means, of course, relocating the importance from the actual article (film review, editorial, etc.) to a succinct headline or in the case of a film rating, a crisp, concise number. What we often see is a willful neglect of analyzing a critic’s thoughtful insight and justification for his/her final film rating and a bizarre singular zeal in finding the numbers. Rotten Tomatoes even has a weekly “competition” of sorts, in which the opening films at the box office are gauged by the community as either fresh and rotten, and the ultimate “fresh” film has the chance to be lauded that week with the “Certified Fresh” badge of honor. All this reverence seems harmless, but this simplified rating system actually reveals a deeper instant gratification mindset that film aggregating sites find itself in. Films are digested in terms of their ratings, and audiences enter into the movie environment with the RT percentage in mind – it’s a fast food mentality we’ve downgraded ourselves into. Critical analysis has dumbed down into “Movie of the week” consensus, a cringe-worthy testimonial that completely bypasses insightful perspective and context. Movies are seen in very basic terms, often black and white contrasts – while the gray area is fading slowly into nonexistence.

__

“This is mass madness, you maniacs! In God’s name, you people are the real thing!”

- Peter Finch as Howard Beale, in Sidney Lumet’s Network, 1976 

If you’ve been following Free Tea with Purchase of a Family Meal, you may have spotted the lack of a critical rating at the end of my Waltz with Bashir review. Since the very launch of Free Tea back in late 2010, I’ve been consistently juggling in my mind whether or not to even include an overall score to accompany my write-ups in the first place. The words should do the talking, not the numbers at the end of my essays. Just because a film has flaws doesn’t mean it should be relegated into the negative territory – nothing’s ever black and white, and no film should carry a stigma of repugnance or unapproachability simply because it exists in the “negative territory.” There will always be aspects of a film that I’ll like, especially when mulling over the movie for many months to come, which is why a thing like Secondary Thoughts exists. All critical thought should be considered in the long term, not merely the growing trend of the short term regardless of the cultural domination of mobile apps and Twitter – two factors that are continuously condensing thought into blurbs tailor made for phones and on-the-go user-friendliness.

I will concede that RT gives a decent quick glance at how a film could play out to be, but it obviously should not fully reflect a film’s worth. Yet the growing trend we now see is the public’s dependence on a simple RT percentage as the final decision maker in how one approaches a film. This exercise in single-minded zeal dangerously recalls Orwellian groupthink – thinking and making decisions as a collective group rather than questioning consensus – ultimately resulting in a lack of discourse, a rise in conformity, and a minimization of critical analysis and individual evaluation. Rotten Tomatoes creates a potentially misleading delineation of a film’s worth as critics are forced to conform to black and white conclusions governed by a rating system. Critics think in terms of numbers, not words, and thus, middle ground slowly becomes nonexistent while films continue to be thought of as “fresh” or “rotten,” good or bad.

As a critic, I’ve found it difficult to find that middle ground, often catching myself thinking in black and white terms for something that certainly shouldn’t be so effortless. So after much deliberation, I’ve decided to completely drop my rating system for the foreseeable future. Just like Whose Line Is It Anyway?, I’ve concluded that the points don’t really matter anyway, so long as the words are of quality and shrewd insight. Good or bad? Thumbs up or thumbs down? These are mind-numbing exercises that force us to think in terms of simplistically defined dichotomies when in actuality, nothing ever really is so straightforward. My follow-up to the Harry Potter and the Deathly Hallows: Part 1 review, the Secondary Thoughts, attests to the notion. One should avoid thinking in the short-term, but rather the long-term, because the test of time ultimately strips down all superficiality and leaves a film barren at its core. Numerous RT ratings are products of short-term thinking, usually the overinflated scores from heat-of-the-moment groupthink. Harry Potter and the Deathly Hallows: Part 2’s 96% propels it as RT’s highest rated Potter flick when others are just as good, if not better (to this reviewer, as I’m sure many others may admit). Bridesmaids has an absurdly overinflated 90% rating (a good film, but not worth 10 points from a perfect score), as does True Grit’s 96%, Born to Be Wild’s 98% (comprised of cute nature footage) and Moneyball’s 95%.

Exacerbating the problem are the reviews from fanboy/fangirl communities on sites like RT and Metacritic, whose single-minded hordes are infamous on the web for esteeming a bottom line rating over anything else. A lack of individual perspective on community-driven sites also fuels the problems on the already flawed rating system, as many obsessive online fanatics have only been watching films for a few years, and their repertoire may only be limited with a handful of Christopher Nolan or Peter Jackson films. There’s also a high chance that RT/Metacritic users have experience in films only released within this millennium, and with scarcely a trace of foreign film familiarity (and no, District 9 does not count). This lack of perspective ultimately skews the numbers, as the odd blogger only has as their zenith The Dark Knight or Avatar as their favorite, completely unaware of the films of Fellini, Tarkovsky, Ozu, Kubrick, so on, and so on. 

Nevertheless, these arguments are only getting to the very crux of the problem with the current film rating system culture. Numerous other factors contribute to my current malaise:

1) The Inception Enigma: Ratings within a Rating

Rotten Tomatoes has a feature that allows users to rate other users’ reviews, yet this system is easily abused. Ultimately, a groupthink mentality arises as online critics downgrade others’ reviews simply because it dissents from the overall consensus – in effect, rating the rating. Oftentimes, reviews that are critical of popular films such as Pulp Fiction or The Godfather are harshly censured and suffer with a low rating in a pathetic form of cyberspace revenge. Opposing the consensus essentially meets with admonishment no matter how intelligent and thought-out a disapproving scrutiny is written, further solidifying a conformist mentality too afraid to speak out against the status quo. Alternatively, those who give top marks to typically poor-reviewed films, regardless of the quality of the justification, again draws forth a vengeful community.

Back in my days actively participating in the RT community (2008-2009), I stumbled across a particularly rebellious user named “TheStunner” who deliberately challenged the system. Acrimonious to the RT scale, this user decided to give a 60% (just enough for a fresh rating) for films s/he approved and a 50% (just enough for a rotten) for unsavory ones – a crude equivalent of a thumbs up / thumbs down arrangement. Of course, this system quickly drew controversy from the easily agitated fanboy/fangirl community who saw their beloved films have their scores dented by a defiant critic. User comments disapproved of the 60% score despite a gushing review that accompanied it simply because the low percentage could potentially lower a much higher rating. TheStunner beautifully visualized the number > justification group mentality, mocking the community’s zealous fanaticism with the average score rather than an intelligent review. And thus, despite carefully worded and shrewd examinations of a film that went hand-in-hand with the score, TheStunner ultimately suffered with his/her stellar essays receiving thumbs down solely because the 50%-60% rating was too low for the single-minded community to stomach.

2) Differing Scales

Rotten Tomatoes sets up their average percentages by tabulating major film reviewers, “Top Critics,” such as Roger Ebert or Lisa Schwarzbaum, as well as secondary reviewers (a number of lesser known critics, websites, blogs officially registered with RT, etc.), and the community into a general rating. Obviously, the more prominent critics are weighed separately from the masses of amateur online critics within the RT community, but just how reliable is the final percentage? Rotten Tomatoes has to consolidate information from numerous sources and standardize differing scales into their 100% rating scale. This standardization begs various questions on trustworthiness, as critics who utilize a 5-point scale, a 4-point scale, and the thumbs up/down system may skew the final rating. Obviously, it’s difficult to translate entirely different scales together into one homogeneous whole, but RT apparently does it – but how credible is the final number?

3) Sample Sizes

As statistics tells us, a healthy sample size is absolutely necessary to deliver a dependable conclusion. Yet as RT shows, a number of independent films, foreign films, and dated works lack a significant critic sample size large enough to generate a balanced average score. Nowadays, major theatrical releases often produce critical input numbering over 200 responses, but older (yet still well-known titles) lack such high samples. Perhaps this deficiency answers why such applauded films as Fear and Loathing In Las Vegas (counted reviews: 50) has an average of 48%, why The Boondock Saints (counted reviews: 23) has an average of 17%, and so on. So while high profile titles that receive hundreds of critical contributions may reflect a general consensus, lesser-known, often independent titles with double digit sample sizes could potentially be statistically inaccurate.

4) Time 

The test of time obviously has great influence on a film’s worth, but ratings oftentimes do not reflect long-term consideration. Usually, spur-of-the-moment hype governs the final rating and any single critic who opposes the current film climate is met with harsh derision. Just look at the RT ratings for three of the Harry Potter films: Sorcerer’s Stone (80%), Deathly Hallows Part 1 (79%), and Deathly Hallows Part 2 (96%). The latter of the three appears to have a significantly overinflated score while its first half, arguably one of the best in the series, is dead last behind Sorcerer’s Stone as the “worst” Potter film… according to Rotten Tomatoes.

Numbers tend to reflect the short-term consensus rather than long-term deliberation. Thinking in terms of the long-term separates The Social Network from The King’s Speech, Mulholland Drive from A Beautiful Mind, The Pianist from Chicago (there’s a common thread going on here, see if you can find out what). The rating system needs to step away from the pull of hype and towards impartial, sensible consideration to avoid glorifying films with little justification in being glorified and vice-versa. 

5) Numbers > Justification

There’s a growing problem regarding audiences venerating the overarching number than the actual assessment of the film, as people look towards the average rating to simplify all thought processes. The problem is two-fold, however: a combination of the aggregating website and the user themselves. Firstly, websites that emphasize a film’s rating over an assessment essentially nullify the justification building up to a score. Concurrently, the users themselves are drawn into this instant gratification construct that idolizes the bottom line score over a thought-out, well-constructed critical scrutiny. Just look towards the aforementioned experiment by TheStunner to observe how numbers have all but taken over the mentalities of online critics. Again, it’s an example of groupthink at work that has brought the rating system down to its most primitive form. With a culture that esteems consensus and averaged scores over actual drawn-out reasoning comes with it a conventionalized, potentially unsound rating system. The numbers don’t lie, but ignoring the worded thoughts only makes it easier for the numbers to start doing so.

6) Must… Get… That… A+ (The Importance Placed Upon a Score) 

Films with a history of built-up hype condition audiences to associate it with high marks regardless of the final product’s quality. Hype is a dangerous thing, as I discussed earlier under “Time,” though there are deeper psychological implications in why audiences place such high importance on a score. The Rotten Tomatoes rating system mirrors an elementary school grading system, equating 60% and below with rotten, or failing, and thus, the RT community frequently finds users squabbling over movies tottering over the edge of the boundary line. I’m no psychological expert, but the connection between the RT rating scale with an elementary school grading scale may have an innate connotation with achieving that “perfect A+” score. If the grade is all that really matters, what will happen to the actual written text that justifies it? Will it continue to be overshadowed by a mere number?

7) Oversimplification

“Don’t you see that the whole aim of Newspeak is to narrow the range of thought? In the end we shall make thoughtcrime literally impossible, because there will be no words in which to express it. Every concept that can ever be needed, will be expressed by exactly one word, with its meaning rigidly defined and all its subsidiary meanings rubbed out and forgotten.”

- George Orwell, 1984

How useful, how trustworthy is a numerical average for a collective group of film scores? Narrowing all a film’s worth into a standardized number essentially devalues all its context and work, potentially driving away prospective fans with a negative stigma of “rotten” or “thumbs down.” Furthermore, comparing one film to another based solely on an aggregated score isn’t a particularly sensible method because films are so different – the context, the year released, the number of reviews gathered (sample sizes), and so on. Audiences should exercise caution in concluding that a score is indicative of a film’s value; one must look into why a film was rated good or bad, not merely if it is rated good or bad. And finally, one must gauge for oneself whether or not a film is deserving of such a conclusion.

What Orwell says about Newspeak can easily be associated with the film rating system. One should never allow a simplified, dumbed down conclusion to define a body of work; there must be reasoning, support, and thought put into its justification. As a review aggregator, RT and Metacritic are indispensible in collecting together a diverse stratum of critics from the immense film community. However, an average score should never have the final word in a film’s worth – and this conclusion also reaches unto the realm of videogaming, literature, music, etc. One must exercise attentiveness and question unanimity to avoid falling into groupthink because aggregators are merely starting points for delving deeper into thoughtful analysis. Armond White, a film critic who I have my own fair share of qualms with for a number of reasons I won’t get into here for the sake of staying on topic, writes a thoughtful essay on the dangers of aggregating sites, specifically Rotten Tomatoes, here.

And so, following the example of commendable online film critics who tackle criticism well – Not Coming to a Theater Near You, Reverse Shot, Senses of Cinema, etc. – I’ve decided to let the words do the talking, not a number. At the end of the day, concluding a thorough, contemplative review with a simplified rating unintentionally instills within a reader’s mind that the number is the takeaway point of an article. Numbers only contribute to a degradation of all the material that precedes it, and in the end, both reviewer and reader come away with nothing. And in the great words of Gene Wilder as Willy Wonka, this is the end result: You lose! Good day, sir!

Notes 4