Sex! Violence! Profanity! Since movies became a popular form of entertainment, people have been preoccupied with these things in movies. In the United States, many films in 1920s and 1930s (in)famously had tons of licentiousness, violence, and foul language. But in 1934, the Hays Code was introduced, which severely restricted what could be shown on screen. For example:
- Kisses could not last longer than 3 seconds
- The word “God” could only be used in a reverent tone
- If someone performed an immoral act, they had to be punished on-screen
Many films had clever ways of getting around some of these rules and, eventually, in 1968, the Hays Code was abolished. Instead, a film rating system — the one most of us are familiar with today — was introduced.
I want to understand these movie ratings better. What causes a movie to be rated a certain way? Ratings seem to be based on the amount of sex, violence, and profanity in movies, so I’m going to use that as the basis for my analysis. Specifically, I’m interested in two things:
- How much sex, violence, and profanity (SVP) is there in contemporary movies?
- How does the amount of SVP determine the MPAA movie rating?
How do we measure sex, violence, and profanity?
Each country has its own rating system, but I’m going to use the one from the United States as a reference point. In the US, the Motion Picture Association of America (MPAA) rates movies according to their suitability for children:
- G (General Audience): nothing that would offend parents for viewing by children
- PG (Parental Guidance Suggested): may contain some material parents might not like for their young children
- PG-13 (Parents Strongly Cautioned): some material may be inappropriate for pre-teenagers
- R (Restricted): parents are urged to learn more about the film before taking their young children with them
- NC-17 (Adults Only): clearly adult
The MPAA doesn’t separate sex, violence, and profanity in its ratings, so they won’t be that useful for us. Instead, I’ll use ratings from Kids-In-Mind, an unaffiliated organisation that rates movies on a 0-10 scale across three dimensions — sex, violence, and profanity. After some pruning, I had 4,391 movies to analyse (for details on the data, see the note at the bottom of this post).
Let’s put their ratings in perspective. I hope you don’t need someone to tell you that Pulp Fiction has more profanity than The Muppets, but here are some representative examples for the ends and midpoint of each dimension.
How sexy/violent/profane are movies?
To answer this question, let’s look at the shape of each dimension’s histogram. The height of the bars represents the number of movies with a given rating.
Let’s start with sex. The mean sex rating is 4.0 and most movies have a score that’s close to the mean; few movies are extremely sexy (score of 9 or 10) or completely lacking any sexual content (score of 0).
The mean violence rating is 4.8 and, like the sex ratings, most movies cluster around this mean. In other words, not many movies are extremely violent or extremely tame.
The profanity rating is the most interesting. The mean rating is 4.9, but the histogram isn’t nearly as smooth as those for sex and violence. The two spikes are are interesting:
- There are a lot of 5-rated movies, more than twice as many as 4-rated movies, the next-biggest category. That’s a bit odd.
- There are a lot of extremely profane movies, with the maximum rating of 10. This pattern is different than what we saw with sex and violence, where we saw a smooth decrease as we got further from the mean score.
Why is the profanity rating…weird?
The profanity counts have two interesting spikes, telling us that there are a bunch of movies with a middling amount of profanity and a bunch of movies with an extreme amount of profanity. Let’s break things down by MPAA to see if that tells us anything interesting.
Look at the profanity histograms for PG-13 and R-rated movies. There seems to be a ceiling rating of 5. After some investigation, we find out why. As soon as a movie has a certain level of profanity, it gets an R rating, according to MPAA’s rating rules (PDF):
“A motion picture’s single use of one of the harsher sexually-derived words, though only as an expletive, initially requires at least a PG-13 rating. More than one such expletive requires an R rating, as must even one of those words used in a sexual context.”
So there’s a line in the sand for profanity, but not for sex or violence. As far as I can tell, profanity is the only one of these categories where there is a hard-and-fast rule for determining a rating. No such rules exist for sex and violence; instead, their impact on a movie’s rating is based more on the rating board’s interpretation of what they believe a majority of American parents would think.
This exploratory data analysis is just that — an exploration. We now understand what goes into movie ratings better and it prompts some new questions:
- Are other countries’ rating systems similar to the MPAA’s?
- What other factors, if any, go into determining a movie rating?
- How have movie ratings evolved over time? For example, has “damn” had less of an impact of a rating over time?
These are all interesting avenues to investigate, but for now, I hope you’ve found this exploration interesting.
Note on data
I scraped the data for 4,497 movies from Kids-In-Mind, which has details on movie name, year of release, MPAA rating, and the individual SVP ratings given by Kids-In-Mind. For the analysis, I excluded movies released before 1992 (n=34) or after 2016 (n=17) because I didn’t feel there were enough movies to be representative of the years they were released. I excluded movies with an NC-17 rating (n=5) or with no rating at all (n=67) for the same reason — I didn’t feel the sample was large enough to be representative. There ended up being 4,391 movies in the main analysis.