Warning: This article contains 17 uses of the f-word, 9 uses of the s-word and 8 uses of the c-word, among other examples of foul language. Even the graphs are graphic. You have been fucking warned.
Like last week, this is another offshoot of my work for Guinness World Records, although the vast majority of what you’re about to read won’t make it into the 2025 World Records book for reasons that will become extremely obvious.
What began as an investigation into the ‘Sweariest Movie’ grew into a much wider exploration of swearing within movies worldwide.
I will post a longer explanation of my methodology at the end of the article, but for now, know that it is the result of analysing the dialogue of 42,704 movies released between 2000 and 2022. Today, I’m focusing on their country of origin.
Which country produces the sweariest movies?
This kind of research needs a fair amount of subjectivity. Each culture will have its own assessment of what words are and are not swear words. For today’s analysis, I have opted to look at the uses of the words ‘Fuck, ‘Shit’, and ‘Cunt’.
I couldn’t decide whether it was more important to look at the frequency or density of sweary movies, so I did both. On the chart below we have two metrics of swearyness:
- Frequency. On the horizontal axis we have the percentage of movies which featured at least one use of the no-no words. You’ll see that almost 71% of the 553 Thai movies I studied used at least one of those words once (this is largely down to the sheer number of ‘Shit’s they produced).
- Density. The vertical axis tracks the average number of uses of those words across all films. The average Irish film had 26.6 such swear words.
Let’s dig a little deeper in this pool of filth…
The Irish are the biggest ‘Fuck’ers
Films produced (or co-produced) in Ireland had the highest frequency of films containing the word ‘Fuck’ across the dataset, only slightly pipping Argentina and Belgium to the ‘Fuck’ing crown.
Anecdotally, I was told by an Irish friend a while ago that word placement was key in understating the use of the word ‘Fuck’ in the Irish vernacular. In that “Pass me the fucking bread” may not be viewed as offensive, while “Fucking pass me the bread” might. Therefore, if we were able to go through all 9,795 uses of ‘Fuck’ in the Irish canon, we may find interesting differences between uses.
There’s a lot of ‘Bollocks’ in the UK and Ireland
As well as the Holy Trinity of swear words we’ve seen so far, I tracked a whole bunch of other ones. This revealed localised swear words, such as ‘Bollocks’ which was largely found in Ireland and the UK.
As an aside, it seems that where there are ‘Bollocks’ there also tend to be ‘Wanker’s. The two words were highly correlated.
Where the two nations differ is in the prevalence of ‘Twat’s. UK films are twice as likely to contain ‘Twat’s as those from Ireland.
Australia is big on ‘Dick’s
Australian films are the most likely to contain multiple ‘Dick’s, followed closely by Canada and the US.
Australia is also the leader in ‘Bugger’y, with the Czech Republic and the UK coming up the rear.
‘Slut’s were more often found in Thailand, Brazil and Belgium
As well as direct explicative, I also tracked pejorative descriptors, such as ‘Slut’. These were spread more widely than ‘Twat’s and ‘Dick’s we saw above, with the greatest concentration being in films from Thailand, Brazil and Belgium.
North America loves a ‘Douche’
Finally, films coming from America and Canada were the most likely to use the word ‘Douche’.
If there hasn’t already been enough filth, I have looked at related topics in the past, including:
- An analysis of 12,309 feature film script reports in 2019, I analysed over 12,000 movie screenplays and their script reports. Among the findings was a meaningful positive correlation between the level of swearing and the score the script received from Script Readers.
- Defining the average screenplay came from the same analysis and looking at how likely screenwriters were to use certain words. The delightful Venn diagram below comes from this work.
Below is the result of analysing 12,309 feature film scripts, showing the prevalence of certain key swearwords. You can read the full report here.
Non-swearing further reading includes:
- Which countries most commonly team up to create film co-productions?
- How many countries do Hollywood movies shoot in?
- Which languages are most commonly used in movies?
For this analysis, I started with a dataset containing over 400,000 English language subtitle files very kindly provided by the lovely folk at OpenSubtitles.com. I narrowed it down to 42,704 live-action fiction movies, released between 2000 and 2022. Metadata came from OMDb, IMDb, The Numbers, Wikipedia and my own analysis. The final charts show the 40 countries which had at least 200 movies appearing in the dataset.
I conducted a keyword search for a predefined set of swear words within these subtitle files, taking into account considerations such as spelling variations, punctuation and case. This approach, while efficient at identifying the presence of swear words, inherently comes with some limitations, including:
- Subtitle inaccuracies. Subtitle files may contain errors that affect the accuracy of the swearing count, as they can be auto-generated or include manual transcription mistakes.
- Regional variations. The most popular English version of a film’s subtitles was used, potentially overlooking regional variations that might contain different results. This is especially relevant to swear words as each country is likely to have its own rules on what’s ‘too offensive’.
- Missing movies. There is a chance that some key films are missing from the dataset, which could skew the results and leave out key candidates.
- Context. The keyword search doesn’t account for context, meaning that words identified as swears might not always serve as profanity within the dialogue, such as when they’re part of a name or used in a non-swear context. This is most relevant for ‘Dick’ as it’s a common name and thus frequently appears in dialogue where it is not intended as profanity. That said, this is looking at disproportional uses of the studied words, so the relative position in the cohort is more important than the exact values.
- Subtitle contents. Some subtitle files might not be complete or may exclude parts of the dialogue that impact the overall count of swear words. In last week’s work on James Bond films I needed to take account of subtitles for the theme tune. I wasn’t able to do this kind of manual work on this dataset of 70k+ files!
Therefore, I see this as accurate but not precise. This means that is great for identifying general trends and possible contenders for films with the most swear words. But when it comes to the standards needed to anoint a particular film a Guinness World Record winner, we go through additional stages of checking. These include cross-referencing against multiple subtitle sources and manually checking the context of swear word usage. However, for the purpose of this study, the current methodology offers a robust starting point for exploring the prevalence of swearing in movies.
Well, that was a lot of fun! But the dataset and methodology can produce much more than just filth. I am currently studying movies for things like cliques, cultural references and language shifts. If you have a suggestion of what I should study, please do reach out or leave a comment below.