A reader, Peter, got in touch to ask:

People often complain about a sharp decline in the quality of film writing in the last 10-15 years. So my thinking is, are modern screenplays objectively worse?

I love these types of questions. Partly because it‘s an entry point for research that will help us better understand movies, and partly because it will force us to define terms. In that process of defining something like “worse”, we will learn more about the nature of movies, even before the data reveals the results.

Today I’m going to address one narrow aspect of the question (i.e. writing style).

I studied the dialogue in 64,332 movies released between 1940 and 2022. Each movie’s writing was scored using four metrics, which will allow us to see the change over time.

Vocabulary variety (technical term is MTLD) is a measure of lexical diversity, i.e., how long a script can keep introducing new words before it starts repeating itself. Sentence structure, which splits into: Line verbosity i.e. how many words within a line. Short lines i.e. how many sentences have only one, two, or three words. Filler words look at how much conversational padding there is, such as “like”, "you know”, “I mean”, etc.

Together, we can use these signals to see whether scripts are becoming simpler. If they are, we would expect to see a narrower vocabulary, shorter speech chunks, more fragments, and possibly more conversational padding.

Now we have our measures, place your bets on what we’ll see…

Signal 1 - Are movies using fewer unique words?

Let’s start with the broadest signal. Vocabulary variety tracks how wide a script’s word range is. Technically, this is measured using Measure of Textual Lexical Diversity which estimates how long a text can continue introducing new words before it starts recycling earlier ones.

In practice, it answers a simple question: how quickly does a script begin repeating itself?

Imagine two versions of the same emotional scene.

In one version, a character says something is “astonishing”, then “extraordinary”, then “unprecedented”. The writer keeps reaching for fresh terms. In the other version, everything is “crazy”. The surprise is crazy. The plan is crazy. The villain is crazy. The solution is crazy.

Both versions communicate meaning, but only the former maintains lexical range.

The chart below has blue dots for the average for all movies released in each year, based on an arbitrary base year of 1965. This means we don’t have to be distracted by the exact numbers, but can quickly see whether things are rising, falling, or staying consistent. The orange line is the trend line, to make the trend even easier to spot.

The data is clear - yes, movies are becoming lexically less diverse. Scripts are recycling their vocabulary sooner than they did decades ago.

Dialogue is increasingly drawing from a tighter core set of terms. Characters repeat key words more often, and emotional emphasis leans on reinforcement rather than variation.

When vocabulary range narrows, dialogue becomes more direct, and (arguably) more predictable. The linguistic palette shrinks, even if the story world does not.

So that’s one signal suggesting an increase in script simplicity. Let’s look at the next.

Signal 2 - Are lines of dialogue in movies getting shorter?

In previous research, I have shown that the number of spoken words in movies was at its highest in the 1940s, after which it fell until the 1960s, when it remained fairly static until another decline in the last 1990s. However, since the mid-2000s, it’s been on the rise again.

Connected to this is the inverse trend around silence. The last few decades have seen a significant decline in moments when no one is speaking.

Whilst you may have your own views on whether the lack of silence is a signal for or against the claim that ‘movies are becoming more simplistic’, I don’t think it alone is enough to score this point yet.

We need to dig into how sentences are being constructed. Consider a scene in a movie where one character is confronting another. An older style might run like this:

“If you think for one moment that I am going to allow this arrangement to proceed without objection, then you have fundamentally misunderstood my position.”

That is a single flowing unit of speech which builds, qualifies, and explains.

Whereas, a more modern structure might look like this:

“You’re wrong.” “This isn’t happening.” “Not today.”

In those two scenes, the story bears the same, but the delivery is broken into smaller pieces. One would assume that the editing of those two scenes would also differ, with the latter having a fast intercutting rhythm.

Let’s zero in on the length of the lines, so see if characters are expressing complex thoughts or empty short lines.

Line verbosity measures how many words a character speaks before the subtitle breaks. Subtitles divide dialogue into blocks, and each block represents one continuous chunk of speech. Longer blocks usually mean longer thoughts, whereas shorter blocks mean quicker exchanges and more rapid back-and-forth.

And here we see the same pattern as before, i.e., a clear, consistent decline over the past eighty years.

We can go one further on this point and look at the amount of very short lines, i.e. those with only one, two, or three words, such as:

Wait.

Why?

Go.

Now!

And indeed, we do see a clear rise in these micro lines.

If line verbosity tells us that blocks are shrinking, this measure shows how they are shrinking - ideas are being divided into fragments rather than contained within longer, flowing sentences.

This shift changes the rhythm and reduces the cognitive load on viewers, allowing them to process shorter units more quickly.

So I’m calling that two-zero on the argument in favour of increased script simplicity.

Signal 3 - Are movies full of pointless words?

Finally, we look at empty words or phrases, which take up space but do not drive the story forward or convey information.

Filler words are the small pieces of conversational padding that make speech feel natural. Words and phrases like “like”, “you know”, “I mean”. They rarely carry plot information, but they affect the tone and rhythm.

Compare:

“I don’t agree with you.”

With:

“I mean, I just… I don’t agree with you.”

The latter version feels more conversational, in part because it contains more linguistic padding.

This third signal might be the strongest of the trio, as we saw a stark increase over the period I examined. Characters use more verbal cushioning than they did in earlier films, with dialogue sounding looser, more informal, and closer to everyday speech patterns.

When we pull the three signals together, we can be pretty clear that movie scripts have become:

Snappier

More verbose

But less diverse

And with more empty phrases

Does this mean they’re worse? That’s outside the realm of data sciences and into interpretation. Thoughts in the comments, please!

Just one more thing…

I suspect some readers are currently thinking:

But Stephen, those change in speech patterns are found in the real world, and so rather than scripts getting worse, this acutally shows that screenwriters are keeping up with society.

And that’s a fair point. Perhaps none of this reflects a decline in writing at all, but instead reflects changes in how people speak in the real world.

It’s fair to say that everyday conversation has become looser, more fragmented, and more filled with verbal padding. So, if screenwriters are mirroring real speech, then what we are measuring is that cultural change, rather than a decline in creative writing.

We need to test if what we’ve tracked so far is down to one or the other of the following:

Writing fashion. This sums up the notion that the craft of screenwriting itself has shifted, meaning that modern writers are structuring scenes differently, breaking dialogue down into shorter units and choosing for their characters to speak in faster, more fragmented exchanges. And if you don’t like that, you might conclude that ‘films are getting more simplistic’. Real-life reflection. Writers reflect the world as it should be for the settings of their movies, and as society has become more relaxed, cinema has followed suit, with most set in the present day. If this is true, then scripts are not getting simpler, but we are getting simpler, and movies are just a mirror.

How do we test which is true? We look for movies set in the same time period but written at different points in the past eighty years.

I identified two cohorts of movies from the wider database:

Contemporaneous movies are those which were set in the same decade in which they were released.

Movies set in the 1800s, as this was the largest time period outside of the 20th and 21st centuries. (More on that here: What is the most popular time period for movies?)

If these changes are about writing fashion, then we should see the same drift we saw earlier.

But if the writers are accurately reflecting the time period their stories are set in, then there should be no change over the past eight years.

So what do we see?

For our first two structural signals, the cohorts move together.

Dialogue lines are getting shorter in both groups.

And the number of very short lines is increasing in both groups.

That suggests that at least part of what we’re seeing is not about characters sounding more “modern”. It looks more like a shift in writing style, with scenes broken into smaller conversational beats, exchanges tighter, and thoughts divided into fragments more often.

That structural compression is happening regardless of whether the story is set in 2022 or 1822. So, some of this change is about how films are written and edited, not about what period they depict.

But when we look at language complexity, the picture changes.

Films set in the 1800s do not show the same decline in vocabulary variety that we see in contemporaneous titles. There is still some downward movement, but it is noticeably flatter.

The clearest divergence appears in filler words. Over the period I studied, filler words increased markedly in films set in the present day. In films set in the 1800s, they slightly declined.

Writers of period films appear to resist some aspects of modern speech patterns. Excessive filler words would feel obviously out of place in an 1800s setting. In previous research, I have shown that audiences punish films they regard as using anachronistic language choices, which could explain this heightened awareness (or pressure from producers).

So what have we learned?

Across the past eighty years, movie dialogue has become structurally tighter, with shorter lines, very brief exchanges becoming more common, and scenes are broken into smaller conversational beats.

This shift appears in both contemporary and historical films, suggesting it reflects changes in writing fashion and pacing.

At the same time, vocabulary diversity has declined, and filler words have increased, but mainly in films set in the present day. Period films do not follow that trend to the same extent. That implies that some of what we are measuring reflects contemporary speech patterns, not a universal drift across all storytelling.

In short, we are seeing two forces at work: a broad structural compression in how dialogue is constructed, and a setting-dependent shift towards more conversational, padded language in modern-set stories.

Movies may well be becoming structurally tighter and more conversational. But whether that counts as “simpler” depends on whether you see those changes as loss, adaptation, or evolution.

What do you think?

Notes

This research examined the English-language subtitle files of 64,332 movies released between 1965 and 2022, inclusive.

Each film was scored on four subtitle-derived language signals.

Vocabulary variety was measured using MTLD (Measure of Textual Lexical Diversity), where lower values indicate faster repetition of the same word set.

Line verbosity was measured as the mean number of words per subtitle block, with lower values indicating shorter speech chunks.

Structural fragmentation was measured as the share of subtitle lines containing one to three words, where higher values indicate more micro-lines.

Conversational padding was measured as filler-word frequency per 1,000 words, with higher values indicating looser, more padded dialogue.

After metric extraction at the film level, change over time was estimated from yearly medians rather than yearly means to reduce the influence of outliers. Trend direction was then modelled with weighted linear fits across release years, with yearly film counts used as weights so denser years contribute more than sparse years.

Time settings were initially determined based on explicit mentions in film synopses, which could reference a specific year, decade, century, or a general time period such as “medieval.” Films with no clear indication of their time period were assumed to be set in the decade of their release.

Subtitles are not perfect proxies for scripts, but they are close enough to capture structure, vocabulary, rhythm, and repetition. One thing to note is that subtitle formatting conventions could have changed over time, which could affect some of what we saw. I’m not too worried about this, as it alone is unlikely to explain the divergence between contemporaneous films and those set in the 1800s. If subtitle fashion were the primary driver, we would expect both cohorts to move in near-identical ways across all four signals, but they do not. So while subtitling norms may account for part of the structural shift, they are unlikely to be the main explanation for the differences we see between story settings.

