• Home
  • Articles
  • Defining the average screenplay, via data on 12,000+ scripts

Defining the average screenplay, via data on 12,000+ scripts

4 February '19 29 Comments on Defining the average screenplay, via data on 12,000+ scripts

Last week, I published my analysis of 12,309 feature film screenplays and the scores they each received from professional script readers.

A byproduct of that research was that I had a large number of data points on a whole bunch of screenplays.  This allowed me to look at what the average screenplay contains.

Hopefully, this research will prove useful to writers, producers and directors looking to understand what a typical screenplay looks like and a benchmark against which they can assess their own work.

All of these scripts were reviewed by professional script readers, either as a part of a screenplay competition or to create a script report.  The vast majority of these scripts will not have been produced into movies yet and a large number of the screenwriters will still be at entry level, rather than professional writers. That being said, within the dataset are scripts which have won awards, been optioned by established producers and been written by professionals and Hollywood stars.

In this article, I’ll share what the typical feature film screenplay contains with regard to seven topics:

  1. Number of pages
  2. Level of swearing
  3. Gender-skewed genres (and who writes female characters)
  4. Number of speaking characters
  5. Number of scenes
  6. Location places and times of day
  7. Age of primary characters

1. Number of pages

The median length across all of our scripts was 106 pages. However, there was a broad spectrum of lengths, with 68.5% of screenplays running between 90 and 120 pages long. As the chart below shows, there are spikes on round numbers; namely pages 90, 100, 110 and 120.

Horror scripts are the shortest, with an average page count of 98.6 while the longest were Faith scripts at 110.0 pages.

2. Level of Swearing

Warning: The charts contain uncensored uses of bad words.  If this is not your thing, then skip to the next sub-section.

Almost four out of five scripts contained the word ‘s**t’, with two-thirds featuring ‘f**k’ and just under one in ten using the word ‘c**t’.

Although more scripts feature one ‘s**t’ than one ‘f**k’, when a ‘f**k’ does appear it tends to be used more frequently than ‘s**t’. Across all our scripts, ‘s**t’ is used an average of 13.2 times, ‘f**k’ 23.9 times and ‘c**t’ 2.1 times.

Unsurprisingly, the swear words were not spread equally across all scripts. I developed a swearing score, based on the frequency of the three swear words I tracked, awarding a ‘1’ for each use of ‘s**t’, ‘1.17’ for ‘f**k’ and ‘8.51’ for ‘c**t’.

Comedies are the sweariest, beating Action and Horror scripts by a tiny margin (Comedy scores 42.8, Action scores 42.5 and Horror scores 41.8). The genres featuring the lowest levels of swearing are Family (1.2), Animated (1.3) and Faith-based scripts (2.8).

Only sixteen scripts used ‘c**t’ without also using either ‘s**t’ or ‘f**k’ at least once.

3. Gender-skewed genres (and who writes female characters)

I have written at length in the past about gender inequality in the film industry, and so I won’t discuss the topic in detail here.  However, it is interesting to note how the gender split changes between different genres of scripts in the dataset.

The most male-dominated genres are Action (in which 8.4% of writers were women), Sci-Fi (14.1%) and Horror (14.5%). Women were best represented within Faith (47.2% female), Family scripts (41.5% female) and Animated (39.1%).

An interesting finding in last week’s research was that when we look at the scores given by readers, there seems to be an advantage to writing in a genre dominated by another gender.

For example, Action is male-dominated but is also a genre in which female writers outperform their male counterparts by the second-largest margin. Likewise, Family films written by men received higher ratings than those by women.

My reading is that when it’s harder to write a certain genre (either due to internal barriers like conventions or external barriers like prejudice) the writers who make it through are, by definition, the most tenacious and dedicated. This means that in a genre where there are few women (such as Action) the writers that are there tend to be better than the average man in the same genre.

As well as tracking the gender of the writers, I also looked at the gender of the major characters of each script (where it was possible to do so).

In all but one genre, female screenwriters were more likely to create female leading characters.  This was particularly pronounced in Historical films, where female characters in male-penned scripts account for only 39% of leading characters whereas the figure was 74% for scripts written by women.

This neatly illustrates one of the many reasons why gender inequality within the film industry can have negative outcomes.  As well as basic fairness and equal opportunities, we also have to consider what characters we are seeing in movies.  Culture can be defined as the stories we tell ourselves about ourselves, and so an overly-male writing community is likely to lead to a culture which overemphasises the plight of male characters, thereby undervaluing female characters, stories and perspectives.

4. Number of characters

The dataset allowed me to look at the number of unique characters who speak in each script, from our principal hero/heroine right through to background characters with single perfunctory lines.

Historical scripts have the greatest number of speaking characters (an average of 45.7) and Horror scripts have the fewest (25.8). Sadly, I was unable to track how many of those characters were still alive by the final page.

5. Number of Scenes

The average script has 110 scenes – just over one scene per page. Action scripts have the greatest number of scenes (an average of 131.2 scenes) with Comedies having the fewest (just 98.5).

6. Location places and times of day

Each scene heading starts with an indication as to whether the scene takes place inside (“INT” for interior), outside (“EXT” for exterior) or a hybrid (“INT/EXT”).

Across all scripts, 60.2% of scenes are interiors, 38.9% are exteriors and 0.9% are hybrid locations.

Westerns are mostly set outside, with 64.4% of their scenes taking place in exterior locations. At the opposite end of the scale, we see 65.2% of Comedy scenes taking place indoors.

Something that will make producers wince is that the average location only appears in 1.5 scenes.

58.3% of scenes take place during the day and 41.7% take place at night. Perhaps unsurprisingly, Horror scripts are much more likely to be set at night (56.5% of scenes) whereas Historical scripts are the most nyctophobic, with only 28.9% taking place at night.

7. Age of primary characters

The average specific age of the top five characters across all the scripts is 31.8 years old.

The character who speaks most often is typically a little younger (average age: 28.3) and as we move down to characters who speak less frequently the age increases slightly. The average age of the fifth most frequently-speaking character is 35.4.

The median age is 30 years old, with 15.4% of all characters being listed as exactly 30.

Notes

Today’s research is riding on the coat-tails of my ‘Judging Screenplays By Their Coverage’ report and so comes with the same notes, definitions and caveats.

I would suggest either reading last week’s article or the full 67-page report for details.  This is particularly relevant to explain our methodologies on complicated topics such as gender.

Share

Comments

29 comments
  1. I have written a long historical saga that could be either a miniseries or episodic TV. The miniseries is never addressed in articles such as above. Please, let’s have more.
    Thank you,
    Carol Layman

  2. Lovely, so helpful for big-picture thinking! #4. I think the title of the chart is meant to say “per script”, rather than per scene. But I would be quite interested to know how many characters per scene there are, broken down by genre.

  3. reaally nice stephen ! loved the casual ‘out of the box’ way of thinking story development…
    that quantitative study like yours reveals. wonderful to check and make associations…

    would you do a study of settings x genre ? by settings I mean locales
    ( could be foreign, close to main action x far from main action…
    in geography they are now saying that there are ‘origin locales’ x ‘attraction locales ( touristic attractions like madame toussaut wax museum) as destination locales. by the way, touristic spots in movies are a huge
    add on to help extra finance the movie in various countries (like Italy) cause each touristic place depicted in movies correlate directly to higher increase in tourism.

    also there are ‘intermediate locales or transition locales too.
    in game design there is ‘level design’ as route design for the ‘user/player’ to move accross the ‘game geography’ from starting point to end point…
    in detective genre story, for instance, in screenplay school we tend to teach that detective investigate clue that leads from place locale A to B…

  4. Jean-Marie MAZALEYRAT

    Hats off to you.
    Simply amazing!
    And of course, some academic myths fall:
    – the use of V.O.
    – the overall importance of such elements as format, hook, originality, structure, theme, pacing, and even conflict!!!
    – etc.
    Having a comparison between non-produced and produced scripts would be great too, and I bet it should break even more myths.

    1. I don’t know that general principles like “format, hook, originality, structure, theme, pacing, and even conflict” are myths, though the Advice Factory has definitely generated a lot of mythology on these issues (especially the use of V.O.).

      It seems to me that shifts like shorter scene length / greater number of scenes / reduced overall page count mean that the way writers handle pacing, for example, has to change as well, but if anything it’s even more important nowadays (and more difficult) to ensure that pacing keeps the reader / viewer enthralled. Ditto for the other principles mentioned.

      1. Thank you. Yes, I found it – dataset, 23.7% female.

        So when you refer to “basic fairness and equal opportunities,” (above) given that Screencraft is one of the biggest entry-level conduits, do you think that Male entrants outnumbering Female entrants by over 3 to 1 could be having a negative impact on gender representation in the film industry? Or do you still feel this is down to ‘unconscious bias’ (from your report “Gender Inequality and Screenwriters”)?

        I wonder if this data has affected your thought process on this issue?

        1. *Great* question.

          I’d say that there a few things to unpack here. Firstly, yes, it’s a valid data point which shows the level of people of each (self-reported) gender who submit to ScreenCraft script competitions and script reports. It would be nice to have comparable data for other script competitions / report suppliers before we extrapolate this data across the whole industry. I’ve not seen anything to suggest this is a skewed dataset but I’m often surprised by data so try not to presume.

          Secondly, this new research is tracking actions taking place at an early stage of the industry journey, as opposed to, say, writing for big-budget Hollywood movies, and so we can conclude that the hand of the industry (biased or not) is much weaker here. Throughout the gender report research, we saw that as the industry got more involved (i.e. bigger budgets, more prestigious shows/films, etc) that female representation dropped. Therefore, the script data adds further evidence to the belief that men and women do not have the same preferences in all situations. (N.B. I don’t think this is a controversial idea and not one I have subscribed to).

          It also adds evidence to the idea that if we were instantly and magically all bias, that the industry wouldn’t end up at exactly 50:50. Quotas, targets and goals of increasing gender representation are not primarily about getting to some magic, Platonic ideal number. It’s about combating decades of entrenched beliefs and normalising something which was once rare or unlikely. As we showed in the gender report, there is a vicious cycle whereby if one class of people are rare, they are seen as a risky choice and in a risk-averse industry, they are much less likely to get hired. To break this still-perpetuating cycle, we need to increase representation in the short-term and change the perceptions of certain classes of people (such as women).

          It’s worth noting that this script data cannot prove that bias is wholly absent. These people will all have been influenced by the perceptions of the industry and the vast majority will have had guidance and support, either formal or informal. It’s just that we would expect any such bias to be weaker than further in the heart of the industry.

          Finally, another reason that it’s important that we have a fair representation among key creatives is that a relativity small number of people have a huge degree of control over our culture. Movies, and the characters shown in them, have a massive influence on how we all see the world. As this new report shows, female writers are much more likely to write about the lives of female characters than male writers. Therefore, it matters if one group of people have a disproportionate effect on the stories we get to see and hear.

          This extends way beyond just gender. We were not able to measure other aspects of the writers, such as class, race, socio-economic status, etc. We couldn’t full measure age, but we did get an indication as ScreenCraft told us that the average age of writers was 32, and our research found that that average age of lead characters was 28. So for both the two factors we have – gender and age – we can see that writers write what they know. This is not in itself a problem (and arguably a good route to factually and emotionally true stories) but does underline the need for diverse storytellers.

          Thank you for the question. The main thing I am seeking to create on this site is fact-driven debate. There are no ideas or beliefs that are beyond challenge, and new data should be used to update our understanding.

          S

          P.S. I am talking in simple terms about two genders here, but only because the data we had on writers self-reported as either male or female, and the character data could only detect male/female skewed names.

          1. I want to clarify, in case of misunderstanding:

            My point is that in your gender equality report, for the phase 1 entry into the industry stage you used applicants to, and students of, Film and Screenwriting courses. For the latter measure, the female applicants were 43% and successful applicants 39%. This is what you took to be representative of the whole industry entry stage.

            If you are measuring, say, female representation of Doctors (the profession, not the show ha), then this approach – studying the gender balance of ie medical students – would indeed make sense, because it is mandatory to study to be a doctor. But Screenwriting is different, as your own report noted, only a minority enter Screenwriting through formal education.

            So my challenge to you was that, is it not more accurate for phase 1 entry level to be measured by the gender balance of people actually sending their scripts out to others, of which competitions form a significant part?

            The basic thrust of your gender inequality report was that the percentage of female screenwriters starts off close to 50% and depletes the higher up the industry you go. Your conclusion was that ‘unconscious bias’ is a significant factor.

            However, your theory only holds water if you indeed believe that the 43% represents the whole of stage 1 entry level. If you were to use, say, the 23.7% from your report, (which is broadly in line with what available data from the Black List, BBC Writersroom and the Nicholls Fellowship), to represent stage 1, then it looks like the career phases 2 and 3 are no longer broadly out of kilter with phase 1.

            Also, you say that “It’s worth noting that this script data cannot prove that bias is wholly absent.” But at least in the Screencraft data it seems to be absent, considering female screenwriters average slightly higher scores, (though this does not necessarily have an impact on the gender of winners, do you have data on that?)

          2. I totally follow your argument and I’m not saying that you’re wrong. Your suggestion is a fair claim and one which does follow from the numbers.

            What I would say is that there is no one place we can go to get an objective number which one and for all tells us what the true intent of new entrants is. Every number is a proxy, and open to interpretation. On the one hand, the film school numbers are great because they’re over a long time series and across many schools and courses. On the other, you’re totally right that it’s not a requirement to enter the industry, and isn’t even the most common route in. So far from perfect.

            I would contend that these ScreenCraft numbers are not obviously a better proxy. The come from one source (for which we cannot know the bias or not) and are no more of a requiring or common element of career progression than schools. Different, but not automatically better or worse.

            To be clear, I do agree with you that this is an indicator, and one which runs counter to the theory suggested by the film school data. The gender report brought together a large number of data points, and the argument did not hang solely on the film school data.

            In answer to your point about the dataset having no bias, I’m not sure we can be so certain. There may be other differences between how men and women respond to the nature of script competitions which need to be factored in. This could be both the propensity to apply in the first case and the willingness to keep going after repeated rejections. This isn’t an argument I’m making but just an example of how bias can take many forms. This particular example would be more a matter of wider gender perceptions rather than something the film industry is doing. (Although of course it may be it wants to take into account if it wants to have more diverse new talent reaching the big leagues).

  5. (Following on from your reply beginning “I totally” – there doesn’t seem to be a reply button on that, so this may pop up out of synch)

    I agree with you that you cannot use a single data point. Screencraft is just one (set of) competitions (s). However, it does seem to match available data from other types of entry points:

    The Black List

    https://blog.blcklst.com/gender-gap-in-non-professional-writer-submissions-ca1887a427c4

    The Nicholls Fellowship

    https://wellywoodwoman.blogspot.com/2013/04/under-representation-in-scriptwriting.html#more

    and the BBC Writersroom

    http://www.bbc.co.uk/blogs/writersroom/entries/f31ff216-f67f-4f85-a98c-cc5615032ba2

    I want to challenge you on one point. Whereas the gender equality report did use multiple sources of data, the main thrust of your argument relied on the gender split of screenwriting courses vs the latter career stages of screenwriting. On your table on pg 94 this is the only screenwriting-specific data for phase 1 and both yourself and others have extracted and shared it as a summary.

    So I agree that a breadth of data points are necessary to properly understand phase 1. Specifically I would like the debate to include (still centring around people who send their scripts to others):

    – Other major competitions
    – Producers (quite difficult)
    – Agents/Clients (relatively easy, they normally publish/promote their agents and clients on their website)
    – Major pitchfests

    One thing I liked about your gender equality report is that you showed the gender breakdown of applicants to various funding schemes. Personally, I would like to see this front and centre of our understanding of phase 1, rather than buried in appendices at the back. This is so we know whether the gender splits of phase 1 really are out of kilter with phases 2 to 4.

    This is important, because I can see this is clearly driving a lot of policy-formation at the Writer’s Guild and perception in the industry in general.

    I’m not saying do a new report, but I hope you will consider the above when revisiting the issue of gender balance, as this issue will undoubtedly come up again.

    1. Yes, good points.

      With all reports of that nature, we’re torn been wanted to be concise and complete. The way I normally handle it is to put the key info in the main section and anything else interesting or relevant in the Appendix. Were I to write that report today, the ScreenCraft data would certainly be relevant and would also affect other choices, such as the relevance of other stats.

      You suggest that finding the gender of producers would be quite difficult. True, but I feel it’s something I could take on. I can apply the same data and processes I have just done for directors and will seek to do so in the coming months.

      1. That sounds great, Stephen.

        It was more the gender of the people sending their scripts into, eg, producers than the gender of the producers themselves I was referring to, but of course the gender of the producers is important as well, to understand trends.

        Thanks for responding in the way you have.

  6. Great stuff, Stephen. I’m working on scripts for sequential art and trying to map your film stats to graphic novels. Now whenever I watch a movie, I’m going to start counting swear words and speaking characters and scene locations – LOL. Thanks for the insights!

  7. I was looking for how many spoken words are said in an average film per minute, but it was still interesting skimming through your graphs. Thanks.

  8. Hi Stephen
    I was looking at this older and very interesting post about screenplay lengths etc. is the page count you used based on A4 or US letter size pages , as it makes a difference of +/- 15% on screenplay page count.
    It would be interesting to have some statistics on number of lines in individual dialogue elements (ie 1 line / 2 lines/ 3 lines / more than 4 lines etc), and also on longline lengths.
    best
    David

  9. I wonder, what percentage of studio bosses use past success average patterns as a guide for future success. And what percentage use data as a guide to find the new and make the opposite?
    😬😄

  10. Thank you for your lucid insights here. For some creatives, like the one living rent free in my head, they may balk “reducing” art to numbers and trying to fit true creativity in some predefined box. For new screenwriters, however, this data is invaluable. Why? It provides a clear framework one can fit their project into. True, you may have a horror masterpiece coming in at 200 pages but if the producer, who is familiar with the genre, will immediately be put off by your ignorance of the genre conventions. So too with number of scenes, characters speaking, etc. If, for example, you only have 10 people speaking, this metric is a clear indication that you need to expand the speaking parts and again avoid producers seeing that you don’t know what you are doing. Conversely, I’m wondering if there is any advantage to hitting these avg. metrics straight on and appealing to the producers sub-conscious bias to scripts that fit the mold. Just some thoughts and thank you so much for this data. As a new screen writer, It helped answer a few questions that have been hanging over the whole project.

  11. Thanks much, all really useful data. It would be interesting to see all the same breakdowns for Oscar-nominated screenplays. The chart that shows the gender of the primary character by the gender of the screenwriter is missing the faith genre. Wondering if that was intentional. Do you have this data?

Leave a Reply

Your email address will not be published. Required fields are marked *

Stephen Follows