The Limitations of Big Data in the Film Industry

One of the limitations is the lack of data on low-budget independent films (Simon & Schroeder, 2019). It becomes harder to successfully predict the box-office success of films that are smaller in scale. This is due to the lack of information in regard to the film. The less information that is available in a film, the harder it becomes for a prediction model to be able to use the data to accurately make a prediction.

Another limitation is that it does not consider who the potential audience for the film is and the audience for its actors. For example, the film Ticket to Paradise was not expected to do well in the United States (U.S.). Upon its release in the U.S. It had made $80 million overseas and was predicted to only accumulate $6.4 million in its opening week in the U. S. But in a surprise, Ticket to Paradise managed to secure $16 million in its U.S. debut (McClintock, 2022). It was reported that sixty-four percent of the audience for the film in its first weekend were older than 35 (What the “Ticket to Paradise” Box Office Opening Says about the State of the Rom-Com, 2022). The age of the audience is a key factor in this. In a movie starring two older actors who are less known to the younger audiences, the main audience who is likely to go see the actors and actresses are familiar with those actors and actresses from when the audience themselves were younger and watch their films.

There are no officially known prediction models, except for one exception, which are able to make predictions for which actors should be hired for a particular project. The one exception to this is Netflix. In the past it has been revealed that for the Netflix series House of Cards, Netflix was able to use data gathered from the streaming platform to make a prediction that pairing David Fincher and Kevin Spacey would be a success. This was based on the popularity of films directed by David Fincher and films starring Kevin Spacey (Carr, 2013).

When it comes to predicting success at the box-office, there is one limitation that can’t be accounted for. With predictions based upon social media users’ activity in regard to the film before its release, there is no way to accurately predict the potential flop of the film at the box office until it has already happened. A film can be predicted to be a success with all the talk surrounding it prior to its release and have the estimated box-office receipt and how many theaters it will play in, but nothing can account for the word of mouth spread of negativity towards a film that is not rated well in the eyes of the audience. The audience who has seen it and disliked it overall can lead to the unpredictable possibility of the film becoming a failure at the box-office when it was previously seen as a potential enormous success. Asur and Huberman (2013) have found that after the release of a film, sentiment on social media can affect the predictions of a film’s box-office revenue.

Another limitation for big data within the industry comes down to when the usable models can create accurate predictions for the films. “Big Data Goes to Hollywood: The Emergence of Big Data as a Tool in the American Film Industry” brings this to light. Asur and Huberman’s (2013) Twitter-based model was only able to make a prediction that could be seen as dependable the night before the film was released into theaters in the United States (Simon & Schroeder, 2019). While the model has shown that it can help to accurately predict which films will be a success, there is no real help when it can only supply the info the night before. At that point, it does not become beneficial for the studio to know at that point due to there being nothing for the studio to do about how the film is distributed.

A main point to remember from the Yahoo! and other search engines includes: when solely working off of the data sets from only one or a few sources at a time can lead to success with predictions, those predictions will contain a lot more failed predictions than there would be if you were to utilize many various data sets to create an overarching model in which box-offices predictions can become more and more successful. One major limitation of this analysis is that companies reveal little to no information on their prediction models, but we do know that they create these overarching models to consider the possibilities from all sides (Kapoor, 2021).