Can We Predict Anime Popularity?

If you want a prequel to this saga, check out Understanding Anime Through Tropes, which was one of my first blog posts ever since I decided to switch back to English.

Shikizaki Kiki
Before we start, a prayer to Shikizaki-san

This time I’m going to be presenting a short overview of a modelling project that I’ve been working on-and-off on for the past six months. Heavily influenced by Azuma Hiroki’s theory on database consumption, I decided to test out an approach to predicting anime popularity. I stress the word popularity because we all know it’s impossible to agree on what constitutes a quality title.

Certain people claim that there are objective qualities to entertainment, I happen to disagree. Popularity is a case of collective wisdom, so I came upon an idea to test whether anime tropes would be able to tell us anything about their respective qualities. Predicting anime quality numerically has had its fan attempts, with very lacklustre results.

Data Mining

There are two wonderful sites on the internet which provided the data I needed to build our anime popularity classifier:

  • TVTropes is a website that catalogues trope information. If you casually glance over a given anime page, you’ll notice that some of their tropes don’t make a whole lot of sense. At least to me they don’t. Even more of them fall into “Your Mileage May Vary” category, which I promptly ignored. Still, it is exactly the hidden which interested me. I was banking on Tropers exposing that hidden knowledge in the form of abstract trope information and let a computer algorithm try to demystify it.
  • Anime News Network with their wonderful API provided anime popularity information. I saved each title’s episode count, anime production company, and user-submitted grades. The first two would join the collected tropes as additional attributes, while grades would serve as class information. I was wondering whether I should keep grades in continuous form (from 1.0 to 10.0) or convert them to binary. In the end, I decided on the latter. ANN puts descriptive labels to their grades, the highest three grades being Masterpiece (10), Excellent (9) and Very Good (8). These three constitute Class 1. Anime with grades 7 or below fall into Class 0, meaning they probably aren’t very exceptional.

Now I know what you’re thinking. I’m dealing with user-submitted data. It’s not perfect, it’s messy – believe me, it is. But there’s enough of it that a lil’ bit of noise probably doesn’t matter a whole lot. Another issue is I’m using ANN user-submitted grades. I discuss their caveats in this post. Maybe you don’t like the site; that’s fine. Just remember that the model can still make sense within its bounds.

The Purpose

What does matter is if we can even make a correlation between tropes and user-submitted grades. Here’s the crux – If and only if we’re descriptive enough with trope information, the model will be able to predict whether an anime is popular (grades 10 to 8) or unpopular (grades 7 to 1) with a certain probability, which is admittedly hard to be. So for example, if you’re an anime producer and your series is in planning stages, this is how you’d use the model: plan out your trope sets, put them into this model, have confidence in the classification score of the highest graded trope set, start your production, then cross your fingers and pray to God everything goes according to plan during production.

This model is a toy for real-world applications if my sarcasm wasn’t obvious enough. However, it does have a specific purpose. With it, I’m trying to demonstrate that tropes carry meaningful information about anime titles, or entertainment in general.

Understanding the Data

I took under consideration anime titles that got released during 2011 and 2014. Going back too far would probably make the dataset less integral, as there are less anime titles listed on TVTropes as you go back in time. This way I produced a list of 419 titles. The original dataset had 10290 attributes to its name, the large majority of them being tropes. In the end I decided on tropes that appeared in more than just one show, shaving away roughly 4000 attributes. Yes, most tropes are delusions with no commonality. It’s also worth noting that roughly 11% of all the collected anime fall into Class 1, meaning my class division humorously satisfied the “90% of all entertainment is crap” rule of thumb. Obviously I had to weigh the remaining trope data with the number of unique tropes, the number of tropes each title has to its name, and also global trope frequencies.

Building the Model

Perhaps all this trope information makes sense in a higher dimension, a thought which practically begged me to test out Support Vector Machines. This learning method maps data to a higher dimension, which makes the process of forming a boundary between the two classes a bit easier.

Having only 419 learning examples is rough. The process of calibrating SVM requires that you separate data to a training and testing set, meaning you’ll train a classifier on the training set, while test its efficiency on the testing set. I made a rough 3:1 split each time. Furthermore, training the classifier demands an additional partitioning step, which leaves you with even less data to calibrate your classifier on. I could always get more anime data, but I decided this is where I’ll draw the line.


The ROC AUC scoring function, which looks at probabilities for predicting true positives, true negatives, false positives and false negatives, spat out the following results:

As you can see, the two scores have a bit of a gap. I shuffled data randomly each time, so the classifier sometimes got better examples to work with or the testing data got harder examples to predict, which further shows that lack of data can screw you over hard. Still, sometimes the score reached 93%, which is very good. On average though, the predictive ability was floating around 85%, which is still relatively good. Anime production and episode count data without tropes consistently yielded sub 70% scores, meaning trope information provided a noticeable increase in the classifier’s predictive ability. This means that tropes most likely carry meaningful hidden knowledge.

If you work for an anime company and you really want to know how far this line of reasoning goes, you can always hire me. As for myself, I’ll be publishing these results in a self-published Mechademia-style journal, which I’m in the process of writing with a group of humanities graduates. No ETA yet, but I’m hoping we’ll also be able to dish out an English translation. Right now though, I just want to hear some opinions.


Leave a comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.