What Anime Genres Do We Like Watching These Days?

This is another one of my data posts. You have been warned.

MDS Plot
MDS visualization of major anime groups today, presented in 2D!

I noticed an interesting question on one of the online anime places that I frequent: What types of anime shows are being made these days? Some folks wanted to do a statistical analysis on the matter, but merely counting genres is easy and that’s boring.

So I came up with a new question. Given that people like watching certain genres more often than other, what genres do we like watching these days on the whole? By that I mean things like, what does the dominant fan group watch, and is my taste still relevant to anime producers?

If I was going to do this, first thing I needed to do was to tap into an anime data source! I was already familiar with AnimeNewsNetwork, AniDB and TV Tropes, but I also wanted to check out this Japanese website called AniKore, where users are encouraged to vote on user-submitted tags. I would have produced a nicely weighted dataset from those, but unfortunately AniKore’s top-ranking tags simply suck. Too many titles have dumb qualitative descriptors tacked onto them, such as kami anime (tl: God-tier anime) that tell you absolutely nothing about the kind of anime you’re dealing with. The problem with TV Tropes, on the other hand, is that their tags aren’t general enough, which is an understatement of the century. Then again, ANN’s genres are too general. Luckily AniDB offered a nice middle ground between the two extremes, with a total of 140 different genre tags.

I don’t want to bore you guys too much with all the technicalities and math that went into this. You can just skip down to the results. Let me just state for the record what general steps I took so that this won’t seem like complete voodoo magic.

  1. I extracted title, start date, ranking, vote count and tag information from AniDB for nearly every TV anime show since the start of 2011 up to this season.
  2. I noted which tags belong to which anime, then I compared numerical representations of anime between each other and produced so-called cosine similarity measures for all pairs of anime. Thusly, each anime pair got scored on how similar they were, or weren’t.
  3. I ran a clustering algorithm on all the scores, which spat out the four groups of anime it thought looked similar to one another. Clustering algorithms try to group similar items along a certain similarity measure. In our case that was cosine similarity.
  4. I counted which tags are the most common for each of the four groups. I also measured how many users voted on average and what their average user rating was inside each group.

You can verify my methodology in the code that I posted on Github. Did this procedure lead to any fruitful results? You decide.

Results

Anime Group 1
Members:  169
Group average ranking:  4.4
Group average voters:  649
Most common tags:
    1.    new                  (anime with this tag: 169)
    2.    science fiction      (anime with this tag: 50)
    3.    comedy               (anime with this tag: 38)
    4.    action               (anime with this tag: 36)
    5.    mecha                (anime with this tag: 35)
    6.    short episodes       (anime with this tag: 30)
    7.    seinen               (anime with this tag: 26)
    8.    fantasy              (anime with this tag: 22)
    9.    super power          (anime with this tag: 21)
   10.    shounen              (anime with this tag: 19)

Anime Group 2
Members:  129
Group average ranking:  3.38
Group average voters:  336
Most common tags:
    1.    game                 (anime with this tag: 72)
    2.    visual novel         (anime with this tag: 29)
    3.    bishounen            (anime with this tag: 18)
    4.    short episodes       (anime with this tag: 16)
    5.    science fiction      (anime with this tag: 16)
    6.    RPG                  (anime with this tag: 15)
    7.    action               (anime with this tag: 15)
    8.    shoujo               (anime with this tag: 15)
    9.    shounen              (anime with this tag: 14)
   10.    music                (anime with this tag: 14)

Anime Group 3
Members:  355
Group average ranking:  5.27
Group average voters:  792
Most common tags:
    1.    manga                (anime with this tag: 344)
    2.    comedy               (anime with this tag: 189)
    3.    shounen              (anime with this tag: 136)
    4.    seinen               (anime with this tag: 118)
    5.    daily life           (anime with this tag: 82)
    6.    short episodes       (anime with this tag: 77)
    7.    school life          (anime with this tag: 70)
    8.    action               (anime with this tag: 59)
    9.    4-koma               (anime with this tag: 55)
   10.    fantasy              (anime with this tag: 53)

Anime Group 4
Members:  195
Group average ranking:  4.93
Group average voters: 1353
Most common tags:
    1.    novel                (anime with this tag: 142)
    2.    fantasy              (anime with this tag: 107)
    3.    seinen               (anime with this tag: 103)
    4.    action               (anime with this tag: 93)
    5.    comedy               (anime with this tag: 82)
    6.    harem                (anime with this tag: 63)
    7.    school life          (anime with this tag: 56)
    8.    ecchi                (anime with this tag: 55)
    9.    romance              (anime with this tag: 46)
   10.    contemporary fantasy (anime with this tag: 42)

A few general observations on my behalf:

  • It seems like Group 2 crowns itself with the genre that has traditionally the worst reputation for adaptations – games.
  • The clustering algorithm was adamant to put game adaptations, manga adaptations, original anime, and novel adaptations into separate groups. Does the source medium really have such a draw for certain other genres or is that more an influence from the algorithm itself?
  • Group 3 looks the strongest by average ranking, but personally I’d take a closer look at Group 4 if I wanted to know what kind of anime people love watching (and voting on) the most these days.
  • Let it be known that distances between each anime were derived only from genre existences and nothing else.

There you have it! If somebody knows a good weighting scheme for binary features, I’d appreciate the tip immensely. As always, comments are appreciated.

Advertisements

Leave a comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s