If you haven’t read the post in question I strongly suggest you do. To be honest, I really wanted some feedback on it, so I did the most shameful of acts and pushed it on Reddit, which sparked a very lively and productive back and forth. It was a beautiful and surprising case of N Heads Are Better Than One.
In the process of trying to defend my hypothesis and learn new things from the original chart, I produced several other graphs to aid our understanding of the data. I should really thank the participants for their thoughtful and provoking questions that made me bother with these to begin with.
We’ve learned that the most damaging effect on all the graphs is what we’ve dubbed the “recency” effect. This means that newer titles will likely be rated higher on ANN and will only gradually receive a more balanced score as more people submit their ratings. If you’ve ever followed ANN’s ratings, the effect should be evident without any visualization whatsoever. I personally believe that the influence of the effect shouldn’t last for more than a decade. What contributes to the downward trend of the Top-N graphs, as we look further into the past, is simply the fact that older anime fans are more likely to check out and only check out top contemporary anime productions, whereas newer anime fans have a tendency not to watch older anime at all.
Another thing to note is the low sample sizes of earlier decades, including 80’s anime. 1988 was the big victim of the original graph because of them–the late 70’s and early 80’s means made it seem like it was just barely competing with that period–when in fact the best titles of the year were remarkably well rated, with titles such as Grave of the Fireflies, My Neighbor Totoro, and Legend of Galactic Heroes being among them. Later on, we wanted to prove that the late 70’s and early 80’s had good means because they had less bottom-tier anime than other periods, but the four graphs I’ve produced don’t necessarily support that claim, and the sampling issues persist. The only logical conclusion left to explain the graph is that the period has very few, if any, outstanding shows, very few crappy shows, but many shows slightly above the overall average.
One of the safe conclusions is that the produced data points for 90’s anime, despite not being very remarkable, have the least issues. For other periods I’m very skeptical, either because of their sample sizes (earlier decades) or not being on the same paradigm as the rest of them (the last two decades). This exercise may have been a bit futile, but at the very least it confirmed that I seriously have to limit data for the machine learning project to the past few years.
If you want to bounce more ideas on the data, shoot here or on Twitter, or send an e-mail.