The Anime Blog Anime Music Tournament to crown the best anime song of all time has been going on for a while over at http://animusictourney.wordpress.com/ and we’ve reached the Sweet Sixteen. (You should all check it out, and vote for the songs I like.) Over the course of the tournament, there has been grumbling in the comments to the effect of “people only voted for <song I didn’t like> over <song I like> because of nostalgia [or recency bias, or because they liked the show, or whatever]”. Sometimes this made sense to me, such as when “Connect” beat “Grey Wednesday”, which I can only attribute to people voting for Madoka Magica over Penguindrum rather than on the merits of the songs themselves. But it’s easy to fool yourself into thinking the songs you like lost because of bias while the songs you didn’t like lost because they just plain weren’t good songs. So I decided to sit down with the data and get some hard numbers.
There were three possible sources of bias I decided to examine: voting for the more recent song, voting for the song that came from a better show, or voting for a song you were familiar with over one you weren’t familiar with. Out of the 240 matches that have been completed in the Animusic Tourney so far:
*In 119 matches the newer song won, while in 119 matches the older song won. (In two matches there was no age difference.)
*In 135 matches the song whose show was rated higher on myanimelist.net won, while in 101 matches the lower-rated show’s song won. (In four matches they were rated the same.)
*In 132 matches the song whose show had more viewers on MAL won, while in 107 matches the song whose show had fewer viewers won. (In one match they had the same number. This was the “Kimi no Shiranai Monogatari” vs. “Staple Stable” matchup.)
So clearly there’s no question of this data showing any recency or nostalgia bias. Forget no statistically significant difference, there’s no difference at all. As far as the other two criteria are concerned it seems clear that people are to at least some extent voting for shows instead of songs. We would only have a 3% chance of seeing results this extreme if the show’s rating were entirely unconnected to the song’s performance in the tournament. The impact of familiarity is less clear. We would have a 12% chance of seeing results this extreme by pure chance, and to some extent familiarity is probably “piggybacking” off show quality, because people are more likely to watch a show if they hear it’s good.
Just because there is an effect doesn’t necessarily mean that there’s an evil bias at work. There could legitimately be a connection between the quality of a show and the quality of the show’s music. For example, one of the things that makes a show good is if it has good songs. Would Suzumiya Haruhi no Yuuutsu have been as beloved without its incredible concert scene? (The recent concert scene in White Album was unimpressive in part because its songs weren’t as amazing as “God Knows”.) It’s nice at least to have some hard numbers to think about though. And hopefully this will get people to stop whining about “recency bias” and “nostalgia bias”.
EDIT: Based on the comments I decided to test a few more hypotheses. First, there was some indication that there could be a generation gap. I set the cutoff between the old and the new at 2006, since that was when HD anime started airing – it makes a pretty good break between “old” and “new” anime. There were 53 cases of an “old” anime beating a “new” anime, and 60 cases of the reverse. This suggests a possible bias toward the newer generation of anime, but a result like that could show up 57% of the time due to pure chance, so it’s hard to draw conclusions.
Second, there was a suggestion to look at only songs that were a certain distance apart in time, because a song from fall of 2005 beating a song from spring of 2006 doesn’t tell you much about nostalgia bias. This makes sense to me. Limiting the comparison to songs with at least 3 years difference has the newer song winning 85 matches compared to the older song winning 76 matches. This is a bit in favor of the newer song, but again, the odds of such a result happening by pure chance are 42% so who knows.
Data is here if you want to play with it yourself. I corrected a couple errors but they don’t materially change the conclusions.
Like this:
Like Loading...
Related
Not convincing to be honest. A simple “newer vs older” test is pretty much worthless when the difference is generational. It’s easy to say newer songs wins as many matches as older songs when people consider shows like Bakemonogatari, Macross Frontier and Aquarion as new, while they are probably older than most shows being represented.
So the additional analysis needed: dice the new v. old into “Between matchups where show is X years old, in how many instances did the new show win and how many did the old show win.”
I’m also curious what the median age difference is in the anime of all matchups to date since within the old v. new debate, results between anime that are only 1 year apart is largely noise when trying to arrive at some definitive conclusion.
Right, and better model would also take into account # of months/season from present, because I think “recency bias” also apply to distinguish between new songs and newer songs, with some kind of tapering point after 1 year or some such? At least that hypo would make logical sense and probably have some evidence behind it, or we could see it.
The result data, either what you are saying and what this post has quoted, should also be compared to the overall picture. Because of the late-night anime landscape changed drastically between 1998 and 2007.
And if you want more context, the anison industry has also changed drastically since 2006-07 (eg.,Lantis bought by now Bamco, establishing anisama, etc). And I wouldn’t be surprised to see data falling in line with that period being the cut off between two different eras..
I looked at a couple more things and didn’t find any conclusive results. You’re welcome to try more stuff.
So the takeaway is that show bias may very well be the biggest factor. No surprise there, I guess, especially after one particular comment that showed up on the Tsuki no Mayu/Hacking to the Gate matchup that I whined about so much.
Also, any chance that I can take a peek at the spreadsheet to see the analysis and stat-sig differences calculations? :>
Yes absolutely. Edited the post.
Thanks for crunching the numbers. It’s nice to see whether these theories hold up. There is certainly a lot less impact than I would have guessed (though I find it hard to consider songs from 2000-2006 as old).
Have you considered that having good music is one of the factors in a show being good? It seems foolish to assume we’re judging the song based on the show and neglecting any of the reverse effect.
I mention that possibility in paragraph 5. Unfortunately I don’t see any good way of correcting for it.