Effectiveness of interval histogram Euclidean distance for predicting tune similarity

January 11, 2014
FolkTuneFinder
Music

In an attempt to quickly find almost exact melodic duplicates (give or take a note or two) in the folktunefinder.com algorithm I tried comparing the Euclidean distance between their interval histograms.

I produced the histogram by bucketing intervals (i.e. number of semitones between consecutive notes) into 30 buckets (15 semitones up or down) and normalising to 1.0 in 32 bit floats.

This method is acceptably fast and surprisingly accurate: of the 98.4% over about 40,000 detections. It comes highly recommended as a first-order comparison.

I think that its suitability is largely dependent on the corpus. I’m doubly surprised, therefore, as the corpus in question is folk tunes which have very common idioms.