recommendations.py
critics = { 'Lisa Rose' : {'Lady in the Water' : 2.5, 'Snakes on a Plane' : 3.5, 'Just My Luck' : 3.0, 'Superman Returns' : 3.5, 'You, Me and Dupree' : 2.5, 'The Night Listener' : 3.0}, 'Gene Seymour' : {'Lady in the Water' : 3.0, 'Snakes on a Plane' : 3.5, 'Just My Luck' : 1.5, 'Superman Returns' : 5.0, 'You, Me and Dupree' : 3.5, 'The Night Listener' : 3.0}, 'Michael Phillips' : {'Lady in the Water' : 2.5, 'Snakes on a Plane' : 3.0, 'Superman Returns' : 3.5, 'The Night Listener' : 4.0}, 'Claudia Puig' : {'Snakes on a Plane' : 3.5, 'Just My Luck' : 3.0, 'Superman Returns' : 4.0, 'You, Me and Dupree' : 2.5, 'The Night Listener' : 4.5}, 'Mick LaSalle' : {'Lady in the Water' : 3.0, 'Snakes on a Plane' : 4.0, 'Just My Luck' : 2.0, 'Superman Returns' : 3.0, 'You, Me and Dupree' : 2.0, 'The Night Listener' : 3.0}, 'Jack Matthews' : {'Lady in the Water' : 3.0, 'Snakes on a Plane' : 4.0, 'Superman Returns' : 3.0, 'You, Me and Dupree' : 3.5, 'The Night Listener' : 3.0}, 'Toby' : {'Snakes on a Plane' : 4.5, 'Superman Returns' : 4.0, 'You, Me and Dupree' : 1.0} }
EDS.py
from recommendations import critics from math import sqrt def sim_distance(prefs, person1, person2): si = {} for item in prefs[person1]: if item in prefs[person2]: si[item] = 1 if len(si) == 0: return 0 sum_of_squares = sum([pow(prefs[person1][item] - prefs[person2][item], 2) for item in si]) return 1 / (1 + sqrt(sum_of_squares)) print sim_distance(critics, 'Lisa Rose', 'Gene Seymour')
基本原理就和它的名字一样,就是通过计算一个坐标轴中,两点之间的距离。
设有点P1(X1, Y1),点P2(X2, Y2),SQRT代表平方根,POW代表平方
则两点之间的距离D = SQRT( POW(X1-X2) + POW(Y1-Y2) )
在本例中X与Y分别代表两部电影平分,如果距离D越小,则代表两个人的相似度越近。