• 分级聚类算法


      分级聚类算法以一组对应于原始数据项的聚类开始。函数的主循环部分会尝试每一组可能的配对并计算他们的相关度,以此来找出最佳配对。最佳配对的两个聚类会被合并成一个新的聚类。新生成的聚类中所包含的数据,等于将两个旧聚类的数据求均值之后得到的结果。循环下去,一直到只剩下一个聚类为止。

    python实现代码:

    def hcluster(rows,distance=pearson):
      distances={}
      currentclustid=-1
    
      # Clusters are initially just the rows
      clust=[bicluster(rows[i],id=i) for i in range(len(rows))]
    
      while len(clust)>1:
        lowestpair=(0,1)
        closest=distance(clust[0].vec,clust[1].vec)
        print "closest",closest
        # loop through every pair looking for the smallest distance
        for i in range(len(clust)):
          for j in range(i+1,len(clust)):
            # distances is the cache of distance calculations
            if (clust[i].id,clust[j].id) not in distances: 
              distances[(clust[i].id,clust[j].id)]=distance(clust[i].vec,clust[j].vec)
    
            d=distances[(clust[i].id,clust[j].id)]
    
            if d<closest:
              closest=d
              lowestpair=(i,j)
    
        # calculate the average of the two clusters
        mergevec=[
        (clust[lowestpair[0]].vec[i]+clust[lowestpair[1]].vec[i])/2.0 
        for i in range(len(clust[0].vec))]
    
        # create the new cluster
        newcluster=bicluster(mergevec,left=clust[lowestpair[0]],
                             right=clust[lowestpair[1]],
                             distance=closest,id=currentclustid)
    
        # cluster ids that weren't in the original set are negative
        currentclustid-=1
        del clust[lowestpair[1]]
        del clust[lowestpair[0]]
        clust.append(newcluster)
    
      return clust[0]
  • 相关阅读:
    Javascript DOM 编程艺术读书笔记16/03/25
    2014 Multi-University Contest 1.1 hdu4861 打表找规律
    汇编小记16/3/23
    汇编小记16/3/22
    hdoj 4940 强连通图
    Head FIRST HTML & CSS 16/03/17
    html&css一些有用的网站整理
    dosbox+debug 模拟dos
    汇编小记16/3/15
    解决windwos另存为,保存文件时无法选择“桌面”文件夹
  • 原文地址:https://www.cnblogs.com/huanhuanang/p/5242784.html
Copyright © 2020-2023  润新知