• 加权中位数


    问题描述为: 一个无序的数列,每个数有其对应的权重,权重为非负整数,代表数列中的数字出现的次数。要求找出这一无序数列中的中位数。

    1. 直接解法,先对该数列和权重排序。然后找出累计权重为中位数的数字。 时间复杂度为排序的 O(nlog(n)+n)

     2 import numpy as np
     3 
     4 def weighted_median(data, weights):
     5     """
     6     Args:
     7       data (list or numpy.array): data
     8       weights (list or numpy.array): weights
     9     """
    10     data, weights = np.array(data).squeeze(), np.array(weights).squeeze()
    11     s_data, s_weights = map(np.array, zip(*sorted(zip(data, weights))))
    12     midpoint = 0.5 * sum(s_weights)
    13     if any(weights > midpoint):
    14         w_median = (data[weights == np.max(weights)])[0]
    15     else:
    16         cs_weights = np.cumsum(s_weights)
    17         idx = np.where(cs_weights <= midpoint)[0][-1]
    18         if cs_weights[idx] == midpoint:
    19             w_median = np.mean(s_data[idx:idx+2])
    20         else:
    21             w_median = s_data[idx+1]
    22     return w_median
    23 
    24 def test_weighted_median():
    25     print("hello, world")
    26     data = [
    27         [7, 1, 2, 4, 10],
    28         [7, 1, 2, 4, 10],
    29         [7, 1, 2, 4, 10, 15],
    30         [1, 2, 4, 7, 10, 15],
    31         [0, 10, 20, 30],
    32         [1, 2, 3, 4, 5],
    33         [30, 40, 50, 60, 35],
    34         [2, 0.6, 1.3, 0.3, 0.3, 1.7, 0.7, 1.7, 0.4],
    35     ]
    36     weights = [
    37         [1, 1/3, 1/3, 1/3, 1],
    38         [1, 1, 1, 1, 1],
    39         [1, 1/3, 1/3, 1/3, 1, 1],
    40         [1/3, 1/3, 1/3, 1, 1, 1],
    41         [30, 191, 9, 0],
    42         [10, 1, 1, 1, 9],
    43         [1, 3, 5, 4, 2],
    44         [2, 2, 0, 1, 2, 2, 1, 6, 0],
    45     ]
    46     answers = [7, 4, 8.5, 8.5, 10, 2.5, 50, 1.7]
    47     for datum, weight, answer in zip(data, weights, answers):
    48         assert(weighted_median(datum, weight) == answer)
    49 
    50 if __name__ == "__main__":
    51     test_weighted_median()

     2. 按照快速排序的思路,先找到一个数字,然后 按照该数字将数列划分成左右两段,根据左右两段的权重之和,递归调用左半侧或者右半侧数列。

  • 相关阅读:
    SVN安装
    MS SQL Server 查询元数据
    MS SQL Server 时间函数
    Codeforces 666E Forensic Examination (后缀自动机 + 线段树合并)
    2020牛客暑期多校训练营(第八场) H Hard String Problem
    2020牛客暑期多校训练营(第四场) B Count New String
    2020牛客暑期多校训练营(第四场) A Ancient Distance
    2020杭电多校第三场 1007 Tokitsukaze and Rescue
    codeforces 1037H
    2020牛客暑期多校训练营(第三场) H Sort the Strings Revision
  • 原文地址:https://www.cnblogs.com/cofludy/p/10655030.html
Copyright © 2020-2023  润新知