Find Median from Data Stream
要点:
- 基本框架:两个heap:large,small把所有数二分。一个新的element。目标:维持heap中的元素个数相同。错误理解:新元素不是任意可以push到large里的,其值可能应该属于small。所以要2步push/pop,之后判断大小。
- large表示大的那边,small表示小的那边。固定:large可以多一个元素
- => 新元素:先push large,然后从large中pop到small。所以small是增加的。如果small和large原来一样,需要重新把中间元素push回large。这样可以不用考虑奇偶在branch,好记。
- 但是仍然要记住even/odd的关系:
- odd/even表示当前heap里的,不包括新element
- 开始是even,所以small里面push再到pop到large里,odd相反。
- 都用minHeap,只需要pop之后取反push到另一个即可。而large的minHeap是自然的。
- -2^31 negate还是-2^31,所以用long
- 注意self.small中的元素是反的,所以要减去
# Median is the middle value in an ordered integer list. If the size of the list is even, there is no middle value. So the median is the mean of the two middle value.
# Examples:
# [2,3,4] , the median is 3
# [2,3], the median is (2 + 3) / 2 = 2.5
# Design a data structure that supports the following two operations:
# void addNum(int num) - Add a integer number from the data stream to the data structure.
# double findMedian() - Return the median of all elements so far.
# For example:
# add(1)
# add(2)
# findMedian() -> 1.5
# add(3)
# findMedian() -> 2
# Credits:
# Special thanks to @Louis1992 for adding this problem and creating all test cases.
# Hide Company Tags Google
# Hide Tags Heap Design
from heapq import heappush, heappop
class MedianFinder:
def __init__(self):
"""
Initialize your data structure here.
"""
self.large = []
self.small = []
def addNum(self, num):
"""
Adds a num into the data structure.
:type num: int
:rtype: void
"""
heappush(self.large, num)
heappush(self.small, -heappop(self.large))
if len(self.large)<len(self.small):
heappush(self.large, -heappop(self.small))
def findMedian(self):
"""
Returns the median of current data stream
:rtype: float
"""
if (len(self.small)+len(self.large)) & 1:
return self.large[0]/1.0
else:
return (self.large[0]-self.small[0])/2.0 # error: should be - not + b/c in small is negative
# Your MedianFinder object will be instantiated and called as such:
# mf = MedianFinder()
# mf.addNum(1)
# mf.findMedian()