Data Stream Median

题目描述

解题方法

基本思想就是维护两个heap

  • min_heap: 存比当前median大的值
  • max_heap: 存比当前median小的值

heap是便于快速求当前的median

max_heap的size 总是与 min_heap的size相等(even)或者大一个(odd),其实反过来 也行,在求median的时候也要根据奇偶来求解

Solution

from heapq import *

class MedianFinder:
    def __init__(self):
        """
        Initialize your data structure here.
        """
        self.min_heap = []
        self.max_heap = []
        self.median = None

    def addNum(self, num):
        """
        Adds a num into the data structure.
        :type num: int
        :rtype: void
        """
        if not self.median:
            self.median = num
            heappush(self.max_heap, -num)
        else:
            # 根据与当前median的大小放到堆中
            if num > self.median:
                heappush(self.min_heap, num)
            else:
                heappush(self.max_heap, -num)

        # min_heap的size应该小于等于max_heap的size
        while len(self.min_heap) > len(self.max_heap):
            val = heappop(self.min_heap)
            heappush(self.max_heap, -val)

        # max_heap的size应该小于等于min_heap + 1
        while len(self.max_heap) > len(self.min_heap) + 1:
            val = heappop(self.max_heap)
            heappush(self.min_heap, -val)

        if len(self.min_heap) == len(self.max_heap):
            # 除法应该注意是否要用double,跟面试官沟通好
            self.median = (self.min_heap[0] - self.max_heap[0]) / 2.0
        else:
            self.median = -self.max_heap[0]

    def findMedian(self):
        """
        Returns the median of current data stream
        :rtype: float
        """
        return self.median

Reference