We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Find Median from Data Stream

Difficulty: Hard


Problem Description

Implement the MedianFinder class that supports adding numbers to a data stream and retrieving the median of all added numbers efficiently.


Key Insights

  • The median is the middle value of a sorted list; if the list size is even, it's the average of the two middle values.
  • We can maintain two heaps (a max-heap for the lower half and a min-heap for the upper half) to efficiently retrieve the median.
  • The max-heap will contain the smaller half of the numbers, while the min-heap will contain the larger half.
  • Balancing the heaps allows us to quickly calculate the median based on their sizes.

Space and Time Complexity

Time Complexity: O(log n) for addNum, O(1) for findMedian
Space Complexity: O(n) for storing the numbers in heaps


Solution

To solve this problem, we utilize two heaps:

  1. A max-heap to store the lower half of the numbers.
  2. A min-heap to store the upper half of the numbers.

When adding a new number, we determine which heap it belongs to and then balance the heaps if necessary. The median can then be found based on the sizes of the heaps:

  • If both heaps are of equal size, the median is the average of the tops of both heaps.
  • If the max-heap has one more element, the median is the top of the max-heap.

Code Solutions

import heapq

class MedianFinder:
    def __init__(self):
        # Max heap for the lower half
        self.lower_half = []
        # Min heap for the upper half
        self.upper_half = []

    def addNum(self, num: int) -> None:
        # Add to max heap (lower half)
        heapq.heappush(self.lower_half, -num)
        
        # Ensure max of lower half is <= min of upper half
        if (self.lower_half and self.upper_half and 
            (-self.lower_half[0] > self.upper_half[0])):
            heapq.heappush(self.upper_half, -heapq.heappop(self.lower_half))
        
        # Balance the sizes of the heaps
        if len(self.lower_half) > len(self.upper_half) + 1:
            heapq.heappush(self.upper_half, -heapq.heappop(self.lower_half))
        elif len(self.upper_half) > len(self.lower_half):
            heapq.heappush(self.lower_half, -heapq.heappop(self.upper_half))

    def findMedian(self) -> float:
        if len(self.lower_half) > len(self.upper_half):
            return float(-self.lower_half[0])
        return (-self.lower_half[0] + self.upper_half[0]) / 2.0
← Back to All Questions