We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Find Pattern in Infinite Stream I

Number: 3333

Difficulty: Medium

Paid? Yes

Companies: Uber


Problem Description

Given an infinite binary stream (accessed via a next() call) and a binary pattern, return the first starting index where the pattern is found in the stream.


Key Insights

  • Use a sliding window of size equal to the pattern’s length.
  • Calculate a rolling hash for the current window to quickly check for potential matches.
  • Compare the computed rolling hash against the precomputed hash of the pattern.
  • When the hashes match, perform an element-wise verification to avoid false positives due to hash collisions.
  • The algorithm terminates once the pattern is found within the first 10^5 bits of the stream.

Space and Time Complexity

Time Complexity: O(n) in the worst case, where n is the number of bits read until a match is found (bounded by 10^5).
Space Complexity: O(m), where m is the length of the pattern.


Solution

The solution leverages a rolling hash with a sliding window.

  1. Precompute the hash for the pattern using a chosen base (2, since data is binary) and a large prime modulus. Also compute base^(m-1) modulo the modulus to help in removing the leftmost digit’s contribution from the rolling hash.
  2. As bits are read from the stream one-by-one, maintain a rolling window of the last m bits and its corresponding hash.
  3. Once the window has m bits, compare the rolling hash with the pattern’s hash. If they match, compare the bits in the current window with the pattern for confirmation.
  4. Slide the window by removing the oldest bit and adding the new bit; update the rolling hash accordingly.
  5. Return the starting index of the window as soon as the pattern is confirmed.

Code Solutions

# Python solution with detailed comments.
def find_pattern_in_stream(pattern, stream):
    # Length of the pattern
    m = len(pattern)
    base = 2
    mod = 10**9 + 7  # A large prime to reduce hash collision chances

    # Precompute the hash of the pattern and the factor for the leftmost digit.
    pattern_hash = 0
    power = 1  # base^(m-1) modulo mod
    for i in range(m):
        pattern_hash = (pattern_hash * base + pattern[i]) % mod
        if i < m - 1:
            power = (power * base) % mod

    # Variables for rolling hash and window
    window_hash = 0
    window = []  # List to store current window bits
    index = 0  # Overall index in the stream

    while True:
        # Read the next bit from the infinite stream.
        bit = stream.next()
        window.append(bit)
        # Update the rolling hash by incorporating the new bit.
        window_hash = (window_hash * base + bit) % mod

        # If the window exceeds the size of the pattern, remove the leftmost bit.
        if len(window) > m:
            left_bit = window.pop(0)
            window_hash = (window_hash - left_bit * power) % mod
            if window_hash < 0:
                window_hash += mod
        
        # Once the window size equals the pattern length, check for a match.
        if len(window) == m:
            start_index = index - m + 1
            if window_hash == pattern_hash:
                if window == pattern:
                    return start_index
        index += 1

# Example simulation:
# class InfiniteStream:
#     def __init__(self, data):
#         self.data = data
#         self.pos = 0
#     def next(self):
#         bit = self.data[self.pos]
#         self.pos = (self.pos + 1) % len(self.data) # For continuous simulation.
#         return bit
#
# stream = InfiniteStream([1,1,1,0,1,1,1])
# pattern = [0,1]
# print(find_pattern_in_stream(pattern, stream))  # Expected output: 3
← Back to All Questions