We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Design Search Autocomplete System

Number: 642

Difficulty: Hard

Paid? Yes

Companies: Pinterest, Microsoft, Amazon, Bloomberg, Google, Uber, Apple, Citadel, TikTok, Remitly, Meta


Problem Description

Design and implement a search autocomplete system for a search engine. The system takes a set of historical sentences with their corresponding occurrence counts. As the user types character by character, the system suggests the top three historical sentences that share the same prefix as the current input. When the user inputs the special character '#' it signifies completion of the sentence; it is then stored in the system, and an empty list is returned.


Key Insights

  • Use a Trie to efficiently store and retrieve sentences based on their prefixes.
  • Maintain a mapping at each Trie node to record sentences (or their frequencies) that pass through that node.
  • Use sorting or a min-heap to select the top three sentences based on their frequency (hot degree) and lexicographical order if frequencies tie.
  • On receiving '#', update the Trie with the current sentence and reset the current search state.

Space and Time Complexity

Time Complexity: O(L * log k) per character input query, where L is the length of the current prefix and k is 3 (constant factor), thus ensuring efficient autocompletion. Space Complexity: O(N * L), where N is the number of unique sentences and L is the average sentence length, for maintaining the Trie.


Solution

The solution revolves around building a Trie where each node contains:

  • A mapping to its child nodes.
  • A hash map that records all sentences passing through the node along with their occurrence counts.

For each character entered (except '#'), the system:

  1. Traverses the Trie to find the current prefix node.
  2. Retrieves stored sentences at that node and sorts them based on the following criteria: descending by frequency and ascending lexicographically if frequencies are equal.
  3. Returns the top three sentences from the sorted result.

When the user inputs '#', the current sentence (tracked as user input so far) is added or its frequency updated in the Trie, and the input state resets.


Code Solutions

Below are implementations in Python, JavaScript, C++, and Java with clear line-by-line comments.

class TrieNode:
    def __init__(self):
        # Mapping from character to TrieNode
        self.children = {}
        # Dictionary mapping sentence to its frequency count
        self.counts = {}

class AutocompleteSystem:
    def __init__(self, sentences, times):
        self.root = TrieNode()
        self.current_node = self.root
        self.current_input = ""
        # Initialize Trie with the given sentences and their times
        for sentence, time in zip(sentences, times):
            self.addSentence(sentence, time)

    def addSentence(self, sentence, freq):
        node = self.root
        for char in sentence:
            if char not in node.children:
                node.children[char] = TrieNode()
            node = node.children[char]
            # Update sentence frequency at each node
            node.counts[sentence] = node.counts.get(sentence, 0) + freq

    def input(self, c):
        results = []
        if c == '#':
            # End of input: add current sentence to Trie
            self.addSentence(self.current_input, 1)
            # Reset the current input and traversal pointer
            self.current_input = ""
            self.current_node = self.root
            return results

        self.current_input += c
        # Move to the next Trie node if it exists
        if self.current_node:
            self.current_node = self.current_node.children.get(c, None)
        if not self.current_node:
            return results

        # Retrieve sentences and sort them by frequency (descending) and lexicographical order (ascending)
        data = list(self.current_node.counts.items())
        data.sort(key=lambda x: (-x[1], x[0]))
        for i in range(min(3, len(data))):
            results.append(data[i][0])
        return results

# Example usage:
# auto = AutocompleteSystem(["i love you", "island", "iroman", "i love leetcode"], [5, 3, 2, 2])
# print(auto.input("i"))
# print(auto.input(" "))
# print(auto.input("a"))
# print(auto.input("#"))
← Back to All Questions