We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Most Common Word

Difficulty: Easy


Problem Description

Given a string paragraph and a string array of the banned words banned, return the most frequent word that is not banned. It is guaranteed there is at least one word that is not banned, and that the answer is unique. The words in paragraph are case-insensitive and the answer should be returned in lowercase.


Key Insights

  • Words are separated by spaces and can have punctuation, which should be ignored.
  • The solution should handle case insensitivity by converting all words to lowercase.
  • A hash table (dictionary) can be used to count the occurrences of each word.
  • The banned words should be stored in a set for O(1) look-up time.

Space and Time Complexity

Time Complexity: O(n) where n is the number of words in the paragraph, since each word is processed once. Space Complexity: O(m) where m is the number of unique words in the paragraph excluding banned words.


Solution

To solve the problem, we can follow these steps:

  1. Normalize the paragraph by converting it to lowercase and replacing punctuation with spaces.
  2. Split the cleaned string into words.
  3. Use a hash table (dictionary) to count occurrences of each word that is not in the banned set.
  4. Iterate through the hash table to find the word with the maximum count, which is our result.

Code Solutions

def mostCommonWord(paragraph, banned):
    import re
    from collections import Counter
    
    # Normalize the paragraph: lowercase and replace punctuation with spaces
    paragraph = re.sub(r'[!?\'.,;]', ' ', paragraph).lower()
    words = paragraph.split()
    
    # Create a set for banned words for O(1) look-up
    banned_set = set(banned)
    
    # Count occurrences of each word that is not banned
    word_count = Counter(word for word in words if word not in banned_set)
    
    # Find the most common word
    return word_count.most_common(1)[0][0]
← Back to All Questions