We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Short Encoding of Words

Difficulty: Medium


Problem Description

Given an array of words, return the length of the shortest reference string possible for any valid encoding of the words. A valid encoding ends with a '#' character and contains substrings that match each word in the array.


Key Insights

  • The reference string must end with a '#' character.
  • Overlapping words can be encoded more efficiently by taking advantage of suffixes.
  • The key is to find unique suffixes of each word that do not match any other word.

Space and Time Complexity

Time Complexity: O(N * M), where N is the number of words and M is the maximum length of a word. Space Complexity: O(N), for storing the set of words.


Solution

To solve the problem, we can utilize a set to store the words. For each word, we check if it is a suffix of any other word in the set. If it is not a suffix, we add its length plus one (for the '#' character) to the total length of the reference string.


Code Solutions

def minimumLengthEncoding(words):
    # Create a set to store the words
    word_set = set(words)
    total_length = 0
    
    # Iterate through each word in the list
    for word in words:
        # Check if any suffix of the word is in the set
        for i in range(1, len(word)):
            if word[i:] in word_set:
                # If it is, we can skip adding this word
                break
        else:
            # If no suffix was found, add the length of the word plus 1 for '#'
            total_length += len(word) + 1
            
    return total_length
← Back to All Questions