We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Extra Characters in a String

Difficulty: Medium


Problem Description

You are given a 0-indexed string s and a dictionary of words dictionary. You have to break s into one or more non-overlapping substrings such that each substring is present in dictionary. There may be some extra characters in s which are not present in any of the substrings. Return the minimum number of extra characters left over if you break up s optimally.

Key Insights

  • The goal is to minimize the number of characters in the string that are not part of any valid substrings from the dictionary.
  • A dynamic programming approach can be utilized to track the minimum extra characters at each index of the string.
  • A set can be used to store the dictionary words for O(1) average-time complexity lookups.
  • Iterate through the string and for each character, check for possible substrings that can be formed with the words in the dictionary.

Space and Time Complexity

Time Complexity: O(n^2 * m), where n is the length of the string and m is the average length of words in the dictionary.
Space Complexity: O(n), for storing the dynamic programming array.

Solution

We will use a dynamic programming approach, where we maintain an array dp such that dp[i] represents the minimum number of extra characters in the substring s[0:i]. We will iterate through the string and update the dp array based on whether we can form a valid substring from the dictionary.

Code Solutions

def minExtraCharacters(s, dictionary):
    # Create a set for quick lookup
    word_set = set(dictionary)
    n = len(s)
    
    # Initialize dp array; dp[i] means min extra characters for s[:i]
    dp = [0] * (n + 1)
    
    # Base case: there are initially n extra characters (all characters)
    for i in range(n + 1):
        dp[i] = i
    
    # Fill the dp array
    for i in range(n):
        for j in range(i + 1, n + 1):
            substring = s[i:j]
            if substring in word_set:
                dp[j] = min(dp[j], dp[i])  # If substring is in dictionary, no extra characters
                
    return dp[n]  # Return min extra characters for the whole string
← Back to All Questions