We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Encode String with Shortest Length

Number: 471

Difficulty: Hard

Paid? Yes

Companies: Google


Problem Description

Given a string s, encode it so that its encoded length is the shortest possible. The encoded format is k[encoded_string] where the encoded_string inside the brackets is repeated exactly k times. If encoding does not produce a shorter string, the original string should be returned. There can be multiple valid encodings.


Key Insights

  • Use Dynamic Programming to consider every substring of s.
  • For each substring, try partitioning it into two smaller substrings and combine their best encodings.
  • Detect if a substring can be constructed by repeatedly concatenating a smaller substring and, if so, build an encoding in the format k[encoded_string].
  • Only use the encoding if it is shorter than the original substring.

Space and Time Complexity

Time Complexity: O(n^3), where n is the length of the string. This is due to the triple nested loops for checking all substrings and splitting them. Space Complexity: O(n^2) for the DP table storing encoded results for each substring.


Solution

We use a bottom-up dynamic programming approach. Create a 2D array dp where dp[i][j] holds the shortest encoded string for substring s[i…j]. The base case is when the substring length is one (dp[i][i] = s[i]). For longer substrings, check every possible partition to see if concatenations of smaller encodings yield a shorter result. Also, try to see if the substring can be expressed as repeated patterns. To do so, for each substring s[i...j], consider if it can be divided into one or more repetitions of a pattern substring. If so, represent it in the k[pattern] format. The key challenge is to correctly check periodicity and to choose the optimal encoding that minimizes the length.


Code Solutions

# Python solution using Dynamic Programming
def encode(s: str) -> str:
    n = len(s)
    # dp[i][j] will store the shortest encoded string for s[i:j+1]
    dp = [["" for _ in range(n)] for _ in range(n)]
    
    # Fill dp for substrings from length 1 to n
    for l in range(1, n+1):  # l is the length of the substring
        for i in range(0, n-l+1):
            j = i + l - 1
            substr = s[i:j+1]
            # If substring length is less than 5, encoding it will not help.
            dp[i][j] = substr
            # Try each possible split position k
            for k in range(i, j):
                left = dp[i][k]
                right = dp[k+1][j]
                if len(left) + len(right) < len(dp[i][j]):
                    dp[i][j] = left + right
            # Check if substr can be represented as a repetition of a pattern.
            # Try all possible patterns that could form the substring.
            for k in range(1, l):
                # Only consider patterns that evenly divide substr length.
                if l % k == 0:
                    repeat_count = l // k
                    # If the pattern repeated 'repeat_count' times equals the substring
                    if substr == substr[0:k] * repeat_count:
                        # Use dp[i][i+k-1] because it might be further encoded itself
                        candidate = str(repeat_count) + "[" + dp[i][i+k-1] + "]"
                        if len(candidate) < len(dp[i][j]):
                            dp[i][j] = candidate
    return dp[0][n-1]

# Example usage:
print(encode("aaa"))
print(encode("aaaaa"))
print(encode("aaaaaaaaaa"))
← Back to All Questions