We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

String Compression II

Difficulty: Hard


Problem Description

Given a string s and an integer k, you need to delete at most k characters from s such that the run-length encoded version of s has minimum length. The run-length encoding replaces consecutive identical characters (repeated 2 or more times) with the concatenation of the character and the number marking the count of the characters.


Key Insights

  • Run-length encoding compresses sequences of identical characters into a format that combines the character and its count.
  • The goal is to minimize the length of the encoded string after deleting up to k characters.
  • The problem can be approached using dynamic programming to explore different deletion scenarios and their impact on the compressed length.

Space and Time Complexity

Time Complexity: O(n^2 * k) where n is the length of the string and k is the number of deletions allowed.
Space Complexity: O(n * k) for storing the state of the dynamic programming table.


Solution

To solve the problem, we will use a dynamic programming approach. We will maintain a DP table where dp[i][j] represents the minimum length of the run-length encoded version of the substring s[0:i] after deleting j characters. The algorithm works as follows:

  1. First, preprocess the string to count consecutive characters and their frequencies.
  2. Use a nested loop to fill out the DP table:
    • For each character segment, decide whether to delete characters from it or keep it intact.
    • Update the DP table based on the current segment's contribution to the run-length encoding.
  3. The final answer will be found in the last cell of the DP table, considering all possible deletions.

Code Solutions

def getLengthOfOptimalCompression(s: str, k: int) -> int:
    n = len(s)
    dp = [[float('inf')] * (k + 1) for _ in range(n + 1)]
    dp[0][0] = 0
    
    # Preprocess the string into groups of characters
    groups = []
    count = 1
    for i in range(1, n):
        if s[i] == s[i - 1]:
            count += 1
        else:
            groups.append((s[i - 1], count))
            count = 1
    groups.append((s[-1], count))
    
    for i in range(1, len(groups) + 1):
        char, freq = groups[i - 1]
        
        for j in range(k + 1):
            # Case 1: Don't delete any characters from this group
            if freq > 1:
                dp[i][j] = min(dp[i][j], dp[i - 1][j] + (len(char) + len(str(freq)) - (1 if freq > 9 else 0)))
            else:
                dp[i][j] = min(dp[i][j], dp[i - 1][j] + len(char))
            
            # Case 2: Delete characters from this group
            for del_count in range(1, min(freq, j) + 1):
                if freq - del_count > 1:
                    dp[i][j] = min(dp[i][j], dp[i - 1][j - del_count] + (len(char) + len(str(freq - del_count)) - (1 if freq - del_count > 9 else 0)))
                else:
                    dp[i][j] = min(dp[i][j], dp[i - 1][j - del_count] + len(char))
    
    return dp[len(groups)][k]
← Back to All Questions