We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Shortest Common Supersequence

Difficulty: Hard


Problem Description

Given two strings str1 and str2, return the shortest string that has both str1 and str2 as subsequences. If there are multiple valid strings, return any of them.


Key Insights

  • A subsequence is formed by deleting some characters from a string without rearranging the remaining characters.
  • The shortest common supersequence (SCS) can be derived using the concept of the longest common subsequence (LCS).
  • The length of the SCS can be calculated as: length(SCS) = len(str1) + len(str2) - len(LCS).
  • Dynamic programming can be used to efficiently compute the LCS which will then help in constructing the SCS.

Space and Time Complexity

Time Complexity: O(m * n) where m and n are the lengths of str1 and str2, respectively.
Space Complexity: O(m * n) for the DP table used to store LCS values.


Solution

To solve the problem, we can use dynamic programming to find the longest common subsequence (LCS) between the two strings. Once we have the LCS, we can construct the shortest common supersequence (SCS) by merging the characters from both strings while maintaining the order of characters according to the LCS.

  1. Create a 2D array dp where dp[i][j] will hold the length of the LCS of str1[0...i-1] and str2[0...j-1].
  2. Fill this array using the rules of LCS:
    • If characters match, increment the count from the previous indices.
    • If not, take the maximum count from either the previous character of str1 or str2.
  3. Once we have the length of the LCS, we can backtrack through the dp array to construct the SCS by adding characters from both strings, ensuring the LCS characters are preserved.

Code Solutions

def shortestCommonSupersequence(str1: str, str2: str) -> str:
    m, n = len(str1), len(str2)
    # Create a DP table to store lengths of longest common subsequence
    dp = [[0] * (n + 1) for _ in range(m + 1)]
    
    # Fill the DP table
    for i in range(1, m + 1):
        for j in range(1, n + 1):
            if str1[i - 1] == str2[j - 1]:
                dp[i][j] = dp[i - 1][j - 1] + 1
            else:
                dp[i][j] = max(dp[i - 1][j], dp[i][j - 1])
    
    # Reconstruct the shortest common supersequence
    i, j = m, n
    scs = []
    
    while i > 0 and j > 0:
        if str1[i - 1] == str2[j - 1]:
            scs.append(str1[i - 1])
            i -= 1
            j -= 1
        elif dp[i - 1][j] > dp[i][j - 1]:
            scs.append(str1[i - 1])
            i -= 1
        else:
            scs.append(str2[j - 1])
            j -= 1
    
    # Add remaining characters
    while i > 0:
        scs.append(str1[i - 1])
        i -= 1
    while j > 0:
        scs.append(str2[j - 1])
        j -= 1
    
    # The SCS is built in reverse order
    return ''.join(reversed(scs))
← Back to All Questions