We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Find Longest Self-Contained Substring

Number: 3410

Difficulty: Hard

Paid? Yes

Companies: Amazon


Problem Description

Given a string s, find the length of its longest self-contained substring. A substring t (with t ≠ s) is defined as self-contained if for every character in t, none of its other occurrences exist in s outside t. If no such substring exists, return -1.


Key Insights

  • A self-contained substring t must include all occurrences of each character that appears in t; that is, for each letter c in t, the first and last occurrence of c in s are both inside t.
  • You can precompute the first and last occurrence indexes for each character.
  • When exploring a candidate substring starting at index i, you can “grow” the candidate until it covers the entire interval for every character it has seen.
  • After computing a candidate interval [i, j], ensure that for every position k within [i, j] the first occurrence of s[k] is not before i. This guarantees no letter in the candidate occurs before it.
  • The candidate must not equal the entire string s.

Space and Time Complexity

Time Complexity: O(n^2) in the worst case (though in practice, with only 26 letters and early breaks, the average runtime is much lower). Space Complexity: O(1) extra space (aside from the output and the fixed table for the 26 letters).


Solution

We first precompute the first and last occurrence indices for every character in s. Then, for every index i that is the first occurrence of its character, we try to build a candidate substring starting at i. Initialize j as the last occurrence of s[i] and then iterate the candidate from i to j; for each index k in the candidate, update j = max(j, lastOccurrence(s[k])). While iterating, if you find any s[k] whose first occurrence is before i, then the candidate is invalid because that character appears outside the candidate. Finally, if the candidate [i, j] is valid and does not span the entire string s, update your answer with its length. This greedy “expansion” guarantees that for every character in the substring, all its occurrences are captured inside the candidate interval.


Code Solutions

Below are implementations in Python, JavaScript, C++, and Java with line-by-line comments.

# Python solution
class Solution:
    def findLongestSelfContainedSubstring(self, s: str) -> int:
        n = len(s)
        # Precompute the first and last occurrence for every character.
        first_occ = {}
        last_occ = {}
        for i, char in enumerate(s):
            if char not in first_occ:
                first_occ[char] = i
            last_occ[char] = i
        
        max_length = -1
        
        # Explore candidate starting at each index that is the first occurrence of its character.
        for i in range(n):
            # Only consider if this index is the first occurrence for its character.
            if first_occ[s[i]] != i:
                continue
            valid = True
            end = last_occ[s[i]]  # initial candidate end
            j = i
            # Expand the candidate until j reaches the candidate boundary.
            while j <= end:
                # Check if any character in the candidate appears before i.
                if first_occ[s[j]] < i:
                    valid = False
                    break
                # Update the candidate boundary
                end = max(end, last_occ[s[j]])
                j += 1
            # Candidate is valid if it does not include characters outside the interval and is not the entire string.
            if valid and end < n:
                max_length = max(max_length, end - i + 1)
        return max_length

# Example usage:
# sol = Solution()
# print(sol.findLongestSelfContainedSubstring("abba"))  # Output: 2
← Back to All Questions