We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Similar String Groups

Difficulty: Hard


Problem Description

Two strings, X and Y, are considered similar if either they are identical or we can make them equivalent by swapping at most two letters (in distinct positions) within the string X. Given a list strs of strings where every string in strs is an anagram of every other string in strs, the task is to find how many groups of similar strings there are.


Key Insights

  • Strings are similar if they can become identical by swapping at most two characters.
  • The problem can be viewed as finding connected components in a graph where nodes are strings and edges represent similarity.
  • We can use a depth-first search (DFS) or union-find to explore and group similar strings.

Space and Time Complexity

Time Complexity: O(N^2 * L), where N is the number of strings and L is the length of each string (for comparing strings). Space Complexity: O(N) for storing visited nodes in the DFS/Union-Find.


Solution

The solution involves treating the strings as nodes in a graph. We can either use Depth-First Search (DFS) or Union-Find to explore the groups of similar strings.

  1. For each string, check against all other strings to see if they are similar.
  2. If they are similar, connect them (either mark them visited in DFS or union them in Union-Find).
  3. Count the number of distinct connected components.

We can check if two strings are similar by tracking the positions of differing characters. If there are exactly two differing characters and swapping them makes the strings identical, they are similar.


Code Solutions

def are_similar(s1, s2):
    diff = []
    for i in range(len(s1)):
        if s1[i] != s2[i]:
            diff.append((s1[i], s2[i]))
    return len(diff) == 2 and diff[0] == (diff[1][1], diff[1][0])

def dfs(index, visited, strs):
    visited[index] = True
    for i in range(len(strs)):
        if not visited[i] and are_similar(strs[index], strs[i]):
            dfs(i, visited, strs)

def num_similar_groups(strs):
    visited = [False] * len(strs)
    count = 0
    for i in range(len(strs)):
        if not visited[i]:
            dfs(i, visited, strs)
            count += 1
    return count
← Back to All Questions