We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Minimize Hamming Distance After Swap Operations

Difficulty: Medium


Problem Description

You are given two integer arrays, source and target, both of length n. You are also given an array allowedSwaps where each allowedSwaps[i] = [a_i, b_i] indicates that you are allowed to swap the elements at index a_i and index b_i (0-indexed) of array source. Note that you can swap elements at a specific pair of indices multiple times and in any order.

The Hamming distance of two arrays of the same length, source and target, is the number of positions where the elements are different. Return the minimum Hamming distance of source and target after performing any amount of swap operations on array source.


Key Insights

  • The allowed swaps can be seen as connections between indices in the source array, forming connected components.
  • Within each connected component, we can rearrange the elements freely.
  • To minimize the Hamming distance, we should ensure that as many elements in source as possible match those in target within the same connected components.
  • The problem can be solved using Union-Find (Disjoint Set Union) or Depth-First Search (DFS) to identify connected components.

Space and Time Complexity

Time Complexity: O(n + m) where n is the length of the source and target arrays, and m is the number of allowed swaps.
Space Complexity: O(n) for the parent array in Union-Find or the visited array in DFS.


Solution

To solve this problem, we can use the Union-Find data structure to group indices of the source array that can be swapped. Once we have identified all connected components, we can count the occurrences of each value in both source and target within these components. Finally, we can compute the minimum Hamming distance by determining how many elements can be matched in each component.

  1. Union-Find Initialization: Initialize a parent array to represent each index's root.
  2. Union Operations: For each allowed swap, union the two indices.
  3. Find Components: Use the find operation to get the representative for each index, grouping them into components.
  4. Count Matches: For each component, count the occurrences of elements in both source and target.
  5. Calculate Hamming Distance: For each component, calculate how many elements can be matched and subtract from the total to get the Hamming distance.

Code Solutions

class UnionFind:
    def __init__(self, size):
        self.parent = list(range(size))
    
    def find(self, x):
        if self.parent[x] != x:
            self.parent[x] = self.find(self.parent[x])  # Path compression
        return self.parent[x]
    
    def union(self, x, y):
        rootX = self.find(x)
        rootY = self.find(y)
        if rootX != rootY:
            self.parent[rootY] = rootX  # Union

def minimizeHammingDistance(source, target, allowedSwaps):
    n = len(source)
    uf = UnionFind(n)
    
    # Union all allowed swaps
    for a, b in allowedSwaps:
        uf.union(a, b)
    
    # Dictionary to group indices by their root
    components = {}
    for i in range(n):
        root = uf.find(i)
        if root not in components:
            components[root] = []
        components[root].append(i)
    
    hamming_distance = 0
    
    # Calculate the minimum Hamming distance
    for indices in components.values():
        source_count = {}
        target_count = {}
        
        # Count occurrences in source and target for the current component
        for index in indices:
            source_count[source[index]] = source_count.get(source[index], 0) + 1
            target_count[target[index]] = target_count.get(target[index], 0) + 1
        
        # Calculate how many can be matched
        matches = 0
        for value in source_count:
            if value in target_count:
                matches += min(source_count[value], target_count[value])
        
        # Hamming distance is the size of the component minus matches
        hamming_distance += len(indices) - matches
    
    return hamming_distance
← Back to All Questions