We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

The Most Similar Path in a Graph

Number: 1687

Difficulty: Hard

Paid? Yes

Companies: Google


Problem Description

Given a connected undirected graph with n cities and m roads, each city has a 3-letter name. You are provided a targetPath (an array of 3-letter city names). The task is to find a valid path in this graph with the same length as targetPath that minimizes the edit distance with respect to targetPath. The edit distance here counts the number of positions where the city name in your path differs from the targetPath. There is also a follow-up if each city can be visited at most once.


Key Insights

  • This problem is essentially a path-finding problem on a graph combined with sequence alignment (minimizing edit distance).
  • Dynamic Programming is an ideal approach by treating each position in the targetPath as a state and considering all possible cities for that step.
  • The DP state can be defined as dp[i][j]: minimum edit distance to match targetPath[0...i] when current city is j.
  • Use transitions over all neighbors of a given city when moving from the (i-1)th to the ith step.
  • To reconstruct the final path, keep a parent pointer that tracks which predecessor city provided the optimal cost.
  • The graph structure is stored using an adjacency list for efficient neighbor lookup.
  • The follow-up requiring cities to be visited only once changes the state to include visited nodes (or use DFS/BFS with a visited set), turning the problem into a variant of the Hamiltonian path problem.

Space and Time Complexity

Time Complexity: O(L * n * d), where L is the length of targetPath, n is the number of cities, and d is the average degree (number of neighbors). In the worst-case d can be O(n), so it could be O(L * n^2).
Space Complexity: O(L * n) to store the DP table and rollout table for reconstruction.


Solution

We use dynamic programming to solve the problem. The state dp[i][j] captures the minimum edit distance for aligning the first (i+1) characters of targetPath ending at city j. For the base state (i=0), for each city j, the cost is 0 if the city’s name matches targetPath[0], otherwise 1. For each subsequent position i, we try to reach city j from any neighbor k of j from the previous step and take the minimal cost. A parent pointer is maintained to help reconstruct the path once dp is fully computed. Finally, by selecting the city with minimum dp[targetPath.length-1][j], we can backtrack the optimal path.


Code Solutions

# Python solution with inline comments
def mostSimilar(n, roads, names, targetPath):
    from collections import defaultdict
    L = len(targetPath)
    
    # Build graph using adjacency list
    graph = defaultdict(list)
    for u, v in roads:
        graph[u].append(v)
        graph[v].append(u)
    
    # Initialize dp table, dp[i][j] is minimal edit distance for targetPath[0...i] ending at city j
    dp = [[float('inf')] * n for _ in range(L)]
    parent = [[-1] * n for _ in range(L)]
    
    # Base case for first targetPath character
    for j in range(n):
        dp[0][j] = 0 if names[j] == targetPath[0] else 1
    
    # Fill dp table
    for i in range(1, L):
        for j in range(n):  # current city j at step i
            # For each neighbor 'nei' of current city j. Also consider staying at the same city if self-loop allowed.
            for k in graph[j]:
                # Calculate cost for this move from city k to city j at step i
                cost = 0 if names[j] == targetPath[i] else 1
                if dp[i-1][k] + cost < dp[i][j]:
                    dp[i][j] = dp[i-1][k] + cost
                    parent[i][j] = k

    # Choose the city at final step with minimal edit distance
    min_cost = float('inf')
    end_city = -1
    for j in range(n):
        if dp[L-1][j] < min_cost:
            min_cost = dp[L-1][j]
            end_city = j

    # Reconstruct the path from end_city backwards using the parent table
    path = [0] * L
    curr = end_city
    for i in range(L-1, -1, -1):
        path[i] = curr
        curr = parent[i][curr] if i > 0 else -1
    return path

# Example usage:
n = 5
roads = [[0, 2], [0, 3], [1, 2], [1, 3], [1, 4], [2, 4]]
names = ["ATL", "PEK", "LAX", "DXB", "HND"]
targetPath = ["ATL", "DXB", "HND", "LAX"]
result = mostSimilar(n, roads, names, targetPath)
print("Optimal path:", result)
← Back to All Questions