We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Minimum Genetic Mutation

Difficulty: Medium


Problem Description

Given a gene string represented by an 8-character long string, we need to investigate a mutation from a gene string startGene to a gene string endGene where one mutation is defined as one single character change in the gene string. A gene must be in bank to make it a valid gene string. Given startGene, endGene, and bank, return the minimum number of mutations needed to mutate from startGene to endGene. If there is no such mutation, return -1.


Key Insights

  • Each mutation changes exactly one character in the gene string.
  • The gene bank contains valid mutations that can be used to reach the endGene.
  • A breadth-first search (BFS) approach is suitable to find the shortest path (minimum mutations) from startGene to endGene.
  • The BFS will explore all valid mutations level by level, ensuring the minimum number of mutations is found first.

Space and Time Complexity

Time Complexity: O(N * 4^L), where N is the number of genes in the bank and L is the length of the gene (which is 8).
Space Complexity: O(N), due to the storage of the bank and the queue used for BFS.


Solution

To solve the problem, we will use a breadth-first search (BFS) algorithm. The BFS will start from the startGene and explore all possible mutations that exist in the bank. We will maintain a queue to keep track of the current gene and the number of mutations taken to reach it. A set will be used to track visited genes to avoid processing the same gene multiple times. As soon as we reach endGene, we will return the number of mutations taken. If the queue is exhausted without finding endGene, we return -1.


Code Solutions

from collections import deque

def minMutation(startGene, endGene, bank):
    if endGene not in bank:
        return -1
    
    bank_set = set(bank)
    queue = deque([(startGene, 0)])  # (current_gene, mutations_count)
    visited = set([startGene])
    
    while queue:
        current_gene, mutations = queue.popleft()
        
        if current_gene == endGene:
            return mutations
        
        for i in range(len(current_gene)):
            for char in 'ACGT':
                if char != current_gene[i]:
                    mutated_gene = current_gene[:i] + char + current_gene[i+1:]
                    if mutated_gene in bank_set and mutated_gene not in visited:
                        visited.add(mutated_gene)
                        queue.append((mutated_gene, mutations + 1))
    
    return -1
← Back to All Questions