We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Repeated DNA Sequences

Difficulty: Medium


Problem Description

Given a string s that represents a DNA sequence, return all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule. You may return the answer in any order.


Key Insights

  • The DNA sequence consists of four nucleotides: 'A', 'C', 'G', and 'T'.
  • We need to identify and return substrings of length 10 that appear more than once.
  • A sliding window approach can efficiently extract all substrings of length 10 from the given string.
  • A hash table (or dictionary) can be used to count occurrences of each substring.

Space and Time Complexity

Time Complexity: O(n), where n is the length of the input string. We scan through the string once and perform constant-time operations for each substring. Space Complexity: O(n), in the worst case, if all substrings are unique and stored in the hash table.


Solution

The solution involves using a sliding window approach to extract all possible 10-letter-long substrings from the given DNA sequence. We will utilize a hash table (or dictionary) to keep track of the count of each substring. If a substring appears more than once, we add it to the result list.

  1. Initialize an empty hash table to count occurrences of each substring.
  2. Use a loop to slide through the string and extract each 10-letter-long substring.
  3. For each substring, update its count in the hash table.
  4. After processing all substrings, filter the hash table to find which substrings have a count greater than 1, and return them as the result.

Code Solutions

def findRepeatedDnaSequences(s: str):
    if len(s) < 10:
        return []
    
    seen = {}
    result = []
    
    for i in range(len(s) - 9):
        substring = s[i:i + 10]
        if substring in seen:
            seen[substring] += 1
        else:
            seen[substring] = 1
    
    for substring, count in seen.items():
        if count > 1:
            result.append(substring)
    
    return result
← Back to All Questions