We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Number of Same-End Substrings

Number: 3247

Difficulty: Medium

Paid? Yes

Companies: Sprinklr


Problem Description

Given a 0-indexed string s and an array of queries where each query [l, r] represents a substring s[l..r], return an array of integers where each element is the number of same-end substrings in the corresponding query. A substring is considered same-end if its first and last characters are equal.


Key Insights

  • A substring is defined by its starting and ending indices.
  • A same-end substring has matching first and last characters.
  • Every single character is itself a same-end substring.
  • For a given character c in a substring, if it appears f times, then the number of same-end substrings contributed by c is f * (f + 1) / 2.
  • Using prefix sum arrays for each character allows efficient frequency queries for any substring.

Space and Time Complexity

Time Complexity: O(26 * N + 26 * Q) ≈ O(N + Q), where N is the string length and Q is the number of queries. Space Complexity: O(26 * (N + 1)) due to storing prefix frequency arrays for each of the 26 lowercase letters.


Solution

We precompute prefix sums for all 26 lowercase letters. For each character, we maintain an array that records the cumulative frequency up to each index in s. For each query (l, r), we compute the frequency of each character by subtracting the prefix sums at indices r+1 and l. Then, for each character, we calculate the number of same-end substrings using the formula f * (f + 1) / 2, where f is the frequency of that character in the substring. Summing these values over all characters gives the answer for that query.


Code Solutions

# Python solution

def count_same_end_substrings(s, queries):
    n = len(s)
    # Create prefix sums for each letter, 26 arrays of length n+1
    prefix = [[0] * (n + 1) for _ in range(26)]
    
    # Build the prefix arrays: for each index, update cumulative frequency for each letter
    for i in range(n):
        char_index = ord(s[i]) - ord('a')
        for c in range(26):
            prefix[c][i+1] = prefix[c][i]
        prefix[char_index][i+1] += 1
    
    results = []
    # Process each query to calculate same-end substring count
    for l, r in queries:
        count = 0
        for c in range(26):
            freq = prefix[c][r+1] - prefix[c][l]
            count += freq * (freq + 1) // 2  # f*(f+1)/2
        results.append(count)
    return results

# Example usage:
s = "abcaab"
queries = [[0,0], [1,4], [2,5], [0,5]]
print(count_same_end_substrings(s, queries))  # Output: [1, 5, 5, 10]
← Back to All Questions