We use cookies (including Google cookies) to personalize ads and analyze traffic. By continuing to use our site, you accept our Privacy Policy.

Tag Validator

Difficulty: Hard


Problem Description

Given a string representing a code snippet, implement a tag validator to parse the code and return whether it is valid. A code snippet is valid if all the following rules hold:

  1. The code must be wrapped in a valid closed tag.
  2. A closed tag has the format <TAG_NAME>TAG_CONTENT</TAG_NAME>. The TAG_NAME in start and end tags must be the same. A closed tag is valid if and only if the TAG_NAME and TAG_CONTENT are valid.
  3. A valid TAG_NAME only contains upper-case letters and has a length in range [1,9].
  4. A valid TAG_CONTENT may contain other valid closed tags, cdata, and any characters but cannot have unmatched <, unmatched start and end tags, or unmatched/closed tags with invalid TAG_NAME.
  5. A start tag is unmatched if no end tag exists with the same TAG_NAME, and vice versa, considering nesting.
  6. A < is unmatched if there is no subsequent >. All characters between < and > should be parsed as TAG_NAME.
  7. The cdata has the format . CDATA_CONTENT may contain any characters and is treated as regular characters.

Key Insights

  • The tag structure needs to be validated based on specific rules regarding format and content.
  • A stack data structure is useful for tracking open tags and ensuring proper nesting and matching.
  • Special handling is required for CDATA sections to ignore their content during validation.

Space and Time Complexity

Time Complexity: O(n) - Each character in the string is processed a limited number of times. Space Complexity: O(n) - In the worst case, the stack may need to store all open tags.


Solution

To solve this problem, we can utilize a stack data structure to keep track of the currently open tags. We will iterate through the string character by character, identifying tags and their content. When we encounter a start tag, we will push it onto the stack, and when we encounter an end tag, we will check for a matching start tag at the top of the stack. For CDATA sections, we will skip parsing their contents. If we reach the end of the string and the stack is empty, the code is valid.


Code Solutions

def isValid(code: str) -> bool:
    stack = []
    i = 0
    n = len(code)

    def is_valid_tag_name(tag_name: str) -> bool:
        return tag_name.isupper() and 1 <= len(tag_name) <= 9

    while i < n:
        if code[i] == '<':
            if code[i + 1:i + 9] == '![CDATA[':
                # Handle CDATA section
                i = code.find(']]>', i) + 3
                if i == -1: return False
                continue
            elif code[i + 1] == '/':
                # Handle end tag
                j = code.find('>', i)
                if j == -1: return False
                tag_name = code[i + 2:j]
                if not stack or stack[-1] != tag_name:
                    return False
                stack.pop()
            else:
                # Handle start tag
                j = code.find('>', i)
                if j == -1: return False
                tag_name = code[i + 1:j]
                if not is_valid_tag_name(tag_name):
                    return False
                stack.append(tag_name)
            i = j + 1
        else:
            i += 1

    return len(stack) == 0
← Back to All Questions