Problem Description
Given a string representing a code snippet, implement a tag validator to parse the code and return whether it is valid. A code snippet is valid if all the following rules hold:
- The code must be wrapped in a valid closed tag.
- A closed tag has the format <TAG_NAME>TAG_CONTENT</TAG_NAME>. The TAG_NAME in start and end tags must be the same. A closed tag is valid if and only if the TAG_NAME and TAG_CONTENT are valid.
- A valid TAG_NAME only contains upper-case letters and has a length in range [1,9].
- A valid TAG_CONTENT may contain other valid closed tags, cdata, and any characters but cannot have unmatched <, unmatched start and end tags, or unmatched/closed tags with invalid TAG_NAME.
- A start tag is unmatched if no end tag exists with the same TAG_NAME, and vice versa, considering nesting.
- A < is unmatched if there is no subsequent >. All characters between < and > should be parsed as TAG_NAME.
- The cdata has the format . CDATA_CONTENT may contain any characters and is treated as regular characters.
Key Insights
- The tag structure needs to be validated based on specific rules regarding format and content.
- A stack data structure is useful for tracking open tags and ensuring proper nesting and matching.
- Special handling is required for CDATA sections to ignore their content during validation.
Space and Time Complexity
Time Complexity: O(n) - Each character in the string is processed a limited number of times. Space Complexity: O(n) - In the worst case, the stack may need to store all open tags.
Solution
To solve this problem, we can utilize a stack data structure to keep track of the currently open tags. We will iterate through the string character by character, identifying tags and their content. When we encounter a start tag, we will push it onto the stack, and when we encounter an end tag, we will check for a matching start tag at the top of the stack. For CDATA sections, we will skip parsing their contents. If we reach the end of the string and the stack is empty, the code is valid.