Problem Description
Given a list of words from an alien dictionary, determine the order of characters in the alien language. The words are sorted lexicographically, but the character order is unknown. Compute a valid character order that respects the provided lexicographical order. If no valid order exists, return an empty string.
Key Insights
- Build a graph where each unique character is a node.
- Compare adjacent words to determine the first different character, which gives the ordering (directed edge).
- Ensure all characters, even those without edges, are represented.
- Use a topological sort (using DFS or BFS) to determine a valid character ordering.
- Handle edge cases, such as words where a longer word appears before its prefix which is invalid.
Space and Time Complexity
Time Complexity: O(N * L + V + E) where N is the number of words, L is the average length of each word, V is the number of unique characters, and E is the number of edges derived from adjacent word comparisons. Space Complexity: O(V + E) for storing the graph and additional data structures for topological sorting.
Solution
We use a graph-based approach to solve the problem. First, we initialize a graph (adjacency list) and in-degree dictionary for all unique characters. We then iterate through each pair of consecutive words, comparing them character by character to find the first pair that differs. This difference implies an order: the first character should come before the second character. We add a directed edge accordingly and update the in-degree for the target character.
To get the final ordering, we perform a topological sort using Kahn’s algorithm:
- Enqueue characters with in-degree zero.
- Remove characters from the queue and append them to our result.
- Decrement the in-degree of neighboring nodes.
- If a neighbor's in-degree becomes zero, enqueue it. If the result’s length equals the number of unique nodes, we have a valid ordering; otherwise, a cycle exists, and we return an empty string.