Problem:
Given two words (start and end), and a dictionary, find all shortest transformation sequence(s) from start to end, such that:
- Only one letter can be changed at a time
- Each intermediate word must exist in the dictionary
For example,
Given:
start = "hit"
end = "cog"
dict = ["hot","dot","dog","lot","log"]
Return
[
["hit","hot","dot","dog","cog"],
["hit","hot","lot","log","cog"]
]
Note:
- All words have the same length.
- All words contain only lowercase alphabetic characters.
Analysis:
As we have analyzed in the Word Ladder simple version, the out-degree for each word is 26*word_len. Thus we should not be affected by any previous question, try to replace "character" one index by one index. Below is a wrong solution through such wrong thinking.
Wrong solution 1:
public class Solution {
public List<List<String>> findLadders(String start, String end, Set<String> dict) {
List<List<String>> ret = new ArrayList<List<String>> ();
if (start == null || end == null || dict == null)
throw new IllegalArgumentException("The passed in arguments is illegal");
ArrayList<String> path = new ArrayList<String> ();
path.add(start);
findPath(start, end, 0, dict, path, ret);
return ret;
} private void findPath(String cur_str, String end, int index, Set<String> dict, ArrayList<String> path, List<List<String>> ret) {
if (cur_str.equals(end)) {
ret.add(new ArrayList<String>(path));
return;
}
if (index == end.length()) {
return;
}
for (int pos = 0; pos < end.length(); pos++) {
for (int i = 0; i < 26; i++) {
char replace = (char)('a' + i);
String new_str = cur_str.substring(0, pos) + replace + cur_str.substring(pos+1, cur_str.length());
if (dict.contains(new_str)) {
path.add(new_str);
findPath(new_str, end, index+1, dict, path, ret);
path.remove(path.size()-1);
}
}
}
}
}
Solution 1:
Unlike Word Ladder problem, which only care about the shortest path's length. For this problem, we need to print out all shortest pathes, which is a subset of all pathes. If we use DFS, we need to traverse each of those path. Apparently, the search cost is quite expensive.
Usable version 1:
public class Solution {
public List<List<String>> findLadders(String start, String end, Set<String> dict) {
List<List<String>> ret = new ArrayList<List<String>> ();
if (start == null || end == null || dict == null)
throw new IllegalArgumentException("The passed in arguments is illegal");
ArrayList<String> path = new ArrayList<String> ();
HashSet<String> visited = new HashSet<String> ();
visited.add(start);
path.add(start);
findPath(start, end, dict, path, visited, ret);
return ret;
} private void findPath(String cur_str, String end, Set<String> dict, ArrayList<String> path, Set<String> visited, List<List<String>> ret) {
if (cur_str.equals(end)) {
ret.add(new ArrayList<String>(path));
return;
}
for (int pos = 0; pos < end.length(); pos++) {
for (int i = 0; i < 26; i++) {
char replace = (char)('a' + i);
String new_str = cur_str.substring(0, pos) + replace + cur_str.substring(pos+1, cur_str.length());
if (dict.contains(new_str) && !visited.contains(new_str)) {
visited.add(new_str);
path.add(new_str);
findPath(new_str, end, dict, path, visited, ret);
path.remove(path.size()-1);
visited.remove(new_str);
}
}
}
}
} The above code structure is easy, but the time complexity is too high, since we need to search all possible routines.
Improvement Analysis:
Apparently, we want to use the powerful BFS for this problem.
The advantage we can take: the shortest path must be first reached than other valid path. int min_level = 0;
while (!queue.isEmpty()) {
WordNode cur = queue.poll();
if (min_level != 0 && level > min_level)
continue;
...
if (min_level == 0)
min_level = level;
if (level == min_level && min_level != 0) {
...
}
} Challenge 1:
Since this problem ask us to print out all shortest pathes, it is far more hard than the previous question, since we could not easily tag all encountered words. However, For this problem, if we blindly tag all encounter words, we could lose answer.
-------------------------------------------------
Case: If we tag all words as visited we have encountered before enqueue.
start: hot
end: dog
dict: [hot, dot, hog, dog] Expected: [[hot, dot, dog], [hot, hog, dog]]
Output: [[hot, dot, dog]]
-------------------------------------------------
Step 1: equeue "hot" (tag "hot" as visited)
Step 2: dequeue "hot", enqueue "dot", "hog" (tag "dot" and "hog" as visited).
Step 3: dequeue "dot", enqueue "dog" (tag "dog" as visited).
step 4: dequeue "hog", try to enqueue "dog". (failed, because "dog" has already been tagged as visited). A way to fix this failure is never tage the "end" word as visisted.
if (!new_word.equals(end))
visited.add(new_word); But it's a little ugly, don't you think so? Challenge 2:
In BFS search, even we reached the target word, how could we recover the path that reach it.
This question has bothered me a lot, until I have seen the genius method: design the WordNode structure for recording previous node reference.
class WordNode {
String word;
WordNode pre;
public WordNode(String word, WordNode pre) {
this.word = word;
this.pre = pre;
}
} We use the queue to contain WordNode.
Queue<WordNode> queue = new LinkedList<WordNode> ();
Once we enqueue a word into the String, we wrap it with WordNode.
if (!visited.contains(new_word) && dict.contains(new_word)) {
queue.offer(new WordNode(new_word, cur)); Mis understanding: Since Java automatically collect the garbage, if we dequeue a WordNode, the information for WordNode is forever disappeared, how could we recover it? Actually, we still have a chain to point all WordNode to guarantee them not disappear. Note the arugaments we have passed to construct WordNode: new WordNode(new_word, cur). The "cur" is the reference for the current node. When we reach the target node we can trace this information back.
Note: In different search path, even for the same word, they are in different WordNode.
----------------------------------------------------------------------
if (level == min_level && min_level != 0) {
ArrayList<String> item = new ArrayList<String> ();
item.add(cur_word);
while (cur.pre != null) {
cur = cur.pre;
item.add(0, cur.word);
}
ret.add(item);
}
---------------------------------------------------------------------- Even we have solved the above two challenges, we still could make a lot mistakes for using this solving strategy.
Since we rely on "cur_num, next_num, level" to maintian the level information we need. And now we have return in the middle(before reach checking "cur_num == 0"). Any error is unavoidable in this routine.
Solution 2:
class WordNode {
String word;
WordNode pre;
public WordNode(String word, WordNode pre) {
this.word = word;
this.pre = pre;
}
} public class Solution {
public List<List<String>> findLadders(String start, String end, Set<String> dict) {
List<List<String>> ret = new ArrayList<List<String>> ();
if (start == null || end == null || dict == null)
throw new IllegalArgumentException("The passed in arguments is illegal");
Queue<WordNode> queue = new LinkedList<WordNode> ();
HashSet<String> visited = new HashSet<String> ();
queue.offer(new WordNode(start, null));
dict.add(end);
visited.add(start);
int cur_num = 1;
int next_num = 0;
int level = 1;
int min_level = 0;
while (!queue.isEmpty()) {
WordNode cur = queue.poll();
if (min_level != 0 && level > min_level)
continue;
String cur_word = cur.word;
cur_num--;
if (end.equals(cur_word)) {
if (min_level == 0)
min_level = level;
//the first min_level was set is the lowest level
if (level == min_level && min_level != 0) {
ArrayList<String> item = new ArrayList<String> ();
item.add(cur_word);
while (cur.pre != null) {
cur = cur.pre;
item.add(0, cur.word);
}
ret.add(item);
}
if (cur_num == 0) {
cur_num = next_num;
next_num = 0;
level++;
}
continue;
}
//don't put the "end" into visited array.
char[] char_array = cur_word.toCharArray();
for (int i = 0; i < end.length(); i++) {
char copy = char_array[i];
for (char c = 'a'; c <= 'z'; c++) {
char_array[i] = c;
String new_word = new String(char_array);
if (!visited.contains(new_word) && dict.contains(new_word)) {
queue.offer(new WordNode(new_word, cur));
next_num++;
if (!new_word.equals(end))
visited.add(new_word);
}
}
char_array[i] = copy;
}
if (cur_num == 0) {
cur_num = next_num;
next_num = 0;
level++;
}
} return ret;
}
}
Since we have already solved two important challenges in BFS.
1. how to visit the "target" word twice, thus we could record other shortest search pathes.
2. how to trace back a search path. Improvement 1:
However, the code structure we have used in "level-traverse" is still not clear enough for tackling this problem, there are too many "counts" needed to maintian properly, which makes the code hard to read and implement. A way to solve this problem is to leverge our helepr data structure "WordNode", we use it not to record the pre node of the current node, we also record the level of the current node. Since we wrap the level information with the node, the code could be quite clear and easy.
1. How to set the initial node?
queue.offer(new WordNode(null, start, 1));
The initial node is the "start", its level is 1. 2. How to specify the level for nodes at next level?
if (un_visited.contains(new_word)) {
visited.add(new_word);
queue.offer(new WordNode(cur_node, new_word, cur_node.level+1));
} 3. How to decide whether we reach a new level?
if (cur_node.level > cur_level) {
un_visited.removeAll(visited);
cur_level = cur_node.level;
} Haha...Quite smart and elegant! Don't you think so!!! Improvement 2:
Rather than avoid tagging the "end" word (which still problemetic), we can cover all shortest pathes by using two HashSets.
One hashset is called "un_visited", another hashset is called "visited". Only when we reach level 'i+1', we tag all nodes at "i" as unreachable.(remove them from un_visted hashset). Note: the visited at here means a word was visited at current level. The reason.
start: hot
end: dog
dict: [hot, dot, hog, dog] step 1: enqueue "hot", mark it as visited.
visited: ["hot"]
unvisited: ["hot", "dot", "hog", "dog"]
step 2: dequeue "hot", enqueue "dot","hog". delete "hot" from unvisited. add "hot", "dog" into visited.
visited: ["hot", "dot", "hog"]
unvisited: ["dot", "hog", "dog"]
step 3: deque "dot", enqueue "dog". "dog" into visited.
visited: ["hot", "dot", "hog", "dog"]
unvisited: ["dot", "hog", "dog"] //note: since there is still "hog" at the same level, unvisited was unchanged.<important>
step 4: deque "dot", enqueue "dog"<different WordNode>. "dog" into visited.
visited: ["hot", "dot", "hog", "dog"]
unvisited: ["dot", "hog", "dog"]
step 5: deque "dog", "dot", "hog", "dog" was removed from unvisited. <two dogs have already been enqueued at step 3 and step 4>
visited: ["hot", "dot", "hog", "dog"]
unvisited: [] So nice!!! We tag the nodes as visited level by level!!! Incase "i" level has multi nodes point to the same element at "i+1" level. Implementation:
1. prepare initial state.
Set<String> visited = new HashSet<String> ();
Set<String> un_visited = new HashSet<String> ();
int cur_level = 1;
dict.add(end);
un_visited.addAll(dict);
queue.offer(new WordNode(null, start, 1));
visited.add(start); 2. remove above(i) level visited nodes.
if (cur_node.level > cur_level) {
un_visited.removeAll(visited);
cur_level = cur_node.level;
} 3. add new words into visited array.
if (un_visited.contains(new_word)) {
visited.add(new_word);
queue.offer(new WordNode(cur_node, new_word, cur_node.level+1));
}
Solution:
class WordNode {
WordNode pre;
String word;
int level;
public WordNode(WordNode pre, String word, int level) {
this.pre = pre;
this.word = word;
this.level = level;
}
} public class Solution {
public List<List<String>> findLadders(String start, String end, Set<String> dict) {
if (start == null || end == null || dict == null)
throw new IllegalArgumentException("The passed in arguments are illegal");
List<List<String>> ret = new ArrayList<List<String>> ();
Queue<WordNode> queue = new LinkedList<WordNode> ();
Set<String> visited = new HashSet<String> ();
Set<String> un_visited = new HashSet<String> ();
int cur_level = 1;
dict.add(end);
un_visited.addAll(dict);
queue.offer(new WordNode(null, start, 1));
visited.add(start);
while (!queue.isEmpty()) {
WordNode cur_node = queue.poll();
if (cur_node.level > cur_level) {
un_visited.removeAll(visited);
cur_level = cur_node.level;
}
String cur_word = cur_node.word;
if (cur_word.equals(end)) {
ArrayList<String> item = new ArrayList<String> ();
while (cur_node != null) {
item.add(0, cur_node.word);
cur_node = cur_node.pre;
}
ret.add(item);
continue;
}
char[] char_array = cur_word.toCharArray();
for (int i = 0; i < end.length(); i++) {
char temp = char_array[i];
for (char c = 'a'; c <= 'z'; c++) {
char_array[i] = c;
String new_word = new String(char_array);
if (un_visited.contains(new_word)) {
visited.add(new_word);
queue.offer(new WordNode(cur_node, new_word, cur_node.level+1));
}
}
char_array[i] = temp;
}
}
return ret;
}
}