72. Edit Distance

时间:2020-12-23 04:54:17

题目:

Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)

You have the following 3 operations permitted on a word:

a) Insert a character
b) Delete a character
c) Replace a character

链接:  http://leetcode.com/problems/edit-distance/

题解:

Dynamic Programming动态规划的经典问题,一定要好好继续研究一下。 详解请看下面的reference。 还可以使用滚动数组继续优化空间为O(n)或者O(m)。最近在忙于房子装修,都没有时间刷题和准备面试,下一遍要补上。

下周一onsite BB,裸面,希望有好运气吧!

Time Complexity - O(mn), Space Complexity - O(mn)。

public class Solution {
public int minDistance(String word1, String word2) {
if(word1 == null || word2 == null)
return 0;
int word1Len = word1.length(), word2Len = word2.length();
int[][] dp = new int[word1Len + 1][word2Len + 1]; for(int i = 0; i < word1Len + 1; i++) //word1 as row
dp[i][0] = i; for(int j = 1; j < word2Len + 1; j++) //word2 as column
dp[0][j] = j; for(int i = 1; i < word1Len + 1; i++) {
for(int j = 1; j < word2Len + 1; j++) {
if(word1.charAt(i - 1) == word2.charAt(j - 1))
dp[i][j] = dp[i - 1][j - 1];
else
dp[i][j] = 1 + Math.min(dp[i - 1][j - 1], Math.min(dp[i][j - 1], dp[i - 1][j]));
}
} return dp[word1Len][word2Len];
}
}

Update:

主要使用DP,假设以word1为列,word2为行,初始化的时候设定distance[0][i]以及distance[j][0] - 当对方字符串为空时需要多少步骤。则转移方程为,当前字符相同时,distance[i][j] = distance[i - 1][j - 1], 否则这时insert, replace,delete权重都为1, 方程为1 + 三种改变的最小值, 既Math.min(distance[i - 1][j - 1], Math.min(distance[i - 1][j], distance[i][j - 1]))。 其中distance[i - 1][j - 1]为replace, distance[i - 1][j]是word1删除一个字符, distance[i][j - 1]是word2删除一个字符。

public class Solution {
public int minDistance(String word1, String word2) {
if(word1 == null || word2 == null)
return 0;
int word1Len = word1.length(), word2Len = word2.length();
int[][] distance = new int[word1Len + 1][word2Len + 1]; for(int i = 1; i < word1Len + 1; i++)
distance[i][0] = i; for(int j = 1; j < word2Len + 1; j++)
distance[0][j] = j; for(int i = 1; i < word1Len + 1; i++) {
for(int j = 1; j < word2Len + 1; j++)
if(word1.charAt(i - 1) == word2.charAt(j - 1))
distance[i][j] = distance[i - 1][j - 1];
else
distance[i][j] = 1 + Math.min(distance[i - 1][j - 1], Math.min(distance[i - 1][j], distance[i][j - 1]));
} return distance[word1Len][word2Len];
}
}

二刷

思路仍然不是特别清晰。我们尝试分为以下几个步骤:

  1. 这道题目应该使用dp。
  2. 要解决的是如何定义dp,  如何设置初始化状态,以及转移方程是什么。
  3. 首先我们考虑边界条件,当有一个string为空的时候我们返回0。
  4. 接下来创建一个dp矩阵dist,假如word1的长度为word1Len,word2的长度为word2Len,那么这个矩阵的长度就为[word1Len + 1, word2Len + 1]。
  5. 我们初始化第一行和第一列,dist[i][0] = i, dist[0][j] = j,  都是负责处理其中一个word为空这种情况。
  6. 接下来,我们定义dist[i][j]为 word1(0, i) 到word2(0,j) 这两个单词的min Edit distance。那么我们有以下的公式:
    1. 假如word1.charAt(i) == word2.charAt(j),那么dist[i][j] = 0
    2. 否则dist[i][j] = 1 + min (dist[i - 1][j - 1], min(dist[i - 1][j], dist[i][j - 1]))。
      1. 这里假如使用dist[i - 1][j - 1],意思是replace
      2. 假如使用dist[i - 1][j],那么是word1比word2少1个字符。 对word1来说是add
      3. 假如使用dist[i][j - 1],那么是word2比word1多一个字符。对word1来说是delete
  7. 最后返回结果dist[word1Len][word2Len]
  8. 这里其实也可以简化为滚动数组,达到Space Complexity - O(n)的结果,留给三刷了。

Java:

Time Complexity - O(mn), Space Complexity - O(mn)。

public class Solution {
public int minDistance(String word1, String word2) {
if (word1 == null || word2 == null) {
return 0;
}
int word1Len = word1.length(), word2Len = word2.length();
int[][] dist = new int[word1Len + 1][word2Len + 1];
for (int i = 1; i <= word1Len; i++) {
dist[i][0] = i;
}
for (int j = 1; j <= word2Len; j++) {
dist[0][j] = j;
} for (int i = 1; i <= word1Len; i++) {
for (int j = 1; j <= word2Len; j++) {
if (word1.charAt(i - 1) == word2.charAt(j - 1)) {
dist[i][j] = dist[i - 1][j - 1];
} else {
dist[i][j] = Math.min(dist[i - 1][j - 1], Math.min(dist[i - 1][j], dist[i][j - 1])) + 1;
}
}
} return dist[word1Len][word2Len];
}
}

三刷:

还是dp。当两字符相等时,取左上的值。 否则表示有一个edit distance,我们取左上,上和左三个值里最小的一个,+ 1,然后继续计算。

Java:

public class Solution {
public int minDistance(String word1, String word2) {
if (word1 == null || word2 == null) return Integer.MAX_VALUE;
int m = word1.length(), n = word2.length();
int[][] dp = new int[m + 1][n + 1];
for (int i = 1; i <= m; i++) dp[i][0] = i;
for (int j = 1; j <= n; j++) dp[0][j] = j; for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
if (word1.charAt(i - 1) == word2.charAt(j - 1)) dp[i][j] = dp[i - 1][j - 1];
else dp[i][j] = 1 + Math.min(dp[i - 1][j - 1], Math.min(dp[i - 1][j], dp[i][j - 1]));
}
}
return dp[m][n];
}
}

一维DP:

跟Maximal Square一样,也是使用一个topLeft来代表左上方的元素。

public class Solution {
public int minDistance(String word1, String word2) {
if (word1 == null || word2 == null) return Integer.MAX_VALUE;
int m = word1.length(), n = word2.length();
if (m == 0) return n;
else if (n == 0) return m; int[] dp = new int[n + 1];
for (int j = 1; j <= n; j++) dp[j] = j;
int topLeft = 0; for (int i = 1; i <= m; i++) {
for (int j = 1; j <= n; j++) {
int tmp = dp[j];
if (word1.charAt(i - 1) == word2.charAt(j - 1)) dp[j] = topLeft;
else dp[j] = 1 + Math.min(topLeft, Math.min(dp[j], dp[j - 1]));
topLeft = tmp;
}
dp[0] = i;
topLeft = i;
}
return dp[n];
}
}

Reference:

https://leetcode.com/discuss/10426/my-o-mn-time-and-o-n-space-solution-using-dp-with-explanation

http://www.cnblogs.com/springfor/p/3896167.html

https://leetcode.com/discuss/17997/my-accepted-java-solution

https://leetcode.com/discuss/20945/standard-dp-solution

https://leetcode.com/discuss/5138/good-pdf-on-edit-distance-problem-may-be-helpful

https://leetcode.com/discuss/43398/20ms-detailed-explained-c-solutions-o-n-space

http://web.stanford.edu/class/cs124/lec/med.pdf

https://en.wikipedia.org/wiki/Edit_distance

https://leetcode.com/discuss/64063/ac-python-212-ms-dp-solution-o-mn-time-o-n-space

https://leetcode.com/discuss/43398/20ms-detailed-explained-c-solutions-o-n-space