30. Substring with Concatenation of All Words 算法分析及其优化

You are given astring, s, and a list of words, words, that are all of the same length. Find all starting indices ofsubstring(s) in s that is a concatenation of each word in wordsexactly once and without any intervening characters.

For example,given:

s: "barfoothefoobarman"

words: ["foo","bar"]

You should return theindices: [0,9].

(order does notmatter).

来自 <https://leetcode.com/problems/substring-with-concatenation-of-all-words/>

题目大意：

这道题的要求是给定字符串S，单词列表words（所有单词均等长），在S中找到包含words中所有单词且只包含一次的子字符串，返回符合该要求的所有子串的首位置，word中的字符串可能会有重复。

题目分析：

我们很容易想到通过暴力的方法找出所有可能性，设置双下标i和j，i作为子串开头，i往后遍历过程中，j依次找出与words中字符长度一样的串，查找是否在words中出现（通过hash表实现）。如果找出满足题意的子串，则把开头i记录下来。

代码如下（600ms）：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        const int sLen=s.length();
        const int N=words.size();
        const int wLen=words[0].size();

        int i,j,n;
        unordered_map<string,int> mapWords;//统计遍历过程中单词数
        unordered_map<string,int> mapNum;//统计words各单词数
        vector<int> ans;

        for(auto str:words) mapNum[str]++; //初始化mapNum

        for(i=0;i<=sLen-N*wLen;i++){//双指针遍历
            for(j=i,n=0;n<N;j=i+wLen*(++n)){
                string str(s,j,wLen);//取与words中字符串等长的str
mapWords[str]++;
if(mapWords[str]>mapNum[str]) break;//map中没有或者超出重复个数，退出循环
            }
            if(n==N) ans.push_back(i);//判断是否达到匹配数目
            mapWords.clear();
        }
        return ans;
    }
};

优化分析：

结果发现，上述代码效率实在是低，这是因为我们按部就班的遍历i，时间复杂度为 O(M*N)，M为words中字符串的长度，N为s长度。下面介绍一种时间复杂度为 O(N)的优化方法，来跳过不必要的i遍历：

我们可以把s中获取的长度为M的字符串str看做一个整体

在匹配的过程，有如下三种情况会出现：

1.当str在words中存在并且没有超出其重复个数时（与之前没有区别），i不变j正常往后偏移wLen，并且与words中的匹配数目n要加1.

2.当str不在words中时，这时抬头下标i应当转移到str的下一位置，j设置成i相等，n清零。

3.当str在words中存在，但数目已经超出重复个数，这时i往后偏移直到找出前面与str相同的子串，并且将i设置这个串的下一个位置，n也要做出相应的改变

代码如下（60ms）：

class Solution {
public:
    vector<int> findSubstring(string s, vector<string>& words) {
        const int sLen=s.length();
        const int N=words.size();
        const int wLen=words[0].size();

        int i,j,n;
        unordered_map<string,int> mapWords;//统计遍历过程中单词数
        unordered_map<string,int> mapNum;//统计words各单词数
        vector<int> ans;//记录结果

        for(auto str:words) mapNum[str]++;  //初始化mapNum
        for(int start=0;start<wLen;start++){
            i=start;n=0;j=i;mapWords.clear();
            while(i<=sLen-N*wLen){//双指针遍历
                string str(s,j,wLen);
                ++mapWords[str];
                if(mapWords[str]<=mapNum[str]){//map中有并且没有超过重复个数
                    ++n;
                    j+=wLen;
                }else if(0==mapNum[str]){//map中没有
                    i=j+wLen;
                    j=i;
                    n=0;
                    mapWords.clear();
                }else{//map中有但超过了重复个数
                    while(s.substr(i,wLen)!=str){
                         --mapWords[s.substr(i,wLen)];
                         --n;
                         i+=wLen;
                    }
                    --mapWords[s.substr(i,wLen)];
                    i+=wLen;
                    j+=wLen;
                }

                if(n==N){//数量达到N时记录结果
                    ans.push_back(i);
                    --mapWords[s.substr(i,wLen)];
                    --n;
                    i+=wLen;
                }
            }
        }
        return ans;
    }
};

秒客网

30. Substring with Concatenation of All Words 算法分析及其优化

相关文章