I want to check if a string is a strictly a subset of another string. For this end I used boost::contains and I compare the size of strings as follows:
我想检查字符串是否严格是另一个字符串的子集。为此我使用了boost :: contains,我比较了字符串的大小,如下所示:
#include <boost/algorithm/string.hpp>
#include <iostream>
using namespace std;
using namespace boost::algorithm;
int main()
{
string str1 = "abc news";
string str2 = "abc";
//strim strings using boost
trim(str1);
trim(str2);
//if str2 is a subset of str1 and its size is less than the size of str1 then it is strictly contained in str1
if(contains(str1,str2) && (str2.size() < str1.size()))
{
cout <<"contains" << end;
}
return 0;
}
Is there a better way to solve this problem? Instead of also comparing the size of strings?
有没有更好的方法来解决这个问题?而不是比较字符串的大小?
Example
- ABC is a proper subset of ABC NEWS
- ABC is not a proper subset of ABC
ABC是ABC新闻的合适子集
ABC不是ABC的合适子集
4 个解决方案
#1
You can just use ==
or !=
to compare the strings:
您可以使用==或!=来比较字符串:
if(contains(str1, str2) && (str1 != str2))
...
If string contains a string and both are not equal, you have a real subset.
如果string包含一个字符串且两者不相等,那么您就有了一个真实的子集。
If this is better than your method is for you to decide. It is less typing and very clear (IMO), but probably a little bit slower if both strings are long and equal or both start with the same, long sequence.
如果这比你的方法更好,你决定。它的输入较少且非常清晰(IMO),但如果两个字符串都很长且相等或两者都以相同的长序列开始,则可能会慢一点。
Note: If you really care about performance, you might want to try the Boyer-Moore search and the Boyer-Moore-Horspool search. They are way faster than any trivial string search (as apparently used in the string search in stdlibc++, see here), I do not know if boost::contains
uses them.
注意:如果您真的关心性能,可能需要尝试Boyer-Moore搜索和Boyer-Moore-Horspool搜索。它们比任何简单的字符串搜索更快(在stdlibc ++中的字符串搜索中显然使用,请参见此处),我不知道boost :: contains是否使用它们。
#2
I would use the following:
我会使用以下内容:
bool is_substr_of(const std::string& sub, const std::string& s) {
return sub.size() < s.size() && s.find(sub) != s.npos;
}
This uses the standard library only, and does the size check first which is cheaper than s.find(sub) != s.npos
.
这只使用标准库,并首先检查尺寸比s.find(sub)!= s.npos便宜。
#3
About Comparaison operations
TL;DR : Be sure about the format of what you're comparing.
TL; DR:确定您所比较的格式。
Be wary of how you define strictly.
警惕严格定义的方式。
For example, you did not pointed out thoses issue is your question, but if i submit let's say :
例如,你没有指出问题是你的问题,但如果我提交让我们说:
"ABC " //IE whitespaces
"ABC\n"
What is your take on it ? Do you accept it or not ? If you don't, you'll have to either trim
or to clean your output before comparing - just a general note on comparaison operations -
你对它有什么看法?你接受与否吗?如果你不这样做,你必须在比较之前修剪或清理你的输出 - 只是关于比较操作的一般说明 -
Anyway, as Baum pointed out, you can either check equality of your strings using ==
or you can compare length (which is more efficient given that you first checked for substring) with either size()
or length()
;
无论如何,正如Baum指出的那样,您可以使用==检查字符串的相等性,或者您可以比较length(在您首次检查子字符串时更有效)与size()或length();
#4
another approach, using only the standard library:
另一种方法,只使用标准库:
#include <algorithm>
#include <string>
#include <iostream>
using namespace std;
int main()
{
string str1 = "abc news";
string str2 = "abc";
if (str2 != str1
&& search(begin(str1), end(str1),
begin(str2), end(str2)) != end(str1))
{
cout <<"contains" << endl;
}
return 0;
}
#1
You can just use ==
or !=
to compare the strings:
您可以使用==或!=来比较字符串:
if(contains(str1, str2) && (str1 != str2))
...
If string contains a string and both are not equal, you have a real subset.
如果string包含一个字符串且两者不相等,那么您就有了一个真实的子集。
If this is better than your method is for you to decide. It is less typing and very clear (IMO), but probably a little bit slower if both strings are long and equal or both start with the same, long sequence.
如果这比你的方法更好,你决定。它的输入较少且非常清晰(IMO),但如果两个字符串都很长且相等或两者都以相同的长序列开始,则可能会慢一点。
Note: If you really care about performance, you might want to try the Boyer-Moore search and the Boyer-Moore-Horspool search. They are way faster than any trivial string search (as apparently used in the string search in stdlibc++, see here), I do not know if boost::contains
uses them.
注意:如果您真的关心性能,可能需要尝试Boyer-Moore搜索和Boyer-Moore-Horspool搜索。它们比任何简单的字符串搜索更快(在stdlibc ++中的字符串搜索中显然使用,请参见此处),我不知道boost :: contains是否使用它们。
#2
I would use the following:
我会使用以下内容:
bool is_substr_of(const std::string& sub, const std::string& s) {
return sub.size() < s.size() && s.find(sub) != s.npos;
}
This uses the standard library only, and does the size check first which is cheaper than s.find(sub) != s.npos
.
这只使用标准库,并首先检查尺寸比s.find(sub)!= s.npos便宜。
#3
About Comparaison operations
TL;DR : Be sure about the format of what you're comparing.
TL; DR:确定您所比较的格式。
Be wary of how you define strictly.
警惕严格定义的方式。
For example, you did not pointed out thoses issue is your question, but if i submit let's say :
例如,你没有指出问题是你的问题,但如果我提交让我们说:
"ABC " //IE whitespaces
"ABC\n"
What is your take on it ? Do you accept it or not ? If you don't, you'll have to either trim
or to clean your output before comparing - just a general note on comparaison operations -
你对它有什么看法?你接受与否吗?如果你不这样做,你必须在比较之前修剪或清理你的输出 - 只是关于比较操作的一般说明 -
Anyway, as Baum pointed out, you can either check equality of your strings using ==
or you can compare length (which is more efficient given that you first checked for substring) with either size()
or length()
;
无论如何,正如Baum指出的那样,您可以使用==检查字符串的相等性,或者您可以比较length(在您首次检查子字符串时更有效)与size()或length();
#4
another approach, using only the standard library:
另一种方法,只使用标准库:
#include <algorithm>
#include <string>
#include <iostream>
using namespace std;
int main()
{
string str1 = "abc news";
string str2 = "abc";
if (str2 != str1
&& search(begin(str1), end(str1),
begin(str2), end(str2)) != end(str1))
{
cout <<"contains" << endl;
}
return 0;
}