How do you have a case insensitive insertion Or search of a string in std::set?
你如何在std :: set中进行不区分大小写的插入或搜索字符串?
For example-
std::set<std::string> s;
s.insert("Hello");
s.insert("HELLO"); //not allowed, string already exists.
4 个解决方案
#1
31
You need to define a custom comparator:
您需要定义自定义比较器:
struct InsensitiveCompare {
bool operator() (const std::string& a, const std::string& b) const {
return stricmp(a.c_str(), b.c_str()) < 0;
}
};
std::set<std::string, InsensitiveCompare> s;
#2
2
std::set offers the possibility of providing your own comparer (as do most std containers). You can then perform any type of comparison you like. Full example is available here
std :: set提供了提供自己的比较器的可能性(和大多数std容器一样)。然后,您可以执行任何类型的比较。这里有完整的例子
#3
1
This is a generic solution that also works with other string types than std::string
(tested with std::wstring
, std::string_view
, char const*
). Basically anything that defines a range of characters should work.
这是一个通用解决方案,也适用于除std :: string之外的其他字符串类型(使用std :: wstring,std :: string_view,char const *进行测试)。基本上任何定义一系列字符的东西都应该有效。
The key point here is to use boost::as_literal
that allows us to treat null-terminated character arrays, character pointers and ranges uniformly in the comparator.
这里的关键点是使用boost :: as_literal,它允许我们在比较器中统一处理以null结尾的字符数组,字符指针和范围。
Generic code ("iset.h"):
通用代码(“iset.h”):
#pragma once
#include <set>
#include <algorithm>
#include <boost/algorithm/string.hpp>
#include <boost/range/as_literal.hpp>
// Case-insensitive generic string comparator.
struct range_iless
{
template< typename InputRange1, typename InputRange2 >
bool operator()( InputRange1 const& r1, InputRange2 const& r2 ) const
{
// include the standard begin() and end() aswell as any custom overloads for ADL
using std::begin; using std::end;
// Treat null-terminated character arrays, character pointers and ranges uniformly.
// This just creates cheap iterator ranges (it doesn't copy container arguments)!
auto ir1 = boost::as_literal( r1 );
auto ir2 = boost::as_literal( r2 );
// Compare case-insensitively.
return std::lexicographical_compare(
begin( ir1 ), end( ir1 ),
begin( ir2 ), end( ir2 ),
boost::is_iless{} );
}
};
// Case-insensitive set for any Key that consists of a range of characters.
template< class Key, class Allocator = std::allocator<Key> >
using iset = std::set< Key, range_iless, Allocator >;
Usage example ("main.cpp"):
用法示例(“main.cpp”):
#include "iset.h" // above header file
#include <iostream>
#include <string>
#include <string_view>
// Output range to stream.
template< typename InputRange, typename Stream, typename CharT >
void write_to( Stream& s, InputRange const& r, CharT const* sep )
{
for( auto const& elem : r )
s << elem << sep;
s << std::endl;
}
int main()
{
iset< std::string > s1{ "Hello", "HELLO", "world" };
iset< std::wstring > s2{ L"Hello", L"HELLO", L"world" };
iset< char const* > s3{ "Hello", "HELLO", "world" };
iset< std::string_view > s4{ "Hello", "HELLO", "world" };
write_to( std::cout, s1, " " );
write_to( std::wcout, s2, L" " );
write_to( std::cout, s3, " " );
write_to( std::cout, s4, " " );
}
在Coliru现场演示
#4
0
From what I have read this is more portable than stricmp() because stricmp() is not in fact part of the std library, but only implemented by most compiler vendors. As a result below is my solution to just roll your own.
根据我的阅读,这比stricmp()更便携,因为stricmp()实际上不是std库的一部分,但只是由大多数编译器供应商实现。结果以下是我自己的解决方案。
#include <string>
#include <cctype>
#include <iostream>
#include <set>
struct caseInsensitiveLess
{
bool operator()(const std::string& x, const std::string& y)
{
unsigned int xs ( x.size() );
unsigned int ys ( y.size() );
unsigned int bound ( 0 );
if ( xs < ys )
bound = xs;
else
bound = ys;
{
unsigned int i = 0;
for (auto it1 = x.begin(), it2 = y.begin(); i < bound; ++i, ++it1, ++it2)
{
if (tolower(*it1) < tolower(*it2))
return true;
if (tolower(*it2) < tolower(*it1))
return false;
}
}
return false;
}
};
int main()
{
std::set<std::string, caseInsensitiveLess> ss1;
std::set<std::string> ss2;
ss1.insert("This is the first string");
ss1.insert("THIS IS THE FIRST STRING");
ss1.insert("THIS IS THE SECOND STRING");
ss1.insert("This IS THE SECOND STRING");
ss1.insert("This IS THE Third");
ss2.insert("this is the first string");
ss2.insert("this is the first string");
ss2.insert("this is the second string");
ss2.insert("this is the second string");
ss2.insert("this is the third");
for ( auto& i: ss1 )
std::cout << i << std::endl;
std::cout << std::endl;
for ( auto& i: ss2 )
std::cout << i << std::endl;
}
Output with case insensitive set and regular set showing the same ordering:
输出不区分大小写的集合和常规集合显示相同的顺序:
This is the first string
THIS IS THE SECOND STRING
This IS THE Third
this is the first string
this is the second string
this is the third
#1
31
You need to define a custom comparator:
您需要定义自定义比较器:
struct InsensitiveCompare {
bool operator() (const std::string& a, const std::string& b) const {
return stricmp(a.c_str(), b.c_str()) < 0;
}
};
std::set<std::string, InsensitiveCompare> s;
#2
2
std::set offers the possibility of providing your own comparer (as do most std containers). You can then perform any type of comparison you like. Full example is available here
std :: set提供了提供自己的比较器的可能性(和大多数std容器一样)。然后,您可以执行任何类型的比较。这里有完整的例子
#3
1
This is a generic solution that also works with other string types than std::string
(tested with std::wstring
, std::string_view
, char const*
). Basically anything that defines a range of characters should work.
这是一个通用解决方案,也适用于除std :: string之外的其他字符串类型(使用std :: wstring,std :: string_view,char const *进行测试)。基本上任何定义一系列字符的东西都应该有效。
The key point here is to use boost::as_literal
that allows us to treat null-terminated character arrays, character pointers and ranges uniformly in the comparator.
这里的关键点是使用boost :: as_literal,它允许我们在比较器中统一处理以null结尾的字符数组,字符指针和范围。
Generic code ("iset.h"):
通用代码(“iset.h”):
#pragma once
#include <set>
#include <algorithm>
#include <boost/algorithm/string.hpp>
#include <boost/range/as_literal.hpp>
// Case-insensitive generic string comparator.
struct range_iless
{
template< typename InputRange1, typename InputRange2 >
bool operator()( InputRange1 const& r1, InputRange2 const& r2 ) const
{
// include the standard begin() and end() aswell as any custom overloads for ADL
using std::begin; using std::end;
// Treat null-terminated character arrays, character pointers and ranges uniformly.
// This just creates cheap iterator ranges (it doesn't copy container arguments)!
auto ir1 = boost::as_literal( r1 );
auto ir2 = boost::as_literal( r2 );
// Compare case-insensitively.
return std::lexicographical_compare(
begin( ir1 ), end( ir1 ),
begin( ir2 ), end( ir2 ),
boost::is_iless{} );
}
};
// Case-insensitive set for any Key that consists of a range of characters.
template< class Key, class Allocator = std::allocator<Key> >
using iset = std::set< Key, range_iless, Allocator >;
Usage example ("main.cpp"):
用法示例(“main.cpp”):
#include "iset.h" // above header file
#include <iostream>
#include <string>
#include <string_view>
// Output range to stream.
template< typename InputRange, typename Stream, typename CharT >
void write_to( Stream& s, InputRange const& r, CharT const* sep )
{
for( auto const& elem : r )
s << elem << sep;
s << std::endl;
}
int main()
{
iset< std::string > s1{ "Hello", "HELLO", "world" };
iset< std::wstring > s2{ L"Hello", L"HELLO", L"world" };
iset< char const* > s3{ "Hello", "HELLO", "world" };
iset< std::string_view > s4{ "Hello", "HELLO", "world" };
write_to( std::cout, s1, " " );
write_to( std::wcout, s2, L" " );
write_to( std::cout, s3, " " );
write_to( std::cout, s4, " " );
}
在Coliru现场演示
#4
0
From what I have read this is more portable than stricmp() because stricmp() is not in fact part of the std library, but only implemented by most compiler vendors. As a result below is my solution to just roll your own.
根据我的阅读,这比stricmp()更便携,因为stricmp()实际上不是std库的一部分,但只是由大多数编译器供应商实现。结果以下是我自己的解决方案。
#include <string>
#include <cctype>
#include <iostream>
#include <set>
struct caseInsensitiveLess
{
bool operator()(const std::string& x, const std::string& y)
{
unsigned int xs ( x.size() );
unsigned int ys ( y.size() );
unsigned int bound ( 0 );
if ( xs < ys )
bound = xs;
else
bound = ys;
{
unsigned int i = 0;
for (auto it1 = x.begin(), it2 = y.begin(); i < bound; ++i, ++it1, ++it2)
{
if (tolower(*it1) < tolower(*it2))
return true;
if (tolower(*it2) < tolower(*it1))
return false;
}
}
return false;
}
};
int main()
{
std::set<std::string, caseInsensitiveLess> ss1;
std::set<std::string> ss2;
ss1.insert("This is the first string");
ss1.insert("THIS IS THE FIRST STRING");
ss1.insert("THIS IS THE SECOND STRING");
ss1.insert("This IS THE SECOND STRING");
ss1.insert("This IS THE Third");
ss2.insert("this is the first string");
ss2.insert("this is the first string");
ss2.insert("this is the second string");
ss2.insert("this is the second string");
ss2.insert("this is the third");
for ( auto& i: ss1 )
std::cout << i << std::endl;
std::cout << std::endl;
for ( auto& i: ss2 )
std::cout << i << std::endl;
}
Output with case insensitive set and regular set showing the same ordering:
输出不区分大小写的集合和常规集合显示相同的顺序:
This is the first string
THIS IS THE SECOND STRING
This IS THE Third
this is the first string
this is the second string
this is the third