如何安全地从流中读取unsigned int? [重复]

时间:2021-12-14 04:14:39

This question already has an answer here:

这个问题在这里已有答案:

In the following program

在以下程序中

#include <iostream>
#include <sstream>

int main()
{
    std::istringstream iss("-89");
    std::cout << static_cast<bool>(iss) << iss.good() << iss.fail() << iss.bad() << iss.eof() << '\n';
    unsigned int u;
    iss >> u;
    std::cout << static_cast<bool>(iss) << iss.good() << iss.fail() << iss.bad() << iss.eof() << '\n';
    return 0;
}

the streams lib reads a signed value into an unsigned int without even a hiccup, silently producing a wrong result:

流lib将一个带符号的值读入unsigned int,甚至没有打嗝,默默地产生错误的结果:

11000
10001

We need to be able to catch those runtime type mismatch errors. If we hadn't just caught this in a simulation, this could have blown up very expensive hardware.

我们需要能够捕获那些运行时类型不匹配错误。如果我们还没有在模拟中发现这一点,那么可能会耗费非常昂贵的硬件。

How can we safely read an unsigned value from a stream?

我们如何安全地从流中读取无符号值?

4 个解决方案

#1


8  

You can read into a variable of a signed type that can handle the entire range first and test if it is negative or beyond the maximum of your target type. If your unsigned values may not fit into the largest signed type available, you'll have to do parsing using something other than iostreams.

您可以读入一个可以处理整个范围的签名类型的变量,并测试它是负数还是超出目标类型的最大值。如果您的无符号值可能不适合可用的最大签名类型,则必须使用iostream之外的其他内容进行解析。

#2


10  

You could write a manipulator:

你可以编写一个操纵器:

template <typename T>
struct ExtractUnsigned
{
    T& value;
    ExtractUnsigned(T& value) : value(value) {}
    void read(std::istream& stream) const {
        char c;
        stream >> c;
        if(c == '-') throw std::runtime_error("Invalid unsigned number");
        stream.putback(c);
        stream >> value;
    }
};

template <typename T>
inline ExtractUnsigned<T> extract_unsigned(T& value) {
    return ExtractUnsigned<T>(value);
}

template <typename T>
inline std::istream& operator >> (std::istream& stream, const ExtractUnsigned<T>& extract) {
    extract.read(stream);
    return stream;
}


int main()
{
    std::istringstream data("   +89   -89");
    unsigned u;
    data >> extract_unsigned(u);
    std::cout << u << '\n';
    data >> extract_unsigned(u);
    return 0;
}

#3


4  

First off, I think parsing a negative value for an unsigned value is wrong. The value is decoded by std::num_get<char> according to the format of strtoull() (22.4.2.12 paragraph 3, stage 3, second bullet). The format of strtoull() is defined in C 7.22.1.4 to be the same as for integer constants in C 6.4.4.1 which requires that the literal value can be represented by an unsigned type. Clearly, a negative value cannot be represented by an unsigned type. Admittedly, I looked at C11 which I isn't really the C standard referenced from C++11. Also, quoting standard paragraphs at the compiler won't fix the issue. Hence, below is an approach which neatly changes the decoding of the values.

首先,我认为解析无符号值的负值是错误的。该值由std :: num_get 根据strtoull()的格式进行解码(22.4.2.12第3段,第3阶段,第2章)。 strtoull()的格式在C 7.22.1.4中定义为与C 6.4.4.1中的整数常量相同,后者要求字面值可以用无符号类型表示。显然,负值不能用无符号类型表示。不可否认,我查看了C11,我不是C ++ 11引用的C标准。此外,在编译器中引用标准段落不会解决问题。因此,下面是一种整齐地改变值的解码的方法。

You could set up a global std::locale with a std::num_get<...> facet rejecting strings starting with a minus sign for unsigned long and unsigned long long. The do_put() override could simply check the first character and then delegate to the base class version if it isn't a '-'.

您可以使用std :: num_get <...> facet设置一个全局std :: locale,拒绝以无符号long和unsigned long long的减号开头的字符串。 do_put()覆盖可以简单地检查第一个字符,然后如果它不是' - '则委托给基类版本。

Below is the code for a custom facet. Although it is quite a bit of code, the actual use is rather straight forward. Most of the code is just boilerplate overriding the different virtual functions used to parse an unsigned number (i.e., the do_get() members). These are all just implemented in terms of the member function template get_impl() which checks if there are no more characters or if the next character is a '-'. In either of these two cases the conversion fails by adding std::ios_base::failbit to the parameter err. Otherwise, the function merely delegates to the base class conversion.

以下是自定义构面的代码。虽然它是相当多的代码,但实际使用是相当直接的。大多数代码只是替代用于解析无符号数的不同虚函数(即do_get()成员)的样板。这些都是根据成员函数模板get_impl()实现的,它检查是否没有更多字符或者下一个字符是否为“ - ”。在这两种情况中的任何一种情况下,通过将std :: ios_base :: failbit添加到参数err中,转换失败。否则,该函数仅委托给基类转换。

The correspondingly created facet is eventually used to construct a new std::locale object (custom; note that the allocated positive_num_get object is automatically release when the last std::locale object where it is used is released). This std::locale is installed to become the global locale. The global locale is used by all newly created stream. Existing streams, in the example std::cin need to be imbue()d with the locale if it should affect them. Once the global locale is set up, newly created stream will just pick up the changed decoding rules, i.e., there shouldn't be much need to change code.

相应创建的facet最终用于构造一个新的std :: locale对象(自定义;请注意,当释放使用它的最后一个std :: locale对象时,分配的positive_num_get对象会自动释放)。安装此std :: locale以成为全局区域设置。所有新创建的流都使用全局区域设置。示例std :: cin中的现有流需要与语言环境一起使用imbue()d,如果它应该影响它们。一旦设置了全局语言环境,新创建的流将仅仅获取改变的解码规则,即,不需要更改代码。

#include <iostream>
#include <sstream>
#include <locale>

class positive_num_get
    : public std::num_get<char> {
    typedef std::num_get<char>::iter_type iter_type;
    typedef std::num_get<char>::char_type char_type;

    // actual implementation: if there is no character or it is a '-' fail
    template <typename T>
    iter_type get_impl(iter_type in, iter_type end,
                       std::ios_base& str, std::ios_base::iostate& err,
                       T& val) const {
        if (in == end || *in == '-') {
            err |= std::ios_base::failbit;
            return in;
        }
        else {
            return this->std::num_get<char>::do_get(in, end, str, err, val);
        }
    }
    // overrides of the various virtual functions
    iter_type do_get(iter_type in, iter_type end,
                     std::ios_base& str, std::ios_base::iostate& err,
                     unsigned short& val) const override {
        return this->get_impl(in, end, str, err, val);
    }
    iter_type do_get(iter_type in, iter_type end,
                     std::ios_base& str, std::ios_base::iostate& err,
                     unsigned int& val) const override {
        return this->get_impl(in, end, str, err, val);
    }
    iter_type do_get(iter_type in, iter_type end,
                     std::ios_base& str, std::ios_base::iostate& err,
                     unsigned long& val) const override {
        return this->get_impl(in, end, str, err, val);
    }
    iter_type do_get(iter_type in, iter_type end,
                     std::ios_base& str, std::ios_base::iostate& err,
                     unsigned long long& val) const override {
        return this->get_impl(in, end, str, err, val);
    }
};

void read(std::string const& input)
{
    std::istringstream in(input);
    unsigned long value;
    if (in >> value) {
        std::cout << "read " << value << " from '" << input << '\n';
    }
    else {
        std::cout << "failed to read value from '" << input << '\n';
    }
}

int main()
{
    read("\t 17");
    read("\t -18");

    std::locale custom(std::locale(), new positive_num_get);
    std::locale::global(custom);
    std::cin.imbue(custom);

    read("\t 19");
    read("\t -20");
}

#4


0  

You could do it as follows:

你可以这样做:

#include <iostream>
#include <sstream>

int main()
{
        std::istringstream iss("-89");
        std::cout << static_cast<bool>(iss) << iss.good() << iss.fail() << iss.bad() << iss.eof() << '\n';
        int u;
        if ((iss >> u) && (u > 0)) {
                unsigned int u1 = static_cast<unsigned int>(u);
                std::cout << "No errors: " << u1 << std::endl;
        } else {
                std::cout << "Error" << std::endl;
        }
        std::cout << static_cast<bool>(iss) << iss.good() << iss.fail() << iss.bad() << iss.eof() << '\n';
        return 0;
}

#1


8  

You can read into a variable of a signed type that can handle the entire range first and test if it is negative or beyond the maximum of your target type. If your unsigned values may not fit into the largest signed type available, you'll have to do parsing using something other than iostreams.

您可以读入一个可以处理整个范围的签名类型的变量,并测试它是负数还是超出目标类型的最大值。如果您的无符号值可能不适合可用的最大签名类型,则必须使用iostream之外的其他内容进行解析。

#2


10  

You could write a manipulator:

你可以编写一个操纵器:

template <typename T>
struct ExtractUnsigned
{
    T& value;
    ExtractUnsigned(T& value) : value(value) {}
    void read(std::istream& stream) const {
        char c;
        stream >> c;
        if(c == '-') throw std::runtime_error("Invalid unsigned number");
        stream.putback(c);
        stream >> value;
    }
};

template <typename T>
inline ExtractUnsigned<T> extract_unsigned(T& value) {
    return ExtractUnsigned<T>(value);
}

template <typename T>
inline std::istream& operator >> (std::istream& stream, const ExtractUnsigned<T>& extract) {
    extract.read(stream);
    return stream;
}


int main()
{
    std::istringstream data("   +89   -89");
    unsigned u;
    data >> extract_unsigned(u);
    std::cout << u << '\n';
    data >> extract_unsigned(u);
    return 0;
}

#3


4  

First off, I think parsing a negative value for an unsigned value is wrong. The value is decoded by std::num_get<char> according to the format of strtoull() (22.4.2.12 paragraph 3, stage 3, second bullet). The format of strtoull() is defined in C 7.22.1.4 to be the same as for integer constants in C 6.4.4.1 which requires that the literal value can be represented by an unsigned type. Clearly, a negative value cannot be represented by an unsigned type. Admittedly, I looked at C11 which I isn't really the C standard referenced from C++11. Also, quoting standard paragraphs at the compiler won't fix the issue. Hence, below is an approach which neatly changes the decoding of the values.

首先,我认为解析无符号值的负值是错误的。该值由std :: num_get 根据strtoull()的格式进行解码(22.4.2.12第3段,第3阶段,第2章)。 strtoull()的格式在C 7.22.1.4中定义为与C 6.4.4.1中的整数常量相同,后者要求字面值可以用无符号类型表示。显然,负值不能用无符号类型表示。不可否认,我查看了C11,我不是C ++ 11引用的C标准。此外,在编译器中引用标准段落不会解决问题。因此,下面是一种整齐地改变值的解码的方法。

You could set up a global std::locale with a std::num_get<...> facet rejecting strings starting with a minus sign for unsigned long and unsigned long long. The do_put() override could simply check the first character and then delegate to the base class version if it isn't a '-'.

您可以使用std :: num_get <...> facet设置一个全局std :: locale,拒绝以无符号long和unsigned long long的减号开头的字符串。 do_put()覆盖可以简单地检查第一个字符,然后如果它不是' - '则委托给基类版本。

Below is the code for a custom facet. Although it is quite a bit of code, the actual use is rather straight forward. Most of the code is just boilerplate overriding the different virtual functions used to parse an unsigned number (i.e., the do_get() members). These are all just implemented in terms of the member function template get_impl() which checks if there are no more characters or if the next character is a '-'. In either of these two cases the conversion fails by adding std::ios_base::failbit to the parameter err. Otherwise, the function merely delegates to the base class conversion.

以下是自定义构面的代码。虽然它是相当多的代码,但实际使用是相当直接的。大多数代码只是替代用于解析无符号数的不同虚函数(即do_get()成员)的样板。这些都是根据成员函数模板get_impl()实现的,它检查是否没有更多字符或者下一个字符是否为“ - ”。在这两种情况中的任何一种情况下,通过将std :: ios_base :: failbit添加到参数err中,转换失败。否则,该函数仅委托给基类转换。

The correspondingly created facet is eventually used to construct a new std::locale object (custom; note that the allocated positive_num_get object is automatically release when the last std::locale object where it is used is released). This std::locale is installed to become the global locale. The global locale is used by all newly created stream. Existing streams, in the example std::cin need to be imbue()d with the locale if it should affect them. Once the global locale is set up, newly created stream will just pick up the changed decoding rules, i.e., there shouldn't be much need to change code.

相应创建的facet最终用于构造一个新的std :: locale对象(自定义;请注意,当释放使用它的最后一个std :: locale对象时,分配的positive_num_get对象会自动释放)。安装此std :: locale以成为全局区域设置。所有新创建的流都使用全局区域设置。示例std :: cin中的现有流需要与语言环境一起使用imbue()d,如果它应该影响它们。一旦设置了全局语言环境,新创建的流将仅仅获取改变的解码规则,即,不需要更改代码。

#include <iostream>
#include <sstream>
#include <locale>

class positive_num_get
    : public std::num_get<char> {
    typedef std::num_get<char>::iter_type iter_type;
    typedef std::num_get<char>::char_type char_type;

    // actual implementation: if there is no character or it is a '-' fail
    template <typename T>
    iter_type get_impl(iter_type in, iter_type end,
                       std::ios_base& str, std::ios_base::iostate& err,
                       T& val) const {
        if (in == end || *in == '-') {
            err |= std::ios_base::failbit;
            return in;
        }
        else {
            return this->std::num_get<char>::do_get(in, end, str, err, val);
        }
    }
    // overrides of the various virtual functions
    iter_type do_get(iter_type in, iter_type end,
                     std::ios_base& str, std::ios_base::iostate& err,
                     unsigned short& val) const override {
        return this->get_impl(in, end, str, err, val);
    }
    iter_type do_get(iter_type in, iter_type end,
                     std::ios_base& str, std::ios_base::iostate& err,
                     unsigned int& val) const override {
        return this->get_impl(in, end, str, err, val);
    }
    iter_type do_get(iter_type in, iter_type end,
                     std::ios_base& str, std::ios_base::iostate& err,
                     unsigned long& val) const override {
        return this->get_impl(in, end, str, err, val);
    }
    iter_type do_get(iter_type in, iter_type end,
                     std::ios_base& str, std::ios_base::iostate& err,
                     unsigned long long& val) const override {
        return this->get_impl(in, end, str, err, val);
    }
};

void read(std::string const& input)
{
    std::istringstream in(input);
    unsigned long value;
    if (in >> value) {
        std::cout << "read " << value << " from '" << input << '\n';
    }
    else {
        std::cout << "failed to read value from '" << input << '\n';
    }
}

int main()
{
    read("\t 17");
    read("\t -18");

    std::locale custom(std::locale(), new positive_num_get);
    std::locale::global(custom);
    std::cin.imbue(custom);

    read("\t 19");
    read("\t -20");
}

#4


0  

You could do it as follows:

你可以这样做:

#include <iostream>
#include <sstream>

int main()
{
        std::istringstream iss("-89");
        std::cout << static_cast<bool>(iss) << iss.good() << iss.fail() << iss.bad() << iss.eof() << '\n';
        int u;
        if ((iss >> u) && (u > 0)) {
                unsigned int u1 = static_cast<unsigned int>(u);
                std::cout << "No errors: " << u1 << std::endl;
        } else {
                std::cout << "Error" << std::endl;
        }
        std::cout << static_cast<bool>(iss) << iss.good() << iss.fail() << iss.bad() << iss.eof() << '\n';
        return 0;
}