:在/proc/pid/stat上读取与ifstream有关的文件时的错误。

时间:2021-11-18 04:50:09

Why does the following code throw an exception? Note that the file is a /proc/pid/stat file so it could be interfered by the kernel.

为什么下面的代码抛出一个异常?注意,该文件是一个/proc/pid/stat文件,因此它可能被内核所干扰。

// Checked that file does exist
try {
  std::ifstream file(path.c_str());
  // Shouldn't even be necessary because it's the default but it doesn't 
  // make any difference.
  file.exceptions(std::ifstream::goodbit);
  // Read the stream into many fields
  // !!!! The exception was thrown here.
  file >> _ >> comm >> state >> ppid >> pgrp >> session >> tty_nr
       /* >> ... omitted */;
  file.close();
} catch (const std::ifstream::failure& e) {
  std::cout << "Exception!!!! " << e.what();
}

The exception was "basic_filebuf::underflow error reading the file".

异常是“basic_filebuf::underflow error读取文件”。

Shouldn't the stream not throw an exception when we haven't asked it to (by setting file.exceptions())?

当我们没有请求流时,该流不应该抛出异常(通过设置file.exception())吗?

More info:

更多信息:

  • It runs on gcc version 4.1.2 20080704 (Red Hat 4.1.2-54)
  • 它运行于gcc版本4.1.2 20080704 (Red Hat 4.1.2-54)
  • The full code that causes the problem (without the try/catch part): proc.hpp
  • 导致问题的完整代码(不包括try/catch部分):proc.hpp。

1 个解决方案

#1


3  

Update 2

I've even tried to force an error by manually setting tiny or huge buffer sizes:

我甚至试图通过手动设置微小或巨大的缓冲区大小来强迫错误:

    std::filebuf fb;
    // set tiny input buffer
    char buf[8]; // or huge: 64*1024
    fb.pubsetbuf(buf, sizeof(buf));
    fb.open(path.c_str(), std::ios::in);

    std::istream file(&fb);

I've verified that the read size were indeed tiny (7) using strace

我已经证实,读取的尺寸确实很小(7)。

sudo strace ./test $(sudo ps h -ae -o pid) |& 
      egrep -w 'read|open' | grep -v '= 7' | less -SR

Interestingly, none of this failed.

有趣的是,这些都没有失败。

Updated

in response to the comments, I have devised a standalone program that does exactly what the OP describes, but I can't reproduce the problem:

作为对评论的回应,我设计了一个独立的程序,它完全按照OP的描述,但是我不能重现这个问题:

#include <sys/types.h> // For pid_t.
#include <fstream>
#include <string>

// mock up
#include <boost/variant.hpp>
namespace {
    struct None {};
    struct Error { std::string s; Error(std::string s): s(s){} };

    std::ostream& operator<<(std::ostream& os, None const&) {
        return os << "None";
    }

    std::ostream& operator<<(std::ostream& os, Error const& e) {
        return os << "Error {" << e.s << "}";
    }

    template <typename T>
        using Result = boost::variant<None, Error, T>;
}
// end mockup

namespace proc {

    // Snapshot of a process (modeled after /proc/[pid]/stat).
    // For more information, see:
    // http://www.kernel.org/doc/Documentation/filesystems/proc.txt
    struct ProcessStatus
    {
        pid_t pid;
        std::string comm;
        char state;
        pid_t ppid, pgrp, session;
        int tty_nr;
        pid_t tpgid;
        unsigned int flags;
        unsigned long minflt, cminflt, majflt, cmajflt;
        unsigned long utime, stime;
        long cutime, cstime, priority, nice, num_threads, itrealvalue;
        unsigned long long starttime;
        unsigned long vsize;
        long rss;
        unsigned long rsslim, startcode, endcode, startstack, kstkeip, signal, blocked, sigcatch, wchan, nswap, cnswap;

        friend std::ostream& operator<<(std::ostream& os, proc::ProcessStatus const& ps) {
            return os << 
                "pid: "         << ps.pid         << "\n" << 
                "comm: "        << ps.comm        << "\n" << 
                "state: "       << ps.state       << "\n" << 
                "ppid: "        << ps.ppid        << "\n" << 
                "pgrp: "        << ps.pgrp        << "\n" << 
                "session: "     << ps.session     << "\n" << 
                "tty_nr: "      << ps.tty_nr      << "\n" << 
                "tpgid: "       << ps.tpgid       << "\n" << 
                "flags: "       << ps.flags       << "\n" << 
                "minflt: "      << ps.minflt      << "\n" << 
                "cminflt: "     << ps.cminflt     << "\n" << 
                "majflt: "      << ps.majflt      << "\n" << 
                "cmajflt: "     << ps.cmajflt     << "\n" << 
                "utime: "       << ps.utime       << "\n" << 
                "stime: "       << ps.stime       << "\n" << 
                "cutime: "      << ps.cutime      << "\n" << 
                "cstime: "      << ps.cstime      << "\n" << 
                "priority: "    << ps.priority    << "\n" << 
                "nice: "        << ps.nice        << "\n" << 
                "num_threads: " << ps.num_threads << "\n" << 
                "itrealvalue: " << ps.itrealvalue << "\n" << 
                "starttime: "   << ps.starttime   << "\n" << 
                "vsize: "       << ps.vsize       << "\n" << 
                "rss: "         << ps.rss         << "\n" << 
                "rsslim: "      << ps.rsslim      << "\n" << 
                "startcode: "   << ps.startcode   << "\n" << 
                "endcode: "     << ps.endcode     << "\n" << 
                "startstack: "  << ps.startstack  << "\n" << 
                "kstkeip: "     << ps.kstkeip     << "\n" << 
                "signal: "      << ps.signal      << "\n" << 
                "blocked: "     << ps.blocked     << "\n" << 
                "sigcatch: "    << ps.sigcatch    << "\n" << 
                "wchan: "       << ps.wchan       << "\n" << 
                "nswap: "       << ps.nswap       << "\n" << 
                "cnswap: "      << ps.cnswap      << "\n";
        }
    };

    // Returns the process statistics from /proc/[pid]/stat.
    // The return value is None if the process does not exist.
    inline Result<ProcessStatus> status(pid_t pid)
    {
        std::string path = "/proc/" + std::to_string(pid) + "/stat";

        std::ifstream file(path.c_str());

        if (!file.is_open()) {
#if 1
            return Error("Failed to open '" + path + "'");
#else // FIXME reenable
            // Need to check if file exists AFTER we open it to guarantee
            // process hasn't terminated (or if it has, we at least have a
            // file which the kernel _should_ respect until a close).
            if (!os::exists(path)) {
                return None();
            }
            return Error("Failed to open '" + path + "'");
#endif
        }

        std::string _; // For ignoring fields.

        // Parse all fields from stat.
        ProcessStatus ps;
        if (file >> _ >> ps.comm >> ps.state >> ps.ppid >> ps.pgrp >> ps.session >> ps.tty_nr
            >> ps.tpgid >> ps.flags >> ps.minflt >> ps.cminflt >> ps.majflt >> ps.cmajflt
            >> ps.utime >> ps.stime >> ps.cutime >> ps.cstime >> ps.priority >> ps.nice
            >> ps.num_threads >> ps.itrealvalue >> ps.starttime >> ps.vsize >> ps.rss
            >> ps.rsslim >> ps.startcode >> ps.endcode >> ps.startstack >> ps.kstkeip
            >> ps.signal >> ps.blocked >> ps.sigcatch >> ps.wchan >> ps.nswap >> ps.cnswap)
        {
            return ps;
        } else
        {
            return Error("Failed to read/parse '" + path + "'");
        }
    }


} // namespace proc {

int main(int argc, const char *argv[])
{
    for (auto i=1; i<argc; ++i)
        std::cout << proc::status(std::stoul(argv[i])) << "\n";
}

It runs happily on my machine, printing stuff like

它在我的机器上快乐地运行,打印东西。

pid: 594590200
comm: (test)
state: R
ppid: 8123
pgrp: 8123
session: 8123
...

Even if/when I torture it with

甚至当我折磨它的时候。

sudo ./test $(sudo ps h -ae -o pid) | grep -v : | sort -u
./test $(sudo ps h -ae -o pid) | grep -v : | sort -u

It just shows (presumably the sudo/ps from the subshell)

它只是显示(大概是subshell的sudo/ps)

Error {Failed to open '/proc/8652/stat'}
Error {Failed to open '/proc/8653/stat'}

I have tried to read the information twice from the input stream (to force read-past-the-end type situation), but no luck.

我试着从输入流中读取两次信息(以强制读取结束类型的情况),但是没有运气。


Old answer text:

You need to establish under what kind of conditions the exception occurs. The following demo code works as expected on my system:

您需要在异常发生的条件下建立。下面的演示代码在我的系统上运行如下:

#include <fstream>
#include <iostream>

int main()
{
    system("ps -o pid,comm,state,ppid,pgrp,session,tty > input.txt");

    try {
        std::ifstream file("input.txt");

        file.exceptions(std::ifstream::goodbit);

        std::string _, comm, state, ppid, pgrp, session, tty_nr;
        while (file >> _ >> comm >> state >> ppid >> pgrp >> session >> tty_nr)
        {
            for (auto&& s : { _, comm, state, ppid, pgrp, session, tty_nr })
                std::cout << s << "\t";
            std::cout << "\n";
        }
        file.close();
    } catch (const std::ifstream::failure& e) {
        std::cout << "Exception!!!! " << e.what();
    }
}

Prints, e.g.:

打印,如:

PID         COMMAND S       PPID    PGRP    SESS    TT      
20950       bash    S       20945   20950   20950   pts/1   
21275       vim     S       20950   21275   20950   pts/1   
21279       bash    S       21275   21275   20950   pts/1   
21280       test    S       21279   21275   20950   pts/1   
21281       sh      S       21280   21275   20950   pts/1   
21282       ps      R       21281   21275   20950   pts/1   

Update

I thought, on many systems the default buffer size is probably max 8192 bytes; let's create some silly long lines instead! Replacing the system call by

我想,在很多系统中默认的缓冲区大小可能是最大8192字节;让我们来创建一些愚蠢的长行吧!替换系统调用。

system("od /dev/urandom -t x8 -Anone | xargs -n256 | tr -d ' ' | xargs -n7 | head -n100 > input.txt");

results in lines with 7 columns, taking ~29kB per line. The output is without hesitation and amounts to 2.8MiB of output, measured with

结果为7列的行,每一行取约29kB。产量毫不犹豫,产量为2.8MiB。

make -B  && ./test | wc 

See it live on coliru:

看它生活在coliru上:

Note that on Coliru, we can't access /dev/urandom, which is why I read from the binary itself. "Random" enough for this purpose :)

注意,在Coliru上,我们不能访问/dev/urandom,这就是我从二进制文件中读取的原因。“随机”就足够了:)

#1


3  

Update 2

I've even tried to force an error by manually setting tiny or huge buffer sizes:

我甚至试图通过手动设置微小或巨大的缓冲区大小来强迫错误:

    std::filebuf fb;
    // set tiny input buffer
    char buf[8]; // or huge: 64*1024
    fb.pubsetbuf(buf, sizeof(buf));
    fb.open(path.c_str(), std::ios::in);

    std::istream file(&fb);

I've verified that the read size were indeed tiny (7) using strace

我已经证实,读取的尺寸确实很小(7)。

sudo strace ./test $(sudo ps h -ae -o pid) |& 
      egrep -w 'read|open' | grep -v '= 7' | less -SR

Interestingly, none of this failed.

有趣的是,这些都没有失败。

Updated

in response to the comments, I have devised a standalone program that does exactly what the OP describes, but I can't reproduce the problem:

作为对评论的回应,我设计了一个独立的程序,它完全按照OP的描述,但是我不能重现这个问题:

#include <sys/types.h> // For pid_t.
#include <fstream>
#include <string>

// mock up
#include <boost/variant.hpp>
namespace {
    struct None {};
    struct Error { std::string s; Error(std::string s): s(s){} };

    std::ostream& operator<<(std::ostream& os, None const&) {
        return os << "None";
    }

    std::ostream& operator<<(std::ostream& os, Error const& e) {
        return os << "Error {" << e.s << "}";
    }

    template <typename T>
        using Result = boost::variant<None, Error, T>;
}
// end mockup

namespace proc {

    // Snapshot of a process (modeled after /proc/[pid]/stat).
    // For more information, see:
    // http://www.kernel.org/doc/Documentation/filesystems/proc.txt
    struct ProcessStatus
    {
        pid_t pid;
        std::string comm;
        char state;
        pid_t ppid, pgrp, session;
        int tty_nr;
        pid_t tpgid;
        unsigned int flags;
        unsigned long minflt, cminflt, majflt, cmajflt;
        unsigned long utime, stime;
        long cutime, cstime, priority, nice, num_threads, itrealvalue;
        unsigned long long starttime;
        unsigned long vsize;
        long rss;
        unsigned long rsslim, startcode, endcode, startstack, kstkeip, signal, blocked, sigcatch, wchan, nswap, cnswap;

        friend std::ostream& operator<<(std::ostream& os, proc::ProcessStatus const& ps) {
            return os << 
                "pid: "         << ps.pid         << "\n" << 
                "comm: "        << ps.comm        << "\n" << 
                "state: "       << ps.state       << "\n" << 
                "ppid: "        << ps.ppid        << "\n" << 
                "pgrp: "        << ps.pgrp        << "\n" << 
                "session: "     << ps.session     << "\n" << 
                "tty_nr: "      << ps.tty_nr      << "\n" << 
                "tpgid: "       << ps.tpgid       << "\n" << 
                "flags: "       << ps.flags       << "\n" << 
                "minflt: "      << ps.minflt      << "\n" << 
                "cminflt: "     << ps.cminflt     << "\n" << 
                "majflt: "      << ps.majflt      << "\n" << 
                "cmajflt: "     << ps.cmajflt     << "\n" << 
                "utime: "       << ps.utime       << "\n" << 
                "stime: "       << ps.stime       << "\n" << 
                "cutime: "      << ps.cutime      << "\n" << 
                "cstime: "      << ps.cstime      << "\n" << 
                "priority: "    << ps.priority    << "\n" << 
                "nice: "        << ps.nice        << "\n" << 
                "num_threads: " << ps.num_threads << "\n" << 
                "itrealvalue: " << ps.itrealvalue << "\n" << 
                "starttime: "   << ps.starttime   << "\n" << 
                "vsize: "       << ps.vsize       << "\n" << 
                "rss: "         << ps.rss         << "\n" << 
                "rsslim: "      << ps.rsslim      << "\n" << 
                "startcode: "   << ps.startcode   << "\n" << 
                "endcode: "     << ps.endcode     << "\n" << 
                "startstack: "  << ps.startstack  << "\n" << 
                "kstkeip: "     << ps.kstkeip     << "\n" << 
                "signal: "      << ps.signal      << "\n" << 
                "blocked: "     << ps.blocked     << "\n" << 
                "sigcatch: "    << ps.sigcatch    << "\n" << 
                "wchan: "       << ps.wchan       << "\n" << 
                "nswap: "       << ps.nswap       << "\n" << 
                "cnswap: "      << ps.cnswap      << "\n";
        }
    };

    // Returns the process statistics from /proc/[pid]/stat.
    // The return value is None if the process does not exist.
    inline Result<ProcessStatus> status(pid_t pid)
    {
        std::string path = "/proc/" + std::to_string(pid) + "/stat";

        std::ifstream file(path.c_str());

        if (!file.is_open()) {
#if 1
            return Error("Failed to open '" + path + "'");
#else // FIXME reenable
            // Need to check if file exists AFTER we open it to guarantee
            // process hasn't terminated (or if it has, we at least have a
            // file which the kernel _should_ respect until a close).
            if (!os::exists(path)) {
                return None();
            }
            return Error("Failed to open '" + path + "'");
#endif
        }

        std::string _; // For ignoring fields.

        // Parse all fields from stat.
        ProcessStatus ps;
        if (file >> _ >> ps.comm >> ps.state >> ps.ppid >> ps.pgrp >> ps.session >> ps.tty_nr
            >> ps.tpgid >> ps.flags >> ps.minflt >> ps.cminflt >> ps.majflt >> ps.cmajflt
            >> ps.utime >> ps.stime >> ps.cutime >> ps.cstime >> ps.priority >> ps.nice
            >> ps.num_threads >> ps.itrealvalue >> ps.starttime >> ps.vsize >> ps.rss
            >> ps.rsslim >> ps.startcode >> ps.endcode >> ps.startstack >> ps.kstkeip
            >> ps.signal >> ps.blocked >> ps.sigcatch >> ps.wchan >> ps.nswap >> ps.cnswap)
        {
            return ps;
        } else
        {
            return Error("Failed to read/parse '" + path + "'");
        }
    }


} // namespace proc {

int main(int argc, const char *argv[])
{
    for (auto i=1; i<argc; ++i)
        std::cout << proc::status(std::stoul(argv[i])) << "\n";
}

It runs happily on my machine, printing stuff like

它在我的机器上快乐地运行,打印东西。

pid: 594590200
comm: (test)
state: R
ppid: 8123
pgrp: 8123
session: 8123
...

Even if/when I torture it with

甚至当我折磨它的时候。

sudo ./test $(sudo ps h -ae -o pid) | grep -v : | sort -u
./test $(sudo ps h -ae -o pid) | grep -v : | sort -u

It just shows (presumably the sudo/ps from the subshell)

它只是显示(大概是subshell的sudo/ps)

Error {Failed to open '/proc/8652/stat'}
Error {Failed to open '/proc/8653/stat'}

I have tried to read the information twice from the input stream (to force read-past-the-end type situation), but no luck.

我试着从输入流中读取两次信息(以强制读取结束类型的情况),但是没有运气。


Old answer text:

You need to establish under what kind of conditions the exception occurs. The following demo code works as expected on my system:

您需要在异常发生的条件下建立。下面的演示代码在我的系统上运行如下:

#include <fstream>
#include <iostream>

int main()
{
    system("ps -o pid,comm,state,ppid,pgrp,session,tty > input.txt");

    try {
        std::ifstream file("input.txt");

        file.exceptions(std::ifstream::goodbit);

        std::string _, comm, state, ppid, pgrp, session, tty_nr;
        while (file >> _ >> comm >> state >> ppid >> pgrp >> session >> tty_nr)
        {
            for (auto&& s : { _, comm, state, ppid, pgrp, session, tty_nr })
                std::cout << s << "\t";
            std::cout << "\n";
        }
        file.close();
    } catch (const std::ifstream::failure& e) {
        std::cout << "Exception!!!! " << e.what();
    }
}

Prints, e.g.:

打印,如:

PID         COMMAND S       PPID    PGRP    SESS    TT      
20950       bash    S       20945   20950   20950   pts/1   
21275       vim     S       20950   21275   20950   pts/1   
21279       bash    S       21275   21275   20950   pts/1   
21280       test    S       21279   21275   20950   pts/1   
21281       sh      S       21280   21275   20950   pts/1   
21282       ps      R       21281   21275   20950   pts/1   

Update

I thought, on many systems the default buffer size is probably max 8192 bytes; let's create some silly long lines instead! Replacing the system call by

我想,在很多系统中默认的缓冲区大小可能是最大8192字节;让我们来创建一些愚蠢的长行吧!替换系统调用。

system("od /dev/urandom -t x8 -Anone | xargs -n256 | tr -d ' ' | xargs -n7 | head -n100 > input.txt");

results in lines with 7 columns, taking ~29kB per line. The output is without hesitation and amounts to 2.8MiB of output, measured with

结果为7列的行,每一行取约29kB。产量毫不犹豫,产量为2.8MiB。

make -B  && ./test | wc 

See it live on coliru:

看它生活在coliru上:

Note that on Coliru, we can't access /dev/urandom, which is why I read from the binary itself. "Random" enough for this purpose :)

注意,在Coliru上,我们不能访问/dev/urandom,这就是我从二进制文件中读取的原因。“随机”就足够了:)