转载:https://www.linux.com/learn/linux-career-center/37985-the-kernel-newbie-corner-kernel-debugging-using-proc-qsequenceq-files-part-1
Over this column and the next one (and possibly the one after that, depending on how detailed we get), we're going to discuss kernel and module debugging using proc files. Specifically, we're going to discuss the seq_file implementation of proc files, which represents the newest and most powerful variation of the proc files we're interested in.
This first column will introduce the simpler variation of sequence files, while Part 2 (and beyond) will cover the more complete and formal usage of those files. And, not surprisingly, we're once again going to steal shamelessly from, and build on, what you can read in the classic book LDD3, found online here. In particular, we'll be working out of Chapter 4 of that book, so it would be in your best interest to have that portion of the chapter handy as you read what follows, since I plan on referring to parts of it as we go along.
Oh, and fair warning: a tiny part of this discussion is speculation, so feel free to use the comments section to correct any misinformation.
(The archive of all previous "Kernel Newbie Corner" articles can be found here.)
This is ongoing content from the Linux Foundation training program. If you want more content, please consider signing up for one of these classes.
What is the /proc Directory, Anyway?
If you've worked with Linux for even a short time, you probably already know the answer to that question. The /proc directory is an example of what is known as a "pseudo" filesystem in Linux; that is, something that isn't really a filesystem in that it doesn't take up any space on disk.
Rather, files under the /proc directory act as interfaces to internal kernel data structures, such that accessing entries under /proc magically accesses the underlying contents in kernel space. Put another way, the "files" you see under /proc aren't really "files"; instead, their contents are typically generated dynamically whenever you access them. And it's that dynamic content generation that's the theme of this and next week's columns because that's how you're going to debug the kernel and your loadable modules--by writing loadable modules that create simple proc files that allow you to list the contents of specific kernel data whenever you want.
Examples. We Need Examples.
Of course you do, so let's examine some of the proc files that already typically exist as a standard part of the Linux kernel. Consider listing the "version" of the running kernel via /proc/version:
$ cat /proc/version
Linux version 2.6.31-rc5 (rpjday@localhost.localdomain)
(gcc version 4.4.0 20090506 (Red Hat 4.4.0-4) (GCC) ) #3
SMP Mon Aug 3 11:24:19 EDT 2009
$
Where did all that information come from? It's certainly not stored in that file, since asking for the long listing of that file produces:
$ ls -l /proc/version
-r--r--r--. 1 root root 0 2009-08-07 11:22 /proc/version
It's a file with zero size. But that's because, again, it's not a real file, it's a pseudo file, whose "contents" are generated by some underlying code that implements that file whenever it's accessed. Put another way, the "contents" of numerous proc files is whatever the kernel programmer decided to generate as "output" for those proc files, using whatever combination of appropriate "print" statments that seemed appropriate at the time.
Let's list some other proc files, all of which contain potentially useful information, all of which are zero size, and all of which generate their content based on some underlying code we'll examine a bit later:
$ cat /proc/cpuinfo
$ cat /proc/cmdline
$ cat /proc/modules
$ cat /proc/meminfo
$ cat /proc/interrupts
... and so on and so on, check it out ...
And note well that the contents of various proc files are generated new each time, which is why you'll probably see slightly different output every time you list the contents of, say, /proc/meminfo. Our mission in this week's column is to show you how to design and create your own proc files, so you can, whenever you want, list them to examine whatever kernel data you choose to associate with them.
(Note that, while the contents of some proc files should change constantly as the system runs, others should remain static. For instance, you don't expect the contents of /proc/version or /proc/cpuinfo to change no matter how many times you list them. You get the idea, right?)
As an aside, you can also create writable proc files, so that writing data to a proc file can be used to modify kernel data structures. But since this is a column on simple debugging, we'll restrict ourselves to just readable proc files. Anyone wanting to get more ambitious is welcome to read the appropriate docs.
Exercise for the reader: Take some time and examine some of the other files under /proc that look like they might contain useful information. Don't be scared to go into some of those subdirectories. If you have the time, read the kernel documentation file Documentation/filesystems/proc.txt.
So What's a "Sequence" File, Then?
And here's where things get a bit tricky. Technically, a "proc file" is nothing more than the file you can see under the /proc directory--it has a name, an owner and group, a size of (typically) zero and some permissions that dictate who is allowed to perform what operations on it. And that's all.
A proc file by itself does absolutely nothing. What's necessary is to then implement some read and/or write code behind it that defines what it means to read from (or write to) that file. And a sequence file is simply one of the possible implementations you can use to define those operations. So what's so special about a sequence file?
At this point, for the sake of brevity, I'm going to refer you to the proper sections of LDD3 that discuss the rationale for sequence files, but I'll at least summarize it here. The "older" and current implementations of proc files had an awkward limitation of not being able to "print" more than a single page of output (a page being defined by the definition of the kernel PAGE_SIZE macro). Sequence files solve this problem by generating the "output" of a proc file as a sequence of writes, each of which can be up to a page in size, with no limit on the number of writes, effectively solving the problem and allowing unlimited output from a single proc file.
In fact, it's probably safe to say that, while you'll still find a lot of the old implementation in the current kernel tree, sequence files are easily the preferred way to implement output-only proc files, even when the output is very brief.
To emphasize the difference between proc files and sequence files, two important points:
- You can create proc files that don't use the underlying seq_file implementation, and
- you can create files based on the seq_file implementation in places other than under /proc.
An Example, Please
At this point, we definitely need a live example, so let's whip up a trivial proc file that will display the current value of jiffies (the tick counter) whenever we list it. Consider the loadable module jif.c (in this case, for a 64-bit system):
#include <linux/module.h>
#include <linux/init.h>
#include <linux/kernel.h> #include <linux/fs.h> // for basic filesystem
#include <linux/proc_fs.h> // for the proc filesystem
#include <linux/seq_file.h> // for sequence files
#include <linux/jiffies.h> // for jiffies static struct proc_dir_entry* jif_file; static int
jif_show(struct seq_file *m, void *v)
{
seq_printf(m, "%llu\n",
(unsigned long long) get_jiffies_64());
return 0;
} static int
jif_open(struct inode *inode, struct file *file)
{
return single_open(file, jif_show, NULL);
} static const struct file_operations jif_fops = {
.owner = THIS_MODULE,
.open = jif_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
}; static int __init
jif_init(void)
{
jif_file = proc_create("jif", 0, NULL, &jif_fops); if (!jif_file) {
return -ENOMEM;
} return 0;
} static void __exit
jif_exit(void)
{
remove_proc_entry("jif", NULL);
} module_init(jif_init);
module_exit(jif_exit); MODULE_LICENSE("GPL");
At this point (using what you know from previous columns in this series), create the corresponding Makefile, and compile the module and load it. As soon as you load that module, verify that the following proc file now exists:
# ls -l /proc/jif
-r--r--r--. 1 root root 0 2009-08-10 19:48 /proc/jif
#
at which point, you should (as a regular user given the read permissions), be able to list that file as much as you want to display the current value of the appropriate kernel jiffies variable:
$ cat /proc/jif
4329225958
$ cat /proc/jif
4329226854
$ cat /proc/jif
4329227174
$ cat /proc/jif
4329227486
$ cat /proc/jif
4329227798
$ cat /proc/jif
4329228078
$
after which you can (as root) unload the module, at which point the proc file is removed. Yes, it really is that simple. And now, to work.
So What Just Happened There?
Let's summarize just the critical features of the above example, so you can start implementing your own (simple) debugging proc files; we'll leave the more complicated features for Part 2. So, about the above:
- The necessary header files should be self-explanatory--just use that list.
- The pointer of type proc_dir_entry (defined in the kernel header file include/linux/proc_fs.h, if you're interested) is used to keep track of the proc file that is created but, in fact, if your proc file code is simple enough, you might not even need to keep track of that. (You'll see why shortly.)
- Your "show" routine (in our case, jif_show()) is what defines what is displayed when someone lists your proc file. That routine always takes the same two argument types but, in simple cases like ours, you can ignore the second argument--it's only relevant when you're actually "sequencing" through the output, as you'll see in Part 2.)
In short, you can define whatever output you want to print based on any kernel routines or data structures you can think of, but keep in mind that, if this is a loadable module, you will have access to only that kernel data that's been "exported."
- The next two portions of our example--the jif_open routine and the jif_fops structure--should be self-explanatory as well. Just substitute your own "open" routine name in there, and leave the rest as is. The fact that this is a trivial sequence file that doesn't even need sequencing is represented by the use of the "single_" variation of the operations.
- The entry and exit routines should also be reasonably self-explanatory. The module entry routine creates the proc file with a name of "jif", the "0" argument represents the default file permission of 0444, the "NULL" means that the file should be created directly under the /proc directory, and the final argument identifies the file operations structure to associate with that file. In other words, the creation of the proc file and its association with its open and I/O routines is all done in a single call.
The exit routine is much simpler--it simply deletes the file by name.
Simple, no? But there is one more point worth making.
Can We Make This Any Simpler?
In fact, we can if we want to cut some corners. As you can see, our entry routine does the proper error checking on whether or not we could even create our proc file and, if that failed, we return a negative error code which, as always, causes the module to fail to load:
static int __init
jif_init(void)
{
jif_file = proc_create("jif", 0, NULL, &jif_fops); if (!jif_file) {
return -ENOMEM;
} return 0;
}
However, if you tighten up the code, you don't even need to save the pointer to the proc_dir_entry structure:
static int __init
jif_init(void)
{
if (!proc_create("jif", 0, NULL, &jif_fops)) {
return -ENOMEM;
} return 0;
}
You can do that since, if you look carefully, you don't really need that pointer anywhere else in the module. In an example this trivial, the exit routine simply has to delete the proc file by its name--it has no need for that pointer so, really, there's no need to hang onto it. At least for now.
And if you really wanted to tighten things up, you could bypass the error-checking altogether:
static int __init
jif_init(void)
{
proc_create("jif", 0, NULL, &jif_fops); return 0;
}
The above simply assumes that there's no possible way for that file creation step to fail. That's probably not wise for production-level code, but the chance of that step failing is typically small so it's probably acceptable for informal testing.
What About Some Kernel Examples?
And since you've seen the very basics of creating a short, output-only sequence file, it's worth seeing the code that's responsible for printing a number of those short files you saw under /proc earlier, such as /proc/version. A number of those files can be found in the kernel source tree, in the fs/proc directory, so let's examine version.c
#include <linux/fs.h>
#include <linux/init.h>
#include <linux/kernel.h>
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/utsname.h> static int version_proc_show(struct seq_file *m, void *v)
{
seq_printf(m, linux_proc_banner,
utsname()->sysname,
utsname()->release,
utsname()->version);
return 0;
} static int version_proc_open(struct inode *inode, struct file *file)
{
return single_open(file, version_proc_show, NULL);
} static const struct file_operations version_proc_fops = {
.open = version_proc_open,
.read = seq_read,
.llseek = seq_lseek,
.release = single_release,
}; static int __init proc_version_init(void)
{
proc_create("version", 0, NULL, &version_proc_fops);
return 0;
}
module_init(proc_version_init);
All of the above should look reasonably similar to your loadable module code, but keep two distinctions in mind:
- Notice that many of the kernel proc files in that directory have no module exit routine. That's because they're built into the kernel proper, and have no option to be built as loadable modules. Hence, they have no need for an exit routine. (For the same reason, they don't need to set a file operations "owner" field because they will never have a module owner.)
- As we've already noted, the proc files that are built into the kernel have access to all kernel symbols with external linkage, while your modules can only access what is exported.
Exercises for the reader: Take a look at some of the other simple sequence files in the kernel fs/proc directory, to see how they match up with their corresponding /proc files that we listed earlier. Some of them should be simple enough that you can see how they work.
In addition, if you have the time, write a loadable module that, when loaded, creates an output-only proc file (say, /proc/hz) that, when read, displays the kernel "HZ" value--that is, the configured kernel tick rate. It's up to you to figure out where to get that value and how to print it out.
Next week: More sequence files, of course.