unix上的环缓冲区日志文件

I'm trying to come up with a unix pipeline of commands that will allow me to log only the most recent n lines of a program's output to a text file.

我试图想出一个unix管道命令,它允许我只将程序输出的最近n行记录到文本文件中。

The text file should never be more than n lines long. (it may be less when it is first filling the file)

文本文件的长度不应超过n行。 (首次填写文件时可能会少一些)

It will be run on a device with limited memory/resources, so keeping the filesize small is a priority.

它将在内存/资源有限的设备上运行,因此保持文件大小是一个优先事项。

I've tried stuff like this (n=500):

我尝试过这样的东西(n = 500):

program_spitting_out_text > output.txt
cat output.txt | tail -500 > recent_output.txt
rm output.txt

program_spitting_out_text | tee output.txt | tail -500 > recent_output.txt

Obviously neither works for my purposes...

显然这两个都不适合我的目的......

Anyone have a good way to do this in a one-liner? Or will I have to write a script/utility?

任何人都有一个很好的方法来做一个单线程?或者我是否必须编写脚本/实用程序?

Note: I don't want anything to do with dmesg and must use standard BSD unix commands. The "program_spitting_out_text" prints out about 60 lines/second, every second.

注意:我不想与dmesg有任何关系,必须使用标准的BSD unix命令。 “program_spitting_out_text”每秒打印约60行/秒。

Thanks in advance!

提前致谢!

1 个解决方案

#1

If program_spitting_out_text runs continuously and keeps it's file open, there's not a lot you can do.

如果program_spitting_out_text连续运行并保持文件打开,那么你可以做很多事情。

Even deleting the file won't help since it will still continue to write to the now "hidden" file (data still exists but there is no directory entry for it) until it closes it, at which point it will be really removed.

即使删除文件也无济于事,因为它仍会继续写入现在的“隐藏”文件(数据仍然存在,但没有目录条目),直到它关闭它,此时它将被真正删除。

If it closes and reopens the log file periodically (every line or every ten seconds or whatever), then you have a relatively easy option.

如果它关闭并定期重新打开日志文件(每行或每十秒或其他),那么你有一个相对容易的选择。

Simply monitor the file until it reaches a certain size, then roll the file over, something like:

只需监控文件,直到达到一定的大小,然后滚动文件,例如:

while true; do
    sleep 5
    lines=$(wc -l <file.log)
    if [[ $lines -ge 5000 ]]; then
        rm -f file2.log
        mv file.log file2.log
        touch file.log
    fi
done

This script will check the file every five seconds and, if it's 5000 lines or more, will move it to a backup file. The program writing to it will continue to write to that backup file (since it has the open handle to it) until it closes it, then it will re-open the new file.

此脚本将每五秒检查一次文件,如果是5000行或更多,则将其移动到备份文件。写入它的程序将继续写入该备份文件(因为它具有打开的句柄),直到它关闭它,然后它将重新打开新文件。

This means you will always have (roughly) between five and ten thousand lines in the log file set, and you can search them with commands that combine the two:

这意味着您将始终在日志文件集中(大致)有五到一万行,并且您可以使用组合这两行的命令搜索它们:

grep ERROR file2.log file.log

Another possibility is if you can restart the program periodically without affecting its function. By way of example, a program which looks for the existence of a file once a second and reports on that, can probably be restarted without a problem. One calculating PI to a hundred billion significant digits will probably not be restartable without impact.

另一种可能性是,如果您可以定期重新启动程序而不影响其功能。举例来说,可以在没有问题的情况下重新启动一个程序,该程序每秒查找一次文件的存在并报告该文件。一个计算PI到一千亿有效数字的人可能无法在没有影响的情况下重新启动。

If it is restartable, then you can basically do the same trick as above. When the log file reaches a certain size, kill of the current program (which you will have started as a background task from your script), do whatever magic you need to in rolling over the log files, then restart the program.

如果它是可重启的,那么你基本上可以做同样的技巧。当日志文件达到一定大小时,终止当前程序(您将从脚本中作为后台任务启动),在滚动日志文件时执行所需的任何魔术,然后重新启动程序。

For example, consider the following (restartable) program prog.sh which just continuously outputs the current date and time:

例如,考虑以下(可重启)程序prog.sh,它只是连续输出当前日期和时间:

#!/usr/bin/bash
while true; do
    date
done

Then, the following script will be responsible for starting and stopping the other script as needed, by checking the log file every five seconds to see if it has exceeded its limits:

然后,以下脚本将负责根据需要启动和停止其他脚本,方法是每隔五秒检查一次日志文件以查看它是否超出了其限制:

#!/usr/bin/bash

exe=./prog.sh
log1=prog.log
maxsz=500

pid=-1
touch ${log1}
log2=${log1}-prev

while true; do
    if [[ ${pid} -eq -1 ]]; then
        lines=${maxsz}
    else
        lines=$(wc -l <${log1})
    fi
    if [[ ${lines} -ge ${maxsz} ]]; then
        if [[ $pid -ge 0 ]]; then
            kill $pid >/dev/null 2>&1
        fi
        sleep 1
        rm -f ${log2}
        mv ${log1} ${log2}
        touch ${log1}
        ${exe} >> ${log1} &
        pid=$!
    fi
    sleep 5
done

And this output (from an every-second wc -l on the two log files) shows what happens at the time of switchover, noting that it's approximate only, due to the delays involved in switching:

此输出(来自两个日志文件中的每秒wc -l)显示切换时发生的情况,并指出它仅是近似值,因为切换涉及延迟:

474 prog.log       0 prog.log-prev
496 prog.log       0 prog.log-prev
518 prog.log       0 prog.log-prev
539 prog.log       0 prog.log-prev
542 prog.log       0 prog.log-prev
 21 prog.log     542 prog.log-prev

Now keep in mind that's a sample script. It's relatively intelligent but probably needs some error handling so that it doesn't leave the executable running if you shut down the monitor.

现在请记住这是一个示例脚本。它相对智能,但可能需要一些错误处理,以便在关闭监视器时不会使可执行文件继续运行。

And, finally, if none of that suffices, there's nothing stopping you from writing your own filter program which takes standard input and continuously outputs that to a real ring buffer file.

最后,如果这些都不够,那么就没有什么能阻止你编写自己的过滤程序,它采用标准输入并连续输出到真正的环形缓冲区文件。

Then you would simply do:

然后你会做:

program_spitting_out_text | ringbuffer 4096 last4k.log

That program could be a true ring buffer in that it treats the 4k file as a circular character buffer but, of course, you'll need a special marker in the file to indicate the write-point, along with a program that can turn it back into a real stream.

该程序可能是一个真正的环形缓冲区,因为它将4k文件视为循环字符缓冲区,但是,当然,您需要在文件中使用特殊标记来指示写入点,以及可以转换它的程序回到真正的流。

Or, it could do much the same as the scripts above, rewriting the file so that it's always below the size desired.

或者,它可以与上面的脚本完全相同,重写文件,使其始终低于所需的大小。

#1