[ptyhon] 日志采集分析示例

时间:2021-02-27 22:30:54

基础环境:CentOS6.7、python3.6

需求描述:采集日志中的关键字,并对数据进行分析,按分钟统计总量,按关键字(name=*****)统计分量。


日志片段:

<2017-10-29 21:53:43> <WARN> related characters name=tryxxxx111001 count=3

<2017-10-29 21:53:43> <WARN> related characters name=tryxxxx111002 count=1

<2017-10-29 21:53:43> <WARN> related characters name=tryxxxx111003 count=43

<2017-10-29 21:54:53> <WARN> related characters name=tryxxxx111001 count=1

<2017-10-29 21:54:54> <WARN> related characters name=tryxxxx111002 count=3

<2017-10-29 21:54:54> <WARN> related characters name=tryxxxx111003 count=12

<2017-10-29 21:55:03> <WARN> related characters name=tryxxxx111001 count=2

<2017-10-29 21:55:03> <WARN> related characters name=tryxxxx111001 count=3

<2017-10-29 21:55:03> <WARN> related characters name=tryxxxx111000 count=2


程序如下:

#!/usr/bin/env python# -*- coding:utf-8 -*-
#filename:loganalysis.py
import sys
import re
filename = sys.argv[1]
countdict = {}
realtime = ''
count = 0
with open(filename,'r') as f:
    for a in f.readlines():
####采集关键信息,第一个(.*)抓取时间只到分钟部分,第二个(.*)抓取‘xxxx111000’段信息,第三个(.*)抓取count=后面的次数信息
        catchlist = re.findall('\<(.*)\:.*related characters name=(.*) count=(\d+)',a)
        if len(catchlist) != 0:
            realtime = catchlist[0][0]
            name = catchlist[0][1]
            count = catchlist[0][2]
            count += int(count)
####结果存放在字典中,按{time:{name:count}}存放。
            if realtime in countdict:
                if name in countdict[realtime]:
                    countdict[realtime][name] += int(count)
                else:
                    countdict[realtime][name] = int(count)
            else:
                countdict[realtime] = {name: int(count)}
for a in countdict:
    finalcount = 0
    for b in countdict[a]:
        print(a,b,' ',countdict[a][b])
        finalcount += countdict[a][b]
    print('time:',a,'total:',finalcount)


运行方式为 python3 loganalysis.py logfilename