前言
本文介绍的是利用Python实现的一个小工具,用于分析Git commit log,获得Git Project每个成员的简单行为数据。
Warning:代码量不能代表程序员能力水平!
启动参数
共5个。
- Repo地址
- Commit 起始日期
- Commit 结束日期
- Git仓库子目录
- 统计分析结果CSV文件目标路径
exec_git
Git Log命令:
1
|
git -C {} log --since={} --until={} --pretty=tformat:%ae --shortstat --no-merges -- {} > {}
|
填入参数,调用系统命令'os.system()',输出结果至本地临时文件。读取至内存,简单的String Array。
parse
Git Log输出有3种格式,对应3种正则表达式。
1
2
3
|
REPATTERN_FULL = r"\s(\d+)\D+(\d+)\D+(\d+)\D+\n"
REPATTERN_INSERT_ONLY = r"\s(\d+)\D+(\d+)\sinsertion\D+\n"
REPATTERN_DELETE_ONLY = r"\s(\d+)\D+(\d+)\sdeletion\D+\n"
|
遍历得到的数据,首先构造一个以Author为Key,分析结果为Value的字典。
分析结果构造一个元祖,包括:
- Commit 次数
- 增加代码行数
- 删除代码行数
- 变更代码行数
save_csv
简单省略。
示例代码:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
|
#!/usr/local/bin/python3
# -*- coding: utf-8 -*-
'''Analyse git branch commit log, for every version, every person.'''
import os
import sys
import re
import csv
GIT_LOG = r 'git -C {} log --since={} --until={} --pretty=tformat:%ae --shortstat --no-merges -- {} > {}'
REPATTERN_FULL = r "\s(\d+)\D+(\d+)\D+(\d+)\D+\n"
REPATTERN_INSERT_ONLY = r "\s(\d+)\D+(\d+)\sinsertion\D+\n"
REPATTERN_DELETE_ONLY = r "\s(\d+)\D+(\d+)\sdeletion\D+\n"
CSV_FILE_HEADER = [ "Author" , "Commit" , "Insert" , "Delete" , "Loc" ]
def exec_git(repo, since, until, subdir):
'''Execute git log commant, return string array.'''
logfile = os.path.join(os.getcwd(), 'gitstats.txt' )
git_log_command = GIT_LOG. format (repo, since, until, subdir, logfile)
os.system(git_log_command)
lines = None
with open (logfile, 'r' , encoding = 'utf-8' ) as logfilehandler:
lines = logfilehandler.readlines()
return lines
def save_csv(stats, csvfile):
'''save stats data to csv file.'''
with open (csvfile, 'w' , encoding = 'utf-8' ) as csvfilehandler:
writer = csv.writer(csvfilehandler)
writer.writerow(CSV_FILE_HEADER)
for author, stat in stats.items():
writer.writerow([author, stat[ 0 ], stat[ 1 ], stat[ 2 ], stat[ 3 ]])
def parse(lines):
'''Analyse git log and sort to csv file.'''
prog_full = re. compile (REPATTERN_FULL)
prog_insert_only = re. compile (REPATTERN_INSERT_ONLY)
prog_delete_only = re. compile (REPATTERN_DELETE_ONLY)
stats = {}
for i in range ( 0 , len (lines), 3 ):
author = lines[i]
#empty = lines[i+1]
info = lines[i + 2 ]
#change = 0
insert, delete = int ( 0 ), int ( 0 )
result = prog_full.search(info)
if result:
#change = result[0]
insert = int (result.group( 2 ))
delete = int (result.group( 3 ))
else :
result = prog_insert_only.search(info)
if result:
#change = result[0]
insert = int (result.group( 2 ))
delete = int ( 0 )
else :
result = prog_delete_only.search(info)
if result:
#change = result[0]
insert = int ( 0 )
delete = int (result.group( 2 ))
else :
print ( 'Regular expression fail!' )
return
loc = insert - delete
stat = stats.get(author)
if stat is None :
stats[author] = [ 1 , insert, delete, loc]
else :
stat[ 0 ] + = 1
stat[ 1 ] + = insert
stat[ 2 ] + = delete
stat[ 3 ] + = loc
return stats
if __name__ = = "__main__" :
print ( 'gitstats begin' )
if len (sys.argv) ! = 6 :
print ( 'Invalid argv parameters.' )
exit( 0 )
REPO = os.path.join(os.getcwd(), sys.argv[ 1 ])
SINCE = sys.argv[ 2 ]
UNTIL = sys.argv[ 3 ]
SUB_DIR = sys.argv[ 4 ]
CSV_FILE = os.path.join(os.getcwd(), sys.argv[ 5 ])
LINES = exec_git(REPO, SINCE, UNTIL, SUB_DIR)
assert LINES is not None
STATS = parse(LINES)
save_csv(STATS, CSV_FILE)
print ( 'gitstats done' )
|
总结
以上就是这篇文章的全部内容了,希望本文的内容对大家的学习或者工作具有一定的参考学习价值,如果有疑问大家可以留言交流,谢谢大家对服务器之家的支持。
原文链接:http://www.jianshu.com/p/cafc3767fff5