网上的一些资源
工具:https://pan.baidu.com/share/init?shareid=2715092862&uk=3395615408 密码是difchttps://pan.baidu.com/share/init?shareid=2577552196&uk=1677924157这个应该是完整版的,仅供参考9gtf。
本人申请了数据集,但是只有aflw-images-0.tar.gz,aflw-images-2.tar.gz,aflw-images-3.tar.gz,工具是用的第一个链接
还有一个闫大脑门同学的链接,更详细
https://pan.baidu.com/share/link?shareid=2792069998&uk=1897923954#list/path=%2F
没有秘密
将压缩包放到linux上的同一个文件夹中,解压缩即可
安装
sqliteman
sudo apt-get install sqliteman
数据处理的脚本为
#-*-coding:utf-8-*-
fid_a = open('a.txt')
fid_b = open('b.txt')
fid_c = open('c.txt','w')
lines_a = fid_a.readlines()
lines_b = fid_b.readlines()
lines_c = []
num = []
#去重
lines_a_no_repeat=[]#没重复的
for line_a in lines_a:
if line_a not in lines_a_no_repeat:
lines_a_no_repeat.append(line_a)
num.append(lines_a.count(line_a))
#取编号
ii=0
for tmp in lines_a_no_repeat:
j = 0
for i in range(0,len(lines_a)):
if tmp==lines_a[i]:
if j == 0:
fid_c.writelines(lines_a[i].strip('\n\r')+' '+str(num[ii])+'\n')
j = j + 1
fid_c.writelines(lines_b[i])
ii = ii + 1
fid_a = open('a.txt') fid_b = open('b.txt') fid_c = open('cc.txt','w') lines_a = fid_a.readlines() lines_b = fid_b.readlines() lines_c = [] for i in range(0,len(lines_b)): lines_a[i]= lines_a[i].strip('\n\r') lines_c.append(lines_a[i]+' '+lines_b[i]) fid_c.writelines(lines_c[i])