I have two CSV's, openable in Numbers or Excel, structured:| word | num1 |
and| word | num2 |
我有两个CSV,可在Numbers或Excel中打开,结构化:|字| num1 |和|字| num2 |
if the two words are equal (like they're both 'hi' and 'hi') I want it to become:| word | num1 | num2 |
如果这两个词是相同的(就像他们'嗨'和'嗨'),我希望它成为:|字| num1 | num2 |
here are some pictures:
这里有一些图片:
So like for row 1, since both the words are the same, 'TRUE', I want it to become something like| TRUE | 5.371748 | 4.48957 |
因此对于第1行,因为两个单词都是相同的,“TRUE”,我希望它变成类似的东西TRUE | 5.371748 | 4.48957 |
Either through some small script, or if there's some feature/ function I'm overlooking.
Thanks!
无论是通过一些小脚本,还是有一些我忽略的功能/功能。谢谢!
3 个解决方案
#1
1
Use a dict:
使用词典:
with open('file1.csv', 'rb') as file_a, open('file2.csv', 'rb') as file_b:
data_a = csv.reader(file_a)
data_b = dict(csv.reader(file_b)) # <-- dict
with open('out.csv', 'wb') as file_out:
csv_out = csv.writer(file_out)
for word, num_a in data_a:
csv_out.writerow([word, num_a, data_b.get(word, '')]) # <-- edit
(untested)
#2
4
For csv
, I always reach for the data analysis library pandas
. http://pandas.pydata.org/
对于csv,我总是找到数据分析库pandas。 http://pandas.pydata.org/
import pandas as pd
df1 = pd.read_csv('file1.csv', names=['word','num1'])
df2 = pd.read_csv('file2.csv', names=['word','num2'])
df3 = pd.merge(df1, df2, on='word')
df3.to_csv('merged_data.csv')
#3
0
I think what you're looking for is zip
, to let you iterate the two CSVs in lock-step:
我认为你要找的是zip,让你在锁定步骤中迭代这两个CSV:
with open('file1.csv', 'rb') as f1, open('file2.csv', 'rb') as f2:
r1, r2 = csv.reader(f1), csv.reader(f2)
with open('out.csv', 'wb') as fout:
w = csv.writer(fout)
for row1, row2 in zip(r1, r2):
if row1[0] == row2[0]:
w.writerow([row1[0], row1[1], row2[1]])
I'm not sure what you wanted to happen if they're not equal. Maybe insert both rows, like this?
如果他们不平等,我不确定你想要发生什么。也许插入两行,像这样?
else:
w.writerow([row1[0], row1[1], ''])
w.writerow([row2[0], '', row2[1]])
#1
1
Use a dict:
使用词典:
with open('file1.csv', 'rb') as file_a, open('file2.csv', 'rb') as file_b:
data_a = csv.reader(file_a)
data_b = dict(csv.reader(file_b)) # <-- dict
with open('out.csv', 'wb') as file_out:
csv_out = csv.writer(file_out)
for word, num_a in data_a:
csv_out.writerow([word, num_a, data_b.get(word, '')]) # <-- edit
(untested)
#2
4
For csv
, I always reach for the data analysis library pandas
. http://pandas.pydata.org/
对于csv,我总是找到数据分析库pandas。 http://pandas.pydata.org/
import pandas as pd
df1 = pd.read_csv('file1.csv', names=['word','num1'])
df2 = pd.read_csv('file2.csv', names=['word','num2'])
df3 = pd.merge(df1, df2, on='word')
df3.to_csv('merged_data.csv')
#3
0
I think what you're looking for is zip
, to let you iterate the two CSVs in lock-step:
我认为你要找的是zip,让你在锁定步骤中迭代这两个CSV:
with open('file1.csv', 'rb') as f1, open('file2.csv', 'rb') as f2:
r1, r2 = csv.reader(f1), csv.reader(f2)
with open('out.csv', 'wb') as fout:
w = csv.writer(fout)
for row1, row2 in zip(r1, r2):
if row1[0] == row2[0]:
w.writerow([row1[0], row1[1], row2[1]])
I'm not sure what you wanted to happen if they're not equal. Maybe insert both rows, like this?
如果他们不平等,我不确定你想要发生什么。也许插入两行,像这样?
else:
w.writerow([row1[0], row1[1], ''])
w.writerow([row2[0], '', row2[1]])