Good day.There’re some data in an Excel spreadsheet, structuring like this:
美好的一天。在Excel电子表格中有一些数据,结构如下:
It’s wanted to put the unique values in the 3 columns together, nicely into a format like:
它希望将3列中的唯一值放在一起,很好地形成如下格式:
Mike to America for Hotel; Meal
and
Kate to Europe; America for Transport
etc
I can only work out for 2 columns.
我只能为2列工作。
the_file = xlrd.open_workbook("testing.xlsx")
the_sheet = the_file.sheet_by_name("Sheet1")
products = defaultdict(list)
for row_index in range(1, the_sheet.nrows):
products[str(the_sheet.cell(row_index, 0).value)].append(the_sheet.cell(row_index, 1).value)
for product, v in products.items()
print product + " to " + ";".join(set(v))
the output is:
输出是:
Mike to America
Hulk to America;Asia
Kate to Europe;America
Dave to Europe
Jack to Europe;America;Asia
Luci to Asia
What’s the way to make the lines work for 3 columns together?
Thank you.
是什么方法可以使线条一起工作3列?谢谢。
3 个解决方案
#1
1
Think there's a more pythonic way to do it but this is what I came up with:
认为有更多的pythonic方式,但这是我想出来的:
from collections import defaultdict
l = [
['mike', 'america', 'hotel'],
['mike', 'america', 'meal'],
['jack', 'america', 'meal'],
['jack', 'europe', 'hotel'],
['jack', 'america', 'bonus'],
['jack', 'asia', 'hotel'],
['dave', 'europe', 'meal'],
]
people = defaultdict(list)
people_places = defaultdict(list)
for row_index in range(len(l)):
people[l[row_index][0]].append(l[row_index][1])
people_places[l[row_index][0] + '|' + l[row_index][1]].append(l[row_index][2])
for p, k in people.items():
activity = []
for place in k:
activity += people_places[p + '|' + place]
print '{} to {} for {}'.format(
p,
';'.join(set(k)),
';'.join(set(activity))
)
You can translate the code to use directly spreadsheet rows and cells or extract the list l
first with something like:
您可以将代码翻译为直接使用电子表格行和单元格,或者首先使用以下内容提取列表:
l = []
with xlrd.open_workbook("testing.xlsx") as the_file:
the_sheet = the_file.sheet_by_name("Sheet1")
for row_index in range(1, the_sheet.nrows):
l.append([
the_sheet.cell(row_index, 0).value,
the_sheet.cell(row_index, 1).value,
the_sheet.cell(row_index, 2).value])
#2
1
First you extract the rows you want here I make it as a nested list i.e. [[col1, col2, col3],[col1, col2, col3]]
首先你提取你想要的行我把它作为嵌套列表,即[[col1,col2,col3],[col1,col2,col3]]
box = list()
bigbox = []
for i in range(len(the_sheet.col(1))):
if i > 2:
for j in range(1,4):
box.append(str(the_sheet.col(j)[i]).split(":")[1])
bigbox.append(box)
box = []
print bigbox
And then I convert the nested list into a nested dictionary of nested set i.e. {'name':{'travel': set of travel, 'expense': set of expense}, ...}
然后我将嵌套列表转换为嵌套字典的嵌套字典,即{'name':{'travel':旅行集,'费用':费用集},...}
dbox = dict()
for name, travel, expense in bigbox:
if name not in dbox:
dbox[name] = {'travel': {travel}, 'expense': {expense}}
else:
dbox[name]['travel'].add(travel)
dbox[name]['expense'].add(expense)
print dbox
lastly you print it out using some voodoo magic read the doc for more info
最后你用一些巫毒魔法打印出来阅读文档了解更多信息
for name in dbox:
print(name, 'to', "; ".join(dbox[name]['travel']), 'for', "; ".join(dbox[name]['expense']))
Hope this help I want to complain on how you don't give me the excel file I need to create that on my own next time include it and this is familar to some programming courses assignment??
希望这个帮助我想抱怨你怎么不给我我需要创建的excel文件,我自己下次包含它,这是一些编程课程分配的熟悉?
#3
1
Solution that comes to me at this moment :
此时遇到的解决方案:
from collections import defaultdict
the_file = xlrd.open_workbook("4_test.xlsx")
the_sheet = the_file.sheet_by_name("Sheet1")
nested_dict = lambda: defaultdict(nested_dict)
_dict = nested_dict()
for row_index in range(1, the_sheet.nrows):
expense = []
travel = []
name = str(the_sheet.cell(row_index, 0).value)
for row_index_1 in range(1, the_sheet.nrows):
if name == str(the_sheet.cell(row_index_1, 0).value):
travel.append(str(the_sheet.cell(row_index_1, 1).value))
expense.append(str(the_sheet.cell(row_index_1, 2).value))
_dict[name]['travel'] = travel
_dict[name]['expense']= expense
for name in _dict:
print name + " to "+ ",".join(set(_dict[name]['travel'])) + " for " + ",".join(set(_dict[name]['expense']))
Output ::
Pintu to Europe for Bonous
Pintu到欧洲为Bonous
Jack to Europe,America for Hotel,Meal
杰克到欧洲,美国为酒店,用餐
Mike to America for Bonous,Hotel,Transport
迈克到美国为Bonous,酒店,运输
#1
1
Think there's a more pythonic way to do it but this is what I came up with:
认为有更多的pythonic方式,但这是我想出来的:
from collections import defaultdict
l = [
['mike', 'america', 'hotel'],
['mike', 'america', 'meal'],
['jack', 'america', 'meal'],
['jack', 'europe', 'hotel'],
['jack', 'america', 'bonus'],
['jack', 'asia', 'hotel'],
['dave', 'europe', 'meal'],
]
people = defaultdict(list)
people_places = defaultdict(list)
for row_index in range(len(l)):
people[l[row_index][0]].append(l[row_index][1])
people_places[l[row_index][0] + '|' + l[row_index][1]].append(l[row_index][2])
for p, k in people.items():
activity = []
for place in k:
activity += people_places[p + '|' + place]
print '{} to {} for {}'.format(
p,
';'.join(set(k)),
';'.join(set(activity))
)
You can translate the code to use directly spreadsheet rows and cells or extract the list l
first with something like:
您可以将代码翻译为直接使用电子表格行和单元格,或者首先使用以下内容提取列表:
l = []
with xlrd.open_workbook("testing.xlsx") as the_file:
the_sheet = the_file.sheet_by_name("Sheet1")
for row_index in range(1, the_sheet.nrows):
l.append([
the_sheet.cell(row_index, 0).value,
the_sheet.cell(row_index, 1).value,
the_sheet.cell(row_index, 2).value])
#2
1
First you extract the rows you want here I make it as a nested list i.e. [[col1, col2, col3],[col1, col2, col3]]
首先你提取你想要的行我把它作为嵌套列表,即[[col1,col2,col3],[col1,col2,col3]]
box = list()
bigbox = []
for i in range(len(the_sheet.col(1))):
if i > 2:
for j in range(1,4):
box.append(str(the_sheet.col(j)[i]).split(":")[1])
bigbox.append(box)
box = []
print bigbox
And then I convert the nested list into a nested dictionary of nested set i.e. {'name':{'travel': set of travel, 'expense': set of expense}, ...}
然后我将嵌套列表转换为嵌套字典的嵌套字典,即{'name':{'travel':旅行集,'费用':费用集},...}
dbox = dict()
for name, travel, expense in bigbox:
if name not in dbox:
dbox[name] = {'travel': {travel}, 'expense': {expense}}
else:
dbox[name]['travel'].add(travel)
dbox[name]['expense'].add(expense)
print dbox
lastly you print it out using some voodoo magic read the doc for more info
最后你用一些巫毒魔法打印出来阅读文档了解更多信息
for name in dbox:
print(name, 'to', "; ".join(dbox[name]['travel']), 'for', "; ".join(dbox[name]['expense']))
Hope this help I want to complain on how you don't give me the excel file I need to create that on my own next time include it and this is familar to some programming courses assignment??
希望这个帮助我想抱怨你怎么不给我我需要创建的excel文件,我自己下次包含它,这是一些编程课程分配的熟悉?
#3
1
Solution that comes to me at this moment :
此时遇到的解决方案:
from collections import defaultdict
the_file = xlrd.open_workbook("4_test.xlsx")
the_sheet = the_file.sheet_by_name("Sheet1")
nested_dict = lambda: defaultdict(nested_dict)
_dict = nested_dict()
for row_index in range(1, the_sheet.nrows):
expense = []
travel = []
name = str(the_sheet.cell(row_index, 0).value)
for row_index_1 in range(1, the_sheet.nrows):
if name == str(the_sheet.cell(row_index_1, 0).value):
travel.append(str(the_sheet.cell(row_index_1, 1).value))
expense.append(str(the_sheet.cell(row_index_1, 2).value))
_dict[name]['travel'] = travel
_dict[name]['expense']= expense
for name in _dict:
print name + " to "+ ",".join(set(_dict[name]['travel'])) + " for " + ",".join(set(_dict[name]['expense']))
Output ::
Pintu to Europe for Bonous
Pintu到欧洲为Bonous
Jack to Europe,America for Hotel,Meal
杰克到欧洲,美国为酒店,用餐
Mike to America for Bonous,Hotel,Transport
迈克到美国为Bonous,酒店,运输