Excel(CSV) - 将标题数据转换为使用重复行映射的行

时间:2021-11-10 21:45:54

I have Excel based data set which needs transformation. I would request a Python based solution as I am learning Python and can read/modify the code thereafter. I am OK with either an Excel or CSV based input/output.

我有基于Excel的数据集,需要转换。我会在学习Python时请求基于Python的解决方案,然后可以读取/修改代码。我可以使用基于Excel或CSV的输入/输出。

This is what my data looks like

这就是我的数据

Channel Condition Value1 Value2 Value3 (Header)
Channel A Condition B Live Live Pilot

频道条件值1值2值3(标题)频道A条件B直播直播

Channel A Condition B Live Pilot Live

频道A条件B直播直播

Channel B Condition C Pilot Pilot Pilot

通道B条件C先导先导飞行员

Channel C Condition D Live Live Live

频道C条件D现场直播

This is the output I want:

这是我想要的输出:

Channel Condition Value(all) Status (Header. I am OK if this does not show up on output)

通道条件值(全部)状态(标题。如果输出中没有显示,我没问题)

Channel A Condition B Value1 Live

频道A条件B值1直播

Channel A Condition B Value2 Live

频道A条件B值2直播

Channel A Condition B Value 3 Pilot

通道A条件B值3飞行员

Channel A Condition B Value 1 Live

频道A条件B值1直播

Channel A Condition B Value 2 Pilot

通道A条件B值2飞行员

Channel A Condition B Value 3 Live...

频道A条件B值3直播...

Basically it is a repetition of the Channel and Condition for each of the "Values" which should be fetched from Column header and the dataset it self (Live/Pilot).

基本上,它是每个“值”的通道和条件的重复,应该从列标题和它自己的数据集(Live / Pilot)获取。

I would appreciate some assistance as I have about 1000 rows of such transformation to do

我会感谢一些帮助,因为我有大约1000行这样的转换

Here is an Image representing what I want Excel(CSV) - 将标题数据转换为使用重复行映射的行

这是一张代表我想要的图像

Edit 2: There's a type on the screenshot. the Last 3 rows should read Channel B, not Channel A.

编辑2:屏幕截图上有一个类型。最后3行应该读取通道B,而不是通道A.

2 个解决方案

#1


0  

Try using the xlrd module. Something like:

尝试使用xlrd模块。就像是:

import xlrd

wb = xlrd.open_workbook(path)
sheet = wb.sheet_by_index(index)

column_list = range(0, sheet.ncols)

val_name = [sheet.cell_value(rowx=0, colx=i) for i in column_list]
channel = val_name.pop(0)
condition = val_name.pop(0)

print(channel, condition, "Value *", "Status")
lines = []
for r in range(1, sheet.nrows):
    row = [sheet.cell_value(rowx=r, colx=i) for i in column_list]
    channel = row[0]
    condition = row[1]
    values = row[2:]

    lines = zip( [channel]*len(values),
                 [condition]*len(values),
                 val_name,
                 values)
    for l in lines:
        print(l)

#2


0  

Something like that should to the job.

这样的事应该适合这份工作。

import csv

transformed = []
with open('excel.csv', newline='') as csvfile:
    r = csv.reader(csvfile, delimiter=' ', quotechar='|')
    for row in r:
        channel, condition, *vals = row
        for val in vals:
            transformed.append([channel, condition, val])

with open('transformed.csv', 'w', newline='') as csvfile:
    w = csv.writer(csvfile, delimiter=' ',
                            quotechar='|', quoting=csv.QUOTE_MINIMAL)
    for row in transformed:
        w.writerow(' '.join(row))

#1


0  

Try using the xlrd module. Something like:

尝试使用xlrd模块。就像是:

import xlrd

wb = xlrd.open_workbook(path)
sheet = wb.sheet_by_index(index)

column_list = range(0, sheet.ncols)

val_name = [sheet.cell_value(rowx=0, colx=i) for i in column_list]
channel = val_name.pop(0)
condition = val_name.pop(0)

print(channel, condition, "Value *", "Status")
lines = []
for r in range(1, sheet.nrows):
    row = [sheet.cell_value(rowx=r, colx=i) for i in column_list]
    channel = row[0]
    condition = row[1]
    values = row[2:]

    lines = zip( [channel]*len(values),
                 [condition]*len(values),
                 val_name,
                 values)
    for l in lines:
        print(l)

#2


0  

Something like that should to the job.

这样的事应该适合这份工作。

import csv

transformed = []
with open('excel.csv', newline='') as csvfile:
    r = csv.reader(csvfile, delimiter=' ', quotechar='|')
    for row in r:
        channel, condition, *vals = row
        for val in vals:
            transformed.append([channel, condition, val])

with open('transformed.csv', 'w', newline='') as csvfile:
    w = csv.writer(csvfile, delimiter=' ',
                            quotechar='|', quoting=csv.QUOTE_MINIMAL)
    for row in transformed:
        w.writerow(' '.join(row))