如何在pandas数据框中转置行并创建列?

时间:2022-02-13 21:40:57

I have the following list:

我有以下列表:

list_1 = [{'28d_click': '2', 'action_type': 'comment', 'value': '2'},
 {'28d_click': '1779',
  '7d_view': '11144',
  'action_type': 'offsite_conversion.custom.xx',
  'value': '9425'},
 {'28d_click': '122', 'action_type': 'landing_page_view', 'value': '122'},
 {'28d_click': '21', 'action_type': 'like', 'value': '21'},
 {'28d_click': '175', 'action_type': 'link_click', 'value': '175'},
 {'28d_click': '1', 'action_type': 'post', 'value': '1'},
 {'28d_click': '23', 'action_type': 'post_reaction', 'value': '23'},
 {'28d_click': '222', 'action_type': 'page_engagement', 'value': '222'},
 {'28d_click': '201', 'action_type': 'post_engagement', 'value': '201'},
 {'28d_click': '1936',
  '7d_view': '11171',
  'action_type': 'offsite_conversion',
  'value': '9607'}]

I have then used this list to create a pandas DataFrame: df = pd.DataFrame(list_1)

我已经使用此列表创建了一个pandas DataFrame:df = pd.DataFrame(list_1)

What I would like to do is use the action_type as columns and store the values in the rows beneath:

我想要做的是使用action_type作为列并将值存储在下面的行中:

Below is just an example - please NOTE: some action_types (conversions) will need to have two fields (28d_click) and (7d_view) as separate columns. EG: Column1: Conversion_1 (28d_click) | Column2: Conversion_1 (7d_view)

以下只是一个示例 - 请注意:某些action_types(转换)需要将两个字段(28d_click)和(7d_view)作为单独的列。 EG:Column1:Conversion_1(28d_click)|第2栏:Conversion_1(7d_view)

landing_page_view   link_click  offsite_conversion.custom.xx (28d_click)  offsite_conversion.custom.xx (7d_view).
122                 175         7                                         16

What I have tried:

我试过的:

df = pd.DataFrame(list_1).T

Almost got the headings the way I wanted, but not quite with the 28d and 7d values

几乎按照我想要的方式得到标题,但不完全是28d和7d值

print([d['action_type'] for d in list_1 if 'action_type' in d])
print([d['value'] for d in list_1 if 'value' in d])
print([d['28d_click'] for d in list_1 if '28d_click' in d])
print([d['7d_view'] for d in list_1 if '7d_view' in d])

This creates individual lists, but then not entirely sure what to do with them

这会创建单个列表,但不完全确定如何处理它们

Is what I'm asking for possible?

我要求的是可能的吗?

I have another DataFrame which I want to join this with but wanted to get this right first.

我有另一个DataFrame,我想加入这个但是想先把它弄好。

Any help would be greatly appreciated.

任何帮助将不胜感激。

Thanks,

Adrian

1 个解决方案

#1


1  

First set_index, then reshape by stack, create one column DataFrame by to_frame and transpose.

首先是set_index,然后按堆栈重新整形,通过to_frame创建一个列DataFrame并进行转置。

Get MultiIndex in columns, so is necessary flatting by map with separator _:

在列中获取MultiIndex,因此需要使用分隔符_来平面化:

df = pd.DataFrame(list_1).set_index('action_type').stack().to_frame(0).T
df.columns = df.columns.map('_'.join)

print (df)
 comment_28d_click comment_value offsite_conversion.custom.xx_28d_click  \
0                 2             2                                   1779   

  offsite_conversion.custom.xx_7d_view offsite_conversion.custom.xx_value  \
0                                11144                               9425   

  landing_page_view_28d_click landing_page_view_value like_28d_click  \
0                         122                     122             21   

  like_value link_click_28d_click           ...            post_value  \
0         21                  175           ...                     1   

  post_reaction_28d_click post_reaction_value page_engagement_28d_click  \
0                      23                  23                       222   

  page_engagement_value post_engagement_28d_click post_engagement_value  \
0                   222                       201                   201   

  offsite_conversion_28d_click offsite_conversion_7d_view  \
0                         1936                      11171   

  offsite_conversion_value  
0                     9607  

[1 rows x 22 columns]

#1


1  

First set_index, then reshape by stack, create one column DataFrame by to_frame and transpose.

首先是set_index,然后按堆栈重新整形,通过to_frame创建一个列DataFrame并进行转置。

Get MultiIndex in columns, so is necessary flatting by map with separator _:

在列中获取MultiIndex,因此需要使用分隔符_来平面化:

df = pd.DataFrame(list_1).set_index('action_type').stack().to_frame(0).T
df.columns = df.columns.map('_'.join)

print (df)
 comment_28d_click comment_value offsite_conversion.custom.xx_28d_click  \
0                 2             2                                   1779   

  offsite_conversion.custom.xx_7d_view offsite_conversion.custom.xx_value  \
0                                11144                               9425   

  landing_page_view_28d_click landing_page_view_value like_28d_click  \
0                         122                     122             21   

  like_value link_click_28d_click           ...            post_value  \
0         21                  175           ...                     1   

  post_reaction_28d_click post_reaction_value page_engagement_28d_click  \
0                      23                  23                       222   

  page_engagement_value post_engagement_28d_click post_engagement_value  \
0                   222                       201                   201   

  offsite_conversion_28d_click offsite_conversion_7d_view  \
0                         1936                      11171   

  offsite_conversion_value  
0                     9607  

[1 rows x 22 columns]