I have two dataframes, called Old and New. Old has 96 rows, and New has 48 rows. I want to take one column of Old, say ['Values']
and split it into two columns in New, say ['First']
and ['Second']
. Thus, for a simple example with 6 rows to start; from:
我有两个dataframes,分别叫做Old和New。Old有96行,New有48行。我想取一列旧的,写上['Values'],然后把它分成两个新的列,比如['First']和['Second']。因此,对于一个开始有6行的简单示例;来自:
Values
1 10
2 20
3 30
4 40
5 50
6 60
to
来
First Second
1 10 40
2 20 50
3 30 60
I have a notion that this should be trivially easy, and yet I can't do it because the indices need to be changed. I simply want to copy values, as you see.
我有个想法,这应该很简单,但我做不到,因为指标需要改变。我只是想复制值。
How is this best done?
这是怎么做到的呢?
2 个解决方案
#1
1
You can use reshape:
您可以使用重塑:
pd.DataFrame(df.values.reshape(-1,2, order='F'), columns=['First','Second'])
Out[12]:
array([[10, 40],
[20, 50],
[30, 60]], dtype=int64)
#2
1
Using split
from numpy, you can split into two or other size and combine them with hstack
to form new dataframe
:
使用从numpy中分离,您可以分成两个或其他大小,并将它们与hstack合并,形成新的dataframe:
import numpy as np
import pandas as pd
df = pd.DataFrame({'Values': {1: 10, 2: 20, 3: 30, 4: 40, 5: 50, 6: 60}})
print(df)
Input dataframe:
输入dataframe:
Values
1 10
2 20
3 30
4 40
5 50
6 60
Now, using split() then, using hstack():
现在,使用split(),然后使用hstack():
splits = np.split(df, 2)
result_df = pd.DataFrame(np.hstack(splits), columns=['First', 'Second'])
print(result_df)
Result:
结果:
First Second
0 10 40
1 20 50
2 30 60
Without using intermediate splits
variable, you can try:
不使用中间分割变量,可以尝试:
result_df = pd.DataFrame(np.hstack(np.split(df, 2)), columns=['First', 'Second'])
#1
1
You can use reshape:
您可以使用重塑:
pd.DataFrame(df.values.reshape(-1,2, order='F'), columns=['First','Second'])
Out[12]:
array([[10, 40],
[20, 50],
[30, 60]], dtype=int64)
#2
1
Using split
from numpy, you can split into two or other size and combine them with hstack
to form new dataframe
:
使用从numpy中分离,您可以分成两个或其他大小,并将它们与hstack合并,形成新的dataframe:
import numpy as np
import pandas as pd
df = pd.DataFrame({'Values': {1: 10, 2: 20, 3: 30, 4: 40, 5: 50, 6: 60}})
print(df)
Input dataframe:
输入dataframe:
Values
1 10
2 20
3 30
4 40
5 50
6 60
Now, using split() then, using hstack():
现在,使用split(),然后使用hstack():
splits = np.split(df, 2)
result_df = pd.DataFrame(np.hstack(splits), columns=['First', 'Second'])
print(result_df)
Result:
结果:
First Second
0 10 40
1 20 50
2 30 60
Without using intermediate splits
variable, you can try:
不使用中间分割变量,可以尝试:
result_df = pd.DataFrame(np.hstack(np.split(df, 2)), columns=['First', 'Second'])