I have a large dataframe containing a column titled "Comment"
我有一个包含标题为“评论”的列的大型数据框
within the comment section I need to pull out 3 values and place into separate columns i.e. (Duty cycle, gas, and pressure)
在评论部分,我需要提取3个值并放入单独的列,即(占空比,气体和压力)
"Data collection START for Duty Cycle: 0, Gas: Vacuum Pressure: 0.000028 Torr"
“数据采集START占空比:0,气体:真空压力:0.000028 Torr”
Currently i am using .split and .tolist to parse the string ->
目前我使用.split和.tolist来解析字符串 - >
#split string and sort into columns
df1 = pd.DataFrame(eventsDf.comment.str.split().tolist(),columns="0 0 0 0 0 0 dutyCycle 0 Gas 0 Pressure 0 ".split())
#join dataFrames
eventsDf = pd.concat([eventsDf, df1], axis=1)
#drop columns not needed
eventsDf.drop(['comment','0',],axis=1,inplace=True)
I found this method rather "hacky" in that in the event the structure of the comment section changes my code would be useless... can anyone show me a more effecient/robust way to go about doing this?? Thank you so much!
我发现这个方法相当“hacky”,因为如果注释部分的结构发生变化,我的代码就会变得无用......任何人都可以向我展示一种更有效/更强大的方法吗?非常感谢!