I am trying to write a paper in IPython notebook, but encountered some issues with display format. Say I have following dataframe df
, is there any way to format var1
and var2
into 2 digit decimals and var3
into percentages.
我想在IPython笔记本上写一篇论文,但是遇到了显示格式的一些问题。假设我有以下数据帧df,有没有办法将var1和var2格式化为2位小数,将var3格式化为百分比。
var1 var2 var3
id
0 1.458315 1.500092 -0.005709
1 1.576704 1.608445 -0.005122
2 1.629253 1.652577 -0.004754
3 1.669331 1.685456 -0.003525
4 1.705139 1.712096 -0.003134
5 1.740447 1.741961 -0.001223
6 1.775980 1.770801 -0.001723
7 1.812037 1.799327 -0.002013
8 1.853130 1.822982 -0.001396
9 1.943985 1.868401 0.005732
The numbers inside are not multiplied by 100, e.g. -0.0057=-0.57%.
里面的数字不会乘以100,例如-0.0057 = -0.57%。
6 个解决方案
#1
17
replace the values using the round function, and format the string representation of the percentage numbers:
使用round函数替换值,并格式化百分比数字的字符串表示形式:
df['var2'] = pd.Series([round(val, 2) for val in df['var2']], index = df.index)
df['var3'] = pd.Series(["{0:.2f}%".format(val * 100) for val in df['var3']], index = df.index)
The round function rounds a floating point number to the number of decimal places provided as second argument to the function.
round函数将浮点数舍入为函数的第二个参数提供的小数位数。
String formatting allows you to represent the numbers as you wish. You can change the number of decimal places shown by changing the number before the f
.
字符串格式允许您根据需要表示数字。您可以通过更改f之前的数字来更改显示的小数位数。
p.s. I was not sure if your 'percentage' numbers had already been multiplied by 100. If they have then clearly you will want to change the number of decimals displayed, and remove the hundred multiplication.
附:我不确定你的'百分比'数字是否已经乘以100.如果他们已经清楚你会想要改变显示的小数位数,并删除百位乘法。
#2
56
The accepted answer suggests to modify the raw data for presentation purposes, something you generally do not want. Imagine you need to make further analyses with these columns and you need the precision you lost with rounding.
接受的答案建议修改原始数据以用于演示目的,这是您通常不想要的。想象一下,您需要使用这些列进行进一步分析,并且您需要通过舍入来丢失精度。
You can modify the formatting of individual columns in data frames, in your case:
在您的情况下,您可以修改数据框中各列的格式:
output = df.to_string(formatters={
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format
})
print(output)
For your information '{:,.2%}'.format(0.214)
yields 21.40%
, so no need for multiplying by 100.
对于您的信息'{:,。2%}'。格式(0.214)产生21.40%,因此不需要乘以100。
You don't have a nice HTML table anymore but a text representation. If you need to stay with HTML use the to_html
function instead.
你没有一个漂亮的HTML表格,只有一个文本表示。如果您需要继续使用HTML,请使用to_html函数。
from IPython.core.display import display, HTML
output = df.to_html(formatters={
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format
})
display(HTML(output))
Update
As of pandas 0.17.1, life got easier and we can get a beautiful html table right away:
从大熊猫0.17.1开始,生活变得更轻松,我们可以立即获得一个漂亮的html表:
df.style.format({
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format,
})
#3
20
You could also set the default format for float :
您还可以设置float的默认格式:
pd.options.display.float_format = '{:.2f}%'.format
#4
11
As suggested by @linqu you should not change your data for presentation. Since pandas 0.17.1, (conditional) formatting was made easier. Quoting the documentation:
正如@linqu所建议的那样,您不应该更改您的数据以进行演示。由于pandas 0.17.1,(条件)格式化变得更容易。引用文档:
You can apply conditional formatting, the visual styling of a
DataFrame
depending on the data within, by using theDataFrame.style
property. This is a property that returns apandas.Styler
object, which has useful methods for formatting and displayingDataFrames
.您可以使用DataFrame.style属性,根据内部数据应用条件格式,即DataFrame的可视样式。这是一个返回pandas.Styler对象的属性,该对象具有格式化和显示DataFrames的有用方法。
For your example, that would be (the usual table will show up in Jupyter):
对于你的例子,那将是(通常的表将显示在Jupyter中):
df.style.format({
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format,
})
#5
0
As a similar approach to the accepted answer that might be considered a bit more readable, elegant, and general (YMMV), you can leverage the map
method:
作为可接受的答案的类似方法,可以被认为更具可读性,优雅和通用(YMMV),您可以利用map方法:
# OP example
df['var3'].map(lambda n: '{:,.2%}'.format(n))
# also works on a series
series_example.map(lambda n: '{:,.2%}'.format(n))
Performance-wise, this is pretty close (marginally slower) than the OP solution.
性能方面,这与OP解决方案非常接近(略慢)。
As an aside, if you do choose to go the pd.options.display.float_format
route, consider using a context manager to handle state per this parallel numpy example.
顺便说一句,如果您选择转到pd.options.display.float_format路由,请考虑使用上下文管理器来处理每个并行numpy示例的状态。
#6
0
Just another way of doing it should you require to do it over a larger range of columns
另一种方法是,您需要在更大范围的列上执行此操作
using applymap
df[['var1','var2']] = df[['var1','var2']].applymap("{0:.2f}".format)
df['var3'] = df['var3'].applymap(lambda x: "{0:.2f}%".format(x*100))
applymap is useful if you need to apply the function over multiple columns; it's essentially an abbreviation of the below for this specific example:
如果需要在多列上应用函数,applymap非常有用;它本质上是这个具体例子的下面的缩写:
df[['var1','var2']].apply(lambda x: map(lambda x:'{:.2f}%'.format(x),x),axis=1)
Great explanation below of apply, map applymap:
以下是适用的重要说明,地图applymap:
Difference between map, applymap and apply methods in Pandas
Pandas中map,applymap和apply方法之间的区别
#1
17
replace the values using the round function, and format the string representation of the percentage numbers:
使用round函数替换值,并格式化百分比数字的字符串表示形式:
df['var2'] = pd.Series([round(val, 2) for val in df['var2']], index = df.index)
df['var3'] = pd.Series(["{0:.2f}%".format(val * 100) for val in df['var3']], index = df.index)
The round function rounds a floating point number to the number of decimal places provided as second argument to the function.
round函数将浮点数舍入为函数的第二个参数提供的小数位数。
String formatting allows you to represent the numbers as you wish. You can change the number of decimal places shown by changing the number before the f
.
字符串格式允许您根据需要表示数字。您可以通过更改f之前的数字来更改显示的小数位数。
p.s. I was not sure if your 'percentage' numbers had already been multiplied by 100. If they have then clearly you will want to change the number of decimals displayed, and remove the hundred multiplication.
附:我不确定你的'百分比'数字是否已经乘以100.如果他们已经清楚你会想要改变显示的小数位数,并删除百位乘法。
#2
56
The accepted answer suggests to modify the raw data for presentation purposes, something you generally do not want. Imagine you need to make further analyses with these columns and you need the precision you lost with rounding.
接受的答案建议修改原始数据以用于演示目的,这是您通常不想要的。想象一下,您需要使用这些列进行进一步分析,并且您需要通过舍入来丢失精度。
You can modify the formatting of individual columns in data frames, in your case:
在您的情况下,您可以修改数据框中各列的格式:
output = df.to_string(formatters={
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format
})
print(output)
For your information '{:,.2%}'.format(0.214)
yields 21.40%
, so no need for multiplying by 100.
对于您的信息'{:,。2%}'。格式(0.214)产生21.40%,因此不需要乘以100。
You don't have a nice HTML table anymore but a text representation. If you need to stay with HTML use the to_html
function instead.
你没有一个漂亮的HTML表格,只有一个文本表示。如果您需要继续使用HTML,请使用to_html函数。
from IPython.core.display import display, HTML
output = df.to_html(formatters={
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format
})
display(HTML(output))
Update
As of pandas 0.17.1, life got easier and we can get a beautiful html table right away:
从大熊猫0.17.1开始,生活变得更轻松,我们可以立即获得一个漂亮的html表:
df.style.format({
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format,
})
#3
20
You could also set the default format for float :
您还可以设置float的默认格式:
pd.options.display.float_format = '{:.2f}%'.format
#4
11
As suggested by @linqu you should not change your data for presentation. Since pandas 0.17.1, (conditional) formatting was made easier. Quoting the documentation:
正如@linqu所建议的那样,您不应该更改您的数据以进行演示。由于pandas 0.17.1,(条件)格式化变得更容易。引用文档:
You can apply conditional formatting, the visual styling of a
DataFrame
depending on the data within, by using theDataFrame.style
property. This is a property that returns apandas.Styler
object, which has useful methods for formatting and displayingDataFrames
.您可以使用DataFrame.style属性,根据内部数据应用条件格式,即DataFrame的可视样式。这是一个返回pandas.Styler对象的属性,该对象具有格式化和显示DataFrames的有用方法。
For your example, that would be (the usual table will show up in Jupyter):
对于你的例子,那将是(通常的表将显示在Jupyter中):
df.style.format({
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format,
})
#5
0
As a similar approach to the accepted answer that might be considered a bit more readable, elegant, and general (YMMV), you can leverage the map
method:
作为可接受的答案的类似方法,可以被认为更具可读性,优雅和通用(YMMV),您可以利用map方法:
# OP example
df['var3'].map(lambda n: '{:,.2%}'.format(n))
# also works on a series
series_example.map(lambda n: '{:,.2%}'.format(n))
Performance-wise, this is pretty close (marginally slower) than the OP solution.
性能方面,这与OP解决方案非常接近(略慢)。
As an aside, if you do choose to go the pd.options.display.float_format
route, consider using a context manager to handle state per this parallel numpy example.
顺便说一句,如果您选择转到pd.options.display.float_format路由,请考虑使用上下文管理器来处理每个并行numpy示例的状态。
#6
0
Just another way of doing it should you require to do it over a larger range of columns
另一种方法是,您需要在更大范围的列上执行此操作
using applymap
df[['var1','var2']] = df[['var1','var2']].applymap("{0:.2f}".format)
df['var3'] = df['var3'].applymap(lambda x: "{0:.2f}%".format(x*100))
applymap is useful if you need to apply the function over multiple columns; it's essentially an abbreviation of the below for this specific example:
如果需要在多列上应用函数,applymap非常有用;它本质上是这个具体例子的下面的缩写:
df[['var1','var2']].apply(lambda x: map(lambda x:'{:.2f}%'.format(x),x),axis=1)
Great explanation below of apply, map applymap:
以下是适用的重要说明,地图applymap:
Difference between map, applymap and apply methods in Pandas
Pandas中map,applymap和apply方法之间的区别