python学习记录-Dataframe

时间:2022-11-17 00:11:42
#-*-coding:utf-8-*-
import pandas
from numpy.random.mtrand import np

#创建一个dataframe
dates=pandas.date_range('20170101',periods=7)
ss=pandas.DataFrame(np.random.randn(7,3),index=dates,columns=['a','b','c'])#生成一个7*3的矩阵
print 'dataframe ss显示为\n%s'%ss

#数据显示
print 'dataframe结构'
print ss.index#显示行索引
print ss.columns#列索引
print ss.shape#显示数据的维度
print ss.size#显示元素的个数
print ss.describe#显示数据的基本描述
print ss.values#显示所有的值

print 'dataframe数据列筛选'
print ss.a#显示数据的某一列的值,方法1
print ss['a']#显示数据的某一列的值,方法2


print 'dataframe数据行筛选-切片'
print ss.head(1)#显示第一行数据,方法1
print ss[:1]#显示第一行数据,方法2
print ss.tail(2)#显示最后两行数据,方法1
print ss[-2:]#显示最后两行数据,方法2

print 'dataframe数据选择-条件筛选'
print ss['2017-01-01':'2017-01-02']#根据日期筛选
print ss[(ss.a<1)&(ss.b>1)]#根据数值筛选

运行结果:

dataframe ss显示为
a b c
2017-01-01 -0.713403 -0.199190 -0.808631
2017-01-02 -2.260560 1.959113 1.113825
2017-01-03 -0.490959 -0.687768 2.312052
2017-01-04 -1.537248 0.810706 1.476309
2017-01-05 -1.359221 0.872240 2.292086
2017-01-06 1.024769 -0.428577 -0.558432
2017-01-07 -0.097108 -0.593286 -0.405638
dataframe结构
DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
'2017-01-05', '2017-01-06', '2017-01-07'],
dtype='datetime64[ns]', freq='D')
Index([u'a', u'b', u'c'], dtype='object')
(7, 3)
21
<bound method DataFrame.describe of a b c
2017-01-01 -0.713403 -0.199190 -0.808631
2017-01-02 -2.260560 1.959113 1.113825
2017-01-03 -0.490959 -0.687768 2.312052
2017-01-04 -1.537248 0.810706 1.476309
2017-01-05 -1.359221 0.872240 2.292086
2017-01-06 1.024769 -0.428577 -0.558432
2017-01-07 -0.097108 -0.593286 -0.405638>
[[-0.71340275 -0.19919033 -0.8086307 ]
[-2.26056037 1.95911348 1.1138253 ]
[-0.49095857 -0.68776847 2.31205246]
[-1.53724807 0.81070592 1.4763093 ]
[-1.35922072 0.87224042 2.29208569]
[ 1.02476913 -0.42857728 -0.55843242]
[-0.09710843 -0.59328586 -0.4056384 ]]
dataframe数据列筛选
2017-01-01 -0.713403
2017-01-02 -2.260560
2017-01-03 -0.490959
2017-01-04 -1.537248
2017-01-05 -1.359221
2017-01-06 1.024769
2017-01-07 -0.097108
Freq: D, Name: a, dtype: float64
2017-01-01 -0.713403
2017-01-02 -2.260560
2017-01-03 -0.490959
2017-01-04 -1.537248
2017-01-05 -1.359221
2017-01-06 1.024769
2017-01-07 -0.097108
Freq: D, Name: a, dtype: float64
dataframe数据行筛选-切片
a b c
2017-01-01 -0.713403 -0.19919 -0.808631
a b c
2017-01-01 -0.713403 -0.19919 -0.808631
a b c
2017-01-06 1.024769 -0.428577 -0.558432
2017-01-07 -0.097108 -0.593286 -0.405638
a b c
2017-01-06 1.024769 -0.428577 -0.558432
2017-01-07 -0.097108 -0.593286 -0.405638
dataframe数据选择-条件筛选
a b c
2017-01-01 -0.713403 -0.199190 -0.808631
2017-01-02 -2.260560 1.959113 1.113825
a b c
2017-01-02 -2.26056 1.959113 1.113825