Python pandas DataFrame操作的实现代码

1. 从字典创建dataframe

									>>> import pandas as pd

									>>> dict1 = {'col1':[1,2,5,7],'col2':['a','b','c','d']}

									>>> df = pd.dataframe(dict1)

									>>> df

									  col1 col2

									0   1  a

									1   2  b

									2   5  c

									3   7  d

2. 从列表创建dataframe (先把列表转化为字典，再把字典转化为dataframe）

				?

									>>> lista = [1,2,5,7]

									>>> listb = ['a','b','c','d']

									>>> df = pd.dataframe({'col1':lista,'col2':listb})

									>>> df

									  col1 col2

									0   1  a

									1   2  b

									2   5  c

									3   7  d

3. 从列表创建dataframe，指定data和columns

				?

									>>> a = ['001','zhangsan','m']

									>>> b = ['002','lisi','f']

									>>> c = ['003','wangwu','m']

									>>> df = pandas.dataframe(data=[a,b,c],columns=['id','name','sex'])

									>>> df

									  id   name sex

									0 001 zhangsan  m

									1 002   lisi  f

									2 003  wangwu  m

4. 修改列名，从['id','name','sex']修改为['id','name','sex']

				?

									>>> df.columns = ['id','name','sex']

									>>> df

									  id   name sex

									0 001 zhangsan  m

									1 002   lisi  f

									2 003  wangwu  m

5. 调整dataframe列顺序、调整列编号从1开始

http://www.zzvips.com/article/177058.html

6. dataframe随机生成10行4列int型数据

				?

									>>> import pandas

									>>> import numpy

									>>> df = pandas.dataframe(numpy.random.randint(0,100,size=(10, 4)), columns=list('abcd')) # 0,100指定随机数为0到100之间（包括0，不包括100），size = (10,4)指定数据为10行4列，column指定列名

									>>> df

									  a  b  c  d

									0 67 28 37 66

									1 21 27 43 37

									2 73 54 98 85

									3 40 78  4 93

									4 99 60 63 16

									5 48 46 24 61

									6 59 52 62 28

									7 20 74 36 64

									8 14 13 46 60

									9 18 44 70 36

7. 用时间序列做index名

				?

									>>> df # 原本index为自动生成的0~9

									  a  b  c  d

									0 31 25 45 67

									1 62 12 61 88

									2 79 36 20 97

									3 26 57 50 44

									4 24 12 50  1

									5  4 61 99 62

									6 40 47 52 27

									7 83 66 71  4

									8 58 59 25 62

									9 38 81 60  8

									>>> import pandas

									>>> dates = pandas.date_range('20180121',periods=10)

									>>> dates # 从20180121开始，共10天

									datetimeindex(['2018-01-21', '2018-01-22', '2018-01-23', '2018-01-24',

									        '2018-01-25', '2018-01-26', '2018-01-27', '2018-01-28',

									        '2018-01-29', '2018-01-30'],

									       dtype='datetime64[ns]', freq='d')

									>>> df.index = dates # 将dates赋值给index

									>>> df

									       a  b  c  d

									2018-01-21 31 25 45 67

									2018-01-22 62 12 61 88

									2018-01-23 79 36 20 97

									2018-01-24 26 57 50 44

									2018-01-25 24 12 50  1

									2018-01-26  4 61 99 62

									2018-01-27 40 47 52 27

									2018-01-28 83 66 71  4

									2018-01-29 58 59 25 62

									2018-01-30 38 81 60  8

8. dataframe 实现类sql操作

pandas官方文档 comparison with sql

https://pandas.pydata.org/pandas-docs/stable/comparison_with_sql.html

以上就是本文的全部内容，希望对大家的学习有所帮助，也希望大家多多支持服务器之家。

原文链接：https://www.cnblogs.com/huahuayu/p/8227494.html

秒客网

Python pandas DataFrame操作的实现代码

相关文章