pandas.DataFrame数据结构说明

时间:2022-11-15 21:27:56

pandas.DataFrame

class pandas.DataFrame(data=Noneindex=Nonecolumns=Nonedtype=Nonecopy=False)

Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure

二维的 大小可变的 异构表格数据结构,表示一个表格数据。

例如如下的csv数据train.csv:

PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C
3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S
4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S
5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S
6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q
7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S
...

读取上述csv文件

train_df = pd.read_csv('train.csv', header = 0)

获取到的DataFrame对象结构大致如下:

 

Age:<Series, len() = 891>
Cabin:<Series, len() = 891>
Embarked:<Series, len() = 891>
Fare:<Series, len() = 891>
Name:<Series, len() = 891>
Parch:<Series, len() = 891>
PassengerId:<Series, len() = 891>
Pclass:<Series, len() = 891>
Sex:<Series, len() = 891>
SibSp:<Series, len() = 891>
Survived:<Series, len() = 891>
Ticket:<Series, len() = 891> ndim:2 plot:<pandas.tools.plotting.FramePlotMethods object at 0x7f260277b890> shape:(891, 12) size:10692 [0]:'PassengerId' [1]:'Survived' [2]:'Pclass' [3]:'Name' [4]:'Sex' [5]:'Age' [6]:'SibSp' [7]:'Parch' [8]:'Ticket' [9]:'Fare' [10]:'Cabin' [11]:'Embarked'