pandas.DataFrame
class pandas.
DataFrame
(data=None, index=None, columns=None, dtype=None, copy=False)
Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). Arithmetic operations align on both row and column labels. Can be thought of as a dict-like container for Series objects. The primary pandas data structure
二维的 大小可变的 异构表格数据结构,表示一个表格数据。
例如如下的csv数据train.csv:
PassengerId,Survived,Pclass,Name,Sex,Age,SibSp,Parch,Ticket,Fare,Cabin,Embarked 1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S 2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C 3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,STON/O2. 3101282,7.925,,S 4,1,1,"Futrelle, Mrs. Jacques Heath (Lily May Peel)",female,35,1,0,113803,53.1,C123,S 5,0,3,"Allen, Mr. William Henry",male,35,0,0,373450,8.05,,S 6,0,3,"Moran, Mr. James",male,,0,0,330877,8.4583,,Q 7,0,1,"McCarthy, Mr. Timothy J",male,54,0,0,17463,51.8625,E46,S ...
读取上述csv文件
train_df = pd.read_csv('train.csv', header = 0)
获取到的DataFrame对象结构大致如下:
Age:<Series, len() = 891> Cabin:<Series, len() = 891> Embarked:<Series, len() = 891> Fare:<Series, len() = 891> Name:<Series, len() = 891> Parch:<Series, len() = 891> PassengerId:<Series, len() = 891> Pclass:<Series, len() = 891> Sex:<Series, len() = 891> SibSp:<Series, len() = 891> Survived:<Series, len() = 891>
Ticket:<Series, len() = 891> ndim:2 plot:<pandas.tools.plotting.FramePlotMethods object at 0x7f260277b890> shape:(891, 12) size:10692 [0]:'PassengerId' [1]:'Survived' [2]:'Pclass' [3]:'Name' [4]:'Sex' [5]:'Age' [6]:'SibSp' [7]:'Parch' [8]:'Ticket' [9]:'Fare' [10]:'Cabin' [11]:'Embarked'