pandas中的Series和DataFrame

　　1.Series介绍及创建

　　Series是一种类似与一维数组的对象，由下面两个部分组成：

　　values：一组数据(ndarray类型)

　　index：相关的数据索引标签

　　创建Series的两种方式：

　　第一种：由列表或numpy数组创建：

　　s1 =Series([11,22,33,44,55],index=['a1','b1','c1','d1','e1'],name='Hello world')

　　print(s1)

　　运行结果：

　　a1 11

　　b1 22

　　c1 33

　　d1 44

　　e1 55

　　Name: Hello world, dtype: int64

　　a1 = np.array([11,22,33,44,55])

　　s2 = Series(a1,index=['a1','b1','c1','d1','e1'],name='hello series')

　　print(s2)

　　运行结果：

　　a1 11

　　b1 22

　　c1 33

　　d1 44

　　e1 55

　　Name: hello series, dtype: int32

　　第二种：由字典创建，不存在index参数设置，但是依然存在默认索引(数据源必须为一维数据)

　　dict = {'hello':12,'series':30}

　　s3 = Series(data=dict)

　　print(s3)

　　运行结果：

　　hello 12

　　series 30

　　dtype: int64

　　2.DataFrame的介绍及创建

　　DataFrame具有标记轴(行和列)的二维大小可变，可能异构的表格数据结构

　　算术运算在行标签和列标签上对齐

　　可以被认为是Series对象的类似dict的容器

　　是pandas的主要数据结构

　　创建DataFrame的4种方式：

　　1.使用字典创建DataFarme

　　dicts = {"tag1": [90, 22, 66],'tag2': [12, 33, 66]}

　　d1 = DataFrame(data=dicts, index=['a', 'b', 'c'])

　　print(d1)

　　运行结果：

　　tag1 tag2

　　a 90 12

　　b 22 33

　　c 66 66

　　2.使用ndarray创建DataFrame

　　d2 = DataFrame(data=np.random.randint(0,100,size=(3,6)),index=["one","two","three"],columns=["a","b","c","d","e","f"])

　　print(d2)

　　运行结果：无锡人流医院 http://xmobile.wxbhnk120.com/

　　a b c d e f

　　one 62 74 51 29 98 18

　　two 16 16 44 3 64 72

　　three 42 94 46 60 34 59

　　3.隐式构造

　　最常见的方法是给DataFrame构造函数的index或者columns参数传递两个或更多的数组(如下另个列的标签数组)

　　d3 = DataFrame(data=np.random.randint(0, 100, size=(2, 4)), index=['x', 'y'], columns=[['a', 'b', 'c', 'd'], ['q1', 'q2', 'q3', 'q4']])

　　print(d3)

　　运行结果：

　　a b c d

　　q1 q2 q3 q4

　　x 47 26 11 8

　　y 40 76 18 9

　　4.显示构造

　　使用pd.MultiIndex.from_arrays数组方式

　　创建了一个索引对象，该索引对象为二层索引

　　indexObj = pd.MultiIndex.from_arrays([['q1', 'q2', 'q3', 'q1'], ['a', 'b', 'c', 'd']])

　　d4 = DataFrame(data=np.random.randint(0, 100, size=(2, 4)), index=['x', 'y'], columns=indexObj)

　　print(d4)

　　运行结果：

　　q1 q2 q3 q1

　　a b c d

　　x 85 72 43 4

　　y 8 43 55 68

pandas中的Series和DataFrame

相关推荐