3.3.4 数据框的操作
(1)数据框显示
info
显示数据结构
head
默认显示前5行
tail
默认显示后5行
# 打印所有结果
from IPython.core.interactiveshell import InteractiveShell
InteractiveShell.ast_node_interactivity = "all"
# 导入Pandas包
import pandas as pd
# 读取CSV数据
BSdata = pd.read_csv("data/test.csv", encoding="utf-8") #GBK
BSdata.info() # 数据框信息
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 6944 entries, 0 to 6943
Data columns (total 7 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Region/Country/Area 25 non-null float64
1 Unnamed: 1 25 non-null object
2 Year 25 non-null float64
3 Series 25 non-null object
4 Value 25 non-null object
5 Footnotes 2 non-null object
6 Source 25 non-null object
dtypes: float64(2), object(5)
memory usage: 379.9+ KB
BSdata.head() # 显示前5行
Region/Country/Area | Unnamed: 1 | Year | Series | Value | Footnotes | Source | |
---|---|---|---|---|---|---|---|
0 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates (millions) | 6,956.82 | NaN | United Nations Population Division, New York, ... |
1 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates for males (milli... | 3,507.70 | NaN | United Nations Population Division, New York, ... |
2 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates for females (mil... | 3,449.12 | NaN | United Nations Population Division, New York, ... |
3 | 1.0 | Total, all countries or areas | 2010.0 | Sex ratio (males per 100 females) | 101.7 | NaN | United Nations Population Division, New York, ... |
4 | 1.0 | Total, all countries or areas | 2010.0 | Population aged 0 to 14 years old (percentage) | 27 | NaN | United Nations Population Division, New York, ... |
BSdata.tail() # 显示后5行
Region/Country/Area | Unnamed: 1 | Year | Series | Value | Footnotes | Source | |
---|---|---|---|---|---|---|---|
6939 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
6940 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
6941 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
6942 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
6943 | NaN | NaN | NaN | NaN | NaN | NaN | NaN |
(2)数据框列名(变量名)
columns
查看列名称
BSdata.columns # 查看列名称
Index(['Region/Country/Area', 'Unnamed: 1', 'Year', 'Series', 'Value',
'Footnotes', 'Source'],
dtype='object')
(3)数据框行名(样品名)
index
BSdata.index # 数据框行名
RangeIndex(start=0, stop=6944, step=1)
(4)数据框维度
shape
BSdata.shape # 显示数据框的行数和列数
BSdata.shape[0] # 数据框行数
BSdata.shape[1] # 数据框列数
(6944, 7)
6944
7
(5)数据框值(数组)
values
BSdata.values[:5] # 数据框值数组
array([[1.0, 'Total, all countries or areas', 2010.0,
'Population mid-year estimates (millions)', '6,956.82', nan,
'United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2021.'],
[1.0, 'Total, all countries or areas', 2010.0,
'Population mid-year estimates for males (millions)', '3,507.70',
nan,
'United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2021.'],
[1.0, 'Total, all countries or areas', 2010.0,
'Population mid-year estimates for females (millions)',
'3,449.12', nan,
'United Nations Population Division, New York, World Population Prospects: The 2019 Revision, last accessed June 2021.'],
[1.0, 'Total, all countries or areas', 2010.0,
'Sex ratio (males per 100 females)', '101.7', nan,
'United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2019 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2021.'],
[1.0, 'Total, all countries or areas', 2010.0,
'Population aged 0 to 14 years old (percentage)', '27', nan,
'United Nations Population Division, New York, World Population Prospects: The 2019 Revision; supplemented by data from the United Nations Statistics Division, New York, Demographic Yearbook 2019 and Secretariat for the Pacific Community (SPC) for small countries or areas, last accessed June 2021.']],
dtype=object)
3.3.4.2 选取变量
(1)[""]
或
.
BSdata['Year'] # 取一列数据【方法1】
0 2010.0
1 2010.0
2 2010.0
3 2010.0
4 2010.0
...
6939 NaN
6940 NaN
6941 NaN
6942 NaN
6943 NaN
Name: Year, Length: 6944, dtype: float64
BSdata[['Year','Series']] # 取两列数据
Year | Series | |
---|---|---|
0 | 2010.0 | Population mid-year estimates (millions) |
1 | 2010.0 | Population mid-year estimates for males (milli... |
2 | 2010.0 | Population mid-year estimates for females (mil... |
3 | 2010.0 | Sex ratio (males per 100 females) |
4 | 2010.0 | Population aged 0 to 14 years old (percentage) |
... | ... | ... |
6939 | NaN | NaN |
6940 | NaN | NaN |
6941 | NaN | NaN |
6942 | NaN | NaN |
6943 | NaN | NaN |
6944 rows × 2 columns
BSdata.Year # 取一列数据 【方法2】
0 2010.0
1 2010.0
2 2010.0
3 2010.0
4 2010.0
...
6939 NaN
6940 NaN
6941 NaN
6942 NaN
6943 NaN
Name: Year, Length: 6944, dtype: float64
(2)下标法
【从零开始计数】
iloc
i表示行index loc表示列column 【区间左闭右开】loc
【闭合区间】
BSdata.iloc[:,2] # 取全部行、第1列
0 2010.0
1 2010.0
2 2010.0
3 2010.0
4 2010.0
...
6939 NaN
6940 NaN
6941 NaN
6942 NaN
6943 NaN
Name: Year, Length: 6944, dtype: float64
BSdata.iloc[:,2:4] # 取全部行、第2、3列【从0开始计数】【区间左闭右开】
Year | Series | |
---|---|---|
0 | 2010.0 | Population mid-year estimates (millions) |
1 | 2010.0 | Population mid-year estimates for males (milli... |
2 | 2010.0 | Population mid-year estimates for females (mil... |
3 | 2010.0 | Sex ratio (males per 100 females) |
4 | 2010.0 | Population aged 0 to 14 years old (percentage) |
... | ... | ... |
6939 | NaN | NaN |
6940 | NaN | NaN |
6941 | NaN | NaN |
6942 | NaN | NaN |
6943 | NaN | NaN |
6944 rows × 2 columns
3.3.4.3 提取样品
BSdata.iloc[3,:] # 取第3行、全部列【从0开始计数】
Region/Country/Area 1.0
Unnamed: 1 Total, all countries or areas
Year 2010.0
Series Sex ratio (males per 100 females)
Value 101.7
Footnotes NaN
Source United Nations Population Division, New York, ...
Name: 3, dtype: object
BSdata.loc[3] # 取第3行【从0开始计数】 【效果同上】
Region/Country/Area 1.0
Unnamed: 1 Total, all countries or areas
Year 2010.0
Series Sex ratio (males per 100 females)
Value 101.7
Footnotes NaN
Source United Nations Population Division, New York, ...
Name: 3, dtype: object
BSdata.loc[3:5] # 取3至5行【闭合区间】
Region/Country/Area | Unnamed: 1 | Year | Series | Value | Footnotes | Source | |
---|---|---|---|---|---|---|---|
3 | 1.0 | Total, all countries or areas | 2010.0 | Sex ratio (males per 100 females) | 101.7 | NaN | United Nations Population Division, New York, ... |
4 | 1.0 | Total, all countries or areas | 2010.0 | Population aged 0 to 14 years old (percentage) | 27 | NaN | United Nations Population Division, New York, ... |
5 | 1.0 | Total, all countries or areas | 2010.0 | Population aged 60+ years old (percentage) | 11 | NaN | United Nations Population Division, New York, ... |
3.3.4.4 选取观测和变量
BSdata.loc[:3,['Year','Series']] # 选取0至3行、Year或Series两列数据【闭区间】
Year | Series | |
---|---|---|
0 | 2010.0 | Population mid-year estimates (millions) |
1 | 2010.0 | Population mid-year estimates for males (milli... |
2 | 2010.0 | Population mid-year estimates for females (mil... |
3 | 2010.0 | Sex ratio (males per 100 females) |
BSdata.iloc[:3,:5] # 0至2行、0至4列数据【区间左闭右开】
Region/Country/Area | Unnamed: 1 | Year | Series | Value | |
---|---|---|---|---|---|
0 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates (millions) | 6,956.82 |
1 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates for males (milli... | 3,507.70 |
2 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates for females (mil... | 3,449.12 |
3.3.4.5 条件选取
BSdata[BSdata['Year']>2010]
Region/Country/Area | Unnamed: 1 | Year | Series | Value | Footnotes | Source | |
---|---|---|---|---|---|---|---|
7 | 1.0 | Total, all countries or areas | 2015.0 | Population mid-year estimates (millions) | 7,379.80 | NaN | United Nations Population Division, New York, ... |
8 | 1.0 | Total, all countries or areas | 2015.0 | Population mid-year estimates for males (milli... | 3,720.70 | NaN | United Nations Population Division, New York, ... |
9 | 1.0 | Total, all countries or areas | 2015.0 | Population mid-year estimates for females (mil... | 3,659.10 | NaN | United Nations Population Division, New York, ... |
10 | 1.0 | Total, all countries or areas | 2015.0 | Sex ratio (males per 100 females) | 101.7 | NaN | United Nations Population Division, New York, ... |
11 | 1.0 | Total, all countries or areas | 2015.0 | Population aged 0 to 14 years old (percentage) | 26.2 | NaN | United Nations Population Division, New York, ... |
12 | 1.0 | Total, all countries or areas | 2015.0 | Population aged 60+ years old (percentage) | 12.2 | NaN | United Nations Population Division, New York, ... |
13 | 1.0 | Total, all countries or areas | 2015.0 | Population density | 56.7 | NaN | United Nations Population Division, New York, ... |
14 | 1.0 | Total, all countries or areas | 2015.0 | Surface area (thousand km2) | 136,162 | NaN | United Nations Statistics Division, New York, ... |
15 | 1.0 | Total, all countries or areas | 2019.0 | Population mid-year estimates (millions) | 7,713.47 | NaN | United Nations Population Division, New York, ... |
16 | 1.0 | Total, all countries or areas | 2019.0 | Population mid-year estimates for males (milli... | 3,889.03 | NaN | United Nations Population Division, New York, ... |
17 | 1.0 | Total, all countries or areas | 2019.0 | Population mid-year estimates for females (mil... | 3,824.43 | NaN | United Nations Population Division, New York, ... |
18 | 1.0 | Total, all countries or areas | 2019.0 | Sex ratio (males per 100 females) | 101.7 | NaN | United Nations Population Division, New York, ... |
19 | 1.0 | Total, all countries or areas | 2019.0 | Population aged 0 to 14 years old (percentage) | 25.6 | NaN | United Nations Population Division, New York, ... |
20 | 1.0 | Total, all countries or areas | 2019.0 | Population aged 60+ years old (percentage) | 13.2 | NaN | United Nations Population Division, New York, ... |
21 | 1.0 | Total, all countries or areas | 2019.0 | Population density | 59.3 | NaN | United Nations Population Division, New York, ... |
22 | 1.0 | Total, all countries or areas | 2019.0 | Surface area (thousand km2) | 130,094 | NaN | United Nations Statistics Division, New York, ... |
23 | 1.0 | Total, all countries or areas | 2021.0 | Population mid-year estimates (millions) | 7,874.97 | Projected estimate (medium fertility variant). | United Nations Population Division, New York, ... |
24 | 1.0 | Total, all countries or areas | 2021.0 | Population mid-year estimates for males (milli... | 3,970.24 | Projected estimate (medium fertility variant). | United Nations Population Division, New York, ... |
BSdata[(BSdata['Year']>2010) & (BSdata['Year']<2016)]
Region/Country/Area | Unnamed: 1 | Year | Series | Value | Footnotes | Source | |
---|---|---|---|---|---|---|---|
7 | 1.0 | Total, all countries or areas | 2015.0 | Population mid-year estimates (millions) | 7,379.80 | NaN | United Nations Population Division, New York, ... |
8 | 1.0 | Total, all countries or areas | 2015.0 | Population mid-year estimates for males (milli... | 3,720.70 | NaN | United Nations Population Division, New York, ... |
9 | 1.0 | Total, all countries or areas | 2015.0 | Population mid-year estimates for females (mil... | 3,659.10 | NaN | United Nations Population Division, New York, ... |
10 | 1.0 | Total, all countries or areas | 2015.0 | Sex ratio (males per 100 females) | 101.7 | NaN | United Nations Population Division, New York, ... |
11 | 1.0 | Total, all countries or areas | 2015.0 | Population aged 0 to 14 years old (percentage) | 26.2 | NaN | United Nations Population Division, New York, ... |
12 | 1.0 | Total, all countries or areas | 2015.0 | Population aged 60+ years old (percentage) | 12.2 | NaN | United Nations Population Division, New York, ... |
13 | 1.0 | Total, all countries or areas | 2015.0 | Population density | 56.7 | NaN | United Nations Population Division, New York, ... |
14 | 1.0 | Total, all countries or areas | 2015.0 | Surface area (thousand km2) | 136,162 | NaN | United Nations Statistics Division, New York, ... |
3.3.4.6 数据框的运算
- 生成新的数据框
BSdata["年/值"]=BSdata['Region/Country/Area']+1
BSdata.head()
Region/Country/Area | Unnamed: 1 | Year | Series | Value | Footnotes | Source | 年/值 | |
---|---|---|---|---|---|---|---|---|
0 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates (millions) | 6,956.82 | NaN | United Nations Population Division, New York, ... | 2.0 |
1 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates for males (milli... | 3,507.70 | NaN | United Nations Population Division, New York, ... | 2.0 |
2 | 1.0 | Total, all countries or areas | 2010.0 | Population mid-year estimates for females (mil... | 3,449.12 | NaN | United Nations Population Division, New York, ... | 2.0 |
3 | 1.0 | Total, all countries or areas | 2010.0 | Sex ratio (males per 100 females) | 101.7 | NaN | United Nations Population Division, New York, ... | 2.0 |
4 | 1.0 | Total, all countries or areas | 2010.0 | Population aged 0 to 14 years old (percentage) | 27 | NaN | United Nations Population Division, New York, ... | 2.0 |
(2)数据框的合并
concat()
pd.concat([BSdata.Year,BSdata.Series],axis=0) # 按行合并 axis=0
0 2010.0
1 2010.0
2 2010.0
3 2010.0
4 2010.0
...
6939 NaN
6940 NaN
6941 NaN
6942 NaN
6943 NaN
Length: 13888, dtype: object
pd.concat([BSdata.Year,BSdata.Series],axis=1) # 按列合并 axis=1
Year | Series | |
---|---|---|
0 | 2010.0 | Population mid-year estimates (millions) |
1 | 2010.0 | Population mid-year estimates for males (milli... |
2 | 2010.0 | Population mid-year estimates for females (mil... |
3 | 2010.0 | Sex ratio (males per 100 females) |
4 | 2010.0 | Population aged 0 to 14 years old (percentage) |
... | ... | ... |
6939 | NaN | NaN |
6940 | NaN | NaN |
6941 | NaN | NaN |
6942 | NaN | NaN |
6943 | NaN | NaN |
6944 rows × 2 columns
(3)数据框转置
T
BSdata.iloc[:3,:5].T
0 | 1 | 2 | |
---|---|---|---|
Region/Country/Area | 1.0 | 1.0 | 1.0 |
Unnamed: 1 | Total, all countries or areas | Total, all countries or areas | Total, all countries or areas |
Year | 2010.0 | 2010.0 | 2010.0 |
Series | Population mid-year estimates (millions) | Population mid-year estimates for males (milli... | Population mid-year estimates for females (mil... |
Value | 6,956.82 | 3,507.70 | 3,449.12 |