标签：normalized df car 车流量空气质量 out 1.000000 相关性

空气质量与车流量对应指标的相关性分析

空气质量与车流量对应指标的相关性分析

数据预处理

1.当天空气质量/车流量其中一类全部缺失/均缺失的占整体数据的不到5%，这部分数据直接删去，认为不影响准确性；

2.剩余数据根据时间进行了连接，去了两张表格相交的日期（2017/3/23-2023/6/25），共1789天（部分天数不连续）；

3.面对空气质量衡量指标的部分缺失，考虑到表中无0值，这里假设空值均代表未测量到对应空气污染量，因此置为0。

处理目标

得到车辆数和空气质量以及大车数和空气质量的相关性。

代码实现

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt 
import seaborn as sns ;sns.set(color_codes=True)#用color_codes预定的颜色
import chardet#用于判断表中数据的类型
from sklearn.preprocessing import StandardScaler#为了标准化

#df = pd.read_csv("python_play.csv")
# 读取CSV文件
#df.head()
# 显示数据框的前几行
#with open('python_play.csv', 'rb') as f:
 #   content = f.read()
  #  print(content)
with open('last.csv', 'rb') as f:
    content = f.read()
    encoding = chardet.detect(content)['encoding']

print(encoding)

out:
UTF-8-SIG

# 读取CSV文件，指定编码为UTF-8-SIG
df = pd.read_csv('last.csv', encoding='UTF-8-SIG',usecols=lambda column: column != 'date')

df.head()

out:

	stream	long-car	large-car	middle-car	light-car	little-car	pm25	pm10	o3	no2	so2	co
0	6245	601	218	347	389	4690	123	43	71	23	4	8
1	18504	2401	932	1612	1339	12220	87	45	34	30	4	9
2	16541	2528	1047	1808	1504	9654	88	44	36	21	4	8
3	13164	2194	876	1410	1255	7429	77	57	60	29	6	8
4	8533	1490	559	973	821	4690	104	67	73	28	6	7

车流量长车流量大型车流量中型车流量轻型车流量微型车流量

scaler = StandardScaler()
# 初始化标准化器

df_normalized = pd.DataFrame(scaler.fit_transform(df), columns=df.columns)
# 对每列数据进行标准化

df_normalized.head()

out:

	stream	long-car	large-car	middle-car	light-car	little-car	pm25	pm10	o3	no2	so2	co
0	-0.315919	-0.472971	-0.011644	-0.676805	0.263679	-0.254230	1.058263	-0.109127	1.027142	0.474434	0.130986	1.102301
1	1.028897	0.594040	1.689531	0.981990	2.363031	0.885089	0.006061	0.001327	-0.141546	1.312421	0.130986	1.693079
2	0.813555	0.669323	1.963529	1.239005	2.727656	0.496843	0.035289	-0.053900	-0.078374	0.235009	0.130986	1.102301
3	0.443097	0.471333	1.556105	0.717108	2.177405	0.160192	-0.286217	0.664057	0.679694	1.192709	1.132417	1.102301
4	-0.064925	0.054014	0.800822	0.144069	1.218332	-0.254230	0.502934	1.216331	1.090314	1.072996	1.132417	0.511523

df_normalized.corr()

out:

	stream	long-car	large-car	middle-car	light-car	little-car	pm25	pm10	o3	no2	so2	co
stream	1.000000	0.960397	0.334564	0.936301	0.290630	0.984934	-0.291996	-0.259647	-0.249167	-0.195712	-0.076936	-0.182893
long-car	0.960397	1.000000	0.468035	0.893379	0.401189	0.909110	-0.219200	-0.208401	-0.184057	-0.138806	-0.000536	-0.128781
large-car	0.334564	0.468035	1.000000	0.274743	0.953798	0.181474	0.137599	0.086044	0.172481	0.161718	0.342927	0.185937
middle-car	0.936301	0.893379	0.274743	1.000000	0.257476	0.912902	-0.296563	-0.250422	-0.324601	-0.197463	-0.107954	-0.189003
light-car	0.290630	0.401189	0.953798	0.257476	1.000000	0.139703	0.148665	0.091512	0.195345	0.176855	0.386891	0.230381
little-car	0.984934	0.909110	0.181474	0.912902	0.139703	1.000000	-0.331484	-0.287759	-0.283557	-0.234101	-0.141788	-0.225157
pm25	-0.291996	-0.219200	0.137599	-0.296563	0.148665	-0.331484	1.000000	0.722665	0.422200	0.561810	0.473017	0.550983
pm10	-0.259647	-0.208401	0.086044	-0.250422	0.091512	-0.287759	0.722665	1.000000	0.523732	0.736929	0.530307	0.550590
o3	-0.249167	-0.184057	0.172481	-0.324601	0.195345	-0.283557	0.422200	0.523732	1.000000	0.417945	0.424055	0.373651
no2	-0.195712	-0.138806	0.161718	-0.197463	0.176855	-0.234101	0.561810	0.736929	0.417945	1.000000	0.552486	0.585678
so2	-0.076936	-0.000536	0.342927	-0.107954	0.386891	-0.141788	0.473017	0.530307	0.424055	0.552486	1.000000	0.465094
co	-0.182893	-0.128781	0.185937	-0.189003	0.230381	-0.225157	0.550983	0.550590	0.373651	0.585678	0.465094	1.000000

后续发现：是否标准化对相关系数影响不变。

两两变量关系图

sns.pairplot(df_normalized)

D:\anaconda3\envs\FLpyth38\lib\site-packages\seaborn\axisgrid.py:123: UserWarning: The figure layout has changed to tight
  self._figure.tight_layout(*args, **kwargs)

out:
<seaborn.axisgrid.PairGrid at 0x26837f3c190>

png

相关程度较高的关系呈现

车流量与pm2.5关系图

sns.jointplot(x='stream',y=' pm25',data=df_normalized)

out:
<seaborn.axisgrid.JointGrid at 0x268546bcd00>

<Figure size 400x400 with 0 Axes>

png

车流量与pm10关系图

sns.jointplot(x='stream',y=' pm10',data=df_normalized)

out:
<seaborn.axisgrid.JointGrid at 0x26847b7d160>

png

大型车流量与SO2浓度

sns.jointplot(x='large-car',y=' so2',data=df_normalized)

out:

<seaborn.axisgrid.JointGrid at 0x268458aa040>

png

大型车流量与CO浓度关系

sns.jointplot(x='large-car',y=' co',data=df_normalized)

out:

<seaborn.axisgrid.JointGrid at 0x268485f2c40>

png

车流量与小型车流量关系

sns.jointplot(x='stream',y='little-car',data=df_normalized,kind='hex')

out:

<seaborn.axisgrid.JointGrid at 0x26848ac0100>

png

车流量与大型车流量关系

sns.jointplot(x='stream',y='large-car',data=df_normalized,kind='hex')

out:

<seaborn.axisgrid.JointGrid at 0x26848ce2250>

png

大型车流量与SO2浓度关系

sns.jointplot(x='large-car',y=' so2',data=df_normalized,kind='hex')

out:

<seaborn.axisgrid.JointGrid at 0x268493f6730>

png

车流量与pm10浓度关系

sns.jointplot(x='stream',y=' pm10',data=df_normalized,kind='hex')

out:

<seaborn.axisgrid.JointGrid at 0x268498d8a30>

png

车流量与pm10浓度关系

sns.jointplot(x='stream',y=' pm10',data=df_normalized,kind='reg')

out:

<seaborn.axisgrid.JointGrid at 0x26849b37430>

png

车流量与大型车流量浓度关系

sns.jointplot(x='large-car',y='stream',data=df_normalized,kind='reg')

out:
<seaborn.axisgrid.JointGrid at 0x26849f8a2b0>

png

车流量与轻型车流量关系

sns.jointplot(x='light-car',y='stream',data=df_normalized,kind='reg')

out:

<seaborn.axisgrid.JointGrid at 0x2684ebc6af0>

png

轻型车流量与SO2浓度关系

sns.jointplot(x='light-car',y=' so2',data=df_normalized,kind='reg')

out:

<seaborn.axisgrid.JointGrid at 0x2684df7e040>

png

大型车流量与SO2浓度关系

sns.jointplot(x='large-car',y=' so2',data=df_normalized,kind='reg')

out:

<seaborn.axisgrid.JointGrid at 0x2684e46f1c0>

png

标签：normalized,df,car,车流量,空气质量,out,1.000000,相关性
From： https://www.cnblogs.com/HYLOVEYOURSELF/p/18213048

空气质量与车流量的相关性分析

空气质量与车流量对应指标的相关性分析

数据预处理

处理目标

代码实现

相关系数表

两两变量关系图

相关系数热力图

相关系数聚类图

相关程度较高的关系呈现

相关文章

赞助商

阅读排行