总览
Data overview
Loading library and data
Segmentation
Pre-processing data for segmentation
Determining the number of cluster
Clustering
Visualization of segments
Analyzing each segment
Forecasting Single time-series
Prepocessing Data for forecasting
Forecasting electricity usage of each segment using Fbprophet
Forecasting electricity usage of all costumers using Fbprophet
加载数据
import pandas as pd import numpy as np import matplotlib.pyplot as plt import warnings warnings.filterwarnings('ignore') %matplotlib inline pd.set_option('display.max_columns', 500)
data_ori = pd.read_csv('../input/daily_electricity_usage.csv') data_ori['date'] = pd.to_datetime(data_ori['date'])
data_ori.head()
Meter ID | date | total daily KW | |
---|---|---|---|
0 | 1000 | 2009-07-14 | 11.203 |
1 | 1000 | 2009-07-15 | 8.403 |
2 | 1000 | 2009-07-16 | 7.225 |
3 | 1000 | 2009-07-17 | 11.338 |
4 | 1000 | 2009-07-18 | 11.306 |
给定的数据集包含从ID 1000到ID 7444的6445个ID。该数据集提供了2009年7月14日至2010年12月31日的每日用电量。客户不仅来自住房,也来自企业。
1. 数据预处理
按用电户分割数据
data = pd.DataFrame({'date':pd.date_range('2009-07-14',periods=536,freq='D',)}) for i in range(1000,7445): S=data_ori[data_ori['Meter ID']==i][['date','total daily KW']] data=pd.merge(data,S,how='left',on='date') for i in range(1,6446): data.columns.values[i]="ID"+str(999+i)
data.head()
date | ID1000 | ID1001 | ID1002 | ID1003 | ID1004 | ID1005 | ID1006 | ID1007 | ID1008 | ID1009 | ID1010 | ID1011 | ID1012 | ID1013 | ID1014 | ID1015 | ID1016 | ID1017 | ID1018 | ID1019 | ID1020 | ID1021 | ID1022 | ID1023 | ID1024 | ID1025 | ID1026 | ID1027 | ID1028 | ID1029 | ID1030 | ID1031 | ID1032 | ID1033 | ID1034 | ID1035 | ID1036 | ID1037 | ID1038 | ID1039 | ID1040 | ID1041 | ID1042 | ID1043 | ID1044 | ID1045 | ID1046 | ID1047 | ID1048 | ID1049 | ID1050 | ID1051 | ID1052 | ID1053 | ID1054 | ID1055 | ID1056 | ID1057 | ID1058 | ID1059 | ID1060 | ID1061 | ID1062 | ID1063 | ID1064 | ID1065 | ID1066 | ID1067 | ID1068 | ID1069 | ID1070 | ID1071 | ID1072 | ID1073 | ID1074 | ID1075 | ID1076 | ID1077 | ID1078 | ID1079 | ID1080 | ID1081 | ID1082 | ID1083 | ID1084 | ID1085 | ID1086 | ID1087 | ID1088 | ID1089 | ID1090 | ID1091 | ID1092 | ID1093 | ID1094 | ID1095 | ID1096 | ID1097 | ID1098 | ID1099 | ID1100 | ID1101 | ID1102 | ID1103 | ID1104 | ID1105 | ID1106 | ID1107 | ID1108 | ID1109 | ID1110 | ID1111 | ID1112 | ID1113 | ID1114 | ID1115 | ID1116 | ID1117 | ID1118 | ID1119 | ID1120 | ID1121 | ID1122 | ID1123 | ID1124 | ID1125 | ID1126 | ID1127 | ID1128 | ID1129 | ID1130 | ID1131 | ID1132 | ID1133 | ID1134 | ID1135 | ID1136 | ID1137 | ID1138 | ID1139 | ID1140 | ID1141 | ID1142 | ID1143 | ID1144 | ID1145 | ID1146 | ID1147 | ID1148 | ID1149 | ID1150 | ID1151 | ID1152 | ID1153 | ID1154 | ID1155 | ID1156 | ID1157 | ID1158 | ID1159 | ID1160 | ID1161 | ID1162 | ID1163 | ID1164 | ID1165 | ID1166 | ID1167 | ID1168 | ID1169 | ID1170 | ID1171 | ID1172 | ID1173 | ID1174 | ID1175 | ID1176 | ID1177 | ID1178 | ID1179 | ID1180 | ID1181 | ID1182 | ID1183 | ID1184 | ID1185 | ID1186 | ID1187 | ID1188 | ID1189 | ID1190 | ID1191 | ID1192 | ID1193 | ID1194 | ID1195 | ID1196 | ID1197 | ID1198 | ID1199 | ID1200 | ID1201 | ID1202 | ID1203 | ID1204 | ID1205 | ID1206 | ID1207 | ID1208 | ID1209 | ID1210 | ID1211 | ID1212 | ID1213 | ID1214 | ID1215 | ID1216 | ID1217 | ID1218 | ID1219 | ID1220 | ID1221 | ID1222 | ID1223 | ID1224 | ID1225 | ID1226 | ID1227 | ID1228 | ID1229 | ID1230 | ID1231 | ID1232 | ID1233 | ID1234 | ID1235 | ID1236 | ID1237 | ID1238 | ID1239 | ID1240 | ID1241 | ID1242 | ID1243 | ID1244 | ID1245 | ID1246 | ID1247 | ID1248 | ... | ID7195 | ID7196 | ID7197 | ID7198 | ID7199 | ID7200 | ID7201 | ID7202 | ID7203 | ID7204 | ID7205 | ID7206 | ID7207 | ID7208 | ID7209 | ID7210 | ID7211 | ID7212 | ID7213 | ID7214 | ID7215 | ID7216 | ID7217 | ID7218 | ID7219 | ID7220 | ID7221 | ID7222 | ID7223 | ID7224 | ID7225 | ID7226 | ID7227 | ID7228 | ID7229 | ID7230 | ID7231 | ID7232 | ID7233 | ID7234 | ID7235 | ID7236 | ID7237 | ID7238 | ID7239 | ID7240 | ID7241 | ID7242 | ID7243 | ID7244 | ID7245 | ID7246 | ID7247 | ID7248 | ID7249 | ID7250 | ID7251 | ID7252 | ID7253 | ID7254 | ID7255 | ID7256 | ID7257 | ID7258 | ID7259 | ID7260 | ID7261 | ID7262 | ID7263 | ID7264 | ID7265 | ID7266 | ID7267 | ID7268 | ID7269 | ID7270 | ID7271 | ID7272 | ID7273 | ID7274 | ID7275 | ID7276 | ID7277 | ID7278 | ID7279 | ID7280 | ID7281 | ID7282 | ID7283 | ID7284 | ID7285 | ID7286 | ID7287 | ID7288 | ID7289 | ID7290 | ID7291 | ID7292 | ID7293 | ID7294 | ID7295 | ID7296 | ID7297 | ID7298 | ID7299 | ID7300 | ID7301 | ID7302 | ID7303 | ID7304 | ID7305 | ID7306 | ID7307 | ID7308 | ID7309 | ID7310 | ID7311 | ID7312 | ID7313 | ID7314 | ID7315 | ID7316 | ID7317 | ID7318 | ID7319 | ID7320 | ID7321 | ID7322 | ID7323 | ID7324 | ID7325 | ID7326 | ID7327 | ID7328 | ID7329 | ID7330 | ID7331 | ID7332 | ID7333 | ID7334 | ID7335 | ID7336 | ID7337 | ID7338 | ID7339 | ID7340 | ID7341 | ID7342 | ID7343 | ID7344 | ID7345 | ID7346 | ID7347 | ID7348 | ID7349 | ID7350 | ID7351 | ID7352 | ID7353 | ID7354 | ID7355 | ID7356 | ID7357 | ID7358 | ID7359 | ID7360 | ID7361 | ID7362 | ID7363 | ID7364 | ID7365 | ID7366 | ID7367 | ID7368 | ID7369 | ID7370 | ID7371 | ID7372 | ID7373 | ID7374 | ID7375 | ID7376 | ID7377 | ID7378 | ID7379 | ID7380 | ID7381 | ID7382 | ID7383 | ID7384 | ID7385 | ID7386 | ID7387 | ID7388 | ID7389 | ID7390 | ID7391 | ID7392 | ID7393 | ID7394 | ID7395 | ID7396 | ID7397 | ID7398 | ID7399 | ID7400 | ID7401 | ID7402 | ID7403 | ID7404 | ID7405 | ID7406 | ID7407 | ID7408 | ID7409 | ID7410 | ID7411 | ID7412 | ID7413 | ID7414 | ID7415 | ID7416 | ID7417 | ID7418 | ID7419 | ID7420 | ID7421 | ID7422 | ID7423 | ID7424 | ID7425 | ID7426 | ID7427 | ID7428 | ID7429 | ID7430 | ID7431 | ID7432 | ID7433 | ID7434 | ID7435 | ID7436 | ID7437 | ID7438 | ID7439 | ID7440 | ID7441 | ID7442 | ID7443 | ID7444 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2009-07-14 | 11.203 | 6.744 | 6.355 | 24.183 | 50.057 | 17.765 | 12.056 | 30.399 | 27.217 | 51.865 | 38.259 | 49.387 | 95.929 | 16.889 | 34.085 | 15.670 | 5.002 | 16.193 | 50.537 | 23.977 | 30.642 | 19.102 | 18.392 | 68.649 | 11.801 | 2.838 | 35.990 | 6.592 | 239.145 | 3.603 | 18.871 | 3.956 | 13.588 | 41.499 | 49.974 | 7.619 | 8.634 | 11.457 | 278.584 | 32.991 | 39.802 | 11.023 | 46.689 | 12.175 | 51.948 | 11.422 | 23.255 | 23.480 | 28.941 | 6.796 | 36.809 | 33.035 | 15.206 | 14.339 | 8.225 | 49.221 | 139.751 | 34.405 | 13.358 | 34.338 | 4.919 | 19.160 | 21.257 | 25.373 | 33.999 | 4.265 | NaN | 21.095 | 30.803 | 1.794 | NaN | 10.675 | 22.434 | 43.316 | 12.115 | 5.125 | 20.806 | 15.858 | 10.591 | 37.904 | 15.448 | 11.246 | 17.225 | 18.391 | 9.086 | 20.755 | 4.046 | 13.717 | 8.139 | 25.618 | 85.459 | 41.014 | 6.395 | 2.427 | 30.335 | 9.719 | 26.899 | 40.104 | 10.554 | 9.935 | 22.986 | 2.451 | 30.398 | 354.344 | 126.100 | 44.428 | 15.256 | 4.362 | 50.724 | 70.316 | 13.777 | 20.557 | 4.126 | 1.928 | NaN | 11.245 | 22.298 | 42.334 | 20.357 | 33.677 | 4.086 | 8.326 | 17.271 | 12.645 | 5.796 | NaN | 10.493 | 3.664 | 24.830 | NaN | 34.353 | 18.768 | 21.419 | 17.536 | 27.316 | 48.379 | 11.145 | 72.529 | 49.306 | 29.965 | 22.189 | 37.462 | 101.498 | 31.844 | 24.987 | 27.255 | 167.823 | 10.599 | 21.220 | 10.747 | 22.090 | NaN | 18.124 | 5.013 | 21.668 | 39.499 | 11.709 | 33.609 | 21.676 | 6.605 | 9.286 | 1.122 | 23.213 | 29.763 | 31.563 | 13.086 | 18.490 | 13.490 | 27.017 | 54.519 | 20.372 | 46.358 | 86.866 | 22.415 | 17.247 | 62.404 | 30.225 | 439.624 | 11.841 | 10.680 | 24.595 | 56.052 | 4.768 | 27.083 | 2.848 | 24.609 | 52.842 | NaN | 24.958 | 46.474 | 46.901 | 14.853 | 10.618 | 9.581 | 15.387 | 14.517 | 29.798 | 6.997 | 17.057 | 13.489 | 25.284 | 16.060 | 27.995 | 51.187 | 4.555 | 17.268 | 11.352 | 14.062 | 0.384 | 23.938 | 8.530 | 29.482 | 16.779 | 14.694 | 7.455 | 26.935 | 19.080 | 42.244 | 13.462 | 34.251 | 11.945 | 21.524 | 18.143 | 14.257 | 11.092 | 19.164 | 20.451 | 80.164 | 100.416 | 22.931 | 17.930 | 17.771 | 16.994 | 33.235 | 17.057 | 22.659 | 15.344 | 197.914 | 33.510 | 16.255 | 27.698 | 7.909 | 202.937 | 15.541 | 15.113 | 25.671 | 45.793 | 15.165 | 16.375 | ... | 57.581 | 6.086 | 5.724 | 18.391 | 17.251 | 36.247 | 248.998 | 29.737 | 8.397 | 11.805 | 42.322 | 30.464 | 16.182 | 11.031 | 4.384 | 89.487 | 79.471 | NaN | 61.156 | 10.567 | 27.216 | 13.156 | 21.730 | 43.895 | 0.643 | 16.231 | 43.561 | 26.351 | 10.694 | 16.500 | 11.490 | 19.650 | 5.350 | 35.504 | 19.243 | 12.963 | 51.401 | 41.113 | 10.762 | 3.906 | 12.306 | 168.335 | 42.059 | 106.838 | 1.559 | 12.258 | 25.861 | 185.746 | 20.204 | 34.145 | 5.224 | 18.013 | 33.364 | 20.671 | 8.975 | 9.606 | 25.334 | 2.908 | 9.722 | 19.622 | 21.052 | 24.616 | 13.641 | 79.238 | 19.317 | 151.142 | 4.975 | 3.583 | 15.858 | 25.935 | NaN | 16.496 | 46.821 | 24.690 | 26.641 | 15.145 | 20.701 | 25.308 | 18.710 | 34.725 | 36.320 | 14.746 | 541.853 | 26.748 | 11.834 | 15.754 | 56.912 | 313.981 | 24.469 | 28.802 | 30.444 | 9.950 | 4.193 | 34.730 | 25.547 | 30.852 | 27.597 | 2.985 | 50.906 | 24.737 | 35.795 | 28.726 | 17.545 | 7.132 | 29.886 | 7.359 | 49.336 | 101.275 | 13.936 | 225.310 | 1.990 | 28.634 | 47.020 | 52.834 | 13.383 | 37.986 | 7.679 | 9.695 | 15.815 | 29.941 | 18.248 | 8.231 | NaN | 12.873 | 8.709 | 19.726 | 3.091 | 43.145 | 24.299 | 37.694 | 16.954 | 12.294 | 21.395 | 25.736 | 35.822 | 4.699 | 27.565 | 2.721 | 20.542 | 16.169 | 29.879 | 20.391 | 10.930 | 171.300 | 204.796 | 6.738 | 15.556 | 20.930 | 12.929 | 7.797 | 27.919 | 5.651 | 7.059 | 126.392 | 16.705 | 29.195 | 8.687 | 19.679 | 36.091 | 27.083 | 15.019 | 25.267 | 44.880 | NaN | 12.969 | NaN | 13.685 | 20.546 | 22.693 | 6.758 | 28.088 | 15.882 | 10.920 | 23.375 | 164.468 | 13.908 | 5.973 | 13.668 | 27.924 | 14.608 | 32.177 | 15.522 | 35.023 | 29.897 | 15.280 | 39.477 | 11.336 | 26.582 | 18.911 | 6.735 | 23.118 | 12.876 | 11.283 | 13.366 | 26.855 | 23.085 | 7.641 | 18.642 | 28.128 | 23.209 | 157.692 | 6.163 | 57.602 | 20.937 | 14.419 | 8.715 | 9.122 | 33.818 | 44.990 | NaN | 24.839 | 14.094 | 38.676 | 16.383 | 31.405 | 31.628 | 19.188 | 36.086 | 20.167 | 16.613 | 9.172 | 27.467 | 18.356 | 40.054 | 470.055 | 25.487 | 19.589 | 15.463 | 22.475 | 20.367 | 13.670 | NaN | 36.923 | 13.091 | 20.860 | 30.366 | 4.274 | 9.214 | 14.319 | 18.357 | 15.643 | 36.500 | 15.346 | 0.732 | 138.130 | 41.813 | 14.491 | 36.813 | 5.112 | 52.940 |
1 | 2009-07-15 | 8.403 | 6.949 | 8.972 | 26.659 | 48.813 | 19.801 | 17.169 | 23.976 | 31.996 | 42.740 | 41.706 | 57.969 | 100.728 | 18.383 | 32.544 | 19.193 | 4.211 | 25.665 | 52.621 | 24.010 | 15.788 | 7.114 | 17.287 | 38.922 | 9.026 | 2.833 | 40.627 | 5.640 | 232.750 | 5.360 | 14.638 | 3.994 | 23.513 | 35.014 | 45.599 | 14.728 | 8.433 | 9.968 | 295.308 | 33.560 | 40.510 | 4.999 | 31.470 | 12.768 | 43.560 | 4.229 | 20.730 | 27.440 | 21.847 | 8.477 | 34.941 | 36.028 | 11.256 | 16.549 | 7.238 | 21.788 | 151.956 | 32.753 | 16.258 | 15.915 | 4.030 | 15.206 | 13.765 | 22.275 | 23.652 | 4.248 | NaN | 14.451 | 38.726 | 1.796 | NaN | 9.953 | 21.310 | 49.737 | 7.870 | 1.953 | 17.831 | 11.156 | 8.200 | 43.305 | 18.493 | 14.125 | 20.078 | 19.555 | 8.440 | 23.577 | 4.091 | 28.915 | 8.266 | 20.919 | 90.852 | 51.369 | 7.743 | 2.079 | 26.950 | 7.799 | 33.057 | 40.519 | 14.629 | 10.809 | 13.740 | 2.428 | 27.410 | 330.995 | 122.284 | 26.577 | 15.600 | 21.510 | 51.340 | 45.136 | 18.954 | 18.034 | 4.286 | 3.195 | NaN | 12.986 | 31.395 | 34.202 | 24.830 | 43.081 | 8.588 | 11.988 | 21.027 | 12.067 | 5.798 | NaN | 10.128 | 3.657 | 22.707 | NaN | 22.908 | 18.018 | 18.211 | 13.046 | 28.755 | 45.095 | 10.519 | 65.962 | 38.022 | 36.379 | 37.302 | 38.236 | 90.683 | 4.236 | 40.683 | 27.443 | 173.812 | 10.534 | 18.023 | 8.687 | 24.065 | NaN | 30.142 | 11.120 | 31.027 | 33.635 | 8.223 | 26.135 | 15.370 | 6.901 | 17.642 | 1.128 | 19.154 | 32.493 | 29.307 | 12.451 | 21.106 | 22.283 | 23.081 | 30.650 | 12.315 | 32.012 | 90.606 | 22.154 | 16.563 | 72.197 | 29.139 | 464.121 | 7.029 | 7.470 | 15.981 | 51.942 | 4.745 | 35.229 | 2.806 | 12.842 | 44.534 | NaN | 12.828 | 59.482 | 34.948 | 11.197 | 8.974 | 9.402 | 20.823 | 23.234 | 34.033 | 7.023 | 33.523 | 14.798 | 34.855 | 25.623 | 30.336 | 36.289 | 5.129 | 32.272 | 12.636 | 16.746 | 0.384 | 17.655 | 12.161 | 25.503 | 22.185 | 7.153 | 8.062 | 27.289 | 18.917 | 51.389 | 12.686 | 19.135 | 7.096 | 15.141 | 6.459 | 11.098 | 10.872 | 18.230 | 34.717 | 88.234 | 106.376 | 27.335 | 19.756 | 27.455 | 20.852 | 30.896 | 17.847 | 13.347 | 9.039 | 203.690 | 34.037 | 17.523 | 31.362 | 10.616 | 175.939 | 23.116 | 14.301 | 29.828 | 38.244 | 10.356 | 18.673 | ... | 67.169 | 7.182 | 5.919 | 9.373 | 21.159 | 34.967 | 258.690 | 22.758 | 8.476 | 22.494 | 38.914 | 36.491 | 11.330 | 14.088 | 4.039 | 52.483 | 54.393 | NaN | 64.164 | 14.781 | 34.132 | 16.778 | 21.269 | 49.397 | 0.630 | 20.662 | 43.641 | 20.178 | 15.436 | 10.005 | 9.120 | 24.021 | 5.168 | 40.948 | 18.729 | 12.633 | 39.048 | 58.488 | 8.177 | 3.695 | 12.901 | 97.148 | 22.421 | 95.627 | 1.589 | 13.140 | 20.562 | 160.628 | 34.952 | 28.763 | 3.769 | 12.107 | 20.688 | 18.250 | 8.254 | 23.488 | 38.365 | 2.050 | 12.632 | 19.421 | 14.566 | 40.192 | 22.348 | 61.029 | 26.571 | 129.421 | 11.047 | 9.648 | 19.616 | 18.483 | NaN | 19.446 | 40.110 | 12.241 | 32.116 | 12.583 | 21.742 | 30.170 | 12.969 | 39.222 | 41.945 | 20.775 | 548.211 | 23.136 | 15.374 | 17.349 | 47.207 | 278.898 | 38.617 | 18.033 | 35.515 | 13.969 | 4.168 | 35.806 | 25.363 | 26.370 | 35.516 | 3.386 | 47.652 | 12.862 | 26.894 | 22.250 | 36.073 | 7.050 | 29.343 | 6.390 | 47.870 | 97.753 | 13.597 | 223.338 | 1.976 | 36.659 | 18.305 | 36.481 | 26.729 | 39.341 | 9.930 | 14.060 | 15.691 | 26.551 | 22.427 | 10.389 | NaN | 14.361 | 6.308 | 15.467 | 1.991 | 40.168 | 22.398 | 44.148 | 17.784 | 11.708 | 18.368 | 26.166 | 12.795 | 4.356 | 11.070 | 2.238 | 30.631 | 16.674 | 32.499 | 20.973 | 12.496 | 135.493 | 196.225 | 12.142 | 17.231 | 17.480 | 8.705 | 10.438 | 33.312 | 6.766 | 8.049 | 126.875 | 16.268 | 33.430 | 8.852 | 21.333 | 47.616 | 17.213 | 16.271 | 19.451 | 49.738 | NaN | 21.919 | NaN | 10.374 | 17.340 | 28.710 | 26.061 | 36.927 | 6.652 | 8.736 | 16.200 | 144.602 | 7.134 | 8.983 | 13.478 | 22.942 | 16.885 | 43.601 | 14.745 | 39.971 | 7.613 | 14.400 | 45.937 | 12.085 | 24.272 | 23.672 | 6.755 | 29.455 | 6.355 | 20.622 | 14.983 | 36.982 | 17.446 | 5.680 | 38.751 | 14.423 | 16.103 | 121.922 | 7.485 | 71.908 | 16.177 | 13.124 | 14.705 | 15.563 | 43.536 | 27.921 | NaN | 23.527 | 17.207 | 35.226 | 17.491 | 20.719 | 26.268 | 15.769 | 28.641 | 8.013 | 18.123 | 8.777 | 18.251 | 18.968 | 59.596 | 473.219 | 27.691 | 16.575 | 12.379 | 26.870 | 16.428 | 10.447 | NaN | 54.772 | 9.413 | 15.657 | 26.507 | 4.263 | 20.573 | 17.819 | 25.509 | 14.667 | 29.443 | 26.156 | 0.685 | 115.893 | 31.572 | 12.597 | 40.492 | 18.233 | 35.582 |
2 | 2009-07-16 | 7.225 | 7.255 | 8.794 | 32.017 | 32.555 | 15.216 | 16.260 | 34.534 | 24.363 | 56.390 | 52.198 | 52.062 | 101.940 | 6.771 | 31.324 | 16.683 | 4.006 | 13.857 | 42.536 | 24.628 | 29.175 | 23.169 | 14.899 | 73.193 | 14.859 | 2.896 | 47.016 | 8.619 | 228.443 | 5.265 | 10.801 | 4.073 | 24.917 | 35.139 | 35.180 | 9.150 | 9.763 | 9.611 | 291.704 | 30.067 | 30.409 | 7.561 | 30.270 | 20.084 | 48.184 | 1.979 | 12.926 | 18.388 | 20.412 | 11.332 | 32.477 | 33.728 | 19.829 | 14.272 | 7.928 | 19.304 | 149.596 | 11.195 | 17.992 | 23.650 | 11.290 | 14.888 | 16.426 | 19.585 | 18.833 | 4.291 | NaN | 20.278 | 29.106 | 1.799 | NaN | 10.783 | 18.601 | 38.484 | 6.230 | 6.670 | 20.691 | 14.502 | 8.758 | 40.689 | 20.400 | 18.038 | 18.155 | 16.204 | 8.575 | 16.426 | 4.033 | 14.519 | 7.998 | 21.788 | 78.085 | 37.037 | 7.808 | 2.109 | 41.142 | 6.853 | 23.605 | 30.936 | 5.830 | 6.788 | 19.190 | 2.472 | 24.673 | 296.074 | 124.130 | 37.885 | 36.222 | 16.940 | 36.224 | 44.017 | 14.538 | 22.422 | 4.143 | 12.104 | NaN | 13.549 | 28.995 | 48.000 | 16.059 | 41.726 | 1.556 | 12.616 | 26.530 | 13.452 | 4.514 | NaN | 13.636 | 3.732 | 28.136 | NaN | 35.926 | 20.438 | 21.219 | 23.597 | 27.840 | 50.961 | 15.456 | 57.342 | 44.278 | 28.575 | 25.807 | 38.398 | 90.368 | 4.274 | 16.729 | 27.428 | 154.666 | 8.509 | 16.742 | 12.443 | 19.854 | NaN | 24.528 | 13.708 | 45.751 | 45.422 | 11.069 | 20.224 | 13.918 | 7.694 | 8.880 | 2.118 | 14.933 | 25.230 | 24.884 | 9.320 | 15.560 | 20.996 | 29.262 | 31.672 | 11.809 | 34.797 | 86.641 | 16.887 | 17.075 | 55.256 | 33.757 | 418.450 | 8.212 | 9.488 | 27.678 | 60.660 | 4.895 | 27.397 | 2.760 | 12.322 | 49.913 | NaN | 23.550 | 23.426 | 32.440 | 14.408 | 7.412 | 10.557 | 28.953 | 19.441 | 9.734 | 5.833 | 21.314 | 14.488 | 35.680 | 20.694 | 23.598 | 48.717 | 6.791 | 27.228 | 12.542 | 14.027 | 0.386 | 23.422 | 39.285 | 22.944 | 19.503 | 16.880 | 8.890 | 23.487 | 18.798 | 45.887 | 11.504 | 16.199 | 10.050 | 11.856 | 5.117 | 14.433 | 21.594 | 16.291 | 32.419 | 78.488 | 103.628 | 31.060 | 13.807 | 20.411 | 16.334 | 22.333 | 19.136 | 12.685 | 11.517 | 153.781 | 33.240 | 18.309 | 21.284 | 11.167 | 184.445 | 20.056 | 14.671 | 28.244 | 41.825 | 19.060 | 21.593 | ... | 54.492 | 8.653 | 5.555 | 8.776 | 21.831 | 41.974 | 267.134 | 25.407 | 8.909 | 21.766 | 48.901 | 33.620 | 17.825 | 6.227 | 4.347 | 58.153 | 88.276 | NaN | 65.188 | 15.022 | 30.122 | 21.294 | 15.453 | 50.989 | 0.637 | 15.396 | 56.631 | 24.937 | 5.598 | 13.506 | 9.693 | 27.653 | 5.225 | 32.995 | 18.561 | 25.262 | 35.559 | 42.129 | 12.787 | 3.769 | 14.223 | 86.800 | 33.350 | 99.142 | 1.590 | 17.274 | 35.314 | 128.206 | 25.266 | 36.221 | 3.746 | 17.343 | 19.345 | 23.641 | 7.417 | 20.185 | 31.077 | 2.865 | 8.001 | 20.749 | 29.454 | 16.263 | 13.036 | 59.329 | 20.852 | 162.040 | 4.857 | 9.047 | 17.094 | 30.273 | NaN | 18.512 | 23.918 | 20.956 | 21.692 | 18.173 | 25.120 | 30.246 | 4.141 | 43.959 | 51.451 | 19.897 | 557.952 | 15.412 | 14.260 | 15.422 | 43.931 | 267.347 | 34.490 | 26.866 | 35.486 | 11.564 | 4.638 | 44.887 | 24.571 | 15.256 | 38.733 | 10.968 | 26.457 | 26.202 | 37.775 | 21.812 | 25.446 | 7.060 | 36.380 | 4.096 | 65.543 | 106.439 | 18.619 | 219.875 | 1.941 | 35.509 | 23.866 | 34.689 | 18.771 | 30.699 | 8.605 | 15.174 | 17.837 | 18.617 | 17.480 | 8.739 | NaN | 14.967 | 6.020 | 26.920 | 3.610 | 38.708 | 30.793 | 47.995 | 19.655 | 13.315 | 11.310 | 19.791 | 33.622 | 4.736 | 25.208 | 3.854 | 23.762 | 12.975 | 35.628 | 28.674 | 11.038 | 117.340 | 193.044 | 11.846 | 19.389 | 24.074 | 12.094 | 9.934 | 31.826 | 12.610 | 17.249 | 128.406 | 18.974 | 18.177 | 8.649 | 25.341 | 42.995 | 24.781 | 14.129 | 23.182 | 47.893 | NaN | 19.408 | NaN | 9.110 | 9.227 | 30.650 | 9.783 | 32.412 | 15.365 | 4.587 | 21.120 | 158.166 | 9.320 | 8.975 | 15.629 | 17.358 | 23.426 | 43.637 | 16.323 | 43.629 | 8.022 | 18.312 | 30.545 | 11.686 | 40.076 | 27.143 | 5.421 | 24.409 | 21.027 | 16.362 | 19.263 | 30.677 | 17.832 | 6.674 | 25.470 | 23.581 | 18.642 | 120.794 | 5.892 | 59.971 | 8.838 | 8.231 | 20.947 | 14.410 | 56.285 | 38.531 | NaN | 12.796 | 12.346 | 31.713 | 18.771 | 8.748 | 34.587 | 18.920 | 38.871 | 7.336 | 25.997 | 9.369 | 23.874 | 18.544 | 35.521 | 478.952 | 32.803 | 19.351 | 19.566 | 16.261 | 17.194 | 11.487 | NaN | 14.831 | 8.368 | 19.139 | 38.481 | 4.236 | 15.118 | 17.562 | 21.360 | 18.037 | 28.786 | 23.945 | 0.707 | 127.698 | 32.618 | 15.816 | 41.487 | 6.925 | 29.307 |
3 | 2009-07-17 | 11.338 | 7.190 | 8.306 | 33.032 | 46.727 | 23.418 | 14.813 | 19.251 | 23.122 | 43.604 | 47.171 | 63.930 | 103.671 | 5.977 | 36.053 | 17.922 | 3.801 | 15.039 | 44.322 | 40.112 | 29.844 | 6.164 | 17.034 | 57.148 | 12.984 | 7.660 | 39.778 | 7.374 | 210.062 | 4.885 | 5.186 | 4.058 | 19.441 | 29.927 | 24.133 | 9.895 | 9.299 | 14.908 | 293.524 | 32.053 | 31.143 | 13.764 | 26.356 | 19.323 | 63.222 | 6.129 | 24.259 | 16.190 | 16.333 | 9.878 | 35.541 | 31.120 | 17.041 | 11.467 | 7.368 | 18.832 | 160.306 | 21.269 | 14.250 | 35.466 | 5.190 | 13.303 | 19.094 | 26.311 | 19.539 | 4.289 | NaN | 15.203 | 28.445 | 1.839 | NaN | 11.561 | 26.379 | 42.226 | 14.935 | 8.802 | 14.565 | 9.944 | 10.283 | 36.062 | 19.212 | 20.317 | 20.650 | 16.339 | 7.323 | 23.764 | 4.194 | 25.781 | 8.109 | 22.996 | 78.157 | 40.553 | 7.765 | 2.440 | 35.611 | 9.658 | 4.233 | 22.960 | 14.477 | 10.198 | 20.109 | 2.441 | 22.846 | 351.355 | 129.159 | 35.337 | 20.794 | 18.380 | 27.622 | 59.358 | 17.235 | 16.257 | 4.003 | 12.304 | NaN | 9.351 | 29.858 | 25.743 | 22.978 | 35.285 | 2.129 | 12.875 | 17.807 | 11.888 | 4.461 | NaN | 13.202 | 3.640 | 29.452 | NaN | 19.793 | 19.590 | 26.128 | 10.827 | 24.332 | 44.061 | 10.332 | 63.543 | 41.386 | 34.788 | 28.935 | 28.100 | 87.504 | 4.406 | 15.347 | 28.645 | 157.133 | 8.959 | 14.329 | 14.483 | 23.351 | NaN | 20.141 | 18.447 | 30.282 | 35.994 | 11.380 | 25.248 | 12.796 | 8.619 | 9.987 | 4.232 | 24.000 | 39.772 | 25.224 | 13.018 | 28.800 | 16.739 | 20.969 | 17.181 | 11.960 | 62.374 | 84.680 | 20.170 | 14.947 | 36.906 | 30.017 | 436.441 | 14.375 | 7.482 | 21.553 | 53.184 | 4.773 | 24.909 | 2.756 | 24.790 | 46.512 | NaN | 22.074 | 35.761 | 28.046 | 9.145 | 11.614 | 12.790 | 17.189 | 25.483 | 3.771 | 8.377 | 25.978 | 13.108 | 22.685 | 29.986 | 42.175 | 67.905 | 14.289 | 20.159 | 12.317 | 7.926 | 0.386 | 22.824 | 15.408 | 16.904 | 23.136 | 12.406 | 8.267 | 24.066 | 23.415 | 37.476 | 13.155 | 20.328 | 9.833 | 15.336 | 5.421 | 12.643 | 10.452 | 14.607 | 18.855 | 63.997 | 102.262 | 36.266 | 25.740 | 18.616 | 19.966 | 35.698 | 19.117 | 14.754 | 11.613 | 123.520 | 40.409 | 26.141 | 17.997 | 6.279 | 173.340 | 22.764 | 18.002 | 26.898 | 51.728 | 12.369 | 19.126 | ... | 69.069 | 10.011 | 5.894 | 8.493 | 22.744 | 23.975 | 246.883 | 31.222 | 7.780 | 11.916 | 30.446 | 33.248 | 22.904 | 12.459 | 4.375 | 57.445 | 97.883 | NaN | 58.829 | 16.759 | 32.246 | 21.784 | 20.635 | 36.203 | 0.650 | 6.262 | 57.890 | 22.594 | 8.970 | 13.708 | 7.007 | 29.919 | 5.258 | 31.863 | 21.434 | 32.945 | 39.095 | 35.107 | 6.701 | 6.718 | 14.858 | 101.168 | 21.381 | 76.154 | 1.580 | 14.354 | 23.277 | 165.868 | 33.069 | 32.259 | 3.874 | 16.441 | 18.703 | 11.590 | 5.837 | 30.100 | 28.272 | 3.578 | 16.809 | 21.341 | 28.662 | 19.379 | 24.973 | 75.876 | 25.626 | 171.970 | 12.285 | 9.764 | 22.564 | 18.613 | NaN | 15.457 | 10.345 | 15.351 | 23.036 | 8.327 | 22.538 | 30.791 | 7.344 | 25.951 | 38.734 | 17.399 | 502.821 | 15.469 | 12.634 | 17.014 | 38.155 | 251.907 | 29.874 | 24.292 | 34.113 | 9.129 | 4.167 | 37.080 | 28.635 | 20.178 | 29.352 | 16.765 | 31.145 | 22.256 | 32.131 | 19.353 | 24.568 | 7.101 | 33.580 | 1.265 | 48.930 | 91.867 | 12.097 | 232.470 | 1.947 | 30.160 | 20.902 | 30.409 | 14.348 | 31.213 | 7.468 | 13.600 | 19.468 | 30.493 | 22.519 | 11.627 | NaN | 19.542 | 6.076 | 26.049 | 7.177 | 49.873 | 26.179 | 49.758 | 18.690 | 11.650 | 30.912 | 27.036 | 31.292 | 4.410 | 26.935 | 2.092 | 28.092 | 24.557 | 28.491 | 21.803 | 6.885 | 122.633 | 209.080 | 5.945 | 19.796 | 17.038 | 10.431 | 6.455 | 32.367 | 13.989 | 14.425 | 124.626 | 16.699 | 22.299 | 10.569 | 20.569 | 37.208 | 26.430 | 18.601 | 25.334 | 48.160 | NaN | 29.561 | NaN | 11.146 | 12.025 | 29.032 | 8.971 | 47.544 | 6.249 | 4.284 | 17.424 | 155.937 | 8.529 | 9.529 | 16.826 | 21.750 | 24.254 | 38.021 | 14.566 | 22.412 | 7.659 | 26.082 | 38.556 | 12.957 | 5.489 | 28.620 | 7.348 | 24.510 | 10.204 | 19.662 | 18.876 | 38.349 | 20.124 | 14.722 | 20.909 | 25.780 | 24.053 | 131.505 | 6.209 | 87.703 | 32.789 | 10.419 | 17.504 | 19.624 | 43.947 | 43.773 | NaN | 16.583 | 16.157 | 34.367 | 17.198 | 13.558 | 28.446 | 17.948 | 27.458 | 7.345 | 27.533 | 8.141 | 22.416 | 13.553 | 13.235 | 572.609 | 31.078 | 14.382 | 13.978 | 7.716 | 26.905 | 6.623 | NaN | 14.396 | 10.794 | 19.851 | 28.909 | 4.200 | 30.427 | 15.063 | 17.576 | 11.512 | 31.394 | 23.118 | 0.655 | 142.211 | 36.614 | 13.162 | 43.986 | 5.370 | 40.986 |
4 | 2009-07-18 | 11.306 | 6.805 | 10.119 | 31.238 | 35.215 | 29.392 | 12.325 | 21.392 | 25.721 | 41.581 | 41.226 | 46.862 | 107.139 | 7.850 | 37.845 | 15.976 | 3.699 | 15.832 | 37.231 | 43.964 | 28.045 | 10.650 | 23.125 | 26.250 | 18.473 | 8.452 | 23.098 | 6.716 | 117.681 | 5.553 | 16.435 | 15.603 | 33.140 | 34.793 | 20.710 | 5.309 | 12.640 | 9.869 | 290.710 | 39.695 | 37.478 | 10.051 | 33.282 | 18.737 | 69.113 | 7.886 | 16.698 | 21.594 | 20.449 | 8.886 | 29.637 | 39.514 | 10.634 | 15.683 | 15.863 | 28.430 | 36.736 | 49.075 | 19.544 | 44.161 | 4.126 | 12.812 | 16.537 | 17.380 | 21.981 | 4.262 | NaN | 18.886 | 37.197 | 1.898 | NaN | 12.784 | 24.404 | 57.909 | 6.966 | 9.109 | 28.657 | 12.817 | 9.994 | 47.592 | 19.604 | 20.869 | 12.833 | 20.106 | 7.066 | 29.865 | 4.125 | 27.128 | 7.883 | 22.454 | 80.922 | 36.099 | 5.894 | 2.715 | 27.470 | 10.092 | 18.527 | 42.243 | 9.955 | 12.387 | 12.567 | 2.464 | 27.637 | 267.968 | 120.195 | 72.754 | 25.899 | 22.020 | 36.844 | 70.280 | 16.439 | 16.770 | 2.708 | 21.373 | NaN | 5.770 | 24.822 | 40.048 | 35.076 | 76.450 | 5.111 | 11.017 | 10.307 | 12.442 | 4.858 | NaN | 36.679 | 3.625 | 33.813 | NaN | 40.331 | 33.450 | 19.135 | 16.962 | 31.999 | 46.675 | 13.399 | 14.862 | 47.329 | 39.057 | 22.892 | 47.599 | 10.146 | 37.917 | 18.386 | 27.805 | 15.546 | 12.962 | 17.305 | 9.659 | 13.844 | NaN | 15.857 | 25.262 | 23.878 | 51.436 | 5.503 | 36.665 | 14.377 | 14.003 | 14.468 | 3.257 | 29.623 | 36.646 | 31.516 | 20.210 | 24.125 | 27.603 | 29.706 | 37.118 | 19.333 | 57.670 | 60.555 | 22.118 | 17.498 | 66.499 | 30.265 | 411.918 | 10.915 | 8.514 | 34.196 | 39.927 | 4.613 | 34.741 | 2.778 | 14.859 | 46.489 | NaN | 25.189 | 53.930 | 23.780 | 16.954 | 7.536 | 33.671 | 24.894 | 19.372 | 16.229 | 7.677 | 22.348 | 13.568 | 27.622 | 31.633 | 28.953 | 41.663 | 10.523 | 13.944 | 23.787 | 10.542 | 0.386 | 19.244 | 6.453 | 13.191 | 27.263 | 11.223 | 8.492 | 16.790 | 38.459 | 25.049 | 13.787 | 11.802 | 11.297 | 7.557 | 5.181 | 11.488 | 8.579 | 16.192 | 21.295 | 98.056 | 73.863 | 55.797 | 17.758 | 30.367 | 20.825 | 31.170 | 18.583 | 14.228 | 10.533 | 16.837 | 60.484 | 31.667 | 18.569 | 8.081 | 33.644 | 20.211 | 16.474 | 36.717 | 47.217 | 24.276 | 21.738 | ... | 45.254 | 11.518 | 10.688 | 11.815 | 14.786 | 34.421 | 247.569 | 18.577 | 12.394 | 31.149 | 44.105 | 32.950 | 16.810 | 26.112 | 4.284 | 10.939 | 93.067 | NaN | 20.565 | 12.989 | 49.831 | 14.832 | 18.024 | 57.849 | 0.634 | 14.668 | 44.888 | 18.832 | 11.000 | 14.553 | 6.491 | 27.024 | 5.213 | 27.791 | 23.056 | 40.142 | 45.734 | 32.770 | 9.370 | 14.622 | 19.290 | 93.823 | 32.925 | 23.273 | 1.571 | 20.501 | 31.792 | 45.205 | 29.808 | 10.474 | 13.157 | 22.796 | 20.834 | 6.902 | 6.876 | 24.874 | 43.358 | 2.058 | 12.143 | 30.298 | 22.340 | 41.560 | 13.110 | 31.101 | 20.577 | 126.211 | 9.728 | 7.778 | 9.528 | 24.551 | NaN | 20.646 | 10.174 | 19.942 | 38.462 | 16.239 | 42.094 | 34.714 | 8.449 | 28.791 | 49.262 | 14.066 | 515.425 | 21.329 | 9.473 | 23.727 | 47.367 | 113.761 | 31.038 | 33.875 | 38.641 | 20.350 | 4.202 | 29.833 | 24.727 | 7.767 | 22.033 | 4.058 | 58.720 | 4.594 | 38.036 | 46.278 | 25.337 | 7.384 | 19.454 | 1.289 | 65.095 | 44.117 | 17.091 | 157.890 | 7.762 | 34.888 | 16.436 | 55.314 | 8.729 | 34.807 | 18.402 | 15.769 | 18.712 | 40.240 | 45.749 | 19.272 | NaN | 24.655 | 7.457 | 18.628 | 8.868 | 42.602 | 39.975 | 44.891 | 18.085 | 11.468 | 17.590 | 25.058 | 55.235 | 4.667 | 38.764 | 1.719 | 18.377 | 20.342 | 33.309 | 14.812 | 9.478 | 90.629 | 54.034 | 15.290 | 26.484 | 15.984 | 10.020 | 6.052 | 27.151 | 9.616 | 6.211 | 149.568 | 16.581 | 30.279 | 16.865 | 27.947 | 34.431 | 23.599 | 16.186 | 39.261 | 19.450 | NaN | 11.049 | NaN | 10.530 | 13.322 | 19.318 | 19.574 | 38.140 | 12.300 | 3.482 | 19.679 | 160.713 | 10.612 | 7.564 | 16.707 | 29.502 | 16.275 | 40.163 | 15.663 | 25.840 | 9.325 | 21.292 | 31.732 | 10.885 | 5.565 | 23.797 | 13.024 | 21.240 | 32.428 | 12.878 | 18.523 | 36.361 | 20.521 | 6.316 | 25.633 | 20.697 | 21.777 | 84.053 | 6.382 | 33.705 | 21.911 | 14.404 | 15.991 | 21.964 | 40.439 | 31.808 | NaN | 13.533 | 20.365 | 37.902 | 14.290 | 11.367 | 21.963 | 20.461 | 29.295 | 7.362 | 30.252 | 8.232 | 23.913 | 20.415 | 43.819 | 535.448 | 42.829 | 19.822 | 9.634 | 16.033 | 28.201 | 5.505 | NaN | 14.676 | 9.928 | 17.213 | 32.016 | 4.200 | 20.712 | 20.230 | 25.962 | 30.767 | 22.112 | 15.582 | 0.682 | 4.641 | 27.982 | 13.301 | 41.018 | 6.751 | 40.270 |
数据补全或者删除
data.isnull().sum().sum()
163262
data = data.fillna(data.mean())
数据集包含163.262个缺失数据,因为有新客户;有些仪表id在1000-7444之间没有观察到。
1. 分段
处理分段特征
data.date = pd.to_datetime(data.date) data['day'] = data['date'].apply(lambda x:x.weekday()) x_call = data.columns[1:-1]
data_fix = pd.DataFrame({'Meter ID':range(1000,7445,1),'total KW':np.sum(data[x_call]).values}) data_fix['average per day']=data[x_call].mean().values data_fix['% Monday']=data[data['day']==0][x_call].sum().values/data_fix['total KW']*100 data_fix['% Tuesday']=data[data['day']==1][x_call].sum().values/data_fix['total KW']*100 data_fix['% Wednesday']=data[data['day']==2][x_call].sum().values/data_fix['total KW']*100 data_fix['% Thursday']=data[data['day']==3][x_call].sum().values/data_fix['total KW']*100 data_fix['% Friday']=data[data['day']==4][x_call].sum().values/data_fix['total KW']*100 data_fix['% Saturday']=data[data['day']==5][x_call].sum().values/data_fix['total KW']*100 data_fix['% Sunday']=data[data['day']==6][x_call].sum().values/data_fix['total KW']*100 data_fix['% weekday']=data[(data['day']!=5)&(data['day']!=6)][x_call].sum().values/data_fix['total KW']*100 data_fix['% weekend']=data[(data['day']==5)|(data['day']==6)][x_call].sum().values/data_fix['total KW']*100
data_fix=data_fix.fillna(0) data_fix.head()
Meter ID | total KW | average per day | % Monday | % Tuesday | % Wednesday | % Thursday | % Friday | % Saturday | % Sunday | % weekday | % weekend | |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1000 | 5515.675 | 10.290438 | 13.818961 | 14.649395 | 14.792587 | 12.848944 | 13.900039 | 15.455497 | 14.534576 | 70.009926 | 29.990074 |
1 | 1001 | 5090.375 | 9.496968 | 14.126091 | 14.361830 | 14.289969 | 14.611595 | 14.851637 | 13.891904 | 13.866974 | 72.241122 | 27.758878 |
2 | 1002 | 5352.830 | 9.986623 | 15.714585 | 14.486150 | 16.015827 | 15.183408 | 13.964968 | 12.886305 | 11.748757 | 75.364938 | 24.635062 |
3 | 1003 | 16305.581 | 30.420860 | 14.545051 | 14.048454 | 14.216507 | 14.363180 | 14.000151 | 14.630714 | 14.195943 | 71.173342 | 28.826658 |
4 | 1004 | 25326.442 | 47.250825 | 14.630796 | 14.177041 | 14.400305 | 13.674076 | 12.893055 | 14.671563 | 15.553164 | 69.775273 | 30.224727 |
建立11个变量来检测每个消费者的消费行为:
- 观察期内总用电量(总KW);
- 每日平均用电量(每日平均值);
- 周一总消费的百分比(% Monday);
- 周二总消费的百分比(% Tuesday);
- 周三总消费的百分比(% Wednesday);
- 周四总消费的百分比(% Thursday);
- 周五总消费的百分比(% Friday);
- 周六总消费的百分比(% Saturday);
- 周日总消费的百分比(% Sunday);
- 平日总消耗量的百分比(%);和
- 周末总消费的百分比(% Weekend)。
归一化
from sklearn.preprocessing import StandardScaler x_calls = data_fix.columns[1:] scaller = StandardScaler() matrix = pd.DataFrame(scaller.fit_transform(data_fix[x_calls]),columns=x_calls) matrix['Meter ID'] = data_fix['Meter ID'] print(matrix.head())
total KW average per day % Monday % Tuesday % Wednesday % Thursday \ 0 -0.462901 -0.462901 -0.248425 0.150228 0.333406 -0.956438 1 -0.477627 -0.477627 -0.026894 -0.042109 -0.010012 0.279580 2 -0.468539 -0.468539 1.118883 0.041043 1.169195 0.680550 3 -0.089300 -0.089300 0.275301 -0.251709 -0.060206 0.105385 4 0.223048 0.223048 0.337149 -0.165704 0.065376 -0.377833 % Friday % Saturday % Sunday % weekday % weekend Meter ID 0 -0.251452 0.500570 0.118593 -0.253682 0.324755 1000 1 0.452542 -0.181690 -0.111800 0.170082 -0.161274 1001 2 -0.203417 -0.620474 -0.842809 0.763378 -0.841745 1002 3 -0.177389 0.140683 0.001729 -0.032718 0.071324 1003 4 -0.996421 0.158507 0.470114 -0.298249 0.375870 1004
保留异常值,这样大公司或太小的房屋的客户就不会被淘汰。
相关性
corr = matrix[x_calls].corr() fig, ax = plt.subplots(figsize=(8, 6)) cax=ax.matshow(corr,vmin=-1,vmax=1) ax.matshow(corr) plt.xticks(range(len(corr.columns)), corr.columns) plt.yticks(range(len(corr.columns)), corr.columns) plt.xticks(rotation=90) plt.colorbar(cax)
簇个数
def plot_BIC(matrix,x_calls,K): from sklearn import mixture BIC=[] for k in K: model=mixture.GaussianMixture(n_components=k,init_params='kmeans') model.fit(matrix[x_calls]) BIC.append(model.bic(matrix[x_calls])) fig, ax = plt.subplots(figsize=(8, 6)) plt.plot(K,BIC,'-cx') plt.ylabel("BIC score") plt.xlabel("k") plt.title("BIC scoring for K-means cell's behaviour") return(BIC)
In [14]:
K = range(2,31) BIC = plot_BIC(matrix,x_calls,K)
通过贝叶斯信息准则(BIC),将客户划分为5类。
Clustering
from sklearn.cluster import KMeans from sklearn.decomposition import PCA from mpl_toolkits.mplot3d import Axes3D cluster = KMeans(n_clusters=5,random_state=217) matrix['cluster'] = cluster.fit_predict(matrix[x_calls]) print(matrix.cluster.value_counts())
1 3747 3 2208 4 385 0 95 2 10 Name: cluster, dtype: int64
d=pd.DataFrame(matrix.cluster.value_counts()) fig, ax = plt.subplots(figsize=(8, 6)) plt.bar(d.index,d['cluster'],align='center',alpha=0.5) plt.xlabel('Cluster') plt.ylabel('number of data') plt.title('Cluster of Data')
Text(0.5,1,'Cluster of Data')
from sklearn.metrics.pairwise import euclidean_distances distance = euclidean_distances(cluster.cluster_centers_, cluster.cluster_centers_) print(distance)
[[ 0. 9.35063261 30.46869987 9.84791523 9.82435185] [ 9.35063261 0. 28.24977455 2.02708658 6.67472145] [30.46869987 28.24977455 0. 27.44631089 31.68601779] [ 9.84791523 2.02708658 27.44631089 0. 8.67886855] [ 9.82435185 6.67472145 31.68601779 8.67886855 0. ]]
The first segment (Cluster 0) contains 95 costumers, the second (Cluster 1) 3747 costumers, the third (Cluster 2) 10 costumers, the fourth (Cluster 3) 2208 costumers, and the fifth (Cluster 4) 385 costumers.
可视化分段
# Reduction dimention of the data using PCA pca = PCA(n_components=3) matrix['x'] = pca.fit_transform(matrix[x_calls])[:,0] matrix['y'] = pca.fit_transform(matrix[x_calls])[:,1] matrix['z'] = pca.fit_transform(matrix[x_calls])[:,2] # Getting the center of each cluster for plotting cluster_centers = pca.transform(cluster.cluster_centers_) cluster_centers = pd.DataFrame(cluster_centers, columns=['x', 'y', 'z']) cluster_centers['cluster'] = range(0, len(cluster_centers)) print(cluster_centers)
x y z cluster 0 3.091673 8.622535 -0.845156 0 1 0.264480 -0.242741 -0.019198 1 2 -14.255594 2.118913 -9.273445 2 3 -1.722686 0.087896 0.107413 3 4 6.897584 -0.321715 0.021218 4
# Plotting for 2-dimention fig, ax = plt.subplots(figsize=(8, 6)) scatter=ax.scatter(matrix['x'],matrix['y'],c=matrix['cluster'],s=21,cmap=plt.cm.Set1_r) ax.scatter(cluster_centers['x'],cluster_centers['y'],s=70,c='blue',marker='+') ax.set_xlabel('x') ax.set_ylabel('y') plt.colorbar(scatter) plt.title('Data Segmentation')
Text(0.5,1,'Data Segmentation')
# Plotting for 3-Dimention fig, ax = plt.subplots(figsize=(8, 6)) ax=fig.add_subplot(111, projection='3d') scatter=ax.scatter(matrix['x'],matrix['y'],matrix['z'],c=matrix['cluster'],s=21,cmap=plt.cm.Set1_r) ax.scatter(cluster_centers['x'],cluster_centers['y'],cluster_centers['z'],s=70,c='red',marker='+') ax.set_xlabel('x') ax.set_ylabel('y') ax.set_zlabel('z') plt.colorbar(scatter) plt.title('Data Segmentation')
Text(0.5,0.92,'Data Segmentation')
By the plots above, we can see that all segments are separated well from each other. It means that BIC method works good for this project.
标签:matrix,plt,分段,fix,cluster,序列,ax,data,预测 From: https://blog.csdn.net/workflower/article/details/143248077