训练集给出如下数据:
测试集提供其中的部分列:
要求预测以下列的数据:
['Tdewpoint', 'Visibility', 'Windspeed', 'RH_out', 'Press_mm_hg', 'RH_9', 'T_out', 'RH_4']
使用回归树进行预测:
import pandas as pd from sklearn.tree import DecisionTreeRegressor from sklearn.multioutput import MultiOutputRegressor # 读入训练集和测试集数据 train_data = pd.read_csv('train_dataset.csv') test_data = pd.read_csv('test_dataset.csv') li=train_data.columns.to_list()[2::] goal=['Tdewpoint', 'Visibility', 'Windspeed', 'RH_out', 'Press_mm_hg', 'RH_9', 'T_out', 'RH_4'] feature=list(set(li)-set(goal)) print(li) print(feature) # 从训练集中分离出目标变量和特征变量 #X_train = train_data.drop(goal, axis=1) X_train = train_data[feature] y_train = train_data[goal] # 创建决策树回归模型并拟合训练集 model = MultiOutputRegressor(DecisionTreeRegressor()) model.fit(X_train, y_train) # 使用模型对测试集进行预测 X_test = test_data[feature] y_pred = model.predict(X_test) # 将预测结果保存为CSV文件 submission = pd.DataFrame(y_pred, columns=goal) submission.to_csv('test_result.csv', index=False)
标签:goal,data,回归,train,test,csv,RH From: https://www.cnblogs.com/datielaoyu/p/17436637.html