算法实践1.2模型构建之集成模型
任务说明
实验过程
1.导入需要用到的包
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
import lightgbm as lgb
from xgboost import XGBClassifier
import matplotlib.pyplot as plt
from sklearn import metrics2.读入数据,划分训练集和测试集,跟上次一样3/7分。
data = pd.read_csv('./data_all.csv', engine='python')
y = data['status']
X = data.drop(['status'], axis=1)
print('The shape of X: ', X.shape)
print('proportion of label 1: ', len(y[y == 1])/len(y))
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=2018)
print('For train, proportion of label 1: ', len(y_train[y_train == 1])/len(y_train))
print('For test, proportion of label 1: ', len(y_test[y_test == 1])/len(y_train))3.构建四个模型并评估:随机森林、GBDT、XGBoost、LightGBM。
4.模型结果展示。





5. 模型参数解释说明
Last updated