1.key：bagging（集成学习）
value：1.创建更多子模型，要保持子模型的差异性
2.投票

2.key：差异性
value:每个子模型只看数据的一部分
example:500个样本数据，每个子模型只看100个数据

3.problem：只看数据的一部分的局限性会不会不准确
answer： Bagging
投票解决，模型越多。准确性越高

4.key：如何创建差异性
value:取样：放回/不放回
example:500个样本数据，每个子模型只看100个数据，看后放回/不放回（Bagging/Pasting）
coef：bootstrap = True

参数解释

5.key：参数解释
example：
random_subspaces_clf = BaggingClassifier(DecisionTreeClassifier(), #子模型算法
n_estimators=500, #创建几个子模型
max_samples=500, #每个子模型查看的数据【行随机】若为500，则不会随机选行，因为
bootstrap=True, #放回/不放回行
oob_score=True, #记录从来没抽过的
n_jobs=-1, #并行数量
max_feature=1, #随机取样特征【列随机】
bootstrap_features=True #列随机)

random_subspaces_clf.fit(X,y)

random_subspaces_clf.oob_score_ #oob_score=True, #记录从来没抽过的

6.#patches

n_estimators=500, #创建几个子模型
max_samples=500, #每个子模型查看的数据【行随机】若为500，则不会随机选行

Bagging

参数解释

相关推荐