머신러닝 5일차

728x90

GBoost

import xgboost as xgb
from xgboost import XGBClassifier

##피터 중요도 시각화 해주는 모듈
from xgboost import plot_importance
import pandas as pd
import numpy as np
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split

##e두개 나와서 이진분류 인거 확인
df['target'].value_counts()

결과

1    357
0    212
Name: target, dtype: int64

#objective :  0이나 1인 이진 분류이므로 이진로지스틱
#오류함수 평가성능지표 : logloss
# early_stoppings :조기 중단 가능한 최소반복횟수 설정
# 부팅 반복횟수 :400
param = {
    'max_depth':3,
    'eta':0.05,
    'objective':'binary:logistic',
    'eval_metric':'logloss',
     #'early_stoppings':100
    }
num_rounds = 400

#피처 중요도
#기본평가지표 F1스코어
plot_importance(xgb_model

이후 xgb시각화 - to_graphviz(모델)

도표로 확인 plot_importance(xgb_w)

파이썬래퍼 XGBoost

LightGBM

아나콘다설치

conda install -c conda-forge lightgbm

GBoost,XGBoost에 비해 메모리 사용량 적음, 빠른 예측 수행시간

모델생성

##모델생성
lgbm = LGBMClassifier(n_estimators=400)
# evals = [(x_test, y_test)]
lgbm.fit(x_train, 
         y_train,
         early_stopping_rounds=50,
         eval_metric ='logloss',
         eval_set = eval_list,
        verbose = True )

lgbm_pred = lgbm.predict(x_test)
lgbm_pred_proba = lgbm.predict_proba(x_test)[:,1]
get_clf_eval(y_test,w_pred,w_pred_proba)

결과

오차행렬
[[34  3]
 [ 2 75]]
정확도:0.9561,정밀도:0.9615,재현율:0.9740,f1:0.9677,AUC:0.9933

주요파라미터

하이퍼파라미터 튜닝방안

728x90

'머신러닝' 카테고리의 다른 글

머신러닝 6일차 (0)	2023.05.15
머신러닝 4일차 (0)	2023.05.11
머신러닝 3일차 (0)	2023.05.10
머신러닝 2일차 (0)	2023.05.09
ML 1일차 (2)	2023.05.08

kinggirl의 개발로그

머신러닝 5일차

'머신러닝' 카테고리의 다른 글

티스토리툴바

머신러닝 5일차

'머신러닝' 카테고리의 다른 글

'머신러닝' Related Articles

티스토리툴바