Train & Log a Custom Xgboost Model

Train and Log a Custom built Xgboost Model with Katonic-SDK Log package.

Import necessary packages

import os

import pandas as pd
from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, roc_auc_score, log_loss, recall_score, f1_score, precision_score
from katonic.log.logmodel import LogModel

Define Experiment name

experiment_name= "custom_xgb_model"

Initiate LogModel with experiment name

lm = LogModel(experiment_name, source_name='xgboost_model_logging.ipynb')

Check Metadata of the created / existing experiment

# experiment id
exp_id = lm.id

print("experiment name: ", lm.name)
print("experiment location: ", lm.location)
print("experiment id: ", lm.id)
print("experiment status: ", lm.stage)

Artifact path where you want to log your model

artifact_path = "xgb-model"

Load Training Data

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/diabetes.csv')
df.head()

Get Features and Labels

x = df.drop(columns=['Outcome'], axis=1)
y = df['Outcome']

Split the dataset in Train and Test

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=.20,random_state=98)

Define Evaluation Metrics

def metric(actual, pred):
    acc_score = accuracy_score(actual, pred)
    recall = recall_score(actual, pred)
    precision_scr = precision_score(actual, pred)
    f1_scr = f1_score(actual, pred)
    auc_roc = roc_auc_score(actual, pred)
    log_los = log_loss(actual, pred)

    return (
        acc_score,
        auc_roc,
        log_los,
        recall,
        f1_scr,
        precision_scr
    )

Train Xgboost Model

model_clf = XGBClassifier(random_state=0)
model_clf.fit(X_train, y_train)

Calculate metrics for the Xgboost model

y_pred = model_clf.predict(X_test)
(acc_score, auc_roc, log_los, recall, f1_scr, precision_scr) = metric(y_test, y_pred)

model_mertics = {
    "accuracy_score": acc_score,
    "roc_auc_score": auc_roc,
    "log_loss": log_los,
    "recall": recall,
    "f1_score": f1_scr,
    "precision_score": precision_scr
}

Log Xgboost Model

You can log your Xgboost model by defining the model_type param as xgboost from available model types scikit-learn, xgboost, catboost, lightgbm, prophet, keras, custom-model.

lm.model_logging(
    model_name="xgboost",
    model_type="xgboost",
    model=model_clf,
    artifact_path=artifact_path,
    current_working_dir=f'{os.getcwd()}/xgboost_model_logging.ipynb',
    metrics=model_mertics
)

Check all the logged Experiments

You can search and get all the logged models with specific experiment ID.

df_runs = lm.search_runs(exp_id)
print("Number of runs done : ", len(df_runs))
df_runs.head()

Import necessary packages​

Define Experiment name​

Initiate LogModel with experiment name​

Check Metadata of the created / existing experiment​

Artifact path where you want to log your model​

Load Training Data​

Get Features and Labels​

Split the dataset in Train and Test​

Define Evaluation Metrics​

Train Xgboost Model​

Calculate metrics for the Xgboost model​

Log Xgboost Model​

Check all the logged Experiments​