Skip to main content

Experiment Operations

Create a New Experimentโ€‹

If experiment does not exist, create an experiment with provided name.

from katonic.ml.client import set_exp

set_exp(exp_name='exp-name')
>>> INFO: 'exp-name' does not exist. Creating a new experiment

Continue Existing Experimentโ€‹

Set given experiment as active experiment.

from katonic.ml.client import set_exp

set_exp(exp_name='previous-exp-name')

Get info of an Existing Experimentโ€‹

Retrieve an experiment by experiment name from the backend store.

from katonic.ml.util import get_exp

get_exp(exp_name='exp-name')

Output

parameters
experiment_nameexp-name
locations3://models/18
experiment_id18
experiment_stageactive
tags

Train & Log Classification Modelsโ€‹

  1. Logistic Regression Classifier
  2. Random Forest Classifier
  3. Adaboost Classifier
  4. Gradient Boosting Classifier
  5. CatBoost Classifier
  6. LGBM Classifier
  7. XGBoost Classifier
  8. Decision Tree Classifier
  9. Support Vector Classifier
  10. Ridge Classifier
  11. KNeighbors Classifier
  12. GaussianNB Classifier

Train & Log Regression Modelsโ€‹

  1. Linear Regression
  2. Ridge Regression
  3. Lasso Regression
  4. ElasticNet Regression
  5. Support Vector Regression
  6. KNN Regression
  7. Random Forest Regressor
  8. XGBoost Regressor
  9. CatBoost Regressor
  10. LGBM Regressor
  11. Gradient Boosting Regressor
  12. AdaBoost Regressor
  13. Decision Tree Regressor
  14. ExtraTree Regressor

Training & Log a LogisticRegression Modelโ€‹

Train & Log a LogisticRegression classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name, # your current notebook path/name
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.LogisticRegression()

Training & Log a RandomForestClassifier Modelโ€‹

Train & Log a RandomForestClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name, # your current notebook path/name
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.RandomForestClassifier()

Training & Log a AdaBoostClassifier Modelโ€‹

Train & Log a AdaBoostClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.AdaBoostClassifier()

Training & Log a GradientBoostingClassifier Modelโ€‹

Train & Log a GradientBoostingClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.GradientBoostingClassifier()

Training & Log a CatBoostClassifier Modelโ€‹

Train & Log a CatBoostClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.CatBoostClassifier()

Training & Log a LGBMClassifier Modelโ€‹

Train & Log a LGBMClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.LGBMClassifier()

Training & Log a XGBClassifier Modelโ€‹

Train & Log a XGBClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.XGBClassifier()

Training & Log a DecisionTreeClassifier Modelโ€‹

Train & Log a DecisionTreeClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.DecisionTreeClassifier()

Training & Log a SupportVectorClassifier Modelโ€‹

Train & Log a SupportVectorClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.SupportVectorClassifier()

Training & Log a RidgeClassifier Modelโ€‹

Train & Log a RidgeClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.RidgeClassifier()

Training & Log a KNeighborsClassifier Modelโ€‹

Train & Log a KNeighborsClassifier classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.KNeighborsClassifier()

Training & Log a GaussianNB Modelโ€‹

Train & Log a GaussianNB classification model.

from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
clf.GaussianNB()

Training & Log a LinearRegression Modelโ€‹

Train & Log a LinearRegression regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name, # your current notebook path/name
features, # [Optional] list of features(str) used in this modelling
artifacts # [Optional] artifacts in dict (e.g. `data_path`: dataset path, `images`: image folder saved for this project like Analysis charts etc.)
)
reg.LinearRegression()

Training & Log a RidgeRegression Modelโ€‹

Train & Log a RidgeRegression regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.RidgeRegression()

Training & Log a LassoRegression Modelโ€‹

Train & Log a LassoRegression regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.LassoRegression()

Training & Log a ElasticNet Modelโ€‹

Train & Log a ElasticNet regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.ElasticNet()

Training & Log a SupportVectorRegressor Modelโ€‹

Train & Log a SupportVectorRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.SupportVectorRegressor()

Training & Log a KNNRegressor Modelโ€‹

Train & Log a KNNRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.KNNRegressor()

Training & Log a RandomForestRegressor Modelโ€‹

Train & Log a RandomForestRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.RandomForestRegressor()

Training & Log a XGBRegressor Modelโ€‹

Train & Log a XGBRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.XGBRegressor()

Training & Log a CatBoostRegressor Modelโ€‹

Train & Log a CatBoostRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.CatBoostRegressor()

Training & Log a LGBMRegressor Modelโ€‹

Train & Log a LGBMRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.LGBMRegressor()

Training & Log a GradientBoostingRegressor Modelโ€‹

Train & Log a GradientBoostingRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.GradientBoostingRegressor()

Training & Log a AdaBoostRegressor Modelโ€‹

Train & Log a AdaBoostRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.AdaBoostRegressor()

Training & Log a DecisionTreeRegressor Modelโ€‹

Train & Log a DecisionTreeRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.DecisionTreeRegressor()

Training & Log a ExtraTreeRegressor Modelโ€‹

Train & Log a ExtraTreeRegressor regression model.

from katonic.ml.regression import Regressor

reg = Regressor(
X_train,
X_test,
y_train,
y_test,
exp_name,
source_name,
features,
)
reg.ExtraTreeRegressor()

log a Classification Model with Hyperparameter Tuningโ€‹

Log classification model with Hyperparameter tuning with provided parameter constraints.

params = {
'n_estimators': {
'low': 80,
'high': 120,
'step': 10,
'type': 'int'
},
'criterion':{
'values': ['gini', 'entropy'],
'type': 'categorical'
},
'min_samples_split': {
'low': 2,
'high': 5,
'type': 'int'
},
'min_samples_leaf':{
'low': 1,
'high': 5,
'type': 'int'
}
}

clf.RandomForestClassifier(is_tune=True, n_trials=5, params=params)

log a Regression Model with Hyperparameter Tuningโ€‹

Log regression model with Hyperparameter tuning with provided parameter constraints


params = {
'n_estimators': {
'low': 80,
'high': 120,
'step': 10,
'type': 'int'
},
'criterion':{
'values': ['mse', 'mae'],
'type': 'categorical'
},
'min_samples_split': {
'low': 2,
'high': 5,
'type': 'int'
},
'min_samples_leaf':{
'low': 1,
'high': 5,
'type': 'int'
}
}

reg.RandomForestRegressor(is_tune=True, params=params)

log a Custom Modelโ€‹

This function helps to log custom user model. The custom model must be a derived class of mlflow.pyfunc.PythonModel.

import os
import mlflow.pyfunc
from katonic.ml.miscellaneous import LogModel

lm = LogModel(experiment_name='custom-model-name', source_name='experiment-docs.ipynb')
>>> INFO: 'custom-model-name' does not exist. Creating a new experiment
working_dir = os.getcwd() + '/experiment-docs.ipynb'

class AddN(mlflow.pyfunc.PythonModel):

def __init__(self, n):
self.n = n

def predict(self, context, model_input):
return model_input.apply(lambda column: column + self.n)

lm.model_logging(
model_name= "add_n",
model_type="custom-model",
model=AddN(n=5),
artifact_path="custom-model-log",
current_working_dir=working_dir,
)
>>> Model artifact logged to: s3://models/19/c31db94b700a4cb79c15e77d77a7f5d5/artifacts/custom-model-name_19_custom-model-log_add_n

log a keras Modelโ€‹

This function helps to log keras model.

import os
import tensorflow as tf
from katonic.ml.miscellaneous import LogModel

lm = LogModel(experiment_name='keras-name', source_name='keras_model_logging.ipynb')
>>> INFO: 'keras-name' does not exist. Creating a new experiment
working_dir = current_working_dir=f'{os.getcwd()}/keras_model_logging.ipynb',

LAYERS = [
tf.keras.layers.Flatten(input_shape=(28, 28), name='inputLayer'),
tf.keras.layers.Dense(300, activation='relu', name='hiddenLayer1'),
tf.keras.layers.Dense(100, activation='relu', name='hiddenLayer2'),
tf.keras.layers.Dense(10, activation='softmax', name='outputLayer')
]
model_clf = tf.keras.models.Sequential(LAYERS)

model_clf.compile(
loss=loss_function,
optimizer=optimizers,
metrics=metric
)

lm.model_logging(
model_name= "keras_classifier",
model_type="keras",
model=model_clf,
artifact_path="keras-log",
current_working_dir=working_dir,
)
>>> Model artifact logged to: s3://models/19/c31db94b700a4cb79c15e77d77a7f5d5/artifacts/keras-name_19_keras-log_keras_classifier

log a Scikit-learn Modelโ€‹

This function helps to log Scikit-learn model.

import os
from sklearn.ensemble import RandomForestClassifier
from katonic.ml.miscellaneous import LogModel

lm = LogModel(experiment_name='sklearn-model', source_name='scikit_learn_logging.ipynb')
>>> INFO: 'sklearn-model' does not exist. Creating a new experiment
working_dir = current_working_dir=f'{os.getcwd()}/scikit_learn_logging.ipynb',

model_clf = RandomForestClassifier(max_depth=2, random_state=0)

lm.model_logging(
model_name="random_forest",
model_type="scikit-learn",
model=model_clf,
artifact_path=artifact_path,
current_working_dir=f'{os.getcwd()}/scikit_learn_logging.ipynb',
metrics=model_mertics
)
>>> Model artifact logged to: s3://models/18/9dbe7b1db90d4dc28cc59fe76a073739/artifacts/sklearn_model_18_scikit-learn-model_random_forest

log a XGBoost Modelโ€‹

This function helps to log XGBoost model.

import os
from xgboost import XGBClassifier
from katonic.ml.miscellaneous import LogModel

lm = LogModel(experiment_name='xgb_model', source_name='xgboost_logging.ipynb')
>>> INFO: 'xgb_model' does not exist. Creating a new experiment
working_dir = current_working_dir=f'{os.getcwd()}/xgboost_logging.ipynb',
artifact_path = "xgb-model"

model_clf = XGBClassifier(random_state=0)

lm.model_logging(
model_name="xgboost",
model_type="xgboost",
model=model_clf,
artifact_path=artifact_path,
current_working_dir=working_dir,
metrics=model_mertics
)
>>> Model artifact logged to: s3://models/19/a58671c5b5e64395919f28db374b00f0/artifacts/xgb_model_19_xgb-model_xgboost

You can explore logging other types of models here.

View Experiment Runsโ€‹

This function search runs and return dataframe of runs. It takes exp_id as input and returns the list of experiment ids.

import pandas as pd
from sklearn.model_selection import train_test_split

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/diabetes.csv')

x = df.drop(columns=['Outcome'], axis=1)
y = df['Outcome']

X_train, X_test, y_train, y_test = train_test_split(x, y, test_size=.20,random_state=98)
from katonic.ml.classification import Classifier

clf = Classifier(
X_train,
X_test,
y_train,
y_test,
'my-new-exp',
source_name,
features,
)

df_runs = clf.search_runs(exp_id='21')
df_runs
artifact_uriend_timeexperiment_idmetrics.accuracy_scoremetrics.f1_scoremetrics.log_lossmetrics.precision_scoremetrics.recallmetrics.roc_auc_scorerun_idrun_namestart_timestatustags.mlflow.log-model.history
0s3://models/21/fef2e0533fec42b586251fbe07294ed...2022-03-29 11:44:24.213000+00:00210.7272730.5625009.4197650.5869570.540.678654fef2e0533fec42b586251fbe07294ed1my-new-exp_21_decision_tree_classifier2022-03-29 11:44:22.013000+00:00FINISHED["run_id": "fef2e0533fec42b586251fbe07294ed1"...
1s3://models/21/fd7ab366582e4a2b85358bfa24ff62c...2022-03-29 11:44:20.341000+00:00210.7922080.6363647.1769410.7368420.560.731923fd7ab366582e4a2b85358bfa24ff62c5my-new-exp_21_logistic_regression2022-03-29 11:44:18.234000+00:00FINISHED["run_id": "fd7ab366582e4a2b85358bfa24ff62c5"...

Delete Experiment Runsโ€‹

Delete experiment runs with the specific run_ids.

from katonic.ml.classification import Classifier

clf = Classifier(X_train, X_test, y_train, y_test, 'my-new-exp', source_name="delete-experiment-runs.ipynb")

run_list = clf.search_runs(exp_id='21')['run_id'].tolist()
run_list
>>> ['fef2e0533fec42b586251fbe07294ed1', 'fd7ab366582e4a2b85358bfa24ff62c5']
clf.delete_run_by_id(run_ids=['fef2e0533fec42b586251fbe07294ed1'])
>>> "['fef2e0533fec42b586251fbe07294ed1'] runids successfully deleted"