Skip to main content

Deploy a Custom GenAI Model.

This document will help you on how to deploy a custom GenAI model.

Before we start the deployment process, please follow the below formats for every file:

The model must contain the following pre-requisite file:

  1. requirement.txt - Text file containing for all your required packages along with the version.
  setuptools==x.x.x 
Wheel==x.x.x 
Spacy==x.x.x
  1. schema.py - This file will contain the schema of input that will be excepted by your endpoint API. You can modify the below code to match your input schema. For the below code the schema is List.
from pydantic import BaseModel 
from typing import List, Any

# sample Predict Schema 
# Make sure key is data and your data can be of anytype 
class PredictSchema(BaseModel): 
    data: Dict
  1. launch.py - This is the most important file, containing load_model, preprocessing and prediction functions. The template for the file is shown below:

Note: load_model and prediction functions are compulsory. preprocessing function is optional based on the data you are passing to the system. By default false is returned from preprocessing function.

Note: load_model takes logger object, please do not define your own logging object. preprocessing takes the data and logger object, prediction takes preprocessed data, model and logger object.

from langchain.llms import OpenAI
from langchain.chains import LLMChain
from pydantic import BaseModel
from typing import List,Any,Dict,Union
import requests
import json
import os
from schema import PredictSchema

def loadmodel(logger):
"""Get the model"""
openai_model = OpenAI(
model_name="gpt-3.5-turbo-16k",
openai_api_key=os.environ.get("API_KEY"),
temperature = 0.7,
max_tokens = 500,
top_p = 1.0,
frequency_penalty = 1.0
)
logger.info(f"Model fetched.")
return openai_model

def preprocessing(data,logger):
""" Applies preprocessing techniques to the raw data"""
logger.info("Task fetched.")
finalPrompt = f"Context: {data['data']}\nYour task is: {data['task']}\nGenerate Response: "
logger.info("Created the final Prompt")
return finalPrompt

def predict(finalPrompt,openai_model,logger):
"""Predicts the results for the given inputs"""
logger.info(f"finalPrompt:{finalPrompt}")
logger.info("Model prediction started.")
try:
response = openai_model(finalPrompt)
except Exception as e:
logger.info(e)
logger.info("prediction Done.")
return response

Once you prepared the required files you can proceed with the deployment.

Note: Don't call any of the 3 methods inside the file.

Note: Before Deploying the mode, all the required files should be inside the GitHub Repository.

How to deploy your AI model using Custom Deployment

  1. Navigate to Deploy section from sidebar on the platform.

  2. Select the Model Deployment option from the bottom. genai

  3. Fill the model details in the dialog box.

  • Give a Name to your deployment, for example custom-GenAI-Model-For-API and proceed to the next field. genai

  • Select Custom Model under Model Source option. genai

  • Select Model type type for eg., in this case it is NLP genai

  • Provide the GitHub token. genai

  • Your username will appear once the token is passed.

  • Select the Account type. genai

  • Select Organization Name, if account type is Organization. genai

  • Select the Repository. genai

  • Select the Revision Type. genai

  • Select the Branch Name, if revision type is Branch. genai

    • Note: your GitHub repository must contain requirements.txt, schema.py and launch.py files whose templates are discussed above.

  • Select Hardware type CPU/GPU

  • Select Python version. genai

  • Select Resources. genai

  • Enable or Disable Autoscaling.

  • Select Pods range, if the user enable Autoscaling. genai

  • Select +Add Environment Variables. We will add openAI API key here. The "key" will be "API_KEY" and the value will be the key value. genai

    • Note: after adding variable name and value, don't forget to click on [+] button beside of it, that'll add variables to your deployment.

  • Click on Deploy. genai

  1. Once your Custom Model API is created you will be able to view it in the Deploy section where it will be in "Processing" state in the beginning. Click on Refresh to update the status. It will be in the "Running" state after some time. genai

  2. You can also check out the logs to see the progress of the current deployment using Logs option. genai

  3. Once your Model API is in the Running state you can check consumption of the hardware resources from Usage option. genai

  4. You can access the API endpoints by clicking on API. genai

  • There are two APIs under API URLs: genai

  • Model Prediction API endpoint: This API is for generating the prediction from the deployed model Here is the code snippet to use the predict API:

MODEL_API_ENDPOINT = "Prediction API URL"
SECURE_TOKEN = "Token"
data = {"data": "Define the value format as per the schema file"}
result = requests.post(f"{MODEL_API_ENDPOINT}", json=data, verify=False, headers = {"Authorization": SECURE_TOKEN})
print(result.text)

genai

  • Model Feedback API endpoint: This API is for monitoring the model performance once you have the true labels available for the data. Here is the code snippet to use the feedback API. The predicted labels can be saved at the destination sources and once the true labels are available those can be passed to the feedback url to monitor the model continuously.
MODEL_FEEDBACK_ENDPOINT = "Feedback API URL"
SECURE_TOKEN = "Token"
true = "Pass the list of true labels"
pred = "Pass the list of predicted labels"
data = {"true_label": true, "predicted_label": pred}
result = requests.post(f"{MODEL_API_ENDPOINT}", json=data, verify=False, headers = {"Authorization": SECURE_TOKEN})
print(result.text)

genai

  • Click on the Create API token to generate a new token in order to access the API. genai

  • Give a name to the token. genai

  • Select the Expiration Time. genai

  • Set the Token Expiry Date. genai

  • Click on Create Token and generate your API Token from the pop-up dialog box. genai

    • Note: A maximum of 10 tokens can be generated for a model.Copy the API Token that was created. As it is only available once, be sure to save it.

  • Under the Existing API token section you can manage the generated token and can delete the no longer needed tokens. genai

  • API usage docs briefs you on how to use the APIs and even gives the flexibility to conduct API testing. genai

  • To know more about the usage of generated API you can follow the below steps

    • This is a guide on how to use the endpoint API. Here you can test the API with different inputs to check the working model.
    • In order to test API you first need to Authorize yourself by adding the token as shown below. Click on Authorize and close the pop-up.
    • Once it is authorise you can click on Predict_Endpoint bar and scroll down to Try it out.
    • If you click on the Try it out button, the Request body panel will be available for editing. Put some input values for testing and the number of values/features in a record must be equal to the features you used while training the model.
    • If you click on execute, you would be able to see the prediction results at the end. If there are any errors you can go back to the model card and check the error logs for further investigation.
  1. You can also modify the resources,version minimum and maximum pods of your deployed model by clicking the Edit option and saving the updated configuration. genai
  2. Click on Monitoring, and a dashboard would open up in a new tab. This will help to monitor the effectiveness and efficiency of your deployed model. Refer the Model Monitoring section in the Documentation to know more about the metrics that are been monitored. genai

genai

  1. To delete the unused models use the Delete button. genai