Skip to main content

Cost Requests

Request Specific Cost Analysis:โ€‹

Note: Administrator privileges are necessary to access the cost management modules on the platform.

The Request-specific cost analysis feature provides a versatile tool for in-depth examination of costs across different dimensions, including project, user, and model perspectives. This functionality allows users to conduct detailed project-based cost analyses, gaining insights into resource consumption specific to each project. Additionally, it facilitates user-based cost analysis, enabling the assessment of individual user contributions to overall expenses. Furthermore, the feature extends its utility to model-based analysis, allowing users to scrutinize the cost implications associated with each deployed model.

These requests are generated whenever you make a request from various functionalities such as chat, extract, summarize, generate, classify, embeddings, tuning-studio, and multimodal. With this comprehensive approach, stakeholders can make informed decisions regarding resource allocation, budgeting, and optimization strategies.

To access the Request specific cost analysis dashboard, follow these steps:

  1. Login to Katonic Generative AI Platform:

    Log in to your Katonic Generative AI platform account using your credentials.

  2. Navigate to the Admin Section:

    Once logged in, click on the 'Admin' section in the platform's interface.

  3. Select Cost Insights Board:

    Within the Admin section, locate and select the 'Requests' board.


Request specific Dashboard:โ€‹

The following details will be available in the request specific cost dashboard.

  1. Flexible Time Analysis: Effortlessly analyze requests over various time periods by applying intuitive filters.

  2. Comprehensive Request Logging: Every request made across the platform is meticulously logged, providing a detailed overview of each interaction.

    Each logged request includes the following details:

    • Created at Timestamp: Records the time when the request was created.

    • Status of the Request: Indicates whether the request was successful or encountered issues.

    • Request: Represents the input provided for the LLM.

    • Response: Displays the output generated by the model.

    • Model: Specifies the name of the LLM model used for the request.

    • Total Tokens: Reflects the total number of tokens utilized throughout the entire request.

    • Prompt Tokens: Identifies the number of input tokens used for the request.

    • Completion Tokens: Quantifies the tokens employed for the output generated by the model.

    • Latency: Measures the response time for the model to generate the output.

    • Type: Specifies the project type in which the request occurred (e.g., Chatbot, Extraction, Summarize, Generate, or Multimodal).

    • Username: Records the name of the user who initiated the request.

    • Cost: Indicates the total cost incurred for processing the request.