Feature Store
Initialize Feature Storeโ
Initialize a Feature Store.
from katonic.fs import FeatureStore
feature_store = FeatureStore(
user_name = "your-name",
project_name = "new-project",
description = "project-description",
)
Define Entity Keyโ
Entity keys (Unique Id) will act as Primary keys to retrieve features.
from katonic.fs import Entity
entity = Entity(name="id", value_type=ValueType.INT64)
Define Data Sourceโ
File Source - CSVโ
from katonic.fs import FileSource
data_source = FileSource(
path = "path/to/your/csv/data/source/file",
file_format = "csv",
event_timestamp_column = "event-timestamp-column"
)
File Source - Parquetโ
from katonic.fs import FileSource
data_source = FileSource(
path = "path/to/your/parquet/data/source/file",
file_format = "parquet",
event_timestamp_column = "event-timestamp-column"
)
DataFrame Source - Pandasโ
from katonic.fs import DataFrameSource
batch_source = DataFrameSource(
df=pandas_dataframe,
event_timestamp_column="event-timestamp-column",
created_timestamp_column="created-timestamp-column",
)
DataFrame Source - Sparkโ
from katonic.fs import DataFrameSource
batch_source = DataFrameSource(
df=spark_dataframe,
mode="append",
event_timestamp_column="event-timestamp-column",
created_timestamp_column="created-timestamp-column",
)
Feature Viewโ
A feature view is a group of features.
from katonic.fs import FeatureView
feature_view = FeatureView(
name="feature-view-name",
entities=["entity-key"],
ttl="2d", # no of days/months/years/hours
features=features_list,
batch_source=batch_source,
)
Write Data to Offline Storeโ
Store data to Offline store.
from katonic.fs import Entity, FeatureView
entity = Entity(name="id", value_type=ValueType.INT64)
feature_view = FeatureView(
name="feature-view-name",
entities=["entity-key"],
ttl="2d", # no of days/months/years/hours
features=features_list,
batch_source=batch_source,
)
feature_store.write_table([entity, feature_view])
Historical Data Retrievalโ
Retrieve training data from Offline store.
training_df = feature_store.get_historical_features(
entity_df=entity-df,
feature_view=["feature-view-name"],
features=features_list,
).to_df()
Publish Data to Online Storeโ
It materializes the latest features from the Offline feature store to an Online store.
feature_store.publish_table(
start_ts = start_date_as_datetime_object,
end_ts = end_date_as_datetime_object
)
Online Features Retrievalโ
It will used to get the latest features at low latency and also for the online serving.
feature_store.get_online_features(
entity_rows=[{"entity-key": entity-value}],
feature_view=["feature-view-name"],
features=features_list,
).to_df()
Feature Store Registryโ
Feature Store Registy is a tracking engine for the feature definitions and their related metadata.
List Entitiesโ
It will list all the entities present in the Feature Registry from all the project.
from katonic.fs import FeatureStore
feature_store = FeatureStore(
user_name = "your-name",
project_name = "new-project",
description = "project-description",
)
feature_store.list_entities()
List Feature Viewโ
It will list all the Feature Views present in the Feature Registry from all the project.
feature_store.list_feature_views()
Get Registry Info - Given User Nameโ
It will Get all the Meta present in the Feature Registry related with given user name.
feature_store.get_registry_info(user_name='user')
Get Registry Info - Given Project Nameโ
It will Get all the Meta present in the Feature Registry related with given project name.
feature_store.get_registry_info(project_name='housing_price')
Get Registry Info - Given User Name, Project Nameโ
It will Get all the Meta present in the Feature Registry related with given project name and user name.
feature_store.get_registry_info(user_name='user', project_name='housing_price')