Runtime environment setup
You can reproduce this deployment guide on localhost. Nevertheless this guide is a blueprint for production. All environment variables, package sources, dependencies and other settings are declared with configuration YAML files. This is the preferred approach to production settings with Ray. See Ray Production Guide.
As a consequence, this guide is a lot about managing YAML files. Kodosumi simplifies the management of these configuration files by splitting the configuration into a base and multiple app configurations.
This deployment is based on the company_news service built in development workflow.
Let’s start with creating a root directory to host our runtime environment, i.e.
mkdir ~/kodosumi
cd kodosumi
Create a Python Virtual Environment with
python -m venv .venv # depends on OS setup
source .venv/bin/activate
The location of your system’s Python executable python might vary.
Next, install Kodosumi from the Python package index
pip install kodosumi
if instead you prefer to install the latest development trunk run
pip install “kodosumi @ git+https://github.com/masumi-network/kodosumi.git@dev”
Start your Ray cluster with
ray start —head
In the previous examples you did run koco start which launches the Kodosumi spooler daemon PLUS the Kodosumi admin panel web app and API. In the current example we start the spooler and the panel seperately. We start with the spooler
koco spool
This starts the spooler in the background and as a daemon. You can review daemon status with
koco spool —status
Stop the spooler later with koco spool --stop.
The spooler automatically creates directory ./data/config to host Ray serve configuration files as specified with configuration parameter YAML_BASE. The yaml base defaults to ./data/config/config.yaml and locates the base relative to the directory where you start koco spool and koco serve.
Create file ./data/config/config.yaml with Ray serve base configuration. The following yaml configuration is a good starting point. For further details read Ray’s documentation about Serve Config Files.
# ./data/config/config.yaml
proxy_location: EveryNode
http_options:
host: 0.0.0.0
port: 8001
grpc_options:
port: 9001
grpc_servicer_functions: []
logging_config:
encoding: TEXT
log_level: DEBUG
logs_dir: null
enable_access_log: true
We will deploy the agentic-workflow-example service as a package company_news.
Create the first app configuration named company_news.yaml with the following content:
# ./data/config/company_news.yaml
name: company_news
route_prefix: /company_news
import_path: company_news.query:fast_app
runtime_env:
py_modules:
- https://github.com/plan-net/agentic-workflow-example/archive/45aabddf234cf8beb7118b400e7cb567776e458a.zip
pip:
- openai
env_vars:
OTEL_SDK_DISABLED: "true"
OPENAI_API_KEY: <-- your-api-key -->
Test and deploy your configuration set with
koco deploy —dry-run —file ./data/config/config.yaml
koco deploy —run —file ./data/config/config.yaml
This will apply your base configuration from ./data/config/config.yaml and adds a key application with records from ./data/config/company_news.yaml.
With running Ray, spooler and app we now start the Kodosumi panel and register Ray deployments
koco serve —register http://localhost:8001/-/routes
See Configure Ray Serve Deployments for additional options on your deployment. Be advised to gather some experience with Ray core components before you rollout your services. Understand remote resource requirements and how to limit concurrency to avoid OOM issues
Deployment API
The deployment API at /deploy and /serve is experimental.
Use kodosumi panel API to change your Ray serve deployments at runtime. The panel API ships with a simple CRUD interfacce to create, read, update and delete deployment configurations including the base configuration with config.yaml.
The following Python snippets demonstrates API usage with example service kodosumi_examples.prime.
import httpx
from pprint import pprint
# login
resp = httpx.get("http://localhost:3370/login?name=admin&password=admin")
cookies = resp.cookies
# retrieve Ray serve deployments status
resp = httpx.get("http://localhost:3370/deploy", cookies=cookies)
pprint(resp.json())
Let us first stop Ray serve and remove all existing deployments except the base configuration config.yaml before we deploy the prime service.
# retrieve active deployments
scope = httpx.get("http://localhost:3370/deploy", cookies=cookies)
for name in scope.json():
# remove deployment
print(name)
resp = httpx.delete(f"http://localhost:3370/deploy/{name}", cookies=cookies)
assert resp.status_code == 204
# stop Ray serve
resp = httpx.delete("http://localhost:3370/serve", cookies=cookies)
assert resp.status_code == 204
Verify no deployments with GET /deploy and an existing base configuration with GET /deploy/config.
# verify no deployments
resp = httpx.get("http://localhost:3370/deploy", cookies=cookies)
assert resp.json() == {}
# verify base configuration
resp = httpx.get("http://localhost:3370/deploy/config", cookies=cookies)
print(resp.content.decode())
This yields the content of the base configuration ./data/config/config.yaml, for example
proxy_location: EveryNode
http_options:
host: 127.0.0.1
port: 8001
grpc_options:
port: 9001
grpc_servicer_functions: []
logging_config:
encoding: TEXT
log_level: DEBUG
logs_dir: null
enable_access_log: true
If the base configuration does not exist and GET /deploy/config throws a 404 Not found exception, then create it with for example
base = """
proxy_location: EveryNode
http_options:
host: 127.0.0.1
port: 8001
grpc_options:
port: 9001
grpc_servicer_functions: []
logging_config:
encoding: TEXT
log_level: DEBUG
logs_dir: null
enable_access_log: true
"""
resp = httpx.post("http://localhost:3370/deploy/config",
cookies=cookies,
content=base)
assert resp.status_code == 201
Deploy the prime service with the corresponding Ray serve configuration.
prime = """
name: prime
route_prefix: /prime
import_path: kodosumi_examples.prime.app:fast_app
runtime_env:
py_modules:
- https://github.com/masumi-network/kodosumi-examples/archive/2db907d955de65bed5dde6513f6359aeb18ebff1.zip
deployments:
- name: PrimeDistribution
num_replicas: auto
ray_actor_options:
num_cpus: 0.1
"""
resp = httpx.post("http://localhost:3370/deploy/prime",
cookies=cookies,
content=prime)
assert resp.status_code == 201
Verify the to-be deployment state of the prime service.
resp = httpx.get("http://localhost:3370/deploy", cookies=cookies)
assert resp.status_code == 200
assert resp.json() == {'prime': 'to-deploy'}
To request Ray serve to enter this state POST /serve with
resp = httpx.post("http://localhost:3370/serve", cookies=cookies, timeout=30)
assert resp.status_code == 201
Watch the timeout because the response of Ray serve might take a while.