GitHub - etalab-ia/opengatellm-helm: Helm chart for albert-api deployment on kubernetes

This repository contains the helm chart to deploy opengatellm and its components on Kubernetes.

Repository structure

In opengatellm-stack folder, there is the helm chart to deploy opengatellm and its components on Kubernetes.
manifests contains an old helm chart version used for deployment on LaSuite.

Infrastructure provisioning

Create a kubernetes cluster with the provider of your choice.
We recommend having at least 3 nodes, including one with a GPU sized for the LLM you wish to use.
Verify that the connection with your cluster is functional and that the nodes are available with kubectl get nodes

Deployment

Customize the deployment in opengatellm-stack/values.yaml, for example the tag of the API version to deploy, rate limiting, API keys for the different deployed services (redis, elastic search, etc), ports, hardware configuration requested by each pod, etc.
In opengatellm-stack/values-secret.yaml, replace the secrets and API keys with values of your choice.
Create a namespace for the deployment kubectl create namespace opengatellm
If you want to deploy from source, install the helm chart from the opengatellm-stack folder : helm install opengatellm-stack . --namespace opengatellm --create-namespace -f values-secrets.yaml -f values.yaml
If you want to deploy from the published version, add the repo with helm repo add opengatellm https://etalab-ia.github.io/opengatellm-helm/; helm repo update and install it with helm install opengatellm-stack opengatellm/opengatellm-stack --namespace opengatellm --create-namespace -f values-secrets.yaml -f values.yaml
Monitor the deployment via the kubernetes dashboard, or via a tool like k9s.
If some components don't start, or are stuck in "Pending", check why with kubectl describe <pod_name>.
If they start but remain in error, you can check the logs with kubectl logs <pod_name>
The entire stack can take 10-15 minutes to deploy. The longest is usually vLLM, depending on the model you are deploying.
The "opengatellm" deployment needs to be restarted once the vLLM is up, so that it loads the vLLM model router.
Once all services are "Running", you can get the public IP of the load balancer with kubectl describe svc opengatellm.
Use the value of LoadBalancer Ingress to contact the API, for example:

To list the available models, you can use the following command:

curl http://YOUR_LOAD_BALANCER_INGRESS_IP/v1/models \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer changeme" \

To test the API, you can use the following command to send a chat completion request:

curl http://YOUR_LOAD_BALANCER_INGRESS_IP/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer changeme" \
  -d '{
    "model": "mistralai/Mistral-Small-3.1-24B-Instruct-2503",
    "messages": [
      {
        "role": "system",
        "content": "You are a helpful assistant."
      },
      {
        "role": "user",
        "content": "Qui es tu ?"
      }
    ]}'

Or a request to the embeddings :

curl -X 'POST' 'http://YOUR_LOAD_BALANCER_INGRESS_IP.fr/v1/embeddings' \
  -H 'accept: application/json' \
    -H "Authorization: Bearer changeme" \
    -H 'Content-Type: application/json' \
    -d '{
        "input": [0],
        "model": "embeddings-small",
        "dimensions": 0,
        "encoding_format": "float",
        "additionalProp1": {}
    }'

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
.github/workflows		.github/workflows
docs		docs
manifests		manifests
opengatellm-stack		opengatellm-stack
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Repository structure

Infrastructure provisioning

Deployment

About

Uh oh!

Releases 3

Packages

Contributors 5

Uh oh!

Languages

License

etalab-ia/opengatellm-helm

Folders and files

Latest commit

History

Repository files navigation

Repository structure

Infrastructure provisioning

Deployment

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 5

Uh oh!

Languages

Packages