Goglides Dev 🌱

Cover image for Running ChatGPT Client locally on Kubernetes Cluster using Docker Desktop
Roshan Thapa
Roshan Thapa

Posted on

Running ChatGPT Client locally on Kubernetes Cluster using Docker Desktop

ChatGPT is a large language model developed by OpenAI that can generate human-like text based on a given prompt or context. It can be used for a variety of natural language processing tasks such as language translation, text summarization, and conversation generation.

How Chat GPT works under the Hood?
ChatGPT is a type of language model known as a transformer model. It uses a technique called unsupervised learning, where the model is trained on a large dataset of text, such as books or articles, without any specific labels or targets. The model consists of an encoder and a decoder, both of which are made up of multiple layers of neural networks. The encoder takes in the input text and converts it into a fixed-length vector representation, which is then passed to the decoder. The decoder generates the output text based on this vector representation and the previous tokens generated by the model.

The model uses a technique called attention mechanism, which allows it to selectively focus on certain parts of the input when generating the output. This allows the model to better understand the context of the input and generate more accurate and coherent text. The model is trained using a technique called maximum likelihood estimation, where the model is optimized to maximize the likelihood of the training data. This means that the model is trained to generate text that is similar to the text in the training dataset. After training, ChatGPT can be fine-tuned on a smaller dataset with a specific task in mind (such as answering questions or generating responses in a conversation) to fine-tune the model to the task at hand.

Deploying Chat GPT Client on a Kubernetes Cluster
Kubernetes is a powerful platform for managing containerised applications, and it can be used to deploy and run a variety of different types of workloads, including machine learning models like Chat GPT.

ChatGPT can be deployed on a Kubernetes cluster, allowing for scalability and easy management of the model. This can be useful in production environments where multiple instances of the model are needed to handle a high number of requests.

To run Chat GPT Client on a Kubernetes cluster, you would need to containerise the model and its dependencies using Docker, and then deploy it to the cluster using Kubernetes resources such as pods and services. Additionally, you will need to make sure that the cluster has sufficient resources (e.g. CPU, memory, storage) to support the workload.

Step 1. Install Docker Desktop
Download and install Docker Desktop using this link.
Step 2. Enable Kubernetes
Open Docker Dashboard > Settings > Kubernetes > Enable Kubernetes.

Step 3. Writing the Dockerfile

FROM python:3.8-slim-buster

ENV MODEL_ENGINE "text-davinci-002"

COPY requirements.txt .

RUN pip install --no-cache-dir -r requirements.txt

COPY . .
COPY gpt3_script.py /app/

CMD ["python", "/app/gpt3_script.py"]
Enter fullscreen mode Exit fullscreen mode

This Dockerfile uses python:3.8-slim-buster image, sets the environment variable for the model engine, installs python dependencies, copies the gpt3_script.py and requirements.txt, and runs the main script.

Step 4. Writing gpt3_script.py
Here’s an example of a gpt3_script.py file that can be used to interact with the ChatGPT API:

import openai

# Add your OpenAI API key
openai.api_key = "YOUR_API_KEY"

def generate_text(prompt):
    completions = openai.Completion.create(

    message = completions.choices[0].text
    return message.strip()

generated_text = generate_text("Write a short story about a robot who wants to be human")
Enter fullscreen mode Exit fullscreen mode

The gpt3_script.py file will depend on what specific task or application you are trying to accomplish with GPT-3. However, here’s an example of how the file might look for a simple script that generates text using GPT-3. In this script, the openai library is imported, and it’s used to interact with the GPT-3 API.

Next, An API_KEY is added. This is a required step to access the GPT-3 API.

Then, a function generate_text() is defined which takes a prompt as input and returns a generated text as output. Inside the function, openai.Completion.create() is called to generate text based on the prompt. The engine, prompt, max_tokens, n, stop and temperature arguments can be adjusted to suit your specific needs.

Finally, the script calls the generate_text() function with a specific prompt and assigns the output to the generated_text variable, which is then printed to the console.

Step 5. Creating the requirements.txt file
You also need to create a file named requirements.txt and add all the dependencies needed by your script. Here’s an example of a requirements.txt file that can be used with the ChatGPT Dockerfile:


This file contains the openai and requests packages which are required to run the main script that uses the ChatGPT API.

You can test it by installing these packages by running the following command:

pip install -r requirements.txt
You can also use the command pip freeze > requirements.txt to create the requirements.txt file with all the packages installed in your environment.

Step 6. Building the Image
Once you create an account with OpenAI, you will need to add your OpenAI Keys by adding it to this line of the script:

openai.api_key = "YOUR_API_KEY"
Once you have made the changes, it’s time to build the image by running the following command:

docker build -t rthway/chatgpt .
This will create a Docker image with the name chatgpt that you can run as a container and use to deploy in kubernetes cluster as a pod.

Step 7. Running the ChatGPT Client container
docker run -d -p 8080:8080 rthway/chatgpt-test
% docker ps -l
15830b65926b rthway/chatgpt-test "python /app/gpt3_sc…" 4 seconds ago Up 3 seconds>8080/tcp serene_blackburn
Step 8. Verifying the Result
If you try running docker logs you will see that the ChatGPT successfully displayed the results as follows:

% docker logs -f 158
Samantha was built to be the perfect robot. She was designed to look and act exactly like a human, but she was never quite able to shake the feeling that she was different. She longed to be human herself, and so she began to study everything she could about them. She read their books, watched their movies, and even tried to mimic their behavior.

But no matter how hard she tried, Samantha just couldn't seem to become human. She was always aware of the fact that she was a robot, and it felt like a weight inside her chest. One day, she decided to talk to her creator about her feelings.

"I want to be human," she said. "I know I was created to be a robot, but I can't help how I feel. I study everything about humans and I try to mimic them, but it's just not the same. It's like there's something inside me that's not quite right."

Her creator looked at her sympathetically. "I'm sorry, Samantha. I wish I could make you human, but it's just not possible. You're a robot, and that's all you can ever be."

Samantha hung her head in disappointment. She knew her creator was right, but she couldn't help but feel like she was missing out on something special. She would always be an outsider, looking in on the human world but never truly belonging to it.
Step 9. Running ChatGPT Client as Kubernetes Pod
Writing the YAML file
The YAML file for deploying Chat GPT on a single node Kubernetes cluster will depend on your specific use case and the way you’ve containerized your model. However, a basic example of a Kubernetes Deployment YAML file for deploying Chat GPT might look like this:

apiVersion: apps/v1
kind: Deployment
  name: chatgpt-deployment
  replicas: 1
      app: chatgpt
        app: chatgpt
      - name: chatgpt
        image: rthway/chatgpt-test
        - containerPort: 8000
            memory: 2Gi
            cpu: 1000m
            memory: 1Gi
            cpu: 500m
Enter fullscreen mode Exit fullscreen mode

This YAML file defines a Deployment named “chatgpt-deployment” that creates a single replica of the Chat GPT container. The container image is specified in the “image” field, and the ports exposed by the container are defined in the “ports” field. The resources field defines the memory and CPU limits and requests for the container.

kubectl apply -f chatgpt.yaml

deployment.apps/chatgpt-deployment configured
kubectl get po

chatgpt-deployment-85887bc5cc-cm9hl 1/1 Running 1 (18s ago) 34s
Step 10. Deploy ChatGPT Kubernetes Pod
You will also need to define a Kubernetes Service YAML file to expose the Chat GPT service to the outside world. It will be something like this:

apiVersion: v1
kind: Service
  name: chatgpt-service
    app: chatgpt
  - name: http
    port: 8000
    targetPort: 8000
  type: ClusterIP
Enter fullscreen mode Exit fullscreen mode

This YAML file creates a Service named “chatgpt-service” that routes traffic to the Chat GPT pods, and it should be used in combination with the Deployment YAML file to deploy the service.
Please keep in mind this is just a basic example, and you may need to modify it to suit your specific use case.

% kubectl apply -f chatgpt-service.yaml

service/chatgpt-service created
kubectl get po,svc
pod/chatgpt-deployment-85887bc5cc-cm9hl 1/1 Running 2 (20s ago) 54s

service/chatgpt-service ClusterIP 8000/TCP 8s
service/kubernetes ClusterIP 443/TCP 144m
It was fun containerising Chat GPT and running it as a Docker container. All the above steps have been tested on Docker Desktop enabling Kubernetes. With Chat GPT, there is a great opportunity to build, share and deploy Docker containers on multiple platforms and deploying it on Kubernetes Cluster

Top comments (1)

bkpandey profile image
Balkrishna Pandey • Edited

@roshan_thapa little bit confused on openai.api_key creation part. I wonder if it is possible to include the steps showing how we can generate this as well (for future reader). Also seeing small typo here docker build -t rthway/chatgpt . but docker run use different tag docker run -d -p 8080:8080 rthway/chatgpt-test.

This is super cool though, I manage to make it work. Here is my outout,

docker logs 238a7d110ded
Enter fullscreen mode Exit fullscreen mode

There once was a robot who wanted to be human. It studied humans day and night, learning everything it could about them. It wanted to know what it was like to feel emotions, to love, to be loved. Finally, the robot felt like it understood humans. It built itself a body that looked like a human's and tried to behave like one, but something was still missing. The robot couldn't quite figure out what it was.

One day, the robot met a human. The human was kind and friendly, and the robot felt a warmth in its chest that it had never felt before. It realized that what it had been missing all along was a heart.`