Goglides Dev 🌱

Cover image for Persisting Python Packages in OpenShift AI Workbench with Custom PIP_TARGET and PYTHONPATH
Balkrishna Pandey
Balkrishna Pandey

Posted on

Persisting Python Packages in OpenShift AI Workbench with Custom PIP_TARGET and PYTHONPATH

One common challenge when working with containerized Python environments, such as the OpenShift AI Workbench, is ensuring that your installed packages persist across sessions. Normally, when a container restarts, all the installed packages that were not part of the original image are lost. This can be frustrating, especially in development workflows that rely on a specific set of packages.

Fortunately, there is a solution that allows for persistence of Python packages by leveraging the PIP_TARGET and PYTHONPATH environment variables.

The Challenge

In OpenShift AI Workbench, every time you launch a workbench, it creates a fresh environment. Any libraries or packages installed via pip are lost when the workbench is stopped or restarted, leading to a waste of time and computational resources as you re-install packages every time.

The Solution

By setting the PIP_TARGET and PYTHONPATH environment variables, we can direct pip to install packages into a specified directory that persists across restarts and configure Python to recognize this directory when importing packages.

Image description


The PIP_TARGET variable tells pip where to install the packages. Rather than installing to the global site-packages directory, pip will install to the specified directory, which, in the case of OpenShift AI Workbench, should be within the persistent folder. For this environment, we're utilizing /opt/app-root/src, a directory that remains intact across workbench restarts. Set this to /opt/app-root/src/.pip or another subdirectory within /opt/app-root/src.


Once PIP_TARGET directs pip to install packages into our persistent directory, PYTHONPATH comes into play. Setting PYTHONPATH to the same directory ensures that Python can find and import these packages at runtime, even after a restart or redeployment. Set this to the same value as PIP_TARGET to ensure Python looks here for your custom-installed packages.

Mount an Extra Volume (Optional)

If you require additional storage or want to separate your package storage from other workbench data, you can mount an extra volume in OpenShift and point PIP_TARGET to a directory on that volume. This provides you with more control over your environment and can be particularly useful when working with large packages or datasets.

Now restart the workbench which should fix this Python Packages persistent issue.


In this example, we're installing Faker, a handy library for generating fake data:

pip install faker
Enter fullscreen mode Exit fullscreen mode

Check the target directory to ensure that the Faker library files are present:

ls /opt/app-root/src/.pip/faker/
ls /opt/app-root/src/.pip/Faker-24.11.0.dist-info/
Enter fullscreen mode Exit fullscreen mode

Create a simple Python script to test the importing of both a standard library (json) and your custom-installed package (Faker):

import json
from faker import Faker

def test_standard_and_custom_packages():
    fake = Faker()

    # Generate fake data using Faker
    user_data = {
        "name": fake.name(),
        "email": fake.email(),
        "address": fake.address()

    # Convert the dictionary to a JSON string
    user_data_json = json.dumps(user_data, indent=4)

    # Print the JSON data
    print("Generated Fake User Data in JSON Format:")

Enter fullscreen mode Exit fullscreen mode

Run the script. If the environment is set up correctly, you should be able to import the Faker library from the custom path and the script should output something similar to the following:

Generated Fake User Data in JSON Format:
    "name": "Tony Mejia",
    "email": "[email protected]",
    "address": "2156 Torres Keys\nCarolynburgh, ND 38868"
Enter fullscreen mode Exit fullscreen mode

Top comments (0)