Goglides Dev 🌱

Cover image for Red Hat OpenShift AI: CERTIFICATE_VERIFY_FAILED during Elyra pipeline run
Balkrishna Pandey
Balkrishna Pandey

Posted on • Updated on

Red Hat OpenShift AI: CERTIFICATE_VERIFY_FAILED during Elyra pipeline run

I encountered a challenging issue in my recent work with Red Hat OpenShift AI projects, specifically while setting up an Elyra pipeline. An error popped up as I attempted to run the pipeline from JupyterLab, halting the process.

The Problem

I was greeted with a rather daunting error message when initiating the pipeline. The core of the issue seemed to be rooted in SSL certification. As shown below, the error indicated a failure in SSL certificate verification due to a self-signed certificate in the certificate chain.

SSL Error

raceback (most recent call last):
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 714, in urlopen
    httplib_response = self._make_request(
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 403, in _make_request
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 1053, in _validate_conn
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connection.py", line 419, in connect
    self.sock = ssl_wrap_socket(
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/util/ssl_.py", line 449, in ssl_wrap_socket
    ssl_sock = _ssl_wrap_socket_impl(
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/util/ssl_.py", line 493, in _ssl_wrap_socket_impl
    return ssl_context.wrap_socket(sock, server_hostname=server_hostname)
  File "/usr/lib64/python3.9/ssl.py", line 501, in wrap_socket
    return self.sslsocket_class._create(
  File "/usr/lib64/python3.9/ssl.py", line 1041, in _create
  File "/usr/lib64/python3.9/ssl.py", line 1310, in do_handshake
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1129)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.9/site-packages/elyra/pipeline/kfp/processor_kfp.py", line 210, in process
    client = TektonClient(
  File "/opt/app-root/lib64/python3.9/site-packages/kfp/_client.py", line 196, in __init__
    if not self._context_setting['namespace'] and self.get_kfp_healthz(
  File "/opt/app-root/lib64/python3.9/site-packages/kfp/_client.py", line 410, in get_kfp_healthz
    response = self._healthz_api.get_healthz()
  File "/opt/app-root/lib64/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 63, in get_healthz
    return self.get_healthz_with_http_info(**kwargs)  # noqa: E501
  File "/opt/app-root/lib64/python3.9/site-packages/kfp_server_api/api/healthz_service_api.py", line 134, in get_healthz_with_http_info
    return self.api_client.call_api(
  File "/opt/app-root/lib64/python3.9/site-packages/kfp_server_api/api_client.py", line 364, in call_api
    return self.__call_api(resource_path, method,
  File "/opt/app-root/lib64/python3.9/site-packages/kfp_server_api/api_client.py", line 181, in __call_api
    response_data = self.request(
  File "/opt/app-root/lib64/python3.9/site-packages/kfp_server_api/api_client.py", line 389, in request
    return self.rest_client.GET(url,
  File "/opt/app-root/lib64/python3.9/site-packages/kfp_server_api/rest.py", line 230, in GET
    return self.request("GET", url,
  File "/opt/app-root/lib64/python3.9/site-packages/kfp_server_api/rest.py", line 208, in request
    r = self.pool_manager.request(method, url,
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/request.py", line 74, in request
    return self.request_encode_url(
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/request.py", line 96, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/poolmanager.py", line 376, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 826, in urlopen
    return self.urlopen(
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 826, in urlopen
    return self.urlopen(
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 826, in urlopen
    return self.urlopen(
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/connectionpool.py", line 798, in urlopen
    retries = retries.increment(
  File "/opt/app-root/lib64/python3.9/site-packages/urllib3/util/retry.py", line 592, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='ds-pipeline-pipelines-definition-finetune.apps.cloud9c.xtoph156.dfw.ocp.run', port=443): Max retries exceeded with url: /apis/v1beta1/healthz (Caused by SSLError(SSLCertVerificationError(1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1129)')))

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/app-root/lib64/python3.9/site-packages/tornado/web.py", line 1786, in _execute
    result = await result
  File "/opt/app-root/lib64/python3.9/site-packages/elyra/pipeline/handlers.py", line 160, in post
    response = await PipelineProcessorManager.instance().process(pipeline)
  File "/opt/app-root/lib64/python3.9/site-packages/elyra/pipeline/processor.py", line 105, in process
    res = await asyncio.get_event_loop().run_in_executor(None, processor.process, pipeline)
  File "/usr/lib64/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/opt/app-root/lib64/python3.9/site-packages/elyra/pipeline/kfp/processor_kfp.py", line 238, in process
    raise RuntimeError(
RuntimeError: Failed to initialize `kfp.Client()` against: 'https://ds-pipeline-pipelines-definition-finetune.apps.cloud9c.xtoph156.dfw.ocp.run' - Check Kubeflow Pipelines runtime configuration: 'odh_dsp' - [TIP: did you mean to set 'https://ds-pipeline-pipelines-definition-finetune.apps.cloud9c.xtoph156.dfw.ocp.run/pipeline' as the endpoint, take care not to include 's' at end]
Enter fullscreen mode Exit fullscreen mode

The traceback highlighted several points, most notably issues within the urllib3 and ssl modules, eventually leading to a MaxRetryError and SSLCertVerificationError. This was perplexing, as my configuration appeared to be set up correctly.

SSL Error

The Cause

After some investigation, I realized that the root cause was the SSL certification of the cluster. When setting up data science pipelines on an unsecured cluster, such as in my case, the setup often fails with an error message related to SSL certification.

The Workaround

The solution to this problem was surprisingly straightforward. You must manually create an environment variable before creating a workbench on the cluster. This variable, PIPELINES_SSL_SA_CERTS, should point to the location of your service account's CA certificate, typically found at /var/run/secrets/kubernetes.io/serviceaccount/ca.crt.


Setting this environment variable proved to be the key. Once it was in place, the pipeline ran smoothly without SSL certification errors.

Pipeline Successful Run

Pipeline Successful Run

Top comments (0)