RedHat Openshift 4: How to reset CRI-O storage

#openshift #redhat

Recently, I encountered a problem where one of the nodes in my OpenShift cluster could not pull container images. In this blog post, I will explain the error messages observed in the pod logs and share my steps to resolve the issue.

Problem Description

During my recent experience, I consistently encountered the following error message in the pod logs on the affected node:

5gc/amf:v1" already present on machine
Warning Failed 2m6s kubelet Error: failed to mount container k8s_ctcs_amf01poc-ctcs-amf_6cbb6705-f713-49bb-a9dc-d6a829ef5337_0(3ec5b69efcd18b0c8ac21152726f4f81bf0e9f58d3fc69b7b72b7b67458b34df): error creating overlay mount to /var/lib/containers/storage/overlay/fb605ed2556f50195e9ed85f9abe2b1308b7e9ac71353f1c45c130b5c9ba5107/merged, mount_data=",lowerdir=/var/lib/containers/storage/overlay/l/BNAOYHSQC3DBUMIKXL5RG6TRES:/var/lib/containers/storage/overlay/l/X6DEYA4UQOLMQU3E432US5WBW7:/var/lib/containers/storage/overlay/l/KLFLBAYNHJW7RCPDDKPQ3WAMO4:/var/lib/containers/storage/overlay/l/EXRWVKXB3QYILGYYOK2XLLJQPV:/var/lib/containers/storage/overlay/l/5MXTZP552GAY6CN7QGIPDQ66W6:/var/lib/containers/storage/overlay/l/J7VG3DG2HX23LJUYKCFRDBROUG:/var/lib/containers/storage/overlay/l/KKAKBJ3E7XAFZUYTWT4PFWKM5D:/var/lib/containers/storage/overlay/l/MH5IWXUCPAJBMCKBNF2ACCCHOW:/var/lib/containers/storage/overlay/l/GBVY73SRIBZ6GUHTGRMMGAJRKN:/var/lib/containers/storage/overlay/l/NSMONE25VILA6J4QBA4OKX7ZRG:/var/lib/containers/storage/overlay/l/W5BM5N2OJDNZZAQPCQJHYORGLJ:/var/lib/containers/storage/overlay/l/6O5N6DXUPJXBJHCJNVTYLHI5CR:/var/lib/containers/storage/overlay/l/2WLVV4WFJQJSX3VCRVILPKKAXI:/var/lib/containers/storage/overlay/l/L5IX6H62SCVRSSBNJB5XWFFJAV:/var/lib/containers/storage/overlay/l/ABW6A5XUCR64XZLGOUDACNPCVB,upperdir=/var/lib/containers/storage/overlay/fb605ed2556f50195e9ed85f9abe2b1308b7e9ac71353f1c45c130b5c9ba5107/diff,workdir=/var/lib/containers/storage/overlay/fb605ed2556f50195e9ed85f9abe2b1308b7e9ac71353f1c45c130b5c9ba5107/work,context=\"system_u:object_r:container_file_t:s0:c29,c19\"": no such file or directory

Solution:

To troubleshoot and resolve this issue, I followed these steps:

I also tried to pull the image using the podman pull command manually. However, I faced the same issue; this suggested that the problem was more comprehensive than the automated image-pulling process in the OpenShift cluster. Instead, it was related to the image or the nodes where the pulling occurred.

Preventing pod scheduling on the affected node:
I used the command oc adm drain NODENAME --ignore-daemonsets --delete-local-data --disable-eviction --force to stop scheduling pods on the problematic node.
Stopping services on the affected node:
I SSHed into the node and issued the following commands to stop the necessary services:
- systemctl stop crio to stop the CRI-O service.
- systemctl stop kubelet to stop the kubelet service.
Resetting the Podman configuration:
I attempted to reset the Podman configuration using the command podman system reset -f. However, I encountered an error indicating that certain images were still in use by containers.
Manually deleting the container data:
I resolved this issue by manually deleting the /var/lib/containers/ directory, which contained the container data. In my case, I had to remove the entire directory, including the "overlay" subdirectory.

When attempting to delete the "overlay" folder, I encountered a "Device or resource busy" error. To address this, I used the command umount overlay to unmount the overlay filesystem. Then, I executed crio wipe -f to wipe the CRI-O configuration.

This fixes the issue.

Goglides Dev 🌱

RedHat Openshift 4: How to reset CRI-O storage

Problem Description

Solution:

Top comments (0)

Read next

Fighting a Rare Infection: Insights into the Balantidiasis Treatment Market

Seamless Communication: The Rise of Intercom Devices in North America

CF（Cloudflare）使用教程

流量卡避坑