In a former project, I attempted to automate the process of building base docker images for machine learning projects. In this post, I will attempt to explain the approach I took for automating the process of building computer vision projects.
OpenCV is a library commonly used in computer vision tasks. You can run pip install opencv-contrib
to download the latest python packages and this would be adequate for most learning tasks. However, to utilise some of its advanced features such as the dnn module which enables one to import and run pre-trained models from other frameworks, you would need to manually compile OpenCV. This would also involve compiling it with the required CUDA / CUDNN libraries.
The overall criteria of the build process becomes:
-
The image would need to support both python bindings and C++ libraries.
-
The image would need to have the required CUDA/CUDNN Libs installed in order to use
dnn
-
The image would need to be as compact as possible to only include the required libraries due to size constraint.
I decided to split the images into 2 categories: the plain CPU build and the GPU build.
The GPU build process is a multi-stage build
The GPU build is specified in a separate Dockerfile and makes use of the following NVIDIA Deep Learning Containers:
- nvidia/cuda:${CUDA}-cudnn8-devel-ubuntu${UBUNTU}
- nvidia/cuda:${CUDA}-cudnn8-runtime-ubuntu${UBUNTU}
According to the documentation, the nvidia images are organized into the following categories:
-
base
Base image which is built on by other image types
-
devel For development as it contains the necessary compilers and libraries to build applications. Images in this category are big. For example, the
11.8.0-cudnn8-devel-ubuntu22.04
has a size of 8.99 GB uncompressed. -
runtime
For deploying complied applications. This image would only contain the required libraries but without the compilation tools of the devel images, making it smaller in size.
The cudnn8-devel images include the CUDNN libraries from which the initial build stage starts. This stage would:
- Install the opencv deps
- Build and install python
- Create a virtualenv to install the python deps and opencv bindings
- Download and compile the opencv source with CUDA enabled.
To build python I utilise a custom script which install the deps and build it from source. Then I declared an env variable for the virtualenv path and install the pip deps:
The virtualenv can be copied during the second stage of the multi-stage build.
To build OpenCV with CUDA support, we need to enable the following CMake flags:
The CUDA_ARCH_BIN is set to a fixed value as the default includes older compute capability such as 35 which will be deprecated and also slows down the build. My initial understanding is that setting it to a lower value means that it will be compatible with later versions of GPU?
The complete CMake config becomes:
Once compiled successfully, we symlink the opencv libs into the site-packages
directory of the python virtualenv:
1
ln -s /installed/lib/python${PYVER}/site-packages/cv2/python-${PYVER}/cv2.cpython-${CPYTHON}-x86_64-linux-gnu.so $VIRTUAL_ENV/lib/python${PYVER}/site-packages/cv2.so
The second stage of the GPU build involves copying the build artifacts from the previous stage into a new image that has the required deps installed. I used the cudnn8-runtime image as it has CUDNN and CUDA prebuilt.
For this stage, I ran the custom scripts to install python and the required OpenCV deps. Then I copy the built OpenCV artifacts and the virtualenv across. Assuming the first stage is called builder
:
Note that we still need to declare and append the virtual env path to the system path globally for virtualenv to work across images.
To support C++ compilation, we need to symlink the generated pkg-config file from the OpenCV compilation into the /usr/share/pkgconfig. We also need to create an entry in /etc/ld.so.conf.d/opencv4.conf so that compiled executables can locate the shared object libraries during load:
The ‘/installed’ directory is where the opencv built artifacts are located and the opencv4.pc
file is generated when we enabled -D OPENCV_GENERATE_PKGCONFIG=ON
in the CMake config.
The final image is approx 6.28 GB uncompressed and about 2.78 GB after upload to docker hub.
To run a CUDA image locally using the host GPU, you would need to have the nvidia container runtime installed and working first.
I used the following OpenCV DNN GPU example to test if the image works locally by mounting the directory of the example code into a running container and running the example:
If OpenCV is compiled properly, the above should run and generate the following:
Note that we are able to obtain at least 54 FPS for object detection which is impressive.
Issues during Build
Forward compatibility was attempted on non supported HW
The nvidia container runtime is dependent on the host’s nvidia driver version. In this instance my driver is set to 470
but I’m trying to run CUDA 11.8 which requires driver version of 520
and above. I updated the host driver to 525
and the issue was resolved.
Could not load library libcudnn_cnn_infer.so.8
This error occured during the inference stage when calling network.predict
in the python script.
Running apt-get -y install cuda-nvrtc-11-8 cuda-nvrtc-dev-11-8
solves the issue
Remaining Tasks / Improvements
Additional tasks / improvements could include:
-
Updating the Github Action workflow to publish the images to dockerhub
-
Automate the image security scan process. Currently this is done locally via
docker scan
The built images can be found at OpenCV dockerhub images.
H4ppy H4ck1ng !!!