Dali is an Nvidia library that provides a computational graph for the pre-processing part of your training script. The interesting bit is that operators in the graph can be run on the GPU. You can read more here.
The data pre-processing could be massive on the CPU, and you may end up in a situation where the GPU is starving while waiting for data. In addition to some offline pre-processing, Dali could then alleviate the burden and balance things up.
For a test scenario, I wanted to feed a network with brain scans for a segmentation task. Usually, the images are stored in formats like NIfTI, which are not easy to handle in a typical image pipeline.
I decided to convert the brain scans to multipage gray-scale TIFF’s. I used nibabel to read the original files. Once you have the multidimensional NumPy arrays containing the images, you can save those as TIFF files using tifffile. Saving the TIFF was a matter of one line of code, like the following:
tifffile.imwrite(outfile, data, compress='ZSTD', photometric='minisblack')
Notice I used the Zstandand compression algorithm for deflating the TIFF files. It is swift in compression and decompression and worth using.
Having the images as multi-page TIFF’s is also handy because you can use an ordinary image viewer like geeqie.
The next step was to try to write a Dali pipeline to ingest the data. In the endeavor, I soon discovered that some functionalities and features were missing. For example, it was not possible to open compressed TIFF files. I came up with small modifications to the build script for Dali and solved the problem. They merged those modifications with my pull request to the main branch and released with version 0.26.
The second problem was that the FileReader operator didn’t support a segmentation scenario explicitly. If you want to read an image and the corresponding ground truth, you need to use two FileReader ops. Finally, the ImageDecoder didn’t support the multi-page TIFF files I wanted to use.
Luckily you can write a custom operator in C++ (well, not sure how many people will consider that ‘luck’) and implement the features you need by yourself. You may read more here.
I implemented the custom operators I needed and shared the code here. You can find the instructions to build the shared library in the project’s README.
Disclaimer: the project is just a proof of concept.
The following is a code snippet that shows how to load and use these two custom operators:
import nvidia.dali.plugin_manager as plugin_manager from nvidia.dali.pipeline import Pipeline import nvidia.dali.ops as ops import nvidia.dali.types as types class TestPipeline(Pipeline): def __init__(self, batch_size, num_threads, device_id): super(TestPipeline, self).__init__(batch_size, num_threads, device_id) self.input = ops.SegFileReader(file_root='data', file_list='data/file_list.txt', random_shuffle=True) self.decodeMulti = ops.TiffDecoder() self.rotate = ops.Rotate(device='gpu', interp_type=types.INTERP_NN, keep_size=True) self.rotate_range = ops.Uniform(range=(-27, 27)) self.transpose = ops.Transpose(perm=[1, 2, 0]) self.transposeBack = ops.Transpose(device='gpu', perm=[2, 0, 1]) def define_graph(self): angle_range = self.rotate_range() image, mask = self.input() image = self.decodeMulti(image) mask = self.decodeMulti(mask) image = self.transpose(image) image = self.rotate(image.gpu(), angle=angle_range) image = self.transposeBack(image) mask = self.transpose(mask) mask = self.rotate(mask.gpu(), angle=angle_range) mask = self.transposeBack(mask) return image, mask def main(): plugin_manager.load_library('./cmake-build-debug/libCustomOp.so') pipe = TestPipeline(batch_size=8, num_threads=4, device_id=0) pipe.build() pipe_out = pipe.run() gimage, gmask = pipe_out print(gmask.as_tensor()) print(gmask.layout()) if __name__ == '__main__': main()
At line 33, the shared library will be loaded, and the custom operators will be registered to be used by Dali.
At lines 10, the first custom operator,
SegFileReader, is created and can be wired in
define_graph. This operator differs from the standard one and it is able to read the image and its corresponding segmentation ground truth in one go.
At line 11, we find our custom
TiffDecoder, which will read our multi-page TIFFs’ files. The operator will return tensor with “channel first.” So the shape will be
CHW. Unfortunately, not all the standard operators can work with channel first tensors, and sometimes you will need to transpose things around.
You will find the above test code together with a jupyter notebook as an extra starting point in the repository. I hope you will find it useful.
Dali proved itself a useful library, with some shortcomings. You can get around some rough edges by writing your own operators. The library allows easy integration of C++ code, and for me, it is a useful feature.
Stefano software developer...