SuperPixel & SuperVoxel Extraction

This study presents an efficient superpixel (SP) and supervoxel (SV) extraction method that aims improvements over the state-of-the-art in terms of both accuracy and computational complexity. Segmentation performance is improved through convexity constrained distance utilization, whereas computational efficiency is achieved by replacing complete region processing by a boundary adaptation technique. Starting from the uniformly distributed,   rectangular (cubical) equal size (volume) superpixels (supervoxels), region boundaries are iteratively adapted towards object edges. Adaptation is performed by assigning the boundary pixels to the most similar neighboring SPs (SVs). At each iteration, SP (SV) regions are updated; hence, progressively converging to compact pixel groups. Detailed experimental comparisons against the state-of-the-art competing methods validate the performance of the proposed technique considering both accuracy and speed.


The proposed algorithmic flow of the method can be explored in four main steps: 1) Initialization of the SPs; 2) SP boundary update; 3) SP structure update; 4) Termination.

Why Convexity Constrained SPs?

Convexity constrain is a major criteria for the proposed SP extraction method. The underlying motivation is to create regular oversegment grids over the entire image. This aim is morphologically meaningful, since objects usually tend to have regular boundaries. Moreover, such a constraint could also be useful for graph based implementations, where individual SPs are assigned as graph nodes. The figure below shows some visual examples of different convexity constraints (lambda is 0.9, 0.5 and 0.1 from left to right)


Some quantitative evaluations show competitive results:



As an extension of the SP framework along the video, SV extraction has also been implemented and tested. To our best knowledge, there are only limited number of methods presented in the literature for SV extraction.

The figure below shows how SVs evolve during the temporal movement. A horizontal slice is extracted from the image and the evolution of pixels in the same location has been observed for the following frames in the video. The zoomed part at the lower side shows the temporal voxel boundaries. One can observe the coherence in the voxel boundaries along the succeeding frames. The voxel boundaries can be identified in every 7 frames. Voxels are marked with different colors for visualization.


The code could be found here.

Related publications:

[2015]  H. Emrah Tasli, Cevahir Cigla, A. Aydin Alatan; “Convexity Constrained Efficient Superpixel and Supervoxel Extraction“; Elsevier Signal Processing: Image Communication

[2013]  H. Emrah Tasli, Cevahir Cigla, Theo Gevers, A. Aydin Alatan; “Superpixel Extraction via Convexity Induced Boundary Adaptation” International Conference on Multimedia and Expo (ICME) 2013

[2012]  Cevahir Cigla, H. Emrah Tasli, A. Aydin Alatan; “Efficient Super Pixel Extraction for Image Segmentation” IEEE Conference on Signal Processing and Communications Applications, (SIU) 2012