Local tone mapping algorithm and hardware implementation
Abstract
A novel tone mapping algorithm and hardware implementation for displaying wide dynamic range (WDR) images are proposed. The algorithm processes WDR images in a pixel-by-pixel fashion in the logarithmic domain, and it uses the block-interpolated minimum and maximum pixel values. The hardware implementation can achieve real-time processing of WDR images and is resource-efficient. Experimental results show that the proposed algorithm and hardware implementation can produce images with good brightness and high contrast.
Introduction
The dynamic range is defined as the ratio of the intensity of the brightest point to the intensity of the darkest point in a scene or image. For natural scenes, this ratio can reach up to the order of millions. Wide dynamic range (WDR) images, also called high dynamic range (HDR) images, are images that exhibit larger dynamic range than common photographs. Generally speaking, there are two main methods of obtaining WDR images: one can capture WDR image either in different frames [1] or via advanced image sensors in the same frame [2, 3]. However, the dynamic range of WDR images usually exceeds the dynamic range of conventional display devices, making a proper direct display of WDR image quite impossible. To address the issue of displaying WDR on conventional and existing LDR display devices, tone mapping algorithms have been developed, which aim at efficiently compress WDR image in order to produce LDR images that fit LDR display devices with the implicit or explicit constraints of preserving the original image quality to some extent (contrast, visible details, brightness or texture to name a few). Tone mapping algorithms can be classified into two categories: global and local tone mapping algorithms. Global tone mapping algorithms employ a single function for all pixels and disregard pixel's neighbour statics. In general, they are relatively easy to implement in hardware [4]. However, they may be prone to loss of details in images as well as insufficient contrast. Local tone mapping takes pixel neighbour statistics into account, and they can produce images with more contrast and brightness than global tone mapping algorithms. However, many local tone mapping algorithms are computationally expensive and require a significant amount of hardware resources for implementation [5-8]. In this Letter, we first propose a novel local tone mapping algorithm which processes the WDR image in a pixel-by-pixel manner, and then we present the hardware implementation of the algorithm.
Algorithm
















Hardware implementation
The hardware architecture of the proposed algorithm is shown in Fig. 2. It mainly consists of six modules: block div module, pixel status module, parameter module, regfile module, interp module and compute module. The pixel status module keeps track of the row and column number of the current input pixel and outputs the row and column number pair to block div module where decisions are made to keep the maximum and minimum pixel values of each block. The values are then passed to and stored in the regfile module for the use of the next frame because when operating in real time, we reasonably assume without any exaggeration that there is very little variation between successive image frames [5, 8]. Hence, image statistics (like minimum and maximum) acquired from one frame can be used to process the subsequent frame. The interp module fetches corresponding data from a regfile module based on the row and column number pair
and interpolates the corresponding
and
values. The compute module carries out the calculation of (1) to obtain the final tone mapped value of the pixel
. The parameter module stores user-defined parameters such as image resolution and predefined size of each block.

Overall processing flow of the proposed algorithm

Implemented hardware architecture
















Experimental results and comparisons
The proposed hardware architecture for the tone mapping algorithm was modelled in Verilog HDL and synthesised using Altera Quartus II 13.1 toolset. An Altera Cyclone III FPGA (EP3C120F780) development kit was our targeted platform. Figs. 3 and 4 show images tone mapped with different algorithms. As we can notice, the tone mapped image of our proposed algorithm is globally brighter and many details are visible when compared with other works. To assess the image quality of our tone mapped images, we have used the tone mapped image quality index (TMQI) [9]. The TMQI combines a multi-scale structural fidelity measure and a measure of image naturalness and provides a single quality score of an entire image. The TMQI measurements of images in Figs. 3 and 4 are shown in Table 1. The high quality score of our algorithm suggests that it can produce images that satisfy good quality criteria. Our hardware implementation was designed for minimising hardware resources and real-time processing. The synthesised working clock frequency of our hardware implementation is 100 MHz. Since our algorithm works in pixel-by-pixel fashion, this means that our implementation can process 100 Mega-pixels in 1 ms. The logic utilisation of our hardware implementation on a cyclone III FPGA (119 K logic elements) is only 11%. To evaluate the hardware implementation efficiency, we compare our work with four other similar works. The results are shown in Table 2. Hassan and Carletta [7] reported an FPGA implementation of a local tone mapping algorithm. This design can achieve a processing speed of 60 frames per second (FPS); however, the implementation requires a large number of hardware resources. Vytla et al. [6] have implemented the gradient domain WDR compression algorithm; their design requires a fewer logic resource, but it employs 88 DSP blocks for the computation. Recently, Ambalathankandy et al. [5] have implemented a global-local tone mapping method; this work also requires more logical and memory resources than our implementation. The implementation of Shahnovich et al. [4] use less logic resource, but it can only carry out simple logarithmic compression.
Works | Image size | FPS | Logic elements | Memory (bits) |
---|---|---|---|---|
Hassan and Carletta [7] | 1024 × 768 | 60 | 34,806 | 3,153,048 |
Vytla et al. [6] | 1 Megapixel | 100 | 9019 + 88 DSP | 307,200 |
Ambalathankandy et al. [5] | 1024 × 768 | 126 | 93,989 | 87,176 |
Shahnovich et al. [4] | 1024 × 768 | 126 | 4020 | 270,336 |
This work | 1024 × 768 | 126 | 13,216 | 77,408 |
Conclusion
This Letter introduces a novel WDR image tone mapping algorithm with a hardware implementation. The algorithm can compress WDR images in a pixel-by-pixel fashion with different compression levels based on local pixel statistics. The hardware implementation is resource-efficient and can achieve real-time processing speed. Some experimental results obtained have shown the good performance of the proposed algorithm, and comparisons with some other hardware works have shown that our implementation is hardware-efficient.
Acknowledgment
This work was supported by the Alberta Innovates Technology Futures (AITF) and Natural Sciences and Engineering Research Council of Canada (NSERC).