Frame recovery using multiple images in rolling shutter based systems
Abstract
An algorithm for recovering transmitted static identifiers (IDs) in rolling shutter based Optical Camera Communication systems is proposed by the authors, considering a system comprised of a camera and a circular light source. The goal is to allow the correct decoding when the ID frame is only partially detected in the image. A baseline algorithm as reference for the frame recovery success rate (FRSR) and a reconstruction algorithm based on the idea of capturing multiple frame fragments and reassembling them is proposed in order to recover the transmitted ID not entirely seen on a single image. It was proven, by simulation, that the maximum distance at which the IDs recovery can be guaranteed is increased by 2.5 fold with the proposed algorithm, for 6-bit, 8-bit and 10-bit codewords. An experimental validation algorithm was also proposed, using image processing techniques to extract the bitstreams and test the ID recovery process. The proposed algorithm improves the FRSR for a given distance, even in the presence of considerable bit errors in the bitstreams extracted from the images.
1 INTRODUCTION
Many applications and settings require devices to locate themselves in the surroundings or in the world. Examples of such a need can be a mobile robot transporting parts in a manufacturing production line, or an app to assist visitors in a museum, providing information about the items on display. Both the robot and the app require some means of computing their current position, so they can be able to provide their service.
The systems to provide information about the location of an agent in indoor settings, such as those that have been just described are generally described as Indoor Positioning Systems. There are several technologies available to build these systems, which can be based on radio communications, such as Wi-Fi, Bluetooth, ZigBee and others, sound or ultrasound waves, image processing in both 2D and 3D and Visible Light Communications (VLCs) [1-3].
VLC is a technology being used increasingly due to the widespread usage of LED-based lighting. Unlike other illumination devices, such as filament-based lamps, fluorescent lamps and gas-based lamps, LED lights allow for the modulation of emitted light at high frequencies. These frequencies, being above the threshold detectable by the human eye, are not perceived by humans. This allows for making any luminary a data broadcasting device, without causing any visual perturbation to people in the vicinity [4].
The processes to acquire information transmitted by the lighting devices are usually divided in two main groups: imaging and non-imaging [2, 5]. Non-imaging processes are based on devices such as photo-diodes that measure the intensity of light impinging on the diode's sensitive surface. Imaging processes are based on the acquisition of an image by a digital camera, designated by Optical Camera Communication (OCC) [6]. The processes for recovering the information from the received light signal depend on the camera's technology. Charged Coupled Devices (CCD) cameras work with a global shutter, where the data about the received image is sampled and acquired at the same instant for the whole image [7]. On the contrary, Complementary Metal-Oxide Semiconductor cameras recur to a rolling shutter, where the image is acquired one row at a time. This last characteristic means that when the camera takes an image of an object where the light intensity is changing rapidly, successive rows will appear with different light intensities [8]. This will cause a series of stripes in the image, corresponding to the changing light intensity while the rows were being scanned. The rolling shutter allows then to demodulate the light signal, at a rate much higher than the frame rate, as the light intensity variations are translated to lighter or darker stripes [9].
Considering the case of On-Off Keying (OOK), for example, in a given camera, working with a certain frame rate, the width of each stripe will depend on the duration of each bit; each bit will be recorded in the rows that were scanned during its duration. In the case of a luminary that is used as a data emitter, the number of bits detected in an image captured with a rolling shutter device will depend on how many stripes will fit in the luminary blob; in other words, it will depend on the ratio between the size of the luminary and the width of each stripe.
While OCC systems in rolling shutter mode can reach higher data rates when compared to the global shutter, it is generally difficult to know precisely the time interval between the acquisition of two consecutive photographs, due to uncertainties in the acquisition process, particularly relevant when considering smartphone cameras [10, 11]. This means that, while the symbol rate is higher, the sampling window is effectively reduced and depends on the fixture dimensions and the distance between the camera and the source. For this reason, additional synchronisation mechanisms are necessary to join the multiple images, thus extending the effective sampling window in order to allow the reception of a larger number of symbols. Another common problem with rolling shutter based systems is the fact that the exposure time, that is, the interval during which each row of pixels is exposed to light, often spreads across multiple rows of the image [12]. This creates a low pass filter effect, smoothing out the transitions between the on and off states of the LED, resulting in less defined edges on the acquired signal, leading to possible errors in the bitstream extracted from the image. This effect, however, will not be considered in this work, since it has been previously studied [13, 14] and our focus is on exploring a solution to the problem of having a low number of bits per image.
Visible Light Positioning (VLP) stands for the usage of VLC technologies with the purpose of establishing a device position. By relating the position in the real world of a set of luminaries and the position of these luminaries with respect to the device, the location of the device can be estimated [15]. In order to identify each luminary in the acquired image, VLP systems can use VLC to broadcast an ID unique to each light source [7]. By relating the position of the light sources in the image with their real world positions, an estimate of the camera position can be computed. Although the principle is simple, its application to concrete examples presents some challenges. As the number of luminaries grows, so does the number of required IDs, and consequently, its length, making it difficult to fit inside the luminary's image. Another difficulty is introduced when the distance from the camera to the luminary increases and the size of the luminary is reduced, reducing the number of bits that can be decoded from the image [16].
As a consequence, VLP systems have to deal with the situation that, in many instances, the size of a luminary in the acquired image will be unable to encode the complete ID, and it will contain only a partial and incomplete segment. Our work addresses the problem of reconstructing a complete ID from fragments decoded from individual images acquired by a rolling shutter camera. We study the most adequate forms of encoding the ID, and the processes to recover the full ID bitstream from the fragments acquired from each image.
Multiple works have already proposed implementations of VLP systems which use OCC to identify the transmitters [17-20]. However, since most experimental works are conducted on relatively small setups or with a low number of transmitters, the problem of low sampling windows has not yet been addressed. One work focused on the problem of transmitting large packets of information with limited sampling window lengths [21]. However, this was done considering a frame with an arbitrary payload, which is not what we are exploring in this work. Another work proposed the use of Hamming codes to compensate for the gap between successive images [11]. This, however, increases the frame length, reducing the maximum number of transmitters on the system. We introduced this problem in one of our previous works, where we observed that in a VLP system comprised of a camera and fixtures on a ceiling at a height of 2.7 m, the number of transmitters and the maximum working distance were limited by the small sampling window [22].
The problem at hand is similar to that of shotgun DNA sequencing, in which a large number of short reads from random locations in the DNA sequence are assembled together, rather than attempting to sequentially read the entire genome of the organism [23]. However, solutions for this problem consider large sequences, which is not our case and at least in the context of this work, parallel readings of the sequence are not viable as we would need multiple cameras. Another possible solution to this problem is to assume that there is a minimum overlap between successive reads, which is something that cannot be guaranteed in our case [24].
In this paper we explore detection algorithms for implementing rolling shutter based OCC systems for ID transmission, with focus on low sampling windows and proposing techniques to circumvent the problem of few samples per photograph. This work is an extension of our previous work presented at the 2022 13th International Symposium on Communication Systems, Networks and Digital Signal Processing, from which we extended the number of different ID sizes and experimentally validated the proposed algorithms [25]. The paper is organised as follows. Section 2 presents the system model and the baseline reconstruction algorithm, used as a reference metric. Section 3 presents the proposed algorithm and its corresponding recovery probability model. Section 4 presents the validation process used to empirically assess the algorithm. Section 5 presents the results and conclusions are drawn in Section 6.
2 SYSTEM MODELLING
The considered system is comprised of a camera, following the pinhole model, and a single circular light fixture. Both are aligned in such a way that the fixture is parallel to the image plane, thus having a constant orientation and a varying distance between them. This is done for simplification purposes, but the same principle can be applied to different poses. A diagram describing the system is visible in Figure 1.
This value is rounded up to the nearest integer because we assume that as long as a fraction of the pixel receives light, the entire pixel is identified as belonging to the fixture.
For large values of d, R will decrease and will eventually be smaller than the transmitted codeword size. In this situation, only a fraction of the codeword is received and a process is required to reconstruct the full codeword from the received fragments. Our problem is to study the processes to recover the original codeword, and to estimate the probability of success, which we call the frame recovery success rate (FRSR).
In order to compare the algorithms that we propose in this paper, we will consider an immediate reconstruction algorithm, which consists in searching the received bitstream for a correspondence among the previously defined list of codewords. Given that we are dealing with asynchronous communication, these codes are chosen so that none of them match another code when rotated on a circular buffer, thus being uniquely identifiable on a repeating bitstream. This will be the baseline for the FRSR, which we define as the frame successful reconstruction probability, given a set of N (N ≥ 1) images, in which the number of visible symbols R might be larger or smaller than the codewords length W. A diagram illustrating this process is presented in Figure 2.
-
If the projection size R is smaller than the codeword length W, the recovery probability is always zero, as we did not receive enough symbols;
-
If the projection size R is larger than double the W, the recovery probability is always one, since we can always find the code regardless of the capturing instant;
-
If R is a value between W and 2W, the probability will be somewhere between zero and one.
By combining Equations 3-4), (3-4) and (7), we calculate the FRSR as a function of the distance d. Throughout this work, when necessary, we will consider f = 2534 and r = 7.5 cm, as these are the values corresponding to the camera and fixture used for the experimental validation. We will also consider three different codeword lengths, W = {6, 8, 10}. The list of binary codewords is always obtained by joining a fixed preamble with bits with the remaining bits, which go through all the possible binary combinations, resulting in possible codes, being W always an even integer. An alternative approach is presented in Ref. [27], where binary-coded decimal is used for the identifier, joined with a fixed preamble, also guaranteeing that the resulting codewords are uniquely decodable. This is applied in the context of a global shutter based VLP system, which differs from our case, in the sense that the problem of the sampling window size is eliminated. The preamble for W = 6 is {110}, for W = 8 is {1110} and for W = 10 is {11110}. These were chosen because they were verified to guarantee that all resulting words are uniquely identifiable within a repeating bitstream. As a concrete example, for W = 6, the set of binary codewords is {110000, 110001, 110010, 110011, 110100, 110101, 110110, 110111}.
The plot for the number of bits for each codeword length versus the distance, as well as the fixture projection size, can be observed in Figure 3. This work proposes two algorithms which aim at improving this FRSR curve, increasing the maximum distance at which the receiver can successfully recover the observed fixtures ID.
3 PROPOSED RECONSTRUCTION ALGORITHMS
3.1 Circular buffer algorithm
As described in the previous section, our goal is to improve, for a given distance d, the FRSR or, reciprocally, to increase the distance d at which a given value of FRSR is attained. In order to do this, the first algorithm we propose requires that all codewords have the same known length. Furthermore, the codes have to be unequivocally identifiable within a repeating pattern. Generating the codewords using the prefixes described in the previous section results in dictionaries that satisfy this condition. Since all codewords in a dictionary have the same length W, we can place the received symbols on a circular buffer with W positions, discarding any repeating symbols. Assuming that there are no errors in the received bitstream and provided that at least W symbols were received, this technique allows us to correctly recover the transmitted code regardless of the sampling instant. An illustration of this algorithm is visible in Figure 4.
Using this technique, the FRSR is effectively improved when compared with Equation (7). However, it is still completely impossible to recover the ID when R < W.
3.2 Frame reconstruction from multiple fragments
In an attempt to circumvent the problem described above and further improve the FRSR, we propose a second improvement to the algorithm, based on frame reconstruction from multiple fragments. The idea is that when we cannot receive all W symbols on a single image, by acquiring successive images we are left with a series of fragments that can, potentially, be joined together in order to recover the transmitted codeword. An illustration of the frame reconstruction algorithm with W = 6 is presented in Figure 5.
-
The current fragment positions on the buffer result in all buffer symbols are filled with at least one symbol of a fragment;
-
The buffer positions with more than one fragment associated have the same symbol on all corresponding fragment positions;
-
The resulting combination of W symbols contains a valid codeword, found with the search algorithm illustrated in Figure 4.
Using this procedure, for a given set of fragments, it is expected that multiple different fragment combinations are valid and result in a recovered codeword. Furthermore, it is possible that the algorithm finds different valid codewords among the set of combinations that satisfy the enumerated rules. To circumvent this, we select the codeword that appears more frequently, that is, the mode of the found codewords, which corresponds to our recovered word. The reconstruction process and the case elimination procedure are summarised in Figure 6.
3.3 Frame reconstruction model
Finding a model for the FRSR using this version of the reconstruction algorithm must take into consideration that, with all the possible fragment combination possibilities, the number of possible cases increases exponentially with n. As such, in order to find a FRSR curve, we propose a statistical model which, instead of trying to fit the fragments together, assumes an ideal reconstructor. The ideal reconstruction is a process that, given a set of fragments which contain every bit of the codeword at least once, is capable of correctly assembling the original word at every instance. This removes the uncertainty associated with the reconstruction process and, as such, yields the same result for all codewords regardless of the bit pattern.
A diagram of this model is presented in Figure 7. We start by considering a codeword with W symbols, followed by all W possible fragments it can originate with size l. From these fragments, we compute the Wn possible combinations of n fragments and apply the ideal reconstruction to each of them. The result is a set of binary values, yielding True if the present fragment combination has all W symbols at least once and False otherwise. The percentage of combinations that result in True corresponds to the FRSR of that (l, n) pair. The results of this model for the three codeword lengths are presented in Tables 1–3.
# of frag. | Fragment length (bits) | ||||||||
---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
4 | 0.6 | 0.9 | 0.9 | 1 | 1 | 1 | 1 | 1 | 1 |
3 | 0.4 | 0.8 | 0.9 | 1 | 1 | 1 | 1 | 1 | 1 |
2 | 0.2 | 0.5 | 0.8 | 1 | 1 | 1 | 1 | 1 | 1 |
1 | 0.0 | 0.0 | 0.0 | 1 | 1 | 1 | 1 | 1 | 1 |
# of frag. | Fragment length (bits) | ||||||||
---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
4 | 0.2 | 0.6 | 0.8 | 0.9 | 0.9 | 1 | 1 | 1 | 1 |
3 | 0.0 | 0.4 | 0.7 | 0.8 | 0.9 | 1 | 1 | 1 | 1 |
2 | 0.0 | 0.1 | 0.3 | 0.6 | 0.8 | 1 | 1 | 1 | 1 |
1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 1 | 1 | 1 |
# of frag. | Fragment length (bits) | ||||||||
---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
4 | 0.0 | 0.3 | 0.6 | 0.8 | 0.9 | 0.9 | 0.9 | 1 | 1 |
3 | 0.0 | 0.1 | 0.3 | 0.6 | 0.8 | 0.9 | 0.9 | 1 | 1 |
2 | 0.0 | 0.0 | 0.1 | 0.3 | 0.5 | 0.7 | 0.9 | 1 | 1 |
1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 1 |
From Table 1, we can observe that, contrarily to the results yielded by the circular buffer algorithm, we can now have a probability of correctly decoding the original codeword larger than zero for the cases where l < W. As expected, the exception to this is when n = 1, since this corresponds to the case where only a single fragment is considered at a time, giving the same results as the previous circular buffer algorithm. Furthermore, we can observe that increasing the number of fragments considered in the frame reconstruction process increases the FRSR, for a given fragment size l.
When analysing Tables 2 and 3, we observe a behaviour similar to that of Table 1, suggesting that the proposed algorithm is scalable to larger codeword lengths. We can also observe that the FRSRs appear to undergo a shift to the right of the tables as we increase the codeword size. Finally, in all three tables the FRSR is always 1.0 when l ≥ W, as was the case with the circular buffer algorithm.
In all three scenarios, we can observe that, while the recovery probabilities are now larger than zero when l < W, we can never have absolute certainty that the correct codeword will be obtained. In an attempt to further improve the proposed algorithm, we introduce an additional change. This consists in imposing a constraint on the set of fragments such that all n are unique, symbol-wise, or that there are n − 1 different fragments in the combination. In other words, all fragments must be unique except for one. This results in a reduction of the possible cases and the intuition is that it prevents the receiver from attempting the reconstruction when all n fragments are the same. Of course, depending on the transmitted codeword, this process can eventually eliminate fragment combinations which indeed correspond to different symbols but happen to have the same pattern, hence the decision to allow up to two equal fragments. No probabilistic model is presented exclusively for this modification, but both versions of the algorithm will be considered in the following simulation section.
4 FRAME RECONSTRUCTION SIMULATION
4.1 Simulation procedure
In order to assess the proposed reconstruction algorithms, we present a process to verify the FRSR by simulation. This consists in testing all possible fragment combinations within a specific set of parameters and, with it, empirically obtain the successful recovery probability given a number of fragments, n, of a given length, l, for each codeword length W. A diagram illustrating the proposed process is presented in Figure 8.
The first step in the process is to generate, considering each codeword, all possible fragments with length l. This always results in W fragments, which corresponds to the different symbols at which the reception can start. As noted before, the fragments are not necessarily different symbol-wise, as this depends on the codeword in question; however, the W fragments, unique or not, are always considered in this step.
After generating the fragments, the next step is to generate all possible combinations of n fragments, which yields Wn possible cases. Note that, since we are assuming unknown sampling intervals when acquiring the images, the same fragment can be acquired n times, hence these cases are not discarded.
The next step depends on the considered version of the reconstruction algorithm. If the unconstrained version is chosen, the combination filtering is ignored and we are ready to apply the reconstruction algorithm to each fragment combination obtained before, as described in Figure 5. If the constrained version is considered, the fragment combinations which do not satisfy the condition described in the previous section are discarded, reducing the number of possible cases. Afterwards, the reconstruction algorithm is applied to the remaining combinations.
Finally, the algorithm verifies if the recovered codeword is the same as the original and calculates the percentage of successfully recovered combinations, which corresponds to the FRSR. This process is repeated, for a given (l, n) pair, for each codeword separately, without the reconstruction algorithm knowing which particular codeword was considered.
4.2 Simulation results
The simulation results for the reconstruction algorithm from multiple frames are presented in Tables 4–7. Table 4 shows the simulated FRSR values for codewords with W = 6, using the version of the algorithm without the fragment constraints. Table 5, on the other hand, shows these same results but with the version of the algorithm with the fragment constraints. For W = 8 and W = 10, only the constrained version of the algorithm is considered, as this is the version that shows better results. Since that, for a given W, the different codewords might yield different FRSR values, both the maximum and minimum values are presented in the table for each (l, n) pair. When only a single value is shown, it means that all codewords yielded the same FRSR under those conditions. All values were rounded downwards with one decimal place, in order to avoid the misinterpretation of higher values as 100%.
# of frag. | Fragment length (bits) | ||||||||
---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
4 | 0.9a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
0.2b | 0.7 | 0.9 | |||||||
3 | 0.7 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
0.2 | 0.6 | 0.9 | |||||||
2 | 0.5 | 0.8 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
0.1 | 0.3 | 0.7 | |||||||
1 | 0.0 | 0.0 | 0.0 | 1 | 1 | 1 | 1 | 1 | 1 |
- a Maximum probability among the multiple codewords.
- b Minimum probability among the multiple codewords.
# of frag. | Fragment length (bits) | ||||||||
---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
4 | 1a | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
0.3b | 0.8 | ||||||||
3 | 0.8 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
0.2 | 0.6 | 0.9 | |||||||
2 | 0.5 | 0.8 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
0.1 | 0.3 | 0.7 | |||||||
1 | 0.0 | 0.0 | 0.0 | 1 | 1 | 1 | 1 | 1 | 1 |
- a Maximum probability among the multiple codewords.
- b Minimum probability among the multiple codewords.
# of frag. | Fragment length (bits) | ||||||||
---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
4 | 0.6a | 0.8 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
0.0b | 0.2 | 0.6 | 0.9 | ||||||
3 | 0.5 | 0.7 | 0.9 | 1 | 1 | 1 | 1 | 1 | 1 |
0.0 | 0.1 | 0.4 | 0.8 | 0.9 | |||||
2 | 0.0 | 0.5 | 0.5 | 0.9 | 1 | 1 | 1 | 1 | 1 |
0.0 | 0.1 | 0.2 | 0.5 | 0.8 | |||||
1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 1 | 1 | 1 |
- a Maximum probability among the multiple codewords.
- b Minimum probability among the multiple codewords.
# of frag. | Fragment length (bits) | ||||||||
---|---|---|---|---|---|---|---|---|---|
3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | |
4 | 0.3a | 0.6 | 0.8 | 0.9 | 1 | 1 | 1 | 1 | 1 |
0.0b | 0.0 | 0.1 | 0.5 | 0.8 | 0.9 | ||||
3 | 0.0 | 0.5 | 0.7 | 0.8 | 0.9 | 1 | 1 | 1 | 1 |
0.0 | 0.0 | 0.1 | 0.3 | 0.6 | 0.8 | 0.9 | |||
2 | 0.0 | 0.0 | 0.5 | 0.5 | 0.8 | 0.9 | 1 | 1 | 1 |
0.0 | 0.0 | 0.1 | 0.1 | 0.3 | 0.6 | 0.8 | |||
1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1 | 1 |
- a Maximum probability among the multiple codewords.
- b Minimum probability among the multiple codewords.
From the analysis of Table 4, we can verify that, as predicted by the model shown in Table 1, we always have a recovery probability of one for l ≥ W. We can also observe that, as we increase the number of fragments used in the reconstruction process, the higher the FRSR obtained. For n = 1, we always obtain a probability of zero when l < W which, again, follows the model. When n > 1, we can observe that the FRSR is always improved when compared to the immediate and circular buffer algorithms. Furthermore, it appears that the FRSRs produced by the model are always between the minimum and maximum values obtained in the simulation. Finally, we can conclude that, even though this algorithm presents significant improvements over the two previous ones, we can never guarantee a FRSR of 1.0 when l < W, as predicted by the model. This is due to the fact that there is always the possibility of capturing the same fragment n times, regardless of how many images are acquired, which leads to the impossibility of achieving a 100% recovery probability.
In the constrained version of the algorithm, as shown in Table 5, we observe that all FRSR values are either the same or slightly higher, which validates the efficacy of this addition to the algorithm. More importantly, we now have a 100% recovery probability when four fragments of five bits each are acquired, provided that the fragments follow the defined constraints.
Regarding Tables 6 and 7, we can observe that both follow the models presented in Tables 2 and 3, respectively. The same patterns present for the results referring to W = 6 are visible on both, again, indicating the scalability of the proposed reconstruction algorithm for larger codeword sizes. As was already noted in the models, the FRSR values appear to undergo a shift to the right on the table as we increase the codeword sizes, while retaining similar patterns among the values.
Finally, it is important to note that for all three values of W, provided that the constrained version of the algorithm is considered, we can always guarantee a 100% recovery probability as long as we acquire four fragments with W − 1 bits each.
In order to better compare the multiple proposed algorithms, the model and the simulation results, Figures 9-11 present the FRSR as a function of the distance between the LED fixture and the camera. While the plots are shown as a function of d, the FRSRs are calculated based on the number of symbols visible, as evidenced in Equation (4). This leads to steps in the plots, as we consider that a fraction of a symbol can be rounded upwards, as an entire symbol detected.
Since different codewords yield different FRSRs in the reconstruction algorithms from multiple fragments, instead of showing the results for each code or the minimum and maximum values presented in the previous tables, we assume that all codewords have the same probability of occurrence, resulting in a single FRSR value for a given (l, n) pair, for a given codeword length W. Furthermore, these results only consider the cases with n = 4 and with the fragment constraints applied before the reconstruction process, since these were observed to be the ones that show the best performance. Note that the presented results for the immediate and circular buffer algorithms do not require validation by simulation, as the recovery probability does not depend on the chosen codeword set, as long as we guarantee that they are uniquely decodable. The reconstruction algorithms from multiple fragments, on the other hand, may yield different probabilities for different codes, hence both the model and simulation results.
From these plots, we can verify that each proposed algorithm indeed performs better than the previous one, regardless of the codewords size W. Furthermore, the proposed model follows the simulation curves for all three scenarios, which indicated that the model provides a good estimate, despite the fact that it does not take into account the chosen set of codewords.
Finally, with the proposed reconstruction algorithm with fragment constraints, it is now possible to correctly recover the transmitted ID with 100% certainty with larger distances than before, assuming that there are no bit errors on the reception, with a distance increase from around 2 m to around 5 m for W = 6, from 1.5 to 3.5 m for W = 8 and from 1.25 to 3 m for W = 10. In all three scenarios, this corresponds to a 2.5 fold increase in the maximum distance, in a system with a camera focal length of 2534 and a 15 cm diameter LED light source. Even for larger distances, we now have a probability of correctly decoding the transmitted ID greater than zero.
5 EXPERIMENTAL VALIDATION
5.1 Bitstream extraction algorithm
The models and simulations presented until now considered that the input of the frame reconstruction algorithms was one or a group of fragments which were assumed to be error-free. That is, the possibility of errors in the decoded bitstreams was not considered, as the process of recovering the actual bitstreams from the captured images was assumed to be ideal. In practice, this is neither an error-free process nor one with a single established implementation.
In order to perform an experimental validation of the proposed algorithms, we acquired a set of images of a single modulated light fixture under different conditions, following the configuration described in Figure 1, and to each, applied an algorithm to extract the bitstream transmitted with OCC. Only then the reconstruction algorithms can be applied in order to find the transmitted codeword. The following sub-subsections describe this algorithm, which includes the image filtering and processing, the signal and bits extraction and the Manchester decoding. A complete diagram of the decoding process is presented in Figure 12.
5.1.1 Initial filtering and ROI identification
The first step in the decoding process consists in converting the image to grayscale and normalising it afterwards. Our goal is to identify the region of interest (ROI), that is, the section of the image that contains the circular fixture. Since the fixture has the rolling shutter pattern superimposed, we cannot apply a circular shape detector directly. Instead, we start by resizing the image and dividing it into blocks of H × H pixels. Each block is then reduced to a single pixel, with a value which corresponds to the maximum found among the original block. This was done in order to guarantee that the illuminated stripes are guaranteed to be retained in the resized version. After that, we binarise the resized image by applying a threshold and perform two morphological operations, a dilation followed by an erosion. The goal is to close eventual black stripes left in the resized image and transforming the fixture projection into a filled white region. The resizing process and the morphological operations are illustrated in Figure 13.
Finally, a blob detector is applied to the resulting image in order to detect the circular shape. This was done using the SimpleBlobDetector algorithm from OpenCV, with its parameters calibrated experimentally. When multiple blobs are found on a single image, we keep only the largest one, since all images contain only a single light fixture. This can happen, for example, as a result of reflections present on the images.
5.1.2 ROI filtering and signal extraction
After isolating the ROI on the resized image, we isolate the same region on the higher resolution image, considering the scaling factor H used in the resizing process. With this, we perform the same steps as before, that is, an image binarisation followed by the two morphological operations. The goal here is also to close the fixture region, eliminating the stripped pattern, so that we are left with a binary mask signalling the parts of the image that contain the OCC signal.
Afterwards, we average each row of the ROI extracted from the normalised grayscale image, considering only the pixels which have a corresponding white pixel on the binary mask. Figure 14 shows an example of the signal obtained with this procedure.
5.1.3 Bitstream extraction and Manchester decoding
With the obtained signal, we apply a threshold in order to binarise it, since the light is modulated with OOK. In order to choose this threshold, we computed the histogram of all the row averages found throughout the images for all the test distances, but for a single ID. The results of this analysis are visible on Figure 15.
It is important to note that, by analysing this histogram, we can identify a clear peak for the dark values and a broader peak for the bright values. However, no clear valley can be identified in between, making the threshold decision unclear and indicating that regardless of the chosen value this process is prone to bit errors because of the presence of noise. That being said, the threshold was fixed at the value of 70.
After binarising the signal, we recover the bitstream by analysing the length of each transition on the binarised signal. Since the data is Manchester coded, we can only have one or two consecutive symbols with the same logic level. As such, we apply a threshold to the number of samples between two consecutive transitions, resulting in either {0}, {1}, {0, 0} or {1, 1}.
Finally, we attempt to decode the Manchester coded sequence. Since we know that only 0 − 1 or 1 − 0 transitions can be found, if an invalid Manchester transition is detected, we discard the first bit of the sequence and attempt the decoding again. If we end up discarding all bits and no valid Manchester sequence can be found, we mark the image as undecodable. If, on the other hand, we find a valid sequence, we are ready to apply one of the decoding algorithms described in the paper, in order to recover the transmitted ID.
5.2 Experimental setup
For the experimental validation, a Pi Camera V2 was used, connected to a Raspberry Pi acquiring a series of consecutive photographs. These were acquired as still images, as opposed to a video stream, in order to promote randomness in the time interval between two consecutive photographs. The camera was placed on a tripod, aligned with the normal of a Tridonic DLA G2 LED fixture, modulated with OOK. The fixture was powered with a custom designed current driver and the modulating signal was generated using an Arduino Nano. The distance between the camera and the fixture was varied, in the interval of 1–10 m, in 1 m steps. Finally, only codes with length W = 6 were considered.
For each distance, a series of 20 images were captured for each codeword, resulting in a total of 20 × 8 × 10 = 1600 photographs. The ISO setting of the camera was kept at 100 and the exposure time was set at 9 µs, the minimum allowed value.
5.3 Experimental results
The FRSR values for the multiple distances obtained experimentally are presented in Figure 16, along with the simulated and ideal curves. Using the same set of photographs, we applied the three main reconstruction algorithms described before: the immediate decoding, the circular buffer and the frame reconstruction from multiple fragments. The latter was applied considering the version with fragment constraints and with n = 4.
From the analysis of these curves, we can verify that, as expected, increasing the complexity of the algorithm provides an improvement to the FRSR for a given distance. The immediate reconstruction algorithm is the worst performing one and, while the experimental curve follows the shape of the model, the actual FRSR values are somewhat below the expected ones. This might be due to errors in the bitstream extraction process, since we are only using thresholding processes to binarise the signal and detect the Manchester coded bits. From the analysis of Figure 15, we can observe that there is no clear valley in the histogram between the light and dark stripes of the image, suggesting that no ideal threshold can be found leading to errors in the decoded bitstream.
Regarding the circular buffer algorithm, we can observe that it improves upon the immediate reconstruction one, but still performs considerably lower than the model. This seems reasonable when considering the presence of errors indicated by the immediate reconstruction algorithm curves, as we are using the same set of images to validate both.
Regarding the reconstruction from multiple fragments, we can also observe that it performs better than the two previous ones. Furthermore, we can conclude that the experimental FRSR curve somewhat follows the model, despite having a considerably worse performance. This, again, is expected considering the aforementioned bit errors.
Finally, we can conclude that, even though the bitstream extraction process has errors, the circular buffer and reconstruction from multiple fragments still show improvements, which indicates that the proposed algorithms are still able to improve the FRSR in the presence of bit errors.
In order to assess the experimental performance of the proposed reconstruction algorithms abstracting from the specific parameters of the experimental system (the camera focal length f, transmitter radius r, pixels per symbol M and distance between the camera and the transmitter d), Figure 17 shows the FRSR curves as a function of the number of visible symbols l. Since the experimental validation was performed by acquiring a series of photographs across multiple distances, a method is needed to convert them to the corresponding fragment lengths l. However, due to uncertainties in the image processing algorithm, we cannot guarantee that all photographs for a given distance d will have the same number of symbols. The solution is to use the nominal fragment length for each distance, that is, the number of visible symbols on the image for a given d, obtained with Equation (6).
We can observe that, as expected, the experimental curves never perform better that the corresponding simulation upper bounds. Furthermore, we see that each proposed algorithm performs better than the previous ones for a given nominal fragment length. Finally, we can still observe the impact of the errors discussed above on the three experimental curves but, nevertheless, the proposed reconstruction algorithms provide an improvement over the baseline immediate method, even in the presence of bit errors.
6 CONCLUSIONS
In this paper, we analysed the process of reconstructing an original frame from a set of fragments smaller than the frame itself, in the context of a camera based OCC systems using rolling shutter. We proposed multiple strategies to improve the probability of successfully recovering the originally transmitted ID, considering both a single frame at a time and multiple frames simultaneously. We proposed models for the multiple algorithms that present the probability of success in the recovery process given the fragment length l and number of fragments considered n. These were validated by simulation, considering 6-bit, 8-bit and 10-bit codewords. Finally, we proposed an image processing algorithm to extract the bitstreams from the OCC images and experimentally validated the reconstruction algorithms considering 6-bit codewords. We concluded that, despite the presence of errors in the bitstream extraction process, our proposed algorithms effectively improved the FRSR for the considered scenarios. This paper shows an improvement over previous works, such as Ref. [21], by allowing the frame length to be larger than the sampling window and an arbitrary temporal gap between frames. We also managed to improve the FRSR without using a specific coding scheme, such as Hamming codes, as proposed in Ref. [11].
In the case of VLP, which was our use-case of application, these results allow to increase the distance from the fixtures at which the system can operate, thus yielding VLP systems more robust and scalable. In concrete terms, our simulation results showed a 2.5 fold increase of the maximum distance at which we can guarantee a successful recovery of the transmitted ID, for the three considered codeword sizes. It is important to note that the most relevant system performance parameter is the number of symbols detected in each frame which, in turn, depends on the camera focal distance, the fixture radius and the distance between the transmitter and the receiver. This means that the performance of the reconstruction algorithms is a function of the fragment length, thus being only indirectly influenced by the camera parameters and the distance between the transmitter and the receiver.
In future works, the selection of the codeword list should be further explored, as our results indicate that, with a different set of codes, the proposed reconstruction algorithm could perform better, further increasing the correct recovery probability given a certain number of visible symbols and consecutive frames used. Besides the codeword list, the choice of coding scheme itself should be further explored. As suggested by Ref. [27], having the codeword divided into a preamble and an identifier with the same length might not be ideal from a bit efficiency perspective. Furthermore, the experimental validation process should be conducted for codewords with larger sizes.
AUTHOR CONTRIBUTIONS
Miguel Rêgo: Conceptualization; investigation; methodology; writing – original draft. Joaquin Perez: Conceptualization; writing – review & editing. Pedro Fonseca: Conceptualization; supervision; writing – review & editing. Luís Nero Alves: Conceptualization; supervision; writing – review & editing.
ACKNOWLEDGEMENTS
This work is funded by FCT/MCTES through national funds and when applicable co-funded EU funds under the project UIDB/50008/2020-UIDP/50008/2020. Also this article is based upon work from COST Action NEWFOCUS (CA19111) supported by COST (European Cooperation in Science and Technology) and Grant PID2020-113785RB-100 funded by MCIN/AEI/10.13039/501100011033.
CONFLICT OF INTEREST STATEMENT
The authors hereby declare that they have no conflicts of interest.
Open Research
DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available from the corresponding author upon reasonable request.