Stream-based ORB Feature Extractor with Dynamic Power Optimization
Tran Phong1, Thinh Hung Pham1,
Siew-Kei Lam1, Meiqing Wu1, and Bhavan A. Jasani2
1Nanyang Technological University, Singapore
2Carnegie Mellon University, United States
Abstract
The Oriented Fast and Rotated BRIEF (ORB)
feature extractor, which consists of key-point detection and
descriptor computation, is a key module in many computer
vision systems. Existing hardware implementations of ORB
feature extractor only focus on increasing performance with
power optimization as a post consideration. In this paper, we
present a stream-based ORB feature extractor that incorporates
mechanisms to lower the dynamic power consumption. These
mechanisms exploit the fact that the number of detected keypoints is typically small. The proposed solution significantly
lowers the switching activity of the key-point detection and
descriptor computation stages by early pruning of non-likely
key-points and gating the descriptor computation stages. Further
power reduction and resource minimization are achieved by
employing a threshold-guided bit-width optimization strategy to
truncate the redundant bits in the key-point detection stage.
Finally, we propose an approximation method to achieve rotation
invariance of the descriptors. FPGA implementation targeting
the Altera Aria V device shows that the proposed strategies lead
to over 25% reduction in dynamic power and lower resource
utilization, with only marginal loss in accuracy
Introduction
We want to implement a power-optimized and stream-based ORB feature extractor
to achieve real-time performance in computer vision systems. Yet, there are two challenges:
Firstly, the key-point detection algorithm is computationally intensive due to the numerous addition and multiplication operators
needed for calculating the corner response.
Secondly, a large number of row buffers are
typically used in stream processing to cache the input pixel stream, which contributes to significant dynamic power
consumption and hardware resources.
We thus introduce four main contributions to solve the aformentioned challenges:
We propose a new stream-based architecture for ORB feature extraction that utilizes an approximation angle
discretization method to acehive rotation invariance of the descriptors that avoids costly operations.
We present strategies to lower the dynamic power by effectively inhibiting unncessary signal activities
in the key-point detection and descriptor computation pipeline.
We introduce a bit-width optimization strategy in the key-point detection stage to lower power consumption.
We show that the proposed strategies maintain a marginal loss of accuracy while achieving notable savings in
dynamic power and resource utilization on the Altera Aria V FPGA.
Accuracy Evaluation
First, we measure the accuracy of the key-points detector in which we use repeatability criterion, which measures the robustness against
variety of changes in image conditions, i.e. rotation, scaling, and illumination. Here we report the difference in repeatability of
Prop2-KD and Prop1-KD which is computed in Equation (1).
$$
\begin{equation}
\tag{1}
\Delta r = \frac{|\text{repeat}_{\text{Prop2-KD}} - \text{repeat}_{\text{Prop1-KD}}|}{\text{repeat}_{\text{Prop1-KD}}} \times 100 \%
\end{equation}
$$
Figure 2 illustrates the difference in repeatability rate of
Prop2-KD
and Prop1-KD, where the x-axis is the truncated
bit-width of gradient values (G = 11 to 5). The results show
that the repeatability difference increases with larger bit-width
truncation which implies higher accuracy degradation. When
6 bits of the gradients are truncated, the loss of accuracy
of the proposed architecture is marginal (less than 6%) for
all datasets, but the gains in terms of power and resource
utilization reduction are significant (as shown in the next subsection).
Second, we compare the descriptor accuracy which is determined by the Hamming distance
of the 256-bit descriptor vectors of each implementation(Existing
and Prop2).
Figure 3 shows the average
Hamming distance of Existing
and Prop2. It is evident that our
design results in fewer error bits than Existing. Particularly,
the proposed ORB feature descriptor achieves about 50% reduced
Hamming distance compared to Existing for all four datasets.
One of the main reasons that our design resulted in higher
accuracy is the rounding scheme that we have employed in the
Angle unit for approximating the nearest index of the \(\cos \theta \) and
\(\sin \theta \) LUT, and for determining the point-pairs for binary test.
Unlike Existing, we employ rounding to the nearest integer,
which produces more accurate results.
This research project is partially funded by the National
Research Foundation Singapore under its Campus for Research Excellence and Technological Enterprise (CREATE)
programme
Discussion
For any questions regarding the publications, please contact me or post
a comment below and I will respond shortly.