User:Eirisu

From Robin

Revision as of 08:30, 10 November 2015 by Eirisu (Talk | contribs)
Jump to: navigation, search

eirisu@ifi.uio.no

Contents

Master's Thesis:

  • Stereo Vision and for Unmanned Ground Vehicle

My Master's Thesis concerns the use of optical sensors, in order to construct 3D-models of the area around an autonomous vehicle.

Timeline

October

  • Finish Road-Detection Algorithm
  • Implement the algorithm in C++ (Maybe as a ROS-node)
  • Demonstration of the working system at FFI (October 30)
  • Test 1

November

  • Developement of the dirt-road version of the algorithm.
  • Are the same approaches working in this terrain?
  • Test 2

December

  • Developement of map-representation (free-space, road, path, obstacle, etc.)
  • Classification

January

  • Open area path segmentation

Febuary

  • Begin Writing the Master's Thesis

March

April

  • Deliver draft for review

May

  • Deliver Master's Thesis

Road Detection

One of the first things they want to achieve the Norwegian Defence Research Establishment (FFI), is an autonomous vehicle able to drive by itself at tarmac roads. Thus, I've begun making an algorithm that can detect where the road is in a color-image.

As a first step, I've chosen to take a segment of the image, and calculate the gaussian model for this section. The assumption is that the road is always right in front of the vehicle. Then, a region-growing search is executed to find the other 8-connected pixels belonging to the gaussian model, according to the Bayesian cost function:

\epsilon = \frac{1}{2\pi^{\frac{n}{2}}*|\Sigma|^{\frac{1}{2}}}*exp\bigg((-0.5)*\frac{(I(n,m) - \mu)'}{\Sigma}*(I(n,m) - \mu)\bigg)


Bayesian Classification on RGB space

The results below is computed using only the RGB Color-Space as features and the bayesian cost function on a 373*113pixel image. The left images shows some good result. In these examples, the road texture is pretty homogeneous, and there is little ambiguity. The two images on the right shows some bad result. In the top image, the section right in front of the car, which is used for making a gaussian model for the road, is dominated by shadows, which causses the gaussian model to be false. Shadows are a problem, but there are methods to handle them. The bottom image shows a situation where there are several areas that displays simmilar RGB values as the road.


Including Pixel-Position Into the Feature-Space

Fill in here


Texture-Based Approach

I've experimented with using the Gray-Level Co-Occurence Matrix (GLCM) in order to extract textural information from the image. In my experiment, I've computed the GLCM for a 31x31pixel window surrounding each pixel in the image. Then, 26 different features are computed by applying different functions on the GLCM. I've tested different combinations of these features, to see what feature-combination provides the most robust differentiation. Below are some of the best features I found:


Optimization using Expectation Maximization

Road images often contain different elements which makes classification difficult. Difference in lighting, shadows, texture variance etc. contributes to making the class distribution of road complex. Until now I've tried to find features which are robust to these differences, but another approach is to make the model, which represents the road class, more sophisticated. To make the gaussian model of to the road features more complex, I've utilized the Expectation Maximization algorithm. The algorithm is originally a clustering algorithm, which tries to fit K gaussian models to a feature space. However, instead of using the models as clusters, I use it as a distance meassure. By estimating the probability density function across all the models, we can achieve a distance meassure that takes into account a complex class spread:

  • We construct K gaussian models, each with its own mean μi, and covariance matrix Σi.
  • We then weigh each of these models with a factor ωi, based on the number of samples belonging to each.
  • We then compute the probability density function across all the gaussian models:
    p(x) = \sum_{i=1}^{K}\omega_{i} f_{pdf}(x|\mu_i, \Sigma_i)
    
    where fpdf(x | μii) is given by:
    f_{pdf}(x|\mu_i, \Sigma_i) = \frac{1}{(2\pi)^{D/2}|\Sigma_i|^{1/2}}exp\big(-\frac{1}{2}(x-\mu_i)^{T}\Sigma_i^{-1}(x-\mu_i)\big)
    where D is the number of features in the feature-space.

Below are an example of a single run of the algorithm.

However, we can further develop this method by letting these gaussian models develop over time (i.e. from image to image):

  • The first image captured leads to the first K gaussian models.
  • For each image captured after that we make K new gaussian models.
  • We compare each old model to each new by the Mahalanobis-distance between the models.
    • If the distance is sufficiently low, the current gaussian model is updated by a factor of the new gaussian model's properties.
    • If not, the new model takes over for the weakest old gaussian model in the existing gaussian-mixture-model.

Below are an example of the algorithm applied on some image sequences from the KITTI-dataset.

Illumination-Invariant Images

Shadows and different lighting conditions are a big problem for many segmentation algorithms. Therefore, it is beneficial to be able to filter out these differences in lighting, before segmentation is initiated.

By using knowledge about the camera's spectral response, one can compute an illumination-invariant picture by the following formula: ii_image = 0.5 + log(IG) − α * log(IB) − (1 − α) * log(IR), where α is the spectral response coefficient for the particular camera, IR,IG,IB represent the original image seperated into the three RGB color channels, and ii_$image is the illumination-invariant output image.

Results

Vanishing Point and Road Edge Detection

Vanishing Point Detection

A vanishing point is a point in the image where parallel lines seems to meet, from the observer's perspective. Vaninishing points have a lot of applications in computer vision (camera calibration, SLAM, 3D reconstruction), but for road detection it can help refine the search space. Since the vanishing point is a point far away from the observer, and since our application is always stationed on the ground, we assume that pixels above this point is either too far away, or nearby obstacles.

thumb|250px|Example of a Gabor filter bank containing Gabor kernels at 8 different orientation and at 5 different scales.

To detect these vanishing points, I've used an edge-detection filter, known as Gabor filter, defined as a 2D sinusoid multiplied by gaussian. The phase of the sinusoid determines the orientation of the Gabor kernel, and by convolving the image with a Gabor kernel of a specific orientation, we get a probability of each pixel belonging to that orientation. The purpose of using this technique is to find an individual orientation probability for each pixel. I therefore construct multiple Gabor kernels with different orientation. It is also a good idea to construct Gabor kernels of different scales, to be able to detect lines of different thickness.

I've constructed a Gabor filter bank at 8 different orientation between 0 and 157.5 degrees, at 5 different scales. To get a probability estimation of the pixel orientations, I've done the following:

  • Convolve the image with each of the Gabor kernels Gω,φ to get 8x5 bank of filtered images.
  • For each filtered image, compute the complex response: Iω,φ(x,y) = Re(Gω,φ(x,y))2 + Im(Gω,φ(x,y))2
  • For each pixel in the complex response images, compute the average of the 5 scales: Rφ(x,y) = meanω(Iω,φ(x,y))
  • Then finally, the pixel orientation of each pixel is estimated by finding the maximum of each pixel the 8 response images: θ(x,y) = ArgmaxφRφ(x,y)

At the last step, in order to avoid too much noisy classification, we introduce a confidence threshold for the orientation estimation. This involves only estimating an orientation if the detected maximum orientation, compared to the other orientations, are above a certain threshold.

From this, the vanishing point is estimated by a voting scheme. We iterate through the orientation image θ, and for each pixel we do the following:

  • Find all pixels below the current pixel P, within the radius r = 0.3 * imageHeight.
  • For each sub-pixel V:
    • We calculate the angular difference γ between the orientation of V, and the angle between the V and P, is lower than a certain threshold, P is voted for.
    • The voting-score is a function of this angular difference, and the distance, d(P,V), between the two pixels:  Vote(P,V) = \frac{1}{1 + (\gamma d(P,V))^2}
  • The pixel which receives the highest score, is determined as the vanishing point.


Below is a visualization of the different steps of the algorithm.

Road Edge Detection

When a vanishing point is detected, the road edges can be estimated. The algorithm is quite straight forward:

  • First apply an edge-detector on the input image, for instance Canny.
  • Detect the strongest N strongest edges through Hough-transform.
    • Filter out edges that are close to horizontal and above the vanishing point.
  • Sort the N strongest edges into left edges and right edges, depending on weither their midpoint is on the left or right half of the image.
  • For each line in each of these two sets:
    • Extract k sample-points (pixels) on the line.
      • For each point, calculate the angular difference between the pixels orientation and the lines orientation.
      • If the angular difference is below threshold, the line receives a vote.
    • Extract two patches on either side of the line, A1 and A2.
      • Compute the difference between these two patches:  diff(A1, A2) = \frac{|mean(A1) - mean(A2)|}{\sqrt{var(A1) + var(A2)}}
    • The lines score is multiplied with the difference between A1 and A2.
  • When the highest scoring left edge and right edge is found, we refine our search space as the area going from the vanishing point, through the lowest point on each of the edges, to the image boarder.

Below are some result from the algorith:

Personal tools
Front page