Tracking Attention: 2012

Monday, June 4, 2012

Final BlogPost

Tracking Attention Success*!

The normal vector of the tracked target orthogonal to the plane of the capture source has been implemented. Here are screen shots.

Slightly off, due to normal actually not being aligned to other points.. acceptable.

Missed the capture shot.

Damnit. It looked so perfect too!

The best shot I got. Also, my first one.

Technical Implementation Details Involved:
- Haar Classifier for Frontal Face Profile, for face detection.
- Image Thresholding techniques, for point extraction.
- (Non-CoPlanar) Pose-Estimation, to determine transformations based on source points from known reference points.
- Vector to Plane Intersection.
- Basic Knowledge of Graphics and C++ Programming.

Extended resources during the ten week period:
http://www.aishack.in/2010/07/tracking-colored-objects-in-opencv/
http://www.aforgenet.com/articles/posit/
http://opencv.willowgarage.com/wiki/Posit
http://nghiaho.com/?page_id=576 (Planar Pose Estimation, Ultimately did not use, but proved to be a valuable experience)
http://opencv.willowgarage.com/wiki/FaceDetection

*Limited success really, considering that face detection has now become so unreliable that it takes approximately 30 seconds praying for it to detect the target for every 5 seconds of "found face" capture time.

Monday, May 28, 2012

Week 9 - Calculating Intersection of Plane from Normal-Ray.

Implemented the calculations to find where on the plane the normal is intersecting with it, now I need to test the feed against these calculations and make adjustments based on the physical attributes of the device and the tracked target. Hoping to get that part done before wednesday.

Update: Minor bug in which the points are flipped from the expected positions. Otherwise working as expected.
Update2: Upon further inspection it wasn't that the points are flipped but the matrix that I am using is rotated about 180 degrees from what I was expecting. I will probably need to improve upon how I am acquiring the points of interest before delving deeper into the problem, as my method of acquiring the points has been performing fickely....

Wednesday, May 23, 2012

Week 8 Determining Surface Intersection.

Now that we know which way the object we are tracking, and we are able to capture where it is on a 2d plane, we need to determine where it will intersect our capture surface. To do this, we need to make two assumptions.

One: We know the measurements of the object we are tracking and are able to convert them into pixel coordinates.

Two: We assume the surface we are to intersect is on the same plane of the source of capture.

My method shall be to then project a ray from the tracked object to the surface to determine location.

---------

Another possibility I can start with is to assume I know our source objects dimensions, or if these dimensions can be configured, I can then determine the distance of the tracked object by having two rays cast when looking at two points, and then using the angle between them and a little bit of trigonometry to determine distance of the source. From there, the methodology should be the same.

Something like this...:

Sunday, May 13, 2012

Successful calculation of rotation by in non-coplanar points into pose estimator

So I have a functional pose estimator now, and can begin possibly either improving its performance, or moving onto the next part of my project, which would involve getting the physical dimensions of the work environment relative to the user.

Link to a demonstration of it:
http://youtu.be/5BeG51WrvNw

As well as some images:
(Yaw Test)

(Pitch Test)

Not sure if I should factor in roll... Face isn't detected that way currently..

Sunday, May 6, 2012

Transitioning to (Non-Coplanar) Posit Approach & beginning analysis of particular system.

Building a non-coplanar posit model for testing purposes.
Dimensions Are: RG: (0,0,5.75") RB: (6", 0", 0") RV: (0", 6", 1.25")

After doing some tests, colored thumbtacks (ping pong balls removed) are currently provide too weak to be detected properly through thresholding algorithm.

While working on building a better model, perhaps painting pingpong balls with the respective colors, was looking into possible alternative approaches of coplanar algorithm, steming from discussing project with a few students, and considering how to use perspective (comparing relative edges against each other (parallel edges that are farther away will be smaller)), to aid estimation.

Sunday, April 29, 2012

Merging Retrieval of Thresholded Points from feed and feeding it into Planar Estimation.

I had encountered a bug with the "ideal case" for pose-estimation, which I apparently missed while working on random sample cases, and have since fixed it.

Since Integrating the feed to the program I had encountered odd behavior occurring with the normal calculation, and decided to go back to testing smaller cases. Doing so I realized the following problem:

THIS is what all the people have been talking about when they said there are two corresponding planes that could result from coplanar points. I had initially thought they mean their were simply two normals, one going one direction, the other going the other direction.

Also, there are a number of things I've forgotten to do for the transition, particular going from Cartesian Coordinates to Barycentric Coordinate Systems, which would be useful for visual debugging, however I neglected to do because I thought RELATIVELY it'd still calculate the normals correctly.

Fixed some visual issues, and am able to display some MARGINAL success:

blue detection point still looks off. Probably an error in my conversion of coordinate systems.

Sunday, April 22, 2012

CoPlanar Pose Estimation - Iterated Error Minimization approach

So this week I finished testing out the approach using 3 coplanar points with the following givens:
- Two of the points are equidistant from a third point. (generating two vectors rg and rb, where:
|rg| = |rb|)
- Those two vectors are also perpendicular to each other. (allowing you to assume dot(rg, rb) = 0 ).

Breaking down the two equations, you reach:
rgZ = - (rgX*rbX + rgY*rbY)/rbZ
and
rbZ = sqrt( rgX^2 + rgY^2 + rgZ^2 - rbX^2 - rbY );

Using these two formulas, it is possible to iteratively guess at one (or both) of the possible normals, in my demo I chose only to guess at one of the normals, thinking it should be possible to determine which one you want by always being the one that points towards you. This needs to be tweaked a bit, as this result isn't being reached based on my current approach.

Some screen shots (thick transparent lines are expected normals, thin opaque lines are estimated normals ):

As you can see the guesses are by no means perfect, as there is both a large error visible here, and the cyan is facing the wrong direction.

What seems like a nice estimation is rendered unlikely to occur for the magenta normal as those src points would be very unlikely to attain on a planar target.

Another example of the normal facing the other direction.

Log of random points ((rgX,rgY, rgZ) and (rbX,rbY, rbZ)) being fed to algorithm:
VecG is simply rgX,rgY with the computed rgZ. Same is true for VegB.
Normal is the cross-product of both.
"Error" is computed by using the second (or the unused) equation used to compute the values.

VecG: x: 31.5836 y: -78.3898 z: -84.347
VecB: x: -29.5511 y: -90.051 z: 72.6256
NormalVector is: x: 31.5836 y: -78.3898 z: -84.347
"Error", should be very close to Zero: 1.42109e-014

VecG: x: -68.688 y: -80.2789 z: -43.0637
VecB: x: -37.2845 y: -24.4362 z: 105.024
NormalVector is: x: -68.688 y: -80.2789 z: -43.0637
"Error", should be very close to Zero: 1.42109e-014

VecG: x: 47.2518 y: -11.8229 z: -49.4074
VecB: x: 48.9731 y: -7.34275 z: 48.5935
NormalVector is: x: 47.2518 y: -11.8229 z: -49.4074
"Error", should be very close to Zero: 1.42109e-014

VecG: x: -20.7617 y: -77.7093 z: -95.7285
VecB: x: 24.3629 y: -97.7477 z: 74.0646
NormalVector is: x: -20.7617 y: -77.7093 z: -95.7285
"Error", should be very close to Zero: 1.42109e-014

VecG: x: -71.3187 y: -10.7273 z: 103.581
VecB: x: 99.1394 y: -45.4115 z: 63.5577
NormalVector is: x: -71.3187 y: -10.7273 z: 103.581
"Error", should be very close to Zero: 1.42109e-014

VecG: x: 16.3549 y: -28.9468 z: -30.6332
VecB: x: 28.4036 y: -16.7058 z: 30.9507
NormalVector is: x: 16.3549 y: -28.9468 z: -30.6332
"Error", should be very close to Zero: 1.42109e-014

VecG: x: -8.72524 y: -24.6681 z: -76.4579
VecB: x: -70.22 y: -35.023 z: 19.3131
NormalVector is: x: -8.72524 y: -24.6681 z: -76.4579
"Error", should be very close to Zero: 1.42109e-014

VecG: x: -50.6455 y: -88.3694 z: 19.7959
VecB: x: 57.7013 y: -14.008 z: 85.0901
NormalVector is: x: -50.6455 y: -88.3694 z: 19.7959
"Error", should be very close to Zero: 1.42109e-014

VecG: x: -19.48 y: -20.7312 z: -95.9996
VecB: x: -96.9359 y: -11.6703 z: 22.1902
NormalVector is: x: -19.48 y: -20.7312 z: -95.9996
"Error", should be very close to Zero: 4.26326e-014

VecG: x: -36.8877 y: -35.8715 z: 109.191
VecB: x: 87.1273 y: -83.517 z: 1.99698
NormalVector is: x: -36.8877 y: -35.8715 z: 109.191
"Error", should be very close to Zero: 1.98952e-013

I also looked into other approaches of Coplanar Pose Estimation and found a REALLY NICE demo using AForge.. a C# framework... here is a screenshot:

The theory is described in a paper titled "Iterative Pose Estimation using Coplanar Feature Points" written by Oberkampf, Daniel DeMenthon and Larry Davis. It uses four source points and generates the matrix formed by the translation and rotation transformations, along with the alternate matrix (as it is possible to have two normals.)

Goals for Next time:
- Integrate normal estimation with captured source points.
- Improve upon estimation approach through other algorithms (reverse engineer Coplanar Class from AForge?)

Sunday, April 15, 2012

Posit Algortithm, OpenGL + OpenCV libraries

EDIT:
I am back tracking and scaling down the problem that I'll be working on this week to detect the orientation of a plane, given known points, as I mistook the scope of this problem's difficulty level.

I've incorporated openGL libraries in order to allow me to visually track the progress of my work with pose estimation, and have currently integrated a posit demo found on willow garage.

I've also created an improvised "AR target" with a valid detected face image, along with 3 points to use as control points for orientation, using the concept of thresholding to derive the source points to be used by the Posit algorithm for pose estimation.

I am currently reading up on the Posit Algorithm (http://www.cfar.umd.edu/~daniel/daniel_papersfordownload/Pose25Lines.pdf), described by Daniel F DeMenthon and Larry S Davis.

A concern arises at the start of the first paragraph:
"We assume that we can detect and match in the image four or more noncoplanar feature points of the object and that we know their relative geometry on the object "

An alternative that I plan to look into after I finish reading the paper is located at: http://nghiaho.com/?page_id=576 , which describes a library for Robust Pose Estimation from a Planar Target

Monday, April 9, 2012

Week 2 Face Detection with Haar Classifier, Next Step Pose Estimation

At the end of week one, following a tutorial on how to write a Haar Classifier for Face detection, I am able to now track a face, and using multi-threading am able to present a program (for now, the display of the webcam feed) on a separate layer from the detection calculation.

Added in a frame capture to document what I have so far.
Some images of the face detection capture:

(Good)

Unexpectedly, the image also has the previous rectangle in the scene. Not sure why exactly, however, perhaps this could lead to work in motion estimation.
Some Examples:
(First Image of Sequence)

(Second Image of Sequence, lingering rectangle in image)

(Third Image of Sequence, only one lingering rectangle in image)

My next steps will be to do pose estimation, perhaps beginning on static images.
Currently I am studying the following documentation of previous work on pose estimation to get a better idea of what I need to implement:
Real-Time Face Pose Estimation in Video Sequence, Xiaoping Chen, Qianqian Yang, Honghong Liao, Weiping Sun, Shengsheng Yu, College of Computer Science and Technology, Huazhong University of Science and Technology

and

Head Tracking and Gesture Recognition Library, Louis-Philippe Morency

- Alfred

Thursday, March 29, 2012

Tracking Attention

Abstract

The time of the necessity to convey information to a

computer through the keyboard and mouse has long

since past. With the advent technologies such as

touch screens, web cameras, microphones, computers

now have a vast resource to collect information from.

The goal of this project will be to take on the

challenge of identifying the user's attention focus,

employing techniques used for eye/gaze tracking, and

to use toggle based actions to derive how to interpret

the object of the user's attention, whether by zooming

in on that portion of the screen, selection of the

object, among other methods of interactions

Pre-Quarter Results:

- Set up environment and able to access OpenCV libraries.

- Able to access Camera.

- Implemented Face Detection tutorial, performance is.... slow.

Proposal can be found here:

https://docs.google.com/open?id=0BzA4y4s-uSXHWE9zLVBLbDBSMzZJUHpLQnNLR3RmUQ