Author Topic: Improving to 6 DOF head tracking  (Read 20190 times)

Offline bencoder

  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
on: January 11, 2008, 06:42:57 AM
I have a wiimote but no bluetooth adapter yet so I can't test this out yet but I'm going to work on this once I get set up.

By setting up three IR LEDS in a triangle arrangement, it should be possible to determine all 6 degrees of movement for head tracking. I have attached an image explaining how to track the rotational movements. translational movements would be tracked as they are at the moment(change in distance between the leds for forwards and backwards and movement of the leds for left right up and down)



Offline zembahk

  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Reply #1 on: January 11, 2008, 08:03:27 AM
i like it. it looks like it would work. but what if i was on its side so that safety glasses could be easily modded across the top rim. so it's right, middle, left. instead of top, middle, bottom.



Offline bencoder

  • *
  • Posts: 2
  • Karma: +0/-0
    • View Profile
Reply #2 on: January 11, 2008, 08:12:31 AM
yep, that should work. pretty much any orientation should work.

I should just mention that this isn't an original idea. I believe it was originally introduced by the company naturalpoint who have a product called trackIR which as far as I can tell uses this exact technique(they have an IR camera you clip on to your monitor and it tracks some IR leds clipped onto the side of your head or a visor)
http://www.naturalpoint.com/trackir/

there is also an open source implementation of the concept, again using a webcam modified for IR, called freetrack, and that is available here:
http://www.free-track.net/english/



Offline Demonic69

  • *
  • Posts: 8
  • Karma: +2/-0
    • View Profile
Reply #3 on: January 11, 2008, 10:07:10 AM
i like it. it looks like it would work. but what if i was on its side so that safety glasses could be easily modded across the top rim. so it's right, middle, left. instead of top, middle, bottom.

That wouldn't work unless you could factor in distance from the wiimote.

.  .  . Face on
. . . Turning either way.
Get me?



Offline Alrecenk

  • *
  • Posts: 10
  • Karma: +0/-0
    • View Profile
Reply #4 on: January 11, 2008, 10:42:48 AM
I attempted to work out the mathematics for more 3D tracking, but I ultimately arrived at a system of 3 quadratics that I was unable to solve. The basic idea was to have a set of calibration points with position given in actual space, and in camera space. With 3 calibration points it should be possible to determine all of their distances from the camera, and given the fov etc of the camera find their positions in world space relative to the camera. This calibration should work as long as those 3 points remain in their original configuration, but it's not enough information to track individual points in 3D. For that you need more than one camera, and to calibrate a camera's position and orientation completely it takes 4 calibration points I believe. I was originally thinking about this in relation to a general application that could track a point in 3D using 2 or more wiimotes, and I hadn't really thought about using it for advanced head tracking until I read this post. Given that you can assume that there are always 3 points in a set configuration the math should be considerably simplified. At the moment my compiler and I don't seem to be getting a long, but I'd be happy to provide mathematical assistance if someone wants to attempt this.



Offline gabort

  • *
  • Posts: 39
  • Karma: +3/-0
    • View Profile
Reply #5 on: January 12, 2008, 06:30:23 PM
I think someone mentioned this in another thread already. There is already an implementation of this, but it uses 4 points of reference.

Check out: http://idav.ucdavis.edu/~okreylos/ResDev/Wiimote/index.html



Offline dougnukem

  • *
  • Posts: 1
  • Karma: +0/-0
    • View Profile
Reply #6 on: January 15, 2008, 11:04:08 AM
Awesome idea! I think the 3 point and 4 point tracking on the wii is going to be difficult, mainly because we're not able to get a snapshot image of all four points at any given time. From what I can tell image data is sent in packets containing (x, y, radius) where it determines the x and y location and the approximate radius of the light being detected.

So to track 3/4 points we need a way to "know" that when a point moves to a new location where it moved from so we know that it's the same point. I don't have my wiimote yet so maybe I'm off base and maybe the wiimote libraries already help coordinate this information.

Also is there any reason why it says you need 4 points (not in the same plane) to track a 3D object.

http://idav.ucdavis.edu/~okreylos/ResDev/Wiimote/index.html
Quote
Projection of a custom IR beacon with four LEDs in a non-planar arrangement onto the Wiimote camera's image plane.

 Is that for general 3D object tracking, not simply VR head tracking (There's less rotation in head tracking as you really can only move 180 degrees 90 in each direction).

I feel like the 3 point method would work fine for VR head tracking (it seems to work for the IR-Track software.

Has anyone worked out the intrinsic camera properties for the wii camera, so that we can accurately track a measured 3/4 point object. I think if someone sets up a test case (place a 3 point object at a set distance from the camera), then we can figure out the projection plane, frustrum geometry, and focal point position.

I think we the intrinsic wii camera properties that need to be calculated are (according to Oliver Kreylos):
  • Pixel size - The width and height of each pixel on the camera's sensor in physical coordinate units, e.g., millimeters.
  • Focal length - The orthogonal distance of the camera's focus point (center of its lens) from the image plane in physical coordinate units
  • Center of projection - The position of the orthogonal projection of the camera's focus point onto its image plane in physical coordinate units. This 2D coordinate can be combined with the camera's focal length to express the focus point position as a 3D point.



Offline Azel

  • *
  • Posts: 8
  • Karma: +0/-0
    • View Profile
Reply #7 on: January 15, 2008, 01:58:55 PM
So to track 3/4 points we need a way to "know" that when a point moves to a new location where it moved from so we know that it's the same point. I don't have my wiimote yet so maybe I'm off base and maybe the wiimote libraries already help coordinate this information.
I don't think they do, because this is really hard to accomplish. I also think knowing wich point is which ist the biggest problem when tracking more than two points with a WiiMote.

Also is there any reason why it says you need 4 points (not in the same plane) to track a 3D object.

Oliver Kreylos says:
"If four 3D points and their projections are known, this results in nine total non-linear equations for seven unknowns (8 from the points and one from the quaternion's unity condition). In principle, it would be sufficient to know the projected positions of three known 3D points (leading to seven equations for seven unknowns); however, three points in 3D space are always planar, and the resulting system is very instable."

This seems reasonable and would explain why freetrack switched from 4point Tracking to 3Point tracking. However with 3 LEDs/Point you will have problems in some situations, eg. when the WiiMote sees the 3 points on a straight line (witch reduces the number of equations and makes the problem unsolvable). I think Freetrack assumed in their code, that the user would never turn the head in right angle to the screen, which is also reasonable. =)

I personally, since having seen Johnny's and Olivers videos, have been thinking of a way to track a sword in 6DOF. That's right, I want to play a real swordfight simulation =). The best solution probably is to use 2 (or more) WiiMote(s) as "cameras" (some commercial tracking systems actually use 6 to 8 cams). But the great problem remains: how do I know which point is the tip and the hilt of the sword (and probably which ones are the parrying-"bar").

The only information given by the WiiMote about a detected point is its approximate radius. Maybe this information could be used to distinguish 4 points by giving each of them a unique radius, and thus an "ID". But this radius also changes dependent on the distance between point and WiiMote. Perhaps the last successfully tracked position of a point could be used to guess the future radius? And there we are again, guessing distances and locations.
Oliver Kreylos: "and predicted target point projections are matched with camera observations on a nearest-neighbor basis"

If the ID by radius does not work I think it is impossible to solve the problem with WiiMotes in 6 DOF. One would always have to make restrictions, guess or take the brute-force approach.

And back to sword-fight. I'd probably use a third WiiMote as hilt, with LEDs or reflectors attatched to it, so motionsensor/accelometer information could be used together with IR Point tracking to track the whole sword.
Now will someone please implement this, and send me copy? ;P
(along with a Wii, as I sadly don't own one =( )

Azel
« Last Edit: January 15, 2008, 02:15:03 PM by Azel »



Offline HighDesert

  • *
  • Posts: 3
  • Karma: +0/-0
    • View Profile
Reply #8 on: January 21, 2008, 08:47:42 PM
I personally, since having seen Johnny's and Olivers videos, have been thinking of a way to track a sword in 6DOF. That's
Oliver has a rudimentary light saber demo.

Oliver Kreylos: "and predicted target point projections are matched with camera observations on a nearest-neighbor basis"

If the ID by radius does not work I think it is impossible to solve the problem with WiiMotes in 6 DOF. One would always have to make restrictions, guess or take the brute-force approach.
Oliver's predictive method seems pretty good.  In Johnny's video on the automatic projector calibration, he mentions that they use predictive modelling when one of the sensors (fiber optics in this case) falls outside the view of the system.

http://www.youtube.com/watch?v=XgrGjJUBF_I

There are apparently other more sophisticated techniques out there, one of which underlies FreeTrack.  Off of FreeTrack's wikipedia page,

http://en.wikipedia.org/wiki/FreeTrack

is referenced Daniel DeMenthon's work on what's called "pose calculation" of 3D objects from 2D projections.  There are oodles of mathematical papers at his site:

http://www.cfar.umd.edu/~daniel/

FreeTrack uses a version of DeMenthon's POSIT algorithm.  I can't help but think that FreeTrack could be adapted by throwing out the initial image analysis step that finds the LEDs and just substituting the output from the wii.

As for identifying the LEDs, how fast is the refresh rate on the WiiMote's CCD?  Would it be possible to chirp each LED with a unique time signature?  Or continuously blink them each with a different blink frequency?

Myself, I'm interested in something like a 6DOF mouse for the  analysis of 3D image data.  I want to be able to move the wii around and have my 3D data rendered following my motion.  There are expensive solutions out there.  This is the first cheap solution that looks practical.

John.



Offline Azel

  • *
  • Posts: 8
  • Karma: +0/-0
    • View Profile
Reply #9 on: January 22, 2008, 01:05:01 PM
Oliver has a rudimentary light saber demo.
I know, but the direction of his sword is very restricted, the WiiMote must be pointed at his beacon for 3D tracking.

Oliver's predictive method seems pretty good.
Yes, I'd like to know how he uses the accelometer data to estimate the WiiMote position.

In Johnny's video on the automatic projector calibration, he mentions that they use predictive modelling when one of the sensors (fiber optics in this case) falls outside the view of the system.
The problem is Johnny uses 4 point but needs only two (for the location). This would be anlaogous to using 6 or 8 points for 3D tracking while you need 4. But we only have 4 points. =/

There are oodles of mathematical papers at his site:

http://www.cfar.umd.edu/~daniel/
Thanks for the link. I'll difinitely be reading this.

FreeTrack uses a version of DeMenthon's POSIT algorithm.  I can't help but think that FreeTrack could be adapted by throwing out the initial image analysis step that finds the LEDs and just substituting the output from the wii.
Yes, this might work. As I understand it the Wii implements the image analysis in hardware which is a great advantage.

As for identifying the LEDs, how fast is the refresh rate on the WiiMote's CCD?  Would it be possible to chirp each LED with a unique time signature?  Or continuously blink them each with a different blink frequency?
Johnny said its 100Hz, while I have heard others saying it's only 33Hz.
At 100Hz this might work theoretically, good idea! Though you would have to synchronize the "blinking" LEDs with the WiiMote. Reading the refreshrate from the WiiMote would require serious hardware hacking. Another problem I see is that LEDs might not be able to emit full power and go back to zero in 1/25 of a second.

Myself, I'm interested in something like a 6DOF mouse for the  analysis of 3D image data.  I want to be able to move the wii around and have my 3D data rendered following my motion.  There are expensive solutions out there.  This is the first cheap solution that looks practical.
I think tracking two points with two WiiMotes is definitly doable. There are only two possibilities and the brute force approach looks easy =).

Thanks for the reply!

Azel



Offline Hillfire

  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Reply #10 on: January 23, 2008, 11:17:10 PM
It sounds like identifying the LEDs is the most promising way to go with this. You could even trick the wiimote into reading more than 4 points with this method.

If we were to assume the simplest scenario with two wii-motes and 4 LEDs, where each LED is uniquely identifiable and the distance/angle between the wii-motes is known, then 6 DOF is certainly possible. The math is fairly straight forward as has been discussed in this thread... or in most 3D graphics classes.

The issue is really just point identification. If we assume that the refresh rate of the CCD is 100Hz (which I sincerely hope it is), then Nyquist[1] says the freq of the LED chirp (50% DC) needs to be less than 50 Hz. Most remote-control IR diodes can handle 40 kHz[2], so no problem there. Some error will occur, but as long as the chirp frequencies are known, it can work. A simple micro mounted inside a sword (I really like that idea) or on the back of a glove[3] can handle this easily.

It gets more complicated when the sampling of each wiimote is not synchronized. Worst case, they are 90-degrees out of sync. Now, the maximum frequency that can be reliably read by both sensors is about 25 Hz.

The sword thing only sounds really interesting if you include DesktopVR into it. With the point IDs you could track both one's head and one's sword... that would be some seriously immersive (and ninja like) game play.

I will work up some test code and hardware for this once I get my Bluetooth receiver.

As an aside, being able to uniquely identify each point, even in a one wiimote scenario, will allow very complex interface gestures by providing unique paths for each point... but that's another forum.

[1] - http://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem
[2] - http://scv.bu.edu/GC/shammi/ir/
[3] - http://www.sparkfun.com/commerce/product_info.php?products_id=8465

  - Ken



Offline HighDesert

  • *
  • Posts: 3
  • Karma: +0/-0
    • View Profile
Reply #11 on: January 24, 2008, 01:43:57 PM
Oliver has a rudimentary light saber demo.
I know, but the direction of his sword is very restricted, the WiiMote must be pointed at his beacon for 3D tracking.
Yes, I thought of that as soon as I thought more about my own problem (manipulating 3D data) and what would happen if I rotated the wiimote out of view of the LEDs.  It does make the option of leaving the wiimote stationary while rotating a tetrahedral in its view more appealing, except that I'd lose all those buttons for interaction!  In either case, I'd have to solve the point recognition problem.

FreeTrack uses a version of DeMenthon's POSIT algorithm.  I can't help but think that FreeTrack could be adapted by throwing out the initial image analysis step that finds the LEDs and just substituting the output from the wii.
Yes, this might work. As I understand it the Wii implements the image analysis in hardware which is a great advantage.
If I get a chance, I may dig into FreeTrack's code. 

The issue is really just point identification. If we assume that the refresh rate of the CCD is 100Hz (which I sincerely hope it is), then Nyquist[1] says the freq of the LED chirp (50% DC) needs to be less than 50 Hz. Most remote-control IR diodes can handle 40 kHz[2], so no problem there. Some error will occur, but as long as the chirp frequencies are known, it can work. A simple micro mounted inside a sword (I really like that idea) or on the back of a glove[3] can handle this easily.
With a little bandpass filtering you could select the signal from each particular LED.  The trick will be storing enough values to compute the frequencies without introducing too much lag to the response. 

On the other hand, would we need to chirp all the LEDs?  Would a known geometric arrangement with one chirped LED be enough?  Two?

A few more ideas for identifying the LEDs:

1. Fixed temporal intervals: Instead of blinking continuously at separable frequencies, perhaps the LEDs could be blinked in order at fixed intervals with all LEDs off except one and the interval between each LED changing in a fixed pattern.  Then, the interval between measured coordinates indicates which two LEDs fired most recently.  Intuitively, I'd think the chances of an obscured LED erroneously introducing a known interval to the sequence would be small.  You'd know, "Hey, that gap was too large! It's the LED2 - LED4 gap.  LED 3 is obscured."

2. Geometric LED curtain: A more elaborate and expensive geometric arrangement of LEDs similar to digital paper.  Or, a patterned LED reflective material lit by a single infrared source.  The mathematics of the arrangement would probably be much more complicated since the wiimote can change its view of the "field" while a typical digital paper pen always "looks" straight at the paper.

Basically with either the flashing or spatially positioned LEDs, the idea is to impart extra information to the setup that would enable the identification of a particular LED.

John.

P.S. Regarding the digital paper-like solution.  What about projecting an infrared pattern on the wall?  Are there infrared projection systems?  Would there be enough reflection off standard walls or furniture for the wiimote to detect?
« Last Edit: January 24, 2008, 01:49:02 PM by HighDesert »



Offline Hillfire

  • *
  • Posts: 4
  • Karma: +0/-0
    • View Profile
Reply #12 on: January 24, 2008, 09:14:19 PM
Fixed temporal intervals is really where I was going with that rant, just to tired to convey it. The frequency calculation really is just how long the LED has to be off for both wiimotes to recognize it, to summarize, it would be .04 seconds or longer. HighDesert, kudos, you described this method much better than I did.

Geometric placement would really depend on the scenario. In all scenarios, you can leave exactly one of the LEDs solid without any ambiguity as long as all the others are identified. In the DesktopVR/sword example, you would only need to identify two of the LEDs, one on the sword and one on the head. The position of the second LED on the head can be guessed at. However, you may hold the sword tip (if the handle LED was identified) near/in front of your head, again making it ambiguous. As usual, some logic could get around this with relative accuracy, but that just adds calculation time.

Unfortunately, the "geometric LED curtain" doesn't work with the wiimote. The device is limited to exactly four points of detection. We should consider this lucky since Nintendo had no reason to detect any more than two.

This has certainly gotten of the topic of this thread.

  - Ken



Offline HighDesert

  • *
  • Posts: 3
  • Karma: +0/-0
    • View Profile
Reply #13 on: January 24, 2008, 10:18:52 PM
Unfortunately, the "geometric LED curtain" doesn't work with the wiimote. The device is limited to exactly four points of detection. We should consider this lucky since Nintendo had no reason to detect any more than two.

A little more off topic, but the thing about the digital paper is that the local dot arrangements tell the digital pen exactly where on the paper it is without its seeing the entire paper.  Similar to the temporal chirping of the LEDs, the spatial position of neighoring dots on digital paper is such that an absolute location can be determined. 

I mention the digital paper or "LED curtain" idea as a way to get around the limited field of view of a single wiimote.  Perhaps accurate pointing could still be achieved with a suitable geometric distribution of LEDs.  Sure, it can only see 4 LEDs at a time, but it may be possible to position every set of 4 LEDs in such a way to know exactly which way the wiimote is pointing - absolute pointing being related to the effort to compute absolute position.

Again, this would be quite an elaborate solution.  I think the other approaches are easier to consider for a first attempt at 6DOF positioning.

John.



Offline Helza

  • *
  • Posts: 36
  • Karma: +3/-0
    • View Profile
Reply #14 on: January 28, 2008, 05:07:43 PM
I just update the wiimotelib to support 4 ir dots.. so if anyone can do his math and add it to the wiimotelib that would be awesome :)
http://www.wiimoteproject.com/index.php?topic=308.0