Basic Computer Vision and Flash Player 10
After a bit of a delay, I’m finally beginning to play with Flash Player 10. It was a bit of a false start for me when the betas first hit the streets because tooling options weren’t particularly polished, and I still had Flash Player 9 projects on my to-do list. Another big question was where to start. Like most people I landed on the new 2.5D features, and then tried to add something new using a web camera to control the rotation of a display object.
Image processing has always been a favorite subject of mine. I don’t have the math or computer science skills to really make it sing, but I manage just the same. For this example I wanted to leverage basic computer vision. I would have the web camera capture a specific color and then track that color throughout the coverage area. Depending on where the color was found to be most dominant in the images from the web camera, 2.5D rotations would be made to an image (otherwise displayed as a pane).
The first step is to provide an area for the user to provide a color sample. I chose a 10 x 10 pixel space, and represented that on the screen as a little square in the middle of the web camera display. When something enters that space, the color across the 100 pixels is averaged. When you get the color you want, you can press the space bar to “lock in” your color average.
public function sampleImage():void { var bmpd:BitmapData = new BitmapData( VIDEO_WIDTH, VIDEO_HEIGHT ); var capture:BitmapData = new BitmapData( CAPTURE_WIDTH, CAPTURE_HEIGHT ); var pixel:uint = 0; var r:uint = 0; var g:uint = 0; var b:uint = 0; var totalr:uint = 0; var totalg:uint = 0; var totalb:uint = 0; bmpd.draw( video ); capture.copyPixels( bmpd, new Rectangle( sample.x - video.x + 1, sample.y - video.y + 1, CAPTURE_WIDTH + 1, CAPTURE_HEIGHT + 1 ), new Point( 0, 0 ) ); swatch.graphics.clear(); swatch.graphics.lineStyle( 1, 0xFF0000, 0 ); for( var h:Number = 0; h < capture.height; h++ ) { for( var w:Number = 0; w < capture.width; w++ ) { pixel = capture.getPixel( w, h ); r = ( pixel & 0xFF0000 ) >>> 16; totalr = totalr + r; g = ( pixel & 0x00FF00 ) >>> 8; totalg = totalg + g; b = pixel & 0x0000FF; totalb = totalb + b; swatch.graphics.moveTo( w * 10, h * 10 ); swatch.graphics.beginFill( pixel ); swatch.graphics.drawRect( w * 10, h * 10, 9, 9 ); swatch.graphics.endFill(); } } r = Math.round( totalr / ( capture.width * capture.height ) ); g = Math.round( totalg / ( capture.width * capture.height ) ); b = Math.round( totalb / ( capture.width * capture.height ) ); select.graphics.clear(); select.graphics.moveTo( 0, 0 ); select.graphics.lineStyle( 1, 0x838383 ); select.graphics.beginFill( 0xE8E8E8 ); select.graphics.drawRect( 0, 0, 24, 20 ); select.graphics.endFill(); select.graphics.moveTo( 3, 3 ); select.graphics.lineStyle( 1, 0xFF0000, 0 ); select.graphics.beginFill( ( 0x000000 | r << 16 | g << 8 | b ) ); select.graphics.drawRect( 3, 3, 19, 15 ); select.graphics.endFill(); }
At that point the application start sampling the entire image for the strongest impression of that color average. I was originally sampling 10 x 10 pixel areas and averaging, but found accuracy and ease of use increased significantly when I just looked for the closest match to every pixel in the web camera’s coverage area. This does of course mean that the smaller the web camera space, the more performance you will get. To small however, and the plane becomes difficult to control. I landed at 160 x 120 and 24 frames per second.
public function processImage():void { var bmpd:BitmapData = new BitmapData( VIDEO_WIDTH, VIDEO_HEIGHT ); var average:uint = 0; var distance:Number = 0; var percent:Number = 0; var r1:uint = ( color >> 16 ) & 0xFF; var g1:uint = ( color >> 8 ) & 0xFF; var b1:uint = color & 0xFF; var r2:uint = 0; var g2:uint = 0; var b2:uint = 0; var rotX:Number = 0; var rotY:Number = 0; bmpd.draw( video ); closest.color = 0; closest.distance = 1000; closest.x = 0; closest.y = 0; for( var h:Number = 0; h < video.height; h++ ) { for( var w:Number = 0; w < video.width; w++ ) { average = bmpd.getPixel( w, h ); r2 = ( average >> 16 ) & 0xFF; g2 = ( average >> 8 ) & 0xFF; b2 = average & 0xFF; distance = Math.sqrt( Math.pow( r2 - r1, 2 ) + Math.pow( g2 - g1, 2 ) + Math.pow( b2 - b1, 2 ) ); if( closest.distance > distance ) { closest.distance = distance; closest.x = w; closest.y = h; closest.color = average; } } } sample.x = video.x + closest.x - 5; sample.y = video.y + closest.y - 5; // Rotation on x-axis percent = closest.y / video.height; if( percent < 0.50 ) { photo.rotationX = 360 - ( MAX_ROTATION * ( 0.50 - percent ) ); } else if( percent > 0.50 ) { photo.rotationX = ( MAX_ROTATION * percent ) - ( MAX_ROTATION / 2 ); } else { photo.rotationX = 0; } // Rotation on y-axis percent = closest.x / video.width; if( percent < 0.50 ) { photo.rotationY = ( MAX_ROTATION * percent ) - ( MAX_ROTATION / 2 ); } else if( percent > 0.50 ) { photo.rotationY = 360 - ( MAX_ROTATION * ( 0.50 - percent ) ); } else { photo.rotationY = 0; } }
There’s a lot of tweaking that can go on here to improve the result. As an example, generally HSB values work better than RGB, but RGB is easier for me to grasp mentally, so I started there. Then there’s the definition of “closest” color match. Tracking would also likely be smoother if I focused on an area of the screen that represented the overall closest color match versus just a pixel. That’s being said, if you have Flash Player 10 (and a web camera), here’s what I came up with for your viewing pleasure.
Note: A special thanks goes out to one of my colleagues, Serge Jespers, for discovering that you may need to change the privacy settings on the Flash Player to allow access to your web camera from this domain in order for this example to work. To enable web camera usage, right click on the application and select “Settings…”.
Update: I added the ability to “push” and “pull” the image based on the closeness of the object to the web camera. This is done by monitoring how much of the color is present in the web camera’s viewable space, and is highly subjective depending on the contrast of the selected color value. For this reason, push/pull is off by default and can be enabled by selecting the check box in the lower left-hand corner.
This all works best if you a) use a high contrast color such as a neon green b) bring the object close for initial sampling (spacebar) and then c) pull the object back relatively far such as to reduce the overall footprint in the web camera display. I couldn’t figure out what to do with the web camera/color selection area once an initial selection had been made, so if you have any feedback, I’d very much welcome your thoughts. In general, I just found it easier to control the image (whatever has been recently loaded onto Flickr) if the web camera was still front and center.
|
|
Rotating display objects in Flash Player 10 is just like the standard rotation property we all know and enjoy using today, but there’s now rotationX and rotationY. The hard part is that you have to maintain the z-ordering yourself. I would expect that the community will be quick to pick up this shortcoming when the final release of Flash Player 10 (and related tooling) is available.
While its not an amazing new 3D library or game, my first little footsteps into Flash Player 10 have been really fun. Even my four year old daughter enjoys making the picture move with her hands - and her colorful toys laying around the house have made great test subjects. Not bad for an afternoon of work and the industry leading innovation of Flash Player 10! If you want the source code, I am making that available for download as usual.