Miscellaneous

Xbox 360 Kinect: Much More Than Just a Toy

Note: This article is hosted here for archival purposes only. It does not necessarily represent the values of the Iron Warrior or Waterloo Engineering Society in the present day.

The Xbox 360 Kinect, released last November, has been a phenomenal success in the gaming world. In the first 2 months, the Kinect was only expected to ship 5 million units. Instead, 8 million of the 3D motion sensing devices were shipped. To the unsuspecting consumer, the Kinect is just another gaming toy which enables the person to physically interact with video games through their Xbox 360 consoles. However, to engineers and imaging experts, the Kinect is not just another toy, but a power image capturing tool.

The Kinect sensor uses an array of cameras and microphones to track multiple people through its sensor range. The sensor itself contains three visual cameras, the most basic of which is a standard RGB video camera for pictures and video capture. Two additional cameras are strictly grayscale, but utilize an infrared tracking system to interpret depth and movement. Viewable with infared goggles or cameras, the depth tracking system projects thousands of dots into the area in front of the sensor. The infrared cameras can pick up these dots, measuring the differential size of the dots and interpreting the difference in depth of objects – basically the smaller the dots appear, the farther away the surface is. Based on the information the depth sensors can record, as well as the different viewpoint each camera sees the sensor can combine the data into a 3D representation of the world, similar to how our eyes work. The Kinect microphone array is able to isolate multiple voice channels based on the different microphone pickups on the sensor, combined with the cameras it can use voice chat for multiple people in the room.

The Kinect software is able to use the 3D map of the sensor’s surroundings to track the motion of people, overlaying a virtual skeleton for extracting information about the location, motion, and speed of an individual. Gesture recognition is an important part of the Kinect’s system, allowing it to interpret body motion into actual commands for the system.

Shortly after the Kinect’s release, developers set out to utilize the Kinect sensor for third party development purposes. Drivers were written to take advantage of the sensor’s hardware and soon after, people began sharing their Kinect “hacks” all over the internet. Microsoft’s initial response was quite against the use of the Kinect other than for its Xbox 360 console, but more recently the company has embraced the development community promising to release a Kinect Software Development Kit (SDK) this upcoming spring.

Although, no formal SDK exists thus far, the third party development usage has been quite impressive thus far. For example, two university groups have each used Kinect’s gesture recognition capabilities as a user interface. One group has used the Kinect camera as a replacement for keyboard input in World of Warcraft, using basic arm movements to control the camera as well as player movement. More detailed hand motions are used to select weapons and spells as well as for on-screen selection.

Another group is developing a Kinect program to recognize basic sign language, with plans to increase its vocabulary and accuracy. The program is currently approximately 95% accurate, based off simple sentences based built from a select number of words. While the program is currently limited it shows the future potential. Imagine a hearing-impaired person signing to the Kinect, which interprets and reads out the conversation to a blind person on the other end of the chat.

Groups within Waterloo are also using the Kinect for engineering and scientific purposes as well. One of those groups is a Mechatronics fourth-year design project (FYDP) comprised of Aditya Sharma, Daryl Tiong, Kirk MacTavish, and Sean Anderson. The goal for their FYDP, according to correspondence with Sharma was, “to develop a low-cost module that provides both your position and surrounding map information for any custom user applications.” Initially, the group wanted to use a stereo camera system in combination with Light Detection and Ranging (LIDAR) Technology to give small robots a solution to mapping and path planning, but according to Anderson the release of the Kinect required them to shift their strategy, “Since the release of the Kinect, we were forced to rethink our initial plan. The Microsoft Kinect is capable of providing a very high quality 3D image, similar to that of other Time-of-Flight Cameras which usually carry a price tag near $10,000. With the Kinect being sold at toy stores for a mere $150, it gave us the opportunity to use much richer data for a fraction of our initial price estimate.” The Mechatronics FYDP symposium is set to take place on March 21 where their project will be on display. If you would like to track the group’s progress, check our their website at http://www.ic2020.ca.

The value of the Kinect in the engineering world clearly surpasses its entertainment purpose. It has allowed cheap access to a cutting edge imaging technology allowing many users around the world to apply the Kinect in purposes for which it was not designed. The Kinect has opened up gesture-based control to the masses which will likely invade many forms of technology in the coming years. Imagine being able to control kitchen appliances or televisions with hand gestures from across the room, the technology is already available and its implementation is only a short few years away.

Leave a Reply