A webcam is enough to produce a real-time 3D model of a moving hand
Capturing hand and finger movements within milliseconds is becoming increasingly important for many applications, from virtual reality to human-machine interaction and Industry 4.0. So far, it has required enormous technical effort, which in turn has limited the possible applications. Computer scientists at the Max Planck Institute for Informatics have now developed a software system that requires only the built-in camera of a laptop, due to the interaction of various neural networks. For the first time, the researchers will be presenting the program at stand G75 in hall 27 of the computer fair Cebit, which will take place in Hannover from June 11th onward.
If the computer scientist Franziska Müller holds her hand in front of the laptop camera, the hand’s virtual counterpart appears on the screen. Immediately this is overlaid by a colorful virtual hand skeleton. No matter what movements Müller’s hand makes in front of the webcam, the colored bones of the model do the same. Müller demonstrates the software she developed together with Professor Christian Theobalt and other researchers from the Max Planck Institute for Computer Science in Saarbrücken, Stanford University and the Spanish King Juan Carlos University. So far no other software can do this with such a low-cost camera.
Since it works in almost every kind of filmed scene, it can be used anywhere, and thus trumps previous approaches that require a depth camera or multiple cameras. The algorithm, with which the software transforms the two-dimensional information of the video image in real time into the three-dimensional movement model of the hand’s bones, is based on a special kind of neural network: a so-called „convolutional neural network“ or CNN for short. The researchers have trained it to detect the bones of the hand. They have generated the necessary training data with another neural network. The result: The software calculates the exact 3D poses of the hand’s bones in milliseconds. Even if some of them are occluded, for example by an apple being held in the hand, this does not affect the software. Only several hands working together still confuse the software. Solving this is the researchers‘ next goal.
Press photos: www.uni-saarland.de/pressefotos
Questions can be directed to:
Max Planck Institute for Informatics
Saarland Informatics Campus E1.4
Phone: +49 681 9325 4057
Competence Center Computer Science Saarland
Saarland Informatics Campus
Phone: +49 681 302 70741