Facebook has released an update on its visual perception technology for VR and AR technologies. Software engineers at the social network company have posted details of their computation model which uses deep learning to enable effective full-body tracking capability on a mobile.
“We recently developed a new technology that can accurately detect body poses and segment a person from their background,” said the Facebook AI Camera Team in a research post.
Although the tracking capability itself is not revolutionary, the research team went into some detail about how it created the necessary neural network model for use on a smart phone in real-time.
Circumventing the cumbersome ResNet typically leveraged for Facebook’s initial Mask R-CNN model, engineers were able to optimize the number of convolution layers as well as the width of each layer to lighten the compute load.
“Developing computer vision models for mobile devices is a challenging task,” researchers wrote. “A mobile model has to be small, fast and accurate without large memory requirements.
Data scientists noted the small model size and fast runtime made possible through the modularity of Caffe 2 – not to be confused with Intel Coffee Lake CPUs – Facebook’s scalable deep learning framework, first made open source in April 2017. Engineers utilised “a mobile CPU and GPU libraries including NNPack, SNPE and Metal” to boost computation speed.
“Our final model is only a few megabytes and is very accurate,” researchers said.
Facebook AI Research (FAIR) has made Caffe2 operators open-source, including GenerateProposalsOp, BBoxTransformOp, BoxWithNMSLimit, and RoIAlignOp.
Earlier in January, Facebook partnered with the University of Washington to establish a new research centre for AR/VR technologies. The UW Reality Lab, built within the School of Computer Science & Engineering and located in Seattle, is also home to the Oculus Research division of Facebook which produces the company’s popular headset.
“As we work to give people the power to build community and bring the world closer together, augmented reality and virtual reality will form a growing role as the technical foundation for many experiences,” said Michael Cohen, Director of Computational Photography at Facebook and UW Reality Lab Advisory Board Member.