View all newsletters
Receive our newsletter - data, insights and analysis delivered to you

This New Rendering Framework Lets Neural Networks Turn 2D Images 3D

"Our approach allows for accurate optimization over vertex positions, colors, normals, light directions and texture coordinates through a variety of lighting models.”

By CBR Staff Writer

Researchers at Nvidia say they have created a rendering framework that can produce 3D objects from 2D images, with the correct shape, color, texture and lighting; a framework that can help machine learning models achieve depth perception.

The rendering framework called DIB-R — a differentiable interpolation-based renderer —  produces 3D objects from 2D images and was presented this week at the annual conference on Neural Information Processing Systems in Vancouver, Canada.

The framework, when wrapped around a neural network, learns to predict shape, texture, and light from single images and generate 3D shapes from a photo.

In the paper presented this week the researchers (from Nvidia, the University of Toronto, Vector Institute, McGill University and Aalto Universit) noted: “Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering…

“Enabling machine learning models to understand the image formation process could facilitate disentanglement of geometry from the lighting effects, which is key in achieving invariance and robustness.”

2D to 3D rendering

Credit: Nvidia

DIB-R 2D to 3D Rendering

DIB-R uses an encoder-decoder architecture to transform the input data from the 2D image into a feature map that is then used to predict the image outcome.

DIB-R takes a polygon sphere and alters it to the point that it represents the 2D image it is trying to reproduce in 3D. The researchers trained the model using a number of image datasets from a collection of bird photos to images of vehicles.

Content from our partners
Green for go: Transforming trade in the UK
Manufacturers are switching to personalised customer experience amid fierce competition
How many ends in end-to-end service orchestration?

It could potentially be used by archaeological researchers to create 3D images of objects that have been discovered and imaged during excavations.

2D to 3D rendering

Credit: DIB-R Paper

Using a single NVIDIA V100 GPU it takes just two days to train the model, once trained DIB-R can create a 3D object based on the data of a 2D image within a 100 milliseconds. DIB-R is built on the machine learning framework PyTorch.

The researchers noted that the: “Key to our approach is to view foreground rasterization as a weighted interpolation of local properties and background rasterization as an distance-based aggregation of global geometry. Our approach allows for accurate optimization over vertex positions, colors, normals, light directions and texture coordinates through a variety of lighting models.”

See Also: European Space Agency Awards The World’s First Contract to Tackle Space Debris

Websites in our network
Select and enter your corporate email address Tech Monitor's research, insight and analysis examines the frontiers of digital transformation to help tech leaders navigate the future. Our Changelog newsletter delivers our best work to your inbox every week.
  • CIO
  • CTO
  • CISO
  • CSO
  • CFO
  • CDO
  • CEO
  • Architect Founder
  • MD
  • Director
  • Manager
  • Other
Visit our privacy policy for more information about our services, how New Statesman Media Group may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications. Our services are intended for corporate subscribers and you warrant that the email address submitted is your corporate email address.
THANK YOU