What's new about this? That it's faster? People have been reconstructing 3D images from multiple photos for over a decade. The experimental work today is constructing a 3D image from a single photo, using a neural net to fill in a reasonable model of the stuff you can't see.
Yep. There has been some work to figure out frame-to-frame coherence on a sequence of images in 2D. But, I think this skips over that problem by working in 3D.
all of these videos that zoom into the static image just get me every time I see one. anyone have any insights on how to create a somewhat accurate 3D image like this? I'm sure there's a data set available for public use since it was publicly funded. Really curious how much hardware RAM/CPU/GPU it takes, and what kind of render times are involved to make these types of videos.
I think that 3D scanners would be a perfect solution to this. It becomes easier and easier to digitize real world objects, even based on a single-camera video.
Take pictures of a 3D object from say 100 different angles, remembering which angles they were. Now, use this mona lisa algorithm to put 3D polygons inside a box of sufficient size, keep those that most look like the original object when looked at from all those angles.
Will you get a compressed 3D representation of an object this way?
With enough training data I’m sure it can be done... I’ve seen CNNs that fill in 3D scenes and animate them from two images. I would guess this was a simpler problem?
reply