Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

That should be easily solvable using 3D convolutions and processing a short clip (~10 frames) instead of a single picture.


sort by: page size:

I like to imagine with 3D gaussian splatting of the videos. Not compute possible yet.

Interesting idea. The issue is that (I'm guessing) you'd also need some intense software to decipher the image and convert it into a 3D file.

What's new about this? That it's faster? People have been reconstructing 3D images from multiple photos for over a decade. The experimental work today is constructing a 3D image from a single photo, using a neural net to fill in a reasonable model of the stuff you can't see.

Yep. There has been some work to figure out frame-to-frame coherence on a sequence of images in 2D. But, I think this skips over that problem by working in 3D.

What's the next step? Generating 3d scenes out of these images? Would that be feasible?

all of these videos that zoom into the static image just get me every time I see one. anyone have any insights on how to create a somewhat accurate 3D image like this? I'm sure there's a data set available for public use since it was publicly funded. Really curious how much hardware RAM/CPU/GPU it takes, and what kind of render times are involved to make these types of videos.

I think that 3D scanners would be a perfect solution to this. It becomes easier and easier to digitize real world objects, even based on a single-camera video.

http://youtu.be/gu5Ywwb4RaU


The video is wild. Now we need an AI 3D infill to block in all the missing data at the edges of the view.

Here's something I'd like to see someone try.

Take pictures of a 3D object from say 100 different angles, remembering which angles they were. Now, use this mona lisa algorithm to put 3D polygons inside a box of sufficient size, keep those that most look like the original object when looked at from all those angles.

Will you get a compressed 3D representation of an object this way?


Probably with some simple optical flow view morphing. There's enough data from all the cameras to enable creating a stereo image.

You'd need four pictures - two of each side, to capture enough information to generate a 3d-model, instead of just the one.

Better, but still vulnerable.


That's super interesting. Are you trying to do single-view image to 3D?

Wow, I really want to see a photogrammetry system that can turn that video into to a reconstruction of the area.

Next step is Video. Adding temporal dimension will emphasize extrapolating true 3d shapes of recognized objects.

With enough training data I’m sure it can be done... I’ve seen CNNs that fill in 3D scenes and animate them from two images. I would guess this was a simpler problem?

A more effective version of this would capture a 3D depth map with the 2D image.

Probably easiest solution is a 360 video camera since YT allows you to view them... but not sure how good of quality/distortions.

This is from last years conference, running on a laptop in real time: https://www.youtube.com/watch?v=oJt3Ln8H03s

'computer tricks' are already here with full 3d reconstruction in real time


Suggestion: Use "photosynth" techniques to fuse the video frames from the moving cameras into a 3D model of the room.
next

Legal | privacy