Getting minute, subpixel movements can ironically give you MORE resolution if you process it over time, though you'd probably need some sort of "anchor" points
This is one of those articles that could benefit immensely with a single picture or quick video/gif, but provides neither.
> The other thing I need to do is subpixel-correct lines, because they're so much more nicer in motion than non-subpixel-correct lines. Don't confuse this with AA lines - you can have chunky non-AA single-colour pixel lines that are ALSO subpixel-correct.
> Why this is important is hard to convey in a static picture, but in motion the quality difference is very obvious. And it turns out the speed difference is not actually significant, so why not go for the extra quality?
This claim is made almost immediately in the article, but then provides no poof in picture OR video!
The breakdown of the algorithm and general writing is great though.
Imagine instead of getting a grid of pixels once every 30th of a second, you instead get one pixel’s value, alone with its location, along with the time stamp at which the pixel’s change was noticed. Event cameras can have very fine time stamp resolution (orders of magnitude better than 1/30th of a second) and so a bright moving pixel can be tracked very accurately.
Yes, I've also always felt there must be ways to extract more data from a moving clip, precisely because of the effect he explains, but then it seems that just superimposing the images doesn't actually extract that information, at least not all of it.
But I wonder how to actually do it, do you have concrete ideas for a simple algorithm?
> it should be possible to create images from a static video feed that have higher resolution than the video
You can resolve past the compression artifacts and noise, but you can only do higher resolution if the video is not static. Specific techniques from astronomy are things like Bayer Drizzle, to push past the resolution of the sensor itself via slight motion, but a perfectly locked off shot won't get you extra information.
It might be less true for pixeled videos. I know from some IR camera companys, they use very small movements from the camera to calculate the picture in a higher resolution. And I think, the first picture from a black hole use also a kind of this technologie
Not really. You would notice completely different type of picture then. It would not be consistent. It has to be moving a bit everywhere in every single frame.
From series of images, aka. video, sure. From a single image? Not so much.
In video there is a lot of temporal information and even if the spatial resolution wasn't high enough in a single image, one would be able to accumulate a higher resolution version of the scene using multiple observations.
It should also be possible to train it on itself to improve moving scenes by using the motion itself as temporal super-sampling, just like the human eye does.
While this is cool idea, how possible would it be to extend any particular scene by that short of a time? If the video is output at a typical 24 fps, the shortest any scene could be modified would be ~42 ms. Even with a lot of newer footage being filmed at 48 fps, it's still only able to clipped at a rate of 21 ms.
Past that, I've seen a small amount of time shifting take place during a not-so-careful re-encode. At 1 ms precision, even this would be enough to throw off such a tracking system.
If you can assume the scene you're capturing is fairly static over time, you can sometimes sort of cheat physics a little bit (keeping space constant and leveraging time) by resampling areas multiple times from different perspectives and cleverly combining that data together
Problem is that most the interesting bits people want more data on are quite dynamic in space and time, not just time. Even when it's not you don't gain linear subsampling improvements and eventually get diminishing returns with such approaches.
I wonder if there are interesting applications of being able to capture images at this kind of speed at high resolution. For example, using image analysis between frames to get some kind of depth map, based on subtle differences between each frame due to slight hand movements.
The structure from motion stuff is good enough if you got a couple of pictures from around the human without the human moving (much), but I suspect getting enough pictures without the human moving too far might be difficult.
> How does better temporal resolution (higher frame rate) enable better spacial resolution (see makeup detail)? In no or low motion, there can be no difference.
Because you're not watching a single still frame - it's more like your brain is taking in an integral of visual information over time. You don't need much motion to be able to perceive a visual difference (even if it's just a 'feeling'). Eg. http://iwdrm.tumblr.com/
Can you elaborate on this? If you have multiple frames where the target has moved around and you line up the images couldn’t we get single image with more information?
By the time you can see the motion on your image sensors it's too late - you want to detect and correct the motion you need to do it while you're imaging stuff, before it smears all your measurements.
reply