Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login

> The copying process was never that much of a big deal

I don't know about that? Texture memory management in games can be quite painful. You have to consider different hardware setups and being able to keep the textures you need for a certain scene in memory (or not, in which case, texture thrashing).



sort by: page size:

> And of course your next keystroke is not zero copy. It needs to be sent to the GPU and the texture needs to be updated.

No textures need to be updated. The texture stays in VRAM across each frame, just need to change which texture the character cell points to. That is zero-copy.

If you were to do this with a software rendered terminal, at minimum the software would tell the windowing system which region of the window changed and then copy that region to VRAM. That’s only if the window system supports region updates, if not you’d need to copy the entire window to VRAM each frame. Much slower than just twiddling a pointer to a texture.


> but there's no way a 60GB+ game today would ever fit on, say, a DVD, at the same level of fidelity while maintaining the same player experience.

I mean, you could make a modern open-world game where all the textures are procedurally generated from set formulae, ala https://en.wikipedia.org/wiki/.kkrieger .

It might even load quickly, if the texture generation algorithms were themselves parallelized as compute shaders.

But it'd be a 100% different approach to an art pipeline, one where you can't just throw industry artists/designers at the problem of "adding content" to your world, but instead have to essentially hire mathematicians (sort of like Pixar does when they're building something new.) Costly!


> A game without textures would be hardly playable at all.

It'd be fine. No textures doesn't mean everything is gray.


> Using a texture atlas was a great idea, and we didn't know about it until you told us.

One is forced to wonder how, in an entire team working on a terminal emulator, no one thought to use the texture atlas. I mean, that's basically how the actual physical terminals the software is emulating worked in the first place. Then, when told about the obvious implementation, the reaction was not to slap one's forehead and say "Of course, it was so obvious!", but instead to tell a decades-experience game engine developer that they didn't understand how rendering works.

Mind boggling.


> Pray-tell, what magic compression technique would you use to do better AND still support 4K textures?

How about downloading textures and movies on demand in the background when you enter new areas? It's not like all that stuff is required for the game to start.


> High-res textures are a different thing, since they actually have to be painted.

Ah, I want to clarify again—I was imagining the developers already had higher quality original textures, which they had downscaled for release. The textures in Assassin's Creed II, for instance, have the look of images that were saved down to a lower resolution from larger originals. But I could be wrong, or even if I'm not, it might be less common nowadays.

As you say, the goal is to include things that only require computational resources at runtime (even an order of magnitude more).


> if games shipped without textures and shaders

Back in the days we used mods and texture packs to remove (hardware) expensive textures from the game so we get more FPS. I'm not sure if your argument applies to all games. For many games there is a competitive scene that usually don't give a shit about visuals and would trade most visual features for more frames per second.


> I'm acutely aware of how much does or does not go into textures. Modern shaders can account for as much as half of rendering time, with tracing of rays accounting for the other half. This is the entire shader, not just textures and is an extreme example.

At least at VFX level (Pixar's slightly different, as they use a lot of procedural textures), Texture I/O time can be a significant amount of render time.

> This is not true at all. Render farm nodes are typically built with memory to CPU core ratios that match as the main priority.

I don't know what you mean by this (I assume that memory scales with cores?), but most render farms at high level have extremely expensive fast I/O caches very close to the render server nodes (usually Avere solutions) mainly just for the textures.

The raw source textures are normally of the order of hundreds of gigabytes and thus have to be out-of-core. Pulling them off disk, uncompressing them and filtering them (even tiled and pre-mipped) is extremely expensive.

> This is also not true. In 1995 an Onyx with a maximum of 32 _Sockets_ had a maximum of 2GB of memory. The bandwidth to PCIe 3.0 16x is about 16GB/s and plenty of cards already have 16GB of memory. The textures would also stay in memory for multiple frames, since most textures are not animated.

This is true. One of the reasons why GPU renderers still aren't being used at high-level VFX in general is precisely because of both memory limits (once you go out-of-core on a GPU, you might as well have stayed on the CPU) and due to PCI transfer costs of getting the stuff onto the GPU.

On top of that, almost all final rendering is still done on a per-frame basis, so for each frame, you start the renderer, give it the source scene/geo, it then loads the textures again and again for each different frame - precisely why fast Texture caches are needed.


> But you try to lock() on a backgroundworker (the sane thing to do)

That’s just a leaky abstraction. Updating image from background thread is a sane thing to do from general-purpose programmer POV. To understand why it’s not so good idea, and why it’s not supported, you need to know what happens under the hood. Specifically, how 3D GPU hardware works and executes these commands.

> No solution exists.

From graphic programmers POV, the sane solution — only call GPU from a single thread. In D3D11 there’re things which can be called from background threads. It’s possible to create new texture on the background thread uploading the new data to VRAM, pass the texture to the GUI thread, then on the GUI thread either copy between textures (very fast) or replace textures destroying the old one (even faster). Unfortunately, doing so is slower in many practical use cases, like updating that bitmap at 60Hz: creating new resources is relatively expensive, more so than updating preexisting one.


> One of the big problems at the moment is that photorealistic texturing basically requires finding and mapping a real life actor, which is all sorts of expensive.

As a hobby game dev myself I find this to not be a problem at all. I think human skin is largely a solved problem. It largely boils down to creating a proper material/shader, the material will have many layers of textures. These days, with the proper tools, this is easy, and generally more of an artistic affair than technical.

In fact, I will claim that texturing for games, in general, is largely a solved problem.

Look into both Substance suite and Quixel's offerings, watch/read tutorials. Both Algorithmic and Quixel revolutionized texturing and made AAA quality possible for indies, just like Unreal Engine made AAA quality available for indies.


> Be generous; you even said in your previous sentence you don't know if it's common or not, so how do you know developers refuse to implement basic compression? If it's that easy, I'm sure there are plenty of AAA game studios (including mine) that will hire you to just implement basic compression.

The thing I said I didn't know whether it was common was keeping assets in compressed form in memory. I admit I don't know much of the specifics of how game rendering works. What I do know something about is the extremely poor compression applied to many video games in their on-disk form. I'm willing to grant that your studio may indeed be an exception to this, but the general principle isn't possible to deny. Just by enabling lossless Windows file compression on a game folder, you can frequently see the size of a game on disk drop by 50% or more, as I discuss in my comment here: https://news.ycombinator.com/item?id=34042164

Surely a lossless format designed specifically for encoding assets would be even more effective and fast than generic Windows file compression!


> but go pick up any modern AAA video game and you can tell there is also so much more depth than we used to have. No?

No. There is a lot more texture memory and some good shader programming, but we knew how to render as good or better graphics 25 years ago - we didn't know how to do it anywhere fast enough. For your example most of that gain is in hardware and art budgets.


> The one place where today's GPUs aren't as good as Toy Story is filtering & anti-aliasing

This only makes sense if you are locked in to some texture filtering algorithm already, which isn't true. CPU renderers aren't doing anything with their texture filtering that can't be replicated on GPUs. Where the line should be drawn by using the GPUs native texture filtering and doing more thorough software filtering would be something to explore, but there is no reason why a single texture sample in the terms of a software renderer has to map to a single texture sample on the GPU.

> BTW, textures & texture sampling are a huge portion of the cost of the runtime on render farms. They comprise the majority of the data needed to render a frame.

I'm acutely aware of how much does or does not go into textures. Modern shaders can account for as much as half of rendering time, with tracing of rays accounting for the other half. This is the entire shader, not just textures and is an extreme example.

> The entire architecture of a render farm is built around texture caching.

This is not true at all. Render farm nodes are typically built with memory to CPU core ratios that match as the main priority.

> Just getting textures into a GPU would also pose a significant speed problem.

This is also not true. In 1995 an Onyx with a maximum of 32 _Sockets_ had a maximum of 2GB of memory. The bandwidth to PCIe 3.0 16x is about 16GB/s and plenty of cards already have 16GB of memory. The textures would also stay in memory for multiple frames, since most textures are not animated.


> Man, I don't know why you're in hyperbolic attack mode,

There isn't anything like that in my posts, just corrections along with pointing out irrelevancies, no need to be defensive.

> It is completely relevant, if you can't fit the textures into memory in the first place, which is precisely what was happening in my studio around the same time Toy Story was produced, and what I would speculate was also happening during production of Toy Story.

Yes, PRman has always had great texture caching and like I mentioned earlier, a 32 socket SGI Onyx would max out at 2GB of memory. I think a fraction of that was much more common.

> Still, this would mean that a good chunk of the software, the antialiased frame buffer, all animation and geometry data, all texture data -- all assets for the film -- would have to fit on the GPU in an accessible (potentially uncompressed) state.

I think you mean all assets for a shot, not the whole film.


> and shaders are involved, doesn't everyone use texture compression formats like ETC1 and PVRTC?

Yes, when they can pay the computational price for compression beforehand. Not going to work, if you had to compress in realtime.


> So the Saturn emulated 3D by transforming squares into triangles. Imagine trying to do that in assembly. It’s definitely beyond my capabilities that’s for sure.

All my early 3D engines were in software (x86 assembler), this was in about 1990, before any 3D hardware came along. Doing the 3D texture-mapping is actually really simple, even in assembler, once you know how. Figuring out the algorithm was really hard because it was so hard in those days to find out any information about this stuff.

What I couldn't figure out until several years later was how to fix the issue of perspective warping on the textures. Again, the solution is so simple once you know it, but as a kid it was beyond me, until the Internet came along and opened up a world of coding I'd never seen before.


> My main takeaway from Demoscene is not "compression", but the possibility of writing code which performs very well

This 100%. It's not exclusive to demoscene, but demoscene is certainly yet another inspiration to write lean apps. (The performance of Electron is another :) )

I'll never ever be able to work for an AAA game company because I couldn't stomach dumping 100GB of raw texture data onto someone's hard drive.


> Initially I planned for the game to be flat shaded and added textures as an experiment, then realised the performance wasn't as horrible as I expected,

Heh, same!

> so it's mostly just a leftover from that. I did end up writing a scanline renderer for the flat shaded triangles, I think for those the bottleneck was transforming the vertices though.

IME vertex transformation is not expensive at all, compared to actually drawing on screen, at least if there is HW support for multiplication. Optimizations like merging identical block faces next to each other also reduce the total number of vertices, but this makes the texture distortions more apparent.

> Maybe I can deal with z-sorting entities by splitting them on voxel boundaries, and I guess dividing them again at any boundaries within non-full blocks (is this similar to https://en.wikipedia.org/wiki/Binary_space_partitioning?)

Kind of, yeah. With the structure of blocks in a chunk there is basically already a layer of partitioning that can be used, though it might not help sorting that much. Is there enough RAM for a Z buffer? If so, it might just be fast enough...

> For the affine texture splitting, as far as I can tell your game is more focused on creative mode/building whereas I might also want to add a survival mode so for that you're more likely to be up close to the blocks (e.g. in thin underground tunnels) and then the artifacts get much worse so I guess it matters more for me.

I never really bothered with that also because in tunnels/caves there is a bigger issue: There is no hidden surface elimination, which results in overdraw and due to the rather low effective fillrate in FPS drops. I couldn't come up with a good way to implement that properly without hurting performance itself.

> The code is at https://github.com/Heath123/minecraft-casio if you want to look at it (textures are in a branch) but it is quite messy

I had a look, it's really not. The asm is also readable. Is it currently memory or CPU bound?


Rough context: When the GPU (or the CPU back in the days) had to draw a texture, it had to sample pixels, maybe one, maybe four and do some say bi-linear filtering in between them, and then use that pixel as a result.

Now several problems:

1. If your texture is sampled roughly one pixel from it to one pixel on the screen, and if all pixels are read linearly ("horizontally"), then you are good with the cache, cause for the first pixel you've read the cache maybe cold at that address, but it'll load the next pixels just in case you need them, and here is the catch - you need to get always benefit from that - it's like always using your coupons, and deals, and your employer perks. So the CPU/GPU might read 4, 8, 16 who really knows (but you) bytes in advance, or around that pixel.

2. But then you turn the texture 90 degrees, and suddenly it's much slower. You draw a pixel from the texture, but then the next pixel is 256, 512, or more away ("vertically), and then next too, your cache line of 4, 8, 16, 32 or 64 read bytes that you did not used, and by the time you might need them again, the cache discarded them. Hence the slowness now - much much slower!

4. To fix it, you come up with "swizzling" - e.g. instead of the texture being purely scan-line by scan-line, you kind of split your image in blocks - maybe 8x8 tiles, or 32x32, and then make sure those tiles are linearly written one to each other. You can go even further, but the idea is that under any angle, if you decide to read, you most likely would hit pixels from the same cache line you've read before. It's not that simple, and my explantation is poor, but here is someone who can do that better than me: https://fgiesen.wordpress.com/2011/01/17/texture-tiling-and-...

8. But even with swizzling, tiling, whatever you call that magic to keep pixels together in any direction really together stops working as soon as you have to draw that big texture/image on a much smaller scale.

16. Say 256x256 would have to be drawn as 16x16 - And you say I don't need mipmapping, I don't need smaller versions of my image, I can just use my own image - well then nomatter how you swizzle/tile, you'll be skipping a lot of pixels - hop from here to there, and lost cache lines.

32. For that reasons mipmaps are here to help, but stay with me for one more minute - and see how they almost fix the shimmering problem of textures with "fences", "grills on a car", a pet's cage, or something like it.

64. And when you hear the artist ready to put real spider-man logo on the character made out of real polygons, and real grill in front of the car, made out of real polygons, and real barbed-wired made out of polygons very nice looking fence - then stay away, as these polys woulds shimmer, no good level of detail can be done for them (it'll pop) and such things are just much easily done with textures and mipmaps - it kind of solves a lot of problems.

128. Read about impostor textures.

next

Legal | privacy