SideWays Scroller - 760 Tiles vs 2 Tiles?

Started by
7 comments, last by Nick72c 2 years, 1 month ago

How many background tiles should you use?

Here using SDL2 within C++ I have two versions of a sideways scroller:

Using a Window size of 1800x by 1000y = 1'800'000 pixels

The first renders 20x38 tiles (760 tiles) per frame.

Each tile is 50x50 pixels (2500 pixels)

760 x 2500= 1'900'000 pixels rendered per frame (rendering entire Window plus offset of 1x column of tiles on both left and right of Window (20x 2500 pixels x 2 = 100'000, + Window 1'800'000 pixels = 1'900'000 rendered pixels per frame)

The second renders 2x tiles of full Window Size (1800 x 1000 x 2 pixels = 3'600'000 rendered pixels per frame).

but which is more efficient?

Clearly the 760 tiles per frame is 760 unique (and expensive) SDL_RenderCopy lines per frame, plus navigating a tile map of 20x72 cells to read the new frame layout…. but it only renders 1'900'000 pixels per frame.

The 2 tile version is a lot less code, with only 2x SDL_RenderCopy instructions per frame, but is rendering a much larger 3'600'000 pixels per frame.

The 760 tile version is rendering 100'000 pixels per frame as an offset outside the viewable Window.

Where the 2 tile version is rendering 1'800'000 pixels outside the viewable Window each frame.

I'm therefore assuming that the 760 SCL_RenderCopy instructions are more expensive in terms of time and CPU usage, but the 2 tile version is more expensive in terms of memory resource used.

…so, what size should my background tiles be, as a base for making a 2D scroller?

Advertisement

Nick72c said:
…so, what size should my background tiles be, as a base for making a 2D scroller?

Do what suits your game design. Any 20 years old computer should be able to draw a simple 2D game at fluid fps.

Nick72c said:
Each tile is 50x50 pixels

Why 50? It may have advantages to use a number which is flexible with tilings. Like 48, which is a multiple of 24, 16, 12, 6, 4, 3. Would make pattern designs easier, if you intend to use that.

  • If your game has a tile-based structure (e.g. Super Mario Bros, where distances are measured in blocks that are important objects in their own right) graphics should be tile-based too (which doesn't exclude decorative, non-tiled background layers).
  • Depending on the intended graphical style, either a system of tiles or large panels could be easier to draw and to modify.
    You are still very far from a good looking game and you should plan beyond three tile types or 8 colors doodles.
  • Regarding performance, small square tiles “scale up” to large levels, while drawing the whole background and relying on clipping doesn't.
    If level size is intended to be limited to a few screens (e.g. Bubble Bobble and Street Fighter II, to represent both graphical approaches) efficiency for large levels (e.g. R-Type) is not important.
  • Basic profiling can show whether the overhead of issuing more calls to blit smaller tiles is significant (very likely, not at all).
    Smaller tiles offer a more important benefit than wasting less clipped pixels: they are cheaper to draw than large ones of comparable quality up to the point where small resolution makes them too repetitive.

Omae Wa Mou Shindeiru

While it is important to match your design, it is also important to match your implementation to the hardware.

Decades ago the hardware was oriented around tiles and sprites. On many systems you would fill up a sprite array, then provide another array of which sprite elements to use to fill the screen. Scrolling was done by a shift value, where you could scroll the entire screen by a fraction of a tile. It was quite efficient, especially on systems that could access memory on their cartridge directly. They could simply point the sprite buffer at a location in their card for rendering, and use a small array to represent the active screen. A small bit-field would flip the sprite horizontally, flip it vertically, or rotate it in 45 degree increments. The implementation wasn't to match the design, it was to match the hardware.

These days hardware is more oriented toward meshes and sprite clouds with support for multi-gigabyte megatextures. The arrays of tiles and sprites are distant memories. As you have realized, a naive implementation trying to mimic hardware from decades past will result in tons of draw calls and quickly bog the system down.

Smarter use of the system resources based on how today's hardware actually works can give amazing performance. Instead of drawing 760 individual tiles every frame, composite all the stationary tiles just once, then draw them one time. Don't just composite a single screen, you can easily make a single, very large composite image on the graphics card, perhaps something that is two or even four screens wide, say 4096x2048, then you can have a single draw call that handles all of it, needing to generate the next version when you begin to approach the boundary of the current screen. Making new regions can be done gradually rather than as a massive single-frame task. You might occasionally need to update small subsections (e.g. Super Mario Bros block gets broken and needs to be removed from the static image and replaced with an active sprite) but that's a very small operation compared to the thousand render calls per frame. Or alternatively, instance them. It is a more complex and relies on shaders instead, but then you're drawing each sprite once per screen, exchanging video memory versus data arrays.

Whatever direction you go, have your implementation match what today's hardware is actually doing. Just like how decades ago, implementations went with sprites and tiles to match what their hardware was actually doing.

LorenzoGatti said:

  • If your game has a tile-based structure (e.g. Super Mario Bros, where distances are measured in blocks that are important objects in their own right) graphics should be tile-based too (which doesn't exclude decorative, non-tiled background layers).
  • Depending on the intended graphical style, either a system of tiles or large panels could be easier to draw and to modify.
    You are still very far from a good looking game and you should plan beyond three tile types or 8 colors doodles.
  • Regarding performance, small square tiles “scale up” to large levels, while drawing the whole background and relying on clipping doesn't.
    If level size is intended to be limited to a few screens (e.g. Bubble Bobble and Street Fighter II, to represent both graphical approaches) efficiency for large levels (e.g. R-Type) is not important.
  • Basic profiling can show whether the overhead of issuing more calls to blit smaller tiles is significant (very likely, not at all).
    Smaller tiles offer a more important benefit than wasting less clipped pixels: they are cheaper to draw than large ones of comparable quality up to the point where small resolution makes them too repetitive.

Thank you for the feedback @lorenzogatti

I knocked up those tiles / panels in 2 minutes just to see how they implement, I wasn't intending them to be the final product.

You point regarding less clipped pixels with smaller tiles is at the heart of my question. With the 50x50 pixel tiles (again just an arbitrary tile size to see how they implement) I'm only drawing 100 pixels wide off screen per render, but with the two tile approach I'm drawing 1800 pixels wide off screen per render.

I take it from your response that you believe this larger render cost is more significant/expensive than the number of overall render calls.

How would I check with ‘Basic Profiling"? I’ve checked memory usage with Windows Task Manager and it's 53.8MB with the tile array, and 60.9MB with the two panel approach, which doesn't seem particularly significant. Is there a way to get better profiling metrics?

frob said:

While it is important to match your design, it is also important to match your implementation to the hardware.

Decades ago the hardware was oriented around tiles and sprites. On many systems you would fill up a sprite array, then provide another array of which sprite elements to use to fill the screen. Scrolling was done by a shift value, where you could scroll the entire screen by a fraction of a tile. It was quite efficient, especially on systems that could access memory on their cartridge directly. They could simply point the sprite buffer at a location in their card for rendering, and use a small array to represent the active screen. A small bit-field would flip the sprite horizontally, flip it vertically, or rotate it in 45 degree increments. The implementation wasn't to match the design, it was to match the hardware.

These days hardware is more oriented toward meshes and sprite clouds with support for multi-gigabyte megatextures. The arrays of tiles and sprites are distant memories. As you have realized, a naive implementation trying to mimic hardware from decades past will result in tons of draw calls and quickly bog the system down.

Smarter use of the system resources based on how today's hardware actually works can give amazing performance. Instead of drawing 760 individual tiles every frame, composite all the stationary tiles just once, then draw them one time. Don't just composite a single screen, you can easily make a single, very large composite image on the graphics card, perhaps something that is two or even four screens wide, say 4096x2048, then you can have a single draw call that handles all of it, needing to generate the next version when you begin to approach the boundary of the current screen. Making new regions can be done gradually rather than as a massive single-frame task. You might occasionally need to update small subsections (e.g. Super Mario Bros block gets broken and needs to be removed from the static image and replaced with an active sprite) but that's a very small operation compared to the thousand render calls per frame. Or alternatively, instance them. It is a more complex and relies on shaders instead, but then you're drawing each sprite once per screen, exchanging video memory versus data arrays.

Whatever direction you go, have your implementation match what today's hardware is actually doing. Just like how decades ago, implementations went with sprites and tiles to match what their hardware was actually doing.

Thank you @frob

This is really interesting.

I could draw the entire background, lets say 2 screens high and 4 screens wide, with an art package. Save it as a png file, and load it once before the first render.

I'm currently using this line to render my graphics to the screen:

SDL_RenderCopy(Window::renderer, _image_texture, nullptr, &asset);

The format of SDL_RenderCopy is:

int SDL_RenderCopy(SDL_Renderer * renderer,
                   SDL_Texture * texture,
                   const SDL_Rect * srcrect,
                   const SDL_Rect * dstrect);

In the two panel version I'm calling this twice, once for each panel (where each panel is a separate image texture) , with 1800 wide of pixels being rendered outside on the viewing area (Window::renderer). I'm not sure if SDL_RenderCopy is clipping/truncating all pixels outside the viewable area, but it's not causing any issues that I can see - just feels expensive.

Perhaps I can load the entire 2x screen by 4x screen size background into a single image texture, but instead of passing a nullptr argument into the const SDL_Rect* srcrect paramiter, use this to point to a once screen size rectangle of the image texture to be rendered.

This would be extremely efficient, as it would only use one SDL_RenerCopy call per frame, and only render within the viewable window.

I'm only now seeing that SDL_Rect is a structure with predefined x and y that I can manipulate to point to the portion of the image texture I want rendered to screen,

I'll go and try this.

Thank you ?

Right. And as mentioned, the lowest cost implementation that comes closest to the bunch of sprites is instanced rendering, which is something you should investigate if you're looking at bigger gains. That leverages something the hardware is really good at, while also closely mirroring what your design is using.

There are lots of other options, too. Large blocks for backgrounds can work well. If you want to keep with a bunch of little pieces you can compose the static scene into smaller blocks than the giant one I mentioned above, so you may have 4 or 8 or 16 in the screen rather than 760. Or you might try many other options instead.

The key takeaway is to recognize that the pattern of using sprites was driven by the old hardware. What you see as “old school” is what they saw as “adapting to recent hardware optimizations”. Those were cutting edge designs, a break from the earlier patterns requiring huge innovation. It's just that 30 years have past by, and people are still focusing on what worked well all those decades ago. Hardware has moved on, so it's best to make designs that target today's hardware.

Thanks again for the tips @frob

I got it going with a single texture of 4x screens wide, then added some layers for a bit of parallax scrolling.

Again - please ignore the ‘artwork’ which is just a placeholder.

This topic is closed to new replies.

Advertisement