disable alpha blending for textures with no transparency
This results in a pretty big speed-up in sw rendering, as BlitNtoNPixelAlpha was taking up something like ~85% of the total application time when I was hammering key presses.
Collected with: $ perf record --call-graph dwarf -F 99 ./_build/osk-sdl -d a -n a -v
I used the SDL2 sw rendering to collect this data, but there's a similar speed-up when using directfb too, it's just a lot easier to see in the perf report w/ SDL2 sw rendering (the dfb version has a lot of noise from internal dfb stuff.)
before:
99.75% 0.00% osk-sdl osk-sdl [.] main
|
---main
|
|--86.03%--SDL_RenderPresent_REAL
| FlushRenderCommands
| SW_RunCommandQueue
| |
| |--84.49%--SDL_UpperBlit_REAL
| | SDL_SoftBlit
| | BlitNtoNPixelAlpha
| |
| --1.55%--SDL_FillRects_REAL
| SDL_FillRect4
after:
99.31% 0.00% osk-sdl osk-sdl [.] main
|
---main
|
|--47.90%--SDL_UpdateWindowSurface_REAL
| SDL_UpdateWindowTexture
| |
| |--46.91%--SDL_UpdateTexture_REAL
| | GL_UpdateTexture
| | 0xffff893b3567
| | |
| | --46.74%--0xffff893ae997
| | .....
| |
| --0.83%--KMSDRM_GLES_SwapWindow
| eglSwapBuffers
|
|--47.71%--SDL_RenderPresent_REAL
| FlushRenderCommands
| |
| --47.56%--SW_RunCommandQueue
| |
| |--38.84%--SDL_UpperBlit_REAL
| | SDL_SoftBlit
| | |
| | |--23.86%--Blit_3or4_to_3or4__inversed_rgb
| | |
| | --14.97%--BlitNtoNPixelAlpha
| |
| --8.73%--SDL_FillRects_REAL
| SDL_FillRect4