Draft: RFT GTK4.0 ngl renderer
Some time ago we disabled the new ngl renderer in a bunch of devices due to it having poor performance or surfacing GPU driver bugs, that did not happen with the "good-old" gl renderer. I got approach by a GTK dev, that they have put a great effort in making the new renderer behave closer to the old one. In consequence, they were expecting that our devices would no longer see a great different, and we could revert our changes. To test that hypothesis, we need people with those devices to actually test it!
Relates https://gitlab.com/postmarketOS/pmaports/-/issues/1069 https://gitlab.com/postmarketOS/pmaports/-/issues/2681 https://gitlab.com/postmarketOS/pmaports/-/merge_requests/4961 https://gitlab.com/postmarketOS/pmaports/-/merge_requests/4958 https://gitlab.com/postmarketOS/pmaports/-/merge_requests/5059
Instructions for testing
We have gotten enough tests already and we're waiting to have a decision. So if you are coming here for the ping, thanks a lot, but you can spare the time
- Wait for pipelines to succeed
- Run
mrtest add 5559
, and upgrade every package you have installed. This will actually downgrade gtk4.0. This is expected - Run a couple of GTK4 apps. See if you notice remarkable changes in performance, if you see some artifacts, or if things stay more or less how they are
Testing team
-
Qualcomm Snapdragon 410/412 (MSM8916). Reports of crashing harder than with 4.14.4 and gl renderer - alcatel-idol347 (@mtekman)
- bq-paella (@panzersajt)
- samsung-a3 (@colorant31)
- samsung-gt58 (@Breakfastisready)
- @TravMurav as the author of the original MR, and maintainer of chipset
-
Qualcomm Snapdragon 450/625/626/632 (MSM8953):
- samsung-a6plte (@colorant31)
- xiaomi-mido (@Serg12344)
- @abologna as the author of this SoC MR
- @barni2000 as maintainer of SoC, if you want to provide any feedback
-
lg-hammerhead:
- @adamthiede as author of the MR
-
pine64-pinetab:
- @dylanvanassche as maintainer
Merge request reports
Activity
added request-for-test technical debt type::fix labels
added 1 commit
- cdfc8d0a - temp/gtk4.0: upgrade to 4.15.6
By Pablo Correa Gomez on 2024-09-05T09:34:08
added 1 commit
- 4380e112 - temp/gtk4.0: upgrade to 4.15.6
By Pablo Correa Gomez on 2024-09-05T09:38:18
I'm afraid I have to NACK this for msm8916, GTk4 apps still cause gpu crash. Moreover, whatever we have packaged in pmaports causes only semi-recoverable gpu lockup, but installing this MR causes full SoC crash when GTK4 apps try to render (which is known to happen on some apps but now GTK4 tickles that spot too)
We will likely have to keep the old renderer until it's removed from gtk I'm afraid.
By Nikita Travkin on 2024-09-05T13:34:16
yeah gnome is gilchly on msm8916 btw(full of graphical gilch). Not sure is it the way i installed?but it seem not.
By ΞЖKƆ/QVH on 2024-09-05T20:12:47
Edited by Administrator
Still quite interested in people testing the other SoCs!
By Pablo Correa Gomez on 2024-09-05T13:46:58
Edited by AdministratorHi!
I tested it on bq-paella:
I installed it on top of fresh
20240904-0119-postmarketOS-v24.06-phosh-22.3-bq-paella.img
usingmrtest add 5559, then upgrade, then confirm.
Then rebooted just to be sure.I used
flatpak install flathub org.gnome.Solanum
to test GTK4 app as described in one of the referenced MRs.I could not see any major change in performance, but the preinstalled Apps like
Console
,Calculator
,Maps
, etc keeps crashing Phosh when I tried to open them ever since MR installation!By Panzer Sajt on 2024-09-06T06:52:08
Tested on
xiaomi-tissot
for MSM8953.It definitely runs better than it did before. The performance is no longer unusable, and from a quick test I haven't been able to hit any of the weird glitches such as flashing text in the terminal emulator.
However, it's still not as good as the old
gl
renderer. With that one, transitions betweengnome-control-center
pages are buttery smooth, while they're fast but still somewhat choppy with thengl
renderer.In the "scrolling" demo from
gtk4-demo
, the "scrolling icons" only reaches ~20fps instead of ~30fps; the "scrolling text with emoji" still displays the worst performance regression, dropping from ~30fps to ~10fps with the new renderer.tl;dr excellent progress overall, but still a significant downgrade from the old renderer I'm afraid :(
By Andrea Bolognani on 2024-09-08T18:16:21
Edited by Ghost User
Tested on
alcatel-idol347
following a fresh installAt this point, the Calculator and Consle app are working fine
using
mrtest add 5559, then upgrade, then confirm.
Then rebooted just to be sure.At this point, using the Calculator app causes the phone to reboot.
I used
flatpak install flathub org.gnome.Solanum
to test GTK4 app as described in one of the referenced MRs.Running this after the mrtest brought up the app fine, but Calculator and Console would still trigger a reboot.
I tried opening the
gtk4.0-demo
but that also resulted in a reboot.By mtekman on 2024-09-07T16:20:36
@pabloyoyoista mind updating this to 4.16.0, just to have some more crash fixes for more tests?
FWIW., in the cases where GTK causes GPU crashes, by definition that means there are kernel bugs - and maybe also Mesa ones. If we could get any kind of logs for these cases, that would be great!
By Robert Mader on 2024-09-07T17:40:38
FWIW., in the cases where GTK causes GPU crashes, by definition that means there are kernel bugs - and maybe also Mesa ones. If we could get any kind of logs for these cases, that would be great!
dmesg -w
does not show anything during the error, and my phone does not appear to haveram_console
built in to persist the last dmesg over a reboot.Doing
logread -f
during the error also yields nothing, and doinglogread -n 1000
on the next boot doesn't seem to show anything GPU specific.Here are some lines grepped for "error"
[Sep 07 19:39:44] daemon NetworkManager[1883]: <warn> [1725734384.3635] modem-manager: error poking ModemManager: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.ModemManager1 was not provided by any .service files <-- There were messages at 20:08:13 that did not seem to persist across reboot --> [Sep 07 20:08:45] kern kernel: spmi-temp-alarm 200f000.spmi:pmic@0:temp-alarm@2400: error -ENODEV: failed to register sensor [Sep 07 20:08:45] daemon kernel: udevd[123]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/drivers/remoteproc/qcom_pil_info.ko error=No such file or directory [Sep 07 20:08:45] daemon kernel: udevd[116]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/sound/soc/qcom/snd-soc-qcom-common.ko error=No such file or directory [Sep 07 20:08:45] daemon kernel: udevd[108]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/drivers/media/v4l2-core/v4l2-mem2mem.ko error=No such file or directory [Sep 07 20:08:45] daemon kernel: udevd[120]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/drivers/remoteproc/qcom_pil_info.ko error=No such file or directory [Sep 07 20:08:45] daemon kernel: udevd[125]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/drivers/iio/gyro/bmg160_core.ko error=No such file or directory [Sep 07 20:08:45] kern kernel: msm_mdp 1a01000.display-controller: Direct firmware load for qcom/a300_pm4.fw failed with error -2 [Sep 07 20:08:45] kern kernel: [drm:mdp5_irq_error_handler [msm]] *ERROR* errors: 04000000 [Sep 07 20:08:48] daemon NetworkManager[2108]: <warn> [1725736128.6556] modem-manager: error poking ModemManager: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.ModemManager1 was not provided by any .service files [Sep 07 20:08:48] user nm-dns-filter[2294]: error: unable to determine IP config for primary connection! [Sep 07 20:08:48] user nm-dns-filter[2323]: error: unable to determine IP config for primary connection! [Sep 07 20:09:10] daemon gnome-session-binary[3025]: WARNING: Falling back to non-systemd startup procedure due to error: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Name "org.freedesktop.systemd1" does not exist [Sep 07 20:09:22] daemon [2624]: <wrn> [modem0] error initializing: Modem in failed state: sim-missing
I don't think these are GPU related though.
By mtekman on 2024-09-07T19:18:19
@pabloyoyoista mind updating this to 4.16.0, just to have some more crash fixes for more tests?
I think like you say, the general problems have to be with kernel/mesa drivers. I'll discuss with Company and come back
By Pablo Correa Gomez on 2024-09-08T20:44:00
Edited by Ghost User
Thanks everybody for the tests! We no longer need any more, we just need to figure out a decision and a way forward
By Pablo Correa Gomez on 2024-09-08T18:20:25
Edited by Ghost UserI've been wondering:
-
Is GTK using Vulkan or GL?
Because GTK is picking Vulkan if the driver supports dmabufs and I think the adreno driver does? -
Is the Adreno Vulkan driver (turnip) working as well as the GL driver (freedreno)?
When I started testing GTK on my Raspberry Pi 4, it certainly wasn't as good.
So for everyone running those tests:
-
Can you run once with
GSK_DEBUG=renderer
to have GTK print out which renderer it chooses? -
Can you test both Vulkan and the NGL renderer by specifying
GSK_RENDERER=vulkan
andGSK_RENDERER=ngl
and compare how well they work?
In my experience on low-powered hardware, NGL and Vulkan should perform almost the same if the drivers work well.
By Benjamin Otte on 2024-09-10T08:17:49
Edited by Ghost User-
@abologna @TravMurav would you be able to manually run the tests requested? The behavior seems mostly consistent across devices with the same SoC, so we don't think more tests than one per family
By Pablo Correa Gomez on 2024-09-10T09:44:00
a3xx and a5xx are too old to use Turnip (mesa's vulkan impl for adreno), so it for sure uses GL, but as I've mentioned already, for msm8916 the problem is that a3xx freedreno support is too broken and effectively abandoned in Mesa.
If we can keep the devices alive with old renderer for a while longer - nice, when it comes to the point when old renderer is dropped, we will likely have to force swrast and/or denylist GPU-accelerated UIs on a3xx devices since the circles of people interested in keeping it alive and people who know how to write/fix gpu drivers unfortunately don't overlap.
There is nothing GTK can/should do with the new renderer in this case I believe since I'd hope it already conforms to OpenGL spec and expects the driver to be as conformant.
By Nikita Travkin on 2024-09-10T12:16:39
Edited by Ghost UserWe'd need somebody like this person, fixing bugs on really old Intel :/
By Robert Mader on 2024-09-10T12:19:25
a3xx and a5xx are too old to use Turnip
Thanks, didn't know that!
We'd need somebody like this person, fixing bugs on really old Intel :/
We've discussed trying to fund somebody to work on a5xx at some point, even if it didn't materialize. Though I agree with Nikita, a3xx is not on the table
By Pablo Correa Gomez on 2024-09-10T12:29:04
Edited by AdministratorThere is nothing GTK can/should do with the new renderer in this case I believe since I'd hope it already conforms to OpenGL spec and expects the driver to be as conformant.
There are often things GTK can do slightly differently and gain a ton of performance that way, usually because some hardware or its driver is better at doing X than doing Y.
For a fun example of that, you can look at the table in https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/7570 where some hardware got faster, some got slower and some hardware got faster with Vulkan and slower with GL.But that requires learning about all the different kinds of hardware and what they are good/bad at.
And Adreno is a kind of hardware I don't know much about. Neither is Mali btw.By Benjamin Otte on 2024-09-10T14:03:50
Tested on
xiaomi-tissot
for MSM8953.Can you run once with
GSK_DEBUG=renderer
to have GTK print out which renderer it chooses?$ GSK_DEBUG=renderer gtk4-demo Not using Vulkan: Could not create a Vulkan instance: The requested version of Vulkan is not supported by the driver or is otherwise incompatible for implementation-specific reasons. (VK_ERROR_INCOMPATIBLE_DRIVER) Using renderer 'GskNglRenderer' for surface 'GdkWaylandToplevel'
Can you test both Vulkan and the NGL renderer by specifying
GSK_RENDERER=vulkan
andGSK_RENDERER=ngl
and compare how well they work?They seem to perform roughly the same: somewhat choppy/laggy transitions between panels in
gnome-control-center
, ~20fps for the "scrolling icons" demo and ~10fps for the "scrolling text with emoji" one.Based on the output above, I would assume that this is because even when explicitly asking for the
vulkan
renderer we're actually getting thengl
one, so I'm effectively running the same code in both cases.By Andrea Bolognani on 2024-09-16T17:44:54
Edited by Ghost UserBased on the output above, I would assume that this is because even when explicitly asking for the
vulkan
renderer we're actually getting thengl
one, so I'm effectively running the same code in both cases.Yup.
$ GSK_DEBUG=renderer GSK_RENDERER=vulkan gtk4-demo Environment variable GSK_RENDERER=vulkan set, trying GskVulkanRenderer Not using Vulkan: Could not create a Vulkan instance: The requested version of Vulkan is not supported by the driver or is otherwise incompatible for implementation-specific reasons. (VK_ERROR_INCOMPATIBLE_DRIVER) Using renderer 'GskNglRenderer' for surface 'GdkWaylandToplevel'
By Andrea Bolognani on 2024-09-15T10:21:10
mentioned in issue #3197
By Pablo Correa Gomez on 2024-09-23T14:28:04
After some discussion in-person, I've decided to open #3197 and close this instead. We will keep using the gl renderer in a3xx until it breaks, and not as upstream to block on it. If we ever implement #3197, we'd hope we get some a5xx user that can get engaged with upstream to fix issues. Else, we know we can live with a slower renderer, akin not amazing.
By Pablo Correa Gomez on 2024-09-23T14:31:04
mentioned in issue #3212
By ΞЖKƆ/QVH on 2024-09-28T01:57:33
mentioned in issue #3533 (closed)