Skip to content
Snippets Groups Projects

Draft: RFT GTK4.0 ngl renderer

Closed Imported Administrator requested to merge pabloyoyoista/gtk4-ngl into master
5 unresolved threads

Some time ago we disabled the new ngl renderer in a bunch of devices due to it having poor performance or surfacing GPU driver bugs, that did not happen with the "good-old" gl renderer. I got approach by a GTK dev, that they have put a great effort in making the new renderer behave closer to the old one. In consequence, they were expecting that our devices would no longer see a great different, and we could revert our changes. To test that hypothesis, we need people with those devices to actually test it!

Relates https://gitlab.com/postmarketOS/pmaports/-/issues/1069 https://gitlab.com/postmarketOS/pmaports/-/issues/2681 https://gitlab.com/postmarketOS/pmaports/-/merge_requests/4961 https://gitlab.com/postmarketOS/pmaports/-/merge_requests/4958 https://gitlab.com/postmarketOS/pmaports/-/merge_requests/5059

Instructions for testing

We have gotten enough tests already and we're waiting to have a decision. So if you are coming here for the ping, thanks a lot, but you can spare the time :smile:

  • Wait for pipelines to succeed
  • Run mrtest add 5559, and upgrade every package you have installed. This will actually downgrade gtk4.0. This is expected
  • Run a couple of GTK4 apps. See if you notice remarkable changes in performance, if you see some artifacts, or if things stay more or less how they are

Testing team

  • Qualcomm Snapdragon 410/412 (MSM8916). Reports of crashing harder than with 4.14.4 and gl renderer
    • alcatel-idol347 (@mtekman)
    • bq-paella (@panzersajt)
    • samsung-a3 (@colorant31)
    • samsung-gt58 (@Breakfastisready)
    • @TravMurav as the author of the original MR, and maintainer of chipset
  • Qualcomm Snapdragon 450/625/626/632 (MSM8953):

    • samsung-a6plte (@colorant31)
    • xiaomi-mido (@Serg12344)
    • @abologna as the author of this SoC MR
    • @barni2000 as maintainer of SoC, if you want to provide any feedback
  • lg-hammerhead:

  • pine64-pinetab:

    • @dylanvanassche as maintainer
Edited by Administrator

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Administrator added 1 commit · Imported

    added 1 commit

    • cdfc8d0a - temp/gtk4.0: upgrade to 4.15.6

    Compare with previous version

    By Pablo Correa Gomez on 2024-09-05T09:34:08

  • Administrator changed the description · Imported

    changed the description

    By Pablo Correa Gomez on 2024-09-05T09:36:07

  • Administrator changed the description · Imported

    changed the description

    By Pablo Correa Gomez on 2024-09-05T09:36:36

  • Administrator added 1 commit · Imported

    added 1 commit

    • 4380e112 - temp/gtk4.0: upgrade to 4.15.6

    Compare with previous version

    By Pablo Correa Gomez on 2024-09-05T09:38:18

  • Administrator changed the description · Imported

    changed the description

    By Pablo Correa Gomez on 2024-09-05T09:38:42

  • Administrator added 2 commits · Imported

    added 2 commits

    • 819d243e - device-pine64-pinetab: remove GTK renderer workaround
    • 6168202f - temp/gtk4.0: upgrade to 4.15.6

    Compare with previous version

    By Pablo Correa Gomez on 2024-09-05T10:33:49

    • Author Owner

      I'm afraid I have to NACK this for msm8916, GTk4 apps still cause gpu crash. Moreover, whatever we have packaged in pmaports causes only semi-recoverable gpu lockup, but installing this MR causes full SoC crash when GTK4 apps try to render (which is known to happen on some apps but now GTK4 tickles that spot too)

      We will likely have to keep the old renderer until it's removed from gtk I'm afraid.

      By Nikita Travkin on 2024-09-05T13:34:16

    • Author Owner

      Ok, that's bad news, but thanks for testing!

      whatever we have packaged in pmaports

      FTR, 4.14.4. This MR has 4.15.6

      By Pablo Correa Gomez on 2024-09-05T13:46:31

    • Author Owner

      yeah gnome is gilchly on msm8916 btw(full of graphical gilch). Not sure is it the way i installed?but it seem not.

      By ΞЖKƆ/QVH on 2024-09-05T20:12:47

      Edited by Administrator
    • Please register or sign in to reply
  • Author Owner

    Still quite interested in people testing the other SoCs!

    By Pablo Correa Gomez on 2024-09-05T13:46:58

    Edited by Administrator
  • Administrator changed the description · Imported

    changed the description

    By Pablo Correa Gomez on 2024-09-05T13:47:38

  • Author Owner

    Unfortunately can't test this, as I no longer have a functional Nexus 5. I think it'd probably be fine to remove the workaround and see if there are any bug reports.

    By Adam Thiede on 2024-09-05T20:23:44

  • Author Owner

    Hi!

    I tested it on bq-paella:

    I installed it on top of fresh 20240904-0119-postmarketOS-v24.06-phosh-22.3-bq-paella.img using mrtest add 5559, then upgrade, then confirm. Then rebooted just to be sure.

    I used flatpak install flathub org.gnome.Solanum to test GTK4 app as described in one of the referenced MRs.

    I could not see any major change in performance, but the preinstalled Apps like Console, Calculator, Maps, etc keeps crashing Phosh when I tried to open them ever since MR installation!

    By Panzer Sajt on 2024-09-06T06:52:08

    • Author Owner

      Tested on xiaomi-tissot for MSM8953.

      It definitely runs better than it did before. The performance is no longer unusable, and from a quick test I haven't been able to hit any of the weird glitches such as flashing text in the terminal emulator.

      However, it's still not as good as the old gl renderer. With that one, transitions between gnome-control-center pages are buttery smooth, while they're fast but still somewhat choppy with the ngl renderer.

      In the "scrolling" demo from gtk4-demo, the "scrolling icons" only reaches ~20fps instead of ~30fps; the "scrolling text with emoji" still displays the worst performance regression, dropping from ~30fps to ~10fps with the new renderer.

      tl;dr excellent progress overall, but still a significant downgrade from the old renderer I'm afraid :(

      By Andrea Bolognani on 2024-09-08T18:16:21

      Edited by Ghost User
    • Author Owner

      Those are very good tests! I hope the FPS ones can give the maintainers upstream some extra feedback :) We'll get back after some discussion

      By Pablo Correa Gomez on 2024-09-08T18:16:21

    • Please register or sign in to reply
  • Author Owner

    Tested on alcatel-idol347 following a fresh install

    At this point, the Calculator and Consle app are working fine

    using mrtest add 5559, then upgrade, then confirm. Then rebooted just to be sure.

    At this point, using the Calculator app causes the phone to reboot.

    I used flatpak install flathub org.gnome.Solanum to test GTK4 app as described in one of the referenced MRs.

    Running this after the mrtest brought up the app fine, but Calculator and Console would still trigger a reboot.

    I tried opening the gtk4.0-demo but that also resulted in a reboot.

    By mtekman on 2024-09-07T16:20:36

    • Author Owner

      @pabloyoyoista mind updating this to 4.16.0, just to have some more crash fixes for more tests?

      FWIW., in the cases where GTK causes GPU crashes, by definition that means there are kernel bugs - and maybe also Mesa ones. If we could get any kind of logs for these cases, that would be great!

      By Robert Mader on 2024-09-07T17:40:38

    • Author Owner

      FWIW., in the cases where GTK causes GPU crashes, by definition that means there are kernel bugs - and maybe also Mesa ones. If we could get any kind of logs for these cases, that would be great!

      dmesg -w does not show anything during the error, and my phone does not appear to have ram_console built in to persist the last dmesg over a reboot.

      Doing logread -f during the error also yields nothing, and doing logread -n 1000 on the next boot doesn't seem to show anything GPU specific.

      Here are some lines grepped for "error"

      [Sep 07 19:39:44] daemon NetworkManager[1883]: <warn>  [1725734384.3635] modem-manager: error poking ModemManager: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.ModemManager1 was not provided by any .service files 
      <-- There were messages at 20:08:13 that did not seem to
          persist across reboot -->
      [Sep 07 20:08:45] kern kernel: spmi-temp-alarm 200f000.spmi:pmic@0:temp-alarm@2400: error -ENODEV: failed to register sensor
      [Sep 07 20:08:45] daemon kernel: udevd[123]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/drivers/remoteproc/qcom_pil_info.ko error=No such file or directory
      [Sep 07 20:08:45] daemon kernel: udevd[116]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/sound/soc/qcom/snd-soc-qcom-common.ko error=No such file or directory
      [Sep 07 20:08:45] daemon kernel: udevd[108]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/drivers/media/v4l2-core/v4l2-mem2mem.ko error=No such file or directory
      [Sep 07 20:08:45] daemon kernel: udevd[120]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/drivers/remoteproc/qcom_pil_info.ko error=No such file or directory
      [Sep 07 20:08:45] daemon kernel: udevd[125]: ctx=0xffff8c69cd80 path=/lib/modules/6.6.0-msm8916/kernel/drivers/iio/gyro/bmg160_core.ko error=No such file or directory
      [Sep 07 20:08:45] kern kernel: msm_mdp 1a01000.display-controller: Direct firmware load for qcom/a300_pm4.fw failed with error -2
      [Sep 07 20:08:45] kern kernel: [drm:mdp5_irq_error_handler [msm]] *ERROR* errors: 04000000
      [Sep 07 20:08:48] daemon NetworkManager[2108]: <warn>  [1725736128.6556] modem-manager: error poking ModemManager: GDBus.Error:org.freedesktop.DBus.Error.ServiceUnknown: The name org.freedesktop.ModemManager1 was not provided by any .service files 
      [Sep 07 20:08:48] user nm-dns-filter[2294]: error: unable to determine IP config for primary connection!
      [Sep 07 20:08:48] user nm-dns-filter[2323]: error: unable to determine IP config for primary connection!
      [Sep 07 20:09:10] daemon gnome-session-binary[3025]: WARNING: Falling back to non-systemd startup procedure due to error: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Name "org.freedesktop.systemd1" does not exist 
      [Sep 07 20:09:22] daemon [2624]: <wrn> [modem0] error initializing: Modem in failed state: sim-missing

      I don't think these are GPU related though.

      By mtekman on 2024-09-07T19:18:19

    • Author Owner

      @pabloyoyoista mind updating this to 4.16.0, just to have some more crash fixes for more tests?

      I think like you say, the general problems have to be with kernel/mesa drivers. I'll discuss with Company and come back :smile:

      By Pablo Correa Gomez on 2024-09-08T20:44:00

      Edited by Ghost User
    • Please register or sign in to reply
  • Administrator changed the description · Imported

    changed the description

    By Pablo Correa Gomez on 2024-09-08T18:04:25

  • Author Owner

    Thanks everybody for the tests! We no longer need any more, we just need to figure out a decision and a way forward :smile:

    By Pablo Correa Gomez on 2024-09-08T18:20:25

    Edited by Ghost User
    • Author Owner

      I've been wondering:

      1. Is GTK using Vulkan or GL?
        Because GTK is picking Vulkan if the driver supports dmabufs and I think the adreno driver does?

      2. Is the Adreno Vulkan driver (turnip) working as well as the GL driver (freedreno)?
        When I started testing GTK on my Raspberry Pi 4, it certainly wasn't as good.

      So for everyone running those tests:

      1. Can you run once with GSK_DEBUG=renderer to have GTK print out which renderer it chooses?

      2. Can you test both Vulkan and the NGL renderer by specifying GSK_RENDERER=vulkan and GSK_RENDERER=ngl and compare how well they work?

      In my experience on low-powered hardware, NGL and Vulkan should perform almost the same if the drivers work well.

      By Benjamin Otte on 2024-09-10T08:17:49

      Edited by Ghost User
    • Author Owner

      if the driver supports dmabufs and I think the adreno driver does?

      All mesa drivers do ;)

      By Robert Mader on 2024-09-10T08:17:49

    • Author Owner

      @abologna @TravMurav would you be able to manually run the tests requested? The behavior seems mostly consistent across devices with the same SoC, so we don't think more tests than one per family

      By Pablo Correa Gomez on 2024-09-10T09:44:00

    • Author Owner

      a3xx and a5xx are too old to use Turnip (mesa's vulkan impl for adreno), so it for sure uses GL, but as I've mentioned already, for msm8916 the problem is that a3xx freedreno support is too broken and effectively abandoned in Mesa.

      If we can keep the devices alive with old renderer for a while longer - nice, when it comes to the point when old renderer is dropped, we will likely have to force swrast and/or denylist GPU-accelerated UIs on a3xx devices since the circles of people interested in keeping it alive and people who know how to write/fix gpu drivers unfortunately don't overlap.

      There is nothing GTK can/should do with the new renderer in this case I believe since I'd hope it already conforms to OpenGL spec and expects the driver to be as conformant.

      By Nikita Travkin on 2024-09-10T12:16:39

      Edited by Ghost User
    • Author Owner

      We'd need somebody like this person, fixing bugs on really old Intel :/

      By Robert Mader on 2024-09-10T12:19:25

    • Author Owner

      a3xx and a5xx are too old to use Turnip

      Thanks, didn't know that!

      We'd need somebody like this person, fixing bugs on really old Intel :/

      We've discussed trying to fund somebody to work on a5xx at some point, even if it didn't materialize. Though I agree with Nikita, a3xx is not on the table

      By Pablo Correa Gomez on 2024-09-10T12:29:04

      Edited by Administrator
    • Author Owner

      There is nothing GTK can/should do with the new renderer in this case I believe since I'd hope it already conforms to OpenGL spec and expects the driver to be as conformant.

      There are often things GTK can do slightly differently and gain a ton of performance that way, usually because some hardware or its driver is better at doing X than doing Y.
      For a fun example of that, you can look at the table in https://gitlab.gnome.org/GNOME/gtk/-/merge_requests/7570 where some hardware got faster, some got slower and some hardware got faster with Vulkan and slower with GL.

      But that requires learning about all the different kinds of hardware and what they are good/bad at.
      And Adreno is a kind of hardware I don't know much about. Neither is Mali btw.

      By Benjamin Otte on 2024-09-10T14:03:50

    • Please register or sign in to reply
    • Author Owner

      Tested on xiaomi-tissot for MSM8953.

      Can you run once with GSK_DEBUG=renderer to have GTK print out which renderer it chooses?

      $ GSK_DEBUG=renderer gtk4-demo
      Not using Vulkan: Could not create a Vulkan instance: The requested version of Vulkan is not supported by the driver or is otherwise incompatible for implementation-specific reasons. (VK_ERROR_INCOMPATIBLE_DRIVER)
      Using renderer 'GskNglRenderer' for surface 'GdkWaylandToplevel'

      Can you test both Vulkan and the NGL renderer by specifying GSK_RENDERER=vulkan and GSK_RENDERER=ngl and compare how well they work?

      They seem to perform roughly the same: somewhat choppy/laggy transitions between panels in gnome-control-center, ~20fps for the "scrolling icons" demo and ~10fps for the "scrolling text with emoji" one.

      Based on the output above, I would assume that this is because even when explicitly asking for the vulkan renderer we're actually getting the ngl one, so I'm effectively running the same code in both cases.

      By Andrea Bolognani on 2024-09-16T17:44:54

      Edited by Ghost User
    • Author Owner

      Based on the output above, I would assume that this is because even when explicitly asking for the vulkan renderer we're actually getting the ngl one, so I'm effectively running the same code in both cases.

      Yup.

      $ GSK_DEBUG=renderer GSK_RENDERER=vulkan gtk4-demo
      Environment variable GSK_RENDERER=vulkan set, trying GskVulkanRenderer
      Not using Vulkan: Could not create a Vulkan instance: The requested version of Vulkan is not supported by the driver or is otherwise incompatible for implementation-specific reasons. (VK_ERROR_INCOMPATIBLE_DRIVER)
      Using renderer 'GskNglRenderer' for surface 'GdkWaylandToplevel'

      By Andrea Bolognani on 2024-09-15T10:21:10

    • Please register or sign in to reply
  • Administrator mentioned in issue #3197 · Imported

    mentioned in issue #3197

    By Pablo Correa Gomez on 2024-09-23T14:28:04

  • Author Owner

    After some discussion in-person, I've decided to open #3197 and close this instead. We will keep using the gl renderer in a3xx until it breaks, and not as upstream to block on it. If we ever implement #3197, we'd hope we get some a5xx user that can get engaged with upstream to fix issues. Else, we know we can live with a slower renderer, akin not amazing.

    By Pablo Correa Gomez on 2024-09-23T14:31:04

  • Administrator closed · Imported

    closed

  • Author Owner

    And also, thanks everybody for the tests and the feedback. You provided very good input that helped make a decision and discuss with upstream about it!

    By Pablo Correa Gomez on 2024-09-23T14:31:46

  • Administrator mentioned in issue #3212 · Imported

    mentioned in issue #3212

    By ΞЖKƆ/QVH on 2024-09-28T01:57:33

  • mentioned in issue #3533 (closed)

Please register or sign in to reply
Loading