Skip to content
Snippets Groups Projects
  1. Jan 20, 2025
  2. Jan 16, 2025
    • Matthew Brost's avatar
      drm/xe: Mark ComputeCS read mode as UC on iGPU · b1231ff7
      Matthew Brost authored
      
      RING_CMD_CCTL read index should be UC on iGPU parts due to L3 caching
      structure. Having this as WB blocks ULLS from being enabled. Change to
      UC to unblock ULLS on iGPU.
      
      v2:
       - Drop internal communications commnet, bspec is updated
      
      Cc: Balasubramani Vivekanandan <balasubramani.vivekanandan@intel.com>
      Cc: Michal Mrozek <michal.mrozek@intel.com>
      Cc: Paulo Zanoni <paulo.r.zanoni@intel.com>
      Cc: José Roberto de Souza <jose.souza@intel.com>
      Cc: stable@vger.kernel.org
      Fixes: 328e089b ("drm/xe: Leverage ComputeCS read L3 caching")
      Signed-off-by: default avatarMatthew Brost <matthew.brost@intel.com>
      Acked-by: default avatarMichal Mrozek <michal.mrozek@intel.com>
      Reviewed-by: default avatarStuart Summers <stuart.summers@intel.com>
      Reviewed-by: default avatarMatt Roper <matthew.d.roper@intel.com>
      Link: https://patchwork.freedesktop.org/patch/msgid/20250114002507.114087-1-matthew.brost@intel.com
      
      
      (cherry picked from commit 758debf35b9cda5450e40996991a6e4b222899bd)
      Signed-off-by: default avatarThomas Hellström <thomas.hellstrom@linux.intel.com>
      b1231ff7
    • Xiaolei Wang's avatar
      pmdomain: imx8mp-blk-ctrl: add missing loop break condition · 726efa92
      Xiaolei Wang authored
      
      Currently imx8mp_blk_ctrl_remove() will continue the for loop
      until an out-of-bounds exception occurs.
      
      pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
      pc : dev_pm_domain_detach+0x8/0x48
      lr : imx8mp_blk_ctrl_shutdown+0x58/0x90
      sp : ffffffc084f8bbf0
      x29: ffffffc084f8bbf0 x28: ffffff80daf32ac0 x27: 0000000000000000
      x26: ffffffc081658d78 x25: 0000000000000001 x24: ffffffc08201b028
      x23: ffffff80d0db9490 x22: ffffffc082340a78 x21: 00000000000005b0
      x20: ffffff80d19bc180 x19: 000000000000000a x18: ffffffffffffffff
      x17: ffffffc080a39e08 x16: ffffffc080a39c98 x15: 4f435f464f006c72
      x14: 0000000000000004 x13: ffffff80d0172110 x12: 0000000000000000
      x11: ffffff80d0537740 x10: ffffff80d05376c0 x9 : ffffffc0808ed2d8
      x8 : ffffffc084f8bab0 x7 : 0000000000000000 x6 : 0000000000000000
      x5 : ffffff80d19b9420 x4 : fffffffe03466e60 x3 : 0000000080800077
      x2 : 0000000000000000 x1 : 0000000000000001 x0 : 0000000000000000
      Call trace:
       dev_pm_domain_detach+0x8/0x48
       platform_shutdown+0x2c/0x48
       device_shutdown+0x158/0x268
       kernel_restart_prepare+0x40/0x58
       kernel_kexec+0x58/0xe8
       __do_sys_reboot+0x198/0x258
       __arm64_sys_reboot+0x2c/0x40
       invoke_syscall+0x5c/0x138
       el0_svc_common.constprop.0+0x48/0xf0
       do_el0_svc+0x24/0x38
       el0_svc+0x38/0xc8
       el0t_64_sync_handler+0x120/0x130
       el0t_64_sync+0x190/0x198
      Code: 8128c2d0 ffffffc0 aa1e03e9 d503201f
      
      Fixes: 556f5cf9 ("soc: imx: add i.MX8MP HSIO blk-ctrl")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarXiaolei Wang <xiaolei.wang@windriver.com>
      Reviewed-by: default avatarLucas Stach <l.stach@pengutronix.de>
      Reviewed-by: default avatarFabio Estevam <festevam@gmail.com>
      Reviewed-by: default avatarFrank Li <Frank.Li@nxp.com>
      Link: https://lore.kernel.org/r/20250115014118.4086729-1-xiaolei.wang@windriver.com
      
      
      Signed-off-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      726efa92
    • Leon Romanovsky's avatar
      net/mlx5e: Always start IPsec sequence number from 1 · 7f95b024
      Leon Romanovsky authored
      
      According to RFC4303, section "3.3.3. Sequence Number Generation",
      the first packet sent using a given SA will contain a sequence
      number of 1.
      
      This is applicable to both ESN and non-ESN mode, which was not covered
      in commit mentioned in Fixes line.
      
      Fixes: 3d42c8cc ("net/mlx5e: Ensure that IPsec sequence packet number starts from 1")
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      7f95b024
    • Leon Romanovsky's avatar
      net/mlx5e: Rely on reqid in IPsec tunnel mode · 25f23524
      Leon Romanovsky authored
      
      All packet offloads SAs have reqid in it to make sure they have
      corresponding policy. While it is not strictly needed for transparent
      mode, it is extremely important in tunnel mode. In that mode, policy and
      SAs have different match criteria.
      
      Policy catches the whole subnet addresses, and SA catches the tunnel gateways
      addresses. The source address of such tunnel is not known during egress packet
      traversal in flow steering as it is added only after successful encryption.
      
      As reqid is required for packet offload and it is unique for every SA,
      we can safely rely on it only.
      
      The output below shows the configured egress policy and SA by strongswan:
      
      [leonro@vm ~]$ sudo ip x s
      src 192.169.101.2 dst 192.169.101.1
              proto esp spi 0xc88b7652 reqid 1 mode tunnel
              replay-window 0 flag af-unspec esn
              aead rfc4106(gcm(aes)) 0xe406a01083986e14d116488549094710e9c57bc6 128
              anti-replay esn context:
               seq-hi 0x0, seq 0x0, oseq-hi 0x0, oseq 0x0
               replay_window 1, bitmap-length 1
               00000000
              crypto offload parameters: dev eth2 dir out mode packet
      
      [leonro@064 ~]$ sudo ip x p
      src 192.170.0.0/16 dst 192.170.0.0/16
              dir out priority 383615 ptype main
              tmpl src 192.169.101.2 dst 192.169.101.1
                      proto esp spi 0xc88b7652 reqid 1 mode tunnel
              crypto offload parameters: dev eth2 mode packet
      
      Fixes: b3beba1f ("net/mlx5e: Allow policies with reqid 0, to support IKE policy holes")
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      25f23524
    • Leon Romanovsky's avatar
      net/mlx5e: Fix inversion dependency warning while enabling IPsec tunnel · 2c368809
      Leon Romanovsky authored
      
      Attempt to enable IPsec packet offload in tunnel mode in debug kernel
      generates the following kernel panic, which is happening due to two
      issues:
      1. In SA add section, the should be _bh() variant when marking SA mode.
      2. There is not needed flush_workqueue in SA delete routine. It is not
      needed as at this stage as it is removed from SADB and the running work
      will be canceled later in SA free.
      
       =====================================================
       WARNING: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected
       6.12.0+ #4 Not tainted
       -----------------------------------------------------
       charon/1337 [HC0[0]:SC0[4]:HE1:SE0] is trying to acquire:
       ffff88810f365020 (&xa->xa_lock#24){+.+.}-{3:3}, at: mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
      
       and this task is already holding:
       ffff88813e0f0d48 (&x->lock){+.-.}-{3:3}, at: xfrm_state_delete+0x16/0x30
       which would create a new lock dependency:
        (&x->lock){+.-.}-{3:3} -> (&xa->xa_lock#24){+.+.}-{3:3}
      
       but this new dependency connects a SOFTIRQ-irq-safe lock:
        (&x->lock){+.-.}-{3:3}
      
       ... which became SOFTIRQ-irq-safe at:
         lock_acquire+0x1be/0x520
         _raw_spin_lock_bh+0x34/0x40
         xfrm_timer_handler+0x91/0xd70
         __hrtimer_run_queues+0x1dd/0xa60
         hrtimer_run_softirq+0x146/0x2e0
         handle_softirqs+0x266/0x860
         irq_exit_rcu+0x115/0x1a0
         sysvec_apic_timer_interrupt+0x6e/0x90
         asm_sysvec_apic_timer_interrupt+0x16/0x20
         default_idle+0x13/0x20
         default_idle_call+0x67/0xa0
         do_idle+0x2da/0x320
         cpu_startup_entry+0x50/0x60
         start_secondary+0x213/0x2a0
         common_startup_64+0x129/0x138
      
       to a SOFTIRQ-irq-unsafe lock:
        (&xa->xa_lock#24){+.+.}-{3:3}
      
       ... which became SOFTIRQ-irq-unsafe at:
       ...
         lock_acquire+0x1be/0x520
         _raw_spin_lock+0x2c/0x40
         xa_set_mark+0x70/0x110
         mlx5e_xfrm_add_state+0xe48/0x2290 [mlx5_core]
         xfrm_dev_state_add+0x3bb/0xd70
         xfrm_add_sa+0x2451/0x4a90
         xfrm_user_rcv_msg+0x493/0x880
         netlink_rcv_skb+0x12e/0x380
         xfrm_netlink_rcv+0x6d/0x90
         netlink_unicast+0x42f/0x740
         netlink_sendmsg+0x745/0xbe0
         __sock_sendmsg+0xc5/0x190
         __sys_sendto+0x1fe/0x2c0
         __x64_sys_sendto+0xdc/0x1b0
         do_syscall_64+0x6d/0x140
         entry_SYSCALL_64_after_hwframe+0x4b/0x53
      
       other info that might help us debug this:
      
        Possible interrupt unsafe locking scenario:
      
              CPU0                    CPU1
              ----                    ----
         lock(&xa->xa_lock#24);
                                      local_irq_disable();
                                      lock(&x->lock);
                                      lock(&xa->xa_lock#24);
         <Interrupt>
           lock(&x->lock);
      
        *** DEADLOCK ***
      
       2 locks held by charon/1337:
        #0: ffffffff87f8f858 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{4:4}, at: xfrm_netlink_rcv+0x5e/0x90
        #1: ffff88813e0f0d48 (&x->lock){+.-.}-{3:3}, at: xfrm_state_delete+0x16/0x30
      
       the dependencies between SOFTIRQ-irq-safe lock and the holding lock:
       -> (&x->lock){+.-.}-{3:3} ops: 29 {
          HARDIRQ-ON-W at:
                           lock_acquire+0x1be/0x520
                           _raw_spin_lock_bh+0x34/0x40
                           xfrm_alloc_spi+0xc0/0xe60
                           xfrm_alloc_userspi+0x5f6/0xbc0
                           xfrm_user_rcv_msg+0x493/0x880
                           netlink_rcv_skb+0x12e/0x380
                           xfrm_netlink_rcv+0x6d/0x90
                           netlink_unicast+0x42f/0x740
                           netlink_sendmsg+0x745/0xbe0
                           __sock_sendmsg+0xc5/0x190
                           __sys_sendto+0x1fe/0x2c0
                           __x64_sys_sendto+0xdc/0x1b0
                           do_syscall_64+0x6d/0x140
                           entry_SYSCALL_64_after_hwframe+0x4b/0x53
          IN-SOFTIRQ-W at:
                           lock_acquire+0x1be/0x520
                           _raw_spin_lock_bh+0x34/0x40
                           xfrm_timer_handler+0x91/0xd70
                           __hrtimer_run_queues+0x1dd/0xa60
                           hrtimer_run_softirq+0x146/0x2e0
                           handle_softirqs+0x266/0x860
                           irq_exit_rcu+0x115/0x1a0
                           sysvec_apic_timer_interrupt+0x6e/0x90
                           asm_sysvec_apic_timer_interrupt+0x16/0x20
                           default_idle+0x13/0x20
                           default_idle_call+0x67/0xa0
                           do_idle+0x2da/0x320
                           cpu_startup_entry+0x50/0x60
                           start_secondary+0x213/0x2a0
                           common_startup_64+0x129/0x138
          INITIAL USE at:
                          lock_acquire+0x1be/0x520
                          _raw_spin_lock_bh+0x34/0x40
                          xfrm_alloc_spi+0xc0/0xe60
                          xfrm_alloc_userspi+0x5f6/0xbc0
                          xfrm_user_rcv_msg+0x493/0x880
                          netlink_rcv_skb+0x12e/0x380
                          xfrm_netlink_rcv+0x6d/0x90
                          netlink_unicast+0x42f/0x740
                          netlink_sendmsg+0x745/0xbe0
                          __sock_sendmsg+0xc5/0x190
                          __sys_sendto+0x1fe/0x2c0
                          __x64_sys_sendto+0xdc/0x1b0
                          do_syscall_64+0x6d/0x140
                          entry_SYSCALL_64_after_hwframe+0x4b/0x53
        }
        ... key      at: [<ffffffff87f9cd20>] __key.18+0x0/0x40
      
       the dependencies between the lock to be acquired
        and SOFTIRQ-irq-unsafe lock:
       -> (&xa->xa_lock#24){+.+.}-{3:3} ops: 9 {
          HARDIRQ-ON-W at:
                           lock_acquire+0x1be/0x520
                           _raw_spin_lock_bh+0x34/0x40
                           mlx5e_xfrm_add_state+0xc5b/0x2290 [mlx5_core]
                           xfrm_dev_state_add+0x3bb/0xd70
                           xfrm_add_sa+0x2451/0x4a90
                           xfrm_user_rcv_msg+0x493/0x880
                           netlink_rcv_skb+0x12e/0x380
                           xfrm_netlink_rcv+0x6d/0x90
                           netlink_unicast+0x42f/0x740
                           netlink_sendmsg+0x745/0xbe0
                           __sock_sendmsg+0xc5/0x190
                           __sys_sendto+0x1fe/0x2c0
                           __x64_sys_sendto+0xdc/0x1b0
                           do_syscall_64+0x6d/0x140
                           entry_SYSCALL_64_after_hwframe+0x4b/0x53
          SOFTIRQ-ON-W at:
                           lock_acquire+0x1be/0x520
                           _raw_spin_lock+0x2c/0x40
                           xa_set_mark+0x70/0x110
                           mlx5e_xfrm_add_state+0xe48/0x2290 [mlx5_core]
                           xfrm_dev_state_add+0x3bb/0xd70
                           xfrm_add_sa+0x2451/0x4a90
                           xfrm_user_rcv_msg+0x493/0x880
                           netlink_rcv_skb+0x12e/0x380
                           xfrm_netlink_rcv+0x6d/0x90
                           netlink_unicast+0x42f/0x740
                           netlink_sendmsg+0x745/0xbe0
                           __sock_sendmsg+0xc5/0x190
                           __sys_sendto+0x1fe/0x2c0
                           __x64_sys_sendto+0xdc/0x1b0
                           do_syscall_64+0x6d/0x140
                           entry_SYSCALL_64_after_hwframe+0x4b/0x53
          INITIAL USE at:
                          lock_acquire+0x1be/0x520
                          _raw_spin_lock_bh+0x34/0x40
                          mlx5e_xfrm_add_state+0xc5b/0x2290 [mlx5_core]
                          xfrm_dev_state_add+0x3bb/0xd70
                          xfrm_add_sa+0x2451/0x4a90
                          xfrm_user_rcv_msg+0x493/0x880
                          netlink_rcv_skb+0x12e/0x380
                          xfrm_netlink_rcv+0x6d/0x90
                          netlink_unicast+0x42f/0x740
                          netlink_sendmsg+0x745/0xbe0
                          __sock_sendmsg+0xc5/0x190
                          __sys_sendto+0x1fe/0x2c0
                          __x64_sys_sendto+0xdc/0x1b0
                          do_syscall_64+0x6d/0x140
                          entry_SYSCALL_64_after_hwframe+0x4b/0x53
        }
        ... key      at: [<ffffffffa078ff60>] __key.48+0x0/0xfffffffffff210a0 [mlx5_core]
        ... acquired at:
          __lock_acquire+0x30a0/0x5040
          lock_acquire+0x1be/0x520
          _raw_spin_lock_bh+0x34/0x40
          mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
          xfrm_dev_state_delete+0x90/0x160
          __xfrm_state_delete+0x662/0xae0
          xfrm_state_delete+0x1e/0x30
          xfrm_del_sa+0x1c2/0x340
          xfrm_user_rcv_msg+0x493/0x880
          netlink_rcv_skb+0x12e/0x380
          xfrm_netlink_rcv+0x6d/0x90
          netlink_unicast+0x42f/0x740
          netlink_sendmsg+0x745/0xbe0
          __sock_sendmsg+0xc5/0x190
          __sys_sendto+0x1fe/0x2c0
          __x64_sys_sendto+0xdc/0x1b0
          do_syscall_64+0x6d/0x140
          entry_SYSCALL_64_after_hwframe+0x4b/0x53
      
       stack backtrace:
       CPU: 7 UID: 0 PID: 1337 Comm: charon Not tainted 6.12.0+ #4
       Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
       Call Trace:
        <TASK>
        dump_stack_lvl+0x74/0xd0
        check_irq_usage+0x12e8/0x1d90
        ? print_shortest_lock_dependencies_backwards+0x1b0/0x1b0
        ? check_chain_key+0x1bb/0x4c0
        ? __lockdep_reset_lock+0x180/0x180
        ? check_path.constprop.0+0x24/0x50
        ? mark_lock+0x108/0x2fb0
        ? print_circular_bug+0x9b0/0x9b0
        ? mark_lock+0x108/0x2fb0
        ? print_usage_bug.part.0+0x670/0x670
        ? check_prev_add+0x1c4/0x2310
        check_prev_add+0x1c4/0x2310
        __lock_acquire+0x30a0/0x5040
        ? lockdep_set_lock_cmp_fn+0x190/0x190
        ? lockdep_set_lock_cmp_fn+0x190/0x190
        lock_acquire+0x1be/0x520
        ? mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
        ? lockdep_hardirqs_on_prepare+0x400/0x400
        ? __xfrm_state_delete+0x5f0/0xae0
        ? lock_downgrade+0x6b0/0x6b0
        _raw_spin_lock_bh+0x34/0x40
        ? mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
        mlx5e_xfrm_del_state+0xca/0x1e0 [mlx5_core]
        xfrm_dev_state_delete+0x90/0x160
        __xfrm_state_delete+0x662/0xae0
        xfrm_state_delete+0x1e/0x30
        xfrm_del_sa+0x1c2/0x340
        ? xfrm_get_sa+0x250/0x250
        ? check_chain_key+0x1bb/0x4c0
        xfrm_user_rcv_msg+0x493/0x880
        ? copy_sec_ctx+0x270/0x270
        ? check_chain_key+0x1bb/0x4c0
        ? lockdep_set_lock_cmp_fn+0x190/0x190
        ? lockdep_set_lock_cmp_fn+0x190/0x190
        netlink_rcv_skb+0x12e/0x380
        ? copy_sec_ctx+0x270/0x270
        ? netlink_ack+0xd90/0xd90
        ? netlink_deliver_tap+0xcd/0xb60
        xfrm_netlink_rcv+0x6d/0x90
        netlink_unicast+0x42f/0x740
        ? netlink_attachskb+0x730/0x730
        ? lock_acquire+0x1be/0x520
        netlink_sendmsg+0x745/0xbe0
        ? netlink_unicast+0x740/0x740
        ? __might_fault+0xbb/0x170
        ? netlink_unicast+0x740/0x740
        __sock_sendmsg+0xc5/0x190
        ? fdget+0x163/0x1d0
        __sys_sendto+0x1fe/0x2c0
        ? __x64_sys_getpeername+0xb0/0xb0
        ? do_user_addr_fault+0x856/0xe30
        ? lock_acquire+0x1be/0x520
        ? __task_pid_nr_ns+0x117/0x410
        ? lock_downgrade+0x6b0/0x6b0
        __x64_sys_sendto+0xdc/0x1b0
        ? lockdep_hardirqs_on_prepare+0x284/0x400
        do_syscall_64+0x6d/0x140
        entry_SYSCALL_64_after_hwframe+0x4b/0x53
       RIP: 0033:0x7f7d31291ba4
       Code: 7d e8 89 4d d4 e8 4c 42 f7 ff 44 8b 4d d0 4c 8b 45 c8 89 c3 44 8b 55 d4 8b 7d e8 b8 2c 00 00 00 48 8b 55 d8 48 8b 75 e0 0f 05 <48> 3d 00 f0 ff ff 77 34 89 df 48 89 45 e8 e8 99 42 f7 ff 48 8b 45
       RSP: 002b:00007f7d2ccd94f0 EFLAGS: 00000297 ORIG_RAX: 000000000000002c
       RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f7d31291ba4
       RDX: 0000000000000028 RSI: 00007f7d2ccd96a0 RDI: 000000000000000a
       RBP: 00007f7d2ccd9530 R08: 00007f7d2ccd9598 R09: 000000000000000c
       R10: 0000000000000000 R11: 0000000000000297 R12: 0000000000000028
       R13: 00007f7d2ccd9598 R14: 00007f7d2ccd96a0 R15: 00000000000000e1
        </TASK>
      
      Fixes: 4c24272b ("net/mlx5e: Listen to ARP events to update IPsec L2 headers in tunnel mode")
      Signed-off-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2c368809
    • Mark Zhang's avatar
      net/mlx5: Clear port select structure when fail to create · 5641e82c
      Mark Zhang authored
      
      Clear the port select structure on error so no stale values left after
      definers are destroyed. That's because the mlx5_lag_destroy_definers()
      always try to destroy all lag definers in the tt_map, so in the flow
      below lag definers get double-destroyed and cause kernel crash:
      
        mlx5_lag_port_sel_create()
          mlx5_lag_create_definers()
            mlx5_lag_create_definer()     <- Failed on tt 1
              mlx5_lag_destroy_definers() <- definers[tt=0] gets destroyed
        mlx5_lag_port_sel_create()
          mlx5_lag_create_definers()
            mlx5_lag_create_definer()     <- Failed on tt 0
              mlx5_lag_destroy_definers() <- definers[tt=0] gets double-destroyed
      
       Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
       Mem abort info:
         ESR = 0x0000000096000005
         EC = 0x25: DABT (current EL), IL = 32 bits
         SET = 0, FnV = 0
         EA = 0, S1PTW = 0
         FSC = 0x05: level 1 translation fault
       Data abort info:
         ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
         CM = 0, WnR = 0, TnD = 0, TagAccess = 0
         GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
       user pgtable: 64k pages, 48-bit VAs, pgdp=0000000112ce2e00
       [0000000000000008] pgd=0000000000000000, p4d=0000000000000000, pud=0000000000000000
       Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
       Modules linked in: iptable_raw bonding ip_gre ip6_gre gre ip6_tunnel tunnel6 geneve ip6_udp_tunnel udp_tunnel ipip tunnel4 ip_tunnel rdma_ucm(OE) rdma_cm(OE) iw_cm(OE) ib_ipoib(OE) ib_cm(OE) ib_umad(OE) mlx5_ib(OE) ib_uverbs(OE) mlx5_fwctl(OE) fwctl(OE) mlx5_core(OE) mlxdevm(OE) ib_core(OE) mlxfw(OE) memtrack(OE) mlx_compat(OE) openvswitch nsh nf_conncount psample xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc netconsole overlay efi_pstore sch_fq_codel zram ip_tables crct10dif_ce qemu_fw_cfg fuse ipv6 crc_ccitt [last unloaded: mlx_compat(OE)]
        CPU: 3 UID: 0 PID: 217 Comm: kworker/u53:2 Tainted: G           OE      6.11.0+ #2
        Tainted: [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
        Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015
        Workqueue: mlx5_lag mlx5_do_bond_work [mlx5_core]
        pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
        pc : mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
        lr : mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
        sp : ffff800085fafb00
        x29: ffff800085fafb00 x28: ffff0000da0c8000 x27: 0000000000000000
        x26: ffff0000da0c8000 x25: ffff0000da0c8000 x24: ffff0000da0c8000
        x23: ffff0000c31f81a0 x22: 0400000000000000 x21: ffff0000da0c8000
        x20: 0000000000000000 x19: 0000000000000001 x18: 0000000000000000
        x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffff8b0c9350
        x14: 0000000000000000 x13: ffff800081390d18 x12: ffff800081dc3cc0
        x11: 0000000000000001 x10: 0000000000000b10 x9 : ffff80007ab7304c
        x8 : ffff0000d00711f0 x7 : 0000000000000004 x6 : 0000000000000190
        x5 : ffff00027edb3010 x4 : 0000000000000000 x3 : 0000000000000000
        x2 : ffff0000d39b8000 x1 : ffff0000d39b8000 x0 : 0400000000000000
        Call trace:
         mlx5_del_flow_rules+0x24/0x2c0 [mlx5_core]
         mlx5_lag_destroy_definer+0x54/0x100 [mlx5_core]
         mlx5_lag_destroy_definers+0xa0/0x108 [mlx5_core]
         mlx5_lag_port_sel_create+0x2d4/0x6f8 [mlx5_core]
         mlx5_activate_lag+0x60c/0x6f8 [mlx5_core]
         mlx5_do_bond_work+0x284/0x5c8 [mlx5_core]
         process_one_work+0x170/0x3e0
         worker_thread+0x2d8/0x3e0
         kthread+0x11c/0x128
         ret_from_fork+0x10/0x20
        Code: a9025bf5 aa0003f6 a90363f7 f90023f9 (f9400400)
        ---[ end trace 0000000000000000 ]---
      
      Fixes: dc48516e ("net/mlx5: Lag, add support to create definers for LAG")
      Signed-off-by: default avatarMark Zhang <markzhang@nvidia.com>
      Reviewed-by: default avatarLeon Romanovsky <leonro@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      5641e82c
    • Chris Mi's avatar
      net/mlx5: SF, Fix add port error handling · 2011a2a1
      Chris Mi authored
      
      If failed to add SF, error handling doesn't delete the SF from the
      SF table. But the hw resources are deleted. So when unload driver,
      hw resources will be deleted again. Firmware will report syndrome
      0x68def3 which means "SF is not allocated can not deallocate".
      
      Fix it by delete SF from SF table if failed to add SF.
      
      Fixes: 2597ee19 ("net/mlx5: Call mlx5_sf_id_erase() once in mlx5_sf_dealloc()")
      Signed-off-by: default avatarChris Mi <cmi@nvidia.com>
      Reviewed-by: default avatarShay Drori <shayd@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      2011a2a1
    • Yishai Hadas's avatar
      net/mlx5: Fix a lockdep warning as part of the write combining test · 1b10a519
      Yishai Hadas authored
      
      Fix a lockdep warning [1] observed during the write combining test.
      
      The warning indicates a potential nested lock scenario that could lead
      to a deadlock.
      
      However, this is a false positive alarm because the SF lock and its
      parent lock are distinct ones.
      
      The lockdep confusion arises because the locks belong to the same object
      class (i.e., struct mlx5_core_dev).
      
      To resolve this, the code has been refactored to avoid taking both
      locks. Instead, only the parent lock is acquired.
      
      [1]
      raw_ethernet_bw/2118 is trying to acquire lock:
      [  213.619032] ffff88811dd75e08 (&dev->wc_state_lock){+.+.}-{3:3}, at:
                     mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
      [  213.620270]
      [  213.620270] but task is already holding lock:
      [  213.620943] ffff88810b585e08 (&dev->wc_state_lock){+.+.}-{3:3}, at:
                     mlx5_wc_support_get+0x10c/0x210 [mlx5_core]
      [  213.622045]
      [  213.622045] other info that might help us debug this:
      [  213.622778]  Possible unsafe locking scenario:
      [  213.622778]
      [  213.623465]        CPU0
      [  213.623815]        ----
      [  213.624148]   lock(&dev->wc_state_lock);
      [  213.624615]   lock(&dev->wc_state_lock);
      [  213.625071]
      [  213.625071]  *** DEADLOCK ***
      [  213.625071]
      [  213.625805]  May be due to missing lock nesting notation
      [  213.625805]
      [  213.626522] 4 locks held by raw_ethernet_bw/2118:
      [  213.627019]  #0: ffff88813f80d578 (&uverbs_dev->disassociate_srcu){.+.+}-{0:0},
                      at: ib_uverbs_ioctl+0xc4/0x170 [ib_uverbs]
      [  213.628088]  #1: ffff88810fb23930 (&file->hw_destroy_rwsem){.+.+}-{3:3},
                      at: ib_init_ucontext+0x2d/0xf0 [ib_uverbs]
      [  213.629094]  #2: ffff88810fb23878 (&file->ucontext_lock){+.+.}-{3:3},
                      at: ib_init_ucontext+0x49/0xf0 [ib_uverbs]
      [  213.630106]  #3: ffff88810b585e08 (&dev->wc_state_lock){+.+.}-{3:3},
                      at: mlx5_wc_support_get+0x10c/0x210 [mlx5_core]
      [  213.631185]
      [  213.631185] stack backtrace:
      [  213.631718] CPU: 1 UID: 0 PID: 2118 Comm: raw_ethernet_bw Not tainted
                     6.12.0-rc7_internal_net_next_mlx5_89a0ad0 #1
      [  213.632722] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
                     rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
      [  213.633785] Call Trace:
      [  213.634099]
      [  213.634393]  dump_stack_lvl+0x7e/0xc0
      [  213.634806]  print_deadlock_bug+0x278/0x3c0
      [  213.635265]  __lock_acquire+0x15f4/0x2c40
      [  213.635712]  lock_acquire+0xcd/0x2d0
      [  213.636120]  ? mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
      [  213.636722]  ? mlx5_ib_enable_lb+0x24/0xa0 [mlx5_ib]
      [  213.637277]  __mutex_lock+0x81/0xda0
      [  213.637697]  ? mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
      [  213.638305]  ? mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
      [  213.638902]  ? rcu_read_lock_sched_held+0x3f/0x70
      [  213.639400]  ? mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
      [  213.640016]  mlx5_wc_support_get+0x18c/0x210 [mlx5_core]
      [  213.640615]  set_ucontext_resp+0x68/0x2b0 [mlx5_ib]
      [  213.641144]  ? debug_mutex_init+0x33/0x40
      [  213.641586]  mlx5_ib_alloc_ucontext+0x18e/0x7b0 [mlx5_ib]
      [  213.642145]  ib_init_ucontext+0xa0/0xf0 [ib_uverbs]
      [  213.642679]  ib_uverbs_handler_UVERBS_METHOD_GET_CONTEXT+0x95/0xc0
                      [ib_uverbs]
      [  213.643426]  ? _copy_from_user+0x46/0x80
      [  213.643878]  ib_uverbs_cmd_verbs+0xa6b/0xc80 [ib_uverbs]
      [  213.644426]  ? ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x130/0x130
                     [ib_uverbs]
      [  213.645213]  ? __lock_acquire+0xa99/0x2c40
      [  213.645675]  ? lock_acquire+0xcd/0x2d0
      [  213.646101]  ? ib_uverbs_ioctl+0xc4/0x170 [ib_uverbs]
      [  213.646625]  ? reacquire_held_locks+0xcf/0x1f0
      [  213.647102]  ? do_user_addr_fault+0x45d/0x770
      [  213.647586]  ib_uverbs_ioctl+0xe0/0x170 [ib_uverbs]
      [  213.648102]  ? ib_uverbs_ioctl+0xc4/0x170 [ib_uverbs]
      [  213.648632]  __x64_sys_ioctl+0x4d3/0xaa0
      [  213.649060]  ? do_user_addr_fault+0x4a8/0x770
      [  213.649528]  do_syscall_64+0x6d/0x140
      [  213.649947]  entry_SYSCALL_64_after_hwframe+0x4b/0x53
      [  213.650478] RIP: 0033:0x7fa179b0737b
      [  213.650893] Code: ff ff ff 85 c0 79 9b 49 c7 c4 ff ff ff ff 5b 5d 4c
                     89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8
                     10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d
                     7d 2a 0f 00 f7 d8 64 89 01 48
      [  213.652619] RSP: 002b:00007ffd2e6d46e8 EFLAGS: 00000246 ORIG_RAX:
                     0000000000000010
      [  213.653390] RAX: ffffffffffffffda RBX: 00007ffd2e6d47f8 RCX:
                     00007fa179b0737b
      [  213.654084] RDX: 00007ffd2e6d47e0 RSI: 00000000c0181b01 RDI:
                     0000000000000003
      [  213.654767] RBP: 00007ffd2e6d47c0 R08: 00007fa1799be010 R09:
                     0000000000000002
      [  213.655453] R10: 00007ffd2e6d4960 R11: 0000000000000246 R12:
                     00007ffd2e6d487c
      [  213.656170] R13: 0000000000000027 R14: 0000000000000001 R15:
                     00007ffd2e6d4f70
      
      Fixes: d98995b4 ("net/mlx5: Reimplement write combining test")
      Signed-off-by: default avatarYishai Hadas <yishaih@nvidia.com>
      Reviewed-by: default avatarMichael Guralnik <michaelgur@nvidia.com>
      Reviewed-by: default avatarLarysa Zaremba <larysa.zaremba@intel.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      1b10a519
    • Patrisious Haddad's avatar
      net/mlx5: Fix RDMA TX steering prio · c08d3e62
      Patrisious Haddad authored
      
      User added steering rules at RDMA_TX were being added to the first prio,
      which is the counters prio.
      Fix that so that they are correctly added to the BYPASS_PRIO instead.
      
      Fixes: 24670b1a ("net/mlx5: Add support for RDMA TX steering")
      Signed-off-by: default avatarPatrisious Haddad <phaddad@nvidia.com>
      Reviewed-by: default avatarMark Bloch <mbloch@nvidia.com>
      Reviewed-by: default avatarJacob Keller <jacob.e.keller@intel.com>
      Signed-off-by: default avatarTariq Toukan <tariqt@nvidia.com>
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      c08d3e62
  3. Jan 15, 2025
Loading