Skip to content
Snippets Groups Projects
  1. Mar 25, 2025
    • LogicalErzor's avatar
      ARM: dts: qcom-msm8960: add missing clocks to the timer node · 5b64b9a2
      LogicalErzor authored
      In order to fix DT schema warning and describe hardware properly, add
      missing sleep clock to the timer node.
      
      Solved by Dmitry Baryshkov on the APQ8064 SoC
      Link: https://lore.kernel.org/all/20250318-fix-nexus-4-v2-6-bcedd1406790@oss.qualcomm.com/
      
      
      
      Signed-off-by: default avatarRudraksha Gupta <guptarud@gmail.com>
      5b64b9a2
    • LogicalErzor's avatar
      ARM: dts: qcom: msm8960: Add thermal sensor (tsens) · 84e9f9e7
      LogicalErzor authored
      
      Add support for the thermal sensor (tsens) on the MSM8960 by copying and
      modifying the relevant nodes from the APQ8064 dtsi. These changes enable
      thermal management.
      
      Signed-off-by: default avatarRudraksha Gupta <guptarud@gmail.com>
      84e9f9e7
    • LogicalErzor's avatar
      dt-bindings: nvmem: Add compatible for MSM8960 · 154d0147
      LogicalErzor authored
      
      Document the QFPROM on MSM8960.
      
      Signed-off-by: default avatarRudraksha Gupta <guptarud@gmail.com>
      154d0147
    • LogicalErzor's avatar
      ARM: dts: qcom: msm8960: Add BAM · 44ed498d
      LogicalErzor authored
      
      Copy bam nodes from qcom-ipq8064.dtsi and change
      the reg values to match msm8960.
      
      Co-developed-by: default avatarSam Day <me@samcday.com>
      Signed-off-by: default avatarSam Day <me@samcday.com>
      Reviewed-by: default avatarDmitry Baryshkov <dmitry.baryshkov@linaro.org>
      Signed-off-by: default avatarRudraksha Gupta <guptarud@gmail.com>
      44ed498d
    • Christian Schrefl's avatar
      arm: rust: Enable Rust support for ARMv7 · d8e68a5b
      Christian Schrefl authored and LogicalErzor's avatar LogicalErzor committed
      
      This commit allows building ARMv7 kernels with Rust support.
      
      The rust core library expects some __eabi_... functions
      that are not implemented in the kernel.
      Those functions are some float operations and __aeabi_uldivmod.
      For now those are implemented with define_panicking_intrinsics!.
      
      This is based on the code by Sven Van Asbroeck from the original
      rust branch and inspired by the AArch version by Jamie Cunliffe.
      
      I have tested the rust samples and a custom simple MMIO module
      on hardware (De1SoC FPGA + Arm A9 CPU).
      
      Tested-by: default avatarRudraksha Gupta <guptarud@gmail.com>
      Reviewed-by: default avatarAlice Ryhl <aliceryhl@google.com>
      Acked-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      Tested-by: default avatarMiguel Ojeda <ojeda@kernel.org>
      Acked-by: default avatarArd Biesheuvel <ardb@kernel.org>
      Signed-off-by: default avatarChristian Schrefl <chrisi.schrefl@gmail.com>
      d8e68a5b
    • Linus Torvalds's avatar
      Merge tag 'rcu-next-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux · 3ba7dfb8
      Linus Torvalds authored
      Pull RCU updates from Boqun Feng:
       "Documentation:
         - Add broken-timing possibility to stallwarn.rst
         - Improve discussion of this_cpu_ptr(), add raw_cpu_ptr()
         - Document self-propagating callbacks
         - Point call_srcu() to call_rcu() for detailed memory ordering
         - Add CONFIG_RCU_LAZY delays to call_rcu() kernel-doc header
         - Clarify RCU_LAZY and RCU_LAZY_DEFAULT_OFF help text
         - Remove references to old grace-period-wait primitives
      
        srcu:
         - Introduce srcu_read_{un,}lock_fast(), which is similar to
           srcu_read_{un,}lock_lite(): avoid smp_mb()s in lock and unlock
           at the cost of calling synchronize_rcu() in synchronize_srcu()
      
           Moreover, by returning the percpu offset of the counter at
           srcu_read_lock_fast() time, srcu_read_unlock_fast() can avoid
           extra pointer dereferencing, which makes it faster than
           srcu_read_{un,}lock_lite()
      
           srcu_read_{un,}lock_fast() are intended to replace
           rcu_read_{un,}lock_trace() if possible
      
        RCU torture:
         - Add get_torture_init_jiffies() to return the start time of the test
         - Add a test_boost_holdoff module parameter to allow delaying
           boosting tests when building rcutorture as built-in
         - Add grace period sequence number logging at the beginning and end
           of failure/close-call results
         - Switch to hexadecimal for the expedited grace period sequence
           number in the rcu_exp_grace_period trace point
         - Make cur_ops->format_gp_seqs take buffer length
         - Move RCU_TORTURE_TEST_{CHK_RDR_STATE,LOG_CPU} to bool
         - Complain when invalid SRCU reader_flavor is specified
         - Add FORCE_NEED_SRCU_NMI_SAFE Kconfig for testing, which forces SRCU
           uses atomics even when percpu ops are NMI safe, and use the Kconfig
           for SRCU lockdep testing
      
        Misc:
         - Split rcu_report_exp_cpu_mult() mask parameter and use for tracing
         - Remove READ_ONCE() for rdp->gpwrap access in __note_gp_changes()
         - Fix get_state_synchronize_rcu_full() GP-start detection
         - Move RCU Tasks self-tests to core_initcall()
         - Print segment lengths in show_rcu_nocb_gp_state()
         - Make RCU watch ct_kernel_exit_state() warning
         - Flush console log from kernel_power_off()
         - rcutorture: Allow a negative value for nfakewriters
         - rcu: Update TREE05.boot to test normal synchronize_rcu()
         - rcu: Use _full() API to debug synchronize_rcu()
      
        Make RCU handle PREEMPT_LAZY better:
         - Fix header guard for rcu_all_qs()
         - rcu: Rename PREEMPT_AUTO to PREEMPT_LAZY
         - Update __cond_resched comment about RCU quiescent states
         - Handle unstable rdp in rcu_read_unlock_strict()
         - Handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
         - osnoise: Provide quiescent states
         - Adjust rcutorture with possible PREEMPT_RCU=n && PREEMPT_COUNT=y
           combination
         - Limit PREEMPT_RCU configurations
         - Make rcutorture senario TREE07 and senario TREE10 use
           PREEMPT_LAZY=y"
      
      * tag 'rcu-next-v6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (59 commits)
        rcutorture: Make scenario TREE07 build CONFIG_PREEMPT_LAZY=y
        rcutorture: Make scenario TREE10 build CONFIG_PREEMPT_LAZY=y
        rcu: limit PREEMPT_RCU configurations
        rcutorture: Update ->extendables check for lazy preemption
        rcutorture: Update rcutorture_one_extend_check() for lazy preemption
        osnoise: provide quiescent states
        rcu: Use _full() API to debug synchronize_rcu()
        rcu: Update TREE05.boot to test normal synchronize_rcu()
        rcutorture: Allow a negative value for nfakewriters
        Flush console log from kernel_power_off()
        context_tracking: Make RCU watch ct_kernel_exit_state() warning
        rcu/nocb: Print segment lengths in show_rcu_nocb_gp_state()
        rcu-tasks: Move RCU Tasks self-tests to core_initcall()
        rcu: Fix get_state_synchronize_rcu_full() GP-start detection
        torture: Make SRCU lockdep testing use srcu_read_lock_nmisafe()
        srcu: Add FORCE_NEED_SRCU_NMI_SAFE Kconfig for testing
        rcutorture: Complain when invalid SRCU reader_flavor is specified
        rcutorture: Move RCU_TORTURE_TEST_{CHK_RDR_STATE,LOG_CPU} to bool
        rcutorture: Make cur_ops->format_gp_seqs take buffer length
        rcutorture: Add ftrace-compatible timestamp to GP# failure/close-call output
        ...
      3ba7dfb8
    • Linus Torvalds's avatar
      Merge tag 'bitmap-for-6.15' of https://github.com/norov/linux · 2f2d5294
      Linus Torvalds authored
      Pull bitmap updates from Yury Norov:
      
       - cpumask_next_wrap() rework (me)
      
       - GENMASK() simplification (I Hsin)
      
       - rust bindings for cpumasks (Viresh and me)
      
       - scattered cleanups (Andy, Tamir, Vincent, Ignacio and Joel)
      
      * tag 'bitmap-for-6.15' of https://github.com/norov/linux: (22 commits)
        cpumask: align text in comment
        riscv: fix test_and_{set,clear}_bit ordering documentation
        treewide: fix typo 'unsigned __init128' -> 'unsigned __int128'
        MAINTAINERS: add rust bindings entry for bitmap API
        rust: Add cpumask helpers
        uapi: Revert "bitops: avoid integer overflow in GENMASK(_ULL)"
        cpumask: drop cpumask_next_wrap_old()
        PCI: hv: Switch hv_compose_multi_msi_req_get_cpu() to using cpumask_next_wrap()
        scsi: lpfc: rework lpfc_next_{online,present}_cpu()
        scsi: lpfc: switch lpfc_irq_rebalance() to using cpumask_next_wrap()
        s390: switch stop_machine_yield() to using cpumask_next_wrap()
        padata: switch padata_find_next() to using cpumask_next_wrap()
        cpumask: use cpumask_next_wrap() where appropriate
        cpumask: re-introduce cpumask_next{,_and}_wrap()
        cpumask: deprecate cpumask_next_wrap()
        powerpc/xmon: simplify xmon_batch_next_cpu()
        ibmvnic: simplify ibmvnic_set_queue_affinity()
        virtio_net: simplify virtnet_set_affinity()
        objpool: rework objpool_pop()
        cpumask: add for_each_{possible,online}_cpu_wrap
        ...
      2f2d5294
    • Linus Torvalds's avatar
      Merge tag 'docs-6.15' of git://git.lwn.net/linux · f81c2b81
      Linus Torvalds authored
      Pull documentation updates from Jonathan Corbet:
       "It has been a reasonably busy cycle for docs...
      
         - Significant changes throughout the tree to bring Python code up to
           current standards and raise the minimum Python required to 3.9
      
           Much of this is preparatory to replacing the ancient Perl
           scripts/kernel-doc horror with a slightly less horrifying Python
           implementation, expected for 6.16
      
         - Update the minimum Sphinx required to 3.4.3, allowing us to remove
           a bunch of older compatibility code
      
         - Rework and improve the generation of the ABI documentation
      
        (All of the above done by Mauro)
      
         - Lots of translation updates. Alex Shi and Yanteng Si are taking on
           responsibility for the Chinese translations going forward; that
           work will still get to you via docs-next
      
         - Try to standardize the format for indicating a developer's
           affiliation in commit tags
      
         - Clarify the TAB's role in CoC enforcement actions
      
         - Try to spell out the rules for when a commit tag can name another
           developer without their explicit permission
      
        Plus lots of other typo fixes and updates"
      
      * tag 'docs-6.15' of git://git.lwn.net/linux: (98 commits)
        docs/zh_CN: fix spelling mistake
        docs/Chinese: change the disclaimer words
        docs/zh_CN: Add snp-tdx-threat-model index Chinese translation
        docs: driver-api: firmware: clarify userspace requirements
        docs: clarify rules wrt tagging other people
        docs: Remove outdated highuid.rst documentation
        Documentation: dma-buf: heaps: Add heap name definitions
        docs/.../submit-checklist: Use Documentation/admin-guide/abi.rst for cross-ref of README
        docs: Correct installation instruction
        Documentation: kcsan: fix "Plain Accesses and Data Races" URL in kcsan.rst
        Documentation/CoC: Spell out the TAB role in enforcement decisions
        Documentation: ocxl.rst: Update consortium site
        scripts: get_feat.pl: substitute s390x with s390
        scripts/kernel-doc: drop dead code for Wcontents_before_sections
        scripts/kernel-doc: don't add not needed new lines
        docs: driver-api/infiniband.rst: fix Kerneldoc markup
        drivers: firewire: firewire-cdev.h: fix identation on a kernel-doc markup
        drivers: media: intel-ipu3.h: fix identation on a kernel-doc markup
        include/asm-generic/io.h: fix kerneldoc markup
        Docs/arch/arm64: Fix spelling in amu.rst
        ...
      f81c2b81
    • Linus Torvalds's avatar
      Merge tag 'stop-machine.2025.03.21a' of... · 8541bc1a
      Linus Torvalds authored
      Merge tag 'stop-machine.2025.03.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
      
      Pull stop-machine update from Paul McKenney:
      
       - Add a comment for the call to rcu_momentary_eqs() from
         multi_cpu_stop() explaining that its purpose is to suppress
         false-positive RCU CPU stall warnings
      
      * tag 'stop-machine.2025.03.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        stop-machine: Add comment for rcu_momentary_eqs()
      8541bc1a
    • Linus Torvalds's avatar
      Merge tag 'lkmm.2025.03.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu · 72b40807
      Linus Torvalds authored
      Pull kernel memory model updates from Paul McKenney:
       "Add more atomic operations, rework tags, and update documentation:
      
         - Add additional atomic operations (Puranjay Mohan)
      
         - Make better use of herd7 tags (Jonas Oberhauser)
      
         - Update documentation (Akira Yokosawa)
      
        These changes require v7.58 of the herd7 and klitmus tools, up from
        v7.52"
      
      * tag 'lkmm.2025.03.21a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu:
        tools/memory-model: glossary.txt: Fix indents
        tools/memory-model/README: Fix typo
        tools/memory-model: Distinguish between syntactic and semantic tags
        tools/memory-model: Switch to softcoded herd7 tags
        tools/memory-model: Define effect of Mb tags on RMWs in tools/...
        tools/memory-model: Define applicable tags on operation in tools/...
        tools/memory-model: Legitimize current use of tags in LKMM macros
        tools/memory-model: Add atomic_andnot() with its variants
        tools/memory-model: Add atomic_and()/or()/xor() and add_negative
      72b40807
    • Linus Torvalds's avatar
      Merge tag 'nolibc-20250308-for-6.15-1' of... · 418becac
      Linus Torvalds authored
      Merge tag 'nolibc-20250308-for-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu
      
      Pull nolibc updates from Paul McKenney:
       - 32bit s390 support
       - opendir() and friends
       - openat() support
       - sscanf() support
       - various cleanups
      
      [ Paul has just forwarded the pull request from Thomas Weißschuh, so
        the tag signature is from Thomas, not Paul   - Linus ]
      
      * tag 'nolibc-20250308-for-6.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (26 commits)
        tools/nolibc: don't use asm/ UAPI headers
        selftests/nolibc: stop testing constructor order
        selftests/nolibc: use O_RDONLY flag instead of 0
        tools/nolibc: drop outdated example from overview comment
        tools/nolibc: process open() vararg as mode_t
        tools/nolibc: always use openat(2) instead of open(2)
        tools/nolibc: add support for openat(2)
        selftests/nolibc: add armthumb configuration
        selftests/nolibc: explicitly enable ARM mode
        Revert "selftests: kselftest: Fix build failure with NOLIBC"
        tools/nolibc: add support for [v]sscanf()
        tools/nolibc: add support for 32-bit s390
        selftests/nolibc: rename s390 to s390x
        selftests/nolibc: only run constructor tests on nolibc
        selftests/nolibc: split up architecture list in run-tests.sh
        tools/nolibc: add support for directory access
        tools/nolibc: add support for sys_llseek()
        selftests/nolibc: always keep test kernel configuration up to date
        selftests/nolibc: execute defconfig before other targets
        selftests/nolibc: drop call to mrproper target
        ...
      418becac
    • Linus Torvalds's avatar
      Merge tag 'sched_ext-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext · bcb04425
      Linus Torvalds authored
      Pull sched_ext updates from Tejun Heo:
      
       - Add mechanism to count and report internal events. This significantly
         improves visibility on subtle corner conditions.
      
       - The default idle CPU selection logic is revamped and improved in
         multiple ways including being made topology aware.
      
       - sched_ext was disabling ttwu_queue for simplicity, which can be
         costly when hardware topology is more complex. Implement
         SCX_OPS_ALLOWED_QUEUED_WAKEUP so that BPF schedulers can selectively
         enable ttwu_queue.
      
       - tools/sched_ext updates to improve compatibility among others.
      
       - Other misc updates and fixes.
      
       - sched_ext/for-6.14-fixes were pulled a few times to receive
         prerequisite fixes and resolve conflicts.
      
      * tag 'sched_ext-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (42 commits)
        sched_ext: idle: Refactor scx_select_cpu_dfl()
        sched_ext: idle: Honor idle flags in the built-in idle selection policy
        sched_ext: Skip per-CPU tasks in scx_bpf_reenqueue_local()
        sched_ext: Add trace point to track sched_ext core events
        sched_ext: Change the event type from u64 to s64
        sched_ext: Documentation: add task lifecycle summary
        tools/sched_ext: Provide a compatible helper for scx_bpf_events()
        selftests/sched_ext: Add NUMA-aware scheduler test
        tools/sched_ext: Provide consistent access to scx flags
        sched_ext: idle: Fix scx_bpf_pick_any_cpu_node() behavior
        sched_ext: idle: Introduce scx_bpf_nr_node_ids()
        sched_ext: idle: Introduce node-aware idle cpu kfunc helpers
        sched_ext: idle: Per-node idle cpumasks
        sched_ext: idle: Introduce SCX_OPS_BUILTIN_IDLE_PER_NODE
        sched_ext: idle: Make idle static keys private
        sched/topology: Introduce for_each_node_numadist() iterator
        mm/numa: Introduce nearest_node_nodemask()
        nodemask: numa: reorganize inclusion path
        nodemask: add nodes_copy()
        tools/sched_ext: Sync with scx repo
        ...
      bcb04425
  2. Mar 24, 2025
    • Linus Torvalds's avatar
      Merge tag 'cgroup-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup · 94dc216a
      Linus Torvalds authored
      Pull cgroup updates from Tejun Heo:
      
       - Add deprecation info messages to cgroup1-only features
      
       - rstat updates including a bug fix and breaking up a critical section
         to reduce interrupt latency impact
      
       - Other misc and doc updates
      
      * tag 'cgroup-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
        cgroup: rstat: Cleanup flushing functions and locking
        cgroup/rstat: avoid disabling irqs for O(num_cpu)
        mm: Fix a build breakage in memcontrol-v1.c
        blk-cgroup: Simplify policy files registration
        cgroup: Update file naming comment
        cgroup: Add deprecation message to legacy freezer controller
        mm: Add transformation message for per-memcg swappiness
        RFC cgroup/cpuset-v1: Add deprecation messages to sched_relax_domain_level
        cgroup/cpuset-v1: Add deprecation messages to memory_migrate
        cgroup/cpuset-v1: Add deprecation messages to mem_exclusive and mem_hardwall
        cgroup: Print message when /proc/cgroups is read on v2-only system
        cgroup/blkio: Add deprecation messages to reset_stats
        cgroup/cpuset-v1: Add deprecation messages to memory_spread_page and memory_spread_slab
        cgroup/cpuset-v1: Add deprecation messages to sched_load_balance and memory_pressure_enabled
        cgroup, docs: Be explicit about independence of RT_GROUP_SCHED and non-cpu controllers
        cgroup/rstat: Fix forceidle time in cpu.stat
        cgroup/misc: Remove unused misc_cg_res_total_usage
        cgroup/cpuset: Move procfs cpuset attribute under cgroup-v1.c
        cgroup: update comment about dropping cgroup kn refs
      94dc216a
    • Linus Torvalds's avatar
      Merge tag 'wq-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq · b05f8fbe
      Linus Torvalds authored
      Pull workqueue update from Tejun Heo:
       "Just one commit to expose system BH workqueues to rust"
      
      * tag 'wq-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
        rust: workqueue: define built-in bh queues
      b05f8fbe
    • Linus Torvalds's avatar
      Merge tag 'slab-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab · 05b00ffd
      Linus Torvalds authored
      Pull slab updates from Vlastimil Babka:
      
       - Move the TINY_RCU kvfree_rcu() implementation from RCU to SLAB
         subsystem and cleanup its integration (Vlastimil Babka)
      
         Following the move of the TREE_RCU batching kvfree_rcu()
         implementation in 6.14, move also the simpler TINY_RCU variant.
         Refactor the #ifdef guards so that the simple implementation is also
         used with SLUB_TINY.
      
         Remove the need for RCU to recognize fake callback function pointers
         (__is_kvfree_rcu_offset()) when handling call_rcu() by implementing a
         callback that calculates the object's address from the embedded
         rcu_head address without knowing its offset.
      
       - Improve kmalloc cache randomization in kvmalloc (GONG Ruiqi)
      
         Due to an extra layer of function call, all kvmalloc() allocations
         used the same set of random caches. Thanks to moving the kvmalloc()
         implementation to slub.c, this is improved and randomization now
         works for kvmalloc.
      
       - Various improvements to debugging, testing and other cleanups (Hyesoo
         Yu, Lilith Gkini, Uladzislau Rezki, Matthew Wilcox, Kevin Brodsky, Ye
         Bin)
      
      * tag 'slab-for-6.15' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab:
        slub: Handle freelist cycle in on_freelist()
        mm/slab: call kmalloc_noprof() unconditionally in kmalloc_array_noprof()
        slab: Mark large folios for debugging purposes
        kunit, slub: Add test_kfree_rcu_wq_destroy use case
        mm, slab: cleanup slab_bug() parameters
        mm: slub: call WARN() when detecting a slab corruption
        mm: slub: Print the broken data before restoring them
        slab: Achieve better kmalloc caches randomization in kvmalloc
        slab: Adjust placement of __kvmalloc_node_noprof
        mm/slab: simplify SLAB_* flag handling
        slab: don't batch kvfree_rcu() with SLUB_TINY
        rcu, slab: use a regular callback function for kvfree_rcu
        rcu: remove trace_rcu_kvfree_callback
        slab, rcu: move TINY_RCU variant of kvfree_rcu() to SLAB
      05b00ffd
    • Linus Torvalds's avatar
      Merge tag 'pstore-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 95c61e1a
      Linus Torvalds authored
      Pull tiny pstore update from Kees Cook:
      
       - pstore: Change kmsg_bytes storage size to u32
      
      * tag 'pstore-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        pstore: Change kmsg_bytes storage size to u32
      95c61e1a
    • Linus Torvalds's avatar
      Merge tag 'seccomp-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 11c2b2e3
      Linus Torvalds authored
      Pull seccomp updates from Kees Cook:
      
       - avoid the lock trip seccomp_filter_release in common case (Mateusz
         Guzik)
      
       - remove unused 'sd' argument through-out (Oleg Nesterov)
      
       - selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64
      
      * tag 'seccomp-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        seccomp: avoid the lock trip seccomp_filter_release in common case
        seccomp: remove the 'sd' argument from __seccomp_filter()
        seccomp: remove the 'sd' argument from __secure_computing()
        seccomp: fix the __secure_computing() stub for !HAVE_ARCH_SECCOMP_FILTER
        seccomp/mips: change syscall_trace_enter() to use secure_computing()
        selftests/seccomp: Add hard-coded __NR_uretprobe for x86_64
      11c2b2e3
    • Linus Torvalds's avatar
      Merge tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · fc13a78e
      Linus Torvalds authored
      Pull hardening updates from Kees Cook:
       "As usual, it's scattered changes all over. Patches touching things
        outside of our traditional areas in the tree have been Acked by
        maintainers or were trivial changes:
      
         - loadpin: remove unsupported MODULE_COMPRESS_NONE (Arulpandiyan
           Vadivel)
      
         - samples/check-exec: Fix script name (Mickaël Salaün)
      
         - yama: remove needless locking in yama_task_prctl() (Oleg Nesterov)
      
         - lib/string_choices: Sort by function name (R Sundar)
      
         - hardening: Allow default HARDENED_USERCOPY to be set at compile
           time (Mel Gorman)
      
         - uaccess: Split out compile-time checks into ucopysize.h
      
         - kbuild: clang: Support building UM with SUBARCH=i386
      
         - x86: Enable i386 FORTIFY_SOURCE on Clang 16+
      
         - ubsan/overflow: Rework integer overflow sanitizer option
      
         - Add missing __nonstring annotations for callers of
           memtostr*()/strtomem*()
      
         - Add __must_be_noncstr() and have memtostr*()/strtomem*() check for
           it
      
         - Introduce __nonstring_array for silencing future GCC 15 warnings"
      
      * tag 'hardening-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (26 commits)
        compiler_types: Introduce __nonstring_array
        hardening: Enable i386 FORTIFY_SOURCE on Clang 16+
        x86/build: Remove -ffreestanding on i386 with GCC
        ubsan/overflow: Enable ignorelist parsing and add type filter
        ubsan/overflow: Enable pattern exclusions
        ubsan/overflow: Rework integer overflow sanitizer option to turn on everything
        samples/check-exec: Fix script name
        yama: don't abuse rcu_read_lock/get_task_struct in yama_task_prctl()
        kbuild: clang: Support building UM with SUBARCH=i386
        loadpin: remove MODULE_COMPRESS_NONE as it is no longer supported
        lib/string_choices: Rearrange functions in sorted order
        string.h: Validate memtostr*()/strtomem*() arguments more carefully
        compiler.h: Introduce __must_be_noncstr()
        nilfs2: Mark on-disk strings as nonstring
        uapi: stddef.h: Introduce __kernel_nonstring
        x86/tdx: Mark message.bytes as nonstring
        string: kunit: Mark nonstring test strings as __nonstring
        scsi: qla2xxx: Mark device strings as nonstring
        scsi: mpt3sas: Mark device strings as nonstring
        scsi: mpi3mr: Mark device strings as nonstring
        ...
      fc13a78e
    • Linus Torvalds's avatar
      Merge tag 'move-lib-kunit-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 06961fbb
      Linus Torvalds authored
      Pull lib kunit selftest move from Kees Cook:
       "This is a one-off tree to coordinate the move of selftests out of lib/
        and into lib/tests/. A separate tree was used for this to keep the
        paths sane with all the work in the same place.
      
         - move lib/ selftests into lib/tests/ (Kees Cook, Gabriela
           Bittencourt, Luis Felipe Hernandez, Lukas Bulwahn, Tamir
           Duberstein)
      
         - lib/math: Add int_log test suite (Bruno Sobreira França)
      
         - lib/math: Add Kunit test suite for gcd() (Yu-Chun Lin)
      
         - lib/tests/kfifo_kunit.c: add tests for the kfifo structure (Diego
           Vieira)
      
         - unicode: refactor selftests into KUnit (Gabriela Bittencourt)
      
         - lib/prime_numbers: convert self-test to KUnit (Tamir Duberstein)
      
         - printf: convert self-test to KUnit (Tamir Duberstein)
      
         - scanf: convert self-test to KUnit (Tamir Duberstein)"
      
      * tag 'move-lib-kunit-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: (21 commits)
        scanf: break kunit into test cases
        scanf: convert self-test to KUnit
        scanf: remove redundant debug logs
        scanf: implicate test line in failure messages
        printf: implicate test line in failure messages
        printf: break kunit into test cases
        printf: convert self-test to KUnit
        kunit/fortify: Replace "volatile" with OPTIMIZER_HIDE_VAR()
        kunit/fortify: Expand testing of __compiletime_strlen()
        kunit/stackinit: Use fill byte different from Clang i386 pattern
        kunit/overflow: Fix DEFINE_FLEX tests for counted_by
        selftests: remove reference to prime_numbers.sh
        MAINTAINERS: adjust entries in FORTIFY_SOURCE and KERNEL HARDENING
        lib/prime_numbers: convert self-test to KUnit
        lib/math: Add Kunit test suite for gcd()
        unicode: kunit: change tests filename and path
        unicode: kunit: refactor selftest to kunit tests
        lib/tests/kfifo_kunit.c: add tests for the kfifo structure
        lib: Move KUnit tests into tests/ subdirectory
        lib/math: Add int_log test suite
        ...
      06961fbb
    • Linus Torvalds's avatar
      Merge tag 'execve-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux · 4f773fcb
      Linus Torvalds authored
      Pull execve updates from Kees Cook:
      
       - elf: Define and use note name macros (Akihiko Odaki)
      
       - elf: add remaining SHF_ flag macros (Timur Tabi)
      
       - binfmt: Remove loader from linux_binprm struct (Yonatan Goldschmidt)
      
       - binfmt_elf_fdpic: fix variable set but not used warning (sunliming)
      
      * tag 'execve-v6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
        binfmt_elf_fdpic: fix variable set but not used warning
        elf: add remaining SHF_ flag macros
        binfmt: Remove loader from linux_binprm struct
        crash: Remove KEXEC_CORE_NOTE_NAME
        s390/crash: Use note name macros
        crash: Use note name macros
        powerpc/crash: Use note name macros
        binfmt_elf: Use note name macros
        elf: Define note name macros
      4f773fcb
    • Linus Torvalds's avatar
      Merge tag 'kernel-6.15-rc1.tasklist_lock' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · b0cb56cb
      Linus Torvalds authored
      Pull tasklist_lock optimizations from Christian Brauner:
       "According to the performance testbots this brings a 23% performance
        increase when creating new processes:
      
         - Reduce tasklist_lock hold time on exit:
             - Perform add_device_randomness() without tasklist_lock
             - Perform free_pid() calls outside of tasklist_lock
      
         - Drop irq disablement around pidmap_lock
      
         - Add some tasklist_lock asserts
      
         - Call flush_sigqueue() lockless by changing release_task()
      
         - Don't pointlessly clear TIF_SIGPENDING in __exit_signal() ->
           clear_tsk_thread_flag()"
      
      * tag 'kernel-6.15-rc1.tasklist_lock' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        pid: drop irq disablement around pidmap_lock
        pid: perform free_pid() calls outside of tasklist_lock
        pid: sprinkle tasklist_lock asserts
        exit: hoist get_pid() in release_task() outside of tasklist_lock
        exit: perform add_device_randomness() without tasklist_lock
        exit: kill the pointless __exit_signal()->clear_tsk_thread_flag(TIF_SIGPENDING)
        exit: change the release_task() paths to call flush_sigqueue() lockless
      b0cb56cb
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.rust' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 56e7a8b0
      Linus Torvalds authored
      Pull vfs rust updates from Christian Brauner:
       "This contains minor fixes and improvements to rust file bindings:
      
         - Optimize rust symbol generation for FileDescriptorReservation
      
         - Optimize rust symbol generation for SeqFile"
      
      * tag 'vfs-6.15-rc1.rust' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        rust: optimize rust symbol generation for SeqFile
        rust: file: optimize rust symbol generation for FileDescriptorReservation
      56e7a8b0
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 912b82dc
      Linus Torvalds authored
      Pull vfs file handling updates from Christian Brauner:
       "This contains performance improvements for struct file's new refcount
        mechanism and various other performance work:
      
         - The stock kernel transitioning the file to no refs held penalizes
           the caller with an extra atomic to block any increments. For cases
           where the file is highly likely to be going away this is easily
           avoidable.
      
           Add file_ref_put_close() to better handle the common case where
           closing a file descriptor also operates on the last reference and
           build fput_close_sync() and fput_close() on top of it. This brings
           about 1% performance improvement by eliding one atomic in the
           common case.
      
         - Predict no error in close() since the vast majority of the time
           system call returns 0.
      
         - Reduce the work done in fdget_pos() by predicting that the file was
           found and by explicitly comparing the reference count to one and
           ignoring the dead zone"
      
      * tag 'vfs-6.15-rc1.file' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        fs: reduce work in fdget_pos()
        fs: use fput_close() in path_openat()
        fs: use fput_close() in filp_close()
        fs: use fput_close_sync() in close()
        file: add fput and file_ref_put routines optimized for use when closing a fd
        fs: predict no error in close()
      912b82dc
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.orangefs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · d41066dd
      Linus Torvalds authored
      Pull vfs orangefs updates from Christian Brauner:
       "This contains the work to remove orangefs_writepage() and partially
        convert it to folios.
      
        A few regular bugfixes are included as well"
      
      * tag 'vfs-6.15-rc1.orangefs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        orangefs: Convert orangefs_writepages to contain an array of folios
        orangefs: Simplify bvec setup in orangefs_writepages_work()
        orangefs: Unify error & success paths in orangefs_writepages_work()
        orangefs: Pass mapping to orangefs_writepages_work()
        orangefs: Convert orangefs_writepage_locked() to take a folio
        orangefs: Remove orangefs_writepage()
        orangefs: make open_for_read and open_for_write boolean
        orangefs: Move s_kmod_keyword_mask_map to orangefs-debugfs.c
        orangefs: Do not truncate file size
      d41066dd
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 9483c37e
      Linus Torvalds authored
      Pull vfs afs updates from Christian Brauner:
       "This contains the work for afs for this cycle:
      
         - Fix an occasional hang that's only really encountered when
           rmmod'ing the kafs module
      
         - Remove the "-o autocell" mount option. This is obsolete with the
           dynamic root and removing it makes the next patch slightly easier
      
         - Change how the dynamic root mount is constructed. Currently, the
           root directory is (de)populated when it is (un)mounted if there are
           cells already configured and, further, pairs of automount points
           have to be created/removed each time a cell is added/deleted
      
           This is changed so that readdir on the root dir lists all the known
           cell automount pairs plus the @cell symlinks and the inodes and
           dentries are constructed by lookup on demand. This simplifies the
           cell management code
      
         - A few improvements to the afs_volume and afs_server tracepoints
      
         - Pass trace info into the afs_lookup_cell() function to allow the
           trace log to indicate the purpose of the lookup
      
         - Remove the 'net' parameter from afs_unuse_cell() as it's
           superfluous
      
         - In rxrpc, allow a kernel app (such as kafs) to store a word of
           information on rxrpc_peer records
      
         - Use the information stored on the rxrpc_peer record to point to the
           afs_server record. This allows the server address lookup to be done
           away with
      
         - Simplify the afs_server ref/activity accounting to make each one
           self-contained and not garbage collected from the cell management
           work item
      
         - Simplify the afs_cell ref/activity accounting to make each one of
           these also self-contained and not driven by a central management
           work item
      
           The current code was intended to make it such that a single timer
           for the namespace and one work item per cell could do all the work
           required to maintain these records. This, however, made for some
           sequencing problems when cleaning up these records. Further, the
           attempt to pass refs along with timers and work items made getting
           it right rather tricky when the timer or work item already had a
           ref attached and now a ref had to be got rid of"
      
      * tag 'vfs-6.15-rc1.afs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        afs: Simplify cell record handling
        afs: Fix afs_server ref accounting
        afs: Use the per-peer app data provided by rxrpc
        rxrpc: Allow the app to store private data on peer structs
        afs: Drop the net parameter from afs_unuse_cell()
        afs: Make afs_lookup_cell() take a trace note
        afs: Improve server refcount/active count tracing
        afs: Improve afs_volume tracing to display a debug ID
        afs: Change dynroot to create contents on demand
        afs: Remove the "autocell" mount option
      9483c37e
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.initramfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · c1c98301
      Linus Torvalds authored
      Pull vfs initramfs updates from Christian Brauner:
       "This adds basic kunit test coverage for initramfs unpacking and cleans
        up some buffer handling issues and inefficiencies"
      
      * tag 'vfs-6.15-rc1.initramfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        MAINTAINERS: append initramfs files to the VFS section
        initramfs: avoid static buffer for error message
        initramfs: fix hardlink hash leak without TRAILER
        initramfs: reuse name_len for dir mtime tracking
        initramfs: allocate heap buffers together
        initramfs: avoid memcpy for hex header fields
        vsprintf: add simple_strntoul
        initramfs_test: kunit tests for initramfs unpacking
        init: add initramfs_internal.h
      c1c98301
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.ceph' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · e63046ad
      Linus Torvalds authored
      Pull vfs ceph updates from Christian Brauner:
       "This contains the work to remove access to page->index from ceph
        and fixes the test failure observed for ceph with generic/421 by
        refactoring ceph_writepages_start()"
      
      * tag 'vfs-6.15-rc1.ceph' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        fscrypt: Change fscrypt_encrypt_pagecache_blocks() to take a folio
        ceph: Fix error handling in fill_readdir_cache()
        fs: Remove page_mkwrite_check_truncate()
        ceph: Pass a folio to ceph_allocate_page_array()
        ceph: Convert ceph_move_dirty_page_in_page_array() to move_dirty_folio_in_page_array()
        ceph: Remove uses of page from ceph_process_folio_batch()
        ceph: Convert ceph_check_page_before_write() to use a folio
        ceph: Convert writepage_nounlock() to write_folio_nounlock()
        ceph: Convert ceph_readdir_cache_control to store a folio
        ceph: Convert ceph_find_incompatible() to take a folio
        ceph: Use a folio in ceph_page_mkwrite()
        ceph: Remove ceph_writepage()
        ceph: fix generic/421 test failure
        ceph: introduce ceph_submit_write() method
        ceph: introduce ceph_process_folio_batch() method
        ceph: extend ceph_writeback_ctl for ceph_writepages_start() refactoring
      e63046ad
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.pagesize' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · e41170cc
      Linus Torvalds authored
      Pull vfs pagesize updates from Christian Brauner:
       "This enables block sizes greater than the page size for block devices.
      
        With this we can start supporting block devices with logical block
        sizes larger than 4k.
      
        It also allows to lift the device cache sector size support to 64k.
        This allows filesystems which can use larger sector sizes up to 64k to
        ensure that the filesystem will not generate writes that are smaller
        than the specified sector size"
      
      * tag 'vfs-6.15-rc1.pagesize' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        bdev: add back PAGE_SIZE block size validation for sb_set_blocksize()
        bdev: use bdev_io_min() for statx block size
        block/bdev: lift block size restrictions to 64k
        block/bdev: enable large folio support for large logical block sizes
        fs/buffer fs/mpage: remove large folio restriction
        fs/mpage: use blocks_per_folio instead of blocks_per_page
        fs/mpage: avoid negative shift for large blocksize
        fs/buffer: remove batching from async read
        fs/buffer: simplify block_read_full_folio() with bh_offset()
      e41170cc
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 130e696a
      Linus Torvalds authored
      Pull vfs mount namespace updates from Christian Brauner:
       "This expands the ability of anonymous mount namespaces:
      
         - Creating detached mounts from detached mounts
      
           Currently, detached mounts can only be created from attached
           mounts. This limitaton prevents various use-cases. For example, the
           ability to mount a subdirectory without ever having to make the
           whole filesystem visible first.
      
           The current permission modelis:
      
            (1) Check that the caller is privileged over the owning user
                namespace of it's current mount namespace.
      
            (2) Check that the caller is located in the mount namespace of the
                mount it wants to create a detached copy of.
      
           While it is not strictly necessary to do it this way it is
           consistently applied in the new mount api. This model will also be
           used when allowing the creation of detached mount from another
           detached mount.
      
           The (1) requirement can simply be met by performing the same check
           as for the non-detached case, i.e., verify that the caller is
           privileged over its current mount namespace.
      
           To meet the (2) requirement it must be possible to infer the origin
           mount namespace that the anonymous mount namespace of the detached
           mount was created from.
      
           The origin mount namespace of an anonymous mount is the mount
           namespace that the mounts that were copied into the anonymous mount
           namespace originate from.
      
           In order to check the origin mount namespace of an anonymous mount
           namespace the sequence number of the original mount namespace is
           recorded in the anonymous mount namespace.
      
           With this in place it is possible to perform an equivalent check
           (2') to (2). The origin mount namespace of the anonymous mount
           namespace must be the same as the caller's mount namespace. To
           establish this the sequence number of the caller's mount namespace
           and the origin sequence number of the anonymous mount namespace are
           compared.
      
           The caller is always located in a non-anonymous mount namespace
           since anonymous mount namespaces cannot be setns()ed into. The
           caller's mount namespace will thus always have a valid sequence
           number.
      
           The owning namespace of any mount namespace, anonymous or
           non-anonymous, can never change. A mount attached to a
           non-anonymous mount namespace can never change mount namespace.
      
           If the sequence number of the non-anonymous mount namespace and the
           origin sequence number of the anonymous mount namespace match, the
           owning namespaces must match as well.
      
           Hence, the capability check on the owning namespace of the caller's
           mount namespace ensures that the caller has the ability to copy the
           mount tree.
      
         - Allow mount detached mounts on detached mounts
      
           Currently, detached mounts can only be mounted onto attached
           mounts. This limitation makes it impossible to assemble a new
           private rootfs and move it into place. Instead, a detached tree
           must be created, attached, then mounted open and then either moved
           or detached again. Lift this restriction.
      
           In order to allow mounting detached mounts onto other detached
           mounts the same permission model used for creating detached mounts
           from detached mounts can be used (cf. above).
      
           Allowing to mount detached mounts onto detached mounts leaves three
           cases to consider:
      
            (1) The source mount is an attached mount and the target mount is
                a detached mount. This would be equivalent to moving a mount
                between different mount namespaces. A caller could move an
                attached mount to a detached mount. The detached mount can now
                be freely attached to any mount namespace. This changes the
                current delegatioh model significantly for no good reason. So
                this will fail.
      
            (2) Anonymous mount namespaces are always attached fully, i.e., it
                is not possible to only attach a subtree of an anoymous mount
                namespace. This simplifies the implementation and reasoning.
      
                Consequently, if the anonymous mount namespace of the source
                detached mount and the target detached mount are the identical
                the mount request will fail.
      
            (3) The source mount's anonymous mount namespace is different from
                the target mount's anonymous mount namespace.
      
                In this case the source anonymous mount namespace of the
                source mount tree must be freed after its mounts have been
                moved to the target anonymous mount namespace. The source
                anonymous mount namespace must be empty afterwards.
      
           By allowing to mount detached mounts onto detached mounts a caller
           may do the following:
      
             fd_tree1 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE)
             fd_tree2 = open_tree(-EBADF, "/tmp", OPEN_TREE_CLONE)
      
           fd_tree1 and fd_tree2 refer to two different detached mount trees
           that belong to two different anonymous mount namespace.
      
           It is important to note that fd_tree1 and fd_tree2 both refer to
           the root of their respective anonymous mount namespaces.
      
           By allowing to mount detached mounts onto detached mounts the
           caller may now do:
      
               move_mount(fd_tree1, "", fd_tree2, "",
                          MOVE_MOUNT_F_EMPTY_PATH | MOVE_MOUNT_T_EMPTY_PATH)
      
           This will cause the detached mount referred to by fd_tree1 to be
           mounted on top of the detached mount referred to by fd_tree2.
      
           Thus, the detached mount fd_tree1 is moved from its separate
           anonymous mount namespace into fd_tree2's anonymous mount
           namespace.
      
           It also means that while fd_tree2 continues to refer to the root of
           its respective anonymous mount namespace fd_tree1 doesn't anymore.
      
           This has the consequence that only fd_tree2 can be moved to another
           anonymous or non-anonymous mount namespace. Moving fd_tree1 will
           now fail as fd_tree1 doesn't refer to the root of an anoymous mount
           namespace anymore.
      
           Now fd_tree1 and fd_tree2 refer to separate detached mount trees
           referring to the same anonymous mount namespace.
      
           This is conceptually fine. The new mount api does allow for this to
           happen already via:
      
             mount -t tmpfs tmpfs /mnt
             mkdir -p /mnt/A
             mount -t tmpfs tmpfs /mnt/A
      
             fd_tree3 = open_tree(-EBADF, "/mnt", OPEN_TREE_CLONE | AT_RECURSIVE)
             fd_tree4 = open_tree(-EBADF, "/mnt/A", 0)
      
           Both fd_tree3 and fd_tree4 refer to two different detached mount
           trees but both detached mount trees refer to the same anonymous
           mount namespace. An as with fd_tree1 and fd_tree2, only fd_tree3
           may be moved another mount namespace as fd_tree3 refers to the root
           of the anonymous mount namespace just while fd_tree4 doesn't.
      
           However, there's an important difference between the
           fd_tree3/fd_tree4 and the fd_tree1/fd_tree2 example.
      
           Closing fd_tree4 and releasing the respective struct file will have
           no further effect on fd_tree3's detached mount tree.
      
           However, closing fd_tree3 will cause the mount tree and the
           respective anonymous mount namespace to be destroyed causing the
           detached mount tree of fd_tree4 to be invalid for further mounting.
      
           By allowing to mount detached mounts on detached mounts as in the
           fd_tree1/fd_tree2 example both struct files will affect each other.
      
           Both fd_tree1 and fd_tree2 refer to struct files that have
           FMODE_NEED_UNMOUNT set.
      
           To handle this we use the fact that @fd_tree1 will have a parent
           mount once it has been attached to @fd_tree2.
      
           When dissolve_on_fput() is called the mount that has been passed in
           will refer to the root of the anonymous mount namespace. If it
           doesn't it would mean that mounts are leaked. So before allowing to
           mount detached mounts onto detached mounts this would be a bug.
      
           Now that detached mounts can be mounted onto detached mounts it
           just means that the mount has been attached to another anonymous
           mount namespace and thus dissolve_on_fput() must not unmount the
           mount tree or free the anonymous mount namespace as the file
           referring to the root of the namespace hasn't been closed yet.
      
           If it had been closed yet it would be obvious because the mount
           namespace would be NULL, i.e., the @fd_tree1 would have already
           been unmounted. If @fd_tree1 hasn't been unmounted yet and has a
           parent mount it is safe to skip any cleanup as closing @fd_tree2
           will take care of all cleanup operations.
      
         - Allow mount propagation for detached mount trees
      
           In commit ee2e3f50 ("mount: fix mounting of detached mounts
           onto targets that reside on shared mounts") I fixed a bug where
           propagating the source mount tree of an anonymous mount namespace
           into a target mount tree of a non-anonymous mount namespace could
           be used to trigger an integer overflow in the non-anonymous mount
           namespace causing any new mounts to fail.
      
           The cause of this was that the propagation algorithm was unable to
           recognize mounts from the source mount tree that were already
           propagated into the target mount tree and then reappeared as
           propagation targets when walking the destination propagation mount
           tree.
      
           When fixing this I disabled mount propagation into anonymous mount
           namespaces. Make it possible for anonymous mount namespace to
           receive mount propagation events correctly. This is now also a
           correctness issue now that we allow mounting detached mount trees
           onto detached mount trees.
      
           Mark the source anonymous mount namespace with MNTNS_PROPAGATING
           indicating that all mounts belonging to this mount namespace are
           currently in the process of being propagated and make the
           propagation algorithm discard those if they appear as propagation
           targets"
      
      * tag 'vfs-6.15-rc1.mount.namespace' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits)
        selftests: test subdirectory mounting
        selftests: add test for detached mount tree propagation
        fs: namespace: fix uninitialized variable use
        mount: handle mount propagation for detached mount trees
        fs: allow creating detached mounts from fsmount() file descriptors
        selftests: seventh test for mounting detached mounts onto detached mounts
        selftests: sixth test for mounting detached mounts onto detached mounts
        selftests: fifth test for mounting detached mounts onto detached mounts
        selftests: fourth test for mounting detached mounts onto detached mounts
        selftests: third test for mounting detached mounts onto detached mounts
        selftests: second test for mounting detached mounts onto detached mounts
        selftests: first test for mounting detached mounts onto detached mounts
        fs: mount detached mounts onto detached mounts
        fs: support getname_maybe_null() in move_mount()
        selftests: create detached mounts from detached mounts
        fs: create detached mounts from detached mounts
        fs: add may_copy_tree()
        fs: add fastpath for dissolve_on_fput()
        fs: add assert for move_mount()
        fs: add mnt_ns_empty() helper
        ...
      130e696a
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.nsfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 74adf9e3
      Linus Torvalds authored
      Pull vfs nsfs updates from Christian Brauner:
       "This contains non-urgent fixes for nsfs to validate ioctls before
        performing any relevant operations.
      
        We alredy did this for a few other filesystems last cycle"
      
      * tag 'vfs-6.15-rc1.nsfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        selftests/nsfs: add ioctl validation tests
        nsfs: validate ioctls
      74adf9e3
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.sysv' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · aaca83f7
      Linus Torvalds authored
      Pull vfs sysv removal from Christian Brauner:
       "This removes the sysv filesystem. We've discussed this various times.
      
        It's time to try"
      
      * tag 'vfs-6.15-rc1.sysv' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        sysv: Remove the filesystem
      aaca83f7
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 26d8e430
      Linus Torvalds authored
      Pull vfs async dir updates from Christian Brauner:
       "This contains cleanups that fell out of the work from async directory
        handling:
      
         - Change kern_path_locked() and user_path_locked_at() to never return
           a negative dentry. This simplifies the usability of these helpers
           in various places
      
         - Drop d_exact_alias() from the remaining place in NFS where it is
           still used. This also allows us to drop the d_exact_alias() helper
           completely
      
         - Drop an unnecessary call to fh_update() from nfsd_create_locked()
      
         - Change i_op->mkdir() to return a struct dentry
      
           Change vfs_mkdir() to return a dentry provided by the filesystems
           which is hashed and positive. This allows us to reduce the number
           of cases where the resulting dentry is not positive to very few
           cases. The code in these places becomes simpler and easier to
           understand.
      
         - Repack DENTRY_* and LOOKUP_* flags"
      
      * tag 'vfs-6.15-rc1.async.dir' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        doc: fix inline emphasis warning
        VFS: Change vfs_mkdir() to return the dentry.
        nfs: change mkdir inode_operation to return alternate dentry if needed.
        fuse: return correct dentry for ->mkdir
        ceph: return the correct dentry on mkdir
        hostfs: store inode in dentry after mkdir if possible.
        Change inode_operations.mkdir to return struct dentry *
        nfsd: drop fh_update() from S_IFDIR branch of nfsd_create_locked()
        nfs/vfs: discard d_exact_alias()
        VFS: add common error checks to lookup_one_qstr_excl()
        VFS: change kern_path_locked() and user_path_locked_at() to never return negative dentry
        VFS: repack LOOKUP_ bit flags.
        VFS: repack DENTRY_ flags.
      26d8e430
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.overlayfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 804382d5
      Linus Torvalds authored
      Pull vfs overlayfs updates from Christian Brauner:
       "Currently overlayfs uses the mounter's credentials for its
        override_creds() calls. That provides a consistent permission model.
      
        This patches allows a caller to instruct overlayfs to use its
        credentials instead. The caller must be located in the same user
        namespace hierarchy as the user namespace the overlayfs instance will
        be mounted in. This provides a consistent and simple security model.
      
        With this it is possible to e.g., mount an overlayfs instance where
        the mounter must have CAP_SYS_ADMIN but the credentials used for
        override_creds() have dropped CAP_SYS_ADMIN. It also allows the usage
        of custom fs{g,u}id different from the callers and other tweaks"
      
      * tag 'vfs-6.15-rc1.overlayfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        selftests/ovl: add third selftest for "override_creds"
        selftests/ovl: add second selftest for "override_creds"
        selftests/filesystems: add utils.{c,h}
        selftests/ovl: add first selftest for "override_creds"
        ovl: allow to specify override credentials
      804382d5
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 0ec0d4ec
      Linus Torvalds authored
      Pull vfs iomap updates from Christian Brauner:
      
       - Allow the filesystem to submit the writeback bios.
      
          - Allow the filsystem to track completions on a per-bio bases
            instead of the entire I/O.
      
          - Change writeback_ops so that ->submit_bio can be done by the
            filesystem.
      
          - A new ANON_WRITE flag for writes that don't have a block number
            assigned to them at the iomap level leaving the filesystem to do
            that work in the submission handler.
      
       - Incremental iterator advance
      
         The folio_batch support for zero range where the filesystem provides
         a batch of folios to process that might not be logically continguous
         requires more flexibility than the current offset based iteration
         currently offers.
      
         Update all iomap operations to advance the iterator within the
         operation and thus remove the need to advance from the core iomap
         iterator.
      
       - Make buffered writes work with RWF_DONTCACHE
      
         If RWF_DONTCACHE is set for a write, mark the folios being written as
         uncached. On writeback completion the pages will be dropped.
      
       - Introduce infrastructure for large atomic writes
      
         This will eventually be used by xfs and ext4.
      
      * tag 'vfs-6.15-rc1.iomap' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (42 commits)
        iomap: rework IOMAP atomic flags
        iomap: comment on atomic write checks in iomap_dio_bio_iter()
        iomap: inline iomap_dio_bio_opflags()
        iomap: fix inline data on buffered read
        iomap: Lift blocksize restriction on atomic writes
        iomap: Support SW-based atomic writes
        iomap: Rename IOMAP_ATOMIC -> IOMAP_ATOMIC_HW
        xfs: flag as supporting FOP_DONTCACHE
        iomap: make buffered writes work with RWF_DONTCACHE
        iomap: introduce a full map advance helper
        iomap: rename iomap_iter processed field to status
        iomap: remove unnecessary advance from iomap_iter()
        dax: advance the iomap_iter on pte and pmd faults
        dax: advance the iomap_iter on dedupe range
        dax: advance the iomap_iter on unshare range
        dax: advance the iomap_iter on zero range
        dax: push advance down into dax_iomap_iter() for read and write
        dax: advance the iomap_iter in the read/write path
        iomap: convert misc simple ops to incremental advance
        iomap: advance the iter on direct I/O
        ...
      0ec0d4ec
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · df00ded2
      Linus Torvalds authored
      Pull vfs pidfs updates from Christian Brauner:
      
       - Allow retrieving exit information after a process has been reaped
         through pidfds via the new PIDFD_INTO_EXIT extension for the
         PIDFD_GET_INFO ioctl. Various tools need access to information about
         a process/task even after it has already been reaped.
      
         Pidfd polling allows waiting on either task exit or for a task to
         have been reaped. The contract for PIDFD_INFO_EXIT is simply that
         EPOLLHUP must be observed before exit information can be retrieved,
         i.e., exit information is only provided once the task has been reaped
         and then can be retrieved as long as the pidfd is open.
      
       - Add PIDFD_SELF_{THREAD,THREAD_GROUP} sentinels allowing userspace to
         forgo allocating a file descriptor for their own process. This is
         useful in scenarios where users want to act on their own process
         through pidfds and is akin to AT_FDCWD.
      
       - Improve premature thread-group leader and subthread exec behavior
         when polling on pidfds:
      
         (1) During a multi-threaded exec by a subthread, i.e.,
             non-thread-group leader thread, all other threads in the
             thread-group including the thread-group leader are killed and the
             struct pid of the thread-group leader will be taken over by the
             subthread that called exec. IOW, two tasks change their TIDs.
      
         (2) A premature thread-group leader exit means that the thread-group
             leader exited before all of the other subthreads in the
             thread-group have exited.
      
         Both cases lead to inconsistencies for pidfd polling with
         PIDFD_THREAD. Any caller that holds a PIDFD_THREAD pidfd to the
         current thread-group leader may or may not see an exit notification
         on the file descriptor depending on when poll is performed. If the
         poll is performed before the exec of the subthread has concluded an
         exit notification is generated for the old thread-group leader. If
         the poll is performed after the exec of the subthread has concluded
         no exit notification is generated for the old thread-group leader.
      
         The correct behavior is to simply not generate an exit notification
         on the struct pid of a subhthread exec because the struct pid is
         taken over by the subthread and thus remains alive.
      
         But this is difficult to handle because a thread-group may exit
         premature as mentioned in (2). In that case an exit notification is
         reliably generated but the subthreads may continue to run for an
         indeterminate amount of time and thus also may exec at some point.
      
         After this pull no exit notifications will be generated for a
         PIDFD_THREAD pidfd for a thread-group leader until all subthreads
         have been reaped. If a subthread should exec before no exit
         notification will be generated until that task exits or it creates
         subthreads and repeates the cycle.
      
         This means an exit notification indicates the ability for the father
         to reap the child.
      
      * tag 'vfs-6.15-rc1.pidfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (25 commits)
        selftests/pidfd: third test for multi-threaded exec polling
        selftests/pidfd: second test for multi-threaded exec polling
        selftests/pidfd: first test for multi-threaded exec polling
        pidfs: improve multi-threaded exec and premature thread-group leader exit polling
        pidfs: ensure that PIDFS_INFO_EXIT is available
        selftests/pidfd: add seventh PIDFD_INFO_EXIT selftest
        selftests/pidfd: add sixth PIDFD_INFO_EXIT selftest
        selftests/pidfd: add fifth PIDFD_INFO_EXIT selftest
        selftests/pidfd: add fourth PIDFD_INFO_EXIT selftest
        selftests/pidfd: add third PIDFD_INFO_EXIT selftest
        selftests/pidfd: add second PIDFD_INFO_EXIT selftest
        selftests/pidfd: add first PIDFD_INFO_EXIT selftest
        selftests/pidfd: expand common pidfd header
        pidfs/selftests: ensure correct headers for ioctl handling
        selftests/pidfd: fix header inclusion
        pidfs: allow to retrieve exit information
        pidfs: record exit code and cgroupid at exit
        pidfs: use private inode slab cache
        pidfs: move setting flags into pidfs_alloc_file()
        pidfd: rely on automatic cleanup in __pidfd_prepare()
        ...
      df00ded2
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.pipe' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 71ee2fde
      Linus Torvalds authored
      Pull vfs pipe updates from Christian Brauner:
      
       - Introduce struct file_operations pipeanon_fops
      
       - Don't update {a,c,m}time for anonymous pipes to avoid the performance
         costs associated with it
      
       - Change pipe_write() to never add a zero-sized buffer
      
       - Limit the slots in pipe_resize_ring()
      
       - Use pipe_buf() to retrieve the pipe buffer everywhere
      
       - Drop an always true check in anon_pipe_write()
      
       - Cache 2 pages instead of 1
      
       - Avoid spurious calls to prepare_to_wait_event() in ___wait_event()
      
      * tag 'vfs-6.15-rc1.pipe' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        fs/splice: Use pipe_buf() helper to retrieve pipe buffer
        fs/pipe: Use pipe_buf() helper to retrieve pipe buffer
        kernel/watch_queue: Use pipe_buf() to retrieve the pipe buffer
        fs/pipe: Limit the slots in pipe_resize_ring()
        wait: avoid spurious calls to prepare_to_wait_event() in ___wait_event()
        pipe: cache 2 pages instead of 1
        pipe: drop an always true check in anon_pipe_write()
        pipe: change pipe_write() to never add a zero-sized buffer
        pipe: don't update {a,c,m}time for anonymous pipes
        pipe: introduce struct file_operations pipeanon_fops
      71ee2fde
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · fd101da6
      Linus Torvalds authored
      Pull vfs mount updates from Christian Brauner:
      
       - Mount notifications
      
         The day has come where we finally provide a new api to listen for
         mount topology changes outside of /proc/<pid>/mountinfo. A mount
         namespace file descriptor can be supplied and registered with
         fanotify to listen for mount topology changes.
      
         Currently notifications for mount, umount and moving mounts are
         generated. The generated notification record contains the unique
         mount id of the mount.
      
         The listmount() and statmount() api can be used to query detailed
         information about the mount using the received unique mount id.
      
         This allows userspace to figure out exactly how the mount topology
         changed without having to generating diffs of /proc/<pid>/mountinfo
         in userspace.
      
       - Support O_PATH file descriptors with FSCONFIG_SET_FD in the new mount
         api
      
       - Support detached mounts in overlayfs
      
         Since last cycle we support specifying overlayfs layers via file
         descriptors. However, we don't allow detached mounts which means
         userspace cannot user file descriptors received via
         open_tree(OPEN_TREE_CLONE) and fsmount() directly. They have to
         attach them to a mount namespace via move_mount() first.
      
         This is cumbersome and means they have to undo mounts via umount().
         Allow them to directly use detached mounts.
      
       - Allow to retrieve idmappings with statmount
      
         Currently it isn't possible to figure out what idmapping has been
         attached to an idmapped mount. Add an extension to statmount() which
         allows to read the idmapping from the mount.
      
       - Allow creating idmapped mounts from mounts that are already idmapped
      
         So far it isn't possible to allow the creation of idmapped mounts
         from already idmapped mounts as this has significant lifetime
         implications. Make the creation of idmapped mounts atomic by allow to
         pass struct mount_attr together with the open_tree_attr() system call
         allowing to solve these issues without complicating VFS lookup in any
         way.
      
         The system call has in general the benefit that creating a detached
         mount and applying mount attributes to it becomes an atomic operation
         for userspace.
      
       - Add a way to query statmount() for supported options
      
         Allow userspace to query which mount information can be retrieved
         through statmount().
      
       - Allow superblock owners to force unmount
      
      * tag 'vfs-6.15-rc1.mount' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (21 commits)
        umount: Allow superblock owners to force umount
        selftests: add tests for mount notification
        selinux: add FILE__WATCH_MOUNTNS
        samples/vfs: fix printf format string for size_t
        fs: allow changing idmappings
        fs: add kflags member to struct mount_kattr
        fs: add open_tree_attr()
        fs: add copy_mount_setattr() helper
        fs: add vfs_open_tree() helper
        statmount: add a new supported_mask field
        samples/vfs: add STATMOUNT_MNT_{G,U}IDMAP
        selftests: add tests for using detached mount with overlayfs
        samples/vfs: check whether flag was raised
        statmount: allow to retrieve idmappings
        uidgid: add map_id_range_up()
        fs: allow detached mounts in clone_private_mount()
        selftests/overlayfs: test specifying layers as O_PATH file descriptors
        fs: support O_PATH fds with FSCONFIG_SET_FD
        vfs: add notifications for mount attach and detach
        fanotify: notify on mount attach and detach
        ...
      fd101da6
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.eventpoll' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · a79a09a0
      Linus Torvalds authored
      Pull vfs eventpoll updates from Christian Brauner:
       "This contains a few preparatory changes to eventpoll to allow io_uring
        to support epoll"
      
      * tag 'vfs-6.15-rc1.eventpoll' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        eventpoll: add epoll_sendevents() helper
        eventpoll: abstract out ep_try_send_events() helper
        eventpoll: abstract out parameter sanity checking
      a79a09a0
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · 99c21bea
      Linus Torvalds authored
      Pull misc vfs updates from Christian Brauner:
       "Features:
      
         - Add CONFIG_DEBUG_VFS infrastucture:
            - Catch invalid modes in open
            - Use the new debug macros in inode_set_cached_link()
            - Use debug-only asserts around fd allocation and install
      
         - Place f_ref to 3rd cache line in struct file to resolve false
           sharing
      
      Cleanups:
      
         - Start using anon_inode_getfile_fmode() helper in various places
      
         - Don't take f_lock during SEEK_CUR if exclusion is guaranteed by
           f_pos_lock
      
         - Add unlikely() to kcmp()
      
         - Remove legacy ->remount_fs method from ecryptfs after port to the
           new mount api
      
         - Remove invalidate_inodes() in favour of evict_inodes()
      
         - Simplify ep_busy_loopER by removing unused argument
      
         - Avoid mmap sem relocks when coredumping with many missing pages
      
         - Inline getname()
      
         - Inline new_inode_pseudo() and de-staticize alloc_inode()
      
         - Dodge an atomic in putname if ref == 1
      
         - Consistently deref the files table with rcu_dereference_raw()
      
         - Dedup handling of struct filename init and refcounts bumps
      
         - Use wq_has_sleeper() in end_dir_add()
      
         - Drop the lock trip around I_NEW wake up in evict()
      
         - Load the ->i_sb pointer once in inode_sb_list_{add,del}
      
         - Predict not reaching the limit in alloc_empty_file()
      
         - Tidy up do_sys_openat2() with likely/unlikely
      
         - Call inode_sb_list_add() outside of inode hash lock
      
         - Sort out fd allocation vs dup2 race commentary
      
         - Turn page_offset() into a wrapper around folio_pos()
      
         - Remove locking in exportfs around ->get_parent() call
      
         - try_lookup_one_len() does not need any locks in autofs
      
         - Fix return type of several functions from long to int in open
      
         - Fix return type of several functions from long to int in ioctls
      
        Fixes:
      
         - Fix watch queue accounting mismatch"
      
      * tag 'vfs-6.15-rc1.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (30 commits)
        fs: sort out fd allocation vs dup2 race commentary, take 2
        fs: call inode_sb_list_add() outside of inode hash lock
        fs: tidy up do_sys_openat2() with likely/unlikely
        fs: predict not reaching the limit in alloc_empty_file()
        fs: load the ->i_sb pointer once in inode_sb_list_{add,del}
        fs: drop the lock trip around I_NEW wake up in evict()
        fs: use wq_has_sleeper() in end_dir_add()
        VFS/autofs: try_lookup_one_len() does not need any locks
        fs: dedup handling of struct filename init and refcounts bumps
        fs: consistently deref the files table with rcu_dereference_raw()
        exportfs: remove locking around ->get_parent() call.
        fs: use debug-only asserts around fd allocation and install
        fs: dodge an atomic in putname if ref == 1
        vfs: Remove invalidate_inodes()
        ecryptfs: remove NULL remount_fs from super_operations
        watch_queue: fix pipe accounting mismatch
        fs: place f_ref to 3rd cache line in struct file to resolve false sharing
        epoll: simplify ep_busy_loop by removing always 0 argument
        fs: Turn page_offset() into a wrapper around folio_pos()
        kcmp: improve performance adding an unlikely hint to task comparisons
        ...
      99c21bea
    • Linus Torvalds's avatar
      Merge tag 'vfs-6.15-rc1.mount.api' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs · c4cff1ea
      Linus Torvalds authored
      Pull vfs mount API updates from Christian Brauner:
       "This converts the remaining pseudo filesystems to the new mount api.
      
        The sysv conversion is a bit gratuitous because we remove sysv in
        another pull request. But if we have to revert the removal we at least
        will have it converted to the new mount api already"
      
      * tag 'vfs-6.15-rc1.mount.api' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
        sysv: convert sysv to use the new mount api
        vfs: remove some unused old mount api code
        devtmpfs: replace ->mount with ->get_tree in public instance
        vfs: Convert devpts to use the new mount API
        pstore: convert to the new mount API
      c4cff1ea
Loading