kernel/git/tip/tip.git - Unnamed repository; edit this file 'description' to name the repository.

Age	Commit message (Collapse)	Author	Files	Lines
4 days	Merge tag 'locking-futex-2025-12-10' of ↵	Linus Torvalds	6	-9/+562
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex updates from Ingo Molnar: - Standardize on ktime_t in restart_block::time as well (Thomas Weißschuh) - Futex selftests: - Add robust list testcases (André Almeida) - Formatting fixes/cleanups (Carlos Llamas) * tag 'locking-futex-2025-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Store time as ktime_t in restart block selftests/futex: Create test for robust list selftests/futex: Skip tests if shmget unsupported selftests/futex: Add newline to ksft_exit_fail_msg() selftests/futex: Remove unused test_futex_mpol()
6 days	selftests/bpf: replace "__auto_type" with "auto"	H. Peter Anvin	1	-2/+7
	Replace instances of "__auto_type" with "auto" in: tools/testing/selftests/bpf/prog_tests/socket_helpers.h This file does not seem to be including <linux/compiler_types.h> directly or indirectly, so copy the definition but guard it with !defined(auto). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
7 days	Merge tag 'tty-6.19-rc1' of ↵	Linus Torvalds	4	-1/+657
	git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big set of tty/serial driver changes for 6.19-rc1. Nothing major at all, just small constant churn to make the tty layer "cleaner" as well as serial driver updates and even a new test added! Included in here are: - More tty/serial cleanups from Jiri - tty tiocsti test added to hopefully ensure we don't regress in this area again - sc16is7xx driver updates - imx serial driver updates - 8250 driver updates - new hardware device ids added - other minor serial/tty driver cleanups and tweaks All of these have been in linux-next for a while with no reported issues" * tag 'tty-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (60 commits) serial: sh-sci: Fix deadlock during RSCI FIFO overrun error dt-bindings: serial: rsci: Drop "uart-has-rtscts: false" LoongArch: dts: Add uart new compatible string serial: 8250: Add Loongson uart driver support dt-bindings: serial: 8250: Add Loongson uart compatible serial: 8250: add driver for KEBA UART serial: Keep rs485 settings for devices without firmware node serial: qcom-geni: Enable Serial on SA8255p Qualcomm platforms serial: qcom-geni: Enable PM runtime for serial driver serial: sprd: Return -EPROBE_DEFER when uart clock is not ready tty: serial: samsung: Declare earlycon for Exynos850 serial: icom: Convert PCIBIOS_* return codes to errnos serial: 8250-of: Fix style issues in 8250_of.c serial: add support of CPCI cards serial: mux: Fix kernel doc for mux_poll() tty: replace use of system_unbound_wq with system_dfl_wq serial: 8250_platform: simplify IRQF_SHARED handling serial: 8250: make share_irqs local to 8250_platform serial: 8250: move skip_txen_test to core serial: drop SERIAL_8250_DEPRECATED_OPTIONS ...
8 days	Merge tag 'mm-nonmm-stable-2025-12-06-11-14' of ↵	Linus Torvalds	317	-307/+1291
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c - "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations - "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO - "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day - "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions * tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...
8 days	Merge tag 'landlock-6.19-rc1' of ↵	Linus Torvalds	2	-9/+1467
	git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This mainly fixes handling of disconnected directories and adds new tests" * tag 'landlock-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add disconnected leafs and branch test suites selftests/landlock: Add tests for access through disconnected paths landlock: Improve variable scope landlock: Fix handling of disconnected directories selftests/landlock: Fix makefile header list landlock: Make docs in cred.h and domain.h visible landlock: Minor comments improvements
8 days	Merge tag 'libnvdimm-for-6.19' of ↵	Linus Torvalds	1	-1/+6
	git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull nvdimm updates from Ira Weiny: "These are mainly bug fixes and code updates. There is a new feature to divide up memmap= carve outs and a fix caught in linux-next for that patch. Managing memmap memory on the fly for multiple VM's was proving difficult and Mike provided a driver which allows for the memory to be better manged. Summary: - Allow exposing RAM carveouts as NVDIMM DIMM devices - Prevent integer overflow in ramdax_get_config_data() - Replace use of system_wq with system_percpu_wq - Documentation: btt: Unwrap bit 31-30 nested table - tools/testing/nvdimm: Use per-DIMM device handle" * tag 'libnvdimm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm: Prevent integer overflow in ramdax_get_config_data() Documentation: btt: Unwrap bit 31-30 nested table nvdimm: replace use of system_wq with system_percpu_wq tools/testing/nvdimm: Use per-DIMM device handle nvdimm: allow exposing RAM carveouts as NVDIMM DIMM devices
8 days	Merge tag 'dma-mapping-6.19-2025-12-05' of ↵	Linus Torvalds	3	-136/+0
	git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping updates from Marek Szyprowski: - More DMA mapping API refactoring to physical addresses as the primary interface instead of page+offset parameters. This time dma_map_ops callbacks are converted to physical addresses, what in turn results also in some simplification of architecture specific code (Leon Romanovsky and Jason Gunthorpe) - Clarify that dma_map_benchmark is not a kernel self-test, but standalone tool (Qinxin Xia) * tag 'dma-mapping-6.19-2025-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma-mapping: remove unused map_page callback xen: swiotlb: Convert mapping routine to rely on physical address x86: Use physical address for DMA mapping sparc: Use physical address DMA mapping powerpc: Convert to physical address DMA mapping parisc: Convert DMA map_page to map_phys interface MIPS/jazzdma: Provide physical address directly alpha: Convert mapping routine to rely on physical address dma-mapping: remove unused mapping resource callbacks xen: swiotlb: Switch to physical address mapping callbacks ARM: dma-mapping: Switch to physical address mapping callbacks ARM: dma-mapping: Reduce struct page exposure in arch_sync_dma*() dma-mapping: convert dummy ops to physical address mapping dma-mapping: prepare dma_map_ops to conversion to physical address tools/dma: move dma_map_benchmark from selftests to tools/dma
9 days	Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm	Linus Torvalds	47	-325/+2177
	Pull KVM updates from Paolo Bonzini: "ARM: - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM - Minor fixes to KVM and selftests Loongarch: - Get VM PMU capability from HW GCFG register - Add AVEC basic support - Use 64-bit register definition for EIOINTC - Add KVM timer test cases for tools/selftests RISC/V: - SBI message passing (MPXY) support for KVM guest - Give a new, more specific error subcode for the case when in-kernel AIA virtualization fails to allocate IMSIC VS-file - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually in small chunks - Fix guest page fault within HLV* instructions - Flush VS-stage TLB after VCPU migration for Andes cores s390: - Always allocate ESCA (Extended System Control Area), instead of starting with the basic SCA and converting to ESCA with the addition of the 65th vCPU. The price is increased number of exits (and worse performance) on z10 and earlier processor; ESCA was introduced by z114/z196 in 2010 - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups x86: - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO SPTE caching is disabled, as there can't be any relevant SPTEs to zap - Relocate a misplaced export - Fix an async #PF bug where KVM would clear the completion queue when the guest transitioned in and out of paging mode, e.g. when handling an SMI and then returning to paged mode via RSM - Leave KVM's user-return notifier registered even when disabling virtualization, as long as kvm.ko is loaded. On reboot/shutdown, keeping the notifier registered is ok; the kernel does not use the MSRs and the callback will run cleanly and restore host MSRs if the CPU manages to return to userspace before the system goes down - Use the checked version of {get,put}_user() - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC timers can result in a hard lockup in the host - Revert the periodic kvmclock sync logic now that KVM doesn't use a clocksource that's subject to NTP corrections - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter behind CONFIG_CPU_MITIGATIONS - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast path; the only reason they were handled in the fast path was to paper of a bug in the core #MC code, and that has long since been fixed - Add emulator support for AVX MOV instructions, to play nice with emulated devices whose guest drivers like to access PCI BARs with large multi-byte instructions x86 (AMD): - Fix a few missing "VMCB dirty" bugs - Fix the worst of KVM's lack of EFER.LMSLE emulation - Add AVIC support for addressing 4k vCPUs in x2AVIC mode - Fix incorrect handling of selective CR0 writes when checking intercepts during emulation of L2 instructions - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32] on VMRUN and #VMEXIT - Fix a bug where KVM corrupt the guest code stream when re-injecting a soft interrupt if the guest patched the underlying code after the VM-Exit, e.g. when Linux patches code with a temporary INT3 - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits to userspace, and extend KVM "support" to all policy bits that don't require any actual support from KVM x86 (Intel): - Use the root role from kvm_mmu_page to construct EPTPs instead of the current vCPU state, partly as worthwhile cleanup, but mostly to pave the way for tracking per-root TLB flushes, and elide EPT flushes on pCPU migration if the root is clean from a previous flush - Add a few missing nested consistency checks - Rip out support for doing "early" consistency checks via hardware as the functionality hasn't been used in years and is no longer useful in general; replace it with an off-by-default module param to WARN if hardware fails a check that KVM does not perform - Fix a currently-benign bug where KVM would drop the guest's SPEC_CTRL[63:32] on VM-Enter - Misc cleanups - Overhaul the TDX code to address systemic races where KVM (acting on behalf of userspace) could inadvertantly trigger lock contention in the TDX-Module; KVM was either working around these in weird, ugly ways, or was simply oblivious to them (though even Yan's devilish selftests could only break individual VMs, not the host kernel) - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a TDX vCPU, if creating said vCPU failed partway through - Fix a few sparse warnings (bad annotation, 0 != NULL) - Use struct_size() to simplify copying TDX capabilities to userspace - Fix a bug where TDX would effectively corrupt user-return MSR values if the TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected Selftests: - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM - Forcefully override ARCH from x86_64 to x86 to play nice with specifying ARCH=x86_64 on the command line - Extend a bunch of nested VMX to validate nested SVM as well - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to verify KVM can save/restore nested VMX state when L1 is using 5-level paging, but L2 is not - Clean up the guest paging code in anticipation of sharing the core logic for nested EPT and nested NPT guest_memfd: - Add NUMA mempolicy support for guest_memfd, and clean up a variety of rough edges in guest_memfd along the way - Define a CLASS to automatically handle get+put when grabbing a guest_memfd from a memslot to make it harder to leak references - Enhance KVM selftests to make it easer to develop and debug selftests like those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs often result in hard-to-debug SIGBUS errors - Misc cleanups Generic: - Use the recently-added WQ_PERCPU when creating the per-CPU workqueue for irqfd cleanup - Fix a goof in the dirty ring documentation - Fix choice of target for directed yield across different calls to kvm_vcpu_on_spin(); the function was always starting from the first vCPU instead of continuing the round-robin search" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits) KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions ...
9 days	Merge tag 'riscv-for-linus-6.19-mw1' of ↵	Linus Torvalds	3	-30/+274
	git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: - Enable parallel hotplug for RISC-V - Optimize vector regset allocation for ptrace() - Add a kernel selftest for the vector ptrace interface - Enable the userspace RAID6 test to build and run using RISC-V vectors - Add initial support for the Zalasr RISC-V ratified ISA extension - For the Zicbop RISC-V ratified ISA extension to userspace, expose hardware and kernel support to userspace and add a kselftest for Zicbop - Convert open-coded instances of 'asm goto's that are controlled by runtime ALTERNATIVEs to use riscv_has_extension_{un,}likely(), following arm64's alternative_has_cap_{un,}likely() - Remove an unnecessary mask in the GFP flags used in some calls to pagetable_alloc() * tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: selftests/riscv: Add Zicbop prefetch test riscv: hwprobe: Expose Zicbop extension and its block size riscv: Introduce Zalasr instructions riscv: hwprobe: Export Zalasr extension dt-bindings: riscv: Add Zalasr ISA extension description riscv: Add ISA extension parsing for Zalasr selftests: riscv: Add test for the Vector ptrace interface riscv: ptrace: Optimize the allocation of vector regset raid6: test: Add support for RISC-V raid6: riscv: Allow code to be compiled in userspace raid6: riscv: Prevent compiler from breaking inline vector assembly code riscv: cmpxchg: Use riscv_has_extension_likely riscv: bitops: Use riscv_has_extension_likely riscv: hweight: Use riscv_has_extension_likely riscv: checksum: Use riscv_has_extension_likely riscv: pgtable: Use riscv_has_extension_unlikely riscv: Remove __GFP_HIGHMEM masking RISC-V: Enable HOTPLUG_PARALLEL for secondary CPUs
9 days	Merge tag 'mm-stable-2025-12-03-21-26' of ↵	Linus Torvalds	16	-219/+1944
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki) Rework the vmalloc() code to support non-blocking allocations (GFP_ATOIC, GFP_NOWAIT) "ksm: fix exec/fork inheritance" (xu xin) Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not inherited across fork/exec "mm/zswap: misc cleanup of code and documentations" (SeongJae Park) Some light maintenance work on the zswap code "mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira) Enhance the /sys/kernel/debug/page_owner debug feature by adding unique identifiers to differentiate the various stack traces so that userspace monitoring tools can better match stack traces over time "mm/page_alloc: pcp->batch cleanups" (Joshua Hahn) Minor alterations to the page allocator's per-cpu-pages feature "Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra) Address a scalability issue in userfaultfd's UFFDIO_MOVE operation "kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov) "drivers/base/node: fold node register and unregister functions" (Donet Tom) Clean up the NUMA node handling code a little "mm: some optimizations for prot numa" (Kefeng Wang) Cleanups and small optimizations to the NUMA allocation hinting code "mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn) Address long lock hold times at boot on large machines. These were causing (harmless) softlockup warnings "optimize the logic for handling dirty file folios during reclaim" (Baolin Wang) Remove some now-unnecessary work from page reclaim "mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park) Enhance the DAMOS auto-tuning feature "mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan) Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace configuration "expand mmap_prepare functionality, port more users" (Lorenzo Stoakes) Enhance the new(ish) file_operations.mmap_prepare() method and port additional callsites from the old ->mmap() over to ->mmap_prepare() "Fix stale IOTLB entries for kernel address space" (Lu Baolu) Fix a bug (and possible security issue on non-x86) in the IOMMU code. In some situations the IOMMU could be left hanging onto a stale kernel pagetable entry "mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang) Clean up and optimize the folio splitting code "mm, swap: misc cleanup and bugfix" (Kairui Song) Some cleanups and a minor fix in the swap discard code "mm/damon: misc documentation fixups" (SeongJae Park) "mm/damon: support pin-point targets removal" (SeongJae Park) Permit userspace to remove a specific monitoring target in the middle of the current targets list "mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo) A couple of cleanups related to mm header file inclusion "mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He) improve the selection of swap devices for NUMA machines "mm: Convert memory block states (MEM_) macros to enums" (Israel Batista) Change the memory block labels from macros to enums so they will appear in kernel debug info "ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes) Address an inefficiency when KSM unmerges an address range "mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park) Fix leaks and unhandled malloc() failures in DAMON userspace unit tests "some cleanups for pageout()" (Baolin Wang) Clean up a couple of minor things in the page scanner's writeback-for-eviction code "mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu) Move hugetlb's sysfs/sysctl handling code into a new file "introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes) Make the VMA guard regions available in /proc/pid/smaps and improves the mergeability of guarded VMAs "mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes) Reduce mmap lock contention for callers performing VMA guard region operations "vma_start_write_killable" (Matthew Wilcox) Start work on permitting applications to be killed when they are waiting on a read_lock on the VMA lock "mm/damon/tests: add more tests for online parameters commit" (SeongJae Park) Add additional userspace testing of DAMON's "commit" feature "mm/damon: misc cleanups" (SeongJae Park) "make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes) Address the possible loss of a VMA's VM_SOFTDIRTY flag when that VMA is merged with another "mm: support device-private THP" (Balbir Singh) Introduce support for Transparent Huge Page (THP) migration in zone device-private memory "Optimize folio split in memory failure" (Zi Yan) "mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang) Some more cleanups in the folio splitting code "mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes) Clean up our handling of pagetable leaf entries by introducing the concept of 'software leaf entries', of type softleaf_t "reparent the THP split queue" (Muchun Song) Reparent the THP split queue to its parent memcg. This is in preparation for addressing the long-standing "dying memcg" problem, wherein dead memcg's linger for too long, consuming memory resources "unify PMD scan results and remove redundant cleanup" (Wei Yang) A little cleanup in the hugepage collapse code "zram: introduce writeback bio batching" (Sergey Senozhatsky) Improve zram writeback efficiency by introducing batched bio writeback support "memcg: cleanup the memcg stats interfaces" (Shakeel Butt) Clean up our handling of the interrupt safety of some memcg stats "make vmalloc gfp flags usage more apparent" (Vishal Moola) Clean up vmalloc's handling of incoming GFP flags "mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang) Teach soft dirty and userfaultfd write protect tracking to use RISC-V's Svrsw60t59b extension "mm: swap: small fixes and comment cleanups" (Youngjun Park) Fix a small bug and clean up some of the swap code "initial work on making VMA flags a bitmap" (Lorenzo Stoakes) Start work on converting the vma struct's flags to a bitmap, so we stop running out of them, especially on 32-bit "mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park) Address a possible bug in the swap discard code and clean things up a little [ This merge also reverts commit ebb9aeb980e5 ("vfio/nvgrace-gpu: register device memory for poison handling") because it looks broken to me, I've asked for clarification - Linus ] tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm: fix vma_start_write_killable() signal handling mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate mm/swapfile: fix list iteration when next node is removed during discard fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling mm/kfence: add reboot notifier to disable KFENCE on shutdown memcg: remove inc/dec_lruvec_kmem_state helpers selftests/mm/uffd: initialize char variable to Null mm: fix DEBUG_RODATA_TEST indentation in Kconfig mm: introduce VMA flags bitmap type tools/testing/vma: eliminate dependency on vma->__vm_flags mm: simplify and rename mm flags function for clarity mm: declare VMA flags by bit zram: fix a spelling mistake mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity mm/vmscan: skip increasing kswapd_failures when reclaim was boosted pagemap: update BUDDY flag documentation mm: swap: remove scan_swap_map_slots() references from comments mm: swap: change swap_alloc_slow() to void mm, swap: remove redundant comment for read_swap_cache_async mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational ...
9 days	Merge tag 'ktest-v6.19' of ↵	Linus Torvalds	1	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest Pull ktest fix from Steven Rostedt: - Fix incorrect variable in error message in config-bisect.pl If the old config file fails to get copied as the last good or bad config file, then it fails the program and prints an error message. But the variable used to print what the old config's name was incorrect. It was $config when it should have been $output_config. * tag 'ktest-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: ktest.pl: Fix uninitialized var in config-bisect.pl
9 days	Merge tag 'trace-rv-6.19' of ↵	Linus Torvalds	13	-14/+278
	git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull runtime verifier updates from Steven Rostedt: - Adapt the ftracetest script to be run from a different folder This uses the already existing OPT_TEST_DIR but extends it further to run independent tests, then add an --rv flag to allow using the script for testing RV (mostly) independently on ftrace. - Add basic RV selftests in selftests/verification for more validations Add more validations for available/enabled monitors and reactors. This could have caught the bug introducing kernel panic solved above. Tests use ftracetest. - Convert react() function in reactor to use va_list directly Use a central helper to handle the variadic arguments. Clean up macros and mark functions as static. - Add lockdep annotations to reactors to have lockdep complain of errors If the reactors are called from improper context. Useful to develop new reactors. This highlights a warning in the panic reactor that is related to the printk subsystem and not to RV. - Convert core RV code to use lock guards and __free helpers This completely removes goto statements. - Fix compilation if !CONFIG_RV_REACTORS Fix the warning by keeping LTL monitor variable as always static. * tag 'trace-rv-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Fix compilation if !CONFIG_RV_REACTORS rv: Convert to use __free rv: Convert to use lock guard rv: Add explicit lockdep context for reactors rv: Make rv_reacting_on() static rv: Pass va_list to reactors selftests/verification: Add initial RV tests selftest/ftrace: Generalise ftracetest to use with RV
9 days	Merge tag 'tpmdd-next-6.19-rc1-v4' of ↵	Linus Torvalds	1	-2/+2
	git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: "This contains changes to unify TPM return code translation between trusted_tpm2 and TPM driver itself. Other than that the changes are either bug fixes or minor imrovements" * tag 'tpmdd-next-6.19-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: KEYS: trusted: Use tpm_ret_to_err() in trusted_tpm2 tpm: Use -EPERM as fallback error code in tpm_ret_to_err tpm: Cap the number of PCR banks tpm: Remove tpm_find_get_ops tpm: add WQ_PERCPU to alloc_workqueue users tpm_crb: add missing loc parameter to kerneldoc tpm_crb: Fix a spelling mistake selftests: tpm2: Fix ill defined assertions
9 days	Merge tag 'for-linus-iommufd' of ↵	Linus Torvalds	2	-0/+87
	git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "This is a pretty consequential cycle for iommufd, though this pull is not too big. It is based on a shared branch with VFIO that introduces VFIO_DEVICE_FEATURE_DMA_BUF a DMABUF exporter for VFIO device's MMIO PCI BARs. This was a large multiple series journey over the last year and a half. Based on that work IOMMUFD gains support for VFIO DMABUF's in its existing IOMMU_IOAS_MAP_FILE, which closes the last major gap to support PCI peer to peer transfers within VMs. In Joerg's iommu tree we have the "generic page table" work which aims to consolidate all the duplicated page table code in every iommu driver into a single algorithm. This will be used by iommufd to implement unique page table operations to start adding new features and improve performance. In here: - Expand IOMMU_IOAS_MAP_FILE to accept a DMABUF exported from VFIO. This is the first step to broader DMABUF support in iommufd, right now it only works with VFIO. This closes the last functional gap with classic VFIO type 1 to safely support PCI peer to peer DMA by mapping the VFIO device's MMIO into the IOMMU. - Relax SMMUv3 restrictions on nesting domains to better support qemu's sequence to have an identity mapping before the vSID is established" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommu/arm-smmu-v3-iommufd: Allow attaching nested domain for GBPA cases iommufd/selftest: Add some tests for the dmabuf flow iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE iommufd: Have iopt_map_file_pages convert the fd to a file iommufd: Have pfn_reader process DMABUF iopt_pages iommufd: Allow MMIO pages in a batch iommufd: Allow a DMABUF to be revoked iommufd: Do not map/unmap revoked DMABUFs iommufd: Add DMABUF to iopt_pages vfio/pci: Add vfio_pci_dma_buf_iommufd_map()
9 days	Merge tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio	Linus Torvalds	26	-1071/+1492
	Pull VFIO updates from Alex Williamson: - Move libvfio selftest artifacts in preparation of more tightly coupled integration with KVM selftests (David Matlack) - Fix comment typo in mtty driver (Chu Guangqing) - Support for new hardware revision in the hisi_acc vfio-pci variant driver where the migration registers can now be accessed via the PF. When enabled for this support, the full BAR can be exposed to the user (Longfang Liu) - Fix vfio cdev support for VF token passing, using the correct size for the kernel structure, thereby actually allowing userspace to provide a non-zero UUID token. Also set the match token callback for the hisi_acc, fixing VF token support for this this vfio-pci variant driver (Raghavendra Rao Ananta) - Introduce internal callbacks on vfio devices to simplify and consolidate duplicate code for generating VFIO_DEVICE_GET_REGION_INFO data, removing various ioctl intercepts with a more structured solution (Jason Gunthorpe) - Introduce dma-buf support for vfio-pci devices, allowing MMIO regions to be exposed through dma-buf objects with lifecycle managed through move operations. This enables low-level interactions such as a vfio-pci based SPDK drivers interacting directly with dma-buf capable RDMA devices to enable peer-to-peer operations. IOMMUFD is also now able to build upon this support to fill a long standing feature gap versus the legacy vfio type1 IOMMU backend with an implementation of P2P support for VM use cases that better manages the lifecycle of the P2P mapping (Leon Romanovsky, Jason Gunthorpe, Vivek Kasireddy) - Convert eventfd triggering for error and request signals to use RCU mechanisms in order to avoid a 3-way lockdep reported deadlock issue (Alex Williamson) - Fix a 32-bit overflow introduced via dma-buf support manifesting with large DMA buffers (Alex Mastro) - Convert nvgrace-gpu vfio-pci variant driver to insert mappings on fault rather than at mmap time. This conversion serves both to make use of huge PFNMAPs but also to both avoid corrected RAS events during reset by now being subject to vfio-pci-core's use of unmap_mapping_range(), and to enable a device readiness test after reset (Ankit Agrawal) - Refactoring of vfio selftests to support multi-device tests and split code to provide better separation between IOMMU and device objects. This work also enables a new test suite addition to measure parallel device initialization latency (David Matlack) * tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio: (65 commits) vfio: selftests: Add vfio_pci_device_init_perf_test vfio: selftests: Eliminate INVALID_IOVA vfio: selftests: Split libvfio.h into separate header files vfio: selftests: Move vfio_selftests_*() helpers into libvfio.c vfio: selftests: Rename vfio_util.h to libvfio.h vfio: selftests: Stop passing device for IOMMU operations vfio: selftests: Move IOVA allocator into iova_allocator.c vfio: selftests: Move IOMMU library code into iommu.c vfio: selftests: Rename struct vfio_dma_region to dma_region vfio: selftests: Upgrade driver logging to dev_err() vfio: selftests: Prefix logs with device BDF where relevant vfio: selftests: Eliminate overly chatty logging vfio: selftests: Support multiple devices in the same container/iommufd vfio: selftests: Introduce struct iommu vfio: selftests: Rename struct vfio_iommu_mode to iommu_mode vfio: selftests: Allow passing multiple BDFs on the command line vfio: selftests: Split run.sh into separate scripts vfio: selftests: Move run.sh into scripts directory vfio/nvgrace-gpu: wait for the GPU mem to be ready vfio/nvgrace-gpu: Inform devmem unmapped after reset ...
9 days	Merge tag 'iommu-updates-v6.19' of ↵	Linus Torvalds	2	-22/+50
	git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: - Introduction of the generic IO page-table framework with support for Intel and AMD IOMMU formats from Jason. This has good potential for unifying more IO page-table implementations and making future enhancements more easy. But this also needed quite some fixes during development. All known issues have been fixed, but my feeling is that there is a higher potential than usual that more might be needed. - Intel VT-d updates: - Use right invalidation hint in qi_desc_iotlb() - Reduce the scope of INTEL_IOMMU_FLOPPY_WA - ARM-SMMU updates: - Qualcomm device-tree binding updates for Kaanapali and Glymur SoCs and a new clock for the TBU. - Fix error handling if level 1 CD table allocation fails. - Permit more than the architectural maximum number of SMRs for funky Qualcomm mis-implementations of SMMUv2. - Mediatek driver: - MT8189 iommu support - Move ARM IO-pgtable selftests to kunit - Device leak fixes for a couple of drivers - Random smaller fixes and improvements * tag 'iommu-updates-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (81 commits) iommupt/vtd: Support mgaw's less than a 4 level walk for first stage iommupt/vtd: Allow VT-d to have a larger table top than the vasz requires powerpc/pseries/svm: Make mem_encrypt.h self contained genpt: Make GENERIC_PT invisible iommupt: Avoid a compiler bug with sw_bit iommu/arm-smmu-qcom: Enable use of all SMR groups when running bare-metal iommupt: Fix unlikely flows in increase_top() iommu/amd: Propagate the error code returned by __modify_irte_ga() in modify_irte_ga() MAINTAINERS: Update my email address iommu/arm-smmu-v3: Fix error check in arm_smmu_alloc_cd_tables dt-bindings: iommu: qcom_iommu: Allow 'tbu' clock iommu/vt-d: Restore previous domain::aperture_end calculation iommu/vt-d: Fix unused invalidation hint in qi_desc_iotlb iommu/vt-d: Set INTEL_IOMMU_FLOPPY_WA depend on BLK_DEV_FD iommu/tegra: fix device leak on probe_device() iommu/sun50i: fix device leak on of_xlate() iommu/omap: simplify probe_device() error handling iommu/omap: fix device leaks on probe_device() iommu/mediatek-v1: add missing larb count sanity check iommu/mediatek-v1: fix device leaks on probe() ...
9 days	Merge tag 'cxl-for-6.19' of ↵	Linus Torvalds	7	-77/+525
	git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link (CXL) updates from Dave Jiang: "The additions of note are adding CXL region remove support for locked CXL decoders, adding unit testing support for XOR address translation, and adding unit testing support for extended linear cache. Misc: - Remove incorrect page-allocator quirk section in documentation - Remove unused devm_cxl_port_enumerate_dports() function - Fix typo in cdat.c code comment - Replace use of system_wq with system_percpu_wq - Add locked CXL decoder support for region removal - Return when generic target updated - Rename region_res_match_cxl_range() to spa_maps_hpa() - Clarify comment in spa_maps_hpa() Enable unit testing for XOR address translation of SPA to DPA and vice versa: - Refactor address translation funcs for testing in cxl_region - Make the XOR calculations available for testing - Add cxl_translate module for address translation testing in cxl_test Extended Linear Cache changes: - Add extended linear cache size sysfs attribute - Adjust failure emission of extended linear cache detection in cxl_acpi - Added extended linear cache unit testing support in cxl_test Preparation refactor patches for PRM translation support: - Simplify cxl_rd_ops allocation and handling - Group xor arithmetric setup code in a single block - Remove local variable @inc in cxl_port_setup_targets()" * tag 'cxl-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (22 commits) cxl/test: Assign overflow_err_count from log->nr_overflow cxl/test: Remove ret_limit race condition in mock_get_event() cxl/test: remove unused mock function for cxl_rcd_component_reg_phys() cxl/test: Add support for acpi extended linear cache cxl/test: Add cxl_test CFMWS support for extended linear cache cxl/test: Standardize CXL auto region size cxl/region: Remove local variable @inc in cxl_port_setup_targets() cxl/acpi: Group xor arithmetric setup code in a single block cxl: Simplify cxl_rd_ops allocation and handling cxl: Clarify comment in spa_maps_hpa() cxl: Rename region_res_match_cxl_range() to spa_maps_hpa() acpi/hmat: Return when generic target is updated cxl: Add handling of locked CXL decoder cxl/region: Add support to indicate region has extended linear cache cxl: Adjust extended linear cache failure emission in cxl_acpi cxl/test: Add cxl_translate module for address translation testing cxl/acpi: Make the XOR calculations available for testing cxl/region: Refactor address translation funcs for testing cxl/pci: replace use of system_wq with system_percpu_wq cxl: fix typos in cdat.c comments ...
10 days	Merge tag 'hid-for-linus-2025120201' of ↵	Linus Torvalds	1	-0/+71
	git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Jiri Kosina: - Proper mapping of HID_GD_Z to ABS_DISTANCE for stylus/pen types of devices (Ping Cheng) - Power management/hibernation improvements in intel-ish (Zhang Lixu) - Improved support for several Logitech devices, e.g. G Pro X Superlight 2, new iteration of Lighspeed receiver, G13, G510 (Nathan Rossi, Mavroudis Chatzilazaridis, Leo L Schwab, Hans de Goede) - Support for UcLogic XP-PEN Artist 24 Pro (Joshua Goins) - WinWing Orion2 throttle support improvement (Ivan Gorinov) - other assorted small fixes and device ID additions * tag 'hid-for-linus-2025120201' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (37 commits) drivers: hid: renegotiate resolution multipliers with device after reset HID: evision: Fix Report Descriptor for Evision Wireless Receiver 320f:226f HID: logitech-dj: Fix probe failure when used with KVM HID: logitech-dj: Remove duplicate error logging HID: logitech-dj: Add support for G Pro X Superlight 2 receiver selftests/hid-tablet: add ABS_DISTANCE test for stylus/pen HID: input: map HID_GD_Z to ABS_DISTANCE for stylus/pen HID: bpf: fix typo in HID usage table HID: bpf: add the Huion Kamvas 27 Pro HID: bpf: add heuristics to the Huion Inspiroy 2S eraser button HID: bpf: Add support for XP-Pen Deco02 HID: bpf: Add support for the XP-Pen Deco 01 V3 HID: bpf: Add support for the Waltop Batteryless Tablet HID: bpf: Add fixup for Logitech SpaceNavigator variants HID: bpf: support for Huion Kamvas 16 Gen 3 HID: bpf: add support for Huion Kamvas 13 (Gen 3) (model GS1333) HID: bpf: Add support for the Inspiroy 2M Documentation: hid-alps: Format DataByte* subsection headings Documentation: hid-alps: Fix packet format section headings HID: nintendo: add WQ_PERCPU to alloc_workqueue users ...
10 days	Merge tag 'sound-6.19-rc1' of ↵	Linus Torvalds	1	-1/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "The majority of changes at this time were about ASoC with a lot of code refactoring works. From the functionality POV, there isn't much to see, but we have a wide range of device-specific fixes and updates. Here are some highlights: - Continued ASoC API cleanup work, spanned over many files - Added a SoundWire SCDA generic class driver with regmap support - Enhancements and fixes for Cirrus, Intel, Maxim and Qualcomm. - Support for ASoC Allwinner A523, Mediatek MT8189, Qualcomm QCM2290, QRB2210 and SM6115, SpacemiT K1, and TI TAS2568, TAS5802, TAS5806, TAS5815, TAS5828 and TAS5830 - Usual HD-audio and USB-audio quirks and fixups - Support for Onkyo SE-300PCIE, TASCAM IF-FW/DM MkII Some gpiolib changes for shared GPIOs are included along with this PR for covering ASoC drivers changes" * tag 'sound-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (739 commits) ALSA: hda/realtek: Add PCI SSIDs to HP ProBook quirks ALSA: usb-audio: Simplify with usb_endpoint_max_periodic_payload() ALSA: hda/realtek: fix mute/micmute LEDs don't work for more HP laptops ALSA: rawmidi: Fix inconsistent indenting warning reported by smatch ALSA: dice: fix buffer overflow in detect_stream_formats() ASoC: codecs: Modify awinic amplifier dsp read and write functions ASoC: SDCA: Fixup some more Kconfig issues ASoC: cs35l56: Log a message if firmware is missing ASoC: nau8325: Delete a stray tab firmware: cs_dsp: Add test cases for client_ops == NULL firmware: cs_dsp: Don't require client to provide a struct cs_dsp_client_ops ASoC: fsl_micfil: Set channel range control ASoC: fsl_micfil: Add default quality for different platforms ASoC: intel: sof_sdw: Add codec_info for cs42l45 ASoC: sdw_utils: Add cs42l45 support functions ASoC: intel: sof_sdw: Add ability to have auxiliary devices ASoC: sdw_utils: Move codec_name to dai info ASoC: sdw_utils: Add codec_conf for every DAI ASoC: SDCA: Add terminal type into input/output widget name ASoC: SDCA: Align mute controls to ALSA expectations ...
10 days	Merge tag 'for-6.19/block-20251201' of ↵	Linus Torvalds	2	-34/+45
	git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Fix head insertion for mq-deadline, a regression from when priority support was added - Series simplifying and improving the ublk user copy code - Various ublk related cleanups - Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the request is punted to a thread for handling - Merge and then later revert loop dio nowait support, as it ended up causing excessive stack usage for when the inline issue code needs to dip back into the full file system code - Improve auto integrity code, making it less deadlock prone - Speedup polled IO handling, but manually managing the hctx lookups - Fixes for blk-throttle for SSD devices - Small series with fixes for the S390 dasd driver - Add support for caching zones, avoiding unnecessary report zone queries - MD pull requests via Yu: - fix null-ptr-dereference regression for dm-raid0 - fix IO hang for raid5 when array is broken with IO inflight - remove legacy 1s delay to speed up system shutdown - change maintainer's email address - data can be lost if array is created with different lbs devices, fix this problem and record lbs of the array in metadata - fix rcu protection for md_thread - fix mddev kobject lifetime regression - enable atomic writes for md-linear - some cleanups - bcache updates via Coly - remove useless discard and cache device code - improve usage of per-cpu workqueues - Reorganize the IO scheduler switching code, fixing some lockdep reports as well - Improve the block layer P2P DMA support - Add support to the block tracing code for zoned devices - Segment calculation improves, and memory alignment flexibility improvements - Set of prep and cleanups patches for ublk batching support. The actual batching hasn't been added yet, but helps shrink down the workload of getting that patchset ready for 6.20 - Fix for how the ps3 block driver handles segments offsets - Improve how block plugging handles batch tag allocations - nbd fixes for use-after-free of the configuration on device clear/put - Set of improvements and fixes for zloop - Add Damien as maintainer of the block zoned device code handling - Various other fixes and cleanups * tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) block/rnbd: correct all kernel-doc complaints blk-mq: use queue_hctx in blk_mq_map_queue_type md: remove legacy 1s delay in md_notify_reboot md/raid5: fix IO hang when array is broken with IO inflight md: warn about updating super block failure md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid sbitmap: fix all kernel-doc warnings ublk: add helper of __ublk_fetch() ublk: pass const pointer to ublk_queue_is_zoned() ublk: refactor auto buffer register in ublk_dispatch_req() ublk: add `union ublk_io_buf` with improved naming ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() kfifo: add kfifo_alloc_node() helper for NUMA awareness blk-mq: fix potential uaf for 'queue_hw_ctx' blk-mq: use array manage hctx map instead of xarray ublk: prevent invalid access with DEBUG s390/dasd: Use scnprintf() instead of sprintf() s390/dasd: Move device name formatting into separate function s390/dasd: Remove unnecessary debugfs_create() return checks s390/dasd: Fix gendisk parent after copy pair swap ...
11 days	Merge tag 'net-next-6.19' of ↵	Linus Torvalds	90	-1288/+4187
	git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API: - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers: - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support" * tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits) net: page_pool: sanitise allocation order net: page pool: xa init with destroy on pp init net/mlx5e: Support XDP target xmit with dummy program net/mlx5e: Update XDP features in switch channels selftests/tc-testing: Test CAKE scheduler when enqueue drops packets net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop wireguard: netlink: generate netlink code wireguard: uapi: generate header with ynl-gen wireguard: uapi: move flag enums wireguard: uapi: move enum wg_cmd wireguard: netlink: add YNL specification selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py selftests: drv-net: introduce Iperf3Runner for measurement use cases selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive() Documentation: net: dsa: mention simple HSR offload helpers Documentation: net: dsa: mention availability of RedBox ...
11 days	Merge tag 'bpf-next-6.19' of ↵	Linus Torvalds	87	-3506/+8034
	git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Convert selftests/bpf/test_tc_edt and test_tc_tunnel from .sh to test_progs runner (Alexis Lothoré) - Convert selftests/bpf/test_xsk to test_progs runner (Bastien Curutchet) - Replace bpf memory allocator with kmalloc_nolock() in bpf_local_storage (Amery Hung), and in bpf streams and range tree (Puranjay Mohan) - Introduce support for indirect jumps in BPF verifier and x86 JIT (Anton Protopopov) and arm64 JIT (Puranjay Mohan) - Remove runqslower bpf tool (Hoyeon Lee) - Fix corner cases in the verifier to close several syzbot reports (Eduard Zingerman, KaFai Wan) - Several improvements in deadlock detection in rqspinlock (Kumar Kartikeya Dwivedi) - Implement "jmp" mode for BPF trampoline and corresponding DYNAMIC_FTRACE_WITH_JMP. It improves "fexit" program type performance from 80 M/s to 136 M/s. With Steven's Ack. (Menglong Dong) - Add ability to test non-linear skbs in BPF_PROG_TEST_RUN (Paul Chaignon) - Do not let BPF_PROG_TEST_RUN emit invalid GSO types to stack (Daniel Borkmann) - Generalize buildid reader into bpf_dynptr (Mykyta Yatsenko) - Optimize bpf_map_update_elem() for map-in-map types (Ritesh Oedayrajsingh Varma) - Introduce overwrite mode for BPF ring buffer (Xu Kuohai) * tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (169 commits) bpf: optimize bpf_map_update_elem() for map-in-map types bpf: make kprobe_multi_link_prog_run always_inline selftests/bpf: do not hardcode target rate in test_tc_edt BPF program selftests/bpf: remove test_tc_edt.sh selftests/bpf: integrate test_tc_edt into test_progs selftests/bpf: rename test_tc_edt.bpf.c section to expose program type selftests/bpf: Add success stats to rqspinlock stress test rqspinlock: Precede non-head waiter queueing with AA check rqspinlock: Disable spinning for trylock fallback rqspinlock: Use trylock fallback when per-CPU rqnode is busy rqspinlock: Perform AA checks immediately rqspinlock: Enclose lock/unlock within lock entry acquisitions bpf: Remove runqslower tool selftests/bpf: Remove usage of lsm/file_alloc_security in selftest bpf: Disable file_alloc_security hook bpf: check for insn arrays in check_ptr_alignment bpf: force BPF_F_RDONLY_PROG on insn array creation bpf: Fix exclusive map memory leak selftests/bpf: Make CS length configurable for rqspinlock stress test selftests/bpf: Add lock wait time stats to rqspinlock stress test ...
11 days	ktest.pl: Fix uninitialized var in config-bisect.pl	Steven Rostedt	1	-2/+2
	The error path of copying the old config used the wrong variable in the error message: $ mkdir /tmp/build $ ./tools/testing/ktest/config-bisect.pl -b /tmp/build config-good /tmp/config-bad $ chmod 0 /tmp/build $ ./tools/testing/ktest/config-bisect.pl -b /tmp/build config-good /tmp/config-bad good cp /tmp/build//.config config-good.tmp ... [0 seconds] FAILED! Use of uninitialized value $config in concatenation (.) or string at ./tools/testing/ktest/config-bisect.pl line 744. failed to copy to config-good.tmp When it should have shown: failed to copy /tmp/build//.config to config-good.tmp Cc: stable@vger.kernel.org Cc: John 'Warthog9' Hawley <warthog9@kernel.org> Fixes: 0f0db065999cf ("ktest: Add standalone config-bisect.pl program") Link: https://patch.msgid.link/20251203180924.6862bd26@gandalf.local.home Reported-by: "John W. Krahn" <jwkrahn@shaw.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
11 days	Merge tag 'linux_kselftest-next-6.19-rc1' of ↵	Linus Torvalds	6	-19/+176
	git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: - Add basic test for trace_marker_raw file to tracing selftest - Fix invalid array access in printf dma_map_benchmark selftest - Add tprobe enable/disable testcase to tracing selftest - Update fprobe selftest for ftrace based fprobe * tag 'linux_kselftest-next-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: tracing: Update fprobe selftest for ftrace based fprobe selftests: tracing: Add tprobe enable/disable testcase selftests/run_kselftest.sh: exit with error if tests fail selftests/dma: fix invalid array access in printf selftests/tracing: Add basic test for trace_marker_raw file
11 days	Merge tag 'livepatching-for-6.19' of ↵	Linus Torvalds	1	-1/+5
	git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching updates from Petr Mladek: - Support both paths where tracefs is typically mounted in selftests - Make old_sympos 0 and 1 equal. They both are valid when there is only one symbol with the given name. * tag 'livepatching-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: selftests: livepatch: use canonical ftrace path livepatch: Match old_sympos 0 and 1 in klp_find_func()
11 days	Merge tag 'sched_ext-for-6.19' of ↵	Linus Torvalds	3	-0/+476
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - Improve recovery from misbehaving BPF schedulers. When a scheduler puts many tasks with varying affinity restrictions on a shared DSQ, CPUs scanning through tasks they cannot run can overwhelm the system, causing lockups. Bypass mode now uses per-CPU DSQs with a load balancer to avoid this, and hooks into the hardlockup detector to attempt recovery. Add scx_cpu0 example scheduler to demonstrate this scenario. - Add lockless peek operation for DSQs to reduce lock contention for schedulers that need to query queue state during load balancing. - Allow scx_bpf_reenqueue_local() to be called from anywhere in preparation for deprecating cpu_acquire/release() callbacks in favor of generic BPF hooks. - Prepare for hierarchical scheduler support: add scx_bpf_task_set_slice() and scx_bpf_task_set_dsq_vtime() kfuncs, make scx_bpf_dsq_insert() return bool, and wrap kfunc args in structs for future aux__prog parameter. - Implement cgroup_set_idle() callback to notify BPF schedulers when a cgroup's idle state changes. - Fix migration tasks being incorrectly downgraded from stop_sched_class to rt_sched_class across sched_ext enable/disable. Applied late as the fix is low risk and the bug subtle but needs stable backporting. - Various fixes and cleanups including cgroup exit ordering, SCX_KICK_WAIT reliability, and backward compatibility improvements. tag 'sched_ext-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (44 commits) sched_ext: Fix incorrect sched_class settings for per-cpu migration tasks sched_ext: tools: Removing duplicate targets during non-cross compilation sched_ext: Use kvfree_rcu() to release per-cpu ksyncs object sched_ext: Pass locked CPU parameter to scx_hardlockup() and add docs sched_ext: Update comments replacing breather with aborting mechanism sched_ext: Implement load balancer for bypass mode sched_ext: Factor out abbreviated dispatch dequeue into dispatch_dequeue_locked() sched_ext: Factor out scx_dsq_list_node cursor initialization into INIT_DSQ_LIST_CURSOR sched_ext: Add scx_cpu0 example scheduler sched_ext: Hook up hardlockup detector sched_ext: Make handle_lockup() propagate scx_verror() result sched_ext: Refactor lockup handlers into handle_lockup() sched_ext: Make scx_exit() and scx_vexit() return bool sched_ext: Exit dispatch and move operations immediately when aborting sched_ext: Simplify breather mechanism with scx_aborting flag sched_ext: Use per-CPU DSQs instead of per-node global DSQs in bypass mode sched_ext: Refactor do_enqueue_task() local and global DSQ paths sched_ext: Use shorter slice in bypass mode sched_ext: Mark racy bitfields to prevent adding fields that can't tolerate races sched_ext: Minor cleanups to scx_task_iter ...
11 days	Merge tag 'cgroup-for-6.19' of ↵	Linus Torvalds	8	-24/+32
	git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Defer task cgroup unlink until after the dying task's final context switch so that controllers see the cgroup properly populated until the task is truly gone - cpuset cleanups and simplifications. Enforce that domain isolated CPUs stay in root or isolated partitions and fail if isolated+nohz_full would leave no housekeeping CPU. Fix sched/deadline root domain handling during CPU hot-unplug and race for tasks in attaching cpusets - Misc fixes including memory reclaim protection documentation and selftest KTAP conformance * tag 'cgroup-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) cpuset: Treat cpusets in attaching as populated sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug cgroup/cpuset: Introduce cpuset_cpus_allowed_locked() docs: cgroup: No special handling of unpopulated memcgs docs: cgroup: Note about sibling relative reclaim protection docs: cgroup: Explain reclaim protection target selftests/cgroup: conform test to KTAP format output cpuset: remove need_rebuild_sched_domains cpuset: remove global remote_children list cpuset: simplify node setting on error cgroup: include missing header for struct irq_work cgroup: Fix sleeping from invalid context warning on PREEMPT_RT cgroup/cpuset: Globally track isolated_cpus update cgroup/cpuset: Ensure domain isolated CPUs stay in root or isolated partition cgroup/cpuset: Move up prstate_housekeeping_conflict() helper cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() cgroup: Defer task cgroup unlink until after the task is done switching out cgroup: Move dying_tasks cleanup from cgroup_task_release() to cgroup_task_free() cgroup: Rename cgroup lifecycle hooks to cgroup_task_*() ...
11 days	selftests: tpm2: Fix ill defined assertions	Maurice Hieronymus	1	-2/+2
	Remove parentheses around assert statements in Python. With parentheses, assert always evaluates to True, making the checks ineffective. Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
11 days	Merge tag 'rcu.release.v6.19' of ↵	Linus Torvalds	4	-17/+158
	git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Frederic Weisbecker: "SRCU: - Properly handle SRCU readers within IRQ disabled sections in tiny SRCU - Preparation to reimplement RCU Tasks Trace on top of SRCU fast: - Introduce API to expedite a grace period and test it through rcutorture - Split srcu-fast in two flavours: SRCU-fast and SRCU-fast-updown. Both are still targeted toward faster readers (without full barriers on LOCK and UNLOCK) at the expense of heavier write side (using full RCU grace period ordering instead of simply full ordering) as compared to "traditional" non-fast SRCU. But those srcu-fast flavours are going to be optimized in two different ways: - SRCU-fast will become the reimplementation basis for RCU-TASK-TRACE for consolidation. Since RCU-TASK-TRACE must be NMI safe, SRCU-fast must be as well. - SRCU-fast-updown will be needed for uretprobes code in order to get rid of the read-side memory barriers while still allowing entering the reader at task level while exiting it in a timer handler. It is considered semaphore-like in that it can have different owners between LOCK and UNLOCK. However it is not NMI-safe. The actual optimizations are work in progress for the next cycle. Only the new interfaces are added for now, along with related torture and scalability test code. - Create/document/debug/torture new proper initializers for RCU fast: DEFINE_SRCU_FAST() and init_srcu_struct_fast() This allows for using right away the proper ordering on the write side (either full ordering or full RCU grace period ordering) without waiting for the read side to tell which to use. This also optimizes the read side altogether with moving flavour debug checks under debug config and with removing a costly RmW operation on their first call. - Make some diagnostic functions tracing safe Refscale: - Add performance testing for common context synchronizations (Preemption, IRQ, Softirq) and per-cpu increments. Those are relevant comparisons against SRCU-fast read side APIs, especially as they are planned to synchronize further tracing fast-path code Miscellanous: - In order to prepare the layout for nohz_full work deferral to user exit, the context tracking state must shrink the counter of transitions to/from RCU not watching. The only possible hazard is to trigger wrap-around more easily, delaying a bit grace periods when that happens. This should be a rare event though. Yet add debugging and torture code to test that assumption - Fix memory leak on locktorture module - Annotate accesses in rculist_nulls.h to prevent from KCSAN warnings. On recent discussions, we also concluded that all those WRITE_ONCE() and READ_ONCE() on list APIs deserve appropriate comments. Something to be expected for the next cycle - Provide a script to apply several configs to several commits with torture - Allow torture to reuse a build directory in order to save needless rebuild time - Various cleanups" * tag 'rcu.release.v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (29 commits) refscale: Add SRCU-fast-updown readers refscale: Exercise DEFINE_STATIC_SRCU_FAST() and init_srcu_struct_fast() rcutorture: Make srcu{,d}_torture_init() announce the SRCU type srcu: Create an SRCU-fast-updown API refscale: Do not disable interrupts for tests involving local_bh_enable() refscale: Add non-atomic per-CPU increment readers refscale: Add this_cpu_inc() readers refscale: Add preempt_disable() readers refscale: Add local_bh_disable() readers refscale: Add local_irq_disable() and local_irq_save() readers torture: Permit negative kvm.sh --kconfig numberic arguments srcu: Add SRCU_READ_FLAVOR_FAST_UPDOWN CPP macro rcu: Mark diagnostic functions as notrace rcutorture: Make TREE04 use CONFIG_RCU_DYNTICKS_TORTURE rcutorture: Remove redundant rcutorture_one_extend() from rcu_torture_one_read() rcutorture: Permit kvm-again.sh to re-use the build directory torture: Add kvm-series.sh to test commit/scenario combination rcu: use WRITE_ONCE() for ->next and ->pprev of hlist_nulls locktorture: Fix memory leak in param_set_cpumask() doc: Update for SRCU-fast definitions and initialization ...
11 days	Merge tag 'nolibc-20251130-for-6.19-1' of ↵	Linus Torvalds	3	-1/+15
	git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc Pull nolibc updates from Thomas Weißschuh: - Preparations to the use of nolibc in UML: - Cleanup of sparse warnings - Library mode without _start() - More consistency when disabling errno - Unconditional installation of all architecture support files - Always 64-bit wide ino_t and off_t - Various cleanups and bug fixes * tag 'nolibc-20251130-for-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc: (25 commits) selftests/nolibc: error out on linker warnings selftests/nolibc: use lld to link loongarch binaries tools/nolibc: remove more __nolibc_enosys() fallbacks tools/nolibc: remove now superfluous overflow check in llseek tools/nolibc: use 64-bit off_t tools/nolibc: prefer the llseek syscall tools/nolibc: handle 64-bit off_t for llseek tools/nolibc: use 64-bit ino_t tools/nolibc: avoid using plain integer as NULL pointer tools/nolibc: add support for fchdir() tools/nolibc: clean up outdated comments in generic arch.h tools/nolibc: make the "headers" target install all supported archs tools/nolibc: add the more portable inttypes.h tools/nolibc: provide the portable sys/select.h tools/nolibc: add missing memchr() to string.h tools/nolibc: fix misleading help message regarding installation path tools/nolibc: add uio.h with readv and writev tools/nolibc: add option to disable runtime tools/nolibc: use __fallthrough__ rather than fallthrough tools/nolibc: implement %m if errno is not defined ...
12 days	Merge tag 'arm64-upstream' of ↵	Linus Torvalds	4	-8/+70
	git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "These are the arm64 updates for 6.19. The biggest part is the Arm MPAM driver under drivers/resctrl/. There's a patch touching mm/ to handle spurious faults for huge pmd (similar to the pte version). The corresponding arm64 part allows us to avoid the TLB maintenance if a (huge) page is reused after a write fault. There's EFI refactoring to allow runtime services with preemption enabled and the rest is the usual perf/PMU updates and several cleanups/typos. Summary: Core features: - Basic Arm MPAM (Memory system resource Partitioning And Monitoring) driver under drivers/resctrl/ which makes use of the fs/rectrl/ API Perf and PMU: - Avoid cycle counter on multi-threaded CPUs - Extend CSPMU device probing and add additional filtering support for NVIDIA implementations - Add support for the PMUs on the NoC S3 interconnect - Add additional compatible strings for new Cortex and C1 CPUs - Add support for data source filtering to the SPE driver - Add support for i.MX8QM and "DB" PMU in the imx PMU driver Memory managemennt: - Avoid broadcast TLBI if page reused in write fault - Elide TLB invalidation if the old PTE was not valid - Drop redundant cpu_set__tcr_t0sz() macros - Propagate pgtable_alloc() errors outside of __create_pgd_mapping() - Propagate return value from __change_memory_common() ACPI and EFI: - Call EFI runtime services without disabling preemption - Remove unused ACPI function Miscellaneous: - ptrace support to disable streaming on SME-only systems - Improve sysreg generation to include a 'Prefix' descriptor - Replace __ASSEMBLY__ with __ASSEMBLER__ - Align register dumps in the kselftest zt-test - Remove some no longer used macros/functions - Various spelling corrections" tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits) arm64/mm: Document why linear map split failure upon vm_reset_perms is not problematic arm64/pageattr: Propagate return value from __change_memory_common arm64/sysreg: Remove unused define ARM64_FEATURE_FIELD_BITS KVM: arm64: selftests: Consider all 7 possible levels of cache KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last user arm64: atomics: lse: Remove unused parameters from ATOMIC_FETCH_OP_AND macros Documentation/arm64: Fix the typo of register names ACPI: GTDT: Get rid of acpi_arch_timer_mem_init() perf: arm_spe: Add support for filtering on data source perf: Add perf_event_attr::config4 perf/imx_ddr: Add support for PMU in DB (system interconnects) perf/imx_ddr: Get and enable optional clks perf/imx_ddr: Move ida_alloc() from ddr_perf_init() to ddr_perf_probe() dt-bindings: perf: fsl-imx-ddr: Add compatible string for i.MX8QM, i.MX8QXP and i.MX8DXL arm64: remove duplicate ARCH_HAS_MEM_ENCRYPT arm64: mm: use untagged address to calculate page index MAINTAINERS: new entry for MPAM Driver arm_mpam: Add kunit tests for props_mismatch() arm_mpam: Add kunit test for bitmap reset arm_mpam: Add helper to reset saved mbwu state ...
12 days	Merge tag 's390-6.19-1' of ↵	Linus Torvalds	4	-53/+1
	git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Provide a new interface for dynamic configuration and deconfiguration of hotplug memory, allowing with and without memmap_on_memory support. This makes the way memory hotplug is handled on s390 much more similar to other architectures - Remove compat support. There shouldn't be any compat user space around anymore, therefore get rid of a lot of code which also doesn't need to be tested anymore - Add stackprotector support. GCC 16 will get new compiler options, which allow to generate code required for kernel stackprotector support - Merge pai_crypto and pai_ext PMU drivers into a new driver. This removes a lot of duplicated code. The new driver is also extendable and allows to support new PMUs - Add driver override support for AP queues - Rework and extend zcrypt and AP trace events to allow for tracing of crypto requests - Support block sizes larger than 65535 bytes for CCW tape devices - Since the rework of the virtual kernel address space the module area and the kernel image are within the same 4GB area. This eliminates the need of weak per cpu variables. Get rid of ARCH_MODULE_NEEDS_WEAK_PER_CPU - Various other small improvements and fixes * tag 's390-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (92 commits) watchdog: diag288_wdt: Remove KMSG_COMPONENT macro s390/entry: Use lay instead of aghik s390/vdso: Get rid of -m64 flag handling s390/vdso: Rename vdso64 to vdso s390: Rename head64.S to head.S s390/vdso: Use common STABS_DEBUG and DWARF_DEBUG macros s390: Add stackprotector support s390/modules: Simplify module_finalize() slightly s390: Remove KMSG_COMPONENT macro s390/percpu: Get rid of ARCH_MODULE_NEEDS_WEAK_PER_CPU s390/ap: Restrict driver_override versus apmask and aqmask use s390/ap: Rename mutex ap_perms_mutex to ap_attr_mutex s390/ap: Support driver_override for AP queue devices s390/ap: Use all-bits-one apmask/aqmask for vfio in_use() checks s390/debug: Update description of resize operation s390/syscalls: Switch to generic system call table generation s390/syscalls: Remove system call table pointer from thread_struct s390/uapi: Remove 31 bit support from uapi header files s390: Remove compat support tools: Remove s390 compat support ...
12 days	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	2	-0/+29
	Merge in late fixes in preparation for the net-next PR. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
12 days	Merge tag 'x86_cpu_for_6.19-rc1' of ↵	Linus Torvalds	1	-9/+12
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU feature updates from Dave Hansen: "The biggest thing of note here is Linear Address Space Separation (LASS). It represents the first time I can think of that the upper=>kernel/lower=>user address space convention is actually recognized by the hardware on x86. It ensures that userspace can not even get the hardware to _start_ page walks for the kernel address space. This, of course, is a really nice generic side channel defense. This is really only a down payment on LASS support. There are still some details to work out in its interaction with EFI calls and vsyscall emulation. For now, LASS is disabled if either of those features is compiled in (which is almost always the case). There's also one straggler commit in here which converts an under-utilized AMD CPU feature leaf into a generic Linux-defined leaf so more feature can be packed in there. Summary: - Enable Linear Address Space Separation (LASS) - Change X86_FEATURE leaf 17 from an AMD leaf to Linux-defined" * tag 'x86_cpu_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Enable LASS during CPU initialization selftests/x86: Update the negative vsyscall tests to expect a #GP x86/traps: Communicate a LASS violation in #GP message x86/kexec: Disable LASS during relocate kernel x86/alternatives: Disable LASS when patching kernel code x86/asm: Introduce inline memcpy and memset x86/cpu: Add an LASS dependency on SMAP x86/cpufeatures: Enumerate the LASS feature bits x86/cpufeatures: Make X86_FEATURE leaf 17 Linux-specific
12 days	Merge tag 'kvm-s390-next-6.19-1' of ↵	Paolo Bonzini	2	-0/+141
	https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - SCA rework - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups
12 days	Merge tag 'timers-core-2025-11-30' of ↵	Linus Torvalds	2	-10/+77
	git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: - Prevent a thundering herd problem when the timekeeper CPU is delayed and a large number of CPUs compete to acquire jiffies_lock to do the update. Limit it to one CPU with a separate "uncontended" atomic variable. - A set of improvements for the timer migration mechanism: - Support imbalanced NUMA trees correctly - Support dynamic exclusion of CPUs from the migrator duty to allow the cpuset/isolation mechanism to exclude them from handling timers of remote idle CPUs - The usual small updates, cleanups and enhancements * tag 'timers-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers/migration: Exclude isolated cpus from hierarchy cpumask: Add initialiser to use cleanup helpers sched/isolation: Force housekeeping if isolcpus and nohz_full don't leave any cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() timers/migration: Use scoped_guard on available flag set/clear timers/migration: Add mask for CPUs available in the hierarchy timers/migration: Rename 'online' bit to 'available' selftests/timers/nanosleep: Add tests for return of remaining time selftests/timers: Clean up kernel version check in posix_timers time: Fix a few typos in time[r] related code comments time: tick-oneshot: Add missing Return and parameter descriptions to kernel-doc hrtimer: Store time as ktime_t in restart block timers/migration: Remove dead code handling idle CPU checking for remote timers timers/migration: Remove unused "cpu" parameter from tmigr_get_group() timers/migration: Assert that hotplug preparing CPU is part of stable active hierarchy timers/migration: Fix imbalanced NUMA trees timers/migration: Remove locking on group connection timers/migration: Convert "while" loops to use "for" tick/sched: Limit non-timekeeper CPUs calling jiffies update
12 days	Merge tag 'kvmarm-6.19' of ↵	Paolo Bonzini	13	-22/+819
	https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.19 - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner. - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ. - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU. - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM. - Minor fixes to KVM and selftests
12 days	Merge tag 'kvm-riscv-6.19-1' of https://github.com/kvm-riscv/linux into HEAD	Paolo Bonzini	1	-0/+4
	KVM/riscv changes for 6.19 - SBI MPXY support for KVM guest - New KVM_EXIT_FAIL_ENTRY_NO_VSFILE for the case when in-kernel AIA virtualization fails to allocate IMSIC VS-file - Support enabling dirty log gradually in small chunks - Fix guest page fault within HLV* instructions - Flush VS-stage TLB after VCPU migration for Andes cores
12 days	Merge tag 'loongarch-kvm-6.19' of ↵	Paolo Bonzini	6	-3/+417
	git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.19 1. Get VM PMU capability from HW GCFG register. 2. Add AVEC basic support. 3. Use 64-bit register definition for EIOINTC. 4. Add KVM timer test cases for tools/selftests.
12 days	selftests/tc-testing: Test CAKE scheduler when enqueue drops packets	Xiang Mei	1	-0/+28
	Add tests that trigger packet drops in cake_enqueue(): "CAKE with QFQ Parent - CAKE enqueue with packets dropping". It forces CAKE_enqueue to return NET_XMIT_CN after dropping the packets when it has a QFQ parent. Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://patch.msgid.link/20251128001415.377823-3-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>
12 days	Merge tag 'asoc-v6.19' of ↵	Takashi Iwai	6	-12/+188
	https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v6.19 This is a very large set of updates, as well as some more extensive cleanup work from Morimto-san we've also added a generic SCDA class driver for SoundWire devices enabling us to support many chips with no custom code. There's also a batch of new drivers added for both SoCs and CODECs. - Added a SoundWire SCDA generic class driver, pulling in a little regmap work to support it. - A lot of cleaup and API improvement work from Morimoto-san. - Lots of work on the existing Cirrus, Intel, Maxim and Qualcomm drivers. - Support for Allwinner A523, Mediatek MT8189, Qualcomm QCM2290, QRB2210 and SM6115, SpacemiT K1, and TI TAS2568, TAS5802, TAS5806, TAS5815, TAS5828 and TAS5830. This also pulls in some gpiolib changes supporting shared GPIOs in the core there so we can convert some of the ASoC drivers open coding handling of that to the core functionality.
13 days	selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py	Carolina Jubran	1	-44/+30
	Currently, tolerance is computed against the TC’s expected percentage, making TC3 (20%) validation overly strict and TC4 (80%) overly loose. Update BandwidthValidator to take a dict of shares and compute bounds relative to the overall total, so that all shares are validated consistently. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-7-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py	Carolina Jubran	1	-13/+13
	Correct the documented bandwidth distribution between TC3 and TC4 from 80/20 to 20/80. Update test descriptions and printed messages to consistently reflect the intended split. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-6-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py	Carolina Jubran	1	-2/+2
	Commit 7c32f7a2d3db ("selftests: net: py: don't default to shell=True") changed the cmd() helper to avoid spawning a shell unless explicitly requested. The devlink_rate_tc_bw test enables SR-IOV by writing to the sriov_numvfs sysfs attribute using redirection. Without shell=True the redirection is not interpreted and the VF device never appears, causing the test to fail. Fix by explicitly passing shell=True in the two places that update sriov_numvfs. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-5-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py	Carolina Jubran	1	-41/+29
	Replace the inline iperf3 subprocess and JSON parsing with Iperf3Runner. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-4-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	selftests: drv-net: introduce Iperf3Runner for measurement use cases	Carolina Jubran	3	-12/+82
	GenerateTraffic was added to spin up long-running iperf3 load, mainly to drive high PPS background traffic. It was never meant to provide stable throughput numbers, and trying to repurpose it for measurement does not make sense. Introduce Iperf3Runner to allow tests to split out server/client configuration, control start/stop, and collect JSON output for analysis. This makes it possible to measure bandwidth directly when validating egress shaping. GenerateTraffic stays as the background load generator, reusing the common iperf3 helpers under the hood. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-3-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS	Carolina Jubran	1	-0/+1
	This makes devlink_rate_tc_bw.py present in the Makefile under the same directory. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-2-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	selftests: netconsole: remove log noise due to socat exit	Andre Carvalho	1	-1/+1
	This removes some noise that can be distracting while looking at selftests by redirecting socat stderr to /dev/null. Before this commit, netcons_basic would output: Running with target mode: basic (ipv6) 2025/11/29 12:08:03 socat[259] W exiting on signal 15 2025/11/29 12:08:03 socat[271] W exiting on signal 15 basic : ipv6 : Test passed Running with target mode: basic (ipv4) 2025/11/29 12:08:05 socat[329] W exiting on signal 15 2025/11/29 12:08:05 socat[322] W exiting on signal 15 basic : ipv4 : Test passed Running with target mode: extended (ipv6) 2025/11/29 12:08:08 socat[386] W exiting on signal 15 2025/11/29 12:08:08 socat[386] W exiting on signal 15 2025/11/29 12:08:08 socat[380] W exiting on signal 15 extended : ipv6 : Test passed Running with target mode: extended (ipv4) 2025/11/29 12:08:10 socat[440] W exiting on signal 15 2025/11/29 12:08:10 socat[435] W exiting on signal 15 2025/11/29 12:08:10 socat[435] W exiting on signal 15 extended : ipv4 : Test passed After these changes, output looks like: Running with target mode: basic (ipv6) basic : ipv6 : Test passed Running with target mode: basic (ipv4) basic : ipv4 : Test passed Running with target mode: extended (ipv6) extended : ipv6 : Test passed Running with target mode: extended (ipv4) extended : ipv4 : Test passed Signed-off-by: Andre Carvalho <asantostc@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251129-netcons-socat-noise-v1-1-605a0cea8fca@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	selftests: net: add a hint about MACAddressPolicy=persistent	Jakub Kicinski	1	-1/+1
	New NIPA installation had been reporting a few flaky tests. arp_ndisc_evict_nocarrier is most flaky of them all. I suspect that the flakiness is due to udev swapping the MAC addresses on the interfaces. Extend the message in arp_ndisc_evict_nocarrier to hint at this potential issue. Having the neigh get fail right after ping is rather unusual, unless udev changes the MAC addr causing a flush in the meantime. Link: https://patch.msgid.link/20251127194556.2409574-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	selftests: net: py: handle interrupt during cleanup	Jakub Kicinski	1	-2/+16
	Following up on the old discussion [1]. Let the BaseExceptions out of defer()'ed cleanup. And handle it in the main loop. This allows us to exit the tests if user hit Ctrl-C during defer(). Link: https://lore.kernel.org/20251119063228.3adfd743@kernel.org # [1] Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20251128004846.2602687-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 days	Merge tag 'vfs-6.19-rc1.coredump' of ↵	Linus Torvalds	9	-1663/+2851
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pidfd and coredump updates from Christian Brauner: "Features: - Expose coredump signal via pidfd Expose the signal that caused the coredump through the pidfd interface. The recent changes to rework coredump handling to rely on unix sockets are in the process of being used in systemd. The previous systemd coredump container interface requires the coredump file descriptor and basic information including the signal number to be sent to the container. This means the signal number needs to be available before sending the coredump to the container. - Add supported_mask field to pidfd Add a new supported_mask field to struct pidfd_info that indicates which information fields are supported by the running kernel. This allows userspace to detect feature availability without relying on error codes or kernel version checks. Cleanups: - Drop struct pidfs_exit_info and prepare to drop exit_info pointer, simplifying the internal publication mechanism for exit and coredump information retrievable via the pidfd ioctl - Use guard() for task_lock in pidfs - Reduce wait_pidfd lock scope - Add missing PIDFD_INFO_SIZE_VER1 constant - Add missing BUILD_BUG_ON() assert on struct pidfd_info Fixes: - Fix PIDFD_INFO_COREDUMP handling Selftests: - Split out coredump socket tests and common helpers into separate files for better organization - Fix userspace coredump client detection issues - Handle edge-triggered epoll correctly - Ignore ENOSPC errors in tests - Add debug logging to coredump socket tests, socket protocol tests, and test helpers - Add tests for PIDFD_INFO_COREDUMP_SIGNAL - Add tests for supported_mask field - Update pidfd header for selftests" * tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) pidfs: reduce wait_pidfd lock scope selftests/coredump: add second PIDFD_INFO_COREDUMP_SIGNAL test selftests/coredump: add first PIDFD_INFO_COREDUMP_SIGNAL test selftests/coredump: ignore ENOSPC errors selftests/coredump: add debug logging to coredump socket protocol tests selftests/coredump: add debug logging to coredump socket tests selftests/coredump: add debug logging to test helpers selftests/coredump: handle edge-triggered epoll correctly selftests/coredump: fix userspace coredump client detection selftests/coredump: fix userspace client detection selftests/coredump: split out coredump socket tests selftests/coredump: split out common helpers selftests/pidfd: add second supported_mask test selftests/pidfd: add first supported_mask test selftests/pidfd: update pidfd header pidfs: expose coredump signal pidfs: drop struct pidfs_exit_info pidfs: prepare to drop exit_info pointer pidfd: add a new supported_mask field pidfs: add missing BUILD_BUG_ON() assert on struct pidfd_info ...
13 days	Merge tag 'namespace-6.19-rc1' of ↵	Linus Torvalds	14	-58/+8274
	git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull namespace updates from Christian Brauner: "This contains substantial namespace infrastructure changes including a new system call, active reference counting, and extensive header cleanups. The branch depends on the shared kbuild branch for -fms-extensions support. Features: - listns() system call Add a new listns() system call that allows userspace to iterate through namespaces in the system. This provides a programmatic interface to discover and inspect namespaces, addressing longstanding limitations: Currently, there is no direct way for userspace to enumerate namespaces. Applications must resort to scanning /proc//ns/ across all processes, which is: - Inefficient - requires iterating over all processes - Incomplete - misses namespaces not attached to any running process but kept alive by file descriptors, bind mounts, or parent references - Permission-heavy - requires access to /proc for many processes - No ordering or ownership information - No filtering per namespace type The listns() system call solves these problems: ssize_t listns(const struct ns_id_req req, u64 ns_ids, size_t nr_ns_ids, unsigned int flags); struct ns_id_req { __u32 size; __u32 spare; __u64 ns_id; struct / listns / { __u32 ns_type; __u32 spare2; __u64 user_ns_id; }; }; Features include: - Pagination support for large namespace sets - Filtering by namespace type (MNT_NS, NET_NS, USER_NS, etc.) - Filtering by owning user namespace - Permission checks respecting namespace isolation - Active Reference Counting Introduce an active reference count that tracks namespace visibility to userspace. A namespace is visible in the following cases: - The namespace is in use by a task - The namespace is persisted through a VFS object (namespace file descriptor or bind-mount) - The namespace is a hierarchical type and is the parent of child namespaces The active reference count does not regulate lifetime (that's still done by the normal reference count) - it only regulates visibility to namespace file handles and listns(). This prevents resurrection of namespaces that are pinned only for internal kernel reasons (e.g., user namespaces held by file->f_cred, lazy TLB references on idle CPUs, etc.) which should not be accessible via (1)-(3). - Unified Namespace Tree Introduce a unified tree structure for all namespaces with: - Fixed IDs assigned to initial namespaces - Lookup based solely on inode number - Maintained list of owned namespaces per user namespace - Simplified rbtree comparison helpers Cleanups - Header Reorganization: - Move namespace types into separate header (ns_common_types.h) - Decouple nstree from ns_common header - Move nstree types into separate header - Switch to new ns_tree_{node,root} structures with helper functions - Use guards for ns_tree_lock - Initial Namespace Reference Count Optimization - Make all reference counts on initial namespaces a nop to avoid pointless cacheline ping-pong for namespaces that can never go away - Drop custom reference count initialization for initial namespaces - Add NS_COMMON_INIT() macro and use it for all namespaces - pid: rely on common reference count behavior - Miscellaneous Cleanups - Rename exit_task_namespaces() to exit_nsproxy_namespaces() - Rename is_initial_namespace() and make argument const - Use boolean to indicate anonymous mount namespace - Simplify owner list iteration in nstree - nsfs: raise SB_I_NODEV, SB_I_NOEXEC, and DCACHE_DONTCACHE explicitly - nsfs: use inode_just_drop() - pidfs: raise DCACHE_DONTCACHE explicitly - pidfs: simplify PIDFD_GET__NAMESPACE ioctls - libfs: allow to specify s_d_flags - cgroup: add cgroup namespace to tree after owner is set - nsproxy: fix free_nsproxy() and simplify create_new_namespaces() Fixes: - setns(pidfd, ...) race condition Fix a subtle race when using pidfds with setns(). When the target task exits after prepare_nsset() but before commit_nsset(), the namespace's active reference count might have been dropped. If setns() then installs the namespaces, it would bump the active reference count from zero without taking the required reference on the owner namespace, leading to underflow when later decremented. The fix resurrects the ownership chain if necessary - if the caller succeeded in grabbing passive references, the setns() should succeed even if the target task exits or gets reaped. - Return EFAULT on put_user() error instead of success - Make sure references are dropped outside of RCU lock (some namespaces like mount namespace sleep when putting the last reference) - Don't skip active reference count initialization for network namespace - Add asserts for active refcount underflow - Add asserts for initial namespace reference counts (both passive and active) - ipc: enable is_ns_init_id() assertions - Fix kernel-doc comments for internal nstree functions - Selftests - 15 active reference count tests - 9 listns() functionality tests - 7 listns() permission tests - 12 inactive namespace resurrection tests - 3 threaded active reference count tests - commit_creds() active reference tests - Pagination and stress tests - EFAULT handling test - nsid tests fixes" tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (103 commits) pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls nstree: fix kernel-doc comments for internal functions nsproxy: fix free_nsproxy() and simplify create_new_namespaces() selftests/namespaces: fix nsid tests ns: drop custom reference count initialization for initial namespaces pid: rely on common reference count behavior ns: add asserts for initial namespace active reference counts ns: add asserts for initial namespace reference counts ns: make all reference counts on initial namespace a nop ipc: enable is_ns_init_id() assertions fs: use boolean to indicate anonymous mount namespace ns: rename is_initial_namespace() ns: make is_initial_namespace() argument const nstree: use guards for ns_tree_lock nstree: simplify owner list iteration nstree: switch to new structures nstree: add helper to operate on struct ns_tree_{node,root} nstree: move nstree types into separate header nstree: decouple from ns_common header ns: move namespace types into separate header ...
13 days	Merge branch 'for-linus' into for-next	Takashi Iwai	38	-97/+1626
	Pull remaining 6.18-devel changes. Signed-off-by: Takashi Iwai <tiwai@suse.de>
13 days	Merge branch 'kvm-arm64/nv-xnx-haf' into kvmarm/next	Oliver Upton	4	-0/+178
	* kvm-arm64/nv-xnx-haf: (22 commits) : Support for FEAT_XNX and FEAT_HAF in nested : : Add support for a couple of MMU-related features that weren't : implemented by KVM's software page table walk: : : - FEAT_XNX: Allows the hypervisor to describe execute permissions : separately for EL0 and EL1 : : - FEAT_HAF: Hardware update of the Access Flag, which in the context of : nested means software walkers must also set the Access Flag. : : The series also adds some basic support for testing KVM's emulation of : the AT instruction, including the implementation detail that AT sets the : Access Flag in KVM. KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: arm64: nv: Forward FEAT_XNX permissions to the shadow stage-2 ... Signed-off-by: Oliver Upton <oupton@kernel.org>
13 days	Merge branch 'kvm-arm64/vgic-lr-overflow' into kvmarm/next	Oliver Upton	5	-22/+288
	* kvm-arm64/vgic-lr-overflow: (50 commits) : Support for VGIC LR overflows, courtesy of Marc Zyngier : : Address deficiencies in KVM's GIC emulation when a vCPU has more active : IRQs than can be represented in the VGIC list registers. Sort the AP : list to prioritize inactive and pending IRQs, potentially spilling : active IRQs outside of the LRs. : : Handle deactivation of IRQs outside of the LRs for both EOImode=0/1, : which involves special consideration for SPIs being deactivated from a : different vCPU than the one that acked it. KVM: arm64: Convert ICH_HCR_EL2_TDIR cap to EARLY_LOCAL_CPU_FEATURE KVM: arm64: selftests: vgic_irq: Add timer deactivation test KVM: arm64: selftests: vgic_irq: Add Group-0 enable test KVM: arm64: selftests: vgic_irq: Add asymmetric SPI deaectivation test KVM: arm64: selftests: vgic_irq: Perform EOImode==1 deactivation in ack order KVM: arm64: selftests: vgic_irq: Remove LR-bound limitation KVM: arm64: selftests: vgic_irq: Exclude timer-controlled interrupts KVM: arm64: selftests: vgic_irq: Change configuration before enabling interrupt KVM: arm64: selftests: vgic_irq: Fix GUEST_ASSERT_IAR_EMPTY() helper KVM: arm64: selftests: gic_v3: Disable Group-0 interrupts by default KVM: arm64: selftests: gic_v3: Add irq group setting helper KVM: arm64: GICv2: Always trap GICV_DIR register KVM: arm64: GICv2: Handle deactivation via GICV_DIR traps KVM: arm64: GICv2: Handle LR overflow when EOImode==0 KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En KVM: arm64: GICv3: nv: Plug L1 LR sync into deactivation primitive KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation KVM: arm64: GICv3: Avoid broadcast kick on CPUs lacking TDIR KVM: arm64: GICv3: Handle in-LR deactivation when possible KVM: arm64: GICv3: Add SPI tracking to handle asymmetric deactivation ... Signed-off-by: Oliver Upton <oupton@kernel.org>
13 days	Merge branch 'kvm-arm64/sea-user' into kvmarm/next	Oliver Upton	3	-0/+333
	* kvm-arm64/sea-user: : Userspace handling of SEAs, courtesy of Jiaqi Yan : : Add support for processing external aborts in userspace in situations : where the host has failed to do so, allowing the VMM to potentially : reinject an external abort into the VM. Documentation: kvm: new UAPI for handling SEA KVM: selftests: Test for KVM_EXIT_ARM_SEA KVM: arm64: VM exit to userspace to handle SEA Signed-off-by: Oliver Upton <oupton@kernel.org>
13 days	KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"	Colin Ian King	1	-1/+1
	There is a spelling mistake in a TEST_FAIL message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://msgid.link/20251128175124.319094-1-colin.i.king@gmail.com Signed-off-by: Oliver Upton <oupton@kernel.org>
13 days	KVM: arm64: selftests: Add test for AT emulation	Oliver Upton	4	-0/+178
	Add a basic test for AT emulation in the EL2&0 and EL1&0 translation regimes. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251124190158.177318-16-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
14 days	Merge branch 'rcu/misc' into next	Frederic Weisbecker	3	-16/+157
	- In order to prepare the layout for nohz_full work deferral to user exit, the context tracking state must shrink the counter of transitions to/from RCU not watching. The only possible hazard is to trigger wrap-around more easily, delaying a bit grace periods when that happens. This should be a rare event though. Yet add debugging and torture code to test that assumption. - Fix memory leak on locktorture module - Annotate accesses in rculist_nulls.h to prevent from KCSAN warnings. On recent discussions, we also concluded that all those WRITE_ONCE() and READ_ONCE() on list APIs deserve appropriate comments. Something to be expected for the next cycle. - Provide a script to apply several configs to several commits with torture. - Allow torture to reuse a build directory in order to save needless rebuild time. - Various cleanups.
2025-11-29	selftests/mm/uffd: initialize char variable to Null	Ankit Khushwaha	2	-5/+5
	In "uffd-stress.c" & "uffd-unit-tests.c". address of char variable having garbage value (uninitialized) is passed to 'write' syscall triggers warning. uffd-stress.c:246:39: warning: variable 'c' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] uffd-unit-tests.c:581:31: warning: variable 'c' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] so the fix is to assign char variable to '\0' to prevent writing of garbage value. Link: https://lkml.kernel.org/r/20251126160830.52124-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-29	mm: introduce VMA flags bitmap type	Lorenzo Stoakes	1	-30/+120
	It is useful to transition to using a bitmap for VMA flags so we can avoid running out of flags, especially for 32-bit kernels which are constrained to 32 flags, necessitating some features to be limited to 64-bit kernels only. By doing so, we remove any constraint on the number of VMA flags moving forwards no matter the platform and can decide in future to extend beyond 64 if required. We start by declaring an opaque types, vma_flags_t (which resembles mm_struct flags of type mm_flags_t), setting it to precisely the same size as vm_flags_t, and place it in union with vm_flags in the VMA declaration. We additionally update struct vm_area_desc equivalently placing the new opaque type in union with vm_flags. This change therefore does not impact the size of struct vm_area_struct or struct vm_area_desc. In order for the change to be iterative and to avoid impacting performance, we designate VM_xxx declared bitmap flag values as those which must exist in the first system word of the VMA flags bitmap. We therefore declare vma_flags_clear_all(), vma_flags_overwrite_word(), vma_flags_overwrite_word(), vma_flags_overwrite_word_once(), vma_flags_set_word() and vma_flags_clear_word() in order to allow us to update the existing vm_flags_*() functions to utilise these helpers. This is a stepping stone towards converting users to the VMA flags bitmap and behaves precisely as before. By doing this, we can eliminate the existing private vma->__vm_flags field in the vma->vm_flags union and replace it with the newly introduced opaque type vma_flags, which we call flags so we refer to the new bitmap field as vma->flags. We update vma_flag_[test, set]_atomic() to account for the change also. We adapt vm_flags_reset_once() to only clear those bits above the first system word providing write-once semantics to the first system word (which it is presumed the caller requires - and in all current use cases this is so). As we currently only specify that the VMA flags bitmap size is equal to BITS_PER_LONG number of bits, this is a noop, but is defensive in preparation for a future change that increases this. We additionally update the VMA userland test declarations to implement the same changes there. Finally, we update the rust code to reference vma->vm_flags on update rather than vma->__vm_flags which has been removed. This is safe for now, albeit it is implicitly performing a const cast. Once we introduce flag helpers we can improve this more. No functional change intended. Link: https://lkml.kernel.org/r/bab179d7b153ac12f221b7d65caac2759282cfe9.1764064557.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-29	tools/testing/vma: eliminate dependency on vma->__vm_flags	Lorenzo Stoakes	1	-10/+10
	The userland VMA test code relied on an internal implementation detail - the existence of vma->__vm_flags to directly access VMA flags. There is no need to do so when we have the vm_flags_*() helper functions available. This is ugly, but also a subsequent commit will eliminate this field altogether so this will shortly become broken. This patch has us utilise the helper functions instead. Link: https://lkml.kernel.org/r/6275c53a6bb20743edcbe92d3e130183b47d18d0.1764064557.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-29	mm: declare VMA flags by bit	Lorenzo Stoakes	1	-45/+259
	Patch series "initial work on making VMA flags a bitmap", v3. We are in the rather silly situation that we are running out of VMA flags as they are currently limited to a system word in size. This leads to absurd situations where we limit features to 64-bit architectures only because we simply do not have the ability to add a flag for 32-bit ones. This is very constraining and leads to hacks or, in the worst case, simply an inability to implement features we want for entirely arbitrary reasons. This also of course gives us something of a Y2K type situation in mm where we might eventually exhaust all of the VMA flags even on 64-bit systems. This series lays the groundwork for getting away from this limitation by establishing VMA flags as a bitmap whose size we can increase in future beyond 64 bits if required. This is necessarily a highly iterative process given the extensive use of VMA flags throughout the kernel, so we start by performing basic steps. Firstly, we declare VMA flags by bit number rather than by value, retaining the VM_xxx fields but in terms of these newly introduced VMA_xxx_BIT fields. While we are here, we use sparse annotations to ensure that, when dealing with VMA bit number parameters, we cannot be passed values which are not declared as such - providing some useful type safety. We then introduce an opaque VMA flag type, much like the opaque mm_struct flag type introduced in commit bb6525f2f8c4 ("mm: add bitmap mm->flags field"), which we establish in union with vma->vm_flags (but still set at system word size meaning there is no functional or data type size change). We update the vm_flags_xxx() helpers to use this new bitmap, introducing sensible helpers to do so. This series lays the foundation for further work to expand the use of bitmap VMA flags and eventually eliminate these arbitrary restrictions. This patch (of 4): In order to lay the groundwork for VMA flags being a bitmap rather than a system word in size, we need to be able to consistently refer to VMA flags by bit number rather than value. Take this opportunity to do so in an enum which we which is additionally useful for tooling to extract metadata from. This additionally makes it very clear which bits are being used for what at a glance. We use the VMA_ prefix for the bit values as it is logical to do so since these reference VMAs. We consistently suffix with _BIT to make it clear what the values refer to. We declare bit values even when the flags that use them would not be enabled by config options as this is simply clearer and clearly defines what bit numbers are used for what, at no additional cost. We declare a sparse-bitwise type vma_flag_t which ensures that users can't pass around invalid VMA flags by accident and prepares for future work towards VMA flags being a bitmap where we want to ensure bit values are type safe. To make life easier, we declare some macro helpers - DECLARE_VMA_BIT() allows us to avoid duplication in the enum bit number declarations (and maintaining the sparse __bitwise attribute), and INIT_VM_FLAG() is used to assist with declaration of flags. Unfortunately we can't declare both in the enum, as we run into issue with logic in the kernel requiring that flags are preprocessor definitions, and additionally we cannot have a macro which declares another macro so we must define each flag macro directly. Additionally, update the VMA userland testing vma_internal.h header to include these changes. We also have to fix the parameters to the vma_flag_*_atomic() functions since VMA_MAYBE_GUARD_BIT is now of type vma_flag_t and sparse will complain otherwise. We have to update some rather silly if-deffery found in mm/task_mmu.c which would otherwise break. Finally, we update the rust binding helper as now it cannot auto-detect the flags at all. Link: https://lkml.kernel.org/r/cover.1764064556.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/3a35e5a0bcfa00e84af24cbafc0653e74deda64a.1764064556.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-29	selftests/bpf: do not hardcode target rate in test_tc_edt BPF program	Alexis Lothoré (eBPF Foundation)	2	-3/+4
	test_tc_edt currently defines the target rate in both the userspace and BPF parts. This value could be defined once in the userspace part if we make it able to configure the BPF program before starting the test. Add a target_rate variable in the BPF part, and make the userspace part set it to the desired rate before attaching the shaping program. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-4-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: remove test_tc_edt.sh	Alexis Lothoré (eBPF Foundation)	2	-102/+0
	Now that test_tc_edt has been integrated in test_progs, remove the legacy shell script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-3-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: integrate test_tc_edt into test_progs	Alexis Lothoré (eBPF Foundation)	2	-1/+145
	test_tc_edt.sh uses a pair of veth and a BPF program attached to the TX veth to shape the traffic to 5MBps. It then checks that the amount of received bytes (at interface level), compared to the TX duration, indeed matches 5Mbps. Convert this test script to the test_progs framework: - keep the double veth setup, isolated in two veths - run a small tcp server, and connect client to server - push a pre-configured amount of bytes, and measure how much time has been needed to push those - ensure that this rate is in a 2% error margin around the target rate This two percent value, while being tight, is hopefully large enough to not make the test too flaky in CI, while also turning it into a small example of BPF-based shaping. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-2-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: rename test_tc_edt.bpf.c section to expose program type	Alexis Lothoré (eBPF Foundation)	2	-2/+3
	The test_tc_edt BPF program uses a custom section name, which works fine when manually loading it with tc, but prevents it from being loaded with libbpf. Update the program section name to "tc" to be able to manipulate it with a libbpf-based C test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-1-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-29	selftests/bpf: Add success stats to rqspinlock stress test	Kumar Kartikeya Dwivedi	1	-12/+43
	Add stats to observe the success and failure rate of lock acquisition attempts in various contexts. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251128232802.1031906-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-28	selftests: bonding: add delay before each xvlan_over_bond connectivity check	Hangbin Liu	1	-0/+1
	Jakub reported increased flakiness in bond_macvlan_ipvlan.sh on regular kernel, while the tests consistently pass on a debug kernel. This suggests a timing-sensitive issue. To mitigate this, introduce a short sleep before each xvlan_over_bond connectivity check. The delay helps ensure neighbor and route cache have fully converged before verifying connectivity. The sleep interval is kept minimal since check_connection() is invoked nearly 100 times during the test. Fixes: 246af950b940 ("selftests: bonding: add macvlan over bond testing") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20251114082014.750edfad@kernel.org Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251127143310.47740-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-28	bpf: Remove runqslower tool	Hoyeon Lee	3	-20/+1
	runqslower was added in commit 9c01546d26d2 "tools/bpf: Add runqslower tool to tools/bpf" as a BCC port to showcase early BPF CO-RE + libbpf workflows. runqslower continues to live in BCC (libbpf-tools), so there is no need to keep building and maintaining it. Drop tools/bpf/runqslower and remove all build hooks in tools/bpf and selftests accordingly. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Link: https://lore.kernel.org/r/20251126093821.373291-1-hoyeon.lee@suse.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-28	selftests/bpf: Remove usage of lsm/file_alloc_security in selftest	Amery Hung	3	-7/+7
	file_alloc_security hook is disabled. Use other LSM hooks in selftests instead. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20251126202927.2584874-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-28	bpf: force BPF_F_RDONLY_PROG on insn array creation	Anton Protopopov	1	-1/+1
	The original implementation added a hack to check_mem_access() to prevent programs from writing into insn arrays. To get rid of this hack, enforce BPF_F_RDONLY_PROG on map creation. Also fix the corresponding selftest, as the error message changes with this patch. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251128063224.1305482-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-28	vfio: selftests: Add vfio_pci_device_init_perf_test	David Matlack	2	-0/+171
	Add a new VFIO selftest for measuring the time it takes to run vfio_pci_device_init() in parallel for one or more devices. This test serves as manual regression test for the performance improvement of commit e908f58b6beb ("vfio/pci: Separate SR-IOV VF dev_set"). For example, when running this test with 64 VFs under the same PF: Before: $ ./vfio_pci_device_init_perf_test -r vfio_pci_device_init_perf_test.iommufd.init 0000:1a:00.0 0000:1a:00.1 ... ... Wall time: 6.653234463s Min init time (per device): 0.101215344s Max init time (per device): 6.652755941s Avg init time (per device): 3.377609608s After: $ ./vfio_pci_device_init_perf_test -r vfio_pci_device_init_perf_test.iommufd.init 0000:1a:00.0 0000:1a:00.1 ... ... Wall time: 0.122978332s Min init time (per device): 0.108121915s Max init time (per device): 0.122762761s Avg init time (per device): 0.113816748s This test does not make any assertions about performance, since any such assertion is likely to be flaky due to system differences and random noise. However this test can be fed into automation to detect regressions, and can be used by developers in the future to measure performance optimizations. Suggested-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-19-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Eliminate INVALID_IOVA	David Matlack	4	-10/+13
	Eliminate INVALID_IOVA as there are platforms where UINT64_MAX is a valid iova. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-18-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Split libvfio.h into separate header files	David Matlack	6	-334/+381
	Split out the contents of libvfio.h into separate header files, but keep libvfio.h as the top-level include that all tests can use. Put all new header files into a libvfio/ subdirectory to avoid future name conflicts in include paths when libvfio is used by other selftests like KVM. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-17-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Move vfio_selftests_*() helpers into libvfio.c	David Matlack	3	-71/+79
	Move the vfio_selftests_*() helpers into their own file libvfio.c. These helpers have nothing to do with struct vfio_pci_device, so they don't make sense in vfio_pci_device.c. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-16-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Rename vfio_util.h to libvfio.h	David Matlack	11	-13/+13
	Rename vfio_util.h to libvfio.h to match the name of libvfio.mk. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-15-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Stop passing device for IOMMU operations	David Matlack	4	-65/+22
	Drop the struct vfio_pci_device wrappers for IOMMU map/unmap functions and require tests to directly call iommu_map(), iommu_unmap(), etc. This results in more concise code, and also makes it clear the map operations are happening on a struct iommu, not necessarily on a specific device, especially when multi-device tests are introduced. Do the same for iova_allocator_init() as that function only needs the struct iommu, not struct vfio_pci_device. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-14-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Move IOVA allocator into iova_allocator.c	David Matlack	3	-71/+95
	Move the IOVA allocator into its own file, to provide better separation between the allocator and the struct vfio_pci_device helper code. The allocator could go into iommu.c, but it is standalone enough that a separate file seems cleaner. This also continues the trend of having a .c for every major object in VFIO selftests (vfio_pci_device.c, vfio_pci_driver.c, iommu.c, and now iova_allocator.c). No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-13-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Move IOMMU library code into iommu.c	David Matlack	4	-453/+527
	Move all the IOMMU related library code into their own file iommu.c. This provides a better separation between the vfio_pci_device helper code and the iommu code. No function change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-12-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Rename struct vfio_dma_region to dma_region	David Matlack	4	-21/+21
	Rename struct vfio_dma_region to dma_region. This is in preparation for separating the VFIO PCI device library code from the IOMMU library code. This name change also better reflects the fact that DMA mappings can be managed by either VFIO or IOMMUFD. i.e. the "vfio_" prefix is misleading. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-11-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Upgrade driver logging to dev_err()	David Matlack	2	-4/+4
	Upgrade various logging in the VFIO selftests drivers from dev_info() to dev_err(). All of these logs indicate scenarios that may be unexpected. For example, the logging during probing indicates matching devices but that aren't supported by the driver. And the memcpy errors can indicate a problem if the caller was not trying to do something like exercise I/O fault handling. Exercising I/O fault handling is certainly a valid thing to do, but the driver can't infer the caller's expectations, so better to just log with dev_err(). Suggested-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-10-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Prefix logs with device BDF where relevant	David Matlack	4	-25/+30
	Prefix log messages with the device's BDF where relevant. This will help understanding VFIO selftests logs when tests are run with multiple devices. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-9-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Eliminate overly chatty logging	David Matlack	1	-14/+0
	Eliminate overly chatty logs that are printed during almost every test. These logs are adding more noise than value. If a test cares about this information it can log it itself. This is especially true as the VFIO selftests gains support for multiple devices in a single test (which multiplies all these logs). Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-8-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Support multiple devices in the same container/iommufd	David Matlack	5	-50/+107
	Support tests that want to add multiple devices to the same container/iommufd by decoupling struct vfio_pci_device from struct iommu. Multi-devices tests can now put multiple devices in the same container/iommufd like so: iommu = iommu_init(iommu_mode); device1 = vfio_pci_device_init(bdf1, iommu); device2 = vfio_pci_device_init(bdf2, iommu); device3 = vfio_pci_device_init(bdf3, iommu); ... vfio_pci_device_cleanup(device3); vfio_pci_device_cleanup(device2); vfio_pci_device_cleanup(device1); iommu_cleanup(iommu); To account for the new separation of vfio_pci_device and iommu, update existing tests to initialize and cleanup a struct iommu. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-7-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Introduce struct iommu	David Matlack	2	-42/+49
	Introduce struct iommu, which logically represents either a VFIO container or an iommufd IOAS, depending on which IOMMU mode is used by the test. This will be used in a subsequent commit to allow devices to be added to the same container/iommufd. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-6-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Rename struct vfio_iommu_mode to iommu_mode	David Matlack	2	-4/+4
	Rename struct vfio_iommu_mode to struct iommu_mode since the mode can include iommufd. This also prepares for splitting out all the IOMMU code into its own structs/helpers/files which are independent from the vfio_pci_device code. No function change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-5-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Allow passing multiple BDFs on the command line	David Matlack	2	-11/+48
	Add support for passing multiple device BDFs to a test via the command line. This is a prerequisite for multi-device tests. Single-device tests can continue using vfio_selftests_get_bdf(), which will continue to return argv[argc - 1] (if it is a BDF string), or the environment variable $VFIO_SELFTESTS_BDF otherwise. For multi-device tests, a new helper called vfio_selftests_get_bdfs() is introduced which will return an array of all BDFs found at the end of argv[], as well as the number of BDFs found (passed back to the caller via argument). The array of BDFs returned does not need to be freed by the caller. The environment variable VFIO_SELFTESTS_BDF continues to support only a single BDF for the time being. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-4-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Split run.sh into separate scripts	David Matlack	5	-99/+140
	Split run.sh into separate scripts (setup.sh, run.sh, cleanup.sh) to enable multi-device testing, and prepare for VFIO selftests automatically detecting which devices to use for testing by storing device metadata on the filesystem. - setup.sh takes one or more BDFs as arguments and sets up each device. Metadata about each device is stored on the filesystem in the directory: ${TMPDIR:-/tmp}/vfio-selftests-devices Within this directory is a directory for each BDF, and then files in those directories that cleanup.sh uses to cleanup the device. - run.sh runs a selftest by passing it the BDFs of all set up devices. - cleanup.sh takes zero or more BDFs as arguments and cleans up each device. If no BDFs are provided, it cleans up all devices. This split enables multi-device testing by allowing multiple BDFs to be set up and passed into tests: For example: $ tools/testing/selftests/vfio/scripts/setup.sh <BDF1> <BDF2> $ tools/testing/selftests/vfio/scripts/setup.sh <BDF3> $ tools/testing/selftests/vfio/scripts/run.sh echo <BDF1> <BDF2> <BDF3> $ tools/testing/selftests/vfio/scripts/cleanup.sh In the future, VFIO selftests can automatically detect set up devices by inspecting ${TMPDIR:-/tmp}/vfio-selftests-devices. This will avoid the need for the run.sh script. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-3-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	vfio: selftests: Move run.sh into scripts directory	David Matlack	2	-1/+3
	Move run.sh in a new sub-directory scripts/. This directory will be used to house various helper scripts to be used by humans and automation for running VFIO selftests. Opportunistically also switch run.sh from TEST_PROGS_EXTENDED to TEST_FILES. The former is for actual test executables that are just not run by default. TEST_FILES is a better fit for helper scripts. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-2-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	Merge tag 'vfio-v6.18-rc6' into v6.19/vfio/next	Alex Williamson	4	-9/+288
	Merge mainline vfio-selftest updates for ongoing v6.19 work. Signed-off-by: Alex Williamson <alex@shazbot.org>
2025-11-28	selftests/landlock: Add disconnected leafs and branch test suites	Mickaël Salaün	1	-0/+1051
	Test disconnected directories with two test suites (layout4_disconnected_leafs and layout5_disconnected_branch) and 43 variants to cover the main corner cases. These tests are complementary to the previous commit. Add test_renameat() and test_exchangeat() helpers. Test coverage for security/landlock is 92.1% of 1927 lines according to LLVM 20. Cc: Günther Noack <gnoack@google.com> Cc: Song Liu <song@kernel.org> Cc: Tingmao Wang <m@maowtm.org> Link: https://lore.kernel.org/r/20251128172200.760753-5-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-11-28	selftests/landlock: Add tests for access through disconnected paths	Tingmao Wang	1	-8/+415
	This adds tests for the edge case discussed in [1], with specific ones for rename and link operations when the operands are through disconnected paths, as that go through a separate code path in Landlock. This has resulted in a warning, due to collect_domain_accesses() not expecting to reach a different root from path->mnt: # RUN layout1_bind.path_disconnected ... # OK layout1_bind.path_disconnected ok 96 layout1_bind.path_disconnected # RUN layout1_bind.path_disconnected_rename ... [..] ------------[ cut here ]------------ [..] WARNING: CPU: 3 PID: 385 at security/landlock/fs.c:1065 collect_domain_accesses [..] ... [..] RIP: 0010:collect_domain_accesses (security/landlock/fs.c:1065 (discriminator 2) security/landlock/fs.c:1031 (discriminator 2)) [..] current_check_refer_path (security/landlock/fs.c:1205) [..] ... [..] hook_path_rename (security/landlock/fs.c:1526) [..] security_path_rename (security/security.c:2026 (discriminator 1)) [..] do_renameat2 (fs/namei.c:5264) # OK layout1_bind.path_disconnected_rename ok 97 layout1_bind.path_disconnected_rename Move the const char definitions a bit above so that we can use the path for s4d1 in cleanup code. Cc: Günther Noack <gnoack@google.com> Cc: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/027d5190-b37a-40a8-84e9-4ccbc352bcdf@maowtm.org [1] Signed-off-by: Tingmao Wang <m@maowtm.org> Link: https://lore.kernel.org/r/20251128172200.760753-4-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-11-28	Merge branch 'for-next/sysreg' into for-next/core	Catalin Marinas	1	-3/+7
	* for-next/sysreg: : arm64 sysreg updates/cleanups arm64/sysreg: Remove unused define ARM64_FEATURE_FIELD_BITS KVM: arm64: selftests: Consider all 7 possible levels of cache KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last user arm64/sysreg: Add ICH_VMCR_EL2 arm64/sysreg: Move generation of RES0/RES1/UNKN to function arm64/sysreg: Support feature-specific fields with 'Prefix' descriptor arm64/sysreg: Fix checks for incomplete sysreg definitions arm64/sysreg: Replace TCR_EL1 field macros
2025-11-28	Merge branches 'for-next/misc', 'for-next/kselftest', ↵	Catalin Marinas	3	-5/+63
	'for-next/efi-preempt', 'for-next/assembler-macro', 'for-next/typos', 'for-next/sme-ptrace-disable', 'for-next/local-tlbi-page-reused', 'for-next/mpam', 'for-next/acpi' and 'for-next/documentation', remote-tracking branch 'arm64/for-next/perf' into for-next/core * arm64/for-next/perf: perf: arm_spe: Add support for filtering on data source perf: Add perf_event_attr::config4 perf/imx_ddr: Add support for PMU in DB (system interconnects) perf/imx_ddr: Get and enable optional clks perf/imx_ddr: Move ida_alloc() from ddr_perf_init() to ddr_perf_probe() dt-bindings: perf: fsl-imx-ddr: Add compatible string for i.MX8QM, i.MX8QXP and i.MX8DXL arch_topology: Provide a stub topology_core_has_smt() for !CONFIG_GENERIC_ARCH_TOPOLOGY perf/arm-ni: Fix and optimise register offset calculation perf: arm_pmuv3: Add new Cortex and C1 CPU PMUs perf: arm_cspmu: fix error handling in arm_cspmu_impl_unregister() perf/arm-ni: Add NoC S3 support perf/arm_cspmu: nvidia: Add pmevfiltr2 support perf/arm_cspmu: nvidia: Add revision id matching perf/arm_cspmu: Add pmpidr support perf/arm_cspmu: Add callback to reset filter config perf: arm_pmuv3: Don't use PMCCNTR_EL0 on SMT cores * for-next/misc: : Miscellaneous patches arm64: atomics: lse: Remove unused parameters from ATOMIC_FETCH_OP_AND macros arm64: remove duplicate ARCH_HAS_MEM_ENCRYPT arm64: mm: use untagged address to calculate page index arm64: mm: make linear mapping permission update more robust for patial range arm64/mm: Elide TLB flush in certain pte protection transitions arm64/mm: Rename try_pgd_pgtable_alloc_init_mm arm64/mm: Allow __create_pgd_mapping() to propagate pgtable_alloc() errors arm64: add unlikely hint to MTE async fault check in el0_svc_common arm64: acpi: add newline to deferred APEI warning arm64: entry: Clean out some indirection arm64/mm: Ensure PGD_SIZE is aligned to 64 bytes when PA_BITS = 52 arm64/mm: Drop cpu_set_[default\|idmap]_tcr_t0sz() arm64: remove unused ARCH_PFN_OFFSET arm64: use SOFTIRQ_ON_OWN_STACK for enabling softirq stack arm64: Remove assertion on CONFIG_VMAP_STACK * for-next/kselftest: : arm64 kselftest patches kselftest/arm64: Align zt-test register dumps * for-next/efi-preempt: : arm64: Make EFI calls preemptible arm64/efi: Call EFI runtime services without disabling preemption arm64/efi: Move uaccess en/disable out of efi_set_pgd() arm64/efi: Drop efi_rt_lock spinlock from EFI arch wrapper arm64/fpsimd: Permit kernel mode NEON with IRQs off arm64/fpsimd: Don't warn when EFI execution context is preemptible efi/runtime-wrappers: Keep track of the efi_runtime_lock owner efi: Add missing static initializer for efi_mm::cpus_allowed_lock * for-next/assembler-macro: : arm64: Replace __ASSEMBLY__ with __ASSEMBLER__ in headers arm64: Replace __ASSEMBLY__ with __ASSEMBLER__ in non-uapi headers arm64: Replace __ASSEMBLY__ with __ASSEMBLER__ in uapi headers * for-next/typos: : Random typo/spelling fixes arm64: Fix double word in comments arm64: Fix typos and spelling errors in comments * for-next/sme-ptrace-disable: : Support disabling streaming mode via ptrace on SME only systems kselftest/arm64: Cover disabling streaming mode without SVE in fp-ptrace kselftst/arm64: Test NT_ARM_SVE FPSIMD format writes on non-SVE systems arm64/sme: Support disabling streaming mode via ptrace on SME only systems * for-next/local-tlbi-page-reused: : arm64, mm: avoid TLBI broadcast if page reused in write fault arm64, tlbflush: don't TLBI broadcast if page reused in write fault mm: add spurious fault fixing support for huge pmd * for-next/mpam: (34 commits) : Basic Arm MPAM driver (more to follow) MAINTAINERS: new entry for MPAM Driver arm_mpam: Add kunit tests for props_mismatch() arm_mpam: Add kunit test for bitmap reset arm_mpam: Add helper to reset saved mbwu state arm_mpam: Use long MBWU counters if supported arm_mpam: Probe for long/lwd mbwu counters arm_mpam: Consider overflow in bandwidth counter state arm_mpam: Track bandwidth counter state for power management arm_mpam: Add mpam_msmon_read() to read monitor value arm_mpam: Add helpers to allocate monitors arm_mpam: Probe and reset the rest of the features arm_mpam: Allow configuration to be applied and restored during cpu online arm_mpam: Use a static key to indicate when mpam is enabled arm_mpam: Register and enable IRQs arm_mpam: Extend reset logic to allow devices to be reset any time arm_mpam: Add a helper to touch an MSC from any CPU arm_mpam: Reset MSC controls from cpuhp callbacks arm_mpam: Merge supported features during mpam_enable() into mpam_class arm_mpam: Probe the hardware features resctrl supports arm_mpam: Add helpers for managing the locking around the mon_sel registers ... * for-next/acpi: : arm64 acpi updates ACPI: GTDT: Get rid of acpi_arch_timer_mem_init() * for-next/documentation: : arm64 Documentation updates Documentation/arm64: Fix the typo of register names
2025-11-28	Merge branches 'arm/smmu/updates', 'arm/smmu/bindings', 'mediatek', ↵	Joerg Roedel	2	-22/+50
	'nvidia/tegra', 'intel/vt-d', 'amd/amd-vi' and 'core' into next
2025-11-28	KVM: LoongArch: selftests: Add time counter test case	Bibo Mao	2	-0/+39
	With time counter test, it is to verify that time count starts from 0 and always grows up then. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-28	KVM: LoongArch: selftests: Add SW emulated timer test case	Bibo Mao	1	-0/+40
	This test case setup one-shot timer and execute idle instruction immediately to indicate giving up CPU, hypervisor will emulate SW hrtimer and wakeup vCPU when SW hrtimer is fired. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-28	KVM: LoongArch: selftests: Add timer interrupt test case	Bibo Mao	5	-2/+228
	Add timer test case based on common arch_timer code, timer interrupt with one-shot and period mode is tested. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-28	selftests: netfilter: nft_flowtable.sh: Add the capability to send IPv6 TCP ↵	Lorenzo Bianconi	1	-14/+43
	traffic Introduce the capability to send TCP traffic over IPv6 to nft_flowtable netfilter selftest. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28	selftests: netfilter: nft_flowtable.sh: Add IPIP flowtable selftest	Lorenzo Bianconi	1	-0/+69
	Introduce specific selftest for IPIP flowtable SW acceleration in nft_flowtable.sh Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-27	selftests/liveupdate: add kexec test for multiple and empty sessions	Pasha Tatashin	2	-0/+163
	Introduce a new kexec-based selftest, luo_kexec_multi_session, to validate the end-to-end lifecycle of a more complex LUO scenario. While the existing luo_kexec_simple test covers the basic end-to-end lifecycle, it is limited to a single session with one preserved file. This new test significantly expands coverage by verifying LUO's ability to handle a mixed workload involving multiple sessions, some of which are intentionally empty. This ensures that the LUO core correctly preserves and restores the state of all session types across a reboot. The test validates the following sequence: Stage 1 (Pre-kexec): - Creates two empty test sessions (multi-test-empty-1, multi-test-empty-2). - Creates a session with one preserved memfd (multi-test-files-1). - Creates another session with two preserved memfds (multi-test-files-2), each containing unique data. - Creates a state-tracking session to manage the transition to Stage 2. - Executes a kexec reboot via the helper script. Stage 2 (Post-kexec): - Retrieves the state-tracking session to confirm it is in the post-reboot stage. - Retrieves all four test sessions (both the empty and non-empty ones). - For the non-empty sessions, restores the preserved memfds and verifies their contents match the original data patterns. - Finalizes all test sessions and the state session to ensure a clean teardown and that all associated kernel resources are correctly released. This test provides greater confidence in the robustness of the LUO framework by validating its behavior in a more realistic, multi-faceted scenario. Link: https://lkml.kernel.org/r/20251125165850.3389713-19-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Tested-by: David Matlack <dmatlack@google.com> Cc: Aleksander Lobakin <aleksander.lobakin@intel.com> Cc: Alexander Graf <graf@amazon.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: anish kumar <yesanishhere@gmail.com> Cc: Anna Schumaker <anna.schumaker@oracle.com> Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Chanwoo Choi <cw00.choi@samsung.com> Cc: Chen Ridong <chenridong@huawei.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Daniel Wagner <wagi@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Jeffery <djeffery@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guixin Liu <kanie@linux.alibaba.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Joel Granados <joel.granados@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lennart Poettering <lennart@poettering.net> Cc: Leon Romanovsky <leon@kernel.org> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Lukas Wunner <lukas@wunner.de> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Matthew Maurer <mmaurer@google.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Myugnjoo Ham <myungjoo.ham@samsung.com> Cc: Parav Pandit <parav@nvidia.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Pratyush Yadav <ptyadav@amazon.de> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Song Liu <song@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Stuart Hayes <stuart.w.hayes@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: William Tu <witu@nvidia.com> Cc: Yoann Congal <yoann.congal@smile.fr> Cc: Zhu Yanjun <yanjun.zhu@linux.dev> Cc: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27	selftests/liveupdate: add simple kexec-based selftest for LUO	Pasha Tatashin	5	-0/+421
	Introduce a kexec-based selftest, luo_kexec_simple, to validate the end-to-end lifecycle of a Live Update Orchestrator session across a reboot. While existing tests verify the uAPI in a pre-reboot context, this test ensures that the core functionality—preserving state via Kexec Handover and restoring it in a new kernel—works as expected. The test operates in two stages, managing its state across the reboot by preserving a dedicated "state session" containing a memfd. This mechanism dogfoods the LUO feature itself for state tracking, making the test self-contained. The test validates the following sequence: Stage 1 (Pre-kexec): - Creates a test session (test-session). - Creates and preserves a memfd with a known data pattern into the test session. - Creates the state-tracking session to signal progression to Stage 2. - Executes a kexec reboot via a helper script. Stage 2 (Post-kexec): - Retrieves the state-tracking session to confirm it is in the post-reboot stage. - Retrieves the preserved test session. - Restores the memfd from the test session and verifies its contents match the original data pattern written in Stage 1. - Finalizes both the test and state sessions to ensure a clean teardown. The test relies on a helper script (do_kexec.sh) to perform the reboot and a shared utility library (luo_test_utils.c) for common LUO operations, keeping the main test logic clean and focused. Link: https://lkml.kernel.org/r/20251125165850.3389713-18-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Zhu Yanjun <yanjun.zhu@linux.dev> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Tested-by: David Matlack <dmatlack@google.com> Cc: Aleksander Lobakin <aleksander.lobakin@intel.com> Cc: Alexander Graf <graf@amazon.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: anish kumar <yesanishhere@gmail.com> Cc: Anna Schumaker <anna.schumaker@oracle.com> Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Chanwoo Choi <cw00.choi@samsung.com> Cc: Chen Ridong <chenridong@huawei.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Daniel Wagner <wagi@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Jeffery <djeffery@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guixin Liu <kanie@linux.alibaba.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Joel Granados <joel.granados@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lennart Poettering <lennart@poettering.net> Cc: Leon Romanovsky <leon@kernel.org> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Lukas Wunner <lukas@wunner.de> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Matthew Maurer <mmaurer@google.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Myugnjoo Ham <myungjoo.ham@samsung.com> Cc: Parav Pandit <parav@nvidia.com> Cc: Pratyush Yadav <pratyush@kernel.org> Cc: Pratyush Yadav <ptyadav@amazon.de> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Song Liu <song@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Stuart Hayes <stuart.w.hayes@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: William Tu <witu@nvidia.com> Cc: Yoann Congal <yoann.congal@smile.fr> Cc: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27	selftests/liveupdate: add userspace API selftests	Pasha Tatashin	5	-0/+396
	Introduce a selftest suite for LUO. These tests validate the core userspace-facing API provided by the /dev/liveupdate device and its associated ioctls. The suite covers fundamental device behavior, session management, and the file preservation mechanism using memfd as a test case. This provides regression testing for the LUO uAPI. The following functionality is verified: Device Access: Basic open and close operations on /dev/liveupdate. Enforcement of exclusive device access (verifying EBUSY on a second open). Session Management: Successful creation of sessions with unique names. Failure to create sessions with duplicate names. File Preservation: Preserving a single memfd and verifying its content remains intact post-preservation. Preserving multiple memfds within a single session, each with unique data. A complex scenario involving multiple sessions, each containing a mix of empty and data-filled memfds. Note: This test suite is limited to verifying the pre-kexec functionality of LUO (e.g., session creation, file preservation). The post-kexec restoration of resources is not covered, as the kselftest framework does not currently support orchestrating a reboot and continuing execution in the new kernel. Link: https://lkml.kernel.org/r/20251125165850.3389713-17-pasha.tatashin@soleen.com Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Tested-by: David Matlack <dmatlack@google.com> Cc: Aleksander Lobakin <aleksander.lobakin@intel.com> Cc: Alexander Graf <graf@amazon.com> Cc: Alice Ryhl <aliceryhl@google.com> Cc: Andriy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: anish kumar <yesanishhere@gmail.com> Cc: Anna Schumaker <anna.schumaker@oracle.com> Cc: Bartosz Golaszewski <bartosz.golaszewski@linaro.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Betkov <bp@alien8.de> Cc: Chanwoo Choi <cw00.choi@samsung.com> Cc: Chen Ridong <chenridong@huawei.com> Cc: Chris Li <chrisl@kernel.org> Cc: Christian Brauner <brauner@kernel.org> Cc: Daniel Wagner <wagi@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: David Jeffery <djeffery@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Guixin Liu <kanie@linux.alibaba.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Hugh Dickins <hughd@google.com> Cc: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ira Weiny <ira.weiny@intel.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Jens Axboe <axboe@kernel.dk> Cc: Joanthan Cameron <Jonathan.Cameron@huawei.com> Cc: Joel Granados <joel.granados@kernel.org> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Lennart Poettering <lennart@poettering.net> Cc: Leon Romanovsky <leon@kernel.org> Cc: Leon Romanovsky <leonro@nvidia.com> Cc: Lukas Wunner <lukas@wunner.de> Cc: Marc Rutland <mark.rutland@arm.com> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Matthew Maurer <mmaurer@google.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Myugnjoo Ham <myungjoo.ham@samsung.com> Cc: Parav Pandit <parav@nvidia.com> Cc: Pratyush Yadav <ptyadav@amazon.de> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Roman Gushchin <roman.gushchin@linux.dev> Cc: Saeed Mahameed <saeedm@nvidia.com> Cc: Samiullah Khawaja <skhawaja@google.com> Cc: Song Liu <song@kernel.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Stuart Hayes <stuart.w.hayes@gmail.com> Cc: Tejun Heo <tj@kernel.org> Cc: Thomas Gleinxer <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: William Tu <witu@nvidia.com> Cc: Yoann Congal <yoann.congal@smile.fr> Cc: Zhu Yanjun <yanjun.zhu@linux.dev> Cc: Zijun Hu <quic_zijuhu@quicinc.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27	kho: make debugfs interface optional	Pasha Tatashin	1	-0/+1
	Patch series "liveupdate: Rework KHO for in-kernel users", v9. This series refactors the KHO framework to better support in-kernel users like the upcoming LUO. The current design, which relies on a notifier chain and debugfs for control, is too restrictive for direct programmatic use. The core of this rework is the removal of the notifier chain in favor of a direct registration API. This decouples clients from the shutdown-time finalization sequence, allowing them to manage their preserved state more flexibly and at any time. In support of this new model, this series also: - Makes the debugfs interface optional. - Introduces APIs to unpreserve memory and fixes a bug in the abort path where client state was being incorrectly discarded. Note that this is an interim step, as a more comprehensive fix is planned as part of the stateless KHO work [1]. - Moves all KHO code into a new kernel/liveupdate/ directory to consolidate live update components. This patch (of 9): Currently, KHO is controlled via debugfs interface, but once LUO is introduced, it can control KHO, and the debug interface becomes optional. Add a separate config CONFIG_KEXEC_HANDOVER_DEBUGFS that enables the debugfs interface, and allows to inspect the tree. Move all debugfs related code to a new file to keep the .c files clear of ifdefs. Link: https://lkml.kernel.org/r/20251101142325.1326536-1-pasha.tatashin@soleen.com Link: https://lkml.kernel.org/r/20251101142325.1326536-2-pasha.tatashin@soleen.com Link: https://lore.kernel.org/all/20251020100306.2709352-1-jasonmiu@google.com [1] Co-developed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Signed-off-by: Pasha Tatashin <pasha.tatashin@soleen.com> Reviewed-by: Pratyush Yadav <pratyush@kernel.org> Cc: Alexander Graf <graf@amazon.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Masahiro Yamada <masahiroy@kernel.org> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Randy Dunlap <rdunlap@infradead.org> Cc: Tejun Heo <tj@kernel.org> Cc: Changyuan Lyu <changyuanl@google.com> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Simon Horman <horms@kernel.org> Cc: Zhu Yanjun <yanjun.zhu@linux.dev> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27	selftests: complete kselftest include centralization	Bala-Vignesh-Reddy	306	-307/+310
	This follow-up patch completes centralization of kselftest.h and ksefltest_harness.h includes in remaining seltests files, replacing all relative paths with a non-relative paths using shared -I include path in lib.mk Tested with gcc-13.3 and clang-18.1, and cross-compiled successfully on riscv, arm64, x86_64 and powerpc arch. [reddybalavignesh9979@gmail.com: add selftests include path for kselftest.h] Link: https://lkml.kernel.org/r/20251017090201.317521-1-reddybalavignesh9979@gmail.com Link: https://lkml.kernel.org/r/20251016104409.68985-1-reddybalavignesh9979@gmail.com Signed-off-by: Bala-Vignesh-Reddy <reddybalavignesh9979@gmail.com> Suggested-by: Andrew Morton <akpm@linux-foundation.org> Link: https://lore.kernel.org/lkml/20250820143954.33d95635e504e94df01930d0@linux-foundation.org/ Reviewed-by: Wei Yang <richard.weiyang@gmail.com> Cc: David Hildenbrand <david@redhat.com> Cc: David S. Miller <davem@davemloft.net> Cc: Eric Dumazet <edumazet@google.com> Cc: Günther Noack <gnoack@google.com> Cc: Jakub Kacinski <kuba@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mickael Salaun <mic@digikod.net> Cc: Ming Lei <ming.lei@redhat.com> Cc: Paolo Abeni <pabeni@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Simon Horman <horms@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-27	Merge branch 'mm-hotfixes-stable' into mm-nonmm-stable in order to be able	Andrew Morton	2	-9/+8
	to merge "kho: make debugfs interface optional" into mm-nonmm-stable.
2025-11-27	Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net	Jakub Kicinski	1	-8/+7
	Conflicts: net/xdp/xsk.c 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number") 8da7bea7db69 ("xsk: add indirect call for xsk_destruct_skb") 30ed05adca4a ("xsk: use a smaller new lock for shared pool case") https://lore.kernel.org/20251127105450.4a1665ec@canb.auug.org.au https://lore.kernel.org/eb4eee14-7e24-4d1b-b312-e9ea738fefee@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27	KVM: arm64: selftests: Consider all 7 possible levels of cache	Ben Horgan	1	-1/+1
	In test_clidr() if an empty cache level is not found then the TEST_ASSERT will not fire. Fix this by considering all 7 possible levels when iterating through the hierarchy. Found by inspection. Signed-off-by: Ben Horgan <ben.horgan@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-27	KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last user	Ben Horgan	1	-2/+6
	ARM64_FEATURE_FIELD_BITS is set to 4 but not all ID register fields are 4 bits. See for instance ID_AA64SMFR0_EL1. The last user of this define, ARM64_FEATURE_FIELD_BITS, is the set_id_regs selftest. Its logic assumes the fields aren't a single bits; assert that's the case and stop using the define. As there are no more users, ARM64_FEATURE_FIELD_BITS is removed from the arm64 tools sysreg.h header. A separate commit removes this from the kernel version of the header. Signed-off-by: Ben Horgan <ben.horgan@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-27	selftests: af_unix: remove unused stdlib.h include	Sunday Adelodun	1	-1/+0
	The unix_connreset.c test included <stdlib.h>, but no symbol from that header is used. This causes a fatal build error under certain linux-next configurations where stdlib.h is not available. Remove the unused include to fix the build. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202511221800.hcgCKvVa-lkp@intel.com/ Signed-off-by: Sunday Adelodun <adelodunolaoluwa@yahoo.com> Link: https://patch.msgid.link/20251125113648.25903-1-adelodunolaoluwa@yahoo.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27	KVM: LoongArch: selftests: Add exception handler register interface	Bibo Mao	2	-0/+45
	Add interrupt and exception handler register interface. When exception happens, execute registered exception handler if exists, else report an error. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-27	KVM: LoongArch: selftests: Add basic interfaces	Bibo Mao	2	-0/+55
	Add some basic function interfaces such as CSR register access, local irq enable or disable APIs. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-27	KVM: LoongArch: selftests: Add system registers save/restore on exception	Bibo Mao	2	-1/+10
	When system returns from exception with ertn instruction, PC comes from LOONGARCH_CSR_ERA, and CSR.CRMD comes LOONGARCH_CSR_PRMD. Here save CSR register CSR.ERA and CSR.PRMD into stack, and then restore them from stack. So it can be modified by exception handlers in future. Signed-off-by: Bibo Mao <maobibo@loongson.cn> Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
2025-11-26	selftests/net: packetdrill: pass send_omit_free to MSG_ZEROCOPY tests	Willem de Bruijn	12	-0/+29
	The --send_omit_free flag is needed for TCP zero copy tests, to ensure that packetdrill doesn't free the send() buffer after the send() call. Fixes: 1e42f73fd3c2 ("selftests/net: packetdrill: import tcp/zerocopy") Closes: https://lore.kernel.org/netdev/20251124071831.4cbbf412@kernel.org/ Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251125234029.1320984-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26	selftests/net: initialize char variable to null	Ankit Khushwaha	2	-2/+2
	char variable in 'so_txtime.c' & 'txtimestamp.c' were left uninitilized when switch default case taken. which raises following warning. txtimestamp.c:240:2: warning: variable 'tsname' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] so_txtime.c:210:3: warning: variable 'reason' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] initializing these variables to NULL to fix this. Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Link: https://patch.msgid.link/20251125165302.20079-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26	Merge tag 'mm-hotfixes-stable-2025-11-26-11-51' of ↵	Linus Torvalds	1	-8/+7
	git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "8 hotfixes. 4 are cc:stable, 7 are against mm/. All are singletons - please see the respective changelogs for details" * tag 'mm-hotfixes-stable-2025-11-26-11-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/filemap: fix logic around SIGBUS in filemap_map_pages() mm/huge_memory: fix NULL pointer deference when splitting folio MAINTAINERS: add test_kho to KHO's entry mailmap: add entry for Sam Protsenko selftests/mm: fix division-by-zero in uffd-unit-tests mm/mmap_lock: reset maple state on lock_vma_under_rcu() retry mm/memfd: fix information leak in hugetlb folios mm: swap: remove duplicate nr_swap_pages decrement in get_swap_page_of_type()
2025-11-26	selftests/landlock: Fix makefile header list	Matthieu Buffet	1	-1/+1
	Make all headers part of make's dependencies computations. Otherwise, updating audit.h, common.h, scoped_base_variants.h, scoped_common.h, scoped_multiple_domain_variants.h, or wrappers.h, re-running make and running selftests could lead to testing stale headers. Fixes: 6a500b22971c ("selftests/landlock: Add tests for audit flags and domain IDs") Fixes: fefcf0f7cf47 ("selftests/landlock: Test abstract UNIX socket scoping") Fixes: 5147779d5e1b ("selftests/landlock: Add wrappers.h") Signed-off-by: Matthieu Buffet <matthieu@buffet.re> Link: https://lore.kernel.org/r/20251027011440.1838514-1-matthieu@buffet.re Signed-off-by: Mickaël Salaün <mic@digikod.net>
2025-11-26	Merge branch 'iommufd_dmabuf' into k.o-iommufd/for-next	Jason Gunthorpe	2	-0/+87
	Jason Gunthorpe says: ==================== This series is the start of adding full DMABUF support to iommufd. Currently it is limited to only work with VFIO's DMABUF exporter. It sits on top of Leon's series to add a DMABUF exporter to VFIO: https://lore.kernel.org/all/20251120-dmabuf-vfio-v9-0-d7f71607f371@nvidia.com/ The existing IOMMU_IOAS_MAP_FILE is enhanced to detect DMABUF fd's, but otherwise works the same as it does today for a memfd. The user can select a slice of the FD to map into the ioas and if the underliyng alignment requirements are met it will be placed in the iommu_domain. Though limited, it is enough to allow a VMM like QEMU to connect MMIO BAR memory from VFIO to an iommu_domain controlled by iommufd. This is used for PCI Peer to Peer support in VMs, and is the last feature that the VFIO type 1 container has that iommufd couldn't do. The VFIO type1 version extracts raw PFNs from VMAs, which has no lifetime control and is a use-after-free security problem. Instead iommufd relies on revokable DMABUFs. Whenever VFIO thinks there should be no access to the MMIO it can shoot down the mapping in iommufd which will unmap it from the iommu_domain. There is no automatic remap, this is a safety protocol so the kernel doesn't get stuck. Userspace is expected to know it is doing something that will revoke the dmabuf and map/unmap it around the activity. Eg when QEMU goes to issue FLR it should do the map/unmap to iommufd. Since DMABUF is missing some key general features for this use case it relies on a "private interconnect" between VFIO and iommufd via the vfio_pci_dma_buf_iommufd_map() call. The call confirms the DMABUF has revoke semantics and delivers a phys_addr for the memory suitable for use with iommu_map(). Medium term there is a desire to expand the supported DMABUFs to include GPU drivers to support DPDK/SPDK type use cases so future series will work to add a general concept of revoke and a general negotiation of interconnect to remove vfio_pci_dma_buf_iommufd_map(). I also plan another series to modify iommufd's vfio_compat to transparently pull a dmabuf out of a VFIO VMA to emulate more of the uAPI of type1. The latest series for interconnect negotation to exchange a phys_addr is: https://lore.kernel.org/r/20251027044712.1676175-1-vivek.kasireddy@intel.com And the discussion for design of revoke is here: https://lore.kernel.org/dri-devel/20250114173103.GE5556@nvidia.com/ ==================== Based on a shared branch with vfio. * iommufd_dmabuf: iommufd/selftest: Add some tests for the dmabuf flow iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE iommufd: Have iopt_map_file_pages convert the fd to a file iommufd: Have pfn_reader process DMABUF iopt_pages iommufd: Allow MMIO pages in a batch iommufd: Allow a DMABUF to be revoked iommufd: Do not map/unmap revoked DMABUFs iommufd: Add DMABUF to iopt_pages vfio/pci: Add vfio_pci_dma_buf_iommufd_map() vfio/nvgrace: Support get_dmabuf_phys vfio/pci: Add dma-buf export support for MMIO regions vfio/pci: Enable peer-to-peer DMA transactions by default vfio/pci: Share the core device pointer while invoking feature functions vfio: Export vfio device get and put registration helpers dma-buf: provide phys_vec to scatter-gather mapping routine PCI/P2PDMA: Document DMABUF model PCI/P2PDMA: Provide an access to pci_p2pdma_map_type() function PCI/P2PDMA: Refactor to separate core P2P functionality from memory allocation PCI/P2PDMA: Simplify bus address mapping API PCI/P2PDMA: Separate the mmap() support from the core logic Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-11-26	Merge tag 'kvm-x86-selftests-6.19' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	24	-166/+481
	KVM selftests changes for 6.19: - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM. - Forcefully override ARCH from x86_64 to x86 to play nice with specifying ARCH=x86_64 on the command line. - Extend a bunch of nested VMX to validate nested SVM as well. - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to verify KVM can save/restore nested VMX state when L1 is using 5-level paging, but L2 is not. - Clean up the guest paging code in anticipation of sharing the core logic for nested EPT and nested NPT.
2025-11-26	Merge tag 'kvm-x86-gmem-6.19' of https://github.com/kvm-x86/linux into HEAD	Paolo Bonzini	9	-134/+315
	KVM guest_memfd changes for 6.19: - Add NUMA mempolicy support for guest_memfd, and clean up a variety of rough edges in guest_memfd along the way. - Define a CLASS to automatically handle get+put when grabbing a guest_memfd from a memslot to make it harder to leak references. - Enhance KVM selftests to make it easer to develop and debug selftests like those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs often result in hard-to-debug SIGBUS errors. - Misc cleanups.
2025-11-25	selftest: af_unix: Extend recv() timeout in so_peek_off.c.	Kuniyuki Iwashima	1	-2/+2
	so_peek_off.c is reported to be flaky on NIPA: # # so_peek_off.c:149:two_chunks_overlap_blocking:Expected -1 (-1) != bytes (-1) # # two_chunks_overlap_blocking: Test terminated by assertion # # FAIL so_peek_off.stream.two_chunks_overlap_blocking The test fork()s a child process to send() data after 1ms to wake up the parent process being blocked (up to 3ms) on recv(). But, from the log, the parent woke up after 3ms timeout, so it could be too short when the host is overloaded. Let's extend it to 5s. Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20251124070722.1e828c53@kernel.org/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124212805.486235-3-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25	selftest: af_unix: Create its own .gitignore.	Kuniyuki Iwashima	2	-8/+8
	Somehow AF_UNIX tests have reused ../.gitignore, but now NIPA warns about it. Let's create .gitignore under af_unix/. Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124212805.486235-2-kuniyu@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25	tcp: remove icsk->icsk_retransmit_timer	Eric Dumazet	2	-4/+4
	Now sk->sk_timer is no longer used by TCP keepalive, we can use its storage for TCP and MPTCP retransmit timers for better cache locality. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-5-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25	tcp: introduce icsk->icsk_keepalive_timer	Eric Dumazet	2	-4/+4
	sk->sk_timer has been used for TCP keepalives. Keepalive timers are not in fast path, we want to use sk->sk_timer storage for retransmit timers, for better cache locality. Create icsk->icsk_keepalive_timer and change keepalive code to no longer use sk->sk_timer. Added space is reclaimed in the following patch. This includes changes to MPTCP, which was also using sk_timer. Alias icsk->mptcp_tout_timer and icsk->icsk_keepalive_timer for inet_sk_diag_fill() sake. Signed-off-by: Eric Dumazet <edumazet@google.com> Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251124175013.1473655-4-edumazet@google.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25	vsock/test: Extend transport change null-ptr-deref test	Michal Luczaj	1	-1/+6
	syzkaller reported a lockdep lock order inversion warning[1] due to commit 687aa0c5581b ("vsock: Fix transport_* TOCTOU"). This was fixed in commit f7c877e75352 ("vsock: fix lock inversion in vsock_assign_transport()"). Redo syzkaller's repro by piggybacking on a somewhat related test implemented in commit 3a764d93385c ("vsock/test: Add test for null ptr deref when transport changes"). [1]: https://lore.kernel.org/netdev/68f6cdb0.a70a0220.205af.0039.GAE@google.com/ Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Link: https://patch.msgid.link/20251123-vsock_test-linger-lockdep-warn-v1-1-4b1edf9d8cdc@rbox.co Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-25	selftests/bpf: Make CS length configurable for rqspinlock stress test	Kumar Kartikeya Dwivedi	1	-2/+12
	Allow users to configure the critical section delay for both task/normal and NMI contexts, and set to 20ms and 10ms as before by default. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-4-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-25	selftests/bpf: Add lock wait time stats to rqspinlock stress test	Kumar Kartikeya Dwivedi	1	-0/+104
	Add statistics per-CPU broken down by context and various timing windows for the time taken to acquire an rqspinlock. Cases where all acquisitions fit into the 10ms window are skipped from printing, otherwise the full breakdown is displayed when printing the summary. This allows capturing precisely the number of times outlier attempts happened for a given lock in a given context. A critical detail is that time is captured regardless of success or failure, which is important to capture events for failed but long waiting timeout attempts. Output: [ 64.279459] rqspinlock acquisition latency histogram (ms): [ 64.279472] cpu1: total 528426 (normal 526559, nmi 1867) [ 64.279477] 0-1ms: total 524697 (normal 524697, nmi 0) [ 64.279480] 2-2ms: total 3652 (normal 1811, nmi 1841) [ 64.279482] 3-3ms: total 66 (normal 47, nmi 19) [ 64.279485] 4-4ms: total 2 (normal 1, nmi 1) [ 64.279487] 5-5ms: total 1 (normal 1, nmi 0) [ 64.279489] 6-6ms: total 1 (normal 0, nmi 1) [ 64.279490] 101-150ms: total 1 (normal 0, nmi 1) [ 64.279492] >= 251ms: total 6 (normal 2, nmi 4) ... Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-3-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-25	selftests/bpf: Relax CPU requirements for rqspinlock stress test	Kumar Kartikeya Dwivedi	1	-1/+1
	Only require 2 CPUs for AA, 3 for ABBA, 4 for ABBCCA, which is calculated nicely by adding to the mode enum. Enables running single CPU AA tests. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251125020749.2421610-2-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-25	selftests/bpf: Call bpf_get_numa_node_id() in trigger_count()	Menglong Dong	2	-4/+6
	The bench test "trig-kernel-count" can be used as a baseline comparison for fentry and other benchmarks, and the calling to bpf_get_numa_node_id() should be considered as composition of the baseline. So, let's call it in trigger_count(). Meanwhile, rename trigger_count() to trigger_kernel_count() to make it easier understand. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Andrii Nakryiko <andrii@kernel.org> Link: https://lore.kernel.org/bpf/20251116014242.151110-1-dongml2@chinatelecom.cn
2025-11-25	iommufd/selftest: Add some tests for the dmabuf flow	Jason Gunthorpe	2	-0/+87
	Basic tests of establishing a dmabuf and revoking it. The selftest kernel side provides a basic small dmabuf for this testing. Link: https://patch.msgid.link/r/9-v2-b2c110338e3f+5c2-iommufd_dmabuf_jgg@nvidia.com Reviewed-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Tested-by: Nicolin Chen <nicolinc@nvidia.com> Signed-off-by: Jason Gunthorpe <jgg@nvidia.com>
2025-11-24	selftests: af_unix: don't use SKIP for expected failures	Jakub Kicinski	1	-2/+6
	netdev CI reserves SKIP in selftests for cases which can't be executed due to setup issues, like missing or old commands. Tests which are expected to fail must use XFAIL. Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251123021601.158709-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-24	selftests: netconsole: ensure required log level is set on netcons_basic	Andre Carvalho	1	-2/+3
	This commit ensures that the required log level is set at the start of the test iteration. Part of the cleanup performed at the end of each test iteration resets the log level (do_cleanup in lib_netcons.sh) to the values defined at the time test script started. This may cause further test iterations to fail if the default values are not sufficient. Signed-off-by: Andre Carvalho <asantostc@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20251121-netcons-basic-loglevel-v1-1-577f8586159c@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-24	selftests: hw-net: toeplitz: give the test up to 4 seconds	Jakub Kicinski	1	-1/+1
	Increase the receiver timeout. When running between machines in different geographic regions the test needs more than a second to SSH across and send the frames. The bkg() command that runs the receiver defaults to 5 sec timeout, so using 4 sec sounds like a reasonable value for the receiver itself. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251121040259.3647749-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-24	selftests: hw-net: toeplitz: read indirection table from the device	Jakub Kicinski	1	-1/+23
	Replace the simple modulo math with the real indirection table read from the device. This makes the tests pass for mlx5 and bnxt NICs. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251121040259.3647749-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-24	selftests: hw-net: toeplitz: read the RSS key directly from C	Jakub Kicinski	3	-8/+44
	Now that we have YNL support for RSS accessing the RSS info from C is very easy. Instead of passing the RSS key from Python do it directly in the C code. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251121040259.3647749-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-24	selftests: hw-net: toeplitz: make sure NICs have pure Toeplitz configured	Jakub Kicinski	1	-11/+18
	Make sure that the NIC under test is configured for pure Toeplitz hashing, and no input key transform (no symmetric hashing). Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251121040259.3647749-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-24	selftests: hw-net: auto-disable building the iouring C code	Jakub Kicinski	1	-1/+16
	Looks like the liburing is not updated by distros very aggressively. Presumably because a lot of packages depend on it. I just updated to Fedora 43 and it's still on liburing 2.9. The test is 9mo old, at this stage I think this warrants handling the build failure more gracefully. Detect if iouring is recent enough and if not print a warning and exclude the C prog from build. The Python test will just fail since the binary won't exist. But it removes the major annoyance of having to update liburing from sources when developing other tests. Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251121040259.3647749-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-24	selftests/bpf: Fix htab_update/reenter_update selftest failure	Saket Kumar Bhaskar	2	-15/+41
	Since commit 31158ad02ddb ("rqspinlock: Add deadlock detection and recovery") the updated path on re-entrancy now reports deadlock via -EDEADLK instead of the previous -EBUSY. Also, the way reentrancy was exercised (via fentry/lookup_elem_raw) has been fragile because lookup_elem_raw may be inlined (find_kernel_btf_id() will return -ESRCH). To fix this fentry is attached to bpf_obj_free_fields() instead of lookup_elem_raw() and: - The htab map is made to use a BTF-described struct val with a struct bpf_timer so that check_and_free_fields() reliably calls bpf_obj_free_fields() on element replacement. - The selftest is updated to do two updates to the same key (insert + replace) in prog_test. - The selftest is updated to align with expected errno with the kernel’s current behavior. Signed-off-by: Saket Kumar Bhaskar <skb99@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Link: https://lore.kernel.org/r/20251117060752.129648-1-skb99@linux.ibm.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24	tools/testing/vma: add missing stub	Lorenzo Stoakes	1	-0/+7
	vm_flags_reset() is not available in the userland VMA tests, so add a stub which const-casts vma->vm_flags and avoids the upcoming removal of the vma->__vm_flags field. Link: https://lkml.kernel.org/r/4aff8bf7-d367-4ba3-90ad-13eef7a063fa@lucifer.local Fixes: c5c67c1de357 ("tools/testing/vma: eliminate dependency on vma->__vm_flags") Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24	mm: softdirty: add pgtable_supports_soft_dirty()	Chunyan Zhang	1	-0/+2
	Patch series "mm: Add soft-dirty and uffd-wp support for RISC-V", v15. This patchset adds support for Svrsw60t59b [1] extension which is ratified now, also add soft dirty and userfaultfd write protect tracking for RISC-V. The patches 1 and 2 add macros to allow architectures to define their own checks if the soft-dirty / uffd_wp PTE bits are available, in other words for RISC-V, the Svrsw60t59b extension is supported on which device the kernel is running. Also patch1-2 are removing "ifdef CONFIG_MEM_SOFT_DIRTY" "ifdef CONFIG_HAVE_ARCH_USERFAULTFD_WP" and "ifdef CONFIG_PTE_MARKER_UFFD_WP" in favor of checks which if not overridden by the architecture, no change in behavior is expected. This patchset has been tested with kselftest mm suite in which soft-dirty, madv_populate, test_unmerge_uffd_wp, and uffd-unit-tests run and pass, and no regressions are observed in any of the other tests. This patch (of 6): Some platforms can customize the PTE PMD entry soft-dirty bit making it unavailable even if the architecture provides the resource. Add an API which architectures can define their specific implementations to detect if soft-dirty bit is available on which device the kernel is running. This patch is removing "ifdef CONFIG_MEM_SOFT_DIRTY" in favor of pgtable_supports_soft_dirty() checks that defaults to IS_ENABLED(CONFIG_MEM_SOFT_DIRTY), if not overridden by the architecture, no change in behavior is expected. We make sure to never set VM_SOFTDIRTY if !pgtable_supports_soft_dirty(), so we will never run into VM_SOFTDIRTY checks. [lorenzo.stoakes@oracle.com: fix VMA selftests] Link: https://lkml.kernel.org/r/dac6ddfe-773a-43d5-8f69-021b9ca4d24b@lucifer.local Link: https://lkml.kernel.org/r/20251113072806.795029-1-zhangchunyan@iscas.ac.cn Link: https://lkml.kernel.org/r/20251113072806.795029-2-zhangchunyan@iscas.ac.cn Link: https://github.com/riscv-non-isa/riscv-iommu/pull/543 [1] Signed-off-by: Chunyan Zhang <zhangchunyan@iscas.ac.cn> Acked-by: David Hildenbrand <david@redhat.com> Cc: Albert Ou <aou@eecs.berkeley.edu> Cc: Alexandre Ghiti <alex@ghiti.fr> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Christian Brauner <brauner@kernel.org> Cc: Conor Dooley <conor@kernel.org> Cc: Deepak Gupta <debug@rivosinc.com> Cc: Jan Kara <jack@suse.cz> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Palmer Dabbelt <palmer@dabbelt.com> Cc: Paul Walmsley <paul.walmsley@sifive.com> Cc: Peter Xu <peterx@redhat.com> Cc: Rob Herring <robh@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andrew Jones <ajones@ventanamicro.com> Cc: Conor Dooley <conor.dooley@microchip.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24	selftests/mm: gup_test: fix comment regarding origin of FOLL_WRITE	Peng Li	1	-1/+1
	The 'FOLL_WRITE' of the copied source is located in mm_types.h of mm, not mm.h, so fix it. Link: https://lkml.kernel.org/r/20251117154012.197499-2-peng8420.li@gmail.com Signed-off-by: Peng Li <peng8420.li@gmail.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24	selftests/mm: gup_test: stop testing FOLL_TOUCH	Peng Li	1	-18/+4
	commit 0f20bba1688b ("mm/gup: explicitly define and check internal GUP flags, disallow FOLL_TOUCH") marked FOLL_TOUCH as a GUP-internal flag. This causes a warning to fire when running gup_test, for example: $ ./gup_test -L -r 100 -z dmesg: WARNING: CPU: 1 PID: 117 at mm/gup.c:2512 is_valid_gup_args+0x66/0x8c Therefore, remove the "FOLL_TOUCH" test code from gup_test.c. Link: https://lkml.kernel.org/r/20251117154012.197499-1-peng8420.li@gmail.com Signed-off-by: Peng Li <peng8420.li@gmail.com> Reviewed-by: John Hubbard <jhubbard@nvidia.com> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24	selftests/mm/hmm-tests: new throughput tests including THP	Balbir Singh	1	-1/+196
	Add new benchmark style support to test transfer bandwidth for zone device memory operations. Link: https://lkml.kernel.org/r/20251001065707.920170-16-balbirs@nvidia.com Signed-off-by: Balbir Singh <balbirs@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Gregory Price <gourry@gourry.net> Cc: Ying Huang <ying.huang@linux.alibaba.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Barry Song <baohua@kernel.org> Cc: Lyude Paul <lyude@redhat.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Mika Penttilä <mpenttil@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24	selftests/mm/hmm-tests: partial unmap, mremap and anon_write tests	Matthew Brost	1	-60/+252
	Add partial unmap test case which munmaps memory while in the device. Add tests exercising mremap on faulted-in memory (CPU and GPU) at various offsets and verify correctness. Update anon_write_child to read device memory after fork verifying this flow works in the kernel. Both THP and non-THP cases are updated. Link: https://lkml.kernel.org/r/20251001065707.920170-15-balbirs@nvidia.com Signed-off-by: Balbir Singh <balbirs@nvidia.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Cc: David Hildenbrand <david@redhat.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Gregory Price <gourry@gourry.net> Cc: Ying Huang <ying.huang@linux.alibaba.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Barry Song <baohua@kernel.org> Cc: Lyude Paul <lyude@redhat.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Mika Penttilä <mpenttil@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24	selftests/mm/hmm-tests: new tests for zone device THP migration	Balbir Singh	1	-0/+410
	Add new tests for migrating anon THP pages, including anon_huge, anon_huge_zero and error cases involving forced splitting of pages during migration. Link: https://lkml.kernel.org/r/20251001065707.920170-14-balbirs@nvidia.com Signed-off-by: Balbir Singh <balbirs@nvidia.com> Cc: David Hildenbrand <david@redhat.com> Cc: Zi Yan <ziy@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Gregory Price <gourry@gourry.net> Cc: Ying Huang <ying.huang@linux.alibaba.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com> Cc: Nico Pache <npache@redhat.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Barry Song <baohua@kernel.org> Cc: Lyude Paul <lyude@redhat.com> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Airlie <airlied@gmail.com> Cc: Simona Vetter <simona@ffwll.ch> Cc: Ralph Campbell <rcampbell@nvidia.com> Cc: Mika Penttilä <mpenttil@redhat.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: Francois Dugast <francois.dugast@intel.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24	Merge branch 'mm-hotfixes-stable' into mm-stable in order to merge	Andrew Morton	1	-8/+7
	"mm/huge_memory: only get folio_order() once during __folio_split()" into mm-stable.
2025-11-24	KVM: arm64: selftests: vgic_irq: Add timer deactivation test	Marc Zyngier	1	-0/+65
	Add a new test case that triggers the HW deactivation emulation path when trapping ICV_DIR_EL1. This is obviously tied to the way KVM works now, but the test follows the expected architectural behaviour. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-50-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: vgic_irq: Add Group-0 enable test	Marc Zyngier	1	-0/+49
	Add a new test case that inject a Group-0 interrupt together with a bunch of Group-1 interrupts, Ack/EOI the G1 interrupts, and only then enable G0, expecting to get the G0 interrupt. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-49-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: vgic_irq: Add asymmetric SPI deaectivation test	Marc Zyngier	1	-0/+105
	Add a new test case that makes an interrupt pending on a vcpu, activates it, do the priority drop, and then get another vcpu to do the deactivation. Special care is taken not to trigger an exit in the process, so that we are sure that the active interrupt is in an LR. Joy. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-48-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: vgic_irq: Perform EOImode==1 deactivation in ack order	Marc Zyngier	1	-2/+12
	When EOImode==1, perform the deactivation in the order of activation, just to make things a bit worse for KVM. Yes, I'm nasty. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-47-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: vgic_irq: Remove LR-bound limitation	Marc Zyngier	1	-13/+6
	Good news: our GIC emulation is not completely broken, and we can activate as many interrupts as we want. Bump the test to cover all the SGIs, all the allowed PPIs, and 31 SPIs. Yes, 31, because we have 31 available priorities, and the test is not happy with having two interrupts with the same priority. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-46-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: vgic_irq: Exclude timer-controlled interrupts	Marc Zyngier	1	-6/+25
	The PPI injection API is clear that you can't inject the timer PPIs from userspace, since they are controlled by the timers themselves. Add an exclusion list for this purpose. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-45-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: vgic_irq: Change configuration before enabling interrupt	Marc Zyngier	1	-3/+3
	The architecture is pretty clear that changing the configuration of an enable interrupt is not OK. It doesn't really matter here, but doing the right thing is not more expensive. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-44-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: vgic_irq: Fix GUEST_ASSERT_IAR_EMPTY() helper	Marc Zyngier	1	-1/+1
	No, 0 is not a spurious INTID. Never been, never was. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-43-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: gic_v3: Disable Group-0 interrupts by default	Marc Zyngier	1	-0/+2
	Make sure G0 is disabled at the point of initialising the GIC. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-42-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	KVM: arm64: selftests: gic_v3: Add irq group setting helper	Marc Zyngier	4	-0/+23
	Being able to set the group of an interrupt is pretty useful. Add such a helper. Tested-by: Fuad Tabba <tabba@google.com> Signed-off-by: Marc Zyngier <maz@kernel.org> Tested-by: Mark Brown <broonie@kernel.org> Link: https://msgid.link/20251120172540.2267180-41-maz@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
2025-11-24	selftests/mm: fix division-by-zero in uffd-unit-tests	Carlos Llamas	1	-8/+7
	Commit 4dfd4bba8578 ("selftests/mm/uffd: refactor non-composite global vars into struct") moved some of the operations previously implemented in uffd_setup_environment() earlier in the main test loop. The calculation of nr_pages, which involves a division by page_size, now occurs before checking that default_huge_page_size() returns a non-zero This leads to a division-by-zero error on systems with !CONFIG_HUGETLB. Fix this by relocating the non-zero page_size check before the nr_pages calculation, as it was originally implemented. Link: https://lkml.kernel.org/r/20251113034623.3127012-1-cmllamas@google.com Fixes: 4dfd4bba8578 ("selftests/mm/uffd: refactor non-composite global vars into struct") Signed-off-by: Carlos Llamas <cmllamas@google.com> Acked-by: David Hildenbrand (Red Hat) <david@kernel.org> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Reviewed-by: Ujwal Kundur <ujwal.kundur@gmail.com> Cc: Brendan Jackman <jackmanb@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2025-11-24	selftests/bpf: Allow selftests to build with older xxd	Alan Maguire	2	-2/+5
	Currently selftests require xxd with the "-n <name>" option which allows the user to specify a name not derived from the input object path. Instead of relying on this newer feature, older xxd can be used if we link our desired name ("test_progs_verification_cert") to the input object. Many distros ship xxd in vim-common package and do not have the latest xxd with -n support. Fixes: b720903e2b14d ("selftests/bpf: Enable signature verification for some lskel tests") Signed-off-by: Alan Maguire <alan.maguire@oracle.com> Link: https://lore.kernel.org/r/20251120084754.640405-3-alan.maguire@oracle.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-24	KVM: riscv: selftests: Add SBI MPXY extension to get-reg-list	Anup Patel	1	-0/+4
	The KVM RISC-V allows SBI MPXY extensions for Guest/VM so add it to the get-reg-list test. Signed-off-by: Anup Patel <apatel@ventanamicro.com> Reviewed-by: Andrew Jones <ajones@ventanamicro.com> Link: https://lore.kernel.org/r/20251017155925.361560-5-apatel@ventanamicro.com Signed-off-by: Anup Patel <anup@brainfault.org>
2025-11-22	selftests/nolibc: error out on linker warnings	Thomas Weißschuh	1	-1/+1
	If the linker emits warnings these should abort the build. Otherwise they will be swallowed by run-tests.sh and not shown. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu>
2025-11-22	selftests/nolibc: use lld to link loongarch binaries	Thomas Weißschuh	1	-0/+1
	LLVM 21 switched to -mcmodel=medium for LoongArch64 compilations. This code model uses R_LARCH_ECALL36 relocations which might not be supported by GNU ld which to nolibc testsuite uses by default. ld will not resolve the relocation and all function calls will end up as busy loops. Use lld instead. We can not switch to lld for all LLVM builds, as it does not support all necessary architectures. Signed-off-by: Thomas Weißschuh <linux@weissschuh.net> Acked-by: Willy Tarreau <w@1wt.eu>
2025-11-21	selftests: bpf: Add tests for unbalanced rcu_read_lock	Puranjay Mohan	2	-0/+42
	As verifier now supports nested rcu critical sections, add new test cases to make sure unbalanced usage of rcu_read_lock()/unlock() is rejected. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251117200411.25563-3-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	bpf: support nested rcu critical sections	Puranjay Mohan	1	-1/+1
	Currently, nested rcu critical sections are rejected by the verifier and rcu_lock state is managed by a boolean variable. Add support for nested rcu critical sections by make active_rcu_locks a counter similar to active_preempt_locks. bpf_rcu_read_lock() increments this counter and bpf_rcu_read_unlock() decrements it, MEM_RCU -> PTR_UNTRUSTED transition happens when active_rcu_locks drops to 0. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251117200411.25563-2-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	bpf: test the correct stack liveness of tail calls	Eduard Zingerman	1	-0/+50
	A new test is added: caller_stack_write_tail_call tests that the live stack is correctly tracked for a tail call. Signed-off-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Link: https://lore.kernel.org/r/20251119160355.1160932-5-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	bpf: test the proper verification of tail calls	Martin Teichmann	2	-2/+90
	Three tests are added: - invalidate_pkt_pointers_by_tail_call checks that one can use the packet pointer after a tail call. This was originally possible and also poses not problems, but was made impossible by 1a4607ffba35. - invalidate_pkt_pointers_by_static_tail_call tests a corner case found by Eduard Zingerman during the discussion of the original fix, which was broken in that fix. - subprog_result_tail_call tests that precision propagation works correctly across tail calls. This did not work before. Signed-off-by: Martin Teichmann <martin.teichmann@xfel.eu> Acked-by: Eduard Zingerman <eddyz87@gmail.com> Link: https://lore.kernel.org/r/20251119160355.1160932-3-martin.teichmann@xfel.eu Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: Update test_tag to use sha256	Xing Guo	1	-1/+1
	commit 603b44162325 ("bpf: Update the bpf_prog_calc_tag to use SHA256") changed digest of prog_tag to SHA256 but forgot to update tests correspondingly. Fix it. Fixes: 603b44162325 ("bpf: Update the bpf_prog_calc_tag to use SHA256") Signed-off-by: Xing Guo <higuoxing@gmail.com> Link: https://lore.kernel.org/r/20251121061458.3145167-1-higuoxing@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: Improve reliability of test_perf_branches_no_hw()	Matt Bobrowski	2	-2/+17
	Currently, test_perf_branches_no_hw() relies on the busy loop within test_perf_branches_common() being slow enough to allow at least one perf event sample tick to occur before starting to tear down the backing perf event BPF program. With a relatively small fixed iteration count of 1,000,000, this is not guaranteed on modern fast CPUs, resulting in the test run to subsequently fail with the following: bpf_testmod.ko is already unloaded. Loading bpf_testmod.ko... Successfully loaded bpf_testmod.ko. test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_good_sample:PASS:output not valid 0 nsec check_good_sample:PASS:read_branches_size 0 nsec check_good_sample:PASS:read_branches_stack 0 nsec check_good_sample:PASS:read_branches_stack 0 nsec check_good_sample:PASS:read_branches_global 0 nsec check_good_sample:PASS:read_branches_global 0 nsec check_good_sample:PASS:read_branches_size 0 nsec test_perf_branches_no_hw:PASS:perf_event_open 0 nsec test_perf_branches_common:PASS:test_perf_branches_load 0 nsec test_perf_branches_common:PASS:attach_perf_event 0 nsec test_perf_branches_common:PASS:set_affinity 0 nsec check_bad_sample:FAIL:output not valid no valid sample from prog Summary: 0/1 PASSED, 0 SKIPPED, 1 FAILED Successfully unloaded bpf_testmod.ko. On a modern CPU (i.e. one with a 3.5 GHz clock rate), executing 1 million increments of a volatile integer can take significantly less than 1 millisecond. If the spin loop and detachment of the perf event BPF program elapses before the first 1 ms sampling interval elapses, the perf event will never end up firing. Fix this by bumping the loop iteration counter a little within test_perf_branches_common(), along with ensuring adding another loop termination condition which is directly influenced by the backing perf event BPF program executing. Notably, a concious decision was made to not adjust the sample_freq value as that is just not a reliable way to go about fixing the problem. It effectively still leaves the race window open. Fixes: 67306f84ca78c ("selftests/bpf: Add bpf_read_branch_records() selftest") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Reviewed-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251119143540.2911424-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: skip test_perf_branches_hw() on unsupported platforms	Matt Bobrowski	1	-3/+3
	Gracefully skip the test_perf_branches_hw subtest on platforms that do not support LBR or require specialized perf event attributes to enable branch sampling. For example, AMD's Milan (Zen 3) supports BRS rather than traditional LBR. This requires specific configurations (attr.type = PERF_TYPE_RAW, attr.config = RETIRED_TAKEN_BRANCH_INSTRUCTIONS) that differ from the generic setup used within this test. Notably, it also probably doesn't hold much value to special case perf event configurations for selected micro architectures. Fixes: 67306f84ca78c ("selftests/bpf: Add bpf_read_branch_records() selftest") Signed-off-by: Matt Bobrowski <mattbobrowski@google.com> Acked-by: Song Liu <song@kernel.org> Link: https://lore.kernel.org/r/20251120142059.2836181-1-mattbobrowski@google.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests: bpf: Enable gotox tests from arm64	Puranjay Mohan	1	-2/+2
	arm64 JIT now supports gotox instruction and jumptables, so run tests in verifier_gotox.c for arm64. Signed-off-by: Puranjay Mohan <puranjay@kernel.org> Reviewed-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251117130732.11107-4-puranjay@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-21	selftests/bpf: Use sockaddr_storage instead of sa46 in select_reuseport test	Hoyeon Lee	1	-33/+34
	The select_reuseport selftest uses a custom sa46 union to represent IPv4 and IPv6 addresses. This custom wrapper requires extra manual handling for address family and field extraction. Replace sa46 with sockaddr_storage and update the helper functions to operate on native socket structures. This simplifies the code and removes unnecessary custom address-handling logic. No functional changes intended. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251121081332.2309838-3-hoyeon.lee@suse.com
2025-11-21	selftests/bpf: Use sockaddr_storage directly in cls_redirect test	Hoyeon Lee	1	-79/+43
	The cls_redirect test uses a custom addr_port/tuple wrapper to represent IPv4/IPv6 addresses and ports. This custom wrapper requires extra conversion logic and specific helpers such as fill_addr_port(), which are no longer necessary when using standard socket address structures. This commit replaces addr_port/tuple with the standard sockaddr_storage so test handles address families and ports using native socket types. It removes the custom helper, eliminates redundant casts, and simplifies the setup helpers without functional changes. set_up_conn() and build_input() now take src/dst sockaddr_storage directly. Reviewed-by: Amery Hung <ameryhung@gmail.com> Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Signed-off-by: Martin KaFai Lau <martin.lau@kernel.org> Link: https://patch.msgid.link/20251121081332.2309838-2-hoyeon.lee@suse.com
2025-11-21	KVM: selftests: Make sure vm->vpages_mapped is always up-to-date	Yosry Ahmed	3	-3/+3
	Call paths leading to __virt_pg_map() are currently: (a) virt_pg_map() -> virt_arch_pg_map() -> __virt_pg_map() (b) virt_map_level() -> __virt_pg_map() For (a), calls to virt_pg_map() from kvm_util.c make sure they update vm->vpages_mapped, but other callers do not. Move the sparsebit_set() call into virt_pg_map() to make sure all callers are captured. For (b), call sparsebit_set_num() from virt_map_level(). It's tempting to have a single the call inside __virt_pg_map(), however: - The call path in (a) is not x86-specific, while (b) is. Moving the call into __virt_pg_map() would require doing something similar for other archs implementing virt_pg_map(). - Future changes will reusue __virt_pg_map() for nested PTEs, which should not update vm->vpages_mapped, i.e. a triple underscore version that does not update vm->vpages_mapped would need to be provided. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-12-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-21	KVM: selftests: Stop using __virt_pg_map() directly in tests	Yosry Ahmed	2	-5/+3
	Replace __virt_pg_map() calls in tests by high-level equivalent functions, removing some loops in the process. No functional change intended. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-11-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-21	KVM: s390: Add capability that forwards operation exceptions	Janosch Frank	2	-0/+141
	Setting KVM_CAP_S390_USER_OPEREXEC will forward all operation exceptions to user space. This also includes the 0x0000 instructions managed by KVM_CAP_S390_USER_INSTR0. It's helpful if user space wants to emulate instructions which do not (yet) have an opcode. While we're at it refine the documentation for KVM_CAP_S390_USER_INSTR0. Signed-off-by: Janosch Frank <frankja@linux.ibm.com> Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com> Acked-by: Christian Borntraeger <borntraeger@linux.ibm.com> Signed-off-by: Janosch Frank <frankja@linux.ibm.com>
2025-11-20	selftest: netdevsim: test devlink default params	Daniel Zahka	1	-6/+110
	Test querying default values and resetting to default values for netdevsim devlink params. This should cover the basic paths of interest: driverinit and non-driverinit cmodes, as well as bool and non-bool value type. Default param values of type bool are encoded with u8 netlink type as opposed to flag type, so that userspace can distinguish "not-present" from false. Signed-off-by: Daniel Zahka <daniel.zahka@gmail.com> Link: https://patch.msgid.link/20251119025038.651131-7-daniel.zahka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	netconsole: Increase MAX_USERDATA_ITEMS	Gustavo Luiz Duarte	1	-1/+1
	Increase MAX_USERDATA_ITEMS from 16 to 256 entries now that the userdata buffer is allocated dynamically. The previous limit of 16 was necessary because the buffer was statically allocated for all targets. With dynamic allocation, we can support more entries without wasting memory on targets that don't use userdata. This allows users to attach more metadata to their netconsole messages, which is useful for complex debugging and logging scenarios. Also update the testcase accordingly. Signed-off-by: Gustavo Luiz Duarte <gustavold@gmail.com> Reviewed-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20251119-netconsole_dynamic_extradata-v3-4-497ac3191707@meta.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: net: remove old setup_* scripts	Jakub Kicinski	4	-167/+112
	gro.sh and toeplitz.sh used to source in one of two setup scripts depending on whether the test was expected to be run against veth or a real device. veth testing is replaced by netdevsim and existing "remote endpoint" support in our Python tests. Add a script which sets up loopback mode. The usage is a little bit more complicated than running the scripts used to be. Testing used to work like this: ./../gro.sh -i eth0 ... now the "setup script" has to be run explicitly: NETIF=eth0 ./../ksft_setup_loopback.sh ./../gro.sh But the functionality itself is retained. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251120021024.2944527-13-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: drv-net: hw: convert the Toeplitz test to Python	Jakub Kicinski	5	-229/+215
	Rewrite the existing toeplitz.sh test in Python. The conversion is a lot less exact than the GRO one. We use Netlink APIs to get the device RSS and IRQ information. We expect that the device has neither RPS nor RFS configured, and set RPS up as part of the test. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251120021024.2944527-11-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: drv-net: add a Python version of the GRO test	Jakub Kicinski	4	-106/+168
	Rewrite the existing gro.sh test in Python. The conversion not exact, the changes are related to integrating the test with our "remote endpoint" paradigm. The test now reads the IP addresses from the user config. It resolves the MAC address (including running over Layer 3 networks). Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251120021024.2944527-10-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: net: py: read ip link info about remote dev	Jakub Kicinski	2	-1/+3
	We're already saving the info about the local dev in env.dev for the tests, save remote dev as well. This is more symmetric, env generally provides the same info for local and remote end. While at it make sure that we reliably get the detailed info about the local dev. nsim used to read the dev info without -d. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251120021024.2944527-8-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: net: py: support ksft ready without wait	Jakub Kicinski	1	-8/+12
	There's a common synchronization problem when a script (Python test) uses a C program to set up some state (usually start a receiving process for traffic). The script needs to know when the process has fully initialized. The inverse of the problem exists for shutting the process down - we need a reliable way to tell the process to exit. We added helpers to do this safely in commit 71477137994f ("selftests: drv-net: add a way to wait for a local process") unfortunately the two operations (wait for init, and shutdown) are controlled by a single parameter (ksft_wait). Add support for using ksft_ready without using the second fd for exit. This is useful for programs which wait for a specific number of packets to rx so exit_wait is a good match, but we still need to wait for init. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20251120021024.2944527-7-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: net: relocate gro and toeplitz tests to drivers/net	Jakub Kicinski	14	-15/+17
	The GRO test can run on a real device or a veth. The Toeplitz hash test can only run on a real device. Move them from net/ to drivers/net/ and drivers/net/hw/ respectively. There are two scripts which set up the environment for these tests setup_loopback.sh and setup_veth.sh. Move those scripts to net/lib. The paths to the setup files are a little ugly but they will be deleted shortly. toeplitz_client.sh is not a test in itself, but rather a helper to send traffic, so add it to TEST_FILES rather than TEST_PROGS. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251120021024.2944527-6-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: drv-net: xdp: use variants for qstat tests	Jakub Kicinski	1	-28/+14
	Use just-added ksft variants for XDP qstat tests. While at it correct the number of packets, we're sending 1000 packets now. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251120021024.2944527-5-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: net: py: add test variants	Jakub Kicinski	4	-7/+61
	There's a lot of cases where we try to re-run the same code with different parameters. We currently need to either use a generator method or create a "main" case implementation which then gets called by trivial case functions: def _test(x, y, z): ... def case_int(): _test(1, 2, 3) def case_str(): _test('a', 'b', 'c') Add support for variants, similar to kselftests_harness.h and a lot of other frameworks. Variants can be added as decorator to test functions: @ksft_variants([(1, 2, 3), ('a', 'b', 'c')]) def case(x, y, z): ... ksft_run() will auto-generate case names: case.1_2_3 case.a_b_c Because the names may not always be pretty (and to avoid forcing classes to implement case-friendly __str__()) add a wrapper class KsftNamedVariant which lets the user specify the name for the variant. Note that ksft_run's args are still supported. ksft_run splices args and variant params together. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20251120021024.2944527-4-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: net: py: extract the case generation logic	Jakub Kicinski	1	-8/+21
	In preparation for adding test variants move the test case collection logic to a dedicated function. New helper returns (function, args, name, ) tuples. The main test loop can simply run them, not much logic or discernment needed. Reviewed-by: Petr Machata <petrm@nvidia.com> Reviewed-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251120021024.2944527-3-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	selftests: net: py: coding style improvements	Jakub Kicinski	1	-4/+3
	We're about to add more features here and finding new issues with old ones in place is hard. Address ruff checks: - bare exceptions - f-string with no params - unused import We need to use BaseException when handling defer(), as Petr points out. This retains the old behavior of ignoring SIGTERM while running cleanups. Reviewed-by: Willem de Bruijn <willemb@google.com> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20251120021024.2944527-2-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-20	KVM: selftests: Add a VMX test for LA57 nested state	Jim Mattson	2	-0/+133
	Add a selftest that verifies KVM's ability to save and restore nested state when the L1 guest is using 5-level paging and the L2 guest is using 4-level paging. Specifically, canonicality tests of the VMCS12 host-state fields should accept 57-bit virtual addresses. Signed-off-by: Jim Mattson <jmattson@google.com> Link: https://patch.msgid.link/20251028225827.2269128-5-jmattson@google.com [sean: rename to vmx_nested_la57_state_test to prep nested_<test> namespace] Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Change VM_MODE_PXXV48_4K to VM_MODE_PXXVYY_4K	Jim Mattson	6	-34/+40
	Use 57-bit addresses with 5-level paging on hardware that supports LA57. Continue to use 48-bit addresses with 4-level paging on hardware that doesn't support LA57. Suggested-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Jim Mattson <jmattson@google.com> Link: https://patch.msgid.link/20251028225827.2269128-4-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Use a loop to walk guest page tables	Jim Mattson	1	-13/+10
	Walk the guest page tables via a loop when searching for a PTE, instead of using unique variables for each level of the page tables. This simplifies the code and makes it easier to support 5-level paging in the future. Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251028225827.2269128-3-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Use a loop to create guest page tables	Jim Mattson	1	-14/+11
	Walk the guest page tables via a loop when creating new mappings, instead of using unique variables for each level of the page tables. This simplifies the code and makes it easier to support 5-level paging in the future. Signed-off-by: Jim Mattson <jmattson@google.com> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251028225827.2269128-2-jmattson@google.com Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Remove the unused argument to prepare_eptp()	Yosry Ahmed	4	-6/+4
	eptp_memslot is unused, remove it. No functional change intended. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-10-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Stop hardcoding PAGE_SIZE in x86 selftests	Yosry Ahmed	6	-18/+18
	Use PAGE_SIZE instead of 4096. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-9-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Extend vmx_tsc_adjust_test to cover SVM	Yosry Ahmed	2	-26/+45
	Add SVM L1 code to run the nested guest, and allow the test to run with SVM as well as VMX. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-8-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Extend nested_invalid_cr3_test to cover SVM	Yosry Ahmed	1	-3/+40
	Add SVM L1 code to run the nested guest, and allow the test to run with SVM as well as VMX. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-7-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Move nested invalid CR3 check to its own test	Yosry Ahmed	3	-10/+80
	vmx_tsc_adjust_test currently verifies that a nested VMLAUNCH fails with an invalid CR3. This is irrelevant to TSC scaling, move it to a standalone test. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-6-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Extend vmx_nested_tsc_scaling_test to cover SVM	Yosry Ahmed	2	-6/+44
	Add SVM L1 code to run the nested guest, and allow the test to run with SVM as well as VMX. Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-5-yosry.ahmed@linux.dev Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	KVM: selftests: Extend vmx_close_while_nested_test to cover SVM	Yosry Ahmed	2	-10/+34
	Add SVM L1 code to run the nested guest, and allow the test to run with SVM as well as VMX. Reviewed-by: Jim Mattson <jmattson@google.com> Signed-off-by: Yosry Ahmed <yosry.ahmed@linux.dev> Link: https://patch.msgid.link/20251021074736.1324328-4-yosry.ahmed@linux.dev [sean: rename to "nested_close_kvm_test" to provide nested_* sorting] Signed-off-by: Sean Christopherson <seanjc@google.com>
2025-11-20	selftests/hid-tablet: add ABS_DISTANCE test for stylus/pen	Ping Cheng	1	-0/+71
	For pen and stylus, the ABS_Z event reports ABS_DISTANCE values in the hid generic kernel driver. This test is to make sure that the assignment is properly done for all pen and stylus tools. Same as tilt, distance is an optional event. Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Signed-off-by: Ping Cheng <ping.cheng@wacom.com> Signed-off-by: Tatsunosuke Tobit <tatsunosuke.tobita@wacom.com> Signed-off-by: Jiri Kosina <jkosina@suse.com>
2025-11-20	testing/selftests/mm: add soft-dirty merge self-test	Lorenzo Stoakes	1	-1/+126
	Assert that we correctly merge VMAs containing VM_SOFTDIRTY flags now that we correctly handle these as sticky. In order to do so, we have to account for the fact the pagemap interface checks soft dirty PTEs and additionally that newly merged VMAs are marked VM_SOFTDIRTY. We do this by using use unfaulted anon VMAs, establishing one and clearing references on that one, before establishing another and merging the two before checking that soft-dirty is propagated as expected. We check that this functions correctly with mremap() and mprotect() as sample cases, because VMA merge of adjacent newly mapped VMAs will automatically be made soft-dirty due to existing logic which does so. We are therefore exercising other means of merging VMAs. Link: https://lkml.kernel.org/r/d5a0f735783fb4f30a604f570ede02ccc5e29be9.1763399675.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Andrey Vagin <avagin@gmail.com> Cc: David Hildenbrand (Red Hat) <david@kernel.org> Cc: Jann Horn <jannh@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Mike Rapoport <rppt@kernel.org> Cc: Pedro Falcato <pfalcato@suse.de> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: Cyrill Gorcunov <gorcunov@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>