aboutsummaryrefslogtreecommitdiffstats
path: root/tools
AgeCommit message (Collapse)AuthorFilesLines
2 daysMerge tag 'locking-futex-2025-12-10' of ↵x86/bootLinus Torvalds6-9/+562
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull futex updates from Ingo Molnar: - Standardize on ktime_t in restart_block::time as well (Thomas Weißschuh) - Futex selftests: - Add robust list testcases (André Almeida) - Formatting fixes/cleanups (Carlos Llamas) * tag 'locking-futex-2025-12-10' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: futex: Store time as ktime_t in restart block selftests/futex: Create test for robust list selftests/futex: Skip tests if shmget unsupported selftests/futex: Add newline to ksft_exit_fail_msg() selftests/futex: Remove unused test_futex_mpol()
3 daysMerge tag 'platform-drivers-x86-v6.19-1' of ↵Linus Torvalds2-3/+45
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull x86 platform driver updates from Ilpo Järvinen: - acer-wmi: Add PH16-72, PHN16-72, and PT14-51 fan control support - acpi: platform_profile: Add max-power profile option (power draw limited by the cooling hardware, may exceed battery power draw limit when on AC power) - amd/hsmp: Allow more than one data-fabric per socket - asus-armoury: Add WMI attributes driver to expose miscellaneous WMI functions through fw_attributes (deprecates the custom BIOS features interface through asus-wmi) - asus-wmi: Use brightness_set_blocking() for kbd led - ayaneo-ec: Add Ayaneo Embedded Controller driver - fs/nls: - Fix utf16 to utf8 string conversion when output size restricted - Improve error code consistency for utf8 to utf32 conversions - ideapad-laptop: Fast (Rapid Charge) charge type support - intel/hid: Add Dell Pro Rugged 10/12 tablet to VGBS DMI quirks - intel/pmc: - Arrow Lake telemetry GUID improvements - Add support for Wildcat Lake PMC information - intel_pmc_ipc: Fix ACPI buffer memleak - intel/punit_ipc: Fix memory corruption - intel/vsec: Wildcat Lake PMT telemetry support - lenovo-wmi-gamezone: Map "Extreme" performance mode to max-power - lg-laptop: Add support for the HDAP opregion field - serial-multi-instantiate: Add IRQ_RESOURCE_OPT for IRQ missing projects - thinkpad-t14s-ec: Improve suspend/resume support (lid LEDs, keyboard backlight) - uniwill: Add Uniwill laptop driver - wmi: Move under drivers/platform/wmi as non-x86 WMI support is around the corner and other WMI features will require adding more C files as well - tools/power/x86/intel-speed-select: v1.24 - Check feature status to check if the feature enablement was successful - Reset SST-TF bucket structure to display valid bucket info - Miscellaneous cleanups / refactoring / improvements * tag 'platform-drivers-x86-v6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: (73 commits) tools/power/x86/intel-speed-select: v1.24 release tools/power/x86/intel-speed-select: Reset isst_turbo_freq_info for invalid buckets tools/power/x86/intel-speed-select: Check feature status platform/x86: asus-wmi: use brightness_set_blocking() for kbd led fs/nls: Fix inconsistency between utf8_to_utf32() and utf32_to_utf8() platform/x86: asus-armoury: add support for GA503QR platform/x86: intel_pmc_ipc: fix ACPI buffer memory leak platform/x86: hp-wmi: Order DMI board name arrays platform/x86/intel/hid: Add Dell Pro Rugged 10/12 tablet to VGBS DMI quirks platform: surface: replace use of system_wq with system_percpu_wq platform: x86: replace use of system_wq with system_percpu_wq platform/surface: acpi-notify: add WQ_PERCPU to alloc_workqueue users platform/x86: wmi-gamezone: Add Legion Go 2 Quirks platform/x86: lenovo-wmi-gamezone Use max-power rather than balanced-performance acpi: platform_profile - Add max-power profile option platform/x86/amd/pmf: Use devm_mutex_init() for mutex initialization platform/x86/amd/pmf: Add BIOS_INPUTS_MAX macro to replace hardcoded array size platform/x86: serial-multi-instantiate: Add IRQ_RESOURCE_OPT for IRQ missing projects platform/x86/amd/pmf: Refactor repetitive BIOS output handling platform/x86/uniwill: Add TUXEDO devices ...
3 daysMerge tag 'auto-type-conversion-for-v6.19-rc1' of ↵Linus Torvalds2-3/+8
git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-auto Pull __auto_type to auto conversion from Peter Anvin: "Convert '__auto_type' to 'auto', defining a macro for 'auto' unless C23+ is in use" * tag 'auto-type-conversion-for-v6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-auto: tools/virtio: replace "__auto_type" with "auto" selftests/bpf: replace "__auto_type" with "auto" arch/x86: replace "__auto_type" with "auto" arch/nios2: replace "__auto_type" and adjacent equivalent with "auto" fs/proc: replace "__auto_type" with "const auto" include/linux: change "__auto_type" to "auto" compiler_types.h: add "auto" as a macro for "__auto_type"
4 daystools/virtio: replace "__auto_type" with "auto"H. Peter Anvin1-1/+1
Replace one instance of "__auto_type" with "auto" in: tools/virtio/linux/compiler.h This file *does* include <linux/compiler_types.h> directly, so there is no need to duplicate the definition. Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
4 daysselftests/bpf: replace "__auto_type" with "auto"H. Peter Anvin1-2/+7
Replace instances of "__auto_type" with "auto" in: tools/testing/selftests/bpf/prog_tests/socket_helpers.h This file does not seem to be including <linux/compiler_types.h> directly or indirectly, so copy the definition but guard it with !defined(auto). Acked-by: Eduard Zingerman <eddyz87@gmail.com> Signed-off-by: H. Peter Anvin (Intel) <hpa@zytor.com>
4 daysMerge tag 'hyperv-next-signed-20251207' of ↵Linus Torvalds1-0/+4
git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux Pull hyperv updates from Wei Liu: - Enhancements to Linux as the root partition for Microsoft Hypervisor: - Support a new mode called L1VH, which allows Linux to drive the hypervisor running the Azure Host directly - Support for MSHV crash dump collection - Allow Linux's memory management subsystem to better manage guest memory regions - Fix issues that prevented a clean shutdown of the whole system on bare metal and nested configurations - ARM64 support for the MSHV driver - Various other bug fixes and cleanups - Add support for Confidential VMBus for Linux guest on Hyper-V - Secure AVIC support for Linux guests on Hyper-V - Add the mshv_vtl driver to allow Linux to run as the secure kernel in a higher virtual trust level for Hyper-V * tag 'hyperv-next-signed-20251207' of git://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux: (58 commits) mshv: Cleanly shutdown root partition with MSHV mshv: Use reboot notifier to configure sleep state mshv: Add definitions for MSHV sleep state configuration mshv: Add support for movable memory regions mshv: Add refcount and locking to mem regions mshv: Fix huge page handling in memory region traversal mshv: Move region management to mshv_regions.c mshv: Centralize guest memory region destruction mshv: Refactor and rename memory region handling functions mshv: adjust interrupt control structure for ARM64 Drivers: hv: use kmalloc_array() instead of kmalloc() mshv: Add ioctl for self targeted passthrough hvcalls Drivers: hv: Introduce mshv_vtl driver Drivers: hv: Export some symbols for mshv_vtl static_call: allow using STATIC_CALL_TRAMP_STR() from assembly mshv: Extend create partition ioctl to support cpu features mshv: Allow mappings that overlap in uaddr mshv: Fix create memory region overlap check mshv: add WQ_PERCPU to alloc_workqueue users Drivers: hv: Use kmalloc_array() instead of kmalloc() ...
5 daysMerge tag 'perf-tools-for-v6.19-2025-12-06' of ↵Linus Torvalds246-3670/+11130
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools Pull perf tools updates from Namhyung Kim: "Perf event/metric description: Unify all event and metric descriptions in JSON format. Now event parsing and handling is greatly simplified by that. From users point of view, perf list will provide richer information about hardware events like the following. $ perf list hw List of pre-defined events (to be used in -e or -M): legacy hardware: branch-instructions [Retired branch instructions [This event is an alias of branches]. Unit: cpu] branch-misses [Mispredicted branch instructions. Unit: cpu] branches [Retired branch instructions [This event is an alias of branch-instructions]. Unit: cpu] bus-cycles [Bus cycles,which can be different from total cycles. Unit: cpu] cache-misses [Cache misses. Usually this indicates Last Level Cache misses; this is intended to be used in conjunction with the PERF_COUNT_HW_CACHE_REFERENCES event to calculate cache miss rates. Unit: cpu] cache-references [Cache accesses. Usually this indicates Last Level Cache accesses but this may vary depending on your CPU. This may include prefetches and coherency messages; again this depends on the design of your CPU. Unit: cpu] cpu-cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cycles]. Unit: cpu] cycles [Total cycles. Be wary of what happens during CPU frequency scaling [This event is an alias of cpu-cycles]. Unit: cpu] instructions [Retired instructions. Be careful,these can be affected by various issues,most notably hardware interrupt counts. Unit: cpu] ref-cycles [Total cycles; not affected by CPU frequency scaling. Unit: cpu] But most notable changes would be in the perf stat. On the right side, the default metrics are better named and aligned. :) $ perf stat -- perf test -w noploop Performance counter stats for 'perf test -w noploop': 11 context-switches # 10.8 cs/sec cs_per_second 0 cpu-migrations # 0.0 migrations/sec migrations_per_second 3,612 page-faults # 3532.5 faults/sec page_faults_per_second 1,022.51 msec task-clock # 1.0 CPUs CPUs_utilized 110,466 branch-misses # 0.0 % branch_miss_rate (88.66%) 6,934,452,104 branches # 6781.8 M/sec branch_frequency (88.66%) 4,657,032,590 cpu-cycles # 4.6 GHz cycles_frequency (88.65%) 27,755,874,218 instructions # 6.0 instructions insn_per_cycle (89.03%) TopdownL1 # 0.3 % tma_backend_bound # 9.3 % tma_bad_speculation (89.05%) # 9.7 % tma_frontend_bound (77.86%) # 80.7 % tma_retiring (88.81%) 1.025318171 seconds time elapsed 1.013248000 seconds user 0.012014000 seconds sys Deferred unwinding support: With the kernel support (commit c69993ecdd4d: "perf: Support deferred user unwind"), perf can use deferred callchains for userspace stack trace with frame pointers like below: $ perf record --call-graph fp,defer ... This will be transparent to users when it comes to other commands like perf report and perf script. They will merge the deferred callchains to the previous samples as if they were collected together. ARM SPE updates - Extensive enhancements to support various kinds of memory operations including GCS, MTE allocation tags, memcpy/memset, register access, and SIMD operations. - Add inverted data source filter (inv_data_src_filter) support to exclude certain data sources. - Improve documentation. Vendor event updates: - Intel: Updated event files for Sierra Forest, Panther Lake, Meteor Lake, Lunar Lake, Granite Rapids, and others. - Arm64: Added metrics for i.MX94 DDR PMU and Cortex-A720AE definitions. - RISC-V: Added JSON support for T-HEAD C920V2. Misc: - Improve pointer tracking in data type profiling. It'd give better output when the variable is using container_of() to convert type. - Annotation support for perf c2c report in TUI. Press 'a' key to enter annotation view from cacheline browser window. This will show which instruction is causing the cacheline contention. - Lots of fixes and test coverage improvements!" * tag 'perf-tools-for-v6.19-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (214 commits) libperf: Use 'extern' in LIBPERF_API visibility macro perf stat: Improve handling of termination by signal perf tests stat: Add test for error for an offline CPU perf stat: When no events, don't report an error if there is none perf tests stat: Add "--null" coverage perf cpumap: Add "any" CPU handling to cpu_map__snprint_mask libperf cpumap: Fix perf_cpu_map__max for an empty/NULL map perf stat: Allow no events to open if this is a "--null" run perf test kvm: Add some basic perf kvm test coverage perf tests evlist: Add basic evlist test perf tests script dlfilter: Add a dlfilter test perf tests kallsyms: Add basic kallsyms test perf tests timechart: Add a perf timechart test perf tests top: Add basic perf top coverage test perf tests buildid: Add purge and remove testing perf tests c2c: Add a basic c2c perf c2c: Clean up some defensive gets and make asan clean perf jitdump: Fix missed dso__put perf mem-events: Don't leak online CPU map perf hist: In init, ensure mem_info is put on error paths ...
5 daysMerge tag 'tty-6.19-rc1' of ↵Linus Torvalds4-1/+657
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty Pull tty/serial updates from Greg KH: "Here is the big set of tty/serial driver changes for 6.19-rc1. Nothing major at all, just small constant churn to make the tty layer "cleaner" as well as serial driver updates and even a new test added! Included in here are: - More tty/serial cleanups from Jiri - tty tiocsti test added to hopefully ensure we don't regress in this area again - sc16is7xx driver updates - imx serial driver updates - 8250 driver updates - new hardware device ids added - other minor serial/tty driver cleanups and tweaks All of these have been in linux-next for a while with no reported issues" * tag 'tty-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (60 commits) serial: sh-sci: Fix deadlock during RSCI FIFO overrun error dt-bindings: serial: rsci: Drop "uart-has-rtscts: false" LoongArch: dts: Add uart new compatible string serial: 8250: Add Loongson uart driver support dt-bindings: serial: 8250: Add Loongson uart compatible serial: 8250: add driver for KEBA UART serial: Keep rs485 settings for devices without firmware node serial: qcom-geni: Enable Serial on SA8255p Qualcomm platforms serial: qcom-geni: Enable PM runtime for serial driver serial: sprd: Return -EPROBE_DEFER when uart clock is not ready tty: serial: samsung: Declare earlycon for Exynos850 serial: icom: Convert PCIBIOS_* return codes to errnos serial: 8250-of: Fix style issues in 8250_of.c serial: add support of CPCI cards serial: mux: Fix kernel doc for mux_poll() tty: replace use of system_unbound_wq with system_dfl_wq serial: 8250_platform: simplify IRQF_SHARED handling serial: 8250: make share_irqs local to 8250_platform serial: 8250: move skip_txen_test to core serial: drop SERIAL_8250_DEPRECATED_OPTIONS ...
6 daysMerge tag 'mm-nonmm-stable-2025-12-06-11-14' of ↵Linus Torvalds317-307/+1291
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull non-MM updates from Andrew Morton: - "panic: sys_info: Refactor and fix a potential issue" (Andy Shevchenko) fixes a build issue and does some cleanup in ib/sys_info.c - "Implement mul_u64_u64_div_u64_roundup()" (David Laight) enhances the 64-bit math code on behalf of a PWM driver and beefs up the test module for these library functions - "scripts/gdb/symbols: make BPF debug info available to GDB" (Ilya Leoshkevich) makes BPF symbol names, sizes, and line numbers available to the GDB debugger - "Enable hung_task and lockup cases to dump system info on demand" (Feng Tang) adds a sysctl which can be used to cause additional info dumping when the hung-task and lockup detectors fire - "lib/base64: add generic encoder/decoder, migrate users" (Kuan-Wei Chiu) adds a general base64 encoder/decoder to lib/ and migrates several users away from their private implementations - "rbree: inline rb_first() and rb_last()" (Eric Dumazet) makes TCP a little faster - "liveupdate: Rework KHO for in-kernel users" (Pasha Tatashin) reworks the KEXEC Handover interfaces in preparation for Live Update Orchestrator (LUO), and possibly for other future clients - "kho: simplify state machine and enable dynamic updates" (Pasha Tatashin) increases the flexibility of KEXEC Handover. Also preparation for LUO - "Live Update Orchestrator" (Pasha Tatashin) is a major new feature targeted at cloud environments. Quoting the cover letter: This series introduces the Live Update Orchestrator, a kernel subsystem designed to facilitate live kernel updates using a kexec-based reboot. This capability is critical for cloud environments, allowing hypervisors to be updated with minimal downtime for running virtual machines. LUO achieves this by preserving the state of selected resources, such as memory, devices and their dependencies, across the kernel transition. As a key feature, this series includes support for preserving memfd file descriptors, which allows critical in-memory data, such as guest RAM or any other large memory region, to be maintained in RAM across the kexec reboot. Mike Rappaport merits a mention here, for his extensive review and testing work. - "kexec: reorganize kexec and kdump sysfs" (Sourabh Jain) moves the kexec and kdump sysfs entries from /sys/kernel/ to /sys/kernel/kexec/ and adds back-compatibility symlinks which can hopefully be removed one day - "kho: fixes for vmalloc restoration" (Mike Rapoport) fixes a BUG which was being hit during KHO restoration of vmalloc() regions * tag 'mm-nonmm-stable-2025-12-06-11-14' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (139 commits) calibrate: update header inclusion Reinstate "resource: avoid unnecessary lookups in find_next_iomem_res()" vmcoreinfo: track and log recoverable hardware errors kho: fix restoring of contiguous ranges of order-0 pages kho: kho_restore_vmalloc: fix initialization of pages array MAINTAINERS: TPM DEVICE DRIVER: update the W-tag init: replace simple_strtoul with kstrtoul to improve lpj_setup KHO: fix boot failure due to kmemleak access to non-PRESENT pages Documentation/ABI: new kexec and kdump sysfs interface Documentation/ABI: mark old kexec sysfs deprecated kexec: move sysfs entries to /sys/kernel/kexec test_kho: always print restore status kho: free chunks using free_page() instead of kfree() selftests/liveupdate: add kexec test for multiple and empty sessions selftests/liveupdate: add simple kexec-based selftest for LUO selftests/liveupdate: add userspace API selftests docs: add documentation for memfd preservation via LUO mm: memfd_luo: allow preserving memfd liveupdate: luo_file: add private argument to store runtime state mm: shmem: export some functions to internal.h ...
6 daysMerge tag 'objtool-urgent-2025-12-06' of ↵Linus Torvalds5-15/+154
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool fixes from Ingo Molnar: "Address various objtool scalability bugs/inefficiencies exposed by allmodconfig builds, plus improve the quality of alternatives instructions generated code and disassembly" * tag 'objtool-urgent-2025-12-06' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Simplify .annotate_insn code generation output some more objtool: Add more robust signal error handling, detect and warn about stack overflows objtool: Remove newlines and tabs from annotation macros objtool: Consolidate annotation macros x86/asm: Remove ANNOTATE_DATA_SPECIAL usage x86/alternative: Remove ANNOTATE_DATA_SPECIAL usage objtool: Fix stack overflow in validate_branch()
6 daysMerge tag 'nfsd-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linuxLinus Torvalds24-33/+65
Pull nfsd updates from Chuck Lever: - Mike Snitzer's mechanism for disabling I/O caching introduced in v6.18 is extended to include using direct I/O. The goal is to further reduce the memory footprint consumed by NFS clients accessing large data sets via NFSD. - The NFSD community adopted a maintainer entry profile during this cycle. See Documentation/filesystems/nfs/nfsd-maintainer-entry-profile.rst - Work continues on hardening NFSD's implementation of the pNFS block layout type. This type enables pNFS clients to directly access the underlying block devices that contain an exported file system, reducing server overhead and increasing data throughput. - The remaining patches are clean-ups and minor optimizations. Many thanks to the contributors, reviewers, testers, and bug reporters who participated during the v6.19 NFSD development cycle. * tag 'nfsd-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux: (38 commits) NFSD: nfsd-io-modes: Separate lists NFSD: nfsd-io-modes: Wrap shell snippets in literal code blocks NFSD: Add toctree entry for NFSD IO modes docs NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst NFSD: Implement NFSD_IO_DIRECT for NFS WRITE NFSD: Make FILE_SYNC WRITEs comply with spec NFSD: Add trace point for SCSI fencing operation. NFSD: use correct reservation type in nfsd4_scsi_fence_client xdrgen: Don't generate unnecessary semicolon xdrgen: Fix union declarations NFSD: don't start nfsd if sv_permsocks is empty xdrgen: handle _XdrString in union encoder/decoder xdrgen: Fix the variable-length opaque field decoder template xdrgen: Make the xdrgen script location-independent xdrgen: Generalize/harden pathname construction lockd: don't allow locking on reexported NFSv2/3 MAINTAINERS: add a nfsd blocklayout reviewer nfsd: Use MD5 library instead of crypto_shash nfsd: stop pretending that we cache the SEQUENCE reply. NFS: nfsd-maintainer-entry-profile: Inline function name prefixes ...
6 daysMerge tag 'landlock-6.19-rc1' of ↵Linus Torvalds2-9/+1467
git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock updates from Mickaël Salaün: "This mainly fixes handling of disconnected directories and adds new tests" * tag 'landlock-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: selftests/landlock: Add disconnected leafs and branch test suites selftests/landlock: Add tests for access through disconnected paths landlock: Improve variable scope landlock: Fix handling of disconnected directories selftests/landlock: Fix makefile header list landlock: Make docs in cred.h and domain.h visible landlock: Minor comments improvements
6 daysMerge tag 'turbostat-v2025.12.02' of ↵Linus Torvalds3-619/+660
git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux Pull turbostat updates from Len Brown: - Add LLC statistics columns: LLCkRPS = Last Level Cache Thousands of References Per Second LLC%hit = Last Level Cache Hit % - Recognize Wildcat Lake and Nova Lake platforms - Add MSR check for Android - Add APERF check for VMWARE - Add RAPL check for AWS - Minor fixes to turbostat (and x86_energy_perf_policy) * tag 'turbostat-v2025.12.02' of git://git.kernel.org/pub/scm/linux/kernel/git/lenb/linux: (21 commits) tools/power turbostat: version 2025.12.02 tools/power turbostat: Print wide names only for RAW 64-bit columns tools/power turbostat: Print percentages in 8-columns tools/power turbostat: Print "nan" for out of range percentages tools/power turbostat: Validate APERF access for VMWARE tools/power turbostat: Enhance perf probe tools/power turbostat: Validate RAPL MSRs for AWS Nitro Hypervisor tools/power x86_energy_perf_policy: Fix potential NULL pointer dereference tools/power x86_energy_perf_policy: Fix format string in error message tools/power x86_energy_perf_policy: Simplify Android MSR probe tools/power x86_energy_perf_policy: Add Android MSR device support tools/power turbostat: Add run-time MSR driver probe tools/power turbostat: Set per_cpu_msr_sum to NULL after free tools/power turbostat: Add LLC stats tools/power turbostat: Remove dead code tools/power turbostat: Refactor floating point printout code tools/power turbostat.8: Update example tools/power turbostat: Refactor added-counter value printing code tools/power turbostat: Refactor added column header printing tools/power turbostat: Add Wildcat Lake and Nova Lake support ...
6 daysMerge tag 'libnvdimm-for-6.19' of ↵Linus Torvalds1-1/+6
git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm Pull nvdimm updates from Ira Weiny: "These are mainly bug fixes and code updates. There is a new feature to divide up memmap= carve outs and a fix caught in linux-next for that patch. Managing memmap memory on the fly for multiple VM's was proving difficult and Mike provided a driver which allows for the memory to be better manged. Summary: - Allow exposing RAM carveouts as NVDIMM DIMM devices - Prevent integer overflow in ramdax_get_config_data() - Replace use of system_wq with system_percpu_wq - Documentation: btt: Unwrap bit 31-30 nested table - tools/testing/nvdimm: Use per-DIMM device handle" * tag 'libnvdimm-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm: nvdimm: Prevent integer overflow in ramdax_get_config_data() Documentation: btt: Unwrap bit 31-30 nested table nvdimm: replace use of system_wq with system_percpu_wq tools/testing/nvdimm: Use per-DIMM device handle nvdimm: allow exposing RAM carveouts as NVDIMM DIMM devices
6 daysMerge tag 'dma-mapping-6.19-2025-12-05' of ↵Linus Torvalds6-14/+65
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping updates from Marek Szyprowski: - More DMA mapping API refactoring to physical addresses as the primary interface instead of page+offset parameters. This time dma_map_ops callbacks are converted to physical addresses, what in turn results also in some simplification of architecture specific code (Leon Romanovsky and Jason Gunthorpe) - Clarify that dma_map_benchmark is not a kernel self-test, but standalone tool (Qinxin Xia) * tag 'dma-mapping-6.19-2025-12-05' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma-mapping: remove unused map_page callback xen: swiotlb: Convert mapping routine to rely on physical address x86: Use physical address for DMA mapping sparc: Use physical address DMA mapping powerpc: Convert to physical address DMA mapping parisc: Convert DMA map_page to map_phys interface MIPS/jazzdma: Provide physical address directly alpha: Convert mapping routine to rely on physical address dma-mapping: remove unused mapping resource callbacks xen: swiotlb: Switch to physical address mapping callbacks ARM: dma-mapping: Switch to physical address mapping callbacks ARM: dma-mapping: Reduce struct page exposure in arch_sync_dma*() dma-mapping: convert dummy ops to physical address mapping dma-mapping: prepare dma_map_ops to conversion to physical address tools/dma: move dma_map_benchmark from selftests to tools/dma
7 daysMerge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvmLinus Torvalds48-325/+2179
Pull KVM updates from Paolo Bonzini: "ARM: - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM - Minor fixes to KVM and selftests Loongarch: - Get VM PMU capability from HW GCFG register - Add AVEC basic support - Use 64-bit register definition for EIOINTC - Add KVM timer test cases for tools/selftests RISC/V: - SBI message passing (MPXY) support for KVM guest - Give a new, more specific error subcode for the case when in-kernel AIA virtualization fails to allocate IMSIC VS-file - Support KVM_DIRTY_LOG_INITIALLY_SET, enabling dirty log gradually in small chunks - Fix guest page fault within HLV* instructions - Flush VS-stage TLB after VCPU migration for Andes cores s390: - Always allocate ESCA (Extended System Control Area), instead of starting with the basic SCA and converting to ESCA with the addition of the 65th vCPU. The price is increased number of exits (and worse performance) on z10 and earlier processor; ESCA was introduced by z114/z196 in 2010 - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups x86: - Skip the costly "zap all SPTEs" on an MMIO generation wrap if MMIO SPTE caching is disabled, as there can't be any relevant SPTEs to zap - Relocate a misplaced export - Fix an async #PF bug where KVM would clear the completion queue when the guest transitioned in and out of paging mode, e.g. when handling an SMI and then returning to paged mode via RSM - Leave KVM's user-return notifier registered even when disabling virtualization, as long as kvm.ko is loaded. On reboot/shutdown, keeping the notifier registered is ok; the kernel does not use the MSRs and the callback will run cleanly and restore host MSRs if the CPU manages to return to userspace before the system goes down - Use the checked version of {get,put}_user() - Fix a long-lurking bug where KVM's lack of catch-up logic for periodic APIC timers can result in a hard lockup in the host - Revert the periodic kvmclock sync logic now that KVM doesn't use a clocksource that's subject to NTP corrections - Clean up KVM's handling of MMIO Stale Data and L1TF, and bury the latter behind CONFIG_CPU_MITIGATIONS - Context switch XCR0, XSS, and PKRU outside of the entry/exit fast path; the only reason they were handled in the fast path was to paper of a bug in the core #MC code, and that has long since been fixed - Add emulator support for AVX MOV instructions, to play nice with emulated devices whose guest drivers like to access PCI BARs with large multi-byte instructions x86 (AMD): - Fix a few missing "VMCB dirty" bugs - Fix the worst of KVM's lack of EFER.LMSLE emulation - Add AVIC support for addressing 4k vCPUs in x2AVIC mode - Fix incorrect handling of selective CR0 writes when checking intercepts during emulation of L2 instructions - Fix a currently-benign bug where KVM would clobber SPEC_CTRL[63:32] on VMRUN and #VMEXIT - Fix a bug where KVM corrupt the guest code stream when re-injecting a soft interrupt if the guest patched the underlying code after the VM-Exit, e.g. when Linux patches code with a temporary INT3 - Add KVM_X86_SNP_POLICY_BITS to advertise supported SNP policy bits to userspace, and extend KVM "support" to all policy bits that don't require any actual support from KVM x86 (Intel): - Use the root role from kvm_mmu_page to construct EPTPs instead of the current vCPU state, partly as worthwhile cleanup, but mostly to pave the way for tracking per-root TLB flushes, and elide EPT flushes on pCPU migration if the root is clean from a previous flush - Add a few missing nested consistency checks - Rip out support for doing "early" consistency checks via hardware as the functionality hasn't been used in years and is no longer useful in general; replace it with an off-by-default module param to WARN if hardware fails a check that KVM does not perform - Fix a currently-benign bug where KVM would drop the guest's SPEC_CTRL[63:32] on VM-Enter - Misc cleanups - Overhaul the TDX code to address systemic races where KVM (acting on behalf of userspace) could inadvertantly trigger lock contention in the TDX-Module; KVM was either working around these in weird, ugly ways, or was simply oblivious to them (though even Yan's devilish selftests could only break individual VMs, not the host kernel) - Fix a bug where KVM could corrupt a vCPU's cpu_list when freeing a TDX vCPU, if creating said vCPU failed partway through - Fix a few sparse warnings (bad annotation, 0 != NULL) - Use struct_size() to simplify copying TDX capabilities to userspace - Fix a bug where TDX would effectively corrupt user-return MSR values if the TDX Module rejects VP.ENTER and thus doesn't clobber host MSRs as expected Selftests: - Fix a math goof in mmu_stress_test when running on a single-CPU system/VM - Forcefully override ARCH from x86_64 to x86 to play nice with specifying ARCH=x86_64 on the command line - Extend a bunch of nested VMX to validate nested SVM as well - Add support for LA57 in the core VM_MODE_xxx macro, and add a test to verify KVM can save/restore nested VMX state when L1 is using 5-level paging, but L2 is not - Clean up the guest paging code in anticipation of sharing the core logic for nested EPT and nested NPT guest_memfd: - Add NUMA mempolicy support for guest_memfd, and clean up a variety of rough edges in guest_memfd along the way - Define a CLASS to automatically handle get+put when grabbing a guest_memfd from a memslot to make it harder to leak references - Enhance KVM selftests to make it easer to develop and debug selftests like those added for guest_memfd NUMA support, e.g. where test and/or KVM bugs often result in hard-to-debug SIGBUS errors - Misc cleanups Generic: - Use the recently-added WQ_PERCPU when creating the per-CPU workqueue for irqfd cleanup - Fix a goof in the dirty ring documentation - Fix choice of target for directed yield across different calls to kvm_vcpu_on_spin(); the function was always starting from the first vCPU instead of continuing the round-robin search" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (260 commits) KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: s390: Use generic VIRT_XFER_TO_GUEST_WORK functions ...
7 daysMerge tag 'riscv-for-linus-6.19-mw1' of ↵Linus Torvalds3-30/+274
git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux Pull RISC-V updates from Paul Walmsley: - Enable parallel hotplug for RISC-V - Optimize vector regset allocation for ptrace() - Add a kernel selftest for the vector ptrace interface - Enable the userspace RAID6 test to build and run using RISC-V vectors - Add initial support for the Zalasr RISC-V ratified ISA extension - For the Zicbop RISC-V ratified ISA extension to userspace, expose hardware and kernel support to userspace and add a kselftest for Zicbop - Convert open-coded instances of 'asm goto's that are controlled by runtime ALTERNATIVEs to use riscv_has_extension_{un,}likely(), following arm64's alternative_has_cap_{un,}likely() - Remove an unnecessary mask in the GFP flags used in some calls to pagetable_alloc() * tag 'riscv-for-linus-6.19-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: selftests/riscv: Add Zicbop prefetch test riscv: hwprobe: Expose Zicbop extension and its block size riscv: Introduce Zalasr instructions riscv: hwprobe: Export Zalasr extension dt-bindings: riscv: Add Zalasr ISA extension description riscv: Add ISA extension parsing for Zalasr selftests: riscv: Add test for the Vector ptrace interface riscv: ptrace: Optimize the allocation of vector regset raid6: test: Add support for RISC-V raid6: riscv: Allow code to be compiled in userspace raid6: riscv: Prevent compiler from breaking inline vector assembly code riscv: cmpxchg: Use riscv_has_extension_likely riscv: bitops: Use riscv_has_extension_likely riscv: hweight: Use riscv_has_extension_likely riscv: checksum: Use riscv_has_extension_likely riscv: pgtable: Use riscv_has_extension_unlikely riscv: Remove __GFP_HIGHMEM masking RISC-V: Enable HOTPLUG_PARALLEL for secondary CPUs
7 daysMerge tag 'mm-stable-2025-12-03-21-26' of ↵Linus Torvalds17-225/+1952
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull MM updates from Andrew Morton: "__vmalloc()/kvmalloc() and no-block support" (Uladzislau Rezki) Rework the vmalloc() code to support non-blocking allocations (GFP_ATOIC, GFP_NOWAIT) "ksm: fix exec/fork inheritance" (xu xin) Fix a rare case where the KSM MMF_VM_MERGE_ANY prctl state is not inherited across fork/exec "mm/zswap: misc cleanup of code and documentations" (SeongJae Park) Some light maintenance work on the zswap code "mm/page_owner: add debugfs files 'show_handles' and 'show_stacks_handles'" (Mauricio Faria de Oliveira) Enhance the /sys/kernel/debug/page_owner debug feature by adding unique identifiers to differentiate the various stack traces so that userspace monitoring tools can better match stack traces over time "mm/page_alloc: pcp->batch cleanups" (Joshua Hahn) Minor alterations to the page allocator's per-cpu-pages feature "Improve UFFDIO_MOVE scalability by removing anon_vma lock" (Lokesh Gidra) Address a scalability issue in userfaultfd's UFFDIO_MOVE operation "kasan: cleanups for kasan_enabled() checks" (Sabyrzhan Tasbolatov) "drivers/base/node: fold node register and unregister functions" (Donet Tom) Clean up the NUMA node handling code a little "mm: some optimizations for prot numa" (Kefeng Wang) Cleanups and small optimizations to the NUMA allocation hinting code "mm/page_alloc: Batch callers of free_pcppages_bulk" (Joshua Hahn) Address long lock hold times at boot on large machines. These were causing (harmless) softlockup warnings "optimize the logic for handling dirty file folios during reclaim" (Baolin Wang) Remove some now-unnecessary work from page reclaim "mm/damon: allow DAMOS auto-tuned for per-memcg per-node memory usage" (SeongJae Park) Enhance the DAMOS auto-tuning feature "mm/damon: fixes for address alignment issues in DAMON_LRU_SORT and DAMON_RECLAIM" (Quanmin Yan) Fix DAMON_LRU_SORT and DAMON_RECLAIM with certain userspace configuration "expand mmap_prepare functionality, port more users" (Lorenzo Stoakes) Enhance the new(ish) file_operations.mmap_prepare() method and port additional callsites from the old ->mmap() over to ->mmap_prepare() "Fix stale IOTLB entries for kernel address space" (Lu Baolu) Fix a bug (and possible security issue on non-x86) in the IOMMU code. In some situations the IOMMU could be left hanging onto a stale kernel pagetable entry "mm/huge_memory: cleanup __split_unmapped_folio()" (Wei Yang) Clean up and optimize the folio splitting code "mm, swap: misc cleanup and bugfix" (Kairui Song) Some cleanups and a minor fix in the swap discard code "mm/damon: misc documentation fixups" (SeongJae Park) "mm/damon: support pin-point targets removal" (SeongJae Park) Permit userspace to remove a specific monitoring target in the middle of the current targets list "mm: MISC follow-up patches for linux/pgalloc.h" (Harry Yoo) A couple of cleanups related to mm header file inclusion "mm/swapfile.c: select swap devices of default priority round robin" (Baoquan He) improve the selection of swap devices for NUMA machines "mm: Convert memory block states (MEM_*) macros to enums" (Israel Batista) Change the memory block labels from macros to enums so they will appear in kernel debug info "ksm: perform a range-walk to jump over holes in break_ksm" (Pedro Demarchi Gomes) Address an inefficiency when KSM unmerges an address range "mm/damon/tests: fix memory bugs in kunit tests" (SeongJae Park) Fix leaks and unhandled malloc() failures in DAMON userspace unit tests "some cleanups for pageout()" (Baolin Wang) Clean up a couple of minor things in the page scanner's writeback-for-eviction code "mm/hugetlb: refactor sysfs/sysctl interfaces" (Hui Zhu) Move hugetlb's sysfs/sysctl handling code into a new file "introduce VM_MAYBE_GUARD and make it sticky" (Lorenzo Stoakes) Make the VMA guard regions available in /proc/pid/smaps and improves the mergeability of guarded VMAs "mm: perform guard region install/remove under VMA lock" (Lorenzo Stoakes) Reduce mmap lock contention for callers performing VMA guard region operations "vma_start_write_killable" (Matthew Wilcox) Start work on permitting applications to be killed when they are waiting on a read_lock on the VMA lock "mm/damon/tests: add more tests for online parameters commit" (SeongJae Park) Add additional userspace testing of DAMON's "commit" feature "mm/damon: misc cleanups" (SeongJae Park) "make VM_SOFTDIRTY a sticky VMA flag" (Lorenzo Stoakes) Address the possible loss of a VMA's VM_SOFTDIRTY flag when that VMA is merged with another "mm: support device-private THP" (Balbir Singh) Introduce support for Transparent Huge Page (THP) migration in zone device-private memory "Optimize folio split in memory failure" (Zi Yan) "mm/huge_memory: Define split_type and consolidate split support checks" (Wei Yang) Some more cleanups in the folio splitting code "mm: remove is_swap_[pte, pmd]() + non-swap entries, introduce leaf entries" (Lorenzo Stoakes) Clean up our handling of pagetable leaf entries by introducing the concept of 'software leaf entries', of type softleaf_t "reparent the THP split queue" (Muchun Song) Reparent the THP split queue to its parent memcg. This is in preparation for addressing the long-standing "dying memcg" problem, wherein dead memcg's linger for too long, consuming memory resources "unify PMD scan results and remove redundant cleanup" (Wei Yang) A little cleanup in the hugepage collapse code "zram: introduce writeback bio batching" (Sergey Senozhatsky) Improve zram writeback efficiency by introducing batched bio writeback support "memcg: cleanup the memcg stats interfaces" (Shakeel Butt) Clean up our handling of the interrupt safety of some memcg stats "make vmalloc gfp flags usage more apparent" (Vishal Moola) Clean up vmalloc's handling of incoming GFP flags "mm: Add soft-dirty and uffd-wp support for RISC-V" (Chunyan Zhang) Teach soft dirty and userfaultfd write protect tracking to use RISC-V's Svrsw60t59b extension "mm: swap: small fixes and comment cleanups" (Youngjun Park) Fix a small bug and clean up some of the swap code "initial work on making VMA flags a bitmap" (Lorenzo Stoakes) Start work on converting the vma struct's flags to a bitmap, so we stop running out of them, especially on 32-bit "mm/swapfile: fix and cleanup swap list iterations" (Youngjun Park) Address a possible bug in the swap discard code and clean things up a little [ This merge also reverts commit ebb9aeb980e5 ("vfio/nvgrace-gpu: register device memory for poison handling") because it looks broken to me, I've asked for clarification - Linus ] * tag 'mm-stable-2025-12-03-21-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (321 commits) mm: fix vma_start_write_killable() signal handling mm/swapfile: use plist_for_each_entry in __folio_throttle_swaprate mm/swapfile: fix list iteration when next node is removed during discard fs/proc/task_mmu.c: fix make_uffd_wp_huge_pte() huge pte handling mm/kfence: add reboot notifier to disable KFENCE on shutdown memcg: remove inc/dec_lruvec_kmem_state helpers selftests/mm/uffd: initialize char variable to Null mm: fix DEBUG_RODATA_TEST indentation in Kconfig mm: introduce VMA flags bitmap type tools/testing/vma: eliminate dependency on vma->__vm_flags mm: simplify and rename mm flags function for clarity mm: declare VMA flags by bit zram: fix a spelling mistake mm/page_alloc: optimize lowmem_reserve max lookup using its semantic monotonicity mm/vmscan: skip increasing kswapd_failures when reclaim was boosted pagemap: update BUDDY flag documentation mm: swap: remove scan_swap_map_slots() references from comments mm: swap: change swap_alloc_slow() to void mm, swap: remove redundant comment for read_swap_cache_async mm, swap: use SWP_SOLIDSTATE to determine if swap is rotational ...
7 daysMerge tag 'ktest-v6.19' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest Pull ktest fix from Steven Rostedt: - Fix incorrect variable in error message in config-bisect.pl If the old config file fails to get copied as the last good or bad config file, then it fails the program and prints an error message. But the variable used to print what the old config's name was incorrect. It was $config when it should have been $output_config. * tag 'ktest-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-ktest: ktest.pl: Fix uninitialized var in config-bisect.pl
7 dayslibperf: Use 'extern' in LIBPERF_API visibility macroArnaldo Carvalho de Melo1-1/+1
Use 'extern' on LIBPERF_API to address this issue that started appearing with gcc 15, first seen in ubuntu 25.10: evlist.c: In function 'perf_evlist__purge': evlist.c:202:17: error: implicit declaration of function 'perf_evsel__delete'; did you mean 'perf_evsel__exit'? [-Wimplicit-function-declaration] 202 | perf_evsel__delete(pos); | ^~~~~~~~~~~~~~~~~~ | perf_evsel__exit evlist.c:202:17: error: nested extern declaration of 'perf_evsel__delete' [-Werror=nested-externs] evlist.c: In function 'perf_evlist__open': evlist.c:261:23: error: implicit declaration of function 'perf_evsel__open'; did you mean 'perf_evsel__exit'? [-Wimplicit-function-declaration] 261 | err = perf_evsel__open(evsel, evsel->cpus, evsel->threads); | ^~~~~~~~~~~~~~~~ | perf_evsel__exit evlist.c:261:23: error: nested extern declaration of 'perf_evsel__open' [-Werror=nested-externs] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
7 daysMerge tag 'trace-rv-6.19' of ↵Linus Torvalds13-14/+278
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull runtime verifier updates from Steven Rostedt: - Adapt the ftracetest script to be run from a different folder This uses the already existing OPT_TEST_DIR but extends it further to run independent tests, then add an --rv flag to allow using the script for testing RV (mostly) independently on ftrace. - Add basic RV selftests in selftests/verification for more validations Add more validations for available/enabled monitors and reactors. This could have caught the bug introducing kernel panic solved above. Tests use ftracetest. - Convert react() function in reactor to use va_list directly Use a central helper to handle the variadic arguments. Clean up macros and mark functions as static. - Add lockdep annotations to reactors to have lockdep complain of errors If the reactors are called from improper context. Useful to develop new reactors. This highlights a warning in the panic reactor that is related to the printk subsystem and not to RV. - Convert core RV code to use lock guards and __free helpers This completely removes goto statements. - Fix compilation if !CONFIG_RV_REACTORS Fix the warning by keeping LTL monitor variable as always static. * tag 'trace-rv-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rv: Fix compilation if !CONFIG_RV_REACTORS rv: Convert to use __free rv: Convert to use lock guard rv: Add explicit lockdep context for reactors rv: Make rv_reacting_on() static rv: Pass va_list to reactors selftests/verification: Add initial RV tests selftest/ftrace: Generalise ftracetest to use with RV
7 daysMerge tag 'trace-tools-v6.19' of ↵Linus Torvalds14-372/+246
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull rtla trace tooling updates from Steven Rostedt: - Officially add Tomas Glozar as a maintainer to RTLA tool - Add for_each_monitored_cpu() helper In multiple places, RTLA tools iterate over the list of CPUs running tracer threads. Use single helper instead of repeating the for/if combination. - Remove unused variable option_index in argument parsing RTLA tools use getopt_long() for argument parsing. For its last argument, an unused variable "option_index" is passed. Remove the variable and pass NULL to getopt_long() to shorten the naturally long parsing functions, and make them more readable. - Fix unassigned nr_cpus after code consolidation In recent code consolidation, timerlat tool cleanup, previously implemented separately for each tool, was moved to a common function timerlat_free(). The cleanup relies on nr_cpus being set. This was not done in the new function, leaving the variable uninitialized. Initialize the variable properly, and remove silencing of compiler warning for uninitialized variables. - Stop tracing on user latency in BPF mode Despite the name, rtla-timerlat's -T/--thread option sets timerlat's stop_tracing_total_us option, which also stops tracing on return-from-user latency, not only on thread latency. Implement the same behavior also in BPF sample collection stop tracing handler to avoid a discrepancy and restore correspondence of behavior with the equivalent option of cyclictest. - Fix threshold actions always triggering A bug in threshold action logic caused the action to execute even if tracing did not stop because of threshold. Fix the logic to stop correctly. - Fix few minor issues in tests Extend tests that were shown to need it to 5s, fix osnoise test calling timerlat by mistake, and use new, more reliable output checking in timerlat's "top stop at failed action" test. - Do not print usage on argument parsing error RTLA prints the entire usage message on encountering errors in argument parsing, like a malformed CPU list. The usage message has gotten too long. Instead of printing it, use newly added fatal() helper function to simply exit with the error message, excluding the usage. - Fix unintuitive -C/--cgroup interface "-C cgroup" and "--cgroup cgroup" are invalid syntax, despite that being a common way to specify an option with argument. Moreover, using them fails silently and no cgroup is set. Create new helper function to unify the handling of all such options and allow all of: -Xsomething -X=something -X something as well as the equivalent for the long option. - Fix -a overriding -t argument filename Fix a bug where -a following -t custom_file.txt overrides the custom filename with the default timerlat_trace.txt. - Stop tracing correctly on multiple events at once In some race scenarios, RTLA BPF sample collection might send multiple stop tracing events via the BPF ringbuffer at once. Compare the number of events for != 0 instead of == 1 to cover for this scenario and stop tracing properly. * tag 'trace-tools-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: rtla/timerlat: Exit top main loop on any non-zero wait_retval rtla/tests: Don't rely on matching ^1ALL rtla: Fix -a overriding -t argument rtla: Fix -C/--cgroup interface tools/rtla: Replace osnoise_hist_usage("...") with fatal("...") tools/rtla: Replace osnoise_top_usage("...") with fatal("...") tools/rtla: Replace timerlat_hist_usage("...") with fatal("...") tools/rtla: Replace timerlat_top_usage("...") with fatal("...") tools/rtla: Add fatal() and replace error handling pattern rtla/tests: Fix osnoise test calling timerlat rtla/tests: Extend action tests to 5s tools/rtla: Fix --on-threshold always triggering rtla/timerlat_bpf: Stop tracing on user latency tools/rtla: Fix unassigned nr_cpus tools/rtla: Remove unused optional option_index tools/rtla: Add for_each_monitored_cpu() helper MAINTAINERS: Add Tomas Glozar as a maintainer to RTLA tool
7 daysMerge tag 'tpmdd-next-6.19-rc1-v4' of ↵Linus Torvalds1-2/+2
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull tpm updates from Jarkko Sakkinen: "This contains changes to unify TPM return code translation between trusted_tpm2 and TPM driver itself. Other than that the changes are either bug fixes or minor imrovements" * tag 'tpmdd-next-6.19-rc1-v4' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: KEYS: trusted: Use tpm_ret_to_err() in trusted_tpm2 tpm: Use -EPERM as fallback error code in tpm_ret_to_err tpm: Cap the number of PCR banks tpm: Remove tpm_find_get_ops tpm: add WQ_PERCPU to alloc_workqueue users tpm_crb: add missing loc parameter to kerneldoc tpm_crb: Fix a spelling mistake selftests: tpm2: Fix ill defined assertions
7 daysMerge tag 'for-linus-iommufd' of ↵Linus Torvalds2-0/+87
git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd Pull iommufd updates from Jason Gunthorpe: "This is a pretty consequential cycle for iommufd, though this pull is not too big. It is based on a shared branch with VFIO that introduces VFIO_DEVICE_FEATURE_DMA_BUF a DMABUF exporter for VFIO device's MMIO PCI BARs. This was a large multiple series journey over the last year and a half. Based on that work IOMMUFD gains support for VFIO DMABUF's in its existing IOMMU_IOAS_MAP_FILE, which closes the last major gap to support PCI peer to peer transfers within VMs. In Joerg's iommu tree we have the "generic page table" work which aims to consolidate all the duplicated page table code in every iommu driver into a single algorithm. This will be used by iommufd to implement unique page table operations to start adding new features and improve performance. In here: - Expand IOMMU_IOAS_MAP_FILE to accept a DMABUF exported from VFIO. This is the first step to broader DMABUF support in iommufd, right now it only works with VFIO. This closes the last functional gap with classic VFIO type 1 to safely support PCI peer to peer DMA by mapping the VFIO device's MMIO into the IOMMU. - Relax SMMUv3 restrictions on nesting domains to better support qemu's sequence to have an identity mapping before the vSID is established" * tag 'for-linus-iommufd' of git://git.kernel.org/pub/scm/linux/kernel/git/jgg/iommufd: iommu/arm-smmu-v3-iommufd: Allow attaching nested domain for GBPA cases iommufd/selftest: Add some tests for the dmabuf flow iommufd: Accept a DMABUF through IOMMU_IOAS_MAP_FILE iommufd: Have iopt_map_file_pages convert the fd to a file iommufd: Have pfn_reader process DMABUF iopt_pages iommufd: Allow MMIO pages in a batch iommufd: Allow a DMABUF to be revoked iommufd: Do not map/unmap revoked DMABUFs iommufd: Add DMABUF to iopt_pages vfio/pci: Add vfio_pci_dma_buf_iommufd_map()
7 daysMerge tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfioLinus Torvalds26-1071/+1492
Pull VFIO updates from Alex Williamson: - Move libvfio selftest artifacts in preparation of more tightly coupled integration with KVM selftests (David Matlack) - Fix comment typo in mtty driver (Chu Guangqing) - Support for new hardware revision in the hisi_acc vfio-pci variant driver where the migration registers can now be accessed via the PF. When enabled for this support, the full BAR can be exposed to the user (Longfang Liu) - Fix vfio cdev support for VF token passing, using the correct size for the kernel structure, thereby actually allowing userspace to provide a non-zero UUID token. Also set the match token callback for the hisi_acc, fixing VF token support for this this vfio-pci variant driver (Raghavendra Rao Ananta) - Introduce internal callbacks on vfio devices to simplify and consolidate duplicate code for generating VFIO_DEVICE_GET_REGION_INFO data, removing various ioctl intercepts with a more structured solution (Jason Gunthorpe) - Introduce dma-buf support for vfio-pci devices, allowing MMIO regions to be exposed through dma-buf objects with lifecycle managed through move operations. This enables low-level interactions such as a vfio-pci based SPDK drivers interacting directly with dma-buf capable RDMA devices to enable peer-to-peer operations. IOMMUFD is also now able to build upon this support to fill a long standing feature gap versus the legacy vfio type1 IOMMU backend with an implementation of P2P support for VM use cases that better manages the lifecycle of the P2P mapping (Leon Romanovsky, Jason Gunthorpe, Vivek Kasireddy) - Convert eventfd triggering for error and request signals to use RCU mechanisms in order to avoid a 3-way lockdep reported deadlock issue (Alex Williamson) - Fix a 32-bit overflow introduced via dma-buf support manifesting with large DMA buffers (Alex Mastro) - Convert nvgrace-gpu vfio-pci variant driver to insert mappings on fault rather than at mmap time. This conversion serves both to make use of huge PFNMAPs but also to both avoid corrected RAS events during reset by now being subject to vfio-pci-core's use of unmap_mapping_range(), and to enable a device readiness test after reset (Ankit Agrawal) - Refactoring of vfio selftests to support multi-device tests and split code to provide better separation between IOMMU and device objects. This work also enables a new test suite addition to measure parallel device initialization latency (David Matlack) * tag 'vfio-v6.19-rc1' of https://github.com/awilliam/linux-vfio: (65 commits) vfio: selftests: Add vfio_pci_device_init_perf_test vfio: selftests: Eliminate INVALID_IOVA vfio: selftests: Split libvfio.h into separate header files vfio: selftests: Move vfio_selftests_*() helpers into libvfio.c vfio: selftests: Rename vfio_util.h to libvfio.h vfio: selftests: Stop passing device for IOMMU operations vfio: selftests: Move IOVA allocator into iova_allocator.c vfio: selftests: Move IOMMU library code into iommu.c vfio: selftests: Rename struct vfio_dma_region to dma_region vfio: selftests: Upgrade driver logging to dev_err() vfio: selftests: Prefix logs with device BDF where relevant vfio: selftests: Eliminate overly chatty logging vfio: selftests: Support multiple devices in the same container/iommufd vfio: selftests: Introduce struct iommu vfio: selftests: Rename struct vfio_iommu_mode to iommu_mode vfio: selftests: Allow passing multiple BDFs on the command line vfio: selftests: Split run.sh into separate scripts vfio: selftests: Move run.sh into scripts directory vfio/nvgrace-gpu: wait for the GPU mem to be ready vfio/nvgrace-gpu: Inform devmem unmapped after reset ...
7 daysMerge tag 'iommu-updates-v6.19' of ↵Linus Torvalds2-22/+50
git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux Pull iommu updates from Joerg Roedel: - Introduction of the generic IO page-table framework with support for Intel and AMD IOMMU formats from Jason. This has good potential for unifying more IO page-table implementations and making future enhancements more easy. But this also needed quite some fixes during development. All known issues have been fixed, but my feeling is that there is a higher potential than usual that more might be needed. - Intel VT-d updates: - Use right invalidation hint in qi_desc_iotlb() - Reduce the scope of INTEL_IOMMU_FLOPPY_WA - ARM-SMMU updates: - Qualcomm device-tree binding updates for Kaanapali and Glymur SoCs and a new clock for the TBU. - Fix error handling if level 1 CD table allocation fails. - Permit more than the architectural maximum number of SMRs for funky Qualcomm mis-implementations of SMMUv2. - Mediatek driver: - MT8189 iommu support - Move ARM IO-pgtable selftests to kunit - Device leak fixes for a couple of drivers - Random smaller fixes and improvements * tag 'iommu-updates-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/iommu/linux: (81 commits) iommupt/vtd: Support mgaw's less than a 4 level walk for first stage iommupt/vtd: Allow VT-d to have a larger table top than the vasz requires powerpc/pseries/svm: Make mem_encrypt.h self contained genpt: Make GENERIC_PT invisible iommupt: Avoid a compiler bug with sw_bit iommu/arm-smmu-qcom: Enable use of all SMR groups when running bare-metal iommupt: Fix unlikely flows in increase_top() iommu/amd: Propagate the error code returned by __modify_irte_ga() in modify_irte_ga() MAINTAINERS: Update my email address iommu/arm-smmu-v3: Fix error check in arm_smmu_alloc_cd_tables dt-bindings: iommu: qcom_iommu: Allow 'tbu' clock iommu/vt-d: Restore previous domain::aperture_end calculation iommu/vt-d: Fix unused invalidation hint in qi_desc_iotlb iommu/vt-d: Set INTEL_IOMMU_FLOPPY_WA depend on BLK_DEV_FD iommu/tegra: fix device leak on probe_device() iommu/sun50i: fix device leak on of_xlate() iommu/omap: simplify probe_device() error handling iommu/omap: fix device leaks on probe_device() iommu/mediatek-v1: add missing larb count sanity check iommu/mediatek-v1: fix device leaks on probe() ...
7 daysMerge tag 'cxl-for-6.19' of ↵Linus Torvalds7-77/+525
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl Pull compute express link (CXL) updates from Dave Jiang: "The additions of note are adding CXL region remove support for locked CXL decoders, adding unit testing support for XOR address translation, and adding unit testing support for extended linear cache. Misc: - Remove incorrect page-allocator quirk section in documentation - Remove unused devm_cxl_port_enumerate_dports() function - Fix typo in cdat.c code comment - Replace use of system_wq with system_percpu_wq - Add locked CXL decoder support for region removal - Return when generic target updated - Rename region_res_match_cxl_range() to spa_maps_hpa() - Clarify comment in spa_maps_hpa() Enable unit testing for XOR address translation of SPA to DPA and vice versa: - Refactor address translation funcs for testing in cxl_region - Make the XOR calculations available for testing - Add cxl_translate module for address translation testing in cxl_test Extended Linear Cache changes: - Add extended linear cache size sysfs attribute - Adjust failure emission of extended linear cache detection in cxl_acpi - Added extended linear cache unit testing support in cxl_test Preparation refactor patches for PRM translation support: - Simplify cxl_rd_ops allocation and handling - Group xor arithmetric setup code in a single block - Remove local variable @inc in cxl_port_setup_targets()" * tag 'cxl-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (22 commits) cxl/test: Assign overflow_err_count from log->nr_overflow cxl/test: Remove ret_limit race condition in mock_get_event() cxl/test: remove unused mock function for cxl_rcd_component_reg_phys() cxl/test: Add support for acpi extended linear cache cxl/test: Add cxl_test CFMWS support for extended linear cache cxl/test: Standardize CXL auto region size cxl/region: Remove local variable @inc in cxl_port_setup_targets() cxl/acpi: Group xor arithmetric setup code in a single block cxl: Simplify cxl_rd_ops allocation and handling cxl: Clarify comment in spa_maps_hpa() cxl: Rename region_res_match_cxl_range() to spa_maps_hpa() acpi/hmat: Return when generic target is updated cxl: Add handling of locked CXL decoder cxl/region: Add support to indicate region has extended linear cache cxl: Adjust extended linear cache failure emission in cxl_acpi cxl/test: Add cxl_translate module for address translation testing cxl/acpi: Make the XOR calculations available for testing cxl/region: Refactor address translation funcs for testing cxl/pci: replace use of system_wq with system_percpu_wq cxl: fix typos in cdat.c comments ...
8 daysMerge tag 'hid-for-linus-2025120201' of ↵Linus Torvalds1-0/+71
git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid Pull HID updates from Jiri Kosina: - Proper mapping of HID_GD_Z to ABS_DISTANCE for stylus/pen types of devices (Ping Cheng) - Power management/hibernation improvements in intel-ish (Zhang Lixu) - Improved support for several Logitech devices, e.g. G Pro X Superlight 2, new iteration of Lighspeed receiver, G13, G510 (Nathan Rossi, Mavroudis Chatzilazaridis, Leo L Schwab, Hans de Goede) - Support for UcLogic XP-PEN Artist 24 Pro (Joshua Goins) - WinWing Orion2 throttle support improvement (Ivan Gorinov) - other assorted small fixes and device ID additions * tag 'hid-for-linus-2025120201' of git://git.kernel.org/pub/scm/linux/kernel/git/hid/hid: (37 commits) drivers: hid: renegotiate resolution multipliers with device after reset HID: evision: Fix Report Descriptor for Evision Wireless Receiver 320f:226f HID: logitech-dj: Fix probe failure when used with KVM HID: logitech-dj: Remove duplicate error logging HID: logitech-dj: Add support for G Pro X Superlight 2 receiver selftests/hid-tablet: add ABS_DISTANCE test for stylus/pen HID: input: map HID_GD_Z to ABS_DISTANCE for stylus/pen HID: bpf: fix typo in HID usage table HID: bpf: add the Huion Kamvas 27 Pro HID: bpf: add heuristics to the Huion Inspiroy 2S eraser button HID: bpf: Add support for XP-Pen Deco02 HID: bpf: Add support for the XP-Pen Deco 01 V3 HID: bpf: Add support for the Waltop Batteryless Tablet HID: bpf: Add fixup for Logitech SpaceNavigator variants HID: bpf: support for Huion Kamvas 16 Gen 3 HID: bpf: add support for Huion Kamvas 13 (Gen 3) (model GS1333) HID: bpf: Add support for the Inspiroy 2M Documentation: hid-alps: Format DataByte* subsection headings Documentation: hid-alps: Fix packet format section headings HID: nintendo: add WQ_PERCPU to alloc_workqueue users ...
8 daysperf stat: Improve handling of termination by signalIan Rogers1-5/+16
When interrupting perf stat in repeat mode with a signal the signal is passed to the child process but the repeat doesn't terminate: ``` $ perf stat -v --null --repeat 10 sleep 1 Control descriptor is not initialized [ perf stat: executing run #1 ... ] [ perf stat: executing run #2 ... ] ^Csleep: Interrupt [ perf stat: executing run #3 ... ] [ perf stat: executing run #4 ... ] [ perf stat: executing run #5 ... ] [ perf stat: executing run #6 ... ] [ perf stat: executing run #7 ... ] [ perf stat: executing run #8 ... ] [ perf stat: executing run #9 ... ] [ perf stat: executing run #10 ... ] Performance counter stats for 'sleep 1' (10 runs): 0.9500 +- 0.0512 seconds time elapsed ( +- 5.39% ) 0.01user 0.02system 0:09.53elapsed 0%CPU (0avgtext+0avgdata 18940maxresident)k 29944inputs+0outputs (0major+2629minor)pagefaults 0swaps ``` Terminate the repeated run and give a reasonable exit value: ``` $ perf stat -v --null --repeat 10 sleep 1 Control descriptor is not initialized [ perf stat: executing run #1 ... ] [ perf stat: executing run #2 ... ] [ perf stat: executing run #3 ... ] ^Csleep: Interrupt Performance counter stats for 'sleep 1' (10 runs): 0.680 +- 0.321 seconds time elapsed ( +- 47.16% ) Command exited with non-zero status 130 0.00user 0.01system 0:02.05elapsed 0%CPU (0avgtext+0avgdata 70688maxresident)k 0inputs+0outputs (0major+5002minor)pagefaults 0swaps ``` Note, this also changes the exit value for non-repeat runs when interrupted by a signal. Reported-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/lkml/aS5wjmbAM9ka3M2g@gmail.com/ Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 daysperf tests stat: Add test for error for an offline CPUIan Rogers1-0/+27
Add a test that if an offline CPU is requested perf stat will fail. $ perf test -vv "perf stat tests" 101: perf stat tests: --- start --- test child forked, pid 46965 Basic stat command test Basic stat command test [Success] Null stat command test Null stat command test [Success] Offline CPU stat command test (cpu 8) Offline CPU stat command test [Success] stat record and report test stat record and report test [Success] stat record and script test stat record and script test [Success] stat repeat weak groups test stat repeat weak groups test [Success] Topdown event group test Topdown event group test [Success] Topdown weak groups test Topdown weak groups test [Skipped event parsing failed] cputype test cputype test [Success] hybrid test hybrid test [Success] ---- end(0) ---- 101: perf stat tests : Ok Reported-by: Thomas Richter <tmricht@linux.ibm.com> Closes: https://lore.kernel.org/linux-perf-users/94313b82-888b-4f42-9fb0-4585f9e90080@linux.ibm.com/ Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 daysMerge tag 'sound-6.19-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound updates from Takashi Iwai: "The majority of changes at this time were about ASoC with a lot of code refactoring works. From the functionality POV, there isn't much to see, but we have a wide range of device-specific fixes and updates. Here are some highlights: - Continued ASoC API cleanup work, spanned over many files - Added a SoundWire SCDA generic class driver with regmap support - Enhancements and fixes for Cirrus, Intel, Maxim and Qualcomm. - Support for ASoC Allwinner A523, Mediatek MT8189, Qualcomm QCM2290, QRB2210 and SM6115, SpacemiT K1, and TI TAS2568, TAS5802, TAS5806, TAS5815, TAS5828 and TAS5830 - Usual HD-audio and USB-audio quirks and fixups - Support for Onkyo SE-300PCIE, TASCAM IF-FW/DM MkII Some gpiolib changes for shared GPIOs are included along with this PR for covering ASoC drivers changes" * tag 'sound-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: (739 commits) ALSA: hda/realtek: Add PCI SSIDs to HP ProBook quirks ALSA: usb-audio: Simplify with usb_endpoint_max_periodic_payload() ALSA: hda/realtek: fix mute/micmute LEDs don't work for more HP laptops ALSA: rawmidi: Fix inconsistent indenting warning reported by smatch ALSA: dice: fix buffer overflow in detect_stream_formats() ASoC: codecs: Modify awinic amplifier dsp read and write functions ASoC: SDCA: Fixup some more Kconfig issues ASoC: cs35l56: Log a message if firmware is missing ASoC: nau8325: Delete a stray tab firmware: cs_dsp: Add test cases for client_ops == NULL firmware: cs_dsp: Don't require client to provide a struct cs_dsp_client_ops ASoC: fsl_micfil: Set channel range control ASoC: fsl_micfil: Add default quality for different platforms ASoC: intel: sof_sdw: Add codec_info for cs42l45 ASoC: sdw_utils: Add cs42l45 support functions ASoC: intel: sof_sdw: Add ability to have auxiliary devices ASoC: sdw_utils: Move codec_name to dai info ASoC: sdw_utils: Add codec_conf for every DAI ASoC: SDCA: Add terminal type into input/output widget name ASoC: SDCA: Align mute controls to ALSA expectations ...
8 daysperf stat: When no events, don't report an error if there is noneIan Rogers1-2/+4
Events may fail to open as no supported CPUs were specified on the command line. In this case a confusing "error" message of "success" can be reported. Let's skip the error in that case. Before: ``` $ perf stat -C2048 -e cycles -- true WARNING: A requested CPU in '2048' is not supported by PMU 'cpu' (CPUs 0-7) for event 'cycles' Error: No supported events found. The sys_perf_event_open() syscall returned with 0 (Success) for event (cpu/unknown-hardware/). "dmesg | grep -i perf" may provide additional information. ``` After: ``` $ perf stat -C2048 -e cycles -- true WARNING: A requested CPU in '2048' is not supported by PMU 'cpu' (CPUs 0-7) for event 'cycles' Error: No supported events found. ``` Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 daysperf tests stat: Add "--null" coverageIan Rogers1-0/+12
Ensure "--null" does a minimal run. Reported-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/linux-perf-users/aSwt7yzFjVJCEmVp@gmail.com/ Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 daysperf cpumap: Add "any" CPU handling to cpu_map__snprint_maskIan Rogers1-2/+7
If the perf_cpu_map is empty or is just the any CPU value, then early return. Don't process the "any" CPU when creating the bitmap. Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 dayslibperf cpumap: Fix perf_cpu_map__max for an empty/NULL mapIan Rogers1-4/+6
Passing an empty map to perf_cpu_map__max triggered a SEGV. Explicitly test for the empty map. Reported-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/linux-perf-users/aSwt7yzFjVJCEmVp@gmail.com/ Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 daysperf stat: Allow no events to open if this is a "--null" runIan Rogers1-1/+1
It is intended that a "--null" run doesn't open any events. Fixes: 2cc7aa995ce9 ("perf stat: Refactor retry/skip/fatal error handling") Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
8 daysMerge tag 'for-6.19/block-20251201' of ↵Linus Torvalds2-34/+45
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Fix head insertion for mq-deadline, a regression from when priority support was added - Series simplifying and improving the ublk user copy code - Various ublk related cleanups - Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the request is punted to a thread for handling - Merge and then later revert loop dio nowait support, as it ended up causing excessive stack usage for when the inline issue code needs to dip back into the full file system code - Improve auto integrity code, making it less deadlock prone - Speedup polled IO handling, but manually managing the hctx lookups - Fixes for blk-throttle for SSD devices - Small series with fixes for the S390 dasd driver - Add support for caching zones, avoiding unnecessary report zone queries - MD pull requests via Yu: - fix null-ptr-dereference regression for dm-raid0 - fix IO hang for raid5 when array is broken with IO inflight - remove legacy 1s delay to speed up system shutdown - change maintainer's email address - data can be lost if array is created with different lbs devices, fix this problem and record lbs of the array in metadata - fix rcu protection for md_thread - fix mddev kobject lifetime regression - enable atomic writes for md-linear - some cleanups - bcache updates via Coly - remove useless discard and cache device code - improve usage of per-cpu workqueues - Reorganize the IO scheduler switching code, fixing some lockdep reports as well - Improve the block layer P2P DMA support - Add support to the block tracing code for zoned devices - Segment calculation improves, and memory alignment flexibility improvements - Set of prep and cleanups patches for ublk batching support. The actual batching hasn't been added yet, but helps shrink down the workload of getting that patchset ready for 6.20 - Fix for how the ps3 block driver handles segments offsets - Improve how block plugging handles batch tag allocations - nbd fixes for use-after-free of the configuration on device clear/put - Set of improvements and fixes for zloop - Add Damien as maintainer of the block zoned device code handling - Various other fixes and cleanups * tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) block/rnbd: correct all kernel-doc complaints blk-mq: use queue_hctx in blk_mq_map_queue_type md: remove legacy 1s delay in md_notify_reboot md/raid5: fix IO hang when array is broken with IO inflight md: warn about updating super block failure md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid sbitmap: fix all kernel-doc warnings ublk: add helper of __ublk_fetch() ublk: pass const pointer to ublk_queue_is_zoned() ublk: refactor auto buffer register in ublk_dispatch_req() ublk: add `union ublk_io_buf` with improved naming ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() kfifo: add kfifo_alloc_node() helper for NUMA awareness blk-mq: fix potential uaf for 'queue_hw_ctx' blk-mq: use array manage hctx map instead of xarray ublk: prevent invalid access with DEBUG s390/dasd: Use scnprintf() instead of sprintf() s390/dasd: Move device name formatting into separate function s390/dasd: Remove unnecessary debugfs_create() return checks s390/dasd: Fix gendisk parent after copy pair swap ...
9 daysMerge tag 'net-next-6.19' of ↵Linus Torvalds113-1472/+7097
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API: - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers: - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support" * tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits) net: page_pool: sanitise allocation order net: page pool: xa init with destroy on pp init net/mlx5e: Support XDP target xmit with dummy program net/mlx5e: Update XDP features in switch channels selftests/tc-testing: Test CAKE scheduler when enqueue drops packets net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop wireguard: netlink: generate netlink code wireguard: uapi: generate header with ynl-gen wireguard: uapi: move flag enums wireguard: uapi: move enum wg_cmd wireguard: netlink: add YNL specification selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py selftests: drv-net: introduce Iperf3Runner for measurement use cases selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive() Documentation: net: dsa: mention simple HSR offload helpers Documentation: net: dsa: mention availability of RedBox ...
9 daysMerge tag 'bpf-next-6.19' of ↵Linus Torvalds106-3961/+8439
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Convert selftests/bpf/test_tc_edt and test_tc_tunnel from .sh to test_progs runner (Alexis Lothoré) - Convert selftests/bpf/test_xsk to test_progs runner (Bastien Curutchet) - Replace bpf memory allocator with kmalloc_nolock() in bpf_local_storage (Amery Hung), and in bpf streams and range tree (Puranjay Mohan) - Introduce support for indirect jumps in BPF verifier and x86 JIT (Anton Protopopov) and arm64 JIT (Puranjay Mohan) - Remove runqslower bpf tool (Hoyeon Lee) - Fix corner cases in the verifier to close several syzbot reports (Eduard Zingerman, KaFai Wan) - Several improvements in deadlock detection in rqspinlock (Kumar Kartikeya Dwivedi) - Implement "jmp" mode for BPF trampoline and corresponding DYNAMIC_FTRACE_WITH_JMP. It improves "fexit" program type performance from 80 M/s to 136 M/s. With Steven's Ack. (Menglong Dong) - Add ability to test non-linear skbs in BPF_PROG_TEST_RUN (Paul Chaignon) - Do not let BPF_PROG_TEST_RUN emit invalid GSO types to stack (Daniel Borkmann) - Generalize buildid reader into bpf_dynptr (Mykyta Yatsenko) - Optimize bpf_map_update_elem() for map-in-map types (Ritesh Oedayrajsingh Varma) - Introduce overwrite mode for BPF ring buffer (Xu Kuohai) * tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (169 commits) bpf: optimize bpf_map_update_elem() for map-in-map types bpf: make kprobe_multi_link_prog_run always_inline selftests/bpf: do not hardcode target rate in test_tc_edt BPF program selftests/bpf: remove test_tc_edt.sh selftests/bpf: integrate test_tc_edt into test_progs selftests/bpf: rename test_tc_edt.bpf.c section to expose program type selftests/bpf: Add success stats to rqspinlock stress test rqspinlock: Precede non-head waiter queueing with AA check rqspinlock: Disable spinning for trylock fallback rqspinlock: Use trylock fallback when per-CPU rqnode is busy rqspinlock: Perform AA checks immediately rqspinlock: Enclose lock/unlock within lock entry acquisitions bpf: Remove runqslower tool selftests/bpf: Remove usage of lsm/file_alloc_security in selftest bpf: Disable file_alloc_security hook bpf: check for insn arrays in check_ptr_alignment bpf: force BPF_F_RDONLY_PROG on insn array creation bpf: Fix exclusive map memory leak selftests/bpf: Make CS length configurable for rqspinlock stress test selftests/bpf: Add lock wait time stats to rqspinlock stress test ...
9 daysktest.pl: Fix uninitialized var in config-bisect.plSteven Rostedt1-2/+2
The error path of copying the old config used the wrong variable in the error message: $ mkdir /tmp/build $ ./tools/testing/ktest/config-bisect.pl -b /tmp/build config-good /tmp/config-bad $ chmod 0 /tmp/build $ ./tools/testing/ktest/config-bisect.pl -b /tmp/build config-good /tmp/config-bad good cp /tmp/build//.config config-good.tmp ... [0 seconds] FAILED! Use of uninitialized value $config in concatenation (.) or string at ./tools/testing/ktest/config-bisect.pl line 744. failed to copy to config-good.tmp When it should have shown: failed to copy /tmp/build//.config to config-good.tmp Cc: stable@vger.kernel.org Cc: John 'Warthog9' Hawley <warthog9@kernel.org> Fixes: 0f0db065999cf ("ktest: Add standalone config-bisect.pl program") Link: https://patch.msgid.link/20251203180924.6862bd26@gandalf.local.home Reported-by: "John W. Krahn" <jwkrahn@shaw.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
9 daysMerge tag 'linux_kselftest-next-6.19-rc1' of ↵Linus Torvalds6-19/+176
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: - Add basic test for trace_marker_raw file to tracing selftest - Fix invalid array access in printf dma_map_benchmark selftest - Add tprobe enable/disable testcase to tracing selftest - Update fprobe selftest for ftrace based fprobe * tag 'linux_kselftest-next-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: tracing: Update fprobe selftest for ftrace based fprobe selftests: tracing: Add tprobe enable/disable testcase selftests/run_kselftest.sh: exit with error if tests fail selftests/dma: fix invalid array access in printf selftests/tracing: Add basic test for trace_marker_raw file
9 daysMerge tag 'livepatching-for-6.19' of ↵Linus Torvalds1-1/+5
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching updates from Petr Mladek: - Support both paths where tracefs is typically mounted in selftests - Make old_sympos 0 and 1 equal. They both are valid when there is only one symbol with the given name. * tag 'livepatching-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: selftests: livepatch: use canonical ftrace path livepatch: Match old_sympos 0 and 1 in klp_find_func()
9 daysMerge tag 'sched_ext-for-6.19' of ↵Linus Torvalds11-129/+950
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - Improve recovery from misbehaving BPF schedulers. When a scheduler puts many tasks with varying affinity restrictions on a shared DSQ, CPUs scanning through tasks they cannot run can overwhelm the system, causing lockups. Bypass mode now uses per-CPU DSQs with a load balancer to avoid this, and hooks into the hardlockup detector to attempt recovery. Add scx_cpu0 example scheduler to demonstrate this scenario. - Add lockless peek operation for DSQs to reduce lock contention for schedulers that need to query queue state during load balancing. - Allow scx_bpf_reenqueue_local() to be called from anywhere in preparation for deprecating cpu_acquire/release() callbacks in favor of generic BPF hooks. - Prepare for hierarchical scheduler support: add scx_bpf_task_set_slice() and scx_bpf_task_set_dsq_vtime() kfuncs, make scx_bpf_dsq_insert*() return bool, and wrap kfunc args in structs for future aux__prog parameter. - Implement cgroup_set_idle() callback to notify BPF schedulers when a cgroup's idle state changes. - Fix migration tasks being incorrectly downgraded from stop_sched_class to rt_sched_class across sched_ext enable/disable. Applied late as the fix is low risk and the bug subtle but needs stable backporting. - Various fixes and cleanups including cgroup exit ordering, SCX_KICK_WAIT reliability, and backward compatibility improvements. * tag 'sched_ext-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (44 commits) sched_ext: Fix incorrect sched_class settings for per-cpu migration tasks sched_ext: tools: Removing duplicate targets during non-cross compilation sched_ext: Use kvfree_rcu() to release per-cpu ksyncs object sched_ext: Pass locked CPU parameter to scx_hardlockup() and add docs sched_ext: Update comments replacing breather with aborting mechanism sched_ext: Implement load balancer for bypass mode sched_ext: Factor out abbreviated dispatch dequeue into dispatch_dequeue_locked() sched_ext: Factor out scx_dsq_list_node cursor initialization into INIT_DSQ_LIST_CURSOR sched_ext: Add scx_cpu0 example scheduler sched_ext: Hook up hardlockup detector sched_ext: Make handle_lockup() propagate scx_verror() result sched_ext: Refactor lockup handlers into handle_lockup() sched_ext: Make scx_exit() and scx_vexit() return bool sched_ext: Exit dispatch and move operations immediately when aborting sched_ext: Simplify breather mechanism with scx_aborting flag sched_ext: Use per-CPU DSQs instead of per-node global DSQs in bypass mode sched_ext: Refactor do_enqueue_task() local and global DSQ paths sched_ext: Use shorter slice in bypass mode sched_ext: Mark racy bitfields to prevent adding fields that can't tolerate races sched_ext: Minor cleanups to scx_task_iter ...
9 daysMerge tag 'cgroup-for-6.19' of ↵Linus Torvalds8-24/+32
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Defer task cgroup unlink until after the dying task's final context switch so that controllers see the cgroup properly populated until the task is truly gone - cpuset cleanups and simplifications. Enforce that domain isolated CPUs stay in root or isolated partitions and fail if isolated+nohz_full would leave no housekeeping CPU. Fix sched/deadline root domain handling during CPU hot-unplug and race for tasks in attaching cpusets - Misc fixes including memory reclaim protection documentation and selftest KTAP conformance * tag 'cgroup-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) cpuset: Treat cpusets in attaching as populated sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug cgroup/cpuset: Introduce cpuset_cpus_allowed_locked() docs: cgroup: No special handling of unpopulated memcgs docs: cgroup: Note about sibling relative reclaim protection docs: cgroup: Explain reclaim protection target selftests/cgroup: conform test to KTAP format output cpuset: remove need_rebuild_sched_domains cpuset: remove global remote_children list cpuset: simplify node setting on error cgroup: include missing header for struct irq_work cgroup: Fix sleeping from invalid context warning on PREEMPT_RT cgroup/cpuset: Globally track isolated_cpus update cgroup/cpuset: Ensure domain isolated CPUs stay in root or isolated partition cgroup/cpuset: Move up prstate_housekeeping_conflict() helper cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() cgroup: Defer task cgroup unlink until after the task is done switching out cgroup: Move dying_tasks cleanup from cgroup_task_release() to cgroup_task_free() cgroup: Rename cgroup lifecycle hooks to cgroup_task_*() ...
9 daysselftests: tpm2: Fix ill defined assertionsMaurice Hieronymus1-2/+2
Remove parentheses around assert statements in Python. With parentheses, assert always evaluates to True, making the checks ineffective. Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
9 daysMerge tag 'rcu.release.v6.19' of ↵Linus Torvalds4-17/+158
git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Frederic Weisbecker: "SRCU: - Properly handle SRCU readers within IRQ disabled sections in tiny SRCU - Preparation to reimplement RCU Tasks Trace on top of SRCU fast: - Introduce API to expedite a grace period and test it through rcutorture - Split srcu-fast in two flavours: SRCU-fast and SRCU-fast-updown. Both are still targeted toward faster readers (without full barriers on LOCK and UNLOCK) at the expense of heavier write side (using full RCU grace period ordering instead of simply full ordering) as compared to "traditional" non-fast SRCU. But those srcu-fast flavours are going to be optimized in two different ways: - SRCU-fast will become the reimplementation basis for RCU-TASK-TRACE for consolidation. Since RCU-TASK-TRACE must be NMI safe, SRCU-fast must be as well. - SRCU-fast-updown will be needed for uretprobes code in order to get rid of the read-side memory barriers while still allowing entering the reader at task level while exiting it in a timer handler. It is considered semaphore-like in that it can have different owners between LOCK and UNLOCK. However it is not NMI-safe. The actual optimizations are work in progress for the next cycle. Only the new interfaces are added for now, along with related torture and scalability test code. - Create/document/debug/torture new proper initializers for RCU fast: DEFINE_SRCU_FAST() and init_srcu_struct_fast() This allows for using right away the proper ordering on the write side (either full ordering or full RCU grace period ordering) without waiting for the read side to tell which to use. This also optimizes the read side altogether with moving flavour debug checks under debug config and with removing a costly RmW operation on their first call. - Make some diagnostic functions tracing safe Refscale: - Add performance testing for common context synchronizations (Preemption, IRQ, Softirq) and per-cpu increments. Those are relevant comparisons against SRCU-fast read side APIs, especially as they are planned to synchronize further tracing fast-path code Miscellanous: - In order to prepare the layout for nohz_full work deferral to user exit, the context tracking state must shrink the counter of transitions to/from RCU not watching. The only possible hazard is to trigger wrap-around more easily, delaying a bit grace periods when that happens. This should be a rare event though. Yet add debugging and torture code to test that assumption - Fix memory leak on locktorture module - Annotate accesses in rculist_nulls.h to prevent from KCSAN warnings. On recent discussions, we also concluded that all those WRITE_ONCE() and READ_ONCE() on list APIs deserve appropriate comments. Something to be expected for the next cycle - Provide a script to apply several configs to several commits with torture - Allow torture to reuse a build directory in order to save needless rebuild time - Various cleanups" * tag 'rcu.release.v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (29 commits) refscale: Add SRCU-fast-updown readers refscale: Exercise DEFINE_STATIC_SRCU_FAST() and init_srcu_struct_fast() rcutorture: Make srcu{,d}_torture_init() announce the SRCU type srcu: Create an SRCU-fast-updown API refscale: Do not disable interrupts for tests involving local_bh_enable() refscale: Add non-atomic per-CPU increment readers refscale: Add this_cpu_inc() readers refscale: Add preempt_disable() readers refscale: Add local_bh_disable() readers refscale: Add local_irq_disable() and local_irq_save() readers torture: Permit negative kvm.sh --kconfig numberic arguments srcu: Add SRCU_READ_FLAVOR_FAST_UPDOWN CPP macro rcu: Mark diagnostic functions as notrace rcutorture: Make TREE04 use CONFIG_RCU_DYNTICKS_TORTURE rcutorture: Remove redundant rcutorture_one_extend() from rcu_torture_one_read() rcutorture: Permit kvm-again.sh to re-use the build directory torture: Add kvm-series.sh to test commit/scenario combination rcu: use WRITE_ONCE() for ->next and ->pprev of hlist_nulls locktorture: Fix memory leak in param_set_cpumask() doc: Update for SRCU-fast definitions and initialization ...
9 daysMerge tag 'docs-6.19' of git://git.lwn.net/linuxLinus Torvalds30-107/+9623
Pull documentation updates from Jonathan Corbet: "This has been another busy cycle for documentation, with a lot of build-system thrashing. That work should slow down from here on out. - The various scripts and tools for documentation were spread out in several directories; now they are (almost) all coalesced under tools/docs/. The holdout is the kernel-doc script, which cannot be easily moved without some further thought. - As the amount of Python code increases, we are accumulating modules that are imported by multiple programs. These modules have been pulled together under tools/lib/python/ -- at least, for documentation-related programs. There is other Python code in the tree that might eventually want to move toward this organization. - The Perl kernel-doc.pl script has been removed. It is no longer used by default, and nobody has missed it, least of all anybody who actually had to look at it. - The docs build was controlled by a complex mess of makefilese that few dared to touch. Mauro has moved that logic into a new program (tools/docs/sphinx-build-wrapper) that, with any luck at all, will be far easier to understand and maintain. - The get_feat.pl program, used to access information under Documentation/features/, has been rewritten in Python, bringing an end to the use of Perl in the docs subsystem. - The top-level README file has been reorganized into a more reader-friendly presentation. - A lot of Chinese translation additions - Typo fixes and documentation updates as usual" * tag 'docs-6.19' of git://git.lwn.net/linux: (164 commits) docs: makefile: move rustdoc check to the build wrapper README: restructure with role-based documentation and guidelines docs: kdoc: various fixes for grammar, spelling, punctuation docs: kdoc_parser: use '@' for Excess enum value docs: submitting-patches: Clarify that removal of Acks needs explanation too docs: kdoc_parser: add data/function attributes to ignore docs: MAINTAINERS: update Mauro's files/paths docs/zh_CN: Add wd719x.rst translation docs/zh_CN: Add libsas.rst translation get_feat.pl: remove it, as it got replaced by get_feat.py Documentation/sphinx/kernel_feat.py: use class directly tools/docs/get_feat.py: convert get_feat.pl to Python Documentation/admin-guide: fix typo and comment in cscope example docs/zh_CN: Add data-integrity.rst translation docs/zh_CN: Add blk-mq.rst translation docs/zh_CN: Add block/index.rst translation docs/zh_CN: Update the Chinese translation of kbuild.rst docs: bring some order to our Python module hierarchy docs: Move the python libraries to tools/lib/python Documentation/kernel-parameters: Move the kernel build options ...
9 daysperf test kvm: Add some basic perf kvm test coverageIan Rogers1-0/+154
Setup qemu with KVM then run kvm stat and some host recording/reporting/build-id tests. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tests evlist: Add basic evlist testIan Rogers1-0/+79
Add test that evlist reports expected events from perf record. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tests script dlfilter: Add a dlfilter testIan Rogers1-0/+107
Compile a simple dlfilter and make sure it remove samples from everything other than a test_loop. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tests kallsyms: Add basic kallsyms testIan Rogers1-0/+56
Add test that kallsyms finds a well known symbol and fails for another. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tests timechart: Add a perf timechart testIan Rogers1-0/+67
Basic coverage for `perf timechart` doing a record and then a basic sanity test of the generated SVG file. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tests top: Add basic perf top coverage testIan Rogers1-0/+74
The test starts a backgroup thloop workload and monitors it using cpu-clock ensuring test_loop appears in the output. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tests buildid: Add purge and remove testingIan Rogers1-26/+177
Add testing for the purge and remove commands. Use the noploop workload rather than just a return to avoid missing samples in the workload in perf record. Tidy up the cleanup code to cleanup when signals happen. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tests c2c: Add a basic c2cIan Rogers1-0/+62
Add basic c2c record and report testing to gain some coverage. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf c2c: Clean up some defensive gets and make asan cleanIan Rogers1-22/+14
To deal with histogram code that had missing gets the c2c code had some defensive gets. Those other issues were cleaned up by the reference count checker, clean them up for the c2c command here. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf jitdump: Fix missed dso__putIan Rogers1-0/+2
Reference count checking caught a missing dso__put following a machine__findnew_dso_id. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf mem-events: Don't leak online CPU mapIan Rogers1-1/+4
Reference count checking found the online CPU map was being gotten but not put. Add in the missing put. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf hist: In init, ensure mem_info is put on error pathsIan Rogers1-4/+2
Rather than exit the internal map_symbols directly, put the mem-info that does this and also lowers the reference count on the mem-info itself otherwise the mem-info is being leaked. Fixes: 56e144fe98260a0f ("perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf probe-event: Ensure probe event nsinfo is always clearedIan Rogers1-6/+6
Move nsinfo__zput from cleanup_perf_probe_events to clear_perf_probe_event so it is always executed. Clean up clear_perf_probe_events to not call nsinfo__zput and use the pev variable to avoid repeated array accesses. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf symbol: Add missed dso__putIan Rogers1-0/+1
Add missing dso__put for the dso created in maps__split_kallsyms. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf symbol-elf: Add missing puts on error pathIan Rogers1-1/+4
In dso__process_kernel_symbol if inserting a map fails, probably ENOMEM, then the reference count puts were missing on the dso and map. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf timechart: Add record support for output perf.data pathIan Rogers2-6/+12
The '-o' option exists for the SVG creation but not for `perf timechart record`. Add to better allow testing. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf kvm: Fix debug assertionIan Rogers1-1/+1
There are 2 slots left for kvm_add_default_arch_event, fix the assertion so that debug builds don't fail the assert and to agree with the comment. Fixes: 45ff39f6e70aa55d0 ("perf tools kvm: Fix the potential out of range memory access issue") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update sierraforest events from 1.12 to 1.13Ian Rogers3-11/+20
The updated events were published in: https://github.com/intel/perfmon/commit/445e38f5128592f8b5c38da30267fff025e37613 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update pantherlake events from 1.00 to 1.02Ian Rogers5-2/+425
The updated events were published in: https://github.com/intel/perfmon/commit/6edacf434dffa046435de2f6a182c00df3cf4edc Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update meteorlake events from 1.17 to 1.18Ian Rogers2-11/+11
The updated events were published in: https://github.com/intel/perfmon/commit/348f33fae477f281812c32e1c07812b7e35614dd Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update lunarlake events from 1.18 to 1.19Ian Rogers4-14/+35
The updated events were published in: https://github.com/intel/perfmon/commit/09a0c74b23b5d20adf1f97e5022856568d05494c Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update icelakex events from 1.28 to 1.30Ian Rogers2-3/+3
The updated events were published in: https://github.com/intel/perfmon/commit/dc6ffee20c74bfd21d7a7e338345578d4b7ca9ca Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update graniterapids events from 1.15 to 1.16Ian Rogers3-3/+12
The updated events were published in: https://github.com/intel/perfmon/commit/b4acc3fd520eb098db41083010b65b75ae906c96 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update cascadelakex metric unitsIan Rogers2-7/+7
The updated metrics were published in: https://github.com/intel/perfmon/pull/348/commits/2dce436130ddfb8b442fc373d103f970de26cb78 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update arrowlake events from 1.13 to 1.14Ian Rogers8-19/+1111
The updated events were published in: https://github.com/intel/perfmon/commit/588dd77675039e1aaacee27a414cbcf3625c58a3 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf vendor events intel: Update alderlake events from 1.34 to 1.35Ian Rogers5-22/+26
The updated events were published in: https://github.com/intel/perfmon/commit/c74f1cefa94d224cb3338507961b59d8a2a1c4e9 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf arm_spe: Add CPU variants supporting common data source packetLeo Yan1-0/+5
Add the following CPU variants to the list for data source decoding: - Cortex-A715 [1] - Cortex-A78C [2] - Cortex-X1 [3] - Cortex-X4 [4] - Neoverse V3 [5] [1] https://developer.arm.com/documentation/101590/0103/Statistical-Profiling-Extension-Support/Statistical-Profiling-Extension-data-source-packet [2] https://developer.arm.com/documentation/102226/0002/Debug-descriptions/Statistical-Profiling-Extension/implementation-defined-features-of-SPE [3] https://developer.arm.com/documentation/101433/0102/Debug-descriptions/Statistical-Profiling-Extension/implementation-defined-features-of-SPE [4] https://developer.arm.com/documentation/102484/0003/Statistical-Profiling-Extension-support/Statistical-Profiling-Extension-data-source-packet [5] https://developer.arm.com/documentation/107734/0002/Statistical-Profiling-Extension-support/Statistical-Profiling-Extension-data-source-packet Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf auxtrace: Include sys/types.h for pid_tArnaldo Carvalho de Melo1-0/+1
In 754187ad73b73bcb ("perf build: Remove NO_AUXTRACE build option") sys/types.h was removed, which broke the build in all Alpine Linux releases, as musl libc has pid_t defined via sys/types.h, add it back. Fixes: 754187ad73b73bcb ("perf build: Remove NO_AUXTRACE build option") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysobjtool: Add more robust signal error handling, detect and warn about stack ↵Josh Poimboeuf4-1/+141
overflows When the kernel build fails due to an objtool segfault, the error message is a bit obtuse and confusing: make[5]: *** [scripts/Makefile.build:503: drivers/scsi/qla2xxx/qla2xxx.o] Error 139 ^^^^^^^^^ make[5]: *** Deleting file 'drivers/scsi/qla2xxx/qla2xxx.o' make[4]: *** [scripts/Makefile.build:556: drivers/scsi/qla2xxx] Error 2 make[3]: *** [scripts/Makefile.build:556: drivers/scsi] Error 2 make[2]: *** [scripts/Makefile.build:556: drivers] Error 2 make[1]: *** [/home/jpoimboe/git/linux/Makefile:2013: .] Error 2 make: *** [Makefile:248: __sub-make] Error 2 Add a signal handler to objtool which prints an error message like if the local stack has overflown (for which there's a chance as objtool makes heavy use of recursion): drivers/scsi/qla2xxx/qla2xxx.o: error: SIGSEGV: objtool stack overflow! or: drivers/scsi/qla2xxx/qla2xxx.o: error: SIGSEGV: objtool crash! Also, re-raise the signal so the core dump still gets triggered. [ mingo: Applied a build fix, added more comments and prettified the code. ] Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: David Laight <david.laight.linux@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/mi4tihk4dbncn7belrhp6ooudhpw4vdggerktu5333w3gqf3uf@vqlhc3y667mg
9 daysMerge tag 'nolibc-20251130-for-6.19-1' of ↵Linus Torvalds37-163/+290
git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc Pull nolibc updates from Thomas Weißschuh: - Preparations to the use of nolibc in UML: - Cleanup of sparse warnings - Library mode without _start() - More consistency when disabling errno - Unconditional installation of all architecture support files - Always 64-bit wide ino_t and off_t - Various cleanups and bug fixes * tag 'nolibc-20251130-for-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc: (25 commits) selftests/nolibc: error out on linker warnings selftests/nolibc: use lld to link loongarch binaries tools/nolibc: remove more __nolibc_enosys() fallbacks tools/nolibc: remove now superfluous overflow check in llseek tools/nolibc: use 64-bit off_t tools/nolibc: prefer the llseek syscall tools/nolibc: handle 64-bit off_t for llseek tools/nolibc: use 64-bit ino_t tools/nolibc: avoid using plain integer as NULL pointer tools/nolibc: add support for fchdir() tools/nolibc: clean up outdated comments in generic arch.h tools/nolibc: make the "headers" target install all supported archs tools/nolibc: add the more portable inttypes.h tools/nolibc: provide the portable sys/select.h tools/nolibc: add missing memchr() to string.h tools/nolibc: fix misleading help message regarding installation path tools/nolibc: add uio.h with readv and writev tools/nolibc: add option to disable runtime tools/nolibc: use __fallthrough__ rather than fallthrough tools/nolibc: implement %m if errno is not defined ...
9 daystools/power/x86/intel-speed-select: v1.24 releaseSrinivas Pandruvada1-1/+1
This version includes the following changes: - Check feature status to check if the feature enablement was successful - Reset SST-TF bucket structure to display valid bucket info Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
9 daystools/power/x86/intel-speed-select: Reset isst_turbo_freq_info for invalid ↵Srinivas Pandruvada1-0/+1
buckets With SST-TF version 2 only 3 buckets are present. The information in others buckets can be junk. So initialize the info structure of type isst_turbo_freq_info, before issing ioctl to get bucket information. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
9 daystools/power/x86/intel-speed-select: Check feature statusSrinivas Pandruvada1-2/+43
After change of enable/disable status of SST-CP, SST-TF and SST-BF check if the hardware status change was successful. If not successful even after retries, return failure. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
9 daysperf test: Add kallsyms split testNamhyung Kim4-0/+159
Create a fake root directory for /proc/{version,modules,kallsyms} in /tmp for testing. The kallsyms has a bad symbol in the module and it causes the main map splitted. The test ensures it only has two maps - kernel and the module and it finds the initial map after the module without creating the split maps like [kernel].0 and so on. $ perf test -vv "split kallsyms" 69: split kallsyms: --- start --- test child forked, pid 1016196 try to create fake root directory create kernel maps from the fake root directory maps__set_modules_path_dir: cannot open /tmp/perf-test.Zrv6Sy/lib/modules/X.Y.Z dir Problems setting modules path maps, continuing anyway... Failed to open /tmp/perf-test.Zrv6Sy/proc/kcore. Note /proc/kcore requires CAP_SYS_RAWIO capability to access. Using /tmp/perf-test.Zrv6Sy/proc/kallsyms for symbols kernel map loaded - check symbol and map ---- end(0) ---- 69: split kallsyms : Ok Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tools: Use machine->root_dir to find /proc/kallsymsNamhyung Kim1-1/+7
This is for test functions to find the kallsyms correctly. It can find the machine from the kernel maps and use its root_dir. This is helpful to setup fake /proc directory for testing. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tools: Fallback to initial kernel map properlyNamhyung Kim1-1/+2
In maps__split_kallsyms(), it assumes new kernel map when it finds a symbol without module after any module and the initial kernel map has some symbols. Because it expects modules are out of the kernel map so modules should not have symbols in the kernel map. For example, the following memory map shows symbols and maps. Any symbols in the module 1 area will go to the module 1. The main kernel map starts at 0xffffffffbc200000. But if any symbol has a module between the symbols in that area, next symbols after 0xffffffffbd008000 will generate new kernel maps like [kernel].1. kernel address | | | | 0xffffffffc0000000 |---------------------| | (symbols) | | ... | <--- [kernel].N 0xffffffffbc400000 |---------------------| | (symbols) | | module 2 | <--- bad? 0xffffffffbc380000 |---------------------| | ... | | (symbols) | | [kernel.kallsyms] | <--- initial map 0xffffffffbc200000 |---------------------| | | | | 0xffffffffabcde000 |---------------------| | (symbols) | | module 1 | 0xffffffffabcd0000 |---------------------| This is very fragile when the module has a symbol that falls into the main kernel map for some reason. My system has a livepatch module with such symbols. And it created a lot of new kernel maps after those symbols. But the symbol may have broken addresses and the later symbols can still be found in the initial kernel map. Let's check the symbol address in the initial map and use it if found. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tools: Fix split kallsyms DSO countingNamhyung Kim1-2/+2
It's counted twice as it's increased after calling maps__insert(). I guess we want to increase it only after it's added properly. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tools: Mark split kallsyms DSOs as loadedNamhyung Kim1-0/+1
The maps__split_kallsyms() will split symbols to module DSOs if it comes from a module. It also handled some unusual kernel symbols after modules by creating new kernel maps like "[kernel].0". But they are pseudo DSOs to have those unexpected symbols. They should not be considered as unloaded kernel DSOs. Otherwise the dso__load() for them will end up calling dso__load_kallsyms() and then maps__split_kallsyms() again and again. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tools: Flush remaining samples w/o deferred callchainsNamhyung Kim1-0/+50
It's possible that some kernel samples don't have matching deferred callchain records when the profiling session was ended before the threads came back to userspace. Let's flush the samples before finish the session. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf tools: Merge deferred user callchainsNamhyung Kim11-1/+133
Save samples with deferred callchains in a separate list and deliver them after merging the user callchains. If users don't want to merge they can set tool->merge_deferred_callchains to false to prevent the behavior. With previous result, now perf script will show the merged callchains. $ perf script ... pwd 2312 121.163435: 249113 cpu/cycles/P: ffffffff845b78d8 __build_id_parse.isra.0+0x218 ([kernel.kallsyms]) ffffffff83bb5bf6 perf_event_mmap+0x2e6 ([kernel.kallsyms]) ffffffff83c31959 mprotect_fixup+0x1e9 ([kernel.kallsyms]) ffffffff83c31dc5 do_mprotect_pkey+0x2b5 ([kernel.kallsyms]) ffffffff83c3206f __x64_sys_mprotect+0x1f ([kernel.kallsyms]) ffffffff845e6692 do_syscall_64+0x62 ([kernel.kallsyms]) ffffffff8360012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms]) 7f18fe337fa7 mprotect+0x7 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe330e0f _dl_sysdep_start+0x7f (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe331448 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) ... The old output can be get using --no-merge-callchain option. Also perf report can get the user callchain entry at the end. $ perf report --no-children --stdio -q -S __build_id_parse.isra.0 # symbol: __build_id_parse.isra.0 8.40% pwd [kernel.kallsyms] | ---__build_id_parse.isra.0 perf_event_mmap mprotect_fixup do_mprotect_pkey __x64_sys_mprotect do_syscall_64 entry_SYSCALL_64_after_hwframe mprotect _dl_sysdep_start _dl_start_user Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf script: Display PERF_RECORD_CALLCHAIN_DEFERREDNamhyung Kim2-1/+93
Handle the deferred callchains in the script output. $ perf script ... pwd 2312 121.163435: 249113 cpu/cycles/P: ffffffff845b78d8 __build_id_parse.isra.0+0x218 ([kernel.kallsyms]) ffffffff83bb5bf6 perf_event_mmap+0x2e6 ([kernel.kallsyms]) ffffffff83c31959 mprotect_fixup+0x1e9 ([kernel.kallsyms]) ffffffff83c31dc5 do_mprotect_pkey+0x2b5 ([kernel.kallsyms]) ffffffff83c3206f __x64_sys_mprotect+0x1f ([kernel.kallsyms]) ffffffff845e6692 do_syscall_64+0x62 ([kernel.kallsyms]) ffffffff8360012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms]) b00000006 (cookie) ([unknown]) pwd 2312 121.163447: DEFERRED CALLCHAIN [cookie: b00000006] 7f18fe337fa7 mprotect+0x7 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe330e0f _dl_sysdep_start+0x7f (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe331448 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysperf record: Add --call-graph fp,defer option for deferred callchainsNamhyung Kim6-3/+41
Add a new callchain record mode option for deferred callchains. For now it only works with FP (frame-pointer) mode. And add the missing feature detection logic to clear the flag on old kernels. $ perf record --call-graph fp,defer -vv true ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CALLCHAIN|PERIOD read_format ID|LOST disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 sample_id_all 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 defer_callchain 1 defer_output 1 ------------------------------------------------------------ sys_perf_event_open: pid 162755 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off deferred callchain support Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
9 daysMerge tag 'thermal-6.19-rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull thermal control updates from Rafael Wysocki: "These add Nova Lake processor support to the Intel thermal drivers and DPTF code, update thermal control documentation, simplify the ACPI DPTF code related to thermal control, add QCS8300 compatible to the tsens thermal DT bindings, add DT bindings for NXP i.MX91 thermal module and add support for it to the imx91 thermal driver, update a few other thermal drivers and fix a format string issue in a thermal utility: - Add Nova Lake processor thermal device to the int340x processor_thermal driver, add DLVR support for Nova Lake to it, add Nova Lake support to the ACPI DPTF code, document thermal throttling on Intel platforms, and update workload type hint interface documentation (Srinivas Pandruvada) - Remove int340x thermal scan handler from the ACPI DPTF code because it turned out to be unnecessary (Slawomir Rosek) - Clean up the Intel int340x thermal driver (Kaushlendra Kumar) - Document the RZ/V2H TSU DT bindings (Ovidiu Panait) - Document the Kaanapali Temperature Sensor (Manaf Meethalavalappu Pallikunhi) - Document R-Car Gen4 and RZ/G2 support in driver comment (Marek Vasut) - Convert to DEFINE_SIMPLE_DEV_PM_OPS() in R-Car [Gen3] (Geert Uytterhoeven) - Fix format string bug in thermal-engine (Malaya Kumar Rout) - Make ipq5018 tsens standalone compatible (George Moussalem) - Add the QCS8300 compatible for QCom Tsens (Gaurav Kohli) - Add support for the NXP i.MX91 thermal module, including the DT bindings (Pengfei Li)" * tag 'thermal-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: thermal/drivers/imx91: Add support for i.MX91 thermal monitoring unit dt-bindings: thermal: fsl,imx91-tmu: add bindings for NXP i.MX91 thermal module dt-bindings: thermal: tsens: Add QCS8300 compatible dt-bindings: thermal: qcom-tsens: make ipq5018 tsens standalone compatible tools/thermal/thermal-engine: Fix format string bug in thermal-engine docs: driver-api/thermal/intel_dptf: Add new workload type hint thermal/drivers/rcar_gen3: Convert to DEFINE_SIMPLE_DEV_PM_OPS() thermal/drivers/rcar: Convert to DEFINE_SIMPLE_DEV_PM_OPS() Documentation: thermal: Document thermal throttling on Intel platforms ACPI: DPTF: Support Nova Lake thermal: intel: int340x: Add DLVR support for Nova Lake thermal: int340x: processor_thermal: Add Nova Lake processor thermal device thermal: intel: int340x: Replace sprintf() with sysfs_emit() thermal: intel: int340x: Use symbolic constant for UUID comparison thermal/drivers/rcar_gen3: Document R-Car Gen4 and RZ/G2 support in driver comment dt-bindings: thermal: qcom-tsens: document the Kaanapali Temperature Sensor dt-bindings: thermal: r9a09g047-tsu: Document RZ/V2H TSU ACPI: DPTF: Remove int340x thermal scan handler thermal: intel: Select INT340X_THERMAL from INTEL_SOC_DTS_THERMAL
10 daysMerge tag 'pm-6.19-rc1' of ↵Linus Torvalds1-11/+21
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull power management updates from Rafael Wysocki: "There are quite a few interesting things here, including new hardware support, new features, some bug fixes and documentation updates. In addition, there are a usual bunch of minor fixes and cleanups all over. In the new hardware support category, there are intel_pstate and intel_rapl driver updates to support new processors, Panther Lake, Wildcat Lake, Noval Lake, and Diamond Rapids in the OOB mode, OPP and bandwidth allocation support in the tegra186 cpufreq driver, and JH7110S SOC support in dt-platdev cpufreq. The new features are the PM QoS CPU latency limit for suspend-to-idle, the netlink support for the energy model management, support for terminating system suspend via a wakeup event during the sync of file systems, configurable number of hibernation compression threads, the runtime PM auto-cleanup macros, and the "poweroff" PM event that is expected to be used during system shutdown. Bugs are mostly fixed in cpuidle governors, but there are also fixes elsewhere, like in the amd-pstate cpufreq driver. Documentation updates include, but are not limited to, a new doc on debugging shutdown hangs, cross-referencing fixes and cleanups in the intel_pstate documentation, and updates of comments in the core hibernation code. Specifics: - Introduce and document a QoS limit on CPU exit latency during wakeup from suspend-to-idle (Ulf Hansson) - Add support for building libcpupower statically (Zuo An) - Add support for sending netlink notifications to user space on energy model updates (Changwoo Mini, Peng Fan) - Minor improvements to the Rust OPP interface (Tamir Duberstein) - Fixes to scope-based pointers in the OPP library (Viresh Kumar) - Use residency threshold in polling state override decisions in the menu cpuidle governor (Aboorva Devarajan) - Add sanity check for exit latency and target residency in the cpufreq core (Rafael Wysocki) - Use this_cpu_ptr() where possible in the teo governor (Christian Loehle) - Rework the handling of tick wakeups in the teo cpuidle governor to increase the likelihood of stopping the scheduler tick in the cases when tick wakeups can be counted as non-timer ones (Rafael Wysocki) - Fix a reverse condition in the teo cpuidle governor and drop a misguided target residency check from it (Rafael Wysocki) - Clean up multiple minor defects in the teo cpuidle governor (Rafael Wysocki) - Update header inclusion to make it follow the Include What You Use principle (Andy Shevchenko) - Enable MSR-based RAPL PMU support in the intel_rapl power capping driver and arrange for using it on the Panther Lake and Wildcat Lake processors (Kuppuswamy Sathyanarayanan) - Add support for Nova Lake and Wildcat Lake processors to the intel_rapl power capping driver (Kaushlendra Kumar, Srinivas Pandruvada) - Add OPP and bandwidth support for Tegra186 (Aaron Kling) - Optimizations for parameter array handling in the amd-pstate cpufreq driver (Mario Limonciello) - Fix for mode changes with offline CPUs in the amd-pstate cpufreq driver (Gautham Shenoy) - Preserve freq_table_sorted across suspend/hibernate in the cpufreq core (Zihuan Zhang) - Adjust energy model rules for Intel hybrid platforms in the intel_pstate cpufreq driver and improve printing of debug messages in it (Rafael Wysocki) - Replace deprecated strcpy() in cpufreq_unregister_governor() (Thorsten Blum) - Fix duplicate hyperlink target errors in the intel_pstate cpufreq driver documentation and use :ref: directive for internal linking in it (Swaraj Gaikwad, Bagas Sanjaya) - Add Diamond Rapids OOB mode support to the intel_pstate cpufreq driver (Kuppuswamy Sathyanarayanan) - Use mutex guard for driver locking in the intel_pstate driver and eliminate some code duplication from it (Rafael Wysocki) - Replace udelay() with usleep_range() in ACPI cpufreq (Kaushlendra Kumar) - Minor improvements to various cpufreq drivers (Christian Marangi, Hal Feng, Jie Zhan, Marco Crivellari, Miaoqian Lin, and Shuhao Fu) - Replace snprintf() with scnprintf() in show_trace_dev_match() (Kaushlendra Kumar) - Fix memory allocation error handling in pm_vt_switch_required() (Malaya Kumar Rout) - Introduce CALL_PM_OP() macro and use it to simplify code in generic PM operations (Kaushlendra Kumar) - Add module param to backtrace all CPUs in the device power management watchdog (Sergey Senozhatsky) - Rework message printing in swsusp_save() (Rafael Wysocki) - Make it possible to change the number of hibernation compression threads (Xueqin Luo) - Clarify that only cgroup1 freezer uses PM freezer (Tejun Heo) - Add document on debugging shutdown hangs to PM documentation and correct a mistaken configuration option in it (Mario Limonciello) - Shut down wakeup source timer before removing the wakeup source from the list (Kaushlendra Kumar, Rafael Wysocki) - Introduce new PMSG_POWEROFF event for system shutdown handling with the help of PM device callbacks (Mario Limonciello) - Make pm_test delay interruptible by wakeup events (Riwen Lu) - Clean up kernel-doc comment style usage in the core hibernation code and remove unuseful comments from it (Sunday Adelodun, Rafael Wysocki) - Add support for handling wakeup events and aborting the suspend process while it is syncing file systems (Samuel Wu, Rafael Wysocki) - Add WQ_UNBOUND to pm_wq workqueue (Marco Crivellari) - Add runtime PM wrapper macros for ACQUIRE()/ACQUIRE_ERR() and use them in the PCI core and the ACPI TAD driver (Rafael Wysocki) - Improve runtime PM in the ACPI TAD driver (Rafael Wysocki) - Update pm_runtime_allow/forbid() documentation (Rafael Wysocki) - Fix typos in runtime.c comments (Malaya Kumar Rout) - Move governor.h from devfreq under include/linux/ and rename to devfreq-governor.h to allow devfreq governor definitions in out of drivers/devfreq/ (Dmitry Baryshkov) - Use min() to improve readability in tegra30-devfreq.c (Thorsten Blum) - Fix potential use-after-free issue of OPP handling in hisi_uncore_freq.c (Pengjie Zhang) - Fix typo in DFSO_DOWNDIFFERENTIAL macro name in governor_simpleondemand.c in devfreq (Riwen Lu)" * tag 'pm-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (96 commits) PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name cpuidle: Warn instead of bailing out if target residency check fails cpuidle: Update header inclusion Documentation: power/cpuidle: Document the CPU system wakeup latency QoS cpuidle: Respect the CPU system wakeup QoS limit for cpuidle sched: idle: Respect the CPU system wakeup QoS limit for s2idle pmdomain: Respect the CPU system wakeup QoS limit for cpuidle pmdomain: Respect the CPU system wakeup QoS limit for s2idle PM: QoS: Introduce a CPU system wakeup QoS limit cpuidle: governors: teo: Add missing space to the description PM: hibernate: Extra cleanup of comments in swap handling code PM / devfreq: tegra30: use min to simplify actmon_cpu_to_emc_rate PM / devfreq: hisi: Fix potential UAF in OPP handling PM / devfreq: Move governor.h to a public header location powercap: intel_rapl: Enable MSR-based RAPL PMU support powercap: intel_rapl: Prepare read_raw() interface for atomic-context callers cpufreq: qcom-nvmem: fix compilation warning for qcom_cpufreq_ipq806x_match_list PM: sleep: Call pm_sleep_fs_sync() instead of ksys_sync_helper() PM: sleep: Add support for wakeup during filesystem sync cpufreq: ACPI: Replace udelay() with usleep_range() ...
10 daysMerge tag 'acpi-6.19-rc1' of ↵Linus Torvalds1-2/+5
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI updates from Rafael Wysocki: "These add Microsoft fan extensions support to the ACPI fan driver, fix a bug in ACPICA, update other ACPI drivers (processor, time and alarm device), update ACPI power management code and ACPI device properties management, and fix an ACPI utility: - Avoid walking the ACPI namespace in the AML interpreter if the starting node cannot be determined (Cryolitia PukNgae) - Use min() instead of min_t() in the ACPI device properties handling code to avoid discarding significant bits (David Laight) - Fix potential fwnode refcount leak in acpi_fwnode_graph_parse_endpoint() that may prevent the parent fwnode from being released (Haotian Zhang) - Rework acpi_graph_get_next_endpoint() to use ACPI functions only, remove unnecessary conditionals from it to make it easier to follow, and make acpi_get_next_subnode() static (Sakari Ailus) - Drop unused function acpi_get_lps0_constraint(), make some Low-Power S0 callback functions for suspend-to-idle static, and rearrange the code retrieving Low-Power S0 constraints so it only runs when the constraints are actually used (Rafael Wysocki) - Drop redundant locking from the ACPI battery driver (Rafael Wysocki) - Improve runtime PM in the ACPI time and alarm device (TAD) driver using guard macros and rearrange code related to runtime PM in acpi_tad_remove() (Rafael Wysocki) - Add support for Microsoft fan extensions to the ACPI fan driver along with notification support and work around a 64-bit firmware bug in that driver (Armin Wolf) - Use ACPI_FREE() to free ACPI buffer in the ACPI DPTF code (Kaushlendra Kumar) - Fix a memory leak and a resource leak in the ACPI pfrut utility (Malaya Kumar Rout) - Replace `core::mem::zeroed` with `pin_init::zeroed` in the ACPI Rust code (Siyuan Huang) - Update the ACPI code to use the new style of allocating workqueues and new global workqueues (Marco Crivellari) - Fix two spelling mistakes in the ACPI code (Chu Guangqing) - Fix ISAPNP to generate uevents to auto-load modules (René Rebe) - Relocate the state flags initialization in the ACPI processor idle driver and drop redundant C-state count checks from it (Huisong Li) - Fix map_x2apic_id() in the ACPI processor core driver for amd-pstate on am4 (René Rebe)" * tag 'acpi-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (30 commits) ACPI: PM: Fix a spelling mistake ACPI: LPSS: Fix a spelling mistake ACPI: processor_core: fix map_x2apic_id for amd-pstate on am4 ACPICA: Avoid walking the Namespace if start_node is NULL ACPI: tools: pfrut: fix memory leak and resource leak in pfrut.c ACPI: property: use min() instead of min_t() PNP: Fix ISAPNP to generate uevents to auto-load modules ACPI: property: Fix fwnode refcount leak in acpi_fwnode_graph_parse_endpoint() ACPI: DPTF: Use ACPI_FREE() for ACPI buffer deallocation ACPI: processor: idle: Drop redundant C-state count checks ACPI: thermal: Add WQ_PERCPU to alloc_workqueue() users ACPI: OSL: Add WQ_PERCPU to alloc_workqueue() users ACPI: EC: Add WQ_PERCPU to alloc_workqueue() users ACPI: OSL: replace use of system_wq with system_percpu_wq ACPI: scan: replace use of system_unbound_wq with system_dfl_wq ACPI: fan: Add support for Microsoft fan extensions ACPI: fan: Add hwmon notification support ACPI: fan: Add basic notification support ACPI: TAD: Improve runtime PM using guard macros ACPI: TAD: Rearrange runtime PM operations in acpi_tad_remove() ...
10 daysMerge tag 'arm64-upstream' of ↵Linus Torvalds9-23/+83
git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux Pull arm64 updates from Catalin Marinas: "These are the arm64 updates for 6.19. The biggest part is the Arm MPAM driver under drivers/resctrl/. There's a patch touching mm/ to handle spurious faults for huge pmd (similar to the pte version). The corresponding arm64 part allows us to avoid the TLB maintenance if a (huge) page is reused after a write fault. There's EFI refactoring to allow runtime services with preemption enabled and the rest is the usual perf/PMU updates and several cleanups/typos. Summary: Core features: - Basic Arm MPAM (Memory system resource Partitioning And Monitoring) driver under drivers/resctrl/ which makes use of the fs/rectrl/ API Perf and PMU: - Avoid cycle counter on multi-threaded CPUs - Extend CSPMU device probing and add additional filtering support for NVIDIA implementations - Add support for the PMUs on the NoC S3 interconnect - Add additional compatible strings for new Cortex and C1 CPUs - Add support for data source filtering to the SPE driver - Add support for i.MX8QM and "DB" PMU in the imx PMU driver Memory managemennt: - Avoid broadcast TLBI if page reused in write fault - Elide TLB invalidation if the old PTE was not valid - Drop redundant cpu_set_*_tcr_t0sz() macros - Propagate pgtable_alloc() errors outside of __create_pgd_mapping() - Propagate return value from __change_memory_common() ACPI and EFI: - Call EFI runtime services without disabling preemption - Remove unused ACPI function Miscellaneous: - ptrace support to disable streaming on SME-only systems - Improve sysreg generation to include a 'Prefix' descriptor - Replace __ASSEMBLY__ with __ASSEMBLER__ - Align register dumps in the kselftest zt-test - Remove some no longer used macros/functions - Various spelling corrections" * tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (94 commits) arm64/mm: Document why linear map split failure upon vm_reset_perms is not problematic arm64/pageattr: Propagate return value from __change_memory_common arm64/sysreg: Remove unused define ARM64_FEATURE_FIELD_BITS KVM: arm64: selftests: Consider all 7 possible levels of cache KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last user arm64: atomics: lse: Remove unused parameters from ATOMIC_FETCH_OP_AND macros Documentation/arm64: Fix the typo of register names ACPI: GTDT: Get rid of acpi_arch_timer_mem_init() perf: arm_spe: Add support for filtering on data source perf: Add perf_event_attr::config4 perf/imx_ddr: Add support for PMU in DB (system interconnects) perf/imx_ddr: Get and enable optional clks perf/imx_ddr: Move ida_alloc() from ddr_perf_init() to ddr_perf_probe() dt-bindings: perf: fsl-imx-ddr: Add compatible string for i.MX8QM, i.MX8QXP and i.MX8DXL arm64: remove duplicate ARCH_HAS_MEM_ENCRYPT arm64: mm: use untagged address to calculate page index MAINTAINERS: new entry for MPAM Driver arm_mpam: Add kunit tests for props_mismatch() arm_mpam: Add kunit test for bitmap reset arm_mpam: Add helper to reset saved mbwu state ...
10 daysMerge tag 's390-6.19-1' of ↵Linus Torvalds9-69/+2
git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux Pull s390 updates from Heiko Carstens: - Provide a new interface for dynamic configuration and deconfiguration of hotplug memory, allowing with and without memmap_on_memory support. This makes the way memory hotplug is handled on s390 much more similar to other architectures - Remove compat support. There shouldn't be any compat user space around anymore, therefore get rid of a lot of code which also doesn't need to be tested anymore - Add stackprotector support. GCC 16 will get new compiler options, which allow to generate code required for kernel stackprotector support - Merge pai_crypto and pai_ext PMU drivers into a new driver. This removes a lot of duplicated code. The new driver is also extendable and allows to support new PMUs - Add driver override support for AP queues - Rework and extend zcrypt and AP trace events to allow for tracing of crypto requests - Support block sizes larger than 65535 bytes for CCW tape devices - Since the rework of the virtual kernel address space the module area and the kernel image are within the same 4GB area. This eliminates the need of weak per cpu variables. Get rid of ARCH_MODULE_NEEDS_WEAK_PER_CPU - Various other small improvements and fixes * tag 's390-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (92 commits) watchdog: diag288_wdt: Remove KMSG_COMPONENT macro s390/entry: Use lay instead of aghik s390/vdso: Get rid of -m64 flag handling s390/vdso: Rename vdso64 to vdso s390: Rename head64.S to head.S s390/vdso: Use common STABS_DEBUG and DWARF_DEBUG macros s390: Add stackprotector support s390/modules: Simplify module_finalize() slightly s390: Remove KMSG_COMPONENT macro s390/percpu: Get rid of ARCH_MODULE_NEEDS_WEAK_PER_CPU s390/ap: Restrict driver_override versus apmask and aqmask use s390/ap: Rename mutex ap_perms_mutex to ap_attr_mutex s390/ap: Support driver_override for AP queue devices s390/ap: Use all-bits-one apmask/aqmask for vfio in_use() checks s390/debug: Update description of resize operation s390/syscalls: Switch to generic system call table generation s390/syscalls: Remove system call table pointer from thread_struct s390/uapi: Remove 31 bit support from uapi header files s390: Remove compat support tools: Remove s390 compat support ...
10 daysperf tools: Minimal DEFERRED_CALLCHAIN supportNamhyung Kim9-3/+73
Add a new event type for deferred callchains and a new callback for the struct perf_tool. For now it doesn't actually handle the deferred callchains but it just marks the sample if it has the PERF_CONTEXT_ USER_DEFFERED in the callchain array. At least, perf report can dump the raw data with this change. Actually this requires the next commit to enable attr.defer_callchain, but if you already have a data file, it'll show the following result. $ perf report -D ... 0x2158@perf.data [0x40]: event: 22 . . ... raw event: size 64 bytes . 0000: 16 00 00 00 02 00 40 00 06 00 00 00 0b 00 00 00 ......@......... . 0010: 03 00 00 00 00 00 00 00 a7 7f 33 fe 18 7f 00 00 ..........3..... . 0020: 0f 0e 33 fe 18 7f 00 00 48 14 33 fe 18 7f 00 00 ..3.....H.3..... . 0030: 08 09 00 00 08 09 00 00 e6 7a e7 35 1c 00 00 00 .........z.5.... 121163447014 0x2158 [0x40]: PERF_RECORD_CALLCHAIN_DEFERRED(IP, 0x2): 2312/2312: 0xb00000006 ... FP chain: nr:3 ..... 0: 00007f18fe337fa7 ..... 1: 00007f18fe330e0f ..... 2: 00007f18fe331448 : unhandled! Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daystools headers UAPI: Sync linux/perf_event.h for deferred callchainsNamhyung Kim1-1/+20
It needs to sync with the kernel to support user space changes for the deferred callchains. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Skip optional metrics in metric group listIan Rogers1-3/+5
For metric groups, skip metrics in the list that are None. This allows functions to better optionally return metrics. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Drop duplicate pending metricsIan Rogers1-1/+2
Drop adding a pending metric if there is an existing one. Ensure the PMUs differ for hybrid systems. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Move json encoding to its own functionsIan Rogers1-12/+22
Have dedicated encode functions rather than having them embedded in MetricGroup. This is to provide some uniformity in the Metric ToXXX routines. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Add threshold expressions to MetricIan Rogers1-1/+6
Allow threshold expressions for metrics to be generated. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Term list fix in event parsingIan Rogers1-1/+6
Fix events seemingly broken apart at a comma. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Support parsing negative exponentsIan Rogers2-1/+5
Support negative exponents when parsing from a json metric string by making the numbers after the 'e' optional in the 'Event' insertion fix up. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Allow metric groups not to be namedIan Rogers1-1/+2
It can be convenient to have unnamed metric groups for the sake of organizing other metrics and metric groups. An unspecified name shouldn't contribute to the MetricGroup json value, so don't record it. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Add descriptions to metricgroup abstractionIan Rogers1-2/+12
Add a function to recursively generate metric group descriptions. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Update metric constraint supportIan Rogers1-4/+10
Previous metric constraints were binary, either none or don't group when the NMI watchdog is present. Update to match the definitions in 'enum metric_event_groups' in pmu-events.h. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jevents: Allow multiple metricgroups.json filesIan Rogers1-2/+2
Allow multiple metricgroups.json files by handling any file ending with metricgroups.json as a metricgroups file. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf ilist: Be tolerant of reading a metric on the wrong CPUIan Rogers1-2/+6
This happens on hybrid machine metrics. Be tolerant and don't cause the ilist application to crash with an exception. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf python: Correct copying of metric_leader in an evselIan Rogers2-22/+61
Ensure the metric_leader is copied and set up correctly. In compute_metric determine the correct metric_leader event to match the requested CPU. Fixes the handling of metrics particularly on hybrid machines. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Falcon <thomas.falcon@intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf test: Add python JIT dump testNamhyung Kim1-0/+81
Add a test case for the python interpreter like below so that we can make sure it won't break again. To validate the effect of build-ID generation, it adds and removes the JIT'ed DSOs to/from the build-ID cache for the test. $ perf test -vv jitdump 84: python profiling with jitdump: --- start --- test child forked, pid 214316 Run python with -Xperf_jit [ perf record: Woken up 5 times to write data ] [ perf record: Captured and wrote 1.180 MB /tmp/__perf_test.perf.data.XbqZNm (140 samples) ] Generate JIT-ed DSOs using perf inject Add JIT-ed DSOs to the build-ID cache Check the symbol containing the script name Found 108 matching lines Remove JIT-ed DSOs from the build-ID cache ---- end(0) ---- 84: python profiling with jitdump : Ok Cc: Pablo Galindo <pablogsal@gmail.com> Link: https://docs.python.org/3/howto/perf_profiling.html#how-to-work-without-frame-pointers Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf jitdump: Add sym/str-tables to build-ID generationNamhyung Kim1-2/+30
It was reported that python backtrace with JIT dump was broken after the change to built-in SHA-1 implementation. It seems python generates the same JIT code for each function. They will become separate DSOs but the contents are the same. Only difference is in the symbol name. But this caused a problem that every JIT'ed DSOs will have the same build-ID which makes perf confused. And it resulted in no python symbols (from JIT) in the output. Looking back at the original code before the conversion, it used the load_addr as well as the code section to distinguish each DSO. But it'd be better to use contents of symtab and strtab instead as it aligns with some linker behaviors. This patch adds a buffer to save all the contents in a single place for SHA-1 calculation. Probably we need to add sha1_update() or similar to update the existing hash value with different contents and use it here. But it's out of scope for this change and I'd like something that can be backported to the stable trees easily. Reviewed-by: Ian Rogers <irogers@google.com> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Pablo Galindo <pablogsal@gmail.com> Cc: Fangrui Song <maskray@sourceware.org> Link: https://github.com/python/cpython/issues/139544 Fixes: e3f612c1d8f3945b ("perf genelf: Remove libcrypto dependency and use built-in sha1()") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf test: Fix hybrid testing of event fallback testIan Rogers1-17/+2
The mem-loads-aux event exists on hybrid systems but the "cpu" PMU does not. This causes an event parsing error which erroneously makes the test look like it is failing. Avoid naming the PMU to avoid this. Rather than cleaning up perf.data in the directory the test is run, explicitly send the 'perf record' output to /dev/null and avoid any cleanup scripts. Fixes: fc9c17b22352 ("perf test: Add a perf event fallback test") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf tools: Remove a trailing newline in the event termsNamhyung Kim1-0/+2
So that it can show the correct encoding info in the JSON output. $ perf list -j hw [ { "Unit": "cpu", "Topic": "legacy hardware", "EventName": "branch-instructions", "EventType": "Kernel PMU event", "BriefDescription": "Retired branch instructions [This event is an alias of branches]", "Encoding": "cpu/event=0xc4/" }, ... Reviewed-by: Ian Rogers <irogers@google.com> Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysMerge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski2-0/+29
Merge in late fixes in preparation for the net-next PR. Signed-off-by: Jakub Kicinski <kuba@kernel.org>
10 daysMerge tag 'x86_cpu_for_6.19-rc1' of ↵Linus Torvalds2-10/+16
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 CPU feature updates from Dave Hansen: "The biggest thing of note here is Linear Address Space Separation (LASS). It represents the first time I can think of that the upper=>kernel/lower=>user address space convention is actually recognized by the hardware on x86. It ensures that userspace can not even get the hardware to _start_ page walks for the kernel address space. This, of course, is a really nice generic side channel defense. This is really only a down payment on LASS support. There are still some details to work out in its interaction with EFI calls and vsyscall emulation. For now, LASS is disabled if either of those features is compiled in (which is almost always the case). There's also one straggler commit in here which converts an under-utilized AMD CPU feature leaf into a generic Linux-defined leaf so more feature can be packed in there. Summary: - Enable Linear Address Space Separation (LASS) - Change X86_FEATURE leaf 17 from an AMD leaf to Linux-defined" * tag 'x86_cpu_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Enable LASS during CPU initialization selftests/x86: Update the negative vsyscall tests to expect a #GP x86/traps: Communicate a LASS violation in #GP message x86/kexec: Disable LASS during relocate kernel x86/alternatives: Disable LASS when patching kernel code x86/asm: Introduce inline memcpy and memset x86/cpu: Add an LASS dependency on SMAP x86/cpufeatures: Enumerate the LASS feature bits x86/cpufeatures: Make X86_FEATURE leaf 17 Linux-specific
10 daystools/power turbostat: version 2025.12.02Len Brown1-218/+108
Since release 2025.09.09: Add LLC statistics columns: LLCkRPS = Last Level Cache Thousands of References Per Second LLC%hit = Last Level Cache Hit % Recognize Wildcat Lake and Nova Lake platforms Add MSR check for Android Add APERF check for VMWARE Add RAPL check for AWS minor fixes This patch: White-space only, resulting from running Lindent on everything except the tab-justified data-tables, and using -l150 instead of -l80 to allow long lines. Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Print wide names only for RAW 64-bit columnsLen Brown1-19/+21
Print a wide column header only for the case of a 64-bit RAW counter. It turns out that wide column headers otherwise are more harm than good. Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Print percentages in 8-columnsLen Brown1-2/+2
Added counters that are FORMAT_PERCENT do not need to be 64-bits -- 32 is plenty. This allows the output code to fit them, and their header, into 8-columns. Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Print "nan" for out of range percentagesLen Brown1-39/+53
Sometimes counters return junk. For the cases where values > 100% is invalid, print "nan". Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Validate APERF access for VMWARELen Brown1-7/+7
VMWARE correctly enumerates lack of APERF and MPERF in CPUID, but turbostat didn't consult that before attempting to access them. Since VMWARE allows access, but always returns 0, turbostat got confusd into an infinite reset loop. Head this off by listening to CPUID.6.APERF_MPERF (and rename the existing variable to make this more clear) Reported-by: David Arcari <darcari@redhat.com> Tested-by: David Arcari <darcari@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Enhance perf probeLen Brown1-10/+44
check_perf_access() will now check both IPC and LLC perf counters if they are enabled. If any fail, it now disables perf and all perf counters. Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Validate RAPL MSRs for AWS Nitro HypervisorLen Brown1-58/+98
Even though the platform->plat_rapl_msrs enumeration may be accurate, a VM, such as AWS Nitro Hypervisor, may deny access to the underlying MSRs. Probe if PKG_ENERGY is readable and non-zero. If no, ignore all RAPL MSRs. Reported-by: Emily Ehlert <ehemily@amazon.de> Tested-by: Emily Ehlert <ehemily@amazon.de> Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power x86_energy_perf_policy: Fix potential NULL pointer dereferenceMalaya Kumar Rout1-1/+6
In err_on_hypervisor(), strstr() is called to search for "flags" in the buffer, but the return value is not checked before being used in pointer arithmetic (flags - buffer). If strstr() returns NULL because "flags" is not found in /proc/cpuinfo, this will cause undefined behavior and likely a crash. Add a NULL check after the strstr() call and handle the error appropriately by cleaning up resources and reporting a meaningful error message. Signed-off-by: Malaya Kumar Rout <mrout@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power x86_energy_perf_policy: Fix format string in error messageMalaya Kumar Rout1-1/+1
The error message in validate_cpu_selected_set() uses an incomplete format specifier "cpu%" instead of "cpu%d", resulting in the error message printing "Requested cpu% is not present" rather than showing the actual CPU number. Fix the format string to properly display the CPU number. Signed-off-by: Malaya Kumar Rout <mrout@redhat.com> Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power x86_energy_perf_policy: Simplify Android MSR probeLen Brown1-27/+11
no functional change Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power x86_energy_perf_policy: Add Android MSR device supportKaushlendra Kumar1-8/+46
Add support for Android MSR device paths which use /dev/msrN format instead of the standard Linux /dev/cpu/N/msr format. The tool now probes both path formats at startup and uses the appropriate one. This enables x86_energy_perf_policy to work on Android systems where MSR devices follow a different naming convention while maintaining full compatibility with standard Linux systems. Signed-off-by: Kaushlendra Kumar <kaushlendra.kumar@intel.com> Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Add run-time MSR driver probeLen Brown1-29/+39
Rather than starting down the conditional-compile road... Probe the location of the MSR files at run-time. Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Set per_cpu_msr_sum to NULL after freeEmily Ehlert1-0/+1
Set per_cpu_msr_sum to NULL after freeing it in the error path of msr_sum_record() to prevent potential use-after-free issues. Signed-off-by: Emily Ehlert <ehemily@amazon.com> Signed-off-by: Len Brown <len.brown@intel.com>
10 daystools/power turbostat: Add LLC statsLen Brown2-28/+164
LLCkRPS = Last Level Cache Thousands of References Per Second LLC%hit = Last Level Cache Hit % These columns are enabled by-default. They can be controlled with the --show/--hide options by individual column names above, or together using the "llc" or "cache" groups. Signed-off-by: Len Brown <len.brown@intel.com>
10 daysMerge tag 'x86_cleanups_for_v6.19_rc1' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cleanups from Borislav Petkov: - The mandatory pile of cleanups the cat drags in every merge window * tag 'x86_cleanups_for_v6.19_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/boot: Clean up whitespace in a20.c x86/mm: Delete disabled debug code x86/{boot,mtrr}: Remove unused function declarations x86/percpu: Use BIT_WORD() and BIT_MASK() macros x86/cpufeatures: Correct LKGS feature flag description x86/idtentry: Add missing '*' to kernel-doc lines
10 daysMerge tag 'kvm-s390-next-6.19-1' of ↵Paolo Bonzini2-0/+141
https://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into HEAD - SCA rework - VIRT_XFER_TO_GUEST_WORK support - Operation exception forwarding support - Cleanups
10 daysMerge tag 'timers-core-2025-11-30' of ↵Linus Torvalds2-10/+77
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull timer core updates from Thomas Gleixner: - Prevent a thundering herd problem when the timekeeper CPU is delayed and a large number of CPUs compete to acquire jiffies_lock to do the update. Limit it to one CPU with a separate "uncontended" atomic variable. - A set of improvements for the timer migration mechanism: - Support imbalanced NUMA trees correctly - Support dynamic exclusion of CPUs from the migrator duty to allow the cpuset/isolation mechanism to exclude them from handling timers of remote idle CPUs - The usual small updates, cleanups and enhancements * tag 'timers-core-2025-11-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: timers/migration: Exclude isolated cpus from hierarchy cpumask: Add initialiser to use cleanup helpers sched/isolation: Force housekeeping if isolcpus and nohz_full don't leave any cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() timers/migration: Use scoped_guard on available flag set/clear timers/migration: Add mask for CPUs available in the hierarchy timers/migration: Rename 'online' bit to 'available' selftests/timers/nanosleep: Add tests for return of remaining time selftests/timers: Clean up kernel version check in posix_timers time: Fix a few typos in time[r] related code comments time: tick-oneshot: Add missing Return and parameter descriptions to kernel-doc hrtimer: Store time as ktime_t in restart block timers/migration: Remove dead code handling idle CPU checking for remote timers timers/migration: Remove unused "cpu" parameter from tmigr_get_group() timers/migration: Assert that hotplug preparing CPU is part of stable active hierarchy timers/migration: Fix imbalanced NUMA trees timers/migration: Remove locking on group connection timers/migration: Convert "while" loops to use "for" tick/sched: Limit non-timekeeper CPUs calling jiffies update
10 daysMerge tag 'kvmarm-6.19' of ↵Paolo Bonzini14-22/+821
https://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD KVM/arm64 updates for 6.19 - Support for userspace handling of synchronous external aborts (SEAs), allowing the VMM to potentially handle the abort in a non-fatal manner. - Large rework of the VGIC's list register handling with the goal of supporting more active/pending IRQs than available list registers in hardware. In addition, the VGIC now supports EOImode==1 style deactivations for IRQs which may occur on a separate vCPU than the one that acked the IRQ. - Support for FEAT_XNX (user / privileged execute permissions) and FEAT_HAF (hardware update to the Access Flag) in the software page table walkers and shadow MMU. - Allow page table destruction to reschedule, fixing long need_resched latencies observed when destroying a large VM. - Minor fixes to KVM and selftests
10 daysMerge tag 'kvm-riscv-6.19-1' of https://github.com/kvm-riscv/linux into HEADPaolo Bonzini1-0/+4
KVM/riscv changes for 6.19 - SBI MPXY support for KVM guest - New KVM_EXIT_FAIL_ENTRY_NO_VSFILE for the case when in-kernel AIA virtualization fails to allocate IMSIC VS-file - Support enabling dirty log gradually in small chunks - Fix guest page fault within HLV* instructions - Flush VS-stage TLB after VCPU migration for Andes cores
10 daysMerge tag 'loongarch-kvm-6.19' of ↵Paolo Bonzini6-3/+417
git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD LoongArch KVM changes for v6.19 1. Get VM PMU capability from HW GCFG register. 2. Add AVEC basic support. 3. Use 64-bit register definition for EIOINTC. 4. Add KVM timer test cases for tools/selftests.
10 daysobjtool: Fix stack overflow in validate_branch()Josh Poimboeuf1-14/+13
On an allmodconfig kernel compiled with Clang, objtool is segfaulting in drivers/scsi/qla2xxx/qla2xxx.o due to a stack overflow in validate_branch(). Due in part to KASAN being enabled, the qla2xxx code has a large number of conditional jumps, causing objtool to go quite deep in its recursion. By far the biggest offender of stack usage is the recently added 'prev_state' stack variable in validate_insn(), coming in at 328 bytes. Move that variable (and its tracing usage) to handle_insn_ops() and make handle_insn_ops() noinline to keep its stack frame outside the recursive call chain. Reported-by: Nathan Chancellor <nathan@kernel.org> Fixes: fcb268b47a2f ("objtool: Trace instruction state changes during function validation") Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/21bb161c23ca0d8c942a960505c0d327ca2dc7dc.1764691895.git.jpoimboe@kernel.org Closes: https://lore.kernel.org/20251201202329.GA3225984@ax162
10 daysselftests/tc-testing: Test CAKE scheduler when enqueue drops packetsXiang Mei1-0/+28
Add tests that trigger packet drops in cake_enqueue(): "CAKE with QFQ Parent - CAKE enqueue with packets dropping". It forces CAKE_enqueue to return NET_XMIT_CN after dropping the packets when it has a QFQ parent. Signed-off-by: Xiang Mei <xmei5@asu.edu> Reviewed-by: Toke Høiland-Jørgensen <toke@toke.dk> Link: https://patch.msgid.link/20251128001415.377823-3-xmei5@asu.edu Signed-off-by: Paolo Abeni <pabeni@redhat.com>
10 daysMerge tag 'asoc-v6.19' of ↵Takashi Iwai7-14/+191
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-linus ASoC: Updates for v6.19 This is a very large set of updates, as well as some more extensive cleanup work from Morimto-san we've also added a generic SCDA class driver for SoundWire devices enabling us to support many chips with no custom code. There's also a batch of new drivers added for both SoCs and CODECs. - Added a SoundWire SCDA generic class driver, pulling in a little regmap work to support it. - A *lot* of cleaup and API improvement work from Morimoto-san. - Lots of work on the existing Cirrus, Intel, Maxim and Qualcomm drivers. - Support for Allwinner A523, Mediatek MT8189, Qualcomm QCM2290, QRB2210 and SM6115, SpacemiT K1, and TI TAS2568, TAS5802, TAS5806, TAS5815, TAS5828 and TAS5830. This also pulls in some gpiolib changes supporting shared GPIOs in the core there so we can convert some of the ASoC drivers open coding handling of that to the core functionality.
10 daysMerge tag 'perf-core-2025-12-01' of ↵Linus Torvalds2-4/+22
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Callchain support: - Add support for deferred user-space stack unwinding for perf, enabled on x86. (Peter Zijlstra, Steven Rostedt) - unwind_user/x86: Enable frame pointer unwinding on x86 (Josh Poimboeuf) x86 PMU support and infrastructure: - x86/insn: Simplify for_each_insn_prefix() (Peter Zijlstra) - x86/insn,uprobes,alternative: Unify insn_is_nop() (Peter Zijlstra) Intel PMU driver: - Large series to prepare for and implement architectural PEBS support for Intel platforms such as Clearwater Forest (CWF) and Panther Lake (PTL). (Dapeng Mi, Kan Liang) - Check dynamic constraints (Kan Liang) - Optimize PEBS extended config (Peter Zijlstra) - cstates: - Remove PC3 support from LunarLake (Zhang Rui) - Add Pantherlake support (Zhang Rui) - Clearwater Forest support (Zide Chen) AMD PMU driver: - x86/amd: Check event before enable to avoid GPF (George Kennedy) Fixes and cleanups: - task_work: Fix NMI race condition (Peter Zijlstra) - perf/x86: Fix NULL event access and potential PEBS record loss (Dapeng Mi) - Misc other fixes and cleanups (Dapeng Mi, Ingo Molnar, Peter Zijlstra)" * tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use perf/x86/intel: Optimize PEBS extended config perf/x86/intel: Check PEBS dyn_constraints perf/x86/intel: Add a check for dynamic constraints perf/x86/intel: Add counter group support for arch-PEBS perf/x86/intel: Setup PEBS data configuration and enable legacy groups perf/x86/intel: Update dyn_constraint base on PEBS event precise level perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR perf/x86/intel: Process arch-PEBS records or record fragments perf/x86/intel/ds: Factor out PEBS group processing code to functions perf/x86/intel/ds: Factor out PEBS record processing code to functions perf/x86/intel: Initialize architectural PEBS perf/x86/intel: Correct large PEBS flag check perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call perf/x86: Fix NULL event access and potential PEBS record loss perf/x86: Remove redundant is_x86_event() prototype entry,unwind/deferred: Fix unwind_reset_info() placement unwind_user/x86: Fix arch=um build perf: Support deferred user unwind unwind_user/x86: Teach FP unwind about start of function ...
10 daysMerge tag 'objtool-core-2025-12-01' of ↵Linus Torvalds50-994/+6036
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull objtool updates from Ingo Molnar: - klp-build livepatch module generation (Josh Poimboeuf) Introduce new objtool features and a klp-build script to generate livepatch modules using a source .patch as input. This builds on concepts from the longstanding out-of-tree kpatch project which began in 2012 and has been used for many years to generate livepatch modules for production kernels. However, this is a complete rewrite which incorporates hard-earned lessons from 12+ years of maintaining kpatch. Key improvements compared to kpatch-build: - Integrated with objtool: Leverages objtool's existing control-flow graph analysis to help detect changed functions. - Works on vmlinux.o: Supports late-linked objects, making it compatible with LTO, IBT, and similar. - Simplified code base: ~3k fewer lines of code. - Upstream: No more out-of-tree #ifdef hacks, far less cruft. - Cleaner internals: Vastly simplified logic for symbol/section/reloc inclusion and special section extraction. - Robust __LINE__ macro handling: Avoids false positive binary diffs caused by the __LINE__ macro by introducing a fix-patch-lines script which injects #line directives into the source .patch to preserve the original line numbers at compile time. - Disassemble code with libopcodes instead of running objdump (Alexandre Chartre) - Disassemble support (-d option to objtool) by Alexandre Chartre, which supports the decoding of various Linux kernel code generation specials such as alternatives: 17ef: sched_balance_find_dst_group+0x62f mov 0x34(%r9),%edx 17f3: sched_balance_find_dst_group+0x633 | <alternative.17f3> | X86_FEATURE_POPCNT 17f3: sched_balance_find_dst_group+0x633 | call 0x17f8 <__sw_hweight64> | popcnt %rdi,%rax 17f8: sched_balance_find_dst_group+0x638 cmp %eax,%edx ... jump table alternatives: 1895: sched_use_asym_prio+0x5 test $0x8,%ch 1898: sched_use_asym_prio+0x8 je 0x18a9 <sched_use_asym_prio+0x19> 189a: sched_use_asym_prio+0xa | <jump_table.189a> | JUMP 189a: sched_use_asym_prio+0xa | jmp 0x18ae <sched_use_asym_prio+0x1e> | nop2 189c: sched_use_asym_prio+0xc mov $0x1,%eax 18a1: sched_use_asym_prio+0x11 and $0x80,%ecx ... exception table alternatives: native_read_msr: 5b80: native_read_msr+0x0 mov %edi,%ecx 5b82: native_read_msr+0x2 | <ex_table.5b82> | EXCEPTION 5b82: native_read_msr+0x2 | rdmsr | resume at 0x5b84 <native_read_msr+0x4> 5b84: native_read_msr+0x4 shl $0x20,%rdx .... x86 feature flag decoding (also see the X86_FEATURE_POPCNT example in sched_balance_find_dst_group() above): 2faaf: start_thread_common.constprop.0+0x1f jne 0x2fba4 <start_thread_common.constprop.0+0x114> 2fab5: start_thread_common.constprop.0+0x25 | <alternative.2fab5> | X86_FEATURE_ALWAYS | X86_BUG_NULL_SEG 2fab5: start_thread_common.constprop.0+0x25 | jmp 0x2faba <.altinstr_aux+0x2f4> | jmp 0x4b0 <start_thread_common.constprop.0+0x3f> | nop5 2faba: start_thread_common.constprop.0+0x2a mov $0x2b,%eax ... NOP sequence shortening: 1048e2: snapshot_write_finalize+0xc2 je 0x104917 <snapshot_write_finalize+0xf7> 1048e4: snapshot_write_finalize+0xc4 nop6 1048ea: snapshot_write_finalize+0xca nop11 1048f5: snapshot_write_finalize+0xd5 nop11 104900: snapshot_write_finalize+0xe0 mov %rax,%rcx 104903: snapshot_write_finalize+0xe3 mov 0x10(%rdx),%rax ... and much more. - Function validation tracing support (Alexandre Chartre) - Various -ffunction-sections fixes (Josh Poimboeuf) - Clang AutoFDO (Automated Feedback-Directed Optimizations) support (Josh Poimboeuf) - Misc fixes and cleanups (Borislav Petkov, Chen Ni, Dylan Hatch, Ingo Molnar, John Wang, Josh Poimboeuf, Pankaj Raghav, Peter Zijlstra, Thorsten Blum) * tag 'objtool-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (129 commits) objtool: Fix segfault on unknown alternatives objtool: Build with disassembly can fail when including bdf.h objtool: Trim trailing NOPs in alternative objtool: Add wide output for disassembly objtool: Compact output for alternatives with one instruction objtool: Improve naming of group alternatives objtool: Add Function to get the name of a CPU feature objtool: Provide access to feature and flags of group alternatives objtool: Fix address references in alternatives objtool: Disassemble jump table alternatives objtool: Disassemble exception table alternatives objtool: Print addresses with alternative instructions objtool: Disassemble group alternatives objtool: Print headers for alternatives objtool: Preserve alternatives order objtool: Add the --disas=<function-pattern> action objtool: Do not validate IBT for .return_sites and .call_sites objtool: Improve tracing of alternative instructions objtool: Add functions to better name alternatives objtool: Identify the different types of alternatives ...
11 daysselftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.pyCarolina Jubran1-44/+30
Currently, tolerance is computed against the TC’s expected percentage, making TC3 (20%) validation overly strict and TC4 (80%) overly loose. Update BandwidthValidator to take a dict of shares and compute bounds relative to the overall total, so that all shares are validated consistently. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-7-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.pyCarolina Jubran1-13/+13
Correct the documented bandwidth distribution between TC3 and TC4 from 80/20 to 20/80. Update test descriptions and printed messages to consistently reflect the intended split. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-6-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.pyCarolina Jubran1-2/+2
Commit 7c32f7a2d3db ("selftests: net: py: don't default to shell=True") changed the cmd() helper to avoid spawning a shell unless explicitly requested. The devlink_rate_tc_bw test enables SR-IOV by writing to the sriov_numvfs sysfs attribute using redirection. Without shell=True the redirection is not interpreted and the VF device never appears, causing the test to fail. Fix by explicitly passing shell=True in the two places that update sriov_numvfs. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-5-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.pyCarolina Jubran1-41/+29
Replace the inline iperf3 subprocess and JSON parsing with Iperf3Runner. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-4-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: drv-net: introduce Iperf3Runner for measurement use casesCarolina Jubran3-12/+82
GenerateTraffic was added to spin up long-running iperf3 load, mainly to drive high PPS background traffic. It was never meant to provide stable throughput numbers, and trying to repurpose it for measurement does not make sense. Introduce Iperf3Runner to allow tests to split out server/client configuration, control start/stop, and collect JSON output for analysis. This makes it possible to measure bandwidth directly when validating egress shaping. GenerateTraffic stays as the background load generator, reusing the common iperf3 helpers under the hood. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-3-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGSCarolina Jubran1-0/+1
This makes devlink_rate_tc_bw.py present in the Makefile under the same directory. Signed-off-by: Carolina Jubran <cjubran@nvidia.com> Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com> Reviewed-by: Nimrod Oren <noren@nvidia.com> Link: https://patch.msgid.link/20251130091938.4109055-2-cjubran@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: netconsole: remove log noise due to socat exitAndre Carvalho1-1/+1
This removes some noise that can be distracting while looking at selftests by redirecting socat stderr to /dev/null. Before this commit, netcons_basic would output: Running with target mode: basic (ipv6) 2025/11/29 12:08:03 socat[259] W exiting on signal 15 2025/11/29 12:08:03 socat[271] W exiting on signal 15 basic : ipv6 : Test passed Running with target mode: basic (ipv4) 2025/11/29 12:08:05 socat[329] W exiting on signal 15 2025/11/29 12:08:05 socat[322] W exiting on signal 15 basic : ipv4 : Test passed Running with target mode: extended (ipv6) 2025/11/29 12:08:08 socat[386] W exiting on signal 15 2025/11/29 12:08:08 socat[386] W exiting on signal 15 2025/11/29 12:08:08 socat[380] W exiting on signal 15 extended : ipv6 : Test passed Running with target mode: extended (ipv4) 2025/11/29 12:08:10 socat[440] W exiting on signal 15 2025/11/29 12:08:10 socat[435] W exiting on signal 15 2025/11/29 12:08:10 socat[435] W exiting on signal 15 extended : ipv4 : Test passed After these changes, output looks like: Running with target mode: basic (ipv6) basic : ipv6 : Test passed Running with target mode: basic (ipv4) basic : ipv4 : Test passed Running with target mode: extended (ipv6) extended : ipv6 : Test passed Running with target mode: extended (ipv4) extended : ipv4 : Test passed Signed-off-by: Andre Carvalho <asantostc@gmail.com> Reviewed-by: Simon Horman <horms@kernel.org> Link: https://patch.msgid.link/20251129-netcons-socat-noise-v1-1-605a0cea8fca@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: net: add a hint about MACAddressPolicy=persistentJakub Kicinski1-1/+1
New NIPA installation had been reporting a few flaky tests. arp_ndisc_evict_nocarrier is most flaky of them all. I suspect that the flakiness is due to udev swapping the MAC addresses on the interfaces. Extend the message in arp_ndisc_evict_nocarrier to hint at this potential issue. Having the neigh get fail right after ping is rather unusual, unless udev changes the MAC addr causing a flush in the meantime. Link: https://patch.msgid.link/20251127194556.2409574-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysselftests: net: py: handle interrupt during cleanupJakub Kicinski1-2/+16
Following up on the old discussion [1]. Let the BaseExceptions out of defer()'ed cleanup. And handle it in the main loop. This allows us to exit the tests if user hit Ctrl-C during defer(). Link: https://lore.kernel.org/20251119063228.3adfd743@kernel.org # [1] Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Petr Machata <petrm@nvidia.com> Link: https://patch.msgid.link/20251128004846.2602687-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysynl: samples: Fix spelling mistake "failedq" -> "failed"Colin Ian King1-1/+1
There is a spelling mistake in an error message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://patch.msgid.link/20251128173802.318520-1-colin.i.king@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
11 daysMerge tag 'vfs-6.19-rc1.coredump' of ↵Linus Torvalds9-1663/+2851
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull pidfd and coredump updates from Christian Brauner: "Features: - Expose coredump signal via pidfd Expose the signal that caused the coredump through the pidfd interface. The recent changes to rework coredump handling to rely on unix sockets are in the process of being used in systemd. The previous systemd coredump container interface requires the coredump file descriptor and basic information including the signal number to be sent to the container. This means the signal number needs to be available before sending the coredump to the container. - Add supported_mask field to pidfd Add a new supported_mask field to struct pidfd_info that indicates which information fields are supported by the running kernel. This allows userspace to detect feature availability without relying on error codes or kernel version checks. Cleanups: - Drop struct pidfs_exit_info and prepare to drop exit_info pointer, simplifying the internal publication mechanism for exit and coredump information retrievable via the pidfd ioctl - Use guard() for task_lock in pidfs - Reduce wait_pidfd lock scope - Add missing PIDFD_INFO_SIZE_VER1 constant - Add missing BUILD_BUG_ON() assert on struct pidfd_info Fixes: - Fix PIDFD_INFO_COREDUMP handling Selftests: - Split out coredump socket tests and common helpers into separate files for better organization - Fix userspace coredump client detection issues - Handle edge-triggered epoll correctly - Ignore ENOSPC errors in tests - Add debug logging to coredump socket tests, socket protocol tests, and test helpers - Add tests for PIDFD_INFO_COREDUMP_SIGNAL - Add tests for supported_mask field - Update pidfd header for selftests" * tag 'vfs-6.19-rc1.coredump' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (23 commits) pidfs: reduce wait_pidfd lock scope selftests/coredump: add second PIDFD_INFO_COREDUMP_SIGNAL test selftests/coredump: add first PIDFD_INFO_COREDUMP_SIGNAL test selftests/coredump: ignore ENOSPC errors selftests/coredump: add debug logging to coredump socket protocol tests selftests/coredump: add debug logging to coredump socket tests selftests/coredump: add debug logging to test helpers selftests/coredump: handle edge-triggered epoll correctly selftests/coredump: fix userspace coredump client detection selftests/coredump: fix userspace client detection selftests/coredump: split out coredump socket tests selftests/coredump: split out common helpers selftests/pidfd: add second supported_mask test selftests/pidfd: add first supported_mask test selftests/pidfd: update pidfd header pidfs: expose coredump signal pidfs: drop struct pidfs_exit_info pidfs: prepare to drop exit_info pointer pidfd: add a new supported_mask field pidfs: add missing BUILD_BUG_ON() assert on struct pidfd_info ...
11 daysMerge tag 'namespace-6.19-rc1' of ↵Linus Torvalds15-58/+8344
git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs Pull namespace updates from Christian Brauner: "This contains substantial namespace infrastructure changes including a new system call, active reference counting, and extensive header cleanups. The branch depends on the shared kbuild branch for -fms-extensions support. Features: - listns() system call Add a new listns() system call that allows userspace to iterate through namespaces in the system. This provides a programmatic interface to discover and inspect namespaces, addressing longstanding limitations: Currently, there is no direct way for userspace to enumerate namespaces. Applications must resort to scanning /proc/*/ns/ across all processes, which is: - Inefficient - requires iterating over all processes - Incomplete - misses namespaces not attached to any running process but kept alive by file descriptors, bind mounts, or parent references - Permission-heavy - requires access to /proc for many processes - No ordering or ownership information - No filtering per namespace type The listns() system call solves these problems: ssize_t listns(const struct ns_id_req *req, u64 *ns_ids, size_t nr_ns_ids, unsigned int flags); struct ns_id_req { __u32 size; __u32 spare; __u64 ns_id; struct /* listns */ { __u32 ns_type; __u32 spare2; __u64 user_ns_id; }; }; Features include: - Pagination support for large namespace sets - Filtering by namespace type (MNT_NS, NET_NS, USER_NS, etc.) - Filtering by owning user namespace - Permission checks respecting namespace isolation - Active Reference Counting Introduce an active reference count that tracks namespace visibility to userspace. A namespace is visible in the following cases: - The namespace is in use by a task - The namespace is persisted through a VFS object (namespace file descriptor or bind-mount) - The namespace is a hierarchical type and is the parent of child namespaces The active reference count does not regulate lifetime (that's still done by the normal reference count) - it only regulates visibility to namespace file handles and listns(). This prevents resurrection of namespaces that are pinned only for internal kernel reasons (e.g., user namespaces held by file->f_cred, lazy TLB references on idle CPUs, etc.) which should not be accessible via (1)-(3). - Unified Namespace Tree Introduce a unified tree structure for all namespaces with: - Fixed IDs assigned to initial namespaces - Lookup based solely on inode number - Maintained list of owned namespaces per user namespace - Simplified rbtree comparison helpers Cleanups - Header Reorganization: - Move namespace types into separate header (ns_common_types.h) - Decouple nstree from ns_common header - Move nstree types into separate header - Switch to new ns_tree_{node,root} structures with helper functions - Use guards for ns_tree_lock - Initial Namespace Reference Count Optimization - Make all reference counts on initial namespaces a nop to avoid pointless cacheline ping-pong for namespaces that can never go away - Drop custom reference count initialization for initial namespaces - Add NS_COMMON_INIT() macro and use it for all namespaces - pid: rely on common reference count behavior - Miscellaneous Cleanups - Rename exit_task_namespaces() to exit_nsproxy_namespaces() - Rename is_initial_namespace() and make argument const - Use boolean to indicate anonymous mount namespace - Simplify owner list iteration in nstree - nsfs: raise SB_I_NODEV, SB_I_NOEXEC, and DCACHE_DONTCACHE explicitly - nsfs: use inode_just_drop() - pidfs: raise DCACHE_DONTCACHE explicitly - pidfs: simplify PIDFD_GET__NAMESPACE ioctls - libfs: allow to specify s_d_flags - cgroup: add cgroup namespace to tree after owner is set - nsproxy: fix free_nsproxy() and simplify create_new_namespaces() Fixes: - setns(pidfd, ...) race condition Fix a subtle race when using pidfds with setns(). When the target task exits after prepare_nsset() but before commit_nsset(), the namespace's active reference count might have been dropped. If setns() then installs the namespaces, it would bump the active reference count from zero without taking the required reference on the owner namespace, leading to underflow when later decremented. The fix resurrects the ownership chain if necessary - if the caller succeeded in grabbing passive references, the setns() should succeed even if the target task exits or gets reaped. - Return EFAULT on put_user() error instead of success - Make sure references are dropped outside of RCU lock (some namespaces like mount namespace sleep when putting the last reference) - Don't skip active reference count initialization for network namespace - Add asserts for active refcount underflow - Add asserts for initial namespace reference counts (both passive and active) - ipc: enable is_ns_init_id() assertions - Fix kernel-doc comments for internal nstree functions - Selftests - 15 active reference count tests - 9 listns() functionality tests - 7 listns() permission tests - 12 inactive namespace resurrection tests - 3 threaded active reference count tests - commit_creds() active reference tests - Pagination and stress tests - EFAULT handling test - nsid tests fixes" * tag 'namespace-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs: (103 commits) pidfs: simplify PIDFD_GET_<type>_NAMESPACE ioctls nstree: fix kernel-doc comments for internal functions nsproxy: fix free_nsproxy() and simplify create_new_namespaces() selftests/namespaces: fix nsid tests ns: drop custom reference count initialization for initial namespaces pid: rely on common reference count behavior ns: add asserts for initial namespace active reference counts ns: add asserts for initial namespace reference counts ns: make all reference counts on initial namespace a nop ipc: enable is_ns_init_id() assertions fs: use boolean to indicate anonymous mount namespace ns: rename is_initial_namespace() ns: make is_initial_namespace() argument const nstree: use guards for ns_tree_lock nstree: simplify owner list iteration nstree: switch to new structures nstree: add helper to operate on struct ns_tree_{node,root} nstree: move nstree types into separate header nstree: decouple from ns_common header ns: move namespace types into separate header ...
11 daysMerge branch 'for-linus' into for-nextTakashi Iwai65-155/+1868
Pull remaining 6.18-devel changes. Signed-off-by: Takashi Iwai <tiwai@suse.de>
11 daysobjtool: Fix segfault on unknown alternativesobjtool-core-2025-12-01objtool/coreIngo Molnar1-0/+3
So 'objtool --link -d vmlinux.o' gets surprised by this endbr64+endbr64 pattern in ___bpf_prog_run(): ___bpf_prog_run: 1e7680: ___bpf_prog_run+0x0 push %r12 1e7682: ___bpf_prog_run+0x2 mov %rdi,%r12 1e7685: ___bpf_prog_run+0x5 push %rbp 1e7686: ___bpf_prog_run+0x6 xor %ebp,%ebp 1e7688: ___bpf_prog_run+0x8 push %rbx 1e7689: ___bpf_prog_run+0x9 mov %rsi,%rbx 1e768c: ___bpf_prog_run+0xc movzbl (%rbx),%esi 1e768f: ___bpf_prog_run+0xf movzbl %sil,%edx 1e7693: ___bpf_prog_run+0x13 mov %esi,%eax 1e7695: ___bpf_prog_run+0x15 mov 0x0(,%rdx,8),%rdx 1e769d: ___bpf_prog_run+0x1d jmp 0x1e76a2 <__x86_indirect_thunk_rdx> 1e76a2: ___bpf_prog_run+0x22 endbr64 1e76a6: ___bpf_prog_run+0x26 endbr64 1e76aa: ___bpf_prog_run+0x2a mov 0x4(%rbx),%edx And crashes due to blindly dereferencing alt->insn->alt_group. Bail out on NULL ->alt_group, which produces this warning and continues with the disassembly, instead of a segfault: .git/O/vmlinux.o: warning: objtool: <alternative.1e769d>: failed to disassemble alternative Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Josh Poimboeuf <jpoimboe@kernel.org> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
11 daysMerge branch 'kvm-arm64/nv-xnx-haf' into kvmarm/nextOliver Upton4-0/+178
* kvm-arm64/nv-xnx-haf: (22 commits) : Support for FEAT_XNX and FEAT_HAF in nested : : Add support for a couple of MMU-related features that weren't : implemented by KVM's software page table walk: : : - FEAT_XNX: Allows the hypervisor to describe execute permissions : separately for EL0 and EL1 : : - FEAT_HAF: Hardware update of the Access Flag, which in the context of : nested means software walkers must also set the Access Flag. : : The series also adds some basic support for testing KVM's emulation of : the AT instruction, including the implementation detail that AT sets the : Access Flag in KVM. KVM: arm64: at: Update AF on software walk only if VM has FEAT_HAFDBS KVM: arm64: at: Use correct HA bit in TCR_EL2 when regime is EL2 KVM: arm64: Document KVM_PGTABLE_PROT_{UX,PX} KVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected" KVM: arm64: Add break to default case in kvm_pgtable_stage2_pte_prot() KVM: arm64: Add endian casting to kvm_swap_s[12]_desc() KVM: arm64: Fix compilation when CONFIG_ARM64_USE_LSE_ATOMICS=n KVM: arm64: selftests: Add test for AT emulation KVM: arm64: nv: Expose hardware access flag management to NV guests KVM: arm64: nv: Implement HW access flag management in stage-2 SW PTW KVM: arm64: Implement HW access flag management in stage-1 SW PTW KVM: arm64: Propagate PTW errors up to AT emulation KVM: arm64: Add helper for swapping guest descriptor KVM: arm64: nv: Use pgtable definitions in stage-2 walk KVM: arm64: Handle endianness in read helper for emulated PTW KVM: arm64: nv: Stop passing vCPU through void ptr in S2 PTW KVM: arm64: Call helper for reading descriptors directly KVM: arm64: nv: Advertise support for FEAT_XNX KVM: arm64: Teach ptdump about FEAT_XNX permissions KVM: arm64: nv: Forward FEAT_XNX permissions to the shadow stage-2 ... Signed-off-by: Oliver Upton <oupton@kernel.org>
11 daysMerge branch 'kvm-arm64/vgic-lr-overflow' into kvmarm/nextOliver Upton5-22/+288
* kvm-arm64/vgic-lr-overflow: (50 commits) : Support for VGIC LR overflows, courtesy of Marc Zyngier : : Address deficiencies in KVM's GIC emulation when a vCPU has more active : IRQs than can be represented in the VGIC list registers. Sort the AP : list to prioritize inactive and pending IRQs, potentially spilling : active IRQs outside of the LRs. : : Handle deactivation of IRQs outside of the LRs for both EOImode=0/1, : which involves special consideration for SPIs being deactivated from a : different vCPU than the one that acked it. KVM: arm64: Convert ICH_HCR_EL2_TDIR cap to EARLY_LOCAL_CPU_FEATURE KVM: arm64: selftests: vgic_irq: Add timer deactivation test KVM: arm64: selftests: vgic_irq: Add Group-0 enable test KVM: arm64: selftests: vgic_irq: Add asymmetric SPI deaectivation test KVM: arm64: selftests: vgic_irq: Perform EOImode==1 deactivation in ack order KVM: arm64: selftests: vgic_irq: Remove LR-bound limitation KVM: arm64: selftests: vgic_irq: Exclude timer-controlled interrupts KVM: arm64: selftests: vgic_irq: Change configuration before enabling interrupt KVM: arm64: selftests: vgic_irq: Fix GUEST_ASSERT_IAR_EMPTY() helper KVM: arm64: selftests: gic_v3: Disable Group-0 interrupts by default KVM: arm64: selftests: gic_v3: Add irq group setting helper KVM: arm64: GICv2: Always trap GICV_DIR register KVM: arm64: GICv2: Handle deactivation via GICV_DIR traps KVM: arm64: GICv2: Handle LR overflow when EOImode==0 KVM: arm64: GICv3: Force exit to sync ICH_HCR_EL2.En KVM: arm64: GICv3: nv: Plug L1 LR sync into deactivation primitive KVM: arm64: GICv3: nv: Resync LRs/VMCR/HCR early for better MI emulation KVM: arm64: GICv3: Avoid broadcast kick on CPUs lacking TDIR KVM: arm64: GICv3: Handle in-LR deactivation when possible KVM: arm64: GICv3: Add SPI tracking to handle asymmetric deactivation ... Signed-off-by: Oliver Upton <oupton@kernel.org>
11 daysMerge branch 'kvm-arm64/sea-user' into kvmarm/nextOliver Upton4-0/+335
* kvm-arm64/sea-user: : Userspace handling of SEAs, courtesy of Jiaqi Yan : : Add support for processing external aborts in userspace in situations : where the host has failed to do so, allowing the VMM to potentially : reinject an external abort into the VM. Documentation: kvm: new UAPI for handling SEA KVM: selftests: Test for KVM_EXIT_ARM_SEA KVM: arm64: VM exit to userspace to handle SEA Signed-off-by: Oliver Upton <oupton@kernel.org>
11 daysKVM: arm64: Fix spelling mistake "Unexpeced" -> "Unexpected"Colin Ian King1-1/+1
There is a spelling mistake in a TEST_FAIL message. Fix it. Signed-off-by: Colin Ian King <colin.i.king@gmail.com> Link: https://msgid.link/20251128175124.319094-1-colin.i.king@gmail.com Signed-off-by: Oliver Upton <oupton@kernel.org>
11 daysKVM: arm64: selftests: Add test for AT emulationOliver Upton4-0/+178
Add a basic test for AT emulation in the EL2&0 and EL1&0 translation regimes. Reviewed-by: Marc Zyngier <maz@kernel.org> Tested-by: Marc Zyngier <maz@kernel.org> Link: https://msgid.link/20251124190158.177318-16-oupton@kernel.org Signed-off-by: Oliver Upton <oupton@kernel.org>
12 daysMerge branch 'rcu/misc' into nextFrederic Weisbecker3-16/+157
- In order to prepare the layout for nohz_full work deferral to user exit, the context tracking state must shrink the counter of transitions to/from RCU not watching. The only possible hazard is to trigger wrap-around more easily, delaying a bit grace periods when that happens. This should be a rare event though. Yet add debugging and torture code to test that assumption. - Fix memory leak on locktorture module - Annotate accesses in rculist_nulls.h to prevent from KCSAN warnings. On recent discussions, we also concluded that all those WRITE_ONCE() and READ_ONCE() on list APIs deserve appropriate comments. Something to be expected for the next cycle. - Provide a script to apply several configs to several commits with torture. - Allow torture to reuse a build directory in order to save needless rebuild time. - Various cleanups.
13 daysperf trace: Skip internal syscall argumentsNamhyung Kim1-0/+21
Recent changes in the linux-next kernel will add new field for syscalls to have contents in the userspace like below. # cat /sys/kernel/tracing/events/syscalls/sys_enter_write/format name: sys_enter_write ID: 758 format: field:unsigned short common_type; offset:0; size:2; signed:0; field:unsigned char common_flags; offset:2; size:1; signed:0; field:unsigned char common_preempt_count; offset:3; size:1; signed:0; field:int common_pid; offset:4; size:4; signed:1; field:int __syscall_nr; offset:8; size:4; signed:1; field:unsigned int fd; offset:16; size:8; signed:0; field:const char * buf; offset:24; size:8; signed:0; field:size_t count; offset:32; size:8; signed:0; field:__data_loc char[] __buf_val; offset:40; size:4; signed:0; print fmt: "fd: 0x%08lx, buf: 0x%08lx (%s), count: 0x%08lx", ((unsigned long)(REC->fd)), ((unsigned long)(REC->buf)), __print_dynamic_array(__buf_val, 1), ((unsigned long)(REC->count)) We have a different way to handle those arguments and this change confuses perf trace then make some tests failing. Fix it by skipping the new fields that have "__data_loc char[]" type. Maybe we can switch to this instead of the BPF augmentation later. Reviewed-by: Howard Chu <howardchu95@gmail.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Tested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Howard Chu <howardchu95@gmail.com> Reported-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
13 daysselftests/mm/uffd: initialize char variable to NullAnkit Khushwaha2-5/+5
In "uffd-stress.c" & "uffd-unit-tests.c". address of char variable having garbage value (uninitialized) is passed to 'write' syscall triggers warning. uffd-stress.c:246:39: warning: variable 'c' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] uffd-unit-tests.c:581:31: warning: variable 'c' is uninitialized when passed as a const pointer argument here [-Wuninitialized-const-pointer] so the fix is to assign char variable to '\0' to prevent writing of garbage value. Link: https://lkml.kernel.org/r/20251126160830.52124-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org> Cc: Bill Wendling <morbo@google.com> Cc: Justin Stitt <justinstitt@google.com> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Xu <peterx@redhat.com> Cc: Shuah Khan <shuah@kernel.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysmm: introduce VMA flags bitmap typeLorenzo Stoakes1-30/+120
It is useful to transition to using a bitmap for VMA flags so we can avoid running out of flags, especially for 32-bit kernels which are constrained to 32 flags, necessitating some features to be limited to 64-bit kernels only. By doing so, we remove any constraint on the number of VMA flags moving forwards no matter the platform and can decide in future to extend beyond 64 if required. We start by declaring an opaque types, vma_flags_t (which resembles mm_struct flags of type mm_flags_t), setting it to precisely the same size as vm_flags_t, and place it in union with vm_flags in the VMA declaration. We additionally update struct vm_area_desc equivalently placing the new opaque type in union with vm_flags. This change therefore does not impact the size of struct vm_area_struct or struct vm_area_desc. In order for the change to be iterative and to avoid impacting performance, we designate VM_xxx declared bitmap flag values as those which must exist in the first system word of the VMA flags bitmap. We therefore declare vma_flags_clear_all(), vma_flags_overwrite_word(), vma_flags_overwrite_word(), vma_flags_overwrite_word_once(), vma_flags_set_word() and vma_flags_clear_word() in order to allow us to update the existing vm_flags_*() functions to utilise these helpers. This is a stepping stone towards converting users to the VMA flags bitmap and behaves precisely as before. By doing this, we can eliminate the existing private vma->__vm_flags field in the vma->vm_flags union and replace it with the newly introduced opaque type vma_flags, which we call flags so we refer to the new bitmap field as vma->flags. We update vma_flag_[test, set]_atomic() to account for the change also. We adapt vm_flags_reset_once() to only clear those bits above the first system word providing write-once semantics to the first system word (which it is presumed the caller requires - and in all current use cases this is so). As we currently only specify that the VMA flags bitmap size is equal to BITS_PER_LONG number of bits, this is a noop, but is defensive in preparation for a future change that increases this. We additionally update the VMA userland test declarations to implement the same changes there. Finally, we update the rust code to reference vma->vm_flags on update rather than vma->__vm_flags which has been removed. This is safe for now, albeit it is implicitly performing a const cast. Once we introduce flag helpers we can improve this more. No functional change intended. Link: https://lkml.kernel.org/r/bab179d7b153ac12f221b7d65caac2759282cfe9.1764064557.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daystools/testing/vma: eliminate dependency on vma->__vm_flagsLorenzo Stoakes1-10/+10
The userland VMA test code relied on an internal implementation detail - the existence of vma->__vm_flags to directly access VMA flags. There is no need to do so when we have the vm_flags_*() helper functions available. This is ugly, but also a subsequent commit will eliminate this field altogether so this will shortly become broken. This patch has us utilise the helper functions instead. Link: https://lkml.kernel.org/r/6275c53a6bb20743edcbe92d3e130183b47d18d0.1764064557.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Acked-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysmm: declare VMA flags by bitLorenzo Stoakes1-45/+259
Patch series "initial work on making VMA flags a bitmap", v3. We are in the rather silly situation that we are running out of VMA flags as they are currently limited to a system word in size. This leads to absurd situations where we limit features to 64-bit architectures only because we simply do not have the ability to add a flag for 32-bit ones. This is very constraining and leads to hacks or, in the worst case, simply an inability to implement features we want for entirely arbitrary reasons. This also of course gives us something of a Y2K type situation in mm where we might eventually exhaust all of the VMA flags even on 64-bit systems. This series lays the groundwork for getting away from this limitation by establishing VMA flags as a bitmap whose size we can increase in future beyond 64 bits if required. This is necessarily a highly iterative process given the extensive use of VMA flags throughout the kernel, so we start by performing basic steps. Firstly, we declare VMA flags by bit number rather than by value, retaining the VM_xxx fields but in terms of these newly introduced VMA_xxx_BIT fields. While we are here, we use sparse annotations to ensure that, when dealing with VMA bit number parameters, we cannot be passed values which are not declared as such - providing some useful type safety. We then introduce an opaque VMA flag type, much like the opaque mm_struct flag type introduced in commit bb6525f2f8c4 ("mm: add bitmap mm->flags field"), which we establish in union with vma->vm_flags (but still set at system word size meaning there is no functional or data type size change). We update the vm_flags_xxx() helpers to use this new bitmap, introducing sensible helpers to do so. This series lays the foundation for further work to expand the use of bitmap VMA flags and eventually eliminate these arbitrary restrictions. This patch (of 4): In order to lay the groundwork for VMA flags being a bitmap rather than a system word in size, we need to be able to consistently refer to VMA flags by bit number rather than value. Take this opportunity to do so in an enum which we which is additionally useful for tooling to extract metadata from. This additionally makes it very clear which bits are being used for what at a glance. We use the VMA_ prefix for the bit values as it is logical to do so since these reference VMAs. We consistently suffix with _BIT to make it clear what the values refer to. We declare bit values even when the flags that use them would not be enabled by config options as this is simply clearer and clearly defines what bit numbers are used for what, at no additional cost. We declare a sparse-bitwise type vma_flag_t which ensures that users can't pass around invalid VMA flags by accident and prepares for future work towards VMA flags being a bitmap where we want to ensure bit values are type safe. To make life easier, we declare some macro helpers - DECLARE_VMA_BIT() allows us to avoid duplication in the enum bit number declarations (and maintaining the sparse __bitwise attribute), and INIT_VM_FLAG() is used to assist with declaration of flags. Unfortunately we can't declare both in the enum, as we run into issue with logic in the kernel requiring that flags are preprocessor definitions, and additionally we cannot have a macro which declares another macro so we must define each flag macro directly. Additionally, update the VMA userland testing vma_internal.h header to include these changes. We also have to fix the parameters to the vma_flag_*_atomic() functions since VMA_MAYBE_GUARD_BIT is now of type vma_flag_t and sparse will complain otherwise. We have to update some rather silly if-deffery found in mm/task_mmu.c which would otherwise break. Finally, we update the rust binding helper as now it cannot auto-detect the flags at all. Link: https://lkml.kernel.org/r/cover.1764064556.git.lorenzo.stoakes@oracle.com Link: https://lkml.kernel.org/r/3a35e5a0bcfa00e84af24cbafc0653e74deda64a.1764064556.git.lorenzo.stoakes@oracle.com Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com> Acked-by: Vlastimil Babka <vbabka@suse.cz> Reviewed-by: Pedro Falcato <pfalcato@suse.de> Acked-by: Alice Ryhl <aliceryhl@google.com> [rust] Cc: Alex Gaynor <alex.gaynor@gmail.com> Cc: Alistair Popple <apopple@nvidia.com> Cc: Andreas Hindborg <a.hindborg@kernel.org> Cc: Axel Rasmussen <axelrasmussen@google.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Baoquan He <bhe@redhat.com> Cc: Barry Song <baohua@kernel.org> Cc: Ben Segall <bsegall@google.com> Cc: Björn Roy Baron <bjorn3_gh@protonmail.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: Byungchul Park <byungchul@sk.com> Cc: Chengming Zhou <chengming.zhou@linux.dev> Cc: Chris Li <chrisl@kernel.org> Cc: Danilo Krummrich <dakr@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: David Rientjes <rientjes@google.com> Cc: Dev Jain <dev.jain@arm.com> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Gary Guo <gary@garyguo.net> Cc: Gregory Price <gourry@gourry.net> Cc: "Huang, Ying" <ying.huang@linux.alibaba.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jann Horn <jannh@google.com> Cc: Jason Gunthorpe <jgg@ziepe.ca> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: John Hubbard <jhubbard@nvidia.com> Cc: Joshua Hahn <joshua.hahnjy@gmail.com> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Kairui Song <kasong@tencent.com> Cc: Kees Cook <kees@kernel.org> Cc: Kemeng Shi <shikemeng@huaweicloud.com> Cc: Lance Yang <lance.yang@linux.dev> Cc: Leon Romanovsky <leon@kernel.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mathew Brost <matthew.brost@intel.com> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> Cc: Mel Gorman <mgorman <mgorman@suse.de> Cc: Michal Hocko <mhocko@suse.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Rapoport <rppt@kernel.org> Cc: Muchun Song <muchun.song@linux.dev> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Nico Pache <npache@redhat.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: Peter Xu <peterx@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Qi Zheng <zhengqi.arch@bytedance.com> Cc: Rakie Kim <rakie.kim@sk.com> Cc: Rik van Riel <riel@surriel.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Suren Baghdasaryan <surenb@google.com> Cc: Trevor Gross <tmgross@umich.edu> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Wei Xu <weixugc@google.com> Cc: xu xin <xu.xin16@zte.com.cn> Cc: Yuanchu Xie <yuanchu@google.com> Cc: Zi Yan <ziy@nvidia.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
13 daysselftests/bpf: do not hardcode target rate in test_tc_edt BPF programAlexis Lothoré (eBPF Foundation)2-3/+4
test_tc_edt currently defines the target rate in both the userspace and BPF parts. This value could be defined once in the userspace part if we make it able to configure the BPF program before starting the test. Add a target_rate variable in the BPF part, and make the userspace part set it to the desired rate before attaching the shaping program. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-4-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
13 daysselftests/bpf: remove test_tc_edt.shAlexis Lothoré (eBPF Foundation)2-102/+0
Now that test_tc_edt has been integrated in test_progs, remove the legacy shell script. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-3-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
13 daysselftests/bpf: integrate test_tc_edt into test_progsAlexis Lothoré (eBPF Foundation)2-1/+145
test_tc_edt.sh uses a pair of veth and a BPF program attached to the TX veth to shape the traffic to 5MBps. It then checks that the amount of received bytes (at interface level), compared to the TX duration, indeed matches 5Mbps. Convert this test script to the test_progs framework: - keep the double veth setup, isolated in two veths - run a small tcp server, and connect client to server - push a pre-configured amount of bytes, and measure how much time has been needed to push those - ensure that this rate is in a 2% error margin around the target rate This two percent value, while being tight, is hopefully large enough to not make the test too flaky in CI, while also turning it into a small example of BPF-based shaping. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-2-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
13 daysselftests/bpf: rename test_tc_edt.bpf.c section to expose program typeAlexis Lothoré (eBPF Foundation)2-2/+3
The test_tc_edt BPF program uses a custom section name, which works fine when manually loading it with tc, but prevents it from being loaded with libbpf. Update the program section name to "tc" to be able to manipulate it with a libbpf-based C test. Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com> Link: https://lore.kernel.org/r/20251128-tc_edt-v2-1-26db48373e73@bootlin.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
13 daysselftests/bpf: Add success stats to rqspinlock stress testKumar Kartikeya Dwivedi1-12/+43
Add stats to observe the success and failure rate of lock acquisition attempts in various contexts. Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com> Link: https://lore.kernel.org/r/20251128232802.1031906-7-memxor@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
13 daysdocs: makefile: move rustdoc check to the build wrapperMauro Carvalho Chehab1-9/+32
The makefile logic to detect if rust is enabled is not working the way it was expected: instead of using the current setup for CONFIG_RUST, it uses a cached version from a previous build. The root cause is that the current logic inside docs/Makefile uses a cached version of CONFIG_RUST, from the last time a non documentation target was executed. That's perfectly fine for Sphinx build, as it doesn't need to read or depend on any CONFIG_*. So, instead of relying at the cache, move the logic to the wrapper script and let it check the current content of .config, to verify if CONFIG_RUST was selected. Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <c06b1834ef02099735c13ee1109fa2a2b9e47795.1763722971.git.mchehab+huawei@kernel.org>
13 daysdocs: kdoc: various fixes for grammar, spelling, punctuationRandy Dunlap5-26/+26
Correct grammar, spelling, and punctuation in comments, strings, print messages, logs. Change two instances of two spaces between words to just one space. codespell was used to find misspelled words. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Cc: Jonathan Corbet <corbet@lwn.net> Cc: linux-doc@vger.kernel.org Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20251124041011.3030571-1-rdunlap@infradead.org>
13 daysdocs: kdoc_parser: use '@' for Excess enum valueRandy Dunlap1-1/+1
kdoc is looking for "@value" here, so use that kind of string in the warning message. The "%value" can be confusing. This changes: Warning: drivers/net/wireless/mediatek/mt76/testmode.h:92 Excess enum value '%MT76_TM_ATTR_TX_PENDING' description in 'mt76_testmode_attr' to this: Warning: drivers/net/wireless/mediatek/mt76/testmode.h:92 Excess enum value '@MT76_TM_ATTR_TX_PENDING' description in 'mt76_testmode_attr' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20251126061752.3497106-1-rdunlap@infradead.org>
13 daysdocs: kdoc_parser: add data/function attributes to ignoreRandy Dunlap1-0/+3
Recognize and ignore __rcu (in struct members), __private (in struct members), and __always_unused (in function parameters) to prevent kernel-doc warnings: Warning: include/linux/rethook.h:38 struct member 'void (__rcu *handler' not described in 'rethook' Warning: include/linux/hrtimer_types.h:47 Invalid param: enum hrtimer_restart (*__private function)(struct hrtimer *) Warning: security/ipe/hooks.c:81 function parameter '__always_unused' not described in 'ipe_mmap_file' Warning: security/ipe/hooks.c:109 function parameter '__always_unused' not described in 'ipe_file_mprotect' There are more of these (in compiler_types.h, compiler_attributes.h) that can be added as needed. Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Jonathan Corbet <corbet@lwn.net> Message-ID: <20251127063117.150384-1-rdunlap@infradead.org>
13 daysselftests: bonding: add delay before each xvlan_over_bond connectivity checkHangbin Liu1-0/+1
Jakub reported increased flakiness in bond_macvlan_ipvlan.sh on regular kernel, while the tests consistently pass on a debug kernel. This suggests a timing-sensitive issue. To mitigate this, introduce a short sleep before each xvlan_over_bond connectivity check. The delay helps ensure neighbor and route cache have fully converged before verifying connectivity. The sleep interval is kept minimal since check_connection() is invoked nearly 100 times during the test. Fixes: 246af950b940 ("selftests: bonding: add macvlan over bond testing") Reported-by: Jakub Kicinski <kuba@kernel.org> Closes: https://lore.kernel.org/netdev/20251114082014.750edfad@kernel.org Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251127143310.47740-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daysMerge tag 'nf-next-25-11-28' of ↵Jakub Kicinski1-14/+112
git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for net-next The following batch contains Netfilter updates for net-next: 0) Add sanity check for maximum encapsulations in bridge vlan, reported by the new AI robot. 1) Move the flowtable path discovery code to its own file, the nft_flow_offload.c mixes the nf_tables evaluation with the path discovery logic, just split this in two for clarity. 2) Consolidate flowtable xmit path by using dev_queue_xmit() and the real device behind the layer 2 vlan/pppoe device. This allows to inline encapsulation. After this update, hw_ifidx can be removed since both ifidx and hw_ifidx now point to the same device. 3) Support for IPIP encapsulation in the flowtable, extend selftest to cover for this new layer 3 offload, from Lorenzo Bianconi. 4) Push down the skb into the conncount API to fix duplicates in the conncount list for packets with non-confirmed conntrack entries, this is due to an optimization introduced in d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC"). From Fernando Fernandez Mancera. 5) In conncount, disable BH when performing garbage collection to consolidate existing behaviour in the conncount API, also from Fernando. 6) A matching packet with a confirmed conntrack invokes GC if conncount reaches the limit in an attempt to release slots. This allows the existing extensions to be used for real conntrack counting, not just limiting new connections, from Fernando. 7) Support for updating ct count objects in nf_tables, from Fernando. 8) Extend nft_flowtables.sh selftest to send IPv6 TCP traffic, from Lorenzo Bianconi. 9) Fixes for UAPI kernel-doc documentation, from Randy Dunlap. * tag 'nf-next-25-11-28' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next: netfilter: nf_tables: improve UAPI kernel-doc comments netfilter: ip6t_srh: fix UAPI kernel-doc comments format selftests: netfilter: nft_flowtable.sh: Add the capability to send IPv6 TCP traffic netfilter: nft_connlimit: add support to object update operation netfilter: nft_connlimit: update the count if add was skipped netfilter: nf_conncount: make nf_conncount_gc_list() to disable BH netfilter: nf_conncount: rework API to use sk_buff directly selftests: netfilter: nft_flowtable.sh: Add IPIP flowtable selftest netfilter: flowtable: Add IPIP tx sw acceleration netfilter: flowtable: Add IPIP rx sw acceleration netfilter: flowtable: use tuple address to calculate next hop netfilter: flowtable: remove hw_ifidx netfilter: flowtable: inline pppoe encapsulation in xmit path netfilter: flowtable: inline vlan encapsulation in xmit path netfilter: flowtable: consolidate xmit path netfilter: flowtable: move path discovery infrastructure to its own file netfilter: flowtable: check for maximum number of encapsulations in bridge vlan ==================== Link: https://patch.msgid.link/20251128002345.29378-1-pablo@netfilter.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daystools: ynl: add a lint makefile targetDonald Hunter1-1/+3
Add a lint target to run yamllint on the YNL specs. make -C tools/net/ynl lint make: Entering directory '/home/donaldh/net-next/tools/net/ynl' yamllint ../../../Documentation/netlink/specs/*.yaml ../../../Documentation/netlink/specs/ethtool.yaml 1272:21 warning truthy value should be one of [false, true] (truthy) make: Leaving directory '/home/donaldh/net-next/tools/net/ynl' Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20251127123502.89142-3-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
13 daystools: ynl: add schema checkingDonald Hunter2-6/+35
Add a --validate flag to pyynl for explicit schema check with error reporting and add a schema_check make target to check all YNL specs. make -C tools/net/ynl schema_check make: Entering directory '/home/donaldh/net-next/tools/net/ynl' ok 1 binder.yaml schema validation not ok 2 conntrack.yaml schema validation 'labels mask' does not match '^[0-9a-z-]+$' Failed validating 'pattern' in schema['properties']['attribute-sets']['items']['properties']['attributes']['items']['properties']['name']: {'type': 'string', 'pattern': '^[0-9a-z-]+$'} On instance['attribute-sets'][14]['attributes'][22]['name']: 'labels mask' ok 3 devlink.yaml schema validation [...] Signed-off-by: Donald Hunter <donald.hunter@gmail.com> Link: https://patch.msgid.link/20251127123502.89142-2-donald.hunter@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
14 daysbpf: Remove runqslower toolHoyeon Lee9-413/+4
runqslower was added in commit 9c01546d26d2 "tools/bpf: Add runqslower tool to tools/bpf" as a BCC port to showcase early BPF CO-RE + libbpf workflows. runqslower continues to live in BCC (libbpf-tools), so there is no need to keep building and maintaining it. Drop tools/bpf/runqslower and remove all build hooks in tools/bpf and selftests accordingly. Signed-off-by: Hoyeon Lee <hoyeon.lee@suse.com> Link: https://lore.kernel.org/r/20251126093821.373291-1-hoyeon.lee@suse.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
14 daysselftests/bpf: Remove usage of lsm/file_alloc_security in selftestAmery Hung3-7/+7
file_alloc_security hook is disabled. Use other LSM hooks in selftests instead. Signed-off-by: Amery Hung <ameryhung@gmail.com> Link: https://lore.kernel.org/r/20251126202927.2584874-2-ameryhung@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
14 daysbpf: force BPF_F_RDONLY_PROG on insn array creationAnton Protopopov1-1/+1
The original implementation added a hack to check_mem_access() to prevent programs from writing into insn arrays. To get rid of this hack, enforce BPF_F_RDONLY_PROG on map creation. Also fix the corresponding selftest, as the error message changes with this patch. Suggested-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: Anton Protopopov <a.s.protopopov@gmail.com> Link: https://lore.kernel.org/r/20251128063224.1305482-2-a.s.protopopov@gmail.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
14 daysvfio: selftests: Add vfio_pci_device_init_perf_testDavid Matlack2-0/+171
Add a new VFIO selftest for measuring the time it takes to run vfio_pci_device_init() in parallel for one or more devices. This test serves as manual regression test for the performance improvement of commit e908f58b6beb ("vfio/pci: Separate SR-IOV VF dev_set"). For example, when running this test with 64 VFs under the same PF: Before: $ ./vfio_pci_device_init_perf_test -r vfio_pci_device_init_perf_test.iommufd.init 0000:1a:00.0 0000:1a:00.1 ... ... Wall time: 6.653234463s Min init time (per device): 0.101215344s Max init time (per device): 6.652755941s Avg init time (per device): 3.377609608s After: $ ./vfio_pci_device_init_perf_test -r vfio_pci_device_init_perf_test.iommufd.init 0000:1a:00.0 0000:1a:00.1 ... ... Wall time: 0.122978332s Min init time (per device): 0.108121915s Max init time (per device): 0.122762761s Avg init time (per device): 0.113816748s This test does not make any assertions about performance, since any such assertion is likely to be flaky due to system differences and random noise. However this test can be fed into automation to detect regressions, and can be used by developers in the future to measure performance optimizations. Suggested-by: Aaron Lewis <aaronlewis@google.com> Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-19-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Eliminate INVALID_IOVADavid Matlack4-10/+13
Eliminate INVALID_IOVA as there are platforms where UINT64_MAX is a valid iova. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-18-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Split libvfio.h into separate header filesDavid Matlack6-334/+381
Split out the contents of libvfio.h into separate header files, but keep libvfio.h as the top-level include that all tests can use. Put all new header files into a libvfio/ subdirectory to avoid future name conflicts in include paths when libvfio is used by other selftests like KVM. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-17-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Move vfio_selftests_*() helpers into libvfio.cDavid Matlack3-71/+79
Move the vfio_selftests_*() helpers into their own file libvfio.c. These helpers have nothing to do with struct vfio_pci_device, so they don't make sense in vfio_pci_device.c. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-16-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Rename vfio_util.h to libvfio.hDavid Matlack11-13/+13
Rename vfio_util.h to libvfio.h to match the name of libvfio.mk. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-15-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Stop passing device for IOMMU operationsDavid Matlack4-65/+22
Drop the struct vfio_pci_device wrappers for IOMMU map/unmap functions and require tests to directly call iommu_map(), iommu_unmap(), etc. This results in more concise code, and also makes it clear the map operations are happening on a struct iommu, not necessarily on a specific device, especially when multi-device tests are introduced. Do the same for iova_allocator_init() as that function only needs the struct iommu, not struct vfio_pci_device. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-14-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Move IOVA allocator into iova_allocator.cDavid Matlack3-71/+95
Move the IOVA allocator into its own file, to provide better separation between the allocator and the struct vfio_pci_device helper code. The allocator could go into iommu.c, but it is standalone enough that a separate file seems cleaner. This also continues the trend of having a .c for every major object in VFIO selftests (vfio_pci_device.c, vfio_pci_driver.c, iommu.c, and now iova_allocator.c). No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-13-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Move IOMMU library code into iommu.cDavid Matlack4-453/+527
Move all the IOMMU related library code into their own file iommu.c. This provides a better separation between the vfio_pci_device helper code and the iommu code. No function change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-12-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Rename struct vfio_dma_region to dma_regionDavid Matlack4-21/+21
Rename struct vfio_dma_region to dma_region. This is in preparation for separating the VFIO PCI device library code from the IOMMU library code. This name change also better reflects the fact that DMA mappings can be managed by either VFIO or IOMMUFD. i.e. the "vfio_" prefix is misleading. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-11-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Upgrade driver logging to dev_err()David Matlack2-4/+4
Upgrade various logging in the VFIO selftests drivers from dev_info() to dev_err(). All of these logs indicate scenarios that may be unexpected. For example, the logging during probing indicates matching devices but that aren't supported by the driver. And the memcpy errors can indicate a problem if the caller was not trying to do something like exercise I/O fault handling. Exercising I/O fault handling is certainly a valid thing to do, but the driver can't infer the caller's expectations, so better to just log with dev_err(). Suggested-by: Raghavendra Rao Ananta <rananta@google.com> Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-10-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Prefix logs with device BDF where relevantDavid Matlack4-25/+30
Prefix log messages with the device's BDF where relevant. This will help understanding VFIO selftests logs when tests are run with multiple devices. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-9-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Eliminate overly chatty loggingDavid Matlack1-14/+0
Eliminate overly chatty logs that are printed during almost every test. These logs are adding more noise than value. If a test cares about this information it can log it itself. This is especially true as the VFIO selftests gains support for multiple devices in a single test (which multiplies all these logs). Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-8-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Support multiple devices in the same container/iommufdDavid Matlack5-50/+107
Support tests that want to add multiple devices to the same container/iommufd by decoupling struct vfio_pci_device from struct iommu. Multi-devices tests can now put multiple devices in the same container/iommufd like so: iommu = iommu_init(iommu_mode); device1 = vfio_pci_device_init(bdf1, iommu); device2 = vfio_pci_device_init(bdf2, iommu); device3 = vfio_pci_device_init(bdf3, iommu); ... vfio_pci_device_cleanup(device3); vfio_pci_device_cleanup(device2); vfio_pci_device_cleanup(device1); iommu_cleanup(iommu); To account for the new separation of vfio_pci_device and iommu, update existing tests to initialize and cleanup a struct iommu. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-7-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Introduce struct iommuDavid Matlack2-42/+49
Introduce struct iommu, which logically represents either a VFIO container or an iommufd IOAS, depending on which IOMMU mode is used by the test. This will be used in a subsequent commit to allow devices to be added to the same container/iommufd. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-6-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Rename struct vfio_iommu_mode to iommu_modeDavid Matlack2-4/+4
Rename struct vfio_iommu_mode to struct iommu_mode since the mode can include iommufd. This also prepares for splitting out all the IOMMU code into its own structs/helpers/files which are independent from the vfio_pci_device code. No function change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-5-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Allow passing multiple BDFs on the command lineDavid Matlack2-11/+48
Add support for passing multiple device BDFs to a test via the command line. This is a prerequisite for multi-device tests. Single-device tests can continue using vfio_selftests_get_bdf(), which will continue to return argv[argc - 1] (if it is a BDF string), or the environment variable $VFIO_SELFTESTS_BDF otherwise. For multi-device tests, a new helper called vfio_selftests_get_bdfs() is introduced which will return an array of all BDFs found at the end of argv[], as well as the number of BDFs found (passed back to the caller via argument). The array of BDFs returned does not need to be freed by the caller. The environment variable VFIO_SELFTESTS_BDF continues to support only a single BDF for the time being. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-4-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Split run.sh into separate scriptsDavid Matlack5-99/+140
Split run.sh into separate scripts (setup.sh, run.sh, cleanup.sh) to enable multi-device testing, and prepare for VFIO selftests automatically detecting which devices to use for testing by storing device metadata on the filesystem. - setup.sh takes one or more BDFs as arguments and sets up each device. Metadata about each device is stored on the filesystem in the directory: ${TMPDIR:-/tmp}/vfio-selftests-devices Within this directory is a directory for each BDF, and then files in those directories that cleanup.sh uses to cleanup the device. - run.sh runs a selftest by passing it the BDFs of all set up devices. - cleanup.sh takes zero or more BDFs as arguments and cleans up each device. If no BDFs are provided, it cleans up all devices. This split enables multi-device testing by allowing multiple BDFs to be set up and passed into tests: For example: $ tools/testing/selftests/vfio/scripts/setup.sh <BDF1> <BDF2> $ tools/testing/selftests/vfio/scripts/setup.sh <BDF3> $ tools/testing/selftests/vfio/scripts/run.sh echo <BDF1> <BDF2> <BDF3> $ tools/testing/selftests/vfio/scripts/cleanup.sh In the future, VFIO selftests can automatically detect set up devices by inspecting ${TMPDIR:-/tmp}/vfio-selftests-devices. This will avoid the need for the run.sh script. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-3-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysvfio: selftests: Move run.sh into scripts directoryDavid Matlack2-1/+3
Move run.sh in a new sub-directory scripts/. This directory will be used to house various helper scripts to be used by humans and automation for running VFIO selftests. Opportunistically also switch run.sh from TEST_PROGS_EXTENDED to TEST_FILES. The former is for actual test executables that are just not run by default. TEST_FILES is a better fit for helper scripts. No functional change intended. Reviewed-by: Alex Mastro <amastro@fb.com> Tested-by: Alex Mastro <amastro@fb.com> Reviewed-by: Raghavendra Rao Ananta <rananta@google.com> Signed-off-by: David Matlack <dmatlack@google.com> Link: https://lore.kernel.org/r/20251126231733.3302983-2-dmatlack@google.com Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysMerge tag 'vfio-v6.18-rc6' into v6.19/vfio/nextAlex Williamson4-9/+288
Merge mainline vfio-selftest updates for ongoing v6.19 work. Signed-off-by: Alex Williamson <alex@shazbot.org>
14 daysselftests/landlock: Add disconnected leafs and branch test suitesMickaël Salaün1-0/+1051
Test disconnected directories with two test suites (layout4_disconnected_leafs and layout5_disconnected_branch) and 43 variants to cover the main corner cases. These tests are complementary to the previous commit. Add test_renameat() and test_exchangeat() helpers. Test coverage for security/landlock is 92.1% of 1927 lines according to LLVM 20. Cc: Günther Noack <gnoack@google.com> Cc: Song Liu <song@kernel.org> Cc: Tingmao Wang <m@maowtm.org> Link: https://lore.kernel.org/r/20251128172200.760753-5-mic@digikod.net Signed-off-by: Mickaël Salaün <mic@digikod.net>