aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
10 daysscsi: sd: reject invalid pr_read_keys() num_keys valuesStefan Hajnoczi1-1/+11
The pr_read_keys() interface has a u32 num_keys parameter. The SCSI PERSISTENT RESERVE IN command has a maximum READ KEYS service action size of 65536 bytes. Reject num_keys values that are too large to fit into the SCSI command. This will become important when pr_read_keys() is exposed to untrusted userspace via an <linux/pr.h> ioctl. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 daysblock: enable per-cpu bio cache by defaultFengnan Chang3-19/+12
Since after commit 12e4e8c7ab59 ("io_uring/rw: enable bio caches for IRQ rw"), bio_put is safe for task and irq context, bio_alloc_bioset is safe for task context and no one calls in irq context, so we can enable per cpu bio cache by default. Benchmarked with t/io_uring and ext4+nvme: taskset -c 6 /root/fio/t/io_uring -p0 -d128 -b4096 -s1 -c1 -F1 -B1 -R1 -X1 -n1 -P1 /mnt/testfile base IOPS is 562K, patch IOPS is 574K. The CPU usage of bio_alloc_bioset decrease from 1.42% to 1.22%. The worst case is allocate bio in CPU A but free in CPU B, still use t/io_uring and ext4+nvme: base IOPS is 648K, patch IOPS is 647K. Also use fio test ext4/xfs with libaio/sync/io_uring on null_blk and nvme, no obvious performance regression. Signed-off-by: Fengnan Chang <changfengnan@bytedance.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 daysblock: use bio_alloc_bioset for passthru IO by defaultFengnan Chang2-55/+37
Use bio_alloc_bioset for passthru IO by default, so that we can enable bio cache for irq and polled passthru IO in later. Signed-off-by: Fengnan Chang <changfengnan@bytedance.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 daysio_uring/trace: rename io_uring_queue_async_work event "rw" fieldCaleb Sander Mateos1-6/+6
The io_uring_queue_async_work tracepoint event stores an int rw field that represents whether the work item is hashed. Rename it to "hashed" and change its type to bool to more accurately reflect its value. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 daysio_uring/io-wq: always retry worker create on ERESTART*Caleb Sander Mateos1-2/+3
If a task has a pending signal when create_io_thread() is called, copy_process() will return -ERESTARTNOINTR. io_should_retry_thread() will request a retry of create_io_thread() up to WORKER_INIT_LIMIT = 3 times. If all retries fail, the io_uring request will fail with ECANCELED. Commit 3918315c5dc ("io-wq: backoff when retrying worker creation") added a linear backoff to allow the thread to handle its signal before the retry. However, a thread receiving frequent signals may get unlucky and have a signal pending at every retry. Since the userspace task doesn't control when it receives signals, there's no easy way for it to prevent the create_io_thread() failure due to pending signals. The task may also lack the information necessary to regenerate the canceled SQE. So always retry the create_io_thread() on the ERESTART* errors, analogous to what a fork() syscall would do. EAGAIN can occur due to various persistent conditions such as exceeding RLIMIT_NPROC, so respect the WORKER_INIT_LIMIT retry limit for EAGAIN errors. Signed-off-by: Caleb Sander Mateos <csander@purestorage.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 daysio_uring/poll: correctly handle io_poll_add() return value on updateJens Axboe1-2/+7
When the core of io_uring was updated to handle completions consistently and with fixed return codes, the POLL_REMOVE opcode with updates got slightly broken. If a POLL_ADD is pending and then POLL_REMOVE is used to update the events of that request, if that update causes the POLL_ADD to now trigger, then that completion is lost and a CQE is never posted. Additionally, ensure that if an update does cause an existing POLL_ADD to complete, that the completion value isn't always overwritten with -ECANCELED. For that case, whatever io_poll_add() set the value to should just be retained. Cc: stable@vger.kernel.org Fixes: 97b388d70b53 ("io_uring: handle completions in the core") Reported-by: syzbot+641eec6b7af1f62f2b99@syzkaller.appspotmail.com Tested-by: syzbot+641eec6b7af1f62f2b99@syzkaller.appspotmail.com Signed-off-by: Jens Axboe <axboe@kernel.dk>
10 daysASoC: cros_ec_codec: Remove unnecessary selection of CRYPTOEric Biggers1-1/+0
The only crypto-related functionality this codec uses is the sha256() function, which is provided by CRYPTO_LIB_SHA256. Originally CRYPTO_LIB_SHA256 was visible only when CRYPTO; however, that was fixed years ago and the libraries can now be selected on their own. So, remove the unnecessary selection of CRYPTO. Signed-off-by: Eric Biggers <ebiggers@kernel.org> Link: https://patch.msgid.link/20251204052954.488568-1-ebiggers@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
10 daysdrm/rcar-du: dsi: Handle both DRM_MODE_FLAG_N.SYNC and !DRM_MODE_FLAG_P.SYNCMarek Vasut1-2/+2
Since commit 94fe479fae96 ("drm/rcar-du: dsi: Clean up handling of DRM mode flags") the driver does not set TXVMVPRMSET0R_VSPOL_LOW and TXVMVPRMSET0R_HSPOL_LOW for modes which set neither DRM_MODE_FLAG_[PN].SYNC. The previous behavior was to assume that neither flag means DRM_MODE_FLAG_N.SYNC . Restore the previous behavior for maximum compatibility. The change of behavior is visible below, consider Vertical mode->flags for simplicity sake, although the same applies to Horizontal ones: Before 94fe479fae96 ("drm/rcar-du: dsi: Clean up handling of DRM mode flags") : - DRM_MODE_FLAG_PVSYNC => vprmset0r |= 0 - DRM_MODE_FLAG_NVSYNC => vprmset0r |= TXVMVPRMSET0R_VSPOL_LOW - Neither DRM_MODE_FLAG_[PN]VSYNC => vprmset0r |= TXVMVPRMSET0R_VSPOL_LOW After 94fe479fae96 ("drm/rcar-du: dsi: Clean up handling of DRM mode flags") : - DRM_MODE_FLAG_PVSYNC => vprmset0r |= 0 - DRM_MODE_FLAG_NVSYNC => vprmset0r |= TXVMVPRMSET0R_VSPOL_LOW - Neither DRM_MODE_FLAG_[PN]VSYNC => vprmset0r |= 0 <---------- This broke The "Neither" case behavior is different, because DRM_MODE_FLAG_N[HV]SYNC is really not equivalent !DRM_MODE_FLAG_P[HV]SYNC . Fixes: 94fe479fae96 ("drm/rcar-du: dsi: Clean up handling of DRM mode flags") Signed-off-by: Marek Vasut <marek.vasut+renesas@mailbox.org> Reviewed-by: Laurent Pinchart <laurent.pinchart+renesas@ideasonboard.com> Link: https://patch.msgid.link/20251202181146.138365-1-marek.vasut+renesas@mailbox.org Signed-off-by: Tomi Valkeinen <tomi.valkeinen@ideasonboard.com>
10 daysASoC: codecs: wcd93xx: fix OF node leaks on probeMark Brown3-6/+4
Merge series from Johan Hovold <johan@kernel.org>: The original wcd938x driver has a couple of OF node reference leaks which have been reproduced in the two later added drivers.
10 daysASoC: ak4458 & ak5558: disable regulator if errorMark Brown2-2/+18
Merge series from Shengjiu Wang <shengjiu.wang@nxp.com>: Disable regulator when error happens to balance the reference count.
10 daysautofs: fix per-dentry timeout warningIan Kent1-10/+12
The check that determines if the message that warns about the per-dentry timeout being greater than the super block timeout is not correct. The initial value for this field is -1 and the type of the field is unsigned long. I could change the type to long but the message is in the wrong place too, it should come after the timeout setting. So leave everything else as it is and move the message and check the timeout is actually set as an additional condition on issuing the message. Also fix the timeout comparison. Signed-off-by: Ian Kent <raven@themaw.net> Link: https://patch.msgid.link/20251111060439.19593-2-raven@themaw.net Signed-off-by: Christian Brauner <brauner@kernel.org>
10 daysperf stat: When no events, don't report an error if there is noneIan Rogers1-2/+4
Events may fail to open as no supported CPUs were specified on the command line. In this case a confusing "error" message of "success" can be reported. Let's skip the error in that case. Before: ``` $ perf stat -C2048 -e cycles -- true WARNING: A requested CPU in '2048' is not supported by PMU 'cpu' (CPUs 0-7) for event 'cycles' Error: No supported events found. The sys_perf_event_open() syscall returned with 0 (Success) for event (cpu/unknown-hardware/). "dmesg | grep -i perf" may provide additional information. ``` After: ``` $ perf stat -C2048 -e cycles -- true WARNING: A requested CPU in '2048' is not supported by PMU 'cpu' (CPUs 0-7) for event 'cycles' Error: No supported events found. ``` Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf tests stat: Add "--null" coverageIan Rogers1-0/+12
Ensure "--null" does a minimal run. Reported-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/linux-perf-users/aSwt7yzFjVJCEmVp@gmail.com/ Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf cpumap: Add "any" CPU handling to cpu_map__snprint_maskIan Rogers1-2/+7
If the perf_cpu_map is empty or is just the any CPU value, then early return. Don't process the "any" CPU when creating the bitmap. Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 dayslibperf cpumap: Fix perf_cpu_map__max for an empty/NULL mapIan Rogers1-4/+6
Passing an empty map to perf_cpu_map__max triggered a SEGV. Explicitly test for the empty map. Reported-by: Ingo Molnar <mingo@kernel.org> Closes: https://lore.kernel.org/linux-perf-users/aSwt7yzFjVJCEmVp@gmail.com/ Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 daysperf stat: Allow no events to open if this is a "--null" runIan Rogers1-1/+1
It is intended that a "--null" run doesn't open any events. Fixes: 2cc7aa995ce9 ("perf stat: Refactor retry/skip/fatal error handling") Tested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
10 dayscifs: client: enforce consistent handling of multichannel and max_channelsRajasi Mandal3-18/+50
Previously, the behavior of the multichannel and max_channels mount options was inconsistent and order-dependent. For example, specifying "multichannel,max_channels=1" would result in 2 channels, while "max_channels=1,multichannel" would result in 1 channel. Additionally, conflicting combinations such as "nomultichannel,max_channels=3" or "multichannel,max_channels=1" did not produce errors and could lead to unexpected channel counts. This commit introduces two new fields in smb3_fs_context to explicitly track whether multichannel and max_channels were specified during mount. The option parsing and validation logic is updated to ensure: - The outcome is no longer dependent on the order of options. - Conflicting combinations (e.g., "nomultichannel,max_channels=3" or "multichannel,max_channels=1") are detected and result in an error. - The number of channels created is consistent with the specified options. This improves the reliability and predictability of mount option handling for SMB3 multichannel support. Reviewed-by: Shyam Prasad N <sprasad@microsoft.com> Signed-off-by: Rajasi Mandal <rajasimandal@microsoft.com> Signed-off-by: Steve French <stfrench@microsoft.com>
10 daysMerge tag 'ntfs3_for_6.19' of ↵Linus Torvalds13-331/+421
https://github.com/Paragon-Software-Group/linux-ntfs3 Pull ntfs3 updates from Konstantin Komarov: "New code: - support timestamps prior to epoch - do not overwrite uptodate pages - disable readahead for compressed files - setting of dummy blocksize to read boot_block when mounting - the run_lock initialization when loading $Extend - initialization of allocated memory before use - support for the NTFS3_IOC_SHUTDOWN ioctl - check for minimum alignment when performing direct I/O reads - check for shutdown in fsync Fixes: - mount failure for sparse runs in run_unpack() - use-after-free of sbi->options in cmp_fnames - KMSAN uninit bug after failed mi_read in mi_format_new - uninit error after buffer allocation by __getname() - KMSAN uninit-value in ni_create_attr_list - double free of sbi->options->nls and ownership of fc->fs_private - incorrect vcn adjustments in attr_collapse_range() - mode update when ACL can be reduced to mode - memory leaks in add sub record Changes: - refactor code, updated terminology, spelling - do not kmap pages in (de)compression code - after ntfs_look_free_mft(), code that fails must put mft_inode - default mount options for "acl" and "prealloc" Replaced: - use unsafe_memcpy() to avoid memcpy size warning - ntfs_bio_pages with page cache for compressed files" * tag 'ntfs3_for_6.19' of https://github.com/Paragon-Software-Group/linux-ntfs3: (26 commits) fs/ntfs3: check for shutdown in fsync fs/ntfs3: change the default mount options for "acl" and "prealloc" fs/ntfs3: Prevent memory leaks in add sub record fs/ntfs3: out1 also needs to put mi fs/ntfs3: Fix spelling mistake "recommened" -> "recommended" fs/ntfs3: update mode in xattr when ACL can be reduced to mode fs/ntfs3: check minimum alignment for direct I/O fs/ntfs3: implement NTFS3_IOC_SHUTDOWN ioctl fs/ntfs3: correct attr_collapse_range when file is too fragmented ntfs3: fix double free of sbi->options->nls and clarify ownership of fc->fs_private fs/ntfs3: Initialize allocated memory before use fs/ntfs3: remove ntfs_bio_pages and use page cache for compressed I/O ntfs3: avoid memcpy size warning fs/ntfs3: fix KMSAN uninit-value in ni_create_attr_list ntfs3: init run lock for extend inode ntfs: set dummy blocksize to read boot_block when mounting fs/ntfs3: disable readahead for compressed files ntfs3: Fix uninit buffer allocated by __getname() ntfs3: fix uninit memory after failed mi_read in mi_format_new ntfs3: fix use-after-free of sbi->options in cmp_fnames ...
10 daysMerge tag 'ext4_for_linus-6.19-rc1' of ↵Linus Torvalds28-711/+872
git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4 Pull ext4 updates from Ted Ts'o: "New features and improvements for the ext4 file system: - Optimize online defragmentation by using folios instead of individual buffer heads - Improve error codes stored in the superblock when the journal aborts - Minor cleanups and clarifications in ext4_map_blocks() - Add documentation of the casefold and encrypt flags - Add support for file systems with a blocksize greater than the pagesize - Improve performance by enabling the caching the fact that an inode does not have a Posix ACL Various Bug Fixes: - Fix false positive complaints from smatch - Fix error code which is returned by ext4fs_dirhash() when Siphash is used without the encryption key - Fix races when writing to inline data files which could trigger a BUG - Fix potential NULL dereference when there is an corrupt file system with an extended attribute value stored in a inode - Fix false positive lockdep report when syzbot uses ext4 and ocfs2 together - Fix false positive reported by DEPT by adjusting lock annotation - Avoid a potential BUG_ON in jbd2 when a file system is massively corrupted - Fix a WARN_ON when superblock is corrupted with a non-NULL terminated mount options field - Add check if the userspace passes in a non-NULL terminated mount options field to EXT4_IOC_SET_TUNE_SB_PARAM - Fix a potential journal checksum failure whena file system is copied while it is mounted read-only - Fix a potential potential orphan file tracking error which only showed on 32-bit systems - Fix assertion checks in mballoc (which have to be explicitly enbled by manually enabling AGGRESSIVE_CHECKS and recompiling) - Avoid complaining about overly large orphan files created by mke2fs with with file systems with a 64k block size" * tag 'ext4_for_linus-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (58 commits) ext4: mark inodes without acls in __ext4_iget() ext4: enable block size larger than page size ext4: add checks for large folio incompatibilities when BS > PS ext4: support verifying data from large folios with fs-verity ext4: make data=journal support large block size ext4: support large block size in __ext4_block_zero_page_range() ext4: support large block size in mpage_prepare_extent_to_map() ext4: support large block size in mpage_map_and_submit_buffers() ext4: support large block size in ext4_block_write_begin() ext4: support large block size in ext4_mpage_readpages() ext4: rename 'page' references to 'folio' in multi-block allocator ext4: prepare buddy cache inode for BS > PS with large folios ext4: support large block size in ext4_mb_init_cache() ext4: support large block size in ext4_mb_get_buddy_page_lock() ext4: support large block size in ext4_mb_load_buddy_gfp() ext4: add EXT4_LBLK_TO_PG and EXT4_PG_TO_LBLK for block/page conversion ext4: add EXT4_LBLK_TO_B macro for logical block to bytes conversion ext4: support large block size in ext4_readdir() ext4: support large block size in ext4_calculate_overhead() ext4: introduce s_min_folio_order for future BS > PS support ...
10 daysMerge tag 'gfs2-for-6.19' of ↵Linus Torvalds27-768/+396
git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2 Pull gfs2 updates from Andreas Gruenbacher: - Major withdraw / error handling overhaul based on dlm's new DLM_RELEASE_RECOVER feature: this allows gfs to treat withdraws like node failures. Make withdraws asynchronous - Fix a bug in commit e4a8b5481c59a that caused 'df' to remain out of sync. ('df' is still allowed to go slightly out of sync for short periods of time) - Prevent recusive memory reclaim in gfs2_unstuff_dinode() - Clean up SDF_JOURNAL_LIVE flag handling - Fix remote evict for read-only filesystems - Fix a misuse of bio_chain() - Various other minor cleanups * tag 'gfs2-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2: (35 commits) gfs2: Fix use of bio_chain gfs2: Clean up SDF_JOURNAL_LIVE flag handling gfs2: No longer thaw filesystems during a withdraw gfs2: Withdraw immediately in gfs2_trans_add_meta gfs2: New gfs2_withdraw_helper gfs2: Clean up properly during a withdraw gfs2: Rename gfs2_{gl_dq_holders => withdraw_glocks} Revert "gfs2: fix infinite loop when checking ail item count before go_inval" Revert "gfs2: Allow some glocks to be used during withdraw" Revert "gfs2: Check for log write errors before telling dlm to unlock" Revert "gfs2: fix a deadlock on withdraw-during-mount" Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (6/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (5/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (4/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (3/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (2/6) Revert "gfs2: Force withdraw to replay journals and wait for it to finish" (1/6) Revert "gfs2: don't stop reads while withdraw in progress" gfs2: Rename LM_FLAG_{NOEXP -> RECOVER} gfs2: Kill gfs2_io_error_bh_wd ...
10 daysMerge tag 'v6.19-rc-smb-fixes' of git://git.samba.org/ksmbdLinus Torvalds44-1743/+1194
Pull smb client and server updates from Steve French: - server fixes: - IPC use after free locking fix - fix locking bug in delete paths - fix use after free in disconnect - fix underflow in locking check - error mapping improvement - socket listening improvement - return code mapping fixes - crypto improvements (use default libraries) - cleanup patches: - netfs - client checkpatch cleanup - server cleanup - move server/client duplicate code to common code - fix some defines to better match protocol specification - smbdirect (RDMA) fixes - client debugging improvements for leases * tag 'v6.19-rc-smb-fixes' of git://git.samba.org/ksmbd: (44 commits) cifs: Use netfs_alloc/free_folioq_buffer() smb: client: show smb lease key in open_dirs output smb: client: show smb lease key in open_files output ksmbd: ipc: fix use-after-free in ipc_msg_send_request smb: client: relax WARN_ON_ONCE(SMBDIRECT_SOCKET_*) checks in recv_done() and smbd_conn_upcall() smb: server: relax WARN_ON_ONCE(SMBDIRECT_SOCKET_*) checks in recv_done() and smb_direct_cm_handler() smb: smbdirect: introduce SMBDIRECT_CHECK_STATUS_{WARN,DISCONNECT}() smb: smbdirect: introduce SMBDIRECT_DEBUG_ERR_PTR() helper ksmbd: vfs: fix race on m_flags in vfs_cache ksmbd: Replace strcpy + strcat to improve convert_to_nt_pathname smb: move FILE_SYSTEM_ATTRIBUTE_INFO to common/fscc.h ksmbd: implement error handling for STATUS_INFO_LENGTH_MISMATCH in smb server ksmbd: fix use-after-free in ksmbd_tree_connect_put under concurrency ksmbd: server: avoid busy polling in accept loop smb: move create_durable_reconn to common/smb2pdu.h smb: fix some warnings reported by scripts/checkpatch.pl smb: do some cleanups smb: move FILE_SYSTEM_SIZE_INFO to common/fscc.h smb: move some duplicate struct definitions to common/fscc.h smb: move list of FileSystemAttributes to common/fscc.h ...
10 daysMerge tag 'xfs-merge-6.19' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linuxLinus Torvalds29-743/+363
Pull xfs updates from Carlos Maiolino: "There are no major changes in xfs. This contains mostly some code cleanups, a few bug fixes and documentation update. Highlights are: - Quota locking cleanup - Getting rid of old xlog_in_core_2_t type" * tag 'xfs-merge-6.19' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux: (33 commits) docs: remove obsolete links in the xfs online repair documentation xfs: move some code out of xfs_iget_recycle xfs: use zi more in xfs_zone_gc_mount xfs: remove the unused bv field in struct xfs_gc_bio xfs: remove xarray mark for reclaimable zones xfs: remove the xlog_in_core_t typedef xfs: remove l_iclog_heads xfs: remove the xlog_rec_header_t typedef xfs: remove xlog_in_core_2_t xfs: remove a very outdated comment from xlog_alloc_log xfs: cleanup xlog_alloc_log a bit xfs: don't use xlog_in_core_2_t in struct xlog_in_core xfs: add a on-disk log header cycle array accessor xfs: add a XLOG_CYCLE_DATA_SIZE constant xfs: reduce ilock roundtrips in xfs_qm_vop_dqalloc xfs: move xfs_dquot_tree calls into xfs_qm_dqget_cache_{lookup,insert} xfs: move quota locking into xrep_quota_item xfs: move quota locking into xqcheck_commit_dquot xfs: move q_qlock locking into xqcheck_compare_dquot xfs: move q_qlock locking into xchk_quota_item ...
10 daysMerge tag 'erofs-for-6.19-rc1' of ↵Linus Torvalds11-148/+178
git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs Pull erofs updates from Gao Xiang: - Fix a WARNING caused by a recent FSDAX misdetection regression - Fix the filesystem stacking limit for file-backed mounts - Print more informative diagnostics on decompression errors - Switch the on-disk definition `erofs_fs.h` to the MIT license - Minor cleanups * tag 'erofs-for-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/xiang/erofs: erofs: switch on-disk header `erofs_fs.h` to MIT license erofs: get rid of raw bi_end_io() usage erofs: enable error reporting for z_erofs_fixup_insize() erofs: enable error reporting for z_erofs_stream_switch_bufs() erofs: improve Zstd, LZMA and DEFLATE error strings erofs: improve decompression error reporting erofs: tidy up z_erofs_lz4_handle_overlap() erofs: limit the level of fs stacking for file-backed mounts erofs: correct FSDAX detection
10 daysMerge tag 'hfs-v6.19-tag1' of ↵Linus Torvalds30-914/+2698
git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs Pull hfs/hfsplus updates from Viacheslav Dubeyko: "Several fixes for syzbot reported issues, HFS/HFS+ fixes of xfstests failures, Kunit-based unit-tests introduction, and code cleanup: - Dan Carpenter fixed a potential use-after-free issue in hfs_correct_next_unused_CNID() method. Tetsuo Handa has made nice fix of syzbot reported issue related to incorrect inode->i_mode management if volume has been corrupted somehow. Yang Chenzhi has made really good fix of potential race condition in __hfs_bnode_create() method for HFS+ file system. - Several fixes to xfstests failures. Particularly, generic/070, generic/073, and generic/101 test-cases finish successfully for the case of HFS+ file system right now. - HFS and HFS+ drivers share multiple structures of on-disk layout declarations. Some structures are used without any change. However, we had two independent declarations of the same structures in HFS and HFS+ drivers. The on-disk layout declarations have been moved into include/linux/hfs_common.h with the goal to exclude the declarations duplication and to keep the HFS/HFS+ on-disk layout declarations in one place. Also, this patch prepares the basis for creating a hfslib that can aggregate common functionality without necessity to duplicate the same code in HFS and HFS+ drivers. - HFS/HFS+ really need unit-tests because of multiple xfstests failures. The first two patches introduce Kunit-based unit-tests for the case string operations in HFS/HFS+ file system drivers" * tag 'hfs-v6.19-tag1' of git://git.kernel.org/pub/scm/linux/kernel/git/vdubeyko/hfs: hfs/hfsplus: move on-disk layout declarations into hfs_common.h hfsplus: fix volume corruption issue for generic/101 hfsplus: introduce KUnit tests for HFS+ string operations hfs: introduce KUnit tests for HFS string operations hfsplus: fix volume corruption issue for generic/073 hfsplus: Verify inode mode when loading from disk hfsplus: fix volume corruption issue for generic/070 hfs/hfsplus: prevent getting negative values of offset/length hfsplus: fix missing hfs_bnode_get() in __hfs_bnode_create hfs: fix potential use after free in hfs_correct_next_unused_CNID()
10 daysMerge tag 'for-6.19-tag' of ↵Linus Torvalds74-2298/+2773
git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux Pull btrfs updates from David Sterba: "Features: - shutdown ioctl support (needs CONFIG_BTRFS_EXPERIMENTAL for now): - set filesystem state as being shut down (also named going down in other filesystems), where all active operations return EIO and this cannot be changed until unmount - pending operations are attempted to be finished but error messages may still show up depending on where exactly the shutdown happened - scrub (and device replace) vs suspend/hibernate: - a running scrub will prevent suspend, which can be annoying as suspend is an immediate request and scrub is not critical - filesystem freezing before suspend was not sufficient as the problem was in process freezing - behaviour change: on suspend scrub and device replace are cancelled, where scrub can record the last state and continue from there; the device replace has to be restarted from the beginning - zone stats exported in sysfs, from the perspective of the filesystem this includes active, reclaimable, relocation etc zones Performance: - improvements when processing space reservation tickets by optimizing locking and shrinking critical sections, cumulative improvements in lockstat numbers show +15% Notable fixes: - use vmalloc fallback when allocating bios as high order allocations can happen with wide checksums (like sha256) - scrub will always track the last position of progress so it's not starting from zero after an error Core: - under experimental config, checksum calculations are offloaded to process context, simplifies locking and allows to remove compression write worker kthread(s): - speed improvement in direct IO throughput with buffered IO fallback is +15% when not offloaded but this is more related to internal crypto subsystem improvements - this will be probably default in the future removing the sysfs tunable - (experimental) block size > page size updates: - support more operations when not using large folios (encoded read/write and send) - raid56 - more preparations for fscrypt support Other: - more conversions to auto-cleaned variables - parameter cleanups and removals - extended warning fixes - improved printing of structured values like keys - lots of other cleanups and refactoring" * tag 'for-6.19-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (147 commits) btrfs: remove unnecessary inode key in btrfs_log_all_parents() btrfs: remove redundant zero/NULL initializations in btrfs_alloc_root() btrfs: remaining BTRFS_PATH_AUTO_FREE conversions btrfs: send: do not allocate memory for xattr data when checking it exists btrfs: send: add unlikely to all unexpected overflow checks btrfs: reduce arguments to btrfs_del_inode_ref_in_log() btrfs: remove root argument from btrfs_del_dir_entries_in_log() btrfs: use test_and_set_bit() in btrfs_delayed_delete_inode_ref() btrfs: don't search back for dir inode item in INO_LOOKUP_USER btrfs: don't rewrite ret from inode_permission btrfs: add orig_logical to btrfs_bio for encryption btrfs: disable verity on encrypted inodes btrfs: disable various operations on encrypted inodes btrfs: remove redundant level reset in btrfs_del_items() btrfs: simplify leaf traversal after path release in btrfs_next_old_leaf() btrfs: optimize balance_level() path reference handling btrfs: factor out root promotion logic into promote_child_to_root() btrfs: raid56: remove the "_step" infix btrfs: raid56: enable bs > ps support btrfs: raid56: prepare finish_parity_scrub() to support bs > ps cases ...
10 daysMerge tag 'for-6.19/block-20251201' of ↵Linus Torvalds108-1557/+3019
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull block updates from Jens Axboe: - Fix head insertion for mq-deadline, a regression from when priority support was added - Series simplifying and improving the ublk user copy code - Various ublk related cleanups - Fixup REQ_NOWAIT handling in loop/zloop, clearing NOWAIT when the request is punted to a thread for handling - Merge and then later revert loop dio nowait support, as it ended up causing excessive stack usage for when the inline issue code needs to dip back into the full file system code - Improve auto integrity code, making it less deadlock prone - Speedup polled IO handling, but manually managing the hctx lookups - Fixes for blk-throttle for SSD devices - Small series with fixes for the S390 dasd driver - Add support for caching zones, avoiding unnecessary report zone queries - MD pull requests via Yu: - fix null-ptr-dereference regression for dm-raid0 - fix IO hang for raid5 when array is broken with IO inflight - remove legacy 1s delay to speed up system shutdown - change maintainer's email address - data can be lost if array is created with different lbs devices, fix this problem and record lbs of the array in metadata - fix rcu protection for md_thread - fix mddev kobject lifetime regression - enable atomic writes for md-linear - some cleanups - bcache updates via Coly - remove useless discard and cache device code - improve usage of per-cpu workqueues - Reorganize the IO scheduler switching code, fixing some lockdep reports as well - Improve the block layer P2P DMA support - Add support to the block tracing code for zoned devices - Segment calculation improves, and memory alignment flexibility improvements - Set of prep and cleanups patches for ublk batching support. The actual batching hasn't been added yet, but helps shrink down the workload of getting that patchset ready for 6.20 - Fix for how the ps3 block driver handles segments offsets - Improve how block plugging handles batch tag allocations - nbd fixes for use-after-free of the configuration on device clear/put - Set of improvements and fixes for zloop - Add Damien as maintainer of the block zoned device code handling - Various other fixes and cleanups * tag 'for-6.19/block-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (162 commits) block/rnbd: correct all kernel-doc complaints blk-mq: use queue_hctx in blk_mq_map_queue_type md: remove legacy 1s delay in md_notify_reboot md/raid5: fix IO hang when array is broken with IO inflight md: warn about updating super block failure md/raid0: fix NULL pointer dereference in create_strip_zones() for dm-raid sbitmap: fix all kernel-doc warnings ublk: add helper of __ublk_fetch() ublk: pass const pointer to ublk_queue_is_zoned() ublk: refactor auto buffer register in ublk_dispatch_req() ublk: add `union ublk_io_buf` with improved naming ublk: add parameter `struct io_uring_cmd *` to ublk_prep_auto_buf_reg() kfifo: add kfifo_alloc_node() helper for NUMA awareness blk-mq: fix potential uaf for 'queue_hw_ctx' blk-mq: use array manage hctx map instead of xarray ublk: prevent invalid access with DEBUG s390/dasd: Use scnprintf() instead of sprintf() s390/dasd: Move device name formatting into separate function s390/dasd: Remove unnecessary debugfs_create() return checks s390/dasd: Fix gendisk parent after copy pair swap ...
10 daysMerge tag 'for-6.19/io_uring-20251201' of ↵Linus Torvalds46-856/+1299
git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux Pull io_uring updates from Jens Axboe: - Unify how task_work cancelations are detected, placing it in the task_work running state rather than needing to check the task state - Series cleaning up and moving the cancelation code to where it belongs, in cancel.c - Cleanup of waitid and futex argument handling - Add support for mixed sized SQEs. 6.18 added support for mixed sized CQEs, improving flexibility and efficiency of workloads that need big CQEs. This adds similar support for SQEs, where the occasional need for a 128b SQE doesn't necessitate having all SQEs be 128b in size - Introduce zcrx and SQ/CQ layout queries. The former returns what zcrx features are available. And both return the ring size information to help with allocation size calculation for user provided rings like IORING_SETUP_NO_MMAP and IORING_MEM_REGION_TYPE_USER - Zcrx updates for 6.19. It includes a bunch of small patches, IORING_REGISTER_ZCRX_CTRL and RQ flushing and David's work on sharing zcrx b/w multiple io_uring instances - Series cleaning up ring initializations, notable deduplicating ring size and offset calculations. It also moves most of the checking before doing any allocations, making the code simpler - Add support for getsockname and getpeername, which is mostly a trivial hookup after a bit of refactoring on the networking side - Various fixes and cleanups * tag 'for-6.19/io_uring-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux: (68 commits) io_uring: Introduce getsockname io_uring cmd socket: Split out a getsockname helper for io_uring socket: Unify getsockname and getpeername implementation io_uring/query: drop unused io_handle_query_entry() ctx arg io_uring/kbuf: remove obsolete buf_nr_pages and update comments io_uring/register: use correct location for io_rings_layout io_uring/zcrx: share an ifq between rings io_uring/zcrx: add io_fill_zcrx_offsets() io_uring/zcrx: export zcrx via a file io_uring/zcrx: move io_zcrx_scrub() and dependencies up io_uring/zcrx: count zcrx users io_uring/zcrx: add sync refill queue flushing io_uring/zcrx: introduce IORING_REGISTER_ZCRX_CTRL io_uring/zcrx: elide passing msg flags io_uring/zcrx: use folio_nr_pages() instead of shift operation io_uring/zcrx: convert to use netmem_desc io_uring/query: introduce rings info query io_uring/query: introduce zcrx query io_uring: move cq/sq user offset init around io_uring: pre-calculate scq layout ...
10 daysf2fs: ignore discard return valueChaitanya Kulkarni1-7/+3
__blkdev_issue_discard() always returns 0, making the error assignment in __submit_discard_cmd() dead code. Initialize err to 0 and remove the error assignment from the __blkdev_issue_discard() call to err. Move fault injection code into already present if branch where err is set to -EIO. This preserves the fault injection behavior while removing dead error handling. Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Chaitanya Kulkarni <ckulkarnilinux@gmail.com> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: optimize trace_f2fs_write_checkpoint with enumsYH Lin3-8/+23
This patch optimizes the tracepoint by replacing these hardcoded strings with a new enumeration f2fs_cp_phase. 1.Defines enum f2fs_cp_phase with values for each checkpoint phase. 2.Updates trace_f2fs_write_checkpoint to accept a u16 phase argument instead of a string pointer. 3.Uses __print_symbolic in TP_printk to convert the enum values back to their corresponding strings for human-readable trace output. This change reduces the storage overhead for each trace event by replacing a variable-length string with a 2-byte integer, while maintaining the same readable output in ftrace. Signed-off-by: YH Lin <yhli@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix to not account invalid blocks in get_left_section_blocks()Chao Yu1-3/+5
w/ LFS mode, in get_left_section_blocks(), we should not account the blocks which were used before and now are invalided, otherwise those blocks will be counted as freed one in has_curseg_enough_space(), result in missing to trigger GC in time. Cc: stable@kernel.org Fixes: 249ad438e1d9 ("f2fs: add a method for calculating the remaining blocks in the current segment in LFS mode.") Fixes: bf34c93d2645 ("f2fs: check curseg space before foreground GC") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: support to show curseg.next_blkoff in debugfsChao Yu2-10/+20
cat /sys/kernel/debug/f2fs/status Main area: 17 segs, 17 secs 17 zones TYPE blkoff segno secno zoneno dirty_seg full_seg valid_blk - COLD data: 0 4 4 4 0 0 0 - WARM data: 0 7 7 7 0 0 0 - HOT data: 1 5 5 5 2 0 512 - Dir dnode: 3 0 0 0 1 0 2 - File dnode: 0 1 1 1 0 0 0 - Indir nodes: 0 2 2 2 0 0 0 - Pinned file: 0 -1 -1 -1 - ATGC data: 0 -1 -1 -1 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysdocs: f2fs: wrap ASCII tables in literal blocks to fix LaTeX buildMasaharu Noguchi1-62/+69
Sphinx's LaTeX builder fails when converting the nested ASCII tables in f2fs.rst, producing the following error: "Markup is unsupported in LaTeX: longtable does not support nesting a table." Wrap the affected ASCII tables in literal code blocks to force Sphinx to render them verbatim. This prevents nested longtables and fixes the PDF build failure on Sphinx 8.2.x. Acked-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Akira Yokosawa <akiyks@gmail.com> Signed-off-by: Masaharu Noguchi <nogunix@gmail.com> Acked-by: Jonathan Corbet <corbet@lwn.net> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: expand scalability of f2fs mount optionChao Yu2-58/+63
opt field in structure f2fs_mount_info and opt_mask field in structure f2fs_fs_context is 32-bits variable, now we're running out of available bits in them, let's expand them to 64-bits for better scalability. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: change default schedule timeout valueChao Yu2-3/+5
This patch changes default schedule timeout value from 20ms to 1ms, in order to give caller more chances to check whether IO or non-IO congestion condition has already been mitigable. In addition, default interval of periodical discard submission is kept to 20ms. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: introduce f2fs_schedule_timeout()Chao Yu6-16/+24
In f2fs retry logic, we will call f2fs_io_schedule_timeout() to sleep as uninterruptible state (waiting for IO) for a while, however, in several paths below, we are not blocked by IO: - f2fs_write_single_data_page() return -EAGAIN due to racing on cp_rwsem. - f2fs_flush_device_cache() failed to submit preflush command. - __issue_discard_cmd_range() sleeps periodically in between two in batch discard submissions. So, in order to reveal state of task more accurate, let's introduce f2fs_schedule_timeout() and call it in above paths in where we are waiting for non-IO reasons. Then we can get real reason of uninterruptible sleep for a thread in tracepoint, perfetto, etc. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: use memalloc_retry_wait() as much as possibleChao Yu2-2/+2
memalloc_retry_wait() is recommended in memory allocation retry logic, use it as much as possible. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: add a sysfs entry to show max open zonesYongpeng Yang2-0/+8
This patch adds a sysfs entry showing the max zones that F2FS can write concurrently. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: wrap all unusable_blocks_per_sec code in CONFIG_BLK_DEV_ZONEDYongpeng Yang2-1/+6
The usage of unusable_blocks_per_sec is already wrapped by CONFIG_BLK_DEV_ZONED, except for its declaration and the definitions of CAP_BLKS_PER_SEC and CAP_SEGS_PER_SEC. This patch ensures that all code related to unusable_blocks_per_sec is properly wrapped under the CONFIG_BLK_DEV_ZONED option. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: simplify list initialization in f2fs_recover_fsync_data()Baolin Liu1-6/+3
In f2fs_recover_fsync_data(),use LIST_HEAD() to declare and initialize the list_head in one step instead of using INIT_LIST_HEAD() separately. No functional change. Signed-off-by: Baolin Liu <liubaolin@kylinos.cn> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: revert summary entry count from 2048 to 512 in 16kb block supportDaeho Jeong8-63/+130
The recent increase in the number of Segment Summary Area (SSA) entries from 512 to 2048 was an unintentional change in logic of 16kb block support. This commit corrects the issue. To better utilize the space available from the erroneous 2048-entry calculation, we are implementing a solution to share the currently unused SSA space with neighboring segments. This enhances overall SSA utilization without impacting the established 8MB segment size. Fixes: d7e9a9037de2 ("f2fs: Support Block Size == Page Size") Signed-off-by: Daeho Jeong <daehojeong@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix to detect recoverable inode during dryrun of find_fsync_dnodes()Chao Yu1-8/+12
mkfs.f2fs -f /dev/vdd mount /dev/vdd /mnt/f2fs touch /mnt/f2fs/foo sync # avoid CP_UMOUNT_FLAG in last f2fs_checkpoint.ckpt_flags touch /mnt/f2fs/bar f2fs_io fsync /mnt/f2fs/bar f2fs_io shutdown 2 /mnt/f2fs umount /mnt/f2fs blockdev --setro /dev/vdd mount /dev/vdd /mnt/f2fs mount: /mnt/f2fs: WARNING: source write-protected, mounted read-only. For the case if we create and fsync a new inode before sudden power-cut, without norecovery or disable_roll_forward mount option, the following mount will succeed w/o recovering last fsynced inode. The problem here is that we only check inode_list list after find_fsync_dnodes() in f2fs_recover_fsync_data() to find out whether there is recoverable data in the iamge, but there is a missed case, if last fsynced inode is not existing in last checkpoint, then, we will fail to get its inode due to nat of inode node is not existing in last checkpoint, so the inode won't be linked in inode_list. Let's detect such case in dyrun mode to fix this issue. After this change, mount will fail as expected below: mount: /mnt/f2fs: cannot mount /dev/vdd read-only. dmesg(1) may have more information after failed mount system call. demsg: F2FS-fs (vdd): Need to recover fsync data, but write access unavailable, please try mount w/ disable_roll_forward or norecovery Cc: stable@kernel.org Fixes: 6781eabba1bd ("f2fs: give -EINVAL for norecovery and rw mount") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix return value of f2fs_recover_fsync_data()Chao Yu1-5/+9
With below scripts, it will trigger panic in f2fs: mkfs.f2fs -f /dev/vdd mount /dev/vdd /mnt/f2fs touch /mnt/f2fs/foo sync echo 111 >> /mnt/f2fs/foo f2fs_io fsync /mnt/f2fs/foo f2fs_io shutdown 2 /mnt/f2fs umount /mnt/f2fs mount -o ro,norecovery /dev/vdd /mnt/f2fs or mount -o ro,disable_roll_forward /dev/vdd /mnt/f2fs F2FS-fs (vdd): f2fs_recover_fsync_data: recovery fsync data, check_only: 0 F2FS-fs (vdd): Mounted with checkpoint version = 7f5c361f F2FS-fs (vdd): Stopped filesystem due to reason: 0 F2FS-fs (vdd): f2fs_recover_fsync_data: recovery fsync data, check_only: 1 Filesystem f2fs get_tree() didn't set fc->root, returned 1 ------------[ cut here ]------------ kernel BUG at fs/super.c:1761! Oops: invalid opcode: 0000 [#1] SMP PTI CPU: 3 UID: 0 PID: 722 Comm: mount Not tainted 6.18.0-rc2+ #721 PREEMPT(voluntary) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:vfs_get_tree.cold+0x18/0x1a Call Trace: <TASK> fc_mount+0x13/0xa0 path_mount+0x34e/0xc50 __x64_sys_mount+0x121/0x150 do_syscall_64+0x84/0x800 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7fa6cc126cfe The root cause is we missed to handle error number returned from f2fs_recover_fsync_data() when mounting image w/ ro,norecovery or ro,disable_roll_forward mount option, result in returning a positive error number to vfs_get_tree(), fix it. Cc: stable@kernel.org Fixes: 6781eabba1bd ("f2fs: give -EINVAL for norecovery and rw mount") Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: add fadvise tracepointJaegeuk Kim2-0/+34
This adds a tracepoint in the fadvise call path. Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix age extent cache insertion skip on counter overflowXiaole He3-4/+16
The age extent cache uses last_blocks (derived from allocated_data_blocks) to determine data age. However, there's a conflict between the deletion marker (last_blocks=0) and legitimate last_blocks=0 cases when allocated_data_blocks overflows to 0 after reaching ULLONG_MAX. In this case, valid extents are incorrectly skipped due to the "if (!tei->last_blocks)" check in __update_extent_tree_range(). This patch fixes the issue by: 1. Reserving ULLONG_MAX as an invalid/deletion marker 2. Limiting allocated_data_blocks to range [0, ULLONG_MAX-1] 3. Using F2FS_EXTENT_AGE_INVALID for deletion scenarios 4. Adjusting overflow age calculation from ULLONG_MAX to (ULLONG_MAX-1) Reproducer (using a patched kernel with allocated_data_blocks initialized to ULLONG_MAX - 3 for quick testing): Step 1: Mount and check initial state # dd if=/dev/zero of=/tmp/test.img bs=1M count=100 # mkfs.f2fs -f /tmp/test.img # mkdir -p /mnt/f2fs_test # mount -t f2fs -o loop,age_extent_cache /tmp/test.img /mnt/f2fs_test # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 18446744073709551612 # ULLONG_MAX - 3 Inner Struct Count: tree: 1(0), node: 0 Step 2: Create files and write data to trigger overflow # touch /mnt/f2fs_test/{1,2,3,4}.txt; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 18446744073709551613 # ULLONG_MAX - 2 Inner Struct Count: tree: 5(0), node: 1 # dd if=/dev/urandom of=/mnt/f2fs_test/1.txt bs=4K count=1; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 18446744073709551614 # ULLONG_MAX - 1 Inner Struct Count: tree: 5(0), node: 2 # dd if=/dev/urandom of=/mnt/f2fs_test/2.txt bs=4K count=1; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 18446744073709551615 # ULLONG_MAX Inner Struct Count: tree: 5(0), node: 3 # dd if=/dev/urandom of=/mnt/f2fs_test/3.txt bs=4K count=1; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 0 # Counter overflowed! Inner Struct Count: tree: 5(0), node: 4 Step 3: Trigger the bug - next write should create node but gets skipped # dd if=/dev/urandom of=/mnt/f2fs_test/4.txt bs=4K count=1; sync # cat /sys/kernel/debug/f2fs/status | grep -A 4 "Block Age" Allocated Data Blocks: 1 Inner Struct Count: tree: 5(0), node: 4 Expected: node: 5 (new extent node for 4.txt) Actual: node: 4 (extent insertion was incorrectly skipped due to last_blocks = allocated_data_blocks = 0 in __get_new_block_age) After this fix, the extent node is correctly inserted and node count becomes 5 as expected. Fixes: 71644dff4811 ("f2fs: add block_age-based extent cache") Cc: stable@kernel.org Signed-off-by: Xiaole He <hexiaole1994@126.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: Add sanity checks before unlinking and loading inodesNikola Z. Ivanov2-5/+18
Add check for inode->i_nlink == 1 for directories during unlink, as their value is decremented twice, which can trigger a warning in drop_nlink. In such case mark the filesystem as corrupted and return from the function call with the relevant failure return value. Additionally add the check for i_nlink == 1 in sanity_check_inode in order to detect on-disk corruption early. Reported-by: syzbot+c07d47c7bc68f47b9083@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=c07d47c7bc68f47b9083 Tested-by: syzbot+c07d47c7bc68f47b9083@syzkaller.appspotmail.com Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: Rename f2fs_unlink exit labelNikola Z. Ivanov1-7/+7
Rename "fail" label to "out" as it's used as a default exit path out of f2fs_unlink as well as error path. Signed-off-by: Nikola Z. Ivanov <zlatistiv@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: ensure minimum trim granularity accounts for all devicesYongpeng Yang2-6/+18
When F2FS uses multiple block devices, each device may have a different discard granularity. The minimum trim granularity must be at least the maximum discard granularity of all devices, excluding zoned devices. Use max_t instead of the max() macro to compute the maximum value. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix uninitialized one_time_gc in victim_sel_policyXiaole He1-1/+1
The one_time_gc field in struct victim_sel_policy is conditionally initialized but unconditionally read, leading to undefined behavior that triggers UBSAN warnings. In f2fs_get_victim() at fs/f2fs/gc.c:774, the victim_sel_policy structure is declared without initialization: struct victim_sel_policy p; The field p.one_time_gc is only assigned when the 'one_time' parameter is true (line 789): if (one_time) { p.one_time_gc = one_time; ... } However, this field is unconditionally read in subsequent get_gc_cost() at line 395: if (p->one_time_gc && (valid_thresh_ratio < 100) && ...) When one_time is false, p.one_time_gc contains uninitialized stack memory. Hence p.one_time_gc is an invalid bool value. UBSAN detects this invalid bool value: UBSAN: invalid-load in fs/f2fs/gc.c:395:7 load of value 77 is not a valid value for type '_Bool' CPU: 3 UID: 0 PID: 1297 Comm: f2fs_gc-252:16 Not tainted 6.18.0-rc3 #5 PREEMPT(voluntary) Hardware name: OpenStack Foundation OpenStack Nova, BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x70/0x90 dump_stack+0x14/0x20 __ubsan_handle_load_invalid_value+0xb3/0xf0 ? dl_server_update+0x2e/0x40 ? update_curr+0x147/0x170 f2fs_get_victim.cold+0x66/0x134 [f2fs] ? sched_balance_newidle+0x2ca/0x470 ? finish_task_switch.isra.0+0x8d/0x2a0 f2fs_gc+0x2ba/0x8e0 [f2fs] ? _raw_spin_unlock_irqrestore+0x12/0x40 ? __timer_delete_sync+0x80/0xe0 ? timer_delete_sync+0x14/0x20 ? schedule_timeout+0x82/0x100 gc_thread_func+0x38b/0x860 [f2fs] ? gc_thread_func+0x38b/0x860 [f2fs] ? __pfx_autoremove_wake_function+0x10/0x10 kthread+0x10b/0x220 ? __pfx_gc_thread_func+0x10/0x10 [f2fs] ? _raw_spin_unlock_irq+0x12/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x11a/0x160 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> This issue is reliably reproducible with the following steps on a 100GB SSD /dev/vdb: mkfs.f2fs -f /dev/vdb mount /dev/vdb /mnt/f2fs_test fio --name=gc --directory=/mnt/f2fs_test --rw=randwrite \ --bs=4k --size=8G --numjobs=12 --fsync=4 --runtime=10 \ --time_based echo 1 > /sys/fs/f2fs/vdb/gc_urgent The uninitialized value causes incorrect GC victim selection, leading to unpredictable garbage collection behavior. Fix by zero-initializing the entire victim_sel_policy structure to ensure all fields have defined values. Fixes: e791d00bd06c ("f2fs: add valid block ratio not to do excessive GC for one time GC") Cc: stable@kernel.org Signed-off-by: Xiaole He <hexiaole1994@126.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix to access i_size w/ i_size_read()Chao Yu1-4/+4
It recommends to use i_size_{read,write}() to access and update i_size, otherwise, we may get wrong tearing value due to high 32-bits value and low 32-bits value of i_size field are not updated atomically in 32-bits archicture machine. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: ensure node page reads complete before f2fs_put_super() finishesJan Prusakowski1-8/+9
Xfstests generic/335, generic/336 sometimes crash with the following message: F2FS-fs (dm-0): detect filesystem reference count leak during umount, type: 9, count: 1 ------------[ cut here ]------------ kernel BUG at fs/f2fs/super.c:1939! Oops: invalid opcode: 0000 [#1] SMP NOPTI CPU: 1 UID: 0 PID: 609351 Comm: umount Tainted: G W 6.17.0-rc5-xfstests-g9dd1835ecda5 #1 PREEMPT(none) Tainted: [W]=WARN Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 RIP: 0010:f2fs_put_super+0x3b3/0x3c0 Call Trace: <TASK> generic_shutdown_super+0x7e/0x190 kill_block_super+0x1a/0x40 kill_f2fs_super+0x9d/0x190 deactivate_locked_super+0x30/0xb0 cleanup_mnt+0xba/0x150 task_work_run+0x5c/0xa0 exit_to_user_mode_loop+0xb7/0xc0 do_syscall_64+0x1ae/0x1c0 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> ---[ end trace 0000000000000000 ]--- It appears that sometimes it is possible that f2fs_put_super() is called before all node page reads are completed. Adding a call to f2fs_wait_on_all_pages() for F2FS_RD_NODE fixes the problem. Cc: stable@kernel.org Fixes: 20872584b8c0b ("f2fs: fix to drop all dirty meta/node pages during umount()") Signed-off-by: Jan Prusakowski <jprusakowski@google.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: block cache/dio write during f2fs_enable_checkpoint()Chao Yu3-9/+34
If there are too many background IOs during f2fs_enable_checkpoint(), sync_inodes_sb() may be blocked for long time due to it will loop to write dirty datas which are generated by in parallel write() continuously. Let's change as below to resolve this issue: - hold cp_enable_rwsem write lock to block any cache/dio write - decrease DEF_ENABLE_INTERVAL from 16 to 5 In addition, dump more logs during f2fs_enable_checkpoint(). Testcase: 1. fill data into filesystem until 90% usage. 2. mount -o remount,checkpoint=disable:10% /data 3. fio --rw=randwrite --bs=4kb --size=1GB --numjobs=10 \ --iodepth=64 --ioengine=psync --time_based --runtime=600 \ --directory=/data/fio_dir/ & 4. mount -o remount,checkpoint=enable /data Before: F2FS-fs (dm-51): f2fs_enable_checkpoint() finishes, writeback:7232, sync:39793, cp:457 After: F2FS-fs (dm-51): f2fs_enable_checkpoint end, writeback:5032, lock:0, sync_inode:5552, sync_fs:84 Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix to propagate error from f2fs_enable_checkpoint()Chao Yu1-10/+16
In order to let userspace detect such error rather than suffering silent failure. Fixes: 4354994f097d ("f2fs: checkpoint disabling") Cc: stable@kernel.org Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: change the unlock parameter of f2fs_put_page to boolYongpeng Yang6-16/+14
Change the type of the unlock parameter of f2fs_put_page to bool. All callers should consistently pass true or false. No logical change. Signed-off-by: Yongpeng Yang <yangyongpeng@xiaomi.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: invalidate dentry cache on failed whiteout creationDeepanshu Kartikey1-2/+4
F2FS can mount filesystems with corrupted directory depth values that get runtime-clamped to MAX_DIR_HASH_DEPTH. When RENAME_WHITEOUT operations are performed on such directories, f2fs_rename performs directory modifications (updating target entry and deleting source entry) before attempting to add the whiteout entry via f2fs_add_link. If f2fs_add_link fails due to the corrupted directory structure, the function returns an error to VFS, but the partial directory modifications have already been committed to disk. VFS assumes the entire rename operation failed and does not update the dentry cache, leaving stale mappings. In the error path, VFS does not call d_move() to update the dentry cache. This results in new_dentry still pointing to the old inode (new_inode) which has already had its i_nlink decremented to zero. The stale cache causes subsequent operations to incorrectly reference the freed inode. This causes subsequent operations to use cached dentry information that no longer matches the on-disk state. When a second rename targets the same entry, VFS attempts to decrement i_nlink on the stale inode, which may already have i_nlink=0, triggering a WARNING in drop_nlink(). Example sequence: 1. First rename (RENAME_WHITEOUT): file2 → file1 - f2fs updates file1 entry on disk (points to inode 8) - f2fs deletes file2 entry on disk - f2fs_add_link(whiteout) fails (corrupted directory) - Returns error to VFS - VFS does not call d_move() due to error - VFS cache still has: file1 → inode 7 (stale!) - inode 7 has i_nlink=0 (already decremented) 2. Second rename: file3 → file1 - VFS uses stale cache: file1 → inode 7 - Tries to drop_nlink on inode 7 (i_nlink already 0) - WARNING in drop_nlink() Fix this by explicitly invalidating old_dentry and new_dentry when f2fs_add_link fails during whiteout creation. This forces VFS to refresh from disk on subsequent operations, ensuring cache consistency even when the rename partially succeeds. Reproducer: 1. Mount F2FS image with corrupted i_current_depth 2. renameat2(file2, file1, RENAME_WHITEOUT) 3. renameat2(file3, file1, 0) 4. System triggers WARNING in drop_nlink() Fixes: 7e01e7ad746b ("f2fs: support RENAME_WHITEOUT") Reported-by: syzbot+632cf32276a9a564188d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=632cf32276a9a564188d Suggested-by: Chao Yu <chao@kernel.org> Link: https://lore.kernel.org/all/20251022233349.102728-1-kartikey406@gmail.com/ [v1] Cc: stable@vger.kernel.org Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com> Reviewed-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: use global inline_xattr_slab instead of per-sb slab cacheChao Yu4-37/+25
As Hong Yun reported in mailing list: loop7: detected capacity change from 0 to 131072 ------------[ cut here ]------------ kmem_cache of name 'f2fs_xattr_entry-7:7' already exists WARNING: CPU: 0 PID: 24426 at mm/slab_common.c:110 kmem_cache_sanity_check mm/slab_common.c:109 [inline] WARNING: CPU: 0 PID: 24426 at mm/slab_common.c:110 __kmem_cache_create_args+0xa6/0x320 mm/slab_common.c:307 CPU: 0 UID: 0 PID: 24426 Comm: syz.7.1370 Not tainted 6.17.0-rc4 #1 PREEMPT(full) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 RIP: 0010:kmem_cache_sanity_check mm/slab_common.c:109 [inline] RIP: 0010:__kmem_cache_create_args+0xa6/0x320 mm/slab_common.c:307 Call Trace:  __kmem_cache_create include/linux/slab.h:353 [inline]  f2fs_kmem_cache_create fs/f2fs/f2fs.h:2943 [inline]  f2fs_init_xattr_caches+0xa5/0xe0 fs/f2fs/xattr.c:843  f2fs_fill_super+0x1645/0x2620 fs/f2fs/super.c:4918  get_tree_bdev_flags+0x1fb/0x260 fs/super.c:1692  vfs_get_tree+0x43/0x140 fs/super.c:1815  do_new_mount+0x201/0x550 fs/namespace.c:3808  do_mount fs/namespace.c:4136 [inline]  __do_sys_mount fs/namespace.c:4347 [inline]  __se_sys_mount+0x298/0x2f0 fs/namespace.c:4324  do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]  do_syscall_64+0x8e/0x3a0 arch/x86/entry/syscall_64.c:94  entry_SYSCALL_64_after_hwframe+0x76/0x7e The bug can be reproduced w/ below scripts: - mount /dev/vdb /mnt1 - mount /dev/vdc /mnt2 - umount /mnt1 - mounnt /dev/vdb /mnt1 The reason is if we created two slab caches, named f2fs_xattr_entry-7:3 and f2fs_xattr_entry-7:7, and they have the same slab size. Actually, slab system will only create one slab cache core structure which has slab name of "f2fs_xattr_entry-7:3", and two slab caches share the same structure and cache address. So, if we destroy f2fs_xattr_entry-7:3 cache w/ cache address, it will decrease reference count of slab cache, rather than release slab cache entirely, since there is one more user has referenced the cache. Then, if we try to create slab cache w/ name "f2fs_xattr_entry-7:3" again, slab system will find that there is existed cache which has the same name and trigger the warning. Let's changes to use global inline_xattr_slab instead of per-sb slab cache for fixing. Fixes: a999150f4fe3 ("f2fs: use kmem_cache pool during inline xattr lookups") Cc: stable@kernel.org Reported-by: Hong Yun <yhong@link.cuhk.edu.hk> Tested-by: Hong Yun <yhong@link.cuhk.edu.hk> Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix to avoid updating compression context during writebackChao Yu4-3/+23
Bai, Shuangpeng <sjb7183@psu.edu> reported a bug as below: Oops: divide error: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 11441 Comm: syz.0.46 Not tainted 6.17.0 #1 PREEMPT(full) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:f2fs_all_cluster_page_ready+0x106/0x550 fs/f2fs/compress.c:857 Call Trace: <TASK> f2fs_write_cache_pages fs/f2fs/data.c:3078 [inline] __f2fs_write_data_pages fs/f2fs/data.c:3290 [inline] f2fs_write_data_pages+0x1c19/0x3600 fs/f2fs/data.c:3317 do_writepages+0x38e/0x640 mm/page-writeback.c:2634 filemap_fdatawrite_wbc mm/filemap.c:386 [inline] __filemap_fdatawrite_range mm/filemap.c:419 [inline] file_write_and_wait_range+0x2ba/0x3e0 mm/filemap.c:794 f2fs_do_sync_file+0x6e6/0x1b00 fs/f2fs/file.c:294 generic_write_sync include/linux/fs.h:3043 [inline] f2fs_file_write_iter+0x76e/0x2700 fs/f2fs/file.c:5259 new_sync_write fs/read_write.c:593 [inline] vfs_write+0x7e9/0xe00 fs/read_write.c:686 ksys_write+0x19d/0x2d0 fs/read_write.c:738 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xf7/0x470 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f The bug was triggered w/ below race condition: fsync setattr ioctl - f2fs_do_sync_file - file_write_and_wait_range - f2fs_write_cache_pages : inode is non-compressed : cc.cluster_size = F2FS_I(inode)->i_cluster_size = 0 - tag_pages_for_writeback - f2fs_setattr - truncate_setsize - f2fs_truncate - f2fs_fileattr_set - f2fs_setflags_common - set_compress_context : F2FS_I(inode)->i_cluster_size = 4 : set_inode_flag(inode, FI_COMPRESSED_FILE) - f2fs_compressed_file : return true - f2fs_all_cluster_page_ready : "pgidx % cc->cluster_size" trigger dividing 0 issue Let's change as below to fix this issue: - introduce a new atomic type variable .writeback in structure f2fs_inode_info to track the number of threads which calling f2fs_write_cache_pages(). - use .i_sem lock to protect .writeback update. - check .writeback before update compression context in f2fs_setflags_common() to avoid race w/ ->writepages. Fixes: 4c8ff7095bef ("f2fs: support data compression") Cc: stable@kernel.org Reported-by: Bai, Shuangpeng <sjb7183@psu.edu> Tested-by: Bai, Shuangpeng <sjb7183@psu.edu> Closes: https://lore.kernel.org/lkml/44D8F7B3-68AD-425F-9915-65D27591F93F@psu.edu Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix to avoid updating zero-sized extent in extent cacheChao Yu1-2/+5
As syzbot reported: F2FS-fs (loop0): __update_extent_tree_range: extent len is zero, type: 0, extent [0, 0, 0], age [0, 0] ------------[ cut here ]------------ kernel BUG at fs/f2fs/extent_cache.c:678! Oops: invalid opcode: 0000 [#1] SMP KASAN NOPTI CPU: 0 UID: 0 PID: 5336 Comm: syz.0.0 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 RIP: 0010:__update_extent_tree_range+0x13bc/0x1500 fs/f2fs/extent_cache.c:678 Call Trace: <TASK> f2fs_update_read_extent_cache_range+0x192/0x3e0 fs/f2fs/extent_cache.c:1085 f2fs_do_zero_range fs/f2fs/file.c:1657 [inline] f2fs_zero_range+0x10c1/0x1580 fs/f2fs/file.c:1737 f2fs_fallocate+0x583/0x990 fs/f2fs/file.c:2030 vfs_fallocate+0x669/0x7e0 fs/open.c:342 ioctl_preallocate fs/ioctl.c:289 [inline] file_ioctl+0x611/0x780 fs/ioctl.c:-1 do_vfs_ioctl+0xb33/0x1430 fs/ioctl.c:576 __do_sys_ioctl fs/ioctl.c:595 [inline] __se_sys_ioctl+0x82/0x170 fs/ioctl.c:583 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f07bc58eec9 In error path of f2fs_zero_range(), it may add a zero-sized extent into extent cache, it should be avoided. Fixes: 6e9619499f53 ("f2fs: support in batch fzero in dnode page") Cc: stable@kernel.org Reported-by: syzbot+24124df3170c3638b35f@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/68e5d698.050a0220.256323.0032.GAE@google.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: fix to avoid potential deadlockChao Yu3-46/+1
As Jiaming Zhang and syzbot reported, there is potential deadlock in f2fs as below: Chain exists of: &sbi->cp_rwsem --> fs_reclaim --> sb_internal#2 Possible unsafe locking scenario: CPU0 CPU1 ---- ---- rlock(sb_internal#2); lock(fs_reclaim); lock(sb_internal#2); rlock(&sbi->cp_rwsem); *** DEADLOCK *** 3 locks held by kswapd0/73: #0: ffffffff8e247a40 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:7015 [inline] #0: ffffffff8e247a40 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0x951/0x2800 mm/vmscan.c:7389 #1: ffff8880118400e0 (&type->s_umount_key#50){.+.+}-{4:4}, at: super_trylock_shared fs/super.c:562 [inline] #1: ffff8880118400e0 (&type->s_umount_key#50){.+.+}-{4:4}, at: super_cache_scan+0x91/0x4b0 fs/super.c:197 #2: ffff888011840610 (sb_internal#2){.+.+}-{0:0}, at: f2fs_evict_inode+0x8d9/0x1b60 fs/f2fs/inode.c:890 stack backtrace: CPU: 0 UID: 0 PID: 73 Comm: kswapd0 Not tainted syzkaller #0 PREEMPT(full) Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120 print_circular_bug+0x2ee/0x310 kernel/locking/lockdep.c:2043 check_noncircular+0x134/0x160 kernel/locking/lockdep.c:2175 check_prev_add kernel/locking/lockdep.c:3165 [inline] check_prevs_add kernel/locking/lockdep.c:3284 [inline] validate_chain+0xb9b/0x2140 kernel/locking/lockdep.c:3908 __lock_acquire+0xab9/0xd20 kernel/locking/lockdep.c:5237 lock_acquire+0x120/0x360 kernel/locking/lockdep.c:5868 down_read+0x46/0x2e0 kernel/locking/rwsem.c:1537 f2fs_down_read fs/f2fs/f2fs.h:2278 [inline] f2fs_lock_op fs/f2fs/f2fs.h:2357 [inline] f2fs_do_truncate_blocks+0x21c/0x10c0 fs/f2fs/file.c:791 f2fs_truncate_blocks+0x10a/0x300 fs/f2fs/file.c:867 f2fs_truncate+0x489/0x7c0 fs/f2fs/file.c:925 f2fs_evict_inode+0x9f2/0x1b60 fs/f2fs/inode.c:897 evict+0x504/0x9c0 fs/inode.c:810 f2fs_evict_inode+0x1dc/0x1b60 fs/f2fs/inode.c:853 evict+0x504/0x9c0 fs/inode.c:810 dispose_list fs/inode.c:852 [inline] prune_icache_sb+0x21b/0x2c0 fs/inode.c:1000 super_cache_scan+0x39b/0x4b0 fs/super.c:224 do_shrink_slab+0x6ef/0x1110 mm/shrinker.c:437 shrink_slab_memcg mm/shrinker.c:550 [inline] shrink_slab+0x7ef/0x10d0 mm/shrinker.c:628 shrink_one+0x28a/0x7c0 mm/vmscan.c:4955 shrink_many mm/vmscan.c:5016 [inline] lru_gen_shrink_node mm/vmscan.c:5094 [inline] shrink_node+0x315d/0x3780 mm/vmscan.c:6081 kswapd_shrink_node mm/vmscan.c:6941 [inline] balance_pgdat mm/vmscan.c:7124 [inline] kswapd+0x147c/0x2800 mm/vmscan.c:7389 kthread+0x70e/0x8a0 kernel/kthread.c:463 ret_from_fork+0x4bc/0x870 arch/x86/kernel/process.c:158 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245 </TASK> The root cause is deadlock among four locks as below: kswapd - fs_reclaim --- Lock A - shrink_one - evict - f2fs_evict_inode - sb_start_intwrite --- Lock B - iput - evict - f2fs_evict_inode - sb_start_intwrite --- Lock B - f2fs_truncate - f2fs_truncate_blocks - f2fs_do_truncate_blocks - f2fs_lock_op --- Lock C ioctl - f2fs_ioc_commit_atomic_write - f2fs_lock_op --- Lock C - __f2fs_commit_atomic_write - __replace_atomic_write_block - f2fs_get_dnode_of_data - __get_node_folio - f2fs_check_nid_range - f2fs_handle_error - f2fs_record_errors - f2fs_down_write --- Lock D open - do_open - do_truncate - security_inode_need_killpriv - f2fs_getxattr - lookup_all_xattrs - f2fs_handle_error - f2fs_record_errors - f2fs_down_write --- Lock D - f2fs_commit_super - read_mapping_folio - filemap_alloc_folio_noprof - prepare_alloc_pages - fs_reclaim_acquire --- Lock A In order to avoid such deadlock, we need to avoid grabbing sb_lock in f2fs_handle_error(), so, let's use asynchronous method instead: - remove f2fs_handle_error() implementation - rename f2fs_handle_error_async() to f2fs_handle_error() - spread f2fs_handle_error() Fixes: 95fa90c9e5a7 ("f2fs: support recording errors into superblock") Cc: stable@kernel.org Reported-by: syzbot+14b90e1156b9f6fc1266@syzkaller.appspotmail.com Closes: https://lore.kernel.org/linux-f2fs-devel/68eae49b.050a0220.ac43.0001.GAE@google.com Reported-by: Jiaming Zhang <r772577952@gmail.com> Closes: https://lore.kernel.org/lkml/CANypQFa-Gy9sD-N35o3PC+FystOWkNuN8pv6S75HLT0ga-Tzgw@mail.gmail.com Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: use f2fs_filemap_get_folio() to support fault injectionChao Yu2-3/+3
Use f2fs_filemap_get_folio() instead of __filemap_get_folio() in: - f2fs_find_data_folio - f2fs_write_begin - f2fs_read_merkle_tree_page So that, we can trigger fault injection in those places. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: use f2fs_filemap_get_folio() instead of f2fs_pagecache_get_page()Chao Yu2-20/+13
Let's use f2fs_filemap_get_folio() instead of f2fs_pagecache_get_page() in ra_data_block() and move_data_block(), then remove f2fs_pagecache_get_page() since it has no user. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: convert add_ipu_page() to use folioChao Yu1-4/+3
No logic changes. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysf2fs: clean up w/ bio_add_folio_nofail()Chao Yu1-4/+3
In add_bio_entry(), adding a page to newly allocated bio should never fail, let's use bio_add_folio_nofail() instead of bio_add_page() & unnecessary error handling for cleanup. Signed-off-by: Chao Yu <chao@kernel.org> Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
10 daysMerge tag 'net-next-6.19' of ↵Linus Torvalds1652-23922/+57995
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next Pull networking updates from Jakub Kicinski: "Core & protocols: - Replace busylock at the Tx queuing layer with a lockless list. Resulting in a 300% (4x) improvement on heavy TX workloads, sending twice the number of packets per second, for half the cpu cycles. - Allow constantly busy flows to migrate to a more suitable CPU/NIC queue. Normally we perform queue re-selection when flow comes out of idle, but under extreme circumstances the flows may be constantly busy. Add sysctl to allow periodic rehashing even if it'd risk packet reordering. - Optimize the NAPI skb cache, make it larger, use it in more paths. - Attempt returning Tx skbs to the originating CPU (like we already did for Rx skbs). - Various data structure layout and prefetch optimizations from Eric. - Remove ktime_get() from the recvmsg() fast path, ktime_get() is sadly quite expensive on recent AMD machines. - Extend threaded NAPI polling to allow the kthread busy poll for packets. - Make MPTCP use Rx backlog processing. This lowers the lock pressure, improving the Rx performance. - Support memcg accounting of MPTCP socket memory. - Allow admin to opt sockets out of global protocol memory accounting (using a sysctl or BPF-based policy). The global limits are a poor fit for modern container workloads, where limits are imposed using cgroups. - Improve heuristics for when to kick off AF_UNIX garbage collection. - Allow users to control TCP SACK compression, and default to 33% of RTT. - Add tcp_rcvbuf_low_rtt sysctl to let datacenter users avoid unnecessarily aggressive rcvbuf growth and overshot when the connection RTT is low. - Preserve skb metadata space across skb_push / skb_pull operations. - Support for IPIP encapsulation in the nftables flowtable offload. - Support appending IP interface information to ICMP messages (RFC 5837). - Support setting max record size in TLS (RFC 8449). - Remove taking rtnl_lock from RTM_GETNEIGHTBL and RTM_SETNEIGHTBL. - Use a dedicated lock (and RCU) in MPLS, instead of rtnl_lock. - Let users configure the number of write buffers in SMC. - Add new struct sockaddr_unsized for sockaddr of unknown length, from Kees. - Some conversions away from the crypto_ahash API, from Eric Biggers. - Some preparations for slimming down struct page. - YAML Netlink protocol spec for WireGuard. - Add a tool on top of YAML Netlink specs/lib for reporting commonly computed derived statistics and summarized system state. Driver API: - Add CAN XL support to the CAN Netlink interface. - Add uAPI for reporting PHY Mean Square Error (MSE) diagnostics, as defined by the OPEN Alliance's "Advanced diagnostic features for 100BASE-T1 automotive Ethernet PHYs" specification. - Add DPLL phase-adjust-gran pin attribute (and implement it in zl3073x). - Refactor xfrm_input lock to reduce contention when NIC offloads IPsec and performs RSS. - Add info to devlink params whether the current setting is the default or a user override. Allow resetting back to default. - Add standard device stats for PSP crypto offload. - Leverage DSA frame broadcast to implement simple HSR frame duplication for a lot of switches without dedicated HSR offload. - Add uAPI defines for 1.6Tbps link modes. Device drivers: - Add Motorcomm YT921x gigabit Ethernet switch support. - Add MUCSE driver for N500/N210 1GbE NIC series. - Convert drivers to support dedicated ops for timestamping control, and away from the direct IOCTL handling. While at it support GET operations for PHY timestamping. - Add (and convert most drivers to) a dedicated ethtool callback for reading the Rx ring count. - Significant refactoring efforts in the STMMAC driver, which supports Synopsys turn-key MAC IP integrated into a ton of SoCs. - Ethernet high-speed NICs: - Broadcom (bnxt): - support PPS in/out on all pins - Intel (100G, ice, idpf): - ice: implement standard ethtool and timestamping stats - i40e: support setting the max number of MAC addresses per VF - iavf: support RSS of GTP tunnels for 5G and LTE deployments - nVidia/Mellanox (mlx5): - reduce downtime on interface reconfiguration - disable being an XDP redirect target by default (same as other drivers) to avoid wasting resources if feature is unused - Meta (fbnic): - add support for Linux-managed PCS on 25G, 50G, and 100G links - Wangxun: - support Rx descriptor merge, and Tx head writeback - support Rx coalescing offload - support 25G SPF and 40G QSFP modules - Ethernet virtual: - Google (gve): - allow ethtool to configure rx_buf_len - implement XDP HW RX Timestamping support for DQ descriptor format - Microsoft vNIC (mana): - support HW link state events - handle hardware recovery events when probing the device - Ethernet NICs consumer, and embedded: - usbnet: add support for Byte Queue Limits (BQL) - AMD (amd-xgbe): - add device selftests - NXP (enetc): - add i.MX94 support - Broadcom integrated MACs (bcmgenet, bcmasp): - bcmasp: add support for PHY-based Wake-on-LAN - Broadcom switches (b53): - support port isolation - support BCM5389/97/98 and BCM63XX ARL formats - Lantiq/MaxLinear switches: - support bridge FDB entries on the CPU port - use regmap for register access - allow user to enable/disable learning - support Energy Efficient Ethernet - support configuring RMII clock delays - add tagging driver for MaxLinear GSW1xx switches - Synopsys (stmmac): - support using the HW clock in free running mode - add Eswin EIC7700 support - add Rockchip RK3506 support - add Altera Agilex5 support - Cadence (macb): - cleanup and consolidate descriptor and DMA address handling - add EyeQ5 support - TI: - icssg-prueth: support AF_XDP - Airoha access points: - add missing Ethernet stats and link state callback - add AN7583 support - support out-of-order Tx completion processing - Power over Ethernet: - pd692x0: preserve PSE configuration across reboots - add support for TPS23881B devices - Ethernet PHYs: - Open Alliance OATC14 10BASE-T1S PHY cable diagnostic support - Support 50G SerDes and 100G interfaces in Linux-managed PHYs - micrel: - support for non PTP SKUs of lan8814 - enable in-band auto-negotiation on lan8814 - realtek: - cable testing support on RTL8224 - interrupt support on RTL8221B - motorcomm: support for PHY LEDs on YT853 - microchip: support for LAN867X Rev.D0 PHYs w/ SQI and cable diag - mscc: support for PHY LED control - CAN drivers: - m_can: add support for optional reset and system wake up - remove can_change_mtu() obsoleted by core handling - mcp251xfd: support GPIO controller functionality - Bluetooth: - add initial support for PASTa - WiFi: - split ieee80211.h file, it's way too big - improvements in VHT radiotap reporting, S1G, Channel Switch Announcement handling, rate tracking in mesh networks - improve multi-radio monitor mode support, and add a cfg80211 debugfs interface for it - HT action frame handling on 6 GHz - initial chanctx work towards NAN - MU-MIMO sniffer improvements - WiFi drivers: - RealTek (rtw89): - support USB devices RTL8852AU and RTL8852CU - initial work for RTL8922DE - improved injection support - Intel: - iwlwifi: new sniffer API support - MediaTek (mt76): - WED support for >32-bit DMA - airoha NPU support - regdomain improvements - continued WiFi7/MLO work - Qualcomm/Atheros: - ath10k: factory test support - ath11k: TX power insertion support - ath12k: BSS color change support - ath12k: statistics improvements - brcmfmac: Acer A1 840 tablet quirk - rtl8xxxu: 40 MHz connection fixes/support" * tag 'net-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1381 commits) net: page_pool: sanitise allocation order net: page pool: xa init with destroy on pp init net/mlx5e: Support XDP target xmit with dummy program net/mlx5e: Update XDP features in switch channels selftests/tc-testing: Test CAKE scheduler when enqueue drops packets net/sched: sch_cake: Fix incorrect qlen reduction in cake_drop wireguard: netlink: generate netlink code wireguard: uapi: generate header with ynl-gen wireguard: uapi: move flag enums wireguard: uapi: move enum wg_cmd wireguard: netlink: add YNL specification selftests: drv-net: Fix tolerance calculation in devlink_rate_tc_bw.py selftests: drv-net: Fix and clarify TC bandwidth split in devlink_rate_tc_bw.py selftests: drv-net: Set shell=True for sysfs writes in devlink_rate_tc_bw.py selftests: drv-net: Use Iperf3Runner in devlink_rate_tc_bw.py selftests: drv-net: introduce Iperf3Runner for measurement use cases selftests: drv-net: Add devlink_rate_tc_bw.py to TEST_PROGS net: ps3_gelic_net: Use napi_alloc_skb() and napi_gro_receive() Documentation: net: dsa: mention simple HSR offload helpers Documentation: net: dsa: mention availability of RedBox ...
10 daysMerge tag 'bpf-next-6.19' of ↵Linus Torvalds157-4998/+10852
git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next Pull bpf updates from Alexei Starovoitov: - Convert selftests/bpf/test_tc_edt and test_tc_tunnel from .sh to test_progs runner (Alexis Lothoré) - Convert selftests/bpf/test_xsk to test_progs runner (Bastien Curutchet) - Replace bpf memory allocator with kmalloc_nolock() in bpf_local_storage (Amery Hung), and in bpf streams and range tree (Puranjay Mohan) - Introduce support for indirect jumps in BPF verifier and x86 JIT (Anton Protopopov) and arm64 JIT (Puranjay Mohan) - Remove runqslower bpf tool (Hoyeon Lee) - Fix corner cases in the verifier to close several syzbot reports (Eduard Zingerman, KaFai Wan) - Several improvements in deadlock detection in rqspinlock (Kumar Kartikeya Dwivedi) - Implement "jmp" mode for BPF trampoline and corresponding DYNAMIC_FTRACE_WITH_JMP. It improves "fexit" program type performance from 80 M/s to 136 M/s. With Steven's Ack. (Menglong Dong) - Add ability to test non-linear skbs in BPF_PROG_TEST_RUN (Paul Chaignon) - Do not let BPF_PROG_TEST_RUN emit invalid GSO types to stack (Daniel Borkmann) - Generalize buildid reader into bpf_dynptr (Mykyta Yatsenko) - Optimize bpf_map_update_elem() for map-in-map types (Ritesh Oedayrajsingh Varma) - Introduce overwrite mode for BPF ring buffer (Xu Kuohai) * tag 'bpf-next-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (169 commits) bpf: optimize bpf_map_update_elem() for map-in-map types bpf: make kprobe_multi_link_prog_run always_inline selftests/bpf: do not hardcode target rate in test_tc_edt BPF program selftests/bpf: remove test_tc_edt.sh selftests/bpf: integrate test_tc_edt into test_progs selftests/bpf: rename test_tc_edt.bpf.c section to expose program type selftests/bpf: Add success stats to rqspinlock stress test rqspinlock: Precede non-head waiter queueing with AA check rqspinlock: Disable spinning for trylock fallback rqspinlock: Use trylock fallback when per-CPU rqnode is busy rqspinlock: Perform AA checks immediately rqspinlock: Enclose lock/unlock within lock entry acquisitions bpf: Remove runqslower tool selftests/bpf: Remove usage of lsm/file_alloc_security in selftest bpf: Disable file_alloc_security hook bpf: check for insn arrays in check_ptr_alignment bpf: force BPF_F_RDONLY_PROG on insn array creation bpf: Fix exclusive map memory leak selftests/bpf: Make CS length configurable for rqspinlock stress test selftests/bpf: Add lock wait time stats to rqspinlock stress test ...
10 daysMerge tag 'linux_kselftest-kunit-6.19-rc1' of ↵Linus Torvalds3-3/+35
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kunit updates from Shuah Khan: - Make filter parameters configurable via Kconfig - Add description of kunit.enable parameter to documentation * tag 'linux_kselftest-kunit-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: kunit: Make filter parameters configurable via Kconfig Documentation: kunit: add description of kunit.enable parameter
10 daysktest.pl: Fix uninitialized var in config-bisect.plSteven Rostedt1-2/+2
The error path of copying the old config used the wrong variable in the error message: $ mkdir /tmp/build $ ./tools/testing/ktest/config-bisect.pl -b /tmp/build config-good /tmp/config-bad $ chmod 0 /tmp/build $ ./tools/testing/ktest/config-bisect.pl -b /tmp/build config-good /tmp/config-bad good cp /tmp/build//.config config-good.tmp ... [0 seconds] FAILED! Use of uninitialized value $config in concatenation (.) or string at ./tools/testing/ktest/config-bisect.pl line 744. failed to copy to config-good.tmp When it should have shown: failed to copy /tmp/build//.config to config-good.tmp Cc: stable@vger.kernel.org Cc: John 'Warthog9' Hawley <warthog9@kernel.org> Fixes: 0f0db065999cf ("ktest: Add standalone config-bisect.pl program") Link: https://patch.msgid.link/20251203180924.6862bd26@gandalf.local.home Reported-by: "John W. Krahn" <jwkrahn@shaw.ca> Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
10 dayspinctrl: starfive: use dynamic GPIO base allocationAli Tariq4-6/+1
The JH7110 pinctrl driver currently sets a static GPIO base number from platform data: sfp->gc.base = info->gc_base; Static base assignment is deprecated and results in the following warning: gpio gpiochip0: Static allocation of GPIO base is deprecated, use dynamic allocation. Set `sfp->gc.base = -1` to let the GPIO core dynamically allocate the base number. This removes the warning and aligns the driver with current GPIO guidelines. Since the GPIO base is now allocated dynamically, remove `gc_base` field in `struct jh7110_pinctrl_soc_info` and the associated `JH7110_SYS_GC_BASE` and `JH7110_AON_GC_BASE` constants as they are no longer used anywhere in the driver. Tested on VisionFive 2 (JH7110 SoC). Signed-off-by: Ali Tariq <alitariq45892@gmail.com> Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com> Reviewed-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Signed-off-by: Linus Walleij <linusw@kernel.org>
10 dayspinctrl: single: Fix incorrect type for error return variableHaotian Zhang1-3/+4
pcs_pinconf_get() and pcs_pinconf_set() declare ret as unsigned int, but assign it the return values of pcs_get_function() that may return negative error codes. This causes negative error codes to be converted to large positive values. Change ret from unsigned int to int in both functions. Fixes: 9dddb4df90d1 ("pinctrl: single: support generic pinconf") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Signed-off-by: Linus Walleij <linusw@kernel.org>
10 daysMerge tag 'linux_kselftest-next-6.19-rc1' of ↵Linus Torvalds6-19/+176
git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest Pull kselftest updates from Shuah Khan: - Add basic test for trace_marker_raw file to tracing selftest - Fix invalid array access in printf dma_map_benchmark selftest - Add tprobe enable/disable testcase to tracing selftest - Update fprobe selftest for ftrace based fprobe * tag 'linux_kselftest-next-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest: selftests: tracing: Update fprobe selftest for ftrace based fprobe selftests: tracing: Add tprobe enable/disable testcase selftests/run_kselftest.sh: exit with error if tests fail selftests/dma: fix invalid array access in printf selftests/tracing: Add basic test for trace_marker_raw file
10 daysMerge tag 'kbuild-6.19-1' of ↵Linus Torvalds17-115/+207
git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux Pull Kbuild updates from Nicolas Schier: - Enable -fms-extensions, allowing anonymous use of tagged struct or union in struct/union (tag kbuild-ms-extensions-6.19). An exemplary conversion patch is added here, too (btrfs). [ Editor's note: the core of this actually came in early through a shared branch and a few other trees - Linus ] - Introduce architecture-specific CC_CAN_LINK and flags for userprogs - Add new packaging target 'modules-cpio-pkg' for building a initramfs cpio w/ kmods - Handle included .c files in gen_compile_commands - Minor kbuild changes: - Use objtree for module signing key path, fixing oot kmod signing - Improve documentation of KBUILD_BUILD_TIMESTAMP - Reuse KBUILD_USERCFLAGS for UAPI, instead of defining twice - Rename scripts/Makefile.extrawarn to Makefile.warn - Drop obsolete types.h check from headers_check.pl - Remove outdated config leak ignore entries * tag 'kbuild-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/kbuild/linux: kbuild: add target to build a cpio containing modules initramfs: add gen_init_cpio to hostprogs unconditionally kbuild: allow architectures to override CC_CAN_LINK init: deduplicate cc-can-link.sh invocations kbuild: don't enable CC_CAN_LINK if the dummy program generates warnings scripts: headers_install.sh: Remove two outdated config leak ignore entries scripts/clang-tools: Handle included .c files in gen_compile_commands kbuild: uapi: Drop types.h check from headers_check.pl kbuild: Rename Makefile.extrawarn to Makefile.warn MAINTAINERS, .mailmap: Update mail address for Nicolas Schier kbuild: uapi: reuse KBUILD_USERCFLAGS kbuild: doc: improve KBUILD_BUILD_TIMESTAMP documentation kbuild: Use objtree for module signing key path btrfs: send: make use of -fms-extensions for defining struct fs_path
10 daysMerge tag 'rust-6.19' of ↵Linus Torvalds131-477/+59666
git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux Pull Rust updates from Miguel Ojeda: "Toolchain and infrastructure: - Add support for 'syn'. Syn is a parsing library for parsing a stream of Rust tokens into a syntax tree of Rust source code. Currently this library is geared toward use in Rust procedural macros, but contains some APIs that may be useful more generally. 'syn' allows us to greatly simplify writing complex macros such as 'pin-init' (Benno has already prepared the 'syn'-based version). We will use it in the 'macros' crate too. 'syn' is the most downloaded Rust crate (according to crates.io), and it is also used by the Rust compiler itself. While the amount of code is substantial, there should not be many updates needed for these crates, and even if there are, they should not be too big, e.g. +7k -3k lines across the 3 crates in the last year. 'syn' requires two smaller dependencies: 'quote' and 'proc-macro2'. I only modified their code to remove a third dependency ('unicode-ident') and to add the SPDX identifiers. The code can be easily verified to exactly match upstream with the provided scripts. They are all licensed under "Apache-2.0 OR MIT", like the other vendored 'alloc' crate we had for a while. Please see the merge commit with the cover letter for more context. - Allow 'unreachable_pub' and 'clippy::disallowed_names' for doctests. Examples (i.e. doctests) may want to do things like show public items and use names such as 'foo'. Nevertheless, we still try to keep examples as close to real code as possible (this is part of why running Clippy on doctests is important for us, e.g. for safety comments, which userspace Rust does not support yet but we are stricter). 'kernel' crate: - Replace our custom 'CStr' type with 'core::ffi::CStr'. Using the standard library type reduces our custom code footprint, and we retain needed custom functionality through an extension trait and a new 'fmt!' macro which replaces the previous 'core' import. This started in 6.17 and continued in 6.18, and we finally land the replacement now. This required quite some stamina from Tamir, who split the changes in steps to prepare for the flag day change here. - Replace 'kernel::c_str!' with C string literals. C string literals were added in Rust 1.77, which produce '&CStr's (the 'core' one), so now we can write: c"hi" instead of: c_str!("hi") - Add 'num' module for numerical features. It includes the 'Integer' trait, implemented for all primitive integer types. It also includes the 'Bounded' integer wrapping type: an integer value that requires only the 'N' least significant bits of the wrapped type to be encoded: // An unsigned 8-bit integer, of which only the 4 LSBs are used. let v = Bounded::<u8, 4>::new::<15>(); assert_eq!(v.get(), 15); 'Bounded' is useful to e.g. enforce guarantees when working with bitfields that have an arbitrary number of bits. Values can also be constructed from simple non-constant expressions or, for more complex ones, validated at runtime. 'Bounded' also comes with comparison and arithmetic operations (with both their backing type and other 'Bounded's with a compatible backing type), casts to change the backing type, extending/shrinking and infallible/fallible conversions from/to primitives as applicable. - 'rbtree' module: add immutable cursor ('Cursor'). It enables to use just an immutable tree reference where appropriate. The existing fully-featured mutable cursor is renamed to 'CursorMut'. kallsyms: - Fix wrong "big" kernel symbol type read from procfs. 'pin-init' crate: - A couple minor fixes (Benno asked me to pick these patches up for him this cycle). Documentation: - Quick Start guide: add Debian 13 (Trixie). Debian Stable is now able to build Linux, since Debian 13 (released 2025-08-09) packages Rust 1.85.0, which is recent enough. We are planning to propose that the minimum supported Rust version in Linux follows Debian Stable releases, with Debian 13 being the first one we upgrade to, i.e. Rust 1.85. MAINTAINERS: - Add entry for the new 'num' module. - Remove Alex as Rust maintainer: he hasn't had the time to contribute for a few years now, so it is a no-op change in practice. And a few other cleanups and improvements" * tag 'rust-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/ojeda/linux: (53 commits) rust: macros: support `proc-macro2`, `quote` and `syn` rust: syn: enable support in kbuild rust: syn: add `README.md` rust: syn: remove `unicode-ident` dependency rust: syn: add SPDX License Identifiers rust: syn: import crate rust: quote: enable support in kbuild rust: quote: add `README.md` rust: quote: add SPDX License Identifiers rust: quote: import crate rust: proc-macro2: enable support in kbuild rust: proc-macro2: add `README.md` rust: proc-macro2: remove `unicode_ident` dependency rust: proc-macro2: add SPDX License Identifiers rust: proc-macro2: import crate rust: kbuild: support using libraries in `rustc_procmacro` rust: kbuild: support skipping flags in `rustc_test_library` rust: kbuild: add proc macro library support rust: kbuild: simplify `--cfg` handling rust: kbuild: introduce `core-flags` and `core-skip_flags` ...
11 daysMerge tag 'livepatching-for-6.19' of ↵Linus Torvalds2-2/+12
git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching Pull livepatching updates from Petr Mladek: - Support both paths where tracefs is typically mounted in selftests - Make old_sympos 0 and 1 equal. They both are valid when there is only one symbol with the given name. * tag 'livepatching-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/livepatching/livepatching: selftests: livepatch: use canonical ftrace path livepatch: Match old_sympos 0 and 1 in klp_find_func()
11 daysMerge tag 'sched_ext-for-6.19' of ↵Linus Torvalds20-411/+1893
git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext Pull sched_ext updates from Tejun Heo: - Improve recovery from misbehaving BPF schedulers. When a scheduler puts many tasks with varying affinity restrictions on a shared DSQ, CPUs scanning through tasks they cannot run can overwhelm the system, causing lockups. Bypass mode now uses per-CPU DSQs with a load balancer to avoid this, and hooks into the hardlockup detector to attempt recovery. Add scx_cpu0 example scheduler to demonstrate this scenario. - Add lockless peek operation for DSQs to reduce lock contention for schedulers that need to query queue state during load balancing. - Allow scx_bpf_reenqueue_local() to be called from anywhere in preparation for deprecating cpu_acquire/release() callbacks in favor of generic BPF hooks. - Prepare for hierarchical scheduler support: add scx_bpf_task_set_slice() and scx_bpf_task_set_dsq_vtime() kfuncs, make scx_bpf_dsq_insert*() return bool, and wrap kfunc args in structs for future aux__prog parameter. - Implement cgroup_set_idle() callback to notify BPF schedulers when a cgroup's idle state changes. - Fix migration tasks being incorrectly downgraded from stop_sched_class to rt_sched_class across sched_ext enable/disable. Applied late as the fix is low risk and the bug subtle but needs stable backporting. - Various fixes and cleanups including cgroup exit ordering, SCX_KICK_WAIT reliability, and backward compatibility improvements. * tag 'sched_ext-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/sched_ext: (44 commits) sched_ext: Fix incorrect sched_class settings for per-cpu migration tasks sched_ext: tools: Removing duplicate targets during non-cross compilation sched_ext: Use kvfree_rcu() to release per-cpu ksyncs object sched_ext: Pass locked CPU parameter to scx_hardlockup() and add docs sched_ext: Update comments replacing breather with aborting mechanism sched_ext: Implement load balancer for bypass mode sched_ext: Factor out abbreviated dispatch dequeue into dispatch_dequeue_locked() sched_ext: Factor out scx_dsq_list_node cursor initialization into INIT_DSQ_LIST_CURSOR sched_ext: Add scx_cpu0 example scheduler sched_ext: Hook up hardlockup detector sched_ext: Make handle_lockup() propagate scx_verror() result sched_ext: Refactor lockup handlers into handle_lockup() sched_ext: Make scx_exit() and scx_vexit() return bool sched_ext: Exit dispatch and move operations immediately when aborting sched_ext: Simplify breather mechanism with scx_aborting flag sched_ext: Use per-CPU DSQs instead of per-node global DSQs in bypass mode sched_ext: Refactor do_enqueue_task() local and global DSQ paths sched_ext: Use shorter slice in bypass mode sched_ext: Mark racy bitfields to prevent adding fields that can't tolerate races sched_ext: Minor cleanups to scx_task_iter ...
11 daysMerge tag 'cgroup-for-6.19' of ↵Linus Torvalds20-206/+436
git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: - Defer task cgroup unlink until after the dying task's final context switch so that controllers see the cgroup properly populated until the task is truly gone - cpuset cleanups and simplifications. Enforce that domain isolated CPUs stay in root or isolated partitions and fail if isolated+nohz_full would leave no housekeeping CPU. Fix sched/deadline root domain handling during CPU hot-unplug and race for tasks in attaching cpusets - Misc fixes including memory reclaim protection documentation and selftest KTAP conformance * tag 'cgroup-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) cpuset: Treat cpusets in attaching as populated sched/deadline: Walk up cpuset hierarchy to decide root domain when hot-unplug cgroup/cpuset: Introduce cpuset_cpus_allowed_locked() docs: cgroup: No special handling of unpopulated memcgs docs: cgroup: Note about sibling relative reclaim protection docs: cgroup: Explain reclaim protection target selftests/cgroup: conform test to KTAP format output cpuset: remove need_rebuild_sched_domains cpuset: remove global remote_children list cpuset: simplify node setting on error cgroup: include missing header for struct irq_work cgroup: Fix sleeping from invalid context warning on PREEMPT_RT cgroup/cpuset: Globally track isolated_cpus update cgroup/cpuset: Ensure domain isolated CPUs stay in root or isolated partition cgroup/cpuset: Move up prstate_housekeeping_conflict() helper cgroup/cpuset: Fail if isolated and nohz_full don't leave any housekeeping cgroup/cpuset: Rename update_unbound_workqueue_cpumask() to update_isolation_cpumasks() cgroup: Defer task cgroup unlink until after the task is done switching out cgroup: Move dying_tasks cleanup from cgroup_task_release() to cgroup_task_free() cgroup: Rename cgroup lifecycle hooks to cgroup_task_*() ...
11 daysKEYS: trusted: Use tpm_ret_to_err() in trusted_tpm2Jarkko Sakkinen1-19/+7
Use tpm_ret_to_err() to transmute TPM return codes in trusted_tpm2. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@opinsys.com> Acked-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
11 daystpm: Use -EPERM as fallback error code in tpm_ret_to_errJarkko Sakkinen1-1/+3
Using -EFAULT as the tpm_ret_to_err() fallback error code causes makes it incompatible on how trusted keys transmute TPM return codes. Change the fallback as -EPERM in order to gain compatibility with trusted keys. In addition, map TPM_RC_HASH to -EINVAL in order to be compatible with tpm2_seal_trusted() return values. Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@opinsys.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
11 daystpm: Cap the number of PCR banksJarkko Sakkinen4-14/+8
tpm2_get_pcr_allocation() does not cap any upper limit for the number of banks. Cap the limit to eight banks so that out of bounds values coming from external I/O cause on only limited harm. Cc: stable@vger.kernel.org # v5.10+ Fixes: bcfff8384f6c ("tpm: dynamically allocate the allocated_banks array") Tested-by: Lai Yi <yi1.lai@linux.intel.com> Reviewed-by: Jonathan McDowell <noodles@meta.com> Reviewed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@opinsys.com>
11 daystpm: Remove tpm_find_get_opsJonathan McDowell4-43/+17
tpm_find_get_ops() looks for the first valid TPM if the caller passes in NULL. All internal users have been converted to either associate themselves with a TPM directly, or call tpm_default_chip() as part of their setup. Remove the no longer necessary tpm_find_get_ops(). Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jonathan McDowell <noodles@meta.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
11 daystpm: add WQ_PERCPU to alloc_workqueue usersMarco Crivellari1-1/+2
Currently if a user enqueues a work item using schedule_delayed_work() the used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to schedule_work() that is using system_wq and queue_work(), that makes use again of WORK_CPU_UNBOUND. This lack of consistency cannot be addressed without refactoring the API. alloc_workqueue() treats all queues as per-CPU by default, while unbound workqueues must opt-in via WQ_UNBOUND. This default is suboptimal: most workloads benefit from unbound queues, allowing the scheduler to place worker threads where they’re needed and reducing noise when CPUs are isolated. This continues the effort to refactor workqueue APIs, which began with the introduction of new workqueues and a new alloc_workqueue flag in: commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq") commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag") This change adds a new WQ_PERCPU flag to explicitly request alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified. With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND), any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND must now use WQ_PERCPU. Once migration is complete, WQ_UNBOUND can be removed and unbound will become the implicit default. Suggested-by: Tejun Heo <tj@kernel.org> Signed-off-by: Marco Crivellari <marco.crivellari@suse.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
11 daystpm_crb: add missing loc parameter to kerneldocStuart Yoder1-0/+2
Update the kerneldoc parameter definitions for __crb_go_idle and __crb_cmd_ready to include the loc parameter. Signed-off-by: Stuart Yoder <stuart.yoder@arm.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
11 daystpm_crb: Fix a spelling mistakeChu Guangqing1-1/+1
The spelling of the word "requrest" is incorrect; it should be "request". Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
11 daysselftests: tpm2: Fix ill defined assertionsMaurice Hieronymus1-2/+2
Remove parentheses around assert statements in Python. With parentheses, assert always evaluates to True, making the checks ineffective. Signed-off-by: Maurice Hieronymus <mhi@mailbox.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
11 daysMerge tag 'wq-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wqLinus Torvalds1-36/+50
Pull workqueue updates from Tejun Heo: - Rescuer affinity management: Affinity is now updated only when detached using wq_unbound_cpumask consistently. DISASSOCIATED workers also follow unbound cpumask changes to avoid breaking CPU isolation - Rescuer cleanups preparing for fetching work items one by one from pool list: Work assignment factored out, optimized to skip pwqs no longer needing rescue, and shutdown logic simplified - Unused assert_rcu_or_wq_mutex_or_pool_mutex() removed * tag 'wq-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq: workqueue: Don't rely on wq->rescuer to stop rescuer workqueue: Only assign rescuer work when really needed workqueue: Factor out assign_rescuer_work() workqueue: Init rescuer's affinities as wq_unbound_cpumask workqueue: Let DISASSOCIATED workers follow unbound wq cpumask changes workqueue: Update the rescuer's affinity only when it is detached workqueue: Remove unused assert_rcu_or_wq_mutex_or_pool_mutex
11 daysMerge tag 'printk-for-6.19' of ↵Linus Torvalds39-387/+678
git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux Pull printk updates from Petr Mladek: - Allow creaing nbcon console drivers with an unsafe write_atomic() callback that can only be called by the final nbcon_atomic_flush_unsafe(). Otherwise, the driver would rely on the kthread. It is going to be used as the-best-effort approach for an experimental nbcon netconsole driver, see https://lore.kernel.org/r/20251121-nbcon-v1-2-503d17b2b4af@debian.org Note that a safe .write_atomic() callback is supposed to work in NMI context. But some networking drivers are not safe even in IRQ context: https://lore.kernel.org/r/oc46gdpmmlly5o44obvmoatfqo5bhpgv7pabpvb6sjuqioymcg@gjsma3ghoz35 In an ideal world, all networking drivers would be fixed first and the atomic flush would be blocked only in NMI context. But it brings the question how reliable networking drivers are when the system is in a bad state. They might block flushing more reliable serial consoles which are more suitable for serious debugging anyway. - Allow to use the last 4 bytes of the printk ring buffer. - Prevent queuing IRQ work and block printk kthreads when consoles are suspended. Otherwise, they create non-necessary churn or even block the suspend. - Release console_lock() between each record in the kthread used for legacy consoles on RT. It might significantly speed up the boot. - Release nbcon context between each record in the atomic flush. It prevents stalls of the related printk kthread after it has lost the ownership in the middle of a record - Add support for NBCON consoles into KDB - Add %ptsP modifier for printing struct timespec64 and use it where possible - Misc code clean up * tag 'printk-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/printk/linux: (48 commits) printk: Use console_is_usable on console_unblank arch: um: kmsg_dump: Use console_is_usable drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOT lib/vsprintf: Unify FORMAT_STATE_NUM handlers printk: Avoid irq_work for printk_deferred() on suspend printk: Avoid scheduling irq_work on suspend printk: Allow printk_trigger_flush() to flush all types tracing: Switch to use %ptSp scsi: snic: Switch to use %ptSp scsi: fnic: Switch to use %ptSp s390/dasd: Switch to use %ptSp ptp: ocp: Switch to use %ptSp pps: Switch to use %ptSp PCI: epf-test: Switch to use %ptSp net: dsa: sja1105: Switch to use %ptSp mmc: mmc_test: Switch to use %ptSp media: av7110: Switch to use %ptSp ipmi: Switch to use %ptSp igb: Switch to use %ptSp e1000e: Switch to use %ptSp ...
11 daysMerge tag 'lkmm.2025.12.01a' of ↵Linus Torvalds1-3/+3
git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu Pull lkmm documentation update from Paul McKenney: - Sort the memory-barriers.txt file's wait_event* and wait_on_bit* list alphabetically * tag 'lkmm.2025.12.01a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: memory-barriers.txt: Sort wait_event* and wait_on_bit* list alphabetically
11 daysMerge branch 'pci/misc'Bjorn Helgaas2-2/+2
- Use max() instead of max_t() to ease static analysis (David Laight) - Add Manivannan Sadhasivam as PCI/pwrctrl maintainer (Bartosz Golaszewski) * pci/misc: MAINTAINERS: Add Manivannan Sadhasivam as PCI/pwrctrl maintainer PCI: Use max() instead of max_t() to ease static analysis
11 daysMerge branch 'pci/pwrctrl-tc9563'Bjorn Helgaas8-0/+876
- Add a struct pci_ops.assert_perst() function pointer to assert/deassert PCIe PERST# and implement it for the qcom driver (Krishna Chaitanya Chundru) - Add DT binding and pwrctrl driver for the Toshiba TC9563 PCIe switch, which must be held in reset after poweron so the pwrctrl driver can configure the switch via I2C before bringing up the links (Krishna Chaitanya Chundru) * pci/pwrctrl-tc9563: PCI: pwrctrl: Add power control driver for TC9563 PCI: qcom: Implement .assert_perst() PCI: dwc: Implement .assert_perst() for dwc glue drivers PCI: Add .assert_perst() to control PCIe PERST# dt-bindings: PCI: Add binding for Toshiba TC9563 PCIe switch
11 daysMerge branch 'pci/controller/stm32'Bjorn Helgaas3-33/+27
- Fix a race between link training and endpoint register initialization (Christian Bruel) - Align endpoint allocations to match the ATU requirements (Christian Bruel) - Add #includes to avoid depending on 'proxy' headers (Andy Shevchenko) * pci/controller/stm32: PCI: stm32: Don't use 'proxy' headers PCI: stm32: Fix EP page_size alignment PCI: stm32: Fix LTSSM EP race with start link
11 daysMerge branch 'pci/controller/spacemit-k1'Bjorn Helgaas4-0/+528
- Add DT binding and driver for SpacemiT K1 (Alex Elder) * pci/controller/spacemit-k1: PCI: spacemit: Add SpacemiT PCIe host driver dt-bindings: pci: spacemit: Introduce PCIe host controller
11 daysMerge branch 'pci/controller/sky1'Bjorn Helgaas15-516/+1844
- Add module support for platform controller driver (Manikandan K Pillai) - Split headers into 'legacy' (LGA) and 'high perf' (HPA) (Manikandan K Pillai) - Add DT binding and driver for CIX Sky1 (Hans Zhang) * pci/controller/sky1: MAINTAINERS: Add CIX Sky1 PCIe controller driver maintainer PCI: sky1: Add PCIe host support for CIX Sky1 dt-bindings: PCI: Add CIX Sky1 PCIe Root Complex bindings PCI: cadence: Add support for High Perf Architecture (HPA) controller PCI: cadence: Move PCIe RP common functions to a separate file PCI: cadence: Split PCIe controller header file PCI: cadence: Add module support for platform controller driver
11 daysMerge branch 'pci/controller/sg2042'Bjorn Helgaas1-3/+0
- Fix sg2042_pcie_remove() reference count issue (Christophe JAILLET) * pci/controller/sg2042: PCI: sg2042: Fix a reference count issue in sg2042_pcie_remove()
11 daysMerge branch 'pci/controller/s32g'Bjorn Helgaas6-0/+564
- Add NXP S32G host controller DT binding and driver (Vincent Guittot) * pci/controller/s32g: MAINTAINERS: Add NXP S32G PCIe controller driver maintainer PCI: s32g: Add NXP S32G PCIe controller driver (RC) PCI: dwc: Add register and bitfield definitions dt-bindings: PCI: s32g: Add NXP S32G PCIe controller
11 daysMerge branch 'pci/controller/rzg3s-host'Bjorn Helgaas5-0/+2028
- Add Renesas RZ/G3S host controller DT binding and driver (Claudiu Beznea) * pci/controller/rzg3s-host: PCI: Add Renesas RZ/G3S host controller driver dt-bindings: PCI: Add Renesas RZ/G3S PCIe controller binding
11 daysMerge branch 'pci/controller/rcar-gen2'Bjorn Helgaas1-4/+3
- Drop ARM dependency so we can build test on other arches (Geert Uytterhoeven) * pci/controller/rcar-gen2: PCI: rcar-gen2: Drop ARM dependency from PCI_RCAR_GEN2
11 daysMerge branch 'pci/controller/qcom'Bjorn Helgaas1-2/+15
- Look up OPP using both frequency and data rate (not just frequency) so RPMh votes can account for both (Krishna Chaitanya Chundru) * pci/controller/qcom: PCI: qcom: Use frequency and level based OPP lookup
11 daysMerge branch 'pci/controller/meson'Bjorn Helgaas3-11/+25
- Update DT binding to name DBI region "dbi", not "elbi", and update driver to support both (Manivannan Sadhasivam) * pci/controller/meson: PCI: meson: Fix parsing the DBI register region dt-bindings: PCI: amlogic: Fix the register name of the DBI region
11 daysMerge branch 'pci/controller/mediatek'Bjorn Helgaas4-321/+683
- Convert DT binding to YAML schema (Christian Marangi) - Add Airoha AN7583 DT compatible and driver support (Christian Marangi) * pci/controller/mediatek: PCI: mediatek: Add support for Airoha AN7583 SoC PCI: mediatek: Use generic MACRO for TPVPERL delay PCI: mediatek: Convert bool to single quirks entry and bitmap dt-bindings: PCI: mediatek: Add support for Airoha AN7583 dt-bindings: PCI: mediatek: Convert to YAML schema
11 daysMerge branch 'pci/controller/keystone'Bjorn Helgaas7-37/+65
- Fail the probe instead of silently succeeding if ks_pcie_of_data didn't specify Root Complex or Endpoint mode (Siddharth Vadapalli) - Make keystone buildable as a loadable module, except on ARM32 where hook_fault_code() is __init (Siddharth Vadapalli) * pci/controller/keystone: PCI: keystone: Add support to build as a loadable module PCI: dwc: Export dw_pcie_allocate_domains() and dw_pcie_ep_raise_msix_irq() PCI: Export pci_get_host_bridge_device() for use by pci-keystone PCI: keystone: Exit ks_pcie_probe() for invalid mode
11 daysMerge branch 'pci/controller/j721e'Bjorn Helgaas1-22/+11
- Use devm_clk_get_optional_enabled() instead of open-coding devm_clk_get_optional() and clk_prepare_enable() (Anand Moon) * pci/controller/j721e: PCI: j721e: Use 'pcie->reset_gpio' directly and drop the local variable PCI: j721e: Use devm_clk_get_optional_enabled() to get and enable the clock
11 daysMerge branch 'pci/controller/ixp4xx'Bjorn Helgaas2-1/+7
- Guard ARM32-specific hook_fault_code() with ifdefs so we can build test on other arches (Bjorn Helgaas) * pci/controller/ixp4xx: PCI: ixp4xx: Guard ARM32-specific hook_fault_code()
11 daysMerge branch 'pci/controller/dw-rockchip'Bjorn Helgaas1-17/+6
- Use devm_regulator_get_enable_optional() to simplify probing (Anand Moon) * pci/controller/dw-rockchip: PCI: dw-rockchip: Simplify regulator setup with devm_regulator_get_enable_optional()
11 daysMerge branch 'pci/controller/dwc'Bjorn Helgaas6-41/+79
- Update PORT_LOGIC_LTSSM_STATE_MASK to be a 6-bit mask as per spec, not a 5-bit mask (Shawn Lin) - Clear L1 PM Substate Capability 'Supported' bits unless glue driver says it's supported, which prevents users from enabling non-working L1SS. Currently only qcom and tegra194 support L1SS (Bjorn Helgaas) - Remove now-superfluous L1SS disable code from tegra194 (Bjorn Helgaas) - Configure L1SS support in dw-rockchip when DT says 'supports-clkreq' (Shawn Lin) * pci/controller/dwc: PCI: dw-rockchip: Configure L1SS support PCI: tegra194: Remove unnecessary L1SS disable code PCI: dwc: Advertise L1 PM Substates only if driver requests it PCI: dwc: Fix wrong PORT_LOGIC_LTSSM_STATE_MASK definition
11 daysMerge branch 'pci/controller/brcmstb'Bjorn Helgaas1-13/+196
- Disable advertising ASPM L0s support correctly (Jim Quinlan) - Add a panic/die handler to print diagnostic info in case PCIe caused an unrecoverable abort (Jim Quinlan) * pci/controller/brcmstb: PCI: brcmstb: Add panic/die handler to driver PCI: brcmstb: Add a way to indicate if PCIe bridge is active PCI: brcmstb: Fix disabling L0s capability
11 daysMerge branch 'pci/controller/host-common'Bjorn Helgaas3-43/+14
- Move struct pci_host_bridge allocation from pci_host_common_init() to callers, which significantly simplifies pcie-apple (Marc Zyngier) * pci/controller/host-common: PCI: host-generic: Move bridge allocation outside of pci_host_common_init()
11 daysMerge branch 'pci/endpoint'Bjorn Helgaas4-54/+275
- Convert the endpoint doorbell test to use a threaded IRQ to fix a 'sleeping while atomic' issue (Bhanu Seshu Kumar Valluri) - Add endpoint VNTB MSI doorbell support to reduce latency between host and endpoint (Frank Li) * pci/endpoint: PCI: endpoint: pci-epf-vntb: Add MSI doorbell support PCI: endpoint: Add pci_epf_assign_bar_space() API PCI: endpoint: Add pci_epf_get_required_bar_size() helper PCI: endpoint: Rename 'epf_bar::aligned_size' to 'epf_bar:mem_size' PCI: endpoint: pci-epf-test: Fix sleeping function being called from atomic context
11 daysMerge branch 'pci/dt-binding'Bjorn Helgaas18-26/+72
- Add Rockchip RK3528 compatible strings in DT binding (Yao Zi) - Add Qualcomm Kaanapali to SM8550 DT binding (Qiang Yu) - Add 'contains' to the 'select' schema to enable the amlogic,axg-pcie binding (Rob Herring) - Update Manivannan Sadhasivam's email address in bindings (Manivannan Sadhasivam) - Add required 'power-domains' and 'resets' to qcom sa8775p, sc7280, sc8280xp, sm8150, sm8250, sm8350, sm8450, sm8550, x1e80100 DT schemas (Krzysztof Kozlowski) * pci/dt-binding: dt-bindings: PCI: qcom,pcie-x1e80100: Add missing required power-domains and resets dt-bindings: PCI: qcom,pcie-sm8550: Add missing required power-domains and resets dt-bindings: PCI: qcom,pcie-sm8450: Add missing required power-domains and resets dt-bindings: PCI: qcom,pcie-sm8350: Add missing required power-domains and resets dt-bindings: PCI: qcom,pcie-sm8250: Add missing required power-domains and resets dt-bindings: PCI: qcom,pcie-sm8150: Add missing required power-domains and resets dt-bindings: PCI: qcom,pcie-sc8280xp: Add missing required power-domains and resets dt-bindings: PCI: qcom,pcie-sc7280: Add missing required power-domains and resets dt-bindings: PCI: qcom,pcie-sa8775p: Add missing required power-domains and resets dt-bindings: PCI: Update the email address for Manivannan Sadhasivam dt-bindings: PCI: amlogic,axg-pcie: Fix select schema dt-bindings: PCI: qcom,pcie-sm8550: Add Kaanapali compatible dt-bindings: PCI: dwc: rockchip: Add RK3528 variant
11 daysMerge branch 'pci/resource'Bjorn Helgaas14-363/+475
- Prevent resource tree corruption when BAR resize fails (Ilpo Järvinen) - Restore BARs to the original size if a BAR resize fails (Ilpo Järvinen) - Remove BAR release from BAR resize attempts by the xe, i915, and amdgpu drivers so the PCI core can restore BARs if the resize fails (Ilpo Järvinen) - Move Resizable BAR code to rebar.c (Ilpo Järvinen) - Add pci_rebar_size_supported() and use it in i915 and xe (Ilpo Järvinen) - Add pci_rebar_get_max_size() and use it in xe and amdgpu (Ilpo Järvinen) * pci/resource: PCI: Validate pci_rebar_size_supported() input PCI: Convert BAR sizes bitmasks to u64 drm/amdgpu: Use pci_rebar_get_max_size() drm/xe/vram: Use pci_rebar_get_max_size() PCI: Add pci_rebar_get_max_size() drm/xe/vram: Use PCI rebar helpers in resize_vram_bar() drm/i915/gt: Use pci_rebar_size_supported() PCI: Add pci_rebar_size_supported() helper PCI: Improve Resizable BAR functions kernel doc PCI: Move pci_rebar_size_to_bytes() and export it PCI: Move pci_rebar_bytes_to_size() and clean it up PCI: Move Resizable BAR code to rebar.c PCI: Prevent restoring assigned resources drm/amdgpu: Remove driver side BAR release before resize drm/i915: Remove driver side BAR release before resize drm/xe: Remove driver side BAR release before resize PCI: Add kerneldoc for pci_resize_resource() PCI: Fix restoring BARs on BAR resize rollback path PCI: Free saved list without holding pci_bus_sem PCI: Try BAR resize even when no window was released PCI: Change pci_dev variable from 'bridge' to 'dev' PCI/IOV: Adjust ->barsz[] when changing BAR size PCI: Prevent resource tree corruption when BAR resize fails
11 daysMerge branch 'pci/ptm'Bjorn Helgaas2-0/+25
- Enable PTM only if device advertises support for a relevant role, to prevent invalid PTM Requests that cause ACS violations that are reported as AER Uncorrectable Non-Fatal errors (Mika Westerberg) * pci/ptm: PCI/PTM: Enable only if device advertises relevant role
11 daysMerge branch 'pci/err'Bjorn Helgaas35-64/+22
- For drivers using PCI legacy suspend, save config state at suspend so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - For devices with no driver or a driver that lacks PM, save config state at hibernate so that state (not any earlier state from enumeration, probe, or error recovery) will be restored when resuming (Lukas Wunner) - Save device config space on device addition, before driver binding, so error recovery works more reliably (Lukas Wunner) - Drop pci_save_state() from several drivers that no longer need it since the PCI core always does it and pci_restore_state() no longer invalidates the saved state (Lukas Wunner) - Document use of pci_save_state() by drivers to capture the state they want restored during error recovery (Lukas Wunner) * pci/err: Documentation: PCI: Amend error recovery doc with pci_save_state() rules treewide: Drop pci_save_state() after pci_restore_state() PCI/ERR: Ensure error recoverability at all times PCI/PM: Stop needlessly clearing state_saved on enumeration and thaw PCI/PM: Reinstate clearing state_saved in legacy and !PM codepaths
11 daysMerge branch 'pci/enumeration'Bjorn Helgaas5-78/+63
- Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms (Dan Williams) - Switch vmd from custom domain number allocator to the common allocator (Dan Williams) * pci/enumeration: PCI: vmd: Switch to pci_bus_find_emul_domain_nr() PCI: Enable host bridge emulation for PCI_DOMAINS_GENERIC platforms
11 daysMerge tag 'rcu.release.v6.19' of ↵Linus Torvalds21-142/+1041
git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux Pull RCU updates from Frederic Weisbecker: "SRCU: - Properly handle SRCU readers within IRQ disabled sections in tiny SRCU - Preparation to reimplement RCU Tasks Trace on top of SRCU fast: - Introduce API to expedite a grace period and test it through rcutorture - Split srcu-fast in two flavours: SRCU-fast and SRCU-fast-updown. Both are still targeted toward faster readers (without full barriers on LOCK and UNLOCK) at the expense of heavier write side (using full RCU grace period ordering instead of simply full ordering) as compared to "traditional" non-fast SRCU. But those srcu-fast flavours are going to be optimized in two different ways: - SRCU-fast will become the reimplementation basis for RCU-TASK-TRACE for consolidation. Since RCU-TASK-TRACE must be NMI safe, SRCU-fast must be as well. - SRCU-fast-updown will be needed for uretprobes code in order to get rid of the read-side memory barriers while still allowing entering the reader at task level while exiting it in a timer handler. It is considered semaphore-like in that it can have different owners between LOCK and UNLOCK. However it is not NMI-safe. The actual optimizations are work in progress for the next cycle. Only the new interfaces are added for now, along with related torture and scalability test code. - Create/document/debug/torture new proper initializers for RCU fast: DEFINE_SRCU_FAST() and init_srcu_struct_fast() This allows for using right away the proper ordering on the write side (either full ordering or full RCU grace period ordering) without waiting for the read side to tell which to use. This also optimizes the read side altogether with moving flavour debug checks under debug config and with removing a costly RmW operation on their first call. - Make some diagnostic functions tracing safe Refscale: - Add performance testing for common context synchronizations (Preemption, IRQ, Softirq) and per-cpu increments. Those are relevant comparisons against SRCU-fast read side APIs, especially as they are planned to synchronize further tracing fast-path code Miscellanous: - In order to prepare the layout for nohz_full work deferral to user exit, the context tracking state must shrink the counter of transitions to/from RCU not watching. The only possible hazard is to trigger wrap-around more easily, delaying a bit grace periods when that happens. This should be a rare event though. Yet add debugging and torture code to test that assumption - Fix memory leak on locktorture module - Annotate accesses in rculist_nulls.h to prevent from KCSAN warnings. On recent discussions, we also concluded that all those WRITE_ONCE() and READ_ONCE() on list APIs deserve appropriate comments. Something to be expected for the next cycle - Provide a script to apply several configs to several commits with torture - Allow torture to reuse a build directory in order to save needless rebuild time - Various cleanups" * tag 'rcu.release.v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (29 commits) refscale: Add SRCU-fast-updown readers refscale: Exercise DEFINE_STATIC_SRCU_FAST() and init_srcu_struct_fast() rcutorture: Make srcu{,d}_torture_init() announce the SRCU type srcu: Create an SRCU-fast-updown API refscale: Do not disable interrupts for tests involving local_bh_enable() refscale: Add non-atomic per-CPU increment readers refscale: Add this_cpu_inc() readers refscale: Add preempt_disable() readers refscale: Add local_bh_disable() readers refscale: Add local_irq_disable() and local_irq_save() readers torture: Permit negative kvm.sh --kconfig numberic arguments srcu: Add SRCU_READ_FLAVOR_FAST_UPDOWN CPP macro rcu: Mark diagnostic functions as notrace rcutorture: Make TREE04 use CONFIG_RCU_DYNTICKS_TORTURE rcutorture: Remove redundant rcutorture_one_extend() from rcu_torture_one_read() rcutorture: Permit kvm-again.sh to re-use the build directory torture: Add kvm-series.sh to test commit/scenario combination rcu: use WRITE_ONCE() for ->next and ->pprev of hlist_nulls locktorture: Fix memory leak in param_set_cpumask() doc: Update for SRCU-fast definitions and initialization ...
11 daysMerge tag 'slab-for-6.19' of ↵Linus Torvalds13-679/+758
git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab Pull slab updates from Vlastimil Babka: - mempool_alloc_bulk() support for upcoming users in the block layer that need to allocate multiple objects at once with the mempool's guaranteed progress semantics, which is not achievable with an allocation single objects in a loop. Along with refactoring and various improvements (Christoph Hellwig) - Preparations for the upcoming separation of struct slab from struct page, mostly by removing the struct folio layer, as the purpose of struct folio has shifted since it became used in slab code (Matthew Wilcox) - Modernisation of slab's boot param API usage, which removes some unexpected parsing corner cases (Petr Tesarik) - Refactoring of freelist_aba_t (now struct freelist_counters) and associated functions for double cmpxchg, enabled by -fms-extensions (Vlastimil Babka) - Cleanups and improvements related to sheaves caching layer, that were part of the full conversion to sheaves, which is planned for the next release (Vlastimil Babka) * tag 'slab-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (42 commits) slab: Remove unnecessary call to compound_head() in alloc_from_pcs() mempool: clarify behavior of mempool_alloc_preallocated() mempool: drop the file name in the top of file comment mempool: de-typedef mempool: remove mempool_{init,create}_kvmalloc_pool mempool: legitimize the io_schedule_timeout in mempool_alloc_from_pool mempool: add mempool_{alloc,free}_bulk mempool: factor out a mempool_alloc_from_pool helper slab: Remove references to folios from virt_to_slab() kasan: Remove references to folio in __kasan_mempool_poison_object() memcg: Convert mem_cgroup_from_obj_folio() to mem_cgroup_from_obj_slab() mempool: factor out a mempool_adjust_gfp helper mempool: add error injection support mempool: improve kerneldoc comments mm: improve kerneldoc comments for __alloc_pages_bulk fault-inject: make enum fault_flags available unconditionally usercopy: Remove folio references from check_heap_object() slab: Remove folio references from kfree_nolock() slab: Remove folio references from kfree_rcu_sheaf() slab: Remove folio references from build_detached_freelist() ...
11 daysMerge tag 'docs-6.19' of git://git.lwn.net/linuxLinus Torvalds287-5290/+9639
Pull documentation updates from Jonathan Corbet: "This has been another busy cycle for documentation, with a lot of build-system thrashing. That work should slow down from here on out. - The various scripts and tools for documentation were spread out in several directories; now they are (almost) all coalesced under tools/docs/. The holdout is the kernel-doc script, which cannot be easily moved without some further thought. - As the amount of Python code increases, we are accumulating modules that are imported by multiple programs. These modules have been pulled together under tools/lib/python/ -- at least, for documentation-related programs. There is other Python code in the tree that might eventually want to move toward this organization. - The Perl kernel-doc.pl script has been removed. It is no longer used by default, and nobody has missed it, least of all anybody who actually had to look at it. - The docs build was controlled by a complex mess of makefilese that few dared to touch. Mauro has moved that logic into a new program (tools/docs/sphinx-build-wrapper) that, with any luck at all, will be far easier to understand and maintain. - The get_feat.pl program, used to access information under Documentation/features/, has been rewritten in Python, bringing an end to the use of Perl in the docs subsystem. - The top-level README file has been reorganized into a more reader-friendly presentation. - A lot of Chinese translation additions - Typo fixes and documentation updates as usual" * tag 'docs-6.19' of git://git.lwn.net/linux: (164 commits) docs: makefile: move rustdoc check to the build wrapper README: restructure with role-based documentation and guidelines docs: kdoc: various fixes for grammar, spelling, punctuation docs: kdoc_parser: use '@' for Excess enum value docs: submitting-patches: Clarify that removal of Acks needs explanation too docs: kdoc_parser: add data/function attributes to ignore docs: MAINTAINERS: update Mauro's files/paths docs/zh_CN: Add wd719x.rst translation docs/zh_CN: Add libsas.rst translation get_feat.pl: remove it, as it got replaced by get_feat.py Documentation/sphinx/kernel_feat.py: use class directly tools/docs/get_feat.py: convert get_feat.pl to Python Documentation/admin-guide: fix typo and comment in cscope example docs/zh_CN: Add data-integrity.rst translation docs/zh_CN: Add blk-mq.rst translation docs/zh_CN: Add block/index.rst translation docs/zh_CN: Update the Chinese translation of kbuild.rst docs: bring some order to our Python module hierarchy docs: Move the python libraries to tools/lib/python Documentation/kernel-parameters: Move the kernel build options ...
11 daysMerge tag 'v6.19-p1' of ↵Linus Torvalds170-1682/+2028
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 Pull crypto updates from Herbert Xu: "API: - Rewrite memcpy_sglist from scratch - Add on-stack AEAD request allocation - Fix partial block processing in ahash Algorithms: - Remove ansi_cprng - Remove tcrypt tests for poly1305 - Fix EINPROGRESS processing in authenc - Fix double-free in zstd Drivers: - Use drbg ctr helper when reseeding xilinx-trng - Add support for PCI device 0x115A to ccp - Add support of paes in caam - Add support for aes-xts in dthev2 Others: - Use likely in rhashtable lookup - Fix lockdep false-positive in padata by removing a helper" * tag 'v6.19-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (71 commits) crypto: zstd - fix double-free in per-CPU stream cleanup crypto: ahash - Zero positive err value in ahash_update_finish crypto: ahash - Fix crypto_ahash_import with partial block data crypto: lib/mpi - use min() instead of min_t() crypto: ccp - use min() instead of min_t() hwrng: core - use min3() instead of nested min_t() crypto: aesni - ctr_crypt() use min() instead of min_t() crypto: drbg - Delete unused ctx from struct sdesc crypto: testmgr - Add missing DES weak and semi-weak key tests Revert "crypto: scatterwalk - Move skcipher walk and use it for memcpy_sglist" crypto: scatterwalk - Fix memcpy_sglist() to always succeed crypto: iaa - Request to add Kanchana P Sridhar to Maintainers. crypto: tcrypt - Remove unused poly1305 support crypto: ansi_cprng - Remove unused ansi_cprng algorithm crypto: asymmetric_keys - fix uninitialized pointers with free attribute KEYS: Avoid -Wflex-array-member-not-at-end warning crypto: ccree - Correctly handle return of sg_nents_for_len crypto: starfive - Correctly handle return of sg_nents_for_len crypto: iaa - Fix incorrect return value in save_iaa_wq() crypto: zstd - Remove unnecessary size_t cast ...
11 daysMerge tag 'ipe-pr-20251202' of ↵Linus Torvalds5-4/+47
git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe Pull IPE udates from Fan Wu: "The primary change is the addition of support for the AT_EXECVE_CHECK flag. This allows interpreters to signal the kernel to perform IPE security checks on script files before execution, extending IPE enforcement to indirectly executed scripts. Update documentation for it, and also fix a comment" * tag 'ipe-pr-20251202' of git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe: ipe: Update documentation for script enforcement ipe: Add AT_EXECVE_CHECK support for script enforcement ipe: Drop a duplicated CONFIG_ prefix in the ifdeffery
11 daysMerge tag 'integrity-v6.19' of ↵Linus Torvalds7-28/+123
git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity updates from Mimi Zohar: "Bug fixes: - defer credentials checking from the bprm_check_security hook to the bprm_creds_from_file security hook - properly ignore IMA policy rules based on undefined SELinux labels IMA policy rule extensions: - extend IMA to limit including file hashes in the audit logs (dont_audit action) - define a new filesystem subtype policy option (fs_subtype) Misc: - extend IMA to support in-kernel module decompression by deferring the IMA signature verification in kernel_read_file() to after the kernel module is decompressed" * tag 'integrity-v6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: ima: Handle error code returned by ima_filter_rule_match() ima: Access decompressed kernel module to verify appended signature ima: add fs_subtype condition for distinguishing FUSE instances ima: add dont_audit action to suppress audit actions ima: Attach CREDS_CHECK IMA hook to bprm_creds_from_file LSM hook
11 daysperf test kvm: Add some basic perf kvm test coverageIan Rogers1-0/+154
Setup qemu with KVM then run kvm stat and some host recording/reporting/build-id tests. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tests evlist: Add basic evlist testIan Rogers1-0/+79
Add test that evlist reports expected events from perf record. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tests script dlfilter: Add a dlfilter testIan Rogers1-0/+107
Compile a simple dlfilter and make sure it remove samples from everything other than a test_loop. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tests kallsyms: Add basic kallsyms testIan Rogers1-0/+56
Add test that kallsyms finds a well known symbol and fails for another. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tests timechart: Add a perf timechart testIan Rogers1-0/+67
Basic coverage for `perf timechart` doing a record and then a basic sanity test of the generated SVG file. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tests top: Add basic perf top coverage testIan Rogers1-0/+74
The test starts a backgroup thloop workload and monitors it using cpu-clock ensuring test_loop appears in the output. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tests buildid: Add purge and remove testingIan Rogers1-26/+177
Add testing for the purge and remove commands. Use the noploop workload rather than just a return to avoid missing samples in the workload in perf record. Tidy up the cleanup code to cleanup when signals happen. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tests c2c: Add a basic c2cIan Rogers1-0/+62
Add basic c2c record and report testing to gain some coverage. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf c2c: Clean up some defensive gets and make asan cleanIan Rogers1-22/+14
To deal with histogram code that had missing gets the c2c code had some defensive gets. Those other issues were cleaned up by the reference count checker, clean them up for the c2c command here. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf jitdump: Fix missed dso__putIan Rogers1-0/+2
Reference count checking caught a missing dso__put following a machine__findnew_dso_id. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf mem-events: Don't leak online CPU mapIan Rogers1-1/+4
Reference count checking found the online CPU map was being gotten but not put. Add in the missing put. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf hist: In init, ensure mem_info is put on error pathsIan Rogers1-4/+2
Rather than exit the internal map_symbols directly, put the mem-info that does this and also lowers the reference count on the mem-info itself otherwise the mem-info is being leaked. Fixes: 56e144fe98260a0f ("perf mem_info: Add and use map_symbol__exit and addr_map_symbol__exit") Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf probe-event: Ensure probe event nsinfo is always clearedIan Rogers1-6/+6
Move nsinfo__zput from cleanup_perf_probe_events to clear_perf_probe_event so it is always executed. Clean up clear_perf_probe_events to not call nsinfo__zput and use the pev variable to avoid repeated array accesses. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf symbol: Add missed dso__putIan Rogers1-0/+1
Add missing dso__put for the dso created in maps__split_kallsyms. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf symbol-elf: Add missing puts on error pathIan Rogers1-1/+4
In dso__process_kernel_symbol if inserting a map fails, probably ENOMEM, then the reference count puts were missing on the dso and map. Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf timechart: Add record support for output perf.data pathIan Rogers2-6/+12
The '-o' option exists for the SVG creation but not for `perf timechart record`. Add to better allow testing. Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf kvm: Fix debug assertionIan Rogers1-1/+1
There are 2 slots left for kvm_add_default_arch_event, fix the assertion so that debug builds don't fail the assert and to agree with the comment. Fixes: 45ff39f6e70aa55d0 ("perf tools kvm: Fix the potential out of range memory access issue") Signed-off-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update sierraforest events from 1.12 to 1.13Ian Rogers3-11/+20
The updated events were published in: https://github.com/intel/perfmon/commit/445e38f5128592f8b5c38da30267fff025e37613 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update pantherlake events from 1.00 to 1.02Ian Rogers5-2/+425
The updated events were published in: https://github.com/intel/perfmon/commit/6edacf434dffa046435de2f6a182c00df3cf4edc Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update meteorlake events from 1.17 to 1.18Ian Rogers2-11/+11
The updated events were published in: https://github.com/intel/perfmon/commit/348f33fae477f281812c32e1c07812b7e35614dd Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update lunarlake events from 1.18 to 1.19Ian Rogers4-14/+35
The updated events were published in: https://github.com/intel/perfmon/commit/09a0c74b23b5d20adf1f97e5022856568d05494c Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update icelakex events from 1.28 to 1.30Ian Rogers2-3/+3
The updated events were published in: https://github.com/intel/perfmon/commit/dc6ffee20c74bfd21d7a7e338345578d4b7ca9ca Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update graniterapids events from 1.15 to 1.16Ian Rogers3-3/+12
The updated events were published in: https://github.com/intel/perfmon/commit/b4acc3fd520eb098db41083010b65b75ae906c96 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update cascadelakex metric unitsIan Rogers2-7/+7
The updated metrics were published in: https://github.com/intel/perfmon/pull/348/commits/2dce436130ddfb8b442fc373d103f970de26cb78 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update arrowlake events from 1.13 to 1.14Ian Rogers8-19/+1111
The updated events were published in: https://github.com/intel/perfmon/commit/588dd77675039e1aaacee27a414cbcf3625c58a3 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf vendor events intel: Update alderlake events from 1.34 to 1.35Ian Rogers5-22/+26
The updated events were published in: https://github.com/intel/perfmon/commit/c74f1cefa94d224cb3338507961b59d8a2a1c4e9 Signed-off-by: Ian Rogers <irogers@google.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf arm_spe: Add CPU variants supporting common data source packetLeo Yan1-0/+5
Add the following CPU variants to the list for data source decoding: - Cortex-A715 [1] - Cortex-A78C [2] - Cortex-X1 [3] - Cortex-X4 [4] - Neoverse V3 [5] [1] https://developer.arm.com/documentation/101590/0103/Statistical-Profiling-Extension-Support/Statistical-Profiling-Extension-data-source-packet [2] https://developer.arm.com/documentation/102226/0002/Debug-descriptions/Statistical-Profiling-Extension/implementation-defined-features-of-SPE [3] https://developer.arm.com/documentation/101433/0102/Debug-descriptions/Statistical-Profiling-Extension/implementation-defined-features-of-SPE [4] https://developer.arm.com/documentation/102484/0003/Statistical-Profiling-Extension-support/Statistical-Profiling-Extension-data-source-packet [5] https://developer.arm.com/documentation/107734/0002/Statistical-Profiling-Extension-support/Statistical-Profiling-Extension-data-source-packet Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf auxtrace: Include sys/types.h for pid_tArnaldo Carvalho de Melo1-0/+1
In 754187ad73b73bcb ("perf build: Remove NO_AUXTRACE build option") sys/types.h was removed, which broke the build in all Alpine Linux releases, as musl libc has pid_t defined via sys/types.h, add it back. Fixes: 754187ad73b73bcb ("perf build: Remove NO_AUXTRACE build option") Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com> Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysMerge tag 'Smack-for-6.19' of https://github.com/cschaufler/smack-nextLinus Torvalds4-119/+275
Pull smack updates from Casey Schaufler: - fix several cases where labels were treated inconsistently when imported from user space - clean up the assignment of extended attributes - documentation improvements * tag 'Smack-for-6.19' of https://github.com/cschaufler/smack-next: Smack: function parameter 'gfp' not described smack: fix kernel-doc warnings for smk_import_valid_label() smack: fix bug: setting task label silently ignores input garbage smack: fix bug: unprivileged task can create labels smack: fix bug: invalid label of unix socket file smack: always "instantiate" inode in smack_inode_init_security() smack: deduplicate xattr setting in smack_inode_init_security() smack: fix bug: SMACK64TRANSMUTE set on non-directory smack: deduplicate "does access rule request transmutation"
11 daysMerge tag 'audit-pr-20251201' of ↵Linus Torvalds3-27/+21
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: - Consolidate the loops in __audit_inode_child() to improve performance When logging a child inode in __audit_inode_child(), we first run through the list of recorded inodes looking for the parent and then we repeat the search looking for a matching child entry. This pull request consolidates both searches into one pass through the recorded inodes, resuling in approximately a 50% reduction in audit overhead. See the commit description for the testing details. - Combine kmalloc()/memset() into kzalloc() in audit_krule_to_data() - Comment fixes * tag 'audit-pr-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: merge loops in __audit_inode_child() audit: Use kzalloc() instead of kmalloc()/memset() in audit_krule_to_data() audit: fix comment misindentation in audit.h
11 daysMerge tag 'selinux-pr-20251201' of ↵Linus Torvalds11-47/+110
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: - Improve the granularity of SELinux labeling for memfd files Currently when creating a memfd file, SELinux treats it the same as any other tmpfs, or hugetlbfs, file. While simple, the drawback is that it is not possible to differentiate between memfd and tmpfs files. This adds a call to the security_inode_init_security_anon() LSM hook and wires up SELinux to provide a set of memfd specific access controls, including the ability to control the execution of memfds. As usual, the commit message has more information. - Improve the SELinux AVC lookup performance Adopt MurmurHash3 for the SELinux AVC hash function instead of the custom hash function currently used. MurmurHash3 is already used for the SELinux access vector table so the impact to the code is minimal, and performance tests have shown improvements in both hash distribution and latency. See the commit message for the performance measurments. - Introduce a Kconfig option for the SELinux AVC bucket/slot size While we have the ability to grow the number of AVC hash buckets today, the size of the buckets (slot size) is fixed at 512. This pull request makes that slot size configurable at build time through a new Kconfig knob, CONFIG_SECURITY_SELINUX_AVC_HASH_BITS. * tag 'selinux-pr-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: improve bucket distribution uniformity of avc_hash() selinux: Move avtab_hash() to a shared location for future reuse selinux: Introduce a new config to make avc cache slot size adjustable memfd,selinux: call security_inode_init_security_anon()
11 daysobjtool: Simplify .annotate_insn code generation output some moreobjtool-urgent-2025-12-06objtool/urgentJosh Poimboeuf1-7/+6
Remove the superfluous section name quotes, and combine the longs into a single command. Before: 911: .pushsection ".discard.annotate_insn", "M", @progbits, 8; .long 911b - .; .long 2; .popsection After: 911: .pushsection .discard.annotate_insn, "M", @progbits, 8; .long 911b - ., 2; .popsection No change in functionality. Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/hpsfcihgqmhcdrg7pop7z73ptymakgjq7qlxrawrjxilosk43l@xikqif3ievj4
11 daysobjtool: Add more robust signal error handling, detect and warn about stack ↵Josh Poimboeuf4-1/+141
overflows When the kernel build fails due to an objtool segfault, the error message is a bit obtuse and confusing: make[5]: *** [scripts/Makefile.build:503: drivers/scsi/qla2xxx/qla2xxx.o] Error 139 ^^^^^^^^^ make[5]: *** Deleting file 'drivers/scsi/qla2xxx/qla2xxx.o' make[4]: *** [scripts/Makefile.build:556: drivers/scsi/qla2xxx] Error 2 make[3]: *** [scripts/Makefile.build:556: drivers/scsi] Error 2 make[2]: *** [scripts/Makefile.build:556: drivers] Error 2 make[1]: *** [/home/jpoimboe/git/linux/Makefile:2013: .] Error 2 make: *** [Makefile:248: __sub-make] Error 2 Add a signal handler to objtool which prints an error message like if the local stack has overflown (for which there's a chance as objtool makes heavy use of recursion): drivers/scsi/qla2xxx/qla2xxx.o: error: SIGSEGV: objtool stack overflow! or: drivers/scsi/qla2xxx/qla2xxx.o: error: SIGSEGV: objtool crash! Also, re-raise the signal so the core dump still gets triggered. [ mingo: Applied a build fix, added more comments and prettified the code. ] Suggested-by: Ingo Molnar <mingo@kernel.org> Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Alexandre Chartre <alexandre.chartre@oracle.com> Cc: David Laight <david.laight.linux@gmail.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/mi4tihk4dbncn7belrhp6ooudhpw4vdggerktu5333w3gqf3uf@vqlhc3y667mg
11 daysobjtool: Remove newlines and tabs from annotation macrosJosh Poimboeuf15-23/+23
Remove newlines and tabs from the annotation macros so the invoking code can insert them as needed to match the style of the surrounding code. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://patch.msgid.link/66305834c2eb78f082217611b756231ae9c0b555.1764694625.git.jpoimboe@kernel.org
11 daysobjtool: Consolidate annotation macrosJosh Poimboeuf1-21/+15
Consolidate __ASM_ANNOTATE into a single macro which is used by both C and asm. This also makes the code generation a bit more palatable by putting it all on a single line. Turn this: 911: .pushsection .discard.annotate_insn,"M", @progbits, 8 .long 911b - . .long 1 .popsection jmp __x86_return_thunk Into: 911: .pushsection ".discard.annotate_insn", "M", @progbits, 8; .long 911b - .; .long 1; .popsection jmp __x86_return_thunk Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://patch.msgid.link/c05ff40d3383e85c3b59018ef0b3c7aaf993a60d.1764694625.git.jpoimboe@kernel.org
11 daysperf/uprobes: Remove <space><Tab> whitespace noiseIngo Molnar1-4/+4
A few cases of space-Tab noise snuck in. Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/176478594889.498.15611228524880763978.tip-bot2@tip-bot2
11 daysMerge branches 'clk-visconti', 'clk-imx', 'clk-microchip', 'clk-rockchip' ↵Stephen Boyd79-245/+8298
and 'clk-qcom' into clk-next * clk-visconti: clk: visconti: Add VIIF clocks dt-bindings: clock: tmpv770x: Add VIIF clocks dt-bindings: clock: tmpv770x: Remove definition of number of clocks clk: visconti: Do not define number of clocks in bindings * clk-imx: clk: imx: add driver for imx8ulp's sim lpav dt-bindings: clock: document 8ULP's SIM LPAV clk: imx: imx8mp-audiomix: use devm_auxiliary_device_create() to simple code clk: imx: Add some delay before deassert the reset * clk-microchip: reset: mpfs: add non-auxiliary bus probing clk: lan966x: remove unused dt-bindings include clk: microchip: mpfs: use regmap for clocks dt-bindings: clk: microchip: mpfs: remove first reg region * clk-rockchip: clk: rockchip: Add clock and reset driver for RK3506 dt-bindings: clock: rockchip: Add RK3506 clock and reset unit clk: rockchip: Add clock controller for the RV1126B dt-bindings: clock, reset: Add support for rv1126b clk: rockchip: Implement rockchip_clk_register_armclk_multi_pll() dt-bindings: clock: rk3568: Drop CLK_NR_CLKS define clk: rockchip: rk3568: Drop CLK_NR_CLKS usage dt-bindings: clock: rk3568: Add SCMI clock ids * clk-qcom: (48 commits) clk: qcom: Mark camcc_sm7150_hws static clk: qcom: x1e80100-dispcc: Add USB4 router link resets dt-bindings: clock: qcom: x1e80100-dispcc: Add USB4 router link resets clk: qcom: videocc-sm8750: Add video clock controller driver for SM8750 dt-bindings: clock: qcom: Add SM8750 video clock controller clk: qcom: branch: Extend invert logic for branch2 mem clocks clk: qcom: ecpricc-qdu100: Add mem_enable_mask to the clock memory branch clk: qcom: clk_mem_branch: add enable mask and invert flags clk: qcom: mmcc-sdm660: Add missing MDSS reset dt-bindings: clock: mmcc-sdm660: Add missing MDSS reset clk: qcom: use different Kconfig prompts for APSS IPQ5424/6018 drivers clk: qcom: apss-ipq5424: remove unused 'apss_clk' structure dt-bindings: clock: qcom: Add Kaanapali Global clock controller dt-bindings: clock: qcom: Document the Kaanapali TCSR Clock Controller dt-bindings: clock: qcom-rpmhcc: Add RPMHCC for Kaanapali clk: qcom: tcsrcc-glymur: Update register offsets for clock refs clk: qcom: gcc-qcs615: Update the SDCC clock to use shared_floor_ops clk: qcom: camcc-sm7150: Fix PLL config of PLL2 clk: qcom: camcc-sm6350: Fix PLL config of PLL2 clk: qcom: Add NSS clock controller driver for IPQ5424 ...
11 daysMerge branches 'clk-socfpga', 'clk-renesas', 'clk-cleanup', 'clk-samsung' ↵Stephen Boyd50-133/+1933
and 'clk-mediatek' into clk-next * clk-socfpga: clk: socfpga: agilex5: add clock driver for Agilex5 * clk-renesas: (35 commits) clk: renesas: r9a09g077: Add SPI module clocks clk: renesas: r9a09g056: Add USB3.0 clocks/resets clk: renesas: r9a09g057: Add USB3.0 clocks/resets clk: renesas: r9a09g047: Add RSCI clocks/resets dt-bindings: clock: renesas,r9a09g056-cpg: Add USB3.0 core clocks dt-bindings: clock: renesas,r9a09g057-cpg: Add USB3.0 core clocks clk: renesas: r9a06g032: Fix memory leak in error path clk: renesas: r9a09g077: Use devm_ helpers for divider clock registration clk: renesas: r9a09g077: Remove stray blank line clk: renesas: r9a09g077: Propagate rate changes to parent clocks clk: renesas: r8a779a0: Add 3DGE module clock clk: renesas: r8a779a0: Add ZG Core clock clk: renesas: rcar-gen4: Add support for clock dividers in FRQCRB dt-bindings: clock: r8a779a0: Add ZG core clock clk: renesas: r9a09g056: Add clock and reset entries for ISP clk: renesas: r9a09g056: Add support for PLLVDO, CRU clocks, and resets clk: renesas: r9a09g056: Add clocks and resets for DSI and LCDC modules clk: renesas: r9a09g077: Add TSU module clock clk: renesas: r9a09g057: Add clock and reset entries for DSI and LCDC clk: renesas: rzv2h: Add support for DSI clocks ... * clk-cleanup: clk: keystone: fix compile testing clk: keystone: syscon-clk: fix regmap leak on probe failure clk: samsung: exynos-clkout: Assign .num before accessing .hws clk: actions: Fix discarding const qualifier by 'container_of' macro clk: spacemit: Set clk_hw_onecell_data::num before using flex array clk: spacemit: fix comment typo clk: keystone: Fix discarded const qualifiers clk: sprd: sc9860: Simplify with of_device_get_match_data() * clk-samsung: firmware: exynos-acpm: add empty method to allow compile test MAINTAINERS: add ACPM clock bindings and driver clk: samsung: add Exynos ACPM clock driver firmware: exynos-acpm: register ACPM clocks pdev firmware: exynos-acpm: add DVFS protocol dt-bindings: firmware: google,gs101-acpm-ipc: add ACPM clocks clk: samsung: clk-pll: simplify samsung_pll_lock_wait() clk: samsung: exynosautov920: add block mfc clock support clk: samsung: exynosautov920: add clock support dt-bindings: clock: exynosautov920: add mfc clock definitions dt-bindings: clock: exynosautov920: add m2m clock definitions dt-bindings: clock: google,gs101-clock: add power-domains * clk-mediatek: clk: en7523: Add reset-controller support for EN7523 SoC dt-bindings: clock: airoha: Add reset support to EN7523 clock binding
11 daysMerge tag 'lsm-pr-20251201' of ↵Linus Torvalds53-742/+1025
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm Pull LSM updates from Paul Moore: - Rework the LSM initialization code What started as a "quick" patch to enable a notification event once all of the individual LSMs were initialized, snowballed a bit into a 30+ patch patchset when everything was done. Most of the patches, and diffstat, is due to splitting out the initialization code into security/lsm_init.c and cleaning up some of the mess that was there. While not strictly necessary, it does cleanup the code signficantly, and hopefully makes the upkeep a bit easier in the future. Aside from the new LSM_STARTED_ALL notification, these changes also ensure that individual LSM initcalls are only called when the LSM is enabled at boot time. There should be a minor reduction in boot times for those who build multiple LSMs into their kernels, but only enable a subset at boot. It is worth mentioning that nothing at present makes use of the LSM_STARTED_ALL notification, but there is work in progress which is dependent upon LSM_STARTED_ALL. - Make better use of the seq_put*() helpers in device_cgroup * tag 'lsm-pr-20251201' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/lsm: (36 commits) lsm: use unrcu_pointer() for current->cred in security_init() device_cgroup: Refactor devcgroup_seq_show to use seq_put* helpers lsm: add a LSM_STARTED_ALL notification event lsm: consolidate all of the LSM framework initcalls selinux: move initcalls to the LSM framework ima,evm: move initcalls to the LSM framework lockdown: move initcalls to the LSM framework apparmor: move initcalls to the LSM framework safesetid: move initcalls to the LSM framework tomoyo: move initcalls to the LSM framework smack: move initcalls to the LSM framework ipe: move initcalls to the LSM framework loadpin: move initcalls to the LSM framework lsm: introduce an initcall mechanism into the LSM framework lsm: group lsm_order_parse() with the other lsm_order_*() functions lsm: output available LSMs when debugging lsm: cleanup the debug and console output in lsm_init.c lsm: add/tweak function header comment blocks in lsm_init.c lsm: fold lsm_init_ordered() into security_init() lsm: cleanup initialize_lsm() and rename to lsm_init_single() ...
11 daysx86/boot/Documentation: Prefix hexadecimal literals with 0xx86-urgent-2025-12-06Ingo Molnar1-23/+23
The x86 bootloader ID specification text uses hexadecimal values without a 0x prefix: D kexec-tools E Extended (see ext_loader_type) F Special (0xFF = undefined) 10 Reserved 11 Minimal Linux Bootloader <http://sebastian-plotz.blogspot.de> 12 OVMF UEFI virtualization stack 13 barebox Which beyond the ambiguity of '13' in isolation, also made me fail a grep -wi '0xd' when I was looking for the kexec bootloader ID definition and caused quite a bit of head-scratching before I found out why it didn't show up. Furthermore, the actual explanatory text uses the 0x prefix: For boot loader IDs above T = 0xD, write T = 0xE to this field and write the extended ID minus 0x10 to the ext_loader_type field. Similarly, the ext_loader_ver field can be used to provide more than four bits for the bootloader version. So make it all both unambiguous, easy to grep and consistent across the entire documentation by prefixing the IDs with 0x. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: linux-kernel@vger.kernel.org
11 daysx86/boot/Documentation: Spell 'ID' consistentlyIngo Molnar1-2/+2
The bootloader ID specification text uses 2 capitalization variants for the same thing: 'id', 'ids', 'ID' and 'IDs'. Use 'ID/IDs' consistently. Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: linux-kernel@vger.kernel.org
11 daysMerge tag 'keys-trusted-next-rc1' of ↵Linus Torvalds3-22/+22
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull trusted key updates from Jarkko Sakkinen: - Remove duplicate 'tpm2_hash_map' in favor of 'tpm2_find_hash_alg()' - Fix a memory leak on failure paths of 'tpm2_load_cmd' * tag 'keys-trusted-next-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: KEYS: trusted: Fix a memory leak in tpm2_load_cmd KEYS: trusted: Replace a redundant instance of tpm2_hash_map
11 daysMerge tag 'keys-next-6.19-rc1' of ↵Linus Torvalds6-11/+13
git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd Pull keys update from Jarkko Sakkinen: "This contains only three fixes" * tag 'keys-next-6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/jarkko/linux-tpmdd: keys: Fix grammar and formatting in 'struct key_type' comments keys: Replace deprecated strncpy in ecryptfs_fill_auth_tok keys: Remove redundant less-than-zero checks
11 daysMerge tag 'nolibc-20251130-for-6.19-1' of ↵Linus Torvalds37-163/+290
git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc Pull nolibc updates from Thomas Weißschuh: - Preparations to the use of nolibc in UML: - Cleanup of sparse warnings - Library mode without _start() - More consistency when disabling errno - Unconditional installation of all architecture support files - Always 64-bit wide ino_t and off_t - Various cleanups and bug fixes * tag 'nolibc-20251130-for-6.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/nolibc/linux-nolibc: (25 commits) selftests/nolibc: error out on linker warnings selftests/nolibc: use lld to link loongarch binaries tools/nolibc: remove more __nolibc_enosys() fallbacks tools/nolibc: remove now superfluous overflow check in llseek tools/nolibc: use 64-bit off_t tools/nolibc: prefer the llseek syscall tools/nolibc: handle 64-bit off_t for llseek tools/nolibc: use 64-bit ino_t tools/nolibc: avoid using plain integer as NULL pointer tools/nolibc: add support for fchdir() tools/nolibc: clean up outdated comments in generic arch.h tools/nolibc: make the "headers" target install all supported archs tools/nolibc: add the more portable inttypes.h tools/nolibc: provide the portable sys/select.h tools/nolibc: add missing memchr() to string.h tools/nolibc: fix misleading help message regarding installation path tools/nolibc: add uio.h with readv and writev tools/nolibc: add option to disable runtime tools/nolibc: use __fallthrough__ rather than fallthrough tools/nolibc: implement %m if errno is not defined ...
11 daysASoc: qcom: q6afe: fix bad guard conversionJohan Hovold1-2/+2
A recent spinlock guard conversion used the wrong guard so that interrupts are no longer disabled while holding the port list lock. Based on a cursory look this appears to be safe currently, but it could cause a deadlock if one of these helpers are ever called in interrupt context. Fixes: 4b1edbb028fb ("ASoC: qcom: q6afe: Use guard() for spin locks") Cc: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> Signed-off-by: Johan Hovold <johan@kernel.org> Reviewed-by: Srinivas Kandagatla <srinivas.kandagatla@oss.qualcomm.com> Link: https://patch.msgid.link/20251203105542.24765-2-johan@kernel.org Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysdt-bindings: thermal: qcom-tsens: Remove invalid tab characterRob Herring (Arm)1-1/+1
Commit 1ee90870ce79 ("dt-bindings: thermal: tsens: Add QCS8300 compatible") uses a tab character which is illegal in YAML (at the beginning of a line). The original patch was correct, so this got corrupted when applied. Fixes: 1ee90870ce79 ("dt-bindings: thermal: tsens: Add QCS8300 compatible") Signed-off-by: Rob Herring (Arm) <robh@kernel.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
11 daystools/power/x86/intel-speed-select: v1.24 releaseSrinivas Pandruvada1-1/+1
This version includes the following changes: - Check feature status to check if the feature enablement was successful - Reset SST-TF bucket structure to display valid bucket info Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
11 daystools/power/x86/intel-speed-select: Reset isst_turbo_freq_info for invalid ↵Srinivas Pandruvada1-0/+1
buckets With SST-TF version 2 only 3 buckets are present. The information in others buckets can be junk. So initialize the info structure of type isst_turbo_freq_info, before issing ioctl to get bucket information. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
11 daystools/power/x86/intel-speed-select: Check feature statusSrinivas Pandruvada1-2/+43
After change of enable/disable status of SST-CP, SST-TF and SST-BF check if the hardware status change was successful. If not successful even after retries, return failure. Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
11 daysx86/asm: Remove ANNOTATE_DATA_SPECIAL usageJosh Poimboeuf7-12/+20
Instead of manually annotating each __ex_table entry, just make the section mergeable and store the entry size in the ELF section header. Either way works for objtool create_fake_symbols(), this way produces cleaner code generation. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://patch.msgid.link/b858cb7891c1ba0080e22a9c32595e6c302435e2.1764694625.git.jpoimboe@kernel.org
11 daysx86/alternative: Remove ANNOTATE_DATA_SPECIAL usageJosh Poimboeuf3-4/+7
Instead of manually annotating each .altinstructions entry, just make the section mergeable and store the entry size in the ELF section header. Either way works for objtool create_fake_symbols(), this way produces cleaner code generation. Signed-off-by: Josh Poimboeuf <jpoimboe@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: https://patch.msgid.link/5ac04e6db5be6453dce8003a771ebb0c47b4cd7a.1764694625.git.jpoimboe@kernel.org
11 daysdt-bindings: kbuild: Skip validating empty examplesRob Herring (Arm)1-1/+2
Extracting empty examples results in just the empty template being generated and then validated. That's pointless and not free, so filter out the schemas without any examples from the targets. There's currently a little less than 10% of the binding schema files without examples. Removing them improves the build time by ~6%. Link: https://patch.msgid.link/20251201175030.3785060-1-robh@kernel.org Signed-off-by: Rob Herring (Arm) <robh@kernel.org>
11 daysASoC: rockchip: Fix Wvoid-pointer-to-enum-cast warning (again)Krzysztof Kozlowski1-1/+1
'version' is an enum, thus cast of pointer on 64-bit compile test with clang W=1 causes: rockchip_pdm.c:583:17: error: cast to smaller integer type 'enum rk_pdm_version' from 'const void *' [-Werror,-Wvoid-pointer-to-enum-cast] This was already fixed in commit 49a4a8d12612 ("ASoC: rockchip: Fix Wvoid-pointer-to-enum-cast warning") but then got bad in commit 9958d85968ed ("ASoC: Use device_get_match_data()"). Discussion on LKML also pointed out that 'uintptr_t' is not the correct type and either 'kernel_ulong_t' or 'unsigned long' should be used, with several arguments towards the latter [1]. Link: https://lore.kernel.org/r/CAMuHMdX7t=mabqFE5O-Cii3REMuyaePHmqX+j_mqyrn6XXzsoA@mail.gmail.com/ [1] Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20251203141644.106459-2-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysASoC: codecs: nau8325: Silence uninitialized variables warningsKrzysztof Kozlowski1-2/+2
clang W=1 builds warn: nau8325.c:430:13: error: variable 'n2_max' is uninitialized when used here [-Werror,-Wuninitialized] which are false positive, because the variables will be always initialized when used (guarded by mclk_max!=0 check). However initializing them upfront makes the code more obvious and easier, plus it silences the warning. Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20251203140611.87191-2-krzysztof.kozlowski@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysNFSD: nfsd-io-modes: Separate listsBagas Sanjaya1-0/+5
Sphinx reports htmldocs indentation warnings: Documentation/filesystems/nfs/nfsd-io-modes.rst:58: ERROR: Unexpected indentation. [docutils] Documentation/filesystems/nfs/nfsd-io-modes.rst:59: WARNING: Block quote ends without a blank line; unexpected unindent. [docutils] These caused the lists to be shown as long running paragraphs merged with their previous paragraphs. Fix these by separating the lists with a blank line. Fixes: fa8d4e6784d1b6 ("NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20251202152506.7a2d2d41@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
11 daysNFSD: nfsd-io-modes: Wrap shell snippets in literal code blocksBagas Sanjaya1-12/+16
Sphinx reports htmldocs indentation warnings: Documentation/filesystems/nfs/nfsd-io-modes.rst:29: ERROR: Unexpected indentation. [docutils] Documentation/filesystems/nfs/nfsd-io-modes.rst:34: ERROR: Unexpected indentation. [docutils] Fix these by wrapping shell snippets in literal code blocks. Fixes: fa8d4e6784d1b6 ("NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20251202152506.7a2d2d41@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
11 daysNFSD: Add toctree entry for NFSD IO modes docsBagas Sanjaya1-0/+1
Commit fa8d4e6784d1b6 ("NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst") adds documentation for NFSD I/O modes, but it forgets to add toctree entry for it. Hence, Sphinx reports: Documentation/filesystems/nfs/nfsd-io-modes.rst: WARNING: document isn't included in any toctree [toc.not_included] Add the entry. Fixes: fa8d4e6784d1b6 ("NFSD: add Documentation/filesystems/nfs/nfsd-io-modes.rst") Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/linux-next/20251202152506.7a2d2d41@canb.auug.org.au/ Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Reviewed-by: Jeff Layton <jlayton@kernel.org> Reviewed-by: Randy Dunlap <rdunlap@infradead.org> Tested-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
11 daysregulator: check the return value of gpiod_set_value_cansleep()Bartosz Golaszewski1-3/+10
gpiod_set_value_cansleep() now returns an integer and can indicate failures in the GPIO layer. Propagate any potential errors to regulator core. Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com> Link: https://patch.msgid.link/20251203084737.15891-1-bartosz.golaszewski@oss.qualcomm.com Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysASoC: ak5558: Disable regulator when error happensShengjiu Wang1-1/+9
Disable regulator in runtime resume when error happens to balance the reference count of regulator. Fixes: 2ff6d5a108c6 ("ASoC: ak5558: Add regulator support") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/20251203100529.3841203-3-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysASoC: ak4458: Disable regulator when error happensShengjiu Wang1-1/+9
Disable regulator in runtime resume when error happens to balance the reference count of regulator. Fixes: 7e3096e8f823 ("ASoC: ak4458: Add regulator support") Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Link: https://patch.msgid.link/20251203100529.3841203-2-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysgpio: mmio: fix bad guard conversionJohan Hovold1-5/+5
A recent spinlock guard conversion consistently used the wrong guard so that interrupts are no longer disabled while holding the chip lock (which can cause deadlocks). Fixes: 7e061b462b3d ("gpio: mmio: use lock guards") Signed-off-by: Johan Hovold <johan@kernel.org> Link: https://lore.kernel.org/r/20251203105206.24453-1-johan@kernel.org Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@oss.qualcomm.com>
11 daysASoC: amd: acp: Audio is not resuming after s0ixHemalatha Pinnamreddy1-6/+24
Audio fails to resume after system exits suspend mode due to accessing incorrect ring buffer address during resume. This patch resolves issue by selecting correct address based on the ACP version. Fixes: f6f7d25b11033 ("ASoC: amd: acp: Add pte configuration for ACP7.0 platform") Signed-off-by: Hemalatha Pinnamreddy <hemalatha.pinnamreddy2@amd.com> Signed-off-by: Raghavendra Prasad Mallela <raghavendraprasad.mallela@amd.com> Reviewed-by: Mario Limonciello (AMD) <superm1@kernel.org> Link: https://patch.msgid.link/20251203064650.2554625-1-raghavendraprasad.mallela@amd.com Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysASoC: dt-bindings: cirrus,cs42xx8: Reference common DAI propertiesShengjiu Wang1-1/+4
Reference the dai-common.yaml schema to allow '#sound-dai-cells' and "sound-name-prefix' to be used because cirrus,cs42xx8 is codec DAI. Signed-off-by: Shengjiu Wang <shengjiu.wang@nxp.com> Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@oss.qualcomm.com> Link: https://patch.msgid.link/20251203102836.3856471-1-shengjiu.wang@nxp.com Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysASoC: bcm: bcm63xx-pcm-whistler: Check return value of of_dma_configure()Haotian Zhang1-1/+3
bcm63xx_soc_pcm_new() does not check the return value of of_dma_configure(), which may fail with -EPROBE_DEFER or other errors, allowing PCM setup to continue with incomplete DMA configuration. Add error checking for of_dma_configure() and return on failure. Fixes: 88eb404ccc3e ("ASoC: brcm: Add DSL/PON SoC audio driver") Signed-off-by: Haotian Zhang <vulab@iscas.ac.cn> Link: https://patch.msgid.link/20251202101642.492-1-vulab@iscas.ac.cn Signed-off-by: Mark Brown <broonie@kernel.org>
11 daysrust: pci: fix build failure when CONFIG_PCI_MSI is disabledDanilo Krummrich1-1/+13
When CONFIG_PCI_MSI is disabled pci_alloc_irq_vectors() and pci_free_irq_vectors() are defined as inline functions and hence require a Rust helper. error[E0425]: cannot find function `pci_alloc_irq_vectors` in crate `bindings` --> rust/kernel/pci/irq.rs:144:23 | 144 | ...s::pci_alloc_irq_vectors(dev.as_raw(), min_vecs, max_vecs, irq_types.as_raw()) | ^^^^^^^^^^^^^^^^^^^^^ help: a function with a similar name exists: `pci_irq_vector` | ::: .../rust/bindings/bindings_helpers_generated.rs:1197:5 | 1197 | pub fn pci_irq_vector(pdev: *mut pci_dev, nvec: ffi::c_uint) -> ffi::c_int; | --------------------------------------------------------------------------- similarly named function `pci_irq_vector` defined here error[E0425]: cannot find function `pci_free_irq_vectors` in crate `bindings` --> rust/kernel/pci/irq.rs:170:28 | 170 | unsafe { bindings::pci_free_irq_vectors(self.dev.as_raw()) }; | ^^^^^^^^^^^^^^^^^^^^ help: a function with a similar name exists: `pci_irq_vector` | ::: .../rust/bindings/bindings_helpers_generated.rs:1197:5 | 1197 | pub fn pci_irq_vector(pdev: *mut pci_dev, nvec: ffi::c_uint) -> ffi::c_int; | --------------------------------------------------------------------------- similarly named function `pci_irq_vector` defined here error: aborting due to 2 previous errors Fix this by adding the corresponding helpers. Fixes: 340ccc973544 ("rust: pci: Allocate and manage PCI interrupt vectors") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202512012238.YgVvRRUx-lkp@intel.com/ Reviewed-by: Alice Ryhl <aliceryhl@google.com> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com> Link: https://patch.msgid.link/20251202210501.40998-1-dakr@kernel.org Signed-off-by: Danilo Krummrich <dakr@kernel.org>
11 daysfs: assert on I_FREEING not being set in iput() and iput_not_last()Mateusz Guzik1-1/+2
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20251201132037.22835-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysfs: PM: Fix reverse check in filesystems_freeze_callback()Rafael J. Wysocki1-1/+1
The freeze_all_ptr check in filesystems_freeze_callback() introduced by commit a3f8f8662771 ("power: always freeze efivarfs") is reverse which quite confusingly causes all file systems to be frozen when filesystem_freeze_enabled is false. On my systems it causes the WARN_ON_ONCE() in __set_task_frozen() to trigger, most likely due to an attempt to freeze a file system that is not ready for that. Add a logical negation to the check in question to reverse it as appropriate. Fixes: a3f8f8662771 ("power: always freeze efivarfs") Cc: 6.18+ <stable@vger.kernel.org> # 6.18+ Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Link: https://patch.msgid.link/12788397.O9o76ZdvQC@rafael.j.wysocki Signed-off-by: Christian Brauner <brauner@kernel.org>
11 daysdrm/gem-shmem: revert the 8-byte alignment constraintLudovic Desroches1-1/+1
Using drm_mode_size_dumb() to compute the size of dumb buffers introduced an 8-byte alignment constraint on the pitch that wasn’t present before. Let’s remove this constraint, which isn’t necessarily required and may cause buffers to be allocated larger than needed. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: 4977dcecb931 ("drm/gem-shmem: Compute dumb-buffer sizes with drm_mode_size_dumb()") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20251126-lcd_pitch_alignment-v1-2-991610a1e369@microchip.com
11 daysdrm/gem-dma: revert the 8-byte alignment constraintLudovic Desroches1-1/+1
Using drm_mode_size_dumb() to compute the size of dumb buffers introduced an 8-byte alignment constraint on the pitch that wasn’t present before. Let’s remove this constraint, which isn’t necessarily required and may cause buffers to be allocated larger than needed. Signed-off-by: Ludovic Desroches <ludovic.desroches@microchip.com> Reviewed-by: Thomas Zimmermann <tzimmermann@suse.de> Fixes: dcacfcd35cef ("drm/gem-dma: Compute dumb-buffer sizes with drm_mode_size_dumb()") Signed-off-by: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20251126-lcd_pitch_alignment-v1-1-991610a1e369@microchip.com
11 daysplatform/x86: asus-wmi: use brightness_set_blocking() for kbd ledAnton Khirnov1-4/+4
kbd_led_set() can sleep, and so may not be used as the brightness_set() callback. Otherwise using this led with a trigger leads to system hangs accompanied by: BUG: scheduling while atomic: acpi_fakekeyd/2588/0x00000003 CPU: 4 UID: 0 PID: 2588 Comm: acpi_fakekeyd Not tainted 6.17.9+deb14-amd64 #1 PREEMPT(lazy) Debian 6.17.9-1 Hardware name: ASUSTeK COMPUTER INC. ASUS EXPERTBOOK B9403CVAR/B9403CVAR, BIOS B9403CVAR.311 12/24/2024 Call Trace: <TASK> [...] schedule_timeout+0xbd/0x100 __down_common+0x175/0x290 down_timeout+0x67/0x70 acpi_os_wait_semaphore+0x57/0x90 [...] asus_wmi_evaluate_method3+0x87/0x190 [asus_wmi] led_trigger_event+0x3f/0x60 [...] Fixes: 9fe44fc98ce4 ("platform/x86: asus-wmi: Simplify the keyboard brightness updating process") Signed-off-by: Anton Khirnov <anton@khirnov.net> Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Denis Benato <benato.denis96@gmail.com> Link: https://patch.msgid.link/20251129101307.18085-3-anton@khirnov.net Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com> Signed-off-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
11 daysmedia: uapi: c3-isp: Fix documentation warningJacopo Mondi1-1/+1
Building htmldocs generates a warning: WARNING: include/uapi/linux/media/amlogic/c3-isp-config.h:199 error: Cannot parse struct or union! Which correctly highlights that the c3_isp_params_block_header symbol is wrongly documented as a struct while it's a plain #define instead. Fix this by removing the 'struct' identifier from the documentation of the c3_isp_params_block_header symbol. [ribalda: Add Closes:] Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Closes: https://lore.kernel.org/all/20251127131425.4b5b6644@canb.auug.org.au/ Fixes: 45662082855c ("media: uapi: Convert Amlogic C3 to V4L2 extensible params") Cc: stable@vger.kernel.org Signed-off-by: Jacopo Mondi <jacopo.mondi@ideasonboard.com> Signed-off-by: Ricardo Ribalda <ribalda@chromium.org> Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
11 daysscsi: Revert "scsi: libsas: Fix exp-attached device scan after probe failure ↵Xingui Yang1-14/+0
scanned in again after probe failed" This reverts commit ab2068a6fb84751836a84c26ca72b3beb349619d. When probing the exp-attached sata device, libsas/libata will issue a hard reset in sas_probe_sata() -> ata_sas_async_probe(), then a broadcast event will be received after the disk probe fails, and this commit causes the probe will be re-executed on the disk, and a faulty disk may get into an indefinite loop of probe. Therefore, revert this commit, although it can fix some temporary issues with disk probe failure. Signed-off-by: Xingui Yang <yangxingui@huawei.com> Reviewed-by: Jason Yan <yanaijie@huawei.com> Reviewed-by: John Garry <john.g.garry@oracle.com> Link: https://patch.msgid.link/20251202065627.140361-1-yangxingui@huawei.com Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
11 daysscsi: ufs: core: Fix RPMB link error by reversing Kconfig dependenciesBean Huo2-1/+1
When CONFIG_SCSI_UFSHCD=y and CONFIG_RPMB=m, the kernel fails to link with undefined references to ufs_rpmb_probe() and ufs_rpmb_remove(): ld: drivers/ufs/core/ufshcd.c:8950: undefined reference to `ufs_rpmb_probe' ld: drivers/ufs/core/ufshcd.c:10505: undefined reference to `ufs_rpmb_remove' The issue is that RPMB depends on its consumers (MMC, UFS) in Kconfig, which is backwards. This prevents proper module dependency handling when the library is modular but consumers are built-in. Fix by reversing the dependency: - Remove 'depends on MMC || SCSI_UFSHCD' from RPMB Kconfig - Add 'depends on RPMB || !RPMB' to SCSI_UFSHCD Kconfig This allows RPMB to be an independent library while ensuring correct linking in all module/built-in combinations. Fixes: b06b8c421485 ("scsi: ufs: core: Add OP-TEE based RPMB driver for UFS devices") Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/oe-kbuild-all/202511300443.h7sotuL0-lkp@intel.com/ Suggested-by: Arnd Bergmann <arnd@arndb.de> Cc: Bart Van Assche <bvanassche@acm.org> Cc: Jens Wiklander <jens.wiklander@linaro.org> Cc: Ulf Hansson <ulf.hansson@linaro.org> Signed-off-by: Bean Huo <beanhuo@micron.com> Reviewed-by: Arnd Bergmann <arnd@arndb.de> Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org> Reviewed-by: Bart Van Assche <bvanassche@acm.org> Link: https://patch.msgid.link/20251202155138.2607210-1-beanhuo@iokpp.de Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
11 daysperf test: Add kallsyms split testNamhyung Kim4-0/+159
Create a fake root directory for /proc/{version,modules,kallsyms} in /tmp for testing. The kallsyms has a bad symbol in the module and it causes the main map splitted. The test ensures it only has two maps - kernel and the module and it finds the initial map after the module without creating the split maps like [kernel].0 and so on. $ perf test -vv "split kallsyms" 69: split kallsyms: --- start --- test child forked, pid 1016196 try to create fake root directory create kernel maps from the fake root directory maps__set_modules_path_dir: cannot open /tmp/perf-test.Zrv6Sy/lib/modules/X.Y.Z dir Problems setting modules path maps, continuing anyway... Failed to open /tmp/perf-test.Zrv6Sy/proc/kcore. Note /proc/kcore requires CAP_SYS_RAWIO capability to access. Using /tmp/perf-test.Zrv6Sy/proc/kallsyms for symbols kernel map loaded - check symbol and map ---- end(0) ---- 69: split kallsyms : Ok Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tools: Use machine->root_dir to find /proc/kallsymsNamhyung Kim1-1/+7
This is for test functions to find the kallsyms correctly. It can find the machine from the kernel maps and use its root_dir. This is helpful to setup fake /proc directory for testing. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tools: Fallback to initial kernel map properlyNamhyung Kim1-1/+2
In maps__split_kallsyms(), it assumes new kernel map when it finds a symbol without module after any module and the initial kernel map has some symbols. Because it expects modules are out of the kernel map so modules should not have symbols in the kernel map. For example, the following memory map shows symbols and maps. Any symbols in the module 1 area will go to the module 1. The main kernel map starts at 0xffffffffbc200000. But if any symbol has a module between the symbols in that area, next symbols after 0xffffffffbd008000 will generate new kernel maps like [kernel].1. kernel address | | | | 0xffffffffc0000000 |---------------------| | (symbols) | | ... | <--- [kernel].N 0xffffffffbc400000 |---------------------| | (symbols) | | module 2 | <--- bad? 0xffffffffbc380000 |---------------------| | ... | | (symbols) | | [kernel.kallsyms] | <--- initial map 0xffffffffbc200000 |---------------------| | | | | 0xffffffffabcde000 |---------------------| | (symbols) | | module 1 | 0xffffffffabcd0000 |---------------------| This is very fragile when the module has a symbol that falls into the main kernel map for some reason. My system has a livepatch module with such symbols. And it created a lot of new kernel maps after those symbols. But the symbol may have broken addresses and the later symbols can still be found in the initial kernel map. Let's check the symbol address in the initial map and use it if found. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tools: Fix split kallsyms DSO countingNamhyung Kim1-2/+2
It's counted twice as it's increased after calling maps__insert(). I guess we want to increase it only after it's added properly. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tools: Mark split kallsyms DSOs as loadedNamhyung Kim1-0/+1
The maps__split_kallsyms() will split symbols to module DSOs if it comes from a module. It also handled some unusual kernel symbols after modules by creating new kernel maps like "[kernel].0". But they are pseudo DSOs to have those unexpected symbols. They should not be considered as unloaded kernel DSOs. Otherwise the dso__load() for them will end up calling dso__load_kallsyms() and then maps__split_kallsyms() again and again. Reviewed-by: Ian Rogers <irogers@google.com> Fixes: 2e538c4a1847291cf ("perf tools: Improve kernel/modules symbol lookup") Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tools: Flush remaining samples w/o deferred callchainsNamhyung Kim1-0/+50
It's possible that some kernel samples don't have matching deferred callchain records when the profiling session was ended before the threads came back to userspace. Let's flush the samples before finish the session. Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf tools: Merge deferred user callchainsNamhyung Kim11-1/+133
Save samples with deferred callchains in a separate list and deliver them after merging the user callchains. If users don't want to merge they can set tool->merge_deferred_callchains to false to prevent the behavior. With previous result, now perf script will show the merged callchains. $ perf script ... pwd 2312 121.163435: 249113 cpu/cycles/P: ffffffff845b78d8 __build_id_parse.isra.0+0x218 ([kernel.kallsyms]) ffffffff83bb5bf6 perf_event_mmap+0x2e6 ([kernel.kallsyms]) ffffffff83c31959 mprotect_fixup+0x1e9 ([kernel.kallsyms]) ffffffff83c31dc5 do_mprotect_pkey+0x2b5 ([kernel.kallsyms]) ffffffff83c3206f __x64_sys_mprotect+0x1f ([kernel.kallsyms]) ffffffff845e6692 do_syscall_64+0x62 ([kernel.kallsyms]) ffffffff8360012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms]) 7f18fe337fa7 mprotect+0x7 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe330e0f _dl_sysdep_start+0x7f (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe331448 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) ... The old output can be get using --no-merge-callchain option. Also perf report can get the user callchain entry at the end. $ perf report --no-children --stdio -q -S __build_id_parse.isra.0 # symbol: __build_id_parse.isra.0 8.40% pwd [kernel.kallsyms] | ---__build_id_parse.isra.0 perf_event_mmap mprotect_fixup do_mprotect_pkey __x64_sys_mprotect do_syscall_64 entry_SYSCALL_64_after_hwframe mprotect _dl_sysdep_start _dl_start_user Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf script: Display PERF_RECORD_CALLCHAIN_DEFERREDNamhyung Kim2-1/+93
Handle the deferred callchains in the script output. $ perf script ... pwd 2312 121.163435: 249113 cpu/cycles/P: ffffffff845b78d8 __build_id_parse.isra.0+0x218 ([kernel.kallsyms]) ffffffff83bb5bf6 perf_event_mmap+0x2e6 ([kernel.kallsyms]) ffffffff83c31959 mprotect_fixup+0x1e9 ([kernel.kallsyms]) ffffffff83c31dc5 do_mprotect_pkey+0x2b5 ([kernel.kallsyms]) ffffffff83c3206f __x64_sys_mprotect+0x1f ([kernel.kallsyms]) ffffffff845e6692 do_syscall_64+0x62 ([kernel.kallsyms]) ffffffff8360012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms]) b00000006 (cookie) ([unknown]) pwd 2312 121.163447: DEFERRED CALLCHAIN [cookie: b00000006] 7f18fe337fa7 mprotect+0x7 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe330e0f _dl_sysdep_start+0x7f (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) 7f18fe331448 _dl_start_user+0x0 (/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2) Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysperf record: Add --call-graph fp,defer option for deferred callchainsNamhyung Kim6-3/+41
Add a new callchain record mode option for deferred callchains. For now it only works with FP (frame-pointer) mode. And add the missing feature detection logic to clear the flag on old kernels. $ perf record --call-graph fp,defer -vv true ... ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) size 136 config 0 (PERF_COUNT_HW_CPU_CYCLES) { sample_period, sample_freq } 4000 sample_type IP|TID|TIME|CALLCHAIN|PERIOD read_format ID|LOST disabled 1 inherit 1 mmap 1 comm 1 freq 1 enable_on_exec 1 task 1 sample_id_all 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 defer_callchain 1 defer_output 1 ------------------------------------------------------------ sys_perf_event_open: pid 162755 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -22 switching off deferred callchain support Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org>
11 daysipe: Update documentation for script enforcementYanzhu Huang1-3/+14
This patch adds explanation of script enforcement mechanism in admin guide documentation. Describes how IPE supports integrity enforcement for indirectly executed scripts through the AT_EXECVE_CHECK flag, and how this differs from kernel enforcement for compiled executables. Signed-off-by: Yanzhu Huang <yanzhuhuang@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@kernel.org>
11 daysipe: Add AT_EXECVE_CHECK support for script enforcementYanzhu Huang4-0/+32
This patch adds a new ipe_bprm_creds_for_exec() hook that integrates with the AT_EXECVE_CHECK mechanism. To enable script enforcement, interpreters need to incorporate the AT_EXECVE_CHECK flag when calling execveat() on script files before execution. When a userspace interpreter calls execveat() with the AT_EXECVE_CHECK flag, this hook triggers IPE policy evaluation on the script file. The hook only triggers IPE when bprm->is_check is true, ensuring it's being called from an AT_EXECVE_CHECK context. It then builds an evaluation context for an IPE_OP_EXEC operation and invokes IPE policy. The kernel returns the policy decision to the interpreter, which can then decide whether to proceed with script execution. This extends IPE enforcement to indirectly executed scripts, permitting trusted scripts to execute while denying untrusted ones. Signed-off-by: Yanzhu Huang <yanzhuhuang@linux.microsoft.com> Signed-off-by: Fan Wu <wufan@kernel.org>