aboutsummaryrefslogtreecommitdiffstats
AgeCommit message (Collapse)AuthorFilesLines
2025-11-28bpf: convert bpf_iter_new_fd() to FD_PREPARE()Christian Brauner1-21/+8
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-23-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28ipc: convert do_mq_open() to FD_ADD()Christian Brauner1-29/+25
Christian Brauner <brauner@kernel.org> says: The fix sent in [1] was squashed into this commit. Fixes: https://lore.kernel.org/c41de645-8234-465f-a3be-f0385e3a163c@sirena.org.uk [1] Reported-by: Mark Brown <broonie@kernel.org> [1] Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> [1] Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-22-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28exec: convert begin_new_exec() to FD_ADD()Christian Brauner1-2/+1
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-21-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28af_unix: convert unix_file_open() to FD_ADD()Christian Brauner1-15/+1
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-19-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28dma: convert dma_buf_fd() to FD_ADD()Christian Brauner1-9/+1
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-18-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28xfs: convert xfs_open_by_handle() to FD_PREPARE()Christian Brauner1-39/+17
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-17-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28userfaultfd: convert new_userfaultfd() to FD_PREPARE()Christian Brauner1-20/+10
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-16-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28timerfd: convert timerfd_create() to FD_ADD()Christian Brauner1-20/+9
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-15-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28signalfd: convert do_signalfd4() to FD_ADD()Christian Brauner1-18/+11
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-14-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28open: convert do_sys_openat2() to FD_ADD()Christian Brauner1-14/+3
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-13-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28eventpoll: convert do_epoll_create() to FD_PREPARE()Christian Brauner1-22/+10
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-12-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28autofs: convert autofs_dev_ioctl_open_mountpoint() to FD_ADD()Christian Brauner1-24/+6
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-11-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28nsfs: convert ns_ioctl() to FD_PREPARE()Christian Brauner1-23/+12
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-10-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28nsfs: convert open_namespace() to FD_PREPARE()Christian Brauner1-11/+1
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-9-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28fanotify: convert fanotify_init() to FD_PREPARE()Christian Brauner1-38/+22
Christian Brauner <brauner@kernel.org> says: The fix sent in [1] was squashed into this commit. Link: https://lore.kernel.org/20251127201618.2115275-1-kuniyu@google.com [1] Reported-by: syzbot+321168dfa622eda99689@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/6928b121.a70a0220.d98e3.0110.GAE@google.com Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-8-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28namespace: convert fsmount() to FD_PREPARE()Christian Brauner1-40/+30
Christian Brauner <brauner@kernel.org> says: A variant of the fix sent in [1] was squashed into this commit. Link: https://lore.kernel.org/20251128035149.392402-1-kartikey406@gmail.com [1] Reported-by: Deepanshu Kartikey <kartikey406@gmail.com> Reported-by: syzbot+94048264da5715c251f9@syzkaller.appspotmail.com Tested-by: syzbot+94048264da5715c251f9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=94048264da5715c251f9 Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-7-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28namespace: convert open_tree_attr() to FD_PREPARE()Christian Brauner1-13/+6
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-6-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28namespace: convert open_tree() to FD_ADD()Christian Brauner1-13/+1
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-5-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28fhandle: convert do_handle_open() to FD_ADD()Christian Brauner1-17/+13
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-4-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28eventfd: convert do_eventfd() to FD_PREPARE()Christian Brauner1-20/+11
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-3-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28anon_inodes: convert to FD_ADD()Christian Brauner1-21/+2
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-2-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28file: add FD_{ADD,PREPARE}()Christian Brauner2-0/+133
I've been playing with this to allow for moderately flexible usage of the get_unused_fd_flags() + create file + fd_install() pattern that's used quite extensively. How callers allocate files is really heterogenous so it's not really convenient to fold them into a single class. It's possibe to split them into subclasses like for anon inodes. I think that's not necessarily nice as well. My take is to add two primites: (1) FD_ADD() the simple cases a file is installed: fd = FD_ADD(O_CLOEXEC, open_file(some, args))); if (fd >= 0) kvm_get_kvm(vcpu->kvm); return fd; (2) FD_PREPARE() that captures all the cases where access to fd or file or additional work before publishing the fd is needed: FD_PREPARE(fdf, open_flag, file_open_handle(&path, open_flag)); if (fdf.err) return fdf.err; if (copy_to_user(/* something something */)) return -EFAULT; return fd_publish(fdf); I've converted all of the easy cases over to it and it gets rid of an aweful lot of convoluted cleanup logic. It's centered around struct fd_prepare. FD_PREPARE() encapsulates all of allocation and cleanup logic and must be followed by a call to fd_publish() which associates the fd with the file and installs it into the callers fdtable. If fd_publish() isn't called both are deallocated. It mandates a specific order namely that first we allocate the fd and then instantiate the file. But that shouldn't be a problem nearly everyone I've converted uses this exact pattern anyway. There's a bunch of additional cases where it would be easy to convert them to this pattern. For example, the whole sync file stuff in dma currently retains the containing structure of the file instead of the file itself even though it's only used to allocate files. Changing that would make it fall into the FD_PREPARE() pattern easily. I've not done that work yet. There's room for extending this in a way that wed'd have subclasses for some particularly often use patterns but as I said I'm not even sure that's worth it. Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-1-b6efa1706cfd@kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28afs: Fix delayed allocation of a cell's anonymous keyDavid Howells3-43/+49
The allocation of a cell's anonymous key is done in a background thread along with other cell setup such as doing a DNS upcall. In the reported bug, this is triggered by afs_parse_source() parsing the device name given to mount() and calling afs_lookup_cell() with the name of the cell. The normal key lookup then tries to use the key description on the anonymous authentication key as the reference for request_key() - but it may not yet be set and so an oops can happen. This has been made more likely to happen by the fix for dynamic lookup failure. Fix this by firstly allocating a reference name and attaching it to the afs_cell record when the record is created. It can share the memory allocation with the cell name (unfortunately it can't just overlap the cell name by prepending it with "afs@" as the cell name already has a '.' prepended for other purposes). This reference name is then passed to request_key(). Secondly, the anon key is now allocated on demand at the point a key is requested in afs_request_key() if it is not already allocated. A mutex is used to prevent multiple allocation for a cell. Thirdly, make afs_request_key_rcu() return NULL if the anonymous key isn't yet allocated (if we need it) and then the caller can return -ECHILD to drop out of RCU-mode and afs_request_key() can be called. Note that the anonymous key is kind of necessary to make the key lookup cache work as that doesn't currently cache a negative lookup, but it's probably worth some investigation to see if NULL can be used instead. Fixes: 330e2c514823 ("afs: Fix dynamic lookup to fail on cell lookup failure") Reported-by: syzbot+41c68824eefb67cdf00c@syzkaller.appspotmail.com Signed-off-by: David Howells <dhowells@redhat.com> Link: https://patch.msgid.link/800328.1764325145@warthog.procyon.org.uk cc: Marc Dionne <marc.dionne@auristor.com> cc: linux-afs@lists.infradead.org cc: linux-fsdevel@vger.kernel.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28ovl: remove unneeded semicolonChen Ni1-1/+1
Remove unnecessary semicolons reported by Coccinelle/coccicheck and the semantic patch at scripts/coccinelle/misc/semicolon.cocci. Signed-off-by: Chen Ni <nichen@iscas.ac.cn> Fixed: 7ab96df840e60 ("VFS/nfsd/cachefiles/ovl: add start_creating() and end_creating()") Signed-off-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28vfs: add needed headers for new struct delegation definitionJeff Layton1-0/+5
The definition of struct delegation uses stdint.h integer types. Add the necessary headers to ensure that always works. Fixes: 1602bad16d7d ("vfs: expose delegation support to userland") Signed-off-by: Jeff Layton <jlayton@kernel.org> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28ovl: fail ovl_lock_rename_workdir() if either target is unhashedNeilBrown1-2/+2
As well as checking that the parent hasn't changed after getting the lock we need to check that the dentry hasn't been unhashed. Otherwise we might try to rename something that has been removed. Reported-by: syzbot+bfc9a0ccf0de47d04e8c@syzkaller.appspotmail.com Fixes: d2c995581c7c ("ovl: Call ovl_create_temp() without lock held.") Signed-off-by: NeilBrown <neil@brown.name> Link: https://patch.msgid.link/176429295510.634289.1552337113663461690@noble.neil.brown.name Tested-by: syzbot+bfc9a0ccf0de47d04e8c@syzkaller.appspotmail.com Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28dcache: touch up predicts in __d_lookup_rcu()Mateusz Guzik1-3/+12
Rationale is that if the parent dentry is the same and the length is the same, then you have to be unlucky for the name to not match. At the same time the dentry was literally just found on the hash, so you have to be even more unlucky to determine it is unhashed. While here add commentary while d_unhashed() is necessary. It was already removed once and brought back in: 2e321806b681b192 ("Revert "vfs: remove unnecessary d_unhashed() check from __d_lookup_rcu"") Signed-off-by: Mateusz Guzik <mjguzik@gmail.com> Link: https://patch.msgid.link/20251127131526.4137768-1-mjguzik@gmail.com Reviewed-by: Jan Kara <jack@suse.cz> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28filelock: __fcntl_getlease: fix kernel-doc warningsRandy Dunlap1-1/+2
Use the correct function name and add description for the @flavor parameter to avoid these kernel-doc warnings: Warning: fs/locks.c:1706 function parameter 'flavor' not described in '__fcntl_getlease' WARNING: fs/locks.c:1706 expecting prototype for fcntl_getlease(). Prototype was for __fcntl_getlease() instead Fixes: 1602bad16d7d ("vfs: expose delegation support to userland") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Link: https://patch.msgid.link/20251128000826.457120-1-rdunlap@infradead.org Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-28nfsd: fix end_creating() conversionNeil Brown2-4/+5
Avoid a double-unlock as nfs_create_locked() will have unlocked the parent and do the dput() manually. Christian Brauner <brauner@kernel.org> says: I've taken Neil's proposed fix from [1] and added a commit message. Fixes: https://lore.kernel.org/202511252132.2c621407-lkp@intel.com [1] Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()") Signed-off-by: Neil Brown <neil@brown.name> Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-27Merge tag 'drm-fixes-2025-11-28' of https://gitlab.freedesktop.org/drm/kernelLinus Torvalds20-65/+85
Pull drm fixes from Dave Airlie: "Last one for this round hopefully, mostly the usual suspects, xe/amdgpu, with some single fixes otherwise. There is one amdgpu HDMI blackscreen bug that came in late in the cycle, but it was bisected and the revert is in here. i915: - Reject async flips when PSR's selective fetch is enabled xe: - Fix resource leak in xe_guc_ct_init_noalloc()'s error path - Fix stack_depot usage without STACKDEPOT_ALWAYS_INIT - Fix overflow in conversion from clock tics to msec amdgpu: - Unified MES fix - HDMI fix - Cursor fix - Bightness fix - EDID reading improvement - UserQ fix - Cyan Skillfish IP discovery fix bridge: - sil902x: Fix HDMI detection imagination: - Update documentation sti: - Fix leaks in probe vga_switcheroo: - Avoid race condition during fbcon initialization" * tag 'drm-fixes-2025-11-28' of https://gitlab.freedesktop.org/drm/kernel: drm/amdgpu: fix cyan_skillfish2 gpu info fw handling drm/amdgpu: attach tlb fence to the PTs update drm/amd/display: Increase EDID read retries drm/amd/display: Don't change brightness for disabled connectors drm/amd/display: Check NULL before accessing Revert "drm/amd/display: Move setup_stream_attribute" drm/xe: Fix conversion from clock ticks to milliseconds drm/xe/guc: Fix stack_depot usage drm/xe/guc: Fix resource leak in xe_guc_ct_init_noalloc() drm/i915/psr: Reject async flips when selective fetch is enabled drm, fbcon, vga_switcheroo: Avoid race condition in fbcon setup drm/amd/amdgpu: reserve vm invalidation engine for uni_mes drm: sti: fix device leaks at component probe drm/imagination: Document pvr_device.power member drm/bridge: sii902x: Fix HDMI detection with DRM_BRIDGE_ATTACH_NO_CONNECTOR
2025-11-27Merge branch 'bnxt_en-updates-for-net-next'Jakub Kicinski6-18/+99
Michael Chan says: ==================== bnxt_en: Updates for net-next (part) This series includes an enhnacement to the priority TX counters, an enhancement to a PHY module error extack message, cleanup of unneeded MSIX logic in bnxt_ulp.c, adding CQ dump during TX timeout, LRO/HW_GRO performance improvement by enabling Relaxed Ordering, and improved SRIOV admin link state support. ==================== Link: https://patch.msgid.link/20251126215648.1885936-1-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27bnxt_en: Add Virtual Admin Link State Support for VFsRob Miller3-5/+57
The firmware can now cache the virtual link admin state (auto/on/off) of all VFs and as such, the PF driver no longer has to intercept the VF driver's port_phy_qcfg() call and then provide the link admin state. If the FW does not have this capability, fall back to the existing interception method. The initial default link admin state (auto) is also set initially when the VFs are created. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Reviewed-by: Mohammad Shuab Siddique <mohammad-shuab.siddique@broadcom.com> Signed-off-by: Rob Miller <rmiller@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-7-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27bnxt_en: Do not set EOP on RX AGG BDs on 5760X chipsMichael Chan2-1/+9
With End-of-Packet padding (EOP) set, the chip will disable Relaxed Ordering (RO) of TPA data packets. A TPA segment with EOP set will be padded to the next cache boundary and can potentially overwrite the beginning bytes of the next TPA segment when RO is enabled on 5760X. To prevent that, the chip disables RO for TPA when EOP is set. To take advantge of RO and higher performance, do not set EOP on 5760X chips when TPA is enabled. Define a proper RX_BD_FLAGS_AGG_EOP constant to make it clear that we are setting EOP. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-6-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27bnxt_en: Add CQ ring dump to bnxt_dump_cp_sw_state()Michael Chan1-2/+10
On newer chips that use NQs and CQs, add the CQ ring dump to bnxt_dump_cp_sw_state() to make it more complete. Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-5-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27bnxt_en: Remove the redundant BNXT_EN_FLAG_MSIX_REQUESTED flagKalesh AP2-6/+2
MSIX is always requested when the RoCE driver calls bnxt_register_dev(). We already check bnxt_ulp_registered(), so checking the flag is redundant. It was a left-over flag after converting to auxbus, so remove it. Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com> Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-4-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27bnxt_en: Enhance log message in bnxt_get_module_status()Gautam R A1-0/+5
Rturn early with -EOPNOTSUPP and an extack message if the PHY type is BaseT since module status is not available for BaseT. Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com> Signed-off-by: Gautam R A <gautam-r.a@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-3-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27bnxt_en: Enhance TX pri countersMichael Chan3-4/+16
The priority packet and byte counters in ethtool -S are returned by the driver based on the pri2cos mapping. The assumption is that each priority is mapped to one and only one hardware CoS queue. In a special RoCE configuration, the FW uses combined CoS queue 0 and CoS queue 1 for the priority mapped to CoS queue 0. In this special case, we need to add the CoS queue 0 and CoS queue 1 counters for the priority packet and byte counters. Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Link: https://patch.msgid.link/20251126215648.1885936-2-michael.chan@broadcom.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27Merge branch ↵Jakub Kicinski13-65/+47
'intel-wired-lan-driver-updates-2025-11-25-ice-idpf-iavf-ixgbe-ixgbevf-e1000e' Tony Nguyen says: ==================== Intel Wired LAN Driver Updates 2025-11-25 (ice, idpf, iavf, ixgbe, ixgbevf, e1000e) Natalia cleans up ixgbevf_q_vector struct removing an unused field. Emil converts vport state tracking from enum to bitmap and removes unneeded states for idpf. Tony removes an unneeded check from e1000e. Alok Tiwari removes an unnecessary second call to ixgbe_non_sfp_link_config() and adjusts the checked member, in idpf, to reflect the member that is later used. He also fixes various typos and messages for better clarity misc Intel drivers. ==================== Link: https://patch.msgid.link/20251125223632.1857532-1-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27iavf: clarify VLAN add/delete log messages and lower log levelAlok Tiwari1-4/+8
The current dev_warn messages for too many VLAN changes are confusing and one place incorrectly references "add" instead of "delete" VLANs due to copy-paste errors. - Use dev_info instead of dev_warn to lower the log level. - Rephrase the message to: "virtchnl: Too many VLAN [add|delete] ([v1|v2]) requests; splitting into multiple messages to PF\n". Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-12-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27ice: fix comment typo and correct module format stringAlok Tiwari2-2/+2
- Fix a typo in the ice_fdir_has_frag() kernel-doc comment ("is" -> "if") - Correct the NVM erase error message format string from "0x02%x" to "0x%02x" so the module value is printed correctly. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-11-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27idpf: correct queue index in Rx allocation error messagesAlok Tiwari1-4/+4
The error messages in idpf_rx_desc_alloc_all() used the group index i when reporting memory allocation failures for individual Rx and Rx buffer queues. This is incorrect. Update the messages to use the correct queue index j and include the queue group index i for clearer identification of the affected Rx and Rx buffer queues. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Simon Horman <horms@kernel.org> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-10-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27idpf: use desc_ring when checking completion queue DMA allocationAlok Tiwari1-1/+1
idpf_compl_queue uses a union for comp, comp_4b, and desc_ring. The release path should check complq->desc_ring to determine whether the DMA descriptor ring is allocated. The current check against comp works but is leftover from a previous commit and is misleading in this context. Switching the check to desc_ring improves readability and more directly reflects the intended meaning, since desc_ring is the field representing the allocated DMA-backed descriptor ring. No functional change. Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-9-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27ixgbe: avoid redundant call to ixgbe_non_sfp_link_config()Alok Tiwari1-2/+2
ixgbe_non_sfp_link_config() is called twice in ixgbe_open() once to assign its return value to err and again in the conditional check. This patch uses the stored err value instead of calling the function a second time. This avoids redundant work and ensures consistent error reporting. Also fix a small typo in the ixgbe_remove() comment: "The could be caused" -> "This could be caused". Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com> Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-8-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27e1000e: Remove unneeded checksTony Nguyen1-5/+1
The caller, ethtool_set_eeprom(), already performs the same checks so these are unnecessary in the driver. This reverts commit 90fb7db49c6d ("e1000e: fix heap overflow in e1000_set_eeprom"), however, corrections for RCT have been kept. Link: https://lore.kernel.org/all/db92fcc8-114d-4e85-9d15-7860545bc65e@suse.de/ Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-7-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27idpf: convert vport state to bitmapEmil Tantilov7-30/+28
Convert vport state to a bitmap and remove the DOWN state which is redundant in the existing logic. There are no functional changes aside from the use of bitwise operations when setting and checking the states. Removed the double underscore to be consistent with the naming of other bitmaps in the header and renamed current_state to vport_is_up to match the meaning of the new variable. Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Chittim Madhu <madhu.chittim@intel.com> Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com> Tested-by: Samuel Salin <Samuel.salin@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-6-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27ixgbevf: ixgbevf_q_vector clean upNatalia Wochtman1-17/+1
Flex array should be at the end of the structure and use [] syntax Remove unused fields of ixgbevf_q_vector. They aren't used since busy poll was moved to core code in commit 508aac6dee02 ("ixgbevf: get rid of custom busy polling code"). Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Signed-off-by: Natalia Wochtman <natalia.wochtman@intel.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Tested-by: Rafal Romanowski <rafal.romanowski@intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com> Link: https://patch.msgid.link/20251125223632.1857532-5-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27dibs: Remove KMSG_COMPONENT macroHeiko Carstens1-2/+1
The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel message catalog" from 2008 [1] which never made it upstream. The macro was added to s390 code to allow for an out-of-tree patch which used this to generate unique message ids. Also this out-of-tree doesn't exist anymore. The pattern of how the KMSG_COMPONENT is used was partially also used for non s390 specific code, for whatever reasons. Remove the macro in order to get rid of a pointless indirection. [1] https://lwn.net/Articles/292650/ Signed-off-by: Heiko Carstens <hca@linux.ibm.com> Acked-by: Alexandra Winter <wintera@linux.ibm.com> Link: https://patch.msgid.link/20251126142242.2124317-1-hca@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-28Merge tag 'drm-xe-fixes-2025-11-27' of ↵Dave Airlie2-12/+10
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix resource leak in xe_guc_ct_init_noalloc()'s error path (Shuicheng Lin) - Fix stack_depot usage without STACKDEPOT_ALWAYS_INIT (Lucas) - Fix overflow in conversion from clock tics to msec (Harish Chegondi) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patch.msgid.link/7ejiqjgthpqybg5svmkind2pszk4fqadxuq7rngchaaw76iept@5pn6sngqj6lk
2025-11-27net: thunder: convert to use .get_rx_ring_countBreno Leitao1-13/+3
Convert the Cavium Thunder NIC VF driver to use the new .get_rx_ring_count ethtool operation instead of implementing .get_rxnfc solely for handling ETHTOOL_GRXRINGS command. This simplifies the code by removing the switch statement and replacing it with a direct return of the queue count. The new callback provides the same functionality in a more direct way, following the ongoing ethtool API modernization. Signed-off-by: Breno Leitao <leitao@debian.org> Link: https://patch.msgid.link/20251126-gxring_cavium-v1-1-a066c0c9e0c6@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27net: stmmac: fix rx limit check in stmmac_rx_zc()Alexey Kodanev1-1/+1
The extra "count >= limit" check in stmmac_rx_zc() is redundant and has no effect because the value of "count" doesn't change after the while condition at this point. However, it can change after "read_again:" label: while (count < limit) { ... if (count >= limit) break; read_again: ... /* XSK pool expects RX frame 1:1 mapped to XSK buffer */ if (likely(status & rx_not_ls)) { xsk_buff_free(buf->xdp); buf->xdp = NULL; dirty++; count++; goto read_again; } ... This patch addresses the same issue previously resolved in stmmac_rx() by commit fa02de9e7588 ("net: stmmac: fix rx budget limit check"). The fix is the same: move the check after the label to ensure that it bounds the goto loop. Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy") Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251126104327.175590-1-aleksei.kodanev@bell-sw.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-28Merge tag 'drm-misc-fixes-2025-11-27' of ↵Dave Airlie5-27/+31
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: bridge: - sil902x: Fix HDMI detection imagination: - Update documentation sti: - Fix leaks in probe vga_switcheroo: - Avoid race condition during fbcon initialization Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patch.msgid.link/20251127081007.GA13578@2a02-2454-fd5e-fd00-689d-32c0-780c-bb87.dyn6.pyur.net
2025-11-28Merge tag 'amd-drm-fixes-6.18-2025-11-26' of ↵Dave Airlie11-20/+36
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.18-2025-11-26: amdgpu: - Unified MES fix - HDMI fix - Cursor fix - Bightness fix - EDID reading improvement - UserQ fix - Cyan Skillfish IP discovery fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patch.msgid.link/20251126204925.3316684-1-alexander.deucher@amd.com
2025-11-27Merge branch 'net-dsa-yt921x-fix-parsing-mib-attributes'Jakub Kicinski2-55/+104
David Yang says: ==================== net: dsa: yt921x: Fix parsing MIB attributes ==================== Link: https://patch.msgid.link/20251126084024.2843851-1-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27net: dsa: yt921x: Use macros for MIB locationsDavid Yang2-54/+103
Extract MIB constants into the header file to improve code style. This patch will not change the behavior of the function. Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20251126084024.2843851-3-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27net: dsa: yt921x: Fix parsing MIB attributesDavid Yang1-11/+11
There are hard-to-find unused fields in the MIB table I didn't notice in the example driver code, causing wrong interpretation of the MIB data. For some 64-bit attributes, the current (wrong) implementation took the correct lower 32 bits, but messed up the upper 32 bits, so it would work accidentally until 32-bit overflows happen. Fix that too. Fixes: 186623f4aa72 ("net: dsa: yt921x: Add support for Motorcomm YT921x") Signed-off-by: David Yang <mmyangfl@gmail.com> Link: https://patch.msgid.link/20251126084024.2843851-2-mmyangfl@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27r8169: add DASH support for RTL8127APJaven Xu1-0/+1
This adds DASH support for chip RTL8127AP. Its mac version is RTL_GIGA_MAC_VER_80. DASH is a standard for remote management of network device, allowing out-of-band control. Signed-off-by: Javen Xu <javen_xu@realsil.com.cn> Link: https://patch.msgid.link/20251126055950.2050-1-javen_xu@realsil.com.cn Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27net: vxlan: prevent NULL deref in vxlan_xmit_oneAntoine Tenart1-3/+15
Neither sock4 nor sock6 pointers are guaranteed to be non-NULL in vxlan_xmit_one, e.g. if the iface is brought down. This can lead to the following NULL dereference: BUG: kernel NULL pointer dereference, address: 0000000000000010 Oops: Oops: 0000 [#1] SMP NOPTI RIP: 0010:vxlan_xmit_one+0xbb3/0x1580 Call Trace: vxlan_xmit+0x429/0x610 dev_hard_start_xmit+0x55/0xa0 __dev_queue_xmit+0x6d0/0x7f0 ip_finish_output2+0x24b/0x590 ip_output+0x63/0x110 Mentioned commits changed the code path in vxlan_xmit_one and as a side effect the sock4/6 pointer validity checks in vxlan(6)_get_route were lost. Fix this by adding back checks. Since both commits being fixed were released in the same version (v6.7) and are strongly related, bundle the fixes in a single commit. Reported-by: Liang Li <liali@redhat.com> Fixes: 6f19b2c136d9 ("vxlan: use generic function for tunnel IPv4 route lookup") Fixes: 2aceb896ee18 ("vxlan: use generic function for tunnel IPv6 route lookup") Cc: Beniamino Galvani <b.galvani@gmail.com> Signed-off-by: Antoine Tenart <atenart@kernel.org> Reviewed-by: Ido Schimmel <idosch@nvidia.com> Tested-by: Ido Schimmel <idosch@nvidia.com> Link: https://patch.msgid.link/20251126102627.74223-1-atenart@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27iavf: Implement settime64 with -EOPNOTSUPPMichal Schmidt1-0/+7
ptp_clock_settime() assumes every ptp_clock has implemented settime64(). Stub it with -EOPNOTSUPP to prevent a NULL dereference. The fix is similar to commit 329d050bbe63 ("gve: Implement settime64 with -EOPNOTSUPP"). Fixes: d734223b2f0d ("iavf: add initial framework for registering PTP clock") Signed-off-by: Michal Schmidt <mschmidt@redhat.com> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Reviewed-by: Tim Hostetler <thostet@google.com> Link: https://patch.msgid.link/20251126094850.2842557-1-mschmidt@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27if_ether.h: Clarify ethertype validity for gsw1xx dsaPeter Enderborg1-1/+3
This 0x88C3 is registered to Infineon Technologies Corporate Research ST and are used by MaxLinear. Infineon made a spin off called Lantiq. Lantiq was acquired by Intel MaxLinear acquired Intels Connected Home division. The product FAQ from MaxLinear describes it's history from the F24S. The driver for the gsw1xx is based on Lantiq showing it's similarities. Ref https://standards-oui.ieee.org/ethertype/eth.txt Signed-off-by: Peter Enderborg <Peter.Enderborg@axis.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27net: wwan: mhi_wwan_mbim: Avoid -Wflex-array-member-not-at-end warningGustavo A. R. Silva1-8/+9
Use DEFINE_RAW_FLEX() to avoid a -Wflex-array-member-not-at-end warning. Remove fixed-size array struct usb_cdc_ncm_dpe16 dpe16[2]; from struct mbim_tx_hdr, so that flex-array member struct mbim_tx_hdr::ndp16.dpe16[] ends last in this structure. Compensate for this by using the DEFINE_RAW_FLEX() helper to declare the on-stack struct instance that contains struct usb_cdc_ncm_ndp16 as a member. Adjust the rest of the code, accordingly. So, with these changes fix the following warning: drivers/net/wwan/mhi_wwan_mbim.c:81:34: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27gve: Fix race condition on tx->dropped_pkt updateMax Yuan2-0/+8
The tx->dropped_pkt counter is a 64-bit integer that is incremented directly. On 32-bit architectures, this operation is not atomic and can lead to read/write tearing if a reader accesses the counter during the update. This can result in incorrect values being reported for dropped packets. To prevent this potential data corruption, wrap the increment operation with u64_stats_update_begin() and u64_stats_update_end(). This ensures that updates to the 64-bit counter are atomic, even on 32-bit systems, by using a sequence lock. The u64_stats_sync API requires the writer to have exclusive access, which is already provided in this context by the network stack's serialization of the transmit path (net_device_ops::ndo_start_xmit [1]) for a given queue. [1]: https://www.kernel.org/doc/Documentation/networking/netdevices.txt Signed-off-by: Max Yuan <maxyuan@google.com> Reviewed-by: Jordan Rhee <jordanrhee@google.com> Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27net: restore napi_consume_skb()'s NULL-handlingJakub Kicinski1-1/+1
Commit e20dfbad8aab ("net: fix napi_consume_skb() with alien skbs") added a skb->cpu check to napi_consume_skb(), before the point where napi_consume_skb() validated skb is not NULL. Add an explicit check to the early exit condition. Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27eth: bnxt: make use of napi_consume_skb()Jakub Kicinski1-1/+1
As those following recent changes from Eric know very well using NAPI skb cache is crucial to achieve good perf, at least on recent AMD platforms. Make sure bnxt feeds the skb cache with Tx skbs. Reviewed-by: Michael Chan <michael.chan@broadcom.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27netmem, devmem, tcp: access pp fields through @desc in net_iovByungchul Park3-6/+6
Convert all the legacy code directly accessing the pp fields in net_iov to access them through @desc in net_iov. Signed-off-by: Byungchul Park <byungchul@sk.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27Merge tag 'dma-mapping-6.18-2025-11-27' of ↵Linus Torvalds2-2/+4
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux Pull dma-mapping fixes from Marek Szyprowski: "Two last minute fixes for the recently modified DMA API infrastructure: - proper handling of DMA_ATTR_MMIO in dma_iova_unlink() function (me) - regression fix for the code refactoring related to P2PDMA (Pranjal Shrivastava)" * tag 'dma-mapping-6.18-2025-11-27' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux: dma-direct: Fix missing sg_dma_len assignment in P2PDMA bus mappings iommu/dma: add missing support for DMA_ATTR_MMIO for dma_iova_unlink()
2025-11-27Merge tag 'acpi-6.18-rc8-2' of ↵Linus Torvalds1-1/+1
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "One more urgent ACPI support fix for 6.18 There is one more commit that needs to be reverted after reverting problematic commit 7a8c994cbb2d ("ACPI: processor: idle: Optimize ACPI idle driver registration"), so revert it" * tag 'acpi-6.18-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPI: processor: Update cpuidle driver check in __acpi_processor_start()"
2025-11-28Merge tag 'drm-intel-fixes-2025-11-26' of ↵Dave Airlie2-6/+8
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes - Reject async flips when PSR's selective fetch is enabled (Ville) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patch.msgid.link/aScgY8QMjmyJRBX2@intel.com
2025-11-28netfilter: nf_tables: improve UAPI kernel-doc commentsRandy Dunlap1-7/+7
In include/uapi/linux/netfilter/nf_tables.h, correct the kernel-doc comments for mistyped enum names and enum values to avoid these kernel-doc warnings and improve the documentation: nf_tables.h:896: warning: Enum value 'NFT_EXTHDR_OP_TCPOPT' not described in enum 'nft_exthdr_op' nf_tables.h:896: warning: Excess enum value 'NFT_EXTHDR_OP_TCP' description in 'nft_exthdr_op' nf_tables.h:1210: warning: expecting prototype for enum nft_flow_attributes. Prototype was for enum nft_offload_attributes instead nf_tables.h:1428: warning: expecting prototype for enum nft_reject_code. Prototype was for enum nft_reject_inet_code instead (add beginning '@' to each enum value description:) nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_FAMILY' not described in enum 'nft_tproxy_attributes' nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_REG_ADDR' not described in enum 'nft_tproxy_attributes' nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_REG_PORT' not described in enum 'nft_tproxy_attributes' nf_tables.h:1796: warning: expecting prototype for enum nft_device_attributes. Prototype was for enum nft_devices_attributes instead Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: ip6t_srh: fix UAPI kernel-doc comments formatRandy Dunlap1-20/+20
Fix the kernel-doc format for struct members to be "@member" instead of "@ member" to avoid kernel-doc warnings. Warning: ip6t_srh.h:60 struct member 'next_hdr' not described in 'ip6t_srh' Warning: ip6t_srh.h:60 struct member 'hdr_len' not described in 'ip6t_srh' Warning: ip6t_srh.h:60 struct member 'segs_left' not described in 'ip6t_srh' Warning: ip6t_srh.h:60 struct member 'last_entry' not described in 'ip6t_srh' Warning: ip6t_srh.h:60 struct member 'tag' not described in 'ip6t_srh' Warning: ip6t_srh.h:60 struct member 'mt_flags' not described in 'ip6t_srh' Warning: ip6t_srh.h:60 struct member 'mt_invflags' not described in 'ip6t_srh' Warning: ip6t_srh.h:93 struct member 'next_hdr' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'hdr_len' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'segs_left' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'last_entry' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'tag' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'psid_addr' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'nsid_addr' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'lsid_addr' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'psid_msk' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'nsid_msk' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'lsid_msk' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'mt_flags' not described in 'ip6t_srh1' Warning: ip6t_srh.h:93 struct member 'mt_invflags' not described in 'ip6t_srh1' Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28selftests: netfilter: nft_flowtable.sh: Add the capability to send IPv6 TCP ↵Lorenzo Bianconi1-14/+43
traffic Introduce the capability to send TCP traffic over IPv6 to nft_flowtable netfilter selftest. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: nft_connlimit: add support to object update operationFernando Fernandez Mancera1-1/+12
This is useful to update the limit or flags without clearing the connections tracked. Use READ_ONCE() on packetpath as it can be modified on controlplane. Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: nft_connlimit: update the count if add was skippedFernando Fernandez Mancera2-6/+19
Connlimit expression can be used for all kind of packets and not only for packets with connection state new. See this ruleset as example: table ip filter { chain input { type filter hook input priority filter; policy accept; tcp dport 22 ct count over 4 counter } } Currently, if the connection count goes over the limit the counter will count the packets. When a connection is closed, the connection count won't decrement as it should because it is only updated for new connections due to an optimization on __nf_conncount_add() that prevents updating the list if the connection is duplicated. To solve this problem, check whether the connection was skipped and if so, update the list. Adjust count_tree() too so the same fix is applied for xt_connlimit. Fixes: 976afca1ceba ("netfilter: nf_conncount: Early exit in nf_conncount_lookup() and cleanup") Closes: https://lore.kernel.org/netfilter/trinity-85c72a88-d762-46c3-be97-36f10e5d9796-1761173693813@3c-app-mailcom-bs12/ Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: nf_conncount: make nf_conncount_gc_list() to disable BHFernando Fernandez Mancera2-13/+18
For convenience when performing GC over the connection list, make nf_conncount_gc_list() to disable BH. This unifies the behavior with nf_conncount_add() and nf_conncount_count(). Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: nf_conncount: rework API to use sk_buff directlyFernando Fernandez Mancera5-103/+142
When using nf_conncount infrastructure for non-confirmed connections a duplicated track is possible due to an optimization introduced since commit d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC"). In order to fix this introduce a new conncount API that receives directly an sk_buff struct. It fetches the tuple and zone and the corresponding ct from it. It comes with both existing conncount variants nf_conncount_count_skb() and nf_conncount_add_skb(). In addition remove the old API and adjust all the users to use the new one. This way, for each sk_buff struct it is possible to check if there is a ct present and already confirmed. If so, skip the add operation. Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC") Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28selftests: netfilter: nft_flowtable.sh: Add IPIP flowtable selftestLorenzo Bianconi1-0/+69
Introduce specific selftest for IPIP flowtable SW acceleration in nft_flowtable.sh Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: flowtable: Add IPIP tx sw accelerationLorenzo Bianconi2-4/+106
Introduce sw acceleration for tx path of IPIP tunnels relying on the netfilter flowtable infrastructure. This patch introduces basic infrastructure to accelerate other tunnel types (e.g. IP6IP6). IPIP sw tx acceleration can be tested running the following scenario where the traffic is forwarded between two NICs (eth0 and eth1) and an IPIP tunnel is used to access a remote site (using eth1 as the underlay device): ETH0 -- TUN0 <==> ETH1 -- [IP network] -- TUN1 (192.168.100.2) $ip addr show 6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:00:22:33:11:55 brd ff:ff:ff:ff:ff:ff inet 192.168.0.2/24 scope global eth0 valid_lft forever preferred_lft forever 7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:11:22:33:11:55 brd ff:ff:ff:ff:ff:ff inet 192.168.1.1/24 scope global eth1 valid_lft forever preferred_lft forever 8: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000 link/ipip 192.168.1.1 peer 192.168.1.2 inet 192.168.100.1/24 scope global tun0 valid_lft forever preferred_lft forever $ip route show default via 192.168.100.2 dev tun0 192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.2 192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.1 192.168.100.0/24 dev tun0 proto kernel scope link src 192.168.100.1 $nft list ruleset table inet filter { flowtable ft { hook ingress priority filter devices = { eth0, eth1 } } chain forward { type filter hook forward priority filter; policy accept; meta l4proto { tcp, udp } flow add @ft } } Reproducing the scenario described above using veths I got the following results: - TCP stream trasmitted into the IPIP tunnel: - net-next: (baseline) ~ 85Gbps - net-next + IPIP flowtable support: ~102Gbps Co-developed-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: flowtable: Add IPIP rx sw accelerationLorenzo Bianconi6-13/+153
Introduce sw acceleration for rx path of IPIP tunnels relying on the netfilter flowtable infrastructure. Subsequent patches will add sw acceleration for IPIP tunnels tx path. This series introduces basic infrastructure to accelerate other tunnel types (e.g. IP6IP6). IPIP rx sw acceleration can be tested running the following scenario where the traffic is forwarded between two NICs (eth0 and eth1) and an IPIP tunnel is used to access a remote site (using eth1 as the underlay device): ETH0 -- TUN0 <==> ETH1 -- [IP network] -- TUN1 (192.168.100.2) $ip addr show 6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:00:22:33:11:55 brd ff:ff:ff:ff:ff:ff inet 192.168.0.2/24 scope global eth0 valid_lft forever preferred_lft forever 7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000 link/ether 00:11:22:33:11:55 brd ff:ff:ff:ff:ff:ff inet 192.168.1.1/24 scope global eth1 valid_lft forever preferred_lft forever 8: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000 link/ipip 192.168.1.1 peer 192.168.1.2 inet 192.168.100.1/24 scope global tun0 valid_lft forever preferred_lft forever $ip route show default via 192.168.100.2 dev tun0 192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.2 192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.1 192.168.100.0/24 dev tun0 proto kernel scope link src 192.168.100.1 $nft list ruleset table inet filter { flowtable ft { hook ingress priority filter devices = { eth0, eth1 } } chain forward { type filter hook forward priority filter; policy accept; meta l4proto { tcp, udp } flow add @ft } } Reproducing the scenario described above using veths I got the following results: - TCP stream received from the IPIP tunnel: - net-next: (baseline) ~ 71Gbps - net-next + IPIP flowtbale support: ~101Gbps Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: flowtable: use tuple address to calculate next hopPablo Neira Ayuso1-4/+12
This simplifies IPIP tunnel support coming in follow up patches. No function changes are intended. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: flowtable: remove hw_ifidxPablo Neira Ayuso4-6/+1
hw_ifidx was originally introduced to store the real netdevice as a requirement for the hardware offload support in: 73f97025a972 ("netfilter: nft_flow_offload: use direct xmit if hardware offload is enabled") Since ("netfilter: flowtable: consolidate xmit path"), ifidx and hw_ifidx points to the real device in the xmit path, remove it. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: flowtable: inline pppoe encapsulation in xmit pathPablo Neira Ayuso2-7/+44
Push the pppoe header from the flowtable xmit path, inlining is faster than the original xmit path because it can avoid some locking. This is based on a patch originally written by wenxu. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-28netfilter: flowtable: inline vlan encapsulation in xmit pathPablo Neira Ayuso2-3/+29
Push the vlan header from the flowtable xmit path, instead of passing the packet to the vlan device. This is based on a patch originally written by wenxu. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-27netfilter: flowtable: consolidate xmit pathPablo Neira Ayuso4-39/+57
Use dev_queue_xmit() for the XMIT_NEIGH case. Store the interface index of the real device behind the vlan/pppoe device, this introduces an extra lookup for the real device in the xmit path because rt->dst.dev provides the vlan/pppoe device. XMIT_NEIGH now looks more similar to XMIT_DIRECT but the check for stale dst and the neighbour lookup still remain in place which is convenient to deal with network topology changes. Note that nft_flow_route() needs to relax the check for _XMIT_NEIGH so the existing basic xfrm offload (which only works in one direction) does not break. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-27netfilter: flowtable: move path discovery infrastructure to its own filePablo Neira Ayuso4-259/+281
This file contains the path discovery that is run from the forward chain for the packet offloading the flow into the flowtable. This consists of a series of calls to dev_fill_forward_path() for each device stack. More topologies may be supported in the future, so move this code to its own file to separate it from the nftables flow_offload expression. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-27netfilter: flowtable: check for maximum number of encapsulations in bridge vlanPablo Neira Ayuso1-1/+8
Add a sanity check to skip path discovery if the maximum number of encapsulation is reached. While at it, check for underflow too. Fixes: 26267bf9bb57 ("netfilter: flowtable: bridge vlan hardware offload and switchdev") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2025-11-27keys: Fix grammar and formatting in 'struct key_type' commentsThorsten Blum1-3/+6
s/it/if/ and s/revokation/revocation/, capitalize "clear", and add a period after the sentence. Fix the comment formatting. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-11-27keys: Replace deprecated strncpy in ecryptfs_fill_auth_tokThorsten Blum1-2/+1
strncpy() is deprecated for NUL-terminated destination buffers; use strscpy_pad() instead to retain the NUL-padding behavior of strncpy(). The destination buffer is initialized using kzalloc() with a 'signature' size of ECRYPTFS_PASSWORD_SIG_SIZE + 1. strncpy() then copies up to ECRYPTFS_PASSWORD_SIG_SIZE bytes from 'key_desc', NUL-padding any remaining bytes if needed, but expects the last byte to be zero. strscpy_pad() also copies the source string to 'signature', and NUL-pads the destination buffer if needed, but ensures it's always NUL-terminated without relying on it being zero-initialized. strscpy_pad() automatically determines the size of the fixed-length destination buffer via sizeof() when the optional size argument is omitted, making an explicit size unnecessary. In encrypted_init(), the source string 'key_desc' is validated by valid_ecryptfs_desc() before calling ecryptfs_fill_auth_tok(), and is therefore NUL-terminated and satisfies the __must_be_cstr() requirement of strscpy_pad(). Link: https://github.com/KSPP/linux/issues/90 Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Kees Cook <kees@kernel.org> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-11-27keys: Remove redundant less-than-zero checksThorsten Blum4-6/+6
The local variables 'size_t datalen' are unsigned and cannot be less than zero. Remove the redundant conditions. Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev> Reviewed-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
2025-11-27Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/netJakub Kicinski192-1158/+1936
Conflicts: net/xdp/xsk.c 0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number") 8da7bea7db69 ("xsk: add indirect call for xsk_destruct_skb") 30ed05adca4a ("xsk: use a smaller new lock for shared pool case") https://lore.kernel.org/20251127105450.4a1665ec@canb.auug.org.au https://lore.kernel.org/eb4eee14-7e24-4d1b-b312-e9ea738fefee@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-27Revert "ACPI: processor: Update cpuidle driver check in ↵Rafael J. Wysocki1-1/+1
__acpi_processor_start()" Revert commit 8a1b5d412cb4 ("ACPI: processor: Update cpuidle driver check in __acpi_processor_start()") which depends on commit 7a8c994cbb2d ("ACPI: processor: idle: Optimize ACPI idle driver registration") that got reverted. Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-27Merge tag 'ceph-for-6.18-rc8' of https://github.com/ceph/ceph-clientLinus Torvalds7-42/+66
Pull ceph fixes from Ilya Dryomov: "A patch to make sparse read handling work in msgr2 secure mode from Slava and a couple of fixes from Ziming and myself to avoid operating on potentially invalid memory, all marked for stable" * tag 'ceph-for-6.18-rc8' of https://github.com/ceph/ceph-client: libceph: prevent potential out-of-bounds writes in handle_auth_session_key() libceph: replace BUG_ON with bounds check for map->max_osd ceph: fix crash in process_v2_sparse_read() for encrypted directories libceph: drop started parameter of __ceph_open_session() libceph: fix potential use-after-free in have_mon_and_osd_map()
2025-11-27arm64/sysreg: Remove unused define ARM64_FEATURE_FIELD_BITSBen Horgan1-2/+0
The define ARM64_FEATURE_FIELD_BITS is now unused and feature id fields don't always have 4 bits. Remove it. Signed-off-by: Ben Horgan <ben.horgan@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-27KVM: arm64: selftests: Consider all 7 possible levels of cacheBen Horgan1-1/+1
In test_clidr() if an empty cache level is not found then the TEST_ASSERT will not fire. Fix this by considering all 7 possible levels when iterating through the hierarchy. Found by inspection. Signed-off-by: Ben Horgan <ben.horgan@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-27KVM: arm64: selftests: Remove ARM64_FEATURE_FIELD_BITS and its last userBen Horgan2-4/+6
ARM64_FEATURE_FIELD_BITS is set to 4 but not all ID register fields are 4 bits. See for instance ID_AA64SMFR0_EL1. The last user of this define, ARM64_FEATURE_FIELD_BITS, is the set_id_regs selftest. Its logic assumes the fields aren't a single bits; assert that's the case and stop using the define. As there are no more users, ARM64_FEATURE_FIELD_BITS is removed from the arm64 tools sysreg.h header. A separate commit removes this from the kernel version of the header. Signed-off-by: Ben Horgan <ben.horgan@arm.com> Acked-by: Marc Zyngier <maz@kernel.org> Acked-by: Oliver Upton <oupton@kernel.org> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-27arm64: atomics: lse: Remove unused parameters from ATOMIC_FETCH_OP_AND macrosSeongsu Park1-10/+10
The ATOMIC_FETCH_OP_AND and ATOMIC64_FETCH_OP_AND macros accept 'mb' and 'cl' parameters but never use them in their implementation. These macros simply delegate to the corresponding andnot functions, which handle the actual atomic operations and memory barriers. Signed-off-by: Seongsu Park <sgsu.park@samsung.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
2025-11-27Merge tag 'net-6.18-rc8' of ↵Linus Torvalds49-353/+700
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net Pull networking fixes from Paolo Abeni: "Including fixes from bluetooth and CAN. No known outstanding regressions. Current release - regressions: - mptcp: initialize rcv_mss before calling tcp_send_active_reset() - eth: mlx5e: fix validation logic in rate limiting Previous releases - regressions: - xsk: avoid data corruption on cq descriptor number - bluetooth: - prevent race in socket write iter and sock bind - fix not generating mackey and ltk when repairing - can: - kvaser_usb: fix potential infinite loop in command parsers - rcar_canfd: fix CAN-FD mode as default - eth: - veth: reduce XDP no_direct return section to fix race - virtio-net: avoid unnecessary checksum calculation on guest RX Previous releases - always broken: - sched: fix TCF_LAYER_TRANSPORT handling in tcf_get_base_ptr() - bluetooth: mediatek: fix kernel crash when releasing iso interface - vhost: rewind next_avail_head while discarding descriptors - eth: - r8169: fix RTL8127 hang on suspend/shutdown - aquantia: add missing descriptor cache invalidation on ATL2 - dsa: microchip: fix resource releases in error path" * tag 'net-6.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits) mptcp: Initialise rcv_mss before calling tcp_send_active_reset() in mptcp_do_fastclose(). net: fec: do not register PPS event for PEROUT net: fec: do not allow enabling PPS and PEROUT simultaneously net: fec: do not update PEROUT if it is enabled net: fec: cancel perout_timer when PEROUT is disabled net: mctp: unconditionally set skb->dev on dst output net: atlantic: fix fragment overflow handling in RX path MAINTAINERS: separate VIRTIO NET DRIVER and add netdev virtio-net: avoid unnecessary checksum calculation on guest RX eth: fbnic: Fix counter roll-over issue mptcp: clear scheduled subflows on retransmit net: dsa: sja1105: fix SGMII linking at 10M or 100M but not passing traffic s390/net: list Aswin Karuvally as maintainer net: wwan: mhi: Keep modem name match with Foxconn T99W640 vhost: rewind next_avail_head while discarding descriptors net/sched: em_canid: fix uninit-value in em_canid_match can: rcar_canfd: Fix CAN-FD mode as default xsk: avoid data corruption on cq descriptor number r8169: fix RTL8127 hang on suspend/shutdown net: sxgbe: fix potential NULL dereference in sxgbe_rx() ...
2025-11-27Merge tag 'platform-drivers-x86-v6.18-5' of ↵Linus Torvalds2-7/+11
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86 Pull platform driver fixes from Ilpo Järvinen: - arm64/thinkpad-t14s-ec: - Fix IRQ race condition - Sleep after EC access - intel/punit_ipc: Fix memory corruption * tag 'platform-drivers-x86-v6.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86: platform/x86: intel: punit_ipc: fix memory corruption platform: arm64: thinkpad-t14s-ec: sleep after EC access platform: arm64: thinkpad-t14s-ec: fix IRQ race condition
2025-11-27debugobjects: Use LD_WAIT_CONFIG instead of LD_WAIT_SLEEPSebastian Andrzej Siewior1-2/+2
fill_pool_map is used to suppress nesting violations caused by acquiring a spinlock_t (from within the memory allocator) while holding a raw_spinlock_t. The used annotation is wrong. LD_WAIT_SLEEP is for always sleeping lock types such as mutex_t. LD_WAIT_CONFIG is for lock type which are sleeping while spinning on PREEMPT_RT such as spinlock_t. Use LD_WAIT_CONFIG as override. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251127153652.291697-3-bigeasy@linutronix.de
2025-11-27debugobjects: Allow to refill the pool before SYSTEM_SCHEDULINGSebastian Andrzej Siewior1-1/+1
The pool of free objects is refilled on several occasions such as object initialisation. On PREEMPT_RT refilling is limited to preemptible sections due to sleeping locks used by the memory allocator. The system boots with disabled interrupts so the pool can not be refilled. If too many objects are initialized and the pool gets empty then debugobjects disables itself. Refiling can also happen early in the boot with disabled interrupts as long as the scheduler is not operational. If the scheduler can not preempt a task then a sleeping lock can not be contended. Allow to additionally refill the pool if the scheduler is not operational. Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Link: https://patch.msgid.link/20251127153652.291697-2-bigeasy@linutronix.de
2025-11-27Merge tag 'devfreq-next-for-6.19' of ↵Rafael J. Wysocki9-50/+45
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux Pull devfreq changes for v6.19 from Chanwoo Choi: "- Move governor.h under include/linux/ and rename to devfreq-governor.h in order to allow devfreq governor definitions in out of drivers/devfreq/. - Fix potential use-after-free issue of OPP handling on hisi_uncore_freq.c - Use min() to improve the readability on tegra30-devfreq.c - Fix typo in DFSO_DOWNDIFFERENTIAL macro name on governor_simpleondemand.c" * tag 'devfreq-next-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux: PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name PM / devfreq: tegra30: use min to simplify actmon_cpu_to_emc_rate PM / devfreq: hisi: Fix potential UAF in OPP handling PM / devfreq: Move governor.h to a public header location
2025-11-27printk: Use console_is_usable on console_unblankMarcos Paulo de Souza1-8/+4
The macro for_each_console_srcu iterates over all registered consoles. It's implied that all registered consoles have CON_ENABLED flag set, making the check for the flag unnecessary. Call console_is_usable function to fully verify if the given console is usable before calling the ->unblank callback. Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://patch.msgid.link/20251121-printk-cleanup-part2-v2-3-57b8b78647f4@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-27arch: um: kmsg_dump: Use console_is_usableMarcos Paulo de Souza1-1/+1
All consoles found on for_each_console are registered, meaning that all of them have the CON_ENABLED flag set. Since NBCON was introduced it's important to check if a given console also implements the NBCON callbacks. The function console_is_usable does exactly that. Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://patch.msgid.link/20251121-printk-cleanup-part2-v2-2-57b8b78647f4@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-27drivers: serial: kgdboc: Drop checks for CON_ENABLED and CON_BOOTMarcos Paulo de Souza1-1/+0
The original code tried to find a console that has CON_BOOT _or_ CON_ENABLED flag set. The flag CON_ENABLED is set to all registered consoles, so in this case this check is always true, even for the CON_BOOT consoles. The initial intent of the kgdboc_earlycon_init was to get a console early (CON_BOOT) or later on in the process (CON_ENABLED). The code was using for_each_console macro, meaning that all console structs were previously registered on the printk() machinery. At this point, any console found on for_each_console is safe for kgdboc_earlycon_init to use. Dropping the check makes the code cleaner, and avoids further confusion by future readers of the code. Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com> Reviewed-by: Petr Mladek <pmladek@suse.com> Link: https://patch.msgid.link/20251121-printk-cleanup-part2-v2-1-57b8b78647f4@suse.com Signed-off-by: Petr Mladek <pmladek@suse.com>
2025-11-27Merge tag 'linux-can-next-for-6.19-20251126' of ↵Paolo Abeni14-171/+1166
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next Marc Kleine-Budde says: ==================== pull-request: can-next 2025-11-26 this is a pull request of 27 patches for net-next/main. The first 17 patches are by Vincent Mailhol and Oliver Hartkopp and add CAN XL support to the CAN netlink interface. Geert Uytterhoeven and Biju Das provide 7 patches for the rcar_canfd driver to add suspend/resume support. The next 2 patches are by Markus Schneider-Pargmann and add them as the m_can maintainer. Conor Dooley's patch updates the mpfs-can DT bindungs. linux-can-next-for-6.19-20251126 * tag 'linux-can-next-for-6.19-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (27 commits) dt-bindings: can: mpfs: document resets MAINTAINERS: Simplify m_can section MAINTAINERS: Add myself as m_can maintainer can: rcar_canfd: Add suspend/resume support can: rcar_canfd: Convert to DEFINE_SIMPLE_DEV_PM_OPS() can: rcar_canfd: Invert CAN clock and close_candev() order can: rcar_canfd: Extract rcar_canfd_global_{,de}init() can: rcar_canfd: Use devm_clk_get_optional() for RAM clk can: rcar_canfd: Invert global vs. channel teardown can: rcar_canfd: Invert reset assert order can: dev: print bitrate error with two decimal digits can: raw: instantly reject unsupported CAN frames can: add dummy_can driver can: calc_bittiming: add can_calc_sample_point_pwm() can: calc_bittiming: add can_calc_sample_point_nrz() can: calc_bittiming: replace misleading "nominal" by "reference" can: netlink: add PWM netlink interface can: calc_bittiming: add PWM calculation can: bittiming: add PWM validation can: bittiming: add PWM parameters ... ==================== Link: https://patch.msgid.link/20251126120106.154635-1-mkl@pengutronix.de Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27x86/mm: Delete disabled debug codeBrendan Jackman1-3/+0
This code doesn't run. Since 2008: 4f9c11dd49fb ("x86, 64-bit: adjust mapping of physical pagetables to work with Xen") the kernel has gained more flexible logging and tracing capabilities; presumably if anyone wanted to take advantage of this log message they would have got rid of the "if (0)" so they could use these capabilities. Since they haven't, just delete it. Signed-off-by: Brendan Jackman <jackmanb@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Link: https://patch.msgid.link/20251003-x86-init-cleanup-v1-1-f2b7994c2ad6@google.com
2025-11-27refscale: Exercise DEFINE_STATIC_SRCU_FAST() and init_srcu_struct_fast()Paul E. McKenney1-1/+8
This commit updates the initialization for the "srcu-fast" scale type to use DEFINE_STATIC_SRCU_FAST() when reader_flavor is equal to SRCU_READ_FLAVOR_FAST. Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: <bpf@vger.kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-11-27rcutorture: Make srcu{,d}_torture_init() announce the SRCU typePaul E. McKenney1-6/+23
This commit causes rcutorture's srcu_torture_init() and srcud_torture_init() functions to announce on the console log which variant of SRCU is being tortured, for example: "torture: srcud_torture_init fast SRCU". [ paulmck: Apply feedback from kernel test robot. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-11-27srcu: Create an SRCU-fast-updown APIPaul E. McKenney5-16/+183
This commit creates an SRCU-fast-updown API, including DEFINE_SRCU_FAST_UPDOWN(), DEFINE_STATIC_SRCU_FAST_UPDOWN(), __init_srcu_struct_fast_updown(), init_srcu_struct_fast_updown(), srcu_read_lock_fast_updown(), srcu_read_unlock_fast_updown(), __srcu_read_lock_fast_updown(), and __srcu_read_unlock_fast_updown(). These are initially identical to their SRCU-fast counterparts, but both SRCU-fast and SRCU-fast-updown will be optimized in different directions by later commits. SRCU-fast will lack any sort of srcu_down_read() and srcu_up_read() APIs, which will enable extremely efficient NMI safety. For its part, SRCU-fast-updown will not be NMI safe, which will enable reasonably efficient implementations of srcu_down_read_fast() and srcu_up_read_fast(). This API fork happens to meet two different future use cases. * SRCU-fast will become the reimplementation basis for RCU-TASK-TRACE for consolidation. Since RCU-TASK-TRACE must be NMI safe, SRCU-fast must be as well. * SRCU-fast-updown will be needed for uretprobes code in order to get rid of the read-side memory barriers while still allowing entering the reader at task level while exiting it in a timer handler. This commit also adds rcutorture tests for the new APIs. This (annoyingly) needs to be in the same commit for bisectability. With this commit, the 0x8 value tests SRCU-fast-updown. However, most SRCU-fast testing will be via the RCU Tasks Trace wrappers. [ paulmck: Apply s/0x8/0x4/ missing change per Boqun Feng feedback. ] [ paulmck: Apply Akira Yokosawa feedback. ] Signed-off-by: Paul E. McKenney <paulmck@kernel.org> Cc: Andrii Nakryiko <andrii@kernel.org> Cc: Alexei Starovoitov <ast@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: <bpf@vger.kernel.org> Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
2025-11-27ACPI: PM: Fix a spelling mistakeChu Guangqing1-1/+1
Fix spelling by replacing "interrups" with "interrupts". Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> [ rjw: Changelog edits ] Link: https://patch.msgid.link/20251125022403.2614-1-chuguangqing@inspur.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-27ACPI: LPSS: Fix a spelling mistakeChu Guangqing1-1/+1
Fix spelling by replacing "successfull" with "successful". Signed-off-by: Chu Guangqing <chuguangqing@inspur.com> [ rjw: Changelog edits ] Link: https://patch.msgid.link/20251125021431.2243-1-chuguangqing@inspur.com Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2025-11-27mptcp: Initialise rcv_mss before calling tcp_send_active_reset() in ↵Kuniyuki Iwashima1-0/+6
mptcp_do_fastclose(). syzbot reported divide-by-zero in __tcp_select_window() by MPTCP socket. [0] We had a similar issue for the bare TCP and fixed in commit 499350a5a6e7 ("tcp: initialize rcv_mss to TCP_MIN_MSS instead of 0"). Let's apply the same fix to mptcp_do_fastclose(). [0]: Oops: divide error: 0000 [#1] SMP KASAN PTI CPU: 0 UID: 0 PID: 6068 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025 RIP: 0010:__tcp_select_window+0x824/0x1320 net/ipv4/tcp_output.c:3336 Code: ff ff ff 44 89 f1 d3 e0 89 c1 f7 d1 41 01 cc 41 21 c4 e9 a9 00 00 00 e8 ca 49 01 f8 e9 9c 00 00 00 e8 c0 49 01 f8 44 89 e0 99 <f7> 7c 24 1c 41 29 d4 48 bb 00 00 00 00 00 fc ff df e9 80 00 00 00 RSP: 0018:ffffc90003017640 EFLAGS: 00010293 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88807b469e40 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ffffc90003017730 R08: ffff888033268143 R09: 1ffff1100664d028 R10: dffffc0000000000 R11: ffffed100664d029 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 FS: 000055557faa0500(0000) GS:ffff888126135000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f64a1912ff8 CR3: 0000000072122000 CR4: 00000000003526f0 Call Trace: <TASK> tcp_select_window net/ipv4/tcp_output.c:281 [inline] __tcp_transmit_skb+0xbc7/0x3aa0 net/ipv4/tcp_output.c:1568 tcp_transmit_skb net/ipv4/tcp_output.c:1649 [inline] tcp_send_active_reset+0x2d1/0x5b0 net/ipv4/tcp_output.c:3836 mptcp_do_fastclose+0x27e/0x380 net/mptcp/protocol.c:2793 mptcp_disconnect+0x238/0x710 net/mptcp/protocol.c:3253 mptcp_sendmsg_fastopen+0x2f8/0x580 net/mptcp/protocol.c:1776 mptcp_sendmsg+0x1774/0x1980 net/mptcp/protocol.c:1855 sock_sendmsg_nosec net/socket.c:727 [inline] __sock_sendmsg+0xe5/0x270 net/socket.c:742 __sys_sendto+0x3bd/0x520 net/socket.c:2244 __do_sys_sendto net/socket.c:2251 [inline] __se_sys_sendto net/socket.c:2247 [inline] __x64_sys_sendto+0xde/0x100 net/socket.c:2247 do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline] do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94 entry_SYSCALL_64_after_hwframe+0x77/0x7f RIP: 0033:0x7f66e998f749 Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007ffff9acedb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c RAX: ffffffffffffffda RBX: 00007f66e9be5fa0 RCX: 00007f66e998f749 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003 RBP: 00007ffff9acee10 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001 R13: 00007f66e9be5fa0 R14: 00007f66e9be5fa0 R15: 0000000000000006 </TASK> Fixes: ae155060247b ("mptcp: fix duplicate reset on fastclose") Reported-by: syzbot+3a92d359bc2ec6255a33@syzkaller.appspotmail.com Closes: https://lore.kernel.org/netdev/69260882.a70a0220.d98e3.00b4.GAE@google.com/ Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251125195331.309558-1-kuniyu@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: stmmac: dwmac: Disable flushing frames on Rx Buffer UnavailableRohan G Thomas5-2/+14
In Store and Forward mode, flushing frames when the receive buffer is unavailable, can cause the MTL Rx FIFO to go out of sync. This results in buffering of a few frames in the FIFO without triggering Rx DMA from transferring the data to the system memory until another packet is received. Once the issue happens, for a ping request, the packet is forwarded to the system memory only after we receive another packet and hece we observe a latency equivalent to the ping interval. 64 bytes from 192.168.2.100: seq=1 ttl=64 time=1000.344 ms Also, we can observe constant gmacgrp_debug register value of 0x00000120, which indicates "Reading frame data". The issue is not reproducible after disabling frame flushing when Rx buffer is unavailable. But in that case, the Rx DMA enters a suspend state due to buffer unavailability. To resume operation, software must write to the receive_poll_demand register after adding new descriptors, which reactivates the Rx DMA. This issue is observed in the socfpga platforms which has dwmac1000 IP like Arria 10, Cyclone V and Agilex 7. Issue is reproducible after running iperf3 server at the DUT for UDP lower packet sizes. Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com> Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com> Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Link: https://patch.msgid.link/20251126-a10_ext_fix-v1-1-d163507f646f@altera.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27selftests: af_unix: remove unused stdlib.h includeSunday Adelodun1-1/+0
The unix_connreset.c test included <stdlib.h>, but no symbol from that header is used. This causes a fatal build error under certain linux-next configurations where stdlib.h is not available. Remove the unused include to fix the build. Reported-by: kernel test robot <lkp@intel.com> Closes: https://lore.kernel.org/r/202511221800.hcgCKvVa-lkp@intel.com/ Signed-off-by: Sunday Adelodun <adelodunolaoluwa@yahoo.com> Link: https://patch.msgid.link/20251125113648.25903-1-adelodunolaoluwa@yahoo.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27spi: nxp-fspi: Propagate fwnode in ACPI case as wellAndy Shevchenko1-5/+5
Propagate fwnode of the ACPI device to the SPI controller Linux device. Currently only OF case propagates fwnode to the controller. While at it, replace several calls to dev_fwnode() with a single one cached in a local variable, and unify checks for fwnode type by using is_*_node() APIs. Fixes: 55ab8487e01d ("spi: spi-nxp-fspi: Add ACPI support") Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Reviewed-by: Haibo Chen <haibo.chen@nxp.com> Link: https://patch.msgid.link/20251126202501.2319679-1-andriy.shevchenko@linux.intel.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-11-27regulator: rtq2208: Correct LDO2 logic judgment bitsChiYuan Huang1-1/+1
The LDO2 judgement bit position should be 7, not 6. Cc: stable@vger.kernel.org Reported-by: Yoon Dong Min <dm.youn@telechips.com> Fixes: b65439d90150 ("regulator: rtq2208: Fix the LDO DVS capability") Signed-off-by: ChiYuan Huang <cy_huang@richtek.com> Link: https://patch.msgid.link/faadb009f84b88bfcabe39fc5009c7357b00bbe2.1764209258.git.cy_huang@richtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-11-27regulator: rtq2208: Correct buck group2 phase mapping logicChiYuan Huang1-2/+2
Correct buck group2 H and F mapping logic. Cc: stable@vger.kernel.org Reported-by: Yoon Dong Min <dm.youn@telechips.com> Fixes: 1742e7e978ba ("regulator: rtq2208: Fix incorrect buck converter phase mapping") Signed-off-by: ChiYuan Huang <cy_huang@richtek.com> Link: https://patch.msgid.link/8527ae02a72b754d89b7580a5fe7474d6f80f5c3.1764209258.git.cy_huang@richtek.com Signed-off-by: Mark Brown <broonie@kernel.org>
2025-11-27Merge branch 'net-fec-fix-some-ptp-related-issues'Paolo Abeni2-12/+53
Wei Fang says: ==================== net: fec: fix some PTP related issues There are some issues which were introduced by the commit 350749b909bf ("net: fec: Add support for periodic output signal of PPS"). See each patch for more details. ==================== Link: https://patch.msgid.link/20251125085210.1094306-1-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: fec: do not register PPS event for PEROUTWei Fang1-2/+5
There are currently two situations that can trigger the PTP interrupt, one is the PPS event, the other is the PEROUT event. However, the irq handler fec_pps_interrupt() does not check the irq event type and directly registers a PPS event into the system, but the event may be a PEROUT event. This is incorrect because PEROUT is an output signal, while PPS is the input of the kernel PPS system. Therefore, add a check for the event type, if pps_enable is true, it means that the current event is a PPS event, and then the PPS event is registered. Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-5-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: fec: do not allow enabling PPS and PEROUT simultaneouslyWei Fang1-0/+12
In the current driver, PPS and PEROUT use the same channel to generate the events, so they cannot be enabled at the same time. Otherwise, the later configuration will overwrite the earlier configuration. Therefore, when configuring PPS, the driver will check whether PEROUT is enabled. Similarly, when configuring PEROUT, the driver will check whether PPS is enabled. Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-4-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: fec: do not update PEROUT if it is enabledWei Fang2-10/+34
If the previously set PEROUT is already active, updating it will cause the new PEROUT to start immediately instead of at the specified time. This is because fep->reload_period is updated whithout check whether the PEROUT is enabled, and the old PEROUT is not disabled. Therefore, the pulse period will be updated immediately in the pulse interrupt handler fec_pps_interrupt(). Currently, the driver does not support directly updating PEROUT and it will make the logic be more complicated. To fix the current issue, add a check before enabling the PEROUT, the driver will return an error if PEROUT is enabled. If users wants to update a new PEROUT, they should disable the old PEROUT first. Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-3-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: fec: cancel perout_timer when PEROUT is disabledWei Fang1-0/+2
The PEROUT allows the user to set a specified future time to output the periodic signal. If the future time is far from the current time, the FEC driver will use hrtimer to configure PEROUT one second before the future time. However, the hrtimer will not be canceled if the PEROUT is disabled before the hrtimer expires. So the PEROUT will be configured when the hrtimer expires, which is not as expected. Therefore, cancel the hrtimer in fec_ptp_pps_disable() to fix this issue. Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS") Signed-off-by: Wei Fang <wei.fang@nxp.com> Link: https://patch.msgid.link/20251125085210.1094306-2-wei.fang@nxp.com Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: mctp: unconditionally set skb->dev on dst outputJeremy Kerr1-0/+1
On transmit, we are currently relying on skb->dev being set by mctp_local_output() when we first set up the skb destination fields. However, forwarded skbs do not use the local_output path, so will retain their incoming netdev as their ->dev on tx. This does not work when we're forwarding between interfaces. Set skb->dev unconditionally in the transmit path, to allow for proper forwarding. We keep the skb->dev initialisation in mctp_local_output(), as we use it for fragmentation. Fixes: 269936db5eb3 ("net: mctp: separate routing database from routing operations") Suggested-by: Vince Chang <vince_chang@aspeedtech.com> Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au> Link: https://patch.msgid.link/20251125-dev-forward-v1-1-54ecffcd0616@codeconstruct.com.au Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27Merge branch 'net-phy-add-support-for-fbnic-phy-w-25g-50g-and-100g-support'Paolo Abeni17-147/+613
Alexander Duyck says: ==================== net: phy: Add support for fbnic PHY w/ 25G, 50G, and 100G support To transition the fbnic driver to using the XPCS driver we need to address the fact that we need a representation for the FW managed PMD that is actually a SerDes PHY to handle link bouncing during link training. This patch set introduces the necessary bits to the XPCS driver code to enable it to read 25G, 50G, and 100G speeds from the PCS ctrl1 register, and adds support for the approriate interfaces. The rest of this patch set enables the changes to fbnic to make use of these interfaces and expose a PMD that can provide a necessary link delay to avoid link flapping in the event that a cable is disconnected and reconnected, and to correctly expose the count for the link down events. With this we have the basic groundwork laid as with this all the bits and pieces are in place in terms of reading the configuration. The general plan for follow-on patch sets is to start looking at enabling changing the configuration in environments where that is supported. ==================== Link: https://patch.msgid.link/176374310349.959489.838154632023183753.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Replace use of internal PCS w/ Designware XPCSAlexander Duyck5-63/+55
As we have exposed the PCS registers via the SWMII we can now start looking at connecting the XPCS driver to those registers and let it mange the PCS instead of us doing it directly from the fbnic driver. For now this just gets us the ability to detect link. The hope is in the future to add some of the vendor specific registers to begin enabling XPCS configuration of the interface. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374325295.959489.14521115864034905277.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Add SW shim for MDIO interface to PMD and PCSAlexander Duyck5-0/+205
In order for us to support a PCS device we need to add an MDIO bus to allow the drivers to have access to the registers for the device. This change adds such an interface. The interface will consist of 2 PHY addrs, the first one consisting of a PMD and PCS, and the second just being a PCS. There is a need for 2 PHYs addrs due to the fact that in order to support the 50GBase-CR2 mode we will need to access and configure the PCS vendor registers and RSFEC registers from the second lane identical to the first. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374324532.959489.15389723111560978054.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Add handler for reporting link down event statisticsAlexander Duyck1-0/+9
We were previously not displaying the number of link_down_events tracked by the device. With this change we should now be able to display the value. The value itself tracks the calls from the phylink interface to the mac_link_down call. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374323824.959489.6915296616773178954.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Add logic to track PMD state via MAC/PCS signalsAlexander Duyck9-52/+175
One complication with the design of our part is that the PMD doesn't provide a direct signal to the host. Instead we have visibility to signals that the PCS provides to the MAC that allow us to check the link state through that. We will need to account for several things in the PMD and firmware when managing the link. Specifically when the link first starts to come up the PMD will cause the link to flap. This is due to the firmware starting a training cycle when the link is first detected. This will cause link flapping if we were to immediately report link up when the PCS first detects it. To address that we are adding a pmd_state variable that is meant to be a countdown of sorts indicating the state of the PMD. If the link is down or has been reconfigured the PMD will start out in the initialize state. By default the link is assumed to be in the SEND_DATA state if it is available on initial link inspection. If link is detected while in the initialize state the PMD state will switch to training, and if after 4 seconds the link is still stable we will transition to link_ready, and finally the send_data state. With this we can avoid link flapping when a cable is first connected to the NIC. One side effect of this is that we need to pull the link state away from the PCS. For now we use a union of the PCS link state register value and the pmd_state. The plan is to add a PMD register to report the pmd_state to the phylink interface. With that we can then look at switching over to the use of the XPCS driver for fbnic instead of having an internal one. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374323107.959489.14951134213387615059.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27fbnic: Rename PCS IRQ to MAC IRQ as it is actually a MAC interruptAlexander Duyck6-32/+32
Throughout several spots in the code I had called out the IRQ as being related to the PCS. However the actual IRQ is a part of the MAC and it is just exposing PCS data. To more accurately reflect the owner of the calls this change makes it so that we rename the functions and values that are taking in the interrupt value and processing it to reflect that it is a MAC call and not a PCS one. This change is mostly motivated by the fact that we will be moving the handling of this interrupt from being PCS focused to being more PMA/PMD focused as this will drive the phydev driver that I am adding instead of driving the PCS directly. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374322373.959489.12018231545479053860.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: pcs: xpcs: Add support for FBNIC 25G, 50G, 100G PMDAlexander Duyck2-2/+24
The fbnic driver is planning to make use of the XPCS driver to enable support for PCS and better integration with phylink. To do this though we will need to enable several workarounds since the PMD interface for fbnic is likely to be unique since it is a mix of two different vendor products with a unique wrapper around the IP. I have generated a PHY identifier based on IEEE 802.3-2022 22.2.4.3.1 using an OUI belonging to Meta Platforms and used with our NICs. Using this we will provide it as the PMD ID via the SW based MDIO interface so that the fbnic device can be identified and necessary workarounds enabled in the XPCS driver. As an initial workaround this change adds an exception so that soft_reset is not set when the driver is initially bound to the PCS. In addition I have added logic to integrate the PMD Rx signal detect state into the link state for the PCS. With this we can avoid the link coming up too soon on the FBNIC PMD and as a result of it being in the training state so we can avoid link flaps. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374321695.959489.6648161125012056619.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: pcs: xpcs: Fix PMA identifier handling in XPCSAlexander Duyck3-6/+10
The XPCS driver was mangling the PMA identifier as the original code appears to have been focused on just capturing the OUI. Rather than store a mangled ID it is better to work with the actual PMA ID and instead just mask out the values that don't apply rather than shifting them and reordering them as you still don't get the original OUI for the NIC without having to bitswap the values as per the definition of the layout in IEEE 802.3-2022 22.2.4.3.1. By laying it out as it was in the hardware it is also less likely for us to have an unintentional collision as the enum values will occupy the revision number area while the OUI occupies the upper 22 bits. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374320920.959489.17267159479370601070.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: pcs: xpcs: Add support for 25G, 50G, and 100G interfacesAlexander Duyck2-4/+107
With this change we are adding support for 25G, 50G, and 100G interface types to the XPCS driver. This had supposedly been enabled with the addition of XLGMII but I don't see any capability for configuration there so I suspect it may need to be refactored in the future. With this change we can enable the XPCS driver with the selected interface and it should be able to detect link, speed, and report the link status to the phylink interface. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374320248.959489.11649590675011158859.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27net: phy: Add MDIO_PMA_CTRL1_SPEED for 2.5G and 5G to reflect PMA valuesAlexander Duyck2-6/+14
The 2.5G and 5G values are not consistent between the PCS CTRL1 and PMA CTRL1 values. In order to avoid confusion between the two I am updating the values to include "PMA" in the name similar to values used in similar places. To avoid breaking UAPI I have retained the original macros and just defined them as the new PMA based defines. Signed-off-by: Alexander Duyck <alexanderduyck@fb.com> Link: https://patch.msgid.link/176374319569.959489.6610469879021800710.stgit@ahduyck-xeon-server.home.arpa Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2025-11-27MAINTAINERS: add German Maglione as virtiofs co-maintainerStefan Hajnoczi1-0/+1
German Maglione is a co-maintainer of the virtiofsd userspace device implementation (https://gitlab.com/virtio-fs/virtiofsd) and is currently one of the most active virtiofs developers outside the kernel. I have not worked on virtiofs except to review kernel patches for a few years now and would like German to take over from me gradually. It is healthier to have a kernel maintainer who is actively involved. I expect to remove myself in a few months. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Link: https://patch.msgid.link/20251126211548.598469-1-stefanha@redhat.com Signed-off-by: Christian Brauner <brauner@kernel.org>
2025-11-27libceph: prevent potential out-of-bounds writes in handle_auth_session_key()ziming zhang1-0/+2
The len field originates from untrusted network packets. Boundary checks have been added to prevent potential out-of-bounds writes when decrypting the connection secret or processing service tickets. [ idryomov: changelog ] Cc: stable@vger.kernel.org Signed-off-by: ziming zhang <ezrakiez@gmail.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-11-27libceph: replace BUG_ON with bounds check for map->max_osdziming zhang1-7/+11
OSD indexes come from untrusted network packets. Boundary checks are added to validate these against map->max_osd. [ idryomov: drop BUG_ON in ceph_get_primary_affinity(), minor cosmetic edits ] Cc: stable@vger.kernel.org Signed-off-by: ziming zhang <ezrakiez@gmail.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-11-27ceph: fix crash in process_v2_sparse_read() for encrypted directoriesViacheslav Dubeyko1-4/+7
The crash in process_v2_sparse_read() for fscrypt-encrypted directories has been reported. Issue takes place for Ceph msgr2 protocol in secure mode. It can be reproduced by the steps: sudo mount -t ceph :/ /mnt/cephfs/ -o name=admin,fs=cephfs,ms_mode=secure (1) mkdir /mnt/cephfs/fscrypt-test-3 (2) cp area_decrypted.tar /mnt/cephfs/fscrypt-test-3 (3) fscrypt encrypt --source=raw_key --key=./my.key /mnt/cephfs/fscrypt-test-3 (4) fscrypt lock /mnt/cephfs/fscrypt-test-3 (5) fscrypt unlock --key=my.key /mnt/cephfs/fscrypt-test-3 (6) cat /mnt/cephfs/fscrypt-test-3/area_decrypted.tar (7) Issue has been triggered [ 408.072247] ------------[ cut here ]------------ [ 408.072251] WARNING: CPU: 1 PID: 392 at net/ceph/messenger_v2.c:865 ceph_con_v2_try_read+0x4b39/0x72f0 [ 408.072267] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_pmc_ssram_telemetry intel_vsec kvm_intel joydev kvm irqbypass polyval_clmulni ghash_clmulni_intel aesni_intel rapl input_leds psmouse serio_raw i2c_piix4 vga16fb bochs vgastate i2c_smbus floppy mac_hid qemu_fw_cfg pata_acpi sch_fq_codel rbd msr parport_pc ppdev lp parport efi_pstore [ 408.072304] CPU: 1 UID: 0 PID: 392 Comm: kworker/1:3 Not tainted 6.17.0-rc7+ [ 408.072307] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-5.fc42 04/01/2014 [ 408.072310] Workqueue: ceph-msgr ceph_con_workfn [ 408.072314] RIP: 0010:ceph_con_v2_try_read+0x4b39/0x72f0 [ 408.072317] Code: c7 c1 20 f0 d4 ae 50 31 d2 48 c7 c6 60 27 d5 ae 48 c7 c7 f8 8e 6f b0 68 60 38 d5 ae e8 00 47 61 fe 48 83 c4 18 e9 ac fc ff ff <0f> 0b e9 06 fe ff ff 4c 8b 9d 98 fd ff ff 0f 84 64 e7 ff ff 89 85 [ 408.072319] RSP: 0018:ffff88811c3e7a30 EFLAGS: 00010246 [ 408.072322] RAX: ffffed1024874c6f RBX: ffffea00042c2b40 RCX: 0000000000000f38 [ 408.072324] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ 408.072325] RBP: ffff88811c3e7ca8 R08: 0000000000000000 R09: 00000000000000c8 [ 408.072326] R10: 00000000000000c8 R11: 0000000000000000 R12: 00000000000000c8 [ 408.072327] R13: dffffc0000000000 R14: ffff8881243a6030 R15: 0000000000003000 [ 408.072329] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000) knlGS:0000000000000000 [ 408.072331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 408.072332] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0 [ 408.072336] PKRU: 55555554 [ 408.072337] Call Trace: [ 408.072338] <TASK> [ 408.072340] ? sched_clock_noinstr+0x9/0x10 [ 408.072344] ? __pfx_ceph_con_v2_try_read+0x10/0x10 [ 408.072347] ? _raw_spin_unlock+0xe/0x40 [ 408.072349] ? finish_task_switch.isra.0+0x15d/0x830 [ 408.072353] ? __kasan_check_write+0x14/0x30 [ 408.072357] ? mutex_lock+0x84/0xe0 [ 408.072359] ? __pfx_mutex_lock+0x10/0x10 [ 408.072361] ceph_con_workfn+0x27e/0x10e0 [ 408.072364] ? metric_delayed_work+0x311/0x2c50 [ 408.072367] process_one_work+0x611/0xe20 [ 408.072371] ? __kasan_check_write+0x14/0x30 [ 408.072373] worker_thread+0x7e3/0x1580 [ 408.072375] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 408.072378] ? __pfx_worker_thread+0x10/0x10 [ 408.072381] kthread+0x381/0x7a0 [ 408.072383] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 408.072385] ? __pfx_kthread+0x10/0x10 [ 408.072387] ? __kasan_check_write+0x14/0x30 [ 408.072389] ? recalc_sigpending+0x160/0x220 [ 408.072392] ? _raw_spin_unlock_irq+0xe/0x50 [ 408.072394] ? calculate_sigpending+0x78/0xb0 [ 408.072395] ? __pfx_kthread+0x10/0x10 [ 408.072397] ret_from_fork+0x2b6/0x380 [ 408.072400] ? __pfx_kthread+0x10/0x10 [ 408.072402] ret_from_fork_asm+0x1a/0x30 [ 408.072406] </TASK> [ 408.072407] ---[ end trace 0000000000000000 ]--- [ 408.072418] Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI [ 408.072984] KASAN: null-ptr-deref in range [0x0000000000000000- 0x0000000000000007] [ 408.073350] CPU: 1 UID: 0 PID: 392 Comm: kworker/1:3 Tainted: G W 6.17.0-rc7+ #1 PREEMPT(voluntary) [ 408.073886] Tainted: [W]=WARN [ 408.074042] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-5.fc42 04/01/2014 [ 408.074468] Workqueue: ceph-msgr ceph_con_workfn [ 408.074694] RIP: 0010:ceph_msg_data_advance+0x79/0x1a80 [ 408.074976] Code: fc ff df 49 8d 77 08 48 c1 ee 03 80 3c 16 00 0f 85 07 11 00 00 48 ba 00 00 00 00 00 fc ff df 49 8b 5f 08 48 89 de 48 c1 ee 03 <0f> b6 14 16 84 d2 74 09 80 fa 03 0f 8e 0f 0e 00 00 8b 13 83 fa 03 [ 408.075884] RSP: 0018:ffff88811c3e7990 EFLAGS: 00010246 [ 408.076305] RAX: ffff8881243a6388 RBX: 0000000000000000 RCX: 0000000000000000 [ 408.076909] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff8881243a6378 [ 408.077466] RBP: ffff88811c3e7a20 R08: 0000000000000000 R09: 00000000000000c8 [ 408.078034] R10: ffff8881243a6388 R11: 0000000000000000 R12: ffffed1024874c71 [ 408.078575] R13: dffffc0000000000 R14: ffff8881243a6030 R15: ffff8881243a6378 [ 408.079159] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000) knlGS:0000000000000000 [ 408.079736] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 408.080039] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0 [ 408.080376] PKRU: 55555554 [ 408.080513] Call Trace: [ 408.080630] <TASK> [ 408.080729] ceph_con_v2_try_read+0x49b9/0x72f0 [ 408.081115] ? __pfx_ceph_con_v2_try_read+0x10/0x10 [ 408.081348] ? _raw_spin_unlock+0xe/0x40 [ 408.081538] ? finish_task_switch.isra.0+0x15d/0x830 [ 408.081768] ? __kasan_check_write+0x14/0x30 [ 408.081986] ? mutex_lock+0x84/0xe0 [ 408.082160] ? __pfx_mutex_lock+0x10/0x10 [ 408.082343] ceph_con_workfn+0x27e/0x10e0 [ 408.082529] ? metric_delayed_work+0x311/0x2c50 [ 408.082737] process_one_work+0x611/0xe20 [ 408.082948] ? __kasan_check_write+0x14/0x30 [ 408.083156] worker_thread+0x7e3/0x1580 [ 408.083331] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 408.083557] ? __pfx_worker_thread+0x10/0x10 [ 408.083751] kthread+0x381/0x7a0 [ 408.083922] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 408.084139] ? __pfx_kthread+0x10/0x10 [ 408.084310] ? __kasan_check_write+0x14/0x30 [ 408.084510] ? recalc_sigpending+0x160/0x220 [ 408.084708] ? _raw_spin_unlock_irq+0xe/0x50 [ 408.084917] ? calculate_sigpending+0x78/0xb0 [ 408.085138] ? __pfx_kthread+0x10/0x10 [ 408.085335] ret_from_fork+0x2b6/0x380 [ 408.085525] ? __pfx_kthread+0x10/0x10 [ 408.085720] ret_from_fork_asm+0x1a/0x30 [ 408.085922] </TASK> [ 408.086036] Modules linked in: intel_rapl_msr intel_rapl_common intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_discovery pmt_class intel_pmc_ssram_telemetry intel_vsec kvm_intel joydev kvm irqbypass polyval_clmulni ghash_clmulni_intel aesni_intel rapl input_leds psmouse serio_raw i2c_piix4 vga16fb bochs vgastate i2c_smbus floppy mac_hid qemu_fw_cfg pata_acpi sch_fq_codel rbd msr parport_pc ppdev lp parport efi_pstore [ 408.087778] ---[ end trace 0000000000000000 ]--- [ 408.088007] RIP: 0010:ceph_msg_data_advance+0x79/0x1a80 [ 408.088260] Code: fc ff df 49 8d 77 08 48 c1 ee 03 80 3c 16 00 0f 85 07 11 00 00 48 ba 00 00 00 00 00 fc ff df 49 8b 5f 08 48 89 de 48 c1 ee 03 <0f> b6 14 16 84 d2 74 09 80 fa 03 0f 8e 0f 0e 00 00 8b 13 83 fa 03 [ 408.089118] RSP: 0018:ffff88811c3e7990 EFLAGS: 00010246 [ 408.089357] RAX: ffff8881243a6388 RBX: 0000000000000000 RCX: 0000000000000000 [ 408.089678] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff8881243a6378 [ 408.090020] RBP: ffff88811c3e7a20 R08: 0000000000000000 R09: 00000000000000c8 [ 408.090360] R10: ffff8881243a6388 R11: 0000000000000000 R12: ffffed1024874c71 [ 408.090687] R13: dffffc0000000000 R14: ffff8881243a6030 R15: ffff8881243a6378 [ 408.091035] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000) knlGS:0000000000000000 [ 408.091452] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 408.092015] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0 [ 408.092530] PKRU: 55555554 [ 417.112915] ================================================================== [ 417.113491] BUG: KASAN: slab-use-after-free in __mutex_lock.constprop.0+0x1522/0x1610 [ 417.114014] Read of size 4 at addr ffff888124870034 by task kworker/2:0/4951 [ 417.114587] CPU: 2 UID: 0 PID: 4951 Comm: kworker/2:0 Tainted: G D W 6.17.0-rc7+ #1 PREEMPT(voluntary) [ 417.114592] Tainted: [D]=DIE, [W]=WARN [ 417.114593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.17.0-5.fc42 04/01/2014 [ 417.114596] Workqueue: events handle_timeout [ 417.114601] Call Trace: [ 417.114602] <TASK> [ 417.114604] dump_stack_lvl+0x5c/0x90 [ 417.114610] print_report+0x171/0x4dc [ 417.114613] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 417.114617] ? kasan_complete_mode_report_info+0x80/0x220 [ 417.114621] kasan_report+0xbd/0x100 [ 417.114625] ? __mutex_lock.constprop.0+0x1522/0x1610 [ 417.114628] ? __mutex_lock.constprop.0+0x1522/0x1610 [ 417.114630] __asan_report_load4_noabort+0x14/0x30 [ 417.114633] __mutex_lock.constprop.0+0x1522/0x1610 [ 417.114635] ? queue_con_delay+0x8d/0x200 [ 417.114638] ? __pfx___mutex_lock.constprop.0+0x10/0x10 [ 417.114641] ? __send_subscribe+0x529/0xb20 [ 417.114644] __mutex_lock_slowpath+0x13/0x20 [ 417.114646] mutex_lock+0xd4/0xe0 [ 417.114649] ? __pfx_mutex_lock+0x10/0x10 [ 417.114652] ? ceph_monc_renew_subs+0x2a/0x40 [ 417.114654] ceph_con_keepalive+0x22/0x110 [ 417.114656] handle_timeout+0x6b3/0x11d0 [ 417.114659] ? _raw_spin_unlock_irq+0xe/0x50 [ 417.114662] ? __pfx_handle_timeout+0x10/0x10 [ 417.114664] ? queue_delayed_work_on+0x8e/0xa0 [ 417.114669] process_one_work+0x611/0xe20 [ 417.114672] ? __kasan_check_write+0x14/0x30 [ 417.114676] worker_thread+0x7e3/0x1580 [ 417.114678] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 417.114682] ? __pfx_sched_setscheduler_nocheck+0x10/0x10 [ 417.114687] ? __pfx_worker_thread+0x10/0x10 [ 417.114689] kthread+0x381/0x7a0 [ 417.114692] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 417.114694] ? __pfx_kthread+0x10/0x10 [ 417.114697] ? __kasan_check_write+0x14/0x30 [ 417.114699] ? recalc_sigpending+0x160/0x220 [ 417.114703] ? _raw_spin_unlock_irq+0xe/0x50 [ 417.114705] ? calculate_sigpending+0x78/0xb0 [ 417.114707] ? __pfx_kthread+0x10/0x10 [ 417.114710] ret_from_fork+0x2b6/0x380 [ 417.114713] ? __pfx_kthread+0x10/0x10 [ 417.114715] ret_from_fork_asm+0x1a/0x30 [ 417.114720] </TASK> [ 417.125171] Allocated by task 2: [ 417.125333] kasan_save_stack+0x26/0x60 [ 417.125522] kasan_save_track+0x14/0x40 [ 417.125742] kasan_save_alloc_info+0x39/0x60 [ 417.125945] __kasan_slab_alloc+0x8b/0xb0 [ 417.126133] kmem_cache_alloc_node_noprof+0x13b/0x460 [ 417.126381] copy_process+0x320/0x6250 [ 417.126595] kernel_clone+0xb7/0x840 [ 417.126792] kernel_thread+0xd6/0x120 [ 417.126995] kthreadd+0x85c/0xbe0 [ 417.127176] ret_from_fork+0x2b6/0x380 [ 417.127378] ret_from_fork_asm+0x1a/0x30 [ 417.127692] Freed by task 0: [ 417.127851] kasan_save_stack+0x26/0x60 [ 417.128057] kasan_save_track+0x14/0x40 [ 417.128267] kasan_save_free_info+0x3b/0x60 [ 417.128491] __kasan_slab_free+0x6c/0xa0 [ 417.128708] kmem_cache_free+0x182/0x550 [ 417.128906] free_task+0xeb/0x140 [ 417.129070] __put_task_struct+0x1d2/0x4f0 [ 417.129259] __put_task_struct_rcu_cb+0x15/0x20 [ 417.129480] rcu_do_batch+0x3d3/0xe70 [ 417.129681] rcu_core+0x549/0xb30 [ 417.129839] rcu_core_si+0xe/0x20 [ 417.130005] handle_softirqs+0x160/0x570 [ 417.130190] __irq_exit_rcu+0x189/0x1e0 [ 417.130369] irq_exit_rcu+0xe/0x20 [ 417.130531] sysvec_apic_timer_interrupt+0x9f/0xd0 [ 417.130768] asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 417.131082] Last potentially related work creation: [ 417.131305] kasan_save_stack+0x26/0x60 [ 417.131484] kasan_record_aux_stack+0xae/0xd0 [ 417.131695] __call_rcu_common+0xcd/0x14b0 [ 417.131909] call_rcu+0x31/0x50 [ 417.132071] delayed_put_task_struct+0x128/0x190 [ 417.132295] rcu_do_batch+0x3d3/0xe70 [ 417.132478] rcu_core+0x549/0xb30 [ 417.132658] rcu_core_si+0xe/0x20 [ 417.132808] handle_softirqs+0x160/0x570 [ 417.132993] __irq_exit_rcu+0x189/0x1e0 [ 417.133181] irq_exit_rcu+0xe/0x20 [ 417.133353] sysvec_apic_timer_interrupt+0x9f/0xd0 [ 417.133584] asm_sysvec_apic_timer_interrupt+0x1b/0x20 [ 417.133921] Second to last potentially related work creation: [ 417.134183] kasan_save_stack+0x26/0x60 [ 417.134362] kasan_record_aux_stack+0xae/0xd0 [ 417.134566] __call_rcu_common+0xcd/0x14b0 [ 417.134782] call_rcu+0x31/0x50 [ 417.134929] put_task_struct_rcu_user+0x58/0xb0 [ 417.135143] finish_task_switch.isra.0+0x5d3/0x830 [ 417.135366] __schedule+0xd30/0x5100 [ 417.135534] schedule_idle+0x5a/0x90 [ 417.135712] do_idle+0x25f/0x410 [ 417.135871] cpu_startup_entry+0x53/0x70 [ 417.136053] start_secondary+0x216/0x2c0 [ 417.136233] common_startup_64+0x13e/0x141 [ 417.136894] The buggy address belongs to the object at ffff888124870000 which belongs to the cache task_struct of size 10504 [ 417.138122] The buggy address is located 52 bytes inside of freed 10504-byte region [ffff888124870000, ffff888124872908) [ 417.139465] The buggy address belongs to the physical page: [ 417.140016] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x124870 [ 417.140789] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [ 417.141519] memcg:ffff88811aa20e01 [ 417.141874] anon flags: 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) [ 417.142600] page_type: f5(slab) [ 417.142922] raw: 0017ffffc0000040 ffff88810094f040 0000000000000000 dead000000000001 [ 417.143554] raw: 0000000000000000 0000000000030003 00000000f5000000 ffff88811aa20e01 [ 417.143954] head: 0017ffffc0000040 ffff88810094f040 0000000000000000 dead000000000001 [ 417.144329] head: 0000000000000000 0000000000030003 00000000f5000000 ffff88811aa20e01 [ 417.144710] head: 0017ffffc0000003 ffffea0004921c01 00000000ffffffff 00000000ffffffff [ 417.145106] head: ffffffffffffffff 0000000000000000 00000000ffffffff 0000000000000008 [ 417.145485] page dumped because: kasan: bad access detected [ 417.145859] Memory state around the buggy address: [ 417.146094] ffff88812486ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 417.146439] ffff88812486ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 417.146791] >ffff888124870000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 417.147145] ^ [ 417.147387] ffff888124870080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 417.147751] ffff888124870100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [ 417.148123] ================================================================== First of all, we have warning in get_bvec_at() because cursor->total_resid contains zero value. And, finally, we have crash in ceph_msg_data_advance() because cursor->data is NULL. It means that get_bvec_at() receives not initialized ceph_msg_data_cursor structure because data is NULL and total_resid contains zero. Moreover, we don't have likewise issue for the case of Ceph msgr1 protocol because ceph_msg_data_cursor_init() has been called before reading sparse data. This patch adds calling of ceph_msg_data_cursor_init() in the beginning of process_v2_sparse_read() with the goal to guarantee that logic of reading sparse data works correctly for the case of Ceph msgr2 protocol. Cc: stable@vger.kernel.org Link: https://tracker.ceph.com/issues/73152 Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2025-11-27x86/bugs: Make i386 use GENERIC_BUG_RELATIVE_POINTERSPeter Zijlstra2-11/+5
Linus figured less #ifdef is more better and making x86-32 use GENERIC_BUG_RELATIVE_POINTERS removes one layer of macro magic from the bug.h bits. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-11-27x86/bug: Fix BUG_FORMAT vs KASLRPeter Zijlstra2-3/+17
Encoding a relative NULL pointer doesn't work for KASLR, when the whole kernel image gets shifted, the __bug_table and the target string get shifted by the same amount and the relative offset is preserved. However when the target is an absolute 0 value and the __bug_table gets moved about, the end result in a pointer equivalent to kaslr_offset(), not NULL. Notably, this will generate SHN_UNDEF relocations, and Ard would really like to not have those at all. Use the empty string to denote no-string. Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org> Cc: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-11-27objtool: Build with disassembly can fail when including bdf.hAlexandre Chartre1-1/+1
Building objtool with disassembly support can fail when including the bdf.h file: In file included from tools/objtool/include/objtool/arch.h:108, from check.c:14: /usr/include/bfd.h:35:2: error: #error config.h must be included before this header 35 | #error config.h must be included before this header | ^~~~~ This check is present in the bfd.h file generated from the binutils source code, but it is not necessarily present in the bfd.h file provided in a binutil package (for example, it is not present in the binutil RPM). The solution to this issue is to define the PACKAGE macro before including bfd.h. This is the solution suggested by the binutil developer in bug 14243, and it is used by other kernel tools which also use bfd.h (perf and bpf). Fixes: 59953303827ec ("objtool: Disassemble code with libopcodes instead of running objdump") Closes: https://lore.kernel.org/all/3fa261fd-3b46-4cbe-b48d-7503aabc96cb@oracle.com/ Reported-by: Nathan Chancellor <nathan@kernel.org> Suggested-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://sourceware.org/bugzilla/show_bug.cgi?id=14243 Link: https://patch.msgid.link/20251126134519.1760889-1-alexandre.chartre@oracle.com
2025-11-26Merge tag 'v6.18rc7-SMB-client-fix' of git://git.samba.org/sfrench/cifs-2.6Linus Torvalds1-0/+1
Pull smb client fix from Steve French: "smb client multiuser (with cifscreds) mount fix" * tag 'v6.18rc7-SMB-client-fix' of git://git.samba.org/sfrench/cifs-2.6: smb: client: fix memory leak in cifs_construct_tcon()
2025-11-26Merge tag 'linux-can-fixes-for-6.18-20251126' of ↵Jakub Kicinski6-41/+127
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2025-11-26 this is a pull request of 8 patches for net/main. Seungjin Bae provides a patch for the kvaser_usb driver to fix a potential infinite loop in the USB data stream command parser. Thomas Mühlbacher's patch for the sja1000 driver IRQ handler's max loop handling, that might lead to unhandled interrupts. 3 patches by me for the gs_usb driver fix handling of failed transmit URBs and add checking of the actual length of received URBs before accessing the data. The next patch is by me and is a port of Thomas Mühlbacher's patch (fix IRQ handler's max loop handling, that might lead to unhandled interrupts.) to the sun4i_can driver. Biju Das provides a patch for the rcar_canfd driver to fix the CAN-FD mode setting. The last patch is by Shaurya Rane for the em_canid filter to ensure that the complete CAN frame is present in the linear data buffer before accessing it. * tag 'linux-can-fixes-for-6.18-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can: net/sched: em_canid: fix uninit-value in em_canid_match can: rcar_canfd: Fix CAN-FD mode as default can: sun4i_can: sun4i_can_interrupt(): fix max irq loop handling can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing data can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing header can: gs_usb: gs_usb_xmit_callback(): fix handling of failed transmitted URBs can: sja1000: fix max irq loop handling can: kvaser_usb: leaf: Fix potential infinite loop in command parsers ==================== Link: https://patch.msgid.link/20251126155713.217105-1-mkl@pengutronix.de Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: atlantic: fix fragment overflow handling in RX pathJiefeng Zhang1-0/+5
The atlantic driver can receive packets with more than MAX_SKB_FRAGS (17) fragments when handling large multi-descriptor packets. This causes an out-of-bounds write in skb_add_rx_frag_netmem() leading to kernel panic. The issue occurs because the driver doesn't check the total number of fragments before calling skb_add_rx_frag(). When a packet requires more than MAX_SKB_FRAGS fragments, the fragment index exceeds the array bounds. Fix by assuming there will be an extra frag if buff->len > AQ_CFG_RX_HDR_SIZE, then all fragments are accounted for. And reusing the existing check to prevent the overflow earlier in the code path. This crash occurred in production with an Aquantia AQC113 10G NIC. Stack trace from production environment: ``` RIP: 0010:skb_add_rx_frag_netmem+0x29/0xd0 Code: 90 f3 0f 1e fa 0f 1f 44 00 00 48 89 f8 41 89 ca 48 89 d7 48 63 ce 8b 90 c0 00 00 00 48 c1 e1 04 48 01 ca 48 03 90 c8 00 00 00 <48> 89 7a 30 44 89 52 3c 44 89 42 38 40 f6 c7 01 75 74 48 89 fa 83 RSP: 0018:ffffa9bec02a8d50 EFLAGS: 00010287 RAX: ffff925b22e80a00 RBX: ffff925ad38d2700 RCX: fffffffe0a0c8000 RDX: ffff9258ea95bac0 RSI: ffff925ae0a0c800 RDI: 0000000000037a40 RBP: 0000000000000024 R08: 0000000000000000 R09: 0000000000000021 R10: 0000000000000848 R11: 0000000000000000 R12: ffffa9bec02a8e24 R13: ffff925ad8615570 R14: 0000000000000000 R15: ffff925b22e80a00 FS: 0000000000000000(0000) GS:ffff925e47880000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9258ea95baf0 CR3: 0000000166022004 CR4: 0000000000f72ef0 PKRU: 55555554 Call Trace: <IRQ> aq_ring_rx_clean+0x175/0xe60 [atlantic] ? aq_ring_rx_clean+0x14d/0xe60 [atlantic] ? aq_ring_tx_clean+0xdf/0x190 [atlantic] ? kmem_cache_free+0x348/0x450 ? aq_vec_poll+0x81/0x1d0 [atlantic] ? __napi_poll+0x28/0x1c0 ? net_rx_action+0x337/0x420 ``` Fixes: 6aecbba12b5c ("net: atlantic: add check for MAX_SKB_FRAGS") Changes in v4: - Add Fixes: tag to satisfy patch validation requirements. Changes in v3: - Fix by assuming there will be an extra frag if buff->len > AQ_CFG_RX_HDR_SIZE, then all fragments are accounted for. Signed-off-by: Jiefeng Zhang <jiefeng.z.zhang@gmail.com> Link: https://patch.msgid.link/20251126032249.69358-1-jiefeng.z.zhang@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26MAINTAINERS: separate VIRTIO NET DRIVER and add netdevJon Kohler1-3/+14
Changes to virtio network stack should be cc'd to netdev DL, separate it into its own group to add netdev in addition to virtualization DL. Signed-off-by: Jon Kohler <jon@nutanix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20251126015750.2200267-1-jon@nutanix.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26virtio-net: avoid unnecessary checksum calculation on guest RXJon Kohler3-5/+7
Commit a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP GSO tunneling.") inadvertently altered checksum offload behavior for guests not using UDP GSO tunneling. Before, tun_put_user called tun_vnet_hdr_from_skb, which passed has_data_valid = true to virtio_net_hdr_from_skb. After, tun_put_user began calling tun_vnet_hdr_tnl_from_skb instead, which passes has_data_valid = false into both call sites. This caused virtio hdr flags to not include VIRTIO_NET_HDR_F_DATA_VALID for SKBs where skb->ip_summed == CHECKSUM_UNNECESSARY. As a result, guests are forced to recalculate checksums unnecessarily. Restore the previous behavior by ensuring has_data_valid = true is passed in the !tnl_gso_type case, but only from tun side, as virtio_net_hdr_tnl_from_skb() is used also by the virtio_net driver, which in turn must not use VIRTIO_NET_HDR_F_DATA_VALID on tx. cc: stable@vger.kernel.org Fixes: a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP GSO tunneling.") Signed-off-by: Jon Kohler <jon@nutanix.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Link: https://patch.msgid.link/20251125222754.1737443-1-jon@nutanix.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26amd-xgbe: let the MAC manage PHY PMRaju Rangoju2-5/+10
Use the MAC managed PM flag to indicate that MAC driver takes care of suspending/resuming the PHY, and reset it when the device is brought up. Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com> Link: https://patch.msgid.link/20251123163721.442162-1-Raju.Rangoju@amd.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26eth: fbnic: Fix counter roll-over issueMohsin Bashir1-1/+1
Fix a potential counter roll-over issue in fbnic_mbx_alloc_rx_msgs() when calculating descriptor slots. The issue occurs when head - tail results in a large positive value (unsigned) and the compiler interprets head - tail - 1 as a signed value. Since FBNIC_IPC_MBX_DESC_LEN is a power of two, use a masking operation, which is a common way of avoiding this problem when dealing with these sort of ring space calculations. Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism") Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com> Link: https://patch.msgid.link/20251125211704.3222413-1-mohsin.bashr@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26mptcp: clear scheduled subflows on retransmitPaolo Abeni1-2/+11
When __mptcp_retrans() kicks-in, it schedules one or more subflows for retransmission, but such subflows could be actually left alone if there is no more data to retransmit and/or in case of concurrent fallback. Scheduled subflows could be processed much later in time, i.e. when new data will be transmitted, leading to bad subflow selection. Explicitly clear all scheduled subflows before leaving the retransmission function. Fixes: ee2708aedad0 ("mptcp: use get_retrans wrapper") Cc: stable@vger.kernel.org Reported-by: Filip Pokryvka <fpokryvk@redhat.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Link: https://patch.msgid.link/20251125-net-mptcp-clear-sched-rtx-v1-1-1cea4ad2165f@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26Merge branch ↵Jakub Kicinski6-43/+272
'net-hibmcge-add-support-for-tracepoint-and-pagepool-on-hibmcge-driver' Jijie Shao says: ==================== net: hibmcge: Add support for tracepoint and pagepool on hibmcge driver In this patch set: 1: add support for tracepoint for rx descriptor 2: double the rx queue depth to reduce packet drop 3: add support for pagepool on rx ==================== Link: https://patch.msgid.link/20251122034657.3373143-1-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: hibmcge: add support for pagepool on rxJijie Shao3-33/+142
add support for pagepool on rx, and remove the legacy path Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20251122034657.3373143-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: hibmcge: reduce packet drop under stress testingJijie Shao1-10/+37
Under stress test scenarios, hibmcge driver may not receive packets in a timely manner, which can lead to the buffer of the hardware queue being exhausted, resulting in packet drop. This patch doubles the software queue depth and uses half of the buffer to fill the hardware queue before receiving packets, thus preventing packet loss caused by the hardware queue buffer being exhausted. Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20251122034657.3373143-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: hibmcge: add support for tracepoint to dump some fields of rx_descTao Lan4-0/+93
add support for tracepoint to dump some fields of rx_desc Signed-off-by: Tao Lan <lantao5@huawei.com> Signed-off-by: Jijie Shao <shaojijie@huawei.com> Link: https://patch.msgid.link/20251122034657.3373143-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: dsa: sja1105: fix SGMII linking at 10M or 100M but not passing trafficVladimir Oltean1-7/+0
When using the SGMII PCS as a fixed-link chip-to-chip connection, it is easy to miss the fact that traffic passes only at 1G, since that's what any normal such connection would use. When using the SGMII PCS connected towards an on-board PHY or an SFP module, it is immediately noticeable that when the link resolves to a speed other than 1G, traffic from the MAC fails to pass: TX counters increase, but nothing gets decoded by the other end, and no local RX counters increase either. Artificially lowering a fixed-link rate to speed = <100> makes us able to see the same issue as in the case of having an SGMII PHY. Some debugging shows that the XPCS configuration is A-OK, but that the MAC Configuration Table entry for the port has the SPEED bits still set to 1000Mbps, due to a special condition in the driver. Deleting that condition, and letting the resolved link speed be programmed directly into the MAC speed field, results in a functional link at all 3 speeds. This piece of evidence, based on testing on both generations with SGMII support (SJA1105S and SJA1110A) directly contradicts the statement from the blamed commit that "the MAC is fixed at 1 Gbps and we need to configure the PCS only (if even that)". Worse, that statement is not backed by any documentation, and no one from NXP knows what it might refer to. I am unable to recall sufficient context regarding my testing from March 2020 to understand what led me to draw such a braindead and factually incorrect conclusion. Yet, there is nothing of value regarding forcing the MAC speed, either for SGMII or 2500Base-X (introduced at a later stage), so remove all such logic. Fixes: ffe10e679cec ("net: dsa: sja1105: Add support for the SGMII port") Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251122111324.136761-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: fman_memac: report structured ethtool countersVladimir Oltean3-0/+146
The FMan driver has support for 2 MACs: mEMAC (newer, present on Layerscape and PowerPC T series) and dTSEC/TGEC (older, present on PowerPC P series). I only have handy access to the mEMAC, and this adds support for MAC counters for those platforms. MAC counters are necessary for any kind of low-level debugging, and currently there is no mechanism to dump them. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251122115931.151719-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: dpaa: fman_memac: complete phylink support with 2500base-xVladimir Oltean1-0/+4
The DPAA phylink conversion in the following commits partially developed code for handling the 2500base-x host interface mode (called "2.5G SGMII" in LS1043A/LS1046A reference manuals). - 0fc83bd79589 ("net: fman: memac: Add serdes support") - 5d93cfcf7360 ("net: dpaa: Convert to phylink") In principle, having phy-interface-mode = "2500base-x" and a pcsphy-handle (unnamed or with pcs-handle-names = "sgmii") in the MAC device tree node results in PHY_INTERFACE_MODE_2500BASEX being set in phylink_config :: supported_interfaces, but this isn't sufficient. Because memac_select_pcs() returns no PCS for PHY_INTERFACE_MODE_2500BASEX, the Lynx PCS code never engages. There's a chance the PCS driver doesn't have any configuration to change for 2500base-x fixed-link (based on bootloader pre-initialization), but there's an even higher chance that this is not the case, and the PCS remains misconfigured. More importantly, memac_if_mode() does not handle PHY_INTERFACE_MODE_2500BASEX, and it should be telling the mEMAC to configure itself in GMII mode (which is upclocked by the PCS). Currently it prints a WARN_ON() and returns zero, aka IF_MODE_10G (incorrect). The additional case statement in memac_prepare() for calling phy_set_mode_ext() does not make any difference, because there is no generic PHY driver for the Lynx 10G SerDes from LS1043A/LS1046A. But we add it nonetheless, for consistency. Regarding the question "did 2500base-x ever work with the FMan mEMAC mainline code prior to the phylink conversion?" - the answer is more nuanced. For context, the previous phylib-based implementation was unable to describe the fixed-link speed as 2500, because the software PHY implementation is limited to 1G. However, improperly describing the link as an sgmii fixed-link with speed = <1000> would have resulted in a functional 2.5G speed, because there is no other difference than the SerDes lane clock net frequency (3.125 GHz for 2500base-x) - all the other higher-level settings are the same, and the SerDes lane frequency is currently handled by the RCW. But this hack cannot be extended towards a phylib PHY such as Aquantia operating in OCSGMII, because the latter requires phy-mode = "2500base-x", which the mEMAC driver did not support prior to the phylink conversion. So I do not really consider this a regression, just completing support for a missing feature. The FMan mEMAC driver sets phylink's "default_an_inband" property to true, making it as if the device tree node had the managed = "in-band-status" property anyway. This default made sense for SGMII, where it was added to avoid regressions, but for 2500base-x we learned only recently how to enable in-band autoneg: https://lore.kernel.org/netdev/20251122113433.141930-1-vladimir.oltean@nxp.com/ so the driver needs to opt out of this default in-band enabled behaviour, and only enable in-band based on the device tree property. Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk> Link: https://lore.kernel.org/netdev/aIyx0OLWGw5zKarX@shell.armlinux.org.uk/#t Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251122115523.150260-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: phy: dp83867: implement configurability for SGMII in-band auto-negotiationVladimir Oltean1-7/+29
Implement the inband_caps() and config_inband() PHY driver methods, to allow working with PCS devices that do not support or want in-band to be used. There is a complication due to existing logic from commit c76acfb7e19d ("net: phy: dp83867: retrigger SGMII AN when link change") which might re-enable what dp83867_config_inband() has disabled. So we need to modify dp83867_link_change_notify() to use phy_modify_changed() when temporarily disabling in-band autoneg. If the return code is 0, it means the original in-band was disabled and we need to keep it disabled. If the return code is 1, the original was enabled and we need to re-enable it. If negative, there was an error, which was silent before, and remains silent now. dp83867_config_inband() and dp83867_link_change_notify() are serialized by the phydev->lock. Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com> Link: https://patch.msgid.link/20251122110427.133035-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26tools: ynl: add YNL test frameworkHangbin Liu5-2/+593
Add a test framework for YAML Netlink (YNL) tools, covering both CLI and ethtool functionality. The framework includes: 1) cli: family listing, netdev, ethtool, rt-* families, and nlctrl operations 2) ethtool: device info, statistics, ring/coalesce/pause parameters, and feature gettings The current YNL syntax is a bit obscure, and end users may not always know how to use it. This test framework provides usage examples and also serves as a regression test to catch potential breakages caused by future changes. Reviewed-by: Donald Hunter <donald.hunter@gmail.com> Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Link: https://patch.msgid.link/20251124022055.33389-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26netlink: specs: add big-endian byte-order for u32 IPv4 addressesHangbin Liu2-0/+3
The fix commit converted several IPv4 address attributes from binary to u32, but forgot to specify byte-order: big-endian. Without this, YNL tools display IPv4 addresses incorrectly due to host-endian interpretation. Add the missing byte-order: big-endian to all affected u32 IPv4 address fields to ensure correct parsing and display. Fixes: 1064d521d177 ("netlink: specs: support ipv4-or-v6 for dual-stack fields") Reported-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net> Link: https://patch.msgid.link/20251125112048.37631-1-liuhangbin@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26s390/net: list Aswin Karuvally as maintainerAlexandra Winter1-1/+1
Thank you Aswin for taking this responsibility. Signed-off-by: Alexandra Winter <wintera@linux.ibm.com> Acked-by: Aswin Karuvally <aswin@linux.ibm.com> Link: https://patch.msgid.link/20251125085829.3679506-1-wintera@linux.ibm.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26Merge branch 'net-intel-migrate-to-get_rx_ring_count-ethtool-callback'Jakub Kicinski8-47/+86
Breno Leitao says: ==================== net: intel: migrate to .get_rx_ring_count() ethtool callback This series migrates Intel network drivers to use the new .get_rx_ring_count() ethtool callback introduced in commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries"). The new callback simplifies the .get_rxnfc() implementation by removing ETHTOOL_GRXRINGS handling and moving it to a dedicated callback. This provides a cleaner separation of concerns and aligns these drivers with the modern ethtool API. The series updates the following Intel drivers: - idpf - igb - igc - ixgbevf - fm10k ==================== Link: https://patch.msgid.link/20251125-gxring_intel-v2-0-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26fm10k: extract GRXRINGS from .get_rxnfcBreno Leitao1-14/+3
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries") added specific support for GRXRINGS callback, simplifying .get_rxnfc. Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new .get_rx_ring_count(). This simplifies the RX ring count retrieval and aligns fm10k with the new ethtool API for querying RX ring parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20251125-gxring_intel-v2-8-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26ixgbevf: extract GRXRINGS from .get_rxnfcBreno Leitao1-11/+3
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries") added specific support for GRXRINGS callback, simplifying .get_rxnfc. Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new .get_rx_ring_count(). This simplifies the RX ring count retrieval and aligns ixgbevf with the new ethtool API for querying RX ring parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20251125-gxring_intel-v2-7-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26igc: extract GRXRINGS from .get_rxnfcBreno Leitao1-3/+8
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries") added specific support for GRXRINGS callback, simplifying .get_rxnfc. Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new .get_rx_ring_count(). This simplifies the RX ring count retrieval and aligns igc with the new ethtool API for querying RX ring parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20251125-gxring_intel-v2-6-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26igb: extract GRXRINGS from .get_rxnfcBreno Leitao1-4/+8
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries") added specific support for GRXRINGS callback, simplifying .get_rxnfc. Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new .get_rx_ring_count(). This simplifies the RX ring count retrieval and aligns igb with the new ethtool API for querying RX ring parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20251125-gxring_intel-v2-5-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26idpf: extract GRXRINGS from .get_rxnfcBreno Leitao1-3/+20
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries") added specific support for GRXRINGS callback, simplifying .get_rxnfc. Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new .get_rx_ring_count(). This simplifies the RX ring count retrieval and aligns idpf with the new ethtool API for querying RX ring parameters. I was not totally convinced I needed to have the lock, but, I decided to be on the safe side and get the exact same behaviour it was before. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20251125-gxring_intel-v2-4-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26ice: extract GRXRINGS from .get_rxnfcBreno Leitao1-4/+15
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries") added specific support for GRXRINGS callback, simplifying .get_rxnfc. Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new .get_rx_ring_count(). This simplifies the RX ring count retrieval and aligns ice with the new ethtool API for querying RX ring parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20251125-gxring_intel-v2-3-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26iavf: extract GRXRINGS from .get_rxnfcBreno Leitao1-4/+14
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries") added specific support for GRXRINGS callback, simplifying .get_rxnfc. Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new .get_rx_ring_count(). This simplifies the RX ring count retrieval and aligns iavf with the new ethtool API for querying RX ring parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20251125-gxring_intel-v2-2-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26i40e: extract GRXRINGS from .get_rxnfcBreno Leitao1-4/+15
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to optimize RX ring queries") added specific support for GRXRINGS callback, simplifying .get_rxnfc. Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new .get_rx_ring_count(). This simplifies the RX ring count retrieval and aligns i40e with the new ethtool API for querying RX ring parameters. Signed-off-by: Breno Leitao <leitao@debian.org> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com> Link: https://patch.msgid.link/20251125-gxring_intel-v2-1-f55cd022d28b@debian.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26Merge branch 'unify-platform-suspend-resume-routines-for-pci-dwmac-glue'Jakub Kicinski6-68/+76
Yao Zi says: ==================== Unify platform suspend/resume routines for PCI DWMAC glue There are currently three PCI-based DWMAC glue drivers in tree, stmmac_pci.c, dwmac-intel.c, and dwmac-loongson.c. Both stmmac_pci.c and dwmac-intel.c implements the same and duplicated platform suspend/resume routines. This series introduces a new PCI helper library, stmmac_libpci.c, providing a pair of helpers, stmmac_pci_plat_{suspend,resume}, and replaces the driver-specific implementation with the helpers to reduce code duplication. The helper will also simplify the Motorcomm DWMAC glue driver which I'm working on. The glue driver for Intel controllers isn't covered by the series, since its suspend routine doesn't call pci_disable_device() and thus is a little different from the new generic helpers. I only have Loongson hardware on hand, thus the series is only tested on Loongson 3A5000 machine. I could confirm the controller works after resume, and WoL works as expected. This shouldn't break stmmac_pci.c, either, since the new helpers have the exactly same code as the old driver-specific suspend/resume hooks. ==================== Link: https://patch.msgid.link/20251124160417.51514-1-ziyao@disroot.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: stmmac: pci: Use generic PCI suspend/resume routinesYao Zi2-34/+5
Convert STMMAC PCI glue driver to use the generic platform suspend/resume routines for PCI controllers, instead of implementing its own one. Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Link: https://patch.msgid.link/20251124160417.51514-4-ziyao@disroot.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: stmmac: loongson: Use generic PCI suspend/resume routinesYao Zi2-34/+5
Convert glue driver for Loongson DWMAC controller to use the generic platform suspend/resume routines for PCI controllers, instead of implementing its own one. Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Acked-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Link: https://patch.msgid.link/20251124160417.51514-3-ziyao@disroot.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: stmmac: Add generic suspend/resume helper for PCI-based controllersYao Zi4-0/+66
Most glue driver for PCI-based DWMAC controllers utilize similar platform suspend/resume routines. Add a generic implementation to reduce duplicated code. Signed-off-by: Yao Zi <ziyao@disroot.org> Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn> Link: https://patch.msgid.link/20251124160417.51514-2-ziyao@disroot.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: wwan: mhi: Keep modem name match with Foxconn T99W640Slark Xiao1-1/+1
Correct it since M.2 device T99W640 has updated from T99W515. We need to align it with MHI side otherwise this modem can't get the network. Fixes: ae5a34264354 ("bus: mhi: host: pci_generic: Fix the modem name of Foxconn T99W640") Signed-off-by: Slark Xiao <slark_xiao@163.com> Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com> Link: https://patch.msgid.link/20251125070900.33324-1-slark_xiao@163.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26Merge branch 'add-hwtstamp_get-callback-to-phy-drivers'Jakub Kicinski12-51/+139
Vadim Fedorenko says: ==================== add hwtstamp_get callback to phy drivers PHY drivers are able to configure HW time stamping and are not able to report configuration back to user space. Add callback to report configuration like it's done for net_device and add implementation to the drivers. ==================== Link: https://patch.msgid.link/20251124181151.277256-1-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26ptp: ptp_ines: add HW timestamp configuration reportingVadim Fedorenko1-0/+23
The driver partially stores HW timestamping configuration, but missing pieces can be read from HW. Add callback to report configuration. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251124181151.277256-8-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: phy: nxp-c45-tja11xx: add HW timestamp configuration reportingVadim Fedorenko1-0/+14
The driver stores HW timestamping configuration and can technically report it. Add callback to do it. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251124181151.277256-7-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26phy: mscc: add HW timestamp configuration reportingVadim Fedorenko1-0/+13
The driver stores HW configuration and can technically report it. Add callback to do it. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251124181151.277256-6-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: phy: dp83640: add HW timestamp configuration reportingVadim Fedorenko1-4/+17
The driver stores configuration of TX timestamping and can technically report it. Patch RX timestamp configuration storage to be more precise on reporting and add callback to actually report it. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251124181151.277256-5-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26net: phy: broadcom: add HW timestamp configuration reportingVadim Fedorenko1-0/+13
The driver stores configuration information and can technically report it. Implement hwtstamp_get callback to report the configuration. Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com> Link: https://patch.msgid.link/20251124181151.277256-4-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26phy: add hwtstamp_get callback to phy driversVadim Fedorenko3-4/+13
PHY devices had lack of hwtstamp_get callback even though most of them are tracking configuration info. Introduce new call back to mii_timestamper. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251124181151.277256-3-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26phy: rename hwtstamp callback to hwtstamp_setVadim Fedorenko11-43/+46
PHY devices has hwtstamp callback which actually performs set operation. Rename it to better reflect the action. Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> Reviewed-by: Kory Maincent <kory.maincent@bootlin.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev> Link: https://patch.msgid.link/20251124181151.277256-2-vadim.fedorenko@linux.dev Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26selftests/net: packetdrill: pass send_omit_free to MSG_ZEROCOPY testsWillem de Bruijn12-0/+29
The --send_omit_free flag is needed for TCP zero copy tests, to ensure that packetdrill doesn't free the send() buffer after the send() call. Fixes: 1e42f73fd3c2 ("selftests/net: packetdrill: import tcp/zerocopy") Closes: https://lore.kernel.org/netdev/20251124071831.4cbbf412@kernel.org/ Suggested-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Link: https://patch.msgid.link/20251125234029.1320984-1-willemdebruijn.kernel@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26selftests/net: initialize char variable to nullAnkit Khushwaha2-2/+2
char variable in 'so_txtime.c' & 'txtimestamp.c' were left uninitilized when switch default case taken. which raises following warning. txtimestamp.c:240:2: warning: variable 'tsname' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] so_txtime.c:210:3: warning: variable 'reason' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized] initializing these variables to NULL to fix this. Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com> Link: https://patch.msgid.link/20251125165302.20079-1-ankitkhushwaha.linux@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26vhost: rewind next_avail_head while discarding descriptorsJason Wang3-36/+103
When discarding descriptors with IN_ORDER, we should rewind next_avail_head otherwise it would run out of sync with last_avail_idx. This would cause driver to report "id X is not a head". Fixing this by returning the number of descriptors that is used for each buffer via vhost_get_vq_desc_n() so caller can use the value while discarding descriptors. Fixes: 67a873df0c41 ("vhost: basic in order support") Cc: stable@vger.kernel.org Signed-off-by: Jason Wang <jasowang@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Link: https://patch.msgid.link/20251120022950.10117-1-jasowang@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2025-11-26libceph: drop started parameter of __ceph_open_session()Ilya Dryomov3-6/+4
With the previous commit revamping the timeout handling, started isn't used anymore. It could be taken into account by adjusting the initial value of the timeout, but there is little point as both callers capture the timestamp shortly before calling __ceph_open_session() -- the only thing of note that happens in the interim is taking client->mount_mutex and that isn't expected to take multiple seconds. Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
2025-11-26libceph: fix potential use-after-free in have_mon_and_osd_map()Ilya Dryomov2-25/+42
The wait loop in __ceph_open_session() can race with the client receiving a new monmap or osdmap shortly after the initial map is received. Both ceph_monc_handle_map() and handle_one_map() install a new map immediately after freeing the old one kfree(monc->monmap); monc->monmap = monmap; ceph_osdmap_destroy(osdc->osdmap); osdc->osdmap = newmap; under client->monc.mutex and client->osdc.lock respectively, but because neither is taken in have_mon_and_osd_map() it's possible for client->monc.monmap->epoch and client->osdc.osdmap->epoch arms in client->monc.monmap && client->monc.monmap->epoch && client->osdc.osdmap && client->osdc.osdmap->epoch; condition to dereference an already freed map. This happens to be reproducible with generic/395 and generic/397 with KASAN enabled: BUG: KASAN: slab-use-after-free in have_mon_and_osd_map+0x56/0x70 Read of size 4 at addr ffff88811012d810 by task mount.ceph/13305 CPU: 2 UID: 0 PID: 13305 Comm: mount.ceph Not tainted 6.14.0-rc2-build2+ #1266 ... Call Trace: <TASK> have_mon_and_osd_map+0x56/0x70 ceph_open_session+0x182/0x290 ceph_get_tree+0x333/0x680 vfs_get_tree+0x49/0x180 do_new_mount+0x1a3/0x2d0 path_mount+0x6dd/0x730 do_mount+0x99/0xe0 __do_sys_mount+0x141/0x180 do_syscall_64+0x9f/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e </TASK> Allocated by task 13305: ceph_osdmap_alloc+0x16/0x130 ceph_osdc_init+0x27a/0x4c0 ceph_create_client+0x153/0x190 create_fs_client+0x50/0x2a0 ceph_get_tree+0xff/0x680 vfs_get_tree+0x49/0x180 do_new_mount+0x1a3/0x2d0 path_mount+0x6dd/0x730 do_mount+0x99/0xe0 __do_sys_mount+0x141/0x180 do_syscall_64+0x9f/0x100 entry_SYSCALL_64_after_hwframe+0x76/0x7e Freed by task 9475: kfree+0x212/0x290 handle_one_map+0x23c/0x3b0 ceph_osdc_handle_map+0x3c9/0x590 mon_dispatch+0x655/0x6f0 ceph_con_process_message+0xc3/0xe0 ceph_con_v1_try_read+0x614/0x760 ceph_con_workfn+0x2de/0x650 process_one_work+0x486/0x7c0 process_scheduled_works+0x73/0x90 worker_thread+0x1c8/0x2a0 kthread+0x2ec/0x300 ret_from_fork+0x24/0x40 ret_from_fork_asm+0x1a/0x30 Rewrite the wait loop to check the above condition directly with client->monc.mutex and client->osdc.lock taken as appropriate. While at it, improve the timeout handling (previously mount_timeout could be exceeded in case wait_event_interruptible_timeout() slept more than once) and access client->auth_err under client->monc.mutex to match how it's set in finish_auth(). monmap_show() and osdmap_show() now take the respective lock before accessing the map as well. Cc: stable@vger.kernel.org Reported-by: David Howells <dhowells@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com> Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
2025-11-26Merge tag 'trace-ringbuffer-v6.18-rc7' of ↵Linus Torvalds1-0/+10
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace Pull ring-buffer fix from Steven Rostedt: - Do not allow mmapped ring buffer to be split When the ring buffer VMA is split by a partial munmap or a MAP_FIXED, the kernel calls vm_ops->close() on each portion. This causes the ring_buffer_unmap() to be called multiple times. This causes subsequent calls to return -ENODEV and triggers a warning. There's no reason to allow user space to split up memory mapping of the ring buffer. Have it return -EINVAL when that happens. * tag 'trace-ringbuffer-v6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: tracing: Fix WARN_ON in tracing_buffers_mmap_close for split VMAs
2025-11-26kbuild: add target to build a cpio containing modulesSascha Hauer1-0/+20
Add a new package target to build a cpio archive containing the kernel modules. This is particularly useful to supplement an existing initramfs with the kernel modules so that the root filesystem can be started with all needed kernel modules without modifying it. Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de> Reviewed-by: Simon Glass <sjg@chromium.org> Tested-by: Simon Glass <sjg@chromium.org> Co-developed-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nicolas Schier <nsc@kernel.org> Link: https://patch.msgid.link/20251125-cpio-modules-pkg-v2-2-aa8277d89682@pengutronix.de Signed-off-by: Nicolas Schier <nsc@kernel.org>
2025-11-26initramfs: add gen_init_cpio to hostprogs unconditionallyAhmad Fatoum1-2/+2
gen_init_cpio is currently only needed when an initramfs cpio archive is to be created out of CONFIG_INITRAMFS_SOURCE's contents. In other cases, it's not added to hostprogs and no make target is available. In preparation to use the host program from Makefile.package, define it unconditionally. The program will still only be built as needed. Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de> Reviewed-by: Simon Glass <sjg@chromium.org> Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de> Reviewed-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nathan Chancellor <nathan@kernel.org> Tested-by: Nicolas Schier <nsc@kernel.org> Link: https://patch.msgid.link/20251125-cpio-modules-pkg-v2-1-aa8277d89682@pengutronix.de Signed-off-by: Nicolas Schier <nsc@kernel.org>
2025-11-26dma-direct: Fix missing sg_dma_len assignment in P2PDMA bus mappingsPranjal Shrivastava1-0/+1
Prior to commit a25e7962db0d7 ("PCI/P2PDMA: Refactor the p2pdma mapping helpers"), P2P segments were mapped using the pci_p2pdma_map_segment() helper. This helper was responsible for populating sg->dma_address, marking the bus address, and also setting sg_dma_len(sg). The refactor[1] removed this helper and moved the mapping logic directly into the callers. While iommu_dma_map_sg() was correctly updated to set the length in the new flow, it was missed in dma_direct_map_sg(). Thus, in dma_direct_map_sg(), the PCI_P2PDMA_MAP_BUS_ADDR case sets the dma_address and marks the segment, but immediately executes 'continue', which causes the loop to skip the standard assignment logic at the end: sg_dma_len(sg) = sg->length; As a result, when CONFIG_NEED_SG_DMA_LENGTH is enabled, the dma_length field remains uninitialized (zero) for P2P bus address mappings. This breaks upper-layer drivers (for e.g. RDMA/IB) that rely on sg_dma_len() to determine the transfer size. Fix this by explicitly setting the DMA length in the PCI_P2PDMA_MAP_BUS_ADDR case before continuing to the next scatterlist entry. Fixes: a25e7962db0d7 ("PCI/P2PDMA: Refactor the p2pdma mapping helpers") Reported-by: Jacob Moroni <jmoroni@google.com> Signed-off-by: Pranjal Shrivastava <praan@google.com> [1] https://lore.kernel.org/all/ac14a0e94355bf898de65d023ccf8a2ad22a3ece.1746424934.git.leon@kernel.org/ Reviewed-by: Logan Gunthorpe <logang@deltatee.com> Reviewed-by: Leon Romanovsky <leonro@nvidia.com> Reviewed-by: Shivaji Kant <shivajikant@google.com> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> Link: https://lore.kernel.org/r/20251126114112.3694469-1-praan@google.com
2025-11-26Merge tag 'mm-hotfixes-stable-2025-11-26-11-51' of ↵Linus Torvalds8-36/+63
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm Pull misc fixes from Andrew Morton: "8 hotfixes. 4 are cc:stable, 7 are against mm/. All are singletons - please see the respective changelogs for details" * tag 'mm-hotfixes-stable-2025-11-26-11-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: mm/filemap: fix logic around SIGBUS in filemap_map_pages() mm/huge_memory: fix NULL pointer deference when splitting folio MAINTAINERS: add test_kho to KHO's entry mailmap: add entry for Sam Protsenko selftests/mm: fix division-by-zero in uffd-unit-tests mm/mmap_lock: reset maple state on lock_vma_under_rcu() retry mm/memfd: fix information leak in hugetlb folios mm: swap: remove duplicate nr_swap_pages decrement in get_swap_page_of_type()
2025-11-26Fix Intel Dollar Cove TI battery driver 32-bit build errorLinus Torvalds1-4/+6
The driver is doing a 64-bit divide, rather than using the proper helpers, causing link errors on i386 allyesconfig builds: x86_64-linux-ld: drivers/power/supply/intel_dc_ti_battery.o: in function `dc_ti_battery_get_voltage_and_current_now': intel_dc_ti_battery.c:(.text+0x5c): undefined reference to `__udivdi3' x86_64-linux-ld: intel_dc_ti_battery.c:(.text+0x96): undefined reference to `__udivdi3' and while fixing that, fix the double rounding: keep the timing difference in nanoseconds ('ktime'), and then just convert to usecs at the end. Not because the timing precision is likely to matter, but because doing it right also makes the code simpler. Reported-by: Guenter Roeck <linux@roeck-us.net> Cc: Hans de Goede <hansg@kernel.org> Cc: Sebastian Reichel <sebastian.reichel@collabora.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-11-26Increase the default 32-bit build frame size warning limit to 1280 bytesLinus Torvalds1-2/+1
That was already the limit with KASAN enabled, and the 32-bit x86 build ends up having a couple of drm cases that have stack frames _just_ over 1kB on my allmodconfig test. So the minimal fix for this build issue for now is to just bump the limit and make it independent of KASAN. [ Side note: XTENSA already used 1.5k and PARISC uses 2k, so 1280 is still relatively conservative ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2025-11-26bpf: Fix exclusive map memory leakEdward Adam Davis1-1/+2
When excl_prog_hash is 0 and excl_prog_hash_size is non-zero, the map also needs to be freed. Otherwise, the map memory will not be reclaimed, just like the memory leak problem reported by syzbot [1]. syzbot reported: BUG: memory leak backtrace (crc 7b9fb9b4): map_create+0x322/0x11e0 kernel/bpf/syscall.c:1512 __sys_bpf+0x3556/0x3610 kernel/bpf/syscall.c:6131 Fixes: baefdbdf6812 ("bpf: Implement exclusive map creation") Reported-by: syzbot+cf08c551fecea9fd1320@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=cf08c551fecea9fd1320 Tested-by: syzbot+cf08c551fecea9fd1320@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis <eadavis@qq.com> Acked-by: Yonghong Song <yonghong.song@linux.dev> Link: https://lore.kernel.org/r/tencent_3F226F882CE56DCC94ACE90EED1ECCFC780A@qq.com Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2025-11-26Merge tag 'sound-6.18' of ↵Linus Torvalds5-5/+11
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound Pull sound fixes from Takashi Iwai: "A collection of small fixes. All changes are device-specific and trivial, mostly HD-audio and USB-audio quirks and fixups" * tag 'sound-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound: ALSA: hda/realtek: Add quirk for HP ProBook 450 G8 ALSA: usb-audio: fix uac2 clock source at terminal parser ALSA: hda/realtek: add quirk for HP pavilion aero laptop 13z-be200 ALSA: hda/cirrus fix cs420x MacPro 6,1 inverted jack detection ALSA: usb-audio: Add DSD quirk for LEAK Stereo 230 ALSA: au88x0: Fix incorrect error handling for PCI config reads
2025-11-26Merge tag 'acpi-6.18-rc8' of ↵Linus Torvalds3-79/+76
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm Pull ACPI fix from Rafael Wysocki: "Revert a commit that attempted to make the code in the ACPI processor driver more straightforward, but it turned out to cause the kernel to crash on at least one system, along with some further cleanups on top of it" * tag 'acpi-6.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: Revert "ACPI: processor: idle: Optimize ACPI idle driver registration" Revert "ACPI: processor: Remove unused empty stubs of some functions" Revert "ACPI: processor: idle: Rearrange declarations in header file" Revert "ACPI: processor: idle: Redefine two functions as void" Revert "ACPI: processor: Do not expose global variable acpi_idle_driver"
2025-11-26drm/amdgpu: fix cyan_skillfish2 gpu info fw handlingAlex Deucher1-0/+2
If the board supports IP discovery, we don't need to parse the gpu info firmware. Backport to 6.18. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4721 Fixes: fa819e3a7c1e ("drm/amdgpu: add support for cyan skillfish gpu_info") Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 5427e32fa3a0ba9a016db83877851ed277b065fb)
2025-11-26drm/amdgpu: attach tlb fence to the PTs updatePrike Liang1-1/+1
Ensure the userq TLB flush is emitted only after the VM update finishes and the PT BOs have been annotated with bookkeeping fences. Suggested-by: Christian König <christian.koenig@amd.com> Signed-off-by: Prike Liang <Prike.Liang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f3854e04b708d73276c4488231a8bd66d30b4671) Cc: stable@vger.kernel.org
2025-11-26drm/amd/display: Increase EDID read retriesMario Limonciello (AMD)1-4/+4
[WHY] When monitor is still booting EDID read can fail while DPCD read is successful. In this case no EDID data will be returned, and this could happen for a while. [HOW] Increase number of attempts to read EDID in dm_helpers_read_local_edid() to 25. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4672 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Hung <alex.hung@amd.com> Tested-by: Dan Wheeler <daniel.wheeler@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit a76d6f2c76c3abac519ba753e2723e6ffe8e461c) Cc: stable@vger.kernel.org
2025-11-26drm/amd/display: Don't change brightness for disabled connectorsMario Limonciello (AMD)1-0/+15
[WHY] When a laptop lid is closed the connector is disabled but userspace can still try to change brightness. This doesn't work because the panel is turned off. It will eventually time out, but there is a lot of stutter along the way. [How] Iterate all connectors to check whether the matching one for the backlight index is enabled. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4675 Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Ray Wu <ray.wu@amd.com> Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit f6eeab30323d1174a4cc022e769d248fe8241304) Cc: stable@vger.kernel.org
2025-11-26drm/amd/display: Check NULL before accessingAlex Hung1-3/+8
[WHAT] IGT kms_cursor_legacy's long-nonblocking-modeset-vs-cursor-atomic fails with NULL pointer dereference. This can be reproduced with both an eDP panel and a DP monitors connected. BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: Oops: 0000 [#1] SMP NOPTI CPU: 13 UID: 0 PID: 2960 Comm: kms_cursor_lega Not tainted 6.16.0-99-custom #8 PREEMPT(voluntary) Hardware name: AMD ........ RIP: 0010:dc_stream_get_scanoutpos+0x34/0x130 [amdgpu] Code: 57 4d 89 c7 41 56 49 89 ce 41 55 49 89 d5 41 54 49 89 fc 53 48 83 ec 18 48 8b 87 a0 64 00 00 48 89 75 d0 48 c7 c6 e0 41 30 c2 <48> 8b 38 48 8b 9f 68 06 00 00 e8 8d d7 fd ff 31 c0 48 81 c3 e0 02 RSP: 0018:ffffd0f3c2bd7608 EFLAGS: 00010292 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffd0f3c2bd7668 RDX: ffffd0f3c2bd7664 RSI: ffffffffc23041e0 RDI: ffff8b32494b8000 RBP: ffffd0f3c2bd7648 R08: ffffd0f3c2bd766c R09: ffffd0f3c2bd7760 R10: ffffd0f3c2bd7820 R11: 0000000000000000 R12: ffff8b32494b8000 R13: ffffd0f3c2bd7664 R14: ffffd0f3c2bd7668 R15: ffffd0f3c2bd766c FS: 000071f631b68700(0000) GS:ffff8b399f114000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000000 CR3: 00000001b8105000 CR4: 0000000000f50ef0 PKRU: 55555554 Call Trace: <TASK> dm_crtc_get_scanoutpos+0xd7/0x180 [amdgpu] amdgpu_display_get_crtc_scanoutpos+0x86/0x1c0 [amdgpu] ? __pfx_amdgpu_crtc_get_scanout_position+0x10/0x10[amdgpu] amdgpu_crtc_get_scanout_position+0x27/0x50 [amdgpu] drm_crtc_vblank_helper_get_vblank_timestamp_internal+0xf7/0x400 drm_crtc_vblank_helper_get_vblank_timestamp+0x1c/0x30 drm_crtc_get_last_vbltimestamp+0x55/0x90 drm_crtc_next_vblank_start+0x45/0xa0 drm_atomic_helper_wait_for_fences+0x81/0x1f0 ... Cc: Mario Limonciello <mario.limonciello@amd.com> Cc: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com> Signed-off-by: Alex Hung <alex.hung@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 621e55f1919640acab25383362b96e65f2baea3c) Cc: stable@vger.kernel.org
2025-11-26Revert "drm/amd/display: Move setup_stream_attribute"Alex Deucher5-12/+3
This reverts commit 2681bf4ae8d24df950138b8c9ea9c271cd62e414. This results in a blank screen on the HDMI port on some systems. Revert for now so as not to regress 6.18, can be addressed in 6.19 once the issue is root caused. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4652 Cc: Sunpeng.Li@amd.com Cc: ivan.lipski@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d0e9de7a81503cdde37fb2d37f1d102f9e0f38fb)