| Age | Commit message (Collapse) | Author | Files | Lines |
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-23-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Christian Brauner <brauner@kernel.org> says:
The fix sent in [1] was squashed into this commit.
Fixes: https://lore.kernel.org/c41de645-8234-465f-a3be-f0385e3a163c@sirena.org.uk [1]
Reported-by: Mark Brown <broonie@kernel.org> [1]
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> [1]
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-22-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-21-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-19-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-18-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-17-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-16-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-15-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-14-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-13-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-12-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-11-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-10-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-9-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Christian Brauner <brauner@kernel.org> says:
The fix sent in [1] was squashed into this commit.
Link: https://lore.kernel.org/20251127201618.2115275-1-kuniyu@google.com [1]
Reported-by: syzbot+321168dfa622eda99689@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/lkml/6928b121.a70a0220.d98e3.0110.GAE@google.com
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-8-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Christian Brauner <brauner@kernel.org> says:
A variant of the fix sent in [1] was squashed into this commit.
Link: https://lore.kernel.org/20251128035149.392402-1-kartikey406@gmail.com [1]
Reported-by: Deepanshu Kartikey <kartikey406@gmail.com>
Reported-by: syzbot+94048264da5715c251f9@syzkaller.appspotmail.com
Tested-by: syzbot+94048264da5715c251f9@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=94048264da5715c251f9
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-7-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-6-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-5-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-4-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-3-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-2-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
I've been playing with this to allow for moderately flexible usage of
the get_unused_fd_flags() + create file + fd_install() pattern that's
used quite extensively.
How callers allocate files is really heterogenous so it's not really
convenient to fold them into a single class. It's possibe to split them
into subclasses like for anon inodes. I think that's not necessarily
nice as well.
My take is to add two primites:
(1) FD_ADD() the simple cases a file is installed:
fd = FD_ADD(O_CLOEXEC, open_file(some, args)));
if (fd >= 0)
kvm_get_kvm(vcpu->kvm);
return fd;
(2) FD_PREPARE() that captures all the cases where access to fd or file
or additional work before publishing the fd is needed:
FD_PREPARE(fdf, open_flag, file_open_handle(&path, open_flag));
if (fdf.err)
return fdf.err;
if (copy_to_user(/* something something */))
return -EFAULT;
return fd_publish(fdf);
I've converted all of the easy cases over to it and it gets rid of an
aweful lot of convoluted cleanup logic.
It's centered around struct fd_prepare. FD_PREPARE() encapsulates all of
allocation and cleanup logic and must be followed by a call to
fd_publish() which associates the fd with the file and installs it into
the callers fdtable. If fd_publish() isn't called both are deallocated.
It mandates a specific order namely that first we allocate the fd and
then instantiate the file. But that shouldn't be a problem nearly
everyone I've converted uses this exact pattern anyway.
There's a bunch of additional cases where it would be easy to convert
them to this pattern. For example, the whole sync file stuff in dma
currently retains the containing structure of the file instead of the
file itself even though it's only used to allocate files. Changing that
would make it fall into the FD_PREPARE() pattern easily. I've not done
that work yet.
There's room for extending this in a way that wed'd have subclasses for
some particularly often use patterns but as I said I'm not even sure
that's worth it.
Link: https://patch.msgid.link/20251123-work-fd-prepare-v4-1-b6efa1706cfd@kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The allocation of a cell's anonymous key is done in a background thread
along with other cell setup such as doing a DNS upcall. In the reported
bug, this is triggered by afs_parse_source() parsing the device name given
to mount() and calling afs_lookup_cell() with the name of the cell.
The normal key lookup then tries to use the key description on the
anonymous authentication key as the reference for request_key() - but it
may not yet be set and so an oops can happen.
This has been made more likely to happen by the fix for dynamic lookup
failure.
Fix this by firstly allocating a reference name and attaching it to the
afs_cell record when the record is created. It can share the memory
allocation with the cell name (unfortunately it can't just overlap the cell
name by prepending it with "afs@" as the cell name already has a '.'
prepended for other purposes). This reference name is then passed to
request_key().
Secondly, the anon key is now allocated on demand at the point a key is
requested in afs_request_key() if it is not already allocated. A mutex is
used to prevent multiple allocation for a cell.
Thirdly, make afs_request_key_rcu() return NULL if the anonymous key isn't
yet allocated (if we need it) and then the caller can return -ECHILD to
drop out of RCU-mode and afs_request_key() can be called.
Note that the anonymous key is kind of necessary to make the key lookup
cache work as that doesn't currently cache a negative lookup, but it's
probably worth some investigation to see if NULL can be used instead.
Fixes: 330e2c514823 ("afs: Fix dynamic lookup to fail on cell lookup failure")
Reported-by: syzbot+41c68824eefb67cdf00c@syzkaller.appspotmail.com
Signed-off-by: David Howells <dhowells@redhat.com>
Link: https://patch.msgid.link/800328.1764325145@warthog.procyon.org.uk
cc: Marc Dionne <marc.dionne@auristor.com>
cc: linux-afs@lists.infradead.org
cc: linux-fsdevel@vger.kernel.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Remove unnecessary semicolons reported by Coccinelle/coccicheck and the
semantic patch at scripts/coccinelle/misc/semicolon.cocci.
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Fixed: 7ab96df840e60 ("VFS/nfsd/cachefiles/ovl: add start_creating() and end_creating()")
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The definition of struct delegation uses stdint.h integer types. Add the
necessary headers to ensure that always works.
Fixes: 1602bad16d7d ("vfs: expose delegation support to userland")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
As well as checking that the parent hasn't changed after getting the
lock we need to check that the dentry hasn't been unhashed.
Otherwise we might try to rename something that has been removed.
Reported-by: syzbot+bfc9a0ccf0de47d04e8c@syzkaller.appspotmail.com
Fixes: d2c995581c7c ("ovl: Call ovl_create_temp() without lock held.")
Signed-off-by: NeilBrown <neil@brown.name>
Link: https://patch.msgid.link/176429295510.634289.1552337113663461690@noble.neil.brown.name
Tested-by: syzbot+bfc9a0ccf0de47d04e8c@syzkaller.appspotmail.com
Reviewed-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Rationale is that if the parent dentry is the same and the length is the
same, then you have to be unlucky for the name to not match.
At the same time the dentry was literally just found on the hash, so you
have to be even more unlucky to determine it is unhashed.
While here add commentary while d_unhashed() is necessary. It was
already removed once and brought back in:
2e321806b681b192 ("Revert "vfs: remove unnecessary d_unhashed() check from __d_lookup_rcu"")
Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
Link: https://patch.msgid.link/20251127131526.4137768-1-mjguzik@gmail.com
Reviewed-by: Jan Kara <jack@suse.cz>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Use the correct function name and add description for the @flavor
parameter to avoid these kernel-doc warnings:
Warning: fs/locks.c:1706 function parameter 'flavor' not described in
'__fcntl_getlease'
WARNING: fs/locks.c:1706 expecting prototype for fcntl_getlease().
Prototype was for __fcntl_getlease() instead
Fixes: 1602bad16d7d ("vfs: expose delegation support to userland")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://patch.msgid.link/20251128000826.457120-1-rdunlap@infradead.org
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Avoid a double-unlock as nfs_create_locked() will have unlocked the
parent and do the dput() manually.
Christian Brauner <brauner@kernel.org> says:
I've taken Neil's proposed fix from [1] and added a commit message.
Fixes: https://lore.kernel.org/202511252132.2c621407-lkp@intel.com [1]
Fixes: bd6ede8a06e8 ("VFS/nfsd/cachefiles/ovl: introduce start_removing() and end_removing()")
Signed-off-by: Neil Brown <neil@brown.name>
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
Pull drm fixes from Dave Airlie:
"Last one for this round hopefully, mostly the usual suspects,
xe/amdgpu, with some single fixes otherwise.
There is one amdgpu HDMI blackscreen bug that came in late in the
cycle, but it was bisected and the revert is in here.
i915:
- Reject async flips when PSR's selective fetch is enabled
xe:
- Fix resource leak in xe_guc_ct_init_noalloc()'s error path
- Fix stack_depot usage without STACKDEPOT_ALWAYS_INIT
- Fix overflow in conversion from clock tics to msec
amdgpu:
- Unified MES fix
- HDMI fix
- Cursor fix
- Bightness fix
- EDID reading improvement
- UserQ fix
- Cyan Skillfish IP discovery fix
bridge:
- sil902x: Fix HDMI detection
imagination:
- Update documentation
sti:
- Fix leaks in probe
vga_switcheroo:
- Avoid race condition during fbcon initialization"
* tag 'drm-fixes-2025-11-28' of https://gitlab.freedesktop.org/drm/kernel:
drm/amdgpu: fix cyan_skillfish2 gpu info fw handling
drm/amdgpu: attach tlb fence to the PTs update
drm/amd/display: Increase EDID read retries
drm/amd/display: Don't change brightness for disabled connectors
drm/amd/display: Check NULL before accessing
Revert "drm/amd/display: Move setup_stream_attribute"
drm/xe: Fix conversion from clock ticks to milliseconds
drm/xe/guc: Fix stack_depot usage
drm/xe/guc: Fix resource leak in xe_guc_ct_init_noalloc()
drm/i915/psr: Reject async flips when selective fetch is enabled
drm, fbcon, vga_switcheroo: Avoid race condition in fbcon setup
drm/amd/amdgpu: reserve vm invalidation engine for uni_mes
drm: sti: fix device leaks at component probe
drm/imagination: Document pvr_device.power member
drm/bridge: sii902x: Fix HDMI detection with DRM_BRIDGE_ATTACH_NO_CONNECTOR
|
|
Michael Chan says:
====================
bnxt_en: Updates for net-next (part)
This series includes an enhnacement to the priority TX counters,
an enhancement to a PHY module error extack message, cleanup of
unneeded MSIX logic in bnxt_ulp.c, adding CQ dump during TX timeout,
LRO/HW_GRO performance improvement by enabling Relaxed Ordering,
and improved SRIOV admin link state support.
====================
Link: https://patch.msgid.link/20251126215648.1885936-1-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The firmware can now cache the virtual link admin state (auto/on/off) of
all VFs and as such, the PF driver no longer has to intercept the VF
driver's port_phy_qcfg() call and then provide the link admin state.
If the FW does not have this capability, fall back to the existing
interception method.
The initial default link admin state (auto) is also set initially when
the VFs are created.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Reviewed-by: Mohammad Shuab Siddique <mohammad-shuab.siddique@broadcom.com>
Signed-off-by: Rob Miller <rmiller@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-7-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With End-of-Packet padding (EOP) set, the chip will disable Relaxed
Ordering (RO) of TPA data packets. A TPA segment with EOP set will be
padded to the next cache boundary and can potentially overwrite the
beginning bytes of the next TPA segment when RO is enabled on 5760X.
To prevent that, the chip disables RO for TPA when EOP is set.
To take advantge of RO and higher performance, do not set EOP on
5760X chips when TPA is enabled. Define a proper RX_BD_FLAGS_AGG_EOP
constant to make it clear that we are setting EOP.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-6-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On newer chips that use NQs and CQs, add the CQ ring dump to
bnxt_dump_cp_sw_state() to make it more complete.
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-5-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
MSIX is always requested when the RoCE driver calls bnxt_register_dev().
We already check bnxt_ulp_registered(), so checking the flag is
redundant. It was a left-over flag after converting to auxbus, so
remove it.
Reviewed-by: Pavan Chebbi <pavan.chebbi@broadcom.com>
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Kalesh AP <kalesh-anakkur.purayil@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-4-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Rturn early with -EOPNOTSUPP and an extack message if the PHY type is
BaseT since module status is not available for BaseT.
Reviewed-by: Somnath Kotur <somnath.kotur@broadcom.com>
Signed-off-by: Gautam R A <gautam-r.a@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-3-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The priority packet and byte counters in ethtool -S are returned by
the driver based on the pri2cos mapping. The assumption is that each
priority is mapped to one and only one hardware CoS queue. In a
special RoCE configuration, the FW uses combined CoS queue 0 and CoS
queue 1 for the priority mapped to CoS queue 0. In this special
case, we need to add the CoS queue 0 and CoS queue 1 counters for
the priority packet and byte counters.
Reviewed-by: Andy Gospodarek <andrew.gospodarek@broadcom.com>
Signed-off-by: Michael Chan <michael.chan@broadcom.com>
Link: https://patch.msgid.link/20251126215648.1885936-2-michael.chan@broadcom.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'intel-wired-lan-driver-updates-2025-11-25-ice-idpf-iavf-ixgbe-ixgbevf-e1000e'
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2025-11-25 (ice, idpf, iavf, ixgbe, ixgbevf, e1000e)
Natalia cleans up ixgbevf_q_vector struct removing an unused field.
Emil converts vport state tracking from enum to bitmap and removes
unneeded states for idpf.
Tony removes an unneeded check from e1000e.
Alok Tiwari removes an unnecessary second call to
ixgbe_non_sfp_link_config() and adjusts the checked member, in idpf, to
reflect the member that is later used. He also fixes various typos and
messages for better clarity misc Intel drivers.
====================
Link: https://patch.msgid.link/20251125223632.1857532-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The current dev_warn messages for too many VLAN changes are confusing
and one place incorrectly references "add" instead of "delete" VLANs
due to copy-paste errors.
- Use dev_info instead of dev_warn to lower the log level.
- Rephrase the message to: "virtchnl: Too many VLAN [add|delete]
([v1|v2]) requests; splitting into multiple messages to PF\n".
Suggested-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-12-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
- Fix a typo in the ice_fdir_has_frag() kernel-doc comment ("is" -> "if")
- Correct the NVM erase error message format string from "0x02%x" to
"0x%02x" so the module value is printed correctly.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-11-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The error messages in idpf_rx_desc_alloc_all() used the group index i
when reporting memory allocation failures for individual Rx and Rx buffer
queues. This is incorrect.
Update the messages to use the correct queue index j and include the
queue group index i for clearer identification of the affected Rx and Rx
buffer queues.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-10-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
idpf_compl_queue uses a union for comp, comp_4b, and desc_ring. The
release path should check complq->desc_ring to determine whether the DMA
descriptor ring is allocated. The current check against comp works but is
leftover from a previous commit and is misleading in this context.
Switching the check to desc_ring improves readability and more directly
reflects the intended meaning, since desc_ring is the field representing
the allocated DMA-backed descriptor ring.
No functional change.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-9-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ixgbe_non_sfp_link_config() is called twice in ixgbe_open()
once to assign its return value to err and again in the
conditional check. This patch uses the stored err value
instead of calling the function a second time. This avoids
redundant work and ensures consistent error reporting.
Also fix a small typo in the ixgbe_remove() comment:
"The could be caused" -> "This could be caused".
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Jedrzej Jagielski <jedrzej.jagielski@intel.com>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-8-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The caller, ethtool_set_eeprom(), already performs the same checks so
these are unnecessary in the driver. This reverts commit
90fb7db49c6d ("e1000e: fix heap overflow in e1000_set_eeprom"), however,
corrections for RCT have been kept.
Link: https://lore.kernel.org/all/db92fcc8-114d-4e85-9d15-7860545bc65e@suse.de/
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-7-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert vport state to a bitmap and remove the DOWN state which is
redundant in the existing logic. There are no functional changes aside
from the use of bitwise operations when setting and checking the states.
Removed the double underscore to be consistent with the naming of other
bitmaps in the header and renamed current_state to vport_is_up to match
the meaning of the new variable.
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Chittim Madhu <madhu.chittim@intel.com>
Signed-off-by: Emil Tantilov <emil.s.tantilov@intel.com>
Tested-by: Samuel Salin <Samuel.salin@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-6-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Flex array should be at the end of the structure and use [] syntax
Remove unused fields of ixgbevf_q_vector.
They aren't used since busy poll was moved to core code in commit
508aac6dee02 ("ixgbevf: get rid of custom busy polling code").
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Natalia Wochtman <natalia.wochtman@intel.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Link: https://patch.msgid.link/20251125223632.1857532-5-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The KMSG_COMPONENT macro is a leftover of the s390 specific "kernel message
catalog" from 2008 [1] which never made it upstream.
The macro was added to s390 code to allow for an out-of-tree patch which
used this to generate unique message ids. Also this out-of-tree doesn't
exist anymore.
The pattern of how the KMSG_COMPONENT is used was partially also used for
non s390 specific code, for whatever reasons.
Remove the macro in order to get rid of a pointless indirection.
[1] https://lwn.net/Articles/292650/
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Link: https://patch.msgid.link/20251126142242.2124317-1-hca@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes
Driver Changes:
- Fix resource leak in xe_guc_ct_init_noalloc()'s error path (Shuicheng Lin)
- Fix stack_depot usage without STACKDEPOT_ALWAYS_INIT (Lucas)
- Fix overflow in conversion from clock tics to msec (Harish Chegondi)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Lucas De Marchi <lucas.demarchi@intel.com>
Link: https://patch.msgid.link/7ejiqjgthpqybg5svmkind2pszk4fqadxuq7rngchaaw76iept@5pn6sngqj6lk
|
|
Convert the Cavium Thunder NIC VF driver to use the new .get_rx_ring_count
ethtool operation instead of implementing .get_rxnfc solely for handling
ETHTOOL_GRXRINGS command. This simplifies the code by removing the
switch statement and replacing it with a direct return of the queue
count.
The new callback provides the same functionality in a more direct way,
following the ongoing ethtool API modernization.
Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://patch.msgid.link/20251126-gxring_cavium-v1-1-a066c0c9e0c6@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The extra "count >= limit" check in stmmac_rx_zc() is redundant and
has no effect because the value of "count" doesn't change after the
while condition at this point.
However, it can change after "read_again:" label:
while (count < limit) {
...
if (count >= limit)
break;
read_again:
...
/* XSK pool expects RX frame 1:1 mapped to XSK buffer */
if (likely(status & rx_not_ls)) {
xsk_buff_free(buf->xdp);
buf->xdp = NULL;
dirty++;
count++;
goto read_again;
}
...
This patch addresses the same issue previously resolved in stmmac_rx()
by commit fa02de9e7588 ("net: stmmac: fix rx budget limit check").
The fix is the same: move the check after the label to ensure that it
bounds the goto loop.
Fixes: bba2556efad6 ("net: stmmac: Enable RX via AF_XDP zero-copy")
Signed-off-by: Alexey Kodanev <aleksei.kodanev@bell-sw.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20251126104327.175590-1-aleksei.kodanev@bell-sw.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes
Short summary of fixes pull:
bridge:
- sil902x: Fix HDMI detection
imagination:
- Update documentation
sti:
- Fix leaks in probe
vga_switcheroo:
- Avoid race condition during fbcon initialization
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Thomas Zimmermann <tzimmermann@suse.de>
Link: https://patch.msgid.link/20251127081007.GA13578@2a02-2454-fd5e-fd00-689d-32c0-780c-bb87.dyn6.pyur.net
|
|
https://gitlab.freedesktop.org/agd5f/linux into drm-fixes
amd-drm-fixes-6.18-2025-11-26:
amdgpu:
- Unified MES fix
- HDMI fix
- Cursor fix
- Bightness fix
- EDID reading improvement
- UserQ fix
- Cyan Skillfish IP discovery fix
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Alex Deucher <alexander.deucher@amd.com>
Link: https://patch.msgid.link/20251126204925.3316684-1-alexander.deucher@amd.com
|
|
David Yang says:
====================
net: dsa: yt921x: Fix parsing MIB attributes
====================
Link: https://patch.msgid.link/20251126084024.2843851-1-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Extract MIB constants into the header file to improve code style. This
patch will not change the behavior of the function.
Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20251126084024.2843851-3-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
There are hard-to-find unused fields in the MIB table I didn't notice in
the example driver code, causing wrong interpretation of the MIB data.
For some 64-bit attributes, the current (wrong) implementation took the
correct lower 32 bits, but messed up the upper 32 bits, so it would work
accidentally until 32-bit overflows happen. Fix that too.
Fixes: 186623f4aa72 ("net: dsa: yt921x: Add support for Motorcomm YT921x")
Signed-off-by: David Yang <mmyangfl@gmail.com>
Link: https://patch.msgid.link/20251126084024.2843851-2-mmyangfl@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This adds DASH support for chip RTL8127AP. Its mac version is
RTL_GIGA_MAC_VER_80. DASH is a standard for remote management of network
device, allowing out-of-band control.
Signed-off-by: Javen Xu <javen_xu@realsil.com.cn>
Link: https://patch.msgid.link/20251126055950.2050-1-javen_xu@realsil.com.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Neither sock4 nor sock6 pointers are guaranteed to be non-NULL in
vxlan_xmit_one, e.g. if the iface is brought down. This can lead to the
following NULL dereference:
BUG: kernel NULL pointer dereference, address: 0000000000000010
Oops: Oops: 0000 [#1] SMP NOPTI
RIP: 0010:vxlan_xmit_one+0xbb3/0x1580
Call Trace:
vxlan_xmit+0x429/0x610
dev_hard_start_xmit+0x55/0xa0
__dev_queue_xmit+0x6d0/0x7f0
ip_finish_output2+0x24b/0x590
ip_output+0x63/0x110
Mentioned commits changed the code path in vxlan_xmit_one and as a side
effect the sock4/6 pointer validity checks in vxlan(6)_get_route were
lost. Fix this by adding back checks.
Since both commits being fixed were released in the same version (v6.7)
and are strongly related, bundle the fixes in a single commit.
Reported-by: Liang Li <liali@redhat.com>
Fixes: 6f19b2c136d9 ("vxlan: use generic function for tunnel IPv4 route lookup")
Fixes: 2aceb896ee18 ("vxlan: use generic function for tunnel IPv6 route lookup")
Cc: Beniamino Galvani <b.galvani@gmail.com>
Signed-off-by: Antoine Tenart <atenart@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Link: https://patch.msgid.link/20251126102627.74223-1-atenart@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ptp_clock_settime() assumes every ptp_clock has implemented settime64().
Stub it with -EOPNOTSUPP to prevent a NULL dereference.
The fix is similar to commit 329d050bbe63 ("gve: Implement settime64
with -EOPNOTSUPP").
Fixes: d734223b2f0d ("iavf: add initial framework for registering PTP clock")
Signed-off-by: Michal Schmidt <mschmidt@redhat.com>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Tim Hostetler <thostet@google.com>
Link: https://patch.msgid.link/20251126094850.2842557-1-mschmidt@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This 0x88C3 is registered to Infineon Technologies Corporate Research ST
and are used by MaxLinear.
Infineon made a spin off called Lantiq.
Lantiq was acquired by Intel
MaxLinear acquired Intels Connected Home division.
The product FAQ from MaxLinear describes it's history from the F24S.
The driver for the gsw1xx is based on Lantiq showing it's similarities.
Ref https://standards-oui.ieee.org/ethertype/eth.txt
Signed-off-by: Peter Enderborg <Peter.Enderborg@axis.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use DEFINE_RAW_FLEX() to avoid a -Wflex-array-member-not-at-end warning.
Remove fixed-size array struct usb_cdc_ncm_dpe16 dpe16[2]; from struct
mbim_tx_hdr, so that flex-array member struct mbim_tx_hdr::ndp16.dpe16[]
ends last in this structure.
Compensate for this by using the DEFINE_RAW_FLEX() helper to declare the
on-stack struct instance that contains struct usb_cdc_ncm_ndp16 as a
member. Adjust the rest of the code, accordingly.
So, with these changes fix the following warning:
drivers/net/wwan/mhi_wwan_mbim.c:81:34: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end]
Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The tx->dropped_pkt counter is a 64-bit integer that is incremented
directly. On 32-bit architectures, this operation is not atomic and
can lead to read/write tearing if a reader accesses the counter during
the update. This can result in incorrect values being reported for
dropped packets.
To prevent this potential data corruption, wrap the increment
operation with u64_stats_update_begin() and u64_stats_update_end().
This ensures that updates to the 64-bit counter are atomic, even on
32-bit systems, by using a sequence lock.
The u64_stats_sync API requires the writer to have exclusive access,
which is already provided in this context by the network stack's
serialization of the transmit path (net_device_ops::ndo_start_xmit
[1]) for a given queue.
[1]: https://www.kernel.org/doc/Documentation/networking/netdevices.txt
Signed-off-by: Max Yuan <maxyuan@google.com>
Reviewed-by: Jordan Rhee <jordanrhee@google.com>
Signed-off-by: Harshitha Ramamurthy <hramamurthy@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit e20dfbad8aab ("net: fix napi_consume_skb() with alien skbs")
added a skb->cpu check to napi_consume_skb(), before the point where
napi_consume_skb() validated skb is not NULL.
Add an explicit check to the early exit condition.
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As those following recent changes from Eric know very well
using NAPI skb cache is crucial to achieve good perf, at
least on recent AMD platforms. Make sure bnxt feeds the skb
cache with Tx skbs.
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert all the legacy code directly accessing the pp fields in net_iov
to access them through @desc in net_iov.
Signed-off-by: Byungchul Park <byungchul@sk.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux
Pull dma-mapping fixes from Marek Szyprowski:
"Two last minute fixes for the recently modified DMA API infrastructure:
- proper handling of DMA_ATTR_MMIO in dma_iova_unlink() function (me)
- regression fix for the code refactoring related to P2PDMA (Pranjal
Shrivastava)"
* tag 'dma-mapping-6.18-2025-11-27' of git://git.kernel.org/pub/scm/linux/kernel/git/mszyprowski/linux:
dma-direct: Fix missing sg_dma_len assignment in P2PDMA bus mappings
iommu/dma: add missing support for DMA_ATTR_MMIO for dma_iova_unlink()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"One more urgent ACPI support fix for 6.18
There is one more commit that needs to be reverted after reverting
problematic commit 7a8c994cbb2d ("ACPI: processor: idle: Optimize ACPI
idle driver registration"), so revert it"
* tag 'acpi-6.18-rc8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI: processor: Update cpuidle driver check in __acpi_processor_start()"
|
|
https://gitlab.freedesktop.org/drm/i915/kernel into drm-fixes
- Reject async flips when PSR's selective fetch is enabled (Ville)
Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Rodrigo Vivi <rodrigo.vivi@intel.com>
Link: https://patch.msgid.link/aScgY8QMjmyJRBX2@intel.com
|
|
In include/uapi/linux/netfilter/nf_tables.h,
correct the kernel-doc comments for mistyped enum names and enum values to
avoid these kernel-doc warnings and improve the documentation:
nf_tables.h:896: warning: Enum value 'NFT_EXTHDR_OP_TCPOPT' not described
in enum 'nft_exthdr_op'
nf_tables.h:896: warning: Excess enum value 'NFT_EXTHDR_OP_TCP' description
in 'nft_exthdr_op'
nf_tables.h:1210: warning: expecting prototype for enum
nft_flow_attributes. Prototype was for enum nft_offload_attributes instead
nf_tables.h:1428: warning: expecting prototype for enum nft_reject_code.
Prototype was for enum nft_reject_inet_code instead
(add beginning '@' to each enum value description:)
nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_FAMILY' not described
in enum 'nft_tproxy_attributes'
nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_REG_ADDR' not described
in enum 'nft_tproxy_attributes'
nf_tables.h:1493: warning: Enum value 'NFTA_TPROXY_REG_PORT' not described
in enum 'nft_tproxy_attributes'
nf_tables.h:1796: warning: expecting prototype for enum
nft_device_attributes. Prototype was for enum
nft_devices_attributes instead
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Fix the kernel-doc format for struct members to be "@member" instead of
"@ member" to avoid kernel-doc warnings.
Warning: ip6t_srh.h:60 struct member 'next_hdr' not described in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'hdr_len' not described in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'segs_left' not described
in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'last_entry' not described
in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'tag' not described in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'mt_flags' not described in 'ip6t_srh'
Warning: ip6t_srh.h:60 struct member 'mt_invflags' not described
in 'ip6t_srh'
Warning: ip6t_srh.h:93 struct member 'next_hdr' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'hdr_len' not described in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'segs_left' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'last_entry' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'tag' not described in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'psid_addr' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'nsid_addr' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'lsid_addr' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'psid_msk' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'nsid_msk' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'lsid_msk' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'mt_flags' not described
in 'ip6t_srh1'
Warning: ip6t_srh.h:93 struct member 'mt_invflags' not described
in 'ip6t_srh1'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
traffic
Introduce the capability to send TCP traffic over IPv6 to
nft_flowtable netfilter selftest.
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This is useful to update the limit or flags without clearing the
connections tracked. Use READ_ONCE() on packetpath as it can be modified
on controlplane.
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Connlimit expression can be used for all kind of packets and not only
for packets with connection state new. See this ruleset as example:
table ip filter {
chain input {
type filter hook input priority filter; policy accept;
tcp dport 22 ct count over 4 counter
}
}
Currently, if the connection count goes over the limit the counter will
count the packets. When a connection is closed, the connection count
won't decrement as it should because it is only updated for new
connections due to an optimization on __nf_conncount_add() that prevents
updating the list if the connection is duplicated.
To solve this problem, check whether the connection was skipped and if
so, update the list. Adjust count_tree() too so the same fix is applied
for xt_connlimit.
Fixes: 976afca1ceba ("netfilter: nf_conncount: Early exit in nf_conncount_lookup() and cleanup")
Closes: https://lore.kernel.org/netfilter/trinity-85c72a88-d762-46c3-be97-36f10e5d9796-1761173693813@3c-app-mailcom-bs12/
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
For convenience when performing GC over the connection list, make
nf_conncount_gc_list() to disable BH. This unifies the behavior with
nf_conncount_add() and nf_conncount_count().
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
When using nf_conncount infrastructure for non-confirmed connections a
duplicated track is possible due to an optimization introduced since
commit d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC").
In order to fix this introduce a new conncount API that receives
directly an sk_buff struct. It fetches the tuple and zone and the
corresponding ct from it. It comes with both existing conncount variants
nf_conncount_count_skb() and nf_conncount_add_skb(). In addition remove
the old API and adjust all the users to use the new one.
This way, for each sk_buff struct it is possible to check if there is a
ct present and already confirmed. If so, skip the add operation.
Fixes: d265929930e2 ("netfilter: nf_conncount: reduce unnecessary GC")
Signed-off-by: Fernando Fernandez Mancera <fmancera@suse.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Introduce specific selftest for IPIP flowtable SW acceleration in
nft_flowtable.sh
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Introduce sw acceleration for tx path of IPIP tunnels relying on the
netfilter flowtable infrastructure.
This patch introduces basic infrastructure to accelerate other tunnel
types (e.g. IP6IP6).
IPIP sw tx acceleration can be tested running the following scenario where
the traffic is forwarded between two NICs (eth0 and eth1) and an IPIP
tunnel is used to access a remote site (using eth1 as the underlay device):
ETH0 -- TUN0 <==> ETH1 -- [IP network] -- TUN1 (192.168.100.2)
$ip addr show
6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:00:22:33:11:55 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.2/24 scope global eth0
valid_lft forever preferred_lft forever
7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:11:22:33:11:55 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.1/24 scope global eth1
valid_lft forever preferred_lft forever
8: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 192.168.1.1 peer 192.168.1.2
inet 192.168.100.1/24 scope global tun0
valid_lft forever preferred_lft forever
$ip route show
default via 192.168.100.2 dev tun0
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.2
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.1
192.168.100.0/24 dev tun0 proto kernel scope link src 192.168.100.1
$nft list ruleset
table inet filter {
flowtable ft {
hook ingress priority filter
devices = { eth0, eth1 }
}
chain forward {
type filter hook forward priority filter; policy accept;
meta l4proto { tcp, udp } flow add @ft
}
}
Reproducing the scenario described above using veths I got the following
results:
- TCP stream trasmitted into the IPIP tunnel:
- net-next: (baseline) ~ 85Gbps
- net-next + IPIP flowtable support: ~102Gbps
Co-developed-by: Pablo Neira Ayuso <pablo@netfilter.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Introduce sw acceleration for rx path of IPIP tunnels relying on the
netfilter flowtable infrastructure. Subsequent patches will add sw
acceleration for IPIP tunnels tx path.
This series introduces basic infrastructure to accelerate other tunnel
types (e.g. IP6IP6).
IPIP rx sw acceleration can be tested running the following scenario where
the traffic is forwarded between two NICs (eth0 and eth1) and an IPIP
tunnel is used to access a remote site (using eth1 as the underlay device):
ETH0 -- TUN0 <==> ETH1 -- [IP network] -- TUN1 (192.168.100.2)
$ip addr show
6: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:00:22:33:11:55 brd ff:ff:ff:ff:ff:ff
inet 192.168.0.2/24 scope global eth0
valid_lft forever preferred_lft forever
7: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
link/ether 00:11:22:33:11:55 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.1/24 scope global eth1
valid_lft forever preferred_lft forever
8: tun0@NONE: <POINTOPOINT,NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
link/ipip 192.168.1.1 peer 192.168.1.2
inet 192.168.100.1/24 scope global tun0
valid_lft forever preferred_lft forever
$ip route show
default via 192.168.100.2 dev tun0
192.168.0.0/24 dev eth0 proto kernel scope link src 192.168.0.2
192.168.1.0/24 dev eth1 proto kernel scope link src 192.168.1.1
192.168.100.0/24 dev tun0 proto kernel scope link src 192.168.100.1
$nft list ruleset
table inet filter {
flowtable ft {
hook ingress priority filter
devices = { eth0, eth1 }
}
chain forward {
type filter hook forward priority filter; policy accept;
meta l4proto { tcp, udp } flow add @ft
}
}
Reproducing the scenario described above using veths I got the following
results:
- TCP stream received from the IPIP tunnel:
- net-next: (baseline) ~ 71Gbps
- net-next + IPIP flowtbale support: ~101Gbps
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This simplifies IPIP tunnel support coming in follow up patches.
No function changes are intended.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
hw_ifidx was originally introduced to store the real netdevice as a
requirement for the hardware offload support in:
73f97025a972 ("netfilter: nft_flow_offload: use direct xmit if hardware offload is enabled")
Since ("netfilter: flowtable: consolidate xmit path"), ifidx and
hw_ifidx points to the real device in the xmit path, remove it.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Push the pppoe header from the flowtable xmit path, inlining is faster
than the original xmit path because it can avoid some locking.
This is based on a patch originally written by wenxu.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Push the vlan header from the flowtable xmit path, instead of passing
the packet to the vlan device.
This is based on a patch originally written by wenxu.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Use dev_queue_xmit() for the XMIT_NEIGH case. Store the interface index
of the real device behind the vlan/pppoe device, this introduces an
extra lookup for the real device in the xmit path because rt->dst.dev
provides the vlan/pppoe device.
XMIT_NEIGH now looks more similar to XMIT_DIRECT but the check for stale
dst and the neighbour lookup still remain in place which is convenient
to deal with network topology changes.
Note that nft_flow_route() needs to relax the check for _XMIT_NEIGH so
the existing basic xfrm offload (which only works in one direction) does
not break.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This file contains the path discovery that is run from the forward chain
for the packet offloading the flow into the flowtable. This consists
of a series of calls to dev_fill_forward_path() for each device stack.
More topologies may be supported in the future, so move this code to its
own file to separate it from the nftables flow_offload expression.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
Add a sanity check to skip path discovery if the maximum number of
encapsulation is reached. While at it, check for underflow too.
Fixes: 26267bf9bb57 ("netfilter: flowtable: bridge vlan hardware offload and switchdev")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
s/it/if/ and s/revokation/revocation/, capitalize "clear", and add a
period after the sentence. Fix the comment formatting.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
strncpy() is deprecated for NUL-terminated destination buffers; use
strscpy_pad() instead to retain the NUL-padding behavior of strncpy().
The destination buffer is initialized using kzalloc() with a 'signature'
size of ECRYPTFS_PASSWORD_SIG_SIZE + 1. strncpy() then copies up to
ECRYPTFS_PASSWORD_SIG_SIZE bytes from 'key_desc', NUL-padding any
remaining bytes if needed, but expects the last byte to be zero.
strscpy_pad() also copies the source string to 'signature', and NUL-pads
the destination buffer if needed, but ensures it's always NUL-terminated
without relying on it being zero-initialized.
strscpy_pad() automatically determines the size of the fixed-length
destination buffer via sizeof() when the optional size argument is
omitted, making an explicit size unnecessary.
In encrypted_init(), the source string 'key_desc' is validated by
valid_ecryptfs_desc() before calling ecryptfs_fill_auth_tok(), and is
therefore NUL-terminated and satisfies the __must_be_cstr() requirement
of strscpy_pad().
Link: https://github.com/KSPP/linux/issues/90
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Kees Cook <kees@kernel.org>
Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org>
Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
The local variables 'size_t datalen' are unsigned and cannot be less
than zero. Remove the redundant conditions.
Signed-off-by: Thorsten Blum <thorsten.blum@linux.dev>
Reviewed-by: Mimi Zohar <zohar@linux.ibm.com>
Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org>
|
|
Conflicts:
net/xdp/xsk.c
0ebc27a4c67d ("xsk: avoid data corruption on cq descriptor number")
8da7bea7db69 ("xsk: add indirect call for xsk_destruct_skb")
30ed05adca4a ("xsk: use a smaller new lock for shared pool case")
https://lore.kernel.org/20251127105450.4a1665ec@canb.auug.org.au
https://lore.kernel.org/eb4eee14-7e24-4d1b-b312-e9ea738fefee@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
__acpi_processor_start()"
Revert commit 8a1b5d412cb4 ("ACPI: processor: Update cpuidle driver
check in __acpi_processor_start()") which depends on commit
7a8c994cbb2d ("ACPI: processor: idle: Optimize ACPI idle driver
registration") that got reverted.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Pull ceph fixes from Ilya Dryomov:
"A patch to make sparse read handling work in msgr2 secure mode from
Slava and a couple of fixes from Ziming and myself to avoid operating
on potentially invalid memory, all marked for stable"
* tag 'ceph-for-6.18-rc8' of https://github.com/ceph/ceph-client:
libceph: prevent potential out-of-bounds writes in handle_auth_session_key()
libceph: replace BUG_ON with bounds check for map->max_osd
ceph: fix crash in process_v2_sparse_read() for encrypted directories
libceph: drop started parameter of __ceph_open_session()
libceph: fix potential use-after-free in have_mon_and_osd_map()
|
|
The define ARM64_FEATURE_FIELD_BITS is now unused and feature id
fields don't always have 4 bits. Remove it.
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
In test_clidr() if an empty cache level is not found then the TEST_ASSERT
will not fire. Fix this by considering all 7 possible levels when iterating
through the hierarchy. Found by inspection.
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
ARM64_FEATURE_FIELD_BITS is set to 4 but not all ID register fields are 4
bits. See for instance ID_AA64SMFR0_EL1. The last user of this define,
ARM64_FEATURE_FIELD_BITS, is the set_id_regs selftest. Its logic assumes
the fields aren't a single bits; assert that's the case and stop using the
define. As there are no more users, ARM64_FEATURE_FIELD_BITS is removed
from the arm64 tools sysreg.h header. A separate commit removes this from
the kernel version of the header.
Signed-off-by: Ben Horgan <ben.horgan@arm.com>
Acked-by: Marc Zyngier <maz@kernel.org>
Acked-by: Oliver Upton <oupton@kernel.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
The ATOMIC_FETCH_OP_AND and ATOMIC64_FETCH_OP_AND macros accept 'mb' and
'cl' parameters but never use them in their implementation. These macros
simply delegate to the corresponding andnot functions, which handle the
actual atomic operations and memory barriers.
Signed-off-by: Seongsu Park <sgsu.park@samsung.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bluetooth and CAN. No known outstanding
regressions.
Current release - regressions:
- mptcp: initialize rcv_mss before calling tcp_send_active_reset()
- eth: mlx5e: fix validation logic in rate limiting
Previous releases - regressions:
- xsk: avoid data corruption on cq descriptor number
- bluetooth:
- prevent race in socket write iter and sock bind
- fix not generating mackey and ltk when repairing
- can:
- kvaser_usb: fix potential infinite loop in command parsers
- rcar_canfd: fix CAN-FD mode as default
- eth:
- veth: reduce XDP no_direct return section to fix race
- virtio-net: avoid unnecessary checksum calculation on guest RX
Previous releases - always broken:
- sched: fix TCF_LAYER_TRANSPORT handling in tcf_get_base_ptr()
- bluetooth: mediatek: fix kernel crash when releasing iso interface
- vhost: rewind next_avail_head while discarding descriptors
- eth:
- r8169: fix RTL8127 hang on suspend/shutdown
- aquantia: add missing descriptor cache invalidation on ATL2
- dsa: microchip: fix resource releases in error path"
* tag 'net-6.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (47 commits)
mptcp: Initialise rcv_mss before calling tcp_send_active_reset() in mptcp_do_fastclose().
net: fec: do not register PPS event for PEROUT
net: fec: do not allow enabling PPS and PEROUT simultaneously
net: fec: do not update PEROUT if it is enabled
net: fec: cancel perout_timer when PEROUT is disabled
net: mctp: unconditionally set skb->dev on dst output
net: atlantic: fix fragment overflow handling in RX path
MAINTAINERS: separate VIRTIO NET DRIVER and add netdev
virtio-net: avoid unnecessary checksum calculation on guest RX
eth: fbnic: Fix counter roll-over issue
mptcp: clear scheduled subflows on retransmit
net: dsa: sja1105: fix SGMII linking at 10M or 100M but not passing traffic
s390/net: list Aswin Karuvally as maintainer
net: wwan: mhi: Keep modem name match with Foxconn T99W640
vhost: rewind next_avail_head while discarding descriptors
net/sched: em_canid: fix uninit-value in em_canid_match
can: rcar_canfd: Fix CAN-FD mode as default
xsk: avoid data corruption on cq descriptor number
r8169: fix RTL8127 hang on suspend/shutdown
net: sxgbe: fix potential NULL dereference in sxgbe_rx()
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull platform driver fixes from Ilpo Järvinen:
- arm64/thinkpad-t14s-ec:
- Fix IRQ race condition
- Sleep after EC access
- intel/punit_ipc: Fix memory corruption
* tag 'platform-drivers-x86-v6.18-5' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/x86: intel: punit_ipc: fix memory corruption
platform: arm64: thinkpad-t14s-ec: sleep after EC access
platform: arm64: thinkpad-t14s-ec: fix IRQ race condition
|
|
fill_pool_map is used to suppress nesting violations caused by acquiring
a spinlock_t (from within the memory allocator) while holding a
raw_spinlock_t. The used annotation is wrong.
LD_WAIT_SLEEP is for always sleeping lock types such as mutex_t.
LD_WAIT_CONFIG is for lock type which are sleeping while spinning on
PREEMPT_RT such as spinlock_t.
Use LD_WAIT_CONFIG as override.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251127153652.291697-3-bigeasy@linutronix.de
|
|
The pool of free objects is refilled on several occasions such as object
initialisation. On PREEMPT_RT refilling is limited to preemptible
sections due to sleeping locks used by the memory allocator. The system
boots with disabled interrupts so the pool can not be refilled.
If too many objects are initialized and the pool gets empty then
debugobjects disables itself.
Refiling can also happen early in the boot with disabled interrupts as
long as the scheduler is not operational. If the scheduler can not
preempt a task then a sleeping lock can not be contended.
Allow to additionally refill the pool if the scheduler is not
operational.
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20251127153652.291697-2-bigeasy@linutronix.de
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux
Pull devfreq changes for v6.19 from Chanwoo Choi:
"- Move governor.h under include/linux/ and rename to devfreq-governor.h
in order to allow devfreq governor definitions in out of drivers/devfreq/.
- Fix potential use-after-free issue of OPP handling on hisi_uncore_freq.c
- Use min() to improve the readability on tegra30-devfreq.c
- Fix typo in DFSO_DOWNDIFFERENTIAL macro name on governor_simpleondemand.c"
* tag 'devfreq-next-for-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/chanwoo/linux:
PM / devfreq: Fix typo in DFSO_DOWNDIFFERENTIAL macro name
PM / devfreq: tegra30: use min to simplify actmon_cpu_to_emc_rate
PM / devfreq: hisi: Fix potential UAF in OPP handling
PM / devfreq: Move governor.h to a public header location
|
|
The macro for_each_console_srcu iterates over all registered consoles. It's
implied that all registered consoles have CON_ENABLED flag set, making
the check for the flag unnecessary. Call console_is_usable function to
fully verify if the given console is usable before calling the ->unblank
callback.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20251121-printk-cleanup-part2-v2-3-57b8b78647f4@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
All consoles found on for_each_console are registered, meaning that all
of them have the CON_ENABLED flag set. Since NBCON was introduced it's
important to check if a given console also implements the NBCON callbacks.
The function console_is_usable does exactly that.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20251121-printk-cleanup-part2-v2-2-57b8b78647f4@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
The original code tried to find a console that has CON_BOOT _or_
CON_ENABLED flag set. The flag CON_ENABLED is set to all registered
consoles, so in this case this check is always true, even for the
CON_BOOT consoles.
The initial intent of the kgdboc_earlycon_init was to get a console
early (CON_BOOT) or later on in the process (CON_ENABLED). The
code was using for_each_console macro, meaning that all console structs
were previously registered on the printk() machinery. At this point,
any console found on for_each_console is safe for kgdboc_earlycon_init
to use.
Dropping the check makes the code cleaner, and avoids further confusion
by future readers of the code.
Signed-off-by: Marcos Paulo de Souza <mpdesouza@suse.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
Link: https://patch.msgid.link/20251121-printk-cleanup-part2-v2-1-57b8b78647f4@suse.com
Signed-off-by: Petr Mladek <pmladek@suse.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next
Marc Kleine-Budde says:
====================
pull-request: can-next 2025-11-26
this is a pull request of 27 patches for net-next/main.
The first 17 patches are by Vincent Mailhol and Oliver Hartkopp and
add CAN XL support to the CAN netlink interface.
Geert Uytterhoeven and Biju Das provide 7 patches for the rcar_canfd
driver to add suspend/resume support.
The next 2 patches are by Markus Schneider-Pargmann and add them as
the m_can maintainer.
Conor Dooley's patch updates the mpfs-can DT bindungs.
linux-can-next-for-6.19-20251126
* tag 'linux-can-next-for-6.19-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next: (27 commits)
dt-bindings: can: mpfs: document resets
MAINTAINERS: Simplify m_can section
MAINTAINERS: Add myself as m_can maintainer
can: rcar_canfd: Add suspend/resume support
can: rcar_canfd: Convert to DEFINE_SIMPLE_DEV_PM_OPS()
can: rcar_canfd: Invert CAN clock and close_candev() order
can: rcar_canfd: Extract rcar_canfd_global_{,de}init()
can: rcar_canfd: Use devm_clk_get_optional() for RAM clk
can: rcar_canfd: Invert global vs. channel teardown
can: rcar_canfd: Invert reset assert order
can: dev: print bitrate error with two decimal digits
can: raw: instantly reject unsupported CAN frames
can: add dummy_can driver
can: calc_bittiming: add can_calc_sample_point_pwm()
can: calc_bittiming: add can_calc_sample_point_nrz()
can: calc_bittiming: replace misleading "nominal" by "reference"
can: netlink: add PWM netlink interface
can: calc_bittiming: add PWM calculation
can: bittiming: add PWM validation
can: bittiming: add PWM parameters
...
====================
Link: https://patch.msgid.link/20251126120106.154635-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
This code doesn't run. Since 2008:
4f9c11dd49fb ("x86, 64-bit: adjust mapping of physical pagetables to work with Xen")
the kernel has gained more flexible logging and tracing capabilities;
presumably if anyone wanted to take advantage of this log message they would
have got rid of the "if (0)" so they could use these capabilities.
Since they haven't, just delete it.
Signed-off-by: Brendan Jackman <jackmanb@google.com>
Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
Link: https://patch.msgid.link/20251003-x86-init-cleanup-v1-1-f2b7994c2ad6@google.com
|
|
This commit updates the initialization for the "srcu-fast" scale
type to use DEFINE_STATIC_SRCU_FAST() when reader_flavor is equal to
SRCU_READ_FLAVOR_FAST.
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: <bpf@vger.kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
This commit causes rcutorture's srcu_torture_init() and
srcud_torture_init() functions to announce on the console log
which variant of SRCU is being tortured, for example: "torture:
srcud_torture_init fast SRCU".
[ paulmck: Apply feedback from kernel test robot. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
This commit creates an SRCU-fast-updown API, including
DEFINE_SRCU_FAST_UPDOWN(), DEFINE_STATIC_SRCU_FAST_UPDOWN(),
__init_srcu_struct_fast_updown(), init_srcu_struct_fast_updown(),
srcu_read_lock_fast_updown(), srcu_read_unlock_fast_updown(),
__srcu_read_lock_fast_updown(), and __srcu_read_unlock_fast_updown().
These are initially identical to their SRCU-fast counterparts, but both
SRCU-fast and SRCU-fast-updown will be optimized in different directions
by later commits. SRCU-fast will lack any sort of srcu_down_read() and
srcu_up_read() APIs, which will enable extremely efficient NMI safety.
For its part, SRCU-fast-updown will not be NMI safe, which will enable
reasonably efficient implementations of srcu_down_read_fast() and
srcu_up_read_fast().
This API fork happens to meet two different future use cases.
* SRCU-fast will become the reimplementation basis for RCU-TASK-TRACE
for consolidation. Since RCU-TASK-TRACE must be NMI safe, SRCU-fast
must be as well.
* SRCU-fast-updown will be needed for uretprobes code in order to get
rid of the read-side memory barriers while still allowing entering the
reader at task level while exiting it in a timer handler.
This commit also adds rcutorture tests for the new APIs. This
(annoyingly) needs to be in the same commit for bisectability. With this
commit, the 0x8 value tests SRCU-fast-updown. However, most SRCU-fast
testing will be via the RCU Tasks Trace wrappers.
[ paulmck: Apply s/0x8/0x4/ missing change per Boqun Feng feedback. ]
[ paulmck: Apply Akira Yokosawa feedback. ]
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: <bpf@vger.kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
|
|
Fix spelling by replacing "interrups" with "interrupts".
Signed-off-by: Chu Guangqing <chuguangqing@inspur.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251125022403.2614-1-chuguangqing@inspur.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
Fix spelling by replacing "successfull" with "successful".
Signed-off-by: Chu Guangqing <chuguangqing@inspur.com>
[ rjw: Changelog edits ]
Link: https://patch.msgid.link/20251125021431.2243-1-chuguangqing@inspur.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
|
|
mptcp_do_fastclose().
syzbot reported divide-by-zero in __tcp_select_window() by
MPTCP socket. [0]
We had a similar issue for the bare TCP and fixed in commit
499350a5a6e7 ("tcp: initialize rcv_mss to TCP_MIN_MSS instead
of 0").
Let's apply the same fix to mptcp_do_fastclose().
[0]:
Oops: divide error: 0000 [#1] SMP KASAN PTI
CPU: 0 UID: 0 PID: 6068 Comm: syz.0.17 Not tainted syzkaller #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/25/2025
RIP: 0010:__tcp_select_window+0x824/0x1320 net/ipv4/tcp_output.c:3336
Code: ff ff ff 44 89 f1 d3 e0 89 c1 f7 d1 41 01 cc 41 21 c4 e9 a9 00 00 00 e8 ca 49 01 f8 e9 9c 00 00 00 e8 c0 49 01 f8 44 89 e0 99 <f7> 7c 24 1c 41 29 d4 48 bb 00 00 00 00 00 fc ff df e9 80 00 00 00
RSP: 0018:ffffc90003017640 EFLAGS: 00010293
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff88807b469e40
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90003017730 R08: ffff888033268143 R09: 1ffff1100664d028
R10: dffffc0000000000 R11: ffffed100664d029 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
FS: 000055557faa0500(0000) GS:ffff888126135000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f64a1912ff8 CR3: 0000000072122000 CR4: 00000000003526f0
Call Trace:
<TASK>
tcp_select_window net/ipv4/tcp_output.c:281 [inline]
__tcp_transmit_skb+0xbc7/0x3aa0 net/ipv4/tcp_output.c:1568
tcp_transmit_skb net/ipv4/tcp_output.c:1649 [inline]
tcp_send_active_reset+0x2d1/0x5b0 net/ipv4/tcp_output.c:3836
mptcp_do_fastclose+0x27e/0x380 net/mptcp/protocol.c:2793
mptcp_disconnect+0x238/0x710 net/mptcp/protocol.c:3253
mptcp_sendmsg_fastopen+0x2f8/0x580 net/mptcp/protocol.c:1776
mptcp_sendmsg+0x1774/0x1980 net/mptcp/protocol.c:1855
sock_sendmsg_nosec net/socket.c:727 [inline]
__sock_sendmsg+0xe5/0x270 net/socket.c:742
__sys_sendto+0x3bd/0x520 net/socket.c:2244
__do_sys_sendto net/socket.c:2251 [inline]
__se_sys_sendto net/socket.c:2247 [inline]
__x64_sys_sendto+0xde/0x100 net/socket.c:2247
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0xfa0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f66e998f749
Code: ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 a8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffff9acedb8 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00007f66e9be5fa0 RCX: 00007f66e998f749
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000003
RBP: 00007ffff9acee10 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001
R13: 00007f66e9be5fa0 R14: 00007f66e9be5fa0 R15: 0000000000000006
</TASK>
Fixes: ae155060247b ("mptcp: fix duplicate reset on fastclose")
Reported-by: syzbot+3a92d359bc2ec6255a33@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/69260882.a70a0220.d98e3.00b4.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Cc: stable@vger.kernel.org
Link: https://patch.msgid.link/20251125195331.309558-1-kuniyu@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In Store and Forward mode, flushing frames when the receive buffer is
unavailable, can cause the MTL Rx FIFO to go out of sync. This results
in buffering of a few frames in the FIFO without triggering Rx DMA
from transferring the data to the system memory until another packet
is received. Once the issue happens, for a ping request, the packet is
forwarded to the system memory only after we receive another packet
and hece we observe a latency equivalent to the ping interval.
64 bytes from 192.168.2.100: seq=1 ttl=64 time=1000.344 ms
Also, we can observe constant gmacgrp_debug register value of
0x00000120, which indicates "Reading frame data".
The issue is not reproducible after disabling frame flushing when Rx
buffer is unavailable. But in that case, the Rx DMA enters a suspend
state due to buffer unavailability. To resume operation, software
must write to the receive_poll_demand register after adding new
descriptors, which reactivates the Rx DMA.
This issue is observed in the socfpga platforms which has dwmac1000 IP
like Arria 10, Cyclone V and Agilex 7. Issue is reproducible after
running iperf3 server at the DUT for UDP lower packet sizes.
Signed-off-by: Rohan G Thomas <rohan.g.thomas@altera.com>
Reviewed-by: Matthew Gerlach <matthew.gerlach@altera.com>
Tested-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://patch.msgid.link/20251126-a10_ext_fix-v1-1-d163507f646f@altera.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The unix_connreset.c test included <stdlib.h>, but no symbol from that
header is used. This causes a fatal build error under certain
linux-next configurations where stdlib.h is not available.
Remove the unused include to fix the build.
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/r/202511221800.hcgCKvVa-lkp@intel.com/
Signed-off-by: Sunday Adelodun <adelodunolaoluwa@yahoo.com>
Link: https://patch.msgid.link/20251125113648.25903-1-adelodunolaoluwa@yahoo.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Propagate fwnode of the ACPI device to the SPI controller Linux device.
Currently only OF case propagates fwnode to the controller.
While at it, replace several calls to dev_fwnode() with a single one
cached in a local variable, and unify checks for fwnode type by using
is_*_node() APIs.
Fixes: 55ab8487e01d ("spi: spi-nxp-fspi: Add ACPI support")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Haibo Chen <haibo.chen@nxp.com>
Link: https://patch.msgid.link/20251126202501.2319679-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
The LDO2 judgement bit position should be 7, not 6.
Cc: stable@vger.kernel.org
Reported-by: Yoon Dong Min <dm.youn@telechips.com>
Fixes: b65439d90150 ("regulator: rtq2208: Fix the LDO DVS capability")
Signed-off-by: ChiYuan Huang <cy_huang@richtek.com>
Link: https://patch.msgid.link/faadb009f84b88bfcabe39fc5009c7357b00bbe2.1764209258.git.cy_huang@richtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Correct buck group2 H and F mapping logic.
Cc: stable@vger.kernel.org
Reported-by: Yoon Dong Min <dm.youn@telechips.com>
Fixes: 1742e7e978ba ("regulator: rtq2208: Fix incorrect buck converter phase mapping")
Signed-off-by: ChiYuan Huang <cy_huang@richtek.com>
Link: https://patch.msgid.link/8527ae02a72b754d89b7580a5fe7474d6f80f5c3.1764209258.git.cy_huang@richtek.com
Signed-off-by: Mark Brown <broonie@kernel.org>
|
|
Wei Fang says:
====================
net: fec: fix some PTP related issues
There are some issues which were introduced by the commit 350749b909bf
("net: fec: Add support for periodic output signal of PPS"). See each
patch for more details.
====================
Link: https://patch.msgid.link/20251125085210.1094306-1-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There are currently two situations that can trigger the PTP interrupt,
one is the PPS event, the other is the PEROUT event. However, the irq
handler fec_pps_interrupt() does not check the irq event type and
directly registers a PPS event into the system, but the event may be
a PEROUT event. This is incorrect because PEROUT is an output signal,
while PPS is the input of the kernel PPS system. Therefore, add a check
for the event type, if pps_enable is true, it means that the current
event is a PPS event, and then the PPS event is registered.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251125085210.1094306-5-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In the current driver, PPS and PEROUT use the same channel to generate
the events, so they cannot be enabled at the same time. Otherwise, the
later configuration will overwrite the earlier configuration. Therefore,
when configuring PPS, the driver will check whether PEROUT is enabled.
Similarly, when configuring PEROUT, the driver will check whether PPS
is enabled.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251125085210.1094306-4-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If the previously set PEROUT is already active, updating it will cause
the new PEROUT to start immediately instead of at the specified time.
This is because fep->reload_period is updated whithout check whether
the PEROUT is enabled, and the old PEROUT is not disabled. Therefore,
the pulse period will be updated immediately in the pulse interrupt
handler fec_pps_interrupt().
Currently, the driver does not support directly updating PEROUT and it
will make the logic be more complicated. To fix the current issue, add
a check before enabling the PEROUT, the driver will return an error if
PEROUT is enabled. If users wants to update a new PEROUT, they should
disable the old PEROUT first.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251125085210.1094306-3-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The PEROUT allows the user to set a specified future time to output the
periodic signal. If the future time is far from the current time, the FEC
driver will use hrtimer to configure PEROUT one second before the future
time. However, the hrtimer will not be canceled if the PEROUT is disabled
before the hrtimer expires. So the PEROUT will be configured when the
hrtimer expires, which is not as expected. Therefore, cancel the hrtimer
in fec_ptp_pps_disable() to fix this issue.
Fixes: 350749b909bf ("net: fec: Add support for periodic output signal of PPS")
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Link: https://patch.msgid.link/20251125085210.1094306-2-wei.fang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
On transmit, we are currently relying on skb->dev being set by
mctp_local_output() when we first set up the skb destination fields.
However, forwarded skbs do not use the local_output path, so will retain
their incoming netdev as their ->dev on tx. This does not work when
we're forwarding between interfaces.
Set skb->dev unconditionally in the transmit path, to allow for proper
forwarding.
We keep the skb->dev initialisation in mctp_local_output(), as we use it
for fragmentation.
Fixes: 269936db5eb3 ("net: mctp: separate routing database from routing operations")
Suggested-by: Vince Chang <vince_chang@aspeedtech.com>
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Link: https://patch.msgid.link/20251125-dev-forward-v1-1-54ecffcd0616@codeconstruct.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Alexander Duyck says:
====================
net: phy: Add support for fbnic PHY w/ 25G, 50G, and 100G support
To transition the fbnic driver to using the XPCS driver we need to address
the fact that we need a representation for the FW managed PMD that is
actually a SerDes PHY to handle link bouncing during link training.
This patch set introduces the necessary bits to the XPCS driver code to
enable it to read 25G, 50G, and 100G speeds from the PCS ctrl1 register,
and adds support for the approriate interfaces.
The rest of this patch set enables the changes to fbnic to make use of
these interfaces and expose a PMD that can provide a necessary link delay
to avoid link flapping in the event that a cable is disconnected and
reconnected, and to correctly expose the count for the link down events.
With this we have the basic groundwork laid as with this all the bits and
pieces are in place in terms of reading the configuration. The general plan
for follow-on patch sets is to start looking at enabling changing the
configuration in environments where that is supported.
====================
Link: https://patch.msgid.link/176374310349.959489.838154632023183753.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
As we have exposed the PCS registers via the SWMII we can now start looking
at connecting the XPCS driver to those registers and let it mange the PCS
instead of us doing it directly from the fbnic driver.
For now this just gets us the ability to detect link. The hope is in the
future to add some of the vendor specific registers to begin enabling XPCS
configuration of the interface.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374325295.959489.14521115864034905277.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
In order for us to support a PCS device we need to add an MDIO bus to allow
the drivers to have access to the registers for the device. This change
adds such an interface.
The interface will consist of 2 PHY addrs, the first one consisting of a
PMD and PCS, and the second just being a PCS. There is a need for 2 PHYs
addrs due to the fact that in order to support the 50GBase-CR2 mode we will
need to access and configure the PCS vendor registers and RSFEC registers
from the second lane identical to the first.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374324532.959489.15389723111560978054.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We were previously not displaying the number of link_down_events tracked by
the device. With this change we should now be able to display the value.
The value itself tracks the calls from the phylink interface to the
mac_link_down call.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374323824.959489.6915296616773178954.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
One complication with the design of our part is that the PMD doesn't
provide a direct signal to the host. Instead we have visibility to signals
that the PCS provides to the MAC that allow us to check the link state
through that.
We will need to account for several things in the PMD and firmware when
managing the link. Specifically when the link first starts to come up the
PMD will cause the link to flap. This is due to the firmware starting a
training cycle when the link is first detected. This will cause link
flapping if we were to immediately report link up when the PCS first
detects it.
To address that we are adding a pmd_state variable that is meant to be a
countdown of sorts indicating the state of the PMD. If the link is down or
has been reconfigured the PMD will start out in the initialize state. By
default the link is assumed to be in the SEND_DATA state if it is available
on initial link inspection. If link is detected while in the initialize
state the PMD state will switch to training, and if after 4 seconds the
link is still stable we will transition to link_ready, and finally the
send_data state. With this we can avoid link flapping when a cable is
first connected to the NIC.
One side effect of this is that we need to pull the link state away from
the PCS. For now we use a union of the PCS link state register value and
the pmd_state. The plan is to add a PMD register to report the pmd_state
to the phylink interface. With that we can then look at switching over to
the use of the XPCS driver for fbnic instead of having an internal one.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374323107.959489.14951134213387615059.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Throughout several spots in the code I had called out the IRQ as being
related to the PCS. However the actual IRQ is a part of the MAC and it is
just exposing PCS data. To more accurately reflect the owner of the calls
this change makes it so that we rename the functions and values that are
taking in the interrupt value and processing it to reflect that it is a MAC
call and not a PCS one.
This change is mostly motivated by the fact that we will be moving the
handling of this interrupt from being PCS focused to being more PMA/PMD
focused as this will drive the phydev driver that I am adding instead of
driving the PCS directly.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374322373.959489.12018231545479053860.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The fbnic driver is planning to make use of the XPCS driver to enable
support for PCS and better integration with phylink. To do this though we
will need to enable several workarounds since the PMD interface for fbnic
is likely to be unique since it is a mix of two different vendor products
with a unique wrapper around the IP.
I have generated a PHY identifier based on IEEE 802.3-2022 22.2.4.3.1 using
an OUI belonging to Meta Platforms and used with our NICs. Using this we
will provide it as the PMD ID via the SW based MDIO interface so that
the fbnic device can be identified and necessary workarounds enabled in the
XPCS driver.
As an initial workaround this change adds an exception so that soft_reset
is not set when the driver is initially bound to the PCS.
In addition I have added logic to integrate the PMD Rx signal detect state
into the link state for the PCS. With this we can avoid the link coming up
too soon on the FBNIC PMD and as a result of it being in the training state
so we can avoid link flaps.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374321695.959489.6648161125012056619.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The XPCS driver was mangling the PMA identifier as the original code
appears to have been focused on just capturing the OUI. Rather than store a
mangled ID it is better to work with the actual PMA ID and instead just
mask out the values that don't apply rather than shifting them and
reordering them as you still don't get the original OUI for the NIC without
having to bitswap the values as per the definition of the layout in IEEE
802.3-2022 22.2.4.3.1.
By laying it out as it was in the hardware it is also less likely for us to
have an unintentional collision as the enum values will occupy the revision
number area while the OUI occupies the upper 22 bits.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374320920.959489.17267159479370601070.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
With this change we are adding support for 25G, 50G, and 100G interface
types to the XPCS driver. This had supposedly been enabled with the
addition of XLGMII but I don't see any capability for configuration there
so I suspect it may need to be refactored in the future.
With this change we can enable the XPCS driver with the selected interface
and it should be able to detect link, speed, and report the link status to
the phylink interface.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374320248.959489.11649590675011158859.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The 2.5G and 5G values are not consistent between the PCS CTRL1 and PMA
CTRL1 values. In order to avoid confusion between the two I am updating the
values to include "PMA" in the name similar to values used in similar
places.
To avoid breaking UAPI I have retained the original macros and just defined
them as the new PMA based defines.
Signed-off-by: Alexander Duyck <alexanderduyck@fb.com>
Link: https://patch.msgid.link/176374319569.959489.6610469879021800710.stgit@ahduyck-xeon-server.home.arpa
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
German Maglione is a co-maintainer of the virtiofsd userspace device
implementation (https://gitlab.com/virtio-fs/virtiofsd) and is currently
one of the most active virtiofs developers outside the kernel.
I have not worked on virtiofs except to review kernel patches for a few
years now and would like German to take over from me gradually. It is
healthier to have a kernel maintainer who is actively involved. I expect
to remove myself in a few months.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Link: https://patch.msgid.link/20251126211548.598469-1-stefanha@redhat.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
|
|
The len field originates from untrusted network packets. Boundary
checks have been added to prevent potential out-of-bounds writes when
decrypting the connection secret or processing service tickets.
[ idryomov: changelog ]
Cc: stable@vger.kernel.org
Signed-off-by: ziming zhang <ezrakiez@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
OSD indexes come from untrusted network packets. Boundary checks are
added to validate these against map->max_osd.
[ idryomov: drop BUG_ON in ceph_get_primary_affinity(), minor cosmetic
edits ]
Cc: stable@vger.kernel.org
Signed-off-by: ziming zhang <ezrakiez@gmail.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
The crash in process_v2_sparse_read() for fscrypt-encrypted directories
has been reported. Issue takes place for Ceph msgr2 protocol in secure
mode. It can be reproduced by the steps:
sudo mount -t ceph :/ /mnt/cephfs/ -o name=admin,fs=cephfs,ms_mode=secure
(1) mkdir /mnt/cephfs/fscrypt-test-3
(2) cp area_decrypted.tar /mnt/cephfs/fscrypt-test-3
(3) fscrypt encrypt --source=raw_key --key=./my.key /mnt/cephfs/fscrypt-test-3
(4) fscrypt lock /mnt/cephfs/fscrypt-test-3
(5) fscrypt unlock --key=my.key /mnt/cephfs/fscrypt-test-3
(6) cat /mnt/cephfs/fscrypt-test-3/area_decrypted.tar
(7) Issue has been triggered
[ 408.072247] ------------[ cut here ]------------
[ 408.072251] WARNING: CPU: 1 PID: 392 at net/ceph/messenger_v2.c:865
ceph_con_v2_try_read+0x4b39/0x72f0
[ 408.072267] Modules linked in: intel_rapl_msr intel_rapl_common
intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_discovery
pmt_class intel_pmc_ssram_telemetry intel_vsec kvm_intel joydev kvm irqbypass
polyval_clmulni ghash_clmulni_intel aesni_intel rapl input_leds psmouse
serio_raw i2c_piix4 vga16fb bochs vgastate i2c_smbus floppy mac_hid qemu_fw_cfg
pata_acpi sch_fq_codel rbd msr parport_pc ppdev lp parport efi_pstore
[ 408.072304] CPU: 1 UID: 0 PID: 392 Comm: kworker/1:3 Not tainted 6.17.0-rc7+
[ 408.072307] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.17.0-5.fc42 04/01/2014
[ 408.072310] Workqueue: ceph-msgr ceph_con_workfn
[ 408.072314] RIP: 0010:ceph_con_v2_try_read+0x4b39/0x72f0
[ 408.072317] Code: c7 c1 20 f0 d4 ae 50 31 d2 48 c7 c6 60 27 d5 ae 48 c7 c7 f8
8e 6f b0 68 60 38 d5 ae e8 00 47 61 fe 48 83 c4 18 e9 ac fc ff ff <0f> 0b e9 06
fe ff ff 4c 8b 9d 98 fd ff ff 0f 84 64 e7 ff ff 89 85
[ 408.072319] RSP: 0018:ffff88811c3e7a30 EFLAGS: 00010246
[ 408.072322] RAX: ffffed1024874c6f RBX: ffffea00042c2b40 RCX: 0000000000000f38
[ 408.072324] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 408.072325] RBP: ffff88811c3e7ca8 R08: 0000000000000000 R09: 00000000000000c8
[ 408.072326] R10: 00000000000000c8 R11: 0000000000000000 R12: 00000000000000c8
[ 408.072327] R13: dffffc0000000000 R14: ffff8881243a6030 R15: 0000000000003000
[ 408.072329] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000)
knlGS:0000000000000000
[ 408.072331] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 408.072332] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0
[ 408.072336] PKRU: 55555554
[ 408.072337] Call Trace:
[ 408.072338] <TASK>
[ 408.072340] ? sched_clock_noinstr+0x9/0x10
[ 408.072344] ? __pfx_ceph_con_v2_try_read+0x10/0x10
[ 408.072347] ? _raw_spin_unlock+0xe/0x40
[ 408.072349] ? finish_task_switch.isra.0+0x15d/0x830
[ 408.072353] ? __kasan_check_write+0x14/0x30
[ 408.072357] ? mutex_lock+0x84/0xe0
[ 408.072359] ? __pfx_mutex_lock+0x10/0x10
[ 408.072361] ceph_con_workfn+0x27e/0x10e0
[ 408.072364] ? metric_delayed_work+0x311/0x2c50
[ 408.072367] process_one_work+0x611/0xe20
[ 408.072371] ? __kasan_check_write+0x14/0x30
[ 408.072373] worker_thread+0x7e3/0x1580
[ 408.072375] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 408.072378] ? __pfx_worker_thread+0x10/0x10
[ 408.072381] kthread+0x381/0x7a0
[ 408.072383] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ 408.072385] ? __pfx_kthread+0x10/0x10
[ 408.072387] ? __kasan_check_write+0x14/0x30
[ 408.072389] ? recalc_sigpending+0x160/0x220
[ 408.072392] ? _raw_spin_unlock_irq+0xe/0x50
[ 408.072394] ? calculate_sigpending+0x78/0xb0
[ 408.072395] ? __pfx_kthread+0x10/0x10
[ 408.072397] ret_from_fork+0x2b6/0x380
[ 408.072400] ? __pfx_kthread+0x10/0x10
[ 408.072402] ret_from_fork_asm+0x1a/0x30
[ 408.072406] </TASK>
[ 408.072407] ---[ end trace 0000000000000000 ]---
[ 408.072418] Oops: general protection fault, probably for non-canonical
address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
[ 408.072984] KASAN: null-ptr-deref in range [0x0000000000000000-
0x0000000000000007]
[ 408.073350] CPU: 1 UID: 0 PID: 392 Comm: kworker/1:3 Tainted: G W
6.17.0-rc7+ #1 PREEMPT(voluntary)
[ 408.073886] Tainted: [W]=WARN
[ 408.074042] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.17.0-5.fc42 04/01/2014
[ 408.074468] Workqueue: ceph-msgr ceph_con_workfn
[ 408.074694] RIP: 0010:ceph_msg_data_advance+0x79/0x1a80
[ 408.074976] Code: fc ff df 49 8d 77 08 48 c1 ee 03 80 3c 16 00 0f 85 07 11 00
00 48 ba 00 00 00 00 00 fc ff df 49 8b 5f 08 48 89 de 48 c1 ee 03 <0f> b6 14 16
84 d2 74 09 80 fa 03 0f 8e 0f 0e 00 00 8b 13 83 fa 03
[ 408.075884] RSP: 0018:ffff88811c3e7990 EFLAGS: 00010246
[ 408.076305] RAX: ffff8881243a6388 RBX: 0000000000000000 RCX: 0000000000000000
[ 408.076909] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff8881243a6378
[ 408.077466] RBP: ffff88811c3e7a20 R08: 0000000000000000 R09: 00000000000000c8
[ 408.078034] R10: ffff8881243a6388 R11: 0000000000000000 R12: ffffed1024874c71
[ 408.078575] R13: dffffc0000000000 R14: ffff8881243a6030 R15: ffff8881243a6378
[ 408.079159] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000)
knlGS:0000000000000000
[ 408.079736] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 408.080039] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0
[ 408.080376] PKRU: 55555554
[ 408.080513] Call Trace:
[ 408.080630] <TASK>
[ 408.080729] ceph_con_v2_try_read+0x49b9/0x72f0
[ 408.081115] ? __pfx_ceph_con_v2_try_read+0x10/0x10
[ 408.081348] ? _raw_spin_unlock+0xe/0x40
[ 408.081538] ? finish_task_switch.isra.0+0x15d/0x830
[ 408.081768] ? __kasan_check_write+0x14/0x30
[ 408.081986] ? mutex_lock+0x84/0xe0
[ 408.082160] ? __pfx_mutex_lock+0x10/0x10
[ 408.082343] ceph_con_workfn+0x27e/0x10e0
[ 408.082529] ? metric_delayed_work+0x311/0x2c50
[ 408.082737] process_one_work+0x611/0xe20
[ 408.082948] ? __kasan_check_write+0x14/0x30
[ 408.083156] worker_thread+0x7e3/0x1580
[ 408.083331] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 408.083557] ? __pfx_worker_thread+0x10/0x10
[ 408.083751] kthread+0x381/0x7a0
[ 408.083922] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ 408.084139] ? __pfx_kthread+0x10/0x10
[ 408.084310] ? __kasan_check_write+0x14/0x30
[ 408.084510] ? recalc_sigpending+0x160/0x220
[ 408.084708] ? _raw_spin_unlock_irq+0xe/0x50
[ 408.084917] ? calculate_sigpending+0x78/0xb0
[ 408.085138] ? __pfx_kthread+0x10/0x10
[ 408.085335] ret_from_fork+0x2b6/0x380
[ 408.085525] ? __pfx_kthread+0x10/0x10
[ 408.085720] ret_from_fork_asm+0x1a/0x30
[ 408.085922] </TASK>
[ 408.086036] Modules linked in: intel_rapl_msr intel_rapl_common
intel_uncore_frequency_common intel_pmc_core pmt_telemetry pmt_discovery
pmt_class intel_pmc_ssram_telemetry intel_vsec kvm_intel joydev kvm irqbypass
polyval_clmulni ghash_clmulni_intel aesni_intel rapl input_leds psmouse
serio_raw i2c_piix4 vga16fb bochs vgastate i2c_smbus floppy mac_hid qemu_fw_cfg
pata_acpi sch_fq_codel rbd msr parport_pc ppdev lp parport efi_pstore
[ 408.087778] ---[ end trace 0000000000000000 ]---
[ 408.088007] RIP: 0010:ceph_msg_data_advance+0x79/0x1a80
[ 408.088260] Code: fc ff df 49 8d 77 08 48 c1 ee 03 80 3c 16 00 0f 85 07 11 00
00 48 ba 00 00 00 00 00 fc ff df 49 8b 5f 08 48 89 de 48 c1 ee 03 <0f> b6 14 16
84 d2 74 09 80 fa 03 0f 8e 0f 0e 00 00 8b 13 83 fa 03
[ 408.089118] RSP: 0018:ffff88811c3e7990 EFLAGS: 00010246
[ 408.089357] RAX: ffff8881243a6388 RBX: 0000000000000000 RCX: 0000000000000000
[ 408.089678] RDX: dffffc0000000000 RSI: 0000000000000000 RDI: ffff8881243a6378
[ 408.090020] RBP: ffff88811c3e7a20 R08: 0000000000000000 R09: 00000000000000c8
[ 408.090360] R10: ffff8881243a6388 R11: 0000000000000000 R12: ffffed1024874c71
[ 408.090687] R13: dffffc0000000000 R14: ffff8881243a6030 R15: ffff8881243a6378
[ 408.091035] FS: 0000000000000000(0000) GS:ffff88823eadf000(0000)
knlGS:0000000000000000
[ 408.091452] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 408.092015] CR2: 000000c0003c6000 CR3: 000000010c106005 CR4: 0000000000772ef0
[ 408.092530] PKRU: 55555554
[ 417.112915]
==================================================================
[ 417.113491] BUG: KASAN: slab-use-after-free in
__mutex_lock.constprop.0+0x1522/0x1610
[ 417.114014] Read of size 4 at addr ffff888124870034 by task kworker/2:0/4951
[ 417.114587] CPU: 2 UID: 0 PID: 4951 Comm: kworker/2:0 Tainted: G D W
6.17.0-rc7+ #1 PREEMPT(voluntary)
[ 417.114592] Tainted: [D]=DIE, [W]=WARN
[ 417.114593] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.17.0-5.fc42 04/01/2014
[ 417.114596] Workqueue: events handle_timeout
[ 417.114601] Call Trace:
[ 417.114602] <TASK>
[ 417.114604] dump_stack_lvl+0x5c/0x90
[ 417.114610] print_report+0x171/0x4dc
[ 417.114613] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 417.114617] ? kasan_complete_mode_report_info+0x80/0x220
[ 417.114621] kasan_report+0xbd/0x100
[ 417.114625] ? __mutex_lock.constprop.0+0x1522/0x1610
[ 417.114628] ? __mutex_lock.constprop.0+0x1522/0x1610
[ 417.114630] __asan_report_load4_noabort+0x14/0x30
[ 417.114633] __mutex_lock.constprop.0+0x1522/0x1610
[ 417.114635] ? queue_con_delay+0x8d/0x200
[ 417.114638] ? __pfx___mutex_lock.constprop.0+0x10/0x10
[ 417.114641] ? __send_subscribe+0x529/0xb20
[ 417.114644] __mutex_lock_slowpath+0x13/0x20
[ 417.114646] mutex_lock+0xd4/0xe0
[ 417.114649] ? __pfx_mutex_lock+0x10/0x10
[ 417.114652] ? ceph_monc_renew_subs+0x2a/0x40
[ 417.114654] ceph_con_keepalive+0x22/0x110
[ 417.114656] handle_timeout+0x6b3/0x11d0
[ 417.114659] ? _raw_spin_unlock_irq+0xe/0x50
[ 417.114662] ? __pfx_handle_timeout+0x10/0x10
[ 417.114664] ? queue_delayed_work_on+0x8e/0xa0
[ 417.114669] process_one_work+0x611/0xe20
[ 417.114672] ? __kasan_check_write+0x14/0x30
[ 417.114676] worker_thread+0x7e3/0x1580
[ 417.114678] ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 417.114682] ? __pfx_sched_setscheduler_nocheck+0x10/0x10
[ 417.114687] ? __pfx_worker_thread+0x10/0x10
[ 417.114689] kthread+0x381/0x7a0
[ 417.114692] ? __pfx__raw_spin_lock_irq+0x10/0x10
[ 417.114694] ? __pfx_kthread+0x10/0x10
[ 417.114697] ? __kasan_check_write+0x14/0x30
[ 417.114699] ? recalc_sigpending+0x160/0x220
[ 417.114703] ? _raw_spin_unlock_irq+0xe/0x50
[ 417.114705] ? calculate_sigpending+0x78/0xb0
[ 417.114707] ? __pfx_kthread+0x10/0x10
[ 417.114710] ret_from_fork+0x2b6/0x380
[ 417.114713] ? __pfx_kthread+0x10/0x10
[ 417.114715] ret_from_fork_asm+0x1a/0x30
[ 417.114720] </TASK>
[ 417.125171] Allocated by task 2:
[ 417.125333] kasan_save_stack+0x26/0x60
[ 417.125522] kasan_save_track+0x14/0x40
[ 417.125742] kasan_save_alloc_info+0x39/0x60
[ 417.125945] __kasan_slab_alloc+0x8b/0xb0
[ 417.126133] kmem_cache_alloc_node_noprof+0x13b/0x460
[ 417.126381] copy_process+0x320/0x6250
[ 417.126595] kernel_clone+0xb7/0x840
[ 417.126792] kernel_thread+0xd6/0x120
[ 417.126995] kthreadd+0x85c/0xbe0
[ 417.127176] ret_from_fork+0x2b6/0x380
[ 417.127378] ret_from_fork_asm+0x1a/0x30
[ 417.127692] Freed by task 0:
[ 417.127851] kasan_save_stack+0x26/0x60
[ 417.128057] kasan_save_track+0x14/0x40
[ 417.128267] kasan_save_free_info+0x3b/0x60
[ 417.128491] __kasan_slab_free+0x6c/0xa0
[ 417.128708] kmem_cache_free+0x182/0x550
[ 417.128906] free_task+0xeb/0x140
[ 417.129070] __put_task_struct+0x1d2/0x4f0
[ 417.129259] __put_task_struct_rcu_cb+0x15/0x20
[ 417.129480] rcu_do_batch+0x3d3/0xe70
[ 417.129681] rcu_core+0x549/0xb30
[ 417.129839] rcu_core_si+0xe/0x20
[ 417.130005] handle_softirqs+0x160/0x570
[ 417.130190] __irq_exit_rcu+0x189/0x1e0
[ 417.130369] irq_exit_rcu+0xe/0x20
[ 417.130531] sysvec_apic_timer_interrupt+0x9f/0xd0
[ 417.130768] asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 417.131082] Last potentially related work creation:
[ 417.131305] kasan_save_stack+0x26/0x60
[ 417.131484] kasan_record_aux_stack+0xae/0xd0
[ 417.131695] __call_rcu_common+0xcd/0x14b0
[ 417.131909] call_rcu+0x31/0x50
[ 417.132071] delayed_put_task_struct+0x128/0x190
[ 417.132295] rcu_do_batch+0x3d3/0xe70
[ 417.132478] rcu_core+0x549/0xb30
[ 417.132658] rcu_core_si+0xe/0x20
[ 417.132808] handle_softirqs+0x160/0x570
[ 417.132993] __irq_exit_rcu+0x189/0x1e0
[ 417.133181] irq_exit_rcu+0xe/0x20
[ 417.133353] sysvec_apic_timer_interrupt+0x9f/0xd0
[ 417.133584] asm_sysvec_apic_timer_interrupt+0x1b/0x20
[ 417.133921] Second to last potentially related work creation:
[ 417.134183] kasan_save_stack+0x26/0x60
[ 417.134362] kasan_record_aux_stack+0xae/0xd0
[ 417.134566] __call_rcu_common+0xcd/0x14b0
[ 417.134782] call_rcu+0x31/0x50
[ 417.134929] put_task_struct_rcu_user+0x58/0xb0
[ 417.135143] finish_task_switch.isra.0+0x5d3/0x830
[ 417.135366] __schedule+0xd30/0x5100
[ 417.135534] schedule_idle+0x5a/0x90
[ 417.135712] do_idle+0x25f/0x410
[ 417.135871] cpu_startup_entry+0x53/0x70
[ 417.136053] start_secondary+0x216/0x2c0
[ 417.136233] common_startup_64+0x13e/0x141
[ 417.136894] The buggy address belongs to the object at ffff888124870000
which belongs to the cache task_struct of size 10504
[ 417.138122] The buggy address is located 52 bytes inside of
freed 10504-byte region [ffff888124870000, ffff888124872908)
[ 417.139465] The buggy address belongs to the physical page:
[ 417.140016] page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0
pfn:0x124870
[ 417.140789] head: order:3 mapcount:0 entire_mapcount:0 nr_pages_mapped:0
pincount:0
[ 417.141519] memcg:ffff88811aa20e01
[ 417.141874] anon flags:
0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff)
[ 417.142600] page_type: f5(slab)
[ 417.142922] raw: 0017ffffc0000040 ffff88810094f040 0000000000000000
dead000000000001
[ 417.143554] raw: 0000000000000000 0000000000030003 00000000f5000000
ffff88811aa20e01
[ 417.143954] head: 0017ffffc0000040 ffff88810094f040 0000000000000000
dead000000000001
[ 417.144329] head: 0000000000000000 0000000000030003 00000000f5000000
ffff88811aa20e01
[ 417.144710] head: 0017ffffc0000003 ffffea0004921c01 00000000ffffffff
00000000ffffffff
[ 417.145106] head: ffffffffffffffff 0000000000000000 00000000ffffffff
0000000000000008
[ 417.145485] page dumped because: kasan: bad access detected
[ 417.145859] Memory state around the buggy address:
[ 417.146094] ffff88812486ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
fc
[ 417.146439] ffff88812486ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
fc
[ 417.146791] >ffff888124870000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[ 417.147145] ^
[ 417.147387] ffff888124870080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[ 417.147751] ffff888124870100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
fb
[ 417.148123]
==================================================================
First of all, we have warning in get_bvec_at() because
cursor->total_resid contains zero value. And, finally,
we have crash in ceph_msg_data_advance() because
cursor->data is NULL. It means that get_bvec_at()
receives not initialized ceph_msg_data_cursor structure
because data is NULL and total_resid contains zero.
Moreover, we don't have likewise issue for the case of
Ceph msgr1 protocol because ceph_msg_data_cursor_init()
has been called before reading sparse data.
This patch adds calling of ceph_msg_data_cursor_init()
in the beginning of process_v2_sparse_read() with
the goal to guarantee that logic of reading sparse data
works correctly for the case of Ceph msgr2 protocol.
Cc: stable@vger.kernel.org
Link: https://tracker.ceph.com/issues/73152
Signed-off-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
Reviewed-by: Ilya Dryomov <idryomov@gmail.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
|
|
Linus figured less #ifdef is more better and making x86-32 use
GENERIC_BUG_RELATIVE_POINTERS removes one layer of macro magic from
the bug.h bits.
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
Encoding a relative NULL pointer doesn't work for KASLR, when the
whole kernel image gets shifted, the __bug_table and the target string
get shifted by the same amount and the relative offset is preserved.
However when the target is an absolute 0 value and the __bug_table
gets moved about, the end result in a pointer equivalent to
kaslr_offset(), not NULL.
Notably, this will generate SHN_UNDEF relocations, and Ard would
really like to not have those at all.
Use the empty string to denote no-string.
Suggested-by: Linus Torvalds <torvalds@linuxfoundation.org>
Cc: Ard Biesheuvel <ardb@kernel.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
|
|
Building objtool with disassembly support can fail when including
the bdf.h file:
In file included from tools/objtool/include/objtool/arch.h:108,
from check.c:14:
/usr/include/bfd.h:35:2: error: #error config.h must be included before this header
35 | #error config.h must be included before this header
| ^~~~~
This check is present in the bfd.h file generated from the binutils
source code, but it is not necessarily present in the bfd.h file
provided in a binutil package (for example, it is not present in
the binutil RPM).
The solution to this issue is to define the PACKAGE macro before
including bfd.h. This is the solution suggested by the binutil
developer in bug 14243, and it is used by other kernel tools
which also use bfd.h (perf and bpf).
Fixes: 59953303827ec ("objtool: Disassemble code with libopcodes instead of running objdump")
Closes: https://lore.kernel.org/all/3fa261fd-3b46-4cbe-b48d-7503aabc96cb@oracle.com/
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Alexandre Chartre <alexandre.chartre@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://sourceware.org/bugzilla/show_bug.cgi?id=14243
Link: https://patch.msgid.link/20251126134519.1760889-1-alexandre.chartre@oracle.com
|
|
Pull smb client fix from Steve French:
"smb client multiuser (with cifscreds) mount fix"
* tag 'v6.18rc7-SMB-client-fix' of git://git.samba.org/sfrench/cifs-2.6:
smb: client: fix memory leak in cifs_construct_tcon()
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can
Marc Kleine-Budde says:
====================
pull-request: can 2025-11-26
this is a pull request of 8 patches for net/main.
Seungjin Bae provides a patch for the kvaser_usb driver to fix a
potential infinite loop in the USB data stream command parser.
Thomas Mühlbacher's patch for the sja1000 driver IRQ handler's max
loop handling, that might lead to unhandled interrupts.
3 patches by me for the gs_usb driver fix handling of failed transmit
URBs and add checking of the actual length of received URBs before
accessing the data.
The next patch is by me and is a port of Thomas Mühlbacher's patch
(fix IRQ handler's max loop handling, that might lead to unhandled
interrupts.) to the sun4i_can driver.
Biju Das provides a patch for the rcar_canfd driver to fix the CAN-FD
mode setting.
The last patch is by Shaurya Rane for the em_canid filter to ensure
that the complete CAN frame is present in the linear data buffer
before accessing it.
* tag 'linux-can-fixes-for-6.18-20251126' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can:
net/sched: em_canid: fix uninit-value in em_canid_match
can: rcar_canfd: Fix CAN-FD mode as default
can: sun4i_can: sun4i_can_interrupt(): fix max irq loop handling
can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing data
can: gs_usb: gs_usb_receive_bulk_callback(): check actual_length before accessing header
can: gs_usb: gs_usb_xmit_callback(): fix handling of failed transmitted URBs
can: sja1000: fix max irq loop handling
can: kvaser_usb: leaf: Fix potential infinite loop in command parsers
====================
Link: https://patch.msgid.link/20251126155713.217105-1-mkl@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The atlantic driver can receive packets with more than MAX_SKB_FRAGS (17)
fragments when handling large multi-descriptor packets. This causes an
out-of-bounds write in skb_add_rx_frag_netmem() leading to kernel panic.
The issue occurs because the driver doesn't check the total number of
fragments before calling skb_add_rx_frag(). When a packet requires more
than MAX_SKB_FRAGS fragments, the fragment index exceeds the array bounds.
Fix by assuming there will be an extra frag if buff->len > AQ_CFG_RX_HDR_SIZE,
then all fragments are accounted for. And reusing the existing check to
prevent the overflow earlier in the code path.
This crash occurred in production with an Aquantia AQC113 10G NIC.
Stack trace from production environment:
```
RIP: 0010:skb_add_rx_frag_netmem+0x29/0xd0
Code: 90 f3 0f 1e fa 0f 1f 44 00 00 48 89 f8 41 89
ca 48 89 d7 48 63 ce 8b 90 c0 00 00 00 48 c1 e1 04 48 01 ca 48 03 90
c8 00 00 00 <48> 89 7a 30 44 89 52 3c 44 89 42 38 40 f6 c7 01 75 74 48
89 fa 83
RSP: 0018:ffffa9bec02a8d50 EFLAGS: 00010287
RAX: ffff925b22e80a00 RBX: ffff925ad38d2700 RCX:
fffffffe0a0c8000
RDX: ffff9258ea95bac0 RSI: ffff925ae0a0c800 RDI:
0000000000037a40
RBP: 0000000000000024 R08: 0000000000000000 R09:
0000000000000021
R10: 0000000000000848 R11: 0000000000000000 R12:
ffffa9bec02a8e24
R13: ffff925ad8615570 R14: 0000000000000000 R15:
ffff925b22e80a00
FS: 0000000000000000(0000)
GS:ffff925e47880000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffff9258ea95baf0 CR3: 0000000166022004 CR4:
0000000000f72ef0
PKRU: 55555554
Call Trace:
<IRQ>
aq_ring_rx_clean+0x175/0xe60 [atlantic]
? aq_ring_rx_clean+0x14d/0xe60 [atlantic]
? aq_ring_tx_clean+0xdf/0x190 [atlantic]
? kmem_cache_free+0x348/0x450
? aq_vec_poll+0x81/0x1d0 [atlantic]
? __napi_poll+0x28/0x1c0
? net_rx_action+0x337/0x420
```
Fixes: 6aecbba12b5c ("net: atlantic: add check for MAX_SKB_FRAGS")
Changes in v4:
- Add Fixes: tag to satisfy patch validation requirements.
Changes in v3:
- Fix by assuming there will be an extra frag if buff->len > AQ_CFG_RX_HDR_SIZE,
then all fragments are accounted for.
Signed-off-by: Jiefeng Zhang <jiefeng.z.zhang@gmail.com>
Link: https://patch.msgid.link/20251126032249.69358-1-jiefeng.z.zhang@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Changes to virtio network stack should be cc'd to netdev DL, separate
it into its own group to add netdev in addition to virtualization DL.
Signed-off-by: Jon Kohler <jon@nutanix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20251126015750.2200267-1-jon@nutanix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP
GSO tunneling.") inadvertently altered checksum offload behavior
for guests not using UDP GSO tunneling.
Before, tun_put_user called tun_vnet_hdr_from_skb, which passed
has_data_valid = true to virtio_net_hdr_from_skb.
After, tun_put_user began calling tun_vnet_hdr_tnl_from_skb instead,
which passes has_data_valid = false into both call sites.
This caused virtio hdr flags to not include VIRTIO_NET_HDR_F_DATA_VALID
for SKBs where skb->ip_summed == CHECKSUM_UNNECESSARY. As a result,
guests are forced to recalculate checksums unnecessarily.
Restore the previous behavior by ensuring has_data_valid = true is
passed in the !tnl_gso_type case, but only from tun side, as
virtio_net_hdr_tnl_from_skb() is used also by the virtio_net driver,
which in turn must not use VIRTIO_NET_HDR_F_DATA_VALID on tx.
cc: stable@vger.kernel.org
Fixes: a2fb4bc4e2a6 ("net: implement virtio helpers to handle UDP GSO tunneling.")
Signed-off-by: Jon Kohler <jon@nutanix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Link: https://patch.msgid.link/20251125222754.1737443-1-jon@nutanix.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Use the MAC managed PM flag to indicate that MAC driver takes care of
suspending/resuming the PHY, and reset it when the device is brought up.
Signed-off-by: Raju Rangoju <Raju.Rangoju@amd.com>
Link: https://patch.msgid.link/20251123163721.442162-1-Raju.Rangoju@amd.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix a potential counter roll-over issue in fbnic_mbx_alloc_rx_msgs()
when calculating descriptor slots. The issue occurs when head - tail
results in a large positive value (unsigned) and the compiler interprets
head - tail - 1 as a signed value.
Since FBNIC_IPC_MBX_DESC_LEN is a power of two, use a masking operation,
which is a common way of avoiding this problem when dealing with these
sort of ring space calculations.
Fixes: da3cde08209e ("eth: fbnic: Add FW communication mechanism")
Signed-off-by: Mohsin Bashir <mohsin.bashr@gmail.com>
Link: https://patch.msgid.link/20251125211704.3222413-1-mohsin.bashr@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When __mptcp_retrans() kicks-in, it schedules one or more subflows for
retransmission, but such subflows could be actually left alone if there
is no more data to retransmit and/or in case of concurrent fallback.
Scheduled subflows could be processed much later in time, i.e. when new
data will be transmitted, leading to bad subflow selection.
Explicitly clear all scheduled subflows before leaving the
retransmission function.
Fixes: ee2708aedad0 ("mptcp: use get_retrans wrapper")
Cc: stable@vger.kernel.org
Reported-by: Filip Pokryvka <fpokryvk@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://patch.msgid.link/20251125-net-mptcp-clear-sched-rtx-v1-1-1cea4ad2165f@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
'net-hibmcge-add-support-for-tracepoint-and-pagepool-on-hibmcge-driver'
Jijie Shao says:
====================
net: hibmcge: Add support for tracepoint and pagepool on hibmcge driver
In this patch set:
1: add support for tracepoint for rx descriptor
2: double the rx queue depth to reduce packet drop
3: add support for pagepool on rx
====================
Link: https://patch.msgid.link/20251122034657.3373143-1-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
add support for pagepool on rx, and remove the legacy path
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20251122034657.3373143-4-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Under stress test scenarios, hibmcge driver may not receive packets
in a timely manner, which can lead to the buffer of the hardware queue
being exhausted, resulting in packet drop.
This patch doubles the software queue depth and uses half of the buffer
to fill the hardware queue before receiving packets, thus preventing
packet loss caused by the hardware queue buffer being exhausted.
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20251122034657.3373143-3-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
add support for tracepoint to dump some fields of rx_desc
Signed-off-by: Tao Lan <lantao5@huawei.com>
Signed-off-by: Jijie Shao <shaojijie@huawei.com>
Link: https://patch.msgid.link/20251122034657.3373143-2-shaojijie@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When using the SGMII PCS as a fixed-link chip-to-chip connection, it is
easy to miss the fact that traffic passes only at 1G, since that's what
any normal such connection would use.
When using the SGMII PCS connected towards an on-board PHY or an SFP
module, it is immediately noticeable that when the link resolves to a
speed other than 1G, traffic from the MAC fails to pass: TX counters
increase, but nothing gets decoded by the other end, and no local RX
counters increase either.
Artificially lowering a fixed-link rate to speed = <100> makes us able
to see the same issue as in the case of having an SGMII PHY.
Some debugging shows that the XPCS configuration is A-OK, but that the
MAC Configuration Table entry for the port has the SPEED bits still set
to 1000Mbps, due to a special condition in the driver. Deleting that
condition, and letting the resolved link speed be programmed directly
into the MAC speed field, results in a functional link at all 3 speeds.
This piece of evidence, based on testing on both generations with SGMII
support (SJA1105S and SJA1110A) directly contradicts the statement from
the blamed commit that "the MAC is fixed at 1 Gbps and we need to
configure the PCS only (if even that)". Worse, that statement is not
backed by any documentation, and no one from NXP knows what it might
refer to.
I am unable to recall sufficient context regarding my testing from March
2020 to understand what led me to draw such a braindead and factually
incorrect conclusion. Yet, there is nothing of value regarding forcing
the MAC speed, either for SGMII or 2500Base-X (introduced at a later
stage), so remove all such logic.
Fixes: ffe10e679cec ("net: dsa: sja1105: Add support for the SGMII port")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251122111324.136761-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The FMan driver has support for 2 MACs: mEMAC (newer, present on
Layerscape and PowerPC T series) and dTSEC/TGEC (older, present on
PowerPC P series). I only have handy access to the mEMAC, and this adds
support for MAC counters for those platforms.
MAC counters are necessary for any kind of low-level debugging, and
currently there is no mechanism to dump them.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251122115931.151719-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The DPAA phylink conversion in the following commits partially developed
code for handling the 2500base-x host interface mode (called "2.5G
SGMII" in LS1043A/LS1046A reference manuals).
- 0fc83bd79589 ("net: fman: memac: Add serdes support")
- 5d93cfcf7360 ("net: dpaa: Convert to phylink")
In principle, having phy-interface-mode = "2500base-x" and a pcsphy-handle
(unnamed or with pcs-handle-names = "sgmii") in the MAC device tree node
results in PHY_INTERFACE_MODE_2500BASEX being set in phylink_config ::
supported_interfaces, but this isn't sufficient.
Because memac_select_pcs() returns no PCS for PHY_INTERFACE_MODE_2500BASEX,
the Lynx PCS code never engages. There's a chance the PCS driver doesn't
have any configuration to change for 2500base-x fixed-link (based on
bootloader pre-initialization), but there's an even higher chance that
this is not the case, and the PCS remains misconfigured.
More importantly, memac_if_mode() does not handle
PHY_INTERFACE_MODE_2500BASEX, and it should be telling the mEMAC to
configure itself in GMII mode (which is upclocked by the PCS). Currently
it prints a WARN_ON() and returns zero, aka IF_MODE_10G (incorrect).
The additional case statement in memac_prepare() for calling
phy_set_mode_ext() does not make any difference, because there is no
generic PHY driver for the Lynx 10G SerDes from LS1043A/LS1046A. But we
add it nonetheless, for consistency.
Regarding the question "did 2500base-x ever work with the FMan mEMAC
mainline code prior to the phylink conversion?" - the answer is more
nuanced.
For context, the previous phylib-based implementation was unable to
describe the fixed-link speed as 2500, because the software PHY
implementation is limited to 1G. However, improperly describing the link
as an sgmii fixed-link with speed = <1000> would have resulted in a
functional 2.5G speed, because there is no other difference than the
SerDes lane clock net frequency (3.125 GHz for 2500base-x) - all the
other higher-level settings are the same, and the SerDes lane frequency
is currently handled by the RCW.
But this hack cannot be extended towards a phylib PHY such as Aquantia
operating in OCSGMII, because the latter requires phy-mode = "2500base-x",
which the mEMAC driver did not support prior to the phylink conversion.
So I do not really consider this a regression, just completing support
for a missing feature.
The FMan mEMAC driver sets phylink's "default_an_inband" property to
true, making it as if the device tree node had the managed =
"in-band-status" property anyway. This default made sense for SGMII,
where it was added to avoid regressions, but for 2500base-x we learned
only recently how to enable in-band autoneg:
https://lore.kernel.org/netdev/20251122113433.141930-1-vladimir.oltean@nxp.com/
so the driver needs to opt out of this default in-band enabled
behaviour, and only enable in-band based on the device tree property.
Suggested-by: Russell King (Oracle) <linux@armlinux.org.uk>
Link: https://lore.kernel.org/netdev/aIyx0OLWGw5zKarX@shell.armlinux.org.uk/#t
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251122115523.150260-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Implement the inband_caps() and config_inband() PHY driver methods, to
allow working with PCS devices that do not support or want in-band to be
used.
There is a complication due to existing logic from commit c76acfb7e19d
("net: phy: dp83867: retrigger SGMII AN when link change") which might
re-enable what dp83867_config_inband() has disabled. So we need to
modify dp83867_link_change_notify() to use phy_modify_changed() when
temporarily disabling in-band autoneg. If the return code is 0, it means
the original in-band was disabled and we need to keep it disabled.
If the return code is 1, the original was enabled and we need to
re-enable it. If negative, there was an error, which was silent before,
and remains silent now.
dp83867_config_inband() and dp83867_link_change_notify() are serialized
by the phydev->lock.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20251122110427.133035-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Add a test framework for YAML Netlink (YNL) tools, covering both CLI and
ethtool functionality. The framework includes:
1) cli: family listing, netdev, ethtool, rt-* families, and nlctrl
operations
2) ethtool: device info, statistics, ring/coalesce/pause parameters, and
feature gettings
The current YNL syntax is a bit obscure, and end users may not always know
how to use it. This test framework provides usage examples and also serves
as a regression test to catch potential breakages caused by future changes.
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Acked-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://patch.msgid.link/20251124022055.33389-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The fix commit converted several IPv4 address attributes from binary
to u32, but forgot to specify byte-order: big-endian. Without this,
YNL tools display IPv4 addresses incorrectly due to host-endian
interpretation.
Add the missing byte-order: big-endian to all affected u32 IPv4
address fields to ensure correct parsing and display.
Fixes: 1064d521d177 ("netlink: specs: support ipv4-or-v6 for dual-stack fields")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Reviewed-by: Asbjørn Sloth Tønnesen <ast@fiberby.net>
Link: https://patch.msgid.link/20251125112048.37631-1-liuhangbin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Thank you Aswin for taking this responsibility.
Signed-off-by: Alexandra Winter <wintera@linux.ibm.com>
Acked-by: Aswin Karuvally <aswin@linux.ibm.com>
Link: https://patch.msgid.link/20251125085829.3679506-1-wintera@linux.ibm.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Breno Leitao says:
====================
net: intel: migrate to .get_rx_ring_count() ethtool callback
This series migrates Intel network drivers to use the new .get_rx_ring_count()
ethtool callback introduced in commit 84eaf4359c36 ("net: ethtool: add
get_rx_ring_count callback to optimize RX ring queries").
The new callback simplifies the .get_rxnfc() implementation by removing
ETHTOOL_GRXRINGS handling and moving it to a dedicated callback. This provides
a cleaner separation of concerns and aligns these drivers with the modern
ethtool API.
The series updates the following Intel drivers:
- idpf
- igb
- igc
- ixgbevf
- fm10k
====================
Link: https://patch.msgid.link/20251125-gxring_intel-v2-0-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns fm10k with the new
ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20251125-gxring_intel-v2-8-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns ixgbevf with the new
ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20251125-gxring_intel-v2-7-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns igc with the new
ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20251125-gxring_intel-v2-6-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns igb with the new
ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20251125-gxring_intel-v2-5-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns idpf with the new
ethtool API for querying RX ring parameters.
I was not totally convinced I needed to have the lock, but, I decided to
be on the safe side and get the exact same behaviour it was before.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20251125-gxring_intel-v2-4-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns ice with the new
ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20251125-gxring_intel-v2-3-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns iavf with the new
ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20251125-gxring_intel-v2-2-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 84eaf4359c36 ("net: ethtool: add get_rx_ring_count callback to
optimize RX ring queries") added specific support for GRXRINGS callback,
simplifying .get_rxnfc.
Remove the handling of GRXRINGS in .get_rxnfc() by moving it to the new
.get_rx_ring_count().
This simplifies the RX ring count retrieval and aligns i40e with the new
ethtool API for querying RX ring parameters.
Signed-off-by: Breno Leitao <leitao@debian.org>
Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Link: https://patch.msgid.link/20251125-gxring_intel-v2-1-f55cd022d28b@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Yao Zi says:
====================
Unify platform suspend/resume routines for PCI DWMAC glue
There are currently three PCI-based DWMAC glue drivers in tree,
stmmac_pci.c, dwmac-intel.c, and dwmac-loongson.c. Both stmmac_pci.c and
dwmac-intel.c implements the same and duplicated platform suspend/resume
routines.
This series introduces a new PCI helper library, stmmac_libpci.c,
providing a pair of helpers, stmmac_pci_plat_{suspend,resume}, and
replaces the driver-specific implementation with the helpers to reduce
code duplication. The helper will also simplify the Motorcomm DWMAC glue
driver which I'm working on.
The glue driver for Intel controllers isn't covered by the series, since
its suspend routine doesn't call pci_disable_device() and thus is a
little different from the new generic helpers.
I only have Loongson hardware on hand, thus the series is only tested on
Loongson 3A5000 machine. I could confirm the controller works after
resume, and WoL works as expected. This shouldn't break stmmac_pci.c,
either, since the new helpers have the exactly same code as the old
driver-specific suspend/resume hooks.
====================
Link: https://patch.msgid.link/20251124160417.51514-1-ziyao@disroot.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert STMMAC PCI glue driver to use the generic platform
suspend/resume routines for PCI controllers, instead of implementing its
own one.
Signed-off-by: Yao Zi <ziyao@disroot.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Link: https://patch.msgid.link/20251124160417.51514-4-ziyao@disroot.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Convert glue driver for Loongson DWMAC controller to use the generic
platform suspend/resume routines for PCI controllers, instead of
implementing its own one.
Signed-off-by: Yao Zi <ziyao@disroot.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Link: https://patch.msgid.link/20251124160417.51514-3-ziyao@disroot.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Most glue driver for PCI-based DWMAC controllers utilize similar
platform suspend/resume routines. Add a generic implementation to reduce
duplicated code.
Signed-off-by: Yao Zi <ziyao@disroot.org>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Yanteng Si <siyanteng@cqsoftware.com.cn>
Link: https://patch.msgid.link/20251124160417.51514-2-ziyao@disroot.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Correct it since M.2 device T99W640 has updated from T99W515.
We need to align it with MHI side otherwise this modem can't
get the network.
Fixes: ae5a34264354 ("bus: mhi: host: pci_generic: Fix the modem name of Foxconn T99W640")
Signed-off-by: Slark Xiao <slark_xiao@163.com>
Reviewed-by: Loic Poulain <loic.poulain@oss.qualcomm.com>
Link: https://patch.msgid.link/20251125070900.33324-1-slark_xiao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Vadim Fedorenko says:
====================
add hwtstamp_get callback to phy drivers
PHY drivers are able to configure HW time stamping and are not able to
report configuration back to user space. Add callback to report
configuration like it's done for net_device and add implementation to
the drivers.
====================
Link: https://patch.msgid.link/20251124181151.277256-1-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The driver partially stores HW timestamping configuration, but missing
pieces can be read from HW. Add callback to report configuration.
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251124181151.277256-8-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The driver stores HW timestamping configuration and can technically
report it. Add callback to do it.
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251124181151.277256-7-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The driver stores HW configuration and can technically report it.
Add callback to do it.
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251124181151.277256-6-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The driver stores configuration of TX timestamping and can technically
report it. Patch RX timestamp configuration storage to be more precise
on reporting and add callback to actually report it.
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251124181151.277256-5-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The driver stores configuration information and can technically report
it. Implement hwtstamp_get callback to report the configuration.
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Link: https://patch.msgid.link/20251124181151.277256-4-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
PHY devices had lack of hwtstamp_get callback even though most of them
are tracking configuration info. Introduce new call back to
mii_timestamper.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251124181151.277256-3-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
PHY devices has hwtstamp callback which actually performs set operation.
Rename it to better reflect the action.
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Kory Maincent <kory.maincent@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Link: https://patch.msgid.link/20251124181151.277256-2-vadim.fedorenko@linux.dev
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The --send_omit_free flag is needed for TCP zero copy tests, to ensure
that packetdrill doesn't free the send() buffer after the send() call.
Fixes: 1e42f73fd3c2 ("selftests/net: packetdrill: import tcp/zerocopy")
Closes: https://lore.kernel.org/netdev/20251124071831.4cbbf412@kernel.org/
Suggested-by: Neal Cardwell <ncardwell@google.com>
Signed-off-by: Willem de Bruijn <willemb@google.com>
Link: https://patch.msgid.link/20251125234029.1320984-1-willemdebruijn.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
char variable in 'so_txtime.c' & 'txtimestamp.c' were left uninitilized
when switch default case taken. which raises following warning.
txtimestamp.c:240:2: warning: variable 'tsname' is used uninitialized
whenever switch default is taken [-Wsometimes-uninitialized]
so_txtime.c:210:3: warning: variable 'reason' is used uninitialized
whenever switch default is taken [-Wsometimes-uninitialized]
initializing these variables to NULL to fix this.
Signed-off-by: Ankit Khushwaha <ankitkhushwaha.linux@gmail.com>
Link: https://patch.msgid.link/20251125165302.20079-1-ankitkhushwaha.linux@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When discarding descriptors with IN_ORDER, we should rewind
next_avail_head otherwise it would run out of sync with
last_avail_idx. This would cause driver to report
"id X is not a head".
Fixing this by returning the number of descriptors that is used for
each buffer via vhost_get_vq_desc_n() so caller can use the value
while discarding descriptors.
Fixes: 67a873df0c41 ("vhost: basic in order support")
Cc: stable@vger.kernel.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Link: https://patch.msgid.link/20251120022950.10117-1-jasowang@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
With the previous commit revamping the timeout handling, started isn't
used anymore. It could be taken into account by adjusting the initial
value of the timeout, but there is little point as both callers capture
the timestamp shortly before calling __ceph_open_session() -- the only
thing of note that happens in the interim is taking client->mount_mutex
and that isn't expected to take multiple seconds.
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
|
|
The wait loop in __ceph_open_session() can race with the client
receiving a new monmap or osdmap shortly after the initial map is
received. Both ceph_monc_handle_map() and handle_one_map() install
a new map immediately after freeing the old one
kfree(monc->monmap);
monc->monmap = monmap;
ceph_osdmap_destroy(osdc->osdmap);
osdc->osdmap = newmap;
under client->monc.mutex and client->osdc.lock respectively, but
because neither is taken in have_mon_and_osd_map() it's possible for
client->monc.monmap->epoch and client->osdc.osdmap->epoch arms in
client->monc.monmap && client->monc.monmap->epoch &&
client->osdc.osdmap && client->osdc.osdmap->epoch;
condition to dereference an already freed map. This happens to be
reproducible with generic/395 and generic/397 with KASAN enabled:
BUG: KASAN: slab-use-after-free in have_mon_and_osd_map+0x56/0x70
Read of size 4 at addr ffff88811012d810 by task mount.ceph/13305
CPU: 2 UID: 0 PID: 13305 Comm: mount.ceph Not tainted 6.14.0-rc2-build2+ #1266
...
Call Trace:
<TASK>
have_mon_and_osd_map+0x56/0x70
ceph_open_session+0x182/0x290
ceph_get_tree+0x333/0x680
vfs_get_tree+0x49/0x180
do_new_mount+0x1a3/0x2d0
path_mount+0x6dd/0x730
do_mount+0x99/0xe0
__do_sys_mount+0x141/0x180
do_syscall_64+0x9f/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
</TASK>
Allocated by task 13305:
ceph_osdmap_alloc+0x16/0x130
ceph_osdc_init+0x27a/0x4c0
ceph_create_client+0x153/0x190
create_fs_client+0x50/0x2a0
ceph_get_tree+0xff/0x680
vfs_get_tree+0x49/0x180
do_new_mount+0x1a3/0x2d0
path_mount+0x6dd/0x730
do_mount+0x99/0xe0
__do_sys_mount+0x141/0x180
do_syscall_64+0x9f/0x100
entry_SYSCALL_64_after_hwframe+0x76/0x7e
Freed by task 9475:
kfree+0x212/0x290
handle_one_map+0x23c/0x3b0
ceph_osdc_handle_map+0x3c9/0x590
mon_dispatch+0x655/0x6f0
ceph_con_process_message+0xc3/0xe0
ceph_con_v1_try_read+0x614/0x760
ceph_con_workfn+0x2de/0x650
process_one_work+0x486/0x7c0
process_scheduled_works+0x73/0x90
worker_thread+0x1c8/0x2a0
kthread+0x2ec/0x300
ret_from_fork+0x24/0x40
ret_from_fork_asm+0x1a/0x30
Rewrite the wait loop to check the above condition directly with
client->monc.mutex and client->osdc.lock taken as appropriate. While
at it, improve the timeout handling (previously mount_timeout could be
exceeded in case wait_event_interruptible_timeout() slept more than
once) and access client->auth_err under client->monc.mutex to match
how it's set in finish_auth().
monmap_show() and osdmap_show() now take the respective lock before
accessing the map as well.
Cc: stable@vger.kernel.org
Reported-by: David Howells <dhowells@redhat.com>
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull ring-buffer fix from Steven Rostedt:
- Do not allow mmapped ring buffer to be split
When the ring buffer VMA is split by a partial munmap or a MAP_FIXED,
the kernel calls vm_ops->close() on each portion. This causes the
ring_buffer_unmap() to be called multiple times. This causes
subsequent calls to return -ENODEV and triggers a warning.
There's no reason to allow user space to split up memory mapping of
the ring buffer. Have it return -EINVAL when that happens.
* tag 'trace-ringbuffer-v6.18-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
tracing: Fix WARN_ON in tracing_buffers_mmap_close for split VMAs
|
|
Add a new package target to build a cpio archive containing the kernel
modules. This is particularly useful to supplement an existing initramfs
with the kernel modules so that the root filesystem can be started with
all needed kernel modules without modifying it.
Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
Reviewed-by: Simon Glass <sjg@chromium.org>
Tested-by: Simon Glass <sjg@chromium.org>
Co-developed-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20251125-cpio-modules-pkg-v2-2-aa8277d89682@pengutronix.de
Signed-off-by: Nicolas Schier <nsc@kernel.org>
|
|
gen_init_cpio is currently only needed when an initramfs cpio archive is
to be created out of CONFIG_INITRAMFS_SOURCE's contents. In other cases,
it's not added to hostprogs and no make target is available.
In preparation to use the host program from Makefile.package, define it
unconditionally. The program will still only be built as needed.
Signed-off-by: Ahmad Fatoum <a.fatoum@pengutronix.de>
Reviewed-by: Simon Glass <sjg@chromium.org>
Reviewed-by: Thomas Weißschuh <thomas.weissschuh@linutronix.de>
Reviewed-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Tested-by: Nicolas Schier <nsc@kernel.org>
Link: https://patch.msgid.link/20251125-cpio-modules-pkg-v2-1-aa8277d89682@pengutronix.de
Signed-off-by: Nicolas Schier <nsc@kernel.org>
|
|
Prior to commit a25e7962db0d7 ("PCI/P2PDMA: Refactor the p2pdma mapping
helpers"), P2P segments were mapped using the pci_p2pdma_map_segment()
helper. This helper was responsible for populating sg->dma_address,
marking the bus address, and also setting sg_dma_len(sg).
The refactor[1] removed this helper and moved the mapping logic directly
into the callers. While iommu_dma_map_sg() was correctly updated to set
the length in the new flow, it was missed in dma_direct_map_sg().
Thus, in dma_direct_map_sg(), the PCI_P2PDMA_MAP_BUS_ADDR case sets the
dma_address and marks the segment, but immediately executes 'continue',
which causes the loop to skip the standard assignment logic at the end:
sg_dma_len(sg) = sg->length;
As a result, when CONFIG_NEED_SG_DMA_LENGTH is enabled, the dma_length
field remains uninitialized (zero) for P2P bus address mappings. This
breaks upper-layer drivers (for e.g. RDMA/IB) that rely on sg_dma_len()
to determine the transfer size.
Fix this by explicitly setting the DMA length in the
PCI_P2PDMA_MAP_BUS_ADDR case before continuing to the next scatterlist
entry.
Fixes: a25e7962db0d7 ("PCI/P2PDMA: Refactor the p2pdma mapping helpers")
Reported-by: Jacob Moroni <jmoroni@google.com>
Signed-off-by: Pranjal Shrivastava <praan@google.com>
[1]
https://lore.kernel.org/all/ac14a0e94355bf898de65d023ccf8a2ad22a3ece.1746424934.git.leon@kernel.org/
Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Shivaji Kant <shivajikant@google.com>
Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com>
Link: https://lore.kernel.org/r/20251126114112.3694469-1-praan@google.com
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull misc fixes from Andrew Morton:
"8 hotfixes. 4 are cc:stable, 7 are against mm/.
All are singletons - please see the respective changelogs for details"
* tag 'mm-hotfixes-stable-2025-11-26-11-51' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm:
mm/filemap: fix logic around SIGBUS in filemap_map_pages()
mm/huge_memory: fix NULL pointer deference when splitting folio
MAINTAINERS: add test_kho to KHO's entry
mailmap: add entry for Sam Protsenko
selftests/mm: fix division-by-zero in uffd-unit-tests
mm/mmap_lock: reset maple state on lock_vma_under_rcu() retry
mm/memfd: fix information leak in hugetlb folios
mm: swap: remove duplicate nr_swap_pages decrement in get_swap_page_of_type()
|
|
The driver is doing a 64-bit divide, rather than using the proper
helpers, causing link errors on i386 allyesconfig builds:
x86_64-linux-ld: drivers/power/supply/intel_dc_ti_battery.o: in function `dc_ti_battery_get_voltage_and_current_now':
intel_dc_ti_battery.c:(.text+0x5c): undefined reference to `__udivdi3'
x86_64-linux-ld: intel_dc_ti_battery.c:(.text+0x96): undefined reference to `__udivdi3'
and while fixing that, fix the double rounding: keep the timing
difference in nanoseconds ('ktime'), and then just convert to usecs at
the end.
Not because the timing precision is likely to matter, but because doing
it right also makes the code simpler.
Reported-by: Guenter Roeck <linux@roeck-us.net>
Cc: Hans de Goede <hansg@kernel.org>
Cc: Sebastian Reichel <sebastian.reichel@collabora.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
That was already the limit with KASAN enabled, and the 32-bit x86 build
ends up having a couple of drm cases that have stack frames _just_ over
1kB on my allmodconfig test. So the minimal fix for this build issue
for now is to just bump the limit and make it independent of KASAN.
[ Side note: XTENSA already used 1.5k and PARISC uses 2k, so 1280 is
still relatively conservative ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
When excl_prog_hash is 0 and excl_prog_hash_size is non-zero, the map also
needs to be freed. Otherwise, the map memory will not be reclaimed, just
like the memory leak problem reported by syzbot [1].
syzbot reported:
BUG: memory leak
backtrace (crc 7b9fb9b4):
map_create+0x322/0x11e0 kernel/bpf/syscall.c:1512
__sys_bpf+0x3556/0x3610 kernel/bpf/syscall.c:6131
Fixes: baefdbdf6812 ("bpf: Implement exclusive map creation")
Reported-by: syzbot+cf08c551fecea9fd1320@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=cf08c551fecea9fd1320
Tested-by: syzbot+cf08c551fecea9fd1320@syzkaller.appspotmail.com
Signed-off-by: Edward Adam Davis <eadavis@qq.com>
Acked-by: Yonghong Song <yonghong.song@linux.dev>
Link: https://lore.kernel.org/r/tencent_3F226F882CE56DCC94ACE90EED1ECCFC780A@qq.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound
Pull sound fixes from Takashi Iwai:
"A collection of small fixes. All changes are device-specific and
trivial, mostly HD-audio and USB-audio quirks and fixups"
* tag 'sound-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound:
ALSA: hda/realtek: Add quirk for HP ProBook 450 G8
ALSA: usb-audio: fix uac2 clock source at terminal parser
ALSA: hda/realtek: add quirk for HP pavilion aero laptop 13z-be200
ALSA: hda/cirrus fix cs420x MacPro 6,1 inverted jack detection
ALSA: usb-audio: Add DSD quirk for LEAK Stereo 230
ALSA: au88x0: Fix incorrect error handling for PCI config reads
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fix from Rafael Wysocki:
"Revert a commit that attempted to make the code in the ACPI processor
driver more straightforward, but it turned out to cause the kernel to
crash on at least one system, along with some further cleanups on top
of it"
* tag 'acpi-6.18-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
Revert "ACPI: processor: idle: Optimize ACPI idle driver registration"
Revert "ACPI: processor: Remove unused empty stubs of some functions"
Revert "ACPI: processor: idle: Rearrange declarations in header file"
Revert "ACPI: processor: idle: Redefine two functions as void"
Revert "ACPI: processor: Do not expose global variable acpi_idle_driver"
|
|
If the board supports IP discovery, we don't need to
parse the gpu info firmware.
Backport to 6.18.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4721
Fixes: fa819e3a7c1e ("drm/amdgpu: add support for cyan skillfish gpu_info")
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 5427e32fa3a0ba9a016db83877851ed277b065fb)
|
|
Ensure the userq TLB flush is emitted only after
the VM update finishes and the PT BOs have been
annotated with bookkeeping fences.
Suggested-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Prike Liang <Prike.Liang@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f3854e04b708d73276c4488231a8bd66d30b4671)
Cc: stable@vger.kernel.org
|
|
[WHY]
When monitor is still booting EDID read can fail while DPCD read
is successful. In this case no EDID data will be returned, and this
could happen for a while.
[HOW]
Increase number of attempts to read EDID in dm_helpers_read_local_edid()
to 25.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4672
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Tested-by: Dan Wheeler <daniel.wheeler@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit a76d6f2c76c3abac519ba753e2723e6ffe8e461c)
Cc: stable@vger.kernel.org
|
|
[WHY]
When a laptop lid is closed the connector is disabled but userspace
can still try to change brightness. This doesn't work because the
panel is turned off. It will eventually time out, but there is a lot
of stutter along the way.
[How]
Iterate all connectors to check whether the matching one for the backlight
index is enabled.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4675
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Ray Wu <ray.wu@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit f6eeab30323d1174a4cc022e769d248fe8241304)
Cc: stable@vger.kernel.org
|
|
[WHAT]
IGT kms_cursor_legacy's long-nonblocking-modeset-vs-cursor-atomic
fails with NULL pointer dereference. This can be reproduced with
both an eDP panel and a DP monitors connected.
BUG: kernel NULL pointer dereference, address: 0000000000000000
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD 0 P4D 0
Oops: Oops: 0000 [#1] SMP NOPTI
CPU: 13 UID: 0 PID: 2960 Comm: kms_cursor_lega Not tainted
6.16.0-99-custom #8 PREEMPT(voluntary)
Hardware name: AMD ........
RIP: 0010:dc_stream_get_scanoutpos+0x34/0x130 [amdgpu]
Code: 57 4d 89 c7 41 56 49 89 ce 41 55 49 89 d5 41 54 49
89 fc 53 48 83 ec 18 48 8b 87 a0 64 00 00 48 89 75 d0 48 c7 c6 e0 41 30
c2 <48> 8b 38 48 8b 9f 68 06 00 00 e8 8d d7 fd ff 31 c0 48 81 c3 e0 02
RSP: 0018:ffffd0f3c2bd7608 EFLAGS: 00010292
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffd0f3c2bd7668
RDX: ffffd0f3c2bd7664 RSI: ffffffffc23041e0 RDI: ffff8b32494b8000
RBP: ffffd0f3c2bd7648 R08: ffffd0f3c2bd766c R09: ffffd0f3c2bd7760
R10: ffffd0f3c2bd7820 R11: 0000000000000000 R12: ffff8b32494b8000
R13: ffffd0f3c2bd7664 R14: ffffd0f3c2bd7668 R15: ffffd0f3c2bd766c
FS: 000071f631b68700(0000) GS:ffff8b399f114000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000001b8105000 CR4: 0000000000f50ef0
PKRU: 55555554
Call Trace:
<TASK>
dm_crtc_get_scanoutpos+0xd7/0x180 [amdgpu]
amdgpu_display_get_crtc_scanoutpos+0x86/0x1c0 [amdgpu]
? __pfx_amdgpu_crtc_get_scanout_position+0x10/0x10[amdgpu]
amdgpu_crtc_get_scanout_position+0x27/0x50 [amdgpu]
drm_crtc_vblank_helper_get_vblank_timestamp_internal+0xf7/0x400
drm_crtc_vblank_helper_get_vblank_timestamp+0x1c/0x30
drm_crtc_get_last_vbltimestamp+0x55/0x90
drm_crtc_next_vblank_start+0x45/0xa0
drm_atomic_helper_wait_for_fences+0x81/0x1f0
...
Cc: Mario Limonciello <mario.limonciello@amd.com>
Cc: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Aurabindo Pillai <aurabindo.pillai@amd.com>
Signed-off-by: Alex Hung <alex.hung@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit 621e55f1919640acab25383362b96e65f2baea3c)
Cc: stable@vger.kernel.org
|
|
This reverts commit 2681bf4ae8d24df950138b8c9ea9c271cd62e414.
This results in a blank screen on the HDMI port on some systems.
Revert for now so as not to regress 6.18, can be addressed
in 6.19 once the issue is root caused.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/4652
Cc: Sunpeng.Li@amd.com
Cc: ivan.lipski@amd.com
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit d0e9de7a81503cdde37fb2d37f1d102f9e0f38fb)
|