| Age | Commit message (Collapse) | Author | Files | Lines |
|
Cross-merge networking fixes after downstream PR (net-6.18-rc6).
No conflicts, adjacent changes in:
drivers/net/phy/micrel.c
96a9178a29a6 ("net: phy: micrel: lan8814 fix reset of the QSGMII interface")
61b7ade9ba8c ("net: phy: micrel: Add support for non PTP SKUs for lan8814")
and a trivial one in tools/testing/selftests/drivers/net/Makefile.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported use-after-free of tipc_net(net)->monitors[]
in tipc_mon_reinit_self(). [0]
The array is protected by RTNL, but tipc_mon_reinit_self()
iterates over it without RTNL.
tipc_mon_reinit_self() is called from tipc_net_finalize(),
which is always under RTNL except for tipc_net_finalize_work().
Let's hold RTNL in tipc_net_finalize_work().
[0]:
BUG: KASAN: slab-use-after-free in __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
BUG: KASAN: slab-use-after-free in _raw_spin_lock_irqsave+0xa7/0xf0 kernel/locking/spinlock.c:162
Read of size 1 at addr ffff88805eae1030 by task kworker/0:7/5989
CPU: 0 UID: 0 PID: 5989 Comm: kworker/0:7 Not tainted syzkaller #0 PREEMPT_{RT,(full)}
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/18/2025
Workqueue: events tipc_net_finalize_work
Call Trace:
<TASK>
dump_stack_lvl+0x189/0x250 lib/dump_stack.c:120
print_address_description mm/kasan/report.c:378 [inline]
print_report+0xca/0x240 mm/kasan/report.c:482
kasan_report+0x118/0x150 mm/kasan/report.c:595
__kasan_check_byte+0x2a/0x40 mm/kasan/common.c:568
kasan_check_byte include/linux/kasan.h:399 [inline]
lock_acquire+0x8d/0x360 kernel/locking/lockdep.c:5842
__raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline]
_raw_spin_lock_irqsave+0xa7/0xf0 kernel/locking/spinlock.c:162
rtlock_slowlock kernel/locking/rtmutex.c:1894 [inline]
rwbase_rtmutex_lock_state kernel/locking/spinlock_rt.c:160 [inline]
rwbase_write_lock+0xd3/0x7e0 kernel/locking/rwbase_rt.c:244
rt_write_lock+0x76/0x110 kernel/locking/spinlock_rt.c:243
write_lock_bh include/linux/rwlock_rt.h:99 [inline]
tipc_mon_reinit_self+0x79/0x430 net/tipc/monitor.c:718
tipc_net_finalize+0x115/0x190 net/tipc/net.c:140
process_one_work kernel/workqueue.c:3236 [inline]
process_scheduled_works+0xade/0x17b0 kernel/workqueue.c:3319
worker_thread+0x8a0/0xda0 kernel/workqueue.c:3400
kthread+0x70e/0x8a0 kernel/kthread.c:463
ret_from_fork+0x439/0x7d0 arch/x86/kernel/process.c:148
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
Allocated by task 6089:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:388 [inline]
__kasan_kmalloc+0x93/0xb0 mm/kasan/common.c:405
kasan_kmalloc include/linux/kasan.h:260 [inline]
__kmalloc_cache_noprof+0x1a8/0x320 mm/slub.c:4407
kmalloc_noprof include/linux/slab.h:905 [inline]
kzalloc_noprof include/linux/slab.h:1039 [inline]
tipc_mon_create+0xc3/0x4d0 net/tipc/monitor.c:657
tipc_enable_bearer net/tipc/bearer.c:357 [inline]
__tipc_nl_bearer_enable+0xe16/0x13f0 net/tipc/bearer.c:1047
__tipc_nl_compat_doit net/tipc/netlink_compat.c:371 [inline]
tipc_nl_compat_doit+0x3bc/0x5f0 net/tipc/netlink_compat.c:393
tipc_nl_compat_handle net/tipc/netlink_compat.c:-1 [inline]
tipc_nl_compat_recv+0x83c/0xbe0 net/tipc/netlink_compat.c:1321
genl_family_rcv_msg_doit+0x215/0x300 net/netlink/genetlink.c:1115
genl_family_rcv_msg net/netlink/genetlink.c:1195 [inline]
genl_rcv_msg+0x60e/0x790 net/netlink/genetlink.c:1210
netlink_rcv_skb+0x208/0x470 net/netlink/af_netlink.c:2552
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1219
netlink_unicast_kernel net/netlink/af_netlink.c:1320 [inline]
netlink_unicast+0x846/0xa10 net/netlink/af_netlink.c:1346
netlink_sendmsg+0x805/0xb30 net/netlink/af_netlink.c:1896
sock_sendmsg_nosec net/socket.c:714 [inline]
__sock_sendmsg+0x21c/0x270 net/socket.c:729
____sys_sendmsg+0x508/0x820 net/socket.c:2614
___sys_sendmsg+0x21f/0x2a0 net/socket.c:2668
__sys_sendmsg net/socket.c:2700 [inline]
__do_sys_sendmsg net/socket.c:2705 [inline]
__se_sys_sendmsg net/socket.c:2703 [inline]
__x64_sys_sendmsg+0x1a1/0x260 net/socket.c:2703
do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
do_syscall_64+0xfa/0x3b0 arch/x86/entry/syscall_64.c:94
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 6088:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3e/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x46/0x50 mm/kasan/generic.c:576
poison_slab_object mm/kasan/common.c:243 [inline]
__kasan_slab_free+0x5b/0x80 mm/kasan/common.c:275
kasan_slab_free include/linux/kasan.h:233 [inline]
slab_free_hook mm/slub.c:2422 [inline]
slab_free mm/slub.c:4695 [inline]
kfree+0x195/0x550 mm/slub.c:4894
tipc_l2_device_event+0x380/0x650 net/tipc/bearer.c:-1
notifier_call_chain+0x1b3/0x3e0 kernel/notifier.c:85
call_netdevice_notifiers_extack net/core/dev.c:2267 [inline]
call_netdevice_notifiers net/core/dev.c:2281 [inline]
unregister_netdevice_many_notify+0x14d7/0x1fe0 net/core/dev.c:12166
unregister_netdevice_many net/core/dev.c:12229 [inline]
unregister_netdevice_queue+0x33c/0x380 net/core/dev.c:12073
unregister_netdevice include/linux/netdevice.h:3385 [inline]
__tun_detach+0xe4d/0x1620 drivers/net/tun.c:621
tun_detach drivers/net/tun.c:637 [inline]
tun_chr_close+0x10d/0x1c0 drivers/net/tun.c:3433
__fput+0x458/0xa80 fs/file_table.c:468
task_work_run+0x1d4/0x260 kernel/task_work.c:227
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop+0xec/0x110 kernel/entry/common.c:43
exit_to_user_mode_prepare include/linux/irq-entry-common.h:225 [inline]
syscall_exit_to_user_mode_work include/linux/entry-common.h:175 [inline]
syscall_exit_to_user_mode include/linux/entry-common.h:210 [inline]
do_syscall_64+0x2bd/0x3b0 arch/x86/entry/syscall_64.c:100
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Fixes: 46cb01eeeb86 ("tipc: update mon's self addr when node addr generated")
Reported-by: syzbot+d7dad7fd4b3921104957@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/690c323a.050a0220.baf87.007f.GAE@google.com/
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20251107064038.2361188-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update all struct proto_ops connect() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.
No binary changes expected.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-3-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Update all struct proto_ops bind() callback function prototypes from
"struct sockaddr *" to "struct sockaddr_unsized *" to avoid lying to the
compiler about object sizes. Calls into struct proto handlers gain casts
that will be removed in the struct proto conversion patch.
No binary changes expected.
Signed-off-by: Kees Cook <kees@kernel.org>
Link: https://patch.msgid.link/20251104002617.2752303-2-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
sk->sk_refcnt has been converted to refcount_t in 2017.
__sock_put(sk) being refcount_dec(&sk->sk_refcnt), it will complain
loudly if the current refcnt is 1 (or less) in a non racy way.
We can remove four WARN_ON() in favor of the generic refcount_dec()
check.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Xuanqiang Luo<luoxuanqiang@kylinos.cn>
Link: https://patch.msgid.link/20251014140605.2982703-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Fix typos in doc comments.
Signed-off-by: Bhanu Seshu Kumar Valluri <bhanuseshukumar@gmail.com>
Link: https://patch.msgid.link/20251001105715.50462-1-bhanuseshukumar@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since the value returned by 'tipc_nodeid2string()' is not used, the
function may be adjusted to return the length of the result, which
is helpful to drop a few calls to 'strlen()' in 'tipc_link_create()'
and 'tipc_link_bc_create()'. Compile tested only.
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250926074113.914399-1-dmantipov@yandex.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
We want to split sk->sk_drops in the future to reduce
potential contention on this field.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@google.com>
Link: https://patch.msgid.link/20250826125031.1578842-2-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Cross-merge networking fixes after downstream PR (net-6.16-rc6).
No conflicts.
Adjacent changes:
Documentation/devicetree/bindings/net/allwinner,sun8i-a83t-emac.yaml
0a12c435a1d6 ("dt-bindings: net: sun8i-emac: Add A100 EMAC compatible")
b3603c0466a8 ("dt-bindings: net: sun8i-emac: Rename A523 EMAC0 to GMAC0")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported a null-ptr-deref in tipc_conn_close() during netns
dismantle. [0]
tipc_topsrv_stop() iterates tipc_net(net)->topsrv->conn_idr and calls
tipc_conn_close() for each tipc_conn.
The problem is that tipc_conn_close() is called after releasing the
IDR lock.
At the same time, there might be tipc_conn_recv_work() running and it
could call tipc_conn_close() for the same tipc_conn and release its
last ->kref.
Once we release the IDR lock in tipc_topsrv_stop(), there is no
guarantee that the tipc_conn is alive.
Let's hold the ref before releasing the lock and put the ref after
tipc_conn_close() in tipc_topsrv_stop().
[0]:
BUG: KASAN: use-after-free in tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165
Read of size 8 at addr ffff888099305a08 by task kworker/u4:3/435
CPU: 0 PID: 435 Comm: kworker/u4:3 Not tainted 4.19.204-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: netns cleanup_net
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1fc/0x2ef lib/dump_stack.c:118
print_address_description.cold+0x54/0x219 mm/kasan/report.c:256
kasan_report_error.cold+0x8a/0x1b9 mm/kasan/report.c:354
kasan_report mm/kasan/report.c:412 [inline]
__asan_report_load8_noabort+0x88/0x90 mm/kasan/report.c:433
tipc_conn_close+0x122/0x140 net/tipc/topsrv.c:165
tipc_topsrv_stop net/tipc/topsrv.c:701 [inline]
tipc_topsrv_exit_net+0x27b/0x5c0 net/tipc/topsrv.c:722
ops_exit_list+0xa5/0x150 net/core/net_namespace.c:153
cleanup_net+0x3b4/0x8b0 net/core/net_namespace.c:553
process_one_work+0x864/0x1570 kernel/workqueue.c:2153
worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
kthread+0x33f/0x460 kernel/kthread.c:259
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415
Allocated by task 23:
kmem_cache_alloc_trace+0x12f/0x380 mm/slab.c:3625
kmalloc include/linux/slab.h:515 [inline]
kzalloc include/linux/slab.h:709 [inline]
tipc_conn_alloc+0x43/0x4f0 net/tipc/topsrv.c:192
tipc_topsrv_accept+0x1b5/0x280 net/tipc/topsrv.c:470
process_one_work+0x864/0x1570 kernel/workqueue.c:2153
worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
kthread+0x33f/0x460 kernel/kthread.c:259
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415
Freed by task 23:
__cache_free mm/slab.c:3503 [inline]
kfree+0xcc/0x210 mm/slab.c:3822
tipc_conn_kref_release net/tipc/topsrv.c:150 [inline]
kref_put include/linux/kref.h:70 [inline]
conn_put+0x2cd/0x3a0 net/tipc/topsrv.c:155
process_one_work+0x864/0x1570 kernel/workqueue.c:2153
worker_thread+0x64c/0x1130 kernel/workqueue.c:2296
kthread+0x33f/0x460 kernel/kthread.c:259
ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:415
The buggy address belongs to the object at ffff888099305a00
which belongs to the cache kmalloc-512 of size 512
The buggy address is located 8 bytes inside of
512-byte region [ffff888099305a00, ffff888099305c00)
The buggy address belongs to the page:
page:ffffea000264c140 count:1 mapcount:0 mapping:ffff88813bff0940 index:0x0
flags: 0xfff00000000100(slab)
raw: 00fff00000000100 ffffea00028b6b88 ffffea0002cd2b08 ffff88813bff0940
raw: 0000000000000000 ffff888099305000 0000000100000006 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888099305900: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888099305980: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff888099305a00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888099305a80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888099305b00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fixes: c5fa7b3cf3cb ("tipc: introduce new TIPC server infrastructure")
Reported-by: syzbot+d333febcf8f4bc5f6110@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=27169a847a70550d17be
Signed-off-by: Kuniyuki Iwashima <kuniyu@google.com>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Link: https://patch.msgid.link/20250702014350.692213-1-kuniyu@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Difference between sock_i_uid() and sk_uid() is that
after sock_orphan(), sock_i_uid() returns GLOBAL_ROOT_UID
while sk_uid() returns the last cached sk->sk_uid value.
None of sock_i_uid() callers care about this.
Use sk_uid() which is much faster and inlined.
Note that diag/dump users are calling sock_i_ino() and
can not see the full benefit yet.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Lorenzo Colitti <lorenzo@google.com>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Link: https://patch.msgid.link/20250620133001.4090592-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.16-rc3).
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The reproduction steps:
1. create a tun interface
2. enable l2 bearer
3. TIPC_NL_UDP_GET_REMOTEIP with media name set to tun
tipc: Started in network mode
tipc: Node identity 8af312d38a21, cluster identity 4711
tipc: Enabled bearer <eth:syz_tun>, priority 1
Oops: general protection fault
KASAN: null-ptr-deref in range
CPU: 1 UID: 1000 PID: 559 Comm: poc Not tainted 6.16.0-rc1+ #117 PREEMPT
Hardware name: QEMU Ubuntu 24.04 PC
RIP: 0010:tipc_udp_nl_dump_remoteip+0x4a4/0x8f0
the ub was in fact a struct dev.
when bid != 0 && skip_cnt != 0, bearer_list[bid] may be NULL or
other media when other thread changes it.
fix this by checking media_id.
Fixes: 832629ca5c313 ("tipc: add UDP remoteip dump to netlink API")
Signed-off-by: Haixia Qu <hxqu@hillstonenet.com>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Link: https://patch.msgid.link/20250617055624.2680-1-hxqu@hillstonenet.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
ip6tunnel_xmit() erases the contents of the SKB control block. In order to
be able to set particular IP6CB flags on the SKB, add a corresponding
parameter, and propagate it to udp_tunnel6_xmit_skb() as well.
In one of the following patches, VXLAN driver will use this facility to
mark packets as subject to IPv6 multicast routing.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/acb4f9f3e40c3a931236c3af08a720b017fbfbfb.1750113335.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The function always returns zero, thus the return value does not carry any
signal. Just make it void.
Most callers already ignore the return value. However:
- Refold arguments of the call from sctp_v6_xmit() so that they fit into
the 80-column limit.
- tipc_udp_xmit() initializes err from the return value, but that should
already be always zero at that point. So there's no practical change, but
elision of the assignment prompts a couple more tweaks to clean up the
function.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Link: https://patch.msgid.link/7facacf9d8ca3ca9391a4aee88160913671b868d.1750113335.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
iptunnel_xmit() erases the contents of the SKB control block. In order to
be able to set particular IPCB flags on the SKB, add a corresponding
parameter, and propagate it to udp_tunnel_xmit_skb() as well.
In one of the following patches, VXLAN driver will use this facility to
mark packets as subject to IP multicast routing.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Reviewed-by: Nikolay Aleksandrov <razor@blackwall.org>
Acked-by: Antonio Quartulli <antonio@openvpn.net>
Link: https://patch.msgid.link/89c9daf9f2dc088b6b92ccebcc929f51742de91f.1750113335.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Move this API to the canonical timer_*() namespace.
[ tglx: Redone against pre rc1 ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/aB2X0jCKQO56WdMt@gmail.com
|
|
syzbot reported a refcount warning [1] caused by calling get_net() on
a network namespace that is being destroyed (refcount=0). This happens
when a TIPC discovery timer fires during network namespace cleanup.
The recently added get_net() call in commit e279024617134 ("net/tipc:
fix slab-use-after-free Read in tipc_aead_encrypt_done") attempts to
hold a reference to the network namespace. However, if the namespace
is already being destroyed, its refcount might be zero, leading to the
use-after-free warning.
Replace get_net() with maybe_get_net(), which safely checks if the
refcount is non-zero before incrementing it. If the namespace is being
destroyed, return -ENODEV early, after releasing the bearer reference.
[1]: https://lore.kernel.org/all/68342b55.a70a0220.253bc2.0091.GAE@google.com/T/#m12019cf9ae77e1954f666914640efa36d52704a2
Reported-by: syzbot+f0c4a4aba757549ae26c@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/68342b55.a70a0220.253bc2.0091.GAE@google.com/T/#m12019cf9ae77e1954f666914640efa36d52704a2
Fixes: e27902461713 ("net/tipc: fix slab-use-after-free Read in tipc_aead_encrypt_done")
Signed-off-by: Charalampos Mitrodimas <charmitro@posteo.net>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Link: https://patch.msgid.link/20250527-net-tipc-warning-v2-1-df3dc398a047@posteo.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
The tipc_aead_free() function currently uses kfree() to release the aead
structure. However, this structure contains sensitive information, such
as key's SALT value, which should be securely erased from memory to
prevent potential leakage.
To enhance security, replace kfree() with kfree_sensitive() when freeing
the aead structure. This change ensures that sensitive data is explicitly
cleared before memory deallocation, aligning with the approach used in
tipc_aead_init() and adhering to best practices for handling confidential
information.
Signed-off-by: Zilin Guan <zilin@seu.edu.cn>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Link: https://patch.msgid.link/20250523114717.4021518-1-zilin@seu.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc8).
Conflicts:
80f2ab46c2ee ("irdma: free iwdev->rf after removing MSI-X")
4bcc063939a5 ("ice, irdma: fix an off by one in error handling code")
c24a65b6a27c ("iidc/ice/irdma: Update IDC to support multiple consumers")
https://lore.kernel.org/20250513130630.280ee6c5@canb.auug.org.au
No extra adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Syzbot reported a slab-use-after-free with the following call trace:
==================================================================
BUG: KASAN: slab-use-after-free in tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840
Read of size 8 at addr ffff88807a733000 by task kworker/1:0/25
Call Trace:
kasan_report+0xd9/0x110 mm/kasan/report.c:601
tipc_aead_encrypt_done+0x4bd/0x510 net/tipc/crypto.c:840
crypto_request_complete include/crypto/algapi.h:266
aead_request_complete include/crypto/internal/aead.h:85
cryptd_aead_crypt+0x3b8/0x750 crypto/cryptd.c:772
crypto_request_complete include/crypto/algapi.h:266
cryptd_queue_worker+0x131/0x200 crypto/cryptd.c:181
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
Allocated by task 8355:
kzalloc_noprof include/linux/slab.h:778
tipc_crypto_start+0xcc/0x9e0 net/tipc/crypto.c:1466
tipc_init_net+0x2dd/0x430 net/tipc/core.c:72
ops_init+0xb9/0x650 net/core/net_namespace.c:139
setup_net+0x435/0xb40 net/core/net_namespace.c:343
copy_net_ns+0x2f0/0x670 net/core/net_namespace.c:508
create_new_namespaces+0x3ea/0xb10 kernel/nsproxy.c:110
unshare_nsproxy_namespaces+0xc0/0x1f0 kernel/nsproxy.c:228
ksys_unshare+0x419/0x970 kernel/fork.c:3323
__do_sys_unshare kernel/fork.c:3394
Freed by task 63:
kfree+0x12a/0x3b0 mm/slub.c:4557
tipc_crypto_stop+0x23c/0x500 net/tipc/crypto.c:1539
tipc_exit_net+0x8c/0x110 net/tipc/core.c:119
ops_exit_list+0xb0/0x180 net/core/net_namespace.c:173
cleanup_net+0x5b7/0xbf0 net/core/net_namespace.c:640
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
After freed the tipc_crypto tx by delete namespace, tipc_aead_encrypt_done
may still visit it in cryptd_queue_worker workqueue.
I reproduce this issue by:
ip netns add ns1
ip link add veth1 type veth peer name veth2
ip link set veth1 netns ns1
ip netns exec ns1 tipc bearer enable media eth dev veth1
ip netns exec ns1 tipc node set key this_is_a_master_key master
ip netns exec ns1 tipc bearer disable media eth dev veth1
ip netns del ns1
The key of reproduction is that, simd_aead_encrypt is interrupted, leading
to crypto_simd_usable() return false. Thus, the cryptd_queue_worker is
triggered, and the tipc_crypto tx will be visited.
tipc_disc_timeout
tipc_bearer_xmit_skb
tipc_crypto_xmit
tipc_aead_encrypt
crypto_aead_encrypt
// encrypt()
simd_aead_encrypt
// crypto_simd_usable() is false
child = &ctx->cryptd_tfm->base;
simd_aead_encrypt
crypto_aead_encrypt
// encrypt()
cryptd_aead_encrypt_enqueue
cryptd_aead_enqueue
cryptd_enqueue_request
// trigger cryptd_queue_worker
queue_work_on(smp_processor_id(), cryptd_wq, &cpu_queue->work)
Fix this by holding net reference count before encrypt.
Reported-by: syzbot+55c12726619ff85ce1f6@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=55c12726619ff85ce1f6
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Signed-off-by: Wang Liang <wangliang74@huawei.com>
Link: https://patch.msgid.link/20250520101404.1341730-1-wangliang74@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Cross-merge networking fixes after downstream PR (net-6.15-rc4).
This pull includes wireless and a fix to vxlan which isn't
in Linus's tree just yet. The latter creates with a silent conflict
/ build breakage, so merging it now to avoid causing problems.
drivers/net/vxlan/vxlan_vnifilter.c
094adad91310 ("vxlan: Use a single lock to protect the FDB table")
087a9eb9e597 ("vxlan: vnifilter: Fix unlocked deletion of default FDB entry")
https://lore.kernel.org/20250423145131.513029-1-idosch@nvidia.com
No "normal" conflicts, or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported:
tipc: Node number set to 1055423674
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] SMP KASAN NOPTI
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
CPU: 3 UID: 0 PID: 6017 Comm: kworker/3:5 Not tainted 6.15.0-rc1-syzkaller-00246-g900241a5cc15 #0 PREEMPT(full)
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2~bpo12+1 04/01/2014
Workqueue: events tipc_net_finalize_work
RIP: 0010:tipc_mon_reinit_self+0x11c/0x210 net/tipc/monitor.c:719
...
RSP: 0018:ffffc9000356fb68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003ee87cba
RDX: 0000000000000000 RSI: ffffffff8dbc56a7 RDI: ffff88804c2cc010
RBP: dffffc0000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000007
R13: fffffbfff2111097 R14: ffff88804ead8000 R15: ffff88804ead9010
FS: 0000000000000000(0000) GS:ffff888097ab9000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000f720eb00 CR3: 000000000e182000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
tipc_net_finalize+0x10b/0x180 net/tipc/net.c:140
process_one_work+0x9cc/0x1b70 kernel/workqueue.c:3238
process_scheduled_works kernel/workqueue.c:3319 [inline]
worker_thread+0x6c8/0xf10 kernel/workqueue.c:3400
kthread+0x3c2/0x780 kernel/kthread.c:464
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:153
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
...
RIP: 0010:tipc_mon_reinit_self+0x11c/0x210 net/tipc/monitor.c:719
...
RSP: 0018:ffffc9000356fb68 EFLAGS: 00010246
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 000000003ee87cba
RDX: 0000000000000000 RSI: ffffffff8dbc56a7 RDI: ffff88804c2cc010
RBP: dffffc0000000000 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000007
R13: fffffbfff2111097 R14: ffff88804ead8000 R15: ffff88804ead9010
FS: 0000000000000000(0000) GS:ffff888097ab9000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000f720eb00 CR3: 000000000e182000 CR4: 0000000000352ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
There is a racing condition between workqueue created when enabling
bearer and another thread created when disabling bearer right after
that as follow:
enabling_bearer | disabling_bearer
--------------- | ----------------
tipc_disc_timeout() |
{ | bearer_disable()
... | {
schedule_work(&tn->work); | tipc_mon_delete()
... | {
} | ...
| write_lock_bh(&mon->lock);
| mon->self = NULL;
| write_unlock_bh(&mon->lock);
| ...
| }
tipc_net_finalize_work() | }
{ |
... |
tipc_net_finalize() |
{ |
... |
tipc_mon_reinit_self() |
{ |
... |
write_lock_bh(&mon->lock); |
mon->self->addr = tipc_own_addr(net); |
write_unlock_bh(&mon->lock); |
... |
} |
... |
} |
... |
} |
'mon->self' is set to NULL in disabling_bearer thread and dereferenced
later in enabling_bearer thread.
This commit fixes this issue by validating 'mon->self' before assigning
node address to it.
Reported-by: syzbot+ed60da8d686dc709164c@syzkaller.appspotmail.com
Fixes: 46cb01eeeb86 ("tipc: update mon's self addr when node addr generated")
Signed-off-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20250417074826.578115-1-tung.quang.nguyen@est.tech
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch suggests the replacement of strncpy with strscpy
as per Documentation/process/deprecated.
The strncpy() fails to guarantee NULL termination,
The function adds zero pads which isn't really convenient for short strings
as it may cause performance issues.
strscpy() is a preferred replacement because
it overcomes the limitations of strncpy mentioned above.
Compile Tested
Signed-off-by: Kevin Paul Reddy Janagari <kevinpaul468@gmail.com>
Reviewed-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Tested-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Link: https://patch.msgid.link/20250411085010.6249-1-kevinpaul468@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
Current release - regressions:
- core: hold instance lock during NETDEV_CHANGE
- rtnetlink: fix bad unlock balance in do_setlink()
- ipv6:
- fix null-ptr-deref in addrconf_add_ifaddr()
- align behavior across nexthops during path selection
Previous releases - regressions:
- sctp: prevent transport UaF in sendmsg
- mptcp: only inc MPJoinAckHMacFailure for HMAC failures
Previous releases - always broken:
- sched:
- make ->qlen_notify() idempotent
- ensure sufficient space when sending filter netlink notifications
- sch_sfq: really don't allow 1 packet limit
- netfilter: fix incorrect avx2 match of 5th field octet
- tls: explicitly disallow disconnect
- eth: octeontx2-pf: fix VF root node parent queue priority"
* tag 'net-6.15-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (38 commits)
ethtool: cmis_cdb: Fix incorrect read / write length extension
selftests: netfilter: add test case for recent mismatch bug
nft_set_pipapo: fix incorrect avx2 match of 5th field octet
net: ppp: Add bound checking for skb data on ppp_sync_txmung
net: Fix null-ptr-deref by sock_lock_init_class_and_name() and rmmod.
ipv6: Align behavior across nexthops during path selection
net: phy: allow MDIO bus PM ops to start/stop state machine for phylink-controlled PHY
net: phy: move phy_link_change() prior to mdio_bus_phy_may_suspend()
selftests/tc-testing: sfq: check that a derived limit of 1 is rejected
net_sched: sch_sfq: move the limit validation
net_sched: sch_sfq: use a temporary work area for validating configuration
net: libwx: handle page_pool_dev_alloc_pages error
selftests: mptcp: validate MPJoin HMacFailure counters
mptcp: only inc MPJoinAckHMacFailure for HMAC failures
rtnetlink: Fix bad unlock balance in do_setlink().
net: ethtool: Don't call .cleanup_data when prepare_data fails
tc: Ensure we have enough buffer space when sending filter netlink notifications
net: libwx: Fix the wrong Rx descriptor field
octeontx2-pf: qos: fix VF root node parent queue index
selftests: tls: check that disconnect does nothing
...
|
|
In case the backlog transmit queue for system-importance messages is overloaded,
tipc_link_xmit() returns -ENOBUFS but the skb list is not purged. This leads to
memory leak and failure when a skb is allocated.
This commit fixes this issue by purging the skb list before tipc_link_xmit()
returns.
Fixes: 365ad353c256 ("tipc: reduce risk of user starvation during link congestion")
Signed-off-by: Tung Nguyen <tung.quang.nguyen@est.tech>
Link: https://patch.msgid.link/20250403092431.514063-1-tung.quang.nguyen@est.tech
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
timer_delete[_sync]() replaces del_timer[_sync](). Convert the whole tree
over and remove the historical wrapper inlines.
Conversion was done with coccinelle plus manual fixups where necessary.
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ingo Molnar <mingo@kernel.org>
|
|
The address of a data structure member was determined before
a corresponding null pointer check in the implementation of
the function “tipc_link_tnl_prepare”.
Thus avoid the risk for undefined behaviour by moving the definition
for the local variable “fdefq” into an if branch at the end.
This issue was detected by using the Coccinelle software.
Signed-off-by: Markus Elfring <elfring@users.sourceforge.net>
Link: https://patch.msgid.link/08fe8fc3-19c3-4324-8719-0ee74b0f32c9@web.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
On a 32bit system the "keylen + sizeof(struct tipc_aead_key)" math could
have an integer wrapping issue. It doesn't matter because the "keylen"
is checked on the next line, but just to make life easier for static
analysis tools, let's re-order these conditions and avoid the integer
overflow.
Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cross-merge networking fixes after downstream PR (net-6.13-rc3).
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tipc_exit_net() is very slow and is abused by syzbot.
tipc_nametbl_stop() is called for each netns being dismantled.
Calling synchronize_net() right before freeing tn->nametbl
is a big hammer.
Replace this with kfree_rcu().
Note that RCU is not properly used here, otherwise
tn->nametbl should be cleared before the synchronize_net()
or kfree_rcu(), or even before the cleanup loop.
We might need to fix this at some point.
Also note tipc uses other synchronize_rcu() calls,
more work is needed to make tipc_exit_net() much faster.
List of remaining calls to synchronize_rcu()
tipc_detach_loopback() (dev_remove_pack())
tipc_bcast_stop()
tipc_sk_rht_destroy()
Signed-off-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241204210234.319484-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot found [1] that after blamed commit, ub->ubsock->sk
was NULL when attempting the atomic_dec() :
atomic_dec(&tipc_net(sock_net(ub->ubsock->sk))->wq_count);
Fix this by caching the tipc_net pointer.
[1]
Oops: general protection fault, probably for non-canonical address 0xdffffc0000000006: 0000 [#1] PREEMPT SMP KASAN PTI
KASAN: null-ptr-deref in range [0x0000000000000030-0x0000000000000037]
CPU: 0 UID: 0 PID: 5896 Comm: kworker/0:3 Not tainted 6.13.0-rc1-next-20241203-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
Workqueue: events cleanup_bearer
RIP: 0010:read_pnet include/net/net_namespace.h:387 [inline]
RIP: 0010:sock_net include/net/sock.h:655 [inline]
RIP: 0010:cleanup_bearer+0x1f7/0x280 net/tipc/udp_media.c:820
Code: 18 48 89 d8 48 c1 e8 03 42 80 3c 28 00 74 08 48 89 df e8 3c f7 99 f6 48 8b 1b 48 83 c3 30 e8 f0 e4 60 00 48 89 d8 48 c1 e8 03 <42> 80 3c 28 00 74 08 48 89 df e8 1a f7 99 f6 49 83 c7 e8 48 8b 1b
RSP: 0018:ffffc9000410fb70 EFLAGS: 00010206
RAX: 0000000000000006 RBX: 0000000000000030 RCX: ffff88802fe45a00
RDX: 0000000000000001 RSI: 0000000000000008 RDI: ffffc9000410f900
RBP: ffff88807e1f0908 R08: ffffc9000410f907 R09: 1ffff92000821f20
R10: dffffc0000000000 R11: fffff52000821f21 R12: ffff888031d19980
R13: dffffc0000000000 R14: dffffc0000000000 R15: ffff88807e1f0918
FS: 0000000000000000(0000) GS:ffff8880b8600000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000556ca050b000 CR3: 0000000031c0c000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Fixes: 6a2fa13312e5 ("tipc: Fix use-after-free of kernel socket in cleanup_bearer().")
Reported-by: syzbot+46aa5474f179dacd1a3b@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/67508b5f.050a0220.17bd51.0070.GAE@google.com/T/#u
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20241204170548.4152658-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzkaller reported a use-after-free of UDP kernel socket
in cleanup_bearer() without repro. [0][1]
When bearer_disable() calls tipc_udp_disable(), cleanup
of the UDP kernel socket is deferred by work calling
cleanup_bearer().
tipc_net_stop() waits for such works to finish by checking
tipc_net(net)->wq_count. However, the work decrements the
count too early before releasing the kernel socket,
unblocking cleanup_net() and resulting in use-after-free.
Let's move the decrement after releasing the socket in
cleanup_bearer().
[0]:
ref_tracker: net notrefcnt@000000009b3d1faf has 1/1 users at
sk_alloc+0x438/0x608
inet_create+0x4c8/0xcb0
__sock_create+0x350/0x6b8
sock_create_kern+0x58/0x78
udp_sock_create4+0x68/0x398
udp_sock_create+0x88/0xc8
tipc_udp_enable+0x5e8/0x848
__tipc_nl_bearer_enable+0x84c/0xed8
tipc_nl_bearer_enable+0x38/0x60
genl_family_rcv_msg_doit+0x170/0x248
genl_rcv_msg+0x400/0x5b0
netlink_rcv_skb+0x1dc/0x398
genl_rcv+0x44/0x68
netlink_unicast+0x678/0x8b0
netlink_sendmsg+0x5e4/0x898
____sys_sendmsg+0x500/0x830
[1]:
BUG: KMSAN: use-after-free in udp_hashslot include/net/udp.h:85 [inline]
BUG: KMSAN: use-after-free in udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
udp_hashslot include/net/udp.h:85 [inline]
udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
sk_common_release+0xaf/0x3f0 net/core/sock.c:3820
inet_release+0x1e0/0x260 net/ipv4/af_inet.c:437
inet6_release+0x6f/0xd0 net/ipv6/af_inet6.c:489
__sock_release net/socket.c:658 [inline]
sock_release+0xa0/0x210 net/socket.c:686
cleanup_bearer+0x42d/0x4c0 net/tipc/udp_media.c:819
process_one_work kernel/workqueue.c:3229 [inline]
process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
kthread+0x531/0x6b0 kernel/kthread.c:389
ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244
Uninit was created at:
slab_free_hook mm/slub.c:2269 [inline]
slab_free mm/slub.c:4580 [inline]
kmem_cache_free+0x207/0xc40 mm/slub.c:4682
net_free net/core/net_namespace.c:454 [inline]
cleanup_net+0x16f2/0x19d0 net/core/net_namespace.c:647
process_one_work kernel/workqueue.c:3229 [inline]
process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
kthread+0x531/0x6b0 kernel/kthread.c:389
ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244
CPU: 0 UID: 0 PID: 54 Comm: kworker/0:2 Not tainted 6.12.0-rc1-00131-gf66ebf37d69c #7 91723d6f74857f70725e1583cba3cf4adc716cfa
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: events cleanup_bearer
Fixes: 26abe14379f8 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
Reported-by: syzkaller <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://patch.msgid.link/20241127050512.28438-1-kuniyu@amazon.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Merge in late fixes to prepare for the 6.12 net-next PR.
No conflicts or adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Clang static checker (scan-build) warning:
net/tipc/bcast.c:305:4:
The expression is an uninitialized value. The computed value will also
be garbage [core.uninitialized.Assign]
305 | (*cong_link_cnt)++;
| ^~~~~~~~~~~~~~~~~~
tipc_rcast_xmit() will increase cong_link_cnt's value, but cong_link_cnt
is uninitialized. Although it won't really cause a problem, it's better
to fix it.
Fixes: dca4a17d24ee ("tipc: fix potential hanging after b/rcast changing")
Signed-off-by: Su Hui <suhui@nfschina.com>
Reviewed-by: Justin Stitt <justinstitt@google.com>
Link: https://patch.msgid.link/20240912110119.2025503-1-suhui@nfschina.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
list_head can be initialized automatically with LIST_HEAD()
instead of calling INIT_LIST_HEAD(). Here we can simplify
the code.
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240904093243.3345012-3-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The deprecated helper strcpy() performs no bounds checking on the
destination buffer. This could result in linear overflows beyond
the end of the buffer, leading to all kinds of misbehaviors.
The safe replacement is strscpy() [1].
Link: https://www.kernel.org/doc/html/latest/process/deprecated.html#strcpy [1]
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
Link: https://patch.msgid.link/20240828123224.3697672-6-lihongbo22@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When calculating size of own domain based on number of peers, the result
should be less than MAX_MON_DOMAIN, so using min() here is very semantic.
Signed-off-by: Li Zetao <lizetao1@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240822133908.1042240-8-lizetao1@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Smatch reports that copying media_name and if_name to name_parts may
overwrite the destination.
.../bearer.c:166 bearer_name_validate() error: strcpy() 'media_name' too large for 'name_parts->media_name' (32 vs 16)
.../bearer.c:167 bearer_name_validate() error: strcpy() 'if_name' too large for 'name_parts->if_name' (1010102 vs 16)
This does seem to be the case so guard against this possibility by using
strscpy() and failing if truncation occurs.
Introduced by commit b97bf3fd8f6a ("[TIPC] Initial merge")
Compile tested only.
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://patch.msgid.link/20240801-tipic-overrun-v2-1-c5b869d1f074@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from bpf and netfilter.
A lot of networking people were at a conference last week, busy
catching COVID, so relatively short PR.
Current release - regressions:
- tcp: process the 3rd ACK with sk_socket for TFO and MPTCP
Current release - new code bugs:
- l2tp: protect session IDR and tunnel session list with one lock,
make sure the state is coherent to avoid a warning
- eth: bnxt_en: update xdp_rxq_info in queue restart logic
- eth: airoha: fix location of the MBI_RX_AGE_SEL_MASK field
Previous releases - regressions:
- xsk: require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len,
the field reuses previously un-validated pad
Previous releases - always broken:
- tap/tun: drop short frames to prevent crashes later in the stack
- eth: ice: add a per-VF limit on number of FDIR filters
- af_unix: disable MSG_OOB handling for sockets in sockmap/sockhash"
* tag 'net-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (34 commits)
tun: add missing verification for short frame
tap: add missing verification for short frame
mISDN: Fix a use after free in hfcmulti_tx()
gve: Fix an edge case for TSO skb validity check
bnxt_en: update xdp_rxq_info in queue restart logic
tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
selftests/bpf: Add XDP_UMEM_TX_METADATA_LEN to XSK TX metadata test
xsk: Require XDP_UMEM_TX_METADATA_LEN to actuate tx_metadata_len
bpf: Fix a segment issue when downgrading gso_size
net: mediatek: Fix potential NULL pointer dereference in dummy net_device handling
MAINTAINERS: make Breno the netconsole maintainer
MAINTAINERS: Update bonding entry
net: nexthop: Initialize all fields in dumped nexthops
net: stmmac: Correct byte order of perfect_match
selftests: forwarding: skip if kernel not support setting bridge fdb learning limit
tipc: Return non-zero value from tipc_udp_addr2str() on error
netfilter: nft_set_pipapo_avx2: disable softinterrupts
ice: Fix recipe read procedure
ice: Add a per-VF limit on number of FDIR filters
net: bonding: correctly annotate RCU in bond_should_notify_peers()
...
|
|
tipc_udp_addr2str() should return non-zero value if the UDP media
address is invalid. Otherwise, a buffer overflow access can occur in
tipc_media_addr_printf(). Fix this by returning 1 on an invalid UDP
media address.
Fixes: d0f91938bede ("tipc: add ip/udp media type")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@endava.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull non-MM updates from Andrew Morton:
- In the series "treewide: Refactor heap related implementation",
Kuan-Wei Chiu has significantly reworked the min_heap library code
and has taught bcachefs to use the new more generic implementation.
- Yury Norov's series "Cleanup cpumask.h inclusion in core headers"
reworks the cpumask and nodemask headers to make things generally
more rational.
- Kuan-Wei Chiu has sent along some maintenance work against our
sorting library code in the series "lib/sort: Optimizations and
cleanups".
- More library maintainance work from Christophe Jaillet in the series
"Remove usage of the deprecated ida_simple_xx() API".
- Ryusuke Konishi continues with the nilfs2 fixes and clanups in the
series "nilfs2: eliminate the call to inode_attach_wb()".
- Kuan-Ying Lee has some fixes to the gdb scripts in the series "Fix
GDB command error".
- Plus the usual shower of singleton patches all over the place. Please
see the relevant changelogs for details.
* tag 'mm-nonmm-stable-2024-07-21-15-07' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (98 commits)
ia64: scrub ia64 from poison.h
watchdog/perf: properly initialize the turbo mode timestamp and rearm counter
tsacct: replace strncpy() with strscpy()
lib/bch.c: use swap() to improve code
test_bpf: convert comma to semicolon
init/modpost: conditionally check section mismatch to __meminit*
init: remove unused __MEMINIT* macros
nilfs2: Constify struct kobj_type
nilfs2: avoid undefined behavior in nilfs_cnt32_ge macro
math: rational: add missing MODULE_DESCRIPTION() macro
lib/zlib: add missing MODULE_DESCRIPTION() macro
fs: ufs: add MODULE_DESCRIPTION()
lib/rbtree.c: fix the example typo
ocfs2: add bounds checking to ocfs2_check_dir_entry()
fs: add kernel-doc comments to ocfs2_prepare_orphan_dir()
coredump: simplify zap_process()
selftests/fpu: add missing MODULE_DESCRIPTION() macro
compiler.h: simplify data_race() macro
build-id: require program headers to be right after ELF header
resource: add missing MODULE_DESCRIPTION()
...
|
|
link_is_up() and tipc_link_is_up() have the same functionality.
Consolidate these functions.
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@endava.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
struct tipc_name_table in core.h is not used. Remove this declaration.
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@endava.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There were several instances of the string "assocat" in the kernel, which
should have been spelled "associat", with the various endings of -ive,
-ed, -ion, and sometimes beginnging with dis-.
Add to the spelling dictionary the corrections so that future instances
will be caught by checkpatch, and fix the instances found.
Originally noticed by accident with a 'git grep socat'.
Link: https://lkml.kernel.org/r/20240612001247.356867-1-jesse.brandeburg@intel.com
Signed-off-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
As it says in commit 3bc07321ccc2 ("xfrm: Force a dst refcount before
entering the xfrm type handlers"):
"Crypto requests might return asynchronous. In this case we leave the
rcu protected region, so force a refcount on the skb's destination
entry before we enter the xfrm type input/output handlers."
On TIPC decryption path it has the same problem, and skb_dst_force()
should be called before doing decryption to avoid a possible crash.
Shuang reported this issue when this warning is triggered:
[] WARNING: include/net/dst.h:337 tipc_sk_rcv+0x1055/0x1ea0 [tipc]
[] Kdump: loaded Tainted: G W --------- - - 4.18.0-496.el8.x86_64+debug
[] Workqueue: crypto cryptd_queue_worker
[] RIP: 0010:tipc_sk_rcv+0x1055/0x1ea0 [tipc]
[] Call Trace:
[] tipc_sk_mcast_rcv+0x548/0xea0 [tipc]
[] tipc_rcv+0xcf5/0x1060 [tipc]
[] tipc_aead_decrypt_done+0x215/0x2e0 [tipc]
[] cryptd_aead_crypt+0xdb/0x190
[] cryptd_queue_worker+0xed/0x190
[] process_one_work+0x93d/0x17e0
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/fbe3195fad6997a4eec62d9bf076b2ad03ac336b.1718476040.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
With the rework of how the __string() handles dynamic strings where it
saves off the source string in field in the helper structure[1], the
assignment of that value to the trace event field is stored in the helper
value and does not need to be passed in again.
This means that with:
__string(field, mystring)
Which use to be assigned with __assign_str(field, mystring), no longer
needs the second parameter and it is unused. With this, __assign_str()
will now only get a single parameter.
There's over 700 users of __assign_str() and because coccinelle does not
handle the TRACE_EVENT() macro I ended up using the following sed script:
git grep -l __assign_str | while read a ; do
sed -e 's/\(__assign_str([^,]*[^ ,]\) *,[^;]*/\1)/' $a > /tmp/test-file;
mv /tmp/test-file $a;
done
I then searched for __assign_str() that did not end with ';' as those
were multi line assignments that the sed script above would fail to catch.
Note, the same updates will need to be done for:
__assign_str_len()
__assign_rel_str()
__assign_rel_str_len()
I tested this with both an allmodconfig and an allyesconfig (build only for both).
[1] https://lore.kernel.org/linux-trace-kernel/20240222211442.634192653@goodmis.org/
Link: https://lore.kernel.org/linux-trace-kernel/20240516133454.681ba6a0@rorschach.local.home
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Julia Lawall <Julia.Lawall@inria.fr>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Jani Nikula <jani.nikula@intel.com>
Acked-by: Christian König <christian.koenig@amd.com> for the amdgpu parts.
Acked-by: Thomas Hellström <thomas.hellstrom@linux.intel.com> #for
Acked-by: Rafael J. Wysocki <rafael@kernel.org> # for thermal
Acked-by: Takashi Iwai <tiwai@suse.de>
Acked-by: Darrick J. Wong <djwong@kernel.org> # xfs
Tested-by: Guenter Roeck <linux@roeck-us.net>
|
|
Rather than pass in flags, error pointer, and whether this is a kernel
invocation or not, add a struct proto_accept_arg struct as the argument.
This then holds all of these arguments, and prepares accept for being
able to pass back more information.
No functional changes in this patch.
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
This commit comes at the tail end of a greater effort to remove the
empty elements at the end of the ctl_table arrays (sentinels) which
will reduce the overall build time size of the kernel and run time
memory bloat by ~64 bytes per sentinel (further information Link :
https://lore.kernel.org/all/ZO5Yx5JFogGi%2FcBo@bombadil.infradead.org/)
To avoid lots of small commits, this commit brings together network
changes from (as they appear in MAINTAINERS) LLC, MPTCP, NETROM NETWORK
LAYER, PHONET PROTOCOL, ROSE NETWORK LAYER, RXRPC SOCKETS, SCTP
PROTOCOL, SHARED MEMORY COMMUNICATIONS (SMC), TIPC NETWORK LAYER and
NETWORKING [IPSEC]
* Remove sentinel element from ctl_table structs.
* Replace empty array registration with the register_net_sysctl_sz call
in llc_sysctl_init
* Replace the for loop stop condition that tests for procname == NULL
with one that depends on array size in sctp_sysctl_net_register
* Remove instances where an array element is zeroed out to make it look
like a sentinel in xfrm_sysctl_init. This is not longer needed and is
safe after commit c899710fe7f9 ("networking: Update to
register_net_sysctl_sz") added the array size to the ctl_table
registration
* Use a table_size variable to keep the value of ARRAY_SIZE
Signed-off-by: Joel Granados <j.granados@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
include/linux/filter.h
kernel/bpf/core.c
66e13b615a0c ("bpf: verifier: prevent userspace memory access")
d503a04f8bc0 ("bpf: Add support for certain atomics in bpf_arena to x86 JIT")
https://lore.kernel.org/all/20240429114939.210328b0@canb.auug.org.au/
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
__skb_linearize() doesn't free the skb when it fails, so move
'*buf = NULL' after __skb_linearize(), so that the skb can be
freed on the err path.
Fixes: b7df21cf1b79 ("tipc: skb_linearize the head skb when reassembling msgs")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/90710748c29a1521efac4f75ea01b3b7e61414cf.1714485818.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Sam Page (sam4k) working with Trend Micro Zero Day Initiative reported
a UAF in the tipc_buf_append() error path:
BUG: KASAN: slab-use-after-free in kfree_skb_list_reason+0x47e/0x4c0
linux/net/core/skbuff.c:1183
Read of size 8 at addr ffff88804d2a7c80 by task poc/8034
CPU: 1 PID: 8034 Comm: poc Not tainted 6.8.2 #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.0-debian-1.16.0-5 04/01/2014
Call Trace:
<IRQ>
__dump_stack linux/lib/dump_stack.c:88
dump_stack_lvl+0xd9/0x1b0 linux/lib/dump_stack.c:106
print_address_description linux/mm/kasan/report.c:377
print_report+0xc4/0x620 linux/mm/kasan/report.c:488
kasan_report+0xda/0x110 linux/mm/kasan/report.c:601
kfree_skb_list_reason+0x47e/0x4c0 linux/net/core/skbuff.c:1183
skb_release_data+0x5af/0x880 linux/net/core/skbuff.c:1026
skb_release_all linux/net/core/skbuff.c:1094
__kfree_skb linux/net/core/skbuff.c:1108
kfree_skb_reason+0x12d/0x210 linux/net/core/skbuff.c:1144
kfree_skb linux/./include/linux/skbuff.h:1244
tipc_buf_append+0x425/0xb50 linux/net/tipc/msg.c:186
tipc_link_input+0x224/0x7c0 linux/net/tipc/link.c:1324
tipc_link_rcv+0x76e/0x2d70 linux/net/tipc/link.c:1824
tipc_rcv+0x45f/0x10f0 linux/net/tipc/node.c:2159
tipc_udp_recv+0x73b/0x8f0 linux/net/tipc/udp_media.c:390
udp_queue_rcv_one_skb+0xad2/0x1850 linux/net/ipv4/udp.c:2108
udp_queue_rcv_skb+0x131/0xb00 linux/net/ipv4/udp.c:2186
udp_unicast_rcv_skb+0x165/0x3b0 linux/net/ipv4/udp.c:2346
__udp4_lib_rcv+0x2594/0x3400 linux/net/ipv4/udp.c:2422
ip_protocol_deliver_rcu+0x30c/0x4e0 linux/net/ipv4/ip_input.c:205
ip_local_deliver_finish+0x2e4/0x520 linux/net/ipv4/ip_input.c:233
NF_HOOK linux/./include/linux/netfilter.h:314
NF_HOOK linux/./include/linux/netfilter.h:308
ip_local_deliver+0x18e/0x1f0 linux/net/ipv4/ip_input.c:254
dst_input linux/./include/net/dst.h:461
ip_rcv_finish linux/net/ipv4/ip_input.c:449
NF_HOOK linux/./include/linux/netfilter.h:314
NF_HOOK linux/./include/linux/netfilter.h:308
ip_rcv+0x2c5/0x5d0 linux/net/ipv4/ip_input.c:569
__netif_receive_skb_one_core+0x199/0x1e0 linux/net/core/dev.c:5534
__netif_receive_skb+0x1f/0x1c0 linux/net/core/dev.c:5648
process_backlog+0x101/0x6b0 linux/net/core/dev.c:5976
__napi_poll.constprop.0+0xba/0x550 linux/net/core/dev.c:6576
napi_poll linux/net/core/dev.c:6645
net_rx_action+0x95a/0xe90 linux/net/core/dev.c:6781
__do_softirq+0x21f/0x8e7 linux/kernel/softirq.c:553
do_softirq linux/kernel/softirq.c:454
do_softirq+0xb2/0xf0 linux/kernel/softirq.c:441
</IRQ>
<TASK>
__local_bh_enable_ip+0x100/0x120 linux/kernel/softirq.c:381
local_bh_enable linux/./include/linux/bottom_half.h:33
rcu_read_unlock_bh linux/./include/linux/rcupdate.h:851
__dev_queue_xmit+0x871/0x3ee0 linux/net/core/dev.c:4378
dev_queue_xmit linux/./include/linux/netdevice.h:3169
neigh_hh_output linux/./include/net/neighbour.h:526
neigh_output linux/./include/net/neighbour.h:540
ip_finish_output2+0x169f/0x2550 linux/net/ipv4/ip_output.c:235
__ip_finish_output linux/net/ipv4/ip_output.c:313
__ip_finish_output+0x49e/0x950 linux/net/ipv4/ip_output.c:295
ip_finish_output+0x31/0x310 linux/net/ipv4/ip_output.c:323
NF_HOOK_COND linux/./include/linux/netfilter.h:303
ip_output+0x13b/0x2a0 linux/net/ipv4/ip_output.c:433
dst_output linux/./include/net/dst.h:451
ip_local_out linux/net/ipv4/ip_output.c:129
ip_send_skb+0x3e5/0x560 linux/net/ipv4/ip_output.c:1492
udp_send_skb+0x73f/0x1530 linux/net/ipv4/udp.c:963
udp_sendmsg+0x1a36/0x2b40 linux/net/ipv4/udp.c:1250
inet_sendmsg+0x105/0x140 linux/net/ipv4/af_inet.c:850
sock_sendmsg_nosec linux/net/socket.c:730
__sock_sendmsg linux/net/socket.c:745
__sys_sendto+0x42c/0x4e0 linux/net/socket.c:2191
__do_sys_sendto linux/net/socket.c:2203
__se_sys_sendto linux/net/socket.c:2199
__x64_sys_sendto+0xe0/0x1c0 linux/net/socket.c:2199
do_syscall_x64 linux/arch/x86/entry/common.c:52
do_syscall_64+0xd8/0x270 linux/arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x6f/0x77 linux/arch/x86/entry/entry_64.S:120
RIP: 0033:0x7f3434974f29
Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48
89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d
01 f0 ff ff 73 01 c3 48 8b 0d 37 8f 0d 00 f7 d8 64 89 01 48
RSP: 002b:00007fff9154f2b8 EFLAGS: 00000212 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f3434974f29
RDX: 00000000000032c8 RSI: 00007fff9154f300 RDI: 0000000000000003
RBP: 00007fff915532e0 R08: 00007fff91553360 R09: 0000000000000010
R10: 0000000000000000 R11: 0000000000000212 R12: 000055ed86d261d0
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>
In the critical scenario, either the relevant skb is freed or its
ownership is transferred into a frag_lists. In both cases, the cleanup
code must not free it again: we need to clear the skb reference earlier.
Fixes: 1149557d64c9 ("tipc: eliminate unnecessary linearization of incoming buffers")
Cc: stable@vger.kernel.org
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-23852
Acked-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/752f1ccf762223d109845365d07f55414058e5a3.1714484273.git.pabeni@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
I added dst_rt6_info() in commit
e8dfd42c17fa ("ipv6: introduce dst_rt6_info() helper")
This patch does a similar change for IPv4.
Instead of (struct rtable *)dst casts, we can use :
#define dst_rtable(_ptr) \
container_of_const(_ptr, struct rtable, dst)
Patch is smaller than IPv6 one, because IPv4 has skb_rtable() helper.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Sabrina Dubroca <sd@queasysnail.net>
Link: https://lore.kernel.org/r/20240429133009.1227754-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Variable err is being assigned a zero value and it is never read
afterwards in either the break path or continue path, the assignment
is redundant and can be removed. With it removed, the if statement
can also be simplified.
Cleans up clang scan warning:
net/tipc/socket.c:3570:5: warning: Value stored to 'err' is never
read [deadcode.DeadStores]
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Link: https://lore.kernel.org/r/20240411091704.306752-1-colin.i.king@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Consolidate the error paths of tipc_nl_bearer_add() under the common label
if the function holds rtnl_lock.
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/20240213134058.386123-1-syoshida@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts.
Adjacent changes:
drivers/net/ethernet/stmicro/stmmac/common.h
38cc3c6dcc09 ("net: stmmac: protect updates of 64-bit statistics counters")
fd5a6a71313e ("net: stmmac: est: Per Tx-queue error count for HLBF")
c5c3e1bfc9e0 ("net: stmmac: Offload queueMaxSDU from tc-taprio")
drivers/net/wireless/microchip/wilc1000/netdev.c
c9013880284d ("wifi: fill in MODULE_DESCRIPTION()s for wilc1000")
328efda22af8 ("wifi: wilc1000: do not realloc workqueue everytime an interface is added")
net/unix/garbage.c
11498715f266 ("af_unix: Remove io_uring code for GC.")
1279f9d9dec2 ("af_unix: Call kfree_skb() for dead unix_(sk)->oob_skb in GC.")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The 'bool' is already specified for these options.
The second 'bool' under the help message is redundant.
While I am here, I moved 'default y' above, as it is common to place
the help text last.
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is not appropriate for TIPC to use "diag" as its diag module name
while the other protocols are using "$(protoname)_diag" like tcp_diag,
udp_diag and sctp_diag etc.
So this patch is to rename diag.ko to tipc_diag.ko in tipc's Makefile.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/d909edeef072da1810bd5869fdbbfe84411efdb2.1706904669.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
syzbot reported the following general protection fault [1]:
general protection fault, probably for non-canonical address 0xdffffc0000000010: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000080-0x0000000000000087]
...
RIP: 0010:tipc_udp_is_known_peer+0x9c/0x250 net/tipc/udp_media.c:291
...
Call Trace:
<TASK>
tipc_udp_nl_bearer_add+0x212/0x2f0 net/tipc/udp_media.c:646
tipc_nl_bearer_add+0x21e/0x360 net/tipc/bearer.c:1089
genl_family_rcv_msg_doit+0x1fc/0x2e0 net/netlink/genetlink.c:972
genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline]
genl_rcv_msg+0x561/0x800 net/netlink/genetlink.c:1067
netlink_rcv_skb+0x16b/0x440 net/netlink/af_netlink.c:2544
genl_rcv+0x28/0x40 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1341 [inline]
netlink_unicast+0x53b/0x810 net/netlink/af_netlink.c:1367
netlink_sendmsg+0x8b7/0xd70 net/netlink/af_netlink.c:1909
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg+0xd5/0x180 net/socket.c:745
____sys_sendmsg+0x6ac/0x940 net/socket.c:2584
___sys_sendmsg+0x135/0x1d0 net/socket.c:2638
__sys_sendmsg+0x117/0x1e0 net/socket.c:2667
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x40/0x110 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x63/0x6b
The cause of this issue is that when tipc_nl_bearer_add() is called with
the TIPC_NLA_BEARER_UDP_OPTS attribute, tipc_udp_nl_bearer_add() is called
even if the bearer is not UDP.
tipc_udp_is_known_peer() called by tipc_udp_nl_bearer_add() assumes that
the media_ptr field of the tipc_bearer has an udp_bearer type object, so
the function goes crazy for non-UDP bearers.
This patch fixes the issue by checking the bearer type before calling
tipc_udp_nl_bearer_add() in tipc_nl_bearer_add().
Fixes: ef20cd4dd163 ("tipc: introduce UDP replicast")
Reported-and-tested-by: syzbot+5142b87a9abc510e14fa@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5142b87a9abc510e14fa [1]
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/20240131152310.4089541-1-syoshida@redhat.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove 2 kernel-doc descriptions to squelch warnings:
node.c:150: warning: Excess struct member 'inputq' description in 'tipc_node'
node.c:150: warning: Excess struct member 'namedq' description in 'tipc_node'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jon Maloy <jmaloy@redhat.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240123051152.23684-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Remove a kernel-doc description to squelch a warning:
socket.c:143: warning: Excess struct member 'blocking_link' description in 'tipc_sock'
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Jon Maloy <jmaloy@redhat.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240123051201.24701-1-rdunlap@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Following patch is going to use RCU instead of
sock_diag_table_mutex acquisition.
This patch is a preparation, no change of behavior yet.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Guillaume Nault <gnault@redhat.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove documentation for nonexistent struct members, addressing these
warnings:
./net/tipc/link.c:228: warning: Excess struct member 'media_addr' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'timer' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'refcnt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'proto_msg' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'pmsg' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'backlog_limit' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'exp_msg_count' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'reset_rcv_checkpt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'transmitq' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'snt_nxt' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'deferred_queue' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'unacked_window' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'next_out' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'long_msg_seq_no' description in 'tipc_link'
./net/tipc/link.c:228: warning: Excess struct member 'bc_rcvr' description in 'tipc_link'
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
drivers/net/ethernet/intel/ice/ice_main.c
c9663f79cd82 ("ice: adjust switchdev rebuild path")
7758017911a4 ("ice: restore timestamp configuration after device reset")
https://lore.kernel.org/all/20231121211259.3348630-1-anthony.l.nguyen@intel.com/
Adjacent changes:
kernel/bpf/verifier.c
bb124da69c47 ("bpf: keep track of max number of bpf_loop callback iterations")
5f99f312bd3b ("bpf: add register bounds sanity checks and sanitization")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to all the sock diag modules in one fell swoop.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The purpose of TLV_SPACE() is to add the TLV descriptor size to the size of
the TLV value passed as argument and align the resulting size to
TLV_ALIGNTO.
tipc_tlv_alloc() calls TLV_SPACE() on its argument. In other words,
tipc_tlv_alloc() takes its argument as the size of the TLV value. So the
call to TLV_SPACE() in tipc_get_err_tlv() is redundant. Let's remove this
redundancy.
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
KMSAN reported the following kernel-infoleak issue:
=====================================================
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline]
BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline]
BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline]
BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline]
BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
instrument_copy_to_user include/linux/instrumented.h:114 [inline]
copy_to_user_iter lib/iov_iter.c:24 [inline]
iterate_ubuf include/linux/iov_iter.h:29 [inline]
iterate_and_advance2 include/linux/iov_iter.h:245 [inline]
iterate_and_advance include/linux/iov_iter.h:271 [inline]
_copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186
copy_to_iter include/linux/uio.h:197 [inline]
simple_copy_to_iter net/core/datagram.c:532 [inline]
__skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420
skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546
skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline]
netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967
sock_recvmsg_nosec net/socket.c:1044 [inline]
sock_recvmsg net/socket.c:1066 [inline]
__sys_recvfrom+0x476/0x860 net/socket.c:2246
__do_sys_recvfrom net/socket.c:2264 [inline]
__se_sys_recvfrom net/socket.c:2260 [inline]
__x64_sys_recvfrom+0x130/0x200 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Uninit was created at:
slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768
slab_alloc_node mm/slub.c:3478 [inline]
kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523
kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560
__alloc_skb+0x2fd/0x770 net/core/skbuff.c:651
alloc_skb include/linux/skbuff.h:1286 [inline]
tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline]
tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170
tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324
genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline]
genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067
netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545
genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076
netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368
netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
____sys_sendmsg+0x997/0xd60 net/socket.c:2588
___sys_sendmsg+0x271/0x3b0 net/socket.c:2642
__sys_sendmsg net/socket.c:2671 [inline]
__do_sys_sendmsg net/socket.c:2680 [inline]
__se_sys_sendmsg net/socket.c:2678 [inline]
__x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
Bytes 34-35 of 36 are uninitialized
Memory access of size 36 starts at ffff88802d464a00
Data copied to user address 00007ff55033c0a0
CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd #10
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
=====================================================
tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is
calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and
the length of TLV value passed as an argument, and aligns the result to a
multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes.
If the size of struct tlv_desc plus the length of TLV value is not aligned,
the current implementation leaves the remaining bytes uninitialized. This
is the cause of the above kernel-infoleak issue.
This patch resolves this issue by clearing data up to an aligned size.
Fixes: d0796d1ef63d ("tipc: convert legacy nl bearer dump to nl compat")
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot reported the following uninit-value access issue [1]:
=====================================================
BUG: KMSAN: uninit-value in strlen lib/string.c:418 [inline]
BUG: KMSAN: uninit-value in strstr+0xb8/0x2f0 lib/string.c:756
strlen lib/string.c:418 [inline]
strstr+0xb8/0x2f0 lib/string.c:756
tipc_nl_node_reset_link_stats+0x3ea/0xb50 net/tipc/node.c:2595
genl_family_rcv_msg_doit net/netlink/genetlink.c:971 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:1051 [inline]
genl_rcv_msg+0x11ec/0x1290 net/netlink/genetlink.c:1066
netlink_rcv_skb+0x371/0x650 net/netlink/af_netlink.c:2545
genl_rcv+0x40/0x60 net/netlink/genetlink.c:1075
netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline]
netlink_unicast+0xf47/0x1250 net/netlink/af_netlink.c:1368
netlink_sendmsg+0x1238/0x13d0 net/netlink/af_netlink.c:1910
sock_sendmsg_nosec net/socket.c:730 [inline]
sock_sendmsg net/socket.c:753 [inline]
____sys_sendmsg+0x9c2/0xd60 net/socket.c:2541
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2595
__sys_sendmsg net/socket.c:2624 [inline]
__do_sys_sendmsg net/socket.c:2633 [inline]
__se_sys_sendmsg net/socket.c:2631 [inline]
__x64_sys_sendmsg+0x307/0x490 net/socket.c:2631
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
Uninit was created at:
slab_post_alloc_hook+0x12f/0xb70 mm/slab.h:767
slab_alloc_node mm/slub.c:3478 [inline]
kmem_cache_alloc_node+0x577/0xa80 mm/slub.c:3523
kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:559
__alloc_skb+0x318/0x740 net/core/skbuff.c:650
alloc_skb include/linux/skbuff.h:1286 [inline]
netlink_alloc_large_skb net/netlink/af_netlink.c:1214 [inline]
netlink_sendmsg+0xb34/0x13d0 net/netlink/af_netlink.c:1885
sock_sendmsg_nosec net/socket.c:730 [inline]
sock_sendmsg net/socket.c:753 [inline]
____sys_sendmsg+0x9c2/0xd60 net/socket.c:2541
___sys_sendmsg+0x28d/0x3c0 net/socket.c:2595
__sys_sendmsg net/socket.c:2624 [inline]
__do_sys_sendmsg net/socket.c:2633 [inline]
__se_sys_sendmsg net/socket.c:2631 [inline]
__x64_sys_sendmsg+0x307/0x490 net/socket.c:2631
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
TIPC bearer-related names including link names must be null-terminated
strings. If a link name which is not null-terminated is passed through
netlink, strstr() and similar functions can cause buffer overrun. This
causes the above issue.
This patch changes the nla_policy for bearer-related names from NLA_STRING
to NLA_NUL_STRING. This resolves the issue by ensuring that only
null-terminated strings are accepted as bearer-related names.
syzbot reported similar uninit-value issue related to bearer names [2]. The
root cause of this issue is that a non-null-terminated bearer name was
passed. This patch also resolved this issue.
Fixes: 7be57fc69184 ("tipc: add link get/dump to new netlink api")
Fixes: 0655f6a8635b ("tipc: add bearer disable/enable to new netlink api")
Reported-and-tested-by: syzbot+5138ca807af9d2b42574@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5138ca807af9d2b42574 [1]
Reported-and-tested-by: syzbot+9425c47dccbcb4c17d51@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=9425c47dccbcb4c17d51 [2]
Signed-off-by: Shigeru Yoshida <syoshida@redhat.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20231030075540.3784537-1-syoshida@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
No conflicts (or adjacent changes of note).
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
It seems that tipc_crypto_key_revoke() could be be invoked by
wokequeue tipc_crypto_work_rx() under process context and
timer/rx callback under softirq context, thus the lock acquisition
on &tx->lock seems better use spin_lock_bh() to prevent possible
deadlock.
This flaw was found by an experimental static analysis tool I am
developing for irq-related deadlock.
tipc_crypto_work_rx() <workqueue>
--> tipc_crypto_key_distr()
--> tipc_bcast_xmit()
--> tipc_bcbase_xmit()
--> tipc_bearer_bc_xmit()
--> tipc_crypto_xmit()
--> tipc_ehdr_build()
--> tipc_crypto_key_revoke()
--> spin_lock(&tx->lock)
<timer interrupt>
--> tipc_disc_timeout()
--> tipc_bearer_xmit_skb()
--> tipc_crypto_xmit()
--> tipc_ehdr_build()
--> tipc_crypto_key_revoke()
--> spin_lock(&tx->lock) <deadlock here>
Signed-off-by: Chengfeng Ye <dg573847474@gmail.com>
Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Link: https://lore.kernel.org/r/20230927181414.59928-1-dg573847474@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
If, for any reason, the open-coded arithmetic causes a wraparound,
the protection that `struct_size()` adds against potential integer
overflows is defeated. Fix this by hardening call to `struct_size()`
with `size_add()`.
Fixes: e034c6d23bc4 ("tipc: Use struct_size() helper")
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Some swap cleanups from Ma Wupeng ("fix WARN_ON in
add_to_avail_list")
- Peter Xu has a series (mm/gup: Unify hugetlb, speed up thp") which
reduces the special-case code for handling hugetlb pages in GUP. It
also speeds up GUP handling of transparent hugepages.
- Peng Zhang provides some maple tree speedups ("Optimize the fast path
of mas_store()").
- Sergey Senozhatsky has improved te performance of zsmalloc during
compaction (zsmalloc: small compaction improvements").
- Domenico Cerasuolo has developed additional selftest code for zswap
("selftests: cgroup: add zswap test program").
- xu xin has doe some work on KSM's handling of zero pages. These
changes are mainly to enable the user to better understand the
effectiveness of KSM's treatment of zero pages ("ksm: support
tracking KSM-placed zero-pages").
- Jeff Xu has fixes the behaviour of memfd's
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED sysctl ("mm/memfd: fix sysctl
MEMFD_NOEXEC_SCOPE_NOEXEC_ENFORCED").
- David Howells has fixed an fscache optimization ("mm, netfs, fscache:
Stop read optimisation when folio removed from pagecache").
- Axel Rasmussen has given userfaultfd the ability to simulate memory
poisoning ("add UFFDIO_POISON to simulate memory poisoning with
UFFD").
- Miaohe Lin has contributed some routine maintenance work on the
memory-failure code ("mm: memory-failure: remove unneeded PageHuge()
check").
- Peng Zhang has contributed some maintenance work on the maple tree
code ("Improve the validation for maple tree and some cleanup").
- Hugh Dickins has optimized the collapsing of shmem or file pages into
THPs ("mm: free retracted page table by RCU").
- Jiaqi Yan has a patch series which permits us to use the healthy
subpages within a hardware poisoned huge page for general purposes
("Improve hugetlbfs read on HWPOISON hugepages").
- Kemeng Shi has done some maintenance work on the pagetable-check code
("Remove unused parameters in page_table_check").
- More folioification work from Matthew Wilcox ("More filesystem folio
conversions for 6.6"), ("Followup folio conversions for zswap"). And
from ZhangPeng ("Convert several functions in page_io.c to use a
folio").
- page_ext cleanups from Kemeng Shi ("minor cleanups for page_ext").
- Baoquan He has converted some architectures to use the
GENERIC_IOREMAP ioremap()/iounmap() code ("mm: ioremap: Convert
architectures to take GENERIC_IOREMAP way").
- Anshuman Khandual has optimized arm64 tlb shootdown ("arm64: support
batched/deferred tlb shootdown during page reclamation/migration").
- Better maple tree lockdep checking from Liam Howlett ("More strict
maple tree lockdep"). Liam also developed some efficiency
improvements ("Reduce preallocations for maple tree").
- Cleanup and optimization to the secondary IOMMU TLB invalidation,
from Alistair Popple ("Invalidate secondary IOMMU TLB on permission
upgrade").
- Ryan Roberts fixes some arm64 MM selftest issues ("selftests/mm fixes
for arm64").
- Kemeng Shi provides some maintenance work on the compaction code
("Two minor cleanups for compaction").
- Some reduction in mmap_lock pressure from Matthew Wilcox ("Handle
most file-backed faults under the VMA lock").
- Aneesh Kumar contributes code to use the vmemmap optimization for DAX
on ppc64, under some circumstances ("Add support for DAX vmemmap
optimization for ppc64").
- page-ext cleanups from Kemeng Shi ("add page_ext_data to get client
data in page_ext"), ("minor cleanups to page_ext header").
- Some zswap cleanups from Johannes Weiner ("mm: zswap: three
cleanups").
- kmsan cleanups from ZhangPeng ("minor cleanups for kmsan").
- VMA handling cleanups from Kefeng Wang ("mm: convert to
vma_is_initial_heap/stack()").
- DAMON feature work from SeongJae Park ("mm/damon/sysfs-schemes:
implement DAMOS tried total bytes file"), ("Extend DAMOS filters for
address ranges and DAMON monitoring targets").
- Compaction work from Kemeng Shi ("Fixes and cleanups to compaction").
- Liam Howlett has improved the maple tree node replacement code
("maple_tree: Change replacement strategy").
- ZhangPeng has a general code cleanup - use the K() macro more widely
("cleanup with helper macro K()").
- Aneesh Kumar brings memmap-on-memory to ppc64 ("Add support for
memmap on memory feature on ppc64").
- pagealloc cleanups from Kemeng Shi ("Two minor cleanups for pcp list
in page_alloc"), ("Two minor cleanups for get pageblock
migratetype").
- Vishal Moola introduces a memory descriptor for page table tracking,
"struct ptdesc" ("Split ptdesc from struct page").
- memfd selftest maintenance work from Aleksa Sarai ("memfd: cleanups
for vm.memfd_noexec").
- MM include file rationalization from Hugh Dickins ("arch: include
asm/cacheflush.h in asm/hugetlb.h").
- THP debug output fixes from Hugh Dickins ("mm,thp: fix sloppy text
output").
- kmemleak improvements from Xiaolei Wang ("mm/kmemleak: use
object_cache instead of kmemleak_initialized").
- More folio-related cleanups from Matthew Wilcox ("Remove _folio_dtor
and _folio_order").
- A VMA locking scalability improvement from Suren Baghdasaryan
("Per-VMA lock support for swap and userfaults").
- pagetable handling cleanups from Matthew Wilcox ("New page table
range API").
- A batch of swap/thp cleanups from David Hildenbrand ("mm/swap: stop
using page->private on tail pages for THP_SWAP + cleanups").
- Cleanups and speedups to the hugetlb fault handling from Matthew
Wilcox ("Change calling convention for ->huge_fault").
- Matthew Wilcox has also done some maintenance work on the MM
subsystem documentation ("Improve mm documentation").
* tag 'mm-stable-2023-08-28-18-26' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (489 commits)
maple_tree: shrink struct maple_tree
maple_tree: clean up mas_wr_append()
secretmem: convert page_is_secretmem() to folio_is_secretmem()
nios2: fix flush_dcache_page() for usage from irq context
hugetlb: add documentation for vma_kernel_pagesize()
mm: add orphaned kernel-doc to the rst files.
mm: fix clean_record_shared_mapping_range kernel-doc
mm: fix get_mctgt_type() kernel-doc
mm: fix kernel-doc warning from tlb_flush_rmaps()
mm: remove enum page_entry_size
mm: allow ->huge_fault() to be called without the mmap_lock held
mm: move PMD_ORDER to pgtable.h
mm: remove checks for pte_index
memcg: remove duplication detection for mem_cgroup_uncharge_swap
mm/huge_memory: work on folio->swap instead of page->private when splitting folio
mm/swap: inline folio_set_swap_entry() and folio_swap_entry()
mm/swap: use dedicated entry for swap in folio
mm/swap: stop using page->private on tail pages for THP_SWAP
selftests/mm: fix WARNING comparing pointer to 0
selftests: cgroup: fix test_kmem_memcg_deletion kernel mem check
...
|
|
Patch series "New page table range API", v6.
This patchset changes the API used by the MM to set up page table entries.
The four APIs are:
set_ptes(mm, addr, ptep, pte, nr)
update_mmu_cache_range(vma, addr, ptep, nr)
flush_dcache_folio(folio)
flush_icache_pages(vma, page, nr)
flush_dcache_folio() isn't technically new, but no architecture
implemented it, so I've done that for them. The old APIs remain around
but are mostly implemented by calling the new interfaces.
The new APIs are based around setting up N page table entries at once.
The N entries belong to the same PMD, the same folio and the same VMA, so
ptep++ is a legitimate operation, and locking is taken care of for you.
Some architectures can do a better job of it than just a loop, but I have
hesitated to make too deep a change to architectures I don't understand
well.
One thing I have changed in every architecture is that PG_arch_1 is now a
per-folio bit instead of a per-page bit when used for dcache clean/dirty
tracking. This was something that would have to happen eventually, and it
makes sense to do it now rather than iterate over every page involved in a
cache flush and figure out if it needs to happen.
The point of all this is better performance, and Fengwei Yin has measured
improvement on x86. I suspect you'll see improvement on your architecture
too. Try the new will-it-scale test mentioned here:
https://lore.kernel.org/linux-mm/20230206140639.538867-5-fengwei.yin@intel.com/
You'll need to run it on an XFS filesystem and have
CONFIG_TRANSPARENT_HUGEPAGE set.
This patchset is the basis for much of the anonymous large folio work
being done by Ryan, so it's received quite a lot of testing over the last
few months.
This patch (of 38):
Determine if a value lies within a range more efficiently (subtraction +
comparison vs two comparisons and an AND). It also has useful (under some
circumstances) behaviour if the range exceeds the maximum value of the
type. Convert all the conflicting definitions of in_range() within the
kernel; some can use the generic definition while others need their own
definition.
Link: https://lkml.kernel.org/r/20230802151406.3735276-1-willy@infradead.org
Link: https://lkml.kernel.org/r/20230802151406.3735276-2-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
Since dumps carry struct genl_info now, use the attrs pointer
from genl_info and remove the one in struct genl_dumpit_info.
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Miquel Raynal <miquel.raynal@bootlin.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230814214723.2924989-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Only three families use info->userhdr today and going forward
we discourage using fixed headers in new families.
So having the pointer to user header in struct genl_info
is an overkill. Compute the header pointer at runtime.
Reviewed-by: Johannes Berg <johannes@sipsolutions.net>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Aaron Conole <aconole@redhat.com>
Link: https://lore.kernel.org/r/20230814214723.2924989-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit 526669866140 ("tipc: let broadcast packet reception use new link receive function")
declared but never implemented this.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230807142926.45752-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Commit d50ccc2d3909 ("tipc: add 128-bit node identifier") declared but never
implemented tipc_node_id2hash().
Also commit 5c216e1d28c8 ("tipc: Allow run-time alteration of default link settings")
never implemented tipc_media_set_priority() and tipc_media_set_window(),
commit cad2929dc432 ("tipc: update a binding service via broadcast") only declared
tipc_named_bcast().
Since commit be07f056396d ("tipc: simplify the finalize work queue")
tipc_sched_net_finalize() is removed and declaration is unused.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230802034659.39840-1-yuehaibing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
If tipc_link_bc_create() fails inside tipc_node_create() for a newly
allocated tipc node then we should stop its tipc crypto and free the
resources allocated with a call to tipc_crypto_start().
As the node ref is initialized to one to that point, just put the ref on
tipc_link_bc_create() error case that would lead to tipc_node_free() be
eventually executed and properly clean the node and its crypto resources.
Found by Linux Verification Center (linuxtesting.org).
Fixes: cb8092d70a6f ("tipc: move bc link creation back to tipc_node_create")
Suggested-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Fedor Pchelkin <pchelkin@ispras.ru>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/20230725214628.25246-1-pchelkin@ispras.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
goto free_skb if an unexpected result is returned by pskb_tirm()
in tipc_crypto_rcv_complete().
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Signed-off-by: Yuanjun Gong <ruc_gongyuanjun@163.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/20230725064810.5820-1-ruc_gongyuanjun@163.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Remove ->sendpage() and ->sendpage_locked(). sendmsg() with
MSG_SPLICE_PAGES should be used instead. This allows multiple pages and
multipage folios to be passed through.
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de> # for net/can
cc: Jens Axboe <axboe@kernel.dk>
cc: Matthew Wilcox <willy@infradead.org>
cc: linux-afs@lists.infradead.org
cc: mptcp@lists.linux.dev
cc: rds-devel@oss.oracle.com
cc: tipc-discussion@lists.sourceforge.net
cc: virtualization@lists.linux-foundation.org
Link: https://lore.kernel.org/r/20230623225513.2732256-16-dhowells@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Cross-merge networking fixes after downstream PR.
Conflicts:
include/linux/mlx5/driver.h
617f5db1a626 ("RDMA/mlx5: Fix affinity assignment")
dc13180824b7 ("net/mlx5: Enable devlink port for embedded cpu VF vports")
https://lore.kernel.org/all/20230613125939.595e50b8@canb.auug.org.au/
tools/testing/selftests/net/mptcp/mptcp_join.sh
47867f0a7e83 ("selftests: mptcp: join: skip check if MIB counter not supported")
425ba803124b ("selftests: mptcp: join: support RM_ADDR for used endpoints or not")
45b1a1227a7a ("mptcp: introduces more address related mibs")
0639fa230a21 ("selftests: mptcp: add explicit check for new mibs")
https://lore.kernel.org/netdev/20230609-upstream-net-20230610-mptcp-selftests-support-old-kernels-part-3-v1-0-2896fe2ee8a3@tessares.net/
No adjacent changes.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
According to nla_parse_nested_deprecated(), the tb[] is supposed to the
destination array with maxtype+1 elements. In current
tipc_nl_media_get() and __tipc_nl_media_set(), a larger array is used
which is unnecessary. This patch resize them to a proper size.
Fixes: 1e55417d8fc6 ("tipc: add media set to new netlink api")
Fixes: 46f15c6794fb ("tipc: add media get/dump to new netlink api")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/20230614120604.1196377-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Replace these open-code bearer rcu_dereference access with bearer_get(),
like other places in bearer.c. While at it, also use tipc_net() instead
of net_generic(net, tipc_net_id) to get "tn" in bearer.c.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Larysa Zaremba <larysa.zaremba@intel.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/1072588a8691f970bda950c7e2834d1f2983f58e.1685976044.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit a4dfa72d0acd ("tipc: set default MTU for UDP media"), it's
been no longer using dev->mtu for b->mtu, and the issue described in
commit 3de81b758853 ("tipc: check minimum bearer MTU") doesn't exist
in UDP bearer any more.
Besides, dev->mtu can still be changed to a too small mtu after the UDP
bearer is created even with tipc_mtu_bad() check in tipc_udp_enable().
Note that NETDEV_CHANGEMTU event processing in tipc_l2_device_event()
doesn't really work for UDP bearer.
So this patch deletes the unnecessary tipc_mtu_bad from tipc_udp_enable.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/282f1f5cc40e6cad385aa1c60569e6c5b70e2fb3.1685371933.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Checking the bearer min mtu with tipc_udp_mtu_bad() only works for
IPv4 UDP bearer, and IPv6 UDP bearer has a different value for the
min mtu. This patch checks with encap_hlen + TIPC_MIN_BEARER_MTU
for min mtu, which works for both IPv4 and IPv6 UDP bearer.
Note that tipc_udp_mtu_bad() is still used to check media min mtu
in __tipc_nl_media_set(), as m->mtu currently is only used by the
IPv4 UDP bearer as its default mtu value.
Fixes: 682cd3cf946b ("tipc: confgiure and apply UDP bearer MTU on running links")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When doing link mtu negotiation, a malicious peer may send Activate msg
with a very small mtu, e.g. 4 in Shuang's testing, without checking for
the minimum mtu, l->mtu will be set to 4 in tipc_link_proto_rcv(), then
n->links[bearer_id].mtu is set to 4294967228, which is a overflow of
'4 - INT_H_SIZE - EMSG_OVERHEAD' in tipc_link_mss().
With tipc_link.mtu = 4, tipc_link_xmit() kept printing the warning:
tipc: Too large msg, purging xmit list 1 5 0 40 4!
tipc: Too large msg, purging xmit list 1 15 0 60 4!
And with tipc_link_entry.mtu 4294967228, a huge skb was allocated in
named_distribute(), and when purging it in tipc_link_xmit(), a crash
was even caused:
general protection fault, probably for non-canonical address 0x2100001011000dd: 0000 [#1] PREEMPT SMP PTI
CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Not tainted 6.3.0.neta #19
RIP: 0010:kfree_skb_list_reason+0x7e/0x1f0
Call Trace:
<IRQ>
skb_release_data+0xf9/0x1d0
kfree_skb_reason+0x40/0x100
tipc_link_xmit+0x57a/0x740 [tipc]
tipc_node_xmit+0x16c/0x5c0 [tipc]
tipc_named_node_up+0x27f/0x2c0 [tipc]
tipc_node_write_unlock+0x149/0x170 [tipc]
tipc_rcv+0x608/0x740 [tipc]
tipc_udp_recv+0xdc/0x1f0 [tipc]
udp_queue_rcv_one_skb+0x33e/0x620
udp_unicast_rcv_skb.isra.72+0x75/0x90
__udp4_lib_rcv+0x56d/0xc20
ip_protocol_deliver_rcu+0x100/0x2d0
This patch fixes it by checking the new mtu against tipc_bearer_min_mtu(),
and not updating mtu if it is too small.
Fixes: ed193ece2649 ("tipc: simplify link mtu negotiation")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
As different media may requires different min mtu, and even the
same media with different net family requires different min mtu,
add tipc_bearer_min_mtu() to calculate min mtu accordingly.
This API will be used to check the new mtu when doing the link
mtu negotiation in the next patch.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
__condition is evaluated twice in sk_wait_event() macro.
First invocation is lockless, and reads can race with writes,
as spotted by syzbot.
BUG: KCSAN: data-race in sk_stream_wait_connect / tcp_disconnect
write to 0xffff88812d83d6a0 of 4 bytes by task 9065 on cpu 1:
tcp_disconnect+0x2cd/0xdb0
inet_shutdown+0x19e/0x1f0 net/ipv4/af_inet.c:911
__sys_shutdown_sock net/socket.c:2343 [inline]
__sys_shutdown net/socket.c:2355 [inline]
__do_sys_shutdown net/socket.c:2363 [inline]
__se_sys_shutdown+0xf8/0x140 net/socket.c:2361
__x64_sys_shutdown+0x31/0x40 net/socket.c:2361
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
read to 0xffff88812d83d6a0 of 4 bytes by task 9040 on cpu 0:
sk_stream_wait_connect+0x1de/0x3a0 net/core/stream.c:75
tcp_sendmsg_locked+0x2e4/0x2120 net/ipv4/tcp.c:1266
tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1484
inet6_sendmsg+0x63/0x80 net/ipv6/af_inet6.c:651
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg net/socket.c:747 [inline]
__sys_sendto+0x246/0x300 net/socket.c:2142
__do_sys_sendto net/socket.c:2154 [inline]
__se_sys_sendto net/socket.c:2150 [inline]
__x64_sys_sendto+0x78/0x90 net/socket.c:2150
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0x00000000 -> 0x00000068
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- Add dedicated kmem_cache for typical/small skb->head, avoid having
to access struct page at kfree time, and improve memory use.
- Introduce sysctl to set default RPS configuration for new netdevs.
- Define Netlink protocol specification format which can be used to
describe messages used by each family and auto-generate parsers.
Add tools for generating kernel data structures and uAPI headers.
- Expose all net/core sysctls inside netns.
- Remove 4s sleep in netpoll if carrier is instantly detected on
boot.
- Add configurable limit of MDB entries per port, and port-vlan.
- Continue populating drop reasons throughout the stack.
- Retire a handful of legacy Qdiscs and classifiers.
Protocols:
- Support IPv4 big TCP (TSO frames larger than 64kB).
- Add IP_LOCAL_PORT_RANGE socket option, to control local port range
on socket by socket basis.
- Track and report in procfs number of MPTCP sockets used.
- Support mixing IPv4 and IPv6 flows in the in-kernel MPTCP path
manager.
- IPv6: don't check net.ipv6.route.max_size and rely on garbage
collection to free memory (similarly to IPv4).
- Support Penultimate Segment Pop (PSP) flavor in SRv6 (RFC8986).
- ICMP: add per-rate limit counters.
- Add support for user scanning requests in ieee802154.
- Remove static WEP support.
- Support minimal Wi-Fi 7 Extremely High Throughput (EHT) rate
reporting.
- WiFi 7 EHT channel puncturing support (client & AP).
BPF:
- Add a rbtree data structure following the "next-gen data structure"
precedent set by recently added linked list, that is, by using
kfunc + kptr instead of adding a new BPF map type.
- Expose XDP hints via kfuncs with initial support for RX hash and
timestamp metadata.
- Add BPF_F_NO_TUNNEL_KEY extension to bpf_skb_set_tunnel_key to
better support decap on GRE tunnel devices not operating in collect
metadata.
- Improve x86 JIT's codegen for PROBE_MEM runtime error checks.
- Remove the need for trace_printk_lock for bpf_trace_printk and
bpf_trace_vprintk helpers.
- Extend libbpf's bpf_tracing.h support for tracing arguments of
kprobes/uprobes and syscall as a special case.
- Significantly reduce the search time for module symbols by
livepatch and BPF.
- Enable cpumasks to be used as kptrs, which is useful for tracing
programs tracking which tasks end up running on which CPUs in
different time intervals.
- Add support for BPF trampoline on s390x and riscv64.
- Add capability to export the XDP features supported by the NIC.
- Add __bpf_kfunc tag for marking kernel functions as kfuncs.
- Add cgroup.memory=nobpf kernel parameter option to disable BPF
memory accounting for container environments.
Netfilter:
- Remove the CLUSTERIP target. It has been marked as obsolete for
years, and we still have WARN splats wrt races of the out-of-band
/proc interface installed by this target.
- Add 'destroy' commands to nf_tables. They are identical to the
existing 'delete' commands, but do not return an error if the
referenced object (set, chain, rule...) did not exist.
Driver API:
- Improve cpumask_local_spread() locality to help NICs set the right
IRQ affinity on AMD platforms.
- Separate C22 and C45 MDIO bus transactions more clearly.
- Introduce new DCB table to control DSCP rewrite on egress.
- Support configuration of Physical Layer Collision Avoidance (PLCA)
Reconciliation Sublayer (RS) (802.3cg-2019). Modern version of
shared medium Ethernet.
- Support for MAC Merge layer (IEEE 802.3-2018 clause 99). Allowing
preemption of low priority frames by high priority frames.
- Add support for controlling MACSec offload using netlink SET.
- Rework devlink instance refcounts to allow registration and
de-registration under the instance lock. Split the code into
multiple files, drop some of the unnecessarily granular locks and
factor out common parts of netlink operation handling.
- Add TX frame aggregation parameters (for USB drivers).
- Add a new attr TCA_EXT_WARN_MSG to report TC (offload) warning
messages with notifications for debug.
- Allow offloading of UDP NEW connections via act_ct.
- Add support for per action HW stats in TC.
- Support hardware miss to TC action (continue processing in SW from
a specific point in the action chain).
- Warn if old Wireless Extension user space interface is used with
modern cfg80211/mac80211 drivers. Do not support Wireless
Extensions for Wi-Fi 7 devices at all. Everyone should switch to
using nl80211 interface instead.
- Improve the CAN bit timing configuration. Use extack to return
error messages directly to user space, update the SJW handling,
including the definition of a new default value that will benefit
CAN-FD controllers, by increasing their oscillator tolerance.
New hardware / drivers:
- Ethernet:
- nVidia BlueField-3 support (control traffic driver)
- Ethernet support for imx93 SoCs
- Motorcomm yt8531 gigabit Ethernet PHY
- onsemi NCN26000 10BASE-T1S PHY (with support for PLCA)
- Microchip LAN8841 PHY (incl. cable diagnostics and PTP)
- Amlogic gxl MDIO mux
- WiFi:
- RealTek RTL8188EU (rtl8xxxu)
- Qualcomm Wi-Fi 7 devices (ath12k)
- CAN:
- Renesas R-Car V4H
Drivers:
- Bluetooth:
- Set Per Platform Antenna Gain (PPAG) for Intel controllers.
- Ethernet NICs:
- Intel (1G, igc):
- support TSN / Qbv / packet scheduling features of i226 model
- Intel (100G, ice):
- use GNSS subsystem instead of TTY
- multi-buffer XDP support
- extend support for GPIO pins to E823 devices
- nVidia/Mellanox:
- update the shared buffer configuration on PFC commands
- implement PTP adjphase function for HW offset control
- TC support for Geneve and GRE with VF tunnel offload
- more efficient crypto key management method
- multi-port eswitch support
- Netronome/Corigine:
- add DCB IEEE support
- support IPsec offloading for NFP3800
- Freescale/NXP (enetc):
- support XDP_REDIRECT for XDP non-linear buffers
- improve reconfig, avoid link flap and waiting for idle
- support MAC Merge layer
- Other NICs:
- sfc/ef100: add basic devlink support for ef100
- ionic: rx_push mode operation (writing descriptors via MMIO)
- bnxt: use the auxiliary bus abstraction for RDMA
- r8169: disable ASPM and reset bus in case of tx timeout
- cpsw: support QSGMII mode for J721e CPSW9G
- cpts: support pulse-per-second output
- ngbe: add an mdio bus driver
- usbnet: optimize usbnet_bh() by avoiding unnecessary queuing
- r8152: handle devices with FW with NCM support
- amd-xgbe: support 10Mbps, 2.5GbE speeds and rx-adaptation
- virtio-net: support multi buffer XDP
- virtio/vsock: replace virtio_vsock_pkt with sk_buff
- tsnep: XDP support
- Ethernet high-speed switches:
- nVidia/Mellanox (mlxsw):
- add support for latency TLV (in FW control messages)
- Microchip (sparx5):
- separate explicit and implicit traffic forwarding rules, make
the implicit rules always active
- add support for egress DSCP rewrite
- IS0 VCAP support (Ingress Classification)
- IS2 VCAP filters (protos, L3 addrs, L4 ports, flags, ToS
etc.)
- ES2 VCAP support (Egress Access Control)
- support for Per-Stream Filtering and Policing (802.1Q,
8.6.5.1)
- Ethernet embedded switches:
- Marvell (mv88e6xxx):
- add MAB (port auth) offload support
- enable PTP receive for mv88e6390
- NXP (ocelot):
- support MAC Merge layer
- support for the the vsc7512 internal copper phys
- Microchip:
- lan9303: convert to PHYLINK
- lan966x: support TC flower filter statistics
- lan937x: PTP support for KSZ9563/KSZ8563 and LAN937x
- lan937x: support Credit Based Shaper configuration
- ksz9477: support Energy Efficient Ethernet
- other:
- qca8k: convert to regmap read/write API, use bulk operations
- rswitch: Improve TX timestamp accuracy
- Intel WiFi (iwlwifi):
- EHT (Wi-Fi 7) rate reporting
- STEP equalizer support: transfer some STEP (connection to radio
on platforms with integrated wifi) related parameters from the
BIOS to the firmware.
- Qualcomm 802.11ax WiFi (ath11k):
- IPQ5018 support
- Fine Timing Measurement (FTM) responder role support
- channel 177 support
- MediaTek WiFi (mt76):
- per-PHY LED support
- mt7996: EHT (Wi-Fi 7) support
- Wireless Ethernet Dispatch (WED) reset support
- switch to using page pool allocator
- RealTek WiFi (rtw89):
- support new version of Bluetooth co-existance
- Mobile:
- rmnet: support TX aggregation"
* tag 'net-next-6.3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1872 commits)
page_pool: add a comment explaining the fragment counter usage
net: ethtool: fix __ethtool_dev_mm_supported() implementation
ethtool: pse-pd: Fix double word in comments
xsk: add linux/vmalloc.h to xsk.c
sefltests: netdevsim: wait for devlink instance after netns removal
selftest: fib_tests: Always cleanup before exit
net/mlx5e: Align IPsec ASO result memory to be as required by hardware
net/mlx5e: TC, Set CT miss to the specific ct action instance
net/mlx5e: Rename CHAIN_TO_REG to MAPPED_OBJ_TO_REG
net/mlx5: Refactor tc miss handling to a single function
net/mlx5: Kconfig: Make tc offload depend on tc skb extension
net/sched: flower: Support hardware miss to tc action
net/sched: flower: Move filter handle initialization earlier
net/sched: cls_api: Support hardware miss to tc action
net/sched: Rename user cookie and act cookie
sfc: fix builds without CONFIG_RTC_LIB
sfc: clean up some inconsistent indentings
net/mlx4_en: Introduce flexible array to silence overflow warning
net: lan966x: Fix possible deadlock inside PTP
net/ulp: Remove redundant ->clone() test in inet_clone_ulp().
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6
Pull crypto update from Herbert Xu:
"API:
- Use kmap_local instead of kmap_atomic
- Change request callback to take void pointer
- Print FIPS status in /proc/crypto (when enabled)
Algorithms:
- Add rfc4106/gcm support on arm64
- Add ARIA AVX2/512 support on x86
Drivers:
- Add TRNG driver for StarFive SoC
- Delete ux500/hash driver (subsumed by stm32/hash)
- Add zlib support in qat
- Add RSA support in aspeed"
* tag 'v6.3-p1' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6: (156 commits)
crypto: x86/aria-avx - Do not use avx2 instructions
crypto: aspeed - Fix modular aspeed-acry
crypto: hisilicon/qm - fix coding style issues
crypto: hisilicon/qm - update comments to match function
crypto: hisilicon/qm - change function names
crypto: hisilicon/qm - use min() instead of min_t()
crypto: hisilicon/qm - remove some unused defines
crypto: proc - Print fips status
crypto: crypto4xx - Call dma_unmap_page when done
crypto: octeontx2 - Fix objects shared between several modules
crypto: nx - Fix sparse warnings
crypto: ecc - Silence sparse warning
tls: Pass rec instead of aead_req into tls_encrypt_done
crypto: api - Remove completion function scaffolding
tls: Remove completion function scaffolding
tipc: Remove completion function scaffolding
net: ipv6: Remove completion function scaffolding
net: ipv4: Remove completion function scaffolding
net: macsec: Remove completion function scaffolding
dm: Remove completion function scaffolding
...
|
|
Some of the devlink bits were tricky, but I think I got it right.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When sending a SYN message, this kernel stack trace is observed:
...
[ 13.396352] RIP: 0010:_copy_from_iter+0xb4/0x550
...
[ 13.398494] Call Trace:
[ 13.398630] <TASK>
[ 13.398630] ? __alloc_skb+0xed/0x1a0
[ 13.398630] tipc_msg_build+0x12c/0x670 [tipc]
[ 13.398630] ? shmem_add_to_page_cache.isra.71+0x151/0x290
[ 13.398630] __tipc_sendmsg+0x2d1/0x710 [tipc]
[ 13.398630] ? tipc_connect+0x1d9/0x230 [tipc]
[ 13.398630] ? __local_bh_enable_ip+0x37/0x80
[ 13.398630] tipc_connect+0x1d9/0x230 [tipc]
[ 13.398630] ? __sys_connect+0x9f/0xd0
[ 13.398630] __sys_connect+0x9f/0xd0
[ 13.398630] ? preempt_count_add+0x4d/0xa0
[ 13.398630] ? fpregs_assert_state_consistent+0x22/0x50
[ 13.398630] __x64_sys_connect+0x16/0x20
[ 13.398630] do_syscall_64+0x42/0x90
[ 13.398630] entry_SYSCALL_64_after_hwframe+0x63/0xcd
It is because commit a41dad905e5a ("iov_iter: saner checks for attempt
to copy to/from iterator") has introduced sanity check for copying
from/to iov iterator. Lacking of copy direction from the iterator
viewpoint would lead to kernel stack trace like above.
This commit fixes this issue by initializing the iov iterator with
the correct copy direction when sending SYN or ACK without data.
Fixes: f25dcc7687d4 ("tipc: tipc ->sendmsg() conversion")
Reported-by: syzbot+d43608d061e8847ec9f3@syzkaller.appspotmail.com
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/20230214012606.5804-1-tung.q.nguyen@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This patch removes the temporary scaffolding now that the comletion
function signature has been converted.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
This patch adds temporary scaffolding so that the Crypto API
completion function can take a void * instead of crypto_async_request.
Once affected users have been converted this can be removed.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
|
|
Move string_is_valid() to the header for wider use.
While at it, rename to string_is_terminated() to be
precise about its semantics.
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20230208133153.22528-1-andriy.shevchenko@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
As suggested by Cong, introduce a tracepoint for all ->sk_data_ready()
callback implementations. For example:
<...>
iperf-609 [002] ..... 70.660425: sk_data_ready: family=2 protocol=6 func=sock_def_readable
iperf-609 [002] ..... 70.660436: sk_data_ready: family=2 protocol=6 func=sock_def_readable
<...>
Suggested-by: Cong Wang <cong.wang@bytedance.com>
Signed-off-by: Peilin Ye <peilin.ye@bytedance.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This unexpected behavior is observed:
node 1 | node 2
------ | ------
link is established | link is established
reboot | link is reset
up | send discovery message
receive discovery message |
link is established | link is established
send discovery message |
| receive discovery message
| link is reset (unexpected)
| send reset message
link is reset |
It is due to delayed re-discovery as described in function
tipc_node_check_dest(): "this link endpoint has already reset
and re-established contact with the peer, before receiving a
discovery message from that node."
However, commit 598411d70f85 has changed the condition for calling
tipc_node_link_down() which was the acceptance of new media address.
This commit fixes this by restoring the old and correct behavior.
Fixes: 598411d70f85 ("tipc: make resetting of links non-atomic")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Due to several bugs caused by timers being re-armed after they are
shutdown and just before they are freed, a new state of timers was added
called "shutdown". After a timer is set to this state, then it can no
longer be re-armed.
The following script was run to find all the trivial locations where
del_timer() or del_timer_sync() is called in the same function that the
object holding the timer is freed. It also ignores any locations where
the timer->function is modified between the del_timer*() and the free(),
as that is not considered a "trivial" case.
This was created by using a coccinelle script and the following
commands:
$ cat timer.cocci
@@
expression ptr, slab;
identifier timer, rfield;
@@
(
- del_timer(&ptr->timer);
+ timer_shutdown(&ptr->timer);
|
- del_timer_sync(&ptr->timer);
+ timer_shutdown_sync(&ptr->timer);
)
... when strict
when != ptr->timer
(
kfree_rcu(ptr, rfield);
|
kmem_cache_free(slab, ptr);
|
kfree(ptr);
)
$ spatch timer.cocci . > /tmp/t.patch
$ patch -p1 < /tmp/t.patch
Link: https://lore.kernel.org/lkml/20221123201306.823305113@linutronix.de/
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Acked-by: Pavel Machek <pavel@ucw.cz> [ LED ]
Acked-by: Kalle Valo <kvalo@kernel.org> [ wireless ]
Acked-by: Paolo Abeni <pabeni@redhat.com> [ networking ]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs
Pull iov_iter updates from Al Viro:
"iov_iter work; most of that is about getting rid of direction
misannotations and (hopefully) preventing more of the same for the
future"
* tag 'pull-iov_iter' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
use less confusing names for iov_iter direction initializers
iov_iter: saner checks for attempt to copy to/from iterator
[xen] fix "direction" argument of iov_iter_kvec()
[vhost] fix 'direction' argument of iov_iter_{init,bvec}()
[target] fix iov_iter_bvec() "direction" argument
[s390] memcpy_real(): WRITE is "data source", not destination...
[s390] zcore: WRITE is "data source", not destination...
[infiniband] READ is "data destination", not source...
[fsi] WRITE is "data source", not destination...
[s390] copy_oldmem_kernel() - WRITE is "data source", not destination
csum_and_copy_to_iter(): handle ITER_DISCARD
get rid of unlikely() on page_copy_sane() calls
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/crng/random
Pull random number generator updates from Jason Donenfeld:
- Replace prandom_u32_max() and various open-coded variants of it,
there is now a new family of functions that uses fast rejection
sampling to choose properly uniformly random numbers within an
interval:
get_random_u32_below(ceil) - [0, ceil)
get_random_u32_above(floor) - (floor, U32_MAX]
get_random_u32_inclusive(floor, ceil) - [floor, ceil]
Coccinelle was used to convert all current users of
prandom_u32_max(), as well as many open-coded patterns, resulting in
improvements throughout the tree.
I'll have a "late" 6.1-rc1 pull for you that removes the now unused
prandom_u32_max() function, just in case any other trees add a new
use case of it that needs to converted. According to linux-next,
there may be two trivial cases of prandom_u32_max() reintroductions
that are fixable with a 's/.../.../'. So I'll have for you a final
conversion patch doing that alongside the removal patch during the
second week.
This is a treewide change that touches many files throughout.
- More consistent use of get_random_canary().
- Updates to comments, documentation, tests, headers, and
simplification in configuration.
- The arch_get_random*_early() abstraction was only used by arm64 and
wasn't entirely useful, so this has been replaced by code that works
in all relevant contexts.
- The kernel will use and manage random seeds in non-volatile EFI
variables, refreshing a variable with a fresh seed when the RNG is
initialized. The RNG GUID namespace is then hidden from efivarfs to
prevent accidental leakage.
These changes are split into random.c infrastructure code used in the
EFI subsystem, in this pull request, and related support inside of
EFISTUB, in Ard's EFI tree. These are co-dependent for full
functionality, but the order of merging doesn't matter.
- Part of the infrastructure added for the EFI support is also used for
an improvement to the way vsprintf initializes its siphash key,
replacing an sleep loop wart.
- The hardware RNG framework now always calls its correct random.c
input function, add_hwgenerator_randomness(), rather than sometimes
going through helpers better suited for other cases.
- The add_latent_entropy() function has long been called from the fork
handler, but is a no-op when the latent entropy gcc plugin isn't
used, which is fine for the purposes of latent entropy.
But it was missing out on the cycle counter that was also being mixed
in beside the latent entropy variable. So now, if the latent entropy
gcc plugin isn't enabled, add_latent_entropy() will expand to a call
to add_device_randomness(NULL, 0), which adds a cycle counter,
without the absent latent entropy variable.
- The RNG is now reseeded from a delayed worker, rather than on demand
when used. Always running from a worker allows it to make use of the
CPU RNG on platforms like S390x, whose instructions are too slow to
do so from interrupts. It also has the effect of adding in new inputs
more frequently with more regularity, amounting to a long term
transcript of random values. Plus, it helps a bit with the upcoming
vDSO implementation (which isn't yet ready for 6.2).
- The jitter entropy algorithm now tries to execute on many different
CPUs, round-robining, in hopes of hitting even more memory latencies
and other unpredictable effects. It also will mix in a cycle counter
when the entropy timer fires, in addition to being mixed in from the
main loop, to account more explicitly for fluctuations in that timer
firing. And the state it touches is now kept within the same cache
line, so that it's assured that the different execution contexts will
cause latencies.
* tag 'random-6.2-rc1-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/crng/random: (23 commits)
random: include <linux/once.h> in the right header
random: align entropy_timer_state to cache line
random: mix in cycle counter when jitter timer fires
random: spread out jitter callback to different CPUs
random: remove extraneous period and add a missing one in comments
efi: random: refresh non-volatile random seed when RNG is initialized
vsprintf: initialize siphash key using notifier
random: add back async readiness notifier
random: reseed in delayed work rather than on-demand
random: always mix cycle counter in add_latent_entropy()
hw_random: use add_hwgenerator_randomness() for early entropy
random: modernize documentation comment on get_random_bytes()
random: adjust comment to account for removed function
random: remove early archrandom abstraction
random: use random.trust_{bootloader,cpu} command line option only
stackprotector: actually use get_random_canary()
stackprotector: move get_random_canary() into stackprotector.h
treewide: use get_random_u32_inclusive() when possible
treewide: use get_random_u32_{above,below}() instead of manual loop
treewide: use get_random_u32_below() instead of deprecated function
...
|
|
When sending packets between nodes in netns, it calls tipc_lxc_xmit() for
peer node to receive the packets where tipc_sk_mcast_rcv()/tipc_sk_rcv()
might be called, and it's pretty much like in tipc_rcv().
Currently the local 'node rw lock' is held during calling tipc_lxc_xmit()
to protect the peer_net not being freed by another thread. However, when
receiving these packets, tipc_node_add_conn() might be called where the
peer 'node rw lock' is acquired. Then a dead lock warning is triggered by
lockdep detector, although it is not a real dead lock:
WARNING: possible recursive locking detected
--------------------------------------------
conn_server/1086 is trying to acquire lock:
ffff8880065cb020 (&n->lock#2){++--}-{2:2}, \
at: tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
but task is already holding lock:
ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
at: tipc_node_xmit+0x285/0xb30 [tipc]
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0
----
lock(&n->lock#2);
lock(&n->lock#2);
*** DEADLOCK ***
May be due to missing lock nesting notation
4 locks held by conn_server/1086:
#0: ffff8880036d1e40 (sk_lock-AF_TIPC){+.+.}-{0:0}, \
at: tipc_accept+0x9c0/0x10b0 [tipc]
#1: ffff8880036d5f80 (sk_lock-AF_TIPC/1){+.+.}-{0:0}, \
at: tipc_accept+0x363/0x10b0 [tipc]
#2: ffff8880065cd020 (&n->lock#2){++--}-{2:2}, \
at: tipc_node_xmit+0x285/0xb30 [tipc]
#3: ffff888012e13370 (slock-AF_TIPC){+...}-{2:2}, \
at: tipc_sk_rcv+0x2da/0x1b40 [tipc]
Call Trace:
<TASK>
dump_stack_lvl+0x44/0x5b
__lock_acquire.cold.77+0x1f2/0x3d7
lock_acquire+0x1d2/0x610
_raw_write_lock_bh+0x38/0x80
tipc_node_add_conn.cold.76+0xaa/0x211 [tipc]
tipc_sk_finish_conn+0x21e/0x640 [tipc]
tipc_sk_filter_rcv+0x147b/0x3030 [tipc]
tipc_sk_rcv+0xbb4/0x1b40 [tipc]
tipc_lxc_xmit+0x225/0x26b [tipc]
tipc_node_xmit.cold.82+0x4a/0x102 [tipc]
__tipc_sendstream+0x879/0xff0 [tipc]
tipc_accept+0x966/0x10b0 [tipc]
do_accept+0x37d/0x590
This patch avoids this warning by not holding the 'node rw lock' before
calling tipc_lxc_xmit(). As to protect the 'peer_net', rcu_read_lock()
should be enough, as in cleanup_net() when freeing the netns, it calls
synchronize_rcu() before the free is continued.
Also since tipc_lxc_xmit() is like the RX path in tipc_rcv(), it makes
sense to call it under rcu_read_lock(). Note that the right lock order
must be:
rcu_read_lock();
tipc_node_read_lock(n);
tipc_node_read_unlock(n);
tipc_lxc_xmit();
rcu_read_unlock();
instead of:
tipc_node_read_lock(n);
rcu_read_lock();
tipc_node_read_unlock(n);
tipc_lxc_xmit();
rcu_read_unlock();
and we have to call tipc_node_read_lock/unlock() twice in
tipc_node_xmit().
Fixes: f73b12812a3d ("tipc: improve throughput between nodes in netns")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/5bdd1f8fee9db695cfff4528a48c9b9d0523fb00.1670110641.git.lucien.xin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
Fix the potential risk of OOB if skb_linearize() fails in
tipc_link_proto_rcv().
Fixes: 5cbb28a4bf65 ("tipc: linearize arriving NAME_DISTR and LINK_PROTO buffers")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/r/20221203094635.29024-1-yuehaibing@huawei.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
As the call trace shows, the original skb was freed in tipc_msg_validate(),
and dereferencing the old skb cb would cause an use-after-free crash.
BUG: KASAN: use-after-free in tipc_crypto_rcv_complete+0x1835/0x2240 [tipc]
Call Trace:
<IRQ>
tipc_crypto_rcv_complete+0x1835/0x2240 [tipc]
tipc_crypto_rcv+0xd32/0x1ec0 [tipc]
tipc_rcv+0x744/0x1150 [tipc]
...
Allocated by task 47078:
kmem_cache_alloc_node+0x158/0x4d0
__alloc_skb+0x1c1/0x270
tipc_buf_acquire+0x1e/0xe0 [tipc]
tipc_msg_create+0x33/0x1c0 [tipc]
tipc_link_build_proto_msg+0x38a/0x2100 [tipc]
tipc_link_timeout+0x8b8/0xef0 [tipc]
tipc_node_timeout+0x2a1/0x960 [tipc]
call_timer_fn+0x2d/0x1c0
...
Freed by task 47078:
tipc_msg_validate+0x7b/0x440 [tipc]
tipc_crypto_rcv_complete+0x4b5/0x2240 [tipc]
tipc_crypto_rcv+0xd32/0x1ec0 [tipc]
tipc_rcv+0x744/0x1150 [tipc]
This patch fixes it by re-fetching the skb cb from the new allocated skb
after calling tipc_msg_validate().
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/1b1cdba762915325bd8ef9a98d0276eb673df2a5.1669398403.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
READ/WRITE proved to be actively confusing - the meanings are
"data destination, as used with read(2)" and "data source, as
used with write(2)", but people keep interpreting those as
"we read data from it" and "we write data to it", i.e. exactly
the wrong way.
Call them ITER_DEST and ITER_SOURCE - at least that is harder
to misinterpret...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
If skb_linearize() fails in tipc_disc_rcv(), we need to free the skb instead of
handle it.
Fixes: 25b0b9c4e835 ("tipc: handle collisions of 32-bit node address hash values")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/20221119072832.7896-1-yuehaibing@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
One extra conn_get() is needed in tipc_conn_alloc(), as after
tipc_conn_alloc() is called, tipc_conn_close() may free this
con before deferencing it in tipc_topsrv_accept():
tipc_conn_alloc();
newsk = newsock->sk;
<---- tipc_conn_close();
write_lock_bh(&sk->sk_callback_lock);
newsk->sk_data_ready = tipc_conn_data_ready;
Then an uaf issue can be triggered:
BUG: KASAN: use-after-free in tipc_topsrv_accept+0x1e7/0x370 [tipc]
Call Trace:
<TASK>
dump_stack_lvl+0x33/0x46
print_report+0x178/0x4b0
kasan_report+0x8c/0x100
kasan_check_range+0x179/0x1e0
tipc_topsrv_accept+0x1e7/0x370 [tipc]
process_one_work+0x6a3/0x1030
worker_thread+0x8a/0xdf0
This patch fixes it by holding it in tipc_conn_alloc(), then after
all accessing in tipc_topsrv_accept() releasing it. Note when does
this in tipc_topsrv_kern_subscr(), as tipc_conn_rcv_sub() returns
0 or -1 only, we don't need to check for "> 0".
Fixes: c5fa7b3cf3cb ("tipc: introduce new TIPC server infrastructure")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
A crash was reported by Wei Chen:
BUG: kernel NULL pointer dereference, address: 0000000000000018
RIP: 0010:tipc_conn_close+0x12/0x100
Call Trace:
tipc_topsrv_exit_net+0x139/0x320
ops_exit_list.isra.9+0x49/0x80
cleanup_net+0x31a/0x540
process_one_work+0x3fa/0x9f0
worker_thread+0x42/0x5c0
It was caused by !con->sock in tipc_conn_close(). In tipc_topsrv_accept(),
con is allocated in conn_idr then its sock is set:
con = tipc_conn_alloc();
... <----[1]
con->sock = newsock;
If tipc_conn_close() is called in anytime of [1], the null-pointer-def
is triggered by con->sock->sk due to con->sock is not yet set.
This patch fixes it by moving the con->sock setting to tipc_conn_alloc()
under s->idr_lock. So that con->sock can never be NULL when getting the
con from s->conn_idr. It will be also safer to move con->server and flag
CF_CONNECTED setting under s->idr_lock, as they should all be set before
tipc_conn_alloc() is called.
Fixes: c5fa7b3cf3cb ("tipc: introduce new TIPC server infrastructure")
Reported-by: Wei Chen <harperchen1110@gmail.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This is a simple mechanical transformation done by:
@@
expression E;
@@
- prandom_u32_max
+ get_random_u32_below
(E)
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Reviewed-by: SeongJae Park <sj@kernel.org> # for damon
Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> # for infiniband
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> # for arm
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
This is a follow-up for commit 974cb0e3e7c9 ("tipc: fix uninit-value
in tipc_nl_compat_name_table_dump") where it should have type casted
sizeof(..) to int to work when TLV_GET_DATA_LEN() returns a negative
value.
syzbot reported a call trace because of it:
BUG: KMSAN: uninit-value in ...
tipc_nl_compat_name_table_dump+0x841/0xea0 net/tipc/netlink_compat.c:934
__tipc_nl_compat_dumpit+0xab2/0x1320 net/tipc/netlink_compat.c:238
tipc_nl_compat_dumpit+0x991/0xb50 net/tipc/netlink_compat.c:321
tipc_nl_compat_recv+0xb6e/0x1640 net/tipc/netlink_compat.c:1324
genl_family_rcv_msg_doit net/netlink/genetlink.c:731 [inline]
genl_family_rcv_msg net/netlink/genetlink.c:775 [inline]
genl_rcv_msg+0x103f/0x1260 net/netlink/genetlink.c:792
netlink_rcv_skb+0x3a5/0x6c0 net/netlink/af_netlink.c:2501
genl_rcv+0x3c/0x50 net/netlink/genetlink.c:803
netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
netlink_unicast+0xf3b/0x1270 net/netlink/af_netlink.c:1345
netlink_sendmsg+0x1288/0x1440 net/netlink/af_netlink.c:1921
sock_sendmsg_nosec net/socket.c:714 [inline]
sock_sendmsg net/socket.c:734 [inline]
Reported-by: syzbot+e5dbaaa238680ce206ea@syzkaller.appspotmail.com
Fixes: 974cb0e3e7c9 ("tipc: fix uninit-value in tipc_nl_compat_name_table_dump")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Link: https://lore.kernel.org/r/ccd6a7ea801b15aec092c3b532a883b4c5708695.1667594933.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot found a crash in tipc_topsrv_accept:
KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
Workqueue: tipc_rcv tipc_topsrv_accept
RIP: 0010:kernel_accept+0x22d/0x350 net/socket.c:3487
Call Trace:
<TASK>
tipc_topsrv_accept+0x197/0x280 net/tipc/topsrv.c:460
process_one_work+0x991/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e4/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:306
It was caused by srv->listener that might be set to null by
tipc_topsrv_stop() in net .exit whereas it's still used in
tipc_topsrv_accept() worker.
srv->listener is protected by srv->idr_lock in tipc_topsrv_stop(), so add
a check for srv->listener under srv->idr_lock in tipc_topsrv_accept() to
avoid the null-ptr-deref. To ensure the lsock is not released during the
tipc_topsrv_accept(), move sock_release() after tipc_topsrv_work_stop()
where it's waiting until the tipc_topsrv_accept worker to be done.
Note that sk_callback_lock is used to protect sk->sk_user_data instead of
srv->listener, and it should check srv in tipc_topsrv_listener_data_ready()
instead. This also ensures that no more tipc_topsrv_accept worker will be
started after tipc_conn_close() is called in tipc_topsrv_stop() where it
sets sk->sk_user_data to null.
Fixes: 0ef897be12b8 ("tipc: separate topology server listener socket from subcsriber sockets")
Reported-by: syzbot+c5ce866a8d30f4be0651@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/4eee264380c409c61c6451af1059b7fb271a7e7b.1666120790.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from netfilter.
Current release - regressions:
- revert "net: fix cpu_max_bits_warn() usage in
netif_attrmask_next{,_and}"
- revert "net: sched: fq_codel: remove redundant resource cleanup in
fq_codel_init()"
- dsa: uninitialized variable in dsa_slave_netdevice_event()
- eth: sunhme: uninitialized variable in happy_meal_init()
Current release - new code bugs:
- eth: octeontx2: fix resource not freed after malloc
Previous releases - regressions:
- sched: fix return value of qdisc ingress handling on success
- sched: fix race condition in qdisc_graft()
- udp: update reuse->has_conns under reuseport_lock.
- tls: strp: make sure the TCP skbs do not have overlapping data
- hsr: avoid possible NULL deref in skb_clone()
- tipc: fix an information leak in tipc_topsrv_kern_subscr
- phylink: add mac_managed_pm in phylink_config structure
- eth: i40e: fix DMA mappings leak
- eth: hyperv: fix a RX-path warning
- eth: mtk: fix memory leaks
Previous releases - always broken:
- sched: cake: fix null pointer access issue when cake_init() fails"
* tag 'net-6.1-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (43 commits)
net: phy: dp83822: disable MDI crossover status change interrupt
net: sched: fix race condition in qdisc_graft()
net: hns: fix possible memory leak in hnae_ae_register()
wwan_hwsim: fix possible memory leak in wwan_hwsim_dev_new()
sfc: include vport_id in filter spec hash and equal()
genetlink: fix kdoc warnings
selftests: add selftest for chaining of tc ingress handling to egress
net: Fix return value of qdisc ingress handling on success
net: sched: sfb: fix null pointer access issue when sfb_init() fails
Revert "net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init()"
net: sched: cake: fix null pointer access issue when cake_init() fails
ethernet: marvell: octeontx2 Fix resource not freed after malloc
netfilter: nf_tables: relax NFTA_SET_ELEM_KEY_END set flags requirements
netfilter: rpfilter/fib: Set ->flowic_uid correctly for user namespaces.
ionic: catch NULL pointer issue on reconfig
net: hsr: avoid possible NULL deref in skb_clone()
bnxt_en: fix memory leak in bnxt_nvm_test()
ip6mr: fix UAF issue in ip6mr_sk_done() when addrconf_init_net() failed
udp: Update reuse->has_conns under reuseport_lock.
net: ethernet: mediatek: ppe: Remove the unused function mtk_foe_entry_usable()
...
|
|
Use a 8-byte write to initialize sub.usr_handle in
tipc_topsrv_kern_subscr(), otherwise four bytes remain uninitialized
when issuing setsockopt(..., SOL_TIPC, ...).
This resulted in an infoleak reported by KMSAN when the packet was
received:
=====================================================
BUG: KMSAN: kernel-infoleak in copyout+0xbc/0x100 lib/iov_iter.c:169
instrument_copy_to_user ./include/linux/instrumented.h:121
copyout+0xbc/0x100 lib/iov_iter.c:169
_copy_to_iter+0x5c0/0x20a0 lib/iov_iter.c:527
copy_to_iter ./include/linux/uio.h:176
simple_copy_to_iter+0x64/0xa0 net/core/datagram.c:513
__skb_datagram_iter+0x123/0xdc0 net/core/datagram.c:419
skb_copy_datagram_iter+0x58/0x200 net/core/datagram.c:527
skb_copy_datagram_msg ./include/linux/skbuff.h:3903
packet_recvmsg+0x521/0x1e70 net/packet/af_packet.c:3469
____sys_recvmsg+0x2c4/0x810 net/socket.c:?
___sys_recvmsg+0x217/0x840 net/socket.c:2743
__sys_recvmsg net/socket.c:2773
__do_sys_recvmsg net/socket.c:2783
__se_sys_recvmsg net/socket.c:2780
__x64_sys_recvmsg+0x364/0x540 net/socket.c:2780
do_syscall_x64 arch/x86/entry/common.c:50
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd arch/x86/entry/entry_64.S:120
...
Uninit was stored to memory at:
tipc_sub_subscribe+0x42d/0xb50 net/tipc/subscr.c:156
tipc_conn_rcv_sub+0x246/0x620 net/tipc/topsrv.c:375
tipc_topsrv_kern_subscr+0x2e8/0x400 net/tipc/topsrv.c:579
tipc_group_create+0x4e7/0x7d0 net/tipc/group.c:190
tipc_sk_join+0x2a8/0x770 net/tipc/socket.c:3084
tipc_setsockopt+0xae5/0xe40 net/tipc/socket.c:3201
__sys_setsockopt+0x87f/0xdc0 net/socket.c:2252
__do_sys_setsockopt net/socket.c:2263
__se_sys_setsockopt net/socket.c:2260
__x64_sys_setsockopt+0xe0/0x160 net/socket.c:2260
do_syscall_x64 arch/x86/entry/common.c:50
do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd arch/x86/entry/entry_64.S:120
Local variable sub created at:
tipc_topsrv_kern_subscr+0x57/0x400 net/tipc/topsrv.c:562
tipc_group_create+0x4e7/0x7d0 net/tipc/group.c:190
Bytes 84-87 of 88 are uninitialized
Memory access of size 88 starts at ffff88801ed57cd0
Data copied to user address 0000000020000400
...
=====================================================
Signed-off-by: Alexander Potapenko <glider@google.com>
Fixes: 026321c6d056a5 ("tipc: rename tipc_server to tipc_topsrv")
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The trial period exists until jiffies is after addr_trial_end. But as
jiffies will eventually overflow, just using time_after will eventually
give incorrect results. As the node address is set once the trial period
ends, this can be used to know that we are not in the trial period.
Fixes: e415577f57f4 ("tipc: correct discovery message handling during address trial period")
Signed-off-by: Mark Tomlinson <mark.tomlinson@alliedtelesis.co.nz>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Rather than incurring a division or requesting too many random bytes for
the given range, use the prandom_u32_max() function, which only takes
the minimum required bytes from the RNG and avoids divisions. This was
done mechanically with this coccinelle script:
@basic@
expression E;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
typedef u64;
@@
(
- ((T)get_random_u32() % (E))
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ((E) - 1))
+ prandom_u32_max(E * XXX_MAKE_SURE_E_IS_POW2)
|
- ((u64)(E) * get_random_u32() >> 32)
+ prandom_u32_max(E)
|
- ((T)get_random_u32() & ~PAGE_MASK)
+ prandom_u32_max(PAGE_SIZE)
)
@multi_line@
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
identifier RAND;
expression E;
@@
- RAND = get_random_u32();
... when != RAND
- RAND %= (E);
+ RAND = prandom_u32_max(E);
// Find a potential literal
@literal_mask@
expression LITERAL;
type T;
identifier get_random_u32 =~ "get_random_int|prandom_u32|get_random_u32";
position p;
@@
((T)get_random_u32()@p & (LITERAL))
// Add one to the literal.
@script:python add_one@
literal << literal_mask.LITERAL;
RESULT;
@@
value = None
if literal.startswith('0x'):
value = int(literal, 16)
elif literal[0] in '123456789':
value = int(literal, 10)
if value is None:
print("I don't know how to handle %s" % (literal))
cocci.include_match(False)
elif value == 2**32 - 1 or value == 2**31 - 1 or value == 2**24 - 1 or value == 2**16 - 1 or value == 2**8 - 1:
print("Skipping 0x%x for cleanup elsewhere" % (value))
cocci.include_match(False)
elif value & (value + 1) != 0:
print("Skipping 0x%x because it's not a power of two minus one" % (value))
cocci.include_match(False)
elif literal.startswith('0x'):
coccinelle.RESULT = cocci.make_expr("0x%x" % (value + 1))
else:
coccinelle.RESULT = cocci.make_expr("%d" % (value + 1))
// Replace the literal mask with the calculated result.
@plus_one@
expression literal_mask.LITERAL;
position literal_mask.p;
expression add_one.RESULT;
identifier FUNC;
@@
- (FUNC()@p & (LITERAL))
+ prandom_u32_max(RESULT)
@collapse_ret@
type T;
identifier VAR;
expression E;
@@
{
- T VAR;
- VAR = (E);
- return VAR;
+ return E;
}
@drop_var@
type T;
identifier VAR;
@@
{
- T VAR;
... when != VAR
}
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Yury Norov <yury.norov@gmail.com>
Reviewed-by: KP Singh <kpsingh@kernel.org>
Reviewed-by: Jan Kara <jack@suse.cz> # for ext4 and sbitmap
Reviewed-by: Christoph Böhmwalder <christoph.boehmwalder@linbit.com> # for drbd
Acked-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Heiko Carstens <hca@linux.ibm.com> # for s390
Acked-by: Ulf Hansson <ulf.hansson@linaro.org> # for mmc
Acked-by: Darrick J. Wong <djwong@kernel.org> # for xfs
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
|
|
After commit 09b5678c778f("tipc: remove dead code in tipc_net and relatives"),
struct distr_queue_item is not used any more and can be removed as well.
Signed-off-by: Yuan Can <yuancan@huawei.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/20220928085636.71749-1-yuancan@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
drivers/net/ethernet/freescale/fec.h
7d650df99d52 ("net: fec: add pm_qos support on imx6q platform")
40c79ce13b03 ("net: fec: add stop mode support for imx8 platform")
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
There is a shift wrapping bug in this code so anything thing above
31 will return false.
Fixes: 35c55c9877f8 ("tipc: add neighbor monitoring framework")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We had historically not checked that genlmsghdr.reserved
is 0 on input which prevents us from using those precious
bytes in the future.
One use case would be to extend the cmd field, which is
currently just 8 bits wide and 256 is not a lot of commands
for some core families.
To make sure that new families do the right thing by default
put the onus of opting out of validation on existing families.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Paul Moore <paul@paul-moore.com> (NetLabel)
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
While reading these sysctl variables, they can be changed concurrently.
Thus, we need to add READ_ONCE() to their readers.
- .sysctl_rmem
- .sysctl_rwmem
- .sysctl_rmem_offset
- .sysctl_wmem_offset
- sysctl_tcp_rmem[1, 2]
- sysctl_tcp_wmem[1, 2]
- sysctl_decnet_rmem[1]
- sysctl_decnet_wmem[1]
- sysctl_tipc_rmem[1]
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
drivers/net/ethernet/microchip/sparx5/sparx5_switchdev.c
9c5de246c1db ("net: sparx5: mdb add/del handle non-sparx5 devices")
fbb89d02e33a ("net: sparx5: Allow mdb entries to both CPU and ports")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Free sk in case tipc_sk_insert() fails.
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Reviewed-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Shuang Li reported a NULL pointer dereference crash:
[] BUG: kernel NULL pointer dereference, address: 0000000000000068
[] RIP: 0010:tipc_link_is_up+0x5/0x10 [tipc]
[] Call Trace:
[] <IRQ>
[] tipc_bcast_rcv+0xa2/0x190 [tipc]
[] tipc_node_bc_rcv+0x8b/0x200 [tipc]
[] tipc_rcv+0x3af/0x5b0 [tipc]
[] tipc_udp_recv+0xc7/0x1e0 [tipc]
It was caused by the 'l' passed into tipc_bcast_rcv() is NULL. When it
creates a node in tipc_node_check_dest(), after inserting the new node
into hashtable in tipc_node_create(), it creates the bc link. However,
there is a gap between this insert and bc link creation, a bc packet
may come in and get the node from the hashtable then try to dereference
its bc link, which is NULL.
This patch is to fix it by moving the bc link creation before inserting
into the hashtable.
Note that for a preliminary node becoming "real", the bc link creation
should also be called before it's rehashed, as we don't create it for
preliminary nodes.
Fixes: 4cbf8ac2fe5a ("tipc: enable creating a "preliminary" node")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
tipc_dest_list_len() is not being called anywhere. Clean it up.
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
syzbot found the following issue on:
==================================================================
BUG: KASAN: use-after-free in tipc_named_reinit+0x94f/0x9b0
net/tipc/name_distr.c:413
Read of size 8 at addr ffff88805299a000 by task kworker/1:9/23764
CPU: 1 PID: 23764 Comm: kworker/1:9 Not tainted
5.18.0-rc4-syzkaller-00878-g17d49e6e8012 #0
Hardware name: Google Compute Engine/Google Compute Engine,
BIOS Google 01/01/2011
Workqueue: events tipc_net_finalize_work
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xcd/0x134 lib/dump_stack.c:106
print_address_description.constprop.0.cold+0xeb/0x495
mm/kasan/report.c:313
print_report mm/kasan/report.c:429 [inline]
kasan_report.cold+0xf4/0x1c6 mm/kasan/report.c:491
tipc_named_reinit+0x94f/0x9b0 net/tipc/name_distr.c:413
tipc_net_finalize+0x234/0x3d0 net/tipc/net.c:138
process_one_work+0x996/0x1610 kernel/workqueue.c:2289
worker_thread+0x665/0x1080 kernel/workqueue.c:2436
kthread+0x2e9/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:298
</TASK>
[...]
==================================================================
In the commit
d966ddcc3821 ("tipc: fix a deadlock when flushing scheduled work"),
the cancel_work_sync() function just to make sure ONLY the work
tipc_net_finalize_work() is executing/pending on any CPU completed before
tipc namespace is destroyed through tipc_exit_net(). But this function
is not guaranteed the work is the last queued. So, the destroyed instance
may be accessed in the work which will try to enqueue later.
In order to completely fix, we re-order the calling of cancel_work_sync()
to make sure the work tipc_net_finalize_work() was last queued and it
must be completed by calling cancel_work_sync().
Reported-by: syzbot+47af19f3307fc9c5c82e@syzkaller.appspotmail.com
Fixes: d966ddcc3821 ("tipc: fix a deadlock when flushing scheduled work")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Netdev reference helpers have a dev_ prefix for historic
reasons. Renaming the old helpers would be too much churn
but we can rename the tracking ones which are relatively
recent and should be the default for new code.
Rename:
dev_hold_track() -> netdev_hold()
dev_put_track() -> netdev_put()
dev_replace_track() -> netdev_ref_replace()
Link: https://lore.kernel.org/r/20220608043955.919359-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported uninit-value:
=====================================================
BUG: KMSAN: uninit-value in string_nocheck lib/vsprintf.c:644 [inline]
BUG: KMSAN: uninit-value in string+0x4f9/0x6f0 lib/vsprintf.c:725
string_nocheck lib/vsprintf.c:644 [inline]
string+0x4f9/0x6f0 lib/vsprintf.c:725
vsnprintf+0x2222/0x3650 lib/vsprintf.c:2806
vprintk_store+0x537/0x2150 kernel/printk/printk.c:2158
vprintk_emit+0x28b/0xab0 kernel/printk/printk.c:2256
vprintk_default+0x86/0xa0 kernel/printk/printk.c:2283
vprintk+0x15f/0x180 kernel/printk/printk_safe.c:50
_printk+0x18d/0x1cf kernel/printk/printk.c:2293
tipc_enable_bearer net/tipc/bearer.c:371 [inline]
__tipc_nl_bearer_enable+0x2022/0x22a0 net/tipc/bearer.c:1033
tipc_nl_bearer_enable+0x6c/0xb0 net/tipc/bearer.c:1042
genl_family_rcv_msg_doit net/netlink/genetlink.c:731 [inline]
- Do sanity check the attribute length for TIPC_NLA_BEARER_NAME.
- Do not use 'illegal name' in printing message.
Reported-by: syzbot+e820fdc8ce362f2dea51@syzkaller.appspotmail.com
Fixes: cb30a63384bc ("tipc: refactor function tipc_enable_bearer()")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20220602063053.5892-1-hoang.h.le@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Merge in overtime fixes, no conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In the timer callback function tipc_sk_timeout(), we're trying to
reschedule another timeout to retransmit a setup request if destination
link is congested. But we use the incorrect timeout value
(msecs_to_jiffies(100)) instead of (jiffies + msecs_to_jiffies(100)),
so that the timer expires immediately, it's irrelevant for original
description.
In this commit we correct the timeout value in sk_reset_timer()
Fixes: 6787927475e5 ("tipc: buffer overflow handling in listener socket")
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20220321042229.314288-1-hoang.h.le@dektech.com.au
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
|
|
net/dsa/dsa2.c
commit afb3cc1a397d ("net: dsa: unlock the rtnl_mutex when dsa_master_setup() fails")
commit e83d56537859 ("net: dsa: replay master state events in dsa_tree_{setup,teardown}_master")
https://lore.kernel.org/all/20220307101436.7ae87da0@canb.auug.org.au/
drivers/net/ethernet/intel/ice/ice.h
commit 97b0129146b1 ("ice: Fix error with handling of bonding MTU")
commit 43113ff73453 ("ice: add TTY for GNSS module for E810T device")
https://lore.kernel.org/all/20220310112843.3233bcf1@canb.auug.org.au/
drivers/staging/gdm724x/gdm_lte.c
commit fc7f750dc9d1 ("staging: gdm724x: fix use after free in gdm_lte_rx()")
commit 4bcc4249b4cf ("staging: Use netif_rx().")
https://lore.kernel.org/all/20220308111043.1018a59d@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When receiving a state message, function tipc_link_validate_msg()
is called to validate its header portion. Then, its data portion
is validated before it can be accessed correctly. However, current
data sanity check is done after the message header is accessed to
update some link variables.
This commit fixes this issue by moving the data sanity check to
the beginning of state message handling and right after the header
sanity check.
Fixes: 9aa422ad3266 ("tipc: improve size validations for received domain records")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Link: https://lore.kernel.org/r/20220308021200.9245-1-tung.q.nguyen@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Since commit
baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.")
the function netif_rx() can be used in preemptible/thread context as
well as in interrupt context.
Use netif_rx().
Cc: Jon Maloy <jmaloy@redhat.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: tipc-discussion@lists.sourceforge.net
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When enabling a bearer on a node, a kernel panic is observed:
[ 4.498085] RIP: 0010:tipc_mon_prep+0x4e/0x130 [tipc]
...
[ 4.520030] Call Trace:
[ 4.520689] <IRQ>
[ 4.521236] tipc_link_build_proto_msg+0x375/0x750 [tipc]
[ 4.522654] tipc_link_build_state_msg+0x48/0xc0 [tipc]
[ 4.524034] __tipc_node_link_up+0xd7/0x290 [tipc]
[ 4.525292] tipc_rcv+0x5da/0x730 [tipc]
[ 4.526346] ? __netif_receive_skb_core+0xb7/0xfc0
[ 4.527601] tipc_l2_rcv_msg+0x5e/0x90 [tipc]
[ 4.528737] __netif_receive_skb_list_core+0x20b/0x260
[ 4.530068] netif_receive_skb_list_internal+0x1bf/0x2e0
[ 4.531450] ? dev_gro_receive+0x4c2/0x680
[ 4.532512] napi_complete_done+0x6f/0x180
[ 4.533570] virtnet_poll+0x29c/0x42e [virtio_net]
...
The node in question is receiving activate messages in another
thread after changing bearer status to allow message sending/
receiving in current thread:
thread 1 | thread 2
-------- | --------
|
tipc_enable_bearer() |
test_and_set_bit_lock() |
tipc_bearer_xmit_skb() |
| tipc_l2_rcv_msg()
| tipc_rcv()
| __tipc_node_link_up()
| tipc_link_build_state_msg()
| tipc_link_build_proto_msg()
| tipc_mon_prep()
| {
| ...
| // null-pointer dereference
| u16 gen = mon->dom_gen;
| ...
| }
// Not being executed yet |
tipc_mon_create() |
{ |
... |
// allocate |
mon = kzalloc(); |
... |
} |
Monitoring pointer in thread 2 is dereferenced before monitoring data
is allocated in thread 1. This causes kernel panic.
This commit fixes it by allocating the monitoring data before enabling
the bearer to receive messages.
Fixes: 35c55c9877f8 ("tipc: add neighbor monitoring framework")
Reported-by: Shuang Li <shuali@redhat.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
tools/testing/selftests/net/mptcp/mptcp_join.sh
34aa6e3bccd8 ("selftests: mptcp: add ip mptcp wrappers")
857898eb4b28 ("selftests: mptcp: add missing join check")
6ef84b1517e0 ("selftests: mptcp: more robust signal race test")
https://lore.kernel.org/all/20220221131842.468893-1-broonie@kernel.org/
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/act.h
drivers/net/ethernet/mellanox/mlx5/core/en/tc/act/ct.c
fb7e76ea3f3b6 ("net/mlx5e: TC, Skip redundant ct clear actions")
c63741b426e11 ("net/mlx5e: Fix MPLSoUDP encap to use MPLS action information")
09bf97923224f ("net/mlx5e: TC, Move pedit_headers_action to parse_attr")
84ba8062e383 ("net/mlx5e: Test CT and SAMPLE on flow attr")
efe6f961cd2e ("net/mlx5e: CT, Don't set flow flag CT for ct clear flow")
3b49a7edec1d ("net/mlx5e: TC, Reject rules with multiple CT actions")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
These tests are supposed to check if the loop exited via a break or not.
However the tests are wrong because if we did not exit via a break then
"p" is not a valid pointer. In that case, it's the equivalent of
"if (*(u32 *)sr == *last_key) {". That's going to work most of the time,
but there is a potential for those to be equal.
Fixes: 1593123a6a49 ("tipc: add name table dump to new netlink api")
Fixes: 1a1a143daf84 ("tipc: add publication dump to new netlink api")
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The previous bug fix had an unfortunate side effect that broke
distribution of binding table entries between nodes. The updated
tipc_sock_addr struct is also used further down in the same
function, and there the old value is still the correct one.
Fixes: 032062f363b4 ("tipc: fix wrong publisher node address in link publications")
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/20220216020009.3404578-1-jmaloy@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
When a link comes up we add its presence to the name table to make it
possible for users to subscribe for link up/down events. However, after
a previous call signature change the binding is wrongly published with
the peer node as publishing node, instead of the own node as it should
be. This has the effect that the command 'tipc name table show' will
list the link binding (service type 2) with node scope and a peer node
as originator, something that obviously is impossible.
We correct this bug here.
Fixes: 50a3499ab853 ("tipc: simplify signature of tipc_namtbl_publish()")
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/20220214013852.2803940-1-jmaloy@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
msg_data_sz return a 32bit value, but size is 16bit. This may lead to a
bit overflow.
Signed-off-by: Hangyu Hua <hbh25y@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Jakub Kicinski:
"Including fixes from netfilter and can.
Current release - new code bugs:
- sparx5: fix get_stat64 out-of-bound access and crash
- smc: fix netdev ref tracker misuse
Previous releases - regressions:
- eth: ixgbevf: require large buffers for build_skb on 82599VF, avoid
overflows
- eth: ocelot: fix all IP traffic getting trapped to CPU with PTP
over IP
- bonding: fix rare link activation misses in 802.3ad mode
Previous releases - always broken:
- tcp: fix tcp sock mem accounting in zero-copy corner cases
- remove the cached dst when uncloning an skb dst and its metadata,
since we only have one ref it'd lead to an UaF
- netfilter:
- conntrack: don't refresh sctp entries in closed state
- conntrack: re-init state for retransmitted syn-ack, avoid
connection establishment getting stuck with strange stacks
- ctnetlink: disable helper autoassign, avoid it getting lost
- nft_payload: don't allow transport header access for fragments
- dsa: fix use of devres for mdio throughout drivers
- eth: amd-xgbe: disable interrupts during pci removal
- eth: dpaa2-eth: unregister netdev before disconnecting the PHY
- eth: ice: fix IPIP and SIT TSO offload"
* tag 'net-5.17-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (53 commits)
net: dsa: mv88e6xxx: fix use-after-free in mv88e6xxx_mdios_unregister
net: mscc: ocelot: fix mutex lock error during ethtool stats read
ice: Avoid RTNL lock when re-creating auxiliary device
ice: Fix KASAN error in LAG NETDEV_UNREGISTER handler
ice: fix IPIP and SIT TSO offload
ice: fix an error code in ice_cfg_phy_fec()
net: mpls: Fix GCC 12 warning
dpaa2-eth: unregister the netdev before disconnecting from the PHY
skbuff: cleanup double word in comment
net: macb: Align the dma and coherent dma masks
mptcp: netlink: process IPv6 addrs in creating listening sockets
selftests: mptcp: add missing join check
net: usb: qmi_wwan: Add support for Dell DW5829e
vlan: move dev_put into vlan_dev_uninit
vlan: introduce vlan_dev_free_egress_priority
ax25: fix UAF bugs of net_device caused by rebinding operation
net: dsa: fix panic when DSA master device unbinds on shutdown
net: amd-xgbe: disable interrupts during pci removal
tipc: rate limit warning for received illegal binding update
net: mdio: aspeed: Add missing MODULE_DEVICE_TABLE
...
|
|
The function tipc_mon_rcv() allows a node to receive and process
domain_record structs from peer nodes to track their views of the
network topology.
This patch verifies that the number of members in a received domain
record does not exceed the limit defined by MAX_MON_DOMAIN, something
that may otherwise lead to a stack overflow.
tipc_mon_rcv() is called from the function tipc_link_proto_rcv(), where
we are reading a 32 bit message data length field into a uint16. To
avert any risk of bit overflow, we add an extra sanity check for this in
that function. We cannot see that happen with the current code, but
future designers being unaware of this risk, may introduce it by
allowing delivery of very large (> 64k) sk buffers from the bearer
layer. This potential problem was identified by Eric Dumazet.
This fixes CVE-2022-0435
Reported-by: Samuel Page <samuel.page@appgate.com>
Reported-by: Eric Dumazet <edumazet@google.com>
Fixes: 35c55c9877f8 ("tipc: add neighbor monitoring framework")
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Reviewed-by: Samuel Page <samuel.page@appgate.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
It would be easy to craft a message containing an illegal binding table
update operation. This is handled correctly by the code, but the
corresponding warning printout is not rate limited as is should be.
We fix this now.
Fixes: b97bf3fd8f6a ("[TIPC] Initial merge")
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
IIUC the TIPC msg helpers are not meant to provide
and exhaustive API, so remove the unused ones.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
struct tipc_socket_addr.ref has a 4-byte hole,and __tipc_getname() currently
copying it to user space,causing kernel-infoleak.
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline]
BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:121 [inline] lib/usercopy.c:33
BUG: KMSAN: kernel-infoleak in _copy_to_user+0x1c9/0x270 lib/usercopy.c:33 lib/usercopy.c:33
instrument_copy_to_user include/linux/instrumented.h:121 [inline]
instrument_copy_to_user include/linux/instrumented.h:121 [inline] lib/usercopy.c:33
_copy_to_user+0x1c9/0x270 lib/usercopy.c:33 lib/usercopy.c:33
copy_to_user include/linux/uaccess.h:209 [inline]
copy_to_user include/linux/uaccess.h:209 [inline] net/socket.c:287
move_addr_to_user+0x3f6/0x600 net/socket.c:287 net/socket.c:287
__sys_getpeername+0x470/0x6b0 net/socket.c:1987 net/socket.c:1987
__do_sys_getpeername net/socket.c:1997 [inline]
__se_sys_getpeername net/socket.c:1994 [inline]
__do_sys_getpeername net/socket.c:1997 [inline] net/socket.c:1994
__se_sys_getpeername net/socket.c:1994 [inline] net/socket.c:1994
__x64_sys_getpeername+0xda/0x120 net/socket.c:1994 net/socket.c:1994
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_x64 arch/x86/entry/common.c:51 [inline] arch/x86/entry/common.c:82
do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x44/0xae
Uninit was stored to memory at:
tipc_getname+0x575/0x5e0 net/tipc/socket.c:757 net/tipc/socket.c:757
__sys_getpeername+0x3b3/0x6b0 net/socket.c:1984 net/socket.c:1984
__do_sys_getpeername net/socket.c:1997 [inline]
__se_sys_getpeername net/socket.c:1994 [inline]
__do_sys_getpeername net/socket.c:1997 [inline] net/socket.c:1994
__se_sys_getpeername net/socket.c:1994 [inline] net/socket.c:1994
__x64_sys_getpeername+0xda/0x120 net/socket.c:1994 net/socket.c:1994
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_x64 arch/x86/entry/common.c:51 [inline] arch/x86/entry/common.c:82
do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x44/0xae
Uninit was stored to memory at:
msg_set_word net/tipc/msg.h:212 [inline]
msg_set_destport net/tipc/msg.h:619 [inline]
msg_set_word net/tipc/msg.h:212 [inline] net/tipc/socket.c:1486
msg_set_destport net/tipc/msg.h:619 [inline] net/tipc/socket.c:1486
__tipc_sendmsg+0x44fa/0x5890 net/tipc/socket.c:1486 net/tipc/socket.c:1486
tipc_sendmsg+0xeb/0x140 net/tipc/socket.c:1402 net/tipc/socket.c:1402
sock_sendmsg_nosec net/socket.c:704 [inline]
sock_sendmsg net/socket.c:724 [inline]
sock_sendmsg_nosec net/socket.c:704 [inline] net/socket.c:2409
sock_sendmsg net/socket.c:724 [inline] net/socket.c:2409
____sys_sendmsg+0xe11/0x12c0 net/socket.c:2409 net/socket.c:2409
___sys_sendmsg net/socket.c:2463 [inline]
___sys_sendmsg net/socket.c:2463 [inline] net/socket.c:2492
__sys_sendmsg+0x704/0x840 net/socket.c:2492 net/socket.c:2492
__do_sys_sendmsg net/socket.c:2501 [inline]
__se_sys_sendmsg net/socket.c:2499 [inline]
__do_sys_sendmsg net/socket.c:2501 [inline] net/socket.c:2499
__se_sys_sendmsg net/socket.c:2499 [inline] net/socket.c:2499
__x64_sys_sendmsg+0xe2/0x120 net/socket.c:2499 net/socket.c:2499
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_x64 arch/x86/entry/common.c:51 [inline] arch/x86/entry/common.c:82
do_syscall_64+0x54/0xd0 arch/x86/entry/common.c:82 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x44/0xae
Local variable skaddr created at:
__tipc_sendmsg+0x2d0/0x5890 net/tipc/socket.c:1419 net/tipc/socket.c:1419
tipc_sendmsg+0xeb/0x140 net/tipc/socket.c:1402 net/tipc/socket.c:1402
Bytes 4-7 of 16 are uninitialized
Memory access of size 16 starts at ffff888113753e00
Data copied to user address 0000000020000280
Reported-by: syzbot+cdbd40e0c3ca02cae3b7@syzkaller.appspotmail.com
Signed-off-by: Haimin Zhang <tcs_kernel@tencent.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/1640918123-14547-1-git-send-email-tcs.kernel@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
include/net/sock.h
commit 8f905c0e7354 ("inet: fully convert sk->sk_rx_dst to RCU rules")
commit 43f51df41729 ("net: move early demux fields close to sk_refcnt")
https://lore.kernel.org/all/20211222141641.0caa0ab3@canb.auug.org.au/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 86c3a3e964d910a62eeb277d60b2a60ebefa9feb.
The tipc_aead_init() function can be calling from an interrupt routine.
This allocation might sleep with GFP_KERNEL flag, hence the following BUG
is reported.
[ 17.657509] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:230
[ 17.660916] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 0, name: swapper/3
[ 17.664093] preempt_count: 302, expected: 0
[ 17.665619] RCU nest depth: 2, expected: 0
[ 17.667163] Preemption disabled at:
[ 17.667165] [<0000000000000000>] 0x0
[ 17.669753] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Tainted: G W 5.16.0-rc4+ #1
[ 17.673006] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
[ 17.675540] Call Trace:
[ 17.676285] <IRQ>
[ 17.676913] dump_stack_lvl+0x34/0x44
[ 17.678033] __might_resched.cold+0xd6/0x10f
[ 17.679311] kmem_cache_alloc_trace+0x14d/0x220
[ 17.680663] tipc_crypto_start+0x4a/0x2b0 [tipc]
[ 17.682146] ? kmem_cache_alloc_trace+0xd3/0x220
[ 17.683545] tipc_node_create+0x2f0/0x790 [tipc]
[ 17.684956] tipc_node_check_dest+0x72/0x680 [tipc]
[ 17.686706] ? ___cache_free+0x31/0x350
[ 17.688008] ? skb_release_data+0x128/0x140
[ 17.689431] tipc_disc_rcv+0x479/0x510 [tipc]
[ 17.690904] tipc_rcv+0x71c/0x730 [tipc]
[ 17.692219] ? __netif_receive_skb_core+0xb7/0xf60
[ 17.693856] tipc_l2_rcv_msg+0x5e/0x90 [tipc]
[ 17.695333] __netif_receive_skb_list_core+0x20b/0x260
[ 17.697072] netif_receive_skb_list_internal+0x1bf/0x2e0
[ 17.698870] ? dev_gro_receive+0x4c2/0x680
[ 17.700255] napi_complete_done+0x6f/0x180
[ 17.701657] virtnet_poll+0x29c/0x42e [virtio_net]
[ 17.703262] __napi_poll+0x2c/0x170
[ 17.704429] net_rx_action+0x22f/0x280
[ 17.705706] __do_softirq+0xfd/0x30a
[ 17.706921] common_interrupt+0xa4/0xc0
[ 17.708206] </IRQ>
[ 17.708922] <TASK>
[ 17.709651] asm_common_interrupt+0x1e/0x40
[ 17.711078] RIP: 0010:default_idle+0x18/0x20
Fixes: 86c3a3e964d9 ("tipc: use consistent GFP flags")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Link: https://lore.kernel.org/r/20211217030059.5947-1-hoang.h.le@dektech.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Most notable changes are in af_packet, tipc ones are trivial.
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Jon Maloy <jmaloy@redhat.com>
Cc: Ying Xue <ying.xue@windriver.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When key_exchange is disabled, there is no reason to accept MSG_CRYPTO
msgs if it doesn't send MSG_CRYPTO msgs.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When a skb comes to tipc_aead_encrypt(), it's always linear. The
unlikely check 'skb_cloned(skb) && tailen <= skb_tailroom(skb)'
can completely be taken care of in skb_cow_data() by the code
in branch "if (!skb_has_frag_list())".
Also, remove the 'TODO:' annotation, as the pages in skbs are not
writable, see more on commit 3cf4375a0904 ("tipc: do not write
skb_shinfo frags when doing decrytion").
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/47a478da0b6095b76e3cbe7a75cbd25d9da1df9a.1637773872.git.lucien.xin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
kmemdup can return a null pointer so need to check for it, otherwise
the null key will be dereferenced later in tipc_crypto_key_xmit as
can be seen in the trace [1].
Cc: tipc-discussion@lists.sourceforge.net
Cc: stable@vger.kernel.org # 5.15, 5.14, 5.10
[1] https://syzkaller.appspot.com/bug?id=bca180abb29567b189efdbdb34cbf7ba851c2a58
Reported-by: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Link: https://lore.kernel.org/r/20211115160143.5099-1-tadeusz.struk@linaro.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The MSG_CRYPTO msgs are always encrypted and sent to other nodes
for keys' deployment. But when receiving in peers, if those nodes
do not validate it and make sure it's encrypted, one could craft
a malicious MSG_CRYPTO msg to deploy its key with no need to know
other nodes' keys.
This patch is to do that by checking TIPC_SKB_CB(skb)->decrypted
and discard it if this packet never got decrypted.
Note that this is also a supplementary fix to CVE-2021-43267 that
can be triggered by an unencrypted malicious MSG_CRYPTO msg.
Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange")
Acked-by: Ying Xue <ying.xue@windriver.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Some functions, like tipc_crypto_start use inconsisten GFP flags
when allocating memory. The mentioned function use GFP_ATOMIC to
to alloc a crypto instance, and then calls alloc_ordered_workqueue()
which allocates memory with GFP_KERNEL. tipc_aead_init() function
even uses GFP_KERNEL and GFP_ATOMIC interchangeably.
No doc comment specifies what context a function is designed to
work in, but the flags should at least be consistent within a function.
Cc: Jon Maloy <jmaloy@redhat.com>
Cc: Ying Xue <ying.xue@windriver.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Cc: tipc-discussion@lists.sourceforge.net
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Tadeusz Struk <tadeusz.struk@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
include/net/sock.h
7b50ecfcc6cd ("net: Rename ->stream_memory_read to ->sock_is_readable")
4c1e34c0dbff ("vsock: Enable y2038 safe timeval for timeout")
drivers/net/ethernet/marvell/octeontx2/af/rvu_debugfs.c
0daa55d033b0 ("octeontx2-af: cn10k: debugfs for dumping LMTST map table")
e77bcdd1f639 ("octeontx2-af: Display all enabled PF VF rsrc_alloc entries.")
Adjacent code addition in both cases, keep both.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The function tipc_crypto_key_rcv is used to parse MSG_CRYPTO messages
to receive keys from other nodes in the cluster in order to decrypt any
further messages from them.
This patch verifies that any supplied sizes in the message body are
valid for the received message.
Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange")
Signed-off-by: Max VA <maxv@sentinelone.com>
Acked-by: Ying Xue <ying.xue@windriver.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In preparation for netdev->dev_addr being constant
make all relevant arguments in tipc constant.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
In tipc_sk_enqueue() we use hardcoded 2 jiffies to extract
socket buffer from generic queue to particular socket.
The 2 jiffies is too short in case there are other high priority
tasks get CPU cycles for multiple jiffies update. As result, no
buffer could be enqueued to particular socket.
To solve this, we switch to use constant timeout 20msecs.
Then, the function will be expired between 2 jiffies (CONFIG_100HZ)
and 20 jiffies (CONFIG_1000HZ).
Fixes: c637c1035534 ("tipc: resolve race problem at unicast message reception")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
There is a statement that is indented one character too deeply,
clean this up.
Signed-off-by: Colin Ian King <colin.king@canonical.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
drivers/ptp/Kconfig:
55c8fca1dae1 ("ptp_pch: Restore dependency on PCI")
e5f31552674e ("ethernet: fix PTP_1588_CLOCK dependencies")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
__tipc_sendmsg() is called to send SYN packet by either tipc_sendmsg()
or tipc_connect(). The difference is in tipc_connect(), it will call
tipc_wait_for_connect() after __tipc_sendmsg() to wait until connecting
is done. So there's no need to wait in __tipc_sendmsg() for this case.
This patch is to fix it by calling tipc_wait_for_connect() only when dlen
is not 0 in __tipc_sendmsg(), which means it's called by tipc_connect().
Note this also fixes the failure in tipcutils/test/ptts/:
# ./tipcTS &
# ./tipcTC 9
(hang)
Fixes: 36239dab6da7 ("tipc: fix implicit-connect for SYN+")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicts:
drivers/net/ethernet/broadcom/bnxt/bnxt_ptp.h
9e26680733d5 ("bnxt_en: Update firmware call to retrieve TX PTP timestamp")
9e518f25802c ("bnxt_en: 1PPS functions to configure TSIO pins")
099fdeda659d ("bnxt_en: Event handler for PPS events")
kernel/bpf/helpers.c
include/linux/bpf-cgroup.h
a2baf4e8bb0f ("bpf: Fix potentially incorrect results with bpf_get_local_storage()")
c7603cfa04e7 ("bpf: Add ambient BPF runtime context stored in current")
drivers/net/ethernet/mellanox/mlx5/core/pci_irq.c
5957cc557dc5 ("net/mlx5: Set all field of mlx5_irq before inserting it to the xarray")
2d0b41a37679 ("net/mlx5: Refcount mlx5_irq with integer")
MAINTAINERS
7b637cd52f02 ("MAINTAINERS: fix Microchip CAN BUS Analyzer Tool entry typo")
7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
This reverts commit 0efea3c649f0 because of:
- The returning -ENOBUF error is fine on socket buffer allocation.
- There is side effect in the calling path
tipc_node_xmit()->tipc_link_xmit() when checking error code returning.
Fixes: 0efea3c649f0 ("tipc: Return the correct errno code")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Conflicting commits, all resolutions pretty trivial:
drivers/bus/mhi/pci_generic.c
5c2c85315948 ("bus: mhi: pci-generic: configurable network interface MRU")
56f6f4c4eb2a ("bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean")
drivers/nfc/s3fwrn5/firmware.c
a0302ff5906a ("nfc: s3fwrn5: remove unnecessary label")
46573e3ab08f ("nfc: s3fwrn5: fix undefined parameter values in dev_err()")
801e541c79bb ("nfc: s3fwrn5: fix undefined parameter values in dev_err()")
MAINTAINERS
7d901a1e878a ("net: phy: add Maxlinear GPY115/21x/24x driver")
8a7b46fa7902 ("MAINTAINERS: add Yasushi SHOJI as reviewer for the Microchip CAN BUS Analyzer Tool driver")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
syzbot reported an use-after-free crash:
BUG: KASAN: use-after-free in tipc_recvmsg+0xf77/0xf90 net/tipc/socket.c:1979
Call Trace:
tipc_recvmsg+0xf77/0xf90 net/tipc/socket.c:1979
sock_recvmsg_nosec net/socket.c:943 [inline]
sock_recvmsg net/socket.c:961 [inline]
sock_recvmsg+0xca/0x110 net/socket.c:957
tipc_conn_rcv_from_sock+0x162/0x2f0 net/tipc/topsrv.c:398
tipc_conn_recv_work+0xeb/0x190 net/tipc/topsrv.c:421
process_one_work+0x98d/0x1630 kernel/workqueue.c:2276
worker_thread+0x658/0x11f0 kernel/workqueue.c:2422
As Hoang pointed out, it was caused by skb_cb->bytes_read still accessed
after calling tsk_advance_rx_queue() to free the skb in tipc_recvmsg().
This patch is to fix it by accessing skb_cb->bytes_read earlier than
calling tsk_advance_rx_queue().
Fixes: f4919ff59c28 ("tipc: keep the skb in rcv queue until the whole data is read")
Reported-by: syzbot+e6741b97d5552f97c24d@syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
One skb's skb_shinfo frags are not writable, and they can be shared with
other skbs' like by pskb_copy(). To write the frags may cause other skb's
data crash.
So before doing en/decryption, skb_cow_data() should always be called for
a cloned or nonlinear skb if req dst is using the same sg as req src.
While at it, the likely branch can be removed, as it will be covered
by skb_cow_data().
Note that esp_input() has the same issue, and I will fix it in another
patch. tipc_aead_encrypt() doesn't have this issue, as it only processes
linear data in the unlikely branch.
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The release_sock() is blocking function, it would change the state
after sleeping. In order to evaluate the stated condition outside
the socket lock context, switch to use wait_woken() instead.
Fixes: 6398e23cdb1d8 ("tipc: standardize accept routine")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
For implicit-connect, when it's either SYN- or SYN+, an ACK should
be sent back to the client immediately. It's not appropriate for
the client to enter established state only after receiving data
from the server.
On client side, after the SYN is sent out, tipc_wait_for_connect()
should be called to wait for the ACK if timeout is set.
This patch also restricts __tipc_sendstream() to call __sendmsg()
only when it's in TIPC_OPEN state, so that the client can program
in a single loop doing both connecting and data sending like:
for (...)
sendmsg(dest, buf);
This makes the implicit-connect more implicit.
Fixes: b97bf3fd8f6a ("[TIPC] Initial merge")
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, when userspace reads a datagram with a buffer that is
smaller than this datagram, the data will be truncated and only
part of it can be received by users. It doesn't seem right that
users don't know the datagram size and have to use a huge buffer
to read it to avoid the truncation.
This patch to fix it by keeping the skb in rcv queue until the
whole data is read by users. Only the last msg of the datagram
will be marked with MSG_EOR, just as TCP/SCTP does.
Note that this will work as above only when MSG_EOR is set in the
flags parameter of recvmsg(), so that it won't break any old user
applications.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- BPF:
- add syscall program type and libbpf support for generating
instructions and bindings for in-kernel BPF loaders (BPF loaders
for BPF), this is a stepping stone for signed BPF programs
- infrastructure to migrate TCP child sockets from one listener to
another in the same reuseport group/map to improve flexibility
of service hand-off/restart
- add broadcast support to XDP redirect
- allow bypass of the lockless qdisc to improving performance (for
pktgen: +23% with one thread, +44% with 2 threads)
- add a simpler version of "DO_ONCE()" which does not require jump
labels, intended for slow-path usage
- virtio/vsock: introduce SOCK_SEQPACKET support
- add getsocketopt to retrieve netns cookie
- ip: treat lowest address of a IPv4 subnet as ordinary unicast
address allowing reclaiming of precious IPv4 addresses
- ipv6: use prandom_u32() for ID generation
- ip: add support for more flexible field selection for hashing
across multi-path routes (w/ offload to mlxsw)
- icmp: add support for extended RFC 8335 PROBE (ping)
- seg6: add support for SRv6 End.DT46 behavior
- mptcp:
- DSS checksum support (RFC 8684) to detect middlebox meddling
- support Connection-time 'C' flag
- time stamping support
- sctp: packetization Layer Path MTU Discovery (RFC 8899)
- xfrm: speed up state addition with seq set
- WiFi:
- hidden AP discovery on 6 GHz and other HE 6 GHz improvements
- aggregation handling improvements for some drivers
- minstrel improvements for no-ack frames
- deferred rate control for TXQs to improve reaction times
- switch from round robin to virtual time-based airtime scheduler
- add trace points:
- tcp checksum errors
- openvswitch - action execution, upcalls
- socket errors via sk_error_report
Device APIs:
- devlink: add rate API for hierarchical control of max egress rate
of virtual devices (VFs, SFs etc.)
- don't require RCU read lock to be held around BPF hooks in NAPI
context
- page_pool: generic buffer recycling
New hardware/drivers:
- mobile:
- iosm: PCIe Driver for Intel M.2 Modem
- support for Qualcomm MSM8998 (ipa)
- WiFi: Qualcomm QCN9074 and WCN6855 PCI devices
- sparx5: Microchip SparX-5 family of Enterprise Ethernet switches
- Mellanox BlueField Gigabit Ethernet (control NIC of the DPU)
- NXP SJA1110 Automotive Ethernet 10-port switch
- Qualcomm QCA8327 switch support (qca8k)
- Mikrotik 10/25G NIC (atl1c)
Driver changes:
- ACPI support for some MDIO, MAC and PHY devices from Marvell and
NXP (our first foray into MAC/PHY description via ACPI)
- HW timestamping (PTP) support: bnxt_en, ice, sja1105, hns3, tja11xx
- Mellanox/Nvidia NIC (mlx5)
- NIC VF offload of L2 bridging
- support IRQ distribution to Sub-functions
- Marvell (prestera):
- add flower and match all
- devlink trap
- link aggregation
- Netronome (nfp): connection tracking offload
- Intel 1GE (igc): add AF_XDP support
- Marvell DPU (octeontx2): ingress ratelimit offload
- Google vNIC (gve): new ring/descriptor format support
- Qualcomm mobile (rmnet & ipa): inline checksum offload support
- MediaTek WiFi (mt76)
- mt7915 MSI support
- mt7915 Tx status reporting
- mt7915 thermal sensors support
- mt7921 decapsulation offload
- mt7921 enable runtime pm and deep sleep
- Realtek WiFi (rtw88)
- beacon filter support
- Tx antenna path diversity support
- firmware crash information via devcoredump
- Qualcomm WiFi (wcn36xx)
- Wake-on-WLAN support with magic packets and GTK rekeying
- Micrel PHY (ksz886x/ksz8081): add cable test support"
* tag 'net-next-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2168 commits)
tcp: change ICSK_CA_PRIV_SIZE definition
tcp_yeah: check struct yeah size at compile time
gve: DQO: Fix off by one in gve_rx_dqo()
stmmac: intel: set PCI_D3hot in suspend
stmmac: intel: Enable PHY WOL option in EHL
net: stmmac: option to enable PHY WOL with PMT enabled
net: say "local" instead of "static" addresses in ndo_dflt_fdb_{add,del}
net: use netdev_info in ndo_dflt_fdb_{add,del}
ptp: Set lookup cookie when creating a PTP PPS source.
net: sock: add trace for socket errors
net: sock: introduce sk_error_report
net: dsa: replay the local bridge FDB entries pointing to the bridge dev too
net: dsa: ensure during dsa_fdb_offload_notify that dev_hold and dev_put are on the same dev
net: dsa: include fdb entries pointing to bridge in the host fdb list
net: dsa: include bridge addresses which are local in the host fdb list
net: dsa: sync static FDB entries on foreign interfaces to hardware
net: dsa: install the host MDB and FDB entries in the master's RX filter
net: dsa: reference count the FDB addresses at the cross-chip notifier level
net: dsa: introduce a separate cross-chip notifier type for host FDBs
net: dsa: reference count the MDB entries at the cross-chip notifier level
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux
Pull fallthrough fixes from Gustavo Silva:
"Fix many fall-through warnings when building with Clang 12.0.0 and
'-Wimplicit-fallthrough' so that we at some point will be able to
enable that warning by default"
* tag 'fallthrough-fixes-clang-5.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gustavoars/linux: (26 commits)
rxrpc: Fix fall-through warnings for Clang
drm/nouveau/clk: Fix fall-through warnings for Clang
drm/nouveau/therm: Fix fall-through warnings for Clang
drm/nouveau: Fix fall-through warnings for Clang
xfs: Fix fall-through warnings for Clang
xfrm: Fix fall-through warnings for Clang
tipc: Fix fall-through warnings for Clang
sctp: Fix fall-through warnings for Clang
rds: Fix fall-through warnings for Clang
net/packet: Fix fall-through warnings for Clang
net: netrom: Fix fall-through warnings for Clang
ide: Fix fall-through warnings for Clang
hwmon: (max6621) Fix fall-through warnings for Clang
hwmon: (corsair-cpro) Fix fall-through warnings for Clang
firewire: core: Fix fall-through warnings for Clang
braille_console: Fix fall-through warnings for Clang
ipv4: Fix fall-through warnings for Clang
qlcnic: Fix fall-through warnings for Clang
bnxt_en: Fix fall-through warnings for Clang
netxen_nic: Fix fall-through warnings for Clang
...
|
|
The function align() which is defined in msg.c is redundant, replace it
with ALIGN() and introduce a BUF_ALIGN().
Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
FB_MTU is used in 'tipc_msg_build()' to alloc smaller skb when memory
allocation fails, which can avoid unnecessary sending failures.
The value of FB_MTU now is 3744, and the data size will be:
(3744 + SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + \
SKB_DATA_ALIGN(BUF_HEADROOM + BUF_TAILROOM + 3))
which is larger than one page(4096), and two pages will be allocated.
To avoid it, replace '3744' with a calculation:
(PAGE_SIZE - SKB_DATA_ALIGN(BUF_OVERHEAD) - \
SKB_DATA_ALIGN(sizeof(struct skb_shared_info)))
What's more, alloc_skb_fclone() will call SKB_DATA_ALIGN for data size,
and it's not necessary to make alignment for buf_size in
tipc_buf_acquire(). So, just remove it.
Fixes: 4c94cc2d3d57 ("tipc: fall back to smaller MTU if allocation of local send skb fails")
Signed-off-by: Menglong Dong <dong.menglong@zte.com.cn>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix a spelling mistake.
Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the use of copular verb.
Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix the use of indefinite article.
Signed-off-by: gushengxian <gushengxian@yulong.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When kalloc or kmemdup failed, should return ENOMEM rather than ENOBUF.
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We introduce a new macro TIPC_ANY_SCOPE to make the handling of the
lookup scope value more comprehensible during multicast reception.
The (unchanged) rules go as follows:
1) Multicast messages sent from own node are delivered to all matching
sockets on the own node, irrespective of their binding scope.
2) Multicast messages sent from other nodes arrive here because they
have found TIPC_CLUSTER_SCOPE bindings emanating from this node.
Those messages should be delivered to exactly those sockets, but not
to local sockets bound with TIPC_NODE_SCOPE, since the latter
obviously were not meant to be visible for those senders.
3) Group multicast/broadcast messages are delivered to the sockets with
a binding scope matching exactly the lookup scope indicated in the
message header, and nobody else.
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We refactor tipc_sk_anc_data_recv() to make it slightly more
comprehensible, but also to facilitate application of some additions
to the code in a future commit.
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We eliminate the redundant fields conn_type and conn_instance in
struct tipc_sock. On the connecting side, this information is already
present in the unused (after the connection is established) part of
the pre-allocated header, and on the accepting side, we put it there
when the new socket is created.
Reviewed-by: Xin Long <lucien.xin@gmail.com>
Tested-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch is to use "struct work_struct" for the finalize work queue
instead of "struct tipc_net_work", as it can get the "net" and "addr"
from tipc_net's other members and there is no need to add extra net
and addr in tipc_net by defining "struct tipc_net_work".
Note that it's safe to get net from tn->bcl as bcl is always released
after the finalize work queue is done.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In preparation to enable -Wimplicit-fallthrough for Clang, fix a warning
by explicitly adding a break statement instead of letting the code fall
through to the next case.
Link: https://github.com/KSPP/linux/issues/115
Signed-off-by: Gustavo A. R. Silva <gustavoars@kernel.org>
|
|
On some host, a crash could be triggered simply by repeating these
commands several times:
# modprobe tipc
# tipc bearer enable media udp name UDP1 localip 127.0.0.1
# rmmod tipc
[] BUG: unable to handle kernel paging request at ffffffffc096bb00
[] Workqueue: events 0xffffffffc096bb00
[] Call Trace:
[] ? process_one_work+0x1a7/0x360
[] ? worker_thread+0x30/0x390
[] ? create_worker+0x1a0/0x1a0
[] ? kthread+0x116/0x130
[] ? kthread_flush_work_fn+0x10/0x10
[] ? ret_from_fork+0x35/0x40
When removing the TIPC module, the UDP tunnel sock will be delayed to
release in a work queue as sock_release() can't be done in rtnl_lock().
If the work queue is schedule to run after the TIPC module is removed,
kernel will crash as the work queue function cleanup_beareri() code no
longer exists when trying to invoke it.
To fix it, this patch introduce a member wq_count in tipc_net to track
the numbers of work queues in schedule, and wait and exit until all
work queues are done in tipc_exit_net().
Fixes: d0f91938bede ("tipc: add ip/udp media type")
Reported-by: Shuang Li <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It's not a good idea to append the frag skb to a skb's frag_list if
the frag_list already has skbs from elsewhere, such as this skb was
created by pskb_copy() where the frag_list was cloned (all the skbs
in it were skb_get'ed) and shared by multiple skbs.
However, the new appended frag skb should have been only seen by the
current skb. Otherwise, it will cause use after free crashes as this
appended frag skb are seen by multiple skbs but it only got skb_get
called once.
The same thing happens with a skb updated by pskb_may_pull() with a
skb_cloned skb. Li Shuang has reported quite a few crashes caused
by this when doing testing over macvlan devices:
[] kernel BUG at net/core/skbuff.c:1970!
[] Call Trace:
[] skb_clone+0x4d/0xb0
[] macvlan_broadcast+0xd8/0x160 [macvlan]
[] macvlan_process_broadcast+0x148/0x150 [macvlan]
[] process_one_work+0x1a7/0x360
[] worker_thread+0x30/0x390
[] kernel BUG at mm/usercopy.c:102!
[] Call Trace:
[] __check_heap_object+0xd3/0x100
[] __check_object_size+0xff/0x16b
[] simple_copy_to_iter+0x1c/0x30
[] __skb_datagram_iter+0x7d/0x310
[] __skb_datagram_iter+0x2a5/0x310
[] skb_copy_datagram_iter+0x3b/0x90
[] tipc_recvmsg+0x14a/0x3a0 [tipc]
[] ____sys_recvmsg+0x91/0x150
[] ___sys_recvmsg+0x7b/0xc0
[] kernel BUG at mm/slub.c:305!
[] Call Trace:
[] <IRQ>
[] kmem_cache_free+0x3ff/0x400
[] __netif_receive_skb_core+0x12c/0xc40
[] ? kmem_cache_alloc+0x12e/0x270
[] netif_receive_skb_internal+0x3d/0xb0
[] ? get_rx_page_info+0x8e/0xa0 [be2net]
[] be_poll+0x6ef/0xd00 [be2net]
[] ? irq_exit+0x4f/0x100
[] net_rx_action+0x149/0x3b0
...
This patch is to fix it by linearizing the head skb if it has frag_list
set in tipc_buf_append(). Note that we choose to do this before calling
skb_unshare(), as __skb_linearize() will avoid skb_copy(). Also, we can
not just drop the frag_list either as the early time.
Fixes: 45c8b7b175ce ("tipc: allow non-linear first fragment buffer")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit 6bf24dc0cc0cc43b29ba344b66d78590e687e046.
Above fix is not correct and caused memory leak issue.
Fixes: 6bf24dc0cc0c ("net:tipc: Fix a double free in tipc_sk_mcast_rcv")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Acked-by: Tung Nguyen <tung.q.nguyen@dektech.com.au>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The using of the node address and node link identity are not thread safe,
meaning that two publications may be published the same values, as result
one of them will get failure because of already existing in the name table.
To avoid this we have to use the node address and node link identity values
from inside the node item's write lock protection.
Fixes: 50a3499ab853 ("tipc: simplify signature of tipc_namtbl_publish()")
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next
Pull networking updates from Jakub Kicinski:
"Core:
- bpf:
- allow bpf programs calling kernel functions (initially to
reuse TCP congestion control implementations)
- enable task local storage for tracing programs - remove the
need to store per-task state in hash maps, and allow tracing
programs access to task local storage previously added for
BPF_LSM
- add bpf_for_each_map_elem() helper, allowing programs to walk
all map elements in a more robust and easier to verify fashion
- sockmap: support UDP and cross-protocol BPF_SK_SKB_VERDICT
redirection
- lpm: add support for batched ops in LPM trie
- add BTF_KIND_FLOAT support - mostly to allow use of BTF on
s390 which has floats in its headers files
- improve BPF syscall documentation and extend the use of kdoc
parsing scripts we already employ for bpf-helpers
- libbpf, bpftool: support static linking of BPF ELF files
- improve support for encapsulation of L2 packets
- xdp: restructure redirect actions to avoid a runtime lookup,
improving performance by 4-8% in microbenchmarks
- xsk: build skb by page (aka generic zerocopy xmit) - improve
performance of software AF_XDP path by 33% for devices which don't
need headers in the linear skb part (e.g. virtio)
- nexthop: resilient next-hop groups - improve path stability on
next-hops group changes (incl. offload for mlxsw)
- ipv6: segment routing: add support for IPv4 decapsulation
- icmp: add support for RFC 8335 extended PROBE messages
- inet: use bigger hash table for IP ID generation
- tcp: deal better with delayed TX completions - make sure we don't
give up on fast TCP retransmissions only because driver is slow in
reporting that it completed transmitting the original
- tcp: reorder tcp_congestion_ops for better cache locality
- mptcp:
- add sockopt support for common TCP options
- add support for common TCP msg flags
- include multiple address ids in RM_ADDR
- add reset option support for resetting one subflow
- udp: GRO L4 improvements - improve 'forward' / 'frag_list'
co-existence with UDP tunnel GRO, allowing the first to take place
correctly even for encapsulated UDP traffic
- micro-optimize dev_gro_receive() and flow dissection, avoid
retpoline overhead on VLAN and TEB GRO
- use less memory for sysctls, add a new sysctl type, to allow using
u8 instead of "int" and "long" and shrink networking sysctls
- veth: allow GRO without XDP - this allows aggregating UDP packets
before handing them off to routing, bridge, OvS, etc.
- allow specifing ifindex when device is moved to another namespace
- netfilter:
- nft_socket: add support for cgroupsv2
- nftables: add catch-all set element - special element used to
define a default action in case normal lookup missed
- use net_generic infra in many modules to avoid allocating
per-ns memory unnecessarily
- xps: improve the xps handling to avoid potential out-of-bound
accesses and use-after-free when XPS change race with other
re-configuration under traffic
- add a config knob to turn off per-cpu netdev refcnt to catch
underflows in testing
Device APIs:
- add WWAN subsystem to organize the WWAN interfaces better and
hopefully start driving towards more unified and vendor-
independent APIs
- ethtool:
- add interface for reading IEEE MIB stats (incl. mlx5 and bnxt
support)
- allow network drivers to dump arbitrary SFP EEPROM data,
current offset+length API was a poor fit for modern SFP which
define EEPROM in terms of pages (incl. mlx5 support)
- act_police, flow_offload: add support for packet-per-second
policing (incl. offload for nfp)
- psample: add additional metadata attributes like transit delay for
packets sampled from switch HW (and corresponding egress and
policy-based sampling in the mlxsw driver)
- dsa: improve support for sandwiched LAGs with bridge and DSA
- netfilter:
- flowtable: use direct xmit in topologies with IP forwarding,
bridging, vlans etc.
- nftables: counter hardware offload support
- Bluetooth:
- improvements for firmware download w/ Intel devices
- add support for reading AOSP vendor capabilities
- add support for virtio transport driver
- mac80211:
- allow concurrent monitor iface and ethernet rx decap
- set priority and queue mapping for injected frames
- phy: add support for Clause-45 PHY Loopback
- pci/iov: add sysfs MSI-X vector assignment interface to distribute
MSI-X resources to VFs (incl. mlx5 support)
New hardware/drivers:
- dsa: mv88e6xxx: add support for Marvell mv88e6393x - 11-port
Ethernet switch with 8x 1-Gigabit Ethernet and 3x 10-Gigabit
interfaces.
- dsa: support for legacy Broadcom tags used on BCM5325, BCM5365 and
BCM63xx switches
- Microchip KSZ8863 and KSZ8873; 3x 10/100Mbps Ethernet switches
- ath11k: support for QCN9074 a 802.11ax device
- Bluetooth: Broadcom BCM4330 and BMC4334
- phy: Marvell 88X2222 transceiver support
- mdio: add BCM6368 MDIO mux bus controller
- r8152: support RTL8153 and RTL8156 (USB Ethernet) chips
- mana: driver for Microsoft Azure Network Adapter (MANA)
- Actions Semi Owl Ethernet MAC
- can: driver for ETAS ES58X CAN/USB interfaces
Pure driver changes:
- add XDP support to: enetc, igc, stmmac
- add AF_XDP support to: stmmac
- virtio:
- page_to_skb() use build_skb when there's sufficient tailroom
(21% improvement for 1000B UDP frames)
- support XDP even without dedicated Tx queues - share the Tx
queues with the stack when necessary
- mlx5:
- flow rules: add support for mirroring with conntrack, matching
on ICMP, GTP, flex filters and more
- support packet sampling with flow offloads
- persist uplink representor netdev across eswitch mode changes
- allow coexistence of CQE compression and HW time-stamping
- add ethtool extended link error state reporting
- ice, iavf: support flow filters, UDP Segmentation Offload
- dpaa2-switch:
- move the driver out of staging
- add spanning tree (STP) support
- add rx copybreak support
- add tc flower hardware offload on ingress traffic
- ionic:
- implement Rx page reuse
- support HW PTP time-stamping
- octeon: support TC hardware offloads - flower matching on ingress
and egress ratelimitting.
- stmmac:
- add RX frame steering based on VLAN priority in tc flower
- support frame preemption (FPE)
- intel: add cross time-stamping freq difference adjustment
- ocelot:
- support forwarding of MRP frames in HW
- support multiple bridges
- support PTP Sync one-step timestamping
- dsa: mv88e6xxx, dpaa2-switch: offload bridge port flags like
learning, flooding etc.
- ipa: add IPA v4.5, v4.9 and v4.11 support (Qualcomm SDX55, SM8350,
SC7280 SoCs)
- mt7601u: enable TDLS support
- mt76:
- add support for 802.3 rx frames (mt7915/mt7615)
- mt7915 flash pre-calibration support
- mt7921/mt7663 runtime power management fixes"
* tag 'net-next-5.13' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (2451 commits)
net: selftest: fix build issue if INET is disabled
net: netrom: nr_in: Remove redundant assignment to ns
net: tun: Remove redundant assignment to ret
net: phy: marvell: add downshift support for M88E1240
net: dsa: ksz: Make reg_mib_cnt a u8 as it never exceeds 255
net/sched: act_ct: Remove redundant ct get and check
icmp: standardize naming of RFC 8335 PROBE constants
bpf, selftests: Update array map tests for per-cpu batched ops
bpf: Add batched ops support for percpu array
bpf: Implement formatted output helpers with bstr_printf
seq_file: Add a seq_bprintf function
sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues
net:nfc:digital: Fix a double free in digital_tg_recv_dep_req
net: fix a concurrency bug in l2tp_tunnel_register()
net/smc: Remove redundant assignment to rc
mpls: Remove redundant assignment to err
llc2: Remove redundant assignment to rc
net/tls: Remove redundant initialization of record
rds: Remove redundant assignment to nr_sig
dt-bindings: net: mdio-gpio: add compatible for microchip,mdio-smi0
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull CFI on arm64 support from Kees Cook:
"This builds on last cycle's LTO work, and allows the arm64 kernels to
be built with Clang's Control Flow Integrity feature. This feature has
happily lived in Android kernels for almost 3 years[1], so I'm excited
to have it ready for upstream.
The wide diffstat is mainly due to the treewide fixing of mismatched
list_sort prototypes. Other things in core kernel are to address
various CFI corner cases. The largest code portion is the CFI runtime
implementation itself (which will be shared by all architectures
implementing support for CFI). The arm64 pieces are Acked by arm64
maintainers rather than coming through the arm64 tree since carrying
this tree over there was going to be awkward.
CFI support for x86 is still under development, but is pretty close.
There are a handful of corner cases on x86 that need some improvements
to Clang and objtool, but otherwise works well.
Summary:
- Clean up list_sort prototypes (Sami Tolvanen)
- Introduce CONFIG_CFI_CLANG for arm64 (Sami Tolvanen)"
* tag 'cfi-v5.13-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
arm64: allow CONFIG_CFI_CLANG to be selected
KVM: arm64: Disable CFI for nVHE
arm64: ftrace: use function_nocfi for ftrace_call
arm64: add __nocfi to __apply_alternatives
arm64: add __nocfi to functions that jump to a physical address
arm64: use function_nocfi with __pa_symbol
arm64: implement function_nocfi
psci: use function_nocfi for cpu_resume
lkdtm: use function_nocfi
treewide: Change list_sort to use const pointers
bpf: disable CFI in dispatcher functions
kallsyms: strip ThinLTO hashes from static functions
kthread: use WARN_ON_FUNCTION_MISMATCH
workqueue: use WARN_ON_FUNCTION_MISMATCH
module: ensure __cfi_check alignment
mm: add generic function_nocfi macro
cfi: add __cficanonical
add support for Clang CFI
|
|
Conflicts:
MAINTAINERS
- keep Chandrasekar
drivers/net/ethernet/mellanox/mlx5/core/en_main.c
- simple fix + trust the code re-added to param.c in -next is fine
include/linux/bpf.h
- trivial
include/linux/ethtool.h
- trivial, fix kdoc while at it
include/linux/skmsg.h
- move to relevant place in tcp.c, comment re-wrapped
net/core/skmsg.c
- add the sk = sk // sk = NULL around calls
net/tipc/crypto.c
- trivial
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
list_sort() internally casts the comparison function passed to it
to a different type with constant struct list_head pointers, and
uses this pointer to call the functions, which trips indirect call
Control-Flow Integrity (CFI) checking.
Instead of removing the consts, this change defines the
list_cmp_func_t type and changes the comparison function types of
all list_sort() callers to use const pointers, thus avoiding type
mismatches.
Suggested-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Kees Cook <keescook@chromium.org>
Tested-by: Nick Desaulniers <ndesaulniers@google.com>
Tested-by: Nathan Chancellor <nathan@kernel.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: https://lore.kernel.org/r/20210408182843.1754385-10-samitolvanen@google.com
|
|
These patches fix a series of spelling errors in net/tipc module.
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Zheng Yongjun <zhengyongjun3@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Li Shuang found a NULL pointer dereference crash in her testing:
[] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[] RIP: 0010:tipc_crypto_rcv_complete+0xc8/0x7e0 [tipc]
[] Call Trace:
[] <IRQ>
[] tipc_crypto_rcv+0x2d9/0x8f0 [tipc]
[] tipc_rcv+0x2fc/0x1120 [tipc]
[] tipc_udp_recv+0xc6/0x1e0 [tipc]
[] udpv6_queue_rcv_one_skb+0x16a/0x460
[] udp6_unicast_rcv_skb.isra.35+0x41/0xa0
[] ip6_protocol_deliver_rcu+0x23b/0x4c0
[] ip6_input+0x3d/0xb0
[] ipv6_rcv+0x395/0x510
[] __netif_receive_skb_core+0x5fc/0xc40
This is caused by NULL returned by tipc_aead_get(), and then crashed when
dereferencing it later in tipc_crypto_rcv_complete(). This might happen
when tipc_crypto_rcv_complete() is called by two threads at the same time:
the tmp attached by tipc_crypto_key_attach() in one thread may be released
by the one attached by that in the other thread.
This patch is to fix it by incrementing the tmp's refcnt before attaching
it instead of calling tipc_aead_get() after attaching it.
Fixes: fc1b6d6de220 ("tipc: introduce TIPC encryption & authentication")
Reported-by: Li Shuang <shuali@redhat.com>
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Fix kernel-doc warning:
Documentation/networking/tipc:66: /home/sfr/next/next/net/tipc/name_table.c
:558: WARNING: Unexpected indentation.
Documentation/networking/tipc:66: /home/sfr/next/next/net/tipc/name_table.c
:559: WARNING: Block quote ends without a blank line; unexpected unindent.
Due to blank line missing.
Fixes: 908148bc5046 ("tipc: refactor tipc_sendmsg() and tipc_lookup_anycast()")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Link: https://lore.kernel.org/netdev/20210318172255.63185609@canb.auug.org.au/
Signed-off-by: Wu XiangCheng <bobwxc@email.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
struct ip6_sf_socklist and ipv6_mc_socklist are per-socket MLD data.
These data are protected by rtnl lock, socket lock, and RCU.
So, when these are used, it verifies whether rtnl lock is acquired or not.
ip6_mc_msfget() is called by do_ipv6_getsockopt().
But caller doesn't acquire rtnl lock.
So, when these data are used in the ip6_mc_msfget() lockdep warns about it.
But accessing these is actually safe because socket lock was acquired by
do_ipv6_getsockopt().
So, it changes lockdep annotation from rtnl lock to socket lock.
(rtnl_dereference -> sock_dereference)
Locking graph for mld data is like below:
When writing mld data:
do_ipv6_setsockopt()
rtnl_lock
lock_sock
(mld functions)
idev->mc_lock(if per-interface mld data is modified)
When reading mld data:
do_ipv6_getsockopt()
lock_sock
ip6_mc_msfget()
Splat looks like:
=============================
WARNING: suspicious RCU usage
5.12.0-rc4+ #503 Not tainted
-----------------------------
net/ipv6/mcast.c:610 suspicious rcu_dereference_protected() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1
1 lock held by mcast-listener-/923:
#0: ffff888007958a70 (sk_lock-AF_INET6){+.+.}-{0:0}, at:
ipv6_get_msfilter+0xaf/0x190
stack backtrace:
CPU: 1 PID: 923 Comm: mcast-listener- Not tainted 5.12.0-rc4+ #503
Call Trace:
dump_stack+0xa4/0xe5
ip6_mc_msfget+0x553/0x6c0
? ipv6_sock_mc_join_ssm+0x10/0x10
? lockdep_hardirqs_on_prepare+0x3e0/0x3e0
? mark_held_locks+0xb7/0x120
? lockdep_hardirqs_on_prepare+0x27c/0x3e0
? __local_bh_enable_ip+0xa5/0xf0
? lock_sock_nested+0x82/0xf0
ipv6_get_msfilter+0xc3/0x190
? compat_ipv6_get_msfilter+0x300/0x300
? lock_downgrade+0x690/0x690
do_ipv6_getsockopt.isra.6.constprop.13+0x1809/0x29e0
? do_ipv6_mcast_group_source+0x150/0x150
? register_lock_class+0x1750/0x1750
? kvm_sched_clock_read+0x14/0x30
? sched_clock+0x5/0x10
? sched_clock_cpu+0x18/0x170
? find_held_lock+0x3a/0x1c0
? lock_downgrade+0x690/0x690
? ipv6_getsockopt+0xdb/0x1b0
ipv6_getsockopt+0xdb/0x1b0
[ ... ]
Fixes: 88e2ca308094 ("mld: convert ifmcaddr6 to RCU")
Reported-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
When enabling a bearer by name, we don't sanity check its name with
higher slot in bearer list. This may have the effect that the name
of an already enabled bearer bypasses the check.
To fix the above issue, we just perform an extra checking with all
existing bearers.
Fixes: cb30a63384bc9 ("tipc: refactor function tipc_enable_bearer()")
Cc: stable@vger.kernel.org
Acked-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: Hoang Le <hoang.h.le@dektech.com.au>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the missing destroy_workqueue() before return from
tipc_crypto_start() in the error handling case.
Fixes: 1ef6f7c9390f ("tipc: add automatic session key exchange")
Reported-by: Hulk Robot <hulkci@huawei.com>
Signed-off-by: Yang Yingliang <yangyingliang@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
We fix a warning from the htmldoc tool and an indentation error reported
by smatch. There are no functional changes in this commit.
Signed-off-by: Jon Maloy <jmaloy@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In the if(skb_peek(arrvq) == skb) branch, it calls __skb_dequeue(arrvq) to get
the skb by skb = skb_peek(arrvq). Then __skb_dequeue() unlinks the skb from arrvq
and returns the skb which equals to skb_peek(arrvq). After __skb_dequeue(arrvq)
finished, the skb is freed by kfree_skb(__skb_dequeue(arrvq)) in the first time.
Unfortunately, the same skb is freed in the second time by kfree_skb(skb) after
the branch completed.
My patch removes kfree_skb() in the if(skb_peek(arrvq) == skb) branch, because
this skb will be freed by kfree_skb(skb) finally.
Fixes: cb1b728096f54 ("tipc: eliminate race condition at multicast reception")
Signed-off-by: Lv Yunlong <lyl2019@mail.ustc.edu.cn>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
s/synching/syncing/
Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|