summaryrefslogtreecommitdiff
path: root/net
AgeCommit message (Collapse)AuthorFilesLines
2020-10-01batman-adv: mcast/TT: fix wrongly dropped or rerouted packetsLinus Lüssing1-0/+4
[ Upstream commit 7dda5b3384121181c4e79f6eaeac2b94c0622c8d ] The unicast packet rerouting code makes several assumptions. For instance it assumes that there is always exactly one destination in the TT. This breaks for multicast frames in a unicast packets in several ways: For one thing if there is actually no TT entry and the destination node was selected due to the multicast tvlv flags it announced. Then an intermediate node will wrongly drop the packet. For another thing if there is a TT entry but the TTVN of this entry is newer than the originally addressed destination node: Then the intermediate node will wrongly redirect the packet, leading to duplicated multicast packets at a multicast listener and missing packets at other multicast listeners or multicast routers. Fixing this by not applying the unicast packet rerouting to batman-adv unicast packets with a multicast payload. We are not able to detect a roaming multicast listener at the moment and will just continue to send the multicast frame to both the new and old destination for a while in case of such a roaming multicast listener. Fixes: a73105b8d4c7 ("batman-adv: improved client announcement mechanism") Signed-off-by: Linus Lüssing <linus.luessing@c0d3.blue> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01batman-adv: bla: fix type misuse for backbone_gw hash indexingLinus Lüssing1-3/+4
[ Upstream commit 097930e85f90f252c44dc0d084598265dd44ca48 ] It seems that due to a copy & paste error the void pointer in batadv_choose_backbone_gw() is cast to the wrong type. Fixing this by using "struct batadv_bla_backbone_gw" instead of "struct batadv_bla_claim" which better matches the caller's side. For now it seems that we were lucky because the two structs both have their orig/vid and addr/vid in the beginning. However I stumbled over this issue when I was trying to add some debug variables in front of "orig" in batadv_backbone_gw, which caused hash lookups to fail. Fixes: 07568d0369f9 ("batman-adv: don't rely on positions in struct for hashing") Signed-off-by: Linus Lüssing <ll@simonwunderlich.de> Signed-off-by: Sven Eckelmann <sven@narfation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01net: openvswitch: use div_u64() for 64-by-32 divisionsTonghao Zhang1-1/+1
[ Upstream commit 659d4587fe7233bfdff303744b20d6f41ad04362 ] Compile the kernel for arm 32 platform, the build warning found. To fix that, should use div_u64() for divisions. | net/openvswitch/meter.c:396: undefined reference to `__udivdi3' [add more commit msg, change reported tag, and use div_u64 instead of do_div by Tonghao] Fixes: e57358873bb5d6ca ("net: openvswitch: use u64 for meter bucket") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Tested-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01tipc: fix memory leak in service subscriptingTuong Lien1-1/+3
[ Upstream commit 0771d7df819284d46cf5cfb57698621b503ec17f ] Upon receipt of a service subscription request from user via a topology connection, one 'sub' object will be allocated in kernel, so it will be able to send an event of the service if any to the user correspondingly then. Also, in case of any failure, the connection will be shutdown and all the pertaining 'sub' objects will be freed. However, there is a race condition as follows resulting in memory leak: receive-work connection send-work | | | sub-1 |<------//-------| | sub-2 |<------//-------| | | |<---------------| evt for sub-x sub-3 |<------//-------| | : : : : : : | /--------| | | | * peer closed | | | | | | | |<-------X-------| evt for sub-y | | |<===============| sub-n |<------/ X shutdown | -> orphan | | That is, the 'receive-work' may get the last subscription request while the 'send-work' is shutting down the connection due to peer close. We had a 'lock' on the connection, so the two actions cannot be carried out simultaneously. If the last subscription is allocated e.g. 'sub-n', before the 'send-work' closes the connection, there will be no issue at all, the 'sub' objects will be freed. In contrast the last subscription will become orphan since the connection was closed, and we released all references. This commit fixes the issue by simply adding one test if the connection remains in 'connected' state right after we obtain the connection lock, then a subscription object can be created as usual, otherwise we ignore it. Acked-by: Ying Xue <ying.xue@windriver.com> Acked-by: Jon Maloy <jmaloy@redhat.com> Reported-by: Thang Ngo <thang.h.ngo@dektech.com.au> Signed-off-by: Tuong Lien <tuong.t.lien@dektech.com.au> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01Bluetooth: Handle Inquiry Cancel error after Inquiry CompleteSonny Sasaka1-2/+17
[ Upstream commit adf1d6926444029396861413aba8a0f2a805742a ] After sending Inquiry Cancel command to the controller, it is possible that Inquiry Complete event comes before Inquiry Cancel command complete event. In this case the Inquiry Cancel command will have status of Command Disallowed since there is no Inquiry session to be cancelled. This case should not be treated as error, otherwise we can reach an inconsistent state. Example of a btmon trace when this happened: < HCI Command: Inquiry Cancel (0x01|0x0002) plen 0 > HCI Event: Inquiry Complete (0x01) plen 1 Status: Success (0x00) > HCI Event: Command Complete (0x0e) plen 4 Inquiry Cancel (0x01|0x0002) ncmd 1 Status: Command Disallowed (0x0c) Signed-off-by: Sonny Sasaka <sonnysasaka@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01atm: fix a memory leak of vcc->user_backCong Wang1-0/+6
[ Upstream commit 8d9f73c0ad2f20e9fed5380de0a3097825859d03 ] In lec_arp_clear_vccs() only entry->vcc is freed, but vcc could be installed on entry->recv_vcc too in lec_vcc_added(). This fixes the following memory leak: unreferenced object 0xffff8880d9266b90 (size 16): comm "atm2", pid 425, jiffies 4294907980 (age 23.488s) hex dump (first 16 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 6b 6b 6b a5 ............kkk. backtrace: [<(____ptrval____)>] kmem_cache_alloc_trace+0x10e/0x151 [<(____ptrval____)>] lane_ioctl+0x4b3/0x569 [<(____ptrval____)>] do_vcc_ioctl+0x1ea/0x236 [<(____ptrval____)>] svc_ioctl+0x17d/0x198 [<(____ptrval____)>] sock_do_ioctl+0x47/0x12f [<(____ptrval____)>] sock_ioctl+0x2f9/0x322 [<(____ptrval____)>] vfs_ioctl+0x1e/0x2b [<(____ptrval____)>] ksys_ioctl+0x61/0x80 [<(____ptrval____)>] __x64_sys_ioctl+0x16/0x19 [<(____ptrval____)>] do_syscall_64+0x57/0x65 [<(____ptrval____)>] entry_SYSCALL_64_after_hwframe+0x49/0xb3 Cc: Gengming Liu <l.dmxcsnsbh@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01net: openvswitch: use u64 for meter bucketTonghao Zhang2-2/+2
[ Upstream commit e57358873bb5d6caa882b9684f59140912b37dde ] When setting the meter rate to 4+Gbps, there is an overflow, the meters don't work as expected. Cc: Pravin B Shelar <pshelar@ovn.org> Cc: Andy Zhou <azhou@ovn.org> Signed-off-by: Tonghao Zhang <xiangxia.m.yue@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01svcrdma: Fix leak of transport addressesChuck Lever1-0/+1
[ Upstream commit 1a33d8a284b1e85e03b8c7b1ea8fb985fccd1d71 ] Kernel memory leak detected: unreferenced object 0xffff888849cdf480 (size 8): comm "kworker/u8:3", pid 2086, jiffies 4297898756 (age 4269.856s) hex dump (first 8 bytes): 30 00 cd 49 88 88 ff ff 0..I.... backtrace: [<00000000acfc370b>] __kmalloc_track_caller+0x137/0x183 [<00000000a2724354>] kstrdup+0x2b/0x43 [<0000000082964f84>] xprt_rdma_format_addresses+0x114/0x17d [rpcrdma] [<00000000dfa6ed00>] xprt_setup_rdma_bc+0xc0/0x10c [rpcrdma] [<0000000073051a83>] xprt_create_transport+0x3f/0x1a0 [sunrpc] [<0000000053531a8e>] rpc_create+0x118/0x1cd [sunrpc] [<000000003a51b5f8>] setup_callback_client+0x1a5/0x27d [nfsd] [<000000001bd410af>] nfsd4_process_cb_update.isra.7+0x16c/0x1ac [nfsd] [<000000007f4bbd56>] nfsd4_run_cb_work+0x4c/0xbd [nfsd] [<0000000055c5586b>] process_one_work+0x1b2/0x2fe [<00000000b1e3e8ef>] worker_thread+0x1a6/0x25a [<000000005205fb78>] kthread+0xf6/0xfb [<000000006d2dc057>] ret_from_fork+0x3a/0x50 Introduce a call to xprt_rdma_free_addresses() similar to the way that the TCP backchannel releases a transport's peer address strings. Fixes: 5d252f90a800 ("svcrdma: Add class for RDMA backwards direction transport") Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01SUNRPC: Fix a potential buffer overflow in 'svc_print_xprts()'Christophe JAILLET1-5/+14
[ Upstream commit b25b60d7bfb02a74bc3c2d998e09aab159df8059 ] 'maxlen' is the total size of the destination buffer. There is only one caller and this value is 256. When we compute the size already used and what we would like to add in the buffer, the trailling NULL character is not taken into account. However, this trailling character will be added by the 'strcat' once we have checked that we have enough place. So, there is a off-by-one issue and 1 byte of the stack could be erroneously overwridden. Take into account the trailling NULL, when checking if there is enough place in the destination buffer. While at it, also replace a 'sprintf' by a safer 'snprintf', check for output truncation and avoid a superfluous 'strlen'. Fixes: dc9a16e49dbba ("svc: Add /proc/sys/sunrpc/transport files") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> [ cel: very minor fix to documenting comment Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01Bluetooth: L2CAP: handle l2cap config request during open stateHoward Chung1-1/+2
[ Upstream commit 96298f640104e4cd9a913a6e50b0b981829b94ff ] According to Core Spec Version 5.2 | Vol 3, Part A 6.1.5, the incoming L2CAP_ConfigReq should be handled during OPEN state. The section below shows the btmon trace when running L2CAP/COS/CFD/BV-12-C before and after this change. === Before === ... > ACL Data RX: Handle 256 flags 0x02 dlen 12 #22 L2CAP: Connection Request (0x02) ident 2 len 4 PSM: 1 (0x0001) Source CID: 65 < ACL Data TX: Handle 256 flags 0x00 dlen 16 #23 L2CAP: Connection Response (0x03) ident 2 len 8 Destination CID: 64 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 256 flags 0x00 dlen 12 #24 L2CAP: Configure Request (0x04) ident 2 len 4 Destination CID: 65 Flags: 0x0000 > HCI Event: Number of Completed Packets (0x13) plen 5 #25 Num handles: 1 Handle: 256 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 #26 Num handles: 1 Handle: 256 Count: 1 > ACL Data RX: Handle 256 flags 0x02 dlen 16 #27 L2CAP: Configure Request (0x04) ident 3 len 8 Destination CID: 64 Flags: 0x0000 Option: Unknown (0x10) [hint] 01 00 .. < ACL Data TX: Handle 256 flags 0x00 dlen 18 #28 L2CAP: Configure Response (0x05) ident 3 len 10 Source CID: 65 Flags: 0x0000 Result: Success (0x0000) Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 672 > HCI Event: Number of Completed Packets (0x13) plen 5 #29 Num handles: 1 Handle: 256 Count: 1 > ACL Data RX: Handle 256 flags 0x02 dlen 14 #30 L2CAP: Configure Response (0x05) ident 2 len 6 Source CID: 64 Flags: 0x0000 Result: Success (0x0000) > ACL Data RX: Handle 256 flags 0x02 dlen 20 #31 L2CAP: Configure Request (0x04) ident 3 len 12 Destination CID: 64 Flags: 0x0000 Option: Unknown (0x10) [hint] 01 00 91 02 11 11 ...... < ACL Data TX: Handle 256 flags 0x00 dlen 14 #32 L2CAP: Command Reject (0x01) ident 3 len 6 Reason: Invalid CID in request (0x0002) Destination CID: 64 Source CID: 65 > HCI Event: Number of Completed Packets (0x13) plen 5 #33 Num handles: 1 Handle: 256 Count: 1 ... === After === ... > ACL Data RX: Handle 256 flags 0x02 dlen 12 #22 L2CAP: Connection Request (0x02) ident 2 len 4 PSM: 1 (0x0001) Source CID: 65 < ACL Data TX: Handle 256 flags 0x00 dlen 16 #23 L2CAP: Connection Response (0x03) ident 2 len 8 Destination CID: 64 Source CID: 65 Result: Connection successful (0x0000) Status: No further information available (0x0000) < ACL Data TX: Handle 256 flags 0x00 dlen 12 #24 L2CAP: Configure Request (0x04) ident 2 len 4 Destination CID: 65 Flags: 0x0000 > HCI Event: Number of Completed Packets (0x13) plen 5 #25 Num handles: 1 Handle: 256 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 #26 Num handles: 1 Handle: 256 Count: 1 > ACL Data RX: Handle 256 flags 0x02 dlen 16 #27 L2CAP: Configure Request (0x04) ident 3 len 8 Destination CID: 64 Flags: 0x0000 Option: Unknown (0x10) [hint] 01 00 .. < ACL Data TX: Handle 256 flags 0x00 dlen 18 #28 L2CAP: Configure Response (0x05) ident 3 len 10 Source CID: 65 Flags: 0x0000 Result: Success (0x0000) Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 672 > HCI Event: Number of Completed Packets (0x13) plen 5 #29 Num handles: 1 Handle: 256 Count: 1 > ACL Data RX: Handle 256 flags 0x02 dlen 14 #30 L2CAP: Configure Response (0x05) ident 2 len 6 Source CID: 64 Flags: 0x0000 Result: Success (0x0000) > ACL Data RX: Handle 256 flags 0x02 dlen 20 #31 L2CAP: Configure Request (0x04) ident 3 len 12 Destination CID: 64 Flags: 0x0000 Option: Unknown (0x10) [hint] 01 00 91 02 11 11 ..... < ACL Data TX: Handle 256 flags 0x00 dlen 18 #32 L2CAP: Configure Response (0x05) ident 3 len 10 Source CID: 65 Flags: 0x0000 Result: Success (0x0000) Option: Maximum Transmission Unit (0x01) [mandatory] MTU: 672 < ACL Data TX: Handle 256 flags 0x00 dlen 12 #33 L2CAP: Configure Request (0x04) ident 3 len 4 Destination CID: 65 Flags: 0x0000 > HCI Event: Number of Completed Packets (0x13) plen 5 #34 Num handles: 1 Handle: 256 Count: 1 > HCI Event: Number of Completed Packets (0x13) plen 5 #35 Num handles: 1 Handle: 256 Count: 1 ... Signed-off-by: Howard Chung <howardchung@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01Bluetooth: guard against controllers sending zero'd eventsAlain Michaud1-0/+6
[ Upstream commit 08bb4da90150e2a225f35e0f642cdc463958d696 ] Some controllers have been observed to send zero'd events under some conditions. This change guards against this condition as well as adding a trace to facilitate diagnosability of this condition. Signed-off-by: Alain Michaud <alainm@chromium.org> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01skbuff: fix a data race in skb_queue_len()Qian Cai1-2/+9
[ Upstream commit 86b18aaa2b5b5bb48e609cd591b3d2d0fdbe0442 ] sk_buff.qlen can be accessed concurrently as noticed by KCSAN, BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_dgram_sendmsg read to 0xffff8a1b1d8a81c0 of 4 bytes by task 5371 on cpu 96: unix_dgram_sendmsg+0x9a9/0xb70 include/linux/skbuff.h:1821 net/unix/af_unix.c:1761 ____sys_sendmsg+0x33e/0x370 ___sys_sendmsg+0xa6/0xf0 __sys_sendmsg+0x69/0xf0 __x64_sys_sendmsg+0x51/0x70 do_syscall_64+0x91/0xb47 entry_SYSCALL_64_after_hwframe+0x49/0xbe write to 0xffff8a1b1d8a81c0 of 4 bytes by task 1 on cpu 99: __skb_try_recv_from_queue+0x327/0x410 include/linux/skbuff.h:2029 __skb_try_recv_datagram+0xbe/0x220 unix_dgram_recvmsg+0xee/0x850 ____sys_recvmsg+0x1fb/0x210 ___sys_recvmsg+0xa2/0xf0 __sys_recvmsg+0x66/0xf0 __x64_sys_recvmsg+0x51/0x70 do_syscall_64+0x91/0xb47 entry_SYSCALL_64_after_hwframe+0x49/0xbe Since only the read is operating as lockless, it could introduce a logic bug in unix_recvq_full() due to the load tearing. Fix it by adding a lockless variant of skb_queue_len() and unix_recvq_full() where READ_ONCE() is on the read while WRITE_ONCE() is on the write similar to the commit d7d16a89350a ("net: add skb_queue_empty_lockless()"). Signed-off-by: Qian Cai <cai@lca.pw> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01Bluetooth: prefetch channel before killing sockHillf Danton1-4/+6
[ Upstream commit 2a154903cec20fb64ff4d7d617ca53c16f8fd53a ] Prefetch channel before killing sock in order to fix UAF like BUG: KASAN: use-after-free in l2cap_sock_release+0x24c/0x290 net/bluetooth/l2cap_sock.c:1212 Read of size 8 at addr ffff8880944904a0 by task syz-fuzzer/9751 Reported-by: syzbot+c3c5bdea7863886115dc@syzkaller.appspotmail.com Fixes: 6c08fc896b60 ("Bluetooth: Fix refcount use-after-free issue") Cc: Manish Mandlik <mmandlik@google.com> Signed-off-by: Hillf Danton <hdanton@sina.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01Bluetooth: Fix refcount use-after-free issueManish Mandlik2-14/+28
[ Upstream commit 6c08fc896b60893c5d673764b0668015d76df462 ] There is no lock preventing both l2cap_sock_release() and chan->ops->close() from running at the same time. If we consider Thread A running l2cap_chan_timeout() and Thread B running l2cap_sock_release(), expected behavior is: A::l2cap_chan_timeout()->l2cap_chan_close()->l2cap_sock_teardown_cb() A::l2cap_chan_timeout()->l2cap_sock_close_cb()->l2cap_sock_kill() B::l2cap_sock_release()->sock_orphan() B::l2cap_sock_release()->l2cap_sock_kill() where, sock_orphan() clears "sk->sk_socket" and l2cap_sock_teardown_cb() marks socket as SOCK_ZAPPED. In l2cap_sock_kill(), there is an "if-statement" that checks if both sock_orphan() and sock_teardown() has been run i.e. sk->sk_socket is NULL and socket is marked as SOCK_ZAPPED. Socket is killed if the condition is satisfied. In the race condition, following occurs: A::l2cap_chan_timeout()->l2cap_chan_close()->l2cap_sock_teardown_cb() B::l2cap_sock_release()->sock_orphan() B::l2cap_sock_release()->l2cap_sock_kill() A::l2cap_chan_timeout()->l2cap_sock_close_cb()->l2cap_sock_kill() In this scenario, "if-statement" is true in both B::l2cap_sock_kill() and A::l2cap_sock_kill() and we hit "refcount: underflow; use-after-free" bug. Similar condition occurs at other places where teardown/sock_kill is happening: l2cap_disconnect_rsp()->l2cap_chan_del()->l2cap_sock_teardown_cb() l2cap_disconnect_rsp()->l2cap_sock_close_cb()->l2cap_sock_kill() l2cap_conn_del()->l2cap_chan_del()->l2cap_sock_teardown_cb() l2cap_conn_del()->l2cap_sock_close_cb()->l2cap_sock_kill() l2cap_disconnect_req()->l2cap_chan_del()->l2cap_sock_teardown_cb() l2cap_disconnect_req()->l2cap_sock_close_cb()->l2cap_sock_kill() l2cap_sock_cleanup_listen()->l2cap_chan_close()->l2cap_sock_teardown_cb() l2cap_sock_cleanup_listen()->l2cap_sock_kill() Protect teardown/sock_kill and orphan/sock_kill by adding hold_lock on l2cap channel to ensure that the socket is killed only after marked as zapped and orphan. Signed-off-by: Manish Mandlik <mmandlik@google.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01sctp: move trace_sctp_probe_path into sctp_outq_sackKevin Kou1-0/+6
[ Upstream commit f643ee295c1c63bc117fb052d4da681354d6f732 ] The original patch bringed in the "SCTP ACK tracking trace event" feature was committed at Dec.20, 2017, it replaced jprobe usage with trace events, and bringed in two trace events, one is TRACE_EVENT(sctp_probe), another one is TRACE_EVENT(sctp_probe_path). The original patch intended to trigger the trace_sctp_probe_path in TRACE_EVENT(sctp_probe) as below code, +TRACE_EVENT(sctp_probe, + + TP_PROTO(const struct sctp_endpoint *ep, + const struct sctp_association *asoc, + struct sctp_chunk *chunk), + + TP_ARGS(ep, asoc, chunk), + + TP_STRUCT__entry( + __field(__u64, asoc) + __field(__u32, mark) + __field(__u16, bind_port) + __field(__u16, peer_port) + __field(__u32, pathmtu) + __field(__u32, rwnd) + __field(__u16, unack_data) + ), + + TP_fast_assign( + struct sk_buff *skb = chunk->skb; + + __entry->asoc = (unsigned long)asoc; + __entry->mark = skb->mark; + __entry->bind_port = ep->base.bind_addr.port; + __entry->peer_port = asoc->peer.port; + __entry->pathmtu = asoc->pathmtu; + __entry->rwnd = asoc->peer.rwnd; + __entry->unack_data = asoc->unack_data; + + if (trace_sctp_probe_path_enabled()) { + struct sctp_transport *sp; + + list_for_each_entry(sp, &asoc->peer.transport_addr_list, + transports) { + trace_sctp_probe_path(sp, asoc); + } + } + ), But I found it did not work when I did testing, and trace_sctp_probe_path had no output, I finally found that there is trace buffer lock operation(trace_event_buffer_reserve) in include/trace/trace_events.h: static notrace void \ trace_event_raw_event_##call(void *__data, proto) \ { \ struct trace_event_file *trace_file = __data; \ struct trace_event_data_offsets_##call __maybe_unused __data_offsets;\ struct trace_event_buffer fbuffer; \ struct trace_event_raw_##call *entry; \ int __data_size; \ \ if (trace_trigger_soft_disabled(trace_file)) \ return; \ \ __data_size = trace_event_get_offsets_##call(&__data_offsets, args); \ \ entry = trace_event_buffer_reserve(&fbuffer, trace_file, \ sizeof(*entry) + __data_size); \ \ if (!entry) \ return; \ \ tstruct \ \ { assign; } \ \ trace_event_buffer_commit(&fbuffer); \ } The reason caused no output of trace_sctp_probe_path is that trace_sctp_probe_path written in TP_fast_assign part of TRACE_EVENT(sctp_probe), and it will be placed( { assign; } ) after the trace_event_buffer_reserve() when compiler expands Macro, entry = trace_event_buffer_reserve(&fbuffer, trace_file, \ sizeof(*entry) + __data_size); \ \ if (!entry) \ return; \ \ tstruct \ \ { assign; } \ so trace_sctp_probe_path finally can not acquire trace_event_buffer and return no output, that is to say the nest of tracepoint entry function is not allowed. The function call flow is: trace_sctp_probe() -> trace_event_raw_event_sctp_probe() -> lock buffer -> trace_sctp_probe_path() -> trace_event_raw_event_sctp_probe_path() --nested -> buffer has been locked and return no output. This patch is to remove trace_sctp_probe_path from the TP_fast_assign part of TRACE_EVENT(sctp_probe) to avoid the nest of entry function, and trigger sctp_probe_path_trace in sctp_outq_sack. After this patch, you can enable both events individually, # cd /sys/kernel/debug/tracing # echo 1 > events/sctp/sctp_probe/enable # echo 1 > events/sctp/sctp_probe_path/enable Or, you can enable all the events under sctp. # echo 1 > events/sctp/enable Signed-off-by: Kevin Kou <qdkevin.kou@gmail.com> Acked-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01ipv6_route_seq_next should increase position indexVasily Averin1-5/+2
[ Upstream commit 4fc427e0515811250647d44de38d87d7b0e0790f ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01rt_cpu_seq_next should increase position indexVasily Averin1-0/+1
[ Upstream commit a3ea86739f1bc7e121d921842f0f4a8ab1af94d9 ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01neigh_stat_seq_next() should increase position indexVasily Averin1-0/+1
[ Upstream commit 1e3f9f073c47bee7c23e77316b07bc12338c5bba ] if seq_file .next fuction does not change position index, read after some lseek can generate unexpected output. https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01net: silence data-races on sk_backlog.tailEric Dumazet2-2/+2
[ Upstream commit 9ed498c6280a2f2b51d02df96df53037272ede49 ] sk->sk_backlog.tail might be read without holding the socket spinlock, we need to add proper READ_ONCE()/WRITE_ONCE() to silence the warnings. KCSAN reported : BUG: KCSAN: data-race in tcp_add_backlog / tcp_recvmsg write to 0xffff8881265109f8 of 8 bytes by interrupt on cpu 1: __sk_add_backlog include/net/sock.h:907 [inline] sk_add_backlog include/net/sock.h:938 [inline] tcp_add_backlog+0x476/0xce0 net/ipv4/tcp_ipv4.c:1759 tcp_v4_rcv+0x1a70/0x1bd0 net/ipv4/tcp_ipv4.c:1947 ip_protocol_deliver_rcu+0x4d/0x420 net/ipv4/ip_input.c:204 ip_local_deliver_finish+0x110/0x140 net/ipv4/ip_input.c:231 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_local_deliver+0x133/0x210 net/ipv4/ip_input.c:252 dst_input include/net/dst.h:442 [inline] ip_rcv_finish+0x121/0x160 net/ipv4/ip_input.c:413 NF_HOOK include/linux/netfilter.h:305 [inline] NF_HOOK include/linux/netfilter.h:299 [inline] ip_rcv+0x18f/0x1a0 net/ipv4/ip_input.c:523 __netif_receive_skb_one_core+0xa7/0xe0 net/core/dev.c:4929 __netif_receive_skb+0x37/0xf0 net/core/dev.c:5043 netif_receive_skb_internal+0x59/0x190 net/core/dev.c:5133 napi_skb_finish net/core/dev.c:5596 [inline] napi_gro_receive+0x28f/0x330 net/core/dev.c:5629 receive_buf+0x284/0x30b0 drivers/net/virtio_net.c:1061 virtnet_receive drivers/net/virtio_net.c:1323 [inline] virtnet_poll+0x436/0x7d0 drivers/net/virtio_net.c:1428 napi_poll net/core/dev.c:6311 [inline] net_rx_action+0x3ae/0xa90 net/core/dev.c:6379 __do_softirq+0x115/0x33f kernel/softirq.c:292 invoke_softirq kernel/softirq.c:373 [inline] irq_exit+0xbb/0xe0 kernel/softirq.c:413 exiting_irq arch/x86/include/asm/apic.h:536 [inline] do_IRQ+0xa6/0x180 arch/x86/kernel/irq.c:263 ret_from_intr+0x0/0x19 native_safe_halt+0xe/0x10 arch/x86/kernel/paravirt.c:71 arch_cpu_idle+0x1f/0x30 arch/x86/kernel/process.c:571 default_idle_call+0x1e/0x40 kernel/sched/idle.c:94 cpuidle_idle_call kernel/sched/idle.c:154 [inline] do_idle+0x1af/0x280 kernel/sched/idle.c:263 cpu_startup_entry+0x1b/0x20 kernel/sched/idle.c:355 start_secondary+0x208/0x260 arch/x86/kernel/smpboot.c:264 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 read to 0xffff8881265109f8 of 8 bytes by task 8057 on cpu 0: tcp_recvmsg+0x46e/0x1b40 net/ipv4/tcp.c:2050 inet_recvmsg+0xbb/0x250 net/ipv4/af_inet.c:838 sock_recvmsg_nosec net/socket.c:871 [inline] sock_recvmsg net/socket.c:889 [inline] sock_recvmsg+0x92/0xb0 net/socket.c:885 sock_read_iter+0x15f/0x1e0 net/socket.c:967 call_read_iter include/linux/fs.h:1889 [inline] new_sync_read+0x389/0x4f0 fs/read_write.c:414 __vfs_read+0xb1/0xc0 fs/read_write.c:427 vfs_read fs/read_write.c:461 [inline] vfs_read+0x143/0x2c0 fs/read_write.c:446 ksys_read+0xd5/0x1b0 fs/read_write.c:587 __do_sys_read fs/read_write.c:597 [inline] __se_sys_read fs/read_write.c:595 [inline] __x64_sys_read+0x4c/0x60 fs/read_write.c:595 do_syscall_64+0xcc/0x370 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 8057 Comm: syz-fuzzer Not tainted 5.4.0-rc6+ #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-09-26tcp_bbr: adapt cwnd based on ack aggregation estimationPriyaranjan Jha1-1/+121
commit 78dc70ebaa38aa303274e333be6c98eef87619e2 upstream. Aggregation effects are extremely common with wifi, cellular, and cable modem link technologies, ACK decimation in middleboxes, and LRO and GRO in receiving hosts. The aggregation can happen in either direction, data or ACKs, but in either case the aggregation effect is visible to the sender in the ACK stream. Previously BBR's sending was often limited by cwnd under severe ACK aggregation/decimation because BBR sized the cwnd at 2*BDP. If packets were acked in bursts after long delays (e.g. one ACK acking 5*BDP after 5*RTT), BBR's sending was halted after sending 2*BDP over 2*RTT, leaving the bottleneck idle for potentially long periods. Note that loss-based congestion control does not have this issue because when facing aggregation it continues increasing cwnd after bursts of ACKs, growing cwnd until the buffer is full. To achieve good throughput in the presence of aggregation effects, this algorithm allows the BBR sender to put extra data in flight to keep the bottleneck utilized during silences in the ACK stream that it has evidence to suggest were caused by aggregation. A summary of the algorithm: when a burst of packets are acked by a stretched ACK or a burst of ACKs or both, BBR first estimates the expected amount of data that should have been acked, based on its estimated bandwidth. Then the surplus ("extra_acked") is recorded in a windowed-max filter to estimate the recent level of observed ACK aggregation. Then cwnd is increased by the ACK aggregation estimate. The larger cwnd avoids BBR being cwnd-limited in the face of ACK silences that recent history suggests were caused by aggregation. As a sanity check, the ACK aggregation degree is upper-bounded by the cwnd (at the time of measurement) and a global max of BW * 100ms. The algorithm is further described by the following presentation: https://datatracker.ietf.org/meeting/101/materials/slides-101-iccrg-an-update-on-bbr-work-at-google-00 In our internal testing, we observed a significant increase in BBR throughput (measured using netperf), in a basic wifi setup. - Host1 (sender on ethernet) -> AP -> Host2 (receiver on wifi) - 2.4 GHz -> BBR before: ~73 Mbps; BBR after: ~102 Mbps; CUBIC: ~100 Mbps - 5.0 GHz -> BBR before: ~362 Mbps; BBR after: ~593 Mbps; CUBIC: ~601 Mbps Also, this code is running globally on YouTube TCP connections and produced significant bandwidth increases for YouTube traffic. This is based on Ian Swett's max_ack_height_ algorithm from the QUIC BBR implementation. Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26tcp_bbr: refactor bbr_target_cwnd() for general inflight provisioningPriyaranjan Jha1-21/+39
commit 232aa8ec3ed979d4716891540c03a806ecab0c37 upstream. Because bbr_target_cwnd() is really a general-purpose BBR helper for computing some volume of inflight data as a function of the estimated BDP, refactor it into following helper functions: - bbr_bdp() - bbr_quantization_budget() - bbr_inflight() Signed-off-by: Priyaranjan Jha <priyarjha@google.com> Signed-off-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26ipv4: Update exception handling for multipath routes via same deviceDavid Ahern1-5/+8
[ Upstream commit 2fbc6e89b2f1403189e624cabaf73e189c5e50c6 ] Kfir reported that pmtu exceptions are not created properly for deployments where multipath routes use the same device. After some digging I see 2 compounding problems: 1. ip_route_output_key_hash_rcu is updating the flowi4_oif *after* the route lookup. This is the second use case where this has been a problem (the first is related to use of vti devices with VRF). I can not find any reason for the oif to be changed after the lookup; the code goes back to the start of git. It does not seem logical so remove it. 2. fib_lookups for exceptions do not call fib_select_path to handle multipath route selection based on the hash. The end result is that the fib_lookup used to add the exception always creates it based using the first leg of the route. An example topology showing the problem: | host1 +------+ | eth0 | .209 +------+ | +------+ switch | br0 | +------+ | +---------+---------+ | host2 | host3 +------+ +------+ | eth0 | .250 | eth0 | 192.168.252.252 +------+ +------+ +-----+ +-----+ | vti | .2 | vti | 192.168.247.3 +-----+ +-----+ \ / ================================= tunnels 192.168.247.1/24 for h in host1 host2 host3; do ip netns add ${h} ip -netns ${h} link set lo up ip netns exec ${h} sysctl -wq net.ipv4.ip_forward=1 done ip netns add switch ip -netns switch li set lo up ip -netns switch link add br0 type bridge stp 0 ip -netns switch link set br0 up for n in 1 2 3; do ip -netns switch link add eth-sw type veth peer name eth-h${n} ip -netns switch li set eth-h${n} master br0 up ip -netns switch li set eth-sw netns host${n} name eth0 done ip -netns host1 addr add 192.168.252.209/24 dev eth0 ip -netns host1 link set dev eth0 up ip -netns host1 route add 192.168.247.0/24 \ nexthop via 192.168.252.250 dev eth0 nexthop via 192.168.252.252 dev eth0 ip -netns host2 addr add 192.168.252.250/24 dev eth0 ip -netns host2 link set dev eth0 up ip -netns host2 addr add 192.168.252.252/24 dev eth0 ip -netns host3 link set dev eth0 up ip netns add tunnel ip -netns tunnel li set lo up ip -netns tunnel li add br0 type bridge ip -netns tunnel li set br0 up for n in $(seq 11 20); do ip -netns tunnel addr add dev br0 192.168.247.${n}/24 done for n in 2 3 do ip -netns tunnel link add vti${n} type veth peer name eth${n} ip -netns tunnel link set eth${n} mtu 1360 master br0 up ip -netns tunnel link set vti${n} netns host${n} mtu 1360 up ip -netns host${n} addr add dev vti${n} 192.168.247.${n}/24 done ip -netns tunnel ro add default nexthop via 192.168.247.2 nexthop via 192.168.247.3 ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.11 ip netns exec host1 ping -M do -s 1400 -c3 -I 192.168.252.209 192.168.247.15 ip -netns host1 ro ls cache Before this patch the cache always shows exceptions against the first leg in the multipath route; 192.168.252.250 per this example. Since the hash has an initial random seed, you may need to vary the final octet more than what is listed. In my tests, using addresses between 11 and 19 usually found 1 that used both legs. With this patch, the cache will have exceptions for both legs. Fixes: 4895c771c7f0 ("ipv4: Add FIB nexthop exceptions") Reported-by: Kfir Itzhak <mastertheknife@gmail.com> Signed-off-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26net: qrtr: check skb_put_padto() return valueEric Dumazet1-9/+11
[ Upstream commit 3ca1a42a52ca4b4f02061683851692ad65fefac8 ] If skb_put_padto() returns an error, skb has been freed. Better not touch it anymore, as reported by syzbot [1] Note to qrtr maintainers : this suggests qrtr_sendmsg() should adjust sock_alloc_send_skb() second parameter to account for the potential added alignment to avoid reallocation. [1] BUG: KASAN: use-after-free in __skb_insert include/linux/skbuff.h:1907 [inline] BUG: KASAN: use-after-free in __skb_queue_before include/linux/skbuff.h:2016 [inline] BUG: KASAN: use-after-free in __skb_queue_tail include/linux/skbuff.h:2049 [inline] BUG: KASAN: use-after-free in skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146 Write of size 8 at addr ffff88804d8ab3c0 by task syz-executor.4/4316 CPU: 1 PID: 4316 Comm: syz-executor.4 Not tainted 5.9.0-rc4-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1d6/0x29e lib/dump_stack.c:118 print_address_description+0x66/0x620 mm/kasan/report.c:383 __kasan_report mm/kasan/report.c:513 [inline] kasan_report+0x132/0x1d0 mm/kasan/report.c:530 __skb_insert include/linux/skbuff.h:1907 [inline] __skb_queue_before include/linux/skbuff.h:2016 [inline] __skb_queue_tail include/linux/skbuff.h:2049 [inline] skb_queue_tail+0x6b/0x120 net/core/skbuff.c:3146 qrtr_tun_send+0x1a/0x40 net/qrtr/tun.c:23 qrtr_node_enqueue+0x44f/0xc00 net/qrtr/qrtr.c:364 qrtr_bcast_enqueue+0xbe/0x140 net/qrtr/qrtr.c:861 qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] sock_write_iter+0x317/0x470 net/socket.c:998 call_write_iter include/linux/fs.h:1882 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0xa96/0xd10 fs/read_write.c:578 ksys_write+0x11b/0x220 fs/read_write.c:631 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 RIP: 0033:0x45d5b9 Code: 5d b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 00 66 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 0f 83 2b b4 fb ff c3 66 2e 0f 1f 84 00 00 00 00 RSP: 002b:00007f84b5b81c78 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 RAX: ffffffffffffffda RBX: 0000000000038b40 RCX: 000000000045d5b9 RDX: 0000000000000055 RSI: 0000000020001240 RDI: 0000000000000003 RBP: 00007f84b5b81ca0 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 000000000000000f R13: 00007ffcbbf86daf R14: 00007f84b5b829c0 R15: 000000000118cf4c Allocated by task 4316: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track mm/kasan/common.c:56 [inline] __kasan_kmalloc+0x100/0x130 mm/kasan/common.c:461 slab_post_alloc_hook+0x3e/0x290 mm/slab.h:518 slab_alloc mm/slab.c:3312 [inline] kmem_cache_alloc+0x1c1/0x2d0 mm/slab.c:3482 skb_clone+0x1b2/0x370 net/core/skbuff.c:1449 qrtr_bcast_enqueue+0x6d/0x140 net/qrtr/qrtr.c:857 qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] sock_write_iter+0x317/0x470 net/socket.c:998 call_write_iter include/linux/fs.h:1882 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0xa96/0xd10 fs/read_write.c:578 ksys_write+0x11b/0x220 fs/read_write.c:631 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 Freed by task 4316: kasan_save_stack mm/kasan/common.c:48 [inline] kasan_set_track+0x3d/0x70 mm/kasan/common.c:56 kasan_set_free_info+0x17/0x30 mm/kasan/generic.c:355 __kasan_slab_free+0xdd/0x110 mm/kasan/common.c:422 __cache_free mm/slab.c:3418 [inline] kmem_cache_free+0x82/0xf0 mm/slab.c:3693 __skb_pad+0x3f5/0x5a0 net/core/skbuff.c:1823 __skb_put_padto include/linux/skbuff.h:3233 [inline] skb_put_padto include/linux/skbuff.h:3252 [inline] qrtr_node_enqueue+0x62f/0xc00 net/qrtr/qrtr.c:360 qrtr_bcast_enqueue+0xbe/0x140 net/qrtr/qrtr.c:861 qrtr_sendmsg+0x680/0x9c0 net/qrtr/qrtr.c:960 sock_sendmsg_nosec net/socket.c:651 [inline] sock_sendmsg net/socket.c:671 [inline] sock_write_iter+0x317/0x470 net/socket.c:998 call_write_iter include/linux/fs.h:1882 [inline] new_sync_write fs/read_write.c:503 [inline] vfs_write+0xa96/0xd10 fs/read_write.c:578 ksys_write+0x11b/0x220 fs/read_write.c:631 do_syscall_64+0x31/0x70 arch/x86/entry/common.c:46 entry_SYSCALL_64_after_hwframe+0x44/0xa9 The buggy address belongs to the object at ffff88804d8ab3c0 which belongs to the cache skbuff_head_cache of size 224 The buggy address is located 0 bytes inside of 224-byte region [ffff88804d8ab3c0, ffff88804d8ab4a0) The buggy address belongs to the page: page:00000000ea8cccfb refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88804d8abb40 pfn:0x4d8ab flags: 0xfffe0000000200(slab) raw: 00fffe0000000200 ffffea0002237ec8 ffffea00029b3388 ffff88821bb66800 raw: ffff88804d8abb40 ffff88804d8ab000 000000010000000b 0000000000000000 page dumped because: kasan: bad access detected Fixes: ce57785bf91b ("net: qrtr: fix len of skb_put_padto in qrtr_node_enqueue") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Cc: Carl Huang <cjhuang@codeaurora.org> Cc: Wen Gong <wgong@codeaurora.org> Cc: Bjorn Andersson <bjorn.andersson@linaro.org> Cc: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Acked-by: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org> Reviewed-by: Bjorn Andersson <bjorn.andersson@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26tipc: use skb_unshare() instead in tipc_buf_append()Xin Long1-1/+2
[ Upstream commit ff48b6222e65ebdba5a403ef1deba6214e749193 ] In tipc_buf_append() it may change skb's frag_list, and it causes problems when this skb is cloned. skb_unclone() doesn't really make this skb's flag_list available to change. Shuang Li has reported an use-after-free issue because of this when creating quite a few macvlan dev over the same dev, where the broadcast packets will be cloned and go up to the stack: [ ] BUG: KASAN: use-after-free in pskb_expand_head+0x86d/0xea0 [ ] Call Trace: [ ] dump_stack+0x7c/0xb0 [ ] print_address_description.constprop.7+0x1a/0x220 [ ] kasan_report.cold.10+0x37/0x7c [ ] check_memory_region+0x183/0x1e0 [ ] pskb_expand_head+0x86d/0xea0 [ ] process_backlog+0x1df/0x660 [ ] net_rx_action+0x3b4/0xc90 [ ] [ ] Allocated by task 1786: [ ] kmem_cache_alloc+0xbf/0x220 [ ] skb_clone+0x10a/0x300 [ ] macvlan_broadcast+0x2f6/0x590 [macvlan] [ ] macvlan_process_broadcast+0x37c/0x516 [macvlan] [ ] process_one_work+0x66a/0x1060 [ ] worker_thread+0x87/0xb10 [ ] [ ] Freed by task 3253: [ ] kmem_cache_free+0x82/0x2a0 [ ] skb_release_data+0x2c3/0x6e0 [ ] kfree_skb+0x78/0x1d0 [ ] tipc_recvmsg+0x3be/0xa40 [tipc] So fix it by using skb_unshare() instead, which would create a new skb for the cloned frag and it'll be safe to change its frag_list. The similar things were also done in sctp_make_reassembled_event(), which is using skb_copy(). Reported-by: Shuang Li <shuali@redhat.com> Fixes: 37e22164a8a3 ("tipc: rename and move message reassembly function") Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26tipc: fix shutdown() of connection oriented socketTetsuo Handa1-4/+1
[ Upstream commit a4b5cc9e10803ecba64a7d54c0f47e4564b4a980 ] I confirmed that the problem fixed by commit 2a63866c8b51a3f7 ("tipc: fix shutdown() of connectionless socket") also applies to stream socket. ---------- #include <sys/socket.h> #include <unistd.h> #include <sys/wait.h> int main(int argc, char *argv[]) { int fds[2] = { -1, -1 }; socketpair(PF_TIPC, SOCK_STREAM /* or SOCK_DGRAM */, 0, fds); if (fork() == 0) _exit(read(fds[0], NULL, 1)); shutdown(fds[0], SHUT_RDWR); /* This must make read() return. */ wait(NULL); /* To be woken up by _exit(). */ return 0; } ---------- Since shutdown(SHUT_RDWR) should affect all processes sharing that socket, unconditionally setting sk->sk_shutdown to SHUTDOWN_MASK will be the right behavior. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26tipc: Fix memory leak in tipc_group_create_member()Peilin Ye1-4/+10
[ Upstream commit bb3a420d47ab00d7e1e5083286cab15235a96680 ] tipc_group_add_to_tree() returns silently if `key` matches `nkey` of an existing node, causing tipc_group_create_member() to leak memory. Let tipc_group_add_to_tree() return an error in such a case, so that tipc_group_create_member() can handle it properly. Fixes: 75da2163dbb6 ("tipc: introduce communication groups") Reported-and-tested-by: syzbot+f95d90c454864b3b5bc9@syzkaller.appspotmail.com Cc: Hillf Danton <hdanton@sina.com> Link: https://syzkaller.appspot.com/bug?id=048390604fe1b60df34150265479202f10e13aff Signed-off-by: Peilin Ye <yepeilin.cs@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26net: sch_generic: aviod concurrent reset and enqueue op for lockless qdiscYunsheng Lin1-16/+33
[ Upstream commit 2fb541c862c987d02dfdf28f1545016deecfa0d5 ] Currently there is concurrent reset and enqueue operation for the same lockless qdisc when there is no lock to synchronize the q->enqueue() in __dev_xmit_skb() with the qdisc reset operation in qdisc_deactivate() called by dev_deactivate_queue(), which may cause out-of-bounds access for priv->ring[] in hns3 driver if user has requested a smaller queue num when __dev_xmit_skb() still enqueue a skb with a larger queue_mapping after the corresponding qdisc is reset, and call hns3_nic_net_xmit() with that skb later. Reused the existing synchronize_net() in dev_deactivate_many() to make sure skb with larger queue_mapping enqueued to old qdisc(which is saved in dev_queue->qdisc_sleeping) will always be reset when dev_reset_queue() is called. Fixes: 6b3ba9146fe6 ("net: sched: allow qdiscs to handle locking") Signed-off-by: Yunsheng Lin <linyunsheng@huawei.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-09-26net: ipv6: fix kconfig dependency warning for IPV6_SEG6_HMACNecip Fazil Yildiran1-0/+1
[ Upstream commit db7cd91a4be15e1485d6b58c6afc8761c59c4efb ] When IPV6_SEG6_HMAC is enabled and CRYPTO is disabled, it results in the following Kbuild warning: WARNING: unmet direct dependencies detected for CRYPTO_HMAC Depends on [n]: CRYPTO [=n] Selected by [y]: - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y] WARNING: unmet direct dependencies detected for CRYPTO_SHA1 Depends on [n]: CRYPTO [=n] Selected by [y]: - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y] WARNING: unmet direct dependencies detected for CRYPTO_SHA256 Depends on [n]: CRYPTO [=n] Selected by [y]: - IPV6_SEG6_HMAC [=y] && NET [=y] && INET [=y] && IPV6 [=y] The reason is that IPV6_SEG6_HMAC selects CRYPTO_HMAC, CRYPTO_SHA1, and CRYPTO_SHA256 without depending on or selecting CRYPTO while those configs are subordinate to CRYPTO. Honor the kconfig menu hierarchy to remove kconfig dependency warnings. Fixes: bf355b8d2c30 ("ipv6: sr: add core files for SR HMAC support") Signed-off-by: Necip Fazil Yildiran <fazilyildiran@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: