Age | Commit message (Collapse) | Author | Files | Lines |
|
Handle the few cases that need special treatment in-line using
in_compat_syscall(). This also removes all the now unused
compat_{get,set}sockopt methods.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Handle the few cases that need special treatment in-line using
in_compat_syscall().
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add the compat handling to sock_common_{get,set}sockopt instead,
keyed of in_compat_syscall(). This allow to remove the now unused
->compat_{get,set}sockopt methods from struct proto_ops.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Acked-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Acked-by: Stefan Schmidt <stefan@datenfreihafen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a helper that copies either a native or compat bpf_fprog from
userspace after verifying the length, and remove the compat setsockopt
handlers that now aren't required.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This reverts commit aebe4426ccaa4838f36ea805cdf7d76503e65117.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Mirred currently does not mix well with blocks executed after the qdisc
root lock is taken. This includes classification blocks (such as in PRIO,
ETS, DRR qdiscs) and qevents. The locking caused by the packet mirrored by
mirred can cause deadlocks: either when the thread of execution attempts to
take the lock a second time, or when two threads end up waiting on each
other's locks.
The qevent patchset attempted to not introduce further badness of this
sort, and dropped the lock before executing the qevent block. However this
lead to too little locking and races between qdisc configuration and packet
enqueue in the RED qdisc.
Before the deadlock issues are solved in a way that can be applied across
many qdiscs reasonably easily, do for qevents what is done for the
classification blocks and just keep holding the root lock.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Using uninitialized_var() is dangerous as it papers over real bugs[1]
(or can in the future), and suppresses unrelated compiler warnings
(e.g. "unused variable"). If the compiler thinks it is uninitialized,
either simply initialize the variable or make compiler changes.
In preparation for removing[2] the[3] macro[4], remove all remaining
needless uses with the following script:
git grep '\buninitialized_var\b' | cut -d: -f1 | sort -u | \
xargs perl -pi -e \
's/\buninitialized_var\(([^\)]+)\)/\1/g;
s:\s*/\* (GCC be quiet|to make compiler happy) \*/$::g;'
drivers/video/fbdev/riva/riva_hw.c was manually tweaked to avoid
pathological white-space.
No outstanding warnings were found building allmodconfig with GCC 9.3.0
for x86_64, i386, arm64, arm, powerpc, powerpc64le, s390x, mips, sparc64,
alpha, and m68k.
[1] https://lore.kernel.org/lkml/20200603174714.192027-1-glider@google.com/
[2] https://lore.kernel.org/lkml/CA+55aFw+Vbj0i=1TGqCR5vQkCzWJ0QxK6CernOU6eedsudAixw@mail.gmail.com/
[3] https://lore.kernel.org/lkml/CA+55aFwgbgqhbp1fkxvRKEpzyR5J8n1vKT1VZdz9knmPuXhOeg@mail.gmail.com/
[4] https://lore.kernel.org/lkml/CA+55aFz2500WfbKXAx8s67wrm9=yVJu65TpLgN_ybYNv0VEOKA@mail.gmail.com/
Reviewed-by: Leon Romanovsky <leonro@mellanox.com> # drivers/infiniband and mlx4/mlx5
Acked-by: Jason Gunthorpe <jgg@mellanox.com> # IB
Acked-by: Kalle Valo <kvalo@codeaurora.org> # wireless drivers
Reviewed-by: Chao Yu <yuchao0@huawei.com> # erofs
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Introduce XDP_REDIRECT support for eBPF programs attached to cpumap
entries.
This patch has been tested on Marvell ESPRESSObin using a modified
version of xdp_redirect_cpu sample in order to attach a XDP program
to CPUMAP entries to perform a redirect on the mvneta interface.
In particular the following scenario has been tested:
rq (cpu0) --> mvneta - XDP_REDIRECT (cpu0) --> CPUMAP - XDP_REDIRECT (cpu1) --> mvneta
$./xdp_redirect_cpu -p xdp_cpu_map0 -d eth0 -c 1 -e xdp_redirect \
-f xdp_redirect_kern.o -m tx_port -r eth0
tx: 285.2 Kpps rx: 285.2 Kpps
Attaching a simple XDP program on eth0 to perform XDP_TX gives
comparable results:
tx: 288.4 Kpps rx: 288.4 Kpps
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/2cf8373a731867af302b00c4ff16c122630c4980.1594734381.git.lorenzo@kernel.org
|
|
Introduce the capability to attach an eBPF program to cpumap entries.
The idea behind this feature is to add the possibility to define on
which CPU run the eBPF program if the underlying hw does not support
RSS. Current supported verdicts are XDP_DROP and XDP_PASS.
This patch has been tested on Marvell ESPRESSObin using xdp_redirect_cpu
sample available in the kernel tree to identify possible performance
regressions. Results show there are no observable differences in
packet-per-second:
$./xdp_redirect_cpu --progname xdp_cpu_map0 --dev eth0 --cpu 1
rx: 354.8 Kpps
rx: 356.0 Kpps
rx: 356.8 Kpps
rx: 356.3 Kpps
rx: 356.6 Kpps
rx: 356.6 Kpps
rx: 356.7 Kpps
rx: 355.8 Kpps
rx: 356.8 Kpps
rx: 356.8 Kpps
Co-developed-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/5c9febdf903d810b3415732e5cd98491d7d9067a.1594734381.git.lorenzo@kernel.org
|
|
Move the guts of xdp_convert_buff_to_frame to a new helper,
xdp_update_frame_from_buff so it can be reused removing code duplication
Suggested-by: Jesper Dangaard Brouer <brouer@redhat.com>
Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: David Ahern <dsahern@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/bpf/90a68c283d7ebeb48924934c9b7ac79492300472.1594734381.git.lorenzo@kernel.org
|
|
Starting with the upgrade to v5.8-rc3, I've noticed I wasn't able to
connect to my Bluetooth headset properly anymore. While connecting to
the device would eventually succeed, bluetoothd seemed to be confused
about the current connection state where the state was flapping hence
and forth. Bisecting this issue led to commit 3ca44c16b0dc (Bluetooth:
Consolidate encryption handling in hci_encrypt_cfm, 2020-05-19), which
refactored `hci_encrypt_cfm` to also handle updating the connection
state.
The commit in question changed the code to call `hci_connect_cfm` inside
`hci_encrypt_cfm` and to change the connection state. But with the
conversion, we now only update the connection state if a status was set
already. In fact, the reverse should be true: the status should be
updated if no status is yet set. So let's fix the isuse by reversing the
condition.
Fixes: 3ca44c16b0dc ("Bluetooth: Consolidate encryption handling in hci_encrypt_cfm")
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Acked-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Drop doubled words in several comments.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drop doubled word "the" in a comment.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drop doubled word "to" in a comment.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drop doubled words "or" and "the" in several comments.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drop doubled word "not" in a comment.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Drop doubled words in two comments.
Fix a spello/typo.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: netdev@vger.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
commit 263e1201a2c3 ("mptcp: consolidate synack processing.")
left behind this, remove it.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It is not used since commit 09c7570480f7 ("xfrm: remove flow cache")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
They are not used any more since commit b1edeb102397 ("netlabel: Replace
protocol/NetLabel linking with refrerence counts")
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
Acked-by: Paul Moore <paul@paul-moore.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Extend switchdev API to add support for MRP interconnect. The HW is
notified in the following cases:
SWITCHDEV_OBJ_ID_IN_ROLE_MRP: This is used when the interconnect role
of the node changes. The supported roles are MIM and MIC.
SWITCHDEV_OBJ_ID_IN_STATE_MRP: This is used when the interconnect ring
changes it states to open or closed.
SWITCHDEV_OBJ_ID_IN_TEST_MRP: This is used to start/stop sending
MRP_InTest frames on all MRP ports. This is called only on nodes that
have the interconnect role MIM.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Alexei Starovoitov says:
====================
pull-request: bpf-next 2020-07-13
The following pull-request contains BPF updates for your *net-next* tree.
We've added 36 non-merge commits during the last 7 day(s) which contain
a total of 62 files changed, 2242 insertions(+), 468 deletions(-).
The main changes are:
1) Avoid trace_printk warning banner by switching bpf_trace_printk to use
its own tracing event, from Alan.
2) Better libbpf support on older kernels, from Andrii.
3) Additional AF_XDP stats, from Ciara.
4) build time resolution of BTF IDs, from Jiri.
5) BPF_CGROUP_INET_SOCK_RELEASE hook, from Stanislav.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Previously, shared blocks were only relevant for the pseudo-qdiscs ingress
and clsact. Recently, a qevent facility was introduced, which allows to
bind blocks to well-defined slots of a qdisc instance. RED in particular
got two qevents: early_drop and mark. Drivers that wish to offload these
blocks will be sent the usual notification, and need to know which qdisc it
is related to.
To that end, extend flow_block_offload with a "sch" pointer, and initialize
as appropriate. This prompts changes in the indirect block facility, which
now tracks the scheduler in addition to the netdevice. Update signatures of
several functions similarly.
Signed-off-by: Petr Machata <petrm@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
It can be useful for the user to know the reason behind a dropped packet.
Introduce new counters which track drops on the receive path caused by:
1. rx ring being full
2. fill ring being empty
Also, on the tx path introduce a counter which tracks the number of times
we attempt pull from the tx ring when it is empty.
Signed-off-by: Ciara Loftus <ciara.loftus@intel.com>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Link: https://lore.kernel.org/bpf/20200708072835.4427-2-ciara.loftus@intel.com
|
|
In preparation for users of the "install a received file" logic outside
of net/ (pidfd and seccomp), relocate and rename __scm_install_fd() from
net/core/scm.c to __receive_fd() in fs/file.c, and provide a wrapper
named receive_fd_user(), as future patches will change the interface
to __receive_fd().
Additionally add a comment to fd_install() as a counterpoint to how
__receive_fd() interacts with fput().
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Dmitry Kadashev <dkadashev@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Ido Schimmel <idosch@idosch.org>
Cc: Ioana Ciornei <ioana.ciornei@nxp.com>
Cc: linux-fsdevel@vger.kernel.org
Cc: netdev@vger.kernel.org
Reviewed-by: Sargun Dhillon <sargun@sargun.me>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Duplicate the cleanups from commit 2618d530dd8b ("net/scm: cleanup
scm_detach_fds") into the compat code.
Replace open-coded __receive_sock() with a call to the helper.
Move the check added in commit 1f466e1f15cf ("net: cleanly handle kernel
vs user buffers for ->msg_control") to before the compat call, even
though it should be impossible for an in-kernel call to also be compat.
Correct the int "flags" argument to unsigned int to match fd_install()
and similar APIs.
Regularize any remaining differences, including a whitespace issue,
a checkpatch warning, and add the check from commit 6900317f5eff ("net,
scm: fix PaX detected msg_controllen overflow in scm_detach_fds") which
fixed an overflow unique to 64-bit. To avoid confusion when comparing
the compat handler to the native handler, just include the same check
in the compat handler.
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: netdev@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
Add missed sock updates to compat path via a new helper, which will be
used more in coming patches. (The net/core/scm.c code is left as-is here
to assist with -stable backports for the compat path.)
Cc: Christoph Hellwig <hch@lst.de>
Cc: Sargun Dhillon <sargun@sargun.me>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: stable@vger.kernel.org
Fixes: 48a87cc26c13 ("net: netprio: fd passed in SCM_RIGHTS datagram not set correctly")
Fixes: d84295067fc7 ("net: net_cls: fd passed in SCM_RIGHTS datagram not set correctly")
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
|
|
All conflicts seemed rather trivial, with some guidance from
Saeed Mameed on the tc_ct.c one.
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
In order to use new devlink port health reporters infrastructure, add
corresponding constructor and destructor functions.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add devlink-health reporter support on per-port basis.
The main difference existing devlink-health is that port reporters are
stored in per-devlink_port lists. Upon creation of such health reporter the
reference to a port it belongs to is stored in reporter struct.
Fill the port index attribute in devlink-health response to
allow devlink userspace utility to distinguish between device and port
reporters.
Signed-off-by: Vladyslav Tarasiuk <vladyslavt@mellanox.com>
Reviewed-by: Moshe Shemesh <moshe@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add an interface to report offloaded UDP ports via ethtool netlink.
Now that core takes care of tracking which UDP tunnel ports the NICs
are aware of we can quite easily export this information out to
user space.
The responsibility of writing the netlink dumps is split between
ethtool code and udp_tunnel_nic.c - since udp_tunnel module may
not always be loaded, yet we should always report the capabilities
of the NIC.
$ ethtool --show-tunnels eth0
Tunnel information for eth0:
UDP port table 0:
Size: 4
Types: vxlan
No entries
UDP port table 1:
Size: 4
Types: geneve, vxlan-gpe
Entries (1):
port 1230, vxlan-gpe
v4:
- back to v2, build fix is now directly in udp_tunnel.h
v3:
- don't compile ETHTOOL_MSG_TUNNEL_INFO_GET in if CONFIG_INET
not set.
v2:
- fix string set count,
- reorder enums in the uAPI,
- fix type of ETHTOOL_A_TUNNEL_UDP_TABLE_TYPES to bitset
in docs and comments.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Cater to devices which:
(a) may want to sleep in the callbacks;
(b) only have IPv4 support;
(c) need all the programming to happen while the netdev is up.
Drivers attach UDP tunnel offload info struct to their netdevs,
where they declare how many UDP ports of various tunnel types
they support. Core takes care of tracking which ports to offload.
Use a fixed-size array since this matches what almost all drivers
do, and avoids a complexity and uncertainty around memory allocations
in an atomic context.
Make sure that tunnel drivers don't try to replay the ports when
new NIC netdev is registered. Automatic replays would mess up
reference counting, and will be removed completely once all drivers
are converted.
v4:
- use a #define NULL to avoid build issues with CONFIG_INET=n.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Make it possible to use tunnel types as flags more easily.
There doesn't appear to be any user using the type as an
array index, so this should make no difference.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a new attribute that indicates the split ability of devlink port.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Add a new devlink port attribute that indicates the port's number of lanes.
Drivers are expected to set it via devlink_port_attrs_set(), before
registering the port.
The attribute is not passed to user space in case the number of lanes is
invalid (0).
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Currently, devlink_port_attrs_set accepts a long list of parameters,
that most of them are devlink port's attributes.
Use the devlink_port_attrs struct to replace the relevant parameters.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The struct devlink_port_attrs holds the attributes of devlink_port.
Similarly to the previous patch, 'switch_port' attribute is another
exception.
Move 'switch_port' to be devlink_port's field.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The struct devlink_port_attrs holds the attributes of devlink_port.
The 'set' field is not devlink_port's attribute as opposed to most of the
others.
Move 'set' to be devlink_port's field called 'attrs_set'.
Signed-off-by: Danielle Ratson <danieller@mellanox.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Ido Schimmel <idosch@mellanox.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
bpf_sk_reuseport_detach is currently called when sk->sk_user_data
is not NULL. It is incorrect because sk->sk_user_data may not be
managed by the bpf's reuseport_array. It has been reported in [1] that,
the bpf_sk_reuseport_detach() which is called from udp_lib_unhash() has
corrupted the sk_user_data managed by l2tp.
This patch solves it by using another bit (defined as SK_USER_DATA_BPF)
of the sk_user_data pointer value. It marks that a sk_user_data is
managed/owned by BPF.
The patch depends on a PTRMASK introduced in
commit f1ff5ce2cd5e ("net, sk_msg: Clear sk_user_data pointer on clone if tagged").
[ Note: sk->sk_user_data is used by bpf's reuseport_array only when a sk is
added to the bpf's reuseport_array.
i.e. doing setsockopt(SO_REUSEPORT) and having "sk->sk_reuseport == 1"
alone will not stop sk->sk_user_data being used by other means. ]
[1]: https://lore.kernel.org/netdev/20200706121259.GA20199@katalix.com/
Fixes: 5dc4c4b7d4e8 ("bpf: Introduce BPF_MAP_TYPE_REUSEPORT_SOCKARRAY")
Reported-by: James Chapman <jchapman@katalix.com>
Reported-by: syzbot+9f092552ba9a5efca5df@syzkaller.appspotmail.com
Signed-off-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: James Chapman <jchapman@katalix.com>
Acked-by: James Chapman <jchapman@katalix.com>
Link: https://lore.kernel.org/bpf/20200709061110.4019316-1-kafai@fb.com
|
|
This patch is to register a callback function tunnel6_rcv_cb with
is_ipip set in a xfrm_input_afinfo object for tunnel6 and tunnel46.
It will be called by xfrm_rcv_cb() from xfrm_input() when family
is AF_INET6 and proto is IPPROTO_IPIP or IPPROTO_IPV6.
v1->v2:
- Fix a sparse warning caused by the missing "__rcu", as Jakub
noticed.
- Handle the err returned by xfrm_input_register_afinfo() in
tunnel6_init/fini(), as Sabrina noticed.
v2->v3:
- Add "#if IS_ENABLED(CONFIG_INET6_XFRM_TUNNEL)" to fix the build error
when xfrm is disabled, reported by kbuild test robot
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This patch is to register a callback function tunnel4_rcv_cb with
is_ipip set in a xfrm_input_afinfo object for tunnel4 and tunnel64.
It will be called by xfrm_rcv_cb() from xfrm_input() when family
is AF_INET and proto is IPPROTO_IPIP or IPPROTO_IPV6.
v1->v2:
- Fix a sparse warning caused by the missing "__rcu", as Jakub
noticed.
- Handle the err returned by xfrm_input_register_afinfo() in
tunnel4_init/fini(), as Sabrina noticed.
v2->v3:
- Add "#if IS_ENABLED(CONFIG_INET_XFRM_TUNNEL)" to fix the build error
when xfrm is disabled, reported by kbuild test robot.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This patch is to add a new member is_ipip to struct xfrm_input_afinfo,
to allow another group family of callback functions to be registered
with is_ipip set.
This will be used for doing a callback for struct xfrm(6)_tunnel of
ipip/ipv6 tunnels in xfrm_input() by calling xfrm_rcv_cb(), which is
needed by ipip/ipv6 tunnels' support in ip(6)_vti and xfrm interface
in the next patches.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
|
|
This implements the known parts of the Realtek 4 byte
tag protocol version 0xA, as found in the RTL8366RB
DSA switch.
It is designated as protocol version 0xA as a
different Realtek 4 byte tag format with protocol
version 0x9 is known to exist in the Realtek RTL8306
chips.
The tag and switch chip lacks public documentation, so
the tag format has been reverse-engineered from
packet dumps. As only ingress traffic has been available
for analysis an egress tag has not been possible to
develop (even using educated guesses about bit fields)
so this is as far as it gets. It is not known if the
switch even supports egress tagging.
Excessive attempts to figure out the egress tag format
was made. When nothing else worked, I just tried all bit
combinations with 0xannp where a is protocol and p is
port. I looped through all values several times trying
to get a response from ping, without any positive
result.
Using just these ingress tags however, the switch
functionality is vastly improved and the packets find
their way into the destination port without any
tricky VLAN configuration. On the D-Link DIR-685 the
LAN ports now come up and respond to ping without
any command line configuration so this is a real
improvement for users.
Egress packets need to be restricted to the proper
target ports using VLAN, which the RTL8366RB DSA
switch driver already sets up.
Cc: DENG Qingfang <dqfext@gmail.com>
Cc: Mauri Sandberg <sandberg@mailfence.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
Pablo Neira Ayuso says:
====================
Netfilter/IPVS updates for net-next
The following patchset contains Netfilter updates for net-next:
1) Support for rejecting packets from the prerouting chain, from
Laura Garcia Liebana.
2) Remove useless assignment in pipapo, from Stefano Brivio.
3) On demand hook registration in IPVS, from Julian Anastasov.
4) Expire IPVS connection from process context to not overload
timers, also from Julian.
5) Fallback to conntrack TCP tracker to handle connection reuse
in IPVS, from Julian Anastasov.
6) Several patches to support for chain bindings.
7) Expose enum nft_chain_flags through UAPI.
8) Reject unsupported chain flags from the netlink control plane.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
The packets from tunnel devices (eg bareudp) may have only
metadata in the dst pointer of skb. Hence a pointer check of
neigh_lookup is needed in dst_neigh_lookup_skb
Kernel crashes when packets from bareudp device is processed in
the kernel neighbour subsytem.
[ 133.384484] BUG: kernel NULL pointer dereference, address: 0000000000000000
[ 133.385240] #PF: supervisor instruction fetch in kernel mode
[ 133.385828] #PF: error_code(0x0010) - not-present page
[ 133.386603] PGD 0 P4D 0
[ 133.386875] Oops: 0010 [#1] SMP PTI
[ 133.387275] CPU: 0 PID: 5045 Comm: ping Tainted: G W 5.8.0-rc2+ #15
[ 133.388052] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 133.391076] RIP: 0010:0x0
[ 133.392401] Code: Bad RIP value.
[ 133.394029] RSP: 0018:ffffb79980003d50 EFLAGS: 00010246
[ 133.396656] RAX: 0000000080000102 RBX: ffff9de2fe0d6600 RCX: ffff9de2fe5e9d00
[ 133.399018] RDX: 0000000000000000 RSI: ffff9de2fe5e9d00 RDI: ffff9de2fc21b400
[ 133.399685] RBP: ffff9de2fe5e9d00 R08: 0000000000000000 R09: 0000000000000000
[ 133.400350] R10: ffff9de2fbc6be22 R11: ffff9de2fe0d6600 R12: ffff9de2fc21b400
[ 133.401010] R13: ffff9de2fe0d6628 R14: 0000000000000001 R15: 0000000000000003
[ 133.401667] FS: 00007fe014918740(0000) GS:ffff9de2fec00000(0000) knlGS:0000000000000000
[ 133.402412] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 133.402948] CR2: ffffffffffffffd6 CR3: 000000003bb72000 CR4: 00000000000006f0
[ 133.403611] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 133.404270] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 133.404933] Call Trace:
[ 133.405169] <IRQ>
[ 133.405367] __neigh_update+0x5a4/0x8f0
[ 133.405734] arp_process+0x294/0x820
[ 133.406076] ? __netif_receive_skb_core+0x866/0xe70
[ 133.406557] arp_rcv+0x129/0x1c0
[ 133.406882] __netif_receive_skb_one_core+0x95/0xb0
[ 133.407340] process_backlog+0xa7/0x150
[ 133.407705] net_rx_action+0x2af/0x420
[ 133.408457] __do_softirq+0xda/0x2a8
[ 133.408813] asm_call_on_stack+0x12/0x20
[ 133.409290] </IRQ>
[ 133.409519] do_softirq_own_stack+0x39/0x50
[ 133.410036] do_softirq+0x50/0x60
[ 133.410401] __local_bh_enable_ip+0x50/0x60
[ 133.410871] ip_finish_output2+0x195/0x530
[ 133.411288] ip_output+0x72/0xf0
[ 133.411673] ? __ip_finish_output+0x1f0/0x1f0
[ 133.412122] ip_send_skb+0x15/0x40
[ 133.412471] raw_sendmsg+0x853/0xab0
[ 133.412855] ? insert_pfn+0xfe/0x270
[ 133.413827] ? vvar_fault+0xec/0x190
[ 133.414772] sock_sendmsg+0x57/0x80
[ 133.415685] __sys_sendto+0xdc/0x160
[ 133.416605] ? syscall_trace_enter+0x1d4/0x2b0
[ 133.417679] ? __audit_syscall_exit+0x1d9/0x280
[ 133.418753] ? __prepare_exit_to_usermode+0x5d/0x1a0
[ 133.419819] __x64_sys_sendto+0x24/0x30
[ 133.420848] do_syscall_64+0x4d/0x90
[ 133.421768] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 133.422833] RIP: 0033:0x7fe013689c03
[ 133.423749] Code: Bad RIP value.
[ 133.424624] RSP: 002b:00007ffc7288f418 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[ 133.425940] RAX: ffffffffffffffda RBX: 000056151fc63720 RCX: 00007fe013689c03
[ 133.427225] RDX: 0000000000000040 RSI: 000056151fc63720 RDI: 0000000000000003
[ 133.428481] RBP: 00007ffc72890b30 R08: 000056151fc60500 R09: 0000000000000010
[ 133.429757] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000040
[ 133.431041] R13: 000056151fc636e0 R14: 000056151fc616bc R15: 0000000000000080
[ 133.432481] Modules linked in: mpls_iptunnel act_mirred act_tunnel_key cls_flower sch_ingress veth mpls_router ip_tunnel bareudp ip6_udp_tunnel udp_tunnel macsec udp_diag inet_diag unix_diag af_packet_diag netlink_diag binfmt_misc xt_MASQUERADE iptable_nat xt_addrtype xt_conntrack nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 br_netfilter bridge stp llc ebtable_filter ebtables overlay ip6table_filter ip6_tables iptable_filter sunrpc ext4 mbcache jbd2 pcspkr i2c_piix4 virtio_balloon joydev ip_tables xfs libcrc32c ata_generic qxl pata_acpi drm_ttm_helper ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm ata_piix libata virtio_net net_failover virtio_console failover virtio_blk i2c_core virtio_pci virtio_ring serio_raw floppy virtio dm_mirror dm_region_hash dm_log dm_mod
[ 133.444045] CR2: 0000000000000000
[ 133.445082] ---[ end trace f4aeee1958fd1638 ]---
[ 133.446236] RIP: 0010:0x0
[ 133.447180] Code: Bad RIP value.
[ 133.448152] RSP: 0018:ffffb79980003d50 EFLAGS: 00010246
[ 133.449363] RAX: 0000000080000102 RBX: ffff9de2fe0d6600 RCX: ffff9de2fe5e9d00
[ 133.450835] RDX: 0000000000000000 RSI: ffff9de2fe5e9d00 RDI: ffff9de2fc21b400
[ 133.452237] RBP: ffff9de2fe5e9d00 R08: 0000000000000000 R09: 0000000000000000
[ 133.453722] R10: ffff9de2fbc6be22 R11: ffff9de2fe0d6600 R12: ffff9de2fc21b400
[ 133.455149] R13: ffff9de2fe0d6628 R14: 0000000000000001 R15: 0000000000000003
[ 133.456520] FS: 00007fe014918740(0000) GS:ffff9de2fec00000(0000) knlGS:0000000000000000
[ 133.458046] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 133.459342] CR2: ffffffffffffffd6 CR3: 000000003bb72000 CR4: 00000000000006f0
[ 133.460782] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 133.462240] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 133.463697] Kernel panic - not syncing: Fatal exception in interrupt
[ 133.465226] Kernel Offset: 0xfa00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[ 133.467025] ---[ end Kernel panic - not syncing: Fatal exception in interrupt ]---
Fixes: aaa0c23cb901 ("Fix dst_neigh_lookup/dst_neigh_lookup_skb return value handling bug")
Signed-off-by: Martin Varghese <martin.varghese@nokia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This patch adds a configurable LE autoconnect timeout.
Signed-off-by: Alain Michaud <alainm@chromium.org>
Signed-off-by: Marcel Holtmann <marcel@holtmann.org>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2020-07-04
The following pull-request contains BPF updates for your *net-next* tree.
We've added 73 non-merge commits during the last 17 day(s) which contain
a total of 106 files changed, 5233 insertions(+), 1283 deletions(-).
The main changes are:
1) bpftool ability to show PIDs of processes having open file descriptors
for BPF map/program/link/BTF objects, relying on BPF iterator progs
to extract this info efficiently, from Andrii Nakryiko.
2) Addition of BPF iterator progs for dumping TCP and UDP sockets to
seq_files, from Yonghong Song.
3) Support access to BPF map fields in struct bpf_map from programs
through BTF struct access, from Andrey Ignatov.
4) Add a bpf_get_task_stack() helper to be able to dump /proc/*/stack
via seq_file from BPF iterator progs, from Song Liu.
5) Make SO_KEEPALIVE and related options available to bpf_setsockopt()
helper, from Dmitry Yakunin.
6) Optimize BPF sk_storage selection of its caching index, from Martin
KaFai Lau.
7) Removal of redundant synchronize_rcu()s from BPF map destruction which
has been a historic leftover, from Alexei Starovoitov.
8) Several improvements to test_progs to make it easier to create a shell
loop that invokes each test individually which is useful for some CIs,
from Jesper Dangaard Brouer.
9) Fix bpftool prog dump segfault when compiled without skeleton code on
older clang versions, from John Fastabend.
10) Bunch of cleanups and minor improvements, from various others.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
|
|
This new chain flag specifies that:
* the kernel dynamically allocates the chain name, if no chain name
is specified.
* If the immediate expression that refers to this chain is removed,
then this bound chain (and its content) is destroyed.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This enum definition was never exposed through UAPI. Rename
NFT_BASE_CHAIN to NFT_CHAIN_BASE for consistency.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|
|
This netlink attribute allows you to refer to chains inside a
transaction as an alternative to the name and the handle. The chain
binding support requires this new chain ID approach.
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
|