linux.git/include, branch v4.1.17

arm64: fix building without CONFIG_UID16

2016-01-31T19:23:39+00:00

commit fbc416ff86183e2203cdf975e2881d7c164b0271 upstream.

As reported by Michal Simek, building an ARM64 kernel with CONFIG_UID16
disabled currently fails because the system call table still needs to
reference the individual function entry points that are provided by
kernel/sys_ni.c in this case, and the declarations are hidden inside
of #ifdef CONFIG_UID16:

arch/arm64/include/asm/unistd32.h:57:8: error: 'sys_lchown16' undeclared here (not in a function)
 __SYSCALL(__NR_lchown, sys_lchown16)

I believe this problem only exists on ARM64, because older architectures
tend to not need declarations when their system call table is built
in assembly code, while newer architectures tend to not need UID16
support. ARM64 only uses these system calls for compatibility with
32-bit ARM binaries.

This changes the CONFIG_UID16 check into CONFIG_HAVE_UID16, which is
set unconditionally on ARM64 with CONFIG_COMPAT, so we see the
declarations whenever we need them, but otherwise the behavior is
unchanged.

Fixes: af1839eb4bd4 ("Kconfig: clean up the long arch list for the UID16 config option")
Signed-off-by: Arnd Bergmann 
Acked-by: Will Deacon 
Signed-off-by: Catalin Marinas 
Signed-off-by: Greg Kroah-Hartman

tcp/dccp: fix old style declarations

2016-01-31T19:23:37+00:00

[ Upstream commit 8695a144da9e500a5a60fa34c06694346ec1048f ]

I’m using the compilation flag -Werror=old-style-declaration, which
requires that the “inline” word would come at the beginning of the code
line.

$ make drivers/net/ethernet/intel/e1000e/e1000e.ko
...
include/net/inet_timewait_sock.h:116:1: error: ‘inline’ is not at
beginning of declaration [-Werror=old-style-declaration]
static void inline inet_twsk_schedule(struct inet_timewait_sock *tw, int
timeo)

include/net/inet_timewait_sock.h:121:1: error: ‘inline’ is not at
beginning of declaration [-Werror=old-style-declaration]
static void inline inet_twsk_reschedule(struct inet_timewait_sock *tw,
int timeo)

Fixes: ed2e92394589 ("tcp/dccp: fix timewait races in timer handling")
Signed-off-by: Raanan Avargil 
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

tcp/dccp: fix timewait races in timer handling

2016-01-31T19:23:37+00:00

[ Upstream commit ed2e923945892a8372ab70d2f61d364b0b6d9054 ]

When creating a timewait socket, we need to arm the timer before
allowing other cpus to find it. The signal allowing cpus to find
the socket is setting tw_refcnt to non zero value.

As we set tw_refcnt in __inet_twsk_hashdance(), we therefore need to
call inet_twsk_schedule() first.

This also means we need to remove tw_refcnt changes from
inet_twsk_schedule() and let the caller handle it.

Note that because we use mod_timer_pinned(), we have the guarantee
the timer wont expire before we set tw_refcnt as we run in BH context.

To make things more readable I introduced inet_twsk_reschedule() helper.

When rearming the timer, we can use mod_timer_pending() to make sure
we do not rearm a canceled timer.

Note: This bug can possibly trigger if packets of a flow can hit
multiple cpus. This does not normally happen, unless flow steering
is broken somehow. This explains this bug was spotted ~5 months after
its introduction.

A similar fix is needed for SYN_RECV sockets in reqsk_queue_hash_req(),
but will be provided in a separate patch for proper tracking.

Fixes: 789f558cfb36 ("tcp/dccp: get rid of central timewait timer")
Signed-off-by: Eric Dumazet 
Reported-by: Ying Cai 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

ipv6: update skb->csum when CE mark is propagated

2016-01-31T19:23:37+00:00

[ Upstream commit 34ae6a1aa0540f0f781dd265366036355fdc8930 ]

When a tunnel decapsulates the outer header, it has to comply
with RFC 6080 and eventually propagate CE mark into inner header.

It turns out IP6_ECN_set_ce() does not correctly update skb->csum
for CHECKSUM_COMPLETE packets, triggering infamous "hw csum failure"
messages and stack traces.

Signed-off-by: Eric Dumazet 
Acked-by: Herbert Xu 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: preserve IP control block during GSO segmentation

2016-01-31T19:23:36+00:00

[ Upstream commit 9207f9d45b0ad071baa128e846d7e7ed85016df3 ]

Skb_gso_segment() uses skb control block during segmentation.
This patch adds 32-bytes room for previous control block which
will be copied into all resulting segments.

This patch fixes kernel crash during fragmenting forwarded packets.
Fragmentation requires valid IP CB in skb for clearing ip options.
Also patch removes custom save/restore in ovs code, now it's redundant.

Signed-off-by: Konstantin Khlebnikov 
Link: http://lkml.kernel.org/r/CALYGNiP-0MZ-FExV2HutTvE9U-QQtkKSoE--KN=JQE5STYsjAA@mail.gmail.com
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: filter: make JITs zero A for SKF_AD_ALU_XOR_X

2016-01-31T19:23:36+00:00

[ Upstream commit 55795ef5469290f89f04e12e662ded604909e462 ]

The SKF_AD_ALU_XOR_X ancillary is not like the other ancillary data
instructions since it XORs A with X while all the others replace A with
some loaded value.  All the BPF JITs fail to clear A if this is used as
the first instruction in a filter.  This was found using american fuzzy
lop.

Add a helper to determine if A needs to be cleared given the first
instruction in a filter, and use this in the JITs.  Except for ARM, the
rest have only been compile-tested.

Fixes: 3480593131e0 ("net: filter: get rid of BPF_S_* enum")
Signed-off-by: Rabin Vincent 
Acked-by: Daniel Borkmann 
Acked-by: Alexei Starovoitov 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

unix: properly account for FDs passed over unix sockets

2016-01-31T19:23:36+00:00

[ Upstream commit 712f4aad406bb1ed67f3f98d04c044191f0ff593 ]

It is possible for a process to allocate and accumulate far more FDs than
the process' limit by sending them over a unix socket then closing them
to keep the process' fd count low.

This change addresses this problem by keeping track of the number of FDs
in flight per user and preventing non-privileged processes from having
more FDs in flight than their configured FD limit.

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa 
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds 
Acked-by: Hannes Frederic Sowa 
Signed-off-by: Willy Tarreau 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: cdc_ncm: avoid changing RX/TX buffers on MTU changes

2016-01-31T19:23:35+00:00

[ Upstream commit 1dfddff5fcd869fcab0c52fafae099dfa435a935 ]

NCM buffer sizes are negotiated with the device independently of
the network device MTU.  The RX buffers are allocated by the
usbnet framework based on the rx_urb_size value set by cdc_ncm. A
single RX buffer can hold a number of MTU sized packets.

The default usbnet change_mtu ndo only modifies rx_urb_size if it
is equal to hard_mtu.  And the cdc_ncm driver will set rx_urb_size
and hard_mtu independently of each other, based on dwNtbInMaxSize
and dwNtbOutMaxSize respectively. It was therefore assumed that
usbnet_change_mtu() would never touch rx_urb_size.  This failed to
consider the case where dwNtbInMaxSize and dwNtbOutMaxSize happens
to be equal.

Fix by implementing an NCM specific change_mtu ndo, modifying the
netdev MTU without touching the buffer size settings.

Signed-off-by: Bjørn Mork 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: fix IP early demux races

2016-01-23T04:54:14+00:00

[ Upstream commit 5037e9ef9454917b047f9f3a19b4dd179fbf7cd4 ]

David Wilder reported crashes caused by dst reuse.


  I am seeing a crash on a distro V4.2.3 kernel caused by a double
  release of a dst_entry.  In ipv4_dst_destroy() the call to
  list_empty() finds a poisoned next pointer, indicating the dst_entry
  has already been removed from the list and freed. The crash occurs
  18 to 24 hours into a run of a network stress exerciser.


Thanks to his detailed report and analysis, we were able to understand
the core issue.

IP early demux can associate a dst to skb, after a lookup in TCP/UDP
sockets.

When socket cache is not properly set, we want to store into
sk->sk_dst_cache the dst for future IP early demux lookups,
by acquiring a stable refcount on the dst.

Problem is this acquisition is simply using an atomic_inc(),
which works well, unless the dst was queued for destruction from
dst_release() noticing dst refcount went to zero, if DST_NOCACHE
was set on dst.

We need to make sure current refcount is not zero before incrementing
it, or risk double free as David reported.

This patch, being a stable candidate, adds two new helpers, and use
them only from IP early demux problematic paths.

It might be possible to merge in net-next skb_dst_force() and
skb_dst_force_safe(), but I prefer having the smallest patch for stable
kernels : Maybe some skb_dst_force() callers do not expect skb->dst
can suddenly be cleared.

Can probably be backported back to linux-3.6 kernels

Reported-by: David J. Wilder 
Tested-by: David J. Wilder 
Signed-off-by: Eric Dumazet 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman

net: add validation for the socket syscall protocol argument

2016-01-23T04:54:13+00:00

[ Upstream commit 79462ad02e861803b3840cc782248c7359451cd9 ]

郭永刚 reported that one could simply crash the kernel as root by
using a simple program:

	int socket_fd;
	struct sockaddr_in addr;
	addr.sin_port = 0;
	addr.sin_addr.s_addr = INADDR_ANY;
	addr.sin_family = 10;

	socket_fd = socket(10,3,0x40000000);
	connect(socket_fd , &addr,16);

AF_INET, AF_INET6 sockets actually only support 8-bit protocol
identifiers. inet_sock's skc_protocol field thus is sized accordingly,
thus larger protocol identifiers simply cut off the higher bits and
store a zero in the protocol fields.

This could lead to e.g. NULL function pointer because as a result of
the cut off inet_num is zero and we call down to inet_autobind, which
is NULL for raw sockets.

kernel: Call Trace:
kernel:  [] ? inet_autobind+0x2e/0x70
kernel:  [] inet_dgram_connect+0x54/0x80
kernel:  [] SYSC_connect+0xd9/0x110
kernel:  [] ? ptrace_notify+0x5b/0x80
kernel:  [] ? syscall_trace_enter_phase2+0x108/0x200
kernel:  [] SyS_connect+0xe/0x10
kernel:  [] tracesys_phase2+0x84/0x89

I found no particular commit which introduced this problem.

CVE: CVE-2015-8543
Cc: Cong Wang 
Reported-by: 郭永刚 
Signed-off-by: Hannes Frederic Sowa 
Signed-off-by: David S. Miller 
Signed-off-by: Greg Kroah-Hartman