2017-08-11 22:02:42

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 00/15] 4.4.82-stable review

This is the start of the stable review cycle for the 4.4.82 release.
There are 15 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun Aug 13 22:02:10 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.82-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.4.82-rc1

Michal Kubeček <[email protected]>
net: account for current skb length when deciding about UFO

zheng li <[email protected]>
ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output

Matthew Dawson <[email protected]>
mm/mempool: avoid KASAN marking mempool poison checks as use-after-free

Suzuki K Poulose <[email protected]>
KVM: arm/arm64: Handle hva aging while destroying the vm

Rob Gardner <[email protected]>
sparc64: Prevent perf from running during super critical sections

Willem de Bruijn <[email protected]>
udp: consistently apply ufo or fragmentation

Greg Kroah-Hartman <[email protected]>
revert "ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output"

Greg Kroah-Hartman <[email protected]>
revert "net: account for current skb length when deciding about UFO"

Willem de Bruijn <[email protected]>
packet: fix tp_reserve race in packet_set_ring

Willem de Bruijn <[email protected]>
net: avoid skb_warn_bad_offload false positives on UFO

Eric Dumazet <[email protected]>
tcp: fastopen: tcp_connect() must refresh the route

Xin Long <[email protected]>
net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target

Daniel Borkmann <[email protected]>
bpf, s390: fix jit branch offset related to ldimm64

Eric Dumazet <[email protected]>
net: fix keepalive code vs TCP_FASTOPEN_CONNECT

Yuchung Cheng <[email protected]>
tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states


-------------

Diffstat:

Makefile | 4 ++--
arch/arm/kvm/mmu.c | 4 ++++
arch/s390/net/bpf_jit_comp.c | 3 ++-
arch/sparc/include/asm/mmu_context_64.h | 14 +++++++++-----
arch/sparc/kernel/tsb.S | 12 ++++++++++++
arch/sparc/power/hibernate.c | 3 +--
mm/mempool.c | 2 +-
net/core/dev.c | 2 +-
net/ipv4/ip_output.c | 8 +++++---
net/ipv4/tcp_input.c | 4 ++--
net/ipv4/tcp_output.c | 3 +++
net/ipv4/tcp_timer.c | 3 ++-
net/ipv4/udp.c | 2 +-
net/ipv4/udp_offload.c | 2 +-
net/ipv6/ip6_output.c | 7 ++++---
net/ipv6/udp_offload.c | 2 +-
net/packet/af_packet.c | 13 +++++++++----
net/sched/act_ipt.c | 2 +-
18 files changed, 61 insertions(+), 29 deletions(-)



2017-08-11 22:02:44

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 11/15] sparc64: Prevent perf from running during super critical sections

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Rob Gardner <[email protected]>

commit fc290a114fc6034b0f6a5a46e2fb7d54976cf87a upstream.

This fixes another cause of random segfaults and bus errors that may
occur while running perf with the callgraph option.

Critical sections beginning with spin_lock_irqsave() raise the interrupt
level to PIL_NORMAL_MAX (14) and intentionally do not block performance
counter interrupts, which arrive at PIL_NMI (15).

But some sections of code are "super critical" with respect to perf
because the perf_callchain_user() path accesses user space and may cause
TLB activity as well as faults as it unwinds the user stack.

One particular critical section occurs in switch_mm:

spin_lock_irqsave(&mm->context.lock, flags);
...
load_secondary_context(mm);
tsb_context_switch(mm);
...
spin_unlock_irqrestore(&mm->context.lock, flags);

If a perf interrupt arrives in between load_secondary_context() and
tsb_context_switch(), then perf_callchain_user() could execute with
the context ID of one process, but with an active TSB for a different
process. When the user stack is accessed, it is very likely to
incur a TLB miss, since the h/w context ID has been changed. The TLB
will then be reloaded with a translation from the TSB for one process,
but using a context ID for another process. This exposes memory from
one process to another, and since it is a mapping for stack memory,
this usually causes the new process to crash quickly.

This super critical section needs more protection than is provided
by spin_lock_irqsave() since perf interrupts must not be allowed in.

Since __tsb_context_switch already goes through the trouble of
disabling interrupts completely, we fix this by moving the secondary
context load down into this better protected region.

Orabug: 25577560

Signed-off-by: Dave Aldridge <[email protected]>
Signed-off-by: Rob Gardner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/sparc/include/asm/mmu_context_64.h | 14 +++++++++-----
arch/sparc/kernel/tsb.S | 12 ++++++++++++
arch/sparc/power/hibernate.c | 3 +--
3 files changed, 22 insertions(+), 7 deletions(-)

--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -25,9 +25,11 @@ void destroy_context(struct mm_struct *m
void __tsb_context_switch(unsigned long pgd_pa,
struct tsb_config *tsb_base,
struct tsb_config *tsb_huge,
- unsigned long tsb_descr_pa);
+ unsigned long tsb_descr_pa,
+ unsigned long secondary_ctx);

-static inline void tsb_context_switch(struct mm_struct *mm)
+static inline void tsb_context_switch_ctx(struct mm_struct *mm,
+ unsigned long ctx)
{
__tsb_context_switch(__pa(mm->pgd),
&mm->context.tsb_block[0],
@@ -38,9 +40,12 @@ static inline void tsb_context_switch(st
#else
NULL
#endif
- , __pa(&mm->context.tsb_descr[0]));
+ , __pa(&mm->context.tsb_descr[0]),
+ ctx);
}

+#define tsb_context_switch(X) tsb_context_switch_ctx(X, 0)
+
void tsb_grow(struct mm_struct *mm,
unsigned long tsb_index,
unsigned long mm_rss);
@@ -110,8 +115,7 @@ static inline void switch_mm(struct mm_s
* cpu0 to update it's TSB because at that point the cpu_vm_mask
* only had cpu1 set in it.
*/
- load_secondary_context(mm);
- tsb_context_switch(mm);
+ tsb_context_switch_ctx(mm, CTX_HWBITS(mm->context));

/* Any time a processor runs a context on an address space
* for the first time, we must flush that context out of the
--- a/arch/sparc/kernel/tsb.S
+++ b/arch/sparc/kernel/tsb.S
@@ -375,6 +375,7 @@ tsb_flush:
* %o1: TSB base config pointer
* %o2: TSB huge config pointer, or NULL if none
* %o3: Hypervisor TSB descriptor physical address
+ * %o4: Secondary context to load, if non-zero
*
* We have to run this whole thing with interrupts
* disabled so that the current cpu doesn't change
@@ -387,6 +388,17 @@ __tsb_context_switch:
rdpr %pstate, %g1
wrpr %g1, PSTATE_IE, %pstate

+ brz,pn %o4, 1f
+ mov SECONDARY_CONTEXT, %o5
+
+661: stxa %o4, [%o5] ASI_DMMU
+ .section .sun4v_1insn_patch, "ax"
+ .word 661b
+ stxa %o4, [%o5] ASI_MMU
+ .previous
+ flush %g6
+
+1:
TRAP_LOAD_TRAP_BLOCK(%g2, %g3)

stx %o0, [%g2 + TRAP_PER_CPU_PGD_PADDR]
--- a/arch/sparc/power/hibernate.c
+++ b/arch/sparc/power/hibernate.c
@@ -35,6 +35,5 @@ void restore_processor_state(void)
{
struct mm_struct *mm = current->active_mm;

- load_secondary_context(mm);
- tsb_context_switch(mm);
+ tsb_context_switch_ctx(mm, CTX_HWBITS(mm->context));
}


2017-08-11 22:02:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 06/15] net: avoid skb_warn_bad_offload false positives on UFO

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <[email protected]>


[ Upstream commit 8d63bee643f1fb53e472f0e135cae4eb99d62d19 ]

skb_warn_bad_offload triggers a warning when an skb enters the GSO
stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL
checksum offload set.

Commit b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise")
observed that SKB_GSO_DODGY producers can trigger the check and
that passing those packets through the GSO handlers will fix it
up. But, the software UFO handler will set ip_summed to
CHECKSUM_NONE.

When __skb_gso_segment is called from the receive path, this
triggers the warning again.

Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On
Tx these two are equivalent. On Rx, this better matches the
skb state (checksum computed), as CHECKSUM_NONE here means no
checksum computed.

See also this thread for context:
http://patchwork.ozlabs.org/patch/799015/

Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise")
Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/dev.c | 2 +-
net/ipv4/udp_offload.c | 2 +-
net/ipv6/udp_offload.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2551,7 +2551,7 @@ static inline bool skb_needs_check(struc
{
if (tx_path)
return skb->ip_summed != CHECKSUM_PARTIAL &&
- skb->ip_summed != CHECKSUM_NONE;
+ skb->ip_summed != CHECKSUM_UNNECESSARY;

return skb->ip_summed == CHECKSUM_NONE;
}
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -231,7 +231,7 @@ static struct sk_buff *udp4_ufo_fragment
if (uh->check == 0)
uh->check = CSUM_MANGLED_0;

- skb->ip_summed = CHECKSUM_NONE;
+ skb->ip_summed = CHECKSUM_UNNECESSARY;

/* Fragment the skb. IP headers of the fragments are updated in
* inet_gso_segment()
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -86,7 +86,7 @@ static struct sk_buff *udp6_ufo_fragment
if (uh->check == 0)
uh->check = CSUM_MANGLED_0;

- skb->ip_summed = CHECKSUM_NONE;
+ skb->ip_summed = CHECKSUM_UNNECESSARY;

/* Check if there is enough headroom to insert fragment header. */
tnl_hlen = skb_tnl_header_len(skb);


2017-08-11 22:02:54

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 07/15] packet: fix tp_reserve race in packet_set_ring

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <[email protected]>


[ Upstream commit c27927e372f0785f3303e8fad94b85945e2c97b7 ]

Updates to tp_reserve can race with reads of the field in
packet_set_ring. Avoid this by holding the socket lock during
updates in setsockopt PACKET_RESERVE.

This bug was discovered by syzkaller.

Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
Reported-by: Andrey Konovalov <[email protected]>
Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/packet/af_packet.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3622,14 +3622,19 @@ packet_setsockopt(struct socket *sock, i

if (optlen != sizeof(val))
return -EINVAL;
- if (po->rx_ring.pg_vec || po->tx_ring.pg_vec)
- return -EBUSY;
if (copy_from_user(&val, optval, sizeof(val)))
return -EFAULT;
if (val > INT_MAX)
return -EINVAL;
- po->tp_reserve = val;
- return 0;
+ lock_sock(sk);
+ if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) {
+ ret = -EBUSY;
+ } else {
+ po->tp_reserve = val;
+ ret = 0;
+ }
+ release_sock(sk);
+ return ret;
}
case PACKET_LOSS:
{


2017-08-11 22:02:52

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 05/15] tcp: fastopen: tcp_connect() must refresh the route

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 8ba60924710cde564a3905588b6219741d6356d0 ]

With new TCP_FASTOPEN_CONNECT socket option, there is a possibility
to call tcp_connect() while socket sk_dst_cache is either NULL
or invalid.

+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
+0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
+0 connect(4, ..., ...) = 0

<< sk->sk_dst_cache becomes obsolete, or even set to NULL >>

+1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000

We need to refresh the route otherwise bad things can happen,
especially when syzkaller is running on the host :/

Fixes: 19f6d3f3c8422 ("net/tcp-fastopen: Add new API support")
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yuchung Cheng <[email protected]>
Acked-by: Wei Wang <[email protected]>
Acked-by: Yuchung Cheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_output.c | 3 +++
1 file changed, 3 insertions(+)

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3256,6 +3256,9 @@ int tcp_connect(struct sock *sk)
struct sk_buff *buff;
int err;

+ if (inet_csk(sk)->icsk_af_ops->rebuild_header(sk))
+ return -EHOSTUNREACH; /* Routing failure or similar. */
+
tcp_connect_init(sk);

if (unlikely(tp->repair)) {


2017-08-11 22:07:21

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 09/15] revert "ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output"

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

This reverts commit f102bb7164c9020e12662998f0fd99c3be72d4f6 which is
commit 0a28cfd51e17f4f0a056bcf66bfbe492c3b99f38 upstream as there is
another patch that needs to be applied instead of this one.

Cc: Zheng Li <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/ipv4/ip_output.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -922,7 +922,7 @@ static int __ip_append_data(struct sock
csummode = CHECKSUM_PARTIAL;

cork->length += length;
- if ((((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))) &&
+ if (((length > mtu) || (skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
(sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {


2017-08-11 22:07:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 08/15] revert "net: account for current skb length when deciding about UFO"

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

This reverts commit ef09c9ff343122a0b245416066992d096416ff19 which is
commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 upstream as it causes
merge issues with later patches that are much more important...

Cc: Michal Kubecek <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/ipv4/ip_output.c | 3 +--
net/ipv6/ip6_output.c | 2 +-
2 files changed, 2 insertions(+), 3 deletions(-)

--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -922,8 +922,7 @@ static int __ip_append_data(struct sock
csummode = CHECKSUM_PARTIAL;

cork->length += length;
- if ((((length + (skb ? skb->len : fragheaderlen)) > mtu) ||
- (skb && skb_is_gso(skb))) &&
+ if ((((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
(sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1357,7 +1357,7 @@ emsgsize:
*/

cork->length += length;
- if ((((length + (skb ? skb->len : headersize)) > mtu) ||
+ if ((((length + fragheaderlen) > mtu) ||
(skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) &&


2017-08-11 22:08:03

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 03/15] bpf, s390: fix jit branch offset related to ldimm64

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <[email protected]>


[ Upstream commit b0a0c2566f28e71e5e32121992ac8060cec75510 ]

While testing some other work that required JIT modifications, I
run into test_bpf causing a hang when JIT enabled on s390. The
problematic test case was the one from ddc665a4bb4b (bpf, arm64:
fix jit branch offset related to ldimm64), and turns out that we
do have a similar issue on s390 as well. In bpf_jit_prog() we
update next instruction address after returning from bpf_jit_insn()
with an insn_count. bpf_jit_insn() returns either -1 in case of
error (e.g. unsupported insn), 1 or 2. The latter is only the
case for ldimm64 due to spanning 2 insns, however, next address
is only set to i + 1 not taking actual insn_count into account,
thus fix is to use insn_count instead of 1. bpf_jit_enable in
mode 2 provides also disasm on s390:

Before fix:

000003ff800349b6: a7f40003 brc 15,3ff800349bc ; target
000003ff800349ba: 0000 unknown
000003ff800349bc: e3b0f0700024 stg %r11,112(%r15)
000003ff800349c2: e3e0f0880024 stg %r14,136(%r15)
000003ff800349c8: 0db0 basr %r11,%r0
000003ff800349ca: c0ef00000000 llilf %r14,0
000003ff800349d0: e320b0360004 lg %r2,54(%r11)
000003ff800349d6: e330b03e0004 lg %r3,62(%r11)
000003ff800349dc: ec23ffeda065 clgrj %r2,%r3,10,3ff800349b6 ; jmp
000003ff800349e2: e3e0b0460004 lg %r14,70(%r11)
000003ff800349e8: e3e0b04e0004 lg %r14,78(%r11)
000003ff800349ee: b904002e lgr %r2,%r14
000003ff800349f2: e3b0f0700004 lg %r11,112(%r15)
000003ff800349f8: e3e0f0880004 lg %r14,136(%r15)
000003ff800349fe: 07fe bcr 15,%r14

After fix:

000003ff80ef3db4: a7f40003 brc 15,3ff80ef3dba
000003ff80ef3db8: 0000 unknown
000003ff80ef3dba: e3b0f0700024 stg %r11,112(%r15)
000003ff80ef3dc0: e3e0f0880024 stg %r14,136(%r15)
000003ff80ef3dc6: 0db0 basr %r11,%r0
000003ff80ef3dc8: c0ef00000000 llilf %r14,0
000003ff80ef3dce: e320b0360004 lg %r2,54(%r11)
000003ff80ef3dd4: e330b03e0004 lg %r3,62(%r11)
000003ff80ef3dda: ec230006a065 clgrj %r2,%r3,10,3ff80ef3de6 ; jmp
000003ff80ef3de0: e3e0b0460004 lg %r14,70(%r11)
000003ff80ef3de6: e3e0b04e0004 lg %r14,78(%r11) ; target
000003ff80ef3dec: b904002e lgr %r2,%r14
000003ff80ef3df0: e3b0f0700004 lg %r11,112(%r15)
000003ff80ef3df6: e3e0f0880004 lg %r14,136(%r15)
000003ff80ef3dfc: 07fe bcr 15,%r14

test_bpf.ko suite runs fine after the fix.

Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Daniel Borkmann <[email protected]>
Tested-by: Michael Holzheu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/s390/net/bpf_jit_comp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1250,7 +1250,8 @@ static int bpf_jit_prog(struct bpf_jit *
insn_count = bpf_jit_insn(jit, fp, i);
if (insn_count < 0)
return -1;
- jit->addrs[i + 1] = jit->prg; /* Next instruction address */
+ /* Next instruction address */
+ jit->addrs[i + insn_count] = jit->prg;
}
bpf_jit_epilogue(jit);



2017-08-11 22:08:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 04/15] net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Xin Long <[email protected]>


[ Upstream commit 96d9703050a0036a3360ec98bb41e107c90664fe ]

Commit 55917a21d0cc ("netfilter: x_tables: add context to know if
extension runs from nft_compat") introduced a member nft_compat to
xt_tgchk_param structure.

But it didn't set it's value for ipt_init_target. With unexpected
value in par.nft_compat, it may return unexpected result in some
target's checkentry.

This patch is to set all it's fields as 0 and only initialize the
non-zero fields in ipt_init_target.

v1->v2:
As Wang Cong's suggestion, fix it by setting all it's fields as
0 and only initializing the non-zero fields.

Fixes: 55917a21d0cc ("netfilter: x_tables: add context to know if extension runs from nft_compat")
Suggested-by: Cong Wang <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sched/act_ipt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -42,8 +42,8 @@ static int ipt_init_target(struct xt_ent
return PTR_ERR(target);

t->u.kernel.target = target;
+ memset(&par, 0, sizeof(par));
par.table = table;
- par.entryinfo = NULL;
par.target = target;
par.targinfo = t->data;
par.hook_mask = hook;


2017-08-11 22:02:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 13/15] mm/mempool: avoid KASAN marking mempool poison checks as use-after-free

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Matthew Dawson <[email protected]>

commit 7640131032db9118a78af715ac77ba2debeeb17c upstream.

When removing an element from the mempool, mark it as unpoisoned in KASAN
before verifying its contents for SLUB/SLAB debugging. Otherwise KASAN
will flag the reads checking the element use-after-free writes as
use-after-free reads.

Signed-off-by: Matthew Dawson <[email protected]>
Acked-by: Andrey Ryabinin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Cc: Andrii Bordunov <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
mm/mempool.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -135,8 +135,8 @@ static void *remove_element(mempool_t *p
void *element = pool->elements[--pool->curr_nr];

BUG_ON(pool->curr_nr < 0);
- check_element(pool, element);
kasan_unpoison_element(pool, element);
+ check_element(pool, element);
return element;
}



2017-08-11 22:08:38

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 02/15] net: fix keepalive code vs TCP_FASTOPEN_CONNECT

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 2dda640040876cd8ae646408b69eea40c24f9ae9 ]

syzkaller was able to trigger a divide by 0 in TCP stack [1]

Issue here is that keepalive timer needs to be updated to not attempt
to send a probe if the connection setup was deferred using
TCP_FASTOPEN_CONNECT socket option added in linux-4.11

[1]
divide error: 0000 [#1] SMP
CPU: 18 PID: 0 Comm: swapper/18 Not tainted
task: ffff986f62f4b040 ti: ffff986f62fa2000 task.ti: ffff986f62fa2000
RIP: 0010:[<ffffffff8409cc0d>] [<ffffffff8409cc0d>] __tcp_select_window+0x8d/0x160
Call Trace:
<IRQ>
[<ffffffff8409d951>] tcp_transmit_skb+0x11/0x20
[<ffffffff8409da21>] tcp_xmit_probe_skb+0xc1/0xe0
[<ffffffff840a0ee8>] tcp_write_wakeup+0x68/0x160
[<ffffffff840a151b>] tcp_keepalive_timer+0x17b/0x230
[<ffffffff83b3f799>] call_timer_fn+0x39/0xf0
[<ffffffff83b40797>] run_timer_softirq+0x1d7/0x280
[<ffffffff83a04ddb>] __do_softirq+0xcb/0x257
[<ffffffff83ae03ac>] irq_exit+0x9c/0xb0
[<ffffffff83a04c1a>] smp_apic_timer_interrupt+0x6a/0x80
[<ffffffff83a03eaf>] apic_timer_interrupt+0x7f/0x90
<EOI>
[<ffffffff83fed2ea>] ? cpuidle_enter_state+0x13a/0x3b0
[<ffffffff83fed2cd>] ? cpuidle_enter_state+0x11d/0x3b0

Tested:

Following packetdrill no longer crashes the kernel

`echo 0 >/proc/sys/net/ipv4/tcp_timestamps`

// Cache warmup: send a Fast Open cookie request
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0 setsockopt(3, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation is now in progress)
+0 > S 0:0(0) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO,nop,nop>
+.01 < S. 123:123(0) ack 1 win 14600 <mss 1460,nop,nop,sackOK,nop,wscale 6,FO abcd1234,nop,nop>
+0 > . 1:1(0) ack 1
+0 close(3) = 0
+0 > F. 1:1(0) ack 1
+0 < F. 1:1(0) ack 2 win 92
+0 > . 2:2(0) ack 2

+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
+0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
+0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
+.01 connect(4, ..., ...) = 0
+0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [5], 4) = 0
+10 close(4) = 0

`echo 1 >/proc/sys/net/ipv4/tcp_timestamps`

Fixes: 19f6d3f3c842 ("net/tcp-fastopen: Add new API support")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yuchung Cheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_timer.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -606,7 +606,8 @@ static void tcp_keepalive_timer (unsigne
goto death;
}

- if (!sock_flag(sk, SOCK_KEEPOPEN) || sk->sk_state == TCP_CLOSE)
+ if (!sock_flag(sk, SOCK_KEEPOPEN) ||
+ ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT)))
goto out;

elapsed = keepalive_time_when(tp);


2017-08-11 22:09:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 14/15] ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: zheng li <[email protected]>

commit 0a28cfd51e17f4f0a056bcf66bfbe492c3b99f38 upstream.

There is an inconsistent conditional judgement in __ip_append_data and
ip_finish_output functions, the variable length in __ip_append_data just
include the length of application's payload and udp header, don't include
the length of ip header, but in ip_finish_output use
(skb->len > ip_skb_dst_mtu(skb)) as judgement, and skb->len include the
length of ip header.

That causes some particular application's udp payload whose length is
between (MTU - IP Header) and MTU were fragmented by ip_fragment even
though the rst->dev support UFO feature.

Add the length of ip header to length in __ip_append_data to keep
consistent conditional judgement as ip_finish_output for ip fragment.

Signed-off-by: Zheng Li <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/ipv4/ip_output.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -923,7 +923,7 @@ static int __ip_append_data(struct sock

cork->length += length;
if ((skb && skb_is_gso(skb)) ||
- ((length > mtu) &&
+ (((length + fragheaderlen) > mtu) &&
(skb_queue_len(queue) <= 1) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&


2017-08-11 22:09:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 12/15] KVM: arm/arm64: Handle hva aging while destroying the vm

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Suzuki K Poulose <[email protected]>

commit 7e5a672289c9754d07e1c3b33649786d3d70f5e4 upstream.

The mmu_notifier_release() callback of KVM triggers cleaning up
the stage2 page table on kvm-arm. However there could be other
notifier callbacks in parallel with the mmu_notifier_release(),
which could cause the call backs ending up in an empty stage2
page table. Make sure we check it for all the notifier callbacks.

Fixes: commit 293f29363 ("kvm-arm: Unmap shadow pagetables properly")
Reported-by: Alex Graf <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
arch/arm/kvm/mmu.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1629,12 +1629,16 @@ static int kvm_test_age_hva_handler(stru

int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
{
+ if (!kvm->arch.pgd)
+ return 0;
trace_kvm_age_hva(start, end);
return handle_hva_to_gpa(kvm, start, end, kvm_age_hva_handler, NULL);
}

int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
{
+ if (!kvm->arch.pgd)
+ return 0;
trace_kvm_test_age_hva(hva);
return handle_hva_to_gpa(kvm, hva, hva, kvm_test_age_hva_handler, NULL);
}


2017-08-11 22:09:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 10/15] udp: consistently apply ufo or fragmentation

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <[email protected]>


[ Upstream commit 85f1bd9a7b5a79d5baa8bf44af19658f7bf77bfa ]

When iteratively building a UDP datagram with MSG_MORE and that
datagram exceeds MTU, consistently choose UFO or fragmentation.

Once skb_is_gso, always apply ufo. Conversely, once a datagram is
split across multiple skbs, do not consider ufo.

Sendpage already maintains the first invariant, only add the second.
IPv6 does not have a sendpage implementation to modify.

A gso skb must have a partial checksum, do not follow sk_no_check_tx
in udp_send_skb.

Found by syzkaller.

Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Reported-by: Andrey Konovalov <[email protected]>
Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/ip_output.c | 7 +++++--
net/ipv4/udp.c | 2 +-
net/ipv6/ip6_output.c | 7 ++++---
3 files changed, 10 insertions(+), 6 deletions(-)

--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -922,10 +922,12 @@ static int __ip_append_data(struct sock
csummode = CHECKSUM_PARTIAL;

cork->length += length;
- if (((length > mtu) || (skb && skb_is_gso(skb))) &&
+ if ((skb && skb_is_gso(skb)) ||
+ ((length > mtu) &&
+ (skb_queue_len(queue) <= 1) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
- (sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
+ (sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx)) {
err = ip_ufo_append_data(sk, queue, getfrag, from, length,
hh_len, fragheaderlen, transhdrlen,
maxfraglen, flags);
@@ -1241,6 +1243,7 @@ ssize_t ip_append_page(struct sock *sk,
return -EINVAL;

if ((size + skb->len > mtu) &&
+ (skb_queue_len(&sk->sk_write_queue) == 1) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO)) {
if (skb->ip_summed != CHECKSUM_PARTIAL)
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -819,7 +819,7 @@ static int udp_send_skb(struct sk_buff *
if (is_udplite) /* UDP-Lite */
csum = udplite_csum(skb);

- else if (sk->sk_no_check_tx) { /* UDP csum disabled */
+ else if (sk->sk_no_check_tx && !skb_is_gso(skb)) { /* UDP csum off */

skb->ip_summed = CHECKSUM_NONE;
goto send;
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1357,11 +1357,12 @@ emsgsize:
*/

cork->length += length;
- if ((((length + fragheaderlen) > mtu) ||
- (skb && skb_is_gso(skb))) &&
+ if ((skb && skb_is_gso(skb)) ||
+ (((length + fragheaderlen) > mtu) &&
+ (skb_queue_len(queue) <= 1) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) &&
- (sk->sk_type == SOCK_DGRAM) && !udp_get_no_check6_tx(sk)) {
+ (sk->sk_type == SOCK_DGRAM) && !udp_get_no_check6_tx(sk))) {
err = ip6_ufo_append_data(sk, queue, getfrag, from, length,
hh_len, fragheaderlen, exthdrlen,
transhdrlen, mtu, flags, fl6);


2017-08-11 22:10:08

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.4 01/15] tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states

4.4-stable review patch. If anyone has any objections, please let me know.

------------------

From: Yuchung Cheng <[email protected]>


[ Upstream commit ed254971edea92c3ac5c67c6a05247a92aa6075e ]

If the sender switches the congestion control during ECN-triggered
cwnd-reduction state (CA_CWR), upon exiting recovery cwnd is set to
the ssthresh value calculated by the previous congestion control. If
the previous congestion control is BBR that always keep ssthresh
to TCP_INIFINITE_SSTHRESH, cwnd ends up being infinite. The safe
step is to avoid assigning invalid ssthresh value when recovery ends.

Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_input.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2503,8 +2503,8 @@ static inline void tcp_end_cwnd_reductio
struct tcp_sock *tp = tcp_sk(sk);

/* Reset cwnd to ssthresh in CWR or Recovery (unless it's undone) */
- if (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR ||
- (tp->undo_marker && tp->snd_ssthresh < TCP_INFINITE_SSTHRESH)) {
+ if (tp->snd_ssthresh < TCP_INFINITE_SSTHRESH &&
+ (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR || tp->undo_marker)) {
tp->snd_cwnd = tp->snd_ssthresh;
tp->snd_cwnd_stamp = tcp_time_stamp;
}


2017-08-12 01:57:39

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/15] 4.4.82-stable review

On 08/11/2017 04:02 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.82 release.
> There are 15 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Aug 13 22:02:10 UTC 2017.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.82-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted test system. No dmesg regressions.

thanks,
-- Shuah

2017-08-12 12:24:13

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.4 00/15] 4.4.82-stable review

On 08/11/2017 03:02 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.4.82 release.
> There are 15 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Aug 13 22:02:10 UTC 2017.
> Anything received after that time might be too late.
>

Build results:
total: 145 pass: 145 fail: 0
Qemu test results:
total: 115 pass: 115 fail: 0

Details are available at http://kerneltests.org/builders.

Guenter