2017-08-11 22:02:35

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 00/16] 4.9.43-stable review

This is the start of the stable review cycle for the 4.9.43 release.
There are 16 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.

Responses should be made by Sun Aug 13 22:01:23 UTC 2017.
Anything received after that time might be too late.

The whole patch series can be found in one patch at:
kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.43-rc1.gz
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.

thanks,

greg k-h

-------------
Pseudo-Shortlog of commits:

Greg Kroah-Hartman <[email protected]>
Linux 4.9.43-rc1

Suzuki K Poulose <[email protected]>
KVM: arm/arm64: Handle hva aging while destroying the vm

Rob Gardner <[email protected]>
sparc64: Prevent perf from running during super critical sections

Willem de Bruijn <[email protected]>
udp: consistently apply ufo or fragmentation

Greg Kroah-Hartman <[email protected]>
revert "ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output"

Greg Kroah-Hartman <[email protected]>
revert "net: account for current skb length when deciding about UFO"

Willem de Bruijn <[email protected]>
packet: fix tp_reserve race in packet_set_ring

Nikolay Borisov <[email protected]>
igmp: Fix regression caused by igmp sysctl namespace code.

Willem de Bruijn <[email protected]>
net: avoid skb_warn_bad_offload false positives on UFO

Eric Dumazet <[email protected]>
tcp: fastopen: tcp_connect() must refresh the route

Xin Long <[email protected]>
net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target

Davide Caratti <[email protected]>
net/mlx4_en: don't set CHECKSUM_COMPLETE on SCTP packets

Daniel Borkmann <[email protected]>
bpf, s390: fix jit branch offset related to ldimm64

Eric Dumazet <[email protected]>
net: fix keepalive code vs TCP_FASTOPEN_CONNECT

Yuchung Cheng <[email protected]>
tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states

Guillaume Nault <[email protected]>
ppp: fix xmit recursion detection on ppp channels

Gao Feng <[email protected]>
ppp: Fix false xmit recursion detect with two ppp devices


-------------

Diffstat:

Makefile | 4 +--
arch/arm/kvm/mmu.c | 4 +++
arch/s390/net/bpf_jit_comp.c | 3 +-
arch/sparc/include/asm/mmu_context_64.h | 14 ++++++----
arch/sparc/kernel/tsb.S | 12 ++++++++
arch/sparc/power/hibernate.c | 3 +-
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 29 ++++++++++++--------
drivers/net/ppp/ppp_generic.c | 44 ++++++++++++++++++++----------
net/core/dev.c | 2 +-
net/ipv4/af_inet.c | 7 +++++
net/ipv4/igmp.c | 6 ----
net/ipv4/ip_output.c | 8 ++++--
net/ipv4/tcp_input.c | 4 +--
net/ipv4/tcp_output.c | 3 ++
net/ipv4/tcp_timer.c | 3 +-
net/ipv4/udp.c | 2 +-
net/ipv4/udp_offload.c | 2 +-
net/ipv6/ip6_output.c | 7 +++--
net/ipv6/udp_offload.c | 2 +-
net/packet/af_packet.c | 13 ++++++---
net/sched/act_ipt.c | 2 +-
21 files changed, 114 insertions(+), 60 deletions(-)



2017-08-11 22:02:05

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 13/16] revert "ipv4: Should use consistent conditional judgement for ip fragment in __ip_append_data and ip_finish_output"

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

This reverts commit f102bb7164c9020e12662998f0fd99c3be72d4f6 which is
commit 0a28cfd51e17f4f0a056bcf66bfbe492c3b99f38 upstream as there is
another patch that needs to be applied instead of this one.

Cc: Zheng Li <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/ipv4/ip_output.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -936,7 +936,7 @@ static int __ip_append_data(struct sock
csummode = CHECKSUM_PARTIAL;

cork->length += length;
- if ((((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))) &&
+ if (((length > mtu) || (skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
(sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {


2017-08-11 22:02:20

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 09/16] net: avoid skb_warn_bad_offload false positives on UFO

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <[email protected]>


[ Upstream commit 8d63bee643f1fb53e472f0e135cae4eb99d62d19 ]

skb_warn_bad_offload triggers a warning when an skb enters the GSO
stack at __skb_gso_segment that does not have CHECKSUM_PARTIAL
checksum offload set.

Commit b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise")
observed that SKB_GSO_DODGY producers can trigger the check and
that passing those packets through the GSO handlers will fix it
up. But, the software UFO handler will set ip_summed to
CHECKSUM_NONE.

When __skb_gso_segment is called from the receive path, this
triggers the warning again.

Make UFO set CHECKSUM_UNNECESSARY instead of CHECKSUM_NONE. On
Tx these two are equivalent. On Rx, this better matches the
skb state (checksum computed), as CHECKSUM_NONE here means no
checksum computed.

See also this thread for context:
http://patchwork.ozlabs.org/patch/799015/

Fixes: b2504a5dbef3 ("net: reduce skb_warn_bad_offload() noise")
Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/core/dev.c | 2 +-
net/ipv4/udp_offload.c | 2 +-
net/ipv6/udp_offload.c | 2 +-
3 files changed, 3 insertions(+), 3 deletions(-)

--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -2703,7 +2703,7 @@ static inline bool skb_needs_check(struc
{
if (tx_path)
return skb->ip_summed != CHECKSUM_PARTIAL &&
- skb->ip_summed != CHECKSUM_NONE;
+ skb->ip_summed != CHECKSUM_UNNECESSARY;

return skb->ip_summed == CHECKSUM_NONE;
}
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -232,7 +232,7 @@ static struct sk_buff *udp4_ufo_fragment
if (uh->check == 0)
uh->check = CSUM_MANGLED_0;

- skb->ip_summed = CHECKSUM_NONE;
+ skb->ip_summed = CHECKSUM_UNNECESSARY;

/* If there is no outer header we can fake a checksum offload
* due to the fact that we have already done the checksum in
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -72,7 +72,7 @@ static struct sk_buff *udp6_ufo_fragment
if (uh->check == 0)
uh->check = CSUM_MANGLED_0;

- skb->ip_summed = CHECKSUM_NONE;
+ skb->ip_summed = CHECKSUM_UNNECESSARY;

/* If there is no outer header we can fake a checksum offload
* due to the fact that we have already done the checksum in


2017-08-11 22:02:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 07/16] net: sched: set xt_tgchk_param par.nft_compat as 0 in ipt_init_target

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Xin Long <[email protected]>


[ Upstream commit 96d9703050a0036a3360ec98bb41e107c90664fe ]

Commit 55917a21d0cc ("netfilter: x_tables: add context to know if
extension runs from nft_compat") introduced a member nft_compat to
xt_tgchk_param structure.

But it didn't set it's value for ipt_init_target. With unexpected
value in par.nft_compat, it may return unexpected result in some
target's checkentry.

This patch is to set all it's fields as 0 and only initialize the
non-zero fields in ipt_init_target.

v1->v2:
As Wang Cong's suggestion, fix it by setting all it's fields as
0 and only initializing the non-zero fields.

Fixes: 55917a21d0cc ("netfilter: x_tables: add context to know if extension runs from nft_compat")
Suggested-by: Cong Wang <[email protected]>
Signed-off-by: Xin Long <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/sched/act_ipt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

--- a/net/sched/act_ipt.c
+++ b/net/sched/act_ipt.c
@@ -49,8 +49,8 @@ static int ipt_init_target(struct xt_ent
return PTR_ERR(target);

t->u.kernel.target = target;
+ memset(&par, 0, sizeof(par));
par.table = table;
- par.entryinfo = NULL;
par.target = target;
par.targinfo = t->data;
par.hook_mask = hook;


2017-08-11 22:10:39

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 08/16] tcp: fastopen: tcp_connect() must refresh the route

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 8ba60924710cde564a3905588b6219741d6356d0 ]

With new TCP_FASTOPEN_CONNECT socket option, there is a possibility
to call tcp_connect() while socket sk_dst_cache is either NULL
or invalid.

+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
+0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
+0 connect(4, ..., ...) = 0

<< sk->sk_dst_cache becomes obsolete, or even set to NULL >>

+1 sendto(4, ..., 1000, MSG_FASTOPEN, ..., ...) = 1000

We need to refresh the route otherwise bad things can happen,
especially when syzkaller is running on the host :/

Fixes: 19f6d3f3c8422 ("net/tcp-fastopen: Add new API support")
Reported-by: Dmitry Vyukov <[email protected]>
Signed-off-by: Eric Dumazet <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yuchung Cheng <[email protected]>
Acked-by: Wei Wang <[email protected]>
Acked-by: Yuchung Cheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_output.c | 3 +++
1 file changed, 3 insertions(+)

--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -3344,6 +3344,9 @@ int tcp_connect(struct sock *sk)
struct sk_buff *buff;
int err;

+ if (inet_csk(sk)->icsk_af_ops->rebuild_header(sk))
+ return -EHOSTUNREACH; /* Routing failure or similar. */
+
tcp_connect_init(sk);

if (unlikely(tp->repair)) {


2017-08-11 22:02:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 14/16] udp: consistently apply ufo or fragmentation

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <[email protected]>


[ Upstream commit 85f1bd9a7b5a79d5baa8bf44af19658f7bf77bfa ]

When iteratively building a UDP datagram with MSG_MORE and that
datagram exceeds MTU, consistently choose UFO or fragmentation.

Once skb_is_gso, always apply ufo. Conversely, once a datagram is
split across multiple skbs, do not consider ufo.

Sendpage already maintains the first invariant, only add the second.
IPv6 does not have a sendpage implementation to modify.

A gso skb must have a partial checksum, do not follow sk_no_check_tx
in udp_send_skb.

Found by syzkaller.

Fixes: e89e9cf539a2 ("[IPv4/IPv6]: UFO Scatter-gather approach")
Reported-by: Andrey Konovalov <[email protected]>
Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/ip_output.c | 7 +++++--
net/ipv4/udp.c | 2 +-
net/ipv6/ip6_output.c | 7 ++++---
3 files changed, 10 insertions(+), 6 deletions(-)

--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -936,10 +936,12 @@ static int __ip_append_data(struct sock
csummode = CHECKSUM_PARTIAL;

cork->length += length;
- if (((length > mtu) || (skb && skb_is_gso(skb))) &&
+ if ((skb && skb_is_gso(skb)) ||
+ ((length > mtu) &&
+ (skb_queue_len(queue) <= 1) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
- (sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
+ (sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx)) {
err = ip_ufo_append_data(sk, queue, getfrag, from, length,
hh_len, fragheaderlen, transhdrlen,
maxfraglen, flags);
@@ -1255,6 +1257,7 @@ ssize_t ip_append_page(struct sock *sk,
return -EINVAL;

if ((size + skb->len > mtu) &&
+ (skb_queue_len(&sk->sk_write_queue) == 1) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO)) {
if (skb->ip_summed != CHECKSUM_PARTIAL)
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -813,7 +813,7 @@ static int udp_send_skb(struct sk_buff *
if (is_udplite) /* UDP-Lite */
csum = udplite_csum(skb);

- else if (sk->sk_no_check_tx) { /* UDP csum disabled */
+ else if (sk->sk_no_check_tx && !skb_is_gso(skb)) { /* UDP csum off */

skb->ip_summed = CHECKSUM_NONE;
goto send;
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1372,11 +1372,12 @@ emsgsize:
*/

cork->length += length;
- if ((((length + fragheaderlen) > mtu) ||
- (skb && skb_is_gso(skb))) &&
+ if ((skb && skb_is_gso(skb)) ||
+ (((length + fragheaderlen) > mtu) &&
+ (skb_queue_len(queue) <= 1) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
- (sk->sk_type == SOCK_DGRAM) && !udp_get_no_check6_tx(sk)) {
+ (sk->sk_type == SOCK_DGRAM) && !udp_get_no_check6_tx(sk))) {
err = ip6_ufo_append_data(sk, queue, getfrag, from, length,
hh_len, fragheaderlen, exthdrlen,
transhdrlen, mtu, flags, fl6);


2017-08-11 22:11:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 06/16] net/mlx4_en: dont set CHECKSUM_COMPLETE on SCTP packets

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Davide Caratti <[email protected]>


[ Upstream commit e718fe450e616227b74d27a233cdf37b4df0c82b ]

if the NIC fails to validate the checksum on TCP/UDP, and validation of IP
checksum is successful, the driver subtracts the pseudo-header checksum
from the value obtained by the hardware and sets CHECKSUM_COMPLETE. Don't
do that if protocol is IPPROTO_SCTP, otherwise CRC32c validation fails.

V2: don't test MLX4_CQE_STATUS_IPV6 if MLX4_CQE_STATUS_IPV4 is set

Reported-by: Shuang Li <[email protected]>
Fixes: f8c6455bb04b ("net/mlx4_en: Extend checksum offloading by CHECKSUM COMPLETE")
Signed-off-by: Davide Caratti <[email protected]>
Acked-by: Saeed Mahameed <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)

--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -724,16 +724,21 @@ static inline __wsum get_fixed_vlan_csum
* header, the HW adds it. To address that, we are subtracting the pseudo
* header checksum from the checksum value provided by the HW.
*/
-static void get_fixed_ipv4_csum(__wsum hw_checksum, struct sk_buff *skb,
- struct iphdr *iph)
+static int get_fixed_ipv4_csum(__wsum hw_checksum, struct sk_buff *skb,
+ struct iphdr *iph)
{
__u16 length_for_csum = 0;
__wsum csum_pseudo_header = 0;
+ __u8 ipproto = iph->protocol;
+
+ if (unlikely(ipproto == IPPROTO_SCTP))
+ return -1;

length_for_csum = (be16_to_cpu(iph->tot_len) - (iph->ihl << 2));
csum_pseudo_header = csum_tcpudp_nofold(iph->saddr, iph->daddr,
- length_for_csum, iph->protocol, 0);
+ length_for_csum, ipproto, 0);
skb->csum = csum_sub(hw_checksum, csum_pseudo_header);
+ return 0;
}

#if IS_ENABLED(CONFIG_IPV6)
@@ -744,17 +749,20 @@ static void get_fixed_ipv4_csum(__wsum h
static int get_fixed_ipv6_csum(__wsum hw_checksum, struct sk_buff *skb,
struct ipv6hdr *ipv6h)
{
+ __u8 nexthdr = ipv6h->nexthdr;
__wsum csum_pseudo_hdr = 0;

- if (unlikely(ipv6h->nexthdr == IPPROTO_FRAGMENT ||
- ipv6h->nexthdr == IPPROTO_HOPOPTS))
+ if (unlikely(nexthdr == IPPROTO_FRAGMENT ||
+ nexthdr == IPPROTO_HOPOPTS ||
+ nexthdr == IPPROTO_SCTP))
return -1;
- hw_checksum = csum_add(hw_checksum, (__force __wsum)htons(ipv6h->nexthdr));
+ hw_checksum = csum_add(hw_checksum, (__force __wsum)htons(nexthdr));

csum_pseudo_hdr = csum_partial(&ipv6h->saddr,
sizeof(ipv6h->saddr) + sizeof(ipv6h->daddr), 0);
csum_pseudo_hdr = csum_add(csum_pseudo_hdr, (__force __wsum)ipv6h->payload_len);
- csum_pseudo_hdr = csum_add(csum_pseudo_hdr, (__force __wsum)ntohs(ipv6h->nexthdr));
+ csum_pseudo_hdr = csum_add(csum_pseudo_hdr,
+ (__force __wsum)htons(nexthdr));

skb->csum = csum_sub(hw_checksum, csum_pseudo_hdr);
skb->csum = csum_add(skb->csum, csum_partial(ipv6h, sizeof(struct ipv6hdr), 0));
@@ -777,11 +785,10 @@ static int check_csum(struct mlx4_cqe *c
}

if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPV4))
- get_fixed_ipv4_csum(hw_checksum, skb, hdr);
+ return get_fixed_ipv4_csum(hw_checksum, skb, hdr);
#if IS_ENABLED(CONFIG_IPV6)
- else if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPV6))
- if (unlikely(get_fixed_ipv6_csum(hw_checksum, skb, hdr)))
- return -1;
+ if (cqe->status & cpu_to_be16(MLX4_CQE_STATUS_IPV6))
+ return get_fixed_ipv6_csum(hw_checksum, skb, hdr);
#endif
return 0;
}


2017-08-11 22:11:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 03/16] tcp: avoid setting cwnd to invalid ssthresh after cwnd reduction states

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Yuchung Cheng <[email protected]>


[ Upstream commit ed254971edea92c3ac5c67c6a05247a92aa6075e ]

If the sender switches the congestion control during ECN-triggered
cwnd-reduction state (CA_CWR), upon exiting recovery cwnd is set to
the ssthresh value calculated by the previous congestion control. If
the previous congestion control is BBR that always keep ssthresh
to TCP_INIFINITE_SSTHRESH, cwnd ends up being infinite. The safe
step is to avoid assigning invalid ssthresh value when recovery ends.

Signed-off-by: Yuchung Cheng <[email protected]>
Signed-off-by: Neal Cardwell <[email protected]>
Acked-by: Eric Dumazet <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_input.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -2560,8 +2560,8 @@ static inline void tcp_end_cwnd_reductio
return;

/* Reset cwnd to ssthresh in CWR or Recovery (unless it's undone) */
- if (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR ||
- (tp->undo_marker && tp->snd_ssthresh < TCP_INFINITE_SSTHRESH)) {
+ if (tp->snd_ssthresh < TCP_INFINITE_SSTHRESH &&
+ (inet_csk(sk)->icsk_ca_state == TCP_CA_CWR || tp->undo_marker)) {
tp->snd_cwnd = tp->snd_ssthresh;
tp->snd_cwnd_stamp = tcp_time_stamp;
}


2017-08-11 22:11:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 04/16] net: fix keepalive code vs TCP_FASTOPEN_CONNECT

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Eric Dumazet <[email protected]>


[ Upstream commit 2dda640040876cd8ae646408b69eea40c24f9ae9 ]

syzkaller was able to trigger a divide by 0 in TCP stack [1]

Issue here is that keepalive timer needs to be updated to not attempt
to send a probe if the connection setup was deferred using
TCP_FASTOPEN_CONNECT socket option added in linux-4.11

[1]
divide error: 0000 [#1] SMP
CPU: 18 PID: 0 Comm: swapper/18 Not tainted
task: ffff986f62f4b040 ti: ffff986f62fa2000 task.ti: ffff986f62fa2000
RIP: 0010:[<ffffffff8409cc0d>] [<ffffffff8409cc0d>] __tcp_select_window+0x8d/0x160
Call Trace:
<IRQ>
[<ffffffff8409d951>] tcp_transmit_skb+0x11/0x20
[<ffffffff8409da21>] tcp_xmit_probe_skb+0xc1/0xe0
[<ffffffff840a0ee8>] tcp_write_wakeup+0x68/0x160
[<ffffffff840a151b>] tcp_keepalive_timer+0x17b/0x230
[<ffffffff83b3f799>] call_timer_fn+0x39/0xf0
[<ffffffff83b40797>] run_timer_softirq+0x1d7/0x280
[<ffffffff83a04ddb>] __do_softirq+0xcb/0x257
[<ffffffff83ae03ac>] irq_exit+0x9c/0xb0
[<ffffffff83a04c1a>] smp_apic_timer_interrupt+0x6a/0x80
[<ffffffff83a03eaf>] apic_timer_interrupt+0x7f/0x90
<EOI>
[<ffffffff83fed2ea>] ? cpuidle_enter_state+0x13a/0x3b0
[<ffffffff83fed2cd>] ? cpuidle_enter_state+0x11d/0x3b0

Tested:

Following packetdrill no longer crashes the kernel

`echo 0 >/proc/sys/net/ipv4/tcp_timestamps`

// Cache warmup: send a Fast Open cookie request
0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3
+0 fcntl(3, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0 setsockopt(3, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
+0 connect(3, ..., ...) = -1 EINPROGRESS (Operation is now in progress)
+0 > S 0:0(0) <mss 1460,nop,nop,sackOK,nop,wscale 8,FO,nop,nop>
+.01 < S. 123:123(0) ack 1 win 14600 <mss 1460,nop,nop,sackOK,nop,wscale 6,FO abcd1234,nop,nop>
+0 > . 1:1(0) ack 1
+0 close(3) = 0
+0 > F. 1:1(0) ack 1
+0 < F. 1:1(0) ack 2 win 92
+0 > . 2:2(0) ack 2

+0 socket(..., SOCK_STREAM, IPPROTO_TCP) = 4
+0 fcntl(4, F_SETFL, O_RDWR|O_NONBLOCK) = 0
+0 setsockopt(4, SOL_TCP, TCP_FASTOPEN_CONNECT, [1], 4) = 0
+0 setsockopt(4, SOL_SOCKET, SO_KEEPALIVE, [1], 4) = 0
+.01 connect(4, ..., ...) = 0
+0 setsockopt(4, SOL_TCP, TCP_KEEPIDLE, [5], 4) = 0
+10 close(4) = 0

`echo 1 >/proc/sys/net/ipv4/tcp_timestamps`

Fixes: 19f6d3f3c842 ("net/tcp-fastopen: Add new API support")
Signed-off-by: Eric Dumazet <[email protected]>
Reported-by: Dmitry Vyukov <[email protected]>
Cc: Wei Wang <[email protected]>
Cc: Yuchung Cheng <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/tcp_timer.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -654,7 +654,8 @@ static void tcp_keepalive_timer (unsigne
goto death;
}

- if (!sock_flag(sk, SOCK_KEEPOPEN) || sk->sk_state == TCP_CLOSE)
+ if (!sock_flag(sk, SOCK_KEEPOPEN) ||
+ ((1 << sk->sk_state) & (TCPF_CLOSE | TCPF_SYN_SENT)))
goto out;

elapsed = keepalive_time_when(tp);


2017-08-11 22:11:26

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 05/16] bpf, s390: fix jit branch offset related to ldimm64

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Daniel Borkmann <[email protected]>


[ Upstream commit b0a0c2566f28e71e5e32121992ac8060cec75510 ]

While testing some other work that required JIT modifications, I
run into test_bpf causing a hang when JIT enabled on s390. The
problematic test case was the one from ddc665a4bb4b (bpf, arm64:
fix jit branch offset related to ldimm64), and turns out that we
do have a similar issue on s390 as well. In bpf_jit_prog() we
update next instruction address after returning from bpf_jit_insn()
with an insn_count. bpf_jit_insn() returns either -1 in case of
error (e.g. unsupported insn), 1 or 2. The latter is only the
case for ldimm64 due to spanning 2 insns, however, next address
is only set to i + 1 not taking actual insn_count into account,
thus fix is to use insn_count instead of 1. bpf_jit_enable in
mode 2 provides also disasm on s390:

Before fix:

000003ff800349b6: a7f40003 brc 15,3ff800349bc ; target
000003ff800349ba: 0000 unknown
000003ff800349bc: e3b0f0700024 stg %r11,112(%r15)
000003ff800349c2: e3e0f0880024 stg %r14,136(%r15)
000003ff800349c8: 0db0 basr %r11,%r0
000003ff800349ca: c0ef00000000 llilf %r14,0
000003ff800349d0: e320b0360004 lg %r2,54(%r11)
000003ff800349d6: e330b03e0004 lg %r3,62(%r11)
000003ff800349dc: ec23ffeda065 clgrj %r2,%r3,10,3ff800349b6 ; jmp
000003ff800349e2: e3e0b0460004 lg %r14,70(%r11)
000003ff800349e8: e3e0b04e0004 lg %r14,78(%r11)
000003ff800349ee: b904002e lgr %r2,%r14
000003ff800349f2: e3b0f0700004 lg %r11,112(%r15)
000003ff800349f8: e3e0f0880004 lg %r14,136(%r15)
000003ff800349fe: 07fe bcr 15,%r14

After fix:

000003ff80ef3db4: a7f40003 brc 15,3ff80ef3dba
000003ff80ef3db8: 0000 unknown
000003ff80ef3dba: e3b0f0700024 stg %r11,112(%r15)
000003ff80ef3dc0: e3e0f0880024 stg %r14,136(%r15)
000003ff80ef3dc6: 0db0 basr %r11,%r0
000003ff80ef3dc8: c0ef00000000 llilf %r14,0
000003ff80ef3dce: e320b0360004 lg %r2,54(%r11)
000003ff80ef3dd4: e330b03e0004 lg %r3,62(%r11)
000003ff80ef3dda: ec230006a065 clgrj %r2,%r3,10,3ff80ef3de6 ; jmp
000003ff80ef3de0: e3e0b0460004 lg %r14,70(%r11)
000003ff80ef3de6: e3e0b04e0004 lg %r14,78(%r11) ; target
000003ff80ef3dec: b904002e lgr %r2,%r14
000003ff80ef3df0: e3b0f0700004 lg %r11,112(%r15)
000003ff80ef3df6: e3e0f0880004 lg %r14,136(%r15)
000003ff80ef3dfc: 07fe bcr 15,%r14

test_bpf.ko suite runs fine after the fix.

Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend")
Signed-off-by: Daniel Borkmann <[email protected]>
Tested-by: Michael Holzheu <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
arch/s390/net/bpf_jit_comp.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1252,7 +1252,8 @@ static int bpf_jit_prog(struct bpf_jit *
insn_count = bpf_jit_insn(jit, fp, i);
if (insn_count < 0)
return -1;
- jit->addrs[i + 1] = jit->prg; /* Next instruction address */
+ /* Next instruction address */
+ jit->addrs[i + insn_count] = jit->prg;
}
bpf_jit_epilogue(jit);



2017-08-11 22:02:02

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 15/16] sparc64: Prevent perf from running during super critical sections

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Rob Gardner <[email protected]>

commit fc290a114fc6034b0f6a5a46e2fb7d54976cf87a upstream.

This fixes another cause of random segfaults and bus errors that may
occur while running perf with the callgraph option.

Critical sections beginning with spin_lock_irqsave() raise the interrupt
level to PIL_NORMAL_MAX (14) and intentionally do not block performance
counter interrupts, which arrive at PIL_NMI (15).

But some sections of code are "super critical" with respect to perf
because the perf_callchain_user() path accesses user space and may cause
TLB activity as well as faults as it unwinds the user stack.

One particular critical section occurs in switch_mm:

spin_lock_irqsave(&mm->context.lock, flags);
...
load_secondary_context(mm);
tsb_context_switch(mm);
...
spin_unlock_irqrestore(&mm->context.lock, flags);

If a perf interrupt arrives in between load_secondary_context() and
tsb_context_switch(), then perf_callchain_user() could execute with
the context ID of one process, but with an active TSB for a different
process. When the user stack is accessed, it is very likely to
incur a TLB miss, since the h/w context ID has been changed. The TLB
will then be reloaded with a translation from the TSB for one process,
but using a context ID for another process. This exposes memory from
one process to another, and since it is a mapping for stack memory,
this usually causes the new process to crash quickly.

This super critical section needs more protection than is provided
by spin_lock_irqsave() since perf interrupts must not be allowed in.

Since __tsb_context_switch already goes through the trouble of
disabling interrupts completely, we fix this by moving the secondary
context load down into this better protected region.

Orabug: 25577560

Signed-off-by: Dave Aldridge <[email protected]>
Signed-off-by: Rob Gardner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
arch/sparc/include/asm/mmu_context_64.h | 14 +++++++++-----
arch/sparc/kernel/tsb.S | 12 ++++++++++++
arch/sparc/power/hibernate.c | 3 +--
3 files changed, 22 insertions(+), 7 deletions(-)

--- a/arch/sparc/include/asm/mmu_context_64.h
+++ b/arch/sparc/include/asm/mmu_context_64.h
@@ -25,9 +25,11 @@ void destroy_context(struct mm_struct *m
void __tsb_context_switch(unsigned long pgd_pa,
struct tsb_config *tsb_base,
struct tsb_config *tsb_huge,
- unsigned long tsb_descr_pa);
+ unsigned long tsb_descr_pa,
+ unsigned long secondary_ctx);

-static inline void tsb_context_switch(struct mm_struct *mm)
+static inline void tsb_context_switch_ctx(struct mm_struct *mm,
+ unsigned long ctx)
{
__tsb_context_switch(__pa(mm->pgd),
&mm->context.tsb_block[0],
@@ -38,9 +40,12 @@ static inline void tsb_context_switch(st
#else
NULL
#endif
- , __pa(&mm->context.tsb_descr[0]));
+ , __pa(&mm->context.tsb_descr[0]),
+ ctx);
}

+#define tsb_context_switch(X) tsb_context_switch_ctx(X, 0)
+
void tsb_grow(struct mm_struct *mm,
unsigned long tsb_index,
unsigned long mm_rss);
@@ -110,8 +115,7 @@ static inline void switch_mm(struct mm_s
* cpu0 to update it's TSB because at that point the cpu_vm_mask
* only had cpu1 set in it.
*/
- load_secondary_context(mm);
- tsb_context_switch(mm);
+ tsb_context_switch_ctx(mm, CTX_HWBITS(mm->context));

/* Any time a processor runs a context on an address space
* for the first time, we must flush that context out of the
--- a/arch/sparc/kernel/tsb.S
+++ b/arch/sparc/kernel/tsb.S
@@ -375,6 +375,7 @@ tsb_flush:
* %o1: TSB base config pointer
* %o2: TSB huge config pointer, or NULL if none
* %o3: Hypervisor TSB descriptor physical address
+ * %o4: Secondary context to load, if non-zero
*
* We have to run this whole thing with interrupts
* disabled so that the current cpu doesn't change
@@ -387,6 +388,17 @@ __tsb_context_switch:
rdpr %pstate, %g1
wrpr %g1, PSTATE_IE, %pstate

+ brz,pn %o4, 1f
+ mov SECONDARY_CONTEXT, %o5
+
+661: stxa %o4, [%o5] ASI_DMMU
+ .section .sun4v_1insn_patch, "ax"
+ .word 661b
+ stxa %o4, [%o5] ASI_MMU
+ .previous
+ flush %g6
+
+1:
TRAP_LOAD_TRAP_BLOCK(%g2, %g3)

stx %o0, [%g2 + TRAP_PER_CPU_PGD_PADDR]
--- a/arch/sparc/power/hibernate.c
+++ b/arch/sparc/power/hibernate.c
@@ -35,6 +35,5 @@ void restore_processor_state(void)
{
struct mm_struct *mm = current->active_mm;

- load_secondary_context(mm);
- tsb_context_switch(mm);
+ tsb_context_switch_ctx(mm, CTX_HWBITS(mm->context));
}


2017-08-11 22:12:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 12/16] revert "net: account for current skb length when deciding about UFO"

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

This reverts commit ef09c9ff343122a0b245416066992d096416ff19 which is
commit a5cb659bbc1c8644efa0c3138a757a1e432a4880 upstream as it causes
merge issues with later patches that are much more important...

Cc: Michal Kubecek <[email protected]>
Cc: Vlad Yasevich <[email protected]>
Cc: David S. Miller <[email protected]>
Cc: Sasha Levin <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>

---
net/ipv4/ip_output.c | 3 +--
net/ipv6/ip6_output.c | 2 +-
2 files changed, 2 insertions(+), 3 deletions(-)

--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -936,8 +936,7 @@ static int __ip_append_data(struct sock
csummode = CHECKSUM_PARTIAL;

cork->length += length;
- if ((((length + (skb ? skb->len : fragheaderlen)) > mtu) ||
- (skb && skb_is_gso(skb))) &&
+ if ((((length + fragheaderlen) > mtu) || (skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&
(sk->sk_type == SOCK_DGRAM) && !sk->sk_no_check_tx) {
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1372,7 +1372,7 @@ emsgsize:
*/

cork->length += length;
- if ((((length + (skb ? skb->len : headersize)) > mtu) ||
+ if ((((length + fragheaderlen) > mtu) ||
(skb && skb_is_gso(skb))) &&
(sk->sk_protocol == IPPROTO_UDP) &&
(rt->dst.dev->features & NETIF_F_UFO) && !rt->dst.header_len &&


2017-08-11 22:12:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 16/16] KVM: arm/arm64: Handle hva aging while destroying the vm

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Suzuki K Poulose <[email protected]>

commit 7e5a672289c9754d07e1c3b33649786d3d70f5e4 upstream.

The mmu_notifier_release() callback of KVM triggers cleaning up
the stage2 page table on kvm-arm. However there could be other
notifier callbacks in parallel with the mmu_notifier_release(),
which could cause the call backs ending up in an empty stage2
page table. Make sure we check it for all the notifier callbacks.

Fixes: commit 293f29363 ("kvm-arm: Unmap shadow pagetables properly")
Reported-by: Alex Graf <[email protected]>
Reviewed-by: Christoffer Dall <[email protected]>
Signed-off-by: Suzuki K Poulose <[email protected]>
Signed-off-by: Marc Zyngier <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>


---
arch/arm/kvm/mmu.c | 4 ++++
1 file changed, 4 insertions(+)

--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1664,12 +1664,16 @@ static int kvm_test_age_hva_handler(stru

int kvm_age_hva(struct kvm *kvm, unsigned long start, unsigned long end)
{
+ if (!kvm->arch.pgd)
+ return 0;
trace_kvm_age_hva(start, end);
return handle_hva_to_gpa(kvm, start, end, kvm_age_hva_handler, NULL);
}

int kvm_test_age_hva(struct kvm *kvm, unsigned long hva)
{
+ if (!kvm->arch.pgd)
+ return 0;
trace_kvm_test_age_hva(hva);
return handle_hva_to_gpa(kvm, hva, hva, kvm_test_age_hva_handler, NULL);
}


2017-08-11 22:12:45

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 02/16] ppp: fix xmit recursion detection on ppp channels

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Guillaume Nault <[email protected]>


[ Upstream commit 0a0e1a85c83775a648041be2b15de6d0a2f2b8eb ]

Commit e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp
devices") dropped the xmit_recursion counter incrementation in
ppp_channel_push() and relied on ppp_xmit_process() for this task.
But __ppp_channel_push() can also send packets directly (using the
.start_xmit() channel callback), in which case the xmit_recursion
counter isn't incremented anymore. If such packets get routed back to
the parent ppp unit, ppp_xmit_process() won't notice the recursion and
will call ppp_channel_push() on the same channel, effectively creating
the deadlock situation that the xmit_recursion mechanism was supposed
to prevent.

This patch re-introduces the xmit_recursion counter incrementation in
ppp_channel_push(). Since the xmit_recursion variable is now part of
the parent ppp unit, incrementation is skipped if the channel doesn't
have any. This is fine because only packets routed through the parent
unit may enter the channel recursively.

Finally, we have to ensure that pch->ppp is not going to be modified
while executing ppp_channel_push(). Instead of taking this lock only
while calling ppp_xmit_process(), we now have to hold it for the full
ppp_channel_push() execution. This respects the ppp locks ordering
which requires locking ->upl before ->downl.

Fixes: e5dadc65f9e0 ("ppp: Fix false xmit recursion detect with two ppp devices")
Signed-off-by: Guillaume Nault <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ppp/ppp_generic.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)

--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -1914,21 +1914,23 @@ static void __ppp_channel_push(struct ch
spin_unlock_bh(&pch->downl);
/* see if there is anything from the attached unit to be sent */
if (skb_queue_empty(&pch->file.xq)) {
- read_lock_bh(&pch->upl);
ppp = pch->ppp;
if (ppp)
- ppp_xmit_process(ppp);
- read_unlock_bh(&pch->upl);
+ __ppp_xmit_process(ppp);
}
}

static void ppp_channel_push(struct channel *pch)
{
- local_bh_disable();
-
- __ppp_channel_push(pch);
-
- local_bh_enable();
+ read_lock_bh(&pch->upl);
+ if (pch->ppp) {
+ (*this_cpu_ptr(pch->ppp->xmit_recursion))++;
+ __ppp_channel_push(pch);
+ (*this_cpu_ptr(pch->ppp->xmit_recursion))--;
+ } else {
+ __ppp_channel_push(pch);
+ }
+ read_unlock_bh(&pch->upl);
}

/*


2017-08-11 22:13:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 11/16] packet: fix tp_reserve race in packet_set_ring

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Willem de Bruijn <[email protected]>


[ Upstream commit c27927e372f0785f3303e8fad94b85945e2c97b7 ]

Updates to tp_reserve can race with reads of the field in
packet_set_ring. Avoid this by holding the socket lock during
updates in setsockopt PACKET_RESERVE.

This bug was discovered by syzkaller.

Fixes: 8913336a7e8d ("packet: add PACKET_RESERVE sockopt")
Reported-by: Andrey Konovalov <[email protected]>
Signed-off-by: Willem de Bruijn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/packet/af_packet.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -3698,14 +3698,19 @@ packet_setsockopt(struct socket *sock, i

if (optlen != sizeof(val))
return -EINVAL;
- if (po->rx_ring.pg_vec || po->tx_ring.pg_vec)
- return -EBUSY;
if (copy_from_user(&val, optval, sizeof(val)))
return -EFAULT;
if (val > INT_MAX)
return -EINVAL;
- po->tp_reserve = val;
- return 0;
+ lock_sock(sk);
+ if (po->rx_ring.pg_vec || po->tx_ring.pg_vec) {
+ ret = -EBUSY;
+ } else {
+ po->tp_reserve = val;
+ ret = 0;
+ }
+ release_sock(sk);
+ return ret;
}
case PACKET_LOSS:
{


2017-08-11 22:13:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 10/16] igmp: Fix regression caused by igmp sysctl namespace code.

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Nikolay Borisov <[email protected]>


[ Upstream commit 1714020e42b17135032c8606f7185b3fb2ba5d78 ]

Commit dcd87999d415 ("igmp: net: Move igmp namespace init to correct file")
moved the igmp sysctls initialization from tcp_sk_init to igmp_net_init. This
function is only called as part of per-namespace initialization, only if
CONFIG_IP_MULTICAST is defined, otherwise igmp_mc_init() call in ip_init is
compiled out, casuing the igmp pernet ops to not be registerd and those sysctl
being left initialized with 0. However, there are certain functions, such as
ip_mc_join_group which are always compiled and make use of some of those
sysctls. Let's do a partial revert of the aforementioned commit and move the
sysctl initialization into inet_init_net, that way they will always have
sane values.

Fixes: dcd87999d415 ("igmp: net: Move igmp namespace init to correct file")
Link: https://bugzilla.kernel.org/show_bug.cgi?id=196595
Reported-by: Gerardo Exequiel Pozzi <[email protected]>
Signed-off-by: Nikolay Borisov <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
net/ipv4/af_inet.c | 7 +++++++
net/ipv4/igmp.c | 6 ------
2 files changed, 7 insertions(+), 6 deletions(-)

--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1693,6 +1693,13 @@ static __net_init int inet_init_net(stru
net->ipv4.sysctl_ip_dynaddr = 0;
net->ipv4.sysctl_ip_early_demux = 1;

+ /* Some igmp sysctl, whose values are always used */
+ net->ipv4.sysctl_igmp_max_memberships = 20;
+ net->ipv4.sysctl_igmp_max_msf = 10;
+ /* IGMP reports for link-local multicast groups are enabled by default */
+ net->ipv4.sysctl_igmp_llm_reports = 1;
+ net->ipv4.sysctl_igmp_qrv = 2;
+
return 0;
}

--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -2974,12 +2974,6 @@ static int __net_init igmp_net_init(stru
goto out_sock;
}

- /* Sysctl initialization */
- net->ipv4.sysctl_igmp_max_memberships = 20;
- net->ipv4.sysctl_igmp_max_msf = 10;
- /* IGMP reports for link-local multicast groups are enabled by default */
- net->ipv4.sysctl_igmp_llm_reports = 1;
- net->ipv4.sysctl_igmp_qrv = 2;
return 0;

out_sock:


2017-08-11 22:14:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: [PATCH 4.9 01/16] ppp: Fix false xmit recursion detect with two ppp devices

4.9-stable review patch. If anyone has any objections, please let me know.

------------------

From: Gao Feng <[email protected]>


[ Upstream commit e5dadc65f9e0177eb649bcd9d333f1ebf871223e ]

The global percpu variable ppp_xmit_recursion is used to detect the ppp
xmit recursion to avoid the deadlock, which is caused by one CPU tries to
lock the xmit lock twice. But it would report false recursion when one CPU
wants to send the skb from two different PPP devices, like one L2TP on the
PPPoE. It is a normal case actually.

Now use one percpu member of struct ppp instead of the gloable variable to
detect the xmit recursion of one ppp device.

Fixes: 55454a565836 ("ppp: avoid dealock on recursive xmit")
Signed-off-by: Gao Feng <[email protected]>
Signed-off-by: Liu Jianying <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
---
drivers/net/ppp/ppp_generic.c | 30 +++++++++++++++++++++---------
1 file changed, 21 insertions(+), 9 deletions(-)

--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -119,6 +119,7 @@ struct ppp {
int n_channels; /* how many channels are attached 54 */
spinlock_t rlock; /* lock for receive side 58 */
spinlock_t wlock; /* lock for transmit side 5c */
+ int *xmit_recursion __percpu; /* xmit recursion detect */
int mru; /* max receive unit 60 */
unsigned int flags; /* control bits 64 */
unsigned int xstate; /* transmit state bits 68 */
@@ -1024,6 +1025,7 @@ static int ppp_dev_configure(struct net
struct ppp *ppp = netdev_priv(dev);
int indx;
int err;
+ int cpu;

ppp->dev = dev;
ppp->ppp_net = src_net;
@@ -1038,6 +1040,15 @@ static int ppp_dev_configure(struct net
INIT_LIST_HEAD(&ppp->channels);
spin_lock_init(&ppp->rlock);
spin_lock_init(&ppp->wlock);
+
+ ppp->xmit_recursion = alloc_percpu(int);
+ if (!ppp->xmit_recursion) {
+ err = -ENOMEM;
+ goto err1;
+ }
+ for_each_possible_cpu(cpu)
+ (*per_cpu_ptr(ppp->xmit_recursion, cpu)) = 0;
+
#ifdef CONFIG_PPP_MULTILINK
ppp->minseq = -1;
skb_queue_head_init(&ppp->mrq);
@@ -1049,11 +1060,15 @@ static int ppp_dev_configure(struct net

err = ppp_unit_register(ppp, conf->unit, conf->ifname_is_set);
if (err < 0)
- return err;
+ goto err2;

conf->file->private_data = &ppp->file;

return 0;
+err2:
+ free_percpu(ppp->xmit_recursion);
+err1:
+ return err;
}

static const struct nla_policy ppp_nl_policy[IFLA_PPP_MAX + 1] = {
@@ -1399,18 +1414,16 @@ static void __ppp_xmit_process(struct pp
ppp_xmit_unlock(ppp);
}

-static DEFINE_PER_CPU(int, ppp_xmit_recursion);
-
static void ppp_xmit_process(struct ppp *ppp)
{
local_bh_disable();

- if (unlikely(__this_cpu_read(ppp_xmit_recursion)))
+ if (unlikely(*this_cpu_ptr(ppp->xmit_recursion)))
goto err;

- __this_cpu_inc(ppp_xmit_recursion);
+ (*this_cpu_ptr(ppp->xmit_recursion))++;
__ppp_xmit_process(ppp);
- __this_cpu_dec(ppp_xmit_recursion);
+ (*this_cpu_ptr(ppp->xmit_recursion))--;

local_bh_enable();

@@ -1904,7 +1917,7 @@ static void __ppp_channel_push(struct ch
read_lock_bh(&pch->upl);
ppp = pch->ppp;
if (ppp)
- __ppp_xmit_process(ppp);
+ ppp_xmit_process(ppp);
read_unlock_bh(&pch->upl);
}
}
@@ -1913,9 +1926,7 @@ static void ppp_channel_push(struct chan
{
local_bh_disable();

- __this_cpu_inc(ppp_xmit_recursion);
__ppp_channel_push(pch);
- __this_cpu_dec(ppp_xmit_recursion);

local_bh_enable();
}
@@ -3056,6 +3067,7 @@ static void ppp_destroy_interface(struct
#endif /* CONFIG_PPP_FILTER */

kfree_skb(ppp->xmit_pending);
+ free_percpu(ppp->xmit_recursion);

free_netdev(ppp->dev);
}


2017-08-12 01:56:53

by Shuah Khan

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/16] 4.9.43-stable review

On 08/11/2017 04:01 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.43 release.
> There are 16 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Aug 13 22:01:23 UTC 2017.
> Anything received after that time might be too late.
>
> The whole patch series can be found in one patch at:
> kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.43-rc1.gz
> or in the git tree and branch at:
> git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
> and the diffstat can be found below.
>
> thanks,
>
> greg k-h
>

Compiled and booted on my test system. No dmesg regressions.

thanks,
-- Shuah

2017-08-12 12:36:12

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/16] 4.9.43-stable review

On 08/11/2017 03:01 PM, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 4.9.43 release.
> There are 16 patches in this series, all will be posted as a response
> to this one. If anyone has any issues with these being applied, please
> let me know.
>
> Responses should be made by Sun Aug 13 22:01:23 UTC 2017.
> Anything received after that time might be too late.
>

Build results:
total: 145 pass: 145 fail: 0
Qemu test results:
total: 122 pass: 110 fail: 12
Failed tests:
arm:beagle:multi_v7_defconfig:omap3-beagle
arm:beaglexm:multi_v7_defconfig:omap3-beagle-xm
arm:overo:multi_v7_defconfig:omap3-overo-tobi
arm:sabrelite:multi_v7_defconfig:imx6dl-sabrelite
arm:vexpress-a9:multi_v7_defconfig:vexpress-v2p-ca9
arm:vexpress-a15:multi_v7_defconfig:vexpress-v2p-ca15-tc1
arm:vexpress-a15-a7:multi_v7_defconfig:vexpress-v2p-ca15_a7
arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zc702
arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zc706
arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zed
arm:midway:multi_v7_defconfig:ecx-2000
arm:smdkc210:multi_v7_defconfig:exynos4210-smdkv310

Failures are:

Error log:
make[1]: *** No rule to make target 'arch/arm/boot/dts/sun8i-h3-nanopi-m1.dtb', needed by '__build'. Stop.

This is due to 'ARM: dts: sun8i: Support DTB build for NanoPi M1'. The associated .dts
file is missing in 4.9. Also applying 10efbf5f1633, which introduces it, won't be
sufficient since it depends on additional commits.

Details are available at http://kerneltests.org/builders.

Guenter

2017-08-12 16:07:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/16] 4.9.43-stable review

On Sat, Aug 12, 2017 at 05:36:08AM -0700, Guenter Roeck wrote:
> On 08/11/2017 03:01 PM, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 4.9.43 release.
> > There are 16 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Sun Aug 13 22:01:23 UTC 2017.
> > Anything received after that time might be too late.
> >
>
> Build results:
> total: 145 pass: 145 fail: 0
> Qemu test results:
> total: 122 pass: 110 fail: 12
> Failed tests:
> arm:beagle:multi_v7_defconfig:omap3-beagle
> arm:beaglexm:multi_v7_defconfig:omap3-beagle-xm
> arm:overo:multi_v7_defconfig:omap3-overo-tobi
> arm:sabrelite:multi_v7_defconfig:imx6dl-sabrelite
> arm:vexpress-a9:multi_v7_defconfig:vexpress-v2p-ca9
> arm:vexpress-a15:multi_v7_defconfig:vexpress-v2p-ca15-tc1
> arm:vexpress-a15-a7:multi_v7_defconfig:vexpress-v2p-ca15_a7
> arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zc702
> arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zc706
> arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zed
> arm:midway:multi_v7_defconfig:ecx-2000
> arm:smdkc210:multi_v7_defconfig:exynos4210-smdkv310
>
> Failures are:
>
> Error log:
> make[1]: *** No rule to make target 'arch/arm/boot/dts/sun8i-h3-nanopi-m1.dtb', needed by '__build'. Stop.
>
> This is due to 'ARM: dts: sun8i: Support DTB build for NanoPi M1'. The associated .dts
> file is missing in 4.9. Also applying 10efbf5f1633, which introduces it, won't be
> sufficient since it depends on additional commits.

Ugh, Sasha got this wrong, and I didn't check it :(

It's in 4.9.42, so this should have failed there too. I'll go revert
this now, thanks for letting me know.

greg k-h

2017-08-12 16:27:45

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 4.9 00/16] 4.9.43-stable review

On 08/12/2017 09:07 AM, Greg Kroah-Hartman wrote:
> On Sat, Aug 12, 2017 at 05:36:08AM -0700, Guenter Roeck wrote:
>> On 08/11/2017 03:01 PM, Greg Kroah-Hartman wrote:
>>> This is the start of the stable review cycle for the 4.9.43 release.
>>> There are 16 patches in this series, all will be posted as a response
>>> to this one. If anyone has any issues with these being applied, please
>>> let me know.
>>>
>>> Responses should be made by Sun Aug 13 22:01:23 UTC 2017.
>>> Anything received after that time might be too late.
>>>
>>
>> Build results:
>> total: 145 pass: 145 fail: 0
>> Qemu test results:
>> total: 122 pass: 110 fail: 12
>> Failed tests:
>> arm:beagle:multi_v7_defconfig:omap3-beagle
>> arm:beaglexm:multi_v7_defconfig:omap3-beagle-xm
>> arm:overo:multi_v7_defconfig:omap3-overo-tobi
>> arm:sabrelite:multi_v7_defconfig:imx6dl-sabrelite
>> arm:vexpress-a9:multi_v7_defconfig:vexpress-v2p-ca9
>> arm:vexpress-a15:multi_v7_defconfig:vexpress-v2p-ca15-tc1
>> arm:vexpress-a15-a7:multi_v7_defconfig:vexpress-v2p-ca15_a7
>> arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zc702
>> arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zc706
>> arm:xilinx-zynq-a9:multi_v7_defconfig:zynq-zed
>> arm:midway:multi_v7_defconfig:ecx-2000
>> arm:smdkc210:multi_v7_defconfig:exynos4210-smdkv310
>>
>> Failures are:
>>
>> Error log:
>> make[1]: *** No rule to make target 'arch/arm/boot/dts/sun8i-h3-nanopi-m1.dtb', needed by '__build'. Stop.
>>
>> This is due to 'ARM: dts: sun8i: Support DTB build for NanoPi M1'. The associated .dts
>> file is missing in 4.9. Also applying 10efbf5f1633, which introduces it, won't be
>> sufficient since it depends on additional commits.
>
> Ugh, Sasha got this wrong, and I didn't check it :(
>
> It's in 4.9.42, so this should have failed there too. I'll go revert
> this now, thanks for letting me know.
>
Yes, that is correct. Sorry, I must have dropped the ball there;
I thought I had told you that the builds are failing, but I guess
it got lost in the other failures.

Guenter