Drivers like vxlan use the recently introduced
udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
packet, updates the struct stats using the usual
u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
tstats, so drivers like vxlan, immediately after, call
iptunnel_xmit_stats, which does the same thing - calls
u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).
While vxlan is probably fine (I don't know?), calling a similar function
from, say, an unbound workqueue, on a fully preemptable kernel causes
real issues:
[ 188.434537] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u8:0/6
[ 188.435579] caller is debug_smp_processor_id+0x17/0x20
[ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2
[ 188.435607] Call Trace:
[ 188.435611] [<ffffffff8234e936>] dump_stack+0x4f/0x7b
[ 188.435615] [<ffffffff81915f3d>] check_preemption_disabled+0x19d/0x1c0
[ 188.435619] [<ffffffff81915f77>] debug_smp_processor_id+0x17/0x20
The solution would be to protect the whole
this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
disabling preemption and then reenabling it.
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
include/net/ip6_tunnel.h | 5 ++++-
include/net/ip_tunnels.h | 6 ++++--
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index fa915fa..67dc00d 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -90,11 +90,14 @@ static inline void ip6tunnel_xmit(struct sock *sk, struct sk_buff *skb,
err = ip6_local_out_sk(sk, skb);
if (net_xmit_eval(err) == 0) {
- struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);
+ struct pcpu_sw_netstats *tstats;
+ preempt_disable();
+ tstats = this_cpu_ptr(dev->tstats);
u64_stats_update_begin(&tstats->syncp);
tstats->tx_bytes += pkt_len;
tstats->tx_packets++;
u64_stats_update_end(&tstats->syncp);
+ preempt_enable();
} else {
stats->tx_errors++;
stats->tx_aborted_errors++;
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index f6dafec..6544955 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -287,12 +287,14 @@ static inline void iptunnel_xmit_stats(int err,
struct pcpu_sw_netstats __percpu *stats)
{
if (err > 0) {
- struct pcpu_sw_netstats *tstats = this_cpu_ptr(stats);
-
+ struct pcpu_sw_netstats *tstats;
+ preempt_disable();
+ tstats = this_cpu_ptr(stats);
u64_stats_update_begin(&tstats->syncp);
tstats->tx_bytes += err;
tstats->tx_packets++;
u64_stats_update_end(&tstats->syncp);
+ preempt_enable();
} else if (err < 0) {
err_stats->tx_errors++;
err_stats->tx_aborted_errors++;
--
2.6.2
On Thu, Nov 12, 2015, at 16:30, Jason A. Donenfeld wrote:
> if (err > 0) {
> - struct pcpu_sw_netstats *tstats = this_cpu_ptr(stats);
> -
> + struct pcpu_sw_netstats *tstats;
> + preempt_disable();
> + tstats = this_cpu_ptr(stats);
The canonical way is get_cpu_ptr(stats) / put_cpu_ptr.
Bye,
Hannes
Drivers like vxlan use the recently introduced
udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
packet, updates the struct stats using the usual
u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
tstats, so drivers like vxlan, immediately after, call
iptunnel_xmit_stats, which does the same thing - calls
u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).
While vxlan is probably fine (I don't know?), calling a similar function
from, say, an unbound workqueue, on a fully preemptable kernel causes
real issues:
[ 188.434537] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u8:0/6
[ 188.435579] caller is debug_smp_processor_id+0x17/0x20
[ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2
[ 188.435607] Call Trace:
[ 188.435611] [<ffffffff8234e936>] dump_stack+0x4f/0x7b
[ 188.435615] [<ffffffff81915f3d>] check_preemption_disabled+0x19d/0x1c0
[ 188.435619] [<ffffffff81915f77>] debug_smp_processor_id+0x17/0x20
The solution would be to protect the whole
this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
disabling preemption and then reenabling it.
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
include/net/ip6_tunnel.h | 3 ++-
include/net/ip_tunnels.h | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/net/ip6_tunnel.h b/include/net/ip6_tunnel.h
index fa915fa..d49a8f8 100644
--- a/include/net/ip6_tunnel.h
+++ b/include/net/ip6_tunnel.h
@@ -90,11 +90,12 @@ static inline void ip6tunnel_xmit(struct sock *sk, struct sk_buff *skb,
err = ip6_local_out_sk(sk, skb);
if (net_xmit_eval(err) == 0) {
- struct pcpu_sw_netstats *tstats = this_cpu_ptr(dev->tstats);
+ struct pcpu_sw_netstats *tstats = get_cpu_ptr(dev->tstats);
u64_stats_update_begin(&tstats->syncp);
tstats->tx_bytes += pkt_len;
tstats->tx_packets++;
u64_stats_update_end(&tstats->syncp);
+ put_cpu_ptr(tstats);
} else {
stats->tx_errors++;
stats->tx_aborted_errors++;
diff --git a/include/net/ip_tunnels.h b/include/net/ip_tunnels.h
index f6dafec..62a750a 100644
--- a/include/net/ip_tunnels.h
+++ b/include/net/ip_tunnels.h
@@ -287,12 +287,13 @@ static inline void iptunnel_xmit_stats(int err,
struct pcpu_sw_netstats __percpu *stats)
{
if (err > 0) {
- struct pcpu_sw_netstats *tstats = this_cpu_ptr(stats);
+ struct pcpu_sw_netstats *tstats = get_cpu_ptr(stats);
u64_stats_update_begin(&tstats->syncp);
tstats->tx_bytes += err;
tstats->tx_packets++;
u64_stats_update_end(&tstats->syncp);
+ put_cpu_ptr(tstats);
} else if (err < 0) {
err_stats->tx_errors++;
err_stats->tx_aborted_errors++;
--
2.6.2
On Thu, Nov 12, 2015 at 5:25 PM, Hannes Frederic Sowa
<[email protected]> wrote:
> The canonical way is get_cpu_ptr(stats) / put_cpu_ptr.
Thanks for the pointer. Fixed in v2.
By the way, in case anybody is interested, I've done a little bit of
historical digging work. The functions in question date back to
aa0010f8 from 2012. Before that commit, statistics structures would be
incremented after each tunnel's driver itself dereferenced the per-cpu
variable. When this got factored out into iptunnel_xmit_stats, the
author of aa0010f8 simply took the code from the tunnel drivers, which
included the use of "this_cpu_ptr" instead of "get_cpu_ptr", because
presumably each driver was able to ensure preemption was disabled in
its codepath. With the generalization of this functionality into the
globally useful iptunnel_xmit_stats, we'll need to be using
"get_cpu_ptr" instead.
On Thu, Nov 12, 2015, at 17:35, Jason A. Donenfeld wrote:
> Drivers like vxlan use the recently introduced
> udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
> makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
> packet, updates the struct stats using the usual
> u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
> udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
> tstats, so drivers like vxlan, immediately after, call
> iptunnel_xmit_stats, which does the same thing - calls
> u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).
>
> While vxlan is probably fine (I don't know?), calling a similar function
> from, say, an unbound workqueue, on a fully preemptable kernel causes
> real issues:
>
> [ 188.434537] BUG: using smp_processor_id() in preemptible [00000000]
> code: kworker/u8:0/6
> [ 188.435579] caller is debug_smp_processor_id+0x17/0x20
> [ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2
> [ 188.435607] Call Trace:
> [ 188.435611] [<ffffffff8234e936>] dump_stack+0x4f/0x7b
> [ 188.435615] [<ffffffff81915f3d>]
> check_preemption_disabled+0x19d/0x1c0
> [ 188.435619] [<ffffffff81915f77>] debug_smp_processor_id+0x17/0x20
>
> The solution would be to protect the whole
> this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
> disabling preemption and then reenabling it.
>
> Signed-off-by: Jason A. Donenfeld <[email protected]>
Acked-by: Hannes Frederic Sowa <[email protected]>
Thanks!
From: "Jason A. Donenfeld" <[email protected]>
Date: Thu, 12 Nov 2015 17:35:58 +0100
> Drivers like vxlan use the recently introduced
> udp_tunnel_xmit_skb/udp_tunnel6_xmit_skb APIs. udp_tunnel6_xmit_skb
> makes use of ip6tunnel_xmit, and ip6tunnel_xmit, after sending the
> packet, updates the struct stats using the usual
> u64_stats_update_begin/end calls on this_cpu_ptr(dev->tstats).
> udp_tunnel_xmit_skb makes use of iptunnel_xmit, which doesn't touch
> tstats, so drivers like vxlan, immediately after, call
> iptunnel_xmit_stats, which does the same thing - calls
> u64_stats_update_begin/end on this_cpu_ptr(dev->tstats).
>
> While vxlan is probably fine (I don't know?), calling a similar function
> from, say, an unbound workqueue, on a fully preemptable kernel causes
> real issues:
>
> [ 188.434537] BUG: using smp_processor_id() in preemptible [00000000] code: kworker/u8:0/6
> [ 188.435579] caller is debug_smp_processor_id+0x17/0x20
> [ 188.435583] CPU: 0 PID: 6 Comm: kworker/u8:0 Not tainted 4.2.6 #2
> [ 188.435607] Call Trace:
> [ 188.435611] [<ffffffff8234e936>] dump_stack+0x4f/0x7b
> [ 188.435615] [<ffffffff81915f3d>] check_preemption_disabled+0x19d/0x1c0
> [ 188.435619] [<ffffffff81915f77>] debug_smp_processor_id+0x17/0x20
>
> The solution would be to protect the whole
> this_cpu_ptr(dev->tstats)/u64_stats_update_begin/end blocks with
> disabling preemption and then reenabling it.
>
> Signed-off-by: Jason A. Donenfeld <[email protected]>
Applied and queued up for -stable, thanks Jason
Arguably, ip6tunnel_xmit() is primarily a ->ndo_start_xmit() and
therefore could assume that it only ran inside of a BH disabled code
sequence. And as you noted, when this was turned into a general case
helper function that guarantee was no longer necessarily there.
So another fix could have been to do local_bh_disable() in the
udp_tunnel6_xmit_skb() helper.
Thanks again.
On Mon, Nov 16, 2015 at 8:17 PM, David Miller <[email protected]> wrote:
> So another fix could have been to do local_bh_disable() in the
> udp_tunnel6_xmit_skb() helper.
This would have fixed one problem, but everywhere
udp_tunnel_xmit_skb() (4, not 6) is called, iptunnel_xmit_stats is
called right after it, so there would have to be a modicum of patches
for all of these places too.
By the way, there's something else I noticed about this in dealing
with these functions: the return value of udp_tunnel_xmit_skb is
different from that of udp_tunnel6_xmit_skb. Nobody is using them
incorrectly, so far as I can see, but it is confusing that they return
different things. I had started to clean this up and send a patch, but
it got a bit invasive in drivers I shouldn't really touch. But if
somebody with a bit more top-down command of things wants to poke at
this, it's a low hanging fruit as far as I can see.
>
> Thanks again.
My pleasure.