2022-08-16 08:24:52

by Kuniyuki Iwashima

[permalink] [raw]
Subject: [PATCH v1 net 00/15] sysctl: Fix data-races around net.core.XXX (Round 1)

This series fixes data-races around 22 knobs in net_core_table.
These knobs are skipped:

- netdev_rss_key: Written only once by net_get_random_once() and
read-only knob
- rps_sock_flow_entries: Protected with sock_flow_mutex
- flow_limit_cpu_bitmap: Protected with flow_limit_update_mutex
- flow_limit_table_len: Protected with flow_limit_update_mutex
- default_qdisc: Protected with qdisc_mod_lock
- warnings: Unused

Note 9th patch fixes net.core.message_cost and net.core.message_burst,
and lib/ratelimit.c does not have an explicit maintainer.

The next round is the final round for net.core.XXX and starts from
netdev_budget_usecs.


Kuniyuki Iwashima (15):
net: Fix data-races around sysctl_[rw]mem_(max|default).
net: Fix data-races around weight_p and dev_weight_[rt]x_bias.
net: Fix data-races around netdev_max_backlog.
bpf: Fix data-races around bpf_jit_enable.
bpf: Fix data-races around bpf_jit_harden.
bpf: Fix data-races around bpf_jit_kallsyms.
bpf: Fix a data-race around bpf_jit_limit.
net: Fix data-races around netdev_tstamp_prequeue.
ratelimit: Fix data-races in ___ratelimit().
net: Fix data-races around sysctl_optmem_max.
net: Fix a data-race around sysctl_tstamp_allow_data.
net: Fix a data-race around sysctl_net_busy_poll.
net: Fix a data-race around sysctl_net_busy_read.
net: Fix a data-race around netdev_budget.
net: Fix data-races around sysctl_max_skb_frags.

Documentation/admin-guide/sysctl/net.rst | 2 +-
arch/arm/net/bpf_jit_32.c | 2 +-
arch/arm64/net/bpf_jit_comp.c | 2 +-
arch/mips/net/bpf_jit_comp.c | 2 +-
arch/powerpc/net/bpf_jit_comp.c | 5 +++--
arch/riscv/net/bpf_jit_core.c | 2 +-
arch/s390/net/bpf_jit_comp.c | 2 +-
arch/sparc/net/bpf_jit_comp_32.c | 5 +++--
arch/sparc/net/bpf_jit_comp_64.c | 5 +++--
arch/x86/net/bpf_jit_comp.c | 2 +-
arch/x86/net/bpf_jit_comp32.c | 2 +-
include/linux/filter.h | 16 ++++++++++------
include/net/busy_poll.h | 2 +-
kernel/bpf/core.c | 2 +-
lib/ratelimit.c | 8 +++++---
net/core/bpf_sk_storage.c | 5 +++--
net/core/dev.c | 16 ++++++++--------
net/core/filter.c | 13 +++++++------
net/core/gro_cells.c | 2 +-
net/core/skbuff.c | 2 +-
net/core/sock.c | 18 ++++++++++--------
net/core/sysctl_net_core.c | 10 ++++++----
net/ipv4/ip_output.c | 2 +-
net/ipv4/ip_sockglue.c | 6 +++---
net/ipv4/tcp.c | 4 ++--
net/ipv4/tcp_output.c | 2 +-
net/ipv6/ipv6_sockglue.c | 4 ++--
net/mptcp/protocol.c | 2 +-
net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
net/sched/sch_generic.c | 2 +-
net/xfrm/espintcp.c | 2 +-
net/xfrm/xfrm_input.c | 2 +-
32 files changed, 85 insertions(+), 70 deletions(-)

--
2.30.2


2022-08-16 08:26:01

by Kuniyuki Iwashima

[permalink] [raw]
Subject: [PATCH v1 net 04/15] bpf: Fix data-races around bpf_jit_enable.

A sysctl variable bpf_jit_enable is accessed concurrently, and there is
always a chance of data-race. So, all readers and a writer need some
basic protection to avoid load/store-tearing.

Fixes: 0a14842f5a3c ("net: filter: Just In Time compiler for x86-64")
Signed-off-by: Kuniyuki Iwashima <[email protected]>
---
arch/arm/net/bpf_jit_32.c | 2 +-
arch/arm64/net/bpf_jit_comp.c | 2 +-
arch/mips/net/bpf_jit_comp.c | 2 +-
arch/powerpc/net/bpf_jit_comp.c | 5 +++--
arch/riscv/net/bpf_jit_core.c | 2 +-
arch/s390/net/bpf_jit_comp.c | 2 +-
arch/sparc/net/bpf_jit_comp_32.c | 5 +++--
arch/sparc/net/bpf_jit_comp_64.c | 5 +++--
arch/x86/net/bpf_jit_comp.c | 2 +-
arch/x86/net/bpf_jit_comp32.c | 2 +-
include/linux/filter.h | 2 +-
net/core/sysctl_net_core.c | 4 ++--
12 files changed, 19 insertions(+), 16 deletions(-)

diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index 6a1c9fca5260..4b6b62a6fdd4 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1999,7 +1999,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
}
flush_icache_range((u32)header, (u32)(ctx.target + ctx.idx));

- if (bpf_jit_enable > 1)
+ if (READ_ONCE(bpf_jit_enable) > 1)
/* there are 2 passes here */
bpf_jit_dump(prog->len, image_size, 2, ctx.target);

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 389623ae5a91..03bb40352d2c 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1568,7 +1568,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
}

/* And we're done. */
- if (bpf_jit_enable > 1)
+ if (READ_ONCE(bpf_jit_enable) > 1)
bpf_jit_dump(prog->len, prog_size, 2, ctx.image);

bpf_flush_icache(header, ctx.image + ctx.idx);
diff --git a/arch/mips/net/bpf_jit_comp.c b/arch/mips/net/bpf_jit_comp.c
index b17130d510d4..1e623ae7eadf 100644
--- a/arch/mips/net/bpf_jit_comp.c
+++ b/arch/mips/net/bpf_jit_comp.c
@@ -1012,7 +1012,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
flush_icache_range((unsigned long)header,
(unsigned long)&ctx.target[ctx.jit_index]);

- if (bpf_jit_enable > 1)
+ if (READ_ONCE(bpf_jit_enable) > 1)
bpf_jit_dump(prog->len, image_size, 2, ctx.target);

prog->bpf_func = (void *)ctx.target;
diff --git a/arch/powerpc/net/bpf_jit_comp.c b/arch/powerpc/net/bpf_jit_comp.c
index 43e634126514..c71d1e94ee7e 100644
--- a/arch/powerpc/net/bpf_jit_comp.c
+++ b/arch/powerpc/net/bpf_jit_comp.c
@@ -122,6 +122,7 @@ bool bpf_jit_needs_zext(void)

struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
{
+ int jit_enable = READ_ONCE(bpf_jit_enable);
u32 proglen;
u32 alloclen;
u8 *image = NULL;
@@ -263,13 +264,13 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
}
bpf_jit_build_epilogue(code_base, &cgctx);

- if (bpf_jit_enable > 1)
+ if (jit_enable > 1)
pr_info("Pass %d: shrink = %d, seen = 0x%x\n", pass,
proglen - (cgctx.idx * 4), cgctx.seen);
}

skip_codegen_passes:
- if (bpf_jit_enable > 1)
+ if (jit_enable > 1)
/*
* Note that we output the base address of the code_base
* rather than image, since opcodes are in code_base.
diff --git a/arch/riscv/net/bpf_jit_core.c b/arch/riscv/net/bpf_jit_core.c
index 737baf8715da..603b5b66379b 100644
--- a/arch/riscv/net/bpf_jit_core.c
+++ b/arch/riscv/net/bpf_jit_core.c
@@ -151,7 +151,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
}
bpf_jit_build_epilogue(ctx);

- if (bpf_jit_enable > 1)
+ if (READ_ONCE(bpf_jit_enable) > 1)
bpf_jit_dump(prog->len, prog_size, pass, ctx->insns);

prog->bpf_func = (void *)ctx->insns;
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c
index af35052d06ed..06897a4e9c62 100644
--- a/arch/s390/net/bpf_jit_comp.c
+++ b/arch/s390/net/bpf_jit_comp.c
@@ -1831,7 +1831,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp)
fp = orig_fp;
goto free_addrs;
}
- if (bpf_jit_enable > 1) {
+ if (READ_ONCE(bpf_jit_enable) > 1) {
bpf_jit_dump(fp->len, jit.size, pass, jit.prg_buf);
print_fn_code(jit.prg_buf, jit.size_prg);
}
diff --git a/arch/sparc/net/bpf_jit_comp_32.c b/arch/sparc/net/bpf_jit_comp_32.c
index b1dbf2fa8c0a..7c454b920250 100644
--- a/arch/sparc/net/bpf_jit_comp_32.c
+++ b/arch/sparc/net/bpf_jit_comp_32.c
@@ -326,13 +326,14 @@ do { *prog++ = BR_OPC | WDISP22(OFF); \
void bpf_jit_compile(struct bpf_prog *fp)
{
unsigned int cleanup_addr, proglen, oldproglen = 0;
+ int jit_enable = READ_ONCE(bpf_jit_enable);
u32 temp[8], *prog, *func, seen = 0, pass;
const struct sock_filter *filter = fp->insns;
int i, flen = fp->len, pc_ret0 = -1;
unsigned int *addrs;
void *image;

- if (!bpf_jit_enable)
+ if (!jit_enable)
return;

addrs = kmalloc_array(flen, sizeof(*addrs), GFP_KERNEL);
@@ -743,7 +744,7 @@ cond_branch: f_offset = addrs[i + filter[i].jf];
oldproglen = proglen;
}

- if (bpf_jit_enable > 1)
+ if (jit_enable > 1)
bpf_jit_dump(flen, proglen, pass + 1, image);

if (image) {
diff --git a/arch/sparc/net/bpf_jit_comp_64.c b/arch/sparc/net/bpf_jit_comp_64.c
index fa0759bfe498..74cc1fa1f97f 100644
--- a/arch/sparc/net/bpf_jit_comp_64.c
+++ b/arch/sparc/net/bpf_jit_comp_64.c
@@ -1479,6 +1479,7 @@ struct sparc64_jit_data {

struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
{
+ int jit_enable = READ_ONCE(bpf_jit_enable);
struct bpf_prog *tmp, *orig_prog = prog;
struct sparc64_jit_data *jit_data;
struct bpf_binary_header *header;
@@ -1549,7 +1550,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
}
build_epilogue(&ctx);

- if (bpf_jit_enable > 1)
+ if (jit_enable > 1)
pr_info("Pass %d: size = %u, seen = [%c%c%c%c%c%c]\n", pass,
ctx.idx * 4,
ctx.tmp_1_used ? '1' : ' ',
@@ -1596,7 +1597,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
goto out_off;
}

- if (bpf_jit_enable > 1)
+ if (jit_enable > 1)
bpf_jit_dump(prog->len, image_size, pass, ctx.image);

bpf_flush_icache(header, (u8 *)header + header->size);
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index c1f6c1c51d99..a5c7df7cab2a 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -2439,7 +2439,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
cond_resched();
}

- if (bpf_jit_enable > 1)
+ if (READ_ONCE(bpf_jit_enable) > 1)
bpf_jit_dump(prog->len, proglen, pass + 1, image);

if (image) {
diff --git a/arch/x86/net/bpf_jit_comp32.c b/arch/x86/net/bpf_jit_comp32.c
index 429a89c5468b..745f15a29dd3 100644
--- a/arch/x86/net/bpf_jit_comp32.c
+++ b/arch/x86/net/bpf_jit_comp32.c
@@ -2597,7 +2597,7 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)
cond_resched();
}

- if (bpf_jit_enable > 1)
+ if (READ_ONCE(bpf_jit_enable) > 1)
bpf_jit_dump(prog->len, proglen, pass + 1, image);

if (image) {
diff --git a/include/linux/filter.h b/include/linux/filter.h
index a5f21dc3c432..ce8072626ccf 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1080,7 +1080,7 @@ static inline bool bpf_jit_is_ebpf(void)

static inline bool ebpf_jit_enabled(void)
{
- return bpf_jit_enable && bpf_jit_is_ebpf();
+ return READ_ONCE(bpf_jit_enable) && bpf_jit_is_ebpf();
}

static inline bool bpf_prog_ebpf_jited(const struct bpf_prog *fp)
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index d82ba0c27175..022abf326dfe 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -265,7 +265,7 @@ static int proc_dointvec_minmax_bpf_enable(struct ctl_table *table, int write,
void *buffer, size_t *lenp,
loff_t *ppos)
{
- int ret, jit_enable = *(int *)table->data;
+ int ret, jit_enable = READ_ONCE(*(int *)table->data);
int min = *(int *)table->extra1;
int max = *(int *)table->extra2;
struct ctl_table tmp = *table;
@@ -278,7 +278,7 @@ static int proc_dointvec_minmax_bpf_enable(struct ctl_table *table, int write,
if (write && !ret) {
if (jit_enable < 2 ||
(jit_enable == 2 && bpf_dump_raw_ok(current_cred()))) {
- *(int *)table->data = jit_enable;
+ WRITE_ONCE(*(int *)table->data, jit_enable);
if (jit_enable == 2)
pr_warn("bpf_jit_enable = 2 was set! NEVER use this in production, only for JIT debugging!\n");
} else {
--
2.30.2

2022-08-16 08:46:36

by Kuniyuki Iwashima

[permalink] [raw]
Subject: [PATCH v1 net 09/15] ratelimit: Fix data-races in ___ratelimit().

While reading rs->interval and rs->burst, they can be changed
concurrently. Thus, we need to add READ_ONCE() to their readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <[email protected]>
---
lib/ratelimit.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/ratelimit.c b/lib/ratelimit.c
index e01a93f46f83..b59a1d3d0cc3 100644
--- a/lib/ratelimit.c
+++ b/lib/ratelimit.c
@@ -26,10 +26,12 @@
*/
int ___ratelimit(struct ratelimit_state *rs, const char *func)
{
+ int interval = READ_ONCE(rs->interval);
+ int burst = READ_ONCE(rs->burst);
unsigned long flags;
int ret;

- if (!rs->interval)
+ if (!interval)
return 1;

/*
@@ -44,7 +46,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
if (!rs->begin)
rs->begin = jiffies;

- if (time_is_before_jiffies(rs->begin + rs->interval)) {
+ if (time_is_before_jiffies(rs->begin + interval)) {
if (rs->missed) {
if (!(rs->flags & RATELIMIT_MSG_ON_RELEASE)) {
printk_deferred(KERN_WARNING
@@ -56,7 +58,7 @@ int ___ratelimit(struct ratelimit_state *rs, const char *func)
rs->begin = jiffies;
rs->printed = 0;
}
- if (rs->burst && rs->burst > rs->printed) {
+ if (burst && burst > rs->printed) {
rs->printed++;
ret = 1;
} else {
--
2.30.2

2022-08-16 08:53:08

by Kuniyuki Iwashima

[permalink] [raw]
Subject: [PATCH v1 net 15/15] net: Fix data-races around sysctl_max_skb_frags.

While reading sysctl_max_skb_frags, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 5f74f82ea34c ("net:Add sysctl_max_skb_frags")
Signed-off-by: Kuniyuki Iwashima <[email protected]>
---
CC: Hans Westgaard Ry <[email protected]>
---
net/ipv4/tcp.c | 4 ++--
net/mptcp/protocol.c | 2 +-
2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 970e9a2cca4a..9a6fe3d6ab26 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -1000,7 +1000,7 @@ static struct sk_buff *tcp_build_frag(struct sock *sk, int size_goal, int flags,

i = skb_shinfo(skb)->nr_frags;
can_coalesce = skb_can_coalesce(skb, i, page, offset);
- if (!can_coalesce && i >= sysctl_max_skb_frags) {
+ if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) {
tcp_mark_push(tp, skb);
goto new_segment;
}
@@ -1354,7 +1354,7 @@ int tcp_sendmsg_locked(struct sock *sk, struct msghdr *msg, size_t size)

if (!skb_can_coalesce(skb, i, pfrag->page,
pfrag->offset)) {
- if (i >= sysctl_max_skb_frags) {
+ if (i >= READ_ONCE(sysctl_max_skb_frags)) {
tcp_mark_push(tp, skb);
goto new_segment;
}
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index da4257504fad..d398f3810662 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -1263,7 +1263,7 @@ static int mptcp_sendmsg_frag(struct sock *sk, struct sock *ssk,

i = skb_shinfo(skb)->nr_frags;
can_coalesce = skb_can_coalesce(skb, i, dfrag->page, offset);
- if (!can_coalesce && i >= sysctl_max_skb_frags) {
+ if (!can_coalesce && i >= READ_ONCE(sysctl_max_skb_frags)) {
tcp_mark_push(tcp_sk(ssk), skb);
goto alloc_skb;
}
--
2.30.2

2022-08-16 08:53:19

by Kuniyuki Iwashima

[permalink] [raw]
Subject: [PATCH v1 net 08/15] net: Fix data-races around netdev_tstamp_prequeue.

While reading netdev_tstamp_prequeue, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 3b098e2d7c69 ("net: Consistent skb timestamping")
Signed-off-by: Kuniyuki Iwashima <[email protected]>
---
net/core/dev.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index 07da69c1ac0a..4705e6630efa 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -4928,7 +4928,7 @@ static int netif_rx_internal(struct sk_buff *skb)
{
int ret;

- net_timestamp_check(netdev_tstamp_prequeue, skb);
+ net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);

trace_netif_rx(skb);

@@ -5281,7 +5281,7 @@ static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc,
int ret = NET_RX_DROP;
__be16 type;

- net_timestamp_check(!netdev_tstamp_prequeue, skb);
+ net_timestamp_check(!READ_ONCE(netdev_tstamp_prequeue), skb);

trace_netif_receive_skb(skb);

@@ -5664,7 +5664,7 @@ static int netif_receive_skb_internal(struct sk_buff *skb)
{
int ret;

- net_timestamp_check(netdev_tstamp_prequeue, skb);
+ net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);

if (skb_defer_rx_timestamp(skb))
return NET_RX_SUCCESS;
@@ -5694,7 +5694,7 @@ void netif_receive_skb_list_internal(struct list_head *head)

INIT_LIST_HEAD(&sublist);
list_for_each_entry_safe(skb, next, head, list) {
- net_timestamp_check(netdev_tstamp_prequeue, skb);
+ net_timestamp_check(READ_ONCE(netdev_tstamp_prequeue), skb);
skb_list_del_init(skb);
if (!skb_defer_rx_timestamp(skb))
list_add_tail(&skb->list, &sublist);
--
2.30.2

2022-08-16 08:53:40

by Kuniyuki Iwashima

[permalink] [raw]
Subject: [PATCH v1 net 05/15] bpf: Fix data-races around bpf_jit_harden.

While reading bpf_jit_harden, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 4f3446bb809f ("bpf: add generic constant blinding for use in jits")
Signed-off-by: Kuniyuki Iwashima <[email protected]>
---
CC: Daniel Borkmann <[email protected]>
---
include/linux/filter.h | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index ce8072626ccf..09566ad211bd 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -1090,6 +1090,8 @@ static inline bool bpf_prog_ebpf_jited(const struct bpf_prog *fp)

static inline bool bpf_jit_blinding_enabled(struct bpf_prog *prog)
{
+ int jit_harden = READ_ONCE(bpf_jit_harden);
+
/* These are the prerequisites, should someone ever have the
* idea to call blinding outside of them, we make sure to
* bail out.
@@ -1098,9 +1100,9 @@ static inline bool bpf_jit_blinding_enabled(struct bpf_prog *prog)
return false;
if (!prog->jit_requested)
return false;
- if (!bpf_jit_harden)
+ if (!jit_harden)
return false;
- if (bpf_jit_harden == 1 && capable(CAP_SYS_ADMIN))
+ if (jit_harden == 1 && capable(CAP_SYS_ADMIN))
return false;

return true;
@@ -1111,7 +1113,7 @@ static inline bool bpf_jit_kallsyms_enabled(void)
/* There are a couple of corner cases where kallsyms should
* not be enabled f.e. on hardening.
*/
- if (bpf_jit_harden)
+ if (READ_ONCE(bpf_jit_harden))
return false;
if (!bpf_jit_kallsyms)
return false;
--
2.30.2

2022-08-16 08:56:17

by Kuniyuki Iwashima

[permalink] [raw]
Subject: [PATCH v1 net 10/15] net: Fix data-races around sysctl_optmem_max.

While reading sysctl_optmem_max, it can be changed concurrently.
Thus, we need to add READ_ONCE() to its readers.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kuniyuki Iwashima <[email protected]>
---
net/core/bpf_sk_storage.c | 5 +++--
net/core/filter.c | 9 +++++----
net/core/sock.c | 8 +++++---
net/ipv4/ip_sockglue.c | 6 +++---
net/ipv6/ipv6_sockglue.c | 4 ++--
5 files changed, 18 insertions(+), 14 deletions(-)

diff --git a/net/core/bpf_sk_storage.c b/net/core/bpf_sk_storage.c
index 1b7f385643b4..94374d529ea4 100644
--- a/net/core/bpf_sk_storage.c
+++ b/net/core/bpf_sk_storage.c
@@ -310,11 +310,12 @@ BPF_CALL_2(bpf_sk_storage_delete, struct bpf_map *, map, struct sock *, sk)
static int bpf_sk_storage_charge(struct bpf_local_storage_map *smap,
void *owner, u32 size)
{
+ int optmem_max = READ_ONCE(sysctl_optmem_max);
struct sock *sk = (struct sock *)owner;

/* same check as in sock_kmalloc() */
- if (size <= sysctl_optmem_max &&
- atomic_read(&sk->sk_omem_alloc) + size < sysctl_optmem_max) {
+ if (size <= optmem_max &&
+ atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
atomic_add(size, &sk->sk_omem_alloc);
return 0;
}
diff --git a/net/core/filter.c b/net/core/filter.c
index c4f14ad82029..c191db80ce93 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -1214,10 +1214,11 @@ void sk_filter_uncharge(struct sock *sk, struct sk_filter *fp)
static bool __sk_filter_charge(struct sock *sk, struct sk_filter *fp)
{
u32 filter_size = bpf_prog_size(fp->prog->len);
+ int optmem_max = READ_ONCE(sysctl_optmem_max);

/* same check as in sock_kmalloc() */
- if (filter_size <= sysctl_optmem_max &&
- atomic_read(&sk->sk_omem_alloc) + filter_size < sysctl_optmem_max) {
+ if (filter_size <= optmem_max &&
+ atomic_read(&sk->sk_omem_alloc) + filter_size < optmem_max) {
atomic_add(filter_size, &sk->sk_omem_alloc);
return true;
}
@@ -1548,7 +1549,7 @@ int sk_reuseport_attach_filter(struct sock_fprog *fprog, struct sock *sk)
if (IS_ERR(prog))
return PTR_ERR(prog);

- if (bpf_prog_size(prog->len) > sysctl_optmem_max)
+ if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max))
err = -ENOMEM;
else
err = reuseport_attach_prog(sk, prog);
@@ -1615,7 +1616,7 @@ int sk_reuseport_attach_bpf(u32 ufd, struct sock *sk)
}
} else {
/* BPF_PROG_TYPE_SOCKET_FILTER */
- if (bpf_prog_size(prog->len) > sysctl_optmem_max) {
+ if (bpf_prog_size(prog->len) > READ_ONCE(sysctl_optmem_max)) {
err = -ENOMEM;
goto err_prog_put;
}
diff --git a/net/core/sock.c b/net/core/sock.c
index 303af52f3b79..95abf4604d88 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -2536,7 +2536,7 @@ struct sk_buff *sock_omalloc(struct sock *sk, unsigned long size,

/* small safe race: SKB_TRUESIZE may differ from final skb->truesize */
if (atomic_read(&sk->sk_omem_alloc) + SKB_TRUESIZE(size) >
- sysctl_optmem_max)
+ READ_ONCE(sysctl_optmem_max))
return NULL;

skb = alloc_skb(size, priority);
@@ -2554,8 +2554,10 @@ struct sk_buff *sock_omalloc(struct sock *sk, unsigned long size,
*/
void *sock_kmalloc(struct sock *sk, int size, gfp_t priority)
{
- if ((unsigned int)size <= sysctl_optmem_max &&
- atomic_read(&sk->sk_omem_alloc) + size < sysctl_optmem_max) {
+ int optmem_max = READ_ONCE(sysctl_optmem_max);
+
+ if ((unsigned int)size <= optmem_max &&
+ atomic_read(&sk->sk_omem_alloc) + size < optmem_max) {
void *mem;
/* First do the add, to avoid the race if kmalloc
* might sleep.
diff --git a/net/ipv4/ip_sockglue.c b/net/ipv4/ip_sockglue.c
index a8a323ecbb54..e49a61a053a6 100644
--- a/net/ipv4/ip_sockglue.c
+++ b/net/ipv4/ip_sockglue.c
@@ -772,7 +772,7 @@ static int ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval, int optlen)

if (optlen < GROUP_FILTER_SIZE(0))
return -EINVAL;
- if (optlen > sysctl_optmem_max)
+ if (optlen > READ_ONCE(sysctl_optmem_max))
return -ENOBUFS;

gsf = memdup_sockptr(optval, optlen);
@@ -808,7 +808,7 @@ static int compat_ip_set_mcast_msfilter(struct sock *sk, sockptr_t optval,

if (optlen < size0)
return -EINVAL;
- if (optlen > sysctl_optmem_max - 4)
+ if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
return -ENOBUFS;

p = kmalloc(optlen + 4, GFP_KERNEL);
@@ -1233,7 +1233,7 @@ static int do_ip_setsockopt(struct sock *sk, int level, int optname,

if (optlen < IP_MSFILTER_SIZE(0))
goto e_inval;
- if (optlen > sysctl_optmem_max) {
+ if (optlen > READ_ONCE(sysctl_optmem_max)) {
err = -ENOBUFS;
break;
}
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 222f6bf220ba..e0dcc7a193df 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -210,7 +210,7 @@ static int ipv6_set_mcast_msfilter(struct sock *sk, sockptr_t optval,

if (optlen < GROUP_FILTER_SIZE(0))
return -EINVAL;
- if (optlen > sysctl_optmem_max)
+ if (optlen > READ_ONCE(sysctl_optmem_max))
return -ENOBUFS;

gsf = memdup_sockptr(optval, optlen);
@@ -244,7 +244,7 @@ static int compat_ipv6_set_mcast_msfilter(struct sock *sk, sockptr_t optval,

if (optlen < size0)
return -EINVAL;
- if (optlen > sysctl_optmem_max - 4)
+ if (optlen > READ_ONCE(sysctl_optmem_max) - 4)
return -ENOBUFS;

p = kmalloc(optlen + 4, GFP_KERNEL);
--
2.30.2

2022-08-16 16:54:45

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v1 net 00/15] sysctl: Fix data-races around net.core.XXX (Round 1)

On Mon, 15 Aug 2022 22:23:32 -0700 Kuniyuki Iwashima wrote:
> bpf: Fix data-races around bpf_jit_enable.
> bpf: Fix data-races around bpf_jit_harden.
> bpf: Fix data-races around bpf_jit_kallsyms.
> bpf: Fix a data-race around bpf_jit_limit.

The BPF stuff needs to go via the BPF tree, or get an ack from the BPF
maintainers. I see Daniel is CCed on some of the patches but not all.

2022-08-16 18:00:22

by Kuniyuki Iwashima

[permalink] [raw]
Subject: Re: [PATCH v1 net 00/15] sysctl: Fix data-races around net.core.XXX (Round 1)

From: Jakub Kicinski <[email protected]>
Date: Tue, 16 Aug 2022 09:27:03 -0700
> On Mon, 15 Aug 2022 22:23:32 -0700 Kuniyuki Iwashima wrote:
> > bpf: Fix data-races around bpf_jit_enable.
> > bpf: Fix data-races around bpf_jit_harden.
> > bpf: Fix data-races around bpf_jit_kallsyms.
> > bpf: Fix a data-race around bpf_jit_limit.
>
> The BPF stuff needs to go via the BPF tree, or get an ack from the BPF
> maintainers. I see Daniel is CCed on some of the patches but not all.

Sorry, I just added the author in CC.
Thanks for CCing bpf mailing list, I'll wait an ACK from them.

2022-08-17 16:08:35

by Kuniyuki Iwashima

[permalink] [raw]
Subject: Re: [PATCH v1 net 00/15] sysctl: Fix data-races around net.core.XXX (Round 1)

From: Jakub Kicinski <[email protected]>
Date: Wed, 17 Aug 2022 08:58:41 -0700
> On Tue, 16 Aug 2022 09:58:48 -0700 Kuniyuki Iwashima wrote:
> > From: Jakub Kicinski <[email protected]>
> > Date: Tue, 16 Aug 2022 09:27:03 -0700
> > > On Mon, 15 Aug 2022 22:23:32 -0700 Kuniyuki Iwashima wrote:
> > > > bpf: Fix data-races around bpf_jit_enable.
> > > > bpf: Fix data-races around bpf_jit_harden.
> > > > bpf: Fix data-races around bpf_jit_kallsyms.
> > > > bpf: Fix a data-race around bpf_jit_limit.
> > >
> > > The BPF stuff needs to go via the BPF tree, or get an ack from the BPF
> > > maintainers. I see Daniel is CCed on some of the patches but not all.
> >
> > Sorry, I just added the author in CC.
> > Thanks for CCing bpf mailing list, I'll wait an ACK from them.
>
> So we got no reply from BPF folks and the patch got marked as Changes
> Requested overnight, so probably best if you split the series up
> and send to appropriate trees.

I see, I'll do so.
Sorry for bothering you.

2022-08-17 16:30:14

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v1 net 00/15] sysctl: Fix data-races around net.core.XXX (Round 1)

On Tue, 16 Aug 2022 09:58:48 -0700 Kuniyuki Iwashima wrote:
> From: Jakub Kicinski <[email protected]>
> Date: Tue, 16 Aug 2022 09:27:03 -0700
> > On Mon, 15 Aug 2022 22:23:32 -0700 Kuniyuki Iwashima wrote:
> > > bpf: Fix data-races around bpf_jit_enable.
> > > bpf: Fix data-races around bpf_jit_harden.
> > > bpf: Fix data-races around bpf_jit_kallsyms.
> > > bpf: Fix a data-race around bpf_jit_limit.
> >
> > The BPF stuff needs to go via the BPF tree, or get an ack from the BPF
> > maintainers. I see Daniel is CCed on some of the patches but not all.
>
> Sorry, I just added the author in CC.
> Thanks for CCing bpf mailing list, I'll wait an ACK from them.

So we got no reply from BPF folks and the patch got marked as Changes
Requested overnight, so probably best if you split the series up
and send to appropriate trees.