2020-03-19 22:13:42

by Qian Cai

[permalink] [raw]
Subject: [PATCH] ipv4: fix a RCU-list bug in inet_dump_fib()

There is a place,

inet_dump_fib()
fib_table_dump
fn_trie_dump_leaf()
hlist_for_each_entry_rcu()

without rcu_read_lock() triggers a warning,

WARNING: suspicious RCU usage
-----------------------------
net/ipv4/fib_trie.c:2216 RCU-list traversed in non-reader section!!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
1 lock held by ip/1923:
#0: ffffffff8ce76e40 (rtnl_mutex){+.+.}, at: netlink_dump+0xd6/0x840

Call Trace:
dump_stack+0xa1/0xea
lockdep_rcu_suspicious+0x103/0x10d
fn_trie_dump_leaf+0x581/0x590
fib_table_dump+0x15f/0x220
inet_dump_fib+0x4ad/0x5d0
netlink_dump+0x350/0x840
__netlink_dump_start+0x315/0x3e0
rtnetlink_rcv_msg+0x4d1/0x720
netlink_rcv_skb+0xf0/0x220
rtnetlink_rcv+0x15/0x20
netlink_unicast+0x306/0x460
netlink_sendmsg+0x44b/0x770
__sys_sendto+0x259/0x270
__x64_sys_sendto+0x80/0xa0
do_syscall_64+0x69/0xf4
entry_SYSCALL_64_after_hwframe+0x49/0xb3

Signed-off-by: Qian Cai <[email protected]>
---
net/ipv4/fib_frontend.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index 577db1d50a24..5e441282d647 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -987,6 +987,8 @@ static int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
if (filter.flags & RTM_F_PREFIX)
return skb->len;

+ rcu_read_lock();
+
if (filter.table_id) {
tb = fib_get_table(net, filter.table_id);
if (!tb) {
@@ -1004,8 +1006,6 @@ static int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
s_h = cb->args[0];
s_e = cb->args[1];

- rcu_read_lock();
-
for (h = s_h; h < FIB_TABLE_HASHSZ; h++, s_e = 0) {
e = 0;
head = &net->ipv4.fib_table_hash[h];
--
2.21.0 (Apple Git-122.2)


2020-03-19 22:15:43

by Qian Cai

[permalink] [raw]
Subject: Re: [PATCH] ipv4: fix a RCU-list bug in inet_dump_fib()



> On Mar 19, 2020, at 6:11 PM, Qian Cai <[email protected]> wrote:
>
> There is a place,
>
> inet_dump_fib()
> fib_table_dump
> fn_trie_dump_leaf()
> hlist_for_each_entry_rcu()
>
> without rcu_read_lock() triggers a warning,
>
> WARNING: suspicious RCU usage
> -----------------------------
> net/ipv4/fib_trie.c:2216 RCU-list traversed in non-reader section!!
>
> other info that might help us debug this:
>
> rcu_scheduler_active = 2, debug_locks = 1
> 1 lock held by ip/1923:
> #0: ffffffff8ce76e40 (rtnl_mutex){+.+.}, at: netlink_dump+0xd6/0x840
>
> Call Trace:
> dump_stack+0xa1/0xea
> lockdep_rcu_suspicious+0x103/0x10d
> fn_trie_dump_leaf+0x581/0x590
> fib_table_dump+0x15f/0x220
> inet_dump_fib+0x4ad/0x5d0
> netlink_dump+0x350/0x840
> __netlink_dump_start+0x315/0x3e0
> rtnetlink_rcv_msg+0x4d1/0x720
> netlink_rcv_skb+0xf0/0x220
> rtnetlink_rcv+0x15/0x20
> netlink_unicast+0x306/0x460
> netlink_sendmsg+0x44b/0x770
> __sys_sendto+0x259/0x270
> __x64_sys_sendto+0x80/0xa0
> do_syscall_64+0x69/0xf4
> entry_SYSCALL_64_after_hwframe+0x49/0xb3
>
> Signed-off-by: Qian Cai <[email protected]>

Self-NAK. I forgot to unlock. Will send a v2.

> ---
> net/ipv4/fib_frontend.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> index 577db1d50a24..5e441282d647 100644
> --- a/net/ipv4/fib_frontend.c
> +++ b/net/ipv4/fib_frontend.c
> @@ -987,6 +987,8 @@ static int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
> if (filter.flags & RTM_F_PREFIX)
> return skb->len;
>
> + rcu_read_lock();
> +
> if (filter.table_id) {
> tb = fib_get_table(net, filter.table_id);
> if (!tb) {
> @@ -1004,8 +1006,6 @@ static int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
> s_h = cb->args[0];
> s_e = cb->args[1];
>
> - rcu_read_lock();
> -
> for (h = s_h; h < FIB_TABLE_HASHSZ; h++, s_e = 0) {
> e = 0;
> head = &net->ipv4.fib_table_hash[h];
> --
> 2.21.0 (Apple Git-122.2)
>

2020-03-19 23:27:48

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] ipv4: fix a RCU-list bug in inet_dump_fib()

On 3/19/20 4:11 PM, Qian Cai wrote:
> There is a place,
>
> inet_dump_fib()
> fib_table_dump
> fn_trie_dump_leaf()
> hlist_for_each_entry_rcu()
>
> without rcu_read_lock() triggers a warning,
>
> WARNING: suspicious RCU usage
> -----------------------------
> net/ipv4/fib_trie.c:2216 RCU-list traversed in non-reader section!!
>
> other info that might help us debug this:
>
> rcu_scheduler_active = 2, debug_locks = 1
> 1 lock held by ip/1923:
> #0: ffffffff8ce76e40 (rtnl_mutex){+.+.}, at: netlink_dump+0xd6/0x840
>
> Call Trace:
> dump_stack+0xa1/0xea
> lockdep_rcu_suspicious+0x103/0x10d
> fn_trie_dump_leaf+0x581/0x590
> fib_table_dump+0x15f/0x220
> inet_dump_fib+0x4ad/0x5d0
> netlink_dump+0x350/0x840
> __netlink_dump_start+0x315/0x3e0
> rtnetlink_rcv_msg+0x4d1/0x720
> netlink_rcv_skb+0xf0/0x220
> rtnetlink_rcv+0x15/0x20
> netlink_unicast+0x306/0x460
> netlink_sendmsg+0x44b/0x770
> __sys_sendto+0x259/0x270
> __x64_sys_sendto+0x80/0xa0
> do_syscall_64+0x69/0xf4
> entry_SYSCALL_64_after_hwframe+0x49/0xb3
>

Fixes: 18a8021a7be3 ("net/ipv4: Plumb support for filtering route dumps")

but you have a problem below ...

> Signed-off-by: Qian Cai <[email protected]>
> ---
> net/ipv4/fib_frontend.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
> index 577db1d50a24..5e441282d647 100644
> --- a/net/ipv4/fib_frontend.c
> +++ b/net/ipv4/fib_frontend.c
> @@ -987,6 +987,8 @@ static int inet_dump_fib(struct sk_buff *skb, struct netlink_callback *cb)
> if (filter.flags & RTM_F_PREFIX)
> return skb->len;
>
> + rcu_read_lock();
> +
> if (filter.table_id) {
> tb = fib_get_table(net, filter.table_id);
> if (!tb) {

this branch has 2 return points which now have rcu_read_lock; you should
have seen this when you tested the change..


2020-03-25 03:06:06

by Chen, Rong A

[permalink] [raw]
Subject: [ipv4] f3f6f46e79: kernel-selftests.net.pmtu.sh.fail

FYI, we noticed the following commit (built with gcc-7):

commit: f3f6f46e7935c5fe707b4a88124556d8b9f10c92 ("[PATCH] ipv4: fix a RCU-list bug in inet_dump_fib()")
url: https://github.com/0day-ci/linux/commits/Qian-Cai/ipv4-fix-a-RCU-list-bug-in-inet_dump_fib/20200320-061342


in testcase: kernel-selftests
with following parameters:

group: kselftests-02
ucode: 0xd6

test-description: The kernel contains a set of "self tests" under the tools/testing/selftests/ directory. These are intended to be small unit tests to exercise individual code paths in the kernel.
test-url: https://www.kernel.org/doc/Documentation/kselftest.txt


on test machine: 8 threads Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz with 16G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):


If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>



# selftests: net: pmtu.sh
# TEST: ipv4: PMTU exceptions [ OK ]
# TEST: ipv4: PMTU exceptions - nexthop objects [ OK ]
# TEST: ipv6: PMTU exceptions [ OK ]
# TEST: ipv6: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv4 over vxlan4: PMTU exceptions [ OK ]
# TEST: IPv4 over vxlan4: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv6 over vxlan4: PMTU exceptions [ OK ]
# TEST: IPv6 over vxlan4: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv4 over vxlan6: PMTU exceptions [ OK ]
# TEST: IPv4 over vxlan6: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv6 over vxlan6: PMTU exceptions [ OK ]
# TEST: IPv6 over vxlan6: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv4 over geneve4: PMTU exceptions [ OK ]
# TEST: IPv4 over geneve4: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv6 over geneve4: PMTU exceptions [ OK ]
# TEST: IPv6 over geneve4: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv4 over geneve6: PMTU exceptions [ OK ]
# TEST: IPv4 over geneve6: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv6 over geneve6: PMTU exceptions [ OK ]
# TEST: IPv6 over geneve6: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv4 over fou4: PMTU exceptions [ OK ]
# TEST: IPv4 over fou4: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv6 over fou4: PMTU exceptions [ OK ]
# TEST: IPv6 over fou4: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv4 over fou6: PMTU exceptions [ OK ]
# TEST: IPv4 over fou6: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv6 over fou6: PMTU exceptions [ OK ]
# TEST: IPv6 over fou6: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv4 over gue4: PMTU exceptions [ OK ]
# TEST: IPv4 over gue4: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv6 over gue4: PMTU exceptions [ OK ]
# TEST: IPv6 over gue4: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv4 over gue6: PMTU exceptions [ OK ]
# TEST: IPv4 over gue6: PMTU exceptions - nexthop objects [ OK ]
# TEST: IPv6 over gue6: PMTU exceptions [ OK ]
# TEST: IPv6 over gue6: PMTU exceptions - nexthop objects [ OK ]
# TEST: vti6: PMTU exceptions [ OK ]
# TEST: vti4: PMTU exceptions [ OK ]
# TEST: vti4: default MTU assignment [ OK ]
# TEST: vti6: default MTU assignment [ OK ]
# TEST: vti4: MTU setting on link creation [ OK ]
# TEST: vti6: MTU setting on link creation [ OK ]
# TEST: vti6: MTU changes on link changes [ OK ]
# TEST: ipv4: cleanup of cached exceptions [ OK ]
# TEST: ipv4: cleanup of cached exceptions - nexthop objects [ OK ]
# TEST: ipv6: cleanup of cached exceptions [ OK ]
# TEST: ipv6: cleanup of cached exceptions - nexthop objects [ OK ]
# Segmentation fault
# Segmentation fault
# TEST: ipv4: list and flush cached exceptions [FAIL]
# can't list cached exceptions
# Segmentation fault
# Segmentation fault
# TEST: ipv4: list and flush cached exceptions - nexthop objects [FAIL]
# can't list cached exceptions
# TEST: ipv6: list and flush cached exceptions [ OK ]
# TEST: ipv6: list and flush cached exceptions - nexthop objects [ OK ]
not ok 16 selftests: net: pmtu.sh # exit=1

To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Rong Chen


Attachments:
(No filename) (5.25 kB)
config-5.6.0-rc5-01734-gf3f6f46e7935c (206.96 kB)
job-script (6.27 kB)
kmsg.xz (81.75 kB)
kernel-selftests (254.50 kB)
job.yaml (5.38 kB)
reproduce (1.18 kB)
Download all attachments