Received: by 2002:ac0:a591:0:0:0:0:0 with SMTP id m17-v6csp1146406imm; Thu, 5 Jul 2018 15:58:22 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdhtiwCF9+nFhUyPmRvUMI8/jFnLZRGhfEZPuBKFlylFLYvrpYj5ywWsbQqdWe/J7SGvBk2 X-Received: by 2002:a65:60cf:: with SMTP id r15-v6mr7213811pgv.41.1530831502417; Thu, 05 Jul 2018 15:58:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530831502; cv=none; d=google.com; s=arc-20160816; b=amaKZy6v5q/dhq2QvN3b5yeYsKoW1uTLwlwlcB1iP4uc5YRamJRwgaAPFr4dVLp1VI rs4RgczvrmlfzZ/BjgH03OMQp5Y3TYGV8o/lH5R8dY7vxyu+tocGOeG/7mm1cDi1ITUh Xpq2ffYMgwzxc3U1TOu6Y0k2l8gMV9LWfg8MmReM85TeEqtM7mP/3ST/8WEORBumDsr3 WzcYPjVB8wk5bVGa4fnbUX9/0PhlAzL/3gKuMBdqW6vQi9YZmLBltZZFbRm5NHuNnlpi MVl8plscIADdGJ8f818P3Qj07w7JmqBc/iWW2aha8SE/eogJ4nuItrLv4kWUSxWuntss dj2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:message-id:references :in-reply-to:subject:cc:date:to:from:arc-authentication-results; bh=yjhaEAvc7h20llU6bPhiTEWGnKq8F0hCTEZZVV+1sqg=; b=Ck/0yjv8/blujrq/2JjkP8vpQmaIte2MxEfMaYYuxTTVH5tvvG5zQiu6gHUAFAWyEe fOIonAPXiCvnRhXF5n7Q181x4VbGPMkz/i61UOfB1GrHemlRSy0OIZeB3H7WrjNyFdGi qujpQyCV41m/yTyShq3gztkCrrlS1DgBZP20qH1CQH1EUby+Ve1YDHF49NMFDaGL5Ilg OK+IgilvH/5ZvMGxepSy1HXKlkowob7I0XcW7GiYFGcq2QYIx5A0JJY3wBG6Rbk1vkyF Meia2uISaDSc4Xd/q2yDoVX0C9ihKVqJTByVCmR8je6PsTFFcVkaLuVu52hOkZyepILI re9Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c4-v6si7442686pfk.361.2018.07.05.15.57.43; Thu, 05 Jul 2018 15:58:22 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753526AbeGEW4Q (ORCPT + 99 others); Thu, 5 Jul 2018 18:56:16 -0400 Received: from mx2.suse.de ([195.135.220.15]:41552 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753001AbeGEW4P (ORCPT ); Thu, 5 Jul 2018 18:56:15 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 9AF5DAE03; Thu, 5 Jul 2018 22:56:13 +0000 (UTC) From: NeilBrown To: Paolo Abeni , Thomas Graf , Herbert Xu , David Miller Date: Fri, 06 Jul 2018 08:56:03 +1000 Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [4/4] rhashtable: improve rhashtable_walk stability when stop/start used. In-Reply-To: <86f305ff238d7cdac7ab20b0d6395cc6571cf4e0.camel@redhat.com> References: <152452255351.1456.12384285355497513812.stgit@noble> <86f305ff238d7cdac7ab20b0d6395cc6571cf4e0.camel@redhat.com> Message-ID: <87bmblz3zw.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Thu, Jul 05 2018, Paolo Abeni wrote: > On Tue, 2018-04-24 at 08:29 +1000, NeilBrown wrote: >> diff --git a/lib/rhashtable.c b/lib/rhashtable.c >> index 81edf1ab38ab..9427b5766134 100644 >> --- a/lib/rhashtable.c >> +++ b/lib/rhashtable.c >> @@ -727,6 +727,7 @@ int rhashtable_walk_start_check(struct rhashtable_it= er *iter) >> __acquires(RCU) >> { >> struct rhashtable *ht =3D iter->ht; >> + bool rhlist =3D ht->rhlist; >>=20=20 >> rcu_read_lock(); >>=20=20 >> @@ -735,13 +736,52 @@ int rhashtable_walk_start_check(struct rhashtable_= iter *iter) >> list_del(&iter->walker.list); >> spin_unlock(&ht->lock); >>=20=20 >> - if (!iter->walker.tbl && !iter->end_of_table) { >> + if (iter->end_of_table) >> + return 0; >> + if (!iter->walker.tbl) { >> iter->walker.tbl =3D rht_dereference_rcu(ht->tbl, ht); >> iter->slot =3D 0; >> iter->skip =3D 0; >> return -EAGAIN; >> } >>=20=20 >> + if (iter->p && !rhlist) { >> + /* >> + * We need to validate that 'p' is still in the table, and >> + * if so, update 'skip' >> + */ >> + struct rhash_head *p; >> + int skip =3D 0; >> + rht_for_each_rcu(p, iter->walker.tbl, iter->slot) { >> + skip++; >> + if (p =3D=3D iter->p) { >> + iter->skip =3D skip; >> + goto found; >> + } >> + } >> + iter->p =3D NULL; >> + } else if (iter->p && rhlist) { >> + /* Need to validate that 'list' is still in the table, and >> + * if so, update 'skip' and 'p'. >> + */ >> + struct rhash_head *p; >> + struct rhlist_head *list; >> + int skip =3D 0; >> + rht_for_each_rcu(p, iter->walker.tbl, iter->slot) { >> + for (list =3D container_of(p, struct rhlist_head, rhead); >> + list; >> + list =3D rcu_dereference(list->next)) { >> + skip++; >> + if (list =3D=3D iter->list) { >> + iter->p =3D p; >> + skip =3D skip; >> + goto found; >> + } >> + } >> + } >> + iter->p =3D NULL; >> + } >> +found: >> return 0; >> } >> EXPORT_SYMBOL_GPL(rhashtable_walk_start_check); > > While testing new code that uses the rhashtable walker, I'm obeserving > an use after free, that is apparently caused by the above: > > [ 146.834815] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > [ 146.842933] BUG: KASAN: use-after-free in inet_frag_worker+0x9f/0x210 > [ 146.850120] Read of size 4 at addr ffff881b6b9342d8 by task kworker/13= :1/177 > [ 146.857984]=20 > [ 146.859645] CPU: 13 PID: 177 Comm: kworker/13:1 Not tainted 4.18.0-rc3= .mirror_unclone_6_frag_dbg+ #1974 > [ 146.870128] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.1.7= 06/16/2016 > [ 146.878478] Workqueue: events inet_frag_worker > [ 146.883433] Call Trace: > [ 146.886162] dump_stack+0x90/0xe3 > [ 146.889861] print_address_description+0x6a/0x2a0 > [ 146.895109] kasan_report+0x176/0x2d0 > [ 146.899193] ? inet_frag_worker+0x9f/0x210 > [ 146.903762] inet_frag_worker+0x9f/0x210 > [ 146.908142] process_one_work+0x24f/0x6e0 > [ 146.912614] ? process_one_work+0x1a6/0x6e0 > [ 146.917285] worker_thread+0x4e/0x3d0 > [ 146.921373] kthread+0x106/0x140 > [ 146.924970] ? process_one_work+0x6e0/0x6e0 > [ 146.929637] ? kthread_bind+0x10/0x10 > [ 146.933723] ret_from_fork+0x3a/0x50 > [ 146.937717]=20 > [ 146.939377] Allocated by task 177: > [ 146.943170] kasan_kmalloc+0x86/0xb0 > [ 146.947158] __kmalloc_node+0x181/0x430 > [ 146.951438] kvmalloc_node+0x4f/0x70 > [ 146.955427] alloc_bucket_spinlocks+0x34/0xa0 > [ 146.960286] bucket_table_alloc.isra.13+0x61/0x180 > [ 146.965630] rhashtable_rehash_alloc+0x26/0x80 > [ 146.970585] rht_deferred_worker+0x394/0x810 > [ 146.975348] process_one_work+0x24f/0x6e0 > [ 146.979819] worker_thread+0x4e/0x3d0 > [ 146.983902] kthread+0x106/0x140 > [ 146.987502] ret_from_fork+0x3a/0x50 > [ 146.991487]=20 > [ 146.993146] Freed by task 90: > [ 146.996455] __kasan_slab_free+0x11d/0x180 > [ 147.001025] kfree+0x113/0x350 > [ 147.004431] bucket_table_free+0x1c/0x70 > [ 147.008806] rcu_process_callbacks+0x6c8/0x880 > [ 147.013762] __do_softirq+0xd5/0x505 > [ 147.017747]=20 > [ 147.019407] The buggy address belongs to the object at ffff881b6b934200 > [ 147.019407] which belongs to the cache kmalloc-8192 of size 8192 > [ 147.033574] The buggy address is located 216 bytes inside of > [ 147.033574] 8192-byte region [ffff881b6b934200, ffff881b6b936200) > [ 147.046773] The buggy address belongs to the page: > [ 147.052116] page:ffffea006dae4c00 count:1 mapcount:0 mapping:ffff88010= 7c0e400 index:0x0 compound_mapcount: 0 > [ 147.063086] flags: 0x17ffffc0008100(slab|head) > [ 147.068043] raw: 0017ffffc0008100 dead000000000100 dead000000000200 ff= ff880107c0e400 > [ 147.076684] raw: 0000000000000000 0000000000030003 00000001ffffffff 00= 00000000000000 > [ 147.085324] page dumped because: kasan: bad access detected > [ 147.091540]=20 > [ 147.093210] Memory state around the buggy address: > [ 147.098553] ffff881b6b934180: fc fc fc fc fc fc fc fc fc fc fc fc fc = fc fc fc > [ 147.106613] ffff881b6b934200: fb fb fb fb fb fb fb fb fb fb fb fb fb = fb fb fb > [ 147.114670] >ffff881b6b934280: fb fb fb fb fb fb fb fb fb fb fb fb fb = fb fb fb > [ 147.122730] ^ > [ 147.129527] ffff881b6b934300: fb fb fb fb fb fb fb fb fb fb fb fb fb = fb fb fb > [ 147.137587] ffff881b6b934380: fb fb fb fb fb fb fb fb fb fb fb fb fb = fb fb fb > [ 147.145646] =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > Reverting the above change avoid the issue.=20 > > I *think* that reusing iter->p after a > rcu_read_lock()/rcu_read_unlock() is unsafe: > if the table has been reashed we can still and reuse 'p', but the > related grace period could be already expired. Thanks for the report. It would be unsafe to dereference iter->p, but the code doesn't. At least, it doesn't dereference it until it has searched through the table and confirmed that the pointer is still in the table. Could you please use scripts/faddr2line to identify exactly where the error is occurring? e.g ./scripts/faddr2line vmlinux.o inet_frag_worker+0x9f/0x210 (any .o which contains inet_frag_worker should work). Thanks, NeilBrown > > I can't think of any better fix than a revert. Other opinions welcome! > > Paolo --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEG8Yp69OQ2HB7X0l6Oeye3VZigbkFAls+ogMACgkQOeye3VZi gblPew/5AST8EU1fX5hTiEoqR1JAIUgVwarXIrX26zOQbAeKKB9AwPw5heYQAyAN fiCCfI32mexTpfwMzMSRpCpwVyeLRDRR+C4NKoC5y6LGGAgKAjKgvg5oRhxyrYf2 QBdW3dUTzHyLMfpTG3KMvtw0Et14YfnpZDosG3HsUkgrTvCC+bI86rb04XlqZ5km QR75yaSW3LgtDxAAAqeXp/UCRz/5QaFowvsxq3acNx7JB8tSTDh2Em8vl4YcXQEl +QS7tTSyqUrgt8u1rEUg2UChDrewblQNgrBVm7po6ve3ZfVp1l6rziQDiHIgie9Q E7e/gQqSh/QCWZDpdhje784ovaAQCC/EIXqWYkldQe++M+himgZ+cwJS3PkEsoO6 biaoWOZsAJK6WqVP0ccwXNJn2n0y4IJ8ZaxXzq3gWyRUvkPss/qI8Bvvm7OWQHJJ 4MluO92iy6wnamEz43saKehrjK1RXDWBcuEkT6kxPRFK8vF0e29v/RL0CGdkmB1r CB/RuCvxnftcchKhPiP3T9zEd+rkG1SNSF3iaXe3lBnshe+3g10Hj8JWpP83i5Co OvNqtn8XNYeVj1Fx/9/uFfGa9UwCuE2mQxgTlZJO+TgvkfillS4H9fr4juWAH9Po zDAWxm3YVXV1MBUUjXW0Sr11+SkzxRhEi8qpyiddZ+CF9ceCUEY= =mUQ9 -----END PGP SIGNATURE----- --=-=-=--