Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp24732ybp; Tue, 8 Oct 2019 13:24:10 -0700 (PDT) X-Google-Smtp-Source: APXvYqxf01aCCyZgsfsMaEL0EKCppf0XDS9uIJS5kOJ49Ep9LKTfw4AP1nqubIWhe80lmcNgSbZX X-Received: by 2002:a17:907:2067:: with SMTP id qp7mr13938965ejb.138.1570566250224; Tue, 08 Oct 2019 13:24:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570566250; cv=none; d=google.com; s=arc-20160816; b=Xh5DeHIA9buP5h8NTJtJ/Cf2CebuE5BuUM1TsFBzoHlyJ/qqxJ7H51xT10e0VHIFqH 61dZhleXG0OEjpNvHviZVQKdX84Bg/Wu1F9imgVzJ9+ojK/xfIir3GPO05Ch8x0OzTQw +s5+VULl89tumqW3m5Nq1vPKhxHkMHH6CV2BjlmQkWMHtrJuuBhT5k56uJ1uYzM3Wegy QWJniHqdYCiyVMMkzkNrzeVXC6PPq15g7gCT5DGt9xJY6WOgv4jJld7AxtnVX+Vq3eDA FlL82RKrdHljvSEiOlKEVjf9WYYjtfX7S0Uyx7aKmZnxpkVk9uSlg6XTZ1U5+Z5rf/hM G7Rg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=+Ol8r7x+pBe5FGeX8C685UXpJayFRA7zZaMbYQvDdFM=; b=kZ50fixA48CwQpl1UGXVaEHo83x0gvlavcIbrPDBPChso5LFpMje+OgV7pDz9dTfyG MKbsNKW1iQuvTZS9yxmqUCRSNMdRJfjAE3IDyr25vHQ6QD/WwaPCva9xZomhD3LAf+nd cHZOABULn3KY1yKEKZZa54H4ezLrnENf2Asywb9xvCdCxU7GNsyvbFdx+p/bk8hnrB2q 2YMg970p3LYLsw8PKXIDzwwUcG4ab6ep2Jk6VUkCraDKJhTo2q3pIw6FVxqd/z0FLy6S WQTOxT5R15B6/QTOoEhDpzPpKj4oII4xSf2fkBZ9N/qm1d1qZPByx237o/Mqr4hwQmkp 6nLg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j46si99618eda.9.2019.10.08.13.23.39; Tue, 08 Oct 2019 13:24:10 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730523AbfJHUXe (ORCPT + 99 others); Tue, 8 Oct 2019 16:23:34 -0400 Received: from fieldses.org ([173.255.197.46]:47314 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727835AbfJHUXe (ORCPT ); Tue, 8 Oct 2019 16:23:34 -0400 Received: by fieldses.org (Postfix, from userid 2815) id 02F8A2C0; Tue, 8 Oct 2019 16:23:33 -0400 (EDT) Date: Tue, 8 Oct 2019 16:23:32 -0400 From: "J . Bruce Fields" To: Pavel Tikhomirov Cc: Trond Myklebust , Anna Schumaker , Neil Brown , Chuck Lever , "David S . Miller" , "linux-nfs@vger.kernel.org" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Konstantin Khorenko , Vasiliy Averin Subject: Re: [PATCH] sunrpc: fix crash when cache_head become valid before update Message-ID: <20191008202332.GB9151@fieldses.org> References: <20191001080359.6034-1-ptikhomirov@virtuozzo.com> <3e455bb4-2a03-551e-6efb-1d41b5258327@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3e455bb4-2a03-551e-6efb-1d41b5258327@virtuozzo.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Tue, Oct 08, 2019 at 10:02:53AM +0000, Pavel Tikhomirov wrote: > Add Neil to CC, sorry, had lost it somehow... Always happy when we can fix a bug by deleting code, and your explanation makes sense to me, but I'll give Neil a chance to look it over if he wants. --b. > > On 10/1/19 11:03 AM, Pavel Tikhomirov wrote: > > I was investigating a crash in our Virtuozzo7 kernel which happened in > > in svcauth_unix_set_client. I found out that we access m_client field > > in ip_map structure, which was received from sunrpc_cache_lookup (we > > have a bit older kernel, now the code is in sunrpc_cache_add_entry), and > > these field looks uninitialized (m_client == 0x74 don't look like a > > pointer) but in the cache_head in flags we see 0x1 which is CACHE_VALID. > > > > It looks like the problem appeared from our previous fix to sunrpc (1): > > commit 4ecd55ea0742 ("sunrpc: fix cache_head leak due to queued > > request") > > > > And we've also found a patch already fixing our patch (2): > > commit d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.") > > > > Though the crash is eliminated, I think the core of the problem is not > > completely fixed: > > > > Neil in the patch (2) makes cache_head CACHE_NEGATIVE, before > > cache_fresh_locked which was added in (1) to fix crash. These way > > cache_is_valid won't say the cache is valid anymore and in > > svcauth_unix_set_client the function cache_check will return error > > instead of 0, and we don't count entry as initialized. > > > > But it looks like we need to remove cache_fresh_locked completely in > > sunrpc_cache_lookup: > > > > In (1) we've only wanted to make cache_fresh_unlocked->cache_dequeue so > > that cache_requests with no readers also release corresponding > > cache_head, to fix their leak. We with Vasily were not sure if > > cache_fresh_locked and cache_fresh_unlocked should be used in pair or > > not, so we've guessed to use them in pair. > > > > Now we see that we don't want the CACHE_VALID bit set here by > > cache_fresh_locked, as "valid" means "initialized" and there is no > > initialization in sunrpc_cache_add_entry. Both expiry_time and > > last_refresh are not used in cache_fresh_unlocked code-path and also not > > required for the initial fix. > > > > So to conclude cache_fresh_locked was called by mistake, and we can just > > safely remove it instead of crutching it with CACHE_NEGATIVE. It looks > > ideologically better for me. Hope I don't miss something here. > > > > Here is our crash backtrace: > > [13108726.326291] BUG: unable to handle kernel NULL pointer dereference at 0000000000000074 > > [13108726.326365] IP: [] svcauth_unix_set_client+0x2ab/0x520 [sunrpc] > > [13108726.326448] PGD 0 > > [13108726.326468] Oops: 0002 [#1] SMP > > [13108726.326497] Modules linked in: nbd isofs xfs loop kpatch_cumulative_81_0_r1(O) xt_physdev nfnetlink_queue bluetooth rfkill ip6table_nat nf_nat_ipv6 ip_vs_wrr ip_vs_wlc ip_vs_sh nf_conntrack_netlink ip_vs_sed ip_vs_pe_sip nf_conntrack_sip ip_vs_nq ip_vs_lc ip_vs_lblcr ip_vs_lblc ip_vs_ftp ip_vs_dh nf_nat_ftp nf_conntrack_ftp iptable_raw xt_recent nf_log_ipv6 xt_hl ip6t_rt nf_log_ipv4 nf_log_common xt_LOG xt_limit xt_TCPMSS xt_tcpmss vxlan ip6_udp_tunnel udp_tunnel xt_statistic xt_NFLOG nfnetlink_log dummy xt_mark xt_REDIRECT nf_nat_redirect raw_diag udp_diag tcp_diag inet_diag netlink_diag af_packet_diag unix_diag rpcsec_gss_krb5 xt_addrtype ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 ebtable_nat ebtable_broute nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_mangle ip6table_raw nfsv4 > > [13108726.327173] dns_resolver cls_u32 binfmt_misc arptable_filter arp_tables ip6table_filter ip6_tables devlink fuse_kio_pcs ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_nat iptable_nat nf_nat_ipv4 xt_comment nf_conntrack_ipv4 nf_defrag_ipv4 xt_wdog_tmo xt_multiport bonding xt_set xt_conntrack iptable_filter iptable_mangle kpatch(O) ebtable_filter ebt_among ebtables ip_set_hash_ip ip_set nfnetlink vfat fat skx_edac intel_powerclamp coretemp intel_rapl iosf_mbi kvm_intel kvm irqbypass fuse pcspkr ses enclosure joydev sg mei_me hpwdt hpilo lpc_ich mei ipmi_si shpchp ipmi_devintf ipmi_msghandler xt_ipvs acpi_power_meter ip_vs_rr nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache nf_nat cls_fw sch_htb sch_cbq sch_sfq ip_vs em_u32 nf_conntrack tun br_netfilter veth overlay ip6_vzprivnet ip6_vznetstat ip_vznetstat > > [13108726.327817] ip_vzprivnet vziolimit vzevent vzlist vzstat vznetstat vznetdev vzmon vzdev bridge pio_kaio pio_nfs pio_direct pfmt_raw pfmt_ploop1 ploop ip_tables ext4 mbcache jbd2 sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper scsi_transport_iscsi 8021q syscopyarea sysfillrect garp sysimgblt fb_sys_fops mrp stp ttm llc bnx2x crct10dif_pclmul crct10dif_common crc32_pclmul crc32c_intel drm dm_multipath ghash_clmulni_intel uas aesni_intel lrw gf128mul glue_helper ablk_helper cryptd tg3 smartpqi scsi_transport_sas mdio libcrc32c i2c_core usb_storage ptp pps_core wmi sunrpc dm_mirror dm_region_hash dm_log dm_mod [last unloaded: kpatch_cumulative_82_0_r1] > > [13108726.328403] CPU: 35 PID: 63742 Comm: nfsd ve: 51332 Kdump: loaded Tainted: G W O ------------ 3.10.0-862.20.2.vz7.73.29 #1 73.29 > > [13108726.328491] Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 10/02/2018 > > [13108726.328554] task: ffffa0a6a41b1160 ti: ffffa0c2a74bc000 task.ti: ffffa0c2a74bc000 > > [13108726.328610] RIP: 0010:[] [] svcauth_unix_set_client+0x2ab/0x520 [sunrpc] > > [13108726.328706] RSP: 0018:ffffa0c2a74bfd80 EFLAGS: 00010246 > > [13108726.328750] RAX: 0000000000000001 RBX: ffffa0a6183ae000 RCX: 0000000000000000 > > [13108726.328811] RDX: 0000000000000074 RSI: 0000000000000286 RDI: ffffa0c2a74bfcf0 > > [13108726.328864] RBP: ffffa0c2a74bfe00 R08: ffffa0bab8c22960 R09: 0000000000000001 > > [13108726.328916] R10: 0000000000000001 R11: 0000000000000001 R12: ffffa0a32aa7f000 > > [13108726.328969] R13: ffffa0a6183afac0 R14: ffffa0c233d88d00 R15: ffffa0c2a74bfdb4 > > [13108726.329022] FS: 0000000000000000(0000) GS:ffffa0e17f9c0000(0000) knlGS:0000000000000000 > > [13108726.329081] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [13108726.332311] CR2: 0000000000000074 CR3: 00000026a1b28000 CR4: 00000000007607e0 > > [13108726.334606] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > > [13108726.336754] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > > [13108726.338908] PKRU: 00000000 > > [13108726.341047] Call Trace: > > [13108726.343074] [] ? groups_alloc+0x34/0x110 > > [13108726.344837] [] svc_set_client+0x24/0x30 [sunrpc] > > [13108726.346631] [] svc_process_common+0x241/0x710 [sunrpc] > > [13108726.348332] [] svc_process+0x103/0x190 [sunrpc] > > [13108726.350016] [] nfsd+0xdf/0x150 [nfsd] > > [13108726.351735] [] ? nfsd_destroy+0x80/0x80 [nfsd] > > [13108726.353459] [] kthread+0xd1/0xe0 > > [13108726.355195] [] ? create_kthread+0x60/0x60 > > [13108726.356896] [] ret_from_fork_nospec_begin+0x7/0x21 > > [13108726.358577] [] ? create_kthread+0x60/0x60 > > [13108726.360240] Code: 4c 8b 45 98 0f 8e 2e 01 00 00 83 f8 fe 0f 84 76 fe ff ff 85 c0 0f 85 2b 01 00 00 49 8b 50 40 b8 01 00 00 00 48 89 93 d0 1a 00 00 0f c1 02 83 c0 01 83 f8 01 0f 8e 53 02 00 00 49 8b 44 24 38 > > [13108726.363769] RIP [] svcauth_unix_set_client+0x2ab/0x520 [sunrpc] > > [13108726.365530] RSP > > [13108726.367179] CR2: 0000000000000074 > > > > Fixes: d58431eacb22 ("sunrpc: don't mark uninitialised items as VALID.") > > Signed-off-by: Pavel Tikhomirov > > > > --- > > net/sunrpc/cache.c | 6 ------ > > 1 file changed, 6 deletions(-) > > > > diff --git a/net/sunrpc/cache.c b/net/sunrpc/cache.c > > index a349094f6fb7..f740cb51802a 100644 > > --- a/net/sunrpc/cache.c > > +++ b/net/sunrpc/cache.c > > @@ -53,9 +53,6 @@ static void cache_init(struct cache_head *h, struct cache_detail *detail) > > h->last_refresh = now; > > } > > > > -static inline int cache_is_valid(struct cache_head *h); > > -static void cache_fresh_locked(struct cache_head *head, time_t expiry, > > - struct cache_detail *detail); > > static void cache_fresh_unlocked(struct cache_head *head, > > struct cache_detail *detail); > > > > @@ -105,9 +102,6 @@ static struct cache_head *sunrpc_cache_add_entry(struct cache_detail *detail, > > if (cache_is_expired(detail, tmp)) { > > hlist_del_init_rcu(&tmp->cache_list); > > detail->entries --; > > - if (cache_is_valid(tmp) == -EAGAIN) > > - set_bit(CACHE_NEGATIVE, &tmp->flags); > > - cache_fresh_locked(tmp, 0, detail); > > freeme = tmp; > > break; > > } > > > > -- > Best regards, Tikhomirov Pavel > Software Developer, Virtuozzo.