Received: by 2002:a89:d88:0:b0:1fa:5c73:8e2d with SMTP id eb8csp396126lqb; Fri, 24 May 2024 01:24:46 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUapkTeg+FamX0ax4VH1HnfSCgOe5MvSzKmjuh2zYo/3ILolB6VJ4D9OPOaIY+9ruRoAI25tfj6NAlfqGmPZ+XIusxygYdYCvWVYxGPqw== X-Google-Smtp-Source: AGHT+IG/xqTnfT8bd31wFgIsIGs48HpOXLo3R7/kbT02h11noixBDE9Z7SZn5ZiZXuhQAHPw9ZQb X-Received: by 2002:a05:6e02:152d:b0:372:875a:41dd with SMTP id e9e14a558f8ab-3737b2bd8afmr18291505ab.16.1716539085896; Fri, 24 May 2024 01:24:45 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716539085; cv=pass; d=google.com; s=arc-20160816; b=WUPn/IHlK3PpDNSlfTXdAQ7AuV9Dg8tDdkQzZgWwBnw7K/6kBoyqexMVgESd90n9tT WOolXiAuTN6aP3z7dAgUlX65a045XrpzeJwI4aLejHbXpz5QEu1duVQwO1OYExQgLLOl 4YxCamKTbkZi9mi2YgSFFBuYQ9brAjKuRVof5L+CSaF1NhYE4WiMXP+0P3EtIKtmvtZI siDglBXGn49fwZhBk9xaMijr4K6xm9H/mAYgKPlThjwQ9KV/ZDWZq4XEtQEyAupQqy4h PbfaCJCitTiBev0iCWepwP0/SOKiw5RwT54UmTwQcNwhuduubOD3xJ0mf1lT1jnMqSlI p5Eg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :dkim-signature; bh=XAzc3fXDW6vEbRJ61+z3HVhXnyGMkNU1K1hcXtI5kd8=; fh=za9lo5FZ7JoTBmijDZSWUaLkWqLCIEOUw5Rkn0pMdr0=; b=wd555xivifQgHkl20xATgWbEm1H+/xXnhChxKr5LoC3Sq2t7GccKO/p3rMz8swGtKm dmoDo9AjsGl8gpqoSAsbwFQZzRBZgnl+dIuKn4HmQliz5KkrSWX2uR9BdS5qJSfP9qSp E55af2Fi87PGv3DAS/y2pbCxW+3P13Zlbse3JKBYuIkKR3jTk+lAeRnZgrroewNFctPf M4Fk23D8Ibn9yamqeiFl/WhZPvwQ7oQ6fXt8p9Cn90qd1hdvbCp/mV41cHoYUMhvpNqW tejgomDOSAVLI4V1OCsSQXsJllL/Qbd/pjU23w9eTueS3Whwy+Pq3biOwAdLINUUyf1x 33EA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gooddata.com header.s=google header.b=OVOnJYLR; arc=pass (i=1 spf=pass spfdomain=gooddata.com dkim=pass dkdomain=gooddata.com dmarc=pass fromdomain=gooddata.com); spf=pass (google.com: domain of linux-nfs+bounces-3365-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-nfs+bounces-3365-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gooddata.com Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id 41be03b00d2f7-6822a7a328bsi921848a12.742.2024.05.24.01.24.45 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 May 2024 01:24:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs+bounces-3365-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@gooddata.com header.s=google header.b=OVOnJYLR; arc=pass (i=1 spf=pass spfdomain=gooddata.com dkim=pass dkdomain=gooddata.com dmarc=pass fromdomain=gooddata.com); spf=pass (google.com: domain of linux-nfs+bounces-3365-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-nfs+bounces-3365-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gooddata.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 94C48B21354 for ; Fri, 24 May 2024 08:21:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 370BF5B1F8; Fri, 24 May 2024 08:21:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=gooddata.com header.i=@gooddata.com header.b="OVOnJYLR" X-Original-To: linux-nfs@vger.kernel.org Received: from mail-ej1-f54.google.com (mail-ej1-f54.google.com [209.85.218.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A37636AEC for ; Fri, 24 May 2024 08:21:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.54 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716538917; cv=none; b=FaVOMX7r52rngaQhg/ezuvcwLav+BSwgpOX+m6+wvuoUieis+1RCQSfmwveki5R+nqvuF/AtOO1yJS9g+vOCYwRzdEjXwrsh+DFrrdl0n8Ga4gKWUG/gIHgf2oDBTeZDREXFJXhSKE2Y/xNwnpNc9/VM0grsKZinea9VLxe3YZk= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716538917; c=relaxed/simple; bh=sNSy99WRNiG5vFMSZRHj0qqO47LXRX55y+vpwZvSLPk=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=FQhkiaL7M8uRL3ZnKR0Aug82uqBtKTObSMYO7zdhxnDItFL8GGplVHYv7WFLhOGeqqo+PJ7+wnwRVxrcht58JDaertYv0rXCppP/oMWyLI7n0pYLvPHySYG9wao4P+1E1uBsghHZ9um90JlUtiyJ6WYOQ+VExdmGCr8Sg7LxY8k= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gooddata.com; spf=pass smtp.mailfrom=gooddata.com; dkim=pass (1024-bit key) header.d=gooddata.com header.i=@gooddata.com header.b=OVOnJYLR; arc=none smtp.client-ip=209.85.218.54 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gooddata.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gooddata.com Received: by mail-ej1-f54.google.com with SMTP id a640c23a62f3a-a6269ad7db2so54340566b.2 for ; Fri, 24 May 2024 01:21:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gooddata.com; s=google; t=1716538913; x=1717143713; darn=vger.kernel.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=XAzc3fXDW6vEbRJ61+z3HVhXnyGMkNU1K1hcXtI5kd8=; b=OVOnJYLRRpu4j94DpplC3WZiHo30oS/9rYOm5DZU0pCQYFAiHvNXiE9DDBoxs+wvj/ IrZAqre2ktOkVpHagG0DJm9OLYhfEr43RUkPucpbBOKfXcGvtukvx8lLy99BrGQPMugW lQhM4tllr8E+55qkYvEpmkHgnSW+ZUb4LjBuw= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1716538913; x=1717143713; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=XAzc3fXDW6vEbRJ61+z3HVhXnyGMkNU1K1hcXtI5kd8=; b=eTjQhe42cjPrxc95eT5N30QBfl1YJauJfeM01bDlRaGhEyFJ3YIN60a1iCThcCC0Nw 8Qocu+U9I7QFr4/Waa55VsPaGfzvkhIgVUoThzmQl99vN5gEVmuCesF93rHnJ9u63/nu 8awABh7W+D09HqWv3tFc/rrMS6sUQWUaYFHrFeaZxhPWhvJfJlkkduuM+k4vNveK6myC 6y5IdAKucoTDw4dBdUiQ3FZf8Hr7VNAsH2uMjzyYf0iNibpwxKpsWrLvgiw5uEvdeXfb 1OWH7ldjo3oaqlqpgeLV/59sxP4BUKlxa+zsygfkDNx1J2ydFXx9H67YPWKtXoQx+vi2 2l4A== X-Gm-Message-State: AOJu0YyXb31MWwmOMGdKt78qz7GUwtZwT96wUV9pP4uLI04uYpPkkVN1 ltiL0vShzH69xkVV9Ieet4rlNcTUXETt9KL7HrXEDpzgJwJWNRtuEHwEtT/lgtnDcadV46ozAp1 pzIJeh2/VQgYWyiwy+TxKyviyYIATc9TCKaAE X-Received: by 2002:a17:906:e84:b0:a62:404a:d0d0 with SMTP id a640c23a62f3a-a62643e458cmr104296566b.42.1716538913129; Fri, 24 May 2024 01:21:53 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-nfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: In-Reply-To: From: Jaroslav Pulchart Date: Fri, 24 May 2024 10:21:27 +0200 Message-ID: Subject: Re: [regression] nfsstat/nfsd crash system "general protection fault, probably for non-canonical address ..." after 6.8.9->6.8.10 update To: Chuck Lever Cc: "linux-nfs@vger.kernel.org" , Igor Raits , Jeff Layton , Josef Bacik Content-Type: text/plain; charset="UTF-8" > > On Wed, May 22, 2024 at 04:36:57AM -0400, Jaroslav Pulchart wrote: > > Hello, > > > > I would like to report some issue causing a "general protection fault" > > crash (constantly) after we updated the kernel from 6.8.9 to 6.8.10. > > This is triggered when monitoring is using nfsstat on a server where > > nfsd is running. > > > > [ 3049.260633] general protection fault, probably for non-canonical > > address 0x66fb103e19e9cc89: 0000 [#1] PREEMPT SMP NOPTI > > [ 3049.261628] CPU: 22 PID: 74991 Comm: nfsstat Tainted: G > > E 6.8.10-1.gdc.el9.x86_64 #1 > > [ 3049.262336] Hardware name: RDO OpenStack Compute/RHEL, BIOS > > edk2-20240214-2.el9 02/14/2024 > > [ 3049.263003] RIP: 0010:_raw_spin_lock_irqsave+0x19/0x40 > > [ 3049.263487] Code: cc 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 > > 90 0f 1f 44 00 00 41 54 9c 41 5c fa 65 ff 05 a6 92 f5 42 31 c0 ba 01 > > 00 00 00 0f b1 17 75 0a 4c 89 e0 41 5c c3 cc cc cc cc 89 c6 e8 d0 > > 07 00 > > [ 3049.264882] RSP: 0018:ffffb1bca6b9bd00 EFLAGS: 00010046 > > [ 3049.265365] RAX: 0000000000000000 RBX: 66fb103e19e9c989 RCX: 0000000000000001 > > [ 3049.265953] RDX: 0000000000000001 RSI: 0000000000000001 RDI: 66fb103e19e9cc89 > > [ 3049.266542] RBP: ffffffffc15df280 R08: 0000000000000001 R09: ffffa049a1785cb8 > > [ 3049.267112] R10: ffffb1bca6b9bd70 R11: ffffa04964e49000 R12: 0000000000000246 > > [ 3049.267702] R13: 66fb103e19e9cc89 R14: ffffa048445590a0 R15: 0000000000000001 > > [ 3049.268278] FS: 00007fa3ddf03740(0000) GS:ffffa05703d00000(0000) > > knlGS:0000000000000000 > > [ 3049.268928] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > [ 3049.269443] CR2: 00007fa3dddfca50 CR3: 0000000342d1e004 CR4: 0000000000770ef0 > > [ 3049.270025] PKRU: 55555554 > > [ 3049.270371] Call Trace: > > [ 3049.270723] > > [ 3049.271035] ? die_addr+0x33/0x90 > > [ 3049.271423] ? exc_general_protection+0x1ea/0x450 > > [ 3049.271879] ? asm_exc_general_protection+0x22/0x30 > > [ 3049.272344] ? _raw_spin_lock_irqsave+0x19/0x40 > > [ 3049.272803] __percpu_counter_sum+0xd/0x70 > > [ 3049.273219] nfsd_show+0x4f/0x1d0 [nfsd] > > [ 3049.273666] seq_read_iter+0x11d/0x4d0 > > [ 3049.274073] ? avc_has_perm+0x42/0xc0 > > [ 3049.274489] seq_read+0xfe/0x140 > > [ 3049.274866] proc_reg_read+0x56/0xa0 > > [ 3049.275257] vfs_read+0xa7/0x340 > > [ 3049.275647] ? __do_sys_newfstat+0x57/0x60 > > [ 3049.276059] ksys_read+0x5f/0xe0 > > [ 3049.276439] do_syscall_64+0x5e/0x170 > > [ 3049.276836] entry_SYSCALL_64_after_hwframe+0x78/0x80 > > [ 3049.277296] RIP: 0033:0x7fa3ddcfd9b2 > > [ 3049.277719] Code: c0 e9 b2 fe ff ff 50 48 8d 3d ea 1d 0c 00 e8 c5 > > fd 01 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 > > 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 ec 28 48 89 > > 54 24 > > [ 3049.279139] RSP: 002b:00007ffd930672e8 EFLAGS: 00000246 ORIG_RAX: > > 0000000000000000 > > [ 3049.279788] RAX: ffffffffffffffda RBX: 0000555ded47c2a0 RCX: 00007fa3ddcfd9b2 > > [ 3049.280402] RDX: 0000000000000400 RSI: 0000555ded47c480 RDI: 0000000000000003 > > [ 3049.281046] RBP: 00007fa3dddf75e0 R08: 0000000000000003 R09: 0000000000000077 > > [ 3049.281673] R10: 000000000000005d R11: 0000000000000246 R12: 0000555ded47c2a0 > > [ 3049.282307] R13: 0000000000000d68 R14: 00007fa3dddf69e0 R15: 0000000000000d68 > > [ 3049.282928] > > [ 3049.283310] Modules linked in: mptcp_diag(E) xsk_diag(E) > > raw_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) udp_diag(E) > > tcp_diag(E) inet_diag(E) tun(E) br_netfilter(E) bridge(E) stp(E) > > llc(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) > > nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) binfmt_misc(E) > > zram(E) tls(E) isofs(E) vfat(E) fat(E) intel_rapl_msr(E) > > intel_rapl_common(E) kvm_amd(E) ccp(E) kvm(E) irqbypass(E) > > virtio_net(E) i2c_i801(E) virtio_gpu(E) i2c_smbus(E) net_failover(E) > > virtio_balloon(E) failover(E) virtio_dma_buf(E) fuse(E) ext4(E) > > mbcache(E) jbd2(E) sr_mod(E) cdrom(E) sg(E) ahci(E) libahci(E) > > crct10dif_pclmul(E) crc32_pclmul(Ea) polyval_clmulni(E) > > polyval_generic(E) libata(E) ghash_clmulni_intel(E) sha512_ssse3(E) > > virtio_blk(E) serio_raw(E) btrfs(E) xor(E) zstd_compress(E) > > raid6_pq(E) libcrc32c(E) crc32c_intel(E) dm_mirror(E) > > dm_region_hash(E) dm_log(E) dm_mod(E) > > [ 3049.283345] Unloaded tainted modules: edac_mce_amd(E):1 padlock_aes(E) > > > > Any suggestion on how to fix it is appreciated. > > Bisect between v6.8.9 and v6.8.10 would give us the exact point > where the failures were introduced. > > I see that GregKH pulled in: > > 26a0ddb04230 ("nfsd: rename NFSD_NET_* to NFSD_STATS_*") > b7b05f98f3f0 ("nfsd: expose /proc/net/sunrpc/nfsd in net namespaces") > abf5fb593c90 ("nfsd: make all of the nfsd stats per-network namespace") > > for v6.8.10 as a Stable-Dep-of: 18180a4550d0 ("NFSD: Fix nfsd4_encode_fattr4() crasher") > > Which is a little baffling, I don't see how those two change sets > are mechanically related to each other. But I suspect the culprit is > one of those three stat-related patches. > > > -- > Chuck Lever Hello, I run bisecting. It was easy to reproduce, simple execution of "nfsstat" from terminal stuck the server: abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566 is the first bad commit $ git bisect bad abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566 is the first bad commit commit abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566 (HEAD) Author: Josef Bacik Date: Fri Jan 26 10:39:47 2024 -0500 nfsd: make all of the nfsd stats per-network namespace [ Upstream commit 4b14885411f74b2b0ce0eb2b39d0fffe54e5ca0d ] We have a global set of counters that we modify for all of the nfsd operations, but now that we're exposing these stats across all network namespaces we need to make the stats also be per-network namespace. We already have some caching stats that are per-network namespace, so move these definitions into the same counter and then adjust all the helpers and users of these stats to provide the appropriate nfsd_net struct so that the stats are maintained for the per-network namespace objects. Signed-off-by: Josef Bacik Reviewed-by: Jeff Layton Signed-off-by: Chuck Lever Stable-dep-of: 18180a4550d0 ("NFSD: Fix nfsd4_encode_fattr4() crasher") Signed-off-by: Sasha Levin fs/nfsd/cache.h | 2 -- fs/nfsd/netns.h | 17 +++++++++++++++-- fs/nfsd/nfs4proc.c | 6 +++--- fs/nfsd/nfs4state.c | 3 ++- fs/nfsd/nfscache.c | 36 +++++++----------------------------- fs/nfsd/nfsctl.c | 12 +++--------- fs/nfsd/nfsfh.c | 3 ++- fs/nfsd/stats.c | 26 ++++++++++++++------------ fs/nfsd/stats.h | 54 +++++++++++++++++++----------------------------------- fs/nfsd/vfs.c | 6 ++++-- 10 files changed, 69 insertions(+), 96 deletions(-) $ git bisect log git bisect start # status: waiting for both good and bad commits # good: [f3d61438b613b87afb63118bea6fb18c50ba7a6b] Linux 6.8.9 git bisect good f3d61438b613b87afb63118bea6fb18c50ba7a6b # status: waiting for bad commit, 1 good commit known # bad: [a0c69a570e420e86c7569b8c052913213eef2b45] Linux 6.8.10 git bisect bad a0c69a570e420e86c7569b8c052913213eef2b45 # bad: [4aaed9dbe8acd2b6114458f0498a617283d6275b] hv_netvsc: Don't free decrypted memory git bisect bad 4aaed9dbe8acd2b6114458f0498a617283d6275b # bad: [ee190d04c2f99c8e557b00e997621c04592baed1] net: gro: add flush check in udp_gro_receive_segment git bisect bad ee190d04c2f99c8e557b00e997621c04592baed1 # bad: [781e34b736014188ba9e46a71535237313dcda81] efi/unaccepted: touch soft lockup during memory accept git bisect bad 781e34b736014188ba9e46a71535237313dcda81 # bad: [6a7b07689af6e4e023404bf69b1230f43b2a15bc] NFSD: Fix nfsd4_encode_fattr4() crasher git bisect bad 6a7b07689af6e4e023404bf69b1230f43b2a15bc # good: [e05194baae299f2148ab5f6bab659c6ce8d1f6d3] nfs: expose /proc/net/sunrpc/nfs in net namespaces git bisect good e05194baae299f2148ab5f6bab659c6ce8d1f6d3 # good: [946ab150335d92f852288c1c6b0f0466b5d6e97f] power: supply: mt6360_charger: Fix of_match for usb-otg-vbus regulator git bisect good 946ab150335d92f852288c1c6b0f0466b5d6e97f # good: [b7b05f98f3f06fea3986b46e5c7fe2928676b02d] nfsd: expose /proc/net/sunrpc/nfsd in net namespaces git bisect good b7b05f98f3f06fea3986b46e5c7fe2928676b02d # bad: [0e8003af77879572dbc1df56860cbe2bfa8498f0] NFSD: add support for CB_GETATTR callback git bisect bad 0e8003af77879572dbc1df56860cbe2bfa8498f0 # bad: [abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566] nfsd: make all of the nfsd stats per-network namespace git bisect bad abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566 # first bad commit: [abf5fb593c90d3ab55d6cf1dea7bec8ee0bf3566] nfsd: make all of the nfsd stats per-network namespace