Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp5674126ybl; Tue, 10 Dec 2019 09:38:44 -0800 (PST) X-Google-Smtp-Source: APXvYqzumUCCXfF+yZIIjyk2ym9GkY2krGQ28ohGo4dvkFbYTA6WYO4tEpjm6B092UDDFdR2siyT X-Received: by 2002:a9d:22:: with SMTP id 31mr13695829ota.173.1575999523915; Tue, 10 Dec 2019 09:38:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1575999523; cv=none; d=google.com; s=arc-20160816; b=KOR4T3aPQ8J6Ko4yxUaceRXMNZ/Qkkz8S6OKYrhJtWo41Zb15IgPcOu0l26aRBo7cz vy9MTvUnZ/IMATnvoa8STUSpRk57BsSCyDOuRUH9JYW2iiJXalTUmcePO4S43h3u+FA4 3tGDBb+5jtVId+DNcLsM144BIrQ/UL+StDkbPBmemmGC1abWC34AAUtkrhGAy3/Yx1nS p+iYIB9rSv4C2SE7t52Ss0+RuktiDa24DMEtjrblBv7PaYkeNKvqNMNBSpfjbuJ66Bf/ nmNyFwt2oa7wrzB5CUwO7PcQ3rtYpWOhM42hF5x0VnhhUYQxXwZe74gzrVCf9IpijbEw a7eQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=1SUPXaerdfD3jJMW2gFC5JUPcQ0Za1cnIYsb3822TZU=; b=KbdT40OaAr8IM1u/gs7ToZBSE9beJ7MjKdgQ+rRL7S8drHG/CbJBf2SQig/RfobbAD ro19JfeXR13v7T3bPz0yKq8PH7PvqYe6VpEtK5iY8P2zqi5BG02bWDcyyWJvgpdCInuI T7j7/ppCsJcSF8gx1ymdzPck8eSrTO2KoB26vQRPLI8w8mYgFDtPzUOJsabWvlT35vqx wSeyUFLf+CTazVarM+P/T31KjBtMfHPqj03QaEmEMUzSvOlkXuaZmYf3ALNPbBBgZK0T 5x8yJZZtrpeTRgvqdMS/VQFWQXarW3bNEjUdhu6Z8aBa409efTuvp2+dddkGzZGY784Q SNiQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b="dh1r/4Hj"; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id t195si2311887oih.209.2019.12.10.09.38.32; Tue, 10 Dec 2019 09:38:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b="dh1r/4Hj"; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727592AbfLJRh1 (ORCPT + 99 others); Tue, 10 Dec 2019 12:37:27 -0500 Received: from userp2120.oracle.com ([156.151.31.85]:36856 "EHLO userp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727527AbfLJRh1 (ORCPT ); Tue, 10 Dec 2019 12:37:27 -0500 Received: from pps.filterd (userp2120.oracle.com [127.0.0.1]) by userp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xBAHOE0H078129; Tue, 10 Dec 2019 17:37:21 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=content-type : mime-version : subject : from : in-reply-to : date : cc : content-transfer-encoding : message-id : references : to; s=corp-2019-08-05; bh=1SUPXaerdfD3jJMW2gFC5JUPcQ0Za1cnIYsb3822TZU=; b=dh1r/4HjdgtRRp4kGb+4/aRVXgtZQb9Ji0aP8E2ZWBrGJkj6ugEIAGDjfG1gWG4BoLCo tJ5vyP6Hsv++NkKxUNxKoDwjAk7sphnywtAtAD9ANxVR7fRB78YoMfuyp8QFZUpvtlMp fomIf7Bz1q1Eg+nb9tbbkfbI4amdlhAqBXFTD0NvSbffSTgulBUaW8+y70oWS5+k0Vrj wmgnbcQ1xkI9NszFbExfFzrrVi5MWHNHWV4KQREpMLXBVLFCgU2mFCN2LrtBK9izWCaw kZfxAthGyaPwFcPGQXgN6lRDJ6FYPWrqBaQfbktYkYKmkpHJBuNyfMfQsDk2uzC/ag6X +g== Received: from aserp3030.oracle.com (aserp3030.oracle.com [141.146.126.71]) by userp2120.oracle.com with ESMTP id 2wr4qrfmsh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 10 Dec 2019 17:37:21 +0000 Received: from pps.filterd (aserp3030.oracle.com [127.0.0.1]) by aserp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id xBAHOS5R155548; Tue, 10 Dec 2019 17:37:20 GMT Received: from aserv0121.oracle.com (aserv0121.oracle.com [141.146.126.235]) by aserp3030.oracle.com with ESMTP id 2wt13df3s9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 10 Dec 2019 17:37:20 +0000 Received: from abhmp0008.oracle.com (abhmp0008.oracle.com [141.146.116.14]) by aserv0121.oracle.com (8.14.4/8.13.8) with ESMTP id xBAHbJe1005516; Tue, 10 Dec 2019 17:37:19 GMT Received: from anon-dhcp-152.1015granger.net (/68.61.232.219) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 10 Dec 2019 09:37:19 -0800 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: oops in 5.4 on rdma From: Chuck Lever In-Reply-To: Date: Tue, 10 Dec 2019 12:37:18 -0500 Cc: Linux NFS Mailing List Content-Transfer-Encoding: quoted-printable Message-Id: <4C697DAB-A780-4477-A94E-95B95A66E1A1@oracle.com> References: To: Olga Kornievskaia X-Mailer: Apple Mail (2.3445.104.11) X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9467 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=2 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-1912100149 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9467 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1911140001 definitions=main-1912100149 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org > On Dec 10, 2019, at 12:34 PM, Olga Kornievskaia = wrote: >=20 > Hi Chuck, >=20 > Is this known? Running your cel/testing from commit > 37e235c0128566e9d97741ad1e546b44f324f108 The WARNING does not look familiar, but cel-testing has moved on. Can you fetch it again? > I started generic/013 and test hung for long time, got this but then > test ran successfully. >=20 > [ 153.452029] ------------[ cut here ]------------ > [ 153.507281] WARNING: CPU: 14 PID: 975 at > drivers/infiniband/core/cq.c:310 ib_free_cq_user+0xea/0x100 [ib_core] > [ 153.626988] Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver > nfs fscache rdma_rxe ip6_udp_tunnel udp_tunnel nfsd auth_rpcgss > nfs_acl lockd grace xt_CHECKSUM xt_MASQUERADE tun bridge stp llc > ip6t_rpfilter ipt_REJECT nf_reject_ipv4 ip6t_REJECT nf_reject_ipv6 > xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute ip6table_nat > ip6table_mangle ip6table_security ip6table_raw iptable_nat nf_nat > nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 iptable_mangle > iptable_security iptable_raw ebtable_filter ebtables ip6table_filter > ip6_tables iptable_filter ib_isert iscsi_target_mod ib_srpt > target_core_mod ib_srp scsi_transport_srp rpcrdma sunrpc > intel_rapl_msr intel_rapl_common rdma_ucm x86_pkg_temp_thermal ib_iser > intel_powerclamp coretemp rdma_cm kvm_intel iw_cm ib_umad ib_ipoib > libiscsi kvm scsi_transport_iscsi ib_cm irqbypass crct10dif_pclmul > mlx5_ib crc32_pclmul iTCO_wdt ipmi_ssif ghash_clmulni_intel > iTCO_vendor_support aesni_intel ib_uverbs crypto_simd ipmi_si cryptd > ipmi_devintf pcspkr ib_core > [ 153.627026] glue_helper i2c_i801 sg lpc_ich ipmi_msghandler wmi > acpi_power_meter ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper > syscopyarea sysfillrect sysimgblt fb_sys_fops drm_vram_helper ttm isci > mlx5_core libsas igb drm ahci qla2xxx libahci scsi_transport_sas > libata dca crc32c_intel i2c_algo_bit i2c_core scsi_transport_fc > pci_hyperv_intf dm_mirror dm_region_hash dm_log dm_mod > [ 155.086407] CPU: 14 PID: 975 Comm: kworker/u52:0 Not tainted 5.4.0+ = #1 > [ 155.164520] Hardware name: FUJITSU PRIMERGY RX200 S7/D3032-A1, BIOS > V4.6.5.3 R2.29.0 for D3032-A1x 06/18/2018 > [ 155.283237] Workqueue: xprtiod xprt_autoclose [sunrpc] > [ 155.344725] RIP: 0010:ib_free_cq_user+0xea/0x100 [ib_core] > [ 155.410365] Code: d7 48 8b 03 48 85 c0 75 e8 e9 6a ff ff ff 48 8d > 7f 40 e8 89 9a 52 d6 e9 57 ff ff ff 48 8d 7f 40 e8 0b de 86 d6 e9 49 > ff ff ff <0f> 0b 5b 5d 41 5c c3 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 > 00 00 > [ 155.635114] RSP: 0018:ffff98e4c6aebda0 EFLAGS: 00010202 > [ 155.697624] RAX: 0000000000000001 RBX: ffff8b85efdb8000 RCX: = 0000000000000000 > [ 155.783015] RDX: ffff8b861516ae80 RSI: 0000000000000000 RDI: = ffff8b8df0087000 > [ 155.868404] RBP: ffff8b8df0087000 R08: 0000000000000001 R09: = 0000000000000000 > [ 155.953795] R10: ffff8b8e1724b000 R11: ffffffffffffffa6 R12: = ffff8b85efdb8000 > [ 156.039186] R13: 0000000000000000 R14: ffff8b86071cb000 R15: = ffff8b85efdb8448 > [ 156.124577] FS: 0000000000000000(0000) GS:ffff8b861fa00000(0000) > knlGS:0000000000000000 > [ 156.221405] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 156.290157] CR2: 00007fde99d85000 CR3: 000000025f20a003 CR4: = 00000000000606e0 > [ 156.375548] Call Trace: > [ 156.404805] rpcrdma_ep_destroy+0x43/0x70 [rpcrdma] > [ 156.463171] rpcrdma_ep_disconnect+0xf2/0x1c0 [rpcrdma] > [ 156.525683] ? __switch_to_asm+0x34/0x70 > [ 156.572589] ? __switch_to_asm+0x40/0x70 > [ 156.619500] ? __switch_to_asm+0x34/0x70 > [ 156.666409] ? __switch_to_asm+0x40/0x70 > [ 156.713321] ? __switch_to_asm+0x34/0x70 > [ 156.760238] xprt_rdma_close+0x49/0xc0 [rpcrdma] > [ 156.815481] xprt_autoclose+0x50/0xb0 [sunrpc] > [ 156.868635] process_one_work+0x171/0x380 > [ 156.916584] worker_thread+0x49/0x3f0 > [ 156.960375] kthread+0xf8/0x130 > [ 156.997926] ? max_active_store+0x80/0x80 > [ 157.045875] ? kthread_bind+0x10/0x10 > [ 157.089665] ret_from_fork+0x35/0x40 > [ 157.132416] ---[ end trace dcd41693526c20ae ]--- >=20 > Var log also had this: > Dec 10 12:37:03 localhost kolga: run xfstest generic/013 > Dec 10 12:37:03 localhost journal: run fstests generic/013 at > 2019-12-10 12:37:03 > Dec 10 12:39:54 localhost kernel: INFO: task kworker/6:2:295 blocked > for more than 122 seconds. > Dec 10 12:39:54 localhost kernel: Tainted: G W = 5.4.0+ #1 > Dec 10 12:39:54 localhost kernel: "echo 0 > > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > Dec 10 12:39:55 localhost kernel: kworker/6:2 D 0 295 2 = 0x80004000 > Dec 10 12:39:55 localhost kernel: Workqueue: events xprt_destroy_cb = [sunrpc] > Dec 10 12:39:55 localhost kernel: Call Trace: > Dec 10 12:39:55 localhost kernel: ? __schedule+0x2d1/0x6c0 > Dec 10 12:39:55 localhost kernel: schedule+0x39/0xa0 > Dec 10 12:39:55 localhost kernel: schedule_timeout+0x1c8/0x290 > Dec 10 12:39:55 localhost kernel: ? tracing_is_on+0x11/0x30 > Dec 10 12:39:55 localhost kernel: ? trace_save_cmdline+0x68/0xd0 > Dec 10 12:39:55 localhost kernel: wait_for_completion+0x123/0x190 > Dec 10 12:39:55 localhost kernel: ? wake_up_q+0x70/0x70 > Dec 10 12:39:55 localhost kernel: __flush_work.isra.35+0x11e/0x1a0 > Dec 10 12:39:55 localhost kernel: ? get_work_pool+0x40/0x40 > Dec 10 12:39:55 localhost kernel: __cancel_work_timer+0x103/0x190 > Dec 10 12:39:55 localhost kernel: xprt_rdma_destroy+0x22/0xb0 = [rpcrdma] > Dec 10 12:39:55 localhost kernel: process_one_work+0x171/0x380 > Dec 10 12:39:55 localhost kernel: worker_thread+0x49/0x3f0 > Dec 10 12:39:55 localhost kernel: kthread+0xf8/0x130 > Dec 10 12:39:55 localhost kernel: ? max_active_store+0x80/0x80 > Dec 10 12:39:55 localhost kernel: ? kthread_bind+0x10/0x10 > Dec 10 12:39:55 localhost kernel: ret_from_fork+0x35/0x40 -- Chuck Lever