Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp3915364pxt; Tue, 10 Aug 2021 14:41:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx0TE8f3AAF7TI95JU30wEKV5EF86FSimJhJofTniKBJzCGAF+0IzNE68iah7z7YIRxS+XF X-Received: by 2002:a92:d7c1:: with SMTP id g1mr261409ilq.24.1628631676692; Tue, 10 Aug 2021 14:41:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628631676; cv=none; d=google.com; s=arc-20160816; b=F3lFWXHLRky2chQs9Ccrbo1ADMm/X6vvQ4p8QKdiGMsNaZlC4GNeDNirJ+VKHPE9GD D5ZtI1HLOuLsN6gJpKzjtFpi328xz5TA5htiIDsqpevFw26tuROtq483mcN0xTPO1SpI wtAPZW9uZIgwq/+XDSqLfmsnW5g10iWXucGvgAcs+0cEMdDcS2UaTKz+UCShmmmj5VqW erp6B1I+H3BzK15+9VYGoJQdD/2sWQpWcgAeUKuLAL6Iu/npzdV6C/FjMrj2aF+rG38P epYU8nhDtU+bnvEzBdx5u5HrpYcQO8tLGKirgtEtrNeJsHqIGFpiDgPGYKvFh6hC4Mzq 5CJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=Lm25XxBWsq0ppXl2Dd6JFGNOoV9QckbAdIVONUYJRFk=; b=U4B6HFEp/n+8XqinTrnmU5OH8W4xvov2gB7f6cBoayGg8ZtVwvVIYwZZN1ENTNpYd3 xXTEVvAbXgLErELwpbvvGTQe67gSzmJajUOL7yPWTvkisxZgGuIasP7U/ph4Js3qoOhO 1oW7sX78Itbmv5YYpsyUhxOco8aVRVnJJSdZAFl6y9QdOY3c7N996VIal9rOEP7tahJn j/Ccjy9JUOfFqWJRbsnsr+yFup0FwhOwvS0zQm8yIwR3upOriZ/w8l6wDDXeN41v8RaB LPvdWSDo31SXkuDFhcWlIrL9ZoD/bNGPqOYT74teY4auHf6I8VOYdYiKCrZFdMeB4nH1 lOdA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@rothenpieler.org header.s=mail header.b=acl9iWga; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v31si13138325jal.55.2021.08.10.14.40.51; Tue, 10 Aug 2021 14:41:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@rothenpieler.org header.s=mail header.b=acl9iWga; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=rothenpieler.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234043AbhHJVlN (ORCPT + 99 others); Tue, 10 Aug 2021 17:41:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233625AbhHJVlM (ORCPT ); Tue, 10 Aug 2021 17:41:12 -0400 Received: from btbn.de (btbn.de [IPv6:2a01:4f8:212:2854::2]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C499C061765; Tue, 10 Aug 2021 14:40:49 -0700 (PDT) Received: from [authenticated] by btbn.de (Postfix) with ESMTPSA id C8BCD267198; Tue, 10 Aug 2021 23:40:47 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rothenpieler.org; s=mail; t=1628631647; bh=Lm25XxBWsq0ppXl2Dd6JFGNOoV9QckbAdIVONUYJRFk=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=acl9iWgaS1kHkS7ehILanrOz+4bK3pvJ/Dq2hCesxAwHlyTVjzlRf2mwQDrfOwu1q Q0L48gd2052AiDz6qTGZ/zWX4V6HRuZSFgbIaElQjMtmdXvhFcoHlhOMkjAahVafpW AGrjUX90zbQIGZGwpx73ZBCWgnMzAP1yaPVdlgcennBzC3/+mCbIPts/93dvx4hnmF qJLyCjaRuNEdFUHVbZ1VzDwM1gB2QDgoCWgVUOpAHapoLTym6UI85p7Dda8ms1qGc0 PdDtmA7kTOSi4T8phy/GC7igWPRcMRPTbFU4qz1NJtAInWO6zMSSTgYVSPh0pRL21e Sx1AXsXvELreg== Subject: Re: Spurious instability with NFSoRDMA under moderate load To: Chuck Lever III Cc: Linux NFS Mailing List , linux-rdma References: <4da3b074-a6be-d83f-ccd4-b151557066aa@rothenpieler.org> <72ECF9E1-1F6E-44AF-850C-536BED898DDD@oracle.com> <9355de20-921c-69e0-e5a4-733b64e125e1@rothenpieler.org> <4BA2A532-9063-4893-AF53-E1DAB06095CC@oracle.com> From: Timo Rothenpieler Message-ID: Date: Tue, 10 Aug 2021 23:40:47 +0200 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.12.0 MIME-Version: 1.0 In-Reply-To: <4BA2A532-9063-4893-AF53-E1DAB06095CC@oracle.com> Content-Type: multipart/signed; protocol="application/pkcs7-signature"; micalg=sha-256; boundary="------------ms010509000302020909090602" Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org This is a cryptographically signed message in MIME format. --------------ms010509000302020909090602 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: quoted-printable On 10.08.2021 19:17, Chuck Lever III wrote: >=20 > What I see in this data is that the server is reporting >=20 > SEQ4_STATUS_CB_PATH_DOWN >=20 > and the client is attempting to recover (repeatedly) using > BIND_CONN_TO_SESSION. But apparently the recovery didn't > actually work, because the server continues to report a > callback path problem. >=20 > [1712389.125641] nfs41_handle_sequence_flag_errors: "10.110.10.200" (cl= ient ID 6765f8600a675814) flags=3D0x00000001 > [1712389.129264] nfs4_bind_conn_to_session: bind_conn_to_session was su= ccessful for server 10.110.10.200! >=20 > [1712389.171953] nfs41_handle_sequence_flag_errors: "10.110.10.200" (cl= ient ID 6765f8600a675814) flags=3D0x00000001 > [1712389.178361] nfs4_bind_conn_to_session: bind_conn_to_session was su= ccessful for server 10.110.10.200! >=20 > [1712389.195606] nfs41_handle_sequence_flag_errors: "10.110.10.200" (cl= ient ID 6765f8600a675814) flags=3D0x00000001 > [1712389.203891] nfs4_bind_conn_to_session: bind_conn_to_session was su= ccessful for server 10.110.10.200! >=20 > I guess it's time to switch to tracing on the server side > to see if you can nail down why the server's callback > requests are failing. On your NFS server, run: >=20 > # trace-cmd record -e nfsd -e sunrpc -e rpcgss -e rpcrdma >=20 > at roughly the same point during your test that you captured > the previous client-side trace. I wonder if reverting 6820bf77864d on the server, to have an easier way=20 to reproduce this state, would be worth it. Cause it seems like the actual underlying issue is the inability of the=20 NFS server (and/or client) to reestablish the backchannel if it gets=20 disconnected for whatever reason? Right now I already rebooted the client, and everything is working=20 again, so I'll have to wait a potentially long time for this to happen=20 again otherwise. --------------ms010509000302020909090602 Content-Type: application/pkcs7-signature; name="smime.p7s" Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="smime.p7s" Content-Description: S/MIME Cryptographic Signature MIAGCSqGSIb3DQEHAqCAMIACAQExDzANBglghkgBZQMEAgEFADCABgkqhkiG9w0BBwEAAKCC DVkwggXkMIIDzKADAgECAhAI/yx7V5dPIG8WuMetnzcsMA0GCSqGSIb3DQEBCwUAMIGBMQsw CQYDVQQGEwJJVDEQMA4GA1UECAwHQmVyZ2FtbzEZMBcGA1UEBwwQUG9udGUgU2FuIFBpZXRy bzEXMBUGA1UECgwOQWN0YWxpcyBTLnAuQS4xLDAqBgNVBAMMI0FjdGFsaXMgQ2xpZW50IEF1 dGhlbnRpY2F0aW9uIENBIEczMB4XDTIxMDIxNDE5MTM0N1oXDTIyMDIxNDE5MTM0N1owIDEe MBwGA1UEAwwVdGltb0Byb3RoZW5waWVsZXIub3JnMIIBIjANBgkqhkiG9w0BAQEFAAOCAQ8A MIIBCgKCAQEA0WP2SBuRIpVw5O7QPakKoJjg7B4UNAKTyky1XMsievLNGnR4Nxe6kKU+1oW0 oF5FqMVH9NkT9zhWYJzr5sNwJMKb9t5k8kYC7GXzOM9PxVx3bkLF5bWZrbfelUUwcdiyEYoh d29C+PxiNLHvmayWb3NtxpWiax9A4x7dRhhtqB/0BkPix+ZsIFn8vxpCvIChE2YlQWK3i8UX uBtqm26zBl3BIjj+bpd+7ePVt60vRx/R3LFHtF6kL/gQvgRcm8CFc8Nj3dCUeR2lfG+DzoTY ED6yAi838kRh5JHbqIl/Fo9YRwOYUaq2TFT/fGue87d7duLbckX1aVot+OqE0aeV2QIDAQAB o4IBtjCCAbIwDAYDVR0TAQH/BAIwADAfBgNVHSMEGDAWgBS+l6mqhL+AvxBTfQky+eEuMhvP dzB+BggrBgEFBQcBAQRyMHAwOwYIKwYBBQUHMAKGL2h0dHA6Ly9jYWNlcnQuYWN0YWxpcy5p dC9jZXJ0cy9hY3RhbGlzLWF1dGNsaWczMDEGCCsGAQUFBzABhiVodHRwOi8vb2NzcDA5LmFj dGFsaXMuaXQvVkEvQVVUSENMLUczMCAGA1UdEQQZMBeBFXRpbW9Acm90aGVucGllbGVyLm9y ZzBHBgNVHSAEQDA+MDwGBiuBHwEYATAyMDAGCCsGAQUFBwIBFiRodHRwczovL3d3dy5hY3Rh bGlzLml0L2FyZWEtZG93bmxvYWQwHQYDVR0lBBYwFAYIKwYBBQUHAwIGCCsGAQUFBwMEMEgG A1UdHwRBMD8wPaA7oDmGN2h0dHA6Ly9jcmwwOS5hY3RhbGlzLml0L1JlcG9zaXRvcnkvQVVU SENMLUczL2dldExhc3RDUkwwHQYDVR0OBBYEFK/aNb0BTZd0BqHgSJnmTftGSlabMA4GA1Ud DwEB/wQEAwIFoDANBgkqhkiG9w0BAQsFAAOCAgEAT3W2bBaISi7Utg/WA3U+bBhiouolnROR AB0vW4m3igjMcWx5GrPb8CSWNcq0/+BG+bhj6s+q7D1E9h1HO9CZUCfD7ujXj/VT/h7oMAqX w3Tf6H92bvHmZCvZmb2HKEnAAa4URjeZyNI1uwsMirF/gC5zYX5pm2ydVGxGYusWq8VRZzgc m1a0f3SPtX2dmmqjCzfINsQPs3N7BQo6FO/PfCbCzt22e+9Zm0Lra0Wt2URFTYCKSTjsK2xC SkysTfVIrBZCOb83oTMsgYE9dBmK7Tmob/HzHKs0NUOu4TfEpCgFgoXozMqTLFQac7aW26YK O8ClFDaauyOC71A+kjrth/gkUNEK+Cd3W52hK2FWvxbG/8LQLDMYviZFKxv/LAHU0fb6omva R4dzu9Sagi1z5uI5KHs5SR85lH4Up0dYs+I2xyFb8wZVYa+VuvsJ4W/pL2OaMm0tez+aNprg XURytCSPfAlz3JQdEYIiKPlJrz7O6eL2j7RwxMcKFLQl117mhImjdauIjaaS60w92P7v+F7+ 7INJ8g0PFN2vHVCB9e1g4iSYIgiydDLcbs73Jp1yVp97plWZI9oirxvH1/vI05FUJ3gw9qg2 WfbttAr0AEakAUo3Dv8jB7aQor/5fu8NMOvWjFV7P7GTAgrwil8u6fXa8ae/kWzG/850vgqq GM0wggdtMIIFVaADAgECAhAXED7ePYoctcoGUZPnykNrMA0GCSqGSIb3DQEBCwUAMGsxCzAJ BgNVBAYTAklUMQ4wDAYDVQQHDAVNaWxhbjEjMCEGA1UECgwaQWN0YWxpcyBTLnAuQS4vMDMz NTg1MjA5NjcxJzAlBgNVBAMMHkFjdGFsaXMgQXV0aGVudGljYXRpb24gUm9vdCBDQTAeFw0y MDA3MDYwODQ1NDdaFw0zMDA5MjIxMTIyMDJaMIGBMQswCQYDVQQGEwJJVDEQMA4GA1UECAwH QmVyZ2FtbzEZMBcGA1UEBwwQUG9udGUgU2FuIFBpZXRybzEXMBUGA1UECgwOQWN0YWxpcyBT LnAuQS4xLDAqBgNVBAMMI0FjdGFsaXMgQ2xpZW50IEF1dGhlbnRpY2F0aW9uIENBIEczMIIC IjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEA7eaHlqHBpLbtwkJV9z8PDyJgXxPgpkOI hkmReRwbLxpQD9xGAe72ujqGzFFh78QPgAhxKVqtGHzYeq0VJVCzhnCKRBbVX+JwIhL3ULYh UAZrViUp952qDB6qTL5sGeJS9F69VPSR5k6pFNw7mHDTTt0voWFg2aVkG3khomzVXoieJGOi Q4dH76paCtQbLkt59joAKz2BnwGLQ4wr09nfumJt5AKx2YxHK2XgSPslVZ4z8G00gimsfA7U tjT/wiekY6Z0b7ksLrEcvODncHQe9VSrNRA149SE3AlkWaZM/joVei/GYfj9K5jkiReinR4m qM353FEceLOeBhSTURpMdQ5wsXLi9DSTGBuNv4aw2Dozb/qBlkhGTvwk92mi0jAecE22Sn3A 9UfrU2p1w/uRs+TIteQ0xO0B/J2mY2caqocsS9SsriIGlQ8b0LT0o6Ob07KGtPa5/lIvMmx5 72Dv2v+vDiECByxm1Hdgjp8JtE4mdyYP6GBscJyT71NZw1zXHnFkyCbxReag9qaSR9x4CVVX j1BDmNROCqd5NAfIXUXYTFeZ/jukQigkxXGWhEhfLBC4Ha6pwizz9fq1+wwPKcWaF9P/SZOu BDrG30MiyCZa66G9mEtF5ZLuh4rGfKqxy4Z5Mxecuzt+MZmrSKfKGeXOeED/iuX5Z02M1o7i MS8CAwEAAaOCAfQwggHwMA8GA1UdEwEB/wQFMAMBAf8wHwYDVR0jBBgwFoAUUtiIOsifeGbt ifN7OHCUyQICNtAwQQYIKwYBBQUHAQEENTAzMDEGCCsGAQUFBzABhiVodHRwOi8vb2NzcDA1 LmFjdGFsaXMuaXQvVkEvQVVUSC1ST09UMEUGA1UdIAQ+MDwwOgYEVR0gADAyMDAGCCsGAQUF BwIBFiRodHRwczovL3d3dy5hY3RhbGlzLml0L2FyZWEtZG93bmxvYWQwHQYDVR0lBBYwFAYI KwYBBQUHAwIGCCsGAQUFBwMEMIHjBgNVHR8EgdswgdgwgZaggZOggZCGgY1sZGFwOi8vbGRh cDA1LmFjdGFsaXMuaXQvY24lM2RBY3RhbGlzJTIwQXV0aGVudGljYXRpb24lMjBSb290JTIw Q0EsbyUzZEFjdGFsaXMlMjBTLnAuQS4lMmYwMzM1ODUyMDk2NyxjJTNkSVQ/Y2VydGlmaWNh dGVSZXZvY2F0aW9uTGlzdDtiaW5hcnkwPaA7oDmGN2h0dHA6Ly9jcmwwNS5hY3RhbGlzLml0 L1JlcG9zaXRvcnkvQVVUSC1ST09UL2dldExhc3RDUkwwHQYDVR0OBBYEFL6XqaqEv4C/EFN9 CTL54S4yG893MA4GA1UdDwEB/wQEAwIBBjANBgkqhkiG9w0BAQsFAAOCAgEAJpvnG1kNdLMS A+nnVfeEgIXNQsM7YRxXx6bmEt9IIrFlH1qYKeNw4NV8xtop91Rle168wghmYeCTP10FqfuK MZsleNkI8/b3PBkZLIKOl9p2Dmz2Gc0I3WvcMbAgd/IuBtx998PJX/bBb5dMZuGV2drNmxfz 3ar6ytGYLxedfjKCD55Yv8CQcN6e9sW5OUm9TJ3kjt7Wdvd1hcw5s+7bhlND38rWFJBuzump 5xqm1NSOggOkFSlKnhSz6HUjgwBaid6Ypig9L1/TLrkmtEIpx+wpIj7WTA9JqcMMyLJ0rN6j jpetLSGUDk3NCOpQntSy4a8+0O+SepzS/Tec1cGdSN6Ni2/A7ewQNd1Rbmb2SM2qVBlfN0e6 ZklWo9QYpNZyf0d/d3upsKabE9eNCg1S4eDnp8sJqdlaQQ7hI/UYCAgDtLIm7/J9+/S2zuwE WtJMPcvaYIBczdjwF9uW+8NJ/Zu/JKb98971uua7OsJexPFRBzX7/PnJ2/NXcTdwudShJc/p d9c3IRU7qw+RxRKchIczv3zEuQJMHkSSM8KM8TbOzi/0v0lU6SSyS9bpGdZZxx19Hd8Qs0cv +R6nyt7ohttizwefkYzQ6GzwIwM9gSjH5Bf/r9Kc5/JqqpKKUGicxAGy2zKYEGB0Qo761Mcc IyclBW9mfuNFDbTBeDEyu80xggPzMIID7wIBATCBljCBgTELMAkGA1UEBhMCSVQxEDAOBgNV BAgMB0JlcmdhbW8xGTAXBgNVBAcMEFBvbnRlIFNhbiBQaWV0cm8xFzAVBgNVBAoMDkFjdGFs aXMgUy5wLkEuMSwwKgYDVQQDDCNBY3RhbGlzIENsaWVudCBBdXRoZW50aWNhdGlvbiBDQSBH MwIQCP8se1eXTyBvFrjHrZ83LDANBglghkgBZQMEAgEFAKCCAi0wGAYJKoZIhvcNAQkDMQsG CSqGSIb3DQEHATAcBgkqhkiG9w0BCQUxDxcNMjEwODEwMjE0MDQ3WjAvBgkqhkiG9w0BCQQx IgQgACTu3aU7QrHJJdAu1OxYI1yk/TMZr15wnLYeFNzwEHcwbAYJKoZIhvcNAQkPMV8wXTAL BglghkgBZQMEASowCwYJYIZIAWUDBAECMAoGCCqGSIb3DQMHMA4GCCqGSIb3DQMCAgIAgDAN BggqhkiG9w0DAgIBQDAHBgUrDgMCBzANBggqhkiG9w0DAgIBKDCBpwYJKwYBBAGCNxAEMYGZ MIGWMIGBMQswCQYDVQQGEwJJVDEQMA4GA1UECAwHQmVyZ2FtbzEZMBcGA1UEBwwQUG9udGUg U2FuIFBpZXRybzEXMBUGA1UECgwOQWN0YWxpcyBTLnAuQS4xLDAqBgNVBAMMI0FjdGFsaXMg Q2xpZW50IEF1dGhlbnRpY2F0aW9uIENBIEczAhAI/yx7V5dPIG8WuMetnzcsMIGpBgsqhkiG 9w0BCRACCzGBmaCBljCBgTELMAkGA1UEBhMCSVQxEDAOBgNVBAgMB0JlcmdhbW8xGTAXBgNV BAcMEFBvbnRlIFNhbiBQaWV0cm8xFzAVBgNVBAoMDkFjdGFsaXMgUy5wLkEuMSwwKgYDVQQD DCNBY3RhbGlzIENsaWVudCBBdXRoZW50aWNhdGlvbiBDQSBHMwIQCP8se1eXTyBvFrjHrZ83 LDANBgkqhkiG9w0BAQEFAASCAQB/thooabiRy+slPv3F4vUQilrEeL71rXdrnr6HmVmcFigj E8bcqmo9RqBhqW7mzP4h4tqmag4VIf5HZ2NOmYDtnTtSH9FYU3f3hUTah1dHNanNmfDo9r5E WEOW4979n8mhH1BK0WZ457DBShPPdYmoqlnFgxj9QH1mPQE0xDIEtAHKltNpcrkcz3J1aW+S u23yovxbMHSYn8ZpnqW98zOMMCDu1P3wLGKA6SWwjlBFwhfmmVTJe48xbMIP5gn4hJPA2UsR 9/JtFSKZ6YJhpo7YT5KumdCjWdROGa+KQUKbFIXVQ2Y01+q4ZU3LRPdG/W3bRjMPUywrAe8L +eh/Xrq5AAAAAAAA --------------ms010509000302020909090602--