Received: by 2002:a25:c593:0:0:0:0:0 with SMTP id v141csp4220897ybe; Mon, 9 Sep 2019 06:13:13 -0700 (PDT) X-Google-Smtp-Source: APXvYqwQprUqSdM6GLFo1TLbvDbb2T2s+dMMmhVCxUOOdhFTgXtCceDweuKambD6t/qu76jRvlv7 X-Received: by 2002:a17:906:e2c4:: with SMTP id gr4mr18939034ejb.25.1568034793444; Mon, 09 Sep 2019 06:13:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1568034793; cv=none; d=google.com; s=arc-20160816; b=P8w9Ok3K1Ss8hqp4xHWSofvwzcKUhGM6l87v7ZuB/tAkNcIHKcDdvsvjdAOsh07Hfm 46zay9RtR0AH3TXha0pCV0VBP3ZahUfgPLDVbT96nRvoSaH1BEbMkL1XVIr6NQs271JS vEW1hSucmW1dfnh5VDXW9k3rUcmU6xqJzkA9bY8G/CJJxKzhcNUYaOGGEWFc3qHnPQZi j41MJHLQp6VoHXdWPacYgWNfPhWhra07F2Nb7/EjefNX+z0fRv2qIjtv1zA6Ckb+5P1y woVmv9AFIkaunapN/KIgmfqPxL3sc+1ypqPA/WA9YGtdOhdavieP1RPpBiT0kPK2ktQ8 31Sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:to:references:message-id :content-transfer-encoding:cc:date:in-reply-to:from:subject :mime-version:dkim-signature; bh=A5tT5gwGxBzxIbSUs3TEITFv8DLQ5Lq2YlaSKYXEY6Y=; b=I21rSWW7p1yphzQ8eBtLyqkGt23W6HkodIhlx71XYDwIDpXSzARPaN1xAzqAsOXk1+ rT5aq4yTfU4xBfgc9WZVXBZG8U6Mazxv5pegOVB/Ub6AknigOMjqRQ8JnmbUsIWqr6C9 DLlaBEqLqjPUCg7KKaoZ3tAJMdvf4UPlgpJIqM3URfYgixn7oR3mS9Bt6mtQQMEuID6q 4iAkW+YX757o8W7retMHxoUgiVLCq0RNsRYb5l7EJoovSAo7x/26lbQ4k/8iuI3kywAU 8sMRZHCfpkk4beSs3zarkrBC+BqOmwmtCdGebl36NImR4hrGJn80mzYJH8tJtBNztrAt agDQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=nWKgxBoj; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d6si6450587edv.22.2019.09.09.06.12.47; Mon, 09 Sep 2019 06:13:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2019-08-05 header.b=nWKgxBoj; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730023AbfIHQxv (ORCPT + 99 others); Sun, 8 Sep 2019 12:53:51 -0400 Received: from aserp2120.oracle.com ([141.146.126.78]:36742 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730022AbfIHQxu (ORCPT ); Sun, 8 Sep 2019 12:53:50 -0400 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x88GmukU099786; Sun, 8 Sep 2019 16:51:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=content-type : mime-version : subject : from : in-reply-to : date : cc : content-transfer-encoding : message-id : references : to; s=corp-2019-08-05; bh=A5tT5gwGxBzxIbSUs3TEITFv8DLQ5Lq2YlaSKYXEY6Y=; b=nWKgxBojQ7025srGvTIj0+9K37fUqq/OBSsW3f16zVrTMS7DkTAYEuFsJMQjy7U6B2Kd Yy07AAgrc24bjs7xbm9LZoIqPW4wxFY+OtnVNuDQwiHWE0JyzGnUqpgvB9Qwl5JQ1s6M JFotwOpfwwgAHA0aGxuPLxz30XaBndn1+dBcQpFbZKsOhwz/z3iseTpT0ZwwrSDIJEl8 DzD6bf6tbW90VMRgIDbAsUh6tukFewDgzDsLr7dKwMWVWCDFgTDJgF31+HLyTQ4jgGKh XyFMxd2XdZZu4k2SL9RbnbVJmmcfubzDdCAukUl6XifI0+ZT25MGn/3kE4uGvzmY1fO2 vA== Received: from userp3030.oracle.com (userp3030.oracle.com [156.151.31.80]) by aserp2120.oracle.com with ESMTP id 2uw1jxrebx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 08 Sep 2019 16:51:22 +0000 Received: from pps.filterd (userp3030.oracle.com [127.0.0.1]) by userp3030.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x88GmfpW108982; Sun, 8 Sep 2019 16:51:21 GMT Received: from userv0122.oracle.com (userv0122.oracle.com [156.151.31.75]) by userp3030.oracle.com with ESMTP id 2uve9c3ycj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Sun, 08 Sep 2019 16:51:21 +0000 Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by userv0122.oracle.com (8.14.4/8.14.4) with ESMTP id x88GpJ2M003277; Sun, 8 Sep 2019 16:51:19 GMT Received: from anon-dhcp-153.1015granger.net (/68.61.232.219) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sun, 08 Sep 2019 09:51:19 -0700 Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 12.4 \(3445.104.11\)) Subject: Re: Regression in 5.1.20: Reading long directory fails From: Chuck Lever In-Reply-To: <9c3483e089986fa9f5e8af8ea9e552f4313ecd6a.camel@hammerspace.com> Date: Sun, 8 Sep 2019 12:51:18 -0400 Cc: "tibbs@math.uh.edu" , Bruce Fields , "linux@stwm.de" , Linux NFS Mailing List , "linux-kernel@vger.kernel.org" , "km@cm4all.com" Content-Transfer-Encoding: quoted-printable Message-Id: <87C8F695-2729-4A65-9738-2E651A344BA7@oracle.com> References: <4418877.15LTP4gqqJ@stwm.de> <4198657.JbNDGbLXiX@h2o.as.studentenwerk.mhn.de> <20190906144837.GD17204@fieldses.org> <75F810C6-E99E-40C3-B5E1-34BA2CC42773@oracle.com> <1ebf86cff330eb15c02249f0dac415a8aff99f49.camel@hammerspace.com> <3B2EEB3C-3305-4A50-A55B-51093A985284@oracle.com> <9c3483e089986fa9f5e8af8ea9e552f4313ecd6a.camel@hammerspace.com> To: Trond Myklebust , "bcodding@redhat.com" X-Mailer: Apple Mail (2.3445.104.11) X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9374 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909080185 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9374 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1906280000 definitions=main-1909080185 Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org > On Sep 8, 2019, at 12:47 PM, Trond Myklebust = wrote: >=20 > On Sun, 2019-09-08 at 11:48 -0400, Chuck Lever wrote: >>> On Sep 8, 2019, at 11:19 AM, Trond Myklebust < >>> trondmy@hammerspace.com> wrote: >>>=20 >>> On Sun, 2019-09-08 at 07:39 -0400, Benjamin Coddington wrote: >>>> On 6 Sep 2019, at 16:50, Chuck Lever wrote: >>>>=20 >>>>>> On Sep 6, 2019, at 4:47 PM, Jason L Tibbitts III < >>>>>> tibbs@math.uh.edu>=20 >>>>>> wrote: >>>>>>=20 >>>>>>>>>>> "JBF" =3D=3D J Bruce Fields >>>>>>>>>>> writes: >>>>>>=20 >>>>>> JBF> Those readdir changes were client-side, right? Based on >>>>>> that=20 >>>>>> I'd >>>>>> JBF> been assuming a client bug, but maybe it'd be worth >>>>>> getting >>>>>> a=20 >>>>>> full >>>>>> JBF> packet capture of the readdir reply to make sure it's >>>>>> legit. >>>>>>=20 >>>>>> I have been working with bcodding on IRC for the past couple >>>>>> of >>>>>> days=20 >>>>>> on >>>>>> this. Fortunately I was able to come up with way to fill up >>>>>> a=20 >>>>>> directory >>>>>> in such a way that it will fail with certainty and as a bonus >>>>>> doesn't >>>>>> include any user data so I can feel OK about sharing packet >>>>>> captures.=20 >>>>>> I >>>>>> have a capture alongside a kernel trace of the problematic >>>>>> operation=20 >>>>>> in >>>>>> https://www.math.uh.edu/~tibbs/nfs/. Not that I can >>>>>> particularly=20 >>>>>> tell >>>>>> anything useful from that, but bcodding says that it seems to >>>>>> point=20 >>>>>> to >>>>>> some issue in sunrpc. >>>>>>=20 >>>>>> And because I can easily reproduce this and I was able to do >>>>>> a=20 >>>>>> bisect: >>>>>>=20 >>>>>> 2c94b8eca1a26cd46010d6e73a23da5f2e93a19d is the first bad >>>>>> commit >>>>>> commit 2c94b8eca1a26cd46010d6e73a23da5f2e93a19d >>>>>> Author: Chuck Lever >>>>>> Date: Mon Feb 11 11:25:41 2019 -0500 >>>>>>=20 >>>>>> SUNRPC: Use au_rslack when computing reply buffer size >>>>>>=20 >>>>>> au_rslack is significantly smaller than (au_cslack << 2). >>>>>> Using >>>>>> that value results in smaller receive buffers. In some >>>>>> cases >>>>>> this >>>>>> eliminates an extra segment in Reply chunks (RPC/RDMA). >>>>>>=20 >>>>>> Signed-off-by: Chuck Lever >>>>>> Signed-off-by: Anna Schumaker >>>>>>=20 >>>>>> :040000 040000 d4d1ce2fbe0035c5bd9df976b8c448df85dcb505=20 >>>>>> 7011a792dfe72ff9cd70d66e45d353f3d7817e3e M net >>>>>>=20 >>>>>> But of course, I can't say whether this is the actual bad >>>>>> commit >>>>>> or >>>>>> whether it just introduced a behavior change which alters >>>>>> the=20 >>>>>> conditions >>>>>> under which the problem appears. >>>>>=20 >>>>> The first place I'd start looking is the XDR constants at the >>>>> head >>>>> of=20 >>>>> fs/nfs/nfs4xdr.c >>>>> having to do with READDIR. >>>>>=20 >>>>> The report of behavior changes with the use of krb5p also makes >>>>> this=20 >>>>> commit plausible. >>>>=20 >>>> After sprinkling the printk's, we're coming up one word short in >>>> the=20 >>>> receive >>>> buffer. I think we're not accounting for the xdr pad of buf- >>>>> pages >>>> for=20 >>>> NFS4 >>>> readdir -- but I need to check the RFCs. Anyone know if v4 >>>> READDIR=20 >>>> results >>>> have to be aligned? >>>>=20 >>>> Also need to check just why krb5i is the only auth that cares.. >>>>=20 >>>=20 >>> I'm not seeing that. If you look at commit 02ef04e432ba, you'll see >>> that Chuck did add a 'padding term' to decode_readdir_maxsz in the >>> NFSv4 case. >>> The other thing to remember is that a readdir 'dirlist4' entry is >>> always word aligned (irrespective of the length of the filename), >>> so >>> there is no padding that needs to be taken into account. >>>=20 >>> I think we probably rather want to look at how auth->au_ralign is >>> being >>> calculated for the case of krb5i. I'm really not understanding why >>> auth->au_ralign should not take into account the presence of the >>> mic. >>> Chuck? >>=20 >> I'm looking at gss_unwrap_resp_integ(): >>=20 >> 1971 auth->au_rslack =3D auth->au_verfsize + 2 + 1 + >> XDR_QUADLEN(mic.len); >> 1972 auth->au_ralign =3D auth->au_verfsize + 2; >>=20 >> au_ralign now sets the alignment of the _start_ of the RPC message >> body. >> The MIC comes _after_ the RPC message body for krb5i. >>=20 >> If Ben is off by one quad, that's not the MIC, which is typically 32 >> octets, >> isn't it? >>=20 >> Maybe some variable-length data item in the returned file attributes >> is missing >> an XDR pad. >=20 > The only two pieces of variable length data in the readdir payload are > the file name and the filehandle data. Those might present a problem > when encoding on the server side, but not when decoding on the client > side, since they are embedded in the dirlist4 (which, as I said, is > automatically aligned). The next thing I'd try, then, is to match the Wireshark-dissected READDIR4 reply that fails with the macros at the top of fs/nfs/nfs4xdr.c and look for anything that is missing. > Hmm... One thing that does bother me in both gss_unwrap_resp_integ() > and gss_unwrap_resp_priv() is that if the seqno does not match, then = we > return EIO. What if we had to retransmit a request, but the server > managed to squeeze off a reply to the first transmission? > Note: it should be pretty easy to catch issues such as this, since we > do have tracepoints for them. That said, it is pretty hard to imagine > this being the problem here if the bug is always reproducible (since > retransmissions typically are not). >=20 > --=20 > Trond Myklebust > Linux NFS client maintainer, Hammerspace > trond.myklebust@hammerspace.com -- Chuck Lever