Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp743693ybz; Wed, 15 Apr 2020 18:01:07 -0700 (PDT) X-Google-Smtp-Source: APiQypJDzlgN0529LWpoILHSmWmL0iNjkO4xZi3OlmqK4SVgvfZ4WTf1v1SswNobUlVP697ewiuk X-Received: by 2002:a05:6402:705:: with SMTP id w5mr25413186edx.379.1586998867015; Wed, 15 Apr 2020 18:01:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586998867; cv=none; d=google.com; s=arc-20160816; b=fVYs3DRK4KfLo6GIjazDID11an/mXCh3MxerP/GgWMX3HUyua6znzrrnMBAS13MyiA c0a+xd1kpAojSZ61n15FIz0cp4CcagYnWA3tISy/e19SwuTLUjy+bMjj9tH4K7vQCiPt Mskm7kuF7Wkxz/jLRFGddMHUoixYoljYMLFbFn3aJ+t2Ip8JerP9yHXSmr0Bwm02r50/ j+9JzF1qusjEmQB3UwO+zGm+JQImyC6p6RBOq8qyhza3dkg2ckG69TOUeM/JIaCyUs5p d910kvfKs0kQslz4SjCNvRM/C8S4VoFGR5Yg/M2zbMmpslzKoIPFM2u3Rz+PcZ7r9biF aeJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Y+WYp7XS6lLr/SSsI1xNGLpp3MNZuleOZPW5xL8I+d4=; b=yyYMSxmVj9WpduscdkJPkzSntHIF4gVrGmPRWn5pW+Pxi7RdqURb6JJEh9zmGoDfL0 ntQJxR3khL6obtHBQGoGd+PTUx+fvF7GVJr9PYBiwh0i2cqnA0gNBKXRV/ev21DxxnoG Wfr0heVVu4KCK7teSuUKgaDs5LJUuLv2+zTIdQoki/4UxaJvIpxYMWl8lv9/vXCX6poh /eDYswLfozP3m7kqqK5EaINy/KZSnWLszdurUtzTX0it0fX58fyW2/i9WuJVghdpYVYD CpPuDO/egw3Ifk/Va4X1mRkewjnU4UC7sfSZUeIBuYPHcL72K/xBk96Tz5glzKVS5c29 lcog== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q1si11642295eja.190.2020.04.15.18.00.35; Wed, 15 Apr 2020 18:01:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-nfs-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2405375AbgDOT1K (ORCPT + 99 others); Wed, 15 Apr 2020 15:27:10 -0400 Received: from fieldses.org ([173.255.197.46]:50512 "EHLO fieldses.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729378AbgDOTZn (ORCPT ); Wed, 15 Apr 2020 15:25:43 -0400 Received: by fieldses.org (Postfix, from userid 2815) id 64C941C89; Wed, 15 Apr 2020 15:25:42 -0400 (EDT) Date: Wed, 15 Apr 2020 15:25:42 -0400 From: Bruce Fields To: Chuck Lever Cc: Jeff Layton , Linux NFS Mailing List Subject: Re: GSS unwrapping breaks the DRC Message-ID: <20200415192542.GA6466@fieldses.org> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org On Wed, Apr 15, 2020 at 01:05:11PM -0400, Chuck Lever wrote: > Hi Bruce and Jeff: > > Testing intensive workloads with NFSv3 and NFSv4.0 on NFS/RDMA with krb5i > or krb5p results in a pretty quick workload failure. Closer examination > shows that the client is able to overrun the GSS sequence window with > some regularity. When that happens, the server drops the connection. > > However, when the client retransmits requests with lost replies, they > never hit in the DRC, and that results in unexpected failures of non- > idempotent requests. > > The retransmitted XIDs are found in the DRC, but the retransmitted request > has a different checksum than the original. We're hitting the "mismatch" > case in nfsd_cache_key_cmp for these requests. > > I tracked down the problem to the way the DRC computes the length of the > part of the buffer it wants to checksum. nfsd_cache_csum uses > > head.iov_len + page_len > > and then caps that at RC_CSUMLEN. > > That works fine for krb5 and sys, but the GSS unwrap functions > (integ_unwrap_data and priv_unwrap_data) don't appear to update head.iov_len > properly. So nfsd_cache_csum's length computation is significantly larger > than the clear-text message, and that allows stale parts of the xdr_buf > to be included in the checksum. > > Using xdr_buf_subsegment() at the end of integ_unwrap_data sets the xdr_buf > lengths properly and fixes the situation for krb5i. > > I don't see a similar solution for priv_unwrap_data: there's no MIC len > available, and priv_len is not the actual length of the clear-text message. > > Moreover, the comment in fix_priv_head() is disturbing. I don't see anywhere > where the relationship between the buf's head/len and how svc_defer works is > authoritatively documented. It's not clear exactly how priv_unwrap_data is > supposed to accommodate svc_defer, or whether integ_unwrap_data also needs > to accommodate it. > > So I can't tell if the GSS unwrap functions are wrong or if there's a more > accurate way to compute the message length in nfsd_cache_csum. I suspect > both could use some improvement, but I'm not certain exactly what that > might be. I don't know, I tried looking through that code and didn't get any further than you. The gss unwrap code does look suspect to me. It needs some kind of proper design, as it stands it's just an accumulation of fixes. --b.