Received: by 2002:a5b:505:0:0:0:0:0 with SMTP id o5csp3743670ybp; Sun, 6 Oct 2019 19:18:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqxg3oG4c6Ruure4gCMHH3XfGUH3I+tjOQYIwiTSuGQlzMqLEhj4VqAaKYz6G48dLf9E5nxt X-Received: by 2002:a17:906:ecf6:: with SMTP id qt22mr21756863ejb.212.1570414694310; Sun, 06 Oct 2019 19:18:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1570414694; cv=none; d=google.com; s=arc-20160816; b=h5uD/Y6RIGz67K98FC0gc1jfaqJYwqELTf7ijDG6q3nhAC55YbSZnEDXniE9rq1g4p j4pm2lMGfT/uZRKYsObYqKPCpi94hbUMLuJ1UdCZxjllKbA4Ro8Hhspmh/WSvkLbsR9e qQfR/jyUaJqD/25fn3+5R7HEh7qhXn4HLKVNiLcrFg862xqY/AvwlKGL1JVdk+mwqyAQ CB4mv04aS0aFEwzNnqmH9FJlYnDsXExGcjns4TZ6NQg5z2m29fE0Zf7j9rW+RnShtCk/ ownV1GPbmARZdpqzNuBXNv26nsm4fki/+MHyof/JFndxqOsYp/SyT0Fo55EHOSy9vhcm 22lQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:from:references:cc:to :subject; bh=l21ZA/FrmTEuAIuLG5KXOuDAIjIDldzYDK6Jfx0R2n4=; b=Zqwkt04Dt6tKnfgYrlK4RXWaYDRyowGh1BrOuAaJCmPoJA0X8zza+4zVWjw/RnNFpu jFwZPe7H/gxY1gw0+7R6eTUGPUoHjPEJ5PYayJvbIf6Jf09/KEH2cozVpulgfUQwCZ8d oNSiV172bKMAD5JFtJhICZ56dl+QskfnxHMSdd6HC0LQsm6WDb13RhP5S1hl5N/dpBH4 VnLNB4pVW2RzbWVF3Arx40ryomOr9dxL7pHaKQEFSJ2SlH97JU6di3gkOIh4KthYYLRo eHz0DEk2VOpmxx30eY5UNOg6cufYih6doHSpw3yTTrIcHFvHAl330Z7ipyDMMktdzWMI ia5Q== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ot41si6161983ejb.215.2019.10.06.19.17.30; Sun, 06 Oct 2019 19:18:14 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-nfs-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-nfs-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726785AbfJGCR0 (ORCPT + 99 others); Sun, 6 Oct 2019 22:17:26 -0400 Received: from mail.cn.fujitsu.com ([183.91.158.132]:18817 "EHLO heian.cn.fujitsu.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726772AbfJGCR0 (ORCPT ); Sun, 6 Oct 2019 22:17:26 -0400 X-IronPort-AV: E=Sophos;i="5.67,265,1566835200"; d="scan'208";a="76570175" Received: from unknown (HELO cn.fujitsu.com) ([10.167.33.5]) by heian.cn.fujitsu.com with ESMTP; 07 Oct 2019 10:17:22 +0800 Received: from G08CNEXCHPEKD03.g08.fujitsu.local (unknown [10.167.33.85]) by cn.fujitsu.com (Postfix) with ESMTP id EF5F94CE14F5; Mon, 7 Oct 2019 10:17:18 +0800 (CST) Received: from [10.167.226.33] (10.167.226.33) by G08CNEXCHPEKD03.g08.fujitsu.local (10.167.33.89) with Microsoft SMTP Server (TLS) id 14.3.439.0; Mon, 7 Oct 2019 10:17:21 +0800 Subject: Re: [PATCH] NFS: Fix O_DIRECT read problem when another write is going on To: Trond Myklebust CC: "linux-nfs@vger.kernel.org" , "linux-kernel@vger.kernel.org" References: <1569834678-16117-1-git-send-email-suyj.fnst@cn.fujitsu.com> From: Su Yanjun Message-ID: Date: Mon, 7 Oct 2019 10:17:16 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.167.226.33] X-yoursite-MailScanner-ID: EF5F94CE14F5.AC31E X-yoursite-MailScanner: Found to be clean X-yoursite-MailScanner-From: suyj.fnst@cn.fujitsu.com X-Spam-Status: No Sender: linux-nfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-nfs@vger.kernel.org 在 2019/10/1 2:06, Trond Myklebust 写道: > Hi Su, > > On Mon, 2019-09-30 at 17:11 +0800, Su Yanjun wrote: >> In xfstests generic/465 tests failed. Because O_DIRECT r/w use >> async rpc calls, when r/w rpc calls are running concurrently we >> may read partial data which is wrong. >> >> For example as follows. >> user buffer >> /--------\ >>> |XXXX| >> rpc0 rpc1 >> >> When rpc0 runs it encounters eof so return 0, then another writes >> something. When rpc1 runs it returns some data. The total data >> buffer contains wrong data. >> >> In this patch we check eof mark for each direct request. If >> encounters >> eof then set eof mark in the request, when we meet it again report >> -EAGAIN error. In nfs_direct_complete we convert -EAGAIN as if read >> nothing. When the reader issue another read it will read ok. >> >> Signed-off-by: Su Yanjun >> --- >> fs/nfs/direct.c | 14 +++++++++++++- >> 1 file changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/fs/nfs/direct.c b/fs/nfs/direct.c >> index 222d711..7f737a3 100644 >> --- a/fs/nfs/direct.c >> +++ b/fs/nfs/direct.c >> @@ -93,6 +93,7 @@ struct nfs_direct_req { >> bytes_left, /* bytes left to be >> sent */ >> error; /* any reported error >> */ >> struct completion completion; /* wait for i/o completion */ >> + int eof; /* eof mark in the >> req */ >> >> /* commit state */ >> struct nfs_mds_commit_info mds_cinfo; /* Storage for cinfo >> */ >> @@ -380,6 +381,12 @@ static void nfs_direct_complete(struct >> nfs_direct_req *dreq) >> { >> struct inode *inode = dreq->inode; >> >> + /* read partial data just as read nothing */ >> + if (dreq->error == -EAGAIN) { >> + dreq->count = 0; >> + dreq->error = 0; >> + } >> + >> inode_dio_end(inode); >> >> if (dreq->iocb) { >> @@ -413,8 +420,13 @@ static void nfs_direct_read_completion(struct >> nfs_pgio_header *hdr) >> if (hdr->good_bytes != 0) >> nfs_direct_good_bytes(dreq, hdr); >> >> - if (test_bit(NFS_IOHDR_EOF, &hdr->flags)) >> + if (dreq->eof) >> + dreq->error = -EAGAIN; >> + >> + if (test_bit(NFS_IOHDR_EOF, &hdr->flags)) { >> dreq->error = 0; >> + dreq->eof = 1; >> + } >> >> spin_unlock(&dreq->lock); >> > Thanks for looking into this issue. I agree with your analysis of what > is going wrong in generic/465. > > However, I think the problem is greater than just EOF. I think we also > need to look at the generic error handling, and ensure that it handles > a truncated RPC call in the middle of a series of calls correctly. > > Please see the two patches I sent you just now and check if they fix > the problem for you. The patchset you sent works for generic/465. Thanks a lot