Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp2463212imm; Sat, 23 Jun 2018 19:40:41 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKACPGt8OZh46PbqzmMdiPanmMXWF6Gqfhdz1Zo4d/71wRLMGV389iM0u9xG/cQ821i0D4M X-Received: by 2002:a62:32c4:: with SMTP id y187-v6mr3497568pfy.241.1529808041236; Sat, 23 Jun 2018 19:40:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529808041; cv=none; d=google.com; s=arc-20160816; b=Dfi0BXmGUDfJ5CqmYI1ViDOTrXtyonTSDhws8GVl6rO0UTGWiaNA9BB82BDZYdsE08 BmoJHJjZGvxcwhSg8Ku+8MaFvn+amHNao8lJNnbmqSnbkSU119ZibhekoQj4fCaw3aq0 vl6/GKBTsxtaaxo9PCV1pDFMIp2Rw6+d0BkWBgwa6jpipe7MP8tuTFwaCekt2TlJORQb YRy/Ust+BYXDMU2dg9/yewUnQwAewBKnKwYKPB5T5/j39RYnWW2zio4fnK+kpor/R2kl rLHHMoXlS1dgajl1ha3XZyDV2vbIhhO0o3ZpMJBJQghHsNy9YxTSiC18AAAiG8cVh8s0 sMGQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:arc-authentication-results; bh=zenLbPpPg/TrAhaL0dro0Ym3lKDsVm8hoDadsPL5z3M=; b=Zhxaf91gglGS5OuklfBCRQM8DhlauEtgZzi7lM5hiBoQLvoRMwEhsajAyKKvjEftwe zPJKBz3S4AiJQusaLBY4GU2z7xrNO7+Yo82v25igA/+n+cQkmYdICFqk7NAtXDGV4Zuk 3yMvfb+IWrXA3Wcp+kK0apyEVuSy1lGCGM5lnLs8QkqUs56WKkw2UgpBV9ahl9kV58RN Yr4j4tZoLztnYYXrmF0ZoYMi4qytAuyDougRjHuCA8cPTPhZBlAPESn3IaM4Q9yzkK44 igN3cWC6BFTqxMAAENhOKihf2QAPmZbR97M0fbvERG67s5Ikhj9zkqtL1vVf39rft4uX N9LA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b83-v6si11834906pfk.342.2018.06.23.19.40.26; Sat, 23 Jun 2018 19:40:41 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752182AbeFXCjq (ORCPT + 99 others); Sat, 23 Jun 2018 22:39:46 -0400 Received: from p3plsmtpa12-02.prod.phx3.secureserver.net ([68.178.252.231]:38623 "EHLO p3plsmtpa12-02.prod.phx3.secureserver.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752072AbeFXCjp (ORCPT ); Sat, 23 Jun 2018 22:39:45 -0400 Received: from [192.168.0.67] ([24.218.182.144]) by :SMTPAUTH: with ESMTPSA id WuwJf98qsJu7xWuwKfl3M3; Sat, 23 Jun 2018 19:39:44 -0700 Subject: Re: [Patch v2 13/15] CIFS: Add support for direct I/O read To: longli@microsoft.com, Steve French , linux-cifs@vger.kernel.org, samba-technical@lists.samba.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org References: <20180530194807.31657-1-longli@linuxonhyperv.com> <20180530194807.31657-14-longli@linuxonhyperv.com> From: Tom Talpey Message-ID: Date: Sat, 23 Jun 2018 22:39:44 -0400 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180530194807.31657-14-longli@linuxonhyperv.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-CMAE-Envelope: MS4wfFqz/LEyCAiZ3mwtULc00hg0oXkEqGnY4+LG3XsPqxFC2YGYNQ1KttJDMV8LliHSfAZCGn3QHPLp0pJZaEL/618lDDHp9TV2nIY6gGNi5DoqVTgZBL6Z 8hV7hsDfcjHHyIJHCD812OPyvigoh3yrenoZHhz2GIqNM7OyqMuPvF1dkkA+iZCkkK8FHhHE4Ao2v3jULkD3l3wRc4buW1CUSSgH7of/pQ3ZiBMsc2iz5FhF RXhs6IVDt5djY56Iqh1PB9dSMOA4z/cF0WJRL/5Hw7WD4aiZ+tv+P/6+r4B5DiHcKA+25t1Fgyams/cPJZb6g28K5eOfrJILgJ+Ls/oEC/bctyIwkAsPgsX8 1UHsnVE4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 5/30/2018 3:48 PM, Long Li wrote: > From: Long Li > > Implement the function for direct I/O read. It doesn't support AIO, which > will be implemented in a follow up patch. > > Signed-off-by: Long Li > --- > fs/cifs/cifsfs.h | 1 + > fs/cifs/file.c | 149 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 150 insertions(+) > > diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h > index 5f02318..7fba9aa 100644 > --- a/fs/cifs/cifsfs.h > +++ b/fs/cifs/cifsfs.h > @@ -102,6 +102,7 @@ extern int cifs_open(struct inode *inode, struct file *file); > extern int cifs_close(struct inode *inode, struct file *file); > extern int cifs_closedir(struct inode *inode, struct file *file); > extern ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to); > +extern ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter *to); > extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter *to); > extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from); > extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from); > diff --git a/fs/cifs/file.c b/fs/cifs/file.c > index 87eece6..e6e6f24 100644 > --- a/fs/cifs/file.c > +++ b/fs/cifs/file.c > @@ -2955,6 +2955,18 @@ cifs_read_allocate_pages(struct cifs_readdata *rdata, unsigned int nr_pages) > return rc; > } > > +static void cifs_direct_readdata_release(struct kref *refcount) > +{ > + struct cifs_readdata *rdata = container_of(refcount, > + struct cifs_readdata, refcount); > + unsigned int i; > + > + for (i = 0; i < rdata->nr_pages; i++) > + put_page(rdata->pages[i]); > + > + cifs_readdata_release(refcount); > +} > + > static void > cifs_uncached_readdata_release(struct kref *refcount) > { > @@ -3267,6 +3279,143 @@ collect_uncached_read_data(struct cifs_aio_ctx *ctx) > complete(&ctx->done); > } > > +static void cifs_direct_readv_complete(struct work_struct *work) > +{ > + struct cifs_readdata *rdata = > + container_of(work, struct cifs_readdata, work); > + > + complete(&rdata->done); > + kref_put(&rdata->refcount, cifs_direct_readdata_release); > +} > + > +ssize_t cifs_direct_readv(struct kiocb *iocb, struct iov_iter *to) > +{ > + size_t len, cur_len, start; > + unsigned int npages, rsize, credits; > + struct file *file; > + struct cifs_sb_info *cifs_sb; > + struct cifsFileInfo *cfile; > + struct cifs_tcon *tcon; > + struct page **pagevec; > + ssize_t rc, total_read = 0; > + struct TCP_Server_Info *server; > + loff_t offset = iocb->ki_pos; > + pid_t pid; > + struct cifs_readdata *rdata; > + > + /* > + * iov_iter_get_pages_alloc() doesn't work with ITER_KVEC, > + * fall back to data copy read path > + */ > + if (to->type & ITER_KVEC) { > + cifs_dbg(FYI, "use non-direct cifs_user_readv for kvec I/O\n"); > + return cifs_user_readv(iocb, to); > + } > + > + len = iov_iter_count(to); > + if (!len) > + return 0; > + > + file = iocb->ki_filp; > + cifs_sb = CIFS_FILE_SB(file); > + cfile = file->private_data; > + tcon = tlink_tcon(cfile->tlink); > + server = tcon->ses->server; > + > + if (!server->ops->async_readv) > + return -ENOSYS; > + > + if (cifs_sb->mnt_cifs_flags & CIFS_MOUNT_RWPIDFORWARD) > + pid = cfile->pid; > + else > + pid = current->tgid; > + > + if ((file->f_flags & O_ACCMODE) == O_WRONLY) > + cifs_dbg(FYI, "attempting read on write only file instance\n"); Confusing. Maybe "attempting read on write-only filehandle"? > + > + do { > + rc = server->ops->wait_mtu_credits(server, cifs_sb->rsize, > + &rsize, &credits); > + if (rc) > + break; > + > + cur_len = min_t(const size_t, len, rsize); > + > + rc = iov_iter_get_pages_alloc(to, &pagevec, cur_len, &start); > + if (rc < 0) { > + cifs_dbg(VFS, > + "couldn't get user pages (rc=%zd) iter type %d" > + " iov_offset %lu count %lu\n", > + rc, to->type, to->iov_offset, to->count); > + dump_stack(); > + break; > + } > + > + rdata = cifs_readdata_direct_alloc( > + pagevec, cifs_direct_readv_complete); > + if (!rdata) { > + add_credits_and_wake_if(server, credits, 0); > + rc = -ENOMEM; > + break; > + } > + > + npages = (rc + start + PAGE_SIZE-1) / PAGE_SIZE; > + rdata->nr_pages = npages; > + rdata->page_offset = start; > + rdata->pagesz = PAGE_SIZE; > + rdata->tailsz = npages > 1 ? > + rc-(PAGE_SIZE-start)-(npages-2)*PAGE_SIZE : > + rc; This expression makes my head hurt. Surely it can be simplified, or expressed in a clearer way. > + cur_len = rc; > + > + rdata->cfile = cifsFileInfo_get(cfile); > + rdata->offset = offset; > + rdata->bytes = rc; > + rdata->pid = pid; > + rdata->read_into_pages = cifs_uncached_read_into_pages; > + rdata->copy_into_pages = cifs_uncached_copy_into_pages; > + rdata->credits = credits; > + > + rc = 0; > + if (rdata->cfile->invalidHandle) > + rc = cifs_reopen_file(rdata->cfile, true); > + > + if (!rc) > + rc = server->ops->async_readv(rdata); > + > + if (rc) { This whole rc thing is messy. Initializing to zero, setting only in one case, then testing the result, then setting it again, is twisted. I actually think a goto or two would read much more clearly. > + add_credits_and_wake_if(server, rdata->credits, 0); > + kref_put(&rdata->refcount, > + cifs_direct_readdata_release); > + if (rc == -EAGAIN) > + continue; > + break; It's worth a comment here that this either breaks or continues the entire do {} while (); and btw when it breaks it does *not* return "rc". Again, maybe a goto instead of a break? > + } > + > + wait_for_completion(&rdata->done); > + rc = rdata->result; > + if (rc) { > + kref_put( > + &rdata->refcount, > + cifs_direct_readdata_release); > + if (rc == -EAGAIN) > + continue; > + break; Ditto. > + } > + > + total_read += rdata->got_bytes; > + kref_put(&rdata->refcount, cifs_direct_readdata_release); > + > + iov_iter_advance(to, cur_len); > + len -= cur_len; > + offset += cur_len; > + } while (len); > + > + iocb->ki_pos += total_read; > + > + return total_read; > +} > + > ssize_t cifs_user_readv(struct kiocb *iocb, struct iov_iter *to) > { > struct file *file = iocb->ki_filp; >