Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp8215767pxb; Fri, 19 Feb 2021 10:09:33 -0800 (PST) X-Google-Smtp-Source: ABdhPJz6S1m78oxoZL883QSDBA5srNhBZGwp+Q8RvEoFBKha9lnrjXM2x9uCgYF8WSnQrCVPS4Qo X-Received: by 2002:a17:907:2d93:: with SMTP id gt19mr9807551ejc.246.1613758172846; Fri, 19 Feb 2021 10:09:32 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1613758172; cv=none; d=google.com; s=arc-20160816; b=yw22IRZmibyfIcLSJw9NrrdkIcLcQK/Bw7lD/ES/qjxlvmYd00yZ8ZyJzYYW5XBn6V xsoDd+895V3Vl/sUR09VtFqzCM0m4y+eyi3mlyCOcdap19MPTflf1K5As7WfclTT2L4i RXYTxOQrQzLGwQXelk6Rr4pLWULUlnlwY6ykK626nKK+zEwcWcYR3JLHdIG6lPKxEeYV 1ySUB/EsYJnKvK32sFA9QbBHDid3mLX29ZvpOrOQ0BzQsQGKzUixEvGWcokQ6n9069SY Ydyr6IQvEIk+A/uodHqhxvufJzfQ31uU+ZeX67Umq/0ZmV3jgNF1n096FSNiiLbcyi0l x39Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=LuKWf/JhEqdo4XAmf8mSLyPsxpggrLQcTIj7iKn6gtE=; b=n/XVY6RSoEUfUf3jbzJhcAYsaDeB9ZoKV1dpfXbTPtT9OKit8DdQNR5MbR12i7T6sS rJLzAcdKoMfxzm/iX22FeYLrbqov9WyCMcU7wggx5Hf8L/X7dBrvPgHEVG1hnFkvbocG DQL0nvjaxYZAkjBPjHd1gk3rogNT3CGmbh4aEOY21nGY2syXSc+u1X6OmIx3lcx3tQ8d guvQve6ydaxvoSqR+ik9UpZuW64VdiOn6K2FIz099QTFXTbBGfPJ743Wg9CeDmf8Rq1x IVwyY15Sv/gE7Nw6XtH5hwTopMnpBZapxZ1AKNiTmJi35WojOWQVr4bP03ewsPq3yfcA +Lbw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t1si6263842edy.249.2021.02.19.10.09.09; Fri, 19 Feb 2021 10:09:32 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229784AbhBSSHY (ORCPT + 99 others); Fri, 19 Feb 2021 13:07:24 -0500 Received: from hmm.wantstofly.org ([213.239.204.108]:59298 "EHLO mail.wantstofly.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229720AbhBSSHV (ORCPT ); Fri, 19 Feb 2021 13:07:21 -0500 Received: by mail.wantstofly.org (Postfix, from userid 1000) id C7D207F4AC; Fri, 19 Feb 2021 20:06:37 +0200 (EET) Date: Fri, 19 Feb 2021 20:06:37 +0200 From: Lennert Buytenhek To: Pavel Begunkov Cc: Jens Axboe , Al Viro , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, io-uring@vger.kernel.org, David Laight , Matthew Wilcox Subject: Re: [PATCH v3 2/2] io_uring: add support for IORING_OP_GETDENTS Message-ID: <20210219180637.GC342512@wantstofly.org> References: <20210218122640.GA334506@wantstofly.org> <20210218122755.GC334506@wantstofly.org> <9a6fb59b-be85-c36b-3c83-26cff37bcb87@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <9a6fb59b-be85-c36b-3c83-26cff37bcb87@gmail.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Feb 19, 2021 at 12:05:58PM +0000, Pavel Begunkov wrote: > > IORING_OP_GETDENTS behaves much like getdents64(2) and takes the same > > arguments, but with a small twist: it takes an additional offset > > argument, and reading from the specified directory starts at the given > > offset. > > > > For the first IORING_OP_GETDENTS call on a directory, the offset > > parameter can be set to zero, and for subsequent calls, it can be > > set to the ->d_off field of the last struct linux_dirent64 returned > > by the previous IORING_OP_GETDENTS call. > > > > Internally, if necessary, IORING_OP_GETDENTS will vfs_llseek() to > > the right directory position before calling vfs_getdents(). > > > > IORING_OP_GETDENTS may or may not update the specified directory's > > file offset, and the file offset should not be relied upon having > > any particular value during or after an IORING_OP_GETDENTS call. > > > > Signed-off-by: Lennert Buytenhek > > --- > > fs/io_uring.c | 73 +++++++++++++++++++++++++++++++++++ > > include/uapi/linux/io_uring.h | 1 + > > 2 files changed, 74 insertions(+) > > > > diff --git a/fs/io_uring.c b/fs/io_uring.c > > index 056bd4c90ade..6853bf48369a 100644 > > --- a/fs/io_uring.c > > +++ b/fs/io_uring.c > > @@ -635,6 +635,13 @@ struct io_mkdir { > > struct filename *filename; > > }; > > > [...] > > +static int io_getdents(struct io_kiocb *req, unsigned int issue_flags) > > +{ > > + struct io_getdents *getdents = &req->getdents; > > + bool pos_unlock = false; > > + int ret = 0; > > + > > + /* getdents always requires a blocking context */ > > + if (issue_flags & IO_URING_F_NONBLOCK) > > + return -EAGAIN; > > + > > + /* for vfs_llseek and to serialize ->iterate_shared() on this file */ > > + if (file_count(req->file) > 1) { > > Looks racy, is it safe? E.g. can be concurrently dupped and used, or > just several similar IORING_OP_GETDENTS requests. I thought that it was safe, but I thought about it a bit more, and it seems that it is unsafe -- if you IORING_REGISTER_FILES to register the dirfd and then close the dirfd, you'll get a file_count of 1, while you can submit concurrent operations. So I'll remove the conditional locking. Thanks! (If not for IORING_REGISTER_FILES, it seems safe, because then io_file_get() will hold a(t least one) reference on the file while the operation is in flight, so then if file_count(req->file) == 1 here, then it means that the file is no longer referenced by any fdtable, and nobody else should be able to get a reference to it -- but that's a bit of a useless optimization.) (Logic was taken from __fdget_pos, where it is safe for a different reason, i.e. __fget_light will not bump the refcount iff current->files is unshared.) > > + pos_unlock = true; > > + mutex_lock(&req->file->f_pos_lock); > > + } > > + > > + if (req->file->f_pos != getdents->pos) { > > + loff_t res = vfs_llseek(req->file, getdents->pos, SEEK_SET); > > I may be missing the previous discussions, but can this ever become > stateless, like passing an offset? Including readdir.c and beyond. My aim was to only make the minimally required change initially, but to make that optimization possible in the future (e.g. by reserving the right to either update or not update the file position) -- but I'll try doing the optimization now. > > + if (res < 0) > > + ret = res; > > + } > > + > > + if (ret == 0) { > > + ret = vfs_getdents(req->file, getdents->dirent, > > + getdents->count); > > + } > > + > > + if (pos_unlock) > > + mutex_unlock(&req->file->f_pos_lock); > > + > > + if (ret < 0) { > > + if (ret == -ERESTARTSYS) > > + ret = -EINTR; > > + req_set_fail_links(req); > > + } > > + io_req_complete(req, ret); > > + return 0; > > +} > [...] > > -- > Pavel Begunkov