Received: by 2002:a05:6a10:22f:0:0:0:0 with SMTP id 15csp1062400pxk; Mon, 31 Aug 2020 08:48:24 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwdevqEISaEAoqIfK4CC4otZlBbtg1hHn9QH7r6lnfjWG2QQMEVqhBcrzc8JcsvFlhoIMbA X-Received: by 2002:a17:906:a0c2:: with SMTP id bh2mr1706167ejb.493.1598888904317; Mon, 31 Aug 2020 08:48:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598888904; cv=none; d=google.com; s=arc-20160816; b=A1OoMK853butud9QxH63xwKwdAudYf+LdNZpbCXvqF5WmO8PVpuIj6wvWfWmAoRjeI 2D9uL8lo0V0pBn4BCgW1sw2sJcs2y/aDdOTFFIfMks+b5MasuFuyjglqLaOSMNEMcJgU P/BZ1uP8V8aCtwH6PLPPlER9RTJI6dlFvSGblmjFAHTKidTpwfBhgntLLD9fTtJ5ab4Y CSyRtkxceV1ePArcmUb3Nvp3ngDj48dwSY/mGP7w9QLx7XH2GeR6OjimjIBC3k0MtEnw Jfxqt2u3YwckzarYti7YzUb7GaiSoxUfAdt24wPsL/6YSV6LTxB6QMjWVuGRdeJN62Y/ pTzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=+AoMEfgYRxQkncZxnPbeI2rx7pGrsAFC42S72eCBWIc=; b=xjV/cWsMdgzzJIn8AKh04AHXKfVGWJFHeBdsDos+1zfr7sPaBJMQhhehWDHa8QloaR PBiiW+W5N6HVAtevW+pEGMnFyI0KE6gW92SjyOkbEu6FiYApwXXaGdRgYP0E6f4Y2P/5 xQ5stX5ebLos3dfoWGEfDt3oM2OVKQ3LiVzXwE+grg5sJjEs7pagO3wQV11sYwTtOtGD NYQ5CXQDl72buVAEJ4uacIPD77tl1/BjkqF43gxW/PIudzbJ7HfWwDzhWLgyetgsvDZ2 IhQlrwEg3Gk/9L2NlG2FEhurDJKl1yMWgix2DkLwEgd18O6u6UwKbs+G0Sgch3SB5NvT tY8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=abrgTriq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v9si5798678ejy.486.2020.08.31.08.48.00; Mon, 31 Aug 2020 08:48:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=abrgTriq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727853AbgHaPqx (ORCPT + 99 others); Mon, 31 Aug 2020 11:46:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57858 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727046AbgHaPqs (ORCPT ); Mon, 31 Aug 2020 11:46:48 -0400 Received: from mail-lj1-x243.google.com (mail-lj1-x243.google.com [IPv6:2a00:1450:4864:20::243]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3881C061755 for ; Mon, 31 Aug 2020 08:46:47 -0700 (PDT) Received: by mail-lj1-x243.google.com with SMTP id y4so6191451ljk.8 for ; Mon, 31 Aug 2020 08:46:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+AoMEfgYRxQkncZxnPbeI2rx7pGrsAFC42S72eCBWIc=; b=abrgTriqszl9wlmYoiAi460uozclBt//bwVG6oUCXIYMaa47WhwIZ5OK0cldnL8eqx kNKaDE2trDqOmVr2v/5ZmMIBj2c5PslJlv2Elxb/ggWXIl4U516/Abqy0/HIN11cbHzC h8C0QuQWNNXaJFVu9EOU4Qi0NTNySPdoHprKNcyWCbK9chZpfcUvtCbiRjiPab6dcFod cFZFGqvlQJm9x3ZyscjpjPZ69FH2ahfgHP/+UCwH1Y5lgf1ncOpukzXYqKZ78eqI2EBq 7lhaKLnGzzeilsmbuB5IrcJQGGm0tySPQBMaTkEq/xFYC2aa8zaYOqufmsUAWhTl1Ena 7ICQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+AoMEfgYRxQkncZxnPbeI2rx7pGrsAFC42S72eCBWIc=; b=sQe0CnaHhGuKDxlS3vgiCR3mzEvBEm3EFIl6ACddJ+5KiNZYfZZQgkd6ltwNOCSlpY FiXl0UXfgealcTKsSElraxwOBdmUfPzWPNRB6u7CoLpu9dpRC7z8yC35Hrb2nJq3xItG 8V0y/zHacrUJLoEUpQPNeIQNYlvydJ31qKJsoT6Xlw14FA7fqnsZ9PziPlndCct0T5lv lmh8+tEnTxz4npt1iQ0G0yT++GsZnIoyHxCFvMl3Huk9LxIIdv6NHYNqFRtqlYVwHHqo p4Gr3JdWleuuGK3QGLxFxMFxJG7Y5L0yxlK1oe041IKAFMilO5j9t3d1qlEK8WBhh9Eb jD3A== X-Gm-Message-State: AOAM533DXC6pGYcz+NYdNoKfo0fPyKLCzxtrK0Y/zObLs7QZ5Luijxiu owM3AOpDbS4Jvc2+UXlbxRNVZ+lBVwoWP0wDl16thPrh2Ss= X-Received: by 2002:a2e:9990:: with SMTP id w16mr872000lji.156.1598888805568; Mon, 31 Aug 2020 08:46:45 -0700 (PDT) MIME-Version: 1.0 References: <20200831153207.GO3265@brightrain.aerifal.cx> In-Reply-To: <20200831153207.GO3265@brightrain.aerifal.cx> From: Jann Horn Date: Mon, 31 Aug 2020 17:46:19 +0200 Message-ID: Subject: Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2 To: Rich Felker , Alexander Viro , Jens Axboe Cc: linux-fsdevel , kernel list , Linux API , Pavel Begunkov Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Aug 31, 2020 at 5:32 PM Rich Felker wrote: > The pwrite function, originally defined by POSIX (thus the "p"), is > defined to ignore O_APPEND and write at the offset passed as its > argument. However, historically Linux honored O_APPEND if set and > ignored the offset. This cannot be changed due to stability policy, > but is documented in the man page as a bug. > > Now that there's a pwritev2 syscall providing a superset of the pwrite > functionality that has a flags argument, the conforming behavior can > be offered to userspace via a new flag. Since pwritev2 checks flag > validity (in kiocb_set_rw_flags) and reports unknown ones with > EOPNOTSUPP, callers will not get wrong behavior on old kernels that > don't support the new flag; the error is reported and the caller can > decide how to handle it. > > Signed-off-by: Rich Felker Reviewed-by: Jann Horn Note that if this lands, Michael Kerrisk will probably be happy if you send a corresponding patch for the manpage man2/readv.2. Btw, I'm not really sure whose tree this should go through - VFS is normally Al Viro's turf, but it looks like the most recent modifications to this function have gone through Jens Axboe's tree? > --- > > Changes in v2: I've added a check to ensure that RWF_NOAPPEND does not > override O_APPEND for S_APPEND (chattr +a) inodes, and fixed conflicts > with 1752f0adea98ef85, which optimized kiocb_set_rw_flags to work with > a local copy of flags. Unfortunately the same optimization does not > work for RWF_NOAPPEND since it needs to remove flags from the original > set at function entry. > > If desired, I could further change this so that kiocb_flags is > initialized to ki->ki_flags, with assignment-back in place of |= at > the end of the function. This would allow the same local variable > pattern in the RWF_NOAPPEND code path, which might be more elegant, > but I'm not sure if the emitted code would improve or get worse. > > > include/linux/fs.h | 7 +++++++ > include/uapi/linux/fs.h | 5 ++++- > 2 files changed, 11 insertions(+), 1 deletion(-) > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 7519ae003a08..924e17ac8e7e 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -3321,6 +3321,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > return 0; > if (unlikely(flags & ~RWF_SUPPORTED)) > return -EOPNOTSUPP; > + if (unlikely((flags & RWF_APPEND) && (flags & RWF_NOAPPEND))) > + return -EINVAL; > > if (flags & RWF_NOWAIT) { > if (!(ki->ki_filp->f_mode & FMODE_NOWAIT)) > @@ -3335,6 +3337,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC); > if (flags & RWF_APPEND) > kiocb_flags |= IOCB_APPEND; > + if ((flags & RWF_NOAPPEND) && (ki->ki_flags & IOCB_APPEND)) { > + if (IS_APPEND(file_inode(ki->ki_filp))) > + return -EPERM; > + ki->ki_flags &= ~IOCB_APPEND; > + } > > ki->ki_flags |= kiocb_flags; > return 0; > diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h > index f44eb0a04afd..d5e54e0742cf 100644 > --- a/include/uapi/linux/fs.h > +++ b/include/uapi/linux/fs.h > @@ -300,8 +300,11 @@ typedef int __bitwise __kernel_rwf_t; > /* per-IO O_APPEND */ > #define RWF_APPEND ((__force __kernel_rwf_t)0x00000010) > > +/* per-IO negation of O_APPEND */ > +#define RWF_NOAPPEND ((__force __kernel_rwf_t)0x00000020) > + > /* mask of flags supported by the kernel */ > #define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\ > - RWF_APPEND) > + RWF_APPEND | RWF_NOAPPEND) > > #endif /* _UAPI_LINUX_FS_H */ > -- > 2.21.0 >