Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp4052477ybg; Mon, 21 Oct 2019 03:02:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqyZD72cCykgDYqegQdTivDEtgekarzMpW5eB99GeDJCjI4GFaGUz7L8vXnBp99nU6SD9kyg X-Received: by 2002:a17:906:1be1:: with SMTP id t1mr21396377ejg.73.1571652176898; Mon, 21 Oct 2019 03:02:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571652176; cv=none; d=google.com; s=arc-20160816; b=Acm+GB0c1PjLACNfIjfO7FS+dEXJJReGBGgkJUMLNSB020T4d0bs/x462KFmrxqXmB v/wmAlgOy5oXVLLT4+2T3IvGwFd9eoy/H65GeYJSPI9suH/5xf151p3jtWseTmq3n+jF pu/jfYGysgg6W4CyZHgnCW7tJbOmJNpu2No18nqqt2DBL7s3uYbrK3ftdga+gZ44pGKB LK3LWYa12Fguu381DWOr6VBdrpr/sttDvQLG8xWLtfqLAuU1KTQ73nNvydPZagXKsIqP hQk6aSs/kFH0DkGOtTE1C48iLZLOxqEbIKw52OFD67MywMUOw6HxWRTzGlBiFsANz73n 7lCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=2hHJ5NLcnxlF/35z8dPd6x7teJob8DJgzbVJAWB7Dd0=; b=H8R3er+wfHI6Hd9VFXgeyL277VTr17v4A2HwOnb6Gu/5tpYH38P/dioEuX4aogfNfX tvKQfJ6dpCiFeEOchVUZlxFe0+2zLPZmWk/1FgzdHXcXNom811eGe39Yubt4WgMeWP6j TNoizHwMLlvTNs3Ve6/rILDIkP+A99Ae9gPxHv40QlaOCnZ2MgAJBNjIsVV8jWks7gDe oXKRYnJuo59ZxaHX6OTuqYo2hDEBQA26x4V2NcyFJVB9ti1Nn7K2GvIJDcFVUvqPl8M+ oZWPXXc/UDtTy+9j3AmoaFT7SV318cIIRE4Rbqoc8tjSNZpLzY8kFrs0ixIr2FpDEAad /4UA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=E21aNDbX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d35si9200358ede.440.2019.10.21.03.02.33; Mon, 21 Oct 2019 03:02:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=E21aNDbX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727944AbfJUKAD (ORCPT + 99 others); Mon, 21 Oct 2019 06:00:03 -0400 Received: from mail-ot1-f65.google.com ([209.85.210.65]:40830 "EHLO mail-ot1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727735AbfJUKAD (ORCPT ); Mon, 21 Oct 2019 06:00:03 -0400 Received: by mail-ot1-f65.google.com with SMTP id y39so10448626ota.7 for ; Mon, 21 Oct 2019 03:00:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=2hHJ5NLcnxlF/35z8dPd6x7teJob8DJgzbVJAWB7Dd0=; b=E21aNDbXg874GinAApeZ/EftmLYQq+nttZxK1wdNrGsnWFeKvlbsM3rgwuKDu2kute UAakWBm04O5eVwtO6AGesMaWKD62BbI0PQmrSdzdI6MbBg0GmEIYXr8deCLc+JA+26I3 HZ44l1406jnBi4ERR8aPviUs0Nx7u/9vAQ+zTQu+ieJlXbEDS9kvFAari5ip0p4VcXyo a4hZT4m1QKxftprG8S7DCG6Q6Ytpm3ZSPwz51JsfSoGoGQsCmfmRIXDJbPA+9sNHwhjR ZfidHjhvVWLLnPnlfPowQ6xTbLfv18wym4HZMXg2M0wNRJK1/I1QnXPNvR/chKz2sHF9 Q9lQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=2hHJ5NLcnxlF/35z8dPd6x7teJob8DJgzbVJAWB7Dd0=; b=U+ahE1TKK0X+W4eg8m31iHUs5n4xZwVDpyrDh7D8R9I9YAtKCUP8z9hSLu13PLM77z v6nwMnGV0tLL9deJYNHKJjcq6XPZI5ISiJWf0/ABdLePqoanBJTCETpyJnIPGqkHEego WEoWuflVzGaIai+M9M6kS5h7WG3i0PFtE+RUFVPOFM2Gpm+PxfL6FKM0RqgEZ9IJbuGG tpYN3C8KfqkhO4y3SbJBDplHpcgynzL6jYelg3U0ZWTScLdIXf+KdctLH385z/XMgZ6n IiUTZpxmvb2Cxb9ISwvpjwNmxsQ8X/My3vCuZDPYEPKzy+vdJ6lpoR9ZY33obCWqVPbM KEOg== X-Gm-Message-State: APjAAAVEDV5fNZNYLpu8+m3enwkIWPJeaFrbShzVsaZokN5tneAMkEpo 8Hde0b4r+92cFY5gUE9YauIBGKzOvlKt7bAoapKwLA== X-Received: by 2002:a9d:5f0f:: with SMTP id f15mr636415oti.251.1571652001826; Mon, 21 Oct 2019 03:00:01 -0700 (PDT) MIME-Version: 1.0 References: <20191020173010.GA14744@avx2> In-Reply-To: From: Marco Elver Date: Mon, 21 Oct 2019 11:59:50 +0200 Message-ID: Subject: Re: [PATCH] proc: fix inode uid/gid writeback race To: Alexey Dobriyan Cc: Andrew Morton , LKML , linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 21 Oct 2019 at 11:24, Marco Elver wrote: > > On Sun, 20 Oct 2019 at 19:30, Alexey Dobriyan wrote: > > > > (euid, egid) pair is snapshotted correctly from task under RCU, > > but writeback to inode can be done in any order. > > > > Fix by doing writeback under inode->i_lock where necessary > > (/proc/* , /proc/*/fd/* , /proc/*/map_files/* revalidate). > > > > Reported-by: syzbot+e392f8008a294fdf8891@syzkaller.appspotmail.com > > Signed-off-by: Alexey Dobriyan > > --- > > Thanks! > > This certainly fixes the problem of inconsistent uid/gid pair due to > racing writebacks, as well as the data-race. If that is the only > purpose of this patch, then from what I see this is fine: > > Acked-by: Marco Elver > > However, there is probably still a more fundamental problem as outlined below. > > > fs/proc/base.c | 25 +++++++++++++++++++++++-- > > fs/proc/fd.c | 2 +- > > fs/proc/internal.h | 2 ++ > > 3 files changed, 26 insertions(+), 3 deletions(-) > > > > --- a/fs/proc/base.c > > +++ b/fs/proc/base.c > > @@ -1743,6 +1743,25 @@ void task_dump_owner(struct task_struct *task, umode_t mode, > > *rgid = gid; > > } > > > > +/* use if inode is live */ > > +void task_dump_owner_to_inode(struct task_struct *task, umode_t mode, > > + struct inode *inode) > > +{ > > + kuid_t uid; > > + kgid_t gid; > > + > > + task_dump_owner(task, mode, &uid, &gid); > > + /* > > + * There is no atomic "change all credentials at once" system call, > > + * guaranteeing more than _some_ snapshot from "struct cred" ends up > > + * in inode is not possible. > > + */ > > + spin_lock(&inode->i_lock); > > + inode->i_uid = uid; > > + inode->i_gid = gid; > > + spin_unlock(&inode->i_lock); > > 2 tasks can still race here, and the inconsistent scenario I outlined in > https://lore.kernel.org/linux-fsdevel/000000000000328b2905951a7667@google.com/ > could still happen I think (although extremely unlikely). Mainly, > causality may still be violated -- but I may be wrong as I don't know > the rest of the code (so please be critical of my suggestion). > > The problem is that if 2 threads race here, one has snapshotted old > uid/gid, and the other the new uid/gid. Then it is still possible for > the old uid/gid to be written back after new uid/gid, which would > result in this bad scenario: > > === TASK 1 === > | seteuid(1000); > | seteuid(0); > | stat("/proc/", &fstat); > | assert(fstat.st_uid == 0); // fails > === TASK 2 === > | stat("/proc/", ...); > > AFAIK it's not something that can easily be fixed without some > timestamp on the uid/gid pair (timestamp updated after setuid/seteuid > etc) obtained in the RCU reader critical section. Then in this > critical section, uid/gid should only be written if the current pair > in inode is older according to snapshot timestamp. Note that timestamp is probably wrong here, but rather some kind of sequence number would be needed. > > +} > > + > > struct inode *proc_pid_make_inode(struct super_block * sb, > > struct task_struct *task, umode_t mode) > > { > > @@ -1769,6 +1788,7 @@ struct inode *proc_pid_make_inode(struct super_block * sb, > > if (!ei->pid) > > goto out_unlock; > > > > + /* fresh inode -- no races */ > > task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid); > > security_task_to_inode(task, inode); > > > > @@ -1802,6 +1822,7 @@ int pid_getattr(const struct path *path, struct kstat *stat, > > */ > > return -ENOENT; > > } > > + /* "struct kstat" is thread local, atomic snapshot is enough */ > > task_dump_owner(task, inode->i_mode, &stat->uid, &stat->gid); > > } > > rcu_read_unlock(); > > @@ -1815,7 +1836,7 @@ int pid_getattr(const struct path *path, struct kstat *stat, > > */ > > void pid_update_inode(struct task_struct *task, struct inode *inode) > > { > > - task_dump_owner(task, inode->i_mode, &inode->i_uid, &inode->i_gid); > > + task_dump_owner_to_inode(task, inode->i_mode, inode); > > > > inode->i_mode &= ~(S_ISUID | S_ISGID); > > security_task_to_inode(task, inode); > > @@ -1990,7 +2011,7 @@ static int map_files_d_revalidate(struct dentry *dentry, unsigned int flags) > > mmput(mm); > > > > if (exact_vma_exists) { > > - task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid); > > + task_dump_owner_to_inode(task, 0, inode); > > > > security_task_to_inode(task, inode); > > status = 1; > > --- a/fs/proc/fd.c > > +++ b/fs/proc/fd.c > > @@ -101,7 +101,7 @@ static bool tid_fd_mode(struct task_struct *task, unsigned fd, fmode_t *mode) > > static void tid_fd_update_inode(struct task_struct *task, struct inode *inode, > > fmode_t f_mode) > > { > > - task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid); > > + task_dump_owner_to_inode(task, 0, inode); > > > > if (S_ISLNK(inode->i_mode)) { > > unsigned i_mode = S_IFLNK; > > --- a/fs/proc/internal.h > > +++ b/fs/proc/internal.h > > @@ -123,6 +123,8 @@ static inline struct task_struct *get_proc_task(const struct inode *inode) > > > > void task_dump_owner(struct task_struct *task, umode_t mode, > > kuid_t *ruid, kgid_t *rgid); > > +void task_dump_owner_to_inode(struct task_struct *task, umode_t mode, > > + struct inode *inode); > > > > unsigned name_to_int(const struct qstr *qstr); > > /*