Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757244AbdLPWEr (ORCPT ); Sat, 16 Dec 2017 17:04:47 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54664 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757211AbdLPWEp (ORCPT ); Sat, 16 Dec 2017 17:04:45 -0500 From: Giuseppe Scrivano To: Cong Wang Cc: Al Viro , LKML , Ingo Molnar , Andrew Morton , Linus Torvalds , stable Subject: Re: [PATCH] exit: move exit_task_namespaces() after exit_task_work() References: <20171214201757.5393-1-xiyou.wangcong@gmail.com> <20171214210817.GF21978@ZenIV.linux.org.uk> Date: Sat, 16 Dec 2017 23:04:41 +0100 In-Reply-To: (Cong Wang's message of "Fri, 15 Dec 2017 15:59:46 -0800") Message-ID: <87y3m2pbwm.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Sat, 16 Dec 2017 22:04:45 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2578 Lines: 72 Cong Wang writes: > On Thu, Dec 14, 2017 at 1:08 PM, Al Viro wrote: >> On Thu, Dec 14, 2017 at 12:17:57PM -0800, Cong Wang wrote: >>> syzbot reported we have a use-after-free when mqueue_evict_inode() >>> is called on __cleanup_mnt() path, where the ipc ns is already >>> freed by the previous exit_task_namespaces(). We can just move >>> it after after exit_task_work() to avoid this use-after-free. >> >> What's to prevent somebody else holding a reference to the same >> inode past the exit(2)? IOW, I don't believe that this is fixing >> anything - in the best case, your patch papers over a specific >> reproducer. > > You are right, I missed mq_clear_sbinfo(). yes, the patch 9c583773d036336176e9e50441890659bc4eeae8 introduced an issue since it doesn't reset s_fs_info when ns->mq_mnt == NULL. The following patch solves the issue for me. I store the superblock in "struct ipc_namespace" since we cannot assume "mq_mnt" exists. Does the patch look fine? I'll clean it up and submit it separately if this approach is correct. Regards, Giuseppe diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h index 554e31494f69..66d4a5740a60 100644 --- a/include/linux/ipc_namespace.h +++ b/include/linux/ipc_namespace.h @@ -53,6 +53,7 @@ struct ipc_namespace { /* The kern_mount of the mqueuefs sb. We take a ref on it */ struct vfsmount *mq_mnt; + struct super_block *mq_sb; /* # queues in this ns, protected by mq_lock */ unsigned int mq_queues_count; diff --git a/ipc/mqueue.c b/ipc/mqueue.c index f78d6687a61b..16347cf1de2d 100644 --- a/ipc/mqueue.c +++ b/ipc/mqueue.c @@ -341,6 +341,7 @@ static int mqueue_fill_super(struct super_block *sb, void *data, int silent) sb->s_root = d_make_root(inode); if (!sb->s_root) return -ENOMEM; + ns->mq_sb = sb; return 0; } @@ -1554,6 +1555,7 @@ int mq_init_ns(struct ipc_namespace *ns, bool mount) ns->mq_msg_max = DFLT_MSGMAX; ns->mq_msgsize_max = DFLT_MSGSIZEMAX; ns->mq_msg_default = DFLT_MSG; + ns->mq_sb = NULL; ns->mq_msgsize_default = DFLT_MSGSIZE; if (!mount) @@ -1573,8 +1575,8 @@ int mq_init_ns(struct ipc_namespace *ns, bool mount) void mq_clear_sbinfo(struct ipc_namespace *ns) { - if (ns->mq_mnt) - ns->mq_mnt->mnt_sb->s_fs_info = NULL; + if (ns->mq_sb) + ns->mq_sb->s_fs_info = NULL; } void mq_put_mnt(struct ipc_namespace *ns)