Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754431AbdLUTTf (ORCPT ); Thu, 21 Dec 2017 14:19:35 -0500 Received: from mx1.redhat.com ([209.132.183.28]:40484 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752509AbdLUTTe (ORCPT ); Thu, 21 Dec 2017 14:19:34 -0500 From: Giuseppe Scrivano To: Al Viro Cc: Andrew Morton , LKML , alexander.deucher@amd.com, broonie@kernel.org, chris@chris-wilson.co.uk, David Miller , deepa.kernel@gmail.com, Greg KH , luc.vanoostenryck@gmail.com, lucien xin , Ingo Molnar , Neil Horman , syzkaller-bugs@googlegroups.com, Vladislav Yasevich Subject: Re: [PATCH linux-next] mqueue: fix IPC namespace use-after-free References: <20171219101440.19736-1-gscrivan@redhat.com> <20171219114819.GQ21978@ZenIV.linux.org.uk> <20171219153225.GA14771@ZenIV.linux.org.uk> <874lomhcwb.fsf@redhat.com> <87vah2ftn8.fsf@redhat.com> <20171219201411.GT21978@ZenIV.linux.org.uk> Date: Thu, 21 Dec 2017 20:19:27 +0100 In-Reply-To: <20171219201411.GT21978@ZenIV.linux.org.uk> (Al Viro's message of "Tue, 19 Dec 2017 20:14:12 +0000") Message-ID: <87inczopmo.fsf@redhat.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.3 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 21 Dec 2017 19:19:34 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2333 Lines: 46 Al Viro writes: > On Tue, Dec 19, 2017 at 07:40:43PM +0100, Giuseppe Scrivano wrote: >> Giuseppe Scrivano writes: >> >> > The only issue I've seen with my version is that if I do: >> > >> > # unshare -im /bin/sh >> > # mount -t mqueue mqueue /dev/mqueue >> > # touch /dev/mqueue/foo >> > # umount /dev/mqueue >> > # mount -t mqueue mqueue /dev/mqueue >> > >> > then /dev/mqueue/foo doesn't exist at this point. Your patch does not >> > have this problem and /dev/mqueue/foo is again accessible after the >> > second mount. >> >> although, how much is that of an issue? Is there any other way to delay > > You do realize that with your patch you would end up with worse than that - > mount/touch/umount and now you've got ->mq_sb non-NULL and pointing to freed > super_block. With really unpleasant effects when you quit that ipcns... > >> the cost of kern_mount_data()? Most containers have /dev/mqueue mounted >> but it is not really going to be used. > > _What_ cost? At mount(2) time you are setting the superblock up anyway, so > what would you be delaying? kmem_cache_alloc() for struct mount and assignments > to its fields? That's noise; if anything, I would expect the main cost with > a plenty of containers to be in sget() scanning the list of mqueue superblocks. > And we can get rid of that, while we are at it - to hell with mount_ns(), with > that approach we can just use mount_nodev() instead. The logics in > mq_internal_mount() will deal with multiple instances - if somebody has already > triggered creation of internal mount, all subsequent calls in that ipcns will > end up avoiding kern_mount_data() entirely. And if you have two callers > racing - sure, you will get two superblocks. Not for long, though - the first > one to get to setting ->mq_mnt (serialized on mq_lock) wins, the second loses > and prompty destroys his vfsmount and superblock. I seriously suspect that > variant below would cut down on the cost a whole lot more - as it is, we have > the total of O(N^2) spent in the loop inside of sget_userns() when we create > N ipcns and mount in each of those; this patch should cut that to O(N)... Thanks for the explanation. I see what you mean now and how this cost is inevitable as anyway we need to setup the superblock. Giuseppe