Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757429AbXJKTYs (ORCPT ); Thu, 11 Oct 2007 15:24:48 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753451AbXJKTYk (ORCPT ); Thu, 11 Oct 2007 15:24:40 -0400 Received: from e4.ny.us.ibm.com ([32.97.182.144]:48062 "EHLO e4.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752373AbXJKTYk (ORCPT ); Thu, 11 Oct 2007 15:24:40 -0400 Subject: Re: [RFC][PATCH 2/7] get mount write in __dentry_open() From: Dave Hansen To: Miklos Szeredi Cc: linux-kernel@vger.kernel.org, hch@infradead.org In-Reply-To: References: <20071010163439.0F8089F7@kernel> <20071010163439.9DB7F219@kernel> <1192126616.31114.63.camel@localhost> Content-Type: text/plain Date: Thu, 11 Oct 2007 12:24:01 -0700 Message-Id: <1192130641.20859.30.camel@localhost> Mime-Version: 1.0 X-Mailer: Evolution 2.10.1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2553 Lines: 63 On Thu, 2007-10-11 at 20:31 +0200, Miklos Szeredi wrote: > > On Thu, 2007-10-11 at 17:08 +0200, Miklos Szeredi wrote: > > > > diff -puN fs/namei.c~get-write-in-__dentry_open fs/namei.c > > > > --- lxc/fs/namei.c~get-write-in-__dentry_open 2007-10-03 14:44:52.000000000 -0700 > > > > +++ lxc-dave/fs/namei.c 2007-10-04 18:02:48.000000000 -0700 > > > > @@ -1621,14 +1621,6 @@ int may_open(struct nameidata *nd, int a > > > > return -EACCES; > > > > > > > > flag &= ~O_TRUNC; > > > > - } else if (flag & FMODE_WRITE) { > > > > - /* > > > > - * effectively: !special_file() > > > > - * balanced by __fput() > > > > - */ > > > > - error = mnt_want_write(nd->mnt); > > > > - if (error) > > > > - return error; > > > > } > > > > > > Maybe readonly should still be checked here, so that the order of > > > error checking doesn't change. If racing with a read-only remount the > > > order is irrelevant anyway. Something like this? > > > > > > } else if (flag & FMODE_WRITE && __mnt_is_readonly(nd->mnt)) { > > > return -EROFS > > > } > > > > I think that would be a bug if anything actually managed to trip that > > code. all of the may_open() calls should have been covered by the > > __dentry_open() mnt writer. > > AFACIS, __dentry_open() will normally be called later than may_open(). > And we don't want it earlier, because ->open() may have side affects, > that could be unsafe if done before permission checking. I actually check the mount write count before the ->open() in __dentry_open(). The truncates are also definitely wrapped in their own mnt_want_write() calls now. > > > And they should be added around do_truncate() as well, since you > > > remove the protection from may_open(). > > > > > > This one introduces an interesting race between ro-remount and > > > open(O_TRUNC), where the truncate can succeed but the open fail with > > > EROFS. Is that a problem? > > > > You're right, this does introduce that race, and it is relatively hard > > to fix properly. But, the 'return a filp' patch makes it easy to fix. > > I've put a temporary kludge in the updated version of this patch, and > > fixed it properly in that later patch. > > If you fix this properly, that should take care of the first problem > as well. Yup. New series coming up. -- Dave - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/