Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261584AbVEUNOY (ORCPT ); Sat, 21 May 2005 09:14:24 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261279AbVEUNOY (ORCPT ); Sat, 21 May 2005 09:14:24 -0400 Received: from mx1.redhat.com ([66.187.233.31]:53469 "EHLO mx1.redhat.com") by vger.kernel.org with ESMTP id S261587AbVEUNOL (ORCPT ); Sat, 21 May 2005 09:14:11 -0400 Date: Sat, 21 May 2005 09:14:05 -0400 From: Neil Horman To: Al Viro Cc: Neil Horman , linux-kernel@vger.kernel.org Subject: Re: [Patch] vfs: increase scope of critical locked path in fget_light to avoid race Message-ID: <20050521131405.GA22489@hmsendeavour.rdu.redhat.com> References: <20050520132325.GE19229@hmsendeavour.rdu.redhat.com> <20050520133337.GP29811@parcelfarce.linux.theplanet.co.uk> <20050520134046.GQ29811@parcelfarce.linux.theplanet.co.uk> <20050520152550.GF19229@hmsendeavour.rdu.redhat.com> <20050520212824.GS29811@parcelfarce.linux.theplanet.co.uk> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20050520212824.GS29811@parcelfarce.linux.theplanet.co.uk> User-Agent: Mutt/1.4.1i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3740 Lines: 69 On Fri, May 20, 2005 at 10:28:24PM +0100, Al Viro wrote: > On Fri, May 20, 2005 at 11:25:50AM -0400, Neil Horman wrote: > > I don't have the complete race scenario, just a stack that suggests that > > files->fd was corrupted. This problem isn't recreatable at will (yet), so this > > is really based on a thought experiment more than anything else. The conditions > > that I was envisioning was a multithreaded application in which two threads > > modified the same file descriptor at the same time. Its my understanding, from > > the way I read the code that the ref count on a file_struct will still be one > > for a multithreaded application, and as such it would be possible, using the > > fget_light routine for one thread to be be preforming an operation on an > > descriptor in the fd array, while another thread preformed another operation > > Incorrect. References to files_struct are held by task_struct. Kernel > stack is determined by task_struct. So your two threads would have > to share task_struct (i.e. be purely userland ones) and could not > run in the kernel at the same time for very obvious reasons. > > The rules are simple: > * all access to files_struct is done from upper-half (i.e. is > process-synchronous). > * the only files_struct you can modify is *(current->files) > * each task_struct that has ->files pointing to given files_struct > contributes 1 to ->count of that files_struct. There might be other holders > of temporary references and they also contribute to ->count. > * all changes of task->files itself are process-synchronous. Only > two kinds of changes are possible: > 1) current->files can be set to NULL. That drops a reference to > original files_struct. > 2) current->files can be replaced with a pointer to a new copy of > previous files_struct. This operations drops a reference to old one and > sets the refcount on a copy to 1. It could either be done explicitly (when > unshare(2) gets merged into Linus' tree) or implicitly at the clone()/fork() > time. In the latter case that's done by parent to child before the child > gets a chance to run. > * at task creation time, child inherits ->files from parent; that > acquires a new reference to it. That might be followed by implicit unshare() > (see above). fork(2) always unshares ->files, clone(2) does that unless > CLONE_FILES had been passed to it in flags. > > IOW, the only way for two tasks to have ->files pointing to the same object > is to have it unchanged all the way back to common ancestor. In particular, > if current->files->count is 1, we know that no other task has ->files pointing > to our files_struct and that will remain true until we call clone(). It does > *not* mean that current->files->count will remain 1; somebody might acquire > a temporary reference to our files_struct. However, we are guaranteed that > all such references will be used only for read-only access (that happens, > e.g., when somebody does ls /proc//fd - they will grab a reference > to our ->files and go looking at the descriptor table; they are not allowed > to change it, though). Ok, thanks for the through explination here, quite educational. I'll go looking for other causes for this. Thanks and regards Neil -- /*************************************************** *Neil Horman *Software Engineer *Red Hat, Inc. *nhorman@redhat.com *gpg keyid: 1024D / 0x92A74FA1 *http://pgp.mit.edu ***************************************************/ - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/