Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752399AbbKPJcr (ORCPT ); Mon, 16 Nov 2015 04:32:47 -0500 Received: from mail-wm0-f43.google.com ([74.125.82.43]:37485 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750911AbbKPJco (ORCPT ); Mon, 16 Nov 2015 04:32:44 -0500 Date: Mon, 16 Nov 2015 11:32:41 +0200 From: "Kirill A. Shutemov" To: "Kirill A. Shutemov" , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Dmitry Vyukov , Manfred Spraul Subject: Re: [PATCH, RESEND] ipc/shm: handle removed segments gracefully in shm_mmap() Message-ID: <20151116093241.GB9778@node.shutemov.name> References: <1447232220-36879-1-git-send-email-kirill.shutemov@linux.intel.com> <20151111170347.GA3502@linux-uzut.site> <20151111195023.GA17310@node.shutemov.name> <20151113053137.GB3502@linux-uzut.site> <20151113091259.GB28904@node.shutemov.name> <20151113192310.GC3502@linux-uzut.site> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20151113192310.GC3502@linux-uzut.site> User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3355 Lines: 84 On Fri, Nov 13, 2015 at 11:23:10AM -0800, Davidlohr Bueso wrote: > On Fri, 13 Nov 2015, Kirill A. Shutemov wrote: > > >On Thu, Nov 12, 2015 at 09:31:37PM -0800, Davidlohr Bueso wrote: > >>On Wed, 11 Nov 2015, Kirill A. Shutemov wrote: > >>>>> ret = sfd->file->f_op->mmap(sfd->file, vma); > >>>>>- if (ret != 0) > >>>>>+ if (ret) { > >>>>>+ shm_close(vma); > >>>>> return ret; > >>>>>+ } > >>>> > >>>>Hmm what's this shm_close() about? > >>> > >>>Undo shp->shm_nattch++ in successful __shm_open(). > >> > >>Yeah that's just nasty. > > > >I don't see why: we successfully opened the segment, but f_op->mmap > >failed -- let's close the segment. It's normal error path. > > I was referring to the fact that I hate having to prematurely call shm_open() > just for this case, and then have to backout, ie for nattach. Similarly, I > dislike that you make shm_close behave one way and _shm_open another, looks > hacky. > > That said, I do agree that we should inform EIDRM back to the shm_mmap > caller. My immediate thought would be to recheck right after shm_open returns. > I realize this is also hacky as we run into similar inconsistencies that I > mentioned above. But that's a caller (and the only one), not the whole > shm_open/close. Also, just like we are concerned about EIDRM, should we also > care about EINVAL -- where we race with explicit user shmctl(RMID) calls but > we hold reference to nattach?? I mean, why bother doing mmap if the segment is > marked for deletion and ipc won't touch it again anyway (failed idr lookups). > The downside to that is the extra lookup overhead, so perhaps your approach > is better. But looks like the right thing to do conceptually. Something like so? > > shm_mmap() > { > err = shm_check_vma_validity() > if (err) > > ->mmap() > > shm_open() > err = shm_check_vma_validity() > if (err) > return err; /* shm_open was a nop, return the corresponding error */ > > return 0; > } The problem I have with this approach is that it assumes that there's nothing to undo from ->mmap in case of shm_check_validity() failed in the second call. That seems true at the moment, but I'm not sure if we can assume this in general and if it's future-proof. > So considering EINVAL, even your approach to bumping up nattach by calling > _shm_open earlier isn't enough. Races exposed to user called rmid can still > occur between dropping the lock and doing ->mmap(). Ugh.. I see. That's a problem. Looks like a problem we solved for mm_struct by separation of mm_count from mm_users. Should we have two counters instead of shm_nattch? > Ultimately this leads to all ipc_valid_object() checks, as we totally > ignore SHM_DEST segments nowadays since we forbid mapping previously > removed segments. > > I think this is the first thing we must decide before going forward with this > mess. ipc currently defines invalid objects by merely checking the deleted flag. To me all these flags mess should be replaced by proper refcounting. Although, I admit, I don't understand SysV IPC API good enough to say for sure if it's possible. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/