Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753424AbZDOA4u (ORCPT ); Tue, 14 Apr 2009 20:56:50 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752288AbZDOA4j (ORCPT ); Tue, 14 Apr 2009 20:56:39 -0400 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:54795 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752169AbZDOA4h (ORCPT ); Tue, 14 Apr 2009 20:56:37 -0400 From: KOSAKI Motohiro To: Jeff Moyer Subject: Re: [RFC][PATCH v3 4/6] aio: Don't inherit aio ring memory at fork Cc: kosaki.motohiro@jp.fujitsu.com, LKML , Zach Brown , Jens Axboe , linux-api@vger.kernel.org, Linus Torvalds , Andrew Morton , Nick Piggin , Andrea Arcangeli , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org In-Reply-To: References: <20090414151924.C653.A69D9226@jp.fujitsu.com> Message-Id: <20090415091534.AC18.A69D9226@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit X-Mailer: Becky! ver. 2.50 [ja] Date: Wed, 15 Apr 2009 09:56:34 +0900 (JST) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2541 Lines: 76 Hi! > KOSAKI Motohiro writes: > > > AIO folks, Am I missing anything? > > > > =============== > > Subject: [RFC][PATCH] aio: Don't inherit aio ring memory at fork > > > > Currently, mm_struct::ioctx_list member isn't copyed at fork. IOW aio context don't inherit at fork. > > but only ring memory inherited. that's strange. > > > > This patch mark DONTFORK to ring-memory too. > > Well, given that clearly nobody relies on io contexts being copied to > the child, I think it's okay to make this change. I think the current > behaviour violates the principal of least surprise, but I'm having a > hard time getting upset about that. ;) ok. So, Can I get your Acked-by? > > In addition, This patch has good side effect. it also fix > > "get_user_pages() vs fork" problem. > > Hmm, I don't follow you, here. As I understand it, the get_user_pages > vs. fork problem has to do with the pages used for the actual I/O, not > the pages used to store the completion data. So, could you elaborate a > bit on what you mean by the above statement? No. The problem is, get_user_pages() increment page_count only. but VM page-fault logic don't care page_count. (it only care page::_mapcount) Then, fork and pagefault can change virtual-physical relationship although get_user_pages() is called. drawback worst aio scenario here ----------------------------------------------------------------------- io_setup() and gup inc page_count fork inc mapcount and make write-protect to pte write ring from userland(*) page fault and COW break. parent process get copyed page and child get original page owner-ship. kmap and memcpy from kernel change child page. (it mean data lost) (*) Is this happend? MADV_DONTFORK or down_read(mmap_sem) or down_read(mm_pinned_sem) or copy-at-fork mecanism(=Nick/Andrea patch) solve it. > > I think "man fork" also sould be changed. it only say > > > > * The child does not inherit outstanding asynchronous I/O operations from > > its parent (aio_read(3), aio_write(3)). > > but aio_context_t (return value of io_setup(2)) also don't inherit in current implementaion. > > I can certainly make that change, as I have other changes I need to push > to Michael, anyway. thanks. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/