Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753624AbYFRHZO (ORCPT ); Wed, 18 Jun 2008 03:25:14 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753599AbYFRHYi (ORCPT ); Wed, 18 Jun 2008 03:24:38 -0400 Received: from fgwmail5.fujitsu.co.jp ([192.51.44.35]:50053 "EHLO fgwmail5.fujitsu.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752853AbYFRHYg (ORCPT ); Wed, 18 Jun 2008 03:24:36 -0400 Date: Wed, 18 Jun 2008 16:29:44 +0900 From: KAMEZAWA Hiroyuki To: KAMEZAWA Hiroyuki Cc: Nick Piggin , Daisuke Nishimura , Andrew Morton , Rik van Riel , Lee Schermerhorn , Kosaki Motohiro , linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-testers@vger.kernel.org, "hugh@veritas.com" Subject: [PATCH -mm][BUGFIX] migration_entry_wait fix. v2 Message-Id: <20080618162944.2f8fd265.kamezawa.hiroyu@jp.fujitsu.com> In-Reply-To: <20080618155233.7dd79312.kamezawa.hiroyu@jp.fujitsu.com> References: <20080611225945.4da7bb7f.akpm@linux-foundation.org> <200806181535.58036.nickpiggin@yahoo.com.au> <20080618150436.dca5eb75.kamezawa.hiroyu@jp.fujitsu.com> <200806181642.38379.nickpiggin@yahoo.com.au> <20080618155233.7dd79312.kamezawa.hiroyu@jp.fujitsu.com> Organization: Fujitsu X-Mailer: Sylpheed 2.4.2 (GTK+ 2.10.11; i686-pc-mingw32) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2240 Lines: 60 In speculative page cache look up protocol, page_count(page) is set to 0 while radix-tree modification is going on, truncation, migration, etc... While page migration, a page fault to page under migration does - look up page table - find it is migration_entry_pte - decode pfn from migration_entry_pte and get page of pfn_page(pfn) - wait until page is unlocked It does get_page() -> wait_on_page_locked() -> put_page() now. In page migration's radix-tree replacement, page_freeze_refs() -> page_unfreeze_refs() is called. And page_count(page) turns to be zero and must be kept to be zero while radix-tree replacement. If get_page() is called against a page under radix-tree replacement, the kernel panics(). To avoid this, we shouldn't increment page_count() if it is zero. This patch uses get_page_unless_zero(). Even if get_page_unless_zero() fails, the caller just retries. But will be a bit busier. Change log v1->v2: - rewrote the patch description and added comments. From: Daisuke Nishimura Signed-off-by: Daisuke Nishimura Signed-off-by: KAMEZAWA Hiroyuki --- mm/migrate.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) Index: test-2.6.26-rc5-mm3/mm/migrate.c =================================================================== --- test-2.6.26-rc5-mm3.orig/mm/migrate.c +++ test-2.6.26-rc5-mm3/mm/migrate.c @@ -242,8 +242,15 @@ void migration_entry_wait(struct mm_stru goto out; page = migration_entry_to_page(entry); - - get_page(page); + /* + * Once radix-tree replacement of page migration started, page_count + * *must* be zero. And, we don't want to call wait_on_page_locked() + * against a page without get_page(). + * So, we use get_page_unless_zero(), here. Even failed, page fault + * will occur again. + */ + if (!get_page_unless_zero(page)) + goto out; pte_unmap_unlock(ptep, ptl); wait_on_page_locked(page); put_page(page); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/