Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753085Ab3H0IhY (ORCPT ); Tue, 27 Aug 2013 04:37:24 -0400 Received: from mail-lb0-f174.google.com ([209.85.217.174]:34461 "EHLO mail-lb0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751825Ab3H0IhV (ORCPT ); Tue, 27 Aug 2013 04:37:21 -0400 Date: Tue, 27 Aug 2013 12:37:18 +0400 From: Cyrill Gorcunov To: Dave Jones Cc: Hugh Dickins , Linus Torvalds , Hillf Danton , Linux-MM , Linux Kernel , Andrew Morton , Pavel Emelyanov Subject: Re: unused swap offset / bad page map. Message-ID: <20130827083718.GC7416@moon> References: <20130823032127.GA5098@redhat.com> <20130823035344.GB5098@redhat.com> <20130826190757.GB27768@redhat.com> <20130826222833.GA24320@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130826222833.GA24320@redhat.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3842 Lines: 105 On Mon, Aug 26, 2013 at 06:28:33PM -0400, Dave Jones wrote: > > > > I've not tried matching up bits with Dave's reports, and just going > > into a meeting now, but this patch looks worth a try: probably Cyrill > > can improve it meanwhile to what he actually wants there (I'm > > surprised anything special is needed for just moving a pte). > > > > Hugh > > > > --- 3.11-rc7/mm/mremap.c 2013-07-14 17:10:16.640003652 -0700 > > +++ linux/mm/mremap.c 2013-08-26 14:46:14.460027627 -0700 > > @@ -126,7 +126,7 @@ static void move_ptes(struct vm_area_str > > continue; > > pte = ptep_get_and_clear(mm, old_addr, old_pte); > > pte = move_pte(pte, new_vma->vm_page_prot, old_addr, new_addr); > > - set_pte_at(mm, new_addr, new_pte, pte_mksoft_dirty(pte)); > > + set_pte_at(mm, new_addr, new_pte, pte); > > } > > I'll give this a shot once I'm done with the bisect. I managed to trigger the issue as well. The patch below fixes it. Dave, could you please give it a shot once time permit? Pavel, I kept 'make it dirty on move' logic, but i'm somehow doubt in it, won't plain pte copying (as in Hugh's patch) work of us? --- From: Cyrill Gorcunov Subject: [PATCH] mm: move_ptes -- Set soft dirty bit depending on pte type Dave reported corrupted swap entries | [ 4588.541886] swap_free: Unused swap offset entry 00002d15 | [ 4588.541952] BUG: Bad page map in process trinity-kid12 pte:005a2a80 pmd:22c01f067 and Hugh pointed that in move_ptes _PAGE_SOFT_DIRTY bit set regardless the type of entry pte consists of. The trick here is that -- when we carry soft dirty status in swap entries we are to use _PAGE_SWP_SOFT_DIRTY instead, because this is the only place in pte which can be used for own needs without intersecting with bits owned by swap entry type/offset. Reported-by: Dave Jones Signed-off-by: Cyrill Gorcunov Cc: Pavel Emelyanov Cc: Linus Torvalds Cc: Hugh Dickins Cc: Hillf Danton Cc: Andrew Morton --- mm/mremap.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-) Index: linux-2.6.git/mm/mremap.c =================================================================== --- linux-2.6.git.orig/mm/mremap.c +++ linux-2.6.git/mm/mremap.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -69,6 +70,23 @@ static pmd_t *alloc_new_pmd(struct mm_st return pmd; } +static pte_t move_soft_dirty_pte(pte_t pte) +{ + /* + * Set soft dirty bit so we can notice + * in userspace the ptes were moved. + */ +#ifdef CONFIG_MEM_SOFT_DIRTY + if (pte_present(pte)) + pte = pte_mksoft_dirty(pte); + else if (is_swap_pte(pte)) + pte = pte_swp_mksoft_dirty(pte); + else if (pte_file(pte)) + pte = pte_file_mksoft_dirty(pte); +#endif + return pte; +} + static void move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd, unsigned long old_addr, unsigned long old_end, struct vm_area_struct *new_vma, pmd_t *new_pmd, @@ -126,7 +144,8 @@ static void move_ptes(struct vm_area_str continue; pte = ptep_get_and_clear(mm, old_addr, old_pte); pte = move_pte(pte, new_vma->vm_page_prot, old_addr, new_addr); - set_pte_at(mm, new_addr, new_pte, pte_mksoft_dirty(pte)); + pte = move_soft_dirty_pte(pte); + set_pte_at(mm, new_addr, new_pte, pte); } arch_leave_lazy_mmu_mode(); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/