Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758220Ab3HOPOU (ORCPT ); Thu, 15 Aug 2013 11:14:20 -0400 Received: from cantor2.suse.de ([195.135.220.15]:38975 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755151Ab3HOPOT (ORCPT ); Thu, 15 Aug 2013 11:14:19 -0400 Date: Thu, 15 Aug 2013 17:14:16 +0200 From: Michal Hocko To: Linus Torvalds Cc: Ben Tebulin , Mel Gorman , Johannes Weiner , Balbir Singh , KAMEZAWA Hiroyuki , linux-mm , Rik van Riel , Andrew Morton , LKML , Peter Zijlstra Subject: Re: [Bug] Reproducible data corruption on i5-3340M: Please revert 53a59fc67! Message-ID: <20130815151416.GF27864@dhcp22.suse.cz> References: <520BB225.8030807@gmail.com> <20130814174039.GA24033@dhcp22.suse.cz> <20130814182756.GD24033@dhcp22.suse.cz> <520C9E78.2020401@gmail.com> <20130815134031.GC27864@dhcp22.suse.cz> <20130815144600.GD27864@dhcp22.suse.cz> <20130815145332.GE27864@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130815145332.GE27864@dhcp22.suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2116 Lines: 53 On Thu 15-08-13 16:53:32, Michal Hocko wrote: > On Thu 15-08-13 16:46:00, Michal Hocko wrote: > > On Thu 15-08-13 15:40:31, Michal Hocko wrote: > > > On Thu 15-08-13 05:02:31, Linus Torvalds wrote: > > > > On Thu, Aug 15, 2013 at 2:25 AM, Ben Tebulin wrote: > > > > > > > > > > I just cherry-picked e6c495a96ce0 into 3.9.11 and 3.7.10. > > > > > Unfortunately this does _not resolve_ my issue (too good to be true) :-( > > > > > > > > Ho humm. I've found at least one other bug, but that one only affects > > > > hugepages. Do you perhaps have transparent hugepages enabled? But even > > > > then it looks quite unlikely. > > > > > > __unmap_hugepage_range is hugetlb not THP if you had that one in mind. > > > And yes, it doesn't set the range which sounds buggy. > > > > Or, did you mean tlb_remove_page called from zap_huge_pmd? That one > > should be safe as tlb_remove_pmd_tlb_entry sets need_flush and that > > means that the full range is flushed. > > Dohh... But we need need_flush_all and that is not set here. So this > really looks buggy. This is a really dumb attempt to fix this but maybe it is worth trying to confirm we are really seeing this problem. It still flushes too much potentially but I am not sure how to find out the proper start... Will think about it more. --- diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a92012a..a16f452 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1381,7 +1381,11 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, VM_BUG_ON(!PageHead(page)); tlb->mm->nr_ptes--; spin_unlock(&tlb->mm->page_table_lock); - tlb_remove_page(tlb, page); + if (!__tlb_remove_page(tlb, page)) { + tlb->start = 0; + tlb->end = addr + HPAGE_SIZE; + tlb_flush_mmu(tlb); + } } pte_free(tlb->mm, pgtable); ret = 1; -- Michal Hocko SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/