Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752396AbdHATdp (ORCPT ); Tue, 1 Aug 2017 15:33:45 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42718 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752356AbdHATdo (ORCPT ); Tue, 1 Aug 2017 15:33:44 -0400 DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 81F67377A4F Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx05.extmail.prod.ext.phx2.redhat.com; spf=fail smtp.mailfrom=aarcange@redhat.com Date: Tue, 1 Aug 2017 21:33:41 +0200 From: Andrea Arcangeli To: Minchan Kim Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kernel-team , Nadav Amit , Mel Gorman , Hugh Dickins Subject: Re: [PATCH v2 4/4] mm: fix KSM data corruption Message-ID: <20170801193341.GA24406@redhat.com> References: <1501566977-20293-1-git-send-email-minchan@kernel.org> <1501566977-20293-5-git-send-email-minchan@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1501566977-20293-5-git-send-email-minchan@kernel.org> User-Agent: Mutt/1.8.3 (2017-05-23) X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Tue, 01 Aug 2017 19:33:43 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1369 Lines: 51 Hello, On Tue, Aug 01, 2017 at 02:56:17PM +0900, Minchan Kim wrote: > CPU0 CPU1 CPU2 CPU3 > ---- ---- ---- ---- > Write the same > value on page > > [cache PTE as > dirty in TLB] > > MADV_FREE > pte_mkclean() > > 4 > clear_refs > pte_wrprotect() > > write_protect_page() > [ success, no flush ] > > pages_indentical() > [ ok ] > > Write to page > different value > > [Ok, using stale > PTE] > > replace_page() > > Later, CPU1, CPU2 and CPU3 would flush the TLB, but that is too late. CPU0 > already wrote on the page, but KSM ignored this write, and it got lost. > " > > In above scenario, MADV_FREE is fixed by changing TLB batching API > including [set|clear]_tlb_flush_pending. Remained thing is soft-dirty part. > > This patch changes soft-dirty uses TLB batching API instead of flush_tlb_mm > and KSM checks pending TLB flush by using mm_tlb_flush_pending so that > it will flush TLB to avoid data lost if there are other parallel threads > pending TLB flush. > > [1] http://lkml.kernel.org/r/BD3A0EBE-ECF4-41D4-87FA-C755EA9AB6BD@gmail.com > > Note: > I failed to reproduce this problem through Nadav's test program which > need to tune timing in my system speed so didn't confirm it work. > Nadav, Could you test this patch on your test machine? Reviewed-by: Andrea Arcangeli