Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp353202ybt; Wed, 8 Jul 2020 01:22:58 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwSew0tqHDOq7z+Xs8rLp3Fz8MPy+uBkUzLJHow0HbFU84Yoz8dQZBDw/vmsDa+eaw1fOVt X-Received: by 2002:aa7:d58c:: with SMTP id r12mr68842119edq.160.1594196577841; Wed, 08 Jul 2020 01:22:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594196577; cv=none; d=google.com; s=arc-20160816; b=GR45B810dfOAWpw/aEs8C71rgrErBSi/oCKRm14StbkVcMtlmaB4ppRqL/oPbUl3k0 zrG0FGr1YbUh0kz9ZXsw8xD/NjRB8Ym2h3Q0gpwmVo3jYHQ4FBOcMj7dmJsFY4VrMqhv sC2U5BUotS6KeXE6GoNSjx6jqL3pP8EN0YHFAuJxTLvi5aVWGInRI6o/5pOYPdT43Pzq hqQLV4hVLyVfzgjdHmdNZXm/gBsGzb3ovNkO/J2vBtzuj4XAuFAW7HLmaA0jUUULlq3D qu0Ok5fp/b2Ir1VtXp7FCJQZDQu1Cs+WR4S7dQnraDIJSosgqbtDNEGtzWTQ4yqUdebb BzAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=c72TuTXI10m6pLDm6iBuUbqKZXIYMBEJbf0y6Vb61dI=; b=xZfx8jOcbW/Eg2ECmlKw39lrjbO+RRaHqaS2r0EcwTMtFmJ7u3h5xA+Wckwe3KzkZH bdGoK0T+ShOiY2s1sLiqrwKJ7T5xylK5eFaOfCWT4NjhT1qWmxKlLI8ZG8KAxJkFbwlK FLpovA44dXXx4PaojWABX6l03/J6Em/RwjsirB+rAnRkNPU8b9HJfGfGM8HYB5kE5tiZ 26oYVilSv7z6F2kIkjHW26DAqjFUY3OxbEOOpGZOQfqUysGrqc12ulf3c5rUL1pOFSRh ilgM6RAkS6WqD0ZRyCAR4ET/xh6WwpPpqaMaykWN+y+yV8IsNz44+73n5qMXWANbgH4H NjCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=JIha1NjJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id pw19si16217697ejb.752.2020.07.08.01.22.34; Wed, 08 Jul 2020 01:22:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=JIha1NjJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727017AbgGHIAF (ORCPT + 99 others); Wed, 8 Jul 2020 04:00:05 -0400 Received: from mail.kernel.org ([198.145.29.99]:34196 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726081AbgGHIAF (ORCPT ); Wed, 8 Jul 2020 04:00:05 -0400 Received: from willie-the-truck (236.31.169.217.in-addr.arpa [217.169.31.236]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 1213620672; Wed, 8 Jul 2020 08:00:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1594195204; bh=0r+QvI5jh8QG7OsaTwnkEK/DAUV2jZcvDC3ZHZ6VbGY=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=JIha1NjJVhWDtXJkjzvcQ1HzPxx4TQJTigYw/zH2OaO0VK62aVoV2Iyx+K0s6xVWo i+uJOexYHRqEeMA1eG5p2bOHVWKMGg0s9bjT9HTTnCbSy7dyQ/tqTrXZl5D55fIqpR c8JxYpPzyPmY4f/7DIBCw5s9SDyb+TaSwnq9Blg0= Date: Wed, 8 Jul 2020 09:00:00 +0100 From: Will Deacon To: Yang Shi Cc: hannes@cmpxchg.org, catalin.marinas@arm.com, will.deacon@arm.com, akpm@linux-foundation.org, xuyu@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Subject: Re: [RFC PATCH] mm: avoid access flag update TLB flush for retried page fault Message-ID: <20200708075959.GA25498@willie-the-truck> References: <1594148072-91273-1-git-send-email-yang.shi@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1594148072-91273-1-git-send-email-yang.shi@linux.alibaba.com> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jul 08, 2020 at 02:54:32AM +0800, Yang Shi wrote: > Recently we found regression when running will_it_scale/page_fault3 test > on ARM64. Over 70% down for the multi processes cases and over 20% down > for the multi threads cases. It turns out the regression is caused by commit > 89b15332af7c0312a41e50846819ca6613b58b4c ("mm: drop mmap_sem before > calling balance_dirty_pages() in write fault"). > > The test mmaps a memory size file then write to the mapping, this would > make all memory dirty and trigger dirty pages throttle, that upstream > commit would release mmap_sem then retry the page fault. The retried > page fault would see correct PTEs installed by the first try then update > access flags and flush TLBs. The regression is caused by the excessive > TLB flush. It is fine on x86 since x86 doesn't need flush TLB for > access flag update. > > The page fault would be retried due to: > 1. Waiting for page readahead > 2. Waiting for page swapped in > 3. Waiting for dirty pages throttling > > The first two cases don't have PTEs set up at all, so the retried page > fault would install the PTEs, so they don't reach there. But the #3 > case usually has PTEs installed, the retried page fault would reach the > access flag update. But it seems not necessary to update access flags > for #3 since retried page fault is not real "second access", so it > sounds safe to skip access flag update for retried page fault. > > With this fix the test result get back to normal. > > Reported-by: Xu Yu > Debugged-by: Xu Yu > Tested-by: Xu Yu > Signed-off-by: Yang Shi > --- > I'm not sure if this is safe for non-x86 machines, we did some tests on arm64, but > there may be still corner cases not covered. > > mm/memory.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 87ec87c..3d4e671 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4241,8 +4241,13 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) > if (vmf->flags & FAULT_FLAG_WRITE) { > if (!pte_write(entry)) > return do_wp_page(vmf); > - entry = pte_mkdirty(entry); > } > + > + if ((vmf->flags & FAULT_FLAG_WRITE) && !(vmf->flags & FAULT_FLAG_TRIED)) > + entry = pte_mkdirty(entry); > + else if (vmf->flags & FAULT_FLAG_TRIED) > + goto unlock; > + Can you rewrite this as: if (vmf->flags & FAULT_FLAG_TRIED) goto unlock; if (vmf->flags & FAULT_FLAG_WRITE) entry = pte_mkdirty(entry); ? (I'm half-asleep this morning and there are people screaming and shouting outside my window, so this might be rubbish) If you _can_make that change, then I don't understand why the existing pte_mkdirty() line needs to move at all. Couldn't you just add: if (vmf->flags & FAULT_FLAG_TRIED) goto unlock; after the existing "vmf->flags & FAULT_FLAG_WRITE" block? Will