Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751386AbdH1VY5 (ORCPT ); Mon, 28 Aug 2017 17:24:57 -0400 Received: from gate.crashing.org ([63.228.1.57]:46343 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751218AbdH1VY4 (ORCPT ); Mon, 28 Aug 2017 17:24:56 -0400 Message-ID: <1503954877.4850.19.camel@kernel.crashing.org> Subject: Re: [PATCH v2 14/20] mm: Provide speculative fault infrastructure From: Benjamin Herrenschmidt To: Peter Zijlstra , "Kirill A. Shutemov" Cc: Laurent Dufour , paulmck@linux.vnet.ibm.com, akpm@linux-foundation.org, ak@linux.intel.com, mhocko@kernel.org, dave@stgolabs.net, jack@suse.cz, Matthew Wilcox , mpe@ellerman.id.au, paulus@samba.org, Thomas Gleixner , Ingo Molnar , hpa@zytor.com, Will Deacon , linux-kernel@vger.kernel.org, linux-mm@kvack.org, haren@linux.vnet.ibm.com, khandual@linux.vnet.ibm.com, npiggin@gmail.com, bsingharora@gmail.com, Tim Chen , linuxppc-dev@lists.ozlabs.org, x86@kernel.org Date: Tue, 29 Aug 2017 07:14:37 +1000 In-Reply-To: <20170828093727.5wldedputadanssh@hirez.programming.kicks-ass.net> References: <1503007519-26777-1-git-send-email-ldufour@linux.vnet.ibm.com> <1503007519-26777-15-git-send-email-ldufour@linux.vnet.ibm.com> <20170827001823.n5wgkfq36z6snvf2@node.shutemov.name> <20170828093727.5wldedputadanssh@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.24.5 (3.24.5-1.fc26) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1093 Lines: 27 On Mon, 2017-08-28 at 11:37 +0200, Peter Zijlstra wrote: > > Doing all this job and just give up because we cannot allocate page tables > > looks very wasteful to me. > > > > Have you considered to look how we can hand over from speculative to > > non-speculative path without starting from scratch (when possible)? > > So we _can_ in fact allocate and install page-tables, but we have to be > very careful about it. The interesting case is where we race with > free_pgtables() and install a page that was just taken out. > > But since we already have the VMA I think we can do something like: That makes me extremely nervous... there could be all sort of assumptions esp. in arch code about the fact that we never populate the tree without the mm sem. We'd have to audit archs closely. Things like the page walk cache flushing on power etc... I don't mind the "retry" .. .we've brought stuff in the L1 cache already which I would expect to be the bulk of the overhead, and the allocation case isn't that common. Do we have numbers to show how destrimental this is today ? Cheers, Ben.