Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp533836imm; Thu, 13 Sep 2018 03:58:28 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbSRMGZiZjdisGM0EMc/DXxHFgGJ/t6ynHHQYMaqQWVVMJOsB/IqksY23OL7+mhT2Y/UtIU X-Received: by 2002:a62:23c2:: with SMTP id q63-v6mr6912723pfj.116.1536836308021; Thu, 13 Sep 2018 03:58:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536836307; cv=none; d=google.com; s=arc-20160816; b=x7wOlwpJ8fakzuWbL8eY9ly+g9HW39LhQGjK5RABk7jdrv/s/5fzZp0TnrN6/NsOF1 lS3XP9VuOAaN8305GX7DekbfchT6b1PqbghelxcnAjTmxHIj725vi7zR1Odug2fmSY1Z eeuoIe9PsnCBqCz8665QAx9ftAgBdj/8javtOpi6vZHIYzfPFu9lcts/G7Oiqawn0LFt Jgk03BawbGZCSOipULv3rML5adjuqiGh7b1ke9xFym6IU9SCLkSINW04HmIHALg2rTPK QKig+TFgFEqeQ6fbLwLW/aCkh9t38T2N/pbVJ2C28ZDu9REM+5mi0WyA4BZAjS/TUoMk IRuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=58pngwe/Y8svAz545CEisaXl6sbjbE0b75SZatOv5RQ=; b=egDu3KqaRqMum49St89uDZLT1eTg7PJcmTTR0ytPkTisNEoLXhmQjZYsVj6djrBmme CvkvBgWvwEOB27ZyVE+StNH0UByhyaGKPNLsf69egjjFPcC6mN4jtr5Lx1fd/VZ/fZax wfCY9B/ghP2iw/IOO+FRuLYb0nj7qFrrIP4z9wdj4Zrw8LMS7Vm5ZMVi33gLZgilU9r/ Tmb82EEOK5UFh8Pzm1AW0qBUGTBOZ37NiVkm3bGaQYq8EFt5QHe3BqZt6S6Yof8URMTk 41EOrpu0Ru24jMPYchLw9vhDx5Xib8hnKnbRLHHmbC/nQsKmTn359CBN3KsUM7SwzHx/ hpHQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b="JvYDEY/U"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i33-v6si3923013pld.306.2018.09.13.03.58.12; Thu, 13 Sep 2018 03:58:27 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b="JvYDEY/U"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727006AbeIMQGr (ORCPT + 99 others); Thu, 13 Sep 2018 12:06:47 -0400 Received: from merlin.infradead.org ([205.233.59.134]:55884 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726741AbeIMQGr (ORCPT ); Thu, 13 Sep 2018 12:06:47 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=58pngwe/Y8svAz545CEisaXl6sbjbE0b75SZatOv5RQ=; b=JvYDEY/UtSjR6PIRgxMYZ/2to JyL4G2C+daw+P0mb9+ShFWt1k24UFN7mHkHuCaK5WVTEQFggiCqjtI04qFrDzwCaD9wtqb4E2tLZQ wC5yDPh0GQMGgkdzrDZOXWoB1ur7ZlutL8CXbic5kgsm/Kfny1AOux2lGqgsF/+eLg2rGryoTIZig Py+XQFP4o2oCzwnO4O/3MuRBrJZudNhE6OIXceReYgQ1DXqwOLP5hwww2xSFXfsq5nLo93SEv5Npn /9UyEB6QBnpqGtb90RBjWu21oiv/dnm9W10xAd+jMkb4m3qtE4wabivaK4Jx/i02wDKi9dxDIGI6n A8m/oaE3Q==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1g0PJd-0001se-7J; Thu, 13 Sep 2018 10:57:41 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 44516202C1A2D; Thu, 13 Sep 2018 12:57:38 +0200 (CEST) Date: Thu, 13 Sep 2018 12:57:38 +0200 From: Peter Zijlstra To: Martin Schwidefsky Cc: will.deacon@arm.com, aneesh.kumar@linux.vnet.ibm.com, akpm@linux-foundation.org, npiggin@gmail.com, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux@armlinux.org.uk, heiko.carstens@de.ibm.com Subject: Re: [RFC][PATCH 01/11] asm-generic/tlb: Provide a comment Message-ID: <20180913105738.GW24124@hirez.programming.kicks-ass.net> References: <20180913092110.817204997@infradead.org> <20180913092811.894806629@infradead.org> <20180913123014.0d9321b8@mschwideX1> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180913123014.0d9321b8@mschwideX1> User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Sep 13, 2018 at 12:30:14PM +0200, Martin Schwidefsky wrote: > > + * The mmu_gather data structure is used by the mm code to implement the > > + * correct and efficient ordering of freeing pages and TLB invalidations. > > + * > > + * This correct ordering is: > > + * > > + * 1) unhook page > > + * 2) TLB invalidate page > > + * 3) free page > > + * > > + * That is, we must never free a page before we have ensured there are no live > > + * translations left to it. Otherwise it might be possible to observe (or > > + * worse, change) the page content after it has been reused. > > + * > > This first comment already includes the reason why s390 is probably better off > with its own mmu-gather implementation. It depends on the situation if we have > > 1) unhook the page and do a TLB flush at the same time > 2) free page > > or > > 1) unhook page > 2) free page > 3) final TLB flush of the whole mm that's the fullmm case, right? > A variant of the second order we had in the past is to do the mm TLB flush first, > then the unhooks and frees of the individual pages. The are some tricky corners > switching between the two variants, see finish_arch_post_lock_switch. > > The point is: we *never* have the order 1) unhook, 2) TLB invalidate, 3) free. > If there is concurrency due to a multi-threaded application we have to do the > unhook of the page-table entry and the TLB flush with a single instruction. You can still get the thing you want if for !fullmm you have a no-op tlb_flush() implementation, assuming your arch page-table frobbing thing has the required TLB flush in. Note that that's not utterly unlike how the PowerPC/Sparc hash things work, they clear and invalidate entries different from others and don't use the mmu_gather tlb-flush.