Received: by 2002:ac0:98c7:0:0:0:0:0 with SMTP id g7-v6csp5819185imd; Wed, 31 Oct 2018 02:30:50 -0700 (PDT) X-Google-Smtp-Source: AJdET5fJ9Vn8LUwtTh9yfeB4AgUNS97Z5xzrj0SCXjXxcO3T98ZHPc5aDvBw1SZchOb2ynYErBzF X-Received: by 2002:a17:902:930a:: with SMTP id bc10-v6mr2617128plb.17.1540978250191; Wed, 31 Oct 2018 02:30:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1540978250; cv=none; d=google.com; s=arc-20160816; b=XB4rKxZUUcyNQbtw+BsRIcl/sZyYMinF2ELsDIo5ukKmBbmBGRJp0V3tSxY/LJCjcR QcduDUTSWuf1IuQGITjyWbjynTVnYT3+OI2w+N6r4oNA4ozsEO25Os/KptNZUewf+7uS jMEKBRhOQEvuBbZx2iBxonwieOOxbq8HWpDJHPbsqS8Bytv5lnOTbmoKKOEUurNhIa43 kPfafT1c13je8yRWNODLVQ8YjF1bxnyWMIm5sh0ttqjrvE7z6NzC542P58eI4UYOUOx4 YLLm2d8jjX02Zf7pHYe2LNTeHtzxP6CVis2nGlbITQRA1I4DTPpVjCrJSCTuzZ6Z0FVO u/3A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=3jes58u1jTxmuOF8FMhmEDRdUkFP9Qkm++im5XLV5ys=; b=bEAzc1dEdRDINd8ekaqwwsSK3CbOC1bSKT2LIAUWFq3gUN72uj3z5riSyPdi+P4Xb+ PanT4tUd7LBh3qR7K6NvV1ZIdUEHqyIhq2YIuT3UwsHVo7ncF/z5IlmsVYI4CGA2KfTl i6d8BB8zy+X1HL8H8/y9WZ6S2felAVUMMM4NA3tpSQgjbkMePKo8AJ/X8q9y06tDzluW N5Y4w5TuGlP7zVZWFn7xNfS8OqGX+wFxschFcbqiew9zuGlLMoaUDXedGop+a/VR/9sA /Ts2GKdImQdZKXQ0OVtm4VgI5nY4v3YnrsRELETGLnhllCefqUwk74/QYfDv6usF88fv Q5Sg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=jY8arnAl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id p12-v6si28781747pll.1.2018.10.31.02.30.34; Wed, 31 Oct 2018 02:30:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=jY8arnAl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727852AbeJaS1b (ORCPT + 99 others); Wed, 31 Oct 2018 14:27:31 -0400 Received: from merlin.infradead.org ([205.233.59.134]:34858 "EHLO merlin.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727597AbeJaS1b (ORCPT ); Wed, 31 Oct 2018 14:27:31 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Transfer-Encoding: Content-Type:MIME-Version:References:Message-ID:Subject:Cc:To:From:Date: Sender:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=3jes58u1jTxmuOF8FMhmEDRdUkFP9Qkm++im5XLV5ys=; b=jY8arnAl+D3zGHGTnlVl9fRINf 0r1qffbbaW9tzevjDNcuLMs29ffRTn85xFLvU6m15urgYyQQF0NVmBRd3mTeQps+9jOn+LQZSz8th pQ9qe85dgObjx+EzzI2vfdYSD4xlHYw4nsgyLEdfqO8stcKrJUL/ywEI4ZfdOiTHcKplk8Vp5BIU2 QjfTR3C+gjZk9IIk9Eh6V4M6qbLokd0V/dY7XwKEAHL3FQ5aeP/yhFMUXBygnusbhuAi5bbiuSR/2 DbOWIzFiRCvk8ikqZF78waI5gkJiEkOGKvl9bBUSKIF1eelCR0wpbSTvdVNEVdmNoCQAp0cABhfhW /9+Xy3Ag==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1gHmoq-0002Sf-7w; Wed, 31 Oct 2018 09:29:44 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 36EB42029FA14; Wed, 31 Oct 2018 10:29:42 +0100 (CET) Date: Wed, 31 Oct 2018 10:29:42 +0100 From: Peter Zijlstra To: Matthew Wilcox Cc: Andy Lutomirski , nadav.amit@gmail.com, Kees Cook , Igor Stoppa , Mimi Zohar , Dave Chinner , James Morris , Michal Hocko , Kernel Hardening , linux-integrity , linux-security-module , Igor Stoppa , Dave Hansen , Jonathan Corbet , Laura Abbott , Randy Dunlap , Mike Rapoport , "open list:DOCUMENTATION" , LKML , Thomas Gleixner Subject: Re: [PATCH 10/17] prmem: documentation Message-ID: <20181031092942.GJ744@hirez.programming.kicks-ass.net> References: <20181026092609.GB3159@worktop.c.hoisthospitality.com> <20181028183126.GB744@hirez.programming.kicks-ass.net> <40cd77ce-f234-3213-f3cb-0c3137c5e201@gmail.com> <20181030152641.GE8177@hirez.programming.kicks-ass.net> <0A7AFB50-9ADE-4E12-B541-EC7839223B65@amacapital.net> <20181030175814.GB10491@bombadil.infradead.org> <28C8CD2A-BDC0-49A5-854E-1E18968528B8@amacapital.net> <20181030212551.GD10491@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20181030212551.GD10491@bombadil.infradead.org> User-Agent: Mutt/1.10.1 (2018-07-13) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 30, 2018 at 02:25:51PM -0700, Matthew Wilcox wrote: > On Tue, Oct 30, 2018 at 11:51:17AM -0700, Andy Lutomirski wrote: > > Finally, one issue: rare_alloc() is going to utterly suck > > performance-wise due to the global IPI when the region gets zapped out > > of the direct map or otherwise made RO. This is the same issue that > > makes all existing XPO efforts so painful. We need to either optimize > > the crap out of it somehow or we need to make sure it’s not called > > except during rare events like device enumeration. > > Batching operations is kind of the whole point of the VM ;-) Either > this rare memory gets used a lot, in which case we'll want to create slab > caches for it, make it a MM zone and the whole nine yeards, or it's not > used very much in which case it doesn't matter that performance sucks. Yes, for the dynamic case something along those lines would be needed. If we have a single rare zone, we could even have __GFP_RARE or whatever that manages this. The page allocator would have to grow a rare memblock type, and every rare alloc would allocate from a rare memblock, when none is available, creation of a rare block would set up the mappings etc.. > For now, I'd suggest allocating 2MB chunks as needed, and having a > shrinker to hand back any unused pieces. Something like the percpu allocator?