Received: by 10.223.185.116 with SMTP id b49csp2282291wrg; Thu, 22 Feb 2018 11:04:20 -0800 (PST) X-Google-Smtp-Source: AH8x226AD1pjOgpeL7FHb76PhPPsSjNKJOdoz+mWdwWJAa6M5DM2fyHUJgIA9uOMKCH8Jf5qyHva X-Received: by 10.99.116.25 with SMTP id p25mr6499409pgc.109.1519326260744; Thu, 22 Feb 2018 11:04:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1519326260; cv=none; d=google.com; s=arc-20160816; b=oaUbKvRNG/ReIiIAkUtzlY/hU82U2XBghuoR7LA14BTfS6CTH4ixL6t7uFSxyPoWII fmd4apYDGws8wLmwTLis3fEQKItPekRouMTYwrnV8ywz6kVoq6/zROKzVtecIucA90gz OLhI8xiTfF5G792Aj7Sf2Z2SD/qVLOoc4oow96NWbKTHC0gvjiC05mSf91jPoIMk6+W5 h5UE02EZgHURUpWRpbkwCXOw8/hEgjWc3HN14jtkz2bQ0EqpqBW1/lujg2FOR6i4AzDQ ruiHgTL/MggPd1JBSNSrc3py+cP6Q+BxKm6bKx7dstD5KXwucH5fMxQPSFBv8l3LyiYq GHKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dmarc-filter :arc-authentication-results; bh=0uTZerT8TAyIYncgPLHG4uXYpAD0B+AyfCKmE2CdkMc=; b=j5lxZDw7IvRl3xAp5pu4YNgLFBnDnRKu7Bpw1k89vujk7FivpRTwu+aMeUGrkFUSyO keOAR/VFbp4kFnv/pA5ciq01cmxopRCdC26CoRNB2iAdnJwD8Z3w/Ldhr1zWRylXgubS 9RmOFNrJORQYsRJ0wOYuxu5s04iC8SA7t1KLLvKMKlbJZN5gu2kYv3PBGjkSCi7k8f3g ZYFNWBhB2ksRC1hBK8JZp2urQOkM0FEmh0DB8MITpWO1Wnw+a+nsORcU5zo+4u8UlQ0n 7qbQzrrfT5++pJsyjoWn7kZorZk6BdDuhZeO8dkH/0BIgL05U64Zxnx5warE169sNOnR kQuQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b68si460858pfg.249.2018.02.22.11.04.05; Thu, 22 Feb 2018 11:04:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751273AbeBVTB6 (ORCPT + 99 others); Thu, 22 Feb 2018 14:01:58 -0500 Received: from mail.kernel.org ([198.145.29.99]:46582 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750725AbeBVTB5 (ORCPT ); Thu, 22 Feb 2018 14:01:57 -0500 Received: from mail-it0-f47.google.com (mail-it0-f47.google.com [209.85.214.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E4000217AD for ; Thu, 22 Feb 2018 19:01:56 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org E4000217AD Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=luto@kernel.org Received: by mail-it0-f47.google.com with SMTP id o9so234723itc.1 for ; Thu, 22 Feb 2018 11:01:56 -0800 (PST) X-Gm-Message-State: APf1xPCzs3ByQ5yC6RLNtrI+1w/bA38oIVbsSMUWxG7alHIR9EErFucP x6efa3QBDmXhrh8nIkQslHRjDvQ6YiEsPoHlPKHTDg== X-Received: by 10.36.121.5 with SMTP id z5mr274073itc.146.1519326116240; Thu, 22 Feb 2018 11:01:56 -0800 (PST) MIME-Version: 1.0 Received: by 10.2.137.101 with HTTP; Thu, 22 Feb 2018 11:01:35 -0800 (PST) In-Reply-To: <20180222133643.GJ30681@dhcp22.suse.cz> References: <151670492223.658225.4605377710524021456.stgit@buzz> <151670493255.658225.2881484505285363395.stgit@buzz> <20180221154214.GA4167@bombadil.infradead.org> <20180221170129.GB27687@bombadil.infradead.org> <20180222065943.GA30681@dhcp22.suse.cz> <20180222122254.GA22703@bombadil.infradead.org> <20180222133643.GJ30681@dhcp22.suse.cz> From: Andy Lutomirski Date: Thu, 22 Feb 2018 19:01:35 +0000 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: Use higher-order pages in vmalloc To: Michal Hocko Cc: Matthew Wilcox , Dave Hansen , Konstantin Khlebnikov , LKML , Christoph Hellwig , Linux-MM , Andy Lutomirski , Andrew Morton , "Kirill A. Shutemov" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Feb 22, 2018 at 1:36 PM, Michal Hocko wrote: > On Thu 22-02-18 04:22:54, Matthew Wilcox wrote: >> On Thu, Feb 22, 2018 at 07:59:43AM +0100, Michal Hocko wrote: >> > On Wed 21-02-18 09:01:29, Matthew Wilcox wrote: >> > > Right. It helps with fragmentation if we can keep higher-order >> > > allocations together. >> > >> > Hmm, wouldn't it help if we made vmalloc pages migrateable instead? That >> > would help the compaction and get us to a lower fragmentation longterm >> > without playing tricks in the allocation path. >> >> I was wondering about that possibility. If we want to migrate a page >> then we have to shoot down the PTE across all CPUs, copy the data to the >> new page, and insert the new PTE. Copying 4kB doesn't take long; if you >> have 12GB/s (current example on Wikipedia: dual-channel memory and one >> DDR2-800 module per channel gives a theoretical bandwidth of 12.8GB/s) >> then we should be able to copy a page in 666ns). So there's no problem >> holding a spinlock for it. >> >> But we can't handle a fault in vmalloc space today. It's handled in >> arch-specific code, see vmalloc_fault() in arch/x86/mm/fault.c >> If we're going to do this, it'll have to be something arches opt into >> because I'm not taking on the job of fixing every architecture! > > yes. On x86, if you shoot down the PTE for the current stack, you're dead. vmalloc_fault() might not even be called. Instead we hit do_double_fault(), and the manual warns extremely strongly against trying to recover, and, in this case, I agree with the SDM. If you actually want this to work, there needs to be a special IPI broadcast to the task in question (with appropriate synchronization) that calls magic arch code that does the switcheroo. Didn't someone (Christoph?) have a patch to teach the page allocator to give high-order allocations if available and otherwise fall back to low order?