Received: by 10.223.185.116 with SMTP id b49csp2079253wrg; Mon, 12 Feb 2018 04:03:30 -0800 (PST) X-Google-Smtp-Source: AH8x227bu8hatlGS3I1wl9lBTg09O/VSdo0VMEbB4vr53lhLDqv9+vkbtqs3wLOl3SCK65e7CLAw X-Received: by 10.99.107.73 with SMTP id g70mr9132807pgc.38.1518437010643; Mon, 12 Feb 2018 04:03:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518437010; cv=none; d=google.com; s=arc-20160816; b=Yj2BX1V1EskZS013kYt1pO/K/bj9B48lIIPP8zBlIOUG6VjNETkjMTv2sdeKeEjuuI npBVqGOApQpiqgWKYCsPOchrhoZR/3RwGGfFIpyKxFjNk4ML+pN28S7izRrblOwlnaZ5 ZwNlgqAKQcxYQ4U2AOpoj1rZAvhL9CISbADphNV1WlvEXu2EL9JsVo3X1G/3NQ6D38th h+3e8TTNeJpIf34VwUDVF1wPkSWKVwUbZ3hv+Xr57yheMGPNM7MpjNijeRPfDDlzVIMH xZtapb13cYhTkC8jYmaLkxXQX0YooVGhFt90y+lRUtNA8p8XfZLjPJAMXBUF5T67wS0I djuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=sArsLqS+4WEVC1v7YPPpSJzwvkrW3xLb+y84zKMJT54=; b=QTTV6qH2SNJPHQgpwdOEMcyMzszDYP0buuqxCRusMqbT8KLvXcmhRSqHTc7M1+90ak sKSVfHq7pvSUEV+0ydFWJuoy1t0dSEPZhF/xymub4Z0SrsaBKm+nxIMEp/mg7W/uL8fz ThVBSi3h2ceM4ff/oimOpVsrV1Ju609vO6T0KLGj6wQKVafH+MFzeoRiDicV41UFH2L6 BaNQORpBLqPlwRxv16890pHhFSh1ZCzmP6Pe44BUtjLC25+rkLI73EJOQiUS3idXCVVt UOCp+YWnM16U1Z2ghpTCVe4vqJsuwdJvEP1b85TsII22rN71G7Yfmh8wKP/4JkEX5hcZ gDTg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c2-v6si5844338plk.770.2018.02.12.04.03.16; Mon, 12 Feb 2018 04:03:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933391AbeBLJuX (ORCPT + 99 others); Mon, 12 Feb 2018 04:50:23 -0500 Received: from mx2.suse.de ([195.135.220.15]:42941 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933377AbeBLJuV (ORCPT ); Mon, 12 Feb 2018 04:50:21 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id C59F0AC57; Mon, 12 Feb 2018 09:50:19 +0000 (UTC) Date: Mon, 12 Feb 2018 10:50:19 +0100 From: Michal Hocko To: Matthew Wilcox Cc: Kai Heng Feng , Laura Abbott , linux-mm@kvack.org, Linux Kernel Mailing List , Andrew Morton Subject: Re: Regression after commit 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly") Message-ID: <20180212095019.GX21609@dhcp22.suse.cz> References: <627DA40A-D0F6-41C1-BB5A-55830FBC9800@canonical.com> <20180208130649.GA15846@bombadil.infradead.org> <20180208232004.GA21027@bombadil.infradead.org> <20180211092652.GV21609@dhcp22.suse.cz> <20180211112808.GA4551@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20180211112808.GA4551@bombadil.infradead.org> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [I am crawling over a large backlog after vacation so I will get to other emails in this thread later. Let's just fix the regression first. The patch with the full changelog is at the end of this email. CC Andrew - the original report is http://lkml.kernel.org/r/627DA40A-D0F6-41C1-BB5A-55830FBC9800@canonical.com] On Sun 11-02-18 03:28:08, Matthew Wilcox wrote: > On Sun, Feb 11, 2018 at 10:26:52AM +0100, Michal Hocko wrote: > > On Thu 08-02-18 15:20:04, Matthew Wilcox wrote: > > > ... nevertheless, 19809c2da28a does in fact break vmalloc_32 on 32-bit. Look: > > > > > > #if defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA32) > > > #define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL > > > #elif defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA) > > > #define GFP_VMALLOC32 GFP_DMA | GFP_KERNEL > > > #else > > > #define GFP_VMALLOC32 GFP_KERNEL > > > #endif > > > > > > So we pass in GFP_KERNEL to __vmalloc_node, which calls __vmalloc_node_range > > > which calls __vmalloc_area_node, which ORs in __GFP_HIGHMEM. > > > > Dohh. I have missed this. I was convinced that we always add GFP_DMA32 > > when doing vmalloc_32. Sorry about that. The above definition looks > > quite weird to be honest. First of all do we have any 64b system without > > both DMA and DMA32 zones? If yes, what is the actual semantic of > > vmalloc_32? Or is there any magic forcing GFP_KERNEL into low 32b? > > mmzone.h has the following, which may be inaccurate / out of date: > > * parisc, ia64, sparc <4G > * s390 <2G > * arm Various > * alpha Unlimited or 0-16MB. > * > * i386, x86_64 and multiple other arches > * <16M. > > It claims ZONE_DMA32 is x86-64 only, which is incorrect; it's now used > by arm64, ia64, mips, powerpc, tile. yes, nobody seem to keep this one in sync. > > Also I would expect that __GFP_DMA32 should do the right thing on 32b > > systems. So something like the below should do the trick > > Oh, I see. Because we have: > > #ifdef CONFIG_ZONE_DMA32 > #define OPT_ZONE_DMA32 ZONE_DMA32 > #else > #define OPT_ZONE_DMA32 ZONE_NORMAL > #endif > > we'll end up allocating from ZONE_NORMAL if a non-DMA32 architecture asks > for GFP_DMA32 memory. Thanks; I missed that. yep > I'd recommend this instead then: > > #if defined(CONFIG_64BIT) && !defined(CONFIG_ZONE_DMA32) > #define GFP_VMALLOC32 GFP_DMA | GFP_KERNEL > #else > #define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL > #endif > > I think it's clearer than the three-way #if. I do not have a strong opinion here. I just wanted the change to be obvious without meddling with the 64b ifdefs much. Follow up cleanups are certainly possible. > Now, longer-term, perhaps we should do the following: > > #ifdef CONFIG_ZONE_DMA32 > #define OPT_ZONE_DMA32 ZONE_DMA32 > #elif defined(CONFIG_64BIT) > #define OPT_ZONE_DMA OPT_ZONE_DMA > #else > #define OPT_ZONE_DMA32 ZONE_NORMAL > #endif > > Then we wouldn't need the ifdef here and could always use GFP_DMA32 > | GFP_KERNEL. Would need to audit current users and make sure they > wouldn't be broken by such a change. I am pretty sure improvements are possible. > I noticed a mistake in 704b862f9efd; > > - pages = __vmalloc_node(array_size, 1, nested_gfp|__GFP_HIGHMEM, > + pages = __vmalloc_node(array_size, 1, nested_gfp|highmem_mask, > > We should unconditionally use __GFP_HIGHMEM here instead of highmem_mask > because this is where we allocate the array to hold the struct page > pointers. This can be allocated from highmem, and does not need to be > allocated from ZONE_NORMAL. You seem to be right. nested_gfp doesn't include zone modifiers. Care to send a patch? > Similarly, > > - if (gfpflags_allow_blocking(gfp_mask)) > + if (gfpflags_allow_blocking(gfp_mask|highmem_mask)) > > is not needed (it's not *wrong*, it was just an unnecessary change). yes. highmem_mask has no influence on the blocking behavior. The fix for the regressions should be From 301c0acbce9dd80a854bba49c9db40991df0f9e4 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Mon, 12 Feb 2018 10:37:19 +0100 Subject: [PATCH] vmalloc: fix __GFP_HIGHMEM usage for vmalloc_32 on 32b systems MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Kai Heng Feng has noticed that BUG_ON(PageHighMem(pg)) triggers in drivers/media/common/saa7146/saa7146_core.c since 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly”). saa7146_vmalloc_build_pgtable uses vmalloc_32 and it is reasonable to expect that the resulting page is not in highmem. The above commit aimed to add __GFP_HIGHMEM only for those requests which do not specify any zone modifier gfp flag. vmalloc_32 relies on GFP_VMALLOC32 which should do the right thing. Except it has been missed that GFP_VMALLOC32 is an alias for GFP_KERNEL on 32b architectures. Thanks to Matthew to notice this. Fix the problem by unconditionally setting GFP_DMA32 in GFP_VMALLOC32 for !64b arches (as a bailout). This should do the right thing and use ZONE_NORMAL which should be always below 4G on 32b systems. Debugged-by: Matthew Wilcox Reported-by: Kai Heng Feng Fixes: 19809c2da28a ("mm, vmalloc: use __GFP_HIGHMEM implicitly”) Cc: stable Signed-off-by: Michal Hocko --- mm/vmalloc.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/mm/vmalloc.c b/mm/vmalloc.c index 673942094328..1d147078c469 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -1947,7 +1947,11 @@ void *vmalloc_exec(unsigned long size) #elif defined(CONFIG_64BIT) && defined(CONFIG_ZONE_DMA) #define GFP_VMALLOC32 GFP_DMA | GFP_KERNEL #else -#define GFP_VMALLOC32 GFP_KERNEL +/* + * 64b systems should always have either DMA or DMA32 zones. For others + * GFP_DMA32 should do the right thing and use the normal zone. + */ +#define GFP_VMALLOC32 GFP_DMA32 | GFP_KERNEL #endif /** -- 2.15.1 -- Michal Hocko SUSE Labs