Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp2552371imu; Thu, 29 Nov 2018 06:47:42 -0800 (PST) X-Google-Smtp-Source: AFSGD/X/Cjt4VevRd7QPHukRZ87xGPwfs44pgHlomLnlnPOcgEF1oCkwvUOhF/F+ROGB9hS9pMH5 X-Received: by 2002:a17:902:b406:: with SMTP id x6mr1640306plr.329.1543502862542; Thu, 29 Nov 2018 06:47:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543502862; cv=none; d=google.com; s=arc-20160816; b=qiidfeEGL1aPCWh6A1Wm/xZ00Lrwib09BeP59nr8/JCtDjKA42Kz6pilf6A1f+dB+n 2HTFqUMv1Ic5GQiidsVPUaS4Fi9SQ+xn6hv1MwEDVN6/pfveEHnhovfEpla1Oay0SvZ6 eQNa5mxmiT3ObclF4WrdwFe7Vhf7roh13S8iuZYM5V0hOemumbCqXWqLzo1O0iJRdXJI Hm2RJJGMzfKVYzVZ/f7n+NPlgju92PeBMky0osOHDnLOPrqf2qR9WW05oHFQAZdDeTkj 86GOdXuYbsDiHZJdfGlT1mJuNyP9mKG7gWJTwweUIifbvlzQRgHIfuiL9j5wv5w2kGwO gX9A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=oOrgBnVNHMcT+plxbGtPLcZvZhH9B4Szs8XNyywK4Zs=; b=0hDHCbzRN8B6KfTESNUeoXVptWt83yc3G2mBQpdZCkptX1HfIAmQVGWpe0ODjYmIIJ +ltoHfKKkdxd/H9yQ5Ow3HxPvg8beQ18tCCNpHvGaMTAzSLkuW/cjmiGI9zCAtJ+tXwh XgpP9iWmFiIDMIYMyoPO/bUfIgyGjfoej21r23HYaowgdRC/TrDtsBUhYctRBrhtfzGQ ixJt91NtA0f2nA+hkv3lTxN4L2LXR/NAnY/aZBEn6RjQLwfGVehB8DxQg+J3plcETaxm X7xXA57Rt4SKN/J3FChNbW+htDNivpyuZAY1osZiSyW5QOU2ySpmBKBc7GkkSgpT1rJd lLsQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Nj9XY0er; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a9si2173959pff.126.2018.11.29.06.47.27; Thu, 29 Nov 2018 06:47:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=Nj9XY0er; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387439AbeK3BcM (ORCPT + 99 others); Thu, 29 Nov 2018 20:32:12 -0500 Received: from mail.kernel.org ([198.145.29.99]:60154 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732836AbeK3BcM (ORCPT ); Thu, 29 Nov 2018 20:32:12 -0500 Received: from localhost (5356596B.cm-6-7b.dynamic.ziggo.nl [83.86.89.107]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 6C06E21473; Thu, 29 Nov 2018 14:26:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1543501599; bh=F47w2Gfi4BbipufJDA7wiCLxI3An/bPNW/SA3UijpNw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Nj9XY0erJ5eLceMIK3Lp3uu6e1LX/KzOzugslQcKCGx7F7nnoYSx7te7y9hmDn/tn fI2Oy0zjrreP5UuWnMdOmexumKRsT4wR1sb1TbD13SK5QGBkSrwPnYFioeVChQazLW iFXjTkI4H8Fevk8QOhIq4rxRJzINVY01bcXvyUPU= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Michal Hocko , Konstantin Khlebnikov , Kyungtae Kim , Vlastimil Babka , Balbir Singh , Mel Gorman , Pavel Tatashin , Oscar Salvador , Mike Rapoport , Aaron Lu , Joonsoo Kim , Byoungyoung Lee , "Dae R. Jeong" , Andrew Morton , Linus Torvalds , Sasha Levin Subject: [PATCH 4.14 064/100] mm, page_alloc: check for max order in hot path Date: Thu, 29 Nov 2018 15:12:34 +0100 Message-Id: <20181129140104.499415132@linuxfoundation.org> X-Mailer: git-send-email 2.19.2 In-Reply-To: <20181129140058.768942700@linuxfoundation.org> References: <20181129140058.768942700@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.14-stable review patch. If anyone has any objections, please let me know. ------------------ [ Upstream commit c63ae43ba53bc432b414fd73dd5f4b01fcb1ab43 ] Konstantin has noticed that kvmalloc might trigger the following warning: WARNING: CPU: 0 PID: 6676 at mm/vmstat.c:986 __fragmentation_index+0x54/0x60 [...] Call Trace: fragmentation_index+0x76/0x90 compaction_suitable+0x4f/0xf0 shrink_node+0x295/0x310 node_reclaim+0x205/0x250 get_page_from_freelist+0x649/0xad0 __alloc_pages_nodemask+0x12a/0x2a0 kmalloc_large_node+0x47/0x90 __kmalloc_node+0x22b/0x2e0 kvmalloc_node+0x3e/0x70 xt_alloc_table_info+0x3a/0x80 [x_tables] do_ip6t_set_ctl+0xcd/0x1c0 [ip6_tables] nf_setsockopt+0x44/0x60 SyS_setsockopt+0x6f/0xc0 do_syscall_64+0x67/0x120 entry_SYSCALL_64_after_hwframe+0x3d/0xa2 the problem is that we only check for an out of bound order in the slow path and the node reclaim might happen from the fast path already. This is fixable by making sure that kvmalloc doesn't ever use kmalloc for requests that are larger than KMALLOC_MAX_SIZE but this also shows that the code is rather fragile. A recent UBSAN report just underlines that by the following report UBSAN: Undefined behaviour in mm/page_alloc.c:3117:19 shift exponent 51 is too large for 32-bit type 'int' CPU: 0 PID: 6520 Comm: syz-executor1 Not tainted 4.19.0-rc2 #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0xd2/0x148 lib/dump_stack.c:113 ubsan_epilogue+0x12/0x94 lib/ubsan.c:159 __ubsan_handle_shift_out_of_bounds+0x2b6/0x30b lib/ubsan.c:425 __zone_watermark_ok+0x2c7/0x400 mm/page_alloc.c:3117 zone_watermark_fast mm/page_alloc.c:3216 [inline] get_page_from_freelist+0xc49/0x44c0 mm/page_alloc.c:3300 __alloc_pages_nodemask+0x21e/0x640 mm/page_alloc.c:4370 alloc_pages_current+0xcc/0x210 mm/mempolicy.c:2093 alloc_pages include/linux/gfp.h:509 [inline] __get_free_pages+0x12/0x60 mm/page_alloc.c:4414 dma_mem_alloc+0x36/0x50 arch/x86/include/asm/floppy.h:156 raw_cmd_copyin drivers/block/floppy.c:3159 [inline] raw_cmd_ioctl drivers/block/floppy.c:3206 [inline] fd_locked_ioctl+0xa00/0x2c10 drivers/block/floppy.c:3544 fd_ioctl+0x40/0x60 drivers/block/floppy.c:3571 __blkdev_driver_ioctl block/ioctl.c:303 [inline] blkdev_ioctl+0xb3c/0x1a30 block/ioctl.c:601 block_ioctl+0x105/0x150 fs/block_dev.c:1883 vfs_ioctl fs/ioctl.c:46 [inline] do_vfs_ioctl+0x1c0/0x1150 fs/ioctl.c:687 ksys_ioctl+0x9e/0xb0 fs/ioctl.c:702 __do_sys_ioctl fs/ioctl.c:709 [inline] __se_sys_ioctl fs/ioctl.c:707 [inline] __x64_sys_ioctl+0x7e/0xc0 fs/ioctl.c:707 do_syscall_64+0xc4/0x510 arch/x86/entry/common.c:290 entry_SYSCALL_64_after_hwframe+0x49/0xbe Note that this is not a kvmalloc path. It is just that the fast path really depends on having sanitzed order as well. Therefore move the order check to the fast path. Link: http://lkml.kernel.org/r/20181113094305.GM15120@dhcp22.suse.cz Signed-off-by: Michal Hocko Reported-by: Konstantin Khlebnikov Reported-by: Kyungtae Kim Acked-by: Vlastimil Babka Cc: Balbir Singh Cc: Mel Gorman Cc: Pavel Tatashin Cc: Oscar Salvador Cc: Mike Rapoport Cc: Aaron Lu Cc: Joonsoo Kim Cc: Byoungyoung Lee Cc: "Dae R. Jeong" Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds Signed-off-by: Sasha Levin --- mm/page_alloc.c | 20 +++++++++----------- 1 file changed, 9 insertions(+), 11 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index a604b5da6755..2074f424dabf 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3867,17 +3867,6 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order, unsigned int cpuset_mems_cookie; int reserve_flags; - /* - * In the slowpath, we sanity check order to avoid ever trying to - * reclaim >= MAX_ORDER areas which will never succeed. Callers may - * be using allocators in order of preference for an area that is - * too large. - */ - if (order >= MAX_ORDER) { - WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN)); - return NULL; - } - /* * We also sanity check to catch abuse of atomic reserves being used by * callers that are not in atomic context. @@ -4179,6 +4168,15 @@ __alloc_pages_nodemask(gfp_t gfp_mask, unsigned int order, int preferred_nid, gfp_t alloc_mask; /* The gfp_t that was actually used for allocation */ struct alloc_context ac = { }; + /* + * There are several places where we assume that the order value is sane + * so bail out early if the request is out of bound. + */ + if (unlikely(order >= MAX_ORDER)) { + WARN_ON_ONCE(!(gfp_mask & __GFP_NOWARN)); + return NULL; + } + gfp_mask &= gfp_allowed_mask; alloc_mask = gfp_mask; if (!prepare_alloc_pages(gfp_mask, order, preferred_nid, nodemask, &ac, &alloc_mask, &alloc_flags)) -- 2.17.1