Received: by 2002:ac0:b08d:0:0:0:0:0 with SMTP id l13csp4667499imc; Mon, 25 Feb 2019 08:50:37 -0800 (PST) X-Google-Smtp-Source: AHgI3IaKgTKjzqU55kw7E+NZXYFCP+hloSyJaVzGlGG2CkGTqYHEI8MU3bdskCehxBiyzymJSa1n X-Received: by 2002:a62:6702:: with SMTP id b2mr20970853pfc.244.1551113437597; Mon, 25 Feb 2019 08:50:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551113437; cv=none; d=google.com; s=arc-20160816; b=GG0p6v3JStaeOSfQFgDV7GA9upKpvHfaEpbuzonB9QJqLpz2KcwF5lsGdb7iJy39DW Ggdwx8vre1FWxA5vSa0GZRrkDN6sZwEx5x4k6nlp9zTvpYXWvHY51fF/fxVtLnA0WQwv i+ybvpTcRUjnwisDEzTI8H1X4fYK49WQM/jpRzS9it5Y/kn2oErcuNcA65vG1558/7A8 6RwliX3aEFanEA/5VdKZjW89nLrmoPG9xnz6fWvwZoIpaUmdcDTPQgPjfx9dSMJ8f0le 2W1uVNKWQJhQ6xjGNIzhsii9U1fHMVbrQVWl9xzfDcjUATicT/nqrv3+tHB2HcVJ4RtF PK4Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature; bh=2DavdlfEcPUnZ5zlrCm/gMnuNXvBK6a6jQ40MXyfRN4=; b=zS/xvfcu0duZqiCa8LFBpjeLc0DcLpHv9VfY/PYk24mV8oY6C9JrfhZwtfH0sjRKcc K6u6kwlBqLKWCALZmCJ94dUqEGu9iLch6ulOiOu7CEiRLWuyHJL9eDJAlgYpihICBXnS xojKiHkNx0MYFlnlGEoRsESvf6Llo02YkBpjyqybwy6Y4SD/d+oUePzp5cu9Uvqywq2B SHDJ8cn0z990j2iXIsa3L3ilGLVWpvSum45m8MuMWbURGnpxX+0lEK8ekY4PzY1/zfsP mXlkqzSsoDM78W/hy1msSrKZZUryco6kxecRiJqBTd5mZs/W/z7S4EtA9SOj95seaRSw Xeyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=o+sIwtjm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b17si9636281pls.181.2019.02.25.08.50.22; Mon, 25 Feb 2019 08:50:37 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2018-07-02 header.b=o+sIwtjm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728490AbfBYQte (ORCPT + 99 others); Mon, 25 Feb 2019 11:49:34 -0500 Received: from aserp2130.oracle.com ([141.146.126.79]:55838 "EHLO aserp2130.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728138AbfBYQte (ORCPT ); Mon, 25 Feb 2019 11:49:34 -0500 Received: from pps.filterd (aserp2130.oracle.com [127.0.0.1]) by aserp2130.oracle.com (8.16.0.27/8.16.0.27) with SMTP id x1PGi1fu068593; Mon, 25 Feb 2019 16:49:15 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2018-07-02; bh=2DavdlfEcPUnZ5zlrCm/gMnuNXvBK6a6jQ40MXyfRN4=; b=o+sIwtjme3BGl5X7m88crmOr9mM+RYVdMKlB5tRKJTyW5GSs+1NPVHbFVM3kq82d+0Re OWptQ8ccl8MWYxNUxkjbYtdRTdoEfgyRgVr7zeaVTS+WDza8nuslcA3KDwxVmmrz9mUv y8mFGi8SqFhqTj8QxDHTbHFBDZi9dYoseBr8RyxbuppSJEicGyZdWOWxdhb5LyCF84LJ Jo8AeKK02wVuZQ5ctcs9HfVxSbGzQwUhzEXYqvoD12ANXZlbM3q30jxmwkVk+wGb8RAh KeE0Dn5ySuuz7nVIqy72r8JhOzbN2VAXj89i8qO1EorP9nG5dQnvAeCEggaOHDcp8M80 hg== Received: from aserv0022.oracle.com (aserv0022.oracle.com [141.146.126.234]) by aserp2130.oracle.com with ESMTP id 2qtupdyg7h-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 25 Feb 2019 16:49:15 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0022.oracle.com (8.14.4/8.14.4) with ESMTP id x1PGn9mB026766 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 25 Feb 2019 16:49:10 GMT Received: from abhmp0007.oracle.com (abhmp0007.oracle.com [141.146.116.13]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id x1PGn78D019409; Mon, 25 Feb 2019 16:49:08 GMT Received: from [192.168.1.164] (/50.38.38.67) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 25 Feb 2019 08:49:07 -0800 Subject: Re: [PATCH v4] mm/hugetlb: Fix unsigned overflow in __nr_hugepages_store_common() To: David Rientjes Cc: Jing Xiangfeng , mhocko@kernel.org, akpm@linux-foundation.org, hughd@google.com, linux-mm@kvack.org, n-horiguchi@ah.jp.nec.com, aarcange@redhat.com, kirill.shutemov@linux.intel.com, linux-kernel@vger.kernel.org References: <1550885529-125561-1-git-send-email-jingxiangfeng@huawei.com> <388cbbf5-7086-1d04-4c49-049021504b9d@oracle.com> From: Mike Kravetz Message-ID: <8c167be7-06fa-a8c0-8ee7-0bfad41eaba2@oracle.com> Date: Mon, 25 Feb 2019 08:49:06 -0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=9178 signatures=668685 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1810050000 definitions=main-1902250123 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2/24/19 7:17 PM, David Rientjes wrote: > On Sun, 24 Feb 2019, Mike Kravetz wrote: > >>> User can change a node specific hugetlb count. i.e. >>> /sys/devices/system/node/node1/hugepages/hugepages-2048kB/nr_hugepages >>> the calculated value of count is a total number of huge pages. It could >>> be overflow when a user entering a crazy high value. If so, the total >>> number of huge pages could be a small value which is not user expect. >>> We can simply fix it by setting count to ULONG_MAX, then it goes on. This >>> may be more in line with user's intention of allocating as many huge pages >>> as possible. >>> >>> Signed-off-by: Jing Xiangfeng >> >> Thank you. >> >> Acked-by: Mike Kravetz >> >>> --- >>> mm/hugetlb.c | 7 +++++++ >>> 1 file changed, 7 insertions(+) >>> >>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c >>> index afef616..6688894 100644 >>> --- a/mm/hugetlb.c >>> +++ b/mm/hugetlb.c >>> @@ -2423,7 +2423,14 @@ static ssize_t __nr_hugepages_store_common(bool obey_mempolicy, >>> * per node hstate attribute: adjust count to global, >>> * but restrict alloc/free to the specified node. >>> */ >>> + unsigned long old_count = count; >>> count += h->nr_huge_pages - h->nr_huge_pages_node[nid]; >>> + /* >>> + * If user specified count causes overflow, set to >>> + * largest possible value. >>> + */ >>> + if (count < old_count) >>> + count = ULONG_MAX; >>> init_nodemask_of_node(nodes_allowed, nid); >>> } else >>> nodes_allowed = &node_states[N_MEMORY]; >>> > > Looks like this fixes the overflow issue, but isn't there already a > possible underflow since we don't hold hugetlb_lock? Even if > count == 0, what prevents h->nr_huge_pages_node[nid] being greater than > h->nr_huge_pages here? I think the per hstate values need to be read with > READ_ONCE() and stored on the stack to do any sane bounds checking. Yes, without holding the lock there is the potential for issues. Looking back to when the node specific code was added there is a comment about "re-use/share as much of the existing global hstate attribute initialization and handling". I suspect that is the reason for these calculations outside the lock. As you mention above, nr_huge_pages_node[nid] could be greater than nr_huge_pages. This is true even if we do READ_ONCE(). So, the code would need to test for this condition and 'fix up' values or re-read. It is just racy without holding the lock. If that is too ugly, then we could just add code for the node specific adjustments. set_max_huge_pages() is only called from here. It would be pretty easy to modify set_max_huge_pages() to take the node specific value and do calculations/adjustments under the lock. -- Mike Kravetz