Received: by 10.223.164.202 with SMTP id h10csp2142727wrb; Fri, 24 Nov 2017 06:30:50 -0800 (PST) X-Google-Smtp-Source: AGs4zMbP/w9DVMq+VyUHyLM609p3b5qXY4y/znDmD6WLrMhoCifjJDlwFERnO44OuVPPzPHNqBsA X-Received: by 10.84.175.129 with SMTP id t1mr28777280plb.193.1511533850263; Fri, 24 Nov 2017 06:30:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511533850; cv=none; d=google.com; s=arc-20160816; b=OV5m+abadMlSw1Mx43TcWK5r/QMgQQ0mjsmOUnRbr5FPisTGkNeFB8sM1K2KyF12De Q2C/KEFj/mzY1gbQmhPHYX5+Ks4RoO1BZuEfKB9uJO0eZqUqrDLiYBqw88olPOO2Muxj KlWS6leGgaGkrAP3GSpFACEapzghpFUYIylQ3oRnSRStwiBl1iUAfZ+oWCqX7U49lTIR NFOpwtRivdtDkgwsuz1m8qF7bjEFBur/J0ZxuRlzsJHR7Lu5vLpbSAwemxbQL0dUFVfc GR9sW6GAVhxk288F2UaWrt1fsdd5Rfc3x7Kh96n8Ex2V0AYF5gIBe7XCR14jUdI2fyaW QQXw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:user-agent:in-reply-to :content-disposition:mime-version:references:subject:cc:to:from:date :arc-authentication-results; bh=7kRyxJp+AaJYycBE6arV0CJEPC5hQyvHRbDPs2MXVdM=; b=Oke/Onf+NkoOaic3waI0kfHco60g0aTU3k+zyqzV+tU6My0Lry4/flAGMlorG71hr+ Zeq/tvqxKbSmiJAhhcpQt5x6kt6RfJg5AQV/skk916NUlk+ctIwtAgp1bdDaDecPx30v kvlmFUtQiR/PPVOw9+Z/gu1+RF6aUA/yRZAw2F6fcmIYKOoG1VzlLLMwg7/reYSQcaqh fGrTF1VcEQk1Ycvlp1pfPhtUKXCr0YLSAaCmyOZ8suPHMEe6WeJEpAAXUGANjPD2Mhk4 yC0nkUqB2QJSNbFehvnnehdPS2G3kkjUQjtqzMV5M2RRPOFLRBYMrTXuhSoQ5Yh3Cw6C N2SQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j8si16444124pli.353.2017.11.24.06.30.36; Fri, 24 Nov 2017 06:30:50 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752654AbdKXO37 (ORCPT + 77 others); Fri, 24 Nov 2017 09:29:59 -0500 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:59794 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750801AbdKXO36 (ORCPT ); Fri, 24 Nov 2017 09:29:58 -0500 Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.21/8.16.0.21) with SMTP id vAOET52c127559 for ; Fri, 24 Nov 2017 09:29:57 -0500 Received: from e06smtp15.uk.ibm.com (e06smtp15.uk.ibm.com [195.75.94.111]) by mx0b-001b2d01.pphosted.com with ESMTP id 2eekfcvw1t-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Fri, 24 Nov 2017 09:29:56 -0500 Received: from localhost by e06smtp15.uk.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 24 Nov 2017 14:29:54 -0000 Received: from b06cxnps3074.portsmouth.uk.ibm.com (9.149.109.194) by e06smtp15.uk.ibm.com (192.168.101.145) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 24 Nov 2017 14:29:51 -0000 Received: from d06av26.portsmouth.uk.ibm.com (d06av26.portsmouth.uk.ibm.com [9.149.105.62]) by b06cxnps3074.portsmouth.uk.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id vAOEToGG26149018; Fri, 24 Nov 2017 14:29:50 GMT Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 9D5A8AE058; Fri, 24 Nov 2017 14:23:03 +0000 (GMT) Received: from d06av26.portsmouth.uk.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3B5DCAE056; Fri, 24 Nov 2017 14:23:03 +0000 (GMT) Received: from samekh (unknown [9.162.48.51]) by d06av26.portsmouth.uk.ibm.com (Postfix) with ESMTPS; Fri, 24 Nov 2017 14:23:03 +0000 (GMT) Date: Fri, 24 Nov 2017 14:29:48 +0000 From: Andrea Reale To: zhong jiang Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, m.bielski@virtualopensystems.com, arunks@qti.qualcomm.com, mark.rutland@arm.com, scott.branden@broadcom.com, will.deacon@arm.com, qiuxishi@huawei.com, catalin.marinas@arm.com, mhocko@suse.com, realean2@ie.ibm.com Subject: Re: [PATCH v2 4/5] mm: memory_hotplug: Add memory hotremove probe device References: <22d34fe30df0fbacbfceeb47e20cb1184af73585.1511433386.git.ar@linux.vnet.ibm.com> <5A17F5DF.2040108@huawei.com> <20171124104401.GD18120@samekh> <5A180DF1.8060009@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <5A180DF1.8060009@huawei.com> User-Agent: Mutt/1.5.24 (2015-08-30) X-TM-AS-GCONF: 00 x-cbid: 17112414-0020-0000-0000-000003D037AF X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17112414-0021-0000-0000-00004265924D Message-Id: <20171124142948.GA1966@samekh> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-11-24_05:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1711240197 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi zhongjian, On Fri 24 Nov 2017, 20:17, zhong jiang wrote: > Hi, Andrea > > most of server will benefit from NUMA ,it is best to sovle the issue without > spcial restrictions. > > At least we can obtain the numa information from dtb. therefore, The memory can > online correctly. I fully agree it's an important feature, that should eventually be there. But, at least in my understanding, the implementation is not as straightfoward as it looks. If I declare a memory node in the fdt, then, at boot, the kernel will expect that memory to actually be there to be used: this is not true if I want to plug my dimms only later at runtime. So I think that declaring the hotpluggable memory in an fdt memory node might not feasible without changes. One idea could be to add a new property to memory nodes, to specify what memory is potentially hotplugguable. For example, something like: memory@0 { device_type = "memory"; reg = <0x0 0x0 0x0 0x40000000>; hot-add-range = <0x0 0x40000000 0x0 0x40000000>; numa-node-id=<0>; } memory@10000000000 { device_type = "memory"; reg = <0x100 0x0 0x0 0x40000000>; hot-add-range = <0x100 0x40000000 0x0 0x40000000>; numa-node-id=<1>; } The information in this imaginary "hot-add-range" property would be ignored at boot and only checked by the hot add process to see to which NUMA domain some phy memory belongs. Of course this is just an example, and my limited knowledge of fdt doesn't make me the best person to think what's the best approach. All this to say: in absence of a clear and agreed approach, we released the patch with the !NUMA limitation, so that we can get early feedback. And also in the hope to kickstart this discussion on what's the best approach to support NUMA . Ideas/suggestions? Thanks, Andrea > > Thanks > zhongjiang > > On 2017/11/24 18:44, Andrea Reale wrote: > > Hi zhongjiang, > > > > On Fri 24 Nov 2017, 18:35, zhong jiang wrote: > >> HI, Andrea > >> > >> I don't see "memory_add_physaddr_to_nid" in arch/arm64. > >> Am I miss something? > > When !CONFIG_NUMA it is defined in include/linux/memory_hotplug.h as 0. > > In patch 1/5 of this series we require !NUMA to enable > > ARCH_ENABLE_MEMORY_HOTPLUG. > > > > The reason for this simplification is simply that we would not know how > > to decide the correct node to which to add memory when NUMA is on. > > Any suggestion on that matter is welcome. > > > > Thanks, > > Andrea > > > >> Thnaks > >> zhongjiang > >> > >> On 2017/11/23 19:14, Andrea Reale wrote: > >>> Adding a "remove" sysfs handle that can be used to trigger > >>> memory hotremove manually, exactly simmetrically with > >>> what happens with the "probe" device for hot-add. > >>> > >>> This is usueful for architecture that do not rely on > >>> ACPI for memory hot-remove. > >>> > >>> Signed-off-by: Andrea Reale > >>> Signed-off-by: Maciej Bielski > >>> --- > >>> drivers/base/memory.c | 34 +++++++++++++++++++++++++++++++++- > >>> 1 file changed, 33 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/drivers/base/memory.c b/drivers/base/memory.c > >>> index 1d60b58..8ccb67c 100644 > >>> --- a/drivers/base/memory.c > >>> +++ b/drivers/base/memory.c > >>> @@ -530,7 +530,36 @@ memory_probe_store(struct device *dev, struct device_attribute *attr, > >>> } > >>> > >>> static DEVICE_ATTR(probe, S_IWUSR, NULL, memory_probe_store); > >>> -#endif > >>> + > >>> +#ifdef CONFIG_MEMORY_HOTREMOVE > >>> +static ssize_t > >>> +memory_remove_store(struct device *dev, > >>> + struct device_attribute *attr, const char *buf, size_t count) > >>> +{ > >>> + u64 phys_addr; > >>> + int nid, ret; > >>> + unsigned long pages_per_block = PAGES_PER_SECTION * sections_per_block; > >>> + > >>> + ret = kstrtoull(buf, 0, &phys_addr); > >>> + if (ret) > >>> + return ret; > >>> + > >>> + if (phys_addr & ((pages_per_block << PAGE_SHIFT) - 1)) > >>> + return -EINVAL; > >>> + > >>> + nid = memory_add_physaddr_to_nid(phys_addr); > >>> + ret = lock_device_hotplug_sysfs(); > >>> + if (ret) > >>> + return ret; > >>> + > >>> + remove_memory(nid, phys_addr, > >>> + MIN_MEMORY_BLOCK_SIZE * sections_per_block); > >>> + unlock_device_hotplug(); > >>> + return count; > >>> +} > >>> +static DEVICE_ATTR(remove, S_IWUSR, NULL, memory_remove_store); > >>> +#endif /* CONFIG_MEMORY_HOTREMOVE */ > >>> +#endif /* CONFIG_ARCH_MEMORY_PROBE */ > >>> > >>> #ifdef CONFIG_MEMORY_FAILURE > >>> /* > >>> @@ -790,6 +819,9 @@ bool is_memblock_offlined(struct memory_block *mem) > >>> static struct attribute *memory_root_attrs[] = { > >>> #ifdef CONFIG_ARCH_MEMORY_PROBE > >>> &dev_attr_probe.attr, > >>> +#ifdef CONFIG_MEMORY_HOTREMOVE > >>> + &dev_attr_remove.attr, > >>> +#endif > >>> #endif > >>> > >>> #ifdef CONFIG_MEMORY_FAILURE > >> > > > > . > > > > From 1584950006998605305@xxx Fri Nov 24 12:21:54 +0000 2017 X-GM-THRID: 1584855261784805699 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread