Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp4017450ioo; Wed, 25 May 2022 12:58:41 -0700 (PDT) X-Google-Smtp-Source: ABdhPJybI7b26vcvKG1KumBPEmpJxu6lv63WNTeY6dZQHoWHv357SrYw4b7Td4NQRL9kIRn7anNx X-Received: by 2002:a17:90b:1e46:b0:1e0:b641:7188 with SMTP id pi6-20020a17090b1e4600b001e0b6417188mr4099721pjb.166.1653508720848; Wed, 25 May 2022 12:58:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653508720; cv=none; d=google.com; s=arc-20160816; b=AdevuNxYwQB0+7bllMQrkMTN5jvQn9a7+dSjmho4rp9gxQSZZhiRnT/PWTkTs7ROb+ t1YCAmrrVZQqtIAWL46XV7kjmeZ+wKrxFb97QP4+Dp8viKDDIy56YKwBgntGAJry2NQm EeyflTHPq30kx8KH/BtTHQ87VBnNLa4U/cCq+7nA6IdXzYBq/72bkcyBnJgruamZzMiA jhOHbWQOMkkprlJX340ZXiQn9qZAb5OzBjJ1IHuhTkwo8oZVxG5DFxt6kl42wdPx9wcq OS6uVAG/6vqcx9ICDfc/v0s+Ff3POqCukfWoKObpIPOpHCJ/wg1WuW/ndrvEPY7WLLn/ 0Nng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=xf2fdIO0FYrbYefz3rUMA5V4Vp2E6/jT2Pv3lYomQMk=; b=A4jkSag2PdPtcYofBcfclsuESghlV/lfqbCz47dHUYAtk+yUWzxTSIcO5PqpoE+ARJ eYqKRpM1R2fL3ilgWJyufhItKcQv8KEym0BdiEYYYOhqqkOuE1DJK+kgS0VRuOIRjS9+ 3O3i+XvXMsOQRC/JZ13zYo20wa5bRebI9cyNmaRULbldGF9SR61GsoZHzWUeZ3gnR4tg OJREJMRXWRG+QdiKfRQvsF5uL2OaDsX3h2RQUrxROzZHplaHMKRAbMWA6/Zt+5Mzvd5Z +KcHYpYX6SyjFHhjhMpdQovaqaOokOCBA0mD3QSQD9V0VfOx4nKT6zLFbJ+jphF1mwqJ C2QA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=Lqnwc8iz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id om16-20020a17090b3a9000b001df95edac97si4560407pjb.4.2022.05.25.12.58.28; Wed, 25 May 2022 12:58:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@ibm.com header.s=pp1 header.b=Lqnwc8iz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ibm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237775AbiEXN1d (ORCPT + 99 others); Tue, 24 May 2022 09:27:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229824AbiEXN13 (ORCPT ); Tue, 24 May 2022 09:27:29 -0400 Received: from mx0a-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B42E3985AC for ; Tue, 24 May 2022 06:27:28 -0700 (PDT) Received: from pps.filterd (m0098416.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24OCH5HH008971; Tue, 24 May 2022 13:27:08 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type; s=pp1; bh=xf2fdIO0FYrbYefz3rUMA5V4Vp2E6/jT2Pv3lYomQMk=; b=Lqnwc8izt1JuQNxW7HxUhwhbCTC5TKFVnodLBTjFIo7J5CD+qiNNHv3qgt72Y0BVzmWK GPZGEPy4vGii/u7mNeVjQturXuPNDp+Dp5hm87Kj8dP/MuW81yw8u4UKpbKAUnJ1PxBA 5Shd4T6REUUdRNDCvqs0aHr6HP6214V1Kw3e3gjlCdHb6AoZjHPi7jFaQGFnsMDwkVGn eTC+Fvnx2yExCzfedXzcRwoMmodJDDlJPFUZiHV6isq8FyC7RNNZUhiWejLV7bVPpCqN J+8SEMvqtSrTeuNij5nz91uvXts8oGsbSR7U1rHcc/BH7xpJ2awRZT3/JGN1od9iwYe6 bg== Received: from pps.reinject (localhost [127.0.0.1]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3g8y7y1jfg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 24 May 2022 13:27:07 +0000 Received: from m0098416.ppops.net (m0098416.ppops.net [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 24OCtufI010949; Tue, 24 May 2022 13:27:07 GMT Received: from ppma01wdc.us.ibm.com (fd.55.37a9.ip4.static.sl-reverse.com [169.55.85.253]) by mx0b-001b2d01.pphosted.com (PPS) with ESMTPS id 3g8y7y1jey-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 24 May 2022 13:27:07 +0000 Received: from pps.filterd (ppma01wdc.us.ibm.com [127.0.0.1]) by ppma01wdc.us.ibm.com (8.16.1.2/8.16.1.2) with SMTP id 24ODK4DH025223; Tue, 24 May 2022 13:27:06 GMT Received: from b03cxnp08027.gho.boulder.ibm.com (b03cxnp08027.gho.boulder.ibm.com [9.17.130.19]) by ppma01wdc.us.ibm.com with ESMTP id 3g6qq9g680-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 24 May 2022 13:27:06 +0000 Received: from b03ledav005.gho.boulder.ibm.com (b03ledav005.gho.boulder.ibm.com [9.17.130.236]) by b03cxnp08027.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 24ODR5gg12845612 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 24 May 2022 13:27:05 GMT Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 48F78BE04F; Tue, 24 May 2022 13:27:05 +0000 (GMT) Received: from b03ledav005.gho.boulder.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 3A67FBE053; Tue, 24 May 2022 13:26:57 +0000 (GMT) Received: from skywalker.linux.ibm.com (unknown [9.43.104.172]) by b03ledav005.gho.boulder.ibm.com (Postfix) with ESMTP; Tue, 24 May 2022 13:26:56 +0000 (GMT) X-Mailer: emacs 29.0.50 (via feedmail 11-beta-1 I) From: "Aneesh Kumar K.V" To: Wei Xu , Jonathan Cameron Cc: Dave Hansen , Alistair Popple , Huang Ying , Andrew Morton , Greg Thelen , Yang Shi , Linux Kernel Mailing List , Jagdish Gediya , Michal Hocko , Tim C Chen , Baolin Wang , Feng Tang , Davidlohr Bueso , Dan Williams , David Rientjes , Linux MM , Brice Goglin , Hesham Almatary Subject: Re: RFC: Memory Tiering Kernel Interfaces (v2) In-Reply-To: References: <20220512160010.00005bc4@Huawei.com> <20220518130037.00001cce@Huawei.com> Date: Tue, 24 May 2022 18:56:54 +0530 Message-ID: <8735gzdpsx.fsf@linux.ibm.com> MIME-Version: 1.0 Content-Type: text/plain X-TM-AS-GCONF: 00 X-Proofpoint-GUID: prabyj7jX9mtGqsPemWtPk6pdJ5Dgmfv X-Proofpoint-ORIG-GUID: cmIXNdZk3T_Z9Ka-HLM7VnS6Q8FYZ8FK X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.874,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-24_07,2022-05-23_01,2022-02-23_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 priorityscore=1501 bulkscore=0 mlxscore=0 adultscore=0 lowpriorityscore=0 phishscore=0 spamscore=0 mlxlogscore=999 suspectscore=0 impostorscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2202240000 definitions=main-2205240066 X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Wei Xu writes: > On Wed, May 18, 2022 at 5:00 AM Jonathan Cameron > wrote: >> >> On Wed, 18 May 2022 00:09:48 -0700 >> Wei Xu wrote: ... > Nice :) >> >> Initially I thought this was over complicated when compared to just leaving space, but >> after a chat with Hesham just now you have us both convinced that this is an elegant solution. >> >> Few corners probably need fleshing out: >> * Use of an allocator for new tiers. Flat number at startup, or new one on write of unique >> value to set_memtier perhaps? Also whether to allow drivers to allocate (I think >> we should). >> * Multiple tiers with same rank. My assumption is from demotion path point of view you >> fuse them (treat them as if they were a single tier), but keep them expressed >> separately in the sysfs interface so that the rank can be changed independently. >> * Some guidance on what values make sense for given rank default that might be set by >> a driver. If we have multiple GPU vendors, and someone mixes them in a system we >> probably don't want the default values they use to result in demotion between them. >> This might well be a guidance DOC or appropriate set of #define > > All of these are good ideas, though I am afraid that these can make > tier management too complex for what it's worth. > > How about an alternative tier numbering scheme that uses major.minor > device IDs? For simplicity, we can just start with 3 major tiers. > New tiers can be inserted in-between using minor tier IDs. What drives the creation of a new memory tier here? Jonathan was suggesting we could do something similar to writing to set_memtier for creating a new memory tier. $ echo "memtier128" > sys/devices/system/node/node1/set_memtier But I am wondering whether we should implement that now. If we keep "rank" concept and detach tier index (memtier0 is the memory tier with index 0) separate from rank, I assume we have enough flexibility for a future extension that will allow us to create a memory tier from userspace and assigning it a rank value that helps the device to be placed before or after DRAM in demotion order. ie, For now we will only have memtier0, memtier1, memtier2. We won't add dynamic creation of memory tiers and the above memory tiers will have rank value 0, 1, 2 according with demotion order 0 -> 1 -> 2. -aneesh