Received: by 10.223.164.202 with SMTP id h10csp1418944wrb; Wed, 15 Nov 2017 20:15:29 -0800 (PST) X-Google-Smtp-Source: AGs4zMby3RKJrE5yxZpWim3e4aKe1Gpj4Mkg09Lm14kfTfoyliDaJWru4UElO3DVdoyB0MVTuWOB X-Received: by 10.98.73.79 with SMTP id w76mr426733pfa.148.1510805729529; Wed, 15 Nov 2017 20:15:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1510805729; cv=none; d=google.com; s=arc-20160816; b=TtYqVar323xk5O0DfJKPPRA1a6jETzuVwxMrANrTyMrcTlOpux/5aByVPl1wQAx5JD K80xG3dJMF9Mv2PN/Z5hwkppvujVLoITv+l2FyQf/dBpDA1inGtjTk4biefXfYPLyw9T UlZ/ZkqlynwYUhXj18uc6Y1Ea8BmV/OLR5+R8Ppazchf589I5mCxEubJEI64B51IPp/t S7xNciuJXQwyQ6Pz6nNTeEt8M9ZIrzQp3nd9PsWtS+jGmjTzcxjljUYIYXyItKJACBZy qKKwht6ymQ+4yf60wKqiWBfKpBuRvojNpURFTlqogZSHme521YPqlcjtX05ziH0pMLF+ GqXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date :arc-authentication-results; bh=yc9RTamD4ATer/tweJqRF0GCJDJc7/4k5htJZU4mJXY=; b=s09hiZcRchga077aI0Ncp+EEbTv1WgW7FfDrf926INR/P2UYs7niqhWMih1A64o2Uc itMIMLkOr8lwlapxr/9hASmWfxoINm3RVZ6eEBLXG1nWkoOJcaVlKwo2BiVjLbWJ2KRq oT8rOdRJZ32DkXETpXA6vcYecO+DcsyhEWycqIJZMAVxTvVj1hsw683InH1XShegsF4m Dm+qos8eXE7Zj8mT5533xPx8RP8AC8sCvAK2opzjOr+JBpB13Ol8x+Xt5ilyoWuaB2Yl n4ydbLlvDkQYFWD3yYaVuGfYhOLyuL3Q2xCb4aSTe+w0nBlrEs5Zqprv3s9L1g+htR8X p/gg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e136si76867pfh.243.2017.11.15.20.15.17; Wed, 15 Nov 2017 20:15:29 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758847AbdKPCoh (ORCPT + 89 others); Wed, 15 Nov 2017 21:44:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:54880 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753877AbdKPCo3 (ORCPT ); Wed, 15 Nov 2017 21:44:29 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C53C076520; Thu, 16 Nov 2017 02:44:29 +0000 (UTC) Received: from redhat.com (ovpn-126-8.rdu2.redhat.com [10.10.126.8]) by smtp.corp.redhat.com (Postfix) with ESMTPS id F123D5D96F; Thu, 16 Nov 2017 02:44:27 +0000 (UTC) Date: Wed, 15 Nov 2017 21:44:26 -0500 From: Jerome Glisse To: chet l Cc: Bob Liu , Bob Liu , Dan Williams , "linux-kernel@vger.kernel.org" , Linux MM , John Hubbard , David Nellans , Balbir Singh , Michal Hocko , Andrew Morton Subject: Re: [PATCH 0/6] Cache coherent device memory (CDM) with HMM v5 Message-ID: <20171116024425.GC2934@redhat.com> References: <20170905193644.GD19397@redhat.com> <20170911233649.GA4892@redhat.com> <20170926161635.GA3216@redhat.com> <0d7273c3-181c-6d68-3c5f-fa518e782374@huawei.com> <20170930224927.GC6775@redhat.com> <20171012153721.GA2986@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Thu, 16 Nov 2017 02:44:29 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Nov 15, 2017 at 06:10:08PM -0800, chet l wrote: > >> You may think it as a CCIX device or CAPI device. > >> The requirement is eliminate any extra copy. > >> A typical usecase/requirement is malloc() and madvise() allocate from > >> device memory, then CPU write data to device memory directly and > >> trigger device to read the data/do calculation. > > > > I suggest you rely on the device driver userspace API to do a migration after malloc > > then. Something like: > > ptr = malloc(size); > > my_device_migrate(ptr, size); > > > > Which would call an ioctl of the device driver which itself would migrate memory or > > allocate device memory for the range if pointer return by malloc is not yet back by > > any pages. > > > > So for CCIX, I don't think there is going to be an inline device > driver that would allocate any memory for you. The expansion memory > will become part of the system memory as part of the boot process. So, > if the host DDR is 256GB and the CCIX expansion memory is 4GB, the > total system mem will be 260GB. > > Assume that the 'mm' is taught to mark/anoint the ZONE_DEVICE(or > ZONE_XXX) range from 256 to 260 GB. Then, for kmalloc it(mm) won't use > the ZONE_DEV range. But for a malloc, it will/can use that range. HMM zone device memory would work with that, you just need to teach the platform to identify this memory zone and not hotplug it. Again you should rely on specific device driver API to allocate this memory. > > There has been several discussions already about madvise/mbind/set_mempolicy/ > > move_pages and at this time i don't think we want to add or change any of them to > > understand device memory. My personal opinion is that we first need to have enough > > We will visit these APIs when we are more closer to building exotic > CCIX devices. And the plan is to present/express the CCIX proximity > attributes just like a NUMA node-proximity attribute today. That way > there would be minimal disruptions to the existing OS ecosystem. NUMA have been rejected previously see CDM/CAPI threads. So i don't see it being accepted for CCIX either. My belief is that we want to hide this inside device driver and only once we see multiple devices all doing the same kind of thing we should move toward building something generic that catter to CCIX devices. J�r�me From 1584194472841752725@xxx Thu Nov 16 04:13:00 +0000 2017 X-GM-THRID: 1572843623662560165 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread