Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp227112ybl; Thu, 15 Aug 2019 16:10:50 -0700 (PDT) X-Google-Smtp-Source: APXvYqxxATzn0IDvYK97CnYAUAOP1k5pO/cayFxwKetP8qsYjZ9Zj8KINlFoNagfJwp8BQq7AgO3 X-Received: by 2002:a63:e306:: with SMTP id f6mr5214066pgh.39.1565910650099; Thu, 15 Aug 2019 16:10:50 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565910650; cv=none; d=google.com; s=arc-20160816; b=teLjeFZAylsuWoZiEznA4D5zaEMbeddQ3rfsaWMWN8h7V4+RrXJS0GoEOJdqj7Bpzw R/A2nqvDH/FB0EPoW5/BXEOuK18J/o9bq9Hr6AlS7G9OZFlk2Rf0+VMaao3GT9mDrhNU e7SrOyx20XVTjiKkE85iphqSxNqfAeSrlTxdLljtFn/tK2U+LksXb6h36Yiq6FG6gCex hGPkI1q833Ogch+lY1Bhzq9qx0VDdx8v0sL63teODSXuuLCl5WbqADpFEqpOYoAp5wyG lknGNBQ039gnNaz44FoKpypsXbJFQBxHPknPeSczq9jsnUoiqCuhQuPMtjE2wovIJgN5 Rxpw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=A0+yFoccKuBfbcsXsXsga1tmUthvWenlvN1lT3lVs28=; b=F7OcBP/qnJWARBCysn7QitmBWtwQtX5lxYdfMQHQAd0g4dkRFeWdNlp6qoqka4vCQV m8/9+gI/r8aIMiC1GJ0lXuD2IIZsMuqJTR+yOyoIUYOAMo3avWV3ligXiX+fU5QltjIK EM6Ofm9bkzVMP6OnfwBlc8nZVZATjgkD/J2cZY0LOSndjpvF73OwNL5FagEGGgIidc2p PqxsVTLkzoDml0PVdVdZeZkZvyao+NbqrDgDYQsxR/0FMV8jBQcEJVrK6xenucqpAbFd ATzoB5j87zX6/9c1SbDo2ODMiIfeVfvys1iwN42P6L51K5Eeaf3DWN5QKUbEZntbQPXt iIwQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=qO7wFUVB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h14si1764391pje.101.2019.08.15.16.10.34; Thu, 15 Aug 2019 16:10:50 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@intel-com.20150623.gappssmtp.com header.s=20150623 header.b=qO7wFUVB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730559AbfHOUMf (ORCPT + 99 others); Thu, 15 Aug 2019 16:12:35 -0400 Received: from mail-ot1-f68.google.com ([209.85.210.68]:46748 "EHLO mail-ot1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728579AbfHOUMf (ORCPT ); Thu, 15 Aug 2019 16:12:35 -0400 Received: by mail-ot1-f68.google.com with SMTP id z17so7492069otk.13 for ; Thu, 15 Aug 2019 13:12:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=A0+yFoccKuBfbcsXsXsga1tmUthvWenlvN1lT3lVs28=; b=qO7wFUVBcfsXZdEWtESTlkgDJFIn+h2Zftaw2r0fFx3myFLMcmx/1PxE8CN+fqldWQ zjBl3Dul7mG+AB3phn1DsCwl/3tGds0vBJ1kqEV+ap8hW1PegTxQmwVct4e6eZW5Mfez HLsQOrHW17zkzc1s7yiRXL2BVjwVwr/l/20Il9jmjG/teTTaa5j9sARFRpHce+kxtMJS 6pcaFtAD6i7TvosBx8nv53Etr/AFABojJvFQyvd2/naDP/49htS0fK6O8E2SpX+XvJFi LHhffCbJNMAu8GkF3l05nJPd/24PikpQTsobVDta1UgZR+WEGVU1hpgkfCqVY0EkQboc 34+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=A0+yFoccKuBfbcsXsXsga1tmUthvWenlvN1lT3lVs28=; b=jpmKvwS+tm10bxxNEZK4pWmuGmPykJQFf7wU3CP6aM/rraDDzLi3rki3xFCW/MFD83 ahfj0U4dTZqZEpYI1xI7glo/bjt9RVlSeo48Pb3j3GoWENc/xdUp+HAn8RUPijlAbQEM zlR/Dkr5XmxDwCuu5ocigHDRwQKPRLzy2XqJoM4JnL6pboYxyvkPv9TxP0s+Jd2md9H4 7bb7QB1nokHtyogS29AxbgW118VpKHGxDEtyDT1KVUXqfMVIVBIHYZBnj4bXUqlj7UKB qFUQcCBjisGQCPE5BRn46pI+Ad1mn1fB1zxi6UaFfMMvM3kI5B91g0HoTBRr5g9C2msV J95Q== X-Gm-Message-State: APjAAAWSlENi6R9WV5SZsfAf4VbMY48W8oFuEbKxpUJVNz7Gk+iJ2QKz AFs1rfpzJ84mwc5wxKAWztBEcD5WFFUOs4agDXCuRA== X-Received: by 2002:a9d:7a9a:: with SMTP id l26mr4370413otn.71.1565899953979; Thu, 15 Aug 2019 13:12:33 -0700 (PDT) MIME-Version: 1.0 References: <20190806160554.14046-5-hch@lst.de> <20190807174548.GJ1571@mellanox.com> <20190808065933.GA29382@lst.de> <20190814073854.GA27249@lst.de> <20190814132746.GE13756@mellanox.com> <20190815180325.GA4920@redhat.com> <20190815194339.GC9253@redhat.com> In-Reply-To: <20190815194339.GC9253@redhat.com> From: Dan Williams Date: Thu, 15 Aug 2019 13:12:22 -0700 Message-ID: Subject: Re: [PATCH 04/15] mm: remove the pgmap field from struct hmm_vma_walk To: Jerome Glisse Cc: Jason Gunthorpe , Christoph Hellwig , Ben Skeggs , Felix Kuehling , Ralph Campbell , "linux-mm@kvack.org" , "nouveau@lists.freedesktop.org" , "dri-devel@lists.freedesktop.org" , "amd-gfx@lists.freedesktop.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 15, 2019 at 12:44 PM Jerome Glisse wrote: > > On Thu, Aug 15, 2019 at 12:36:58PM -0700, Dan Williams wrote: > > On Thu, Aug 15, 2019 at 11:07 AM Jerome Glisse wrote: > > > > > > On Wed, Aug 14, 2019 at 07:48:28AM -0700, Dan Williams wrote: > > > > On Wed, Aug 14, 2019 at 6:28 AM Jason Gunthorpe wrote: > > > > > > > > > > On Wed, Aug 14, 2019 at 09:38:54AM +0200, Christoph Hellwig wrote: > > > > > > On Tue, Aug 13, 2019 at 06:36:33PM -0700, Dan Williams wrote: > > > > > > > Section alignment constraints somewhat save us here. The only example > > > > > > > I can think of a PMD not containing a uniform pgmap association for > > > > > > > each pte is the case when the pgmap overlaps normal dram, i.e. shares > > > > > > > the same 'struct memory_section' for a given span. Otherwise, distinct > > > > > > > pgmaps arrange to manage their own exclusive sections (and now > > > > > > > subsections as of v5.3). Otherwise the implementation could not > > > > > > > guarantee different mapping lifetimes. > > > > > > > > > > > > > > That said, this seems to want a better mechanism to determine "pfn is > > > > > > > ZONE_DEVICE". > > > > > > > > > > > > So I guess this patch is fine for now, and once you provide a better > > > > > > mechanism we can switch over to it? > > > > > > > > > > What about the version I sent to just get rid of all the strange > > > > > put_dev_pagemaps while scanning? Odds are good we will work with only > > > > > a single pagemap, so it makes some sense to cache it once we find it? > > > > > > > > Yes, if the scan is over a single pmd then caching it makes sense. > > > > > > Quite frankly an easier an better solution is to remove the pagemap > > > lookup as HMM user abide by mmu notifier it means we will not make > > > use or dereference the struct page so that we are safe from any > > > racing hotunplug of dax memory (as long as device driver using hmm > > > do not have a bug). > > > > Yes, as long as the driver remove is synchronized against HMM > > operations via another mechanism then there is no need to take pagemap > > references. Can you briefly describe what that other mechanism is? > > So if you hotunplug some dax memory i assume that this can only > happens once all the pages are unmapped (as it must have the > zero refcount, well 1 because of the bias) and any unmap will > trigger a mmu notifier callback. User of hmm mirror abiding by > the API will never make use of information they get through the > fault or snapshot function until checking for racing notifier > under lock. Hmm that first assumption is not guaranteed by the dev_pagemap core. The dev_pagemap end of life model is "disable, invalidate, drain" so it's possible to call devm_munmap_pages() while pages are still mapped it just won't complete the teardown of the pagemap until the last reference is dropped. New references are blocked during this teardown. However, if the driver is validating the liveness of the mapping in the mmu-notifier path and blocking new references it sounds like it should be ok. Might there be GPU driver unit tests that cover this racing teardown case?