Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp4162856ybi; Mon, 29 Jul 2019 20:38:02 -0700 (PDT) X-Google-Smtp-Source: APXvYqw9RDDXf1BJBH5+P2A3fREbAI+Z3q7NurCecmI3xPLG7SRp6QfVDo8IcEf+a4ZXrpNmW5Pb X-Received: by 2002:a17:902:2ac7:: with SMTP id j65mr114211861plb.242.1564457882741; Mon, 29 Jul 2019 20:38:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564457882; cv=none; d=google.com; s=arc-20160816; b=zFnKUwpELJyxsUi00gl6F7W262JhLgb1yaSMJ0ySx7amu47JM7yl1kKzJeOSYEw1Nh CXTycHEhNf1M2LYSjIxxqEuFjnK4gvap818RgJ1RGyKgmBye5O98Kud0Zc20A9/tYZm9 rJwFkoig3lPVHmGbKddlzOXPkgOYeMcAl1eu8hG6a6FIEI2TyEBOXApLwwf84mG/2BQ5 ftWzVzOeGmvj34/1mdQGw4o37xXGXKt+/uOQCj6YABju4YARDVx8e/9bxyjp1kKolCFi n0WroHp/9ayFhJ0sYw2qNCUD6z2cBhHg56HXbsPcO+KhvrpjlxzNC4HWKCoEWVd0q49M LOWQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=itEspQP2DFCiFXb/y7yCiyva/NoBGHdfqbU1X37w1qk=; b=OsZKRQco0pE5hy11O1IE1mNeBqJpeqF7w92TzsxBGCn7LJohDqOv7kDwEFk2My/A/E gZgMvbJIQ2EQwC6Ukik9LXVNGrG0LNRg2sygT1L5kxxY5JJNGEtqxBPYFFb8nqkzE3SE IrQVK5dSfnYpJE0aQcWV3fBmnS+/HA7ks7GQs5Vx4OnxIVUZjMFQEYpx4pDCoNRagGMl N6FDw55oNP4Rn8wMQG4HNxpp1zIuMaU2W0AZmcNrHyZgEN1KOzxrg3AgFjSxN7a8vWS2 VS+j87JBetFYujf38+Dy5KFW1+Heef/IJocwt0skiNxb/wl/CGwsLCyk3OlrmvTP8fBu 6hjw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 24si26303715pgl.474.2019.07.29.20.37.47; Mon, 29 Jul 2019 20:38:02 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729836AbfG2XnU (ORCPT + 99 others); Mon, 29 Jul 2019 19:43:20 -0400 Received: from mx1.redhat.com ([209.132.183.28]:43308 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729671AbfG2XnU (ORCPT ); Mon, 29 Jul 2019 19:43:20 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 953C7BDF9; Mon, 29 Jul 2019 23:43:19 +0000 (UTC) Received: from redhat.com (ovpn-112-31.rdu2.redhat.com [10.10.112.31]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5F0455D6A0; Mon, 29 Jul 2019 23:43:16 +0000 (UTC) Date: Mon, 29 Jul 2019 19:43:12 -0400 From: Jerome Glisse To: Christoph Hellwig Cc: Jason Gunthorpe , Ben Skeggs , Ralph Campbell , Bharata B Rao , Andrew Morton , linux-mm@kvack.org, nouveau@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/9] mm: turn migrate_vma upside down Message-ID: <20190729234312.GB7171@redhat.com> References: <20190729142843.22320-1-hch@lst.de> <20190729142843.22320-2-hch@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20190729142843.22320-2-hch@lst.de> User-Agent: Mutt/1.12.0 (2019-05-25) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.29]); Mon, 29 Jul 2019 23:43:19 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 29, 2019 at 05:28:35PM +0300, Christoph Hellwig wrote: > There isn't any good reason to pass callbacks to migrate_vma. Instead > we can just export the three steps done by this function to drivers and > let them sequence the operation without callbacks. This removes a lot > of boilerplate code as-is, and will allow the drivers to drastically > improve code flow and error handling further on. > > Signed-off-by: Christoph Hellwig I haven't finished review, especialy the nouveau code, i will look into this once i get back. In the meantime below are few corrections. > --- > Documentation/vm/hmm.rst | 55 +----- > drivers/gpu/drm/nouveau/nouveau_dmem.c | 122 +++++++------ > include/linux/migrate.h | 118 ++---------- > mm/migrate.c | 242 +++++++++++-------------- > 4 files changed, 193 insertions(+), 344 deletions(-) > [...] > diff --git a/mm/migrate.c b/mm/migrate.c > index 8992741f10aa..dc4e60a496f2 100644 > --- a/mm/migrate.c > +++ b/mm/migrate.c > @@ -2118,16 +2118,6 @@ int migrate_misplaced_transhuge_page(struct mm_struct *mm, > #endif /* CONFIG_NUMA */ > > #if defined(CONFIG_MIGRATE_VMA_HELPER) > -struct migrate_vma { > - struct vm_area_struct *vma; > - unsigned long *dst; > - unsigned long *src; > - unsigned long cpages; > - unsigned long npages; > - unsigned long start; > - unsigned long end; > -}; > - > static int migrate_vma_collect_hole(unsigned long start, > unsigned long end, > struct mm_walk *walk) > @@ -2578,6 +2568,108 @@ static void migrate_vma_unmap(struct migrate_vma *migrate) > } > } > > +/** > + * migrate_vma_setup() - prepare to migrate a range of memory > + * @args: contains the vma, start, and and pfns arrays for the migration > + * > + * Returns: negative errno on failures, 0 when 0 or more pages were migrated > + * without an error. > + * > + * Prepare to migrate a range of memory virtual address range by collecting all > + * the pages backing each virtual address in the range, saving them inside the > + * src array. Then lock those pages and unmap them. Once the pages are locked > + * and unmapped, check whether each page is pinned or not. Pages that aren't > + * pinned have the MIGRATE_PFN_MIGRATE flag set (by this function) in the > + * corresponding src array entry. Then restores any pages that are pinned, by > + * remapping and unlocking those pages. > + * > + * The caller should then allocate destination memory and copy source memory to > + * it for all those entries (ie with MIGRATE_PFN_VALID and MIGRATE_PFN_MIGRATE > + * flag set). Once these are allocated and copied, the caller must update each > + * corresponding entry in the dst array with the pfn value of the destination > + * page and with the MIGRATE_PFN_VALID and MIGRATE_PFN_LOCKED flags set > + * (destination pages must have their struct pages locked, via lock_page()). > + * > + * Note that the caller does not have to migrate all the pages that are marked > + * with MIGRATE_PFN_MIGRATE flag in src array unless this is a migration from > + * device memory to system memory. If the caller cannot migrate a device page > + * back to system memory, then it must return VM_FAULT_SIGBUS, which will > + * might have severe consequences for the userspace process, so it should best ^s/might// ^s/should best/must/ > + * be avoided if possible. ^s/if possible// Maybe adding something about failing only on unrecoverable device error. The only reason we allow failure for migration here is because GPU devices can go into bad state (GPU lockup) and when that happens the GPU memory might be corrupted (power to GPU memory might be cut by GPU driver to recover the GPU). So failing migration back to main memory is only a last resort event. > + * > + * For empty entries inside CPU page table (pte_none() or pmd_none() is true) we > + * do set MIGRATE_PFN_MIGRATE flag inside the corresponding source array thus > + * allowing the caller to allocate device memory for those unback virtual > + * address. For this the caller simply havs to allocate device memory and ^ haves