Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3224411ybt; Mon, 22 Jun 2020 19:01:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxQOd32pSnwbMudMQCcvtDMa9zgIDaL3rIoqy3C7NoDvvdvhkFfxQV3maBt2IFvciQo8Hch X-Received: by 2002:a05:6402:1a48:: with SMTP id bf8mr20098115edb.133.1592877682993; Mon, 22 Jun 2020 19:01:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1592877682; cv=none; d=google.com; s=arc-20160816; b=P2A/EdDIv3q373S7dH5GxN7wExCcNRFKM2pwcw3wmVA8Vt8FsmZ9p5h/4UXCV7g4JY kTnV/MGXfctsAfk6BbbstQFmLlID//GFbA4xvCcDuIZ0onH7BcjzlK9MvFqqwfRORnNh SU3EEdecKS/Eqq/EkItoOyQs7OrxW80qZmEZ1wx04LOs0pQK6nwMFY5UDOOV7ooSMSw1 vYOXYZlWnh0FBTMrSb8hPvgnp7vqMjgvZ8CEZLmCWXHnstF3EUnD0s7fsKSRnu+DKaqA B3TnmjCQjZYx95fBAMSa0e+lTQarOxtQ9PtgWkX4Uscsb2nYwZpgtT5mSww2Y9ShqVAQ 12cQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=NjMcZXprv/nOI4forFhKBTk/9F1/kItqecbA0f3zzT8=; b=iRJZ/Khuo9vvicqqHoNhxF9jwrvoCrevtYXXnTe/CIbliGhGuQCZg60mqJXmJyTGpT xAlnzVzZF6Kg4mKpq2T9uoSmo10UB66VTRsXKMbe5xlYhaVhA8xteTZuvONrp3ukyKk+ lyzIi8afu1EVOElE2+S1AC/6803WhtIwQeVxoyiGxIoa4hM0mL/F6zMC5wJv6fpvIkKs 0abIoWzP3XKyTsnJvipvolCVAk8Y/R/WtOqM2CW3YNephnIV2Sid2B+8U/k878+Zsi+X HPQ0kmIIkHgHYtr+Gcm926i3K4QvgjUa/0am6yXxzbeOHvNi2aJ4IBOy5xvFO4PleRDx fxOw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=MK3YZsIq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id la9si9282446ejb.581.2020.06.22.19.01.01; Mon, 22 Jun 2020 19:01:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=MK3YZsIq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731171AbgFVXzQ (ORCPT + 99 others); Mon, 22 Jun 2020 19:55:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60590 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730227AbgFVXzO (ORCPT ); Mon, 22 Jun 2020 19:55:14 -0400 Received: from mail-ej1-x643.google.com (mail-ej1-x643.google.com [IPv6:2a00:1450:4864:20::643]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED98BC061573; Mon, 22 Jun 2020 16:55:13 -0700 (PDT) Received: by mail-ej1-x643.google.com with SMTP id p20so19700983ejd.13; Mon, 22 Jun 2020 16:55:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=NjMcZXprv/nOI4forFhKBTk/9F1/kItqecbA0f3zzT8=; b=MK3YZsIqGYoDjbxLtQy2a2wo4ruMmS1w9QUl9/b2H1N34YqISWMm4seVahEk6vXpOI SgBWMKclzauV/+pl9tGCeOaTNW0t2+0vC53zrCUIN6UO6sMVq+xOis8Nnm32NMx8suxR UhCmTiQTZQhtWbpw2vcER4mouZPvsT8VZEXNEeATbbrxzozgbjOPBthCdAw+infZucxe zkUG4Enjj+ZXGXRDXzuhcOIhk127PUmHwYCqIOvVFSxdxnzVTp8DEZ/aH2mR5XH5+Ans xmUT27IlXZQdKBsq/7FrGNFly2GxAGf/6aYN5rJVJtKl2tow0KSBSoBWJGhErIvYIYDX NbSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=NjMcZXprv/nOI4forFhKBTk/9F1/kItqecbA0f3zzT8=; b=iz8GU/l3NxK4TsbnTMNwgaDarjy1L9LOGjCR1pcHIW/nTG4WNpeeE7TAhYUbu6wuCm j+cbnoAsfvgJzEeTObyt4Q2DpldZl23ZdnDwMO3kYK7W6BxSGnzTA6x1TLX915SDrhYg wLpoxne/F6L6EHOME3y+DuGZlJ8lnHZXpsDgmG9DAD5FUggUq0B6OeaSZJs0PhR6vClK QycyLbO+3gtBloZl3ItHDLLrqlBE9WMoj9qtvvm2eVqBbre9ytmQuMXPRYRuUhuuRih6 bCNDbByNXWAJmDgDhIyYA45mlHwdJ4c8oRMvj36BIWUzBjJs/NA8IwHO0fz/V6NR4NFB A/uw== X-Gm-Message-State: AOAM530HVXwuEnhUsIEiN6Mn6VPfp0ajpHrXzUrEeEBvx1WidJPfkSt9 OLgH+2G1IBsF8ewUrsocW+J1umsjnbzATn8s0xY= X-Received: by 2002:a17:906:2b81:: with SMTP id m1mr17303645ejg.488.1592870112660; Mon, 22 Jun 2020 16:55:12 -0700 (PDT) MIME-Version: 1.0 References: <20200619215649.32297-1-rcampbell@nvidia.com> <20200619215649.32297-14-rcampbell@nvidia.com> <4C364E23-0716-4D59-85A1-0C293B86BC2C@nvidia.com> In-Reply-To: From: Yang Shi Date: Mon, 22 Jun 2020 16:54:45 -0700 Message-ID: Subject: Re: [PATCH 13/16] mm: support THP migration to device private memory To: John Hubbard Cc: Zi Yan , Ralph Campbell , nouveau@lists.freedesktop.org, linux-rdma@vger.kernel.org, Linux MM , linux-kselftest@vger.kernel.org, Linux Kernel Mailing List , Jerome Glisse , Christoph Hellwig , Jason Gunthorpe , Ben Skeggs , Andrew Morton , Shuah Khan , "Huang, Ying" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jun 22, 2020 at 4:02 PM John Hubbard wrote: > > On 2020-06-22 15:33, Yang Shi wrote: > > On Mon, Jun 22, 2020 at 3:30 PM Yang Shi wrote: > >> On Mon, Jun 22, 2020 at 2:53 PM Zi Yan wrote: > >>> On 22 Jun 2020, at 17:31, Ralph Campbell wrote: > >>>> On 6/22/20 1:10 PM, Zi Yan wrote: > >>>>> On 22 Jun 2020, at 15:36, Ralph Campbell wrote: > >>>>>> On 6/21/20 4:20 PM, Zi Yan wrote: > >>>>>>> On 19 Jun 2020, at 17:56, Ralph Campbell wrote: > ... > >>> Ying(cc=E2=80=99d) developed the code to swapout and swapin THP in on= e piece: https://lore.kernel.org/linux-mm/20181207054122.27822-1-ying.huang= @intel.com/. > >>> I am not sure whether the patchset makes into mainstream or not. It c= ould be a good technical reference > >>> for swapping in device private pages, although swapping in pages from= disk and from device private > >>> memory are two different scenarios. > >>> > >>> Since the device private memory swapin impacts core mm performance, w= e might want to discuss your patches > >>> with more people, like the ones from Ying=E2=80=99s patchset, in the = next version. > >> > >> I believe Ying will give you more insights about how THP swap works. > >> > >> But, IMHO device memory migration (migrate to system memory) seems > >> like THP CoW more than swap. > > > A fine point: overall, the desired behavior is "migrate", not CoW. > That's important. Migrate means that you don't leave a page behind, even > a read-only one. And that's exactly how device private migration is > specified. > > We should try to avoid any erosion of clarity here. Even if somehow > (really?) the underlying implementation calls this THP CoW, the actual > goal is to migrate pages over to the device (and back). > > > >> > >> When migrating in: > > > > Sorry for my fat finger, hit sent button inadvertently, let me finish h= ere. > > > > When migrating in: > > > > - if THP is enabled: allocate THP, but need handle allocation > > failure by falling back to base page > > - if THP is disabled: fallback to base page > > > > OK, but *all* page entries (base and huge/large pages) need to be cleared= , > when migrating to device memory, unless I'm really confused here. > So: not CoW. I realized the comment caused more confusion. I apologize for the confusion. Yes, the trigger condition for swap/migration and CoW are definitely different. Here I mean the fault handling part of migrating into system memory. Swap-in just needs to handle the base page case since THP swapin is not supported in upstream yet and the PMD is split in swap-out phase (see shrink_page_list). The patch adds THP migration support to device memory, but you need to handle migrate in (back to system memory) case correctly. The fault handling should look like THP CoW fault handling behavior (before 5.8): - if THP is enabled: allocate THP, fallback if allocation is failed - if THP is disabled: fallback to base page Swap fault handling doesn't look like the above. So, I said it seems like more THP CoW (fault handling part only before 5.8). I hope I articulate my mind. However, I didn't see such fallback is handled. It looks if THP allocation is failed, it just returns SIGBUS; and no check about THP status if I read the patches correctly. The THP might be disabled for the specific vma or system wide before migrating from device memory back to system memory. > > thanks, > -- > John Hubbard > NVIDIA