Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp648847ybf; Fri, 28 Feb 2020 05:09:22 -0800 (PST) X-Google-Smtp-Source: APXvYqzkhbrzoq98cxbhNXnAQKR4grbgwbmb74xLworh2K+sPj4Jqqq+qIx+nT4pscKG+wc865hW X-Received: by 2002:aca:75c1:: with SMTP id q184mr906281oic.35.1582895362814; Fri, 28 Feb 2020 05:09:22 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1582895362; cv=none; d=google.com; s=arc-20160816; b=PkUpiIx05cQG4Lq3hCJcoO5iGJmsPp1kud25rxz8J0hZZKUZ0Il/o6Yx/CZkZy79NJ GrDIcxZgjYCAn5YYRjdk5MxtL5VsoFw0erpMvQXmoyf9ajWlGC7wSMm1XWzxv73e8kGB tLX25XS8IbofBeV2BjNfji/GrPPbXOWevFSUGqcGo+34bYF1yndF7GA0e23ebN2Q6q9F +PlmtHF0wAQeZ2p37ZXkc28/YXodx2uIHN2m/Hg4O7kG9h2uDsPJcIc6z0OhwgutjGsR HpMv/AsOp+70bk4hpRbh4TL+hmCuAf73hlkU6q4S+zUKrMq0gHHjmkc6IDdvZ+C+nMtv aKAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:organization:from:references:cc:to:subject :dkim-signature; bh=04Dtj9bYc9YRIP6VbnUljGMm8JA9GewDFLSo+D+djMU=; b=LNTND/HGwxa7O95/a/4lJWvpJ57rYmyc2AXB8FvJE0rAK51HF4lJ8X9O/uVUhLwH3e /x71abA/4AL/DUR5BqoMQFAvXyIDe4mbZ+RQE6VCg3ehuCTeCU6p0bhem4Nqg/rFwto2 1hbdiDZJj4BmdZ8GAXmtMKH23iKaJQsE43GDTXmUMQIrP2f1cA9Jxu3RzXXGE/uev9VM 4OYbL21VX+8l1+0KOUBKiA3JcIocPOXTQuDow0HyojRfAogJJrLhV8rlzSIZpC2WjOLq CJr5FV5x4kd9aBr/7QKCK+dAR/t0BjWmmhqIYdyXHL0VCAMAGxdXM0mD1jQx4/TpyQGR GRYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail (test mode) header.i=@shipmail.org header.s=mail header.b=jiwAbnKM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h2si1735969oie.151.2020.02.28.05.08.57; Fri, 28 Feb 2020 05:09:22 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=fail (test mode) header.i=@shipmail.org header.s=mail header.b=jiwAbnKM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726845AbgB1NIS (ORCPT + 99 others); Fri, 28 Feb 2020 08:08:18 -0500 Received: from ste-pvt-msa1.bahnhof.se ([213.80.101.70]:56018 "EHLO ste-pvt-msa1.bahnhof.se" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726063AbgB1NIR (ORCPT ); Fri, 28 Feb 2020 08:08:17 -0500 Received: from localhost (localhost [127.0.0.1]) by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTP id 960713F6E3; Fri, 28 Feb 2020 14:08:15 +0100 (CET) Authentication-Results: ste-pvt-msa1.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b=jiwAbnKM; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se X-Spam-Flag: NO X-Spam-Score: -2.099 X-Spam-Level: X-Spam-Status: No, score=-2.099 tagged_above=-999 required=6.31 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no Received: from ste-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (ste-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WAuvgacn0XQ3; Fri, 28 Feb 2020 14:08:13 +0100 (CET) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by ste-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id 525103F3E7; Fri, 28 Feb 2020 14:08:04 +0100 (CET) Received: from localhost.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id 8BB36360058; Fri, 28 Feb 2020 14:08:04 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1582895284; bh=9KuDoakzQU3d3CJvJMw+Ur864I+q1jzIXNDtCjFACGo=; h=Subject:To:Cc:References:From:Date:In-Reply-To:From; b=jiwAbnKM8mzmWh1+InaZhw89xfuYrm0p2buDYfV+mf+oEkZit27tONGBWmFD6zFhA /LYVlxqf8DII93sn+0XAzHNSkVtCaEgIE8EOFDOc9cNQGLzIw7vwq0km2nFbHdTdx5 jGDo/81sGgrdfVlv41cUO5mhodivTGIhQPB2evoU= Subject: Re: [PATCH v4 0/9] Huge page-table entries for TTM To: Michal Hocko , Andrew Morton Cc: linux-mm@kvack.org, dri-devel@lists.freedesktop.org, linux-kernel@vger.kernel.org, Ralph Campbell , pv-drivers@vmware.com, Dan Williams , "Matthew Wilcox (Oracle)" , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , linux-graphics-maintainer@vmware.com, =?UTF-8?Q?Christian_K=c3=b6nig?= , "Kirill A. Shutemov" References: <20200220122719.4302-1-thomas_os@shipmail.org> From: =?UTF-8?Q?Thomas_Hellstr=c3=b6m_=28VMware=29?= Organization: VMware Inc. Message-ID: Date: Fri, 28 Feb 2020 14:08:04 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <20200220122719.4302-1-thomas_os@shipmail.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Andrew, Michal I'm wondering what's the best way here to get the patches touching mm reviewed and accepted? While drm people and VMware internal people have looked at them, I think the huge_fault() fallback splitting and the introduction of vma_is_special_huge() needs looking at more thoroughly. Apart from that, if possible, I think the best way to merge this series is also through a DRM tree. Thanks, Thomas On 2/20/20 1:27 PM, Thomas Hellström (VMware) wrote: > In order to reduce TLB misses and CPU usage this patchset enables huge- > and giant page-table entries for TTM and TTM-enabled graphics drivers. > > Patch 1 and 2 introduce a vma_is_special_huge() function to make the mm code > take the same path as DAX when splitting huge- and giant page table entries, > (which currently means zapping the page-table entry and rely on re-faulting). > > Patch 3 makes the mm code split existing huge page-table entries > on huge_fault fallbacks. Typically on COW or on buffer-objects that want > write-notify. COW and write-notification is always done on the lowest > page-table level. See the patch log message for additional considerations. > > Patch 4 introduces functions to allow the graphics drivers to manipulate > the caching- and encryption flags of huge page-table entries without ugly > hacks. > > Patch 5 implements the huge_fault handler in TTM. > This enables huge page-table entries, provided that the kernel is configured > to support transhuge pages, either by default or using madvise(). > However, they are unlikely to be inserted unless the kernel buffer object > pfns and user-space addresses align perfectly. There are various options > here, but since buffer objects that reside in system pages typically start > at huge page boundaries if they are backed by huge pages, we try to enforce > buffer object starting pfns and user-space addresses to be huge page-size > aligned if their size exceeds a huge page-size. If pud-size transhuge > ("giant") pages are enabled by the arch, the same holds for those. > > Patch 6 implements a specialized huge_fault handler for vmwgfx. > The vmwgfx driver may perform dirty-tracking and needs some special code > to handle that correctly. > > Patch 7 implements a drm helper to align user-space addresses according > to the above scheme, if possible. > > Patch 8 implements a TTM range manager for vmwgfx that does the same for > graphics IO memory. This may later be reused by other graphics drivers > if necessary. > > Patch 9 finally hooks up the helpers of patch 7 and 8 to the vmwgfx driver. > A similar change is needed for graphics drivers that want a reasonable > likelyhood of actually using huge page-table entries. > > If a buffer object size is not huge-page or giant-page aligned, > its size will NOT be inflated by this patchset. This means that the buffer > object tail will use smaller size page-table entries and thus no memory > overhead occurs. Drivers that want to pay the memory overhead price need to > implement their own scheme to inflate buffer-object sizes. > > PMD size huge page-table-entries have been tested with vmwgfx and found to > work well both with system memory backed and IO memory backed buffer objects. > > PUD size giant page-table-entries have seen limited (fault and COW) testing > using a modified kernel (to support 1GB page allocations) and a fake vmwgfx > TTM memory type. The vmwgfx driver does otherwise not support 1GB-size IO > memory resources. > > Comments and suggestions welcome. > Thomas > > Changes since RFC: > * Check for buffer objects present in contigous IO Memory (Christian König) > * Rebased on the vmwgfx emulated coherent memory functionality. That rebase > adds patch 5. > Changes since v1: > * Make the new TTM range manager vmwgfx-specific. (Christian König) > * Minor fixes for configs that don't support or only partially support > transhuge pages. > Changes since v2: > * Minor coding style and doc fixes in patch 5/9 (Christian König) > * Patch 5/9 doesn't touch mm. Remove from the patch title. > Changes since v3: > * Added reviews and acks > * Implemented ugly but generic ttm_pgprot_is_wrprotecting() instead of arch > specific code. > > Cc: Andrew Morton > Cc: Michal Hocko > Cc: "Matthew Wilcox (Oracle)" > Cc: "Kirill A. Shutemov" > Cc: Ralph Campbell > Cc: "Jérôme Glisse" > Cc: "Christian König" > Cc: Dan Williams > > > _______________________________________________ > dri-devel mailing list > dri-devel@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/dri-devel