Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp4659240rwb; Mon, 31 Jul 2023 10:06:23 -0700 (PDT) X-Google-Smtp-Source: APBJJlE9at3IWUvSSHOpZSi4VA7tf9S0cLaiXlqCRVIL6s6QtqAS5DZB/+oHE1ivRVcGlEItc4js X-Received: by 2002:aa7:c994:0:b0:522:1f3a:ad19 with SMTP id c20-20020aa7c994000000b005221f3aad19mr418357edt.12.1690823183609; Mon, 31 Jul 2023 10:06:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690823183; cv=none; d=google.com; s=arc-20160816; b=OIfZTmNvOdmTJpby7P8tgdysnJ+7EL16drdq6y58dEvT+fwvQ+qrTc3TCoPvEoU2Dn Jgp3zrGA6wuTZUJ0XYkwC7nW/ABjseWd+8U8sXHVp8JDBJObP19DhInhj/lHEYFGJzdj 5/BaPDG5eKnGGKONazMR3uW185CmSttVWxzIRmnfdwZMKKhzTx/0PyZlG0sVanPgIz/S bNKFOhK1eeC/BYXdAwEQpzakhHuv/PRYubESxRUEpWkndq7sbDVX/q6UyWkb33gdACal quoepdvYOgOok38UeyenwC/xpgIIFIFrEZLUil8zKaB6bFhzM2KJ0SsvU2D46gItAvFZ +xUQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=OQo3yL4yG3jGyWSkou4YxRtg8JFH3orkKLuMQw2GA/o=; fh=s+xWqSsxi82R+gWWV5webFlSTnrNOqAL4z8tS6gotZ8=; b=RALM+mYUsbMYFIZM+0KcLJ4HtWu+SW1rnYRJ01F0T423dAKRJIA7Tlr7tAYQavMCU6 BU12/r721p4XE+wLncO+97FoXdbdGXH2yWNTsU4Pl6gR2mfnsrGT5f10FXP6f43BntjE xgmU/LLvK6hsAv4hvnlHym2qyb5p4legd+Bm9aF3c+MIrZMXK2mJ+gd3AkJLG6RPruhM ublLk8fzysFf37rkTI9eWH6JMyqlWZ8+/UZ+QnN0pEUOVSDXeWgUtnyeftswslZFhgCS oxhzi4vnU5oBpI7lsFbBWSFr5OtlaKWgvRpPb7IR3assxZQ9h7Kzts12oaQZBZtIUkNM oNCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=XyE5qzlv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z15-20020aa7cf8f000000b0052237e6e41esi3052224edx.80.2023.07.31.10.05.58; Mon, 31 Jul 2023 10:06:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@infradead.org header.s=casper.20170209 header.b=XyE5qzlv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230319AbjGaQjv (ORCPT + 99 others); Mon, 31 Jul 2023 12:39:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38432 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233552AbjGaQjh (ORCPT ); Mon, 31 Jul 2023 12:39:37 -0400 Received: from casper.infradead.org (casper.infradead.org [IPv6:2001:8b0:10b:1236::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 389DB173F; Mon, 31 Jul 2023 09:38:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=OQo3yL4yG3jGyWSkou4YxRtg8JFH3orkKLuMQw2GA/o=; b=XyE5qzlvho47FbWNmCDHfzzD23 TD+vjVmfbujk2rxhGTl5FeDSkSpPR9xhwnfuuZdcnUFgyYgtfMfL8Al1n9Fs8ZPzeAWt3D8kBaxE0 xkOfXwDby6DwjZf/PGj9OI6d+5QAYtAc9CUDeskJddccqpbaugMVsITh+RDrj4yFgIaXe0nhgkzor 56z7YN7zXuw7ujKhH5HWUpv4mlzr8m+/4YDbAf5tObdgrmEiH53Aqqd+thv9lUOeA2roFSQnSAifu rrbtH8W26v5XKbnsk4ckgo4GESKjWz1TIK7u3drS6B6yZsxzqHGUb9zAfdq6mEx16Semp7xzb/NJ8 QX/m3Vxg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qQVv0-002ffe-NZ; Mon, 31 Jul 2023 16:38:50 +0000 Date: Mon, 31 Jul 2023 17:38:50 +0100 From: Matthew Wilcox To: David Hildenbrand Cc: Rongwei Wang , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "xuyu@linux.alibaba.com" Subject: Re: [PATCH RFC v2 0/4] Add support for sharing page tables across processes (Previously mshare) Message-ID: References: <74fe50d9-9be9-cc97-e550-3ca30aebfd13@linux.alibaba.com> <9faea1cf-d3da-47ff-eb41-adc5bd73e5ca@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 31, 2023 at 06:30:22PM +0200, David Hildenbrand wrote: > Assume we do do the page table sharing at mmap time, if the flags are right. > Let's focus on the most common: > > mmap(memfd, PROT_READ | PROT_WRITE, MAP_SHARED) > > And doing the same in each and every process. That may be the most common in your usage, but for a database, you're looking at two usage scenarios. Postgres calls mmap() on the database file itself so that all processes share the kernel page cache. Some Commercial Databases call mmap() on a hugetlbfs file so that all processes share the same userspace buffer cache. Other Commecial Databases call shmget() / shmat() with SHM_HUGETLB for the exact same reason. This is why I proposed mshare(). Anyone can use it for anything. We have such a diverse set of users who want to do stuff with shared page tables that we should not be tying it to memfd or any other filesystem. Not to mention that it's more flexible; you can map individual 4kB files into it and still get page table sharing.