Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp4729844rwb; Mon, 31 Jul 2023 11:12:00 -0700 (PDT) X-Google-Smtp-Source: APBJJlFFHCVvIVbHHff5lL0M7SE+Y9CBWoC9pE4Ec8TyJ7e5QBDW6O2fB0r1MV6dcVktXzsL1UFj X-Received: by 2002:a17:902:c947:b0:1bb:ab0d:4f76 with SMTP id i7-20020a170902c94700b001bbab0d4f76mr13596862pla.58.1690827119677; Mon, 31 Jul 2023 11:11:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690827119; cv=none; d=google.com; s=arc-20160816; b=K1TdYiTt1iUjpR1NF7z/St9fM7UZawjtiY21goBzv/esxfrCd7e+oUAh+DkHyD8Vel UXh9WfPLXH11YTtWy0AAI46FPNLU03H6HArRXK+nfDD0h+0gX03fSdzsFx81iQHq+1ox WV7++bAgqBgY5BtO/V07WyJzgGMRDQ5di0QMXGXJ71/u6uwaPb54jB/YWBWo2OxJ23mF O1xpG5bqtI+l8braFliDstZqOKhLVKISxpGVwPMjXs7UYT1JIF974tLxJrREv248kJZq O4g9klih6vtpkxBZYVluSBBngNaMU9WZp2rcYdVjv984I/GzPCpHIcYOYiYIXV/v9csT R4Pg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=6TgwT38r7+YbyfH94qkF+Uv3IgMJq4zc27cr5GOz45w=; fh=8sV/Qkdgxa3p9fXD382ZXQ1MvKMMKHrjeuMZSLFzmSE=; b=PCqlW7J02iypCLaAewD/56qXqqo2oejdaL7K6JlPIDRkPKI49SCuIQkA4QgKBsjqXQ VWbhnUp5iZXXf6tXCW5rn+aFK/eDRW57hvN/MKFYUaw/KJa8TydNhDCiAnpEg8ONTNR2 qVX8hOgG+WbCnESVlQ9dyjUGW1+Kj3KUuDQSsxUsIgzaCZ2mbmplLFtadplv7N7N0PAe cXSKXCxlrAahcYkiF2fXXo6eTWV50CCUPS+JBu2awrlgMtys8vAi7BEtgp7RAsTlc4xL Hjhc3hpAjyopZO3VkXc3H38O9ZN/pwwL6vBtiy4rqXSrRHnwqeZuWfyvqj/fJnKTpUU8 Z5/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=E6j4nNfs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k15-20020a170902c40f00b001b23d4573b2si7840433plk.27.2023.07.31.11.11.47; Mon, 31 Jul 2023 11:11:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=E6j4nNfs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233544AbjGaQto (ORCPT + 99 others); Mon, 31 Jul 2023 12:49:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50020 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233532AbjGaQtj (ORCPT ); Mon, 31 Jul 2023 12:49:39 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66C641728 for ; Mon, 31 Jul 2023 09:48:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690822133; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6TgwT38r7+YbyfH94qkF+Uv3IgMJq4zc27cr5GOz45w=; b=E6j4nNfsg08vDGePcxHD5kIJWszOIIqaxaxpFG6tU3vULEH9YdwRt2g+CQneuhy82qtnfH ZGEi2EzR3VouuqUDf54FA1SOn2B9Dmrb1rdRruXOD7Eh4RGZX0s2ENKHmsYrvoXVUQIgYZ WTd0wLzmaWvZ7ReQ4KxMfAe+jyJR6nc= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-576-e-XahwSlMIqVz0m_OT-iZQ-1; Mon, 31 Jul 2023 12:48:50 -0400 X-MC-Unique: e-XahwSlMIqVz0m_OT-iZQ-1 Received: by mail-wr1-f70.google.com with SMTP id ffacd0b85a97d-30932d15a30so2597992f8f.1 for ; Mon, 31 Jul 2023 09:48:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690822129; x=1691426929; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=6TgwT38r7+YbyfH94qkF+Uv3IgMJq4zc27cr5GOz45w=; b=dSI227hsKF4nJGy2+xuKnNKPzn+c3dAzZa3eBLuCemqHlF1HTMmMeu3JKdruPeQUnv pQcEEXzbbxm7veFvf2VlCtMIG4p+ecpsuVT9rYGRh6jpZ4yK8q5Of8Dihsr36hIcP5la P0j0WuGUSUXaWbWWVhuk6b4DoyU1+uc14pjmvOmXhBMiPuind+4urwxHP24Q+3rWQNAb A0pNSpgm5O7fZNFZb+0jF0yX/IZwRjdYCRyBq2hVXk5dZ1kXPWfopywRNyBS4/LRisEj xIXuuUmOQV4HK9923Y+vCGNpfZa9XPUfMHPkLbqNR4+Q5EWwsFSMoMpef2s0SXGrgt1R dv6g== X-Gm-Message-State: ABy/qLb0Hn/hRPurx8j/mewyOEsXpCfG7j5TM2iwnjkhM9PD/GDs5TIr RePEyYWdrOSak6FSuEA64NBebcFkLjUzsNXLV3eOHMXEEehPuRgrjb7n8i54xzt0vU15N91yfbu 3j7S9GNIvhh20AQfY42WnAhoJ X-Received: by 2002:adf:f7cc:0:b0:314:10d6:8910 with SMTP id a12-20020adff7cc000000b0031410d68910mr280804wrq.63.1690822128760; Mon, 31 Jul 2023 09:48:48 -0700 (PDT) X-Received: by 2002:adf:f7cc:0:b0:314:10d6:8910 with SMTP id a12-20020adff7cc000000b0031410d68910mr280792wrq.63.1690822128294; Mon, 31 Jul 2023 09:48:48 -0700 (PDT) Received: from ?IPV6:2003:cb:c723:4c00:5c85:5575:c321:cea3? (p200300cbc7234c005c855575c321cea3.dip0.t-ipconnect.de. [2003:cb:c723:4c00:5c85:5575:c321:cea3]) by smtp.gmail.com with ESMTPSA id y18-20020a5d6212000000b003143c6e09ccsm13700276wru.16.2023.07.31.09.48.47 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 31 Jul 2023 09:48:47 -0700 (PDT) Message-ID: Date: Mon, 31 Jul 2023 18:48:47 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH RFC v2 0/4] Add support for sharing page tables across processes (Previously mshare) Content-Language: en-US To: Matthew Wilcox Cc: Rongwei Wang , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "xuyu@linux.alibaba.com" References: <74fe50d9-9be9-cc97-e550-3ca30aebfd13@linux.alibaba.com> <9faea1cf-d3da-47ff-eb41-adc5bd73e5ca@linux.alibaba.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.2 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H4,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE, SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 31.07.23 18:38, Matthew Wilcox wrote: > On Mon, Jul 31, 2023 at 06:30:22PM +0200, David Hildenbrand wrote: >> Assume we do do the page table sharing at mmap time, if the flags are right. >> Let's focus on the most common: >> >> mmap(memfd, PROT_READ | PROT_WRITE, MAP_SHARED) >> >> And doing the same in each and every process. > > That may be the most common in your usage, but for a database, you're > looking at two usage scenarios. Postgres calls mmap() on the database > file itself so that all processes share the kernel page cache. > Some Commercial Databases call mmap() on a hugetlbfs file so that all > processes share the same userspace buffer cache. Other Commecial > Databases call shmget() / shmat() with SHM_HUGETLB for the exact > same reason. I remember you said that postgres might be looking into using shmem as well, maybe I am wrong. memfd/hugetlb/shmem could all be handled alike, just "arbitrary filesystems" would require more work. > > This is why I proposed mshare(). Anyone can use it for anything. > We have such a diverse set of users who want to do stuff with shared > page tables that we should not be tying it to memfd or any other > filesystem. Not to mention that it's more flexible; you can map > individual 4kB files into it and still get page table sharing. That's not what the current proposal does, or am I wrong? Also, I'm curious, is that a real requirement in the database world? -- Cheers, David / dhildenb