Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp674041pxb; Wed, 18 Aug 2021 11:15:39 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx9FDyWiVs74+eH1ELdeq6QF9Nr/OxxV/OwGbUNuLKWTrYLZqmeF434VRnPcE7PnZdZuwm0 X-Received: by 2002:a17:906:30d0:: with SMTP id b16mr10982635ejb.495.1629310539569; Wed, 18 Aug 2021 11:15:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1629310539; cv=none; d=google.com; s=arc-20160816; b=w3/OtXhfhJOl9hul8I2/Zknoj+3w/NoNkoBFQsOUjOo8pewXej4eruxuhGHVF540ZE 2cJM6y23iMZdmI1mDMlFSN64b4oivc0v6dMgJUfnKE2Gi8IUrBxkRwbybQ9GkO/iR4JO 7fBhgWdUYEZzK+Nba2MWuN5Lo3upC+JWjacK48Y+3MIxN2IRGO3vEVJUxISSNIZSeEzx /B3K+mt3iIwcubfmU5QFy8M0AY9EREzIu7MQOYOenFHLGOhEkzfR7Y0qkX4cYmDj17xq Tbo4nQD2lccdvjUs8fGt4M2Ydc0vcEOxa+kZ4OSUIyZlrznrnd03G1M0mhkJTixawlRx W8/Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:subject :organization:from:references:cc:to:dkim-signature; bh=hDmlm5XVVLxGFrZ0SHbAErAGGO55744SvkIuzRpj4cY=; b=Yt3oyGkzxY3jjXnQ7jt51vUHD+5+2l7FbdIp96g5g4CttsZ+0gMSlKhtHNbQ9+p99E xAc7Q80TfKbFy4j+xyj369VQSP72wn8+Kwq/0cU/eHy5mGw4BqJUd4ajLiyszP1kT4z2 SBYp+jK1Sg/IOC/uhQ89OWBdxGToEcvYrlht0PDdjh+NBPexV5cEVMpBgYjeyUdzoDXt AsrV89we4QnszEqb58WooXkmb9zBzr75zkUVtEELjDvzC2yN3Md34J2Torun8Dokd23Z as8OsHdNT6v9NyCE5SsW7YSzXwzebOxRYGfnB8DKh590Q5jweEmQW18VLg50jA9rx+l9 IUSg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KS4KF4uL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a5si490728eds.231.2021.08.18.11.15.16; Wed, 18 Aug 2021 11:15:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=KS4KF4uL; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229558AbhHRSO2 (ORCPT + 99 others); Wed, 18 Aug 2021 14:14:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:22672 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229448AbhHRSO1 (ORCPT ); Wed, 18 Aug 2021 14:14:27 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1629310432; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=hDmlm5XVVLxGFrZ0SHbAErAGGO55744SvkIuzRpj4cY=; b=KS4KF4uL2VpsPq7QLCjbMq4PSnBr/nd/cVcFxkS/fAaBulPbMVEfNi7Intf64vmTY7uIjm C8e99FdDW2YXjCNHHhSWptEUyfWbiMQh7fzhBfXgfizc33m/TP1gjW4ihtCVw94luOVP1x s5lmf13kLXzHnGX3hKa1WWsDYaHr1ek= Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-427-BrICb0OFMAuLW9aNcXKg3Q-1; Wed, 18 Aug 2021 14:13:51 -0400 X-MC-Unique: BrICb0OFMAuLW9aNcXKg3Q-1 Received: by mail-wr1-f72.google.com with SMTP id a13-20020adfed0d000000b00156fd70137aso726404wro.8 for ; Wed, 18 Aug 2021 11:13:50 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=hDmlm5XVVLxGFrZ0SHbAErAGGO55744SvkIuzRpj4cY=; b=Edwyef/UHiODwstR/dfJVXsWm97IRZ0/Qr0frnspsZr8pgwzJ1h6z7AbbBuPhHpJmq kxEblZuEADUPVb/nWAnaHaI0mUlSB7VPVsRY+s4lO+8vSCqiaB+cwIbUXZrCebd3ILWP YWCh/5O4zTInQH4NDGJVrmrGuOqOTpLWZdh5WSS+f6zJOYUtSegXDBcWrrcGJkTWaoCD 5sqpmMQKXqgqS+6sguquJd/mVdCnq7mhFWU+RIUL2G61nF26eVj0ala4wGl0rvSYBhiL pZzZJjBjRwqjtHzUhpYCvgD2SjtT1spYtWkl05qxqqHeXQBL2Emm+7o3DEeLnAYdPeE2 m5rg== X-Gm-Message-State: AOAM533zxAtBPhF7OvM/0CPXooV6PCdn0DXhQzrnCjadyMWnXN1+eW6d 0e1EasfMZ2rQu/IqcldIeCQ0W7DjweuKxv9G8ekQwkB+Tck4fJyGIHzebiJ6kz957vums1sYRJw ZSAL12w89qurCQZdM0GSe6UbS X-Received: by 2002:a5d:4a50:: with SMTP id v16mr12275073wrs.77.1629310429922; Wed, 18 Aug 2021 11:13:49 -0700 (PDT) X-Received: by 2002:a5d:4a50:: with SMTP id v16mr12275046wrs.77.1629310429743; Wed, 18 Aug 2021 11:13:49 -0700 (PDT) Received: from [192.168.3.132] (p5b0c6417.dip0.t-ipconnect.de. [91.12.100.23]) by smtp.gmail.com with ESMTPSA id l7sm486862wmj.9.2021.08.18.11.13.48 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 18 Aug 2021 11:13:49 -0700 (PDT) To: Tiberiu Georgescu Cc: Peter Xu , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Alistair Popple , Ivan Teterevkov , Mike Rapoport , Hugh Dickins , Matthew Wilcox , Andrea Arcangeli , "Kirill A . Shutemov" , Andrew Morton , Mike Kravetz , "Carl Waldspurger [C]" , Florian Schmidt , Jonathan Davies References: <20210807032521.7591-1-peterx@redhat.com> <16a765e7-c2a3-982a-e585-c04067766e3f@redhat.com> <7F645772-1212-4F0D-88AF-2569D5BBC2CD@nutanix.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [PATCH RFC 0/4] mm: Enable PM_SWAP for shmem with PTE_MARKER Message-ID: <6ab58270-c487-2a56-b522-ea5100edb13c@redhat.com> Date: Wed, 18 Aug 2021 20:13:47 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <7F645772-1212-4F0D-88AF-2569D5BBC2CD@nutanix.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org >> >>> I'm now wondering whether for Tiberiu's case mincore() can also be used. It >>> should just still be a bit slow because it'll look up the cache too, but it >>> should work similarly like the original proposal. > > I am afraid that the information returned by mincore is a little too vague to be of better help, compared to what the pagemap should provide in theory. I will have a look to see whether lseek on > proc/map_files works as a "PM_SWAP" equivalent. However, the swap offset would still be missing. Well, with mincore() you could at least decide "page is present" vs. "page is swapped or not existent". At least for making pageout decisions it shouldn't really matter, no? madvise(MADV_PAGEOUT) on a hole is a nop. But I'm not 100% sure what exactly your use case is here and what you would really need, so you know best :) >> >> Very right, maybe we can just avoid tampering with pagemap on shmem completely (which sounds like an excellent idea to me) and document it as "On shared memory, we will never indicate SWAPPED if the pages have been swapped out. Further, PRESENT might be under-indicated: if a shared page is currently not mapped into the page table of a process.". I saw there was a related, proposed doc update, maybe we can finetune that. >> > We could take into consideration an alternative approach to retrieving the shared page info in user > space, like storing it in sys/fs instead of per process. However, just leaving the pagemap functionality > incomplete, and not providing an alternative to retrieve the missing information, does not seem right. Updating the docs with a "can't do" should be temporary, until an alternative or fix. > As I stated before, making pagemap less broken is not a good idea IMHO. Either make it really correct or just leave it all broken -- and document that e.g., other interfaces (lseek) shall be used. It sounds like they exist and are good enough for CRUI. And TBH, if other interfaces already exist and get the job done, I'm more than happy that we can avoid mixing more shmem stuff into pagemap and trying to compensate performance problems by introducing inconsistency. If it has an fd and we can punch that into syscalls, we should much rather use that fd to lookup stuff then going via process page tables -- if possible of course (to be evaluated, because I haven't looked into the CRIU details and how they use lseek with anonymous shared memory). > Also, I think you are talking about my own doc update patch[3]. If not, please share a link with your > next reply. > > [3] https://marc.info/?m=162878395426774 No, that's it. -- Thanks, David / dhildenb