Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752512AbdGEWWh (ORCPT ); Wed, 5 Jul 2017 18:22:37 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:48278 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751784AbdGEWWf (ORCPT ); Wed, 5 Jul 2017 18:22:35 -0400 Reply-To: prakash.sangappa@oracle.com Subject: Re: [RFC PATCH] userfaultfd: Add feature to request for a signal delivery References: <20170627070643.GA28078@dhcp22.suse.cz> <20170627153557.GB10091@rapoport-lnx> <51508e99-d2dd-894f-8d8a-678e3747c1ee@oracle.com> <20170628131806.GD10091@rapoport-lnx> <3a8e0042-4c49-3ec8-c59f-9036f8e54621@oracle.com> <20170629080910.GC31603@dhcp22.suse.cz> <936bde7b-1913-5589-22f4-9bbfdb6a8dd5@oracle.com> <20170630094718.GE22917@dhcp22.suse.cz> <20170630130813.GA5738@redhat.com> <5956F2EC.1000805@oracle.com> <20170704164034.GH5738@redhat.com> To: Andrea Arcangeli Cc: Michal Hocko , Mike Rapoport , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mike Kravetz , Dave Hansen , Christoph Hellwig , linux-api@vger.kernel.org, John Stultz From: "prakash.sangappa" Message-ID: Date: Wed, 5 Jul 2017 15:24:14 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: <20170704164034.GH5738@redhat.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2316 Lines: 59 On 07/04/2017 09:40 AM, Andrea Arcangeli wrote: > On Fri, Jun 30, 2017 at 05:55:08PM -0700, prakash sangappa wrote: >> Interesting that UFFDIO_COPY is faster then fallocate(). In the DB use case >> the page does not need to be allocated at the time a process trips on >> the hugetlbfs >> file hole and receives SIGBUS. fallocate() is called on the hugetlbfs file, >> when more memory needs to be allocated by a separate process. > The major difference is that with UFFDIO_COPY the hugepage will be > immediately mapped into the virtual address without requiring any > further minor fault. So it's ideal if you could arrange to call > UFFDIO_COPY from the same process that is going to touch and use the > hugetlbfs data immediately after. You would eliminate a minor fault > that way. Ok, we will see how it could be used in the DB use case. > > UFFDIO_COPY at least for anon was measured to perform better than a > regular page fault too. >> Regarding hugetlbfs mount option, one consideration is to allow mounts of >> hugetlbfs inside user namespaces's mount namespace. Which would allow >> non privileged processes to mount hugetlbfs for use inside a user >> namespace. >> This may be needed even for the 'min_size' mount option using which an >> application could reserve huge pages and mount a filesystem for its use, >> with out the need to have privileges given the system has enough hugepages >> configured. It seems if non privileged processes are allowed to mount >> hugetlbfs >> filesystem, then min_size should be subject to some resource limits. >> >> Mounting inside user namespace will be a different patch proposal later. > There's no particular reason to make UFFDIO_FEATURE_SIGBUS a > privileged op unless we want to eliminate the branch with the static > key, so it's certainly simpler than dealing with hugetlbfs min_size > reserves. Ok, so, for now will not make UFFDIO_FEATURE_SIGBUS a privileged op and not use the static key to eliminate the branch. > I'm positive about the UFFDIO_FEATURE_SIGBUS tradeoffs, but others > feel free to comment. > > If you could make second patch to extend the selftest to exercise and > validates UFFDIO_FEATURE_SIGBUS in anon/shmem/hugetlbfs it'd be great. Sure, I will update the tests and send a patch. Thanks, -Prakash. > > Thanks, > Andrea