Received: by 2002:a05:6358:1087:b0:cb:c9d3:cd90 with SMTP id j7csp4338475rwi; Mon, 17 Oct 2022 05:06:16 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4yYThiOWrY2fXtnDuS8h+kSOTspLHMBPq2qiE1hTnlHsXnUWI+nuc1ylolJx5cFCRXpzcA X-Received: by 2002:a17:907:1b1e:b0:783:8e33:2d1c with SMTP id mp30-20020a1709071b1e00b007838e332d1cmr8465688ejc.304.1666008376363; Mon, 17 Oct 2022 05:06:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666008376; cv=none; d=google.com; s=arc-20160816; b=mfCMBUitToezWL0RMmnqjDUEiB3h6jyHMEG8v5GZDOkPyBkKTrz1Wj0FTIAoGU1xlV BjzaU2adbNrj20N5wi3tBakDAPtV06Cl33NRYBACbd7FsNpa/U/L7GeynkF653F01s8j jY01gRTAOg7YBrk1S2Nh/bCe9A9GugAOeCmB9cUxihVzpF8z574gu3ZgYN4PhIGF8aYD 4obgy4izclJUvIlou1Y+M6Nm69nSir38s6onwBxdNIHLqQUgUZpbQk58vNWDMU3V6wFP 5rcQAJjz6DxQ45hBibRmV4lttBkXvNsdC/UB3l+CP2MjeQkZot1WBh9CcJ6irIC0uGTf ie1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to :organization:from:references:cc:to:content-language:subject :user-agent:mime-version:date:message-id:dkim-signature; bh=bxoTaA3aAU+JEFWVUqidohd5Zr4kZ2taV7k83eKc06o=; b=dr1AMsesEij1LnqQen9OBU6qOzHLg6LxijyTvwwh/GzMDlVzxZRkhRO48ZMJZv79IR B7NhMbeqZgnVxdS3/tsOst5XgpW/tdaB/Z1U5BOZfY4allHKA7sBsCRrrpzZCk7BCSKt Q1BJ1+lI+i5Hi+mr4O5QZ6dqN+gaZXivUSucI1mhfgUwYQmBOS9oa6a8X6wWASb6mHYG XayAiMXKih0Y8ttMlqQ78mwD76PsVt9sQ5ym6B0ZpKfJQaNcl1kIQYlOOGVpglsheGi0 xjss3KcKkpBw8b9i4PUlPJTg+v6kKny+qvPbHfWM848NK+MDGQl319bSl4VZ/AqPz3jz mz6Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=LXsD8fxo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dm19-20020a170907949300b0078312c674f1si8768744ejc.227.2022.10.17.05.05.50; Mon, 17 Oct 2022 05:06:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=LXsD8fxo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230059AbiJQMAt (ORCPT + 99 others); Mon, 17 Oct 2022 08:00:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230139AbiJQMAl (ORCPT ); Mon, 17 Oct 2022 08:00:41 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 00FDF2BB0C for ; Mon, 17 Oct 2022 05:00:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1666008020; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=bxoTaA3aAU+JEFWVUqidohd5Zr4kZ2taV7k83eKc06o=; b=LXsD8fxoTGjgGQ/+o8RWPzmikLH0D6danpYmV7altpZ6uNJDXsaNjTwL5tY8Hdab07xEeG g6DW+dwfAirs5CJSbnfFb5wm6ArQbwiXEFuMH986cQ7Ned37aoI1yxLclu7x9VDHSnf5M/ X5p1B+SIbFZ+JU8zAcNAIuAn3Z8zD8Y= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-322-OPUeUVKTNOiNT6i-WCW5Tw-1; Mon, 17 Oct 2022 08:00:19 -0400 X-MC-Unique: OPUeUVKTNOiNT6i-WCW5Tw-1 Received: by mail-wm1-f72.google.com with SMTP id c130-20020a1c3588000000b003b56be513e1so7546792wma.0 for ; Mon, 17 Oct 2022 05:00:18 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=bxoTaA3aAU+JEFWVUqidohd5Zr4kZ2taV7k83eKc06o=; b=EnXFXgmTov4IIM7pLi06uEsRbWocQWhN9q4XaDm8l8VLiME8yiJCmMNXocOzfD7KLU itYXeGV3fmqBCSTKp0BRSdqc2KgRNyr484xCR2GpJHilevuixnPQDMQ0pKSModMewgi8 DBBY+q0dzC3ajyZGqxUNVADkfVyyEFDY2UKGOS+8Rzxl11rXHEk6CX0O7TWqvVlWaxH0 5Tu5Sy0EW/8fUZNA8KROI6zZZ0WMOpK4Z4iI9Nkr1ZKq80JH9WF674GPUvjW1b6YHAKm Gk9Cr8h8OWmOf7PpkLWNQYy0VYMfYDGzouM3DKaP/gv5q3lX8d7S8Kdd7wzjJ3pe1Pau wmOA== X-Gm-Message-State: ACrzQf31CedNWHTrBsA/xEK5R8T0XE8RUkd6nvYeyBJCtBfduNcrXew9 /9VY9ksco0izbMyzQgMVxbMiPEWKKnXE4odCOeCwx64+3QM6ZXhwe0e6qFWf5DVnGNrZEMPlf1Q GvMSK35OjN3gOyHbFZsjhzusY X-Received: by 2002:a05:600c:4f46:b0:3c6:fb4f:3e1b with SMTP id m6-20020a05600c4f4600b003c6fb4f3e1bmr954049wmq.159.1666008017722; Mon, 17 Oct 2022 05:00:17 -0700 (PDT) X-Received: by 2002:a05:600c:4f46:b0:3c6:fb4f:3e1b with SMTP id m6-20020a05600c4f4600b003c6fb4f3e1bmr954026wmq.159.1666008017420; Mon, 17 Oct 2022 05:00:17 -0700 (PDT) Received: from ?IPV6:2003:cb:c70a:a00:37ed:519:6c33:4dc8? (p200300cbc70a0a0037ed05196c334dc8.dip0.t-ipconnect.de. [2003:cb:c70a:a00:37ed:519:6c33:4dc8]) by smtp.gmail.com with ESMTPSA id o39-20020a05600c512700b003b4ff30e566sm27615771wms.3.2022.10.17.05.00.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 17 Oct 2022 05:00:16 -0700 (PDT) Message-ID: <03b90a2f-2854-6e19-6ccd-41f9933d8813@redhat.com> Date: Mon, 17 Oct 2022 14:00:16 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.3.1 Subject: Re: [External] Re: [PATCH] mm: hugetlb: support get/set_policy for hugetlb_vm_ops Content-Language: en-US To: =?UTF-8?B?6buE5p2w?= Cc: songmuchun@bytedance.com, Mike Kravetz , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20221012081526.73067-1-huangjie.albert@bytedance.com> <2aaf2c3a-6e49-abb9-b9c8-19ce87404982@redhat.com> <2f41fc4c-68eb-ab7d-970b-fcb10f474fd4@redhat.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17.10.22 13:46, 黄杰 wrote: > David Hildenbrand 于2022年10月17日周一 19:33写道: >> >> On 17.10.22 11:48, 黄杰 wrote: >>> David Hildenbrand 于2022年10月17日周一 16:44写道: >>>> >>>> On 12.10.22 10:15, Albert Huang wrote: >>>>> From: "huangjie.albert" >>>>> >>>>> implement these two functions so that we can set the mempolicy to >>>>> the inode of the hugetlb file. This ensures that the mempolicy of >>>>> all processes sharing this huge page file is consistent. >>>>> >>>>> In some scenarios where huge pages are shared: >>>>> if we need to limit the memory usage of vm within node0, so I set qemu's >>>>> mempilciy bind to node0, but if there is a process (such as virtiofsd) >>>>> shared memory with the vm, in this case. If the page fault is triggered >>>>> by virtiofsd, the allocated memory may go to node1 which depends on >>>>> virtiofsd. >>>>> >>>> >>>> Any VM that uses hugetlb should be preallocating memory. For example, >>>> this is the expected default under QEMU when using huge pages. >>>> >>>> Once preallocation does the right thing regarding NUMA policy, there is >>>> no need to worry about it in other sub-processes. >>>> >>> >>> Hi, David >>> thanks for your reminder >>> >>> Yes, you are absolutely right, However, the pre-allocation mechanism >>> does solve this problem. >>> However, some scenarios do not like to use the pre-allocation mechanism, such as >>> scenarios that are sensitive to virtual machine startup time, or >>> scenarios that require >>> high memory utilization. The on-demand allocation mechanism may be better, >>> so the key point is to find a way support for shared policy。 >> >> Using hugetlb -- with a fixed pool size -- without preallocation is like >> playing with fire. Hugetlb reservation makes one believe that on-demand >> allocation is going to work, but there are various scenarios where that >> can go seriously wrong, and you can run out of huge pages. >> >> If you're using hugetlb as memory backend for a VM without >> preallocation, you really have to be very careful. I can only advise >> against doing that. >> >> >> Also: why does another process read/write *first* to a guest physical >> memory location before the OS running inside the VM even initialized >> that memory? That sounds very wrong. What am I missing? >> > > for example : virtio ring buffer. > For the avial descriptor, the guest kernel only gives an address to > the backend, > and does not actually access the memory. Okay, thanks. So we're essentially providing uninitialized memory to a device? Hm, that implies that the device might have access to memory that was previously used by someone else ... not sure how to feel about that, but maybe this is just the way of doing things. The "easy" user-space fix would be to simply similarly mbind() in the other processes where we mmap(). Has that option been explored? -- Thanks, David / dhildenb