Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp31444pxu; Tue, 5 Jan 2021 04:29:43 -0800 (PST) X-Google-Smtp-Source: ABdhPJx9Da6Zms3ktwv+B2BVnIzELq4HGsk99o92je7OBc4MBud8MhO5pa2ZhOPtbO3GZwvtPx+4 X-Received: by 2002:a17:906:5fc9:: with SMTP id k9mr68602374ejv.70.1609849783155; Tue, 05 Jan 2021 04:29:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1609849783; cv=none; d=google.com; s=arc-20160816; b=zI6ni5EI64eWvCUL8dTybA9YWZy6DhsTB2SPXKLKikpj9Z9tpbccZ7/rZxT/SNU2Uw 5mccDqWEyNgXyHNOY4JcCd4UPGZZmE+brGI/R/AQSzKa0qBtVVwIv8y6STtH5BFQTBzD ijdeUsSJG2eSaL3a5ynuDRTE8CFCZ4RidWimFrXDcw1Py0CmTvmLkKyXCcnYHRmeWTIn L1fybQ/GjV2LsNjpc6LB1sq1zz4u60nrTDak/DmaBd0adipQqr3F0DXdLThkPsi3bmNU dr5f/d1BKzLbnOXjDcX50zBH9/3ofSjJMPkRzokoYMdCCG/Wxgww6DA3JSq06gEDBrTB 381Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=5kKMEMOodY6kGWS9Y6O4QuS2Jo/OUIx2VF4zkaRwF7U=; b=NoPp4TsPfNUuTAQqHQhlJx9kJbUFkiozD1yi7zIdAEeUK1BqAc+52Us+J6xs5/enk2 havGSUwBlhw1dtdPa0NTwpyL6rQy80blvkCBKEHc4QXlakcXT0R9zrLS0jMeTEAHgVzn kRj0xWAh+6CN96Pe6a7Qjb3vcojyu7OAsixdZCH2SeCuZ/ui4Wj6IQW9PPAPQNr8t+/s x2KITGfU/68TsSex4mFs6YmITebqaOoQ5sgbznRAr3gMr6Ug/CvTokqhm4Ib3CTljaGN hkxgMvaC+HTq83YtdsDyhJZA8iNbEGk9gRJyKCJuaoWS6bPG1Bo3ulA+B31vFfjzN+23 ZaQw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ZqSdvv3l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e5si32309508edc.19.2021.01.05.04.29.18; Tue, 05 Jan 2021 04:29:43 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ZqSdvv3l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728058AbhAEKW6 (ORCPT + 99 others); Tue, 5 Jan 2021 05:22:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46530 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727768AbhAEKW5 (ORCPT ); Tue, 5 Jan 2021 05:22:57 -0500 Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35830C061793 for ; Tue, 5 Jan 2021 02:22:17 -0800 (PST) Received: by mail-lf1-x12f.google.com with SMTP id y19so71173290lfa.13 for ; Tue, 05 Jan 2021 02:22:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=5kKMEMOodY6kGWS9Y6O4QuS2Jo/OUIx2VF4zkaRwF7U=; b=ZqSdvv3lE8DyMK9cHX0L0iq0Wmq5P/xIlRNh7tHkh3VxljPNtgbX24RLEwulmUbej6 dPrwUXdP44Ddkyc2JdWPMcRNXsy52EKz6n6S0mpZrNhAepsxOefAv5nQPKe8jlcdS3LL clswXiXzw47LhMECuEeNcw9Sc5ezTWXLgPR1AnmPoUrSK8nVauxMetpnuh16NRVr0CLp O/7YXSm1Rgt6tEazbVx9YORA6+eCAozpMeBSl/uftK1vr7h7Qo8wr2j3PziIfDIc9yEe YMLzrxRUmdjU4LevoafRSFzJD7CHeYpiApveXTR2JXTIXKH7k2MAuKQxgQIITF2hB6Ok G1UQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=5kKMEMOodY6kGWS9Y6O4QuS2Jo/OUIx2VF4zkaRwF7U=; b=eoBtcR0HwVnLnEIqxPGhd8PFfxcT0RVH+ZKlsZw7o2knUn4aFm9UeN0FQ+or2c8sN5 jeLFXYdTxgdOWFRTeAqyi5X94E+0gUNmfXiZuMKKdc5RNinHG1zuRdLJQSaX64d2zEf7 xONXU4IDXNbUZ5f7B//rNpCb9MiFtdx/0X75Zb9EAO45FmobuVIZsYhKssiqXAWFBkco Kp28mFKcJZwfJ1WlAzRjQ1wlBUdbNGG5QjkpfWA+K3vvapADFGRA1AOqWcG8rzlksJ0S 3eeDxOFVAXyIW+aCYJcY7a1EgIOR366BolDqnKb9uJ7SgC+Y4N2/WwMa/GaO2kBRUrrC 2lSg== X-Gm-Message-State: AOAM530KyqyJ2vXM+dIBzB9NHQo4ye6/2KNSPbqGxbO91A6p2WHUkedj twH04x2iS39OCXMKqbzcExhx0pHnFzh7uEyBA9M= X-Received: by 2002:a2e:96da:: with SMTP id d26mr35613030ljj.233.1609842135783; Tue, 05 Jan 2021 02:22:15 -0800 (PST) MIME-Version: 1.0 References: <96BB0656-F234-4634-853E-E2A747B6ECDB@redhat.com> In-Reply-To: From: Liang Li Date: Tue, 5 Jan 2021 18:22:03 +0800 Message-ID: Subject: Re: [RFC v2 PATCH 0/4] speed up page allocation for __GFP_ZERO To: David Hildenbrand Cc: Alexander Duyck , Mel Gorman , Andrew Morton , Andrea Arcangeli , Dan Williams , "Michael S. Tsirkin" , Jason Wang , Dave Hansen , Michal Hocko , Liang Li , linux-mm , LKML , virtualization@lists.linux-foundation.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > >> That=E2=80=98s mostly already existing scheduling logic, no? (How many= vms can I put onto a specific machine eventually) > > > > It depends on how the scheduling component is designed. Yes, you can pu= t > > 10 VMs with 4C8G(4CPU, 8G RAM) on a host and 20 VMs with 2C4G on > > another one. But if one type of them, e.g. 4C8G are sold out, customers > > can't by more 4C8G VM while there are some free 2C4G VMs, the resource > > reserved for them can be provided as 4C8G VMs > > > > 1. You can, just the startup time will be a little slower? E.g., grow > pre-allocated 4G file to 8G. > > 2. Or let's be creative: teach QEMU to construct a single > RAMBlock/MemoryRegion out of multiple tmpfs files. Works as long as you > don't go crazy on different VM sizes / size differences. > > 3. In your example above, you can dynamically rebalance as VMs are > getting sold, to make sure you always have "big ones" lying around you > can shrink on demand. > Yes, we can always come up with some ways to make things work. it will make the developer of the upper layer component crazy :) > > > > You must know there are a lot of functions in the kernel which can > > be done in userspace. e.g. Some of the device emulations like APIC, > > vhost-net backend which has userspace implementation. :) > > Bad or not depends on the benefits the solution brings. > > From the viewpoint of a user space application, the kernel should > > provide high performance memory management service. That's why > > I think it should be done in the kernel. > > As I expressed a couple of times already, I don't see why using > hugetlbfs and implementing some sort of pre-zeroing there isn't sufficien= t. Did I miss something before? I thought you doubt the need for hugetlbfs free page pre zero out. Hugetlbfs is a good choice and is sufficient. > We really don't *want* complicated things deep down in the mm core if > there are reasonable alternatives. > I understand your concern, we should have sufficient reason to add a new feature to the kernel. And for this one, it's most value is to make the application's life is easier. And implementing it in hugetlbfs can avoid adding more complexity to core MM. I will send out a new revision and drop the part for 'buddy free pages pre zero out'. Thanks for your suggestion! Liang