Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp2819296ybf; Mon, 2 Mar 2020 16:27:43 -0800 (PST) X-Google-Smtp-Source: ADFU+vvZiZwV5hVDwmbQ8OMccJnccZHtoHr3EJ0VWOMA0zfcwGPYVxBusN7Xe5BSjPIBLutXgJNV X-Received: by 2002:aca:4106:: with SMTP id o6mr713371oia.173.1583195263597; Mon, 02 Mar 2020 16:27:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583195263; cv=none; d=google.com; s=arc-20160816; b=xkWCz7BMdAfl/TMYMwRxJfpstg9hCW6mDnbn+zxdjHNASsMVoT5JDW4/07smSTE5QK FRBxmSr5wg9PsNhPzhC66FlC8y446DYhJZ/Hl1cf7Yi5QTVfukcFJ5CIM+a/x+2LidD/ MK6DtF+ihN3TOoOyVKOf/RwTBlmlh3yfMlOMoEhzrgipYqzGC8MfMqGHJwfZzqfZU+ya eoaR5ZYfeT5Ksy9V18K6VxPfYko00TjR3gdXmQi1e5xgIzNvyN55dFAu+1dFZKVDWMrD 2LEuzusp29C9bZRqNFiSyWYeVGz8E7YT0YphvjxpY0YY3hJJ133saJH1Z4VGgch14pDK KM2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:message-id :in-reply-to:date:references:subject:cc:to:from; bh=5D2RvI7ndnKA87nTg0EqrJQqViTEXpuTKoNu+yf5dng=; b=pNxrB2kNdWByWrzNLlCia3gkObFyXakF4+EtZkuhQJvzlqEeLCrWwW2rBVpG3y2n8r 0nTUxO1CGGRTkNI+l/Ri+iAj4WuhUQ5S+r5EnBz/VpB1AhuNBuAmZs77/5J3nWyMjabz y+27+/fDiIXfCACOsXm/O3zwoF3Sz8J0cV7MgTPdTf7t9YYwL9aEPYT4/1PambiXRTlz 7eWOmMc9SA4Kn9MtP6B8B+j/l14xplrhATO9A5eTGzKxV47LI0rG70DAYm/ngvNqodQA VqK/HncTNAIVNq4cXi5C+dFAgpJUo5gh/8rD5Xl0ujSgQ0yT3NaryaqfNKFhhhtsckLo CYdQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l15si7294386oic.220.2020.03.02.16.27.31; Mon, 02 Mar 2020 16:27:43 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727026AbgCCAZs (ORCPT + 99 others); Mon, 2 Mar 2020 19:25:48 -0500 Received: from mga03.intel.com ([134.134.136.65]:18815 "EHLO mga03.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726915AbgCCAZs (ORCPT ); Mon, 2 Mar 2020 19:25:48 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 02 Mar 2020 16:25:47 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,509,1574150400"; d="scan'208";a="273968558" Received: from yhuang-dev.sh.intel.com (HELO yhuang-dev) ([10.239.159.23]) by fmsmga002.fm.intel.com with ESMTP; 02 Mar 2020 16:25:43 -0800 From: "Huang\, Ying" To: David Hildenbrand Cc: Michal Hocko , Matthew Wilcox , Andrew Morton , , , Mel Gorman , Vlastimil Babka , Zi Yan , Peter Zijlstra , Dave Hansen , "Minchan Kim" , Johannes Weiner , Hugh Dickins , Alexander Duyck Subject: Re: [RFC 0/3] mm: Discard lazily freed pages when migrating References: <20200228033819.3857058-1-ying.huang@intel.com> <20200228034248.GE29971@bombadil.infradead.org> <87a7538977.fsf@yhuang-dev.intel.com> <871rqf850z.fsf@yhuang-dev.intel.com> <20200228095048.GK3771@dhcp22.suse.cz> <87d09u7sm2.fsf@yhuang-dev.intel.com> <8005e5a1-e2f2-1e57-ccb4-0cb9371b080d@redhat.com> Date: Tue, 03 Mar 2020 08:25:43 +0800 In-Reply-To: <8005e5a1-e2f2-1e57-ccb4-0cb9371b080d@redhat.com> (David Hildenbrand's message of "Mon, 2 Mar 2020 15:23:16 +0100") Message-ID: <878ski708o.fsf@yhuang-dev.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=ascii Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org David Hildenbrand writes: > On 02.03.20 15:12, Huang, Ying wrote: >> Michal Hocko writes: >> >>> On Fri 28-02-20 16:55:40, Huang, Ying wrote: >>>> David Hildenbrand writes: >>> [...] >>>>> E.g., free page reporting in QEMU wants to use MADV_FREE. The guest will >>>>> report currently free pages to the hypervisor, which will MADV_FREE the >>>>> reported memory. As long as there is no memory pressure, there is no >>>>> need to actually free the pages. Once the guest reuses such a page, it >>>>> could happen that there is still the old page and pulling in in a fresh >>>>> (zeroed) page can be avoided. >>>>> >>>>> AFAIKs, after your change, we would get more pages discarded from our >>>>> guest, resulting in more fresh (zeroed) pages having to be pulled in >>>>> when a guest touches a reported free page again. But OTOH, page >>>>> migration is speed up (avoiding to migrate these pages). >>>> >>>> Let's look at this problem in another perspective. To migrate the >>>> MADV_FREE pages of the QEMU process from the node A to the node B, we >>>> need to free the original pages in the node A, and (maybe) allocate the >>>> same number of pages in the node B. So the question becomes >>>> >>>> - we may need to allocate some pages in the node B >>>> - these pages may be accessed by the application or not >>>> - we should allocate all these pages in advance or allocate them lazily >>>> when they are accessed. >>>> >>>> We thought the common philosophy in Linux kernel is to allocate lazily. >>> >>> The common philosophy is to cache as much as possible. >> >> Yes. This is another common philosophy. And MADV_FREE pages is >> different from caches such as the page caches because it has no valid >> contents. > > Side note: It might contain valid content until discarded/zeroed out. > E.g., an application could use a marker bit (e.g., first bit) to detect > if the page still contains valid data or not. If the data is still > marked valid, the content could be reuse immediately. Not sure if there > is any such application, though :) I don't think this is the typical use case. But I admit that this is possible. Best Regards, Huang, Ying