Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752003AbdCVVR0 (ORCPT ); Wed, 22 Mar 2017 17:17:26 -0400 Received: from mail-pf0-f174.google.com ([209.85.192.174]:35263 "EHLO mail-pf0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751851AbdCVVRG (ORCPT ); Wed, 22 Mar 2017 17:17:06 -0400 MIME-Version: 1.0 From: Dan Streetman Date: Wed, 22 Mar 2017 17:16:23 -0400 Message-ID: Subject: maybe revert commit c275a57f5ec3 "xen/balloon: Set balloon's initial state to number of existing RAM pages" To: Boris Ostrovsky , Konrad Rzeszutek Wilk Cc: David Vrabel , Juergen Gross , xen-devel@lists.xenproject.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2876 Lines: 54 I have a question about a problem introduced by this commit: c275a57f5ec3056f732843b11659d892235faff7 "xen/balloon: Set balloon's initial state to number of existing RAM pages" It changed the xen balloon current_pages calculation to start with the number of physical pages in the system, instead of max_pfn. Since get_num_physpages() does not include holes, it's always less than the e820 map's max_pfn. However, the problem that commit introduced is, if the hypervisor sets the balloon target to equal to the e820 map's max_pfn, then the balloon target will *always* be higher than the initial current pages. Even if the hypervisor sets the target to (e820 max_pfn - holes), if the OS adds any holes, the balloon target will be higher than the current pages. This is the situation, for example, for Amazon AWS instances. The result is, the xen balloon will always immediately hotplug some memory at boot, but then make only (max_pfn - get_num_physpages()) available to the system. This balloon-hotplugged memory can cause problems, if the hypervisor wasn't expecting it; specifically, the system's physical page addresses now will exceed the e820 map's max_pfn, due to the balloon-hotplugged pages; if the hypervisor isn't expecting pt-device DMA to/from those physical pages above the e820 max_pfn, it causes problems. For example: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1668129 The additional small amount of balloon memory can cause other problems as well, for example: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1518457 Anyway, I'd like to ask, was the original commit added because hypervisors are supposed to set their balloon target to the guest system's number of phys pages (max_pfn - holes)? The mailing list discussion and commit description seem to indicate that. However I'm not sure how that is possible, because the kernel reserves its own holes, regardless of any predefined holes in the e820 map; for example, the kernel reserves 64k (by default) at phys addr 0 (the amount of reservation is configurable via CONFIG_X86_RESERVE_LOW). So the hypervisor really has no way to know what the "right" target to specify is; unless it knows the exact guest OS and kernel version, and kernel config values, it will never be able to correctly specify its target to be exactly (e820 max_pfn - all holes). Should this commit be reverted? Should the xen balloon target be adjusted based on kernel-added e820 holes? Should something else be done? For context, Amazon Linux has simply disabled Xen ballooning completely. Likewise, we're planning to disable Xen ballooning in the Ubuntu kernel for Amazon AWS-specific kernels (but not for non-AWS Ubuntu kernels). However, if reverting this patch makes sense in a bigger context (i.e. Xen users besides AWS), that would allow more Ubuntu kernels to work correctly in AWS instances.