Received: by 10.223.185.116 with SMTP id b49csp1614990wrg; Sun, 11 Feb 2018 16:47:31 -0800 (PST) X-Google-Smtp-Source: AH8x226c6dOIVZHYVkpiJeyCLiU/CDD+9ftLDmfx/VPtKe4kDaDHZOByaX9Nw+NrpXHtQAZkHnV8 X-Received: by 10.99.125.79 with SMTP id m15mr2402408pgn.428.1518396450988; Sun, 11 Feb 2018 16:47:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1518396450; cv=none; d=google.com; s=arc-20160816; b=Vf/RVdW5KFZMnVMHr46CFloOW92RSgNtuWT4Rx+RyR+u68E0Ped+j5bXGpbzLKbpbW 3EPA2orbfxRLoYGK02wFsQ+XnQcVLuhleNTTdXg/4smSFUYNYtQxzpQuV3TfDMIIdzKo AlgRJKwdIKf6uygKgDp97P1v1CXBzPrCZPy5yJ4kevbf0okchgdKpzZXuN4To2rX5FoE ThgejBinvwhAijMkyrC+NfOLhreBKGKsfDldCIcK4hWEfSbugpob57MNTLf0f80SfpI1 h+lZ5V4Hq4dYpDEPJob3VkVlLwrlGwuwifs1dor3Ds8X5SeV40eS17UchZuSAmz72ObY ZFsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:reply-to:references:in-reply-to :message-id:date:subject:cc:to:from:arc-authentication-results; bh=nSZRY2rdh7tLFySSazDUoH5HYn8HmrJhSgLVb1EBAIE=; b=uQxVQOgioX+4nYUO03e8G3voHwsEIhuVoCJxukZ0qORRkKswYbDJTFKQ8tr0BvksNV xO/nXLOrRFtXnrD7ZRClrX7SvoSwJMEGboarrpBhVV7kIijs1Y3X2WiLF1FN4KHYx8zx 8ohI7qlZUo6bPv77eGOGprx+PrqCP1BcMt/KOn0HQZ70SaDaxvq5dw4/F8x9TtvjG4gD o3RKRvZhkrACivZrKYyJeswYxYZjN33jfkgbujgpoP4O306Nm6O4Za3RHZM9fGC7yJH5 MMHuwQzqIQZSrBqN07Qqjehlo6QWiFU6rqx+MC0J8/QOYANpGTvdWqao+YDkJi4EFYlv k1Gg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=exchange.microsoft.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y5-v6si3244418pln.274.2018.02.11.16.47.17; Sun, 11 Feb 2018 16:47:30 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=exchange.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932383AbeBLAqN (ORCPT + 99 others); Sun, 11 Feb 2018 19:46:13 -0500 Received: from a2nlsmtp01-02.prod.iad2.secureserver.net ([198.71.225.36]:42940 "EHLO a2nlsmtp01-02.prod.iad2.secureserver.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932292AbeBLApx (ORCPT ); Sun, 11 Feb 2018 19:45:53 -0500 Received: from linuxonhyperv2.linuxonhyperv.com ([107.180.71.197]) by : HOSTING RELAY : with SMTP id l23terXO4QvTSl23tevq1f; Sun, 11 Feb 2018 17:33:37 -0700 x-originating-ip: 107.180.71.197 Received: from kys by linuxonhyperv2.linuxonhyperv.com with local (Exim 4.89_1) (envelope-from ) id 1el23t-0001ms-Jq; Sun, 11 Feb 2018 17:33:37 -0700 From: kys@exchange.microsoft.com To: gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org, devel@linuxdriverproject.org, olaf@aepfle.de, apw@canonical.com, vkuznets@redhat.com, jasowang@redhat.com, leann.ogasawara@canonical.com, marcelo.cerri@canonical.com, sthemmin@microsoft.com Cc: "K . Y . Srinivasan" Subject: [PATCH 11/12] hv_balloon: fix bugs in num_pages_onlined accounting Date: Sun, 11 Feb 2018 17:33:19 -0700 Message-Id: <20180212003320.6748-11-kys@exchange.microsoft.com> X-Mailer: git-send-email 2.15.1 In-Reply-To: <20180212003320.6748-1-kys@exchange.microsoft.com> References: <20180212002958.6679-1-kys@exchange.microsoft.com> <20180212003320.6748-1-kys@exchange.microsoft.com> Reply-To: kys@microsoft.com X-CMAE-Envelope: MS4wfHyZdqk6Dl3CqdKTeQUaFt4DiPq4NwSJNRbSHot72htVxwhQ9iyRVvrQfyMq2KbUn3bruBG3LJedKxJcr14wtflyhxwq2GFNtBHGivf4zE+1HrQTk3qJ lchFZUskV71tfjZpPPgJGrjGEmiWyPA1H5CR1UyGekyhxpcIgq3ZmhNdEaJiFj1BRlgZgyFU5T1fMnMO7oJ3amYOSknqcrv/gop8VTmLqBrjTIHrJcvNUCpH cpMZske8xgoR5PkZkbLjaXHMRleTcVHbzbbxyyX38+4PS1mKe123/FGeeMg2tpDUjjaE4+fXSFQGGs4SvUJgKhzlJuXoJ8FTRx9cv9IkAxwwR8QpAlobKhAn qlvuOOwBfMDxJ7QlAM+ha1ydkdvcrThUGefSrB1Dk7g1Byrnwf79IB38LrNnGA28cUxZelJ/p28OCXQn8BYQSsaseRMJcknTpkPDYDdxTtWluvwzqUpxJDXh Y1kzVxTFDVzIC176u5n6sD6jT0IK4B0q0P8NYFDSpVFawd1CWKMkPXNyqpAkEYhuP7sdvUGmGA7iYEugmoz6MIvaj6DlNKeVNv34eQ== Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Vitaly Kuznetsov Our num_pages_onlined accounting is buggy: 1) In case we're offlining a memory block which was present at boot (e.g. when there was no hotplug at all) we subtract 32k from 0 and as num_pages_onlined is unsigned get a very big positive number. 2) Commit 6df8d9aaf3af ("Drivers: hv: balloon: Correctly update onlined page count") made num_pages_onlined counter accurate on onlining but totally incorrect on offlining for partly populated regions: no matter how many pages were onlined and what was actually added to num_pages_onlined counter we always subtract the full region (32k) so again, num_pages_onlined can wrap around zero. By onlining/offlining the same partly populated region multiple times we can make the situation worse. Solve these issues by doing accurate accounting on offlining: walk HAS list, check for covered range and gaps. Fixes: 6df8d9aaf3af ("Drivers: hv: balloon: Correctly update onlined page count") Signed-off-by: Vitaly Kuznetsov Signed-off-by: K. Y. Srinivasan --- drivers/hv/hv_balloon.c | 82 +++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 69 insertions(+), 13 deletions(-) diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 5b8e1ad1bcfe..7514a32d0de4 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -576,11 +576,65 @@ static struct hv_dynmem_device dm_device; static void post_status(struct hv_dynmem_device *dm); #ifdef CONFIG_MEMORY_HOTPLUG +static inline bool has_pfn_is_backed(struct hv_hotadd_state *has, + unsigned long pfn) +{ + struct hv_hotadd_gap *gap; + + /* The page is not backed. */ + if ((pfn < has->covered_start_pfn) || (pfn >= has->covered_end_pfn)) + return false; + + /* Check for gaps. */ + list_for_each_entry(gap, &has->gap_list, list) { + if ((pfn >= gap->start_pfn) && (pfn < gap->end_pfn)) + return false; + } + + return true; +} + +static unsigned long hv_page_offline_check(unsigned long start_pfn, + unsigned long nr_pages) +{ + unsigned long pfn = start_pfn, count = 0; + struct hv_hotadd_state *has; + bool found; + + while (pfn < start_pfn + nr_pages) { + /* + * Search for HAS which covers the pfn and when we find one + * count how many consequitive PFNs are covered. + */ + found = false; + list_for_each_entry(has, &dm_device.ha_region_list, list) { + while ((pfn >= has->start_pfn) && + (pfn < has->end_pfn) && + (pfn < start_pfn + nr_pages)) { + found = true; + if (has_pfn_is_backed(has, pfn)) + count++; + pfn++; + } + } + + /* + * This PFN is not in any HAS (e.g. we're offlining a region + * which was present at boot), no need to account for it. Go + * to the next one. + */ + if (!found) + pfn++; + } + + return count; +} + static int hv_memory_notifier(struct notifier_block *nb, unsigned long val, void *v) { struct memory_notify *mem = (struct memory_notify *)v; - unsigned long flags; + unsigned long flags, pfn_count; switch (val) { case MEM_ONLINE: @@ -593,7 +647,19 @@ static int hv_memory_notifier(struct notifier_block *nb, unsigned long val, case MEM_OFFLINE: spin_lock_irqsave(&dm_device.ha_lock, flags); - dm_device.num_pages_onlined -= mem->nr_pages; + pfn_count = hv_page_offline_check(mem->start_pfn, + mem->nr_pages); + if (pfn_count <= dm_device.num_pages_onlined) { + dm_device.num_pages_onlined -= pfn_count; + } else { + /* + * We're offlining more pages than we managed to online. + * This is unexpected. In any case don't let + * num_pages_onlined wrap around zero. + */ + WARN_ON_ONCE(1); + dm_device.num_pages_onlined = 0; + } spin_unlock_irqrestore(&dm_device.ha_lock, flags); break; case MEM_GOING_ONLINE: @@ -612,19 +678,9 @@ static struct notifier_block hv_memory_nb = { /* Check if the particular page is backed and can be onlined and online it. */ static void hv_page_online_one(struct hv_hotadd_state *has, struct page *pg) { - struct hv_hotadd_gap *gap; - unsigned long pfn = page_to_pfn(pg); - - /* The page is not backed. */ - if ((pfn < has->covered_start_pfn) || (pfn >= has->covered_end_pfn)) + if (!has_pfn_is_backed(has, page_to_pfn(pg))) return; - /* Check for gaps. */ - list_for_each_entry(gap, &has->gap_list, list) { - if ((pfn >= gap->start_pfn) && (pfn < gap->end_pfn)) - return; - } - /* This frame is currently backed; online the page. */ __online_page_set_limits(pg); __online_page_increment_counters(pg); -- 2.15.1