Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp2285256imm; Thu, 20 Sep 2018 10:34:11 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYaziObWEVrpgE8L1JK6CApooQqQ7h40SOtYKNyxU/LpvrOPGO+FeawRBlNHvfpSDBzkm72 X-Received: by 2002:a63:1f55:: with SMTP id q21-v6mr37181272pgm.88.1537464851770; Thu, 20 Sep 2018 10:34:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537464851; cv=none; d=google.com; s=arc-20160816; b=VHnfYQ5J8o8pXTu5cz4vFzCi1AGSh2uo6c8G7j0aBdcGwNBP42gv94wcBNrV96n5+6 aA/wRmoXKfUQ1PruVTvpqSahQUx7X6ykuoS3eJtkHI7oAbhTGlVqjajxkyhz8eiLXDh0 lszYKbZ+cQo25WctaLcM9ti2CbQDl5aeW8xVsAu3PxlduMoXd1ku7E8voCHCEMPBcrn0 2Se71KnrvAyCPJfS54wrNrg9mv6/M0JgstyL3MR6/Hl8/kMRVnag1Brf6UnHEC+n0jir b9FsWbtcPRs6MPtOfUelKPxzb89U6yZxdplPw780lRQ/6xr+2+ABRfgbG4oufFmcp1UF ZrUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from; bh=xDs1j66JulCWQa3G0Rf1kxv31rT4nhRLNlrnws2ivCc=; b=WCmWydGCtKSPiXv+qvUliSP9/OCr9uJHlaE5LCRQ0CjdvnoJUaUYKhvQr7U+MrJnjO oXcOstg/fBUhFdcUsmuWYcEG8HAm72kCB/SXNMeEMJpnLrapC+Hcum2nHJYammicc1j6 sJq2w608tuX7EmZI9Bgw7ZpNJj5dsLOiitWJoE2c1foc9tO3Mf2XAX2kLuVXolyU/snW BlJ9gdc5pyBZEquvZ5I5qAa8HjPzx6rNzIXWJlsCZ68X0zDNzT9YjfGe5VjwuAWQ5Ql1 +TAcjenBE2kgpw0KwWmKzMV09hsY2wRXdqHX4+CANZfg5D0LwfDAIOV5JV8w4FVuWBoE 1q8w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=vmware.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 15-v6si24219901pgu.205.2018.09.20.10.33.54; Thu, 20 Sep 2018 10:34:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=vmware.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388229AbeITXQv (ORCPT + 99 others); Thu, 20 Sep 2018 19:16:51 -0400 Received: from ex13-edg-ou-001.vmware.com ([208.91.0.189]:8434 "EHLO EX13-EDG-OU-001.vmware.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726540AbeITXQg (ORCPT ); Thu, 20 Sep 2018 19:16:36 -0400 Received: from sc9-mailhost3.vmware.com (10.113.161.73) by EX13-EDG-OU-001.vmware.com (10.113.208.155) with Microsoft SMTP Server id 15.0.1156.6; Thu, 20 Sep 2018 10:31:47 -0700 Received: from sc2-haas01-esx0118.eng.vmware.com (sc2-haas01-esx0118.eng.vmware.com [10.172.44.118]) by sc9-mailhost3.vmware.com (Postfix) with ESMTP id 86EA04069B; Thu, 20 Sep 2018 10:31:48 -0700 (PDT) From: Nadav Amit To: Greg Kroah-Hartman , Arnd Bergmann CC: , Xavier Deguillard , Nadav Amit Subject: [PATCH v2 19/20] vmw_balloon: memory shrinker Date: Thu, 20 Sep 2018 10:30:25 -0700 Message-ID: <20180920173026.141333-20-namit@vmware.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20180920173026.141333-1-namit@vmware.com> References: <20180920173026.141333-1-namit@vmware.com> MIME-Version: 1.0 Content-Type: text/plain Received-SPF: None (EX13-EDG-OU-001.vmware.com: namit@vmware.com does not designate permitted sender hosts) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Adding a shrinker to the VMware balloon to prevent out-of-memory events. We reuse the deflate logic for this matter. Deadlocks should not happen, as no memory allocation is performed while the locks of the communication (batch/page) and page-list are taken. In the unlikely event in which the configuration semaphore is taken for write we bail out and fail gracefully (causing processes to be killed). Once the shrinker is called, inflation is postponed for few seconds. The timeout is updated without any lock, but this should not cause any races, as it is written and read atomically. Reviewed-by: Xavier Deguillard Signed-off-by: Nadav Amit --- drivers/misc/vmw_balloon.c | 124 ++++++++++++++++++++++++++++++++++++- 1 file changed, 122 insertions(+), 2 deletions(-) diff --git a/drivers/misc/vmw_balloon.c b/drivers/misc/vmw_balloon.c index 9da8b76342fd..75407be8ddbc 100644 --- a/drivers/misc/vmw_balloon.c +++ b/drivers/misc/vmw_balloon.c @@ -42,6 +42,10 @@ MODULE_ALIAS("dmi:*:svnVMware*:*"); MODULE_ALIAS("vmware_vmmemctl"); MODULE_LICENSE("GPL"); +/* Delay in seconds after shrink before inflation. */ +#define VMBALLOON_SHRINK_DELAY (5) + +/* Maximum number of refused pages we accumulate during inflation cycle */ #define VMW_BALLOON_MAX_REFUSED 16 /* @@ -216,12 +220,13 @@ enum vmballoon_stat_general { VMW_BALLOON_STAT_TIMER, VMW_BALLOON_STAT_DOORBELL, VMW_BALLOON_STAT_RESET, - VMW_BALLOON_STAT_LAST = VMW_BALLOON_STAT_RESET + VMW_BALLOON_STAT_SHRINK, + VMW_BALLOON_STAT_SHRINK_FREE, + VMW_BALLOON_STAT_LAST = VMW_BALLOON_STAT_SHRINK_FREE }; #define VMW_BALLOON_STAT_NUM (VMW_BALLOON_STAT_LAST + 1) - static DEFINE_STATIC_KEY_TRUE(vmw_balloon_batching); static DEFINE_STATIC_KEY_FALSE(balloon_stat_enabled); @@ -320,6 +325,15 @@ struct vmballoon { */ struct page *page; + /** + * @shrink_timeout: timeout until the next inflation. + * + * After an shrink event, indicates the time in jiffies after which + * inflation is allowed again. Can be written concurrently with reads, + * so must use READ_ONCE/WRITE_ONCE when accessing. + */ + unsigned long shrink_timeout; + /* statistics */ struct vmballoon_stats *stats; @@ -360,6 +374,20 @@ struct vmballoon { * Lock ordering: @conf_sem -> @comm_lock . */ spinlock_t comm_lock; + + /** + * @shrinker: shrinker interface that is used to avoid over-inflation. + */ + struct shrinker shrinker; + + /** + * @shrinker_registered: whether the shrinker was registered. + * + * The shrinker interface does not handle gracefully the removal of + * shrinker that was not registered before. This indication allows to + * simplify the unregistration process. + */ + bool shrinker_registered; }; static struct vmballoon balloon; @@ -902,6 +930,10 @@ static int64_t vmballoon_change(struct vmballoon *b) size - target < vmballoon_page_in_frames(VMW_BALLOON_2M_PAGE)) return 0; + /* If an out-of-memory recently occurred, inflation is disallowed. */ + if (target > size && time_before(jiffies, READ_ONCE(b->shrink_timeout))) + return 0; + return target - size; } @@ -1396,6 +1428,86 @@ static void vmballoon_work(struct work_struct *work) } +/** + * vmballoon_shrinker_scan() - deflate the balloon due to memory pressure. + * @shrinker: pointer to the balloon shrinker. + * @sc: page reclaim information. + * + * Returns: number of pages that were freed during deflation. + */ +static unsigned long vmballoon_shrinker_scan(struct shrinker *shrinker, + struct shrink_control *sc) +{ + struct vmballoon *b = &balloon; + unsigned long deflated_frames; + + pr_debug("%s - size: %llu", __func__, atomic64_read(&b->size)); + + vmballoon_stats_gen_inc(b, VMW_BALLOON_STAT_SHRINK); + + /* + * If the lock is also contended for read, we cannot easily reclaim and + * we bail out. + */ + if (!down_read_trylock(&b->conf_sem)) + return 0; + + deflated_frames = vmballoon_deflate(b, sc->nr_to_scan, true); + + vmballoon_stats_gen_add(b, VMW_BALLOON_STAT_SHRINK_FREE, + deflated_frames); + + /* + * Delay future inflation for some time to mitigate the situations in + * which balloon continuously grows and shrinks. Use WRITE_ONCE() since + * the access is asynchronous. + */ + WRITE_ONCE(b->shrink_timeout, jiffies + HZ * VMBALLOON_SHRINK_DELAY); + + up_read(&b->conf_sem); + + return deflated_frames; +} + +/** + * vmballoon_shrinker_count() - return the number of ballooned pages. + * @shrinker: pointer to the balloon shrinker. + * @sc: page reclaim information. + * + * Returns: number of 4k pages that are allocated for the balloon and can + * therefore be reclaimed under pressure. + */ +static unsigned long vmballoon_shrinker_count(struct shrinker *shrinker, + struct shrink_control *sc) +{ + struct vmballoon *b = &balloon; + + return atomic64_read(&b->size); +} + +static void vmballoon_unregister_shrinker(struct vmballoon *b) +{ + if (b->shrinker_registered) + unregister_shrinker(&b->shrinker); + b->shrinker_registered = false; +} + +static int vmballoon_register_shrinker(struct vmballoon *b) +{ + int r; + + b->shrinker.scan_objects = vmballoon_shrinker_scan; + b->shrinker.count_objects = vmballoon_shrinker_count; + b->shrinker.seeks = DEFAULT_SEEKS; + + r = register_shrinker(&b->shrinker); + + if (r == 0) + b->shrinker_registered = true; + + return r; +} + /* * DEBUGFS Interface */ @@ -1413,6 +1525,8 @@ static const char * const vmballoon_stat_names[] = { [VMW_BALLOON_STAT_TIMER] = "timer", [VMW_BALLOON_STAT_DOORBELL] = "doorbell", [VMW_BALLOON_STAT_RESET] = "reset", + [VMW_BALLOON_STAT_SHRINK] = "shrink", + [VMW_BALLOON_STAT_SHRINK_FREE] = "shrinkFree" }; static int vmballoon_enable_stats(struct vmballoon *b) @@ -1757,6 +1871,10 @@ static int __init vmballoon_init(void) INIT_DELAYED_WORK(&balloon.dwork, vmballoon_work); + error = vmballoon_register_shrinker(&balloon); + if (error) + goto fail; + error = vmballoon_debugfs_init(&balloon); if (error) goto fail; @@ -1782,6 +1900,7 @@ static int __init vmballoon_init(void) return 0; fail: + vmballoon_unregister_shrinker(&balloon); vmballoon_compaction_deinit(&balloon); return error; } @@ -1796,6 +1915,7 @@ late_initcall(vmballoon_init); static void __exit vmballoon_exit(void) { + vmballoon_unregister_shrinker(&balloon); vmballoon_vmci_cleanup(&balloon); cancel_delayed_work_sync(&balloon.dwork); -- 2.17.1