Received: by 2002:a05:6a10:a0d1:0:0:0:0 with SMTP id j17csp1667071pxa; Sun, 16 Aug 2020 06:54:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwGyOz2ThwMFE70d1E5hz2Mvz5W3jmZIz5myQycH59YuhrUQrxlr9MQANrBrxobwkdEs5Hp X-Received: by 2002:a17:906:d8f:: with SMTP id m15mr10829882eji.494.1597586092043; Sun, 16 Aug 2020 06:54:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1597586092; cv=none; d=google.com; s=arc-20160816; b=dfeUXdqopPtblvTlHggTlX2nOGrOgIVl4pUrYPkQbLMLvWMQfYLh/FZ6qCN1dqRTwD 8/XKtETEHQ0FKxTIP40nLKuTGHWNRUVFd+dlri+etPCi99yhDll5XiVZl6Hyq9JgXfJ8 +ZXt7gpedLqhya12YlkUkmllN9n3zJLj5GG4OS217UZMxTBnMMN3DWYofCwmw4fp2Xe+ hDGOEM/7u6PMw1hddBtmzmtcQagy85+8InHsAAtKzgr1HPk81Eb/WQL8qLKZMAhiyLoA a7o9/89MxElG9YTO8K+Om+5U5m8EdgCZR6+K9kNmDtnwQDgPOVkoh00ZSmnrTWPiVtyV r1nw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mcFLaiIwXztw2tm8Rg2ib9m7zR+rRxxy65slu0ULfFE=; b=VeIb7ZxpkObAQM3E05hGv8d1YYjo0XKNwL7j1obi1LgMBE57kJAqgV/XjgQHnzM0Up XbG9GOVR0G39gkKzaZ2Ie5agZ04bLOmyM70Ua2ytBENnuaO1wADV+tyWdwCWNzH29EsZ q0TpVyzPk1qtxYocnZxPEz5I3vpV+H/Tkbqiaf0Se3nYQXZns2MbJwmNnsoJr1XlRQNt xAxa1WBlomKDOJD/mq5NChL69jR5+J5UkThblZC/qSFmdVkCIeS5cy01AQ7LmEvvd7T5 OBZgsB59GByCtMs1OpgoU3/ABdJXkVr4pXOljV+ir+/ixmHdbNOisFyfbc6FdyHLfC6M Os/Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Cmk23l51; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id by28si9116719ejc.510.2020.08.16.06.54.05; Sun, 16 Aug 2020 06:54:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Cmk23l51; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728984AbgHPMyO (ORCPT + 99 others); Sun, 16 Aug 2020 08:54:14 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:24182 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728871AbgHPMyL (ORCPT ); Sun, 16 Aug 2020 08:54:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1597582449; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mcFLaiIwXztw2tm8Rg2ib9m7zR+rRxxy65slu0ULfFE=; b=Cmk23l516plcNiYTX9LxOFLG4ygJ/LKkIKyn/54kdVnlv06tdTq+EkfXf1E5IRAiPLxvfO xNWDb09cm6QGjDBlMYRMt2JX96mqOTS2AagvB9c8zyQCQ36dF+2TEhIlORmTZjyVe2vhxO MmR5DQOOZYM7xVxULrcelk9aFwj9jx0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-579-3GEnkGrQOaqEsUnP1L-rWg-1; Sun, 16 Aug 2020 08:54:08 -0400 X-MC-Unique: 3GEnkGrQOaqEsUnP1L-rWg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C266D802B47; Sun, 16 Aug 2020 12:54:06 +0000 (UTC) Received: from t480s.redhat.com (ovpn-112-43.ams2.redhat.com [10.36.112.43]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9EEAA196B8; Sun, 16 Aug 2020 12:54:01 +0000 (UTC) From: David Hildenbrand To: linux-kernel@vger.kernel.org Cc: virtualization@lists.linux-foundation.org, linux-mm@kvack.org, David Hildenbrand , Andrew Morton , Michal Hocko , "Michael S . Tsirkin" , Jason Wang , Mike Kravetz , Pankaj Gupta , Baoquan He Subject: [PATCH v5 5/6] virtio-mem: don't special-case ZONE_MOVABLE Date: Sun, 16 Aug 2020 14:53:32 +0200 Message-Id: <20200816125333.7434-6-david@redhat.com> In-Reply-To: <20200816125333.7434-1-david@redhat.com> References: <20200816125333.7434-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When introducing virtio-mem, the semantics of ZONE_MOVABLE were rather unclear, which is why we special-cased ZONE_MOVABLE such that partially plugged blocks would never end up in ZONE_MOVABLE. Now that the semantics are much clearer (and will be documented in a follow-up patch including the new virtio-mem behavior), let's allow to online partially plugged memory blocks to ZONE_MOVABLE and also consider memory blocks that were onlined to ZONE_MOVABLE when unplugging memory. While unplugged memory pages are, in general, unmovable, they can be skipped when offlining memory. virtio-mem only unplugs fairly big chunks (in the megabyte range) and rather tries to shrink the memory region than randomly choosing memory. In theory, if all other pages in the movable zone would be movable, virtio-mem would only shrink that zone and not create any kind of fragmentation. In the future, we might want to remember the zone again and use the information when (un)plugging memory. For now, let's keep it simple. Note: Support for defragmentation is planned, to deal with fragmentation after unplug due to memory chunks within memory blocks that could not get unplugged before (e.g., somebody pinning pages within ZONE_MOVABLE for a longer time). Cc: Andrew Morton Cc: Michal Hocko Cc: Michael S. Tsirkin Cc: Jason Wang Cc: Mike Kravetz Cc: Pankaj Gupta Cc: Baoquan He Signed-off-by: David Hildenbrand --- drivers/virtio/virtio_mem.c | 47 +++++++------------------------------ 1 file changed, 8 insertions(+), 39 deletions(-) diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c index f26f5f64ae822..2ddfc4a0e2ee0 100644 --- a/drivers/virtio/virtio_mem.c +++ b/drivers/virtio/virtio_mem.c @@ -36,18 +36,10 @@ enum virtio_mem_mb_state { VIRTIO_MEM_MB_STATE_OFFLINE, /* Partially plugged, fully added to Linux, offline. */ VIRTIO_MEM_MB_STATE_OFFLINE_PARTIAL, - /* Fully plugged, fully added to Linux, online (!ZONE_MOVABLE). */ + /* Fully plugged, fully added to Linux, online. */ VIRTIO_MEM_MB_STATE_ONLINE, - /* Partially plugged, fully added to Linux, online (!ZONE_MOVABLE). */ + /* Partially plugged, fully added to Linux, online. */ VIRTIO_MEM_MB_STATE_ONLINE_PARTIAL, - /* - * Fully plugged, fully added to Linux, online (ZONE_MOVABLE). - * We are not allowed to allocate (unplug) parts of this block that - * are not movable (similar to gigantic pages). We will never allow - * to online OFFLINE_PARTIAL to ZONE_MOVABLE (as they would contain - * unmovable parts). - */ - VIRTIO_MEM_MB_STATE_ONLINE_MOVABLE, VIRTIO_MEM_MB_STATE_COUNT }; @@ -526,21 +518,10 @@ static bool virtio_mem_owned_mb(struct virtio_mem *vm, unsigned long mb_id) } static int virtio_mem_notify_going_online(struct virtio_mem *vm, - unsigned long mb_id, - enum zone_type zone) + unsigned long mb_id) { switch (virtio_mem_mb_get_state(vm, mb_id)) { case VIRTIO_MEM_MB_STATE_OFFLINE_PARTIAL: - /* - * We won't allow to online a partially plugged memory block - * to the MOVABLE zone - it would contain unmovable parts. - */ - if (zone == ZONE_MOVABLE) { - dev_warn_ratelimited(&vm->vdev->dev, - "memory block has holes, MOVABLE not supported\n"); - return NOTIFY_BAD; - } - return NOTIFY_OK; case VIRTIO_MEM_MB_STATE_OFFLINE: return NOTIFY_OK; default: @@ -560,7 +541,6 @@ static void virtio_mem_notify_offline(struct virtio_mem *vm, VIRTIO_MEM_MB_STATE_OFFLINE_PARTIAL); break; case VIRTIO_MEM_MB_STATE_ONLINE: - case VIRTIO_MEM_MB_STATE_ONLINE_MOVABLE: virtio_mem_mb_set_state(vm, mb_id, VIRTIO_MEM_MB_STATE_OFFLINE); break; @@ -579,24 +559,17 @@ static void virtio_mem_notify_offline(struct virtio_mem *vm, virtio_mem_retry(vm); } -static void virtio_mem_notify_online(struct virtio_mem *vm, unsigned long mb_id, - enum zone_type zone) +static void virtio_mem_notify_online(struct virtio_mem *vm, unsigned long mb_id) { unsigned long nb_offline; switch (virtio_mem_mb_get_state(vm, mb_id)) { case VIRTIO_MEM_MB_STATE_OFFLINE_PARTIAL: - BUG_ON(zone == ZONE_MOVABLE); virtio_mem_mb_set_state(vm, mb_id, VIRTIO_MEM_MB_STATE_ONLINE_PARTIAL); break; case VIRTIO_MEM_MB_STATE_OFFLINE: - if (zone == ZONE_MOVABLE) - virtio_mem_mb_set_state(vm, mb_id, - VIRTIO_MEM_MB_STATE_ONLINE_MOVABLE); - else - virtio_mem_mb_set_state(vm, mb_id, - VIRTIO_MEM_MB_STATE_ONLINE); + virtio_mem_mb_set_state(vm, mb_id, VIRTIO_MEM_MB_STATE_ONLINE); break; default: BUG(); @@ -675,7 +648,6 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb, const unsigned long start = PFN_PHYS(mhp->start_pfn); const unsigned long size = PFN_PHYS(mhp->nr_pages); const unsigned long mb_id = virtio_mem_phys_to_mb_id(start); - enum zone_type zone; int rc = NOTIFY_OK; if (!virtio_mem_overlaps_range(vm, start, size)) @@ -717,8 +689,7 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb, break; } vm->hotplug_active = true; - zone = page_zonenum(pfn_to_page(mhp->start_pfn)); - rc = virtio_mem_notify_going_online(vm, mb_id, zone); + rc = virtio_mem_notify_going_online(vm, mb_id); break; case MEM_OFFLINE: virtio_mem_notify_offline(vm, mb_id); @@ -726,8 +697,7 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb, mutex_unlock(&vm->hotplug_mutex); break; case MEM_ONLINE: - zone = page_zonenum(pfn_to_page(mhp->start_pfn)); - virtio_mem_notify_online(vm, mb_id, zone); + virtio_mem_notify_online(vm, mb_id); vm->hotplug_active = false; mutex_unlock(&vm->hotplug_mutex); break; @@ -1906,8 +1876,7 @@ static void virtio_mem_remove(struct virtio_device *vdev) if (vm->nb_mb_state[VIRTIO_MEM_MB_STATE_OFFLINE] || vm->nb_mb_state[VIRTIO_MEM_MB_STATE_OFFLINE_PARTIAL] || vm->nb_mb_state[VIRTIO_MEM_MB_STATE_ONLINE] || - vm->nb_mb_state[VIRTIO_MEM_MB_STATE_ONLINE_PARTIAL] || - vm->nb_mb_state[VIRTIO_MEM_MB_STATE_ONLINE_MOVABLE]) { + vm->nb_mb_state[VIRTIO_MEM_MB_STATE_ONLINE_PARTIAL]) { dev_warn(&vdev->dev, "device still has system memory added\n"); } else { virtio_mem_delete_resource(vm); -- 2.26.2