Let's trigger from offlining code when we're not allowed to touch online
memory.
Handle the other case (memmap possibly freeing up another memory block)
when actually removing memory. When removing via virtio_mem_remove(),
virtio_mem_retry() is a NOP and safe to use.
While at it, move retry handling when offlining out of
virtio_mem_notify_offline(), to share it with Device Block Mode (DBM)
soon.
Cc: "Michael S. Tsirkin" <[email protected]>
Cc: Jason Wang <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
drivers/virtio/virtio_mem.c | 40 ++++++++++++++++++++++++++-----------
1 file changed, 28 insertions(+), 12 deletions(-)
diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 5c93f8a65eba..8ea00f0b2ecd 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -158,6 +158,7 @@ static DEFINE_MUTEX(virtio_mem_mutex);
static LIST_HEAD(virtio_mem_devices);
static void virtio_mem_online_page_cb(struct page *page, unsigned int order);
+static void virtio_mem_retry(struct virtio_mem *vm);
/*
* Register a virtio-mem device so it will be considered for the online_page
@@ -435,9 +436,17 @@ static int virtio_mem_mb_add(struct virtio_mem *vm, unsigned long mb_id)
static int virtio_mem_mb_remove(struct virtio_mem *vm, unsigned long mb_id)
{
const uint64_t addr = virtio_mem_mb_id_to_phys(mb_id);
+ int rc;
dev_dbg(&vm->vdev->dev, "removing memory block: %lu\n", mb_id);
- return remove_memory(vm->nid, addr, memory_block_size_bytes());
+ rc = remove_memory(vm->nid, addr, memory_block_size_bytes());
+ if (!rc)
+ /*
+ * We might have freed up memory we can now unplug, retry
+ * immediately instead of waiting.
+ */
+ virtio_mem_retry(vm);
+ return rc;
}
/*
@@ -452,11 +461,19 @@ static int virtio_mem_mb_offline_and_remove(struct virtio_mem *vm,
unsigned long mb_id)
{
const uint64_t addr = virtio_mem_mb_id_to_phys(mb_id);
+ int rc;
dev_dbg(&vm->vdev->dev, "offlining and removing memory block: %lu\n",
mb_id);
- return offline_and_remove_memory(vm->nid, addr,
- memory_block_size_bytes());
+ rc = offline_and_remove_memory(vm->nid, addr,
+ memory_block_size_bytes());
+ if (!rc)
+ /*
+ * We might have freed up memory we can now unplug, retry
+ * immediately instead of waiting.
+ */
+ virtio_mem_retry(vm);
+ return rc;
}
/*
@@ -534,15 +551,6 @@ static void virtio_mem_notify_offline(struct virtio_mem *vm,
BUG();
break;
}
-
- /*
- * Trigger the workqueue, maybe we can now unplug memory. Also,
- * when we offline and remove a memory block, this will re-trigger
- * us immediately - which is often nice because the removal of
- * the memory block (e.g., memmap) might have freed up memory
- * on other memory blocks we manage.
- */
- virtio_mem_retry(vm);
}
static void virtio_mem_notify_online(struct virtio_mem *vm, unsigned long mb_id)
@@ -679,6 +687,14 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb,
break;
case MEM_OFFLINE:
virtio_mem_notify_offline(vm, mb_id);
+
+ /*
+ * Trigger the workqueue. Now that we have some offline memory,
+ * maybe we can handle pending unplug requests.
+ */
+ if (!unplug_online)
+ virtio_mem_retry(vm);
+
vm->hotplug_active = false;
mutex_unlock(&vm->hotplug_mutex);
break;
--
2.26.2
On Mon, Oct 12, 2020 at 02:53:03PM +0200, David Hildenbrand wrote:
>Let's trigger from offlining code when we're not allowed to touch online
>memory.
This describes the change in virtio_mem_memory_notifier_cb()?
>
>Handle the other case (memmap possibly freeing up another memory block)
>when actually removing memory. When removing via virtio_mem_remove(),
>virtio_mem_retry() is a NOP and safe to use.
>
>While at it, move retry handling when offlining out of
>virtio_mem_notify_offline(), to share it with Device Block Mode (DBM)
>soon.
I may not understand the logic fully. Here is my understanding of current
logic:
virtio_mem_run_wq()
virtio_mem_unplug_request()
virtio_mem_mb_unplug_any_sb_offline()
virtio_mem_mb_remove() --- 1
virtio_mem_mb_unplug_any_sb_online()
virtio_mem_mb_offline_and_remove() --- 2
This patch tries to trigger the wq at 1 and 2. And these two functions are
only valid during this code flow.
These two functions actually remove some memory from the system. So I am not
sure where extra unplug-able memory comes from. I guess those memory is from
memory block device and mem_sectioin, memmap? While those memory is still
marked as online, right?
In case we can gather extra memory at 1 and form a whole memory block. So that
we can unplug an online memory block (by moving data to a new place), this
just affect the process at 2. This means there is no need to trigger the wq at
1, and we can leave it at 2.
>
>Cc: "Michael S. Tsirkin" <[email protected]>
>Cc: Jason Wang <[email protected]>
>Cc: Pankaj Gupta <[email protected]>
>Signed-off-by: David Hildenbrand <[email protected]>
>---
> drivers/virtio/virtio_mem.c | 40 ++++++++++++++++++++++++++-----------
> 1 file changed, 28 insertions(+), 12 deletions(-)
>
>diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
>index 5c93f8a65eba..8ea00f0b2ecd 100644
>--- a/drivers/virtio/virtio_mem.c
>+++ b/drivers/virtio/virtio_mem.c
>@@ -158,6 +158,7 @@ static DEFINE_MUTEX(virtio_mem_mutex);
> static LIST_HEAD(virtio_mem_devices);
>
> static void virtio_mem_online_page_cb(struct page *page, unsigned int order);
>+static void virtio_mem_retry(struct virtio_mem *vm);
>
> /*
> * Register a virtio-mem device so it will be considered for the online_page
>@@ -435,9 +436,17 @@ static int virtio_mem_mb_add(struct virtio_mem *vm, unsigned long mb_id)
> static int virtio_mem_mb_remove(struct virtio_mem *vm, unsigned long mb_id)
> {
> const uint64_t addr = virtio_mem_mb_id_to_phys(mb_id);
>+ int rc;
>
> dev_dbg(&vm->vdev->dev, "removing memory block: %lu\n", mb_id);
>- return remove_memory(vm->nid, addr, memory_block_size_bytes());
>+ rc = remove_memory(vm->nid, addr, memory_block_size_bytes());
>+ if (!rc)
>+ /*
>+ * We might have freed up memory we can now unplug, retry
>+ * immediately instead of waiting.
>+ */
>+ virtio_mem_retry(vm);
>+ return rc;
> }
>
> /*
>@@ -452,11 +461,19 @@ static int virtio_mem_mb_offline_and_remove(struct virtio_mem *vm,
> unsigned long mb_id)
> {
> const uint64_t addr = virtio_mem_mb_id_to_phys(mb_id);
>+ int rc;
>
> dev_dbg(&vm->vdev->dev, "offlining and removing memory block: %lu\n",
> mb_id);
>- return offline_and_remove_memory(vm->nid, addr,
>- memory_block_size_bytes());
>+ rc = offline_and_remove_memory(vm->nid, addr,
>+ memory_block_size_bytes());
>+ if (!rc)
>+ /*
>+ * We might have freed up memory we can now unplug, retry
>+ * immediately instead of waiting.
>+ */
>+ virtio_mem_retry(vm);
>+ return rc;
> }
>
> /*
>@@ -534,15 +551,6 @@ static void virtio_mem_notify_offline(struct virtio_mem *vm,
> BUG();
> break;
> }
>-
>- /*
>- * Trigger the workqueue, maybe we can now unplug memory. Also,
>- * when we offline and remove a memory block, this will re-trigger
>- * us immediately - which is often nice because the removal of
>- * the memory block (e.g., memmap) might have freed up memory
>- * on other memory blocks we manage.
>- */
>- virtio_mem_retry(vm);
> }
>
> static void virtio_mem_notify_online(struct virtio_mem *vm, unsigned long mb_id)
>@@ -679,6 +687,14 @@ static int virtio_mem_memory_notifier_cb(struct notifier_block *nb,
> break;
> case MEM_OFFLINE:
> virtio_mem_notify_offline(vm, mb_id);
>+
>+ /*
>+ * Trigger the workqueue. Now that we have some offline memory,
>+ * maybe we can handle pending unplug requests.
>+ */
>+ if (!unplug_online)
>+ virtio_mem_retry(vm);
>+
> vm->hotplug_active = false;
> mutex_unlock(&vm->hotplug_mutex);
> break;
>--
>2.26.2
--
Wei Yang
Help you, Help me
On 16.10.20 06:03, Wei Yang wrote:
> On Mon, Oct 12, 2020 at 02:53:03PM +0200, David Hildenbrand wrote:
>> Let's trigger from offlining code when we're not allowed to touch online
>> memory.
>
> This describes the change in virtio_mem_memory_notifier_cb()?
Ah, yes, can try to make that clearer.
>
>>
>> Handle the other case (memmap possibly freeing up another memory block)
>> when actually removing memory. When removing via virtio_mem_remove(),
>> virtio_mem_retry() is a NOP and safe to use.
>>
>> While at it, move retry handling when offlining out of
>> virtio_mem_notify_offline(), to share it with Device Block Mode (DBM)
>> soon.
>
> I may not understand the logic fully. Here is my understanding of current
> logic:
>
>
> virtio_mem_run_wq()
> virtio_mem_unplug_request()
> virtio_mem_mb_unplug_any_sb_offline()
> virtio_mem_mb_remove() --- 1
> virtio_mem_mb_unplug_any_sb_online()
> virtio_mem_mb_offline_and_remove() --- 2
>
> This patch tries to trigger the wq at 1 and 2. And these two functions are
> only valid during this code flow.
Exactly.
>
> These two functions actually remove some memory from the system. So I am not
> sure where extra unplug-able memory comes from. I guess those memory is from
> memory block device and mem_sectioin, memmap? While those memory is still
> marked as online, right?
Imagine you end up (only after some repeating plugging and unplugging of
memory, otherwise it's obviously impossible):
Memory block X: Contains only movable data
Memory block X + 1: Contains memmap of Memory block X:
We start to unplug from high, to low.
1. Try to unplug/offline/remove block X + 1: fails, because of the
memmap
2. Try to unplug/offline/remove block X: succeeds.
3. Not all requested memory got unplugged. Sleep for 30 seconds.
4. Retry to unplug/offline/remove block X + 1: succeeds
What we do in 2, is that we trigger a retry of ourselves. That means,
that in 3. we don't actually sleep, but retry immediately.
This has been proven helpful in some of my tests, where you want to
unplug *a lot* of memory again, not just some parts.
Triggering a retry is fairly cheap. Assume you don't actually have to
perform any more unplugging. The workqueue wakes up, detects that
nothing is to do, and goes back to sleep.
--
Thanks,
David / dhildenb
On Fri, Oct 16, 2020 at 11:18:39AM +0200, David Hildenbrand wrote:
>On 16.10.20 06:03, Wei Yang wrote:
>> On Mon, Oct 12, 2020 at 02:53:03PM +0200, David Hildenbrand wrote:
>>> Let's trigger from offlining code when we're not allowed to touch online
Here "touch" means "unplug"? If so, maybe s/touch/unplug/ would be more easy
to understand.
>>> memory.
>>
>> This describes the change in virtio_mem_memory_notifier_cb()?
>
>Ah, yes, can try to make that clearer.
>
>>
>>>
>>> Handle the other case (memmap possibly freeing up another memory block)
>>> when actually removing memory. When removing via virtio_mem_remove(),
>>> virtio_mem_retry() is a NOP and safe to use.
>>>
>>> While at it, move retry handling when offlining out of
>>> virtio_mem_notify_offline(), to share it with Device Block Mode (DBM)
>>> soon.
>>
>> I may not understand the logic fully. Here is my understanding of current
>> logic:
>>
>>
>> virtio_mem_run_wq()
>> virtio_mem_unplug_request()
>> virtio_mem_mb_unplug_any_sb_offline()
>> virtio_mem_mb_remove() --- 1
>> virtio_mem_mb_unplug_any_sb_online()
>> virtio_mem_mb_offline_and_remove() --- 2
>>
I am trying to get more understanding about the logic of virtio_mem_retry().
Current logic seems clear to me. There are four places to trigger it:
* notify_offline
* notify_online
* timer_expired
* config_changed
In this patch, we try to optimize the first case, notify_offline.
Now, we would always trigger retry when one of our memory block get offlined.
Per my understanding, this logic is correct while missed one case (or be more
precise, not handle one case timely). The case this patch wants to improve is
virtio_mem_mb_remove(). If my understanding is correct.
virtio_mem_run_wq()
virtio_mem_unplug_request()
virtio_mem_mb_unplug_any_sb_offline()
virtio_mem_mb_remove() --- 1
virtio_mem_mb_unplug_any_sb_online()
virtio_mem_mb_offline_and_remove() --- 2
The above is two functions this patch adjusts. For 2), it will offline the
memory block, thus will trigger virtio_mem_retry() originally. But for 1), the
memory block is already offlined, so virtio_mem_retry() will not be triggered
originally. This is the case we want to improve in this patch. Instead of wait
for timer expire, we trigger retry immediately after unplug/remove an offlined
memory block.
And after this change, this patch still adjust the original
virtio_mem_notify_offline() path to just trigger virtio_mem_retry() when
unplug_online is false. (This means the offline event is notified from user
space instead of from unplug event).
If my above analysis is correct, I got one small suggestion for this patch.
Instead of adjust current notify_offline handling, how about just trigger
retry during virtio_mem_mb_remove()? Since per my understanding, we just want
to do immediate trigger retry when unplug an offlined memory block.
>> This patch tries to trigger the wq at 1 and 2. And these two functions are
>> only valid during this code flow.
>
>Exactly.
>
>>
>> These two functions actually remove some memory from the system. So I am not
>> sure where extra unplug-able memory comes from. I guess those memory is from
>> memory block device and mem_sectioin, memmap? While those memory is still
>> marked as online, right?
>
>Imagine you end up (only after some repeating plugging and unplugging of
>memory, otherwise it's obviously impossible):
>
>Memory block X: Contains only movable data
>
>Memory block X + 1: Contains memmap of Memory block X:
>
>
>We start to unplug from high, to low.
>
>1. Try to unplug/offline/remove block X + 1: fails, because of the
> memmap
>2. Try to unplug/offline/remove block X: succeeds.
>3. Not all requested memory got unplugged. Sleep for 30 seconds.
>4. Retry to unplug/offline/remove block X + 1: succeeds
>
>What we do in 2, is that we trigger a retry of ourselves. That means,
>that in 3. we don't actually sleep, but retry immediately.
>
>This has been proven helpful in some of my tests, where you want to
>unplug *a lot* of memory again, not just some parts.
>
>
>Triggering a retry is fairly cheap. Assume you don't actually have to
>perform any more unplugging. The workqueue wakes up, detects that
>nothing is to do, and goes back to sleep.
>
>--
>Thanks,
>
>David / dhildenb
--
Wei Yang
Help you, Help me
On 18.10.20 05:57, Wei Yang wrote:
> On Fri, Oct 16, 2020 at 11:18:39AM +0200, David Hildenbrand wrote:
>> On 16.10.20 06:03, Wei Yang wrote:
>>> On Mon, Oct 12, 2020 at 02:53:03PM +0200, David Hildenbrand wrote:
>>>> Let's trigger from offlining code when we're not allowed to touch online
>
> Here "touch" means "unplug"? If so, maybe s/touch/unplug/ would be more easy
> to understand.
Yes, much better.
[...]
> I am trying to get more understanding about the logic of virtio_mem_retry().
>
> Current logic seems clear to me. There are four places to trigger it:
>
> * notify_offline
> * notify_online
> * timer_expired
> * config_changed
>
> In this patch, we try to optimize the first case, notify_offline.
Yes.
>
> Now, we would always trigger retry when one of our memory block get offlined.
> Per my understanding, this logic is correct while missed one case (or be more
> precise, not handle one case timely). The case this patch wants to improve is
> virtio_mem_mb_remove(). If my understanding is correct.
>
Yes, that's one part of it. Read below.
> virtio_mem_run_wq()
> virtio_mem_unplug_request()
> virtio_mem_mb_unplug_any_sb_offline()
> virtio_mem_mb_remove() --- 1
> virtio_mem_mb_unplug_any_sb_online()
> virtio_mem_mb_offline_and_remove() --- 2
>
> The above is two functions this patch adjusts. For 2), it will offline the
> memory block, thus will trigger virtio_mem_retry() originally. But for 1), the
> memory block is already offlined, so virtio_mem_retry() will not be triggered
> originally. This is the case we want to improve in this patch. Instead of wait
> for timer expire, we trigger retry immediately after unplug/remove an offlined
> memory block.
>
> And after this change, this patch still adjust the original
> virtio_mem_notify_offline() path to just trigger virtio_mem_retry() when
> unplug_online is false. (This means the offline event is notified from user
> space instead of from unplug event).
>
> If my above analysis is correct, I got one small suggestion for this patch.
> Instead of adjust current notify_offline handling, how about just trigger
> retry during virtio_mem_mb_remove()? Since per my understanding, we just want
> to do immediate trigger retry when unplug an offlined memory block.
I probably should have added the following to the patch description:
"This is a preparation for Big Block Mode (BBM), whereby we can see some
temporary offlining of memory blocks without actually making progress"
Imagine you have a Big Block that spans to Linux memory blocks. Assume
the first Linux memory blocks has no unmovable data on it.
Assume you call offline_and_remove_memory()
1. Try to offline the first block. Works, notifiers triggered.
virtio_mem_retry().
2. Try to offline the second block. Does not work.
3. Re-online first block.
4. Exit to main loop, exit workqueue.
5. Retry immediately (due to virtio_mem_retry()), go to 1.
So, you'll keep retrying forever. Found while debugging that exact issue :)
--
Thanks,
David / dhildenb
On Mon, Oct 19, 2020 at 11:04:40AM +0200, David Hildenbrand wrote:
>On 18.10.20 05:57, Wei Yang wrote:
>> On Fri, Oct 16, 2020 at 11:18:39AM +0200, David Hildenbrand wrote:
>>> On 16.10.20 06:03, Wei Yang wrote:
>>>> On Mon, Oct 12, 2020 at 02:53:03PM +0200, David Hildenbrand wrote:
>>>>> Let's trigger from offlining code when we're not allowed to touch online
>>
>> Here "touch" means "unplug"? If so, maybe s/touch/unplug/ would be more easy
>> to understand.
>
>Yes, much better.
>
>[...]
>
>> I am trying to get more understanding about the logic of virtio_mem_retry().
>>
>> Current logic seems clear to me. There are four places to trigger it:
>>
>> * notify_offline
>> * notify_online
>> * timer_expired
>> * config_changed
>>
>> In this patch, we try to optimize the first case, notify_offline.
>
>Yes.
>
>>
>> Now, we would always trigger retry when one of our memory block get offlined.
>> Per my understanding, this logic is correct while missed one case (or be more
>> precise, not handle one case timely). The case this patch wants to improve is
>> virtio_mem_mb_remove(). If my understanding is correct.
>>
>
>Yes, that's one part of it. Read below.
>
>> virtio_mem_run_wq()
>> virtio_mem_unplug_request()
>> virtio_mem_mb_unplug_any_sb_offline()
>> virtio_mem_mb_remove() --- 1
>> virtio_mem_mb_unplug_any_sb_online()
>> virtio_mem_mb_offline_and_remove() --- 2
>>
>> The above is two functions this patch adjusts. For 2), it will offline the
>> memory block, thus will trigger virtio_mem_retry() originally. But for 1), the
>> memory block is already offlined, so virtio_mem_retry() will not be triggered
>> originally. This is the case we want to improve in this patch. Instead of wait
>> for timer expire, we trigger retry immediately after unplug/remove an offlined
>> memory block.
>>
>> And after this change, this patch still adjust the original
>> virtio_mem_notify_offline() path to just trigger virtio_mem_retry() when
>> unplug_online is false. (This means the offline event is notified from user
>> space instead of from unplug event).
>>
>> If my above analysis is correct, I got one small suggestion for this patch.
>> Instead of adjust current notify_offline handling, how about just trigger
>> retry during virtio_mem_mb_remove()? Since per my understanding, we just want
>> to do immediate trigger retry when unplug an offlined memory block.
>
>I probably should have added the following to the patch description:
>
>"This is a preparation for Big Block Mode (BBM), whereby we can see some
>temporary offlining of memory blocks without actually making progress"
>
>Imagine you have a Big Block that spans to Linux memory blocks. Assume
>the first Linux memory blocks has no unmovable data on it.
>
>Assume you call offline_and_remove_memory()
>
>1. Try to offline the first block. Works, notifiers triggered.
>virtio_mem_retry().
After this patch, the virtio_mem_retry() is remove here.
>2. Try to offline the second block. Does not work.
>3. Re-online first block.
>4. Exit to main loop, exit workqueue.
Since offline_and_remove_memory() doesn't succeed, virtio_mem_retry() is not
triggered.
>5. Retry immediately (due to virtio_mem_retry()), go to 1.
So we won't have endless loop.
>
>So, you'll keep retrying forever. Found while debugging that exact issue :)
>
If this is the case, my suggestion is to record it in the changelog.
Otherwise, we may lose this corner case which is important to this change.
>
>--
>Thanks,
>
>David / dhildenb
--
Wei Yang
Help you, Help me
>> So, you'll keep retrying forever. Found while debugging that exact issue :)
>>
>
> If this is the case, my suggestion is to record it in the changelog.
> Otherwise, we may lose this corner case which is important to this change.
Yes, already added it - thanks!
--
Thanks,
David / dhildenb