2020-08-21 10:36:20

by David Hildenbrand

[permalink] [raw]
Subject: [PATCH v1 2/5] kernel/resource: merge_system_ram_resources() to merge resources after hotplug

Some add_memory*() users add memory in small, contiguous memory blocks.
Examples include virtio-mem, hyper-v balloon, and the XEN balloon.

This can quickly result in a lot of memory resources, whereby the actual
resource boundaries are not of interest (e.g., it might be relevant for
DIMMs, exposed via /proc/iomem to user space). We really want to merge
added resources in this scenario where possible.

Let's provide an interface to trigger merging of applicable child
resources. It will be, for example, used by virtio-mem to trigger
merging of system ram resources it added to its resource container, but
also by XEN and Hyper-V to trigger merging of system ram resources in
iomem_resource.

Note: We really want to merge after the whole operation succeeded, not
directly when adding a resource to the resource tree (it would break
add_memory_resource() and require splitting resources again when the
operation failed - e.g., due to -ENOMEM).

Cc: Andrew Morton <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Jason Gunthorpe <[email protected]>
Cc: Kees Cook <[email protected]>
Cc: Ard Biesheuvel <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: "K. Y. Srinivasan" <[email protected]>
Cc: Haiyang Zhang <[email protected]>
Cc: Stephen Hemminger <[email protected]>
Cc: Wei Liu <[email protected]>
Cc: Boris Ostrovsky <[email protected]>
Cc: Juergen Gross <[email protected]>
Cc: Stefano Stabellini <[email protected]>
Cc: Roger Pau Monné <[email protected]>
Cc: Julien Grall <[email protected]>
Cc: Pankaj Gupta <[email protected]>
Cc: Baoquan He <[email protected]>
Cc: Wei Yang <[email protected]>
Signed-off-by: David Hildenbrand <[email protected]>
---
include/linux/ioport.h | 3 +++
kernel/resource.c | 52 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 55 insertions(+)

diff --git a/include/linux/ioport.h b/include/linux/ioport.h
index 52a91f5fa1a36..3bb0020cd6ddc 100644
--- a/include/linux/ioport.h
+++ b/include/linux/ioport.h
@@ -251,6 +251,9 @@ extern void __release_region(struct resource *, resource_size_t,
extern void release_mem_region_adjustable(struct resource *, resource_size_t,
resource_size_t);
#endif
+#ifdef CONFIG_MEMORY_HOTPLUG
+extern void merge_system_ram_resources(struct resource *res);
+#endif

/* Wrappers for managed devices */
struct device;
diff --git a/kernel/resource.c b/kernel/resource.c
index 1dcef5d53d76e..b4e0963edadd2 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -1360,6 +1360,58 @@ void release_mem_region_adjustable(struct resource *parent,
}
#endif /* CONFIG_MEMORY_HOTREMOVE */

+#ifdef CONFIG_MEMORY_HOTPLUG
+static bool system_ram_resources_mergeable(struct resource *r1,
+ struct resource *r2)
+{
+ return r1->flags == r2->flags && r1->end + 1 == r2->start &&
+ r1->name == r2->name && r1->desc == r2->desc &&
+ !r1->child && !r2->child;
+}
+
+/*
+ * merge_system_ram_resources - try to merge contiguous system ram resources
+ * @parent: parent resource descriptor
+ *
+ * This interface is intended for memory hotplug, whereby lots of contiguous
+ * system ram resources are added (e.g., via add_memory*()) by a driver, and
+ * the actual resource boundaries are not of interest (e.g., it might be
+ * relevant for DIMMs). Only immediate child resources that are busy and
+ * don't have any children are considered. All applicable child resources
+ * must be immutable during the request.
+ *
+ * Note:
+ * - The caller has to make sure that no pointers to resources that might
+ * get merged are held anymore. Callers should only trigger merging of child
+ * resources when they are the only one adding system ram resources to the
+ * parent (besides during boot).
+ * - release_mem_region_adjustable() will split on demand on memory hotunplug
+ */
+void merge_system_ram_resources(struct resource *parent)
+{
+ const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+ struct resource *cur, *next;
+
+ write_lock(&resource_lock);
+
+ cur = parent->child;
+ while (cur && cur->sibling) {
+ next = cur->sibling;
+ if ((cur->flags & flags) == flags &&
+ system_ram_resources_mergeable(cur, next)) {
+ cur->end = next->end;
+ cur->sibling = next->sibling;
+ free_resource(next);
+ next = cur->sibling;
+ }
+ cur = next;
+ }
+
+ write_unlock(&resource_lock);
+}
+EXPORT_SYMBOL(merge_system_ram_resources);
+#endif /* CONFIG_MEMORY_HOTPLUG */
+
/*
* Managed region resource
*/
--
2.26.2


2020-08-31 09:37:04

by Pankaj Gupta

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] kernel/resource: merge_system_ram_resources() to merge resources after hotplug

> Some add_memory*() users add memory in small, contiguous memory blocks.
> Examples include virtio-mem, hyper-v balloon, and the XEN balloon.
>
> This can quickly result in a lot of memory resources, whereby the actual
> resource boundaries are not of interest (e.g., it might be relevant for
> DIMMs, exposed via /proc/iomem to user space). We really want to merge
> added resources in this scenario where possible.
>
> Let's provide an interface to trigger merging of applicable child
> resources. It will be, for example, used by virtio-mem to trigger
> merging of system ram resources it added to its resource container, but
> also by XEN and Hyper-V to trigger merging of system ram resources in
> iomem_resource.
>
> Note: We really want to merge after the whole operation succeeded, not
> directly when adding a resource to the resource tree (it would break
> add_memory_resource() and require splitting resources again when the
> operation failed - e.g., due to -ENOMEM).
>
> Cc: Andrew Morton <[email protected]>
> Cc: Michal Hocko <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Jason Gunthorpe <[email protected]>
> Cc: Kees Cook <[email protected]>
> Cc: Ard Biesheuvel <[email protected]>
> Cc: Thomas Gleixner <[email protected]>
> Cc: "K. Y. Srinivasan" <[email protected]>
> Cc: Haiyang Zhang <[email protected]>
> Cc: Stephen Hemminger <[email protected]>
> Cc: Wei Liu <[email protected]>
> Cc: Boris Ostrovsky <[email protected]>
> Cc: Juergen Gross <[email protected]>
> Cc: Stefano Stabellini <[email protected]>
> Cc: Roger Pau Monné <[email protected]>
> Cc: Julien Grall <[email protected]>
> Cc: Pankaj Gupta <[email protected]>
> Cc: Baoquan He <[email protected]>
> Cc: Wei Yang <[email protected]>
> Signed-off-by: David Hildenbrand <[email protected]>
> ---
> include/linux/ioport.h | 3 +++
> kernel/resource.c | 52 ++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 55 insertions(+)
>
> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
> index 52a91f5fa1a36..3bb0020cd6ddc 100644
> --- a/include/linux/ioport.h
> +++ b/include/linux/ioport.h
> @@ -251,6 +251,9 @@ extern void __release_region(struct resource *, resource_size_t,
> extern void release_mem_region_adjustable(struct resource *, resource_size_t,
> resource_size_t);
> #endif
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +extern void merge_system_ram_resources(struct resource *res);
> +#endif
>
> /* Wrappers for managed devices */
> struct device;
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 1dcef5d53d76e..b4e0963edadd2 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -1360,6 +1360,58 @@ void release_mem_region_adjustable(struct resource *parent,
> }
> #endif /* CONFIG_MEMORY_HOTREMOVE */
>
> +#ifdef CONFIG_MEMORY_HOTPLUG
> +static bool system_ram_resources_mergeable(struct resource *r1,
> + struct resource *r2)
> +{
> + return r1->flags == r2->flags && r1->end + 1 == r2->start &&
> + r1->name == r2->name && r1->desc == r2->desc &&
> + !r1->child && !r2->child;
> +}
> +
> +/*
> + * merge_system_ram_resources - try to merge contiguous system ram resources
> + * @parent: parent resource descriptor
> + *
> + * This interface is intended for memory hotplug, whereby lots of contiguous
> + * system ram resources are added (e.g., via add_memory*()) by a driver, and
> + * the actual resource boundaries are not of interest (e.g., it might be
> + * relevant for DIMMs). Only immediate child resources that are busy and
> + * don't have any children are considered. All applicable child resources
> + * must be immutable during the request.
> + *
> + * Note:
> + * - The caller has to make sure that no pointers to resources that might
> + * get merged are held anymore. Callers should only trigger merging of child
> + * resources when they are the only one adding system ram resources to the
> + * parent (besides during boot).
> + * - release_mem_region_adjustable() will split on demand on memory hotunplug
> + */
> +void merge_system_ram_resources(struct resource *parent)
> +{
> + const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> + struct resource *cur, *next;
> +
> + write_lock(&resource_lock);
> +
> + cur = parent->child;
> + while (cur && cur->sibling) {
> + next = cur->sibling;
> + if ((cur->flags & flags) == flags &&

Maybe this can be changed to:
!(cur->flags & ~flags)

> + system_ram_resources_mergeable(cur, next)) {
> + cur->end = next->end;
> + cur->sibling = next->sibling;
> + free_resource(next);
> + next = cur->sibling;
> + }
> + cur = next;
> + }
> +
> + write_unlock(&resource_lock);
> +}
> +EXPORT_SYMBOL(merge_system_ram_resources);
> +#endif /* CONFIG_MEMORY_HOTPLUG */
> +
> /*
> * Managed region resource
> */
> --
> 2.26.2
>

2020-09-08 10:32:53

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH v1 2/5] kernel/resource: merge_system_ram_resources() to merge resources after hotplug

On 31.08.20 11:35, Pankaj Gupta wrote:
>> Some add_memory*() users add memory in small, contiguous memory blocks.
>> Examples include virtio-mem, hyper-v balloon, and the XEN balloon.
>>
>> This can quickly result in a lot of memory resources, whereby the actual
>> resource boundaries are not of interest (e.g., it might be relevant for
>> DIMMs, exposed via /proc/iomem to user space). We really want to merge
>> added resources in this scenario where possible.
>>
>> Let's provide an interface to trigger merging of applicable child
>> resources. It will be, for example, used by virtio-mem to trigger
>> merging of system ram resources it added to its resource container, but
>> also by XEN and Hyper-V to trigger merging of system ram resources in
>> iomem_resource.
>>
>> Note: We really want to merge after the whole operation succeeded, not
>> directly when adding a resource to the resource tree (it would break
>> add_memory_resource() and require splitting resources again when the
>> operation failed - e.g., due to -ENOMEM).
>>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Michal Hocko <[email protected]>
>> Cc: Dan Williams <[email protected]>
>> Cc: Jason Gunthorpe <[email protected]>
>> Cc: Kees Cook <[email protected]>
>> Cc: Ard Biesheuvel <[email protected]>
>> Cc: Thomas Gleixner <[email protected]>
>> Cc: "K. Y. Srinivasan" <[email protected]>
>> Cc: Haiyang Zhang <[email protected]>
>> Cc: Stephen Hemminger <[email protected]>
>> Cc: Wei Liu <[email protected]>
>> Cc: Boris Ostrovsky <[email protected]>
>> Cc: Juergen Gross <[email protected]>
>> Cc: Stefano Stabellini <[email protected]>
>> Cc: Roger Pau Monné <[email protected]>
>> Cc: Julien Grall <[email protected]>
>> Cc: Pankaj Gupta <[email protected]>
>> Cc: Baoquan He <[email protected]>
>> Cc: Wei Yang <[email protected]>
>> Signed-off-by: David Hildenbrand <[email protected]>
>> ---
>> include/linux/ioport.h | 3 +++
>> kernel/resource.c | 52 ++++++++++++++++++++++++++++++++++++++++++
>> 2 files changed, 55 insertions(+)
>>
>> diff --git a/include/linux/ioport.h b/include/linux/ioport.h
>> index 52a91f5fa1a36..3bb0020cd6ddc 100644
>> --- a/include/linux/ioport.h
>> +++ b/include/linux/ioport.h
>> @@ -251,6 +251,9 @@ extern void __release_region(struct resource *, resource_size_t,
>> extern void release_mem_region_adjustable(struct resource *, resource_size_t,
>> resource_size_t);
>> #endif
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +extern void merge_system_ram_resources(struct resource *res);
>> +#endif
>>
>> /* Wrappers for managed devices */
>> struct device;
>> diff --git a/kernel/resource.c b/kernel/resource.c
>> index 1dcef5d53d76e..b4e0963edadd2 100644
>> --- a/kernel/resource.c
>> +++ b/kernel/resource.c
>> @@ -1360,6 +1360,58 @@ void release_mem_region_adjustable(struct resource *parent,
>> }
>> #endif /* CONFIG_MEMORY_HOTREMOVE */
>>
>> +#ifdef CONFIG_MEMORY_HOTPLUG
>> +static bool system_ram_resources_mergeable(struct resource *r1,
>> + struct resource *r2)
>> +{
>> + return r1->flags == r2->flags && r1->end + 1 == r2->start &&
>> + r1->name == r2->name && r1->desc == r2->desc &&
>> + !r1->child && !r2->child;
>> +}
>> +
>> +/*
>> + * merge_system_ram_resources - try to merge contiguous system ram resources
>> + * @parent: parent resource descriptor
>> + *
>> + * This interface is intended for memory hotplug, whereby lots of contiguous
>> + * system ram resources are added (e.g., via add_memory*()) by a driver, and
>> + * the actual resource boundaries are not of interest (e.g., it might be
>> + * relevant for DIMMs). Only immediate child resources that are busy and
>> + * don't have any children are considered. All applicable child resources
>> + * must be immutable during the request.
>> + *
>> + * Note:
>> + * - The caller has to make sure that no pointers to resources that might
>> + * get merged are held anymore. Callers should only trigger merging of child
>> + * resources when they are the only one adding system ram resources to the
>> + * parent (besides during boot).
>> + * - release_mem_region_adjustable() will split on demand on memory hotunplug
>> + */
>> +void merge_system_ram_resources(struct resource *parent)
>> +{
>> + const unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>> + struct resource *cur, *next;
>> +
>> + write_lock(&resource_lock);
>> +
>> + cur = parent->child;
>> + while (cur && cur->sibling) {
>> + next = cur->sibling;
>> + if ((cur->flags & flags) == flags &&
>
> Maybe this can be changed to:
> !(cur->flags & ~flags)

That would be different I think.

(cur->flags & flags) == flags
checks that all "flags" are set (additional ones might be set).

!(cur->flags & ~flags)
checks that no other flags besides "flags" are set (and "flags" are not
required to be set).


We use the same handling in find_next_iomem_res(), e.g., called via
walk_system_ram_range also with IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY.

Thanks for having a look!

--
Thanks,

David / dhildenb