2011-05-02 21:50:24

by Daniel Kiper

[permalink] [raw]
Subject: [PATCH V2 2/2] mm: Extend memory hotplug API to allow memory hotplug in virtual machines

This patch contains online_page_callback and apropriate functions for
registering/unregistering online page callbacks. It allows to do some
machine specific tasks during online page stage which is required
to implement memory hotplug in virtual machines. Additionally,
__online_page_set_limits(), __online_page_increment_counters() and
__online_page_free() function was added to ease generic
hotplug operation.

Signed-off-by: Daniel Kiper <[email protected]>
---
include/linux/memory_hotplug.h | 11 +++++-
mm/memory_hotplug.c | 68 ++++++++++++++++++++++++++++++++++++++--
2 files changed, 74 insertions(+), 5 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 8122018..014bd96 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -68,12 +68,19 @@ static inline void zone_seqlock_init(struct zone *zone)
extern int zone_grow_free_lists(struct zone *zone, unsigned long new_nr_pages);
extern int zone_grow_waitqueues(struct zone *zone, unsigned long nr_pages);
extern int add_one_highpage(struct page *page, int pfn, int bad_ppro);
-/* need some defines for these for archs that don't support it */
-extern void online_page(struct page *page);
/* VM interface that may be used by firmware interface */
extern int online_pages(unsigned long, unsigned long);
extern void __offline_isolated_pages(unsigned long, unsigned long);

+typedef void (*online_page_callback_t)(struct page *page);
+
+extern int register_online_page_callback(online_page_callback_t callback);
+extern int unregister_online_page_callback(online_page_callback_t callback);
+
+extern void __online_page_set_limits(struct page *page);
+extern void __online_page_increment_counters(struct page *page);
+extern void __online_page_free(struct page *page);
+
#ifdef CONFIG_MEMORY_HOTREMOVE
extern bool is_pageblock_removable_nolock(struct page *page);
#endif /* CONFIG_MEMORY_HOTREMOVE */
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index a807ccb..6bf78be 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -34,6 +34,17 @@

#include "internal.h"

+/*
+ * online_page_callback contains pointer to current page onlining function.
+ * Initially it is generic_online_page(). If it is required it could be
+ * changed by calling register_online_page_callback() for callback registration
+ * and unregister_online_page_callback() for callback unregistration.
+ */
+
+static void generic_online_page(struct page *page);
+
+static online_page_callback_t online_page_callback = generic_online_page;
+
DEFINE_MUTEX(mem_hotplug_mutex);

void lock_memory_hotplug(void)
@@ -361,23 +372,74 @@ int __remove_pages(struct zone *zone, unsigned long phys_start_pfn,
}
EXPORT_SYMBOL_GPL(__remove_pages);

-void online_page(struct page *page)
+int register_online_page_callback(online_page_callback_t callback)
+{
+ int rc = -EPERM;
+
+ lock_memory_hotplug();
+
+ if (online_page_callback == generic_online_page) {
+ online_page_callback = callback;
+ rc = 0;
+ }
+
+ unlock_memory_hotplug();
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(register_online_page_callback);
+
+int unregister_online_page_callback(online_page_callback_t callback)
+{
+ int rc = -EPERM;
+
+ lock_memory_hotplug();
+
+ if (online_page_callback == callback) {
+ online_page_callback = generic_online_page;
+ rc = 0;
+ }
+
+ unlock_memory_hotplug();
+
+ return rc;
+}
+EXPORT_SYMBOL_GPL(unregister_online_page_callback);
+
+void __online_page_set_limits(struct page *page)
{
unsigned long pfn = page_to_pfn(page);

- totalram_pages++;
if (pfn >= num_physpages)
num_physpages = pfn + 1;
+}
+EXPORT_SYMBOL_GPL(__online_page_set_limits);
+
+void __online_page_increment_counters(struct page *page)
+{
+ totalram_pages++;

#ifdef CONFIG_HIGHMEM
if (PageHighMem(page))
totalhigh_pages++;
#endif
+}
+EXPORT_SYMBOL_GPL(__online_page_increment_counters);

+void __online_page_free(struct page *page)
+{
ClearPageReserved(page);
init_page_count(page);
__free_page(page);
}
+EXPORT_SYMBOL_GPL(__online_page_free);
+
+static void generic_online_page(struct page *page)
+{
+ __online_page_set_limits(page);
+ __online_page_increment_counters(page);
+ __online_page_free(page);
+}

static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
void *arg)
@@ -388,7 +450,7 @@ static int online_pages_range(unsigned long start_pfn, unsigned long nr_pages,
if (PageReserved(pfn_to_page(start_pfn)))
for (i = 0; i < nr_pages; i++) {
page = pfn_to_page(start_pfn + i);
- online_page(page);
+ online_page_callback(page);
onlined_pages++;
}
*(unsigned long *)arg = onlined_pages;
--
1.5.6.5


2011-05-03 16:26:28

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm: Extend memory hotplug API to allow memory hotplug in virtual machines

On Mon, 2011-05-02 at 23:49 +0200, Daniel Kiper wrote:
> +int register_online_page_callback(online_page_callback_t callback)
> +{
> + int rc = -EPERM;
> +
> + lock_memory_hotplug();
> +
> + if (online_page_callback == generic_online_page) {
> + online_page_callback = callback;
> + rc = 0;
> + }
> +
> + unlock_memory_hotplug();
> +
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(register_online_page_callback);

-EPERM is a bit uninformative here. How about -EEXIST, plus a printk?

I also don't seen the real use behind having a "register" that can only
take a single callback. At worst, it should be
"set_online_page_callback()" so it's more apparent that there can only
be one of these.

> +int unregister_online_page_callback(online_page_callback_t callback)
> +{
> + int rc = -EPERM;
> +
> + lock_memory_hotplug();
> +
> + if (online_page_callback == callback) {
> + online_page_callback = generic_online_page;
> + rc = 0;
> + }
> +
> + unlock_memory_hotplug();
> +
> + return rc;
> +}
> +EXPORT_SYMBOL_GPL(unregister_online_page_callback);

Again, -EPERM is a bad code here. -EEXIST, perhaps? It also deserves a
WARN_ON() or a printk on failure here.

Your changelog doesn't mention, but what ever happened to doing
something dirt-simple like this? I have a short memory.

> void arch_free_hotplug_page(struct page *page)
> {
> if (xen_need_to_inflate_balloon())
> put_page_in_balloon(page);
> else
> __free_page(page);
> }

-- Dave

2011-05-03 20:14:42

by Daniel Kiper

[permalink] [raw]
Subject: Re: [PATCH V2 2/2] mm: Extend memory hotplug API to allow memory hotplug in virtual machines

On Tue, May 03, 2011 at 09:25:52AM -0700, Dave Hansen wrote:
> On Mon, 2011-05-02 at 23:49 +0200, Daniel Kiper wrote:
> > +int register_online_page_callback(online_page_callback_t callback)
> > +{
> > + int rc = -EPERM;
> > +
> > + lock_memory_hotplug();
> > +
> > + if (online_page_callback == generic_online_page) {
> > + online_page_callback = callback;
> > + rc = 0;
> > + }
> > +
> > + unlock_memory_hotplug();
> > +
> > + return rc;
> > +}
> > +EXPORT_SYMBOL_GPL(register_online_page_callback);
>
> -EPERM is a bit uninformative here. How about -EEXIST, plus a printk?

EEXIST means File exists (POSIX.1). It could be misleading. That is why
I decided to use EPERM. I could not find any better choice. I think another
choice is EINVAL (not the best one in my opinion). Additionally, I am not
sure it should have printk. I think it is role of caller to notify (or not)
about possible errors.

> I also don't seen the real use behind having a "register" that can only
> take a single callback. At worst, it should be
> "set_online_page_callback()" so it's more apparent that there can only
> be one of these.

OK.

> > +int unregister_online_page_callback(online_page_callback_t callback)
> > +{
> > + int rc = -EPERM;
> > +
> > + lock_memory_hotplug();
> > +
> > + if (online_page_callback == callback) {
> > + online_page_callback = generic_online_page;
> > + rc = 0;
> > + }
> > +
> > + unlock_memory_hotplug();
> > +
> > + return rc;
> > +}
> > +EXPORT_SYMBOL_GPL(unregister_online_page_callback);
>
> Again, -EPERM is a bad code here. -EEXIST, perhaps? It also deserves a
> WARN_ON() or a printk on failure here.

Please look above.

> Your changelog doesn't mention, but what ever happened to doing
> something dirt-simple like this? I have a short memory.

Andrew Morton complained about (ab)use of notifiers. He suggested
to use callback machanism (I could not find any better solution
in Linux Kernel). He convinced me.

Daniel