2013-08-29 21:08:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 0/3] ACPI / hotplug / mm: Rework mutual exclusion between hibernation and memory hotplug

Hi All,

One thing that bothers me quite a bit about memory hotplug is that
lock_hotplug_memory() acquires pm_mutex which is kind of a blunt thing
and has a huge potential for deadlocks.

This can be avoided if device_hotplug_lock is held around hibernation,
which is not too difficult to make happen and hence the following patch
series.

[1/3] ACPI: Acquire device_hotplug_lock before acpi_scan_lock (this is
necessary, because hibernation acquires acpi_scan_lock in linux-next).

[2/3] PM / hibernate: Allocate memory bitmaps after freezing user space
processes (the reason why is explained in the changelog).

[3/3] Rework mutual exclusion between hibernation and memory hotplug.

On top of linux-pm.git/linux-next.

Thanks,
Rafael


2013-08-29 21:08:16

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 1/3] ACPI / scan: Change ordering of locks for device hotplug

From: Rafael J. Wysocki <[email protected]>

Change the ordering of device hotplug locks in scan.c so that
acpi_scan_lock is always acquired after device_hotplug_lock.

This will make it possible to use device_hotplug_lock around some
code paths that acquire acpi_scan_lock safely (most importantly
system suspend and hibernation). Apart from that, acpi_scan_lock
is platform-specific and device_hotplug_lock is general, so the
new ordering appears to be more appropriate from the overall
design viewpoint.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/scan.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)

Index: linux-pm/drivers/acpi/scan.c
===================================================================
--- linux-pm.orig/drivers/acpi/scan.c
+++ linux-pm/drivers/acpi/scan.c
@@ -204,8 +204,6 @@ static int acpi_scan_hot_remove(struct a
return -EINVAL;
}

- lock_device_hotplug();
-
/*
* Carry out two passes here and ignore errors in the first pass,
* because if the devices in question are memory blocks and
@@ -236,9 +234,6 @@ static int acpi_scan_hot_remove(struct a
ACPI_UINT32_MAX,
acpi_bus_online_companions, NULL,
NULL, NULL);
-
- unlock_device_hotplug();
-
put_device(&device->dev);
return -EBUSY;
}
@@ -249,8 +244,6 @@ static int acpi_scan_hot_remove(struct a

acpi_bus_trim(device);

- unlock_device_hotplug();
-
/* Device node has been unregistered. */
put_device(&device->dev);
device = NULL;
@@ -289,6 +282,7 @@ static void acpi_bus_device_eject(void *
u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE;
int error;

+ lock_device_hotplug();
mutex_lock(&acpi_scan_lock);

acpi_bus_get_device(handle, &device);
@@ -312,6 +306,7 @@ static void acpi_bus_device_eject(void *

out:
mutex_unlock(&acpi_scan_lock);
+ unlock_device_hotplug();
return;

err_out:
@@ -326,8 +321,8 @@ static void acpi_scan_bus_device_check(a
u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE;
int error;

- mutex_lock(&acpi_scan_lock);
lock_device_hotplug();
+ mutex_lock(&acpi_scan_lock);

if (ost_source != ACPI_NOTIFY_BUS_CHECK) {
acpi_bus_get_device(handle, &device);
@@ -353,9 +348,9 @@ static void acpi_scan_bus_device_check(a
kobject_uevent(&device->dev.kobj, KOBJ_ONLINE);

out:
- unlock_device_hotplug();
acpi_evaluate_hotplug_ost(handle, ost_source, ost_code, NULL);
mutex_unlock(&acpi_scan_lock);
+ unlock_device_hotplug();
}

static void acpi_scan_bus_check(void *context)
@@ -446,6 +441,7 @@ void acpi_bus_hot_remove_device(void *co
acpi_handle handle = device->handle;
int error;

+ lock_device_hotplug();
mutex_lock(&acpi_scan_lock);

error = acpi_scan_hot_remove(device);
@@ -455,6 +451,7 @@ void acpi_bus_hot_remove_device(void *co
NULL);

mutex_unlock(&acpi_scan_lock);
+ unlock_device_hotplug();
kfree(context);
}
EXPORT_SYMBOL(acpi_bus_hot_remove_device);

2013-08-29 21:08:15

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 2/3] PM / hibernate: Create memory bitmaps after freezing user space

From: Rafael J. Wysocki <[email protected]>

The hibernation core uses special memory bitmaps during image
creation and restoration and traditionally those bitmaps are
allocated before freezing tasks, because in the past GFP_KERNEL
allocations might not work after all tasks had been frozen.

However, this is an anachronism, because hibernation_snapshot()
now calls hibernate_preallocate_memory() which allocates memory
for the image upfront anyway, so the memory bitmaps may be
allocated after freezing user space safely.

For this reason, move all of the create_basic_memory_bitmaps()
calls after freeze_processes() and all of the corresponding
free_basic_memory_bitmaps() calls before thaw_processes().

This will allow us to hold device_hotplug_lock around hibernation
without the need to worry about freezing issues with user space
processes attempting to acquire it via sysfs attributes after the
creation of memory bitmaps and before the freezing of tasks.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
kernel/power/hibernate.c | 41 +++++++++++++++++++----------------------
kernel/power/user.c | 22 ++++++++++++----------
2 files changed, 31 insertions(+), 32 deletions(-)

Index: linux-pm/kernel/power/hibernate.c
===================================================================
--- linux-pm.orig/kernel/power/hibernate.c
+++ linux-pm/kernel/power/hibernate.c
@@ -644,22 +644,22 @@ int hibernate(void)
if (error)
goto Exit;

- /* Allocate memory management structures */
- error = create_basic_memory_bitmaps();
- if (error)
- goto Exit;
-
printk(KERN_INFO "PM: Syncing filesystems ... ");
sys_sync();
printk("done.\n");

error = freeze_processes();
if (error)
- goto Free_bitmaps;
+ goto Exit;
+
+ /* Allocate memory management structures */
+ error = create_basic_memory_bitmaps();
+ if (error)
+ goto Thaw;

error = hibernation_snapshot(hibernation_mode == HIBERNATION_PLATFORM);
if (error || freezer_test_done)
- goto Thaw;
+ goto Free_bitmaps;

if (in_suspend) {
unsigned int flags = 0;
@@ -682,14 +682,13 @@ int hibernate(void)
pr_debug("PM: Image restored successfully.\n");
}

+ Free_bitmaps:
+ free_basic_memory_bitmaps();
Thaw:
thaw_processes();

/* Don't bother checking whether freezer_test_done is true */
freezer_test_done = false;
-
- Free_bitmaps:
- free_basic_memory_bitmaps();
Exit:
pm_notifier_call_chain(PM_POST_HIBERNATION);
pm_restore_console();
@@ -806,21 +805,19 @@ static int software_resume(void)
pm_prepare_console();
error = pm_notifier_call_chain(PM_RESTORE_PREPARE);
if (error)
- goto close_finish;
-
- error = create_basic_memory_bitmaps();
- if (error)
- goto close_finish;
+ goto Close_Finish;

pr_debug("PM: Preparing processes for restore.\n");
error = freeze_processes();
- if (error) {
- swsusp_close(FMODE_READ);
- goto Done;
- }
+ if (error)
+ goto Close_Finish;

pr_debug("PM: Loading hibernation image.\n");

+ error = create_basic_memory_bitmaps();
+ if (error)
+ goto Thaw;
+
error = swsusp_read(&flags);
swsusp_close(FMODE_READ);
if (!error)
@@ -828,9 +825,9 @@ static int software_resume(void)

printk(KERN_ERR "PM: Failed to load hibernation image, recovering.\n");
swsusp_free();
- thaw_processes();
- Done:
free_basic_memory_bitmaps();
+ Thaw:
+ thaw_processes();
Finish:
pm_notifier_call_chain(PM_POST_RESTORE);
pm_restore_console();
@@ -840,7 +837,7 @@ static int software_resume(void)
mutex_unlock(&pm_mutex);
pr_debug("PM: Hibernation image not present or could not be loaded.\n");
return error;
-close_finish:
+ Close_Finish:
swsusp_close(FMODE_READ);
goto Finish;
}
Index: linux-pm/kernel/power/user.c
===================================================================
--- linux-pm.orig/kernel/power/user.c
+++ linux-pm/kernel/power/user.c
@@ -60,11 +60,6 @@ static int snapshot_open(struct inode *i
error = -ENOSYS;
goto Unlock;
}
- if(create_basic_memory_bitmaps()) {
- atomic_inc(&snapshot_device_available);
- error = -ENOMEM;
- goto Unlock;
- }
nonseekable_open(inode, filp);
data = &snapshot_state;
filp->private_data = data;
@@ -90,10 +85,9 @@ static int snapshot_open(struct inode *i
if (error)
pm_notifier_call_chain(PM_POST_RESTORE);
}
- if (error) {
- free_basic_memory_bitmaps();
+ if (error)
atomic_inc(&snapshot_device_available);
- }
+
data->frozen = 0;
data->ready = 0;
data->platform_support = 0;
@@ -111,11 +105,11 @@ static int snapshot_release(struct inode
lock_system_sleep();

swsusp_free();
- free_basic_memory_bitmaps();
data = filp->private_data;
free_all_swap_pages(data->swap);
if (data->frozen) {
pm_restore_gfp_mask();
+ free_basic_memory_bitmaps();
thaw_processes();
}
pm_notifier_call_chain(data->mode == O_RDONLY ?
@@ -220,14 +214,22 @@ static long snapshot_ioctl(struct file *
printk("done.\n");

error = freeze_processes();
- if (!error)
+ if (error)
+ break;
+
+ error = create_basic_memory_bitmaps();
+ if (error)
+ thaw_processes();
+ else
data->frozen = 1;
+
break;

case SNAPSHOT_UNFREEZE:
if (!data->frozen || data->ready)
break;
pm_restore_gfp_mask();
+ free_basic_memory_bitmaps();
thaw_processes();
data->frozen = 0;
break;

2013-08-29 21:08:13

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH 3/3] PM / hibernate / memory hotplug: Rework mutual exclusion

From: Rafael J. Wysocki <[email protected]>

Since all of the memory hotplug operations have to be carried out
under device_hotplug_lock, they won't need to acquire pm_mutex if
device_hotplug_lock is held around hibernation.

For this reason, make the hibernation code acquire
device_hotplug_lock after freezing user space processes and
release it before thawing them. At the same tim drop the
lock_system_sleep() and unlock_system_sleep() calls from
lock_memory_hotplug() and unlock_memory_hotplug(), respectively.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
kernel/power/hibernate.c | 4 ++++
kernel/power/user.c | 2 ++
mm/memory_hotplug.c | 4 ----
3 files changed, 6 insertions(+), 4 deletions(-)

Index: linux-pm/kernel/power/hibernate.c
===================================================================
--- linux-pm.orig/kernel/power/hibernate.c
+++ linux-pm/kernel/power/hibernate.c
@@ -652,6 +652,7 @@ int hibernate(void)
if (error)
goto Exit;

+ lock_device_hotplug();
/* Allocate memory management structures */
error = create_basic_memory_bitmaps();
if (error)
@@ -685,6 +686,7 @@ int hibernate(void)
Free_bitmaps:
free_basic_memory_bitmaps();
Thaw:
+ unlock_device_hotplug();
thaw_processes();

/* Don't bother checking whether freezer_test_done is true */
@@ -814,6 +816,7 @@ static int software_resume(void)

pr_debug("PM: Loading hibernation image.\n");

+ lock_device_hotplug();
error = create_basic_memory_bitmaps();
if (error)
goto Thaw;
@@ -827,6 +830,7 @@ static int software_resume(void)
swsusp_free();
free_basic_memory_bitmaps();
Thaw:
+ unlock_device_hotplug();
thaw_processes();
Finish:
pm_notifier_call_chain(PM_POST_RESTORE);
Index: linux-pm/kernel/power/user.c
===================================================================
--- linux-pm.orig/kernel/power/user.c
+++ linux-pm/kernel/power/user.c
@@ -201,6 +201,7 @@ static long snapshot_ioctl(struct file *
if (!mutex_trylock(&pm_mutex))
return -EBUSY;

+ lock_device_hotplug();
data = filp->private_data;

switch (cmd) {
@@ -373,6 +374,7 @@ static long snapshot_ioctl(struct file *

}

+ unlock_device_hotplug();
mutex_unlock(&pm_mutex);

return error;
Index: linux-pm/mm/memory_hotplug.c
===================================================================
--- linux-pm.orig/mm/memory_hotplug.c
+++ linux-pm/mm/memory_hotplug.c
@@ -51,14 +51,10 @@ DEFINE_MUTEX(mem_hotplug_mutex);
void lock_memory_hotplug(void)
{
mutex_lock(&mem_hotplug_mutex);
-
- /* for exclusive hibernation if CONFIG_HIBERNATION=y */
- lock_system_sleep();
}

void unlock_memory_hotplug(void)
{
- unlock_system_sleep();
mutex_unlock(&mem_hotplug_mutex);
}

2013-08-31 00:19:12

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH 1/3] ACPI / scan: Change ordering of locks for device hotplug

On Thu, 2013-08-29 at 23:15 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Change the ordering of device hotplug locks in scan.c so that
> acpi_scan_lock is always acquired after device_hotplug_lock.
>
> This will make it possible to use device_hotplug_lock around some
> code paths that acquire acpi_scan_lock safely (most importantly
> system suspend and hibernation). Apart from that, acpi_scan_lock
> is platform-specific and device_hotplug_lock is general, so the
> new ordering appears to be more appropriate from the overall
> design viewpoint.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>

Acked-by: Toshi Kani <[email protected]>

Thanks,
-Toshi

2013-08-31 00:24:52

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH 3/3] PM / hibernate / memory hotplug: Rework mutual exclusion

On Thu, 2013-08-29 at 23:18 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> Since all of the memory hotplug operations have to be carried out
> under device_hotplug_lock, they won't need to acquire pm_mutex if
> device_hotplug_lock is held around hibernation.
>
> For this reason, make the hibernation code acquire
> device_hotplug_lock after freezing user space processes and
> release it before thawing them. At the same tim drop the
> lock_system_sleep() and unlock_system_sleep() calls from
> lock_memory_hotplug() and unlock_memory_hotplug(), respectively.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> kernel/power/hibernate.c | 4 ++++
> kernel/power/user.c | 2 ++
> mm/memory_hotplug.c | 4 ----
> 3 files changed, 6 insertions(+), 4 deletions(-)
>
> Index: linux-pm/kernel/power/hibernate.c
> ===================================================================
> --- linux-pm.orig/kernel/power/hibernate.c
> +++ linux-pm/kernel/power/hibernate.c
> @@ -652,6 +652,7 @@ int hibernate(void)
> if (error)
> goto Exit;
>
> + lock_device_hotplug();

Since hibernate() can be called from sysfs, do you think the tool may
see this as a circular dependency with p_active again? This shouldn't
be a problem in practice, though.

Thanks,
-Toshi

2013-08-31 00:28:45

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 3/3] PM / hibernate / memory hotplug: Rework mutual exclusion

On Friday, August 30, 2013 06:23:19 PM Toshi Kani wrote:
> On Thu, 2013-08-29 at 23:18 +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > Since all of the memory hotplug operations have to be carried out
> > under device_hotplug_lock, they won't need to acquire pm_mutex if
> > device_hotplug_lock is held around hibernation.
> >
> > For this reason, make the hibernation code acquire
> > device_hotplug_lock after freezing user space processes and
> > release it before thawing them. At the same tim drop the
> > lock_system_sleep() and unlock_system_sleep() calls from
> > lock_memory_hotplug() and unlock_memory_hotplug(), respectively.
> >
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > ---
> > kernel/power/hibernate.c | 4 ++++
> > kernel/power/user.c | 2 ++
> > mm/memory_hotplug.c | 4 ----
> > 3 files changed, 6 insertions(+), 4 deletions(-)
> >
> > Index: linux-pm/kernel/power/hibernate.c
> > ===================================================================
> > --- linux-pm.orig/kernel/power/hibernate.c
> > +++ linux-pm/kernel/power/hibernate.c
> > @@ -652,6 +652,7 @@ int hibernate(void)
> > if (error)
> > goto Exit;
> >
> > + lock_device_hotplug();
>
> Since hibernate() can be called from sysfs, do you think the tool may
> see this as a circular dependency with p_active again? This shouldn't
> be a problem in practice, though.

/sys/power/state isn't a device attribute even and is never removed, so it
would be very sad and disappointing if lockdep reported that as a circular
dependency. The deadlock is surely not possible here anyway.

Thanks,
Rafael

2013-08-31 00:36:46

by Toshi Kani

[permalink] [raw]
Subject: Re: [PATCH 3/3] PM / hibernate / memory hotplug: Rework mutual exclusion

On Sat, 2013-08-31 at 02:39 +0200, Rafael J. Wysocki wrote:
> On Friday, August 30, 2013 06:23:19 PM Toshi Kani wrote:
> > On Thu, 2013-08-29 at 23:18 +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <[email protected]>
> > >
> > > Since all of the memory hotplug operations have to be carried out
> > > under device_hotplug_lock, they won't need to acquire pm_mutex if
> > > device_hotplug_lock is held around hibernation.
> > >
> > > For this reason, make the hibernation code acquire
> > > device_hotplug_lock after freezing user space processes and
> > > release it before thawing them. At the same tim drop the
> > > lock_system_sleep() and unlock_system_sleep() calls from
> > > lock_memory_hotplug() and unlock_memory_hotplug(), respectively.
> > >
> > > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > > ---
> > > kernel/power/hibernate.c | 4 ++++
> > > kernel/power/user.c | 2 ++
> > > mm/memory_hotplug.c | 4 ----
> > > 3 files changed, 6 insertions(+), 4 deletions(-)
> > >
> > > Index: linux-pm/kernel/power/hibernate.c
> > > ===================================================================
> > > --- linux-pm.orig/kernel/power/hibernate.c
> > > +++ linux-pm/kernel/power/hibernate.c
> > > @@ -652,6 +652,7 @@ int hibernate(void)
> > > if (error)
> > > goto Exit;
> > >
> > > + lock_device_hotplug();
> >
> > Since hibernate() can be called from sysfs, do you think the tool may
> > see this as a circular dependency with p_active again? This shouldn't
> > be a problem in practice, though.
>
> /sys/power/state isn't a device attribute even and is never removed, so it
> would be very sad and disappointing if lockdep reported that as a circular
> dependency. The deadlock is surely not possible here anyway.

Agreed. The code looks good otherwise, and this is a nice cleanup. If
it is OK to ignore the possible warning from the tool (which I do not
know the rule here), feel free to add my ack to patch 2/3 and 3/3 as
well.

Thanks,
-Toshi

2013-08-31 00:39:43

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 3/3] PM / hibernate / memory hotplug: Rework mutual exclusion

On Friday, August 30, 2013 06:35:12 PM Toshi Kani wrote:
> On Sat, 2013-08-31 at 02:39 +0200, Rafael J. Wysocki wrote:
> > On Friday, August 30, 2013 06:23:19 PM Toshi Kani wrote:
> > > On Thu, 2013-08-29 at 23:18 +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <[email protected]>
> > > >
> > > > Since all of the memory hotplug operations have to be carried out
> > > > under device_hotplug_lock, they won't need to acquire pm_mutex if
> > > > device_hotplug_lock is held around hibernation.
> > > >
> > > > For this reason, make the hibernation code acquire
> > > > device_hotplug_lock after freezing user space processes and
> > > > release it before thawing them. At the same tim drop the
> > > > lock_system_sleep() and unlock_system_sleep() calls from
> > > > lock_memory_hotplug() and unlock_memory_hotplug(), respectively.
> > > >
> > > > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > > > ---
> > > > kernel/power/hibernate.c | 4 ++++
> > > > kernel/power/user.c | 2 ++
> > > > mm/memory_hotplug.c | 4 ----
> > > > 3 files changed, 6 insertions(+), 4 deletions(-)
> > > >
> > > > Index: linux-pm/kernel/power/hibernate.c
> > > > ===================================================================
> > > > --- linux-pm.orig/kernel/power/hibernate.c
> > > > +++ linux-pm/kernel/power/hibernate.c
> > > > @@ -652,6 +652,7 @@ int hibernate(void)
> > > > if (error)
> > > > goto Exit;
> > > >
> > > > + lock_device_hotplug();
> > >
> > > Since hibernate() can be called from sysfs, do you think the tool may
> > > see this as a circular dependency with p_active again? This shouldn't
> > > be a problem in practice, though.
> >
> > /sys/power/state isn't a device attribute even and is never removed, so it
> > would be very sad and disappointing if lockdep reported that as a circular
> > dependency. The deadlock is surely not possible here anyway.
>
> Agreed. The code looks good otherwise, and this is a nice cleanup. If
> it is OK to ignore the possible warning from the tool (which I do not
> know the rule here),

Well, if it complains, we'll just need to add some annotations to this.
The code is correct I believe.

> feel free to add my ack to patch 2/3 and 3/3 as well.

I will, thanks!


--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.