test_resume does not work in current kernel when using swapfile for hibernation.
This is because the swap device should be openned non-exclusively in test_resume mode.
Patch 1 is a preparation for patch 2 and it turns snapshot_test into a global variable.
Patch 2 opens swap device non-exclusively for test_resume mode, and exclusively for manual
hibernation resume.
Change since v1:
Turn snapshot_test into global variable and do not introduce parameters for swsusp_check()
nor load_image_and_restore().
Chen Yu (2):
PM: hibernate: Turn snapshot_test into global variable
PM: hibernate: Do not get block device exclusively in test_resume mode
kernel/power/hibernate.c | 12 +++++++++---
kernel/power/power.h | 1 +
kernel/power/swap.c | 5 +++--
3 files changed, 13 insertions(+), 5 deletions(-)
--
2.25.1
There is need to check snapshot_test and open block device
in different mode, so as to avoid the race condition.
No functional changes intended.
Suggested-by: Pavankumar Kondeti <[email protected]>
Signed-off-by: Chen Yu <[email protected]>
---
kernel/power/hibernate.c | 7 ++++++-
kernel/power/power.h | 1 +
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index 793c55a2becb..aa551b093c3f 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -64,6 +64,7 @@ enum {
static int hibernation_mode = HIBERNATION_SHUTDOWN;
bool freezer_test_done;
+bool snapshot_test;
static const struct platform_hibernation_ops *hibernation_ops;
@@ -716,7 +717,6 @@ static int load_image_and_restore(void)
*/
int hibernate(void)
{
- bool snapshot_test = false;
unsigned int sleep_flags;
int error;
@@ -744,6 +744,9 @@ int hibernate(void)
if (error)
goto Exit;
+ /* protected by system_transition_mutex */
+ snapshot_test = false;
+
lock_device_hotplug();
/* Allocate memory management structures */
error = create_basic_memory_bitmaps();
@@ -940,6 +943,8 @@ static int software_resume(void)
*/
mutex_lock_nested(&system_transition_mutex, SINGLE_DEPTH_NESTING);
+ snapshot_test = false;
+
if (swsusp_resume_device)
goto Check_image;
diff --git a/kernel/power/power.h b/kernel/power/power.h
index b4f433943209..b83c8d5e188d 100644
--- a/kernel/power/power.h
+++ b/kernel/power/power.h
@@ -59,6 +59,7 @@ asmlinkage int swsusp_save(void);
/* kernel/power/hibernate.c */
extern bool freezer_test_done;
+extern bool snapshot_test;
extern int hibernation_snapshot(int platform_mode);
extern int hibernation_restore(int platform_mode);
--
2.25.1
The system refused to do a test_resume because it found that the
swap device has already been taken by someone else. Specificly,
the swsusp_check()->blkdev_get_by_dev(FMODE_EXCL) is supposed to
do this check.
Steps to reproduce:
dd if=/dev/zero of=/swapfile bs=$(cat /proc/meminfo |
awk '/MemTotal/ {print $2}') count=1024 conv=notrunc
mkswap /swapfile
swapon /swapfile
swap-offset /swapfile
echo 34816 > /sys/power/resume_offset
echo test_resume > /sys/power/disk
echo disk > /sys/power/state
PM: Using 3 thread(s) for compression
PM: Compressing and saving image data (293150 pages)...
PM: Image saving progress: 0%
PM: Image saving progress: 10%
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/100
ata2: SATA link down (SStatus 0 SControl 300)
ata5: SATA link down (SStatus 0 SControl 300)
ata6: SATA link down (SStatus 0 SControl 300)
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
PM: Image saving progress: 20%
PM: Image saving progress: 30%
PM: Image saving progress: 40%
PM: Image saving progress: 50%
pcieport 0000:00:02.5: pciehp: Slot(0-5): No device found
PM: Image saving progress: 60%
PM: Image saving progress: 70%
PM: Image saving progress: 80%
PM: Image saving progress: 90%
PM: Image saving done
PM: hibernation: Wrote 1172600 kbytes in 2.70 seconds (434.29 MB/s)
PM: S|
PM: hibernation: Basic memory bitmaps freed
PM: Image not found (code -16)
This is because when using the swapfile as the hibernation storage,
the block device where the swapfile is located has already been mounted
by the OS distribution(usually been mounted as the rootfs). This is not
an issue for normal hibernation, because software_resume()->swsusp_check()
happens before the block device(rootfs) mount. But it is a problem for the
test_resume mode. Because when test_resume happens, the block device has
been mounted already.
Thus remove the FMODE_EXCL for test_resume mode. This would not be a
problem because in test_resume stage, the processes have already been
frozen, and the race condition described in
Commit 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()")
is unlikely to happen.
Fixes: 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()")
Reported-by: Yifan Li <[email protected]>
Suggested-by: Pavankumar Kondeti <[email protected]>
Signed-off-by: Chen Yu <[email protected]>
---
kernel/power/hibernate.c | 5 +++--
kernel/power/swap.c | 5 +++--
2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
index aa551b093c3f..defc2257b052 100644
--- a/kernel/power/hibernate.c
+++ b/kernel/power/hibernate.c
@@ -688,18 +688,19 @@ static int load_image_and_restore(void)
{
int error;
unsigned int flags;
+ fmode_t mode = snapshot_test ? FMODE_READ : (FMODE_READ | FMODE_EXCL);
pm_pr_dbg("Loading hibernation image.\n");
lock_device_hotplug();
error = create_basic_memory_bitmaps();
if (error) {
- swsusp_close(FMODE_READ | FMODE_EXCL);
+ swsusp_close(mode);
goto Unlock;
}
error = swsusp_read(&flags);
- swsusp_close(FMODE_READ | FMODE_EXCL);
+ swsusp_close(mode);
if (!error)
error = hibernation_restore(flags & SF_PLATFORM_MODE);
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 36a1df48280c..0f699cd96a89 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -1518,9 +1518,10 @@ int swsusp_check(void)
{
int error;
void *holder;
+ fmode_t mode = snapshot_test ? FMODE_READ : (FMODE_READ | FMODE_EXCL);
hib_resume_bdev = blkdev_get_by_dev(swsusp_resume_device,
- FMODE_READ | FMODE_EXCL, &holder);
+ mode, &holder);
if (!IS_ERR(hib_resume_bdev)) {
set_blocksize(hib_resume_bdev, PAGE_SIZE);
clear_page(swsusp_header);
@@ -1547,7 +1548,7 @@ int swsusp_check(void)
put:
if (error)
- blkdev_put(hib_resume_bdev, FMODE_READ | FMODE_EXCL);
+ blkdev_put(hib_resume_bdev, mode);
else
pr_debug("Image signature found, resuming\n");
} else {
--
2.25.1
On Tue, Apr 11, 2023 at 08:18:43PM +0800, Chen Yu wrote:
> test_resume does not work in current kernel when using swapfile for hibernation.
> This is because the swap device should be openned non-exclusively in test_resume mode.
>
> Patch 1 is a preparation for patch 2 and it turns snapshot_test into a global variable.
> Patch 2 opens swap device non-exclusively for test_resume mode, and exclusively for manual
> hibernation resume.
>
> Change since v1:
> Turn snapshot_test into global variable and do not introduce parameters for swsusp_check()
> nor load_image_and_restore().
>
>
> Chen Yu (2):
> PM: hibernate: Turn snapshot_test into global variable
> PM: hibernate: Do not get block device exclusively in test_resume mode
>
> kernel/power/hibernate.c | 12 +++++++++---
> kernel/power/power.h | 1 +
> kernel/power/swap.c | 5 +++--
> 3 files changed, 13 insertions(+), 5 deletions(-)
>
Looks good to me.
I have verified test_resume on QEMU arm64 and it worked fine with
these two patches included.
Thanks,
Pavan
On Tue, Apr 11, 2023 at 6:23 AM Chen Yu <[email protected]> wrote:
>
> The system refused to do a test_resume because it found that the
> swap device has already been taken by someone else. Specificly,
"Specifically" I suppose.
> the swsusp_check()->blkdev_get_by_dev(FMODE_EXCL) is supposed to
> do this check.
>
> Steps to reproduce:
> dd if=/dev/zero of=/swapfile bs=$(cat /proc/meminfo |
> awk '/MemTotal/ {print $2}') count=1024 conv=notrunc
> mkswap /swapfile
> swapon /swapfile
> swap-offset /swapfile
> echo 34816 > /sys/power/resume_offset
> echo test_resume > /sys/power/disk
> echo disk > /sys/power/state
>
> PM: Using 3 thread(s) for compression
> PM: Compressing and saving image data (293150 pages)...
> PM: Image saving progress: 0%
> PM: Image saving progress: 10%
> ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata1.00: configured for UDMA/100
> ata2: SATA link down (SStatus 0 SControl 300)
> ata5: SATA link down (SStatus 0 SControl 300)
> ata6: SATA link down (SStatus 0 SControl 300)
> ata3: SATA link down (SStatus 0 SControl 300)
> ata4: SATA link down (SStatus 0 SControl 300)
> PM: Image saving progress: 20%
> PM: Image saving progress: 30%
> PM: Image saving progress: 40%
> PM: Image saving progress: 50%
> pcieport 0000:00:02.5: pciehp: Slot(0-5): No device found
> PM: Image saving progress: 60%
> PM: Image saving progress: 70%
> PM: Image saving progress: 80%
> PM: Image saving progress: 90%
> PM: Image saving done
> PM: hibernation: Wrote 1172600 kbytes in 2.70 seconds (434.29 MB/s)
> PM: S|
> PM: hibernation: Basic memory bitmaps freed
> PM: Image not found (code -16)
>
> This is because when using the swapfile as the hibernation storage,
> the block device where the swapfile is located has already been mounted
> by the OS distribution(usually been mounted as the rootfs). This is not
"usually mounted"
> an issue for normal hibernation, because software_resume()->swsusp_check()
> happens before the block device(rootfs) mount. But it is a problem for the
> test_resume mode. Because when test_resume happens, the block device has
> been mounted already.
>
> Thus remove the FMODE_EXCL for test_resume mode. This would not be a
> problem because in test_resume stage, the processes have already been
> frozen, and the race condition described in
> Commit 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()")
> is unlikely to happen.
>
> Fixes: 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()")
> Reported-by: Yifan Li <[email protected]>
> Suggested-by: Pavankumar Kondeti <[email protected]>
> Signed-off-by: Chen Yu <[email protected]>
> ---
> kernel/power/hibernate.c | 5 +++--
> kernel/power/swap.c | 5 +++--
> 2 files changed, 6 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> index aa551b093c3f..defc2257b052 100644
> --- a/kernel/power/hibernate.c
> +++ b/kernel/power/hibernate.c
> @@ -688,18 +688,19 @@ static int load_image_and_restore(void)
> {
> int error;
> unsigned int flags;
> + fmode_t mode = snapshot_test ? FMODE_READ : (FMODE_READ | FMODE_EXCL);
fmode_t mode = FMODE_READ;
if (snapshot_test)
mode |= FMODE_EXCL;
pretty please, and analogously below.
>
> pm_pr_dbg("Loading hibernation image.\n");
>
> lock_device_hotplug();
> error = create_basic_memory_bitmaps();
> if (error) {
> - swsusp_close(FMODE_READ | FMODE_EXCL);
> + swsusp_close(mode);
> goto Unlock;
> }
>
> error = swsusp_read(&flags);
> - swsusp_close(FMODE_READ | FMODE_EXCL);
> + swsusp_close(mode);
> if (!error)
> error = hibernation_restore(flags & SF_PLATFORM_MODE);
>
> diff --git a/kernel/power/swap.c b/kernel/power/swap.c
> index 36a1df48280c..0f699cd96a89 100644
> --- a/kernel/power/swap.c
> +++ b/kernel/power/swap.c
> @@ -1518,9 +1518,10 @@ int swsusp_check(void)
> {
> int error;
> void *holder;
> + fmode_t mode = snapshot_test ? FMODE_READ : (FMODE_READ | FMODE_EXCL);
>
> hib_resume_bdev = blkdev_get_by_dev(swsusp_resume_device,
> - FMODE_READ | FMODE_EXCL, &holder);
> + mode, &holder);
> if (!IS_ERR(hib_resume_bdev)) {
> set_blocksize(hib_resume_bdev, PAGE_SIZE);
> clear_page(swsusp_header);
> @@ -1547,7 +1548,7 @@ int swsusp_check(void)
>
> put:
> if (error)
> - blkdev_put(hib_resume_bdev, FMODE_READ | FMODE_EXCL);
> + blkdev_put(hib_resume_bdev, mode);
> else
> pr_debug("Image signature found, resuming\n");
> } else {
> --
On 2023-04-11 at 11:00:50 +0530, Pavan Kondeti wrote:
> On Tue, Apr 11, 2023 at 08:18:43PM +0800, Chen Yu wrote:
> > test_resume does not work in current kernel when using swapfile for hibernation.
> > This is because the swap device should be openned non-exclusively in test_resume mode.
> >
> > Patch 1 is a preparation for patch 2 and it turns snapshot_test into a global variable.
> > Patch 2 opens swap device non-exclusively for test_resume mode, and exclusively for manual
> > hibernation resume.
> >
> > Change since v1:
> > Turn snapshot_test into global variable and do not introduce parameters for swsusp_check()
> > nor load_image_and_restore().
> >
> >
> > Chen Yu (2):
> > PM: hibernate: Turn snapshot_test into global variable
> > PM: hibernate: Do not get block device exclusively in test_resume mode
> >
> > kernel/power/hibernate.c | 12 +++++++++---
> > kernel/power/power.h | 1 +
> > kernel/power/swap.c | 5 +++--
> > 3 files changed, 13 insertions(+), 5 deletions(-)
> >
> Looks good to me.
>
> I have verified test_resume on QEMU arm64 and it worked fine with
> these two patches included.
>
Thanks, can I add your Tested-by tag?
thanks,
Chenyu
> Thanks,
> Pavan
On 2023-04-11 at 18:21:36 +0200, Rafael J. Wysocki wrote:
> On Tue, Apr 11, 2023 at 6:23 AM Chen Yu <[email protected]> wrote:
> >
> > The system refused to do a test_resume because it found that the
> > swap device has already been taken by someone else. Specificly,
>
> "Specifically" I suppose.
>
Yes, will fix it.
> > the swsusp_check()->blkdev_get_by_dev(FMODE_EXCL) is supposed to
> > do this check.
> >
> > Steps to reproduce:
> > dd if=/dev/zero of=/swapfile bs=$(cat /proc/meminfo |
> > awk '/MemTotal/ {print $2}') count=1024 conv=notrunc
> > mkswap /swapfile
> > swapon /swapfile
> > swap-offset /swapfile
> > echo 34816 > /sys/power/resume_offset
> > echo test_resume > /sys/power/disk
> > echo disk > /sys/power/state
> >
> > PM: Using 3 thread(s) for compression
> > PM: Compressing and saving image data (293150 pages)...
> > PM: Image saving progress: 0%
> > PM: Image saving progress: 10%
> > ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> > ata1.00: configured for UDMA/100
> > ata2: SATA link down (SStatus 0 SControl 300)
> > ata5: SATA link down (SStatus 0 SControl 300)
> > ata6: SATA link down (SStatus 0 SControl 300)
> > ata3: SATA link down (SStatus 0 SControl 300)
> > ata4: SATA link down (SStatus 0 SControl 300)
> > PM: Image saving progress: 20%
> > PM: Image saving progress: 30%
> > PM: Image saving progress: 40%
> > PM: Image saving progress: 50%
> > pcieport 0000:00:02.5: pciehp: Slot(0-5): No device found
> > PM: Image saving progress: 60%
> > PM: Image saving progress: 70%
> > PM: Image saving progress: 80%
> > PM: Image saving progress: 90%
> > PM: Image saving done
> > PM: hibernation: Wrote 1172600 kbytes in 2.70 seconds (434.29 MB/s)
> > PM: S|
> > PM: hibernation: Basic memory bitmaps freed
> > PM: Image not found (code -16)
> >
> > This is because when using the swapfile as the hibernation storage,
> > the block device where the swapfile is located has already been mounted
> > by the OS distribution(usually been mounted as the rootfs). This is not
>
> "usually mounted"
>
OK, will fix it.
> > an issue for normal hibernation, because software_resume()->swsusp_check()
> > happens before the block device(rootfs) mount. But it is a problem for the
> > test_resume mode. Because when test_resume happens, the block device has
> > been mounted already.
> >
> > Thus remove the FMODE_EXCL for test_resume mode. This would not be a
> > problem because in test_resume stage, the processes have already been
> > frozen, and the race condition described in
> > Commit 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()")
> > is unlikely to happen.
> >
> > Fixes: 39fbef4b0f77 ("PM: hibernate: Get block device exclusively in swsusp_check()")
> > Reported-by: Yifan Li <[email protected]>
> > Suggested-by: Pavankumar Kondeti <[email protected]>
> > Signed-off-by: Chen Yu <[email protected]>
> > ---
> > kernel/power/hibernate.c | 5 +++--
> > kernel/power/swap.c | 5 +++--
> > 2 files changed, 6 insertions(+), 4 deletions(-)
> >
> > diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c
> > index aa551b093c3f..defc2257b052 100644
> > --- a/kernel/power/hibernate.c
> > +++ b/kernel/power/hibernate.c
> > @@ -688,18 +688,19 @@ static int load_image_and_restore(void)
> > {
> > int error;
> > unsigned int flags;
> > + fmode_t mode = snapshot_test ? FMODE_READ : (FMODE_READ | FMODE_EXCL);
>
> fmode_t mode = FMODE_READ;
>
> if (snapshot_test)
> mode |= FMODE_EXCL;
>
> pretty please, and analogously below.
>
OK, will fix it in next version.
thanks,
Chenyu
On Wed, Apr 12, 2023 at 09:13:55AM +0800, Chen Yu wrote:
> On 2023-04-11 at 11:00:50 +0530, Pavan Kondeti wrote:
> > On Tue, Apr 11, 2023 at 08:18:43PM +0800, Chen Yu wrote:
> > > test_resume does not work in current kernel when using swapfile for hibernation.
> > > This is because the swap device should be openned non-exclusively in test_resume mode.
> > >
> > > Patch 1 is a preparation for patch 2 and it turns snapshot_test into a global variable.
> > > Patch 2 opens swap device non-exclusively for test_resume mode, and exclusively for manual
> > > hibernation resume.
> > >
> > > Change since v1:
> > > Turn snapshot_test into global variable and do not introduce parameters for swsusp_check()
> > > nor load_image_and_restore().
> > >
> > >
> > > Chen Yu (2):
> > > PM: hibernate: Turn snapshot_test into global variable
> > > PM: hibernate: Do not get block device exclusively in test_resume mode
> > >
> > > kernel/power/hibernate.c | 12 +++++++++---
> > > kernel/power/power.h | 1 +
> > > kernel/power/swap.c | 5 +++--
> > > 3 files changed, 13 insertions(+), 5 deletions(-)
> > >
> > Looks good to me.
> >
> > I have verified test_resume on QEMU arm64 and it worked fine with
> > these two patches included.
> >
> Thanks, can I add your Tested-by tag?
>
Sure. Pls add
Tested-by: Pavankumar Kondeti <[email protected]>
Thanks,
Pavan