2015-07-10 05:12:28

by Minfei Huang

[permalink] [raw]
Subject: [PATCH v4] kexec: Make a pair of map and unmap reserved pages when kdump fails to start

For some arch, kexec shall map the reserved pages, then use them, when
we try to start the kdump service.

Now kexec will never unmap the reserved pages, once it fails to continue
starting the kdump service. So we make a pair of map/unmap reserved
pages whatever kexec fails or not in code path.

In order to make code readable, wrap a new function __kexec_load which
contains all of the logic to deal with the image loading.

Signed-off-by: Minfei Huang <[email protected]>
---
v3:
- reconstruct the patch, wrap a new function to deal with the code logic, based on Vivek and Michael's patch
v2:
- replace the "failure" label with "fail_unmap_pages"
v1:
- reconstruct the patch code
---
kernel/kexec.c | 112 ++++++++++++++++++++++++++++++++-------------------------
1 file changed, 63 insertions(+), 49 deletions(-)

diff --git a/kernel/kexec.c b/kernel/kexec.c
index a785c10..2232c90 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1247,10 +1247,71 @@ int kexec_load_disabled;

static DEFINE_MUTEX(kexec_mutex);

+static int __kexec_load(unsigned long entry, unsigned long nr_segments,
+ struct kexec_segment __user *segments,
+ unsigned long flags)
+{
+ int result = 0;
+ struct kimage **dest_image, *image;
+
+ dest_image = &kexec_image;
+
+ if (flags & KEXEC_ON_CRASH)
+ dest_image = &kexec_crash_image;
+
+ if (nr_segments == 0) {
+ /* Install the new kernel, and Uninstall the old */
+ image = xchg(dest_image, image);
+ kimage_free(image);
+ } else {
+ unsigned long i;
+
+ if (flags & KEXEC_ON_CRASH) {
+ /*
+ * Loading another kernel to switch to if this one
+ * crashes. Free any current crash dump kernel before
+ * we corrupt it.
+ */
+ kimage_free(xchg(&kexec_crash_image, NULL));
+ }
+
+ result = kimage_alloc_init(&image, entry,
+ nr_segments, segments, flags);
+ if (result)
+ return result;
+
+ if (flags & KEXEC_ON_CRASH)
+ crash_map_reserved_pages();
+
+ if (flags & KEXEC_PRESERVE_CONTEXT)
+ image->preserve_context = 1;
+
+ result = machine_kexec_prepare(image);
+ if (result)
+ goto failure_unmap_mem;
+
+ for (i = 0; i < nr_segments; i++) {
+ result = kimage_load_segment(image, &image->segment[i]);
+ if (result)
+ goto failure_unmap_mem;
+ }
+
+ kimage_terminate(image);
+
+ /* Install the new kernel, and Uninstall the old */
+ image = xchg(dest_image, image);
+
+failure_unmap_mem:
+ if (flags & KEXEC_ON_CRASH)
+ crash_unmap_reserved_pages();
+ kimage_free(image);
+ }
+ return result;
+}
+
SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
struct kexec_segment __user *, segments, unsigned long, flags)
{
- struct kimage **dest_image, *image;
int result;

/* We only trust the superuser with rebooting the system. */
@@ -1275,9 +1336,6 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
if (nr_segments > KEXEC_SEGMENT_MAX)
return -EINVAL;

- image = NULL;
- result = 0;
-
/* Because we write directly to the reserved memory
* region when loading crash kernels we need a mutex here to
* prevent multiple crash kernels from attempting to load
@@ -1289,53 +1347,9 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
if (!mutex_trylock(&kexec_mutex))
return -EBUSY;

- dest_image = &kexec_image;
- if (flags & KEXEC_ON_CRASH)
- dest_image = &kexec_crash_image;
- if (nr_segments > 0) {
- unsigned long i;
-
- if (flags & KEXEC_ON_CRASH) {
- /*
- * Loading another kernel to switch to if this one
- * crashes. Free any current crash dump kernel before
- * we corrupt it.
- */
+ result = __kexec_load(entry, nr_segments, segments, flags);

- kimage_free(xchg(&kexec_crash_image, NULL));
- result = kimage_alloc_init(&image, entry, nr_segments,
- segments, flags);
- crash_map_reserved_pages();
- } else {
- /* Loading another kernel to reboot into. */
-
- result = kimage_alloc_init(&image, entry, nr_segments,
- segments, flags);
- }
- if (result)
- goto out;
-
- if (flags & KEXEC_PRESERVE_CONTEXT)
- image->preserve_context = 1;
- result = machine_kexec_prepare(image);
- if (result)
- goto out;
-
- for (i = 0; i < nr_segments; i++) {
- result = kimage_load_segment(image, &image->segment[i]);
- if (result)
- goto out;
- }
- kimage_terminate(image);
- if (flags & KEXEC_ON_CRASH)
- crash_unmap_reserved_pages();
- }
- /* Install the new kernel, and Uninstall the old */
- image = xchg(dest_image, image);
-
-out:
mutex_unlock(&kexec_mutex);
- kimage_free(image);

return result;
}
--
2.2.2


2015-07-10 08:54:25

by Michael Holzheu

[permalink] [raw]
Subject: Re: [PATCH v4] kexec: Make a pair of map and unmap reserved pages when kdump fails to start

On Fri, 10 Jul 2015 13:12:17 +0800
Minfei Huang <[email protected]> wrote:

> For some arch, kexec shall map the reserved pages, then use them, when
> we try to start the kdump service.
>
> Now kexec will never unmap the reserved pages, once it fails to continue
> starting the kdump service. So we make a pair of map/unmap reserved
> pages whatever kexec fails or not in code path.
>
> In order to make code readable, wrap a new function __kexec_load which
> contains all of the logic to deal with the image loading.
>
> Signed-off-by: Minfei Huang <[email protected]>
> ---
> v3:
> - reconstruct the patch, wrap a new function to deal with the code logic, based on Vivek and Michael's patch
> v2:
> - replace the "failure" label with "fail_unmap_pages"
> v1:
> - reconstruct the patch code
> ---
> kernel/kexec.c | 112 ++++++++++++++++++++++++++++++++-------------------------
> 1 file changed, 63 insertions(+), 49 deletions(-)
>
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index a785c10..2232c90 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1247,10 +1247,71 @@ int kexec_load_disabled;
>
> static DEFINE_MUTEX(kexec_mutex);
>
> +static int __kexec_load(unsigned long entry, unsigned long nr_segments,
> + struct kexec_segment __user *segments,
> + unsigned long flags)
> +{
> + int result = 0;
> + struct kimage **dest_image, *image;
> +
> + dest_image = &kexec_image;
> +
> + if (flags & KEXEC_ON_CRASH)
> + dest_image = &kexec_crash_image;
> +
> + if (nr_segments == 0) {
> + /* Install the new kernel, and Uninstall the old */
> + image = xchg(dest_image, image);
> + kimage_free(image);

Well this is wrong and should probably be:

if (nr_segments == 0) {
/* Uninstall image */
image = xchg(dest_image, NULL);
kimage_free(image);

> + } else {
> + unsigned long i;
> +
> + if (flags & KEXEC_ON_CRASH) {
> + /*

[snip]

> + result = kimage_load_segment(image, &image->segment[i]);
> + if (result)
> + goto failure_unmap_mem;
> + }
> +
> + kimage_terminate(image);
> +
> + /* Install the new kernel, and Uninstall the old */

Perhaps fix the comment: Remove superfluous blank and lowercase "uninstall"?

> + image = xchg(dest_image, image);
> +
> +failure_unmap_mem:
> + if (flags & KEXEC_ON_CRASH)
> + crash_unmap_reserved_pages();
> + kimage_free(image);

Here the update patch:
---
diff --git a/kernel/kexec.c b/kernel/kexec.c
index e686a39..2f5b4aa 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1249,8 +1249,8 @@ static int __kexec_load(unsigned long entry, unsigned long nr_segments,
dest_image = &kexec_crash_image;

if (nr_segments == 0) {
- /* Install the new kernel, and Uninstall the old */
- image = xchg(dest_image, image);
+ /* Uninstall image */
+ image = xchg(dest_image, NULL);
kimage_free(image);
} else {
unsigned long i;
@@ -1287,7 +1287,7 @@ static int __kexec_load(unsigned long entry, unsigned long nr_segments,

kimage_terminate(image);

- /* Install the new kernel, and Uninstall the old */
+ /* Install the new kernel, and uninstall the old */
image = xchg(dest_image, image);

failure_unmap_mem:

2015-07-10 09:03:32

by Minfei Huang

[permalink] [raw]
Subject: Re: [PATCH v4] kexec: Make a pair of map and unmap reserved pages when kdump fails to start

On 07/10/15 at 10:54P, Michael Holzheu wrote:
> On Fri, 10 Jul 2015 13:12:17 +0800
> Minfei Huang <[email protected]> wrote:
>
> > For some arch, kexec shall map the reserved pages, then use them, when
> > we try to start the kdump service.
> >
> > Now kexec will never unmap the reserved pages, once it fails to continue
> > starting the kdump service. So we make a pair of map/unmap reserved
> > pages whatever kexec fails or not in code path.
> >
> > In order to make code readable, wrap a new function __kexec_load which
> > contains all of the logic to deal with the image loading.
> >
> > Signed-off-by: Minfei Huang <[email protected]>
> > ---
> > v3:
> > - reconstruct the patch, wrap a new function to deal with the code logic, based on Vivek and Michael's patch
> > v2:
> > - replace the "failure" label with "fail_unmap_pages"
> > v1:
> > - reconstruct the patch code
> > ---
> > kernel/kexec.c | 112 ++++++++++++++++++++++++++++++++-------------------------
> > 1 file changed, 63 insertions(+), 49 deletions(-)
> >
> > diff --git a/kernel/kexec.c b/kernel/kexec.c
> > index a785c10..2232c90 100644
> > --- a/kernel/kexec.c
> > +++ b/kernel/kexec.c
> > @@ -1247,10 +1247,71 @@ int kexec_load_disabled;
> >
> > static DEFINE_MUTEX(kexec_mutex);
> >
> > +static int __kexec_load(unsigned long entry, unsigned long nr_segments,
> > + struct kexec_segment __user *segments,
> > + unsigned long flags)
> > +{
> > + int result = 0;
> > + struct kimage **dest_image, *image;
> > +
> > + dest_image = &kexec_image;
> > +
> > + if (flags & KEXEC_ON_CRASH)
> > + dest_image = &kexec_crash_image;
> > +
> > + if (nr_segments == 0) {
> > + /* Install the new kernel, and Uninstall the old */
> > + image = xchg(dest_image, image);
> > + kimage_free(image);
>
> Well this is wrong and should probably be:
>
> if (nr_segments == 0) {
> /* Uninstall image */
> image = xchg(dest_image, NULL);
> kimage_free(image);
>

You are right. It should be what you commented.

> > + } else {
> > + unsigned long i;
> > +
> > + if (flags & KEXEC_ON_CRASH) {
> > + /*
>
> [snip]
>
> > + result = kimage_load_segment(image, &image->segment[i]);
> > + if (result)
> > + goto failure_unmap_mem;
> > + }
> > +
> > + kimage_terminate(image);
> > +
> > + /* Install the new kernel, and Uninstall the old */
>
> Perhaps fix the comment: Remove superfluous blank and lowercase "uninstall"?
>

Thanks.
Minfei

2015-07-10 09:14:28

by Michael Holzheu

[permalink] [raw]
Subject: Re: [PATCH v4] kexec: Make a pair of map and unmap reserved pages when kdump fails to start

On Fri, 10 Jul 2015 17:03:22 +0800
Minfei Huang <[email protected]> wrote:

> On 07/10/15 at 10:54P, Michael Holzheu wrote:
> > On Fri, 10 Jul 2015 13:12:17 +0800
> > Minfei Huang <[email protected]> wrote:
> >
> > > For some arch, kexec shall map the reserved pages, then use them, when
> > > we try to start the kdump service.
> > >
> > > Now kexec will never unmap the reserved pages, once it fails to continue
> > > starting the kdump service. So we make a pair of map/unmap reserved
> > > pages whatever kexec fails or not in code path.
> > >
> > > In order to make code readable, wrap a new function __kexec_load which
> > > contains all of the logic to deal with the image loading.
> > >
> > > Signed-off-by: Minfei Huang <[email protected]>
> > > ---
> > > v3:
> > > - reconstruct the patch, wrap a new function to deal with the code logic, based on Vivek and Michael's patch
> > > v2:
> > > - replace the "failure" label with "fail_unmap_pages"
> > > v1:
> > > - reconstruct the patch code
> > > ---
> > > kernel/kexec.c | 112 ++++++++++++++++++++++++++++++++-------------------------
> > > 1 file changed, 63 insertions(+), 49 deletions(-)
> > >
> > > diff --git a/kernel/kexec.c b/kernel/kexec.c
> > > index a785c10..2232c90 100644
> > > --- a/kernel/kexec.c
> > > +++ b/kernel/kexec.c
> > > @@ -1247,10 +1247,71 @@ int kexec_load_disabled;
> > >
> > > static DEFINE_MUTEX(kexec_mutex);
> > >
> > > +static int __kexec_load(unsigned long entry, unsigned long nr_segments,
> > > + struct kexec_segment __user *segments,
> > > + unsigned long flags)
> > > +{
> > > + int result = 0;
> > > + struct kimage **dest_image, *image;
> > > +
> > > + dest_image = &kexec_image;
> > > +
> > > + if (flags & KEXEC_ON_CRASH)
> > > + dest_image = &kexec_crash_image;
> > > +
> > > + if (nr_segments == 0) {
> > > + /* Install the new kernel, and Uninstall the old */
> > > + image = xchg(dest_image, image);
> > > + kimage_free(image);
> >
> > Well this is wrong and should probably be:
> >
> > if (nr_segments == 0) {
> > /* Uninstall image */
> > image = xchg(dest_image, NULL);
> > kimage_free(image);
> >
>
> You are right. It should be what you commented.

And after rethinking a bit, I think a one liner and an early exit
would be better in this case:

if (nr_segments == 0) {
/* Uninstall image */
kimage_free(xchg(dest_image, NULL));
return 0;
}

What about the following patch:
---
diff --git a/kernel/kexec.c b/kernel/kexec.c
index 7a36fdc..7837c4e 100644
--- a/kernel/kexec.c
+++ b/kernel/kexec.c
@@ -1236,10 +1236,68 @@ int kexec_load_disabled;

static DEFINE_MUTEX(kexec_mutex);

+static int __kexec_load(unsigned long entry, unsigned long nr_segments,
+ struct kexec_segment __user *segments,
+ unsigned long flags)
+{
+ struct kimage **dest_image, *image;
+ unsigned long i;
+ int result;
+
+ if (flags & KEXEC_ON_CRASH)
+ dest_image = &kexec_crash_image;
+ else
+ dest_image = &kexec_image;
+
+ if (nr_segments == 0) {
+ /* Uninstall image */
+ kfree(xchg(dest_image, NULL));
+ return 0;
+ }
+ if (flags & KEXEC_ON_CRASH) {
+ /*
+ * Loading another kernel to switch to if this one
+ * crashes. Free any current crash dump kernel before
+ * we corrupt it.
+ */
+ kimage_free(xchg(&kexec_crash_image, NULL));
+ }
+
+ result = kimage_alloc_init(&image, entry, nr_segments, segments, flags);
+ if (result)
+ return result;
+
+ if (flags & KEXEC_ON_CRASH)
+ crash_map_reserved_pages();
+
+ if (flags & KEXEC_PRESERVE_CONTEXT)
+ image->preserve_context = 1;
+
+ result = machine_kexec_prepare(image);
+ if (result)
+ goto failure_unmap_mem;
+
+ for (i = 0; i < nr_segments; i++) {
+ result = kimage_load_segment(image, &image->segment[i]);
+ if (result)
+ goto failure_unmap_mem;
+ }
+
+ kimage_terminate(image);
+
+ /* Install the new kernel and uninstall the old */
+ image = xchg(dest_image, image);
+
+failure_unmap_mem:
+ if (flags & KEXEC_ON_CRASH)
+ crash_unmap_reserved_pages();
+ kimage_free(image);
+ return result;
+}
+
SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
struct kexec_segment __user *, segments, unsigned long, flags)
{
- struct kimage **dest_image, *image;
int result;

/* We only trust the superuser with rebooting the system. */
@@ -1264,9 +1322,6 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
if (nr_segments > KEXEC_SEGMENT_MAX)
return -EINVAL;

- image = NULL;
- result = 0;
-
/* Because we write directly to the reserved memory
* region when loading crash kernels we need a mutex here to
* prevent multiple crash kernels from attempting to load
@@ -1278,53 +1333,9 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments,
if (!mutex_trylock(&kexec_mutex))
return -EBUSY;

- dest_image = &kexec_image;
- if (flags & KEXEC_ON_CRASH)
- dest_image = &kexec_crash_image;
- if (nr_segments > 0) {
- unsigned long i;
-
- if (flags & KEXEC_ON_CRASH) {
- /*
- * Loading another kernel to switch to if this one
- * crashes. Free any current crash dump kernel before
- * we corrupt it.
- */
-
- kimage_free(xchg(&kexec_crash_image, NULL));
- result = kimage_alloc_init(&image, entry, nr_segments,
- segments, flags);
- crash_map_reserved_pages();
- } else {
- /* Loading another kernel to reboot into. */
-
- result = kimage_alloc_init(&image, entry, nr_segments,
- segments, flags);
- }
- if (result)
- goto out;
-
- if (flags & KEXEC_PRESERVE_CONTEXT)
- image->preserve_context = 1;
- result = machine_kexec_prepare(image);
- if (result)
- goto out;
+ result = __kexec_load(entry, nr_segments, segments, flags);

- for (i = 0; i < nr_segments; i++) {
- result = kimage_load_segment(image, &image->segment[i]);
- if (result)
- goto out;
- }
- kimage_terminate(image);
- if (flags & KEXEC_ON_CRASH)
- crash_unmap_reserved_pages();
- }
- /* Install the new kernel, and Uninstall the old */
- image = xchg(dest_image, image);
-
-out:
mutex_unlock(&kexec_mutex);
- kimage_free(image);

return result;
}

2015-07-10 09:29:28

by Michael Holzheu

[permalink] [raw]
Subject: Re: [PATCH v4] kexec: Make a pair of map and unmap reserved pages when kdump fails to start

On Fri, 10 Jul 2015 11:14:06 +0200
Michael Holzheu <[email protected]> wrote:

> On Fri, 10 Jul 2015 17:03:22 +0800
> Minfei Huang <[email protected]> wrote:

[snip]

> +static int __kexec_load(unsigned long entry, unsigned long nr_segments,
> + struct kexec_segment __user *segments,
> + unsigned long flags)
> +{
> + struct kimage **dest_image, *image;
> + unsigned long i;
> + int result;
> +
> + if (flags & KEXEC_ON_CRASH)
> + dest_image = &kexec_crash_image;
> + else
> + dest_image = &kexec_image;
> +
> + if (nr_segments == 0) {
> + /* Uninstall image */
> + kfree(xchg(dest_image, NULL));

Sorry, too fast today...
Should be of course not kfree, but:

kimage_free(dest_image, NULL));

Michael

2015-07-14 14:09:23

by Vivek Goyal

[permalink] [raw]
Subject: Re: [PATCH v4] kexec: Make a pair of map and unmap reserved pages when kdump fails to start

On Fri, Jul 10, 2015 at 11:14:06AM +0200, Michael Holzheu wrote:

[..]
> What about the following patch:
> ---
> diff --git a/kernel/kexec.c b/kernel/kexec.c
> index 7a36fdc..7837c4e 100644
> --- a/kernel/kexec.c
> +++ b/kernel/kexec.c
> @@ -1236,10 +1236,68 @@ int kexec_load_disabled;
>
> static DEFINE_MUTEX(kexec_mutex);
>
> +static int __kexec_load(unsigned long entry, unsigned long nr_segments,

How about renaming the function to do_kexec_load()?

We also need to cleanup the description of commit. One needs to explain
problem better and what's the solution this patch is implemeting.

> + struct kexec_segment __user *segments,
> + unsigned long flags)
> +{
> + struct kimage **dest_image, *image;
> + unsigned long i;
> + int result;
> +
> + if (flags & KEXEC_ON_CRASH)
> + dest_image = &kexec_crash_image;
> + else
> + dest_image = &kexec_image;
> +
> + if (nr_segments == 0) {
> + /* Uninstall image */
> + kfree(xchg(dest_image, NULL));

kimage_free(), as you have already noted in a follow up mail.

> + return 0;
> + }
> + if (flags & KEXEC_ON_CRASH) {
> + /*
> + * Loading another kernel to switch to if this one
> + * crashes. Free any current crash dump kernel before
> + * we corrupt it.
> + */
> + kimage_free(xchg(&kexec_crash_image, NULL));
> + }
> +
> + result = kimage_alloc_init(&image, entry, nr_segments, segments, flags);
> + if (result)
> + return result;
> +
> + if (flags & KEXEC_ON_CRASH)
> + crash_map_reserved_pages();
> +
> + if (flags & KEXEC_PRESERVE_CONTEXT)
> + image->preserve_context = 1;
> +
> + result = machine_kexec_prepare(image);
> + if (result)
> + goto failure_unmap_mem;
> +
> + for (i = 0; i < nr_segments; i++) {
> + result = kimage_load_segment(image, &image->segment[i]);
> + if (result)
> + goto failure_unmap_mem;
> + }
> +
> + kimage_terminate(image);
> +
> + /* Install the new kernel and uninstall the old */
> + image = xchg(dest_image, image);
> +
> +failure_unmap_mem:

I don't like this tag "failure_unmap_mem". We are calling this both
in success path as well as failure path. So why not simply call it "out".

> + if (flags & KEXEC_ON_CRASH)
> + crash_unmap_reserved_pages();
> + kimage_free(image);

Now kimage_free() is called with kexec_mutex held. Previously that was
not the case. I hope that's not a problem.

> + return result;
> +}
> +

Thanks
Vivek

2015-07-14 18:10:25

by Michael Holzheu

[permalink] [raw]
Subject: Re: [PATCH v4] kexec: Make a pair of map and unmap reserved pages when kdump fails to start

On Tue, 14 Jul 2015 10:09:20 -0400
Vivek Goyal <[email protected]> wrote:

> On Fri, Jul 10, 2015 at 11:14:06AM +0200, Michael Holzheu wrote:
>
> [..]
> > What about the following patch:
> > ---
> > diff --git a/kernel/kexec.c b/kernel/kexec.c
> > index 7a36fdc..7837c4e 100644
> > --- a/kernel/kexec.c
> > +++ b/kernel/kexec.c
> > @@ -1236,10 +1236,68 @@ int kexec_load_disabled;
> >
> > static DEFINE_MUTEX(kexec_mutex);
> >
> > +static int __kexec_load(unsigned long entry, unsigned long nr_segments,
>

[snip]

> > +
> > +failure_unmap_mem:
>
> I don't like this tag "failure_unmap_mem". We are calling this both
> in success path as well as failure path. So why not simply call it "out".

Since the code is better readable now, I'm fine with "out" :-)

>
> > + if (flags & KEXEC_ON_CRASH)
> > + crash_unmap_reserved_pages();
> > + kimage_free(image);
>
> Now kimage_free() is called with kexec_mutex held. Previously that was
> not the case. I hope that's not a problem.

Yes, I noticed that. But also in the original code there is already
one spot where kimage_free() is called under lock:

/*
* In case of crash, new kernel gets loaded in reserved region. It is
* same memory where old crash kernel might be loaded. Free any
* current crash dump kernel before we corrupt it.
*/
if (flags & KEXEC_FILE_ON_CRASH)
kimage_free(xchg(&kexec_crash_image, NULL));

Therefore I thought it should be ok.

Michael