2018-09-27 14:22:38

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH 0/3] find_next_iomem_res() fixes

These fix:

- A kexec off-by-one error that we probably never see in practice (only
affects a resource starting at exactly 640KB).

- A corner case in walk_iomem_res_desc(), walk_system_ram_res(), etc that
we probably also never see in practice (only affects a single byte
resource at the very end of the region we're searching)

- An issue in walk_iomem_res_desc() that apparently causes a kdump issue
(see Lianbo's note at [1]).

I think we need to fix the kdump issue either by these patches
(specifically the last one) or by the patch Lianbo posted [2].

I'm hoping to avoid deciding and merging these myself, but I'm not
sure who really wants to own kernel/resource.c.

[1] https://lore.kernel.org/lkml/[email protected]
[2] https://lore.kernel.org/lkml/[email protected]

---

Bjorn Helgaas (3):
x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error
resource: Include resource end in walk_*() interfaces
resource: Fix find_next_iomem_res() iteration issue


arch/x86/include/asm/kexec.h | 2 -
kernel/resource.c | 96 ++++++++++++++++++------------------------
2 files changed, 43 insertions(+), 55 deletions(-)


2018-09-27 14:22:37

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

From: Bjorn Helgaas <[email protected]>

The only use of KEXEC_BACKUP_SRC_END is as an argument to
walk_system_ram_res():

int crash_load_segments(struct kimage *image)
{
...
walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
image, determine_backup_region);

walk_system_ram_res() expects "start, end" arguments that are inclusive,
i.e., the range to be walked includes both the start and end addresses.

KEXEC_BACKUP_SRC_END was previously defined as (640 * 1024UL), which is the
first address *past* the desired 0-640KB range.

Define KEXEC_BACKUP_SRC_END as (640 * 1024UL - 1) so the KEXEC_BACKUP_SRC
region is [0-0x9ffff], not [0-0xa0000].

Fixes: dd5f726076cc ("kexec: support for kexec on panic using new system call")
Signed-off-by: Bjorn Helgaas <[email protected]>
---
arch/x86/include/asm/kexec.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index f327236f0fa7..5125fca472bb 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -67,7 +67,7 @@ struct kimage;

/* Memory to backup during crash kdump */
#define KEXEC_BACKUP_SRC_START (0UL)
-#define KEXEC_BACKUP_SRC_END (640 * 1024UL) /* 640K */
+#define KEXEC_BACKUP_SRC_END (640 * 1024UL - 1) /* 640K */

/*
* CPU does not save ss and sp on stack if execution is already


2018-09-27 14:23:55

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue

From: Bjorn Helgaas <[email protected]>

Previously find_next_iomem_res() used "*res" as both an input parameter for
the range to search and the type of resource to search for, and an output
parameter for the resource we found, which makes the interface confusing.

The current callers use find_next_iomem_res() incorrectly because they
allocate a single struct resource and use it for repeated calls to
find_next_iomem_res(). When find_next_iomem_res() returns a resource, it
overwrites the start, end, flags, and desc members of the struct. If we
call find_next_iomem_res() again, we must update or restore these fields.
The previous code restored res.start and res.end, but not res.flags or
res.desc.

Since the callers did not restore res.flags, if they searched for flags
IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags
IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would
incorrectly skip resources unless they were also marked as
IORESOURCE_SYSRAM.

Fix this by restructuring the interface so it takes explicit "start, end,
flags" parameters and uses "*res" only as an output parameter.

Original-patch: http://lore.kernel.org/lkml/[email protected]
Based-on-patch-by: Lianbo Jiang <[email protected]>
Signed-off-by: Bjorn Helgaas <[email protected]>
---
kernel/resource.c | 94 +++++++++++++++++++++++------------------------------
1 file changed, 41 insertions(+), 53 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 155ec873ea4d..9891ea90cc8f 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -319,23 +319,26 @@ int release_resource(struct resource *old)
EXPORT_SYMBOL(release_resource);

/*
- * Finds the lowest iomem resource existing within [res->start..res->end].
- * The caller must specify res->start, res->end, res->flags, and optionally
- * desc. If found, returns 0, res is overwritten, if not found, returns -1.
- * This function walks the whole tree and not just first level children until
- * and unless first_level_children_only is true.
+ * Finds the lowest iomem resource that covers part of [start..end]. The
+ * caller must specify start, end, flags, and desc (which may be
+ * IORES_DESC_NONE).
+ *
+ * If a resource is found, returns 0 and *res is overwritten with the part
+ * of the resource that's within [start..end]; if none is found, returns
+ * -1.
+ *
+ * This function walks the whole tree and not just first level children
+ * unless first_level_children_only is true.
*/
-static int find_next_iomem_res(struct resource *res, unsigned long desc,
- bool first_level_children_only)
+static int find_next_iomem_res(resource_size_t start, resource_size_t end,
+ unsigned long flags, unsigned long desc,
+ bool first_level_children_only,
+ struct resource *res)
{
- resource_size_t start, end;
struct resource *p;
bool sibling_only = false;

BUG_ON(!res);
-
- start = res->start;
- end = res->end;
BUG_ON(start >= end);

if (first_level_children_only)
@@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
read_lock(&resource_lock);

for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
- if ((p->flags & res->flags) != res->flags)
+ if ((p->flags & flags) != flags)
continue;
if ((desc != IORES_DESC_NONE) && (desc != p->desc))
continue;
@@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
read_unlock(&resource_lock);
if (!p)
return -1;
+
/* copy data */
- if (res->start < p->start)
- res->start = p->start;
- if (res->end > p->end)
- res->end = p->end;
+ res->start = max(start, p->start);
+ res->end = min(end, p->end);
res->flags = p->flags;
res->desc = p->desc;
return 0;
}

-static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
- bool first_level_children_only,
- void *arg,
+static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
+ unsigned long flags, unsigned long desc,
+ bool first_level_children_only, void *arg,
int (*func)(struct resource *, void *))
{
- u64 orig_end = res->end;
+ struct resource res;
int ret = -1;

- while ((res->start < res->end) &&
- !find_next_iomem_res(res, desc, first_level_children_only)) {
- ret = (*func)(res, arg);
+ while (start < end &&
+ !find_next_iomem_res(start, end, flags, desc,
+ first_level_children_only, &res)) {
+ ret = (*func)(&res, arg);
if (ret)
break;

- res->start = res->end + 1;
- res->end = orig_end;
+ start = res.end + 1;
}

return ret;
@@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
u64 end, void *arg, int (*func)(struct resource *, void *))
{
- struct resource res;
-
- res.start = start;
- res.end = end;
- res.flags = flags;
-
- return __walk_iomem_res_desc(&res, desc, false, arg, func);
+ return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
}
EXPORT_SYMBOL_GPL(walk_iomem_res_desc);

@@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
int walk_system_ram_res(u64 start, u64 end, void *arg,
int (*func)(struct resource *, void *))
{
- struct resource res;
-
- res.start = start;
- res.end = end;
- res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+ unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;

- return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+ return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
arg, func);
}

@@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
int walk_mem_res(u64 start, u64 end, void *arg,
int (*func)(struct resource *, void *))
{
- struct resource res;
-
- res.start = start;
- res.end = end;
- res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+ unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;

- return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+ return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
arg, func);
}

@@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
void *arg, int (*func)(unsigned long, unsigned long, void *))
{
+ resource_size_t start, end;
+ unsigned long flags;
struct resource res;
unsigned long pfn, end_pfn;
- u64 orig_end;
int ret = -1;

- res.start = (u64) start_pfn << PAGE_SHIFT;
- res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
- res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
- orig_end = res.end;
- while ((res.start < res.end) &&
- (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+ start = (u64) start_pfn << PAGE_SHIFT;
+ end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+ flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+ while (start < end &&
+ !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
+ true, &res)) {
pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
end_pfn = (res.end + 1) >> PAGE_SHIFT;
if (end_pfn > pfn)
ret = (*func)(pfn, end_pfn - pfn, arg);
if (ret)
break;
- res.start = res.end + 1;
- res.end = orig_end;
+ start = res.end + 1;
}
return ret;
}


2018-09-27 14:24:39

by Bjorn Helgaas

[permalink] [raw]
Subject: [PATCH 2/3] resource: Include resource end in walk_*() interfaces

From: Bjorn Helgaas <[email protected]>

find_next_iomem_res() finds an iomem resource that covers part of a range
described by "start, end". All callers expect that range to be inclusive,
i.e., both start and end are included, but find_next_iomem_res() doesn't
handle the end address correctly.

If it finds an iomem resource that contains exactly the end address, it
skips it, e.g., if "start, end" is [0x0-0x10000] and there happens to be an
iomem resource [mem 0x10000-0x10000] (the single byte at 0x10000), we skip
it:

find_next_iomem_res(...)
{
start = 0x0;
end = 0x10000;
for (p = next_resource(...)) {
# p->start = 0x10000;
# p->end = 0x10000;
# we *should* return this resource, but this condition is false:
if ((p->end >= start) && (p->start < end))
break;

Adjust find_next_iomem_res() so it allows a resource that includes the
single byte at the end of the range. This is a corner case that we
probably don't see in practice.

Fixes: 58c1b5b07907 ("[PATCH] memory hotadd fixes: find_next_system_ram catch range fix")
Signed-off-by: Bjorn Helgaas <[email protected]>
---
kernel/resource.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 30e1bc68503b..155ec873ea4d 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -319,7 +319,7 @@ int release_resource(struct resource *old)
EXPORT_SYMBOL(release_resource);

/*
- * Finds the lowest iomem resource existing within [res->start.res->end).
+ * Finds the lowest iomem resource existing within [res->start..res->end].
* The caller must specify res->start, res->end, res->flags, and optionally
* desc. If found, returns 0, res is overwritten, if not found, returns -1.
* This function walks the whole tree and not just first level children until
@@ -352,7 +352,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
p = NULL;
break;
}
- if ((p->end >= start) && (p->start < end))
+ if ((p->end >= start) && (p->start <= end))
break;
}



2018-09-28 13:18:08

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

On Thu, Sep 27, 2018 at 09:21:55AM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> The only use of KEXEC_BACKUP_SRC_END is as an argument to
> walk_system_ram_res():
>
> int crash_load_segments(struct kimage *image)
> {
> ...
> walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
> image, determine_backup_region);
>
> walk_system_ram_res() expects "start, end" arguments that are inclusive,
> i.e., the range to be walked includes both the start and end addresses.
>
> KEXEC_BACKUP_SRC_END was previously defined as (640 * 1024UL), which is the
> first address *past* the desired 0-640KB range.
>
> Define KEXEC_BACKUP_SRC_END as (640 * 1024UL - 1) so the KEXEC_BACKUP_SRC
> region is [0-0x9ffff], not [0-0xa0000].
>
> Fixes: dd5f726076cc ("kexec: support for kexec on panic using new system call")
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> arch/x86/include/asm/kexec.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> index f327236f0fa7..5125fca472bb 100644
> --- a/arch/x86/include/asm/kexec.h
> +++ b/arch/x86/include/asm/kexec.h
> @@ -67,7 +67,7 @@ struct kimage;
>
> /* Memory to backup during crash kdump */
> #define KEXEC_BACKUP_SRC_START (0UL)
> -#define KEXEC_BACKUP_SRC_END (640 * 1024UL) /* 640K */
> +#define KEXEC_BACKUP_SRC_END (640 * 1024UL - 1) /* 640K */
>
> /*
> * CPU does not save ss and sp on stack if execution is already

Reviewed-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)

2018-09-28 13:55:30

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 2/3] resource: Include resource end in walk_*() interfaces

On Thu, Sep 27, 2018 at 09:22:02AM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> find_next_iomem_res() finds an iomem resource that covers part of a range
> described by "start, end". All callers expect that range to be inclusive,
> i.e., both start and end are included, but find_next_iomem_res() doesn't
> handle the end address correctly.
>
> If it finds an iomem resource that contains exactly the end address, it
> skips it, e.g., if "start, end" is [0x0-0x10000] and there happens to be an
> iomem resource [mem 0x10000-0x10000] (the single byte at 0x10000), we skip
> it:
>
> find_next_iomem_res(...)
> {
> start = 0x0;
> end = 0x10000;
> for (p = next_resource(...)) {
> # p->start = 0x10000;
> # p->end = 0x10000;
> # we *should* return this resource, but this condition is false:
> if ((p->end >= start) && (p->start < end))
> break;
>
> Adjust find_next_iomem_res() so it allows a resource that includes the
> single byte at the end of the range. This is a corner case that we
> probably don't see in practice.

This is how one should write commit messages! Thanks for that - it was a
joy - for a change - to read it :-)

>
> Fixes: 58c1b5b07907 ("[PATCH] memory hotadd fixes: find_next_system_ram catch range fix")
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> kernel/resource.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)

Reviewed-by: Borislav Petkov <[email protected]>

> diff --git a/kernel/resource.c b/kernel/resource.c
> index 30e1bc68503b..155ec873ea4d 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -319,7 +319,7 @@ int release_resource(struct resource *old)
> EXPORT_SYMBOL(release_resource);
>
> /*
> - * Finds the lowest iomem resource existing within [res->start.res->end).

What I'm still wondering about is, why was it ever even considered to
have a non-inclusive range. Looking at the git history, especially
58c1b5b07907 and 2842f11419704 - it looks like it was an omission and
then users started using it with inclusive ranges.

Thx.

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)

2018-09-28 16:41:41

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue

On Thu, Sep 27, 2018 at 09:22:09AM -0500, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> Previously find_next_iomem_res() used "*res" as both an input parameter for
> the range to search and the type of resource to search for, and an output
> parameter for the resource we found, which makes the interface confusing.
>
> The current callers use find_next_iomem_res() incorrectly because they
> allocate a single struct resource and use it for repeated calls to
> find_next_iomem_res(). When find_next_iomem_res() returns a resource, it
> overwrites the start, end, flags, and desc members of the struct. If we
> call find_next_iomem_res() again, we must update or restore these fields.
> The previous code restored res.start and res.end, but not res.flags or
> res.desc.

... which is a sure sign that the design of this thing is not the best one.

>
> Since the callers did not restore res.flags, if they searched for flags
> IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags
> IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would
> incorrectly skip resources unless they were also marked as
> IORESOURCE_SYSRAM.

Nice example!

> Fix this by restructuring the interface so it takes explicit "start, end,
> flags" parameters and uses "*res" only as an output parameter.
>
> Original-patch: http://lore.kernel.org/lkml/[email protected]
> Based-on-patch-by: Lianbo Jiang <[email protected]>
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> kernel/resource.c | 94 +++++++++++++++++++++++------------------------------
> 1 file changed, 41 insertions(+), 53 deletions(-)
>
> diff --git a/kernel/resource.c b/kernel/resource.c
> index 155ec873ea4d..9891ea90cc8f 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -319,23 +319,26 @@ int release_resource(struct resource *old)
> EXPORT_SYMBOL(release_resource);
>
> /*

I guess this could be made kernel-doc, since you're touching it...

> - * Finds the lowest iomem resource existing within [res->start..res->end].
> - * The caller must specify res->start, res->end, res->flags, and optionally
> - * desc. If found, returns 0, res is overwritten, if not found, returns -1.
> - * This function walks the whole tree and not just first level children until
> - * and unless first_level_children_only is true.
> + * Finds the lowest iomem resource that covers part of [start..end]. The
> + * caller must specify start, end, flags, and desc (which may be
> + * IORES_DESC_NONE).
> + *
> + * If a resource is found, returns 0 and *res is overwritten with the part
> + * of the resource that's within [start..end]; if none is found, returns
> + * -1.
> + *
> + * This function walks the whole tree and not just first level children
> + * unless first_level_children_only is true.

... and then prepend that with '@' - @first_level_children_only to refer
to the function parameter.

> */
> -static int find_next_iomem_res(struct resource *res, unsigned long desc,
> - bool first_level_children_only)
> +static int find_next_iomem_res(resource_size_t start, resource_size_t end,
> + unsigned long flags, unsigned long desc,
> + bool first_level_children_only,
> + struct resource *res)
> {
> - resource_size_t start, end;
> struct resource *p;
> bool sibling_only = false;
>
> BUG_ON(!res);
> -
> - start = res->start;
> - end = res->end;
> BUG_ON(start >= end);

And since we're touching this, maybe replace that BUG_ON() fun with
simply return -EINVAL or some error code...

>
> if (first_level_children_only)

if (first_level_children_only)
sibling_only = true;

So this is just silly - replacing a bool function param with a local bool
var.

You could kill that, shorten first_level_children_only's name and use it
directly.

Depending on how much cleanup it amounts to, you could make that a
separate cleanup patch ontop, to keep the changes from the cleanup
separate.

> @@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> read_lock(&resource_lock);
>
> for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
> - if ((p->flags & res->flags) != res->flags)
> + if ((p->flags & flags) != flags)
> continue;
> if ((desc != IORES_DESC_NONE) && (desc != p->desc))
> continue;
> @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> read_unlock(&resource_lock);
> if (!p)
> return -1;
> +
> /* copy data */
> - if (res->start < p->start)
> - res->start = p->start;
> - if (res->end > p->end)
> - res->end = p->end;
> + res->start = max(start, p->start);
> + res->end = min(end, p->end);

Should we use the min_t and max_t versions here for typechecking?

> res->flags = p->flags;
> res->desc = p->desc;
> return 0;
> }
>
> -static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> - bool first_level_children_only,
> - void *arg,
> +static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
> + unsigned long flags, unsigned long desc,
> + bool first_level_children_only, void *arg,
> int (*func)(struct resource *, void *))
> {
> - u64 orig_end = res->end;
> + struct resource res;
> int ret = -1;
>
> - while ((res->start < res->end) &&
> - !find_next_iomem_res(res, desc, first_level_children_only)) {
> - ret = (*func)(res, arg);
> + while (start < end &&
> + !find_next_iomem_res(start, end, flags, desc,
> + first_level_children_only, &res)) {
> + ret = (*func)(&res, arg);
> if (ret)
> break;
>
> - res->start = res->end + 1;
> - res->end = orig_end;
> + start = res.end + 1;
> }
>
> return ret;
> @@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
> u64 end, void *arg, int (*func)(struct resource *, void *))

Align that function's parameters on the opening brace, pls, while you're
at it.

> {
> - struct resource res;
> -
> - res.start = start;
> - res.end = end;
> - res.flags = flags;
> -
> - return __walk_iomem_res_desc(&res, desc, false, arg, func);
> + return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
> }
> EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
>
> @@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
> int walk_system_ram_res(u64 start, u64 end, void *arg,
> int (*func)(struct resource *, void *))

Ditto.

> {
> - struct resource res;
> -
> - res.start = start;
> - res.end = end;
> - res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> + unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
>
> - return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> + return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> arg, func);
> }
>
> @@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
> int walk_mem_res(u64 start, u64 end, void *arg,
> int (*func)(struct resource *, void *))
> {
> - struct resource res;
> -
> - res.start = start;
> - res.end = end;
> - res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> + unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
>
> - return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> + return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> arg, func);
> }
>
> @@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
> void *arg, int (*func)(unsigned long, unsigned long, void *))

Ditto.

With that addressed:

Reviewed-by: Borislav Petkov <[email protected]>

All good stuff and a charm to review, lemme know if I should take them
or you can carry them.

Thanks!

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)

2018-09-30 09:21:41

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

Hi Bjorn,

On 09/27/18 at 09:21am, Bjorn Helgaas wrote:
> From: Bjorn Helgaas <[email protected]>
>
> The only use of KEXEC_BACKUP_SRC_END is as an argument to
> walk_system_ram_res():
>
> int crash_load_segments(struct kimage *image)
> {
> ...
> walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
> image, determine_backup_region);
>
> walk_system_ram_res() expects "start, end" arguments that are inclusive,
> i.e., the range to be walked includes both the start and end addresses.

Looking at the function comment of find_next_iomem_res, the res->end
should be exclusive, am I missing something?


/*
* Finds the lowest iomem resource existing within [res->start.res->end).
* The caller must specify res->start, res->end, res->flags, and optionally
* desc. If found, returns 0, res is overwritten, if not found, returns -1.
* This function walks the whole tree and not just first level children until
* and unless first_level_children_only is true.
*/
static int find_next_iomem_res(struct resource *res, unsigned long desc,
bool first_level_children_only)


>
> KEXEC_BACKUP_SRC_END was previously defined as (640 * 1024UL), which is the
> first address *past* the desired 0-640KB range.
>
> Define KEXEC_BACKUP_SRC_END as (640 * 1024UL - 1) so the KEXEC_BACKUP_SRC
> region is [0-0x9ffff], not [0-0xa0000].
>
> Fixes: dd5f726076cc ("kexec: support for kexec on panic using new system call")
> Signed-off-by: Bjorn Helgaas <[email protected]>
> ---
> arch/x86/include/asm/kexec.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
> index f327236f0fa7..5125fca472bb 100644
> --- a/arch/x86/include/asm/kexec.h
> +++ b/arch/x86/include/asm/kexec.h
> @@ -67,7 +67,7 @@ struct kimage;
>
> /* Memory to backup during crash kdump */
> #define KEXEC_BACKUP_SRC_START (0UL)
> -#define KEXEC_BACKUP_SRC_END (640 * 1024UL) /* 640K */
> +#define KEXEC_BACKUP_SRC_END (640 * 1024UL - 1) /* 640K */
>
> /*
> * CPU does not save ss and sp on stack if execution is already
>
>
> _______________________________________________
> kexec mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/kexec

Thanks
Dave

2018-09-30 09:29:47

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

On 09/30/18 at 05:21pm, Dave Young wrote:
> Hi Bjorn,
>
> On 09/27/18 at 09:21am, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <[email protected]>
> >
> > The only use of KEXEC_BACKUP_SRC_END is as an argument to
> > walk_system_ram_res():
> >
> > int crash_load_segments(struct kimage *image)
> > {
> > ...
> > walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
> > image, determine_backup_region);
> >
> > walk_system_ram_res() expects "start, end" arguments that are inclusive,
> > i.e., the range to be walked includes both the start and end addresses.
>
> Looking at the function comment of find_next_iomem_res, the res->end
> should be exclusive, am I missing something?

Oops, you fix it in 2nd patch, I apparently miss that.

Since the fix of checking the end is in another patch, probably merge
these two patches so that they are in one patch to avoid break bisect.

Thanks
Dave

Subject: [tip:x86/mm] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

Commit-ID: 51fbf14f2528a8c6401290e37f1c893a2412f1d3
Gitweb: https://git.kernel.org/tip/51fbf14f2528a8c6401290e37f1c893a2412f1d3
Author: Bjorn Helgaas <[email protected]>
AuthorDate: Thu, 27 Sep 2018 09:21:55 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Tue, 9 Oct 2018 17:18:31 +0200

x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

The only use of KEXEC_BACKUP_SRC_END is as an argument to
walk_system_ram_res():

int crash_load_segments(struct kimage *image)
{
...
walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
image, determine_backup_region);

walk_system_ram_res() expects "start, end" arguments that are inclusive,
i.e., the range to be walked includes both the start and end addresses.

KEXEC_BACKUP_SRC_END was previously defined as (640 * 1024UL), which is the
first address *past* the desired 0-640KB range.

Define KEXEC_BACKUP_SRC_END as (640 * 1024UL - 1) so the KEXEC_BACKUP_SRC
region is [0-0x9ffff], not [0-0xa0000].

Fixes: dd5f726076cc ("kexec: support for kexec on panic using new system call")
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
CC: "H. Peter Anvin" <[email protected]>
CC: Andrew Morton <[email protected]>
CC: Brijesh Singh <[email protected]>
CC: Greg Kroah-Hartman <[email protected]>
CC: Ingo Molnar <[email protected]>
CC: Lianbo Jiang <[email protected]>
CC: Takashi Iwai <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Tom Lendacky <[email protected]>
CC: Vivek Goyal <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
Link: http://lkml.kernel.org/r/153805811578.1157.6948388946904655969.stgit@bhelgaas-glaptop.roam.corp.google.com
---
arch/x86/include/asm/kexec.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index f327236f0fa7..5125fca472bb 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -67,7 +67,7 @@ struct kimage;

/* Memory to backup during crash kdump */
#define KEXEC_BACKUP_SRC_START (0UL)
-#define KEXEC_BACKUP_SRC_END (640 * 1024UL) /* 640K */
+#define KEXEC_BACKUP_SRC_END (640 * 1024UL - 1) /* 640K */

/*
* CPU does not save ss and sp on stack if execution is already

Subject: [tip:x86/mm] resource: Include resource end in walk_*() interfaces

Commit-ID: a98959fdbda1849a01b2150bb635ed559ec06700
Gitweb: https://git.kernel.org/tip/a98959fdbda1849a01b2150bb635ed559ec06700
Author: Bjorn Helgaas <[email protected]>
AuthorDate: Thu, 27 Sep 2018 09:22:02 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Tue, 9 Oct 2018 17:18:34 +0200

resource: Include resource end in walk_*() interfaces

find_next_iomem_res() finds an iomem resource that covers part of a range
described by "start, end". All callers expect that range to be inclusive,
i.e., both start and end are included, but find_next_iomem_res() doesn't
handle the end address correctly.

If it finds an iomem resource that contains exactly the end address, it
skips it, e.g., if "start, end" is [0x0-0x10000] and there happens to be an
iomem resource [mem 0x10000-0x10000] (the single byte at 0x10000), we skip
it:

find_next_iomem_res(...)
{
start = 0x0;
end = 0x10000;
for (p = next_resource(...)) {
# p->start = 0x10000;
# p->end = 0x10000;
# we *should* return this resource, but this condition is false:
if ((p->end >= start) && (p->start < end))
break;

Adjust find_next_iomem_res() so it allows a resource that includes the
single byte at the end of the range. This is a corner case that we
probably don't see in practice.

Fixes: 58c1b5b07907 ("[PATCH] memory hotadd fixes: find_next_system_ram catch range fix")
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
CC: Andrew Morton <[email protected]>
CC: Brijesh Singh <[email protected]>
CC: Dan Williams <[email protected]>
CC: H. Peter Anvin <[email protected]>
CC: Lianbo Jiang <[email protected]>
CC: Takashi Iwai <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Tom Lendacky <[email protected]>
CC: Vivek Goyal <[email protected]>
CC: Yaowei Bai <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: x86-ml <[email protected]>
Link: http://lkml.kernel.org/r/153805812254.1157.16736368485811773752.stgit@bhelgaas-glaptop.roam.corp.google.com
---
kernel/resource.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 30e1bc68503b..155ec873ea4d 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -319,7 +319,7 @@ int release_resource(struct resource *old)
EXPORT_SYMBOL(release_resource);

/*
- * Finds the lowest iomem resource existing within [res->start.res->end).
+ * Finds the lowest iomem resource existing within [res->start..res->end].
* The caller must specify res->start, res->end, res->flags, and optionally
* desc. If found, returns 0, res is overwritten, if not found, returns -1.
* This function walks the whole tree and not just first level children until
@@ -352,7 +352,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
p = NULL;
break;
}
- if ((p->end >= start) && (p->start < end))
+ if ((p->end >= start) && (p->start <= end))
break;
}


Subject: [tip:x86/mm] resource: Fix find_next_iomem_res() iteration issue

Commit-ID: 010a93bf97c72f43aac664d0a685942f83d1a103
Gitweb: https://git.kernel.org/tip/010a93bf97c72f43aac664d0a685942f83d1a103
Author: Bjorn Helgaas <[email protected]>
AuthorDate: Thu, 27 Sep 2018 09:22:09 -0500
Committer: Borislav Petkov <[email protected]>
CommitDate: Tue, 9 Oct 2018 17:18:36 +0200

resource: Fix find_next_iomem_res() iteration issue

Previously find_next_iomem_res() used "*res" as both an input parameter for
the range to search and the type of resource to search for, and an output
parameter for the resource we found, which makes the interface confusing.

The current callers use find_next_iomem_res() incorrectly because they
allocate a single struct resource and use it for repeated calls to
find_next_iomem_res(). When find_next_iomem_res() returns a resource, it
overwrites the start, end, flags, and desc members of the struct. If we
call find_next_iomem_res() again, we must update or restore these fields.
The previous code restored res.start and res.end, but not res.flags or
res.desc.

Since the callers did not restore res.flags, if they searched for flags
IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags
IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would
incorrectly skip resources unless they were also marked as
IORESOURCE_SYSRAM.

Fix this by restructuring the interface so it takes explicit "start, end,
flags" parameters and uses "*res" only as an output parameter.

Based on a patch by Lianbo Jiang <[email protected]>.

[ bp: While at it:
- make comments kernel-doc style.
-

Originally-by: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Bjorn Helgaas <[email protected]>
Signed-off-by: Borislav Petkov <[email protected]>
CC: Andrew Morton <[email protected]>
CC: Brijesh Singh <[email protected]>
CC: Dan Williams <[email protected]>
CC: H. Peter Anvin <[email protected]>
CC: Lianbo Jiang <[email protected]>
CC: Takashi Iwai <[email protected]>
CC: Thomas Gleixner <[email protected]>
CC: Tom Lendacky <[email protected]>
CC: Vivek Goyal <[email protected]>
CC: Yaowei Bai <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: [email protected]
CC: x86-ml <[email protected]>
Link: http://lkml.kernel.org/r/153805812916.1157.177580438135143788.stgit@bhelgaas-glaptop.roam.corp.google.com
---
kernel/resource.c | 96 ++++++++++++++++++++++++-------------------------------
1 file changed, 42 insertions(+), 54 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 155ec873ea4d..38b8d11c9eaf 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -318,24 +318,27 @@ int release_resource(struct resource *old)

EXPORT_SYMBOL(release_resource);

-/*
- * Finds the lowest iomem resource existing within [res->start..res->end].
- * The caller must specify res->start, res->end, res->flags, and optionally
- * desc. If found, returns 0, res is overwritten, if not found, returns -1.
- * This function walks the whole tree and not just first level children until
- * and unless first_level_children_only is true.
+/**
+ * Finds the lowest iomem resource that covers part of [start..end]. The
+ * caller must specify start, end, flags, and desc (which may be
+ * IORES_DESC_NONE).
+ *
+ * If a resource is found, returns 0 and *res is overwritten with the part
+ * of the resource that's within [start..end]; if none is found, returns
+ * -1.
+ *
+ * This function walks the whole tree and not just first level children
+ * unless @first_level_children_only is true.
*/
-static int find_next_iomem_res(struct resource *res, unsigned long desc,
- bool first_level_children_only)
+static int find_next_iomem_res(resource_size_t start, resource_size_t end,
+ unsigned long flags, unsigned long desc,
+ bool first_level_children_only,
+ struct resource *res)
{
- resource_size_t start, end;
struct resource *p;
bool sibling_only = false;

BUG_ON(!res);
-
- start = res->start;
- end = res->end;
BUG_ON(start >= end);

if (first_level_children_only)
@@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
read_lock(&resource_lock);

for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
- if ((p->flags & res->flags) != res->flags)
+ if ((p->flags & flags) != flags)
continue;
if ((desc != IORES_DESC_NONE) && (desc != p->desc))
continue;
@@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
read_unlock(&resource_lock);
if (!p)
return -1;
+
/* copy data */
- if (res->start < p->start)
- res->start = p->start;
- if (res->end > p->end)
- res->end = p->end;
+ res->start = max(start, p->start);
+ res->end = min(end, p->end);
res->flags = p->flags;
res->desc = p->desc;
return 0;
}

-static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
- bool first_level_children_only,
- void *arg,
+static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
+ unsigned long flags, unsigned long desc,
+ bool first_level_children_only, void *arg,
int (*func)(struct resource *, void *))
{
- u64 orig_end = res->end;
+ struct resource res;
int ret = -1;

- while ((res->start < res->end) &&
- !find_next_iomem_res(res, desc, first_level_children_only)) {
- ret = (*func)(res, arg);
+ while (start < end &&
+ !find_next_iomem_res(start, end, flags, desc,
+ first_level_children_only, &res)) {
+ ret = (*func)(&res, arg);
if (ret)
break;

- res->start = res->end + 1;
- res->end = orig_end;
+ start = res.end + 1;
}

return ret;
@@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
u64 end, void *arg, int (*func)(struct resource *, void *))
{
- struct resource res;
-
- res.start = start;
- res.end = end;
- res.flags = flags;
-
- return __walk_iomem_res_desc(&res, desc, false, arg, func);
+ return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
}
EXPORT_SYMBOL_GPL(walk_iomem_res_desc);

@@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
int walk_system_ram_res(u64 start, u64 end, void *arg,
int (*func)(struct resource *, void *))
{
- struct resource res;
+ unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;

- res.start = start;
- res.end = end;
- res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
-
- return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+ return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
arg, func);
}

@@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
int walk_mem_res(u64 start, u64 end, void *arg,
int (*func)(struct resource *, void *))
{
- struct resource res;
+ unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;

- res.start = start;
- res.end = end;
- res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
-
- return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
+ return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
arg, func);
}

@@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
void *arg, int (*func)(unsigned long, unsigned long, void *))
{
+ resource_size_t start, end;
+ unsigned long flags;
struct resource res;
unsigned long pfn, end_pfn;
- u64 orig_end;
int ret = -1;

- res.start = (u64) start_pfn << PAGE_SHIFT;
- res.end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
- res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
- orig_end = res.end;
- while ((res.start < res.end) &&
- (find_next_iomem_res(&res, IORES_DESC_NONE, true) >= 0)) {
+ start = (u64) start_pfn << PAGE_SHIFT;
+ end = ((u64)(start_pfn + nr_pages) << PAGE_SHIFT) - 1;
+ flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
+ while (start < end &&
+ !find_next_iomem_res(start, end, flags, IORES_DESC_NONE,
+ true, &res)) {
pfn = (res.start + PAGE_SIZE - 1) >> PAGE_SHIFT;
end_pfn = (res.end + 1) >> PAGE_SHIFT;
if (end_pfn > pfn)
ret = (*func)(pfn, end_pfn - pfn, arg);
if (ret)
break;
- res.start = res.end + 1;
- res.end = orig_end;
+ start = res.end + 1;
}
return ret;
}

2018-10-09 17:31:45

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue

On Fri, Sep 28, 2018 at 11:42 AM Borislav Petkov <[email protected]> wrote:
>
> On Thu, Sep 27, 2018 at 09:22:09AM -0500, Bjorn Helgaas wrote:
> > From: Bjorn Helgaas <[email protected]>
> >
> > Previously find_next_iomem_res() used "*res" as both an input parameter for
> > the range to search and the type of resource to search for, and an output
> > parameter for the resource we found, which makes the interface confusing.
> >
> > The current callers use find_next_iomem_res() incorrectly because they
> > allocate a single struct resource and use it for repeated calls to
> > find_next_iomem_res(). When find_next_iomem_res() returns a resource, it
> > overwrites the start, end, flags, and desc members of the struct. If we
> > call find_next_iomem_res() again, we must update or restore these fields.
> > The previous code restored res.start and res.end, but not res.flags or
> > res.desc.
>
> ... which is a sure sign that the design of this thing is not the best one.
>
> >
> > Since the callers did not restore res.flags, if they searched for flags
> > IORESOURCE_MEM | IORESOURCE_BUSY and found a resource with flags
> > IORESOURCE_MEM | IORESOURCE_BUSY | IORESOURCE_SYSRAM, the next search would
> > incorrectly skip resources unless they were also marked as
> > IORESOURCE_SYSRAM.
>
> Nice example!
>
> > Fix this by restructuring the interface so it takes explicit "start, end,
> > flags" parameters and uses "*res" only as an output parameter.
> >
> > Original-patch: http://lore.kernel.org/lkml/[email protected]
> > Based-on-patch-by: Lianbo Jiang <[email protected]>
> > Signed-off-by: Bjorn Helgaas <[email protected]>
> > ---
> > kernel/resource.c | 94 +++++++++++++++++++++++------------------------------
> > 1 file changed, 41 insertions(+), 53 deletions(-)
> >
> > diff --git a/kernel/resource.c b/kernel/resource.c
> > index 155ec873ea4d..9891ea90cc8f 100644
> > --- a/kernel/resource.c
> > +++ b/kernel/resource.c
> > @@ -319,23 +319,26 @@ int release_resource(struct resource *old)
> > EXPORT_SYMBOL(release_resource);
> >
> > /*
>
> I guess this could be made kernel-doc, since you're touching it...
>
> > - * Finds the lowest iomem resource existing within [res->start..res->end].
> > - * The caller must specify res->start, res->end, res->flags, and optionally
> > - * desc. If found, returns 0, res is overwritten, if not found, returns -1.
> > - * This function walks the whole tree and not just first level children until
> > - * and unless first_level_children_only is true.
> > + * Finds the lowest iomem resource that covers part of [start..end]. The
> > + * caller must specify start, end, flags, and desc (which may be
> > + * IORES_DESC_NONE).
> > + *
> > + * If a resource is found, returns 0 and *res is overwritten with the part
> > + * of the resource that's within [start..end]; if none is found, returns
> > + * -1.
> > + *
> > + * This function walks the whole tree and not just first level children
> > + * unless first_level_children_only is true.
>
> ... and then prepend that with '@' - @first_level_children_only to refer
> to the function parameter.
>
> > */
> > -static int find_next_iomem_res(struct resource *res, unsigned long desc,
> > - bool first_level_children_only)
> > +static int find_next_iomem_res(resource_size_t start, resource_size_t end,
> > + unsigned long flags, unsigned long desc,
> > + bool first_level_children_only,
> > + struct resource *res)
> > {
> > - resource_size_t start, end;
> > struct resource *p;
> > bool sibling_only = false;
> >
> > BUG_ON(!res);
> > -
> > - start = res->start;
> > - end = res->end;
> > BUG_ON(start >= end);
>
> And since we're touching this, maybe replace that BUG_ON() fun with
> simply return -EINVAL or some error code...
>
> >
> > if (first_level_children_only)
>
> if (first_level_children_only)
> sibling_only = true;
>
> So this is just silly - replacing a bool function param with a local bool
> var.
>
> You could kill that, shorten first_level_children_only's name and use it
> directly.
>
> Depending on how much cleanup it amounts to, you could make that a
> separate cleanup patch ontop, to keep the changes from the cleanup
> separate.
>
> > @@ -344,7 +347,7 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> > read_lock(&resource_lock);
> >
> > for (p = iomem_resource.child; p; p = next_resource(p, sibling_only)) {
> > - if ((p->flags & res->flags) != res->flags)
> > + if ((p->flags & flags) != flags)
> > continue;
> > if ((desc != IORES_DESC_NONE) && (desc != p->desc))
> > continue;
> > @@ -359,32 +362,31 @@ static int find_next_iomem_res(struct resource *res, unsigned long desc,
> > read_unlock(&resource_lock);
> > if (!p)
> > return -1;
> > +
> > /* copy data */
> > - if (res->start < p->start)
> > - res->start = p->start;
> > - if (res->end > p->end)
> > - res->end = p->end;
> > + res->start = max(start, p->start);
> > + res->end = min(end, p->end);
>
> Should we use the min_t and max_t versions here for typechecking?
>
> > res->flags = p->flags;
> > res->desc = p->desc;
> > return 0;
> > }
> >
> > -static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> > - bool first_level_children_only,
> > - void *arg,
> > +static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
> > + unsigned long flags, unsigned long desc,
> > + bool first_level_children_only, void *arg,
> > int (*func)(struct resource *, void *))
> > {
> > - u64 orig_end = res->end;
> > + struct resource res;
> > int ret = -1;
> >
> > - while ((res->start < res->end) &&
> > - !find_next_iomem_res(res, desc, first_level_children_only)) {
> > - ret = (*func)(res, arg);
> > + while (start < end &&
> > + !find_next_iomem_res(start, end, flags, desc,
> > + first_level_children_only, &res)) {
> > + ret = (*func)(&res, arg);
> > if (ret)
> > break;
> >
> > - res->start = res->end + 1;
> > - res->end = orig_end;
> > + start = res.end + 1;
> > }
> >
> > return ret;
> > @@ -407,13 +409,7 @@ static int __walk_iomem_res_desc(struct resource *res, unsigned long desc,
> > int walk_iomem_res_desc(unsigned long desc, unsigned long flags, u64 start,
> > u64 end, void *arg, int (*func)(struct resource *, void *))
>
> Align that function's parameters on the opening brace, pls, while you're
> at it.
>
> > {
> > - struct resource res;
> > -
> > - res.start = start;
> > - res.end = end;
> > - res.flags = flags;
> > -
> > - return __walk_iomem_res_desc(&res, desc, false, arg, func);
> > + return __walk_iomem_res_desc(start, end, flags, desc, false, arg, func);
> > }
> > EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
> >
> > @@ -427,13 +423,9 @@ EXPORT_SYMBOL_GPL(walk_iomem_res_desc);
> > int walk_system_ram_res(u64 start, u64 end, void *arg,
> > int (*func)(struct resource *, void *))
>
> Ditto.
>
> > {
> > - struct resource res;
> > -
> > - res.start = start;
> > - res.end = end;
> > - res.flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> > + unsigned long flags = IORESOURCE_SYSTEM_RAM | IORESOURCE_BUSY;
> >
> > - return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> > + return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> > arg, func);
> > }
> >
> > @@ -444,13 +436,9 @@ int walk_system_ram_res(u64 start, u64 end, void *arg,
> > int walk_mem_res(u64 start, u64 end, void *arg,
> > int (*func)(struct resource *, void *))
> > {
> > - struct resource res;
> > -
> > - res.start = start;
> > - res.end = end;
> > - res.flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> > + unsigned long flags = IORESOURCE_MEM | IORESOURCE_BUSY;
> >
> > - return __walk_iomem_res_desc(&res, IORES_DESC_NONE, true,
> > + return __walk_iomem_res_desc(start, end, flags, IORES_DESC_NONE, true,
> > arg, func);
> > }
> >
> > @@ -464,25 +452,25 @@ int walk_mem_res(u64 start, u64 end, void *arg,
> > int walk_system_ram_range(unsigned long start_pfn, unsigned long nr_pages,
> > void *arg, int (*func)(unsigned long, unsigned long, void *))
>
> Ditto.
>
> With that addressed:
>
> Reviewed-by: Borislav Petkov <[email protected]>
>
> All good stuff and a charm to review, lemme know if I should take them
> or you can carry them.

Sorry, I don't know what happened here because I didn't see these
comments until today. I suspect what happened was that my Gmail
filter auto-filed them in my linux-kernel folder, which I don't read.
On my @google.com account, I have another filter that pull out things
addressed directly to me, which I *do* read. But this thread didn't
cc that account until the tip-bot message, which *did* cc my @google
account because that's how I signed off the patches. Sigh.

Anyway, it looks like this stuff is on its way; let me know
([email protected]) if I should do anything else. I would address
your comments above, but since this seems to be applied and I saw a
cleanup patch from you, I assume you already took care of them.

Bjorn

2018-10-09 17:37:22

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 3/3] resource: Fix find_next_iomem_res() iteration issue

On Tue, Oct 09, 2018 at 12:30:33PM -0500, Bjorn Helgaas wrote:
> Sorry, I don't know what happened here because I didn't see these
> comments until today. I suspect what happened was that my Gmail
> filter auto-filed them in my linux-kernel folder, which I don't read.
> On my @google.com account, I have another filter that pull out things
> addressed directly to me, which I *do* read. But this thread didn't
> cc that account until the tip-bot message, which *did* cc my @google
> account because that's how I signed off the patches. Sigh.

No worries, it all should be taken care of now! :-)

> Anyway, it looks like this stuff is on its way; let me know
> ([email protected]) if I should do anything else. I would address
> your comments above, but since this seems to be applied and I saw a
> cleanup patch from you, I assume you already took care of them.

Yeah, I think I've covered them all but if you see something out of the
ordinary, let me know so that I can fix it.

Thanks!

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)

2018-10-15 04:52:35

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

On 09/30/18 at 05:27pm, Dave Young wrote:
> On 09/30/18 at 05:21pm, Dave Young wrote:
> > Hi Bjorn,
> >
> > On 09/27/18 at 09:21am, Bjorn Helgaas wrote:
> > > From: Bjorn Helgaas <[email protected]>
> > >
> > > The only use of KEXEC_BACKUP_SRC_END is as an argument to
> > > walk_system_ram_res():
> > >
> > > int crash_load_segments(struct kimage *image)
> > > {
> > > ...
> > > walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
> > > image, determine_backup_region);
> > >
> > > walk_system_ram_res() expects "start, end" arguments that are inclusive,
> > > i.e., the range to be walked includes both the start and end addresses.
> >
> > Looking at the function comment of find_next_iomem_res, the res->end
> > should be exclusive, am I missing something?
>
> Oops, you fix it in 2nd patch, I apparently miss that.
>
> Since the fix of checking the end is in another patch, probably merge
> these two patches so that they are in one patch to avoid break bisect.

Not sure if above comment was missed, Boris, would you mind to fold the
patch 1 and 2?

Thanks
Dave

2018-10-15 11:20:04

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

On Mon, Oct 15, 2018 at 12:51:38PM +0800, Dave Young wrote:
> > Since the fix of checking the end is in another patch, probably merge
> > these two patches so that they are in one patch to avoid break bisect.
>
> Not sure if above comment was missed, Boris, would you mind to fold the
> patch 1 and 2?

Without any further explanation about a potential bisection breaking,
there's nothing I can do.

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix Imendörffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nürnberg)

2018-10-15 13:44:44

by Bjorn Helgaas

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

On Mon, Oct 15, 2018 at 12:51:38PM +0800, Dave Young wrote:
> On 09/30/18 at 05:27pm, Dave Young wrote:
> > On 09/30/18 at 05:21pm, Dave Young wrote:
> > > On 09/27/18 at 09:21am, Bjorn Helgaas wrote:
> > > > From: Bjorn Helgaas <[email protected]>
> > > >
> > > > The only use of KEXEC_BACKUP_SRC_END is as an argument to
> > > > walk_system_ram_res():
> > > >
> > > > int crash_load_segments(struct kimage *image)
> > > > {
> > > > ...
> > > > walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
> > > > image, determine_backup_region);
> > > >
> > > > walk_system_ram_res() expects "start, end" arguments that are inclusive,
> > > > i.e., the range to be walked includes both the start and end addresses.
> > >
> > > Looking at the function comment of find_next_iomem_res, the res->end
> > > should be exclusive, am I missing something?
> >
> > Oops, you fix it in 2nd patch, I apparently miss that.
> >
> > Since the fix of checking the end is in another patch, probably merge
> > these two patches so that they are in one patch to avoid break bisect.
>
> Not sure if above comment was missed, Boris, would you mind to fold the
> patch 1 and 2?

Sorry, I did miss this comment.

Patch 2 was for the very specific case of a single-byte resource at
the end address, which we probably never see in practice.

For patch 1, the find_next_iomem_res() function comment had
"[res->start.res->end)", but I think the code actually treated it as
"[res->start.res->end]", so the comment was inaccurate.

Before my patches we had:

#define KEXEC_BACKUP_SRC_START (0UL)
#define KEXEC_BACKUP_SRC_END (640 * 1024UL) # 0xa0000

The intention is to search for system RAM resources that intersect
this region:

[mem 0x0-0x9ffff]

The call is:

walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
..., determine_backup_region);
walk_system_ram_res(0, 0xa0000, ..., determine_backup_region);

Assume iomem_resource contains this system RAM resource:

[mem 0x90000-0xaffff]

In find_next_iomem_res(), the "res" input parameter is the region to
search:

res->start = 0; # KEXEC_BACKUP_SRC_START
res->end = 0xa0000; # KEXEC_BACKUP_SRC_END

In one of the loop iterations we find the [mem 0x90000-0xaffff]
resource (p):

p->start = 0x90000;
p->end = 0xaffff;

if (p->start > end) # 0x90000 > 0xa0000? false
if (p->end >= start && p->start < end) # 0xaffff >= 0 ? true
# 0x90000 < 0xa0000 ? true
break; # so we'll return part of "p"

if (res->start < p->start) # 0x0 < 0x90000 ? true
res->start = 0x90000; # trim beginning to p->start
if (res->end > p->end) # 0xa0000 > 0xaffff ? false

So find_next_iomem_res() returns with this:

res->start = 0x90000; # trimmed to p->start
res->end = 0xa0000; # unchanged from input

[mem 0x90000-0xa0000] # returned resource (res)

and we call determine_backup_region(res), which sets:

image->arch.backup_src_start = 0x90000;
image->arch.backup_src_sz = resource_size(res) # 0xa0000 - 0x90000 + 1
# (0x10001)

This is incorrect. What we wanted was the part of [mem 0x90000-0xaffff]
that intersects the first 640K, i.e., [mem 0x90000-0x9ffff], but what
we got was [mem 0x90000-0xa0000], which is one byte too long.

The resource returned find_next_iomem_res() always ends at the
"res->end" supplied as an input parameter *unless* the input res->end
is strictly greater than the p->end, when it is truncated to p->end.

Bottom line, I don't think patches 1 and 2 need to be folded together
because they fix different problems.

Bjorn

2018-10-16 02:52:14

by Dave Young

[permalink] [raw]
Subject: Re: [PATCH 1/3] x86/kexec: Correct KEXEC_BACKUP_SRC_END off-by-one error

On 10/15/18 at 08:44am, Bjorn Helgaas wrote:
> On Mon, Oct 15, 2018 at 12:51:38PM +0800, Dave Young wrote:
> > On 09/30/18 at 05:27pm, Dave Young wrote:
> > > On 09/30/18 at 05:21pm, Dave Young wrote:
> > > > On 09/27/18 at 09:21am, Bjorn Helgaas wrote:
> > > > > From: Bjorn Helgaas <[email protected]>
> > > > >
> > > > > The only use of KEXEC_BACKUP_SRC_END is as an argument to
> > > > > walk_system_ram_res():
> > > > >
> > > > > int crash_load_segments(struct kimage *image)
> > > > > {
> > > > > ...
> > > > > walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
> > > > > image, determine_backup_region);
> > > > >
> > > > > walk_system_ram_res() expects "start, end" arguments that are inclusive,
> > > > > i.e., the range to be walked includes both the start and end addresses.
> > > >
> > > > Looking at the function comment of find_next_iomem_res, the res->end
> > > > should be exclusive, am I missing something?
> > >
> > > Oops, you fix it in 2nd patch, I apparently miss that.
> > >
> > > Since the fix of checking the end is in another patch, probably merge
> > > these two patches so that they are in one patch to avoid break bisect.
> >
> > Not sure if above comment was missed, Boris, would you mind to fold the
> > patch 1 and 2?
>
> Sorry, I did miss this comment.
>
> Patch 2 was for the very specific case of a single-byte resource at
> the end address, which we probably never see in practice.
>
> For patch 1, the find_next_iomem_res() function comment had
> "[res->start.res->end)", but I think the code actually treated it as
> "[res->start.res->end]", so the comment was inaccurate.
>
> Before my patches we had:
>
> #define KEXEC_BACKUP_SRC_START (0UL)
> #define KEXEC_BACKUP_SRC_END (640 * 1024UL) # 0xa0000
>
> The intention is to search for system RAM resources that intersect
> this region:
>
> [mem 0x0-0x9ffff]
>
> The call is:
>
> walk_system_ram_res(KEXEC_BACKUP_SRC_START, KEXEC_BACKUP_SRC_END,
> ..., determine_backup_region);
> walk_system_ram_res(0, 0xa0000, ..., determine_backup_region);
>
> Assume iomem_resource contains this system RAM resource:
>
> [mem 0x90000-0xaffff]
>
> In find_next_iomem_res(), the "res" input parameter is the region to
> search:
>
> res->start = 0; # KEXEC_BACKUP_SRC_START
> res->end = 0xa0000; # KEXEC_BACKUP_SRC_END
>
> In one of the loop iterations we find the [mem 0x90000-0xaffff]
> resource (p):
>
> p->start = 0x90000;
> p->end = 0xaffff;
>
> if (p->start > end) # 0x90000 > 0xa0000? false
> if (p->end >= start && p->start < end) # 0xaffff >= 0 ? true
> # 0x90000 < 0xa0000 ? true
> break; # so we'll return part of "p"
>
> if (res->start < p->start) # 0x0 < 0x90000 ? true
> res->start = 0x90000; # trim beginning to p->start
> if (res->end > p->end) # 0xa0000 > 0xaffff ? false
>
> So find_next_iomem_res() returns with this:
>
> res->start = 0x90000; # trimmed to p->start
> res->end = 0xa0000; # unchanged from input
>
> [mem 0x90000-0xa0000] # returned resource (res)
>
> and we call determine_backup_region(res), which sets:
>
> image->arch.backup_src_start = 0x90000;
> image->arch.backup_src_sz = resource_size(res) # 0xa0000 - 0x90000 + 1
> # (0x10001)
>
> This is incorrect. What we wanted was the part of [mem 0x90000-0xaffff]
> that intersects the first 640K, i.e., [mem 0x90000-0x9ffff], but what
> we got was [mem 0x90000-0xa0000], which is one byte too long.
>
> The resource returned find_next_iomem_res() always ends at the
> "res->end" supplied as an input parameter *unless* the input res->end
> is strictly greater than the p->end, when it is truncated to p->end.
>
> Bottom line, I don't think patches 1 and 2 need to be folded together
> because they fix different problems.
>
> Bjorn

Bjorn, thanks for the detail explanations, it is very clear now to me.
Indeed 2nd patch is for different issue, please ignore my comment :)

For the series:
Reviewed-by: Dave Young <[email protected]>

Thanks
Dave