2015-12-02 15:07:16

by Seth Jennings

[permalink] [raw]
Subject: [PATCH 1/3] drivers: memory: clean up section counting

Right now, section_count is calculated in add_memory_block().
However, init_memory_block() increments section_count as well,
which, at first, seems like it would lead to an off-by-one error.
There is no harm done because add_memory_block() immediately overwrites
the mem->section_count, but it is messy.

This commit moves the increment out of the common init_memory_block()
(called by both add_memory_block() and register_new_memory()) and
adds it to register_new_memory().

Signed-off-by: Seth Jennings <[email protected]>
---
drivers/base/memory.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index 2804aed..ca2ce02 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -614,7 +614,6 @@ static int init_memory_block(struct memory_block **memory,
base_memory_block_id(scn_nr) * sections_per_block;
mem->end_section_nr = mem->start_section_nr + sections_per_block - 1;
mem->state = state;
- mem->section_count++;
start_pfn = section_nr_to_pfn(mem->start_section_nr);
mem->phys_device = arch_get_memory_phys_device(start_pfn);

@@ -668,6 +667,7 @@ int register_new_memory(int nid, struct mem_section *section)
ret = init_memory_block(&mem, section, MEM_OFFLINE);
if (ret)
goto out;
+ mem->section_count++;
}

if (mem->section_count == sections_per_block)
--
2.5.0


2015-12-02 15:08:17

by Seth Jennings

[permalink] [raw]
Subject: [PATCH 2/3] drivers: memory: rename remove_memory_block() to remove_memory_section()

The function removes a section, not a block. Rename to reflect
actual functionality.

Signed-off-by: Seth Jennings <[email protected]>
---
drivers/base/memory.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index ca2ce02..dd30744 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -688,7 +688,7 @@ unregister_memory(struct memory_block *memory)
device_unregister(&memory->dev);
}

-static int remove_memory_block(unsigned long node_id,
+static int remove_memory_section(unsigned long node_id,
struct mem_section *section, int phys_device)
{
struct memory_block *mem;
@@ -712,7 +712,7 @@ int unregister_memory_section(struct mem_section *section)
if (!present_section(section))
return -EINVAL;

- return remove_memory_block(0, section, 0);
+ return remove_memory_section(0, section, 0);
}
#endif /* CONFIG_MEMORY_HOTREMOVE */

--
2.5.0

2015-12-02 15:07:20

by Seth Jennings

[permalink] [raw]
Subject: [PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with missing sections

bdee237c and 982792c7 introduced large block sizes for x86.
This made it possible to have multiple sections per memory
block where previously, there was a only every one section
per block.

Since blocks consist of contiguous ranges of section, there
can be holes in the blocks where sections are not present.
If one attempts to offline such a block, a crash occurs since
the code is not designed to deal with this.

This patch is a quick fix to gaurd against the crash by
not allowing blocks with non-present sections to be offlined.

Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=107781
Reported-by: Andrew Banman <[email protected]>
Signed-off-by: Seth Jennings <[email protected]>
---
drivers/base/memory.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/drivers/base/memory.c b/drivers/base/memory.c
index dd30744..6d7b14c 100644
--- a/drivers/base/memory.c
+++ b/drivers/base/memory.c
@@ -303,6 +303,10 @@ static int memory_subsys_offline(struct device *dev)
if (mem->state == MEM_OFFLINE)
return 0;

+ /* Can't offline block with non-present sections */
+ if (mem->section_count != sections_per_block)
+ return -EINVAL;
+
return memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE);
}

--
2.5.0

2015-12-02 22:45:58

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 3/3] drivers: memory: prohibit offlining of memory blocks with missing sections

On Wed, 2 Dec 2015 09:07:01 -0600 Seth Jennings <[email protected]> wrote:

> bdee237c and 982792c7 introduced large block sizes for x86.
> This made it possible to have multiple sections per memory
> block where previously, there was a only every one section
> per block.
>
> Since blocks consist of contiguous ranges of section, there
> can be holes in the blocks where sections are not present.
> If one attempts to offline such a block, a crash occurs since
> the code is not designed to deal with this.
>
> This patch is a quick fix to gaurd against the crash by
> not allowing blocks with non-present sections to be offlined.
>
> ...
>
> --- a/drivers/base/memory.c
> +++ b/drivers/base/memory.c
> @@ -303,6 +303,10 @@ static int memory_subsys_offline(struct device *dev)
> if (mem->state == MEM_OFFLINE)
> return 0;
>
> + /* Can't offline block with non-present sections */
> + if (mem->section_count != sections_per_block)
> + return -EINVAL;
> +
> return memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE);
> }

[3/3] fixes a kernel crash so I've tagged it for -stable and shall move
it ahead of [1/2] and [2/2], which are merely cleanups.

This assumes that [3/3] is independent of the other two patches. I'll
eat my hat if it isn't.

2015-12-03 17:58:31

by Andrew Banman

[permalink] [raw]
Subject: [PATCH] drivers: memory: check for missing sections when testing zones

test_pages_in_a_zone does not account for the possibility of missing sections
in the given pfn range. Since pfn_valid_within always returns 1 when
CONFIG_HOLES_IN_ZONE is not set, invalid pfns from missing sections
will pass the test, resulting in a kernel oops. This is remedied by simply
checking for the presence of the pfn's section. We don't have to remove
the pfn_valid_within optimization.

The patch also prevents a crash from offlining memory devices with missing
sections. Despite this, it's probably best to keep

[PATCH 3/3] drivers: memory: prohibit offlining of memory blocks withmissing sections

because missing sections may indicate other problems, like overlapping mem
blocks and who knows what else (see the discussion at BZ 107781).

---
mm/memory_hotplug.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 67d488a..74f5bcd 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1383,6 +1383,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn)
pfn < end_pfn;
pfn += MAX_ORDER_NR_PAGES) {
i = 0;
+ /* Make sure the memory section is present */
+ if (!present_section_nr(pfn_to_section_nr(pfn)))
+ continue;
/* This is just a CONFIG_HOLES_IN_ZONE check.*/
while ((i < MAX_ORDER_NR_PAGES) && !pfn_valid_within(pfn + i))
i++;
--
1.7.12.4


On 12/02/2015 04:45 PM, Andrew Morton wrote:
> On Wed, 2 Dec 2015 09:07:01 -0600 Seth Jennings <[email protected]> wrote:
>
>> bdee237c and 982792c7 introduced large block sizes for x86.
>> This made it possible to have multiple sections per memory
>> block where previously, there was a only every one section
>> per block.
>>
>> Since blocks consist of contiguous ranges of section, there
>> can be holes in the blocks where sections are not present.
>> If one attempts to offline such a block, a crash occurs since
>> the code is not designed to deal with this.
>>
>> This patch is a quick fix to gaurd against the crash by
>> not allowing blocks with non-present sections to be offlined.
>>
>> ...
>>
>> --- a/drivers/base/memory.c
>> +++ b/drivers/base/memory.c
>> @@ -303,6 +303,10 @@ static int memory_subsys_offline(struct device *dev)
>> if (mem->state == MEM_OFFLINE)
>> return 0;
>>
>> + /* Can't offline block with non-present sections */
>> + if (mem->section_count != sections_per_block)
>> + return -EINVAL;
>> +
>> return memory_block_change_state(mem, MEM_OFFLINE, MEM_ONLINE);
>> }
>
> [3/3] fixes a kernel crash so I've tagged it for -stable and shall move
> it ahead of [1/2] and [2/2], which are merely cleanups.
>
> This assumes that [3/3] is independent of the other two patches. I'll
> eat my hat if it isn't.
>

2015-12-05 00:03:06

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] drivers: memory: check for missing sections when testing zones

On Thu, 3 Dec 2015 11:58:46 -0600 Andrew Banman <[email protected]> wrote:

> test_pages_in_a_zone does not account for the possibility of missing sections
> in the given pfn range. Since pfn_valid_within always returns 1 when
> CONFIG_HOLES_IN_ZONE is not set, invalid pfns from missing sections
> will pass the test, resulting in a kernel oops. This is remedied by simply
> checking for the presence of the pfn's section. We don't have to remove
> the pfn_valid_within optimization.
>
> The patch also prevents a crash from offlining memory devices with missing
> sections. Despite this, it's probably best to keep
>
> [PATCH 3/3] drivers: memory: prohibit offlining of memory blocks withmissing sections
>
> because missing sections may indicate other problems, like overlapping mem
> blocks and who knows what else (see the discussion at BZ 107781).
>
> ---
> mm/memory_hotplug.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 67d488a..74f5bcd 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1383,6 +1383,9 @@ int test_pages_in_a_zone(unsigned long start_pfn, unsigned long end_pfn)
> pfn < end_pfn;
> pfn += MAX_ORDER_NR_PAGES) {
> i = 0;
> + /* Make sure the memory section is present */
> + if (!present_section_nr(pfn_to_section_nr(pfn)))
> + continue;
> /* This is just a CONFIG_HOLES_IN_ZONE check.*/
> while ((i < MAX_ORDER_NR_PAGES) && !pfn_valid_within(pfn + i))
> i++;

Please send a Signed-off-by: for this patch.

Your email client is replacing tabs with spaces.

Please confirm that this patch is applicable to current mainline.

Please confirm that this patch is suitable for backporting into -stable
trees.

Thanks.