2021-06-11 04:23:31

by Yaohui Wang

[permalink] [raw]
Subject: [PATCH v2 0/2] mm: unexpected behavior of __ioremap_caller

Due to some boundary calculation & boundary judgment issues,
__ioremap_caller may not work as expected. This may raise the risk of
misusing ioremap_xxx functions, and further, incur terrible performance
issues.

For example, suppose [phys_addr ~ phys_addr + PAGE_SIZE] is normal RAM.
Calling ioremap(phys_addr, PAGE_SIZE-1) will succeed (but it should not).
This will set the cache flag of the phys_addr's directing mapping pte to
be PCD. What's worse, iounmap won't revert this cache flag in the
directing mapping. So the pte in the directing mapping keeps polluted
until we use workarounds (calling ioremap_cache on phys_addr) to fix the
cache bit. If there is important data/code in the polluted page, which is
accessed frequently, then the performance of the machine will drop
terribly.

Here are two patches addressing this issue. The first address the boundary
calculation issue and the second one addresses the boundary judgment
issue.

Yahui Wang (2):
mm: fix pfn calculation mistake in __ioremap_check_ram
mm: fix boundary judgement issues

arch/x86/mm/ioremap.c | 4 ++--
kernel/resource.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)

--
2.25.1


2021-06-11 04:25:13

by Yaohui Wang

[permalink] [raw]
Subject: [PATCH v2 1/2] mm: fix the pfn calculation mistake in __ioremap_check_ram

In arch/x86/mm/ioremap.c:__ioremap_check_ram, the original pfn wrapping
calculation may cause the pfn range to ignore the very start page, if
res->start is not page-aligned, or the very end page, if res->end is not
page aligned.

So start_pfn should wrap down the res->start address, and end_pfn should
wrap up the res->end address. This makes the pfn range completely
contain [res->start, res->end] ram range. This check is more strict and is
more reasonable.

Signed-off-by: Ben Luo <[email protected]>
Signed-off-by: Yahui Wang <[email protected]>
---
arch/x86/mm/ioremap.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
index 9e5ccc56f..79adf0d2d 100644
--- a/arch/x86/mm/ioremap.c
+++ b/arch/x86/mm/ioremap.c
@@ -74,8 +74,8 @@ static unsigned int __ioremap_check_ram(struct resource *res)
if ((res->flags & IORESOURCE_SYSTEM_RAM) != IORESOURCE_SYSTEM_RAM)
return 0;

- start_pfn = (res->start + PAGE_SIZE - 1) >> PAGE_SHIFT;
- stop_pfn = (res->end + 1) >> PAGE_SHIFT;
+ start_pfn = res->start >> PAGE_SHIFT;
+ stop_pfn = (res->end + PAGE_SIZE) >> PAGE_SHIFT;
if (stop_pfn > start_pfn) {
for (i = 0; i < (stop_pfn - start_pfn); ++i)
if (pfn_valid(start_pfn + i) &&
--
2.25.1

2021-06-11 04:25:30

by Yaohui Wang

[permalink] [raw]
Subject: [PATCH v2 2/2] mm: fix boundary judgment issues in kernel/resource.c

The original boundary judgment may ignore @end if @end equals @start. For
example, if we call ioremap(phys, 1), then @end == @start, and the memory
check will not be applied on the page where @end lives, which is
unexpected.

In kernel/resource.c:find_next_iomem_res, the mem region is a closed
interval (i.e. [@start..@end]). So @start == @end should be allowed.

In kernel/resource.c:__walk_iomem_res_desc, the mem region is a closed
interval (i.e. [@start..@end]). So @start == @end should be allowed.

Signed-off-by: Ben Luo <[email protected]>
Signed-off-by: Yahui Wang <[email protected]>
---
kernel/resource.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/resource.c b/kernel/resource.c
index 16e0c7e8e..b29c8c720 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -353,7 +353,7 @@ static int find_next_iomem_res(resource_size_t start, resource_size_t end,
if (!res)
return -EINVAL;

- if (start >= end)
+ if (start > end)
return -EINVAL;

read_lock(&resource_lock);
@@ -408,7 +408,7 @@ static int __walk_iomem_res_desc(resource_size_t start, resource_size_t end,
struct resource res;
int ret = -EINVAL;

- while (start < end &&
+ while (start <= end &&
!find_next_iomem_res(start, end, flags, desc, first_lvl, &res)) {
ret = (*func)(&res, arg);
if (ret)
--
2.25.1

2021-06-19 21:24:40

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: fix the pfn calculation mistake in __ioremap_check_ram

Yaohui!

On Fri, Jun 11 2021 at 12:21, Yaohui Wang wrote:

A few formal things upfront. The prefix of the subject is incorrect. It
should be "x86/ioremap:" git log $FILE helps to figure that out.

Looking at the Signed-off-by chain below this misses either a

From: Ben Luo <[email protected]>

right at the top of the changelog or a Co-Developed-by tag. See
Documentation/process/

> In arch/x86/mm/ioremap.c:__ioremap_check_ram, the original pfn
> wrapping

Just "In __ioremap_check_ram() ..." please. The file name is
uninteresting and we want the '()' at the end of the symbol so it's
obvious that this is a function.

> calculation may cause the pfn range to ignore the very start page, if
> res->start is not page-aligned, or the very end page, if res->end is not
> page aligned.
>
> So start_pfn should wrap down the res->start address, and end_pfn should
> wrap up the res->end address. This makes the pfn range completely
> contain [res->start, res->end] ram range. This check is more strict and is
> more reasonable.

This lacks a "Fixes:" tag

> Signed-off-by: Ben Luo <[email protected]>
> Signed-off-by: Yahui Wang <[email protected]>
> ---
> arch/x86/mm/ioremap.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/ioremap.c b/arch/x86/mm/ioremap.c
> index 9e5ccc56f..79adf0d2d 100644
> --- a/arch/x86/mm/ioremap.c
> +++ b/arch/x86/mm/ioremap.c
> @@ -74,8 +74,8 @@ static unsigned int __ioremap_check_ram(struct resource *res)
> if ((res->flags & IORESOURCE_SYSTEM_RAM) != IORESOURCE_SYSTEM_RAM)
> return 0;
>
> - start_pfn = (res->start + PAGE_SIZE - 1) >> PAGE_SHIFT;
> - stop_pfn = (res->end + 1) >> PAGE_SHIFT;
> + start_pfn = res->start >> PAGE_SHIFT;
> + stop_pfn = (res->end + PAGE_SIZE) >> PAGE_SHIFT;

Please make that:

start_pfn = PFN_DOWN(res->start);
stop_pfn = PFN_UP(res->end);

which gives you the first and the last PFN of that range. That obviously
requires to fix the below as well, but that code is unreadable anyway.

> if (stop_pfn > start_pfn) {
> for (i = 0; i < (stop_pfn - start_pfn); ++i)
> if (pfn_valid(start_pfn + i) &&

npages = stop_pfn - start_pfn + 1;
for (i = 0; i < npages; i++) {
if (.....)
}

you get the idea, right?

Thanks,

tglx

2021-06-19 22:18:09

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] mm: fix boundary judgment issues in kernel/resource.c

Yaohui!

On Fri, Jun 11 2021 at 12:21, Yaohui Wang wrote:

The same formal issues as with patch #1

> The original boundary judgment may ignore @end if @end equals @start. For

May means it can but it must not. But this is not the case here. end
equals start is always ignored.

Also 'original' is meaningless here. Before the patch is applied the
code is that way.

find_next_iomem_res() and __walk_iomem_res_desc() require that the
provided end address is larger than the start address, which ...


> example, if we call ioremap(phys, 1), then @end == @start, and the memory
> check will not be applied on the page where @end lives, which is
> unexpected.

Please avoid 'we' and 'I':

is incorrect when ioremap() is invoked with length=1.

> In kernel/resource.c:find_next_iomem_res, the mem region is a closed

See the reply to #1 vs. function names. Also please write out 'memory',
there is no shortage of space in change logs.

> interval (i.e. [@start..@end]). So @start == @end should be allowed.

closed interval reads strange. The usual terminology is: The end address
is inclusive.

Resources are described with the start address and the inclusive end
address, which means for a resource with 1 byte length the start
address is the same as the end address.

find_next_iomem_res() and __walk_iomem_res_desc() ignore resources
with 1 byte length, which prevents that ioremap(phys, 1) is checked
whether it touches non ioremappable resources.

...

Thanks,

tglx

2021-06-21 06:14:28

by Yaohui Wang

[permalink] [raw]
Subject: Re: [PATCH v2 2/2] mm: fix boundary judgment issues in kernel/resource.c

Hi, Thomas

Thanks for your detailed reply, and your patience for a kernel newbie.

I'll carefully address the formal issues in the next version of patch.


Thanks,

Yaohui

On 2021/6/20 06:16, Thomas Gleixner wrote:
> Yaohui!
>
> On Fri, Jun 11 2021 at 12:21, Yaohui Wang wrote:
>
> The same formal issues as with patch #1
>
>> The original boundary judgment may ignore @end if @end equals @start. For
>
> May means it can but it must not. But this is not the case here. end
> equals start is always ignored.
>
> Also 'original' is meaningless here. Before the patch is applied the
> code is that way.
>
> find_next_iomem_res() and __walk_iomem_res_desc() require that the
> provided end address is larger than the start address, which ...
>
>
>> example, if we call ioremap(phys, 1), then @end == @start, and the memory
>> check will not be applied on the page where @end lives, which is
>> unexpected.
>
> Please avoid 'we' and 'I':
>
> is incorrect when ioremap() is invoked with length=1.
>
>> In kernel/resource.c:find_next_iomem_res, the mem region is a closed
>
> See the reply to #1 vs. function names. Also please write out 'memory',
> there is no shortage of space in change logs.
>
>> interval (i.e. [@start..@end]). So @start == @end should be allowed.
>
> closed interval reads strange. The usual terminology is: The end address
> is inclusive.
>
> Resources are described with the start address and the inclusive end
> address, which means for a resource with 1 byte length the start
> address is the same as the end address.
>
> find_next_iomem_res() and __walk_iomem_res_desc() ignore resources
> with 1 byte length, which prevents that ioremap(phys, 1) is checked
> whether it touches non ioremappable resources.
>
> ...
>
> Thanks,
>
> tglx
>

2021-06-21 08:07:46

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] mm: fix the pfn calculation mistake in __ioremap_check_ram

On Sat, Jun 19 2021 at 23:22, Thomas Gleixner wrote:
> Please make that:
>
> start_pfn = PFN_DOWN(res->start);
> stop_pfn = PFN_UP(res->end);

That should obviously be PFN_DOWN(res->end) as well.