2013-03-07 16:32:04

by Dave Hansen

[permalink] [raw]
Subject: [PATCH] do not try to sync identity map for non-mapped pages


The original bug reporter says this fixes it for him, so I'm
broadening the cc list a bit. I assume this should just get
sucked in to the x86 tree.

The double-signed-off-by from my is because my IBM email is
going away very shortly.

--

kernel_map_sync_memtype() is called from a variety of contexts. The
pat.c code that calls it seems to ensure that it is not called for
non-ram areas by checking via pat_pagerange_is_ram(). It is important
that it only be called on the actual identity map because there *IS*
no map to sync for highmem pages, or for memory holes.

The ioremap.c uses are not as careful as those from pat.c, and call
kernel_map_sync_memtype() on PCI space which is in the middle of the
kernel identity map _range_, but is not actually mapped.

This patch adds a check to kernel_map_sync_memtype() which probably
duplicates some of the checks already in pat.c. But, it is necessary
for the ioremap.c uses and shouldn't hurt other callers.

I have reproduced this bug and this patch fixes it for me and the
original bug reporter:

https://lkml.org/lkml/2013/2/5/396

Signed-off-by: Dave Hansen <[email protected]>
Signed-off-by: Dave Hansen <[email protected]>
Tested-by: Tetsuo Handa <[email protected]>
---

linux-2.6.git-dave/arch/x86/mm/pat.c | 7 +++++++
1 file changed, 7 insertions(+)

diff -puN arch/x86/mm/pat.c~dont-flush-map-for-non-ram-pages arch/x86/mm/pat.c
--- linux-2.6.git/arch/x86/mm/pat.c~dont-flush-map-for-non-ram-pages 2013-03-07 08:14:10.065558743 -0800
+++ linux-2.6.git-dave/arch/x86/mm/pat.c 2013-03-07 08:14:10.069558781 -0800
@@ -563,6 +563,13 @@ int kernel_map_sync_memtype(u64 base, un
if (base > __pa(high_memory-1))
return 0;

+ /*
+ * some areas in the middle of the kernel identity range
+ * are not mapped, like the PCI space.
+ */
+ if (!page_is_ram(base >> PAGE_SHIFT))
+ return 0;
+
id_sz = (__pa(high_memory-1) <= base + size) ?
__pa(high_memory) - base :
size;
_


Subject: [tip:x86/urgent] x86: Do not try to sync identity map for non-mapped pages

Commit-ID: 60f583d56aa515b896a9d94f860f52640c1e8a75
Gitweb: http://git.kernel.org/tip/60f583d56aa515b896a9d94f860f52640c1e8a75
Author: Dave Hansen <[email protected]>
AuthorDate: Thu, 7 Mar 2013 08:31:51 -0800
Committer: H. Peter Anvin <[email protected]>
CommitDate: Thu, 7 Mar 2013 13:23:28 -0800

x86: Do not try to sync identity map for non-mapped pages

kernel_map_sync_memtype() is called from a variety of contexts. The
pat.c code that calls it seems to ensure that it is not called for
non-ram areas by checking via pat_pagerange_is_ram(). It is important
that it only be called on the actual identity map because there *IS*
no map to sync for highmem pages, or for memory holes.

The ioremap.c uses are not as careful as those from pat.c, and call
kernel_map_sync_memtype() on PCI space which is in the middle of the
kernel identity map _range_, but is not actually mapped.

This patch adds a check to kernel_map_sync_memtype() which probably
duplicates some of the checks already in pat.c. But, it is necessary
for the ioremap.c uses and shouldn't hurt other callers.

I have reproduced this bug and this patch fixes it for me and the
original bug reporter:

https://lkml.org/lkml/2013/2/5/396

Signed-off-by: Dave Hansen <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Dave Hansen <[email protected]>
Tested-by: Tetsuo Handa <[email protected]>
Signed-off-by: H. Peter Anvin <[email protected]>
---
arch/x86/mm/pat.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/arch/x86/mm/pat.c b/arch/x86/mm/pat.c
index 2610bd9..6574388 100644
--- a/arch/x86/mm/pat.c
+++ b/arch/x86/mm/pat.c
@@ -563,6 +563,13 @@ int kernel_map_sync_memtype(u64 base, unsigned long size, unsigned long flags)
if (base > __pa(high_memory-1))
return 0;

+ /*
+ * some areas in the middle of the kernel identity range
+ * are not mapped, like the PCI space.
+ */
+ if (!page_is_ram(base >> PAGE_SHIFT))
+ return 0;
+
id_sz = (__pa(high_memory-1) <= base + size) ?
__pa(high_memory) - base :
size;

2013-03-07 22:05:25

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [PATCH] do not try to sync identity map for non-mapped pages

Dave Hansen wrote:
>
> The original bug reporter says this fixes it for him, so I'm
> broadening the cc list a bit. I assume this should just get
> sucked in to the x86 tree.
>
> The double-signed-off-by from my is because my IBM email is
> going away very shortly.
>
> --
>
> kernel_map_sync_memtype() is called from a variety of contexts. The
> pat.c code that calls it seems to ensure that it is not called for
> non-ram areas by checking via pat_pagerange_is_ram(). It is important
> that it only be called on the actual identity map because there *IS*
> no map to sync for highmem pages, or for memory holes.
>
> The ioremap.c uses are not as careful as those from pat.c, and call
> kernel_map_sync_memtype() on PCI space which is in the middle of the
> kernel identity map _range_, but is not actually mapped.
>
> This patch adds a check to kernel_map_sync_memtype() which probably
> duplicates some of the checks already in pat.c. But, it is necessary
> for the ioremap.c uses and shouldn't hurt other callers.
>
> I have reproduced this bug and this patch fixes it for me and the
> original bug reporter:
>
> https://lkml.org/lkml/2013/2/5/396
>

Excuse me, but I didn't realize that the link is wrong.

https://lkml.org/lkml/2013/2/5/396 is a bug in CONFIG_MICROCODE_INTEL_EARLY=y
&& CONFIG_64BIT=n && CONFIG_DEBUG_VIRTUAL=y where patches are not available.

https://lkml.org/lkml/2013/3/5/314 is a bug in CONFIG_ACPI=y &&
CONFIG_DEBUG_VIRTUAL=y where your patch fixes.

Please use https://lkml.org/lkml/2013/3/5/314 rather than
https://lkml.org/lkml/2013/2/5/396 in your patch.

Regards.

2013-03-07 22:14:02

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] do not try to sync identity map for non-mapped pages

On 03/07/2013 02:05 PM, Tetsuo Handa wrote:
>
> Excuse me, but I didn't realize that the link is wrong.
>
> https://lkml.org/lkml/2013/2/5/396 is a bug in CONFIG_MICROCODE_INTEL_EARLY=y
> && CONFIG_64BIT=n && CONFIG_DEBUG_VIRTUAL=y where patches are not available.
>
> https://lkml.org/lkml/2013/3/5/314 is a bug in CONFIG_ACPI=y &&
> CONFIG_DEBUG_VIRTUAL=y where your patch fixes.
>
> Please use https://lkml.org/lkml/2013/3/5/314 rather than
> https://lkml.org/lkml/2013/2/5/396 in your patch.
>

*Grump* I of course already committed it and other things on top.

We shouldn't really use lkml.org links anyway since if that site
disappears those numbers are meaningless; rather, using Message-IDs
(e.g. as embedded in lkml.kernel.org links) is better...

-hpa

2013-04-07 13:33:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] do not try to sync identity map for non-mapped pages

Hey Dave,

On Thu, Mar 07, 2013 at 08:31:51AM -0800, Dave Hansen wrote:
>
> The original bug reporter says this fixes it for him, so I'm
> broadening the cc list a bit. I assume this should just get
> sucked in to the x86 tree.

looks like we haven't whacked all the moles - I keep seeing this when
testing 32-bit builds in qemu on latest Linus + tip. I'd guess this is
still that /dev/mem accessing thing called wdm.

I'm still wondering though whether we should BUG_ON on a /dev/mem
access?

I've added debug output to show why we're triggering:

[ 471.102902] slow_virt_to_phys((void *)x): 0x0, phys_addr: 0x37bfe000
[ 471.119500] ------------[ cut here ]------------
[ 471.119500] kernel BUG at arch/x86/mm/physaddr.c:85!
[ 471.119500] invalid opcode: 0000 [#1] PREEMPT SMP
[ 471.119500] Modules linked in:
[ 471.119500] Pid: 1571, comm: wdm Not tainted 3.9.0-rc5+ #42 Bochs Bochs
[ 471.119500] EIP: 0060:[<c1032f56>] EFLAGS: 00000206 CPU: 0
[ 471.119500] EIP is at __phys_addr+0x86/0xb0
[ 471.119500] EAX: 37bfe000 EBX: 37bfe000 ECX: 00000001 EDX: 37bfe000
[ 471.119500] ESI: f7bfe000 EDI: 00002000 EBP: f67f1f3c ESP: f67f1f28
[ 471.119500] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068
[ 471.119500] CR0: 8005003b CR2: bfeb12d4 CR3: 35edd000 CR4: 000006f0
[ 471.119500] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[ 471.119500] DR6: 00000000 DR7: 00000000
[ 471.119500] Process wdm (pid: 1571, ti=f67f0000 task=f5c90000 task.ti=f67f0000)
[ 471.119500] Stack:
[ 471.119500] c16b1f2c 00000000 37bfe000 00000000 00002000 f67f1f64 c131d074 00002000
[ 471.119500] 00000246 bfeb12ec 00000000 00000000 f67a6c40 c131d040 00002000 f67f1f8c
[ 471.119500] c1129c85 f67f1f98 f67f0000 f5ebe864 c131d040 00000020 f67a6c40 00000000
[ 471.119500] Call Trace:
[ 471.119500] [<c131d074>] read_mem+0x34/0x100
[ 471.119500] [<c131d040>] ? write_mem+0x110/0x110
[ 471.119500] [<c1129c85>] vfs_read+0x85/0x130
[ 471.119500] [<c131d040>] ? write_mem+0x110/0x110
[ 471.119500] [<c1129e87>] sys_read+0x47/0xa0
[ 471.119500] [<c1546e5e>] sysenter_do_call+0x12/0x36
[ 471.119500] Code: 0b a1 88 ae 0c c2 05 00 00 80 00 39 c6 72 bb a1 ac 1a 76 c1 2d 00 a0 3e 00 25 00 00 e0 ff 2d 00 20 00 00 39 c6 73 a3 0f 0b 0f 0b <0f> 0b 89 f0 e8 41 ca ff ff 89 5c 24 08 89 44 24 04 c7 04 24 2c
[ 471.119500] EIP: [<c1032f56>] __phys_addr+0x86/0xb0 SS:ESP 0068:f67f1f28
[ 471.508967] ---[ end trace 5fc00ac35d61284a ]---

Hmmm.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-04-07 16:34:10

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] do not try to sync identity map for non-mapped pages

On 04/07/2013 06:33 AM, Borislav Petkov wrote:
>
> looks like we haven't whacked all the moles - I keep seeing this when
> testing 32-bit builds in qemu on latest Linus + tip. I'd guess this is
> still that /dev/mem accessing thing called wdm.
>
> I'm still wondering though whether we should BUG_ON on a /dev/mem
> access?
>

We shouldn't, no. /dev/mem really needs to be fixed along a bunch of
axes. Yes, it is privileged and extra creepy, but it should either work
or it should fail cleanly.

-hpa

2013-04-07 17:25:55

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH] do not try to sync identity map for non-mapped pages

On Sun, Apr 07, 2013 at 09:34:07AM -0700, H. Peter Anvin wrote:
> We shouldn't, no. /dev/mem really needs to be fixed along a bunch of
> axes. Yes, it is privileged and extra creepy, but it should either
> work or it should fail cleanly.

Can't we filter out accesses through /dev/mem and not BUG_ON in
__phys_addr() if they come through /dev/mem? Simply fail cleanly. No
idea whether we'll break something still reading /dev/mem though.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-04-08 20:33:13

by Dave Hansen

[permalink] [raw]
Subject: Re: [PATCH] do not try to sync identity map for non-mapped pages

On 04/07/2013 09:34 AM, H. Peter Anvin wrote:
> On 04/07/2013 06:33 AM, Borislav Petkov wrote:
>> looks like we haven't whacked all the moles - I keep seeing this when
>> testing 32-bit builds in qemu on latest Linus + tip. I'd guess this is
>> still that /dev/mem accessing thing called wdm.
>>
>> I'm still wondering though whether we should BUG_ON on a /dev/mem
>> access?
>
> We shouldn't, no. /dev/mem really needs to be fixed along a bunch of
> axes. Yes, it is privileged and extra creepy, but it should either work
> or it should fail cleanly.

I've got a set sitting around collecting dust that I've got to post. It
should at least restore sanity for /dev/mem on x86. I'll have it out
for some initial review tomorrow.