2008-01-16 02:41:23

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

Some incremental changes and bug fixes for PAT patchset. The changes are from
the feedback we received earlier. There are few more pending changes that will
follow soon.

Thanks,
Venki
--


2008-01-16 07:31:17

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* [email protected] <[email protected]> wrote:

> Some incremental changes and bug fixes for PAT patchset. The changes
> are from the feedback we received earlier. There are few more pending
> changes that will follow soon.

thanks, applied them to x86.git.

Note that PAT is still hardcoded to disabled in arch/x86/mm/pat.c:

int __read_mostly pat_disabled = 1;

because one of my testsystems failed during bootup. I'll re-check
whether these fixes resolve that, and if it passes then we could enable
PAT.

Ingo

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

Hi,

I just want to report that the PAT support in x86/mm causes crashes
on two of my test machines. On both boxes the SATA detection does
not work when the PAT support is patched into the kernel.

Symptoms are as follows -- best described by a diff between the
two boot.logs:

# diff boot-failing.log boot-working.log

-Linux version 2.6.24-rc8-ga9f7faa5 (root@hunter) (gcc version ...
+Linux version 2.6.24-rc8-g2ea3cf43 (root@hunter) (gcc version ...
...
early_iounmap(ffffffff82a0b000, 00001000)
-early_ioremap(000000000000c000, 00001000) => -000002103394304
-early_iounmap(ffffffff82a0c000, 00001000)
early_iounmap(ffffffff82808000, 00001000)
...
-ACPI: PCI interrupt for device 0000:00:12.0 disabled
-sata_sil: probe of 0000:00:12.0 failed with error -12
+scsi0 : sata_sil
+scsi1 : sata_sil
+ata1: SATA max UDMA/100 mmio m512@0xc0403000 tf 0xc0403080 irq 22
...
-AC'97 space ioremap problem
-ACPI: PCI interrupt for device 0000:00:14.5 disabled
-ATI IXP AC97 controller: probe of 0000:00:14.5 failed with error -5
ALSA device list:
- No soundcards found.
+ #0: ATI IXP rev 80 with ALC655 at 0xc0403800, irq 17
...
-VFS: Cannot open root device "sda1" or unknown-block(0,0)
-Please append a correct "root=" boot option; here are the available partitions:
-1600 4194302 hdc driver: ide-cdrom
-Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(0,0)
+kjournald starting. Commit interval 5 seconds
+EXT3-fs: mounted filesystem with ordered data mode.
+VFS: Mounted root (ext3 filesystem) readonly.
...

<snip>

The second test machine uses ahci. But the symptoms are similar.

I performed a git-bisect on x86/mm. Last commit that worked for me was

2ea3cf43fddecbfd66353caafdf73ec21ea3760b (x86: fix early_ioremap() ISA window)

The subsequent commits for PAT support introduced the problem.
I noticed that PAT should be disabled by default, but obviously the patches
still have some side-effect. (Maybe ioremap changes lead to the problem?)

Boot-logs are attached:

boot-failing.log for x86/mm as of v2.6.24-rc8-672-ga9f7faa
boot-working.log for x86/mm as of v2.6.24-rc8-621-g2ea3cf4

Hopefully it helps to track down the problem.
Maybe someone has an idea why the PAT patches are causing that
ominous "PCI interrupt for device ... disabled" messages.


Thanks and regards,

Andreas


Attachments:
(No filename) (2.27 kB)
boot-failing.log (29.84 kB)
boot-working.log (31.45 kB)
Download all attachments

2008-01-16 19:04:52

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


Can you attach the e820 map from the top of your dmesg.

Thanks,
Venki

>-----Original Message-----
>From: Andreas Herrmann [mailto:[email protected]]
>Sent: Wednesday, January 16, 2008 10:58 AM
>To: Pallipadi, Venkatesh
>Cc: [email protected]; [email protected]; [email protected];
>[email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>[email protected]; Barnes, Jesse; [email protected];
>[email protected]
>Subject: Re: [patch 0/4] x86: PAT followup - Incremental
>changes and bug fixes
>
>Hi,
>
>I just want to report that the PAT support in x86/mm causes crashes
>on two of my test machines. On both boxes the SATA detection does
>not work when the PAT support is patched into the kernel.
>
>Symptoms are as follows -- best described by a diff between the
>two boot.logs:
>
># diff boot-failing.log boot-working.log
>
>-Linux version 2.6.24-rc8-ga9f7faa5 (root@hunter) (gcc version ...
>+Linux version 2.6.24-rc8-g2ea3cf43 (root@hunter) (gcc version ...
>...
> early_iounmap(ffffffff82a0b000, 00001000)
>-early_ioremap(000000000000c000, 00001000) => -000002103394304
>-early_iounmap(ffffffff82a0c000, 00001000)
> early_iounmap(ffffffff82808000, 00001000)
>...
>-ACPI: PCI interrupt for device 0000:00:12.0 disabled
>-sata_sil: probe of 0000:00:12.0 failed with error -12
>+scsi0 : sata_sil
>+scsi1 : sata_sil
>+ata1: SATA max UDMA/100 mmio m512@0xc0403000 tf 0xc0403080 irq 22
>...
>-AC'97 space ioremap problem
>-ACPI: PCI interrupt for device 0000:00:14.5 disabled
>-ATI IXP AC97 controller: probe of 0000:00:14.5 failed with error -5
> ALSA device list:
>- No soundcards found.
>+ #0: ATI IXP rev 80 with ALC655 at 0xc0403800, irq 17
>...
>-VFS: Cannot open root device "sda1" or unknown-block(0,0)
>-Please append a correct "root=" boot option; here are the
>available partitions:
>-1600 4194302 hdc driver: ide-cdrom
>-Kernel panic - not syncing: VFS: Unable to mount root fs on
>unknown-block(0,0)
>+kjournald starting. Commit interval 5 seconds
>+EXT3-fs: mounted filesystem with ordered data mode.
>+VFS: Mounted root (ext3 filesystem) readonly.
>...
>
> <snip>
>
>The second test machine uses ahci. But the symptoms are similar.
>
>I performed a git-bisect on x86/mm. Last commit that worked for me was
>
>2ea3cf43fddecbfd66353caafdf73ec21ea3760b (x86: fix
>early_ioremap() ISA window)
>
>The subsequent commits for PAT support introduced the problem.
>I noticed that PAT should be disabled by default, but
>obviously the patches
>still have some side-effect. (Maybe ioremap changes lead to
>the problem?)
>
>Boot-logs are attached:
>
> boot-failing.log for x86/mm as of v2.6.24-rc8-672-ga9f7faa
> boot-working.log for x86/mm as of v2.6.24-rc8-621-g2ea3cf4
>
>Hopefully it helps to track down the problem.
>Maybe someone has an idea why the PAT patches are causing that
>ominous "PCI interrupt for device ... disabled" messages.
>
>
>Thanks and regards,
>
>Andreas
>

2008-01-16 19:36:51

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


Sorry. Never mind about e820 map. Somehow I did not notice the boot.log
you had attached earlier.

Thanks,
Venki

>-----Original Message-----
>From: Pallipadi, Venkatesh
>Sent: Wednesday, January 16, 2008 11:06 AM
>To: 'Andreas Herrmann'
>Cc: [email protected]; [email protected]; [email protected];
>[email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>[email protected]; Barnes, Jesse; [email protected];
>[email protected]
>Subject: RE: [patch 0/4] x86: PAT followup - Incremental
>changes and bug fixes
>
>
>Can you attach the e820 map from the top of your dmesg.
>
>Thanks,
>Venki
>
>>-----Original Message-----
>>From: Andreas Herrmann [mailto:[email protected]]
>>Sent: Wednesday, January 16, 2008 10:58 AM
>>To: Pallipadi, Venkatesh
>>Cc: [email protected]; [email protected]; [email protected];
>>[email protected]; [email protected];
>>[email protected]; [email protected]; [email protected];
>>[email protected]; [email protected]; [email protected];
>>[email protected]; Barnes, Jesse; [email protected];
>>[email protected]
>>Subject: Re: [patch 0/4] x86: PAT followup - Incremental
>>changes and bug fixes
>>
>>Hi,
>>
>>I just want to report that the PAT support in x86/mm causes crashes
>>on two of my test machines. On both boxes the SATA detection does
>>not work when the PAT support is patched into the kernel.
>>
>>Symptoms are as follows -- best described by a diff between the
>>two boot.logs:
>>
>># diff boot-failing.log boot-working.log
>>
>>-Linux version 2.6.24-rc8-ga9f7faa5 (root@hunter) (gcc version ...
>>+Linux version 2.6.24-rc8-g2ea3cf43 (root@hunter) (gcc version ...
>>...
>> early_iounmap(ffffffff82a0b000, 00001000)
>>-early_ioremap(000000000000c000, 00001000) => -000002103394304
>>-early_iounmap(ffffffff82a0c000, 00001000)
>> early_iounmap(ffffffff82808000, 00001000)
>>...
>>-ACPI: PCI interrupt for device 0000:00:12.0 disabled
>>-sata_sil: probe of 0000:00:12.0 failed with error -12
>>+scsi0 : sata_sil
>>+scsi1 : sata_sil
>>+ata1: SATA max UDMA/100 mmio m512@0xc0403000 tf 0xc0403080 irq 22
>>...
>>-AC'97 space ioremap problem
>>-ACPI: PCI interrupt for device 0000:00:14.5 disabled
>>-ATI IXP AC97 controller: probe of 0000:00:14.5 failed with error -5
>> ALSA device list:
>>- No soundcards found.
>>+ #0: ATI IXP rev 80 with ALC655 at 0xc0403800, irq 17
>>...
>>-VFS: Cannot open root device "sda1" or unknown-block(0,0)
>>-Please append a correct "root=" boot option; here are the
>>available partitions:
>>-1600 4194302 hdc driver: ide-cdrom
>>-Kernel panic - not syncing: VFS: Unable to mount root fs on
>>unknown-block(0,0)
>>+kjournald starting. Commit interval 5 seconds
>>+EXT3-fs: mounted filesystem with ordered data mode.
>>+VFS: Mounted root (ext3 filesystem) readonly.
>>...
>>
>> <snip>
>>
>>The second test machine uses ahci. But the symptoms are similar.
>>
>>I performed a git-bisect on x86/mm. Last commit that worked for me was
>>
>>2ea3cf43fddecbfd66353caafdf73ec21ea3760b (x86: fix
>>early_ioremap() ISA window)
>>
>>The subsequent commits for PAT support introduced the problem.
>>I noticed that PAT should be disabled by default, but
>>obviously the patches
>>still have some side-effect. (Maybe ioremap changes lead to
>>the problem?)
>>
>>Boot-logs are attached:
>>
>> boot-failing.log for x86/mm as of v2.6.24-rc8-672-ga9f7faa
>> boot-working.log for x86/mm as of v2.6.24-rc8-621-g2ea3cf4
>>
>>Hopefully it helps to track down the problem.
>>Maybe someone has an idea why the PAT patches are causing that
>>ominous "PCI interrupt for device ... disabled" messages.
>>
>>
>>Thanks and regards,
>>
>>Andreas
>>

2008-01-16 20:26:15

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Andreas Herrmann <[email protected]> wrote:

> I just want to report that the PAT support in x86/mm causes crashes on
> two of my test machines. On both boxes the SATA detection does not
> work when the PAT support is patched into the kernel.
>
> Symptoms are as follows -- best described by a diff between the two
> boot.logs:
>
> # diff boot-failing.log boot-working.log
>
> -Linux version 2.6.24-rc8-ga9f7faa5 (root@hunter) (gcc version ...
> +Linux version 2.6.24-rc8-g2ea3cf43 (root@hunter) (gcc version ...
> ...
> early_iounmap(ffffffff82a0b000, 00001000)
> -early_ioremap(000000000000c000, 00001000) => -000002103394304
> -early_iounmap(ffffffff82a0c000, 00001000)

hm, so the early_ioremap() stuff isnt working well enough ...

that's the main effect of the PAT patches at the moment: no kernel code
will access the low linear mappings (BIOS tables, ACPI data, etc.)
directly, it's all done via early_ioremap(). But it's apparently buggy
somewhere ...

Ingo

2008-01-16 20:33:37

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Wed, Jan 16, 2008 at 07:57:48PM +0100, Andreas Herrmann wrote:
> Hi,
>
> I just want to report that the PAT support in x86/mm causes crashes
> on two of my test machines. On both boxes the SATA detection does
> not work when the PAT support is patched into the kernel.
>
> Symptoms are as follows -- best described by a diff between the
> two boot.logs:
>
> # diff boot-failing.log boot-working.log
>
> -Linux version 2.6.24-rc8-ga9f7faa5 (root@hunter) (gcc version ...
> +Linux version 2.6.24-rc8-g2ea3cf43 (root@hunter) (gcc version ...
> ...
> early_iounmap(ffffffff82a0b000, 00001000)
> -early_ioremap(000000000000c000, 00001000) => -000002103394304
> -early_iounmap(ffffffff82a0c000, 00001000)
This does not look to be the problem here. We just mapped some new low
address due to possibly a different code path. But, seems to have worked fine.

> early_iounmap(ffffffff82808000, 00001000)
> ...
> -ACPI: PCI interrupt for device 0000:00:12.0 disabled
> -sata_sil: probe of 0000:00:12.0 failed with error -12
> +scsi0 : sata_sil
> +scsi1 : sata_sil
> +ata1: SATA max UDMA/100 mmio m512@0xc0403000 tf 0xc0403080 irq 22
> ...
> -AC'97 space ioremap problem
> -ACPI: PCI interrupt for device 0000:00:14.5 disabled
> -ATI IXP AC97 controller: probe of 0000:00:14.5 failed with error -5

This ioremap failing seems to be the real problem. This can be due to
new tracking of ioremaps introduced by PAT patches. We do not allow
conflicting ioremaps to same region. Probably that is happening
in both Sound and sata initialization which results in driver init failing.

Can you please try the debug patch below over latest x86/mm and boot kernel with
debug boot option and send us the dmesg from the failure. That will give us
better info about ioremaps.

Thanks,
Venki


Index: linux-2.6.git/arch/x86/mm/ioremap_64.c
===================================================================
--- linux-2.6.git.orig/arch/x86/mm/ioremap_64.c 2008-01-16 03:38:32.000000000 -0800
+++ linux-2.6.git/arch/x86/mm/ioremap_64.c 2008-01-16 05:16:28.000000000 -0800
@@ -150,6 +150,8 @@

void __iomem *ioremap_nocache (unsigned long phys_addr, unsigned long size)
{
+ printk(KERN_DEBUG "ioremap_nocache: addr %lx, size %lx\n",
+ phys_addr, size);
return __ioremap(phys_addr, size, _PAGE_UC);
}
EXPORT_SYMBOL(ioremap_nocache);
Index: linux-2.6.git/include/asm-x86/io_64.h
===================================================================
--- linux-2.6.git.orig/include/asm-x86/io_64.h 2008-01-16 03:38:32.000000000 -0800
+++ linux-2.6.git/include/asm-x86/io_64.h 2008-01-16 05:16:57.000000000 -0800
@@ -154,6 +154,8 @@

static inline void __iomem * ioremap (unsigned long offset, unsigned long size)
{
+ printk(KERN_DEBUG "ioremap: addr %lx, size %lx\n",
+ offset, size);
return __ioremap(offset, size, 0);
}

2008-01-16 22:02:10

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

> This ioremap failing seems to be the real problem. This can be due to
> new tracking of ioremaps introduced by PAT patches. We do not allow
> conflicting ioremaps to same region. Probably that is happening

Normally if there is a conflict there should be a printk (or at least it was
so in the original mattr code if you haven't changed it)

-Andi

2008-01-16 22:13:53

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes



>-----Original Message-----
>From: Andi Kleen [mailto:[email protected]]
>Sent: Wednesday, January 16, 2008 2:02 PM
>To: Pallipadi, Venkatesh
>Cc: Andreas Herrmann; [email protected];
>[email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>[email protected]; [email protected]; Barnes, Jesse;
>[email protected]; [email protected]; Siddha, Suresh B
>Subject: Re: [patch 0/4] x86: PAT followup - Incremental
>changes and bug fixes
>
>> This ioremap failing seems to be the real problem. This can be due to
>> new tracking of ioremaps introduced by PAT patches. We do not allow
>> conflicting ioremaps to same region. Probably that is happening
>
>Normally if there is a conflict there should be a printk (or
>at least it was
>so in the original mattr code if you haven't changed it)
>

Yes. Printks are there. But are with KERN_DEBUG now. We should change
them to WARNING atleast.

Thanks,
Venki

2008-01-16 22:34:58

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

> Yes. Printks are there. But are with KERN_DEBUG now. We should change
> them to WARNING atleast.

I'm pretty sure they were without KERN_* originally. Another reason
why the checkpatch.pl KERN_* warnings suck -- the original state would
have been better and I bet you changed it just to shut up the dumb scripts.

-Andi

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Wed, Jan 16, 2008 at 12:33:28PM -0800, Venki Pallipadi wrote:
> This ioremap failing seems to be the real problem. This can be due to
> new tracking of ioremaps introduced by PAT patches. We do not allow
> conflicting ioremaps to same region. Probably that is happening
> in both Sound and sata initialization which results in driver init failing.
>
> Can you please try the debug patch below over latest x86/mm and boot kernel with
> debug boot option and send us the dmesg from the failure. That will give us
> better info about ioremaps.

Attached is the boot.log with x86/mm as of today (v2.6.24-rc8-720-gd294e9e).

For the failed devices I get:

sata_sil 0000:00:12.0: version 2.3
ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 22 (level, low) -> IRQ 22
ioremap_nocache: addr c0403000, size 200
swapper:1 conflicting cache attribute c0403000-c0404000 uncached<->default
ACPI: PCI interrupt for device 0000:00:12.0 disabled

and

Advanced Linux Sound Architecture Driver Version 1.0.15 (Tue Nov 20 19:16:42
2007 UTC).
ACPI: PCI Interrupt 0000:00:14.5[B] -> GSI 17 (level, low) -> IRQ 17
ioremap_nocache: addr c0403800, size 100
swapper:1 conflicting cache attribute c0403000-c0404000 uncached<->default
AC'97 space ioremap problem
ACPI: PCI interrupt for device 0000:00:14.5 disabled
ATI IXP AC97 controller: probe of 0000:00:14.5 failed with error -5

Grepping for ioremap/iounmap gives:

<snip>
ioremap: addr 77e72d10, size 6ad8
ioremap: addr 77e79982, size 544
ioremap: addr 77e7afc0, size 40
ioremap: addr c0403104, size fc
ioremap: addr 77e7ade1, size 3
ioremap: addr 77e7af04, size 1
ioremap: addr 77e7985c, size f4
ioremap: addr 77e79950, size 32
ioremap: addr 77e79ec6, size c0
ioremap: addr 77e79f86, size 7a
ioremap: addr 77e7af74, size 48
ioremap_nocache: addr c0400000, size 1000
ioremap_nocache: addr c0401000, size 1000
ioremap_nocache: addr c0402000, size 1000
ioremap_nocache: addr c0100000, size 80
ioremap_nocache: addr c0403000, size 200
ioremap_nocache: addr c0402000, size 1000
ioremap_nocache: addr c0400000, size 1000
ioremap_nocache: addr c0401000, size 1000
ioremap_nocache: addr c0403800, size 100

I guess the conflict for sata is
ioremap: addr c0403104, size fc
ioremap_nocache: addr c0403000, size 200

But where does the conflict for the sound card
(ioremap_nocache: addr c0403800, size 100)
come from?

And what can I do about conflicting regions?


Regards,

Andreas


Attachments:
(No filename) (2.38 kB)
x86-mm-test-debug.txt (38.20 kB)
Download all attachments
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 08:12:11PM +0100, Andreas Herrmann3 wrote:
> On Wed, Jan 16, 2008 at 12:33:28PM -0800, Venki Pallipadi wrote:

<snip>

>
> I guess the conflict for sata is
> ioremap: addr c0403104, size fc
> ioremap_nocache: addr c0403000, size 200
>
> But where does the conflict for the sound card
> (ioremap_nocache: addr c0403800, size 100)
> come from?

Sorry, forget this dumb question. Its the
same page as above.


Andreas



2008-01-17 20:36:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Andreas Herrmann3 <[email protected]> wrote:

> For the failed devices I get:
>
> sata_sil 0000:00:12.0: version 2.3
> ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 22 (level, low) -> IRQ 22
> ioremap_nocache: addr c0403000, size 200
> swapper:1 conflicting cache attribute c0403000-c0404000 uncached<->default
> ACPI: PCI interrupt for device 0000:00:12.0 disabled

hm, is the problem that the two devices share the same physical page,
and thus get an overlapping area?

as an intermediate fix, how about following the attribute of the already
existing mapping, instead of rejecting the ioremap due to the conflict?
I.e. something like below?

Ingo

---
arch/x86/mm/pat.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-x86.q/arch/x86/mm/pat.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/pat.c
+++ linux-x86.q/arch/x86/mm/pat.c
@@ -174,7 +174,12 @@ int reserve_mattr(u64 start, u64 end, un
current->comm, current->pid,
start, end,
cattr_name(attr), cattr_name(ml->attr));
- err = -EBUSY;
+ /*
+ * Force the already existing attribute:
+ */
+ ma->attr = ml->attr;
+ if (*fattr)
+ *fatt = ml->attr;
break;
}
} else if (ml->start >= end) {

2008-01-17 20:43:49

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

Ingo Molnar wrote:
> * Andreas Herrmann3 <[email protected]> wrote:
>
>> For the failed devices I get:
>>
>> sata_sil 0000:00:12.0: version 2.3
>> ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 22 (level, low) -> IRQ 22
>> ioremap_nocache: addr c0403000, size 200
>> swapper:1 conflicting cache attribute c0403000-c0404000 uncached<->default
>> ACPI: PCI interrupt for device 0000:00:12.0 disabled
>
> hm, is the problem that the two devices share the same physical page,
> and thus get an overlapping area?
>
> as an intermediate fix, how about following the attribute of the already
> existing mapping, instead of rejecting the ioremap due to the conflict?
> I.e. something like below?

The correct behaviour probably would be to go with the most restrictive
caching behaviour, i.e. uncached in this case.

-hpa

2008-01-17 20:46:01

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Ingo Molnar <[email protected]> wrote:

> I.e. something like below?

or the one below. (it even builds)

Ingo

---
arch/x86/mm/pat.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-x86.q/arch/x86/mm/pat.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/pat.c
+++ linux-x86.q/arch/x86/mm/pat.c
@@ -174,7 +174,12 @@ int reserve_mattr(u64 start, u64 end, un
current->comm, current->pid,
start, end,
cattr_name(attr), cattr_name(ml->attr));
- err = -EBUSY;
+ /*
+ * Force the already existing attribute:
+ */
+ ma->attr = ml->attr;
+ if (*fattr)
+ *fattr = ml->attr;
break;
}
} else if (ml->start >= end) {

2008-01-17 20:58:48

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* H. Peter Anvin <[email protected]> wrote:

>> as an intermediate fix, how about following the attribute of the
>> already existing mapping, instead of rejecting the ioremap due to the
>> conflict? I.e. something like below?
>
> The correct behaviour probably would be to go with the most
> restrictive caching behaviour, i.e. uncached in this case.

yeah. Or, to be on the safest side, forcing UC in this case. We'll have
a warning message anyway, so it wont go unnoticed - but we wont break
drivers.

Ingo

--------->
Subject: x86: patches/pat-conflict-fixup.patch
From: Ingo Molnar <[email protected]>

Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/mm/pat.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-x86.q/arch/x86/mm/pat.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/pat.c
+++ linux-x86.q/arch/x86/mm/pat.c
@@ -174,7 +174,12 @@ int reserve_mattr(u64 start, u64 end, un
current->comm, current->pid,
start, end,
cattr_name(attr), cattr_name(ml->attr));
- err = -EBUSY;
+ /*
+ * Force UC on a conflict:
+ */
+ ma->attr = _PAGE_UC;
+ if (*fattr)
+ *fattr = _PAGE_UC;
break;
}
} else if (ml->start >= end) {

2008-01-17 20:59:56

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes



On Thu, 17 Jan 2008, H. Peter Anvin wrote:
> > > swapper:1 conflicting cache attribute c0403000-c0404000 uncached<->default
> > > ACPI: PCI interrupt for device 0000:00:12.0 disabled
>
> The correct behaviour probably would be to go with the most restrictive
> caching behaviour, i.e. uncached in this case.

Well, the sad part is that in this case, uncached is the SAME THING as
default.

So it's not like there is any actual real conflict, other than in a PAT
confusion thing.

Linus

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 09:36:00PM +0100, Ingo Molnar wrote:
>
> * Andreas Herrmann3 <[email protected]> wrote:
>
> > For the failed devices I get:
> >
> > sata_sil 0000:00:12.0: version 2.3
> > ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 22 (level, low) -> IRQ 22
> > ioremap_nocache: addr c0403000, size 200
> > swapper:1 conflicting cache attribute c0403000-c0404000 uncached<->default
> > ACPI: PCI interrupt for device 0000:00:12.0 disabled
>
> hm, is the problem that the two devices share the same physical page,
> and thus get an overlapping area?

Yes.

Meanwhile I have figured out that it is some ACPI stuff that maps the page cached.
I've changed the ioremap's in drivers/acpi/osl.c to ioremap_nocache.
See attached patch.

Now the machine boots without conflicts.

ACPI: EC: Look up EC in DSDT
ioremap_nocache: addr c0403104, size fc
ioremap_nocache: addr 77e7ade1, size 3
ioremap_nocache: addr 77e7af04, size 1

...

sata_sil 0000:00:12.0: version 2.3
ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 22 (level, low) -> IRQ 22
ioremap_nocache: addr c0403000, size 200
scsi0 : sata_sil
scsi1 : sata_sil
ata1: SATA max UDMA/100 mmio m512@0xc0403000 tf 0xc0403080 irq 22
ata2: SATA max UDMA/100 mmio m512@0xc0403000 tf 0xc04030c0 irq 22
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)

...

dmesg-output is attached.


> as an intermediate fix, how about following the attribute of the already
> existing mapping, instead of rejecting the ioremap due to the conflict?
> I.e. something like below?

I guess it is not a good idea to use an existing cachable attribute if the
IO-region is non-prefetchable. And in this example there are 3 devices
which are potentially affected:

00:12.0 IDE interface: ATI Technologies Inc 4379 Serial ATA Controller (rev 80) (
...
Memory at c0403000 (32-bit, non-prefetchable) [size=512]
...

00:14.0 SMBus: ATI Technologies Inc IXP SB400 SMBus Controller (rev 82)
...
Memory at c0403400 (32-bit, non-prefetchable) [size=1K]
...

00:14.5 Multimedia audio controller: ATI Technologies Inc IXP SB400 AC'97 Audio Controller (rev 80)
...
Memory at c0403800 (32-bit, non-prefetchable) [size=256]
...

BTW, is there a need for osl.c to map all regions as cached?


Regards,

Andreas

---
diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 1f1ec4a..175e6a4 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -221,7 +221,7 @@ void __iomem *acpi_os_map_memory(acpi_physical_address phys, acpi_size size)
/*
* ioremap checks to ensure this is in reserved space
*/
- return ioremap((unsigned long)phys, size);
+ return ioremap_nocache((unsigned long)phys, size);
else
return __acpi_map_table((unsigned long)phys, size);
}
@@ -437,7 +437,7 @@ acpi_os_read_memory(acpi_physical_address phys_addr, u32 * value, u32 width)
u32 dummy;
void __iomem *virt_addr;

- virt_addr = ioremap(phys_addr, width);
+ virt_addr = ioremap_nocache(phys_addr, width);
if (!value)
value = &dummy;

@@ -465,7 +465,7 @@ acpi_os_write_memory(acpi_physical_address phys_addr, u32 value, u32 width)
{
void __iomem *virt_addr;

- virt_addr = ioremap(phys_addr, width);
+ virt_addr = ioremap_nocache(phys_addr, width);

switch (width) {
case 8:


Attachments:
(No filename) (3.24 kB)
dmesg.log (22.57 kB)
Download all attachments

2008-01-17 21:13:49

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Andreas Herrmann3 <[email protected]> wrote:

> Yes.
>
> Meanwhile I have figured out that it is some ACPI stuff that maps the
> page cached. I've changed the ioremap's in drivers/acpi/osl.c to
> ioremap_nocache. See attached patch.
>
> Now the machine boots without conflicts.

ah, nice!

but in general we must be robust enough in this case and just degrade
any overlapping page to UC (and emit a warning perhaps) - instead of
failing the ioremap and thus failing the driver (and the bootup).

Does my third patch (which falls back to UC in case of attribute
conflicts, also attached below) instead of your ioremap_nocache() patch
solve your bootup problem too?

while ACPI should not hurt too much from using UC mappings, we should
still solve this intelligently and only use UC when needed. (Sane system
makers with a sane layout of IO areas and BIOS areas should not be
punished with UC overhead.)

> > as an intermediate fix, how about following the attribute of the
> > already existing mapping, instead of rejecting the ioremap due to
> > the conflict? I.e. something like below?
>
> I guess it is not a good idea to use an existing cachable attribute if
> the IO-region is non-prefetchable. And in this example there are 3
> devices which are potentially affected:
>
> 00:12.0 IDE interface: ATI Technologies Inc 4379 Serial ATA Controller (rev 80) (
> ...
> Memory at c0403000 (32-bit, non-prefetchable) [size=512]
> ...
>
> 00:14.0 SMBus: ATI Technologies Inc IXP SB400 SMBus Controller (rev 82)
> ...
> Memory at c0403400 (32-bit, non-prefetchable) [size=1K]
> ...
>
> 00:14.5 Multimedia audio controller: ATI Technologies Inc IXP SB400 AC'97 Audio Controller (rev 80)
> ...
> Memory at c0403800 (32-bit, non-prefetchable) [size=256]
> ...
>
> BTW, is there a need for osl.c to map all regions as cached?

no, there should be no such need. There can be "mapping leaks", in that
the mapped object is not unmapped. There's detection code in today's
x86.git that should report something like this if it occurs:

Debug warning: early ioremap leak of 1 areas detected.
please boot with early_ioremap_debug and report the dmesg.
------------[ cut here ]------------
WARNING: at arch/x86/mm/ioremap_32.c:346 ()

but i have not seen this message in your boot log. Could you boot with
early_ioremap_debug and send us the dmesg - i'm curious which ACPI
tables are actively mapped while those devices are initialized.

Ingo

-------------->
Subject: x86: patches/pat-conflict-fixup.patch
From: Ingo Molnar <[email protected]>

Signed-off-by: Ingo Molnar <[email protected]>
---
arch/x86/mm/pat.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

Index: linux-x86.q/arch/x86/mm/pat.c
===================================================================
--- linux-x86.q.orig/arch/x86/mm/pat.c
+++ linux-x86.q/arch/x86/mm/pat.c
@@ -174,7 +174,12 @@ int reserve_mattr(u64 start, u64 end, un
current->comm, current->pid,
start, end,
cattr_name(attr), cattr_name(ml->attr));
- err = -EBUSY;
+ /*
+ * Force UC on a conflict:
+ */
+ ma->attr = _PAGE_UC;
+ if (*fattr)
+ *fattr = _PAGE_UC;
break;
}
} else if (ml->start >= end) {

2008-01-17 21:23:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Ingo Molnar <[email protected]> wrote:

> * Andreas Herrmann3 <[email protected]> wrote:
>
> > Yes.
> >
> > Meanwhile I have figured out that it is some ACPI stuff that maps the
> > page cached. I've changed the ioremap's in drivers/acpi/osl.c to
> > ioremap_nocache. See attached patch.
> >
> > Now the machine boots without conflicts.
>
> ah, nice!
>
> but in general we must be robust enough in this case and just degrade
> any overlapping page to UC (and emit a warning perhaps) - instead of
> failing the ioremap and thus failing the driver (and the bootup).

btw., there's a change i did in today's x86.git: _all_ the old BIOS data
accesses now go through early_ioremap(). This cleaned up the boot code
quite significantly, as it's much more apparent now when we access a
BIOS data table. (it also solves the problem when BIOS data pages are in
reserved areas that we map via UC or dont map at all)

the same happens with all ISA ioremaps as well - no more "low 1MB is
treated special" exceptions.

[ This also solves the 'EFI puts data pages into really high memory we
dont have mapped yet' category of problems that BIOS writers are
apparently busy creating right now ;-) ]

the downside is that old linear-mapped assumptions might now result in
an early fault - boot with earlyprintk=vga or
earlyprintk=serial,ttyS0,115200. I fixed most such assumptions already
and booted an allyesconfig kernel on both 32-bit and 64-bit x86, but a
few more remain still. I've enhanced the early fault printout code as
well to make it easier to debug such things, so it should be relatively
easy to find the rest.

Ingo

2008-01-17 21:31:41

by Suresh Siddha

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
> but in general we must be robust enough in this case and just degrade
> any overlapping page to UC (and emit a warning perhaps) - instead of
> failing the ioremap and thus failing the driver (and the bootup).

But then, this will cause an attribute conflicit. Old one was specifying
WB in PAT (ioremap with noflags) and the new ioremap specifies UC.

As Linus mentioned, main problem is to figure out the correct attribute
for ioremap() which doesn't specify the actual attribute to be used.

One mechanism to fix the issue generically (somewhat atleast) is to use
MTRR's and figure out the default MTRR attribute for that physical address
and use it for ioremap().

> no, there should be no such need. There can be "mapping leaks", in that
> the mapped object is not unmapped. There's detection code in today's
> x86.git that should report something like this if it occurs:
>
> Debug warning: early ioremap leak of 1 areas detected.
> please boot with early_ioremap_debug and report the dmesg.
> ------------[ cut here ]------------
> WARNING: at arch/x86/mm/ioremap_32.c:346 ()
>
> but i have not seen this message in your boot log. Could you boot with
> early_ioremap_debug and send us the dmesg - i'm curious which ACPI
> tables are actively mapped while those devices are initialized.

In this scenario, ACPI is using ioremap() leaving some dangling references.
Venki is looking to fix this code. Getting the attribute for MTRR
for ioremap noflags, might solve some of these issues aswell. Will look into
this.

thanks,
suresh

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
>
> * Andreas Herrmann3 <[email protected]> wrote:
>
> > Yes.
> >
> > Meanwhile I have figured out that it is some ACPI stuff that maps the
> > page cached. I've changed the ioremap's in drivers/acpi/osl.c to
> > ioremap_nocache. See attached patch.
> >
> > Now the machine boots without conflicts.
>
> ah, nice!
>
> but in general we must be robust enough in this case and just degrade
> any overlapping page to UC (and emit a warning perhaps) - instead of
> failing the ioremap and thus failing the driver (and the bootup).
>
> Does my third patch (which falls back to UC in case of attribute
> conflicts, also attached below) instead of your ioremap_nocache() patch
> solve your bootup problem too?

I'll check this asap

> but i have not seen this message in your boot log. Could you boot with
> early_ioremap_debug and send us the dmesg - i'm curious which ACPI
> tables are actively mapped while those devices are initialized.

Hmm, early_ioremap_debug exists only in ioremap_32.c
Have to adapt the 64-bit version first.

But wait the 64-bit code contains already debug output for this. See
the boot-logs that I have attached to my previous mails.
(Interestingly the code for 64-bit early_io(re/un)map resides not in
ioremap_64.c but in init_64.c.)


Andreas


2008-01-17 21:49:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Siddha, Suresh B <[email protected]> wrote:

> On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
> > but in general we must be robust enough in this case and just degrade
> > any overlapping page to UC (and emit a warning perhaps) - instead of
> > failing the ioremap and thus failing the driver (and the bootup).
>
> But then, this will cause an attribute conflicit. Old one was
> specifying WB in PAT (ioremap with noflags) and the new ioremap
> specifies UC.

we could fix up all aliases of that page as well and degrade them to UC?

> As Linus mentioned, main problem is to figure out the correct
> attribute for ioremap() which doesn't specify the actual attribute to
> be used.

i think the problem is the proximity of some ACPI tables to actual
device mmio areas - they share the same physical page. The ACPI tables
will be mapped WB, the device mmio areas will be UC most of the time.

> One mechanism to fix the issue generically (somewhat atleast) is to
> use MTRR's and figure out the default MTRR attribute for that physical
> address and use it for ioremap().

how would this solve the problem at hand? I dont think it's possible to
guarantee that all the BIOS data pages and mmio areas will have
compatible attributes. BIOS data pages might be in plain RAM that we
intend to map WB. Or they might be in reserved areas near the mmio
addresses.

but if we fixed up aliases (only for that single conflicting page), so
that all mappings are degraded to UC, we'd have uniform behavior all
across and the least amount of surprise to drivers. Hm?

> > but i have not seen this message in your boot log. Could you boot
> > with early_ioremap_debug and send us the dmesg - i'm curious which
> > ACPI tables are actively mapped while those devices are initialized.
>
> In this scenario, ACPI is using ioremap() leaving some dangling
> references. Venki is looking to fix this code. Getting the attribute
> for MTRR for ioremap noflags, might solve some of these issues aswell.
> Will look into this.

ok. Resolving that would be nice anyway because the ACPI table might be
in plain RAM which might be reused by the kernel later on, etc. FYI,
there's also the patch from Yinghai Lu on lkml, for one such dangling
reference problem in the SRAT table.

Ingo

---------------->
From: Yinghai Lu <[email protected]>
Subject: [PATCH] x86: copy srat table and unmap in acpi_parse_table

[PATCH] x86: copy srat table and unmap in acpi_parse_table


the old acpi_numa_slit_init was saving old address in early stage acpi_slit
and acpi_parse_table can not unmap address that.
the patch copy the slit in the callback,
so we could unmap table in acpi_parse_table instead of outside track it.

need to revert
"
commit d8d28f25f33c6a035cdfb1d421c79293d16e5c58
Author: Ingo Molnar <[email protected]>
Date: Thu Jan 17 15:26:42 2008 +0100

x86: ACPI: fix mapping leaks

ioremap_early() is stateful, hence we cannot tolerate mapping leaks.
"

before appling this patch

Signed-off-by: Yinghai Lu <[email protected]>

Index: linux-2.6/arch/x86/mm/srat_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/srat_64.c
+++ linux-2.6/arch/x86/mm/srat_64.c
@@ -23,7 +23,9 @@

int acpi_numa __initdata;

-static struct acpi_table_slit *acpi_slit;
+static int slit_copied;
+static u64 slit_locality_count;
+static u8 slit_entry[MAX_NUMNODES * MAX_NUMNODES];

static nodemask_t nodes_parsed __initdata;
static struct bootnode nodes[MAX_NUMNODES] __initdata;
@@ -130,7 +132,16 @@ void __init acpi_numa_slit_init(struct a
printk(KERN_INFO "ACPI: SLIT table looks invalid. Not used.\n");
return;
}
- acpi_slit = slit;
+
+ if (slit->locality_count > MAX_NUMNODES)
+ return;
+
+ slit_locality_count = slit->locality_count;
+
+ memcpy(slit_entry, slit->entry,
+ slit_locality_count * slit_locality_count);
+
+ slit_copied = 1;
}

/* Callback for Proximity Domain -> LAPIC mapping */
@@ -502,11 +513,11 @@ int __node_distance(int a, int b)
{
int index;

- if (!acpi_slit)
+ if (!slit_copied)
return null_slit_node_compare(a, b) ? LOCAL_DISTANCE :
REMOTE_DISTANCE;
- index = acpi_slit->locality_count * node_to_pxm(a);
- return acpi_slit->entry[index + node_to_pxm(b)];
+ index = slit_locality_count * node_to_pxm(a);
+ return slit_entry[index + node_to_pxm(b)];
}

EXPORT_SYMBOL(__node_distance);
Index: linux-2.6/drivers/acpi/tables.c
===================================================================
--- linux-2.6.orig/drivers/acpi/tables.c
+++ linux-2.6/drivers/acpi/tables.c
@@ -260,6 +260,7 @@ int __init acpi_table_parse(char *id, ac

if (table) {
handler(table);
+ acpi_os_unmap_memory(table, table->length);
return 0;
} else
return 1;

2008-01-17 21:51:13

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

Siddha, Suresh B wrote:
>
> But then, this will cause an attribute conflicit. Old one was specifying
> WB in PAT (ioremap with noflags) and the new ioremap specifies UC.
>
> As Linus mentioned, main problem is to figure out the correct attribute
> for ioremap() which doesn't specify the actual attribute to be used.
>
> One mechanism to fix the issue generically (somewhat atleast) is to use
> MTRR's and figure out the default MTRR attribute for that physical address
> and use it for ioremap().
>

This is the matrix the CPU uses when combining MTRR and PAT behaviour.
It probably makes sense to mimic:

| WB WT WC UC
---+---------------
WB | WB WT WC UC
WT | WT WT UC UC
WC | WC UC WC UC
UC | UC UC UC UC

With the current PAT encoding:

WB = 00
WT = 01
WC = 10
UC = 11

... this is simply a bitwise OR. This makes sense, since one of the
bits denies delaying writes (WT, UC), and the other denies delaying
reads (WC, UC).

-hpa

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 10:42:09PM +0100, Ingo Molnar wrote:
>
> * Siddha, Suresh B <[email protected]> wrote:
>
> > On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
> > > but in general we must be robust enough in this case and just degrade
> > > any overlapping page to UC (and emit a warning perhaps) - instead of
> > > failing the ioremap and thus failing the driver (and the bootup).
> >
> > But then, this will cause an attribute conflicit. Old one was
> > specifying WB in PAT (ioremap with noflags) and the new ioremap
> > specifies UC.
>
> we could fix up all aliases of that page as well and degrade them to UC?

Yes, we must fix all aliases or reject the conflicting mapping.
But fixing all aliases might not be that easy.
(I've just seen a panic when using your patch ;-(


Andreas


2008-01-17 22:13:50

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Andreas Herrmann3 <[email protected]> wrote:

> > but i have not seen this message in your boot log. Could you boot
> > with early_ioremap_debug and send us the dmesg - i'm curious which
> > ACPI tables are actively mapped while those devices are initialized.
>
> Hmm, early_ioremap_debug exists only in ioremap_32.c Have to adapt the
> 64-bit version first.
>
> But wait the 64-bit code contains already debug output for this. See
> the boot-logs that I have attached to my previous mails.
> (Interestingly the code for 64-bit early_io(re/un)map resides not in
> ioremap_64.c but in init_64.c.)

yeah, it's not unified yet.

Ingo

2008-01-17 22:15:31

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

Andreas Herrmann3 wrote:
>
> Yes, we must fix all aliases or reject the conflicting mapping.
> But fixing all aliases might not be that easy.
> (I've just seen a panic when using your patch ;-(
>

Avoiding inconsistent aliases is definitely fundamental to supporting PAT.

-hpa

2008-01-17 22:16:06

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Andreas Herrmann3 <[email protected]> wrote:

> On Thu, Jan 17, 2008 at 10:42:09PM +0100, Ingo Molnar wrote:
> >
> > * Siddha, Suresh B <[email protected]> wrote:
> >
> > > On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
> > > > but in general we must be robust enough in this case and just degrade
> > > > any overlapping page to UC (and emit a warning perhaps) - instead of
> > > > failing the ioremap and thus failing the driver (and the bootup).
> > >
> > > But then, this will cause an attribute conflicit. Old one was
> > > specifying WB in PAT (ioremap with noflags) and the new ioremap
> > > specifies UC.
> >
> > we could fix up all aliases of that page as well and degrade them to UC?
>
> Yes, we must fix all aliases or reject the conflicting mapping. But
> fixing all aliases might not be that easy. (I've just seen a panic
> when using your patch ;-(

yes, indeed my patch is bad if you have PAT enabled: conflicting cache
attributes might be present. I'll go with your patch for now.

should we perhaps do UC by default for early_ioremap() as well? Normally
those mappings are only temporary - but in case of a leak they might
hang around in the pagetables and the CPU might stumble upon them. Also,
should early_iounmap() do a wbinvd() [/clflush()] call as well, to be
safe?

Ingo

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 10:42:28PM +0100, Andreas Herrmann3 wrote:
> On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
> >
> > * Andreas Herrmann3 <[email protected]> wrote:
> >
> > > Yes.
> > >
> > > Meanwhile I have figured out that it is some ACPI stuff that maps the
> > > page cached. I've changed the ioremap's in drivers/acpi/osl.c to
> > > ioremap_nocache. See attached patch.
> > >
> > > Now the machine boots without conflicts.
> >
> > ah, nice!
> >
> > but in general we must be robust enough in this case and just degrade
> > any overlapping page to UC (and emit a warning perhaps) - instead of
> > failing the ioremap and thus failing the driver (and the bootup).
> >
> > Does my third patch (which falls back to UC in case of attribute
> > conflicts, also attached below) instead of your ioremap_nocache() patch
> > solve your bootup problem too?
>
> I'll check this asap

Ok, here is the result:

sata_sil 0000:00:12.0: version 2.3
ACPI: PCI Interrupt 0000:00:12.0[A] -> GSI 22 (level, low) -> IRQ 22
ioremap_nocache: addr c0403000, size 200
swapper:1 conflicting cache attribute c0403000-c0404000 uncached<->default
Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
[<ffffffff8102905d>] ? reserve_mat
1a5/0x221
PGD 0
Oops: 0000 [1] SMP
CPU 3
Modules linked in:
Pid: 1, comm: swapper Not tainted 2.6.24-rc8-gd294e9ed-dirty #1
RIP: 0010:[<ffffffff8102905d>] [<ffffffff8102905d>] ? reserve_mattr+0x1a5/0x221
RSP: 0018:ffff810077581c60 EFLAGS: 00010282
RAX: 000000000000004e RBX: ffff8100775a7a00 RCX: 0000000000004c12
RDX: 000000000000a9a9 RSI: 0000000000000018 RDI: ffffffff8153bed4
RBP: 0000000000000000 R08: ffffffff81540fe7 R09: ffffffff81329d70
R10: 0000000000000000 R11: 0000000000000000 R12: 00000000c0404000
R13: 0000000000000018 R14: 00000000c0403000 R15: 00000000c0403000
FS: 0000000000000000(0000) GS:ffff8100775d6bc0(0000) knlGS:0000000000000000
CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b
CR2: 0000000000000000 CR3: 0000000001001000 CR4: 00000000000006e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process swapper (pid: 1, threadinfo ffff810077580000, task ffff810077564790)
Stack: ffffffff81411900 0000000000001000 0000000000001000 00000000c0404000
ffffc200008ac000 00000000c0403000 ffff8100775a7a40 ffffffff810281e9
0000000000000018 0000000000000005 ffff810077631680 ffff8100777b7800
Call Trace:
[<ffffffff810281e9>] __ioremap+0xc2/0x11a
[<ffffffff8114a6b0>] pcim_iomap+0x43/0x53
[<ffffffff8114a74f>] pcim_iomap_regions+0x8f/0x104
[<ffffffff811fba72>] sil_init_one+0xb0/0x1eb
[<ffffffff81150f98>] pci_device_probe+0xd1/0x138
[<ffffffff811a4d9c>] driver_probe_device+0xe1/0x16a
[<ffffffff811a4f6d>] __driver_attach+0x90/0xcd
[<ffffffff811a4edd>] __driver_attach+0x0/0xcd
[<ffffffff811a4edd>] __driver_attach+0x0/0xcd
[<ffffffff811a4149>] bus_for_each_dev+0x43/0x6e
[<ffffffff811a44c9>] bus_add_driver+0x77/0x1be
[<ffffffff8115116e>] __pci_register_driver+0x58/0x8a
[<ffffffff814d2634>] kernel_init+0x170/0x2e0
[<ffffffff8100cb58>] child_rip+0xa/0x12
[<ffffffff814d24c4>] kernel_init+0x0/0x2e0
[<ffffffff8100cb4e>] child_rip+0x0/0x12


Code: 00 49 89 c9 48 81 c6 e0 02 00 00 48 89 3c 24 31 c0 4d 89 e0 4c 89 f1 48 c7 c7 c3 97 3e 81 e8 71 ef 00 00 48 c7 43 10 18 00 00 00 <48> 83 3c 25 00 00 00 00 00 74 36 48 c7 04 25 00 00 00 00 18 00
RIP [<ffffffff8102905d>] ? reserve_mattr+0x1a5/0x221
RSP <ffff810077581c60>
CR2: 0000000000000000
---[ end trace 5516cbea98bb72f9 ]---
Kernel panic - not syncing: Attempted to kill init!



I should have reviewed your patch.
I guess it must be

"if (fattr)" instead of "if (*fattr)"

I'll give it another try ...


Andreas


Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 10:42:28PM +0100, Andreas Herrmann3 wrote:
> On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
> >
> > * Andreas Herrmann3 <[email protected]> wrote:
> >
> > > Yes.
> > >
> > > Meanwhile I have figured out that it is some ACPI stuff that maps the
> > > page cached. I've changed the ioremap's in drivers/acpi/osl.c to
> > > ioremap_nocache. See attached patch.
> > >
> > > Now the machine boots without conflicts.
> >
> > ah, nice!
> >
> > but in general we must be robust enough in this case and just degrade
> > any overlapping page to UC (and emit a warning perhaps) - instead of
> > failing the ioremap and thus failing the driver (and the bootup).
> >
> > Does my third patch (which falls back to UC in case of attribute
> > conflicts, also attached below) instead of your ioremap_nocache() patch
> > solve your bootup problem too?
>
> I'll check this asap


So, now that I've avoided this tiny NULL-pointer-dereference, the system boots
fine as well with your (slightly modified) patch. See dmesg attached.


Andreas


Attachments:
(No filename) (1.05 kB)
dmesg.log (22.35 kB)
Download all attachments

2008-01-17 22:47:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes


* Andreas Herrmann3 <[email protected]> wrote:

> > I'll check this asap
>
> So, now that I've avoided this tiny NULL-pointer-dereference, the
> system boots fine as well with your (slightly modified) patch. See
> dmesg attached.

for now i applied your ioremap_uncached() patch and removed my patch.

my patch might work if the MTRR marks that area UC. Does it on your
system?

if the MTRRs (as set up by the BIOS) keep it at WB, then the ACPI
ioremap() is already unsafe: the mmio area that happens to be there
might be prefetched by the CPU.

Ingo

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 11:15:05PM +0100, Ingo Molnar wrote:
>
> * Andreas Herrmann3 <[email protected]> wrote:
>
> > On Thu, Jan 17, 2008 at 10:42:09PM +0100, Ingo Molnar wrote:
> > >
> > > * Siddha, Suresh B <[email protected]> wrote:
> > >
> > > > On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
> > > > > but in general we must be robust enough in this case and just degrade
> > > > > any overlapping page to UC (and emit a warning perhaps) - instead of
> > > > > failing the ioremap and thus failing the driver (and the bootup).
> > > >
> > > > But then, this will cause an attribute conflicit. Old one was
> > > > specifying WB in PAT (ioremap with noflags) and the new ioremap
> > > > specifies UC.
> > >
> > > we could fix up all aliases of that page as well and degrade them to UC?
> >
> > Yes, we must fix all aliases or reject the conflicting mapping. But
> > fixing all aliases might not be that easy. (I've just seen a panic
> > when using your patch ;-(
>
> yes, indeed my patch is bad if you have PAT enabled: conflicting cache
> attributes might be present. I'll go with your patch for now.

I think the best is to just reject conflicting mappings. (Because now
I am too tired to think about a safe way how to change the aliases to the
most restrictive memory type. ;-)

But then of course such boot-time problems like I've seen on my test
machines should be avoided somehow.


Andreas


2008-01-17 23:04:42

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 11:52:43PM +0100, Andreas Herrmann3 wrote:
> On Thu, Jan 17, 2008 at 11:15:05PM +0100, Ingo Molnar wrote:
> >
> > * Andreas Herrmann3 <[email protected]> wrote:
> >
> > > On Thu, Jan 17, 2008 at 10:42:09PM +0100, Ingo Molnar wrote:
> > > >
> > > > * Siddha, Suresh B <[email protected]> wrote:
> > > >
> > > > > On Thu, Jan 17, 2008 at 10:13:08PM +0100, Ingo Molnar wrote:
> > > > > > but in general we must be robust enough in this case and just degrade
> > > > > > any overlapping page to UC (and emit a warning perhaps) - instead of
> > > > > > failing the ioremap and thus failing the driver (and the bootup).
> > > > >
> > > > > But then, this will cause an attribute conflicit. Old one was
> > > > > specifying WB in PAT (ioremap with noflags) and the new ioremap
> > > > > specifies UC.
> > > >
> > > > we could fix up all aliases of that page as well and degrade them to UC?
> > >
> > > Yes, we must fix all aliases or reject the conflicting mapping. But
> > > fixing all aliases might not be that easy. (I've just seen a panic
> > > when using your patch ;-(
> >
> > yes, indeed my patch is bad if you have PAT enabled: conflicting cache
> > attributes might be present. I'll go with your patch for now.
>
> I think the best is to just reject conflicting mappings. (Because now
> I am too tired to think about a safe way how to change the aliases to the
> most restrictive memory type. ;-)
>
> But then of course such boot-time problems like I've seen on my test
> machines should be avoided somehow.
>
>

Below is another potential fix for the problem here. Going through ACPI
ioremap usages, we found at one place the mapping is cached for possible
optimization reason and not unmapped later. Patch below always unmaps
ioremap at this place in ACPICA.

Thanks,
Venki


Index: linux-2.6.git/drivers/acpi/executer/exregion.c
===================================================================
--- linux-2.6.git.orig/drivers/acpi/executer/exregion.c 2008-01-17 03:18:39.000000000 -0800
+++ linux-2.6.git/drivers/acpi/executer/exregion.c 2008-01-17 07:34:33.000000000 -0800
@@ -48,6 +48,8 @@
#define _COMPONENT ACPI_EXECUTER
ACPI_MODULE_NAME("exregion")

+static int ioremap_cache;
+
/*******************************************************************************
*
* FUNCTION: acpi_ex_system_memory_space_handler
@@ -249,6 +251,13 @@
break;
}

+ if (!ioremap_cache) {
+ acpi_os_unmap_memory(mem_info->mapped_logical_address,
+ window_size);
+ mem_info->mapped_logical_address = 0;
+ mem_info->mapped_physical_address = 0;
+ mem_info->mapped_length = 0;
+ }
return_ACPI_STATUS(status);
}

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 11:35:51PM +0100, Ingo Molnar wrote:
>
> * Andreas Herrmann3 <[email protected]> wrote:
>
> > > I'll check this asap
> >
> > So, now that I've avoided this tiny NULL-pointer-dereference, the
> > system boots fine as well with your (slightly modified) patch. See
> > dmesg attached.
>
> for now i applied your ioremap_uncached() patch and removed my patch.
>
> my patch might work if the MTRR marks that area UC. Does it on your
> system?

The region (c0403000-c04031ff) is not characterized by an MTRR.
The MTRRdefType specifies it and it is

MTRRdefType = 0x0000000000000c00
MemType=0
MtrrDefTypeFixEn=0x1
MtrrDefTypeEn=0x1

=> 0==UC

So, that's why it worked.


Andreas


Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote:
>
> Below is another potential fix for the problem here. Going through ACPI
> ioremap usages, we found at one place the mapping is cached for possible
> optimization reason and not unmapped later. Patch below always unmaps
> ioremap at this place in ACPICA.
>
> Thanks,
> Venki
>
>
> Index: linux-2.6.git/drivers/acpi/executer/exregion.c
> ===================================================================
> --- linux-2.6.git.orig/drivers/acpi/executer/exregion.c 2008-01-17 03:18:39.000000000 -0800
> +++ linux-2.6.git/drivers/acpi/executer/exregion.c 2008-01-17 07:34:33.000000000 -0800
> @@ -48,6 +48,8 @@
> #define _COMPONENT ACPI_EXECUTER
> ACPI_MODULE_NAME("exregion")
>
> +static int ioremap_cache;
> +
> /*******************************************************************************
> *
> * FUNCTION: acpi_ex_system_memory_space_handler
> @@ -249,6 +251,13 @@
> break;
> }
>
> + if (!ioremap_cache) {
> + acpi_os_unmap_memory(mem_info->mapped_logical_address,
> + window_size);
> + mem_info->mapped_logical_address = 0;
> + mem_info->mapped_physical_address = 0;
> + mem_info->mapped_length = 0;
> + }
> return_ACPI_STATUS(status);
> }
>


Applying and compiling your patch I see:

CC drivers/acpi/executer/exregion.o
drivers/acpi/executer/exregion.c: In function 'acpi_ex_system_memory_space_handler':
drivers/acpi/executer/exregion.c:81: warning: 'window_size' may be used uninitialized in this function


After glancing through this file it seems that ioremap_cache is always 0
and acpi_os_unmap_memory will unconditionally be executed at end of this function.
I am not familiar with that code. But I just want to reinsure that this
is what you want. And if so, why is that variable needed?
But maybe I missed something ...
(I'll test it tomorrow, or I better should say later today.)


Andreas


2008-01-17 23:41:54

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes



>-----Original Message-----
>From: Andreas Herrmann3 [mailto:[email protected]]
>Sent: Thursday, January 17, 2008 3:25 PM
>To: Pallipadi, Venkatesh
>Cc: Ingo Molnar; Siddha, Suresh B; [email protected];
>[email protected]; [email protected];
>[email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>Barnes, Jesse; [email protected]; [email protected]
>Subject: Re: [patch 0/4] x86: PAT followup - Incremental
>changes and bug fixes
>
>On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote:
>>
>> Below is another potential fix for the problem here. Going
>through ACPI
>> ioremap usages, we found at one place the mapping is cached
>for possible
>> optimization reason and not unmapped later. Patch below always unmaps
>> ioremap at this place in ACPICA.
>>
>> Thanks,
>> Venki
>>
>>
>> Index: linux-2.6.git/drivers/acpi/executer/exregion.c
>> ===================================================================
>> --- linux-2.6.git.orig/drivers/acpi/executer/exregion.c
>2008-01-17 03:18:39.000000000 -0800
>> +++ linux-2.6.git/drivers/acpi/executer/exregion.c
>2008-01-17 07:34:33.000000000 -0800
>> @@ -48,6 +48,8 @@
>> #define _COMPONENT ACPI_EXECUTER
>> ACPI_MODULE_NAME("exregion")
>>
>> +static int ioremap_cache;
>> +
>>
>/**************************************************************
>*****************
>> *
>> * FUNCTION: acpi_ex_system_memory_space_handler
>> @@ -249,6 +251,13 @@
>> break;
>> }
>>
>> + if (!ioremap_cache) {
>> + acpi_os_unmap_memory(mem_info->mapped_logical_address,
>> + window_size);
>> + mem_info->mapped_logical_address = 0;
>> + mem_info->mapped_physical_address = 0;
>> + mem_info->mapped_length = 0;
>> + }
>> return_ACPI_STATUS(status);
>> }
>>
>
>
>Applying and compiling your patch I see:
>
> CC drivers/acpi/executer/exregion.o
>drivers/acpi/executer/exregion.c: In function
>'acpi_ex_system_memory_space_handler':
>drivers/acpi/executer/exregion.c:81: warning: 'window_size'
>may be used uninitialized in this function
>
>
>After glancing through this file it seems that ioremap_cache
>is always 0
>and acpi_os_unmap_memory will unconditionally be executed at
>end of this function.
>I am not familiar with that code. But I just want to reinsure that this
>is what you want. And if so, why is that variable needed?
>But maybe I missed something ...

I missed that warning. But should not matter for testing this patch as
we always initialize window_size with the patch.

Yes. The variable is not needed. With patch I always map at the
beginning of this function and unmap at the end. I just kept the
variable as I was planning to add a boot option to control this
initially. But, later decided to keep the test patch simple without any
boot option.
We can come up with a better patch once we know that the test patch
helps.

Thanks,
Venki

2008-01-18 04:25:36

by Andi Kleen

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

> As Linus mentioned, main problem is to figure out the correct attribute
> for ioremap() which doesn't specify the actual attribute to be used.

In this case the correct attribute is the one of the underlying MTRR.

And if it conflicts with some other mapping that overrides an MTRR
the driver was always broken and it should probably error out and be
reevaluated/fixed.

-Andi

Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote:
>
> Below is another potential fix for the problem here. Going through ACPI
> ioremap usages, we found at one place the mapping is cached for possible
> optimization reason and not unmapped later. Patch below always unmaps
> ioremap at this place in ACPICA.

The patch does not fix the problem. The conflicting cache attributes are
still there.

I have done another test after I have added a debug message in iounmap
when a direct mapping gets removed. (See boot-exregion-2.log)
For the address in question this gives:

# grep c04 boot-exregion-2.log

ioremap: addr c0403104, size fc
ioremap: addr c0403184, size 7c
ioremap: addr c040310a, size f6
ioremap: addr c040310a, size f6
ioremap: addr c040310a, size f6
ioremap: addr c040310a, size f6
ioremap: addr c040318a, size 76
ioremap: addr c0403104, size fc
ioremap: addr c0403104, size fc
ioremap: addr c0403184, size 7c
ioremap: addr c0403104, size fc
ioremap: addr c0403184, size 7c
ioremap_nocache: addr c0400000, size 1000
iounmap: addr c0400000, size 1000
ioremap_nocache: addr c0401000, size 1000
iounmap: addr c0401000, size 1000
ioremap_nocache: addr c0402000, size 1000
iounmap: addr c0402000, size 1000
ioremap: addr c0403104, size fc
ioremap: addr c0403184, size 7c
ioremap_nocache: addr c0403000, size 200
swapper:1 conflicting cache attribute c0403000-c0404000 uncached<->default
...


Regards,

Andreas


Attachments:
(No filename) (1.40 kB)
boot-exregion-2.log (24.13 kB)
Download all attachments

2008-01-18 17:12:29

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: RE: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes



>-----Original Message-----
>From: Andreas Herrmann3 [mailto:[email protected]]
>Sent: Friday, January 18, 2008 8:11 AM
>To: Pallipadi, Venkatesh
>Cc: Ingo Molnar; Siddha, Suresh B; [email protected];
>[email protected]; [email protected];
>[email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>[email protected]; [email protected]; [email protected];
>Barnes, Jesse; [email protected]; [email protected]
>Subject: Re: [patch 0/4] x86: PAT followup - Incremental
>changes and bug fixes
>
>On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote:
>>
>> Below is another potential fix for the problem here. Going
>through ACPI
>> ioremap usages, we found at one place the mapping is cached
>for possible
>> optimization reason and not unmapped later. Patch below always unmaps
>> ioremap at this place in ACPICA.
>
>The patch does not fix the problem. The conflicting cache
>attributes are
>still there.
>

Andreas,

Could you also try the patch Suresh Siddha sent out yesterday. That
covers the case where the attribute was not getting removed even after
unmap was called.

Thanks,
Venki

2008-01-18 17:33:51

by Balbir Singh

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

* Pallipadi, Venkatesh <[email protected]> [2008-01-18 09:13:10]:

>
>
> >-----Original Message-----
> >From: Andreas Herrmann3 [mailto:[email protected]]
> >Sent: Friday, January 18, 2008 8:11 AM
> >To: Pallipadi, Venkatesh
> >Cc: Ingo Molnar; Siddha, Suresh B; [email protected];
> >[email protected]; [email protected];
> >[email protected]; [email protected];
> >[email protected]; [email protected]; [email protected];
> >[email protected]; [email protected]; [email protected];
> >Barnes, Jesse; [email protected]; [email protected]
> >Subject: Re: [patch 0/4] x86: PAT followup - Incremental
> >changes and bug fixes
> >
> >On Thu, Jan 17, 2008 at 03:04:10PM -0800, Venki Pallipadi wrote:
> >>
> >> Below is another potential fix for the problem here. Going
> >through ACPI
> >> ioremap usages, we found at one place the mapping is cached
> >for possible
> >> optimization reason and not unmapped later. Patch below always unmaps
> >> ioremap at this place in ACPICA.
> >
> >The patch does not fix the problem. The conflicting cache
> >attributes are
> >still there.
> >
>
> Andreas,
>
> Could you also try the patch Suresh Siddha sent out yesterday. That
> covers the case where the attribute was not getting removed even after
> unmap was called.
>

An easy way for you to figure out if our patch will solve your problem
is this, look for any quirks for your device in drivers/pci/quirks.c
and or architecture specific quirks file. If you see your device in
there, then our patch is likely to solve your problem.

--
Warm Regards,
Balbir Singh
Linux Technology Center
IBM, ISTL

2008-01-24 20:45:13

by Eric W. Biederman

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

"H. Peter Anvin" <[email protected]> writes:

> Siddha, Suresh B wrote:
>>
>> But then, this will cause an attribute conflicit. Old one was specifying
>> WB in PAT (ioremap with noflags) and the new ioremap specifies UC.
>>
>> As Linus mentioned, main problem is to figure out the correct attribute
>> for ioremap() which doesn't specify the actual attribute to be used.
>>
>> One mechanism to fix the issue generically (somewhat atleast) is to use
>> MTRR's and figure out the default MTRR attribute for that physical address
>> and use it for ioremap().
>>
>
> This is the matrix the CPU uses when combining MTRR and PAT behaviour. It
> probably makes sense to mimic:
>
> | WB WT WC UC
> ---+---------------
> WB | WB WT WC UC
> WT | WT WT UC UC
> WC | WC UC WC UC
> UC | UC UC UC UC
>
> With the current PAT encoding:
>
> WB = 00
> WT = 01
> WC = 10
> UC = 11
>
> ... this is simply a bitwise OR. This makes sense, since one of the bits denies
> delaying writes (WT, UC), and the other denies delaying reads (WC, UC).

Almost. There is a specific case and important where MTRR UC + page table WC == WC.

But yes. For ioremap where we are WB + MTRR == MTRR we need to request the
same attributes as the e820 map, to get the attribute checking correct.

Eric

2008-01-24 21:42:06

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [patch 0/4] x86: PAT followup - Incremental changes and bug fixes

Eric W. Biederman wrote:
>>
>> | WB WT WC UC
>> ---+---------------
>> WB | WB WT WC UC
>> WT | WT WT UC UC
>> WC | WC UC WC UC
>> UC | UC UC UC UC
>>
>> With the current PAT encoding:
>>
>> WB = 00
>> WT = 01
>> WC = 10
>> UC = 11
>>
>> ... this is simply a bitwise OR. This makes sense, since one of the bits denies
>> delaying writes (WT, UC), and the other denies delaying reads (WC, UC).
>
> Almost. There is a specific case and important where MTRR UC + page table WC == WC.
>
> But yes. For ioremap where we are WB + MTRR == MTRR we need to request the
> same attributes as the e820 map, to get the attribute checking correct.
>

True; however, that shouldn't be followed for the case of conflicting
attempts at mapping.

Now, I *believe* it is safe to have some mappings UC and some WC. This
is also something to keep in mind (there are legitimate applications for
that particular form of aliasing, too.) If so, we may not want to thump
at those.

-hpa