2008-02-15 19:55:37

by Ingo Molnar

[permalink] [raw]
Subject: [patch 0/3] x86_64: CPA, fix cache attribute inconsistency bug


the following 3 patches are the backports to 2.6.22, 2.6.23 and 2.6.24
of a CPA bug we found while writing 2.6.25's new CPA code.

the bug affects the 64-bit kernel on two popular Linux distros, Fedora
and Ubuntu:

$ grep RELOCA /boot/config-2.6.23.9-85.fc8
CONFIG_RELOCATABLE=y

$ grep RELOC /boot/config-2.6.22-14-generic
CONFIG_RELOCATABLE=y

havent checked more.

(There are no upstream commits for this because this code is not present
in the new code.)

Ingo


2008-02-15 19:59:05

by Ingo Molnar

[permalink] [raw]
Subject: [patch 1/3] x86_64: CPA, fix cache attribute inconsistency bug, v2.6.22 backport


fix CPA cache attribute bug in v2.6.23. When phys_base is nonzero
(when CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates
the secondary alias address by -14 MB (depending on the configured
offset).

The default 64-bit kernels of Fedora and Ubuntu are affected:

$ grep RELOCA /boot/config-2.6.23.9-85.fc8
CONFIG_RELOCATABLE=y

$ grep RELOC /boot/config-2.6.22-14-generic
CONFIG_RELOCATABLE=y

and probably on many other distros as well.

the bug affects all pages in the first 40 MB of physical RAM that
are allocated by some subsystem that does ioremap_nocache() on them:

if (__pa(address) < KERNEL_TEXT_SIZE) {

Hence we might leave page table entries with inconsistent cache
attributes around (pages mapped at both UnCacheable and Write-Back),
and we can also set the wrong kernel text pages to UnCacheable.

the effects of this bug can be random slowdowns and other misbehavior.
If for example AGP allocates its aperture pages into the first 40 MB
of physical RAM, then the -14 MB bug might mark random kernel texto
pages as uncacheable, slowing down a random portion of the 64-bit
kernel until the AGP driver is unloaded.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
---
arch/x86_64/mm/pageattr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-tmp/arch/x86_64/mm/pageattr.c
===================================================================
--- linux-tmp.orig/arch/x86_64/mm/pageattr.c
+++ linux-tmp/arch/x86_64/mm/pageattr.c
@@ -204,7 +204,7 @@ int change_page_attr_addr(unsigned long
if (__pa(address) < KERNEL_TEXT_SIZE) {
unsigned long addr2;
pgprot_t prot2;
- addr2 = __START_KERNEL_map + __pa(address);
+ addr2 = __START_KERNEL_map + __pa(address) - phys_base;
/* Make sure the kernel mappings stay executable */
prot2 = pte_pgprot(pte_mkexec(pfn_pte(0, prot)));
err = __change_page_attr(addr2, pfn, prot2,

2008-02-15 19:59:28

by Ingo Molnar

[permalink] [raw]
Subject: [patch 2/3] x86_64: CPA, fix cache attribute inconsistency bug, v2.6.23 backport


fix CPA cache attribute bug in v2.6.23. When phys_base is nonzero
(when CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates
the secondary alias address by -14 MB (depending on the configured
offset).

The default 64-bit kernels of Fedora and Ubuntu are affected:

$ grep RELOCA /boot/config-2.6.23.9-85.fc8
CONFIG_RELOCATABLE=y

$ grep RELOC /boot/config-2.6.22-14-generic
CONFIG_RELOCATABLE=y

and probably on many other distros as well.

the bug affects all pages in the first 40 MB of physical RAM that
are allocated by some subsystem that does ioremap_nocache() on them:

if (__pa(address) < KERNEL_TEXT_SIZE) {

Hence we might leave page table entries with inconsistent cache
attributes around (pages mapped at both UnCacheable and Write-Back),
and we can also set the wrong kernel text pages to UnCacheable.

the effects of this bug can be random slowdowns and other misbehavior.
If for example AGP allocates its aperture pages into the first 40 MB
of physical RAM, then the -14 MB bug might mark random kernel texto
pages as uncacheable, slowing down a random portion of the 64-bit
kernel until the AGP driver is unloaded.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
---
arch/x86_64/mm/pageattr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-tmp/arch/x86_64/mm/pageattr.c
===================================================================
--- linux-tmp.orig/arch/x86_64/mm/pageattr.c
+++ linux-tmp/arch/x86_64/mm/pageattr.c
@@ -206,7 +206,7 @@ int change_page_attr_addr(unsigned long
if (__pa(address) < KERNEL_TEXT_SIZE) {
unsigned long addr2;
pgprot_t prot2;
- addr2 = __START_KERNEL_map + __pa(address);
+ addr2 = __START_KERNEL_map + __pa(address) - phys_base;
/* Make sure the kernel mappings stay executable */
prot2 = pte_pgprot(pte_mkexec(pfn_pte(0, prot)));
err = __change_page_attr(addr2, pfn, prot2,

2008-02-15 20:00:31

by Ingo Molnar

[permalink] [raw]
Subject: [patch 3/3] x86_64: CPA, fix cache attribute inconsistency bug, v2.6.24 backport


fix CPA cache attribute bug in v2.6.24. When phys_base is nonzero (when
CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates the
secondary alias address by -14 MB (depending on the configured offset).

The default 64-bit kernels of Fedora and Ubuntu are affected:

$ grep RELOCA /boot/config-2.6.23.9-85.fc8
CONFIG_RELOCATABLE=y

$ grep RELOC /boot/config-2.6.22-14-generic
CONFIG_RELOCATABLE=y

and probably on many other distros as well.

the bug affects all pages in the first 40 MB of physical RAM that
are allocated by some subsystem that does ioremap_nocache() on them:

if (__pa(address) < KERNEL_TEXT_SIZE) {

Hence we might leave page table entries with inconsistent cache
attributes around (pages mapped at both UnCacheable and Write-Back),
and we can also set the wrong kernel text pages to UnCacheable.

the effects of this bug can be random slowdowns and other misbehavior.
If for example AGP allocates its aperture pages into the first 40 MB
of physical RAM, then the -14 MB bug might mark random kernel texto
pages as uncacheable, slowing down a random portion of the 64-bit
kernel until the AGP driver is unloaded.

Signed-off-by: Ingo Molnar <[email protected]>
Acked-by: Thomas Gleixner <[email protected]>
---
arch/x86/mm/pageattr_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

Index: linux-tmp/arch/x86/mm/pageattr_64.c
===================================================================
--- linux-tmp.orig/arch/x86/mm/pageattr_64.c
+++ linux-tmp/arch/x86/mm/pageattr_64.c
@@ -207,7 +207,7 @@ int change_page_attr_addr(unsigned long
if (__pa(address) < KERNEL_TEXT_SIZE) {
unsigned long addr2;
pgprot_t prot2;
- addr2 = __START_KERNEL_map + __pa(address);
+ addr2 = __START_KERNEL_map + __pa(address) - phys_base;
/* Make sure the kernel mappings stay executable */
prot2 = pte_pgprot(pte_mkexec(pfn_pte(0, prot)));
err = __change_page_attr(addr2, pfn, prot2,

2008-02-15 20:01:25

by Ingo Molnar

[permalink] [raw]
Subject: Re: [patch 1/3] x86_64: CPA, fix cache attribute inconsistency bug, v2.6.22 backport


correction:

> fix CPA cache attribute bug in v2.6.23. When phys_base is nonzero
v2.6.22

(the text says .23 but the fix is correctly against .22)

Ingo

2008-02-15 20:41:41

by Oliver Pinter

[permalink] [raw]
Subject: Re: [patch 1/3] x86_64: CPA, fix cache attribute inconsistency bug, v2.6.22 backport

hi Ingo,

thx add to http://repo.or.cz/w/linux-2.6.22.y-op.git testing ;)


On 2/15/08, Ingo Molnar <[email protected]> wrote:
>
> fix CPA cache attribute bug in v2.6.23. When phys_base is nonzero
> (when CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates
> the secondary alias address by -14 MB (depending on the configured
> offset).
>
> The default 64-bit kernels of Fedora and Ubuntu are affected:
>
> $ grep RELOCA /boot/config-2.6.23.9-85.fc8
> CONFIG_RELOCATABLE=y
>
> $ grep RELOC /boot/config-2.6.22-14-generic
> CONFIG_RELOCATABLE=y
>
> and probably on many other distros as well.
>
> the bug affects all pages in the first 40 MB of physical RAM that
> are allocated by some subsystem that does ioremap_nocache() on them:
>
> if (__pa(address) < KERNEL_TEXT_SIZE) {
>
> Hence we might leave page table entries with inconsistent cache
> attributes around (pages mapped at both UnCacheable and Write-Back),
> and we can also set the wrong kernel text pages to UnCacheable.
>
> the effects of this bug can be random slowdowns and other misbehavior.
> If for example AGP allocates its aperture pages into the first 40 MB
> of physical RAM, then the -14 MB bug might mark random kernel texto
> pages as uncacheable, slowing down a random portion of the 64-bit
> kernel until the AGP driver is unloaded.
>
> Signed-off-by: Ingo Molnar <[email protected]>
> Acked-by: Thomas Gleixner <[email protected]>
> ---
> arch/x86_64/mm/pageattr.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-tmp/arch/x86_64/mm/pageattr.c
> ===================================================================
> --- linux-tmp.orig/arch/x86_64/mm/pageattr.c
> +++ linux-tmp/arch/x86_64/mm/pageattr.c
> @@ -204,7 +204,7 @@ int change_page_attr_addr(unsigned long
> if (__pa(address) < KERNEL_TEXT_SIZE) {
> unsigned long addr2;
> pgprot_t prot2;
> - addr2 = __START_KERNEL_map + __pa(address);
> + addr2 = __START_KERNEL_map + __pa(address) - phys_base;
> /* Make sure the kernel mappings stay executable */
> prot2 = pte_pgprot(pte_mkexec(pfn_pte(0, prot)));
> err = __change_page_attr(addr2, pfn, prot2,
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
--
Thanks,
Oliver

2008-02-20 20:31:46

by Vivek Goyal

[permalink] [raw]
Subject: Re: [patch 1/3] x86_64: CPA, fix cache attribute inconsistency bug, v2.6.22 backport

On Fri, Feb 15, 2008 at 08:58:22PM +0100, Ingo Molnar wrote:
>
> fix CPA cache attribute bug in v2.6.23. When phys_base is nonzero
> (when CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates
> the secondary alias address by -14 MB (depending on the configured
> offset).
>
> The default 64-bit kernels of Fedora and Ubuntu are affected:
>
> $ grep RELOCA /boot/config-2.6.23.9-85.fc8
> CONFIG_RELOCATABLE=y
>
> $ grep RELOC /boot/config-2.6.22-14-generic
> CONFIG_RELOCATABLE=y
>
> and probably on many other distros as well.
>
> the bug affects all pages in the first 40 MB of physical RAM that
> are allocated by some subsystem that does ioremap_nocache() on them:
>
> if (__pa(address) < KERNEL_TEXT_SIZE) {
>
> Hence we might leave page table entries with inconsistent cache
> attributes around (pages mapped at both UnCacheable and Write-Back),
> and we can also set the wrong kernel text pages to UnCacheable.
>
> the effects of this bug can be random slowdowns and other misbehavior.
> If for example AGP allocates its aperture pages into the first 40 MB
> of physical RAM, then the -14 MB bug might mark random kernel texto
> pages as uncacheable, slowing down a random portion of the 64-bit
> kernel until the AGP driver is unloaded.
>
> Signed-off-by: Ingo Molnar <[email protected]>
> Acked-by: Thomas Gleixner <[email protected]>
> ---
> arch/x86_64/mm/pageattr.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> Index: linux-tmp/arch/x86_64/mm/pageattr.c
> ===================================================================
> --- linux-tmp.orig/arch/x86_64/mm/pageattr.c
> +++ linux-tmp/arch/x86_64/mm/pageattr.c
> @@ -204,7 +204,7 @@ int change_page_attr_addr(unsigned long
> if (__pa(address) < KERNEL_TEXT_SIZE) {

Hi Ingo,

Should we change above condition also to something like following.

kernel_phys_start = __pa(__START_KERNEL_map) + phys_base
kernel_phys_end = kernel_phys_start + KERNEL_TEXT_SIZE

if (__pa(address) >= kernel_phys_start
&& __pa(address) <= kernel_phys_end)

Looks like we are trying to see if a physical page has been mapped
by kernel text/data region also then change the caching attributes there too.

In case of relocatable kernel, not necessarily first 40MB will be mapped by
the kernel text/data region, If boot loader decides to load kernel
at a higher address and phys_base is non-zero.

Thanks
Vivek

2008-02-20 20:45:42

by Vivek Goyal

[permalink] [raw]
Subject: Re: [patch 1/3] x86_64: CPA, fix cache attribute inconsistency bug, v2.6.22 backport

On Wed, Feb 20, 2008 at 03:30:18PM -0500, Vivek Goyal wrote:
> On Fri, Feb 15, 2008 at 08:58:22PM +0100, Ingo Molnar wrote:
> >
> > fix CPA cache attribute bug in v2.6.23. When phys_base is nonzero
> > (when CONFIG_RELOCATABLE=y) then change_page_attr_addr() miscalculates
> > the secondary alias address by -14 MB (depending on the configured
> > offset).
> >
> > The default 64-bit kernels of Fedora and Ubuntu are affected:
> >
> > $ grep RELOCA /boot/config-2.6.23.9-85.fc8
> > CONFIG_RELOCATABLE=y
> >
> > $ grep RELOC /boot/config-2.6.22-14-generic
> > CONFIG_RELOCATABLE=y
> >
> > and probably on many other distros as well.
> >
> > the bug affects all pages in the first 40 MB of physical RAM that
> > are allocated by some subsystem that does ioremap_nocache() on them:
> >
> > if (__pa(address) < KERNEL_TEXT_SIZE) {
> >
> > Hence we might leave page table entries with inconsistent cache
> > attributes around (pages mapped at both UnCacheable and Write-Back),
> > and we can also set the wrong kernel text pages to UnCacheable.
> >
> > the effects of this bug can be random slowdowns and other misbehavior.
> > If for example AGP allocates its aperture pages into the first 40 MB
> > of physical RAM, then the -14 MB bug might mark random kernel texto
> > pages as uncacheable, slowing down a random portion of the 64-bit
> > kernel until the AGP driver is unloaded.
> >
> > Signed-off-by: Ingo Molnar <[email protected]>
> > Acked-by: Thomas Gleixner <[email protected]>
> > ---
> > arch/x86_64/mm/pageattr.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > Index: linux-tmp/arch/x86_64/mm/pageattr.c
> > ===================================================================
> > --- linux-tmp.orig/arch/x86_64/mm/pageattr.c
> > +++ linux-tmp/arch/x86_64/mm/pageattr.c
> > @@ -204,7 +204,7 @@ int change_page_attr_addr(unsigned long
> > if (__pa(address) < KERNEL_TEXT_SIZE) {
>
> Hi Ingo,
>
> Should we change above condition also to something like following.
>
> kernel_phys_start = __pa(__START_KERNEL_map) + phys_base


Oops. Just noticed that __pa() is already taking care of adding phys_base.
So it should probably be.

kernel_phys_start = __pa(__START_KERNEL_map);
kernel_phys_end = kernel_phys_start + KERNEL_TEXT_SIZE;

if (__pa(address) >= kernel_phys_start
&& __pa(address) <= kernel_phys_end)

Thanks
Vivek