2008-07-22 04:32:22

by Rusty Russell

[permalink] [raw]
Subject: [PATCH] x86: fix pte_flags() to only return flags, fix lguest.

Change a15af1c9ea2750a9ff01e51615c45950bad8221b 'x86/paravirt: add
pte_flags to just get pte flags' removed lguest's private pte_flags()
in favor of a generic one.

Unfortunately, the generic one doesn't filter out the non-flags bits:
this results in lguest creating corrupt shadow page tables and blowing
up host memory.

Since noone is supposed to use the pfn part of pte_flags(), it seems
safest to always do the filtering.

Cc: Jeremy Fitzhardinge <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

diff -r ee1a6adad3d2 arch/x86/kernel/paravirt.c
--- a/arch/x86/kernel/paravirt.c Mon Jul 21 12:49:25 2008 +1000
+++ b/arch/x86/kernel/paravirt.c Tue Jul 22 14:09:33 2008 +1000
@@ -428,7 +428,7 @@ struct pv_mmu_ops pv_mmu_ops = {
#endif /* PAGETABLE_LEVELS >= 3 */

.pte_val = native_pte_val,
- .pte_flags = native_pte_val,
+ .pte_flags = native_pte_flags,
.pgd_val = native_pgd_val,

.make_pte = native_make_pte,
diff -r ee1a6adad3d2 include/asm-x86/page.h
--- a/include/asm-x86/page.h Mon Jul 21 12:49:25 2008 +1000
+++ b/include/asm-x86/page.h Tue Jul 22 14:09:33 2008 +1000
@@ -144,6 +144,18 @@ static inline pteval_t native_pte_val(pt
return pte.pte;
}

+/* This belongs in pgtable.h, but our includes are too much of a mess. */
+#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
+#define PTE_FLAGS _AC(0x8000000000000FFF, ULL)
+#else
+#define PTE_FLAGS 0x00000FFF
+#endif
+
+static inline pteval_t native_pte_flags(pte_t pte)
+{
+ return native_pte_val(pte) & PTE_FLAGS;
+}
+
#define pgprot_val(x) ((x).pgprot)
#define __pgprot(x) ((pgprot_t) { (x) } )

@@ -165,7 +177,7 @@ static inline pteval_t native_pte_val(pt
#endif

#define pte_val(x) native_pte_val(x)
-#define pte_flags(x) native_pte_val(x)
+#define pte_flags(x) native_pte_flags(x)
#define __pte(x) native_make_pte(x)

#endif /* CONFIG_PARAVIRT */
diff -r ee1a6adad3d2 include/asm-x86/paravirt.h
--- a/include/asm-x86/paravirt.h Mon Jul 21 12:49:25 2008 +1000
+++ b/include/asm-x86/paravirt.h Tue Jul 22 14:09:33 2008 +1000
@@ -1083,6 +1083,9 @@ static inline pteval_t pte_flags(pte_t p
ret = PVOP_CALL1(pteval_t, pv_mmu_ops.pte_flags,
pte.pte);

+#ifdef CONFIG_PARAVIRT_DEBUG
+ BUG_ON(ret & ~PTE_FLAGS);
+#endif
return ret;
}


2008-07-22 04:38:20

by Stephen Rothwell

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest.

Hi Rusty,

On Tue, 22 Jul 2008 14:31:58 +1000 Rusty Russell <[email protected]> wrote:
>
> +++ b/include/asm-x86/page.h Tue Jul 22 14:09:33 2008 +1000
> @@ -144,6 +144,18 @@ static inline pteval_t native_pte_val(pt
> return pte.pte;
> }
>
> +/* This belongs in pgtable.h, but our includes are too much of a mess. */
> +#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
> +#define PTE_FLAGS _AC(0x8000000000000FFF, ULL)

Just a small point: You don't actually need the _AC() because this is
already in a #ifndef __ASSEMBLY__ section of the file. But better safe
than sorry, I guess. :-)

--
Cheers,
Stephen Rothwell [email protected]
http://www.canb.auug.org.au/~sfr/


Attachments:
(No filename) (706.00 B)
(No filename) (197.00 B)
Download all attachments

2008-07-22 04:50:09

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest.

Rusty Russell wrote:
> Change a15af1c9ea2750a9ff01e51615c45950bad8221b 'x86/paravirt: add
> pte_flags to just get pte flags' removed lguest's private pte_flags()
> in favor of a generic one.
>
> Unfortunately, the generic one doesn't filter out the non-flags bits:
> this results in lguest creating corrupt shadow page tables and blowing
> up host memory.
>
> Since noone is supposed to use the pfn part of pte_flags(), it seems
> safest to always do the filtering.
>

Thinking about this, I wonder if it needs to be a pv_op at all.
Generality says "yes", but there are no users which set it to anything
other than native_pte_flags. The point of it is to return the flags
as-is, without needing more complex stuff (like Xen's mfn to pfn
conversion).

In most cases, the surrounding code will be applying its own mask
anyway, so making it a simple inline will allow all the masks to get
folded together.

If a user which wants to do something other than return the bare flags,
it would be easy enough to put the pv_op back.

> Cc: Jeremy Fitzhardinge <[email protected]>
> Signed-off-by: Rusty Russell <[email protected]>
>
> diff -r ee1a6adad3d2 arch/x86/kernel/paravirt.c
> --- a/arch/x86/kernel/paravirt.c Mon Jul 21 12:49:25 2008 +1000
> +++ b/arch/x86/kernel/paravirt.c Tue Jul 22 14:09:33 2008 +1000
> @@ -428,7 +428,7 @@ struct pv_mmu_ops pv_mmu_ops = {
> #endif /* PAGETABLE_LEVELS >= 3 */
>
> .pte_val = native_pte_val,
> - .pte_flags = native_pte_val,
> + .pte_flags = native_pte_flags,
> .pgd_val = native_pgd_val,
>
> .make_pte = native_make_pte,
> diff -r ee1a6adad3d2 include/asm-x86/page.h
> --- a/include/asm-x86/page.h Mon Jul 21 12:49:25 2008 +1000
> +++ b/include/asm-x86/page.h Tue Jul 22 14:09:33 2008 +1000
> @@ -144,6 +144,18 @@ static inline pteval_t native_pte_val(pt
> return pte.pte;
> }
>
> +/* This belongs in pgtable.h, but our includes are too much of a mess. */
> +#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
> +#define PTE_FLAGS _AC(0x8000000000000FFF, ULL)
> +#else
> +#define PTE_FLAGS 0x00000FFF
> +#endif
>
We already have this, it's called PTE_MASK.

J

2008-07-22 04:51:34

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest.

Stephen Rothwell wrote:
> Just a small point: You don't actually need the _AC() because this is
> already in a #ifndef __ASSEMBLY__ section of the file. But better safe
> than sorry, I guess. :-)
>

It's doubly wrong anyway. The correct thing to use here would be
_AT(pteval_t, 0x80....FFF), but even more correct to use PTE_MASK which
is already defined to have precisely this meaning.

J

2008-07-22 04:53:08

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest.

On Tuesday 22 July 2008 14:38:00 Stephen Rothwell wrote:
> Hi Rusty,
>
> On Tue, 22 Jul 2008 14:31:58 +1000 Rusty Russell <[email protected]>
wrote:
> > +/* This belongs in pgtable.h, but our includes are too much of a mess.
> > */ +#if defined(CONFIG_X86_64) || defined(CONFIG_X86_PAE)
> > +#define PTE_FLAGS _AC(0x8000000000000FFF, ULL)
>
> Just a small point: You don't actually need the _AC() because this is
> already in a #ifndef __ASSEMBLY__ section of the file. But better safe
> than sorry, I guess. :-)

Ah, yes, I had them in pgtable.h before I gave up. I'll leave it in: if
someone wants them in asm they can now safely move it.

Thanks,
Rusty.

2008-07-22 05:49:48

by Rusty Russell

[permalink] [raw]
Subject: [PATCH] x86: fix pte_flags() to only return flags, fix lguest (updated)

(Jeremy said:
rusty: use PTE_MASK
rusty: use PTE_MASK
rusty: use PTE_MASK
When I asked:
jsgf: does that include the NX flag?
He responded eloquently:
rusty: use PTE_MASK
rusty: use PTE_MASK
yes, it's the official constant of masking flags out of ptes
)

Change a15af1c9ea2750a9ff01e51615c45950bad8221b 'x86/paravirt: add
pte_flags to just get pte flags' removed lguest's private pte_flags()
in favor of a generic one.

Unfortunately, the generic one doesn't filter out the non-flags bits:
this results in lguest creating corrupt shadow page tables and blowing
up host memory.

Since noone is supposed to use the pfn part of pte_flags(), it seems
safest to always do the filtering.

Cc: Jeremy Fitzhardinge <[email protected]>
Signed-off-by: Rusty Russell <[email protected]>

diff -r ee1a6adad3d2 arch/x86/kernel/paravirt.c
--- a/arch/x86/kernel/paravirt.c Mon Jul 21 12:49:25 2008 +1000
+++ b/arch/x86/kernel/paravirt.c Tue Jul 22 15:31:04 2008 +1000
@@ -428,7 +428,7 @@ struct pv_mmu_ops pv_mmu_ops = {
#endif /* PAGETABLE_LEVELS >= 3 */

.pte_val = native_pte_val,
- .pte_flags = native_pte_val,
+ .pte_flags = native_pte_flags,
.pgd_val = native_pgd_val,

.make_pte = native_make_pte,
diff -r ee1a6adad3d2 include/asm-x86/page.h
--- a/include/asm-x86/page.h Mon Jul 21 12:49:25 2008 +1000
+++ b/include/asm-x86/page.h Tue Jul 22 15:31:04 2008 +1000
@@ -144,6 +144,11 @@ static inline pteval_t native_pte_val(pt
return pte.pte;
}

+static inline pteval_t native_pte_flags(pte_t pte)
+{
+ return native_pte_val(pte) & ~PTE_MASK;
+}
+
#define pgprot_val(x) ((x).pgprot)
#define __pgprot(x) ((pgprot_t) { (x) } )

@@ -165,7 +170,7 @@ static inline pteval_t native_pte_val(pt
#endif

#define pte_val(x) native_pte_val(x)
-#define pte_flags(x) native_pte_val(x)
+#define pte_flags(x) native_pte_flags(x)
#define __pte(x) native_make_pte(x)

#endif /* CONFIG_PARAVIRT */
diff -r ee1a6adad3d2 include/asm-x86/paravirt.h
--- a/include/asm-x86/paravirt.h Mon Jul 21 12:49:25 2008 +1000
+++ b/include/asm-x86/paravirt.h Tue Jul 22 15:31:04 2008 +1000
@@ -1083,6 +1083,9 @@ static inline pteval_t pte_flags(pte_t p
ret = PVOP_CALL1(pteval_t, pv_mmu_ops.pte_flags,
pte.pte);

+#ifdef CONFIG_PARAVIRT_DEBUG
+ BUG_ON(ret & PTE_MASK);
+#endif
return ret;
}

2008-07-22 06:00:13

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK

Rusty, in his peevish way, complained that macros defining constants
should have a name which somewhat accurately reflects the actual
purpose of the constant.

Aside from the fact that PTE_MASK gives no clue as to what's actually
being masked, and is misleadingly similar to the functionally entirely
different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the
problem is.

But if this patch silences the incessent noise, then it will have
achieved its goal (TODO: write test-case).

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/mm/dump_pagetables.c | 10 +++++-----
arch/x86/xen/enlighten.c | 2 +-
arch/x86/xen/mmu.c | 8 ++++----
include/asm-x86/page.h | 6 +++---
include/asm-x86/paravirt.h | 2 +-
include/asm-x86/pgtable-3level.h | 8 ++++----
include/asm-x86/pgtable.h | 4 ++--
include/asm-x86/pgtable_32.h | 4 ++--
include/asm-x86/pgtable_64.h | 10 +++++-----
include/asm-x86/xen/page.h | 2 +-
10 files changed, 28 insertions(+), 28 deletions(-)

===================================================================
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -148,8 +148,8 @@
* we have now. "break" is either changing perms, levels or
* address space marker.
*/
- prot = pgprot_val(new_prot) & ~(PTE_MASK);
- cur = pgprot_val(st->current_prot) & ~(PTE_MASK);
+ prot = pgprot_val(new_prot) & ~(PTE_PFN_MASK);
+ cur = pgprot_val(st->current_prot) & ~(PTE_PFN_MASK);

if (!st->level) {
/* First entry */
@@ -221,7 +221,7 @@
for (i = 0; i < PTRS_PER_PMD; i++) {
st->current_address = normalize_addr(P + i * PMD_LEVEL_MULT);
if (!pmd_none(*start)) {
- pgprotval_t prot = pmd_val(*start) & ~PTE_MASK;
+ pgprotval_t prot = pmd_val(*start) & ~PTE_PFN_MASK;

if (pmd_large(*start) || !pmd_present(*start))
note_page(m, st, __pgprot(prot), 3);
@@ -253,7 +253,7 @@
for (i = 0; i < PTRS_PER_PUD; i++) {
st->current_address = normalize_addr(P + i * PUD_LEVEL_MULT);
if (!pud_none(*start)) {
- pgprotval_t prot = pud_val(*start) & ~PTE_MASK;
+ pgprotval_t prot = pud_val(*start) & ~PTE_PFN_MASK;

if (pud_large(*start) || !pud_present(*start))
note_page(m, st, __pgprot(prot), 2);
@@ -288,7 +288,7 @@
for (i = 0; i < PTRS_PER_PGD; i++) {
st.current_address = normalize_addr(i * PGD_LEVEL_MULT);
if (!pgd_none(*start)) {
- pgprotval_t prot = pgd_val(*start) & ~PTE_MASK;
+ pgprotval_t prot = pgd_val(*start) & ~PTE_PFN_MASK;

if (pgd_large(*start) || !pgd_present(*start))
note_page(m, &st, __pgprot(prot), 1);
===================================================================
--- a/arch/x86/xen/enlighten.c
+++ b/arch/x86/xen/enlighten.c
@@ -1482,7 +1482,7 @@
{
phys_addr_t paddr;

- maddr &= PTE_MASK;
+ maddr &= PTE_PFN_MASK;
paddr = mfn_to_pfn(maddr >> PAGE_SHIFT) << PAGE_SHIFT;

return paddr;
===================================================================
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -343,8 +343,8 @@
static pteval_t pte_mfn_to_pfn(pteval_t val)
{
if (val & _PAGE_PRESENT) {
- unsigned long mfn = (val & PTE_MASK) >> PAGE_SHIFT;
- pteval_t flags = val & ~PTE_MASK;
+ unsigned long mfn = (val & PTE_PFN_MASK) >> PAGE_SHIFT;
+ pteval_t flags = val & ~PTE_PFN_MASK;
val = ((pteval_t)mfn_to_pfn(mfn) << PAGE_SHIFT) | flags;
}

@@ -354,8 +354,8 @@
static pteval_t pte_pfn_to_mfn(pteval_t val)
{
if (val & _PAGE_PRESENT) {
- unsigned long pfn = (val & PTE_MASK) >> PAGE_SHIFT;
- pteval_t flags = val & ~PTE_MASK;
+ unsigned long pfn = (val & PTE_PFN_MASK) >> PAGE_SHIFT;
+ pteval_t flags = val & ~PTE_PFN_MASK;
val = ((pteval_t)pfn_to_mfn(pfn) << PAGE_SHIFT) | flags;
}

===================================================================
--- a/include/asm-x86/page.h
+++ b/include/asm-x86/page.h
@@ -18,8 +18,8 @@
(ie, 32-bit PAE). */
#define PHYSICAL_PAGE_MASK (((signed long)PAGE_MASK) & __PHYSICAL_MASK)

-/* PTE_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */
-#define PTE_MASK ((pteval_t)PHYSICAL_PAGE_MASK)
+/* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */
+#define PTE_PFN_MASK ((pteval_t)PHYSICAL_PAGE_MASK)

#define PMD_PAGE_SIZE (_AC(1, UL) << PMD_SHIFT)
#define PMD_PAGE_MASK (~(PMD_PAGE_SIZE-1))
@@ -146,7 +146,7 @@

static inline pteval_t native_pte_flags(pte_t pte)
{
- return native_pte_val(pte) & ~PTE_MASK;
+ return native_pte_val(pte) & ~PTE_PFN_MASK;
}

#define pgprot_val(x) ((x).pgprot)
===================================================================
--- a/include/asm-x86/paravirt.h
+++ b/include/asm-x86/paravirt.h
@@ -1071,7 +1071,7 @@
pte.pte);

#ifdef CONFIG_PARAVIRT_DEBUG
- BUG_ON(ret & PTE_MASK);
+ BUG_ON(ret & PTE_PFN_MASK);
#endif
return ret;
}
===================================================================
--- a/include/asm-x86/pgtable-3level.h
+++ b/include/asm-x86/pgtable-3level.h
@@ -25,7 +25,7 @@

static inline int pud_bad(pud_t pud)
{
- return (pud_val(pud) & ~(PTE_MASK | _KERNPG_TABLE | _PAGE_USER)) != 0;
+ return (pud_val(pud) & ~(PTE_PFN_MASK | _KERNPG_TABLE | _PAGE_USER)) != 0;
}

static inline int pud_present(pud_t pud)
@@ -120,9 +120,9 @@
write_cr3(pgd);
}

-#define pud_page(pud) ((struct page *) __va(pud_val(pud) & PTE_MASK))
+#define pud_page(pud) ((struct page *) __va(pud_val(pud) & PTE_PFN_MASK))

-#define pud_page_vaddr(pud) ((unsigned long) __va(pud_val(pud) & PTE_MASK))
+#define pud_page_vaddr(pud) ((unsigned long) __va(pud_val(pud) & PTE_PFN_MASK))


/* Find an entry in the second-level page table.. */
@@ -160,7 +160,7 @@

static inline unsigned long pte_pfn(pte_t pte)
{
- return (pte_val(pte) & PTE_MASK) >> PAGE_SHIFT;
+ return (pte_val(pte) & PTE_PFN_MASK) >> PAGE_SHIFT;
}

/*
===================================================================
--- a/include/asm-x86/pgtable.h
+++ b/include/asm-x86/pgtable.h
@@ -53,7 +53,7 @@
_PAGE_DIRTY)

/* Set of bits not changed in pte_modify */
-#define _PAGE_CHG_MASK (PTE_MASK | _PAGE_PCD | _PAGE_PWT | \
+#define _PAGE_CHG_MASK (PTE_PFN_MASK | _PAGE_PCD | _PAGE_PWT | \
_PAGE_ACCESSED | _PAGE_DIRTY)

#define _PAGE_CACHE_MASK (_PAGE_PCD | _PAGE_PWT)
@@ -286,7 +286,7 @@
return __pgprot(preservebits | addbits);
}

-#define pte_pgprot(x) __pgprot(pte_flags(x) & ~PTE_MASK)
+#define pte_pgprot(x) __pgprot(pte_flags(x) & ~PTE_PFN_MASK)

#define canon_pgprot(p) __pgprot(pgprot_val(p) & __supported_pte_mask)

===================================================================
--- a/include/asm-x86/pgtable_32.h
+++ b/include/asm-x86/pgtable_32.h
@@ -94,7 +94,7 @@
/* To avoid harmful races, pmd_none(x) should check only the lower when PAE */
#define pmd_none(x) (!(unsigned long)pmd_val((x)))
#define pmd_present(x) (pmd_val((x)) & _PAGE_PRESENT)
-#define pmd_bad(x) ((pmd_val(x) & (~PTE_MASK & ~_PAGE_USER)) != _KERNPG_TABLE)
+#define pmd_bad(x) ((pmd_val(x) & (~PTE_PFN_MASK & ~_PAGE_USER)) != _KERNPG_TABLE)

#define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT))

@@ -145,7 +145,7 @@
#define pmd_page(pmd) (pfn_to_page(pmd_val((pmd)) >> PAGE_SHIFT))

#define pmd_page_vaddr(pmd) \
- ((unsigned long)__va(pmd_val((pmd)) & PTE_MASK))
+ ((unsigned long)__va(pmd_val((pmd)) & PTE_PFN_MASK))

#if defined(CONFIG_HIGHPTE)
#define pte_offset_map(dir, address) \
===================================================================
--- a/include/asm-x86/pgtable_64.h
+++ b/include/asm-x86/pgtable_64.h
@@ -158,17 +158,17 @@

static inline int pgd_bad(pgd_t pgd)
{
- return (pgd_val(pgd) & ~(PTE_MASK | _PAGE_USER)) != _KERNPG_TABLE;
+ return (pgd_val(pgd) & ~(PTE_PFN_MASK | _PAGE_USER)) != _KERNPG_TABLE;
}

static inline int pud_bad(pud_t pud)
{
- return (pud_val(pud) & ~(PTE_MASK | _PAGE_USER)) != _KERNPG_TABLE;
+ return (pud_val(pud) & ~(PTE_PFN_MASK | _PAGE_USER)) != _KERNPG_TABLE;
}

static inline int pmd_bad(pmd_t pmd)
{
- return (pmd_val(pmd) & ~(PTE_MASK | _PAGE_USER)) != _KERNPG_TABLE;
+ return (pmd_val(pmd) & ~(PTE_PFN_MASK | _PAGE_USER)) != _KERNPG_TABLE;
}

#define pte_none(x) (!pte_val((x)))
@@ -199,7 +199,7 @@
* Level 4 access.
*/
#define pgd_page_vaddr(pgd) \
- ((unsigned long)__va((unsigned long)pgd_val((pgd)) & PTE_MASK))
+ ((unsigned long)__va((unsigned long)pgd_val((pgd)) & PTE_PFN_MASK))
#define pgd_page(pgd) (pfn_to_page(pgd_val((pgd)) >> PAGE_SHIFT))
#define pgd_present(pgd) (pgd_val(pgd) & _PAGE_PRESENT)
static inline int pgd_large(pgd_t pgd) { return 0; }
@@ -222,7 +222,7 @@
}

/* PMD - Level 2 access */
-#define pmd_page_vaddr(pmd) ((unsigned long) __va(pmd_val((pmd)) & PTE_MASK))
+#define pmd_page_vaddr(pmd) ((unsigned long) __va(pmd_val((pmd)) & PTE_PFN_MASK))
#define pmd_page(pmd) (pfn_to_page(pmd_val((pmd)) >> PAGE_SHIFT))

#define pmd_index(address) (((address) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
===================================================================
--- a/include/asm-x86/xen/page.h
+++ b/include/asm-x86/xen/page.h
@@ -124,7 +124,7 @@

static inline unsigned long pte_mfn(pte_t pte)
{
- return (pte.pte & PTE_MASK) >> PAGE_SHIFT;
+ return (pte.pte & PTE_PFN_MASK) >> PAGE_SHIFT;
}

static inline pte_t mfn_pte(unsigned long page_nr, pgprot_t pgprot)

2008-07-22 06:00:52

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: [PATCH 2/2] x86: add PTE_FLAGS_MASK

PTE_PFN_MASK was getting lonely, so I made it a friend.

Signed-off-by: Jeremy Fitzhardinge <[email protected]>
---
arch/x86/mm/dump_pagetables.c | 6 +++---
arch/x86/xen/mmu.c | 4 ++--
include/asm-x86/page.h | 5 ++++-
include/asm-x86/pgtable.h | 2 +-
include/asm-x86/pgtable_32.h | 2 +-
5 files changed, 11 insertions(+), 8 deletions(-)

===================================================================
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -221,7 +221,7 @@
for (i = 0; i < PTRS_PER_PMD; i++) {
st->current_address = normalize_addr(P + i * PMD_LEVEL_MULT);
if (!pmd_none(*start)) {
- pgprotval_t prot = pmd_val(*start) & ~PTE_PFN_MASK;
+ pgprotval_t prot = pmd_val(*start) & PTE_FLAGS_MASK;

if (pmd_large(*start) || !pmd_present(*start))
note_page(m, st, __pgprot(prot), 3);
@@ -253,7 +253,7 @@
for (i = 0; i < PTRS_PER_PUD; i++) {
st->current_address = normalize_addr(P + i * PUD_LEVEL_MULT);
if (!pud_none(*start)) {
- pgprotval_t prot = pud_val(*start) & ~PTE_PFN_MASK;
+ pgprotval_t prot = pud_val(*start) & PTE_FLAGS_MASK;

if (pud_large(*start) || !pud_present(*start))
note_page(m, st, __pgprot(prot), 2);
@@ -288,7 +288,7 @@
for (i = 0; i < PTRS_PER_PGD; i++) {
st.current_address = normalize_addr(i * PGD_LEVEL_MULT);
if (!pgd_none(*start)) {
- pgprotval_t prot = pgd_val(*start) & ~PTE_PFN_MASK;
+ pgprotval_t prot = pgd_val(*start) & PTE_FLAGS_MASK;

if (pgd_large(*start) || !pgd_present(*start))
note_page(m, &st, __pgprot(prot), 1);
===================================================================
--- a/arch/x86/xen/mmu.c
+++ b/arch/x86/xen/mmu.c
@@ -344,7 +344,7 @@
{
if (val & _PAGE_PRESENT) {
unsigned long mfn = (val & PTE_PFN_MASK) >> PAGE_SHIFT;
- pteval_t flags = val & ~PTE_PFN_MASK;
+ pteval_t flags = val & PTE_FLAGS_MASK;
val = ((pteval_t)mfn_to_pfn(mfn) << PAGE_SHIFT) | flags;
}

@@ -355,7 +355,7 @@
{
if (val & _PAGE_PRESENT) {
unsigned long pfn = (val & PTE_PFN_MASK) >> PAGE_SHIFT;
- pteval_t flags = val & ~PTE_PFN_MASK;
+ pteval_t flags = val & PTE_FLAGS_MASK;
val = ((pteval_t)pfn_to_mfn(pfn) << PAGE_SHIFT) | flags;
}

===================================================================
--- a/include/asm-x86/page.h
+++ b/include/asm-x86/page.h
@@ -20,6 +20,9 @@

/* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */
#define PTE_PFN_MASK ((pteval_t)PHYSICAL_PAGE_MASK)
+
+/* PTE_FLAGS_MASK extracts the flags from a (pte|pmd|pud|pgd)val_t */
+#define PTE_FLAGS_MASK (~PTE_PFN_MASK)

#define PMD_PAGE_SIZE (_AC(1, UL) << PMD_SHIFT)
#define PMD_PAGE_MASK (~(PMD_PAGE_SIZE-1))
@@ -146,7 +149,7 @@

static inline pteval_t native_pte_flags(pte_t pte)
{
- return native_pte_val(pte) & ~PTE_PFN_MASK;
+ return native_pte_val(pte) & PTE_FLAGS_MASK;
}

#define pgprot_val(x) ((x).pgprot)
===================================================================
--- a/include/asm-x86/pgtable.h
+++ b/include/asm-x86/pgtable.h
@@ -286,7 +286,7 @@
return __pgprot(preservebits | addbits);
}

-#define pte_pgprot(x) __pgprot(pte_flags(x) & ~PTE_PFN_MASK)
+#define pte_pgprot(x) __pgprot(pte_flags(x) & PTE_FLAGS_MASK)

#define canon_pgprot(p) __pgprot(pgprot_val(p) & __supported_pte_mask)

===================================================================
--- a/include/asm-x86/pgtable_32.h
+++ b/include/asm-x86/pgtable_32.h
@@ -94,7 +94,7 @@
/* To avoid harmful races, pmd_none(x) should check only the lower when PAE */
#define pmd_none(x) (!(unsigned long)pmd_val((x)))
#define pmd_present(x) (pmd_val((x)) & _PAGE_PRESENT)
-#define pmd_bad(x) ((pmd_val(x) & (~PTE_PFN_MASK & ~_PAGE_USER)) != _KERNPG_TABLE)
+#define pmd_bad(x) ((pmd_val(x) & (PTE_FLAGS_MASK & ~_PAGE_USER)) != _KERNPG_TABLE)

#define pages_to_mb(x) ((x) >> (20-PAGE_SHIFT))


2008-07-22 08:36:58

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK


* Jeremy Fitzhardinge <[email protected]> wrote:

> Rusty, in his peevish way, complained that macros defining constants
> should have a name which somewhat accurately reflects the actual
> purpose of the constant.
>
> Aside from the fact that PTE_MASK gives no clue as to what's actually
> being masked, and is misleadingly similar to the functionally entirely
> different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the
> problem is.

Has Rusty ever heard about the economy of the healthy flow of incoming
regressions? What will we do without obscure names and hard to find
bugs? First he writes a simple and readable hypervisor (ruining a whole
industry based on obscurity!) and now that. It's _so_ unamerican and
unaustralian. I'm worried.

Applied to tip/x86/cleanups anyway. Rusty will find out himself how bad
this whole concept of clean and understandable code is, soon enough!

Ingo

2008-07-22 09:04:59

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest (updated)


* Rusty Russell <[email protected]> wrote:

> (Jeremy said:
> rusty: use PTE_MASK
> rusty: use PTE_MASK
> rusty: use PTE_MASK
> When I asked:
> jsgf: does that include the NX flag?
> He responded eloquently:
> rusty: use PTE_MASK
> rusty: use PTE_MASK
> yes, it's the official constant of masking flags out of ptes
> )
>
> Change a15af1c9ea2750a9ff01e51615c45950bad8221b 'x86/paravirt: add
> pte_flags to just get pte flags' removed lguest's private pte_flags()
> in favor of a generic one.
>
> Unfortunately, the generic one doesn't filter out the non-flags bits:
> this results in lguest creating corrupt shadow page tables and blowing
> up host memory.
>
> Since noone is supposed to use the pfn part of pte_flags(), it seems
> safest to always do the filtering.
>
> Cc: Jeremy Fitzhardinge <[email protected]>
> Signed-off-by: Rusty Russell <[email protected]>

applied to tip/x86/urgent - thanks Rusty!

i'm wondering. My randconfig tests boot up an lguest enabled kernel
every 30 minutes or so:

config-Mon_Jul_21_19_05_54_CEST_2008.good:CONFIG_LGUEST=y
config-Mon_Jul_21_19_43_13_CEST_2008.good:CONFIG_LGUEST=y
config-Mon_Jul_21_19_47_40_CEST_2008.good:CONFIG_LGUEST=y
config-Mon_Jul_21_20_37_41_CEST_2008.good:CONFIG_LGUEST=y
config-Mon_Jul_21_22_11_42_CEST_2008.good:CONFIG_LGUEST=y
config-Mon_Jul_21_22_16_59_CEST_2008.good:CONFIG_LGUEST=y
config-Mon_Jul_21_22_32_22_CEST_2008.good:CONFIG_LGUEST=y
config-Mon_Jul_21_23_25_55_CEST_2008.good:CONFIG_LGUEST=y
config-Mon_Jul_21_23_51_29_CEST_2008.good:CONFIG_LGUEST=y

Would it be possible to have some really stupid lguest self-test which
would complain spectacularly in the host kernel if it fails to reach
some minimal user-space?

Something that could be self-contained within a single bzImage. (i.e. it
would contain a minimalistic image of some sort with a very minimalistic
userspace component as well - or something like that)

I test many distros so installing anything on the user-space side is
quite a PITA and it can go bust without me noticing.

If we had something like that then we'd have noticed the lguest breakage
within 30 minutes of adding the bad commit. (Unless of course you insist
on us adding hard to find random breakages to lguest - which service i'm
afraid we are bound to provide to you in the future too! ;)

Ingo

2008-07-22 11:01:16

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK

On Tuesday 22 July 2008 18:36:26 Ingo Molnar wrote:
> * Jeremy Fitzhardinge <[email protected]> wrote:
> > Rusty, in his peevish way, complained that macros defining constants
> > should have a name which somewhat accurately reflects the actual
> > purpose of the constant.
>
> Applied to tip/x86/cleanups anyway. Rusty will find out himself how bad
> this whole concept of clean and understandable code is, soon enough!

I am disgusted with this inappropriate emphasis on clarity over obscurity. It
should be pretty clear to everyone here that we can't have both!

Fortunately, there is a way to partially rectify the situation. Ingo, please
apply.

Signed-off-by: Rusty Russell <[email protected]>
---
include/asm-x86/page.h | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/include/asm-x86/page.h b/include/asm-x86/page.h
index 6c84622..4207518 100644
--- a/include/asm-x86/page.h
+++ b/include/asm-x86/page.h
@@ -10,6 +10,7 @@

#ifdef __KERNEL__

+/* There's something suspicious about this line: see PTE_PFN_MASK comment. */
#define __PHYSICAL_MASK ((phys_addr_t)(1ULL << __PHYSICAL_MASK_SHIFT) - 1)
#define __VIRTUAL_MASK ((1UL << __VIRTUAL_MASK_SHIFT) - 1)

@@ -19,6 +20,7 @@
#define PHYSICAL_PAGE_MASK (((signed long)PAGE_MASK) & __PHYSICAL_MASK)

/* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */
+/* This line is quite subtle. See __PHYSICAL_MASK comment above. */
#define PTE_PFN_MASK ((pteval_t)PHYSICAL_PAGE_MASK)

/* PTE_FLAGS_MASK extracts the flags from a (pte|pmd|pud|pgd)val_t */

2008-07-22 11:56:19

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK


* Rusty Russell <[email protected]> wrote:

> On Tuesday 22 July 2008 18:36:26 Ingo Molnar wrote:
> > * Jeremy Fitzhardinge <[email protected]> wrote:
> > > Rusty, in his peevish way, complained that macros defining constants
> > > should have a name which somewhat accurately reflects the actual
> > > purpose of the constant.
> >
> > Applied to tip/x86/cleanups anyway. Rusty will find out himself how bad
> > this whole concept of clean and understandable code is, soon enough!
>
> I am disgusted with this inappropriate emphasis on clarity over
> obscurity. It should be pretty clear to everyone here that we can't
> have both!
>
> Fortunately, there is a way to partially rectify the situation. Ingo,
> please apply.

> +/* There's something suspicious about this line: see PTE_PFN_MASK comment. */
> #define __PHYSICAL_MASK ((phys_addr_t)(1ULL << __PHYSICAL_MASK_SHIFT) - 1)

> /* PTE_PFN_MASK extracts the PFN from a (pte|pmd|pud|pgd)val_t */
> +/* This line is quite subtle. See __PHYSICAL_MASK comment above. */
> #define PTE_PFN_MASK ((pteval_t)PHYSICAL_PAGE_MASK)

Now that you and Jeremy have thoroughly destroyed this file's obscurity
with your disgusting cleanups and clarifications, i fear it's beyond
repair. No matter how much i'd love to apply this infinitely recursive
piece of documentation (what a genius it takes to even think of it!) i
regret that i cannot. So sad.

Ingo

2008-07-22 13:04:17

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK

Jeremy Fitzhardinge <[email protected]> writes:

> Rusty, in his peevish way, complained that macros defining constants
> should have a name which somewhat accurately reflects the actual
> purpose of the constant.
>
> Aside from the fact that PTE_MASK gives no clue as to what's actually
> being masked, and is misleadingly similar to the functionally entirely
> different PMD_MASK, PUD_MASK and PGD_MASK, I don't really see what the
> problem is.

PTE_PFN_MASK is not symmetric to PAGE_MASK. How about PTE_PROT_MASK?

Hannes

2008-07-22 14:53:07

by Jeremy Fitzhardinge

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK

Johannes Weiner wrote:
> PTE_PFN_MASK is not symmetric to PAGE_MASK.

No, it isn't. Is there anything about the name that suggests that it
should be? PTE_PFN_MASK is for operating on pteval_t-typed values
extracted from ptes; PAGE_MASK is for operating on addresses.

> How about PTE_PROT_MASK?
>

That's the opposite sense; the next patch defines it as PTE_FLAGS_MASK.

J

2008-07-22 15:18:53

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK

Hi,

Jeremy Fitzhardinge <[email protected]> writes:

> Johannes Weiner wrote:
>> PTE_PFN_MASK is not symmetric to PAGE_MASK.
>
> No, it isn't. Is there anything about the name that suggests that it
> should be? PTE_PFN_MASK is for operating on pteval_t-typed values
> extracted from ptes; PAGE_MASK is for operating on addresses.

I meant the naming scheme, not the functionality.

The thing PAGE_MASK and PTE_MASK have in common is that they are masks
and their names indicate what is masked away when applied.

So PAGE_MASK suggests that it masks out page details. And PTE_MASK
suggests that it masks out PTE details.

PTE_PFN_MASK masks suggests that it masks out the flags, according to
the existing naming convention. But it does the opposite.

Hannes

2008-07-22 15:24:33

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK

Johannes Weiner <[email protected]> writes:

> Hi,
>
> Jeremy Fitzhardinge <[email protected]> writes:
>
>> Johannes Weiner wrote:
>>> PTE_PFN_MASK is not symmetric to PAGE_MASK.
>>
>> No, it isn't. Is there anything about the name that suggests that it
>> should be? PTE_PFN_MASK is for operating on pteval_t-typed values
>> extracted from ptes; PAGE_MASK is for operating on addresses.
>
> I meant the naming scheme, not the functionality.
>
> The thing PAGE_MASK and PTE_MASK have in common is that they are masks
> and their names indicate what is masked away when applied.
>
> So PAGE_MASK suggests that it masks out page details. And PTE_MASK
> suggests that it masks out PTE details.
>
> PTE_PFN_MASK masks suggests that it masks out the flags, according to
> the existing naming convention. But it does the opposite.

As you explained me how PAGE_MASK was meant, scratch the above ;)

Hannes

2008-07-22 15:34:10

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK


* Johannes Weiner <[email protected]> wrote:

> Johannes Weiner <[email protected]> writes:
>
> > Hi,
> >
> > Jeremy Fitzhardinge <[email protected]> writes:
> >
> >> Johannes Weiner wrote:
> >>> PTE_PFN_MASK is not symmetric to PAGE_MASK.
> >>
> >> No, it isn't. Is there anything about the name that suggests that it
> >> should be? PTE_PFN_MASK is for operating on pteval_t-typed values
> >> extracted from ptes; PAGE_MASK is for operating on addresses.
> >
> > I meant the naming scheme, not the functionality.
> >
> > The thing PAGE_MASK and PTE_MASK have in common is that they are masks
> > and their names indicate what is masked away when applied.
> >
> > So PAGE_MASK suggests that it masks out page details. And PTE_MASK
> > suggests that it masks out PTE details.
> >
> > PTE_PFN_MASK masks suggests that it masks out the flags, according
> > to the existing naming convention. But it does the opposite.
>
> As you explained me how PAGE_MASK was meant, scratch the above ;)

btw., feel free to send a patch that adds more comments that makes it
obvious at first sight if someone takes a look at the defines.

Ingo

2008-07-22 15:43:39

by Johannes Weiner

[permalink] [raw]
Subject: Re: [PATCH 1/2] x86: rename PTE_MASK to PTE_PFN_MASK

Hi,

Ingo Molnar <[email protected]> writes:

> * Johannes Weiner <[email protected]> wrote:
>
>> Johannes Weiner <[email protected]> writes:
>>
>> > Hi,
>> >
>> > Jeremy Fitzhardinge <[email protected]> writes:
>> >
>> >> Johannes Weiner wrote:
>> >>> PTE_PFN_MASK is not symmetric to PAGE_MASK.
>> >>
>> >> No, it isn't. Is there anything about the name that suggests that it
>> >> should be? PTE_PFN_MASK is for operating on pteval_t-typed values
>> >> extracted from ptes; PAGE_MASK is for operating on addresses.
>> >
>> > I meant the naming scheme, not the functionality.
>> >
>> > The thing PAGE_MASK and PTE_MASK have in common is that they are masks
>> > and their names indicate what is masked away when applied.
>> >
>> > So PAGE_MASK suggests that it masks out page details. And PTE_MASK
>> > suggests that it masks out PTE details.
>> >
>> > PTE_PFN_MASK masks suggests that it masks out the flags, according
>> > to the existing naming convention. But it does the opposite.
>>
>> As you explained me how PAGE_MASK was meant, scratch the above ;)
>
> btw., feel free to send a patch that adds more comments that makes it
> obvious at first sight if someone takes a look at the defines.

I will do that!

Hannes

2008-07-23 02:26:52

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest (updated)

On Tuesday 22 July 2008 19:04:32 Ingo Molnar wrote:
> i'm wondering. My randconfig tests boot up an lguest enabled kernel
> every 30 minutes or so:
>
> config-Mon_Jul_21_19_05_54_CEST_2008.good:CONFIG_LGUEST=y
> config-Mon_Jul_21_19_43_13_CEST_2008.good:CONFIG_LGUEST=y
> config-Mon_Jul_21_19_47_40_CEST_2008.good:CONFIG_LGUEST=y
> config-Mon_Jul_21_20_37_41_CEST_2008.good:CONFIG_LGUEST=y
> config-Mon_Jul_21_22_11_42_CEST_2008.good:CONFIG_LGUEST=y
> config-Mon_Jul_21_22_16_59_CEST_2008.good:CONFIG_LGUEST=y
> config-Mon_Jul_21_22_32_22_CEST_2008.good:CONFIG_LGUEST=y
> config-Mon_Jul_21_23_25_55_CEST_2008.good:CONFIG_LGUEST=y
> config-Mon_Jul_21_23_51_29_CEST_2008.good:CONFIG_LGUEST=y
>
> Would it be possible to have some really stupid lguest self-test which
> would complain spectacularly in the host kernel if it fails to reach
> some minimal user-space?
>
> Something that could be self-contained within a single bzImage. (i.e. it
> would contain a minimalistic image of some sort with a very minimalistic
> userspace component as well - or something like that)

Well, adding "make -C Documentation/lguest" to the build is a good start (this
finds those "e820.h not longer includable from userspace" bugs).

Secondly, if you put the resulting Documentation/lguest/lguest somewhere on
your booting test machine, it can just do something like

./lguest 64 /boot/vmlinuz-`uname -r` | grep 'VFS: Unable to mount root'

Won't quite test userspace, but it's easy and will get the worst breakage.
If we want to be more ambitious, I'd suggest a tiny initrd with a
statically-linked /hello_world to test userspace:

./lguest --initrd=/boot/hello_world.initrd 64 /boot/vmlinuz-`uname -r` \
rdinit=/hello_world | grep 'Hello world'

I can create one (and test the example) for you if you're interested?

Thanks,
Rusty.

2008-07-24 11:31:45

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest (updated)


* Rusty Russell <[email protected]> wrote:

> On Tuesday 22 July 2008 19:04:32 Ingo Molnar wrote:
> > i'm wondering. My randconfig tests boot up an lguest enabled kernel
> > every 30 minutes or so:
> >
> > config-Mon_Jul_21_19_05_54_CEST_2008.good:CONFIG_LGUEST=y
> > config-Mon_Jul_21_19_43_13_CEST_2008.good:CONFIG_LGUEST=y
> > config-Mon_Jul_21_19_47_40_CEST_2008.good:CONFIG_LGUEST=y
> > config-Mon_Jul_21_20_37_41_CEST_2008.good:CONFIG_LGUEST=y
> > config-Mon_Jul_21_22_11_42_CEST_2008.good:CONFIG_LGUEST=y
> > config-Mon_Jul_21_22_16_59_CEST_2008.good:CONFIG_LGUEST=y
> > config-Mon_Jul_21_22_32_22_CEST_2008.good:CONFIG_LGUEST=y
> > config-Mon_Jul_21_23_25_55_CEST_2008.good:CONFIG_LGUEST=y
> > config-Mon_Jul_21_23_51_29_CEST_2008.good:CONFIG_LGUEST=y
> >
> > Would it be possible to have some really stupid lguest self-test which
> > would complain spectacularly in the host kernel if it fails to reach
> > some minimal user-space?
> >
> > Something that could be self-contained within a single bzImage. (i.e. it
> > would contain a minimalistic image of some sort with a very minimalistic
> > userspace component as well - or something like that)
>
> Well, adding "make -C Documentation/lguest" to the build is a good start (this
> finds those "e820.h not longer includable from userspace" bugs).
>
> Secondly, if you put the resulting Documentation/lguest/lguest somewhere on
> your booting test machine, it can just do something like
>
> ./lguest 64 /boot/vmlinuz-`uname -r` | grep 'VFS: Unable to mount root'

stupid question: what's the easiest way to filter out the case where
there's not sufficient kernel support in the bzImage to actually run
lguest?

I.e. if i extend the "is this bzImage working fine" check with the above
lguest bootup test - with the expectation of it getting down to the
"VFS: Unable to mount root" message [success case], how do i filter out
the case where it doesnt get to that message not due to some lguest
breakage, but because there's not enough lguest support there.

> Won't quite test userspace, but it's easy and will get the worst breakage.
> If we want to be more ambitious, I'd suggest a tiny initrd with a
> statically-linked /hello_world to test userspace:
>
> ./lguest --initrd=/boot/hello_world.initrd 64 /boot/vmlinuz-`uname -r` \
> rdinit=/hello_world | grep 'Hello world'
>
> I can create one (and test the example) for you if you're interested?

that would be great ...

Ingo

2008-07-25 01:56:00

by Rusty Russell

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest (updated)

On Thursday 24 July 2008 21:31:22 Ingo Molnar wrote:
> * Rusty Russell <[email protected]> wrote:
> > On Tuesday 22 July 2008 19:04:32 Ingo Molnar wrote:
> > > i'm wondering. My randconfig tests boot up an lguest enabled kernel
> > > every 30 minutes or so:
> > >
> > > config-Mon_Jul_21_19_05_54_CEST_2008.good:CONFIG_LGUEST=y
> > > config-Mon_Jul_21_19_43_13_CEST_2008.good:CONFIG_LGUEST=y
> > > config-Mon_Jul_21_19_47_40_CEST_2008.good:CONFIG_LGUEST=y
> > > config-Mon_Jul_21_20_37_41_CEST_2008.good:CONFIG_LGUEST=y
> > > config-Mon_Jul_21_22_11_42_CEST_2008.good:CONFIG_LGUEST=y
> > > config-Mon_Jul_21_22_16_59_CEST_2008.good:CONFIG_LGUEST=y
> > > config-Mon_Jul_21_22_32_22_CEST_2008.good:CONFIG_LGUEST=y
> > > config-Mon_Jul_21_23_25_55_CEST_2008.good:CONFIG_LGUEST=y
> > > config-Mon_Jul_21_23_51_29_CEST_2008.good:CONFIG_LGUEST=y
> > >
> > > Would it be possible to have some really stupid lguest self-test which
> > > would complain spectacularly in the host kernel if it fails to reach
> > > some minimal user-space?
> > >
> > > Something that could be self-contained within a single bzImage. (i.e.
> > > it would contain a minimalistic image of some sort with a very
> > > minimalistic userspace component as well - or something like that)
> >
> > Well, adding "make -C Documentation/lguest" to the build is a good start
> > (this finds those "e820.h not longer includable from userspace" bugs).
> >
> > Secondly, if you put the resulting Documentation/lguest/lguest somewhere
> > on your booting test machine, it can just do something like
> >
> > ./lguest 64 /boot/vmlinuz-`uname -r` | grep 'VFS: Unable to mount root'
>
> stupid question: what's the easiest way to filter out the case where
> there's not sufficient kernel support in the bzImage to actually run
> lguest?
>
> I.e. if i extend the "is this bzImage working fine" check with the above
> lguest bootup test - with the expectation of it getting down to the
> "VFS: Unable to mount root" message [success case], how do i filter out
> the case where it doesnt get to that message not due to some lguest
> breakage, but because there's not enough lguest support there.

Easiest to check config: CONFIG_LGUEST and CONFIG_LGUEST_GUEST.

Well, there may be no host-for-lguest support, modular or builtin. "modprobe
lg" to be sure, then if lguest says: "lguest: Failed to open /dev/lguest: No
such file or directory" your host doesn't support it.

If there's no guest support, it's trickier. The boot will fail in some
non-obvious way depending on config options....

> > I can create one (and test the example) for you if you're interested?
>
> that would be great ...

OK, added to TODO.

Rusty.

2008-07-28 15:11:47

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] x86: fix pte_flags() to only return flags, fix lguest (updated)


* Rusty Russell <[email protected]> wrote:

> On Thursday 24 July 2008 21:31:22 Ingo Molnar wrote:
> > * Rusty Russell <[email protected]> wrote:
> > > On Tuesday 22 July 2008 19:04:32 Ingo Molnar wrote:
> > > > i'm wondering. My randconfig tests boot up an lguest enabled kernel
> > > > every 30 minutes or so:
> > > >
> > > > config-Mon_Jul_21_19_05_54_CEST_2008.good:CONFIG_LGUEST=y
> > > > config-Mon_Jul_21_19_43_13_CEST_2008.good:CONFIG_LGUEST=y
> > > > config-Mon_Jul_21_19_47_40_CEST_2008.good:CONFIG_LGUEST=y
> > > > config-Mon_Jul_21_20_37_41_CEST_2008.good:CONFIG_LGUEST=y
> > > > config-Mon_Jul_21_22_11_42_CEST_2008.good:CONFIG_LGUEST=y
> > > > config-Mon_Jul_21_22_16_59_CEST_2008.good:CONFIG_LGUEST=y
> > > > config-Mon_Jul_21_22_32_22_CEST_2008.good:CONFIG_LGUEST=y
> > > > config-Mon_Jul_21_23_25_55_CEST_2008.good:CONFIG_LGUEST=y
> > > > config-Mon_Jul_21_23_51_29_CEST_2008.good:CONFIG_LGUEST=y
> > > >
> > > > Would it be possible to have some really stupid lguest self-test which
> > > > would complain spectacularly in the host kernel if it fails to reach
> > > > some minimal user-space?
> > > >
> > > > Something that could be self-contained within a single bzImage. (i.e.
> > > > it would contain a minimalistic image of some sort with a very
> > > > minimalistic userspace component as well - or something like that)
> > >
> > > Well, adding "make -C Documentation/lguest" to the build is a good start
> > > (this finds those "e820.h not longer includable from userspace" bugs).
> > >
> > > Secondly, if you put the resulting Documentation/lguest/lguest somewhere
> > > on your booting test machine, it can just do something like
> > >
> > > ./lguest 64 /boot/vmlinuz-`uname -r` | grep 'VFS: Unable to mount root'
> >
> > stupid question: what's the easiest way to filter out the case where
> > there's not sufficient kernel support in the bzImage to actually run
> > lguest?
> >
> > I.e. if i extend the "is this bzImage working fine" check with the above
> > lguest bootup test - with the expectation of it getting down to the
> > "VFS: Unable to mount root" message [success case], how do i filter out
> > the case where it doesnt get to that message not due to some lguest
> > breakage, but because there's not enough lguest support there.
>
> Easiest to check config: CONFIG_LGUEST and CONFIG_LGUEST_GUEST.
>
> Well, there may be no host-for-lguest support, modular or builtin.
> "modprobe lg" to be sure, then if lguest says: "lguest: Failed to open
> /dev/lguest: No such file or directory" your host doesn't support it.
>
> If there's no guest support, it's trickier. The boot will fail in
> some non-obvious way depending on config options....

that's why i'm lazily relying on in-kernel tests as much as possible.
Within the kernel we always know whether it's OK. Could we load an
lguest test-image via the firmware loader or something? That would make
automated testing really, really self-contained.

plus "make Documentation/lguest/" could be bound into the random-testing
environment as well. Would be glad to test-drive patches (even if just
half-cooked), if you send any ...

Ingo