Added missed Signed-off-by line.
After a lot of debugging and long reading of Linux Kernel and Xen code
finally I killed deeply hidden bug in pv-grub. Details below.
Additionally, I am CC'ing this e-mail to LKML because this issue
looks like Linux Kernel problem, however, it is not.
This patch applies to Xen Ver. 4.0, Xen Ver. 4.1 and unstable tree.
# HG changeset patch
# User [email protected]
# Date 1303474763 -7200
# Node ID b33bf24be129b7b9cd2248460beb1298088c6af5
# Parent dbf2ddf652dc3dd927447e79ef4bc586de55d708
Introduction of Linux Kernel git commit ceefccc93932b920a8ec6f35f596db05202a12fe
(x86: default CONFIG_PHYSICAL_START and CONFIG_PHYSICAL_ALIGN to 16 MB) revealed
deeply hidden bug in pv-grub. During kernel load stage dom->p2m_host[] list has
been incorrectly initialized.
At the beginning of kernel load stage dom->p2m_host[] list is populated with
current pfn->mfn layout. Later during memory allocation (memory is allocated
page by page in kexec_allocate()) page order is changed to establish linear
layout in new domain. It is done by exchanging subsequent mfns with newly
allocated mfns. dom->p2m_host[] list is indexed by currently requested pfn
(it is incremented from 0) and pfn of newly allocated paged. If pfn of newly
allocated page is less than currently requested pfn then relevant earlier
allocated mfn is overwritten which leads to domain crash later. This patch
fix that issue. If pfn of newly allocated page is less then currently
requested pfn then relevant pfn/mfn pair is properly calculated and usual
exchange occurs later.
Signed-off-by: Daniel Kiper <[email protected]>
diff -r dbf2ddf652dc -r b33bf24be129 stubdom/grub/kexec.c
--- a/stubdom/grub/kexec.c Thu Apr 07 15:26:58 2011 +0100
+++ b/stubdom/grub/kexec.c Fri Apr 22 14:19:23 2011 +0200
@@ -91,6 +91,11 @@ int kexec_allocate(struct xc_dom_image *
new_pfn = PHYS_PFN(to_phys(pages[i]));
pages_mfns[i] = new_mfn = pfn_to_mfn(new_pfn);
+ if (new_pfn < i)
+ for (new_pfn = i; new_pfn < dom->total_pages; ++new_pfn)
+ if (dom->p2m_host[new_pfn] == new_mfn)
+ break;
+
/* Put old page at new PFN */
dom->p2m_host[new_pfn] = old_mfn;
Daniel
Hello,
Daniel Kiper, le Fri 22 Apr 2011 23:25:45 +0200, a ?crit :
> If pfn of newly allocated page is less than currently requested pfn
> then relevant earlier allocated mfn is overwritten which leads to
> domain crash later.
Oops, good catch! And unfortunately it happens seldomly... I guess it
may be the culprit for a fair number of other issues.
> + if (new_pfn < i)
> + for (new_pfn = i; new_pfn < dom->total_pages; ++new_pfn)
> + if (dom->p2m_host[new_pfn] == new_mfn)
> + break;
Instead of looking for the page, which takes a linear time for each page
and thus potentially quadratic time, we should probably rather record
which PFN the MFNs < allocated have been moved to?
Samuel
On Fri, Apr 22, 2011 at 11:25:45PM +0200, Daniel Kiper wrote:
> Added missed Signed-off-by line.
>
> After a lot of debugging and long reading of Linux Kernel and Xen code
> finally I killed deeply hidden bug in pv-grub. Details below.
> Additionally, I am CC'ing this e-mail to LKML because this issue
> looks like Linux Kernel problem, however, it is not.
>
> This patch applies to Xen Ver. 4.0, Xen Ver. 4.1 and unstable tree.
>
> # HG changeset patch
> # User [email protected]
> # Date 1303474763 -7200
> # Node ID b33bf24be129b7b9cd2248460beb1298088c6af5
> # Parent dbf2ddf652dc3dd927447e79ef4bc586de55d708
> Introduction of Linux Kernel git commit ceefccc93932b920a8ec6f35f596db05202a12fe
> (x86: default CONFIG_PHYSICAL_START and CONFIG_PHYSICAL_ALIGN to 16 MB) revealed
> deeply hidden bug in pv-grub. During kernel load stage dom->p2m_host[] list has
> been incorrectly initialized.
>
> At the beginning of kernel load stage dom->p2m_host[] list is populated with
> current pfn->mfn layout. Later during memory allocation (memory is allocated
> page by page in kexec_allocate()) page order is changed to establish linear
> layout in new domain. It is done by exchanging subsequent mfns with newly
> allocated mfns. dom->p2m_host[] list is indexed by currently requested pfn
> (it is incremented from 0) and pfn of newly allocated paged. If pfn of newly
> allocated page is less than currently requested pfn then relevant earlier
> allocated mfn is overwritten which leads to domain crash later. This patch
> fix that issue. If pfn of newly allocated page is less then currently
> requested pfn then relevant pfn/mfn pair is properly calculated and usual
> exchange occurs later.
Nice! I presume this fixes the issue you had at the Xen Hack-O-Thon with
your guest right?
>
> Signed-off-by: Daniel Kiper <[email protected]>
>
> diff -r dbf2ddf652dc -r b33bf24be129 stubdom/grub/kexec.c
> --- a/stubdom/grub/kexec.c Thu Apr 07 15:26:58 2011 +0100
> +++ b/stubdom/grub/kexec.c Fri Apr 22 14:19:23 2011 +0200
> @@ -91,6 +91,11 @@ int kexec_allocate(struct xc_dom_image *
> new_pfn = PHYS_PFN(to_phys(pages[i]));
> pages_mfns[i] = new_mfn = pfn_to_mfn(new_pfn);
>
> + if (new_pfn < i)
> + for (new_pfn = i; new_pfn < dom->total_pages; ++new_pfn)
> + if (dom->p2m_host[new_pfn] == new_mfn)
> + break;
> +
> /* Put old page at new PFN */
> dom->p2m_host[new_pfn] = old_mfn;
>
> Daniel
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Sat, Apr 23, 2011 at 12:33:32AM +0200, Samuel Thibault wrote:
> Hello,
>
> Daniel Kiper, le Fri 22 Apr 2011 23:25:45 +0200, a ?crit :
> > If pfn of newly allocated page is less than currently requested pfn
> > then relevant earlier allocated mfn is overwritten which leads to
> > domain crash later.
>
> Oops, good catch! And unfortunately it happens seldomly... I guess it
> may be the culprit for a fair number of other issues.
I discovered that issue on domU i386. It does not affect x86_64
in my environment. However, as you stated above that issue in some
circumstances could lead to mysterious system crashes or data
corruptions.
> > + if (new_pfn < i)
> > + for (new_pfn = i; new_pfn < dom->total_pages; ++new_pfn)
> > + if (dom->p2m_host[new_pfn] == new_mfn)
> > + break;
>
> Instead of looking for the page, which takes a linear time for each page
> and thus potentially quadratic time, we should probably rather record
> which PFN the MFNs < allocated have been moved to?
I am going to post new time optimized version
of that patch today or tommorow.
Daniel
On Tue, Apr 26, 2011 at 09:42:42AM -0400, Konrad Rzeszutek Wilk wrote:
> On Fri, Apr 22, 2011 at 11:25:45PM +0200, Daniel Kiper wrote:
> > Added missed Signed-off-by line.
> >
> > After a lot of debugging and long reading of Linux Kernel and Xen code
> > finally I killed deeply hidden bug in pv-grub. Details below.
> > Additionally, I am CC'ing this e-mail to LKML because this issue
> > looks like Linux Kernel problem, however, it is not.
> >
> > This patch applies to Xen Ver. 4.0, Xen Ver. 4.1 and unstable tree.
> >
> > # HG changeset patch
> > # User [email protected]
> > # Date 1303474763 -7200
> > # Node ID b33bf24be129b7b9cd2248460beb1298088c6af5
> > # Parent dbf2ddf652dc3dd927447e79ef4bc586de55d708
> > Introduction of Linux Kernel git commit ceefccc93932b920a8ec6f35f596db05202a12fe
> > (x86: default CONFIG_PHYSICAL_START and CONFIG_PHYSICAL_ALIGN to 16 MB) revealed
> > deeply hidden bug in pv-grub. During kernel load stage dom->p2m_host[] list has
> > been incorrectly initialized.
> >
> > At the beginning of kernel load stage dom->p2m_host[] list is populated with
> > current pfn->mfn layout. Later during memory allocation (memory is allocated
> > page by page in kexec_allocate()) page order is changed to establish linear
> > layout in new domain. It is done by exchanging subsequent mfns with newly
> > allocated mfns. dom->p2m_host[] list is indexed by currently requested pfn
> > (it is incremented from 0) and pfn of newly allocated paged. If pfn of newly
> > allocated page is less than currently requested pfn then relevant earlier
> > allocated mfn is overwritten which leads to domain crash later. This patch
> > fix that issue. If pfn of newly allocated page is less then currently
> > requested pfn then relevant pfn/mfn pair is properly calculated and usual
> > exchange occurs later.
>
> Nice! I presume this fixes the issue you had at the Xen Hack-O-Thon with
> your guest right?
Yes, it does. It was very difficult to discover because that
issue overlapped with other memory management issues which
were coming out last time. Currently, I am working on time
optimized version of that patch.
Daniel