2019-10-03 04:18:28

by Ben Luo

[permalink] [raw]
Subject: [PATCH] vfio/type1: remove hugepage checks in is_invalid_reserved_pfn()

Currently, no hugepage split code can transfer the reserved bit
from head to tail during the split, so checking the head can't make
a difference in a racing condition with hugepage spliting.

The buddy wouldn't allow a driver to allocate an hugepage if any
subpage is reserved in the e820 map at boot, if any driver sets the
reserved bit of head page before mapping the hugepage in userland,
it needs to set the reserved bit in all subpages to be safe.

Signed-off-by: Ben Luo <[email protected]>
---
drivers/vfio/vfio_iommu_type1.c | 26 ++++----------------------
1 file changed, 4 insertions(+), 22 deletions(-)

diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
index 054391f..e2019ba 100644
--- a/drivers/vfio/vfio_iommu_type1.c
+++ b/drivers/vfio/vfio_iommu_type1.c
@@ -287,31 +287,13 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async)
* Some mappings aren't backed by a struct page, for example an mmap'd
* MMIO range for our own or another device. These use a different
* pfn conversion and shouldn't be tracked as locked pages.
+ * For compound pages, any driver that sets the reserved bit in head
+ * page needs to set the reserved bit in all subpages to be safe.
*/
static bool is_invalid_reserved_pfn(unsigned long pfn)
{
- if (pfn_valid(pfn)) {
- bool reserved;
- struct page *tail = pfn_to_page(pfn);
- struct page *head = compound_head(tail);
- reserved = !!(PageReserved(head));
- if (head != tail) {
- /*
- * "head" is not a dangling pointer
- * (compound_head takes care of that)
- * but the hugepage may have been split
- * from under us (and we may not hold a
- * reference count on the head page so it can
- * be reused before we run PageReferenced), so
- * we've to check PageTail before returning
- * what we just read.
- */
- smp_rmb();
- if (PageTail(tail))
- return reserved;
- }
- return PageReserved(tail);
- }
+ if (pfn_valid(pfn))
+ return PageReserved(pfn_to_page(pfn));

return true;
}
--
1.8.3.1


2019-10-03 17:06:47

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: [PATCH] vfio/type1: remove hugepage checks in is_invalid_reserved_pfn()

On Thu, Oct 03, 2019 at 11:49:42AM +0800, Ben Luo wrote:
> Currently, no hugepage split code can transfer the reserved bit
> from head to tail during the split, so checking the head can't make
> a difference in a racing condition with hugepage spliting.
>
> The buddy wouldn't allow a driver to allocate an hugepage if any
> subpage is reserved in the e820 map at boot, if any driver sets the
> reserved bit of head page before mapping the hugepage in userland,
> it needs to set the reserved bit in all subpages to be safe.
>
> Signed-off-by: Ben Luo <[email protected]>

Reviewed-by: Andrea Arcangeli <[email protected]>


> ---
> drivers/vfio/vfio_iommu_type1.c | 26 ++++----------------------
> 1 file changed, 4 insertions(+), 22 deletions(-)
>
> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> index 054391f..e2019ba 100644
> --- a/drivers/vfio/vfio_iommu_type1.c
> +++ b/drivers/vfio/vfio_iommu_type1.c
> @@ -287,31 +287,13 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async)
> * Some mappings aren't backed by a struct page, for example an mmap'd
> * MMIO range for our own or another device. These use a different
> * pfn conversion and shouldn't be tracked as locked pages.
> + * For compound pages, any driver that sets the reserved bit in head
> + * page needs to set the reserved bit in all subpages to be safe.
> */
> static bool is_invalid_reserved_pfn(unsigned long pfn)
> {
> - if (pfn_valid(pfn)) {
> - bool reserved;
> - struct page *tail = pfn_to_page(pfn);
> - struct page *head = compound_head(tail);
> - reserved = !!(PageReserved(head));
> - if (head != tail) {
> - /*
> - * "head" is not a dangling pointer
> - * (compound_head takes care of that)
> - * but the hugepage may have been split
> - * from under us (and we may not hold a
> - * reference count on the head page so it can
> - * be reused before we run PageReferenced), so
> - * we've to check PageTail before returning
> - * what we just read.
> - */
> - smp_rmb();
> - if (PageTail(tail))
> - return reserved;
> - }
> - return PageReserved(tail);
> - }
> + if (pfn_valid(pfn))
> + return PageReserved(pfn_to_page(pfn));
>
> return true;
> }
> --
> 1.8.3.1
>

2019-10-18 23:33:15

by Ben Luo

[permalink] [raw]
Subject: Re: [PATCH] vfio/type1: remove hugepage checks in is_invalid_reserved_pfn()

A friendly reminder :)


Thanks,

    Ben

在 2019/10/4 上午12:41, Andrea Arcangeli 写道:
> On Thu, Oct 03, 2019 at 11:49:42AM +0800, Ben Luo wrote:
>> Currently, no hugepage split code can transfer the reserved bit
>> from head to tail during the split, so checking the head can't make
>> a difference in a racing condition with hugepage spliting.
>>
>> The buddy wouldn't allow a driver to allocate an hugepage if any
>> subpage is reserved in the e820 map at boot, if any driver sets the
>> reserved bit of head page before mapping the hugepage in userland,
>> it needs to set the reserved bit in all subpages to be safe.
>>
>> Signed-off-by: Ben Luo <[email protected]>
> Reviewed-by: Andrea Arcangeli <[email protected]>
>
>
>> ---
>> drivers/vfio/vfio_iommu_type1.c | 26 ++++----------------------
>> 1 file changed, 4 insertions(+), 22 deletions(-)
>>
>> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
>> index 054391f..e2019ba 100644
>> --- a/drivers/vfio/vfio_iommu_type1.c
>> +++ b/drivers/vfio/vfio_iommu_type1.c
>> @@ -287,31 +287,13 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async)
>> * Some mappings aren't backed by a struct page, for example an mmap'd
>> * MMIO range for our own or another device. These use a different
>> * pfn conversion and shouldn't be tracked as locked pages.
>> + * For compound pages, any driver that sets the reserved bit in head
>> + * page needs to set the reserved bit in all subpages to be safe.
>> */
>> static bool is_invalid_reserved_pfn(unsigned long pfn)
>> {
>> - if (pfn_valid(pfn)) {
>> - bool reserved;
>> - struct page *tail = pfn_to_page(pfn);
>> - struct page *head = compound_head(tail);
>> - reserved = !!(PageReserved(head));
>> - if (head != tail) {
>> - /*
>> - * "head" is not a dangling pointer
>> - * (compound_head takes care of that)
>> - * but the hugepage may have been split
>> - * from under us (and we may not hold a
>> - * reference count on the head page so it can
>> - * be reused before we run PageReferenced), so
>> - * we've to check PageTail before returning
>> - * what we just read.
>> - */
>> - smp_rmb();
>> - if (PageTail(tail))
>> - return reserved;
>> - }
>> - return PageReserved(tail);
>> - }
>> + if (pfn_valid(pfn))
>> + return PageReserved(pfn_to_page(pfn));
>>
>> return true;
>> }
>> --
>> 1.8.3.1
>>

2019-10-19 08:30:57

by Alex Williamson

[permalink] [raw]
Subject: Re: [PATCH] vfio/type1: remove hugepage checks in is_invalid_reserved_pfn()

On Fri, 18 Oct 2019 14:42:32 +0800
Ben Luo <[email protected]> wrote:

> A friendly reminder :)
>

Thanks Ben! I've added this to the vfio next branch for v5.5 with
Andrea's R-b. Thanks,

Alex

> 在 2019/10/4 上午12:41, Andrea Arcangeli 写道:
> > On Thu, Oct 03, 2019 at 11:49:42AM +0800, Ben Luo wrote:
> >> Currently, no hugepage split code can transfer the reserved bit
> >> from head to tail during the split, so checking the head can't make
> >> a difference in a racing condition with hugepage spliting.
> >>
> >> The buddy wouldn't allow a driver to allocate an hugepage if any
> >> subpage is reserved in the e820 map at boot, if any driver sets the
> >> reserved bit of head page before mapping the hugepage in userland,
> >> it needs to set the reserved bit in all subpages to be safe.
> >>
> >> Signed-off-by: Ben Luo <[email protected]>
> > Reviewed-by: Andrea Arcangeli <[email protected]>
> >
> >
> >> ---
> >> drivers/vfio/vfio_iommu_type1.c | 26 ++++----------------------
> >> 1 file changed, 4 insertions(+), 22 deletions(-)
> >>
> >> diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c
> >> index 054391f..e2019ba 100644
> >> --- a/drivers/vfio/vfio_iommu_type1.c
> >> +++ b/drivers/vfio/vfio_iommu_type1.c
> >> @@ -287,31 +287,13 @@ static int vfio_lock_acct(struct vfio_dma *dma, long npage, bool async)
> >> * Some mappings aren't backed by a struct page, for example an mmap'd
> >> * MMIO range for our own or another device. These use a different
> >> * pfn conversion and shouldn't be tracked as locked pages.
> >> + * For compound pages, any driver that sets the reserved bit in head
> >> + * page needs to set the reserved bit in all subpages to be safe.
> >> */
> >> static bool is_invalid_reserved_pfn(unsigned long pfn)
> >> {
> >> - if (pfn_valid(pfn)) {
> >> - bool reserved;
> >> - struct page *tail = pfn_to_page(pfn);
> >> - struct page *head = compound_head(tail);
> >> - reserved = !!(PageReserved(head));
> >> - if (head != tail) {
> >> - /*
> >> - * "head" is not a dangling pointer
> >> - * (compound_head takes care of that)
> >> - * but the hugepage may have been split
> >> - * from under us (and we may not hold a
> >> - * reference count on the head page so it can
> >> - * be reused before we run PageReferenced), so
> >> - * we've to check PageTail before returning
> >> - * what we just read.
> >> - */
> >> - smp_rmb();
> >> - if (PageTail(tail))
> >> - return reserved;
> >> - }
> >> - return PageReserved(tail);
> >> - }
> >> + if (pfn_valid(pfn))
> >> + return PageReserved(pfn_to_page(pfn));
> >>
> >> return true;
> >> }
> >> --
> >> 1.8.3.1
> >>