2021-08-20 21:33:32

by Alyssa Rosenzweig

[permalink] [raw]
Subject: [PATCH 0/3] drm/panfrost: Bug fixes for lock_region

Chris Morgan reported UBSAN errors in panfrost and tracked them down to
the size computation in lock_region. This calculation is overcomplicated
(seemingly cargo culted from kbase) and can be simplified with kernel
helpers and some mathematical identities. The first patch in the series
rewrites the calculation in a form avoiding undefined behaviour; Chris
confirms it placates UBSAN.

While researching this function, I noticed a pair of other potential
bugs: Bifrost can lock more than 4GiB at a time, but must lock at least
32KiB at a time. The latter patches in the series handle these cases.

The size computation was unit-tested in userspace. Relevant code below,
just missing some copypaste definitions for fls64/clamp/etc:

#define MIN_LOCK (1ULL << 12)
#define MAX_LOCK (1ULL << 48)

struct {
uint64_t size;
uint8_t encoded;
} tests[] = {
/* Clamping */
{ 0, 11 },
{ 1, 11 },
{ 2, 11 },
{ 4095, 11 },
/* Power of two */
{ 4096, 11 },
/* Round up */
{ 4097, 12 },
{ 8192, 12 },
{ 16384, 13 },
{ 16385, 14 },
/* Maximum */
{ ~0ULL, 47 },
};

static uint8_t region_width(uint64_t size)
{
size = clamp(size, MIN_LOCK, MAX_LOCK);
return fls64(size - 1) - 1;
}

int main(int argc, char **argv)
{
for (unsigned i = 0; i < ARRAY_SIZE(tests); ++i) {
uint64_t test = tests[i].size;
uint8_t expected = tests[i].encoded;
uint8_t actual = region_width(test);

assert(expected == actual);
}
}

Alyssa Rosenzweig (3):
drm/panfrost: Simplify lock_region calculation
drm/panfrost: Use u64 for size in lock_region
drm/panfrost: Clamp lock region to Bifrost minimum

drivers/gpu/drm/panfrost/panfrost_mmu.c | 31 +++++++++---------------
drivers/gpu/drm/panfrost/panfrost_regs.h | 2 ++
2 files changed, 13 insertions(+), 20 deletions(-)

--
2.30.2


2021-08-20 21:34:51

by Alyssa Rosenzweig

[permalink] [raw]
Subject: [PATCH 2/3] drm/panfrost: Use u64 for size in lock_region

Mali virtual addresses are 48-bit. Use a u64 instead of size_t to ensure
we can express the "lock everything" condition as ~0ULL without relying
on platform-specific behaviour.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
Suggested-by: Rob Herring <[email protected]>
Tested-by: Chris Morgan <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index f6e02d0392f4..3a795273e505 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -58,7 +58,7 @@ static int write_cmd(struct panfrost_device *pfdev, u32 as_nr, u32 cmd)
}

static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
- u64 iova, size_t size)
+ u64 iova, u64 size)
{
u8 region_width;
u64 region = iova & PAGE_MASK;
@@ -78,7 +78,7 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,


static int mmu_hw_do_operation_locked(struct panfrost_device *pfdev, int as_nr,
- u64 iova, size_t size, u32 op)
+ u64 iova, u64 size, u32 op)
{
if (as_nr < 0)
return 0;
@@ -95,7 +95,7 @@ static int mmu_hw_do_operation_locked(struct panfrost_device *pfdev, int as_nr,

static int mmu_hw_do_operation(struct panfrost_device *pfdev,
struct panfrost_mmu *mmu,
- u64 iova, size_t size, u32 op)
+ u64 iova, u64 size, u32 op)
{
int ret;

@@ -112,7 +112,7 @@ static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_m
u64 transtab = cfg->arm_mali_lpae_cfg.transtab;
u64 memattr = cfg->arm_mali_lpae_cfg.memattr;

- mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0UL, AS_COMMAND_FLUSH_MEM);
+ mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);

mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), transtab & 0xffffffffUL);
mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), transtab >> 32);
@@ -128,7 +128,7 @@ static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_m

static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
{
- mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0UL, AS_COMMAND_FLUSH_MEM);
+ mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);

mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), 0);
mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), 0);
@@ -242,7 +242,7 @@ static size_t get_pgsize(u64 addr, size_t size)

static void panfrost_mmu_flush_range(struct panfrost_device *pfdev,
struct panfrost_mmu *mmu,
- u64 iova, size_t size)
+ u64 iova, u64 size)
{
if (mmu->as < 0)
return;
--
2.30.2

2021-08-20 21:35:10

by Alyssa Rosenzweig

[permalink] [raw]
Subject: [PATCH 3/3] drm/panfrost: Clamp lock region to Bifrost minimum

When locking a region, we currently clamp to a PAGE_SIZE as the minimum
lock region. While this is valid for Midgard, it is invalid for Bifrost,
where the minimum locking size is 8x larger than the 4k page size. Add a
hardware definition for the minimum lock region size (corresponding to
KBASE_LOCK_REGION_MIN_SIZE_LOG2 in kbase) and respect it.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
Tested-by: Chris Morgan <[email protected]>
Cc: <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_regs.h | 2 ++
2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 3a795273e505..dfe5f1d29763 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -66,7 +66,7 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
/* The size is encoded as ceil(log2) minus(1), which may be calculated
* with fls. The size must be clamped to hardware bounds.
*/
- size = max_t(u64, size, PAGE_SIZE);
+ size = max_t(u64, size, AS_LOCK_REGION_MIN_SIZE);
region_width = fls64(size - 1) - 1;
region |= region_width;

diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
index 1940ff86e49a..6c5a11ef1ee8 100644
--- a/drivers/gpu/drm/panfrost/panfrost_regs.h
+++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
@@ -316,6 +316,8 @@
#define AS_FAULTSTATUS_ACCESS_TYPE_READ (0x2 << 8)
#define AS_FAULTSTATUS_ACCESS_TYPE_WRITE (0x3 << 8)

+#define AS_LOCK_REGION_MIN_SIZE (1ULL << 15)
+
#define gpu_write(dev, reg, data) writel(data, dev->iomem + reg)
#define gpu_read(dev, reg) readl(dev->iomem + reg)

--
2.30.2

2021-08-20 21:37:44

by Alyssa Rosenzweig

[permalink] [raw]
Subject: [PATCH 1/3] drm/panfrost: Simplify lock_region calculation

In lock_region, simplify the calculation of the region_width parameter.
This field is the size, but encoded as log2(ceil(size)) - 1.
log2(ceil(size)) may be computed directly as fls(size - 1). However, we
want to use the 64-bit versions as the amount to lock can exceed
32-bits.

This avoids undefined behaviour when locking all memory (size ~0),
caught by UBSAN.

Signed-off-by: Alyssa Rosenzweig <[email protected]>
Reported-and-tested-by: Chris Morgan <[email protected]>
Cc: <[email protected]>
---
drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +++++--------------
1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
index 0da5b3100ab1..f6e02d0392f4 100644
--- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
@@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
{
u8 region_width;
u64 region = iova & PAGE_MASK;
- /*
- * fls returns:
- * 1 .. 32
- *
- * 10 + fls(num_pages)
- * results in the range (11 .. 42)
- */
-
- size = round_up(size, PAGE_SIZE);

- region_width = 10 + fls(size >> PAGE_SHIFT);
- if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) {
- /* not pow2, so must go up to the next pow2 */
- region_width += 1;
- }
+ /* The size is encoded as ceil(log2) minus(1), which may be calculated
+ * with fls. The size must be clamped to hardware bounds.
+ */
+ size = max_t(u64, size, PAGE_SIZE);
+ region_width = fls64(size - 1) - 1;
region |= region_width;

/* Lock the region that needs to be updated */
--
2.30.2

2021-08-23 09:41:47

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH 1/3] drm/panfrost: Simplify lock_region calculation

On 20/08/2021 22:31, Alyssa Rosenzweig wrote:
> In lock_region, simplify the calculation of the region_width parameter.
> This field is the size, but encoded as log2(ceil(size)) - 1.
> log2(ceil(size)) may be computed directly as fls(size - 1). However, we
> want to use the 64-bit versions as the amount to lock can exceed
> 32-bits.
>
> This avoids undefined behaviour when locking all memory (size ~0),
> caught by UBSAN.

It might have been useful to mention what it is that UBSAN specifically
picked up (it took me a while to spot) - but anyway I think there's a
bigger issue with it being completely wrong when size == ~0 (see below).

> Signed-off-by: Alyssa Rosenzweig <[email protected]>
> Reported-and-tested-by: Chris Morgan <[email protected]>
> Cc: <[email protected]>

However, I've confirmed this returns the same values and is certainly
more simple, so:

Reviewed-by: Steven Price <[email protected]>

> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 19 +++++--------------
> 1 file changed, 5 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 0da5b3100ab1..f6e02d0392f4 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -62,21 +62,12 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
> {
> u8 region_width;
> u64 region = iova & PAGE_MASK;
> - /*
> - * fls returns:
> - * 1 .. 32
> - *
> - * 10 + fls(num_pages)
> - * results in the range (11 .. 42)
> - */
> -
> - size = round_up(size, PAGE_SIZE);

This seems to be the first issue - ~0 will be 'rounded up' to 0.

>
> - region_width = 10 + fls(size >> PAGE_SHIFT);

fls(0) == 0, so region_width == 10.

> - if ((size >> PAGE_SHIFT) != (1ul << (region_width - 11))) {

Presumably here's where UBSAN objects - we're shifting by a negative
value, which even it it happens to works means the lock region is tiny
and certainly not what was intended! It might well be worth a:

Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")

Note for anyone following along at (working-from-) home: although this
code was cargo culted from kbase - kbase is fine because it takes a pfn
and doesn't do the round_up() stage.

Which also exposes the second bug (fixed in patch 2): a size_t isn't big
enough on 32 bit platforms (all Midgard/Bifrost GPUs have a VA size
bigger than 32 bits). Again kbase gets away with a u32 because it's a pfn.

There is potentially a third bug which kbase only recently attempted to
fix. The lock address is effectively rounded down by the hardware (the
bottom bits are ignored). So if you have mask=(1<<region_width)-1 but
(iova & mask) != ((iova + size) & mask) then you are potentially failing
to lock the end of the intended region. kbase has added some code to
handle this:

> /* Round up if some memory pages spill into the next region. */
> region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT);
> region_frame_number_end =
> (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT);
>
> if (region_frame_number_start < region_frame_number_end)
> lockaddr_size_log2 += 1;

I guess we should too?

Steve

> - /* not pow2, so must go up to the next pow2 */
> - region_width += 1;
> - }
> + /* The size is encoded as ceil(log2) minus(1), which may be calculated
> + * with fls. The size must be clamped to hardware bounds.
> + */
> + size = max_t(u64, size, PAGE_SIZE);
> + region_width = fls64(size - 1) - 1;
> region |= region_width;
>
> /* Lock the region that needs to be updated */
>

2021-08-23 09:42:51

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH 2/3] drm/panfrost: Use u64 for size in lock_region

On 20/08/2021 22:31, Alyssa Rosenzweig wrote:
> Mali virtual addresses are 48-bit. Use a u64 instead of size_t to ensure
> we can express the "lock everything" condition as ~0ULL without relying
> on platform-specific behaviour.

'platform-specific behaviour' makes it sound like this is something to
do with a particular board. This is 32bit/64bit - it's going to be
broken on 32bit: large lock regions are not going to work.

> Signed-off-by: Alyssa Rosenzweig <[email protected]>
> Suggested-by: Rob Herring <[email protected]>
> Tested-by: Chris Morgan <[email protected]>

Fixes: f3ba91228e8e ("drm/panfrost: Add initial panfrost driver")

Reviewed-by: Steven Price <[email protected]>

Steve

> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 12 ++++++------
> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index f6e02d0392f4..3a795273e505 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -58,7 +58,7 @@ static int write_cmd(struct panfrost_device *pfdev, u32 as_nr, u32 cmd)
> }
>
> static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
> - u64 iova, size_t size)
> + u64 iova, u64 size)
> {
> u8 region_width;
> u64 region = iova & PAGE_MASK;
> @@ -78,7 +78,7 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
>
>
> static int mmu_hw_do_operation_locked(struct panfrost_device *pfdev, int as_nr,
> - u64 iova, size_t size, u32 op)
> + u64 iova, u64 size, u32 op)
> {
> if (as_nr < 0)
> return 0;
> @@ -95,7 +95,7 @@ static int mmu_hw_do_operation_locked(struct panfrost_device *pfdev, int as_nr,
>
> static int mmu_hw_do_operation(struct panfrost_device *pfdev,
> struct panfrost_mmu *mmu,
> - u64 iova, size_t size, u32 op)
> + u64 iova, u64 size, u32 op)
> {
> int ret;
>
> @@ -112,7 +112,7 @@ static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_m
> u64 transtab = cfg->arm_mali_lpae_cfg.transtab;
> u64 memattr = cfg->arm_mali_lpae_cfg.memattr;
>
> - mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0UL, AS_COMMAND_FLUSH_MEM);
> + mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
>
> mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), transtab & 0xffffffffUL);
> mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), transtab >> 32);
> @@ -128,7 +128,7 @@ static void panfrost_mmu_enable(struct panfrost_device *pfdev, struct panfrost_m
>
> static void panfrost_mmu_disable(struct panfrost_device *pfdev, u32 as_nr)
> {
> - mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0UL, AS_COMMAND_FLUSH_MEM);
> + mmu_hw_do_operation_locked(pfdev, as_nr, 0, ~0ULL, AS_COMMAND_FLUSH_MEM);
>
> mmu_write(pfdev, AS_TRANSTAB_LO(as_nr), 0);
> mmu_write(pfdev, AS_TRANSTAB_HI(as_nr), 0);
> @@ -242,7 +242,7 @@ static size_t get_pgsize(u64 addr, size_t size)
>
> static void panfrost_mmu_flush_range(struct panfrost_device *pfdev,
> struct panfrost_mmu *mmu,
> - u64 iova, size_t size)
> + u64 iova, u64 size)
> {
> if (mmu->as < 0)
> return;
>

2021-08-23 09:43:16

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH 3/3] drm/panfrost: Clamp lock region to Bifrost minimum

On 20/08/2021 22:31, Alyssa Rosenzweig wrote:
> When locking a region, we currently clamp to a PAGE_SIZE as the minimum
> lock region. While this is valid for Midgard, it is invalid for Bifrost,

While the spec does seem to state it's invalid for Bifrost - kbase
didn't bother with a lower clamp for a long time. I actually think this
is in many ways more of a spec bug: i.e. implementation details of the
round-up that the hardware does. But it's much safer following the spec
;) And it seems like kbase eventually caught up too.

> where the minimum locking size is 8x larger than the 4k page size. Add a
> hardware definition for the minimum lock region size (corresponding to
> KBASE_LOCK_REGION_MIN_SIZE_LOG2 in kbase) and respect it.
>
> Signed-off-by: Alyssa Rosenzweig <[email protected]>
> Tested-by: Chris Morgan <[email protected]>
> Cc: <[email protected]>

Reviewed-by: Steven Price <[email protected]>

> ---
> drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +-
> drivers/gpu/drm/panfrost/panfrost_regs.h | 2 ++
> 2 files changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> index 3a795273e505..dfe5f1d29763 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c
> @@ -66,7 +66,7 @@ static void lock_region(struct panfrost_device *pfdev, u32 as_nr,
> /* The size is encoded as ceil(log2) minus(1), which may be calculated
> * with fls. The size must be clamped to hardware bounds.
> */
> - size = max_t(u64, size, PAGE_SIZE);
> + size = max_t(u64, size, AS_LOCK_REGION_MIN_SIZE);
> region_width = fls64(size - 1) - 1;
> region |= region_width;
>
> diff --git a/drivers/gpu/drm/panfrost/panfrost_regs.h b/drivers/gpu/drm/panfrost/panfrost_regs.h
> index 1940ff86e49a..6c5a11ef1ee8 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_regs.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_regs.h
> @@ -316,6 +316,8 @@
> #define AS_FAULTSTATUS_ACCESS_TYPE_READ (0x2 << 8)
> #define AS_FAULTSTATUS_ACCESS_TYPE_WRITE (0x3 << 8)
>
> +#define AS_LOCK_REGION_MIN_SIZE (1ULL << 15)
> +
> #define gpu_write(dev, reg, data) writel(data, dev->iomem + reg)
> #define gpu_read(dev, reg) readl(dev->iomem + reg)
>
>

2021-08-23 21:11:43

by Alyssa Rosenzweig

[permalink] [raw]
Subject: Re: [PATCH 1/3] drm/panfrost: Simplify lock_region calculation

> > In lock_region, simplify the calculation of the region_width parameter.
> > This field is the size, but encoded as log2(ceil(size)) - 1.
> > log2(ceil(size)) may be computed directly as fls(size - 1). However, we
> > want to use the 64-bit versions as the amount to lock can exceed
> > 32-bits.
> >
> > This avoids undefined behaviour when locking all memory (size ~0),
> > caught by UBSAN.
>
> It might have been useful to mention what it is that UBSAN specifically
> picked up (it took me a while to spot) - but anyway I think there's a
> bigger issue with it being completely wrong when size == ~0 (see below).

Indeed. I've updated the commit message in v2 to explain what goes
wrong (your analysis was spot on, but a mailing list message is more
ephermal than a commit message). I'll send out v2 tomorrow assuming
nobody objects to v1 in the mean time.

Thanks for the review.

> There is potentially a third bug which kbase only recently attempted to
> fix. The lock address is effectively rounded down by the hardware (the
> bottom bits are ignored). So if you have mask=(1<<region_width)-1 but
> (iova & mask) != ((iova + size) & mask) then you are potentially failing
> to lock the end of the intended region. kbase has added some code to
> handle this:
>
> > /* Round up if some memory pages spill into the next region. */
> > region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT);
> > region_frame_number_end =
> > (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT);
> >
> > if (region_frame_number_start < region_frame_number_end)
> > lockaddr_size_log2 += 1;
>
> I guess we should too?

Oh, I missed this one. Guess we have 4 bugs with this code instead of
just 3, yikes. How could such a short function be so deeply and horribly
broken? ????

Should I add a fourth patch to the series to fix this?

2021-08-23 21:13:14

by Alyssa Rosenzweig

[permalink] [raw]
Subject: Re: [PATCH 2/3] drm/panfrost: Use u64 for size in lock_region

> > Mali virtual addresses are 48-bit. Use a u64 instead of size_t to ensure
> > we can express the "lock everything" condition as ~0ULL without relying
> > on platform-specific behaviour.
>
> 'platform-specific behaviour' makes it sound like this is something to
> do with a particular board. This is 32bit/64bit - it's going to be
> broken on 32bit: large lock regions are not going to work.

Oh, my. I used the term as a weasel word since the spec is loose on how
big a size_t actually is. But if this is causing actual breakage on
armv7 we're in trouble. I'll add a Cc stable tag on v2, unless the Fixes
implies that?

Thanks for pointing this out.

2021-08-23 21:16:43

by Alyssa Rosenzweig

[permalink] [raw]
Subject: Re: [PATCH 3/3] drm/panfrost: Clamp lock region to Bifrost minimum

> > When locking a region, we currently clamp to a PAGE_SIZE as the minimum
> > lock region. While this is valid for Midgard, it is invalid for Bifrost,
>
> While the spec does seem to state it's invalid for Bifrost - kbase
> didn't bother with a lower clamp for a long time. I actually think this
> is in many ways more of a spec bug: i.e. implementation details of the
> round-up that the hardware does. But it's much safer following the spec
> ;) And it seems like kbase eventually caught up too.

Yeah, makes sense. Should I drop the Cc: stable in that case? If the
issue is purely theoretical.

2021-08-23 21:55:32

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH 1/3] drm/panfrost: Simplify lock_region calculation

On 23 August 2021 22:09:08 BST, Alyssa Rosenzweig <[email protected]> wrote:
>> > In lock_region, simplify the calculation of the region_width parameter.
>> > This field is the size, but encoded as log2(ceil(size)) - 1.
>> > log2(ceil(size)) may be computed directly as fls(size - 1). However, we
>> > want to use the 64-bit versions as the amount to lock can exceed
>> > 32-bits.
>> >
>> > This avoids undefined behaviour when locking all memory (size ~0),
>> > caught by UBSAN.
>>
>> It might have been useful to mention what it is that UBSAN specifically
>> picked up (it took me a while to spot) - but anyway I think there's a
>> bigger issue with it being completely wrong when size == ~0 (see below).
>
>Indeed. I've updated the commit message in v2 to explain what goes
>wrong (your analysis was spot on, but a mailing list message is more
>ephermal than a commit message). I'll send out v2 tomorrow assuming
>nobody objects to v1 in the mean time.
>
>Thanks for the review.
>
>> There is potentially a third bug which kbase only recently attempted to
>> fix. The lock address is effectively rounded down by the hardware (the
>> bottom bits are ignored). So if you have mask=(1<<region_width)-1 but
>> (iova & mask) != ((iova + size) & mask) then you are potentially failing
>> to lock the end of the intended region. kbase has added some code to
>> handle this:
>>
>> > /* Round up if some memory pages spill into the next region. */
>> > region_frame_number_start = pfn >> (lockaddr_size_log2 - PAGE_SHIFT);
>> > region_frame_number_end =
>> > (pfn + num_pages - 1) >> (lockaddr_size_log2 - PAGE_SHIFT);
>> >
>> > if (region_frame_number_start < region_frame_number_end)
>> > lockaddr_size_log2 += 1;
>>
>> I guess we should too?
>
>Oh, I missed this one. Guess we have 4 bugs with this code instead of
>just 3, yikes. How could such a short function be so deeply and horribly
>broken? ����
>
>Should I add a fourth patch to the series to fix this?

Yes please!

Thanks,
Steve

2021-08-23 21:58:55

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH 2/3] drm/panfrost: Use u64 for size in lock_region

On 23 August 2021 22:11:09 BST, Alyssa Rosenzweig <[email protected]> wrote:
>> > Mali virtual addresses are 48-bit. Use a u64 instead of size_t to ensure
>> > we can express the "lock everything" condition as ~0ULL without relying
>> > on platform-specific behaviour.
>>
>> 'platform-specific behaviour' makes it sound like this is something to
>> do with a particular board. This is 32bit/64bit - it's going to be
>> broken on 32bit: large lock regions are not going to work.
>
>Oh, my. I used the term as a weasel word since the spec is loose on how
>big a size_t actually is. But if this is causing actual breakage on
>armv7 we're in trouble. I'll add a Cc stable tag on v2, unless the Fixes
>implies that?

The usual practice is to put CC: stable in the commit message or the fixes tag (no need to actually CC the stable email address). So just Fixes: should work

>Thanks for pointing this out.

It's not actually quite so bad as it could be as >4GB regions are unlikely (especially on 32bit), but the GPU does of course support such a thing.

Thanks,
Steve

2021-08-23 22:04:16

by Steven Price

[permalink] [raw]
Subject: Re: [PATCH 3/3] drm/panfrost: Clamp lock region to Bifrost minimum

On 23 August 2021 22:13:45 BST, Alyssa Rosenzweig <[email protected]> wrote:
>> > When locking a region, we currently clamp to a PAGE_SIZE as the minimum
>> > lock region. While this is valid for Midgard, it is invalid for Bifrost,
>>
>> While the spec does seem to state it's invalid for Bifrost - kbase
>> didn't bother with a lower clamp for a long time. I actually think this
>> is in many ways more of a spec bug: i.e. implementation details of the
>> round-up that the hardware does. But it's much safer following the spec
>> ;) And it seems like kbase eventually caught up too.
>
>Yeah, makes sense. Should I drop the Cc: stable in that case? If the
>issue is purely theoretical.

I think it might still be worth fixing. Early Bifrost should be fine, but something triggered a bug report that caused kbase to be fixed, so I'm less confident that there's nothing out there that cares. Following both kbase and the spec seems the safest approach.

Thanks,
Steve