With deferred IO enabled, a page fault happens when data is written to the
framebuffer device. Then driver determines which page is being updated by
calculating the offset of the written virtual address within the virtual
memory area, and uses this offset to get the updated page within the
internal buffer. This page is later copied to hardware (thus the name
"deferred IO").
This calculation is only correct if the virtual memory area is mapped to
the beginning of the internal buffer. Otherwise this is wrong. For example,
if users do:
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
Then the virtual memory area will mapped at offset 0xff000 within the
internal buffer. This offset 0xff000 is not accounted for, and wrong page
is updated. This will lead to wrong pixels being updated on the device.
However, it gets worse: if users do 2 mmap to the same virtual address, for
example:
int fd = open("/dev/fb0", O_RDWR, 0);
char *ptr = (char *) 0x20000000ul;
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
*ptr = 0; // write #1
mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0);
*ptr = 0; // write #2
In this case, both write #1 and write #2 apply to the same virtual address
(0x20000000ul), and the driver mistakenly thinks the same page is being
written to. When the second write happens, the driver thinks this is the
same page as the last time, and reuse the page from write #1. The driver
then lock_page() an incorrect page, and returns VM_FAULT_LOCKED with the
correct page unlocked. It is unclear what will happen with memory
management subsystem after that, but likely something terrible.
Fix this by taking the mapping offset into account.
Reported-and-tested-by: Harshit Mogalapalli <[email protected]>
Closes: https://lore.kernel.org/linux-fbdev/[email protected]
Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
Cc: [email protected]
Signed-off-by: Nam Cao <[email protected]>
---
drivers/video/fbdev/core/fb_defio.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index dae96c9f61cf..d5d6cd9e8b29 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
*/
static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
{
- unsigned long offset = vmf->address - vmf->vma->vm_start;
+ unsigned long offset = vmf->address - vmf->vma->vm_start
+ + (vmf->vma->vm_pgoff << PAGE_SHIFT);
struct page *page = vmf->page;
file_update_time(vmf->vma->vm_file);
--
2.39.2
Hi,
thanks for following through with the bug and sending the patch
Am 19.04.24 um 21:00 schrieb Nam Cao:
> With deferred IO enabled, a page fault happens when data is written to the
> framebuffer device. Then driver determines which page is being updated by
> calculating the offset of the written virtual address within the virtual
> memory area, and uses this offset to get the updated page within the
> internal buffer. This page is later copied to hardware (thus the name
> "deferred IO").
>
> This calculation is only correct if the virtual memory area is mapped to
> the beginning of the internal buffer. Otherwise this is wrong. For example,
> if users do:
> mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
>
> Then the virtual memory area will mapped at offset 0xff000 within the
> internal buffer. This offset 0xff000 is not accounted for, and wrong page
> is updated. This will lead to wrong pixels being updated on the device.
>
> However, it gets worse: if users do 2 mmap to the same virtual address, for
> example:
>
> int fd = open("/dev/fb0", O_RDWR, 0);
> char *ptr = (char *) 0x20000000ul;
> mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
> *ptr = 0; // write #1
> mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0);
> *ptr = 0; // write #2
>
> In this case, both write #1 and write #2 apply to the same virtual address
> (0x20000000ul), and the driver mistakenly thinks the same page is being
> written to. When the second write happens, the driver thinks this is the
> same page as the last time, and reuse the page from write #1. The driver
> then lock_page() an incorrect page, and returns VM_FAULT_LOCKED with the
> correct page unlocked. It is unclear what will happen with memory
> management subsystem after that, but likely something terrible.
Please tone down the drama. :)
>
> Fix this by taking the mapping offset into account.
>
> Reported-and-tested-by: Harshit Mogalapalli <[email protected]>
> Closes: https://lore.kernel.org/linux-fbdev/[email protected]
> Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
> Cc: [email protected]
> Signed-off-by: Nam Cao <[email protected]>
> ---
> drivers/video/fbdev/core/fb_defio.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
> index dae96c9f61cf..d5d6cd9e8b29 100644
> --- a/drivers/video/fbdev/core/fb_defio.c
> +++ b/drivers/video/fbdev/core/fb_defio.c
> @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
> */
> static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
> {
> - unsigned long offset = vmf->address - vmf->vma->vm_start;
> + unsigned long offset = vmf->address - vmf->vma->vm_start
> + + (vmf->vma->vm_pgoff << PAGE_SHIFT);
The page-fault handler at [1] use vm_fault.pgoff to retrieve the page
structure. Can we do the same here and avoid that computation?
Best regards
Thomas
[1]
https://elixir.bootlin.com/linux/v6.8/source/drivers/video/fbdev/core/fb_defio.c#L100
> struct page *page = vmf->page;
>
> file_update_time(vmf->vma->vm_file);
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
On Tue, Apr 23, 2024 at 10:37:41AM +0200, Thomas Zimmermann wrote:
> Hi,
>
> thanks for following through with the bug and sending the patch
>
> Am 19.04.24 um 21:00 schrieb Nam Cao:
> > With deferred IO enabled, a page fault happens when data is written to the
> > framebuffer device. Then driver determines which page is being updated by
> > calculating the offset of the written virtual address within the virtual
> > memory area, and uses this offset to get the updated page within the
> > internal buffer. This page is later copied to hardware (thus the name
> > "deferred IO").
> >
> > This calculation is only correct if the virtual memory area is mapped to
> > the beginning of the internal buffer. Otherwise this is wrong. For example,
> > if users do:
> > mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
> >
> > Then the virtual memory area will mapped at offset 0xff000 within the
> > internal buffer. This offset 0xff000 is not accounted for, and wrong page
> > is updated. This will lead to wrong pixels being updated on the device.
> >
> > However, it gets worse: if users do 2 mmap to the same virtual address, for
> > example:
> >
> > int fd = open("/dev/fb0", O_RDWR, 0);
> > char *ptr = (char *) 0x20000000ul;
> > mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0xff000);
> > *ptr = 0; // write #1
> > mmap(ptr, 4096, PROT_WRITE, MAP_FIXED | MAP_SHARED, fd, 0);
> > *ptr = 0; // write #2
> >
> > In this case, both write #1 and write #2 apply to the same virtual address
> > (0x20000000ul), and the driver mistakenly thinks the same page is being
> > written to. When the second write happens, the driver thinks this is the
> > same page as the last time, and reuse the page from write #1. The driver
> > then lock_page() an incorrect page, and returns VM_FAULT_LOCKED with the
> > correct page unlocked. It is unclear what will happen with memory
> > management subsystem after that, but likely something terrible.
>
> Please tone down the drama. :)
Sorry, that wasn't intentional. Writing is hard :(
Let me just cut this out, this info is not really needed to justify the
changes.
>
> >
> > Fix this by taking the mapping offset into account.
> >
> > Reported-and-tested-by: Harshit Mogalapalli <[email protected]>
> > Closes: https://lore.kernel.org/linux-fbdev/[email protected]
> > Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
> > Cc: [email protected]
> > Signed-off-by: Nam Cao <[email protected]>
> > ---
> > drivers/video/fbdev/core/fb_defio.c | 3 ++-
> > 1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
> > index dae96c9f61cf..d5d6cd9e8b29 100644
> > --- a/drivers/video/fbdev/core/fb_defio.c
> > +++ b/drivers/video/fbdev/core/fb_defio.c
> > @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
> > */
> > static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
> > {
> > - unsigned long offset = vmf->address - vmf->vma->vm_start;
> > + unsigned long offset = vmf->address - vmf->vma->vm_start
> > + + (vmf->vma->vm_pgoff << PAGE_SHIFT);
>
> The page-fault handler at [1] use vm_fault.pgoff to retrieve the page
> structure. Can we do the same here and avoid that computation?
Yes, thanks for the suggestion.
It will change things a bit: offset will not be the exact value anymore,
but will be rounded down to multiple of PAGE_SIZE. But that doesn't matter,
because it will only be used to calculate the page offset later on.
We can clean this up and rename this "offset" to "pg_offset". But that's
for another day.
Best regards,
Nam
Hi
Am 23.04.24 um 11:55 schrieb Nam Cao:
[...]
>>> Fix this by taking the mapping offset into account.
>>>
>>> Reported-and-tested-by: Harshit Mogalapalli <[email protected]>
>>> Closes: https://lore.kernel.org/linux-fbdev/[email protected]
>>> Fixes: 56c134f7f1b5 ("fbdev: Track deferred-I/O pages in pageref struct")
>>> Cc: [email protected]
>>> Signed-off-by: Nam Cao <[email protected]>
>>> ---
>>> drivers/video/fbdev/core/fb_defio.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
>>> index dae96c9f61cf..d5d6cd9e8b29 100644
>>> --- a/drivers/video/fbdev/core/fb_defio.c
>>> +++ b/drivers/video/fbdev/core/fb_defio.c
>>> @@ -196,7 +196,8 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
>>> */
>>> static vm_fault_t fb_deferred_io_page_mkwrite(struct fb_info *info, struct vm_fault *vmf)
>>> {
>>> - unsigned long offset = vmf->address - vmf->vma->vm_start;
>>> + unsigned long offset = vmf->address - vmf->vma->vm_start
>>> + + (vmf->vma->vm_pgoff << PAGE_SHIFT);
>> The page-fault handler at [1] use vm_fault.pgoff to retrieve the page
>> structure. Can we do the same here and avoid that computation?
> Yes, thanks for the suggestion.
>
> It will change things a bit: offset will not be the exact value anymore,
> but will be rounded down to multiple of PAGE_SIZE. But that doesn't matter,
> because it will only be used to calculate the page offset later on.
>
> We can clean this up and rename this "offset" to "pg_offset". But that's
> for another day.
But can't we use struct vm_fault.pgoff directly? The page-fault handler
has used it since forever. The look-up code for the pageref should
probably do the same, because there's a 1:1 connection between the page
and the pageref. The pageref structure only exists because we cannot
store its data in struct page directly.
AFAICT pgoff is exactly the value want to compute. See [1] and the
calculation at [2].
[1] https://elixir.bootlin.com/linux/v6.8/source/mm/memory.c#L5222
[2]
https://elixir.bootlin.com/linux/v6.8/source/include/linux/pagemap.h#L957
Best regards
Thomas
>
> Best regards,
> Nam
--
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Frankenstrasse 146, 90461 Nuernberg, Germany
GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman
HRB 36809 (AG Nuernberg)
On Tue, Apr 23, 2024 at 01:27:09PM +0200, Thomas Zimmermann wrote:
> Am 23.04.24 um 11:55 schrieb Nam Cao:
..
> > > The page-fault handler at [1] use vm_fault.pgoff to retrieve the page
> > > structure. Can we do the same here and avoid that computation?
> > Yes, thanks for the suggestion.
> >
> > It will change things a bit: offset will not be the exact value anymore,
> > but will be rounded down to multiple of PAGE_SIZE. But that doesn't matter,
> > because it will only be used to calculate the page offset later on.
> >
> > We can clean this up and rename this "offset" to "pg_offset". But that's
> > for another day.
>
> But can't we use struct vm_fault.pgoff directly? The page-fault handler has
> used it since forever. The look-up code for the pageref should probably do
> the same, because there's a 1:1 connection between the page and the pageref.
> The pageref structure only exists because we cannot store its data in struct
> page directly.
>
> AFAICT pgoff is exactly the value want to compute. See [1] and? the
> calculation at [2].
Completely agree. I just wanted to point out that the value inside the
variable "offset" won't be the same as before, but it's still correct.
I just tested the patch on real hardware, will send it shortly.
Best regards,
Nam