From: Ankit Agrawal <[email protected]>
The GHES code allows calling of memory_failure() on the PFNs that pass the
pfn_valid() check. This contract is broken for the remapped PFNs which
fails the check and ghes_do_memory_failure() returns without triggering
memory_failure().
Update code to allow memory_failure() call on PFNs failing pfn_valid().
Signed-off-by: Ankit Agrawal <[email protected]>
---
drivers/acpi/apei/ghes.c | 12 +-----------
1 file changed, 1 insertion(+), 11 deletions(-)
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index 63ad0541db38..0ca6ab9fccbe 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -471,20 +471,10 @@ static void ghes_kick_task_work(struct callback_head *head)
static bool ghes_do_memory_failure(u64 physical_addr, int flags)
{
- unsigned long pfn;
-
if (!IS_ENABLED(CONFIG_ACPI_APEI_MEMORY_FAILURE))
return false;
- pfn = PHYS_PFN(physical_addr);
- if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) {
- pr_warn_ratelimited(FW_WARN GHES_PFX
- "Invalid address in generic error data: %#llx\n",
- physical_addr);
- return false;
- }
-
- memory_failure_queue(pfn, flags);
+ memory_failure_queue(PHYS_PFN(physical_addr), flags);
return true;
}
--
2.17.1
On Thu, Nov 23, 2023 at 06:05:11AM +0530, [email protected] wrote:
> - pfn = PHYS_PFN(physical_addr);
> - if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) {
> - pr_warn_ratelimited(FW_WARN GHES_PFX
> - "Invalid address in generic error data: %#llx\n",
> - physical_addr);
> - return false;
> - }
You don't just remove a pfn valid test just because your weird device
can't stomach it - you extend it, like
3ad6fd77a2d6 ("x86/sgx: Add check for SGX pages to ghes_do_memory_failure()")
did, for example.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
On Sun, Dec 03, 2023 at 12:23:19AM +0100, Borislav Petkov wrote:
> On Thu, Nov 23, 2023 at 06:05:11AM +0530, [email protected] wrote:
> > - pfn = PHYS_PFN(physical_addr);
> > - if (!pfn_valid(pfn) && !arch_is_platform_page(physical_addr)) {
> > - pr_warn_ratelimited(FW_WARN GHES_PFX
> > - "Invalid address in generic error data: %#llx\n",
> > - physical_addr);
> > - return false;
> > - }
>
> You don't just remove a pfn valid test just because your weird device
> can't stomach it - you extend it, like
It wasn't removed. patch 1 moved it to memory_failure() where it makes
a lot more sense.
Jason
On Mon, Dec 04, 2023 at 10:36:50AM -0400, Jason Gunthorpe wrote:
> It wasn't removed. patch 1 moved it to memory_failure() where it makes
> a lot more sense.
Why is this a separate patch then?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
>> It wasn't removed. patch 1 moved it to memory_failure() where it makes
>> a lot more sense.
>
> Why is this a separate patch then?
This was done to keep ghes code separate from the memory failure code.
I can merge them if that is preferable.
On Mon, Dec 04, 2023 at 03:54:52PM +0000, Ankit Agrawal wrote:
> >> It wasn't removed. patch 1 moved it to memory_failure() where it makes
> >> a lot more sense.
> >
> > Why is this a separate patch then?
>
> This was done to keep ghes code separate from the memory failure code.
> I can merge them if that is preferable.
A single patch to move just this code could be a good idea
Jason