2018-05-17 09:47:56

by Dou Liyang

[permalink] [raw]
Subject: [PATCH 0/1] Remove the check of duplicate processors

There is a bug in CPU hotplug code, I have two simple fix method, but I can't
ensure which one is better. So sent it out, seek help.

We found some processors which have the same processor ID but in different
PXM in ACPI namespace. such as this:

proc_id | pxm
--------------------
0 <-> 0
1 <-> 0
2 <-> 1
3 <-> 1
......
89 <-> 0
89 <-> 1
89 <-> 2
89 <-> 3
......

So we create a mechanism to validate them. make the processor(ID=89)
as invalid. And once a processor be hotplugged physically, we check its
processor id.

Commit 8e089eaa1999 ("acpi: Provide mechanism to validate processors in the ACPI tables")
Commit a77d6cd96849 ("acpi/processor: Check for duplicate processor ids at hotplug time")

Recently, I found the check mechanism has a bug, it didn't use the
acpi_processor_ids_walk() right and always gave us a wrong result.
First, I fixed it by modifying the value with AE_OK which is the
standard acpi_status value.

https://lkml.org/lkml/2018/3/20/273

But, now, I even think this check is useless. my reasons are following:

1). Based on the practical effect, It works well, and no bug be reported
2). Based on the code, the duplicate cases can be dealed with by

if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id))

That seems more reasonable, let's see the following case:

Before the patch, After the patch

the first processor(ID=89)
hot-add failed success

the others processor(ID=89)
hot-add failed failed


So, What's your idea about it.

Dou Liyang (1):
acpi/processor: Remove the check of duplicates processor ids

drivers/acpi/acpi_processor.c | 126 ------------------------------------------
include/linux/acpi.h | 3 -
2 files changed, 129 deletions(-)

--
2.14.3





2018-05-17 09:48:12

by Dou Liyang

[permalink] [raw]
Subject: [RFC PATCH 1/1] acpi/processor: Remove the check of duplicate processors

Now, Linux checks the duplicate processor ids at boot time based on the ACPI
namespace, and using it at hotplug time.

But, In fact, it doesn't work, becasue of the wrong walk in all APIC namespace
entries, and even if it would be right, it may miss an hotpluggable CPU(we can
hotplug one of the duplicate processor, the others can be ignored).

So, Remove the check code.

Signed-off-by: Dou Liyang <[email protected]>
---
drivers/acpi/acpi_processor.c | 126 ------------------------------------------
include/linux/acpi.h | 3 -
2 files changed, 129 deletions(-)

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 449d86d39965..8358708e0bbb 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -281,13 +281,6 @@ static int acpi_processor_get_info(struct acpi_device *device)
pr->acpi_id = value;
}

- if (acpi_duplicate_processor_id(pr->acpi_id)) {
- dev_err(&device->dev,
- "Failed to get unique processor _UID (0x%x)\n",
- pr->acpi_id);
- return -ENODEV;
- }
-
pr->phys_id = acpi_get_phys_id(pr->handle, device_declaration,
pr->acpi_id);
if (invalid_phys_cpuid(pr->phys_id))
@@ -579,127 +572,8 @@ static struct acpi_scan_handler processor_container_handler = {
.attach = acpi_processor_container_attach,
};

-/* The number of the unique processor IDs */
-static int nr_unique_ids __initdata;
-
-/* The number of the duplicate processor IDs */
-static int nr_duplicate_ids;
-
-/* Used to store the unique processor IDs */
-static int unique_processor_ids[] __initdata = {
- [0 ... NR_CPUS - 1] = -1,
-};
-
-/* Used to store the duplicate processor IDs */
-static int duplicate_processor_ids[] = {
- [0 ... NR_CPUS - 1] = -1,
-};
-
-static void __init processor_validated_ids_update(int proc_id)
-{
- int i;
-
- if (nr_unique_ids == NR_CPUS||nr_duplicate_ids == NR_CPUS)
- return;
-
- /*
- * Firstly, compare the proc_id with duplicate IDs, if the proc_id is
- * already in the IDs, do nothing.
- */
- for (i = 0; i < nr_duplicate_ids; i++) {
- if (duplicate_processor_ids[i] == proc_id)
- return;
- }
-
- /*
- * Secondly, compare the proc_id with unique IDs, if the proc_id is in
- * the IDs, put it in the duplicate IDs.
- */
- for (i = 0; i < nr_unique_ids; i++) {
- if (unique_processor_ids[i] == proc_id) {
- duplicate_processor_ids[nr_duplicate_ids] = proc_id;
- nr_duplicate_ids++;
- return;
- }
- }
-
- /*
- * Lastly, the proc_id is a unique ID, put it in the unique IDs.
- */
- unique_processor_ids[nr_unique_ids] = proc_id;
- nr_unique_ids++;
-}
-
-static acpi_status __init acpi_processor_ids_walk(acpi_handle handle,
- u32 lvl,
- void *context,
- void **rv)
-{
- acpi_status status;
- acpi_object_type acpi_type;
- unsigned long long uid;
- union acpi_object object = { 0 };
- struct acpi_buffer buffer = { sizeof(union acpi_object), &object };
-
- status = acpi_get_type(handle, &acpi_type);
- if (ACPI_FAILURE(status))
- return false;
-
- switch (acpi_type) {
- case ACPI_TYPE_PROCESSOR:
- status = acpi_evaluate_object(handle, NULL, NULL, &buffer);
- if (ACPI_FAILURE(status))
- goto err;
- uid = object.processor.proc_id;
- break;
-
- case ACPI_TYPE_DEVICE:
- status = acpi_evaluate_integer(handle, "_UID", NULL, &uid);
- if (ACPI_FAILURE(status))
- goto err;
- break;
- default:
- goto err;
- }
-
- processor_validated_ids_update(uid);
- return true;
-
-err:
- acpi_handle_info(handle, "Invalid processor object\n");
- return false;
-
-}
-
-static void __init acpi_processor_check_duplicates(void)
-{
- /* check the correctness for all processors in ACPI namespace */
- acpi_walk_namespace(ACPI_TYPE_PROCESSOR, ACPI_ROOT_OBJECT,
- ACPI_UINT32_MAX,
- acpi_processor_ids_walk,
- NULL, NULL, NULL);
- acpi_get_devices(ACPI_PROCESSOR_DEVICE_HID, acpi_processor_ids_walk,
- NULL, NULL);
-}
-
-bool acpi_duplicate_processor_id(int proc_id)
-{
- int i;
-
- /*
- * compare the proc_id with duplicate IDs, if the proc_id is already
- * in the duplicate IDs, return true, otherwise, return false.
- */
- for (i = 0; i < nr_duplicate_ids; i++) {
- if (duplicate_processor_ids[i] == proc_id)
- return true;
- }
- return false;
-}
-
void __init acpi_processor_init(void)
{
- acpi_processor_check_duplicates();
acpi_scan_add_handler_with_hotplug(&processor_handler, "processor");
acpi_scan_add_handler(&processor_container_handler);
}
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 15bfb15c2fa5..068dcfe6768b 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -285,9 +285,6 @@ static inline bool invalid_phys_cpuid(phys_cpuid_t phys_id)
return phys_id == PHYS_CPUID_INVALID;
}

-/* Validate the processor object's proc_id */
-bool acpi_duplicate_processor_id(int proc_id);
-
#ifdef CONFIG_ACPI_HOTPLUG_CPU
/* Arch dependent functions for cpu hotplug support */
int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
--
2.14.3




2018-05-28 08:40:56

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH 0/1] Remove the check of duplicate processors

On Thu, May 17, 2018 at 11:46 AM, Dou Liyang <[email protected]> wrote:
> There is a bug in CPU hotplug code, I have two simple fix method, but I can't
> ensure which one is better. So sent it out, seek help.
>
> We found some processors which have the same processor ID but in different
> PXM in ACPI namespace. such as this:
>
> proc_id | pxm
> --------------------
> 0 <-> 0
> 1 <-> 0
> 2 <-> 1
> 3 <-> 1
> ......
> 89 <-> 0
> 89 <-> 1
> 89 <-> 2
> 89 <-> 3
> ......
>
> So we create a mechanism to validate them. make the processor(ID=89)
> as invalid. And once a processor be hotplugged physically, we check its
> processor id.
>
> Commit 8e089eaa1999 ("acpi: Provide mechanism to validate processors in the ACPI tables")
> Commit a77d6cd96849 ("acpi/processor: Check for duplicate processor ids at hotplug time")
>
> Recently, I found the check mechanism has a bug, it didn't use the
> acpi_processor_ids_walk() right and always gave us a wrong result.
> First, I fixed it by modifying the value with AE_OK which is the
> standard acpi_status value.
>
> https://lkml.org/lkml/2018/3/20/273
>
> But, now, I even think this check is useless. my reasons are following:
>
> 1). Based on the practical effect, It works well, and no bug be reported
> 2). Based on the code, the duplicate cases can be dealed with by
>
> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id))
>
> That seems more reasonable, let's see the following case:
>
> Before the patch, After the patch
>
> the first processor(ID=89)
> hot-add failed success
>
> the others processor(ID=89)
> hot-add failed failed
>
>
> So, What's your idea about it.
>
> Dou Liyang (1):
> acpi/processor: Remove the check of duplicates processor ids

Can you please resend the patch with the above information in the
changelog of it?

That would make it easier to review and discuss it.

Also I think that we need some sort of a check against duplicate IDs.

2018-05-28 08:51:58

by Dou Liyang

[permalink] [raw]
Subject: Re: [PATCH 0/1] Remove the check of duplicate processors

Hi Rafael,

At 05/28/2018 04:40 PM, Rafael J. Wysocki wrote:

>
> Can you please resend the patch with the above information in the
> changelog of it?

Yes, I resend the patch right now.

>
> That would make it easier to review and discuss it.
>
> Also I think that we need some sort of a check against duplicate IDs.
>

Thanks,
dou
>
>