Hello!
Changes since RFC-v1:
* riscv is new, ia64 is gone
* The KVM support is different, and upstream - no need to patch the host.
---
This series adds what looks like cpuhotplug support to arm64 for use in
virtual machines. It does this by moving the cpu_register() calls for
architectures that support ACPI out of the arch code by using
GENERIC_CPU_DEVICES, then into the ACPI processor driver.
The kubernetes folk really want to be able to add CPUs to an existing VM,
in exactly the same way they do on x86. The use-case is pre-booting guests
with one CPU, then adding the number that were actually needed when the
workload is provisioned.
Wait? Doesn't arm64 support cpuhotplug already!?
In the arm world, cpuhotplug gets used to mean removing the power from a CPU.
The CPU is offline, and remains present. For x86, and ACPI, cpuhotplug
has the additional step of physically removing the CPU, so that it isn't
present anymore.
Arm64 doesn't support this, and can't support it: CPUs are really a slice
of the SoC, and there is not enough information in the existing ACPI tables
to describe which bits of the slice also got removed. Without a reference
machine: adding this support to the spec is a wild goose chase.
Critically: everything described in the firmware tables must remain present.
For a virtual machine this is easy as all the other bits of 'virtual SoC'
are emulated, so they can (and do) remain present when a vCPU is 'removed'.
On a system that supports cpuhotplug the MADT has to describe every possible
CPU at boot. Under KVM, the vGIC needs to know about every possible vCPU before
the guest is started.
With these constraints, virtual-cpuhotplug is really just a hypervisor/firmware
policy about which CPUs can be brought online.
This series adds support for virtual-cpuhotplug as exactly that: firmware
policy. This may even work on a physical machine too; for a guest the part of
firmware is played by the VMM. (typically Qemu).
PSCI support is modified to return 'DENIED' if the CPU can't be brought
online/enabled yet. The CPU object's _STA method's enabled bit is used to
indicate firmware's current disposition. If the CPU has its enabled bit clear,
it will not be registered with sysfs, and attempts to bring it online will
fail. The notifications that _STA has changed its value then work in the same
way as physical hotplug, and firmware can cause the CPU to be registered some
time later, allowing it to be brought online.
This creates something that looks like cpuhotplug to user-space, as the sysfs
files appear and disappear, and the udev notifications look the same.
One notable difference is the CPU present mask, which is exposed via sysfs.
Because the CPUs remain present throughout, they can still be seen in that mask.
This value does get used by webbrowsers to estimate the number of CPUs
as the CPU online mask is constantly changed on mobile phones.
Linux is tolerant of PSCI returning errors, as its always been allowed to do
that. To avoid confusing OS that can't tolerate this, we needed an additional
bit in the MADT GICC flags. This series copies ACPI_MADT_ONLINE_CAPABLE, which
appears to be for this purpose, but calls it ACPI_MADT_GICC_CPU_CAPABLE as it
has a different bit position in the GICC.
This code is unconditionally enabled for all ACPI architectures.
If there are problems with firmware tables on some devices, the CPUs will
already be online by the time the acpi_processor_make_enabled() is called.
A mismatch here causes a firmware-bug message and kernel taint. This should
only affect people with broken firmware who also boot with maxcpus=1, and
bring CPUs online later.
I had a go at switching the remaining architectures over to GENERIC_CPU_DEVICES,
so that the Kconfig symbol can be removed, but I got stuck with powerpc
and s390.
I've only build tested Loongarch and riscv. I've removed the ia64 specific
patches, but left the changes in other patches to make git-grep review of
renames easier.
If folk want to play along at home, you'll need a copy of Qemu that supports this.
https://github.com/salil-mehta/qemu.git salil/virt-cpuhp-armv8/rfc-v2-rc6
Replace your '-smp' argument with something like:
| -smp cpus=1,maxcpus=3,cores=3,threads=1,sockets=1
then feed the following to the Qemu montior;
| (qemu) device_add driver=host-arm-cpu,core-id=1,id=cpu1
| (qemu) device_del cpu1
Why is this still an RFC? I'm still looking for confirmation from the
kubernetes/kata folk that this works for them. Because of this I've culled
the CC list...
This series is based on v6.6-rc1, and can be retrieved from:
https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v2
Thanks,
James Morse (34):
ACPI: Move ACPI_HOTPLUG_CPU to be disabled on arm64 and riscv
drivers: base: Use present CPUs in GENERIC_CPU_DEVICES
drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden
drivers: base: Move cpu_dev_init() after node_dev_init()
drivers: base: Print a warning instead of panic() when register_cpu()
fails
arm64: setup: Switch over to GENERIC_CPU_DEVICES using
arch_register_cpu()
x86: intel_epb: Don't rely on link order
x86/topology: Switch over to GENERIC_CPU_DEVICES
LoongArch: Switch over to GENERIC_CPU_DEVICES
riscv: Switch over to GENERIC_CPU_DEVICES
arch_topology: Make register_cpu_capacity_sysctl() tolerant to late
CPUs
ACPI: Use the acpi_device_is_present() helper in more places
ACPI: Rename acpi_scan_device_not_present() to be about enumeration
ACPI: Only enumerate enabled (or functional) devices
ACPI: processor: Add support for processors described as container
packages
ACPI: processor: Register CPUs that are online, but not described in
the DSDT
ACPI: processor: Register all CPUs from acpi_processor_get_info()
ACPI: Rename ACPI_HOTPLUG_CPU to include 'present'
ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove()
ACPI: Rename acpi_processor_hotadd_init and remove pre-processor
guards
ACPI: Add post_eject to struct acpi_scan_handler for cpu hotplug
ACPI: Check _STA present bit before making CPUs not present
ACPI: Warn when the present bit changes but the feature is not enabled
drivers: base: Implement weak arch_unregister_cpu()
LoongArch: Use the __weak version of arch_unregister_cpu()
arm64: acpi: Move get_cpu_for_acpi_id() to a header
ACPICA: Add new MADT GICC flags fields [code first?]
arm64, irqchip/gic-v3, ACPI: Move MADT GICC enabled check into a
helper
irqchip/gic-v3: Don't return errors from gic_acpi_match_gicc()
irqchip/gic-v3: Add support for ACPI's disabled but 'online capable'
CPUs
ACPI: add support to register CPUs based on the _STA enabled bit
arm64: document virtual CPU hotplug's expectations
ACPI: Add _OSC bits to advertise OS support for toggling CPU
present/enabled
cpumask: Add enabled cpumask for present CPUs that can be brought
online
Jean-Philippe Brucker (1):
arm64: psci: Ignore DENIED CPUs
Documentation/arch/arm64/cpu-hotplug.rst | 79 ++++++++++
Documentation/arch/arm64/index.rst | 1 +
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/acpi.h | 11 ++
arch/arm64/include/asm/cpu.h | 1 -
arch/arm64/kernel/acpi_numa.c | 11 --
arch/arm64/kernel/psci.c | 2 +-
arch/arm64/kernel/setup.c | 13 +-
arch/arm64/kernel/smp.c | 5 +-
arch/ia64/Kconfig | 2 +
arch/ia64/include/asm/acpi.h | 2 +-
arch/ia64/include/asm/cpu.h | 5 -
arch/ia64/kernel/acpi.c | 6 +-
arch/ia64/kernel/setup.c | 2 +-
arch/ia64/kernel/topology.c | 2 +-
arch/loongarch/Kconfig | 2 +
arch/loongarch/configs/loongson3_defconfig | 2 +-
arch/loongarch/kernel/acpi.c | 4 +-
arch/loongarch/kernel/topology.c | 38 +----
arch/riscv/Kconfig | 1 +
arch/riscv/kernel/setup.c | 19 +--
arch/x86/Kconfig | 3 +
arch/x86/include/asm/cpu.h | 6 -
arch/x86/kernel/acpi/boot.c | 4 +-
arch/x86/kernel/cpu/intel_epb.c | 2 +-
arch/x86/kernel/topology.c | 25 +---
drivers/acpi/Kconfig | 14 +-
drivers/acpi/acpi_processor.c | 160 ++++++++++++++++-----
drivers/acpi/bus.c | 16 +++
drivers/acpi/device_pm.c | 2 +-
drivers/acpi/device_sysfs.c | 2 +-
drivers/acpi/internal.h | 1 -
drivers/acpi/processor_core.c | 2 +-
drivers/acpi/property.c | 2 +-
drivers/acpi/scan.c | 147 ++++++++++++-------
drivers/base/arch_topology.c | 38 +++--
drivers/base/cpu.c | 40 ++++--
drivers/base/init.c | 2 +-
drivers/firmware/psci/psci.c | 2 +
drivers/irqchip/irq-gic-v3.c | 38 ++---
include/acpi/acpi_bus.h | 1 +
include/acpi/actbl2.h | 1 +
include/acpi/processor.h | 2 +-
include/linux/acpi.h | 14 +-
include/linux/cpu.h | 6 +
include/linux/cpumask.h | 25 ++++
kernel/cpu.c | 3 +
47 files changed, 516 insertions(+), 251 deletions(-)
create mode 100644 Documentation/arch/arm64/cpu-hotplug.rst
--
2.39.2
acpi_processor_hotadd_init() will make a CPU present by mapping it
based on its hardware id.
'hotadd_init' is ambiguous once there are two different behaviours
for cpu hotplug. This is for toggling the _STA present bit. Subsequent
patches will add support for toggling the _STA enabled bit, named
acpi_processor_make_enabled().
Rename it acpi_processor_make_present() to make it clear this is
for CPUs that were not previously present.
Expose the function prototypes it uses to allow the preprocessor
guards to be removed. The IS_ENABLED() check will let the compiler
dead-code elimination pass remove this if it isn't going to be
used.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/acpi_processor.c | 14 +++++---------
include/linux/acpi.h | 2 --
2 files changed, 5 insertions(+), 11 deletions(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 75257fae10e7..22a15a614f95 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -182,13 +182,15 @@ static void __init acpi_pcc_cpufreq_init(void) {}
#endif /* CONFIG_X86 */
/* Initialization */
-#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
-static int acpi_processor_hotadd_init(struct acpi_processor *pr)
+static int acpi_processor_make_present(struct acpi_processor *pr)
{
unsigned long long sta;
acpi_status status;
int ret;
+ if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
+ return -ENODEV;
+
if (invalid_phys_cpuid(pr->phys_id))
return -ENODEV;
@@ -222,12 +224,6 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr)
cpu_maps_update_done();
return ret;
}
-#else
-static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
-{
- return -ENODEV;
-}
-#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
static int acpi_processor_get_info(struct acpi_device *device)
{
@@ -335,7 +331,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
* because cpuid <-> apicid mapping is persistent now.
*/
if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
- int ret = acpi_processor_hotadd_init(pr);
+ int ret = acpi_processor_make_present(pr);
if (ret)
return ret;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 651dd43976a9..b7ab85857bb7 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -316,12 +316,10 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
}
#endif
-#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
/* Arch dependent functions for cpu hotplug support */
int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
int *pcpu);
int acpi_unmap_cpu(int cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
#ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
--
2.39.2
intel_epb_init() is called as a subsys_initcall() to register cpuhp
callbacks. The callbacks make use of get_cpu_device() which will return
NULL unless register_cpu() has been called. register_cpu() is called
from topology_init(), which is also a subsys_initcall().
This is fragile. Moving the register_cpu() to a different
subsys_initcall() leads to a NULL derefernce during boot.
Make intel_epb_init() a late_initcall(), user-space can't provide a
policy before this point anyway.
Signed-off-by: James Morse <[email protected]>
---
subsys_initcall_sync() would be an option, but moving the register_cpu()
calls into ACPI also means adding a safety net for CPUs that are online
but not described properly by firmware. This lives in subsys_initcall_sync().
---
arch/x86/kernel/cpu/intel_epb.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/cpu/intel_epb.c b/arch/x86/kernel/cpu/intel_epb.c
index e4c3ba91321c..f18d35fe27a9 100644
--- a/arch/x86/kernel/cpu/intel_epb.c
+++ b/arch/x86/kernel/cpu/intel_epb.c
@@ -237,4 +237,4 @@ static __init int intel_epb_init(void)
cpuhp_remove_state(CPUHP_AP_X86_INTEL_EPB_ONLINE);
return ret;
}
-subsys_initcall(intel_epb_init);
+late_initcall(intel_epb_init);
--
2.39.2
Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
overridden by the arch code, switch over to this to allow common code
to choose when the register_cpu() call is made.
This allows topology_init() to be removed.
This is an intermediate step to the logic being moved to drivers/acpi,
where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
Signed-off-by: James Morse <[email protected]>
---
arch/loongarch/Kconfig | 1 +
arch/loongarch/kernel/topology.c | 29 ++---------------------------
2 files changed, 3 insertions(+), 27 deletions(-)
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 2bddd202470e..5bed51adc68c 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -72,6 +72,7 @@ config LOONGARCH
select GENERIC_CLOCKEVENTS
select GENERIC_CMOS_UPDATE
select GENERIC_CPU_AUTOPROBE
+ select GENERIC_CPU_DEVICES
select GENERIC_ENTRY
select GENERIC_GETTIMEOFDAY
select GENERIC_IOREMAP if !ARCH_IOREMAP
diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
index caa7cd859078..8e4441c1ff39 100644
--- a/arch/loongarch/kernel/topology.c
+++ b/arch/loongarch/kernel/topology.c
@@ -7,20 +7,13 @@
#include <linux/percpu.h>
#include <asm/bootinfo.h>
-static DEFINE_PER_CPU(struct cpu, cpu_devices);
-
#ifdef CONFIG_HOTPLUG_CPU
int arch_register_cpu(int cpu)
{
- int ret;
struct cpu *c = &per_cpu(cpu_devices, cpu);
- c->hotpluggable = 1;
- ret = register_cpu(c, cpu);
- if (ret < 0)
- pr_warn("register_cpu %d failed (%d)\n", cpu, ret);
-
- return ret;
+ c->hotpluggable = !io_master(cpu);
+ return register_cpu(c, cpu);
}
EXPORT_SYMBOL(arch_register_cpu);
@@ -33,21 +26,3 @@ void arch_unregister_cpu(int cpu)
}
EXPORT_SYMBOL(arch_unregister_cpu);
#endif
-
-static int __init topology_init(void)
-{
- int i, ret;
-
- for_each_present_cpu(i) {
- struct cpu *c = &per_cpu(cpu_devices, i);
-
- c->hotpluggable = !io_master(i);
- ret = register_cpu(c, i);
- if (ret < 0)
- pr_warn("topology_init: register_cpu %d failed (%d)\n", i, ret);
- }
-
- return 0;
-}
-
-subsys_initcall(topology_init);
--
2.39.2
Today the ACPI enumeration code 'visits' all devices that are present.
This is a problem for arm64, where CPUs are always present, but not
always enabled. When a device-check occurs because the firmware-policy
has changed and a CPU is now enabled, the following error occurs:
| acpi ACPI0007:48: Enumeration failure
This is ultimately because acpi_dev_ready_for_enumeration() returns
true for a device that is not enabled. The ACPI Processor driver
will not register such CPUs as they are not 'decoding their resources'.
Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
ACPI allows a device to be functional instead of maintaining the
present and enabled bit. Make this behaviour an explicit check with
a reference to the spec, and then check the present and enabled bits.
This is needed to avoid enumerating present && functional devices that
are not enabled.
Signed-off-by: James Morse <[email protected]>
---
If this change causes problems on deployed hardware, I suggest an
arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
acpi_dev_ready_for_enumeration() to only check the present bit.
---
drivers/acpi/device_pm.c | 2 +-
drivers/acpi/device_sysfs.c | 2 +-
drivers/acpi/internal.h | 1 -
drivers/acpi/property.c | 2 +-
drivers/acpi/scan.c | 23 +++++++++++++----------
5 files changed, 16 insertions(+), 14 deletions(-)
diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index f007116a8427..76c38478a502 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
return -EINVAL;
device->power.state = ACPI_STATE_UNKNOWN;
- if (!acpi_device_is_present(device)) {
+ if (!acpi_dev_ready_for_enumeration(device)) {
device->flags.initialized = false;
return -ENXIO;
}
diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
index b9bbf0746199..16e586d74aa2 100644
--- a/drivers/acpi/device_sysfs.c
+++ b/drivers/acpi/device_sysfs.c
@@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
struct acpi_hardware_id *id;
/* Avoid unnecessarily loading modules for non present devices. */
- if (!acpi_device_is_present(acpi_dev))
+ if (!acpi_dev_ready_for_enumeration(acpi_dev))
return 0;
/*
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 866c7c4ed233..a1b45e345bcc 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -107,7 +107,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
void acpi_device_remove_files(struct acpi_device *dev);
void acpi_device_add_finalize(struct acpi_device *device);
void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
-bool acpi_device_is_present(const struct acpi_device *adev);
bool acpi_device_is_battery(struct acpi_device *adev);
bool acpi_device_is_first_physical_node(struct acpi_device *adev,
const struct device *dev);
diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
index 413e4fcadcaf..e03f00b98701 100644
--- a/drivers/acpi/property.c
+++ b/drivers/acpi/property.c
@@ -1418,7 +1418,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
if (!is_acpi_device_node(fwnode))
return false;
- return acpi_device_is_present(to_acpi_device_node(fwnode));
+ return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
}
static const void *
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 17ab875a7d4e..f898591ce05f 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
int error;
acpi_bus_get_status(adev);
- if (acpi_device_is_present(adev)) {
+ if (acpi_dev_ready_for_enumeration(adev)) {
/*
* This function is only called for device objects for which
* matching scan handlers exist. The only situation in which
@@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
int error;
acpi_bus_get_status(adev);
- if (!acpi_device_is_present(adev)) {
+ if (!acpi_dev_ready_for_enumeration(adev)) {
acpi_scan_device_not_enumerated(adev);
return 0;
}
@@ -1908,11 +1908,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
return true;
}
-bool acpi_device_is_present(const struct acpi_device *adev)
-{
- return adev->status.present || adev->status.functional;
-}
-
static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
const char *idstr,
const struct acpi_device_id **matchid)
@@ -2375,16 +2370,24 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
* acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
* @device: Pointer to the &struct acpi_device to check
*
- * Check if the device is present and has no unmet dependencies.
+ * Check if the device is functional or enabled and has no unmet dependencies.
*
- * Return true if the device is ready for enumeratino. Otherwise, return false.
+ * Return true if the device is ready for enumeration. Otherwise, return false.
*/
bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
{
if (device->flags.honor_deps && device->dep_unmet)
return false;
- return acpi_device_is_present(device);
+ /*
+ * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
+ * (!present && functional) for certain types of devices that should be
+ * enumerated.
+ */
+ if (!device->status.present && !device->status.enabled)
+ return device->status.functional;
+
+ return device->status.present && device->status.enabled;
}
EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
--
2.39.2
ACPI has two ways of describing processors in the DSDT. Either as a device
object with HID ACPI0007, or as a type 'C' package inside a Processor
Container. The ACPI processor driver probes CPUs described as devices, but
not those described as packages.
Duplicate descriptions are not allowed, the ACPI processor driver already
parses the UID from both devices and containers. acpi_processor_get_info()
returns an error if the UID exists twice in the DSDT.
The missing probe for CPUs described as packages creates a problem for
moving the cpu_register() calls into the acpi_processor driver, as CPUs
described like this don't get registered, leading to errors from other
subsystems when they try to add new sysfs entries to the CPU node.
(e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
To fix this, parse the processor container and call acpi_processor_add()
for each processor that is discovered like this. The processor container
handler is added with acpi_scan_add_handler(), so no detach call will
arrive.
Qemu TCG describes CPUs using packages in a processor container.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index c0839bcf78c1..b4bde78121bb 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -625,9 +625,31 @@ static struct acpi_scan_handler processor_handler = {
},
};
+static acpi_status acpi_processor_container_walk(acpi_handle handle,
+ u32 lvl,
+ void *context,
+ void **rv)
+{
+ struct acpi_device *adev;
+ acpi_status status;
+
+ adev = acpi_get_acpi_dev(handle);
+ if (!adev)
+ return AE_ERROR;
+
+ status = acpi_processor_add(adev, &processor_device_ids[0]);
+ acpi_put_acpi_dev(adev);
+
+ return status;
+}
+
static int acpi_processor_container_attach(struct acpi_device *dev,
const struct acpi_device_id *id)
{
+ acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle,
+ ACPI_UINT32_MAX, acpi_processor_container_walk,
+ NULL, NULL, NULL);
+
return 1;
}
--
2.39.2
LoongArch provides its own arch_unregister_cpu(). This clears the
hotpluggable flag, then unregisters the CPU.
It isn't necessary to clear the hotpluggable flag when unregistering
a cpu. unregister_cpu() writes NULL to the percpu cpu_sys_devices
pointer, meaning cpu_is_hotpluggable() will return false, as
get_cpu_device() has returned NULL.
Remove arch_unregister_cpu() and use the __weak version.
Signed-off-by: James Morse <[email protected]>
---
arch/loongarch/kernel/topology.c | 9 ---------
1 file changed, 9 deletions(-)
diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
index 8e4441c1ff39..5a75e2cc0848 100644
--- a/arch/loongarch/kernel/topology.c
+++ b/arch/loongarch/kernel/topology.c
@@ -16,13 +16,4 @@ int arch_register_cpu(int cpu)
return register_cpu(c, cpu);
}
EXPORT_SYMBOL(arch_register_cpu);
-
-void arch_unregister_cpu(int cpu)
-{
- struct cpu *c = &per_cpu(cpu_devices, cpu);
-
- c->hotpluggable = 0;
- unregister_cpu(c);
-}
-EXPORT_SYMBOL(arch_unregister_cpu);
#endif
--
2.39.2
Add the new flag field to the MADT's GICC structure.
'Online Capable' indicates a disabled CPU can be enabled later.
Signed-off-by: James Morse <[email protected]>
---
This patch probably needs to go via the upstream acpica project,
but is included here so the feature can be testd.
---
include/acpi/actbl2.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index 3751ae69432f..c433a079d8e1 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -1046,6 +1046,7 @@ struct acpi_madt_generic_interrupt {
/* ACPI_MADT_ENABLED (1) Processor is usable if set */
#define ACPI_MADT_PERFORMANCE_IRQ_MODE (1<<1) /* 01: Performance Interrupt Mode */
#define ACPI_MADT_VGIC_IRQ_MODE (1<<2) /* 02: VGIC Maintenance Interrupt mode */
+#define ACPI_MADT_GICC_CPU_CAPABLE (1<<3) /* 03: CPU is online capable */
/* 12: Generic Distributor (ACPI 5.0 + ACPI 6.0 changes) */
--
2.39.2
ACPI firmware can trigger the events to add and remove CPUs, but the
OS may not support this.
Print a warning when this happens.
This gives early warning on arm64 systems that don't support
CONFIG_ACPI_HOTPLUG_PRESENT_CPU, as making CPUs not present has
side effects for other parts of the system.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/acpi_processor.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 2cafea1edc24..b67616079751 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -188,8 +188,10 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
acpi_status status;
int ret;
- if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
+ if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
+ pr_err_once("Changing CPU present bit is not supported\n");
return -ENODEV;
+ }
if (invalid_phys_cpuid(pr->phys_id))
return -ENODEV;
@@ -462,8 +464,10 @@ static void acpi_processor_make_not_present(struct acpi_device *device)
{
struct acpi_processor *pr;
- if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
+ if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
+ pr_err_once("Changing CPU present bit is not supported");
return;
+ }
pr = acpi_driver_data(device);
if (pr->id >= nr_cpu_ids)
--
2.39.2
ACPI identifies CPUs by UID. get_cpu_for_acpi_id() maps the ACPI UID
to the linux CPU number.
The helper to retrieve this mapping is only available in arm64's numa
code.
Move it to live next to get_acpi_id_for_cpu().
Signed-off-by: James Morse <[email protected]>
---
arch/arm64/include/asm/acpi.h | 11 +++++++++++
arch/arm64/kernel/acpi_numa.c | 11 -----------
2 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
index 4d537d56eb84..ce5045038e87 100644
--- a/arch/arm64/include/asm/acpi.h
+++ b/arch/arm64/include/asm/acpi.h
@@ -100,6 +100,17 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
return acpi_cpu_get_madt_gicc(cpu)->uid;
}
+static inline int get_cpu_for_acpi_id(u32 uid)
+{
+ int cpu;
+
+ for (cpu = 0; cpu < nr_cpu_ids; cpu++)
+ if (uid == get_acpi_id_for_cpu(cpu))
+ return cpu;
+
+ return -EINVAL;
+}
+
static inline void arch_fix_phys_package_id(int num, u32 slot) { }
void __init acpi_init_cpus(void);
int apei_claim_sea(struct pt_regs *regs);
diff --git a/arch/arm64/kernel/acpi_numa.c b/arch/arm64/kernel/acpi_numa.c
index e51535a5f939..0c036a9a3c33 100644
--- a/arch/arm64/kernel/acpi_numa.c
+++ b/arch/arm64/kernel/acpi_numa.c
@@ -34,17 +34,6 @@ int __init acpi_numa_get_nid(unsigned int cpu)
return acpi_early_node_map[cpu];
}
-static inline int get_cpu_for_acpi_id(u32 uid)
-{
- int cpu;
-
- for (cpu = 0; cpu < nr_cpu_ids; cpu++)
- if (uid == get_acpi_id_for_cpu(cpu))
- return cpu;
-
- return -EINVAL;
-}
-
static int __init acpi_parse_gicc_pxm(union acpi_subtable_headers *header,
const unsigned long end)
{
--
2.39.2
Platform firmware can disabled a CPU, or make it not-present by making
an eject-request notification, then waiting for the os to make it offline
and call _EJx. After the firmware updates _STA with the new status.
Not all operating systems support this. For arm64 making CPUs not-present
has never been supported. For all ACPI architectures, making CPUs disabled
has recently been added. Firmware can't know what the OS has support for.
Add two new _OSC bits to advertise whether the OS supports the _STA enabled
or present bits being toggled for CPUs. This will be important for arm64
if systems that support physical CPU hotplug ever appear as arm64 linux
doesn't currently support this, so firmware shouldn't try.
Advertising this support to firmware is useful for cloud orchestrators
to know whether they can scale a particular VM by adding CPUs.
Signed-off-by: James Morse <[email protected]>
---
I'm assuming ia64 with physical hotplug machines once existed, and
that Loongarch machines with support for this don't.
---
arch/ia64/Kconfig | 1 +
arch/x86/Kconfig | 1 +
drivers/acpi/Kconfig | 9 +++++++++
drivers/acpi/acpi_processor.c | 14 +++++++++++++-
drivers/acpi/bus.c | 16 ++++++++++++++++
include/linux/acpi.h | 4 ++++
6 files changed, 44 insertions(+), 1 deletion(-)
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 54972f9fe804..13df676bad67 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -17,6 +17,7 @@ config IA64
select ARCH_MIGHT_HAVE_PC_SERIO
select ACPI
select ACPI_HOTPLUG_PRESENT_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
+ select ACPI_HOTPLUG_IGNORE_OSC if ACPI
select ACPI_NUMA if NUMA
select ARCH_ENABLE_MEMORY_HOTPLUG
select ARCH_ENABLE_MEMORY_HOTREMOVE
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 295a7a3debb6..5fea3ce9594e 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -61,6 +61,7 @@ config X86
select ACPI_LEGACY_TABLES_LOOKUP if ACPI
select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
select ACPI_HOTPLUG_PRESENT_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
+ select ACPI_HOTPLUG_IGNORE_OSC if ACPI && HOTPLUG_CPU
select ARCH_32BIT_OFF_T if X86_32
select ARCH_CLOCKSOURCE_INIT
select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 417f9f3077d2..c49978b4b11f 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -310,6 +310,15 @@ config ACPI_HOTPLUG_PRESENT_CPU
depends on ACPI_PROCESSOR && HOTPLUG_CPU
select ACPI_CONTAINER
+config ACPI_HOTPLUG_IGNORE_OSC
+ bool
+ depends on ACPI_HOTPLUG_PRESENT_CPU
+ help
+ Ignore whether firmware acknowledged support for toggling the CPU
+ present bit in _STA. Some architectures predate the _OSC bits, so
+ firmware doesn't know to do this.
+
+
config ACPI_PROCESSOR_AGGREGATOR
tristate "Processor Aggregator"
depends on ACPI_PROCESSOR
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index b49859eab01a..87926f22c857 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -181,6 +181,18 @@ static void __init acpi_pcc_cpufreq_init(void)
static void __init acpi_pcc_cpufreq_init(void) {}
#endif /* CONFIG_X86 */
+static bool acpi_processor_hotplug_present_supported(void)
+{
+ if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
+ return false;
+
+ /* x86 systems pre-date the _OSC bit */
+ if (IS_ENABLED(CONFIG_ACPI_HOTPLUG_IGNORE_OSC))
+ return true;
+
+ return osc_sb_hotplug_present_support_acked;
+}
+
/* Initialization */
static int acpi_processor_make_present(struct acpi_processor *pr)
{
@@ -188,7 +200,7 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
acpi_status status;
int ret;
- if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
+ if (!acpi_processor_hotplug_present_supported()) {
pr_err_once("Changing CPU present bit is not supported\n");
return -ENODEV;
}
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index f41dda2d3493..123c28c2eda3 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -298,6 +298,13 @@ EXPORT_SYMBOL_GPL(osc_sb_native_usb4_support_confirmed);
bool osc_sb_cppc2_support_acked;
+/*
+ * ACPI 6.? Proposed Operating System Capabilities for modifying CPU
+ * present/enable.
+ */
+bool osc_sb_hotplug_enabled_support_acked;
+bool osc_sb_hotplug_present_support_acked;
+
static u8 sb_uuid_str[] = "0811B06E-4A27-44F9-8D60-3CBBC22E7B48";
static void acpi_bus_osc_negotiate_platform_control(void)
{
@@ -346,6 +353,11 @@ static void acpi_bus_osc_negotiate_platform_control(void)
if (!ghes_disable)
capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_APEI_SUPPORT;
+
+ capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_HOTPLUG_ENABLED_SUPPORT;
+ if (IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
+ capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_HOTPLUG_PRESENT_SUPPORT;
+
if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle)))
return;
@@ -383,6 +395,10 @@ static void acpi_bus_osc_negotiate_platform_control(void)
capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_NATIVE_USB4_SUPPORT;
osc_cpc_flexible_adr_space_confirmed =
capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_CPC_FLEXIBLE_ADR_SPACE;
+ osc_sb_hotplug_enabled_support_acked =
+ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_HOTPLUG_ENABLED_SUPPORT;
+ osc_sb_hotplug_present_support_acked =
+ capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_HOTPLUG_PRESENT_SUPPORT;
}
kfree(context.ret.pointer);
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index 92cb25349a18..2ba7e0b10bcf 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -580,12 +580,16 @@ acpi_status acpi_run_osc(acpi_handle handle, struct acpi_osc_context *context);
#define OSC_SB_NATIVE_USB4_SUPPORT 0x00040000
#define OSC_SB_PRM_SUPPORT 0x00200000
#define OSC_SB_FFH_OPR_SUPPORT 0x00400000
+#define OSC_SB_HOTPLUG_ENABLED_SUPPORT 0x00800000
+#define OSC_SB_HOTPLUG_PRESENT_SUPPORT 0x01000000
extern bool osc_sb_apei_support_acked;
extern bool osc_pc_lpi_support_confirmed;
extern bool osc_sb_native_usb4_support_confirmed;
extern bool osc_sb_cppc2_support_acked;
extern bool osc_cpc_flexible_adr_space_confirmed;
+extern bool osc_sb_hotplug_enabled_support_acked;
+extern bool osc_sb_hotplug_present_support_acked;
/* USB4 Capabilities */
#define OSC_USB_USB3_TUNNELING 0x00000001
--
2.39.2
gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions().
It should only count the number of enabled redistributors, but it
also tries to sanity check the GICC entry, currently returning an
error if the Enabled bit is set, but the gicr_base_address is zero.
Adding support for the online-capable bit to the sanity check
complicates it, for no benefit. The existing check implicitly
depends on gic_acpi_count_gicr_regions() previous failing to find
any GICR regions (as it is valid to have gicr_base_address of zero if
the redistributors are described via a GICR entry).
Instead of complicating the check, remove it. Failures that happen
at this point cause the irqchip not to register, meaning no irqs
can be requested. The kernel grinds to a panic() pretty quickly.
Without the check, MADT tables that exhibit this problem are still
caught by gic_populate_rdist(), which helpfully also prints what
went wrong:
| CPU4: mpidr 100 has no re-distributor!
Signed-off-by: James Morse <[email protected]>
---
drivers/irqchip/irq-gic-v3.c | 18 ++++++------------
1 file changed, 6 insertions(+), 12 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 72d3cdebdad1..0f54811262eb 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2415,21 +2415,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
/*
* If GICC is enabled and has valid gicr base address, then it means
- * GICR base is presented via GICC
+ * GICR base is presented via GICC. The redistributor is only known to
+ * be accessible if the GICC is marked as enabled. If this bit is not
+ * set, we'd need to add the redistributor at runtime, which isn't
+ * supported.
*/
- if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
+ if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)
acpi_data.enabled_rdists++;
- return 0;
- }
- /*
- * It's perfectly valid firmware can pass disabled GICC entry, driver
- * should not treat as errors, skip the entry instead of probe fail.
- */
- if (!acpi_gicc_is_usable(gicc))
- return 0;
-
- return -ENODEV;
+ return 0;
}
static int __init gic_acpi_count_gicr_regions(void)
--
2.39.2
Three of the five ACPI architectures create sysfs entries using
register_cpu() for present CPUs, whereas arm64, riscv and all
GENERIC_CPU_DEVICES do this for possible CPUs.
Registering a CPU is what causes them to show up in sysfs.
It makes very little sense to register all possible CPUs. Registering
a CPU is what triggers the udev notifications allowing user-space to
react to newly added CPUs.
To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change
it to use for_each_present_cpu(). Making the ACPI architectures use
GENERIC_CPU_DEVICES is a pre-requisite step to centralise their
cpu_register() logic, before moving it into the ACPI processor driver.
When ACPI is disabled this work would be done by
cpu_dev_register_generic().
Of the ACPI architectures that register possible CPUs, arm64 and riscv
do not support making possible CPUs present as they use the weak 'always
fails' version of arch_register_cpu().
Only two of the eight architectures that use GENERIC_CPU_DEVICES have a
distinction between present and possible CPUs.
The following architectures use GENERIC_CPU_DEVICES but are not SMP,
so possible == present:
* m68k
* microblaze
* nios2
The following architectures use GENERIC_CPU_DEVICES and consider
possible == present:
* csky: setup_smp()
* parisc: smp_prepare_boot_cpu() marks the boot cpu as present,
processor_probe() sets possible for all CPUs and present for all CPUs
except the boot cpu.
um appears to be a subarchitecture of x86.
The remaining architecture using GENERIC_CPU_DEVICES are:
* openrisc and hexagon:
where smp_init_cpus() makes all CPUs < NR_CPUS possible,
whereas smp_prepare_cpus() only makes CPUs < setup_max_cpus present.
After this change, openrisc and hexagon systems that use the max_cpus
command line argument would not see the other CPUs present in sysfs.
This should not be a problem as these CPUs can't bre brought online as
_cpu_up() checks cpu_present().
After this change, only CPUs which are present appear in sysfs.
Signed-off-by: James Morse <[email protected]>
---
drivers/base/cpu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 9ea22e165acd..34b48f660b6b 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -533,7 +533,7 @@ static void __init cpu_dev_register_generic(void)
#ifdef CONFIG_GENERIC_CPU_DEVICES
int i;
- for_each_possible_cpu(i) {
+ for_each_present_cpu(i) {
if (register_cpu(&per_cpu(cpu_devices, i), i))
panic("Failed to register CPU device");
}
--
2.39.2
Neither arm64 nor riscv support physical hotadd of CPUs that were not
present at boot. For arm64 much of the platform description is in static
tables which do not have update methods. arm64 does support HOTPLUG_CPU,
which is backed by a firmware interface to turn CPUs on and off.
acpi_processor_hotadd_init() and acpi_processor_remove() are for adding
and removing CPUs that were not present at boot. arm64 systems that do this
are not supported as there is currently insufficient information in the
platform description. (e.g. did the GICR get removed too?)
arm64 currently relies on the MADT enabled flag check in map_gicc_mpidr()
to prevent CPUs that were not described as present at boot from being
added to the system. Similarly, riscv relies on the same check in
map_rintc_hartid(). Both architectures also rely on the weak 'always fails'
definitions of acpi_map_cpu() and arch_register_cpu().
Subsequent changes will redefine ACPI_HOTPLUG_CPU as making possible
CPUs present. Neither arm64 nor riscv support this.
Disable ACPI_HOTPLUG_CPU for arm64 and riscv by removing 'default y' and
selecting it on the other three ACPI architectures. This allows the weak
definitions of some symbols to be removed.
Signed-off-by: James Morse <[email protected]>
---
Changes since RFC:
* Expanded conditions to avoid ACPI_HOTPLUG_CPU being enabled when
HOTPLUG_CPU isn't.
---
arch/ia64/Kconfig | 1 +
arch/loongarch/Kconfig | 1 +
arch/loongarch/include/asm/cpu.h | 7 +++++++
arch/x86/Kconfig | 1 +
drivers/acpi/Kconfig | 1 -
drivers/acpi/acpi_processor.c | 18 ------------------
6 files changed, 10 insertions(+), 19 deletions(-)
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 53faa122b0f4..a3bfd42467ab 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -16,6 +16,7 @@ config IA64
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
select ACPI
+ select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
select ACPI_NUMA if NUMA
select ARCH_ENABLE_MEMORY_HOTPLUG
select ARCH_ENABLE_MEMORY_HOTREMOVE
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index e14396a2ddcb..2bddd202470e 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -5,6 +5,7 @@ config LOONGARCH
select ACPI
select ACPI_GENERIC_GSI if ACPI
select ACPI_MCFG if ACPI
+ select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
select ACPI_PPTT if ACPI
select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
select ARCH_BINFMT_ELF_STATE
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
index 48b9f7168bcc..7afe8cbb844e 100644
--- a/arch/loongarch/include/asm/cpu.h
+++ b/arch/loongarch/include/asm/cpu.h
@@ -128,4 +128,11 @@ enum cpu_type_enum {
#define LOONGARCH_CPU_HYPERVISOR BIT_ULL(CPU_FEATURE_HYPERVISOR)
#define LOONGARCH_CPU_PTW BIT_ULL(CPU_FEATURE_PTW)
+#if !defined(__ASSEMBLY__)
+#ifdef CONFIG_HOTPLUG_CPU
+int arch_register_cpu(int num);
+void arch_unregister_cpu(int cpu);
+#endif
+#endif /* ! __ASSEMBLY__ */
+
#endif /* _ASM_CPU_H */
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 982b777eadc7..a0100a1ab4a0 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -60,6 +60,7 @@ config X86
#
select ACPI_LEGACY_TABLES_LOOKUP if ACPI
select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
+ select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
select ARCH_32BIT_OFF_T if X86_32
select ARCH_CLOCKSOURCE_INIT
select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index cee82b473dc5..8456d48ba702 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -309,7 +309,6 @@ config ACPI_HOTPLUG_CPU
bool
depends on ACPI_PROCESSOR && HOTPLUG_CPU
select ACPI_CONTAINER
- default y
config ACPI_PROCESSOR_AGGREGATOR
tristate "Processor Aggregator"
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index c711db8a9c33..c0839bcf78c1 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -183,24 +183,6 @@ static void __init acpi_pcc_cpufreq_init(void) {}
/* Initialization */
#ifdef CONFIG_ACPI_HOTPLUG_CPU
-int __weak acpi_map_cpu(acpi_handle handle,
- phys_cpuid_t physid, u32 acpi_id, int *pcpu)
-{
- return -ENODEV;
-}
-
-int __weak acpi_unmap_cpu(int cpu)
-{
- return -ENODEV;
-}
-
-int __weak arch_register_cpu(int cpu)
-{
- return -ENODEV;
-}
-
-void __weak arch_unregister_cpu(int cpu) {}
-
static int acpi_processor_hotadd_init(struct acpi_processor *pr)
{
unsigned long long sta;
--
2.39.2
When called acpi_processor_post_eject() unconditionally make a CPU
not-present and unregisters it.
To add support for AML events where the CPU has become disabled, but
remains present, the _STA method should be checked before calling
acpi_processor_remove().
Rename acpi_processor_post_eject() acpi_processor_remove_possible(), and
check the _STA before calling.
Adding the function prototype for arch_unregister_cpu() allows the
preprocessor guards to be removed.
After this change CPUs will remain registered and visible to
user-space as offline if buggy firmware triggers an eject-request,
but doesn't clear the corresponding _STA bits after _EJ0 has been
called.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/acpi_processor.c | 31 +++++++++++++++++++++++++------
include/linux/cpu.h | 1 +
2 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 00dcc23d49a8..2cafea1edc24 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -457,13 +457,12 @@ static int acpi_processor_add(struct acpi_device *device,
return result;
}
-#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
/* Removal */
-static void acpi_processor_post_eject(struct acpi_device *device)
+static void acpi_processor_make_not_present(struct acpi_device *device)
{
struct acpi_processor *pr;
- if (!device || !acpi_driver_data(device))
+ if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
return;
pr = acpi_driver_data(device);
@@ -501,7 +500,29 @@ static void acpi_processor_post_eject(struct acpi_device *device)
free_cpumask_var(pr->throttling.shared_cpu_map);
kfree(pr);
}
-#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
+
+static void acpi_processor_post_eject(struct acpi_device *device)
+{
+ struct acpi_processor *pr;
+ unsigned long long sta;
+ acpi_status status;
+
+ if (!device)
+ return;
+
+ pr = acpi_driver_data(device);
+ if (!pr || pr->id >= nr_cpu_ids || invalid_phys_cpuid(pr->phys_id))
+ return;
+
+ status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
+ if (ACPI_FAILURE(status))
+ return;
+
+ if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_PRESENT)) {
+ acpi_processor_make_not_present(device);
+ return;
+ }
+}
#ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
bool __init processor_physically_present(acpi_handle handle)
@@ -626,9 +647,7 @@ static const struct acpi_device_id processor_device_ids[] = {
static struct acpi_scan_handler processor_handler = {
.ids = processor_device_ids,
.attach = acpi_processor_add,
-#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
.post_eject = acpi_processor_post_eject,
-#endif
.hotplug = {
.enabled = true,
},
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index a71691d7c2ca..e117c06e0c6b 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -81,6 +81,7 @@ struct device *cpu_device_create(struct device *parent, void *drvdata,
const struct attribute_group **groups,
const char *fmt, ...);
extern int arch_register_cpu(int cpu);
+extern void arch_unregister_cpu(int cpu);
#ifdef CONFIG_HOTPLUG_CPU
extern void unregister_cpu(struct cpu *cpu);
extern ssize_t arch_cpu_probe(const char *, size_t);
--
2.39.2
To allow ACPI's _STA value to hide CPUs that are present, but not
available to online right now due to VMM or firmware policy, the
register_cpu() call needs to be made by the ACPI machinery when ACPI
is in use. This allows it to hide CPUs that are unavailable from sysfs.
Switching to GENERIC_CPU_DEVICES is an intermediate step to allow all
five ACPI architectures to be modified at once.
Switch over to GENERIC_CPU_DEVICES, and provide an arch_register_cpu()
that populates the hotpluggable flag. arch_register_cpu() is also the
interface the ACPI machinery expects.
The struct cpu in struct cpuinfo_arm64 is never used directly, remove
it to use the one GENERIC_CPU_DEVICES provides.
This changes the CPUs visible in sysfs from possible to present, but
on arm64 smp_prepare_cpus() ensures these are the same.
Signed-off-by: James Morse <[email protected]>
---
arch/arm64/Kconfig | 1 +
arch/arm64/include/asm/cpu.h | 1 -
arch/arm64/kernel/setup.c | 13 ++++---------
3 files changed, 5 insertions(+), 10 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index b10515c0200b..7b3990abf87a 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -132,6 +132,7 @@ config ARM64
select GENERIC_ARCH_TOPOLOGY
select GENERIC_CLOCKEVENTS_BROADCAST
select GENERIC_CPU_AUTOPROBE
+ select GENERIC_CPU_DEVICES
select GENERIC_CPU_VULNERABILITIES
select GENERIC_EARLY_IOREMAP
select GENERIC_IDLE_POLL_SETUP
diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
index e749838b9c5d..887bd0d992bb 100644
--- a/arch/arm64/include/asm/cpu.h
+++ b/arch/arm64/include/asm/cpu.h
@@ -38,7 +38,6 @@ struct cpuinfo_32bit {
};
struct cpuinfo_arm64 {
- struct cpu cpu;
struct kobject kobj;
u64 reg_ctr;
u64 reg_cntfrq;
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 417a8a86b2db..165bd2c0dd5a 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -402,19 +402,14 @@ static inline bool cpu_can_disable(unsigned int cpu)
return false;
}
-static int __init topology_init(void)
+int arch_register_cpu(int num)
{
- int i;
+ struct cpu *cpu = &per_cpu(cpu_devices, num);
- for_each_possible_cpu(i) {
- struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
- cpu->hotpluggable = cpu_can_disable(i);
- register_cpu(cpu, i);
- }
+ cpu->hotpluggable = cpu_can_disable(num);
- return 0;
+ return register_cpu(cpu, num);
}
-subsys_initcall(topology_init);
static void dump_kernel_offset(void)
{
--
2.39.2
To support virtual CPU hotplug, ACPI has added an 'online capable' bit
to the MADT GICC entries. This indicates a disabled CPU entry may not
be possible to online via PSCI until firmware has set enabled bit in
_STA.
What about the redistributor in the GICC entry? ACPI doesn't want to say.
Assume the worst: When a redistributor is described in the GICC entry,
but the entry is marked as disabled at boot, assume the redistributor
is inaccessible.
The GICv3 driver doesn't support late online of redistributors, so this
means the corresponding CPU can't be brought online either. Clear the
possible and present bits.
Systems that want CPU hotplug in a VM can ensure their redistributors
are always-on, and describe them that way with a GICR entry in the MADT.
When mapping redistributors found via GICC entries, handle the case
where the arch code believes the CPU is present and possible, but it
does not have an accessible redistributor. Print a warning and clear
the present and possible bits.
Signed-off-by: James Morse <[email protected]>
----
Disabled but online-capable CPUs cause this message to be printed
if their redistributors are described via GICC:
| GICv3: CPU 3's redistributor is inaccessible: this CPU can't be brought online
If ACPI's _STA tries to make the cpu present later, this message is printed:
| Changing CPU present bit is not supported
---
drivers/irqchip/irq-gic-v3.c | 14 ++++++++++++++
include/linux/acpi.h | 3 ++-
2 files changed, 16 insertions(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 0f54811262eb..f56d064f4aa9 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2365,11 +2365,25 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
(struct acpi_madt_generic_interrupt *)header;
u32 reg = readl_relaxed(acpi_data.dist_base + GICD_PIDR2) & GIC_PIDR2_ARCH_MASK;
u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
+ int cpu = get_cpu_for_acpi_id(gicc->uid);
void __iomem *redist_base;
if (!acpi_gicc_is_usable(gicc))
return 0;
+ /*
+ * Capable but disabled CPUs can be brought online later. What about
+ * the redistributor? ACPI doesn't want to say!
+ * Virtual hotplug systems can use the MADT's "always-on" GICR entries.
+ * Otherwise, prevent such CPUs from being brought online.
+ */
+ if (!(gicc->flags & ACPI_MADT_ENABLED)) {
+ pr_warn_once("CPU %u's redistributor is inaccessible: this CPU can't be brought online\n", cpu);
+ set_cpu_present(cpu, false);
+ set_cpu_possible(cpu, false);
+ return 0;
+ }
+
redist_base = ioremap(gicc->gicr_base_address, size);
if (!redist_base)
return -ENOMEM;
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index e3265a9eafae..92cb25349a18 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -258,7 +258,8 @@ void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
{
- return (gicc->flags & ACPI_MADT_ENABLED);
+ return ((gicc->flags & ACPI_MADT_ENABLED ||
+ gicc->flags & ACPI_MADT_GICC_CPU_CAPABLE));
}
/* the following numa functions are architecture-dependent */
--
2.39.2
register_cpu_capacity_sysctl() adds a property to sysfs that describes
the CPUs capacity. This is done from a subsys_initcall() that assumes
all possible CPUs are registered.
With CPU hotplug, possible CPUs aren't registered until they become
present, (or for arm64 enabled). This leads to messages during boot:
| register_cpu_capacity_sysctl: too early to get CPU1 device!
and once these CPUs are added to the system, the file is missing.
Move this to a cpuhp callback, so that the file is created once
CPUs are brought online. This covers CPUs that are added late by
mechanisms like hotplug.
One observable difference is the file is now missing for offline CPUs.
Signed-off-by: James Morse <[email protected]>
---
If the offline CPUs thing is a problem for the tools that consume
this value, we'd need to move cpu_capacity to be part of cpu.c's
common_cpu_attr_groups.
---
drivers/base/arch_topology.c | 38 ++++++++++++++++++++++++------------
1 file changed, 26 insertions(+), 12 deletions(-)
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index b741b5ba82bd..9ccb7daee78e 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -220,20 +220,34 @@ static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
static DEVICE_ATTR_RO(cpu_capacity);
+static int cpu_capacity_sysctl_add(unsigned int cpu)
+{
+ struct device *cpu_dev = get_cpu_device(cpu);
+
+ if (!cpu_dev)
+ return -ENOENT;
+
+ device_create_file(cpu_dev, &dev_attr_cpu_capacity);
+
+ return 0;
+}
+
+static int cpu_capacity_sysctl_remove(unsigned int cpu)
+{
+ struct device *cpu_dev = get_cpu_device(cpu);
+
+ if (!cpu_dev)
+ return -ENOENT;
+
+ device_remove_file(cpu_dev, &dev_attr_cpu_capacity);
+
+ return 0;
+}
+
static int register_cpu_capacity_sysctl(void)
{
- int i;
- struct device *cpu;
-
- for_each_possible_cpu(i) {
- cpu = get_cpu_device(i);
- if (!cpu) {
- pr_err("%s: too early to get CPU%d device!\n",
- __func__, i);
- continue;
- }
- device_create_file(cpu, &dev_attr_cpu_capacity);
- }
+ cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "topology/cpu-capacity",
+ cpu_capacity_sysctl_add, cpu_capacity_sysctl_remove);
return 0;
}
--
2.39.2
acpi_scan_device_not_present() is called when a device in the
hierarchy is not available for enumeration. Historically enumeration
was only based on whether the device was present.
To add support for only enumerating devices that are both present
and enabled, this helper should be renamed. It was only ever about
enumeration, rename it acpi_scan_device_not_enumerated().
No change in behaviour is intended.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/scan.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index ed01e19514ef..17ab875a7d4e 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -289,10 +289,10 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
return 0;
}
-static int acpi_scan_device_not_present(struct acpi_device *adev)
+static int acpi_scan_device_not_enumerated(struct acpi_device *adev)
{
if (!acpi_device_enumerated(adev)) {
- dev_warn(&adev->dev, "Still not present\n");
+ dev_warn(&adev->dev, "Still not enumerated\n");
return -EALREADY;
}
acpi_bus_trim(adev);
@@ -327,7 +327,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
error = -ENODEV;
}
} else {
- error = acpi_scan_device_not_present(adev);
+ error = acpi_scan_device_not_enumerated(adev);
}
return error;
}
@@ -339,7 +339,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
acpi_bus_get_status(adev);
if (!acpi_device_is_present(adev)) {
- acpi_scan_device_not_present(adev);
+ acpi_scan_device_not_enumerated(adev);
return 0;
}
if (handler && handler->hotplug.scan_dependent)
--
2.39.2
A subsequent patch will change acpi_scan_hot_remove() to call
acpi_bus_trim_one() instead of acpi_bus_trim(), meaning it can no longer
rely on the prototype in the header file.
Move these functions further up the file.
No change in behaviour.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/scan.c | 76 ++++++++++++++++++++++-----------------------
1 file changed, 38 insertions(+), 38 deletions(-)
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index f898591ce05f..a675333618ae 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -244,6 +244,44 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
return 0;
}
+static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
+{
+ struct acpi_scan_handler *handler = adev->handler;
+
+ acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
+
+ adev->flags.match_driver = false;
+ if (handler) {
+ if (handler->detach)
+ handler->detach(adev);
+
+ adev->handler = NULL;
+ } else {
+ device_release_driver(&adev->dev);
+ }
+ /*
+ * Most likely, the device is going away, so put it into D3cold before
+ * that.
+ */
+ acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
+ adev->flags.initialized = false;
+ acpi_device_clear_enumerated(adev);
+
+ return 0;
+}
+
+/**
+ * acpi_bus_trim - Detach scan handlers and drivers from ACPI device objects.
+ * @adev: Root of the ACPI namespace scope to walk.
+ *
+ * Must be called under acpi_scan_lock.
+ */
+void acpi_bus_trim(struct acpi_device *adev)
+{
+ acpi_bus_trim_one(adev, NULL);
+}
+EXPORT_SYMBOL_GPL(acpi_bus_trim);
+
static int acpi_scan_hot_remove(struct acpi_device *device)
{
acpi_handle handle = device->handle;
@@ -2506,44 +2544,6 @@ int acpi_bus_scan(acpi_handle handle)
}
EXPORT_SYMBOL(acpi_bus_scan);
-static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
-{
- struct acpi_scan_handler *handler = adev->handler;
-
- acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
-
- adev->flags.match_driver = false;
- if (handler) {
- if (handler->detach)
- handler->detach(adev);
-
- adev->handler = NULL;
- } else {
- device_release_driver(&adev->dev);
- }
- /*
- * Most likely, the device is going away, so put it into D3cold before
- * that.
- */
- acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
- adev->flags.initialized = false;
- acpi_device_clear_enumerated(adev);
-
- return 0;
-}
-
-/**
- * acpi_bus_trim - Detach scan handlers and drivers from ACPI device objects.
- * @adev: Root of the ACPI namespace scope to walk.
- *
- * Must be called under acpi_scan_lock.
- */
-void acpi_bus_trim(struct acpi_device *adev)
-{
- acpi_bus_trim_one(adev, NULL);
-}
-EXPORT_SYMBOL_GPL(acpi_bus_trim);
-
int acpi_bus_register_early_device(int type)
{
struct acpi_device *device = NULL;
--
2.39.2
Add a description of physical and virtual CPU hotplug, explain the
differences and elaborate on what is required in ACPI for a working
virtual hotplug system.
Signed-off-by: James Morse <[email protected]>
---
Documentation/arch/arm64/cpu-hotplug.rst | 79 ++++++++++++++++++++++++
Documentation/arch/arm64/index.rst | 1 +
2 files changed, 80 insertions(+)
create mode 100644 Documentation/arch/arm64/cpu-hotplug.rst
diff --git a/Documentation/arch/arm64/cpu-hotplug.rst b/Documentation/arch/arm64/cpu-hotplug.rst
new file mode 100644
index 000000000000..76ba8d932c72
--- /dev/null
+++ b/Documentation/arch/arm64/cpu-hotplug.rst
@@ -0,0 +1,79 @@
+.. SPDX-License-Identifier: GPL-2.0
+.. _cpuhp_index:
+
+====================
+CPU Hotplug and ACPI
+====================
+
+CPU hotplug in the arm64 world is commonly used to describe the kernel taking
+CPUs online/offline using PSCI. This document is about ACPI firmware allowing
+CPUs that were not available during boot to be added to the system later.
+
+``possible`` and ``present`` refer to the state of the CPU as seen by linux.
+
+
+CPU Hotplug on physical systems - CPUs not present at boot
+----------------------------------------------------------
+
+Physical systems need to mark a CPU that is ``possible`` but not ``present`` as
+being ``present``. An example would be a dual socket machine, where the package
+in one of the sockets can be replaced while the system is running.
+
+This is not supported.
+
+In the arm64 world CPUs are not a single device but a slice of the system.
+There are no systems that support the physical addition (or removal) of CPUs
+while the system is running, and ACPI is not able to sufficiently describe
+them.
+
+e.g. New CPUs come with new caches, but the platform's cache toplogy is
+described in a static table, the PPTT. How caches are shared between CPUs is
+not discoverable, and must be described by firmware.
+
+e.g. The GIC redistributor for each CPU must be accessed by the driver during
+boot to discover the system wide supported features. ACPI's MADT GICC
+structures can describe a redistributor associated with a disabled CPU, but
+can't describe whether the redistributor is accessible, only that it is not
+'always on'.
+
+arm64's ACPI tables assume that everything described is ``present``.
+
+
+CPU Hotplug on virtual systems - CPUs not enabled at boot
+---------------------------------------------------------
+
+Virtual systems have the advantage that all the properties the system will
+ever have can be described at boot. There are no power-domain considerations
+as such devices are emulated.
+
+CPU Hotplug on virtual systems is supported. It is distinct from physical
+CPU Hotplug as all resources are described as ``present``, but CPUs may be
+marked as disabled by firmware. Only the CPU's online/offline behaviour is
+influenced by firmware. An example is where a virtual machine boots with a
+single CPU, and additional CPUs are added once a cloud orchestrator deploys
+the workload.
+
+For a virtual machine, the VMM (e.g. Qemu) plays the part of firmware.
+
+Virtual hotplug is implemented as a firmware policy affecting which CPUs can be
+brought online. Firmware can enforce its policy via PSCI's return codes. e.g.
+``DENIED``.
+
+The ACPI tables must describe all the resources of the virtual machine. CPUs
+that firmware wishes to disable either from boot (or later) should not be
+``enabled`` in the MADT GICC structures, but should have the ``online capable``
+bit set, to indicate they can be enabled later. The boot CPU must be marked as
+``enabled``. The 'always on' GICR structure must be used to describe the
+redistributors.
+
+CPUs described as ``online capable`` but not ``enabled`` can be set to enabled
+by the DSDT's Processor object's _STA method. On virtual systems the _STA method
+must always report the CPU as ``present``. Changes to the firmware policy can
+be notified to the OS via device-check or eject-request.
+
+CPUs described as ``enabled`` in the static table, should not have their _STA
+modified dynamically by firmware. Soft-restart features such as kexec will
+re-read the static properties of the system from these static tables, and
+may malfunction if these no longer describe the running system. Linux will
+re-discover the dynamic properties of the system from the _STA method later
+during boot.
diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/index.rst
index d08e924204bf..78544de0a8a9 100644
--- a/Documentation/arch/arm64/index.rst
+++ b/Documentation/arch/arm64/index.rst
@@ -13,6 +13,7 @@ ARM64 Architecture
asymmetric-32bit
booting
cpu-feature-registers
+ cpu-hotplug
elf_hwcaps
hugetlbpage
kdump
--
2.39.2
From: Jean-Philippe Brucker <[email protected]>
When a CPU is marked as disabled, but online capable in the MADT, PSCI
applies some firmware policy to control when it can be brought online.
PSCI returns DENIED to a CPU_ON request if this is not currently
permitted. The OS can learn the current policy from the _STA enabled bit.
Handle the PSCI DENIED return code gracefully instead of printing an
error.
Signed-off-by: Jean-Philippe Brucker <[email protected]>
[ morse: Rewrote commit message ]
Signed-off-by: James Morse <[email protected]>
---
arch/arm64/kernel/psci.c | 2 +-
arch/arm64/kernel/smp.c | 3 ++-
drivers/firmware/psci/psci.c | 2 ++
3 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
index 29a8e444db83..4fcc0cdd757b 100644
--- a/arch/arm64/kernel/psci.c
+++ b/arch/arm64/kernel/psci.c
@@ -40,7 +40,7 @@ static int cpu_psci_cpu_boot(unsigned int cpu)
{
phys_addr_t pa_secondary_entry = __pa_symbol(secondary_entry);
int err = psci_ops.cpu_on(cpu_logical_map(cpu), pa_secondary_entry);
- if (err)
+ if (err && err != -EPROBE_DEFER)
pr_err("failed to boot CPU%d (%d)\n", cpu, err);
return err;
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 8c8f55721786..e958db987665 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -124,7 +124,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
/* Now bring the CPU into our world */
ret = boot_secondary(cpu, idle);
if (ret) {
- pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
+ if (ret != -EPROBE_DEFER)
+ pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
return ret;
}
diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
index d9629ff87861..f7ab3fed3528 100644
--- a/drivers/firmware/psci/psci.c
+++ b/drivers/firmware/psci/psci.c
@@ -218,6 +218,8 @@ static int __psci_cpu_on(u32 fn, unsigned long cpuid, unsigned long entry_point)
int err;
err = invoke_psci_fn(fn, cpuid, entry_point, 0);
+ if (err == PSCI_RET_DENIED)
+ return -EPROBE_DEFER;
return psci_to_linux_errno(err);
}
--
2.39.2
acpi_processor_get_info() registers all present CPUs. Registering a
CPU is what creates the sysfs entries and triggers the udev
notifications.
arm64 virtual machines that support 'virtual cpu hotplug' use the
enabled bit to indicate whether the CPU can be brought online, as
the existing ACPI tables require all hardware to be described and
present.
If firmware describes a CPU as present, but disabled, skip the
registration. Such CPUs are present, but can't be brought online for
whatever reason. (e.g. firmware/hypervisor policy).
Once firmware sets the enabled bit, the CPU can be registered and
brought online by user-space. Online CPUs, or CPUs that are missing
an _STA method must always be registered.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/acpi_processor.c | 31 ++++++++++++++++++++++++++++++-
1 file changed, 30 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index b67616079751..b49859eab01a 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -227,6 +227,32 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
return ret;
}
+static int acpi_processor_make_enabled(struct acpi_processor *pr)
+{
+ unsigned long long sta;
+ acpi_status status;
+ bool present, enabled;
+
+ if (!acpi_has_method(pr->handle, "_STA"))
+ return arch_register_cpu(pr->id);
+
+ status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
+ if (ACPI_FAILURE(status))
+ return -ENODEV;
+
+ present = sta & ACPI_STA_DEVICE_PRESENT;
+ enabled = sta & ACPI_STA_DEVICE_ENABLED;
+
+ if (cpu_online(pr->id) && (!present || !enabled)) {
+ pr_err_once(FW_BUG "CPU %u is online, but described as not present or disabled!\n", pr->id);
+ add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
+ } else if (!present || !enabled) {
+ return -ENODEV;
+ }
+
+ return arch_register_cpu(pr->id);
+}
+
static int acpi_processor_get_info(struct acpi_device *device)
{
union acpi_object object = { 0 };
@@ -318,7 +344,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
*/
if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
!get_cpu_device(pr->id)) {
- int ret = arch_register_cpu(pr->id);
+ int ret = acpi_processor_make_enabled(pr);
if (ret)
return ret;
@@ -526,6 +552,9 @@ static void acpi_processor_post_eject(struct acpi_device *device)
acpi_processor_make_not_present(device);
return;
}
+
+ if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED))
+ arch_unregister_cpu(pr->id);
}
#ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
--
2.39.2
ACPI, irqchip and the architecture code all inspect the MADT
enabled bit for a GICC entry in the MADT.
The addition of an 'online capable' bit means all these sites need
updating.
Move the current checks behind a helper to make future updates easier.
Signed-off-by: James Morse <[email protected]>
---
arch/arm64/kernel/smp.c | 2 +-
drivers/acpi/processor_core.c | 2 +-
drivers/irqchip/irq-gic-v3.c | 10 ++++------
include/linux/acpi.h | 5 +++++
4 files changed, 11 insertions(+), 8 deletions(-)
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 960b98b43506..8c8f55721786 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -520,7 +520,7 @@ acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
{
u64 hwid = processor->arm_mpidr;
- if (!(processor->flags & ACPI_MADT_ENABLED)) {
+ if (!acpi_gicc_is_usable(processor)) {
pr_debug("skipping disabled CPU entry with 0x%llx MPIDR\n", hwid);
return;
}
diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
index 7dd6dbaa98c3..b203cfe28550 100644
--- a/drivers/acpi/processor_core.c
+++ b/drivers/acpi/processor_core.c
@@ -90,7 +90,7 @@ static int map_gicc_mpidr(struct acpi_subtable_header *entry,
struct acpi_madt_generic_interrupt *gicc =
container_of(entry, struct acpi_madt_generic_interrupt, header);
- if (!(gicc->flags & ACPI_MADT_ENABLED))
+ if (!acpi_gicc_is_usable(gicc))
return -ENODEV;
/* device_declaration means Device object in DSDT, in the
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index eedfa8e9f077..72d3cdebdad1 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -2367,8 +2367,7 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
void __iomem *redist_base;
- /* GICC entry which has !ACPI_MADT_ENABLED is not unusable so skip */
- if (!(gicc->flags & ACPI_MADT_ENABLED))
+ if (!acpi_gicc_is_usable(gicc))
return 0;
redist_base = ioremap(gicc->gicr_base_address, size);
@@ -2418,7 +2417,7 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
* If GICC is enabled and has valid gicr base address, then it means
* GICR base is presented via GICC
*/
- if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address) {
+ if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
acpi_data.enabled_rdists++;
return 0;
}
@@ -2427,7 +2426,7 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
* It's perfectly valid firmware can pass disabled GICC entry, driver
* should not treat as errors, skip the entry instead of probe fail.
*/
- if (!(gicc->flags & ACPI_MADT_ENABLED))
+ if (!acpi_gicc_is_usable(gicc))
return 0;
return -ENODEV;
@@ -2486,8 +2485,7 @@ static int __init gic_acpi_parse_virt_madt_gicc(union acpi_subtable_headers *hea
int maint_irq_mode;
static int first_madt = true;
- /* Skip unusable CPUs */
- if (!(gicc->flags & ACPI_MADT_ENABLED))
+ if (!acpi_gicc_is_usable(gicc))
return 0;
maint_irq_mode = (gicc->flags & ACPI_MADT_VGIC_IRQ_MODE) ?
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index b7ab85857bb7..e3265a9eafae 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -256,6 +256,11 @@ acpi_table_parse_cedt(enum acpi_cedt_type id,
int acpi_parse_mcfg (struct acpi_table_header *header);
void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
+static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
+{
+ return (gicc->flags & ACPI_MADT_ENABLED);
+}
+
/* the following numa functions are architecture-dependent */
void acpi_numa_slit_init (struct acpi_table_slit *slit);
--
2.39.2
Add arch_unregister_cpu() to allow the ACPI machinery to call
unregister_cpu(). This is enough for arm64, riscv and loongarch, but
needs to be overridden by x86 and ia64 who need to do more work.
CC: Jean-Philippe Brucker <[email protected]>
Signed-off-by: James Morse <[email protected]>
---
Changes since v1:
* Added CONFIG_HOTPLUG_CPU ifdeffery around unregister_cpu
---
arch/ia64/include/asm/cpu.h | 4 ----
arch/loongarch/include/asm/cpu.h | 6 ------
arch/x86/include/asm/cpu.h | 1 -
drivers/base/cpu.c | 9 ++++++++-
4 files changed, 8 insertions(+), 12 deletions(-)
diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
index a3e690e685e5..642d71675ddb 100644
--- a/arch/ia64/include/asm/cpu.h
+++ b/arch/ia64/include/asm/cpu.h
@@ -15,8 +15,4 @@ DECLARE_PER_CPU(struct ia64_cpu, cpu_devices);
DECLARE_PER_CPU(int, cpu_state);
-#ifdef CONFIG_HOTPLUG_CPU
-extern void arch_unregister_cpu(int);
-#endif
-
#endif /* _ASM_IA64_CPU_H_ */
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
index b8568e637420..48b9f7168bcc 100644
--- a/arch/loongarch/include/asm/cpu.h
+++ b/arch/loongarch/include/asm/cpu.h
@@ -128,10 +128,4 @@ enum cpu_type_enum {
#define LOONGARCH_CPU_HYPERVISOR BIT_ULL(CPU_FEATURE_HYPERVISOR)
#define LOONGARCH_CPU_PTW BIT_ULL(CPU_FEATURE_PTW)
-#if !defined(__ASSEMBLY__)
-#ifdef CONFIG_HOTPLUG_CPU
-void arch_unregister_cpu(int cpu);
-#endif
-#endif /* ! __ASSEMBLY__ */
-
#endif /* _ASM_CPU_H */
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index f349c94510e8..91867a6a9f8e 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -24,7 +24,6 @@ static inline void prefill_possible_map(void) {}
#endif /* CONFIG_SMP */
#ifdef CONFIG_HOTPLUG_CPU
-extern void arch_unregister_cpu(int);
extern void soft_restart_cpu(void);
#endif
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 677f963e02ce..c709747c4a18 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -531,7 +531,14 @@ int __weak arch_register_cpu(int cpu)
{
return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
}
-#endif
+
+#ifdef CONFIG_HOTPLUG_CPU
+void __weak arch_unregister_cpu(int num)
+{
+ unregister_cpu(&per_cpu(cpu_devices, num));
+}
+#endif /* CONFIG_HOTPLUG_CPU */
+#endif /* CONFIG_GENERIC_CPU_DEVICES */
static void __init cpu_dev_register_generic(void)
{
--
2.39.2
acpi_device_is_present() checks the present or functional bits
from the cached copy of _STA.
A few places open-code this check. Use the helper instead to
improve readability.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/scan.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index 691d4b7686ee..ed01e19514ef 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
int error;
acpi_bus_get_status(adev);
- if (adev->status.present || adev->status.functional) {
+ if (acpi_device_is_present(adev)) {
/*
* This function is only called for device objects for which
* matching scan handlers exist. The only situation in which
@@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
int error;
acpi_bus_get_status(adev);
- if (!(adev->status.present || adev->status.functional)) {
+ if (!acpi_device_is_present(adev)) {
acpi_scan_device_not_present(adev);
return 0;
}
--
2.39.2
The code behind ACPI_HOTPLUG_CPU allows a not-present CPU to become
present. This isn't the only use of HOTPLUG_CPU. On arm64 and riscv
CPUs can be taken offline as a power saving measure.
On arm64 an offline CPU may be disabled by firmware, preventing it from
being brought back online, but it remains present throughout.
Adding code to prevent user-space trying to online these disabled CPUs
needs some additional terminology.
Rename the Kconfig symbol CONFIG_ACPI_HOTPLUG_PRESENT_CPU to reflect
that it makes possible CPUs present.
HOTPLUG_CPU is untouched as this is only about the ACPI mechanism.
Signed-off-by: James Morse <[email protected]>
---
arch/ia64/Kconfig | 2 +-
arch/ia64/include/asm/acpi.h | 2 +-
arch/ia64/kernel/acpi.c | 6 +++---
arch/ia64/kernel/setup.c | 2 +-
arch/loongarch/configs/loongson3_defconfig | 2 +-
arch/loongarch/kernel/acpi.c | 4 ++--
arch/x86/Kconfig | 2 +-
arch/x86/kernel/acpi/boot.c | 4 ++--
drivers/acpi/Kconfig | 4 ++--
drivers/acpi/acpi_processor.c | 10 +++++-----
include/acpi/processor.h | 2 +-
include/linux/acpi.h | 6 +++---
12 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index a3bfd42467ab..54972f9fe804 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -16,7 +16,7 @@ config IA64
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_MIGHT_HAVE_PC_SERIO
select ACPI
- select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
+ select ACPI_HOTPLUG_PRESENT_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
select ACPI_NUMA if NUMA
select ARCH_ENABLE_MEMORY_HOTPLUG
select ARCH_ENABLE_MEMORY_HOTREMOVE
diff --git a/arch/ia64/include/asm/acpi.h b/arch/ia64/include/asm/acpi.h
index 58500a964238..482ea994d1e1 100644
--- a/arch/ia64/include/asm/acpi.h
+++ b/arch/ia64/include/asm/acpi.h
@@ -52,7 +52,7 @@ extern unsigned int is_cpu_cpei_target(unsigned int cpu);
extern void set_cpei_target_cpu(unsigned int cpu);
extern unsigned int get_cpei_target_cpu(void);
extern void prefill_possible_map(void);
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
extern int additional_cpus;
#else
#define additional_cpus 0
diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index 15f6cfddcc08..35881bf4b016 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -194,7 +194,7 @@ acpi_parse_plat_int_src(union acpi_subtable_headers * header,
return 0;
}
-#ifdef CONFIG_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
unsigned int can_cpei_retarget(void)
{
extern int cpe_vector;
@@ -711,7 +711,7 @@ int acpi_isa_irq_to_gsi(unsigned isa_irq, u32 *gsi)
/*
* ACPI based hotplug CPU support
*/
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
{
#ifdef CONFIG_ACPI_NUMA
@@ -820,7 +820,7 @@ int acpi_unmap_cpu(int cpu)
return (0);
}
EXPORT_SYMBOL(acpi_unmap_cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
#ifdef CONFIG_ACPI_NUMA
static acpi_status acpi_map_iosapic(acpi_handle handle, u32 depth,
diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
index 5a55ac82c13a..44591716d07b 100644
--- a/arch/ia64/kernel/setup.c
+++ b/arch/ia64/kernel/setup.c
@@ -569,7 +569,7 @@ setup_arch (char **cmdline_p)
#ifdef CONFIG_ACPI_NUMA
acpi_numa_init();
acpi_numa_fixup();
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
prefill_possible_map();
#endif
per_cpu_scan_finalize((cpumask_empty(&early_cpu_possible_map) ?
diff --git a/arch/loongarch/configs/loongson3_defconfig b/arch/loongarch/configs/loongson3_defconfig
index a3b52aaa83b3..ef3bc76313e4 100644
--- a/arch/loongarch/configs/loongson3_defconfig
+++ b/arch/loongarch/configs/loongson3_defconfig
@@ -59,7 +59,7 @@ CONFIG_ACPI_SPCR_TABLE=y
CONFIG_ACPI_TAD=y
CONFIG_ACPI_DOCK=y
CONFIG_ACPI_IPMI=m
-CONFIG_ACPI_HOTPLUG_CPU=y
+CONFIG_ACPI_HOTPLUG_PRESENT_CPU=y
CONFIG_ACPI_PCI_SLOT=y
CONFIG_ACPI_HOTPLUG_MEMORY=y
CONFIG_EFI_ZBOOT=y
diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
index 9450e09073eb..b5153e395ad9 100644
--- a/arch/loongarch/kernel/acpi.c
+++ b/arch/loongarch/kernel/acpi.c
@@ -289,7 +289,7 @@ void __init arch_reserve_mem_area(acpi_physical_address addr, size_t size)
memblock_reserve(addr, size);
}
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
#include <acpi/processor.h>
@@ -341,4 +341,4 @@ int acpi_unmap_cpu(int cpu)
}
EXPORT_SYMBOL(acpi_unmap_cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index 133ea5f561b5..295a7a3debb6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -60,7 +60,7 @@ config X86
#
select ACPI_LEGACY_TABLES_LOOKUP if ACPI
select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
- select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
+ select ACPI_HOTPLUG_PRESENT_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
select ARCH_32BIT_OFF_T if X86_32
select ARCH_CLOCKSOURCE_INIT
select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 2a0ea38955df..84dd4133754b 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -814,7 +814,7 @@ static void __init acpi_set_irq_model_ioapic(void)
/*
* ACPI based hotplug support for CPU
*/
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
#include <acpi/processor.h>
static int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
@@ -863,7 +863,7 @@ int acpi_unmap_cpu(int cpu)
return (0);
}
EXPORT_SYMBOL(acpi_unmap_cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
int acpi_register_ioapic(acpi_handle handle, u64 phys_addr, u32 gsi_base)
{
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 8456d48ba702..417f9f3077d2 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -305,7 +305,7 @@ config ACPI_IPMI
To compile this driver as a module, choose M here:
the module will be called as acpi_ipmi.
-config ACPI_HOTPLUG_CPU
+config ACPI_HOTPLUG_PRESENT_CPU
bool
depends on ACPI_PROCESSOR && HOTPLUG_CPU
select ACPI_CONTAINER
@@ -399,7 +399,7 @@ config ACPI_PCI_SLOT
config ACPI_CONTAINER
bool "Container and Module Devices"
- default (ACPI_HOTPLUG_MEMORY || ACPI_HOTPLUG_CPU)
+ default (ACPI_HOTPLUG_MEMORY || ACPI_HOTPLUG_PRESENT_CPU)
help
This driver supports ACPI Container and Module devices (IDs
ACPI0004, PNP0A05, and PNP0A06).
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 867782bc50b0..75257fae10e7 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -182,7 +182,7 @@ static void __init acpi_pcc_cpufreq_init(void) {}
#endif /* CONFIG_X86 */
/* Initialization */
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
static int acpi_processor_hotadd_init(struct acpi_processor *pr)
{
unsigned long long sta;
@@ -227,7 +227,7 @@ static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
{
return -ENODEV;
}
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
static int acpi_processor_get_info(struct acpi_device *device)
{
@@ -461,7 +461,7 @@ static int acpi_processor_add(struct acpi_device *device,
return result;
}
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
/* Removal */
static void acpi_processor_remove(struct acpi_device *device)
{
@@ -505,7 +505,7 @@ static void acpi_processor_remove(struct acpi_device *device)
free_cpumask_var(pr->throttling.shared_cpu_map);
kfree(pr);
}
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
#ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
bool __init processor_physically_present(acpi_handle handle)
@@ -630,7 +630,7 @@ static const struct acpi_device_id processor_device_ids[] = {
static struct acpi_scan_handler processor_handler = {
.ids = processor_device_ids,
.attach = acpi_processor_add,
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
.detach = acpi_processor_remove,
#endif
.hotplug = {
diff --git a/include/acpi/processor.h b/include/acpi/processor.h
index 94181fe9780a..fd6913370c72 100644
--- a/include/acpi/processor.h
+++ b/include/acpi/processor.h
@@ -465,7 +465,7 @@ extern int acpi_processor_ffh_lpi_probe(unsigned int cpu);
extern int acpi_processor_ffh_lpi_enter(struct acpi_lpi_state *lpi);
#endif
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
extern int arch_register_cpu(int cpu);
extern void arch_unregister_cpu(int cpu);
#endif
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index a73246c3c35e..651dd43976a9 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -316,12 +316,12 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
}
#endif
-#ifdef CONFIG_ACPI_HOTPLUG_CPU
+#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
/* Arch dependent functions for cpu hotplug support */
int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
int *pcpu);
int acpi_unmap_cpu(int cpu);
-#endif /* CONFIG_ACPI_HOTPLUG_CPU */
+#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
#ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
@@ -644,7 +644,7 @@ static inline u32 acpi_osc_ctx_get_cxl_control(struct acpi_osc_context *context)
#define ACPI_GSB_ACCESS_ATTRIB_RAW_PROCESS 0x0000000F
/* Enable _OST when all relevant hotplug operations are enabled */
-#if defined(CONFIG_ACPI_HOTPLUG_CPU) && \
+#if defined(CONFIG_ACPI_HOTPLUG_PRESENT_CPU) && \
defined(CONFIG_ACPI_HOTPLUG_MEMORY) && \
defined(CONFIG_ACPI_CONTAINER)
#define ACPI_HOTPLUG_OST
--
2.39.2
struct acpi_scan_handler has a detach callback that is used to remove
a driver when a bus is changed. When interacting with an eject-request,
the detach callback is called before _EJ0.
This means the ACPI processor driver can't use _STA to determine if a
CPU has been made not-present, or some of the other _STA bits have been
changed. acpi_processor_remove() needs to know the value of _STA after
_EJ0 has been called.
Add a post_eject callback to struct acpi_scan_handler. This is called
after acpi_scan_hot_remove() has successfully called _EJ0. Because
acpi_bus_trim_one() also clears the handler pointer, it needs to be
told if the caller will go on to call acpi_bus_post_eject(), so
that acpi_device_clear_enumerated() and clearing the handler pointer
can be deferred. The existing not-used pointer is used for this.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/acpi_processor.c | 4 +--
drivers/acpi/scan.c | 52 ++++++++++++++++++++++++++++++-----
include/acpi/acpi_bus.h | 1 +
3 files changed, 48 insertions(+), 9 deletions(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 22a15a614f95..00dcc23d49a8 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -459,7 +459,7 @@ static int acpi_processor_add(struct acpi_device *device,
#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
/* Removal */
-static void acpi_processor_remove(struct acpi_device *device)
+static void acpi_processor_post_eject(struct acpi_device *device)
{
struct acpi_processor *pr;
@@ -627,7 +627,7 @@ static struct acpi_scan_handler processor_handler = {
.ids = processor_device_ids,
.attach = acpi_processor_add,
#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
- .detach = acpi_processor_remove,
+ .post_eject = acpi_processor_post_eject,
#endif
.hotplug = {
.enabled = true,
diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
index a675333618ae..b6d2f01640a9 100644
--- a/drivers/acpi/scan.c
+++ b/drivers/acpi/scan.c
@@ -244,18 +244,28 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
return 0;
}
-static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
+/**
+ * acpi_bus_trim_one() - Detach scan handlers and drivers from ACPI device
+ * objects.
+ * @adev: Root of the ACPI namespace scope to walk.
+ * @eject: Pointer to a bool that indicates if this was due to an
+ * eject-request.
+ *
+ * Must be called under acpi_scan_lock.
+ * If @eject points to true, clearing the device enumeration is deferred until
+ * acpi_bus_post_eject() is called.
+ */
+static int acpi_bus_trim_one(struct acpi_device *adev, void *eject)
{
struct acpi_scan_handler *handler = adev->handler;
+ bool is_eject = *(bool *)eject;
- acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
+ acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, eject);
adev->flags.match_driver = false;
if (handler) {
if (handler->detach)
handler->detach(adev);
-
- adev->handler = NULL;
} else {
device_release_driver(&adev->dev);
}
@@ -265,7 +275,12 @@ static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
*/
acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
adev->flags.initialized = false;
- acpi_device_clear_enumerated(adev);
+
+ /* For eject this is deferred to acpi_bus_post_eject() */
+ if (!is_eject) {
+ adev->handler = NULL;
+ acpi_device_clear_enumerated(adev);
+ }
return 0;
}
@@ -278,15 +293,36 @@ static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
*/
void acpi_bus_trim(struct acpi_device *adev)
{
- acpi_bus_trim_one(adev, NULL);
+ bool eject = false;
+
+ acpi_bus_trim_one(adev, &eject);
}
EXPORT_SYMBOL_GPL(acpi_bus_trim);
+static int acpi_bus_post_eject(struct acpi_device *adev, void *not_used)
+{
+ struct acpi_scan_handler *handler = adev->handler;
+
+ acpi_dev_for_each_child_reverse(adev, acpi_bus_post_eject, NULL);
+
+ if (handler) {
+ if (handler->post_eject)
+ handler->post_eject(adev);
+
+ adev->handler = NULL;
+ }
+
+ acpi_device_clear_enumerated(adev);
+
+ return 0;
+}
+
static int acpi_scan_hot_remove(struct acpi_device *device)
{
acpi_handle handle = device->handle;
unsigned long long sta;
acpi_status status;
+ bool eject = true;
if (device->handler && device->handler->hotplug.demand_offline) {
if (!acpi_scan_is_offline(device, true))
@@ -299,7 +335,7 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
acpi_handle_debug(handle, "Ejecting\n");
- acpi_bus_trim(device);
+ acpi_bus_trim_one(device, &eject);
acpi_evaluate_lck(handle, 0);
/*
@@ -322,6 +358,8 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
} else if (sta & ACPI_STA_DEVICE_ENABLED) {
acpi_handle_warn(handle,
"Eject incomplete - status 0x%llx\n", sta);
+ } else {
+ acpi_bus_post_eject(device, NULL);
}
return 0;
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 254685085c82..1b7e1acf925b 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -127,6 +127,7 @@ struct acpi_scan_handler {
bool (*match)(const char *idstr, const struct acpi_device_id **matchid);
int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
void (*detach)(struct acpi_device *dev);
+ void (*post_eject)(struct acpi_device *dev);
void (*bind)(struct device *phys_dev);
void (*unbind)(struct device *phys_dev);
struct acpi_hotplug_profile hotplug;
--
2.39.2
NUMA systems require the node descriptions to be ready before CPUs are
registered. This is so that the node symlinks can be created in sysfs.
Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs
are registered by arch code, instead of cpu_dev_init().
Move cpu_dev_init() after node_dev_init() so that NUMA architectures
can use GENERIC_CPU_DEVICES.
Signed-off-by: James Morse <[email protected]>
---
drivers/base/init.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/init.c b/drivers/base/init.c
index 397eb9880cec..c4954835128c 100644
--- a/drivers/base/init.c
+++ b/drivers/base/init.c
@@ -35,8 +35,8 @@ void __init driver_init(void)
of_core_init();
platform_bus_init();
auxiliary_bus_init();
- cpu_dev_init();
memory_dev_init();
node_dev_init();
+ cpu_dev_init();
container_dev_init();
}
--
2.39.2
ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other
in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors"
says "Each processor in the system must be declared in the ACPI
namespace"). Having two descriptions allows firmware authors to get
this wrong.
If CPUs are described in the MADT/APIC, they will be brought online
early during boot. Once the register_cpu() calls are moved to ACPI,
they will be based on the DSDT description of the CPUs. When CPUs are
missing from the DSDT description, they will end up online, but not
registered.
Add a helper that runs after acpi_init() has completed to register
CPUs that are online, but weren't found in the DSDT. Any CPU that
is registered by this code triggers a firmware-bug warning and kernel
taint.
Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug
is configured.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index b4bde78121bb..a01e315aa16a 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -790,6 +790,25 @@ void __init acpi_processor_init(void)
acpi_pcc_cpufreq_init();
}
+static int __init acpi_processor_register_missing_cpus(void)
+{
+ int cpu;
+
+ if (acpi_disabled)
+ return 0;
+
+ for_each_online_cpu(cpu) {
+ if (!get_cpu_device(cpu)) {
+ pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu);
+ add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
+ arch_register_cpu(cpu);
+ }
+ }
+
+ return 0;
+}
+subsys_initcall_sync(acpi_processor_register_missing_cpus);
+
#ifdef CONFIG_ACPI_PROCESSOR_CSTATE
/**
* acpi_processor_claim_cst_control - Request _CST control from the platform.
--
2.39.2
The 'offline' file in sysfs shows all offline CPUs, including those
that aren't present. User-space is expected to remove not-present CPUs
from this list to learn which CPUs could be brought online.
CPUs can be present but not-enabled. These CPUs can't be brought online
until the firmware policy changes, which comes with an ACPI notification
that will register the CPUs.
With only the offline and present files, user-space is unable to
determine which CPUs it can try to bring online. Add a new CPU mask
that shows this based on all the registered CPUs.
Signed-off-by: James Morse <[email protected]>
---
drivers/base/cpu.c | 10 ++++++++++
include/linux/cpumask.h | 25 +++++++++++++++++++++++++
kernel/cpu.c | 3 +++
3 files changed, 38 insertions(+)
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index c709747c4a18..a19a8be93102 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -95,6 +95,7 @@ void unregister_cpu(struct cpu *cpu)
{
int logical_cpu = cpu->dev.id;
+ set_cpu_enabled(logical_cpu, false);
unregister_cpu_under_node(logical_cpu, cpu_to_node(logical_cpu));
device_unregister(&cpu->dev);
@@ -273,6 +274,13 @@ static ssize_t print_cpus_offline(struct device *dev,
}
static DEVICE_ATTR(offline, 0444, print_cpus_offline, NULL);
+static ssize_t print_cpus_enabled(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ return sysfs_emit(buf, "%*pbl\n", cpumask_pr_args(cpu_enabled_mask));
+}
+static DEVICE_ATTR(enabled, 0444, print_cpus_enabled, NULL);
+
static ssize_t print_cpus_isolated(struct device *dev,
struct device_attribute *attr, char *buf)
{
@@ -413,6 +421,7 @@ int register_cpu(struct cpu *cpu, int num)
register_cpu_under_node(num, cpu_to_node(num));
dev_pm_qos_expose_latency_limit(&cpu->dev,
PM_QOS_RESUME_LATENCY_NO_CONSTRAINT);
+ set_cpu_enabled(num, true);
return 0;
}
@@ -494,6 +503,7 @@ static struct attribute *cpu_root_attrs[] = {
&cpu_attrs[2].attr.attr,
&dev_attr_kernel_max.attr,
&dev_attr_offline.attr,
+ &dev_attr_enabled.attr,
&dev_attr_isolated.attr,
#ifdef CONFIG_NO_HZ_FULL
&dev_attr_nohz_full.attr,
diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
index f10fb87d49db..a29ee03f13ff 100644
--- a/include/linux/cpumask.h
+++ b/include/linux/cpumask.h
@@ -92,6 +92,7 @@ static inline void set_nr_cpu_ids(unsigned int nr)
*
* cpu_possible_mask- has bit 'cpu' set iff cpu is populatable
* cpu_present_mask - has bit 'cpu' set iff cpu is populated
+ * cpu_enabled_mask - has bit 'cpu' set iff cpu can be brought online
* cpu_online_mask - has bit 'cpu' set iff cpu available to scheduler
* cpu_active_mask - has bit 'cpu' set iff cpu available to migration
*
@@ -124,11 +125,13 @@ static inline void set_nr_cpu_ids(unsigned int nr)
extern struct cpumask __cpu_possible_mask;
extern struct cpumask __cpu_online_mask;
+extern struct cpumask __cpu_enabled_mask;
extern struct cpumask __cpu_present_mask;
extern struct cpumask __cpu_active_mask;
extern struct cpumask __cpu_dying_mask;
#define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask)
#define cpu_online_mask ((const struct cpumask *)&__cpu_online_mask)
+#define cpu_enabled_mask ((const struct cpumask *)&__cpu_enabled_mask)
#define cpu_present_mask ((const struct cpumask *)&__cpu_present_mask)
#define cpu_active_mask ((const struct cpumask *)&__cpu_active_mask)
#define cpu_dying_mask ((const struct cpumask *)&__cpu_dying_mask)
@@ -973,6 +976,7 @@ extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS);
#else
#define for_each_possible_cpu(cpu) for_each_cpu((cpu), cpu_possible_mask)
#define for_each_online_cpu(cpu) for_each_cpu((cpu), cpu_online_mask)
+#define for_each_enabled_cpu(cpu) for_each_cpu((cpu), cpu_enabled_mask)
#define for_each_present_cpu(cpu) for_each_cpu((cpu), cpu_present_mask)
#endif
@@ -995,6 +999,15 @@ set_cpu_possible(unsigned int cpu, bool possible)
cpumask_clear_cpu(cpu, &__cpu_possible_mask);
}
+static inline void
+set_cpu_enabled(unsigned int cpu, bool can_be_onlined)
+{
+ if (can_be_onlined)
+ cpumask_set_cpu(cpu, &__cpu_enabled_mask);
+ else
+ cpumask_clear_cpu(cpu, &__cpu_enabled_mask);
+}
+
static inline void
set_cpu_present(unsigned int cpu, bool present)
{
@@ -1074,6 +1087,7 @@ static __always_inline unsigned int num_online_cpus(void)
return raw_atomic_read(&__num_online_cpus);
}
#define num_possible_cpus() cpumask_weight(cpu_possible_mask)
+#define num_enabled_cpus() cpumask_weight(cpu_enabled_mask)
#define num_present_cpus() cpumask_weight(cpu_present_mask)
#define num_active_cpus() cpumask_weight(cpu_active_mask)
@@ -1082,6 +1096,11 @@ static inline bool cpu_online(unsigned int cpu)
return cpumask_test_cpu(cpu, cpu_online_mask);
}
+static inline bool cpu_enabled(unsigned int cpu)
+{
+ return cpumask_test_cpu(cpu, cpu_enabled_mask);
+}
+
static inline bool cpu_possible(unsigned int cpu)
{
return cpumask_test_cpu(cpu, cpu_possible_mask);
@@ -1106,6 +1125,7 @@ static inline bool cpu_dying(unsigned int cpu)
#define num_online_cpus() 1U
#define num_possible_cpus() 1U
+#define num_enabled_cpus() 1U
#define num_present_cpus() 1U
#define num_active_cpus() 1U
@@ -1119,6 +1139,11 @@ static inline bool cpu_possible(unsigned int cpu)
return cpu == 0;
}
+static inline bool cpu_enabled(unsigned int cpu)
+{
+ return cpu == 0;
+}
+
static inline bool cpu_present(unsigned int cpu)
{
return cpu == 0;
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 6de7c6bb74ee..2201a6a449b5 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -3101,6 +3101,9 @@ EXPORT_SYMBOL(__cpu_possible_mask);
struct cpumask __cpu_online_mask __read_mostly;
EXPORT_SYMBOL(__cpu_online_mask);
+struct cpumask __cpu_enabled_mask __read_mostly;
+EXPORT_SYMBOL(__cpu_enabled_mask);
+
struct cpumask __cpu_present_mask __read_mostly;
EXPORT_SYMBOL(__cpu_present_mask);
--
2.39.2
Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
overridden by the arch code, switch over to this to allow common code
to choose when the register_cpu() call is made.
x86's struct cpus come from struct x86_cpu, which has no other members
or users. Remove this and use the version defined by common code.
This is an intermediate step to the logic being moved to drivers/acpi,
where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
Signed-off-by: James Morse <[email protected]>
----
Changes since RFC:
* Fixed the second copy of arch_register_cpu() used for non-hotplug
---
arch/x86/Kconfig | 1 +
arch/x86/include/asm/cpu.h | 4 ----
arch/x86/kernel/topology.c | 25 ++++++-------------------
3 files changed, 7 insertions(+), 23 deletions(-)
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a0100a1ab4a0..133ea5f561b5 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -148,6 +148,7 @@ config X86
select GENERIC_CLOCKEVENTS_MIN_ADJUST
select GENERIC_CMOS_UPDATE
select GENERIC_CPU_AUTOPROBE
+ select GENERIC_CPU_DEVICES
select GENERIC_CPU_VULNERABILITIES
select GENERIC_EARLY_IOREMAP
select GENERIC_ENTRY
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index 96dc4665e87d..f349c94510e8 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -23,10 +23,6 @@ static inline void prefill_possible_map(void) {}
#endif /* CONFIG_SMP */
-struct x86_cpu {
- struct cpu cpu;
-};
-
#ifdef CONFIG_HOTPLUG_CPU
extern void arch_unregister_cpu(int);
extern void soft_restart_cpu(void);
diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
index 0bab03130033..ca08a1d138f0 100644
--- a/arch/x86/kernel/topology.c
+++ b/arch/x86/kernel/topology.c
@@ -35,38 +35,25 @@
#include <asm/io_apic.h>
#include <asm/cpu.h>
-static DEFINE_PER_CPU(struct x86_cpu, cpu_devices);
-
#ifdef CONFIG_HOTPLUG_CPU
int arch_register_cpu(int cpu)
{
- struct x86_cpu *xc = per_cpu_ptr(&cpu_devices, cpu);
+ struct cpu *c = per_cpu_ptr(&cpu_devices, cpu);
- xc->cpu.hotpluggable = cpu > 0;
- return register_cpu(&xc->cpu, cpu);
+ c->hotpluggable = cpu > 0;
+ return register_cpu(c, cpu);
}
EXPORT_SYMBOL(arch_register_cpu);
void arch_unregister_cpu(int num)
{
- unregister_cpu(&per_cpu(cpu_devices, num).cpu);
+ unregister_cpu(&per_cpu(cpu_devices, num));
}
EXPORT_SYMBOL(arch_unregister_cpu);
#else /* CONFIG_HOTPLUG_CPU */
-int __init arch_register_cpu(int num)
+int arch_register_cpu(int num)
{
- return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
+ return register_cpu(&per_cpu(cpu_devices, num), num);
}
#endif /* CONFIG_HOTPLUG_CPU */
-
-static int __init topology_init(void)
-{
- int i;
-
- for_each_present_cpu(i)
- arch_register_cpu(i);
-
- return 0;
-}
-subsys_initcall(topology_init);
--
2.39.2
Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
overridden by the arch code, switch over to this to allow common code
to choose when the register_cpu() call is made.
This allows topology_init() to be removed.
This is an intermediate step to the logic being moved to drivers/acpi,
where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
Signed-off-by: James Morse <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/kernel/setup.c | 19 ++++---------------
2 files changed, 5 insertions(+), 15 deletions(-)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index d607ab0f7c6d..eeb80fb55acc 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -69,6 +69,7 @@ config RISCV
select GENERIC_ARCH_TOPOLOGY
select GENERIC_ATOMIC64 if !64BIT
select GENERIC_CLOCKEVENTS_BROADCAST if SMP
+ select GENERIC_CPU_DEVICES
select GENERIC_EARLY_IOREMAP
select GENERIC_ENTRY
select GENERIC_GETTIMEOFDAY if HAVE_GENERIC_VDSO
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index e600aab116a4..f5bd6b8d0c52 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -62,7 +62,6 @@ atomic_t hart_lottery __section(".sdata")
#endif
;
unsigned long boot_cpu_hartid;
-static DEFINE_PER_CPU(struct cpu, cpu_devices);
/*
* Place kernel memory regions on the resource tree so that
@@ -320,23 +319,13 @@ void __init setup_arch(char **cmdline_p)
riscv_set_dma_cache_alignment();
}
-static int __init topology_init(void)
+int arch_register_cpu(int cpu)
{
- int i, ret;
+ struct cpu *c = &per_cpu(cpu_devices, cpu);
- for_each_possible_cpu(i) {
- struct cpu *cpu = &per_cpu(cpu_devices, i);
-
- cpu->hotpluggable = cpu_has_hotplug(i);
- ret = register_cpu(cpu, i);
- if (unlikely(ret))
- pr_warn("Warning: %s: register_cpu %d failed (%d)\n",
- __func__, i, ret);
- }
-
- return 0;
+ c->hotpluggable = cpu_has_hotplug(cpu);
+ return register_cpu(c, cpu);
}
-subsys_initcall(topology_init);
void free_initmem(void)
{
--
2.39.2
Architectures often have extra per-cpu work that needs doing
before a CPU is registered, often to determine if a CPU is
hotpluggable.
To allow the ACPI architectures to use GENERIC_CPU_DEVICES, move
the cpu_register() call into arch_register_cpu(), which is made __weak
so architectures with extra work can override it.
This aligns with the way x86, ia64 and loongarch register hotplug CPUs
when they become present.
Signed-off-by: James Morse <[email protected]>
---
Changes since RFC:
* Dropped __init from x86/ia64 arch_register_cpu()
---
arch/ia64/include/asm/cpu.h | 1 -
arch/ia64/kernel/topology.c | 2 +-
arch/loongarch/include/asm/cpu.h | 1 -
arch/x86/include/asm/cpu.h | 1 -
arch/x86/kernel/topology.c | 2 +-
drivers/base/cpu.c | 14 ++++++++++----
include/linux/cpu.h | 5 +++++
7 files changed, 17 insertions(+), 9 deletions(-)
diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
index db125df9e088..a3e690e685e5 100644
--- a/arch/ia64/include/asm/cpu.h
+++ b/arch/ia64/include/asm/cpu.h
@@ -16,7 +16,6 @@ DECLARE_PER_CPU(struct ia64_cpu, cpu_devices);
DECLARE_PER_CPU(int, cpu_state);
#ifdef CONFIG_HOTPLUG_CPU
-extern int arch_register_cpu(int num);
extern void arch_unregister_cpu(int);
#endif
diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c
index 94a848b06f15..741863a187a6 100644
--- a/arch/ia64/kernel/topology.c
+++ b/arch/ia64/kernel/topology.c
@@ -59,7 +59,7 @@ void __ref arch_unregister_cpu(int num)
}
EXPORT_SYMBOL(arch_unregister_cpu);
#else
-static int __init arch_register_cpu(int num)
+int __init arch_register_cpu(int num)
{
return register_cpu(&sysfs_cpus[num].cpu, num);
}
diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
index 7afe8cbb844e..b8568e637420 100644
--- a/arch/loongarch/include/asm/cpu.h
+++ b/arch/loongarch/include/asm/cpu.h
@@ -130,7 +130,6 @@ enum cpu_type_enum {
#if !defined(__ASSEMBLY__)
#ifdef CONFIG_HOTPLUG_CPU
-int arch_register_cpu(int num);
void arch_unregister_cpu(int cpu);
#endif
#endif /* ! __ASSEMBLY__ */
diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
index 3a233ebff712..96dc4665e87d 100644
--- a/arch/x86/include/asm/cpu.h
+++ b/arch/x86/include/asm/cpu.h
@@ -28,7 +28,6 @@ struct x86_cpu {
};
#ifdef CONFIG_HOTPLUG_CPU
-extern int arch_register_cpu(int num);
extern void arch_unregister_cpu(int);
extern void soft_restart_cpu(void);
#endif
diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
index ca004e2e4469..0bab03130033 100644
--- a/arch/x86/kernel/topology.c
+++ b/arch/x86/kernel/topology.c
@@ -54,7 +54,7 @@ void arch_unregister_cpu(int num)
EXPORT_SYMBOL(arch_unregister_cpu);
#else /* CONFIG_HOTPLUG_CPU */
-static int __init arch_register_cpu(int num)
+int __init arch_register_cpu(int num)
{
return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
}
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 34b48f660b6b..579064fda97b 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -525,19 +525,25 @@ bool cpu_is_hotpluggable(unsigned int cpu)
EXPORT_SYMBOL_GPL(cpu_is_hotpluggable);
#ifdef CONFIG_GENERIC_CPU_DEVICES
-static DEFINE_PER_CPU(struct cpu, cpu_devices);
+DEFINE_PER_CPU(struct cpu, cpu_devices);
+
+int __weak arch_register_cpu(int cpu)
+{
+ return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
+}
#endif
static void __init cpu_dev_register_generic(void)
{
-#ifdef CONFIG_GENERIC_CPU_DEVICES
int i;
+ if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
+ return;
+
for_each_present_cpu(i) {
- if (register_cpu(&per_cpu(cpu_devices, i), i))
+ if (arch_register_cpu(i))
panic("Failed to register CPU device");
}
-#endif
}
#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 0abd60a7987b..a71691d7c2ca 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -80,12 +80,17 @@ extern __printf(4, 5)
struct device *cpu_device_create(struct device *parent, void *drvdata,
const struct attribute_group **groups,
const char *fmt, ...);
+extern int arch_register_cpu(int cpu);
#ifdef CONFIG_HOTPLUG_CPU
extern void unregister_cpu(struct cpu *cpu);
extern ssize_t arch_cpu_probe(const char *, size_t);
extern ssize_t arch_cpu_release(const char *, size_t);
#endif
+#ifdef CONFIG_GENERIC_CPU_DEVICES
+DECLARE_PER_CPU(struct cpu, cpu_devices);
+#endif
+
/*
* These states are not related to the core CPU hotplug mechanism. They are
* used by various (sub)architectures to track internal state
--
2.39.2
loongarch, mips, parisc, riscv and sh all print a warning if
register_cpu() returns an error. Architectures that use
GENERIC_CPU_DEVICES call panic() instead.
Errors in this path indicate something is wrong with the firmware
description of the platform, but the kernel is able to keep running.
Downgrade this to a warning to make it easier to debug this issue.
This will allow architectures that switching over to GENERIC_CPU_DEVICES
to drop their warning, but keep the existing behaviour.
Signed-off-by: James Morse <[email protected]>
---
drivers/base/cpu.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 579064fda97b..d31c936f0955 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -535,14 +535,15 @@ int __weak arch_register_cpu(int cpu)
static void __init cpu_dev_register_generic(void)
{
- int i;
+ int i, ret;
if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
return;
for_each_present_cpu(i) {
- if (arch_register_cpu(i))
- panic("Failed to register CPU device");
+ ret = arch_register_cpu(i);
+ if (ret)
+ pr_warn("register_cpu %d failed (%d)\n", i, ret);
}
}
--
2.39.2
To allow ACPI to skip the call to arch_register_cpu() when the _STA
value indicates the CPU can't be brought online right now, move the
arch_register_cpu() call into acpi_processor_get_info().
Systems can still be booted with 'acpi=off', or not include an
ACPI description at all. For these, the CPUs continue to be
registered by cpu_dev_register_generic().
This moves the CPU register logic back to a subsys_initcall(),
while the memory nodes will have been registered earlier.
Signed-off-by: James Morse <[email protected]>
---
drivers/acpi/acpi_processor.c | 13 +++++++++++++
drivers/base/cpu.c | 2 +-
2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index a01e315aa16a..867782bc50b0 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -313,6 +313,19 @@ static int acpi_processor_get_info(struct acpi_device *device)
cpufreq_add_device("acpi-cpufreq");
}
+ /*
+ * Register CPUs that are present.
+ * Use get_cpu_device() to skip duplicate CPU descriptions from
+ * firmware.
+ */
+ if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
+ !get_cpu_device(pr->id)) {
+ int ret = arch_register_cpu(pr->id);
+
+ if (ret)
+ return ret;
+ }
+
/*
* Extra Processor objects may be enumerated on MP systems with
* less than the max # of CPUs. They should be ignored _iff
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index d31c936f0955..677f963e02ce 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -537,7 +537,7 @@ static void __init cpu_dev_register_generic(void)
{
int i, ret;
- if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
+ if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
return;
for_each_present_cpu(i) {
--
2.39.2
Hello James,
On Wed, 13 Sept 2023 at 18:41, James Morse <[email protected]> wrote:
>
> Add the new flag field to the MADT's GICC structure.
>
> 'Online Capable' indicates a disabled CPU can be enabled later.
>
Why do we need a bit for this? What would be the point of describing
disabled CPUs that cannot be enabled (and are you are aware of
firmware doing this?).
So why are we not able to assume that this new bit can always be treated as '1'?
> Signed-off-by: James Morse <[email protected]>
> ---
> This patch probably needs to go via the upstream acpica project,
> but is included here so the feature can be testd.
> ---
> include/acpi/actbl2.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
> index 3751ae69432f..c433a079d8e1 100644
> --- a/include/acpi/actbl2.h
> +++ b/include/acpi/actbl2.h
> @@ -1046,6 +1046,7 @@ struct acpi_madt_generic_interrupt {
> /* ACPI_MADT_ENABLED (1) Processor is usable if set */
> #define ACPI_MADT_PERFORMANCE_IRQ_MODE (1<<1) /* 01: Performance Interrupt Mode */
> #define ACPI_MADT_VGIC_IRQ_MODE (1<<2) /* 02: VGIC Maintenance Interrupt mode */
> +#define ACPI_MADT_GICC_CPU_CAPABLE (1<<3) /* 03: CPU is online capable */
>
> /* 12: Generic Distributor (ACPI 5.0 + ACPI 6.0 changes) */
>
> --
> 2.39.2
>
On Wed, Sep 13, 2023 at 04:38:16PM +0000, James Morse wrote:
> +static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
> +{
> + return (gicc->flags & ACPI_MADT_ENABLED);
These parens are not needed.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:38:18PM +0000, James Morse wrote:
> static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
> {
> - return (gicc->flags & ACPI_MADT_ENABLED);
> + return ((gicc->flags & ACPI_MADT_ENABLED ||
> + gicc->flags & ACPI_MADT_GICC_CPU_CAPABLE));
... and this starts getting silly with the number of parens.
return gicc->flags & ACPI_MADT_ENABLED ||
gicc->flags & ACPI_MADT_GICC_CPU_CAPABLE;
is entirely sufficient. Also:
return gicc->flags & (ACPI_MADT_ENABLED | ACPI_MADT_GICC_CPU_CAPABLE);
also works.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:50PM +0000, James Morse wrote:
> Three of the five ACPI architectures create sysfs entries using
> register_cpu() for present CPUs, whereas arm64, riscv and all
> GENERIC_CPU_DEVICES do this for possible CPUs.
>
> Registering a CPU is what causes them to show up in sysfs.
>
> It makes very little sense to register all possible CPUs. Registering
> a CPU is what triggers the udev notifications allowing user-space to
> react to newly added CPUs.
>
> To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change
> it to use for_each_present_cpu(). Making the ACPI architectures use
> GENERIC_CPU_DEVICES is a pre-requisite step to centralise their
> cpu_register() logic, before moving it into the ACPI processor driver.
> When ACPI is disabled this work would be done by
> cpu_dev_register_generic().
>
> Of the ACPI architectures that register possible CPUs, arm64 and riscv
> do not support making possible CPUs present as they use the weak 'always
> fails' version of arch_register_cpu().
>
> Only two of the eight architectures that use GENERIC_CPU_DEVICES have a
> distinction between present and possible CPUs.
>
> The following architectures use GENERIC_CPU_DEVICES but are not SMP,
> so possible == present:
> * m68k
> * microblaze
> * nios2
>
> The following architectures use GENERIC_CPU_DEVICES and consider
> possible == present:
> * csky: setup_smp()
> * parisc: smp_prepare_boot_cpu() marks the boot cpu as present,
> processor_probe() sets possible for all CPUs and present for all CPUs
> except the boot cpu.
However, init/main.c::start_kernel() calls boot_cpu_init() which sets
the boot CPU in the online, active, present and possible masks. So,
_every_ architecture gets the boot CPU in all these masks no matter
what.
Only of something then clears the boot CPU from these masks (which
would be silly) would the boot CPU not be in all of these masks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:51PM +0000, James Morse wrote:
> Architectures often have extra per-cpu work that needs doing
> before a CPU is registered, often to determine if a CPU is
> hotpluggable.
>
> To allow the ACPI architectures to use GENERIC_CPU_DEVICES, move
> the cpu_register() call into arch_register_cpu(), which is made __weak
> so architectures with extra work can override it.
> This aligns with the way x86, ia64 and loongarch register hotplug CPUs
> when they become present.
>
> Signed-off-by: James Morse <[email protected]>
LGTM.
Reviewed-by: Russell King (Oracle) <[email protected]>
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:49PM +0000, James Morse wrote:
> diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
> index 48b9f7168bcc..7afe8cbb844e 100644
> --- a/arch/loongarch/include/asm/cpu.h
> +++ b/arch/loongarch/include/asm/cpu.h
> @@ -128,4 +128,11 @@ enum cpu_type_enum {
> #define LOONGARCH_CPU_HYPERVISOR BIT_ULL(CPU_FEATURE_HYPERVISOR)
> #define LOONGARCH_CPU_PTW BIT_ULL(CPU_FEATURE_PTW)
>
> +#if !defined(__ASSEMBLY__)
> +#ifdef CONFIG_HOTPLUG_CPU
> +int arch_register_cpu(int num);
> +void arch_unregister_cpu(int cpu);
> +#endif
> +#endif /* ! __ASSEMBLY__ */
So, for loongarch:
grep arch_.*register_cpu arch/loongarch/ -r
arch/loongarch/kernel/topology.c:int arch_register_cpu(int cpu)
arch/loongarch/kernel/topology.c:EXPORT_SYMBOL(arch_register_cpu);
arch/loongarch/kernel/topology.c:void arch_unregister_cpu(int cpu)
arch/loongarch/kernel/topology.c:EXPORT_SYMBOL(arch_unregister_cpu);
So really this is a fix (since these functions should have prototypes)
and thus should probably be a separate patch.
However, I also wonder whether these prototypes should be added to
linux/cpu.h and be done with it (rather than have every arch prototype
these - it's not like the prototype can be different from this because
of the generic code.
I know in subsequent patches you do that, but it's rather piecemeal,
and I think this is a change that could be submitted now as both a
fix and clean up.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:52PM +0000, James Morse wrote:
> NUMA systems require the node descriptions to be ready before CPUs are
> registered. This is so that the node symlinks can be created in sysfs.
>
> Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs
> are registered by arch code, instead of cpu_dev_init().
>
> Move cpu_dev_init() after node_dev_init() so that NUMA architectures
> can use GENERIC_CPU_DEVICES.
>
> Signed-off-by: James Morse <[email protected]>
I think this patch should be merged sooner rather than later so that
it gets a longer time to be tested, as moving the order that things
happen in init/main.c can be problematical.
Reviewed-by: Russell King (Oracle) <[email protected]>
Thanks!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:53PM +0000, James Morse wrote:
> loongarch, mips, parisc, riscv and sh all print a warning if
> register_cpu() returns an error. Architectures that use
> GENERIC_CPU_DEVICES call panic() instead.
>
> Errors in this path indicate something is wrong with the firmware
> description of the platform, but the kernel is able to keep running.
>
> Downgrade this to a warning to make it easier to debug this issue.
>
> This will allow architectures that switching over to GENERIC_CPU_DEVICES
> to drop their warning, but keep the existing behaviour.
>
> Signed-off-by: James Morse <[email protected]>
Assuming other architectures do similar to x86 (which only return the
error code from register_cpu()), the only error that would occur here
is if device_register() fails, which would be catastophic, and I
suspect the system would fail to boot anyway.
Downgrading the panic to a warning at least gives us a chance that
the system may come up sufficiently to examine what happened, so I
think this makes sense:
Reviewed-by: Russell King (Oracle) <[email protected]>
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:54PM +0000, James Morse wrote:
> To allow ACPI's _STA value to hide CPUs that are present, but not
> available to online right now due to VMM or firmware policy, the
> register_cpu() call needs to be made by the ACPI machinery when ACPI
> is in use. This allows it to hide CPUs that are unavailable from sysfs.
>
> Switching to GENERIC_CPU_DEVICES is an intermediate step to allow all
> five ACPI architectures to be modified at once.
>
> Switch over to GENERIC_CPU_DEVICES, and provide an arch_register_cpu()
> that populates the hotpluggable flag. arch_register_cpu() is also the
> interface the ACPI machinery expects.
>
> The struct cpu in struct cpuinfo_arm64 is never used directly, remove
> it to use the one GENERIC_CPU_DEVICES provides.
>
> This changes the CPUs visible in sysfs from possible to present, but
> on arm64 smp_prepare_cpus() ensures these are the same.
>
> Signed-off-by: James Morse <[email protected]>
Reviewed-by: Russell King (Oracle) <[email protected]>
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:58PM +0000, James Morse wrote:
> Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
> overridden by the arch code, switch over to this to allow common code
> to choose when the register_cpu() call is made.
>
> This allows topology_init() to be removed.
>
> This is an intermediate step to the logic being moved to drivers/acpi,
> where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
>
> Signed-off-by: James Morse <[email protected]>
... and same concern as the previous patch.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:55PM +0000, James Morse wrote:
> intel_epb_init() is called as a subsys_initcall() to register cpuhp
> callbacks. The callbacks make use of get_cpu_device() which will return
> NULL unless register_cpu() has been called. register_cpu() is called
> from topology_init(), which is also a subsys_initcall().
>
> This is fragile. Moving the register_cpu() to a different
> subsys_initcall() leads to a NULL derefernce during boot.
>
> Make intel_epb_init() a late_initcall(), user-space can't provide a
> policy before this point anyway.
>
> Signed-off-by: James Morse <[email protected]>
I think someone knowledgeable from x86 land needs to ack/review this.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
> Signed-off-by: James Morse <[email protected]>
Seems sensible and well reasoned cleanup to me.
Technically an ABI change, but would be seriously odd if any real code
relied on the current pointless behavior on the few architectures where
is changing.
FWIW review is really of your analysis rather than the change :)
Seems like there may be some additional cleanup that makes sense from
Russell's analysis, but that's perhaps a parallel job.
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/base/cpu.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 9ea22e165acd..34b48f660b6b 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -533,7 +533,7 @@ static void __init cpu_dev_register_generic(void)
> #ifdef CONFIG_GENERIC_CPU_DEVICES
> int i;
>
> - for_each_possible_cpu(i) {
> + for_each_present_cpu(i) {
> if (register_cpu(&per_cpu(cpu_devices, i), i))
> panic("Failed to register CPU device");
> }
On Thu, 14 Sep 2023 09:20:54 +0100
"Russell King (Oracle)" <[email protected]> wrote:
> On Wed, Sep 13, 2023 at 04:37:50PM +0000, James Morse wrote:
> > Three of the five ACPI architectures create sysfs entries using
> > register_cpu() for present CPUs, whereas arm64, riscv and all
> > GENERIC_CPU_DEVICES do this for possible CPUs.
> >
> > Registering a CPU is what causes them to show up in sysfs.
> >
> > It makes very little sense to register all possible CPUs. Registering
> > a CPU is what triggers the udev notifications allowing user-space to
> > react to newly added CPUs.
> >
> > To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change
> > it to use for_each_present_cpu(). Making the ACPI architectures use
> > GENERIC_CPU_DEVICES is a pre-requisite step to centralise their
> > cpu_register() logic, before moving it into the ACPI processor driver.
> > When ACPI is disabled this work would be done by
> > cpu_dev_register_generic().
> >
> > Of the ACPI architectures that register possible CPUs, arm64 and riscv
> > do not support making possible CPUs present as they use the weak 'always
> > fails' version of arch_register_cpu().
> >
> > Only two of the eight architectures that use GENERIC_CPU_DEVICES have a
> > distinction between present and possible CPUs.
> >
> > The following architectures use GENERIC_CPU_DEVICES but are not SMP,
> > so possible == present:
> > * m68k
> > * microblaze
> > * nios2
> >
> > The following architectures use GENERIC_CPU_DEVICES and consider
> > possible == present:
> > * csky: setup_smp()
> > * parisc: smp_prepare_boot_cpu() marks the boot cpu as present,
> > processor_probe() sets possible for all CPUs and present for all CPUs
> > except the boot cpu.
>
> However, init/main.c::start_kernel() calls boot_cpu_init() which sets
> the boot CPU in the online, active, present and possible masks. So,
> _every_ architecture gets the boot CPU in all these masks no matter
> what.
>
> Only of something then clears the boot CPU from these masks (which
> would be silly) would the boot CPU not be in all of these masks.
Hi Russel,
Upshot is that the code in parisc smp_prepare_boot_cpu() can be dropped?
Seems like another useful simplification to add to front of this series.
The function will end up with just a print then.
Seems there are lots of other empty implementations of smp_prepare_boot_cpu()
maybe worth making that optional whilst here and dropping all the empty ones?
There seem to be some other architectures setting at least some of the cpu masks
that could perhaps be tidied up a little via same logic?
Jonathan
>
On Wed, 13 Sep 2023 16:37:51 +0000
James Morse <[email protected]> wrote:
> Architectures often have extra per-cpu work that needs doing
> before a CPU is registered, often to determine if a CPU is
> hotpluggable.
>
> To allow the ACPI architectures to use GENERIC_CPU_DEVICES, move
> the cpu_register() call into arch_register_cpu(), which is made __weak
> so architectures with extra work can override it.
> This aligns with the way x86, ia64 and loongarch register hotplug CPUs
> when they become present.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> Changes since RFC:
> * Dropped __init from x86/ia64 arch_register_cpu()
Confused...
> diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c
> index 94a848b06f15..741863a187a6 100644
> --- a/arch/ia64/kernel/topology.c
> +++ b/arch/ia64/kernel/topology.c
> @@ -59,7 +59,7 @@ void __ref arch_unregister_cpu(int num)
> }
> EXPORT_SYMBOL(arch_unregister_cpu);
> #else
> -static int __init arch_register_cpu(int num)
> +int __init arch_register_cpu(int num)
Still seems to be here...
> {
> return register_cpu(&sysfs_cpus[num].cpu, num);
> }
Even more confused because the block wasn't in the RFC at all.
Maybe dropped static?
On Wed, 13 Sep 2023 16:37:52 +0000
James Morse <[email protected]> wrote:
> NUMA systems require the node descriptions to be ready before CPUs are
> registered. This is so that the node symlinks can be created in sysfs.
>
> Currently no NUMA platform uses GENERIC_CPU_DEVICES, meaning that CPUs
> are registered by arch code, instead of cpu_dev_init().
Worth saying why this matters I think. I wrote a nice note on that being a possible
problem path as node_dev_init() uses the results of cpu_dev_init() if
CONFIG_GENERIC_CPU_DEVICES before seeing this comment and realizing you
had it covered (sort of anyway).
>
> Move cpu_dev_init() after node_dev_init() so that NUMA architectures
> can use GENERIC_CPU_DEVICES.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/base/init.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/base/init.c b/drivers/base/init.c
> index 397eb9880cec..c4954835128c 100644
> --- a/drivers/base/init.c
> +++ b/drivers/base/init.c
> @@ -35,8 +35,8 @@ void __init driver_init(void)
> of_core_init();
> platform_bus_init();
> auxiliary_bus_init();
> - cpu_dev_init();
> memory_dev_init();
> node_dev_init();
> + cpu_dev_init();
> container_dev_init();
> }
On Thu, Sep 14, 2023 at 11:56:13AM +0100, Jonathan Cameron wrote:
> On Thu, 14 Sep 2023 09:20:54 +0100
> "Russell King (Oracle)" <[email protected]> wrote:
>
> > On Wed, Sep 13, 2023 at 04:37:50PM +0000, James Morse wrote:
> > > Three of the five ACPI architectures create sysfs entries using
> > > register_cpu() for present CPUs, whereas arm64, riscv and all
> > > GENERIC_CPU_DEVICES do this for possible CPUs.
> > >
> > > Registering a CPU is what causes them to show up in sysfs.
> > >
> > > It makes very little sense to register all possible CPUs. Registering
> > > a CPU is what triggers the udev notifications allowing user-space to
> > > react to newly added CPUs.
> > >
> > > To allow all five ACPI architectures to use GENERIC_CPU_DEVICES, change
> > > it to use for_each_present_cpu(). Making the ACPI architectures use
> > > GENERIC_CPU_DEVICES is a pre-requisite step to centralise their
> > > cpu_register() logic, before moving it into the ACPI processor driver.
> > > When ACPI is disabled this work would be done by
> > > cpu_dev_register_generic().
> > >
> > > Of the ACPI architectures that register possible CPUs, arm64 and riscv
> > > do not support making possible CPUs present as they use the weak 'always
> > > fails' version of arch_register_cpu().
> > >
> > > Only two of the eight architectures that use GENERIC_CPU_DEVICES have a
> > > distinction between present and possible CPUs.
> > >
> > > The following architectures use GENERIC_CPU_DEVICES but are not SMP,
> > > so possible == present:
> > > * m68k
> > > * microblaze
> > > * nios2
> > >
> > > The following architectures use GENERIC_CPU_DEVICES and consider
> > > possible == present:
> > > * csky: setup_smp()
> > > * parisc: smp_prepare_boot_cpu() marks the boot cpu as present,
> > > processor_probe() sets possible for all CPUs and present for all CPUs
> > > except the boot cpu.
> >
> > However, init/main.c::start_kernel() calls boot_cpu_init() which sets
> > the boot CPU in the online, active, present and possible masks. So,
> > _every_ architecture gets the boot CPU in all these masks no matter
> > what.
> >
> > Only of something then clears the boot CPU from these masks (which
> > would be silly) would the boot CPU not be in all of these masks.
> Hi Russel,
>
> Upshot is that the code in parisc smp_prepare_boot_cpu() can be dropped?
> Seems like another useful simplification to add to front of this series.
Yes - but I personally (and probably others) would like to see progress
made towards getting at least some of the changes in this series merged,
rather than seeing this series hang around longer and grow. Nothing in
this series touches any architecture's smp_prepare_boot_cpu(), so such
a change would not interfere with this series.
Therefore, I suggest that removing those two set_cpu_*() calls in
smp_prepare_boot_cpu() is something that could happen irrespective of
anything in this series, and I would encourage PA-RISC folk to do that
anway.
The same is true of Loongarch, mips, sh, and sparc32, and they can
independently sort this.
> Seems there are lots of other empty implementations of smp_prepare_boot_cpu()
> maybe worth making that optional whilst here and dropping all the empty ones?
Yes, and again, this could be a series separate from this one. If one
arch wants to add the empty weak version of smp_prepare_boot_cpu(),
then it would be a matter of others deleting their empty implementation
(possibly after first having cleaned up the unnecessary set_cpu_*()
calls).
In any case, I would expect that patches doing any of the above would
end up being cherry-picked from a series by arch maintainers, so at
least to me it makes zero sense to include it with this already large
series, and would make the management of this series more complex.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, 13 Sep 2023 16:37:54 +0000
James Morse <[email protected]> wrote:
> To allow ACPI's _STA value to hide CPUs that are present, but not
> available to online right now due to VMM or firmware policy, the
> register_cpu() call needs to be made by the ACPI machinery when ACPI
> is in use. This allows it to hide CPUs that are unavailable from sysfs.
>
> Switching to GENERIC_CPU_DEVICES is an intermediate step to allow all
> five ACPI architectures to be modified at once.
>
> Switch over to GENERIC_CPU_DEVICES, and provide an arch_register_cpu()
> that populates the hotpluggable flag. arch_register_cpu() is also the
> interface the ACPI machinery expects.
>
> The struct cpu in struct cpuinfo_arm64 is never used directly, remove
> it to use the one GENERIC_CPU_DEVICES provides.
>
> This changes the CPUs visible in sysfs from possible to present, but
> on arm64 smp_prepare_cpus() ensures these are the same.
>
> Signed-off-by: James Morse <[email protected]>
After this the earlier question about ordering of cpu_dev_init()
and node_dev_init() is relevant.
Why won't node_dev_init() call
get_cpu_devce() which queries per_cpu(cpu_sys_devices)
and get NULL as we haven't yet filled that in?
Or does it do so but that doesn't matter as well create the
relevant links later?
I've not had enough coffee yet today so might be missing the
obvious!
Jonathan
> ---
> arch/arm64/Kconfig | 1 +
> arch/arm64/include/asm/cpu.h | 1 -
> arch/arm64/kernel/setup.c | 13 ++++---------
> 3 files changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index b10515c0200b..7b3990abf87a 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -132,6 +132,7 @@ config ARM64
> select GENERIC_ARCH_TOPOLOGY
> select GENERIC_CLOCKEVENTS_BROADCAST
> select GENERIC_CPU_AUTOPROBE
> + select GENERIC_CPU_DEVICES
> select GENERIC_CPU_VULNERABILITIES
> select GENERIC_EARLY_IOREMAP
> select GENERIC_IDLE_POLL_SETUP
> diff --git a/arch/arm64/include/asm/cpu.h b/arch/arm64/include/asm/cpu.h
> index e749838b9c5d..887bd0d992bb 100644
> --- a/arch/arm64/include/asm/cpu.h
> +++ b/arch/arm64/include/asm/cpu.h
> @@ -38,7 +38,6 @@ struct cpuinfo_32bit {
> };
>
> struct cpuinfo_arm64 {
> - struct cpu cpu;
> struct kobject kobj;
> u64 reg_ctr;
> u64 reg_cntfrq;
> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> index 417a8a86b2db..165bd2c0dd5a 100644
> --- a/arch/arm64/kernel/setup.c
> +++ b/arch/arm64/kernel/setup.c
> @@ -402,19 +402,14 @@ static inline bool cpu_can_disable(unsigned int cpu)
> return false;
> }
>
> -static int __init topology_init(void)
> +int arch_register_cpu(int num)
> {
> - int i;
> + struct cpu *cpu = &per_cpu(cpu_devices, num);
>
> - for_each_possible_cpu(i) {
> - struct cpu *cpu = &per_cpu(cpu_data.cpu, i);
> - cpu->hotpluggable = cpu_can_disable(i);
> - register_cpu(cpu, i);
> - }
> + cpu->hotpluggable = cpu_can_disable(num);
>
> - return 0;
> + return register_cpu(cpu, num);
> }
> -subsys_initcall(topology_init);
>
> static void dump_kernel_offset(void)
> {
On Wed, 13 Sep 2023 16:37:56 +0000
James Morse <[email protected]> wrote:
> Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
> overridden by the arch code, switch over to this to allow common code
> to choose when the register_cpu() call is made.
>
> x86's struct cpus come from struct x86_cpu, which has no other members
> or users. Remove this and use the version defined by common code.
>
> This is an intermediate step to the logic being moved to drivers/acpi,
> where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
>
> Signed-off-by: James Morse <[email protected]>
> ----
> Changes since RFC:
> * Fixed the second copy of arch_register_cpu() used for non-hotplug
Hi James,
See below for comment on this. Upshot - I think you can delete that
function instead and rely on the weak version.
If you can't because of a later change, useful to call that out
in this patch description for those like me who read an review
in a linear fashion!
...
> EXPORT_SYMBOL(arch_unregister_cpu);
> #else /* CONFIG_HOTPLUG_CPU */
>
> -int __init arch_register_cpu(int num)
> +int arch_register_cpu(int num)
> {
> - return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
> + return register_cpu(&per_cpu(cpu_devices, num), num);
> }
Looks like the weak version introduced in patch 3. Can this
implementation go away and fallback to that?
> #endif /* CONFIG_HOTPLUG_CPU */
> -
> -static int __init topology_init(void)
> -{
> - int i;
> -
> - for_each_present_cpu(i)
> - arch_register_cpu(i);
> -
> - return 0;
> -}
> -subsys_initcall(topology_init);
On Thu, 14 Sep 2023 11:04:27 +0100
"Russell King (Oracle)" <[email protected]> wrote:
> On Wed, Sep 13, 2023 at 04:37:58PM +0000, James Morse wrote:
> > Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
> > overridden by the arch code, switch over to this to allow common code
> > to choose when the register_cpu() call is made.
> >
> > This allows topology_init() to be removed.
> >
> > This is an intermediate step to the logic being moved to drivers/acpi,
> > where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
> >
> > Signed-off-by: James Morse <[email protected]>
>
> ... and same concern as the previous patch.
>
Agreed - with that note added, this one looks simple.
Reviewed-by: Jonathan Cameron <[email protected]>
On Wed, 13 Sep 2023 16:37:59 +0000
James Morse <[email protected]> wrote:
> register_cpu_capacity_sysctl() adds a property to sysfs that describes
> the CPUs capacity. This is done from a subsys_initcall() that assumes
> all possible CPUs are registered.
>
> With CPU hotplug, possible CPUs aren't registered until they become
> present, (or for arm64 enabled). This leads to messages during boot:
> | register_cpu_capacity_sysctl: too early to get CPU1 device!
> and once these CPUs are added to the system, the file is missing.
>
> Move this to a cpuhp callback, so that the file is created once
> CPUs are brought online. This covers CPUs that are added late by
> mechanisms like hotplug.
> One observable difference is the file is now missing for offline CPUs.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> If the offline CPUs thing is a problem for the tools that consume
> this value, we'd need to move cpu_capacity to be part of cpu.c's
> common_cpu_attr_groups.
I think we should do that anyway and then use an is_visible() if we want to
change whether it is visible in offline cpus.
Dynamic sysfs file creation is horrible - particularly when done
from an totally different file from where the rest of the attributes
are registered. I'm curious what the history behind that is.
Whilst here, why is there a common_cpu_attr_groups which is
identical to the hotpluggable_cpu_attr_groups in base/cpu.c?
+CC GregKH
Given changes in drivers/base/
> ---
> drivers/base/arch_topology.c | 38 ++++++++++++++++++++++++------------
> 1 file changed, 26 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index b741b5ba82bd..9ccb7daee78e 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -220,20 +220,34 @@ static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
>
> static DEVICE_ATTR_RO(cpu_capacity);
>
> +static int cpu_capacity_sysctl_add(unsigned int cpu)
> +{
> + struct device *cpu_dev = get_cpu_device(cpu);
> +
> + if (!cpu_dev)
> + return -ENOENT;
> +
> + device_create_file(cpu_dev, &dev_attr_cpu_capacity);
> +
> + return 0;
> +}
> +
> +static int cpu_capacity_sysctl_remove(unsigned int cpu)
> +{
> + struct device *cpu_dev = get_cpu_device(cpu);
> +
> + if (!cpu_dev)
> + return -ENOENT;
> +
> + device_remove_file(cpu_dev, &dev_attr_cpu_capacity);
> +
> + return 0;
> +}
> +
> static int register_cpu_capacity_sysctl(void)
> {
> - int i;
> - struct device *cpu;
> -
> - for_each_possible_cpu(i) {
> - cpu = get_cpu_device(i);
> - if (!cpu) {
> - pr_err("%s: too early to get CPU%d device!\n",
> - __func__, i);
> - continue;
> - }
> - device_create_file(cpu, &dev_attr_cpu_capacity);
> - }
> + cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "topology/cpu-capacity",
> + cpu_capacity_sysctl_add, cpu_capacity_sysctl_remove);
>
> return 0;
> }
On Wed, 13 Sep 2023 16:38:00 +0000
James Morse <[email protected]> wrote:
> acpi_device_is_present() checks the present or functional bits
> from the cached copy of _STA.
>
> A few places open-code this check. Use the helper instead to
> improve readability.
>
> Signed-off-by: James Morse <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Pull this one out and send it upstream in advance of the rest.
Jonathan
> ---
> drivers/acpi/scan.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 691d4b7686ee..ed01e19514ef 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> int error;
>
> acpi_bus_get_status(adev);
> - if (adev->status.present || adev->status.functional) {
> + if (acpi_device_is_present(adev)) {
> /*
> * This function is only called for device objects for which
> * matching scan handlers exist. The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
> int error;
>
> acpi_bus_get_status(adev);
> - if (!(adev->status.present || adev->status.functional)) {
> + if (!acpi_device_is_present(adev)) {
> acpi_scan_device_not_present(adev);
> return 0;
> }
On Wed, 13 Sep 2023 16:38:02 +0000
James Morse <[email protected]> wrote:
> Today the ACPI enumeration code 'visits' all devices that are present.
>
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
>
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
>
> Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit. Make this behaviour an explicit check with
> a reference to the spec, and then check the present and enabled bits.
"and the" only applies if the functional route hasn't been followed
"if not this case check the present and enabled bits."
> This is needed to avoid enumerating present && functional devices that
> are not enabled.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> If this change causes problems on deployed hardware, I suggest an
> arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> acpi_dev_ready_for_enumeration() to only check the present bit.
> ---
> drivers/acpi/device_pm.c | 2 +-
> drivers/acpi/device_sysfs.c | 2 +-
> drivers/acpi/internal.h | 1 -
> drivers/acpi/property.c | 2 +-
> drivers/acpi/scan.c | 23 +++++++++++++----------
> 5 files changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index f007116a8427..76c38478a502 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
> return -EINVAL;
>
> device->power.state = ACPI_STATE_UNKNOWN;
> - if (!acpi_device_is_present(device)) {
> + if (!acpi_dev_ready_for_enumeration(device)) {
> device->flags.initialized = false;
> return -ENXIO;
> }
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index b9bbf0746199..16e586d74aa2 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
> struct acpi_hardware_id *id;
>
> /* Avoid unnecessarily loading modules for non present devices. */
> - if (!acpi_device_is_present(acpi_dev))
> + if (!acpi_dev_ready_for_enumeration(acpi_dev))
> return 0;
>
> /*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 866c7c4ed233..a1b45e345bcc 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -107,7 +107,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
> void acpi_device_remove_files(struct acpi_device *dev);
> void acpi_device_add_finalize(struct acpi_device *device);
> void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
> bool acpi_device_is_battery(struct acpi_device *adev);
> bool acpi_device_is_first_physical_node(struct acpi_device *adev,
> const struct device *dev);
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index 413e4fcadcaf..e03f00b98701 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1418,7 +1418,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
> if (!is_acpi_device_node(fwnode))
> return false;
>
> - return acpi_device_is_present(to_acpi_device_node(fwnode));
> + return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
> }
>
> static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 17ab875a7d4e..f898591ce05f 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> int error;
>
> acpi_bus_get_status(adev);
> - if (acpi_device_is_present(adev)) {
> + if (acpi_dev_ready_for_enumeration(adev)) {
> /*
> * This function is only called for device objects for which
> * matching scan handlers exist. The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
> int error;
>
> acpi_bus_get_status(adev);
> - if (!acpi_device_is_present(adev)) {
> + if (!acpi_dev_ready_for_enumeration(adev)) {
> acpi_scan_device_not_enumerated(adev);
> return 0;
> }
> @@ -1908,11 +1908,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
> return true;
> }
>
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> - return adev->status.present || adev->status.functional;
> -}
> -
> static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
> const char *idstr,
> const struct acpi_device_id **matchid)
> @@ -2375,16 +2370,24 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> * @device: Pointer to the &struct acpi_device to check
> *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
> *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
> */
> bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> {
> if (device->flags.honor_deps && device->dep_unmet)
> return false;
>
> - return acpi_device_is_present(device);
> + /*
> + * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> + * (!present && functional) for certain types of devices that should be
> + * enumerated.
I'd call out the fact that enumeration isn't same as "device driver should be loaded"
which is the thing that functional is supposed to indicate should not happen.
> + */
> + if (!device->status.present && !device->status.enabled)
In theory no need to check !enabled if !present
"If bit [0] is cleared, then bit 1 must also be cleared (in other words, a device that is not present cannot be enabled)."
We could report an ACPI bug if that's seen. If that bug case is ignored this code can
become the simpler.
if (device->status.present)
return device->status_enabled;
else
return device->status.functional;
Or the following also valid here (as functional should be set for enabled present devices
unless they failed diagnostics).
if (dev->status.functional)
return true;
return device->status.present && device->status.enabled;
On assumption we want to enumerate dead devices for debug purposes...
> + return device->status.functional;
> +
> + return device->status.present && device->status.enabled;
> }
> EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>
On Thu, 14 Sep 2023 13:27:32 +0100
Jonathan Cameron <[email protected]> wrote:
> On Wed, 13 Sep 2023 16:38:02 +0000
> James Morse <[email protected]> wrote:
>
> > Today the ACPI enumeration code 'visits' all devices that are present.
> >
> > This is a problem for arm64, where CPUs are always present, but not
> > always enabled. When a device-check occurs because the firmware-policy
> > has changed and a CPU is now enabled, the following error occurs:
> > | acpi ACPI0007:48: Enumeration failure
> >
> > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > true for a device that is not enabled. The ACPI Processor driver
> > will not register such CPUs as they are not 'decoding their resources'.
> >
> > Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> > ACPI allows a device to be functional instead of maintaining the
> > present and enabled bit. Make this behaviour an explicit check with
> > a reference to the spec, and then check the present and enabled bits.
>
> "and the" only applies if the functional route hasn't been followed
> "if not this case check the present and enabled bits."
>
> > This is needed to avoid enumerating present && functional devices that
> > are not enabled.
> >
> > Signed-off-by: James Morse <[email protected]>
> > ---
> > If this change causes problems on deployed hardware, I suggest an
> > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> > acpi_dev_ready_for_enumeration() to only check the present bit.
> > ---
> > drivers/acpi/device_pm.c | 2 +-
> > drivers/acpi/device_sysfs.c | 2 +-
> > drivers/acpi/internal.h | 1 -
> > drivers/acpi/property.c | 2 +-
> > drivers/acpi/scan.c | 23 +++++++++++++----------
> > 5 files changed, 16 insertions(+), 14 deletions(-)
> >
> > diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> > index f007116a8427..76c38478a502 100644
> > --- a/drivers/acpi/device_pm.c
> > +++ b/drivers/acpi/device_pm.c
> > @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
> > return -EINVAL;
> >
> > device->power.state = ACPI_STATE_UNKNOWN;
> > - if (!acpi_device_is_present(device)) {
> > + if (!acpi_dev_ready_for_enumeration(device)) {
> > device->flags.initialized = false;
> > return -ENXIO;
> > }
> > diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> > index b9bbf0746199..16e586d74aa2 100644
> > --- a/drivers/acpi/device_sysfs.c
> > +++ b/drivers/acpi/device_sysfs.c
> > @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
> > struct acpi_hardware_id *id;
> >
> > /* Avoid unnecessarily loading modules for non present devices. */
> > - if (!acpi_device_is_present(acpi_dev))
> > + if (!acpi_dev_ready_for_enumeration(acpi_dev))
> > return 0;
> >
> > /*
> > diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> > index 866c7c4ed233..a1b45e345bcc 100644
> > --- a/drivers/acpi/internal.h
> > +++ b/drivers/acpi/internal.h
> > @@ -107,7 +107,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
> > void acpi_device_remove_files(struct acpi_device *dev);
> > void acpi_device_add_finalize(struct acpi_device *device);
> > void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> > -bool acpi_device_is_present(const struct acpi_device *adev);
> > bool acpi_device_is_battery(struct acpi_device *adev);
> > bool acpi_device_is_first_physical_node(struct acpi_device *adev,
> > const struct device *dev);
> > diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> > index 413e4fcadcaf..e03f00b98701 100644
> > --- a/drivers/acpi/property.c
> > +++ b/drivers/acpi/property.c
> > @@ -1418,7 +1418,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
> > if (!is_acpi_device_node(fwnode))
> > return false;
> >
> > - return acpi_device_is_present(to_acpi_device_node(fwnode));
> > + return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
> > }
> >
> > static const void *
> > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> > index 17ab875a7d4e..f898591ce05f 100644
> > --- a/drivers/acpi/scan.c
> > +++ b/drivers/acpi/scan.c
> > @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> > int error;
> >
> > acpi_bus_get_status(adev);
> > - if (acpi_device_is_present(adev)) {
> > + if (acpi_dev_ready_for_enumeration(adev)) {
> > /*
> > * This function is only called for device objects for which
> > * matching scan handlers exist. The only situation in which
> > @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
> > int error;
> >
> > acpi_bus_get_status(adev);
> > - if (!acpi_device_is_present(adev)) {
> > + if (!acpi_dev_ready_for_enumeration(adev)) {
> > acpi_scan_device_not_enumerated(adev);
> > return 0;
> > }
> > @@ -1908,11 +1908,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
> > return true;
> > }
> >
> > -bool acpi_device_is_present(const struct acpi_device *adev)
> > -{
> > - return adev->status.present || adev->status.functional;
> > -}
> > -
> > static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
> > const char *idstr,
> > const struct acpi_device_id **matchid)
> > @@ -2375,16 +2370,24 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> > * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> > * @device: Pointer to the &struct acpi_device to check
> > *
> > - * Check if the device is present and has no unmet dependencies.
> > + * Check if the device is functional or enabled and has no unmet dependencies.
> > *
> > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > + * Return true if the device is ready for enumeration. Otherwise, return false.
> > */
> > bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> > {
> > if (device->flags.honor_deps && device->dep_unmet)
> > return false;
> >
> > - return acpi_device_is_present(device);
> > + /*
> > + * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > + * (!present && functional) for certain types of devices that should be
> > + * enumerated.
>
> I'd call out the fact that enumeration isn't same as "device driver should be loaded"
> which is the thing that functional is supposed to indicate should not happen.
>
> > + */
> > + if (!device->status.present && !device->status.enabled)
>
> In theory no need to check !enabled if !present
> "If bit [0] is cleared, then bit 1 must also be cleared (in other words, a device that is not present cannot be enabled)."
> We could report an ACPI bug if that's seen. If that bug case is ignored this code can
> become the simpler.
>
> if (device->status.present)
> return device->status_enabled;
> else
> return device->status.functional;
>
> Or the following also valid here (as functional should be set for enabled present devices
> unless they failed diagnostics).
>
> if (dev->status.functional)
> return true;
> return device->status.present && device->status.enabled;
>
> On assumption we want to enumerate dead devices for debug purposes...
Actually ignore this. Could have weird race with present, functional true,
but enabled not quite set - despite the device being there and self
tests having passed.
>
>
> > + return device->status.functional;
> > +
> > + return device->status.present && device->status.enabled;
>
>
> > }
> > EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
> >
>
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Wed, 13 Sep 2023 16:38:04 +0000
James Morse <[email protected]> wrote:
> ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other
> in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors"
> says "Each processor in the system must be declared in the ACPI
> namespace"). Having two descriptions allows firmware authors to get
> this wrong.
>
> If CPUs are described in the MADT/APIC, they will be brought online
> early during boot. Once the register_cpu() calls are moved to ACPI,
> they will be based on the DSDT description of the CPUs. When CPUs are
> missing from the DSDT description, they will end up online, but not
> registered.
>
> Add a helper that runs after acpi_init() has completed to register
> CPUs that are online, but weren't found in the DSDT. Any CPU that
> is registered by this code triggers a firmware-bug warning and kernel
> taint.
>
> Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug
> is configured.
We should fix that as who likes warnings and taint :)
I dread to think how common this will turn out to be.
>
> Signed-off-by: James Morse <[email protected]>
LGTM
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index b4bde78121bb..a01e315aa16a 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -790,6 +790,25 @@ void __init acpi_processor_init(void)
> acpi_pcc_cpufreq_init();
> }
>
> +static int __init acpi_processor_register_missing_cpus(void)
> +{
> + int cpu;
> +
> + if (acpi_disabled)
> + return 0;
> +
> + for_each_online_cpu(cpu) {
> + if (!get_cpu_device(cpu)) {
> + pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu);
> + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> + arch_register_cpu(cpu);
> + }
> + }
> +
> + return 0;
> +}
> +subsys_initcall_sync(acpi_processor_register_missing_cpus);
> +
> #ifdef CONFIG_ACPI_PROCESSOR_CSTATE
> /**
> * acpi_processor_claim_cst_control - Request _CST control from the platform.
On Thu, Sep 14, 2023 at 12:27:15PM +0100, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:37:54 +0000
> James Morse <[email protected]> wrote:
>
> > To allow ACPI's _STA value to hide CPUs that are present, but not
> > available to online right now due to VMM or firmware policy, the
> > register_cpu() call needs to be made by the ACPI machinery when ACPI
> > is in use. This allows it to hide CPUs that are unavailable from sysfs.
> >
> > Switching to GENERIC_CPU_DEVICES is an intermediate step to allow all
> > five ACPI architectures to be modified at once.
> >
> > Switch over to GENERIC_CPU_DEVICES, and provide an arch_register_cpu()
> > that populates the hotpluggable flag. arch_register_cpu() is also the
> > interface the ACPI machinery expects.
> >
> > The struct cpu in struct cpuinfo_arm64 is never used directly, remove
> > it to use the one GENERIC_CPU_DEVICES provides.
> >
> > This changes the CPUs visible in sysfs from possible to present, but
> > on arm64 smp_prepare_cpus() ensures these are the same.
> >
> > Signed-off-by: James Morse <[email protected]>
>
> After this the earlier question about ordering of cpu_dev_init()
> and node_dev_init() is relevant.
>
> Why won't node_dev_init() call
> get_cpu_devce() which queries per_cpu(cpu_sys_devices)
> and get NULL as we haven't yet filled that in?
>
> Or does it do so but that doesn't matter as well create the
> relevant links later?
node_dev_init() will walk through the nodes calling register_one_node()
on each. This will trickle down to __register_one_node() which walks
all present CPUs, calling register_cpu_under_node() on each.
register_cpu_under_node() will call get_cpu_device(cpu) for each and
will return NULL until the CPU is registered using register_cpu(),
which will now happen _after_ node_dev_init().
So, at this point, CPUs won't get registered, and initially one might
think that's a problem.
However, register_cpu() will itself call register_cpu_under_node(),
where get_cpu_device() will return the now populated entry, and the
sysfs links will be created.
So, I think what you've spotted is a potential chunk of code that
isn't necessary when using GENERIC_CPU_DEVICES after this change!
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, 13 Sep 2023 16:38:08 +0000
James Morse <[email protected]> wrote:
> acpi_processor_hotadd_init() will make a CPU present by mapping it
> based on its hardware id.
>
> 'hotadd_init' is ambiguous once there are two different behaviours
> for cpu hotplug. This is for toggling the _STA present bit. Subsequent
> patches will add support for toggling the _STA enabled bit, named
> acpi_processor_make_enabled().
>
> Rename it acpi_processor_make_present() to make it clear this is
> for CPUs that were not previously present.
>
> Expose the function prototypes it uses to allow the preprocessor
> guards to be removed. The IS_ENABLED() check will let the compiler
> dead-code elimination pass remove this if it isn't going to be
> used.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 14 +++++---------
> include/linux/acpi.h | 2 --
> 2 files changed, 5 insertions(+), 11 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 75257fae10e7..22a15a614f95 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -182,13 +182,15 @@ static void __init acpi_pcc_cpufreq_init(void) {}
> #endif /* CONFIG_X86 */
>
> /* Initialization */
> -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> -static int acpi_processor_hotadd_init(struct acpi_processor *pr)
> +static int acpi_processor_make_present(struct acpi_processor *pr)
> {
> unsigned long long sta;
> acpi_status status;
> int ret;
>
> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> + return -ENODEV;
> +
> if (invalid_phys_cpuid(pr->phys_id))
> return -ENODEV;
>
> @@ -222,12 +224,6 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr)
> cpu_maps_update_done();
> return ret;
> }
> -#else
> -static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
> -{
> - return -ENODEV;
> -}
> -#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>
> static int acpi_processor_get_info(struct acpi_device *device)
> {
> @@ -335,7 +331,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> * because cpuid <-> apicid mapping is persistent now.
> */
> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> - int ret = acpi_processor_hotadd_init(pr);
> + int ret = acpi_processor_make_present(pr);
>
> if (ret)
> return ret;
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 651dd43976a9..b7ab85857bb7 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -316,12 +316,10 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
> }
> #endif
>
> -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> /* Arch dependent functions for cpu hotplug support */
> int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
> int *pcpu);
> int acpi_unmap_cpu(int cpu);
I've lost track somewhat but I think the definitions of these are still under ifdefs
which is messy if nothing else and might cause build issues.
> -#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>
> #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
> int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
On Wed, 13 Sep 2023 16:38:07 +0000
James Morse <[email protected]> wrote:
> A subsequent patch will change acpi_scan_hot_remove() to call
> acpi_bus_trim_one() instead of acpi_bus_trim(), meaning it can no longer
> rely on the prototype in the header file.
>
> Move these functions further up the file.
> No change in behaviour.
>
> Signed-off-by: James Morse <[email protected]>
FWIW
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/acpi/scan.c | 76 ++++++++++++++++++++++-----------------------
> 1 file changed, 38 insertions(+), 38 deletions(-)
>
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index f898591ce05f..a675333618ae 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -244,6 +244,44 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
> return 0;
> }
>
> +static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> +{
> + struct acpi_scan_handler *handler = adev->handler;
> +
> + acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
> +
> + adev->flags.match_driver = false;
> + if (handler) {
> + if (handler->detach)
> + handler->detach(adev);
> +
> + adev->handler = NULL;
> + } else {
> + device_release_driver(&adev->dev);
> + }
> + /*
> + * Most likely, the device is going away, so put it into D3cold before
> + * that.
> + */
> + acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
> + adev->flags.initialized = false;
> + acpi_device_clear_enumerated(adev);
> +
> + return 0;
> +}
> +
> +/**
> + * acpi_bus_trim - Detach scan handlers and drivers from ACPI device objects.
> + * @adev: Root of the ACPI namespace scope to walk.
> + *
> + * Must be called under acpi_scan_lock.
> + */
> +void acpi_bus_trim(struct acpi_device *adev)
> +{
> + acpi_bus_trim_one(adev, NULL);
> +}
> +EXPORT_SYMBOL_GPL(acpi_bus_trim);
> +
> static int acpi_scan_hot_remove(struct acpi_device *device)
> {
> acpi_handle handle = device->handle;
> @@ -2506,44 +2544,6 @@ int acpi_bus_scan(acpi_handle handle)
> }
> EXPORT_SYMBOL(acpi_bus_scan);
>
> -static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> -{
> - struct acpi_scan_handler *handler = adev->handler;
> -
> - acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
> -
> - adev->flags.match_driver = false;
> - if (handler) {
> - if (handler->detach)
> - handler->detach(adev);
> -
> - adev->handler = NULL;
> - } else {
> - device_release_driver(&adev->dev);
> - }
> - /*
> - * Most likely, the device is going away, so put it into D3cold before
> - * that.
> - */
> - acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
> - adev->flags.initialized = false;
> - acpi_device_clear_enumerated(adev);
> -
> - return 0;
> -}
> -
> -/**
> - * acpi_bus_trim - Detach scan handlers and drivers from ACPI device objects.
> - * @adev: Root of the ACPI namespace scope to walk.
> - *
> - * Must be called under acpi_scan_lock.
> - */
> -void acpi_bus_trim(struct acpi_device *adev)
> -{
> - acpi_bus_trim_one(adev, NULL);
> -}
> -EXPORT_SYMBOL_GPL(acpi_bus_trim);
> -
> int acpi_bus_register_early_device(int type)
> {
> struct acpi_device *device = NULL;
On Wed, 13 Sep 2023 16:38:09 +0000
James Morse <[email protected]> wrote:
> struct acpi_scan_handler has a detach callback that is used to remove
> a driver when a bus is changed. When interacting with an eject-request,
> the detach callback is called before _EJ0.
>
> This means the ACPI processor driver can't use _STA to determine if a
> CPU has been made not-present, or some of the other _STA bits have been
> changed. acpi_processor_remove() needs to know the value of _STA after
> _EJ0 has been called.
Why hasn't it been a problem before?
>
> Add a post_eject callback to struct acpi_scan_handler. This is called
> after acpi_scan_hot_remove() has successfully called _EJ0. Because
> acpi_bus_trim_one() also clears the handler pointer, it needs to be
> told if the caller will go on to call acpi_bus_post_eject(), so
> that acpi_device_clear_enumerated() and clearing the handler pointer
> can be deferred. The existing not-used pointer is used for this.
>
> Signed-off-by: James Morse <[email protected]>
I briefly wondered if an alternative model where you always call the
post walk was cleaner as the handler clear etc would always be in same place.
However, couldn't make it work that nicely because you still need to indicate
that it's an eject post handler or not which just moves the messy code.
As such this LGTM
Reviewed-by: Joanthan Cameron <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 4 +--
> drivers/acpi/scan.c | 52 ++++++++++++++++++++++++++++++-----
> include/acpi/acpi_bus.h | 1 +
> 3 files changed, 48 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 22a15a614f95..00dcc23d49a8 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -459,7 +459,7 @@ static int acpi_processor_add(struct acpi_device *device,
>
> #ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> /* Removal */
> -static void acpi_processor_remove(struct acpi_device *device)
> +static void acpi_processor_post_eject(struct acpi_device *device)
> {
> struct acpi_processor *pr;
>
> @@ -627,7 +627,7 @@ static struct acpi_scan_handler processor_handler = {
> .ids = processor_device_ids,
> .attach = acpi_processor_add,
> #ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> - .detach = acpi_processor_remove,
> + .post_eject = acpi_processor_post_eject,
> #endif
> .hotplug = {
> .enabled = true,
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index a675333618ae..b6d2f01640a9 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -244,18 +244,28 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
> return 0;
> }
>
> -static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> +/**
> + * acpi_bus_trim_one() - Detach scan handlers and drivers from ACPI device
> + * objects.
> + * @adev: Root of the ACPI namespace scope to walk.
> + * @eject: Pointer to a bool that indicates if this was due to an
> + * eject-request.
> + *
> + * Must be called under acpi_scan_lock.
> + * If @eject points to true, clearing the device enumeration is deferred until
> + * acpi_bus_post_eject() is called.
> + */
> +static int acpi_bus_trim_one(struct acpi_device *adev, void *eject)
> {
> struct acpi_scan_handler *handler = adev->handler;
> + bool is_eject = *(bool *)eject;
>
> - acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
> + acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, eject);
>
> adev->flags.match_driver = false;
> if (handler) {
> if (handler->detach)
> handler->detach(adev);
> -
> - adev->handler = NULL;
> } else {
> device_release_driver(&adev->dev);
> }
> @@ -265,7 +275,12 @@ static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> */
> acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
> adev->flags.initialized = false;
> - acpi_device_clear_enumerated(adev);
> +
> + /* For eject this is deferred to acpi_bus_post_eject() */
> + if (!is_eject) {
> + adev->handler = NULL;
> + acpi_device_clear_enumerated(adev);
> + }
>
> return 0;
> }
> @@ -278,15 +293,36 @@ static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> */
> void acpi_bus_trim(struct acpi_device *adev)
> {
> - acpi_bus_trim_one(adev, NULL);
> + bool eject = false;
> +
> + acpi_bus_trim_one(adev, &eject);
> }
> EXPORT_SYMBOL_GPL(acpi_bus_trim);
>
> +static int acpi_bus_post_eject(struct acpi_device *adev, void *not_used)
> +{
> + struct acpi_scan_handler *handler = adev->handler;
> +
> + acpi_dev_for_each_child_reverse(adev, acpi_bus_post_eject, NULL);
> +
> + if (handler) {
> + if (handler->post_eject)
> + handler->post_eject(adev);
> +
> + adev->handler = NULL;
> + }
> +
> + acpi_device_clear_enumerated(adev);
> +
> + return 0;
> +}
> +
> static int acpi_scan_hot_remove(struct acpi_device *device)
> {
> acpi_handle handle = device->handle;
> unsigned long long sta;
> acpi_status status;
> + bool eject = true;
>
> if (device->handler && device->handler->hotplug.demand_offline) {
> if (!acpi_scan_is_offline(device, true))
> @@ -299,7 +335,7 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
>
> acpi_handle_debug(handle, "Ejecting\n");
>
> - acpi_bus_trim(device);
> + acpi_bus_trim_one(device, &eject);
>
> acpi_evaluate_lck(handle, 0);
> /*
> @@ -322,6 +358,8 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
> } else if (sta & ACPI_STA_DEVICE_ENABLED) {
> acpi_handle_warn(handle,
> "Eject incomplete - status 0x%llx\n", sta);
> + } else {
> + acpi_bus_post_eject(device, NULL);
> }
>
> return 0;
> diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
> index 254685085c82..1b7e1acf925b 100644
> --- a/include/acpi/acpi_bus.h
> +++ b/include/acpi/acpi_bus.h
> @@ -127,6 +127,7 @@ struct acpi_scan_handler {
> bool (*match)(const char *idstr, const struct acpi_device_id **matchid);
> int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
> void (*detach)(struct acpi_device *dev);
> + void (*post_eject)(struct acpi_device *dev);
> void (*bind)(struct device *phys_dev);
> void (*unbind)(struct device *phys_dev);
> struct acpi_hotplug_profile hotplug;
On Wed, 13 Sep 2023 16:38:11 +0000
James Morse <[email protected]> wrote:
> ACPI firmware can trigger the events to add and remove CPUs, but the
> OS may not support this.
>
> Print a warning when this happens.
>
> This gives early warning on arm64 systems that don't support
> CONFIG_ACPI_HOTPLUG_PRESENT_CPU, as making CPUs not present has
> side effects for other parts of the system.
>
> Signed-off-by: James Morse <[email protected]>
Seem like a good idea to me.
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 2cafea1edc24..b67616079751 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -188,8 +188,10 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
> acpi_status status;
> int ret;
>
> - if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
> + pr_err_once("Changing CPU present bit is not supported\n");
> return -ENODEV;
> + }
>
> if (invalid_phys_cpuid(pr->phys_id))
> return -ENODEV;
> @@ -462,8 +464,10 @@ static void acpi_processor_make_not_present(struct acpi_device *device)
> {
> struct acpi_processor *pr;
>
> - if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
> + pr_err_once("Changing CPU present bit is not supported");
> return;
> + }
>
> pr = acpi_driver_data(device);
> if (pr->id >= nr_cpu_ids)
On Wed, 13 Sep 2023 16:38:10 +0000
James Morse <[email protected]> wrote:
> When called acpi_processor_post_eject() unconditionally make a CPU
> not-present and unregisters it.
>
> To add support for AML events where the CPU has become disabled, but
> remains present, the _STA method should be checked before calling
> acpi_processor_remove().
>
> Rename acpi_processor_post_eject() acpi_processor_remove_possible(), and
> check the _STA before calling.
>
> Adding the function prototype for arch_unregister_cpu() allows the
> preprocessor guards to be removed.
>
> After this change CPUs will remain registered and visible to
> user-space as offline if buggy firmware triggers an eject-request,
> but doesn't clear the corresponding _STA bits after _EJ0 has been
> called.
Will be fun to see how many such buggy firmwares are out there.
>
> Signed-off-by: James Morse <[email protected]>
Comment inline but not directly related to this patch so with or
without that change
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 31 +++++++++++++++++++++++++------
> include/linux/cpu.h | 1 +
> 2 files changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 00dcc23d49a8..2cafea1edc24 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -457,13 +457,12 @@ static int acpi_processor_add(struct acpi_device *device,
> return result;
> }
>
> -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> /* Removal */
> -static void acpi_processor_post_eject(struct acpi_device *device)
> +static void acpi_processor_make_not_present(struct acpi_device *device)
> {
> struct acpi_processor *pr;
>
> - if (!device || !acpi_driver_data(device))
> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
Would it be possible to do all the ifdef to IS_ENABLED changes in a separate
patch? I haven't figure out if any of them have dependencies on the other
changes, but they do create a bunch of noise I'd rather not see in the more
complex corners of this.
> return;
>
> pr = acpi_driver_data(device);
> @@ -501,7 +500,29 @@ static void acpi_processor_post_eject(struct acpi_device *device)
> free_cpumask_var(pr->throttling.shared_cpu_map);
> kfree(pr);
> }
> -#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
> +
> +static void acpi_processor_post_eject(struct acpi_device *device)
> +{
> + struct acpi_processor *pr;
> + unsigned long long sta;
> + acpi_status status;
> +
> + if (!device)
> + return;
> +
> + pr = acpi_driver_data(device);
> + if (!pr || pr->id >= nr_cpu_ids || invalid_phys_cpuid(pr->phys_id))
> + return;
> +
> + status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> + if (ACPI_FAILURE(status))
> + return;
> +
> + if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_PRESENT)) {
> + acpi_processor_make_not_present(device);
> + return;
> + }
> +}
>
> #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
> bool __init processor_physically_present(acpi_handle handle)
> @@ -626,9 +647,7 @@ static const struct acpi_device_id processor_device_ids[] = {
> static struct acpi_scan_handler processor_handler = {
> .ids = processor_device_ids,
> .attach = acpi_processor_add,
> -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> .post_eject = acpi_processor_post_eject,
> -#endif
> .hotplug = {
> .enabled = true,
> },
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index a71691d7c2ca..e117c06e0c6b 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -81,6 +81,7 @@ struct device *cpu_device_create(struct device *parent, void *drvdata,
> const struct attribute_group **groups,
> const char *fmt, ...);
> extern int arch_register_cpu(int cpu);
> +extern void arch_unregister_cpu(int cpu);
> #ifdef CONFIG_HOTPLUG_CPU
> extern void unregister_cpu(struct cpu *cpu);
> extern ssize_t arch_cpu_probe(const char *, size_t);
On Wed, 13 Sep 2023 16:37:51 +0000
James Morse <[email protected]> wrote:
> Architectures often have extra per-cpu work that needs doing
> before a CPU is registered, often to determine if a CPU is
> hotpluggable.
>
> To allow the ACPI architectures to use GENERIC_CPU_DEVICES, move
> the cpu_register() call into arch_register_cpu(), which is made __weak
> so architectures with extra work can override it.
> This aligns with the way x86, ia64 and loongarch register hotplug CPUs
> when they become present.
Perhaps call out that you are also making cpu_devices visible outside
of base/cpu.c
Note it isn't obvious to me why you do that in this patch. I assume
it will be needed later...
Otherwise seems sensible to me.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> Changes since RFC:
> * Dropped __init from x86/ia64 arch_register_cpu()
> ---
> arch/ia64/include/asm/cpu.h | 1 -
> arch/ia64/kernel/topology.c | 2 +-
> arch/loongarch/include/asm/cpu.h | 1 -
> arch/x86/include/asm/cpu.h | 1 -
> arch/x86/kernel/topology.c | 2 +-
> drivers/base/cpu.c | 14 ++++++++++----
> include/linux/cpu.h | 5 +++++
> 7 files changed, 17 insertions(+), 9 deletions(-)
>
> diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
> index db125df9e088..a3e690e685e5 100644
> --- a/arch/ia64/include/asm/cpu.h
> +++ b/arch/ia64/include/asm/cpu.h
> @@ -16,7 +16,6 @@ DECLARE_PER_CPU(struct ia64_cpu, cpu_devices);
> DECLARE_PER_CPU(int, cpu_state);
>
> #ifdef CONFIG_HOTPLUG_CPU
> -extern int arch_register_cpu(int num);
> extern void arch_unregister_cpu(int);
> #endif
>
> diff --git a/arch/ia64/kernel/topology.c b/arch/ia64/kernel/topology.c
> index 94a848b06f15..741863a187a6 100644
> --- a/arch/ia64/kernel/topology.c
> +++ b/arch/ia64/kernel/topology.c
> @@ -59,7 +59,7 @@ void __ref arch_unregister_cpu(int num)
> }
> EXPORT_SYMBOL(arch_unregister_cpu);
> #else
> -static int __init arch_register_cpu(int num)
> +int __init arch_register_cpu(int num)
> {
> return register_cpu(&sysfs_cpus[num].cpu, num);
> }
> diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
> index 7afe8cbb844e..b8568e637420 100644
> --- a/arch/loongarch/include/asm/cpu.h
> +++ b/arch/loongarch/include/asm/cpu.h
> @@ -130,7 +130,6 @@ enum cpu_type_enum {
>
> #if !defined(__ASSEMBLY__)
> #ifdef CONFIG_HOTPLUG_CPU
> -int arch_register_cpu(int num);
> void arch_unregister_cpu(int cpu);
> #endif
> #endif /* ! __ASSEMBLY__ */
> diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
> index 3a233ebff712..96dc4665e87d 100644
> --- a/arch/x86/include/asm/cpu.h
> +++ b/arch/x86/include/asm/cpu.h
> @@ -28,7 +28,6 @@ struct x86_cpu {
> };
>
> #ifdef CONFIG_HOTPLUG_CPU
> -extern int arch_register_cpu(int num);
> extern void arch_unregister_cpu(int);
> extern void soft_restart_cpu(void);
> #endif
> diff --git a/arch/x86/kernel/topology.c b/arch/x86/kernel/topology.c
> index ca004e2e4469..0bab03130033 100644
> --- a/arch/x86/kernel/topology.c
> +++ b/arch/x86/kernel/topology.c
> @@ -54,7 +54,7 @@ void arch_unregister_cpu(int num)
> EXPORT_SYMBOL(arch_unregister_cpu);
> #else /* CONFIG_HOTPLUG_CPU */
>
> -static int __init arch_register_cpu(int num)
> +int __init arch_register_cpu(int num)
> {
> return register_cpu(&per_cpu(cpu_devices, num).cpu, num);
> }
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 34b48f660b6b..579064fda97b 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -525,19 +525,25 @@ bool cpu_is_hotpluggable(unsigned int cpu)
> EXPORT_SYMBOL_GPL(cpu_is_hotpluggable);
>
> #ifdef CONFIG_GENERIC_CPU_DEVICES
> -static DEFINE_PER_CPU(struct cpu, cpu_devices);
> +DEFINE_PER_CPU(struct cpu, cpu_devices);
> +
> +int __weak arch_register_cpu(int cpu)
> +{
> + return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
> +}
> #endif
>
> static void __init cpu_dev_register_generic(void)
> {
> -#ifdef CONFIG_GENERIC_CPU_DEVICES
> int i;
>
> + if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> + return;
> +
> for_each_present_cpu(i) {
> - if (register_cpu(&per_cpu(cpu_devices, i), i))
> + if (arch_register_cpu(i))
> panic("Failed to register CPU device");
> }
> -#endif
> }
>
> #ifdef CONFIG_GENERIC_CPU_VULNERABILITIES
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index 0abd60a7987b..a71691d7c2ca 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -80,12 +80,17 @@ extern __printf(4, 5)
> struct device *cpu_device_create(struct device *parent, void *drvdata,
> const struct attribute_group **groups,
> const char *fmt, ...);
> +extern int arch_register_cpu(int cpu);
> #ifdef CONFIG_HOTPLUG_CPU
> extern void unregister_cpu(struct cpu *cpu);
> extern ssize_t arch_cpu_probe(const char *, size_t);
> extern ssize_t arch_cpu_release(const char *, size_t);
> #endif
>
> +#ifdef CONFIG_GENERIC_CPU_DEVICES
> +DECLARE_PER_CPU(struct cpu, cpu_devices);
> +#endif
> +
> /*
> * These states are not related to the core CPU hotplug mechanism. They are
> * used by various (sub)architectures to track internal state
On Wed, 13 Sep 2023 16:38:14 +0000
James Morse <[email protected]> wrote:
> ACPI identifies CPUs by UID. get_cpu_for_acpi_id() maps the ACPI UID
> to the linux CPU number.
>
> The helper to retrieve this mapping is only available in arm64's numa
> code.
>
> Move it to live next to get_acpi_id_for_cpu().
>
> Signed-off-by: James Morse <[email protected]>
Seems reasonable
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> arch/arm64/include/asm/acpi.h | 11 +++++++++++
> arch/arm64/kernel/acpi_numa.c | 11 -----------
> 2 files changed, 11 insertions(+), 11 deletions(-)
>
> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> index 4d537d56eb84..ce5045038e87 100644
> --- a/arch/arm64/include/asm/acpi.h
> +++ b/arch/arm64/include/asm/acpi.h
> @@ -100,6 +100,17 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
> return acpi_cpu_get_madt_gicc(cpu)->uid;
> }
>
> +static inline int get_cpu_for_acpi_id(u32 uid)
> +{
> + int cpu;
> +
> + for (cpu = 0; cpu < nr_cpu_ids; cpu++)
> + if (uid == get_acpi_id_for_cpu(cpu))
> + return cpu;
> +
> + return -EINVAL;
> +}
> +
> static inline void arch_fix_phys_package_id(int num, u32 slot) { }
> void __init acpi_init_cpus(void);
> int apei_claim_sea(struct pt_regs *regs);
> diff --git a/arch/arm64/kernel/acpi_numa.c b/arch/arm64/kernel/acpi_numa.c
> index e51535a5f939..0c036a9a3c33 100644
> --- a/arch/arm64/kernel/acpi_numa.c
> +++ b/arch/arm64/kernel/acpi_numa.c
> @@ -34,17 +34,6 @@ int __init acpi_numa_get_nid(unsigned int cpu)
> return acpi_early_node_map[cpu];
> }
>
> -static inline int get_cpu_for_acpi_id(u32 uid)
> -{
> - int cpu;
> -
> - for (cpu = 0; cpu < nr_cpu_ids; cpu++)
> - if (uid == get_acpi_id_for_cpu(cpu))
> - return cpu;
> -
> - return -EINVAL;
> -}
> -
> static int __init acpi_parse_gicc_pxm(union acpi_subtable_headers *header,
> const unsigned long end)
> {
On Wed, 13 Sep 2023 16:38:13 +0000
James Morse <[email protected]> wrote:
> LoongArch provides its own arch_unregister_cpu(). This clears the
> hotpluggable flag, then unregisters the CPU.
>
> It isn't necessary to clear the hotpluggable flag when unregistering
> a cpu. unregister_cpu() writes NULL to the percpu cpu_sys_devices
> pointer, meaning cpu_is_hotpluggable() will return false, as
> get_cpu_device() has returned NULL.
Thought that looked odd earlier but didn't care enough to dig.
Seem unlikely state would persist for an unregistered cpu.
Great to see confirmation.
>
> Remove arch_unregister_cpu() and use the __weak version.
>
> Signed-off-by: James Morse <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> arch/loongarch/kernel/topology.c | 9 ---------
> 1 file changed, 9 deletions(-)
>
> diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
> index 8e4441c1ff39..5a75e2cc0848 100644
> --- a/arch/loongarch/kernel/topology.c
> +++ b/arch/loongarch/kernel/topology.c
> @@ -16,13 +16,4 @@ int arch_register_cpu(int cpu)
> return register_cpu(c, cpu);
> }
> EXPORT_SYMBOL(arch_register_cpu);
> -
> -void arch_unregister_cpu(int cpu)
> -{
> - struct cpu *c = &per_cpu(cpu_devices, cpu);
> -
> - c->hotpluggable = 0;
> - unregister_cpu(c);
> -}
> -EXPORT_SYMBOL(arch_unregister_cpu);
> #endif
On Wed, 13 Sep 2023 16:38:17 +0000
James Morse <[email protected]> wrote:
> gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions().
> It should only count the number of enabled redistributors, but it
> also tries to sanity check the GICC entry, currently returning an
> error if the Enabled bit is set, but the gicr_base_address is zero.
>
> Adding support for the online-capable bit to the sanity check
> complicates it, for no benefit. The existing check implicitly
> depends on gic_acpi_count_gicr_regions() previous failing to find
> any GICR regions (as it is valid to have gicr_base_address of zero if
> the redistributors are described via a GICR entry).
>
> Instead of complicating the check, remove it. Failures that happen
> at this point cause the irqchip not to register, meaning no irqs
> can be requested. The kernel grinds to a panic() pretty quickly.
>
> Without the check, MADT tables that exhibit this problem are still
> caught by gic_populate_rdist(), which helpfully also prints what
> went wrong:
> | CPU4: mpidr 100 has no re-distributor!
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3.c | 18 ++++++------------
> 1 file changed, 6 insertions(+), 12 deletions(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 72d3cdebdad1..0f54811262eb 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -2415,21 +2415,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
>
> /*
> * If GICC is enabled and has valid gicr base address, then it means
> - * GICR base is presented via GICC
> + * GICR base is presented via GICC. The redistributor is only known to
> + * be accessible if the GICC is marked as enabled. If this bit is not
> + * set, we'd need to add the redistributor at runtime, which isn't
> + * supported.
> */
> - if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
> + if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)
Going in circles...
> acpi_data.enabled_rdists++;
> - return 0;
> - }
>
> - /*
> - * It's perfectly valid firmware can pass disabled GICC entry, driver
> - * should not treat as errors, skip the entry instead of probe fail.
> - */
> - if (!acpi_gicc_is_usable(gicc))
> - return 0;
> -
> - return -ENODEV;
> + return 0;
> }
>
> static int __init gic_acpi_count_gicr_regions(void)
On Thu, 14 Sep 2023 15:07:22 +0100
"Russell King (Oracle)" <[email protected]> wrote:
> On Thu, Sep 14, 2023 at 12:27:15PM +0100, Jonathan Cameron wrote:
> > On Wed, 13 Sep 2023 16:37:54 +0000
> > James Morse <[email protected]> wrote:
> >
> > > To allow ACPI's _STA value to hide CPUs that are present, but not
> > > available to online right now due to VMM or firmware policy, the
> > > register_cpu() call needs to be made by the ACPI machinery when ACPI
> > > is in use. This allows it to hide CPUs that are unavailable from sysfs.
> > >
> > > Switching to GENERIC_CPU_DEVICES is an intermediate step to allow all
> > > five ACPI architectures to be modified at once.
> > >
> > > Switch over to GENERIC_CPU_DEVICES, and provide an arch_register_cpu()
> > > that populates the hotpluggable flag. arch_register_cpu() is also the
> > > interface the ACPI machinery expects.
> > >
> > > The struct cpu in struct cpuinfo_arm64 is never used directly, remove
> > > it to use the one GENERIC_CPU_DEVICES provides.
> > >
> > > This changes the CPUs visible in sysfs from possible to present, but
> > > on arm64 smp_prepare_cpus() ensures these are the same.
> > >
> > > Signed-off-by: James Morse <[email protected]>
> >
> > After this the earlier question about ordering of cpu_dev_init()
> > and node_dev_init() is relevant.
> >
> > Why won't node_dev_init() call
> > get_cpu_devce() which queries per_cpu(cpu_sys_devices)
> > and get NULL as we haven't yet filled that in?
> >
> > Or does it do so but that doesn't matter as well create the
> > relevant links later?
>
> node_dev_init() will walk through the nodes calling register_one_node()
> on each. This will trickle down to __register_one_node() which walks
> all present CPUs, calling register_cpu_under_node() on each.
>
> register_cpu_under_node() will call get_cpu_device(cpu) for each and
> will return NULL until the CPU is registered using register_cpu(),
> which will now happen _after_ node_dev_init().
>
> So, at this point, CPUs won't get registered, and initially one might
> think that's a problem.
>
> However, register_cpu() will itself call register_cpu_under_node(),
> where get_cpu_device() will return the now populated entry, and the
> sysfs links will be created.
>
> So, I think what you've spotted is a potential chunk of code that
> isn't necessary when using GENERIC_CPU_DEVICES after this change!
>
Makes sense thanks. I was just being too lazy to check and bouncing it back
at James! *looks guilty*
Jonathan
On Thu, 14 Sep 2023 09:57:44 +0200
Ard Biesheuvel <[email protected]> wrote:
> Hello James,
>
> On Wed, 13 Sept 2023 at 18:41, James Morse <[email protected]> wrote:
> >
> > Add the new flag field to the MADT's GICC structure.
> >
> > 'Online Capable' indicates a disabled CPU can be enabled later.
> >
>
> Why do we need a bit for this? What would be the point of describing
> disabled CPUs that cannot be enabled (and are you are aware of
> firmware doing this?).
Enabled being not set is common at some similar ACPI tables at least.
This is available in most ACPI tables to allow firmware to use 'nearly'
static tables and just tweak the 'enabled' bit to say if the record should
be ignored or not. Also _STA not present which is for same trick.
If you are doing clever dynamic tables, then you can just not present
the entry.
With that existing use case in mind, need another bit to say this
one might one day turn up. Note this is copied from x86 though no
one seems to have implemented the kernel support for them yet.
Note as per my other reply - this isn't a code first proposal. It's in the
spec already (via a code first proposal last year I think).
>
> So why are we not able to assume that this new bit can always be treated as '1'?
Given above, need the extra bit to size stuff to allow for the CPU showing up
late.
>
>
> > Signed-off-by: James Morse <[email protected]>
> > ---
> > This patch probably needs to go via the upstream acpica project,
> > but is included here so the feature can be testd.
> > ---
> > include/acpi/actbl2.h | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
> > index 3751ae69432f..c433a079d8e1 100644
> > --- a/include/acpi/actbl2.h
> > +++ b/include/acpi/actbl2.h
> > @@ -1046,6 +1046,7 @@ struct acpi_madt_generic_interrupt {
> > /* ACPI_MADT_ENABLED (1) Processor is usable if set */
> > #define ACPI_MADT_PERFORMANCE_IRQ_MODE (1<<1) /* 01: Performance Interrupt Mode */
> > #define ACPI_MADT_VGIC_IRQ_MODE (1<<2) /* 02: VGIC Maintenance Interrupt mode */
> > +#define ACPI_MADT_GICC_CPU_CAPABLE (1<<3) /* 03: CPU is online capable */
> >
> > /* 12: Generic Distributor (ACPI 5.0 + ACPI 6.0 changes) */
> >
> > --
> > 2.39.2
> >
>
> _______________________________________________
> linux-arm-kernel mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
On Wed, 13 Sep 2023 16:38:16 +0000
James Morse <[email protected]> wrote:
> ACPI, irqchip and the architecture code all inspect the MADT
> enabled bit for a GICC entry in the MADT.
>
> The addition of an 'online capable' bit means all these sites need
> updating.
>
> Move the current checks behind a helper to make future updates easier.
>
> Signed-off-by: James Morse <[email protected]>
Looks good to me and seems fine to add as part of a precursor mini
series to the main one. (fix Russell's observation of course!)
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> arch/arm64/kernel/smp.c | 2 +-
> drivers/acpi/processor_core.c | 2 +-
> drivers/irqchip/irq-gic-v3.c | 10 ++++------
> include/linux/acpi.h | 5 +++++
> 4 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 960b98b43506..8c8f55721786 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -520,7 +520,7 @@ acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
> {
> u64 hwid = processor->arm_mpidr;
>
> - if (!(processor->flags & ACPI_MADT_ENABLED)) {
> + if (!acpi_gicc_is_usable(processor)) {
> pr_debug("skipping disabled CPU entry with 0x%llx MPIDR\n", hwid);
> return;
> }
> diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
> index 7dd6dbaa98c3..b203cfe28550 100644
> --- a/drivers/acpi/processor_core.c
> +++ b/drivers/acpi/processor_core.c
> @@ -90,7 +90,7 @@ static int map_gicc_mpidr(struct acpi_subtable_header *entry,
> struct acpi_madt_generic_interrupt *gicc =
> container_of(entry, struct acpi_madt_generic_interrupt, header);
>
> - if (!(gicc->flags & ACPI_MADT_ENABLED))
> + if (!acpi_gicc_is_usable(gicc))
> return -ENODEV;
>
> /* device_declaration means Device object in DSDT, in the
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index eedfa8e9f077..72d3cdebdad1 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -2367,8 +2367,7 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> void __iomem *redist_base;
>
> - /* GICC entry which has !ACPI_MADT_ENABLED is not unusable so skip */
> - if (!(gicc->flags & ACPI_MADT_ENABLED))
> + if (!acpi_gicc_is_usable(gicc))
> return 0;
>
> redist_base = ioremap(gicc->gicr_base_address, size);
> @@ -2418,7 +2417,7 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
> * If GICC is enabled and has valid gicr base address, then it means
> * GICR base is presented via GICC
> */
> - if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address) {
> + if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
> acpi_data.enabled_rdists++;
> return 0;
> }
> @@ -2427,7 +2426,7 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
> * It's perfectly valid firmware can pass disabled GICC entry, driver
> * should not treat as errors, skip the entry instead of probe fail.
> */
> - if (!(gicc->flags & ACPI_MADT_ENABLED))
> + if (!acpi_gicc_is_usable(gicc))
> return 0;
>
> return -ENODEV;
> @@ -2486,8 +2485,7 @@ static int __init gic_acpi_parse_virt_madt_gicc(union acpi_subtable_headers *hea
> int maint_irq_mode;
> static int first_madt = true;
>
> - /* Skip unusable CPUs */
> - if (!(gicc->flags & ACPI_MADT_ENABLED))
> + if (!acpi_gicc_is_usable(gicc))
> return 0;
>
> maint_irq_mode = (gicc->flags & ACPI_MADT_VGIC_IRQ_MODE) ?
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index b7ab85857bb7..e3265a9eafae 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -256,6 +256,11 @@ acpi_table_parse_cedt(enum acpi_cedt_type id,
> int acpi_parse_mcfg (struct acpi_table_header *header);
> void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
>
> +static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
> +{
> + return (gicc->flags & ACPI_MADT_ENABLED);
> +}
> +
> /* the following numa functions are architecture-dependent */
> void acpi_numa_slit_init (struct acpi_table_slit *slit);
>
On Thu, 14 Sept 2023 at 16:55, Jonathan Cameron
<[email protected]> wrote:
>
> On Thu, 14 Sep 2023 09:57:44 +0200
> Ard Biesheuvel <[email protected]> wrote:
>
> > Hello James,
> >
> > On Wed, 13 Sept 2023 at 18:41, James Morse <[email protected]> wrote:
> > >
> > > Add the new flag field to the MADT's GICC structure.
> > >
> > > 'Online Capable' indicates a disabled CPU can be enabled later.
> > >
> >
> > Why do we need a bit for this? What would be the point of describing
> > disabled CPUs that cannot be enabled (and are you are aware of
> > firmware doing this?).
>
> Enabled being not set is common at some similar ACPI tables at least.
>
> This is available in most ACPI tables to allow firmware to use 'nearly'
> static tables and just tweak the 'enabled' bit to say if the record should
> be ignored or not. Also _STA not present which is for same trick.
> If you are doing clever dynamic tables, then you can just not present
> the entry.
>
> With that existing use case in mind, need another bit to say this
> one might one day turn up. Note this is copied from x86 though no
> one seems to have implemented the kernel support for them yet.
>
> Note as per my other reply - this isn't a code first proposal. It's in the
> spec already (via a code first proposal last year I think).
>
> >
> > So why are we not able to assume that this new bit can always be treated as '1'?
>
> Given above, need the extra bit to size stuff to allow for the CPU showing up
> late.
>
So does this mean that on x86, the CPU object is instantiated only
when the hardware level hotplug occurs? And before that, the object
does not exist at all?
Because it seems to me that _STA, having both enabled and present
bits, could already describe what we need here, and arguably, a CPU
that is not both present and enabled should not be used by the OS.
This would leave room for representing off-line CPUs as present but
not enabled.
Apologies if I am missing something obvious here - the whole rationale
behind this thing is rather confusing to me.
On Thu, Sep 14, 2023 at 05:34:25PM +0200, Ard Biesheuvel wrote:
> On Thu, 14 Sept 2023 at 16:55, Jonathan Cameron
> <[email protected]> wrote:
> >
> > On Thu, 14 Sep 2023 09:57:44 +0200
> > Ard Biesheuvel <[email protected]> wrote:
> >
> > > Hello James,
> > >
> > > On Wed, 13 Sept 2023 at 18:41, James Morse <[email protected]> wrote:
> > > >
> > > > Add the new flag field to the MADT's GICC structure.
> > > >
> > > > 'Online Capable' indicates a disabled CPU can be enabled later.
> > > >
> > >
> > > Why do we need a bit for this? What would be the point of describing
> > > disabled CPUs that cannot be enabled (and are you are aware of
> > > firmware doing this?).
> >
> > Enabled being not set is common at some similar ACPI tables at least.
> >
> > This is available in most ACPI tables to allow firmware to use 'nearly'
> > static tables and just tweak the 'enabled' bit to say if the record should
> > be ignored or not. Also _STA not present which is for same trick.
> > If you are doing clever dynamic tables, then you can just not present
> > the entry.
> >
> > With that existing use case in mind, need another bit to say this
> > one might one day turn up. Note this is copied from x86 though no
> > one seems to have implemented the kernel support for them yet.
> >
> > Note as per my other reply - this isn't a code first proposal. It's in the
> > spec already (via a code first proposal last year I think).
> >
> > >
> > > So why are we not able to assume that this new bit can always be treated as '1'?
> >
> > Given above, need the extra bit to size stuff to allow for the CPU showing up
> > late.
> >
>
> So does this mean that on x86, the CPU object is instantiated only
> when the hardware level hotplug occurs? And before that, the object
> does not exist at all?
>
> Because it seems to me that _STA, having both enabled and present
> bits, could already describe what we need here, and arguably, a CPU
> that is not both present and enabled should not be used by the OS.
> This would leave room for representing off-line CPUs as present but
> not enabled.
>
> Apologies if I am missing something obvious here - the whole rationale
> behind this thing is rather confusing to me.
Note that the bit is in the ACPI spec:
https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
The new bit has the same description as per the local-APIC equivalent:
https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#local-apic-flags
for a popular architecture that does have hot-pluggable physical CPUs ;)
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, 13 Sep 2023 16:38:19 +0000
James Morse <[email protected]> wrote:
> From: Jean-Philippe Brucker <[email protected]>
>
> When a CPU is marked as disabled, but online capable in the MADT, PSCI
> applies some firmware policy to control when it can be brought online.
> PSCI returns DENIED to a CPU_ON request if this is not currently
> permitted. The OS can learn the current policy from the _STA enabled bit.
>
> Handle the PSCI DENIED return code gracefully instead of printing an
> error.
Specification reference would be good particularly as it's only been
added as a possibility fairly recently.
>
> Signed-off-by: Jean-Philippe Brucker <[email protected]>
> [ morse: Rewrote commit message ]
> Signed-off-by: James Morse <[email protected]>
> ---
> arch/arm64/kernel/psci.c | 2 +-
> arch/arm64/kernel/smp.c | 3 ++-
> drivers/firmware/psci/psci.c | 2 ++
> 3 files changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
> index 29a8e444db83..4fcc0cdd757b 100644
> --- a/arch/arm64/kernel/psci.c
> +++ b/arch/arm64/kernel/psci.c
> @@ -40,7 +40,7 @@ static int cpu_psci_cpu_boot(unsigned int cpu)
> {
> phys_addr_t pa_secondary_entry = __pa_symbol(secondary_entry);
> int err = psci_ops.cpu_on(cpu_logical_map(cpu), pa_secondary_entry);
> - if (err)
> + if (err && err != -EPROBE_DEFER)
Hmm. EPROBE_DEFER has very specific meaning around driver requesting a retry
when some other bit of the system has finished booting.
I'm not sure it's a good idea for this use case. Maybe just keep to EPERM
as psci_to_linux_errno() will return anyway. Seems valid to me, or
is the requirement to use EPROBE_DEFER coming from further up the stack?
> pr_err("failed to boot CPU%d (%d)\n", cpu, err);
>
> return err;
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 8c8f55721786..e958db987665 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -124,7 +124,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
> /* Now bring the CPU into our world */
> ret = boot_secondary(cpu, idle);
> if (ret) {
> - pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
> + if (ret != -EPROBE_DEFER)
> + pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
> return ret;
> }
>
> diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
> index d9629ff87861..f7ab3fed3528 100644
> --- a/drivers/firmware/psci/psci.c
> +++ b/drivers/firmware/psci/psci.c
> @@ -218,6 +218,8 @@ static int __psci_cpu_on(u32 fn, unsigned long cpuid, unsigned long entry_point)
> int err;
>
> err = invoke_psci_fn(fn, cpuid, entry_point, 0);
> + if (err == PSCI_RET_DENIED)
> + return -EPROBE_DEFER;
> return psci_to_linux_errno(err);
> }
>
On Wed, 13 Sep 2023 16:38:20 +0000
James Morse <[email protected]> wrote:
> acpi_processor_get_info() registers all present CPUs. Registering a
> CPU is what creates the sysfs entries and triggers the udev
> notifications.
>
> arm64 virtual machines that support 'virtual cpu hotplug' use the
> enabled bit to indicate whether the CPU can be brought online, as
> the existing ACPI tables require all hardware to be described and
> present.
>
> If firmware describes a CPU as present, but disabled, skip the
> registration. Such CPUs are present, but can't be brought online for
> whatever reason. (e.g. firmware/hypervisor policy).
>
> Once firmware sets the enabled bit, the CPU can be registered and
> brought online by user-space. Online CPUs, or CPUs that are missing
> an _STA method must always be registered.
>
> Signed-off-by: James Morse <[email protected]>
A small argument with myself inline. Feel free to ignore.
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 31 ++++++++++++++++++++++++++++++-
> 1 file changed, 30 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index b67616079751..b49859eab01a 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -227,6 +227,32 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
> return ret;
> }
>
> +static int acpi_processor_make_enabled(struct acpi_processor *pr)
> +{
> + unsigned long long sta;
> + acpi_status status;
> + bool present, enabled;
> +
> + if (!acpi_has_method(pr->handle, "_STA"))
> + return arch_register_cpu(pr->id);
> +
> + status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> + if (ACPI_FAILURE(status))
> + return -ENODEV;
> +
> + present = sta & ACPI_STA_DEVICE_PRESENT;
> + enabled = sta & ACPI_STA_DEVICE_ENABLED;
> +
> + if (cpu_online(pr->id) && (!present || !enabled)) {
> + pr_err_once(FW_BUG "CPU %u is online, but described as not present or disabled!\n", pr->id);
Why once? If this for some reason happened on multiple CPUs I think we'd want to know.
> + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> + } else if (!present || !enabled) {
> + return -ENODEV;
> + }
I guess you didn't do a nested if here to avoid even longer lines.
Could flip things around though I don't like this much either as it makes
the normal good path exit mid way down.
if (present && enabled)
return arch_register_cpu(pr->id);
if (!cpu_online(pr->id))
return -ENODEV;
pr_err...
add_taint(...
return arch_register_cpu(pr->id);
Ah well. Some code just has to be less than pretty.
> +
> + return arch_register_cpu(pr->id);
> +}
> +
> static int acpi_processor_get_info(struct acpi_device *device)
> {
> union acpi_object object = { 0 };
> @@ -318,7 +344,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> */
> if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> !get_cpu_device(pr->id)) {
> - int ret = arch_register_cpu(pr->id);
> + int ret = acpi_processor_make_enabled(pr);
>
> if (ret)
> return ret;
> @@ -526,6 +552,9 @@ static void acpi_processor_post_eject(struct acpi_device *device)
> acpi_processor_make_not_present(device);
> return;
> }
> +
> + if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED))
> + arch_unregister_cpu(pr->id);
> }
>
> #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
On Wed, 13 Sep 2023 16:38:21 +0000
James Morse <[email protected]> wrote:
> Add a description of physical and virtual CPU hotplug, explain the
> differences and elaborate on what is required in ACPI for a working
> virtual hotplug system.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> Documentation/arch/arm64/cpu-hotplug.rst | 79 ++++++++++++++++++++++++
> Documentation/arch/arm64/index.rst | 1 +
> 2 files changed, 80 insertions(+)
> create mode 100644 Documentation/arch/arm64/cpu-hotplug.rst
>
> diff --git a/Documentation/arch/arm64/cpu-hotplug.rst b/Documentation/arch/arm64/cpu-hotplug.rst
> new file mode 100644
> index 000000000000..76ba8d932c72
> --- /dev/null
> +++ b/Documentation/arch/arm64/cpu-hotplug.rst
> @@ -0,0 +1,79 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +.. _cpuhp_index:
> +
> +====================
> +CPU Hotplug and ACPI
> +====================
> +
> +CPU hotplug in the arm64 world is commonly used to describe the kernel taking
> +CPUs online/offline using PSCI. This document is about ACPI firmware allowing
> +CPUs that were not available during boot to be added to the system later.
> +
> +``possible`` and ``present`` refer to the state of the CPU as seen by linux.
> +
> +
> +CPU Hotplug on physical systems - CPUs not present at boot
> +----------------------------------------------------------
> +
> +Physical systems need to mark a CPU that is ``possible`` but not ``present`` as
> +being ``present``. An example would be a dual socket machine, where the package
> +in one of the sockets can be replaced while the system is running.
> +
> +This is not supported.
> +
> +In the arm64 world CPUs are not a single device but a slice of the system.
> +There are no systems that support the physical addition (or removal) of CPUs
> +while the system is running, and ACPI is not able to sufficiently describe
> +them.
> +
> +e.g. New CPUs come with new caches, but the platform's cache toplogy is
> +described in a static table, the PPTT. How caches are shared between CPUs is
> +not discoverable, and must be described by firmware.
> +
> +e.g. The GIC redistributor for each CPU must be accessed by the driver during
> +boot to discover the system wide supported features. ACPI's MADT GICC
> +structures can describe a redistributor associated with a disabled CPU, but
> +can't describe whether the redistributor is accessible, only that it is not
> +'always on'.
> +
> +arm64's ACPI tables assume that everything described is ``present``.
> +
> +
> +CPU Hotplug on virtual systems - CPUs not enabled at boot
> +---------------------------------------------------------
> +
> +Virtual systems have the advantage that all the properties the system will
> +ever have can be described at boot. There are no power-domain considerations
> +as such devices are emulated.
> +
> +CPU Hotplug on virtual systems is supported. It is distinct from physical
> +CPU Hotplug as all resources are described as ``present``, but CPUs may be
> +marked as disabled by firmware. Only the CPU's online/offline behaviour is
> +influenced by firmware. An example is where a virtual machine boots with a
> +single CPU, and additional CPUs are added once a cloud orchestrator deploys
> +the workload.
> +
> +For a virtual machine, the VMM (e.g. Qemu) plays the part of firmware.
> +
> +Virtual hotplug is implemented as a firmware policy affecting which CPUs can be
> +brought online. Firmware can enforce its policy via PSCI's return codes. e.g.
> +``DENIED``.
> +
> +The ACPI tables must describe all the resources of the virtual machine. CPUs
> +that firmware wishes to disable either from boot (or later) should not be
> +``enabled`` in the MADT GICC structures, but should have the ``online capable``
> +bit set, to indicate they can be enabled later. The boot CPU must be marked as
> +``enabled``. The 'always on' GICR structure must be used to describe the
> +redistributors.
Hi James,
I guess you know I'm going to comment on this given I got a bit fixated on it
at the Linaro Open Discussions call the other day.
This is the corner case that I think needs discussion. So far there is nothing
the ACPI spec that says anything about unplugability of CPUs so I see this as a
Linux implementation choice and I think it may be a problem for the cloud tennant
scalability usecases. The problem is legacy operating systems. Some of whom may have
a different interpretation of the ACPI Spec unless we make sure it addresses this.
At time of VM startup, I want to provide a flexible number of CPUs say, 1 to 64 -
but the customer paid for 4 currently so I want to start them off with 4.
To make this model work I have to know if they are running a hotplug capable OS
and, even if the current OSC ACPI code first proposal goes forwards, I either have to
ask customers to tell me they support it, or boot to find out (relying on OSC handshake
late in boot).
Code first proposal mentioned:
https://bugzilla.tianocore.org/show_bug.cgi?id=4481
If the guest doesn't support CPU hotplug I need to set enabled for the 4 CPUs. Once
booted I can use that OSC to discover if they can ever take advantage of hotplug
CPUs (arguably we could tweak that definition or add another to say they are fine
with me removing them as well).
If they do support CPU hotplug maximum flexiblity suggests I set enabled for CPU 0
and online capable for the next 3 and rely on the OS optimistically poking the
oneline capable ones to see if they are there at boot.
Of course one option is stick with what you have here and treat it as customer
lock-in they can only get bigger, not smaller. Might be acceptable as might
the horrible approach of trying to hot unplug a CPU and getting no reply
(not a good user experience)
I probably haven't described that well.
Jonathan
> +
> +CPUs described as ``online capable`` but not ``enabled`` can be set to enabled
> +by the DSDT's Processor object's _STA method. On virtual systems the _STA method
> +must always report the CPU as ``present``. Changes to the firmware policy can
> +be notified to the OS via device-check or eject-request.
> +
> +CPUs described as ``enabled`` in the static table, should not have their _STA
> +modified dynamically by firmware. Soft-restart features such as kexec will
> +re-read the static properties of the system from these static tables, and
> +may malfunction if these no longer describe the running system. Linux will
> +re-discover the dynamic properties of the system from the _STA method later
> +during boot.
> diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/index.rst
> index d08e924204bf..78544de0a8a9 100644
> --- a/Documentation/arch/arm64/index.rst
> +++ b/Documentation/arch/arm64/index.rst
> @@ -13,6 +13,7 @@ ARM64 Architecture
> asymmetric-32bit
> booting
> cpu-feature-registers
> + cpu-hotplug
> elf_hwcaps
> hugetlbpage
> kdump
On Wed, 13 Sep 2023 16:38:23 +0000
James Morse <[email protected]> wrote:
> The 'offline' file in sysfs shows all offline CPUs, including those
> that aren't present. User-space is expected to remove not-present CPUs
> from this list to learn which CPUs could be brought online.
>
> CPUs can be present but not-enabled. These CPUs can't be brought online
> until the firmware policy changes, which comes with an ACPI notification
> that will register the CPUs.
>
> With only the offline and present files, user-space is unable to
> determine which CPUs it can try to bring online. Add a new CPU mask
> that shows this based on all the registered CPUs.
Bikeshed should be blue.
Enabled is a really confusing name for this - to the extent that I'm not sure
what it means. Assuming I have the sense right, how about the horrible
onlineable or online_capable?
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/base/cpu.c | 10 ++++++++++
> include/linux/cpumask.h | 25 +++++++++++++++++++++++++
> kernel/cpu.c | 3 +++
> 3 files changed, 38 insertions(+)
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index c709747c4a18..a19a8be93102 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -95,6 +95,7 @@ void unregister_cpu(struct cpu *cpu)
> {
> int logical_cpu = cpu->dev.id;
>
> + set_cpu_enabled(logical_cpu, false);
> unregister_cpu_under_node(logical_cpu, cpu_to_node(logical_cpu));
>
> device_unregister(&cpu->dev);
> @@ -273,6 +274,13 @@ static ssize_t print_cpus_offline(struct device *dev,
> }
> static DEVICE_ATTR(offline, 0444, print_cpus_offline, NULL);
>
> +static ssize_t print_cpus_enabled(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + return sysfs_emit(buf, "%*pbl\n", cpumask_pr_args(cpu_enabled_mask));
> +}
> +static DEVICE_ATTR(enabled, 0444, print_cpus_enabled, NULL);
> +
> static ssize_t print_cpus_isolated(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> @@ -413,6 +421,7 @@ int register_cpu(struct cpu *cpu, int num)
> register_cpu_under_node(num, cpu_to_node(num));
> dev_pm_qos_expose_latency_limit(&cpu->dev,
> PM_QOS_RESUME_LATENCY_NO_CONSTRAINT);
> + set_cpu_enabled(num, true);
>
> return 0;
> }
> @@ -494,6 +503,7 @@ static struct attribute *cpu_root_attrs[] = {
> &cpu_attrs[2].attr.attr,
> &dev_attr_kernel_max.attr,
> &dev_attr_offline.attr,
> + &dev_attr_enabled.attr,
> &dev_attr_isolated.attr,
> #ifdef CONFIG_NO_HZ_FULL
> &dev_attr_nohz_full.attr,
> diff --git a/include/linux/cpumask.h b/include/linux/cpumask.h
> index f10fb87d49db..a29ee03f13ff 100644
> --- a/include/linux/cpumask.h
> +++ b/include/linux/cpumask.h
> @@ -92,6 +92,7 @@ static inline void set_nr_cpu_ids(unsigned int nr)
> *
> * cpu_possible_mask- has bit 'cpu' set iff cpu is populatable
> * cpu_present_mask - has bit 'cpu' set iff cpu is populated
> + * cpu_enabled_mask - has bit 'cpu' set iff cpu can be brought online
> * cpu_online_mask - has bit 'cpu' set iff cpu available to scheduler
> * cpu_active_mask - has bit 'cpu' set iff cpu available to migration
> *
> @@ -124,11 +125,13 @@ static inline void set_nr_cpu_ids(unsigned int nr)
>
> extern struct cpumask __cpu_possible_mask;
> extern struct cpumask __cpu_online_mask;
> +extern struct cpumask __cpu_enabled_mask;
> extern struct cpumask __cpu_present_mask;
> extern struct cpumask __cpu_active_mask;
> extern struct cpumask __cpu_dying_mask;
> #define cpu_possible_mask ((const struct cpumask *)&__cpu_possible_mask)
> #define cpu_online_mask ((const struct cpumask *)&__cpu_online_mask)
> +#define cpu_enabled_mask ((const struct cpumask *)&__cpu_enabled_mask)
> #define cpu_present_mask ((const struct cpumask *)&__cpu_present_mask)
> #define cpu_active_mask ((const struct cpumask *)&__cpu_active_mask)
> #define cpu_dying_mask ((const struct cpumask *)&__cpu_dying_mask)
> @@ -973,6 +976,7 @@ extern const DECLARE_BITMAP(cpu_all_bits, NR_CPUS);
> #else
> #define for_each_possible_cpu(cpu) for_each_cpu((cpu), cpu_possible_mask)
> #define for_each_online_cpu(cpu) for_each_cpu((cpu), cpu_online_mask)
> +#define for_each_enabled_cpu(cpu) for_each_cpu((cpu), cpu_enabled_mask)
> #define for_each_present_cpu(cpu) for_each_cpu((cpu), cpu_present_mask)
> #endif
>
> @@ -995,6 +999,15 @@ set_cpu_possible(unsigned int cpu, bool possible)
> cpumask_clear_cpu(cpu, &__cpu_possible_mask);
> }
>
> +static inline void
> +set_cpu_enabled(unsigned int cpu, bool can_be_onlined)
> +{
> + if (can_be_onlined)
> + cpumask_set_cpu(cpu, &__cpu_enabled_mask);
> + else
> + cpumask_clear_cpu(cpu, &__cpu_enabled_mask);
> +}
> +
> static inline void
> set_cpu_present(unsigned int cpu, bool present)
> {
> @@ -1074,6 +1087,7 @@ static __always_inline unsigned int num_online_cpus(void)
> return raw_atomic_read(&__num_online_cpus);
> }
> #define num_possible_cpus() cpumask_weight(cpu_possible_mask)
> +#define num_enabled_cpus() cpumask_weight(cpu_enabled_mask)
> #define num_present_cpus() cpumask_weight(cpu_present_mask)
> #define num_active_cpus() cpumask_weight(cpu_active_mask)
>
> @@ -1082,6 +1096,11 @@ static inline bool cpu_online(unsigned int cpu)
> return cpumask_test_cpu(cpu, cpu_online_mask);
> }
>
> +static inline bool cpu_enabled(unsigned int cpu)
> +{
> + return cpumask_test_cpu(cpu, cpu_enabled_mask);
> +}
> +
> static inline bool cpu_possible(unsigned int cpu)
> {
> return cpumask_test_cpu(cpu, cpu_possible_mask);
> @@ -1106,6 +1125,7 @@ static inline bool cpu_dying(unsigned int cpu)
>
> #define num_online_cpus() 1U
> #define num_possible_cpus() 1U
> +#define num_enabled_cpus() 1U
> #define num_present_cpus() 1U
> #define num_active_cpus() 1U
>
> @@ -1119,6 +1139,11 @@ static inline bool cpu_possible(unsigned int cpu)
> return cpu == 0;
> }
>
> +static inline bool cpu_enabled(unsigned int cpu)
> +{
> + return cpu == 0;
> +}
> +
> static inline bool cpu_present(unsigned int cpu)
> {
> return cpu == 0;
> diff --git a/kernel/cpu.c b/kernel/cpu.c
> index 6de7c6bb74ee..2201a6a449b5 100644
> --- a/kernel/cpu.c
> +++ b/kernel/cpu.c
> @@ -3101,6 +3101,9 @@ EXPORT_SYMBOL(__cpu_possible_mask);
> struct cpumask __cpu_online_mask __read_mostly;
> EXPORT_SYMBOL(__cpu_online_mask);
>
> +struct cpumask __cpu_enabled_mask __read_mostly;
> +EXPORT_SYMBOL(__cpu_enabled_mask);
> +
> struct cpumask __cpu_present_mask __read_mostly;
> EXPORT_SYMBOL(__cpu_present_mask);
>
On Wed, 13 Sep 2023 16:38:22 +0000
James Morse <[email protected]> wrote:
> Platform firmware can disabled a CPU, or make it not-present by making
> an eject-request notification, then waiting for the os to make it offline
> and call _EJx. After the firmware updates _STA with the new status.
>
> Not all operating systems support this. For arm64 making CPUs not-present
> has never been supported. For all ACPI architectures, making CPUs disabled
> has recently been added. Firmware can't know what the OS has support for.
>
> Add two new _OSC bits to advertise whether the OS supports the _STA enabled
> or present bits being toggled for CPUs. This will be important for arm64
> if systems that support physical CPU hotplug ever appear as arm64 linux
> doesn't currently support this, so firmware shouldn't try.
I'm not sure I like enabling this for all architectures though I guess
everyone will ignore it on those that have long supported
changing the enabled bit. The hypervisors won't care if Linux claims
to support it or not. I can see the argument for architectures that might
support it in the future.
I need to think a bit more about this, but maybe just having the online
capable bit OSC is safer in general. I guess it depends on whether there
are hypervisors out there implementing the x86 version of that even though
no one has yet posted patches for Linux.
Perhaps we just call these out as hints that we 'definitely' support them.
Otherwise we might for some architectures so poke it anyway.
OSC is late in boot, so what advantage is there in preventing it working?
We can't change any of the bring up / sizing etc as a result so might as
well let it through.
>
> Advertising this support to firmware is useful for cloud orchestrators
> to know whether they can scale a particular VM by adding CPUs.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> I'm assuming ia64 with physical hotplug machines once existed, and
> that Loongarch machines with support for this don't.
> ---
> arch/ia64/Kconfig | 1 +
> arch/x86/Kconfig | 1 +
> drivers/acpi/Kconfig | 9 +++++++++
> drivers/acpi/acpi_processor.c | 14 +++++++++++++-
> drivers/acpi/bus.c | 16 ++++++++++++++++
> include/linux/acpi.h | 4 ++++
> 6 files changed, 44 insertions(+), 1 deletion(-)
>
> diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
> index 54972f9fe804..13df676bad67 100644
> --- a/arch/ia64/Kconfig
> +++ b/arch/ia64/Kconfig
> @@ -17,6 +17,7 @@ config IA64
> select ARCH_MIGHT_HAVE_PC_SERIO
> select ACPI
> select ACPI_HOTPLUG_PRESENT_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
> + select ACPI_HOTPLUG_IGNORE_OSC if ACPI
> select ACPI_NUMA if NUMA
> select ARCH_ENABLE_MEMORY_HOTPLUG
> select ARCH_ENABLE_MEMORY_HOTREMOVE
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 295a7a3debb6..5fea3ce9594e 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -61,6 +61,7 @@ config X86
> select ACPI_LEGACY_TABLES_LOOKUP if ACPI
> select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
> select ACPI_HOTPLUG_PRESENT_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
> + select ACPI_HOTPLUG_IGNORE_OSC if ACPI && HOTPLUG_CPU
> select ARCH_32BIT_OFF_T if X86_32
> select ARCH_CLOCKSOURCE_INIT
> select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index 417f9f3077d2..c49978b4b11f 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -310,6 +310,15 @@ config ACPI_HOTPLUG_PRESENT_CPU
> depends on ACPI_PROCESSOR && HOTPLUG_CPU
> select ACPI_CONTAINER
>
> +config ACPI_HOTPLUG_IGNORE_OSC
> + bool
> + depends on ACPI_HOTPLUG_PRESENT_CPU
> + help
> + Ignore whether firmware acknowledged support for toggling the CPU
> + present bit in _STA. Some architectures predate the _OSC bits, so
> + firmware doesn't know to do this.
> +
> +
> config ACPI_PROCESSOR_AGGREGATOR
> tristate "Processor Aggregator"
> depends on ACPI_PROCESSOR
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index b49859eab01a..87926f22c857 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -181,6 +181,18 @@ static void __init acpi_pcc_cpufreq_init(void)
> static void __init acpi_pcc_cpufreq_init(void) {}
> #endif /* CONFIG_X86 */
>
> +static bool acpi_processor_hotplug_present_supported(void)
> +{
> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> + return false;
> +
> + /* x86 systems pre-date the _OSC bit */
> + if (IS_ENABLED(CONFIG_ACPI_HOTPLUG_IGNORE_OSC))
> + return true;
> +
> + return osc_sb_hotplug_present_support_acked;
> +}
> +
> /* Initialization */
> static int acpi_processor_make_present(struct acpi_processor *pr)
> {
> @@ -188,7 +200,7 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
> acpi_status status;
> int ret;
>
> - if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
> + if (!acpi_processor_hotplug_present_supported()) {
I don't see the advantage of blocking on basis of what the firmware said.
It was clearly lying or didn't understand the question ;)
> pr_err_once("Changing CPU present bit is not supported\n");
> return -ENODEV;
> }
> diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
> index f41dda2d3493..123c28c2eda3 100644
> --- a/drivers/acpi/bus.c
> +++ b/drivers/acpi/bus.c
> @@ -298,6 +298,13 @@ EXPORT_SYMBOL_GPL(osc_sb_native_usb4_support_confirmed);
>
> bool osc_sb_cppc2_support_acked;
>
> +/*
> + * ACPI 6.? Proposed Operating System Capabilities for modifying CPU
> + * present/enable.
> + */
> +bool osc_sb_hotplug_enabled_support_acked;
> +bool osc_sb_hotplug_present_support_acked;
> +
> static u8 sb_uuid_str[] = "0811B06E-4A27-44F9-8D60-3CBBC22E7B48";
> static void acpi_bus_osc_negotiate_platform_control(void)
> {
> @@ -346,6 +353,11 @@ static void acpi_bus_osc_negotiate_platform_control(void)
>
> if (!ghes_disable)
> capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_APEI_SUPPORT;
> +
> + capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_HOTPLUG_ENABLED_SUPPORT;
> + if (IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> + capbuf[OSC_SUPPORT_DWORD] |= OSC_SB_HOTPLUG_PRESENT_SUPPORT;
> +
> if (ACPI_FAILURE(acpi_get_handle(NULL, "\\_SB", &handle)))
> return;
>
> @@ -383,6 +395,10 @@ static void acpi_bus_osc_negotiate_platform_control(void)
> capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_NATIVE_USB4_SUPPORT;
> osc_cpc_flexible_adr_space_confirmed =
> capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_CPC_FLEXIBLE_ADR_SPACE;
> + osc_sb_hotplug_enabled_support_acked =
> + capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_HOTPLUG_ENABLED_SUPPORT;
> + osc_sb_hotplug_present_support_acked =
> + capbuf_ret[OSC_SUPPORT_DWORD] & OSC_SB_HOTPLUG_PRESENT_SUPPORT;
> }
>
> kfree(context.ret.pointer);
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index 92cb25349a18..2ba7e0b10bcf 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -580,12 +580,16 @@ acpi_status acpi_run_osc(acpi_handle handle, struct acpi_osc_context *context);
> #define OSC_SB_NATIVE_USB4_SUPPORT 0x00040000
> #define OSC_SB_PRM_SUPPORT 0x00200000
> #define OSC_SB_FFH_OPR_SUPPORT 0x00400000
> +#define OSC_SB_HOTPLUG_ENABLED_SUPPORT 0x00800000
> +#define OSC_SB_HOTPLUG_PRESENT_SUPPORT 0x01000000
>
> extern bool osc_sb_apei_support_acked;
> extern bool osc_pc_lpi_support_confirmed;
> extern bool osc_sb_native_usb4_support_confirmed;
> extern bool osc_sb_cppc2_support_acked;
> extern bool osc_cpc_flexible_adr_space_confirmed;
> +extern bool osc_sb_hotplug_enabled_support_acked;
> +extern bool osc_sb_hotplug_present_support_acked;
>
> /* USB4 Capabilities */
> #define OSC_USB_USB3_TUNNELING 0x00000001
On Wed, 13 Sep 2023 16:38:15 +0000
James Morse <[email protected]> wrote:
> Add the new flag field to the MADT's GICC structure.
>
> 'Online Capable' indicates a disabled CPU can be enabled later.
>
> Signed-off-by: James Morse <[email protected]>
Why [code first?] it's in ACPI 6.5
https://uefi.org/sites/default/files/resources/ACPI_Spec_6_5_Aug29.pdf
Spec reference would be good though. It's 6.5 Tabel 5.37: GICC CPU Interface Flags
I think
> ---
> This patch probably needs to go via the upstream acpica project,
> but is included here so the feature can be testd.
tested
> ---
> include/acpi/actbl2.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
> index 3751ae69432f..c433a079d8e1 100644
> --- a/include/acpi/actbl2.h
> +++ b/include/acpi/actbl2.h
> @@ -1046,6 +1046,7 @@ struct acpi_madt_generic_interrupt {
> /* ACPI_MADT_ENABLED (1) Processor is usable if set */
> #define ACPI_MADT_PERFORMANCE_IRQ_MODE (1<<1) /* 01: Performance Interrupt Mode */
> #define ACPI_MADT_VGIC_IRQ_MODE (1<<2) /* 02: VGIC Maintenance Interrupt mode */
> +#define ACPI_MADT_GICC_CPU_CAPABLE (1<<3) /* 03: CPU is online capable */
bikeshed colour time....
It's capable of being a CPU?
ACPI_MADT_GICC_ONLINE_CAPABLE
GICC already tells us it's a CPU (last C) despite the table in ACPI being labeled
Table 5.37: GICC CPU Interface table
>
> /* 12: Generic Distributor (ACPI 5.0 + ACPI 6.0 changes) */
>
On Wed, Sep 13, 2023 at 04:37:56PM +0000, James Morse wrote:
> Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
> overridden by the arch code, switch over to this to allow common code
> to choose when the register_cpu() call is made.
>
> x86's struct cpus come from struct x86_cpu, which has no other members
> or users. Remove this and use the version defined by common code.
>
> This is an intermediate step to the logic being moved to drivers/acpi,
> where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
I think it should also be noted that this moves the registration of
CPUs from subsys to driver core initialisation (before any other
initcalls are run.)
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:37:57PM +0000, James Morse wrote:
> Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
> overridden by the arch code, switch over to this to allow common code
> to choose when the register_cpu() call is made.
>
> This allows topology_init() to be removed.
>
> This is an intermediate step to the logic being moved to drivers/acpi,
> where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
>
> Signed-off-by: James Morse <[email protected]>
Same comment as x86 (moving the point at which cpus are registered
ought to be mentioned in the commit message.)
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, 13 Sep 2023 16:37:57 +0000
James Morse <[email protected]> wrote:
> Now that GENERIC_CPU_DEVICES calls arch_register_cpu(), which can be
> overridden by the arch code, switch over to this to allow common code
> to choose when the register_cpu() call is made.
>
> This allows topology_init() to be removed.
>
> This is an intermediate step to the logic being moved to drivers/acpi,
> where GENERIC_CPU_DEVICES will do the work when booting with acpi=off.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> arch/loongarch/Kconfig | 1 +
> arch/loongarch/kernel/topology.c | 29 ++---------------------------
> 2 files changed, 3 insertions(+), 27 deletions(-)
>
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index 2bddd202470e..5bed51adc68c 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -72,6 +72,7 @@ config LOONGARCH
> select GENERIC_CLOCKEVENTS
> select GENERIC_CMOS_UPDATE
> select GENERIC_CPU_AUTOPROBE
> + select GENERIC_CPU_DEVICES
> select GENERIC_ENTRY
> select GENERIC_GETTIMEOFDAY
> select GENERIC_IOREMAP if !ARCH_IOREMAP
> diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
> index caa7cd859078..8e4441c1ff39 100644
> --- a/arch/loongarch/kernel/topology.c
> +++ b/arch/loongarch/kernel/topology.c
> @@ -7,20 +7,13 @@
> #include <linux/percpu.h>
> #include <asm/bootinfo.h>
>
> -static DEFINE_PER_CPU(struct cpu, cpu_devices);
> -
> #ifdef CONFIG_HOTPLUG_CPU
> int arch_register_cpu(int cpu)
> {
> - int ret;
> struct cpu *c = &per_cpu(cpu_devices, cpu);
>
> - c->hotpluggable = 1;
This is a bit subtle. Can loongarch hotplug a CPU that
is also io_master(cpu)? I have no idea if there is a subtle difference
between.
1) CPUs present at boot where if they are an io_master they are not allowed
to be hot removed.
2) CPUs that turn up (hotplugged) later which are an io_master and by original code
can be removed.
My guess is that no io_master CPU can be hotplugged in making this irrelevant
and your code correct as the =1 is just a micro optimizatoin.
If we can confirm that, a one line addition to the patch description would be
great.
Otherwise LGTM
> - ret = register_cpu(c, cpu);
> - if (ret < 0)
> - pr_warn("register_cpu %d failed (%d)\n", cpu, ret);
> -
> - return ret;
> + c->hotpluggable = !io_master(cpu);
> + return register_cpu(c, cpu);
> }
> EXPORT_SYMBOL(arch_register_cpu);
>
> @@ -33,21 +26,3 @@ void arch_unregister_cpu(int cpu)
> }
> EXPORT_SYMBOL(arch_unregister_cpu);
> #endif
> -
> -static int __init topology_init(void)
> -{
> - int i, ret;
> -
> - for_each_present_cpu(i) {
> - struct cpu *c = &per_cpu(cpu_devices, i);
> -
> - c->hotpluggable = !io_master(i);
> - ret = register_cpu(c, i);
> - if (ret < 0)
> - pr_warn("topology_init: register_cpu %d failed (%d)\n", i, ret);
> - }
> -
> - return 0;
> -}
> -
> -subsys_initcall(topology_init);
On Wed, 13 Sep 2023 16:38:03 +0000
James Morse <[email protected]> wrote:
> ACPI has two ways of describing processors in the DSDT. Either as a device
> object with HID ACPI0007, or as a type 'C' package inside a Processor
> Container. The ACPI processor driver probes CPUs described as devices, but
> not those described as packages.
>
Specification reference needed...
Terminology wise, I'd just refer to Processor() objects as I think they
are named objects rather than data terms like a package (Which include
a PkgLength etc)
> Duplicate descriptions are not allowed, the ACPI processor driver already
> parses the UID from both devices and containers. acpi_processor_get_info()
> returns an error if the UID exists twice in the DSDT.
>
> The missing probe for CPUs described as packages creates a problem for
> moving the cpu_register() calls into the acpi_processor driver, as CPUs
> described like this don't get registered, leading to errors from other
> subsystems when they try to add new sysfs entries to the CPU node.
> (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
>
> To fix this, parse the processor container and call acpi_processor_add()
> for each processor that is discovered like this. The processor container
> handler is added with acpi_scan_add_handler(), so no detach call will
> arrive.
>
> Qemu TCG describes CPUs using packages in a processor container.
processor terms in a processor container.
>
> Signed-off-by: James Morse <[email protected]>
Otherwise looks fine to me.
Jonathan
> ---
> drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index c0839bcf78c1..b4bde78121bb 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -625,9 +625,31 @@ static struct acpi_scan_handler processor_handler = {
> },
> };
>
> +static acpi_status acpi_processor_container_walk(acpi_handle handle,
> + u32 lvl,
> + void *context,
> + void **rv)
> +{
> + struct acpi_device *adev;
> + acpi_status status;
> +
> + adev = acpi_get_acpi_dev(handle);
> + if (!adev)
> + return AE_ERROR;
> +
> + status = acpi_processor_add(adev, &processor_device_ids[0]);
> + acpi_put_acpi_dev(adev);
> +
> + return status;
> +}
> +
> static int acpi_processor_container_attach(struct acpi_device *dev,
> const struct acpi_device_id *id)
> {
> + acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle,
> + ACPI_UINT32_MAX, acpi_processor_container_walk,
> + NULL, NULL, NULL);
> +
> return 1;
> }
>
On Wed, 13 Sep 2023 16:38:12 +0000
James Morse <[email protected]> wrote:
> Add arch_unregister_cpu() to allow the ACPI machinery to call
> unregister_cpu(). This is enough for arm64, riscv and loongarch, but
> needs to be overridden by x86 and ia64 who need to do more work.
>
> CC: Jean-Philippe Brucker <[email protected]>
> Signed-off-by: James Morse <[email protected]>
Ah. Was thinking this should happen in an earlier patch.
Reviewed-by: Jonathan Cameron <[email protected]>
> ---
> Changes since v1:
> * Added CONFIG_HOTPLUG_CPU ifdeffery around unregister_cpu
> ---
> arch/ia64/include/asm/cpu.h | 4 ----
> arch/loongarch/include/asm/cpu.h | 6 ------
> arch/x86/include/asm/cpu.h | 1 -
> drivers/base/cpu.c | 9 ++++++++-
> 4 files changed, 8 insertions(+), 12 deletions(-)
>
> diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
> index a3e690e685e5..642d71675ddb 100644
> --- a/arch/ia64/include/asm/cpu.h
> +++ b/arch/ia64/include/asm/cpu.h
> @@ -15,8 +15,4 @@ DECLARE_PER_CPU(struct ia64_cpu, cpu_devices);
>
> DECLARE_PER_CPU(int, cpu_state);
>
> -#ifdef CONFIG_HOTPLUG_CPU
> -extern void arch_unregister_cpu(int);
> -#endif
> -
> #endif /* _ASM_IA64_CPU_H_ */
> diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
> index b8568e637420..48b9f7168bcc 100644
> --- a/arch/loongarch/include/asm/cpu.h
> +++ b/arch/loongarch/include/asm/cpu.h
> @@ -128,10 +128,4 @@ enum cpu_type_enum {
> #define LOONGARCH_CPU_HYPERVISOR BIT_ULL(CPU_FEATURE_HYPERVISOR)
> #define LOONGARCH_CPU_PTW BIT_ULL(CPU_FEATURE_PTW)
>
> -#if !defined(__ASSEMBLY__)
> -#ifdef CONFIG_HOTPLUG_CPU
> -void arch_unregister_cpu(int cpu);
> -#endif
> -#endif /* ! __ASSEMBLY__ */
> -
> #endif /* _ASM_CPU_H */
> diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
> index f349c94510e8..91867a6a9f8e 100644
> --- a/arch/x86/include/asm/cpu.h
> +++ b/arch/x86/include/asm/cpu.h
> @@ -24,7 +24,6 @@ static inline void prefill_possible_map(void) {}
> #endif /* CONFIG_SMP */
>
> #ifdef CONFIG_HOTPLUG_CPU
> -extern void arch_unregister_cpu(int);
> extern void soft_restart_cpu(void);
> #endif
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 677f963e02ce..c709747c4a18 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -531,7 +531,14 @@ int __weak arch_register_cpu(int cpu)
> {
> return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
> }
> -#endif
> +
> +#ifdef CONFIG_HOTPLUG_CPU
> +void __weak arch_unregister_cpu(int num)
> +{
> + unregister_cpu(&per_cpu(cpu_devices, num));
> +}
> +#endif /* CONFIG_HOTPLUG_CPU */
> +#endif /* CONFIG_GENERIC_CPU_DEVICES */
>
> static void __init cpu_dev_register_generic(void)
> {
Hi Ard,
> From: Ard Biesheuvel <[email protected]>
> Sent: Thursday, September 14, 2023 4:34 PM
> To: Jonathan Cameron <[email protected]>
> Cc: James Morse <[email protected]>; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; linux-arm-
> [email protected]; [email protected];
> [email protected]; [email protected]; Salil Mehta
> <[email protected]>; Russell King <[email protected]>; Jean-
> Philippe Brucker <[email protected]>; [email protected];
> [email protected]
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
>
> On Thu, 14 Sept 2023 at 16:55, Jonathan Cameron
> <[email protected]> wrote:
> >
> > On Thu, 14 Sep 2023 09:57:44 +0200
> > Ard Biesheuvel <[email protected]> wrote:
> >
> > > Hello James,
> > >
> > > On Wed, 13 Sept 2023 at 18:41, James Morse <[email protected]> wrote:
> > > >
> > > > Add the new flag field to the MADT's GICC structure.
> > > >
> > > > 'Online Capable' indicates a disabled CPU can be enabled later.
> > > >
> > >
> > > Why do we need a bit for this? What would be the point of describing
> > > disabled CPUs that cannot be enabled (and are you are aware of
> > > firmware doing this?).
> >
> > Enabled being not set is common at some similar ACPI tables at least.
> >
> > This is available in most ACPI tables to allow firmware to use 'nearly'
> > static tables and just tweak the 'enabled' bit to say if the record should
> > be ignored or not. Also _STA not present which is for same trick.
> > If you are doing clever dynamic tables, then you can just not present
> > the entry.
> >
> > With that existing use case in mind, need another bit to say this
> > one might one day turn up. Note this is copied from x86 though no
> > one seems to have implemented the kernel support for them yet.
> >
> > Note as per my other reply - this isn't a code first proposal. It's in the
> > spec already (via a code first proposal last year I think).
> >
> > >
> > > So why are we not able to assume that this new bit can always be treated as '1'?
> >
> > Given above, need the extra bit to size stuff to allow for the CPU showing up
> > late.
> >
>
> So does this mean that on x86, the CPU object is instantiated only
> when the hardware level hotplug occurs? And before that, the object
> does not exist at all?
That is correct but I am not sure if the presence of hardware Hotplug
on x86 is even true. It all hidden behind firmware magic (I think). So
x86 is able to use same infrastructure both for virtual and physical
CPU Hotplug.
From the ACPI 6.3 > x86 have started to use online-capable bit for local
x2apic in the MADT Table
https://lore.kernel.org/lkml/168016878002.404.5262105401164408214.tip-bot2@tip-bot2/
https://lore.kernel.org/lkml/168016878085.404.6003734700616193238.tip-bot2@tip-bot2/
But there is a subtle difference in the way it is being used on x86
and on the ARM platform right now.
On x86, during init, if the MADT entry for LAPIC is found to be
online-capable and is enabled as well then possible and present
cpumask gets set and a logical cpu-id is also allocated. If the
MADT entry is online-capable but not enabled then disabled cpus
are still counted but logical cpu-id is not allocated during
init time and in fact setting present mask bits are also
deferred till Hotplug happens later.
static int acpi_register_lapic(int id, u32 acpiid, u8 enabled)
{
[...]
if (!enabled) { /* Not ACPI_MADT_ENABLED */
++disabled_cpus;
return -EINVAL;
}
[...]
cpu = generic_processor_info(id, ver); /* logical cupid, present mask*/
[...]
return cpu;
}
acpi_parse_x2apic(union acpi_subtable_headers * header, const unsigned long end)
{
struct acpi_madt_local_x2apic *processor = NULL;
processor = (struct acpi_madt_local_x2apic *)header;
[...]
enabled = processor->lapic_flags & ACPI_MADT_ENABLED;
[...]
/* don't register processors that cannot be onlined */
if (!acpi_is_processor_usable(processor->lapic_flags))
return 0;
[...]
acpi_register_lapic(apic_id, processor->uid, enabled);
return 0;
}
On ARM, we similarly identify all MADT GICC entries which are
*usable* i.e. either are *ENABLED* or *online-capable*. But
Unlike x86, all cpus corresponding to usable MADT GICC entries
gets logical cpu-ds allocated and their present bit mask set
during boot itself. Hence, present mask is always equal to
the possible cpus mask on ARM.
https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Model.html#gicc-cpu-interface-flags
For online-capable but *not* enabled CPUs we defer the
registration of the logical CPU-ids with the Linux Driver Model
till the time ACPI Hotplug event occurs. This means
register_cpu() is not called for the disabled CPUs during
init time. Hence, sysfs entries for the disabled CPUs
don’t exits.
But above creates bit of confusion to a x86 accustomed users
as on ARM with our solution, present CPUs are always equal to
possible CPUs.
$ cat /sys/devices/system/cpu/possible
0-5
$ cat /sys/devices/system/cpu/present
0-5
$ cat /sys/devices/system/cpu/online
0-1
$ cat /sys/devices/system/cpu/offline
2-5
There is no way to know which CPUs have been hotplugged
using above interface. Hence, we have also a new mask
of enabled CPUs in the
$ cat /sys/devices/system/cpu/possible
0-5
$ cat /sys/devices/system/cpu/present
0-5
$ cat /sys/devices/system/cpu/enabled
0-2
$ cat /sys/devices/system/cpu/online
0-1
$ cat /sys/devices/system/cpu/offline
2-5
Qemu parameters: -smp cpu=3 maxcpus=6
Kernel parameter: maxcpus=2
>
> Because it seems to me that _STA, having both enabled and present
> bits, could already describe what we need here, and arguably, a CPU
> that is not both present and enabled should not be used by the OS.
> This would leave room for representing off-line CPUs as present but
> not enabled.
That is correct understanding.
For plugged cpus:
_STA.Present=1 and _STA.Enabled=1
For unplugged cpus:
_STA.Present=1 and _STA.Enabled=0
Hot(un)plugging is only allowed if during boot the GICC entries were
discovered as *online-capable*. GICC entries which are MADT GICC
enabled during boot cannot be hot-unplugged either.
Catch:
If hot unplugging is to be supported for all cpus except the boot
then we MUST set all CPUs except boot CPUs as *online-capable*.
This poses compatibility problems with the legacy OS running over
latest machines/platforms supporting Hotplug feature. OS might
ignore all the online-capable bits during boot time and hence only
1 CPU i.e. boot cpus might appear.
Hence, MADT.GICC.Enabled bits and MADT.GICC.online-capable need
Not be mutually exclusive. This requires more discussions!
You might find below useful:
https://kvm-forum.qemu.org/2023/talk/9SMPDQ/
>
> Apologies if I am missing something obvious here - the whole rationale
> behind this thing is rather confusing to me.
On Fri, Sep 15, 2023 at 02:29:13AM +0000, Salil Mehta wrote:
> On x86, during init, if the MADT entry for LAPIC is found to be
> online-capable and is enabled as well then possible and present
Note that the ACPI spec says enabled + online-capable isn't defined.
"The information conveyed by this bit depends on the value of the
Enabled bit. If the Enabled bit is set, this bit is reserved and
must be zero."
So, if x86 is doing something with the enabled && online-capable
state (other than ignoring the online-capable) then technically it
is doing something that the spec doesn't define - and it's
completely fine if aarch64 does something else (maybe treating it
strictly as per the spec and ignoring online-capable.)
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Fri, Sep 15, 2023 at 9:09 AM Russell King (Oracle)
<[email protected]> wrote:
>
> On Fri, Sep 15, 2023 at 02:29:13AM +0000, Salil Mehta wrote:
> > On x86, during init, if the MADT entry for LAPIC is found to be
> > online-capable and is enabled as well then possible and present
>
> Note that the ACPI spec says enabled + online-capable isn't defined.
>
> "The information conveyed by this bit depends on the value of the
> Enabled bit. If the Enabled bit is set, this bit is reserved and
> must be zero."
>
> So, if x86 is doing something with the enabled && online-capable
> state (other than ignoring the online-capable) then technically it
> is doing something that the spec doesn't define
And so it is wrong.
> - and it's
> completely fine if aarch64 does something else (maybe treating it
> strictly as per the spec and ignoring online-capable.)
That actually is the only compliant thing that can be done.
As per the spec (quoted above), a platform firmware setting
online-capable to 1 when Enabled is set is not compliant and it is
invalid to treat this as meaningful data.
As currently defined, online-capable is only applicable to CPUs that
are not enabled to start with and its role is to make it clear whether
or not they can be enabled later AFAICS.
If there is a need to represent the case in which a CPI that is
enabled to start with can be disabled, but cannot be enabled again,
the spec needs to be updated.
Hi Russel,
> From: Russell King <[email protected]>
> Sent: Friday, September 15, 2023 8:09 AM
> To: Salil Mehta <[email protected]>
> Cc: Ard Biesheuvel <[email protected]>; Jonathan Cameron
> <[email protected]>; James Morse <[email protected]>; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; linux-arm-
> [email protected]; [email protected];
> [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> [email protected]>; [email protected]; [email protected]
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
>
> On Fri, Sep 15, 2023 at 02:29:13AM +0000, Salil Mehta wrote:
> > On x86, during init, if the MADT entry for LAPIC is found to be
> > online-capable and is enabled as well then possible and present
>
> Note that the ACPI spec says enabled + online-capable isn't defined.
>
> "The information conveyed by this bit depends on the value of the
> Enabled bit. If the Enabled bit is set, this bit is reserved and
> must be zero."
>
> So, if x86 is doing something with the enabled && online-capable
> state (other than ignoring the online-capable) then technically it
> is doing something that the spec doesn't define - and it's
> completely fine if aarch64 does something else (maybe treating it
> strictly as per the spec and ignoring online-capable.)
I would suggest that we should concentrate on what is actually
required. The fact of the matter is there is no need to keep
ACPI MADT.GICC.Enabled and ACPI MADT.GICC.online-capable bits
mutually exclusive. (please correct my understanding here
if I am wrong here)
It is a different matter that x86 has implemented above
requirement first for their x2APIC and spec are still not
reflecting what has been implemented as part of the code.
(I would add, for whatever reasons)
On ARM we have copied something from x86 ACPI Specification
which has not been updated yet. (why it is not updated? Maybe
x86 folks can clarify more on this?). Even on ARM, mutual
exclusiveness of the bits is not required. But does it breaks
anything on ARM to *not* have mutual exclusiveness.
AFAICS, no, but ARM Arch guys can confirm this?)
If bits are *not* required to be mutually exclusive on either
platforms x86/ARM then, I think, it makes sense to update
ACPI specification for both of the platforms.
Thanks
Salil.
> From: Rafael J. Wysocki <[email protected]>
> Sent: Friday, September 15, 2023 9:45 AM
> To: Russell King (Oracle) <[email protected]>
> Cc: Salil Mehta <[email protected]>; Ard Biesheuvel <[email protected]>;
> Jonathan Cameron <[email protected]>; James Morse
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected]; Jean-
> Philippe Brucker <[email protected]>; [email protected];
> [email protected]
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
>
> On Fri, Sep 15, 2023 at 9:09 AM Russell King (Oracle)
> <[email protected]> wrote:
> >
> > On Fri, Sep 15, 2023 at 02:29:13AM +0000, Salil Mehta wrote:
> > > On x86, during init, if the MADT entry for LAPIC is found to be
> > > online-capable and is enabled as well then possible and present
> >
> > Note that the ACPI spec says enabled + online-capable isn't defined.
> >
> > "The information conveyed by this bit depends on the value of the
> > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > must be zero."
> >
> > So, if x86 is doing something with the enabled && online-capable
> > state (other than ignoring the online-capable) then technically it
> > is doing something that the spec doesn't define
>
> And so it is wrong.
Or maybe, specification has not been updated yet. code-first?
>
> > - and it's
> > completely fine if aarch64 does something else (maybe treating it
> > strictly as per the spec and ignoring online-capable.)
>
> That actually is the only compliant thing that can be done.
Yes, but the question is it what is required and does it solves
the problem of Hotplug. I think no.
By complying with what is there in the spec means we have to
do the tradeoff between having not to support hot(un)plugging
of the cold-plugged CPUs Vs risk of breaking the legacy OS
attempting to use newer platforms with Hotplug support. Later
is more of a ARM problem as we are not allowed to tweak the
ACPI tables once the system has booted.
>
> As per the spec (quoted above), a platform firmware setting
> online-capable to 1 when Enabled is set is not compliant and it is
> invalid to treat this as meaningful data.
Correct. but is it really what we need? We need both of the
Bits to be set for supporting hot(un)plugging of cold booted
CPUs.
>
> As currently defined, online-capable is only applicable to CPUs that
> are not enabled to start with and its role is to make it clear whether
> or not they can be enabled later AFAICS.
Correct.
>
> If there is a need to represent the case in which a CPI that is
> enabled to start with can be disabled, but cannot be enabled again,
> the spec needs to be updated.
Absolutely. And that’s what my humble suggestion is as well.
Thanks
Salil.
On Fri, Sep 15, 2023 at 11:34 AM Salil Mehta <[email protected]> wrote:
>
>
> > From: Rafael J. Wysocki <[email protected]>
> > Sent: Friday, September 15, 2023 9:45 AM
> > To: Russell King (Oracle) <[email protected]>
> > Cc: Salil Mehta <[email protected]>; Ard Biesheuvel <[email protected]>;
> > Jonathan Cameron <[email protected]>; James Morse
> > <[email protected]>; [email protected]; [email protected];
> > [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; [email protected]; Jean-
> > Philippe Brucker <[email protected]>; [email protected];
> > [email protected]
> > Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> > [code first?]
> >
> > On Fri, Sep 15, 2023 at 9:09 AM Russell King (Oracle)
> > <[email protected]> wrote:
> > >
> > > On Fri, Sep 15, 2023 at 02:29:13AM +0000, Salil Mehta wrote:
> > > > On x86, during init, if the MADT entry for LAPIC is found to be
> > > > online-capable and is enabled as well then possible and present
> > >
> > > Note that the ACPI spec says enabled + online-capable isn't defined.
> > >
> > > "The information conveyed by this bit depends on the value of the
> > > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > > must be zero."
> > >
> > > So, if x86 is doing something with the enabled && online-capable
> > > state (other than ignoring the online-capable) then technically it
> > > is doing something that the spec doesn't define
> >
> > And so it is wrong.
>
>
> Or maybe, specification has not been updated yet. code-first?
Well, if you are aware of any change requests related to this and
posted as code-first, please let me know.
> From: Rafael J. Wysocki <[email protected]>
> Sent: Friday, September 15, 2023 11:21 AM
> To: Salil Mehta <[email protected]>
> Cc: Rafael J. Wysocki <[email protected]>; Russell King (Oracle)
> <[email protected]>; Ard Biesheuvel <[email protected]>; Jonathan Cameron
> <[email protected]>; James Morse <[email protected]>; linux-
> [email protected]; [email protected]; [email protected];
> [email protected]; [email protected]; linux-arm-
> [email protected]; [email protected];
> [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> [email protected]>; [email protected]; [email protected]
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
>
> On Fri, Sep 15, 2023 at 11:34 AM Salil Mehta <[email protected]>
> wrote:
> >
> >
> > > From: Rafael J. Wysocki <[email protected]>
> > > Sent: Friday, September 15, 2023 9:45 AM
> > > To: Russell King (Oracle) <[email protected]>
> > > Cc: Salil Mehta <[email protected]>; Ard Biesheuvel <[email protected]>;
> > > Jonathan Cameron <[email protected]>; James Morse
> > > <[email protected]>; [email protected]; [email protected];
> > > [email protected]; [email protected]; linux-
> > > [email protected]; [email protected]; linux-
> > > [email protected]; [email protected]; [email protected];
> Jean-
> > > Philippe Brucker <[email protected]>; [email protected];
> > > [email protected]
> > > Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags
> fields
> > > [code first?]
> > >
> > > On Fri, Sep 15, 2023 at 9:09 AM Russell King (Oracle)
> > > <[email protected]> wrote:
> > > >
> > > > On Fri, Sep 15, 2023 at 02:29:13AM +0000, Salil Mehta wrote:
> > > > > On x86, during init, if the MADT entry for LAPIC is found to be
> > > > > online-capable and is enabled as well then possible and present
> > > >
> > > > Note that the ACPI spec says enabled + online-capable isn't defined.
> > > >
> > > > "The information conveyed by this bit depends on the value of the
> > > > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > > > must be zero."
> > > >
> > > > So, if x86 is doing something with the enabled && online-capable
> > > > state (other than ignoring the online-capable) then technically it
> > > > is doing something that the spec doesn't define
> > >
> > > And so it is wrong.
> >
> >
> > Or maybe, specification has not been updated yet. code-first?
>
> Well, if you are aware of any change requests related to this and
> posted as code-first, please let me know.
I am not aware of any on x86. Maybe we can do it on ARM first and
let other Arch pitch-in their objection later? Afterall, there is
a legitimate use-case in case of ARM. Having mutually exclusive
bits breaks certain use-cases and we have to do the tradeoffs.
This can be done in parallel while other patches are getting
reviewed and momentarily living with the tradeoffs till
specification is sorted. But of course it depends upon what
other stake holders and most importantly what ARM Arch people
think of it.
Thanks
Salil.
On Fri, Sep 15, 2023 at 02:49:41PM +0000, Salil Mehta wrote:
> I am not aware of any on x86. Maybe we can do it on ARM first and
> let other Arch pitch-in their objection later? Afterall, there is
> a legitimate use-case in case of ARM. Having mutually exclusive
> bits breaks certain use-cases and we have to do the tradeoffs.
... but let's not use that as an argument to delay the forward
progress of getting aarch64 vCPU hotplug patches merged.
If we want to later propose that Enabled=1 Online-Capable=1 means
that the CPU can be hot-unplugged, then that's something that can
be added to the spec later, and added to the kernel later. There
is no need to go through more iterations of patch sets to add this
feature before considering that aarch64 vCPU hotplug is ready to
be merged.
Like I said in my other email, it's time to stop this "well, if we
do this, then we can do that" cycle - stop playing games with what
can be done.
Delaying merging this code means not only does the maintenance
burden keep increasing (because more and more patches accumulate
which have to be constantly forward ported) but those who *want*
this feature are deprived for what, another year? two years?
decades? before it gets merged.
So please, stop dreaming up new features. Let's get aarch64 vCPU
hotplug that is compliant with the current ACPI spec, merged into
upstream. If we _then_ want to consider additional features, that's
the time to do it.
If you're not prepared to do that, do not be surprised if someone
else (such as myself) decides to fork James' work in order to get
it merged upstream - and yes, I _will_ do that if these games
carry on. I have already started to do that by proposing a patch
that is different from what James has to at least get some of
James' desired changes upstream - and I will continue doing that
all the time that (a) I see that there's a better way to address
something in James' patch and (b) I think in the longer term it
will reduce the maintenance burden of this patch set.
People are getting sick and tired of waiting for this feature.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Hi Russel,
Thanks for highlighting your concerns.
> From: Russell King <[email protected]>
> Sent: Friday, September 15, 2023 2:43 PM
> To: Salil Mehta <[email protected]>
> Cc: Rafael J. Wysocki <[email protected]>; Ard Biesheuvel
> <[email protected]>; Jonathan Cameron <[email protected]>; James
> Morse <[email protected]>; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; linux-arm-
> [email protected]; [email protected];
> [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> [email protected]>; [email protected]; [email protected]
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
>
> On Fri, Sep 15, 2023 at 09:34:46AM +0000, Salil Mehta wrote:
> > > > Note that the ACPI spec says enabled + online-capable isn't defined.
> > > >
> > > > "The information conveyed by this bit depends on the value of the
> > > > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > > > must be zero."
> > > >
> > > > So, if x86 is doing something with the enabled && online-capable
> > > > state (other than ignoring the online-capable) then technically it
> > > > is doing something that the spec doesn't define
> > >
> > > And so it is wrong.
> >
> > Or maybe, specification has not been updated yet. code-first?
>
> What is the point in speculating. If you want to speculate about it,
> fine, but please don't use speculation as a reason that "oh we need
> to sort this out before we can merge the patches".
[already replied in other thread but repeating it here]
Sorry, I am not aware but I was suggesting this. Can we have this
done for ARM first because there is a legitimate use-case. This
can be done in parallel while other patches are getting reviewed.
It would be great if they get accepted even in the current form.
> This is precisely why engineers are bad at producing products. They
> like to continually tweak the design, and the design never gets out
> the door. You need someone who is a project manager to tell engineers
> when to stop. Without a project manager to do that, eventually the
> project fades into insignificance because it becomes no longer relevant
> or has its funding cut.
>
> Hotplug VCPU on aarch64 feels exactly like that - it seems to be an
> engineer project that is just going to for-ever rumble on and never
> actually see the light of day.
Sometimes things are not in single persons control. Yes, it is
frustrating, I do understand that.
> So please - stop speculating and lets get vCPU hotplug *actually*
> delivered and usable. Even if it's not 100% perfect.
We need to decide what is the criteria of acceptability and it can
vary across organizations. It depends upon internal requirements.
The issues what I pointed are,
1. Legacy OS will not boot on latest platform with hotplug support.
- Try running older windows on ARM platform with hotplug support.
- older windows will only see boot cpu with online-capable bit.
- Will windows use _OSC to check compatibility?
- We have verified this with older Linux and it only shows 1 CPU.
2. Hot(un)plug of cold-booted CPUs.
- Its use-case is subjective. Maybe you can throw light on this.
With current composition of bits both 1 & 2 cannot be supported
simultaneously.
It is perfectly okay to live with them while clearly indicating
what we intend to support or are in process of supporting it.
But we do need an open discussion about how to proceed. This is
to avoid surprises later on.
BTW, I am just trying to make every one aware of the problems.
Many thanks!
Best regards
Salil.
On Fri, 15 Sep 2023 16:17:21 +0100
Salil Mehta <[email protected]> wrote:
> Hi Russel,
> Thanks for highlighting your concerns.
>
> > From: Russell King <[email protected]>
> > Sent: Friday, September 15, 2023 2:43 PM
> > To: Salil Mehta <[email protected]>
> > Cc: Rafael J. Wysocki <[email protected]>; Ard Biesheuvel
> > <[email protected]>; Jonathan Cameron <[email protected]>; James
> > Morse <[email protected]>; [email protected];
> > [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; linux-arm-
> > [email protected]; [email protected];
> > [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> > [email protected]>; [email protected]; [email protected]
> > Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> > [code first?]
> >
> > On Fri, Sep 15, 2023 at 09:34:46AM +0000, Salil Mehta wrote:
> > > > > Note that the ACPI spec says enabled + online-capable isn't defined.
> > > > >
> > > > > "The information conveyed by this bit depends on the value of the
> > > > > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > > > > must be zero."
> > > > >
> > > > > So, if x86 is doing something with the enabled && online-capable
> > > > > state (other than ignoring the online-capable) then technically it
> > > > > is doing something that the spec doesn't define
> > > >
> > > > And so it is wrong.
> > >
> > > Or maybe, specification has not been updated yet. code-first?
> >
> > What is the point in speculating. If you want to speculate about it,
> > fine, but please don't use speculation as a reason that "oh we need
> > to sort this out before we can merge the patches".
>
> [already replied in other thread but repeating it here]
>
> Sorry, I am not aware but I was suggesting this. Can we have this
> done for ARM first because there is a legitimate use-case. This
> can be done in parallel while other patches are getting reviewed.
> It would be great if they get accepted even in the current form.
>
>
> > This is precisely why engineers are bad at producing products. They
> > like to continually tweak the design, and the design never gets out
> > the door. You need someone who is a project manager to tell engineers
> > when to stop. Without a project manager to do that, eventually the
> > project fades into insignificance because it becomes no longer relevant
> > or has its funding cut.
> >
> > Hotplug VCPU on aarch64 feels exactly like that - it seems to be an
> > engineer project that is just going to for-ever rumble on and never
> > actually see the light of day.
>
>
> Sometimes things are not in single persons control. Yes, it is
> frustrating, I do understand that.
>
>
> > So please - stop speculating and lets get vCPU hotplug *actually*
> > delivered and usable. Even if it's not 100% perfect.
>
> We need to decide what is the criteria of acceptability and it can
> vary across organizations. It depends upon internal requirements.
> The issues what I pointed are,
>
> 1. Legacy OS will not boot on latest platform with hotplug support.
> - Try running older windows on ARM platform with hotplug support.
> - older windows will only see boot cpu with online-capable bit.
> - Will windows use _OSC to check compatibility?
> - We have verified this with older Linux and it only shows 1 CPU.
> 2. Hot(un)plug of cold-booted CPUs.
> - Its use-case is subjective. Maybe you can throw light on this.
>
> With current composition of bits both 1 & 2 cannot be supported
> simultaneously.
>
> It is perfectly okay to live with them while clearly indicating
> what we intend to support or are in process of supporting it.
> But we do need an open discussion about how to proceed. This is
> to avoid surprises later on.
>
> BTW, I am just trying to make every one aware of the problems.
Step 1 - just allow growing (and shrinking back to initial
enabled cpus). That is fine with current specification and legacy
OS. We only assume CPUs that are hotplugged can later be removed.
That covers most use cases.
So what effectively what Russell said. Enable what we can with
the specifications as they stand before getting distracted by
modifying them (again).
Jonathan
>
> Many thanks!
>
> Best regards
> Salil.
>
>
>
On Fri, Sep 15, 2023 at 03:17:21PM +0000, Salil Mehta wrote:
> Hi Russel,
> Thanks for highlighting your concerns.
>
> > From: Russell King <[email protected]>
> > Sent: Friday, September 15, 2023 2:43 PM
> > To: Salil Mehta <[email protected]>
> > Cc: Rafael J. Wysocki <[email protected]>; Ard Biesheuvel
> > <[email protected]>; Jonathan Cameron <[email protected]>; James
> > Morse <[email protected]>; [email protected];
> > [email protected]; [email protected]; linux-
> > [email protected]; [email protected]; linux-arm-
> > [email protected]; [email protected];
> > [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> > [email protected]>; [email protected]; [email protected]
> > Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> > [code first?]
> >
> > On Fri, Sep 15, 2023 at 09:34:46AM +0000, Salil Mehta wrote:
> > > > > Note that the ACPI spec says enabled + online-capable isn't defined.
> > > > >
> > > > > "The information conveyed by this bit depends on the value of the
> > > > > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > > > > must be zero."
> > > > >
> > > > > So, if x86 is doing something with the enabled && online-capable
> > > > > state (other than ignoring the online-capable) then technically it
> > > > > is doing something that the spec doesn't define
> > > >
> > > > And so it is wrong.
> > >
> > > Or maybe, specification has not been updated yet. code-first?
> >
> > What is the point in speculating. If you want to speculate about it,
> > fine, but please don't use speculation as a reason that "oh we need
> > to sort this out before we can merge the patches".
>
> [already replied in other thread but repeating it here]
>
> Sorry, I am not aware but I was suggesting this. Can we have this
> done for ARM first because there is a legitimate use-case. This
> can be done in parallel while other patches are getting reviewed.
> It would be great if they get accepted even in the current form.
>
>
> > This is precisely why engineers are bad at producing products. They
> > like to continually tweak the design, and the design never gets out
> > the door. You need someone who is a project manager to tell engineers
> > when to stop. Without a project manager to do that, eventually the
> > project fades into insignificance because it becomes no longer relevant
> > or has its funding cut.
> >
> > Hotplug VCPU on aarch64 feels exactly like that - it seems to be an
> > engineer project that is just going to for-ever rumble on and never
> > actually see the light of day.
>
>
> Sometimes things are not in single persons control. Yes, it is
> frustrating, I do understand that.
>
>
> > So please - stop speculating and lets get vCPU hotplug *actually*
> > delivered and usable. Even if it's not 100% perfect.
>
> We need to decide what is the criteria of acceptability and it can
> vary across organizations. It depends upon internal requirements.
> The issues what I pointed are,
>
> 1. Legacy OS will not boot on latest platform with hotplug support.
> - Try running older windows on ARM platform with hotplug support.
> - older windows will only see boot cpu with online-capable bit.
> - Will windows use _OSC to check compatibility?
> - We have verified this with older Linux and it only shows 1 CPU.
> 2. Hot(un)plug of cold-booted CPUs.
> - Its use-case is subjective. Maybe you can throw light on this.
>
> With current composition of bits both 1 & 2 cannot be supported
> simultaneously.
>
> It is perfectly okay to live with them while clearly indicating
> what we intend to support or are in process of supporting it.
> But we do need an open discussion about how to proceed. This is
> to avoid surprises later on.
>
> BTW, I am just trying to make every one aware of the problems.
Please do it as a separate discussion then - rather than starting a
thread in response to a posting of patches which are _supposed_ to
be being reviewed.
Bringing up issues which are in effect future enhancements without
explicitly stating that they are future enhancements makes it look like
the patch set isn't ready to be merged - and is a distraction to trying
to get the series merged.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Hi Russel,
> From: Russell King <[email protected]>
> Sent: Friday, September 15, 2023 4:16 PM
> To: Salil Mehta <[email protected]>
> Cc: Rafael J. Wysocki <[email protected]>; Ard Biesheuvel
> <[email protected]>; Jonathan Cameron <[email protected]>; James
> Morse <[email protected]>; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; linux-arm-
> [email protected]; [email protected];
> [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> [email protected]>; [email protected]; [email protected]
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
>
> On Fri, Sep 15, 2023 at 02:49:41PM +0000, Salil Mehta wrote:
> > I am not aware of any on x86. Maybe we can do it on ARM first and
> > let other Arch pitch-in their objection later? Afterall, there is
> > a legitimate use-case in case of ARM. Having mutually exclusive
> > bits breaks certain use-cases and we have to do the tradeoffs.
>
> ... but let's not use that as an argument to delay the forward
> progress of getting aarch64 vCPU hotplug patches merged.
Why would anybody do that? We have been working with ARM for almost
3 years to get to the current point where we have overcome most of
the architecture issues and have made this feature viable at the
first place. It is totally out of wits that anyone of us would
want to delay its acceptance.
>
> If we want to later propose that Enabled=1 Online-Capable=1 means
> that the CPU can be hot-unplugged, then that's something that can
> be added to the spec later, and added to the kernel later. There
> is no need to go through more iterations of patch sets to add this
> feature before considering that aarch64 vCPU hotplug is ready to
> be merged.
Absolutely but again these two things can be done in parallel.
And whether patch-set is ready to get accepted is up to the
Maintainers to decide and other community members as well.
Yourself, James, I and others have been making efforts in this
direction already.
But I understand your concern that maybe current discussion might
create a bit of a distraction and can be held.
>
> Like I said in my other email, it's time to stop this "well, if we
> do this, then we can do that" cycle - stop playing games with what
> can be done.
Don't know which cyclic games are being referred here - really!
I will leave it up to James to answer that.
> Delaying merging this code means not only does the maintenance
> burden keep increasing (because more and more patches accumulate
> which have to be constantly forward ported) but those who *want*
> this feature are deprived for what, another year? two years?
> decades? before it gets merged.
It is good to know that there are customers waiting for this
feature at your side as well. Let us hope this can get accepted
quickly.
> So please, stop dreaming up new features. Let's get aarch64 vCPU
> hotplug that is compliant with the current ACPI spec, merged into
> upstream. If we _then_ want to consider additional features, that's
> the time to do it.
That's what I suggested earlier as well but the discussions for the
problem cannot be ignored.
> If you're not prepared to do that, do not be surprised if someone
> else (such as myself) decides to fork James' work in order to get
> it merged upstream - and yes, I _will_ do that if these games
> carry on. I have already started to do that by proposing a patch
> that is different from what James has to at least get some of
> James' desired changes upstream - and I will continue doing that
> all the time that (a) I see that there's a better way to address
> something in James' patch and (b) I think in the longer term it
> will reduce the maintenance burden of this patch set.
Are you changing the approach of the kernel?
Thanks
Salil.
Hi Jonathan,
> From: Jonathan Cameron <[email protected]>
> Sent: Friday, September 15, 2023 4:33 PM
> To: Salil Mehta <[email protected]>
> Cc: Russell King <[email protected]>; Rafael J. Wysocki
> <[email protected]>; Ard Biesheuvel <[email protected]>; James Morse
> <[email protected]>; [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected]; Jean-
> Philippe Brucker <[email protected]>; [email protected];
> [email protected]
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
>
> On Fri, 15 Sep 2023 16:17:21 +0100
> Salil Mehta <[email protected]> wrote:
>
> > Hi Russel,
> > Thanks for highlighting your concerns.
> >
> > > From: Russell King <[email protected]>
> > > Sent: Friday, September 15, 2023 2:43 PM
> > > To: Salil Mehta <[email protected]>
> > > Cc: Rafael J. Wysocki <[email protected]>; Ard Biesheuvel
> > > <[email protected]>; Jonathan Cameron <[email protected]>; James
> > > Morse <[email protected]>; [email protected];
> > > [email protected]; [email protected]; linux-
> > > [email protected]; [email protected]; linux-arm-
> > > [email protected]; [email protected];
> > > [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> > > [email protected]>; [email protected]; [email protected]
> > > Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> > > [code first?]
> > >
> > > On Fri, Sep 15, 2023 at 09:34:46AM +0000, Salil Mehta wrote:
> > > > > > Note that the ACPI spec says enabled + online-capable isn't defined.
> > > > > >
> > > > > > "The information conveyed by this bit depends on the value of the
> > > > > > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > > > > > must be zero."
> > > > > >
> > > > > > So, if x86 is doing something with the enabled && online-capable
> > > > > > state (other than ignoring the online-capable) then technically it
> > > > > > is doing something that the spec doesn't define
> > > > >
> > > > > And so it is wrong.
> > > >
> > > > Or maybe, specification has not been updated yet. code-first?
> > >
> > > What is the point in speculating. If you want to speculate about it,
> > > fine, but please don't use speculation as a reason that "oh we need
> > > to sort this out before we can merge the patches".
> >
> > [already replied in other thread but repeating it here]
> >
> > Sorry, I am not aware but I was suggesting this. Can we have this
> > done for ARM first because there is a legitimate use-case. This
> > can be done in parallel while other patches are getting reviewed.
> > It would be great if they get accepted even in the current form.
> >
> >
> > > This is precisely why engineers are bad at producing products. They
> > > like to continually tweak the design, and the design never gets out
> > > the door. You need someone who is a project manager to tell engineers
> > > when to stop. Without a project manager to do that, eventually the
> > > project fades into insignificance because it becomes no longer relevant
> > > or has its funding cut.
> > >
> > > Hotplug VCPU on aarch64 feels exactly like that - it seems to be an
> > > engineer project that is just going to for-ever rumble on and never
> > > actually see the light of day.
> >
> >
> > Sometimes things are not in single persons control. Yes, it is
> > frustrating, I do understand that.
> >
> >
> > > So please - stop speculating and lets get vCPU hotplug *actually*
> > > delivered and usable. Even if it's not 100% perfect.
> >
> > We need to decide what is the criteria of acceptability and it can
> > vary across organizations. It depends upon internal requirements.
> > The issues what I pointed are,
> >
> > 1. Legacy OS will not boot on latest platform with hotplug support.
> > - Try running older windows on ARM platform with hotplug support.
> > - older windows will only see boot cpu with online-capable bit.
> > - Will windows use _OSC to check compatibility?
> > - We have verified this with older Linux and it only shows 1 CPU.
> > 2. Hot(un)plug of cold-booted CPUs.
> > - Its use-case is subjective. Maybe you can throw light on this.
> >
> > With current composition of bits both 1 & 2 cannot be supported
> > simultaneously.
> >
> > It is perfectly okay to live with them while clearly indicating
> > what we intend to support or are in process of supporting it.
> > But we do need an open discussion about how to proceed. This is
> > to avoid surprises later on.
> >
> > BTW, I am just trying to make every one aware of the problems.
>
> Step 1 - just allow growing (and shrinking back to initial
> enabled cpus). That is fine with current specification and legacy
> OS. We only assume CPUs that are hotplugged can later be removed.
> That covers most use cases.
Yes, we can do that for a moment (at least in qemu) and then
not allow unplugging vCPUs which were cold plugged or allow
it as a debugging feature but splash a warning.
> So what effectively what Russell said. Enable what we can with
> the specifications as they stand before getting distracted by
> modifying them (again).
Yes, agreed. Idea was to clearly highlight them. These can be
discussed as part of separate thread in parallel - absolutely!
Thanks
Salil.
Hi Russel,
> From: Russell King <[email protected]>
> Sent: Friday, September 15, 2023 4:41 PM
> To: Salil Mehta <[email protected]>
> Cc: Rafael J. Wysocki <[email protected]>; Ard Biesheuvel
> <[email protected]>; Jonathan Cameron <[email protected]>; James
> Morse <[email protected]>; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]; linux-arm-
> [email protected]; [email protected];
> [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> [email protected]>; [email protected]; [email protected]
> Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> [code first?]
>
> On Fri, Sep 15, 2023 at 03:17:21PM +0000, Salil Mehta wrote:
> > Hi Russel,
> > Thanks for highlighting your concerns.
> >
> > > From: Russell King <[email protected]>
> > > Sent: Friday, September 15, 2023 2:43 PM
> > > To: Salil Mehta <[email protected]>
> > > Cc: Rafael J. Wysocki <[email protected]>; Ard Biesheuvel
> > > <[email protected]>; Jonathan Cameron <[email protected]>; James
> > > Morse <[email protected]>; [email protected];
> > > [email protected]; [email protected]; linux-
> > > [email protected]; [email protected]; linux-arm-
> > > [email protected]; [email protected];
> > > [email protected]; [email protected]; Jean-Philippe Brucker <jean-
> > > [email protected]>; [email protected]; [email protected]
> > > Subject: Re: [RFC PATCH v2 27/35] ACPICA: Add new MADT GICC flags fields
> > > [code first?]
> > >
> > > On Fri, Sep 15, 2023 at 09:34:46AM +0000, Salil Mehta wrote:
> > > > > > Note that the ACPI spec says enabled + online-capable isn't defined.
> > > > > >
> > > > > > "The information conveyed by this bit depends on the value of the
> > > > > > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > > > > > must be zero."
> > > > > >
> > > > > > So, if x86 is doing something with the enabled && online-capable
> > > > > > state (other than ignoring the online-capable) then technically it
> > > > > > is doing something that the spec doesn't define
> > > > >
> > > > > And so it is wrong.
> > > >
> > > > Or maybe, specification has not been updated yet. code-first?
> > >
> > > What is the point in speculating. If you want to speculate about it,
> > > fine, but please don't use speculation as a reason that "oh we need
> > > to sort this out before we can merge the patches".
> >
> > [already replied in other thread but repeating it here]
> >
> > Sorry, I am not aware but I was suggesting this. Can we have this
> > done for ARM first because there is a legitimate use-case. This
> > can be done in parallel while other patches are getting reviewed.
> > It would be great if they get accepted even in the current form.
> >
> >
> > > This is precisely why engineers are bad at producing products. They
> > > like to continually tweak the design, and the design never gets out
> > > the door. You need someone who is a project manager to tell engineers
> > > when to stop. Without a project manager to do that, eventually the
> > > project fades into insignificance because it becomes no longer relevant
> > > or has its funding cut.
> > >
> > > Hotplug VCPU on aarch64 feels exactly like that - it seems to be an
> > > engineer project that is just going to for-ever rumble on and never
> > > actually see the light of day.
> >
> >
> > Sometimes things are not in single persons control. Yes, it is
> > frustrating, I do understand that.
> >
> >
> > > So please - stop speculating and lets get vCPU hotplug *actually*
> > > delivered and usable. Even if it's not 100% perfect.
> >
> > We need to decide what is the criteria of acceptability and it can
> > vary across organizations. It depends upon internal requirements.
> > The issues what I pointed are,
> >
> > 1. Legacy OS will not boot on latest platform with hotplug support.
> > - Try running older windows on ARM platform with hotplug support.
> > - older windows will only see boot cpu with online-capable bit.
> > - Will windows use _OSC to check compatibility?
> > - We have verified this with older Linux and it only shows 1 CPU.
> > 2. Hot(un)plug of cold-booted CPUs.
> > - Its use-case is subjective. Maybe you can throw light on this.
> >
> > With current composition of bits both 1 & 2 cannot be supported
> > simultaneously.
> >
> > It is perfectly okay to live with them while clearly indicating
> > what we intend to support or are in process of supporting it.
> > But we do need an open discussion about how to proceed. This is
> > to avoid surprises later on.
> >
> > BTW, I am just trying to make every one aware of the problems.
>
> Please do it as a separate discussion then - rather than starting a
> thread in response to a posting of patches which are _supposed_ to
> be being reviewed.
Yes, we can discuss it as part of separate thread.
> Bringing up issues which are in effect future enhancements without
> explicitly stating that they are future enhancements makes it look like
> the patch set isn't ready to be merged - and is a distraction to trying
> to get the series merged.
I beg to disagree on this as these are not enhancements/features
but problems. But yes, we can sort these out in a step wise fashion
subsequently even after patches have been accepted. Totally agree
that this can cause distraction so let us defer it for a moment.
The original purpose was to highlight them here briefly, which
has been achieved!
Thanks
Salil.
On Fri, Sep 15, 2023 at 09:34:46AM +0000, Salil Mehta wrote:
> > > Note that the ACPI spec says enabled + online-capable isn't defined.
> > >
> > > "The information conveyed by this bit depends on the value of the
> > > Enabled bit. If the Enabled bit is set, this bit is reserved and
> > > must be zero."
> > >
> > > So, if x86 is doing something with the enabled && online-capable
> > > state (other than ignoring the online-capable) then technically it
> > > is doing something that the spec doesn't define
> >
> > And so it is wrong.
>
> Or maybe, specification has not been updated yet. code-first?
What is the point in speculating. If you want to speculate about it,
fine, but please don't use speculation as a reason that "oh we need
to sort this out before we can merge the patches".
This is precisely why engineers are bad at producing products. They
like to continually tweak the design, and the design never gets out
the door. You need someone who is a project manager to tell engineers
when to stop. Without a project manager to do that, eventually the
project fades into insignificance because it becomes no longer relevant
or has its funding cut.
Hotplug VCPU on aarch64 feels exactly like that - it seems to be an
engineer project that is just going to for-ever rumble on and never
actually see the light of day.
So please - stop speculating and lets get vCPU hotplug *actually*
delivered and usable. Even if it's not 100% perfect.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On 9/14/23 02:38, James Morse wrote:
> ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other
> in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors"
> says "Each processor in the system must be declared in the ACPI
> namespace"). Having two descriptions allows firmware authors to get
> this wrong.
>
> If CPUs are described in the MADT/APIC, they will be brought online
> early during boot. Once the register_cpu() calls are moved to ACPI,
> they will be based on the DSDT description of the CPUs. When CPUs are
> missing from the DSDT description, they will end up online, but not
> registered.
>
> Add a helper that runs after acpi_init() has completed to register
> CPUs that are online, but weren't found in the DSDT. Any CPU that
> is registered by this code triggers a firmware-bug warning and kernel
> taint.
>
> Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug
> is configured.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++
> 1 file changed, 19 insertions(+)
>
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index b4bde78121bb..a01e315aa16a 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -790,6 +790,25 @@ void __init acpi_processor_init(void)
> acpi_pcc_cpufreq_init();
> }
>
> +static int __init acpi_processor_register_missing_cpus(void)
> +{
> + int cpu;
> +
> + if (acpi_disabled)
> + return 0;
> +
> + for_each_online_cpu(cpu) {
> + if (!get_cpu_device(cpu)) {
> + pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu);
> + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> + arch_register_cpu(cpu);
> + }
> + }
> +
> + return 0;
> +}
> +subsys_initcall_sync(acpi_processor_register_missing_cpus);
> +
> #ifdef CONFIG_ACPI_PROCESSOR_CSTATE
> /**
> * acpi_processor_claim_cst_control - Request _CST control from the platform.
On 9/14/23 02:37, James Morse wrote:
> loongarch, mips, parisc, riscv and sh all print a warning if
> register_cpu() returns an error. Architectures that use
> GENERIC_CPU_DEVICES call panic() instead.
>
> Errors in this path indicate something is wrong with the firmware
> description of the platform, but the kernel is able to keep running.
>
> Downgrade this to a warning to make it easier to debug this issue.
>
> This will allow architectures that switching over to GENERIC_CPU_DEVICES
> to drop their warning, but keep the existing behaviour.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/base/cpu.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 579064fda97b..d31c936f0955 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -535,14 +535,15 @@ int __weak arch_register_cpu(int cpu)
>
> static void __init cpu_dev_register_generic(void)
> {
> - int i;
> + int i, ret;
>
> if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> return;
>
> for_each_present_cpu(i) {
> - if (arch_register_cpu(i))
> - panic("Failed to register CPU device");
> + ret = arch_register_cpu(i);
> + if (ret)
> + pr_warn("register_cpu %d failed (%d)\n", i, ret);
> }
> }
>
The same warning message has been printed by arch/loongarch/kernel/topology.c::arch_register_cpu().
In order to avoid the duplication, I think the warning message in arch/loongarch needs to be dropped?
Thanks,
Gavin
On 9/14/23 22:27, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:38:02 +0000
> James Morse <[email protected]> wrote:
>
>> Today the ACPI enumeration code 'visits' all devices that are present.
>>
>> This is a problem for arm64, where CPUs are always present, but not
>> always enabled. When a device-check occurs because the firmware-policy
>> has changed and a CPU is now enabled, the following error occurs:
>> | acpi ACPI0007:48: Enumeration failure
>>
>> This is ultimately because acpi_dev_ready_for_enumeration() returns
>> true for a device that is not enabled. The ACPI Processor driver
>> will not register such CPUs as they are not 'decoding their resources'.
>>
>> Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
>> ACPI allows a device to be functional instead of maintaining the
>> present and enabled bit. Make this behaviour an explicit check with
>> a reference to the spec, and then check the present and enabled bits.
>
> "and the" only applies if the functional route hasn't been followed
> "if not this case check the present and enabled bits."
>
>> This is needed to avoid enumerating present && functional devices that
>> are not enabled.
>>
>> Signed-off-by: James Morse <[email protected]>
>> ---
>> If this change causes problems on deployed hardware, I suggest an
>> arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
>> acpi_dev_ready_for_enumeration() to only check the present bit.
>> ---
>> drivers/acpi/device_pm.c | 2 +-
>> drivers/acpi/device_sysfs.c | 2 +-
>> drivers/acpi/internal.h | 1 -
>> drivers/acpi/property.c | 2 +-
>> drivers/acpi/scan.c | 23 +++++++++++++----------
>> 5 files changed, 16 insertions(+), 14 deletions(-)
>>
>> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
>> index f007116a8427..76c38478a502 100644
>> --- a/drivers/acpi/device_pm.c
>> +++ b/drivers/acpi/device_pm.c
>> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
>> return -EINVAL;
>>
>> device->power.state = ACPI_STATE_UNKNOWN;
>> - if (!acpi_device_is_present(device)) {
>> + if (!acpi_dev_ready_for_enumeration(device)) {
>> device->flags.initialized = false;
>> return -ENXIO;
>> }
>> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
>> index b9bbf0746199..16e586d74aa2 100644
>> --- a/drivers/acpi/device_sysfs.c
>> +++ b/drivers/acpi/device_sysfs.c
>> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
>> struct acpi_hardware_id *id;
>>
>> /* Avoid unnecessarily loading modules for non present devices. */
>> - if (!acpi_device_is_present(acpi_dev))
>> + if (!acpi_dev_ready_for_enumeration(acpi_dev))
>> return 0;
>>
>> /*
>> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
>> index 866c7c4ed233..a1b45e345bcc 100644
>> --- a/drivers/acpi/internal.h
>> +++ b/drivers/acpi/internal.h
>> @@ -107,7 +107,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
>> void acpi_device_remove_files(struct acpi_device *dev);
>> void acpi_device_add_finalize(struct acpi_device *device);
>> void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
>> -bool acpi_device_is_present(const struct acpi_device *adev);
>> bool acpi_device_is_battery(struct acpi_device *adev);
>> bool acpi_device_is_first_physical_node(struct acpi_device *adev,
>> const struct device *dev);
>> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
>> index 413e4fcadcaf..e03f00b98701 100644
>> --- a/drivers/acpi/property.c
>> +++ b/drivers/acpi/property.c
>> @@ -1418,7 +1418,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
>> if (!is_acpi_device_node(fwnode))
>> return false;
>>
>> - return acpi_device_is_present(to_acpi_device_node(fwnode));
>> + return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
>> }
>>
>> static const void *
>> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
>> index 17ab875a7d4e..f898591ce05f 100644
>> --- a/drivers/acpi/scan.c
>> +++ b/drivers/acpi/scan.c
>> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
>> int error;
>>
>> acpi_bus_get_status(adev);
>> - if (acpi_device_is_present(adev)) {
>> + if (acpi_dev_ready_for_enumeration(adev)) {
>> /*
>> * This function is only called for device objects for which
>> * matching scan handlers exist. The only situation in which
>> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>> int error;
>>
>> acpi_bus_get_status(adev);
>> - if (!acpi_device_is_present(adev)) {
>> + if (!acpi_dev_ready_for_enumeration(adev)) {
>> acpi_scan_device_not_enumerated(adev);
>> return 0;
>> }
>> @@ -1908,11 +1908,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
>> return true;
>> }
>>
>> -bool acpi_device_is_present(const struct acpi_device *adev)
>> -{
>> - return adev->status.present || adev->status.functional;
>> -}
>> -
>> static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
>> const char *idstr,
>> const struct acpi_device_id **matchid)
>> @@ -2375,16 +2370,24 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
>> * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
>> * @device: Pointer to the &struct acpi_device to check
>> *
>> - * Check if the device is present and has no unmet dependencies.
>> + * Check if the device is functional or enabled and has no unmet dependencies.
>> *
>> - * Return true if the device is ready for enumeratino. Otherwise, return false.
>> + * Return true if the device is ready for enumeration. Otherwise, return false.
>> */
>> bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
>> {
>> if (device->flags.honor_deps && device->dep_unmet)
>> return false;
>>
>> - return acpi_device_is_present(device);
>> + /*
>> + * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
>> + * (!present && functional) for certain types of devices that should be
>> + * enumerated.
>
> I'd call out the fact that enumeration isn't same as "device driver should be loaded"
> which is the thing that functional is supposed to indicate should not happen.
>
>> + */
>> + if (!device->status.present && !device->status.enabled)
>
> In theory no need to check !enabled if !present
> "If bit [0] is cleared, then bit 1 must also be cleared (in other words, a device that is not present cannot be enabled)."
> We could report an ACPI bug if that's seen. If that bug case is ignored this code can
> become the simpler.
>
> if (device->status.present)
> return device->status_enabled;
> else
> return device->status.functional;
>
> Or the following also valid here (as functional should be set for enabled present devices
> unless they failed diagnostics).
>
> if (dev->status.functional)
> return true;
> return device->status.present && device->status.enabled;
>
> On assumption we want to enumerate dead devices for debug purposes...
>
I think it's worthy to include the words about the synchronization between present/enabled
bits into comments, outlined by Jonathan, to help readers to understand the code. Something
like below for the comments:
/*
* ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
* (!present && functional) for certain types of devices that should be
* enumerated. Note that the enabled bit can't be set until the present
* bit is set.
*/
>
>> + return device->status.functional;
>> +
>> + return device->status.present && device->status.enabled;
>
>
>> }
>> EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>>
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> ACPI has two ways of describing processors in the DSDT. Either as a device
> object with HID ACPI0007, or as a type 'C' package inside a Processor
> Container. The ACPI processor driver probes CPUs described as devices, but
> not those described as packages.
>
> Duplicate descriptions are not allowed, the ACPI processor driver already
> parses the UID from both devices and containers. acpi_processor_get_info()
> returns an error if the UID exists twice in the DSDT.
>
> The missing probe for CPUs described as packages creates a problem for
> moving the cpu_register() calls into the acpi_processor driver, as CPUs
> described like this don't get registered, leading to errors from other
> subsystems when they try to add new sysfs entries to the CPU node.
> (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
>
> To fix this, parse the processor container and call acpi_processor_add()
> for each processor that is discovered like this. The processor container
> handler is added with acpi_scan_add_handler(), so no detach call will
> arrive.
>
> Qemu TCG describes CPUs using packages in a processor container.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
I don't understand the last sentence of the commit log. QEMU
always have "ACPI0007" for the processor devices.
#define ACPI_PROCESSOR_DEVICE_HID "ACPI0007"
#define ACPI_PROCESSOR_OBJECT_HID "LNXCPU"
[gshan@gshan q]$ git grep ACPI0007
hw/acpi/cpu.c: aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
hw/arm/virt-acpi-build.c: aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
hw/riscv/virt-acpi-build.c: aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007")));
[gshan@gshan q]$ git grep LNXCPU
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index c0839bcf78c1..b4bde78121bb 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -625,9 +625,31 @@ static struct acpi_scan_handler processor_handler = {
> },
> };
>
> +static acpi_status acpi_processor_container_walk(acpi_handle handle,
> + u32 lvl,
> + void *context,
> + void **rv)
> +{
> + struct acpi_device *adev;
> + acpi_status status;
> +
> + adev = acpi_get_acpi_dev(handle);
> + if (!adev)
> + return AE_ERROR;
> +
> + status = acpi_processor_add(adev, &processor_device_ids[0]);
> + acpi_put_acpi_dev(adev);
> +
> + return status;
> +}
> +
> static int acpi_processor_container_attach(struct acpi_device *dev,
> const struct acpi_device_id *id)
> {
> + acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle,
> + ACPI_UINT32_MAX, acpi_processor_container_walk,
> + NULL, NULL, NULL);
> +
> return 1;
> }
>
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> acpi_device_is_present() checks the present or functional bits
> from the cached copy of _STA.
>
> A few places open-code this check. Use the helper instead to
> improve readability.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/scan.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 691d4b7686ee..ed01e19514ef 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> int error;
>
> acpi_bus_get_status(adev);
> - if (adev->status.present || adev->status.functional) {
> + if (acpi_device_is_present(adev)) {
> /*
> * This function is only called for device objects for which
> * matching scan handlers exist. The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
> int error;
>
> acpi_bus_get_status(adev);
> - if (!(adev->status.present || adev->status.functional)) {
> + if (!acpi_device_is_present(adev)) {
> acpi_scan_device_not_present(adev);
> return 0;
> }
On 9/14/23 02:37, James Morse wrote:
> intel_epb_init() is called as a subsys_initcall() to register cpuhp
> callbacks. The callbacks make use of get_cpu_device() which will return
> NULL unless register_cpu() has been called. register_cpu() is called
> from topology_init(), which is also a subsys_initcall().
>
> This is fragile. Moving the register_cpu() to a different
> subsys_initcall() leads to a NULL derefernce during boot.
^^^^^^^^^^
s/derefernce/dereference
Reported by ./scripts/checkpatch.pl --codespell
>
> Make intel_epb_init() a late_initcall(), user-space can't provide a
> policy before this point anyway.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> subsys_initcall_sync() would be an option, but moving the register_cpu()
> calls into ACPI also means adding a safety net for CPUs that are online
> but not described properly by firmware. This lives in subsys_initcall_sync().
> ---
> arch/x86/kernel/cpu/intel_epb.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kernel/cpu/intel_epb.c b/arch/x86/kernel/cpu/intel_epb.c
> index e4c3ba91321c..f18d35fe27a9 100644
> --- a/arch/x86/kernel/cpu/intel_epb.c
> +++ b/arch/x86/kernel/cpu/intel_epb.c
> @@ -237,4 +237,4 @@ static __init int intel_epb_init(void)
> cpuhp_remove_state(CPUHP_AP_X86_INTEL_EPB_ONLINE);
> return ret;
> }
> -subsys_initcall(intel_epb_init);
> +late_initcall(intel_epb_init);
Thanks,
Gavin
On Wed, Sep 13, 2023 at 04:37:48PM +0000, James Morse wrote:
> This series is based on v6.6-rc1, and can be retrieved from:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/ virtual_cpu_hotplug/rfc/v2
Hi James,
FYI, this doesn't seem to be based upon v6.6-rc1, but v6.4-rc5.
virtual_cpu_hotplug/rfc/v2 seems to have a hash of 505859b05e15.
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On 9/14/23 02:38, James Morse wrote:
> acpi_scan_device_not_present() is called when a device in the
> hierarchy is not available for enumeration. Historically enumeration
> was only based on whether the device was present.
>
> To add support for only enumerating devices that are both present
> and enabled, this helper should be renamed. It was only ever about
> enumeration, rename it acpi_scan_device_not_enumerated().
>
> No change in behaviour is intended.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/scan.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index ed01e19514ef..17ab875a7d4e 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -289,10 +289,10 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
> return 0;
> }
>
> -static int acpi_scan_device_not_present(struct acpi_device *adev)
> +static int acpi_scan_device_not_enumerated(struct acpi_device *adev)
> {
> if (!acpi_device_enumerated(adev)) {
> - dev_warn(&adev->dev, "Still not present\n");
> + dev_warn(&adev->dev, "Still not enumerated\n");
> return -EALREADY;
> }
> acpi_bus_trim(adev);
> @@ -327,7 +327,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> error = -ENODEV;
> }
> } else {
> - error = acpi_scan_device_not_present(adev);
> + error = acpi_scan_device_not_enumerated(adev);
> }
> return error;
> }
> @@ -339,7 +339,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
>
> acpi_bus_get_status(adev);
> if (!acpi_device_is_present(adev)) {
> - acpi_scan_device_not_present(adev);
> + acpi_scan_device_not_enumerated(adev);
> return 0;
> }
> if (handler && handler->hotplug.scan_dependent)
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> A subsequent patch will change acpi_scan_hot_remove() to call
> acpi_bus_trim_one() instead of acpi_bus_trim(), meaning it can no longer
> rely on the prototype in the header file.
>
> Move these functions further up the file.
> No change in behaviour.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/scan.c | 76 ++++++++++++++++++++++-----------------------
> 1 file changed, 38 insertions(+), 38 deletions(-)
>
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index f898591ce05f..a675333618ae 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -244,6 +244,44 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
> return 0;
> }
>
> +static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> +{
> + struct acpi_scan_handler *handler = adev->handler;
> +
> + acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
> +
> + adev->flags.match_driver = false;
> + if (handler) {
> + if (handler->detach)
> + handler->detach(adev);
> +
> + adev->handler = NULL;
> + } else {
> + device_release_driver(&adev->dev);
> + }
> + /*
> + * Most likely, the device is going away, so put it into D3cold before
> + * that.
> + */
> + acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
> + adev->flags.initialized = false;
> + acpi_device_clear_enumerated(adev);
> +
> + return 0;
> +}
> +
> +/**
> + * acpi_bus_trim - Detach scan handlers and drivers from ACPI device objects.
> + * @adev: Root of the ACPI namespace scope to walk.
> + *
> + * Must be called under acpi_scan_lock.
> + */
> +void acpi_bus_trim(struct acpi_device *adev)
> +{
> + acpi_bus_trim_one(adev, NULL);
> +}
> +EXPORT_SYMBOL_GPL(acpi_bus_trim);
> +
> static int acpi_scan_hot_remove(struct acpi_device *device)
> {
> acpi_handle handle = device->handle;
> @@ -2506,44 +2544,6 @@ int acpi_bus_scan(acpi_handle handle)
> }
> EXPORT_SYMBOL(acpi_bus_scan);
>
> -static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> -{
> - struct acpi_scan_handler *handler = adev->handler;
> -
> - acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
> -
> - adev->flags.match_driver = false;
> - if (handler) {
> - if (handler->detach)
> - handler->detach(adev);
> -
> - adev->handler = NULL;
> - } else {
> - device_release_driver(&adev->dev);
> - }
> - /*
> - * Most likely, the device is going away, so put it into D3cold before
> - * that.
> - */
> - acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
> - adev->flags.initialized = false;
> - acpi_device_clear_enumerated(adev);
> -
> - return 0;
> -}
> -
> -/**
> - * acpi_bus_trim - Detach scan handlers and drivers from ACPI device objects.
> - * @adev: Root of the ACPI namespace scope to walk.
> - *
> - * Must be called under acpi_scan_lock.
> - */
> -void acpi_bus_trim(struct acpi_device *adev)
> -{
> - acpi_bus_trim_one(adev, NULL);
> -}
> -EXPORT_SYMBOL_GPL(acpi_bus_trim);
> -
> int acpi_bus_register_early_device(int type)
> {
> struct acpi_device *device = NULL;
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> To allow ACPI to skip the call to arch_register_cpu() when the _STA
> value indicates the CPU can't be brought online right now, move the
> arch_register_cpu() call into acpi_processor_get_info().
>
> Systems can still be booted with 'acpi=off', or not include an
> ACPI description at all. For these, the CPUs continue to be
> registered by cpu_dev_register_generic().
>
> This moves the CPU register logic back to a subsys_initcall(),
> while the memory nodes will have been registered earlier.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 13 +++++++++++++
> drivers/base/cpu.c | 2 +-
> 2 files changed, 14 insertions(+), 1 deletion(-)
>
With the following nits addressed:
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index a01e315aa16a..867782bc50b0 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -313,6 +313,19 @@ static int acpi_processor_get_info(struct acpi_device *device)
> cpufreq_add_device("acpi-cpufreq");
> }
>
> + /*
> + * Register CPUs that are present.
> + * Use get_cpu_device() to skip duplicate CPU descriptions from
> + * firmware.
> + */
> + if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> + !get_cpu_device(pr->id)) {
> + int ret = arch_register_cpu(pr->id);
> +
> + if (ret)
> + return ret;
> + }
> +
The multiple lines of comments could be combined a bit:
/*
* Register CPUs that are present. get_cpu_device() is used to
* skip duplicate CPU description from firmware.
*/
> /*
> * Extra Processor objects may be enumerated on MP systems with
> * less than the max # of CPUs. They should be ignored _iff
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index d31c936f0955..677f963e02ce 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -537,7 +537,7 @@ static void __init cpu_dev_register_generic(void)
> {
> int i, ret;
>
> - if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> + if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES) || !acpi_disabled)
> return;
>
> for_each_present_cpu(i) {
Some comments may be worthy, to explain why we need "!acpi_disabled" here.
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> The code behind ACPI_HOTPLUG_CPU allows a not-present CPU to become
> present. This isn't the only use of HOTPLUG_CPU. On arm64 and riscv
> CPUs can be taken offline as a power saving measure.
>
> On arm64 an offline CPU may be disabled by firmware, preventing it from
> being brought back online, but it remains present throughout.
>
> Adding code to prevent user-space trying to online these disabled CPUs
> needs some additional terminology.
>
> Rename the Kconfig symbol CONFIG_ACPI_HOTPLUG_PRESENT_CPU to reflect
> that it makes possible CPUs present.
>
> HOTPLUG_CPU is untouched as this is only about the ACPI mechanism.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> arch/ia64/Kconfig | 2 +-
> arch/ia64/include/asm/acpi.h | 2 +-
> arch/ia64/kernel/acpi.c | 6 +++---
> arch/ia64/kernel/setup.c | 2 +-
> arch/loongarch/configs/loongson3_defconfig | 2 +-
> arch/loongarch/kernel/acpi.c | 4 ++--
> arch/x86/Kconfig | 2 +-
> arch/x86/kernel/acpi/boot.c | 4 ++--
> drivers/acpi/Kconfig | 4 ++--
> drivers/acpi/acpi_processor.c | 10 +++++-----
> include/acpi/processor.h | 2 +-
> include/linux/acpi.h | 6 +++---
> 12 files changed, 23 insertions(+), 23 deletions(-)
>
The replacement is missed for arch/loongarch.
[gshan@gshan l]$ git grep ACPI_HOTPLUG_CPU
arch/loongarch/Kconfig: select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
> diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
> index a3bfd42467ab..54972f9fe804 100644
> --- a/arch/ia64/Kconfig
> +++ b/arch/ia64/Kconfig
> @@ -16,7 +16,7 @@ config IA64
> select ARCH_MIGHT_HAVE_PC_PARPORT
> select ARCH_MIGHT_HAVE_PC_SERIO
> select ACPI
> - select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
> + select ACPI_HOTPLUG_PRESENT_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
> select ACPI_NUMA if NUMA
> select ARCH_ENABLE_MEMORY_HOTPLUG
> select ARCH_ENABLE_MEMORY_HOTREMOVE
> diff --git a/arch/ia64/include/asm/acpi.h b/arch/ia64/include/asm/acpi.h
> index 58500a964238..482ea994d1e1 100644
> --- a/arch/ia64/include/asm/acpi.h
> +++ b/arch/ia64/include/asm/acpi.h
> @@ -52,7 +52,7 @@ extern unsigned int is_cpu_cpei_target(unsigned int cpu);
> extern void set_cpei_target_cpu(unsigned int cpu);
> extern unsigned int get_cpei_target_cpu(void);
> extern void prefill_possible_map(void);
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> extern int additional_cpus;
> #else
> #define additional_cpus 0
> diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
> index 15f6cfddcc08..35881bf4b016 100644
> --- a/arch/ia64/kernel/acpi.c
> +++ b/arch/ia64/kernel/acpi.c
> @@ -194,7 +194,7 @@ acpi_parse_plat_int_src(union acpi_subtable_headers * header,
> return 0;
> }
>
> -#ifdef CONFIG_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> unsigned int can_cpei_retarget(void)
> {
> extern int cpe_vector;
> @@ -711,7 +711,7 @@ int acpi_isa_irq_to_gsi(unsigned isa_irq, u32 *gsi)
> /*
> * ACPI based hotplug CPU support
> */
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
> {
> #ifdef CONFIG_ACPI_NUMA
> @@ -820,7 +820,7 @@ int acpi_unmap_cpu(int cpu)
> return (0);
> }
> EXPORT_SYMBOL(acpi_unmap_cpu);
> -#endif /* CONFIG_ACPI_HOTPLUG_CPU */
> +#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>
> #ifdef CONFIG_ACPI_NUMA
> static acpi_status acpi_map_iosapic(acpi_handle handle, u32 depth,
> diff --git a/arch/ia64/kernel/setup.c b/arch/ia64/kernel/setup.c
> index 5a55ac82c13a..44591716d07b 100644
> --- a/arch/ia64/kernel/setup.c
> +++ b/arch/ia64/kernel/setup.c
> @@ -569,7 +569,7 @@ setup_arch (char **cmdline_p)
> #ifdef CONFIG_ACPI_NUMA
> acpi_numa_init();
> acpi_numa_fixup();
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> prefill_possible_map();
> #endif
> per_cpu_scan_finalize((cpumask_empty(&early_cpu_possible_map) ?
> diff --git a/arch/loongarch/configs/loongson3_defconfig b/arch/loongarch/configs/loongson3_defconfig
> index a3b52aaa83b3..ef3bc76313e4 100644
> --- a/arch/loongarch/configs/loongson3_defconfig
> +++ b/arch/loongarch/configs/loongson3_defconfig
> @@ -59,7 +59,7 @@ CONFIG_ACPI_SPCR_TABLE=y
> CONFIG_ACPI_TAD=y
> CONFIG_ACPI_DOCK=y
> CONFIG_ACPI_IPMI=m
> -CONFIG_ACPI_HOTPLUG_CPU=y
> +CONFIG_ACPI_HOTPLUG_PRESENT_CPU=y
> CONFIG_ACPI_PCI_SLOT=y
> CONFIG_ACPI_HOTPLUG_MEMORY=y
> CONFIG_EFI_ZBOOT=y
> diff --git a/arch/loongarch/kernel/acpi.c b/arch/loongarch/kernel/acpi.c
> index 9450e09073eb..b5153e395ad9 100644
> --- a/arch/loongarch/kernel/acpi.c
> +++ b/arch/loongarch/kernel/acpi.c
> @@ -289,7 +289,7 @@ void __init arch_reserve_mem_area(acpi_physical_address addr, size_t size)
> memblock_reserve(addr, size);
> }
>
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
>
> #include <acpi/processor.h>
>
> @@ -341,4 +341,4 @@ int acpi_unmap_cpu(int cpu)
> }
> EXPORT_SYMBOL(acpi_unmap_cpu);
>
> -#endif /* CONFIG_ACPI_HOTPLUG_CPU */
> +#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index 133ea5f561b5..295a7a3debb6 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -60,7 +60,7 @@ config X86
> #
> select ACPI_LEGACY_TABLES_LOOKUP if ACPI
> select ACPI_SYSTEM_POWER_STATES_SUPPORT if ACPI
> - select ACPI_HOTPLUG_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
> + select ACPI_HOTPLUG_PRESENT_CPU if ACPI_PROCESSOR && HOTPLUG_CPU
> select ARCH_32BIT_OFF_T if X86_32
> select ARCH_CLOCKSOURCE_INIT
> select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE
> diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
> index 2a0ea38955df..84dd4133754b 100644
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -814,7 +814,7 @@ static void __init acpi_set_irq_model_ioapic(void)
> /*
> * ACPI based hotplug support for CPU
> */
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> #include <acpi/processor.h>
>
> static int acpi_map_cpu2node(acpi_handle handle, int cpu, int physid)
> @@ -863,7 +863,7 @@ int acpi_unmap_cpu(int cpu)
> return (0);
> }
> EXPORT_SYMBOL(acpi_unmap_cpu);
> -#endif /* CONFIG_ACPI_HOTPLUG_CPU */
> +#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>
> int acpi_register_ioapic(acpi_handle handle, u64 phys_addr, u32 gsi_base)
> {
> diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
> index 8456d48ba702..417f9f3077d2 100644
> --- a/drivers/acpi/Kconfig
> +++ b/drivers/acpi/Kconfig
> @@ -305,7 +305,7 @@ config ACPI_IPMI
> To compile this driver as a module, choose M here:
> the module will be called as acpi_ipmi.
>
> -config ACPI_HOTPLUG_CPU
> +config ACPI_HOTPLUG_PRESENT_CPU
> bool
> depends on ACPI_PROCESSOR && HOTPLUG_CPU
> select ACPI_CONTAINER
> @@ -399,7 +399,7 @@ config ACPI_PCI_SLOT
>
> config ACPI_CONTAINER
> bool "Container and Module Devices"
> - default (ACPI_HOTPLUG_MEMORY || ACPI_HOTPLUG_CPU)
> + default (ACPI_HOTPLUG_MEMORY || ACPI_HOTPLUG_PRESENT_CPU)
> help
> This driver supports ACPI Container and Module devices (IDs
> ACPI0004, PNP0A05, and PNP0A06).
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 867782bc50b0..75257fae10e7 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -182,7 +182,7 @@ static void __init acpi_pcc_cpufreq_init(void) {}
> #endif /* CONFIG_X86 */
>
> /* Initialization */
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> static int acpi_processor_hotadd_init(struct acpi_processor *pr)
> {
> unsigned long long sta;
> @@ -227,7 +227,7 @@ static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
> {
> return -ENODEV;
> }
> -#endif /* CONFIG_ACPI_HOTPLUG_CPU */
> +#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>
> static int acpi_processor_get_info(struct acpi_device *device)
> {
> @@ -461,7 +461,7 @@ static int acpi_processor_add(struct acpi_device *device,
> return result;
> }
>
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> /* Removal */
> static void acpi_processor_remove(struct acpi_device *device)
> {
> @@ -505,7 +505,7 @@ static void acpi_processor_remove(struct acpi_device *device)
> free_cpumask_var(pr->throttling.shared_cpu_map);
> kfree(pr);
> }
> -#endif /* CONFIG_ACPI_HOTPLUG_CPU */
> +#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>
> #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
> bool __init processor_physically_present(acpi_handle handle)
> @@ -630,7 +630,7 @@ static const struct acpi_device_id processor_device_ids[] = {
> static struct acpi_scan_handler processor_handler = {
> .ids = processor_device_ids,
> .attach = acpi_processor_add,
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> .detach = acpi_processor_remove,
> #endif
> .hotplug = {
> diff --git a/include/acpi/processor.h b/include/acpi/processor.h
> index 94181fe9780a..fd6913370c72 100644
> --- a/include/acpi/processor.h
> +++ b/include/acpi/processor.h
> @@ -465,7 +465,7 @@ extern int acpi_processor_ffh_lpi_probe(unsigned int cpu);
> extern int acpi_processor_ffh_lpi_enter(struct acpi_lpi_state *lpi);
> #endif
>
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> extern int arch_register_cpu(int cpu);
> extern void arch_unregister_cpu(int cpu);
> #endif
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index a73246c3c35e..651dd43976a9 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -316,12 +316,12 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
> }
> #endif
>
> -#ifdef CONFIG_ACPI_HOTPLUG_CPU
> +#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> /* Arch dependent functions for cpu hotplug support */
> int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
> int *pcpu);
> int acpi_unmap_cpu(int cpu);
> -#endif /* CONFIG_ACPI_HOTPLUG_CPU */
> +#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>
> #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
> int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
> @@ -644,7 +644,7 @@ static inline u32 acpi_osc_ctx_get_cxl_control(struct acpi_osc_context *context)
> #define ACPI_GSB_ACCESS_ATTRIB_RAW_PROCESS 0x0000000F
>
> /* Enable _OST when all relevant hotplug operations are enabled */
> -#if defined(CONFIG_ACPI_HOTPLUG_CPU) && \
> +#if defined(CONFIG_ACPI_HOTPLUG_PRESENT_CPU) && \
> defined(CONFIG_ACPI_HOTPLUG_MEMORY) && \
> defined(CONFIG_ACPI_CONTAINER)
> #define ACPI_HOTPLUG_OST
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> LoongArch provides its own arch_unregister_cpu(). This clears the
> hotpluggable flag, then unregisters the CPU.
>
> It isn't necessary to clear the hotpluggable flag when unregistering
> a cpu. unregister_cpu() writes NULL to the percpu cpu_sys_devices
> pointer, meaning cpu_is_hotpluggable() will return false, as
> get_cpu_device() has returned NULL.
>
> Remove arch_unregister_cpu() and use the __weak version.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> arch/loongarch/kernel/topology.c | 9 ---------
> 1 file changed, 9 deletions(-)
>
I think arch/x86/kernel/topology.c::arch_unregister_cpu() can be dropped either.
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/arch/loongarch/kernel/topology.c b/arch/loongarch/kernel/topology.c
> index 8e4441c1ff39..5a75e2cc0848 100644
> --- a/arch/loongarch/kernel/topology.c
> +++ b/arch/loongarch/kernel/topology.c
> @@ -16,13 +16,4 @@ int arch_register_cpu(int cpu)
> return register_cpu(c, cpu);
> }
> EXPORT_SYMBOL(arch_register_cpu);
> -
> -void arch_unregister_cpu(int cpu)
> -{
> - struct cpu *c = &per_cpu(cpu_devices, cpu);
> -
> - c->hotpluggable = 0;
> - unregister_cpu(c);
> -}
> -EXPORT_SYMBOL(arch_unregister_cpu);
> #endif
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> When called acpi_processor_post_eject() unconditionally make a CPU
> not-present and unregisters it.
>
> To add support for AML events where the CPU has become disabled, but
> remains present, the _STA method should be checked before calling
> acpi_processor_remove().
>
> Rename acpi_processor_post_eject() acpi_processor_remove_possible(), and
> check the _STA before calling.
>
> Adding the function prototype for arch_unregister_cpu() allows the
> preprocessor guards to be removed.
>
> After this change CPUs will remain registered and visible to
> user-space as offline if buggy firmware triggers an eject-request,
> but doesn't clear the corresponding _STA bits after _EJ0 has been
> called.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 31 +++++++++++++++++++++++++------
> include/linux/cpu.h | 1 +
> 2 files changed, 26 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 00dcc23d49a8..2cafea1edc24 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -457,13 +457,12 @@ static int acpi_processor_add(struct acpi_device *device,
> return result;
> }
>
> -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> /* Removal */
> -static void acpi_processor_post_eject(struct acpi_device *device)
> +static void acpi_processor_make_not_present(struct acpi_device *device)
> {
> struct acpi_processor *pr;
>
> - if (!device || !acpi_driver_data(device))
> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> return;
>
In order to use IS_ENABLED(),
> pr = acpi_driver_data(device);
> @@ -501,7 +500,29 @@ static void acpi_processor_post_eject(struct acpi_device *device)
> free_cpumask_var(pr->throttling.shared_cpu_map);
> kfree(pr);
> }
> -#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
> +
> +static void acpi_processor_post_eject(struct acpi_device *device)
> +{
> + struct acpi_processor *pr;
> + unsigned long long sta;
> + acpi_status status;
> +
> + if (!device)
> + return;
> +
> + pr = acpi_driver_data(device);
> + if (!pr || pr->id >= nr_cpu_ids || invalid_phys_cpuid(pr->phys_id))
> + return;
> +
Do we really need to validate the logic and hardware CPU IDs here? I think
the ACPI processor device can't be added successfully if one of them is
invalid.
> + status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> + if (ACPI_FAILURE(status))
> + return;
> +
> + if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_PRESENT)) {
> + acpi_processor_make_not_present(device);
> + return;
> + }
> +}
>
> #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
> bool __init processor_physically_present(acpi_handle handle)
> @@ -626,9 +647,7 @@ static const struct acpi_device_id processor_device_ids[] = {
> static struct acpi_scan_handler processor_handler = {
> .ids = processor_device_ids,
> .attach = acpi_processor_add,
> -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> .post_eject = acpi_processor_post_eject,
> -#endif
> .hotplug = {
> .enabled = true,
> },
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index a71691d7c2ca..e117c06e0c6b 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -81,6 +81,7 @@ struct device *cpu_device_create(struct device *parent, void *drvdata,
> const struct attribute_group **groups,
> const char *fmt, ...);
> extern int arch_register_cpu(int cpu);
> +extern void arch_unregister_cpu(int cpu);
arch_unregister_cpu() is protected by CONFIG_HOTPLUG_CPU in the individual architectures,
for example arch/ia64/kernel/topology.c
> #ifdef CONFIG_HOTPLUG_CPU
> extern void unregister_cpu(struct cpu *cpu);
> extern ssize_t arch_cpu_probe(const char *, size_t);
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> ACPI, irqchip and the architecture code all inspect the MADT
> enabled bit for a GICC entry in the MADT.
>
> The addition of an 'online capable' bit means all these sites need
> updating.
>
> Move the current checks behind a helper to make future updates easier.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> arch/arm64/kernel/smp.c | 2 +-
> drivers/acpi/processor_core.c | 2 +-
> drivers/irqchip/irq-gic-v3.c | 10 ++++------
> include/linux/acpi.h | 5 +++++
> 4 files changed, 11 insertions(+), 8 deletions(-)
>
With Jonathan and Russell's comments addressed:
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 960b98b43506..8c8f55721786 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -520,7 +520,7 @@ acpi_map_gic_cpu_interface(struct acpi_madt_generic_interrupt *processor)
> {
> u64 hwid = processor->arm_mpidr;
>
> - if (!(processor->flags & ACPI_MADT_ENABLED)) {
> + if (!acpi_gicc_is_usable(processor)) {
> pr_debug("skipping disabled CPU entry with 0x%llx MPIDR\n", hwid);
> return;
> }
> diff --git a/drivers/acpi/processor_core.c b/drivers/acpi/processor_core.c
> index 7dd6dbaa98c3..b203cfe28550 100644
> --- a/drivers/acpi/processor_core.c
> +++ b/drivers/acpi/processor_core.c
> @@ -90,7 +90,7 @@ static int map_gicc_mpidr(struct acpi_subtable_header *entry,
> struct acpi_madt_generic_interrupt *gicc =
> container_of(entry, struct acpi_madt_generic_interrupt, header);
>
> - if (!(gicc->flags & ACPI_MADT_ENABLED))
> + if (!acpi_gicc_is_usable(gicc))
> return -ENODEV;
>
> /* device_declaration means Device object in DSDT, in the
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index eedfa8e9f077..72d3cdebdad1 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -2367,8 +2367,7 @@ gic_acpi_parse_madt_gicc(union acpi_subtable_headers *header,
> u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
> void __iomem *redist_base;
>
> - /* GICC entry which has !ACPI_MADT_ENABLED is not unusable so skip */
> - if (!(gicc->flags & ACPI_MADT_ENABLED))
> + if (!acpi_gicc_is_usable(gicc))
> return 0;
>
> redist_base = ioremap(gicc->gicr_base_address, size);
> @@ -2418,7 +2417,7 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
> * If GICC is enabled and has valid gicr base address, then it means
> * GICR base is presented via GICC
> */
> - if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address) {
> + if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
> acpi_data.enabled_rdists++;
> return 0;
> }
> @@ -2427,7 +2426,7 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
> * It's perfectly valid firmware can pass disabled GICC entry, driver
> * should not treat as errors, skip the entry instead of probe fail.
> */
> - if (!(gicc->flags & ACPI_MADT_ENABLED))
> + if (!acpi_gicc_is_usable(gicc))
> return 0;
>
> return -ENODEV;
> @@ -2486,8 +2485,7 @@ static int __init gic_acpi_parse_virt_madt_gicc(union acpi_subtable_headers *hea
> int maint_irq_mode;
> static int first_madt = true;
>
> - /* Skip unusable CPUs */
> - if (!(gicc->flags & ACPI_MADT_ENABLED))
> + if (!acpi_gicc_is_usable(gicc))
> return 0;
>
> maint_irq_mode = (gicc->flags & ACPI_MADT_VGIC_IRQ_MODE) ?
> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> index b7ab85857bb7..e3265a9eafae 100644
> --- a/include/linux/acpi.h
> +++ b/include/linux/acpi.h
> @@ -256,6 +256,11 @@ acpi_table_parse_cedt(enum acpi_cedt_type id,
> int acpi_parse_mcfg (struct acpi_table_header *header);
> void acpi_table_print_madt_entry (struct acpi_subtable_header *madt);
>
> +static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
> +{
> + return (gicc->flags & ACPI_MADT_ENABLED);
> +}
> +
> /* the following numa functions are architecture-dependent */
> void acpi_numa_slit_init (struct acpi_table_slit *slit);
>
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> Today the ACPI enumeration code 'visits' all devices that are present.
>
> This is a problem for arm64, where CPUs are always present, but not
> always enabled. When a device-check occurs because the firmware-policy
> has changed and a CPU is now enabled, the following error occurs:
> | acpi ACPI0007:48: Enumeration failure
>
> This is ultimately because acpi_dev_ready_for_enumeration() returns
> true for a device that is not enabled. The ACPI Processor driver
> will not register such CPUs as they are not 'decoding their resources'.
>
> Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> ACPI allows a device to be functional instead of maintaining the
> present and enabled bit. Make this behaviour an explicit check with
> a reference to the spec, and then check the present and enabled bits.
> This is needed to avoid enumerating present && functional devices that
> are not enabled.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> If this change causes problems on deployed hardware, I suggest an
> arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> acpi_dev_ready_for_enumeration() to only check the present bit.
> ---
> drivers/acpi/device_pm.c | 2 +-
> drivers/acpi/device_sysfs.c | 2 +-
> drivers/acpi/internal.h | 1 -
> drivers/acpi/property.c | 2 +-
> drivers/acpi/scan.c | 23 +++++++++++++----------
> 5 files changed, 16 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> index f007116a8427..76c38478a502 100644
> --- a/drivers/acpi/device_pm.c
> +++ b/drivers/acpi/device_pm.c
> @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
> return -EINVAL;
>
> device->power.state = ACPI_STATE_UNKNOWN;
> - if (!acpi_device_is_present(device)) {
> + if (!acpi_dev_ready_for_enumeration(device)) {
> device->flags.initialized = false;
> return -ENXIO;
> }
> diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> index b9bbf0746199..16e586d74aa2 100644
> --- a/drivers/acpi/device_sysfs.c
> +++ b/drivers/acpi/device_sysfs.c
> @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
> struct acpi_hardware_id *id;
>
> /* Avoid unnecessarily loading modules for non present devices. */
> - if (!acpi_device_is_present(acpi_dev))
> + if (!acpi_dev_ready_for_enumeration(acpi_dev))
> return 0;
>
> /*
> diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> index 866c7c4ed233..a1b45e345bcc 100644
> --- a/drivers/acpi/internal.h
> +++ b/drivers/acpi/internal.h
> @@ -107,7 +107,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
> void acpi_device_remove_files(struct acpi_device *dev);
> void acpi_device_add_finalize(struct acpi_device *device);
> void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> -bool acpi_device_is_present(const struct acpi_device *adev);
> bool acpi_device_is_battery(struct acpi_device *adev);
> bool acpi_device_is_first_physical_node(struct acpi_device *adev,
> const struct device *dev);
> diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> index 413e4fcadcaf..e03f00b98701 100644
> --- a/drivers/acpi/property.c
> +++ b/drivers/acpi/property.c
> @@ -1418,7 +1418,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
> if (!is_acpi_device_node(fwnode))
> return false;
>
> - return acpi_device_is_present(to_acpi_device_node(fwnode));
> + return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
> }
>
> static const void *
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index 17ab875a7d4e..f898591ce05f 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> int error;
>
> acpi_bus_get_status(adev);
> - if (acpi_device_is_present(adev)) {
> + if (acpi_dev_ready_for_enumeration(adev)) {
> /*
> * This function is only called for device objects for which
> * matching scan handlers exist. The only situation in which
> @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
> int error;
>
> acpi_bus_get_status(adev);
> - if (!acpi_device_is_present(adev)) {
> + if (!acpi_dev_ready_for_enumeration(adev)) {
> acpi_scan_device_not_enumerated(adev);
> return 0;
> }
> @@ -1908,11 +1908,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
> return true;
> }
>
> -bool acpi_device_is_present(const struct acpi_device *adev)
> -{
> - return adev->status.present || adev->status.functional;
> -}
> -
> static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
> const char *idstr,
> const struct acpi_device_id **matchid)
> @@ -2375,16 +2370,24 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> * @device: Pointer to the &struct acpi_device to check
> *
> - * Check if the device is present and has no unmet dependencies.
> + * Check if the device is functional or enabled and has no unmet dependencies.
> *
> - * Return true if the device is ready for enumeratino. Otherwise, return false.
> + * Return true if the device is ready for enumeration. Otherwise, return false.
> */
> bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> {
> if (device->flags.honor_deps && device->dep_unmet)
> return false;
>
> - return acpi_device_is_present(device);
> + /*
> + * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> + * (!present && functional) for certain types of devices that should be
> + * enumerated.
> + */
> + if (!device->status.present && !device->status.enabled)
> + return device->status.functional;
> +
> + return device->status.present && device->status.enabled;
> }
> EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>
Looking at Salil's latest branch (vcpu-hotplug-RFCv2-rc7), there are 3 possible statuses:
0x0 when CPU isn't present
0xD when CPU is present, but not enabled
0xF when CPU is present and enabled
Previously, the ACPI device is enumerated on 0xD and 0xF. We want to avoid the enumeration
on 0xD since the processor isn't ready for enumeration in this specific case. The changed
check (device->status.present && device->status.enabled) can ensure it. So the addition
of checking @device->state.functional seems irrelevant to ARM64 vCPU hot-add? I guess we
probably want a relaxation after the condition (device->status.present || device->status.enabled)
becomes a more strict one (device->status.present && device->status.enabled)
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> Add arch_unregister_cpu() to allow the ACPI machinery to call
> unregister_cpu(). This is enough for arm64, riscv and loongarch, but
> needs to be overridden by x86 and ia64 who need to do more work.
>
> CC: Jean-Philippe Brucker <[email protected]>
> Signed-off-by: James Morse <[email protected]>
> ---
> Changes since v1:
> * Added CONFIG_HOTPLUG_CPU ifdeffery around unregister_cpu
> ---
> arch/ia64/include/asm/cpu.h | 4 ----
> arch/loongarch/include/asm/cpu.h | 6 ------
> arch/x86/include/asm/cpu.h | 1 -
> drivers/base/cpu.c | 9 ++++++++-
> 4 files changed, 8 insertions(+), 12 deletions(-)
>
I agree with Jonathan this patch needs to come early. Maybe move this
before the following one:
[RFC PATCH v2 19/35] ACPI: Move acpi_bus_trim_one() before acpi_scan_hot_remove()
> diff --git a/arch/ia64/include/asm/cpu.h b/arch/ia64/include/asm/cpu.h
> index a3e690e685e5..642d71675ddb 100644
> --- a/arch/ia64/include/asm/cpu.h
> +++ b/arch/ia64/include/asm/cpu.h
> @@ -15,8 +15,4 @@ DECLARE_PER_CPU(struct ia64_cpu, cpu_devices);
>
> DECLARE_PER_CPU(int, cpu_state);
>
> -#ifdef CONFIG_HOTPLUG_CPU
> -extern void arch_unregister_cpu(int);
> -#endif
> -
> #endif /* _ASM_IA64_CPU_H_ */
> diff --git a/arch/loongarch/include/asm/cpu.h b/arch/loongarch/include/asm/cpu.h
> index b8568e637420..48b9f7168bcc 100644
> --- a/arch/loongarch/include/asm/cpu.h
> +++ b/arch/loongarch/include/asm/cpu.h
> @@ -128,10 +128,4 @@ enum cpu_type_enum {
> #define LOONGARCH_CPU_HYPERVISOR BIT_ULL(CPU_FEATURE_HYPERVISOR)
> #define LOONGARCH_CPU_PTW BIT_ULL(CPU_FEATURE_PTW)
>
> -#if !defined(__ASSEMBLY__)
> -#ifdef CONFIG_HOTPLUG_CPU
> -void arch_unregister_cpu(int cpu);
> -#endif
> -#endif /* ! __ASSEMBLY__ */
> -
> #endif /* _ASM_CPU_H */
> diff --git a/arch/x86/include/asm/cpu.h b/arch/x86/include/asm/cpu.h
> index f349c94510e8..91867a6a9f8e 100644
> --- a/arch/x86/include/asm/cpu.h
> +++ b/arch/x86/include/asm/cpu.h
> @@ -24,7 +24,6 @@ static inline void prefill_possible_map(void) {}
> #endif /* CONFIG_SMP */
>
> #ifdef CONFIG_HOTPLUG_CPU
> -extern void arch_unregister_cpu(int);
> extern void soft_restart_cpu(void);
> #endif
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index 677f963e02ce..c709747c4a18 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -531,7 +531,14 @@ int __weak arch_register_cpu(int cpu)
> {
> return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
> }
> -#endif
> +
> +#ifdef CONFIG_HOTPLUG_CPU
> +void __weak arch_unregister_cpu(int num)
> +{
> + unregister_cpu(&per_cpu(cpu_devices, num));
> +}
> +#endif /* CONFIG_HOTPLUG_CPU */
> +#endif /* CONFIG_GENERIC_CPU_DEVICES */
>
It seems conflicting with its declaration in include/linux/cpu.h. Besides,
the function is still needed by drivers/acpi/acpi_processor.c::acpi_processor_make_not_present()
even both CONFIG_HOTPLUG_CPU and CONFIG_GENERIC_CPU_DEVICES are disabled?
> static void __init cpu_dev_register_generic(void)
> {
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions().
> It should only count the number of enabled redistributors, but it
> also tries to sanity check the GICC entry, currently returning an
> error if the Enabled bit is set, but the gicr_base_address is zero.
>
> Adding support for the online-capable bit to the sanity check
> complicates it, for no benefit. The existing check implicitly
> depends on gic_acpi_count_gicr_regions() previous failing to find
> any GICR regions (as it is valid to have gicr_base_address of zero if
> the redistributors are described via a GICR entry).
>
> Instead of complicating the check, remove it. Failures that happen
> at this point cause the irqchip not to register, meaning no irqs
> can be requested. The kernel grinds to a panic() pretty quickly.
>
> Without the check, MADT tables that exhibit this problem are still
> caught by gic_populate_rdist(), which helpfully also prints what
> went wrong:
> | CPU4: mpidr 100 has no re-distributor!
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/irqchip/irq-gic-v3.c | 18 ++++++------------
> 1 file changed, 6 insertions(+), 12 deletions(-)
>
With below nits resolved:
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> index 72d3cdebdad1..0f54811262eb 100644
> --- a/drivers/irqchip/irq-gic-v3.c
> +++ b/drivers/irqchip/irq-gic-v3.c
> @@ -2415,21 +2415,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
>
> /*
> * If GICC is enabled and has valid gicr base address, then it means
> - * GICR base is presented via GICC
> + * GICR base is presented via GICC. The redistributor is only known to
> + * be accessible if the GICC is marked as enabled. If this bit is not
> + * set, we'd need to add the redistributor at runtime, which isn't
> + * supported.
> */
> - if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
> + if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)
> acpi_data.enabled_rdists++;
> - return 0;
> - }
>
if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
> - /*
> - * It's perfectly valid firmware can pass disabled GICC entry, driver
> - * should not treat as errors, skip the entry instead of probe fail.
> - */
> - if (!acpi_gicc_is_usable(gicc))
> - return 0;
> -
> - return -ENODEV;
> + return 0;
> }
>
> static int __init gic_acpi_count_gicr_regions(void)
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> struct acpi_scan_handler has a detach callback that is used to remove
> a driver when a bus is changed. When interacting with an eject-request,
> the detach callback is called before _EJ0.
>
> This means the ACPI processor driver can't use _STA to determine if a
> CPU has been made not-present, or some of the other _STA bits have been
> changed. acpi_processor_remove() needs to know the value of _STA after
> _EJ0 has been called.
>
It's helpful to mention which ACPI processor driver needs to use _STA
to determine the status here. I guess the ACPI processor driver will
behave differently depending on the status.
> Add a post_eject callback to struct acpi_scan_handler. This is called
> after acpi_scan_hot_remove() has successfully called _EJ0. Because
> acpi_bus_trim_one() also clears the handler pointer, it needs to be
> told if the caller will go on to call acpi_bus_post_eject(), so
> that acpi_device_clear_enumerated() and clearing the handler pointer
> can be deferred. The existing not-used pointer is used for this.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 4 +--
> drivers/acpi/scan.c | 52 ++++++++++++++++++++++++++++++-----
> include/acpi/acpi_bus.h | 1 +
> 3 files changed, 48 insertions(+), 9 deletions(-)
>
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 22a15a614f95..00dcc23d49a8 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -459,7 +459,7 @@ static int acpi_processor_add(struct acpi_device *device,
>
> #ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> /* Removal */
> -static void acpi_processor_remove(struct acpi_device *device)
> +static void acpi_processor_post_eject(struct acpi_device *device)
> {
> struct acpi_processor *pr;
>
> @@ -627,7 +627,7 @@ static struct acpi_scan_handler processor_handler = {
> .ids = processor_device_ids,
> .attach = acpi_processor_add,
> #ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> - .detach = acpi_processor_remove,
> + .post_eject = acpi_processor_post_eject,
> #endif
> .hotplug = {
> .enabled = true,
> diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> index a675333618ae..b6d2f01640a9 100644
> --- a/drivers/acpi/scan.c
> +++ b/drivers/acpi/scan.c
> @@ -244,18 +244,28 @@ static int acpi_scan_try_to_offline(struct acpi_device *device)
> return 0;
> }
>
> -static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> +/**
> + * acpi_bus_trim_one() - Detach scan handlers and drivers from ACPI device
> + * objects.
> + * @adev: Root of the ACPI namespace scope to walk.
> + * @eject: Pointer to a bool that indicates if this was due to an
> + * eject-request.
> + *
> + * Must be called under acpi_scan_lock.
> + * If @eject points to true, clearing the device enumeration is deferred until
> + * acpi_bus_post_eject() is called.
> + */
> +static int acpi_bus_trim_one(struct acpi_device *adev, void *eject)
> {
> struct acpi_scan_handler *handler = adev->handler;
> + bool is_eject = *(bool *)eject;
>
> - acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, NULL);
> + acpi_dev_for_each_child_reverse(adev, acpi_bus_trim_one, eject);
>
> adev->flags.match_driver = false;
> if (handler) {
> if (handler->detach)
> handler->detach(adev);
> -
> - adev->handler = NULL;
> } else {
> device_release_driver(&adev->dev);
> }
> @@ -265,7 +275,12 @@ static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> */
> acpi_device_set_power(adev, ACPI_STATE_D3_COLD);
> adev->flags.initialized = false;
> - acpi_device_clear_enumerated(adev);
> +
> + /* For eject this is deferred to acpi_bus_post_eject() */
> + if (!is_eject) {
> + adev->handler = NULL;
> + acpi_device_clear_enumerated(adev);
> + }
>
> return 0;
> }
> @@ -278,15 +293,36 @@ static int acpi_bus_trim_one(struct acpi_device *adev, void *not_used)
> */
> void acpi_bus_trim(struct acpi_device *adev)
> {
> - acpi_bus_trim_one(adev, NULL);
> + bool eject = false;
> +
> + acpi_bus_trim_one(adev, &eject);
> }
> EXPORT_SYMBOL_GPL(acpi_bus_trim);
>
> +static int acpi_bus_post_eject(struct acpi_device *adev, void *not_used)
> +{
> + struct acpi_scan_handler *handler = adev->handler;
> +
> + acpi_dev_for_each_child_reverse(adev, acpi_bus_post_eject, NULL);
> +
> + if (handler) {
> + if (handler->post_eject)
> + handler->post_eject(adev);
> +
> + adev->handler = NULL;
> + }
> +
> + acpi_device_clear_enumerated(adev);
> +
> + return 0;
> +}
> +
> static int acpi_scan_hot_remove(struct acpi_device *device)
> {
> acpi_handle handle = device->handle;
> unsigned long long sta;
> acpi_status status;
> + bool eject = true;
>
> if (device->handler && device->handler->hotplug.demand_offline) {
> if (!acpi_scan_is_offline(device, true))
> @@ -299,7 +335,7 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
>
> acpi_handle_debug(handle, "Ejecting\n");
>
> - acpi_bus_trim(device);
> + acpi_bus_trim_one(device, &eject);
>
> acpi_evaluate_lck(handle, 0);
> /*
> @@ -322,6 +358,8 @@ static int acpi_scan_hot_remove(struct acpi_device *device)
> } else if (sta & ACPI_STA_DEVICE_ENABLED) {
> acpi_handle_warn(handle,
> "Eject incomplete - status 0x%llx\n", sta);
> + } else {
> + acpi_bus_post_eject(device, NULL);
> }
>
> return 0;
> diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
> index 254685085c82..1b7e1acf925b 100644
> --- a/include/acpi/acpi_bus.h
> +++ b/include/acpi/acpi_bus.h
> @@ -127,6 +127,7 @@ struct acpi_scan_handler {
> bool (*match)(const char *idstr, const struct acpi_device_id **matchid);
> int (*attach)(struct acpi_device *dev, const struct acpi_device_id *id);
> void (*detach)(struct acpi_device *dev);
> + void (*post_eject)(struct acpi_device *dev);
> void (*bind)(struct device *phys_dev);
> void (*unbind)(struct device *phys_dev);
> struct acpi_hotplug_profile hotplug;
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> ACPI identifies CPUs by UID. get_cpu_for_acpi_id() maps the ACPI UID
> to the linux CPU number.
>
> The helper to retrieve this mapping is only available in arm64's numa
> code.
>
> Move it to live next to get_acpi_id_for_cpu().
>
> Signed-off-by: James Morse <[email protected]>
> ---
> arch/arm64/include/asm/acpi.h | 11 +++++++++++
> arch/arm64/kernel/acpi_numa.c | 11 -----------
> 2 files changed, 11 insertions(+), 11 deletions(-)
>
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h
> index 4d537d56eb84..ce5045038e87 100644
> --- a/arch/arm64/include/asm/acpi.h
> +++ b/arch/arm64/include/asm/acpi.h
> @@ -100,6 +100,17 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu)
> return acpi_cpu_get_madt_gicc(cpu)->uid;
> }
>
> +static inline int get_cpu_for_acpi_id(u32 uid)
> +{
> + int cpu;
> +
> + for (cpu = 0; cpu < nr_cpu_ids; cpu++)
> + if (uid == get_acpi_id_for_cpu(cpu))
> + return cpu;
> +
> + return -EINVAL;
> +}
> +
> static inline void arch_fix_phys_package_id(int num, u32 slot) { }
> void __init acpi_init_cpus(void);
> int apei_claim_sea(struct pt_regs *regs);
> diff --git a/arch/arm64/kernel/acpi_numa.c b/arch/arm64/kernel/acpi_numa.c
> index e51535a5f939..0c036a9a3c33 100644
> --- a/arch/arm64/kernel/acpi_numa.c
> +++ b/arch/arm64/kernel/acpi_numa.c
> @@ -34,17 +34,6 @@ int __init acpi_numa_get_nid(unsigned int cpu)
> return acpi_early_node_map[cpu];
> }
>
> -static inline int get_cpu_for_acpi_id(u32 uid)
> -{
> - int cpu;
> -
> - for (cpu = 0; cpu < nr_cpu_ids; cpu++)
> - if (uid == get_acpi_id_for_cpu(cpu))
> - return cpu;
> -
> - return -EINVAL;
> -}
> -
> static int __init acpi_parse_gicc_pxm(union acpi_subtable_headers *header,
> const unsigned long end)
> {
On 9/14/23 02:38, James Morse wrote:
> ACPI firmware can trigger the events to add and remove CPUs, but the
> OS may not support this.
>
> Print a warning when this happens.
^^^^^^^
error message
>
> This gives early warning on arm64 systems that don't support
> CONFIG_ACPI_HOTPLUG_PRESENT_CPU, as making CPUs not present has
> side effects for other parts of the system.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 8 ++++++--
> 1 file changed, 6 insertions(+), 2 deletions(-)
>
Maybe a warning message is enough, but a error message
is also fine, I think.
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 2cafea1edc24..b67616079751 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -188,8 +188,10 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
> acpi_status status;
> int ret;
>
> - if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
> + pr_err_once("Changing CPU present bit is not supported\n");
> return -ENODEV;
> + }
>
> if (invalid_phys_cpuid(pr->phys_id))
> return -ENODEV;
> @@ -462,8 +464,10 @@ static void acpi_processor_make_not_present(struct acpi_device *device)
> {
> struct acpi_processor *pr;
>
> - if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU)) {
> + pr_err_once("Changing CPU present bit is not supported");
> return;
> + }
>
> pr = acpi_driver_data(device);
> if (pr->id >= nr_cpu_ids)
Thanks,
Gavin
On 9/15/23 02:01, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:38:19 +0000
> James Morse <[email protected]> wrote:
>
>> From: Jean-Philippe Brucker <[email protected]>
>>
>> When a CPU is marked as disabled, but online capable in the MADT, PSCI
>> applies some firmware policy to control when it can be brought online.
>> PSCI returns DENIED to a CPU_ON request if this is not currently
>> permitted. The OS can learn the current policy from the _STA enabled bit.
>>
>> Handle the PSCI DENIED return code gracefully instead of printing an
>> error.
>
> Specification reference would be good particularly as it's only been
> added as a possibility fairly recently.
>
https://developer.arm.com/documentation/den0022/f/?lang=en page-58
It seems DENIED is the best matched indicator.
>>
>> Signed-off-by: Jean-Philippe Brucker <[email protected]>
>> [ morse: Rewrote commit message ]
>> Signed-off-by: James Morse <[email protected]>
>> ---
>> arch/arm64/kernel/psci.c | 2 +-
>> arch/arm64/kernel/smp.c | 3 ++-
>> drivers/firmware/psci/psci.c | 2 ++
>> 3 files changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/arm64/kernel/psci.c b/arch/arm64/kernel/psci.c
>> index 29a8e444db83..4fcc0cdd757b 100644
>> --- a/arch/arm64/kernel/psci.c
>> +++ b/arch/arm64/kernel/psci.c
>> @@ -40,7 +40,7 @@ static int cpu_psci_cpu_boot(unsigned int cpu)
>> {
>> phys_addr_t pa_secondary_entry = __pa_symbol(secondary_entry);
>> int err = psci_ops.cpu_on(cpu_logical_map(cpu), pa_secondary_entry);
>> - if (err)
>> + if (err && err != -EPROBE_DEFER)
>
> Hmm. EPROBE_DEFER has very specific meaning around driver requesting a retry
> when some other bit of the system has finished booting.
> I'm not sure it's a good idea for this use case. Maybe just keep to EPERM
> as psci_to_linux_errno() will return anyway. Seems valid to me, or
> is the requirement to use EPROBE_DEFER coming from further up the stack?
>
I agree with Jonathan that -EPERM from psci_to_linux_errno(DENIED) is
good enough here. Actually, I think we need to bail from bringing up
the CPU once error is raised on psci_ops.cpu_on() and avoid reporting
it as error with help of -EPROBE_DEFER. -EPERM can serve the same
purpose.
>
>
>> pr_err("failed to boot CPU%d (%d)\n", cpu, err);
>>
>> return err;
>> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
>> index 8c8f55721786..e958db987665 100644
>> --- a/arch/arm64/kernel/smp.c
>> +++ b/arch/arm64/kernel/smp.c
>> @@ -124,7 +124,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>> /* Now bring the CPU into our world */
>> ret = boot_secondary(cpu, idle);
>> if (ret) {
>> - pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
>> + if (ret != -EPROBE_DEFER)
>> + pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
>> return ret;
>> }
>>
>> diff --git a/drivers/firmware/psci/psci.c b/drivers/firmware/psci/psci.c
>> index d9629ff87861..f7ab3fed3528 100644
>> --- a/drivers/firmware/psci/psci.c
>> +++ b/drivers/firmware/psci/psci.c
>> @@ -218,6 +218,8 @@ static int __psci_cpu_on(u32 fn, unsigned long cpuid, unsigned long entry_point)
>> int err;
>>
>> err = invoke_psci_fn(fn, cpuid, entry_point, 0);
>> + if (err == PSCI_RET_DENIED)
>> + return -EPROBE_DEFER;
>> return psci_to_linux_errno(err);
>> }
>>
Thanks,
Gavin
On 9/19/23 13:39, Gavin Shan wrote:
>
> On 9/14/23 02:38, James Morse wrote:
>> gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions().
>> It should only count the number of enabled redistributors, but it
>> also tries to sanity check the GICC entry, currently returning an
>> error if the Enabled bit is set, but the gicr_base_address is zero.
>>
>> Adding support for the online-capable bit to the sanity check
>> complicates it, for no benefit. The existing check implicitly
>> depends on gic_acpi_count_gicr_regions() previous failing to find
>> any GICR regions (as it is valid to have gicr_base_address of zero if
>> the redistributors are described via a GICR entry).
>>
>> Instead of complicating the check, remove it. Failures that happen
>> at this point cause the irqchip not to register, meaning no irqs
>> can be requested. The kernel grinds to a panic() pretty quickly.
>>
>> Without the check, MADT tables that exhibit this problem are still
>> caught by gic_populate_rdist(), which helpfully also prints what
>> went wrong:
>> | CPU4: mpidr 100 has no re-distributor!
>>
>> Signed-off-by: James Morse <[email protected]>
>> ---
>> Â drivers/irqchip/irq-gic-v3.c | 18 ++++++------------
>> Â 1 file changed, 6 insertions(+), 12 deletions(-)
>>
>
> With below nits resolved:
>
> Reviewed-by: Gavin Shan <[email protected]>
>
>> diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
>> index 72d3cdebdad1..0f54811262eb 100644
>> --- a/drivers/irqchip/irq-gic-v3.c
>> +++ b/drivers/irqchip/irq-gic-v3.c
>> @@ -2415,21 +2415,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
>> Â Â Â Â Â /*
>> Â Â Â Â Â Â * If GICC is enabled and has valid gicr base address, then it means
>> -Â Â Â Â * GICR base is presented via GICC
>> +Â Â Â Â * GICR base is presented via GICC. The redistributor is only known to
>> +Â Â Â Â * be accessible if the GICC is marked as enabled. If this bit is not
>> +Â Â Â Â * set, we'd need to add the redistributor at runtime, which isn't
>> +Â Â Â Â * supported.
>> Â Â Â Â Â Â */
>> -Â Â Â if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
>> +Â Â Â if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)
>> Â Â Â Â Â Â Â Â Â acpi_data.enabled_rdists++;
>> -Â Â Â Â Â Â Â return 0;
>> -Â Â Â }
>
> Â Â Â Â if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
>
Please ignore this since acpi_gicc_is_usable() is changed to cover
the bit ACPI_MADT_GICC_CPU_CAPABLE in next patch, which means
"(gicc->flags & ACPI_MADT_ENABLED)" is needed here.
>
>> -Â Â Â /*
>> -Â Â Â Â * It's perfectly valid firmware can pass disabled GICC entry, driver
>> -Â Â Â Â * should not treat as errors, skip the entry instead of probe fail.
>> -Â Â Â Â */
>> -Â Â Â if (!acpi_gicc_is_usable(gicc))
>> -Â Â Â Â Â Â Â return 0;
>> -
>> -Â Â Â return -ENODEV;
>> +Â Â Â return 0;
>> Â }
>> Â static int __init gic_acpi_count_gicr_regions(void)
Thanks,
Gavin
On Tue, Sep 19, 2023 at 02:46:22PM +1000, Gavin Shan wrote:
> The message needs to be split up into multiple lines to make ./scripts/checkpatch.pl
> happy:
>
> pr_err_once(FW_BUG "CPU %u is online, but described "
> "as not present or disabled!\n", pr->id);
No. checkpatch is wrong on this point. Splitting the message like this
hurts debuggability, because the message can no longer be grepped for.
What James has done is perfectly fine.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Thu, Sep 14, 2023 at 05:13:41PM +0100, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:38:20 +0000
> James Morse <[email protected]> wrote:
> > +static int acpi_processor_make_enabled(struct acpi_processor *pr)
> > +{
> > + unsigned long long sta;
> > + acpi_status status;
> > + bool present, enabled;
> > +
> > + if (!acpi_has_method(pr->handle, "_STA"))
> > + return arch_register_cpu(pr->id);
> > +
> > + status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> > + if (ACPI_FAILURE(status))
> > + return -ENODEV;
> > +
> > + present = sta & ACPI_STA_DEVICE_PRESENT;
> > + enabled = sta & ACPI_STA_DEVICE_ENABLED;
> > +
> > + if (cpu_online(pr->id) && (!present || !enabled)) {
> > + pr_err_once(FW_BUG "CPU %u is online, but described as not present or disabled!\n", pr->id);
>
> Why once? If this for some reason happened on multiple CPUs I think we'd want to know.
>
> > + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> > + } else if (!present || !enabled) {
> > + return -ENODEV;
> > + }
>
> I guess you didn't do a nested if here to avoid even longer lines.
> Could flip things around though I don't like this much either as it makes
> the normal good path exit mid way down.
>
> if (present && enabled)
> return arch_register_cpu(pr->id);
>
> if (!cpu_online(pr->id))
> return -ENODEV;
>
> pr_err...
> add_taint(...
>
> return arch_register_cpu(pr->id);
>
> Ah well. Some code just has to be less than pretty.
How about:
static int acpi_processor_should_register_cpu(struct acpi_processor *pr)
{
unsigned long long sta;
acpi_status status;
if (!acpi_has_method(pr->handle, "_STA"))
return 0;
status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
if (ACPI_FAILURE(status))
return -ENODEV;
if (sta & ACPI_STA_DEVICE_PRESENT && sta & ACPI_STA_DEVICE_ENABLED)
return 0;
if (cpu_online(pr->id)) {
pr_err_once(FW_BUG
"CPU %u is online, but described as not present or disabled!\n",
pr->id);
/* Register the CPU anyway */
return 0;
}
return -ENODEV;
}
static int acpi_processor_make_enabled(struct acpi_processor *pr)
{
int ret = acpi_processor_should_register_cpu(pr);
if (ret)
return ret;
return arch_register_cpu(pr->id);
}
I would keep the "cpu online but !present and !disabled" as a sub-block
because it makes better reading flow, but instead break the message as
I've indicated above.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On 9/14/23 18:10, Russell King (Oracle) wrote:
> On Wed, Sep 13, 2023 at 04:38:18PM +0000, James Morse wrote:
>> static inline bool acpi_gicc_is_usable(struct acpi_madt_generic_interrupt *gicc)
>> {
>> - return (gicc->flags & ACPI_MADT_ENABLED);
>> + return ((gicc->flags & ACPI_MADT_ENABLED ||
>> + gicc->flags & ACPI_MADT_GICC_CPU_CAPABLE));
>
> ... and this starts getting silly with the number of parens.
>
> return gicc->flags & ACPI_MADT_ENABLED ||
> gicc->flags & ACPI_MADT_GICC_CPU_CAPABLE;
>
> is entirely sufficient. Also:
>
> return gicc->flags & (ACPI_MADT_ENABLED | ACPI_MADT_GICC_CPU_CAPABLE);
>
> also works.
>
vote for the second one, which is: gicc->flags & (ACPI_MADT_ENABLED | ACPI_MADT_GICC_CPU_CAPABLE)
Thanks,
Gavin
On 9/14/23 02:38, James Morse wrote:
> acpi_processor_get_info() registers all present CPUs. Registering a
> CPU is what creates the sysfs entries and triggers the udev
> notifications.
>
> arm64 virtual machines that support 'virtual cpu hotplug' use the
> enabled bit to indicate whether the CPU can be brought online, as
> the existing ACPI tables require all hardware to be described and
> present.
>
> If firmware describes a CPU as present, but disabled, skip the
> registration. Such CPUs are present, but can't be brought online for
> whatever reason. (e.g. firmware/hypervisor policy).
>
> Once firmware sets the enabled bit, the CPU can be registered and
> brought online by user-space. Online CPUs, or CPUs that are missing
> an _STA method must always be registered.
>
> Signed-off-by: James Morse <[email protected]>
> ---
> drivers/acpi/acpi_processor.c | 31 ++++++++++++++++++++++++++++++-
> 1 file changed, 30 insertions(+), 1 deletion(-)
>
With below nits addressed:
Reviewed-by: Gavin Shan <[email protected]>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index b67616079751..b49859eab01a 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -227,6 +227,32 @@ static int acpi_processor_make_present(struct acpi_processor *pr)
> return ret;
> }
>
> +static int acpi_processor_make_enabled(struct acpi_processor *pr)
> +{
> + unsigned long long sta;
> + acpi_status status;
> + bool present, enabled;
> +
> + if (!acpi_has_method(pr->handle, "_STA"))
> + return arch_register_cpu(pr->id);
> +
> + status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> + if (ACPI_FAILURE(status))
> + return -ENODEV;
> +
> + present = sta & ACPI_STA_DEVICE_PRESENT;
> + enabled = sta & ACPI_STA_DEVICE_ENABLED;
> +
> + if (cpu_online(pr->id) && (!present || !enabled)) {
> + pr_err_once(FW_BUG "CPU %u is online, but described as not present or disabled!\n", pr->id);
> + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
> + } else if (!present || !enabled) {
> + return -ENODEV;
> + }
> +
> + return arch_register_cpu(pr->id);
> +}
> +
The message needs to be split up into multiple lines to make ./scripts/checkpatch.pl
happy:
pr_err_once(FW_BUG "CPU %u is online, but described "
"as not present or disabled!\n", pr->id);
> static int acpi_processor_get_info(struct acpi_device *device)
> {
> union acpi_object object = { 0 };
> @@ -318,7 +344,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> */
> if (!invalid_logical_cpuid(pr->id) && cpu_present(pr->id) &&
> !get_cpu_device(pr->id)) {
> - int ret = arch_register_cpu(pr->id);
> + int ret = acpi_processor_make_enabled(pr);
>
> if (ret)
> return ret;
> @@ -526,6 +552,9 @@ static void acpi_processor_post_eject(struct acpi_device *device)
> acpi_processor_make_not_present(device);
> return;
> }
> +
> + if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_ENABLED))
> + arch_unregister_cpu(pr->id);
> }
>
> #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
Thanks,
Gavin
On 9/15/23 00:17, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:38:08 +0000
> James Morse <[email protected]> wrote:
>
>> acpi_processor_hotadd_init() will make a CPU present by mapping it
>> based on its hardware id.
>>
>> 'hotadd_init' is ambiguous once there are two different behaviours
>> for cpu hotplug. This is for toggling the _STA present bit. Subsequent
>> patches will add support for toggling the _STA enabled bit, named
>> acpi_processor_make_enabled().
>>
>> Rename it acpi_processor_make_present() to make it clear this is
>> for CPUs that were not previously present.
>>
>> Expose the function prototypes it uses to allow the preprocessor
>> guards to be removed. The IS_ENABLED() check will let the compiler
>> dead-code elimination pass remove this if it isn't going to be
>> used.
>>
>> Signed-off-by: James Morse <[email protected]>
>> ---
>> drivers/acpi/acpi_processor.c | 14 +++++---------
>> include/linux/acpi.h | 2 --
>> 2 files changed, 5 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
>> index 75257fae10e7..22a15a614f95 100644
>> --- a/drivers/acpi/acpi_processor.c
>> +++ b/drivers/acpi/acpi_processor.c
>> @@ -182,13 +182,15 @@ static void __init acpi_pcc_cpufreq_init(void) {}
>> #endif /* CONFIG_X86 */
>>
>> /* Initialization */
>> -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
>> -static int acpi_processor_hotadd_init(struct acpi_processor *pr)
>> +static int acpi_processor_make_present(struct acpi_processor *pr)
>> {
>> unsigned long long sta;
>> acpi_status status;
>> int ret;
>>
>> + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
>> + return -ENODEV;
>> +
>> if (invalid_phys_cpuid(pr->phys_id))
>> return -ENODEV;
>>
>> @@ -222,12 +224,6 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr)
>> cpu_maps_update_done();
>> return ret;
>> }
>> -#else
>> -static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
>> -{
>> - return -ENODEV;
>> -}
>> -#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>>
>> static int acpi_processor_get_info(struct acpi_device *device)
>> {
>> @@ -335,7 +331,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
>> * because cpuid <-> apicid mapping is persistent now.
>> */
>> if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
>> - int ret = acpi_processor_hotadd_init(pr);
>> + int ret = acpi_processor_make_present(pr);
>>
>> if (ret)
>> return ret;
>> diff --git a/include/linux/acpi.h b/include/linux/acpi.h
>> index 651dd43976a9..b7ab85857bb7 100644
>> --- a/include/linux/acpi.h
>> +++ b/include/linux/acpi.h
>> @@ -316,12 +316,10 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
>> }
>> #endif
>>
>> -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
>> /* Arch dependent functions for cpu hotplug support */
>> int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
>> int *pcpu);
>> int acpi_unmap_cpu(int cpu);
>
> I've lost track somewhat but I think the definitions of these are still under ifdefs
> which is messy if nothing else and might cause build issues.
>
Yup, it's not safe to use 'if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))' in
acpi_processor_make_present() until the ifdefs are removed for those two functions
in individual architectures.
>> -#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
>>
>> #ifdef CONFIG_ACPI_HOTPLUG_IOAPIC
>> int acpi_get_ioapic_id(acpi_handle handle, u32 gsi_base, u64 *phys_addr);
Thanks,
Gavin
> From: James Morse <[email protected]>
> Sent: Wednesday, September 13, 2023 5:38 PM
[...]
>
> Hello!
>
> Changes since RFC-v1:
> * riscv is new, ia64 is gone
> * The KVM support is different, and upstream - no need to patch the host.
>
> ---
>
> This series adds what looks like cpuhotplug support to arm64 for use in
> virtual machines. It does this by moving the cpu_register() calls for
> architectures that support ACPI out of the arch code by using
> GENERIC_CPU_DEVICES, then into the ACPI processor driver.
>
> The kubernetes folk really want to be able to add CPUs to an existing VM,
> in exactly the same way they do on x86. The use-case is pre-booting guests
> with one CPU, then adding the number that were actually needed when the
> workload is provisioned.
>
[...]
>
> I had a go at switching the remaining architectures over to
> GENERIC_CPU_DEVICES,
> so that the Kconfig symbol can be removed, but I got stuck with powerpc
> and s390.
>
> I've only build tested Loongarch and riscv. I've removed the ia64 specific
> patches, but left the changes in other patches to make git-grep review of
> renames easier.
>
> If folk want to play along at home, you'll need a copy of Qemu that
> supports this.
> https://github.com/salil-mehta/qemu.git salil/virt-cpuhp-armv8/rfc-v2-rc6
Please use the latest pushed RFC V2 instead:
https://lore.kernel.org/qemu-devel/[email protected]/T/#m523b37819c4811c7827333982004e07a1ef03879
Repository:
https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v2
Thanks
Salil.
[...]
> Why is this still an RFC? I'm still looking for confirmation from the
> kubernetes/kata folk that this works for them. Because of this I've culled
> the CC list...
>
>
> This series is based on v6.6-rc1, and can be retrieved from:
> https://git.kernel.org/pub/scm/linux/kernel/git/morse/linux.git/virtual_cpu_hotplug/rfc/v2
On Mon, Sep 18, 2023 at 01:33:37PM +1000, Gavin Shan wrote:
>
>
> On 9/14/23 02:37, James Morse wrote:
> > loongarch, mips, parisc, riscv and sh all print a warning if
> > register_cpu() returns an error. Architectures that use
> > GENERIC_CPU_DEVICES call panic() instead.
> >
> > Errors in this path indicate something is wrong with the firmware
> > description of the platform, but the kernel is able to keep running.
> >
> > Downgrade this to a warning to make it easier to debug this issue.
> >
> > This will allow architectures that switching over to GENERIC_CPU_DEVICES
> > to drop their warning, but keep the existing behaviour.
> >
> > Signed-off-by: James Morse <[email protected]>
> > ---
> > drivers/base/cpu.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > index 579064fda97b..d31c936f0955 100644
> > --- a/drivers/base/cpu.c
> > +++ b/drivers/base/cpu.c
> > @@ -535,14 +535,15 @@ int __weak arch_register_cpu(int cpu)
> > static void __init cpu_dev_register_generic(void)
> > {
> > - int i;
> > + int i, ret;
> > if (!IS_ENABLED(CONFIG_GENERIC_CPU_DEVICES))
> > return;
> > for_each_present_cpu(i) {
> > - if (arch_register_cpu(i))
> > - panic("Failed to register CPU device");
> > + ret = arch_register_cpu(i);
> > + if (ret)
> > + pr_warn("register_cpu %d failed (%d)\n", i, ret);
> > }
> > }
>
> The same warning message has been printed by arch/loongarch/kernel/topology.c::arch_register_cpu().
> In order to avoid the duplication, I think the warning message in arch/loongarch needs to be dropped?
No it doesn't, as far as Loongarch is concerned. Given where this change
occurs in the series, it is correct as far as this is concerned.
The reason is that this code path can only be reached when
CONFIG_GENERIC_CPU_DEVICES is set, which is something the arch has to
select. Loongarch doesn't select that until patch 9 in the series,
"LoongArch: Switch over to GENERIC_CPU_DEVICES", and that patch is
where the warning message in arch/loongarch is removed.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Thu, Sep 14, 2023 at 01:01:26PM +0100, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:37:59 +0000
> James Morse <[email protected]> wrote:
>
> > register_cpu_capacity_sysctl() adds a property to sysfs that describes
> > the CPUs capacity. This is done from a subsys_initcall() that assumes
> > all possible CPUs are registered.
> >
> > With CPU hotplug, possible CPUs aren't registered until they become
> > present, (or for arm64 enabled). This leads to messages during boot:
> > | register_cpu_capacity_sysctl: too early to get CPU1 device!
> > and once these CPUs are added to the system, the file is missing.
> >
> > Move this to a cpuhp callback, so that the file is created once
> > CPUs are brought online. This covers CPUs that are added late by
> > mechanisms like hotplug.
> > One observable difference is the file is now missing for offline CPUs.
> >
> > Signed-off-by: James Morse <[email protected]>
> > ---
> > If the offline CPUs thing is a problem for the tools that consume
> > this value, we'd need to move cpu_capacity to be part of cpu.c's
> > common_cpu_attr_groups.
>
> I think we should do that anyway and then use an is_visible() if we want to
> change whether it is visible in offline cpus.
>
> Dynamic sysfs file creation is horrible - particularly when done
> from an totally different file from where the rest of the attributes
> are registered. I'm curious what the history behind that is.
>
> Whilst here, why is there a common_cpu_attr_groups which is
> identical to the hotpluggable_cpu_attr_groups in base/cpu.c?
>
>
> +CC GregKH
> Given changes in drivers/base/
It would be good to have a comment on this from Greg before I post
an updated series of James' patches with most of the comments
addressed, possibly later today.
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Thu, Sep 14, 2023 at 01:01:26PM +0100, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:37:59 +0000
> James Morse <[email protected]> wrote:
>
> > register_cpu_capacity_sysctl() adds a property to sysfs that describes
> > the CPUs capacity. This is done from a subsys_initcall() that assumes
> > all possible CPUs are registered.
> >
> > With CPU hotplug, possible CPUs aren't registered until they become
> > present, (or for arm64 enabled). This leads to messages during boot:
> > | register_cpu_capacity_sysctl: too early to get CPU1 device!
> > and once these CPUs are added to the system, the file is missing.
> >
> > Move this to a cpuhp callback, so that the file is created once
> > CPUs are brought online. This covers CPUs that are added late by
> > mechanisms like hotplug.
> > One observable difference is the file is now missing for offline CPUs.
> >
> > Signed-off-by: James Morse <[email protected]>
> > ---
> > If the offline CPUs thing is a problem for the tools that consume
> > this value, we'd need to move cpu_capacity to be part of cpu.c's
> > common_cpu_attr_groups.
>
> I think we should do that anyway and then use an is_visible() if we want to
> change whether it is visible in offline cpus.
>
> Dynamic sysfs file creation is horrible - particularly when done
> from an totally different file from where the rest of the attributes
> are registered. I'm curious what the history behind that is.
>
> Whilst here, why is there a common_cpu_attr_groups which is
> identical to the hotpluggable_cpu_attr_groups in base/cpu.c?
Looking into doing this, the easy bit is adding the attribute group
with an appropriate .is_visible dependent on cpu_present(), but we
need to be able to call sysfs_update_groups() when the state of the
.is_visible() changes.
Given the comment in sysfs_update_groups() about "if an error occurs",
rather than making this part of common_cpu_attr_groups, would it be
better that it's part of its own set of groups, thus limiting the
damage from a possible error? I suspect, however, that any error at
that point means that the system is rather fatally wounded.
This is what I have so far to implement your idea, less the necessary
sysfs_update_groups() call when we need to change the visibility of
the attributes.
diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
index 9ccb7daee78e..06c9fc6620d2 100644
--- a/drivers/base/arch_topology.c
+++ b/drivers/base/arch_topology.c
@@ -215,43 +215,24 @@ static ssize_t cpu_capacity_show(struct device *dev,
return sysfs_emit(buf, "%lu\n", topology_get_cpu_scale(cpu->dev.id));
}
-static void update_topology_flags_workfn(struct work_struct *work);
-static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
-
static DEVICE_ATTR_RO(cpu_capacity);
-static int cpu_capacity_sysctl_add(unsigned int cpu)
-{
- struct device *cpu_dev = get_cpu_device(cpu);
-
- if (!cpu_dev)
- return -ENOENT;
-
- device_create_file(cpu_dev, &dev_attr_cpu_capacity);
-
- return 0;
-}
-
-static int cpu_capacity_sysctl_remove(unsigned int cpu)
+static umode_t cpu_present_attrs_visible(struct kobject *kobi,
+ struct attribute *attr, int index)
{
- struct device *cpu_dev = get_cpu_device(cpu);
-
- if (!cpu_dev)
- return -ENOENT;
-
- device_remove_file(cpu_dev, &dev_attr_cpu_capacity);
+ struct device *dev = kobj_to_dev(kobj);
+ struct cpu *cpu = container_of(dev, struct cpu, dev);
- return 0;
+ return cpu_present(cpu->dev.id) ? attr->mode : 0;
}
-static int register_cpu_capacity_sysctl(void)
-{
- cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "topology/cpu-capacity",
- cpu_capacity_sysctl_add, cpu_capacity_sysctl_remove);
+const struct attribute_group cpu_capacity_attr_group = {
+ .is_visible = cpu_present_attrs_visible,
+ .attrs = cpu_capacity_attrs
+};
- return 0;
-}
-subsys_initcall(register_cpu_capacity_sysctl);
+static void update_topology_flags_workfn(struct work_struct *work);
+static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
static int update_topology;
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index a19a8be93102..954b045705c2 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -192,6 +192,9 @@ static const struct attribute_group crash_note_cpu_attr_group = {
static const struct attribute_group *common_cpu_attr_groups[] = {
#ifdef CONFIG_KEXEC
&crash_note_cpu_attr_group,
+#endif
+#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
+ &cpu_capacity_attr_group,
#endif
NULL
};
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index e117c06e0c6b..745ad21e3dc8 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -30,6 +30,8 @@ struct cpu {
struct device dev;
};
+extern const struct attribute_group cpu_capacity_attr_group;
+
extern void boot_cpu_init(void);
extern void boot_cpu_hotplug_init(void);
extern void cpu_init(void);
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Thu, Sep 14, 2023 at 02:09:40PM +0100, Jonathan Cameron wrote:
> On Thu, 14 Sep 2023 13:27:32 +0100
> Jonathan Cameron <[email protected]> wrote:
>
> > On Wed, 13 Sep 2023 16:38:02 +0000
> > James Morse <[email protected]> wrote:
> >
> > > Today the ACPI enumeration code 'visits' all devices that are present.
> > >
> > > This is a problem for arm64, where CPUs are always present, but not
> > > always enabled. When a device-check occurs because the firmware-policy
> > > has changed and a CPU is now enabled, the following error occurs:
> > > | acpi ACPI0007:48: Enumeration failure
> > >
> > > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > > true for a device that is not enabled. The ACPI Processor driver
> > > will not register such CPUs as they are not 'decoding their resources'.
> > >
> > > Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> > > ACPI allows a device to be functional instead of maintaining the
> > > present and enabled bit. Make this behaviour an explicit check with
> > > a reference to the spec, and then check the present and enabled bits.
> >
> > "and the" only applies if the functional route hasn't been followed
> > "if not this case check the present and enabled bits."
> >
> > > This is needed to avoid enumerating present && functional devices that
> > > are not enabled.
> > >
> > > Signed-off-by: James Morse <[email protected]>
> > > ---
> > > If this change causes problems on deployed hardware, I suggest an
> > > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> > > acpi_dev_ready_for_enumeration() to only check the present bit.
> > > ---
> > > drivers/acpi/device_pm.c | 2 +-
> > > drivers/acpi/device_sysfs.c | 2 +-
> > > drivers/acpi/internal.h | 1 -
> > > drivers/acpi/property.c | 2 +-
> > > drivers/acpi/scan.c | 23 +++++++++++++----------
> > > 5 files changed, 16 insertions(+), 14 deletions(-)
> > >
> > > diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> > > index f007116a8427..76c38478a502 100644
> > > --- a/drivers/acpi/device_pm.c
> > > +++ b/drivers/acpi/device_pm.c
> > > @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
> > > return -EINVAL;
> > >
> > > device->power.state = ACPI_STATE_UNKNOWN;
> > > - if (!acpi_device_is_present(device)) {
> > > + if (!acpi_dev_ready_for_enumeration(device)) {
> > > device->flags.initialized = false;
> > > return -ENXIO;
> > > }
> > > diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> > > index b9bbf0746199..16e586d74aa2 100644
> > > --- a/drivers/acpi/device_sysfs.c
> > > +++ b/drivers/acpi/device_sysfs.c
> > > @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
> > > struct acpi_hardware_id *id;
> > >
> > > /* Avoid unnecessarily loading modules for non present devices. */
> > > - if (!acpi_device_is_present(acpi_dev))
> > > + if (!acpi_dev_ready_for_enumeration(acpi_dev))
> > > return 0;
> > >
> > > /*
> > > diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> > > index 866c7c4ed233..a1b45e345bcc 100644
> > > --- a/drivers/acpi/internal.h
> > > +++ b/drivers/acpi/internal.h
> > > @@ -107,7 +107,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
> > > void acpi_device_remove_files(struct acpi_device *dev);
> > > void acpi_device_add_finalize(struct acpi_device *device);
> > > void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> > > -bool acpi_device_is_present(const struct acpi_device *adev);
> > > bool acpi_device_is_battery(struct acpi_device *adev);
> > > bool acpi_device_is_first_physical_node(struct acpi_device *adev,
> > > const struct device *dev);
> > > diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> > > index 413e4fcadcaf..e03f00b98701 100644
> > > --- a/drivers/acpi/property.c
> > > +++ b/drivers/acpi/property.c
> > > @@ -1418,7 +1418,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
> > > if (!is_acpi_device_node(fwnode))
> > > return false;
> > >
> > > - return acpi_device_is_present(to_acpi_device_node(fwnode));
> > > + return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
> > > }
> > >
> > > static const void *
> > > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> > > index 17ab875a7d4e..f898591ce05f 100644
> > > --- a/drivers/acpi/scan.c
> > > +++ b/drivers/acpi/scan.c
> > > @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> > > int error;
> > >
> > > acpi_bus_get_status(adev);
> > > - if (acpi_device_is_present(adev)) {
> > > + if (acpi_dev_ready_for_enumeration(adev)) {
> > > /*
> > > * This function is only called for device objects for which
> > > * matching scan handlers exist. The only situation in which
> > > @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
> > > int error;
> > >
> > > acpi_bus_get_status(adev);
> > > - if (!acpi_device_is_present(adev)) {
> > > + if (!acpi_dev_ready_for_enumeration(adev)) {
> > > acpi_scan_device_not_enumerated(adev);
> > > return 0;
> > > }
> > > @@ -1908,11 +1908,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
> > > return true;
> > > }
> > >
> > > -bool acpi_device_is_present(const struct acpi_device *adev)
> > > -{
> > > - return adev->status.present || adev->status.functional;
> > > -}
> > > -
> > > static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
> > > const char *idstr,
> > > const struct acpi_device_id **matchid)
> > > @@ -2375,16 +2370,24 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> > > * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> > > * @device: Pointer to the &struct acpi_device to check
> > > *
> > > - * Check if the device is present and has no unmet dependencies.
> > > + * Check if the device is functional or enabled and has no unmet dependencies.
> > > *
> > > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > > + * Return true if the device is ready for enumeration. Otherwise, return false.
> > > */
> > > bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> > > {
> > > if (device->flags.honor_deps && device->dep_unmet)
> > > return false;
> > >
> > > - return acpi_device_is_present(device);
> > > + /*
> > > + * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > > + * (!present && functional) for certain types of devices that should be
> > > + * enumerated.
> >
> > I'd call out the fact that enumeration isn't same as "device driver should be loaded"
> > which is the thing that functional is supposed to indicate should not happen.
> >
> > > + */
> > > + if (!device->status.present && !device->status.enabled)
> >
> > In theory no need to check !enabled if !present
> > "If bit [0] is cleared, then bit 1 must also be cleared (in other words, a device that is not present cannot be enabled)."
> > We could report an ACPI bug if that's seen. If that bug case is ignored this code can
> > become the simpler.
> >
> > if (device->status.present)
> > return device->status_enabled;
> > else
> > return device->status.functional;
> >
> > Or the following also valid here (as functional should be set for enabled present devices
> > unless they failed diagnostics).
> >
> > if (dev->status.functional)
> > return true;
> > return device->status.present && device->status.enabled;
> >
> > On assumption we want to enumerate dead devices for debug purposes...
> Actually ignore this. Could have weird race with present, functional true,
> but enabled not quite set - despite the device being there and self
> tests having passed.
Are you suggesting to ignore you're entire suggestion or just this
suggestion and go with the first one?
So, the code was originally effectively:
return adev->status.present || adev->status.functional;
So it has the truth table:
present functional result
false false false
false true true
true don't care true
James' replacement code makes this:
if (!device->status.present && !device->status.enabled)
return device->status.functional;
return device->status.present && device->status.enabled;
giving:
present enabled functional result
false false false false
false false true true
false true don't care false <== invalid according to spec
true false don't care false
true true don't care true
So, I think what you're getting at is that we want the logic to be
according to the above table, but simplified, not caring about the
invalid state too much?
In which case, I would suggest going with your first suggestion, in
other words:
if (device->status.present)
return device->status.enabled;
else
return device->status.functional;
Yes?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Tue, Sep 19, 2023 at 09:43:46AM +1000, Gavin Shan wrote:
> On 9/14/23 02:38, James Morse wrote:
> > + if (!device->status.present && !device->status.enabled)
> > + return device->status.functional;
> > +
> > + return device->status.present && device->status.enabled;
> > }
> > EXPORT_SYMBOL_GPL(acpi_dev_ready_for_enumeration);
>
> Looking at Salil's latest branch (vcpu-hotplug-RFCv2-rc7), there are 3 possible statuses:
>
> 0x0 when CPU isn't present
> 0xD when CPU is present, but not enabled
> 0xF when CPU is present and enabled
>
> Previously, the ACPI device is enumerated on 0xD and 0xF. We want to avoid the enumeration
> on 0xD since the processor isn't ready for enumeration in this specific case. The changed
> check (device->status.present && device->status.enabled) can ensure it. So the addition
> of checking @device->state.functional seems irrelevant to ARM64 vCPU hot-add? I guess we
> probably want a relaxation after the condition (device->status.present || device->status.enabled)
> becomes a more strict one (device->status.present && device->status.enabled)
Okay, I'm confused by your comment.
As mentioned in my reply to Jonathan, the current code tests for
device->status.present || device->status.functional, not
device->status.present || device->status.enabled.
Digging back in the history, the acpi_device_is_present() helper
was added in 202317a573b2 "ACPI / scan: Add acpi_device objects for all
device nodes in the namespace". The commit description states:
Modify the ACPI namespace scanning code to register a struct
acpi_device object for every namespace node representing a device,
processor and so on, even if the device represented by that namespace
node is reported to be not present and not functional by _STA.
It seems the code originally used this test
- if (!(sta & ACPI_STA_DEVICE_PRESENT) &&
- !(sta & ACPI_STA_DEVICE_FUNCTIONING)) {
So this commit is just continuing that "tradition".
Digging further back, we find:
778cbc1d3abd ACPI: factor out device type and status checking
- case ACPI_BUS_TYPE_PROCESSOR:
- case ACPI_BUS_TYPE_DEVICE:
...
- /*
- * When the device is neither present nor functional, the
- * device should not be added to Linux ACPI device tree.
- * When the status of the device is not present but functinal,
- * it should be added to Linux ACPI tree. For example : bay
- * device , dock device.
- * In such conditions it is unncessary to check whether it is
- * bay device or dock device.
- */
- if (!device->status.present && !device->status.functional) {
and that comment seems to indicate where the !present && functional
case comes from.
So, I think it's necessary to continue supporting the !present &&
functional case otherwise it seems to me that we'll be regressing some
platforms.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:38:01PM +0000, James Morse wrote:
> acpi_scan_device_not_present() is called when a device in the
> hierarchy is not available for enumeration. Historically enumeration
> was only based on whether the device was present.
>
> To add support for only enumerating devices that are both present
> and enabled, this helper should be renamed. It was only ever about
> enumeration, rename it acpi_scan_device_not_enumerated().
>
> No change in behaviour is intended.
>
> Signed-off-by: James Morse <[email protected]>
Is this another patch which ought to be submitted without waiting
for the rest of the series?
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Fri, 20 Oct 2023 17:01:30 +0100
"Russell King (Oracle)" <[email protected]> wrote:
> On Wed, Sep 13, 2023 at 04:38:01PM +0000, James Morse wrote:
> > acpi_scan_device_not_present() is called when a device in the
> > hierarchy is not available for enumeration. Historically enumeration
> > was only based on whether the device was present.
> >
> > To add support for only enumerating devices that are both present
> > and enabled, this helper should be renamed. It was only ever about
> > enumeration, rename it acpi_scan_device_not_enumerated().
> >
> > No change in behaviour is intended.
> >
> > Signed-off-by: James Morse <[email protected]>
>
> Is this another patch which ought to be submitted without waiting
> for the rest of the series?
>
> Thanks.
>
Looks like a valid standalone change to me.
On Fri, 20 Oct 2023 16:32:17 +0100
"Russell King (Oracle)" <[email protected]> wrote:
> On Thu, Sep 14, 2023 at 02:09:40PM +0100, Jonathan Cameron wrote:
> > On Thu, 14 Sep 2023 13:27:32 +0100
> > Jonathan Cameron <[email protected]> wrote:
> >
> > > On Wed, 13 Sep 2023 16:38:02 +0000
> > > James Morse <[email protected]> wrote:
> > >
> > > > Today the ACPI enumeration code 'visits' all devices that are present.
> > > >
> > > > This is a problem for arm64, where CPUs are always present, but not
> > > > always enabled. When a device-check occurs because the firmware-policy
> > > > has changed and a CPU is now enabled, the following error occurs:
> > > > | acpi ACPI0007:48: Enumeration failure
> > > >
> > > > This is ultimately because acpi_dev_ready_for_enumeration() returns
> > > > true for a device that is not enabled. The ACPI Processor driver
> > > > will not register such CPUs as they are not 'decoding their resources'.
> > > >
> > > > Change acpi_dev_ready_for_enumeration() to also check the enabled bit.
> > > > ACPI allows a device to be functional instead of maintaining the
> > > > present and enabled bit. Make this behaviour an explicit check with
> > > > a reference to the spec, and then check the present and enabled bits.
> > >
> > > "and the" only applies if the functional route hasn't been followed
> > > "if not this case check the present and enabled bits."
> > >
> > > > This is needed to avoid enumerating present && functional devices that
> > > > are not enabled.
> > > >
> > > > Signed-off-by: James Morse <[email protected]>
> > > > ---
> > > > If this change causes problems on deployed hardware, I suggest an
> > > > arch opt-in: ACPI_IGNORE_STA_ENABLED, that causes
> > > > acpi_dev_ready_for_enumeration() to only check the present bit.
> > > > ---
> > > > drivers/acpi/device_pm.c | 2 +-
> > > > drivers/acpi/device_sysfs.c | 2 +-
> > > > drivers/acpi/internal.h | 1 -
> > > > drivers/acpi/property.c | 2 +-
> > > > drivers/acpi/scan.c | 23 +++++++++++++----------
> > > > 5 files changed, 16 insertions(+), 14 deletions(-)
> > > >
> > > > diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
> > > > index f007116a8427..76c38478a502 100644
> > > > --- a/drivers/acpi/device_pm.c
> > > > +++ b/drivers/acpi/device_pm.c
> > > > @@ -313,7 +313,7 @@ int acpi_bus_init_power(struct acpi_device *device)
> > > > return -EINVAL;
> > > >
> > > > device->power.state = ACPI_STATE_UNKNOWN;
> > > > - if (!acpi_device_is_present(device)) {
> > > > + if (!acpi_dev_ready_for_enumeration(device)) {
> > > > device->flags.initialized = false;
> > > > return -ENXIO;
> > > > }
> > > > diff --git a/drivers/acpi/device_sysfs.c b/drivers/acpi/device_sysfs.c
> > > > index b9bbf0746199..16e586d74aa2 100644
> > > > --- a/drivers/acpi/device_sysfs.c
> > > > +++ b/drivers/acpi/device_sysfs.c
> > > > @@ -141,7 +141,7 @@ static int create_pnp_modalias(const struct acpi_device *acpi_dev, char *modalia
> > > > struct acpi_hardware_id *id;
> > > >
> > > > /* Avoid unnecessarily loading modules for non present devices. */
> > > > - if (!acpi_device_is_present(acpi_dev))
> > > > + if (!acpi_dev_ready_for_enumeration(acpi_dev))
> > > > return 0;
> > > >
> > > > /*
> > > > diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
> > > > index 866c7c4ed233..a1b45e345bcc 100644
> > > > --- a/drivers/acpi/internal.h
> > > > +++ b/drivers/acpi/internal.h
> > > > @@ -107,7 +107,6 @@ int acpi_device_setup_files(struct acpi_device *dev);
> > > > void acpi_device_remove_files(struct acpi_device *dev);
> > > > void acpi_device_add_finalize(struct acpi_device *device);
> > > > void acpi_free_pnp_ids(struct acpi_device_pnp *pnp);
> > > > -bool acpi_device_is_present(const struct acpi_device *adev);
> > > > bool acpi_device_is_battery(struct acpi_device *adev);
> > > > bool acpi_device_is_first_physical_node(struct acpi_device *adev,
> > > > const struct device *dev);
> > > > diff --git a/drivers/acpi/property.c b/drivers/acpi/property.c
> > > > index 413e4fcadcaf..e03f00b98701 100644
> > > > --- a/drivers/acpi/property.c
> > > > +++ b/drivers/acpi/property.c
> > > > @@ -1418,7 +1418,7 @@ static bool acpi_fwnode_device_is_available(const struct fwnode_handle *fwnode)
> > > > if (!is_acpi_device_node(fwnode))
> > > > return false;
> > > >
> > > > - return acpi_device_is_present(to_acpi_device_node(fwnode));
> > > > + return acpi_dev_ready_for_enumeration(to_acpi_device_node(fwnode));
> > > > }
> > > >
> > > > static const void *
> > > > diff --git a/drivers/acpi/scan.c b/drivers/acpi/scan.c
> > > > index 17ab875a7d4e..f898591ce05f 100644
> > > > --- a/drivers/acpi/scan.c
> > > > +++ b/drivers/acpi/scan.c
> > > > @@ -304,7 +304,7 @@ static int acpi_scan_device_check(struct acpi_device *adev)
> > > > int error;
> > > >
> > > > acpi_bus_get_status(adev);
> > > > - if (acpi_device_is_present(adev)) {
> > > > + if (acpi_dev_ready_for_enumeration(adev)) {
> > > > /*
> > > > * This function is only called for device objects for which
> > > > * matching scan handlers exist. The only situation in which
> > > > @@ -338,7 +338,7 @@ static int acpi_scan_bus_check(struct acpi_device *adev, void *not_used)
> > > > int error;
> > > >
> > > > acpi_bus_get_status(adev);
> > > > - if (!acpi_device_is_present(adev)) {
> > > > + if (!acpi_dev_ready_for_enumeration(adev)) {
> > > > acpi_scan_device_not_enumerated(adev);
> > > > return 0;
> > > > }
> > > > @@ -1908,11 +1908,6 @@ static bool acpi_device_should_be_hidden(acpi_handle handle)
> > > > return true;
> > > > }
> > > >
> > > > -bool acpi_device_is_present(const struct acpi_device *adev)
> > > > -{
> > > > - return adev->status.present || adev->status.functional;
> > > > -}
> > > > -
> > > > static bool acpi_scan_handler_matching(struct acpi_scan_handler *handler,
> > > > const char *idstr,
> > > > const struct acpi_device_id **matchid)
> > > > @@ -2375,16 +2370,24 @@ EXPORT_SYMBOL_GPL(acpi_dev_clear_dependencies);
> > > > * acpi_dev_ready_for_enumeration - Check if the ACPI device is ready for enumeration
> > > > * @device: Pointer to the &struct acpi_device to check
> > > > *
> > > > - * Check if the device is present and has no unmet dependencies.
> > > > + * Check if the device is functional or enabled and has no unmet dependencies.
> > > > *
> > > > - * Return true if the device is ready for enumeratino. Otherwise, return false.
> > > > + * Return true if the device is ready for enumeration. Otherwise, return false.
> > > > */
> > > > bool acpi_dev_ready_for_enumeration(const struct acpi_device *device)
> > > > {
> > > > if (device->flags.honor_deps && device->dep_unmet)
> > > > return false;
> > > >
> > > > - return acpi_device_is_present(device);
> > > > + /*
> > > > + * ACPI 6.5's 6.3.7 "_STA (Device Status)" allows firmware to return
> > > > + * (!present && functional) for certain types of devices that should be
> > > > + * enumerated.
> > >
> > > I'd call out the fact that enumeration isn't same as "device driver should be loaded"
> > > which is the thing that functional is supposed to indicate should not happen.
> > >
> > > > + */
> > > > + if (!device->status.present && !device->status.enabled)
> > >
> > > In theory no need to check !enabled if !present
> > > "If bit [0] is cleared, then bit 1 must also be cleared (in other words, a device that is not present cannot be enabled)."
> > > We could report an ACPI bug if that's seen. If that bug case is ignored this code can
> > > become the simpler.
> > >
> > > if (device->status.present)
> > > return device->status_enabled;
> > > else
> > > return device->status.functional;
> > >
> > > Or the following also valid here (as functional should be set for enabled present devices
> > > unless they failed diagnostics).
> > >
> > > if (dev->status.functional)
> > > return true;
> > > return device->status.present && device->status.enabled;
> > >
> > > On assumption we want to enumerate dead devices for debug purposes...
> > Actually ignore this. Could have weird race with present, functional true,
> > but enabled not quite set - despite the device being there and self
> > tests having passed.
>
> Are you suggesting to ignore you're entire suggestion or just this
> suggestion and go with the first one?
I meant just the last one. Sorry for confusion.
>
> So, the code was originally effectively:
>
> return adev->status.present || adev->status.functional;
>
> So it has the truth table:
>
> present functional result
> false false false
> false true true
> true don't care true
>
> James' replacement code makes this:
>
> if (!device->status.present && !device->status.enabled)
> return device->status.functional;
>
> return device->status.present && device->status.enabled;
>
> giving:
>
> present enabled functional result
> false false false false
> false false true true
> false true don't care false <== invalid according to spec
> true false don't care false
> true true don't care true
>
> So, I think what you're getting at is that we want the logic to be
> according to the above table, but simplified, not caring about the
> invalid state too much?
>
> In which case, I would suggest going with your first suggestion, in
> other words:
>
> if (device->status.present)
> return device->status.enabled;
> else
> return device->status.functional;
>
> Yes?
>
Yes I agree.
On Fri, Oct 20, 2023 at 12:53:29PM +0100, Russell King (Oracle) wrote:
> On Thu, Sep 14, 2023 at 01:01:26PM +0100, Jonathan Cameron wrote:
> > On Wed, 13 Sep 2023 16:37:59 +0000
> > James Morse <[email protected]> wrote:
> >
> > > register_cpu_capacity_sysctl() adds a property to sysfs that describes
> > > the CPUs capacity. This is done from a subsys_initcall() that assumes
> > > all possible CPUs are registered.
> > >
> > > With CPU hotplug, possible CPUs aren't registered until they become
> > > present, (or for arm64 enabled). This leads to messages during boot:
> > > | register_cpu_capacity_sysctl: too early to get CPU1 device!
> > > and once these CPUs are added to the system, the file is missing.
> > >
> > > Move this to a cpuhp callback, so that the file is created once
> > > CPUs are brought online. This covers CPUs that are added late by
> > > mechanisms like hotplug.
> > > One observable difference is the file is now missing for offline CPUs.
> > >
> > > Signed-off-by: James Morse <[email protected]>
> > > ---
> > > If the offline CPUs thing is a problem for the tools that consume
> > > this value, we'd need to move cpu_capacity to be part of cpu.c's
> > > common_cpu_attr_groups.
> >
> > I think we should do that anyway and then use an is_visible() if we want to
> > change whether it is visible in offline cpus.
> >
> > Dynamic sysfs file creation is horrible - particularly when done
> > from an totally different file from where the rest of the attributes
> > are registered. I'm curious what the history behind that is.
> >
> > Whilst here, why is there a common_cpu_attr_groups which is
> > identical to the hotpluggable_cpu_attr_groups in base/cpu.c?
> >
> >
> > +CC GregKH
> > Given changes in drivers/base/
>
> It would be good to have a comment on this from Greg before I post
> an updated series of James' patches with most of the comments
> addressed, possibly later today.
Sorry, I don't see what I am supposed to comment on, so please just send
a new series and I'll look at that.
thanks,
greg k-h
On Tue, Sep 19, 2023 at 10:59:23AM +1000, Gavin Shan wrote:
> On 9/14/23 02:38, James Morse wrote:
> > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > index 677f963e02ce..c709747c4a18 100644
> > --- a/drivers/base/cpu.c
> > +++ b/drivers/base/cpu.c
> > @@ -531,7 +531,14 @@ int __weak arch_register_cpu(int cpu)
> > {
> > return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
> > }
> > -#endif
> > +
> > +#ifdef CONFIG_HOTPLUG_CPU
> > +void __weak arch_unregister_cpu(int num)
> > +{
> > + unregister_cpu(&per_cpu(cpu_devices, num));
> > +}
> > +#endif /* CONFIG_HOTPLUG_CPU */
> > +#endif /* CONFIG_GENERIC_CPU_DEVICES */
>
> It seems conflicting with its declaration in include/linux/cpu.h.
How so? The declaration is:
extern void arch_unregister_cpu(int cpu);
So:
void __weak arch_unregister_cpu(int num)
is compatible.
> Besides, the function is still needed by
> drivers/acpi/acpi_processor.c::acpi_processor_make_not_present()
> even both CONFIG_HOTPLUG_CPU and CONFIG_GENERIC_CPU_DEVICES are disabled?
Yes, I agree - it needs to be present when ACPI is built, so I'm
thinking the right solution is to move it out from under at least
CONFIG_HOTPLUG_CPU.
It can't be moved out from under CONFIG_GENERIC_CPU_DEVICES because
then we end up referencing the per-cpu variable cpu_devices which only
exists when CONFIG_GENERIC_CPU_DEVICES is enabled. Is that a problem
though, because in the case of !CONFIG_GENERIC_CPU_DEVICES, aren't
architectures required to provide both arch_.*register_cpu() functions?
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Thu, Sep 14, 2023 at 03:41:11PM +0100, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:38:13 +0000
> James Morse <[email protected]> wrote:
>
> > LoongArch provides its own arch_unregister_cpu(). This clears the
> > hotpluggable flag, then unregisters the CPU.
> >
> > It isn't necessary to clear the hotpluggable flag when unregistering
> > a cpu. unregister_cpu() writes NULL to the percpu cpu_sys_devices
> > pointer, meaning cpu_is_hotpluggable() will return false, as
> > get_cpu_device() has returned NULL.
>
> Thought that looked odd earlier but didn't care enough to dig.
> Seem unlikely state would persist for an unregistered cpu.
> Great to see confirmation.
Would it make sense to move this immedaitely after "LoongArch: Switch
over to GENERIC_CPU_DEVICES" ?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Mon, Oct 23, 2023 at 09:44:50AM +0100, Russell King (Oracle) wrote:
> On Tue, Sep 19, 2023 at 10:59:23AM +1000, Gavin Shan wrote:
> > On 9/14/23 02:38, James Morse wrote:
> > > diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> > > index 677f963e02ce..c709747c4a18 100644
> > > --- a/drivers/base/cpu.c
> > > +++ b/drivers/base/cpu.c
> > > @@ -531,7 +531,14 @@ int __weak arch_register_cpu(int cpu)
> > > {
> > > return register_cpu(&per_cpu(cpu_devices, cpu), cpu);
> > > }
> > > -#endif
> > > +
> > > +#ifdef CONFIG_HOTPLUG_CPU
> > > +void __weak arch_unregister_cpu(int num)
> > > +{
> > > + unregister_cpu(&per_cpu(cpu_devices, num));
> > > +}
> > > +#endif /* CONFIG_HOTPLUG_CPU */
> > > +#endif /* CONFIG_GENERIC_CPU_DEVICES */
> >
> > It seems conflicting with its declaration in include/linux/cpu.h.
>
> How so? The declaration is:
>
> extern void arch_unregister_cpu(int cpu);
>
> So:
>
> void __weak arch_unregister_cpu(int num)
>
> is compatible.
>
> > Besides, the function is still needed by
> > drivers/acpi/acpi_processor.c::acpi_processor_make_not_present()
> > even both CONFIG_HOTPLUG_CPU and CONFIG_GENERIC_CPU_DEVICES are disabled?
>
> Yes, I agree - it needs to be present when ACPI is built, so I'm
> thinking the right solution is to move it out from under at least
> CONFIG_HOTPLUG_CPU.
>
> It can't be moved out from under CONFIG_GENERIC_CPU_DEVICES because
> then we end up referencing the per-cpu variable cpu_devices which only
> exists when CONFIG_GENERIC_CPU_DEVICES is enabled. Is that a problem
> though, because in the case of !CONFIG_GENERIC_CPU_DEVICES, aren't
> architectures required to provide both arch_.*register_cpu() functions?
I'm also wondering why this patch isn't part of:
"drivers: base: Allow parts of GENERIC_CPU_DEVICES to be overridden"
because it seems to be doing something very similar.
The commit I refer to introduces a weak version of arch_register_cpu(),
and it seems it would also be appropriate to introduce a weak version
of its unregister paired function at the same time.
Any existing definitions of non-weak arch_unregister_cpu() would
override it so it shouldn't cause any issues.
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Thu, Sep 14, 2023 at 04:02:23PM +0100, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:38:17 +0000
> James Morse <[email protected]> wrote:
>
> > gic_acpi_match_gicc() is only called via gic_acpi_count_gicr_regions().
> > It should only count the number of enabled redistributors, but it
> > also tries to sanity check the GICC entry, currently returning an
> > error if the Enabled bit is set, but the gicr_base_address is zero.
> >
> > Adding support for the online-capable bit to the sanity check
> > complicates it, for no benefit. The existing check implicitly
> > depends on gic_acpi_count_gicr_regions() previous failing to find
> > any GICR regions (as it is valid to have gicr_base_address of zero if
> > the redistributors are described via a GICR entry).
> >
> > Instead of complicating the check, remove it. Failures that happen
> > at this point cause the irqchip not to register, meaning no irqs
> > can be requested. The kernel grinds to a panic() pretty quickly.
> >
> > Without the check, MADT tables that exhibit this problem are still
> > caught by gic_populate_rdist(), which helpfully also prints what
> > went wrong:
> > | CPU4: mpidr 100 has no re-distributor!
> >
> > Signed-off-by: James Morse <[email protected]>
> > ---
> > drivers/irqchip/irq-gic-v3.c | 18 ++++++------------
> > 1 file changed, 6 insertions(+), 12 deletions(-)
> >
> > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > index 72d3cdebdad1..0f54811262eb 100644
> > --- a/drivers/irqchip/irq-gic-v3.c
> > +++ b/drivers/irqchip/irq-gic-v3.c
> > @@ -2415,21 +2415,15 @@ static int __init gic_acpi_match_gicc(union acpi_subtable_headers *header,
> >
> > /*
> > * If GICC is enabled and has valid gicr base address, then it means
> > - * GICR base is presented via GICC
> > + * GICR base is presented via GICC. The redistributor is only known to
> > + * be accessible if the GICC is marked as enabled. If this bit is not
> > + * set, we'd need to add the redistributor at runtime, which isn't
> > + * supported.
> > */
> > - if (acpi_gicc_is_usable(gicc) && gicc->gicr_base_address) {
> > + if (gicc->flags & ACPI_MADT_ENABLED && gicc->gicr_base_address)
>
> Going in circles...
It does seem that way. Are you suggesting something should change here?
Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Mon, Sep 18, 2023 at 03:50:09PM +1000, Gavin Shan wrote:
>
> On 9/15/23 00:17, Jonathan Cameron wrote:
> > On Wed, 13 Sep 2023 16:38:08 +0000
> > James Morse <[email protected]> wrote:
> >
> > > acpi_processor_hotadd_init() will make a CPU present by mapping it
> > > based on its hardware id.
> > >
> > > 'hotadd_init' is ambiguous once there are two different behaviours
> > > for cpu hotplug. This is for toggling the _STA present bit. Subsequent
> > > patches will add support for toggling the _STA enabled bit, named
> > > acpi_processor_make_enabled().
> > >
> > > Rename it acpi_processor_make_present() to make it clear this is
> > > for CPUs that were not previously present.
> > >
> > > Expose the function prototypes it uses to allow the preprocessor
> > > guards to be removed. The IS_ENABLED() check will let the compiler
> > > dead-code elimination pass remove this if it isn't going to be
> > > used.
> > >
> > > Signed-off-by: James Morse <[email protected]>
> > > ---
> > > drivers/acpi/acpi_processor.c | 14 +++++---------
> > > include/linux/acpi.h | 2 --
> > > 2 files changed, 5 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > > index 75257fae10e7..22a15a614f95 100644
> > > --- a/drivers/acpi/acpi_processor.c
> > > +++ b/drivers/acpi/acpi_processor.c
> > > @@ -182,13 +182,15 @@ static void __init acpi_pcc_cpufreq_init(void) {}
> > > #endif /* CONFIG_X86 */
> > > /* Initialization */
> > > -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> > > -static int acpi_processor_hotadd_init(struct acpi_processor *pr)
> > > +static int acpi_processor_make_present(struct acpi_processor *pr)
> > > {
> > > unsigned long long sta;
> > > acpi_status status;
> > > int ret;
> > > + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> > > + return -ENODEV;
> > > +
> > > if (invalid_phys_cpuid(pr->phys_id))
> > > return -ENODEV;
> > > @@ -222,12 +224,6 @@ static int acpi_processor_hotadd_init(struct acpi_processor *pr)
> > > cpu_maps_update_done();
> > > return ret;
> > > }
> > > -#else
> > > -static inline int acpi_processor_hotadd_init(struct acpi_processor *pr)
> > > -{
> > > - return -ENODEV;
> > > -}
> > > -#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
> > > static int acpi_processor_get_info(struct acpi_device *device)
> > > {
> > > @@ -335,7 +331,7 @@ static int acpi_processor_get_info(struct acpi_device *device)
> > > * because cpuid <-> apicid mapping is persistent now.
> > > */
> > > if (invalid_logical_cpuid(pr->id) || !cpu_present(pr->id)) {
> > > - int ret = acpi_processor_hotadd_init(pr);
> > > + int ret = acpi_processor_make_present(pr);
> > > if (ret)
> > > return ret;
> > > diff --git a/include/linux/acpi.h b/include/linux/acpi.h
> > > index 651dd43976a9..b7ab85857bb7 100644
> > > --- a/include/linux/acpi.h
> > > +++ b/include/linux/acpi.h
> > > @@ -316,12 +316,10 @@ static inline int acpi_processor_evaluate_cst(acpi_handle handle, u32 cpu,
> > > }
> > > #endif
> > > -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> > > /* Arch dependent functions for cpu hotplug support */
> > > int acpi_map_cpu(acpi_handle handle, phys_cpuid_t physid, u32 acpi_id,
> > > int *pcpu);
> > > int acpi_unmap_cpu(int cpu);
> >
> > I've lost track somewhat but I think the definitions of these are still under ifdefs
> > which is messy if nothing else and might cause build issues.
> >
>
> Yup, it's not safe to use 'if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))' in
> acpi_processor_make_present() until the ifdefs are removed for those two functions
> in individual architectures.
The same thing appears in a final patch that James seems to have added
to the repository:
ACPI: processor: Only call arch_unregister_cpu() if HOTPLUG_CPU is selected
in which acpi_processor_post_eject() has this change:
- if (!device)
+ if (!IS_ENABLED(CONFIG_HOTPLUG_CPU) || !device)
I'm wondering if that's caused by a previous patch making the weak
definition of arch_unregister_cpu() dependent on HOTPLUG_CPU, and
whether dropping that ifdef around the function would be better. I
think I already asked that question, but this final patch seems to be
the confirmation that we need to provide a definition of it.
I think the reason James did it like that is because unregister_cpu()
is dependent upon CONFIG_HOTPLUG_CPU, but it's probably better to do:
#ifdef CONFIG_HOTPLUG_CPU
void __weak arch_unregister_cpu(int num)
{
unregister_cpu(&per_cpu(cpu_devices, num));
}
#else
void __weak arch_unregister_cpu(int num)
{
}
#endif
Agreed?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Thu, Sep 14, 2023 at 02:53:53PM +0100, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:38:03 +0000
> James Morse <[email protected]> wrote:
>
> > ACPI has two ways of describing processors in the DSDT. Either as a device
> > object with HID ACPI0007, or as a type 'C' package inside a Processor
> > Container. The ACPI processor driver probes CPUs described as devices, but
> > not those described as packages.
> >
>
> Specification reference needed...
>
> Terminology wise, I'd just refer to Processor() objects as I think they
> are named objects rather than data terms like a package (Which include
> a PkgLength etc)
I'm not sure what kind of reference you want for the above. Looking in
ACPI 6.5, I've found in 5.2.12:
"Starting with ACPI Specification 6.3, the use of the Processor() object
was deprecated. Only legacy systems should continue with this usage. On
the Itanium architecture only, a _UID is provided for the Processor()
that is a string object. This usage of _UID is also deprecated since it
can preclude an OSPM from being able to match a processor to a
non-enumerable device, such as those defined in the MADT. From ACPI
Specification 6.3 onward, all processor objects for all architectures
except Itanium must now use Device() objects with an _HID of ACPI0007,
and use only integer _UID values."
Also, there is:
https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
Unfortunately, using the search facility on that site to try and find
Processor() doesn't work - it appears to strip the "()" characters from
the search (which is completely dumb, why do search facilities do that?)
> > The missing probe for CPUs described as packages creates a problem for
> > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > described like this don't get registered, leading to errors from other
> > subsystems when they try to add new sysfs entries to the CPU node.
> > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> >
> > To fix this, parse the processor container and call acpi_processor_add()
> > for each processor that is discovered like this. The processor container
> > handler is added with acpi_scan_add_handler(), so no detach call will
> > arrive.
> >
> > Qemu TCG describes CPUs using packages in a processor container.
>
> processor terms in a processor container.
Are you wanting this to be:
"Qemu TCG describes CPUs using processor terms in a processor
container."
? Searching the ACPI spec for "processor terms" (with or without quotes)
only brings up results for "terms" - yet another reason to hate site-
provided search facilities, I don't know why sites bother. :(
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Mon, Sep 18, 2023 at 03:02:53PM +1000, Gavin Shan wrote:
>
> On 9/14/23 02:38, James Morse wrote:
> > ACPI has two ways of describing processors in the DSDT. Either as a device
> > object with HID ACPI0007, or as a type 'C' package inside a Processor
> > Container. The ACPI processor driver probes CPUs described as devices, but
> > not those described as packages.
> >
> > Duplicate descriptions are not allowed, the ACPI processor driver already
> > parses the UID from both devices and containers. acpi_processor_get_info()
> > returns an error if the UID exists twice in the DSDT.
> >
> > The missing probe for CPUs described as packages creates a problem for
> > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > described like this don't get registered, leading to errors from other
> > subsystems when they try to add new sysfs entries to the CPU node.
> > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> >
> > To fix this, parse the processor container and call acpi_processor_add()
> > for each processor that is discovered like this. The processor container
> > handler is added with acpi_scan_add_handler(), so no detach call will
> > arrive.
> >
> > Qemu TCG describes CPUs using packages in a processor container.
> >
> > Signed-off-by: James Morse <[email protected]>
> > ---
> > drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
>
> I don't understand the last sentence of the commit log. QEMU
> always have "ACPI0007" for the processor devices.
I think what James is referring to is the use of Processor Containers
(see
https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-container-device)
which are defined using HID ACPI0010. This seems to be what
build_cpus_aml() is doing. It creates:
\_SB.CPUS - processor container with ACPI0010
and then builds the processor devices underneath that object using
ACPI0007.
I think the use of "packages" there is wrong, it should be "processor
devices" - taking the terminology from the above specification link.
As far as I can see, QEMU does not (yet) use the option of embedding
child processor containers beneath a parent.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Fri, Nov 03, 2023 at 10:43:15AM +0000, Russell King (Oracle) wrote:
> On Thu, Sep 14, 2023 at 02:53:53PM +0100, Jonathan Cameron wrote:
> > On Wed, 13 Sep 2023 16:38:03 +0000
> > James Morse <[email protected]> wrote:
> >
> > > ACPI has two ways of describing processors in the DSDT. Either as a device
> > > object with HID ACPI0007, or as a type 'C' package inside a Processor
> > > Container. The ACPI processor driver probes CPUs described as devices, but
> > > not those described as packages.
> > >
> >
> > Specification reference needed...
> >
> > Terminology wise, I'd just refer to Processor() objects as I think they
> > are named objects rather than data terms like a package (Which include
> > a PkgLength etc)
>
> I'm not sure what kind of reference you want for the above. Looking in
> ACPI 6.5, I've found in 5.2.12:
>
> "Starting with ACPI Specification 6.3, the use of the Processor() object
> was deprecated. Only legacy systems should continue with this usage. On
> the Itanium architecture only, a _UID is provided for the Processor()
> that is a string object. This usage of _UID is also deprecated since it
> can preclude an OSPM from being able to match a processor to a
> non-enumerable device, such as those defined in the MADT. From ACPI
> Specification 6.3 onward, all processor objects for all architectures
> except Itanium must now use Device() objects with an _HID of ACPI0007,
> and use only integer _UID values."
>
> Also, there is:
>
> https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
>
> Unfortunately, using the search facility on that site to try and find
> Processor() doesn't work - it appears to strip the "()" characters from
> the search (which is completely dumb, why do search facilities do that?)
>
> > > The missing probe for CPUs described as packages creates a problem for
> > > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > > described like this don't get registered, leading to errors from other
> > > subsystems when they try to add new sysfs entries to the CPU node.
> > > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> > >
> > > To fix this, parse the processor container and call acpi_processor_add()
> > > for each processor that is discovered like this. The processor container
> > > handler is added with acpi_scan_add_handler(), so no detach call will
> > > arrive.
> > >
> > > Qemu TCG describes CPUs using packages in a processor container.
> >
> > processor terms in a processor container.
>
> Are you wanting this to be:
>
> "Qemu TCG describes CPUs using processor terms in a processor
> container."
>
> ? Searching the ACPI spec for "processor terms" (with or without quotes)
> only brings up results for "terms" - yet another reason to hate site-
> provided search facilities, I don't know why sites bother. :(
Given what
https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-container-device
says, and what QEMU does (as I detailed in my reply to Gavin), I think
this should be:
"Qemu TCG describes CPUs using processor devices in a processor
container."
which uses the same terminology as the ACPI specification. Maybe also
including a reference to the above URL would be a good idea too?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Sep 13, 2023 at 04:38:03PM +0000, James Morse wrote:
> ACPI has two ways of describing processors in the DSDT. Either as a device
> object with HID ACPI0007, or as a type 'C' package inside a Processor
> Container.
I'm struggling with the reference to a "type 'C' package inside a
Processor Container".
ACPI 6.0, which introduced the Processor Container, says: "This optional
device is a container object that acts much like a bus node in a
namespace. It may contain child objects that are either processor devices
or other processor containers"
For "Processor device":
"For more information on the declaration of the processor device object,
see Section 19.6.30, "Device (Declare Device Package).""
which leads one to the Device() object, not the Processor() object.
It also states:
"A Device definition for a processor is declared using the ACPI0007
hardware identifier (HID)."
My understanding from this is that Processor() is not allowed to be
used within a Processor Container, only Device()s with _HID of
ACPI0007 are permitted.
In light of this, please could you clarify your first sentence, as it
seems to be contrary to what is stated in ACPI 6.x specs. Thanks.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Fri, 3 Nov 2023 10:43:14 +0000
"Russell King (Oracle)" <[email protected]> wrote:
> On Thu, Sep 14, 2023 at 02:53:53PM +0100, Jonathan Cameron wrote:
> > On Wed, 13 Sep 2023 16:38:03 +0000
> > James Morse <[email protected]> wrote:
> >
> > > ACPI has two ways of describing processors in the DSDT. Either as a device
> > > object with HID ACPI0007, or as a type 'C' package inside a Processor
> > > Container. The ACPI processor driver probes CPUs described as devices, but
> > > not those described as packages.
> > >
> >
> > Specification reference needed...
> >
> > Terminology wise, I'd just refer to Processor() objects as I think they
> > are named objects rather than data terms like a package (Which include
> > a PkgLength etc)
>
> I'm not sure what kind of reference you want for the above. Looking in
> ACPI 6.5, I've found in 5.2.12:
>
> "Starting with ACPI Specification 6.3, the use of the Processor() object
> was deprecated. Only legacy systems should continue with this usage. On
> the Itanium architecture only, a _UID is provided for the Processor()
> that is a string object. This usage of _UID is also deprecated since it
> can preclude an OSPM from being able to match a processor to a
> non-enumerable device, such as those defined in the MADT. From ACPI
> Specification 6.3 onward, all processor objects for all architectures
> except Itanium must now use Device() objects with an _HID of ACPI0007,
> and use only integer _UID values."
>
> Also, there is:
>
> https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
That pair of refs, just as 'where to look if you care' cross references, seem
to cover it as well as possible.
>
> Unfortunately, using the search facility on that site to try and find
> Processor() doesn't work - it appears to strip the "()" characters from
> the search (which is completely dumb, why do search facilities do that?)
Yeah. Not great.
>
> > > The missing probe for CPUs described as packages creates a problem for
> > > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > > described like this don't get registered, leading to errors from other
> > > subsystems when they try to add new sysfs entries to the CPU node.
> > > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> > >
> > > To fix this, parse the processor container and call acpi_processor_add()
> > > for each processor that is discovered like this. The processor container
> > > handler is added with acpi_scan_add_handler(), so no detach call will
> > > arrive.
> > >
> > > Qemu TCG describes CPUs using packages in a processor container.
> >
> > processor terms in a processor container.
>
> Are you wanting this to be:
>
> "Qemu TCG describes CPUs using processor terms in a processor
> container."
>
> ? Searching the ACPI spec for "processor terms" (with or without quotes)
> only brings up results for "terms" - yet another reason to hate site-
> provided search facilities, I don't know why sites bother. :(
Yup. I just use the PDFs partly for that reason.
Not possible to find in 6.5 because as it's deprecated they removed the information..
Look at ACPI 6.3 and there is 19.6.108 Processor (Declare Processor)
deep in the ASL operator reference
Wording wise I'm not sure exactly what they should be other than they
aren't packages (if my rough ASL understanding is right).
Different byte encoding.
Jonathan
>
On Thu, Sep 14, 2023 at 03:31:10PM +0100, Jonathan Cameron wrote:
> On Wed, 13 Sep 2023 16:38:10 +0000
> James Morse <[email protected]> wrote:
> > -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> > /* Removal */
> > -static void acpi_processor_post_eject(struct acpi_device *device)
> > +static void acpi_processor_make_not_present(struct acpi_device *device)
> > {
> > struct acpi_processor *pr;
> >
> > - if (!device || !acpi_driver_data(device))
> > + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
>
> Would it be possible to do all the ifdef to IS_ENABLED changes in a separate
> patch? I haven't figure out if any of them have dependencies on the other
> changes, but they do create a bunch of noise I'd rather not see in the more
> complex corners of this.
I'm also wondering why we want to do this check here, rather than...
> > +static void acpi_processor_post_eject(struct acpi_device *device)
> > +{
> > + struct acpi_processor *pr;
> > + unsigned long long sta;
> > + acpi_status status;
... here, because none of the code below has any effect if
acpi_processor_make_not_present() merely returns. So the below seems
like a waste of code space when CONFIG_ACPI_HOTPLUG_PRESENT_CPU is
disabled.
> > +
> > + if (!device)
> > + return;
> > +
> > + pr = acpi_driver_data(device);
> > + if (!pr || pr->id >= nr_cpu_ids || invalid_phys_cpuid(pr->phys_id))
> > + return;
> > +
> > + status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> > + if (ACPI_FAILURE(status))
> > + return;
> > +
> > + if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_PRESENT)) {
> > + acpi_processor_make_not_present(device);
> > + return;
> > + }
> > +}
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Tue, Sep 19, 2023 at 10:45:39AM +1000, Gavin Shan wrote:
> On 9/14/23 02:38, James Morse wrote:
> > When called acpi_processor_post_eject() unconditionally make a CPU
> > not-present and unregisters it.
> >
> > To add support for AML events where the CPU has become disabled, but
> > remains present, the _STA method should be checked before calling
> > acpi_processor_remove().
> >
> > Rename acpi_processor_post_eject() acpi_processor_remove_possible(), and
> > check the _STA before calling.
> >
> > Adding the function prototype for arch_unregister_cpu() allows the
> > preprocessor guards to be removed.
> >
> > After this change CPUs will remain registered and visible to
> > user-space as offline if buggy firmware triggers an eject-request,
> > but doesn't clear the corresponding _STA bits after _EJ0 has been
> > called.
> >
> > Signed-off-by: James Morse <[email protected]>
> > ---
> > drivers/acpi/acpi_processor.c | 31 +++++++++++++++++++++++++------
> > include/linux/cpu.h | 1 +
> > 2 files changed, 26 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 00dcc23d49a8..2cafea1edc24 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -457,13 +457,12 @@ static int acpi_processor_add(struct acpi_device *device,
> > return result;
> > }
> > -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> > /* Removal */
> > -static void acpi_processor_post_eject(struct acpi_device *device)
> > +static void acpi_processor_make_not_present(struct acpi_device *device)
> > {
> > struct acpi_processor *pr;
> > - if (!device || !acpi_driver_data(device))
> > + if (!IS_ENABLED(CONFIG_ACPI_HOTPLUG_PRESENT_CPU))
> > return;
>
> In order to use IS_ENABLED(),
And the rest of this statement is where?
> > pr = acpi_driver_data(device);
> > @@ -501,7 +500,29 @@ static void acpi_processor_post_eject(struct acpi_device *device)
> > free_cpumask_var(pr->throttling.shared_cpu_map);
> > kfree(pr);
> > }
> > -#endif /* CONFIG_ACPI_HOTPLUG_PRESENT_CPU */
> > +
> > +static void acpi_processor_post_eject(struct acpi_device *device)
> > +{
> > + struct acpi_processor *pr;
> > + unsigned long long sta;
> > + acpi_status status;
> > +
> > + if (!device)
> > + return;
> > +
> > + pr = acpi_driver_data(device);
> > + if (!pr || pr->id >= nr_cpu_ids || invalid_phys_cpuid(pr->phys_id))
> > + return;
> > +
>
> Do we really need to validate the logic and hardware CPU IDs here? I think
> the ACPI processor device can't be added successfully if one of them is
> invalid.
>
> > + status = acpi_evaluate_integer(pr->handle, "_STA", NULL, &sta);
> > + if (ACPI_FAILURE(status))
> > + return;
> > +
> > + if (cpu_present(pr->id) && !(sta & ACPI_STA_DEVICE_PRESENT)) {
> > + acpi_processor_make_not_present(device);
> > + return;
> > + }
> > +}
> > #ifdef CONFIG_ARCH_MIGHT_HAVE_ACPI_PDC
> > bool __init processor_physically_present(acpi_handle handle)
> > @@ -626,9 +647,7 @@ static const struct acpi_device_id processor_device_ids[] = {
> > static struct acpi_scan_handler processor_handler = {
> > .ids = processor_device_ids,
> > .attach = acpi_processor_add,
> > -#ifdef CONFIG_ACPI_HOTPLUG_PRESENT_CPU
> > .post_eject = acpi_processor_post_eject,
> > -#endif
> > .hotplug = {
> > .enabled = true,
> > },
> > diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> > index a71691d7c2ca..e117c06e0c6b 100644
> > --- a/include/linux/cpu.h
> > +++ b/include/linux/cpu.h
> > @@ -81,6 +81,7 @@ struct device *cpu_device_create(struct device *parent, void *drvdata,
> > const struct attribute_group **groups,
> > const char *fmt, ...);
> > extern int arch_register_cpu(int cpu);
> > +extern void arch_unregister_cpu(int cpu);
>
> arch_unregister_cpu() is protected by CONFIG_HOTPLUG_CPU in the individual architectures,
> for example arch/ia64/kernel/topology.c
Yes, I agree, there _may_ be a reference to arch_unregister_cpu() if
the compiler doesn't optimise the "if(0) return".
As things stand in my "head" tree (which I'll be posting once 6.7-rc1
is out) at the point that this patch exists in the series, there are
no architectures which provide arch_unregister_cpu(), and the only
implementation of it is the __weak one in drivers/base/cpu.c
That implementation is also ifdef'd with CONFIG_HOTPLUG_CPU and also
CONFIG_GENERIC_CPU_DEVICES.
Meanwhile, acpi_processor.c is always built with ACPI, and while we
have IS_ENABLED() clauses with James' patches for both of these
symbols, if the compiler doesn't optimise the code away, we will end
up with a reference and a link-time error. That being said, the 0-day
bot has not reported anything as yet (and it builds my tree.)
So, is this a problem that needs to be solved or not?
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
On Fri, 20 Oct 2023 14:44:35 +0100
"Russell King (Oracle)" <[email protected]> wrote:
> On Thu, Sep 14, 2023 at 01:01:26PM +0100, Jonathan Cameron wrote:
> > On Wed, 13 Sep 2023 16:37:59 +0000
> > James Morse <[email protected]> wrote:
> >
> > > register_cpu_capacity_sysctl() adds a property to sysfs that describes
> > > the CPUs capacity. This is done from a subsys_initcall() that assumes
> > > all possible CPUs are registered.
> > >
> > > With CPU hotplug, possible CPUs aren't registered until they become
> > > present, (or for arm64 enabled). This leads to messages during boot:
> > > | register_cpu_capacity_sysctl: too early to get CPU1 device!
> > > and once these CPUs are added to the system, the file is missing.
> > >
> > > Move this to a cpuhp callback, so that the file is created once
> > > CPUs are brought online. This covers CPUs that are added late by
> > > mechanisms like hotplug.
> > > One observable difference is the file is now missing for offline CPUs.
> > >
> > > Signed-off-by: James Morse <[email protected]>
> > > ---
> > > If the offline CPUs thing is a problem for the tools that consume
> > > this value, we'd need to move cpu_capacity to be part of cpu.c's
> > > common_cpu_attr_groups.
> >
> > I think we should do that anyway and then use an is_visible() if we want to
> > change whether it is visible in offline cpus.
> >
> > Dynamic sysfs file creation is horrible - particularly when done
> > from an totally different file from where the rest of the attributes
> > are registered. I'm curious what the history behind that is.
> >
> > Whilst here, why is there a common_cpu_attr_groups which is
> > identical to the hotpluggable_cpu_attr_groups in base/cpu.c?
>
> Looking into doing this, the easy bit is adding the attribute group
> with an appropriate .is_visible dependent on cpu_present(), but we
> need to be able to call sysfs_update_groups() when the state of the
> .is_visible() changes.
Hi Russell,
Sorry, somehow I missed this completely until you referred back to it :(
This is pretty much what I was thinking so thanks for doing it.
>
> Given the comment in sysfs_update_groups() about "if an error occurs",
> rather than making this part of common_cpu_attr_groups, would it be
> better that it's part of its own set of groups, thus limiting the
> damage from a possible error? I suspect, however, that any error at
> that point means that the system is rather fatally wounded.
>
> This is what I have so far to implement your idea, less the necessary
> sysfs_update_groups() call when we need to change the visibility of
> the attributes.
Fwiw (and I think you shouldn't add this to the critical path for your
main series for obvious reasons), I think you are right that it makes
sense to do this in a separate group, but that if we were going to see
an error I'd 'hope' we shouldn't see anything that hasn't occurred
when groups were originally added. Maybe that's overly optimistic.
Sorry again for lack of reply before now and thanks for pointing this
out. I'd love to see this posted after the ARM vCPU HP stuff is in.
Jonathan
>
> diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c
> index 9ccb7daee78e..06c9fc6620d2 100644
> --- a/drivers/base/arch_topology.c
> +++ b/drivers/base/arch_topology.c
> @@ -215,43 +215,24 @@ static ssize_t cpu_capacity_show(struct device *dev,
> return sysfs_emit(buf, "%lu\n", topology_get_cpu_scale(cpu->dev.id));
> }
>
> -static void update_topology_flags_workfn(struct work_struct *work);
> -static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
> -
> static DEVICE_ATTR_RO(cpu_capacity);
>
> -static int cpu_capacity_sysctl_add(unsigned int cpu)
> -{
> - struct device *cpu_dev = get_cpu_device(cpu);
> -
> - if (!cpu_dev)
> - return -ENOENT;
> -
> - device_create_file(cpu_dev, &dev_attr_cpu_capacity);
> -
> - return 0;
> -}
> -
> -static int cpu_capacity_sysctl_remove(unsigned int cpu)
> +static umode_t cpu_present_attrs_visible(struct kobject *kobi,
> + struct attribute *attr, int index)
> {
> - struct device *cpu_dev = get_cpu_device(cpu);
> -
> - if (!cpu_dev)
> - return -ENOENT;
> -
> - device_remove_file(cpu_dev, &dev_attr_cpu_capacity);
> + struct device *dev = kobj_to_dev(kobj);
> + struct cpu *cpu = container_of(dev, struct cpu, dev);
>
> - return 0;
> + return cpu_present(cpu->dev.id) ? attr->mode : 0;
> }
>
> -static int register_cpu_capacity_sysctl(void)
> -{
> - cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "topology/cpu-capacity",
> - cpu_capacity_sysctl_add, cpu_capacity_sysctl_remove);
> +const struct attribute_group cpu_capacity_attr_group = {
> + .is_visible = cpu_present_attrs_visible,
> + .attrs = cpu_capacity_attrs
> +};
>
> - return 0;
> -}
> -subsys_initcall(register_cpu_capacity_sysctl);
> +static void update_topology_flags_workfn(struct work_struct *work);
> +static DECLARE_WORK(update_topology_flags_work, update_topology_flags_workfn);
>
> static int update_topology;
>
> diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
> index a19a8be93102..954b045705c2 100644
> --- a/drivers/base/cpu.c
> +++ b/drivers/base/cpu.c
> @@ -192,6 +192,9 @@ static const struct attribute_group crash_note_cpu_attr_group = {
> static const struct attribute_group *common_cpu_attr_groups[] = {
> #ifdef CONFIG_KEXEC
> &crash_note_cpu_attr_group,
> +#endif
> +#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY
> + &cpu_capacity_attr_group,
> #endif
> NULL
> };
> diff --git a/include/linux/cpu.h b/include/linux/cpu.h
> index e117c06e0c6b..745ad21e3dc8 100644
> --- a/include/linux/cpu.h
> +++ b/include/linux/cpu.h
> @@ -30,6 +30,8 @@ struct cpu {
> struct device dev;
> };
>
> +extern const struct attribute_group cpu_capacity_attr_group;
> +
> extern void boot_cpu_init(void);
> extern void boot_cpu_hotplug_init(void);
> extern void cpu_init(void);
>