2020-05-07 23:57:42

by Dan Williams

[permalink] [raw]
Subject: [PATCH v2] ACPI: Drop rcu usage for MMIO mappings

Recently a performance problem was reported for a process invoking a
non-trival ASL program. The method call in this case ends up
repetitively triggering a call path like:

acpi_ex_store
acpi_ex_store_object_to_node
acpi_ex_write_data_to_field
acpi_ex_insert_into_field
acpi_ex_write_with_update_rule
acpi_ex_field_datum_io
acpi_ex_access_region
acpi_ev_address_space_dispatch
acpi_ex_system_memory_space_handler
acpi_os_map_cleanup.part.14
_synchronize_rcu_expedited.constprop.89
schedule

The end result of frequent synchronize_rcu_expedited() invocation is
tiny sub-millisecond spurts of execution where the scheduler freely
migrates this apparently sleepy task. The overhead of frequent scheduler
invocation multiplies the execution time by a factor of 2-3X.

For example, performance improves from 16 minutes to 7 minutes for a
firmware update procedure across 24 devices.

Perhaps the rcu usage was intended to allow for not taking a sleeping
lock in the acpi_os_{read,write}_memory() path which ostensibly could be
called from an APEI NMI error interrupt? Neither rcu_read_lock() nor
ioremap() are interrupt safe, so add a WARN_ONCE() to validate that rcu
was not serving as a mechanism to avoid direct calls to ioremap(). Even
the original implementation had a spin_lock_irqsave(), but that is not
NMI safe.

APEI itself already has some concept of avoiding ioremap() from
interrupt context (see erst_exec_move_data()), if the new warning
triggers it means that APEI either needs more instrumentation like that
to pre-emptively fail, or more infrastructure to arrange for pre-mapping
the resources it needs in NMI context.

Cc: <[email protected]>
Fixes: 620242ae8c3d ("ACPI: Maintain a list of ACPI memory mapped I/O remappings")
Cc: Len Brown <[email protected]>
Cc: Borislav Petkov <[email protected]>
Cc: Ira Weiny <[email protected]>
Cc: James Morse <[email protected]>
Cc: Erik Kaneda <[email protected]>
Cc: Myron Stowe <[email protected]>
Cc: "Rafael J. Wysocki" <[email protected]>
Cc: Andy Shevchenko <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
---
Changes since v1 [1]:

- Actually cc: the most important list for ACPI changes (Rafael)

- Cleanup unnecessary variable initialization (Andy)

Link: https://lore.kernel.org/linux-nvdimm/158880834905.2183490.15616329469420234017.stgit@dwillia2-desk3.amr.corp.intel.com/


drivers/acpi/osl.c | 117 +++++++++++++++++++++++++---------------------------
1 file changed, 57 insertions(+), 60 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 762c5d50b8fe..a44b75aac5d0 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -214,13 +214,13 @@ acpi_physical_address __init acpi_os_get_root_pointer(void)
return pa;
}

-/* Must be called with 'acpi_ioremap_lock' or RCU read lock held. */
static struct acpi_ioremap *
acpi_map_lookup(acpi_physical_address phys, acpi_size size)
{
struct acpi_ioremap *map;

- list_for_each_entry_rcu(map, &acpi_ioremaps, list, acpi_ioremap_lock_held())
+ lockdep_assert_held(&acpi_ioremap_lock);
+ list_for_each_entry(map, &acpi_ioremaps, list)
if (map->phys <= phys &&
phys + size <= map->phys + map->size)
return map;
@@ -228,7 +228,6 @@ acpi_map_lookup(acpi_physical_address phys, acpi_size size)
return NULL;
}

-/* Must be called with 'acpi_ioremap_lock' or RCU read lock held. */
static void __iomem *
acpi_map_vaddr_lookup(acpi_physical_address phys, unsigned int size)
{
@@ -263,7 +262,8 @@ acpi_map_lookup_virt(void __iomem *virt, acpi_size size)
{
struct acpi_ioremap *map;

- list_for_each_entry_rcu(map, &acpi_ioremaps, list, acpi_ioremap_lock_held())
+ lockdep_assert_held(&acpi_ioremap_lock);
+ list_for_each_entry(map, &acpi_ioremaps, list)
if (map->virt <= virt &&
virt + size <= map->virt + map->size)
return map;
@@ -360,7 +360,7 @@ void __iomem __ref
map->size = pg_sz;
map->refcount = 1;

- list_add_tail_rcu(&map->list, &acpi_ioremaps);
+ list_add_tail(&map->list, &acpi_ioremaps);

out:
mutex_unlock(&acpi_ioremap_lock);
@@ -374,20 +374,13 @@ void *__ref acpi_os_map_memory(acpi_physical_address phys, acpi_size size)
}
EXPORT_SYMBOL_GPL(acpi_os_map_memory);

-/* Must be called with mutex_lock(&acpi_ioremap_lock) */
-static unsigned long acpi_os_drop_map_ref(struct acpi_ioremap *map)
-{
- unsigned long refcount = --map->refcount;
-
- if (!refcount)
- list_del_rcu(&map->list);
- return refcount;
-}
-
-static void acpi_os_map_cleanup(struct acpi_ioremap *map)
+static void acpi_os_drop_map_ref(struct acpi_ioremap *map)
{
- synchronize_rcu_expedited();
+ lockdep_assert_held(&acpi_ioremap_lock);
+ if (--map->refcount > 0)
+ return;
acpi_unmap(map->phys, map->virt);
+ list_del(&map->list);
kfree(map);
}

@@ -408,7 +401,6 @@ static void acpi_os_map_cleanup(struct acpi_ioremap *map)
void __ref acpi_os_unmap_iomem(void __iomem *virt, acpi_size size)
{
struct acpi_ioremap *map;
- unsigned long refcount;

if (!acpi_permanent_mmap) {
__acpi_unmap_table(virt, size);
@@ -422,11 +414,8 @@ void __ref acpi_os_unmap_iomem(void __iomem *virt, acpi_size size)
WARN(true, PREFIX "%s: bad address %p\n", __func__, virt);
return;
}
- refcount = acpi_os_drop_map_ref(map);
+ acpi_os_drop_map_ref(map);
mutex_unlock(&acpi_ioremap_lock);
-
- if (!refcount)
- acpi_os_map_cleanup(map);
}
EXPORT_SYMBOL_GPL(acpi_os_unmap_iomem);

@@ -461,7 +450,6 @@ void acpi_os_unmap_generic_address(struct acpi_generic_address *gas)
{
u64 addr;
struct acpi_ioremap *map;
- unsigned long refcount;

if (gas->space_id != ACPI_ADR_SPACE_SYSTEM_MEMORY)
return;
@@ -477,11 +465,8 @@ void acpi_os_unmap_generic_address(struct acpi_generic_address *gas)
mutex_unlock(&acpi_ioremap_lock);
return;
}
- refcount = acpi_os_drop_map_ref(map);
+ acpi_os_drop_map_ref(map);
mutex_unlock(&acpi_ioremap_lock);
-
- if (!refcount)
- acpi_os_map_cleanup(map);
}
EXPORT_SYMBOL(acpi_os_unmap_generic_address);

@@ -700,55 +685,71 @@ int acpi_os_read_iomem(void __iomem *virt_addr, u64 *value, u32 width)
return 0;
}

+static void __iomem *acpi_os_rw_map(acpi_physical_address phys_addr,
+ unsigned int size, bool *did_fallback)
+{
+ void __iomem *virt_addr;
+
+ if (WARN_ONCE(in_interrupt(), "ioremap in interrupt context\n"))
+ return NULL;
+
+ /* Try to use a cached mapping and fallback otherwise */
+ *did_fallback = false;
+ mutex_lock(&acpi_ioremap_lock);
+ virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
+ if (virt_addr)
+ return virt_addr;
+ mutex_unlock(&acpi_ioremap_lock);
+
+ virt_addr = acpi_os_ioremap(phys_addr, size);
+ *did_fallback = true;
+
+ return virt_addr;
+}
+
+static void acpi_os_rw_unmap(void __iomem *virt_addr, bool did_fallback)
+{
+ if (did_fallback) {
+ /* in the fallback case no lock is held */
+ iounmap(virt_addr);
+ return;
+ }
+
+ mutex_unlock(&acpi_ioremap_lock);
+}
+
acpi_status
acpi_os_read_memory(acpi_physical_address phys_addr, u64 *value, u32 width)
{
- void __iomem *virt_addr;
unsigned int size = width / 8;
- bool unmap = false;
+ bool did_fallback = false;
+ void __iomem *virt_addr;
u64 dummy;
int error;

- rcu_read_lock();
- virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
- if (!virt_addr) {
- rcu_read_unlock();
- virt_addr = acpi_os_ioremap(phys_addr, size);
- if (!virt_addr)
- return AE_BAD_ADDRESS;
- unmap = true;
- }
-
+ virt_addr = acpi_os_rw_map(phys_addr, size, &did_fallback);
+ if (!virt_addr)
+ return AE_BAD_ADDRESS;
if (!value)
value = &dummy;

error = acpi_os_read_iomem(virt_addr, value, width);
BUG_ON(error);

- if (unmap)
- iounmap(virt_addr);
- else
- rcu_read_unlock();
-
+ acpi_os_rw_unmap(virt_addr, did_fallback);
return AE_OK;
}

acpi_status
acpi_os_write_memory(acpi_physical_address phys_addr, u64 value, u32 width)
{
- void __iomem *virt_addr;
unsigned int size = width / 8;
- bool unmap = false;
+ bool did_fallback = false;
+ void __iomem *virt_addr;

- rcu_read_lock();
- virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
- if (!virt_addr) {
- rcu_read_unlock();
- virt_addr = acpi_os_ioremap(phys_addr, size);
- if (!virt_addr)
- return AE_BAD_ADDRESS;
- unmap = true;
- }
+ virt_addr = acpi_os_rw_map(phys_addr, size, &did_fallback);
+ if (!virt_addr)
+ return AE_BAD_ADDRESS;

switch (width) {
case 8:
@@ -767,11 +768,7 @@ acpi_os_write_memory(acpi_physical_address phys_addr, u64 value, u32 width)
BUG();
}

- if (unmap)
- iounmap(virt_addr);
- else
- rcu_read_unlock();
-
+ acpi_os_rw_unmap(virt_addr, did_fallback);
return AE_OK;
}



2020-05-13 08:58:01

by Chen, Rong A

[permalink] [raw]
Subject: [ACPI] 5a91d41f89: BUG:sleeping_function_called_from_invalid_context_at_kernel/locking/mutex.c

Greeting,

FYI, we noticed the following commit (built with gcc-7):

commit: 5a91d41f89e8874053e12766fa8eb5eaa997d277 ("[PATCH v2] ACPI: Drop rcu usage for MMIO mappings")
url: https://github.com/0day-ci/linux/commits/Dan-Williams/ACPI-Drop-rcu-usage-for-MMIO-mappings/20200509-013208
base: https://git.kernel.org/cgit/linux/kernel/git/rafael/linux-pm.git linux-next

in testcase: v4l2
with following parameters:

test: device
ucode: 0x43



on test machine: 72 threads Intel(R) Xeon(R) CPU E5-2699 v3 @ 2.30GHz with 256G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):




If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 15.593161] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:935
[ 15.594078] in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 1, name: swapper/0
[ 15.594078] 2 locks held by swapper/0/1:
[ 15.594078] #0: ffff88a08055b188 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x1d/0x60
[ 15.594078] #1: ffffffff82a1a658 (ghes_notify_lock_irq){....}-{2:2}, at: ghes_probe+0x1c7/0x470
[ 15.594078] irq event stamp: 11350142
[ 15.594078] hardirqs last enabled at (11350141): [<ffffffff8137d9bf>] kfree+0x18f/0x2f0
[ 15.594078] hardirqs last disabled at (11350142): [<ffffffff81cc0c00>] _raw_spin_lock_irqsave+0x20/0x60
[ 15.594078] softirqs last enabled at (11350110): [<ffffffff82000353>] __do_softirq+0x353/0x466
[ 15.594078] softirqs last disabled at (11350103): [<ffffffff8113c277>] irq_exit+0xe7/0xf0
[ 15.594078] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc4-00022-g5a91d41f89e88 #1
[ 15.594078] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 15.594078] Call Trace:
[ 15.594078] dump_stack+0x8f/0xcb
[ 15.594078] ___might_sleep+0x175/0x260
[ 15.594078] __mutex_lock+0x55/0x9f0
[ 15.594078] ? cpumask_next+0x17/0x20
[ 15.594078] ? validate_chain+0xdec/0x1240
[ 15.594078] ? rdinit_setup+0x2b/0x2b
[ 15.594078] ? acpi_os_rw_map+0x34/0xb0
[ 15.594078] acpi_os_rw_map+0x34/0xb0
[ 15.594078] acpi_os_read_memory+0x34/0xc0
[ 15.594078] ? lock_acquire+0xac/0x390
[ 15.594078] apei_read+0x97/0xb0
[ 15.594078] __ghes_peek_estatus+0x27/0xc0
[ 15.594078] ghes_proc+0x37/0x120
[ 15.594078] ghes_probe+0x1d2/0x470
[ 15.594078] platform_drv_probe+0x37/0x90
[ 15.594078] really_probe+0xef/0x430
[ 15.594078] driver_probe_device+0x110/0x120
[ 15.594078] device_driver_attach+0x4f/0x60
[ 15.594078] __driver_attach+0x9c/0x140
[ 15.594078] ? device_driver_attach+0x60/0x60
[ 15.594078] bus_for_each_dev+0x79/0xc0
[ 15.594078] bus_add_driver+0x147/0x220
[ 15.594078] ? bert_init+0x229/0x229
[ 15.594078] driver_register+0x5b/0xf0
[ 15.594078] ? bert_init+0x229/0x229
[ 15.594078] ghes_init+0x83/0xde
[ 15.594078] do_one_initcall+0x5d/0x2f0
[ 15.594078] ? rdinit_setup+0x2b/0x2b
[ 15.594078] ? rcu_read_lock_sched_held+0x52/0x80
[ 15.594078] kernel_init_freeable+0x260/0x2da
[ 15.594078] ? rest_init+0x250/0x250
[ 15.594078] kernel_init+0xa/0x110
[ 15.594078] ret_from_fork+0x3a/0x50
[ 15.594078]
[ 15.594078] =============================
[ 15.594078] [ BUG: Invalid wait context ]
[ 15.594078] 5.7.0-rc4-00022-g5a91d41f89e88 #1 Tainted: G W
[ 15.594078] -----------------------------
[ 15.594078] swapper/0/1 is trying to lock:
[ 15.594078] ffffffff82a0b5c8 (acpi_ioremap_lock){+.+.}-{3:3}, at: acpi_os_rw_map+0x34/0xb0
[ 15.594078] other info that might help us debug this:
[ 15.594078] context-{4:4}
[ 15.594078] 2 locks held by swapper/0/1:
[ 15.594078] #0: ffff88a08055b188 (&dev->mutex){....}-{3:3}, at: device_driver_attach+0x1d/0x60
[ 15.594078] #1: ffffffff82a1a658 (ghes_notify_lock_irq){....}-{2:2}, at: ghes_probe+0x1c7/0x470
[ 15.594078] stack backtrace:
[ 15.594078] CPU: 1 PID: 1 Comm: swapper/0 Tainted: G W 5.7.0-rc4-00022-g5a91d41f89e88 #1
[ 15.594078] Hardware name: Intel Corporation S2600WTT/S2600WTT, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 15.594078] Call Trace:
[ 15.594078] dump_stack+0x8f/0xcb
[ 15.594078] __lock_acquire+0x61e/0xbc0
[ 15.594078] lock_acquire+0xac/0x390
[ 15.594078] ? acpi_os_rw_map+0x34/0xb0
[ 15.594078] ? acpi_os_rw_map+0x34/0xb0
[ 15.594078] ? acpi_os_rw_map+0x34/0xb0
[ 15.594078] __mutex_lock+0xa1/0x9f0
[ 15.594078] ? acpi_os_rw_map+0x34/0xb0
[ 15.594078] ? cpumask_next+0x17/0x20
[ 15.594078] ? rdinit_setup+0x2b/0x2b
[ 15.594078] ? acpi_os_rw_map+0x34/0xb0
[ 15.594078] acpi_os_rw_map+0x34/0xb0
[ 15.594078] acpi_os_read_memory+0x34/0xc0
[ 15.594078] ? lock_acquire+0xac/0x390
[ 15.594078] apei_read+0x97/0xb0
[ 15.594078] __ghes_peek_estatus+0x27/0xc0
[ 15.594078] ghes_proc+0x37/0x120
[ 15.594078] ghes_probe+0x1d2/0x470
[ 15.594078] platform_drv_probe+0x37/0x90
[ 15.594078] really_probe+0xef/0x430
[ 15.594078] driver_probe_device+0x110/0x120
[ 15.594078] device_driver_attach+0x4f/0x60
[ 15.594078] __driver_attach+0x9c/0x140
[ 15.594078] ? device_driver_attach+0x60/0x60
[ 15.594078] bus_for_each_dev+0x79/0xc0
[ 15.594078] bus_add_driver+0x147/0x220
[ 15.594078] ? bert_init+0x229/0x229
[ 15.594078] driver_register+0x5b/0xf0
[ 15.594078] ? bert_init+0x229/0x229
[ 15.594078] ghes_init+0x83/0xde
[ 15.594078] do_one_initcall+0x5d/0x2f0
[ 15.594078] ? rdinit_setup+0x2b/0x2b
[ 15.594078] ? rcu_read_lock_sched_held+0x52/0x80
[ 15.594078] kernel_init_freeable+0x260/0x2da
[ 15.594078] ? rest_init+0x250/0x250
[ 15.594078] kernel_init+0xa/0x110
[ 15.594078] ret_from_fork+0x3a/0x50
[ 16.109557] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC.
[ 16.118199] Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled
[ 16.125308] 00:02: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
[ 16.133596] 00:03: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A
[ 16.142660] Non-volatile memory driver v1.3
[ 16.147391] Linux agpgart interface v0.103
[ 16.158812] lkdtm: No crash points registered, enable through debugfs
[ 16.166314] rdac: device handler registered
[ 16.171055] hp_sw: device handler registered
[ 16.175838] emc: device handler registered
[ 16.180592] alua: device handler registered
[ 16.185679] MACsec IEEE 802.1AE
[ 16.189408] libphy: Fixed MDIO Bus: probed
[ 16.194253] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI
[ 16.202138] e1000: Copyright (c) 1999-2006 Intel Corporation.
[ 16.208641] e1000e: Intel(R) PRO/1000 Network Driver - 3.2.6-k
[ 16.215173] e1000e: Copyright(c) 1999 - 2015 Intel Corporation.
[ 16.221901] igb: Intel(R) Gigabit Ethernet Network Driver - version 5.6.0-k
[ 16.229691] igb: Copyright (c) 2007-2014 Intel Corporation.
[ 16.235985] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 5.1.0-k
[ 16.244549] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[ 16.251177] IOAPIC[9]: Set IRTE entry (P:1 FPD:0 Dst_Mode:1 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000001 SID:002C SQ:0 SVT:1)
[ 16.266251] IOAPIC[1]: Set routing entry (9-13 -> 0xef -> IRQ 38 Mode:1 Active:1 Dest:1)
[ 16.548010] ixgbe 0000:03:00.0: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0
[ 16.643801] ixgbe 0000:03:00.0: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
[ 16.677322] ixgbe 0000:03:00.0: MAC: 3, PHY: 0, PBA No: 000000-000
[ 16.684240] ixgbe 0000:03:00.0: 00:1e:67:f7:44:b3
[ 16.838449] ixgbe 0000:03:00.0: Intel(R) 10 Gigabit Network Connection
[ 16.845921] libphy: ixgbe-mdio: probed
[ 16.850249] IOAPIC[9]: Set IRTE entry (P:1 FPD:0 Dst_Mode:1 Redir_hint:1 Trig_Mode:0 Dlvry_Mode:0 Avail:0 Vector:EF Dest:00000001 SID:002C SQ:0 SVT:1)
[ 16.865321] IOAPIC[1]: Set routing entry (9-10 -> 0xef -> IRQ 105 Mode:1 Active:1 Dest:1)
[ 17.147743] ixgbe 0000:03:00.1: Multiqueue Enabled: Rx Queue count = 63, Tx Queue count = 63 XDP Queue count = 0
[ 17.243466] ixgbe 0000:03:00.1: 32.000 Gb/s available PCIe bandwidth (5.0 GT/s PCIe x8 link)
[ 17.276992] ixgbe 0000:03:00.1: MAC: 3, PHY: 0, PBA No: 000000-000
[ 17.283910] ixgbe 0000:03:00.1: 00:1e:67:f7:44:b4
[ 17.437334] ixgbe 0000:03:00.1: Intel(R) 10 Gigabit Network Connection
[ 17.444791] libphy: ixgbe-mdio: probed
[ 17.449025] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.8.20-k
[ 17.457890] i40e: Copyright (c) 2013 - 2019 Intel Corporation.
[ 17.465025] usbcore: registered new interface driver catc
[ 17.471076] usbcore: registered new interface driver kaweth
[ 17.477311] pegasus: v0.9.3 (2013/04/25), Pegasus/Pegasus II USB Ethernet driver
[ 17.485597] usbcore: registered new interface driver pegasus
[ 17.491949] usbcore: registered new interface driver rtl8150
[ 17.498295] usbcore: registered new interface driver asix
[ 17.504340] usbcore: registered new interface driver cdc_ether
[ 17.510870] usbcore: registered new interface driver cdc_eem
[ 17.517207] usbcore: registered new interface driver dm9601
[ 17.523455] usbcore: registered new interface driver smsc75xx
[ 17.529891] usbcore: registered new interface driver smsc95xx


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
bin/lkp install job.yaml # job file is attached in this email
bin/lkp run job.yaml



Thanks,
Rong Chen


Attachments:
(No filename) (9.55 kB)
config-5.7.0-rc4-00022-g5a91d41f89e88 (210.24 kB)
job-script (5.20 kB)
dmesg.xz (21.96 kB)
job.yaml (4.27 kB)
Download all attachments

2020-06-05 13:34:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2] ACPI: Drop rcu usage for MMIO mappings

On Fri, May 8, 2020 at 1:55 AM Dan Williams <[email protected]> wrote:
>
> Recently a performance problem was reported for a process invoking a
> non-trival ASL program. The method call in this case ends up
> repetitively triggering a call path like:
>
> acpi_ex_store
> acpi_ex_store_object_to_node
> acpi_ex_write_data_to_field
> acpi_ex_insert_into_field
> acpi_ex_write_with_update_rule
> acpi_ex_field_datum_io
> acpi_ex_access_region
> acpi_ev_address_space_dispatch
> acpi_ex_system_memory_space_handler
> acpi_os_map_cleanup.part.14
> _synchronize_rcu_expedited.constprop.89
> schedule
>
> The end result of frequent synchronize_rcu_expedited() invocation is
> tiny sub-millisecond spurts of execution where the scheduler freely
> migrates this apparently sleepy task. The overhead of frequent scheduler
> invocation multiplies the execution time by a factor of 2-3X.
>
> For example, performance improves from 16 minutes to 7 minutes for a
> firmware update procedure across 24 devices.
>
> Perhaps the rcu usage was intended to allow for not taking a sleeping
> lock in the acpi_os_{read,write}_memory() path which ostensibly could be
> called from an APEI NMI error interrupt?

Not really.

acpi_os_{read|write}_memory() end up being called from non-NMI
interrupt context via acpi_hw_{read|write}(), respectively, and quite
obviously ioremap() cannot be run from there, but in those cases the
mappings in question are there in the list already in all cases and so
the ioremap() isn't used then.

RCU is there to protect these users from walking the list while it is
being updated.

> Neither rcu_read_lock() nor ioremap() are interrupt safe, so add a WARN_ONCE() to validate that rcu
> was not serving as a mechanism to avoid direct calls to ioremap().

But it would produce false-positives if the IRQ context was not NMI,
wouldn't it?

> Even the original implementation had a spin_lock_irqsave(), but that is not
> NMI safe.

Which is not a problem (see above).

> APEI itself already has some concept of avoiding ioremap() from
> interrupt context (see erst_exec_move_data()), if the new warning
> triggers it means that APEI either needs more instrumentation like that
> to pre-emptively fail, or more infrastructure to arrange for pre-mapping
> the resources it needs in NMI context.

Well, I'm not sure about that.

Thanks!

2020-06-05 14:09:02

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFT][PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management

From: Rafael J. Wysocki <[email protected]>
Subject: [PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management

The ACPI OS layer uses RCU to protect the list of ACPI memory
mappings from being walked while it is updated. Among other
situations, that list can be walked in non-NMI interrupt context,
so using a sleeping lock to protect it is not an option.

However, performance issues related to the RCU usage in there
appear, as described by Dan Williams:

"Recently a performance problem was reported for a process invoking
a non-trival ASL program. The method call in this case ends up
repetitively triggering a call path like:

acpi_ex_store
acpi_ex_store_object_to_node
acpi_ex_write_data_to_field
acpi_ex_insert_into_field
acpi_ex_write_with_update_rule
acpi_ex_field_datum_io
acpi_ex_access_region
acpi_ev_address_space_dispatch
acpi_ex_system_memory_space_handler
acpi_os_map_cleanup.part.14
_synchronize_rcu_expedited.constprop.89
schedule

The end result of frequent synchronize_rcu_expedited() invocation is
tiny sub-millisecond spurts of execution where the scheduler freely
migrates this apparently sleepy task. The overhead of frequent
scheduler invocation multiplies the execution time by a factor
of 2-3X."

In order to avoid these issues, replace the RCU in the ACPI OS
layer by an rwlock.

That rwlock should not be frequently contended, so the performance
impact of it is not expected to be significant.

Reported-by: Dan Williams <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---

Hi Dan,

This is a possible fix for the ACPI OSL RCU-related performance issues, but
can you please arrange for the testing of it on the affected systems?

Cheers!

---
drivers/acpi/osl.c | 50 ++++++++++++++++++++++++++++++++++----------------
1 file changed, 34 insertions(+), 16 deletions(-)

Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -81,8 +81,8 @@ struct acpi_ioremap {
};

static LIST_HEAD(acpi_ioremaps);
+static DEFINE_RWLOCK(acpi_ioremaps_list_lock);
static DEFINE_MUTEX(acpi_ioremap_lock);
-#define acpi_ioremap_lock_held() lock_is_held(&acpi_ioremap_lock.dep_map)

static void __init acpi_request_region (struct acpi_generic_address *gas,
unsigned int length, char *desc)
@@ -214,13 +214,13 @@ acpi_physical_address __init acpi_os_get
return pa;
}

-/* Must be called with 'acpi_ioremap_lock' or RCU read lock held. */
+/* Must be called with 'acpi_ioremap_lock' or 'acpi_ioremaps_list_lock' held. */
static struct acpi_ioremap *
acpi_map_lookup(acpi_physical_address phys, acpi_size size)
{
struct acpi_ioremap *map;

- list_for_each_entry_rcu(map, &acpi_ioremaps, list, acpi_ioremap_lock_held())
+ list_for_each_entry(map, &acpi_ioremaps, list)
if (map->phys <= phys &&
phys + size <= map->phys + map->size)
return map;
@@ -228,7 +228,7 @@ acpi_map_lookup(acpi_physical_address ph
return NULL;
}

-/* Must be called with 'acpi_ioremap_lock' or RCU read lock held. */
+/* Must be called with 'acpi_ioremap_lock' or 'acpi_ioremaps_list_lock' held. */
static void __iomem *
acpi_map_vaddr_lookup(acpi_physical_address phys, unsigned int size)
{
@@ -257,13 +257,13 @@ void __iomem *acpi_os_get_iomem(acpi_phy
}
EXPORT_SYMBOL_GPL(acpi_os_get_iomem);

-/* Must be called with 'acpi_ioremap_lock' or RCU read lock held. */
+/* Must be called with 'acpi_ioremap_lock' or 'acpi_ioremaps_list_lock' held. */
static struct acpi_ioremap *
acpi_map_lookup_virt(void __iomem *virt, acpi_size size)
{
struct acpi_ioremap *map;

- list_for_each_entry_rcu(map, &acpi_ioremaps, list, acpi_ioremap_lock_held())
+ list_for_each_entry(map, &acpi_ioremaps, list)
if (map->virt <= virt &&
virt + size <= map->virt + map->size)
return map;
@@ -360,7 +360,11 @@ void __iomem __ref
map->size = pg_sz;
map->refcount = 1;

- list_add_tail_rcu(&map->list, &acpi_ioremaps);
+ write_lock_irq(&acpi_ioremaps_list_lock);
+
+ list_add_tail(&map->list, &acpi_ioremaps);
+
+ write_unlock_irq(&acpi_ioremaps_list_lock);

out:
mutex_unlock(&acpi_ioremap_lock);
@@ -379,14 +383,18 @@ static unsigned long acpi_os_drop_map_re
{
unsigned long refcount = --map->refcount;

- if (!refcount)
- list_del_rcu(&map->list);
+ if (!refcount) {
+ write_lock_irq(&acpi_ioremaps_list_lock);
+
+ list_del(&map->list);
+
+ write_unlock_irq(&acpi_ioremaps_list_lock);
+ }
return refcount;
}

static void acpi_os_map_cleanup(struct acpi_ioremap *map)
{
- synchronize_rcu_expedited();
acpi_unmap(map->phys, map->virt);
kfree(map);
}
@@ -704,18 +712,23 @@ acpi_status
acpi_os_read_memory(acpi_physical_address phys_addr, u64 *value, u32 width)
{
void __iomem *virt_addr;
+ unsigned long flags;
unsigned int size = width / 8;
bool unmap = false;
u64 dummy;
int error;

- rcu_read_lock();
+ read_lock_irqsave(&acpi_ioremaps_list_lock, flags);
+
virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
if (!virt_addr) {
- rcu_read_unlock();
+
+ read_unlock_irqrestore(&acpi_ioremaps_list_lock, flags);
+
virt_addr = acpi_os_ioremap(phys_addr, size);
if (!virt_addr)
return AE_BAD_ADDRESS;
+
unmap = true;
}

@@ -728,7 +741,7 @@ acpi_os_read_memory(acpi_physical_addres
if (unmap)
iounmap(virt_addr);
else
- rcu_read_unlock();
+ read_unlock_irqrestore(&acpi_ioremaps_list_lock, flags);

return AE_OK;
}
@@ -737,16 +750,21 @@ acpi_status
acpi_os_write_memory(acpi_physical_address phys_addr, u64 value, u32 width)
{
void __iomem *virt_addr;
+ unsigned long flags;
unsigned int size = width / 8;
bool unmap = false;

- rcu_read_lock();
+ read_lock_irqsave(&acpi_ioremaps_list_lock, flags);
+
virt_addr = acpi_map_vaddr_lookup(phys_addr, size);
if (!virt_addr) {
- rcu_read_unlock();
+
+ read_unlock_irqrestore(&acpi_ioremaps_list_lock, flags);
+
virt_addr = acpi_os_ioremap(phys_addr, size);
if (!virt_addr)
return AE_BAD_ADDRESS;
+
unmap = true;
}

@@ -770,7 +788,7 @@ acpi_os_write_memory(acpi_physical_addre
if (unmap)
iounmap(virt_addr);
else
- rcu_read_unlock();
+ read_unlock_irqrestore(&acpi_ioremaps_list_lock, flags);

return AE_OK;
}



2020-06-05 16:20:43

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] ACPI: Drop rcu usage for MMIO mappings

On Fri, Jun 5, 2020 at 6:32 AM Rafael J. Wysocki <[email protected]> wrote:
>
> On Fri, May 8, 2020 at 1:55 AM Dan Williams <[email protected]> wrote:
> >
> > Recently a performance problem was reported for a process invoking a
> > non-trival ASL program. The method call in this case ends up
> > repetitively triggering a call path like:
> >
> > acpi_ex_store
> > acpi_ex_store_object_to_node
> > acpi_ex_write_data_to_field
> > acpi_ex_insert_into_field
> > acpi_ex_write_with_update_rule
> > acpi_ex_field_datum_io
> > acpi_ex_access_region
> > acpi_ev_address_space_dispatch
> > acpi_ex_system_memory_space_handler
> > acpi_os_map_cleanup.part.14
> > _synchronize_rcu_expedited.constprop.89
> > schedule
> >
> > The end result of frequent synchronize_rcu_expedited() invocation is
> > tiny sub-millisecond spurts of execution where the scheduler freely
> > migrates this apparently sleepy task. The overhead of frequent scheduler
> > invocation multiplies the execution time by a factor of 2-3X.
> >
> > For example, performance improves from 16 minutes to 7 minutes for a
> > firmware update procedure across 24 devices.
> >
> > Perhaps the rcu usage was intended to allow for not taking a sleeping
> > lock in the acpi_os_{read,write}_memory() path which ostensibly could be
> > called from an APEI NMI error interrupt?
>
> Not really.
>
> acpi_os_{read|write}_memory() end up being called from non-NMI
> interrupt context via acpi_hw_{read|write}(), respectively, and quite
> obviously ioremap() cannot be run from there, but in those cases the
> mappings in question are there in the list already in all cases and so
> the ioremap() isn't used then.
>
> RCU is there to protect these users from walking the list while it is
> being updated.
>
> > Neither rcu_read_lock() nor ioremap() are interrupt safe, so add a WARN_ONCE() to validate that rcu
> > was not serving as a mechanism to avoid direct calls to ioremap().
>
> But it would produce false-positives if the IRQ context was not NMI,
> wouldn't it?
>
> > Even the original implementation had a spin_lock_irqsave(), but that is not
> > NMI safe.
>
> Which is not a problem (see above).
>
> > APEI itself already has some concept of avoiding ioremap() from
> > interrupt context (see erst_exec_move_data()), if the new warning
> > triggers it means that APEI either needs more instrumentation like that
> > to pre-emptively fail, or more infrastructure to arrange for pre-mapping
> > the resources it needs in NMI context.
>
> Well, I'm not sure about that.

Right, this patch set is about 2-3 generations behind the architecture
of the fix we are discussing internally, you might mention that.

The fix we are looking at now is to pre-map operation regions in a
similar manner as the way APEI resources are pre-mapped. The
pre-mapping would arrange for synchronize_rcu_expedited() to be elided
on each dynamic mapping attempt. The other piece is to arrange for
operation-regions to be mapped at their full size at once rather than
a page at a time.

2020-06-05 16:26:23

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2] ACPI: Drop rcu usage for MMIO mappings

On Fri, Jun 5, 2020 at 6:18 PM Dan Williams <[email protected]> wrote:
>
> On Fri, Jun 5, 2020 at 6:32 AM Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Fri, May 8, 2020 at 1:55 AM Dan Williams <[email protected]> wrote:
> > >
> > > Recently a performance problem was reported for a process invoking a
> > > non-trival ASL program. The method call in this case ends up
> > > repetitively triggering a call path like:
> > >
> > > acpi_ex_store
> > > acpi_ex_store_object_to_node
> > > acpi_ex_write_data_to_field
> > > acpi_ex_insert_into_field
> > > acpi_ex_write_with_update_rule
> > > acpi_ex_field_datum_io
> > > acpi_ex_access_region
> > > acpi_ev_address_space_dispatch
> > > acpi_ex_system_memory_space_handler
> > > acpi_os_map_cleanup.part.14
> > > _synchronize_rcu_expedited.constprop.89
> > > schedule
> > >
> > > The end result of frequent synchronize_rcu_expedited() invocation is
> > > tiny sub-millisecond spurts of execution where the scheduler freely
> > > migrates this apparently sleepy task. The overhead of frequent scheduler
> > > invocation multiplies the execution time by a factor of 2-3X.
> > >
> > > For example, performance improves from 16 minutes to 7 minutes for a
> > > firmware update procedure across 24 devices.
> > >
> > > Perhaps the rcu usage was intended to allow for not taking a sleeping
> > > lock in the acpi_os_{read,write}_memory() path which ostensibly could be
> > > called from an APEI NMI error interrupt?
> >
> > Not really.
> >
> > acpi_os_{read|write}_memory() end up being called from non-NMI
> > interrupt context via acpi_hw_{read|write}(), respectively, and quite
> > obviously ioremap() cannot be run from there, but in those cases the
> > mappings in question are there in the list already in all cases and so
> > the ioremap() isn't used then.
> >
> > RCU is there to protect these users from walking the list while it is
> > being updated.
> >
> > > Neither rcu_read_lock() nor ioremap() are interrupt safe, so add a WARN_ONCE() to validate that rcu
> > > was not serving as a mechanism to avoid direct calls to ioremap().
> >
> > But it would produce false-positives if the IRQ context was not NMI,
> > wouldn't it?
> >
> > > Even the original implementation had a spin_lock_irqsave(), but that is not
> > > NMI safe.
> >
> > Which is not a problem (see above).
> >
> > > APEI itself already has some concept of avoiding ioremap() from
> > > interrupt context (see erst_exec_move_data()), if the new warning
> > > triggers it means that APEI either needs more instrumentation like that
> > > to pre-emptively fail, or more infrastructure to arrange for pre-mapping
> > > the resources it needs in NMI context.
> >
> > Well, I'm not sure about that.
>
> Right, this patch set is about 2-3 generations behind the architecture
> of the fix we are discussing internally, you might mention that.

Yes, sorry.

> The fix we are looking at now is to pre-map operation regions in a
> similar manner as the way APEI resources are pre-mapped. The
> pre-mapping would arrange for synchronize_rcu_expedited() to be elided
> on each dynamic mapping attempt. The other piece is to arrange for
> operation-regions to be mapped at their full size at once rather than
> a page at a time.

However, if the RCU usage in ACPI OSL can be replaced with an rwlock,
some of the ACPICA changes above may not be necessary anymore (even
though some of them may still be worth making).

2020-06-05 16:42:53

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] ACPI: Drop rcu usage for MMIO mappings

On Fri, Jun 5, 2020 at 9:22 AM Rafael J. Wysocki <[email protected]> wrote:
[..]
> > The fix we are looking at now is to pre-map operation regions in a
> > similar manner as the way APEI resources are pre-mapped. The
> > pre-mapping would arrange for synchronize_rcu_expedited() to be elided
> > on each dynamic mapping attempt. The other piece is to arrange for
> > operation-regions to be mapped at their full size at once rather than
> > a page at a time.
>
> However, if the RCU usage in ACPI OSL can be replaced with an rwlock,
> some of the ACPICA changes above may not be necessary anymore (even
> though some of them may still be worth making).

I don't think you can replace the RCU usage in ACPI OSL and still
maintain NMI lookups in a dynamic list.

However, there are 3 solutions I see:

- Prevent acpi_os_map_cleanup() from triggering at high frequency by
pre-mapping and never unmapping operation-regions resources (internal
discussion in progress)

- Prevent walks of the 'acpi_ioremaps' list (acpi_map_lookup_virt())
from NMI context by re-writing the physical addresses in the APEI
tables with pre-mapped virtual address, i.e. remove rcu_read_lock()
and list_for_each_entry_rcu() from NMI context.

- Split operation-region resources into a separate mapping mechanism
than APEI resources so that typical locking can be used for the
sleepable resources and let the NMI accessible resources be managed
separately.

That last one is one we have not discussed internally, but it occurred
to me when you mentioned replacing RCU.

2020-06-05 17:05:09

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH v2] ACPI: Drop rcu usage for MMIO mappings

On Fri, Jun 5, 2020 at 6:39 PM Dan Williams <[email protected]> wrote:
>
> On Fri, Jun 5, 2020 at 9:22 AM Rafael J. Wysocki <[email protected]> wrote:
> [..]
> > > The fix we are looking at now is to pre-map operation regions in a
> > > similar manner as the way APEI resources are pre-mapped. The
> > > pre-mapping would arrange for synchronize_rcu_expedited() to be elided
> > > on each dynamic mapping attempt. The other piece is to arrange for
> > > operation-regions to be mapped at their full size at once rather than
> > > a page at a time.
> >
> > However, if the RCU usage in ACPI OSL can be replaced with an rwlock,
> > some of the ACPICA changes above may not be necessary anymore (even
> > though some of them may still be worth making).
>
> I don't think you can replace the RCU usage in ACPI OSL and still
> maintain NMI lookups in a dynamic list.

I'm not sure what NMI lookups have to do with the issue at hand.

If acpi_os_{read|write}_memory() is used from NMI, that is a bug
already in there which is unrelated to the performance problem with
opregions.

> However, there are 3 solutions I see:
>
> - Prevent acpi_os_map_cleanup() from triggering at high frequency by
> pre-mapping and never unmapping operation-regions resources (internal
> discussion in progress)

Yes, that can be done, if necessary.

> - Prevent walks of the 'acpi_ioremaps' list (acpi_map_lookup_virt())
> from NMI context by re-writing the physical addresses in the APEI
> tables with pre-mapped virtual address, i.e. remove rcu_read_lock()
> and list_for_each_entry_rcu() from NMI context.

That sounds a bit convoluted to me.

> - Split operation-region resources into a separate mapping mechanism
> than APEI resources so that typical locking can be used for the
> sleepable resources and let the NMI accessible resources be managed
> separately.
>
> That last one is one we have not discussed internally, but it occurred
> to me when you mentioned replacing RCU.

So NMI cannot use acpi_os_{read|write}_memory() safely which you have
pointed out for a few times.

But even if NMI resources are managed separately, the others will
still not be sleepable (at least not all of them).

Cheers!

2020-06-05 17:11:44

by Dan Williams

[permalink] [raw]
Subject: Re: [RFT][PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management

On Fri, Jun 5, 2020 at 7:06 AM Rafael J. Wysocki <[email protected]> wrote:
>
> From: Rafael J. Wysocki <[email protected]>
> Subject: [PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management
>
> The ACPI OS layer uses RCU to protect the list of ACPI memory
> mappings from being walked while it is updated. Among other
> situations, that list can be walked in non-NMI interrupt context,
> so using a sleeping lock to protect it is not an option.
>
> However, performance issues related to the RCU usage in there
> appear, as described by Dan Williams:
>
> "Recently a performance problem was reported for a process invoking
> a non-trival ASL program. The method call in this case ends up
> repetitively triggering a call path like:
>
> acpi_ex_store
> acpi_ex_store_object_to_node
> acpi_ex_write_data_to_field
> acpi_ex_insert_into_field
> acpi_ex_write_with_update_rule
> acpi_ex_field_datum_io
> acpi_ex_access_region
> acpi_ev_address_space_dispatch
> acpi_ex_system_memory_space_handler
> acpi_os_map_cleanup.part.14
> _synchronize_rcu_expedited.constprop.89
> schedule
>
> The end result of frequent synchronize_rcu_expedited() invocation is
> tiny sub-millisecond spurts of execution where the scheduler freely
> migrates this apparently sleepy task. The overhead of frequent
> scheduler invocation multiplies the execution time by a factor
> of 2-3X."
>
> In order to avoid these issues, replace the RCU in the ACPI OS
> layer by an rwlock.
>
> That rwlock should not be frequently contended, so the performance
> impact of it is not expected to be significant.
>
> Reported-by: Dan Williams <[email protected]>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
>
> Hi Dan,
>
> This is a possible fix for the ACPI OSL RCU-related performance issues, but
> can you please arrange for the testing of it on the affected systems?

Ugh, is it really this simple? I did not realize the read-side is NMI
safe. I'll take a look.

2020-06-05 19:43:01

by Andy Shevchenko

[permalink] [raw]
Subject: Re: [RFT][PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management

On Fri, Jun 5, 2020 at 5:11 PM Rafael J. Wysocki <[email protected]> wrote:

...

> + if (!refcount) {
> + write_lock_irq(&acpi_ioremaps_list_lock);
> +
> + list_del(&map->list);
> +
> + write_unlock_irq(&acpi_ioremaps_list_lock);
> + }
> return refcount;

It seems we can decrease indentation level at the same time:

if (refcount)
return refcount;

...
return 0;

--
With Best Regards,
Andy Shevchenko

2020-06-06 06:50:26

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFT][PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management

On Fri, Jun 5, 2020 at 9:40 PM Andy Shevchenko
<[email protected]> wrote:
>
> On Fri, Jun 5, 2020 at 5:11 PM Rafael J. Wysocki <[email protected]> wrote:
>
> ...
>
> > + if (!refcount) {
> > + write_lock_irq(&acpi_ioremaps_list_lock);
> > +
> > + list_del(&map->list);
> > +
> > + write_unlock_irq(&acpi_ioremaps_list_lock);
> > + }
> > return refcount;
>
> It seems we can decrease indentation level at the same time:
>
> if (refcount)
> return refcount;
>
> ...
> return 0;

Right, but the patch will need to be dropped anyway I think.

2020-06-06 06:58:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFT][PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management

On Fri, Jun 5, 2020 at 7:09 PM Dan Williams <[email protected]> wrote:
>
> On Fri, Jun 5, 2020 at 7:06 AM Rafael J. Wysocki <[email protected]> wrote:
> >
> > From: Rafael J. Wysocki <[email protected]>
> > Subject: [PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management
> >
> > The ACPI OS layer uses RCU to protect the list of ACPI memory
> > mappings from being walked while it is updated. Among other
> > situations, that list can be walked in non-NMI interrupt context,
> > so using a sleeping lock to protect it is not an option.
> >
> > However, performance issues related to the RCU usage in there
> > appear, as described by Dan Williams:
> >
> > "Recently a performance problem was reported for a process invoking
> > a non-trival ASL program. The method call in this case ends up
> > repetitively triggering a call path like:
> >
> > acpi_ex_store
> > acpi_ex_store_object_to_node
> > acpi_ex_write_data_to_field
> > acpi_ex_insert_into_field
> > acpi_ex_write_with_update_rule
> > acpi_ex_field_datum_io
> > acpi_ex_access_region
> > acpi_ev_address_space_dispatch
> > acpi_ex_system_memory_space_handler
> > acpi_os_map_cleanup.part.14
> > _synchronize_rcu_expedited.constprop.89
> > schedule
> >
> > The end result of frequent synchronize_rcu_expedited() invocation is
> > tiny sub-millisecond spurts of execution where the scheduler freely
> > migrates this apparently sleepy task. The overhead of frequent
> > scheduler invocation multiplies the execution time by a factor
> > of 2-3X."
> >
> > In order to avoid these issues, replace the RCU in the ACPI OS
> > layer by an rwlock.
> >
> > That rwlock should not be frequently contended, so the performance
> > impact of it is not expected to be significant.
> >
> > Reported-by: Dan Williams <[email protected]>
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > ---
> >
> > Hi Dan,
> >
> > This is a possible fix for the ACPI OSL RCU-related performance issues, but
> > can you please arrange for the testing of it on the affected systems?
>
> Ugh, is it really this simple? I did not realize the read-side is NMI
> safe. I'll take a look.

But if an NMI triggers while the lock is being held for writing, it
will deadlock, won't it?

OTOH, according to the RCU documentation it is valid to call
rcu_read_[un]lock() from an NMI handler (see Interrupts and NMIs in
Documentation/RCU/Design/Requirements/Requirements.rst) so we are good
from this perspective today.

Unless we teach APEI to avoid mapping lookups from
apei_{read|write}(), which wouldn't be unreasonable by itself, we need
to hold on to the RCU in ACPI OSL, so it looks like addressing the
problem in ACPICA is the best way to do it (and the current ACPICA
code in question is suboptimal, so it would be good to rework it
anyway).

Cheers!

2020-06-08 15:35:55

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFT][PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management

On Saturday, June 6, 2020 8:56:26 AM CEST Rafael J. Wysocki wrote:
> On Fri, Jun 5, 2020 at 7:09 PM Dan Williams <[email protected]> wrote:
> >
> > On Fri, Jun 5, 2020 at 7:06 AM Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > From: Rafael J. Wysocki <[email protected]>
> > > Subject: [PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management
> > >
> > > The ACPI OS layer uses RCU to protect the list of ACPI memory
> > > mappings from being walked while it is updated. Among other
> > > situations, that list can be walked in non-NMI interrupt context,
> > > so using a sleeping lock to protect it is not an option.
> > >
> > > However, performance issues related to the RCU usage in there
> > > appear, as described by Dan Williams:
> > >
> > > "Recently a performance problem was reported for a process invoking
> > > a non-trival ASL program. The method call in this case ends up
> > > repetitively triggering a call path like:
> > >
> > > acpi_ex_store
> > > acpi_ex_store_object_to_node
> > > acpi_ex_write_data_to_field
> > > acpi_ex_insert_into_field
> > > acpi_ex_write_with_update_rule
> > > acpi_ex_field_datum_io
> > > acpi_ex_access_region
> > > acpi_ev_address_space_dispatch
> > > acpi_ex_system_memory_space_handler
> > > acpi_os_map_cleanup.part.14
> > > _synchronize_rcu_expedited.constprop.89
> > > schedule
> > >
> > > The end result of frequent synchronize_rcu_expedited() invocation is
> > > tiny sub-millisecond spurts of execution where the scheduler freely
> > > migrates this apparently sleepy task. The overhead of frequent
> > > scheduler invocation multiplies the execution time by a factor
> > > of 2-3X."
> > >
> > > In order to avoid these issues, replace the RCU in the ACPI OS
> > > layer by an rwlock.
> > >
> > > That rwlock should not be frequently contended, so the performance
> > > impact of it is not expected to be significant.
> > >
> > > Reported-by: Dan Williams <[email protected]>
> > > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > > ---
> > >
> > > Hi Dan,
> > >
> > > This is a possible fix for the ACPI OSL RCU-related performance issues, but
> > > can you please arrange for the testing of it on the affected systems?
> >
> > Ugh, is it really this simple? I did not realize the read-side is NMI
> > safe. I'll take a look.
>
> But if an NMI triggers while the lock is being held for writing, it
> will deadlock, won't it?
>
> OTOH, according to the RCU documentation it is valid to call
> rcu_read_[un]lock() from an NMI handler (see Interrupts and NMIs in
> Documentation/RCU/Design/Requirements/Requirements.rst) so we are good
> from this perspective today.
>
> Unless we teach APEI to avoid mapping lookups from
> apei_{read|write}(), which wouldn't be unreasonable by itself, we need
> to hold on to the RCU in ACPI OSL, so it looks like addressing the
> problem in ACPICA is the best way to do it (and the current ACPICA
> code in question is suboptimal, so it would be good to rework it
> anyway).
>
> Cheers!

I've sent the prototype patch below to you, Bob and Erik in private, so
here it goes to the lists for completeness.

It introduces a "fast-path" variant of acpi_os_map_memory() that only
returns non-NULL if a matching mapping is already there in the list
and reworks acpi_ex_system_memory_space_handler() to use it.

The idea is to do a fast-path lookup first for every new mapping and
only run the full acpi_os_map_memory() if that returns NULL and then
save the mapping return by it and do a fast-path lookup for it again
to bump up its reference counter in the OSL layer. That should prevent
the mappings from going away until the opregions that they belong to
go away (the opregion deactivation code is updated too to remove the
saved mappings), so in the cases when there's not too much opregion
creation and removal activity, it should make the RCU-related overhead
go away.

Please test.

Cheers!

---
drivers/acpi/acpica/evrgnini.c | 14 ++++++++-
drivers/acpi/acpica/exregion.c | 49 +++++++++++++++++++++++++++++--
drivers/acpi/osl.c | 59 ++++++++++++++++++++++++++++----------
include/acpi/actypes.h | 7 ++++
include/acpi/platform/aclinuxex.h | 2 +
5 files changed, 112 insertions(+), 19 deletions(-)

Index: linux-pm/drivers/acpi/osl.c
===================================================================
--- linux-pm.orig/drivers/acpi/osl.c
+++ linux-pm/drivers/acpi/osl.c
@@ -302,21 +302,8 @@ static void acpi_unmap(acpi_physical_add
iounmap(vaddr);
}

-/**
- * acpi_os_map_iomem - Get a virtual address for a given physical address range.
- * @phys: Start of the physical address range to map.
- * @size: Size of the physical address range to map.
- *
- * Look up the given physical address range in the list of existing ACPI memory
- * mappings. If found, get a reference to it and return a pointer to it (its
- * virtual address). If not found, map it, add it to that list and return a
- * pointer to it.
- *
- * During early init (when acpi_permanent_mmap has not been set yet) this
- * routine simply calls __acpi_map_table() to get the job done.
- */
-void __iomem __ref
-*acpi_os_map_iomem(acpi_physical_address phys, acpi_size size)
+static void __iomem __ref *__acpi_os_map_iomem(acpi_physical_address phys,
+ acpi_size size, bool fastpath)
{
struct acpi_ioremap *map;
void __iomem *virt;
@@ -339,6 +326,11 @@ void __iomem __ref
goto out;
}

+ if (fastpath) {
+ mutex_unlock(&acpi_ioremap_lock);
+ return NULL;
+ }
+
map = kzalloc(sizeof(*map), GFP_KERNEL);
if (!map) {
mutex_unlock(&acpi_ioremap_lock);
@@ -366,6 +358,25 @@ out:
mutex_unlock(&acpi_ioremap_lock);
return map->virt + (phys - map->phys);
}
+
+/**
+ * acpi_os_map_iomem - Get a virtual address for a given physical address range.
+ * @phys: Start of the physical address range to map.
+ * @size: Size of the physical address range to map.
+ *
+ * Look up the given physical address range in the list of existing ACPI memory
+ * mappings. If found, get a reference to it and return a pointer representing
+ * its virtual address. If not found, map it, add it to that list and return a
+ * pointer representing its virtual address.
+ *
+ * During early init (when acpi_permanent_mmap has not been set yet) call
+ * __acpi_map_table() to obtain the mapping.
+ */
+void __iomem __ref *acpi_os_map_iomem(acpi_physical_address phys,
+ acpi_size size)
+{
+ return __acpi_os_map_iomem(phys, size, false);
+}
EXPORT_SYMBOL_GPL(acpi_os_map_iomem);

void *__ref acpi_os_map_memory(acpi_physical_address phys, acpi_size size)
@@ -374,6 +385,24 @@ void *__ref acpi_os_map_memory(acpi_phys
}
EXPORT_SYMBOL_GPL(acpi_os_map_memory);

+/**
+ * acpi_os_map_memory_fastpath - Fast-path physical-to-virtual address mapping.
+ * @phys: Start of the physical address range to map.
+ * @size: Size of the physical address range to map.
+ *
+ * Look up the given physical address range in the list of existing ACPI memory
+ * mappings. If found, get a reference to it and return a pointer representing
+ * its virtual address. If not found, return NULL.
+ *
+ * During early init (when acpi_permanent_mmap has not been set yet) call
+ * __acpi_map_table() to obtain the mapping.
+ */
+void __ref *acpi_os_map_memory_fastpath(acpi_physical_address phys,
+ acpi_size size)
+{
+ return __acpi_os_map_iomem(phys, size, true);
+}
+
/* Must be called with mutex_lock(&acpi_ioremap_lock) */
static unsigned long acpi_os_drop_map_ref(struct acpi_ioremap *map)
{
Index: linux-pm/include/acpi/actypes.h
===================================================================
--- linux-pm.orig/include/acpi/actypes.h
+++ linux-pm/include/acpi/actypes.h
@@ -1200,12 +1200,19 @@ struct acpi_pci_id {
u16 function;
};

+struct acpi_mem_mapping {
+ u8 *logical_address;
+ acpi_size length;
+ struct acpi_mem_mapping *next;
+};
+
struct acpi_mem_space_context {
u32 length;
acpi_physical_address address;
acpi_physical_address mapped_physical_address;
u8 *mapped_logical_address;
acpi_size mapped_length;
+ struct acpi_mem_mapping *first_mapping;
};

/*
Index: linux-pm/drivers/acpi/acpica/exregion.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpica/exregion.c
+++ linux-pm/drivers/acpi/acpica/exregion.c
@@ -44,6 +44,9 @@ acpi_ex_system_memory_space_handler(u32
u32 length;
acpi_size map_length;
acpi_size page_boundary_map_length;
+#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
+ struct acpi_mem_mapping *new_mapping;
+#endif
#ifdef ACPI_MISALIGNMENT_NOT_SUPPORTED
u32 remainder;
#endif
@@ -143,9 +146,20 @@ acpi_ex_system_memory_space_handler(u32

/* Create a new mapping starting at the address given */

- mem_info->mapped_logical_address =
- acpi_os_map_memory(address, map_length);
- if (!mem_info->mapped_logical_address) {
+#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
+ /* Look for an existing mapping matching the request at hand. */
+ logical_addr_ptr = acpi_os_map_memory_fastpath(address, length);
+ if (logical_addr_ptr) {
+ /*
+ * A matching mapping has been found, so cache it and
+ * carry our the access as requested.
+ */
+ goto access;
+ }
+#endif /* ACPI_OS_MAP_MEMORY_FASTPATH */
+
+ logical_addr_ptr = acpi_os_map_memory(address, map_length);
+ if (!logical_addr_ptr) {
ACPI_ERROR((AE_INFO,
"Could not map memory at 0x%8.8X%8.8X, size %u",
ACPI_FORMAT_UINT64(address),
@@ -154,8 +168,37 @@ acpi_ex_system_memory_space_handler(u32
return_ACPI_STATUS(AE_NO_MEMORY);
}

+#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
+ new_mapping = ACPI_ALLOCATE_ZEROED(sizeof(*new_mapping));
+ if (new_mapping) {
+ new_mapping->logical_address = logical_addr_ptr;
+ new_mapping->length = map_length;
+ new_mapping->next = mem_info->first_mapping;
+ mem_info->first_mapping = new_mapping;
+ /*
+ * Carry out an extra fast-path lookup to get one more
+ * reference to this mapping to prevent it from getting
+ * dropped if a future access involving this region does
+ * not fall into it.
+ */
+ acpi_os_map_memory_fastpath(address, map_length);
+ } else {
+ /*
+ * No room to save the new mapping, but this is not
+ * critical. Just log the error and carry out the
+ * access as requested.
+ */
+ ACPI_ERROR((AE_INFO,
+ "Not enough memory to save memory mapping at 0x%8.8X%8.8X, size %u",
+ ACPI_FORMAT_UINT64(address),
+ (u32)map_length));
+ }
+
+access:
+#endif /* ACPI_OS_MAP_MEMORY_FASTPATH */
/* Save the physical address and mapping size */

+ mem_info->mapped_logical_address = logical_addr_ptr;
mem_info->mapped_physical_address = address;
mem_info->mapped_length = map_length;
}
Index: linux-pm/drivers/acpi/acpica/evrgnini.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpica/evrgnini.c
+++ linux-pm/drivers/acpi/acpica/evrgnini.c
@@ -38,6 +38,9 @@ acpi_ev_system_memory_region_setup(acpi_
union acpi_operand_object *region_desc =
(union acpi_operand_object *)handle;
struct acpi_mem_space_context *local_region_context;
+#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
+ struct acpi_mem_mapping *mapping;
+#endif

ACPI_FUNCTION_TRACE(ev_system_memory_region_setup);

@@ -46,13 +49,22 @@ acpi_ev_system_memory_region_setup(acpi_
local_region_context =
(struct acpi_mem_space_context *)*region_context;

- /* Delete a cached mapping if present */
+ /* Delete memory mappings if present */

if (local_region_context->mapped_length) {
acpi_os_unmap_memory(local_region_context->
mapped_logical_address,
local_region_context->
mapped_length);
+#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
+ while (local_region_context->first_mapping) {
+ mapping = local_region_context->first_mapping;
+ local_region_context->first_mapping = mapping->next;
+ acpi_os_unmap_memory(mapping->logical_address,
+ mapping->length);
+ ACPI_FREE(mapping);
+ }
+#endif /* ACPI_OS_MAP_MEMORY_FASTPATH */
}
ACPI_FREE(local_region_context);
*region_context = NULL;
Index: linux-pm/include/acpi/platform/aclinuxex.h
===================================================================
--- linux-pm.orig/include/acpi/platform/aclinuxex.h
+++ linux-pm/include/acpi/platform/aclinuxex.h
@@ -138,6 +138,8 @@ static inline void acpi_os_terminate_deb
/*
* OSL interfaces added by Linux
*/
+#define ACPI_OS_MAP_MEMORY_FASTPATH
+void *acpi_os_map_memory_fastpath(acpi_physical_address where, acpi_size length);

#endif /* __KERNEL__ */




2020-06-08 18:30:22

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFT][PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management

On Mon, Jun 8, 2020 at 5:33 PM Rafael J. Wysocki <[email protected]> wrote:
>
> On Saturday, June 6, 2020 8:56:26 AM CEST Rafael J. Wysocki wrote:
> > On Fri, Jun 5, 2020 at 7:09 PM Dan Williams <[email protected]> wrote:
> > >
> > > On Fri, Jun 5, 2020 at 7:06 AM Rafael J. Wysocki <[email protected]> wrote:
> > > >
> > > > From: Rafael J. Wysocki <[email protected]>
> > > > Subject: [PATCH] ACPI: OSL: Use rwlock instead of RCU for memory management
> > > >
> > > > The ACPI OS layer uses RCU to protect the list of ACPI memory
> > > > mappings from being walked while it is updated. Among other
> > > > situations, that list can be walked in non-NMI interrupt context,
> > > > so using a sleeping lock to protect it is not an option.
> > > >
> > > > However, performance issues related to the RCU usage in there
> > > > appear, as described by Dan Williams:
> > > >
> > > > "Recently a performance problem was reported for a process invoking
> > > > a non-trival ASL program. The method call in this case ends up
> > > > repetitively triggering a call path like:
> > > >
> > > > acpi_ex_store
> > > > acpi_ex_store_object_to_node
> > > > acpi_ex_write_data_to_field
> > > > acpi_ex_insert_into_field
> > > > acpi_ex_write_with_update_rule
> > > > acpi_ex_field_datum_io
> > > > acpi_ex_access_region
> > > > acpi_ev_address_space_dispatch
> > > > acpi_ex_system_memory_space_handler
> > > > acpi_os_map_cleanup.part.14
> > > > _synchronize_rcu_expedited.constprop.89
> > > > schedule
> > > >
> > > > The end result of frequent synchronize_rcu_expedited() invocation is
> > > > tiny sub-millisecond spurts of execution where the scheduler freely
> > > > migrates this apparently sleepy task. The overhead of frequent
> > > > scheduler invocation multiplies the execution time by a factor
> > > > of 2-3X."
> > > >
> > > > In order to avoid these issues, replace the RCU in the ACPI OS
> > > > layer by an rwlock.
> > > >
> > > > That rwlock should not be frequently contended, so the performance
> > > > impact of it is not expected to be significant.
> > > >
> > > > Reported-by: Dan Williams <[email protected]>
> > > > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > > > ---
> > > >
> > > > Hi Dan,
> > > >
> > > > This is a possible fix for the ACPI OSL RCU-related performance issues, but
> > > > can you please arrange for the testing of it on the affected systems?
> > >
> > > Ugh, is it really this simple? I did not realize the read-side is NMI
> > > safe. I'll take a look.
> >
> > But if an NMI triggers while the lock is being held for writing, it
> > will deadlock, won't it?
> >
> > OTOH, according to the RCU documentation it is valid to call
> > rcu_read_[un]lock() from an NMI handler (see Interrupts and NMIs in
> > Documentation/RCU/Design/Requirements/Requirements.rst) so we are good
> > from this perspective today.
> >
> > Unless we teach APEI to avoid mapping lookups from
> > apei_{read|write}(), which wouldn't be unreasonable by itself, we need
> > to hold on to the RCU in ACPI OSL, so it looks like addressing the
> > problem in ACPICA is the best way to do it (and the current ACPICA
> > code in question is suboptimal, so it would be good to rework it
> > anyway).
> >
> > Cheers!
>
> I've sent the prototype patch below to you, Bob and Erik in private, so
> here it goes to the lists for completeness.
>
> It introduces a "fast-path" variant of acpi_os_map_memory() that only
> returns non-NULL if a matching mapping is already there in the list
> and reworks acpi_ex_system_memory_space_handler() to use it.
>
> The idea is to do a fast-path lookup first for every new mapping and
> only run the full acpi_os_map_memory() if that returns NULL and then
> save the mapping return by it and do a fast-path lookup for it again
> to bump up its reference counter in the OSL layer. That should prevent
> the mappings from going away until the opregions that they belong to
> go away (the opregion deactivation code is updated too to remove the
> saved mappings), so in the cases when there's not too much opregion
> creation and removal activity, it should make the RCU-related overhead
> go away.
>
> Please test.
>
> Cheers!
>
> ---
> drivers/acpi/acpica/evrgnini.c | 14 ++++++++-
> drivers/acpi/acpica/exregion.c | 49 +++++++++++++++++++++++++++++--
> drivers/acpi/osl.c | 59 ++++++++++++++++++++++++++++----------
> include/acpi/actypes.h | 7 ++++
> include/acpi/platform/aclinuxex.h | 2 +
> 5 files changed, 112 insertions(+), 19 deletions(-)
>
> Index: linux-pm/drivers/acpi/osl.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/osl.c
> +++ linux-pm/drivers/acpi/osl.c
> @@ -302,21 +302,8 @@ static void acpi_unmap(acpi_physical_add
> iounmap(vaddr);
> }
>
> -/**
> - * acpi_os_map_iomem - Get a virtual address for a given physical address range.
> - * @phys: Start of the physical address range to map.
> - * @size: Size of the physical address range to map.
> - *
> - * Look up the given physical address range in the list of existing ACPI memory
> - * mappings. If found, get a reference to it and return a pointer to it (its
> - * virtual address). If not found, map it, add it to that list and return a
> - * pointer to it.
> - *
> - * During early init (when acpi_permanent_mmap has not been set yet) this
> - * routine simply calls __acpi_map_table() to get the job done.
> - */
> -void __iomem __ref
> -*acpi_os_map_iomem(acpi_physical_address phys, acpi_size size)
> +static void __iomem __ref *__acpi_os_map_iomem(acpi_physical_address phys,
> + acpi_size size, bool fastpath)
> {
> struct acpi_ioremap *map;
> void __iomem *virt;
> @@ -339,6 +326,11 @@ void __iomem __ref
> goto out;
> }
>
> + if (fastpath) {
> + mutex_unlock(&acpi_ioremap_lock);
> + return NULL;
> + }
> +
> map = kzalloc(sizeof(*map), GFP_KERNEL);
> if (!map) {
> mutex_unlock(&acpi_ioremap_lock);
> @@ -366,6 +358,25 @@ out:
> mutex_unlock(&acpi_ioremap_lock);
> return map->virt + (phys - map->phys);
> }
> +
> +/**
> + * acpi_os_map_iomem - Get a virtual address for a given physical address range.
> + * @phys: Start of the physical address range to map.
> + * @size: Size of the physical address range to map.
> + *
> + * Look up the given physical address range in the list of existing ACPI memory
> + * mappings. If found, get a reference to it and return a pointer representing
> + * its virtual address. If not found, map it, add it to that list and return a
> + * pointer representing its virtual address.
> + *
> + * During early init (when acpi_permanent_mmap has not been set yet) call
> + * __acpi_map_table() to obtain the mapping.
> + */
> +void __iomem __ref *acpi_os_map_iomem(acpi_physical_address phys,
> + acpi_size size)
> +{
> + return __acpi_os_map_iomem(phys, size, false);
> +}
> EXPORT_SYMBOL_GPL(acpi_os_map_iomem);
>
> void *__ref acpi_os_map_memory(acpi_physical_address phys, acpi_size size)
> @@ -374,6 +385,24 @@ void *__ref acpi_os_map_memory(acpi_phys
> }
> EXPORT_SYMBOL_GPL(acpi_os_map_memory);
>
> +/**
> + * acpi_os_map_memory_fastpath - Fast-path physical-to-virtual address mapping.
> + * @phys: Start of the physical address range to map.
> + * @size: Size of the physical address range to map.
> + *
> + * Look up the given physical address range in the list of existing ACPI memory
> + * mappings. If found, get a reference to it and return a pointer representing
> + * its virtual address. If not found, return NULL.
> + *
> + * During early init (when acpi_permanent_mmap has not been set yet) call
> + * __acpi_map_table() to obtain the mapping.
> + */
> +void __ref *acpi_os_map_memory_fastpath(acpi_physical_address phys,
> + acpi_size size)
> +{
> + return __acpi_os_map_iomem(phys, size, true);
> +}
> +
> /* Must be called with mutex_lock(&acpi_ioremap_lock) */
> static unsigned long acpi_os_drop_map_ref(struct acpi_ioremap *map)
> {
> Index: linux-pm/include/acpi/actypes.h
> ===================================================================
> --- linux-pm.orig/include/acpi/actypes.h
> +++ linux-pm/include/acpi/actypes.h
> @@ -1200,12 +1200,19 @@ struct acpi_pci_id {
> u16 function;
> };
>
> +struct acpi_mem_mapping {
> + u8 *logical_address;
> + acpi_size length;
> + struct acpi_mem_mapping *next;
> +};
> +
> struct acpi_mem_space_context {
> u32 length;
> acpi_physical_address address;
> acpi_physical_address mapped_physical_address;
> u8 *mapped_logical_address;
> acpi_size mapped_length;
> + struct acpi_mem_mapping *first_mapping;
> };
>
> /*
> Index: linux-pm/drivers/acpi/acpica/exregion.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpica/exregion.c
> +++ linux-pm/drivers/acpi/acpica/exregion.c
> @@ -44,6 +44,9 @@ acpi_ex_system_memory_space_handler(u32
> u32 length;
> acpi_size map_length;
> acpi_size page_boundary_map_length;
> +#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
> + struct acpi_mem_mapping *new_mapping;
> +#endif
> #ifdef ACPI_MISALIGNMENT_NOT_SUPPORTED
> u32 remainder;
> #endif
> @@ -143,9 +146,20 @@ acpi_ex_system_memory_space_handler(u32
>
> /* Create a new mapping starting at the address given */
>
> - mem_info->mapped_logical_address =
> - acpi_os_map_memory(address, map_length);
> - if (!mem_info->mapped_logical_address) {
> +#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
> + /* Look for an existing mapping matching the request at hand. */
> + logical_addr_ptr = acpi_os_map_memory_fastpath(address, length);

s/length/map_length/

but the patch should still work as is, with more overhead though.

> + if (logical_addr_ptr) {
> + /*
> + * A matching mapping has been found, so cache it and
> + * carry our the access as requested.
> + */
> + goto access;
> + }
> +#endif /* ACPI_OS_MAP_MEMORY_FASTPATH */
> +
> + logical_addr_ptr = acpi_os_map_memory(address, map_length);
> + if (!logical_addr_ptr) {
> ACPI_ERROR((AE_INFO,
> "Could not map memory at 0x%8.8X%8.8X, size %u",
> ACPI_FORMAT_UINT64(address),
> @@ -154,8 +168,37 @@ acpi_ex_system_memory_space_handler(u32
> return_ACPI_STATUS(AE_NO_MEMORY);
> }
>
> +#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
> + new_mapping = ACPI_ALLOCATE_ZEROED(sizeof(*new_mapping));
> + if (new_mapping) {
> + new_mapping->logical_address = logical_addr_ptr;
> + new_mapping->length = map_length;
> + new_mapping->next = mem_info->first_mapping;
> + mem_info->first_mapping = new_mapping;
> + /*
> + * Carry out an extra fast-path lookup to get one more
> + * reference to this mapping to prevent it from getting
> + * dropped if a future access involving this region does
> + * not fall into it.
> + */
> + acpi_os_map_memory_fastpath(address, map_length);
> + } else {
> + /*
> + * No room to save the new mapping, but this is not
> + * critical. Just log the error and carry out the
> + * access as requested.
> + */
> + ACPI_ERROR((AE_INFO,
> + "Not enough memory to save memory mapping at 0x%8.8X%8.8X, size %u",
> + ACPI_FORMAT_UINT64(address),
> + (u32)map_length));
> + }
> +
> +access:
> +#endif /* ACPI_OS_MAP_MEMORY_FASTPATH */
> /* Save the physical address and mapping size */
>
> + mem_info->mapped_logical_address = logical_addr_ptr;
> mem_info->mapped_physical_address = address;
> mem_info->mapped_length = map_length;
> }
> Index: linux-pm/drivers/acpi/acpica/evrgnini.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpica/evrgnini.c
> +++ linux-pm/drivers/acpi/acpica/evrgnini.c
> @@ -38,6 +38,9 @@ acpi_ev_system_memory_region_setup(acpi_
> union acpi_operand_object *region_desc =
> (union acpi_operand_object *)handle;
> struct acpi_mem_space_context *local_region_context;
> +#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
> + struct acpi_mem_mapping *mapping;
> +#endif
>
> ACPI_FUNCTION_TRACE(ev_system_memory_region_setup);
>
> @@ -46,13 +49,22 @@ acpi_ev_system_memory_region_setup(acpi_
> local_region_context =
> (struct acpi_mem_space_context *)*region_context;
>
> - /* Delete a cached mapping if present */
> + /* Delete memory mappings if present */
>
> if (local_region_context->mapped_length) {
> acpi_os_unmap_memory(local_region_context->
> mapped_logical_address,
> local_region_context->
> mapped_length);
> +#ifdef ACPI_OS_MAP_MEMORY_FASTPATH
> + while (local_region_context->first_mapping) {
> + mapping = local_region_context->first_mapping;
> + local_region_context->first_mapping = mapping->next;
> + acpi_os_unmap_memory(mapping->logical_address,
> + mapping->length);
> + ACPI_FREE(mapping);
> + }
> +#endif /* ACPI_OS_MAP_MEMORY_FASTPATH */
> }
> ACPI_FREE(local_region_context);
> *region_context = NULL;
> Index: linux-pm/include/acpi/platform/aclinuxex.h
> ===================================================================
> --- linux-pm.orig/include/acpi/platform/aclinuxex.h
> +++ linux-pm/include/acpi/platform/aclinuxex.h
> @@ -138,6 +138,8 @@ static inline void acpi_os_terminate_deb
> /*
> * OSL interfaces added by Linux
> */
> +#define ACPI_OS_MAP_MEMORY_FASTPATH
> +void *acpi_os_map_memory_fastpath(acpi_physical_address where, acpi_size length);
>
> #endif /* __KERNEL__ */
>
>
>
>

2020-06-10 12:26:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFT][PATCH 0/3] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

Hi All,

This series is to address the problem with RCU synchronization occurring,
possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
when the namespace and interpreter mutexes are held.

The basic idea is to avoid the actual unmapping of memory in
acpi_ex_system_memory_space_handler() by making it take the advantage of the
reference counting of memory mappings utilized by the OSL layer in Linux.

The basic assumption in patch [1/3] is that if the special
ACPI_OS_MAP_MEMORY_FAST_PATH() macro is present, it can be used to increment
the reference counter of a known-existing memory mapping in the OS layer
which then is dropped by the subsequent acpi_os_unmap_memory() without
unmapping the address range at hand. That can be utilized by
acpi_ex_system_memory_space_handler() to prevent the reference counters of
all mappings used by it from dropping down to 0 (which also prevents the
address ranges associated with them from being unmapped) so that they can
be unmapped later (specifically, at the operation region deactivation time).

Patch [2/3] defers the unmapping even further, until the namespace and
interpreter mutexes are released, to avoid invoking the RCU synchronization
under theses mutexes.

Finally, patch [3/3] changes the OS layer in Linux to provide the
ACPI_OS_MAP_MEMORY_FAST_PATH() macro.

Note that if this macro is not defined, the code works the way it used to.

The series is available from the git branch at

git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
acpica-osl

for easier testing.

Cheers,
Rafael



2020-06-10 12:26:48

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFT][PATCH 1/3] ACPICA: Defer unmapping of memory used in memory opregions

From: "Rafael J. Wysocki" <[email protected]>

The ACPI OS layer in Linux uses RCU to protect the list of ACPI
memory mappings from being walked while it is being updated. Among
other situations, that list can be walked in (NMI and non-NMI)
interrupt context, so using a sleeping lock to protect it is not
an option.

However, performance issues related to the RCU usage in there
appear, as described by Dan Williams:

"Recently a performance problem was reported for a process invoking
a non-trival ASL program. The method call in this case ends up
repetitively triggering a call path like:

acpi_ex_store
acpi_ex_store_object_to_node
acpi_ex_write_data_to_field
acpi_ex_insert_into_field
acpi_ex_write_with_update_rule
acpi_ex_field_datum_io
acpi_ex_access_region
acpi_ev_address_space_dispatch
acpi_ex_system_memory_space_handler
acpi_os_map_cleanup.part.14
_synchronize_rcu_expedited.constprop.89
schedule

The end result of frequent synchronize_rcu_expedited() invocation is
tiny sub-millisecond spurts of execution where the scheduler freely
migrates this apparently sleepy task. The overhead of frequent
scheduler invocation multiplies the execution time by a factor
of 2-3X."

The source of this is that acpi_ex_system_memory_space_handler()
unmaps the memory mapping currently cached by it at the access time
it that mapping doesn't cover the memory area being accessed.
Consequently, if there is a memory opregion with two fields
separated from each other by an unused chunk of address space that
is large enough for not being covered by a single mapping, and they
happen to be used in an alternating pattern, the unmapping will
occur on every acpi_ex_system_memory_space_handler() invocation for
that memory opregion and that will lead to significant overhead.

To remedy that, modify acpi_ex_system_memory_space_handler() so it
can defer the unmapping of the memory mapped by it till it is
deactivated if a special ACPI_OS_MAP_MEMORY_FAST_PATH() macro,
allowing its users to get an extra reference to a known-existing
memory mapping without actually mapping it again, is defined in the
OS layer.

Namely, make acpi_ex_system_memory_space_handler() manage an internal
list of memory mappings covering all memory accesses through it that
have occurred so far if ACPI_OS_MAP_MEMORY_FAST_PATH() is present, so
that every new mapping is added to that list with an extra reference
obtained via ACPI_OS_MAP_MEMORY_FAST_PATH() which prevents it from
being unmapped by acpi_ex_system_memory_space_handler() itself.
Of course, those mappings need to go away at one point, so change
acpi_ev_system_memory_region_setup() to unmap them when the memory
opregion holding them is deactivated.

This should reduce the acpi_ex_system_memory_space_handler() overhead
for memory opregions that do not appear and then go away immediately
after a single access. [of course, ACPI_OS_MAP_MEMORY_FAST_PATH()
needs to be implemented by the OS for this change to take effect.]

Reported-by: Dan Williams <[email protected]>
Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/acpica/evrgnini.c | 14 +++++-
drivers/acpi/acpica/exregion.c | 90 ++++++++++++++++++++++++++++++++--
include/acpi/actypes.h | 8 +++
3 files changed, 107 insertions(+), 5 deletions(-)

diff --git a/drivers/acpi/acpica/evrgnini.c b/drivers/acpi/acpica/evrgnini.c
index aefc0145e583..48a5e6eaf9b9 100644
--- a/drivers/acpi/acpica/evrgnini.c
+++ b/drivers/acpi/acpica/evrgnini.c
@@ -38,6 +38,9 @@ acpi_ev_system_memory_region_setup(acpi_handle handle,
union acpi_operand_object *region_desc =
(union acpi_operand_object *)handle;
struct acpi_mem_space_context *local_region_context;
+#ifdef ACPI_OS_MAP_MEMORY_FAST_PATH
+ struct acpi_mem_mapping *mapping;
+#endif

ACPI_FUNCTION_TRACE(ev_system_memory_region_setup);

@@ -46,13 +49,22 @@ acpi_ev_system_memory_region_setup(acpi_handle handle,
local_region_context =
(struct acpi_mem_space_context *)*region_context;

- /* Delete a cached mapping if present */
+ /* Delete memory mappings if present */

if (local_region_context->mapped_length) {
acpi_os_unmap_memory(local_region_context->
mapped_logical_address,
local_region_context->
mapped_length);
+#ifdef ACPI_OS_MAP_MEMORY_FAST_PATH
+ while (local_region_context->first_mapping) {
+ mapping = local_region_context->first_mapping;
+ local_region_context->first_mapping = mapping->next;
+ acpi_os_unmap_memory(mapping->logical_address,
+ mapping->length);
+ ACPI_FREE(mapping);
+ }
+#endif
}
ACPI_FREE(local_region_context);
*region_context = NULL;
diff --git a/drivers/acpi/acpica/exregion.c b/drivers/acpi/acpica/exregion.c
index d15a66de26c0..703868db9551 100644
--- a/drivers/acpi/acpica/exregion.c
+++ b/drivers/acpi/acpica/exregion.c
@@ -44,6 +44,9 @@ acpi_ex_system_memory_space_handler(u32 function,
u32 length;
acpi_size map_length;
acpi_size page_boundary_map_length;
+#ifdef ACPI_OS_MAP_MEMORY_FAST_PATH
+ struct acpi_mem_mapping *mapping;
+#endif
#ifdef ACPI_MISALIGNMENT_NOT_SUPPORTED
u32 remainder;
#endif
@@ -102,7 +105,7 @@ acpi_ex_system_memory_space_handler(u32 function,
mem_info->mapped_length))) {
/*
* The request cannot be resolved by the current memory mapping;
- * Delete the existing mapping and create a new one.
+ * Delete the current cached mapping and get a new one.
*/
if (mem_info->mapped_length) {

@@ -112,6 +115,40 @@ acpi_ex_system_memory_space_handler(u32 function,
mem_info->mapped_length);
}

+#ifdef ACPI_OS_MAP_MEMORY_FAST_PATH
+ /*
+ * Look for an existing saved mapping matching the request at
+ * hand. If found, bump up its reference counter in the OS
+ * layer, cache it and carry out the access.
+ */
+ for (mapping = mem_info->first_mapping; mapping;
+ mapping = mapping->next) {
+ if (address < mapping->physical_address)
+ continue;
+
+ if ((u64)address + length >
+ (u64)mapping->physical_address +
+ mapping->length)
+ continue;
+
+ /*
+ * When called on a known-existing memory mapping,
+ * ACPI_OS_MAP_MEMORY_FAST_PATH() must return the same
+ * logical address as before or NULL.
+ */
+ if (!ACPI_OS_MAP_MEMORY_FAST_PATH(mapping->physical_address,
+ mapping->length))
+ continue;
+
+ mem_info->mapped_logical_address =
+ mapping->logical_address;
+ mem_info->mapped_physical_address =
+ mapping->physical_address;
+ mem_info->mapped_length = mapping->length;
+ goto access;
+ }
+#endif /* ACPI_OS_MAP_MEMORY_FAST_PATH */
+
/*
* October 2009: Attempt to map from the requested address to the
* end of the region. However, we will never map more than one
@@ -143,9 +180,8 @@ acpi_ex_system_memory_space_handler(u32 function,

/* Create a new mapping starting at the address given */

- mem_info->mapped_logical_address =
- acpi_os_map_memory(address, map_length);
- if (!mem_info->mapped_logical_address) {
+ logical_addr_ptr = acpi_os_map_memory(address, map_length);
+ if (!logical_addr_ptr) {
ACPI_ERROR((AE_INFO,
"Could not map memory at 0x%8.8X%8.8X, size %u",
ACPI_FORMAT_UINT64(address),
@@ -156,10 +192,56 @@ acpi_ex_system_memory_space_handler(u32 function,

/* Save the physical address and mapping size */

+ mem_info->mapped_logical_address = logical_addr_ptr;
mem_info->mapped_physical_address = address;
mem_info->mapped_length = map_length;
+
+#ifdef ACPI_OS_MAP_MEMORY_FAST_PATH
+ /*
+ * Get memory to save the new mapping for removal at the
+ * operation region deactivation time.
+ */
+ mapping = ACPI_ALLOCATE_ZEROED(sizeof(*mapping));
+ if (!mapping) {
+ /*
+ * No room to save the new mapping, but this is not
+ * critical. Just log the error and carry out the
+ * access as requested.
+ */
+ ACPI_ERROR((AE_INFO,
+ "Not enough memory to save memory mapping at 0x%8.8X%8.8X, size %u",
+ ACPI_FORMAT_UINT64(address),
+ (u32)map_length));
+ goto access;
+ }
+ /*
+ * Bump up the mapping's reference counter in the OS layer to
+ * prevent it from getting dropped prematurely.
+ */
+ if (!ACPI_OS_MAP_MEMORY_FAST_PATH(address, map_length)) {
+ /*
+ * Something has gone wrong, but this is not critical.
+ * Log the error, free the memory that won't be used and
+ * carry out the access as requested.
+ */
+ ACPI_ERROR((AE_INFO,
+ "Unable to save memory mapping at 0x%8.8X%8.8X, size %u",
+ ACPI_FORMAT_UINT64(address),
+ (u32)map_length));
+ ACPI_FREE(mapping);
+ goto access;
+ }
+ mapping->physical_address = address;
+ mapping->logical_address = logical_addr_ptr;
+ mapping->length = map_length;
+ mapping->next = mem_info->first_mapping;
+ mem_info->first_mapping = mapping;
}

+access:
+#else /* !ACPI_OS_MAP_MEMORY_FAST_PATH */
+ }
+#endif /* !ACPI_OS_MAP_MEMORY_FAST_PATH */
/*
* Generate a logical pointer corresponding to the address we want to
* access
diff --git a/include/acpi/actypes.h b/include/acpi/actypes.h
index 4defed58ea33..64ab323b81b4 100644
--- a/include/acpi/actypes.h
+++ b/include/acpi/actypes.h
@@ -1200,12 +1200,20 @@ struct acpi_pci_id {
u16 function;
};

+struct acpi_mem_mapping {
+ acpi_physical_address physical_address;
+ u8 *logical_address;
+ acpi_size length;
+ struct acpi_mem_mapping *next;
+};
+
struct acpi_mem_space_context {
u32 length;
acpi_physical_address address;
acpi_physical_address mapped_physical_address;
u8 *mapped_logical_address;
acpi_size mapped_length;
+ struct acpi_mem_mapping *first_mapping;
};

/*
--
2.26.2




2020-06-10 17:58:38

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFT][PATCH 3/3] ACPI: OSL: Define ACPI_OS_MAP_MEMORY_FAST_PATH()

From: "Rafael J. Wysocki" <[email protected]>

Define the ACPI_OS_MAP_MEMORY_FAST_PATH() macro to allow
acpi_ex_system_memory_space_handler() to avoid memory unmapping
overhead by deferring the unmap operations to the point when the
AML interpreter is exited after removing the operation region
that held the memory mappings which are not used any more.

That macro, when called on a knwon-existing memory mapping,
causes the reference counter of that mapping in the OS layer to be
incremented and returns a pointer representing the virtual address
of the start of the mapped memory area without really mapping it,
so the first subsequent unmap operation on it will only decrement
the reference counter.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/osl.c | 67 +++++++++++++++++++++++--------
include/acpi/platform/aclinuxex.h | 4 ++
2 files changed, 55 insertions(+), 16 deletions(-)

diff --git a/drivers/acpi/osl.c b/drivers/acpi/osl.c
index 762c5d50b8fe..b75f3a17776f 100644
--- a/drivers/acpi/osl.c
+++ b/drivers/acpi/osl.c
@@ -302,21 +302,8 @@ static void acpi_unmap(acpi_physical_address pg_off, void __iomem *vaddr)
iounmap(vaddr);
}

-/**
- * acpi_os_map_iomem - Get a virtual address for a given physical address range.
- * @phys: Start of the physical address range to map.
- * @size: Size of the physical address range to map.
- *
- * Look up the given physical address range in the list of existing ACPI memory
- * mappings. If found, get a reference to it and return a pointer to it (its
- * virtual address). If not found, map it, add it to that list and return a
- * pointer to it.
- *
- * During early init (when acpi_permanent_mmap has not been set yet) this
- * routine simply calls __acpi_map_table() to get the job done.
- */
-void __iomem __ref
-*acpi_os_map_iomem(acpi_physical_address phys, acpi_size size)
+static void __iomem __ref *__acpi_os_map_iomem(acpi_physical_address phys,
+ acpi_size size, bool fast_path)
{
struct acpi_ioremap *map;
void __iomem *virt;
@@ -328,8 +315,12 @@ void __iomem __ref
return NULL;
}

- if (!acpi_permanent_mmap)
+ if (!acpi_permanent_mmap) {
+ if (WARN_ON(fast_path))
+ return NULL;
+
return __acpi_map_table((unsigned long)phys, size);
+ }

mutex_lock(&acpi_ioremap_lock);
/* Check if there's a suitable mapping already. */
@@ -339,6 +330,11 @@ void __iomem __ref
goto out;
}

+ if (fast_path) {
+ mutex_unlock(&acpi_ioremap_lock);
+ return NULL;
+ }
+
map = kzalloc(sizeof(*map), GFP_KERNEL);
if (!map) {
mutex_unlock(&acpi_ioremap_lock);
@@ -366,6 +362,25 @@ void __iomem __ref
mutex_unlock(&acpi_ioremap_lock);
return map->virt + (phys - map->phys);
}
+
+/**
+ * acpi_os_map_iomem - Get a virtual address for a given physical address range.
+ * @phys: Start of the physical address range to map.
+ * @size: Size of the physical address range to map.
+ *
+ * Look up the given physical address range in the list of existing ACPI memory
+ * mappings. If found, get a reference to it and return a pointer representing
+ * its virtual address. If not found, map it, add it to that list and return a
+ * pointer representing its virtual address.
+ *
+ * During early init (when acpi_permanent_mmap has not been set yet) call
+ * __acpi_map_table() to obtain the mapping.
+ */
+void __iomem __ref *acpi_os_map_iomem(acpi_physical_address phys,
+ acpi_size size)
+{
+ return __acpi_os_map_iomem(phys, size, false);
+}
EXPORT_SYMBOL_GPL(acpi_os_map_iomem);

void *__ref acpi_os_map_memory(acpi_physical_address phys, acpi_size size)
@@ -374,6 +389,24 @@ void *__ref acpi_os_map_memory(acpi_physical_address phys, acpi_size size)
}
EXPORT_SYMBOL_GPL(acpi_os_map_memory);

+/**
+ * acpi_os_map_memory_fast_path - Fast-path physical-to-virtual address mapping.
+ * @phys: Start of the physical address range to map.
+ * @size: Size of the physical address range to map.
+ *
+ * Look up the given physical address range in the list of existing ACPI memory
+ * mappings. If found, get a reference to it and return a pointer representing
+ * its virtual address. If not found, return NULL.
+ *
+ * During early init (when acpi_permanent_mmap has not been set yet) log a
+ * warning and return NULL.
+ */
+void __ref *acpi_os_map_memory_fast_path(acpi_physical_address phys,
+ acpi_size size)
+{
+ return __acpi_os_map_iomem(phys, size, true);
+}
+
/* Must be called with mutex_lock(&acpi_ioremap_lock) */
static unsigned long acpi_os_drop_map_ref(struct acpi_ioremap *map)
{
@@ -1571,6 +1604,8 @@ acpi_status acpi_release_memory(acpi_handle handle, struct resource *res,

return acpi_walk_namespace(ACPI_TYPE_REGION, handle, level,
acpi_deactivate_mem_region, NULL, res, NULL);
+
+ acpi_release_unused_memory_mappings();
}
EXPORT_SYMBOL_GPL(acpi_release_memory);

diff --git a/include/acpi/platform/aclinuxex.h b/include/acpi/platform/aclinuxex.h
index 04f88f2de781..1d8be4ac9ef9 100644
--- a/include/acpi/platform/aclinuxex.h
+++ b/include/acpi/platform/aclinuxex.h
@@ -139,6 +139,10 @@ static inline void acpi_os_terminate_debugger(void)
* OSL interfaces added by Linux
*/

+void *acpi_os_map_memory_fast_path(acpi_physical_address where, acpi_size length);
+
+#define ACPI_OS_MAP_MEMORY_FAST_PATH(a, s) acpi_os_map_memory_fast_path(a, s)
+
#endif /* __KERNEL__ */

#endif /* __ACLINUXEX_H__ */
--
2.26.2




2020-06-13 19:21:42

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFT][PATCH 0/3] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

On Wednesday, June 10, 2020 2:17:04 PM CEST Rafael J. Wysocki wrote:
> Hi All,
>
> This series is to address the problem with RCU synchronization occurring,
> possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
> when the namespace and interpreter mutexes are held.
>
> The basic idea is to avoid the actual unmapping of memory in
> acpi_ex_system_memory_space_handler() by making it take the advantage of the
> reference counting of memory mappings utilized by the OSL layer in Linux.
>
> The basic assumption in patch [1/3] is that if the special
> ACPI_OS_MAP_MEMORY_FAST_PATH() macro is present, it can be used to increment
> the reference counter of a known-existing memory mapping in the OS layer
> which then is dropped by the subsequent acpi_os_unmap_memory() without
> unmapping the address range at hand. That can be utilized by
> acpi_ex_system_memory_space_handler() to prevent the reference counters of
> all mappings used by it from dropping down to 0 (which also prevents the
> address ranges associated with them from being unmapped) so that they can
> be unmapped later (specifically, at the operation region deactivation time).
>
> Patch [2/3] defers the unmapping even further, until the namespace and
> interpreter mutexes are released, to avoid invoking the RCU synchronization
> under theses mutexes.
>
> Finally, patch [3/3] changes the OS layer in Linux to provide the
> ACPI_OS_MAP_MEMORY_FAST_PATH() macro.
>
> Note that if this macro is not defined, the code works the way it used to.
>
> The series is available from the git branch at
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> acpica-osl
>
> for easier testing.

Please disregard this patch series, it will be replaced by a new one which
already is there in the acpica-osl branch above.

Thanks!



2020-06-22 14:06:28

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFT][PATCH v2 0/4] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

Hi All,

This series is to address the problem with RCU synchronization occurring,
possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
when the namespace and interpreter mutexes are held.

Like I said before, I had decided to change the approach used in the previous
iteration of this series and to allow the unmap operations carried out by
acpi_ex_system_memory_space_handler() to be deferred in the first place,
which is done in patches [1-2/4].

However, it turns out that the "fast-path" mapping is still useful on top of
the above to reduce the number of ioremap-iounmap cycles for the same address
range and so it is introduced by patches [3-4/4].

For details, please refer to the patch changelogs.

The series is available from the git branch at

git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
acpica-osl

for easier testing.

Cheers,
Rafael






2020-06-22 14:08:17

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFT][PATCH v2 3/4] ACPICA: Preserve memory opregion mappings if supported by OS

From: "Rafael J. Wysocki" <[email protected]>

The ACPICA's strategy with respect to the handling of memory mappings
associated with memory operation regions is to avoid mapping the
entire region at once which may be problematic at least in principle
(for example, it may lead to conflicts with overlapping mappings
having different attributes created by drivers). It may also be
wasteful, because memory opregions on some systems take up vast
chunks of address space while the fields in those regions actually
accessed by AML are sparsely distributed.

For this reason, a one-page "window" is mapped for a given opregion
on the first memory access through it and if that "window" does not
cover an address range accessed through that opregion subsequently,
it is unmapped and a new "window" is mapped to replace it. Next,
if the new "window" is not sufficient to access memory through the
opregion in question in the future, it will be replaced with yet
another "window" and so on. That may lead to a suboptimal sequence
of memory mapping and unmapping operations, for example if two fields
in one opregion separated from each other by a sufficiently wide
chunk of unused address space are accessed in an alternating pattern.

The situation may still be suboptimal if the deferred unmapping
introduced previously is supported by the OS layer. For instance,
the alternating memory access pattern mentioned above may produce
a relatively long list of mappings to release with substantial
duplication among the entries in it, which could be avoided if
acpi_ex_system_memory_space_handler() did not release the mapping
used by it previously as soon as the current access was not covered
by it.

In order to improve that, modify acpi_ex_system_memory_space_handler()
to take advantage of the memory mappings reference counting at the OS
level if a suitable interface is provided.

Namely, if ACPI_USE_FAST_PATH_MAPPING is set, the OS is expected to
implement acpi_os_map_memory_fast_path() that will return NULL if
there is no mapping covering the given address range known to it.
If such a mapping is there, however, its reference counter will be
incremented and a pointer representing the requested virtual address
will be returned right away without any additional consequences.

That allows acpi_ex_system_memory_space_handler() to acquire
additional references to all new memory mappings with the help
of acpi_os_map_memory_fast_path() so as to retain them until the
memory opregions associated with them go away. The function will
still use a new "window" mapping if the current one does not
cover the address range at hand, but it will avoid unmapping the
current one right away by adding it to a list of "known" mappings
associated with the given memory opregion which will be deleted at
the opregion deactivation time. The mappings in that list can be
used every time a "new window" is needed so as to avoid overhead
related to the mapping and unmapping of memory.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/acpica/acinterp.h | 4 +
drivers/acpi/acpica/evrgnini.c | 7 +-
drivers/acpi/acpica/exregion.c | 159 ++++++++++++++++++++++++++++++++-
3 files changed, 162 insertions(+), 8 deletions(-)

diff --git a/drivers/acpi/acpica/acinterp.h b/drivers/acpi/acpica/acinterp.h
index 1f1026fb06e9..db9c279baa2e 100644
--- a/drivers/acpi/acpica/acinterp.h
+++ b/drivers/acpi/acpica/acinterp.h
@@ -479,8 +479,12 @@ void acpi_ex_pci_cls_to_string(char *dest, u8 class_code[3]);

u8 acpi_is_valid_space_id(u8 space_id);

+struct acpi_mem_space_context *acpi_ex_alloc_mem_space_context(void);
+
void acpi_ex_unmap_region_memory(struct acpi_mem_space_context *mem_info);

+void acpi_ex_unmap_all_region_mappings(struct acpi_mem_space_context *mem_info);
+
/*
* exregion - default op_region handlers
*/
diff --git a/drivers/acpi/acpica/evrgnini.c b/drivers/acpi/acpica/evrgnini.c
index 9f33114a74ca..f6c5feea10bc 100644
--- a/drivers/acpi/acpica/evrgnini.c
+++ b/drivers/acpi/acpica/evrgnini.c
@@ -46,10 +46,10 @@ acpi_ev_system_memory_region_setup(acpi_handle handle,
local_region_context =
(struct acpi_mem_space_context *)*region_context;

- /* Delete a cached mapping if present */
+ /* Delete memory mappings if present */

if (local_region_context->mapped_length) {
- acpi_ex_unmap_region_memory(local_region_context);
+ acpi_ex_unmap_all_region_mappings(local_region_context);
}
ACPI_FREE(local_region_context);
*region_context = NULL;
@@ -59,8 +59,7 @@ acpi_ev_system_memory_region_setup(acpi_handle handle,

/* Create a new context */

- local_region_context =
- ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_mem_space_context));
+ local_region_context = acpi_ex_alloc_mem_space_context();
if (!(local_region_context)) {
return_ACPI_STATUS(AE_NO_MEMORY);
}
diff --git a/drivers/acpi/acpica/exregion.c b/drivers/acpi/acpica/exregion.c
index af777b7fccb0..9d97b6a67074 100644
--- a/drivers/acpi/acpica/exregion.c
+++ b/drivers/acpi/acpica/exregion.c
@@ -14,6 +14,40 @@
#define _COMPONENT ACPI_EXECUTER
ACPI_MODULE_NAME("exregion")

+struct acpi_mem_mapping {
+ acpi_physical_address physical_address;
+ u8 *logical_address;
+ acpi_size length;
+ struct acpi_mem_mapping *next_mm;
+};
+
+struct acpi_mm_context {
+ struct acpi_mem_space_context mem_info;
+ struct acpi_mem_mapping *first_mm;
+};
+
+/*****************************************************************************
+ *
+ * FUNCTION: acpi_ex_alloc_mem_space_context
+ *
+ * PARAMETERS: None
+ *
+ * RETURN: Pointer to a new region context object.
+ *
+ * DESCRIPTION: Allocate memory for memory operation region representation.
+ *
+ ****************************************************************************/
+struct acpi_mem_space_context *acpi_ex_alloc_mem_space_context(void)
+{
+ ACPI_FUNCTION_TRACE(acpi_ex_alloc_mem_space_context);
+
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ return ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_mm_context));
+#else
+ return ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_mem_space_context));
+#endif
+}
+
/*****************************************************************************
*
* FUNCTION: acpi_ex_unmap_region_memory
@@ -40,6 +74,44 @@ void acpi_ex_unmap_region_memory(struct acpi_mem_space_context *mem_info)
return_VOID;
}

+/*****************************************************************************
+ *
+ * FUNCTION: acpi_ex_unmap_all_region_mappings
+ *
+ * PARAMETERS: mem_info - Region specific context
+ *
+ * RETURN: None
+ *
+ * DESCRIPTION: Unmap all mappings associated with a memory operation region.
+ *
+ ****************************************************************************/
+void acpi_ex_unmap_all_region_mappings(struct acpi_mem_space_context *mem_info)
+{
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ struct acpi_mm_context *mm_context = (struct acpi_mm_context *)mem_info;
+ struct acpi_mem_mapping *mm;
+#endif
+
+ ACPI_FUNCTION_TRACE(acpi_ex_unmap_all_region_mappings);
+
+ acpi_ex_unmap_region_memory(mem_info);
+
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ while (mm_context->first_mm) {
+ mm = mm_context->first_mm;
+ mm_context->first_mm = mm->next_mm;
+#ifdef ACPI_USE_DEFERRED_UNMAPPING
+ acpi_os_unmap_deferred(mm->logical_address, mm->length);
+#else
+ acpi_os_unmap_memory(mm->logical_address, mm->length);
+#endif
+ ACPI_FREE(mm);
+ }
+#endif /* ACPI_USE_FAST_PATH_MAPPING */
+
+ return_VOID;
+}
+
/*******************************************************************************
*
* FUNCTION: acpi_ex_system_memory_space_handler
@@ -70,6 +142,10 @@ acpi_ex_system_memory_space_handler(u32 function,
u32 length;
acpi_size map_length;
acpi_size page_boundary_map_length;
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ struct acpi_mm_context *mm_context = (struct acpi_mm_context *)mem_info;
+ struct acpi_mem_mapping *mm;
+#endif
#ifdef ACPI_MISALIGNMENT_NOT_SUPPORTED
u32 remainder;
#endif
@@ -128,7 +204,7 @@ acpi_ex_system_memory_space_handler(u32 function,
mem_info->mapped_length))) {
/*
* The request cannot be resolved by the current memory mapping;
- * Delete the existing mapping and create a new one.
+ * Delete the current cached mapping and get a new one.
*/
if (mem_info->mapped_length) {

@@ -137,6 +213,36 @@ acpi_ex_system_memory_space_handler(u32 function,
acpi_ex_unmap_region_memory(mem_info);
}

+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ /*
+ * Look for an existing saved mapping matching the address range
+ * at hand. If found, make the OS layer bump up the reference
+ * counter of that mapping, cache it and carry out the access.
+ */
+ for (mm = mm_context->first_mm; mm; mm = mm->next_mm) {
+ if (address < mm->physical_address)
+ continue;
+
+ if ((u64)address + length >
+ (u64)mm->physical_address + mm->length)
+ continue;
+
+ /*
+ * When called on a known-existing memory mapping,
+ * acpi_os_map_memory_fast_path() must return the same
+ * logical address as before or NULL.
+ */
+ if (!acpi_os_map_memory_fast_path(mm->physical_address,
+ mm->length))
+ continue;
+
+ mem_info->mapped_logical_address = mm->logical_address;
+ mem_info->mapped_physical_address = mm->physical_address;
+ mem_info->mapped_length = mm->length;
+ goto access;
+ }
+#endif /* ACPI_USE_FAST_PATH_MAPPING */
+
/*
* October 2009: Attempt to map from the requested address to the
* end of the region. However, we will never map more than one
@@ -168,9 +274,8 @@ acpi_ex_system_memory_space_handler(u32 function,

/* Create a new mapping starting at the address given */

- mem_info->mapped_logical_address =
- acpi_os_map_memory(address, map_length);
- if (!mem_info->mapped_logical_address) {
+ logical_addr_ptr = acpi_os_map_memory(address, map_length);
+ if (!logical_addr_ptr) {
ACPI_ERROR((AE_INFO,
"Could not map memory at 0x%8.8X%8.8X, size %u",
ACPI_FORMAT_UINT64(address),
@@ -181,10 +286,56 @@ acpi_ex_system_memory_space_handler(u32 function,

/* Save the physical address and mapping size */

+ mem_info->mapped_logical_address = logical_addr_ptr;
mem_info->mapped_physical_address = address;
mem_info->mapped_length = map_length;
+
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ /*
+ * Create a new mm list entry to save the new mapping for
+ * removal at the operation region deactivation time.
+ */
+ mm = ACPI_ALLOCATE_ZEROED(sizeof(*mm));
+ if (!mm) {
+ /*
+ * No room to save the new mapping, but this is not
+ * critical. Just log the error and carry out the
+ * access as requested.
+ */
+ ACPI_ERROR((AE_INFO,
+ "Not enough memory to save memory mapping at 0x%8.8X%8.8X, size %u",
+ ACPI_FORMAT_UINT64(address),
+ (u32)map_length));
+ goto access;
+ }
+ /*
+ * Bump up the new mapping's reference counter in the OS layer
+ * to prevent it from getting dropped prematurely.
+ */
+ if (!acpi_os_map_memory_fast_path(address, map_length)) {
+ /*
+ * Something has gone wrong, but this is not critical.
+ * Log the error, free the mm list entry that won't be
+ * used and carry out the access as requested.
+ */
+ ACPI_ERROR((AE_INFO,
+ "Unable to save memory mapping at 0x%8.8X%8.8X, size %u",
+ ACPI_FORMAT_UINT64(address),
+ (u32)map_length));
+ ACPI_FREE(mm);
+ goto access;
+ }
+ mm->physical_address = address;
+ mm->logical_address = logical_addr_ptr;
+ mm->length = map_length;
+ mm->next_mm = mm_context->first_mm;
+ mm_context->first_mm = mm;
}

+access:
+#else /* !ACPI_USE_FAST_PATH_MAPPING */
+ }
+#endif /* !ACPI_USE_FAST_PATH_MAPPING */
/*
* Generate a logical pointer corresponding to the address we want to
* access
--
2.26.2




2020-06-26 17:37:53

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFT][PATCH v3 0/4] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

Hi All,

On Monday, June 22, 2020 3:50:42 PM CEST Rafael J. Wysocki wrote:
> Hi All,
>
> This series is to address the problem with RCU synchronization occurring,
> possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
> when the namespace and interpreter mutexes are held.
>
> Like I said before, I had decided to change the approach used in the previous
> iteration of this series and to allow the unmap operations carried out by
> acpi_ex_system_memory_space_handler() to be deferred in the first place,
> which is done in patches [1-2/4].

In the meantime I realized that calling syncrhonize_rcu_expedited() under the
"tables" mutex within ACPICA is not quite a good idea too and that there is no
reason for any users of acpi_os_unmap_memory() in the tree to use the "sync"
variant of unmapping.

So, unless I'm missing something, acpi_os_unmap_memory() can be changed to
always defer the final unmapping and the only ACPICA change needed to support
that is the addition of the acpi_os_release_unused_mappings() call to get rid
of the unused mappings when leaving the interpreter (module the extra call in
the debug code for consistency).

So patches [1-2/4] have been changed accordingly.

> However, it turns out that the "fast-path" mapping is still useful on top of
> the above to reduce the number of ioremap-iounmap cycles for the same address
> range and so it is introduced by patches [3-4/4].

Patches [3-4/4] still do what they did, but they have been simplified a bit
after rebasing on top of the new [1-2/4].

The below information is still valid, but it applies to the v3, of course.

> For details, please refer to the patch changelogs.
>
> The series is available from the git branch at
>
> git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> acpica-osl
>
> for easier testing.

Also the series have been tested locally.

Thanks,
Rafael



2020-06-26 17:38:03

by Rafael J. Wysocki

[permalink] [raw]
Subject: [RFT][PATCH v3 3/4] ACPICA: Preserve memory opregion mappings if supported by OS

From: "Rafael J. Wysocki" <[email protected]>

The ACPICA's strategy with respect to the handling of memory mappings
associated with memory operation regions is to avoid mapping the
entire region at once which may be problematic at least in principle
(for example, it may lead to conflicts with overlapping mappings
having different attributes created by drivers). It may also be
wasteful, because memory opregions on some systems take up vast
chunks of address space while the fields in those regions actually
accessed by AML are sparsely distributed.

For this reason, a one-page "window" is mapped for a given opregion
on the first memory access through it and if that "window" does not
cover an address range accessed through that opregion subsequently,
it is unmapped and a new "window" is mapped to replace it. Next,
if the new "window" is not sufficient to acess memory through the
opregion in question in the future, it will be replaced with yet
another "window" and so on. That may lead to a suboptimal sequence
of memory mapping and unmapping operations, for example if two fields
in one opregion separated from each other by a sufficiently wide
chunk of unused address space are accessed in an alternating pattern.

The situation may still be suboptimal if the deferred unmapping
introduced previously is supported by the OS layer. For instance,
the alternating memory access pattern mentioned above may produce
a relatively long list of mappings to release with substantial
duplication among the entries in it, which could be avoided if
acpi_ex_system_memory_space_handler() did not release the mapping
used by it previously as soon as the current access was not covered
by it.

In order to improve that, modify acpi_ex_system_memory_space_handler()
to take advantage of the memory mappings reference counting at the OS
level if a suitable interface is provided.

Namely, if ACPI_USE_FAST_PATH_MAPPING is set, the OS is expected to
implement acpi_os_map_memory_fast_path() that will return NULL if
there is no mapping covering the given address range known to it.
If such a mapping is there, however, its reference counter will be
incremented and a pointer representing the requested virtual address
will be returned right away without any additional consequences.

That allows acpi_ex_system_memory_space_handler() to acquire
additional references to all new memory mappings with the help
of acpi_os_map_memory_fast_path() so as to retain them until the
memory opregions associated with them go away. The function will
still use a new "window" mapping if the current one does not
cover the address range at hand, but it will avoid unmapping the
current one right away by adding it to a list of "known" mappings
associated with the given memory opregion which will be deleted at
the opregion deactivation time. The mappings in that list can be
used every time a "new window" is needed so as to avoid overhead
related to the mapping and unmapping of memory.

Signed-off-by: Rafael J. Wysocki <[email protected]>
---
drivers/acpi/acpica/acinterp.h | 3 +
drivers/acpi/acpica/evrgnini.c | 9 +-
drivers/acpi/acpica/exregion.c | 154 ++++++++++++++++++++++++++++++++-
3 files changed, 156 insertions(+), 10 deletions(-)

diff --git a/drivers/acpi/acpica/acinterp.h b/drivers/acpi/acpica/acinterp.h
index a6d896cda2a5..95675a7a8a6b 100644
--- a/drivers/acpi/acpica/acinterp.h
+++ b/drivers/acpi/acpica/acinterp.h
@@ -479,6 +479,9 @@ void acpi_ex_pci_cls_to_string(char *dest, u8 class_code[3]);

u8 acpi_is_valid_space_id(u8 space_id);

+acpi_size acpi_ex_mem_space_context_size(void);
+void acpi_ex_unmap_all_region_mappings(struct acpi_mem_space_context *mem_info);
+
/*
* exregion - default op_region handlers
*/
diff --git a/drivers/acpi/acpica/evrgnini.c b/drivers/acpi/acpica/evrgnini.c
index aefc0145e583..82f466a128d5 100644
--- a/drivers/acpi/acpica/evrgnini.c
+++ b/drivers/acpi/acpica/evrgnini.c
@@ -46,13 +46,10 @@ acpi_ev_system_memory_region_setup(acpi_handle handle,
local_region_context =
(struct acpi_mem_space_context *)*region_context;

- /* Delete a cached mapping if present */
+ /* Delete memory mappings if present */

if (local_region_context->mapped_length) {
- acpi_os_unmap_memory(local_region_context->
- mapped_logical_address,
- local_region_context->
- mapped_length);
+ acpi_ex_unmap_all_region_mappings(local_region_context);
}
ACPI_FREE(local_region_context);
*region_context = NULL;
@@ -63,7 +60,7 @@ acpi_ev_system_memory_region_setup(acpi_handle handle,
/* Create a new context */

local_region_context =
- ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_mem_space_context));
+ ACPI_ALLOCATE_ZEROED(acpi_ex_mem_space_context_size());
if (!(local_region_context)) {
return_ACPI_STATUS(AE_NO_MEMORY);
}
diff --git a/drivers/acpi/acpica/exregion.c b/drivers/acpi/acpica/exregion.c
index d15a66de26c0..4274582619d2 100644
--- a/drivers/acpi/acpica/exregion.c
+++ b/drivers/acpi/acpica/exregion.c
@@ -14,6 +14,73 @@
#define _COMPONENT ACPI_EXECUTER
ACPI_MODULE_NAME("exregion")

+struct acpi_mem_mapping {
+ acpi_physical_address physical_address;
+ u8 *logical_address;
+ acpi_size length;
+ struct acpi_mem_mapping *next_mm;
+};
+
+struct acpi_mm_context {
+ struct acpi_mem_space_context mem_info;
+ struct acpi_mem_mapping *first_mm;
+};
+
+/*******************************************************************************
+ *
+ * FUNCTION: acpi_ex_mem_space_context_size
+ *
+ * PARAMETERS: None
+ *
+ * RETURN: Size of internal memory operation region representation.
+ *
+ ******************************************************************************/
+acpi_size acpi_ex_mem_space_context_size(void)
+{
+ ACPI_FUNCTION_TRACE(acpi_ex_mem_space_context_size);
+
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ return sizeof(struct acpi_mm_context);
+#else
+ return sizeof(struct acpi_mem_space_context);
+#endif
+}
+
+/*******************************************************************************
+ *
+ * FUNCTION: acpi_ex_unmap_all_region_mappings
+ *
+ * PARAMETERS: mem_info - Region specific context
+ *
+ * RETURN: None
+ *
+ * DESCRIPTION: Unmap all mappings associated with a memory operation region.
+ *
+ ******************************************************************************/
+void acpi_ex_unmap_all_region_mappings(struct acpi_mem_space_context *mem_info)
+{
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ struct acpi_mm_context *mm_context = (struct acpi_mm_context *)mem_info;
+ struct acpi_mem_mapping *mm;
+#endif
+
+ ACPI_FUNCTION_TRACE(acpi_ex_unmap_all_region_mappings);
+
+ acpi_os_unmap_memory(mem_info->mapped_logical_address,
+ mem_info->mapped_length);
+
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ while (mm_context->first_mm) {
+ mm = mm_context->first_mm;
+ mm_context->first_mm = mm->next_mm;
+ acpi_os_unmap_memory(mm->logical_address, mm->length);
+ ACPI_FREE(mm);
+ }
+#endif
+
+ return_VOID;
+}
+
/*******************************************************************************
*
* FUNCTION: acpi_ex_system_memory_space_handler
@@ -44,6 +111,10 @@ acpi_ex_system_memory_space_handler(u32 function,
u32 length;
acpi_size map_length;
acpi_size page_boundary_map_length;
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ struct acpi_mm_context *mm_context = (struct acpi_mm_context *)mem_info;
+ struct acpi_mem_mapping *mm;
+#endif
#ifdef ACPI_MISALIGNMENT_NOT_SUPPORTED
u32 remainder;
#endif
@@ -102,7 +173,7 @@ acpi_ex_system_memory_space_handler(u32 function,
mem_info->mapped_length))) {
/*
* The request cannot be resolved by the current memory mapping;
- * Delete the existing mapping and create a new one.
+ * Delete the current cached mapping and get a new one.
*/
if (mem_info->mapped_length) {

@@ -112,6 +183,36 @@ acpi_ex_system_memory_space_handler(u32 function,
mem_info->mapped_length);
}

+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ /*
+ * Look for an existing saved mapping matching the address range
+ * at hand. If found, make the OS layer bump up the reference
+ * counter of that mapping, cache it and carry out the access.
+ */
+ for (mm = mm_context->first_mm; mm; mm = mm->next_mm) {
+ if (address < mm->physical_address)
+ continue;
+
+ if ((u64)address + length >
+ (u64)mm->physical_address + mm->length)
+ continue;
+
+ /*
+ * When called on a known-existing memory mapping,
+ * acpi_os_map_memory_fast_path() must return the same
+ * logical address as before or NULL.
+ */
+ if (!acpi_os_map_memory_fast_path(mm->physical_address,
+ mm->length))
+ continue;
+
+ mem_info->mapped_logical_address = mm->logical_address;
+ mem_info->mapped_physical_address = mm->physical_address;
+ mem_info->mapped_length = mm->length;
+ goto access;
+ }
+#endif /* ACPI_USE_FAST_PATH_MAPPING */
+
/*
* October 2009: Attempt to map from the requested address to the
* end of the region. However, we will never map more than one
@@ -143,9 +244,8 @@ acpi_ex_system_memory_space_handler(u32 function,

/* Create a new mapping starting at the address given */

- mem_info->mapped_logical_address =
- acpi_os_map_memory(address, map_length);
- if (!mem_info->mapped_logical_address) {
+ logical_addr_ptr = acpi_os_map_memory(address, map_length);
+ if (!logical_addr_ptr) {
ACPI_ERROR((AE_INFO,
"Could not map memory at 0x%8.8X%8.8X, size %u",
ACPI_FORMAT_UINT64(address),
@@ -156,10 +256,56 @@ acpi_ex_system_memory_space_handler(u32 function,

/* Save the physical address and mapping size */

+ mem_info->mapped_logical_address = logical_addr_ptr;
mem_info->mapped_physical_address = address;
mem_info->mapped_length = map_length;
+
+#ifdef ACPI_USE_FAST_PATH_MAPPING
+ /*
+ * Create a new mm list entry to save the new mapping for
+ * removal at the operation region deactivation time.
+ */
+ mm = ACPI_ALLOCATE_ZEROED(sizeof(*mm));
+ if (!mm) {
+ /*
+ * No room to save the new mapping, but this is not
+ * critical. Just log the error and carry out the
+ * access as requested.
+ */
+ ACPI_ERROR((AE_INFO,
+ "Not enough memory to save memory mapping at 0x%8.8X%8.8X, size %u",
+ ACPI_FORMAT_UINT64(address),
+ (u32)map_length));
+ goto access;
+ }
+ /*
+ * Bump up the new mapping's reference counter in the OS layer
+ * to prevent it from getting dropped prematurely.
+ */
+ if (!acpi_os_map_memory_fast_path(address, map_length)) {
+ /*
+ * Something has gone wrong, but this is not critical.
+ * Log the error, free the mm list entry that won't be
+ * used and carry out the access as requested.
+ */
+ ACPI_ERROR((AE_INFO,
+ "Unable to save memory mapping at 0x%8.8X%8.8X, size %u",
+ ACPI_FORMAT_UINT64(address),
+ (u32)map_length));
+ ACPI_FREE(mm);
+ goto access;
+ }
+ mm->physical_address = address;
+ mm->logical_address = logical_addr_ptr;
+ mm->length = map_length;
+ mm->next_mm = mm_context->first_mm;
+ mm_context->first_mm = mm;
}

+access:
+#else /* !ACPI_USE_FAST_PATH_MAPPING */
+ }
+#endif /* !ACPI_USE_FAST_PATH_MAPPING */
/*
* Generate a logical pointer corresponding to the address we want to
* access
--
2.26.2




2020-06-26 18:45:03

by Dan Williams

[permalink] [raw]
Subject: Re: [RFT][PATCH v3 0/4] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

On Fri, Jun 26, 2020 at 10:34 AM Rafael J. Wysocki <[email protected]> wrote:
>
> Hi All,
>
> On Monday, June 22, 2020 3:50:42 PM CEST Rafael J. Wysocki wrote:
> > Hi All,
> >
> > This series is to address the problem with RCU synchronization occurring,
> > possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
> > when the namespace and interpreter mutexes are held.
> >
> > Like I said before, I had decided to change the approach used in the previous
> > iteration of this series and to allow the unmap operations carried out by
> > acpi_ex_system_memory_space_handler() to be deferred in the first place,
> > which is done in patches [1-2/4].
>
> In the meantime I realized that calling syncrhonize_rcu_expedited() under the
> "tables" mutex within ACPICA is not quite a good idea too and that there is no
> reason for any users of acpi_os_unmap_memory() in the tree to use the "sync"
> variant of unmapping.
>
> So, unless I'm missing something, acpi_os_unmap_memory() can be changed to
> always defer the final unmapping and the only ACPICA change needed to support
> that is the addition of the acpi_os_release_unused_mappings() call to get rid
> of the unused mappings when leaving the interpreter (module the extra call in
> the debug code for consistency).
>
> So patches [1-2/4] have been changed accordingly.
>
> > However, it turns out that the "fast-path" mapping is still useful on top of
> > the above to reduce the number of ioremap-iounmap cycles for the same address
> > range and so it is introduced by patches [3-4/4].
>
> Patches [3-4/4] still do what they did, but they have been simplified a bit
> after rebasing on top of the new [1-2/4].
>
> The below information is still valid, but it applies to the v3, of course.
>
> > For details, please refer to the patch changelogs.
> >
> > The series is available from the git branch at
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> > acpica-osl
> >
> > for easier testing.
>
> Also the series have been tested locally.

Ok, I'm still trying to get the original reporter to confirm this
reduces the execution time for ASL routines with a lot of OpRegion
touches. Shall I rebuild that test kernel with these changes, or are
the results from the original RFT still interesting?

2020-06-26 22:55:07

by Kaneda, Erik

[permalink] [raw]
Subject: RE: [RFT][PATCH v2 3/4] ACPICA: Preserve memory opregion mappings if supported by OS



> -----Original Message-----
> From: Rafael J. Wysocki <[email protected]>
> Sent: Monday, June 22, 2020 7:02 AM
> To: Williams, Dan J <[email protected]>; Kaneda, Erik
> <[email protected]>
> Cc: Wysocki, Rafael J <[email protected]>; Len Brown
> <[email protected]>; Borislav Petkov <[email protected]>; Weiny, Ira
> <[email protected]>; James Morse <[email protected]>; Myron
> Stowe <[email protected]>; Andy Shevchenko
> <[email protected]>; [email protected]; linux-
> [email protected]; [email protected]; Moore, Robert
> <[email protected]>
> Subject: [RFT][PATCH v2 3/4] ACPICA: Preserve memory opregion mappings
> if supported by OS
>
> From: "Rafael J. Wysocki" <[email protected]>
>
> The ACPICA's strategy with respect to the handling of memory mappings
> associated with memory operation regions is to avoid mapping the
> entire region at once which may be problematic at least in principle
> (for example, it may lead to conflicts with overlapping mappings
> having different attributes created by drivers). It may also be
> wasteful, because memory opregions on some systems take up vast
> chunks of address space while the fields in those regions actually
> accessed by AML are sparsely distributed.
>
> For this reason, a one-page "window" is mapped for a given opregion
> on the first memory access through it and if that "window" does not
> cover an address range accessed through that opregion subsequently,
> it is unmapped and a new "window" is mapped to replace it. Next,
> if the new "window" is not sufficient to access memory through the
> opregion in question in the future, it will be replaced with yet
> another "window" and so on. That may lead to a suboptimal sequence
> of memory mapping and unmapping operations, for example if two fields
> in one opregion separated from each other by a sufficiently wide
> chunk of unused address space are accessed in an alternating pattern.
>
> The situation may still be suboptimal if the deferred unmapping
> introduced previously is supported by the OS layer. For instance,
> the alternating memory access pattern mentioned above may produce
> a relatively long list of mappings to release with substantial
> duplication among the entries in it, which could be avoided if
> acpi_ex_system_memory_space_handler() did not release the mapping
> used by it previously as soon as the current access was not covered
> by it.
>
> In order to improve that, modify acpi_ex_system_memory_space_handler()
> to take advantage of the memory mappings reference counting at the OS
> level if a suitable interface is provided.
>
Hi,

> Namely, if ACPI_USE_FAST_PATH_MAPPING is set, the OS is expected to
> implement acpi_os_map_memory_fast_path() that will return NULL if
> there is no mapping covering the given address range known to it.
> If such a mapping is there, however, its reference counter will be
> incremented and a pointer representing the requested virtual address
> will be returned right away without any additional consequences.

I do not fully understand why this is under a #ifdef. Is this to support operating systems that might not want to add support for this behavior?

Also, instead of using the terminology fast_path, I think it would be easier to use terminology that describes the mechanism..
It might be easier for other Operating systems to understand something like acpi_os_map_preserved_memory or acpi_os_map_sysmem_opregion_memory.

Thanks,
Erik
>
> That allows acpi_ex_system_memory_space_handler() to acquire
> additional references to all new memory mappings with the help
> of acpi_os_map_memory_fast_path() so as to retain them until the
> memory opregions associated with them go away. The function will
> still use a new "window" mapping if the current one does not
> cover the address range at hand, but it will avoid unmapping the
> current one right away by adding it to a list of "known" mappings
> associated with the given memory opregion which will be deleted at
> the opregion deactivation time. The mappings in that list can be
> used every time a "new window" is needed so as to avoid overhead
> related to the mapping and unmapping of memory.
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
> drivers/acpi/acpica/acinterp.h | 4 +
> drivers/acpi/acpica/evrgnini.c | 7 +-
> drivers/acpi/acpica/exregion.c | 159
> ++++++++++++++++++++++++++++++++-
> 3 files changed, 162 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/acpi/acpica/acinterp.h b/drivers/acpi/acpica/acinterp.h
> index 1f1026fb06e9..db9c279baa2e 100644
> --- a/drivers/acpi/acpica/acinterp.h
> +++ b/drivers/acpi/acpica/acinterp.h
> @@ -479,8 +479,12 @@ void acpi_ex_pci_cls_to_string(char *dest, u8
> class_code[3]);
>
> u8 acpi_is_valid_space_id(u8 space_id);
>
> +struct acpi_mem_space_context
> *acpi_ex_alloc_mem_space_context(void);
> +
> void acpi_ex_unmap_region_memory(struct acpi_mem_space_context
> *mem_info);
>
> +void acpi_ex_unmap_all_region_mappings(struct
> acpi_mem_space_context *mem_info);
> +
> /*
> * exregion - default op_region handlers
> */
> diff --git a/drivers/acpi/acpica/evrgnini.c b/drivers/acpi/acpica/evrgnini.c
> index 9f33114a74ca..f6c5feea10bc 100644
> --- a/drivers/acpi/acpica/evrgnini.c
> +++ b/drivers/acpi/acpica/evrgnini.c
> @@ -46,10 +46,10 @@ acpi_ev_system_memory_region_setup(acpi_handle
> handle,
> local_region_context =
> (struct acpi_mem_space_context
> *)*region_context;
>
> - /* Delete a cached mapping if present */
> + /* Delete memory mappings if present */
>
> if (local_region_context->mapped_length) {
> -
> acpi_ex_unmap_region_memory(local_region_context);
> +
> acpi_ex_unmap_all_region_mappings(local_region_context);
> }
> ACPI_FREE(local_region_context);
> *region_context = NULL;
> @@ -59,8 +59,7 @@ acpi_ev_system_memory_region_setup(acpi_handle
> handle,
>
> /* Create a new context */
>
> - local_region_context =
> - ACPI_ALLOCATE_ZEROED(sizeof(struct
> acpi_mem_space_context));
> + local_region_context = acpi_ex_alloc_mem_space_context();
> if (!(local_region_context)) {
> return_ACPI_STATUS(AE_NO_MEMORY);
> }
> diff --git a/drivers/acpi/acpica/exregion.c b/drivers/acpi/acpica/exregion.c
> index af777b7fccb0..9d97b6a67074 100644
> --- a/drivers/acpi/acpica/exregion.c
> +++ b/drivers/acpi/acpica/exregion.c
> @@ -14,6 +14,40 @@
> #define _COMPONENT ACPI_EXECUTER
> ACPI_MODULE_NAME("exregion")
>
> +struct acpi_mem_mapping {
> + acpi_physical_address physical_address;
> + u8 *logical_address;
> + acpi_size length;
> + struct acpi_mem_mapping *next_mm;
> +};
> +
> +struct acpi_mm_context {
> + struct acpi_mem_space_context mem_info;
> + struct acpi_mem_mapping *first_mm;
> +};
> +
> +/*********************************************************
> ********************
> + *
> + * FUNCTION: acpi_ex_alloc_mem_space_context
> + *
> + * PARAMETERS: None
> + *
> + * RETURN: Pointer to a new region context object.
> + *
> + * DESCRIPTION: Allocate memory for memory operation region
> representation.
> + *
> +
> **********************************************************
> ******************/
> +struct acpi_mem_space_context
> *acpi_ex_alloc_mem_space_context(void)
> +{
> + ACPI_FUNCTION_TRACE(acpi_ex_alloc_mem_space_context);
> +
> +#ifdef ACPI_USE_FAST_PATH_MAPPING
> + return ACPI_ALLOCATE_ZEROED(sizeof(struct acpi_mm_context));
> +#else
> + return ACPI_ALLOCATE_ZEROED(sizeof(struct
> acpi_mem_space_context));
> +#endif
> +}
> +
>
> /**********************************************************
> *******************
> *
> * FUNCTION: acpi_ex_unmap_region_memory
> @@ -40,6 +74,44 @@ void acpi_ex_unmap_region_memory(struct
> acpi_mem_space_context *mem_info)
> return_VOID;
> }
>
> +/*********************************************************
> ********************
> + *
> + * FUNCTION: acpi_ex_unmap_all_region_mappings
> + *
> + * PARAMETERS: mem_info - Region specific context
> + *
> + * RETURN: None
> + *
> + * DESCRIPTION: Unmap all mappings associated with a memory operation
> region.
> + *
> +
> **********************************************************
> ******************/
> +void acpi_ex_unmap_all_region_mappings(struct
> acpi_mem_space_context *mem_info)
> +{
> +#ifdef ACPI_USE_FAST_PATH_MAPPING
> + struct acpi_mm_context *mm_context = (struct acpi_mm_context
> *)mem_info;
> + struct acpi_mem_mapping *mm;
> +#endif
> +
> + ACPI_FUNCTION_TRACE(acpi_ex_unmap_all_region_mappings);
> +
> + acpi_ex_unmap_region_memory(mem_info);
> +
> +#ifdef ACPI_USE_FAST_PATH_MAPPING
> + while (mm_context->first_mm) {
> + mm = mm_context->first_mm;
> + mm_context->first_mm = mm->next_mm;
> +#ifdef ACPI_USE_DEFERRED_UNMAPPING
> + acpi_os_unmap_deferred(mm->logical_address, mm-
> >length);
> +#else
> + acpi_os_unmap_memory(mm->logical_address, mm-
> >length);
> +#endif
> + ACPI_FREE(mm);
> + }
> +#endif /* ACPI_USE_FAST_PATH_MAPPING */
> +
> + return_VOID;
> +}
> +
>
> /**********************************************************
> *********************
> *
> * FUNCTION: acpi_ex_system_memory_space_handler
> @@ -70,6 +142,10 @@ acpi_ex_system_memory_space_handler(u32
> function,
> u32 length;
> acpi_size map_length;
> acpi_size page_boundary_map_length;
> +#ifdef ACPI_USE_FAST_PATH_MAPPING
> + struct acpi_mm_context *mm_context = (struct acpi_mm_context
> *)mem_info;
> + struct acpi_mem_mapping *mm;
> +#endif
> #ifdef ACPI_MISALIGNMENT_NOT_SUPPORTED
> u32 remainder;
> #endif
> @@ -128,7 +204,7 @@ acpi_ex_system_memory_space_handler(u32
> function,
> mem_info->mapped_length))) {
> /*
> * The request cannot be resolved by the current memory
> mapping;
> - * Delete the existing mapping and create a new one.
> + * Delete the current cached mapping and get a new one.
> */
> if (mem_info->mapped_length) {
>
> @@ -137,6 +213,36 @@ acpi_ex_system_memory_space_handler(u32
> function,
> acpi_ex_unmap_region_memory(mem_info);
> }
>
> +#ifdef ACPI_USE_FAST_PATH_MAPPING
> + /*
> + * Look for an existing saved mapping matching the address
> range
> + * at hand. If found, make the OS layer bump up the
> reference
> + * counter of that mapping, cache it and carry out the access.
> + */
> + for (mm = mm_context->first_mm; mm; mm = mm-
> >next_mm) {
> + if (address < mm->physical_address)
> + continue;
> +
> + if ((u64)address + length >
> + (u64)mm->physical_address + mm-
> >length)
> + continue;
> +
> + /*
> + * When called on a known-existing memory mapping,
> + * acpi_os_map_memory_fast_path() must return
> the same
> + * logical address as before or NULL.
> + */
> + if (!acpi_os_map_memory_fast_path(mm-
> >physical_address,
> + mm->length))
> + continue;
> +
> + mem_info->mapped_logical_address = mm-
> >logical_address;
> + mem_info->mapped_physical_address = mm-
> >physical_address;
> + mem_info->mapped_length = mm->length;
> + goto access;
> + }
> +#endif /* ACPI_USE_FAST_PATH_MAPPING */
> +
> /*
> * October 2009: Attempt to map from the requested
> address to the
> * end of the region. However, we will never map more than
> one
> @@ -168,9 +274,8 @@ acpi_ex_system_memory_space_handler(u32
> function,
>
> /* Create a new mapping starting at the address given */
>
> - mem_info->mapped_logical_address =
> - acpi_os_map_memory(address, map_length);
> - if (!mem_info->mapped_logical_address) {
> + logical_addr_ptr = acpi_os_map_memory(address,
> map_length);
> + if (!logical_addr_ptr) {
> ACPI_ERROR((AE_INFO,
> "Could not map memory at 0x%8.8X%8.8X,
> size %u",
> ACPI_FORMAT_UINT64(address),
> @@ -181,10 +286,56 @@ acpi_ex_system_memory_space_handler(u32
> function,
>
> /* Save the physical address and mapping size */
>
> + mem_info->mapped_logical_address = logical_addr_ptr;
> mem_info->mapped_physical_address = address;
> mem_info->mapped_length = map_length;
> +
> +#ifdef ACPI_USE_FAST_PATH_MAPPING
> + /*
> + * Create a new mm list entry to save the new mapping for
> + * removal at the operation region deactivation time.
> + */
> + mm = ACPI_ALLOCATE_ZEROED(sizeof(*mm));
> + if (!mm) {
> + /*
> + * No room to save the new mapping, but this is not
> + * critical. Just log the error and carry out the
> + * access as requested.
> + */
> + ACPI_ERROR((AE_INFO,
> + "Not enough memory to save memory
> mapping at 0x%8.8X%8.8X, size %u",
> + ACPI_FORMAT_UINT64(address),
> + (u32)map_length));
> + goto access;
> + }
> + /*
> + * Bump up the new mapping's reference counter in the OS
> layer
> + * to prevent it from getting dropped prematurely.
> + */
> + if (!acpi_os_map_memory_fast_path(address, map_length))
> {
> + /*
> + * Something has gone wrong, but this is not critical.
> + * Log the error, free the mm list entry that won't be
> + * used and carry out the access as requested.
> + */
> + ACPI_ERROR((AE_INFO,
> + "Unable to save memory mapping at
> 0x%8.8X%8.8X, size %u",
> + ACPI_FORMAT_UINT64(address),
> + (u32)map_length));
> + ACPI_FREE(mm);
> + goto access;
> + }
> + mm->physical_address = address;
> + mm->logical_address = logical_addr_ptr;
> + mm->length = map_length;
> + mm->next_mm = mm_context->first_mm;
> + mm_context->first_mm = mm;
> }
>
> +access:
> +#else /* !ACPI_USE_FAST_PATH_MAPPING */
> + }
> +#endif /* !ACPI_USE_FAST_PATH_MAPPING */
> /*
> * Generate a logical pointer corresponding to the address we want
> to
> * access
> --
> 2.26.2
>
>
>

2020-06-28 17:14:04

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFT][PATCH v3 0/4] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

On Fri, Jun 26, 2020 at 8:41 PM Dan Williams <[email protected]> wrote:
>
> On Fri, Jun 26, 2020 at 10:34 AM Rafael J. Wysocki <[email protected]> wrote:
> >
> > Hi All,
> >
> > On Monday, June 22, 2020 3:50:42 PM CEST Rafael J. Wysocki wrote:
> > > Hi All,
> > >
> > > This series is to address the problem with RCU synchronization occurring,
> > > possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
> > > when the namespace and interpreter mutexes are held.
> > >
> > > Like I said before, I had decided to change the approach used in the previous
> > > iteration of this series and to allow the unmap operations carried out by
> > > acpi_ex_system_memory_space_handler() to be deferred in the first place,
> > > which is done in patches [1-2/4].
> >
> > In the meantime I realized that calling syncrhonize_rcu_expedited() under the
> > "tables" mutex within ACPICA is not quite a good idea too and that there is no
> > reason for any users of acpi_os_unmap_memory() in the tree to use the "sync"
> > variant of unmapping.
> >
> > So, unless I'm missing something, acpi_os_unmap_memory() can be changed to
> > always defer the final unmapping and the only ACPICA change needed to support
> > that is the addition of the acpi_os_release_unused_mappings() call to get rid
> > of the unused mappings when leaving the interpreter (module the extra call in
> > the debug code for consistency).
> >
> > So patches [1-2/4] have been changed accordingly.
> >
> > > However, it turns out that the "fast-path" mapping is still useful on top of
> > > the above to reduce the number of ioremap-iounmap cycles for the same address
> > > range and so it is introduced by patches [3-4/4].
> >
> > Patches [3-4/4] still do what they did, but they have been simplified a bit
> > after rebasing on top of the new [1-2/4].
> >
> > The below information is still valid, but it applies to the v3, of course.
> >
> > > For details, please refer to the patch changelogs.
> > >
> > > The series is available from the git branch at
> > >
> > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> > > acpica-osl
> > >
> > > for easier testing.
> >
> > Also the series have been tested locally.
>
> Ok, I'm still trying to get the original reporter to confirm this
> reduces the execution time for ASL routines with a lot of OpRegion
> touches. Shall I rebuild that test kernel with these changes, or are
> the results from the original RFT still interesting?

I'm mostly interested in the results with the v3 applied.

Also it would be good to check the impact of the first two patches
alone relative to all four.

Thanks!

2020-06-29 19:09:12

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFT][PATCH v2 3/4] ACPICA: Preserve memory opregion mappings if supported by OS

On Sat, Jun 27, 2020 at 12:53 AM Kaneda, Erik <[email protected]> wrote:
>
>
>
> > -----Original Message-----
> > From: Rafael J. Wysocki <[email protected]>
> > Sent: Monday, June 22, 2020 7:02 AM
> > To: Williams, Dan J <[email protected]>; Kaneda, Erik
> > <[email protected]>
> > Cc: Wysocki, Rafael J <[email protected]>; Len Brown
> > <[email protected]>; Borislav Petkov <[email protected]>; Weiny, Ira
> > <[email protected]>; James Morse <[email protected]>; Myron
> > Stowe <[email protected]>; Andy Shevchenko
> > <[email protected]>; [email protected]; linux-
> > [email protected]; [email protected]; Moore, Robert
> > <[email protected]>
> > Subject: [RFT][PATCH v2 3/4] ACPICA: Preserve memory opregion mappings
> > if supported by OS
> >
> > From: "Rafael J. Wysocki" <[email protected]>
> >
> > The ACPICA's strategy with respect to the handling of memory mappings
> > associated with memory operation regions is to avoid mapping the
> > entire region at once which may be problematic at least in principle
> > (for example, it may lead to conflicts with overlapping mappings
> > having different attributes created by drivers). It may also be
> > wasteful, because memory opregions on some systems take up vast
> > chunks of address space while the fields in those regions actually
> > accessed by AML are sparsely distributed.
> >
> > For this reason, a one-page "window" is mapped for a given opregion
> > on the first memory access through it and if that "window" does not
> > cover an address range accessed through that opregion subsequently,
> > it is unmapped and a new "window" is mapped to replace it. Next,
> > if the new "window" is not sufficient to access memory through the
> > opregion in question in the future, it will be replaced with yet
> > another "window" and so on. That may lead to a suboptimal sequence
> > of memory mapping and unmapping operations, for example if two fields
> > in one opregion separated from each other by a sufficiently wide
> > chunk of unused address space are accessed in an alternating pattern.
> >
> > The situation may still be suboptimal if the deferred unmapping
> > introduced previously is supported by the OS layer. For instance,
> > the alternating memory access pattern mentioned above may produce
> > a relatively long list of mappings to release with substantial
> > duplication among the entries in it, which could be avoided if
> > acpi_ex_system_memory_space_handler() did not release the mapping
> > used by it previously as soon as the current access was not covered
> > by it.
> >
> > In order to improve that, modify acpi_ex_system_memory_space_handler()
> > to take advantage of the memory mappings reference counting at the OS
> > level if a suitable interface is provided.
> >
> Hi,
>
> > Namely, if ACPI_USE_FAST_PATH_MAPPING is set, the OS is expected to
> > implement acpi_os_map_memory_fast_path() that will return NULL if
> > there is no mapping covering the given address range known to it.
> > If such a mapping is there, however, its reference counter will be
> > incremented and a pointer representing the requested virtual address
> > will be returned right away without any additional consequences.
>
> I do not fully understand why this is under a #ifdef. Is this to support operating systems that might not want to add support for this behavior?

Yes, and to protect the ones that have not added support for it just yet.

Without the "fast-path" mapping support, ACPICA has no way to obtain
additional references to known-existing mappings and the new code
won't work as expected without it, so it is better to avoid building
that code at all in those cases IMO.

> Also, instead of using the terminology fast_path, I think it would be easier to use terminology that describes the mechanism..
> It might be easier for other Operating systems to understand something like acpi_os_map_preserved_memory or acpi_os_map_sysmem_opregion_memory.

Well, the naming is not particularly important to me to be honest, but
this is mostly about being able to get a new reference to a
known-existing memory mapping.

So something like acpi_os_ref_memory_map() perhaps?

But I'm thinking that this can be implemented without the "fast-path"
mapping support too, let me try to do that.

Cheers!

2020-06-29 19:25:00

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v4 0/2] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

Hi All,

On Friday, June 26, 2020 7:28:27 PM CEST Rafael J. Wysocki wrote:
> Hi All,
>
> On Monday, June 22, 2020 3:50:42 PM CEST Rafael J. Wysocki wrote:
> > Hi All,
> >
> > This series is to address the problem with RCU synchronization occurring,
> > possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
> > when the namespace and interpreter mutexes are held.
> >
> > Like I said before, I had decided to change the approach used in the previous
> > iteration of this series and to allow the unmap operations carried out by
> > acpi_ex_system_memory_space_handler() to be deferred in the first place,
> > which is done in patches [1-2/4].
>
> In the meantime I realized that calling syncrhonize_rcu_expedited() under the
> "tables" mutex within ACPICA is not quite a good idea too and that there is no
> reason for any users of acpi_os_unmap_memory() in the tree to use the "sync"
> variant of unmapping.
>
> So, unless I'm missing something, acpi_os_unmap_memory() can be changed to
> always defer the final unmapping and the only ACPICA change needed to support
> that is the addition of the acpi_os_release_unused_mappings() call to get rid
> of the unused mappings when leaving the interpreter (module the extra call in
> the debug code for consistency).
>
> So patches [1-2/4] have been changed accordingly.

And this still can be improved by using queue_rcu_work() to queue up the unused
mappings for removal in which case ACPICA need not be modified at all for the
deferred unmapping to work.

Accordingly, patches [1-2/4] from the v3 (and earlier) are now replaced by one
patch, the [1/2].

> > However, it turns out that the "fast-path" mapping is still useful on top of
> > the above to reduce the number of ioremap-iounmap cycles for the same address
> > range and so it is introduced by patches [3-4/4].
>
> Patches [3-4/4] still do what they did, but they have been simplified a bit
> after rebasing on top of the new [1-2/4].

Moreover, the ACPICA part of the old patches [3-4/4] can be reworked to always
preserve memory mappings created by the memory opregion handler without the
need to take additional references to memory mappings at the OS level, so
patch [4/4] from the v3 (and earlier) is not needed now.

Again, for details, please refer to the patch changelogs, but I'm kind of
inclined to make these changes regardless, because they both are clear
improvements to me.

As before:

> > The series is available from the git branch at
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> > acpica-osl
> >
> > for easier testing.
>
> Also the series have been tested locally.

Cheers,
Rafael



2020-06-29 20:47:47

by Dan Williams

[permalink] [raw]
Subject: Re: [RFT][PATCH v3 0/4] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

On Sun, Jun 28, 2020 at 10:09 AM Rafael J. Wysocki <[email protected]> wrote:
>
> On Fri, Jun 26, 2020 at 8:41 PM Dan Williams <[email protected]> wrote:
> >
> > On Fri, Jun 26, 2020 at 10:34 AM Rafael J. Wysocki <[email protected]> wrote:
> > >
> > > Hi All,
> > >
> > > On Monday, June 22, 2020 3:50:42 PM CEST Rafael J. Wysocki wrote:
> > > > Hi All,
> > > >
> > > > This series is to address the problem with RCU synchronization occurring,
> > > > possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
> > > > when the namespace and interpreter mutexes are held.
> > > >
> > > > Like I said before, I had decided to change the approach used in the previous
> > > > iteration of this series and to allow the unmap operations carried out by
> > > > acpi_ex_system_memory_space_handler() to be deferred in the first place,
> > > > which is done in patches [1-2/4].
> > >
> > > In the meantime I realized that calling syncrhonize_rcu_expedited() under the
> > > "tables" mutex within ACPICA is not quite a good idea too and that there is no
> > > reason for any users of acpi_os_unmap_memory() in the tree to use the "sync"
> > > variant of unmapping.
> > >
> > > So, unless I'm missing something, acpi_os_unmap_memory() can be changed to
> > > always defer the final unmapping and the only ACPICA change needed to support
> > > that is the addition of the acpi_os_release_unused_mappings() call to get rid
> > > of the unused mappings when leaving the interpreter (module the extra call in
> > > the debug code for consistency).
> > >
> > > So patches [1-2/4] have been changed accordingly.
> > >
> > > > However, it turns out that the "fast-path" mapping is still useful on top of
> > > > the above to reduce the number of ioremap-iounmap cycles for the same address
> > > > range and so it is introduced by patches [3-4/4].
> > >
> > > Patches [3-4/4] still do what they did, but they have been simplified a bit
> > > after rebasing on top of the new [1-2/4].
> > >
> > > The below information is still valid, but it applies to the v3, of course.
> > >
> > > > For details, please refer to the patch changelogs.
> > > >
> > > > The series is available from the git branch at
> > > >
> > > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> > > > acpica-osl
> > > >
> > > > for easier testing.
> > >
> > > Also the series have been tested locally.
> >
> > Ok, I'm still trying to get the original reporter to confirm this
> > reduces the execution time for ASL routines with a lot of OpRegion
> > touches. Shall I rebuild that test kernel with these changes, or are
> > the results from the original RFT still interesting?
>
> I'm mostly interested in the results with the v3 applied.
>

Ok, I just got feedback on v2 and it still showed the 30 minute
execution time where 7 minutes was achieved previously.

> Also it would be good to check the impact of the first two patches
> alone relative to all four.

I'll start with the full set and see if they can also support the
"first 2" experiment.

2020-06-30 11:08:36

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [RFT][PATCH v3 0/4] ACPI: ACPICA / OSL: Avoid unmapping ACPI memory inside of the AML interpreter

On Mon, Jun 29, 2020 at 10:46 PM Dan Williams <[email protected]> wrote:
>
> On Sun, Jun 28, 2020 at 10:09 AM Rafael J. Wysocki <[email protected]> wrote:
> >
> > On Fri, Jun 26, 2020 at 8:41 PM Dan Williams <[email protected]> wrote:
> > >
> > > On Fri, Jun 26, 2020 at 10:34 AM Rafael J. Wysocki <[email protected]> wrote:
> > > >
> > > > Hi All,
> > > >
> > > > On Monday, June 22, 2020 3:50:42 PM CEST Rafael J. Wysocki wrote:
> > > > > Hi All,
> > > > >
> > > > > This series is to address the problem with RCU synchronization occurring,
> > > > > possibly relatively often, inside of acpi_ex_system_memory_space_handler(),
> > > > > when the namespace and interpreter mutexes are held.
> > > > >
> > > > > Like I said before, I had decided to change the approach used in the previous
> > > > > iteration of this series and to allow the unmap operations carried out by
> > > > > acpi_ex_system_memory_space_handler() to be deferred in the first place,
> > > > > which is done in patches [1-2/4].
> > > >
> > > > In the meantime I realized that calling syncrhonize_rcu_expedited() under the
> > > > "tables" mutex within ACPICA is not quite a good idea too and that there is no
> > > > reason for any users of acpi_os_unmap_memory() in the tree to use the "sync"
> > > > variant of unmapping.
> > > >
> > > > So, unless I'm missing something, acpi_os_unmap_memory() can be changed to
> > > > always defer the final unmapping and the only ACPICA change needed to support
> > > > that is the addition of the acpi_os_release_unused_mappings() call to get rid
> > > > of the unused mappings when leaving the interpreter (module the extra call in
> > > > the debug code for consistency).
> > > >
> > > > So patches [1-2/4] have been changed accordingly.
> > > >
> > > > > However, it turns out that the "fast-path" mapping is still useful on top of
> > > > > the above to reduce the number of ioremap-iounmap cycles for the same address
> > > > > range and so it is introduced by patches [3-4/4].
> > > >
> > > > Patches [3-4/4] still do what they did, but they have been simplified a bit
> > > > after rebasing on top of the new [1-2/4].
> > > >
> > > > The below information is still valid, but it applies to the v3, of course.
> > > >
> > > > > For details, please refer to the patch changelogs.
> > > > >
> > > > > The series is available from the git branch at
> > > > >
> > > > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \
> > > > > acpica-osl
> > > > >
> > > > > for easier testing.
> > > >
> > > > Also the series have been tested locally.
> > >
> > > Ok, I'm still trying to get the original reporter to confirm this
> > > reduces the execution time for ASL routines with a lot of OpRegion
> > > touches. Shall I rebuild that test kernel with these changes, or are
> > > the results from the original RFT still interesting?
> >
> > I'm mostly interested in the results with the v3 applied.
> >
>
> Ok, I just got feedback on v2 and it still showed the 30 minute
> execution time where 7 minutes was achieved previously.

This probably means that "transient" memory opregions, which appear
and go away during the AML execution, are involved and so moving the
RCU synchronization outside of the interpreter and namespace locks is
not enough to cover this case.

It should be covered by the v4
(https://lore.kernel.org/linux-acpi/1666722.UopIai5n7p@kreacher/T/#u),
though, because the unmapping is completely asynchronous in there and
it doesn't add any significant latency to the interpreter exit path.
So I would expect to see much better results with the v4, so I'd
recommend testing this one next.

> > Also it would be good to check the impact of the first two patches
> > alone relative to all four.
>
> I'll start with the full set and see if they can also support the
> "first 2" experiment.

In the v4 there are just two patches, so it should be straightforward
enough to test with and without the top-most one. :-)