2023-12-15 05:26:12

by Vishal Verma

[permalink] [raw]
Subject: [PATCH v6 0/4] Add DAX ABI for memmap_on_memory

The DAX drivers were missing sysfs ABI documentation entirely. Add this
missing documentation for the sysfs ABI for DAX regions and Dax devices
in patch 1. Switch to guard(device) semantics for Scope Based Resource
Management for device_{lock,unlock} flows in drivers/dax/bus.c in patch
2. Export mhp_supports_memmap_on_memory() in patch 3. Add a new ABI for
toggling memmap_on_memory semantics in patch 4.

The missing ABI was spotted in [1], this series is a split of the new
ABI additions behind the initial documentation creation.

[1]: https://lore.kernel.org/linux-cxl/[email protected]/

---
This series depends on [2] which adds the definition for guard(device).
[2]: https://lore.kernel.org/r/170250854466.1522182.17555361077409628655.stgit@dwillia2-xfh.jf.intel.com

---

Other Logistics -

Andrew, would you prefer patch 3 to go through mm? Or through the dax
tree with an mm ack? The remaining patches are all contained to dax, but
do depend on the memmap_on_memory set that is currently in mm-stable.

---
Changes in v6:
- Use sysfs_emit() in memmap_on_memory_show() (Greg)
- Change the ABI documentation date for memmap_on_memory to January 2024
as that's likely when the 6.8 merge window will fall (Greg)
- Fix dev->driver check (Ying)
- Link to v5: https://lore.kernel.org/r/[email protected]

Changes in v5:
- Export and check mhp_supports_memmap_on_memory() in the DAX sysfs ABI
(David)
- Obtain dax_drv under the device lock (Ying)
- Check dax_drv for NULL before dereferencing it (Ying)
- Clean up some repetition in sysfs-bus-dax documentation entries
(Jonathan)
- A few additional cleanups enabled by guard(device) (Jonathan)
- Drop the DEFINE_GUARD() part of patch 2, add dependency on Dan's patch
above so it can be backported / applied separately (Jonathan, Dan)
- Link to v4: https://lore.kernel.org/r/[email protected]

Changes in v4:
- Hold the device lock when checking if the dax_dev is bound to kmem
(Ying, Dan)
- Remove dax region checks (and locks) as they were unnecessary.
- Introduce guard(device) for device lock/unlock (Dan)
- Convert the rest of drivers/dax/bus.c to guard(device)
- Link to v3: https://lore.kernel.org/r/[email protected]

Changes in v3:
- Fix typo in ABI docs (Zhijian Li)
- Add kernel config and module parameter dependencies to the ABI docs
entry (David Hildenbrand)
- Ensure kmem isn't active when setting the sysfs attribute (Ying
Huang)
- Simplify returning from memmap_on_memory_store()
- Link to v2: https://lore.kernel.org/r/[email protected]

Changes in v2:
- Fix CC lists, patch 1/2 didn't get sent correctly in v1
- Link to v1: https://lore.kernel.org/r/[email protected]

Cc: <[email protected]>
Cc: <[email protected]>
Cc: <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Huang Ying <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: <[email protected]>
To: Dan Williams <[email protected]>
To: Vishal Verma <[email protected]>
To: Dave Jiang <[email protected]>
To: Andrew Morton <[email protected]>
To: Oscar Salvador <[email protected]>

---
Vishal Verma (4):
Documentatiion/ABI: Add ABI documentation for sys-bus-dax
dax/bus: Use guard(device) in sysfs attribute helpers
mm/memory_hotplug: export mhp_supports_memmap_on_memory()
dax: add a sysfs knob to control memmap_on_memory behavior

include/linux/memory_hotplug.h | 6 ++
drivers/dax/bus.c | 179 +++++++++++++++++---------------
mm/memory_hotplug.c | 17 ++-
Documentation/ABI/testing/sysfs-bus-dax | 153 +++++++++++++++++++++++++++
4 files changed, 260 insertions(+), 95 deletions(-)
---
base-commit: a6e0c2ca980d75d5ac6b2902c5c0028eaf094db3
change-id: 20231025-vv-dax_abi-17a219c46076

Best regards,
--
Vishal Verma <[email protected]>



2023-12-15 05:26:16

by Vishal Verma

[permalink] [raw]
Subject: [PATCH v6 1/4] Documentatiion/ABI: Add ABI documentation for sys-bus-dax

Add the missing sysfs ABI documentation for the device DAX subsystem.
Various ABI attributes under this have been present since v5.1, and more
have been added over time. In preparation for adding a new attribute,
add this file with the historical details.

Cc: Dan Williams <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
---
Documentation/ABI/testing/sysfs-bus-dax | 136 ++++++++++++++++++++++++++++++++
1 file changed, 136 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-dax b/Documentation/ABI/testing/sysfs-bus-dax
new file mode 100644
index 000000000000..6359f7bc9bf4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-dax
@@ -0,0 +1,136 @@
+What: /sys/bus/dax/devices/daxX.Y/align
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (RW) Provides a way to specify an alignment for a dax device.
+ Values allowed are constrained by the physical address ranges
+ that back the dax device, and also by arch requirements.
+
+What: /sys/bus/dax/devices/daxX.Y/mapping
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (WO) Provides a way to allocate a mapping range under a dax
+ device. Specified in the format <start>-<end>.
+
+What: /sys/bus/dax/devices/daxX.Y/mapping[0..N]/start
+What: /sys/bus/dax/devices/daxX.Y/mapping[0..N]/end
+What: /sys/bus/dax/devices/daxX.Y/mapping[0..N]/page_offset
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (RO) A dax device may have multiple constituent discontiguous
+ address ranges. These are represented by the different
+ 'mappingX' subdirectories. The 'start' attribute indicates the
+ start physical address for the given range. The 'end' attribute
+ indicates the end physical address for the given range. The
+ 'page_offset' attribute indicates the offset of the current
+ range in the dax device.
+
+What: /sys/bus/dax/devices/daxX.Y/resource
+Date: June, 2019
+KernelVersion: v5.3
+Contact: [email protected]
+Description:
+ (RO) The resource attribute indicates the starting physical
+ address of a dax device. In case of a device with multiple
+ constituent ranges, it indicates the starting address of the
+ first range.
+
+What: /sys/bus/dax/devices/daxX.Y/size
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (RW) The size attribute indicates the total size of a dax
+ device. For creating subdivided dax devices, or for resizing
+ an existing device, the new size can be written to this as
+ part of the reconfiguration process.
+
+What: /sys/bus/dax/devices/daxX.Y/numa_node
+Date: November, 2019
+KernelVersion: v5.5
+Contact: [email protected]
+Description:
+ (RO) If NUMA is enabled and the platform has affinitized the
+ backing device for this dax device, emit the CPU node
+ affinity for this device.
+
+What: /sys/bus/dax/devices/daxX.Y/target_node
+Date: February, 2019
+KernelVersion: v5.1
+Contact: [email protected]
+Description:
+ (RO) The target-node attribute is the Linux numa-node that a
+ device-dax instance may create when it is online. Prior to
+ being online the device's 'numa_node' property reflects the
+ closest online cpu node which is the typical expectation of a
+ device 'numa_node'. Once it is online it becomes its own
+ distinct numa node.
+
+What: $(readlink -f /sys/bus/dax/devices/daxX.Y)/../dax_region/available_size
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (RO) The available_size attribute tracks available dax region
+ capacity. This only applies to volatile hmem devices, not pmem
+ devices, since pmem devices are defined by nvdimm namespace
+ boundaries.
+
+What: $(readlink -f /sys/bus/dax/devices/daxX.Y)/../dax_region/size
+Date: July, 2017
+KernelVersion: v5.1
+Contact: [email protected]
+Description:
+ (RO) The size attribute indicates the size of a given dax region
+ in bytes.
+
+What: $(readlink -f /sys/bus/dax/devices/daxX.Y)/../dax_region/align
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (RO) The align attribute indicates alignment of the dax region.
+ Changes on align may not always be valid, when say certain
+ mappings were created with 2M and then we switch to 1G. This
+ validates all ranges against the new value being attempted, post
+ resizing.
+
+What: $(readlink -f /sys/bus/dax/devices/daxX.Y)/../dax_region/seed
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (RO) The seed device is a concept for dynamic dax regions to be
+ able to split the region amongst multiple sub-instances. The
+ seed device, similar to libnvdimm seed devices, is a device
+ that starts with zero capacity allocated and unbound to a
+ driver.
+
+What: $(readlink -f /sys/bus/dax/devices/daxX.Y)/../dax_region/create
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (RW) The create interface to the dax region provides a way to
+ create a new unconfigured dax device under the given region, which
+ can then be configured (with a size etc.) and then probed.
+
+What: $(readlink -f /sys/bus/dax/devices/daxX.Y)/../dax_region/delete
+Date: October, 2020
+KernelVersion: v5.10
+Contact: [email protected]
+Description:
+ (WO) The delete interface for a dax region provides for deletion
+ of any 0-sized and idle dax devices.
+
+What: $(readlink -f /sys/bus/dax/devices/daxX.Y)/../dax_region/id
+Date: July, 2017
+KernelVersion: v5.1
+Contact: [email protected]
+Description:
+ (RO) The id attribute indicates the region id of a dax region.

--
2.41.0


2023-12-15 05:26:18

by Vishal Verma

[permalink] [raw]
Subject: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

Use the guard(device) macro to lock a 'struct device', and unlock it
automatically when going out of scope using Scope Based Resource
Management semantics. A lot of the sysfs attribute writes in
drivers/dax/bus.c benefit from a cleanup using these, so change these
where applicable.

Cc: Joao Martins <[email protected]>
Cc: Dan Williams <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
---
drivers/dax/bus.c | 143 ++++++++++++++++++++++--------------------------------
1 file changed, 59 insertions(+), 84 deletions(-)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 1ff1ab5fa105..6226de131d17 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -294,13 +294,10 @@ static ssize_t available_size_show(struct device *dev,
struct device_attribute *attr, char *buf)
{
struct dax_region *dax_region = dev_get_drvdata(dev);
- unsigned long long size;

- device_lock(dev);
- size = dax_region_avail_size(dax_region);
- device_unlock(dev);
+ guard(device)(dev);

- return sprintf(buf, "%llu\n", size);
+ return sprintf(buf, "%llu\n", dax_region_avail_size(dax_region));
}
static DEVICE_ATTR_RO(available_size);

@@ -309,17 +306,14 @@ static ssize_t seed_show(struct device *dev,
{
struct dax_region *dax_region = dev_get_drvdata(dev);
struct device *seed;
- ssize_t rc;

if (is_static(dax_region))
return -EINVAL;

- device_lock(dev);
+ guard(device)(dev);
seed = dax_region->seed;
- rc = sprintf(buf, "%s\n", seed ? dev_name(seed) : "");
- device_unlock(dev);

- return rc;
+ return sprintf(buf, "%s\n", seed ? dev_name(seed) : "");
}
static DEVICE_ATTR_RO(seed);

@@ -328,24 +322,28 @@ static ssize_t create_show(struct device *dev,
{
struct dax_region *dax_region = dev_get_drvdata(dev);
struct device *youngest;
- ssize_t rc;

if (is_static(dax_region))
return -EINVAL;

- device_lock(dev);
+ guard(device)(dev);
youngest = dax_region->youngest;
- rc = sprintf(buf, "%s\n", youngest ? dev_name(youngest) : "");
- device_unlock(dev);

- return rc;
+ return sprintf(buf, "%s\n", youngest ? dev_name(youngest) : "");
}

static ssize_t create_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t len)
{
struct dax_region *dax_region = dev_get_drvdata(dev);
+ struct dev_dax_data data = {
+ .dax_region = dax_region,
+ .size = 0,
+ .id = -1,
+ .memmap_on_memory = false,
+ };
unsigned long long avail;
+ struct dev_dax *dev_dax;
ssize_t rc;
int val;

@@ -358,38 +356,25 @@ static ssize_t create_store(struct device *dev, struct device_attribute *attr,
if (val != 1)
return -EINVAL;

- device_lock(dev);
+ guard(device)(dev);
avail = dax_region_avail_size(dax_region);
if (avail == 0)
- rc = -ENOSPC;
- else {
- struct dev_dax_data data = {
- .dax_region = dax_region,
- .size = 0,
- .id = -1,
- .memmap_on_memory = false,
- };
- struct dev_dax *dev_dax = devm_create_dev_dax(&data);
+ return -ENOSPC;

- if (IS_ERR(dev_dax))
- rc = PTR_ERR(dev_dax);
- else {
- /*
- * In support of crafting multiple new devices
- * simultaneously multiple seeds can be created,
- * but only the first one that has not been
- * successfully bound is tracked as the region
- * seed.
- */
- if (!dax_region->seed)
- dax_region->seed = &dev_dax->dev;
- dax_region->youngest = &dev_dax->dev;
- rc = len;
- }
- }
- device_unlock(dev);
+ dev_dax = devm_create_dev_dax(&data);
+ if (IS_ERR(dev_dax))
+ return PTR_ERR(dev_dax);

- return rc;
+ /*
+ * In support of crafting multiple new devices simultaneously multiple
+ * seeds can be created, but only the first one that has not been
+ * successfully bound is tracked as the region seed.
+ */
+ if (!dax_region->seed)
+ dax_region->seed = &dev_dax->dev;
+ dax_region->youngest = &dev_dax->dev;
+
+ return len;
}
static DEVICE_ATTR_RW(create);

@@ -481,12 +466,9 @@ static int __free_dev_dax_id(struct dev_dax *dev_dax)
static int free_dev_dax_id(struct dev_dax *dev_dax)
{
struct device *dev = &dev_dax->dev;
- int rc;

- device_lock(dev);
- rc = __free_dev_dax_id(dev_dax);
- device_unlock(dev);
- return rc;
+ guard(device)(dev);
+ return __free_dev_dax_id(dev_dax);
}

static int alloc_dev_dax_id(struct dev_dax *dev_dax)
@@ -908,9 +890,8 @@ static ssize_t size_show(struct device *dev,
struct dev_dax *dev_dax = to_dev_dax(dev);
unsigned long long size;

- device_lock(dev);
+ guard(device)(dev);
size = dev_dax_size(dev_dax);
- device_unlock(dev);

return sprintf(buf, "%llu\n", size);
}
@@ -1080,17 +1061,16 @@ static ssize_t size_store(struct device *dev, struct device_attribute *attr,
return -EINVAL;
}

- device_lock(dax_region->dev);
- if (!dax_region->dev->driver) {
- device_unlock(dax_region->dev);
+ guard(device)(dax_region->dev);
+ if (!dax_region->dev->driver)
return -ENXIO;
- }
- device_lock(dev);
+
+ guard(device)(dev);
rc = dev_dax_resize(dax_region, dev_dax, val);
- device_unlock(dev);
- device_unlock(dax_region->dev);
+ if (rc)
+ return rc;

- return rc == 0 ? len : rc;
+ return len;
}
static DEVICE_ATTR_RW(size);

@@ -1137,21 +1117,20 @@ static ssize_t mapping_store(struct device *dev, struct device_attribute *attr,
if (rc)
return rc;

- rc = -ENXIO;
- device_lock(dax_region->dev);
- if (!dax_region->dev->driver) {
- device_unlock(dax_region->dev);
- return rc;
- }
- device_lock(dev);
+ guard(device)(dax_region->dev);
+ if (!dax_region->dev->driver)
+ return -ENXIO;

+ guard(device)(dev);
to_alloc = range_len(&r);
- if (alloc_is_aligned(dev_dax, to_alloc))
- rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
- device_unlock(dev);
- device_unlock(dax_region->dev);
+ if (!alloc_is_aligned(dev_dax, to_alloc))
+ return -ENXIO;

- return rc == 0 ? len : rc;
+ rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
+ if (rc)
+ return rc;
+
+ return len;
}
static DEVICE_ATTR_WO(mapping);

@@ -1196,27 +1175,23 @@ static ssize_t align_store(struct device *dev, struct device_attribute *attr,
if (!dax_align_valid(val))
return -EINVAL;

- device_lock(dax_region->dev);
- if (!dax_region->dev->driver) {
- device_unlock(dax_region->dev);
+ guard(device)(dax_region->dev);
+ if (!dax_region->dev->driver)
return -ENXIO;
- }

- device_lock(dev);
- if (dev->driver) {
- rc = -EBUSY;
- goto out_unlock;
- }
+ guard(device)(dev);
+ if (dev->driver)
+ return -EBUSY;

align_save = dev_dax->align;
dev_dax->align = val;
rc = dev_dax_validate_align(dev_dax);
- if (rc)
+ if (rc) {
dev_dax->align = align_save;
-out_unlock:
- device_unlock(dev);
- device_unlock(dax_region->dev);
- return rc == 0 ? len : rc;
+ return rc;
+ }
+
+ return len;
}
static DEVICE_ATTR_RW(align);


--
2.41.0


2023-12-15 05:26:26

by Vishal Verma

[permalink] [raw]
Subject: [PATCH v6 3/4] mm/memory_hotplug: export mhp_supports_memmap_on_memory()

In preparation for adding sysfs ABI to toggle memmap_on_memory semantics
for drivers adding memory, export the mhp_supports_memmap_on_memory()
helper. This allows drivers to check if memmap_on_memory support is
available before trying to request it, and display an appropriate
message if it isn't available. As part of this, remove the size argument
to this - with recent updates to allow memmap_on_memory for larger
ranges, and the internal splitting of altmaps into respective memory
blocks, the size argument is meaningless.

Cc: Andrew Morton <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Oscar Salvador <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Huang Ying <[email protected]>
Suggested-by: David Hildenbrand <[email protected]>
Acked-by: David Hildenbrand <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
---
include/linux/memory_hotplug.h | 6 ++++++
mm/memory_hotplug.c | 17 ++++++-----------
2 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
index 7d2076583494..ebc9d528f00c 100644
--- a/include/linux/memory_hotplug.h
+++ b/include/linux/memory_hotplug.h
@@ -121,6 +121,7 @@ struct mhp_params {

bool mhp_range_allowed(u64 start, u64 size, bool need_mapping);
struct range mhp_get_pluggable_range(bool need_mapping);
+bool mhp_supports_memmap_on_memory(void);

/*
* Zone resizing functions
@@ -262,6 +263,11 @@ static inline bool movable_node_is_enabled(void)
return false;
}

+static bool mhp_supports_memmap_on_memory(void)
+{
+ return false;
+}
+
static inline void pgdat_kswapd_lock(pg_data_t *pgdat) {}
static inline void pgdat_kswapd_unlock(pg_data_t *pgdat) {}
static inline void pgdat_kswapd_lock_init(pg_data_t *pgdat) {}
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index 926e1cfb10e9..751664c519f7 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1325,7 +1325,7 @@ static inline bool arch_supports_memmap_on_memory(unsigned long vmemmap_size)
}
#endif

-static bool mhp_supports_memmap_on_memory(unsigned long size)
+bool mhp_supports_memmap_on_memory(void)
{
unsigned long vmemmap_size = memory_block_memmap_size();
unsigned long memmap_pages = memory_block_memmap_on_memory_pages();
@@ -1334,17 +1334,11 @@ static bool mhp_supports_memmap_on_memory(unsigned long size)
* Besides having arch support and the feature enabled at runtime, we
* need a few more assumptions to hold true:
*
- * a) We span a single memory block: memory onlining/offlinin;g happens
- * in memory block granularity. We don't want the vmemmap of online
- * memory blocks to reside on offline memory blocks. In the future,
- * we might want to support variable-sized memory blocks to make the
- * feature more versatile.
- *
- * b) The vmemmap pages span complete PMDs: We don't want vmemmap code
+ * a) The vmemmap pages span complete PMDs: We don't want vmemmap code
* to populate memory from the altmap for unrelated parts (i.e.,
* other memory blocks)
*
- * c) The vmemmap pages (and thereby the pages that will be exposed to
+ * b) The vmemmap pages (and thereby the pages that will be exposed to
* the buddy) have to cover full pageblocks: memory onlining/offlining
* code requires applicable ranges to be page-aligned, for example, to
* set the migratetypes properly.
@@ -1356,7 +1350,7 @@ static bool mhp_supports_memmap_on_memory(unsigned long size)
* altmap as an alternative source of memory, and we do not exactly
* populate a single PMD.
*/
- if (!mhp_memmap_on_memory() || size != memory_block_size_bytes())
+ if (!mhp_memmap_on_memory())
return false;

/*
@@ -1379,6 +1373,7 @@ static bool mhp_supports_memmap_on_memory(unsigned long size)

return arch_supports_memmap_on_memory(vmemmap_size);
}
+EXPORT_SYMBOL_GPL(mhp_supports_memmap_on_memory);

static void __ref remove_memory_blocks_and_altmaps(u64 start, u64 size)
{
@@ -1512,7 +1507,7 @@ int __ref add_memory_resource(int nid, struct resource *res, mhp_t mhp_flags)
* Self hosted memmap array
*/
if ((mhp_flags & MHP_MEMMAP_ON_MEMORY) &&
- mhp_supports_memmap_on_memory(memory_block_size_bytes())) {
+ mhp_supports_memmap_on_memory()) {
ret = create_altmaps_and_memory_blocks(nid, group, start, size);
if (ret)
goto error;

--
2.41.0


2023-12-15 05:26:49

by Vishal Verma

[permalink] [raw]
Subject: [PATCH v6 4/4] dax: add a sysfs knob to control memmap_on_memory behavior

Add a sysfs knob for dax devices to control the memmap_on_memory setting
if the dax device were to be hotplugged as system memory.

The default memmap_on_memory setting for dax devices originating via
pmem or hmem is set to 'false' - i.e. no memmap_on_memory semantics, to
preserve legacy behavior. For dax devices via CXL, the default is on.
The sysfs control allows the administrator to override the above
defaults if needed.

Cc: David Hildenbrand <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Huang Ying <[email protected]>
Tested-by: Li Zhijian <[email protected]>
Reviewed-by: Jonathan Cameron <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
Signed-off-by: Vishal Verma <[email protected]>
---
drivers/dax/bus.c | 36 +++++++++++++++++++++++++++++++++
Documentation/ABI/testing/sysfs-bus-dax | 17 ++++++++++++++++
2 files changed, 53 insertions(+)

diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
index 6226de131d17..3622b3d1c0de 100644
--- a/drivers/dax/bus.c
+++ b/drivers/dax/bus.c
@@ -1245,6 +1245,41 @@ static ssize_t numa_node_show(struct device *dev,
}
static DEVICE_ATTR_RO(numa_node);

+static ssize_t memmap_on_memory_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ struct dev_dax *dev_dax = to_dev_dax(dev);
+
+ return sysfs_emit(buf, "%d\n", dev_dax->memmap_on_memory);
+}
+
+static ssize_t memmap_on_memory_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t len)
+{
+ struct dev_dax *dev_dax = to_dev_dax(dev);
+ ssize_t rc;
+ bool val;
+
+ rc = kstrtobool(buf, &val);
+ if (rc)
+ return rc;
+
+ if (val == true && !mhp_supports_memmap_on_memory()) {
+ dev_dbg(dev, "memmap_on_memory is not available\n");
+ return -EOPNOTSUPP;
+ }
+
+ guard(device)(dev);
+ if (dev_dax->memmap_on_memory != val && dev->driver &&
+ to_dax_drv(dev->driver)->type == DAXDRV_KMEM_TYPE)
+ return -EBUSY;
+ dev_dax->memmap_on_memory = val;
+
+ return len;
+}
+static DEVICE_ATTR_RW(memmap_on_memory);
+
static umode_t dev_dax_visible(struct kobject *kobj, struct attribute *a, int n)
{
struct device *dev = container_of(kobj, struct device, kobj);
@@ -1271,6 +1306,7 @@ static struct attribute *dev_dax_attributes[] = {
&dev_attr_align.attr,
&dev_attr_resource.attr,
&dev_attr_numa_node.attr,
+ &dev_attr_memmap_on_memory.attr,
NULL,
};

diff --git a/Documentation/ABI/testing/sysfs-bus-dax b/Documentation/ABI/testing/sysfs-bus-dax
index 6359f7bc9bf4..b34266bfae49 100644
--- a/Documentation/ABI/testing/sysfs-bus-dax
+++ b/Documentation/ABI/testing/sysfs-bus-dax
@@ -134,3 +134,20 @@ KernelVersion: v5.1
Contact: [email protected]
Description:
(RO) The id attribute indicates the region id of a dax region.
+
+What: /sys/bus/dax/devices/daxX.Y/memmap_on_memory
+Date: January, 2024
+KernelVersion: v6.8
+Contact: [email protected]
+Description:
+ (RW) Control the memmap_on_memory setting if the dax device
+ were to be hotplugged as system memory. This determines whether
+ the 'altmap' for the hotplugged memory will be placed on the
+ device being hotplugged (memmap_on_memory=1) or if it will be
+ placed on regular memory (memmap_on_memory=0). This attribute
+ must be set before the device is handed over to the 'kmem'
+ driver (i.e. hotplugged into system-ram). Additionally, this
+ depends on CONFIG_MHP_MEMMAP_ON_MEMORY, and a globally enabled
+ memmap_on_memory parameter for memory_hotplug. This is
+ typically set on the kernel command line -
+ memory_hotplug.memmap_on_memory set to 'true' or 'force'."

--
2.41.0


2023-12-15 05:56:49

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

On Thu, Dec 14, 2023 at 10:25:27PM -0700, Vishal Verma wrote:
> @@ -294,13 +294,10 @@ static ssize_t available_size_show(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> struct dax_region *dax_region = dev_get_drvdata(dev);
> - unsigned long long size;
>
> - device_lock(dev);
> - size = dax_region_avail_size(dax_region);
> - device_unlock(dev);
> + guard(device)(dev);
>
> - return sprintf(buf, "%llu\n", size);
> + return sprintf(buf, "%llu\n", dax_region_avail_size(dax_region));
> }

Is this an appropriate use of guard()? sprintf is not the fastest of
functions, so we will end up holding the device_lock for longer than
we used to.

> @@ -908,9 +890,8 @@ static ssize_t size_show(struct device *dev,
> struct dev_dax *dev_dax = to_dev_dax(dev);
> unsigned long long size;
>
> - device_lock(dev);
> + guard(device)(dev);
> size = dev_dax_size(dev_dax);
> - device_unlock(dev);
>
> return sprintf(buf, "%llu\n", size);
> }

If it is appropriate, then you can do without the 'size' variable here.

> @@ -1137,21 +1117,20 @@ static ssize_t mapping_store(struct device *dev, struct device_attribute *attr,
> if (rc)
> return rc;
>
> - rc = -ENXIO;
> - device_lock(dax_region->dev);
> - if (!dax_region->dev->driver) {
> - device_unlock(dax_region->dev);
> - return rc;
> - }
> - device_lock(dev);
> + guard(device)(dax_region->dev);
> + if (!dax_region->dev->driver)
> + return -ENXIO;
>
> + guard(device)(dev);
> to_alloc = range_len(&r);
> - if (alloc_is_aligned(dev_dax, to_alloc))
> - rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
> - device_unlock(dev);
> - device_unlock(dax_region->dev);
> + if (!alloc_is_aligned(dev_dax, to_alloc))
> + return -ENXIO;
>
> - return rc == 0 ? len : rc;
> + rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
> + if (rc)
> + return rc;
> +
> + return len;
> }

Have I mentioned how much I hate the "rc" naming convention? It tells
you nothing useful about the contents of the variable. If you called it
'err', I'd know it was an error, and then the end of this function would
make sense.

if (err)
return err;
return len;


2023-12-15 06:34:14

by Vishal Verma

[permalink] [raw]
Subject: Re: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

On Fri, 2023-12-15 at 05:56 +0000, Matthew Wilcox wrote:
> On Thu, Dec 14, 2023 at 10:25:27PM -0700, Vishal Verma wrote:
> > @@ -294,13 +294,10 @@ static ssize_t available_size_show(struct device *dev,
> >                 struct device_attribute *attr, char *buf)
> >  {
> >         struct dax_region *dax_region = dev_get_drvdata(dev);
> > -       unsigned long long size;
> >  
> > -       device_lock(dev);
> > -       size = dax_region_avail_size(dax_region);
> > -       device_unlock(dev);
> > +       guard(device)(dev);
> >  
> > -       return sprintf(buf, "%llu\n", size);
> > +       return sprintf(buf, "%llu\n", dax_region_avail_size(dax_region));
> >  }
>
> Is this an appropriate use of guard()?  sprintf is not the fastest of
> functions, so we will end up holding the device_lock for longer than
> we used to.

Hi Matthew,

Agreed that we end up holding the lock for a bit longer in many of
these. I'm inclined to say this is okay, since these are all user
configuration paths through sysfs, not affecting any sort of runtime
performance.

>
> > @@ -908,9 +890,8 @@ static ssize_t size_show(struct device *dev,
> >         struct dev_dax *dev_dax = to_dev_dax(dev);
> >         unsigned long long size;
> >  
> > -       device_lock(dev);
> > +       guard(device)(dev);
> >         size = dev_dax_size(dev_dax);
> > -       device_unlock(dev);
> >  
> >         return sprintf(buf, "%llu\n", size);
> >  }
>
> If it is appropriate, then you can do without the 'size' variable here.

Yep will remove. I suppose a lot of these can also switch to sysfs_emit
as Greg pointed out in a previous posting. I can add that as a separate
cleanup patch.

>
> > @@ -1137,21 +1117,20 @@ static ssize_t mapping_store(struct device *dev, struct device_attribute *attr,
> >         if (rc)
> >                 return rc;
> >  
> > -       rc = -ENXIO;
> > -       device_lock(dax_region->dev);
> > -       if (!dax_region->dev->driver) {
> > -               device_unlock(dax_region->dev);
> > -               return rc;
> > -       }
> > -       device_lock(dev);
> > +       guard(device)(dax_region->dev);
> > +       if (!dax_region->dev->driver)
> > +               return -ENXIO;
> >  
> > +       guard(device)(dev);
> >         to_alloc = range_len(&r);
> > -       if (alloc_is_aligned(dev_dax, to_alloc))
> > -               rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
> > -       device_unlock(dev);
> > -       device_unlock(dax_region->dev);
> > +       if (!alloc_is_aligned(dev_dax, to_alloc))
> > +               return -ENXIO;
> >  
> > -       return rc == 0 ? len : rc;
> > +       rc = alloc_dev_dax_range(dev_dax, r.start, to_alloc);
> > +       if (rc)
> > +               return rc;
> > +
> > +       return len;
> >  }
>
> Have I mentioned how much I hate the "rc" naming convention?  It tells
> you nothing useful about the contents of the variable.  If you called it
> 'err', I'd know it was an error, and then the end of this function would
> make sense.
>
>         if (err)
>                 return err;
>         return len;
>
I'm a little hesitant to change this because the 'rc' convention is
used all over this file, and while I don't mind making this change for
the bits I touch in this patch, it would just result in a mix of 'rc'
and 'err' in this file.

2023-12-15 07:27:57

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

On Thu, Dec 14, 2023 at 10:25:27PM -0700, Vishal Verma wrote:
> Use the guard(device) macro to lock a 'struct device', and unlock it
> automatically when going out of scope using Scope Based Resource
> Management semantics. A lot of the sysfs attribute writes in
> drivers/dax/bus.c benefit from a cleanup using these, so change these
> where applicable.

Wait, why are you needing to call device_lock() at all here? Why is dax
special in needing this when no other subsystem requires it?

>
> Cc: Joao Martins <[email protected]>
> Cc: Dan Williams <[email protected]>
> Signed-off-by: Vishal Verma <[email protected]>
> ---
> drivers/dax/bus.c | 143 ++++++++++++++++++++++--------------------------------
> 1 file changed, 59 insertions(+), 84 deletions(-)
>
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 1ff1ab5fa105..6226de131d17 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -294,13 +294,10 @@ static ssize_t available_size_show(struct device *dev,
> struct device_attribute *attr, char *buf)
> {
> struct dax_region *dax_region = dev_get_drvdata(dev);
> - unsigned long long size;
>
> - device_lock(dev);
> - size = dax_region_avail_size(dax_region);
> - device_unlock(dev);
> + guard(device)(dev);

You have a valid device here, why are you locking it? How can it go
away? And if it can, shouldn't you have a local lock for it, and not
abuse the driver core lock?

>
> - return sprintf(buf, "%llu\n", size);
> + return sprintf(buf, "%llu\n", dax_region_avail_size(dax_region));

sysfs_emit() everywhere please.

But again, the issue is "why do you need a lock"?

thanks,

greg k-h

2023-12-15 07:55:24

by Huang, Ying

[permalink] [raw]
Subject: Re: [PATCH v6 4/4] dax: add a sysfs knob to control memmap_on_memory behavior

Vishal Verma <[email protected]> writes:

> Add a sysfs knob for dax devices to control the memmap_on_memory setting
> if the dax device were to be hotplugged as system memory.
>
> The default memmap_on_memory setting for dax devices originating via
> pmem or hmem is set to 'false' - i.e. no memmap_on_memory semantics, to
> preserve legacy behavior. For dax devices via CXL, the default is on.
> The sysfs control allows the administrator to override the above
> defaults if needed.
>
> Cc: David Hildenbrand <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Dave Jiang <[email protected]>
> Cc: Dave Hansen <[email protected]>
> Cc: Huang Ying <[email protected]>
> Tested-by: Li Zhijian <[email protected]>
> Reviewed-by: Jonathan Cameron <[email protected]>
> Reviewed-by: David Hildenbrand <[email protected]>
> Signed-off-by: Vishal Verma <[email protected]>

Looks good to me! Thanks!

Reviewed-by: "Huang, Ying" <[email protected]>

> ---
> drivers/dax/bus.c | 36 +++++++++++++++++++++++++++++++++
> Documentation/ABI/testing/sysfs-bus-dax | 17 ++++++++++++++++
> 2 files changed, 53 insertions(+)
>
> diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> index 6226de131d17..3622b3d1c0de 100644
> --- a/drivers/dax/bus.c
> +++ b/drivers/dax/bus.c
> @@ -1245,6 +1245,41 @@ static ssize_t numa_node_show(struct device *dev,
> }
> static DEVICE_ATTR_RO(numa_node);
>
> +static ssize_t memmap_on_memory_show(struct device *dev,
> + struct device_attribute *attr, char *buf)
> +{
> + struct dev_dax *dev_dax = to_dev_dax(dev);
> +
> + return sysfs_emit(buf, "%d\n", dev_dax->memmap_on_memory);
> +}
> +
> +static ssize_t memmap_on_memory_store(struct device *dev,
> + struct device_attribute *attr,
> + const char *buf, size_t len)
> +{
> + struct dev_dax *dev_dax = to_dev_dax(dev);
> + ssize_t rc;
> + bool val;
> +
> + rc = kstrtobool(buf, &val);
> + if (rc)
> + return rc;
> +
> + if (val == true && !mhp_supports_memmap_on_memory()) {
> + dev_dbg(dev, "memmap_on_memory is not available\n");
> + return -EOPNOTSUPP;
> + }
> +
> + guard(device)(dev);
> + if (dev_dax->memmap_on_memory != val && dev->driver &&
> + to_dax_drv(dev->driver)->type == DAXDRV_KMEM_TYPE)
> + return -EBUSY;
> + dev_dax->memmap_on_memory = val;
> +
> + return len;
> +}
> +static DEVICE_ATTR_RW(memmap_on_memory);
> +
> static umode_t dev_dax_visible(struct kobject *kobj, struct attribute *a, int n)
> {
> struct device *dev = container_of(kobj, struct device, kobj);
> @@ -1271,6 +1306,7 @@ static struct attribute *dev_dax_attributes[] = {
> &dev_attr_align.attr,
> &dev_attr_resource.attr,
> &dev_attr_numa_node.attr,
> + &dev_attr_memmap_on_memory.attr,
> NULL,
> };
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-dax b/Documentation/ABI/testing/sysfs-bus-dax
> index 6359f7bc9bf4..b34266bfae49 100644
> --- a/Documentation/ABI/testing/sysfs-bus-dax
> +++ b/Documentation/ABI/testing/sysfs-bus-dax
> @@ -134,3 +134,20 @@ KernelVersion: v5.1
> Contact: [email protected]
> Description:
> (RO) The id attribute indicates the region id of a dax region.
> +
> +What: /sys/bus/dax/devices/daxX.Y/memmap_on_memory
> +Date: January, 2024
> +KernelVersion: v6.8
> +Contact: [email protected]
> +Description:
> + (RW) Control the memmap_on_memory setting if the dax device
> + were to be hotplugged as system memory. This determines whether
> + the 'altmap' for the hotplugged memory will be placed on the
> + device being hotplugged (memmap_on_memory=1) or if it will be
> + placed on regular memory (memmap_on_memory=0). This attribute
> + must be set before the device is handed over to the 'kmem'
> + driver (i.e. hotplugged into system-ram). Additionally, this
> + depends on CONFIG_MHP_MEMMAP_ON_MEMORY, and a globally enabled
> + memmap_on_memory parameter for memory_hotplug. This is
> + typically set on the kernel command line -
> + memory_hotplug.memmap_on_memory set to 'true' or 'force'."

2023-12-15 16:25:28

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

On Fri, Dec 15, 2023 at 06:33:58AM +0000, Verma, Vishal L wrote:
> On Fri, 2023-12-15 at 05:56 +0000, Matthew Wilcox wrote:
> > On Thu, Dec 14, 2023 at 10:25:27PM -0700, Vishal Verma wrote:
> > > @@ -294,13 +294,10 @@ static ssize_t available_size_show(struct device *dev,
> > > ????????????????struct device_attribute *attr, char *buf)
> > > ?{
> > > ????????struct dax_region *dax_region = dev_get_drvdata(dev);
> > > -???????unsigned long long size;
> > > ?
> > > -???????device_lock(dev);
> > > -???????size = dax_region_avail_size(dax_region);
> > > -???????device_unlock(dev);
> > > +???????guard(device)(dev);
> > > ?
> > > -???????return sprintf(buf, "%llu\n", size);
> > > +???????return sprintf(buf, "%llu\n", dax_region_avail_size(dax_region));
> > > ?}
> >
> > Is this an appropriate use of guard()?? sprintf is not the fastest of
> > functions, so we will end up holding the device_lock for longer than
> > we used to.
>
> Hi Matthew,
>
> Agreed that we end up holding the lock for a bit longer in many of
> these. I'm inclined to say this is okay, since these are all user
> configuration paths through sysfs, not affecting any sort of runtime
> performance.

Why does the lock have to be taken at all? You have a valid reference,
isn't that all you need?

thanks,

greg k-h

2023-12-15 17:27:19

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

Greg Kroah-Hartman wrote:
> On Thu, Dec 14, 2023 at 10:25:27PM -0700, Vishal Verma wrote:
> > Use the guard(device) macro to lock a 'struct device', and unlock it
> > automatically when going out of scope using Scope Based Resource
> > Management semantics. A lot of the sysfs attribute writes in
> > drivers/dax/bus.c benefit from a cleanup using these, so change these
> > where applicable.
>
> Wait, why are you needing to call device_lock() at all here? Why is dax
> special in needing this when no other subsystem requires it?
>
> >
> > Cc: Joao Martins <[email protected]>
> > Cc: Dan Williams <[email protected]>
> > Signed-off-by: Vishal Verma <[email protected]>
> > ---
> > drivers/dax/bus.c | 143 ++++++++++++++++++++++--------------------------------
> > 1 file changed, 59 insertions(+), 84 deletions(-)
> >
> > diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> > index 1ff1ab5fa105..6226de131d17 100644
> > --- a/drivers/dax/bus.c
> > +++ b/drivers/dax/bus.c
> > @@ -294,13 +294,10 @@ static ssize_t available_size_show(struct device *dev,
> > struct device_attribute *attr, char *buf)
> > {
> > struct dax_region *dax_region = dev_get_drvdata(dev);
> > - unsigned long long size;
> >
> > - device_lock(dev);
> > - size = dax_region_avail_size(dax_region);
> > - device_unlock(dev);
> > + guard(device)(dev);
>
> You have a valid device here, why are you locking it? How can it go
> away? And if it can, shouldn't you have a local lock for it, and not
> abuse the driver core lock?

Yes, this is a driver-core lock abuse written by someone who should have
known better. And yes, a local lock to protect the dax_region resource
tree should replace this. A new rwsem to synchronize all list walks
seems appropriate.

2023-12-15 17:33:07

by Vishal Verma

[permalink] [raw]
Subject: Re: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

On Fri, 2023-12-15 at 09:15 -0800, Dan Williams wrote:
> Greg Kroah-Hartman wrote:
> > On Thu, Dec 14, 2023 at 10:25:27PM -0700, Vishal Verma wrote:
> > > Use the guard(device) macro to lock a 'struct device', and unlock it
> > > automatically when going out of scope using Scope Based Resource
> > > Management semantics. A lot of the sysfs attribute writes in
> > > drivers/dax/bus.c benefit from a cleanup using these, so change these
> > > where applicable.
> >
> > Wait, why are you needing to call device_lock() at all here?  Why is dax
> > special in needing this when no other subsystem requires it?
> >
> > >
> > > Cc: Joao Martins <[email protected]>
> > > Cc: Dan Williams <[email protected]>
> > > Signed-off-by: Vishal Verma <[email protected]>
> > > ---
> > >  drivers/dax/bus.c | 143 ++++++++++++++++++++++--------------------------------
> > >  1 file changed, 59 insertions(+), 84 deletions(-)
> > >
> > > diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> > > index 1ff1ab5fa105..6226de131d17 100644
> > > --- a/drivers/dax/bus.c
> > > +++ b/drivers/dax/bus.c
> > > @@ -294,13 +294,10 @@ static ssize_t available_size_show(struct device *dev,
> > >                 struct device_attribute *attr, char *buf)
> > >  {
> > >         struct dax_region *dax_region = dev_get_drvdata(dev);
> > > -       unsigned long long size;
> > >  
> > > -       device_lock(dev);
> > > -       size = dax_region_avail_size(dax_region);
> > > -       device_unlock(dev);
> > > +       guard(device)(dev);
> >
> > You have a valid device here, why are you locking it?  How can it go
> > away?  And if it can, shouldn't you have a local lock for it, and not
> > abuse the driver core lock?
>
> Yes, this is a driver-core lock abuse written by someone who should have
> known better. And yes, a local lock to protect the dax_region resource
> tree should replace this. A new rwsem to synchronize all list walks
> seems appropriate.

I see why _a_ lock is needed both here and in size_show() - the size
calculations do a walk over discontiguous ranges, and we don't want the
device to get reconfigured in the middle of that. A different local
lock seems reasonable - however can that go as a separate cleanup that
stands on its own?

For this series, I'll add a cleanup to replace the sprintfs with
sysfs_emit().

2023-12-15 17:53:36

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

On Fri, Dec 15, 2023 at 05:32:50PM +0000, Verma, Vishal L wrote:
> On Fri, 2023-12-15 at 09:15 -0800, Dan Williams wrote:
> > Greg Kroah-Hartman wrote:
> > > On Thu, Dec 14, 2023 at 10:25:27PM -0700, Vishal Verma wrote:
> > > > Use the guard(device) macro to lock a 'struct device', and unlock it
> > > > automatically when going out of scope using Scope Based Resource
> > > > Management semantics. A lot of the sysfs attribute writes in
> > > > drivers/dax/bus.c benefit from a cleanup using these, so change these
> > > > where applicable.
> > >
> > > Wait, why are you needing to call device_lock() at all here?? Why is dax
> > > special in needing this when no other subsystem requires it?
> > >
> > > >
> > > > Cc: Joao Martins <[email protected]>
> > > > Cc: Dan Williams <[email protected]>
> > > > Signed-off-by: Vishal Verma <[email protected]>
> > > > ---
> > > > ?drivers/dax/bus.c | 143 ++++++++++++++++++++++--------------------------------
> > > > ?1 file changed, 59 insertions(+), 84 deletions(-)
> > > >
> > > > diff --git a/drivers/dax/bus.c b/drivers/dax/bus.c
> > > > index 1ff1ab5fa105..6226de131d17 100644
> > > > --- a/drivers/dax/bus.c
> > > > +++ b/drivers/dax/bus.c
> > > > @@ -294,13 +294,10 @@ static ssize_t available_size_show(struct device *dev,
> > > > ????????????????struct device_attribute *attr, char *buf)
> > > > ?{
> > > > ????????struct dax_region *dax_region = dev_get_drvdata(dev);
> > > > -???????unsigned long long size;
> > > > ?
> > > > -???????device_lock(dev);
> > > > -???????size = dax_region_avail_size(dax_region);
> > > > -???????device_unlock(dev);
> > > > +???????guard(device)(dev);
> > >
> > > You have a valid device here, why are you locking it?? How can it go
> > > away?? And if it can, shouldn't you have a local lock for it, and not
> > > abuse the driver core lock?
> >
> > Yes, this is a driver-core lock abuse written by someone who should have
> > known better. And yes, a local lock to protect the dax_region resource
> > tree should replace this. A new rwsem to synchronize all list walks
> > seems appropriate.
>
> I see why _a_ lock is needed both here and in size_show() - the size
> calculations do a walk over discontiguous ranges, and we don't want the
> device to get reconfigured in the middle of that. A different local
> lock seems reasonable - however can that go as a separate cleanup that
> stands on its own?

Sure, but do not add a conversion to use guard(device) here, as that
will be pointless as you will just use a real lock instead.

> For this series, I'll add a cleanup to replace the sprintfs with
> sysfs_emit().

Why not have that be the first patch in the series? Then add your local
lock and convert everything to use it instead of the device lock?

thanks,

greg k-h

2023-12-19 15:29:12

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [PATCH v6 2/4] dax/bus: Use guard(device) in sysfs attribute helpers

On Thu, 14 Dec 2023 22:25:27 -0700
Vishal Verma <[email protected]> wrote:

> Use the guard(device) macro to lock a 'struct device', and unlock it
> automatically when going out of scope using Scope Based Resource
> Management semantics. A lot of the sysfs attribute writes in
> drivers/dax/bus.c benefit from a cleanup using these, so change these
> where applicable.
>
> Cc: Joao Martins <[email protected]>
> Cc: Dan Williams <[email protected]>
> Signed-off-by: Vishal Verma <[email protected]>
Hi Vishal,

A few really minor suggestions inline if you happen to be doing a v7.
Either way
Reviewed-by: Jonathan Cameron <[email protected]>

>
> @@ -481,12 +466,9 @@ static int __free_dev_dax_id(struct dev_dax *dev_dax)
> static int free_dev_dax_id(struct dev_dax *dev_dax)
> {
> struct device *dev = &dev_dax->dev;
> - int rc;
>
> - device_lock(dev);
> - rc = __free_dev_dax_id(dev_dax);
> - device_unlock(dev);
> - return rc;
> + guard(device)(dev);

guard(device)(&dev_dax->dev); /* Only one user now */

> + return __free_dev_dax_id(dev_dax);
> }
>
> static int alloc_dev_dax_id(struct dev_dax *dev_dax)
> @@ -908,9 +890,8 @@ static ssize_t size_show(struct device *dev,
> struct dev_dax *dev_dax = to_dev_dax(dev);
> unsigned long long size;
>
> - device_lock(dev);
> + guard(device)(dev);
> size = dev_dax_size(dev_dax);
> - device_unlock(dev);
>
> return sprintf(buf, "%llu\n", size);
Might as well make this

guard(device)(dev);
return sprintf(buf, "%llu\n", dev_dax_size(to_dev_dax(dev));

> }