2024-02-12 16:36:10

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 0/8] Introduce cpu_dcache_is_aliasing() to fix DAX regression

This commit introduced in v4.0 prevents building FS_DAX on 32-bit ARM,
even on ARMv7 which does not have virtually aliased data caches:

commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches")

It used to work fine before: I have customers using DAX over pmem on
ARMv7, but this regression will likely prevent them from upgrading their
kernel.

The root of the issue here is the fact that DAX was never designed to
handle virtually aliasing data caches (VIVT and VIPT with aliasing data
cache). It touches the pages through their linear mapping, which is not
consistent with the userspace mappings with virtually aliasing data
caches.

This patch series introduces cpu_dcache_is_aliasing() with the new
Kconfig option ARCH_HAS_CPU_CACHE_ALIASING and implements it for all
architectures. The implementation of cpu_dcache_is_aliasing() is either
evaluated to a constant at compile-time or a runtime check, which is
what is needed on ARM.

With this we can basically narrow down the list of architectures which
are unsupported by DAX to those which are really affected.

Testing done so far:

- Compile allyesconfig on x86-64,
- Compile allyesconfig on x86-64, with FS_DAX=n.
- Compile allyesconfig on x86-64, with DAX=n.
- Boot test after modifying alloc_dax() to force returning -EOPNOTSUPP
even on x86-64, thus simulating the behavior expected on an
architecture with data cache aliasing.

There are many more axes to test however. I would welcome Tested-by for:

- affected architectures,
- affected drivers,
- affected filesytems.

[ Based on commit "nvdimm/pmem: Fix leak on dax_add_host() failure". ]

Thanks,

Mathieu

Changes since v4:
- Move the change which makes alloc_dax() return ERR_PTR(-EOPNOTSUPP)
when CONFIG_DAX=n earlier in the series,
- Fold driver cleanup patches into their respective per-driver changes.
- Move "nvdimm/pmem: Fix leak on dax_add_host() failure" outside of this
series.

Changes since v3:
- Fix a leak on dax_add_host() failure in nvdimm/pmem.
- Split the series into a bissectable sequence of changes.
- Ensure that device-dax use-cases still works on data cache
aliasing architectures.

Changes since v2:
- Move DAX supported runtime check to alloc_dax(),
- Modify DM to handle alloc_dax() error as non-fatal,
- Remove all filesystem modifications, since the check is now done by
alloc_dax(),
- rename "dcache" and "cache" to "cpu dcache" and "cpu cache" to
eliminate confusion with VFS terminology.

Changes since v1:
- The order of the series was completely changed based on the
feedback received on v1,
- cache_is_aliasing() is renamed to dcache_is_aliasing(),
- ARCH_HAS_CACHE_ALIASING_DYNAMIC is gone,
- dcache_is_aliasing() vs ARCH_HAS_CACHE_ALIASING relationship is
simplified,
- the dax_is_supported() check was moved to its rightful place in all
filesystems.

Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]

Mathieu Desnoyers (8):
dax: alloc_dax() return ERR_PTR(-EOPNOTSUPP) for CONFIG_DAX=n
nvdimm/pmem: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
dm: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
dcssblk: Handle alloc_dax() -EOPNOTSUPP failure
virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal
dax: Check for data cache aliasing at runtime
Introduce cpu_dcache_is_aliasing() across all architectures
dax: Fix incorrect list of data cache aliasing architectures

arch/arc/Kconfig | 1 +
arch/arc/include/asm/cachetype.h | 9 +++++++++
arch/arm/Kconfig | 1 +
arch/arm/include/asm/cachetype.h | 2 ++
arch/csky/Kconfig | 1 +
arch/csky/include/asm/cachetype.h | 9 +++++++++
arch/m68k/Kconfig | 1 +
arch/m68k/include/asm/cachetype.h | 9 +++++++++
arch/mips/Kconfig | 1 +
arch/mips/include/asm/cachetype.h | 9 +++++++++
arch/nios2/Kconfig | 1 +
arch/nios2/include/asm/cachetype.h | 10 ++++++++++
arch/parisc/Kconfig | 1 +
arch/parisc/include/asm/cachetype.h | 9 +++++++++
arch/sh/Kconfig | 1 +
arch/sh/include/asm/cachetype.h | 9 +++++++++
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/cachetype.h | 14 ++++++++++++++
arch/xtensa/Kconfig | 1 +
arch/xtensa/include/asm/cachetype.h | 10 ++++++++++
drivers/dax/super.c | 14 ++++++++++++++
drivers/md/dm.c | 17 +++++++++--------
drivers/nvdimm/pmem.c | 22 ++++++++++++----------
drivers/s390/block/dcssblk.c | 11 ++++++-----
fs/Kconfig | 1 -
fs/fuse/virtio_fs.c | 15 +++++++++++----
include/linux/cacheinfo.h | 6 ++++++
include/linux/dax.h | 6 +-----
mm/Kconfig | 6 ++++++
29 files changed, 165 insertions(+), 33 deletions(-)
create mode 100644 arch/arc/include/asm/cachetype.h
create mode 100644 arch/csky/include/asm/cachetype.h
create mode 100644 arch/m68k/include/asm/cachetype.h
create mode 100644 arch/mips/include/asm/cachetype.h
create mode 100644 arch/nios2/include/asm/cachetype.h
create mode 100644 arch/parisc/include/asm/cachetype.h
create mode 100644 arch/sh/include/asm/cachetype.h
create mode 100644 arch/sparc/include/asm/cachetype.h
create mode 100644 arch/xtensa/include/asm/cachetype.h

--
2.39.2


2024-02-12 16:36:17

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 1/8] dax: alloc_dax() return ERR_PTR(-EOPNOTSUPP) for CONFIG_DAX=n

Change the return value from NULL to PTR_ERR(-EOPNOTSUPP) for
CONFIG_DAX=n to be consistent with the fact that CONFIG_DAX=y
never returns NULL.

This is done in preparation for using cpu_dcache_is_aliasing() in a
following change which will properly support architectures which detect
data cache aliasing at runtime.

Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Signed-off-by: Mathieu Desnoyers <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
drivers/dax/super.c | 5 +++++
include/linux/dax.h | 6 +-----
2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 0da9232ea175..205b888d45bf 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -319,6 +319,11 @@ EXPORT_SYMBOL_GPL(dax_alive);
* that any fault handlers or operations that might have seen
* dax_alive(), have completed. Any operations that start after
* synchronize_srcu() has run will abort upon seeing !dax_alive().
+ *
+ * Note, because alloc_dax() returns an ERR_PTR() on error, callers
+ * typically store its result into a local variable in order to check
+ * the result. Therefore, care must be taken to populate the struct
+ * device dax_dev field make sure the dax_dev is not leaked.
*/
void kill_dax(struct dax_device *dax_dev)
{
diff --git a/include/linux/dax.h b/include/linux/dax.h
index b463502b16e1..df2d52b8a245 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -86,11 +86,7 @@ static inline void *dax_holder(struct dax_device *dax_dev)
static inline struct dax_device *alloc_dax(void *private,
const struct dax_operations *ops)
{
- /*
- * Callers should check IS_ENABLED(CONFIG_DAX) to know if this
- * NULL is an error or expected.
- */
- return NULL;
+ return ERR_PTR(-EOPNOTSUPP);
}
static inline void put_dax(struct dax_device *dax_dev)
{
--
2.39.2


2024-02-12 16:36:57

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 3/8] dm: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

In preparation for checking whether the architecture has data cache
aliasing within alloc_dax(), modify the error handling of dm alloc_dev()
to treat alloc_dax() -EOPNOTSUPP failure as non-fatal.

Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Suggested-by: Dan Williams <[email protected]>
Signed-off-by: Mathieu Desnoyers <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: Mikulas Patocka <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
drivers/md/dm.c | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 23c32cd1f1d8..acdc00bc05be 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -2054,6 +2054,7 @@ static void cleanup_mapped_device(struct mapped_device *md)
static struct mapped_device *alloc_dev(int minor)
{
int r, numa_node_id = dm_get_numa_node();
+ struct dax_device *dax_dev;
struct mapped_device *md;
void *old_md;

@@ -2122,15 +2123,15 @@ static struct mapped_device *alloc_dev(int minor)
md->disk->private_data = md;
sprintf(md->disk->disk_name, "dm-%d", minor);

- if (IS_ENABLED(CONFIG_FS_DAX)) {
- md->dax_dev = alloc_dax(md, &dm_dax_ops);
- if (IS_ERR(md->dax_dev)) {
- md->dax_dev = NULL;
+ dax_dev = alloc_dax(md, &dm_dax_ops);
+ if (IS_ERR(dax_dev)) {
+ if (PTR_ERR(dax_dev) != -EOPNOTSUPP)
goto bad;
- }
- set_dax_nocache(md->dax_dev);
- set_dax_nomc(md->dax_dev);
- if (dax_add_host(md->dax_dev, md->disk))
+ } else {
+ set_dax_nocache(dax_dev);
+ set_dax_nomc(dax_dev);
+ md->dax_dev = dax_dev;
+ if (dax_add_host(dax_dev, md->disk))
goto bad;
}

--
2.39.2


2024-02-12 16:37:18

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 4/8] dcssblk: Handle alloc_dax() -EOPNOTSUPP failure

In preparation for checking whether the architecture has data cache
aliasing within alloc_dax(), modify the error handling of dcssblk
dcssblk_add_store() to handle alloc_dax() -EOPNOTSUPP failures.

Considering that s390 is not a data cache aliasing architecture,
and considering that DCSSBLK selects DAX, a return value of -EOPNOTSUPP
from alloc_dax() should make dcssblk_add_store() fail.

Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Signed-off-by: Mathieu Desnoyers <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: Mikulas Patocka <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
drivers/s390/block/dcssblk.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/drivers/s390/block/dcssblk.c b/drivers/s390/block/dcssblk.c
index 4b7ecd4fd431..f363c1d51d9a 100644
--- a/drivers/s390/block/dcssblk.c
+++ b/drivers/s390/block/dcssblk.c
@@ -549,6 +549,7 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char
int rc, i, j, num_of_segments;
struct dcssblk_dev_info *dev_info;
struct segment_info *seg_info, *temp;
+ struct dax_device *dax_dev;
char *local_buf;
unsigned long seg_byte_size;

@@ -677,13 +678,13 @@ dcssblk_add_store(struct device *dev, struct device_attribute *attr, const char
if (rc)
goto put_dev;

- dev_info->dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops);
- if (IS_ERR(dev_info->dax_dev)) {
- rc = PTR_ERR(dev_info->dax_dev);
- dev_info->dax_dev = NULL;
+ dax_dev = alloc_dax(dev_info, &dcssblk_dax_ops);
+ if (IS_ERR(dax_dev)) {
+ rc = PTR_ERR(dax_dev);
goto put_dev;
}
- set_dax_synchronous(dev_info->dax_dev);
+ set_dax_synchronous(dax_dev);
+ dev_info->dax_dev = dax_dev;
rc = dax_add_host(dev_info->dax_dev, dev_info->gd);
if (rc)
goto out_dax;
--
2.39.2


2024-02-12 16:37:55

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 6/8] dax: Check for data cache aliasing at runtime

Replace the following fs/Kconfig:FS_DAX dependency:

depends on !(ARM || MIPS || SPARC)

By a runtime check within alloc_dax(). This runtime check returns
ERR_PTR(-EOPNOTSUPP) if the @ops parameter is non-NULL (which means
the kernel is using an aliased mapping) on an architecture which
has data cache aliasing.

Change the return value from NULL to PTR_ERR(-EOPNOTSUPP) for
CONFIG_DAX=n for consistency.

This is done in preparation for using cpu_dcache_is_aliasing() in a
following change which will properly support architectures which detect
data cache aliasing at runtime.

Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Signed-off-by: Mathieu Desnoyers <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
drivers/dax/super.c | 10 ++++++++++
fs/Kconfig | 1 -
2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index 205b888d45bf..ce5bffa86bba 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -450,6 +450,16 @@ struct dax_device *alloc_dax(void *private, const struct dax_operations *ops)
dev_t devt;
int minor;

+ /*
+ * Unavailable on architectures with virtually aliased data caches,
+ * except for device-dax (NULL operations pointer), which does
+ * not use aliased mappings from the kernel.
+ */
+ if (ops && (IS_ENABLED(CONFIG_ARM) ||
+ IS_ENABLED(CONFIG_MIPS) ||
+ IS_ENABLED(CONFIG_SPARC)))
+ return ERR_PTR(-EOPNOTSUPP);
+
if (WARN_ON_ONCE(ops && !ops->zero_page_range))
return ERR_PTR(-EINVAL);

diff --git a/fs/Kconfig b/fs/Kconfig
index 42837617a55b..e5efdb3b276b 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -56,7 +56,6 @@ endif # BLOCK
config FS_DAX
bool "File system based Direct Access (DAX) support"
depends on MMU
- depends on !(ARM || MIPS || SPARC)
depends on ZONE_DEVICE || FS_DAX_LIMITED
select FS_IOMAP
select DAX
--
2.39.2


2024-02-12 16:38:11

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

In preparation for checking whether the architecture has data cache
aliasing within alloc_dax(), modify the error handling of virtio
virtio_fs_setup_dax() to treat alloc_dax() -EOPNOTSUPP failure as
non-fatal.

Co-developed-by: Dan Williams <[email protected]>
Signed-off-by: Dan Williams <[email protected]>
Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Signed-off-by: Mathieu Desnoyers <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: Mikulas Patocka <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
fs/fuse/virtio_fs.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
index 5f1be1da92ce..f9acd9972af2 100644
--- a/fs/fuse/virtio_fs.c
+++ b/fs/fuse/virtio_fs.c
@@ -16,6 +16,7 @@
#include <linux/fs_context.h>
#include <linux/fs_parser.h>
#include <linux/highmem.h>
+#include <linux/cleanup.h>
#include <linux/uio.h>
#include "fuse_i.h"

@@ -795,8 +796,11 @@ static void virtio_fs_cleanup_dax(void *data)
put_dax(dax_dev);
}

+DEFINE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR(_T)) virtio_fs_cleanup_dax(_T))
+
static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
{
+ struct dax_device *dax_dev __free(cleanup_dax) = ERR_PTR(-EOPNOTSUPP);
struct virtio_shm_region cache_reg;
struct dev_pagemap *pgmap;
bool have_cache;
@@ -804,6 +808,12 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
if (!IS_ENABLED(CONFIG_FUSE_DAX))
return 0;

+ dax_dev = alloc_dax(fs, &virtio_fs_dax_ops);
+ if (IS_ERR(dax_dev)) {
+ int rc = PTR_ERR(dax_dev);
+ return rc == -EOPNOTSUPP ? 0 : rc;
+ }
+
/* Get cache region */
have_cache = virtio_get_shm_region(vdev, &cache_reg,
(u8)VIRTIO_FS_SHMCAP_ID_CACHE);
@@ -849,10 +859,7 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
dev_dbg(&vdev->dev, "%s: window kaddr 0x%px phys_addr 0x%llx len 0x%llx\n",
__func__, fs->window_kaddr, cache_reg.addr, cache_reg.len);

- fs->dax_dev = alloc_dax(fs, &virtio_fs_dax_ops);
- if (IS_ERR(fs->dax_dev))
- return PTR_ERR(fs->dax_dev);
-
+ fs->dax_dev = no_free_ptr(dax_dev);
return devm_add_action_or_reset(&vdev->dev, virtio_fs_cleanup_dax,
fs->dax_dev);
}
--
2.39.2


2024-02-12 16:39:08

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 8/8] dax: Fix incorrect list of data cache aliasing architectures

commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
prevents DAX from building on architectures with virtually aliased
dcache with:

depends on !(ARM || MIPS || SPARC)

This check is too broad (e.g. recent ARMv7 don't have virtually aliased
dcaches), and also misses many other architectures with virtually
aliased data cache.

This is a regression introduced in the v4.0 Linux kernel where the
dax mount option is removed for 32-bit ARMv7 boards which have no data
cache aliasing, and therefore should work fine with FS_DAX.

This was turned into the following check in alloc_dax() by a preparatory
change:

if (ops && (IS_ENABLED(CONFIG_ARM) ||
IS_ENABLED(CONFIG_MIPS) ||
IS_ENABLED(CONFIG_SPARC)))
return NULL;

Use cpu_dcache_is_aliasing() instead to figure out whether the environment
has aliasing data caches.

Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Signed-off-by: Mathieu Desnoyers <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
drivers/dax/super.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/dax/super.c b/drivers/dax/super.c
index ce5bffa86bba..a21a7c262382 100644
--- a/drivers/dax/super.c
+++ b/drivers/dax/super.c
@@ -13,6 +13,7 @@
#include <linux/uio.h>
#include <linux/dax.h>
#include <linux/fs.h>
+#include <linux/cacheinfo.h>
#include "dax-private.h"

/**
@@ -455,9 +456,7 @@ struct dax_device *alloc_dax(void *private, const struct dax_operations *ops)
* except for device-dax (NULL operations pointer), which does
* not use aliased mappings from the kernel.
*/
- if (ops && (IS_ENABLED(CONFIG_ARM) ||
- IS_ENABLED(CONFIG_MIPS) ||
- IS_ENABLED(CONFIG_SPARC)))
+ if (ops && cpu_dcache_is_aliasing())
return ERR_PTR(-EOPNOTSUPP);

if (WARN_ON_ONCE(ops && !ops->zero_page_range))
--
2.39.2


2024-02-12 16:39:19

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 7/8] Introduce cpu_dcache_is_aliasing() across all architectures

Introduce a generic way to query whether the data cache is virtually
aliased on all architectures. Its purpose is to ensure that subsystems
which are incompatible with virtually aliased data caches (e.g. FS_DAX)
can reliably query this.

For data cache aliasing, there are three scenarios dependending on the
architecture. Here is a breakdown based on my understanding:

A) The data cache is always aliasing:

* arc
* csky
* m68k (note: shared memory mappings are incoherent ? SHMLBA is missing there.)
* sh
* parisc

B) The data cache aliasing is statically known or depends on querying CPU
state at runtime:

* arm (cache_is_vivt() || cache_is_vipt_aliasing())
* mips (cpu_has_dc_aliases)
* nios2 (NIOS2_DCACHE_SIZE > PAGE_SIZE)
* sparc32 (vac_cache_size > PAGE_SIZE)
* sparc64 (L1DCACHE_SIZE > PAGE_SIZE)
* xtensa (DCACHE_WAY_SIZE > PAGE_SIZE)

C) The data cache is never aliasing:

* alpha
* arm64 (aarch64)
* hexagon
* loongarch (but with incoherent write buffers, which are disabled since
commit d23b7795 ("LoongArch: Change SHMLBA from SZ_64K to PAGE_SIZE"))
* microblaze
* openrisc
* powerpc
* riscv
* s390
* um
* x86

Require architectures in A) and B) to select ARCH_HAS_CPU_CACHE_ALIASING and
implement "cpu_dcache_is_aliasing()".

Architectures in C) don't select ARCH_HAS_CPU_CACHE_ALIASING, and thus
cpu_dcache_is_aliasing() simply evaluates to "false".

Note that this leaves "cpu_icache_is_aliasing()" to be implemented as future
work. This would be useful to gate features like XIP on architectures
which have aliasing CPU dcache-icache but not CPU dcache-dcache.

Use "cpu_dcache" and "cpu_cache" rather than just "dcache" and "cache"
to clarify that we really mean "CPU data cache" and "CPU cache" to
eliminate any possible confusion with VFS "dentry cache" and "page
cache".

Link: https://lore.kernel.org/lkml/[email protected]/
Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Signed-off-by: Mathieu Desnoyers <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
arch/arc/Kconfig | 1 +
arch/arc/include/asm/cachetype.h | 9 +++++++++
arch/arm/Kconfig | 1 +
arch/arm/include/asm/cachetype.h | 2 ++
arch/csky/Kconfig | 1 +
arch/csky/include/asm/cachetype.h | 9 +++++++++
arch/m68k/Kconfig | 1 +
arch/m68k/include/asm/cachetype.h | 9 +++++++++
arch/mips/Kconfig | 1 +
arch/mips/include/asm/cachetype.h | 9 +++++++++
arch/nios2/Kconfig | 1 +
arch/nios2/include/asm/cachetype.h | 10 ++++++++++
arch/parisc/Kconfig | 1 +
arch/parisc/include/asm/cachetype.h | 9 +++++++++
arch/sh/Kconfig | 1 +
arch/sh/include/asm/cachetype.h | 9 +++++++++
arch/sparc/Kconfig | 1 +
arch/sparc/include/asm/cachetype.h | 14 ++++++++++++++
arch/xtensa/Kconfig | 1 +
arch/xtensa/include/asm/cachetype.h | 10 ++++++++++
include/linux/cacheinfo.h | 6 ++++++
mm/Kconfig | 6 ++++++
22 files changed, 112 insertions(+)
create mode 100644 arch/arc/include/asm/cachetype.h
create mode 100644 arch/csky/include/asm/cachetype.h
create mode 100644 arch/m68k/include/asm/cachetype.h
create mode 100644 arch/mips/include/asm/cachetype.h
create mode 100644 arch/nios2/include/asm/cachetype.h
create mode 100644 arch/parisc/include/asm/cachetype.h
create mode 100644 arch/sh/include/asm/cachetype.h
create mode 100644 arch/sparc/include/asm/cachetype.h
create mode 100644 arch/xtensa/include/asm/cachetype.h

diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig
index 1b0483c51cc1..7d294a3242a4 100644
--- a/arch/arc/Kconfig
+++ b/arch/arc/Kconfig
@@ -6,6 +6,7 @@
config ARC
def_bool y
select ARC_TIMERS
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_HAS_CACHE_LINE_SIZE
select ARCH_HAS_DEBUG_VM_PGTABLE
select ARCH_HAS_DMA_PREP_COHERENT
diff --git a/arch/arc/include/asm/cachetype.h b/arch/arc/include/asm/cachetype.h
new file mode 100644
index 000000000000..05fc7ed59712
--- /dev/null
+++ b/arch/arc/include/asm/cachetype.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_ARC_CACHETYPE_H
+#define __ASM_ARC_CACHETYPE_H
+
+#include <linux/types.h>
+
+#define cpu_dcache_is_aliasing() true
+
+#endif
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index f8567e95f98b..cd13b1788973 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -5,6 +5,7 @@ config ARM
select ARCH_32BIT_OFF_T
select ARCH_CORRECT_STACKTRACE_ON_KRETPROBE if HAVE_KRETPROBES && FRAME_POINTER && !ARM_UNWIND
select ARCH_HAS_BINFMT_FLAT
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_HAS_CPU_FINALIZE_INIT if MMU
select ARCH_HAS_CURRENT_STACK_POINTER
select ARCH_HAS_DEBUG_VIRTUAL if MMU
diff --git a/arch/arm/include/asm/cachetype.h b/arch/arm/include/asm/cachetype.h
index e8c30430be33..b9dbe1d4c8fe 100644
--- a/arch/arm/include/asm/cachetype.h
+++ b/arch/arm/include/asm/cachetype.h
@@ -20,6 +20,8 @@ extern unsigned int cacheid;
#define icache_is_vipt_aliasing() cacheid_is(CACHEID_VIPT_I_ALIASING)
#define icache_is_pipt() cacheid_is(CACHEID_PIPT)

+#define cpu_dcache_is_aliasing() (cache_is_vivt() || cache_is_vipt_aliasing())
+
/*
* __LINUX_ARM_ARCH__ is the minimum supported CPU architecture
* Mask out support which will never be present on newer CPUs.
diff --git a/arch/csky/Kconfig b/arch/csky/Kconfig
index cf2a6fd7dff8..8a91eccf76dc 100644
--- a/arch/csky/Kconfig
+++ b/arch/csky/Kconfig
@@ -2,6 +2,7 @@
config CSKY
def_bool y
select ARCH_32BIT_OFF_T
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_HAS_DMA_PREP_COHERENT
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_SYNC_DMA_FOR_CPU
diff --git a/arch/csky/include/asm/cachetype.h b/arch/csky/include/asm/cachetype.h
new file mode 100644
index 000000000000..98cbe3af662f
--- /dev/null
+++ b/arch/csky/include/asm/cachetype.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_CSKY_CACHETYPE_H
+#define __ASM_CSKY_CACHETYPE_H
+
+#include <linux/types.h>
+
+#define cpu_dcache_is_aliasing() true
+
+#endif
diff --git a/arch/m68k/Kconfig b/arch/m68k/Kconfig
index 4b3e93cac723..a9c3e3de0c6d 100644
--- a/arch/m68k/Kconfig
+++ b/arch/m68k/Kconfig
@@ -3,6 +3,7 @@ config M68K
bool
default y
select ARCH_32BIT_OFF_T
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_HAS_BINFMT_FLAT
select ARCH_HAS_CPU_FINALIZE_INIT if MMU
select ARCH_HAS_CURRENT_STACK_POINTER
diff --git a/arch/m68k/include/asm/cachetype.h b/arch/m68k/include/asm/cachetype.h
new file mode 100644
index 000000000000..7fad5d9ab8fe
--- /dev/null
+++ b/arch/m68k/include/asm/cachetype.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_M68K_CACHETYPE_H
+#define __ASM_M68K_CACHETYPE_H
+
+#include <linux/types.h>
+
+#define cpu_dcache_is_aliasing() true
+
+#endif
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 797ae590ebdb..ab1c8bd96666 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -4,6 +4,7 @@ config MIPS
default y
select ARCH_32BIT_OFF_T if !64BIT
select ARCH_BINFMT_ELF_STATE if MIPS_FP_SUPPORT
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_HAS_CPU_FINALIZE_INIT
select ARCH_HAS_CURRENT_STACK_POINTER if !CC_IS_CLANG || CLANG_VERSION >= 140000
select ARCH_HAS_DEBUG_VIRTUAL if !64BIT
diff --git a/arch/mips/include/asm/cachetype.h b/arch/mips/include/asm/cachetype.h
new file mode 100644
index 000000000000..9f4ba2fe1155
--- /dev/null
+++ b/arch/mips/include/asm/cachetype.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_MIPS_CACHETYPE_H
+#define __ASM_MIPS_CACHETYPE_H
+
+#include <asm/cpu-features.h>
+
+#define cpu_dcache_is_aliasing() cpu_has_dc_aliases
+
+#endif
diff --git a/arch/nios2/Kconfig b/arch/nios2/Kconfig
index d54464021a61..760fb541ecd2 100644
--- a/arch/nios2/Kconfig
+++ b/arch/nios2/Kconfig
@@ -2,6 +2,7 @@
config NIOS2
def_bool y
select ARCH_32BIT_OFF_T
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_HAS_DMA_PREP_COHERENT
select ARCH_HAS_SYNC_DMA_FOR_CPU
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
diff --git a/arch/nios2/include/asm/cachetype.h b/arch/nios2/include/asm/cachetype.h
new file mode 100644
index 000000000000..eb9c416b8a1c
--- /dev/null
+++ b/arch/nios2/include/asm/cachetype.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_NIOS2_CACHETYPE_H
+#define __ASM_NIOS2_CACHETYPE_H
+
+#include <asm/page.h>
+#include <asm/cache.h>
+
+#define cpu_dcache_is_aliasing() (NIOS2_DCACHE_SIZE > PAGE_SIZE)
+
+#endif
diff --git a/arch/parisc/Kconfig b/arch/parisc/Kconfig
index d14ccc948a29..0f25c227f74b 100644
--- a/arch/parisc/Kconfig
+++ b/arch/parisc/Kconfig
@@ -8,6 +8,7 @@ config PARISC
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_SYSCALL_TRACEPOINTS
select ARCH_WANT_FRAME_POINTERS
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_HAS_DMA_ALLOC if PA11
select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_STRICT_KERNEL_RWX
diff --git a/arch/parisc/include/asm/cachetype.h b/arch/parisc/include/asm/cachetype.h
new file mode 100644
index 000000000000..e0868a1d3c47
--- /dev/null
+++ b/arch/parisc/include/asm/cachetype.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_PARISC_CACHETYPE_H
+#define __ASM_PARISC_CACHETYPE_H
+
+#include <linux/types.h>
+
+#define cpu_dcache_is_aliasing() true
+
+#endif
diff --git a/arch/sh/Kconfig b/arch/sh/Kconfig
index 7500521b2b98..2ad3e29f0ebe 100644
--- a/arch/sh/Kconfig
+++ b/arch/sh/Kconfig
@@ -2,6 +2,7 @@
config SUPERH
def_bool y
select ARCH_32BIT_OFF_T
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_ENABLE_MEMORY_HOTPLUG if SPARSEMEM && MMU
select ARCH_ENABLE_MEMORY_HOTREMOVE if SPARSEMEM && MMU
select ARCH_HAVE_NMI_SAFE_CMPXCHG if (GUSA_RB || CPU_SH4A)
diff --git a/arch/sh/include/asm/cachetype.h b/arch/sh/include/asm/cachetype.h
new file mode 100644
index 000000000000..a5fffe536068
--- /dev/null
+++ b/arch/sh/include/asm/cachetype.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_SH_CACHETYPE_H
+#define __ASM_SH_CACHETYPE_H
+
+#include <linux/types.h>
+
+#define cpu_dcache_is_aliasing() true
+
+#endif
diff --git a/arch/sparc/Kconfig b/arch/sparc/Kconfig
index 49849790e66d..5ba627da15d7 100644
--- a/arch/sparc/Kconfig
+++ b/arch/sparc/Kconfig
@@ -13,6 +13,7 @@ config 64BIT
config SPARC
bool
default y
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_MIGHT_HAVE_PC_PARPORT if SPARC64 && PCI
select ARCH_MIGHT_HAVE_PC_SERIO
select DMA_OPS
diff --git a/arch/sparc/include/asm/cachetype.h b/arch/sparc/include/asm/cachetype.h
new file mode 100644
index 000000000000..caf1c0045892
--- /dev/null
+++ b/arch/sparc/include/asm/cachetype.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_SPARC_CACHETYPE_H
+#define __ASM_SPARC_CACHETYPE_H
+
+#include <asm/page.h>
+
+#ifdef CONFIG_SPARC32
+extern int vac_cache_size;
+#define cpu_dcache_is_aliasing() (vac_cache_size > PAGE_SIZE)
+#else
+#define cpu_dcache_is_aliasing() (L1DCACHE_SIZE > PAGE_SIZE)
+#endif
+
+#endif
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index 7d792077e5fd..2dfde54d1a84 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -2,6 +2,7 @@
config XTENSA
def_bool y
select ARCH_32BIT_OFF_T
+ select ARCH_HAS_CPU_CACHE_ALIASING
select ARCH_HAS_BINFMT_FLAT if !MMU
select ARCH_HAS_CURRENT_STACK_POINTER
select ARCH_HAS_DEBUG_VM_PGTABLE
diff --git a/arch/xtensa/include/asm/cachetype.h b/arch/xtensa/include/asm/cachetype.h
new file mode 100644
index 000000000000..51bd49e2a1c5
--- /dev/null
+++ b/arch/xtensa/include/asm/cachetype.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_XTENSA_CACHETYPE_H
+#define __ASM_XTENSA_CACHETYPE_H
+
+#include <asm/cache.h>
+#include <asm/page.h>
+
+#define cpu_dcache_is_aliasing() (DCACHE_WAY_SIZE > PAGE_SIZE)
+
+#endif
diff --git a/include/linux/cacheinfo.h b/include/linux/cacheinfo.h
index d504eb4b49ab..2cb15fe4fe12 100644
--- a/include/linux/cacheinfo.h
+++ b/include/linux/cacheinfo.h
@@ -138,4 +138,10 @@ static inline int get_cpu_cacheinfo_id(int cpu, int level)
#define use_arch_cache_info() (false)
#endif

+#ifndef CONFIG_ARCH_HAS_CPU_CACHE_ALIASING
+#define cpu_dcache_is_aliasing() false
+#else
+#include <asm/cachetype.h>
+#endif
+
#endif /* _LINUX_CACHEINFO_H */
diff --git a/mm/Kconfig b/mm/Kconfig
index 57cd378c73d6..db09c9ad15c9 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -1016,6 +1016,12 @@ config IDLE_PAGE_TRACKING
See Documentation/admin-guide/mm/idle_page_tracking.rst for
more details.

+# Architectures which implement cpu_dcache_is_aliasing() to query
+# whether the data caches are aliased (VIVT or VIPT with dcache
+# aliasing) need to select this.
+config ARCH_HAS_CPU_CACHE_ALIASING
+ bool
+
config ARCH_HAS_CACHE_LINE_SIZE
bool

--
2.39.2


2024-02-12 16:40:08

by Mathieu Desnoyers

[permalink] [raw]
Subject: [PATCH v5 2/8] nvdimm/pmem: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

In preparation for checking whether the architecture has data cache
aliasing within alloc_dax(), modify the error handling of nvdimm/pmem
pmem_attach_disk() to treat alloc_dax() -EOPNOTSUPP failure as non-fatal.

[ Based on commit "nvdimm/pmem: Fix leak on dax_add_host() failure". ]

Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
Signed-off-by: Mathieu Desnoyers <[email protected]>
Reviewed-by: Dan Williams <[email protected]>
Cc: Alasdair Kergon <[email protected]>
Cc: Mike Snitzer <[email protected]>
Cc: Mikulas Patocka <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Linus Torvalds <[email protected]>
Cc: Dan Williams <[email protected]>
Cc: Vishal Verma <[email protected]>
Cc: Dave Jiang <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Russell King <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
---
drivers/nvdimm/pmem.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 9fe358090720..e9898457a7bd 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -560,17 +560,19 @@ static int pmem_attach_disk(struct device *dev,
dax_dev = alloc_dax(pmem, &pmem_dax_ops);
if (IS_ERR(dax_dev)) {
rc = PTR_ERR(dax_dev);
- goto out;
+ if (rc != -EOPNOTSUPP)
+ goto out;
+ } else {
+ set_dax_nocache(dax_dev);
+ set_dax_nomc(dax_dev);
+ if (is_nvdimm_sync(nd_region))
+ set_dax_synchronous(dax_dev);
+ pmem->dax_dev = dax_dev;
+ rc = dax_add_host(dax_dev, disk);
+ if (rc)
+ goto out_cleanup_dax;
+ dax_write_cache(dax_dev, nvdimm_has_cache(nd_region));
}
- set_dax_nocache(dax_dev);
- set_dax_nomc(dax_dev);
- if (is_nvdimm_sync(nd_region))
- set_dax_synchronous(dax_dev);
- pmem->dax_dev = dax_dev;
- rc = dax_add_host(dax_dev, disk);
- if (rc)
- goto out_cleanup_dax;
- dax_write_cache(dax_dev, nvdimm_has_cache(nd_region));
rc = device_add_disk(dev, disk, pmem_attribute_groups);
if (rc)
goto out_remove_host;
--
2.39.2


2024-02-12 22:05:07

by Dan Williams

[permalink] [raw]
Subject: RE: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

Mathieu Desnoyers wrote:
> In preparation for checking whether the architecture has data cache
> aliasing within alloc_dax(), modify the error handling of virtio
> virtio_fs_setup_dax() to treat alloc_dax() -EOPNOTSUPP failure as
> non-fatal.
>
> Co-developed-by: Dan Williams <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
> Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
> Signed-off-by: Mathieu Desnoyers <[email protected]>
> Cc: Alasdair Kergon <[email protected]>
> Cc: Mike Snitzer <[email protected]>
> Cc: Mikulas Patocka <[email protected]>
> Cc: Andrew Morton <[email protected]>
> Cc: Linus Torvalds <[email protected]>
> Cc: Dan Williams <[email protected]>
> Cc: Vishal Verma <[email protected]>
> Cc: Dave Jiang <[email protected]>
> Cc: Matthew Wilcox <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Russell King <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> Cc: [email protected]
> ---
> fs/fuse/virtio_fs.c | 15 +++++++++++----
> 1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c
> index 5f1be1da92ce..f9acd9972af2 100644
> --- a/fs/fuse/virtio_fs.c
> +++ b/fs/fuse/virtio_fs.c
> @@ -16,6 +16,7 @@
> #include <linux/fs_context.h>
> #include <linux/fs_parser.h>
> #include <linux/highmem.h>
> +#include <linux/cleanup.h>
> #include <linux/uio.h>
> #include "fuse_i.h"
>
> @@ -795,8 +796,11 @@ static void virtio_fs_cleanup_dax(void *data)
> put_dax(dax_dev);
> }
>
> +DEFINE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR(_T)) virtio_fs_cleanup_dax(_T))
> +
> static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
> {
> + struct dax_device *dax_dev __free(cleanup_dax) = ERR_PTR(-EOPNOTSUPP);
> struct virtio_shm_region cache_reg;
> struct dev_pagemap *pgmap;
> bool have_cache;
> @@ -804,6 +808,12 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
> if (!IS_ENABLED(CONFIG_FUSE_DAX))
> return 0;
>
> + dax_dev = alloc_dax(fs, &virtio_fs_dax_ops);
> + if (IS_ERR(dax_dev)) {
> + int rc = PTR_ERR(dax_dev);
> + return rc == -EOPNOTSUPP ? 0 : rc;
> + }
> +
> /* Get cache region */
> have_cache = virtio_get_shm_region(vdev, &cache_reg,
> (u8)VIRTIO_FS_SHMCAP_ID_CACHE);
> @@ -849,10 +859,7 @@ static int virtio_fs_setup_dax(struct virtio_device *vdev, struct virtio_fs *fs)
> dev_dbg(&vdev->dev, "%s: window kaddr 0x%px phys_addr 0x%llx len 0x%llx\n",
> __func__, fs->window_kaddr, cache_reg.addr, cache_reg.len);
>
> - fs->dax_dev = alloc_dax(fs, &virtio_fs_dax_ops);
> - if (IS_ERR(fs->dax_dev))
> - return PTR_ERR(fs->dax_dev);
> -
> + fs->dax_dev = no_free_ptr(dax_dev);

This works because the internals of virtio_fs_cleanup_dax(), "kill_dax()
and put_dax()", know how to handle a NULL @dax_dev. It is still early
days with the "cleanup" helpers, but I wonder if anyone else cares that
the DEFINE_FREE() above does not check for NULL?

I think it is ok, but wanted to point that out for the virtiofs folks
and wonder what the style should be for things like this going forward.

Other growing pains with "cleanup.h" and ERR_PTR is happening over here:

http://lore.kernel.org/r/[email protected]

..and that arrived at a similar style as is being used in this patch.
In both cases the cleanup function is called on exit with a NULL
argument.

2024-02-12 22:09:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

On Mon, 12 Feb 2024 at 14:04, Dan Williams <[email protected]> wrote:
>
> This works because the internals of virtio_fs_cleanup_dax(), "kill_dax()
> and put_dax()", know how to handle a NULL @dax_dev. It is still early
> days with the "cleanup" helpers, but I wonder if anyone else cares that
> the DEFINE_FREE() above does not check for NULL?

Well, the main reason for DEFINE_FREE() to check for NULL is not
correctness, but code generation. See the comment about kfree() in
<linux/cleanup.h>:

* NOTE: the DEFINE_FREE()'s @free expression includes a NULL test even though
* kfree() is fine to be called with a NULL value. This is on purpose. This way
* the compiler sees the end of our alloc_obj() function as [...]

with the full explanation there.

Now, whether the code wants to actually use the cleanup() helpers for
a single use-case is debatable.

But yes, if it does, I suspect it should use !IS_ERR_OR_NULL(ptr).

Linus

2024-02-12 23:03:40

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

[ add Lukas ]

Linus Torvalds wrote:
> On Mon, 12 Feb 2024 at 14:04, Dan Williams <[email protected]> wrote:
> >
> > This works because the internals of virtio_fs_cleanup_dax(), "kill_dax()
> > and put_dax()", know how to handle a NULL @dax_dev. It is still early
> > days with the "cleanup" helpers, but I wonder if anyone else cares that
> > the DEFINE_FREE() above does not check for NULL?
>
> Well, the main reason for DEFINE_FREE() to check for NULL is not
> correctness, but code generation. See the comment about kfree() in
> <linux/cleanup.h>:
>
> * NOTE: the DEFINE_FREE()'s @free expression includes a NULL test even though
> * kfree() is fine to be called with a NULL value. This is on purpose. This way
> * the compiler sees the end of our alloc_obj() function as [...]
>
> with the full explanation there.
>
> Now, whether the code wants to actually use the cleanup() helpers for
> a single use-case is debatable.
>
> But yes, if it does, I suspect it should use !IS_ERR_OR_NULL(ptr).o

I am trying to arrive at a common recommendation given Lukas found that
IS_ERR_OR_NULL() resulted in unwanted NULL checks emitted in the
assembly [1].

He is doing something similar:

http://lore.kernel.org/r/4143b15418c4ecf87ddeceb36813943c3ede17aa.1707734526.git.lukas@wunner.de

..and introduced an assume() helper.

However, Lukas, I think Linus is right, your DEFINE_FREE() should use
IS_ERR_OR_NULL(). I.e. the problem is trying to use
__free(x509_free_certificate) in x509_cert_parse().

> --- a/crypto/asymmetric_keys/x509_cert_parser.c
> +++ b/crypto/asymmetric_keys/x509_cert_parser.c
> @@ -60,24 +60,24 @@ void x509_free_certificate(struct x509_certificate *cert)
> */
> struct x509_certificate *x509_cert_parse(const void *data, size_t datalen)
> {
> - struct x509_certificate *cert;
> - struct x509_parse_context *ctx;
> + struct x509_certificate *cert __free(x509_free_certificate);

..make this:

struct x509_certificate *cert __free(kfree);

..and Mathieu, this should be IS_ERR_OR_NULL() to skip an unnecessary
call to virtio_fs_cleanup_dax() at function exit that the compiler
should elide.

2024-02-13 06:26:22

by Lukas Wunner

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

On Mon, Feb 12, 2024 at 11:30:58AM -0500, Mathieu Desnoyers wrote:
> In preparation for checking whether the architecture has data cache
> aliasing within alloc_dax(), modify the error handling of virtio
> virtio_fs_setup_dax() to treat alloc_dax() -EOPNOTSUPP failure as
> non-fatal.
>
> Co-developed-by: Dan Williams <[email protected]>
> Signed-off-by: Dan Williams <[email protected]>
> Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")

That's a v4.0 commit, yet this patch uses DEFINE_FREE() which is
only available in v6.6 but not any earlier stable kernels.

So the Fixes tag feels a bit weird.

Apart from that,
Reviewed-by: Lukas Wunner <[email protected]>

2024-02-13 11:19:42

by Heiko Carstens

[permalink] [raw]
Subject: Re: [PATCH v5 4/8] dcssblk: Handle alloc_dax() -EOPNOTSUPP failure

On Mon, Feb 12, 2024 at 11:30:57AM -0500, Mathieu Desnoyers wrote:
> In preparation for checking whether the architecture has data cache
> aliasing within alloc_dax(), modify the error handling of dcssblk
> dcssblk_add_store() to handle alloc_dax() -EOPNOTSUPP failures.
>
> Considering that s390 is not a data cache aliasing architecture,
> and considering that DCSSBLK selects DAX, a return value of -EOPNOTSUPP
> from alloc_dax() should make dcssblk_add_store() fail.
>
> Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
> Signed-off-by: Mathieu Desnoyers <[email protected]>
> Reviewed-by: Dan Williams <[email protected]>
..
> ---
> drivers/s390/block/dcssblk.c | 11 ++++++-----
> 1 file changed, 6 insertions(+), 5 deletions(-)

Acked-by: Heiko Carstens <[email protected]>

2024-02-13 18:57:32

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

Lukas Wunner wrote:
> On Mon, Feb 12, 2024 at 03:02:46PM -0800, Dan Williams wrote:
> > However, Lukas, I think Linus is right, your DEFINE_FREE() should use
> > IS_ERR_OR_NULL().
>
> Uh... that's a negative, sir. ;)
>
> IS_ERR_OR_NULL() results in...
> * a superfluous NULL pointer check in x509_key_preparse() and
> * a superfluous IS_ERR check in x509_cert_parse().
>
> IS_ERR() results *only* in...
> * a superfluous IS_ERR check in x509_cert_parse().
>
> I can get rid of the IS_ERR() check by using assume().
>
> I can *not* get rid of the NULL pointer check because the compiler
> is compiled with -fno-delete-null-pointer-checks. (The compiler
> seems to ignore __attribute__((returns_nonnull)) due to that.)
>
>
> > I.e. the problem is trying to use
> > __free(x509_free_certificate) in x509_cert_parse().
> >
> > > --- a/crypto/asymmetric_keys/x509_cert_parser.c
> > > +++ b/crypto/asymmetric_keys/x509_cert_parser.c
> > > @@ -60,24 +60,24 @@ void x509_free_certificate(struct x509_certificate *cert)
> > > */
> > > struct x509_certificate *x509_cert_parse(const void *data, size_t datalen)
> > > {
> > > - struct x509_certificate *cert;
> > > - struct x509_parse_context *ctx;
> > > + struct x509_certificate *cert __free(x509_free_certificate);
> >
> > ...make this:
> >
> > struct x509_certificate *cert __free(kfree);
>
> That doesn't work I'm afraid. x509_cert_parse() needs
> x509_free_certificate() to be called in the error path,
> not kfree(). See the existing code in current mainline:
>
> x509_cert_parse() populates three sub-allocations in
> struct x509_certificate (pub, sig, id) and two
> sub-sub-allocations (pub->key, pub->params).
>
> So I'd have to add five additional local variables which
> get freed by __cleanup(). One of them (pub->key) requires
> kfree_sensitive() instead of kfree(), so I'd need an extra
> DEFINE_FREE() for that.
>
> I haven't tried it but I suspect the result would look
> terrible and David Howells wouldn't like it.

Ugh, that's what I was afraid of, so these cases are different.

> > ...and Mathieu, this should be IS_ERR_OR_NULL() to skip an unnecessary
> > call to virtio_fs_cleanup_dax() at function exit that the compiler
> > should elide.
>
> My recommendation is to check for !IS_ERR() in the DEFINE_FREE() clause
> and amend virtio_fs_cleanup_dax() with a "if (!dax_dev) return;" for
> defensiveness in case someone calls it with a NULL pointer.

The internal calls (kill_dax(), put_dax()) check for NULL already, so I
don't think that's needed.

> That's the best solution I could come up with for the x509_certificate
> conversion.
>
> Note that even with superfluous checks avoided, __cleanup() causes
> gcc-12 to always generate two return paths. It's very visible in
> the generated code that all the stack unwinding code gets duplicated
> in every function using __cleanup(). The existing Assembler code
> of x509_key_preparse() and x509_cert_parse(), without __cleanup()
> invocation, has only a single return path.

I saw that too, some NULL checks can indeed be elided with a NULL check
in the DEFINE_FREE(), but the multiple exit paths still someimtes result
in __cleanup() using functions being larger than the goto equivalent.

> So __cleanup() bloats the code regardless of superfluous checks,
> but future gcc versions might avoid that. clang-15 generates much
> more compact code (vmlinux is a couple hundred kBytes smaller),
> but does weird things such as inlining x509_free_certificate()
> in x509_cert_parse().
>
> As you may have guessed, I've spent an inordinate amount of time
> down that rabbit hole. ;(

Hey, this is new and interesting stuff, glad we are grappling with it at
this level.

2024-02-13 20:05:39

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

On 2024-02-13 01:25, Lukas Wunner wrote:
> On Mon, Feb 12, 2024 at 11:30:58AM -0500, Mathieu Desnoyers wrote:
>> In preparation for checking whether the architecture has data cache
>> aliasing within alloc_dax(), modify the error handling of virtio
>> virtio_fs_setup_dax() to treat alloc_dax() -EOPNOTSUPP failure as
>> non-fatal.
>>
>> Co-developed-by: Dan Williams <[email protected]>
>> Signed-off-by: Dan Williams <[email protected]>
>> Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
>
> That's a v4.0 commit, yet this patch uses DEFINE_FREE() which is
> only available in v6.6 but not any earlier stable kernels.

I asked this question to Greg KH before creating this patch, and his
answer was to implement my fix for master, and stable kernels would take
care of backporting all the required dependencies.

Now if I look at latest 6.1, 5.15, 5.10, 5.4, 4.19 stable kernels,
none seem to have include/linux/cleanup.h today. But I suspect that
sooner or later relevant master branch fixes will require stable
kernels to backport cleanup.h, so why not do it now ?

Thanks,

Mathieu


>
> So the Fixes tag feels a bit weird.
>
> Apart from that,
> Reviewed-by: Lukas Wunner <[email protected]>

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


2024-02-13 20:07:57

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v5 1/8] dax: alloc_dax() return ERR_PTR(-EOPNOTSUPP) for CONFIG_DAX=n

On 2024-02-13 14:07, Dan Williams wrote:
> Lukas Wunner wrote:
>> On Mon, Feb 12, 2024 at 11:30:54AM -0500, Mathieu Desnoyers wrote:
>>> Change the return value from NULL to PTR_ERR(-EOPNOTSUPP) for
>>> CONFIG_DAX=n to be consistent with the fact that CONFIG_DAX=y
>>> never returns NULL.
>>
>> All the callers of alloc_dax() only check for IS_ERR().
>>
>> Doesn't this result in a change of behavior in all the callers?
>> Previously they'd ignore the NULL return value and continue,
>> now they'll error out.
>>
>> Given that, seems dangerous to add a Fixes tag with a v4.0 commit
>> and thus risk regressing all stable kernels.
>
> Oh, good catch, yes that Fixes tag should be:
>
> 4e4ced93794a dax: Move mandatory ->zero_page_range() check in alloc_dax()
>
> ...as that was the one that updated the alloc_dax() calling convention
> without fixing up the CONFIG_DAX=n case.
>
> This is a pre-requisite for restoring DAX operation on ARM32.

I'll change the Fixes tag in this commit to:

Fixes: 4e4ced93794a ("dax: Move mandatory ->zero_page_range() check in alloc_dax()")

for v6.

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


2024-02-13 20:25:33

by Mathieu Desnoyers

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

On 2024-02-12 18:02, Dan Williams wrote:
[...]
> ...and Mathieu, this should be IS_ERR_OR_NULL() to skip an unnecessary
> call to virtio_fs_cleanup_dax() at function exit that the compiler
> should elide.

OK, so I'll go back to the previous approach for v6:

DEFINE_FREE(cleanup_dax, struct dax_dev *, if (!IS_ERR_OR_NULL(_T)) virtio_fs_cleanup_dax(_T))

and define the variable as:

struct dax_device *dax_dev __free(cleanup_dax) = NULL;

Thanks,

Mathieu

--
Mathieu Desnoyers
EfficiOS Inc.
https://www.efficios.com


2024-02-14 07:00:04

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH v5 5/8] virtio: Treat alloc_dax() -EOPNOTSUPP failure as non-fatal

On Tue, Feb 13, 2024 at 02:46:05PM -0500, Mathieu Desnoyers wrote:
> On 2024-02-13 01:25, Lukas Wunner wrote:
> > On Mon, Feb 12, 2024 at 11:30:58AM -0500, Mathieu Desnoyers wrote:
> > > In preparation for checking whether the architecture has data cache
> > > aliasing within alloc_dax(), modify the error handling of virtio
> > > virtio_fs_setup_dax() to treat alloc_dax() -EOPNOTSUPP failure as
> > > non-fatal.
> > >
> > > Co-developed-by: Dan Williams <[email protected]>
> > > Signed-off-by: Dan Williams <[email protected]>
> > > Fixes: d92576f1167c ("dax: does not work correctly with virtual aliasing caches")
> >
> > That's a v4.0 commit, yet this patch uses DEFINE_FREE() which is
> > only available in v6.6 but not any earlier stable kernels.
>
> I asked this question to Greg KH before creating this patch, and his
> answer was to implement my fix for master, and stable kernels would take
> care of backporting all the required dependencies.

That is correct.

> Now if I look at latest 6.1, 5.15, 5.10, 5.4, 4.19 stable kernels,
> none seem to have include/linux/cleanup.h today. But I suspect that
> sooner or later relevant master branch fixes will require stable
> kernels to backport cleanup.h, so why not do it now ?

Yes, eventually we will need to backport cleanup.h to the older kernel
trees, I know of many patches "in flight" that are using it, so it's not
unique to this one at all, so this is fine to have.

Remember, make changes for Linus's tree, don't go through any gyrations
to make things special for stable releases, that's something to only
consider later on, if you want to. stable kernels should have no affect
on developers OTHER than a simple cc: stable tag on stuff that they know
should be backported, unless they really want to do more than that,
that's up to them.

thanks,

greg k-h