2018-12-05 05:50:45

by Nicolas Boichat

[permalink] [raw]
Subject: [PATCH v4 0/3] iommu/io-pgtable-arm-v7s: Use DMA32 zone for page tables

This is a follow-up to the discussion in [1], [2].

IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For L1 tables that are bigger than a page, we can just use __get_free_pages
with GFP_DMA32 (on arm64 systems only, arm would still use GFP_DMA).

For L2 tables that only take 1KB, it would be a waste to allocate a full
page, so we considered 3 approaches:
1. This series, adding support for GFP_DMA32 slab caches.
2. genalloc, which requires pre-allocating the maximum number of L2 page
tables (4096, so 4MB of memory).
3. page_frag, which is not very memory-efficient as it is unable to reuse
freed fragments until the whole page is freed. [3]

This series is the most memory-efficient approach.

[1] https://lists.linuxfoundation.org/pipermail/iommu/2018-November/030876.html
[2] https://lists.linuxfoundation.org/pipermail/iommu/2018-December/031696.html
[3] https://patchwork.codeaurora.org/patch/671639/

Changes since v1:
- Add support for SLAB_CACHE_DMA32 in slab and slub (patches 1/2)
- iommu/io-pgtable-arm-v7s (patch 3):
- Changed approach to use SLAB_CACHE_DMA32 added by the previous
commit.
- Use DMA or DMA32 depending on the architecture (DMA for arm,
DMA32 for arm64).

Changes since v2:
- Reworded and expanded commit messages
- Added cache_dma32 documentation in PATCH 2/3.

v3 used the page_frag approach, see [3].

Nicolas Boichat (3):
mm: slab/slub: Add check_slab_flags function to check for valid flags
mm: Add support for kmem caches in DMA32 zone
iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

Documentation/ABI/testing/sysfs-kernel-slab | 9 ++++++++
drivers/iommu/io-pgtable-arm-v7s.c | 20 +++++++++++++----
include/linux/slab.h | 2 ++
mm/internal.h | 22 +++++++++++++++++--
mm/slab.c | 10 +++------
mm/slab.h | 3 ++-
mm/slab_common.c | 2 +-
mm/slub.c | 24 +++++++++++++++------
8 files changed, 70 insertions(+), 22 deletions(-)

--
2.20.0.rc1.387.gf8505762e3-goog



2018-12-05 05:49:52

by Nicolas Boichat

[permalink] [raw]
Subject: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
data structures smaller than a page with GFP_DMA32 flag.

This change makes it possible to create a custom cache in DMA32 zone
using kmem_cache_create, then allocate memory using kmem_cache_alloc.

We do not create a DMA32 kmalloc cache array, as there are currently
no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
ensures that such calls still fail (as they do before this change).

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat <[email protected]>
---

Changes since v2:
- Clarified commit message
- Add entry in sysfs-kernel-slab to document the new sysfs file

(v3 used the page_frag approach)

Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
include/linux/slab.h | 2 ++
mm/internal.h | 8 ++++++--
mm/slab.c | 4 +++-
mm/slab.h | 3 ++-
mm/slab_common.c | 2 +-
mm/slub.c | 18 +++++++++++++++++-
7 files changed, 40 insertions(+), 6 deletions(-)

diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
index 29601d93a1c2ea..d742c6cfdffbe9 100644
--- a/Documentation/ABI/testing/sysfs-kernel-slab
+++ b/Documentation/ABI/testing/sysfs-kernel-slab
@@ -106,6 +106,15 @@ Description:
are from ZONE_DMA.
Available when CONFIG_ZONE_DMA is enabled.

+What: /sys/kernel/slab/cache/cache_dma32
+Date: December 2018
+KernelVersion: 4.21
+Contact: Nicolas Boichat <[email protected]>
+Description:
+ The cache_dma32 file is read-only and specifies whether objects
+ are from ZONE_DMA32.
+ Available when CONFIG_ZONE_DMA32 is enabled.
+
What: /sys/kernel/slab/cache/cpu_slabs
Date: May 2007
KernelVersion: 2.6.22
diff --git a/include/linux/slab.h b/include/linux/slab.h
index 11b45f7ae4057c..9449b19c5f107a 100644
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -32,6 +32,8 @@
#define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
/* Use GFP_DMA memory */
#define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
+/* Use GFP_DMA32 memory */
+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
/* DEBUG: Store the last owner for bug hunting */
#define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
/* Panic if kmem_cache_create() fails */
diff --git a/mm/internal.h b/mm/internal.h
index a2ee82a0cd44ae..fd244ad716eaf8 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -14,6 +14,7 @@
#include <linux/fs.h>
#include <linux/mm.h>
#include <linux/pagemap.h>
+#include <linux/slab.h>
#include <linux/tracepoint-defs.h>

/*
@@ -34,9 +35,12 @@
#define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)

/* Check for flags that must not be used with a slab allocator */
-static inline gfp_t check_slab_flags(gfp_t flags)
+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
{
- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+
+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
+ bug_mask |= __GFP_DMA32;

if (unlikely(flags & bug_mask)) {
gfp_t invalid_mask = flags & bug_mask;
diff --git a/mm/slab.c b/mm/slab.c
index 65a774f05e7836..2fd3b9a996cbe6 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2109,6 +2109,8 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
cachep->allocflags = __GFP_COMP;
if (flags & SLAB_CACHE_DMA)
cachep->allocflags |= GFP_DMA;
+ if (flags & SLAB_CACHE_DMA32)
+ cachep->allocflags |= GFP_DMA32;
if (flags & SLAB_RECLAIM_ACCOUNT)
cachep->allocflags |= __GFP_RECLAIMABLE;
cachep->size = size;
@@ -2643,7 +2645,7 @@ static struct page *cache_grow_begin(struct kmem_cache *cachep,
* Be lazy and only check for valid flags here, keeping it out of the
* critical path in kmem_cache_alloc().
*/
- flags = check_slab_flags(flags);
+ flags = check_slab_flags(flags, cachep->flags);
WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);

diff --git a/mm/slab.h b/mm/slab.h
index 4190c24ef0e9df..fcf717e12f0a86 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_flags(unsigned int object_size,


/* Legal flag mask for kmem_cache_create(), for various configurations */
-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
+ SLAB_CACHE_DMA32 | SLAB_PANIC | \
SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )

#if defined(CONFIG_DEBUG_SLAB)
diff --git a/mm/slab_common.c b/mm/slab_common.c
index 70b0cc85db67f8..18b7b809c8d064 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -53,7 +53,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
SLAB_FAILSLAB | SLAB_KASAN)

#define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
- SLAB_ACCOUNT)
+ SLAB_CACHE_DMA32 | SLAB_ACCOUNT)

/*
* Merge control. If this is set then no merging of slab caches will occur.
diff --git a/mm/slub.c b/mm/slub.c
index 21a3f6866da472..6d47765a82d150 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1685,7 +1685,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)

static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
{
- flags = check_slab_flags(flags);
+ flags = check_slab_flags(flags, s->flags);

return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
@@ -3577,6 +3577,9 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order)
if (s->flags & SLAB_CACHE_DMA)
s->allocflags |= GFP_DMA;

+ if (s->flags & SLAB_CACHE_DMA32)
+ s->allocflags |= GFP_DMA32;
+
if (s->flags & SLAB_RECLAIM_ACCOUNT)
s->allocflags |= __GFP_RECLAIMABLE;

@@ -5095,6 +5098,14 @@ static ssize_t cache_dma_show(struct kmem_cache *s, char *buf)
SLAB_ATTR_RO(cache_dma);
#endif

+#ifdef CONFIG_ZONE_DMA32
+static ssize_t cache_dma32_show(struct kmem_cache *s, char *buf)
+{
+ return sprintf(buf, "%d\n", !!(s->flags & SLAB_CACHE_DMA32));
+}
+SLAB_ATTR_RO(cache_dma32);
+#endif
+
static ssize_t usersize_show(struct kmem_cache *s, char *buf)
{
return sprintf(buf, "%u\n", s->usersize);
@@ -5435,6 +5446,9 @@ static struct attribute *slab_attrs[] = {
#ifdef CONFIG_ZONE_DMA
&cache_dma_attr.attr,
#endif
+#ifdef CONFIG_ZONE_DMA32
+ &cache_dma32_attr.attr,
+#endif
#ifdef CONFIG_NUMA
&remote_node_defrag_ratio_attr.attr,
#endif
@@ -5665,6 +5679,8 @@ static char *create_unique_id(struct kmem_cache *s)
*/
if (s->flags & SLAB_CACHE_DMA)
*p++ = 'd';
+ if (s->flags & SLAB_CACHE_DMA32)
+ *p++ = 'D';
if (s->flags & SLAB_RECLAIM_ACCOUNT)
*p++ = 'a';
if (s->flags & SLAB_CONSISTENCY_CHECKS)
--
2.20.0.rc1.387.gf8505762e3-goog


2018-12-05 05:50:13

by Nicolas Boichat

[permalink] [raw]
Subject: [PATCH v4 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

IOMMUs using ARMv7 short-descriptor format require page tables
(level 1 and 2) to be allocated within the first 4GB of RAM, even
on 64-bit systems.

For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
is defined (e.g. on arm64 platforms).

For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32.

Also, print an error when the physical address does not fit in
32-bit, to make debugging easier in the future.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat <[email protected]>
---

Changes since v2:
- Commit message

(v3 used the page_frag approach)

drivers/iommu/io-pgtable-arm-v7s.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)

diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
index 445c3bde04800c..996f7b6d00b44a 100644
--- a/drivers/iommu/io-pgtable-arm-v7s.c
+++ b/drivers/iommu/io-pgtable-arm-v7s.c
@@ -161,6 +161,14 @@

#define ARM_V7S_TCR_PD1 BIT(5)

+#ifdef CONFIG_ZONE_DMA32
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
+#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
+#else
+#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
+#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
+#endif
+
typedef u32 arm_v7s_iopte;

static bool selftest_running;
@@ -198,13 +206,17 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
void *table = NULL;

if (lvl == 1)
- table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
+ table = (void *)__get_free_pages(
+ __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
else if (lvl == 2)
- table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
+ table = kmem_cache_zalloc(data->l2_tables,
+ gfp | ARM_V7S_TABLE_GFP_DMA);
phys = virt_to_phys(table);
- if (phys != (arm_v7s_iopte)phys)
+ if (phys != (arm_v7s_iopte)phys) {
/* Doesn't fit in PTE */
+ dev_err(dev, "Page table does not fit in PTE: %pa", &phys);
goto out_free;
+ }
if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
if (dma_mapping_error(dev, dma))
@@ -737,7 +749,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
ARM_V7S_TABLE_SIZE(2),
ARM_V7S_TABLE_SIZE(2),
- SLAB_CACHE_DMA, NULL);
+ ARM_V7S_TABLE_SLAB_CACHE, NULL);
if (!data->l2_tables)
goto out_free_data;

--
2.20.0.rc1.387.gf8505762e3-goog


2018-12-05 05:50:51

by Nicolas Boichat

[permalink] [raw]
Subject: [PATCH v4 1/3] mm: slab/slub: Add check_slab_flags function to check for valid flags

Remove duplicated code between slab and slub, and will make it
easier to make the test more complicated in the next commits.

Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
Signed-off-by: Nicolas Boichat <[email protected]>
---
mm/internal.h | 18 ++++++++++++++++--
mm/slab.c | 8 +-------
mm/slub.c | 8 +-------
3 files changed, 18 insertions(+), 16 deletions(-)

diff --git a/mm/internal.h b/mm/internal.h
index f0c9ccde3bdb9e..a2ee82a0cd44ae 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -33,8 +33,22 @@
/* Control allocation cpuset and node placement constraints */
#define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)

-/* Do not use these with a slab allocator */
-#define GFP_SLAB_BUG_MASK (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK)
+/* Check for flags that must not be used with a slab allocator */
+static inline gfp_t check_slab_flags(gfp_t flags)
+{
+ gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
+
+ if (unlikely(flags & bug_mask)) {
+ gfp_t invalid_mask = flags & bug_mask;
+
+ flags &= ~bug_mask;
+ pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
+ invalid_mask, &invalid_mask, flags, &flags);
+ dump_stack();
+ }
+
+ return flags;
+}

void page_writeback_init(void);

diff --git a/mm/slab.c b/mm/slab.c
index 73fe23e649c91a..65a774f05e7836 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -2643,13 +2643,7 @@ static struct page *cache_grow_begin(struct kmem_cache *cachep,
* Be lazy and only check for valid flags here, keeping it out of the
* critical path in kmem_cache_alloc().
*/
- if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
- gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
- flags &= ~GFP_SLAB_BUG_MASK;
- pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
- invalid_mask, &invalid_mask, flags, &flags);
- dump_stack();
- }
+ flags = check_slab_flags(flags);
WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);

diff --git a/mm/slub.c b/mm/slub.c
index c229a9b7dd5448..21a3f6866da472 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -1685,13 +1685,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)

static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
{
- if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
- gfp_t invalid_mask = flags & GFP_SLAB_BUG_MASK;
- flags &= ~GFP_SLAB_BUG_MASK;
- pr_warn("Unexpected gfp: %#x (%pGg). Fixing up to gfp: %#x (%pGg). Fix your code!\n",
- invalid_mask, &invalid_mask, flags, &flags);
- dump_stack();
- }
+ flags = check_slab_flags(flags);

return allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
--
2.20.0.rc1.387.gf8505762e3-goog


2018-12-05 07:26:37

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
>In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
>data structures smaller than a page with GFP_DMA32 flag.
>
>This change makes it possible to create a custom cache in DMA32 zone
>using kmem_cache_create, then allocate memory using kmem_cache_alloc.
>
>We do not create a DMA32 kmalloc cache array, as there are currently
>no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
>ensures that such calls still fail (as they do before this change).
>
>Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
>Signed-off-by: Nicolas Boichat <[email protected]>
>---
>
>Changes since v2:
> - Clarified commit message
> - Add entry in sysfs-kernel-slab to document the new sysfs file
>
>(v3 used the page_frag approach)
>
>Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
> include/linux/slab.h | 2 ++
> mm/internal.h | 8 ++++++--
> mm/slab.c | 4 +++-
> mm/slab.h | 3 ++-
> mm/slab_common.c | 2 +-
> mm/slub.c | 18 +++++++++++++++++-
> 7 files changed, 40 insertions(+), 6 deletions(-)
>
>diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
>index 29601d93a1c2ea..d742c6cfdffbe9 100644
>--- a/Documentation/ABI/testing/sysfs-kernel-slab
>+++ b/Documentation/ABI/testing/sysfs-kernel-slab
>@@ -106,6 +106,15 @@ Description:
> are from ZONE_DMA.
> Available when CONFIG_ZONE_DMA is enabled.
>
>+What: /sys/kernel/slab/cache/cache_dma32
>+Date: December 2018
>+KernelVersion: 4.21
>+Contact: Nicolas Boichat <[email protected]>
>+Description:
>+ The cache_dma32 file is read-only and specifies whether objects
>+ are from ZONE_DMA32.
>+ Available when CONFIG_ZONE_DMA32 is enabled.
>+
> What: /sys/kernel/slab/cache/cpu_slabs
> Date: May 2007
> KernelVersion: 2.6.22
>diff --git a/include/linux/slab.h b/include/linux/slab.h
>index 11b45f7ae4057c..9449b19c5f107a 100644
>--- a/include/linux/slab.h
>+++ b/include/linux/slab.h
>@@ -32,6 +32,8 @@
> #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
> /* Use GFP_DMA memory */
> #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
>+/* Use GFP_DMA32 memory */
>+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
> /* DEBUG: Store the last owner for bug hunting */
> #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
> /* Panic if kmem_cache_create() fails */
>diff --git a/mm/internal.h b/mm/internal.h
>index a2ee82a0cd44ae..fd244ad716eaf8 100644
>--- a/mm/internal.h
>+++ b/mm/internal.h
>@@ -14,6 +14,7 @@
> #include <linux/fs.h>
> #include <linux/mm.h>
> #include <linux/pagemap.h>
>+#include <linux/slab.h>
> #include <linux/tracepoint-defs.h>
>
> /*
>@@ -34,9 +35,12 @@
> #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
>
> /* Check for flags that must not be used with a slab allocator */
>-static inline gfp_t check_slab_flags(gfp_t flags)
>+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
> {
>- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>+
>+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
>+ bug_mask |= __GFP_DMA32;

The original version doesn't check CONFIG_ZONE_DMA32.

Do we need to add this condition here?
Could we just decide the bug_mask based on slab_flags?

>
> if (unlikely(flags & bug_mask)) {
> gfp_t invalid_mask = flags & bug_mask;
>diff --git a/mm/slab.c b/mm/slab.c
>index 65a774f05e7836..2fd3b9a996cbe6 100644
>--- a/mm/slab.c
>+++ b/mm/slab.c
>@@ -2109,6 +2109,8 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
> cachep->allocflags = __GFP_COMP;
> if (flags & SLAB_CACHE_DMA)
> cachep->allocflags |= GFP_DMA;
>+ if (flags & SLAB_CACHE_DMA32)
>+ cachep->allocflags |= GFP_DMA32;
> if (flags & SLAB_RECLAIM_ACCOUNT)
> cachep->allocflags |= __GFP_RECLAIMABLE;
> cachep->size = size;
>@@ -2643,7 +2645,7 @@ static struct page *cache_grow_begin(struct kmem_cache *cachep,
> * Be lazy and only check for valid flags here, keeping it out of the
> * critical path in kmem_cache_alloc().
> */
>- flags = check_slab_flags(flags);
>+ flags = check_slab_flags(flags, cachep->flags);
> WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
> local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
>
>diff --git a/mm/slab.h b/mm/slab.h
>index 4190c24ef0e9df..fcf717e12f0a86 100644
>--- a/mm/slab.h
>+++ b/mm/slab.h
>@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_flags(unsigned int object_size,
>
>
> /* Legal flag mask for kmem_cache_create(), for various configurations */
>-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
>+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
>+ SLAB_CACHE_DMA32 | SLAB_PANIC | \
> SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
>
> #if defined(CONFIG_DEBUG_SLAB)
>diff --git a/mm/slab_common.c b/mm/slab_common.c
>index 70b0cc85db67f8..18b7b809c8d064 100644
>--- a/mm/slab_common.c
>+++ b/mm/slab_common.c
>@@ -53,7 +53,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
> SLAB_FAILSLAB | SLAB_KASAN)
>
> #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
>- SLAB_ACCOUNT)
>+ SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
>
> /*
> * Merge control. If this is set then no merging of slab caches will occur.
>diff --git a/mm/slub.c b/mm/slub.c
>index 21a3f6866da472..6d47765a82d150 100644
>--- a/mm/slub.c
>+++ b/mm/slub.c
>@@ -1685,7 +1685,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
>
> static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
> {
>- flags = check_slab_flags(flags);
>+ flags = check_slab_flags(flags, s->flags);
>
> return allocate_slab(s,
> flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
>@@ -3577,6 +3577,9 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order)
> if (s->flags & SLAB_CACHE_DMA)
> s->allocflags |= GFP_DMA;
>
>+ if (s->flags & SLAB_CACHE_DMA32)
>+ s->allocflags |= GFP_DMA32;
>+
> if (s->flags & SLAB_RECLAIM_ACCOUNT)
> s->allocflags |= __GFP_RECLAIMABLE;
>
>@@ -5095,6 +5098,14 @@ static ssize_t cache_dma_show(struct kmem_cache *s, char *buf)
> SLAB_ATTR_RO(cache_dma);
> #endif
>
>+#ifdef CONFIG_ZONE_DMA32
>+static ssize_t cache_dma32_show(struct kmem_cache *s, char *buf)
>+{
>+ return sprintf(buf, "%d\n", !!(s->flags & SLAB_CACHE_DMA32));
>+}
>+SLAB_ATTR_RO(cache_dma32);
>+#endif
>+
> static ssize_t usersize_show(struct kmem_cache *s, char *buf)
> {
> return sprintf(buf, "%u\n", s->usersize);
>@@ -5435,6 +5446,9 @@ static struct attribute *slab_attrs[] = {
> #ifdef CONFIG_ZONE_DMA
> &cache_dma_attr.attr,
> #endif
>+#ifdef CONFIG_ZONE_DMA32
>+ &cache_dma32_attr.attr,
>+#endif
> #ifdef CONFIG_NUMA
> &remote_node_defrag_ratio_attr.attr,
> #endif
>@@ -5665,6 +5679,8 @@ static char *create_unique_id(struct kmem_cache *s)
> */
> if (s->flags & SLAB_CACHE_DMA)
> *p++ = 'd';
>+ if (s->flags & SLAB_CACHE_DMA32)
>+ *p++ = 'D';
> if (s->flags & SLAB_RECLAIM_ACCOUNT)
> *p++ = 'a';
> if (s->flags & SLAB_CONSISTENCY_CHECKS)
>--
>2.20.0.rc1.387.gf8505762e3-goog
>
>_______________________________________________
>iommu mailing list
>[email protected]
>https://lists.linuxfoundation.org/mailman/listinfo/iommu

--
Wei Yang
Help you, Help me

2018-12-05 07:42:06

by Nicolas Boichat

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed, Dec 5, 2018 at 3:25 PM Wei Yang <[email protected]> wrote:
>
> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> >data structures smaller than a page with GFP_DMA32 flag.
> >
> >This change makes it possible to create a custom cache in DMA32 zone
> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >
> >We do not create a DMA32 kmalloc cache array, as there are currently
> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> >ensures that such calls still fail (as they do before this change).
> >
> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> >Signed-off-by: Nicolas Boichat <[email protected]>
> >---
> >
> >Changes since v2:
> > - Clarified commit message
> > - Add entry in sysfs-kernel-slab to document the new sysfs file
> >
> >(v3 used the page_frag approach)
> >
> >Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
> > include/linux/slab.h | 2 ++
> > mm/internal.h | 8 ++++++--
> > mm/slab.c | 4 +++-
> > mm/slab.h | 3 ++-
> > mm/slab_common.c | 2 +-
> > mm/slub.c | 18 +++++++++++++++++-
> > 7 files changed, 40 insertions(+), 6 deletions(-)
> >
> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
> >@@ -106,6 +106,15 @@ Description:
> > are from ZONE_DMA.
> > Available when CONFIG_ZONE_DMA is enabled.
> >
> >+What: /sys/kernel/slab/cache/cache_dma32
> >+Date: December 2018
> >+KernelVersion: 4.21
> >+Contact: Nicolas Boichat <[email protected]>
> >+Description:
> >+ The cache_dma32 file is read-only and specifies whether objects
> >+ are from ZONE_DMA32.
> >+ Available when CONFIG_ZONE_DMA32 is enabled.
> >+
> > What: /sys/kernel/slab/cache/cpu_slabs
> > Date: May 2007
> > KernelVersion: 2.6.22
> >diff --git a/include/linux/slab.h b/include/linux/slab.h
> >index 11b45f7ae4057c..9449b19c5f107a 100644
> >--- a/include/linux/slab.h
> >+++ b/include/linux/slab.h
> >@@ -32,6 +32,8 @@
> > #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
> > /* Use GFP_DMA memory */
> > #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
> >+/* Use GFP_DMA32 memory */
> >+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
> > /* DEBUG: Store the last owner for bug hunting */
> > #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
> > /* Panic if kmem_cache_create() fails */
> >diff --git a/mm/internal.h b/mm/internal.h
> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
> >--- a/mm/internal.h
> >+++ b/mm/internal.h
> >@@ -14,6 +14,7 @@
> > #include <linux/fs.h>
> > #include <linux/mm.h>
> > #include <linux/pagemap.h>
> >+#include <linux/slab.h>
> > #include <linux/tracepoint-defs.h>
> >
> > /*
> >@@ -34,9 +35,12 @@
> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
> >
> > /* Check for flags that must not be used with a slab allocator */
> >-static inline gfp_t check_slab_flags(gfp_t flags)
> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
> > {
> >- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >+
> >+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
> >+ bug_mask |= __GFP_DMA32;
>
> The original version doesn't check CONFIG_ZONE_DMA32.
>
> Do we need to add this condition here?
> Could we just decide the bug_mask based on slab_flags?

We can. The reason I did it this way is that when we don't have
CONFIG_ZONE_DMA32, the compiler should be able to simplify to:

bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
if (true || ..) => if (true)
bug_mask |= __GFP_DMA32;

Then just
bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK | __GFP_DMA32;

And since the function is inline, slab_flags would not even need to be
accessed at all.

> >
> > if (unlikely(flags & bug_mask)) {
> > gfp_t invalid_mask = flags & bug_mask;
> >diff --git a/mm/slab.c b/mm/slab.c
> >index 65a774f05e7836..2fd3b9a996cbe6 100644
> >--- a/mm/slab.c
> >+++ b/mm/slab.c
> >@@ -2109,6 +2109,8 @@ int __kmem_cache_create(struct kmem_cache *cachep, slab_flags_t flags)
> > cachep->allocflags = __GFP_COMP;
> > if (flags & SLAB_CACHE_DMA)
> > cachep->allocflags |= GFP_DMA;
> >+ if (flags & SLAB_CACHE_DMA32)
> >+ cachep->allocflags |= GFP_DMA32;
> > if (flags & SLAB_RECLAIM_ACCOUNT)
> > cachep->allocflags |= __GFP_RECLAIMABLE;
> > cachep->size = size;
> >@@ -2643,7 +2645,7 @@ static struct page *cache_grow_begin(struct kmem_cache *cachep,
> > * Be lazy and only check for valid flags here, keeping it out of the
> > * critical path in kmem_cache_alloc().
> > */
> >- flags = check_slab_flags(flags);
> >+ flags = check_slab_flags(flags, cachep->flags);
> > WARN_ON_ONCE(cachep->ctor && (flags & __GFP_ZERO));
> > local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);
> >
> >diff --git a/mm/slab.h b/mm/slab.h
> >index 4190c24ef0e9df..fcf717e12f0a86 100644
> >--- a/mm/slab.h
> >+++ b/mm/slab.h
> >@@ -127,7 +127,8 @@ static inline slab_flags_t kmem_cache_flags(unsigned int object_size,
> >
> >
> > /* Legal flag mask for kmem_cache_create(), for various configurations */
> >-#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | SLAB_PANIC | \
> >+#define SLAB_CORE_FLAGS (SLAB_HWCACHE_ALIGN | SLAB_CACHE_DMA | \
> >+ SLAB_CACHE_DMA32 | SLAB_PANIC | \
> > SLAB_TYPESAFE_BY_RCU | SLAB_DEBUG_OBJECTS )
> >
> > #if defined(CONFIG_DEBUG_SLAB)
> >diff --git a/mm/slab_common.c b/mm/slab_common.c
> >index 70b0cc85db67f8..18b7b809c8d064 100644
> >--- a/mm/slab_common.c
> >+++ b/mm/slab_common.c
> >@@ -53,7 +53,7 @@ static DECLARE_WORK(slab_caches_to_rcu_destroy_work,
> > SLAB_FAILSLAB | SLAB_KASAN)
> >
> > #define SLAB_MERGE_SAME (SLAB_RECLAIM_ACCOUNT | SLAB_CACHE_DMA | \
> >- SLAB_ACCOUNT)
> >+ SLAB_CACHE_DMA32 | SLAB_ACCOUNT)
> >
> > /*
> > * Merge control. If this is set then no merging of slab caches will occur.
> >diff --git a/mm/slub.c b/mm/slub.c
> >index 21a3f6866da472..6d47765a82d150 100644
> >--- a/mm/slub.c
> >+++ b/mm/slub.c
> >@@ -1685,7 +1685,7 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node)
> >
> > static struct page *new_slab(struct kmem_cache *s, gfp_t flags, int node)
> > {
> >- flags = check_slab_flags(flags);
> >+ flags = check_slab_flags(flags, s->flags);
> >
> > return allocate_slab(s,
> > flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
> >@@ -3577,6 +3577,9 @@ static int calculate_sizes(struct kmem_cache *s, int forced_order)
> > if (s->flags & SLAB_CACHE_DMA)
> > s->allocflags |= GFP_DMA;
> >
> >+ if (s->flags & SLAB_CACHE_DMA32)
> >+ s->allocflags |= GFP_DMA32;
> >+
> > if (s->flags & SLAB_RECLAIM_ACCOUNT)
> > s->allocflags |= __GFP_RECLAIMABLE;
> >
> >@@ -5095,6 +5098,14 @@ static ssize_t cache_dma_show(struct kmem_cache *s, char *buf)
> > SLAB_ATTR_RO(cache_dma);
> > #endif
> >
> >+#ifdef CONFIG_ZONE_DMA32
> >+static ssize_t cache_dma32_show(struct kmem_cache *s, char *buf)
> >+{
> >+ return sprintf(buf, "%d\n", !!(s->flags & SLAB_CACHE_DMA32));
> >+}
> >+SLAB_ATTR_RO(cache_dma32);
> >+#endif
> >+
> > static ssize_t usersize_show(struct kmem_cache *s, char *buf)
> > {
> > return sprintf(buf, "%u\n", s->usersize);
> >@@ -5435,6 +5446,9 @@ static struct attribute *slab_attrs[] = {
> > #ifdef CONFIG_ZONE_DMA
> > &cache_dma_attr.attr,
> > #endif
> >+#ifdef CONFIG_ZONE_DMA32
> >+ &cache_dma32_attr.attr,
> >+#endif
> > #ifdef CONFIG_NUMA
> > &remote_node_defrag_ratio_attr.attr,
> > #endif
> >@@ -5665,6 +5679,8 @@ static char *create_unique_id(struct kmem_cache *s)
> > */
> > if (s->flags & SLAB_CACHE_DMA)
> > *p++ = 'd';
> >+ if (s->flags & SLAB_CACHE_DMA32)
> >+ *p++ = 'D';
> > if (s->flags & SLAB_RECLAIM_ACCOUNT)
> > *p++ = 'a';
> > if (s->flags & SLAB_CONSISTENCY_CHECKS)
> >--
> >2.20.0.rc1.387.gf8505762e3-goog
> >
> >_______________________________________________
> >iommu mailing list
> >[email protected]
> >https://lists.linuxfoundation.org/mailman/listinfo/iommu
>
> --
> Wei Yang
> Help you, Help me

2018-12-05 09:21:23

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed, Dec 05, 2018 at 03:39:51PM +0800, Nicolas Boichat wrote:
>On Wed, Dec 5, 2018 at 3:25 PM Wei Yang <[email protected]> wrote:
>>
>> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
>> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
>> >data structures smaller than a page with GFP_DMA32 flag.
>> >
>> >This change makes it possible to create a custom cache in DMA32 zone
>> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
>> >
>> >We do not create a DMA32 kmalloc cache array, as there are currently
>> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
>> >ensures that such calls still fail (as they do before this change).
>> >
>> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
>> >Signed-off-by: Nicolas Boichat <[email protected]>
>> >---
>> >
>> >Changes since v2:
>> > - Clarified commit message
>> > - Add entry in sysfs-kernel-slab to document the new sysfs file
>> >
>> >(v3 used the page_frag approach)
>> >
>> >Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
>> > include/linux/slab.h | 2 ++
>> > mm/internal.h | 8 ++++++--
>> > mm/slab.c | 4 +++-
>> > mm/slab.h | 3 ++-
>> > mm/slab_common.c | 2 +-
>> > mm/slub.c | 18 +++++++++++++++++-
>> > 7 files changed, 40 insertions(+), 6 deletions(-)
>> >
>> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
>> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
>> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
>> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
>> >@@ -106,6 +106,15 @@ Description:
>> > are from ZONE_DMA.
>> > Available when CONFIG_ZONE_DMA is enabled.
>> >
>> >+What: /sys/kernel/slab/cache/cache_dma32
>> >+Date: December 2018
>> >+KernelVersion: 4.21
>> >+Contact: Nicolas Boichat <[email protected]>
>> >+Description:
>> >+ The cache_dma32 file is read-only and specifies whether objects
>> >+ are from ZONE_DMA32.
>> >+ Available when CONFIG_ZONE_DMA32 is enabled.
>> >+
>> > What: /sys/kernel/slab/cache/cpu_slabs
>> > Date: May 2007
>> > KernelVersion: 2.6.22
>> >diff --git a/include/linux/slab.h b/include/linux/slab.h
>> >index 11b45f7ae4057c..9449b19c5f107a 100644
>> >--- a/include/linux/slab.h
>> >+++ b/include/linux/slab.h
>> >@@ -32,6 +32,8 @@
>> > #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
>> > /* Use GFP_DMA memory */
>> > #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
>> >+/* Use GFP_DMA32 memory */
>> >+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
>> > /* DEBUG: Store the last owner for bug hunting */
>> > #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
>> > /* Panic if kmem_cache_create() fails */
>> >diff --git a/mm/internal.h b/mm/internal.h
>> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
>> >--- a/mm/internal.h
>> >+++ b/mm/internal.h
>> >@@ -14,6 +14,7 @@
>> > #include <linux/fs.h>
>> > #include <linux/mm.h>
>> > #include <linux/pagemap.h>
>> >+#include <linux/slab.h>
>> > #include <linux/tracepoint-defs.h>
>> >
>> > /*
>> >@@ -34,9 +35,12 @@
>> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
>> >
>> > /* Check for flags that must not be used with a slab allocator */
>> >-static inline gfp_t check_slab_flags(gfp_t flags)
>> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
>> > {
>> >- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >+
>> >+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
>> >+ bug_mask |= __GFP_DMA32;
>>
>> The original version doesn't check CONFIG_ZONE_DMA32.
>>
>> Do we need to add this condition here?
>> Could we just decide the bug_mask based on slab_flags?
>
>We can. The reason I did it this way is that when we don't have
>CONFIG_ZONE_DMA32, the compiler should be able to simplify to:
>
>bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>if (true || ..) => if (true)
> bug_mask |= __GFP_DMA32;
>
>Then just
>bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK | __GFP_DMA32;
>
>And since the function is inline, slab_flags would not even need to be
>accessed at all.
>

Thanks for explanation. This make sense to me.

--
Wei Yang
Help you, Help me

2018-12-05 09:59:44

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed 05-12-18 13:48:27, Nicolas Boichat wrote:
> In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> data structures smaller than a page with GFP_DMA32 flag.
>
> This change makes it possible to create a custom cache in DMA32 zone
> using kmem_cache_create, then allocate memory using kmem_cache_alloc.
>
> We do not create a DMA32 kmalloc cache array, as there are currently
> no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> ensures that such calls still fail (as they do before this change).

The changelog should be much more specific about decisions made here.
First of all it would be nice to mention the usecase.

Secondly, why do we need a new sysfs file? Who is going to consume it?

Then why do we need SLAB_MERGE_SAME to cover GFP_DMA32 as well? I
thought the whole point is to use dedicated slab cache. Who is this
going to merge with?
--
Michal Hocko
SUSE Labs

2018-12-05 11:02:23

by Nicolas Boichat

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed, Dec 5, 2018 at 5:56 PM Michal Hocko <[email protected]> wrote:
>
> On Wed 05-12-18 13:48:27, Nicolas Boichat wrote:
> > In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> > data structures smaller than a page with GFP_DMA32 flag.
> >
> > This change makes it possible to create a custom cache in DMA32 zone
> > using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >
> > We do not create a DMA32 kmalloc cache array, as there are currently
> > no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> > ensures that such calls still fail (as they do before this change).
>
> The changelog should be much more specific about decisions made here.
> First of all it would be nice to mention the usecase.

Ok, I'll copy most of the cover letter text here (i.e. the fact that
IOMMU wants physical memory <4GB for L2 page tables, why it's better
than genalloc/page_frag).

> Secondly, why do we need a new sysfs file? Who is going to consume it?

We have cache_dma, so it seems consistent to add cache_dma32.

I wasn't aware of tools/vm/slabinfo.c, so I can add support for
cache_dma32 in a follow-up patch. Any other user I should take care
of?

> Then why do we need SLAB_MERGE_SAME to cover GFP_DMA32 as well?

SLAB_MERGE_SAME tells us which flags _need_ to be the same for the
slabs to be merged. We don't want slab caches with GFP_DMA32 and
~GFP_DMA32 to be merged, so it should be in there.
(https://elixir.bootlin.com/linux/v4.19.6/source/mm/slab_common.c#L342).

> I
> thought the whole point is to use dedicated slab cache. Who is this
> going to merge with?

Well, if there was another SLAB cache requiring 1KB GFP_DMA32
elements, then I don't see why we would not merge the caches. This is
what happens with this IOMMU L2 tables cache pre-CONFIG_ZONE_DMA32 on
arm64 (output on some 3.18 kernel below), and what would happen on
arm32 since we still use GFP_DMA.

/sys/kernel/slab # ls -l | grep dt-0001024
drwxr-xr-x. 2 root root 0 Dec 5 02:25 :dt-0001024
lrwxrwxrwx. 1 root root 0 Dec 5 02:25 dma-kmalloc-1024 -> :dt-0001024
lrwxrwxrwx. 1 root root 0 Dec 5 02:25 io-pgtable_armv7s_l2 -> :dt-0001024

Thanks!

> --
> Michal Hocko
> SUSE Labs

2018-12-05 11:39:43

by Michal Hocko

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed 05-12-18 19:01:03, Nicolas Boichat wrote:
[...]
> > Secondly, why do we need a new sysfs file? Who is going to consume it?
>
> We have cache_dma, so it seems consistent to add cache_dma32.

I wouldn't copy a pattern unless there is an explicit usecase for it.
We do expose way too much to userspace and that keeps kicking us later.
Not that I am aware of any specific example for cache_dma but seeing
other examples I would rather be more careful.

> I wasn't aware of tools/vm/slabinfo.c, so I can add support for
> cache_dma32 in a follow-up patch. Any other user I should take care
> of?

In general zones are inernal MM implementation details and the less we
export to userspace the better.

> > Then why do we need SLAB_MERGE_SAME to cover GFP_DMA32 as well?
>
> SLAB_MERGE_SAME tells us which flags _need_ to be the same for the
> slabs to be merged. We don't want slab caches with GFP_DMA32 and
> ~GFP_DMA32 to be merged, so it should be in there.
> (https://elixir.bootlin.com/linux/v4.19.6/source/mm/slab_common.c#L342).

Ohh, my bad, I have misread the change. Sure we definitely not want to
allow merging here. My bad.
--
Michal Hocko
SUSE Labs

2018-12-05 12:20:40

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed, Dec 05, 2018 at 03:39:51PM +0800, Nicolas Boichat wrote:
>On Wed, Dec 5, 2018 at 3:25 PM Wei Yang <[email protected]> wrote:
>>
>> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
>> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
>> >data structures smaller than a page with GFP_DMA32 flag.
>> >
>> >This change makes it possible to create a custom cache in DMA32 zone
>> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
>> >
>> >We do not create a DMA32 kmalloc cache array, as there are currently
>> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
>> >ensures that such calls still fail (as they do before this change).
>> >
>> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
>> >Signed-off-by: Nicolas Boichat <[email protected]>
>> >---
>> >
>> >Changes since v2:
>> > - Clarified commit message
>> > - Add entry in sysfs-kernel-slab to document the new sysfs file
>> >
>> >(v3 used the page_frag approach)
>> >
>> >Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
>> > include/linux/slab.h | 2 ++
>> > mm/internal.h | 8 ++++++--
>> > mm/slab.c | 4 +++-
>> > mm/slab.h | 3 ++-
>> > mm/slab_common.c | 2 +-
>> > mm/slub.c | 18 +++++++++++++++++-
>> > 7 files changed, 40 insertions(+), 6 deletions(-)
>> >
>> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
>> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
>> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
>> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
>> >@@ -106,6 +106,15 @@ Description:
>> > are from ZONE_DMA.
>> > Available when CONFIG_ZONE_DMA is enabled.
>> >
>> >+What: /sys/kernel/slab/cache/cache_dma32
>> >+Date: December 2018
>> >+KernelVersion: 4.21
>> >+Contact: Nicolas Boichat <[email protected]>
>> >+Description:
>> >+ The cache_dma32 file is read-only and specifies whether objects
>> >+ are from ZONE_DMA32.
>> >+ Available when CONFIG_ZONE_DMA32 is enabled.
>> >+
>> > What: /sys/kernel/slab/cache/cpu_slabs
>> > Date: May 2007
>> > KernelVersion: 2.6.22
>> >diff --git a/include/linux/slab.h b/include/linux/slab.h
>> >index 11b45f7ae4057c..9449b19c5f107a 100644
>> >--- a/include/linux/slab.h
>> >+++ b/include/linux/slab.h
>> >@@ -32,6 +32,8 @@
>> > #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
>> > /* Use GFP_DMA memory */
>> > #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
>> >+/* Use GFP_DMA32 memory */
>> >+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
>> > /* DEBUG: Store the last owner for bug hunting */
>> > #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
>> > /* Panic if kmem_cache_create() fails */
>> >diff --git a/mm/internal.h b/mm/internal.h
>> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
>> >--- a/mm/internal.h
>> >+++ b/mm/internal.h
>> >@@ -14,6 +14,7 @@
>> > #include <linux/fs.h>
>> > #include <linux/mm.h>
>> > #include <linux/pagemap.h>
>> >+#include <linux/slab.h>
>> > #include <linux/tracepoint-defs.h>
>> >
>> > /*
>> >@@ -34,9 +35,12 @@
>> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
>> >
>> > /* Check for flags that must not be used with a slab allocator */
>> >-static inline gfp_t check_slab_flags(gfp_t flags)
>> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
>> > {
>> >- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >+
>> >+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
>> >+ bug_mask |= __GFP_DMA32;
>>
>> The original version doesn't check CONFIG_ZONE_DMA32.
>>
>> Do we need to add this condition here?
>> Could we just decide the bug_mask based on slab_flags?
>
>We can. The reason I did it this way is that when we don't have
>CONFIG_ZONE_DMA32, the compiler should be able to simplify to:
>
>bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>if (true || ..) => if (true)
> bug_mask |= __GFP_DMA32;
>
>Then just
>bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK | __GFP_DMA32;
>
>And since the function is inline, slab_flags would not even need to be
>accessed at all.
>

Hmm, I get one confusion.

This means if CONFIG_ZONE_DMA32 is not enabled, bug_mask will always
contains __GFP_DMA32. This will check with cachep->flags.

If cachep->flags has GFP_DMA32, this always fail?

Is this possible?

--
Wei Yang
Help you, Help me

2018-12-05 13:38:16

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v4 1/3] mm: slab/slub: Add check_slab_flags function to check for valid flags

On 12/5/18 6:48 AM, Nicolas Boichat wrote:
> Remove duplicated code between slab and slub, and will make it
> easier to make the test more complicated in the next commits.
>
> Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")

Well, not really. Patch 3 does that and yeah this will be a prerequisity
for a clean stable backport, but we don't tag all prerequisities like
this, I think?

> Signed-off-by: Nicolas Boichat <[email protected]>

Acked-by: Vlastimil Babka <[email protected]>

2018-12-05 13:56:19

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

On Wed, Dec 05, 2018 at 01:48:28PM +0800, Nicolas Boichat wrote:
> IOMMUs using ARMv7 short-descriptor format require page tables
> (level 1 and 2) to be allocated within the first 4GB of RAM, even
> on 64-bit systems.

> +#ifdef CONFIG_ZONE_DMA32
> +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
> +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
> +#else
> +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
> +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
> +#endif

How does using GFP_DMA make sense based on the above? If the system
has more than 32-bits worth of RAM it should be using GFP_DMA32, else
GFP_KERNEL, not GFP_DMA for an arch defined small addressability pool.

2018-12-05 14:04:43

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On 12/5/18 6:48 AM, Nicolas Boichat wrote:
> In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> data structures smaller than a page with GFP_DMA32 flag.
>
> This change makes it possible to create a custom cache in DMA32 zone
> using kmem_cache_create, then allocate memory using kmem_cache_alloc.
>
> We do not create a DMA32 kmalloc cache array, as there are currently
> no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> ensures that such calls still fail (as they do before this change).
>
> Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")

Same as my comment for 1/3.

> Signed-off-by: Nicolas Boichat <[email protected]>

In general,
Acked-by: Vlastimil Babka <[email protected]>

Some comments below:

> ---
>
> Changes since v2:
> - Clarified commit message
> - Add entry in sysfs-kernel-slab to document the new sysfs file
>
> (v3 used the page_frag approach)
>
> Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
> include/linux/slab.h | 2 ++
> mm/internal.h | 8 ++++++--
> mm/slab.c | 4 +++-
> mm/slab.h | 3 ++-
> mm/slab_common.c | 2 +-
> mm/slub.c | 18 +++++++++++++++++-
> 7 files changed, 40 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
> index 29601d93a1c2ea..d742c6cfdffbe9 100644
> --- a/Documentation/ABI/testing/sysfs-kernel-slab
> +++ b/Documentation/ABI/testing/sysfs-kernel-slab
> @@ -106,6 +106,15 @@ Description:
> are from ZONE_DMA.
> Available when CONFIG_ZONE_DMA is enabled.
>
> +What: /sys/kernel/slab/cache/cache_dma32
> +Date: December 2018
> +KernelVersion: 4.21
> +Contact: Nicolas Boichat <[email protected]>
> +Description:
> + The cache_dma32 file is read-only and specifies whether objects
> + are from ZONE_DMA32.
> + Available when CONFIG_ZONE_DMA32 is enabled.

I don't have a strong opinion. It's a new file, yeah, but consistent
with already existing ones. I'd leave the decision with SL*B maintainers.

> What: /sys/kernel/slab/cache/cpu_slabs
> Date: May 2007
> KernelVersion: 2.6.22
> diff --git a/include/linux/slab.h b/include/linux/slab.h
> index 11b45f7ae4057c..9449b19c5f107a 100644
> --- a/include/linux/slab.h
> +++ b/include/linux/slab.h
> @@ -32,6 +32,8 @@
> #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
> /* Use GFP_DMA memory */
> #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
> +/* Use GFP_DMA32 memory */
> +#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
> /* DEBUG: Store the last owner for bug hunting */
> #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
> /* Panic if kmem_cache_create() fails */
> diff --git a/mm/internal.h b/mm/internal.h
> index a2ee82a0cd44ae..fd244ad716eaf8 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -14,6 +14,7 @@
> #include <linux/fs.h>
> #include <linux/mm.h>
> #include <linux/pagemap.h>
> +#include <linux/slab.h>
> #include <linux/tracepoint-defs.h>
>
> /*
> @@ -34,9 +35,12 @@
> #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
>
> /* Check for flags that must not be used with a slab allocator */
> -static inline gfp_t check_slab_flags(gfp_t flags)
> +static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
> {
> - gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> + gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> +
> + if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
> + bug_mask |= __GFP_DMA32;

I'll point out that this is not even strictly needed AFAICS, as only
flags passed to kmem_cache_alloc() are checked - the cache->allocflags
derived from SLAB_CACHE_DMA32 are appended only after check_slab_flags()
(in both SLAB and SLUB AFAICS). And for a cache created with
SLAB_CACHE_DMA32, the caller of kmem_cache_alloc() doesn't need to also
include __GFP_DMA32, the allocation will be from ZONE_DMA32 regardless.
So it would be fine even unchanged. The check would anyway need some
more love to catch the same with __GFP_DMA to be consistent and cover
all corner cases.

>
> if (unlikely(flags & bug_mask)) {
> gfp_t invalid_mask = flags & bug_mask;

2018-12-05 14:42:37

by Robin Murphy

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

On 05/12/2018 13:54, Christoph Hellwig wrote:
> On Wed, Dec 05, 2018 at 01:48:28PM +0800, Nicolas Boichat wrote:
>> IOMMUs using ARMv7 short-descriptor format require page tables
>> (level 1 and 2) to be allocated within the first 4GB of RAM, even
>> on 64-bit systems.
>
>> +#ifdef CONFIG_ZONE_DMA32
>> +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
>> +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
>> +#else
>> +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
>> +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
>> +#endif
>
> How does using GFP_DMA make sense based on the above? If the system
> has more than 32-bits worth of RAM it should be using GFP_DMA32, else
> GFP_KERNEL, not GFP_DMA for an arch defined small addressability pool.

32-bit Arm doesn't have ZONE_DMA32, but has (or at least had at the
time) a 2GB ZONE_DMA. Whether we actually need that or not depends on
how this all interacts with LPAE and highmem, but I'm not sure of those
details off-hand.

Robin.

2018-12-05 14:46:37

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

On Wed, Dec 05, 2018 at 02:40:06PM +0000, Robin Murphy wrote:
> 32-bit Arm doesn't have ZONE_DMA32, but has (or at least had at the time) a
> 2GB ZONE_DMA. Whether we actually need that or not depends on how this all
> interacts with LPAE and highmem, but I'm not sure of those details off-hand.

Well, arm32 can't address more than 32-bits in the linear kernel
mapping, so GFP_KERNEL should be perfectly fine there if the limit
really is 32-bits and not 31 or smaller because someone stole a bit
or two somewhere.

2018-12-05 14:48:26

by Will Deacon

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

On Wed, Dec 05, 2018 at 06:43:08AM -0800, Christoph Hellwig wrote:
> On Wed, Dec 05, 2018 at 02:40:06PM +0000, Robin Murphy wrote:
> > 32-bit Arm doesn't have ZONE_DMA32, but has (or at least had at the time) a
> > 2GB ZONE_DMA. Whether we actually need that or not depends on how this all
> > interacts with LPAE and highmem, but I'm not sure of those details off-hand.
>
> Well, arm32 can't address more than 32-bits in the linear kernel
> mapping, so GFP_KERNEL should be perfectly fine there if the limit
> really is 32-bits and not 31 or smaller because someone stole a bit
> or two somewhere.

I'm not sure that's necessarily true on the physical side. Wasn't there a
keystone SoC with /all/ the coherent memory above 4GB?

Will

2018-12-05 14:49:07

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v4 3/3] iommu/io-pgtable-arm-v7s: Request DMA32 memory, and improve debugging

On 12/5/18 6:48 AM, Nicolas Boichat wrote:
> IOMMUs using ARMv7 short-descriptor format require page tables
> (level 1 and 2) to be allocated within the first 4GB of RAM, even
> on 64-bit systems.
>
> For level 1/2 pages, ensure GFP_DMA32 is used if CONFIG_ZONE_DMA32
> is defined (e.g. on arm64 platforms).
>
> For level 2 pages, allocate a slab cache in SLAB_CACHE_DMA32.
>
> Also, print an error when the physical address does not fit in
> 32-bit, to make debugging easier in the future.
>
> Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> Signed-off-by: Nicolas Boichat <[email protected]>
> ---
>
> Changes since v2:
> - Commit message
>
> (v3 used the page_frag approach)
>
> drivers/iommu/io-pgtable-arm-v7s.c | 20 ++++++++++++++++----
> 1 file changed, 16 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/iommu/io-pgtable-arm-v7s.c b/drivers/iommu/io-pgtable-arm-v7s.c
> index 445c3bde04800c..996f7b6d00b44a 100644
> --- a/drivers/iommu/io-pgtable-arm-v7s.c
> +++ b/drivers/iommu/io-pgtable-arm-v7s.c
> @@ -161,6 +161,14 @@
>
> #define ARM_V7S_TCR_PD1 BIT(5)
>
> +#ifdef CONFIG_ZONE_DMA32
> +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA32
> +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA32
> +#else
> +#define ARM_V7S_TABLE_GFP_DMA GFP_DMA
> +#define ARM_V7S_TABLE_SLAB_CACHE SLAB_CACHE_DMA
> +#endif
> +
> typedef u32 arm_v7s_iopte;
>
> static bool selftest_running;
> @@ -198,13 +206,17 @@ static void *__arm_v7s_alloc_table(int lvl, gfp_t gfp,
> void *table = NULL;
>
> if (lvl == 1)
> - table = (void *)__get_dma_pages(__GFP_ZERO, get_order(size));
> + table = (void *)__get_free_pages(
> + __GFP_ZERO | ARM_V7S_TABLE_GFP_DMA, get_order(size));
> else if (lvl == 2)
> - table = kmem_cache_zalloc(data->l2_tables, gfp | GFP_DMA);
> + table = kmem_cache_zalloc(data->l2_tables,
> + gfp | ARM_V7S_TABLE_GFP_DMA);

So as I've explained in 2/3, you don't need ARM_V7S_TABLE_GFP_DMA here
(and then you don't need to adjust the slab warnings).

> phys = virt_to_phys(table);
> - if (phys != (arm_v7s_iopte)phys)
> + if (phys != (arm_v7s_iopte)phys) {
> /* Doesn't fit in PTE */
> + dev_err(dev, "Page table does not fit in PTE: %pa", &phys);
> goto out_free;
> + }
> if (table && !(cfg->quirks & IO_PGTABLE_QUIRK_NO_DMA)) {
> dma = dma_map_single(dev, table, size, DMA_TO_DEVICE);
> if (dma_mapping_error(dev, dma))
> @@ -737,7 +749,7 @@ static struct io_pgtable *arm_v7s_alloc_pgtable(struct io_pgtable_cfg *cfg,
> data->l2_tables = kmem_cache_create("io-pgtable_armv7s_l2",
> ARM_V7S_TABLE_SIZE(2),
> ARM_V7S_TABLE_SIZE(2),
> - SLAB_CACHE_DMA, NULL);
> + ARM_V7S_TABLE_SLAB_CACHE, NULL);
> if (!data->l2_tables)
> goto out_free_data;
>
>


2018-12-06 00:44:02

by Nicolas Boichat

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed, Dec 5, 2018 at 8:18 PM Wei Yang <[email protected]> wrote:
>
> On Wed, Dec 05, 2018 at 03:39:51PM +0800, Nicolas Boichat wrote:
> >On Wed, Dec 5, 2018 at 3:25 PM Wei Yang <[email protected]> wrote:
> >>
> >> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
> >> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> >> >data structures smaller than a page with GFP_DMA32 flag.
> >> >
> >> >This change makes it possible to create a custom cache in DMA32 zone
> >> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >> >
> >> >We do not create a DMA32 kmalloc cache array, as there are currently
> >> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> >> >ensures that such calls still fail (as they do before this change).
> >> >
> >> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> >> >Signed-off-by: Nicolas Boichat <[email protected]>
> >> >---
> >> >
> >> >Changes since v2:
> >> > - Clarified commit message
> >> > - Add entry in sysfs-kernel-slab to document the new sysfs file
> >> >
> >> >(v3 used the page_frag approach)
> >> >
> >> >Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
> >> > include/linux/slab.h | 2 ++
> >> > mm/internal.h | 8 ++++++--
> >> > mm/slab.c | 4 +++-
> >> > mm/slab.h | 3 ++-
> >> > mm/slab_common.c | 2 +-
> >> > mm/slub.c | 18 +++++++++++++++++-
> >> > 7 files changed, 40 insertions(+), 6 deletions(-)
> >> >
> >> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
> >> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
> >> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
> >> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
> >> >@@ -106,6 +106,15 @@ Description:
> >> > are from ZONE_DMA.
> >> > Available when CONFIG_ZONE_DMA is enabled.
> >> >
> >> >+What: /sys/kernel/slab/cache/cache_dma32
> >> >+Date: December 2018
> >> >+KernelVersion: 4.21
> >> >+Contact: Nicolas Boichat <[email protected]>
> >> >+Description:
> >> >+ The cache_dma32 file is read-only and specifies whether objects
> >> >+ are from ZONE_DMA32.
> >> >+ Available when CONFIG_ZONE_DMA32 is enabled.
> >> >+
> >> > What: /sys/kernel/slab/cache/cpu_slabs
> >> > Date: May 2007
> >> > KernelVersion: 2.6.22
> >> >diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> >index 11b45f7ae4057c..9449b19c5f107a 100644
> >> >--- a/include/linux/slab.h
> >> >+++ b/include/linux/slab.h
> >> >@@ -32,6 +32,8 @@
> >> > #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
> >> > /* Use GFP_DMA memory */
> >> > #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
> >> >+/* Use GFP_DMA32 memory */
> >> >+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
> >> > /* DEBUG: Store the last owner for bug hunting */
> >> > #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
> >> > /* Panic if kmem_cache_create() fails */
> >> >diff --git a/mm/internal.h b/mm/internal.h
> >> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
> >> >--- a/mm/internal.h
> >> >+++ b/mm/internal.h
> >> >@@ -14,6 +14,7 @@
> >> > #include <linux/fs.h>
> >> > #include <linux/mm.h>
> >> > #include <linux/pagemap.h>
> >> >+#include <linux/slab.h>
> >> > #include <linux/tracepoint-defs.h>
> >> >
> >> > /*
> >> >@@ -34,9 +35,12 @@
> >> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
> >> >
> >> > /* Check for flags that must not be used with a slab allocator */
> >> >-static inline gfp_t check_slab_flags(gfp_t flags)
> >> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
> >> > {
> >> >- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >> >+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >> >+
> >> >+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
> >> >+ bug_mask |= __GFP_DMA32;
> >>
> >> The original version doesn't check CONFIG_ZONE_DMA32.
> >>
> >> Do we need to add this condition here?
> >> Could we just decide the bug_mask based on slab_flags?
> >
> >We can. The reason I did it this way is that when we don't have
> >CONFIG_ZONE_DMA32, the compiler should be able to simplify to:
> >
> >bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >if (true || ..) => if (true)
> > bug_mask |= __GFP_DMA32;
> >
> >Then just
> >bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK | __GFP_DMA32;
> >
> >And since the function is inline, slab_flags would not even need to be
> >accessed at all.
> >
>
> Hmm, I get one confusion.
>
> This means if CONFIG_ZONE_DMA32 is not enabled, bug_mask will always
> contains __GFP_DMA32. This will check with cachep->flags.
>
> If cachep->flags has GFP_DMA32, this always fail?
>
> Is this possible?

Not fully sure to understand the question, but the code is:
if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
bug_mask |= __GFP_DMA32;

IS_ENABLED(CONFIG_ZONE_DMA32) == true:
- (slab_flags & SLAB_CACHE_DMA32) => bug_mask untouched, __GFP_DMA32
is allowed.
- !(slab_flags & SLAB_CACHE_DMA32) => bug_mask |= __GFP_DMA32;,
__GFP_DMA32 triggers warning
IS_ENABLED(CONFIG_ZONE_DMA32) == false:
=> bug_mask |= __GFP_DMA32;, __GFP_DMA32 triggers warning (as
expected, GFP_DMA32 does not make sense if there is no DMA32 zone).

Does that clarify?

>
> --
> Wei Yang
> Help you, Help me

2018-12-06 03:33:53

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Thu, Dec 06, 2018 at 08:41:36AM +0800, Nicolas Boichat wrote:
>On Wed, Dec 5, 2018 at 8:18 PM Wei Yang <[email protected]> wrote:
>>
>> On Wed, Dec 05, 2018 at 03:39:51PM +0800, Nicolas Boichat wrote:
>> >On Wed, Dec 5, 2018 at 3:25 PM Wei Yang <[email protected]> wrote:
>> >>
>> >> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
>> >> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
>> >> >data structures smaller than a page with GFP_DMA32 flag.
>> >> >
>> >> >This change makes it possible to create a custom cache in DMA32 zone
>> >> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
>> >> >
>> >> >We do not create a DMA32 kmalloc cache array, as there are currently
>> >> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
>> >> >ensures that such calls still fail (as they do before this change).
>> >> >
>> >> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
>> >> >Signed-off-by: Nicolas Boichat <[email protected]>
>> >> >---
>> >> >
>> >> >Changes since v2:
>> >> > - Clarified commit message
>> >> > - Add entry in sysfs-kernel-slab to document the new sysfs file
>> >> >
>> >> >(v3 used the page_frag approach)
>> >> >
>> >> >Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
>> >> > include/linux/slab.h | 2 ++
>> >> > mm/internal.h | 8 ++++++--
>> >> > mm/slab.c | 4 +++-
>> >> > mm/slab.h | 3 ++-
>> >> > mm/slab_common.c | 2 +-
>> >> > mm/slub.c | 18 +++++++++++++++++-
>> >> > 7 files changed, 40 insertions(+), 6 deletions(-)
>> >> >
>> >> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
>> >> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
>> >> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
>> >> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
>> >> >@@ -106,6 +106,15 @@ Description:
>> >> > are from ZONE_DMA.
>> >> > Available when CONFIG_ZONE_DMA is enabled.
>> >> >
>> >> >+What: /sys/kernel/slab/cache/cache_dma32
>> >> >+Date: December 2018
>> >> >+KernelVersion: 4.21
>> >> >+Contact: Nicolas Boichat <[email protected]>
>> >> >+Description:
>> >> >+ The cache_dma32 file is read-only and specifies whether objects
>> >> >+ are from ZONE_DMA32.
>> >> >+ Available when CONFIG_ZONE_DMA32 is enabled.
>> >> >+
>> >> > What: /sys/kernel/slab/cache/cpu_slabs
>> >> > Date: May 2007
>> >> > KernelVersion: 2.6.22
>> >> >diff --git a/include/linux/slab.h b/include/linux/slab.h
>> >> >index 11b45f7ae4057c..9449b19c5f107a 100644
>> >> >--- a/include/linux/slab.h
>> >> >+++ b/include/linux/slab.h
>> >> >@@ -32,6 +32,8 @@
>> >> > #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
>> >> > /* Use GFP_DMA memory */
>> >> > #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
>> >> >+/* Use GFP_DMA32 memory */
>> >> >+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
>> >> > /* DEBUG: Store the last owner for bug hunting */
>> >> > #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
>> >> > /* Panic if kmem_cache_create() fails */
>> >> >diff --git a/mm/internal.h b/mm/internal.h
>> >> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
>> >> >--- a/mm/internal.h
>> >> >+++ b/mm/internal.h
>> >> >@@ -14,6 +14,7 @@
>> >> > #include <linux/fs.h>
>> >> > #include <linux/mm.h>
>> >> > #include <linux/pagemap.h>
>> >> >+#include <linux/slab.h>
>> >> > #include <linux/tracepoint-defs.h>
>> >> >
>> >> > /*
>> >> >@@ -34,9 +35,12 @@
>> >> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
>> >> >
>> >> > /* Check for flags that must not be used with a slab allocator */
>> >> >-static inline gfp_t check_slab_flags(gfp_t flags)
>> >> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
>> >> > {
>> >> >- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >> >+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >> >+
>> >> >+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
>> >> >+ bug_mask |= __GFP_DMA32;
>> >>
>> >> The original version doesn't check CONFIG_ZONE_DMA32.
>> >>
>> >> Do we need to add this condition here?
>> >> Could we just decide the bug_mask based on slab_flags?
>> >
>> >We can. The reason I did it this way is that when we don't have
>> >CONFIG_ZONE_DMA32, the compiler should be able to simplify to:
>> >
>> >bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >if (true || ..) => if (true)
>> > bug_mask |= __GFP_DMA32;
>> >
>> >Then just
>> >bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK | __GFP_DMA32;
>> >
>> >And since the function is inline, slab_flags would not even need to be
>> >accessed at all.
>> >
>>
>> Hmm, I get one confusion.
>>
>> This means if CONFIG_ZONE_DMA32 is not enabled, bug_mask will always
>> contains __GFP_DMA32. This will check with cachep->flags.
>>
>> If cachep->flags has GFP_DMA32, this always fail?
>>
>> Is this possible?
>
>Not fully sure to understand the question, but the code is:
>if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
> bug_mask |= __GFP_DMA32;
>
>IS_ENABLED(CONFIG_ZONE_DMA32) == true:
> - (slab_flags & SLAB_CACHE_DMA32) => bug_mask untouched, __GFP_DMA32
>is allowed.
> - !(slab_flags & SLAB_CACHE_DMA32) => bug_mask |= __GFP_DMA32;,
>__GFP_DMA32 triggers warning
>IS_ENABLED(CONFIG_ZONE_DMA32) == false:
> => bug_mask |= __GFP_DMA32;, __GFP_DMA32 triggers warning (as
>expected, GFP_DMA32 does not make sense if there is no DMA32 zone).

This is the case I am thinking.

The warning is reasonable since there is no DMA32. While the
kmem_cache_create() user is not easy to change their code.

For example, one writes code and wants to have a kmem_cache with DMA32
capability, so he writes kmem_cache_create(__GFP_DMA32). The code is
there and not easy to change. But one distro builder decides to disable
DMA32. This will leads to all the kmem_cache_create() through warning?

This behavior is what we expect?

>
>Does that clarify?
>
>>
>> --
>> Wei Yang
>> Help you, Help me

--
Wei Yang
Help you, Help me

2018-12-06 03:51:28

by Nicolas Boichat

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Wed, Dec 5, 2018 at 10:02 PM Vlastimil Babka <[email protected]> wrote:
>
> On 12/5/18 6:48 AM, Nicolas Boichat wrote:
> > In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> > data structures smaller than a page with GFP_DMA32 flag.
> >
> > This change makes it possible to create a custom cache in DMA32 zone
> > using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >
> > We do not create a DMA32 kmalloc cache array, as there are currently
> > no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> > ensures that such calls still fail (as they do before this change).
> >
> > Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
>
> Same as my comment for 1/3.

I'll drop.

> > Signed-off-by: Nicolas Boichat <[email protected]>
>
> In general,
> Acked-by: Vlastimil Babka <[email protected]>
>
> Some comments below:
>
> > ---
> >
> > Changes since v2:
> > - Clarified commit message
> > - Add entry in sysfs-kernel-slab to document the new sysfs file
> >
> > (v3 used the page_frag approach)
> >
> > Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
> > include/linux/slab.h | 2 ++
> > mm/internal.h | 8 ++++++--
> > mm/slab.c | 4 +++-
> > mm/slab.h | 3 ++-
> > mm/slab_common.c | 2 +-
> > mm/slub.c | 18 +++++++++++++++++-
> > 7 files changed, 40 insertions(+), 6 deletions(-)
> >
> > diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
> > index 29601d93a1c2ea..d742c6cfdffbe9 100644
> > --- a/Documentation/ABI/testing/sysfs-kernel-slab
> > +++ b/Documentation/ABI/testing/sysfs-kernel-slab
> > @@ -106,6 +106,15 @@ Description:
> > are from ZONE_DMA.
> > Available when CONFIG_ZONE_DMA is enabled.
> >
> > +What: /sys/kernel/slab/cache/cache_dma32
> > +Date: December 2018
> > +KernelVersion: 4.21
> > +Contact: Nicolas Boichat <[email protected]>
> > +Description:
> > + The cache_dma32 file is read-only and specifies whether objects
> > + are from ZONE_DMA32.
> > + Available when CONFIG_ZONE_DMA32 is enabled.
>
> I don't have a strong opinion. It's a new file, yeah, but consistent
> with already existing ones. I'd leave the decision with SL*B maintainers.
>
> > What: /sys/kernel/slab/cache/cpu_slabs
> > Date: May 2007
> > KernelVersion: 2.6.22
> > diff --git a/include/linux/slab.h b/include/linux/slab.h
> > index 11b45f7ae4057c..9449b19c5f107a 100644
> > --- a/include/linux/slab.h
> > +++ b/include/linux/slab.h
> > @@ -32,6 +32,8 @@
> > #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
> > /* Use GFP_DMA memory */
> > #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
> > +/* Use GFP_DMA32 memory */
> > +#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
> > /* DEBUG: Store the last owner for bug hunting */
> > #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
> > /* Panic if kmem_cache_create() fails */
> > diff --git a/mm/internal.h b/mm/internal.h
> > index a2ee82a0cd44ae..fd244ad716eaf8 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -14,6 +14,7 @@
> > #include <linux/fs.h>
> > #include <linux/mm.h>
> > #include <linux/pagemap.h>
> > +#include <linux/slab.h>
> > #include <linux/tracepoint-defs.h>
> >
> > /*
> > @@ -34,9 +35,12 @@
> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
> >
> > /* Check for flags that must not be used with a slab allocator */
> > -static inline gfp_t check_slab_flags(gfp_t flags)
> > +static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
> > {
> > - gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> > + gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> > +
> > + if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
> > + bug_mask |= __GFP_DMA32;
>
> I'll point out that this is not even strictly needed AFAICS, as only
> flags passed to kmem_cache_alloc() are checked - the cache->allocflags
> derived from SLAB_CACHE_DMA32 are appended only after check_slab_flags()
> (in both SLAB and SLUB AFAICS). And for a cache created with
> SLAB_CACHE_DMA32, the caller of kmem_cache_alloc() doesn't need to also
> include __GFP_DMA32, the allocation will be from ZONE_DMA32 regardless.

Yes, you're right. I also looked at existing users of SLAB_CACHE_DMA,
and there is one case in drivers/scsi/scsi_lib.c where GFP_DMA is not
be passed (all the other users pass it).

I can drop GFP_DMA32 from my call in io-pgtable-arm-v7s.c.

> So it would be fine even unchanged. The check would anyway need some
> more love to catch the same with __GFP_DMA to be consistent and cover
> all corner cases.

Yes, the test is not complete. If we really wanted this to be
accurate, we'd need to check that GFP_* exactly matches SLAB_CACHE_*.

The only problem with dropping this is test that we should restore
GFP_DMA32 warning/errors somewhere else (as Christopher pointed out
here: https://lkml.org/lkml/2018/11/22/430), especially for kmalloc
case.

Maybe this can be done in kmalloc_slab.

> >
> > if (unlikely(flags & bug_mask)) {
> > gfp_t invalid_mask = flags & bug_mask;

2018-12-06 03:56:21

by Nicolas Boichat

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Thu, Dec 6, 2018 at 11:32 AM Wei Yang <[email protected]> wrote:
>
> On Thu, Dec 06, 2018 at 08:41:36AM +0800, Nicolas Boichat wrote:
> >On Wed, Dec 5, 2018 at 8:18 PM Wei Yang <[email protected]> wrote:
> >>
> >> On Wed, Dec 05, 2018 at 03:39:51PM +0800, Nicolas Boichat wrote:
> >> >On Wed, Dec 5, 2018 at 3:25 PM Wei Yang <[email protected]> wrote:
> >> >>
> >> >> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
> >> >> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
> >> >> >data structures smaller than a page with GFP_DMA32 flag.
> >> >> >
> >> >> >This change makes it possible to create a custom cache in DMA32 zone
> >> >> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
> >> >> >
> >> >> >We do not create a DMA32 kmalloc cache array, as there are currently
> >> >> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
> >> >> >ensures that such calls still fail (as they do before this change).
> >> >> >
> >> >> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
> >> >> >Signed-off-by: Nicolas Boichat <[email protected]>
> >> >> >---
> >> >> >
> >> >> >Changes since v2:
> >> >> > - Clarified commit message
> >> >> > - Add entry in sysfs-kernel-slab to document the new sysfs file
> >> >> >
> >> >> >(v3 used the page_frag approach)
> >> >> >
> >> >> >Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
> >> >> > include/linux/slab.h | 2 ++
> >> >> > mm/internal.h | 8 ++++++--
> >> >> > mm/slab.c | 4 +++-
> >> >> > mm/slab.h | 3 ++-
> >> >> > mm/slab_common.c | 2 +-
> >> >> > mm/slub.c | 18 +++++++++++++++++-
> >> >> > 7 files changed, 40 insertions(+), 6 deletions(-)
> >> >> >
> >> >> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
> >> >> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
> >> >> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
> >> >> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
> >> >> >@@ -106,6 +106,15 @@ Description:
> >> >> > are from ZONE_DMA.
> >> >> > Available when CONFIG_ZONE_DMA is enabled.
> >> >> >
> >> >> >+What: /sys/kernel/slab/cache/cache_dma32
> >> >> >+Date: December 2018
> >> >> >+KernelVersion: 4.21
> >> >> >+Contact: Nicolas Boichat <[email protected]>
> >> >> >+Description:
> >> >> >+ The cache_dma32 file is read-only and specifies whether objects
> >> >> >+ are from ZONE_DMA32.
> >> >> >+ Available when CONFIG_ZONE_DMA32 is enabled.
> >> >> >+
> >> >> > What: /sys/kernel/slab/cache/cpu_slabs
> >> >> > Date: May 2007
> >> >> > KernelVersion: 2.6.22
> >> >> >diff --git a/include/linux/slab.h b/include/linux/slab.h
> >> >> >index 11b45f7ae4057c..9449b19c5f107a 100644
> >> >> >--- a/include/linux/slab.h
> >> >> >+++ b/include/linux/slab.h
> >> >> >@@ -32,6 +32,8 @@
> >> >> > #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
> >> >> > /* Use GFP_DMA memory */
> >> >> > #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
> >> >> >+/* Use GFP_DMA32 memory */
> >> >> >+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
> >> >> > /* DEBUG: Store the last owner for bug hunting */
> >> >> > #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
> >> >> > /* Panic if kmem_cache_create() fails */
> >> >> >diff --git a/mm/internal.h b/mm/internal.h
> >> >> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
> >> >> >--- a/mm/internal.h
> >> >> >+++ b/mm/internal.h
> >> >> >@@ -14,6 +14,7 @@
> >> >> > #include <linux/fs.h>
> >> >> > #include <linux/mm.h>
> >> >> > #include <linux/pagemap.h>
> >> >> >+#include <linux/slab.h>
> >> >> > #include <linux/tracepoint-defs.h>
> >> >> >
> >> >> > /*
> >> >> >@@ -34,9 +35,12 @@
> >> >> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
> >> >> >
> >> >> > /* Check for flags that must not be used with a slab allocator */
> >> >> >-static inline gfp_t check_slab_flags(gfp_t flags)
> >> >> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
> >> >> > {
> >> >> >- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >> >> >+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >> >> >+
> >> >> >+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
> >> >> >+ bug_mask |= __GFP_DMA32;
> >> >>
> >> >> The original version doesn't check CONFIG_ZONE_DMA32.
> >> >>
> >> >> Do we need to add this condition here?
> >> >> Could we just decide the bug_mask based on slab_flags?
> >> >
> >> >We can. The reason I did it this way is that when we don't have
> >> >CONFIG_ZONE_DMA32, the compiler should be able to simplify to:
> >> >
> >> >bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
> >> >if (true || ..) => if (true)
> >> > bug_mask |= __GFP_DMA32;
> >> >
> >> >Then just
> >> >bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK | __GFP_DMA32;
> >> >
> >> >And since the function is inline, slab_flags would not even need to be
> >> >accessed at all.
> >> >
> >>
> >> Hmm, I get one confusion.
> >>
> >> This means if CONFIG_ZONE_DMA32 is not enabled, bug_mask will always
> >> contains __GFP_DMA32. This will check with cachep->flags.
> >>
> >> If cachep->flags has GFP_DMA32, this always fail?
> >>
> >> Is this possible?
> >
> >Not fully sure to understand the question, but the code is:
> >if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
> > bug_mask |= __GFP_DMA32;
> >
> >IS_ENABLED(CONFIG_ZONE_DMA32) == true:
> > - (slab_flags & SLAB_CACHE_DMA32) => bug_mask untouched, __GFP_DMA32
> >is allowed.
> > - !(slab_flags & SLAB_CACHE_DMA32) => bug_mask |= __GFP_DMA32;,
> >__GFP_DMA32 triggers warning
> >IS_ENABLED(CONFIG_ZONE_DMA32) == false:
> > => bug_mask |= __GFP_DMA32;, __GFP_DMA32 triggers warning (as
> >expected, GFP_DMA32 does not make sense if there is no DMA32 zone).
>
> This is the case I am thinking.
>
> The warning is reasonable since there is no DMA32. While the
> kmem_cache_create() user is not easy to change their code.
>
> For example, one writes code and wants to have a kmem_cache with DMA32
> capability, so he writes kmem_cache_create(__GFP_DMA32). The code is
> there and not easy to change. But one distro builder decides to disable
> DMA32. This will leads to all the kmem_cache_create() through warning?

I don't think CONFIG_ZONE_DMA32 can be enabled/disabled by
distro/user? IIUC this is a property of the architecture, some have it
enabled, some don't.

> This behavior is what we expect?
>
> >
> >Does that clarify?
> >
> >>
> >> --
> >> Wei Yang
> >> Help you, Help me
>
> --
> Wei Yang
> Help you, Help me

2018-12-06 06:30:35

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Thu, Dec 06, 2018 at 11:55:02AM +0800, Nicolas Boichat wrote:
>On Thu, Dec 6, 2018 at 11:32 AM Wei Yang <[email protected]> wrote:
>>
>> On Thu, Dec 06, 2018 at 08:41:36AM +0800, Nicolas Boichat wrote:
>> >On Wed, Dec 5, 2018 at 8:18 PM Wei Yang <[email protected]> wrote:
>> >>
>> >> On Wed, Dec 05, 2018 at 03:39:51PM +0800, Nicolas Boichat wrote:
>> >> >On Wed, Dec 5, 2018 at 3:25 PM Wei Yang <[email protected]> wrote:
>> >> >>
>> >> >> On Wed, Dec 05, 2018 at 01:48:27PM +0800, Nicolas Boichat wrote:
>> >> >> >In some cases (e.g. IOMMU ARMv7s page allocator), we need to allocate
>> >> >> >data structures smaller than a page with GFP_DMA32 flag.
>> >> >> >
>> >> >> >This change makes it possible to create a custom cache in DMA32 zone
>> >> >> >using kmem_cache_create, then allocate memory using kmem_cache_alloc.
>> >> >> >
>> >> >> >We do not create a DMA32 kmalloc cache array, as there are currently
>> >> >> >no users of kmalloc(..., GFP_DMA32). The new test in check_slab_flags
>> >> >> >ensures that such calls still fail (as they do before this change).
>> >> >> >
>> >> >> >Fixes: ad67f5a6545f ("arm64: replace ZONE_DMA with ZONE_DMA32")
>> >> >> >Signed-off-by: Nicolas Boichat <[email protected]>
>> >> >> >---
>> >> >> >
>> >> >> >Changes since v2:
>> >> >> > - Clarified commit message
>> >> >> > - Add entry in sysfs-kernel-slab to document the new sysfs file
>> >> >> >
>> >> >> >(v3 used the page_frag approach)
>> >> >> >
>> >> >> >Documentation/ABI/testing/sysfs-kernel-slab | 9 +++++++++
>> >> >> > include/linux/slab.h | 2 ++
>> >> >> > mm/internal.h | 8 ++++++--
>> >> >> > mm/slab.c | 4 +++-
>> >> >> > mm/slab.h | 3 ++-
>> >> >> > mm/slab_common.c | 2 +-
>> >> >> > mm/slub.c | 18 +++++++++++++++++-
>> >> >> > 7 files changed, 40 insertions(+), 6 deletions(-)
>> >> >> >
>> >> >> >diff --git a/Documentation/ABI/testing/sysfs-kernel-slab b/Documentation/ABI/testing/sysfs-kernel-slab
>> >> >> >index 29601d93a1c2ea..d742c6cfdffbe9 100644
>> >> >> >--- a/Documentation/ABI/testing/sysfs-kernel-slab
>> >> >> >+++ b/Documentation/ABI/testing/sysfs-kernel-slab
>> >> >> >@@ -106,6 +106,15 @@ Description:
>> >> >> > are from ZONE_DMA.
>> >> >> > Available when CONFIG_ZONE_DMA is enabled.
>> >> >> >
>> >> >> >+What: /sys/kernel/slab/cache/cache_dma32
>> >> >> >+Date: December 2018
>> >> >> >+KernelVersion: 4.21
>> >> >> >+Contact: Nicolas Boichat <[email protected]>
>> >> >> >+Description:
>> >> >> >+ The cache_dma32 file is read-only and specifies whether objects
>> >> >> >+ are from ZONE_DMA32.
>> >> >> >+ Available when CONFIG_ZONE_DMA32 is enabled.
>> >> >> >+
>> >> >> > What: /sys/kernel/slab/cache/cpu_slabs
>> >> >> > Date: May 2007
>> >> >> > KernelVersion: 2.6.22
>> >> >> >diff --git a/include/linux/slab.h b/include/linux/slab.h
>> >> >> >index 11b45f7ae4057c..9449b19c5f107a 100644
>> >> >> >--- a/include/linux/slab.h
>> >> >> >+++ b/include/linux/slab.h
>> >> >> >@@ -32,6 +32,8 @@
>> >> >> > #define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
>> >> >> > /* Use GFP_DMA memory */
>> >> >> > #define SLAB_CACHE_DMA ((slab_flags_t __force)0x00004000U)
>> >> >> >+/* Use GFP_DMA32 memory */
>> >> >> >+#define SLAB_CACHE_DMA32 ((slab_flags_t __force)0x00008000U)
>> >> >> > /* DEBUG: Store the last owner for bug hunting */
>> >> >> > #define SLAB_STORE_USER ((slab_flags_t __force)0x00010000U)
>> >> >> > /* Panic if kmem_cache_create() fails */
>> >> >> >diff --git a/mm/internal.h b/mm/internal.h
>> >> >> >index a2ee82a0cd44ae..fd244ad716eaf8 100644
>> >> >> >--- a/mm/internal.h
>> >> >> >+++ b/mm/internal.h
>> >> >> >@@ -14,6 +14,7 @@
>> >> >> > #include <linux/fs.h>
>> >> >> > #include <linux/mm.h>
>> >> >> > #include <linux/pagemap.h>
>> >> >> >+#include <linux/slab.h>
>> >> >> > #include <linux/tracepoint-defs.h>
>> >> >> >
>> >> >> > /*
>> >> >> >@@ -34,9 +35,12 @@
>> >> >> > #define GFP_CONSTRAINT_MASK (__GFP_HARDWALL|__GFP_THISNODE)
>> >> >> >
>> >> >> > /* Check for flags that must not be used with a slab allocator */
>> >> >> >-static inline gfp_t check_slab_flags(gfp_t flags)
>> >> >> >+static inline gfp_t check_slab_flags(gfp_t flags, slab_flags_t slab_flags)
>> >> >> > {
>> >> >> >- gfp_t bug_mask = __GFP_DMA32 | __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >> >> >+ gfp_t bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >> >> >+
>> >> >> >+ if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
>> >> >> >+ bug_mask |= __GFP_DMA32;
>> >> >>
>> >> >> The original version doesn't check CONFIG_ZONE_DMA32.
>> >> >>
>> >> >> Do we need to add this condition here?
>> >> >> Could we just decide the bug_mask based on slab_flags?
>> >> >
>> >> >We can. The reason I did it this way is that when we don't have
>> >> >CONFIG_ZONE_DMA32, the compiler should be able to simplify to:
>> >> >
>> >> >bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK;
>> >> >if (true || ..) => if (true)
>> >> > bug_mask |= __GFP_DMA32;
>> >> >
>> >> >Then just
>> >> >bug_mask = __GFP_HIGHMEM | ~__GFP_BITS_MASK | __GFP_DMA32;
>> >> >
>> >> >And since the function is inline, slab_flags would not even need to be
>> >> >accessed at all.
>> >> >
>> >>
>> >> Hmm, I get one confusion.
>> >>
>> >> This means if CONFIG_ZONE_DMA32 is not enabled, bug_mask will always
>> >> contains __GFP_DMA32. This will check with cachep->flags.
>> >>
>> >> If cachep->flags has GFP_DMA32, this always fail?
>> >>
>> >> Is this possible?
>> >
>> >Not fully sure to understand the question, but the code is:
>> >if (!IS_ENABLED(CONFIG_ZONE_DMA32) || !(slab_flags & SLAB_CACHE_DMA32))
>> > bug_mask |= __GFP_DMA32;
>> >
>> >IS_ENABLED(CONFIG_ZONE_DMA32) == true:
>> > - (slab_flags & SLAB_CACHE_DMA32) => bug_mask untouched, __GFP_DMA32
>> >is allowed.
>> > - !(slab_flags & SLAB_CACHE_DMA32) => bug_mask |= __GFP_DMA32;,
>> >__GFP_DMA32 triggers warning
>> >IS_ENABLED(CONFIG_ZONE_DMA32) == false:
>> > => bug_mask |= __GFP_DMA32;, __GFP_DMA32 triggers warning (as
>> >expected, GFP_DMA32 does not make sense if there is no DMA32 zone).
>>
>> This is the case I am thinking.
>>
>> The warning is reasonable since there is no DMA32. While the
>> kmem_cache_create() user is not easy to change their code.
>>
>> For example, one writes code and wants to have a kmem_cache with DMA32
>> capability, so he writes kmem_cache_create(__GFP_DMA32). The code is
>> there and not easy to change. But one distro builder decides to disable
>> DMA32. This will leads to all the kmem_cache_create() through warning?
>
>I don't think CONFIG_ZONE_DMA32 can be enabled/disabled by
>distro/user? IIUC this is a property of the architecture, some have it
>enabled, some don't.

Ok, thanks.

>
>> This behavior is what we expect?
>>
>> >
>> >Does that clarify?
>> >
>> >>
>> >> --
>> >> Wei Yang
>> >> Help you, Help me
>>
>> --
>> Wei Yang
>> Help you, Help me

--
Wei Yang
Help you, Help me

2018-12-06 09:38:40

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On 12/6/18 4:49 AM, Nicolas Boichat wrote:
>> So it would be fine even unchanged. The check would anyway need some
>> more love to catch the same with __GFP_DMA to be consistent and cover
>> all corner cases.
> Yes, the test is not complete. If we really wanted this to be
> accurate, we'd need to check that GFP_* exactly matches SLAB_CACHE_*.
>
> The only problem with dropping this is test that we should restore
> GFP_DMA32 warning/errors somewhere else (as Christopher pointed out
> here: https://lkml.org/lkml/2018/11/22/430), especially for kmalloc
> case.

I meant just dropping that patch hunk, not the whole test. Then the test
stays as it is and will keep warning anyone calling kmalloc(GFP_DMA32).
It would also warn anyone calling kmem_cache_alloc(GFP_DMA32) on
SLAB_CACHE_DMA32 cache, but since the gfp can be just dropped, and you
as the only user of this so far will do that, it's fine?

> Maybe this can be done in kmalloc_slab.

2018-12-06 10:11:14

by Nicolas Boichat

[permalink] [raw]
Subject: Re: [PATCH v4 2/3] mm: Add support for kmem caches in DMA32 zone

On Thu, Dec 6, 2018 at 5:37 PM Vlastimil Babka <[email protected]> wrote:
>
> On 12/6/18 4:49 AM, Nicolas Boichat wrote:
> >> So it would be fine even unchanged. The check would anyway need some
> >> more love to catch the same with __GFP_DMA to be consistent and cover
> >> all corner cases.
> > Yes, the test is not complete. If we really wanted this to be
> > accurate, we'd need to check that GFP_* exactly matches SLAB_CACHE_*.
> >
> > The only problem with dropping this is test that we should restore
> > GFP_DMA32 warning/errors somewhere else (as Christopher pointed out
> > here: https://lkml.org/lkml/2018/11/22/430), especially for kmalloc
> > case.
>
> I meant just dropping that patch hunk, not the whole test. Then the test
> stays as it is and will keep warning anyone calling kmalloc(GFP_DMA32).
> It would also warn anyone calling kmem_cache_alloc(GFP_DMA32) on
> SLAB_CACHE_DMA32 cache, but since the gfp can be just dropped, and you
> as the only user of this so far will do that, it's fine?

I missed your point, this would work fine indeed.

Thanks.

> > Maybe this can be done in kmalloc_slab.