2019-03-19 21:12:57

by Vlastimil Babka

[permalink] [raw]
Subject: [RFC 0/2] guarantee natural alignment for kmalloc()

The recent thread [1] inspired me to look into guaranteeing alignment for
kmalloc() for power-of-two sizes. Turns out it's not difficult and in most
configuration nothing really changes as it happens implicitly. More details in
the first patch. If we agree we want to do this, I will see where to update
documentation and perhaps if there are any workarounds in the tree that can be
converted to plain kmalloc() afterwards.

The second patch is quick and dirty selftest for the alignment. Suggestions
welcome whether and how to include this kind of selftest that has to be
in-kernel.

[1] https://lore.kernel.org/linux-fsdevel/[email protected]/T/#u

Vlastimil Babka (2):
mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two)
mm, sl[aou]b: test whether kmalloc() alignment works as expected

mm/slab_common.c | 30 +++++++++++++++++++++++++++++-
mm/slob.c | 42 +++++++++++++++++++++++++++++++-----------
2 files changed, 60 insertions(+), 12 deletions(-)

--
2.21.0



2019-03-19 21:12:38

by Vlastimil Babka

[permalink] [raw]
Subject: [RFC 1/2] mm, sl[aou]b: guarantee natural alignment for kmalloc(power-of-two)

In most configurations, kmalloc() happens to return naturally aligned blocks
for power of two sizes. That means some kmalloc() users might implicitly rely
on that alignment, until stuff breaks when the kernel is built with e.g.
CONFIG_SLUB_DEBUG or CONFIG_SLOB, and blocks stop being aligned. Then
developers have to devise workaround such as own kmem caches with specified
alignment, which is not always practical, as recently evidenced in [1].

Ideally we should provide to mm users what they need without difficult
workarounds or own reimplementations, so let's make the kmalloc() alignment
explicit and guaranteed for power-of-two sizes under all configurations.
What this means for the three available allocators?

* SLAB happens to be OK even before the patch. The implicit alignment could be
compromised with CONFIG_DEBUG_SLAB due to redzoning, however SLAB disables
red zoning for caches with alignment larger than unsigned long long.
Practically on at least x86 this includes kmalloc caches as they use cache
line alignment which is larger than that. Still, this patch ensures alignment
on all arches and cache sizes.

* SLUB is implicitly OK unless red zoning is enabled through CONFIG_SLUB_DEBUG
or boot parameter. With this patch, explicit alignment guarantees it with red
zoning as well. This will result in more memory being wasted, but that should
be acceptable in a debugging scenario.

* SLOB has no implicit alignment so this patch adds it explicitly for
kmalloc(). The downside is increased fragmentation, which is hopefully
acceptable for this relatively rarely used allocator.

[1] https://lore.kernel.org/linux-fsdevel/[email protected]/T/#u

Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slab_common.c | 11 ++++++++++-
mm/slob.c | 42 +++++++++++++++++++++++++++++++-----------
2 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index 03eeb8b7b4b1..e591d5688558 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -968,10 +968,19 @@ void __init create_boot_cache(struct kmem_cache *s, const char *name,
unsigned int useroffset, unsigned int usersize)
{
int err;
+ unsigned int align = ARCH_KMALLOC_MINALIGN;

s->name = name;
s->size = s->object_size = size;
- s->align = calculate_alignment(flags, ARCH_KMALLOC_MINALIGN, size);
+
+ /*
+ * For power of two sizes, guarantee natural alignment for kmalloc
+ * caches, regardless of SL*B debugging options.
+ */
+ if (is_power_of_2(size))
+ align = max(align, size);
+ s->align = calculate_alignment(flags, align, size);
+
s->useroffset = useroffset;
s->usersize = usersize;

diff --git a/mm/slob.c b/mm/slob.c
index 307c2c9feb44..e100fa09493f 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -215,7 +215,8 @@ static void slob_free_pages(void *b, int order)
/*
* Allocate a slob block within a given slob_page sp.
*/
-static void *slob_page_alloc(struct page *sp, size_t size, int align)
+static void *slob_page_alloc(struct page *sp, size_t size, int align,
+ int align_offset)
{
slob_t *prev, *cur, *aligned = NULL;
int delta = 0, units = SLOB_UNITS(size);
@@ -223,8 +224,17 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align)
for (prev = NULL, cur = sp->freelist; ; prev = cur, cur = slob_next(cur)) {
slobidx_t avail = slob_units(cur);

+ /*
+ * 'aligned' will hold the address of the slob block so that the
+ * address 'aligned'+'align_offset' is aligned according to the
+ * 'align' parameter. This is for kmalloc() which prepends the
+ * allocated block with its size, so that the block itself is
+ * aligned when needed.
+ */
if (align) {
- aligned = (slob_t *)ALIGN((unsigned long)cur, align);
+ aligned = (slob_t *)
+ (ALIGN((unsigned long)cur + align_offset, align)
+ - align_offset);
delta = aligned - cur;
}
if (avail >= units + delta) { /* room enough? */
@@ -266,7 +276,8 @@ static void *slob_page_alloc(struct page *sp, size_t size, int align)
/*
* slob_alloc: entry point into the slob allocator.
*/
-static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
+static void *slob_alloc(size_t size, gfp_t gfp, int align, int node,
+ int align_offset)
{
struct page *sp;
struct list_head *prev;
@@ -298,7 +309,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)

/* Attempt to alloc */
prev = sp->lru.prev;
- b = slob_page_alloc(sp, size, align);
+ b = slob_page_alloc(sp, size, align, align_offset);
if (!b)
continue;

@@ -326,7 +337,7 @@ static void *slob_alloc(size_t size, gfp_t gfp, int align, int node)
INIT_LIST_HEAD(&sp->lru);
set_slob(b, SLOB_UNITS(PAGE_SIZE), b + SLOB_UNITS(PAGE_SIZE));
set_slob_page_free(sp, slob_list);
- b = slob_page_alloc(sp, size, align);
+ b = slob_page_alloc(sp, size, align, align_offset);
BUG_ON(!b);
spin_unlock_irqrestore(&slob_lock, flags);
}
@@ -428,7 +439,7 @@ static __always_inline void *
__do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
{
unsigned int *m;
- int align = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
+ int minalign = max_t(size_t, ARCH_KMALLOC_MINALIGN, ARCH_SLAB_MINALIGN);
void *ret;

gfp &= gfp_allowed_mask;
@@ -436,19 +447,28 @@ __do_kmalloc_node(size_t size, gfp_t gfp, int node, unsigned long caller)
fs_reclaim_acquire(gfp);
fs_reclaim_release(gfp);

- if (size < PAGE_SIZE - align) {
+ if (size < PAGE_SIZE - minalign) {
+ int align = minalign;
+
+ /*
+ * For power of two sizes, guarantee natural alignment for
+ * kmalloc()'d objects.
+ */
+ if (is_power_of_2(size))
+ align = max(minalign, (int) size);
+
if (!size)
return ZERO_SIZE_PTR;

- m = slob_alloc(size + align, gfp, align, node);
+ m = slob_alloc(size + minalign, gfp, align, node, minalign);

if (!m)
return NULL;
*m = size;
- ret = (void *)m + align;
+ ret = (void *)m + minalign;

trace_kmalloc_node(caller, ret,
- size, size + align, gfp, node);
+ size, size + minalign, gfp, node);
} else {
unsigned int order = get_order(size);

@@ -544,7 +564,7 @@ static void *slob_alloc_node(struct kmem_cache *c, gfp_t flags, int node)
fs_reclaim_release(flags);

if (c->size < PAGE_SIZE) {
- b = slob_alloc(c->size, flags, c->align, node);
+ b = slob_alloc(c->size, flags, c->align, node, 0);
trace_kmem_cache_alloc_node(_RET_IP_, b, c->object_size,
SLOB_UNITS(c->size) * SLOB_UNIT,
flags, node);
--
2.21.0


2019-03-19 21:14:14

by Vlastimil Babka

[permalink] [raw]
Subject: [RFC 2/2] mm, sl[aou]b: test whether kmalloc() alignment works as expected

Quick and dirty init test that kmalloc() alignment works as expected for
power-of-two sizes after the previous patch.

Signed-off-by: Vlastimil Babka <[email protected]>
---
mm/slab_common.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)

diff --git a/mm/slab_common.c b/mm/slab_common.c
index e591d5688558..de10ca9640e0 100644
--- a/mm/slab_common.c
+++ b/mm/slab_common.c
@@ -1621,3 +1621,22 @@ int should_failslab(struct kmem_cache *s, gfp_t gfpflags)
return 0;
}
ALLOW_ERROR_INJECTION(should_failslab, ERRNO);
+
+static int __init slab_kmalloc_test(void)
+{
+ int i;
+
+ for (i = KMALLOC_SHIFT_LOW; i <= KMALLOC_SHIFT_HIGH; i++) {
+ unsigned int size = 1 << i;
+ void * obj = kmalloc(size, GFP_KERNEL);
+ unsigned long objaddr = (unsigned long) obj;
+
+ printk("Size %u obj %px alignment: %s", size, obj,
+ (((objaddr & (size - 1)) == 0) ? "OK" : "WRONG"));
+ kfree(obj);
+ }
+
+ return 0;
+}
+
+__initcall(slab_kmalloc_test);
--
2.21.0


Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On Tue, 19 Mar 2019, Vlastimil Babka wrote:

> The recent thread [1] inspired me to look into guaranteeing alignment for
> kmalloc() for power-of-two sizes. Turns out it's not difficult and in most
> configuration nothing really changes as it happens implicitly. More details in
> the first patch. If we agree we want to do this, I will see where to update
> documentation and perhaps if there are any workarounds in the tree that can be
> converted to plain kmalloc() afterwards.

This means that the alignments are no longer uniform for all kmalloc
caches and we get back to code making all sorts of assumptions about
kmalloc alignments.

Currently all kmalloc objects are aligned to KMALLOC_MIN_ALIGN. That will
no longer be the case and alignments will become inconsistent.

I think its valuable that alignment requirements need to be explicitly
requested.

Lets add an array of power of two aligned kmalloc caches if that is really
necessary. Add some GFP_XXX flag to kmalloc to make it ^2 aligned maybe?


Subject: Re: [RFC 2/2] mm, sl[aou]b: test whether kmalloc() alignment works as expected

On Tue, 19 Mar 2019, Vlastimil Babka wrote:

> Quick and dirty init test that kmalloc() alignment works as expected for
> power-of-two sizes after the previous patch.

There is already an allocator testing function in mm/slub.c. Can you
generalize it or portions and put the into mm/slab_common.c?

2019-03-20 00:54:44

by David Rientjes

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On Wed, 20 Mar 2019, Christopher Lameter wrote:

> > The recent thread [1] inspired me to look into guaranteeing alignment for
> > kmalloc() for power-of-two sizes. Turns out it's not difficult and in most
> > configuration nothing really changes as it happens implicitly. More details in
> > the first patch. If we agree we want to do this, I will see where to update
> > documentation and perhaps if there are any workarounds in the tree that can be
> > converted to plain kmalloc() afterwards.
>
> This means that the alignments are no longer uniform for all kmalloc
> caches and we get back to code making all sorts of assumptions about
> kmalloc alignments.
>
> Currently all kmalloc objects are aligned to KMALLOC_MIN_ALIGN. That will
> no longer be the case and alignments will become inconsistent.
>
> I think its valuable that alignment requirements need to be explicitly
> requested.
>
> Lets add an array of power of two aligned kmalloc caches if that is really
> necessary. Add some GFP_XXX flag to kmalloc to make it ^2 aligned maybe?
>

No objection, but I think the GFP flags should remain what they are for:
to Get Free Pages. If we are to add additional flags to specify
characteristics of slab objects, can we add a kmalloc_flags() variant that
will take a new set of flags? SLAB_OBJ_ALIGN_POW2?

2019-03-20 08:50:01

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On 3/20/19 1:43 AM, Christopher Lameter wrote:
> On Tue, 19 Mar 2019, Vlastimil Babka wrote:
>
>> The recent thread [1] inspired me to look into guaranteeing alignment for
>> kmalloc() for power-of-two sizes. Turns out it's not difficult and in most
>> configuration nothing really changes as it happens implicitly. More details in
>> the first patch. If we agree we want to do this, I will see where to update
>> documentation and perhaps if there are any workarounds in the tree that can be
>> converted to plain kmalloc() afterwards.
>
> This means that the alignments are no longer uniform for all kmalloc
> caches and we get back to code making all sorts of assumptions about
> kmalloc alignments.

Natural alignment to size is rather well defined, no? Would anyone ever
assume a larger one, for what reason?
It's now where some make assumptions (even unknowingly) for natural
There are two 'odd' sizes 96 and 192, which will keep cacheline size
alignment, would anyone really expect more than 64 bytes?

> Currently all kmalloc objects are aligned to KMALLOC_MIN_ALIGN. That will
> no longer be the case and alignments will become inconsistent.

KMALLOC_MIN_ALIGN is still the minimum, but in practice it's larger
which is not a problem.

Also let me stress again that nothing really changes except for SLOB,
and SLUB with debug options. The natural alignment for power-of-two
sizes already happens as SLAB and SLUB both allocate objects starting on
the page boundary. So people make assumptions based on that, and then
break with SLOB, or SLUB with debug. This patch just prevents that
breakage by guaranteeing those natural assumptions at all times.

> I think its valuable that alignment requirements need to be explicitly
> requested.

That's still possible for named caches created by kmem_cache_create().

> Lets add an array of power of two aligned kmalloc caches if that is really
> necessary. Add some GFP_XXX flag to kmalloc to make it ^2 aligned maybe?

That's unnecessary and wasteful, as the existing caches are already
aligned in the common configurations. Requiring a flag doesn't help with
the implicit assumptions going wrong. I really don't think it needs to
get more complicated than adjusting the uncommon configuration, as this
patch does.

Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On Wed, 20 Mar 2019, Vlastimil Babka wrote:

> > This means that the alignments are no longer uniform for all kmalloc
> > caches and we get back to code making all sorts of assumptions about
> > kmalloc alignments.
>
> Natural alignment to size is rather well defined, no? Would anyone ever
> assume a larger one, for what reason?
> It's now where some make assumptions (even unknowingly) for natural
> There are two 'odd' sizes 96 and 192, which will keep cacheline size
> alignment, would anyone really expect more than 64 bytes?

I think one would expect one set of alighment for any kmalloc object.

> > Currently all kmalloc objects are aligned to KMALLOC_MIN_ALIGN. That will
> > no longer be the case and alignments will become inconsistent.
>
> KMALLOC_MIN_ALIGN is still the minimum, but in practice it's larger
> which is not a problem.

"In practice" refers to the current way that slab allocators arrange
objects within the page. They are free to do otherwise if new ideas come
up for object arrangements etc.

The slab allocators already may have to store data in addition to the user
accessible part (f.e. for RCU or ctor). The "natural alighnment" of a
power of 2 cache is no longer as you expect for these cases. Debugging is
not the only case where we extend the object.

> Also let me stress again that nothing really changes except for SLOB,
> and SLUB with debug options. The natural alignment for power-of-two
> sizes already happens as SLAB and SLUB both allocate objects starting on
> the page boundary. So people make assumptions based on that, and then
> break with SLOB, or SLUB with debug. This patch just prevents that
> breakage by guaranteeing those natural assumptions at all times.

As explained before there is nothing "natural" here. Doing so restricts
future features and creates a mess within the allocator of exceptions for
debuggin etc etc (see what happened to SLAB). "Natural" is just a
simplistic thought of a user how he would arrange power of 2 objects.
These assumption should not be made but specified explicitly.

> > I think its valuable that alignment requirements need to be explicitly
> > requested.
>
> That's still possible for named caches created by kmem_cache_create().

So lets leave it as it is now then.

2019-03-20 18:56:01

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On Wed, Mar 20, 2019 at 09:48:47AM +0100, Vlastimil Babka wrote:
> Natural alignment to size is rather well defined, no? Would anyone ever
> assume a larger one, for what reason?
> It's now where some make assumptions (even unknowingly) for natural
> There are two 'odd' sizes 96 and 192, which will keep cacheline size
> alignment, would anyone really expect more than 64 bytes?

Presumably 96 will keep being aligned to 32 bytes, as aligning 96 to 64
just results in 128-byte allocations.

2019-03-20 21:49:03

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()


On 3/20/2019 7:53 PM, Matthew Wilcox wrote:
> On Wed, Mar 20, 2019 at 09:48:47AM +0100, Vlastimil Babka wrote:
>> Natural alignment to size is rather well defined, no? Would anyone ever
>> assume a larger one, for what reason?
>> It's now where some make assumptions (even unknowingly) for natural
>> There are two 'odd' sizes 96 and 192, which will keep cacheline size
>> alignment, would anyone really expect more than 64 bytes?
>
> Presumably 96 will keep being aligned to 32 bytes, as aligning 96 to 64
> just results in 128-byte allocations.

Well, looks like that's what happens. This is with SLAB, but the alignment
calculations should be common:

slabinfo - version: 2.1
# name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
kmalloc-96 2611 4896 128 32 1 : tunables 120 60 8 : slabdata 153 153 0
kmalloc-128 4798 5536 128 32 1 : tunables 120 60 8 : slabdata 173 173 0

2019-03-21 02:24:54

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On Wed, Mar 20, 2019 at 10:48:03PM +0100, Vlastimil Babka wrote:
> On 3/20/2019 7:53 PM, Matthew Wilcox wrote:
> > On Wed, Mar 20, 2019 at 09:48:47AM +0100, Vlastimil Babka wrote:
> >> Natural alignment to size is rather well defined, no? Would anyone ever
> >> assume a larger one, for what reason?
> >> It's now where some make assumptions (even unknowingly) for natural
> >> There are two 'odd' sizes 96 and 192, which will keep cacheline size
> >> alignment, would anyone really expect more than 64 bytes?
> >
> > Presumably 96 will keep being aligned to 32 bytes, as aligning 96 to 64
> > just results in 128-byte allocations.
>
> Well, looks like that's what happens. This is with SLAB, but the alignment
> calculations should be common:
>
> slabinfo - version: 2.1
> # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
> kmalloc-96 2611 4896 128 32 1 : tunables 120 60 8 : slabdata 153 153 0
> kmalloc-128 4798 5536 128 32 1 : tunables 120 60 8 : slabdata 173 173 0

Hmm. On my laptop, I see:

kmalloc-96 28050 35364 96 42 1 : tunables 0 0 0 : slabdata 842 842 0

That'd take me from 842 * 4k pages to 1105 4k pages -- an extra megabyte of
memory.

This is running Debian's 4.19 kernel:

# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_SLAB_MERGE_DEFAULT=y
CONFIG_SLAB_FREELIST_RANDOM=y
CONFIG_SLAB_FREELIST_HARDENED=y
CONFIG_SLUB_CPU_PARTIAL=y



2019-03-21 07:03:27

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On 3/21/19 3:23 AM, Matthew Wilcox wrote:
> On Wed, Mar 20, 2019 at 10:48:03PM +0100, Vlastimil Babka wrote:
>>
>> Well, looks like that's what happens. This is with SLAB, but the alignment
>> calculations should be common:
>>
>> slabinfo - version: 2.1
>> # name <active_objs> <num_objs> <objsize> <objperslab> <pagesperslab> : tunables <limit> <batchcount> <sharedfactor> : slabdata <active_slabs> <num_slabs> <sharedavail>
>> kmalloc-96 2611 4896 128 32 1 : tunables 120 60 8 : slabdata 153 153 0
>> kmalloc-128 4798 5536 128 32 1 : tunables 120 60 8 : slabdata 173 173 0
>
> Hmm. On my laptop, I see:
>
> kmalloc-96 28050 35364 96 42 1 : tunables 0 0 0 : slabdata 842 842 0
>
> That'd take me from 842 * 4k pages to 1105 4k pages -- an extra megabyte of
> memory.
>
> This is running Debian's 4.19 kernel:
>
> # CONFIG_SLAB is not set
> CONFIG_SLUB=y

Ah, you're right. SLAB creates kmalloc caches with:

#ifndef ARCH_KMALLOC_FLAGS
#define ARCH_KMALLOC_FLAGS SLAB_HWCACHE_ALIGN
#endif

create_kmalloc_caches(ARCH_KMALLOC_FLAGS);

While SLUB just:

create_kmalloc_caches(0);

even though it uses SLAB_HWCACHE_ALIGN for kmem_cache_node and
kmem_cache caches.



2019-03-21 07:42:58

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On 3/20/19 7:20 PM, Christopher Lameter wrote:
>>> Currently all kmalloc objects are aligned to KMALLOC_MIN_ALIGN. That will
>>> no longer be the case and alignments will become inconsistent.
>>
>> KMALLOC_MIN_ALIGN is still the minimum, but in practice it's larger
>> which is not a problem.
>
> "In practice" refers to the current way that slab allocators arrange
> objects within the page. They are free to do otherwise if new ideas come
> up for object arrangements etc.
>
> The slab allocators already may have to store data in addition to the user
> accessible part (f.e. for RCU or ctor). The "natural alighnment" of a
> power of 2 cache is no longer as you expect for these cases. Debugging is
> not the only case where we extend the object.

For plain kmalloc() caches, RCU and ctors don't apply, right.

>> Also let me stress again that nothing really changes except for SLOB,
>> and SLUB with debug options. The natural alignment for power-of-two
>> sizes already happens as SLAB and SLUB both allocate objects starting on
>> the page boundary. So people make assumptions based on that, and then
>> break with SLOB, or SLUB with debug. This patch just prevents that
>> breakage by guaranteeing those natural assumptions at all times.
>
> As explained before there is nothing "natural" here. Doing so restricts
> future features

Well, future features will have to deal with the existing named caches
created with specific alignment.

> and creates a mess within the allocator of exceptions for
> debuggin etc etc (see what happened to SLAB).

SLAB could be fixed, just nobody cares enough I guess. If I want to
debug wrong SL*B usage I'll use SLUB.

> "Natural" is just a
> simplistic thought of a user how he would arrange power of 2 objects.
> These assumption should not be made but specified explicitly.

Patch 1 does this explicitly for plain kmalloc(). It's unrealistic to
add 'align' parameter to plain kmalloc() as that would have to create
caches on-demand for 'new' values of align parameter.

>>> I think its valuable that alignment requirements need to be explicitly
>>> requested.
>>
>> That's still possible for named caches created by kmem_cache_create().
>
> So lets leave it as it is now then.

That however doesn't work well for the xfs/IO case where block sizes are
not known in advance:

https://lore.kernel.org/linux-fsdevel/[email protected]/T/#ec3a292c358d05a6b29cc4a9ce3ae6b2faf31a23f

Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On Thu, 21 Mar 2019, Vlastimil Babka wrote:

> That however doesn't work well for the xfs/IO case where block sizes are
> not known in advance:
>
> https://lore.kernel.org/linux-fsdevel/[email protected]/T/#ec3a292c358d05a6b29cc4a9ce3ae6b2faf31a23f

I thought we agreed to use custom slab caches for that?


2019-04-05 17:12:14

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On 3/22/19 6:52 PM, Christopher Lameter wrote:
> On Thu, 21 Mar 2019, Vlastimil Babka wrote:
>
>> That however doesn't work well for the xfs/IO case where block sizes are
>> not known in advance:
>>
>> https://lore.kernel.org/linux-fsdevel/[email protected]/T/#ec3a292c358d05a6b29cc4a9ce3ae6b2faf31a23f
>
> I thought we agreed to use custom slab caches for that?

Hm maybe I missed something but my impression was that xfs/IO folks would have
to create lots of them for various sizes not known in advance, and that it
wasn't practical and would welcome if kmalloc just guaranteed the alignment.
But so far they haven't chimed in here in this thread, so I guess I'm wrong.

2019-04-07 08:03:15

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On Fri, Apr 05, 2019 at 07:11:17PM +0200, Vlastimil Babka wrote:
> On 3/22/19 6:52 PM, Christopher Lameter wrote:
> > On Thu, 21 Mar 2019, Vlastimil Babka wrote:
> >
> >> That however doesn't work well for the xfs/IO case where block sizes are
> >> not known in advance:
> >>
> >> https://lore.kernel.org/linux-fsdevel/[email protected]/T/#ec3a292c358d05a6b29cc4a9ce3ae6b2faf31a23f
> >
> > I thought we agreed to use custom slab caches for that?
>
> Hm maybe I missed something but my impression was that xfs/IO folks would have
> to create lots of them for various sizes not known in advance, and that it
> wasn't practical and would welcome if kmalloc just guaranteed the alignment.
> But so far they haven't chimed in here in this thread, so I guess I'm wrong.

Yes, in XFS we might have quite a few. Never mind all the other
block level consumers that might have similar reasonable expectations
but haven't triggered the problematic drivers yet.

2019-04-09 08:08:39

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On 4/7/19 10:00 AM, Christoph Hellwig wrote:
> On Fri, Apr 05, 2019 at 07:11:17PM +0200, Vlastimil Babka wrote:
>> On 3/22/19 6:52 PM, Christopher Lameter wrote:
>> > On Thu, 21 Mar 2019, Vlastimil Babka wrote:
>> >
>> >> That however doesn't work well for the xfs/IO case where block sizes are
>> >> not known in advance:
>> >>
>> >> https://lore.kernel.org/linux-fsdevel/[email protected]/T/#ec3a292c358d05a6b29cc4a9ce3ae6b2faf31a23f
>> >
>> > I thought we agreed to use custom slab caches for that?
>>
>> Hm maybe I missed something but my impression was that xfs/IO folks would have
>> to create lots of them for various sizes not known in advance, and that it
>> wasn't practical and would welcome if kmalloc just guaranteed the alignment.
>> But so far they haven't chimed in here in this thread, so I guess I'm wrong.
>
> Yes, in XFS we might have quite a few. Never mind all the other
> block level consumers that might have similar reasonable expectations
> but haven't triggered the problematic drivers yet.

What about a LSF session/BoF to sort this out, then? Would need to have people
from all three MM+FS+IO groups, I suppose.

2019-04-09 09:21:55

by Michal Hocko

[permalink] [raw]
Subject: Re: [RFC 0/2] guarantee natural alignment for kmalloc()

On Tue 09-04-19 10:07:42, Vlastimil Babka wrote:
> On 4/7/19 10:00 AM, Christoph Hellwig wrote:
> > On Fri, Apr 05, 2019 at 07:11:17PM +0200, Vlastimil Babka wrote:
> >> On 3/22/19 6:52 PM, Christopher Lameter wrote:
> >> > On Thu, 21 Mar 2019, Vlastimil Babka wrote:
> >> >
> >> >> That however doesn't work well for the xfs/IO case where block sizes are
> >> >> not known in advance:
> >> >>
> >> >> https://lore.kernel.org/linux-fsdevel/[email protected]/T/#ec3a292c358d05a6b29cc4a9ce3ae6b2faf31a23f
> >> >
> >> > I thought we agreed to use custom slab caches for that?
> >>
> >> Hm maybe I missed something but my impression was that xfs/IO folks would have
> >> to create lots of them for various sizes not known in advance, and that it
> >> wasn't practical and would welcome if kmalloc just guaranteed the alignment.
> >> But so far they haven't chimed in here in this thread, so I guess I'm wrong.
> >
> > Yes, in XFS we might have quite a few. Never mind all the other
> > block level consumers that might have similar reasonable expectations
> > but haven't triggered the problematic drivers yet.
>
> What about a LSF session/BoF to sort this out, then? Would need to have people
> from all three MM+FS+IO groups, I suppose.

Sounds like a good plan. Care to send an email to lsf-pc mailing list so
that it doesn't fall through cracks please?

--
Michal Hocko
SUSE Labs