When running ltp's oom test with kmemleak enabled, the below warning was
triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
passed in:
WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
Modules linked in: loop dax_pmem dax_pmem_core
ip_tables x_tables xfs virtio_net net_failover virtio_blk failover
ata_generic virtio_pci virtio_ring virtio libata
CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
...
kmemleak_alloc+0x4e/0xb0
kmem_cache_alloc+0x2a7/0x3e0
? __kmalloc+0x1d6/0x470
? ___might_sleep+0x9c/0x170
? mempool_alloc+0x2b0/0x2b0
mempool_alloc_slab+0x2d/0x40
mempool_alloc+0x118/0x2b0
? __kasan_check_read+0x11/0x20
? mempool_resize+0x390/0x390
? lock_downgrade+0x3c0/0x3c0
bio_alloc_bioset+0x19d/0x350
? __swap_duplicate+0x161/0x240
? bvec_alloc+0x1b0/0x1b0
? do_raw_spin_unlock+0xa8/0x140
? _raw_spin_unlock+0x27/0x40
get_swap_bio+0x80/0x230
? __x64_sys_madvise+0x50/0x50
? end_swap_bio_read+0x310/0x310
? __kasan_check_read+0x11/0x20
? check_chain_key+0x24e/0x300
? bdev_write_page+0x55/0x130
__swap_writepage+0x5ff/0xb20
The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has
__GFP_NOFAIL set all the time due to commit
d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
with fault injection").
The fault-injection would not try to fail slab or page allocation if
__GFP_NOFAIL is used and that commit tries to turn off fault injection
for kmemleak allocation. Although __GFP_NOFAIL doesn't guarantee no
failure for all the cases (i.e. non-blockable allocation may fail), it
still makes sense to the most cases. Kmemleak is also a debugging tool,
so it sounds not worth changing the behavior.
It also meaks sense to keep the warning, so just document the special
case in the comment.
Cc: Michal Hocko <[email protected]>
Cc: Dmitry Vyukov <[email protected]>
Cc: Catalin Marinas <[email protected]>
Signed-off-by: Yang Shi <[email protected]>
---
mm/page_alloc.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index d66bc8a..cac6efb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4531,8 +4531,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
*/
if (gfp_mask & __GFP_NOFAIL) {
/*
- * All existing users of the __GFP_NOFAIL are blockable, so warn
- * of any new users that actually require GFP_NOWAIT
+ * The users of the __GFP_NOFAIL are expected be blockable,
+ * and this is true for the most cases except for kmemleak.
+ * The kmemleak pass in __GFP_NOFAIL to skip fault injection,
+ * however kmemleak may allocate object at some non-blockable
+ * context to trigger this warning.
+ *
+ * Keep this warning since it is still useful for the most
+ * normal cases.
*/
if (WARN_ON_ONCE(!can_direct_reclaim))
goto fail;
--
1.8.3.1
On Sat, 13 Jul 2019, Yang Shi wrote:
> When running ltp's oom test with kmemleak enabled, the below warning was
> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> passed in:
>
> WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
> Modules linked in: loop dax_pmem dax_pmem_core
> ip_tables x_tables xfs virtio_net net_failover virtio_blk failover
> ata_generic virtio_pci virtio_ring virtio libata
> CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
> ...
> kmemleak_alloc+0x4e/0xb0
> kmem_cache_alloc+0x2a7/0x3e0
> ? __kmalloc+0x1d6/0x470
> ? ___might_sleep+0x9c/0x170
> ? mempool_alloc+0x2b0/0x2b0
> mempool_alloc_slab+0x2d/0x40
> mempool_alloc+0x118/0x2b0
> ? __kasan_check_read+0x11/0x20
> ? mempool_resize+0x390/0x390
> ? lock_downgrade+0x3c0/0x3c0
> bio_alloc_bioset+0x19d/0x350
> ? __swap_duplicate+0x161/0x240
> ? bvec_alloc+0x1b0/0x1b0
> ? do_raw_spin_unlock+0xa8/0x140
> ? _raw_spin_unlock+0x27/0x40
> get_swap_bio+0x80/0x230
> ? __x64_sys_madvise+0x50/0x50
> ? end_swap_bio_read+0x310/0x310
> ? __kasan_check_read+0x11/0x20
> ? check_chain_key+0x24e/0x300
> ? bdev_write_page+0x55/0x130
> __swap_writepage+0x5ff/0xb20
>
> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has
> __GFP_NOFAIL set all the time due to commit
> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
> with fault injection").
>
It only clears __GFP_DIRECT_RECLAIM provisionally to see if the allocation
would immediately succeed before falling back to the elements in the
mempool. If that fails, and the mempool is empty, mempool_alloc()
attempts the allocation with __GFP_DIRECT_RECLAIM. So for the problem
described here, I think what we really want is this:
diff --git a/mm/mempool.c b/mm/mempool.c
--- a/mm/mempool.c
+++ b/mm/mempool.c
@@ -386,7 +386,7 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask)
gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */
gfp_mask |= __GFP_NOWARN; /* failures are OK */
- gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO);
+ gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO|__GFP_NOFAIL);
repeat_alloc:
But bio_alloc_bioset() plays with gfp_mask itself: are we sure that it
isn't the one clearing __GFP_DIRECT_RECLAIM itself before falling back to
saved_gfp?
In other words do we also want this?
diff --git a/block/bio.c b/block/bio.c
--- a/block/bio.c
+++ b/block/bio.c
@@ -462,16 +462,16 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs,
* We solve this, and guarantee forward progress, with a rescuer
* workqueue per bio_set. If we go to allocate and there are
* bios on current->bio_list, we first try the allocation
- * without __GFP_DIRECT_RECLAIM; if that fails, we punt those
- * bios we would be blocking to the rescuer workqueue before
- * we retry with the original gfp_flags.
+ * without __GFP_DIRECT_RECLAIM or __GFP_NOFAIL; if that fails,
+ * we punt those bios we would be blocking to the rescuer
+ * workqueue before we retry with the original gfp_flags.
*/
-
if (current->bio_list &&
(!bio_list_empty(¤t->bio_list[0]) ||
!bio_list_empty(¤t->bio_list[1])) &&
bs->rescue_workqueue)
- gfp_mask &= ~__GFP_DIRECT_RECLAIM;
+ gfp_mask &= ~(__GFP_DIRECT_RECLAIM |
+ __GFP_NOFAIL);
p = mempool_alloc(&bs->bio_pool, gfp_mask);
if (!p && gfp_mask != saved_gfp) {
On Sat, Jul 13, 2019 at 04:49:04AM +0800, Yang Shi wrote:
> When running ltp's oom test with kmemleak enabled, the below warning was
> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> passed in:
There are lots of places where kmemleak will call kmalloc with
__GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM (including the XArray code, which
is how I know about it). It needs to be fixed to allow its internal
allocations to fail and return failure of the original allocation as
a consequence.
On 7/13/19 2:25 PM, Matthew Wilcox wrote:
> On Sat, Jul 13, 2019 at 04:49:04AM +0800, Yang Shi wrote:
>> When running ltp's oom test with kmemleak enabled, the below warning was
>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
>> passed in:
> There are lots of places where kmemleak will call kmalloc with
> __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM (including the XArray code, which
> is how I know about it). It needs to be fixed to allow its internal
> allocations to fail and return failure of the original allocation as
> a consequence.
Do you mean kmemleak internal allocation? It would fail even though
__GFP_NOFAIL is passed in if GFP_NOWAIT is specified. Currently buddy
allocator will not retry if the allocation is non-blockable.
On 7/13/19 12:39 PM, David Rientjes wrote:
> On Sat, 13 Jul 2019, Yang Shi wrote:
>
>> When running ltp's oom test with kmemleak enabled, the below warning was
>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
>> passed in:
>>
>> WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
>> Modules linked in: loop dax_pmem dax_pmem_core
>> ip_tables x_tables xfs virtio_net net_failover virtio_blk failover
>> ata_generic virtio_pci virtio_ring virtio libata
>> CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
>> RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
>> ...
>> kmemleak_alloc+0x4e/0xb0
>> kmem_cache_alloc+0x2a7/0x3e0
>> ? __kmalloc+0x1d6/0x470
>> ? ___might_sleep+0x9c/0x170
>> ? mempool_alloc+0x2b0/0x2b0
>> mempool_alloc_slab+0x2d/0x40
>> mempool_alloc+0x118/0x2b0
>> ? __kasan_check_read+0x11/0x20
>> ? mempool_resize+0x390/0x390
>> ? lock_downgrade+0x3c0/0x3c0
>> bio_alloc_bioset+0x19d/0x350
>> ? __swap_duplicate+0x161/0x240
>> ? bvec_alloc+0x1b0/0x1b0
>> ? do_raw_spin_unlock+0xa8/0x140
>> ? _raw_spin_unlock+0x27/0x40
>> get_swap_bio+0x80/0x230
>> ? __x64_sys_madvise+0x50/0x50
>> ? end_swap_bio_read+0x310/0x310
>> ? __kasan_check_read+0x11/0x20
>> ? check_chain_key+0x24e/0x300
>> ? bdev_write_page+0x55/0x130
>> __swap_writepage+0x5ff/0xb20
>>
>> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has
>> __GFP_NOFAIL set all the time due to commit
>> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
>> with fault injection").
>>
> It only clears __GFP_DIRECT_RECLAIM provisionally to see if the allocation
> would immediately succeed before falling back to the elements in the
> mempool. If that fails, and the mempool is empty, mempool_alloc()
> attempts the allocation with __GFP_DIRECT_RECLAIM. So for the problem
> described here, I think what we really want is this:
>
> diff --git a/mm/mempool.c b/mm/mempool.c
> --- a/mm/mempool.c
> +++ b/mm/mempool.c
> @@ -386,7 +386,7 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask)
> gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */
> gfp_mask |= __GFP_NOWARN; /* failures are OK */
>
> - gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO);
> + gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO|__GFP_NOFAIL);
>
> repeat_alloc:
>
> But bio_alloc_bioset() plays with gfp_mask itself: are we sure that it
> isn't the one clearing __GFP_DIRECT_RECLAIM itself before falling back to
> saved_gfp?
>
> In other words do we also want this?
>
> diff --git a/block/bio.c b/block/bio.c
> --- a/block/bio.c
> +++ b/block/bio.c
> @@ -462,16 +462,16 @@ struct bio *bio_alloc_bioset(gfp_t gfp_mask, unsigned int nr_iovecs,
> * We solve this, and guarantee forward progress, with a rescuer
> * workqueue per bio_set. If we go to allocate and there are
> * bios on current->bio_list, we first try the allocation
> - * without __GFP_DIRECT_RECLAIM; if that fails, we punt those
> - * bios we would be blocking to the rescuer workqueue before
> - * we retry with the original gfp_flags.
> + * without __GFP_DIRECT_RECLAIM or __GFP_NOFAIL; if that fails,
> + * we punt those bios we would be blocking to the rescuer
> + * workqueue before we retry with the original gfp_flags.
> */
> -
> if (current->bio_list &&
> (!bio_list_empty(¤t->bio_list[0]) ||
> !bio_list_empty(¤t->bio_list[1])) &&
> bs->rescue_workqueue)
> - gfp_mask &= ~__GFP_DIRECT_RECLAIM;
> + gfp_mask &= ~(__GFP_DIRECT_RECLAIM |
> + __GFP_NOFAIL);
>
> p = mempool_alloc(&bs->bio_pool, gfp_mask);
> if (!p && gfp_mask != saved_gfp) {
I don't think it will make any difference by removing __GFP_NOFAIL
outside kmemleak. The problem is the commit
d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
with fault injection") makes __GFP_NOFAIL is set for kmemleak always in
order to turn off fault-injection for kmemleak.
As long as kmemleak is called in ~__GFP_DIRECT_RECLAIM path, the warning
might be hit.
And since kmemleak is just a debugging tool, so IMHO I don't think this
is worth fixing, so I came up with the patch to document it.
On Sun, Jul 14, 2019 at 08:47:07PM -0700, Yang Shi wrote:
>
>
> On 7/13/19 2:25 PM, Matthew Wilcox wrote:
> > On Sat, Jul 13, 2019 at 04:49:04AM +0800, Yang Shi wrote:
> > > When running ltp's oom test with kmemleak enabled, the below warning was
> > > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> > > passed in:
> > There are lots of places where kmemleak will call kmalloc with
> > __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM (including the XArray code, which
> > is how I know about it). It needs to be fixed to allow its internal
> > allocations to fail and return failure of the original allocation as
> > a consequence.
>
> Do you mean kmemleak internal allocation? It would fail even though
> __GFP_NOFAIL is passed in if GFP_NOWAIT is specified. Currently buddy
> allocator will not retry if the allocation is non-blockable.
Actually it sets off a warning. Which is the right response from the
core mm code because specifying __GFP_NOFAIL and __GFP_NOWAIT makes no
sense.
On Sat 13-07-19 04:49:04, Yang Shi wrote:
> When running ltp's oom test with kmemleak enabled, the below warning was
> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> passed in:
kmemleak is broken and this is a long term issue. I thought that
Catalin had something to address this.
While this patch only adds a comment and discourages future changes of
the warning which is fine and probably something that we should do,
kmemleak really should be fixed sooner than later.
--
Michal Hocko
SUSE Labs
On Sat 13-07-19 12:39:16, David Rientjes wrote:
> On Sat, 13 Jul 2019, Yang Shi wrote:
>
> > When running ltp's oom test with kmemleak enabled, the below warning was
> > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> > passed in:
> >
> > WARNING: CPU: 105 PID: 2138 at mm/page_alloc.c:4608 __alloc_pages_nodemask+0x1c31/0x1d50
> > Modules linked in: loop dax_pmem dax_pmem_core
> > ip_tables x_tables xfs virtio_net net_failover virtio_blk failover
> > ata_generic virtio_pci virtio_ring virtio libata
> > CPU: 105 PID: 2138 Comm: oom01 Not tainted 5.2.0-next-20190710+ #7
> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.10.2-0-g5f4c7b1-prebuilt.qemu-project.org 04/01/2014
> > RIP: 0010:__alloc_pages_nodemask+0x1c31/0x1d50
> > ...
> > kmemleak_alloc+0x4e/0xb0
> > kmem_cache_alloc+0x2a7/0x3e0
> > ? __kmalloc+0x1d6/0x470
> > ? ___might_sleep+0x9c/0x170
> > ? mempool_alloc+0x2b0/0x2b0
> > mempool_alloc_slab+0x2d/0x40
> > mempool_alloc+0x118/0x2b0
> > ? __kasan_check_read+0x11/0x20
> > ? mempool_resize+0x390/0x390
> > ? lock_downgrade+0x3c0/0x3c0
> > bio_alloc_bioset+0x19d/0x350
> > ? __swap_duplicate+0x161/0x240
> > ? bvec_alloc+0x1b0/0x1b0
> > ? do_raw_spin_unlock+0xa8/0x140
> > ? _raw_spin_unlock+0x27/0x40
> > get_swap_bio+0x80/0x230
> > ? __x64_sys_madvise+0x50/0x50
> > ? end_swap_bio_read+0x310/0x310
> > ? __kasan_check_read+0x11/0x20
> > ? check_chain_key+0x24e/0x300
> > ? bdev_write_page+0x55/0x130
> > __swap_writepage+0x5ff/0xb20
> >
> > The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has
> > __GFP_NOFAIL set all the time due to commit
> > d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
> > with fault injection").
> >
>
> It only clears __GFP_DIRECT_RECLAIM provisionally to see if the allocation
> would immediately succeed before falling back to the elements in the
> mempool. If that fails, and the mempool is empty, mempool_alloc()
> attempts the allocation with __GFP_DIRECT_RECLAIM. So for the problem
> described here, I think what we really want is this:
>
> diff --git a/mm/mempool.c b/mm/mempool.c
> --- a/mm/mempool.c
> +++ b/mm/mempool.c
> @@ -386,7 +386,7 @@ void *mempool_alloc(mempool_t *pool, gfp_t gfp_mask)
> gfp_mask |= __GFP_NORETRY; /* don't loop in __alloc_pages */
> gfp_mask |= __GFP_NOWARN; /* failures are OK */
>
> - gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO);
> + gfp_temp = gfp_mask & ~(__GFP_DIRECT_RECLAIM|__GFP_IO|__GFP_NOFAIL);
>
> repeat_alloc:
No, I do not think we should make mempool allocator more complex for
something that is an implementation problem the kmemleak.
--
Michal Hocko
SUSE Labs
On 15 Jul 2019, at 08:17, Michal Hocko <[email protected]> wrote:
> On Sat 13-07-19 04:49:04, Yang Shi wrote:
>> When running ltp's oom test with kmemleak enabled, the below warning was
>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
>> passed in:
>
> kmemleak is broken and this is a long term issue. I thought that
> Catalin had something to address this.
What needs to be done in the short term is revert commit d9570ee3bd1d4f20ce63485f5ef05663866fe6c0. Longer term the solution is to embed kmemleak metadata into the slab so that we don’t have the situation where the primary slab allocation success but the kmemleak metadata fails.
I’m on holiday for one more week with just a phone to reply from but feel free to revert the above commit. I’ll follow up with a better solution.
Catalin
On Mon, 2019-07-15 at 10:01 -0500, Catalin Marinas wrote:
> On 15 Jul 2019, at 08:17, Michal Hocko <[email protected]> wrote:
> > On Sat 13-07-19 04:49:04, Yang Shi wrote:
> > > When running ltp's oom test with kmemleak enabled, the below warning was
> > > triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> > > passed in:
> >
> > kmemleak is broken and this is a long term issue. I thought that
> > Catalin had something to address this.
>
> What needs to be done in the short term is revert commit
> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0. Longer term the solution is to embed
> kmemleak metadata into the slab so that we don’t have the situation where the
> primary slab allocation success but the kmemleak metadata fails.
>
> I’m on holiday for one more week with just a phone to reply from but feel free
> to revert the above commit. I’ll follow up with a better solution.
Well, the reverting will only make the situation worst for the kmemleak under
memory pressure. In the meantime, if someone wants to push for the mempool
solution with tunable pool sizes along with the reverting, that could be an
improvement.
https://lore.kernel.org/linux-mm/[email protected]/
On 7/15/19 8:18 AM, Qian Cai wrote:
> On Mon, 2019-07-15 at 10:01 -0500, Catalin Marinas wrote:
>> On 15 Jul 2019, at 08:17, Michal Hocko <[email protected]> wrote:
>>> On Sat 13-07-19 04:49:04, Yang Shi wrote:
>>>> When running ltp's oom test with kmemleak enabled, the below warning was
>>>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
>>>> passed in:
>>> kmemleak is broken and this is a long term issue. I thought that
>>> Catalin had something to address this.
>> What needs to be done in the short term is revert commit
>> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0. Longer term the solution is to embed
>> kmemleak metadata into the slab so that we don’t have the situation where the
>> primary slab allocation success but the kmemleak metadata fails.
>>
>> I’m on holiday for one more week with just a phone to reply from but feel free
>> to revert the above commit. I’ll follow up with a better solution.
> Well, the reverting will only make the situation worst for the kmemleak under
> memory pressure. In the meantime, if someone wants to push for the mempool
I think this is expected by reverting that commit since kmemleak
metadata could fail. But, it could fail too even though that commit is
not reverted if the context is non-blockable.
> solution with tunable pool sizes along with the reverting, that could be an
> improvement.
>
> https://lore.kernel.org/linux-mm/[email protected]/
On 7/15/19 6:06 AM, Matthew Wilcox wrote:
> On Sun, Jul 14, 2019 at 08:47:07PM -0700, Yang Shi wrote:
>>
>> On 7/13/19 2:25 PM, Matthew Wilcox wrote:
>>> On Sat, Jul 13, 2019 at 04:49:04AM +0800, Yang Shi wrote:
>>>> When running ltp's oom test with kmemleak enabled, the below warning was
>>>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
>>>> passed in:
>>> There are lots of places where kmemleak will call kmalloc with
>>> __GFP_NOFAIL and ~__GFP_DIRECT_RECLAIM (including the XArray code, which
>>> is how I know about it). It needs to be fixed to allow its internal
>>> allocations to fail and return failure of the original allocation as
>>> a consequence.
>> Do you mean kmemleak internal allocation? It would fail even though
>> __GFP_NOFAIL is passed in if GFP_NOWAIT is specified. Currently buddy
>> allocator will not retry if the allocation is non-blockable.
> Actually it sets off a warning. Which is the right response from the
> core mm code because specifying __GFP_NOFAIL and __GFP_NOWAIT makes no
> sense.
Yes, this is what I meant. Kmemleak did a trick to fool fault-injection
by passing in __GFP_NOFAIL, but it doesn't make sense for non-blockable
allocation.
On 7/15/19 8:01 AM, Catalin Marinas wrote:
> On 15 Jul 2019, at 08:17, Michal Hocko <[email protected]> wrote:
>> On Sat 13-07-19 04:49:04, Yang Shi wrote:
>>> When running ltp's oom test with kmemleak enabled, the below warning was
>>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
>>> passed in:
>> kmemleak is broken and this is a long term issue. I thought that
>> Catalin had something to address this.
> What needs to be done in the short term is revert commit d9570ee3bd1d4f20ce63485f5ef05663866fe6c0. Longer term the solution is to embed kmemleak metadata into the slab so that we don’t have the situation where the primary slab allocation success but the kmemleak metadata fails.
>
> I’m on holiday for one more week with just a phone to reply from but feel free to revert the above commit. I’ll follow up with a better solution.
Thanks, I'm going to submit a new patch to revert that commit.
Yang
>
> Catalin
On Sat, 13 Jul 2019 04:49:04 +0800 Yang Shi <[email protected]> wrote:
> When running ltp's oom test with kmemleak enabled, the below warning was
> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
> passed in:
>
> ...
>
> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has
> __GFP_NOFAIL set all the time due to commit
> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
> with fault injection").
>
> The fault-injection would not try to fail slab or page allocation if
> __GFP_NOFAIL is used and that commit tries to turn off fault injection
> for kmemleak allocation. Although __GFP_NOFAIL doesn't guarantee no
> failure for all the cases (i.e. non-blockable allocation may fail), it
> still makes sense to the most cases. Kmemleak is also a debugging tool,
> so it sounds not worth changing the behavior.
>
> It also meaks sense to keep the warning, so just document the special
> case in the comment.
>
> ...
>
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4531,8 +4531,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
> */
> if (gfp_mask & __GFP_NOFAIL) {
> /*
> - * All existing users of the __GFP_NOFAIL are blockable, so warn
> - * of any new users that actually require GFP_NOWAIT
> + * The users of the __GFP_NOFAIL are expected be blockable,
> + * and this is true for the most cases except for kmemleak.
> + * The kmemleak pass in __GFP_NOFAIL to skip fault injection,
> + * however kmemleak may allocate object at some non-blockable
> + * context to trigger this warning.
> + *
> + * Keep this warning since it is still useful for the most
> + * normal cases.
> */
Comment has rather a lot of typos. I'd normally fix them but I think
I'll duck this patch until the kmemleak situation is addressed, so we
can add a kmemleakless long-term comment, if desired.
On 7/24/19 7:48 PM, Andrew Morton wrote:
> On Sat, 13 Jul 2019 04:49:04 +0800 Yang Shi <[email protected]> wrote:
>
>> When running ltp's oom test with kmemleak enabled, the below warning was
>> triggerred since kernel detects __GFP_NOFAIL & ~__GFP_DIRECT_RECLAIM is
>> passed in:
>>
>> ...
>>
>> The mempool_alloc_slab() clears __GFP_DIRECT_RECLAIM, kmemleak has
>> __GFP_NOFAIL set all the time due to commit
>> d9570ee3bd1d4f20ce63485f5ef05663866fe6c0 ("kmemleak: allow to coexist
>> with fault injection").
>>
>> The fault-injection would not try to fail slab or page allocation if
>> __GFP_NOFAIL is used and that commit tries to turn off fault injection
>> for kmemleak allocation. Although __GFP_NOFAIL doesn't guarantee no
>> failure for all the cases (i.e. non-blockable allocation may fail), it
>> still makes sense to the most cases. Kmemleak is also a debugging tool,
>> so it sounds not worth changing the behavior.
>>
>> It also meaks sense to keep the warning, so just document the special
>> case in the comment.
>>
>> ...
>>
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -4531,8 +4531,14 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_mask)
>> */
>> if (gfp_mask & __GFP_NOFAIL) {
>> /*
>> - * All existing users of the __GFP_NOFAIL are blockable, so warn
>> - * of any new users that actually require GFP_NOWAIT
>> + * The users of the __GFP_NOFAIL are expected be blockable,
>> + * and this is true for the most cases except for kmemleak.
>> + * The kmemleak pass in __GFP_NOFAIL to skip fault injection,
>> + * however kmemleak may allocate object at some non-blockable
>> + * context to trigger this warning.
>> + *
>> + * Keep this warning since it is still useful for the most
>> + * normal cases.
>> */
> Comment has rather a lot of typos. I'd normally fix them but I think
> I'll duck this patch until the kmemleak situation is addressed, so we
> can add a kmemleakless long-term comment, if desired.
Actually, this has been replaced by reverting the problematic commit.
And, the patch has been in -mm tree. Please see:
revert-kmemleak-allow-to-coexist-with-fault-injection.patch
I think we would like to have this merged in 5.3-rc1 or rc2?