2022-11-18 01:14:44

by Zqiang

[permalink] [raw]
Subject: [PATCH v3] mm: Make vmalloc_dump_obj() call in clean context

Currently, the mem_dump_obj() is invoked in call_rcu(), the
call_rcu() is maybe invoked in non-preemptive code segment,
for object allocated from vmalloc(), the following scenarios
may occur:

CPU 0
tasks context
spin_lock(&vmap_area_lock)
Interrupt context
call_rcu()
mem_dump_obj
vmalloc_dump_obj
spin_lock(&vmap_area_lock) <--deadlock

and for PREEMPT-RT kernel, the spinlock will convert to sleepable
lock, so the vmap_area_lock spinlock not allowed to get in non-preemptive
code segment. therefore, this commit make the vmalloc_dump_obj() call in
a clean context.

Signed-off-by: Zqiang <[email protected]>
---
v1->v2:
add IS_ENABLED(CONFIG_PREEMPT_RT) check.
v2->v3:
change commit message and add some comment.

mm/util.c | 4 +++-
mm/vmalloc.c | 25 +++++++++++++++++++++++++
2 files changed, 28 insertions(+), 1 deletion(-)

diff --git a/mm/util.c b/mm/util.c
index 12984e76767e..2b0222a728cc 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1128,7 +1128,9 @@ void mem_dump_obj(void *object)
return;

if (virt_addr_valid(object))
- type = "non-slab/vmalloc memory";
+ type = "non-slab memory";
+ else if (is_vmalloc_addr(object))
+ type = "vmalloc memory";
else if (object == NULL)
type = "NULL pointer";
else if (object == ZERO_SIZE_PTR)
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ccaa461998f3..4351eafbe7ab 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -4034,6 +4034,31 @@ bool vmalloc_dump_obj(void *object)
struct vm_struct *vm;
void *objp = (void *)PAGE_ALIGN((unsigned long)object);

+ /* for non-vmalloc addr, return directly */
+ if (!is_vmalloc_addr(objp))
+ return false;
+
+ /**
+ * for non-Preempt-RT kernel, return directly. otherwise not
+ * only needs to determine whether it is in the interrupt context
+ * (in_interrupt())to avoid deadlock, but also to avoid acquire
+ * vmap_area_lock spinlock in disables interrupts or preempts
+ * critical sections, because the vmap_area_lock spinlock convert
+ * to sleepable lock
+ */
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
+ return false;
+
+ /**
+ * get here, for Preempt-RT kernel, it means that we are in
+ * preemptible context(preemptible() is true), it also means
+ * that the in_interrupt() will return false.
+ * for non-Preempt-RT kernel, only needs to determine whether
+ * it is in the interrupt context(in_interrupt()) to avoid deadlock
+ */
+ if (in_interrupt())
+ return false;
+
vm = find_vm_area(objp);
if (!vm)
return false;
--
2.25.1



2022-11-22 23:43:27

by Zqiang

[permalink] [raw]
Subject: RE: [PATCH v3] mm: Make vmalloc_dump_obj() call in clean context


Gently ping ????

Thanks
Zqiang

>Currently, the mem_dump_obj() is invoked in call_rcu(), the
>call_rcu() is maybe invoked in non-preemptive code segment,
>for object allocated from vmalloc(), the following scenarios
>may occur:
>
> CPU 0
>tasks context
> spin_lock(&vmap_area_lock)
> Interrupt context
> call_rcu()
> mem_dump_obj
> vmalloc_dump_obj
> spin_lock(&vmap_area_lock) <--deadlock
>
>and for PREEMPT-RT kernel, the spinlock will convert to sleepable
>lock, so the vmap_area_lock spinlock not allowed to get in non-preemptive
>code segment. therefore, this commit make the vmalloc_dump_obj() call in
>a clean context.
>
>Signed-off-by: Zqiang <[email protected]>
>---
>v1->v2:
> add IS_ENABLED(CONFIG_PREEMPT_RT) check.
> v2->v3:
> change commit message and add some comment.
>
> mm/util.c | 4 +++-
> mm/vmalloc.c | 25 +++++++++++++++++++++++++
> 2 files changed, 28 insertions(+), 1 deletion(-)
>
>diff --git a/mm/util.c b/mm/util.c
>index 12984e76767e..2b0222a728cc 100644
>--- a/mm/util.c
>+++ b/mm/util.c
>@@ -1128,7 +1128,9 @@ void mem_dump_obj(void *object)
> return;
>
> if (virt_addr_valid(object))
>- type = "non-slab/vmalloc memory";
>+ type = "non-slab memory";
>+ else if (is_vmalloc_addr(object))
>+ type = "vmalloc memory";
> else if (object == NULL)
> type = "NULL pointer";
> else if (object == ZERO_SIZE_PTR)
>diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>index ccaa461998f3..4351eafbe7ab 100644
>--- a/mm/vmalloc.c
>+++ b/mm/vmalloc.c
>@@ -4034,6 +4034,31 @@ bool vmalloc_dump_obj(void *object)
> struct vm_struct *vm;
> void *objp = (void *)PAGE_ALIGN((unsigned long)object);
>
>+ /* for non-vmalloc addr, return directly */
>+ if (!is_vmalloc_addr(objp))
>+ return false;
>+
>+ /**
>+ * for non-Preempt-RT kernel, return directly. otherwise not
>+ * only needs to determine whether it is in the interrupt context
>+ * (in_interrupt())to avoid deadlock, but also to avoid acquire
>+ * vmap_area_lock spinlock in disables interrupts or preempts
>+ * critical sections, because the vmap_area_lock spinlock convert
>+ * to sleepable lock
>+ */
>+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
>+ return false;
>+
>+ /**
>+ * get here, for Preempt-RT kernel, it means that we are in
>+ * preemptible context(preemptible() is true), it also means
>+ * that the in_interrupt() will return false.
>+ * for non-Preempt-RT kernel, only needs to determine whether
>+ * it is in the interrupt context(in_interrupt()) to avoid deadlock
>+ */
>+ if (in_interrupt())
>+ return false;
>+
> vm = find_vm_area(objp);
> if (!vm)
> return false;
>--
>2.25.1

2022-11-28 08:22:41

by Zhen Lei

[permalink] [raw]
Subject: Re: [PATCH v3] mm: Make vmalloc_dump_obj() call in clean context



On 2022/11/23 7:05, Zhang, Qiang1 wrote:
>
> Gently ping ????
>
> Thanks
> Zqiang
>
>> Currently, the mem_dump_obj() is invoked in call_rcu(), the
>> call_rcu() is maybe invoked in non-preemptive code segment,
>> for object allocated from vmalloc(), the following scenarios
>> may occur:
>>
>> CPU 0
>> tasks context
>> spin_lock(&vmap_area_lock)
>> Interrupt context
>> call_rcu()
>> mem_dump_obj
>> vmalloc_dump_obj
>> spin_lock(&vmap_area_lock) <--deadlock
>>
>> and for PREEMPT-RT kernel, the spinlock will convert to sleepable
>> lock, so the vmap_area_lock spinlock not allowed to get in non-preemptive
>> code segment. therefore, this commit make the vmalloc_dump_obj() call in
>> a clean context.
>>
>> Signed-off-by: Zqiang <[email protected]>
>> ---
>> v1->v2:
>> add IS_ENABLED(CONFIG_PREEMPT_RT) check.
>> v2->v3:
>> change commit message and add some comment.
>>
>> mm/util.c | 4 +++-
>> mm/vmalloc.c | 25 +++++++++++++++++++++++++
>> 2 files changed, 28 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/util.c b/mm/util.c
>> index 12984e76767e..2b0222a728cc 100644
>> --- a/mm/util.c
>> +++ b/mm/util.c
>> @@ -1128,7 +1128,9 @@ void mem_dump_obj(void *object)
>> return;
>>
>> if (virt_addr_valid(object))
>> - type = "non-slab/vmalloc memory";
>> + type = "non-slab memory";
>> + else if (is_vmalloc_addr(object))
>> + type = "vmalloc memory";
>> else if (object == NULL)
>> type = "NULL pointer";
>> else if (object == ZERO_SIZE_PTR)
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
>> index ccaa461998f3..4351eafbe7ab 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -4034,6 +4034,31 @@ bool vmalloc_dump_obj(void *object)
>> struct vm_struct *vm;
>> void *objp = (void *)PAGE_ALIGN((unsigned long)object);
>>
>> + /* for non-vmalloc addr, return directly */
>> + if (!is_vmalloc_addr(objp))
>> + return false;
>> +
>> + /**
>> + * for non-Preempt-RT kernel, return directly. otherwise not
>> + * only needs to determine whether it is in the interrupt context
>> + * (in_interrupt())to avoid deadlock, but also to avoid acquire
>> + * vmap_area_lock spinlock in disables interrupts or preempts
>> + * critical sections, because the vmap_area_lock spinlock convert
>> + * to sleepable lock
>> + */
>> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
>> + return false;
>> +
>> + /**
>> + * get here, for Preempt-RT kernel, it means that we are in
>> + * preemptible context(preemptible() is true), it also means
>> + * that the in_interrupt() will return false.
>> + * for non-Preempt-RT kernel, only needs to determine whether
>> + * it is in the interrupt context(in_interrupt()) to avoid deadlock
>> + */
>> + if (in_interrupt())
>> + return false;

We want mem_dump_obj() to work properly in the interrupt context. But with
this if statement, it's impossible to work properly.

Here's my test case:
void *tst_p;

void my_irqwork_handler(struct irq_work *work)
{
void *p = tst_p;

printk("enter my_irqwork_handler: CPU=%d, locked=%d\n", smp_processor_id(), tst_is_locked());
mem_dump_obj(p);
vfree(p);
}

static void test_mem_dump(void)
{
struct irq_work work = IRQ_WORK_INIT_HARD(my_irqwork_handler);

tst_p = vmalloc(PAGE_SIZE);
if (!tst_p) {
printk("vmalloc failed\n");
return;
}
printk("enter test_mem_dump: CPU=%d\n", smp_processor_id());

//tst_lock();
irq_work_queue(&work);
//tst_unlock();

printk("leave test_mem_dump: CPU=%d\n", smp_processor_id());
}

Test result:
[ 45.212941] enter test_mem_dump: CPU=0
[ 45.213280] enter my_irqwork_handler: CPU=0, locked=0
[ 45.213546] vmalloc memory
[ 45.213996] leave test_mem_dump: CPU=0

>> +
>> vm = find_vm_area(objp);
>> if (!vm)
>> return false;
>> --
>> 2.25.1
>

--
Regards,
Zhen Lei

2022-11-28 08:40:49

by Zqiang

[permalink] [raw]
Subject: RE: [PATCH v3] mm: Make vmalloc_dump_obj() call in clean context

On 2022/11/23 7:05, Zhang, Qiang1 wrote:
>
> Gently ping ????
>
> Thanks
> Zqiang
>
>> Currently, the mem_dump_obj() is invoked in call_rcu(), the
>> call_rcu() is maybe invoked in non-preemptive code segment, for
>> object allocated from vmalloc(), the following scenarios may occur:
>>
>> CPU 0
>> tasks context
>> spin_lock(&vmap_area_lock)
>> Interrupt context
>> call_rcu()
>> mem_dump_obj
>> vmalloc_dump_obj
>> spin_lock(&vmap_area_lock) <--deadlock
>>
>> and for PREEMPT-RT kernel, the spinlock will convert to sleepable
>> lock, so the vmap_area_lock spinlock not allowed to get in
>> non-preemptive code segment. therefore, this commit make the
>> vmalloc_dump_obj() call in a clean context.
>>
>> Signed-off-by: Zqiang <[email protected]>
>> ---
>> v1->v2:
>> add IS_ENABLED(CONFIG_PREEMPT_RT) check.
>> v2->v3:
>> change commit message and add some comment.
>>
>> mm/util.c | 4 +++-
>> mm/vmalloc.c | 25 +++++++++++++++++++++++++
>> 2 files changed, 28 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/util.c b/mm/util.c
>> index 12984e76767e..2b0222a728cc 100644
>> --- a/mm/util.c
>> +++ b/mm/util.c
>> @@ -1128,7 +1128,9 @@ void mem_dump_obj(void *object)
>> return;
>>
>> if (virt_addr_valid(object))
>> - type = "non-slab/vmalloc memory";
>> + type = "non-slab memory";
>> + else if (is_vmalloc_addr(object))
>> + type = "vmalloc memory";
>> else if (object == NULL)
>> type = "NULL pointer";
>> else if (object == ZERO_SIZE_PTR)
>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c index
>> ccaa461998f3..4351eafbe7ab 100644
>> --- a/mm/vmalloc.c
>> +++ b/mm/vmalloc.c
>> @@ -4034,6 +4034,31 @@ bool vmalloc_dump_obj(void *object)
>> struct vm_struct *vm;
>> void *objp = (void *)PAGE_ALIGN((unsigned long)object);
>>
>> + /* for non-vmalloc addr, return directly */
>> + if (!is_vmalloc_addr(objp))
>> + return false;
>> +
>> + /**
>> + * for non-Preempt-RT kernel, return directly. otherwise not
>> + * only needs to determine whether it is in the interrupt context
>> + * (in_interrupt())to avoid deadlock, but also to avoid acquire
>> + * vmap_area_lock spinlock in disables interrupts or preempts
>> + * critical sections, because the vmap_area_lock spinlock convert
>> + * to sleepable lock
>> + */
>> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
>> + return false;
>> +
>> + /**
>> + * get here, for Preempt-RT kernel, it means that we are in
>> + * preemptible context(preemptible() is true), it also means
>> + * that the in_interrupt() will return false.
>> + * for non-Preempt-RT kernel, only needs to determine whether
>> + * it is in the interrupt context(in_interrupt()) to avoid deadlock
>> + */
>> + if (in_interrupt())
>> + return false;
>
>
>We want mem_dump_obj() to work properly in the interrupt context. But with this if statement, it's impossible to work properly.

This is to avoid the following scenarios, because, call_rcu() can be invoked in hard irq or
softirq context, so mem_dump_obj() not dump some details info.

CPU 0
tasks context
spin_lock(&vmap_area_lock)
Interrupt or softirq context
call_rcu()
mem_dump_obj
vmalloc_dump_obj
spin_lock(&vmap_area_lock) <--deadlock

because mem_dump_obj() only used by RCU, I'm not sure if this modification is appropriate,
need to hear from Paul.

Thanks
Zqiang


>
>Here's my test case:
>void *tst_p;
>
>void my_irqwork_handler(struct irq_work *work) {
> void *p = tst_p;
>
> printk("enter my_irqwork_handler: CPU=%d, locked=%d\n", smp_processor_id(), tst_is_locked());
> mem_dump_obj(p);
> vfree(p);
>}
>
>static void test_mem_dump(void)
>{
> struct irq_work work = IRQ_WORK_INIT_HARD(my_irqwork_handler);
>
> tst_p = vmalloc(PAGE_SIZE);
> if (!tst_p) {
> printk("vmalloc failed\n");
> return;
> }
> printk("enter test_mem_dump: CPU=%d\n", smp_processor_id());
>
> //tst_lock();
> irq_work_queue(&work);
> //tst_unlock();
>
> printk("leave test_mem_dump: CPU=%d\n", smp_processor_id()); }
>
>Test result:
>[ 45.212941] enter test_mem_dump: CPU=0
>[ 45.213280] enter my_irqwork_handler: CPU=0, locked=0
>[ 45.213546] vmalloc memory
>[ 45.213996] leave test_mem_dump: CPU=0
>
>> +
>> vm = find_vm_area(objp);
>> if (!vm)
>> return false;
>> --
>> 2.25.1
>
>
>--
>Regards,
> Zhen Lei

2022-11-28 09:27:19

by Zhen Lei

[permalink] [raw]
Subject: Re: [PATCH v3] mm: Make vmalloc_dump_obj() call in clean context



On 2022/11/28 16:33, Zhang, Qiang1 wrote:
> On 2022/11/23 7:05, Zhang, Qiang1 wrote:
>>
>> Gently ping ????
>>
>> Thanks
>> Zqiang
>>
>>> Currently, the mem_dump_obj() is invoked in call_rcu(), the
>>> call_rcu() is maybe invoked in non-preemptive code segment, for
>>> object allocated from vmalloc(), the following scenarios may occur:
>>>
>>> CPU 0
>>> tasks context
>>> spin_lock(&vmap_area_lock)
>>> Interrupt context
>>> call_rcu()
>>> mem_dump_obj
>>> vmalloc_dump_obj
>>> spin_lock(&vmap_area_lock) <--deadlock
>>>
>>> and for PREEMPT-RT kernel, the spinlock will convert to sleepable
>>> lock, so the vmap_area_lock spinlock not allowed to get in
>>> non-preemptive code segment. therefore, this commit make the
>>> vmalloc_dump_obj() call in a clean context.
>>>
>>> Signed-off-by: Zqiang <[email protected]>
>>> ---
>>> v1->v2:
>>> add IS_ENABLED(CONFIG_PREEMPT_RT) check.
>>> v2->v3:
>>> change commit message and add some comment.
>>>
>>> mm/util.c | 4 +++-
>>> mm/vmalloc.c | 25 +++++++++++++++++++++++++
>>> 2 files changed, 28 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/util.c b/mm/util.c
>>> index 12984e76767e..2b0222a728cc 100644
>>> --- a/mm/util.c
>>> +++ b/mm/util.c
>>> @@ -1128,7 +1128,9 @@ void mem_dump_obj(void *object)
>>> return;
>>>
>>> if (virt_addr_valid(object))
>>> - type = "non-slab/vmalloc memory";
>>> + type = "non-slab memory";
>>> + else if (is_vmalloc_addr(object))
>>> + type = "vmalloc memory";
>>> else if (object == NULL)
>>> type = "NULL pointer";
>>> else if (object == ZERO_SIZE_PTR)
>>> diff --git a/mm/vmalloc.c b/mm/vmalloc.c index
>>> ccaa461998f3..4351eafbe7ab 100644
>>> --- a/mm/vmalloc.c
>>> +++ b/mm/vmalloc.c
>>> @@ -4034,6 +4034,31 @@ bool vmalloc_dump_obj(void *object)
>>> struct vm_struct *vm;
>>> void *objp = (void *)PAGE_ALIGN((unsigned long)object);
>>>
>>> + /* for non-vmalloc addr, return directly */
>>> + if (!is_vmalloc_addr(objp))
>>> + return false;
>>> +
>>> + /**
>>> + * for non-Preempt-RT kernel, return directly. otherwise not
>>> + * only needs to determine whether it is in the interrupt context
>>> + * (in_interrupt())to avoid deadlock, but also to avoid acquire
>>> + * vmap_area_lock spinlock in disables interrupts or preempts
>>> + * critical sections, because the vmap_area_lock spinlock convert
>>> + * to sleepable lock
>>> + */
>>> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible())
>>> + return false;
>>> +
>>> + /**
>>> + * get here, for Preempt-RT kernel, it means that we are in
>>> + * preemptible context(preemptible() is true), it also means
>>> + * that the in_interrupt() will return false.
>>> + * for non-Preempt-RT kernel, only needs to determine whether
>>> + * it is in the interrupt context(in_interrupt()) to avoid deadlock
>>> + */
>>> + if (in_interrupt())
>>> + return false;
>>
>>
>> We want mem_dump_obj() to work properly in the interrupt context. But with this if statement, it's impossible to work properly.
>
> This is to avoid the following scenarios, because, call_rcu() can be invoked in hard irq or
> softirq context, so mem_dump_obj() not dump some details info.

OK. Sorry, I'm confusing your issue with what I'm doing right now.

https://lkml.org/lkml/2022/11/16/913

I need "if (in_interrupt() && spin_is_locked(&vmap_area_lock))". So
mem_dump_obj() can work well in interrupt, except the task was
interrupted in the critical section of vmap_area_lock.


>
> CPU 0
> tasks context
> spin_lock(&vmap_area_lock)
> Interrupt or softirq context
> call_rcu()
> mem_dump_obj
> vmalloc_dump_obj
> spin_lock(&vmap_area_lock) <--deadlock
>
> because mem_dump_obj() only used by RCU, I'm not sure if this modification is appropriate,
> need to hear from Paul.
>
> Thanks
> Zqiang
>
>
>>
>> Here's my test case:
>> void *tst_p;
>>
>> void my_irqwork_handler(struct irq_work *work) {
>> void *p = tst_p;
>>
>> printk("enter my_irqwork_handler: CPU=%d, locked=%d\n", smp_processor_id(), tst_is_locked());
>> mem_dump_obj(p);
>> vfree(p);
>> }
>>
>> static void test_mem_dump(void)
>> {
>> struct irq_work work = IRQ_WORK_INIT_HARD(my_irqwork_handler);
>>
>> tst_p = vmalloc(PAGE_SIZE);
>> if (!tst_p) {
>> printk("vmalloc failed\n");
>> return;
>> }
>> printk("enter test_mem_dump: CPU=%d\n", smp_processor_id());
>>
>> //tst_lock();
>> irq_work_queue(&work);
>> //tst_unlock();
>>
>> printk("leave test_mem_dump: CPU=%d\n", smp_processor_id()); }
>>
>> Test result:
>> [ 45.212941] enter test_mem_dump: CPU=0
>> [ 45.213280] enter my_irqwork_handler: CPU=0, locked=0
>> [ 45.213546] vmalloc memory
>> [ 45.213996] leave test_mem_dump: CPU=0
>>
>>> +
>>> vm = find_vm_area(objp);
>>> if (!vm)
>>> return false;
>>> --
>>> 2.25.1
>>
>>
>> --
>> Regards,
>> Zhen Lei

--
Regards,
Zhen Lei