LinuxLists.cc - [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work

2016-12-14 14:05:57

Subject: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

For btrfs_scrubparity_helper() the ->func() is set to
scrub_parity_bio_endio_worker(). This functions invokes
scrub_free_parity() which kfrees() the `work' object. All is good as
long as trace events are not enabled because we boom with a backtrace
like this:
| Workqueue: btrfs-endio btrfs_endio_helper
| RIP: 0010:[<ffffffff812f81ae>] [<ffffffff812f81ae>] trace_event_raw_event_btrfs__work__done+0x4e/0xa0
| Call Trace:
| [<ffffffff8136497d>] btrfs_scrubparity_helper+0x59d/0x780
| [<ffffffff81364c49>] btrfs_endio_helper+0x9/0x10
| [<ffffffff8108af8e>] process_one_work+0x26e/0x7b0
| [<ffffffff8108b516>] worker_thread+0x46/0x560
| [<ffffffff81091c4e>] kthread+0xee/0x110
| [<ffffffff818e166a>] ret_from_fork+0x2a/0x40

So in order to avoid this, I remove the trace point.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
fs/btrfs/async-thread.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index e0f071f6b5a7..d0dfc3d2e199 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -318,8 +318,6 @@ static void normal_work_helper(struct btrfs_work *work)
set_bit(WORK_DONE_BIT, &work->flags);
run_ordered_work(wq);
}
- if (!need_order)
- trace_btrfs_all_work_done(work);
}

void btrfs_init_work(struct btrfs_work *work, btrfs_work_func_t uniq_func,
--
2.11.0

2016-12-14 14:05:56

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: [PATCH 2/2] btrfs: swap free() and trace point in run_ordered_work()

The previous patch removed a trace point due to a use after free problem
with tracing enabled. While looking at the backtrace it took me a while
to find the right spot. While doing so I noticed that this trace point
could be used after one of two clean-up functions were invoked:
- run_one_async_free()
- async_cow_free()

Both of them free the `work' item so a later use in the tracepoint is
not possible.
This patch swaps the order so we first have the trace point and then
free the struct.

Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
fs/btrfs/async-thread.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
index d0dfc3d2e199..6f4631bf74f8 100644
--- a/fs/btrfs/async-thread.c
+++ b/fs/btrfs/async-thread.c
@@ -288,8 +288,8 @@ static void run_ordered_work(struct __btrfs_workqueue *wq)
* we don't want to call the ordered free functions
* with the lock held though
*/
- work->ordered_free(work);
trace_btrfs_all_work_done(work);
+ work->ordered_free(work);
}
spin_unlock_irqrestore(lock, flags);
}
--
2.11.0

2016-12-14 14:08:24

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: [RFC] btrfs: lockdep says "possible recursive locking detected" in btrfs_clear_lock_blocking_rw()

With lockdep enabled I managed to trigger the following lockdep splat:
| =============================================
| [ INFO: possible recursive locking detected ]
| 4.9.0-rt0 #804 Tainted: G W
| ---------------------------------------------
| kworker/u16:4/154 is trying to acquire lock:
| (btrfs-fs-00){+.+...}, at: [<ffffffff8136f0e1>] btrfs_clear_lock_blocking_rw+0x71/0x120
|
| but task is already holding lock:
| (btrfs-fs-00){+.+...}, at: [<ffffffff8136f0e1>] btrfs_clear_lock_blocking_rw+0x71/0x120
|
| other info that might help us debug this:
| Possible unsafe locking scenario:
|
| CPU0
| ----
| lock(btrfs-fs-00);
| lock(btrfs-fs-00);
|
| *** DEADLOCK ***
|
| May be due to missing lock nesting notation
|
| 6 locks held by kworker/u16:4/154:
| #0: ("%s-%s""btrfs", name){.+.+.+}, at: [<ffffffff8108af13>] process_one_work+0x1f3/0x7b0
| #1: ((&work->normal_work)){+.+.+.}, at: [<ffffffff8108af13>] process_one_work+0x1f3/0x7b0
| #2: (sb_internal){.+.+..}, at: [<ffffffff8132f7a1>] start_transaction+0x2f1/0x590
| #3: (btrfs-fs-02){+.+...}, at: [<ffffffff8136f0e1>] btrfs_clear_lock_blocking_rw+0x71/0x120
| #4: (btrfs-fs-01){+.+...}, at: [<ffffffff8136f0e1>] btrfs_clear_lock_blocking_rw+0x71/0x120
| #5: (btrfs-fs-00){+.+...}, at: [<ffffffff8136f0e1>] btrfs_clear_lock_blocking_rw+0x71/0x120
|
| stack backtrace:
| CPU: 1 PID: 154 Comm: kworker/u16:4 Tainted: G W 4.9.0-rt1+ #804
| Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.9.3-20161025_171302-gandalf 04/01/2014
| Workqueue: btrfs-delalloc btrfs_delalloc_helper
| ffffc9000123b7d0 ffffffff8141a2a5 ffffffff829d6db0 ffffffff829d6db0
| ffffc9000123b890 ffffffff810c19dd 00000000000002fe 0000000000000006
| 000000003c272f80 ffffffff82308200 ce68145f590e60eb ffff880039c108c0
| Call Trace:
| [<ffffffff8141a2a5>] dump_stack+0x86/0xc1
| [<ffffffff810c19dd>] __lock_acquire+0x6dd/0x11d0
| [<ffffffff810c2956>] lock_acquire+0x116/0x240
| [<ffffffff818e0ed5>] rt_read_lock+0x45/0x60
| [<ffffffff8136f0e1>] btrfs_clear_lock_blocking_rw+0x71/0x120
| [<ffffffff812fec14>] btrfs_clear_path_blocking+0x94/0xb0
| [<ffffffff8130706f>] btrfs_next_old_leaf+0x3df/0x420
| [<ffffffff813070bb>] btrfs_next_leaf+0xb/0x10
| [<ffffffff81344eeb>] __btrfs_drop_extents+0x1cb/0xd50
| [<ffffffff81336161>] cow_file_range_inline+0x191/0x6c0
| [<ffffffff81338364>] compress_file_range.constprop.68+0x314/0x710
| [<ffffffff81338790>] async_cow_start+0x30/0x50
| [<ffffffff813644dd>] btrfs_scrubparity_helper+0xfd/0x620
| [<ffffffff81364b39>] btrfs_delalloc_helper+0x9/0x10
| [<ffffffff8108af8e>] process_one_work+0x26e/0x7b0
| [<ffffffff8108b516>] worker_thread+0x46/0x560
| [<ffffffff81091c4e>] kthread+0xee/0x110
| [<ffffffff818e166a>] ret_from_fork+0x2a/0x40

I can trigger it on -RT but it won't show up on a vanilla kernel. I
don't see obvious difference here (between RT and !RT). We do have more
preemption points and a spin_lock() does not disable preemption (so any
assumption on spin_lock() disabling preemption will fail).
With all btrfs events enabled, this did not trigger. With the following
patch

--- a/fs/btrfs/locking.c
+++ b/fs/btrfs/locking.c
@@ -41,6 +41,7 @@ void btrfs_set_lock_blocking_rw(struct extent_buffer *eb, int rw)
*/
if (eb->lock_nested && current->pid == eb->lock_owner)
return;
+ trace_printk("eb %p rw %d\n", eb, rw);
if (rw == BTRFS_WRITE_LOCK) {
if (atomic_read(&eb->blocking_writers) == 0) {
WARN_ON(atomic_read(&eb->spinning_writers) != 1);
@@ -73,6 +74,7 @@ void btrfs_clear_lock_blocking_rw(struct extent_buffer *eb, int rw)
if (eb->lock_nested && current->pid == eb->lock_owner)
return;

+ trace_printk("eb %p rw %d\n", eb, rw);
if (rw == BTRFS_WRITE_LOCK_BLOCKING) {
BUG_ON(atomic_read(&eb->blocking_writers) != 1);
write_lock(&eb->lock);

I manage to collect this (the last few lines from the kworker):

# _-----=> irqs-off
# / _----=> need-resched
# |/ _-----=> need-resched_lazy
# || / _---=> hardirq/softirq
# ||| / _--=> preempt-depth
# |||| / _-=> preempt-lazy-depth
# ||||| / _-=> migrate-disable
# |||||| / delay
# TASK-PID CPU# ||||||| TIMESTAMP FUNCTION
# | | | ||||||| | |
kworker/u16:4-154 [001] .....11 60.632361: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 1
kworker/u16:4-154 [001] ....... 60.632362: btrfs_clear_lock_blocking_rw: eb ffff880039ebac00 rw 3
kworker/u16:4-154 [001] .....11 60.632366: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 1
kworker/u16:4-154 [001] ....... 60.632367: btrfs_clear_lock_blocking_rw: eb ffff880039ebac00 rw 3
kworker/u16:4-154 [001] .....11 60.632367: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 1
kworker/u16:4-154 [001] ....... 60.632368: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 3
kworker/u16:4-154 [001] ....... 60.632369: btrfs_clear_lock_blocking_rw: eb ffff880039ebac00 rw 3
kworker/u16:4-154 [001] .....12 60.632371: btrfs_set_lock_blocking_rw: eb ffff880039ebb000 rw 1
kworker/u16:4-154 [001] .....11 60.632372: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 1
kworker/u16:4-154 [001] ....... 60.632372: btrfs_clear_lock_blocking_rw: eb ffff880039ebac00 rw 3
kworker/u16:4-154 [001] .....11 60.632373: btrfs_clear_lock_blocking_rw: eb ffff880039ebb000 rw 3
kworker/u16:4-154 [001] .....12 60.632376: btrfs_set_lock_blocking_rw: eb ffff880039104000 rw 1
kworker/u16:4-154 [001] .....11 60.632377: btrfs_set_lock_blocking_rw: eb ffff880039ebb000 rw 1
kworker/u16:4-154 [001] ....... 60.632377: btrfs_clear_lock_blocking_rw: eb ffff880039ebb000 rw 3
kworker/u16:4-154 [001] .....11 60.632378: btrfs_clear_lock_blocking_rw: eb ffff880039104000 rw 3
kworker/u16:4-154 [001] .....12 60.632379: btrfs_set_lock_blocking_rw: eb ffff880039104000 rw 1
kworker/u16:4-154 [001] .....11 60.632380: btrfs_set_lock_blocking_rw: eb ffff880039ebb000 rw 1
kworker/u16:4-154 [001] .....11 60.632384: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 2
kworker/u16:4-154 [001] ....... 60.632384: btrfs_clear_lock_blocking_rw: eb ffff880039ebac00 rw 4
kworker/u16:4-154 [001] .....12 60.632386: btrfs_set_lock_blocking_rw: eb ffff880039ebb000 rw 2
kworker/u16:4-154 [001] .....11 60.632387: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 2
kworker/u16:4-154 [001] ....... 60.632392: btrfs_clear_lock_blocking_rw: eb ffff880039ebac00 rw 4
kworker/u16:4-154 [001] .....11 60.632392: btrfs_clear_lock_blocking_rw: eb ffff880039ebb000 rw 4
kworker/u16:4-154 [001] .....13 60.632395: btrfs_set_lock_blocking_rw: eb ffff880039104000 rw 2
kworker/u16:4-154 [001] .....12 60.632395: btrfs_set_lock_blocking_rw: eb ffff880039ebb000 rw 2
kworker/u16:4-154 [001] .....11 60.632396: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 2
kworker/u16:4-154 [001] ....... 60.632399: btrfs_clear_lock_blocking_rw: eb ffff880039ebac00 rw 4
kworker/u16:4-154 [001] .....11 60.632404: btrfs_clear_lock_blocking_rw: eb ffff880039ebb000 rw 4
kworker/u16:4-154 [001] .....12 60.632404: btrfs_clear_lock_blocking_rw: eb ffff880039104000 rw 4
kworker/u16:4-154 [001] .....13 60.632406: btrfs_set_lock_blocking_rw: eb ffff880039104000 rw 2
kworker/u16:4-154 [001] .....12 60.632407: btrfs_set_lock_blocking_rw: eb ffff880039ebb000 rw 2
kworker/u16:4-154 [001] .....11 60.632407: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 2
kworker/u16:4-154 [001] .....11 60.632601: btrfs_set_lock_blocking_rw: eb ffff880039763c00 rw 2
kworker/u16:4-154 [001] ....... 60.632601: btrfs_set_lock_blocking_rw: eb ffff880039104000 rw 4
kworker/u16:4-154 [001] ....... 60.632602: btrfs_set_lock_blocking_rw: eb ffff880039ebb000 rw 4
kworker/u16:4-154 [001] ....... 60.632602: btrfs_set_lock_blocking_rw: eb ffff880039ebac00 rw 4
kworker/u16:4-154 [001] ....... 60.632602: btrfs_clear_lock_blocking_rw: eb ffff880039ebac00 rw 4
kworker/u16:4-154 [001] .....11 60.632610: btrfs_clear_lock_blocking_rw: eb ffff880039ebb000 rw 4
kworker/u16:4-154 [001] .....12 60.632611: btrfs_clear_lock_blocking_rw: eb ffff880039104000 rw 4
kworker/u16:4-154 [001] .....13 60.632612: btrfs_clear_lock_blocking_rw: eb ffff880039763c00 rw 4

Could this be a wrong / missing lockdep annotation?

Sebastian

2016-12-20 17:26:32

by David Sterba

[permalink] [raw]

Subject: Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

Adding Qu to CC,

On Wed, Dec 14, 2016 at 03:05:29PM +0100, Sebastian Andrzej Siewior wrote:
> For btrfs_scrubparity_helper() the ->func() is set to
> scrub_parity_bio_endio_worker(). This functions invokes
> scrub_free_parity() which kfrees() the `work' object. All is good as
> long as trace events are not enabled because we boom with a backtrace
> like this:
> | Workqueue: btrfs-endio btrfs_endio_helper
> | RIP: 0010:[<ffffffff812f81ae>] [<ffffffff812f81ae>] trace_event_raw_event_btrfs__work__done+0x4e/0xa0
> | Call Trace:
> | [<ffffffff8136497d>] btrfs_scrubparity_helper+0x59d/0x780
> | [<ffffffff81364c49>] btrfs_endio_helper+0x9/0x10
> | [<ffffffff8108af8e>] process_one_work+0x26e/0x7b0
> | [<ffffffff8108b516>] worker_thread+0x46/0x560
> | [<ffffffff81091c4e>] kthread+0xee/0x110
> | [<ffffffff818e166a>] ret_from_fork+0x2a/0x40
>
> So in order to avoid this, I remove the trace point.
>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> fs/btrfs/async-thread.c | 2 --
> 1 file changed, 2 deletions(-)
>
> diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
> index e0f071f6b5a7..d0dfc3d2e199 100644
> --- a/fs/btrfs/async-thread.c
> +++ b/fs/btrfs/async-thread.c
> @@ -318,8 +318,6 @@ static void normal_work_helper(struct btrfs_work *work)
> set_bit(WORK_DONE_BIT, &work->flags);
> run_ordered_work(wq);
> }
> - if (!need_order)
> - trace_btrfs_all_work_done(work);

The comment in the function says we can't touch 'work' after the
callbacks. I don't see any way to use it in a tracepoint here. The
"all_work_done" pairs with a preceding trace_btrfs_work_sched in the
same function or from within run_ordered_work, also called after the
free callback.

So I think we should either remove the tracepoint completely or change
the arguments to take something else than a potentially freed 'work'.

I'm a bit puzzled by the comment in trace/events/btrfs.h

http://lxr.free-electrons.com/source/include/trace/events/btrfs.h#L1165

/* For situiations that the work is freed */
DECLARE_EVENT_CLASS(btrfs__work__done,

so we're expecing a freed pointer anyway? That sounds wrong.

I'll queue the patch for 4.10 as it fixes a crash.

2016-12-21 00:33:20

by Qu Wenruo

[permalink] [raw]

Subject: Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

At 12/21/2016 01:26 AM, David Sterba wrote:
> Adding Qu to CC,
>
> On Wed, Dec 14, 2016 at 03:05:29PM +0100, Sebastian Andrzej Siewior wrote:
>> For btrfs_scrubparity_helper() the ->func() is set to
>> scrub_parity_bio_endio_worker(). This functions invokes
>> scrub_free_parity() which kfrees() the `work' object. All is good as
>> long as trace events are not enabled because we boom with a backtrace
>> like this:
>> | Workqueue: btrfs-endio btrfs_endio_helper
>> | RIP: 0010:[<ffffffff812f81ae>] [<ffffffff812f81ae>] trace_event_raw_event_btrfs__work__done+0x4e/0xa0
>> | Call Trace:
>> | [<ffffffff8136497d>] btrfs_scrubparity_helper+0x59d/0x780
>> | [<ffffffff81364c49>] btrfs_endio_helper+0x9/0x10
>> | [<ffffffff8108af8e>] process_one_work+0x26e/0x7b0
>> | [<ffffffff8108b516>] worker_thread+0x46/0x560
>> | [<ffffffff81091c4e>] kthread+0xee/0x110
>> | [<ffffffff818e166a>] ret_from_fork+0x2a/0x40
>>
>> So in order to avoid this, I remove the trace point.
>>
>> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
>> ---
>> fs/btrfs/async-thread.c | 2 --
>> 1 file changed, 2 deletions(-)
>>
>> diff --git a/fs/btrfs/async-thread.c b/fs/btrfs/async-thread.c
>> index e0f071f6b5a7..d0dfc3d2e199 100644
>> --- a/fs/btrfs/async-thread.c
>> +++ b/fs/btrfs/async-thread.c
>> @@ -318,8 +318,6 @@ static void normal_work_helper(struct btrfs_work *work)
>> set_bit(WORK_DONE_BIT, &work->flags);
>> run_ordered_work(wq);
>> }
>> - if (!need_order)
>> - trace_btrfs_all_work_done(work);
>
> The comment in the function says we can't touch 'work' after the
> callbacks. I don't see any way to use it in a tracepoint here. The
> "all_work_done" pairs with a preceding trace_btrfs_work_sched in the
> same function or from within run_ordered_work, also called after the
> free callback.

The trace point only uses the pointer, and this helps us to pair with
btrfs_work_queued/sched.

But I still don't understand why backtrace is triggered.
Since we're just recording a pointer, not touching it.

Would you please explain the problem with more details on how it trigger
the problem?

>
> So I think we should either remove the tracepoint completely or change
> the arguments to take something else than a potentially freed 'work'.

I'm mostly OK to remove the tracepoint, but such all_workd_done() trace
should still help to determine if it's a workqueue stalled.

Thanks,
Qu

>
> I'm a bit puzzled by the comment in trace/events/btrfs.h
>
> http://lxr.free-electrons.com/source/include/trace/events/btrfs.h#L1165
>
> /* For situiations that the work is freed */
> DECLARE_EVENT_CLASS(btrfs__work__done,
>
> so we're expecing a freed pointer anyway? That sounds wrong.
>
> I'll queue the patch for 4.10 as it fixes a crash.
>
>

2016-12-21 08:28:55

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

On 2016-12-21 08:33:03 [+0800], Qu Wenruo wrote:
> The trace point only uses the pointer, and this helps us to pair with
> btrfs_work_queued/sched.

| /* For situiations that the work is freed */
| DECLARE_EVENT_CLASS(btrfs__work__done,
|
| TP_PROTO(struct btrfs_work *work),
|
| TP_ARGS(work),
|
| TP_STRUCT__entry_btrfs(
| __field( void *, work )
| ),
|
| TP_fast_assign_btrfs(btrfs_work_owner(work),
| __entry->work = work;
| ),
|
| TP_printk_btrfs("work->%p", __entry->work)
| );

and btrfs_work_owner exapnds to:

| struct btrfs_fs_info *
| btrfs_work_owner(struct btrfs_work *work)
| {
| return work->wq->fs_info;
| }

voilà

> But I still don't understand why backtrace is triggered.
> Since we're just recording a pointer, not touching it.
>
> Would you please explain the problem with more details on how it trigger the
> problem?

enabled all events played with the fs which was just an upgrade and git
tree sync + checkout so nothing special.

> >
> > So I think we should either remove the tracepoint completely or change
> > the arguments to take something else than a potentially freed 'work'.
>
> I'm mostly OK to remove the tracepoint, but such all_workd_done() trace
> should still help to determine if it's a workqueue stalled.
>
> Thanks,
> Qu

Sebastian

2016-12-21 08:45:29

by Qu Wenruo

[permalink] [raw]

Subject: Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

At 12/21/2016 04:28 PM, Sebastian Andrzej Siewior wrote:
> On 2016-12-21 08:33:03 [+0800], Qu Wenruo wrote:
>> The trace point only uses the pointer, and this helps us to pair with
>> btrfs_work_queued/sched.
>
> | /* For situiations that the work is freed */
> | DECLARE_EVENT_CLASS(btrfs__work__done,
> |
> | TP_PROTO(struct btrfs_work *work),
> |
> | TP_ARGS(work),
> |
> | TP_STRUCT__entry_btrfs(
> | __field( void *, work )
> | ),
> |
> | TP_fast_assign_btrfs(btrfs_work_owner(work),
> | __entry->work = work;
> | ),
> |
> | TP_printk_btrfs("work->%p", __entry->work)
> | );
>
> and btrfs_work_owner exapnds to:
>
> | struct btrfs_fs_info *
> | btrfs_work_owner(struct btrfs_work *work)
> | {
> | return work->wq->fs_info;
> | }
>
> voilà

Oh I got it, thanks very much.

The btrfs_work_owner() is newly introduced, no wonder I didn't know that.

I think we can fix it by extracting fs_info pointer before running the
work, and using the extracted one in the trace point.

Thanks,
Qu

>
>> But I still don't understand why backtrace is triggered.
>> Since we're just recording a pointer, not touching it.
>>
>> Would you please explain the problem with more details on how it trigger the
>> problem?
>
> enabled all events played with the fs which was just an upgrade and git
> tree sync + checkout so nothing special.
>
>>>
>>> So I think we should either remove the tracepoint completely or change
>>> the arguments to take something else than a potentially freed 'work'.
>>
>> I'm mostly OK to remove the tracepoint, but such all_workd_done() trace
>> should still help to determine if it's a workqueue stalled.
>>
>> Thanks,
>> Qu
>
> Sebastian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>

2016-12-21 08:55:43

by Sebastian Andrzej Siewior

[permalink] [raw]

Subject: Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

On 2016-12-21 16:45:06 [+0800], Qu Wenruo wrote:
> > | DECLARE_EVENT_CLASS(btrfs__work__done,
> > |
> > | TP_PROTO(struct btrfs_work *work),
> > |
> > | TP_ARGS(work),
> > |
> > | TP_STRUCT__entry_btrfs(
> > | __field( void *, work )
> > | ),
> > |
> > | TP_fast_assign_btrfs(btrfs_work_owner(work),
> > | __entry->work = work;
> > | ),
> > |
> > | TP_printk_btrfs("work->%p", __entry->work)
> > | );
> >
> > and btrfs_work_owner exapnds to:
> >
> > | struct btrfs_fs_info *
> > | btrfs_work_owner(struct btrfs_work *work)
> > | {
> > | return work->wq->fs_info;
> > | }
> >
> > voilà
>
> Oh I got it, thanks very much.
>
> The btrfs_work_owner() is newly introduced, no wonder I didn't know that.

It was introduced in cb001095ca70 ("btrfs: plumb fs_info into
btrfs_work") which is v4.8-rc1. The usage in trace points started in
bc074524e123 ("btrfs: prefix fsid to all trace events") which is also
v4.8-rc1. So whatever is done to get it fixed, it should be pushed
stable for v4.8+.

> Thanks,
> Qu

Sebastian

2016-12-21 14:09:07

by David Sterba

[permalink] [raw]

Subject: Re: [PATCH 1/2] btrfs: drop trace_btrfs_all_work_done() from normal_work_helper()

On Wed, Dec 21, 2016 at 04:45:06PM +0800, Qu Wenruo wrote:
>
>
> At 12/21/2016 04:28 PM, Sebastian Andrzej Siewior wrote:
> > On 2016-12-21 08:33:03 [+0800], Qu Wenruo wrote:
> >> The trace point only uses the pointer, and this helps us to pair with
> >> btrfs_work_queued/sched.
> >
> > | /* For situiations that the work is freed */
> > | DECLARE_EVENT_CLASS(btrfs__work__done,
> > |
> > | TP_PROTO(struct btrfs_work *work),
> > |
> > | TP_ARGS(work),
> > |
> > | TP_STRUCT__entry_btrfs(
> > | __field( void *, work )
> > | ),
> > |
> > | TP_fast_assign_btrfs(btrfs_work_owner(work),
> > | __entry->work = work;
> > | ),
> > |
> > | TP_printk_btrfs("work->%p", __entry->work)
> > | );
> >
> > and btrfs_work_owner exapnds to:
> >
> > | struct btrfs_fs_info *
> > | btrfs_work_owner(struct btrfs_work *work)
> > | {
> > | return work->wq->fs_info;
> > | }
> >
> > voil?
>
> Oh I got it, thanks very much.
>
> The btrfs_work_owner() is newly introduced, no wonder I didn't know that.
>
>
> I think we can fix it by extracting fs_info pointer before running the
> work, and using the extracted one in the trace point.

That sounds like an easy fix to preserve the tracepoint.