I ran into the stale-block-directory problem while using blktrace,
and while chasing it noticed the open-coded cmpxchg.
The unlink-on-final-close is a longstanding bug that's triggerable
by the standard blktrace(1) utility (it races child threads against
the parent), and should go to -stable. The cmpxchg is a startup
race that may be easy to maliciously trigger from userspace, but
is unlikely to be triggered accidentally. (Famous last words...)
Andy Isaacson (2):
blktrace: use cmpxchg
blktrace: unlink blk directory on final trace close
kernel/trace/blktrace.c | 18 +++++++++---------
1 files changed, 9 insertions(+), 9 deletions(-)
Replace open-coded racy implementation of cmpxchg with the real thing.
This bug is probably easy to maliciously trigger from userspace, and
I think it will result in memory corruption, but the race window is
small so I think it's unlikely to be triggered accidentally.
Signed-off-by: Andy Isaacson <[email protected]>
---
kernel/trace/blktrace.c | 14 +++++---------
1 files changed, 5 insertions(+), 9 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 638711c..347fe8e 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -511,11 +511,9 @@ int do_blk_trace_setup(struct request_queue *q, char *name, dev_t dev,
bt->trace_state = Blktrace_setup;
ret = -EBUSY;
- old_bt = xchg(&q->blk_trace, bt);
- if (old_bt) {
- (void) xchg(&q->blk_trace, old_bt);
+ old_bt = cmpxchg(&q->blk_trace, NULL, bt);
+ if (old_bt)
goto err;
- }
if (atomic_inc_return(&blk_probes_ref) == 1)
blk_register_tracepoints();
@@ -1464,12 +1462,10 @@ static int blk_trace_setup_queue(struct request_queue *q,
blk_trace_setup_lba(bt, bdev);
- old_bt = xchg(&q->blk_trace, bt);
- if (old_bt != NULL) {
- (void)xchg(&q->blk_trace, old_bt);
- ret = -EBUSY;
+ ret = -EBUSY;
+ old_bt = cmpxchg(&q->blk_trace, NULL, bt);
+ if (old_bt)
goto free_bt;
- }
if (atomic_inc_return(&blk_probes_ref) == 1)
blk_register_tracepoints();
--
1.7.1
blktrace fails to completely clean up after itself if BLKTRACESTOP is
called before the final trace file is closed:
% blktrace /dev/sdb &
% exec 6</sys/kernel/debug/block/sdb/trace0
% kill %1
% find /sys/kernel/debug/block/
/sys/kernel/debug/block/
/sys/kernel/debug/block/sdb
/sys/kernel/debug/block/sdb/trace0
% exec 6<&-
% find /sys/kernel/debug/block/
/sys/kernel/debug/block/
/sys/kernel/debug/block/sdb
The proper fix is to move cleanup from BLKTRACESTOP to the close
routine, but that's nontrivial, so here's a simple fix to remove
debug/block/<dev> when the last trace%d file is closed.
Cc: [email protected]
Signed-off-by: Andy Isaacson <[email protected]>
---
kernel/trace/blktrace.c | 4 ++++
1 files changed, 4 insertions(+), 0 deletions(-)
diff --git a/kernel/trace/blktrace.c b/kernel/trace/blktrace.c
index 347fe8e..9b78930 100644
--- a/kernel/trace/blktrace.c
+++ b/kernel/trace/blktrace.c
@@ -383,8 +383,12 @@ static int blk_subbuf_start_callback(struct rchan_buf *buf, void *subbuf,
static int blk_remove_buf_file_callback(struct dentry *dentry)
{
+ struct dentry *parent = dentry->d_parent;
debugfs_remove(dentry);
+ if (simple_empty(parent))
+ debugfs_remove(parent);
+
return 0;
}
--
1.7.1