2007-11-19 07:23:47

by Torsten Kaiser

[permalink] [raw]
Subject: 2.6.24-rc2-mm1: kcryptd vs lockdep

Trying the last NFSv4 patch (but that patch is only the cause, why I
had lockdep enabled) I got this:
[ 64.550203]
[ 64.550205] =========================
[ 64.552213] [ BUG: held lock freed! ]
[ 64.553633] -------------------------
[ 64.555055] kcryptd/1022 is freeing memory
FFFF81011EBEFB00-FFFF81011EBEFB3F, with a lock still held there!
[ 64.558809] (kcryptd){--..}, at: [<ffffffff80247dd9>]
run_workqueue+0x129/0x210
[ 64.561743] 2 locks held by kcryptd/1022:
[ 64.563296] #0: (kcryptd){--..}, at: [<ffffffff80247dd9>]
run_workqueue+0x129/0x210
[ 64.566409] #1: (&io->work#2){--..}, at: [<ffffffff80247dd9>]
run_workqueue+0x129/0x210
[ 64.569672]
[ 64.569672] stack backtrace:
[ 64.571375]
[ 64.571375] Call Trace:
[ 64.572913] [<ffffffff8025a5f0>] debug_check_no_locks_freed+0x190/0x1b0
[ 64.575764] [<ffffffff8026f192>] mempool_free_slab+0x12/0x20
[ 64.577986] [<ffffffff80296bb9>] kmem_cache_free+0x79/0xe0
[ 64.580140] [<ffffffff8026f192>] mempool_free_slab+0x12/0x20
[ 64.582362] [<ffffffff8026f22a>] mempool_free+0x8a/0xa0
[ 64.584415] [<ffffffff802c76af>] bio_free+0x2f/0x50
[ 64.586337] [<ffffffff802c76e0>] bio_fs_destructor+0x10/0x20
[ 64.588558] [<ffffffff802c7436>] bio_put+0x26/0x30
[ 64.590446] [<ffffffff803834d9>] xfs_buf_bio_end_io+0x99/0x120
[ 64.592734] [<ffffffff802c72a9>] bio_endio+0x19/0x40
[ 64.594687] [<ffffffff804e3827>] dec_pending+0x107/0x210
[ 64.596775] [<ffffffff804e3ae0>] clone_endio+0x70/0xb0
[ 64.598793] [<ffffffff804eb880>] kcryptd_do_crypt+0x0/0x290
[ 64.600978] [<ffffffff802c72a9>] bio_endio+0x19/0x40
[ 64.602931] [<ffffffff804eb382>] crypt_dec_pending+0x32/0x50
[ 64.605149] [<ffffffff804eb8e4>] kcryptd_do_crypt+0x64/0x290
[ 64.607368] [<ffffffff804eb880>] kcryptd_do_crypt+0x0/0x290
[ 64.609553] [<ffffffff804eb880>] kcryptd_do_crypt+0x0/0x290
[ 64.611739] [<ffffffff80247e25>] run_workqueue+0x175/0x210
[ 64.613892] [<ffffffff80248af1>] worker_thread+0x71/0xb0
[ 64.615981] [<ffffffff8024c830>] autoremove_wake_function+0x0/0x40
[ 64.618402] [<ffffffff80248a80>] worker_thread+0x0/0xb0
[ 64.620454] [<ffffffff8024c43d>] kthread+0x4d/0x80
[ 64.622340] [<ffffffff8020cbc8>] child_rip+0xa/0x12
[ 64.624262] [<ffffffff8020c2df>] restore_args+0x0/0x30
[ 64.626281] [<ffffffff8024c3f0>] kthread+0x0/0x80
[ 64.628134] [<ffffffff8020cbbe>] child_rip+0x0/0x12
[ 64.630052]
[ 64.630637] INFO: lockdep is turned off.

I only have only seen this once, booting the same kernel build a
second time, it did not happen again.
Also I got two other oopses when trying to shut the system down after
the above happend. So it might be possible that kcryptd only was the
victim of an other corruption, but then I don't know what subsystem
was to blame.

The other oopses:
[ 108.613851] Unable to handle kernel paging request at 00000a6425203a72 RIP:
[ 108.618485] [<00000a6425203a72>]
[ 108.624339] PGD 0
[ 108.626416] Oops: 0010 [1] SMP
[ 108.629657] last sysfs file:
/sys/devices/pci0000:00/0000:00:0f.0/0000:01:00.1/resource
[ 108.637675] CPU 3
[ 108.639749] Modules linked in: radeon drm nfsd exportfs ipv6 tuner
tea5767 tda8290 tuner_xc2028 tda9887 tuner_simple mt20xx tea5761
tvaudio msp3400 bttv ir_common compat_ioctl32 videobuf_dma_sg
videobuf_core btcx_risc tveeprom videodev usbhid v4l2_common
v4l1_compat hid pata_amd sg
[ 108.665913] Pid: 8715, comm: reboot Not tainted 2.6.24-rc2-mm1 #14
[ 108.672103] RIP: 0010:[<00000a6425203a72>] [<00000a6425203a72>]
[ 108.678164] RSP: 0018:ffff81011d4a1e10 EFLAGS: 00010206
[ 108.683491] RAX: 00000a6425203a72 RBX: ffffffff8077f9a0 RCX: 000000000000000a
[ 108.690635] RDX: ffffffff8077fb40 RSI: 000000000000000a RDI: ffff81011ff0f870
[ 108.697779] RBP: ffff81011d4a1e28 R08: 00000000000007d0 R09: 0000000000000001
[ 108.704922] R10: ffffffff804bf40a R11: 2222222222222222 R12: 00000000fee1dead
[ 108.712066] R13: 0000000001234567 R14: 0000000000000000 R15: 0000000000000001
[ 108.719210] FS: 00007f217607f6f0(0000) GS:ffff81011ff11780(0000)
knlGS:0000000000000000
[ 108.727312] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 108.733071] CR2: 00000a6425203a72 CR3: 000000011e182000 CR4: 00000000000006e0
[ 108.740215] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 108.747358] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 108.754501] Process reboot (pid: 8715, threadinfo FFFF81011D4A0000,
task FFFF81011EC20EC0)
[ 108.762776] Stack: ffffffff8042c4ac ffff81011d4a1e28
0000000000000000 ffff81011d4a1e48
[ 108.770993] ffffffff802451ff 0000000028121969 0000000028121969
ffff81011d4a1f78
[ 108.778561] ffffffff802453d5 ffff81011f9ec078 ffff81011f97e780
ffff81011d4a1e88
[ 108.785911] Call Trace:
[ 108.788601] [<ffffffff8042c4ac>] device_shutdown+0x4c/0xa0
[ 108.794192] [<ffffffff802451ff>] kernel_restart+0x2f/0x70
[ 108.799696] [<ffffffff802453d5>] sys_reboot+0x185/0x1d0
[ 108.805028] [<ffffffff802b0289>] d_free+0x49/0x50
[ 108.809832] [<ffffffff802b02e0>] d_kill+0x50/0x70
[ 108.814641] [<ffffffff802b6880>] mntput_no_expire+0x20/0xe0
[ 108.820314] [<ffffffff8029e2bd>] __fput+0x17d/0x230
[ 108.825295] [<ffffffff8029e726>] fput+0x16/0x20
[ 108.829933] [<ffffffff805cfe43>] trace_hardirqs_on_thunk+0x35/0x3a
[ 108.836219] [<ffffffff8020bc8e>] system_call+0x7e/0x83
[ 108.841459]
[ 108.842979] INFO: lockdep is turned off.
[ 108.846922]
[ 108.846922] Code: Bad RIP value.
[ 108.851852] RIP [<00000a6425203a72>]
[ 108.855570] RSP <ffff81011d4a1e10>
[ 108.859081] CR2: 00000a6425203a72
[ 110.859331] md: stopping all md devices.
[ 110.863297] md: md1 still in use.
[ 111.865229] sd 8:0:1:0: [sdd] Synchronizing SCSI cache
[ 112.064924] sd 2:0:0:0: [sdc] Synchronizing SCSI cache
[ 112.066951] sd 1:0:0:0: [sdb] Synchronizing SCSI cache
[ 112.068961] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[ 112.071007] Unable to handle kernel paging request at 00000a6425203a72 RIP:
[ 112.072794] [<00000a6425203a72>]
[ 112.075053] PGD 0
[ 112.075857] Oops: 0010 [2] SMP
[ 112.077111] last sysfs file:
/sys/devices/pci0000:00/0000:00:0f.0/0000:01:00.1/resource
[ 112.080199] CPU 2
[ 112.081003] Modules linked in: radeon drm nfsd exportfs ipv6 tuner
tea5767 tda8290 tuner_xc2028 tda9887 tuner_simple mt20xx tea5761
tvaudio msp3400 bttv ir_common compat_ioctl32 videobuf_dma_sg
videobuf_core btcx_risc tveeprom videodev usbhid v4l2_common
v4l1_compat hid pata_amd sg
[ 112.091136] Pid: 8716, comm: reboot Tainted: G D 2.6.24-rc2-mm1 #14
[ 112.093723] RIP: 0010:[<00000a6425203a72>] [<00000a6425203a72>]
[ 112.096060] RSP: 0018:ffff81011dc71e10 EFLAGS: 00010206
[ 112.098113] RAX: 00000a6425203a72 RBX: ffffffff8077f9a0 RCX: 000000000000000a
[ 112.100867] RDX: ffffffff8077fb40 RSI: 000000000000000a RDI: ffff81011ff0f870
[ 112.103621] RBP: ffff81011dc71e28 R08: 0000000000000001 R09: 0000000000000001
[ 112.106375] R10: ffffffff804bf40a R11: 2222222222222222 R12: 00000000fee1dead
[ 112.109126] R13: 0000000001234567 R14: 0000000000000000 R15: 0000000000000000
[ 112.111881] FS: 00007fe4ba0c96f0(0000) GS:ffff81011ff11300(0000)
knlGS:0000000000000000
[ 112.115000] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 112.117218] CR2: 00000a6425203a72 CR3: 000000011e405000 CR4: 00000000000006e0
[ 112.119972] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 112.122727] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 112.125481] Process reboot (pid: 8716, threadinfo FFFF81011DC70000,
task FFFF81011EC20000)
[ 112.128666] Stack: ffffffff8042c4ac ffff81011dc71e28
0000000000000000 ffff81011dc71e48
[ 112.131846] ffffffff802451ff 0000000028121969 0000000028121969
ffff81011dc71f78
[ 112.134774] ffffffff802453d5 ffff81011dc71f48 0000000000000001
ffff81011dc71f28
[ 112.137614] Call Trace:
[ 112.138657] [<ffffffff8042c4ac>] device_shutdown+0x4c/0xa0
[ 112.140814] [<ffffffff802451ff>] kernel_restart+0x2f/0x70
[ 112.142934] [<ffffffff802453d5>] sys_reboot+0x185/0x1d0
[ 112.144990] [<ffffffff8024fcf2>] hrtimer_nanosleep+0x72/0x130
[ 112.147244] [<ffffffff8024f470>] hrtimer_wakeup+0x0/0x30
[ 112.149331] [<ffffffff805cf137>] do_nanosleep+0x57/0x90
[ 112.151386] [<ffffffff80240e36>] sigprocmask+0x86/0xf0
[ 112.153409] [<ffffffff805cfe43>] trace_hardirqs_on_thunk+0x35/0x3a
[ 112.155830] [<ffffffff8024f97c>] lock_hrtimer_base+0x2c/0x60
[ 112.158052] [<ffffffff8020bc8e>] system_call+0x7e/0x83
[ 112.160074]
[ 112.160662] INFO: lockdep is turned off.
[ 112.162181]
[ 112.162182] Code: Bad RIP value.
[ 112.164090] RIP [<00000a6425203a72>]
[ 112.165528] RSP <ffff81011dc71e10>
[ 112.166881] CR2: 00000a6425203a72
[ 128.836239] SysRq : Resetting

I had never seen this oops at shutdown before...
Torsten


2007-11-19 07:56:51

by Ingo Molnar

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep


* Torsten Kaiser <[email protected]> wrote:

> Trying the last NFSv4 patch (but that patch is only the cause, why I
> had lockdep enabled) I got this:
> [ 64.550203]
> [ 64.550205] =========================
> [ 64.552213] [ BUG: held lock freed! ]
> [ 64.553633] -------------------------
> [ 64.555055] kcryptd/1022 is freeing memory
> FFFF81011EBEFB00-FFFF81011EBEFB3F, with a lock still held there!

so kcryptd frees a live, still in use bio? That could be a receipe for
data corruption. Does SLUB_DEBUG (or SLAB_DEBUG) catch anything?

Ingo

2007-11-19 19:34:45

by Torsten Kaiser

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Nov 19, 2007 8:56 AM, Ingo Molnar <[email protected]> wrote:
>
> * Torsten Kaiser <[email protected]> wrote:
>
> > Trying the last NFSv4 patch (but that patch is only the cause, why I
> > had lockdep enabled) I got this:
> > [ 64.550203]
> > [ 64.550205] =========================
> > [ 64.552213] [ BUG: held lock freed! ]
> > [ 64.553633] -------------------------
> > [ 64.555055] kcryptd/1022 is freeing memory
> > FFFF81011EBEFB00-FFFF81011EBEFB3F, with a lock still held there!
>
> so kcryptd frees a live, still in use bio? That could be a receipe for
> data corruption. Does SLUB_DEBUG (or SLAB_DEBUG) catch anything?

I have SLUB_DEBUG=y, but not SLUB_DEBUG_ON.
But apart from this message I did not the anything in the syslog.
It seems to not be onetime event, as one the third boot it happend
again. Stacktrace was identical.
Sadly trying 3 boots with slub_debug=FZP and another one with only F
did not trigger it.

But I don't think kcryptd is freeing a bio at that point.
The message said about the freed lock: (kcryptd){--..}, at: [<ffffffff80247dd9>]
(gdb) list *0xffffffff80247dd9
0xffffffff80247dd9 is in run_workqueue (include/asm/bitops_64.h:69).
64 * you should call smp_mb__before_clear_bit() and/or
smp_mb__after_clear_bit()
65 * in order to ensure changes are visible on other processors.
66 */
67 static inline void clear_bit(int nr, volatile void *addr)
68 {
69 __asm__ __volatile__( LOCK_PREFIX
70 "btrl %1,%0"
71 :ADDR
72 :"dIr" (nr));
73 }
increasing the addr a little bit shows:
(gdb) list *0xffffffff80247ddf
0xffffffff80247ddf is in run_workqueue (kernel/workqueue.c:275).
270 list_del_init(cwq->worklist.next);
271 spin_unlock_irq(&cwq->lock);
272
273 BUG_ON(get_wq_data(work) != cwq);
274 work_clear_pending(work);
275 lock_acquire(&cwq->wq->lockdep_map, 0, 0, 0,
2, _THIS_IP_);
276 lock_acquire(&lockdep_map, 0, 0, 0, 2, _THIS_IP_);
277 f(work);
278 lock_release(&lockdep_map, 1, _THIS_IP_);
279 lock_release(&cwq->wq->lockdep_map, 1, _THIS_IP_);


Above this acquire/release sequence is the following comment:
#ifdef CONFIG_LOCKDEP
/*
* It is permissible to free the struct work_struct
* from inside the function that is called from it,
* this we need to take into account for lockdep too.
* To avoid bogus "held lock freed" warnings as well
* as problems when looking into work->lockdep_map,
* make a copy and use that here.
*/
struct lockdep_map lockdep_map = work->lockdep_map;
#endif

Did something trigger this anyway?

Anything I could try, apart from more boots with slub_debug=F?

Torsten

2007-11-19 21:01:42

by Milan Broz

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

Torsten Kaiser wrote:
> On Nov 19, 2007 8:56 AM, Ingo Molnar <[email protected]> wrote:
>> * Torsten Kaiser <[email protected]> wrote:
...
> Above this acquire/release sequence is the following comment:
> #ifdef CONFIG_LOCKDEP
> /*
> * It is permissible to free the struct work_struct
> * from inside the function that is called from it,
> * this we need to take into account for lockdep too.
> * To avoid bogus "held lock freed" warnings as well
> * as problems when looking into work->lockdep_map,
> * make a copy and use that here.
> */
> struct lockdep_map lockdep_map = work->lockdep_map;
> #endif
>
> Did something trigger this anyway?
>
> Anything I could try, apart from more boots with slub_debug=F?

Please could you try which patch from the dm-crypt series cause this ?
(agk-dm-dm-crypt* names.)

I suspect agk-dm-dm-crypt-move-bio-submission-to-thread.patch because
there is one work struct used subsequently in two threads...
(io thread already started while crypt thread is processing lockdep_map
after calling f(work)...)

(btw these patches prepare dm-crypt for next patchset introducing
async cryptoapi, so there should be no functional changes yet.)

Milan
--
[email protected]


2007-11-20 06:56:35

by Torsten Kaiser

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Nov 19, 2007 10:00 PM, Milan Broz <[email protected]> wrote:
> Torsten Kaiser wrote:
> > Anything I could try, apart from more boots with slub_debug=F?

One time it triggered with slub_debug=F, but no additional output.
With slub_debug=FP I have not seen it again, so I can't say if that
would yield more info.

> Please could you try which patch from the dm-crypt series cause this ?
> (agk-dm-dm-crypt* names.)
>
> I suspect agk-dm-dm-crypt-move-bio-submission-to-thread.patch because
> there is one work struct used subsequently in two threads...
> (io thread already started while crypt thread is processing lockdep_map
> after calling f(work)...)

After reverting only
agk-dm-dm-crypt-move-bio-submission-to-thread.patch I also have not
seen the 'held lock freed' message again.

If it happens again with this revert, I will post that output.

Thanks for the hint.

Torsten

2007-11-20 14:41:11

by Milan Broz

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

Torsten Kaiser wrote:
> On Nov 19, 2007 10:00 PM, Milan Broz <[email protected]> wrote:
>> Torsten Kaiser wrote:
>>> Anything I could try, apart from more boots with slub_debug=F?
>
> One time it triggered with slub_debug=F, but no additional output.
> With slub_debug=FP I have not seen it again, so I can't say if that
> would yield more info.
>
>> Please could you try which patch from the dm-crypt series cause this ?
>> (agk-dm-dm-crypt* names.)
>>
>> I suspect agk-dm-dm-crypt-move-bio-submission-to-thread.patch because
>> there is one work struct used subsequently in two threads...
>> (io thread already started while crypt thread is processing lockdep_map
>> after calling f(work)...)
>
> After reverting only
> agk-dm-dm-crypt-move-bio-submission-to-thread.patch I also have not
> seen the 'held lock freed' message again.

Ok, then I have question: Is the following pseudocode correct
(and problem is in lock validation which checks something
already initialized for another queue) or reusing work_struct
is not permitted from inside called work function ?

(Note comment in code "It is permissible to free the struct
work_struct from inside the function that is called from it".)

struct work_struct work;
struct workqueue_struct *a, *b;

do_b(*work)
{
/* do something else */
}

do_a(*work)
{
/* do something */
INIT_WORK(&work, do_b);
queue_work(b, &work);
}


INIT_WORK(&work, do_a);
queue_work(a, &work);

Milan
--
[email protected]

2007-11-20 23:36:57

by Alasdair G Kergon

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Tue, Nov 20, 2007 at 03:40:30PM +0100, Milan Broz wrote:
> (Note comment in code "It is permissible to free the struct
> work_struct from inside the function that is called from it".)

I don't understand yet how lockdep behaves if the work struct gets
reused and the reused one finishes first.

I renamed the kcryptd functions today in an attempt to disentangle this
code a bit more.

- io->pending reference counting looks correct (though used
inconsistently when comparing READ with WRITE)

- But what happens if kcryptd_crypt_write_convert_loop() calls
INIT_WORK/queue_work twice?

Alasdair
--
[email protected]

2007-11-21 15:59:20

by Oleg Nesterov

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

Alasdair G Kergon wrote:
>
> - But what happens if kcryptd_crypt_write_convert_loop() calls
> INIT_WORK/queue_work twice?

Can't find this function. But "INIT_WORK + queue_work" twice is very
wrong of course.

Milan Broz wrote:
>
> Ok, then I have question: Is the following pseudocode correct
> (and problem is in lock validation which checks something
> already initialized for another queue) or reusing work_struct
> is not permitted from inside called work function ?
>
> (Note comment in code "It is permissible to free the struct
> work_struct from inside the function that is called from it".)
>
> struct work_struct work;
> struct workqueue_struct *a, *b;
>
> do_b(*work)
> {
> /* do something else */
> }
>
> do_a(*work)
> {
> /* do something */
> INIT_WORK(&work, do_b);
> queue_work(b, &work);
> }
>
>
> INIT_WORK(&work, do_a);
> queue_work(a, &work);

(just in case, in that particular case PREPARE_WORK() should be used)

INIT_WORK(w) can be used if we know that "w" is not pending, and nobody
else can write to this work (say, queue_work(w) or cancel_work_sync(w)).
So currently the code above should work correctly.

However, I'd say it is not correct, INIT_WORK() can throw out some debug
info for example, or the implementation could be changed.

I'm not sure about CONFIG_LOCKDEP (Johannes cc'ed). INIT_WORK() does
lockdep_init_map(->lockdep_map) but run_workqueue() has a local copy,
looks ok.

Oleg.

2007-11-21 16:05:51

by Johannes Berg

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

Hi,

> > Ok, then I have question: Is the following pseudocode correct
> > (and problem is in lock validation which checks something
> > already initialized for another queue) or reusing work_struct
> > is not permitted from inside called work function ?
> >
> > (Note comment in code "It is permissible to free the struct
> > work_struct from inside the function that is called from it".)
> >
> > struct work_struct work;
> > struct workqueue_struct *a, *b;
> >
> > do_b(*work)
> > {
> > /* do something else */
> > }
> >
> > do_a(*work)
> > {
> > /* do something */
> > INIT_WORK(&work, do_b);
> > queue_work(b, &work);
> > }
> >
> >
> > INIT_WORK(&work, do_a);
> > queue_work(a, &work);
>
> (just in case, in that particular case PREPARE_WORK() should be used)
>
> INIT_WORK(w) can be used if we know that "w" is not pending, and nobody
> else can write to this work (say, queue_work(w) or cancel_work_sync(w)).
> So currently the code above should work correctly.
>
> However, I'd say it is not correct, INIT_WORK() can throw out some debug
> info for example, or the implementation could be changed.
>
> I'm not sure about CONFIG_LOCKDEP (Johannes cc'ed). INIT_WORK() does
> lockdep_init_map(->lockdep_map) but run_workqueue() has a local copy,
> looks ok.

We explicitly need to use a copy of the lockdep_map for "locking" the
work struct as per the quoted comment. So as far as I can tell, what
INIT_WORK() is doing here is changing an at that point unused copy of
the lockdep map so I think it should be fine. Not sure about the other
fine points nor why you'd want this though :)

johannes


Attachments:
signature.asc (828.00 B)
This is a digitally signed message part

2007-11-23 10:22:19

by Torsten Kaiser

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Nov 20, 2007 7:55 AM, Torsten Kaiser <[email protected]> wrote:
> On Nov 19, 2007 10:00 PM, Milan Broz <[email protected]> wrote:
> > Please could you try which patch from the dm-crypt series cause this ?
> > (agk-dm-dm-crypt* names.)
> >
> > I suspect agk-dm-dm-crypt-move-bio-submission-to-thread.patch because
> > there is one work struct used subsequently in two threads...
> > (io thread already started while crypt thread is processing lockdep_map
> > after calling f(work)...)
>
> After reverting only
> agk-dm-dm-crypt-move-bio-submission-to-thread.patch I also have not
> seen the 'held lock freed' message again.
>
> If it happens again with this revert, I will post that output.

It happened again, here I post the output:
Nov 23 10:56:17 treogen [ 58.364441] XFS mounting filesystem dm-0
Nov 23 10:56:17 treogen [ 58.519648] Ending clean XFS mount for
filesystem: dm-0
Nov 23 10:56:17 treogen [ 58.858098]
Nov 23 10:56:17 treogen [ 58.858104] =========================
Nov 23 10:56:17 treogen [ 58.863316] [ BUG: held lock freed! ]
Nov 23 10:56:17 treogen [ 58.866998] -------------------------
Nov 23 10:56:17 treogen [ 58.870685] kcryptd/1022 is freeing memory
FFFF81011EAD4B00-FFFF81011EAD4B3F
, with a lock still held there!
Nov 23 10:56:17 treogen [ 58.880430] (kcryptd){--..}, at:
[<ffffffff80247dd9>] run_workqueue+0x129/0
x210
Nov 23 10:56:17 treogen [ 58.888014] 2 locks held by kcryptd/1022:
Nov 23 10:56:17 treogen [ 58.892045] #0: (kcryptd){--..}, at:
[<ffffffff80247dd9>] run_workqueue+0x
129/0x210
Nov 23 10:56:17 treogen [ 58.900095] #1: (&io->work#2){--..}, at:
[<ffffffff80247dd9>] run_workqueu
e+0x129/0x210
Nov 23 10:56:17 treogen [ 58.908535]
Nov 23 10:56:17 treogen [ 58.908535] stack backtrace:
Nov 23 10:56:17 treogen [ 58.912954]
Nov 23 10:56:17 treogen [ 58.912955] Call Trace:
Nov 23 10:56:17 treogen [ 58.916944] [<ffffffff8025a5f0>]
debug_check_no_locks_freed+0x190/0x1b0
Nov 23 10:56:17 treogen [ 58.924313] [<ffffffff8026f192>]
mempool_free_slab+0x12/0x20
Nov 23 10:56:17 treogen [ 58.930073] [<ffffffff80296bb9>]
kmem_cache_free+0x79/0xe0
Nov 23 10:56:17 treogen [ 58.935665] [<ffffffff8026f192>]
mempool_free_slab+0x12/0x20
Nov 23 10:56:17 treogen [ 58.941424] [<ffffffff8026f22a>]
mempool_free+0x8a/0xa0
Nov 23 10:56:17 treogen [ 58.946755] [<ffffffff802c76af>] bio_free+0x2f/0x50
Nov 23 10:56:17 treogen [ 58.951736] [<ffffffff804e36fd>]
dm_bio_destructor+0xd/0x10
Nov 23 10:56:17 treogen [ 58.957414] [<ffffffff802c7436>] bio_put+0x26/0x30
Nov 23 10:56:17 treogen [ 58.962311] [<ffffffff804e3af3>]
clone_endio+0x83/0xb0
Nov 23 10:56:17 treogen [ 58.967553] [<ffffffff804eb860>]
kcryptd_do_crypt+0x0/0x290
Nov 23 10:56:17 treogen [ 58.973224] [<ffffffff802c72a9>] bio_endio+0x19/0x40
Nov 23 10:56:17 treogen [ 58.978291] [<ffffffff804eb372>]
crypt_dec_pending+0x32/0x50
Nov 23 10:56:17 treogen [ 58.984050] [<ffffffff804eb8c4>]
kcryptd_do_crypt+0x64/0x290
Nov 23 10:56:17 treogen [ 58.989810] [<ffffffff804eb860>]
kcryptd_do_crypt+0x0/0x290
Nov 23 10:56:17 treogen [ 58.995483] [<ffffffff804eb860>]
kcryptd_do_crypt+0x0/0x290
Nov 23 10:56:17 treogen [ 59.001158] [<ffffffff80247e25>]
run_workqueue+0x175/0x210
Nov 23 10:56:17 treogen [ 59.006746] [<ffffffff80248af1>]
worker_thread+0x71/0xb0
Nov 23 10:56:17 treogen [ 59.012158] [<ffffffff8024c830>]
autoremove_wake_function+0x0/0x40
Nov 23 10:56:17 treogen [ 59.018435] [<ffffffff80248a80>]
worker_thread+0x0/0xb0
Nov 23 10:56:17 treogen [ 59.023764] [<ffffffff8024c43d>] kthread+0x4d/0x80
Nov 23 10:56:17 treogen [ 59.028660] [<ffffffff8020cbc8>] child_rip+0xa/0x12
Nov 23 10:56:17 treogen [ 59.033640] [<ffffffff8020c2df>]
restore_args+0x0/0x30
Nov 23 10:56:17 treogen [ 59.038880] [<ffffffff8024c3f0>] kthread+0x0/0x80
Nov 23 10:56:17 treogen [ 59.043689] [<ffffffff8020cbbe>] child_rip+0x0/0x12
Nov 23 10:56:17 treogen [ 59.048670]
Nov 23 10:56:17 treogen [ 59.050190] INFO: lockdep is turned off.
Nov 23 10:56:17 treogen [ 59.919020] pata_amd 0000:00:04.0: version 0.3.10

>From what I see the only difference between the other stack traces and
this one is the following part:

old traces with agk-dm-dm-crypt-move-bio-submission-to-thread.patch applied:
[ 64.584415] [<ffffffff802c76af>] bio_free+0x2f/0x50
[ 64.586337] [<ffffffff802c76e0>] bio_fs_destructor+0x10/0x20
[ 64.588558] [<ffffffff802c7436>] bio_put+0x26/0x30
[ 64.590446] [<ffffffff803834d9>] xfs_buf_bio_end_io+0x99/0x120
[ 64.592734] [<ffffffff802c72a9>] bio_endio+0x19/0x40
[ 64.594687] [<ffffffff804e3827>] dec_pending+0x107/0x210
[ 64.596775] [<ffffffff804e3ae0>] clone_endio+0x70/0xb0

new trace with agk-dm-dm-crypt-move-bio-submission-to-thread.patch reverted:
[ 58.946755] [<ffffffff802c76af>] bio_free+0x2f/0x50
[ 58.951736] [<ffffffff804e36fd>] dm_bio_destructor+0xd/0x10
[ 58.957414] [<ffffffff802c7436>] bio_put+0x26/0x30
[ 58.962311] [<ffffffff804e3af3>] clone_endio+0x83/0xb0

(gdb) list *0xffffffff804e3ae0
0xffffffff804e3ae0 is in clone_endio (drivers/md/dm.c:539).
534 dec_pending(tio->io, error);
535
536 /*
537 * Store md for cleanup instead of tio which is about
to get freed.
538 */
539 bio->bi_private = md->bs;
540
541 bio_put(bio);
542 free_tio(md, tio);
543 }

(gdb) list *0xffffffff804e3af3
0xffffffff804e3af3 is in clone_endio (drivers/md/dm.c:542).
537 * Store md for cleanup instead of tio which is about
to get freed.
538 */
539 bio->bi_private = md->bs;
540
541 bio_put(bio);
542 free_tio(md, tio);
543 }
544
545 static sector_t max_io_len(struct mapped_device *md,
546 sector_t sector, struct dm_target *ti)

Torsten

2007-11-23 22:42:48

by Torsten Kaiser

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Nov 19, 2007 10:00 PM, Milan Broz <[email protected]> wrote:
> Torsten Kaiser wrote:
> > On Nov 19, 2007 8:56 AM, Ingo Molnar <[email protected]> wrote:
> >> * Torsten Kaiser <[email protected]> wrote:
> ...
> > Above this acquire/release sequence is the following comment:
> > #ifdef CONFIG_LOCKDEP
> > /*
> > * It is permissible to free the struct work_struct
> > * from inside the function that is called from it,
> > * this we need to take into account for lockdep too.
> > * To avoid bogus "held lock freed" warnings as well
> > * as problems when looking into work->lockdep_map,
> > * make a copy and use that here.
> > */
> > struct lockdep_map lockdep_map = work->lockdep_map;
> > #endif
> >
> > Did something trigger this anyway?
> >
> > Anything I could try, apart from more boots with slub_debug=F?
>
> Please could you try which patch from the dm-crypt series cause this ?
> (agk-dm-dm-crypt* names.)
>
> I suspect agk-dm-dm-crypt-move-bio-submission-to-thread.patch because
> there is one work struct used subsequently in two threads...
> (io thread already started while crypt thread is processing lockdep_map
> after calling f(work)...)
>
> (btw these patches prepare dm-crypt for next patchset introducing
> async cryptoapi, so there should be no functional changes yet.)

I looked at all of these agk-*-patches, as the error is not
bisectable, because it triggers unreliable.
The one that looks suspicious is agk-dm-dm-crypt-tidy-io-ref-counting.patch

This one does a functional change, as there now is an additional ref
on io->pending. Instead of only increasing io->pending if there really
are more then one clone-bio, it will now take an additional ref in
crypt_write_io_process().

I certainly agree with the cleanup, but this introduces the following change:

Before the cleanup *all* calls to crypt_dec_pending() was via crypt_endio().
Now there is an additional call to crypt_dec_pending() to balance the
additional ref placed into crypt_write_io_process(). And that one is
not called from whatever context/thread cleans up after
make_generic_request, but directly in the context/thread of the caller
of crypt_write_io_process(), and that is kcryptd.

So now it is possible (if all requests finish before
crypt_write_io_process() returns) that kcryptd itself will release the
bio, but the workqueue infrastructure still seems to have a lock on
that.

But as the comment in run_workqueue says, this should be legal, and I
can't figure out what would make the the lockdep copy mechanism fail.
Especially if the trigger was really a WRITE request, as with
agk-dm-dm-crypt-move-bio-submission-to-thread.patch reverted this
should never use the kcrypt_io-workqueue and so there should be not
even the problem with using INIT_WORK twice on the same work_struct.

... or I just don't see the bug.

Torsten

2007-11-24 03:49:37

by Alasdair G Kergon

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Fri, Nov 23, 2007 at 11:42:36PM +0100, Torsten Kaiser wrote:
> ... or I just don't see the bug.

See my earlier post in this thread: there's a race in the write loop
where a work struct could be used twice on the same queue.
(Needs data structure change to fix that, which nobody has attempted
to do yet.)

BTW To eliminate any internal lockdep concerns (and people say there
should be no problem) temporarily add a second struct instead of reusing
one on two queues.

Alasdair
--
[email protected]

2007-11-24 04:03:41

by Alasdair G Kergon

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

Also io->pending may need better protection - atomic, but missing memory
barriers? (May be getting away without sometimes due to side-effects of
other function calls, but needs doing properly.)

[BTW Other device-mapper atomic_t usage also needs reviewing.]

Alasdair
--
[email protected]

2007-11-24 04:13:52

by Alasdair G Kergon

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Fri, Nov 23, 2007 at 11:42:36PM +0100, Torsten Kaiser wrote:
> Before the cleanup *all* calls to crypt_dec_pending() was via crypt_endio().
> Now there is an additional call to crypt_dec_pending() to balance the
> additional ref placed into crypt_write_io_process(). And that one is
> not called from whatever context/thread cleans up after
> make_generic_request, but directly in the context/thread of the caller
> of crypt_write_io_process(), and that is kcryptd.

Please do look at the latest patches (always at
http://www.kernel.org/pub/linux/kernel/people/agk/patches/2.6/editing/series.html )
where you'll see I've already disentangled the mess of functions
and given them more understandable names, so at least following the program
flow is easier.

Read and write do the ref counting differently (but correctly AFAICT) - I want
that changing, but held back from doing it without first checking whether the
later patches (not yet reviewed) provide a reason to prefer one method
over the other.

Alasdair
--
[email protected]

2007-11-24 04:57:30

by Torsten Kaiser

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

On Nov 24, 2007 4:49 AM, Alasdair G Kergon <[email protected]> wrote:
> On Fri, Nov 23, 2007 at 11:42:36PM +0100, Torsten Kaiser wrote:
> > ... or I just don't see the bug.
>
> See my earlier post in this thread: there's a race in the write loop
> where a work struct could be used twice on the same queue.
> (Needs data structure change to fix that, which nobody has attempted
> to do yet.)

As I wrote in an earlier post:
I did see this lockdep message even with
agk-dm-dm-crypt-move-bio-submission-to-thread.patch reverted, so the
work struct is not used in the write loop.

> BTW To eliminate any internal lockdep concerns (and people say there
> should be no problem) temporarily add a second struct instead of reusing
> one on two queues.

I think, this might really be a lockdep bug, but as I'm not fluent
enough with C, please check, if my logik is correct:

The freed-locked-lock-test is the only function that uses this in lockdep.c:
static inline int in_range(const void *start, const void *addr, const void *end)
{
return addr >= start && addr <= end;
}
This will return true, if addr is in the range of start (including)
to end (including).

But debug_check_no_locks_freed() seems does:
const void *mem_to = mem_from + mem_len
-> mem_to is the last byte of the freed range, that fits in_range
lock_from = (void *)hlock->instance;
-> first byte of the lock
lock_to = (void *)(hlock->instance + 1);
-> first byte of the next lock, not last byte of the lock that is being checked!
(Or am I reading this wrong?)

The test is:
if (!in_range(mem_from, lock_from, mem_to) &&
!in_range(mem_from, lock_to, mem_to))
continue;
So it tests, if the first byte of the lock is in the range that is freed ->OK
And if the first byte of the *next* lock is in the range that is freed
-> Not OK.

That would also explain the rather strange output:
=========================
[ BUG: held lock freed! ]
-------------------------
kcryptd/1022 is freeing memory
FFFF81011EBEFB00-FFFF81011EBEFB3F, with a lock still held there!
(kcryptd){--..}, at: [<ffffffff80247dd9>] run_workqueue+0x129/0x210
2 locks held by kcryptd/1022:
#0: (kcryptd){--..}, at: [<ffffffff80247dd9>] run_workqueue+0x129/0x210
#1: (&io->work#2){--..}, at: [<ffffffff80247dd9>] run_workqueue+0x129/0x210

That claims that the lock of the *workqueue* struct, not the work
struct is getting freed!
But I'm still happily using the dm-crypt device, even 19 hours after
that message.

So my current best guess to the source of this message is, that with
the change in the ref counting it is now possible that the work struct
is really getting freed before the workqueue function returns. But as
the comment in run_workqueue() says, that is still legal.
But now the first byte of the next lock is part of the freed memory
and so the wrong "held lock freed" is triggered.

Torsten

2007-11-24 06:38:52

by Herbert Xu

[permalink] [raw]
Subject: Re: 2.6.24-rc2-mm1: kcryptd vs lockdep

Alasdair G Kergon <[email protected]> wrote:
> Also io->pending may need better protection - atomic, but missing memory
> barriers? (May be getting away without sometimes due to side-effects of
> other function calls, but needs doing properly.)

If it's using atomic_dec_and_test then that comes with an implicit
memory barrier.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2007-11-24 10:53:56

by Oleg Nesterov

[permalink] [raw]
Subject: [PATCH] debug_check_no_locks_freed: fix in_range() checks

Torsten, could you ack/nack this patch?

Torsten Kaiser wrote:
>
> static inline int in_range(const void *start, const void *addr, const void *end)
> {
> return addr >= start && addr <= end;
> }
> This will return true, if addr is in the range of start (including)
> to end (including).
>
> But debug_check_no_locks_freed() seems does:
> const void *mem_to = mem_from + mem_len
> -> mem_to is the last byte of the freed range, that fits in_range
> lock_from = (void *)hlock->instance;
> -> first byte of the lock
> lock_to = (void *)(hlock->instance + 1);
> -> first byte of the next lock, not last byte of the lock that is being checked!
>
> The test is:
> if (!in_range(mem_from, lock_from, mem_to) &&
> !in_range(mem_from, lock_to, mem_to))
> continue;
> So it tests, if the first byte of the lock is in the range that is freed ->OK
> And if the first byte of the *next* lock is in the range that is freed
> -> Not OK.

We can also simplify in_range checks, we need only 2 comparisons, not 4.
If the lock is not in memory range, it should be either at the left of range
or at the right.

Signed-off-by: Oleg Nesterov <[email protected]>

--- 24/kernel/lockdep.c~ 2007-11-09 12:57:31.000000000 +0300
+++ 24/kernel/lockdep.c 2007-11-24 13:32:52.000000000 +0300
@@ -3054,11 +3054,6 @@ void __init lockdep_info(void)
#endif
}

-static inline int in_range(const void *start, const void *addr, const void *end)
-{
- return addr >= start && addr <= end;
-}
-
static void
print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
const void *mem_to, struct held_lock *hlock)
@@ -3080,6 +3075,13 @@ print_freed_lock_bug(struct task_struct
dump_stack();
}

+static inline int not_in_range(const void* mem_from, unsigned long mem_len,
+ const void* lock_from, unsigned long lock_len)
+{
+ return lock_from + lock_len <= mem_from ||
+ mem_from + mem_len <= lock_from;
+}
+
/*
* Called when kernel memory is freed (or unmapped), or if a lock
* is destroyed or reinitialized - this code checks whether there is
@@ -3087,7 +3089,6 @@ print_freed_lock_bug(struct task_struct
*/
void debug_check_no_locks_freed(const void *mem_from, unsigned long mem_len)
{
- const void *mem_to = mem_from + mem_len, *lock_from, *lock_to;
struct task_struct *curr = current;
struct held_lock *hlock;
unsigned long flags;
@@ -3100,14 +3101,11 @@ void debug_check_no_locks_freed(const vo
for (i = 0; i < curr->lockdep_depth; i++) {
hlock = curr->held_locks + i;

- lock_from = (void *)hlock->instance;
- lock_to = (void *)(hlock->instance + 1);
-
- if (!in_range(mem_from, lock_from, mem_to) &&
- !in_range(mem_from, lock_to, mem_to))
+ if (not_in_range(mem_from, mem_len, hlock->instance,
+ sizeof(*hlock->instance)))
continue;

- print_freed_lock_bug(curr, mem_from, mem_to, hlock);
+ print_freed_lock_bug(curr, mem_from, mem_from + mem_len, hlock);
break;
}
local_irq_restore(flags);

2007-11-24 12:18:46

by Torsten Kaiser

[permalink] [raw]
Subject: Re: [PATCH] debug_check_no_locks_freed: fix in_range() checks

On Nov 24, 2007 11:53 AM, Oleg Nesterov <[email protected]> wrote:
> Torsten, could you ack/nack this patch?

>From looking at the code I would ack it.
I will reapply agk-dm-dm-crypt-move-bio-submission-to-thread.patch and
this patch and boot several times, but as the message was not
triggered on every boot, this can't prove anything.

But if it happens again, I will notify you.

Torsten

> Torsten Kaiser wrote:
> >
> > static inline int in_range(const void *start, const void *addr, const void *end)
> > {
> > return addr >= start && addr <= end;
> > }
> > This will return true, if addr is in the range of start (including)
> > to end (including).
> >
> > But debug_check_no_locks_freed() seems does:
> > const void *mem_to = mem_from + mem_len
> > -> mem_to is the last byte of the freed range, that fits in_range
> > lock_from = (void *)hlock->instance;
> > -> first byte of the lock
> > lock_to = (void *)(hlock->instance + 1);
> > -> first byte of the next lock, not last byte of the lock that is being checked!
> >
> > The test is:
> > if (!in_range(mem_from, lock_from, mem_to) &&
> > !in_range(mem_from, lock_to, mem_to))
> > continue;
> > So it tests, if the first byte of the lock is in the range that is freed ->OK
> > And if the first byte of the *next* lock is in the range that is freed
> > -> Not OK.
>
> We can also simplify in_range checks, we need only 2 comparisons, not 4.
> If the lock is not in memory range, it should be either at the left of range
> or at the right.
>
> Signed-off-by: Oleg Nesterov <[email protected]>
>
> --- 24/kernel/lockdep.c~ 2007-11-09 12:57:31.000000000 +0300
> +++ 24/kernel/lockdep.c 2007-11-24 13:32:52.000000000 +0300
> @@ -3054,11 +3054,6 @@ void __init lockdep_info(void)
> #endif
> }
>
> -static inline int in_range(const void *start, const void *addr, const void *end)
> -{
> - return addr >= start && addr <= end;
> -}
> -
> static void
> print_freed_lock_bug(struct task_struct *curr, const void *mem_from,
> const void *mem_to, struct held_lock *hlock)
> @@ -3080,6 +3075,13 @@ print_freed_lock_bug(struct task_struct
> dump_stack();
> }
>
> +static inline int not_in_range(const void* mem_from, unsigned long mem_len,
> + const void* lock_from, unsigned long lock_len)
> +{
> + return lock_from + lock_len <= mem_from ||
> + mem_from + mem_len <= lock_from;
> +}
> +
> /*
> * Called when kernel memory is freed (or unmapped), or if a lock
> * is destroyed or reinitialized - this code checks whether there is
> @@ -3087,7 +3089,6 @@ print_freed_lock_bug(struct task_struct
> */
> void debug_check_no_locks_freed(const void *mem_from, unsigned long mem_len)
> {
> - const void *mem_to = mem_from + mem_len, *lock_from, *lock_to;
> struct task_struct *curr = current;
> struct held_lock *hlock;
> unsigned long flags;
> @@ -3100,14 +3101,11 @@ void debug_check_no_locks_freed(const vo
> for (i = 0; i < curr->lockdep_depth; i++) {
> hlock = curr->held_locks + i;
>
> - lock_from = (void *)hlock->instance;
> - lock_to = (void *)(hlock->instance + 1);
> -
> - if (!in_range(mem_from, lock_from, mem_to) &&
> - !in_range(mem_from, lock_to, mem_to))
> + if (not_in_range(mem_from, mem_len, hlock->instance,
> + sizeof(*hlock->instance)))
> continue;
>
> - print_freed_lock_bug(curr, mem_from, mem_to, hlock);
> + print_freed_lock_bug(curr, mem_from, mem_from + mem_len, hlock);
> break;
> }
> local_irq_restore(flags);
>
>

2007-11-24 12:23:38

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH] debug_check_no_locks_freed: fix in_range() checks


* Oleg Nesterov <[email protected]> wrote:

> > But debug_check_no_locks_freed() seems does:
> > const void *mem_to = mem_from + mem_len
> > -> mem_to is the last byte of the freed range, that fits in_range
> > lock_from = (void *)hlock->instance;
> > -> first byte of the lock
> > lock_to = (void *)(hlock->instance + 1);
> > -> first byte of the next lock, not last byte of the lock that is being checked!
> >
> > The test is:
> > if (!in_range(mem_from, lock_from, mem_to) &&
> > !in_range(mem_from, lock_to, mem_to))
> > continue;
> > So it tests, if the first byte of the lock is in the range that is freed ->OK
> > And if the first byte of the *next* lock is in the range that is freed
> > -> Not OK.

thanks, applied.

Ingo

2007-11-24 12:25:25

by Oleg Nesterov

[permalink] [raw]
Subject: Re: [PATCH] debug_check_no_locks_freed: fix in_range() checks

On 11/24, Torsten Kaiser wrote:
>
> On Nov 24, 2007 11:53 AM, Oleg Nesterov <[email protected]> wrote:
> > Torsten, could you ack/nack this patch?
>
> From looking at the code I would ack it.

Great.

> I will reapply agk-dm-dm-crypt-move-bio-submission-to-thread.patch and
> this patch and boot several times, but as the message was not
> triggered on every boot, this can't prove anything.

Regardless of any other possible problems, I think you found a real bug
in debug_check_no_locks_freed() which should be fixed.

Oleg.

2007-11-24 12:35:52

by Alasdair G Kergon

[permalink] [raw]
Subject: Re: [PATCH] debug_check_no_locks_freed: fix in_range() checks

On Sat, Nov 24, 2007 at 01:18:35PM +0100, Torsten Kaiser wrote:
> I will reapply agk-dm-dm-crypt-move-bio-submission-to-thread.patch and
> this patch and boot several times

OK for a test system, but until the write loop problem is addressed I believe
there's a risk of data corruption under low memory conditions. (I dropped it
from the export to -mm a few days ago for this reason.)

Alasdair
--
[email protected]