Doing ipsec produces a spinlock recursion warning.
This is due to not disabling BH during crypto completion function.
Fixes: 59ca6c93387d3 ("virtio-crypto: implement RSA algorithm")
Reported-by: Halil Pasic <[email protected]>
Signed-off-by: Gonglei <[email protected]>
---
drivers/crypto/virtio/virtio_crypto_akcipher_algs.c | 3 ++-
drivers/crypto/virtio/virtio_crypto_skcipher_algs.c | 3 ++-
2 files changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c b/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
index 2621ff8a9376..19e2898977d3 100644
--- a/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
@@ -61,8 +61,9 @@ static void virtio_crypto_akcipher_finalize_req(
vc_akcipher_req->src_buf = NULL;
vc_akcipher_req->dst_buf = NULL;
virtcrypto_clear_request(&vc_akcipher_req->base);
-
+ local_bh_disable();
crypto_finalize_akcipher_request(vc_akcipher_req->base.dataq->engine, req, err);
+ local_bh_enable();
}
static void virtio_crypto_dataq_akcipher_callback(struct virtio_crypto_request *vc_req, int len)
diff --git a/drivers/crypto/virtio/virtio_crypto_skcipher_algs.c b/drivers/crypto/virtio/virtio_crypto_skcipher_algs.c
index 23c41d87d835..661c1102583e 100644
--- a/drivers/crypto/virtio/virtio_crypto_skcipher_algs.c
+++ b/drivers/crypto/virtio/virtio_crypto_skcipher_algs.c
@@ -566,9 +566,10 @@ static void virtio_crypto_skcipher_finalize_req(
AES_BLOCK_SIZE, 0);
kfree_sensitive(vc_sym_req->iv);
virtcrypto_clear_request(&vc_sym_req->base);
-
+ local_bh_disable();
crypto_finalize_skcipher_request(vc_sym_req->base.dataq->engine,
req, err);
+ local_bh_enable();
}
static struct virtio_crypto_algo virtio_crypto_algs[] = { {
--
2.23.0
[..]
> --- a/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
> +++ b/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
> @@ -61,8 +61,9 @@ static void virtio_crypto_akcipher_finalize_req(
> vc_akcipher_req->src_buf = NULL;
> vc_akcipher_req->dst_buf = NULL;
> virtcrypto_clear_request(&vc_akcipher_req->base);
> -
> + local_bh_disable();
> crypto_finalize_akcipher_request(vc_akcipher_req->base.dataq->engine, req, err);
> + local_bh_enable();
Thanks Gonglei!
I did this a quick spin, and it does not seem to be sufficient on s390x.
Which does not come as a surprise to me, because
#define lockdep_assert_in_softirq() \
do { \
WARN_ON_ONCE(__lockdep_enabled && \
(!in_softirq() || in_irq() || in_nmi())); \
} while (0)
will still warn because in_irq() still evaluates to true (your patch
addresses the !in_softirq() part).
I don't have any results on x86 yet. My current understanding is that the
virtio-pci transport code disables interrupts locally somewhere in the
call chain (actually in vp_vring_interrupt() via spin_lock_irqsave())
and then x86 would be fine. But I will get that verified.
On the other hand virtio_airq_handler() calls vring_interrupt() with
interrupts enabled. (While vring_interrupt() is called in a (read)
critical section in virtio_airq_handler() we use read_lock() and
not read_lock_irqsave() to grab the lock. Whether that is correct in
it self (i.e. disregarding the crypto problem) or not I'm not sure right
now. Will think some more about it tomorrow.) If the way to go forward
is disabling interrupts in virtio-ccw before vring_interrupt() is
called, I would be glad to spin a patch for that.
Copying Conny, as she may have an opinion on this (if I'm not wrong she
authored that code).
Regards,
Halil
> -----Original Message-----
> From: Michael S. Tsirkin [mailto:[email protected]]
> Sent: Wednesday, September 27, 2023 1:14 AM
> To: Halil Pasic <[email protected]>
> Cc: Gonglei (Arei) <[email protected]>; Herbert Xu
> <[email protected]>; [email protected]; Marc
> Hartmayer <[email protected]>; Jason Wang
> <[email protected]>; [email protected];
> [email protected]; [email protected]; Cornelia Huck
> <[email protected]>
> Subject: Re: [PATCH] crypto: virtio-crypto: call finalize with bh disabled
>
> On Tue, Sep 26, 2023 at 06:41:58PM +0200, Halil Pasic wrote:
> > [..]
> > > --- a/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
> > > +++ b/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
> > > @@ -61,8 +61,9 @@ static void virtio_crypto_akcipher_finalize_req(
> > > vc_akcipher_req->src_buf = NULL;
> > > vc_akcipher_req->dst_buf = NULL;
> > > virtcrypto_clear_request(&vc_akcipher_req->base);
> > > -
> > > + local_bh_disable();
> > >
> > > crypto_finalize_akcipher_request(vc_akcipher_req->base.dataq->engine
> > > , req, err);
> > > + local_bh_enable();
> >
> > Thanks Gonglei!
> >
> > I did this a quick spin, and it does not seem to be sufficient on s390x.
> > Which does not come as a surprise to me, because
> >
> > #define lockdep_assert_in_softirq()
> \
> > do
> {
> \
> > WARN_ON_ONCE(__lockdep_enabled &&
> \
> > (!in_softirq() || in_irq() || in_nmi())); \
> > } while (0)
> >
> > will still warn because in_irq() still evaluates to true (your patch
> > addresses the !in_softirq() part).
> >
> > I don't have any results on x86 yet. My current understanding is that
> > the virtio-pci transport code disables interrupts locally somewhere in
> > the call chain (actually in vp_vring_interrupt() via
> > spin_lock_irqsave()) and then x86 would be fine. But I will get that verified.
> >
> > On the other hand virtio_airq_handler() calls vring_interrupt() with
> > interrupts enabled. (While vring_interrupt() is called in a (read)
> > critical section in virtio_airq_handler() we use read_lock() and not
> > read_lock_irqsave() to grab the lock. Whether that is correct in it
> > self (i.e. disregarding the crypto problem) or not I'm not sure right
> > now. Will think some more about it tomorrow.) If the way to go forward
> > is disabling interrupts in virtio-ccw before vring_interrupt() is
> > called, I would be glad to spin a patch for that.
> >
> > Copying Conny, as she may have an opinion on this (if I'm not wrong
> > she authored that code).
> >
> > Regards,
> > Halil
>
> On a related note, config change callback is also handled incorrectly in this
> driver, it takes a mutex from interrupt context.
Good catch. Will fix it.
Thanks.
-Gonglei
On Wed, 27 Sep 2023 12:08:43 +0200
Cornelia Huck <[email protected]> wrote:
> > On the other hand virtio_airq_handler() calls vring_interrupt() with
> > interrupts enabled. (While vring_interrupt() is called in a (read)
> > critical section in virtio_airq_handler() we use read_lock() and
> > not read_lock_irqsave() to grab the lock. Whether that is correct in
> > it self (i.e. disregarding the crypto problem) or not I'm not sure right
> > now. Will think some more about it tomorrow.) If the way to go forward
> > is disabling interrupts in virtio-ccw before vring_interrupt() is
> > called, I would be glad to spin a patch for that.
>
> virtio_airq_handler() is supposed to be an interrupt handler for an
> adapter interrupt -- as such I would expect it to always run with
> interrupts disabled (and I'd expect vring_interrupt() to be called
> with interrupts disabled as well; if that's not the case, I think it
> would need to run asynchronously.) At least that was my understanding at
> the time I wrote the code.
Thanks Connie! I don't quite understand what do you mean by "run with
interrupts disabled" in this context.
Do you mean that if I were to add the following warning:
diff --git a/drivers/s390/virtio/virtio_ccw.c b/drivers/s390/virtio/virtio_ccw.c
index ac67576301bf..2a9c73f5964f 100644
--- a/drivers/s390/virtio/virtio_ccw.c
+++ b/drivers/s390/virtio/virtio_ccw.c
@@ -211,6 +211,8 @@ static void virtio_airq_handler(struct airq_struct *airq,
struct airq_info *info = container_of(airq, struct airq_info, airq);
unsigned long ai;
+ WARN_ONCE(in_irq(), "irqs are ought to be disabled but are not\n");
+
inc_irq_stat(IRQIO_VAI);
it would/should never trigger, or do you mean something different?
If yes, does that mean that you would expect the common airq code (i.e. something
like do_airq_interrupt()) to disable interrupts, or call virtio_airq_handler()?
asynchronously sort of as a bottom half (my understanding of bottom halves is currently
not complete).
If no what do you actually mean?
Regards,
Halil
Regards,
Halil
On Tue, 26 Sep 2023 13:13:51 -0400
"Michael S. Tsirkin" <[email protected]> wrote:
> > On the other hand virtio_airq_handler() calls vring_interrupt() with
> > interrupts enabled. (While vring_interrupt() is called in a (read)
> > critical section in virtio_airq_handler() we use read_lock() and
> > not read_lock_irqsave() to grab the lock. Whether that is correct in
> > it self (i.e. disregarding the crypto problem) or not I'm not sure right
> > now. Will think some more about it tomorrow.) If the way to go forward
> > is disabling interrupts in virtio-ccw before vring_interrupt() is
> > called, I would be glad to spin a patch for that.
> >
> > Copying Conny, as she may have an opinion on this (if I'm not wrong she
> > authored that code).
> >
> > Regards,
> > Halil
>
> On a related note, config change callback is also handled incorrectly
> in this driver, it takes a mutex from interrupt context.
Thanks Michael! I intend to give Connie a little more time to chime in.
Assumed no controversies emerge I intend to cook up two patches for
the two issues later during the day.
Regards,
Halil
On Wed, 27 Sep 2023 09:24:09 +0000
"Gonglei (Arei)" <[email protected]> wrote:
> > On a related note, config change callback is also handled incorrectly in this
> > driver, it takes a mutex from interrupt context.
>
> Good catch. Will fix it.
Thanks Gonglei! Sorry I first misunderstood this as a problem within the
virtio-ccw driver, but it is actually about virtio-crypto. Thanks for
fixing this!
Regards,
Halil
On Tue, 26 Sep 2023 18:41:58 +0200
Halil Pasic <[email protected]> wrote:
> > + local_bh_disable();
> > crypto_finalize_akcipher_request(vc_akcipher_req->base.dataq->engine, req, err);
> > + local_bh_enable();
>
> Thanks Gonglei!
>
> I did this a quick spin, and it does not seem to be sufficient on s390x.
> Which does not come as a surprise to me, because
>
> #define lockdep_assert_in_softirq() \
> do { \
> WARN_ON_ONCE(__lockdep_enabled && \
> (!in_softirq() || in_irq() || in_nmi())); \
> } while (0)
>
> will still warn because in_irq() still evaluates to true (your patch
> addresses the !in_softirq() part).
>
> I don't have any results on x86 yet. My current understanding is that the
> virtio-pci transport code disables interrupts locally somewhere in the
> call chain (actually in vp_vring_interrupt() via spin_lock_irqsave())
> and then x86 would be fine. But I will get that verified.
[ 35.177962][ C0] WARNING: CPU: 0 PID: 152 at kernel/softirq.c:306 __local_bh_disable_ip (kernel/softirq.c:306 (discriminator 1))
[ 35.178551][ C0] Modules linked in: vmw_vsock_virtio_transport(+) vmw_vsock_virtio_transport_common virtio_crypto(+) crypto_engine vsock
[ 35.179930][ C0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014
[ 35.180548][ C0] RIP: 0010:__local_bh_disable_ip (kernel/softirq.c:306 (discriminator 1))
[ 35.180936][ C0] Code: eb 7d 65 8b 05 ef 90 eb 7d 31 f0 f6 c4 ff 74 13 9c 58 f6 c4 02 75 17 80 e7 02 74 01 fb 5b c3 cc cc cc cc e8 48 2f 15 00 eb e6 <0f> 0b eb ca e8 2d 88 03 03 eb e2 66 66 2e 0f 1f 84 00 00 00 00 00
All code
========
0: eb 7d jmp 0x7f
2: 65 8b 05 ef 90 eb 7d mov %gs:0x7deb90ef(%rip),%eax # 0x7deb90f8
9: 31 f0 xor %esi,%eax
b: f6 c4 ff test $0xff,%ah
e: 74 13 je 0x23
10: 9c pushf
11: 58 pop %rax
12: f6 c4 02 test $0x2,%ah
15: 75 17 jne 0x2e
17: 80 e7 02 and $0x2,%bh
1a: 74 01 je 0x1d
1c: fb sti
1d: 5b pop %rbx
1e: c3 ret
1f: cc int3
20: cc int3
21: cc int3
22: cc int3
23: e8 48 2f 15 00 call 0x152f70
28: eb e6 jmp 0x10
2a:* 0f 0b ud2 <-- trapping instruction
2c: eb ca jmp 0xfffffffffffffff8
2e: e8 2d 88 03 03 call 0x3038860
33: eb e2 jmp 0x17
35: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1)
3c: 00 00 00 00
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: eb ca jmp 0xffffffffffffffce
4: e8 2d 88 03 03 call 0x3038836
9: eb e2 jmp 0xffffffffffffffed
b: 66 66 2e 0f 1f 84 00 data16 cs nopw 0x0(%rax,%rax,1)
12: 00 00 00 00
[ 35.182237][ C0] RSP: 0018:ffffc90000007d88 EFLAGS: 00010006
[ 35.182637][ C0] RAX: 0000000080010003 RBX: ffff888108308538 RCX: ffffc90000007d50
[ 35.183186][ C0] RDX: ffff88811ae36300 RSI: 0000000000000200 RDI: ffffffffc02b16cc
[ 35.183700][ C0] RBP: ffff8881083084e8 R08: 0000000000000000 R09: fffffbfff0d04f04
[ 35.184216][ C0] R10: ffffffff86827823 R11: ffffffff852013e6 R12: 0000000000000001
[ 35.184730][ C0] R13: 0000000000000000 R14: ffff888108308538 R15: dffffc0000000000
[ 35.185248][ C0] FS: 00007f06cb551800(0000) GS:ffff88811ae00000(0000) knlGS:0000000000000000
[ 35.185831][ C0] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 35.186271][ C0] CR2: 000055dc93010628 CR3: 0000000116b28000 CR4: 00000000000006f0
[ 35.186789][ C0] Call Trace:
[ 35.187010][ C0] <IRQ>
[ 35.187204][ C0] ? __warn (kernel/panic.c:673)
[ 35.187505][ C0] ? __local_bh_disable_ip (kernel/softirq.c:306 (discriminator 1))
[ 35.187857][ C0] ? report_bug (lib/bug.c:180 lib/bug.c:219)
[ 35.188197][ C0] ? handle_bug (arch/x86/kernel/traps.c:237 (discriminator 1))
[ 35.188483][ C0] ? exc_invalid_op (arch/x86/kernel/traps.c:258 (discriminator 1))
[ 35.188790][ C0] ? asm_exc_invalid_op (./arch/x86/include/asm/idtentry.h:568)
[ 35.189120][ C0] ? asm_common_interrupt (./arch/x86/include/asm/idtentry.h:636)
[ 35.189466][ C0] ? virtio_crypto_dataq_sym_callback (drivers/crypto/virtio/virtio_crypto_skcipher_algs.c:567 drivers/crypto/virtio/virtio_crypto_skcipher_algs.c:81 drivers/crypto/virtio/virtio_crypto_skcipher_algs.c:55) virtio_crypto
[ 35.189983][ C0] ? __local_bh_disable_ip (kernel/softirq.c:306 (discriminator 1))
[ 35.190336][ C0] virtio_crypto_dataq_sym_callback (drivers/crypto/virtio/virtio_crypto_skcipher_algs.c:570 drivers/crypto/virtio/virtio_crypto_skcipher_algs.c:81 drivers/crypto/virtio/virtio_crypto_skcipher_algs.c:55) virtio_crypto
[ 35.190837][ C0] virtcrypto_dataq_callback (drivers/crypto/virtio/virtio_crypto_core.c:91) virtio_crypto
[ 35.191304][ C0] ? __pfx_virtcrypto_dataq_callback (drivers/crypto/virtio/virtio_crypto_core.c:76) virtio_crypto
[ 35.191796][ C0] ? __pfx_do_raw_spin_lock (kernel/locking/spinlock_debug.c:113)
[ 35.192154][ C0] vring_interrupt (drivers/virtio/virtio_ring.c:2598)
[ 35.192536][ C0] vp_vring_interrupt (drivers/virtio/virtio_pci_common.c:67 (discriminator 2))
[ 35.193064][ C0] ? __pfx_vp_vring_interrupt (drivers/virtio/virtio_pci_common.c:60)
[ 35.193793][ C0] __handle_irq_event_percpu (kernel/irq/handle.c:158)
[ 35.194272][ C0] handle_irq_event (kernel/irq/handle.c:195 kernel/irq/handle.c:210)
[ 35.194587][ C0] handle_edge_irq (kernel/irq/chip.c:833)
[ 35.194903][ C0] __common_interrupt (arch/x86/kernel/irq.c:271)
[ 35.195232][ C0] common_interrupt (arch/x86/kernel/irq.c:247 (discriminator 47))
So I was wrong, this patch is not sufficient, not on x86 nor on s390x.
And the problem is that we are in hardirq context.
For some reason, I was under the impression that disabling interrupts in
a hardirq context somehow takes you out of hardirq context and makes
in_irq() return false. Silly me! (I was assuming the fix works on x86 and
hallucinated based on that assumption and any differences I have found
between virtio-ccw and virtio-pci.)
Currently I don't see a need to fix anything in virtio-ccw.
Regards,
Halil
Hi Michael & Lei,
I volunteer to fix this by workqueue.
I also notice that device drivers use workqueue to handle config-changed
again and again, what about re-implement __virtio_config_changed() by
kicking workqueue instead?
By the way, balloon dirvers uses
spin_lock_irqsave/spin_unlock_irqrestore in config-changed callback, do
it handle correctly?
On 9/27/23 21:25, Halil Pasic wrote:
> On Wed, 27 Sep 2023 09:24:09 +0000
> "Gonglei (Arei)" <[email protected]> wrote:
>
>>> On a related note, config change callback is also handled incorrectly in this
>>> driver, it takes a mutex from interrupt context.
>>
>> Good catch. Will fix it.
>
> Thanks Gonglei! Sorry I first misunderstood this as a problem within the
> virtio-ccw driver, but it is actually about virtio-crypto. Thanks for
> fixing this!
>
> Regards,
> Halil
--
zhenwei pi
> -----Original Message-----
> From: zhenwei pi [mailto:[email protected]]
> Sent: Thursday, September 28, 2023 9:24 AM
> To: Michael S. Tsirkin <[email protected]>; Gonglei (Arei)
> <[email protected]>
> Cc: Halil Pasic <[email protected]>; Herbert Xu
> <[email protected]>; [email protected]; Marc
> Hartmayer <[email protected]>; Jason Wang
> <[email protected]>; [email protected];
> [email protected]; Cornelia Huck <[email protected]>
> Subject: Re: Re: [PATCH] crypto: virtio-crypto: call finalize with bh disabled
>
> Hi Michael & Lei,
>
> I volunteer to fix this by workqueue.
>
Thanks, patches are always welcome.
> I also notice that device drivers use workqueue to handle config-changed again
> and again, what about re-implement __virtio_config_changed() by kicking
> workqueue instead?
>
Personally, I prefer to implement it in the device driver case by case. some devices
want to work in the upper half of the interrupt context, such as virtio-mem.
> By the way, balloon dirvers uses
> spin_lock_irqsave/spin_unlock_irqrestore in config-changed callback, do it
> handle correctly?
>
It's ok. The critical resource protected is global system_freezable_wq.
Regards,
-Gonglei
> On 9/27/23 21:25, Halil Pasic wrote:
> > On Wed, 27 Sep 2023 09:24:09 +0000
> > "Gonglei (Arei)" <[email protected]> wrote:
> >
> >>> On a related note, config change callback is also handled
> >>> incorrectly in this driver, it takes a mutex from interrupt context.
> >>
> >> Good catch. Will fix it.
> >
> > Thanks Gonglei! Sorry I first misunderstood this as a problem within
> > the virtio-ccw driver, but it is actually about virtio-crypto. Thanks
> > for fixing this!
> >
> > Regards,
> > Halil
>
> --
> zhenwei pi
Ping Herbert.
Thanks.
> -----Original Message-----
> From: Gonglei (Arei)
> Sent: Wednesday, September 27, 2023 5:18 PM
> To: 'Halil Pasic' <[email protected]>
> Cc: Herbert Xu <[email protected]>; [email protected];
> Marc Hartmayer <[email protected]>; Michael S. Tsirkin
> <[email protected]>; Jason Wang <[email protected]>;
> [email protected]; [email protected];
> [email protected]; Cornelia Huck <[email protected]>
> Subject: RE: [PATCH] crypto: virtio-crypto: call finalize with bh disabled
>
>
>
> > -----Original Message-----
> > From: Halil Pasic [mailto:[email protected]]
> > Sent: Wednesday, September 27, 2023 12:42 AM
> > To: Gonglei (Arei) <[email protected]>
> > Cc: Herbert Xu <[email protected]>;
> > [email protected]; Marc Hartmayer <[email protected]>;
> > Michael S. Tsirkin <[email protected]>; Jason Wang
> <[email protected]>;
> > [email protected];
> > [email protected]; [email protected]; Halil Pasic
> > <[email protected]>; Cornelia Huck <[email protected]>
> > Subject: Re: [PATCH] crypto: virtio-crypto: call finalize with bh
> > disabled
> >
> > [..]
> > > --- a/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
> > > +++ b/drivers/crypto/virtio/virtio_crypto_akcipher_algs.c
> > > @@ -61,8 +61,9 @@ static void virtio_crypto_akcipher_finalize_req(
> > > vc_akcipher_req->src_buf = NULL;
> > > vc_akcipher_req->dst_buf = NULL;
> > > virtcrypto_clear_request(&vc_akcipher_req->base);
> > > -
> > > + local_bh_disable();
> > >
> > > crypto_finalize_akcipher_request(vc_akcipher_req->base.dataq->engine
> > > ,
> > > req, err);
> > > + local_bh_enable();
> >
> > Thanks Gonglei!
> >
> > I did this a quick spin, and it does not seem to be sufficient on s390x.
> > Which does not come as a surprise to me, because
> >
> > #define lockdep_assert_in_softirq()
> > \
> > do
> > {
> > \
> > WARN_ON_ONCE(__lockdep_enabled &&
> > \
> > (!in_softirq() || in_irq() || in_nmi())); \
> > } while (0)
> >
> > will still warn because in_irq() still evaluates to true (your patch
> > addresses the !in_softirq() part).
> >
> You are right.
>
> So I think the core of this question is: Can we call crypto_finalize_request() in
> the upper half of the interrupt?
> If so, maybe we should introduce a new function, such as
> lockdep_assert_in_interrupt().
>
> #define lockdep_assert_in_interrupt() \
> do { \
> WARN_ON_ONCE(__lockdep_enabled && !in_interrupt()); \
> } while (0)
>
> If not, why?
>
> Herbert, do you have any suggestions? Thanks.
>
>
> Regards,
> -Gonglei
>