2005-05-23 23:35:57

by Andrew Morton

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

Linux Kernel Mailing List <[email protected]> wrote:
>
> The netlink gfp_any() problem made me double-check the uses of in_softirq()
> in crypto/*. It seems to me that we should be checking in_atomic() instead
> of in_softirq() in crypto_yield. Otherwise people calling the crypto ops
> with spin locks held or preemption disabled will get burnt, right?
>

Sort-of, but the code is still wrong.

>
> crypto/internal.h | 2 +-
> 1 files changed, 1 insertion(+), 1 deletion(-)
>
> Index: crypto/internal.h
> ===================================================================
> --- dade029a8df8b249d14282d8f8023a0de0f6c1e7/crypto/internal.h (mode:100644 sha1:e68e43886d3cc23439f30210e88b517911bf395e)
> +++ c48106158bce4c7af328c486b7f33ad2133459ee/crypto/internal.h (mode:100644 sha1:964b9a60ca24413f07b1fe8410f7ac3198642135)
> @@ -38,7 +38,7 @@ static inline void crypto_kunmap(void *v
>
> static inline void crypto_yield(struct crypto_tfm *tfm)
> {
> - if (!in_softirq())
> + if (!in_atomic())
> cond_resched();
> }

This code can cause deadlocks on CONFIG_SMP && !CONFIG_PREEMPT kernels.

Please see http://lkml.org/lkml/2005/3/10/358

You (the programmer) *have* to know what context you're running in before
doing a voluntary yield. There is simply no way to work this out at
runtime.


2005-05-24 00:03:15

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

On Mon, 2005-05-23 at 16:28 -0700, Andrew Morton wrote:

> This code can cause deadlocks on CONFIG_SMP && !CONFIG_PREEMPT kernels.
>
> Please see http://lkml.org/lkml/2005/3/10/358
>
> You (the programmer) *have* to know what context you're running in before
> doing a voluntary yield. There is simply no way to work this out at
> runtime.

Hrm... Linus just merged it though...

Ben.


2005-05-24 00:23:54

by Andrew Morton

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

Benjamin Herrenschmidt <[email protected]> wrote:
>
> On Mon, 2005-05-23 at 16:28 -0700, Andrew Morton wrote:
>
> > This code can cause deadlocks on CONFIG_SMP && !CONFIG_PREEMPT kernels.
> >
> > Please see http://lkml.org/lkml/2005/3/10/358
> >
> > You (the programmer) *have* to know what context you're running in before
> > doing a voluntary yield. There is simply no way to work this out at
> > runtime.
>
> Hrm... Linus just merged it though...
>

The old version was:

if (!in_softirq())
cond_resched();

Which I guess is OK if the programmer knows that this code is only ever called

a) from softirq context or

b) from process context with no locks/smp_processor_id/etc held.

The new version is:

if (!in_atomic())
cond_resched();

which happens to still be correct as long as a) and b) still hold, which I
assume they do.

Both versions are deadlocky if b) is violated.

So. It sucks before and it sucks after, but we might not be deadlocky.
Problem is, !CONFIG_PREEMPT also disable the beancounting which
might_sleep() depends upon, so it's harder to tell whether all callers are
correct.

2005-05-24 02:21:27

by Herbert Xu

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

On Mon, May 23, 2005 at 11:28:06PM +0000, Andrew Morton wrote:
>
> This code can cause deadlocks on CONFIG_SMP && !CONFIG_PREEMPT kernels.
>
> Please see http://lkml.org/lkml/2005/3/10/358
>
> You (the programmer) *have* to know what context you're running in before
> doing a voluntary yield. There is simply no way to work this out at
> runtime.

You're right.

Perhaps we should code this into the crypto API instead? For instance,
we can have a tfm flag that says whether we can sleep or not.

Dave & James, What do you think?

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2005-05-24 02:32:33

by Andrew Morton

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

Herbert Xu <[email protected]> wrote:
>
> Perhaps we should code this into the crypto API instead? For instance,
> we can have a tfm flag that says whether we can sleep or not.

Are you sure it's actually needed? Have significant scheduling latencies
actually been observed?

Bear in mind that anyone who cares a lot about latency will be running
CONFIG_PREEMPT kernels, in which case the whole thing is redundant anyway.
I generally take the position that if we're going to put a scheduling point
into a non-premept kernel then it'd better be for a pretty bad latency
point - more than 10 milliseconds, say.

2005-05-24 02:43:34

by Herbert Xu

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

On Mon, May 23, 2005 at 07:31:16PM -0700, Andrew Morton wrote:
>
> Are you sure it's actually needed? Have significant scheduling latencies
> actually been observed?

I certainly don't have any problems with removing the yield altogether.

> Bear in mind that anyone who cares a lot about latency will be running
> CONFIG_PREEMPT kernels, in which case the whole thing is redundant anyway.
> I generally take the position that if we're going to put a scheduling point
> into a non-premept kernel then it'd better be for a pretty bad latency
> point - more than 10 milliseconds, say.

The crypt() function can easily take more than 10 milliseconds with
a large enough buffer.

James & Dave, do you have any opinions on this?

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2005-05-24 03:20:51

by James Morris

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

On Tue, 24 May 2005, Herbert Xu wrote:

> On Mon, May 23, 2005 at 07:31:16PM -0700, Andrew Morton wrote:
> >
> > Are you sure it's actually needed? Have significant scheduling latencies
> > actually been observed?
>
> I certainly don't have any problems with removing the yield altogether.
>
> > Bear in mind that anyone who cares a lot about latency will be running
> > CONFIG_PREEMPT kernels, in which case the whole thing is redundant anyway.
> > I generally take the position that if we're going to put a scheduling point
> > into a non-premept kernel then it'd better be for a pretty bad latency
> > point - more than 10 milliseconds, say.
>
> The crypt() function can easily take more than 10 milliseconds with
> a large enough buffer.
>
> James & Dave, do you have any opinions on this?

a) remove the scheudling point and see if anyone complains
b) if so, add a flag



- James
--
James Morris
<[email protected]>


2005-05-24 03:51:36

by Herbert Xu

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

On Mon, May 23, 2005 at 11:20:26PM -0400, James Morris wrote:
>
> a) remove the scheudling point and see if anyone complains
> b) if so, add a flag

OK. I think we should go with the flag though since it could
also be used for memory allocation.

I've just added some code which allocates a scratch space for
unaligned input to the VIA Padlock (IPv4 ESP traffic is normally
unaligned due to the 20-byte IP header). It could use this flag
to determine whether it should do GFP_KERNEL or GFP_ATOMIC.

Actually, has anyone considered using a 4-byte IP option padding?
It's legal per RFC-791 but it'd be interesting to know how well
it works in the field.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2005-05-24 13:34:42

by Bill Davidsen

[permalink] [raw]
Subject: Re: [CRYPTO]: Only reschedule if !in_atomic()

Andrew Morton wrote:
> Herbert Xu <[email protected]> wrote:
>
>>Perhaps we should code this into the crypto API instead? For instance,
>> we can have a tfm flag that says whether we can sleep or not.
>
>
> Are you sure it's actually needed? Have significant scheduling latencies
> actually been observed?
>
> Bear in mind that anyone who cares a lot about latency will be running
> CONFIG_PREEMPT kernels, in which case the whole thing is redundant anyway.
> I generally take the position that if we're going to put a scheduling point
> into a non-premept kernel then it'd better be for a pretty bad latency
> point - more than 10 milliseconds, say.
>
People do run crypto on old slow machines, and also laptops configured
to use as little power as possible. I wouldn't be surprised if latencies
got in the >10ms range pretty regularly on some systems which are pretty
mainstream.

Just my read on it, if a flag will prevent deadlock without relying on
callers doing the right thing, that's probably a desirable change WRT
future stability.