2008-12-07 15:55:19

by Herbert Xu

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 10:49:27PM +0800, Herbert Xu wrote:
>
> In any case the loop itself does not involve any crypto components
> so I don't think making changes in the crypto layer is going to
> make this go away forever as anyone calling request_module early
> enough will get into this loop.

Having said that I think this patch should remove this particular
trigger. Unfortunately I couldn't find a more succinct way of
expressing this relationship: if ALGAPI is built-in because it's
selected by an algorithm, and CRYPTO_MANAGER is enabled then we
require CRYPTO_MANAGER to be built-in as well.

Evgeniy, please let me know whether this fixes the problem.

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 6593b5a..3f88a52 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -31,35 +31,63 @@ config CRYPTO_FIPS

config CRYPTO_ALGAPI
tristate
+ select CRYPTO_ALGAPI2
help
This option provides the API for cryptographic algorithms.

+config CRYPTO_ALGAPI2
+ tristate
+
config CRYPTO_AEAD
tristate
+ select CRYPTO_AEAD2
select CRYPTO_ALGAPI

+config CRYPTO_AEAD2
+ tristate
+ select CRYPTO_ALGAPI2
+
config CRYPTO_BLKCIPHER
tristate
+ select CRYPTO_BLKCIPHER2
select CRYPTO_ALGAPI
- select CRYPTO_RNG
+
+config CRYPTO_BLKCIPHER2
+ tristate
+ select CRYPTO_ALGAPI2
+ select CRYPTO_RNG2

config CRYPTO_HASH
tristate
+ select CRYPTO_HASH2
select CRYPTO_ALGAPI

+config CRYPTO_HASH2
+ tristate
+ select CRYPTO_ALGAPI2
+
config CRYPTO_RNG
tristate
+ select CRYPTO_RNG2
select CRYPTO_ALGAPI

+config CRYPTO_RNG2
+ tristate
+ select CRYPTO_ALGAPI2
+
config CRYPTO_MANAGER
tristate "Cryptographic algorithm manager"
- select CRYPTO_AEAD
- select CRYPTO_HASH
- select CRYPTO_BLKCIPHER
+ select CRYPTO_MANAGER2
help
Create default cryptographic template instantiations such as
cbc(aes).

+config CRYPTO_MANAGER2
+ def_tristate CRYPTO_MANAGER || (CRYPTO_MANAGER!=n && CRYPTO_ALGAPI=y)
+ select CRYPTO_AEAD2
+ select CRYPTO_HASH2
+ select CRYPTO_BLKCIPHER2
+
config CRYPTO_GF128MUL
tristate "GF(2^128) multiplication functions (EXPERIMENTAL)"
depends on EXPERIMENTAL
diff --git a/crypto/Makefile b/crypto/Makefile
index 0e93f32..73e7c30 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -9,25 +9,25 @@ obj-$(CONFIG_CRYPTO_FIPS) += fips.o

crypto_algapi-$(CONFIG_PROC_FS) += proc.o
crypto_algapi-objs := algapi.o scatterwalk.o $(crypto_algapi-y)
-obj-$(CONFIG_CRYPTO_ALGAPI) += crypto_algapi.o
+obj-$(CONFIG_CRYPTO_ALGAPI2) += crypto_algapi.o

-obj-$(CONFIG_CRYPTO_AEAD) += aead.o
+obj-$(CONFIG_CRYPTO_AEAD2) += aead.o

crypto_blkcipher-objs := ablkcipher.o
crypto_blkcipher-objs += blkcipher.o
-obj-$(CONFIG_CRYPTO_BLKCIPHER) += crypto_blkcipher.o
-obj-$(CONFIG_CRYPTO_BLKCIPHER) += chainiv.o
-obj-$(CONFIG_CRYPTO_BLKCIPHER) += eseqiv.o
+obj-$(CONFIG_CRYPTO_BLKCIPHER2) += crypto_blkcipher.o
+obj-$(CONFIG_CRYPTO_BLKCIPHER2) += chainiv.o
+obj-$(CONFIG_CRYPTO_BLKCIPHER2) += eseqiv.o
obj-$(CONFIG_CRYPTO_SEQIV) += seqiv.o

crypto_hash-objs := hash.o
crypto_hash-objs += ahash.o
crypto_hash-objs += shash.o
-obj-$(CONFIG_CRYPTO_HASH) += crypto_hash.o
+obj-$(CONFIG_CRYPTO_HASH2) += crypto_hash.o

cryptomgr-objs := algboss.o testmgr.o

-obj-$(CONFIG_CRYPTO_MANAGER) += cryptomgr.o
+obj-$(CONFIG_CRYPTO_MANAGER2) += cryptomgr.o
obj-$(CONFIG_CRYPTO_HMAC) += hmac.o
obj-$(CONFIG_CRYPTO_XCBC) += xcbc.o
obj-$(CONFIG_CRYPTO_NULL) += crypto_null.o

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt


2008-12-07 16:03:19

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 16:55, Herbert Xu <[email protected]> wrote:
> On Sun, Dec 07, 2008 at 10:49:27PM +0800, Herbert Xu wrote:
>>
>> In any case the loop itself does not involve any crypto components
>> so I don't think making changes in the crypto layer is going to
>> make this go away forever as anyone calling request_module early
>> enough will get into this loop.
>
> Having said that I think this patch should remove this particular
> trigger. Unfortunately I couldn't find a more succinct way of
> expressing this relationship: if ALGAPI is built-in because it's
> selected by an algorithm, and CRYPTO_MANAGER is enabled then we
> require CRYPTO_MANAGER to be built-in as well.
>
> Evgeniy, please let me know whether this fixes the problem.

Even when this works around it, the problem that the kernel triggers
module loading on /dev/console access stays and needs to be fixed
instead.

It's a pure kernel bug, regardless of how many times Alan likes to
establish non-sensical hotplug rules to prevent this. The next
innocent driver change, like yours is, will run into the same problem.

Thanks,
Kay

2008-12-07 16:11:19

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> Even when this works around it, the problem that the kernel triggers
> module loading on /dev/console access stays and needs to be fixed
> instead.

It isn't a problem. It is trying to have hotplug load a suitable driver.
This is what is supposed to happen.

Alan

2008-12-07 16:21:03

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 17:09, Alan Cox <[email protected]> wrote:
>> Even when this works around it, the problem that the kernel triggers
>> module loading on /dev/console access stays and needs to be fixed
>> instead.
>
> It isn't a problem. It is trying to have hotplug load a suitable driver.
> This is what is supposed to happen.

No, it's not. 5:1 is _in_ the kernel, and must not be tried to be
loaded by the kernel. We need to make /dev/console access return
-ENODEV if not available, not try to load a module for it.

It is a problem. It's broken behavior, and it breaks shipped products
with any innocent future driver change like Herbert's.

Please start to think about the problem, and don't repeat things that
do not make sense, and never made any sense.

Thanks,
Kay

2008-12-07 16:31:53

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 04:09:21PM +0000, Alan Cox ([email protected]) wrote:
> > Even when this works around it, the problem that the kernel triggers
> > module loading on /dev/console access stays and needs to be fixed
> > instead.
>
> It isn't a problem. It is trying to have hotplug load a suitable driver.
> This is what is supposed to happen.

Wrong. First at least because tty/console stuff is built in.
But we alredy got, that you decided that Debian's initramfs-tools (0.92e)
are allowed not to boot with some kernel configs.

--
Evgeniy Polyakov

2008-12-07 16:33:25

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 11:55:07PM +0800, Herbert Xu ([email protected]) wrote:
> Having said that I think this patch should remove this particular
> trigger. Unfortunately I couldn't find a more succinct way of
> expressing this relationship: if ALGAPI is built-in because it's
> selected by an algorithm, and CRYPTO_MANAGER is enabled then we
> require CRYPTO_MANAGER to be built-in as well.
>
> Evgeniy, please let me know whether this fixes the problem.

Holy shit... This looks 'excellent' :)

I will try it tomorrow and report back the results.

--
Evgeniy Polyakov

2008-12-07 16:57:34

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> > It isn't a problem. It is trying to have hotplug load a suitable driver.
> > This is what is supposed to happen.
>
> No, it's not. 5:1 is _in_ the kernel, and must not be tried to be
> loaded by the kernel. We need to make /dev/console access return
> -ENODEV if not available, not try to load a module for it.

Hello earth calling, wake up.

The userspace opens major 5 minor 1. The kernel has no driver mapped to
that so the kernel asks user space to load a module of its choice for
major 5 minor 1. What user space does with that is and has always been a
problem for user space.

User space is quite at liberty to go .. 5,1 and I have a serial console
I want to load 8250_pci please.

What it must not do is try and re-open it again and again and again.

You may not assume that a given device load mapping is solely used to
create the most basic device node involved. It isn't that simple. In many
cases a device node translates to a series of module loads of otherwise
apparently unrelated devices.


2008-12-07 17:00:59

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> Wrong. First at least because tty/console stuff is built in.

Are you two people or just two names and email addresses for
the same ?

/dev/console is a logical mapping to a device which may well be
different, loaded after PCI is initialised and dependant on PCI.

> But we alredy got, that you decided that Debian's initramfs-tools (0.92e)
> are allowed not to boot with some kernel configs.

Yes. They won't boot if you don't include any disk drivers. They won't
boot if you don't include any file systems. They wont boot in a million
other cases. That is the user space not the kernel at fault.

The kernel is doing precisely what it is supposed to. It is even logging
the user space bug and stopping trying to get stuck in a loop loading a
module in order to attempt to cope with the user space bug.

If you want to complain then file a debian bug, go fix the user space.

Alan

2008-12-07 17:03:51

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 04:57:40PM +0000, Alan Cox ([email protected]) wrote:
> The userspace opens major 5 minor 1. The kernel has no driver mapped to
> that so the kernel asks user space to load a module of its choice for

It has. It really has it. It was not yet initialized though because,
because, because... Hmm, userspace code started and built-in kernel
subsystem was not yet initialized.

> major 5 minor 1. What user space does with that is and has always been a
> problem for user space.
>
> User space is quite at liberty to go .. 5,1 and I have a serial console
> I want to load 8250_pci please.
>
> What it must not do is try and re-open it again and again and again.

Heh, init is not allowed to write something to the console or any other
tty?

--
Evgeniy Polyakov

2008-12-07 17:13:58

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 05:01:08PM +0000, Alan Cox ([email protected]) wrote:
> > Wrong. First at least because tty/console stuff is built in.
>
> Are you two people or just two names and email addresses for
> the same ?

Depending on who you are asking for.

> /dev/console is a logical mapping to a device which may well be
> different, loaded after PCI is initialised and dependant on PCI.

Yup. It may. But in showed case it is not. In showed case it showed a
bug. Which you call a feature.

> > But we alredy got, that you decided that Debian's initramfs-tools (0.92e)
> > are allowed not to boot with some kernel configs.
>
> Yes. They won't boot if you don't include any disk drivers. They won't
> boot if you don't include any file systems. They wont boot in a million
> other cases. That is the user space not the kernel at fault.

But right now _everything_ is presented. All things are in places.
Disk drivers, filesystem, and even that stupid console/tty driver.
It _IS_ in the kernel.

> The kernel is doing precisely what it is supposed to. It is even logging
> the user space bug and stopping trying to get stuck in a loop loading a
> module in order to attempt to cope with the user space bug.
>
> If you want to complain then file a debian bug, go fix the user space.

You introduce so naive rules for the initramfs userspace... It should
not use console, it should not load modules, which may load another
one... What next, not to use syscalls? Sigh, instead of thinking on how
to fix the weak boot process so that next time similar problem arises,
we ended up with pointing a finger one to another... Hope we will not
get another similar cases though.

--
Evgeniy Polyakov

2008-12-07 17:17:48

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 18:01, Alan Cox <[email protected]> wrote:
>> Wrong. First at least because tty/console stuff is built in.
>
> Are you two people or just two names and email addresses for
> the same ?

Damn, now you found that out.

I suggest you get a few more mail addresses too, to back your
non-sensical claims, and try to establish silly userspace rules to
work around an obvious kernel bug. :)

Good luck, looking forward to your creativity,
Kay

2008-12-07 17:22:31

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 18:01, Alan Cox <[email protected]> wrote:
>> Wrong. First at least because tty/console stuff is built in.
>
> Are you two people or just two names and email addresses for
> the same ?
>
> /dev/console is a logical mapping to a device which may well be
> different, loaded after PCI is initialised and dependant on PCI.

So wrong. If no driver is associated, like early, in that case, we
must return -ENODEV, instead of calling modprobe in a loop. It's a
built-in device, and it's easy to fix.

Again, please start to think about, it's all contained in the kernel,
and the kernel is wrong. Claiming userspace is guilty to triggering
this kernel bug does not help anything.

Kay

2008-12-07 17:24:53

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> Heh, init is not allowed to write something to the console or any other
> tty?

init is not run as the helper of a modprobe.

2008-12-07 17:28:51

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> > /dev/console is a logical mapping to a device which may well be
> > different, loaded after PCI is initialised and dependant on PCI.
>
> So wrong. If no driver is associated, like early, in that case, we
> must return -ENODEV, instead of calling modprobe in a loop. It's a
> built-in device, and it's easy to fix.

You've clearly no idea how initrd even works have you ? If it just
returned -ENODEV you wouldn't be able to open the console and you
wouldn't trigger the loading of the module to get the console running. So
you've now completely buggered the boot process.

The correct sequence is

Open device
Kernel issues hotplug message
Hotplug script loads drivers to policy


The problem case you have due to initrd bugs is

Open device
Kernel issues hotplug message
Hotplug script opens same device (BUG)
Kernel issues hotplug message
.....
Kernel detects this is stuck
Kernel replies with -ENODEV/-ENXIO to try
and rescue itself from buggy initrd scripts

That is how it has worked since we first had script based module
requesting which is some years now.


2008-12-07 17:29:06

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 05:24:47PM +0000, Alan Cox ([email protected]) wrote:
> > Heh, init is not allowed to write something to the console or any other
> > tty?
>
> init is not run as the helper of a modprobe.

I hope it is not another limitation for the early userspace? :)
I worked with systems where init made everything. Not showing that as an
example of particulary good design, but still there may be situations
when userspace code does something, which kernel is not ready for. And
it is a kernel problem that it is not ready for some usespace behaviour.

Btw, modprobe can not print error message to the console, that's the
rule ;)

--
Evgeniy Polyakov

2008-12-07 17:39:50

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 18:28, Alan Cox <[email protected]> wrote:
>> > /dev/console is a logical mapping to a device which may well be
>> > different, loaded after PCI is initialised and dependant on PCI.
>>
>> So wrong. If no driver is associated, like early, in that case, we
>> must return -ENODEV, instead of calling modprobe in a loop. It's a
>> built-in device, and it's easy to fix.
>
> You've clearly no idea how initrd even works have you ?

Not sure, if you understand the real problem. A kernel forked binary
is allowed to access /dev/console, but it triggers a kernel bug.

> If it just
> returned -ENODEV you wouldn't be able to open the console and you
> wouldn't trigger the loading of the module to get the console running. So
> you've now completely buggered the boot process.

Utter nonsense. Exactly when the driver is available shortly later,
the console will work. If it's not backed by a driver, it should not
try to load it, it will never get any driver loaded by opening it. The
kernel must handle that.

> The correct sequence is
>
> Open device
> Kernel issues hotplug message
> Hotplug script loads drivers to policy

Nonsense. The kernel calls /sbin/modprobe directly, no hotplug involved.

> The problem case you have due to initrd bugs is
>
> Open device
> Kernel issues hotplug message
> Hotplug script opens same device (BUG)

The kernel calls modprobe for something, modprobe tries to log an
error, and the kernel calls modprobe again. Bug! No hotplug involved.

> Kernel issues hotplug message
> .....
> Kernel detects this is stuck
> Kernel replies with -ENODEV/-ENXIO to try
> and rescue itself from buggy initrd scripts

Totally wrong, It never was that way.

> That is how it has worked since we first had script based module
> requesting which is some years now.

Please update your idea of hotplug and the kernel module loader, you
are on the totally wrong track.

Thanks,
Kay

2008-12-07 17:44:37

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 05:28:55PM +0000, Alan Cox ([email protected]) wrote:
> > > /dev/console is a logical mapping to a device which may well be
> > > different, loaded after PCI is initialised and dependant on PCI.
> >
> > So wrong. If no driver is associated, like early, in that case, we
> > must return -ENODEV, instead of calling modprobe in a loop. It's a
> > built-in device, and it's easy to fix.
>
> You've clearly no idea how initrd even works have you ? If it just
> returned -ENODEV you wouldn't be able to open the console and you
> wouldn't trigger the loading of the module to get the console running. So
> you've now completely buggered the boot process.
>
> The correct sequence is
>
> Open device
> Kernel issues hotplug message
> Hotplug script loads drivers to policy
>
>
> The problem case you have due to initrd bugs is
>
> Open device
> Kernel issues hotplug message
> Hotplug script opens same device (BUG)

Everyone understands that, what you do not want to get, is that this
case can be handled by the kernel so that there would be no recursion.
And instead of thinking how to fix it, you just try to shut it up.

There may be another case, when the same happens inside the kernel, i.e.
module being loaded requires console and the same happens again. Similar
problem exists for network console, when there is no underlying device
yet, but it is handled.

Fortunately console is the most common example (maybe even the only
one), so this case can be fixed easily. Moreover, because of subtle
tty ordering, when everything is in kernel, it still may be triggered,
as was shown previously.

And while having dumb console device sounds awful for you, it is a
bulletprof solution even for non-expected userspace behaviout. And so
far the only objection was that it may break something, which you
believe may happen not even being tested.

Alan, let's make some progress on this fingerpointing. If Herbert's
patch fixes the crypto loading problem, it will find its way upstream
for the current tree, and in the merge window Kay's patch may be applied
and widely tested. Thoughts?

--
Evgeniy Polyakov

2008-12-07 17:51:11

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, 7 Dec 2008 18:39:48 +0100
"Kay Sievers" <[email protected]> wrote:

> On Sun, Dec 7, 2008 at 18:28, Alan Cox <[email protected]> wrote:
> >> > /dev/console is a logical mapping to a device which may well be
> >> > different, loaded after PCI is initialised and dependant on PCI.
> >>
> >> So wrong. If no driver is associated, like early, in that case, we
> >> must return -ENODEV, instead of calling modprobe in a loop. It's a
> >> built-in device, and it's easy to fix.
> >
> > You've clearly no idea how initrd even works have you ?
>
> Not sure, if you understand the real problem. A kernel forked binary
> is allowed to access /dev/console, but it triggers a kernel bug.

If there is a hotplug load for the console device then it cannot. Just as
a request for the driver for /dev/hda cannot open /dev/hda* again.

> Nonsense. The kernel calls /sbin/modprobe directly, no hotplug involved.

Its up to you if the kernel calls modprobe and what your modprobe is

> The kernel calls modprobe for something, modprobe tries to log an
> error, and the kernel calls modprobe again. Bug! No hotplug involved.

A modprobe to load the console device shouln't open /dev/console. Very
simple and always been true.

> > Kernel issues hotplug message
> > .....
> > Kernel detects this is stuck
> > Kernel replies with -ENODEV/-ENXIO to try
> > and rescue itself from buggy initrd scripts
>
> Totally wrong, It never was that way

Funny but its been that way for many many years. Just as it cannot try and
syslog an error when loading the AF_UNIX socket family.

Alan

2008-12-07 17:52:36

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> Alan, let's make some progress on this fingerpointing. If Herbert's
> patch fixes the crypto loading problem, it will find its way upstream
> for the current tree, and in the merge window Kay's patch may be applied
> and widely tested. Thoughts?

I have no intention of applying Kay's patch because it is wrong and it
will only break things not fix them.

Alan

2008-12-07 17:54:40

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 05:52:45PM +0000, Alan Cox ([email protected]) wrote:
> > Alan, let's make some progress on this fingerpointing. If Herbert's
> > patch fixes the crypto loading problem, it will find its way upstream
> > for the current tree, and in the merge window Kay's patch may be applied
> > and widely tested. Thoughts?
>
> I have no intention of applying Kay's patch because it is wrong and it
> will only break things not fix them.

And you are sure it breaks something based on what?

--
Evgeniy Polyakov

2008-12-07 18:02:56

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, 7 Dec 2008 20:54:38 +0300
Evgeniy Polyakov <[email protected]> wrote:

> On Sun, Dec 07, 2008 at 05:52:45PM +0000, Alan Cox ([email protected]) wrote:
> > > Alan, let's make some progress on this fingerpointing. If Herbert's
> > > patch fixes the crypto loading problem, it will find its way upstream
> > > for the current tree, and in the merge window Kay's patch may be applied
> > > and widely tested. Thoughts?
> >
> > I have no intention of applying Kay's patch because it is wrong and it
> > will only break things not fix them.
>
> And you are sure it breaks something based on what?


Firstly:

You propose to implement

modprobe fails (due to crypto requirements)
open /dev/console
-ENODEV
log error to nowhere


Why is this useful - you now get failing module loads producing no
diagnostics and in many case the setup just dying silently. Previously
you got an attempt to recover and diagnostics which allowed the problem
to be found (as Herbert did)

Secondly:

If I have a /dev/console on a PCI device and I have modprobe set to load
8250_pci or a framebuffer driver when the console is opened you will
break the current working behaviour.


2008-12-07 18:13:15

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 19:03, Alan Cox <[email protected]> wrote:
> On Sun, 7 Dec 2008 20:54:38 +0300
> Evgeniy Polyakov <[email protected]> wrote:
>
>> On Sun, Dec 07, 2008 at 05:52:45PM +0000, Alan Cox ([email protected]) wrote:
>> > > Alan, let's make some progress on this fingerpointing. If Herbert's
>> > > patch fixes the crypto loading problem, it will find its way upstream
>> > > for the current tree, and in the merge window Kay's patch may be applied
>> > > and widely tested. Thoughts?
>> >
>> > I have no intention of applying Kay's patch because it is wrong and it
>> > will only break things not fix them.
>>
>> And you are sure it breaks something based on what?
>
>
> Firstly:
>
> You propose to implement
>
> modprobe fails (due to crypto requirements)
> open /dev/console
> -ENODEV
> log error to nowhere

Yes, log nowhere instead of running in a loop would be much better
than loading a 5:1 driver which will never exist as a module.

> Why is this useful - you now get failing module loads producing no
> diagnostics and in many case the setup just dying silently.

It's obviously more useful than not to boot up.

> Previously
> you got an attempt to recover and diagnostics which allowed the problem
> to be found (as Herbert did)
>
> Secondly:
>
> If I have a /dev/console on a PCI device and I have modprobe set to load
> 8250_pci or a framebuffer driver when the console is opened you will
> break the current working behaviour.

No, the pci driver will never get loaded by modprobe 5:1, that's all
what this bug is about. It connects to the existing console device
when the driver is loaded, and at that moment /dev/console gets
working.

Kay

2008-12-07 18:15:51

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> Yes, log nowhere instead of running in a loop would be much better
> than loading a 5:1 driver which will never exist as a module.

The loop is detected and terminated.
>
> > Why is this useful - you now get failing module loads producing no
> > diagnostics and in many case the setup just dying silently.
>
> It's obviously more useful than not to boot up.

What makes you think it will now boot up. The loop is already detected
and terminated. What will you do if it doesn't and you get no
diagnostics. How will distributions debug those reports in bugzilla.

> No, the pci driver will never get loaded by modprobe 5:1,

Why not ? You have no idea how the other millions of Linux users have
their module loading rules configured. A change which breaks this
behaviour is a regression.

Alan

2008-12-07 18:21:48

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 19:15, Alan Cox <[email protected]> wrote:
>> Yes, log nowhere instead of running in a loop would be much better
>> than loading a 5:1 driver which will never exist as a module.
>
> The loop is detected and terminated.

No. Please back up what you are trying to talk about.

>> > Why is this useful - you now get failing module loads producing no
>> > diagnostics and in many case the setup just dying silently.
>>
>> It's obviously more useful than not to boot up.
>
> What makes you think it will now boot up. The loop is already detected
> and terminated. What will you do if it doesn't and you get no
> diagnostics. How will distributions debug those reports in bugzilla.

The boxes of the reporters hang! Read the bug! Please!

>> No, the pci driver will never get loaded by modprobe 5:1,
>
> Why not ? You have no idea how the other millions of Linux users have
> their module loading rules configured. A change which breaks this
> behaviour is a regression.

There is no cheap way out of the problem, it's a kernel bug, and we
will fix it - you may just delay it with your zero arguments.

Kay

2008-12-07 18:22:44

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 18:51, Alan Cox <[email protected]> wrote:
> On Sun, 7 Dec 2008 18:39:48 +0100
> "Kay Sievers" <[email protected]> wrote:
>
>> On Sun, Dec 7, 2008 at 18:28, Alan Cox <[email protected]> wrote:
>> >> > /dev/console is a logical mapping to a device which may well be
>> >> > different, loaded after PCI is initialised and dependant on PCI.
>> >>
>> >> So wrong. If no driver is associated, like early, in that case, we
>> >> must return -ENODEV, instead of calling modprobe in a loop. It's a
>> >> built-in device, and it's easy to fix.
>> >
>> > You've clearly no idea how initrd even works have you ?
>>
>> Not sure, if you understand the real problem. A kernel forked binary
>> is allowed to access /dev/console, but it triggers a kernel bug.
>
> If there is a hotplug load for the console device then it cannot. Just as
> a request for the driver for /dev/hda cannot open /dev/hda* again.

So wrong. Modprobe 5:1 will never load anything, but kill the box in
such case at that time of bootup.

>> Nonsense. The kernel calls /sbin/modprobe directly, no hotplug involved.
>
> Its up to you if the kernel calls modprobe and what your modprobe is

What are you saying? Your picture of hotplug is just wrong, regardless
of "it's up to you". I talk about what people use, and what is
obviously broken currently.

>> The kernel calls modprobe for something, modprobe tries to log an
>> error, and the kernel calls modprobe again. Bug! No hotplug involved.
>
> A modprobe to load the console device shouln't open /dev/console. Very
> simple and always been true.

We are _not_ loading a console driver that way, we try to load the
console device itself, not the driver. There is no driver for 5:1 in
any module.

>> > Kernel issues hotplug message
>> > .....
>> > Kernel detects this is stuck
>> > Kernel replies with -ENODEV/-ENXIO to try
>> > and rescue itself from buggy initrd scripts
>>
>> Totally wrong, It never was that way
>
> Funny but its been that way for many many years. Just as it cannot try and
> syslog an error when loading the AF_UNIX socket family.

And? Keep on the subject please. It's still a bug to run in a loop
trying something that will _never_ exist.

Kay

2008-12-07 18:31:15

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> > The loop is detected and terminated.
>
> No. Please back up what you are trying to talk about.

Let me introduce you to.. drum roll.. the source code. Its a useful
resource, why don't you use it for once.


max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
atomic_inc(&kmod_concurrent);
if (atomic_read(&kmod_concurrent) > max_modprobes) {
/* We may be blaming an innocent here, but unlikely */
if (kmod_loop_msg++ < 5)
printk(KERN_ERR
"request_module: runaway loop modprobe
%s\n", module_name);
atomic_dec(&kmod_concurrent);
return -ENOMEM;
}

Happy now. Print it out, share it with friends, find someone who can read
C if you are stuck.

> The boxes of the reporters hang! Read the bug! Please!

They would still hang. As I repeatedly said for the benefit of two people
who don't seem to be able to read source code, the loop is detected and
terminated. So it already fails the open when it sees it has gotten five
layers deep.

> There is no cheap way out of the problem, it's a kernel bug, and we
> will fix it - you may just delay it with your zero arguments.

Oh I see. Allow me to explain your position in the words of some small
children I know

"ME!! ME!! ME!! ME!! ME!!"

I don't care about your obscure corner-case non bug that in fact was a
crypto bug combined with a modprobe bug and where the crypto bug is
now fixed. I do care about not breaking existing users systems. The fact
we do this is why Linux doesn't suck.

Alan

2008-12-07 19:02:15

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 19:31, Alan Cox <[email protected]> wrote:
>> > The loop is detected and terminated.
>>
>> No. Please back up what you are trying to talk about.
>
> Let me introduce you to.. drum roll.. the source code. Its a useful
> resource, why don't you use it for once.
>
>
> max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> atomic_inc(&kmod_concurrent);
> if (atomic_read(&kmod_concurrent) > max_modprobes) {
> /* We may be blaming an innocent here, but unlikely */
> if (kmod_loop_msg++ < 5)
> printk(KERN_ERR
> "request_module: runaway loop modprobe
> %s\n", module_name);
> atomic_dec(&kmod_concurrent);
> return -ENOMEM;
> }
>
> Happy now. Print it out, share it with friends, find someone who can read
> C if you are stuck.

It does not work, that's all. Reproduce the bug and look at it for yourself.

It's still a bug, regardless of all the childish stuff you wrap around it.

>> The boxes of the reporters hang! Read the bug! Please!
>
> They would still hang. As I repeatedly said for the benefit of two people
> who don't seem to be able to read source code, the loop is detected and
> terminated. So it already fails the open when it sees it has gotten five
> layers deep.
>
>> There is no cheap way out of the problem, it's a kernel bug, and we
>> will fix it - you may just delay it with your zero arguments.
>
> Oh I see. Allow me to explain your position in the words of some small
> children I know
>
> "ME!! ME!! ME!! ME!! ME!!"
>
> I don't care about your obscure corner-case non bug that in fact was a
> crypto bug combined with a modprobe bug and where the crypto bug is
> now fixed. I do care about not breaking existing users systems. The fact
> we do this is why Linux doesn't suck.

It's not obscure, it's obviously broken. And it currently sucks.

I have no idea why are you repeatedly, with completely wrong arguments
trying, to explain "hotplug" stuff to me. I maintain it by the way,
and I didn't see you involved here in the last years, so I guess you
miss the background to understand the problem.

Kay

2008-12-07 20:00:09

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> > max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
> > atomic_inc(&kmod_concurrent);
> > if (atomic_read(&kmod_concurrent) > max_modprobes) {
> > /* We may be blaming an innocent here, but unlikely */
> > if (kmod_loop_msg++ < 5)
> > printk(KERN_ERR
> > "request_module: runaway loop modprobe
> > %s\n", module_name);
> > atomic_dec(&kmod_concurrent);
> > return -ENOMEM;
> > }
> >
> > Happy now. Print it out, share it with friends, find someone who can read
> > C if you are stuck.
>
> It does not work, that's all.

It works for me.

> Reproduce the bug and look at it for yourself.

Well since you've got a reproducer and this code works for me (I've tested
it just fine), why don't you go and reproduce the problem then post a fix
to that code I quoted instead of all this reordering rubbish. If you fix
this code not only won't you risk all the mess from re-ordering
initialisations around the kernel but you'll fix non console related
looping which you imply is also broken as you claim that code doesn't
work for you.

If I deliberately break my module utils I see a sequence of modprobes
which then hits kmod_concurrent limit then causes a -ENOMEM back to
userspace which then fails the file open. The bug report also shows the
printk is displayed so the runaway loop *was* detected and the code paths
taken which stopped the loop.

I get open -> modprobe -> open -> modprobe -> open -> modprobe ... ->
open fail, then open fail, open fail, open fail, open fail back to the
first modprobe exiting.

Your proposal to keep the current recent modprobe parameter strings would
shorten the amount of recursion but it wouldn't change the result that I
can see. If I open /dev/console early and wrongly from a modprobe then I
ultimately get a failing open just as I should do.

Alan

2008-12-07 22:26:46

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 7, 2008 at 21:00, Alan Cox <[email protected]> wrote:
>> > max_modprobes = min(max_threads/2, MAX_KMOD_CONCURRENT);
>> > atomic_inc(&kmod_concurrent);
>> > if (atomic_read(&kmod_concurrent) > max_modprobes) {
>> > /* We may be blaming an innocent here, but unlikely */
>> > if (kmod_loop_msg++ < 5)
>> > printk(KERN_ERR
>> > "request_module: runaway loop modprobe
>> > %s\n", module_name);
>> > atomic_dec(&kmod_concurrent);
>> > return -ENOMEM;
>> > }
>> >
>> > Happy now. Print it out, share it with friends, find someone who can read
>> > C if you are stuck.
>>
>> It does not work, that's all.
>
> It works for me.

It runs in a loop here, just like the $subject says, and bko#12153 says.

>> Reproduce the bug and look at it for yourself.
>
> Well since you've got a reproducer and this code works for me (I've tested
> it just fine), why don't you go and reproduce the problem then post a fix
> to that code I quoted instead of all this reordering rubbish. If you fix
> this code not only won't you risk all the mess from re-ordering
> initialisations around the kernel but you'll fix non console related
> looping which you imply is also broken as you claim that code doesn't
> work for you.
>
> If I deliberately break my module utils I see a sequence of modprobes
> which then hits kmod_concurrent limit then causes a -ENOMEM back to
> userspace which then fails the file open. The bug report also shows the
> printk is displayed so the runaway loop *was* detected and the code paths
> taken which stopped the loop.

Sure, I can try to find out why the limiter does does not work here.

> I get open -> modprobe -> open -> modprobe -> open -> modprobe ... ->
> open fail, then open fail, open fail, open fail, open fail back to the
> first modprobe exiting.
>
> Your proposal to keep the current recent modprobe parameter strings would
> shorten the amount of recursion but it wouldn't change the result that I
> can see. If I open /dev/console early and wrongly from a modprobe then I
> ultimately get a failing open just as I should do.

No, my proposal would prevent the kernel to call any modprobe for the
special built-in device 5:1, it shortens the recursion to zero, which
sounds like the right fix. It's really weird to fix the symptoms,
instead of the real problem.

The kernel forks a binary with a broken environment. Even the
in-kernel-tree cpio generator creates /dev/console, and accessing it
from a kernel forked binary makes it crash. The kernel must provide a
sane environment for stuff that it calls, I think, that is pretty
obvious.

Device 5:1 is a core device, which never makes sense to call modprobe
for it. No other later driver will ever register that dev_t, so we
should do it before calling out to other random userspace stuff which
triggers the kernel to go crazy with its own devices.

My proposal was to connect 5:1 to the kobject map at the same time as
we register the whole tty class. We allocate the class, create sysfs
entries, run /sbin/modprobe at that time, why shouldn't we just
register the dev_t that time too?

It properly prevents the needless userspace "driver searching" for a
well known, already created, but not properly registered core device.

It makes the process environment sane, and prevents the root cause of
the bug, and does not just limit the damage.

Thanks,
Kay

2008-12-08 01:19:04

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

Let's take a step back, shall we? Fundamentally, what's going on here
is that a particular distribution's initrd (Debian's, to be precise)
is running into an error in response to a modprobe request for
char-major-5-1, and it is attempting to write to the console, which is
resulting in another modprobe request.... ad infinitum.

There is a dispute about whether it is looping forever, or whether it
should be getting caught by kernel/kmod.c's modprobe recursion
detector. Alan has checked the recursion detector and reports that it
works just fine; Evgeniy and Kay are claiming that it in fact loops
forever, and the recursion detector is not working. I'm going to
guess that Alan tested on Fedora, where it did work just fine, and
reason why people using a Debian-derived initrd is seeing a recursive
loop is because the recursive loop detector works by detecting up to
five concurrent calls to modprobe. That is, while the userspace
helper process is running, another userspace helper is invoked, and so
on, so that there are five userspace helpers piled up on one another,
this will trigger the automatic recursion detector. I'm guessing why
it isn't working given Debian's initrd setup is that whatever is
ultimately opening /dev/console isn't being called until after the
helper script has exited.

Given that the general purpose recursion detector is apparently not
working at least in this case, Kay has proposed that we special case a
kludge wihch prevents the userspace helper be called in the case of
5:1. His argument in favor of doing this is that /dev/console is
never a module, so requesting char-major-5-1 will never be helpful,
and this error can only happen in early userspace, when the tty
subsystem hasn't been initialized yet. Alan claims this could also
happen if the appropriate low-level console driver hasn't been loaded,
and so perhaps the right thing in response to the request for
char-major-5-1 is to load 8250_pci. Here, I think Alan is wrong, and
Kay is right. From looking at the source, if there is no low-level
console driver loaded, there is no call to request_module(); the only
time this can happen is when tty driver hasn't been initialized in
early startup.

On the other hand, Alan is right that in general it is the usermode
helper and initrd's responsibility not create a recursive dependency.
This is in general true, not just for /dev/console. So based on that,
it can be argued that the recursion kludge checking for 5:1 should
just as much be put in userspace. In addition, the fact that
recursion detection isn't working also seems to indicate that initrd
in question is also doing something very wrong.

So I would think the best thing to do is to figure out what Debian's
initrd is doing that is evading the recursion detection. Fixing that
is going to make things much more robust.

Regards,

- Ted

2008-12-08 03:24:27

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, 07 Dec 2008 19:22:41 +0100, Kay Sievers said:

> We are _not_ loading a console driver that way, we try to load the
> console device itself, not the driver. There is no driver for 5:1 in
> any module.

Semantic quibbling, trying to paper over stupidity.

It doesn't really matter if it's a "console driver" or "the console device
itself". If you're in modprobe loading *any* piece of "all the stuff needed
to make open("/dev/console") work", the last thing you want to be doing
is opening /dev/console to complain about something not working.

As Alan said several times now, there really *is* a requirement that early
userspace has its act together.


Attachments:
(No filename) (226.00 B)

2008-12-08 03:35:23

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Mon, Dec 8, 2008 at 02:18, Theodore Tso <[email protected]> wrote:
> Let's take a step back, shall we? Fundamentally, what's going on here
> is that a particular distribution's initrd (Debian's, to be precise)
> is running into an error in response to a modprobe request for
> char-major-5-1, and it is attempting to write to the console, which is
> resulting in another modprobe request.... ad infinitum.
>
> There is a dispute about whether it is looping forever, or whether it
> should be getting caught by kernel/kmod.c's modprobe recursion
> detector. Alan has checked the recursion detector and reports that it
> works just fine; Evgeniy and Kay are claiming that it in fact loops
> forever, and the recursion detector is not working. I'm going to
> guess that Alan tested on Fedora, where it did work just fine, and
> reason why people using a Debian-derived initrd is seeing a recursive
> loop is because the recursive loop detector works by detecting up to
> five concurrent calls to modprobe. That is, while the userspace
> helper process is running, another userspace helper is invoked, and so
> on, so that there are five userspace helpers piled up on one another,
> this will trigger the automatic recursion detector. I'm guessing why
> it isn't working given Debian's initrd setup is that whatever is
> ultimately opening /dev/console isn't being called until after the
> helper script has exited.
>
> Given that the general purpose recursion detector is apparently not
> working at least in this case, Kay has proposed that we special case a
> kludge wihch prevents the userspace helper be called in the case of
> 5:1. His argument in favor of doing this is that /dev/console is
> never a module, so requesting char-major-5-1 will never be helpful,
> and this error can only happen in early userspace, when the tty
> subsystem hasn't been initialized yet. Alan claims this could also
> happen if the appropriate low-level console driver hasn't been loaded,
> and so perhaps the right thing in response to the request for
> char-major-5-1 is to load 8250_pci. Here, I think Alan is wrong, and
> Kay is right. From looking at the source, if there is no low-level
> console driver loaded, there is no call to request_module(); the only
> time this can happen is when tty driver hasn't been initialized in
> early startup.
>
> On the other hand, Alan is right that in general it is the usermode
> helper and initrd's responsibility not create a recursive dependency.
> This is in general true, not just for /dev/console. So based on that,
> it can be argued that the recursion kludge checking for 5:1 should
> just as much be put in userspace. In addition, the fact that
> recursion detection isn't working also seems to indicate that initrd
> in question is also doing something very wrong.
>
> So I would think the best thing to do is to figure out what Debian's
> initrd is doing that is evading the recursion detection. Fixing that
> is going to make things much more robust.

I tested it with an initramfs image with /sbin/modprobe being a shell
script that writes a few bytes to /dev/console, and does nothing else.
It did run for minutes here, until I stopped it.

Sure, if we can make userspace behave nicely, I'm all for doing it. On
the other hand, I think it's a good thing to provide a sane
environment by the kernel for any "non-initramfs-optimized" helper.
Writing to /dev/console is a usual behavior for tools used in
initramfs, to be able to debug bootup problems. In many cases it is
just glibc's fallback LOG_CONS, if syslog is not available.

Modprobe has syslog code, so it might very well be, that the Debian
initramfs isn't doing anything obscure. The current behavior, is very
easy to trigger bug, and a pretty hard to debug setup, if you do not
add debugging code to the kernel code itself.

That's why I still think it's a good thing, to connect the core tty
devices to their dev_t handler internally, before we init all the
other drivers and run userspace. It will prevent any further needless
searches by the kernel, for a driver for the already created but just
not registered tty devices. We will do exactly the same dev_t
<->device connection (register_chrdev_region) anyway, directly after
loading all the other drivers.

Thanks,
Kay

2008-12-08 03:56:12

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Mon, Dec 8, 2008 at 04:23, <[email protected]> wrote:
> On Sun, 07 Dec 2008 19:22:41 +0100, Kay Sievers said:
>
>> We are _not_ loading a console driver that way, we try to load the
>> console device itself, not the driver. There is no driver for 5:1 in
>> any module.

[childish blurb removed]

> It doesn't really matter if it's a "console driver" or "the console device
> itself". If you're in modprobe loading *any* piece of "all the stuff needed
> to make open("/dev/console") work", the last thing you want to be doing
> is opening /dev/console to complain about something not working.

Nothing in initramfs or userspace tries to load "all the stuff needed
to make open("/dev/console") work". It's the kernel itself, that tries
to resolve its own requirements.

The kernel forked binary _writes_ to /dev/console, if we like it or
not, it seems to do that, and it's arguable why it should not, or how
it should detect that it is not allowed to do that. You may just
depend on the logging of binaries to find other bugs.

The helper was called in the first place for something else, in this
case the "cryptomgr". The loop is caused entirely by the kernel
itself, if /dev/console is just accessed. There is no intentional
loading of any console driver from userspace happening here.

Kay

2008-12-08 13:06:50

by Evgeniy Polyakov

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Sun, Dec 07, 2008 at 11:55:07PM +0800, Herbert Xu ([email protected]) wrote:
> Having said that I think this patch should remove this particular
> trigger. Unfortunately I couldn't find a more succinct way of
> expressing this relationship: if ALGAPI is built-in because it's
> selected by an algorithm, and CRYPTO_MANAGER is enabled then we
> require CRYPTO_MANAGER to be built-in as well.
>
> Evgeniy, please let me know whether this fixes the problem.

Your patch does not apply cleanly to the latest git and missed
rng dependency to rng2.
Attached patch fixes problem with module loading for me.

diff --git a/crypto/Kconfig b/crypto/Kconfig
index 39dbd8e..dc20a34 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -31,35 +31,63 @@ config CRYPTO_FIPS

config CRYPTO_ALGAPI
tristate
+ select CRYPTO_ALGAPI2
help
This option provides the API for cryptographic algorithms.

+config CRYPTO_ALGAPI2
+ tristate
+
config CRYPTO_AEAD
tristate
+ select CRYPTO_AEAD2
select CRYPTO_ALGAPI

+config CRYPTO_AEAD2
+ tristate
+ select CRYPTO_ALGAPI2
+
config CRYPTO_BLKCIPHER
tristate
+ select CRYPTO_BLKCIPHER2
select CRYPTO_ALGAPI
- select CRYPTO_RNG
+
+config CRYPTO_BLKCIPHER2
+ tristate
+ select CRYPTO_ALGAPI2
+ select CRYPTO_RNG2

config CRYPTO_HASH
tristate
+ select CRYPTO_HASH2
select CRYPTO_ALGAPI

+config CRYPTO_HASH2
+ tristate
+ select CRYPTO_ALGAPI2
+
config CRYPTO_RNG
tristate
+ select CRYPTO_RNG2
select CRYPTO_ALGAPI

+config CRYPTO_RNG2
+ tristate
+ select CRYPTO_ALGAPI2
+
config CRYPTO_MANAGER
tristate "Cryptographic algorithm manager"
- select CRYPTO_AEAD
- select CRYPTO_HASH
- select CRYPTO_BLKCIPHER
+ select CRYPTO_MANAGER2
help
Create default cryptographic template instantiations such as
cbc(aes).

+config CRYPTO_MANAGER2
+ def_tristate CRYPTO_MANAGER || (CRYPTO_MANAGER!=n && CRYPTO_ALGAPI=y)
+ select CRYPTO_AEAD2
+ select CRYPTO_HASH2
+ select CRYPTO_BLKCIPHER2
+
config CRYPTO_GF128MUL
tristate "GF(2^128) multiplication functions (EXPERIMENTAL)"
depends on EXPERIMENTAL
diff --git a/crypto/Makefile b/crypto/Makefile
index 5862b80..cd4a4ed 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -9,24 +9,24 @@ obj-$(CONFIG_CRYPTO_FIPS) += fips.o

crypto_algapi-$(CONFIG_PROC_FS) += proc.o
crypto_algapi-objs := algapi.o scatterwalk.o $(crypto_algapi-y)
-obj-$(CONFIG_CRYPTO_ALGAPI) += crypto_algapi.o
+obj-$(CONFIG_CRYPTO_ALGAPI2) += crypto_algapi.o

-obj-$(CONFIG_CRYPTO_AEAD) += aead.o
+obj-$(CONFIG_CRYPTO_AEAD2) += aead.o

crypto_blkcipher-objs := ablkcipher.o
crypto_blkcipher-objs += blkcipher.o
-obj-$(CONFIG_CRYPTO_BLKCIPHER) += crypto_blkcipher.o
-obj-$(CONFIG_CRYPTO_BLKCIPHER) += chainiv.o
-obj-$(CONFIG_CRYPTO_BLKCIPHER) += eseqiv.o
+obj-$(CONFIG_CRYPTO_BLKCIPHER2) += crypto_blkcipher.o
+obj-$(CONFIG_CRYPTO_BLKCIPHER2) += chainiv.o
+obj-$(CONFIG_CRYPTO_BLKCIPHER2) += eseqiv.o
obj-$(CONFIG_CRYPTO_SEQIV) += seqiv.o

crypto_hash-objs := hash.o
crypto_hash-objs += ahash.o
-obj-$(CONFIG_CRYPTO_HASH) += crypto_hash.o
+obj-$(CONFIG_CRYPTO_HASH2) += crypto_hash.o

cryptomgr-objs := algboss.o testmgr.o

-obj-$(CONFIG_CRYPTO_MANAGER) += cryptomgr.o
+obj-$(CONFIG_CRYPTO_MANAGER2) += cryptomgr.o
obj-$(CONFIG_CRYPTO_HMAC) += hmac.o
obj-$(CONFIG_CRYPTO_XCBC) += xcbc.o
obj-$(CONFIG_CRYPTO_NULL) += crypto_null.o
@@ -73,8 +73,8 @@ obj-$(CONFIG_CRYPTO_MICHAEL_MIC) += michael_mic.o
obj-$(CONFIG_CRYPTO_CRC32C) += crc32c.o
obj-$(CONFIG_CRYPTO_AUTHENC) += authenc.o
obj-$(CONFIG_CRYPTO_LZO) += lzo.o
-obj-$(CONFIG_CRYPTO_RNG) += rng.o
-obj-$(CONFIG_CRYPTO_RNG) += krng.o
+obj-$(CONFIG_CRYPTO_RNG2) += rng.o
+obj-$(CONFIG_CRYPTO_RNG2) += krng.o
obj-$(CONFIG_CRYPTO_ANSI_CPRNG) += ansi_cprng.o
obj-$(CONFIG_CRYPTO_TEST) += tcrypt.o

--
Evgeniy Polyakov

2008-12-09 00:42:38

by Herbert Xu

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Mon, Dec 08, 2008 at 04:06:46PM +0300, Evgeniy Polyakov wrote:
>
> Your patch does not apply cleanly to the latest git and missed
> rng dependency to rng2.

Right it was against cryptodev-2.6 actually. Thanks for fixing
it up and confirming. I'll submit this to Linus.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2008-12-09 01:09:49

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Mon, Dec 08, 2008 at 04:35:20AM +0100, Kay Sievers wrote:
> Sure, if we can make userspace behave nicely, I'm all for doing it. On
> the other hand, I think it's a good thing to provide a sane
> environment by the kernel for any "non-initramfs-optimized" helper.
> Writing to /dev/console is a usual behavior for tools used in
> initramfs, to be able to debug bootup problems. In many cases it is
> just glibc's fallback LOG_CONS, if syslog is not available.

Well, can we agree that (a) there is a generalized modprobe recursion
detector already in the kernel, and (b) for some reason, Debian isn't
triggering it? That seems like a bug, and while it may not be 100%
clear whether the bug is on the userspace side or the kernel side, it
would seem that finding and fixing *that* would be a Good Thing, and
it could be argued that a specific don't request char-major-5-1 would
be papering over this bug (whether it is in Debian's initrd or in the
kernel side code). It would be good to make sure we understand what
the root causes for while the modprobe recursion detector is
apparently not triggering, since it could be that Debian's initrd
might cause some other uncaught recursion loop if we don't drive this
problem determination to root cause.

> That's why I still think it's a good thing, to connect the core tty
> devices to their dev_t handler internally, before we init all the
> other drivers and run userspace.

Um, how early? (/me searches lkml.org for the original patch). Ah,
OK, you want to do it postcore.... There may actually be a problem
with that, because it looks like vtconsole_class_init (which is
currently run as a postcore initcall) really wants to happen before
the tty layer initializes itself. So moving tty into postcore could
potentially run into problems, depending on whether vt.c's or
tty_io.c's initcalls are run first.

- Ted

2008-12-09 02:00:03

by Kay Sievers

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

On Tue, Dec 9, 2008 at 02:09, Theodore Tso <[email protected]> wrote:
> On Mon, Dec 08, 2008 at 04:35:20AM +0100, Kay Sievers wrote:
>> Sure, if we can make userspace behave nicely, I'm all for doing it. On
>> the other hand, I think it's a good thing to provide a sane
>> environment by the kernel for any "non-initramfs-optimized" helper.
>> Writing to /dev/console is a usual behavior for tools used in
>> initramfs, to be able to debug bootup problems. In many cases it is
>> just glibc's fallback LOG_CONS, if syslog is not available.
>
> Well, can we agree that (a) there is a generalized modprobe recursion
> detector already in the kernel, and (b) for some reason, Debian isn't
> triggering it? That seems like a bug, and while it may not be 100%
> clear whether the bug is on the userspace side or the kernel side, it
> would seem that finding and fixing *that* would be a Good Thing, and
> it could be argued that a specific don't request char-major-5-1 would
> be papering over this bug (whether it is in Debian's initrd or in the
> kernel side code). It would be good to make sure we understand what
> the root causes for while the modprobe recursion detector is
> apparently not triggering, since it could be that Debian's initrd
> might cause some other uncaught recursion loop if we don't drive this
> problem determination to root cause.

Right, it would be good to know for sure what's going on.

In my test script it was just a simple access to /dev/console, so it
might just be that modprobe wants to log something on Debian. At least
that would match the behavior I've seen here.

And we really rely on these tools to be able to print debug messages
sometimes to find other bugs (at least at the time it's theoretically
possible to log). And I don't see how modprobe could find out, that it
is not allowed to access /dev/console.

As an ideal solution, I would like to have /dev/console working before
_any_ userspace is called, and it would put all the stuff that is
written to it the kernel "dmesg buffer" as long as the console isn't
working, just to be able to see what's going on, if needed. As the
second choice, it would return -ENODEV. The current looping is
definitely not what we want. :)

>> That's why I still think it's a good thing, to connect the core tty
>> devices to their dev_t handler internally, before we init all the
>> other drivers and run userspace.
>
> Um, how early? (/me searches lkml.org for the original patch). Ah,
> OK, you want to do it postcore.... There may actually be a problem
> with that, because it looks like vtconsole_class_init (which is
> currently run as a postcore initcall) really wants to happen before
> the tty layer initializes itself. So moving tty into postcore could
> potentially run into problems, depending on whether vt.c's or
> tty_io.c's initcalls are run first.

Sure, we might need to change that, if it is a problem. It was mainly
to find out what was going on, because people thought that some serial
driver is involved in the loop, which I didn't believe.

To be safer, maybe we just want to move the cdev_add(),
register_chrdev_region() call, which should prevent the "userspace
driver search".

Thanks,
Kay

2008-12-09 10:14:00

by Alan

[permalink] [raw]
Subject: Re: Runaway loop with the current git.

> apparently not triggering, since it could be that Debian's initrd
> might cause some other uncaught recursion loop if we don't drive this
> problem determination to root cause.

Agreed - and there are all sorts of obscure and wonderful other ways to
trip this up which need the detector (AF_UNIX modular and doing a syslog
for example).

> > That's why I still think it's a good thing, to connect the core tty
> > devices to their dev_t handler internally, before we init all the
> > other drivers and run userspace.
>
> Um, how early? (/me searches lkml.org for the original patch). Ah,
> OK, you want to do it postcore.... There may actually be a problem
> with that, because it looks like vtconsole_class_init (which is
> currently run as a postcore initcall) really wants to happen before
> the tty layer initializes itself. So moving tty into postcore could
> potentially run into problems, depending on whether vt.c's or
> tty_io.c's initcalls are run first.

Its playing with fire and doing so to paper over a completely different
problem. The recursion detector is generic and should be sufficient. If
it isn't then it needs improving to be generic, sufficient and working.