2002-03-14 02:26:53

by Andrea Arcangeli

[permalink] [raw]
Subject: 2.4.19pre3aa2

URL:

http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre3aa2.gz
http://www.us.kernel.org/pub/linux/kernel/people/andrea/kernels/v2.4/2.4.19pre3aa2/

VM alone:

http://www.us.kernel.org/pub/linux/kernel/people/andrea/patches/v2.4/2.4.19pre3/vm-32

Changelog:

Only in 2.4.19pre3aa1: 00_lcall_trace-1

Obsoleted by mainline (thanks to Christoph Hellwig for
the reminder).

Only in 2.4.19pre3aa2: 00_loop-IV-API-hvr-1

Make the IV API not to be in function of the blkdev
of the underlying fs, so you can copy your cryptoloop
around without risking to lose data. This breaks the
on-disk format of some encrypted transfer module though,
if you don't like it please discuss it here in CC with
Herbert Valerio Riedel <[email protected]>, the patch
is from him. I think writing a converter in place of
the loop data would be the preferred solution if needed. It could
be done in a way to link transparently with the source of
the kernel modules.

Only in 2.4.19pre3aa2: 00_multipath-routing-smp-1

SMP race fix.

Only in 2.4.19pre3aa2: 00_zlib-1

the zlib fix.

Only in 2.4.19pre3aa1: 10_nfs-o_direct-4
Only in 2.4.19pre3aa2: 10_nfs-o_direct-5
Only in 2.4.19pre3aa1: 10_reiserfs-o_direct-1

Merged together for consistency, suggested by
Christoph Hellwig.

Only in 2.4.19pre3aa1: 10_vm-31
Only in 2.4.19pre3aa2: 10_vm-32

Fixed ext3 deadlock and "theorical" mainline SMP race for
some arch.

Only in 2.4.19pre3aa2: 21_pte-highmem-f00f-1

vmalloc called before smp_init was an hack, right way
is to use fixmap. CONFIG_M686 doesn't mean much these
days, but it's ok and probably most vendors will use it
for the smp kernels, so it will save 4096 of the vmalloc space.
I just didn't wanted to clobber the code with || CONFIG_K7 ||
CONFIG_... | ... given all the other f00f stuff is also
conditional only to M686 and probably nobody bothered to compile
it out for my same reason (if somebody worries about the few
kbytes of bytecode and the 4096 bytes of vmalloc virtual address
space feel free to send me a patch :).

This allows booting P5 SMP with pte-highmem.

Only in 2.4.19pre3aa1: 70_xfs-6.gz
Only in 2.4.19pre3aa2: 70_xfs-7.gz

Rediffed for the vm deadlock fix, also clear PG_Launder _before_
unlocking the buffer in write_buffer_locked().

Andrea


2002-03-14 12:32:53

by Dave Jones

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, Mar 14, 2002 at 03:28:01AM +0100, Andrea Arcangeli wrote:
> Only in 2.4.19pre3aa2: 21_pte-highmem-f00f-1
>
> vmalloc called before smp_init was an hack, right way
> is to use fixmap. CONFIG_M686 doesn't mean much these
> days, but it's ok and probably most vendors will use it
> for the smp kernels, so it will save 4096 of the vmalloc space.
> I just didn't wanted to clobber the code with || CONFIG_K7 ||
> CONFIG_... | ... given all the other f00f stuff is also
> conditional only to M686 and probably nobody bothered to compile
> it out for my same reason

Brian Gerst had a patch a few months back to introduce a CONFIG_F00F
if a relevant CONFIG_Mxxx was chosen[1]. It never got applied anywhere, but makes
more sense than the CONFIG_M686 we currently use.

[1] 386/486/586. With addition of my Vendor choice menu, we could even further
narrow it down to Intel only.

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-03-14 12:36:03

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, Mar 14, 2002 at 01:32:23PM +0100, Dave Jones wrote:
> On Thu, Mar 14, 2002 at 03:28:01AM +0100, Andrea Arcangeli wrote:
> > Only in 2.4.19pre3aa2: 21_pte-highmem-f00f-1
> >
> > vmalloc called before smp_init was an hack, right way
> > is to use fixmap. CONFIG_M686 doesn't mean much these
> > days, but it's ok and probably most vendors will use it
> > for the smp kernels, so it will save 4096 of the vmalloc space.
> > I just didn't wanted to clobber the code with || CONFIG_K7 ||
> > CONFIG_... | ... given all the other f00f stuff is also
> > conditional only to M686 and probably nobody bothered to compile
> > it out for my same reason
>
> Brian Gerst had a patch a few months back to introduce a CONFIG_F00F
> if a relevant CONFIG_Mxxx was chosen[1]. It never got applied anywhere, but makes
> more sense than the CONFIG_M686 we currently use.

indeed.

Andrea

2002-03-14 12:46:33

by Jeff Garzik

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Dave Jones wrote:

>
>[1] 386/486/586. With addition of my Vendor choice menu, we could even further
> narrow it down to Intel only.
>


Could you be convinced to post your vendor choice patch to linux-kernel
perhaps? :)

(yes, I know I already have it, I would like others to see it)

Jeff



2002-03-14 12:59:54

by Dave Jones

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, Mar 14, 2002 at 07:46:01AM -0500, Jeff Garzik wrote:

> Could you be convinced to post your vendor choice patch to linux-kernel
> perhaps? :)
> (yes, I know I already have it, I would like others to see it)

Sure. it's against something ancient, and I've not had time to rediff/update it.
If there's sufficient interest, I'll do so soon..

http://www.codemonkey.org.uk/patches/old/cpu-choice-2.diff

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-03-14 15:55:05

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, 14 Mar 2002, Dave Jones wrote:

> On Thu, Mar 14, 2002 at 03:28:01AM +0100, Andrea Arcangeli wrote:
> > Only in 2.4.19pre3aa2: 21_pte-highmem-f00f-1
> >
> > vmalloc called before smp_init was an hack, right way
> > is to use fixmap. CONFIG_M686 doesn't mean much these
> > days, but it's ok and probably most vendors will use it
> > for the smp kernels, so it will save 4096 of the vmalloc space.
> > I just didn't wanted to clobber the code with || CONFIG_K7 ||
> > CONFIG_... | ... given all the other f00f stuff is also
> > conditional only to M686 and probably nobody bothered to compile
> > it out for my same reason
>
> Brian Gerst had a patch a few months back to introduce a CONFIG_F00F
> if a relevant CONFIG_Mxxx was chosen[1]. It never got applied anywhere, but makes
> more sense than the CONFIG_M686 we currently use.
>
> [1] 386/486/586. With addition of my Vendor choice menu, we could even further
> narrow it down to Intel only.

Since vendors (and consultants) like to build a single kernel for use on
multiple machines, it would be nice if this could be done by some init
code (released) and a module. I don't know what the overhead would be,
perhaps the runtime code is so small it's not worth doing. Does that mean
it's not worth doing the option either? It certainly would seen desirable
to check for the F00F bug and if the code to handle it was not present
refuse to boot right away.

The code actually looks so small as to be unworthy of an option, given
that many people would set it off not knowing was it was much less whether
they needed it. This is not like a missing FPU where you can do a graceful
reject of the instructions, if you have the bug and not the fix you are
vulnerable to sudden total failures, correct?

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-03-14 16:14:17

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, Mar 14, 2002 at 10:53:01AM -0500, Bill Davidsen wrote:
> On Thu, 14 Mar 2002, Dave Jones wrote:
>
> > On Thu, Mar 14, 2002 at 03:28:01AM +0100, Andrea Arcangeli wrote:
> > > Only in 2.4.19pre3aa2: 21_pte-highmem-f00f-1
> > >
> > > vmalloc called before smp_init was an hack, right way
> > > is to use fixmap. CONFIG_M686 doesn't mean much these
> > > days, but it's ok and probably most vendors will use it
> > > for the smp kernels, so it will save 4096 of the vmalloc space.
> > > I just didn't wanted to clobber the code with || CONFIG_K7 ||
> > > CONFIG_... | ... given all the other f00f stuff is also
> > > conditional only to M686 and probably nobody bothered to compile
> > > it out for my same reason
> >
> > Brian Gerst had a patch a few months back to introduce a CONFIG_F00F
> > if a relevant CONFIG_Mxxx was chosen[1]. It never got applied anywhere, but makes
> > more sense than the CONFIG_M686 we currently use.
> >
> > [1] 386/486/586. With addition of my Vendor choice menu, we could even further
> > narrow it down to Intel only.
>
> Since vendors (and consultants) like to build a single kernel for use on
> multiple machines, it would be nice if this could be done by some init
> code (released) and a module. I don't know what the overhead would be,
> perhaps the runtime code is so small it's not worth doing. Does that mean

Correct. I think the CONFIG option isn't worthwhile in the first place
and this is why I only left the CONFIG_M686 knowing most smp kernels are
compiled that way. 4096bytes of virtual vmallc space and some houndred
bytes of bytecode doesn't worth the config option. If something the
CONFIG_F00F would be more a documentation effort 8). But nevertheless if
somebody really cares, that still make sense and it doesn't hurt. At the
very least it is better than the current halfway broken CONFIG_M686.
But personally I'm not going to implement it and if I would really be
bothered by the halfway broken CONFIG_M686 I would drop it instead.

> it's not worth doing the option either? It certainly would seen desirable
> to check for the F00F bug and if the code to handle it was not present
> refuse to boot right away.
>
> The code actually looks so small as to be unworthy of an option, given
> that many people would set it off not knowing was it was much less whether
> they needed it. This is not like a missing FPU where you can do a graceful
> reject of the instructions, if you have the bug and not the fix you are
> vulnerable to sudden total failures, correct?
>
> --
> bill davidsen <[email protected]>
> CTO, TMR Associates, Inc
> Doing interesting things with little computers since 1979.


Andrea

2002-03-14 16:16:47

by Dave Jones

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, Mar 14, 2002 at 05:12:59PM +0100, Andrea Arcangeli wrote:
> Correct. I think the CONFIG option isn't worthwhile in the first place
> and this is why I only left the CONFIG_M686 knowing most smp kernels are
> compiled that way. 4096bytes of virtual vmallc space and some houndred
> bytes of bytecode doesn't worth the config option. If something the
> CONFIG_F00F would be more a documentation effort 8).

nononono! CONFIG_FOOF is a derived symbol from whatever CONFIG_Mx8x
is set. Much in the way we derive CONFIG_X86_WP_WORKS_OK, CONFIG_X86_PPRO_FENCE
and freinds..

--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-03-14 16:18:27

by Dave Jones

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, Mar 14, 2002 at 10:53:01AM -0500, Bill Davidsen wrote:

> Since vendors (and consultants) like to build a single kernel for use on
> multiple machines, it would be nice if this could be done by some init
> code (released) and a module.

The relevant code is (where possible) marked as __init already.
So the init code gets thrown away whether needed or not.

> The code actually looks so small as to be unworthy of an option

It's not a matter of codesize, it's a correctness issue in the source.
#ifndef CONFIG_M686 is wrong. It assumes a P6 is the only CPU family
in existence without the bug, despite the fact there are probably close
to a dozen others.

> that many people would set it off not knowing was it was much less whether
> they needed it. This is not like a missing FPU where you can do a graceful
> reject of the instructions, if you have the bug and not the fix you are
> vulnerable to sudden total failures, correct?

No. You at worse vulnerable to a malicious user running hand-crafted code
(no compiler generates this code-sequence) bringing down the machine.

The proposal however was not to remove anything that we currently have.
Every kernel that is possible to be run on an affected box (i386/i486/i586)
would still have the workaround present. We just won't generate it in
Cyrix III, Athlon, Pentium 4, etc kernels..


--
| Dave Jones. http://www.codemonkey.org.uk
| SuSE Labs

2002-03-14 16:32:47

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, Mar 14, 2002 at 05:16:20PM +0100, Dave Jones wrote:
> On Thu, Mar 14, 2002 at 05:12:59PM +0100, Andrea Arcangeli wrote:
> > Correct. I think the CONFIG option isn't worthwhile in the first place
> > and this is why I only left the CONFIG_M686 knowing most smp kernels are
> > compiled that way. 4096bytes of virtual vmallc space and some houndred
> > bytes of bytecode doesn't worth the config option. If something the
> > CONFIG_F00F would be more a documentation effort 8).
>
> nononono! CONFIG_FOOF is a derived symbol from whatever CONFIG_Mx8x
> is set. Much in the way we derive CONFIG_X86_WP_WORKS_OK, CONFIG_X86_PPRO_FENCE
> and freinds..

Of course I perfectly understood what you meant. I designed the current
way of doing the CONFIG_X86 some years back so I know what you mean.

Just re-read the previous email with that in mind. What I meant is that
4096bytes of virtual vmalloc space doesn't worth a CONFIG_F00F IMHO.
While CONFIG_F00F isn't wasted user time because the user won't see it,
it still clobbers the source code, but nevertheless it is better than
the current halfway broken #ifdef CONFIG_M686 and that's why I said if
somebody bothers to verymicrooptimize I'm ok, I just won't microoptimize
myself and I'd drop CONFIG_M686 instead.

Andrea

2002-03-14 17:18:57

by Bill Davidsen

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, 14 Mar 2002, Dave Jones wrote:

> On Thu, Mar 14, 2002 at 10:53:01AM -0500, Bill Davidsen wrote:

> It's not a matter of codesize, it's a correctness issue in the source.
> #ifndef CONFIG_M686 is wrong. It assumes a P6 is the only CPU family
> in existence without the bug, despite the fact there are probably close
> to a dozen others.

No problem with that, it obviously should be done for all chips which
could have the bug, both Intel and faithful clones :-(

> > that many people would set it off not knowing was it was much less whether
> > they needed it. This is not like a missing FPU where you can do a graceful
> > reject of the instructions, if you have the bug and not the fix you are
> > vulnerable to sudden total failures, correct?
>
> No. You at worse vulnerable to a malicious user running hand-crafted code
> (no compiler generates this code-sequence) bringing down the machine.

Having the machine lock suddenly when user mode code is executed fits MY
definition of sudden total failure! Malicious user meaning anyone who d/l
something and runs it. Doesn't take malicious, just dumb. Somone wanting
to try the new screen saver or some such.

> The proposal however was not to remove anything that we currently have.
> Every kernel that is possible to be run on an affected box (i386/i486/i586)
> would still have the workaround present. We just won't generate it in
> Cyrix III, Athlon, Pentium 4, etc kernels..

As long as it's not something the user can easily set wrong. If they
build a kernel for K6 and they have a P-II "stupidity, like virtue, is its
own reward."

--
bill davidsen <[email protected]>
CTO, TMR Associates, Inc
Doing interesting things with little computers since 1979.

2002-03-14 18:02:17

by Andrew Morton

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Andrea Arcangeli wrote:
>
> ...
> Only in 2.4.19pre3aa2: 10_vm-32
>
> Fixed ext3 deadlock and "theorical" mainline SMP race for
> some arch.
>

That works fine. ext3 is happy again.

-

2002-03-14 21:17:48

by H. Peter Anvin

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Followup to: <[email protected]>
By author: Dave Jones <[email protected]>
In newsgroup: linux.dev.kernel
>
> On Thu, Mar 14, 2002 at 03:28:01AM +0100, Andrea Arcangeli wrote:
> > Only in 2.4.19pre3aa2: 21_pte-highmem-f00f-1
> >
> > vmalloc called before smp_init was an hack, right way
> > is to use fixmap. CONFIG_M686 doesn't mean much these
> > days, but it's ok and probably most vendors will use it
> > for the smp kernels, so it will save 4096 of the vmalloc space.
> > I just didn't wanted to clobber the code with || CONFIG_K7 ||
> > CONFIG_... | ... given all the other f00f stuff is also
> > conditional only to M686 and probably nobody bothered to compile
> > it out for my same reason
>
> Brian Gerst had a patch a few months back to introduce a CONFIG_F00F
> if a relevant CONFIG_Mxxx was chosen[1]. It never got applied anywhere, but makes
> more sense than the CONFIG_M686 we currently use.
>
> [1] 386/486/586. With addition of my Vendor choice menu, we could even further
> narrow it down to Intel only.
>

One thing: I would again like to see "compatibility" and
"optimization" be separated out. The current CPU type menu is a bit
of both.

It's actually quite useful these days to compile for 386 or 486, but
optimize for 686.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2002-03-14 22:57:59

by Jari Ruusu

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Andrea Arcangeli wrote:
> Only in 2.4.19pre3aa2: 00_loop-IV-API-hvr-1
>
> Make the IV API not to be in function of the blkdev
> of the underlying fs, so you can copy your cryptoloop
> around without risking to lose data. This breaks the
> on-disk format of some encrypted transfer module though,
> if you don't like it please discuss it here in CC with
> Herbert Valerio Riedel <[email protected]>, the patch
> is from him. I think writing a converter in place of
> the loop data would be the preferred solution if needed. It could
> be done in a way to link transparently with the source of
> the kernel modules.

Since you are finally fixing loop bugs, attached is a patch of loop fixes
from loop-AES package (cipher code has been removed of course). These fixes
are well tested and are basically unchanged since September 29 2001. Biggest
fix is to make device backed loop work with swap so that encrypted swap is
possible:

- IV computed in 512 byte units.
- Make device backed loop work with swap by pre-allocating pages.
- External encryption module locking bug fixed (from Ingo Rohloff).
- Get rid of the loop_get_bs() crap.
- grab_cache_page() return value handled properly, avoids Oops.
- No more illegal messing with BH_Dirty flag.
- No more illegal sleeping in generic_make_request().
- Loops can be set-up properly when root partition is still mounted ro.
- Default soft block size is set properly for file backed loops.
- kmalloc() error case handled properly.
- loop_set_fd() 'goto out_putf' error case handled properly.

Regards,
Jari Ruusu <[email protected]>


Attachments:
loop-fixes-2.4.19-pre2.diff.gz (7.05 kB)

2002-03-15 10:57:33

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Fri, Mar 15 2002, Jari Ruusu wrote:
> - No more illegal sleeping in generic_make_request().

I've told you this before -- sleeping in make_request is not illegal,
heck it happens _all the time_. Safely sleeping requires a reserved pool
of the units you wish to allocate, of course. In fact I think that would
be much nicer than the path you are following here by delaying
allocations to the loop thread (and still not using a reserved pool).

Briefly looking through the patch.

- loop_put_buffer(), it looks racy to check waitqueue_active there.

- if(!bh) return((struct buffer_head *)0);

eww!

- Also, please adher to the style. VaRiAbLe names can hurt the eyes, and
stuff like

if (something) break;

return(val);

etc don't belong too. Could you fix that up?

That said, thanks for fixing it!

--
Jens Axboe

2002-03-15 11:10:03

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Fri, Mar 15 2002, Jens Axboe wrote:
> On Fri, Mar 15 2002, Jari Ruusu wrote:
> > - No more illegal sleeping in generic_make_request().

BTW, it looks like you are killing LO_FLAGS_BH_REMAP?! Why? This is a
very worthwhile optimization.

--
Jens Axboe

2002-03-15 12:20:30

by Herbert Valerio Riedel

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

hello!

On Thu, 2002-03-14 at 23:57, Jari Ruusu wrote:
> Andrea Arcangeli wrote:
> > Only in 2.4.19pre3aa2: 00_loop-IV-API-hvr-1
> >
> > Make the IV API not to be in function of the blkdev
> > of the underlying fs, so you can copy your cryptoloop
> > around without risking to lose data. This breaks the
> > on-disk format of some encrypted transfer module though,
> > if you don't like it please discuss it here in CC with
> > Herbert Valerio Riedel <[email protected]>, the patch
> > is from him. I think writing a converter in place of
> > the loop data would be the preferred solution if needed. It could
> > be done in a way to link transparently with the source of
> > the kernel modules.

just as a side note to compatibility to the old IV metric;

the patch I sent to andrea is quite minimal, it only changes the IV
metric, and adds a few #define's to loop.h in order to recognize the IV
metric when compiling filter modules against it; as to jari's patch, if
the following def's were added, it can be used instead of my minimal
patch...

/* definitions for IV metric */
#define LOOP_IV_SECTOR_BITS 9
#define LOOP_IV_SECTOR_SIZE (1 << LOOP_IV_SECTOR_BITS)

typedef int loop_iv_t;

...except maybe for when backward compatibility is needed. As it is a
major concern to some of us to be able to convert their old iv-metric
encrypted volumes to the new "atomic"-IV-metric, it can be usefull to be
able to have both IV metrics available on the same system...

one approach could be to losetup the concerned volume w/ an old loop.o v
ersion and decipher the content on-the-fly by doing something like

# set up the loop dev
losetup -e cipher /dev/loop0 /dev/partition

# make sure we got the right passphrase etc...; and also make sure the
# kernel fs code set the right transfer-block-size...
mount /dev/loop0 /mnt/tmp; umount /dev/loop0

# decrypt on-the-fly...
dd if=/dev/loop0 of=/dev/partition

...now in theory we should have the unencrypted data lying in
/dev/partition


now we could replace the loop.o module by one feature the new IV metric,
and just reverse the process..

so far ok... but some of us might want to decrypt and encrypt directly,
w/o having to have some unencrypted (thus unprotected) data lying around
for the meantime; so we might want to have 2 IV metrics available at the
same time... this can either be done by cooperation from the loop.o
module, i.e. some option for setting the IV size to calculate... _or_ it
can be calculated by the filter module, by finding out what IV blocksize
to assume, and then dividing the 512-IV-metric by some integral factor
to reflect the old metric...

the latter approach is possible w/ my patch, and has been implemented
cryptoapi's cryptoloop filter through ioctl()...

this relays on the IV-fix not modifying the transfer block size (if the
blocks passed to the transfer filter should become smaller than the old
metric's IV block size, we simply can't emulate the old metric...), but
only the passed IV value...

ps: just to make one thing clear (again), I don't care too much whether
my loop-fix or jari's goes in; I primarily care for a fixed IV situation
(including the above mentioned #define's/typedef) and if possible anyhow
to allow for limited compatibility to the old metric...

regards,
--
Herbert Valerio Riedel / Phone: (EUROPE) +43-1-58801-18840
Email: [email protected] / Finger [email protected] for GnuPG Public Key
GnuPG Key Fingerprint: 7BB9 2D6C D485 CE64 4748 5F65 4981 E064 883F
4142


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part

2002-03-15 17:35:43

by Jari Ruusu

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Jens Axboe wrote:
> On Fri, Mar 15 2002, Jari Ruusu wrote:
> > - No more illegal sleeping in generic_make_request().
>
> I've told you this before -- sleeping in make_request is not illegal,
> heck it happens _all the time_. Safely sleeping requires a reserved pool
> of the units you wish to allocate, of course. In fact I think that would
> be much nicer than the path you are following here by delaying
> allocations to the loop thread (and still not using a reserved pool).

Yes, I know you have told me that before, but I'm being overcareful. See:

<quote> from device drivers book by Alessandro Rubini, chapter 12, page 331
The request function has one very important constraint: it must be atomic.
request is not usually called in direct response to user requests, and it is
not running in the context of any particular process. It can be called at
interrupt time, from tasklets, or from any number of other places. Thus, it
must not sleep while carrying out its tasks.
</quote>

> - loop_put_buffer(), it looks racy to check waitqueue_active there.

No race there. All that loop_put_buffer() cares is that helper thread wakes
up. If helper thread woke up earlier and completed its job, fine. If helper
thread wakes up later, that is fine too. If helper thread wakes up
unnecessarily, it will just go back to sleep after noticing that that there
is no work to do.

> - if(!bh) return((struct buffer_head *)0);
>
> eww!
>
> - Also, please adher to the style. VaRiAbLe names can hurt the eyes, and
> stuff like
>
> if (something) break;
>
> return(val);
>
> etc don't belong too. Could you fix that up?
>
> That said, thanks for fixing it!

If there is any chance of being merged to mainline kernel, I will fix these
"hurt the eyes" formatting issues.

> BTW, it looks like you are killing LO_FLAGS_BH_REMAP?! Why? This is a
> very worthwhile optimization.

Removing it simplified the code a lot. Doing remap direcly from
loop_make_request() would probably be more effective. Just remap and return
1 from loop_make_request() like LVM code does.

Regards,
Jari Ruusu <[email protected]>

2002-03-15 17:37:17

by Jari Ruusu

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Herbert Valerio Riedel wrote:
> the patch I sent to andrea is quite minimal, it only changes the IV
> metric, and adds a few #define's to loop.h in order to recognize the IV
> metric when compiling filter modules against it; as to jari's patch, if
> the following def's were added, it can be used instead of my minimal
> patch...
>
> /* definitions for IV metric */
> #define LOOP_IV_SECTOR_BITS 9
> #define LOOP_IV_SECTOR_SIZE (1 << LOOP_IV_SECTOR_BITS)
>
> typedef int loop_iv_t;
>
> ...except maybe for when backward compatibility is needed. As it is a
> major concern to some of us to be able to convert their old iv-metric
> encrypted volumes to the new "atomic"-IV-metric, it can be usefull to be
> able to have both IV metrics available on the same system...

Not changing IV parameter type in 2.4 kernels is important. Break that in
2.5/2.6 kernels, but not in stable 2.4, ok? Older 2.4 kernels dont't have
loop_iv_t, and being able to compile _existing_ modules for them is
important. 512 byte IV is nothing new, you know that, and all sane systems
have used 512 byte IV for a long time already. So the 'block size IV' change
to '512 byte IV' is nothing new, but changing the parameter type is evil and
should be avoided for compatibility sake.

> ps: just to make one thing clear (again), I don't care too much whether
> my loop-fix or jari's goes in; I primarily care for a fixed IV situation
> (including the above mentioned #define's/typedef) and if possible anyhow
> to allow for limited compatibility to the old metric...

So the choice here is either break (or at least cause need to modify) all
other implementations or cryptoapi implementation.

Herbert, if this loop_iv_t type goes into mainline kernel, I will have to
reverse that on loop-AES patches for backward compatibility.

Dependency on above mentioned #define's/typedef on kernel include files is
silly, as cryptoapi can define them on any of its own include files.

Regards,
Jari Ruusu <[email protected]>

2002-03-15 17:57:37

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Fri, Mar 15, 2002 at 07:35:02PM +0200, Jari Ruusu wrote:
> Jens Axboe wrote:
> > On Fri, Mar 15 2002, Jari Ruusu wrote:
> > > - No more illegal sleeping in generic_make_request().
> >
> > I've told you this before -- sleeping in make_request is not illegal,
> > heck it happens _all the time_. Safely sleeping requires a reserved pool
> > of the units you wish to allocate, of course. In fact I think that would
> > be much nicer than the path you are following here by delaying
> > allocations to the loop thread (and still not using a reserved pool).
>
> Yes, I know you have told me that before, but I'm being overcareful. See:
>
> <quote> from device drivers book by Alessandro Rubini, chapter 12, page 331
> The request function has one very important constraint: it must be atomic.
> request is not usually called in direct response to user requests, and it is
> not running in the context of any particular process. It can be called at
> interrupt time, from tasklets, or from any number of other places. Thus, it
> must not sleep while carrying out its tasks.
> </quote>

loop isn't implement via ->request_fn anymore. Loop since 2.4 is only
driven by the ->make_request_fn, that for the other more normal devices
just means the old legacy __make_request. request_fn is subject to the
rules pointed out by Alessandro, but ->make_request_fn can sleep just
like ll_rw_block can sleep. ->make_request_fn and in turn
loop_make_request can only run in normal context with irq enabled and
they're both allowed to sleep just like ll_rw_block and submit_bh.

Nevertheless as Jens said the infinite-loop-allocation in the
->make_request_fn path are deadlock prone at the moment, not because
they sleeps but because they need a reserved mempool to guarantee
operations can go ahead slowly without deadlocks even if dynamic
allocation fails, but this is not a very pratical problem, it's very
unlikely to deadlock there (it's not worse than the other infinite loop
in getblk() that affects not just loop).

Andrea

2002-03-15 18:37:10

by Herbert Valerio Riedel

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

hello!

On Fri, 2002-03-15 at 18:36, Jari Ruusu wrote:
> > /* definitions for IV metric */
> > #define LOOP_IV_SECTOR_BITS 9
> > #define LOOP_IV_SECTOR_SIZE (1 << LOOP_IV_SECTOR_BITS)
> >
> > typedef int loop_iv_t;

> Not changing IV parameter type in 2.4 kernels is important. Break that in
> 2.5/2.6 kernels, but not in stable 2.4, ok?

if you look closely at the typedef above, you'll notice that it doesn't
introduce a _new_ type, but only a synonym for the 'int' type, which
happens to be the type used in 2.4 kernels...

typedef int (* transfer_proc_t)(struct loop_device *, int cmd,
char *raw_buf, char *loop_buf, int size,
int real_block);

what I'm trying to do by the introduction of this _aesthetic_ syntactic
element, is to prepare the way to smoothly upgrade the type used for the
IV value; and more important to keep filter modules from having to add
ugly #ifdef LINUX_VERSION_CODE <= KERNEL(2,X,Y)... just for changes
which could be catched by kernel headers...

have you ever wondered, why even the C standard defines some typedefs as
size_t instead of using just a 'long'?

> Older 2.4 kernels dont't have
> loop_iv_t, and being able to compile _existing_ modules for them is
> important.
as older kernels will likely have the variable-IV-metric, they'll have
to be patched anyway (loop.[ch]), and after having patched them, the
problem vanishes...


>> ps: just to make one thing clear (again), I don't care too much whether
>> my loop-fix or jari's goes in; I primarily care for a fixed IV situation
>> (including the above mentioned #define's/typedef) and if possible anyhow
>> to allow for limited compatibility to the old metric...
> So the choice here is either break (or at least cause need to modify) all
> other implementations or cryptoapi implementation.
actually it's just -- either make my life harder w/ cryptoapi or not, as
it doesn't affect other implementations; but other implementations may
start to use those typedefs in the future, so when 2.6 (or whatever
comes out) the 'current' 2.6 has them already, and the 'ultra-stable'
2.4's will have them as well... and maybe everybody is happy, that he
can compile the same c source against both, 2.4 and 2.6, loop.h and
still get a correct filter module...

> Herbert, if this loop_iv_t type goes into mainline kernel, I will have to
> reverse that on loop-AES patches for backward compatibility.
I don't see why... loop_iv_t is just a 'reviced-declarator-type-list
int', i.e. it is is type compatible w/ an 'int'... so you won't even
notice the difference when compiling loop-AES against it...

> Dependency on above mentioned #define's/typedef on kernel include files is
> silly, as cryptoapi can define them on any of its own include files.
yeah, sure... at the expense of having to redundantly keep track (ugly
#ifdef's) of which kernel version is being used...

another reason I haven't mentioned so far is about the #define's;

#define LOOP_IV_SECTOR_BITS 9
#define LOOP_IV_SECTOR_SIZE (1 << LOOP_IV_SECTOR_BITS)

...if you don't see any reason for those 2 innocent #define then you
won't see any reason for other #define in the kernel headers as well;
e.g.:

fs.h:
#define BLOCK_SIZE_BITS 10
#define BLOCK_SIZE (1<<BLOCK_SIZE_BITS)

you may ask why those are used, since they are hardcoded anyway... well,
lemme show you an example where this would considerably help:

loop.c:
static unsigned long compute_loop_size(struct loop_device *lo, struct
dentry * lo_dentry, kdev_t lodev)
{
if (S_ISREG(lo_dentry->d_inode->i_mode))
return (lo_dentry->d_inode->i_size - lo->lo_offset) >> BLOCK_SIZE_BITS;
if (blk_size[MAJOR(lodev)])
return blk_size[MAJOR(lodev)][MINOR(lodev)] -
(lo->lo_offset >> BLOCK_SIZE_BITS);
return MAX_DISK_SIZE;
}

and now imagine the same source fragment, with the #define's
evaluated... wouldn't that be less self-explaining?

well, I hope I could make clear to you, why using symbolic constants and
the use of typedefs in C are good practice in general wrt software
engineering...

regards,
--
Herbert Valerio Riedel / Phone: (EUROPE) +43-1-58801-18840
Email: [email protected] / Finger [email protected] for GnuPG Public Key
GnuPG Key Fingerprint: 7BB9 2D6C D485 CE64 4748 5F65 4981 E064 883F
4142


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part

2002-03-16 01:25:37

by Mike Fedyk

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Thu, Mar 14, 2002 at 05:12:59PM +0100, Andrea Arcangeli wrote:
> On Thu, Mar 14, 2002 at 10:53:01AM -0500, Bill Davidsen wrote:
> > On Thu, 14 Mar 2002, Dave Jones wrote:
> >
> > > On Thu, Mar 14, 2002 at 03:28:01AM +0100, Andrea Arcangeli wrote:
> > > > Only in 2.4.19pre3aa2: 21_pte-highmem-f00f-1
> > > >
> > > > vmalloc called before smp_init was an hack, right way
> > > > is to use fixmap. CONFIG_M686 doesn't mean much these
> > > > days, but it's ok and probably most vendors will use it
> > > > for the smp kernels, so it will save 4096 of the vmalloc space.
> > > > I just didn't wanted to clobber the code with || CONFIG_K7 ||
> > > > CONFIG_... | ... given all the other f00f stuff is also
> > > > conditional only to M686 and probably nobody bothered to compile
> > > > it out for my same reason
> > >
> > > Brian Gerst had a patch a few months back to introduce a CONFIG_F00F
> > > if a relevant CONFIG_Mxxx was chosen[1]. It never got applied anywhere, but makes
> > > more sense than the CONFIG_M686 we currently use.
> > >
> > > [1] 386/486/586. With addition of my Vendor choice menu, we could even further
> > > narrow it down to Intel only.
> >
> > Since vendors (and consultants) like to build a single kernel for use on
> > multiple machines, it would be nice if this could be done by some init
> > code (released) and a module. I don't know what the overhead would be,
> > perhaps the runtime code is so small it's not worth doing. Does that mean
>
> Correct. I think the CONFIG option isn't worthwhile in the first place
> and this is why I only left the CONFIG_M686 knowing most smp kernels are
> compiled that way. 4096bytes of virtual vmallc space and some houndred

Does that mean kernels compiled for 486 and smp won't be protected from F00F?

2002-03-16 12:13:07

by Jari Ruusu

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Herbert Valerio Riedel wrote:
> On Fri, 2002-03-15 at 18:36, Jari Ruusu wrote:
> > Not changing IV parameter type in 2.4 kernels is important. Break that in
> > 2.5/2.6 kernels, but not in stable 2.4, ok?
>
> if you look closely at the typedef above, you'll notice that it doesn't
> introduce a _new_ type, but only a synonym for the 'int' type, which
> happens to be the type used in 2.4 kernels...

A synonym that does not exist on older kernels.

> typedef int (* transfer_proc_t)(struct loop_device *, int cmd,
> char *raw_buf, char *loop_buf, int size,
> int real_block);
^^^
That function prototype _must_ _not_ change in 2.4 kernels. That 'int'
better be there on newer 2.4 kernels.

> > Older 2.4 kernels dont't have
> > loop_iv_t, and being able to compile _existing_ modules for them is
> > important.
> as older kernels will likely have the variable-IV-metric, they'll have
> to be patched anyway (loop.[ch]), and after having patched them, the
> problem vanishes...

If you haven't noticed, loop-AES compiles and works fine on older kernels
without changing anything in kernel sources.

> > So the choice here is either break (or at least cause need to modify) all
> > other implementations or cryptoapi implementation.
> actually it's just -- either make my life harder w/ cryptoapi or not, as
> it doesn't affect other implementations;

Transfer function prototype change does affect other implementations. I
would have to add workaround to compile loop-AES on older kernels. If
cryptoapi got one parameter wrong, I don't see why other implementations
should bend over because of that.

Herbert, if you need some special types and/or macros in cryptoapi, just
define them in cryptoapi include files. Please don't mess with code that
other implementations depend on.

> > Herbert, if this loop_iv_t type goes into mainline kernel, I will have to
> > reverse that on loop-AES patches for backward compatibility.
> I don't see why... loop_iv_t is just a 'reviced-declarator-type-list
> int', i.e. it is is type compatible w/ an 'int'... so you won't even
> notice the difference when compiling loop-AES against it...

What I do notice is having to add workaround. See, loop-AES only creates
patched-loop.c (but not patched-loop.h), and under some circumstances
patched-loop.c may come from newer code than loop.h. So it needs a
workaround if transfer function prototype is changed.

> > Dependency on above mentioned #define's/typedef on kernel include files is
> > silly, as cryptoapi can define them on any of its own include files.
> yeah, sure... at the expense of having to redundantly keep track (ugly
> #ifdef's) of which kernel version is being used...

If IV parameter type is changed from 32 bits to 64 bits, you will need new
code and #ifdef's to handle that anyway. And that is 2.5/2.6 thing anyway.

Regards,
Jari Ruusu <[email protected]>

2002-03-16 12:11:27

by Jari Ruusu

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Andrea Arcangeli wrote:
> Nevertheless as Jens said the infinite-loop-allocation in the
> ->make_request_fn path are deadlock prone at the moment, not because
> they sleeps but because they need a reserved mempool to guarantee
> operations can go ahead slowly without deadlocks even if dynamic
> allocation fails, but this is not a very pratical problem, it's very
> unlikely to deadlock there (it's not worse than the other infinite loop
> in getblk() that affects not just loop).

Unlikely to deadlock for normal filesystems, but swap on encrypted loop is a
different case.

Deadlock free operation is exactly why my prealloc loop patch is needed.
Beyond initial preallocating that is done at losetup time, it does not
allocate anything from kernel memory pools, but effectively recycles its
private per loop device preallocated memory. Encrypted swap needs that, and
normal device backed loop file systems are also happy with deadlock free and
VM stress free operation.

Regards,
Jari Ruusu <[email protected]>

2002-03-16 13:54:06

by Andrea Arcangeli

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Sat, Mar 16, 2002 at 02:10:27PM +0200, Jari Ruusu wrote:
> Andrea Arcangeli wrote:
> > Nevertheless as Jens said the infinite-loop-allocation in the
> > ->make_request_fn path are deadlock prone at the moment, not because
> > they sleeps but because they need a reserved mempool to guarantee
> > operations can go ahead slowly without deadlocks even if dynamic
> > allocation fails, but this is not a very pratical problem, it's very
> > unlikely to deadlock there (it's not worse than the other infinite loop
> > in getblk() that affects not just loop).
>
> Unlikely to deadlock for normal filesystems, but swap on encrypted loop is a
> different case.

Not really much different, it's no different than other paths like
getblk (when you swap on top of the fs) and brw_page (OTOH we have a
little reserved shared pool for the async bh needed by brw_page but it's not
math accurate first of all because it's shared for the other pagecache
I/O too), it just increases a little the mem pressure when there's
a transfer function involved and the translation cannot be done in
place, but I obviously agree that such loop deadlock needs fixing
reagardless if it's easy to reproduce it or not :). Infact I think I
mentioned such deadlock in the past too.

> Deadlock free operation is exactly why my prealloc loop patch is needed.
> Beyond initial preallocating that is done at losetup time, it does not
> allocate anything from kernel memory pools, but effectively recycles its
> private per loop device preallocated memory. Encrypted swap needs that, and
> normal device backed loop file systems are also happy with deadlock free and
> VM stress free operation.

Yes, thanks.

Andrea

2002-03-18 19:14:52

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Fri, Mar 15 2002, Jari Ruusu wrote:
> Jens Axboe wrote:
> > On Fri, Mar 15 2002, Jari Ruusu wrote:
> > > - No more illegal sleeping in generic_make_request().
> >
> > I've told you this before -- sleeping in make_request is not illegal,
> > heck it happens _all the time_. Safely sleeping requires a reserved pool
> > of the units you wish to allocate, of course. In fact I think that would
> > be much nicer than the path you are following here by delaying
> > allocations to the loop thread (and still not using a reserved pool).
>
> Yes, I know you have told me that before, but I'm being overcareful. See:
>
> <quote> from device drivers book by Alessandro Rubini, chapter 12, page 331
> The request function has one very important constraint: it must be atomic.
> request is not usually called in direct response to user requests, and it is
> not running in the context of any particular process. It can be called at
> interrupt time, from tasklets, or from any number of other places. Thus, it
> must not sleep while carrying out its tasks.
> </quote>

That's talking about the request_fn funtion, not related to
make_request_fn that I rewrote loop to use. So that's not a valid point.

> > - if(!bh) return((struct buffer_head *)0);
> >
> > eww!
> >
> > - Also, please adher to the style. VaRiAbLe names can hurt the eyes, and
> > stuff like
> >
> > if (something) break;
> >
> > return(val);
> >
> > etc don't belong too. Could you fix that up?
> >
> > That said, thanks for fixing it!
>
> If there is any chance of being merged to mainline kernel, I will fix these
> "hurt the eyes" formatting issues.

I think there is. At least I can safely say there's no chance it will be
merged if these things aren't fixed. So take your pick :-)

> > BTW, it looks like you are killing LO_FLAGS_BH_REMAP?! Why? This is a
> > very worthwhile optimization.
>
> Removing it simplified the code a lot. Doing remap direcly from
> loop_make_request() would probably be more effective. Just remap and return

LOTS more effective. Please don't kill this functionality. I don't buy
the simplification argument.

> 1 from loop_make_request() like LVM code does.

And like loop currently does...

--
Jens Axboe

2002-03-19 23:28:37

by Jari Ruusu

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Jens Axboe wrote:
> On Fri, Mar 15 2002, Jari Ruusu wrote:
> > If there is any chance of being merged to mainline kernel, I will fix these
> > "hurt the eyes" formatting issues.
>
> I think there is. At least I can safely say there's no chance it will be
> merged if these things aren't fixed. So take your pick :-)

OK, I have fixed above mentioned formatting issues. A patch is attached.

And, about sleeping in loop_make_request(), I have also changed the code in
such way that it defaults to code that may sleep in loop_make_request(). But
non-sleeping code is still present (but not currently used), like this:

#if 1
may-sleep-in-loop_make_request-code-here
#else
non-sleeping-loop_make_request-code-here
#endif

> > > BTW, it looks like you are killing LO_FLAGS_BH_REMAP?! Why? This is a
> > > very worthwhile optimization.
> >
> > Removing it simplified the code a lot. Doing remap direcly from
> > loop_make_request() would probably be more effective. Just remap and return
>
> LOTS more effective. Please don't kill this functionality. I don't buy
> the simplification argument.

This new patch does not remove LO_FLAGS_BH_REMAP optimization.

> > 1 from loop_make_request() like LVM code does.
>
> And like loop currently does...

2.4.19-pre3 loop code wanders to loop_get_buffer() and transfer function in
LO_FLAGS_BH_REMAP optimization case.

My version is simpler than yours:

if (lo->lo_flags & LO_FLAGS_BH_REMAP) {
rbh->b_rsector += (lo->lo_offset >> 9);
rbh->b_rdev = lo->lo_device;
return 1;
}

Regards,
Jari Ruusu <[email protected]>


Attachments:
loop-fixes-2.4.19-pre3.diff.gz (7.00 kB)

2002-03-20 07:55:25

by Jens Axboe

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Wed, Mar 20 2002, Jari Ruusu wrote:
> Jens Axboe wrote:
> > On Fri, Mar 15 2002, Jari Ruusu wrote:
> > > If there is any chance of being merged to mainline kernel, I will fix these
> > > "hurt the eyes" formatting issues.
> >
> > I think there is. At least I can safely say there's no chance it will be
> > merged if these things aren't fixed. So take your pick :-)
>
> OK, I have fixed above mentioned formatting issues. A patch is attached.

Super!

> And, about sleeping in loop_make_request(), I have also changed the code in
> such way that it defaults to code that may sleep in loop_make_request(). But
> non-sleeping code is still present (but not currently used), like this:
>
> #if 1
> may-sleep-in-loop_make_request-code-here
> #else
> non-sleeping-loop_make_request-code-here
> #endif

The solution I prefer is to have loop_make_request() block when we run
out of pre-allocated buffers, ie similar to "normal" block drivers when
they run out of request slots. This provides a nice throttling mechanism
and makes sure that loop doesn't eat into the memory polls too heavily.

In any way, do it one way and remove the other.

> > > > BTW, it looks like you are killing LO_FLAGS_BH_REMAP?! Why? This is a
> > > > very worthwhile optimization.
> > >
> > > Removing it simplified the code a lot. Doing remap direcly from
> > > loop_make_request() would probably be more effective. Just remap and return
> >
> > LOTS more effective. Please don't kill this functionality. I don't buy
> > the simplification argument.
>
> This new patch does not remove LO_FLAGS_BH_REMAP optimization.

Good.

> > > 1 from loop_make_request() like LVM code does.
> >
> > And like loop currently does...
>
> 2.4.19-pre3 loop code wanders to loop_get_buffer() and transfer function in
> LO_FLAGS_BH_REMAP optimization case.
>
> My version is simpler than yours:
>
> if (lo->lo_flags & LO_FLAGS_BH_REMAP) {
> rbh->b_rsector += (lo->lo_offset >> 9);
> rbh->b_rdev = lo->lo_device;
> return 1;
> }

So the 'new' version does exactly the same, but doesn't do it in
loop_get_buffer().

--
Jens Axboe

2002-03-20 14:22:15

by Herbert Valerio Riedel

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

On Wed, 2002-03-20 at 00:26, Jari Ruusu wrote:
> > > If there is any chance of being merged to mainline kernel, I will fix these
> > > "hurt the eyes" formatting issues.
> > I think there is. At least I can safely say there's no chance it will be
> > merged if these things aren't fixed. So take your pick :-)
> OK, I have fixed above mentioned formatting issues. A patch is attached.

...well, you still haven't honored my modest wish of the innocent 2
#define's and 1 typedef...

regards,
--
Herbert Valerio Riedel / Phone: (EUROPE) +43-1-58801-18840
Email: [email protected] / Finger [email protected] for GnuPG Public Key
GnuPG Key Fingerprint: 7BB9 2D6C D485 CE64 4748 5F65 4981 E064 883F
4142


Attachments:
signature.asc (232.00 B)
This is a digitally signed message part

2002-03-20 18:17:37

by Jari Ruusu

[permalink] [raw]
Subject: Re: 2.4.19pre3aa2

Jens Axboe wrote:
> On Wed, Mar 20 2002, Jari Ruusu wrote:
> > And, about sleeping in loop_make_request(), I have also changed the code in
> > such way that it defaults to code that may sleep in loop_make_request(). But
> > non-sleeping code is still present (but not currently used), like this:
> >
> > #if 1
> > may-sleep-in-loop_make_request-code-here
> > #else
> > non-sleeping-loop_make_request-code-here
> > #endif
>
> The solution I prefer is to have loop_make_request() block when we run
> out of pre-allocated buffers, ie similar to "normal" block drivers when
> they run out of request slots. This provides a nice throttling mechanism
> and makes sure that loop doesn't eat into the memory polls too heavily.
>
> In any way, do it one way and remove the other.

OK, non-sleeping code is gone.

> > 2.4.19-pre3 loop code wanders to loop_get_buffer() and transfer function in
> > LO_FLAGS_BH_REMAP optimization case.
>
> So the 'new' version does exactly the same, but doesn't do it in
> loop_get_buffer().

I found a new bug in stock loop driver: LO_FLAGS_BH_REMAP optimization is
never enabled because init hook for type 0 transfer is never executed. That
bug is not dangerous, but is not fixed by patch that I posted here
yesterday... so here is new one that fixes it.


Herbert Valerio Riedel wrote:
> ...well, you still haven't honored my modest wish of the innocent 2
> #define's and 1 typedef...

Your cryptoapi specific defines and typedef are now included.

Regards,
Jari Ruusu <[email protected]>


Attachments:
loop-fixes-2.4.19-pre3-v2.diff.gz (7.18 kB)