2003-07-02 15:07:25

by Andries E. Brouwer

[permalink] [raw]
Subject: [PATCH] cryptoloop

This is part 3 in the loop series.

It adds a module cryptoloop, with source file drivers/block/cryptoloop.c
and config option BLK_DEV_CRYPTOLOOP. It is the missing link between
the kernel crypto subdirectory and the loop device.
This means that one no longer needs kernel patches to use encryption
via the loop device. This goes together with mount and losetup from
util-linux 2.12 which will be released as soon as this goes in.

The configuration menu texts were shortened a bit. On the one hand
because the URLs were outdated. On the other hand, because such a
discussion is more appropriate for Documentation/loop.txt than for
a configuration menu.

There was an unfortunate clash between software that used lo_name
for the (possibly truncated) filename of the loop device -- of course
for informational purposes only --, and the software that used it
for the name of the crypto algorithm. I separated both roles.

The cryptoloop code is basically that from [email protected] and
[email protected].

The unused char key_reserved[48] was removed.

Andries

diff -u --recursive --new-file -X /linux/dontdiff a/drivers/block/Kconfig b/drivers/block/Kconfig
--- a/drivers/block/Kconfig Thu May 22 13:16:20 2003
+++ b/drivers/block/Kconfig Wed Jul 2 16:45:41 2003
@@ -237,29 +237,20 @@
root file system inside a DOS FAT file using this loop device
driver.

- The loop device driver can also be used to "hide" a file system in a
- disk partition, floppy, or regular file, either using encryption
+ To use the loop device, you need the losetup utility, found in the
+ util-linux package, see
+ <ftp://ftp.kernel.org/pub/linux/utils/util-linux/>.
+
+ The loop device driver can also be used to "hide" a file system in
+ a disk partition, floppy, or regular file, either using encryption
(scrambling the data) or steganography (hiding the data in the low
bits of, say, a sound file). This is also safe if the file resides
- on a remote file server. If you want to do this, you will first have
- to acquire and install a kernel patch from
- <ftp://ftp.kerneli.org/pub/kerneli/>, and then you need to
- say Y to this option.
-
- Note that alternative ways to use encrypted file systems are
- provided by the cfs package, which can be gotten from
- <ftp://ftp.kerneli.org/pub/kerneli/net-source/>, and the newer tcfs
- package, available at <http://tcfs.dia.unisa.it/>. You do not need
- to say Y here if you want to use one of these. However, using cfs
- requires saying Y to "NFS file system support" below while using
- tcfs requires applying a kernel patch. An alternative steganography
- solution is provided by StegFS, also available from
- <ftp://ftp.kerneli.org/pub/kerneli/net-source/>.
-
- To use the loop device, you need the losetup utility and a recent
- version of the mount program, both contained in the util-linux
- package. The location and current version number of util-linux is
- contained in the file <file:Documentation/Changes>.
+ on a remote file server.
+
+ There are several ways of doing this. Some of these require kernel
+ patches. The vanilla kernel offers the cryptoloop option. If you
+ want to use that, say Y to both LOOP and CRYPTOLOOP, and make sure
+ you have a recent (version 2.12 or later) version of util-linux.

Note that this loop device has nothing to do with the loopback
device used for network connections from the machine to itself.
@@ -271,6 +262,14 @@

Most users will answer N here.

+config BLK_DEV_CRYPTOLOOP
+ tristate "Cryptoloop Support"
+ depends on BLK_DEV_LOOP
+ ---help---
+ Say Y here if you want to be able to use the ciphers that are
+ provided by the CryptoAPI as loop transformation. This might be
+ used as hard disk encryption.
+
config BLK_DEV_NBD
tristate "Network block device support"
depends on NET
diff -u --recursive --new-file -X /linux/dontdiff a/drivers/block/Makefile b/drivers/block/Makefile
--- a/drivers/block/Makefile Thu May 22 13:16:20 2003
+++ b/drivers/block/Makefile Wed Jul 2 17:38:52 2003
@@ -30,3 +30,4 @@

obj-$(CONFIG_BLK_DEV_UMEM) += umem.o
obj-$(CONFIG_BLK_DEV_NBD) += nbd.o
+obj-$(CONFIG_BLK_DEV_CRYPTOLOOP) += cryptoloop.o
diff -u --recursive --new-file -X /linux/dontdiff a/drivers/block/cryptoloop.c b/drivers/block/cryptoloop.c
--- a/drivers/block/cryptoloop.c Thu Jan 1 01:00:00 1970
+++ b/drivers/block/cryptoloop.c Wed Jul 2 18:17:51 2003
@@ -0,0 +1,191 @@
+/*
+ Linux loop encryption enabling module
+
+ Copyright (C) 2002 Herbert Valerio Riedel <[email protected]>
+ Copyright (C) 2003 Fruhwirth Clemens <[email protected]>
+
+ This module is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This module is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this module; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ */
+
+#include <linux/module.h>
+#include <linux/version.h>
+
+#include <linux/init.h>
+#include <linux/config.h>
+#include <linux/string.h>
+#include <linux/crypto.h>
+#include <linux/blkdev.h>
+#include <linux/loop.h>
+#include <linux/blk.h>
+#include <asm/semaphore.h>
+#include <asm/uaccess.h>
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("loop blockdevice transferfunction adaptor / CryptoAPI");
+MODULE_AUTHOR("Herbert Valerio Riedel <[email protected]>");
+
+#define LOOP_IV_SECTOR_BITS 9
+#define LOOP_IV_SECTOR_SIZE (1 << LOOP_IV_SECTOR_BITS)
+
+static int
+cryptoloop_init(struct loop_device *lo, const struct loop_info64 *info)
+{
+ int err = -EINVAL;
+ char cms[LO_NAME_SIZE]; /* cipher-mode string */
+ char *cipher;
+ char *mode;
+ char *cmsp = cms; /* c-m string pointer */
+ struct crypto_tfm *tfm = NULL;
+
+ /* encryption breaks for non sector aligned offsets */
+
+ if (info->lo_offset % LOOP_IV_SECTOR_SIZE)
+ goto out;
+
+ strncpy(cms, info->lo_crypt_name, LO_NAME_SIZE);
+ cms[LO_NAME_SIZE - 1] = 0;
+ cipher = strsep(&cmsp, "-");
+ mode = strsep(&cmsp, "-");
+
+ if (mode == NULL || strcmp(mode, "cbc") == 0)
+ tfm = crypto_alloc_tfm(cipher, CRYPTO_TFM_MODE_CBC);
+ else if (strcmp(mode, "ecb") == 0)
+ tfm = crypto_alloc_tfm(cipher, CRYPTO_TFM_MODE_ECB);
+ if (tfm == NULL)
+ return -EINVAL;
+
+ err = tfm->crt_u.cipher.cit_setkey(tfm, info->lo_encrypt_key,
+ info->lo_encrypt_key_size);
+
+ if (err != 0)
+ goto out_free_tfm;
+
+ lo->key_data = tfm;
+ return 0;
+
+ out_free_tfm:
+ crypto_free_tfm(tfm);
+
+ out:
+ return err;
+}
+
+typedef int (*encdec_t)(struct crypto_tfm *tfm,
+ struct scatterlist *sg_out,
+ struct scatterlist *sg_in,
+ unsigned int nsg, u8 *iv);
+
+static int
+cryptoloop_transfer(struct loop_device *lo, int cmd, char *raw_buf,
+ char *loop_buf, int size, sector_t IV)
+{
+ struct crypto_tfm *tfm = (struct crypto_tfm *) lo->key_data;
+ struct scatterlist sg_out = { 0, };
+ struct scatterlist sg_in = { 0, };
+
+ encdec_t encdecfunc;
+ char const *in;
+ char *out;
+
+ if (cmd == READ) {
+ in = raw_buf;
+ out = loop_buf;
+ encdecfunc = tfm->crt_u.cipher.cit_decrypt_iv;
+ } else {
+ in = loop_buf;
+ out = raw_buf;
+ encdecfunc = tfm->crt_u.cipher.cit_encrypt_iv;
+ }
+
+ while (size > 0) {
+ const int sz = min(size, LOOP_IV_SECTOR_SIZE);
+ u32 iv[4] = { 0, };
+ iv[0] = cpu_to_le32(IV & 0xffffffff);
+
+ sg_in.page = virt_to_page(in);
+ sg_in.offset = (unsigned long)in & ~PAGE_MASK;
+ sg_in.length = sz;
+
+ sg_out.page = virt_to_page(out);
+ sg_out.offset = (unsigned long)out & ~PAGE_MASK;
+ sg_out.length = sz;
+
+ encdecfunc(tfm, &sg_out, &sg_in, sz, (u8 *)iv);
+
+ IV++;
+ size -= sz;
+ in += sz;
+ out += sz;
+ }
+
+ return 0;
+}
+
+
+static int
+cryptoloop_ioctl(struct loop_device *lo, int cmd, unsigned long arg)
+{
+ return -EINVAL;
+}
+
+static int
+cryptoloop_release(struct loop_device *lo)
+{
+ struct crypto_tfm *tfm = (struct crypto_tfm *) lo->key_data;
+ if (tfm != NULL) {
+ crypto_free_tfm(tfm);
+ lo->key_data = NULL;
+ return 0;
+ }
+ printk(KERN_ERR "cryptoloop_release(): tfm == NULL?\n");
+ return -EINVAL;
+}
+
+static struct loop_func_table cryptoloop_funcs = {
+ number: LO_CRYPT_CRYPTOAPI,
+ init: cryptoloop_init,
+ ioctl: cryptoloop_ioctl,
+ transfer: cryptoloop_transfer,
+ release: cryptoloop_release,
+ owner: THIS_MODULE
+};
+
+static int __init
+init_cryptoloop(void)
+{
+ int rc;
+
+ if ((rc = loop_register_transfer(&cryptoloop_funcs))) {
+ printk(KERN_ERR
+ "cryptoloop: register loop transfer function failed\n");
+ return rc;
+ }
+
+ printk(KERN_INFO "cryptoloop: loaded\n");
+ return 0;
+}
+
+static void __exit
+cleanup_cryptoloop(void)
+{
+ if (loop_unregister_transfer(LO_CRYPT_CRYPTOAPI))
+ printk(KERN_ERR
+ "cryptoloop: unregistering transfer funcs failed\n");
+
+ printk(KERN_INFO "cryptoloop: unloaded\n");
+}
+
+module_init(init_cryptoloop);
+module_exit(cleanup_cryptoloop);
diff -u --recursive --new-file -X /linux/dontdiff a/drivers/block/loop.c b/drivers/block/loop.c
--- a/drivers/block/loop.c Wed Jul 2 16:22:55 2003
+++ b/drivers/block/loop.c Wed Jul 2 16:45:41 2003
@@ -840,7 +840,8 @@
lo->lo_flags = 0;
lo->lo_queue.queuedata = NULL;
memset(lo->lo_encrypt_key, 0, LO_KEY_SIZE);
- memset(lo->lo_name, 0, LO_NAME_SIZE);
+ memset(lo->lo_crypt_name, 0, LO_NAME_SIZE);
+ memset(lo->lo_file_name, 0, LO_NAME_SIZE);
invalidate_bdev(bdev, 0);
set_capacity(disks[lo->lo_number], 0);
filp->f_dentry->d_inode->i_mapping->gfp_mask = gfp;
@@ -892,7 +893,10 @@
return -EFBIG;
}

- strlcpy(lo->lo_name, info->lo_name, LO_NAME_SIZE);
+ memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
+ memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE);
+ lo->lo_file_name[LO_NAME_SIZE-1] = 0;
+ lo->lo_crypt_name[LO_NAME_SIZE-1] = 0;

if (!xfer)
xfer = &none_funcs;
@@ -931,7 +935,8 @@
info->lo_offset = lo->lo_offset;
info->lo_sizelimit = lo->lo_sizelimit;
info->lo_flags = lo->lo_flags;
- strlcpy(info->lo_name, lo->lo_name, LO_NAME_SIZE);
+ memcpy(info->lo_file_name, lo->lo_file_name, LO_NAME_SIZE);
+ memcpy(info->lo_crypt_name, lo->lo_crypt_name, LO_NAME_SIZE);
info->lo_encrypt_type =
lo->lo_encryption ? lo->lo_encryption->number : 0;
if (lo->lo_encrypt_key_size && capable(CAP_SYS_ADMIN)) {
@@ -957,7 +962,10 @@
info64->lo_flags = info->lo_flags;
info64->lo_init[0] = info->lo_init[0];
info64->lo_init[1] = info->lo_init[1];
- memcpy(info64->lo_name, info->lo_name, LO_NAME_SIZE);
+ if (info->lo_encrypt_type == LO_CRYPT_CRYPTOAPI)
+ memcpy(info64->lo_crypt_name, info->lo_name, LO_NAME_SIZE);
+ else
+ memcpy(info64->lo_file_name, info->lo_name, LO_NAME_SIZE);
memcpy(info64->lo_encrypt_key, info->lo_encrypt_key, LO_KEY_SIZE);
}

@@ -975,8 +983,11 @@
info->lo_flags = info64->lo_flags;
info->lo_init[0] = info64->lo_init[0];
info->lo_init[1] = info64->lo_init[1];
- memcpy(info->lo_name, info64->lo_name, LO_NAME_SIZE);
- memcpy(info->lo_encrypt_key,info64->lo_encrypt_key,LO_KEY_SIZE);
+ if (info->lo_encrypt_type == LO_CRYPT_CRYPTOAPI)
+ memcpy(info->lo_name, info64->lo_crypt_name, LO_NAME_SIZE);
+ else
+ memcpy(info->lo_name, info64->lo_file_name, LO_NAME_SIZE);
+ memcpy(info->lo_encrypt_key, info64->lo_encrypt_key, LO_KEY_SIZE);

/* error in case values were truncated */
if (info->lo_device != info64->lo_device ||
diff -u --recursive --new-file -X /linux/dontdiff a/include/linux/loop.h b/include/linux/loop.h
--- a/include/linux/loop.h Wed Jul 2 16:23:00 2003
+++ b/include/linux/loop.h Wed Jul 2 16:45:41 2003
@@ -36,7 +36,8 @@
int (*transfer)(struct loop_device *, int cmd,
char *raw_buf, char *loop_buf, int size,
sector_t real_block);
- char lo_name[LO_NAME_SIZE];
+ char lo_file_name[LO_NAME_SIZE];
+ char lo_crypt_name[LO_NAME_SIZE];
char lo_encrypt_key[LO_KEY_SIZE];
int lo_encrypt_key_size;
struct loop_func_table *lo_encryption;
@@ -49,7 +50,6 @@
struct block_device *lo_device;
unsigned lo_blocksize;
void *key_data;
- char key_reserved[48]; /* for use by the filter modules */

int old_gfp_mask;

@@ -102,7 +102,8 @@
__u32 lo_encrypt_type;
__u32 lo_encrypt_key_size; /* ioctl w/o */
__u32 lo_flags; /* ioctl r/o */
- __u8 lo_name[LO_NAME_SIZE];
+ __u8 lo_file_name[LO_NAME_SIZE];
+ __u8 lo_crypt_name[LO_NAME_SIZE];
__u8 lo_encrypt_key[LO_KEY_SIZE]; /* ioctl w/o */
__u64 lo_init[2];
};


2003-07-02 17:01:32

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

[email protected] wrote:
>
> +static int
> +cryptoloop_transfer(struct loop_device *lo, int cmd, char *raw_buf,
> + char *loop_buf, int size, sector_t IV)

You'll note that loop.c goes from (page/offset/len) to (addr/len), and this
transfer function then immediately goes from (addr,len) to
(page/offset/len).

That's rather silly, and it forces the loop driver to play games pushing
pages into lowmem.

Can we keep everything using (page/offset/len) end-to-end?


2003-07-02 18:29:58

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

akpm:

> You'll note that loop.c goes from (page/offset/len) to (addr/len),
> and this transfer function then immediately goes from (addr,len)
> to (page/offset/len). That's rather silly ..

Changing that would kill all existing modules that use the loop device.

Maybe nobody cares. Then we can do so in a subsequent patch.

Andries

2003-07-02 18:54:08

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

[email protected] wrote:
>
> akpm:
>
> > You'll note that loop.c goes from (page/offset/len) to (addr/len),
> > and this transfer function then immediately goes from (addr,len)
> > to (page/offset/len). That's rather silly ..
>
> Changing that would kill all existing modules that use the loop device.

That's OK - if they don't want to properly adapt to the new API they can
just kmap the page and call into the old-style code. A five-line wrapper.

> Maybe nobody cares. Then we can do so in a subsequent patch.

Splitting these changes into two almost doubles the testing effort, or
halves the coverage. We really should aim to get both these changes in
place at the same time. I know that I wouldn't bother testing it if there
are more large changes pending...

Could we pleeeze have a little cryptoloop.txt which just gives the basics
on where to obtain the tools and how to get the thing up and running? It's
a right pain having to go scrabbling all over the internet working out how
to set stuff up if you just want to do a bit of testing every few months.

Thanks.

(Where are the first and second patches btw? Merged? Is a fourth
anticipated?)


2003-07-02 19:02:36

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop


On Wed, 2 Jul 2003, Andrew Morton wrote:
>
> (Where are the first and second patches btw? Merged? Is a fourth
> anticipated?)

The two first ones are merged.

Linus

PS. I'm not readign the kernel mailing list very actively right now: I'm
going to move my mailing list reading around, so please keep me cc'd on
anything I need to know.

2003-07-02 19:11:08

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

Linus Torvalds <[email protected]> wrote:
>
>
> On Wed, 2 Jul 2003, Andrew Morton wrote:
> >
> > (Where are the first and second patches btw? Merged? Is a fourth
> > anticipated?)
>
> The two first ones are merged.

Well please don't let me stand in the way of #3. But we shouldn't lose
sight of the need to fix up these shortcomings. pagecache&BIO use
pageframes and crypto uses pageframes. Connecting them together via
virtual addresses is daft.


2003-07-02 19:17:02

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop


On Wed, 2 Jul 2003, Andrew Morton wrote:
>
> Well please don't let me stand in the way of #3. But we shouldn't lose
> sight of the need to fix up these shortcomings. pagecache&BIO use
> pageframes and crypto uses pageframes. Connecting them together via
> virtual addresses is daft.

I do agree. Might as well fix that, if patch#3 is working in that area
anyway.

Linus

2003-07-02 19:27:40

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

* akpm> Where are the first and second patches btw? Merged? Is a fourth
* akpm> anticipated?)

Yes, merged.

(Usually I plan a track and submit individual steps.
When they get applied I continue.
If not, there is not need to waste time on the rest.)

Here a series of about half a dozen steps was announced.


* akpm> Splitting these changes into two almost doubles the testing effort,

One should always go in many small steps.
Each step so that it is clear what happens.
Preferably with different steps separated by a kernel release,
so that if things break, people can tell at what kernel version.


* akpm> Could we pleeeze have a little cryptoloop.txt which just gives
* akpm> the basics on where to obtain the tools and how to get the thing
* akpm> up and running? It's a right pain having to go scrabbling
* akpm> all over the internet working out how to set stuff up
* akpm> if you just want to do a bit of testing every few months.

Please read the help message in the config menu.
Then read the text accompanying my patch.
When still needed, read losetup(8).

Andries



[Let me say the same things in different words. You want to test.
I think you are too early, since I have not released util-linux 2.12,
and will wait until this, or a similar, patch has been applied.
You can test that no old things are broken by using mount (-o loop)
or losetup. If you are really very eager I can prepare a tar file
for you and put it on some ftp site.]

2003-07-02 19:49:17

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

[email protected] wrote:
>
> (Usually I plan a track and submit individual steps.
> When they get applied I continue.
> If not, there is not need to waste time on the rest.)

As someone who is on the receiving end of this process I must say that it
is not comfortable.

It really helps to be able to see the whole thing laid out all at the same
time. So we can see that it works, so that we can see that the end result
is the right one, etc.

Whereas receiving a long drawn out trickle of patches is quite confusing.
One doesn't know where it is leading, nor where it will end, nor when to
get down and actually start testing it, etc. Nor whether this ongoing
churn is stomping on someone else's development effort.

And there is the risk that you get halfway through and then suddenly "no
way, we don't want to do that". Then what? Argue? Revert?


So for everyone except the guy who's writing the code it is best to have
all the work in place and reviewable at the same time.

For the author, yes, there is a risk that more code than necessary will be
tossed away. We can minimise that by discussing things beforehand, getting
understanding and agreement from the relevant people on the intended
direction. I think we have done that for cryptoloop. We still have not
really done it for 64-bit dev_t.


2003-07-02 20:45:46

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

From [email protected] Wed Jul 2 22:03:49 2003

> (Usually I plan a track and submit individual steps.
> When they get applied I continue.
> If not, there is no need to waste time on the rest.)

As someone who is on the receiving end of this process I must say that it
is not comfortable.

It really helps to be able to see the whole thing laid out all at the same
time. So we can see that it works, so that we can see that the end result
is the right one, etc.

Whereas receiving a long drawn out trickle of patches is quite confusing.
One doesn't know where it is leading, nor where it will end, nor when to
get down and actually start testing it, etc. Nor whether this ongoing
churn is stomping on someone else's development effort.

And there is the risk that you get halfway through and then suddenly "no
way, we don't want to do that". Then what? Argue? Revert?

No, the point of such a series is that each patch does something
clearly defined, is an improvement even when the author dies the
next day so that all further work is lost.

You should never accept a patch that makes things worse and is only
justified by a future one.

So for everyone except the guy who's writing the code it is best to have
all the work in place and reviewable at the same time.

No. Some changes are too large for that.

For the author, yes, there is a risk that more code than necessary will be
tossed away. We can minimise that by discussing things beforehand, getting
understanding and agreement from the relevant people on the intended
direction. I think we have done that for cryptoloop. We still have not
really done it for 64-bit dev_t.

Yes, now that you remind me, that is also an interesting topic.
About discussing beforehand - search the archives and you'll see
immense amounts of discussion on large dev_t.
You'll always find me willing to discuss details.

Now suppose one wants a large dev_t. Some people do.
Then several steps are needed. One of these steps
is the addition of the mknod64 system call.
That is a nice small isolated step - part of the necessary
user space interface. It can be done independently of any
other steps. It was submitted, but is not in the present
kernel. Why not? I do not recall anybody pointing out problems.

I think some of these large dev_t steps were somewhat urgent, because
they were prerequisites for glibc changes. But they didnt happen.
Linus took a bit from me, and you submitted a few steps from me,
and then nothing. OK.

Dave compared the patch submission process to TCP/IP.
I agree, and go into exponential backoff. Try again after two weeks,
three months, a year and a half.
But if you want we can restart that particular series of patches.
Or discuss, if there are things to discuss.

Andries

2003-07-02 20:52:10

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Wed, Jul 02, 2003 at 11:00:06PM +0200, [email protected] wrote:
>
> Now suppose one wants a large dev_t. Some people do.
> Then several steps are needed. One of these steps
> is the addition of the mknod64 system call.
> That is a nice small isolated step - part of the necessary
> user space interface. It can be done independently of any
> other steps. It was submitted, but is not in the present
> kernel. Why not? I do not recall anybody pointing out problems.

I was wondering what had happened here with this. What is left to do to
finish full support?

thanks,

greg k-h

2003-07-02 21:22:29

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

[email protected] wrote:
>
> Now suppose one wants a large dev_t. Some people do.

<thumps table> 2.6 will support a large dev_t.

We need to make this happen.

> Then several steps are needed. One of these steps
> is the addition of the mknod64 system call.
> That is a nice small isolated step - part of the necessary
> user space interface. It can be done independently of any
> other steps. It was submitted, but is not in the present
> kernel. Why not? I do not recall anybody pointing out problems.

This precisely illustrates my point.

mknod64() and several other dev_t patches (ext2 and ext3 support,
especially) have stalled in -mm for months. The reason why I have not
moved ahead with them is that I am waiting to see the rest of the work.

Because it could be that Al has different ideas, or that someone else gets
down and completes the work and wants to do it differently. I simply do
not know.

Or it could be that nobody ends up doing the dev_t work at all, in which
case we end up with a pointless syscall sitting there.

> I think some of these large dev_t steps were somewhat urgent, because
> they were prerequisites for glibc changes. But they didnt happen.
> Linus took a bit from me, and you submitted a few steps from me,
> and then nothing. OK.

The patches stopped!

Please, if you want to finish off the dev_t work, send the rest of the
patches. We'll shake them down, allow Al to be rude about them, break
people's builds with them, crash their kernels, etc. When it is all settled
down and everyone is sufficiently happy, shoot them across to Linus.

That's why I run -mm, so people can make these large and evolving changes,
have them reviewed and understood without breaking Linus's tree and thereby
impacting everyone.


> Dave compared the patch submission process to TCP/IP.
> I agree, and go into exponential backoff. Try again after two weeks,
> three months, a year and a half.
> But if you want we can restart that particular series of patches.
> Or discuss, if there are things to discuss.

Sounds great, let's do it. Send patches against the current -mm, copy
Greg, Christoph and Al on them and let's wrap this thing up.

The only problem of which I'm aware is that they seem to still break
devicemapper a bit, but the fixes for that seem to be just a few days away.

2003-07-02 22:12:42

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

[on large dev_t]:

> I was wondering what had happened here with this.
> What is left to do to finish full support?

It is not quite clear what "full support" means.
(And not clear what "left" means. Roughly speaking I did
everything, but today part of that would have to be redone.)


Suppose that one regards stat/mknod as communication
userspace <-> filesystem.
Then one wants stat() to be able to return 64-bit dev_t,
and mknod() to be able to bring 64 bits to the filesystem.

Several filesystems have room for only 16 or 32 bits, while some
have room for 128 bits. So details will be filesystem-dependent.

I seem to recall that among the submitted patches was one
that added ext2 support for 64-bit dev_t.
For some other filesystems things were settled. E.g. for NFS.
And there was mknod64.


That is one part. But there is more than stat/mknod.
Every system call and ioctl that uses a dev_t parameter or
a dev_t field in some parameter struct must be looked at.
Do we want to deprecate the thing that has not been used
the past nine years? Or do we need a foo64 version?
Maybe I have patches for all cases.
Don't recall whether I finished the audit. Anyhow,
I would have to do it again - last time was several months ago,
new calls may have been added since.


That is the interface with user space.
Strange enough people are not satisfied with nice large numbers
in special device nodes on the filesystem. They would like to
see something meaningful happen when they open() such a node.

This means that the kernel must be able to find a device driver
or so given a large number. That can be done using a hash table
lookup. But what one really wants is that a device driver can
register an interval of device numbers. Then hash lookup does
not work so well, and one needs a tree lookup. Maybe Al did this
wrong at first in the block device code. I have not checked whether
that has improved recently, but I saw that he did work on the
character device part. Must read that in case this series of
patches is restarted.


So, coming back to your question: what is needed?
Maybe I already had all needed fragments, and earlier my
uncertainty was mostly about the possible existence of
some obscure ioctl on some obscure architecture (not m68, Geert)
that I might have overlooked.

I think the -mm kernels have had (still have?) most of these
large dev_t patches, and basically these did not cause problems.

So what is needed today is looking at the present kernel.
Redoing the syscall and ioctl audit.
Getting the patches into the tree.
Getting glibc to update <sys/sysmacros.h> and stat/mknod/ustat.


And all of this is only the use of the numbers.
Good support for large numbers of disks is something entirely
different. I recall that there were scaling problems, e.g.
with sysfs.

Andries

2003-07-02 22:43:34

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

From [email protected] Wed Jul 2 23:36:40 2003

> Now suppose one wants a large dev_t. Some people do.

<thumps table> 2.6 will support a large dev_t.

We need to make this happen.

Very well.

> Then several steps are needed. One of these steps
> is the addition of the mknod64 system call.
> That is a nice small isolated step - part of the necessary
> user space interface. It can be done independently of any
> other steps. It was submitted, but is not in the present
> kernel. Why not? I do not recall anybody pointing out problems.

This precisely illustrates my point.

mknod64() and several other dev_t patches (ext2 and ext3 support,
especially) have stalled in -mm for months. The reason why I have not
moved ahead with them is that I am waiting to see the rest of the work.

That is what they call a deadlock.
You asked a few times what parts could be submitted and I answered.

Because it could be that Al has different ideas, or that someone else gets
down and completes the work and wants to do it differently. I simply do
not know.

Neither do I. I just wait and see.

Andries

2003-07-03 11:07:00

by Jari Ruusu

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

[email protected] wrote:
> akpm:
> > You'll note that loop.c goes from (page/offset/len) to (addr/len),
> > and this transfer function then immediately goes from (addr,len)
> > to (page/offset/len). That's rather silly ..
>
> Changing that would kill all existing modules that use the loop device.
>
> Maybe nobody cares. Then we can do so in a subsequent patch.

I care. Please don't break the transfer function prototype.

I don't know if you guys have realized it or not, but cryptoloop+cryptoapi
is the slowest possible loop crypto implementation on the planet. Before you
guys sacrifice loop performance with cryptoloop only stuff, you may want to
do google search for "loop-AES" (twice as fast on most modern boxes) and
choose to preserve fast interfaces that other implementations depend on.

Regards,
Jari Ruusu <[email protected]>

2003-07-03 15:05:42

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

Jari Ruusu <[email protected]> wrote:
>
> [email protected] wrote:
> > akpm:
> > > You'll note that loop.c goes from (page/offset/len) to (addr/len),
> > > and this transfer function then immediately goes from (addr,len)
> > > to (page/offset/len). That's rather silly ..
> >
> > Changing that would kill all existing modules that use the loop device.
> >
> > Maybe nobody cares. Then we can do so in a subsequent patch.
>
> I care. Please don't break the transfer function prototype.

Why?

> I don't know if you guys have realized it or not, but cryptoloop+cryptoapi
> is the slowest possible loop crypto implementation on the planet. Before you
> guys sacrifice loop performance with cryptoloop only stuff, you may want to
> do google search for "loop-AES" (twice as fast on most modern boxes) and
> choose to preserve fast interfaces that other implementations depend on.

It's not clear what point you're actually trying to make here. But it's
worth pointing out that the current transfer function interface can be sped
up (a bit) by going to a pageframe-based one.

2003-07-03 15:30:32

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

A general comment to the design of this code: imho we should rip
out the whole concept of transfer functions and go directly to
the cryptoapi algorithms - one level of unneeded abstractions
gone (the transfer funcs were never used for anything but X0R in
mainline and the interface is quite wrong as we could really
pass the page+offsets through). If we finally decide to get rid
of this legacy junk I even volunteer to implemente a X0R cryptoapi
module :)

Some more nitpicking:

> +#include <linux/version.h>

not needed.

> +#include <linux/config.h>

dito.

> +static struct loop_func_table cryptoloop_funcs = {
> + number: LO_CRYPT_CRYPTOAPI,
> + init: cryptoloop_init,
> + ioctl: cryptoloop_ioctl,
> + transfer: cryptoloop_transfer,
> + release: cryptoloop_release,
> + owner: THIS_MODULE
> +};

please use C99-initializers.

> + printk(KERN_INFO "cryptoloop: loaded\n");

isn't this a bit verbose?

> + if (loop_unregister_transfer(LO_CRYPT_CRYPTOAPI))

how could this fail?

> + printk(KERN_ERR
> + "cryptoloop: unregistering transfer funcs failed\n");
> +
> + printk(KERN_INFO "cryptoloop: unloaded\n");

dito.

> + if (info->lo_encrypt_type == LO_CRYPT_CRYPTOAPI)
> + memcpy(info64->lo_crypt_name, info->lo_name, LO_NAME_SIZE);
> + else
> + memcpy(info64->lo_file_name, info->lo_name, LO_NAME_SIZE);

this special-casing sounds like a bad idea. I'd say anyone who wants
crypto support needs to use a losetup that uses the newstyle loop_info.
(Or did I get the context wrong? diff -p helps a lot if you don't have
a recent kernel tree nearby..)

2003-07-03 16:13:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Wed, Jul 02, 2003 at 11:00:06PM +0200, [email protected] wrote:
> No, the point of such a series is that each patch does something
> clearly defined, is an improvement even when the author dies the
> next day so that all further work is lost.

*nod*

> You should never accept a patch that makes things worse and is only
> justified by a future one.

well, I almost agree. If the other patch is posted at the same time
or the regression is for less important code (say a legacy driver)
this can be ok.

> So for everyone except the guy who's writing the code it is best to have
> all the work in place and reviewable at the same time.
>
> No. Some changes are too large for that.

right, there's changes that are too large. But it really helps a lot
to send a series of patches to give a broader view.

2003-07-03 16:16:34

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

hch:

> Some nitpicking:

Thanks. In case Linus applied the patch this is unimportant.
In case he didnt I'll probably resubmit when 2.5.75 comes out
and apply your polishing.

> + if (info->lo_encrypt_type == LO_CRYPT_CRYPTOAPI)
>
> this special-casing sounds like a bad idea.

True. But it is only in the compatibility code.
Once you rip that out the special casing disappears automatically.
It describes current reality.

> imho we should rip out the whole concept of transfer functions

They are used much more frequently than cryptoapi is.
People tell me jari-loop is much faster at present.
If this is true, your move would not be very popular.

Anyway, I am not doing a redesign. Just a cleanup.

Andries

2003-07-03 16:23:46

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Thu, Jul 03, 2003 at 06:25:59PM +0200, [email protected] wrote:
> They are used much more frequently than cryptoapi is.
> People tell me jari-loop is much faster at present.
> If this is true, your move would not be very popular.

Personally I care far less about one cryptoloop implementation
beeing faster than another one, but about proper design of whatever
gets into mainline. Think about this like the freeswan vs kernel
ipsec thing - of course klips had more feature and more mature code
initially (and maybe still has in some areas) but the kernel ipsec
is the much better design.

I'd be very interested in seeing some backing and explanation of his
claims so we can incorporate it into the generic cryptoapi / loop code.

The only thing I could imagine is that he has assembly implementations
of many ciphers that are faster than the C ones in cryptoapi - something
that's on the cryptoapi todo list anyway.

> Anyway, I am not doing a redesign. Just a cleanup.

Umm, no. You add a feature. That's something very different from
a cleanup.

2003-07-03 16:34:25

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Wed, Jul 02, 2003 at 08:44:18PM +0200, [email protected] wrote:
> > and this transfer function then immediately goes from (addr,len)
> > to (page/offset/len). That's rather silly ..
>
> Changing that would kill all existing modules that use the loop device.

I don't think that matters. This is 2.5 and APIs tend to change
in unstable series :) Do you know of any users except the various
cryptoloop implementations?

2003-07-03 17:15:15

by Jari Ruusu

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

Andrew Morton wrote:
> Jari Ruusu <[email protected]> wrote:
> > [email protected] wrote:
> > > Changing that would kill all existing modules that use the loop device.
> > >
> > > Maybe nobody cares. Then we can do so in a subsequent patch.
> >
> > I care. Please don't break the transfer function prototype.
>
> Why?

Because loop-AES attempts to be compatible with structures in loop.h by not
modifying loop.h at all. This is what the "no kernel sources patched or
replaced" means. Breakage in loop.h breaks loop-AES, and I have to clean the
mess.

Regards,
Jari Ruusu <[email protected]>

2003-07-03 17:25:02

by Chris Friesen

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

Jari Ruusu wrote:

> Because loop-AES attempts to be compatible with structures in loop.h by not
> modifying loop.h at all. This is what the "no kernel sources patched or
> replaced" means. Breakage in loop.h breaks loop-AES, and I have to clean the
> mess.

We're in a development stream. It is kind of expected that in-kernel
APIs may change if the developers feel it will lead to some improvement.

This sucks for people that are trying to track those APIs with
out-of-kernel patches, but its a fact of life.

Chris


--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2003-07-04 07:28:48

by Jari Ruusu

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

Chris Friesen wrote:
> Jari Ruusu wrote:
> > Because loop-AES attempts to be compatible with structures in loop.h by not
> > modifying loop.h at all. This is what the "no kernel sources patched or
> > replaced" means. Breakage in loop.h breaks loop-AES, and I have to clean the
> > mess.
>
> We're in a development stream. It is kind of expected that in-kernel
> APIs may change if the developers feel it will lead to some improvement.
>
> This sucks for people that are trying to track those APIs with
> out-of-kernel patches, but its a fact of life.

I know. I already have to deal with API breakages.

Changing transfer function prototype may be a tiny speed improvement for one
implementation that happens to use unoptimal API, but at same time be tiny
speed degration to other implementations that use more saner APIs. I am
unhappy with that change, because I happen to maintain four such transfers
that would be subject to tiny speed degration.

Regards,
Jari Ruusu <[email protected]>

2003-07-04 08:29:39

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

Jari Ruusu <[email protected]> wrote:
>
> Changing transfer function prototype may be a tiny speed improvement for one
> implementation that happens to use unoptimal API, but at same time be tiny
> speed degration to other implementations that use more saner APIs. I am
> unhappy with that change, because I happen to maintain four such transfers
> that would be subject to tiny speed degration.

Both the source and dest pages are being mapped with a sleeping kmap() at
present. This way we can stop doing that, which is a significant saving on
highmem and a more significant saving on SMP highmem. (crypto uses kmap_atomic).

Why is the loop driver taking a copy of all the pages btw? Looks to me
that we can just clone the bio and save an entire copy?

2003-07-04 09:25:21

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Thu, Jul 03, 2003 at 08:29:49PM +0300, Jari Ruusu wrote:
> Because loop-AES attempts to be compatible with structures in loop.h by not
> modifying loop.h at all. This is what the "no kernel sources patched or
> replaced" means. Breakage in loop.h breaks loop-AES, and I have to clean the
> mess.

So get your code merged. Moaning about breaking out of tree code beeing
broken by changes when an in-kernel alternative eists doesn't help.

Either try to help improving what's in the tree or shut up.

2003-07-04 09:27:55

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Fri, Jul 04, 2003 at 10:43:10AM +0300, Jari Ruusu wrote:
> Changing transfer function prototype may be a tiny speed improvement for one
> implementation that happens to use unoptimal API, but at same time be tiny
> speed degration to other implementations that use more saner APIs. I am
> unhappy with that change, because I happen to maintain four such transfers
> that would be subject to tiny speed degration.

You've so far only made ubacked claims in this thread. Show the
numbers and tell us why your implementation is faster and show the
numbers and explain why this change should make your module slower.

If you can't benefit from using the page frame + offset the worst
thing you'd have to do is doing the kmap yourself instead of loop.c
doing it. And no, kmap doesn't get magically slower when called from
a different module.

2003-07-04 10:54:10

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

hch to jari:

> So get your code merged. Moaning about breaking out of tree code beeing
> broken by changes when an in-kernel alternative eists doesn't help.
>
> Either try to help improving what's in the tree or shut up.


Oh, Christoph - can't you just be a tiny bit more civil.

Here is an ungoing process of merging crypto/loop code.
You are perfectly aware of that - you complained about
every single stage - the rfc patch at the start was
too large, the whitespace in the next patch was distributed
incorrectly, also the third part had terrible bugs - I forget,
maybe there was a superfluous #include.

Now that you are very aware of this ongoing effort
of merging the loop stuff that so far lived as separate
patches outside the kernel tree, how can you say
"get your code merged"? That is precisely what we are
doing right now. Slowly. Step by step.


Andrew on the other hand apparently didnt know, and commented
on something that can be improved. Excellent. For this patch #3
that comment was not relevant, but no doubt we must try and
follow his good advice in some subsequent patch. Preferably
without breaking existing user space.


Try to follow Andrew's example - nice, friendly, constructive.
It is possible to be a good coder without being abusive.

Andries

2003-07-04 11:59:21

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Fri, Jul 04, 2003 at 01:08:32PM +0200, [email protected] wrote:
> hch to jari:
>
> > So get your code merged. Moaning about breaking out of tree code beeing
> > broken by changes when an in-kernel alternative eists doesn't help.
> >
> > Either try to help improving what's in the tree or shut up.
>
>
> Oh, Christoph - can't you just be a tiny bit more civil.

Maybe I could. But Jari has only moaned about the implementation
you posted here without explaining why or even showing numbers.

That's really, really anoying. If he wants to improve the situation
he should post patches or at least say what part of the design or
implementation in particular he doesn't like.

I'm l33t and my code is better than your is crap and I refuse to
take make a single conpromise for code like that. Put up or
shut up.

> Here is an ungoing process of merging crypto/loop code.
> You are perfectly aware of that - you complained about
> every single stage - the rfc patch at the start was
> too large, the whitespace in the next patch was distributed
> incorrectly, also the third part had terrible bugs - I forget,
> maybe there was a superfluous #include.

Maybe I sounded too harsh all the time, but that's not complaining
but showing problems - really tiny ones like superflous includes
or bigger ones likes a wrong layer of abstraction. If you argue
with me that these aren't problem because of whatever (like you
did for the transfer function thing) that's fine, but saying I
shouldn't "complain" about patches is crap. If you don't want
your code (or mine or whatever) to get better don't post it here.

> Now that you are very aware of this ongoing effort
> of merging the loop stuff that so far lived as separate
> patches outside the kernel tree, how can you say
> "get your code merged"? That is precisely what we are
> doing right now. Slowly. Step by step.

You submitted the crypto-api based loop code, Jari has a different
implementation of which he thinks it's fairly superior and his
comments indicate he wants to maintain it out of tree. And given
that we're in the process of merging some cryptoloop implementation
this is the worst possible attitude. He doesn't tell us what part
of the patch you posted is wrong, he doesn't even say "stop! here's
my patch instead, it's better because a, b, c), no he says please
keep the broken hooks so my l33t out of tree implementation can
be used instead because you suck. wonderfull.

2003-07-04 13:06:51

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

>> Oh, Christoph - can't you just be a tiny bit more civil.

> Maybe I could.

Good!


Andries


BTW - So far, Clemens and Jari and I have been cooperating and
discussing every step. Most of your statements seem unfounded to me.

2003-07-04 13:13:54

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Fri, Jul 04, 2003 at 03:21:15PM +0200, [email protected] wrote:
> BTW - So far, Clemens and Jari and I have been cooperating and
> discussing every step. Most of your statements seem unfounded to me.

I'd be happy to take everything back :) But so far I've only heard
him moaning that

a) the cryptoloop implementation you submitted is the worst possible
one (without any backing)
b) the changes proposed by akpm and me would break external modules
(which they would, but that's life and we've never cared much
about outside drivers) und would make them slower which is
utterly bogus.

But hey, let's give him time to defend himself..

2003-07-05 08:27:15

by Jari Ruusu

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

Christoph Hellwig wrote:
> On Fri, Jul 04, 2003 at 10:43:10AM +0300, Jari Ruusu wrote:
> > Changing transfer function prototype may be a tiny speed improvement for one
> > implementation that happens to use unoptimal API, but at same time be tiny
> > speed degration to other implementations that use more saner APIs. I am
> > unhappy with that change, because I happen to maintain four such transfers
> > that would be subject to tiny speed degration.
>
> You've so far only made ubacked claims in this thread. Show the
> numbers and tell us why your implementation is faster and show the
> numbers and explain why this change should make your module slower.

I haven't seen the modified transfer function yet, so no test data on speed
difference yet. Notice that I said "tiny speed degration". Tiny in this
context may well mean unmeasurable small. However, it will add mode code to
optimized implementation. More code == tiny bit slower.

Loop code in loop-AES does not have any significant speed advantage over
mainline loop. Most significant advantage that loop-AES' loop code has over
mainline, is that it includes a ton of bug fixes still missing from mainline
loop. Loop-AES' speed advantage comes from optimized AES-only transfer
function that does the CBC stuff and directly calls highly optimized AES
implementation without any CryptoAPI overhead. IOW, it does loop -> transfer
-> AES with all unnecessary crap removed. I am complaining because you guys
are about to add unnecessary crap to loop -> transfer interface.

Following tests were run in userspace on my 300 MHz Pentium-2 test box, and
were compiled using Debian woody gcc "version 2.95.4 20011002 (Debian
prerelease)" with these compiler flags: -O2 -Wall -march=i686
-mpreferred-stack-boundary=2 -fomit-frame-pointer

This tests only low level cipher functions aes_encrypt() and aes_decrypt()
from linux-2.5.74/crypto/aes.c with all CryptoAPI overhead removed. In real
use, including CryptoAPI overhead, these numbers should be a little bit
smaller.

key length 128 bits, encrypt speed 68.5 Mbits/sec
key length 128 bits, decrypt speed 58.9 Mbits/sec
key length 192 bits, encrypt speed 58.3 Mbits/sec
key length 192 bits, decrypt speed 50.3 Mbits/sec
key length 256 bits, encrypt speed 51.0 Mbits/sec
key length 256 bits, decrypt speed 43.8 Mbits/sec

This tests aes_encrypt() and aes_decrypt() from loop-AES-v1.7d/aes-i586.S
Most users running loop-AES on modern x86 boxes get to use this code
automatically.

key length 128 bits, encrypt speed 129.6 Mbits/sec
key length 128 bits, decrypt speed 131.3 Mbits/sec
key length 192 bits, encrypt speed 113.0 Mbits/sec
key length 192 bits, decrypt speed 111.7 Mbits/sec
key length 256 bits, encrypt speed 96.2 Mbits/sec
key length 256 bits, decrypt speed 97.5 Mbits/sec

This tests aes_encrypt() and aes_decrypt() from loop-AES-v1.7d/aes.c
Loop-AES users running non-x86 kernels or x86 configured for i386/i486 will
run this version.

key length 128 bits, encrypt speed 81.2 Mbits/sec
key length 128 bits, decrypt speed 83.4 Mbits/sec
key length 192 bits, encrypt speed 68.5 Mbits/sec
key length 192 bits, decrypt speed 70.6 Mbits/sec
key length 256 bits, encrypt speed 58.9 Mbits/sec
key length 256 bits, decrypt speed 60.9 Mbits/sec

> Either try to help improving what's in the tree or shut up.

I have posted patches to be included in mainline. Fixes are available, and
if they are not merged, then so be it. If fixes are not merged to mainline,
they will be maintained outside of mainline so people who need them can
actually use them. It is not really my failt that mainline people seem to
prefer buggy loop and slow loop crypto.

Regards,
Jari Ruusu <[email protected]>

2003-07-05 08:42:54

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

Jari Ruusu <[email protected]> wrote:
>
> I have posted patches to be included in mainline.

Have you? I only recall a single humongous patch from you which was far
too large and did too many things at once to be reviewed and generally
understood by other kernel developers.

If you have a sequence of one-concept-per-patch documented changes then
please tell us where to find it.

2003-07-05 08:50:53

by Andre Hedrick

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop


Andrew,

Hold up! Recall some changes can not be micro sliced. If this the case
also, you need to evaluate assigning to a selected committee to be formed
to review and parse.

If you do not understand the scope of the change set, then find somebody
(pural if needed) to parse and report.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

On Sat, 5 Jul 2003, Andrew Morton wrote:

> Jari Ruusu <[email protected]> wrote:
> >
> > I have posted patches to be included in mainline.
>
> Have you? I only recall a single humongous patch from you which was far
> too large and did too many things at once to be reviewed and generally
> understood by other kernel developers.
>
> If you have a sequence of one-concept-per-patch documented changes then
> please tell us where to find it.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-07-05 09:00:09

by Andre Hedrick

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop


Jari,

Word of advice from somebody who has been down this road.
Repatch and add in a dual path select, this will provide a security
blanket for the folks who do not understand the entire scope.

Cheers,

Andre Hedrick
LAD Storage Consulting Group

On Sat, 5 Jul 2003, Jari Ruusu wrote:

> Christoph Hellwig wrote:
> > On Fri, Jul 04, 2003 at 10:43:10AM +0300, Jari Ruusu wrote:
> > > Changing transfer function prototype may be a tiny speed improvement for one
> > > implementation that happens to use unoptimal API, but at same time be tiny
> > > speed degration to other implementations that use more saner APIs. I am
> > > unhappy with that change, because I happen to maintain four such transfers
> > > that would be subject to tiny speed degration.
> >
> > You've so far only made ubacked claims in this thread. Show the
> > numbers and tell us why your implementation is faster and show the
> > numbers and explain why this change should make your module slower.
>
> I haven't seen the modified transfer function yet, so no test data on speed
> difference yet. Notice that I said "tiny speed degration". Tiny in this
> context may well mean unmeasurable small. However, it will add mode code to
> optimized implementation. More code == tiny bit slower.
>
> Loop code in loop-AES does not have any significant speed advantage over
> mainline loop. Most significant advantage that loop-AES' loop code has over
> mainline, is that it includes a ton of bug fixes still missing from mainline
> loop. Loop-AES' speed advantage comes from optimized AES-only transfer
> function that does the CBC stuff and directly calls highly optimized AES
> implementation without any CryptoAPI overhead. IOW, it does loop -> transfer
> -> AES with all unnecessary crap removed. I am complaining because you guys
> are about to add unnecessary crap to loop -> transfer interface.
>
> Following tests were run in userspace on my 300 MHz Pentium-2 test box, and
> were compiled using Debian woody gcc "version 2.95.4 20011002 (Debian
> prerelease)" with these compiler flags: -O2 -Wall -march=i686
> -mpreferred-stack-boundary=2 -fomit-frame-pointer
>
> This tests only low level cipher functions aes_encrypt() and aes_decrypt()
> from linux-2.5.74/crypto/aes.c with all CryptoAPI overhead removed. In real
> use, including CryptoAPI overhead, these numbers should be a little bit
> smaller.
>
> key length 128 bits, encrypt speed 68.5 Mbits/sec
> key length 128 bits, decrypt speed 58.9 Mbits/sec
> key length 192 bits, encrypt speed 58.3 Mbits/sec
> key length 192 bits, decrypt speed 50.3 Mbits/sec
> key length 256 bits, encrypt speed 51.0 Mbits/sec
> key length 256 bits, decrypt speed 43.8 Mbits/sec
>
> This tests aes_encrypt() and aes_decrypt() from loop-AES-v1.7d/aes-i586.S
> Most users running loop-AES on modern x86 boxes get to use this code
> automatically.
>
> key length 128 bits, encrypt speed 129.6 Mbits/sec
> key length 128 bits, decrypt speed 131.3 Mbits/sec
> key length 192 bits, encrypt speed 113.0 Mbits/sec
> key length 192 bits, decrypt speed 111.7 Mbits/sec
> key length 256 bits, encrypt speed 96.2 Mbits/sec
> key length 256 bits, decrypt speed 97.5 Mbits/sec
>
> This tests aes_encrypt() and aes_decrypt() from loop-AES-v1.7d/aes.c
> Loop-AES users running non-x86 kernels or x86 configured for i386/i486 will
> run this version.
>
> key length 128 bits, encrypt speed 81.2 Mbits/sec
> key length 128 bits, decrypt speed 83.4 Mbits/sec
> key length 192 bits, encrypt speed 68.5 Mbits/sec
> key length 192 bits, decrypt speed 70.6 Mbits/sec
> key length 256 bits, encrypt speed 58.9 Mbits/sec
> key length 256 bits, decrypt speed 60.9 Mbits/sec
>
> > Either try to help improving what's in the tree or shut up.
>
> I have posted patches to be included in mainline. Fixes are available, and
> if they are not merged, then so be it. If fixes are not merged to mainline,
> they will be maintained outside of mainline so people who need them can
> actually use them. It is not really my failt that mainline people seem to
> prefer buggy loop and slow loop crypto.
>
> Regards,
> Jari Ruusu <[email protected]>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2003-07-05 17:02:37

by James Morris

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Sat, 5 Jul 2003, Jari Ruusu wrote:

> This tests only low level cipher functions aes_encrypt() and aes_decrypt()
> from linux-2.5.74/crypto/aes.c with all CryptoAPI overhead removed. In real
> use, including CryptoAPI overhead, these numbers should be a little bit
> smaller.
>
> key length 128 bits, encrypt speed 68.5 Mbits/sec
> key length 128 bits, decrypt speed 58.9 Mbits/sec
> key length 192 bits, encrypt speed 58.3 Mbits/sec
> key length 192 bits, decrypt speed 50.3 Mbits/sec
> key length 256 bits, encrypt speed 51.0 Mbits/sec
> key length 256 bits, decrypt speed 43.8 Mbits/sec

[snip]

> This tests aes_encrypt() and aes_decrypt() from loop-AES-v1.7d/aes.c
> Loop-AES users running non-x86 kernels or x86 configured for i386/i486 will
> run this version.
>
> key length 128 bits, encrypt speed 81.2 Mbits/sec
> key length 128 bits, decrypt speed 83.4 Mbits/sec
> key length 192 bits, encrypt speed 68.5 Mbits/sec
> key length 192 bits, decrypt speed 70.6 Mbits/sec
> key length 256 bits, encrypt speed 58.9 Mbits/sec
> key length 256 bits, decrypt speed 60.9 Mbits/sec

These results are interesting, if they represent a pure comparison of the
crypto algorithm implementations -- both the mainline kernel version and
yours are based on the same Gladman code.


- James
--
James Morris
<[email protected]>

2003-07-05 17:07:18

by Linus Torvalds

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop


On Sat, 5 Jul 2003, Jari Ruusu wrote:
>
> This tests only low level cipher functions aes_encrypt() and aes_decrypt()
> from linux-2.5.74/crypto/aes.c with all CryptoAPI overhead removed. In real
> use, including CryptoAPI overhead, these numbers should be a little bit
> smaller.
>
> key length 128 bits, encrypt speed 68.5 Mbits/sec
...

Note that the issue that started the discussion is somewhat different: the
performance impact of having to map the pages with sleeping kmap's because
of the old virtual-address-pointer interfaces.

That won't even show up on most setups, but it can be rather nasty on big
boxes with lots of ram and several CPU's. Do we care? Possibly not, but I
think from a design standpoint it would be better to not have the problem.

Linus

2003-07-08 12:30:10

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] cryptoloop

On Sat, Jul 05, 2003 at 11:41:45AM +0300, Jari Ruusu wrote:
> I haven't seen the modified transfer function yet, so no test data on speed
> difference yet. Notice that I said "tiny speed degration". Tiny in this
> context may well mean unmeasurable small. However, it will add mode code to
> optimized implementation. More code == tiny bit slower.

Then I'd suggest reading the suggestion carefully :) It suggest to:

a) not perform that kmap/kunmap in loop.c before calling the transfer
function and
b) pass the page frame + offset to the transfer function instead of the
kernel virtual address

That means for the module implementing the transfer function either

1) a superflous kmap that it doesn't need is gone (cryptoapi case)
2) it needs to perform the kmap previously done by itself. (= no change)
3) it needs to perform the kmap itself but can use kmap_atomic now


> Loop code in loop-AES does not have any significant speed advantage over
> mainline loop. Most significant advantage that loop-AES' loop code has over
> mainline, is that it includes a ton of bug fixes still missing from mainline
> loop.

Well, we already had a discussion about that, and you haven't tried to
split your patch up so it's unacceptable for mainline. I'm in fact pretty
sure some parts like the bio reservations are buggy but it's hard to tell
excatly with a that big patch.

> Loop-AES' speed advantage comes from optimized AES-only transfer
> function that does the CBC stuff and directly calls highly optimized AES
> implementation without any CryptoAPI overhead.

That's the good old VFS argument. Of course we could rip out the
VFS and go directly to ext2 but in many cases abstraction have more
advantage than the tiny speed loss they impare.

> This tests only low level cipher functions aes_encrypt() and aes_decrypt()
> from linux-2.5.74/crypto/aes.c with all CryptoAPI overhead removed. In real
> use, including CryptoAPI overhead, these numbers should be a little bit
> smaller.

<snip>

The number is interesting. It might be worthwile to try embedding your
aes implementations into the CryptoAPI. Given your negative comments
I suppose you are not interested in something like that?

> I have posted patches to be included in mainline. Fixes are available, and
> if they are not merged, then so be it. If fixes are not merged to mainline,
> they will be maintained outside of mainline so people who need them can
> actually use them. It is not really my failt that mainline people seem to
> prefer buggy loop and slow loop crypto.

So far all loop bugs that have been reported during 2.5 have been fixed.
As for slow crypto: yes, we prefer the cryptoapi abstraction over
directly going to a single algorithm. And no one here said he diskliked
your AES implementation, they just haven't been offered yet. I'm sure
your assembly implementation will be merged once the framework for
optimized crypto algorithms is in place and I'm also pretty sure
we'll find out why your C implementation is faster and either switch the
current implementation for yours or improve it. It's just that all this
would be a lot easier if you just offered them in a form of a nicely
described do one thing patch instead of moaning from the backside..