2012-11-15 09:03:46

by Jorgen Lundman

[permalink] [raw]
Subject: Crypto causes panic in scatterwalk_done with large/multiple buffers


I have a situation where I setup scatterlists as:

input scatterlist of 1, address ffffc90003627000 len 0x20000.

output scatterlist of 2, address 0 ffffc90002d45000 len 0x20000
address 1 ffff88003b079d98 len 0x000c

When I call crypto_aead_encrypt(req); it will die with:

kernel: [ 925.151113] BUG: unable to handle kernel paging request at
ffffeb04000b5140
kernel: [ 925.151253] IP: [<ffffffff812f4880>] scatterwalk_done+0x50/0x60
kernel: [ 925.151325] PGD 0
kernel: [ 925.151381] Oops: 0000 [#1] SMP
kernel: [ 925.151442] CPU 1
kernel: [ 925.154255] [<ffffffff812f7640>] blkcipher_walk_done+0xb0/0x230
kernel: [ 925.154255] [<ffffffffa02e9169>] crypto_ctr_crypt+0x129/0x2b0 [ctr]
kernel: [ 925.154255] [<ffffffff812fe580>] ? crypto_aes_set_key+0x40/0x40
kernel: [ 925.154255] [<ffffffff812f6cbd>] async_encrypt+0x3d/0x40
kernel: [ 925.154255] [<ffffffffa0149326>] crypto_ccm_encrypt+0x246/0x290
[ccm]
kernel: [ 925.154255] [<ffffffffa01633bd>] crypto_encrypt+0x26d/0x2d0



What is interesting about that is, if I allocate a linear buffer instead:

dst = kmalloc(cryptlen, GFP_KERNEL); // 0x20000 + 0x000c
sg_init_table(sg, 1 );
sg_set_buf(&sg[0], dst, cryptlen);

crypto_aead_encrypt(req);

will no longer panic. However, when I try to copy the linear buffer back to
scatterlist;


scatterwalk_map_and_copy(dst, sg, 0, cryptlen, 1);


then it will panic there instead.


However, if I replace it with the call:

sg_copy_from_buffer(sg, sg_nents(sg),
dst, cryptlen);

everything works! <-

So, what am I doing wrong that makes scatterwalk_map_and_copy() fail, and
sg_copy_from_buffer() work fine? It would be nice if I could fix it, so I
did not need to copy to a temporary buffer.

Lund

--
Jorgen Lundman | <[email protected]>
Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell)
Japan | +81 (0)3 -3375-1767 (home)


2012-11-17 00:42:24

by Jussi Kivilinna

[permalink] [raw]
Subject: Re: Crypto causes panic in scatterwalk_done with large/multiple buffers

Hello,

I managed to reproduce something similiar with small buffers...

Does attached patch help in your case?

-Jussi

Quoting Jorgen Lundman <[email protected]>:

>
> I have a situation where I setup scatterlists as:
>
> input scatterlist of 1, address ffffc90003627000 len 0x20000.
>
> output scatterlist of 2, address 0 ffffc90002d45000 len 0x20000
> address 1 ffff88003b079d98 len 0x000c
>
> When I call crypto_aead_encrypt(req); it will die with:
>
> kernel: [ 925.151113] BUG: unable to handle kernel paging request at
> ffffeb04000b5140
> kernel: [ 925.151253] IP: [<ffffffff812f4880>] scatterwalk_done+0x50/0x60
> kernel: [ 925.151325] PGD 0
> kernel: [ 925.151381] Oops: 0000 [#1] SMP
> kernel: [ 925.151442] CPU 1
> kernel: [ 925.154255] [<ffffffff812f7640>] blkcipher_walk_done+0xb0/0x230
> kernel: [ 925.154255] [<ffffffffa02e9169>]
> crypto_ctr_crypt+0x129/0x2b0 [ctr]
> kernel: [ 925.154255] [<ffffffff812fe580>] ? crypto_aes_set_key+0x40/0x40
> kernel: [ 925.154255] [<ffffffff812f6cbd>] async_encrypt+0x3d/0x40
> kernel: [ 925.154255] [<ffffffffa0149326>] crypto_ccm_encrypt+0x246/0x290
> [ccm]
> kernel: [ 925.154255] [<ffffffffa01633bd>] crypto_encrypt+0x26d/0x2d0
>
>
>
> What is interesting about that is, if I allocate a linear buffer instead:
>
> dst = kmalloc(cryptlen, GFP_KERNEL); // 0x20000 + 0x000c
> sg_init_table(sg, 1 );
> sg_set_buf(&sg[0], dst, cryptlen);
>
> crypto_aead_encrypt(req);
>
> will no longer panic. However, when I try to copy the linear buffer back to
> scatterlist;
>
>
> scatterwalk_map_and_copy(dst, sg, 0, cryptlen, 1);
>
>
> then it will panic there instead.
>
>
> However, if I replace it with the call:
>
> sg_copy_from_buffer(sg, sg_nents(sg),
> dst, cryptlen);
>
> everything works! <-
>
> So, what am I doing wrong that makes scatterwalk_map_and_copy() fail, and
> sg_copy_from_buffer() work fine? It would be nice if I could fix it, so I
> did not need to copy to a temporary buffer.
>
> Lund
>
> --
> Jorgen Lundman | <[email protected]>
> Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work)
> Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell)
> Japan | +81 (0)3 -3375-1767 (home)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>



Attachments:
(No filename) (2.43 kB)
05-fix-broken-scatterwalk_sg_chain.patch (1.24 kB)
Download all attachments

2012-11-17 08:39:04

by Jussi Kivilinna

[permalink] [raw]
Subject: Re: Crypto causes panic in scatterwalk_done with large/multiple buffers

Quoting Jussi Kivilinna <[email protected]>:

> Hello,
>
> I managed to reproduce something similiar with small buffers...
>
> Does attached patch help in your case?
>

Appearently this patch only fixed my debug printk loop that used
sg_next from scatterlist API instead of scatterwalk_sg_next from
scatterwalk API. Sorry for the noise.

-Jussi

2012-11-18 00:29:09

by Jorgen Lundman

[permalink] [raw]
Subject: Re: Crypto causes panic in scatterwalk_done with large/multiple buffers

>
> Appearently this patch only fixed my debug printk loop that used sg_next
> from scatterlist API instead of scatterwalk_sg_next from scatterwalk API.
> Sorry for the noise.
>

Thanks for looking at this. I think I am dealing with 2 problems, one is
that occasionally my buffers are from vmalloc, and needs to have some logic
using vmalloc_to_page(). But I don't know if ciphers should handle that
internally, blkcipher.c certainly seems to have several modes, although I
do not see how to *set* them.

Second problem is most likely what you were looking at. It is quite easy to
make the crypto code die.

For example, if I use "ccm(aes)" which can take the dst buffer, plus a hmac
buffer;

cipher = kmalloc( ciphersize, ...
hmac = kmalloc( 16, ...
sg_set_buf( &sg[0], cipher, ciphersize);
sg_set_buf( &sg[1], hmac, 16);
aead_encrypt()...

and all is well, but if you shift hmac address away from PAGE boundary, like:

hmac = kmalloc( 16 + 32, ...
hmac += 32;
sg_set_buf( &sg[1], hmac, 16);

ie, allocate a larger buffer, and put the pointer into the page a bit. And
it will die in scatterwalk very often. +32 isnt magical, any non-zero
number works.

Lund



--
Jorgen Lundman | <[email protected]>
Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell)
Japan | +81 (0)3 -3375-1767 (home)

2012-11-18 02:10:44

by Jorgen Lundman

[permalink] [raw]
Subject: Re: Crypto causes panic in scatterwalk_done with large/multiple buffers


Specifically, the crash I get is finally in here:


static void Xscatterwalk_pagedone(struct scatter_walk *walk, int out,
unsigned int more)
{
if (out) {
struct page *page;

page = sg_page(walk->sg) + ((walk->offset - 1) >> PAGE_SHIFT);

printk("**** PageSlab: page %p offset 0x%04lx\n",
page, walk->offset);

if (!PageSlab(page)) { // <===== here

kernel: [ 62.650590] **** PageSlab: page ffffeb04000b7e40 offset 0x1000


Commenting out the call to PageSlab and flush_dcache_page(page);
surprisingly makes it work a little better. But presumably it comes with
a cost that I can not yet see.

2012-11-18 17:40:37

by Jussi Kivilinna

[permalink] [raw]
Subject: Re: Crypto causes panic in scatterwalk_done with large/multiple buffers

Quoting Jorgen Lundman <[email protected]>:

>>
>> Appearently this patch only fixed my debug printk loop that used sg_next
>> from scatterlist API instead of scatterwalk_sg_next from scatterwalk API.
>> Sorry for the noise.
>>
>
> Thanks for looking at this. I think I am dealing with 2 problems, one is
> that occasionally my buffers are from vmalloc, and needs to have some logic
> using vmalloc_to_page(). But I don't know if ciphers should handle that
> internally, blkcipher.c certainly seems to have several modes, although I
> do not see how to *set* them.

From what I now researched, you must not pass vmalloc'd memory to
sg_set_buf() as it internally uses virt_to_page() to get page of
buffer address. You most likely need to walk through your vmalloc'd
buffer and pass all individual pages to scatterlist with sg_set_page().

>
> Second problem is most likely what you were looking at. It is quite easy to
> make the crypto code die.
>
> For example, if I use "ccm(aes)" which can take the dst buffer, plus a hmac
> buffer;
>
> cipher = kmalloc( ciphersize, ...
> hmac = kmalloc( 16, ...
> sg_set_buf( &sg[0], cipher, ciphersize);
> sg_set_buf( &sg[1], hmac, 16);
> aead_encrypt()...
>
> and all is well, but if you shift hmac address away from PAGE boundary, like:
>
> hmac = kmalloc( 16 + 32, ...
> hmac += 32;
> sg_set_buf( &sg[1], hmac, 16);
>
> ie, allocate a larger buffer, and put the pointer into the page a bit. And
> it will die in scatterwalk very often. +32 isnt magical, any non-zero
> number works.

This is strange as crypto subsystem's internal test mechanism uses
such offsetted buffers.

-Jussi

2012-11-19 00:29:26

by Jorgen Lundman

[permalink] [raw]
Subject: Re: Crypto causes panic in scatterwalk_done with large/multiple buffers

>
> From what I now researched, you must not pass vmalloc'd memory to
> sg_set_buf() as it internally uses virt_to_page() to get page of buffer
> address. You most likely need to walk through your vmalloc'd buffer and
> pass all individual pages to scatterlist with sg_set_page().

I was told the same, so I wrote something that called vmalloc_to_page() for
each PAGE_SIZE of the vmalloced memory, with offsets, and created large
scatterlists (well, around ~120 or so).

Made no difference at all, still dies in scatterwalk_done.

Looking at blkcipher (ccm->ctr->blkcipher) and I see it has methods for
_phys(), _virt(), _virt_block(), _fast(), _slow() and _copy(). I was quite
interested in the copy() method, since that is what I am doing at the top.

No idea how I affect those types of use. phys and virt seem to be handled
internally, and virt_block() is what ctr.c uses.

This made me think the vmalloc thing isn't my problem.

Then I copied ccm.c, ctr.c and blkcipher.c to my own module, just so that I
could commend out:


static void Xscatterwalk_pagedone(struct scatter_walk *walk, int out,
unsigned int more)
{

if (out) {
struct page *page;

page = sg_page(walk->sg) + ((walk->offset - 1) >> PAGE_SHIFT);

#if 0
/*
* This call causes panic, so all these sources are just to
* avoid this call.
*
* Once this issue has been fixed, we can remove the entire
* "sun-ccm(aes)" module.
*
*/
if (!PageSlab(page))
flush_dcache_page(page);
#endif
}

if (more) {


And making the two functions, Xscatterwalk_done() and
Xscatterwalk_copychunks(), to call my version instead.

Wouldn't you know, all panics are gone. The misaligned buffer panic, gone.
Reverted back to just passing vmalloc() memory, no problems.

I have been running multi-gig bonnie++ for 24 hours (file system related
crypto). I am trusting that it is working now.

Tested with 3.2.0-0.bpo.3-amd64 and 3.5.0-17-generic.

> This is strange as crypto subsystem's internal test mechanism uses such
> offsetted buffers.
>

I'm sure it does. I run my tests about 100 times and it usually dies around
~40th.


I appreciate your help,

Lund

--
Jorgen Lundman | <[email protected]>
Unix Administrator | +81 (0)3 -5456-2687 ext 1017 (work)
Shibuya-ku, Tokyo | +81 (0)90-5578-8500 (cell)
Japan | +81 (0)3 -3375-1767 (home)