The network layer tries to allocate high order pages for skb_buff
fragments, this leads to problems if we pass such a buffer to
crypto because crypto assumes to have always order null pages
in the scatterlists.
This was not a problem so far, because the network stack linearized
all buffers before passing them to crypto. If we try to avoid the
linearization with skb_cow_data in IPsec esp4/esp6 this incompatibility
becomes visible.
Signed-off-by: Steffen Klassert <[email protected]>
---
Herbert, I could not find out why this PAGE_SIZE limit is in place.
So not sure if this is the right fix. Also, would it be ok to merge
this, or whatever is the right fix through the IPsec tree? We need
this before we can change esp to avoid linearization.
The full IPsec patchset can be found here:
https://git.kernel.org/cgit/linux/kernel/git/klassert/linux-stk.git/log/?h=net-next-ipsec-offload-work
crypto/ahash.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 5fc1f17..ca92783 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -44,8 +44,7 @@ static int hash_walk_next(struct crypto_hash_walk *walk)
{
unsigned int alignmask = walk->alignmask;
unsigned int offset = walk->offset;
- unsigned int nbytes = min(walk->entrylen,
- ((unsigned int)(PAGE_SIZE)) - offset);
+ unsigned int nbytes = walk->entrylen;
if (walk->flags & CRYPTO_ALG_ASYNC)
walk->data = kmap(walk->pg);
@@ -91,8 +90,6 @@ int crypto_hash_walk_done(struct crypto_hash_walk *walk, int err)
walk->offset = ALIGN(walk->offset, alignmask + 1);
walk->data += walk->offset;
- nbytes = min(nbytes,
- ((unsigned int)(PAGE_SIZE)) - walk->offset);
walk->entrylen -= nbytes;
return nbytes;
--
1.9.1
On Thu, Apr 21, 2016 at 09:14:51AM +0200, Steffen Klassert wrote:
> The network layer tries to allocate high order pages for skb_buff
> fragments, this leads to problems if we pass such a buffer to
> crypto because crypto assumes to have always order null pages
> in the scatterlists.
I don't understand. AFAICS the crypto API assumes no such thing.
Of course there might be a bug there since we probably don't get
too many superpages coming in normally.
> Herbert, I could not find out why this PAGE_SIZE limit is in place.
> So not sure if this is the right fix. Also, would it be ok to merge
> this, or whatever is the right fix through the IPsec tree? We need
> this before we can change esp to avoid linearization.
Your patch makes no sense. When you do a kmap you can only do
one page at a time. So if you have a "superpage" (an SG list
entry with multiple contiguous pages) then you must walk it one
page at a time.
That's why we cap it at PAGE_SIZE.
Is it not walking the superpage properly?
Cheers,
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Mon, Apr 25, 2016 at 06:05:27PM +0800, Herbert Xu wrote:
> On Thu, Apr 21, 2016 at 09:14:51AM +0200, Steffen Klassert wrote:
> > The network layer tries to allocate high order pages for skb_buff
> > fragments, this leads to problems if we pass such a buffer to
> > crypto because crypto assumes to have always order null pages
> > in the scatterlists.
>
> I don't understand. AFAICS the crypto API assumes no such thing.
> Of course there might be a bug there since we probably don't get
> too many superpages coming in normally.
Maybe I misinterpreted the things I observed.
>
> > Herbert, I could not find out why this PAGE_SIZE limit is in place.
> > So not sure if this is the right fix. Also, would it be ok to merge
> > this, or whatever is the right fix through the IPsec tree? We need
> > this before we can change esp to avoid linearization.
>
> Your patch makes no sense.
That's possible :)
> When you do a kmap you can only do
> one page at a time. So if you have a "superpage" (an SG list
> entry with multiple contiguous pages) then you must walk it one
> page at a time.
>
> That's why we cap it at PAGE_SIZE.
>
> Is it not walking the superpage properly?
The problem was that if offset (in a superpage) equals
PAGE_SIZE in hash_walk_next(), nbytes becomes zero. So
we map the page, but we don't hash and unmap because we
exit the loop in shash_ahash_update() in this case.
On Thu, Apr 28, 2016 at 10:27:43AM +0200, Steffen Klassert wrote:
>
> The problem was that if offset (in a superpage) equals
> PAGE_SIZE in hash_walk_next(), nbytes becomes zero. So
> we map the page, but we don't hash and unmap because we
> exit the loop in shash_ahash_update() in this case.
I see. Does this patch help?
---8<---
The length clamping in the crypto hash walk code is broken if
supplied with an offset greater than or equal to PAGE_SIZE. This
patch fixes it by borrowing the code from scatterwalk.
Cc: <[email protected]>
Reported-by: Steffen Klassert <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 5fc1f17..2d6c4f1 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -44,8 +44,8 @@ static int hash_walk_next(struct crypto_hash_walk *walk)
{
unsigned int alignmask = walk->alignmask;
unsigned int offset = walk->offset;
- unsigned int nbytes = min(walk->entrylen,
- ((unsigned int)(PAGE_SIZE)) - offset);
+ unsigned int nbytes = min_t(unsigned int, walk->entrylen,
+ offset_in_page(~offset) + 1);
if (walk->flags & CRYPTO_ALG_ASYNC)
walk->data = kmap(walk->pg);
@@ -91,8 +91,8 @@ int crypto_hash_walk_done(struct crypto_hash_walk *walk, int err)
walk->offset = ALIGN(walk->offset, alignmask + 1);
walk->data += walk->offset;
- nbytes = min(nbytes,
- ((unsigned int)(PAGE_SIZE)) - walk->offset);
+ nbytes = min_t(unsigned int, nbytes,
+ offset_in_page(~walk->offset) + 1);
walk->entrylen -= nbytes;
return nbytes;
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Tue, May 03, 2016 at 05:55:31PM +0800, Herbert Xu wrote:
> On Thu, Apr 28, 2016 at 10:27:43AM +0200, Steffen Klassert wrote:
> >
> > The problem was that if offset (in a superpage) equals
> > PAGE_SIZE in hash_walk_next(), nbytes becomes zero. So
> > we map the page, but we don't hash and unmap because we
> > exit the loop in shash_ahash_update() in this case.
>
> I see. Does this patch help?
Hmm, the 'sleeping while atomic' because of not unmapping
the page goes away, but now I see a lot of IPsec ICV fails
on the receive side. I'll try to find out what's going on.
Sowmini, could you please doublecheck with your test setup?
On Wed, May 04, 2016 at 11:34:20AM +0200, Steffen Klassert wrote:
>
> Hmm, the 'sleeping while atomic' because of not unmapping
> the page goes away, but now I see a lot of IPsec ICV fails
> on the receive side. I'll try to find out what's going on.
>
> Sowmini, could you please doublecheck with your test setup?
Don't bother, my patch was crap. Here's one that's more likely
to work:
---8<---
The crypto hash walk code is broken when supplied with an offset
greater than or equal to PAGE_SIZE. This patch fixes it by adjusting
walk->pg and walk->offset when this happens.
Cc: <[email protected]>
Reported-by: Steffen Klassert <[email protected]>
Signed-off-by: Herbert Xu <[email protected]>
diff --git a/crypto/ahash.c b/crypto/ahash.c
index 5fc1f17..3887a98 100644
--- a/crypto/ahash.c
+++ b/crypto/ahash.c
@@ -69,8 +69,9 @@ static int hash_walk_new_entry(struct crypto_hash_walk *walk)
struct scatterlist *sg;
sg = walk->sg;
- walk->pg = sg_page(sg);
walk->offset = sg->offset;
+ walk->pg = sg_page(walk->sg) + (walk->offset >> PAGE_SHIFT);
+ walk->offset = offset_in_page(walk->offset);
walk->entrylen = sg->length;
if (walk->entrylen > walk->total)
--
Email: Herbert Xu <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Wed, May 04, 2016 at 05:52:56PM +0800, Herbert Xu wrote:
> On Wed, May 04, 2016 at 11:34:20AM +0200, Steffen Klassert wrote:
> >
> > Hmm, the 'sleeping while atomic' because of not unmapping
> > the page goes away, but now I see a lot of IPsec ICV fails
> > on the receive side. I'll try to find out what's going on.
> >
> > Sowmini, could you please doublecheck with your test setup?
>
> Don't bother, my patch was crap. Here's one that's more likely
> to work:
This one is much better, works here :)
On (05/04/16 12:08), Steffen Klassert wrote:
> > > Sowmini, could you please doublecheck with your test setup?
> >
> > Don't bother, my patch was crap. Here's one that's more likely
> > to work:
>
> This one is much better, works here :)
The patch seems to work, but the caveat is that the original
bug (iperf segv) was not always reproducible on demand - it depended
on memory and other config.
But looks like Steffen has a reliable way to reproduce the original
problem, so lets go with this patch for now.
--Sowmini