2012-05-09 22:36:04

by Benjamin Poirier

[permalink] [raw]
Subject: [PATCH] xfrm: take iphdr size into account for esp payload size calculation

Corrects the function that determines the esp payload size.
The calculations done in esp4_get_mtu lead to overlength frames in transport
mode for certain mtu values (eg. 1499 leads to FRAGFAILS) and suboptimal
frames for others (eg. 1500, where the addition of padding in the esp header
can be avoided).

According to what is done, mainly in esp_output(), net_header_len aka
sizeof(struct iphdr) must be taken into account before doing the alignment
calculation.

Signed-off-by: Benjamin Poirier <[email protected]>

---
Tested with
* transport mode E
* transport mode EA
* transport mode E + ah
* tunnel mode E

Not tested with BEET, but it should be the same as transport mode
draft-nikander-esp-beet-mode-03.txt Section 5.2:
"The wire packet format is identical to the ESP transport mode"
---
net/ipv4/esp4.c | 24 +++---------------------
1 files changed, 3 insertions(+), 21 deletions(-)

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 89a47b3..60f2738 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -459,28 +459,10 @@ static u32 esp4_get_mtu(struct xfrm_state *x, int mtu)
struct esp_data *esp = x->data;
u32 blksize = ALIGN(crypto_aead_blocksize(esp->aead), 4);
u32 align = max_t(u32, blksize, esp->padlen);
- u32 rem;

- mtu -= x->props.header_len + crypto_aead_authsize(esp->aead);
- rem = mtu & (align - 1);
- mtu &= ~(align - 1);
-
- switch (x->props.mode) {
- case XFRM_MODE_TUNNEL:
- break;
- default:
- case XFRM_MODE_TRANSPORT:
- /* The worst case */
- mtu -= blksize - 4;
- mtu += min_t(u32, blksize - 4, rem);
- break;
- case XFRM_MODE_BEET:
- /* The worst case. */
- mtu += min_t(u32, IPV4_BEET_PHMAXLEN, rem);
- break;
- }
-
- return mtu - 2;
+ return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
+ sizeof(struct iphdr)) & ~(align - 1)) + (sizeof(struct
+ iphdr) - 2);
}

static void esp4_err(struct sk_buff *skb, u32 info)
--
1.7.7


2012-05-10 12:44:27

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH] xfrm: take iphdr size into account for esp payload size calculation

On Wed, May 09, 2012 at 06:35:52PM -0400, Benjamin Poirier wrote:
>
> According to what is done, mainly in esp_output(), net_header_len aka
> sizeof(struct iphdr) must be taken into account before doing the alignment
> calculation.

Why do you need to take the ip header into account here? Your patch breaks
pmtu discovery, at least on tunnel mode with aes-sha1 (aes blocksize 16 bytes).

With your patch applied:

tracepath -n 192.168.1.2
1?: [LOCALHOST] pmtu 1442
1: send failed
1: send failed
Resume: pmtu 1442

Without your patch:

tracepath -n 192.168.1.2
1?: [LOCALHOST] pmtu 1438
1: 192.168.1.2 0.736ms reached
1: 192.168.1.2 0.390ms reached
Resume: pmtu 1438 hops 1 back 64

Your patch increases the mtu by 4 bytes. Be aware that adding
one byte of payload may increase the packet size up to 16 bytes
in the case of aes, as we have to pad the encryption payload
always to a multiple of the cipher blocksize.

> -
> - switch (x->props.mode) {
> - case XFRM_MODE_TUNNEL:
> - break;
> - default:
> - case XFRM_MODE_TRANSPORT:
> - /* The worst case */
> - mtu -= blksize - 4;
> - mtu += min_t(u32, blksize - 4, rem);
> - break;

Btw. why we are doing the calculation above for transport mode?

2012-05-11 01:03:04

by Benjamin Poirier

[permalink] [raw]
Subject: Re: [PATCH] xfrm: take iphdr size into account for esp payload size calculation

On 2012/05/10 14:18, Steffen Klassert wrote:
> On Wed, May 09, 2012 at 06:35:52PM -0400, Benjamin Poirier wrote:
> >
> > According to what is done, mainly in esp_output(), net_header_len aka
> > sizeof(struct iphdr) must be taken into account before doing the alignment
> > calculation.
>
> Why do you need to take the ip header into account here?

The value returned by this function is tuned for tcp segment size:
1) from tcp_mtu_to_mss()
mss = pmtu - tcp_hlen - net_hlen
2) frame structure for transport mode
mtu = mss + tcp_hlen + esp_header_len(esp_payload_len) + ah_len + net_hlen

The "mtu" parameter of esp4_get_mtu is in fact mtu - ah_len.
The return value of esp4_get_mtu is put into pmtu.

If we put 1 and 2 together we have:
pmtu = mtu - ah_len - esp_header_len(esp_payload_len)
with esp_payload_len = mss + tcp_hlen

This formula expands to:
pmtu = mtu - ah_len - (header_len + align(align(pmtu - net_hlen + 2, blksize),
esp->padlen) - (pmtu - net_hlen) + alen)

and simplifies to:
pmtu = (mtu - ah_len - net_hlen - header_len - alen) & ~(max(blksize,
esp->padlen) - 1) + (net_hlen - 2)

which, in the context of esp4_get_mtu, becomes:
((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) - sizeof(struct
iphdr)) & ~(align - 1)) + (sizeof(struct iphdr) - 2)

This is the same formula as before, except for sizeof(struct iphdr) which was
missing.

> Your patch breaks
> pmtu discovery, at least on tunnel mode with aes-sha1 (aes blocksize 16 bytes).
>
> With your patch applied:
>
> tracepath -n 192.168.1.2
> 1?: [LOCALHOST] pmtu 1442
> 1: send failed
> 1: send failed
> Resume: pmtu 1442
>
> Without your patch:
>
> tracepath -n 192.168.1.2
> 1?: [LOCALHOST] pmtu 1438
> 1: 192.168.1.2 0.736ms reached
> 1: 192.168.1.2 0.390ms reached
> Resume: pmtu 1438 hops 1 back 64
>
> Your patch increases the mtu by 4 bytes. Be aware that adding
> one byte of payload may increase the packet size up to 16 bytes
> in the case of aes, as we have to pad the encryption payload
> always to a multiple of the cipher blocksize.

Thanks for testing. My own testing had been limited to rsync'ing large
files.

The problem with tunnel mode in v1 of the patch was twofold.
First, we don't play games with mss and tcp_hlen. esp_payload_len = pmtu
directly, instead of esp_payload_len = pmtu - net_hlen.
Second, in the tunnel case, header_len already includes net_hlen, see
esp_init_state().
The net result is that the formula for tunnel mode was already correct.

>
> > -
> > - switch (x->props.mode) {
> > - case XFRM_MODE_TUNNEL:
> > - break;
> > - default:
> > - case XFRM_MODE_TRANSPORT:
> > - /* The worst case */
> > - mtu -= blksize - 4;
> > - mtu += min_t(u32, blksize - 4, rem);
> > - break;
>
> Btw. why we are doing the calculation above for transport mode?

2012-05-11 01:07:40

by Benjamin Poirier

[permalink] [raw]
Subject: [PATCH v2] xfrm: take iphdr size into account for esp payload size calculation

Corrects the function that determines the esp payload size.
The calculations done in esp4_get_mtu lead to overlength frames in transport
mode for certain mtu values and suboptimal frames for others.

According to what is done, mainly in esp_output(), net_header_len aka
sizeof(struct iphdr) must be taken into account before doing the alignment
calculation.

Signed-off-by: Benjamin Poirier <[email protected]>

---
Changes since v1:
* introduce l3_adj to preserve the same returned value as before for tunnel
mode

For example, with md5 AH and 3des ESP (transport mode):
mtu = 1499 leads to FRAGFAILS
mtu = 1500 the addition of padding in the esp header could be avoided

Tested with
* transport mode E
* transport mode EA
* transport mode E + ah
* tunnel mode E

Not tested with BEET, but it should be the same as transport mode
draft-nikander-esp-beet-mode-03.txt Section 5.2:
"The wire packet format is identical to the ESP transport mode"
---
net/ipv4/esp4.c | 24 +++++++++---------------
1 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 89a47b3..b2503f6 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -459,28 +459,22 @@ static u32 esp4_get_mtu(struct xfrm_state *x, int mtu)
struct esp_data *esp = x->data;
u32 blksize = ALIGN(crypto_aead_blocksize(esp->aead), 4);
u32 align = max_t(u32, blksize, esp->padlen);
- u32 rem;
-
- mtu -= x->props.header_len + crypto_aead_authsize(esp->aead);
- rem = mtu & (align - 1);
- mtu &= ~(align - 1);
+ unsigned int l3_adj;

switch (x->props.mode) {
- case XFRM_MODE_TUNNEL:
- break;
- default:
case XFRM_MODE_TRANSPORT:
- /* The worst case */
- mtu -= blksize - 4;
- mtu += min_t(u32, blksize - 4, rem);
- break;
case XFRM_MODE_BEET:
- /* The worst case. */
- mtu += min_t(u32, IPV4_BEET_PHMAXLEN, rem);
+ l3_adj = sizeof(struct iphdr);
break;
+ case XFRM_MODE_TUNNEL:
+ l3_adj = 0;
+ break;
+ default:
+ BUG();
}

- return mtu - 2;
+ return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
+ l3_adj) & ~(align - 1)) + (l3_adj - 2);
}

static void esp4_err(struct sk_buff *skb, u32 info)
--
1.7.7

2012-05-11 10:39:41

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH] xfrm: take iphdr size into account for esp payload size calculation

On Thu, May 10, 2012 at 09:02:49PM -0400, Benjamin Poirier wrote:
>
> The value returned by this function is tuned for tcp segment size:
> 1) from tcp_mtu_to_mss()
> mss = pmtu - tcp_hlen - net_hlen
> 2) frame structure for transport mode
> mtu = mss + tcp_hlen + esp_header_len(esp_payload_len) + ah_len + net_hlen

I think you can simplify the calculations here, this
calculation should not depend on any special layer 4
protocol.

>
> The "mtu" parameter of esp4_get_mtu is in fact mtu - ah_len.
> The return value of esp4_get_mtu is put into pmtu.
>
> If we put 1 and 2 together we have:
> pmtu = mtu - ah_len - esp_header_len(esp_payload_len)
> with esp_payload_len = mss + tcp_hlen
>
> This formula expands to:
> pmtu = mtu - ah_len - (header_len + align(align(pmtu - net_hlen + 2, blksize),
> esp->padlen) - (pmtu - net_hlen) + alen)
>
> and simplifies to:
> pmtu = (mtu - ah_len - net_hlen - header_len - alen) & ~(max(blksize,
> esp->padlen) - 1) + (net_hlen - 2)
>
> which, in the context of esp4_get_mtu, becomes:
> ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) - sizeof(struct
> iphdr)) & ~(align - 1)) + (sizeof(struct iphdr) - 2)
>
> This is the same formula as before, except for sizeof(struct iphdr) which was
> missing.
>

Well, makes sense. I use transport mode very rarely, so I never noticed this.
But I was sure that it worked correct in tunnel mode.

Thanks.

2012-05-14 22:42:49

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v2] xfrm: take iphdr size into account for esp payload size calculation

From: Benjamin Poirier <[email protected]>
Date: Thu, 10 May 2012 21:07:22 -0400

> - return mtu - 2;
> + return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
> + l3_adj) & ~(align - 1)) + (l3_adj - 2);
> }

The formatting of this expression is completely wrong, you need
to make the "l3_adj" in the second line be aligned with the openning
parenthesis on the previous line at a depth determined based upon
how deeply in the openning parenthesis context this expression sits.

Just using TABS is ugly, and not allowed.

Use something like emacs's "C mode" with Linux coding style enabled
to assist you if you can't figure it out yourself.

2012-05-16 19:37:07

by Benjamin Poirier

[permalink] [raw]
Subject: [PATCH v3] xfrm: take iphdr size into account for esp payload size calculation

Corrects the function that determines the esp payload size.
The calculations done in esp4_get_mtu lead to overlength frames in transport
mode for certain mtu values and suboptimal frames for others.

According to what is done, mainly in esp_output(), net_header_len aka
sizeof(struct iphdr) must be taken into account before doing the alignment
calculation.

Signed-off-by: Benjamin Poirier <[email protected]>

---
Changes since v2:
* rename l3_adj to net_adj
* fix indentation

Changes since v1:
* introduce l3_adj to preserve the same returned value as before for tunnel
mode

For example, with md5 AH and 3des ESP (transport mode):
mtu = 1499 leads to FRAGFAILS
mtu = 1500 the addition of padding in the esp header could be avoided

Tested with
* transport mode E
* transport mode EA
* transport mode E + ah
* tunnel mode E

Not tested with BEET, but it should be the same as transport mode
draft-nikander-esp-beet-mode-03.txt Section 5.2:
"The wire packet format is identical to the ESP transport mode"
---
net/ipv4/esp4.c | 24 +++++++++---------------
1 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 89a47b3..cb982a6 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -459,28 +459,22 @@ static u32 esp4_get_mtu(struct xfrm_state *x, int mtu)
struct esp_data *esp = x->data;
u32 blksize = ALIGN(crypto_aead_blocksize(esp->aead), 4);
u32 align = max_t(u32, blksize, esp->padlen);
- u32 rem;
-
- mtu -= x->props.header_len + crypto_aead_authsize(esp->aead);
- rem = mtu & (align - 1);
- mtu &= ~(align - 1);
+ unsigned int net_adj;

switch (x->props.mode) {
- case XFRM_MODE_TUNNEL:
- break;
- default:
case XFRM_MODE_TRANSPORT:
- /* The worst case */
- mtu -= blksize - 4;
- mtu += min_t(u32, blksize - 4, rem);
- break;
case XFRM_MODE_BEET:
- /* The worst case. */
- mtu += min_t(u32, IPV4_BEET_PHMAXLEN, rem);
+ net_adj = sizeof(struct iphdr);
break;
+ case XFRM_MODE_TUNNEL:
+ net_adj = 0;
+ break;
+ default:
+ BUG();
}

- return mtu - 2;
+ return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
+ net_adj) & ~(align - 1)) + (net_adj - 2);
}

static void esp4_err(struct sk_buff *skb, u32 info)
--
1.7.7

2012-05-18 00:07:56

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v3] xfrm: take iphdr size into account for esp payload size calculation

From: Benjamin Poirier <[email protected]>
Date: Wed, 16 May 2012 15:35:25 -0400

> Corrects the function that determines the esp payload size.
> The calculations done in esp4_get_mtu lead to overlength frames in transport
> mode for certain mtu values and suboptimal frames for others.
>
> According to what is done, mainly in esp_output(), net_header_len aka
> sizeof(struct iphdr) must be taken into account before doing the alignment
> calculation.
>
> Signed-off-by: Benjamin Poirier <[email protected]>

This looks great.

Could you please fix net/ipv6/esp6.c too, it seems to have the same
exact bug.

Once you respin this patch with both ipv4 and ipv6 fixed, I'll apply
it.

Thank you.

2012-05-24 21:33:38

by Benjamin Poirier

[permalink] [raw]
Subject: [PATCH v4] xfrm: take net hdr len into account for esp payload size calculation

Corrects the function that determines the esp payload size. The calculations
done in esp{4,6}_get_mtu() lead to overlength frames in transport mode for
certain mtu values and suboptimal frames for others.

According to what is done, mainly in esp{,6}_output() and tcp_mtu_to_mss(),
net_header_len must be taken into account before doing the alignment
calculation.

Signed-off-by: Benjamin Poirier <[email protected]>

---
Changes since v3:
* also fix ipv6

Changes since v2:
* rename l3_adj to net_adj
* fix indentation

Changes since v1:
* introduce l3_adj to preserve the same returned value as before for tunnel
mode

For example:
* on ipv4 with md5 AH and 3des ESP (transport mode):
mtu = 1499 leads to FRAGFAILS
mtu = 1500 the addition of padding in the esp header could be avoided
* on ipv6 with md5 AH and twofish-sha1 ESP (transport mode):
mtu = 1491 leads to Ip6FragFails
mtu = 1499 padding can be avoided

For details on how the formula is established, see
https://lkml.org/lkml/2012/5/10/597

Tested with
* transport mode E
* transport mode EA
* transport mode E + ah
* tunnel mode E

Not tested with BEET, but it should be the same as transport mode
draft-nikander-esp-beet-mode-03.txt Section 5.2:
"The wire packet format is identical to the ESP transport mode"
---
net/ipv4/esp4.c | 24 +++++++++---------------
net/ipv6/esp6.c | 18 +++++++-----------
2 files changed, 16 insertions(+), 26 deletions(-)

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index 89a47b3..cb982a6 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -459,28 +459,22 @@ static u32 esp4_get_mtu(struct xfrm_state *x, int mtu)
struct esp_data *esp = x->data;
u32 blksize = ALIGN(crypto_aead_blocksize(esp->aead), 4);
u32 align = max_t(u32, blksize, esp->padlen);
- u32 rem;
-
- mtu -= x->props.header_len + crypto_aead_authsize(esp->aead);
- rem = mtu & (align - 1);
- mtu &= ~(align - 1);
+ unsigned int net_adj;

switch (x->props.mode) {
- case XFRM_MODE_TUNNEL:
- break;
- default:
case XFRM_MODE_TRANSPORT:
- /* The worst case */
- mtu -= blksize - 4;
- mtu += min_t(u32, blksize - 4, rem);
- break;
case XFRM_MODE_BEET:
- /* The worst case. */
- mtu += min_t(u32, IPV4_BEET_PHMAXLEN, rem);
+ net_adj = sizeof(struct iphdr);
break;
+ case XFRM_MODE_TUNNEL:
+ net_adj = 0;
+ break;
+ default:
+ BUG();
}

- return mtu - 2;
+ return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
+ net_adj) & ~(align - 1)) + (net_adj - 2);
}

static void esp4_err(struct sk_buff *skb, u32 info)
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 1e62b75..db1521f 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -413,19 +413,15 @@ static u32 esp6_get_mtu(struct xfrm_state *x, int mtu)
struct esp_data *esp = x->data;
u32 blksize = ALIGN(crypto_aead_blocksize(esp->aead), 4);
u32 align = max_t(u32, blksize, esp->padlen);
- u32 rem;
+ unsigned int net_adj;

- mtu -= x->props.header_len + crypto_aead_authsize(esp->aead);
- rem = mtu & (align - 1);
- mtu &= ~(align - 1);
-
- if (x->props.mode != XFRM_MODE_TUNNEL) {
- u32 padsize = ((blksize - 1) & 7) + 1;
- mtu -= blksize - padsize;
- mtu += min_t(u32, blksize - padsize, rem);
- }
+ if (x->props.mode != XFRM_MODE_TUNNEL)
+ net_adj = sizeof(struct ipv6hdr);
+ else
+ net_adj = 0;

- return mtu - 2;
+ return ((mtu - x->props.header_len - crypto_aead_authsize(esp->aead) -
+ net_adj) & ~(align - 1)) + (net_adj - 2);
}

static void esp6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
--
1.7.7

2012-05-27 05:12:25

by David Miller

[permalink] [raw]
Subject: Re: [PATCH v4] xfrm: take net hdr len into account for esp payload size calculation

From: Benjamin Poirier <[email protected]>
Date: Thu, 24 May 2012 17:32:38 -0400

> Corrects the function that determines the esp payload size. The calculations
> done in esp{4,6}_get_mtu() lead to overlength frames in transport mode for
> certain mtu values and suboptimal frames for others.
>
> According to what is done, mainly in esp{,6}_output() and tcp_mtu_to_mss(),
> net_header_len must be taken into account before doing the alignment
> calculation.
>
> Signed-off-by: Benjamin Poirier <[email protected]>

Applied and queued up for -stable, thanks.