__ip_append_data() can get into an infinite loop when asked to splice into
a partially-built UDP message that has more than the frag-limit data and up
to the MTU limit. Something like:
pipe(pfd);
sfd = socket(AF_INET, SOCK_DGRAM, 0);
connect(sfd, ...);
send(sfd, buffer, 8161, MSG_CONFIRM|MSG_MORE);
write(pfd[1], buffer, 8);
splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0);
where the amount of data given to send() is dependent on the MTU size (in
this instance an interface with an MTU of 8192).
The problem is that the calculation of the amount to copy in
__ip_append_data() goes negative in two places, and, in the second place,
this gets subtracted from the length remaining, thereby increasing it.
This happens when pagedlen > 0 (which happens for MSG_ZEROCOPY and
MSG_SPLICE_PAGES), because the terms in:
copy = datalen - transhdrlen - fraggap - pagedlen;
then mostly cancel when pagedlen is substituted for, leaving just -fraggap.
This causes:
length -= copy + transhdrlen;
to increase the length to more than the amount of data in msg->msg_iter,
which causes skb_splice_from_iter() to be unable to fill the request and it
returns less than 'copied' - which means that length never gets to 0 and we
never exit the loop.
Fix this by:
(1) Insert a note about the dodgy calculation of 'copy'.
(2) If MSG_SPLICE_PAGES, clear copy if it is negative from the above
equation, so that 'offset' isn't regressed and 'length' isn't
increased, which will mean that length and thus copy should match the
amount left in the iterator.
(3) When handling MSG_SPLICE_PAGES, give a warning and return -EIO if
we're asked to splice more than is in the iterator. It might be
better to not give the warning or even just give a 'short' write.
[!] Note that this ought to also affect MSG_ZEROCOPY, but MSG_ZEROCOPY
avoids the problem by simply assuming that everything asked for got copied,
not just the amount that was in the iterator. This is a potential bug for
the future.
Fixes: 7ac7c987850c ("udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES")
Reported-by: [email protected]
Link: https://lore.kernel.org/r/[email protected]/
Signed-off-by: David Howells <[email protected]>
cc: Willem de Bruijn <[email protected]>
cc: "David S. Miller" <[email protected]>
cc: Eric Dumazet <[email protected]>
cc: Jakub Kicinski <[email protected]>
cc: Paolo Abeni <[email protected]>
cc: David Ahern <[email protected]>
cc: Jens Axboe <[email protected]>
cc: [email protected]
---
net/ipv4/ip_output.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 6e70839257f7..91715603cf6e 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1158,10 +1158,15 @@ static int __ip_append_data(struct sock *sk,
}
copy = datalen - transhdrlen - fraggap - pagedlen;
+ /* [!] NOTE: copy will be negative if pagedlen>0
+ * because then the equation reduces to -fraggap.
+ */
if (copy > 0 && getfrag(from, data + transhdrlen, offset, copy, fraggap, skb) < 0) {
err = -EFAULT;
kfree_skb(skb);
goto error;
+ } else if (flags & MSG_SPLICE_PAGES) {
+ copy = 0;
}
offset += copy;
@@ -1209,6 +1214,10 @@ static int __ip_append_data(struct sock *sk,
} else if (flags & MSG_SPLICE_PAGES) {
struct msghdr *msg = from;
+ err = -EIO;
+ if (WARN_ON_ONCE(copy > msg->msg_iter.count))
+ goto error;
+
err = skb_splice_from_iter(skb, &msg->msg_iter, copy,
sk->sk_allocation);
if (err < 0)
David Howells wrote:
>
> __ip_append_data() can get into an infinite loop when asked to splice into
> a partially-built UDP message that has more than the frag-limit data and up
> to the MTU limit. Something like:
>
> pipe(pfd);
> sfd = socket(AF_INET, SOCK_DGRAM, 0);
> connect(sfd, ...);
> send(sfd, buffer, 8161, MSG_CONFIRM|MSG_MORE);
> write(pfd[1], buffer, 8);
> splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0);
>
> where the amount of data given to send() is dependent on the MTU size (in
> this instance an interface with an MTU of 8192).
>
> The problem is that the calculation of the amount to copy in
> __ip_append_data() goes negative in two places, and, in the second place,
> this gets subtracted from the length remaining, thereby increasing it.
>
> This happens when pagedlen > 0 (which happens for MSG_ZEROCOPY and
> MSG_SPLICE_PAGES), because the terms in:
>
> copy = datalen - transhdrlen - fraggap - pagedlen;
>
> then mostly cancel when pagedlen is substituted for, leaving just -fraggap.
> This causes:
>
> length -= copy + transhdrlen;
>
> to increase the length to more than the amount of data in msg->msg_iter,
> which causes skb_splice_from_iter() to be unable to fill the request and it
> returns less than 'copied' - which means that length never gets to 0 and we
> never exit the loop.
>
> Fix this by:
>
> (1) Insert a note about the dodgy calculation of 'copy'.
>
> (2) If MSG_SPLICE_PAGES, clear copy if it is negative from the above
> equation, so that 'offset' isn't regressed and 'length' isn't
> increased, which will mean that length and thus copy should match the
> amount left in the iterator.
>
> (3) When handling MSG_SPLICE_PAGES, give a warning and return -EIO if
> we're asked to splice more than is in the iterator. It might be
> better to not give the warning or even just give a 'short' write.
>
> [!] Note that this ought to also affect MSG_ZEROCOPY, but MSG_ZEROCOPY
> avoids the problem by simply assuming that everything asked for got copied,
> not just the amount that was in the iterator. This is a potential bug for
> the future.
>
> Fixes: 7ac7c987850c ("udp: Convert udp_sendpage() to use MSG_SPLICE_PAGES")
> Reported-by: [email protected]
> Link: https://lore.kernel.org/r/[email protected]/
> Signed-off-by: David Howells <[email protected]>
> cc: Willem de Bruijn <[email protected]>
> cc: "David S. Miller" <[email protected]>
> cc: Eric Dumazet <[email protected]>
> cc: Jakub Kicinski <[email protected]>
> cc: Paolo Abeni <[email protected]>
> cc: David Ahern <[email protected]>
> cc: Jens Axboe <[email protected]>
> cc: [email protected]
Reviewed-by: Willem de Bruijn <[email protected]>
I noticed that this is still open in patchwork, no need to resend.
Hello:
This patch was applied to netdev/net.git (main)
by Jakub Kicinski <[email protected]>:
On Tue, 01 Aug 2023 16:48:53 +0100 you wrote:
> __ip_append_data() can get into an infinite loop when asked to splice into
> a partially-built UDP message that has more than the frag-limit data and up
> to the MTU limit. Something like:
>
> pipe(pfd);
> sfd = socket(AF_INET, SOCK_DGRAM, 0);
> connect(sfd, ...);
> send(sfd, buffer, 8161, MSG_CONFIRM|MSG_MORE);
> write(pfd[1], buffer, 8);
> splice(pfd[0], 0, sfd, 0, 0x4ffe0ul, 0);
>
> [...]
Here is the summary with links:
- [net] udp: Fix __ip_append_data()'s handling of MSG_SPLICE_PAGES
https://git.kernel.org/netdev/net/c/0f71c9caf267
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html