2023-06-26 22:12:13

by David Howells

[permalink] [raw]
Subject: Is ->sendmsg() allowed to change the msghdr struct it is given?

Hi Jakub,

Do you know if ->sendmsg() might alter the msghdr struct it is passed as an
argument? Certainly it can alter msg_iter, but can it also modify, say,
msg_flags?

Thanks,
David



2023-06-27 13:13:17

by David Howells

[permalink] [raw]
Subject: Re: Is ->sendmsg() allowed to change the msghdr struct it is given?

Paolo Abeni <[email protected]> wrote:

> udp_sendmsg() can set the MSG_TRUNC bit in msg->msg_flags, so I guess
> that kind of actions are sort of allowed. Still, AFAICS, the kernel
> based msghdr is not copied back to the user-space, so such change
> should be almost a no-op in practice.
>
> @David: which would be the end goal for such action?

Various places in the kernel use sock_sendmsg() - and I've added a bunch more
with the MSG_SPLICE_PAGES patches. For some of the things I've added, there's
a loop which used to call ->sendpage() and now calls sock_sendmsg(). In most
of those places, msghdr will get reset each time round the loop - but not in
all cases.

Of particular immediate interest is net/ceph/messenger_v2.c. If you go to:

https://lore.kernel.org/r/[email protected]/

and look at the resultant code:

static int do_sendmsg(struct socket *sock, struct iov_iter *it)
{
struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS };
int ret;

msg.msg_iter = *it;
while (iov_iter_count(it)) {
ret = sock_sendmsg(sock, &msg);
if (ret <= 0) {
if (ret == -EAGAIN)
ret = 0;
return ret;
}

iov_iter_advance(it, ret);
}

WARN_ON(msg_data_left(&msg));
return 1;
}

for example. It could/would malfunction if sendmsg() is allowed to modify
msghdr - or if it doesn't update msg_iter. Likewise:

static int do_try_sendpage(struct socket *sock, struct iov_iter *it)
{
struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS };
struct bio_vec bv;
int ret;

if (WARN_ON(!iov_iter_is_bvec(it)))
return -EINVAL;

while (iov_iter_count(it)) {
/* iov_iter_iovec() for ITER_BVEC */
bvec_set_page(&bv, it->bvec->bv_page,
min(iov_iter_count(it),
it->bvec->bv_len - it->iov_offset),
it->bvec->bv_offset + it->iov_offset);

/*
* MSG_SPLICE_PAGES cannot properly handle pages with
* page_count == 0, we need to fall back to sendmsg if
* that's the case.
*
* Same goes for slab pages: skb_can_coalesce() allows
* coalescing neighboring slab objects into a single frag
* which triggers one of hardened usercopy checks.
*/
if (sendpage_ok(bv.bv_page))
msg.msg_flags |= MSG_SPLICE_PAGES;
else
msg.msg_flags &= ~MSG_SPLICE_PAGES;

iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, bv.bv_len);
ret = sock_sendmsg(sock, &msg);
if (ret <= 0) {
if (ret == -EAGAIN)
ret = 0;
return ret;
}

iov_iter_advance(it, ret);
}

return 1;
}

could be similarly affected if ->sendmsg() mucks about with msg_flags.

David


2023-06-27 13:24:14

by Paolo Abeni

[permalink] [raw]
Subject: Re: Is ->sendmsg() allowed to change the msghdr struct it is given?

On Tue, 2023-06-27 at 14:51 +0200, Paolo Abeni wrote:
> On Mon, 2023-06-26 at 14:22 -0700, Jakub Kicinski wrote:
> > On Mon, 26 Jun 2023 22:14:41 +0100 David Howells wrote:
> > > Do you know if ->sendmsg() might alter the msghdr struct it is passed as an
> > > argument? Certainly it can alter msg_iter, but can it also modify,
> > > say, msg_flags?
> >
> > I'm not aware of a precedent either way.
> > Eric or Paolo would know better than me, tho.
>
> udp_sendmsg() can set the MSG_TRUNC bit in msg->msg_flags, so I guess
> that kind of actions are sort of allowed.

Sorry, ENOCOFFEE here. It's actually udp_recvmsg() updating msg-
>msg_flags.

> Still, AFAICS, the kernel
> based msghdr is not copied back to the user-space, so such change
> should be almost a no-op in practice.

This part should be correct.

> @David: which would be the end goal for such action?

Sorry for the noisy reply,

Paolo


2023-06-27 14:08:21

by Paolo Abeni

[permalink] [raw]
Subject: Re: Is ->sendmsg() allowed to change the msghdr struct it is given?

On Tue, 2023-06-27 at 14:09 +0100, David Howells wrote:
> Paolo Abeni <[email protected]> wrote:
>
> > udp_sendmsg() can set the MSG_TRUNC bit in msg->msg_flags, so I guess
> > that kind of actions are sort of allowed. Still, AFAICS, the kernel
> > based msghdr is not copied back to the user-space, so such change
> > should be almost a no-op in practice.
> >
> > @David: which would be the end goal for such action?
>
> Various places in the kernel use sock_sendmsg() - and I've added a bunch more
> with the MSG_SPLICE_PAGES patches. For some of the things I've added, there's
> a loop which used to call ->sendpage() and now calls sock_sendmsg(). In most
> of those places, msghdr will get reset each time round the loop - but not in
> all cases.
>
> Of particular immediate interest is net/ceph/messenger_v2.c. If you go to:
>
> https://lore.kernel.org/r/[email protected]/
>
> and look at the resultant code:
>
> static int do_sendmsg(struct socket *sock, struct iov_iter *it)
> {
> struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS };
> int ret;
>
> msg.msg_iter = *it;
> while (iov_iter_count(it)) {
> ret = sock_sendmsg(sock, &msg);
> if (ret <= 0) {
> if (ret == -EAGAIN)
> ret = 0;
> return ret;
> }
>
> iov_iter_advance(it, ret);
> }
>
> WARN_ON(msg_data_left(&msg));
> return 1;
> }
>
> for example. It could/would malfunction if sendmsg() is allowed to modify
> msghdr - or if it doesn't update msg_iter. Likewise:
>
> static int do_try_sendpage(struct socket *sock, struct iov_iter *it)
> {
> struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS };
> struct bio_vec bv;
> int ret;
>
> if (WARN_ON(!iov_iter_is_bvec(it)))
> return -EINVAL;
>
> while (iov_iter_count(it)) {
> /* iov_iter_iovec() for ITER_BVEC */
> bvec_set_page(&bv, it->bvec->bv_page,
> min(iov_iter_count(it),
> it->bvec->bv_len - it->iov_offset),
> it->bvec->bv_offset + it->iov_offset);
>
> /*
> * MSG_SPLICE_PAGES cannot properly handle pages with
> * page_count == 0, we need to fall back to sendmsg if
> * that's the case.
> *
> * Same goes for slab pages: skb_can_coalesce() allows
> * coalescing neighboring slab objects into a single frag
> * which triggers one of hardened usercopy checks.
> */
> if (sendpage_ok(bv.bv_page))
> msg.msg_flags |= MSG_SPLICE_PAGES;
> else
> msg.msg_flags &= ~MSG_SPLICE_PAGES;
>
> iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, bv.bv_len);
> ret = sock_sendmsg(sock, &msg);
> if (ret <= 0) {
> if (ret == -EAGAIN)
> ret = 0;
> return ret;
> }
>
> iov_iter_advance(it, ret);
> }
>
> return 1;
> }
>
> could be similarly affected if ->sendmsg() mucks about with msg_flags.

With some help from the compiler - locally changing the proto_ops and
proto sendmsg definition and handling the fallout - I found the
following:

- mptcp_sendmsg() is clearing unsupported msg_flags
(I should have recalled this one even without much testing ;)

- udpv4_sendmsg() is setting msg_name/msg_namelen
- tls_device_sendmsg() is clearing MSG_SPLICE_PAGE when zerocopy is not
supported
- unix_seqpacket_sendmsg() is clearing msg_namelen

I could have missed something, but the above looks safe for the use-
case you mentioned.

Cheers,

Paolo