2020-03-26 02:52:03

by Yafang Shao

[permalink] [raw]
Subject: [PATCH] kernel/taskstats: fix wrong nla type for {cgroup,task}stats policy

After our server is upgraded to a newer kernel, we found that it
continuesly print a warning in the kernel message. The warning is,
[832984.946322] netlink: 'irmas.lc': attribute type 1 has an invalid length.

irmas.lc is one of our container monitor daemons, and it will use
CGROUPSTATS_CMD_GET to get the cgroupstats, that is similar with
tools/accounting/getdelays.c. We can also produce this warning with
getdelays. For example, after running bellow command
$ ./getdelays -C /sys/fs/cgroup/memory
then you can find a warning in dmesg,
[61607.229318] netlink: 'getdelays': attribute type 1 has an invalid length.

This warning is introduced in commit 6e237d099fac ("netlink: Relax attr
validation for fixed length types"), which is used to check whether
attributes using types NLA_U* and NLA_S* have an exact length.

Regarding this issue, the root cause is cgroupstats_cmd_get_policy defines
a wrong type as NLA_U32, while it should be NLA_NESTED an its minimal
length is NLA_HDRLEN. That is similar to taskstats_cmd_get_policy.

As this behavior change really breaks our application, we'd better
cc stable as well.

Signed-off-by: Yafang Shao <[email protected]>
Cc: [email protected]
---
kernel/taskstats.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/kernel/taskstats.c b/kernel/taskstats.c
index e2ac0e3..b90a520 100644
--- a/kernel/taskstats.c
+++ b/kernel/taskstats.c
@@ -35,8 +35,8 @@
static struct genl_family family;

static const struct nla_policy taskstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1] = {
- [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_U32 },
- [TASKSTATS_CMD_ATTR_TGID] = { .type = NLA_U32 },
+ [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_NESTED },
+ [TASKSTATS_CMD_ATTR_TGID] = { .type = NLA_NESTED },
[TASKSTATS_CMD_ATTR_REGISTER_CPUMASK] = { .type = NLA_STRING },
[TASKSTATS_CMD_ATTR_DEREGISTER_CPUMASK] = { .type = NLA_STRING },};

@@ -45,7 +45,7 @@
* Make sure they are always aligned.
*/
static const struct nla_policy cgroupstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1] = {
- [CGROUPSTATS_CMD_ATTR_FD] = { .type = NLA_U32 },
+ [CGROUPSTATS_CMD_ATTR_FD] = { .type = NLA_NESTED },
};

struct listener {
--
1.8.3.1


2020-03-26 20:19:04

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] kernel/taskstats: fix wrong nla type for {cgroup,task}stats policy

On Thu, 2020-03-26 at 13:08 -0700, Andrew Morton wrote:

> > After our server is upgraded to a newer kernel, we found that it
> > continuesly print a warning in the kernel message. The warning is,
> > [832984.946322] netlink: 'irmas.lc': attribute type 1 has an invalid length.
> >
> > irmas.lc is one of our container monitor daemons, and it will use
> > CGROUPSTATS_CMD_GET to get the cgroupstats, that is similar with
> > tools/accounting/getdelays.c. We can also produce this warning with
> > getdelays. For example, after running bellow command
> > $ ./getdelays -C /sys/fs/cgroup/memory
> > then you can find a warning in dmesg,
> > [61607.229318] netlink: 'getdelays': attribute type 1 has an invalid length.
> >
> > This warning is introduced in commit 6e237d099fac ("netlink: Relax attr
> > validation for fixed length types"), which is used to check whether
> > attributes using types NLA_U* and NLA_S* have an exact length.
> >
> > Regarding this issue, the root cause is cgroupstats_cmd_get_policy defines
> > a wrong type as NLA_U32, while it should be NLA_NESTED an its minimal
> > length is NLA_HDRLEN. That is similar to taskstats_cmd_get_policy.
> >
> > As this behavior change really breaks our application, we'd better
> > cc stable as well.

Can you explain how it breaks the application? I mean, it's really only
printing a message to the kernel log in this case? At least that's what
you're describing.

I think you may be describing it wrong, because an NLA_NESTED is allowed
to be *empty* (but otherwise must have at least 4 bytes just like an
NLA_U32).

That said, I'm not even sure I agree that this fix is right? See below.

> Is it correct to say that although the code has always been incorrect,
> but only kernels after 6e237d099fac need this change? If so, I'll add
> Fixes:6e237d099fac to guide the -stable backporting.

That doesn't really seem right - 6e237d099fac *relaxed* the checks. If
anything then it ought to point to 28033ae4e0f5 which may have actually
returned an error; but again, need to understand better what really the
issue is.

> > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > index e2ac0e3..b90a520 100644
> > --- a/kernel/taskstats.c
> > +++ b/kernel/taskstats.c
> > @@ -35,8 +35,8 @@
> > static struct genl_family family;
> >
> > static const struct nla_policy taskstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1] = {
> > - [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_U32 },
> > - [TASKSTATS_CMD_ATTR_TGID] = { .type = NLA_U32 },
> > + [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_NESTED },
> > + [TASKSTATS_CMD_ATTR_TGID] = { .type = NLA_NESTED },


I'm not sure where this is coming from - the kernel evidently uses them
as nested attributes in *outgoing* data (see mk_reply()), but as NLA_U32
in *incoming* data, (see cmd_attr_pid() and cmd_attr_tgid()).

I would generally recommend not doing such a thing as it's messy, but we
do have quite a few such instances cases. In all those cases must the
policy list the incoming policy since that's what the kernel uses to
validate the attributes.

IOW, this part of the change seems _wrong_.


> > * Make sure they are always aligned.
> > */
> > static const struct nla_policy cgroupstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1] = {
> > - [CGROUPSTATS_CMD_ATTR_FD] = { .type = NLA_U32 },
> > + [CGROUPSTATS_CMD_ATTR_FD] = { .type = NLA_NESTED },
> > };

And same here, actually.

johannes

2020-03-26 20:30:05

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] kernel/taskstats: fix wrong nla type for {cgroup,task}stats policy

On Thu, 2020-03-26 at 13:08 -0700, Andrew Morton wrote:
> (cc's added)
>
> On Wed, 25 Mar 2020 22:50:42 -0400 Yafang Shao <[email protected]> wrote:
>
> > After our server is upgraded to a newer kernel, we found that it
> > continuesly print a warning in the kernel message. The warning is,
> > [832984.946322] netlink: 'irmas.lc': attribute type 1 has an invalid length.
> >
> > irmas.lc is one of our container monitor daemons, and it will use
> > CGROUPSTATS_CMD_GET to get the cgroupstats, that is similar with
> > tools/accounting/getdelays.c. We can also produce this warning with
> > getdelays. For example, after running bellow command
> > $ ./getdelays -C /sys/fs/cgroup/memory
> > then you can find a warning in dmesg,
> > [61607.229318] netlink: 'getdelays': attribute type 1 has an invalid length.

And looking at this ... well, that code is completely wrong?

E.g.

rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
cmd_type, &tid, sizeof(__u32));

(cmd_type is one of TASKSTATS_CMD_ATTR_TGID, TASKSTATS_CMD_ATTR_PID)

or it might do

rc = send_cmd(nl_sd, id, mypid, CGROUPSTATS_CMD_GET,
CGROUPSTATS_CMD_ATTR_FD, &cfd, sizeof(__u32));

so clearly it wants to produce a u32 attribute.

But then

static int send_cmd(int sd, __u16 nlmsg_type, __u32 nlmsg_pid,
__u8 genl_cmd, __u16 nla_type,
void *nla_data, int nla_len)
{
...

na = (struct nlattr *) GENLMSG_DATA(&msg);

// this is still fine

na->nla_type = nla_type;

// this is also fine

na->nla_len = nla_len + 1 + NLA_HDRLEN;

// but this??? the nla_len of a netlink attribute should just be
// the len ... what's NLA_HDRLEN doing here? this isn't nested
// here we end up just reserving 1+NLA_HDRLEN too much space

memcpy(NLA_DATA(na), nla_data, nla_len);

// but then it anyway only fills the first nla_len bytes, which
// is just like a regular attribute.

msg.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
// note that this is also wrong - it should be
// += NLA_ALIGN(NLA_HDRLEN + nla_len)



So really I think what happened here is precisely what we wanted -
David's kernel patch caught the broken userspace tool.

johannes

2020-03-26 21:12:33

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH] kernel/taskstats: fix wrong nla type for {cgroup,task}stats policy

On 3/26/20 2:28 PM, Johannes Berg wrote:
>
> And looking at this ... well, that code is completely wrong?
>
> E.g.
>
> rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
> cmd_type, &tid, sizeof(__u32));
>
> (cmd_type is one of TASKSTATS_CMD_ATTR_TGID, TASKSTATS_CMD_ATTR_PID)
>
> or it might do
>
> rc = send_cmd(nl_sd, id, mypid, CGROUPSTATS_CMD_GET,
> CGROUPSTATS_CMD_ATTR_FD, &cfd, sizeof(__u32));
>
> so clearly it wants to produce a u32 attribute.
>
> But then
>
> static int send_cmd(int sd, __u16 nlmsg_type, __u32 nlmsg_pid,
> __u8 genl_cmd, __u16 nla_type,
> void *nla_data, int nla_len)
> {
> ...
>
> na = (struct nlattr *) GENLMSG_DATA(&msg);
>
> // this is still fine
>
> na->nla_type = nla_type;
>
> // this is also fine
>
> na->nla_len = nla_len + 1 + NLA_HDRLEN;
>
> // but this??? the nla_len of a netlink attribute should just be
> // the len ... what's NLA_HDRLEN doing here? this isn't nested
> // here we end up just reserving 1+NLA_HDRLEN too much space
>
> memcpy(NLA_DATA(na), nla_data, nla_len);
>
> // but then it anyway only fills the first nla_len bytes, which
> // is just like a regular attribute.
>
> msg.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
> // note that this is also wrong - it should be
> // += NLA_ALIGN(NLA_HDRLEN + nla_len)
>
>
>
> So really I think what happened here is precisely what we wanted -
> David's kernel patch caught the broken userspace tool.

agreed. The tool needs to be fixed, not the kernel policy.

I do not get the error message with this change as Johannes points out
above:

diff --git a/tools/accounting/getdelays.c b/tools/accounting/getdelays.c
index 8cb504d30384..e90fd133df0e 100644
--- a/tools/accounting/getdelays.c
+++ b/tools/accounting/getdelays.c
@@ -136,7 +136,7 @@ static int send_cmd(int sd, __u16 nlmsg_type, __u32
nlmsg_pid,
msg.g.version = 0x1;
na = (struct nlattr *) GENLMSG_DATA(&msg);
na->nla_type = nla_type;
- na->nla_len = nla_len + 1 + NLA_HDRLEN;
+ na->nla_len = nla_len + NLA_HDRLEN;
memcpy(NLA_DATA(na), nla_data, nla_len);
msg.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);

2020-03-26 21:14:30

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH] kernel/taskstats: fix wrong nla type for {cgroup,task}stats policy

On Thu, 2020-03-26 at 15:11 -0600, David Ahern wrote:
>
> > na->nla_len = nla_len + 1 + NLA_HDRLEN;
> >
> > // but this??? the nla_len of a netlink attribute should just be
> > // the len ... what's NLA_HDRLEN doing here? this isn't nested
> > // here we end up just reserving 1+NLA_HDRLEN too much space

[...]

> I do not get the error message with this change as Johannes points out
> above:
>
> diff --git a/tools/accounting/getdelays.c b/tools/accounting/getdelays.c
> index 8cb504d30384..e90fd133df0e 100644
> --- a/tools/accounting/getdelays.c
> +++ b/tools/accounting/getdelays.c
> @@ -136,7 +136,7 @@ static int send_cmd(int sd, __u16 nlmsg_type, __u32
> nlmsg_pid,
> msg.g.version = 0x1;
> na = (struct nlattr *) GENLMSG_DATA(&msg);
> na->nla_type = nla_type;
> - na->nla_len = nla_len + 1 + NLA_HDRLEN;
> + na->nla_len = nla_len + NLA_HDRLEN;

Oops, thanks for the correction - indeed NLA_HDRLEN is included, I was
wrong above.

johannes

2020-03-27 00:40:34

by Yafang Shao

[permalink] [raw]
Subject: Re: [PATCH] kernel/taskstats: fix wrong nla type for {cgroup,task}stats policy

On Fri, Mar 27, 2020 at 5:11 AM David Ahern <[email protected]> wrote:
>
> On 3/26/20 2:28 PM, Johannes Berg wrote:
> >
> > And looking at this ... well, that code is completely wrong?
> >
> > E.g.
> >
> > rc = send_cmd(nl_sd, id, mypid, TASKSTATS_CMD_GET,
> > cmd_type, &tid, sizeof(__u32));
> >
> > (cmd_type is one of TASKSTATS_CMD_ATTR_TGID, TASKSTATS_CMD_ATTR_PID)
> >
> > or it might do
> >
> > rc = send_cmd(nl_sd, id, mypid, CGROUPSTATS_CMD_GET,
> > CGROUPSTATS_CMD_ATTR_FD, &cfd, sizeof(__u32));
> >
> > so clearly it wants to produce a u32 attribute.
> >
> > But then
> >
> > static int send_cmd(int sd, __u16 nlmsg_type, __u32 nlmsg_pid,
> > __u8 genl_cmd, __u16 nla_type,
> > void *nla_data, int nla_len)
> > {
> > ...
> >
> > na = (struct nlattr *) GENLMSG_DATA(&msg);
> >
> > // this is still fine
> >
> > na->nla_type = nla_type;
> >
> > // this is also fine
> >
> > na->nla_len = nla_len + 1 + NLA_HDRLEN;
> >
> > // but this??? the nla_len of a netlink attribute should just be
> > // the len ... what's NLA_HDRLEN doing here? this isn't nested
> > // here we end up just reserving 1+NLA_HDRLEN too much space
> >
> > memcpy(NLA_DATA(na), nla_data, nla_len);
> >
> > // but then it anyway only fills the first nla_len bytes, which
> > // is just like a regular attribute.
> >
> > msg.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
> > // note that this is also wrong - it should be
> > // += NLA_ALIGN(NLA_HDRLEN + nla_len)
> >
> >
> >
> > So really I think what happened here is precisely what we wanted -
> > David's kernel patch caught the broken userspace tool.
>
> agreed. The tool needs to be fixed, not the kernel policy.
>
> I do not get the error message with this change as Johannes points out
> above:
>
> diff --git a/tools/accounting/getdelays.c b/tools/accounting/getdelays.c
> index 8cb504d30384..e90fd133df0e 100644
> --- a/tools/accounting/getdelays.c
> +++ b/tools/accounting/getdelays.c
> @@ -136,7 +136,7 @@ static int send_cmd(int sd, __u16 nlmsg_type, __u32
> nlmsg_pid,
> msg.g.version = 0x1;
> na = (struct nlattr *) GENLMSG_DATA(&msg);
> na->nla_type = nla_type;
> - na->nla_len = nla_len + 1 + NLA_HDRLEN;
> + na->nla_len = nla_len + NLA_HDRLEN;
> memcpy(NLA_DATA(na), nla_data, nla_len);
> msg.n.nlmsg_len += NLMSG_ALIGN(na->nla_len);
>

Right. This is the right thing to do.
I missed that the nla_len() will minus the NLA_HDRLEN.

Would you pls. submit a patch ?

Feel free to add:
Tested-by: Yafang Shao <[email protected]>

Thanks
Yafang

2020-03-27 00:44:37

by Yafang Shao

[permalink] [raw]
Subject: Re: [PATCH] kernel/taskstats: fix wrong nla type for {cgroup,task}stats policy

On Fri, Mar 27, 2020 at 4:18 AM Johannes Berg <[email protected]> wrote:
>
> On Thu, 2020-03-26 at 13:08 -0700, Andrew Morton wrote:
>
> > > After our server is upgraded to a newer kernel, we found that it
> > > continuesly print a warning in the kernel message. The warning is,
> > > [832984.946322] netlink: 'irmas.lc': attribute type 1 has an invalid length.
> > >
> > > irmas.lc is one of our container monitor daemons, and it will use
> > > CGROUPSTATS_CMD_GET to get the cgroupstats, that is similar with
> > > tools/accounting/getdelays.c. We can also produce this warning with
> > > getdelays. For example, after running bellow command
> > > $ ./getdelays -C /sys/fs/cgroup/memory
> > > then you can find a warning in dmesg,
> > > [61607.229318] netlink: 'getdelays': attribute type 1 has an invalid length.
> > >
> > > This warning is introduced in commit 6e237d099fac ("netlink: Relax attr
> > > validation for fixed length types"), which is used to check whether
> > > attributes using types NLA_U* and NLA_S* have an exact length.
> > >
> > > Regarding this issue, the root cause is cgroupstats_cmd_get_policy defines
> > > a wrong type as NLA_U32, while it should be NLA_NESTED an its minimal
> > > length is NLA_HDRLEN. That is similar to taskstats_cmd_get_policy.
> > >
> > > As this behavior change really breaks our application, we'd better
> > > cc stable as well.
>
> Can you explain how it breaks the application? I mean, it's really only
> printing a message to the kernel log in this case? At least that's what
> you're describing.
>
> I think you may be describing it wrong, because an NLA_NESTED is allowed
> to be *empty* (but otherwise must have at least 4 bytes just like an
> NLA_U32).
>
> That said, I'm not even sure I agree that this fix is right? See below.
>
> > Is it correct to say that although the code has always been incorrect,
> > but only kernels after 6e237d099fac need this change? If so, I'll add
> > Fixes:6e237d099fac to guide the -stable backporting.
>
> That doesn't really seem right - 6e237d099fac *relaxed* the checks. If
> anything then it ought to point to 28033ae4e0f5 which may have actually
> returned an error; but again, need to understand better what really the
> issue is.
>
> > > diff --git a/kernel/taskstats.c b/kernel/taskstats.c
> > > index e2ac0e3..b90a520 100644
> > > --- a/kernel/taskstats.c
> > > +++ b/kernel/taskstats.c
> > > @@ -35,8 +35,8 @@
> > > static struct genl_family family;
> > >
> > > static const struct nla_policy taskstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1] = {
> > > - [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_U32 },
> > > - [TASKSTATS_CMD_ATTR_TGID] = { .type = NLA_U32 },
> > > + [TASKSTATS_CMD_ATTR_PID] = { .type = NLA_NESTED },
> > > + [TASKSTATS_CMD_ATTR_TGID] = { .type = NLA_NESTED },
>
>
> I'm not sure where this is coming from - the kernel evidently uses them
> as nested attributes in *outgoing* data (see mk_reply()), but as NLA_U32
> in *incoming* data, (see cmd_attr_pid() and cmd_attr_tgid()).
>

Thanks for the explanation.
The nested attributes is only used in *outgoing* data, rather than the
'incoming' data.

> I would generally recommend not doing such a thing as it's messy, but we
> do have quite a few such instances cases. In all those cases must the
> policy list the incoming policy since that's what the kernel uses to
> validate the attributes.
>
> IOW, this part of the change seems _wrong_.
>
>
> > > * Make sure they are always aligned.
> > > */
> > > static const struct nla_policy cgroupstats_cmd_get_policy[TASKSTATS_CMD_ATTR_MAX+1] = {
> > > - [CGROUPSTATS_CMD_ATTR_FD] = { .type = NLA_U32 },
> > > + [CGROUPSTATS_CMD_ATTR_FD] = { .type = NLA_NESTED },
> > > };
>
> And same here, actually.
>
> johannes
>


Thanks
Yafang

2020-03-27 15:06:09

by Sasha Levin

[permalink] [raw]
Subject: Re: [PATCH] kernel/taskstats: fix wrong nla type for {cgroup,task}stats policy

Hi

[This is an automated email]

This commit has been processed because it contains a -stable tag.
The stable tag indicates that it's relevant for the following trees: all

The bot has tested the following trees: v5.5.11, v5.4.27, v4.19.112, v4.14.174, v4.9.217, v4.4.217.

v5.5.11: Build OK!
v5.4.27: Build OK!
v4.19.112: Build OK!
v4.14.174: Build OK!
v4.9.217: Build OK!
v4.4.217: Failed to apply! Possible dependencies:
243d52126184 ("taskstats: fix the length of cgroupstats_cmd_get_policy")


NOTE: The patch will not be queued to stable trees until it is upstream.

How should we proceed with this patch?

--
Thanks
Sasha