2022-01-10 15:56:53

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v2 0/2] Fix sockaddr handling in NFSD trace points

The patches in this series address a simple buffer over-read bug in
the Linux NFS server.

However I was thinking it would be nice to have trace helpers to
deal safely with generic socket addresses. I'd like to be able to
treat them the same way we currently treat strings. So for example:


#define field_sockaddr(field, len) __dynamic_array(u8, field, len)
#define assign_sockaddr(dest, src, len) memcpy(__get_dynamic_array(dest), src, len)
#define __get_sockaddr(field) ((struct sockaddr *)__get_dynamic_array(field))

TRACE_EVENT(sockaddr_example,
TP_PROTO(
const struct sockaddr *sap,
size_t salen
),
TP_ARGS(sap, salen),
TP_STRUCT__entry(
__field_sockaddr(addr, salen)
),
TP_fast_assign(
__assign_sockaddr(addr, sap, salen);
),
TP_printk("addr=%pIS", __get_sockaddr(addr))
);


should be able to store any address family in a dynamically-sized
array field (addr).

I haven't quite been able to figure out how to handle the
TP_printk() part of this equation. `trace-cmd report` displays
something like "addr=ARG TYPE NOT FIELD BUT 7".

Thoughts or advice appreciated.

---

Chuck Lever (2):
SUNRPC: Fix sockaddr handling in the svc_xprt_create_error trace point
SUNRPC: Fix sockaddr handling in svcsock_accept_class trace points


include/trace/events/sunrpc.h | 13 +++++++------
net/sunrpc/svc_xprt.c | 2 +-
2 files changed, 8 insertions(+), 7 deletions(-)

--
Chuck Lever


2022-01-10 15:57:05

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v2 1/2] SUNRPC: Fix sockaddr handling in the svc_xprt_create_error trace point

While testing, I got an unexpected KASAN splat:

Jan 08 13:50:27 oracle-102.nfsv4.dev kernel: BUG: KASAN: stack-out-of-bounds in trace_event_raw_event_svc_xprt_create_err+0x190/0x210 [sunrpc]
Jan 08 13:50:27 oracle-102.nfsv4.dev kernel: Read of size 28 at addr ffffc9000008f728 by task mount.nfs/4628

The memcpy() in the TP_fast_assign section of this trace point
copies the size of the destination buffer in order that the buffer
won't be overrun.

In other similar trace points, the source buffer for this memcpy is
a "struct sockaddr_storage" so the actual length of the source
buffer is always long enough to prevent the memcpy from reading
uninitialized or unallocated memory.

However, for this trace point, the source buffer can be as small as
a "struct sockaddr_in". For AF_INET sockaddrs, the memcpy() reads
memory that follows the source buffer, which is not always valid
memory.

To avoid copying past the end of the passed-in sockaddr, make the
source address's length available to the memcpy(). It would be a
little nicer if the tracing infrastructure was more friendly about
storing socket addresses that are not AF_INET, but I could not find
a way to make printk("%pIS") work with a dynamic array.

Reported-by: KASAN
Fixes: 4b8f380e46e4 ("SUNRPC: Tracepoint to record errors in svc_xpo_create()")
Signed-off-by: Chuck Lever <[email protected]>
---
include/trace/events/sunrpc.h | 5 +++--
net/sunrpc/svc_xprt.c | 2 +-
2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 684cc0e322fa..8ee03c9fdfdf 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -1744,10 +1744,11 @@ TRACE_EVENT(svc_xprt_create_err,
const char *program,
const char *protocol,
struct sockaddr *sap,
+ size_t saplen,
const struct svc_xprt *xprt
),

- TP_ARGS(program, protocol, sap, xprt),
+ TP_ARGS(program, protocol, sap, saplen, xprt),

TP_STRUCT__entry(
__field(long, error)
@@ -1760,7 +1761,7 @@ TRACE_EVENT(svc_xprt_create_err,
__entry->error = PTR_ERR(xprt);
__assign_str(program, program);
__assign_str(protocol, protocol);
- memcpy(__entry->addr, sap, sizeof(__entry->addr));
+ memcpy(__entry->addr, sap, min(saplen, sizeof(__entry->addr)));
),

TP_printk("addr=%pISpc program=%s protocol=%s error=%ld",
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index b1744432489e..1d8fc9d8da09 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -243,7 +243,7 @@ static struct svc_xprt *__svc_xpo_create(struct svc_xprt_class *xcl,
xprt = xcl->xcl_ops->xpo_create(serv, net, sap, len, flags);
if (IS_ERR(xprt))
trace_svc_xprt_create_err(serv->sv_program->pg_name,
- xcl->xcl_name, sap, xprt);
+ xcl->xcl_name, sap, len, xprt);
return xprt;
}



2022-01-10 15:57:07

by Chuck Lever

[permalink] [raw]
Subject: [PATCH v2 2/2] SUNRPC: Fix sockaddr handling in svcsock_accept_class trace points

Avoid potentially hazardous memory copying and the needless use of
"%pIS" -- in the kernel, an RPC service listener is always bound to
ANYADDR. Having the network namespace is helpful when recording
errors, though.

Fixes: a0469f46faab ("SUNRPC: Replace dprintk call sites in TCP state change callouts")
Signed-off-by: Chuck Lever <[email protected]>
---
include/trace/events/sunrpc.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index 8ee03c9fdfdf..81fcc662f80b 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -2125,17 +2125,17 @@ DECLARE_EVENT_CLASS(svcsock_accept_class,
TP_STRUCT__entry(
__field(long, status)
__string(service, service)
- __array(unsigned char, addr, sizeof(struct sockaddr_in6))
+ __field(unsigned int, netns_ino)
),

TP_fast_assign(
__entry->status = status;
__assign_str(service, service);
- memcpy(__entry->addr, &xprt->xpt_local, sizeof(__entry->addr));
+ __entry->netns_ino = xprt->xpt_net->ns.inum;
),

- TP_printk("listener=%pISpc service=%s status=%ld",
- __entry->addr, __get_str(service), __entry->status
+ TP_printk("addr=listener service=%s status=%ld",
+ __get_str(service), __entry->status
)
);



2022-01-10 16:05:41

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix sockaddr handling in NFSD trace points

On Mon, 10 Jan 2022 10:56:49 -0500
Chuck Lever <[email protected]> wrote:

> The patches in this series address a simple buffer over-read bug in
> the Linux NFS server.
>
> However I was thinking it would be nice to have trace helpers to
> deal safely with generic socket addresses. I'd like to be able to
> treat them the same way we currently treat strings. So for example:
>
>
> #define field_sockaddr(field, len) __dynamic_array(u8, field, len)
> #define assign_sockaddr(dest, src, len) memcpy(__get_dynamic_array(dest), src, len)
> #define __get_sockaddr(field) ((struct sockaddr *)__get_dynamic_array(field))
>
> TRACE_EVENT(sockaddr_example,
> TP_PROTO(
> const struct sockaddr *sap,
> size_t salen
> ),
> TP_ARGS(sap, salen),
> TP_STRUCT__entry(
> __field_sockaddr(addr, salen)
> ),
> TP_fast_assign(
> __assign_sockaddr(addr, sap, salen);
> ),
> TP_printk("addr=%pIS", __get_sockaddr(addr))
> );
>
>
> should be able to store any address family in a dynamically-sized
> array field (addr).
>
> I haven't quite been able to figure out how to handle the
> TP_printk() part of this equation. `trace-cmd report` displays
> something like "addr=ARG TYPE NOT FIELD BUT 7".
>
> Thoughts or advice appreciated.

I'll take a look into it.

Thanks,

-- Steve

2022-01-10 23:31:39

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix sockaddr handling in NFSD trace points

On Mon, 10 Jan 2022 11:05:35 -0500
Steven Rostedt <[email protected]> wrote:

> > I haven't quite been able to figure out how to handle the
> > TP_printk() part of this equation. `trace-cmd report` displays
> > something like "addr=ARG TYPE NOT FIELD BUT 7".
> >
> > Thoughts or advice appreciated.
>
> I'll take a look into it.

If you add this to libtraceevent, it will work:

diff --git a/src/event-parse.c b/src/event-parse.c
index 9bd605d..033529c 100644
--- a/src/event-parse.c
+++ b/src/event-parse.c
@@ -5127,6 +5127,8 @@ static int print_ipsa_arg(struct trace_seq *s, const char *ptr, char i,
unsigned char *buf;
struct sockaddr_storage *sa;
bool reverse = false;
+ unsigned int offset;
+ unsigned int len;
int rc = 0;
int ret;

@@ -5152,27 +5154,42 @@ static int print_ipsa_arg(struct trace_seq *s, const char *ptr, char i,
return rc;
}

- if (arg->type != TEP_PRINT_FIELD) {
- trace_seq_printf(s, "ARG TYPE NOT FIELD BUT %d", arg->type);
- return rc;
- }
+ /* evaluate if the arg has a typecast */
+ while (arg->type == TEP_PRINT_TYPE)
+ arg = arg->typecast.item;
+
+ if (arg->type == TEP_PRINT_FIELD) {

- if (!arg->field.field) {
- arg->field.field =
- tep_find_any_field(event, arg->field.name);
if (!arg->field.field) {
- do_warning("%s: field %s not found",
- __func__, arg->field.name);
- return rc;
+ arg->field.field =
+ tep_find_any_field(event, arg->field.name);
+ if (!arg->field.field) {
+ do_warning("%s: field %s not found",
+ __func__, arg->field.name);
+ return rc;
+ }
}
+
+ offset = arg->field.field->offset;
+ len = arg->field.field->size;
+
+ } else if (arg->type == TEP_PRINT_DYNAMIC_ARRAY) {
+
+ dynamic_offset_field(event->tep, arg->dynarray.field, data,
+ size, &offset, &len);
+
+ } else {
+ trace_seq_printf(s, "ARG NOT FIELD NOR DYNAMIC ARRAY BUT TYPE %d",
+ arg->type);
+ return rc;
}

- sa = (struct sockaddr_storage *) (data + arg->field.field->offset);
+ sa = (struct sockaddr_storage *)(data + offset);

if (sa->ss_family == AF_INET) {
struct sockaddr_in *sa4 = (struct sockaddr_in *) sa;

- if (arg->field.field->size < sizeof(struct sockaddr_in)) {
+ if (len < sizeof(struct sockaddr_in)) {
trace_seq_printf(s, "INVALIDIPv4");
return rc;
}
@@ -5185,7 +5202,7 @@ static int print_ipsa_arg(struct trace_seq *s, const char *ptr, char i,
} else if (sa->ss_family == AF_INET6) {
struct sockaddr_in6 *sa6 = (struct sockaddr_in6 *) sa;

- if (arg->field.field->size < sizeof(struct sockaddr_in6)) {
+ if (len < sizeof(struct sockaddr_in6)) {
trace_seq_printf(s, "INVALIDIPv6");
return rc;
}

2022-01-11 15:41:04

by Chuck Lever

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] Fix sockaddr handling in NFSD trace points



> On Jan 10, 2022, at 6:31 PM, Steven Rostedt <[email protected]> wrote:
>
> On Mon, 10 Jan 2022 11:05:35 -0500
> Steven Rostedt <[email protected]> wrote:
>
>>> I haven't quite been able to figure out how to handle the
>>> TP_printk() part of this equation. `trace-cmd report` displays
>>> something like "addr=ARG TYPE NOT FIELD BUT 7".
>>>
>>> Thoughts or advice appreciated.
>>
>> I'll take a look into it.
>
> If you add this to libtraceevent, it will work:

Thank you Steven! I will set up my current test system with
a locally-built trace-cmd and try this out. I can send a
proper patch that introduces the helper macros in my cover
letter's pseudo-code example today or tomorrow.


> diff --git a/src/event-parse.c b/src/event-parse.c
> index 9bd605d..033529c 100644
> --- a/src/event-parse.c
> +++ b/src/event-parse.c
> @@ -5127,6 +5127,8 @@ static int print_ipsa_arg(struct trace_seq *s, const char *ptr, char i,
> unsigned char *buf;
> struct sockaddr_storage *sa;
> bool reverse = false;
> + unsigned int offset;
> + unsigned int len;
> int rc = 0;
> int ret;
>
> @@ -5152,27 +5154,42 @@ static int print_ipsa_arg(struct trace_seq *s, const char *ptr, char i,
> return rc;
> }
>
> - if (arg->type != TEP_PRINT_FIELD) {
> - trace_seq_printf(s, "ARG TYPE NOT FIELD BUT %d", arg->type);
> - return rc;
> - }
> + /* evaluate if the arg has a typecast */
> + while (arg->type == TEP_PRINT_TYPE)
> + arg = arg->typecast.item;
> +
> + if (arg->type == TEP_PRINT_FIELD) {
>
> - if (!arg->field.field) {
> - arg->field.field =
> - tep_find_any_field(event, arg->field.name);
> if (!arg->field.field) {
> - do_warning("%s: field %s not found",
> - __func__, arg->field.name);
> - return rc;
> + arg->field.field =
> + tep_find_any_field(event, arg->field.name);
> + if (!arg->field.field) {
> + do_warning("%s: field %s not found",
> + __func__, arg->field.name);
> + return rc;
> + }
> }
> +
> + offset = arg->field.field->offset;
> + len = arg->field.field->size;
> +
> + } else if (arg->type == TEP_PRINT_DYNAMIC_ARRAY) {
> +
> + dynamic_offset_field(event->tep, arg->dynarray.field, data,
> + size, &offset, &len);
> +
> + } else {
> + trace_seq_printf(s, "ARG NOT FIELD NOR DYNAMIC ARRAY BUT TYPE %d",
> + arg->type);
> + return rc;
> }
>
> - sa = (struct sockaddr_storage *) (data + arg->field.field->offset);
> + sa = (struct sockaddr_storage *)(data + offset);
>
> if (sa->ss_family == AF_INET) {
> struct sockaddr_in *sa4 = (struct sockaddr_in *) sa;
>
> - if (arg->field.field->size < sizeof(struct sockaddr_in)) {
> + if (len < sizeof(struct sockaddr_in)) {
> trace_seq_printf(s, "INVALIDIPv4");
> return rc;
> }
> @@ -5185,7 +5202,7 @@ static int print_ipsa_arg(struct trace_seq *s, const char *ptr, char i,
> } else if (sa->ss_family == AF_INET6) {
> struct sockaddr_in6 *sa6 = (struct sockaddr_in6 *) sa;
>
> - if (arg->field.field->size < sizeof(struct sockaddr_in6)) {
> + if (len < sizeof(struct sockaddr_in6)) {
> trace_seq_printf(s, "INVALIDIPv6");
> return rc;
> }

--
Chuck Lever