From: Menglong Dong <[email protected]>
In the commit c504e5c2f964 ("net: skb: introduce kfree_skb_reason()"),
we added the support of reporting the reasons of skb drops to kfree_skb
tracepoint. And in this series patches, reasons for skb drops are added
to gre packet receiving path.
gre_rcv() in gre_demux.c is handled in the 1th patch.
In order to report skb drop reasons, we make erspan_rcv() return
PACKET_NEXT when no tunnel device found in the 2th patch. This may don't
correspond to 'PACKET_NEXT', but don't matter.
And gre_rcv() in ip_gre.c is handled in the 3th patch.
Following drop reasons are added(what they mean can be see in the
document for them):
SKB_DROP_REASON_GRE_VERSION
SKB_DROP_REASON_GRE_NOHANDLER
SKB_DROP_REASON_GRE_CSUM
SKB_DROP_REASON_GRE_NOTUNNEL
Maybe SKB_DROP_REASON_GRE_NOHANDLER can be replaced with
SKB_DROP_REASON_GRE_VERSION? As no gre_protocol found means that gre
version not supported.
PS: This set is parallel with the set "net: icmp: add skb drop reasons
to icmp", please don't mind :)
Menglong Dong (3):
net: gre_demux: add skb drop reasons to gre_rcv()
net: ipgre: make erspan_rcv() return PACKET_NEXT
net: ipgre: add skb drop reasons to gre_rcv()
include/linux/skbuff.h | 6 ++++++
include/trace/events/skb.h | 4 ++++
net/ipv4/gre_demux.c | 12 +++++++++---
net/ipv4/ip_gre.c | 30 +++++++++++++++++++-----------
4 files changed, 38 insertions(+), 14 deletions(-)
--
2.35.1
From: Menglong Dong <[email protected]>
Replace kfree_skb() used in gre_rcv() with kfree_skb_reason(). Following
new drop reasons are added:
SKB_DROP_REASON_GRE_VERSION
SKB_DROP_REASON_GRE_NOHANDLER
Reviewed-by: Hao Peng <[email protected]>
Reviewed-by: Biao Jiang <[email protected]>
Signed-off-by: Menglong Dong <[email protected]>
---
include/linux/skbuff.h | 4 ++++
include/trace/events/skb.h | 2 ++
net/ipv4/gre_demux.c | 12 +++++++++---
3 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 26538ceb4b01..5edb704af5bb 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -444,6 +444,10 @@ enum skb_drop_reason {
SKB_DROP_REASON_TAP_TXFILTER, /* dropped by tx filter implemented
* at tun/tap, e.g., check_filter()
*/
+ SKB_DROP_REASON_GRE_VERSION, /* GRE version not supported */
+ SKB_DROP_REASON_GRE_NOHANDLER, /* no handler found (version not
+ * supported?)
+ */
SKB_DROP_REASON_MAX,
};
diff --git a/include/trace/events/skb.h b/include/trace/events/skb.h
index e1670e1e4934..f2bcffdc4bae 100644
--- a/include/trace/events/skb.h
+++ b/include/trace/events/skb.h
@@ -61,6 +61,8 @@
EM(SKB_DROP_REASON_HDR_TRUNC, HDR_TRUNC) \
EM(SKB_DROP_REASON_TAP_FILTER, TAP_FILTER) \
EM(SKB_DROP_REASON_TAP_TXFILTER, TAP_TXFILTER) \
+ EM(SKB_DROP_REASON_GRE_VERSION, GRE_VERSION) \
+ EM(SKB_DROP_REASON_GRE_NOHANDLER, GRE_NOHANDLER) \
EMe(SKB_DROP_REASON_MAX, MAX)
#undef EM
diff --git a/net/ipv4/gre_demux.c b/net/ipv4/gre_demux.c
index cbb2b4bb0dfa..066cbaadc52a 100644
--- a/net/ipv4/gre_demux.c
+++ b/net/ipv4/gre_demux.c
@@ -146,20 +146,26 @@ EXPORT_SYMBOL(gre_parse_header);
static int gre_rcv(struct sk_buff *skb)
{
const struct gre_protocol *proto;
+ enum skb_drop_reason reason;
u8 ver;
int ret;
+ reason = SKB_DROP_REASON_NOT_SPECIFIED;
if (!pskb_may_pull(skb, 12))
goto drop;
ver = skb->data[1]&0x7f;
- if (ver >= GREPROTO_MAX)
+ if (ver >= GREPROTO_MAX) {
+ reason = SKB_DROP_REASON_GRE_VERSION;
goto drop;
+ }
rcu_read_lock();
proto = rcu_dereference(gre_proto[ver]);
- if (!proto || !proto->handler)
+ if (!proto || !proto->handler) {
+ reason = SKB_DROP_REASON_GRE_NOHANDLER;
goto drop_unlock;
+ }
ret = proto->handler(skb);
rcu_read_unlock();
return ret;
@@ -167,7 +173,7 @@ static int gre_rcv(struct sk_buff *skb)
drop_unlock:
rcu_read_unlock();
drop:
- kfree_skb(skb);
+ kfree_skb_reason(skb, reason);
return NET_RX_DROP;
}
--
2.35.1
On Wed, Mar 16, 2022 at 11:08 AM Jakub Kicinski <[email protected]> wrote:
>
> On Mon, 14 Mar 2022 21:33:10 +0800 [email protected] wrote:
> > + reason = SKB_DROP_REASON_NOT_SPECIFIED;
> > if (!pskb_may_pull(skb, 12))
> > goto drop;
>
> REASON_HDR_TRUNC ?
I'm still not sure about such a 'pskb_pull' failure, whose reasons may be
complex, such as no memory or packet length too small. I see somewhere
return a '-NOMEM' when skb pull fails.
So maybe such cases can be ignored? In my opinion, not all skb drops
need a reason.
>
> > ver = skb->data[1]&0x7f;
> > - if (ver >= GREPROTO_MAX)
> > + if (ver >= GREPROTO_MAX) {
> > + reason = SKB_DROP_REASON_GRE_VERSION;
>
> TBH I'm still not sure what level of granularity we should be shooting
> for with the reasons. I'd throw all unexpected header values into one
> bucket, not go for a reason per field, per protocol. But as I'm said
> I'm not sure myself, so we can keep what you have..
>
> > goto drop;
> > + }
> >
> > rcu_read_lock();
> > proto = rcu_dereference(gre_proto[ver]);
> > - if (!proto || !proto->handler)
> > + if (!proto || !proto->handler) {
> > + reason = SKB_DROP_REASON_GRE_NOHANDLER;
>
> I think the ->handler check is defensive programming, there's no
> protocol upstream which would leave handler NULL.
>
> This is akin to SKB_DROP_REASON_PTYPE_ABSENT, we can reuse that or add
> a new reason, but I'd think the phrasing should be kept similar.
With the handler not NULL, does it mean the gre version is not supported here,
and this 'SKB_DROP_REASON_GRE_NOHANDLER' can be replaced with
SKB_DROP_REASON_GRE_VERSION above?
>
> > goto drop_unlock;
> > + }
> > ret = proto->handler(skb);
On 3/15/22 10:55 PM, Jakub Kicinski wrote:
> On Tue, 15 Mar 2022 21:49:01 -0600 David Ahern wrote:
>>>> ver = skb->data[1]&0x7f;
>>>> - if (ver >= GREPROTO_MAX)
>>>> + if (ver >= GREPROTO_MAX) {
>>>> + reason = SKB_DROP_REASON_GRE_VERSION;
>>>
>>> TBH I'm still not sure what level of granularity we should be shooting
>>> for with the reasons. I'd throw all unexpected header values into one
>>> bucket, not go for a reason per field, per protocol. But as I'm said
>>> I'm not sure myself, so we can keep what you have..
>>
>> I have stated before I do not believe every single drop point in the
>> kernel needs a unique reason code. This is overkill. The reason augments
>> information we already have -- the IP from kfree_skb tracepoint.
>
> That's certainly true. I wonder if there is a systematic way of
> approaching these additions that'd help us picking the points were
> we add reasons less of a judgment call.
In my head it's split between OS housekeeping and user visible data.
Housekeeping side of it is more the technical failure points like skb
manipulations - maybe interesting to a user collecting stats about how a
node is performing, but more than likely not. IMHO, those are ignored
for now (NOT_SPECIFIED).
The immediate big win is for packets from a network where an analysis
can show code location (instruction pointer), user focused reason (csum
failure, 'otherhost', no socket open, no socket buffer space, ...) and
traceable to a specific host (headers in skb data).
On Wed, 16 Mar 2022 12:41:45 +0800 Menglong Dong wrote:
> > On Mon, 14 Mar 2022 21:33:10 +0800 [email protected] wrote:
> > > + reason = SKB_DROP_REASON_NOT_SPECIFIED;
> > > if (!pskb_may_pull(skb, 12))
> > > goto drop;
> >
> > REASON_HDR_TRUNC ?
>
> I'm still not sure about such a 'pskb_pull' failure, whose reasons may be
> complex, such as no memory or packet length too small. I see somewhere
> return a '-NOMEM' when skb pull fails.
>
> So maybe such cases can be ignored? In my opinion, not all skb drops
> need a reason.
Ah, okay, I saw Dave's email as well. I wasn't sure if this omission
was intentional. But if it is, that's fine.
On Mon, 14 Mar 2022 21:33:10 +0800 [email protected] wrote:
> + reason = SKB_DROP_REASON_NOT_SPECIFIED;
> if (!pskb_may_pull(skb, 12))
> goto drop;
REASON_HDR_TRUNC ?
> ver = skb->data[1]&0x7f;
> - if (ver >= GREPROTO_MAX)
> + if (ver >= GREPROTO_MAX) {
> + reason = SKB_DROP_REASON_GRE_VERSION;
TBH I'm still not sure what level of granularity we should be shooting
for with the reasons. I'd throw all unexpected header values into one
bucket, not go for a reason per field, per protocol. But as I'm said
I'm not sure myself, so we can keep what you have..
> goto drop;
> + }
>
> rcu_read_lock();
> proto = rcu_dereference(gre_proto[ver]);
> - if (!proto || !proto->handler)
> + if (!proto || !proto->handler) {
> + reason = SKB_DROP_REASON_GRE_NOHANDLER;
I think the ->handler check is defensive programming, there's no
protocol upstream which would leave handler NULL.
This is akin to SKB_DROP_REASON_PTYPE_ABSENT, we can reuse that or add
a new reason, but I'd think the phrasing should be kept similar.
> goto drop_unlock;
> + }
> ret = proto->handler(skb);
On Wed, 16 Mar 2022 10:12:00 +0000 David Laight wrote:
> Is it worth considering splitting the 'reason' into two parts?
> Eg x << 16 | y
> One part being the overall reason - and probably a define.
> The other qualifying the actual failing test and probably just
> being a number.
>
> Then you get an overall view of the fails (which might even
> be counted) while still being able to locate the actual
> failing test.
That popped to my mind, but other than the fact that it "seems fine"
I can't really convince myself that (a) 2 levels are enough, why not 3;
(b) I personally don't often look at the drops, so IDK what'd fit the
needs of the consumer of this API. TCP bad csum drops are the only ones
I have experience with, and we have those already covered.
On Tue, 15 Mar 2022 21:49:01 -0600 David Ahern wrote:
> >> ver = skb->data[1]&0x7f;
> >> - if (ver >= GREPROTO_MAX)
> >> + if (ver >= GREPROTO_MAX) {
> >> + reason = SKB_DROP_REASON_GRE_VERSION;
> >
> > TBH I'm still not sure what level of granularity we should be shooting
> > for with the reasons. I'd throw all unexpected header values into one
> > bucket, not go for a reason per field, per protocol. But as I'm said
> > I'm not sure myself, so we can keep what you have..
>
> I have stated before I do not believe every single drop point in the
> kernel needs a unique reason code. This is overkill. The reason augments
> information we already have -- the IP from kfree_skb tracepoint.
That's certainly true. I wonder if there is a systematic way of
approaching these additions that'd help us picking the points were
we add reasons less of a judgment call.
On 3/15/22 9:08 PM, Jakub Kicinski wrote:
> On Mon, 14 Mar 2022 21:33:10 +0800 [email protected] wrote:
>> + reason = SKB_DROP_REASON_NOT_SPECIFIED;
>> if (!pskb_may_pull(skb, 12))
>> goto drop;
>
> REASON_HDR_TRUNC ?
>
>> ver = skb->data[1]&0x7f;
>> - if (ver >= GREPROTO_MAX)
>> + if (ver >= GREPROTO_MAX) {
>> + reason = SKB_DROP_REASON_GRE_VERSION;
>
> TBH I'm still not sure what level of granularity we should be shooting
> for with the reasons. I'd throw all unexpected header values into one
> bucket, not go for a reason per field, per protocol. But as I'm said
> I'm not sure myself, so we can keep what you have..
I have stated before I do not believe every single drop point in the
kernel needs a unique reason code. This is overkill. The reason augments
information we already have -- the IP from kfree_skb tracepoint.
>
>> goto drop;
>> + }
>>
>> rcu_read_lock();
>> proto = rcu_dereference(gre_proto[ver]);
>> - if (!proto || !proto->handler)
>> + if (!proto || !proto->handler) {
>> + reason = SKB_DROP_REASON_GRE_NOHANDLER;
>
> I think the ->handler check is defensive programming, there's no
> protocol upstream which would leave handler NULL.
>
> This is akin to SKB_DROP_REASON_PTYPE_ABSENT, we can reuse that or add
> a new reason, but I'd think the phrasing should be kept similar.
>
>> goto drop_unlock;
>> + }
>> ret = proto->handler(skb);
From: Jakub Kicinski
> Sent: 16 March 2022 04:56
>
> On Tue, 15 Mar 2022 21:49:01 -0600 David Ahern wrote:
> > >> ver = skb->data[1]&0x7f;
> > >> - if (ver >= GREPROTO_MAX)
> > >> + if (ver >= GREPROTO_MAX) {
> > >> + reason = SKB_DROP_REASON_GRE_VERSION;
> > >
> > > TBH I'm still not sure what level of granularity we should be shooting
> > > for with the reasons. I'd throw all unexpected header values into one
> > > bucket, not go for a reason per field, per protocol. But as I'm said
> > > I'm not sure myself, so we can keep what you have..
> >
> > I have stated before I do not believe every single drop point in the
> > kernel needs a unique reason code. This is overkill. The reason augments
> > information we already have -- the IP from kfree_skb tracepoint.
>
> That's certainly true. I wonder if there is a systematic way of
> approaching these additions that'd help us picking the points were
> we add reasons less of a judgment call.
Is it worth considering splitting the 'reason' into two parts?
Eg x << 16 | y
One part being the overall reason - and probably a define.
The other qualifying the actual failing test and probably just
being a number.
Then you get an overall view of the fails (which might even
be counted) while still being able to locate the actual
failing test.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Wed, Mar 16, 2022 at 11:49 AM David Ahern <[email protected]> wrote:
>
> On 3/15/22 9:08 PM, Jakub Kicinski wrote:
> > On Mon, 14 Mar 2022 21:33:10 +0800 [email protected] wrote:
> >> + reason = SKB_DROP_REASON_NOT_SPECIFIED;
> >> if (!pskb_may_pull(skb, 12))
> >> goto drop;
> >
> > REASON_HDR_TRUNC ?
> >
> >> ver = skb->data[1]&0x7f;
> >> - if (ver >= GREPROTO_MAX)
> >> + if (ver >= GREPROTO_MAX) {
> >> + reason = SKB_DROP_REASON_GRE_VERSION;
> >
> > TBH I'm still not sure what level of granularity we should be shooting
> > for with the reasons. I'd throw all unexpected header values into one
> > bucket, not go for a reason per field, per protocol. But as I'm said
> > I'm not sure myself, so we can keep what you have..
>
> I have stated before I do not believe every single drop point in the
> kernel needs a unique reason code. This is overkill. The reason augments
> information we already have -- the IP from kfree_skb tracepoint.
Is this reason unnecessary? I'm not sure if the GRE version problem should
be reported. With versions not supported by the kernel, it seems we
can't get the
drop reason from the packet data, as they are fine. And previous seems not
suitable here, as it is a L4 problem.
I'll remove the reason here if there is no positive reply.
Thanks!
Menglong Dong
>
> >
> >> goto drop;
> >> + }
> >>
> >> rcu_read_lock();
> >> proto = rcu_dereference(gre_proto[ver]);
> >> - if (!proto || !proto->handler)
> >> + if (!proto || !proto->handler) {
> >> + reason = SKB_DROP_REASON_GRE_NOHANDLER;
> >
> > I think the ->handler check is defensive programming, there's no
> > protocol upstream which would leave handler NULL.
> >
> > This is akin to SKB_DROP_REASON_PTYPE_ABSENT, we can reuse that or add
> > a new reason, but I'd think the phrasing should be kept similar.
> >
> >> goto drop_unlock;
> >> + }
> >> ret = proto->handler(skb);
>
On 3/16/22 12:57 PM, Jakub Kicinski wrote:
> On Wed, 16 Mar 2022 08:56:14 -0600 David Ahern wrote:
>>> That's certainly true. I wonder if there is a systematic way of
>>> approaching these additions that'd help us picking the points were
>>> we add reasons less of a judgment call.
>>
>> In my head it's split between OS housekeeping and user visible data.
>> Housekeeping side of it is more the technical failure points like skb
>> manipulations - maybe interesting to a user collecting stats about how a
>> node is performing, but more than likely not. IMHO, those are ignored
>> for now (NOT_SPECIFIED).
>>
>> The immediate big win is for packets from a network where an analysis
>> can show code location (instruction pointer), user focused reason (csum
>> failure, 'otherhost', no socket open, no socket buffer space, ...) and
>> traceable to a specific host (headers in skb data).
>
> Maybe I'm oversimplifying but would that mean first order of business
> is to have drop codes for where we already bump MIB exception stats?
That was the original motivation and it has since spun out of control -
walking the code base and assigning unique drop reasons for every single
call to kfree_skb.
On 3/15/22 10:53 PM, Menglong Dong wrote:
> On Wed, Mar 16, 2022 at 11:49 AM David Ahern <[email protected]> wrote:
>>
>> On 3/15/22 9:08 PM, Jakub Kicinski wrote:
>>> On Mon, 14 Mar 2022 21:33:10 +0800 [email protected] wrote:
>>>> + reason = SKB_DROP_REASON_NOT_SPECIFIED;
>>>> if (!pskb_may_pull(skb, 12))
>>>> goto drop;
>>>
>>> REASON_HDR_TRUNC ?
>>>
>>>> ver = skb->data[1]&0x7f;
>>>> - if (ver >= GREPROTO_MAX)
>>>> + if (ver >= GREPROTO_MAX) {
>>>> + reason = SKB_DROP_REASON_GRE_VERSION;
>>>
>>> TBH I'm still not sure what level of granularity we should be shooting
>>> for with the reasons. I'd throw all unexpected header values into one
>>> bucket, not go for a reason per field, per protocol. But as I'm said
>>> I'm not sure myself, so we can keep what you have..
>>
>> I have stated before I do not believe every single drop point in the
>> kernel needs a unique reason code. This is overkill. The reason augments
>> information we already have -- the IP from kfree_skb tracepoint.
>
> Is this reason unnecessary? I'm not sure if the GRE version problem should
> be reported. With versions not supported by the kernel, it seems we
> can't get the
> drop reason from the packet data, as they are fine. And previous seems not
> suitable here, as it is a L4 problem.
>
Generically, it is "no support for a protocol version". That kind of
reason code + the IP tells you GRE is processing a packet with an
unsupported protocol version.
On Wed, 16 Mar 2022 08:56:14 -0600 David Ahern wrote:
> > That's certainly true. I wonder if there is a systematic way of
> > approaching these additions that'd help us picking the points were
> > we add reasons less of a judgment call.
>
> In my head it's split between OS housekeeping and user visible data.
> Housekeeping side of it is more the technical failure points like skb
> manipulations - maybe interesting to a user collecting stats about how a
> node is performing, but more than likely not. IMHO, those are ignored
> for now (NOT_SPECIFIED).
>
> The immediate big win is for packets from a network where an analysis
> can show code location (instruction pointer), user focused reason (csum
> failure, 'otherhost', no socket open, no socket buffer space, ...) and
> traceable to a specific host (headers in skb data).
Maybe I'm oversimplifying but would that mean first order of business
is to have drop codes for where we already bump MIB exception stats?