2022-01-22 19:59:02

by Jeffrey Ji

[permalink] [raw]
Subject: [PATCH net-next] net-core: add InMacErrors counter

From: jeffreyji <[email protected]>

Increment InMacErrors counter when packet dropped due to incorrect dest
MAC addr.

example output from nstat:
\~# nstat -z "*InMac*"
\#kernel
Ip6InMacErrors 0 0.0
IpExtInMacErrors 1 0.0

Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other
with "{eth(daddr=$INCORRECT_MAC...}", verified that nstat showed the
counter was incremented.

Signed-off-by: jeffreyji <[email protected]>
---
include/uapi/linux/snmp.h | 1 +
net/ipv4/ip_input.c | 4 +++-
net/ipv4/proc.c | 1 +
net/ipv6/ip6_input.c | 1 +
net/ipv6/proc.c | 1 +
5 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/uapi/linux/snmp.h b/include/uapi/linux/snmp.h
index 904909d020e2..ac2fac12dd7d 100644
--- a/include/uapi/linux/snmp.h
+++ b/include/uapi/linux/snmp.h
@@ -57,6 +57,7 @@ enum
IPSTATS_MIB_ECT0PKTS, /* InECT0Pkts */
IPSTATS_MIB_CEPKTS, /* InCEPkts */
IPSTATS_MIB_REASM_OVERLAPS, /* ReasmOverlaps */
+ IPSTATS_MIB_INMACERRORS, /* InMacErrors */
__IPSTATS_MIB_MAX
};

diff --git a/net/ipv4/ip_input.c b/net/ipv4/ip_input.c
index 3a025c011971..0961585f7c02 100644
--- a/net/ipv4/ip_input.c
+++ b/net/ipv4/ip_input.c
@@ -441,8 +441,10 @@ static struct sk_buff *ip_rcv_core(struct sk_buff *skb, struct net *net)
/* When the interface is in promisc. mode, drop all the crap
* that it receives, do not try to analyse it.
*/
- if (skb->pkt_type == PACKET_OTHERHOST)
+ if (skb->pkt_type == PACKET_OTHERHOST) {
+ __IP_INC_STATS(net, IPSTATS_MIB_INMACERRORS);
goto drop;
+ }

__IP_UPD_PO_STATS(net, IPSTATS_MIB_IN, skb->len);

diff --git a/net/ipv4/proc.c b/net/ipv4/proc.c
index f30273afb539..dfe0a1dbf8e9 100644
--- a/net/ipv4/proc.c
+++ b/net/ipv4/proc.c
@@ -117,6 +117,7 @@ static const struct snmp_mib snmp4_ipextstats_list[] = {
SNMP_MIB_ITEM("InECT0Pkts", IPSTATS_MIB_ECT0PKTS),
SNMP_MIB_ITEM("InCEPkts", IPSTATS_MIB_CEPKTS),
SNMP_MIB_ITEM("ReasmOverlaps", IPSTATS_MIB_REASM_OVERLAPS),
+ SNMP_MIB_ITEM("InMacErrors", IPSTATS_MIB_INMACERRORS),
SNMP_MIB_SENTINEL
};

diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 80256717868e..2903869274ca 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -150,6 +150,7 @@ static struct sk_buff *ip6_rcv_core(struct sk_buff *skb, struct net_device *dev,
struct inet6_dev *idev;

if (skb->pkt_type == PACKET_OTHERHOST) {
+ __IP6_INC_STATS(net, idev, IPSTATS_MIB_INMACERRORS);
kfree_skb(skb);
return NULL;
}
diff --git a/net/ipv6/proc.c b/net/ipv6/proc.c
index d6306aa46bb1..76e6119ba558 100644
--- a/net/ipv6/proc.c
+++ b/net/ipv6/proc.c
@@ -84,6 +84,7 @@ static const struct snmp_mib snmp6_ipstats_list[] = {
SNMP_MIB_ITEM("Ip6InECT1Pkts", IPSTATS_MIB_ECT1PKTS),
SNMP_MIB_ITEM("Ip6InECT0Pkts", IPSTATS_MIB_ECT0PKTS),
SNMP_MIB_ITEM("Ip6InCEPkts", IPSTATS_MIB_CEPKTS),
+ SNMP_MIB_ITEM("Ip6InMacErrors", IPSTATS_MIB_INMACERRORS),
SNMP_MIB_SENTINEL
};

--
2.35.0.rc0.227.g00780c9af4-goog


2022-01-23 00:14:40

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

On Sat, 22 Jan 2022 00:03:01 +0000 Jeffrey Ji wrote:
> From: jeffreyji <[email protected]>
>
> Increment InMacErrors counter when packet dropped due to incorrect dest
> MAC addr.
>
> example output from nstat:
> \~# nstat -z "*InMac*"
> \#kernel
> Ip6InMacErrors 0 0.0
> IpExtInMacErrors 1 0.0
>
> Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other
> with "{eth(daddr=$INCORRECT_MAC...}", verified that nstat showed the
> counter was incremented.
>
> Signed-off-by: jeffreyji <[email protected]>

How about we use the new kfree_skb_reason() instead to avoid allocating
per-netns memory the stats?

2022-01-24 19:40:45

by Brian Vazquez

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

On Fri, Jan 21, 2022 at 7:41 PM Jakub Kicinski <[email protected]> wrote:
>
> On Sat, 22 Jan 2022 00:03:01 +0000 Jeffrey Ji wrote:
> > From: jeffreyji <[email protected]>
> >
> > Increment InMacErrors counter when packet dropped due to incorrect dest
> > MAC addr.
> >
> > example output from nstat:
> > \~# nstat -z "*InMac*"
> > \#kernel
> > Ip6InMacErrors 0 0.0
> > IpExtInMacErrors 1 0.0
> >
> > Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other
> > with "{eth(daddr=$INCORRECT_MAC...}", verified that nstat showed the
> > counter was incremented.
> >
> > Signed-off-by: jeffreyji <[email protected]>
>
> How about we use the new kfree_skb_reason() instead to avoid allocating
> per-netns memory the stats?

I'm not too familiar with the new kfree_skb_reason , but my
understanding is that it needs either the drop_monitor or ebpf to get
the reason from the tracepoint, right? This is not too different from
using perf tool to find where the pkt is being dropped.

The idea here was to have a high level metric that is easier to find
for users that have less expertise on using more advance tools.

2022-01-24 19:42:37

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

On Mon, 24 Jan 2022 09:13:12 -0800 Brian Vazquez wrote:
> On Fri, Jan 21, 2022 at 7:41 PM Jakub Kicinski <[email protected]> wrote:
> > On Sat, 22 Jan 2022 00:03:01 +0000 Jeffrey Ji wrote:
> > > From: jeffreyji <[email protected]>
> > >
> > > Increment InMacErrors counter when packet dropped due to incorrect dest
> > > MAC addr.
> > >
> > > example output from nstat:
> > > \~# nstat -z "*InMac*"
> > > \#kernel
> > > Ip6InMacErrors 0 0.0
> > > IpExtInMacErrors 1 0.0
> > >
> > > Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other
> > > with "{eth(daddr=$INCORRECT_MAC...}", verified that nstat showed the
> > > counter was incremented.
> > >
> > > Signed-off-by: jeffreyji <[email protected]>
> >
> > How about we use the new kfree_skb_reason() instead to avoid allocating
> > per-netns memory the stats?
>
> I'm not too familiar with the new kfree_skb_reason , but my
> understanding is that it needs either the drop_monitor or ebpf to get
> the reason from the tracepoint, right? This is not too different from
> using perf tool to find where the pkt is being dropped.
>
> The idea here was to have a high level metric that is easier to find
> for users that have less expertise on using more advance tools.

That much it's understood, but it's a trade off. This drop point
existed for 20 years, why do we need to consume extra memory now?

2022-01-24 19:44:21

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

On Mon, Jan 24, 2022 at 9:29 AM Jakub Kicinski <[email protected]> wrote:

> That much it's understood, but it's a trade off. This drop point
> existed for 20 years, why do we need to consume extra memory now?

Being able to tell after the facts, that such a drop reason had
occured can help investigations.

nstat -a | grep InMac

Jeffrey, what about _also_ adding the kfree_skb_reason(), since this
seems to also help ?

2022-01-24 19:44:48

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

On Mon, 24 Jan 2022 09:39:22 -0800 Eric Dumazet wrote:
> On Mon, Jan 24, 2022 at 9:29 AM Jakub Kicinski <[email protected]> wrote:
>
> > That much it's understood, but it's a trade off. This drop point
> > existed for 20 years, why do we need to consume extra memory now?
>
> Being able to tell after the facts, that such a drop reason had
> occured can help investigations.
>
> nstat -a | grep InMac
>
> Jeffrey, what about _also_ adding the kfree_skb_reason(), since this
> seems to also help ?

Yes, and please add a proper justification to the commit message,
with real life examples of cases where this drop point has proven
to be the culprit. Right now the commit message says gives the example
of trafgen :(

2022-01-24 22:33:07

by Cong Wang

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

On Mon, Jan 24, 2022 at 09:29:24AM -0800, Jakub Kicinski wrote:
> On Mon, 24 Jan 2022 09:13:12 -0800 Brian Vazquez wrote:
> > On Fri, Jan 21, 2022 at 7:41 PM Jakub Kicinski <[email protected]> wrote:
> > > On Sat, 22 Jan 2022 00:03:01 +0000 Jeffrey Ji wrote:
> > > > From: jeffreyji <[email protected]>
> > > >
> > > > Increment InMacErrors counter when packet dropped due to incorrect dest
> > > > MAC addr.
> > > >
> > > > example output from nstat:
> > > > \~# nstat -z "*InMac*"
> > > > \#kernel
> > > > Ip6InMacErrors 0 0.0
> > > > IpExtInMacErrors 1 0.0
> > > >
> > > > Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other
> > > > with "{eth(daddr=$INCORRECT_MAC...}", verified that nstat showed the
> > > > counter was incremented.
> > > >
> > > > Signed-off-by: jeffreyji <[email protected]>
> > >
> > > How about we use the new kfree_skb_reason() instead to avoid allocating
> > > per-netns memory the stats?
> >
> > I'm not too familiar with the new kfree_skb_reason , but my
> > understanding is that it needs either the drop_monitor or ebpf to get
> > the reason from the tracepoint, right? This is not too different from
> > using perf tool to find where the pkt is being dropped.
> >
> > The idea here was to have a high level metric that is easier to find
> > for users that have less expertise on using more advance tools.
>
> That much it's understood, but it's a trade off. This drop point
> existed for 20 years, why do we need to consume extra memory now?

kfree_skb_reason() is for tracing and tracing has overhead in
production, which is higher than just a percpu counter.

What memory overhead are you talking about? We have ~37 IP related
SNMP counters, this patch merely adds 1/37 memory overhead. So, what's the
point? :-/

Thanks.

2022-01-24 23:28:56

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

Hi Jeffrey,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url: https://github.com/0day-ci/linux/commits/Jeffrey-Ji/net-core-add-InMacErrors-counter/20220122-080455
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 8aaaf2f3af2ae212428f4db1af34214225f5cec3
config: x86_64-randconfig-a001-20220117 (https://download.01.org/0day-ci/archive/20220125/[email protected]/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 7b3d30728816403d1fd73cc5082e9fb761262bce)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/0day-ci/linux/commit/f8ea346d278c116f830459bae2a910fdc5e96a35
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Jeffrey-Ji/net-core-add-InMacErrors-counter/20220122-080455
git checkout f8ea346d278c116f830459bae2a910fdc5e96a35
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash net/ipv6/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> net/ipv6/ip6_input.c:153:24: warning: variable 'idev' is uninitialized when used here [-Wuninitialized]
__IP6_INC_STATS(net, idev, IPSTATS_MIB_INMACERRORS);
^~~~
include/net/ipv6.h:256:26: note: expanded from macro '__IP6_INC_STATS'
_DEVINC(net, ipv6, __, idev, field)
^~~~
include/net/ipv6.h:211:29: note: expanded from macro '_DEVINC'
struct inet6_dev *_idev = (idev); \
^~~~
net/ipv6/ip6_input.c:150:24: note: initialize the variable 'idev' to silence this warning
struct inet6_dev *idev;
^
= NULL
1 warning generated.


vim +/idev +153 net/ipv6/ip6_input.c

144
145 static struct sk_buff *ip6_rcv_core(struct sk_buff *skb, struct net_device *dev,
146 struct net *net)
147 {
148 const struct ipv6hdr *hdr;
149 u32 pkt_len;
150 struct inet6_dev *idev;
151
152 if (skb->pkt_type == PACKET_OTHERHOST) {
> 153 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INMACERRORS);
154 kfree_skb(skb);
155 return NULL;
156 }
157
158 rcu_read_lock();
159
160 idev = __in6_dev_get(skb->dev);
161
162 __IP6_UPD_PO_STATS(net, idev, IPSTATS_MIB_IN, skb->len);
163
164 if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL ||
165 !idev || unlikely(idev->cnf.disable_ipv6)) {
166 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INDISCARDS);
167 goto drop;
168 }
169
170 memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
171
172 /*
173 * Store incoming device index. When the packet will
174 * be queued, we cannot refer to skb->dev anymore.
175 *
176 * BTW, when we send a packet for our own local address on a
177 * non-loopback interface (e.g. ethX), it is being delivered
178 * via the loopback interface (lo) here; skb->dev = loopback_dev.
179 * It, however, should be considered as if it is being
180 * arrived via the sending interface (ethX), because of the
181 * nature of scoping architecture. --yoshfuji
182 */
183 IP6CB(skb)->iif = skb_valid_dst(skb) ? ip6_dst_idev(skb_dst(skb))->dev->ifindex : dev->ifindex;
184
185 if (unlikely(!pskb_may_pull(skb, sizeof(*hdr))))
186 goto err;
187
188 hdr = ipv6_hdr(skb);
189
190 if (hdr->version != 6)
191 goto err;
192
193 __IP6_ADD_STATS(net, idev,
194 IPSTATS_MIB_NOECTPKTS +
195 (ipv6_get_dsfield(hdr) & INET_ECN_MASK),
196 max_t(unsigned short, 1, skb_shinfo(skb)->gso_segs));
197 /*
198 * RFC4291 2.5.3
199 * The loopback address must not be used as the source address in IPv6
200 * packets that are sent outside of a single node. [..]
201 * A packet received on an interface with a destination address
202 * of loopback must be dropped.
203 */
204 if ((ipv6_addr_loopback(&hdr->saddr) ||
205 ipv6_addr_loopback(&hdr->daddr)) &&
206 !(dev->flags & IFF_LOOPBACK) &&
207 !netif_is_l3_master(dev))
208 goto err;
209
210 /* RFC4291 Errata ID: 3480
211 * Interface-Local scope spans only a single interface on a
212 * node and is useful only for loopback transmission of
213 * multicast. Packets with interface-local scope received
214 * from another node must be discarded.
215 */
216 if (!(skb->pkt_type == PACKET_LOOPBACK ||
217 dev->flags & IFF_LOOPBACK) &&
218 ipv6_addr_is_multicast(&hdr->daddr) &&
219 IPV6_ADDR_MC_SCOPE(&hdr->daddr) == 1)
220 goto err;
221
222 /* If enabled, drop unicast packets that were encapsulated in link-layer
223 * multicast or broadcast to protected against the so-called "hole-196"
224 * attack in 802.11 wireless.
225 */
226 if (!ipv6_addr_is_multicast(&hdr->daddr) &&
227 (skb->pkt_type == PACKET_BROADCAST ||
228 skb->pkt_type == PACKET_MULTICAST) &&
229 idev->cnf.drop_unicast_in_l2_multicast)
230 goto err;
231
232 /* RFC4291 2.7
233 * Nodes must not originate a packet to a multicast address whose scope
234 * field contains the reserved value 0; if such a packet is received, it
235 * must be silently dropped.
236 */
237 if (ipv6_addr_is_multicast(&hdr->daddr) &&
238 IPV6_ADDR_MC_SCOPE(&hdr->daddr) == 0)
239 goto err;
240
241 /*
242 * RFC4291 2.7
243 * Multicast addresses must not be used as source addresses in IPv6
244 * packets or appear in any Routing header.
245 */
246 if (ipv6_addr_is_multicast(&hdr->saddr))
247 goto err;
248
249 skb->transport_header = skb->network_header + sizeof(*hdr);
250 IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
251
252 pkt_len = ntohs(hdr->payload_len);
253
254 /* pkt_len may be zero if Jumbo payload option is present */
255 if (pkt_len || hdr->nexthdr != NEXTHDR_HOP) {
256 if (pkt_len + sizeof(struct ipv6hdr) > skb->len) {
257 __IP6_INC_STATS(net,
258 idev, IPSTATS_MIB_INTRUNCATEDPKTS);
259 goto drop;
260 }
261 if (pskb_trim_rcsum(skb, pkt_len + sizeof(struct ipv6hdr))) {
262 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
263 goto drop;
264 }
265 hdr = ipv6_hdr(skb);
266 }
267
268 if (hdr->nexthdr == NEXTHDR_HOP) {
269 if (ipv6_parse_hopopts(skb) < 0) {
270 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
271 rcu_read_unlock();
272 return NULL;
273 }
274 }
275
276 rcu_read_unlock();
277
278 /* Must drop socket now because of tproxy. */
279 if (!skb_sk_is_prefetched(skb))
280 skb_orphan(skb);
281
282 return skb;
283 err:
284 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
285 drop:
286 rcu_read_unlock();
287 kfree_skb(skb);
288 return NULL;
289 }
290

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]

2022-01-24 23:29:16

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

Hi Jeffrey,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url: https://github.com/0day-ci/linux/commits/Jeffrey-Ji/net-core-add-InMacErrors-counter/20220122-080455
base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git 8aaaf2f3af2ae212428f4db1af34214225f5cec3
config: mips-maltaaprp_defconfig (https://download.01.org/0day-ci/archive/20220125/[email protected]/config)
compiler: clang version 14.0.0 (https://github.com/llvm/llvm-project 7b3d30728816403d1fd73cc5082e9fb761262bce)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# install mips cross compiling tool for clang build
# apt-get install binutils-mips-linux-gnu
# https://github.com/0day-ci/linux/commit/f8ea346d278c116f830459bae2a910fdc5e96a35
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Jeffrey-Ji/net-core-add-InMacErrors-counter/20220122-080455
git checkout f8ea346d278c116f830459bae2a910fdc5e96a35
# save the config file to linux build tree
mkdir build_dir
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=mips SHELL=/bin/bash net/ipv6/

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All warnings (new ones prefixed by >>):

>> net/ipv6/ip6_input.c:153:24: warning: variable 'idev' is uninitialized when used here [-Wuninitialized]
__IP6_INC_STATS(net, idev, IPSTATS_MIB_INMACERRORS);
^~~~
include/net/ipv6.h:256:26: note: expanded from macro '__IP6_INC_STATS'
_DEVINC(net, ipv6, __, idev, field)
^~~~
include/net/ipv6.h:211:29: note: expanded from macro '_DEVINC'
struct inet6_dev *_idev = (idev); \
^~~~
net/ipv6/ip6_input.c:150:24: note: initialize the variable 'idev' to silence this warning
struct inet6_dev *idev;
^
= NULL
1 warning generated.


vim +/idev +153 net/ipv6/ip6_input.c

144
145 static struct sk_buff *ip6_rcv_core(struct sk_buff *skb, struct net_device *dev,
146 struct net *net)
147 {
148 const struct ipv6hdr *hdr;
149 u32 pkt_len;
150 struct inet6_dev *idev;
151
152 if (skb->pkt_type == PACKET_OTHERHOST) {
> 153 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INMACERRORS);
154 kfree_skb(skb);
155 return NULL;
156 }
157
158 rcu_read_lock();
159
160 idev = __in6_dev_get(skb->dev);
161
162 __IP6_UPD_PO_STATS(net, idev, IPSTATS_MIB_IN, skb->len);
163
164 if ((skb = skb_share_check(skb, GFP_ATOMIC)) == NULL ||
165 !idev || unlikely(idev->cnf.disable_ipv6)) {
166 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INDISCARDS);
167 goto drop;
168 }
169
170 memset(IP6CB(skb), 0, sizeof(struct inet6_skb_parm));
171
172 /*
173 * Store incoming device index. When the packet will
174 * be queued, we cannot refer to skb->dev anymore.
175 *
176 * BTW, when we send a packet for our own local address on a
177 * non-loopback interface (e.g. ethX), it is being delivered
178 * via the loopback interface (lo) here; skb->dev = loopback_dev.
179 * It, however, should be considered as if it is being
180 * arrived via the sending interface (ethX), because of the
181 * nature of scoping architecture. --yoshfuji
182 */
183 IP6CB(skb)->iif = skb_valid_dst(skb) ? ip6_dst_idev(skb_dst(skb))->dev->ifindex : dev->ifindex;
184
185 if (unlikely(!pskb_may_pull(skb, sizeof(*hdr))))
186 goto err;
187
188 hdr = ipv6_hdr(skb);
189
190 if (hdr->version != 6)
191 goto err;
192
193 __IP6_ADD_STATS(net, idev,
194 IPSTATS_MIB_NOECTPKTS +
195 (ipv6_get_dsfield(hdr) & INET_ECN_MASK),
196 max_t(unsigned short, 1, skb_shinfo(skb)->gso_segs));
197 /*
198 * RFC4291 2.5.3
199 * The loopback address must not be used as the source address in IPv6
200 * packets that are sent outside of a single node. [..]
201 * A packet received on an interface with a destination address
202 * of loopback must be dropped.
203 */
204 if ((ipv6_addr_loopback(&hdr->saddr) ||
205 ipv6_addr_loopback(&hdr->daddr)) &&
206 !(dev->flags & IFF_LOOPBACK) &&
207 !netif_is_l3_master(dev))
208 goto err;
209
210 /* RFC4291 Errata ID: 3480
211 * Interface-Local scope spans only a single interface on a
212 * node and is useful only for loopback transmission of
213 * multicast. Packets with interface-local scope received
214 * from another node must be discarded.
215 */
216 if (!(skb->pkt_type == PACKET_LOOPBACK ||
217 dev->flags & IFF_LOOPBACK) &&
218 ipv6_addr_is_multicast(&hdr->daddr) &&
219 IPV6_ADDR_MC_SCOPE(&hdr->daddr) == 1)
220 goto err;
221
222 /* If enabled, drop unicast packets that were encapsulated in link-layer
223 * multicast or broadcast to protected against the so-called "hole-196"
224 * attack in 802.11 wireless.
225 */
226 if (!ipv6_addr_is_multicast(&hdr->daddr) &&
227 (skb->pkt_type == PACKET_BROADCAST ||
228 skb->pkt_type == PACKET_MULTICAST) &&
229 idev->cnf.drop_unicast_in_l2_multicast)
230 goto err;
231
232 /* RFC4291 2.7
233 * Nodes must not originate a packet to a multicast address whose scope
234 * field contains the reserved value 0; if such a packet is received, it
235 * must be silently dropped.
236 */
237 if (ipv6_addr_is_multicast(&hdr->daddr) &&
238 IPV6_ADDR_MC_SCOPE(&hdr->daddr) == 0)
239 goto err;
240
241 /*
242 * RFC4291 2.7
243 * Multicast addresses must not be used as source addresses in IPv6
244 * packets or appear in any Routing header.
245 */
246 if (ipv6_addr_is_multicast(&hdr->saddr))
247 goto err;
248
249 skb->transport_header = skb->network_header + sizeof(*hdr);
250 IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
251
252 pkt_len = ntohs(hdr->payload_len);
253
254 /* pkt_len may be zero if Jumbo payload option is present */
255 if (pkt_len || hdr->nexthdr != NEXTHDR_HOP) {
256 if (pkt_len + sizeof(struct ipv6hdr) > skb->len) {
257 __IP6_INC_STATS(net,
258 idev, IPSTATS_MIB_INTRUNCATEDPKTS);
259 goto drop;
260 }
261 if (pskb_trim_rcsum(skb, pkt_len + sizeof(struct ipv6hdr))) {
262 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
263 goto drop;
264 }
265 hdr = ipv6_hdr(skb);
266 }
267
268 if (hdr->nexthdr == NEXTHDR_HOP) {
269 if (ipv6_parse_hopopts(skb) < 0) {
270 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
271 rcu_read_unlock();
272 return NULL;
273 }
274 }
275
276 rcu_read_unlock();
277
278 /* Must drop socket now because of tproxy. */
279 if (!skb_sk_is_prefetched(skb))
280 skb_orphan(skb);
281
282 return skb;
283 err:
284 __IP6_INC_STATS(net, idev, IPSTATS_MIB_INHDRERRORS);
285 drop:
286 rcu_read_unlock();
287 kfree_skb(skb);
288 return NULL;
289 }
290

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]

2022-01-25 00:33:41

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH net-next] net-core: add InMacErrors counter

On Mon, Jan 24, 2022 at 12:20 PM Cong Wang <[email protected]> wrote:
>
> On Mon, Jan 24, 2022 at 09:29:24AM -0800, Jakub Kicinski wrote:
> > On Mon, 24 Jan 2022 09:13:12 -0800 Brian Vazquez wrote:
> > > On Fri, Jan 21, 2022 at 7:41 PM Jakub Kicinski <[email protected]> wrote:
> > > > On Sat, 22 Jan 2022 00:03:01 +0000 Jeffrey Ji wrote:
> > > > > From: jeffreyji <[email protected]>
> > > > >
> > > > > Increment InMacErrors counter when packet dropped due to incorrect dest
> > > > > MAC addr.
> > > > >
> > > > > example output from nstat:
> > > > > \~# nstat -z "*InMac*"
> > > > > \#kernel
> > > > > Ip6InMacErrors 0 0.0
> > > > > IpExtInMacErrors 1 0.0
> > > > >
> > > > > Tested: Created 2 netns, sent 1 packet using trafgen from 1 to the other
> > > > > with "{eth(daddr=$INCORRECT_MAC...}", verified that nstat showed the
> > > > > counter was incremented.
> > > > >
> > > > > Signed-off-by: jeffreyji <[email protected]>
> > > >
> > > > How about we use the new kfree_skb_reason() instead to avoid allocating
> > > > per-netns memory the stats?
> > >
> > > I'm not too familiar with the new kfree_skb_reason , but my
> > > understanding is that it needs either the drop_monitor or ebpf to get
> > > the reason from the tracepoint, right? This is not too different from
> > > using perf tool to find where the pkt is being dropped.
> > >
> > > The idea here was to have a high level metric that is easier to find
> > > for users that have less expertise on using more advance tools.
> >
> > That much it's understood, but it's a trade off. This drop point
> > existed for 20 years, why do we need to consume extra memory now?
>
> kfree_skb_reason() is for tracing and tracing has overhead in
> production, which is higher than just a percpu counter.
>
> What memory overhead are you talking about? We have ~37 IP related
> SNMP counters, this patch merely adds 1/37 memory overhead. So, what's the
> point? :-/
>

BTW I just saw that kfree_skb_reason() changes have been proposed
already by Menglong Dong this morning.