Currently, netpoll only supports interfaces with an ethernet-compatible
link layer. Certain interfaces like SLIP do not have a link layer
on the network interface level at all and expect raw IP packets,
and could benefit from being supported by netpoll.
This commit adds support for such interfaces by using the network device's
`hard_header_len` field as an indication that no link layer is present.
If that is the case we simply skip adding the ethernet header, causing
a raw IP packet to be sent over the interface. This has been confirmed
to add netconsole support to at least SLIP and WireGuard interfaces.
Signed-off-by: Mark Cilissen <[email protected]>
---
Documentation/networking/netconsole.rst | 3 ++-
net/core/netpoll.c | 30 ++++++++++++++++---------
2 files changed, 22 insertions(+), 11 deletions(-)
diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst
index d55c2a22ec7a..434ce0366027 100644
--- a/Documentation/networking/netconsole.rst
+++ b/Documentation/networking/netconsole.rst
@@ -327,4 +327,5 @@ enable the logging of even the most critical kernel bugs. It works
from IRQ contexts as well, and does not enable interrupts while
sending packets. Due to these unique needs, configuration cannot
be more automatic, and some fundamental limitations will remain:
-only IP networks, UDP packets and ethernet devices are supported.
+only UDP packets over IP networks, over ethernet links if a
+hardware layer is required, are supported.
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 543007f159f9..0299fb71b456 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -399,7 +399,7 @@ EXPORT_SYMBOL(netpoll_send_skb);
void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
{
- int total_len, ip_len, udp_len;
+ int total_len, ip_len, udp_len, link_len;
struct sk_buff *skb;
struct udphdr *udph;
struct iphdr *iph;
@@ -416,7 +416,10 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
else
ip_len = udp_len + sizeof(*iph);
- total_len = ip_len + LL_RESERVED_SPACE(np->dev);
+ /* if there's a hardware header assume ethernet, else raw IP */
+ eth = NULL;
+ link_len = np->dev->hard_header_len ? LL_RESERVED_SPACE(np->dev) : 0;
+ total_len = ip_len + link_len;
skb = find_skb(np, total_len + np->dev->needed_tailroom,
total_len - len);
@@ -458,9 +461,11 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
ip6h->saddr = np->local_ip.in6;
ip6h->daddr = np->remote_ip.in6;
- eth = skb_push(skb, ETH_HLEN);
- skb_reset_mac_header(skb);
- skb->protocol = eth->h_proto = htons(ETH_P_IPV6);
+ skb->protocol = htons(ETH_P_IPV6);
+ if (link_len) {
+ eth = skb_push(skb, ETH_HLEN);
+ skb_reset_mac_header(skb);
+ }
} else {
udph->check = 0;
udph->check = csum_tcpudp_magic(np->local_ip.ip,
@@ -487,13 +492,18 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
put_unaligned(np->remote_ip.ip, &(iph->daddr));
iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
- eth = skb_push(skb, ETH_HLEN);
- skb_reset_mac_header(skb);
- skb->protocol = eth->h_proto = htons(ETH_P_IP);
+ skb->protocol = htons(ETH_P_IP);
+ if (link_len) {
+ eth = skb_push(skb, ETH_HLEN);
+ skb_reset_mac_header(skb);
+ }
}
- ether_addr_copy(eth->h_source, np->dev->dev_addr);
- ether_addr_copy(eth->h_dest, np->remote_mac);
+ if (eth) {
+ eth->h_proto = skb->protocol;
+ ether_addr_copy(eth->h_source, np->dev->dev_addr);
+ ether_addr_copy(eth->h_dest, np->remote_mac);
+ }
skb->dev = np->dev;
--
2.43.0
On 2024-03-13 at 18:16:13, Mark Cilissen ([email protected]) wrote:
> Currently, netpoll only supports interfaces with an ethernet-compatible
> link layer. Certain interfaces like SLIP do not have a link layer
> on the network interface level at all and expect raw IP packets,
> and could benefit from being supported by netpoll.
>
> This commit adds support for such interfaces by using the network device's
> `hard_header_len` field as an indication that no link layer is present.
> If that is the case we simply skip adding the ethernet header, causing
> a raw IP packet to be sent over the interface. This has been confirmed
> to add netconsole support to at least SLIP and WireGuard interfaces.
>
> Signed-off-by: Mark Cilissen <[email protected]>
> ---
> Documentation/networking/netconsole.rst | 3 ++-
> net/core/netpoll.c | 30 ++++++++++++++++---------
> 2 files changed, 22 insertions(+), 11 deletions(-)
>
> diff --git a/Documentation/networking/netconsole.rst b/Documentation/networking/netconsole.rst
> index d55c2a22ec7a..434ce0366027 100644
> --- a/Documentation/networking/netconsole.rst
> +++ b/Documentation/networking/netconsole.rst
> @@ -327,4 +327,5 @@ enable the logging of even the most critical kernel bugs. It works
> from IRQ contexts as well, and does not enable interrupts while
> sending packets. Due to these unique needs, configuration cannot
> be more automatic, and some fundamental limitations will remain:
> -only IP networks, UDP packets and ethernet devices are supported.
> +only UDP packets over IP networks, over ethernet links if a
> +hardware layer is required, are supported.
> diff --git a/net/core/netpoll.c b/net/core/netpoll.c
> index 543007f159f9..0299fb71b456 100644
> --- a/net/core/netpoll.c
> +++ b/net/core/netpoll.c
> @@ -399,7 +399,7 @@ EXPORT_SYMBOL(netpoll_send_skb);
>
> void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
> {
> - int total_len, ip_len, udp_len;
> + int total_len, ip_len, udp_len, link_len;
> struct sk_buff *skb;
> struct udphdr *udph;
> struct iphdr *iph;
> @@ -416,7 +416,10 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
> else
> ip_len = udp_len + sizeof(*iph);
>
> - total_len = ip_len + LL_RESERVED_SPACE(np->dev);
> + /* if there's a hardware header assume ethernet, else raw IP */
Taking an assumption based on dev's lower layer does not look to be good.
why not transmit packet from skb_network_header() in your driver (by making
changes in your driver)
> + eth = NULL;
> + link_len = np->dev->hard_header_len ? LL_RESERVED_SPACE(np->dev) : 0;
> + total_len = ip_len + link_len;
>
> skb = find_skb(np, total_len + np->dev->needed_tailroom,
> total_len - len);
> @@ -458,9 +461,11 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
> ip6h->saddr = np->local_ip.in6;
> ip6h->daddr = np->remote_ip.in6;
>
> - eth = skb_push(skb, ETH_HLEN);
> - skb_reset_mac_header(skb);
> - skb->protocol = eth->h_proto = htons(ETH_P_IPV6);
> + skb->protocol = htons(ETH_P_IPV6);
> + if (link_len) {
> + eth = skb_push(skb, ETH_HLEN);
> + skb_reset_mac_header(skb);
> + }
> } else {
> udph->check = 0;
> udph->check = csum_tcpudp_magic(np->local_ip.ip,
> @@ -487,13 +492,18 @@ void netpoll_send_udp(struct netpoll *np, const char *msg, int len)
> put_unaligned(np->remote_ip.ip, &(iph->daddr));
> iph->check = ip_fast_csum((unsigned char *)iph, iph->ihl);
>
> - eth = skb_push(skb, ETH_HLEN);
> - skb_reset_mac_header(skb);
> - skb->protocol = eth->h_proto = htons(ETH_P_IP);
> + skb->protocol = htons(ETH_P_IP);
> + if (link_len) {
> + eth = skb_push(skb, ETH_HLEN);
> + skb_reset_mac_header(skb);
> + }
> }
>
> - ether_addr_copy(eth->h_source, np->dev->dev_addr);
> - ether_addr_copy(eth->h_dest, np->remote_mac);
> + if (eth) {
> + eth->h_proto = skb->protocol;
> + ether_addr_copy(eth->h_source, np->dev->dev_addr);
> + ether_addr_copy(eth->h_dest, np->remote_mac);
> + }
>
> skb->dev = np->dev;
>
> --
> 2.43.0
>
Hi Ratheesh,
> Op 13 mrt 6 Reiwa, om 14:36 heeft Ratheesh Kannoth <[email protected]> het volgende geschreven:
>
> On 2024-03-13 at 18:16:13, Mark Cilissen ([email protected]) wrote:
>> […]
> Taking an assumption based on dev’s lower layer does not look to be good.
> why not transmit packet from skb_network_header() in your driver (by making
> changes in your driver)
There’s two assumptions at play here:
- The lower layer is ethernet: this has always been present in netpoll, and is even
documented in netconsole.rst. This comment just mentions it because we add a way
to bypass the assumption; it is not an assumption this patch adds to the code.
- hard_header_len==0 means that there is no exposed link layer: this is a rather
conservative assumption in my opinion, and is also mentioned in the definition
of LL_RESERVED_SPACE:
> * Alternative is:
> * dev->hard_header_len ? (dev->hard_header_len +
> * (HH_DATA_MOD - 1)) & ~(HH_DATA_MOD - 1) : 0
The same assumption is also made in more places in the core network code, like af_packet:
> - If the device has no dev->header_ops->create, there is no LL header
> visible above the device. In this case, its hard_header_len should be 0.
> The device may prepend its own header internally. In this case, its
> needed_headroom should be set to the space needed for it to add its
> internal header.
I could change it to, like af_packet, check `dev->header_ops` instead if that is preferred,
but I don’t think that patching every single raw IP driver to deal with skbs that managed
to somehow have link layer data would be preferred here, especially since netpoll is kind
of a special case to begin with. I am open to suggestions and ideas, though.
> […]
Thanks and regards,
Mark
> From: Mark <[email protected]>
> Sent: Wednesday, March 13, 2024 7:23 PM
> To: Ratheesh Kannoth <[email protected]>
> > On 2024-03-13 at 18:16:13, Mark Cilissen ([email protected]) wrote:
> >> […]
> > Taking an assumption based on dev’s lower layer does not look to be good.
> > why not transmit packet from skb_network_header() in your driver (by
> > making changes in your driver)
>
> There’s two assumptions at play here:
> - The lower layer is ethernet: this has always been present in netpoll, and is
> even
> documented in netconsole.rst. This comment just mentions it because we
> add a way
> to bypass the assumption; it is not an assumption this patch adds to the
> code.
> - hard_header_len==0 means that there is no exposed link layer: this is a rather
> conservative assumption in my opinion, and is also mentioned in the
> definition
Hmm. That is not my question. Let me explain it in detail. Netconsole is using netpoll_send_udp() to encapsulate the msg over
UDP/IP/ MAC headers. Job well done. Now it calls netdev->ops->ndo_start_xmit(skb, dev). If your driver is well aware that you can
Transmit only from network header, why don’t you dma map from network header ?
> of LL_RESERVED_SPACE:
>
> > * Alternative is:
> > * dev->hard_header_len ? (dev->hard_header_len +
> > * (HH_DATA_MOD - 1)) & ~(HH_DATA_MOD - 1) : 0
>
> The same assumption is also made in more places in the core network code,
> like af_packet:
>
> > - If the device has no dev->header_ops->create, there is no LL header
> > visible above the device. In this case, its hard_header_len should be 0.
> > The device may prepend its own header internally. In this case, its
> > needed_headroom should be set to the space needed for it to add its
> > internal header.
>
ACK.
On Wed, 2024-03-13 at 13:46 +0100, Mark Cilissen wrote:
> Currently, netpoll only supports interfaces with an ethernet-compatible
> link layer. Certain interfaces like SLIP do not have a link layer
> on the network interface level at all and expect raw IP packets,
> and could benefit from being supported by netpoll.
>
> This commit adds support for such interfaces by using the network device's
> `hard_header_len` field as an indication that no link layer is present.
> If that is the case we simply skip adding the ethernet header, causing
> a raw IP packet to be sent over the interface. This has been confirmed
> to add netconsole support to at least SLIP and WireGuard interfaces.
>
> Signed-off-by: Mark Cilissen <[email protected]>
## Form letter - net-next-closed
The merge window for v6.9 has begun and we have already posted our pull
request. Therefore net-next is closed for new drivers, features, code
refactoring and optimizations. We are currently accepting bug fixes
only.
Please repost when net-next reopens after March 25th.
RFC patches sent for review only are obviously welcome at any time.
See:
https://www.kernel.org/doc/html/next/process/maintainer-netdev.html#development-cycle
--
pw-bot: defer
On Wed, 13 Mar 2024 13:46:13 +0100 Mark Cilissen wrote:
> Currently, netpoll only supports interfaces with an ethernet-compatible
> link layer. Certain interfaces like SLIP do not have a link layer
> on the network interface level at all and expect raw IP packets,
> and could benefit from being supported by netpoll.
>
> This commit adds support for such interfaces by using the network device's
> `hard_header_len` field as an indication that no link layer is present.
> If that is the case we simply skip adding the ethernet header, causing
> a raw IP packet to be sent over the interface. This has been confirmed
> to add netconsole support to at least SLIP and WireGuard interfaces.
Would be great if this could come with a simple selftest under
tools/testing/selftests/net. Preferably using some simple tunnel
device, rather than wg to limit tooling dependencies ;)
Hi Ratheesh,
> Op 14 mrt 6 Reiwa, om 03:46 heeft Ratheesh Kannoth <[email protected]> het volgende geschreven:
>
>> From: Mark <[email protected]>
>> […]
>
> Hmm. That is not my question. Let me explain it in detail. Netconsole is using netpoll_send_udp() to encapsulate the msg over
> UDP/IP/ MAC headers. Job well done. Now it calls netdev->ops->ndo_start_xmit(skb, dev). If your driver is well aware that you can
> Transmit only from network header, why don’t you dma map from network header ?
The rest of the network subsystem seems to not add a header to skbs submitted
to netdev->ops->ndo_start_xmit() at all, which makes sense considering
netdev->header_ops is either NULL or no-op for these devices.
Following this line of reasoning, from API perspective it made more sense
to me for netpoll to not submit ‘bogus’ skbs that are out-of-line with what
the rest of the network subsystem does to ndo_start_xmit() to begin with.
It really depends on the API guarantees we want to have for netdev,
but personally I'm wary of introducing an allowance for bogus headers.
Additionally from a practical perspective, this would require changing almost
every, if not every, IP interface driver. I took a look at the WireGuard
driver to see what it would entail, and from my limited experience with the
networking code it seems like there's some quite annoying interactions with
e.g. GSO which would make driver-side handling of such packets quite a bit
more complex.
So from my perspective, fixing this in netpoll is both the more API-correct
change and introduces the least amount of additional complexity.
> […]
Thanks and regards,
Mark
Hi Jakub,
> Op 14 mrt 6 Reiwa, om 19:34 heeft Jakub Kicinski <[email protected]> het volgende geschreven:
>
> On Wed, 13 Mar 2024 13:46:13 +0100 Mark Cilissen wrote:
>> Currently, netpoll only supports interfaces with an ethernet-compatible
>> link layer. Certain interfaces like SLIP do not have a link layer
>> on the network interface level at all and expect raw IP packets,
>> and could benefit from being supported by netpoll.
>>
>> This commit adds support for such interfaces by using the network device's
>> `hard_header_len` field as an indication that no link layer is present.
>> If that is the case we simply skip adding the ethernet header, causing
>> a raw IP packet to be sent over the interface. This has been confirmed
>> to add netconsole support to at least SLIP and WireGuard interfaces.
>
> Would be great if this could come with a simple selftest under
> tools/testing/selftests/net. Preferably using some simple tunnel
> device, rather than wg to limit tooling dependencies ;)
Yes, that makes a lot of sense. I wrote a selftest that tests netconsole
using IPIP and GRE tunnels, and they too seem to work correctly and pass (and
more importantly, fail without the patch ;-). Would you prefer me to submit
a v2 with that test now, or when net-next reopens in a week?
Thanks and regards,
Mark
> From: Mark <[email protected]>
> Sent: Monday, March 18, 2024 5:13 PM
> To: Ratheesh Kannoth <[email protected]>
> Cc: [email protected]; Hans de Goede <[email protected]>; Eric
> Dumazet <[email protected]>; Jakub Kicinski <[email protected]>; Breno
> Leitao <[email protected]>; Ingo Molnar <[email protected]>; David S.
> Miller <[email protected]>; Paolo Abeni <[email protected]>; linux-
> [email protected]
> Subject: Re: [EXTERNAL] [PATCH] netpoll: support sending over raw IP
> interfaces
>
>
> Hi Ratheesh,
>
> > Op 14 mrt 6 Reiwa, om 03:46 heeft Ratheesh Kannoth
> <[email protected]> het volgende geschreven:
> >
> >> From: Mark <[email protected]>
> >> […]
> >
> > Hmm. That is not my question. Let me explain it in detail. Netconsole is
> using netpoll_send_udp() to encapsulate the msg over
> > UDP/IP/ MAC headers. Job well done. Now it calls
> > netdev->ops->ndo_start_xmit(skb, dev). If your driver is well aware that
> you can Transmit only from network header, why don’t you dma map from
> network header ?
>
> The rest of the network subsystem seems to not add a header to skbs
> submitted to netdev->ops->ndo_start_xmit() at all, which makes sense
> considering
> netdev->header_ops is either NULL or no-op for these devices.
>
> Following this line of reasoning, from API perspective it made more sense to
> me for netpoll to not submit ‘bogus’ skbs that are out-of-line with what the
> rest of the network subsystem does to ndo_start_xmit() to begin with.
> It really depends on the API guarantees we want to have for netdev, but
> personally I'm wary of introducing an allowance for bogus headers.
>
Is below network topology possible ?
Netpoll()- ------> netdev A ----> raw interface
Where netdev A's netdev->header_ops != NULL
> Additionally from a practical perspective, this would require changing almost
> every, if not every, IP interface driver. I took a look at the WireGuard driver to
> see what it would entail, and from my limited experience with the networking
> code it seems like there's some quite annoying interactions with e.g. GSO
> which would make driver-side handling of such packets quite a bit more
> complex.
ACK.
>
> So from my perspective, fixing this in netpoll is both the more API-correct
> change and introduces the least amount of additional complexity.
ACK.
On Mon, 18 Mar 2024 12:47:46 +0100 Mark wrote:
> > Would be great if this could come with a simple selftest under
> > tools/testing/selftests/net. Preferably using some simple tunnel
> > device, rather than wg to limit tooling dependencies ;)
>
> Yes, that makes a lot of sense. I wrote a selftest that tests netconsole
> using IPIP and GRE tunnels, and they too seem to work correctly and pass (and
> more importantly, fail without the patch ;-). Would you prefer me to submit
> a v2 with that test now, or when net-next reopens in a week?
Feel free to post as [RFC net-next v2] now. I can't guarantee people
will find the time to review, but it shouldn't hurt to have it on
the list :)
Hi Ratheesh,
> Op 18 mrt 6 Reiwa, om 15:06 heeft Ratheesh Kannoth <[email protected]> het volgende geschreven:
>
>> […]
> Is below network topology possible ?
> Netpoll()- ------> netdev A ----> raw interface
> Where netdev A's netdev->header_ops != NULL
I believe so, this is not uncommon in tunnel devices like gretap.
However, those fully encapsulate the link layer header in the packet
to the lower interface. I am not aware of a interface driver that removes
a header upon xmit, so to speak. However, I have just posted a v2 that
instead uses the documented `dev_has_header()` API, which seems to fit the
check exactly, here:
https://lore.kernel.org/netdev/[email protected]/T/
I hope this change manages to mitigate your concerns. :-)
> […]
Thanks and regards,
Mark
> From: Mark <[email protected]>
> Sent: Thursday, March 21, 2024 6:03 PM
> To: Ratheesh Kannoth <[email protected]>
> Cc: [email protected]; Hans de Goede <[email protected]>; Eric
> Dumazet <[email protected]>; Jakub Kicinski <[email protected]>; Breno
> Leitao <[email protected]>; Ingo Molnar <[email protected]>; David S.
> Miller <[email protected]>; Paolo Abeni <[email protected]>; linux-
> [email protected]
> Subject: Re: [EXTERNAL] [PATCH] netpoll: support sending over raw IP
> interfaces
>
> Hi Ratheesh,
>
> > Op 18 mrt 6 Reiwa, om 15:06 heeft Ratheesh Kannoth
> <[email protected]> het volgende geschreven:
> >
> >> […]
> > Is below network topology possible ?
> > Netpoll()- ------> netdev A ----> raw interface Where netdev A's
> > netdev->header_ops != NULL
>
> I believe so, this is not uncommon in tunnel devices like gretap.
> However, those fully encapsulate the link layer header in the packet to the
> lower interface. I am not aware of a interface driver that removes a header
> upon xmit, so to speak. However, I have just posted a v2 that instead uses the
> documented `dev_has_header()` API, which seems to fit the check exactly,
> here:
ACK. I understand the complexity to implement it in driver than a easy fix in netpoll.