2020-02-15 05:25:59

by Dexuan Cui

[permalink] [raw]
Subject: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?

Hi,
It looks all the layers of drivers among the network stack can use the 48-byte
skb->cb array. Is there any rule how they should coordinate with each other?

I noticed the last 16 bytes are used by struct skb_gso_cb:

include/linux/skbuff.h:
struct skb_gso_cb {
union {
int mac_offset;
int data_offset;
};
int encap_level;
__wsum csum;
__u16 csum_start;
};
#define SKB_SGO_CB_OFFSET 32
#define SKB_GSO_CB(skb) ((struct skb_gso_cb *)((skb)->cb + SKB_SGO_CB_OFFSET))

Does this mean a low level NIC driver (e.g. hv_netvsc) should only use
the first 32 bytes? What if the upper layer network stack starts to take up
more space in the future?

Now hv_netvsc assumes it can use all of the 48-bytes, though it uses only
20 bytes, but just in case the struct hv_netvsc_packet grows to >32 bytes in the
future, should we change the BUILD_BUG_ON() in netvsc_start_xmit() to
BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) > SKB_SGO_CB_OFFSET);
?

struct hv_netvsc_packet {
/* Bookkeeping stuff */
u8 cp_partial; /* partial copy into send buffer */

u8 rmsg_size; /* RNDIS header and PPI size */
u8 rmsg_pgcnt; /* page count of RNDIS header and PPI */
u8 page_buf_cnt;

u16 q_idx;
u16 total_packets;

u32 total_bytes;
u32 send_buf_index;
u32 total_data_buflen;
};

static int netvsc_start_xmit(struct sk_buff *skb, struct net_device *net)
{
...
/*
* Place the rndis header in the skb head room and
* the skb->cb will be used for hv_netvsc_packet
* structure.
*/
ret = skb_cow_head(skb, RNDIS_AND_PPI_SIZE);
if (ret)
goto no_memory;

/* Use the skb control buffer for building up the packet */
BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) >
FIELD_SIZEOF(struct sk_buff, cb));
packet = (struct hv_netvsc_packet *)skb->cb;

Thanks,
-- Dexuan


2020-02-15 15:21:30

by Haiyang Zhang

[permalink] [raw]
Subject: RE: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?



> -----Original Message-----
> From: Dexuan Cui <[email protected]>
> Sent: Saturday, February 15, 2020 12:24 AM
> To: Haiyang Zhang <[email protected]>; Stephen Hemminger
> <[email protected]>; David S. Miller <[email protected]>;
> [email protected]; KY Srinivasan <[email protected]>; linux-
> [email protected]
> Cc: [email protected]
> Subject: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?
>
> Hi,
> It looks all the layers of drivers among the network stack can use the 48-byte
> skb->cb array. Is there any rule how they should coordinate with each other?
>
> I noticed the last 16 bytes are used by struct skb_gso_cb:
>
> include/linux/skbuff.h:
> struct skb_gso_cb {
> union {
> int mac_offset;
> int data_offset;
> };
> int encap_level;
> __wsum csum;
> __u16 csum_start;
> };
> #define SKB_SGO_CB_OFFSET 32
> #define SKB_GSO_CB(skb) ((struct skb_gso_cb *)((skb)->cb +
> SKB_SGO_CB_OFFSET))
>
> Does this mean a low level NIC driver (e.g. hv_netvsc) should only use the first
> 32 bytes? What if the upper layer network stack starts to take up more space in
> the future?

According to the comments in skbuff.h below, it is the responsibility of the owning
layer to make a SKB clone, if it wants to keep the data across layers. So, every layer
can still use all of the 48 bytes.

/*
* This is the control buffer. It is free to use for every
* layer. Please put your private variables there. If you
* want to keep them across layers you have to do a skb_clone()
* first. This is owned by whoever has the skb queued ATM.
*/
char cb[48] __aligned(8);

> Now hv_netvsc assumes it can use all of the 48-bytes, though it uses only
> 20 bytes, but just in case the struct hv_netvsc_packet grows to >32 bytes in the
> future, should we change the BUILD_BUG_ON() in netvsc_start_xmit() to
> BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) > SKB_SGO_CB_OFFSET); ?

Based on the explanation above, the existing hv_netvsc code is correct.

Thanks,
- Haiyang

2020-02-15 18:04:42

by Dexuan Cui

[permalink] [raw]
Subject: RE: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?

> From: Haiyang Zhang <[email protected]>
> Sent: Saturday, February 15, 2020 7:20 AM
> To: Dexuan Cui <[email protected]>; Stephen Hemminger
>
> According to the comments in skbuff.h below, it is the responsibility of the
> owning layer to make a SKB clone, if it wants to keep the data across layers.
> So, every layer can still use all of the 48 bytes.
>
> /*
> * This is the control buffer. It is free to use for every
> * layer. Please put your private variables there. If you
> * want to keep them across layers you have to do a skb_clone()
> * first. This is owned by whoever has the skb queued ATM.
> */
> char cb[48] __aligned(8);
>
> > Now hv_netvsc assumes it can use all of the 48-bytes, though it uses only
> > 20 bytes, but just in case the struct hv_netvsc_packet grows to >32 bytes in
> the
> > future, should we change the BUILD_BUG_ON() in netvsc_start_xmit() to
> > BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) > SKB_SGO_CB_OFFSET); ?
>
> Based on the explanation above, the existing hv_netvsc code is correct.
>
> Thanks,
> - Haiyang

Got it. So if the upper layer saves something in the cb, it must do a skb_clone()
and pass the new skb to hv_netvsc. hv_netvsc is the lowest layer in the network
stack, so it can use all the 48 bytes without calling skb_clone().

BTW, now I happen to have a different question: in netvsc_probe() we have
net->needed_headroom = RNDIS_AND_PPI_SIZE;
I think this means when the network stack (ARP, IP, ICMP, TCP, UDP,etc) passes a
skb to hv_netvsc, the skb's headroom is increased by an extra size of
net->needed_headroom, right? Then in netvsc_xmit(), why do we still need to
call skb_cow_head(skb, RNDIS_AND_PPI_SIZE)? -- this looks unnecessary to me?

PS, what does the "cow" here mean? Copy On Write? It looks skb_cow_head()
just copies the data (if necessary) and it has nothing to do with the
write-protection in the MMU code.

Thanks,
Dexuan

2020-02-17 22:31:45

by Haiyang Zhang

[permalink] [raw]
Subject: RE: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?



> -----Original Message-----
> From: Dexuan Cui <[email protected]>
> Sent: Saturday, February 15, 2020 1:04 PM
> To: Haiyang Zhang <[email protected]>; Stephen Hemminger
> <[email protected]>; David S. Miller <[email protected]>;
> [email protected]; KY Srinivasan <[email protected]>; linux-
> [email protected]
> Cc: [email protected]
> Subject: RE: Is it safe for a NIC driver to use all the 48 bytes of skb->cb?
>
> > From: Haiyang Zhang <[email protected]>
> > Sent: Saturday, February 15, 2020 7:20 AM
> > To: Dexuan Cui <[email protected]>; Stephen Hemminger
> >
> > According to the comments in skbuff.h below, it is the responsibility of the
> > owning layer to make a SKB clone, if it wants to keep the data across layers.
> > So, every layer can still use all of the 48 bytes.
> >
> > /*
> > * This is the control buffer. It is free to use for every
> > * layer. Please put your private variables there. If you
> > * want to keep them across layers you have to do a skb_clone()
> > * first. This is owned by whoever has the skb queued ATM.
> > */
> > char cb[48] __aligned(8);
> >
> > > Now hv_netvsc assumes it can use all of the 48-bytes, though it uses only
> > > 20 bytes, but just in case the struct hv_netvsc_packet grows to >32 bytes in
> > the
> > > future, should we change the BUILD_BUG_ON() in netvsc_start_xmit() to
> > > BUILD_BUG_ON(sizeof(struct hv_netvsc_packet) > SKB_SGO_CB_OFFSET); ?
> >
> > Based on the explanation above, the existing hv_netvsc code is correct.
> >
> > Thanks,
> > - Haiyang
>
> Got it. So if the upper layer saves something in the cb, it must do a skb_clone()
> and pass the new skb to hv_netvsc. hv_netvsc is the lowest layer in the network
> stack, so it can use all the 48 bytes without calling skb_clone().
>
> BTW, now I happen to have a different question: in netvsc_probe() we have
> net->needed_headroom = RNDIS_AND_PPI_SIZE;
> I think this means when the network stack (ARP, IP, ICMP, TCP, UDP,etc) passes
> a
> skb to hv_netvsc, the skb's headroom is increased by an extra size of
> net->needed_headroom, right? Then in netvsc_xmit(), why do we still need to
> call skb_cow_head(skb, RNDIS_AND_PPI_SIZE)? -- this looks unnecessary to me?

skb_cow_head() only expands the headroom if it is not enough, in case some
upper layer path didn't reserve enough.

> PS, what does the "cow" here mean? Copy On Write? It looks skb_cow_head()
> just copies the data (if necessary) and it has nothing to do with the
> write-protection in the MMU code.

Unrelated to MMU. It just copies some data to make room for writing.

Thanks,
- Haiyang