2013-06-12 09:05:20

by Dave Wiltshire

[permalink] [raw]
Subject: [PATCH 0/3] skbuff: pskb_expand_head changes

This patchset applies against David Miller's net-next tree.

This is my first patch, so please let me know if I've done something
wrong or could do something better (e.g. splitting the patch in a
different way).

I think there is a problem with memory accounting in pskb_expand_head.
Some call sites update truesize while others don't. This means that the
skbuff can change in size without truesize being updated. This seems
incorrect to me. So I changed it. Perhaps I don't understand something,
but I thought it best to generate the change and then ask. So is this
correct?

The other patches are from looking at all the call sites to
pskb_expand_head and since I was doing that I cleaned some of them up.

Dave Wiltshire (3):
skbuff: Update truesize in pskb_expand_head
skbuff: skb_cow_head not used in some drivers.
skbuff: Added new helper function skb_cow_clone_head.

drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 8 ++------
drivers/net/ethernet/atheros/atl1e/atl1e_main.c | 8 ++------
drivers/net/ethernet/atheros/atlx/atl1.c | 8 ++------
drivers/net/ethernet/broadcom/tg3.c | 3 +--
drivers/net/ethernet/brocade/bna/bnad.c | 11 +++--------
drivers/net/ethernet/intel/e1000/e1000_main.c | 8 ++------
drivers/net/ethernet/intel/e1000e/netdev.c | 8 ++------
drivers/net/ethernet/intel/igb/igb_main.c | 7 ++-----
drivers/net/ethernet/intel/igbvf/netdev.c | 11 +++--------
drivers/net/ethernet/intel/ixgb/ixgb_main.c | 8 ++------
drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 10 +++-------
drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c | 7 ++-----
drivers/net/ethernet/jme.c | 3 +--
drivers/net/ethernet/qlogic/qlge/qlge_main.c | 8 ++------
drivers/net/wimax/i2400m/netdev.c | 3 +--
drivers/net/wireless/mwl8k.c | 1 -
include/linux/skbuff.h | 14 +++++++++++++
kernel/audit.c | 2 --
net/core/dev.c | 8 ++------
net/core/skbuff.c | 1 +
net/netlink/af_netlink.c | 3 +--
net/openvswitch/actions.c | 22 +++++++--------------
net/sched/act_csum.c | 8 ++------
net/sched/act_nat.c | 18 +++++------------
net/wireless/util.c | 2 --
25 files changed, 62 insertions(+), 128 deletions(-)

--
1.7.10.4


2013-06-12 09:06:24

by Dave Wiltshire

[permalink] [raw]
Subject: [PATCH 1/3] skbuff: Update truesize in pskb_expand_head

Some call sites to pskb_expand_head subsequently update the skb truesize
and others don't (even with non-zero arguments). This is likely a memory
audit leak. Fixed this up by moving the memory accounting to the
skbuff.c file and removing it from the calling sites.

Signed-off-by: Dave Wiltshire <[email protected]>
---
drivers/net/wireless/mwl8k.c | 1 -
kernel/audit.c | 2 --
net/core/skbuff.c | 1 +
net/netlink/af_netlink.c | 3 +--
net/wireless/util.c | 2 --
5 files changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/net/wireless/mwl8k.c b/drivers/net/wireless/mwl8k.c
index 6820fce..802c8d7 100644
--- a/drivers/net/wireless/mwl8k.c
+++ b/drivers/net/wireless/mwl8k.c
@@ -845,7 +845,6 @@ mwl8k_add_dma_header(struct mwl8k_priv *priv, struct sk_buff *skb,
"Failed to reallocate TX buffer\n");
return;
}
- skb->truesize += REDUCED_TX_HEADROOM;
}

reqd_hdrlen = sizeof(*tr) + head_pad;
diff --git a/kernel/audit.c b/kernel/audit.c
index 21c7fa6..e05b57b 100644
--- a/kernel/audit.c
+++ b/kernel/audit.c
@@ -1157,7 +1157,6 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
static inline int audit_expand(struct audit_buffer *ab, int extra)
{
struct sk_buff *skb = ab->skb;
- int oldtail = skb_tailroom(skb);
int ret = pskb_expand_head(skb, 0, extra, ab->gfp_mask);
int newtail = skb_tailroom(skb);

@@ -1166,7 +1165,6 @@ static inline int audit_expand(struct audit_buffer *ab, int extra)
return 0;
}

- skb->truesize += newtail - oldtail;
return newtail;
}

diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index edf3757..125bb7e 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -1061,6 +1061,7 @@ int pskb_expand_head(struct sk_buff *skb, int nhead, int ntail,
if (!data)
goto nodata;
size = SKB_WITH_OVERHEAD(ksize(data));
+ skb->truesize += size - skb_end_offset(skb);

/* Copy only real data... and, alas, header. This should be
* optimized for the cases when header is void.
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 9b6b115..77fd986 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1565,8 +1565,7 @@ static struct sk_buff *netlink_trim(struct sk_buff *skb, gfp_t allocation)
skb = nskb;
}

- if (!pskb_expand_head(skb, 0, -delta, allocation))
- skb->truesize -= delta;
+ pskb_expand_head(skb, 0, -delta, allocation);

return skb;
}
diff --git a/net/wireless/util.c b/net/wireless/util.c
index f5ad4d9..5710aa2 100644
--- a/net/wireless/util.c
+++ b/net/wireless/util.c
@@ -533,8 +533,6 @@ int ieee80211_data_from_8023(struct sk_buff *skb, const u8 *addr,

if (pskb_expand_head(skb, head_need, 0, GFP_ATOMIC))
return -ENOMEM;
-
- skb->truesize += head_need;
}

if (encaps_data) {
--
1.7.10.4

2013-06-12 09:17:08

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 1/3] skbuff: Update truesize in pskb_expand_head

On Wed, 2013-06-12 at 19:05 +1000, Dave Wiltshire wrote:
> Some call sites to pskb_expand_head subsequently update the skb truesize
> and others don't (even with non-zero arguments). This is likely a memory
> audit leak. Fixed this up by moving the memory accounting to the
> skbuff.c file and removing it from the calling sites.
>
> Signed-off-by: Dave Wiltshire <[email protected]>
> ---
> drivers/net/wireless/mwl8k.c | 1 -
> kernel/audit.c | 2 --
> net/core/skbuff.c | 1 +
> net/netlink/af_netlink.c | 3 +--
> net/wireless/util.c | 2 --
> 5 files changed, 2 insertions(+), 7 deletions(-)

Ouch.

Sorry, you cannot do that.

skb->truesize is really complex, because there is a strong relation
between skb->truesize and memory accounting on sockets.

So pskb_expand_head() should not touch skb->truesize.

Only callers can do that when needed, and if possible.

An example of very careful truesize manipulation can be found in
tcp_tso_segment()


2013-06-12 23:34:23

by Dave Wiltshire

[permalink] [raw]
Subject: Re: [PATCH 1/3] skbuff: Update truesize in pskb_expand_head

On Wed, Jun 12, 2013 at 02:16:58AM -0700, Eric Dumazet wrote:
> On Wed, 2013-06-12 at 19:05 +1000, Dave Wiltshire wrote:
> > Some call sites to pskb_expand_head subsequently update the skb truesize
> > and others don't (even with non-zero arguments). This is likely a memory
> > audit leak. Fixed this up by moving the memory accounting to the
> > skbuff.c file and removing it from the calling sites.
>
> Ouch.
>
> Sorry, you cannot do that.
>
> skb->truesize is really complex, because there is a strong relation
> between skb->truesize and memory accounting on sockets.
>

Firstly, from my cover letter: "Perhaps I don't understand something,
but I thought it best to generate the change and then ask. So is this
correct?". But secondly, I understand that the only reason for truesize
is for memory accounting on sockets. Indeed that's why I thought this
was incorrect. Something being complex is not a good reason not to do
it.

> So pskb_expand_head() should not touch skb->truesize.
>
> Only callers can do that when needed, and if possible.
>
> An example of very careful truesize manipulation can be found in
> tcp_tso_segment()
>

Perhaps I'm still missing something but I don't think tcp_tso_segment is
a very good example of truesize in skbuffs. That function is reassigning
already allocated memory between different skbuffs, and also it doesn't
touch pskb_expand_head. I don't see how that is similar to calling
pskb_expand_head with non-zero parameters (thus increasing the size of a
skbuff) and _not_ updating truesize as occurs, for instance, in
drivers/atm/solos-pci.c in the function psend.

Now this is a little used driver so perhaps it doesn't matter. But I'm
not sure if this is happening in other places thus meaning that memory
accounting on sockets isn't being performed correctly. Which is the
reason I suggested this as a fix.

Dave W

2013-06-13 05:39:03

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH 1/3] skbuff: Update truesize in pskb_expand_head

On Thu, 2013-06-13 at 09:35 +1000, Dave Wiltshire wrote:

> Firstly, from my cover letter: "Perhaps I don't understand something,
> but I thought it best to generate the change and then ask. So is this
> correct?".

Sure I have no problems with that.

> But secondly, I understand that the only reason for truesize
> is for memory accounting on sockets. Indeed that's why I thought this
> was incorrect. Something being complex is not a good reason not to do
> it.

OK but right now your patch adds many regressions, that will take weeks
for other dev to discover, understand and fix.

If you change skb->truesize not properly while skb is owned by a socket,
we can hold references forever and prevent sockets from being
dismantled/freed, or worse we'll free sockets while they are still in
use and panic the machine.

Some callers of pskb_expand_head() do not want skb->truesize to be
modified, because skb might be orphaned or not.

There are hundred of call sites.

Really, your patch is way too risky and will consume too much time for
very little gain. We cannot change conventions without a full audit.

There are some cases where truesize can not be exactly tracked.
For example, when we pull all data from a frag into skb->head, and the
frag can be release (put_page() on it), we do not know its original size
and can not reduce skb->truesize by this amount.
Thats fine, most of the time, because we pull network headers and they
are limited in size.

Your changelog used : "This is likely a memory audit leak", which is
weak considering the amount of time needed for us to validate such a big
change.