LinuxLists.cc - [PATCH 00/17] net subsystem refcount conversions

2017-03-16 15:29:20

Subject: [PATCH 00/17] net subsystem refcount conversions

This series, for core network subsystem components, replaces atomic_t reference
counters with the new refcount_t type and API (see include/linux/refcount.h).
By doing this we prevent intentional or accidental
underflows or overflows that can led to use-after-free vulnerabilities.
These patches contain only generic net pieces. Other changes will be sent separately.

The patches are fully independent and can be cherry-picked separately.
Since we convert all kernel subsystems in the same fashion, resulting
in about 300 patches, we have to group them for sending at least in some
fashion to be manageable. Please excuse the long cc list.

If there are no objections to the patches, please merge them via respective trees.

Elena Reshetova (17):
net: convert neighbour.refcnt from atomic_t to refcount_t
net: convert neigh_params.refcnt from atomic_t to refcount_t
net: convert nf_bridge_info.use from atomic_t to refcount_t
net: convert sk_buff.users from atomic_t to refcount_t
net: convert sk_buff_fclones.fclone_ref from atomic_t to refcount_t
net: convert sock.sk_wmem_alloc from atomic_t to refcount_t
net: convert sock.sk_refcnt from atomic_t to refcount_t
net: convert sk_filter.refcnt from atomic_t to refcount_t
net: convert ip_mc_list.refcnt from atomic_t to refcount_t
net: convert in_device.refcnt from atomic_t to refcount_t
net: convert netpoll_info.refcnt from atomic_t to refcount_t
net: convert unix_address.refcnt from atomic_t to refcount_t
net: convert fib_rule.refcnt from atomic_t to refcount_t
net: convert inet_frag_queue.refcnt from atomic_t to refcount_t
net: convert net.passive from atomic_t to refcount_t
net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t
net: convert packet_fanout.sk_ref from atomic_t to refcount_t

crypto/algif_aead.c | 2 +-
drivers/atm/fore200e.c | 12 +-----------
drivers/atm/he.c | 2 +-
drivers/atm/idt77252.c | 4 ++--
drivers/infiniband/hw/nes/nes_cm.c | 4 ++--
drivers/isdn/mISDN/socket.c | 2 +-
drivers/net/rionet.c | 2 +-
drivers/s390/net/ctcm_main.c | 26 ++++++++++++------------
drivers/s390/net/netiucv.c | 10 +++++-----
drivers/s390/net/qeth_core_main.c | 4 ++--
drivers/scsi/cxgbi/libcxgbi.h | 2 +-
include/linux/atmdev.h | 2 +-
include/linux/filter.h | 3 ++-
include/linux/igmp.h | 3 ++-
include/linux/inetdevice.h | 11 ++++++-----
include/linux/netpoll.h | 3 ++-
include/linux/skbuff.h | 16 +++++++--------
include/net/af_unix.h | 3 ++-
include/net/arp.h | 2 +-
include/net/fib_rules.h | 7 ++++---
include/net/inet_frag.h | 4 ++--
include/net/inet_hashtables.h | 4 ++--
include/net/ndisc.h | 2 +-
include/net/neighbour.h | 15 +++++++-------
include/net/net_namespace.h | 3 ++-
include/net/netfilter/br_netfilter.h | 2 +-
include/net/netlabel.h | 8 ++++----
include/net/request_sock.h | 9 +++++----
include/net/sock.h | 25 ++++++++++++------------
net/atm/br2684.c | 2 +-
net/atm/clip.c | 8 ++++----
net/atm/common.c | 10 +++++-----
net/atm/lec.c | 4 ++--
net/atm/mpc.c | 4 ++--
net/atm/pppoatm.c | 2 +-
net/atm/proc.c | 2 +-
net/atm/raw.c | 2 +-
net/atm/signaling.c | 2 +-
net/bluetooth/af_bluetooth.c | 2 +-
net/bluetooth/rfcomm/sock.c | 2 +-
net/bridge/br_netfilter_hooks.c | 4 ++--
net/caif/caif_socket.c | 2 +-
net/core/datagram.c | 10 +++++-----
net/core/dev.c | 10 +++++-----
net/core/fib_rules.c | 4 ++--
net/core/filter.c | 7 ++++---
net/core/neighbour.c | 22 ++++++++++-----------
net/core/net-sysfs.c | 2 +-
net/core/net_namespace.c | 4 ++--
net/core/netpoll.c | 10 +++++-----
net/core/pktgen.c | 16 +++++++--------
net/core/rtnetlink.c | 2 +-
net/core/skbuff.c | 38 ++++++++++++++++++------------------
net/core/sock.c | 30 ++++++++++++++--------------
net/dccp/ipv6.c | 2 +-
net/decnet/dn_neigh.c | 2 +-
net/ipv4/af_inet.c | 2 +-
net/ipv4/cipso_ipv4.c | 4 ++--
net/ipv4/devinet.c | 2 +-
net/ipv4/esp4.c | 2 +-
net/ipv4/igmp.c | 10 +++++-----
net/ipv4/inet_connection_sock.c | 2 +-
net/ipv4/inet_fragment.c | 14 ++++++-------
net/ipv4/inet_hashtables.c | 4 ++--
net/ipv4/inet_timewait_sock.c | 8 ++++----
net/ipv4/ip_fragment.c | 2 +-
net/ipv4/ip_output.c | 6 +++---
net/ipv4/ping.c | 4 ++--
net/ipv4/raw.c | 2 +-
net/ipv4/syncookies.c | 2 +-
net/ipv4/tcp.c | 4 ++--
net/ipv4/tcp_fastopen.c | 2 +-
net/ipv4/tcp_ipv4.c | 4 ++--
net/ipv4/tcp_offload.c | 2 +-
net/ipv4/tcp_output.c | 13 ++++++------
net/ipv4/udp.c | 6 +++---
net/ipv4/udp_diag.c | 4 ++--
net/ipv6/calipso.c | 4 ++--
net/ipv6/datagram.c | 2 +-
net/ipv6/esp6.c | 2 +-
net/ipv6/inet6_hashtables.c | 4 ++--
net/ipv6/ip6_output.c | 4 ++--
net/ipv6/syncookies.c | 2 +-
net/ipv6/tcp_ipv6.c | 6 +++---
net/ipv6/udp.c | 2 +-
net/kcm/kcmproc.c | 2 +-
net/key/af_key.c | 8 ++++----
net/l2tp/l2tp_debugfs.c | 3 +--
net/llc/llc_conn.c | 8 ++++----
net/llc/llc_sap.c | 2 +-
net/netfilter/xt_TPROXY.c | 4 ++--
net/netlink/af_netlink.c | 14 ++++++-------
net/packet/af_packet.c | 14 ++++++-------
net/packet/internal.h | 4 +++-
net/phonet/socket.c | 4 ++--
net/rds/tcp_send.c | 2 +-
net/rxrpc/af_rxrpc.c | 6 +++---
net/rxrpc/skbuff.c | 12 ++++++------
net/sched/em_meta.c | 2 +-
net/sched/sch_atm.c | 2 +-
net/sctp/output.c | 2 +-
net/sctp/outqueue.c | 2 +-
net/sctp/proc.c | 2 +-
net/sctp/socket.c | 6 +++---
net/tipc/socket.c | 2 +-
net/unix/af_unix.c | 16 +++++++--------
106 files changed, 320 insertions(+), 319 deletions(-)

--
2.7.4

Subject: [PATCH 06/17] net: convert sock.sk_wmem_alloc from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
drivers/atm/fore200e.c | 12 +-----------
drivers/atm/he.c | 2 +-
drivers/atm/idt77252.c | 4 ++--
include/linux/atmdev.h | 2 +-
include/net/sock.h | 8 ++++----
net/atm/br2684.c | 2 +-
net/atm/clip.c | 2 +-
net/atm/common.c | 10 +++++-----
net/atm/lec.c | 4 ++--
net/atm/mpc.c | 4 ++--
net/atm/pppoatm.c | 2 +-
net/atm/raw.c | 2 +-
net/atm/signaling.c | 2 +-
net/caif/caif_socket.c | 2 +-
net/core/datagram.c | 2 +-
net/core/skbuff.c | 2 +-
net/core/sock.c | 26 +++++++++++++-------------
net/ipv4/af_inet.c | 2 +-
net/ipv4/esp4.c | 2 +-
net/ipv4/ip_output.c | 6 +++---
net/ipv4/tcp.c | 4 ++--
net/ipv4/tcp_offload.c | 2 +-
net/ipv4/tcp_output.c | 13 ++++++-------
net/ipv6/esp6.c | 2 +-
net/ipv6/ip6_output.c | 4 ++--
net/kcm/kcmproc.c | 2 +-
net/key/af_key.c | 2 +-
net/netlink/af_netlink.c | 2 +-
net/packet/af_packet.c | 4 ++--
net/phonet/socket.c | 2 +-
net/rds/tcp_send.c | 2 +-
net/rxrpc/af_rxrpc.c | 4 ++--
net/sched/sch_atm.c | 2 +-
net/sctp/output.c | 2 +-
net/sctp/proc.c | 2 +-
net/sctp/socket.c | 4 ++--
net/unix/af_unix.c | 6 +++---
37 files changed, 73 insertions(+), 84 deletions(-)

diff --git a/drivers/atm/fore200e.c b/drivers/atm/fore200e.c
index 637c3e6..b770d18 100644
--- a/drivers/atm/fore200e.c
+++ b/drivers/atm/fore200e.c
@@ -924,12 +924,7 @@ fore200e_tx_irq(struct fore200e* fore200e)
else {
dev_kfree_skb_any(entry->skb);
}
-#if 1
- /* race fixed by the above incarnation mechanism, but... */
- if (atomic_read(&sk_atm(vcc)->sk_wmem_alloc) < 0) {
- atomic_set(&sk_atm(vcc)->sk_wmem_alloc, 0);
- }
-#endif
+
/* check error condition */
if (*entry->status & STATUS_ERROR)
atomic_inc(&vcc->stats->tx_err);
@@ -1130,13 +1125,9 @@ fore200e_push_rpd(struct fore200e* fore200e, struct atm_vcc* vcc, struct rpd* rp
return -ENOMEM;
}

- ASSERT(atomic_read(&sk_atm(vcc)->sk_wmem_alloc) >= 0);
-
vcc->push(vcc, skb);
atomic_inc(&vcc->stats->rx);

- ASSERT(atomic_read(&sk_atm(vcc)->sk_wmem_alloc) >= 0);
-
return 0;
}

@@ -1572,7 +1563,6 @@ fore200e_send(struct atm_vcc *vcc, struct sk_buff *skb)
unsigned long flags;

ASSERT(vcc);
- ASSERT(atomic_read(&sk_atm(vcc)->sk_wmem_alloc) >= 0);
ASSERT(fore200e);
ASSERT(fore200e_vcc);

diff --git a/drivers/atm/he.c b/drivers/atm/he.c
index 3617659..fc1bbdb 100644
--- a/drivers/atm/he.c
+++ b/drivers/atm/he.c
@@ -2395,7 +2395,7 @@ he_close(struct atm_vcc *vcc)
* TBRQ, the host issues the close command to the adapter.
*/

- while (((tx_inuse = atomic_read(&sk_atm(vcc)->sk_wmem_alloc)) > 1) &&
+ while (((tx_inuse = refcount_read(&sk_atm(vcc)->sk_wmem_alloc)) > 1) &&
(retry < MAX_RETRY)) {
msleep(sleep);
if (sleep < 250)
diff --git a/drivers/atm/idt77252.c b/drivers/atm/idt77252.c
index 5ec1095..20eda87 100644
--- a/drivers/atm/idt77252.c
+++ b/drivers/atm/idt77252.c
@@ -724,7 +724,7 @@ push_on_scq(struct idt77252_dev *card, struct vc_map *vc, struct sk_buff *skb)
struct sock *sk = sk_atm(vcc);

vc->estimator->cells += (skb->len + 47) / 48;
- if (atomic_read(&sk->sk_wmem_alloc) >
+ if (refcount_read(&sk->sk_wmem_alloc) >
(sk->sk_sndbuf >> 1)) {
u32 cps = vc->estimator->maxcps;

@@ -2012,7 +2012,7 @@ idt77252_send_oam(struct atm_vcc *vcc, void *cell, int flags)
atomic_inc(&vcc->stats->tx_err);
return -ENOMEM;
}
- atomic_add(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);

memcpy(skb_put(skb, 52), cell, 52);

diff --git a/include/linux/atmdev.h b/include/linux/atmdev.h
index c1da539..4d97a89 100644
--- a/include/linux/atmdev.h
+++ b/include/linux/atmdev.h
@@ -254,7 +254,7 @@ static inline void atm_return(struct atm_vcc *vcc,int truesize)

static inline int atm_may_send(struct atm_vcc *vcc,unsigned int size)
{
- return (size + atomic_read(&sk_atm(vcc)->sk_wmem_alloc)) <
+ return (size + refcount_read(&sk_atm(vcc)->sk_wmem_alloc)) <
sk_atm(vcc)->sk_sndbuf;
}

diff --git a/include/net/sock.h b/include/net/sock.h
index c519de7..24dcdba 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -388,7 +388,7 @@ struct sock {

/* ===== cache line for TX ===== */
int sk_wmem_queued;
- atomic_t sk_wmem_alloc;
+ refcount_t sk_wmem_alloc;
unsigned long sk_tsq_flags;
struct sk_buff *sk_send_head;
struct sk_buff_head sk_write_queue;
@@ -1916,7 +1916,7 @@ static inline int skb_copy_to_page_nocache(struct sock *sk, struct iov_iter *fro
*/
static inline int sk_wmem_alloc_get(const struct sock *sk)
{
- return atomic_read(&sk->sk_wmem_alloc) - 1;
+ return refcount_read(&sk->sk_wmem_alloc) - 1;
}

/**
@@ -2060,7 +2060,7 @@ static inline unsigned long sock_wspace(struct sock *sk)
int amt = 0;

if (!(sk->sk_shutdown & SEND_SHUTDOWN)) {
- amt = sk->sk_sndbuf - atomic_read(&sk->sk_wmem_alloc);
+ amt = sk->sk_sndbuf - refcount_read(&sk->sk_wmem_alloc);
if (amt < 0)
amt = 0;
}
@@ -2141,7 +2141,7 @@ bool sk_page_frag_refill(struct sock *sk, struct page_frag *pfrag);
*/
static inline bool sock_writeable(const struct sock *sk)
{
- return atomic_read(&sk->sk_wmem_alloc) < (sk->sk_sndbuf >> 1);
+ return refcount_read(&sk->sk_wmem_alloc) < (sk->sk_sndbuf >> 1);
}

static inline gfp_t gfp_any(void)
diff --git a/net/atm/br2684.c b/net/atm/br2684.c
index fca84e1..4e11119 100644
--- a/net/atm/br2684.c
+++ b/net/atm/br2684.c
@@ -252,7 +252,7 @@ static int br2684_xmit_vcc(struct sk_buff *skb, struct net_device *dev,

ATM_SKB(skb)->vcc = atmvcc = brvcc->atmvcc;
pr_debug("atm_skb(%p)->vcc(%p)->dev(%p)\n", skb, atmvcc, atmvcc->dev);
- atomic_add(skb->truesize, &sk_atm(atmvcc)->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk_atm(atmvcc)->sk_wmem_alloc);
ATM_SKB(skb)->atm_options = atmvcc->atm_options;
dev->stats.tx_packets++;
dev->stats.tx_bytes += skb->len;
diff --git a/net/atm/clip.c b/net/atm/clip.c
index 33e0940..e2e1318 100644
--- a/net/atm/clip.c
+++ b/net/atm/clip.c
@@ -381,7 +381,7 @@ static netdev_tx_t clip_start_xmit(struct sk_buff *skb,
memcpy(here, llc_oui, sizeof(llc_oui));
((__be16 *) here)[3] = skb->protocol;
}
- atomic_add(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
ATM_SKB(skb)->atm_options = vcc->atm_options;
entry->vccs->last_use = jiffies;
pr_debug("atm_skb(%p)->vcc(%p)->dev(%p)\n", skb, vcc, vcc->dev);
diff --git a/net/atm/common.c b/net/atm/common.c
index 9613381..e58d938 100644
--- a/net/atm/common.c
+++ b/net/atm/common.c
@@ -75,7 +75,7 @@ static struct sk_buff *alloc_tx(struct atm_vcc *vcc, unsigned int size)
while (!(skb = alloc_skb(size, GFP_KERNEL)))
schedule();
pr_debug("%d += %d\n", sk_wmem_alloc_get(sk), skb->truesize);
- atomic_add(skb->truesize, &sk->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk->sk_wmem_alloc);
return skb;
}

@@ -85,9 +85,9 @@ static void vcc_sock_destruct(struct sock *sk)
printk(KERN_DEBUG "%s: rmem leakage (%d bytes) detected.\n",
__func__, atomic_read(&sk->sk_rmem_alloc));

- if (atomic_read(&sk->sk_wmem_alloc))
+ if (refcount_read(&sk->sk_wmem_alloc))
printk(KERN_DEBUG "%s: wmem leakage (%d bytes) detected.\n",
- __func__, atomic_read(&sk->sk_wmem_alloc));
+ __func__, refcount_read(&sk->sk_wmem_alloc));
}

static void vcc_def_wakeup(struct sock *sk)
@@ -106,7 +106,7 @@ static inline int vcc_writable(struct sock *sk)
struct atm_vcc *vcc = atm_sk(sk);

return (vcc->qos.txtp.max_sdu +
- atomic_read(&sk->sk_wmem_alloc)) <= sk->sk_sndbuf;
+ refcount_read(&sk->sk_wmem_alloc)) <= sk->sk_sndbuf;
}

static void vcc_write_space(struct sock *sk)
@@ -161,7 +161,7 @@ int vcc_create(struct net *net, struct socket *sock, int protocol, int family, i
memset(&vcc->local, 0, sizeof(struct sockaddr_atmsvc));
memset(&vcc->remote, 0, sizeof(struct sockaddr_atmsvc));
vcc->qos.txtp.max_sdu = 1 << 16; /* for meta VCs */
- atomic_set(&sk->sk_wmem_alloc, 1);
+ refcount_set(&sk->sk_wmem_alloc, 1);
atomic_set(&sk->sk_rmem_alloc, 0);
vcc->push = NULL;
vcc->pop = NULL;
diff --git a/net/atm/lec.c b/net/atm/lec.c
index 09cfe87..7554571 100644
--- a/net/atm/lec.c
+++ b/net/atm/lec.c
@@ -181,7 +181,7 @@ lec_send(struct atm_vcc *vcc, struct sk_buff *skb)
ATM_SKB(skb)->vcc = vcc;
ATM_SKB(skb)->atm_options = vcc->atm_options;

- atomic_add(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
if (vcc->send(vcc, skb) < 0) {
dev->stats.tx_dropped++;
return;
@@ -345,7 +345,7 @@ static int lec_atm_send(struct atm_vcc *vcc, struct sk_buff *skb)
int i;
char *tmp; /* FIXME */

- atomic_sub(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
+ WARN_ON(refcount_sub_and_test(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc));
mesg = (struct atmlec_msg *)skb->data;
tmp = skb->data;
tmp += sizeof(struct atmlec_msg);
diff --git a/net/atm/mpc.c b/net/atm/mpc.c
index a190800..680a4b9 100644
--- a/net/atm/mpc.c
+++ b/net/atm/mpc.c
@@ -555,7 +555,7 @@ static int send_via_shortcut(struct sk_buff *skb, struct mpoa_client *mpc)
sizeof(struct llc_snap_hdr));
}

- atomic_add(skb->truesize, &sk_atm(entry->shortcut)->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk_atm(entry->shortcut)->sk_wmem_alloc);
ATM_SKB(skb)->atm_options = entry->shortcut->atm_options;
entry->shortcut->send(entry->shortcut, skb);
entry->packets_fwded++;
@@ -911,7 +911,7 @@ static int msg_from_mpoad(struct atm_vcc *vcc, struct sk_buff *skb)

struct mpoa_client *mpc = find_mpc_by_vcc(vcc);
struct k_message *mesg = (struct k_message *)skb->data;
- atomic_sub(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
+ WARN_ON(refcount_sub_and_test(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc));

if (mpc == NULL) {
pr_info("no mpc found\n");
diff --git a/net/atm/pppoatm.c b/net/atm/pppoatm.c
index c4e0984..21d9d34 100644
--- a/net/atm/pppoatm.c
+++ b/net/atm/pppoatm.c
@@ -350,7 +350,7 @@ static int pppoatm_send(struct ppp_channel *chan, struct sk_buff *skb)
return 1;
}

- atomic_add(skb->truesize, &sk_atm(ATM_SKB(skb)->vcc)->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk_atm(ATM_SKB(skb)->vcc)->sk_wmem_alloc);
ATM_SKB(skb)->atm_options = ATM_SKB(skb)->vcc->atm_options;
pr_debug("atm_skb(%p)->vcc(%p)->dev(%p)\n",
skb, ATM_SKB(skb)->vcc, ATM_SKB(skb)->vcc->dev);
diff --git a/net/atm/raw.c b/net/atm/raw.c
index 2e17e97..821c079 100644
--- a/net/atm/raw.c
+++ b/net/atm/raw.c
@@ -35,7 +35,7 @@ static void atm_pop_raw(struct atm_vcc *vcc, struct sk_buff *skb)

pr_debug("(%d) %d -= %d\n",
vcc->vci, sk_wmem_alloc_get(sk), skb->truesize);
- atomic_sub(skb->truesize, &sk->sk_wmem_alloc);
+ WARN_ON(refcount_sub_and_test(skb->truesize, &sk->sk_wmem_alloc));
dev_kfree_skb_any(skb);
sk->sk_write_space(sk);
}
diff --git a/net/atm/signaling.c b/net/atm/signaling.c
index adb6e3d..ca59496 100644
--- a/net/atm/signaling.c
+++ b/net/atm/signaling.c
@@ -67,7 +67,7 @@ static int sigd_send(struct atm_vcc *vcc, struct sk_buff *skb)
struct sock *sk;

msg = (struct atmsvc_msg *) skb->data;
- atomic_sub(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc);
+ WARN_ON(refcount_sub_and_test(skb->truesize, &sk_atm(vcc)->sk_wmem_alloc));
vcc = *(struct atm_vcc **) &msg->vcc;
pr_debug("%d (0x%lx)\n", (int)msg->type, (unsigned long)vcc);
sk = sk_atm(vcc);
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c
index adcad34..0ea2616 100644
--- a/net/caif/caif_socket.c
+++ b/net/caif/caif_socket.c
@@ -1009,7 +1009,7 @@ static const struct proto_ops caif_stream_ops = {
static void caif_sock_destructor(struct sock *sk)
{
struct caifsock *cf_sk = container_of(sk, struct caifsock, sk);
- caif_assert(!atomic_read(&sk->sk_wmem_alloc));
+ caif_assert(!refcount_read(&sk->sk_wmem_alloc));
caif_assert(sk_unhashed(sk));
caif_assert(!sk->sk_socket);
if (!sock_flag(sk, SOCK_DEAD)) {
diff --git a/net/core/datagram.c b/net/core/datagram.c
index 281e5d6..d0702d0 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -595,7 +595,7 @@ int zerocopy_sg_from_iter(struct sk_buff *skb, struct iov_iter *from)
skb->data_len += copied;
skb->len += copied;
skb->truesize += truesize;
- atomic_add(truesize, &skb->sk->sk_wmem_alloc);
+ refcount_add(truesize, &skb->sk->sk_wmem_alloc);
while (copied) {
int size = min_t(int, copied, PAGE_SIZE - start);
skb_fill_page_desc(skb, frag++, pages[n], start, size);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 6911269..e81a27f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -2985,7 +2985,7 @@ int skb_append_datato_frags(struct sock *sk, struct sk_buff *skb,
get_page(pfrag->page);

skb->truesize += copy;
- atomic_add(copy, &sk->sk_wmem_alloc);
+ refcount_add(copy, &sk->sk_wmem_alloc);
skb->len += copy;
skb->data_len += copy;
offset += copy;
diff --git a/net/core/sock.c b/net/core/sock.c
index f6fd79f..e830ddc 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -1404,7 +1404,7 @@ struct sock *sk_alloc(struct net *net, int family, gfp_t priority,
if (likely(sk->sk_net_refcnt))
get_net(net);
sock_net_set(sk, net);
- atomic_set(&sk->sk_wmem_alloc, 1);
+ refcount_set(&sk->sk_wmem_alloc, 1);

mem_cgroup_sk_alloc(sk);
cgroup_sk_alloc(&sk->sk_cgrp_data);
@@ -1428,7 +1428,7 @@ static void __sk_destruct(struct rcu_head *head)
sk->sk_destruct(sk);

filter = rcu_dereference_check(sk->sk_filter,
- atomic_read(&sk->sk_wmem_alloc) == 0);
+ refcount_read(&sk->sk_wmem_alloc) == 0);
if (filter) {
sk_filter_uncharge(sk, filter);
RCU_INIT_POINTER(sk->sk_filter, NULL);
@@ -1473,7 +1473,7 @@ void sk_free(struct sock *sk)
* some packets are still in some tx queue.
* If not null, sock_wfree() will call __sk_free(sk) later
*/
- if (atomic_dec_and_test(&sk->sk_wmem_alloc))
+ if (refcount_dec_and_test(&sk->sk_wmem_alloc))
__sk_free(sk);
}
EXPORT_SYMBOL(sk_free);
@@ -1509,7 +1509,7 @@ struct sock *sk_clone_lock(const struct sock *sk, const gfp_t priority)
/*
* sk_wmem_alloc set to one (see sk_free() and sock_wfree())
*/
- atomic_set(&newsk->sk_wmem_alloc, 1);
+ refcount_set(&newsk->sk_wmem_alloc, 1);
atomic_set(&newsk->sk_omem_alloc, 0);
skb_queue_head_init(&newsk->sk_receive_queue);
skb_queue_head_init(&newsk->sk_write_queue);
@@ -1638,7 +1638,7 @@ void sock_wfree(struct sk_buff *skb)
* Keep a reference on sk_wmem_alloc, this will be released
* after sk_write_space() call
*/
- atomic_sub(len - 1, &sk->sk_wmem_alloc);
+ WARN_ON(refcount_sub_and_test(len - 1, &sk->sk_wmem_alloc));
sk->sk_write_space(sk);
len = 1;
}
@@ -1646,7 +1646,7 @@ void sock_wfree(struct sk_buff *skb)
* if sk_wmem_alloc reaches 0, we must finish what sk_free()
* could not do because of in-flight packets
*/
- if (atomic_sub_and_test(len, &sk->sk_wmem_alloc))
+ if (refcount_sub_and_test(len, &sk->sk_wmem_alloc))
__sk_free(sk);
}
EXPORT_SYMBOL(sock_wfree);
@@ -1658,7 +1658,7 @@ void __sock_wfree(struct sk_buff *skb)
{
struct sock *sk = skb->sk;

- if (atomic_sub_and_test(skb->truesize, &sk->sk_wmem_alloc))
+ if (refcount_sub_and_test(skb->truesize, &sk->sk_wmem_alloc))
__sk_free(sk);
}

@@ -1680,7 +1680,7 @@ void skb_set_owner_w(struct sk_buff *skb, struct sock *sk)
* is enough to guarantee sk_free() wont free this sock until
* all in-flight packets are completed
*/
- atomic_add(skb->truesize, &sk->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk->sk_wmem_alloc);
}
EXPORT_SYMBOL(skb_set_owner_w);

@@ -1708,7 +1708,7 @@ void skb_orphan_partial(struct sk_buff *skb)
|| skb->destructor == tcp_wfree
#endif
) {
- atomic_sub(skb->truesize - 1, &skb->sk->sk_wmem_alloc);
+ WARN_ON(refcount_sub_and_test(skb->truesize - 1, &skb->sk->sk_wmem_alloc));
skb->truesize = 1;
} else {
skb_orphan(skb);
@@ -1767,7 +1767,7 @@ EXPORT_SYMBOL(sock_i_ino);
struct sk_buff *sock_wmalloc(struct sock *sk, unsigned long size, int force,
gfp_t priority)
{
- if (force || atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
+ if (force || refcount_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf) {
struct sk_buff *skb = alloc_skb(size, priority);
if (skb) {
skb_set_owner_w(skb, sk);
@@ -1842,7 +1842,7 @@ static long sock_wait_for_wmem(struct sock *sk, long timeo)
break;
set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
- if (atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf)
+ if (refcount_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf)
break;
if (sk->sk_shutdown & SEND_SHUTDOWN)
break;
@@ -2145,7 +2145,7 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind)
if (sk->sk_type == SOCK_STREAM) {
if (sk->sk_wmem_queued < prot->sysctl_wmem[0])
return 1;
- } else if (atomic_read(&sk->sk_wmem_alloc) <
+ } else if (refcount_read(&sk->sk_wmem_alloc) <
prot->sysctl_wmem[0])
return 1;
}
@@ -2411,7 +2411,7 @@ static void sock_def_write_space(struct sock *sk)
/* Do not wake up a writer until he can make "significant"
* progress. --DaveM
*/
- if ((atomic_read(&sk->sk_wmem_alloc) << 1) <= sk->sk_sndbuf) {
+ if ((refcount_read(&sk->sk_wmem_alloc) << 1) <= sk->sk_sndbuf) {
wq = rcu_dereference(sk->sk_wq);
if (skwq_has_sleeper(wq))
wake_up_interruptible_sync_poll(&wq->wait, POLLOUT |
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index 5091f46..4d89969 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -150,7 +150,7 @@ void inet_sock_destruct(struct sock *sk)
}

WARN_ON(atomic_read(&sk->sk_rmem_alloc));
- WARN_ON(atomic_read(&sk->sk_wmem_alloc));
+ WARN_ON(refcount_read(&sk->sk_wmem_alloc));
WARN_ON(sk->sk_wmem_queued);
WARN_ON(sk->sk_forward_alloc);

diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index b1e2444..d02afc2 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -337,7 +337,7 @@ static int esp_output(struct xfrm_state *x, struct sk_buff *skb)
skb->data_len += tailen;
skb->truesize += tailen;
if (sk)
- atomic_add(tailen, &sk->sk_wmem_alloc);
+ refcount_add(tailen, &sk->sk_wmem_alloc);

skb_push(skb, -skb_network_offset(skb));

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 737ce82..a2ea706 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -1036,7 +1036,7 @@ static int __ip_append_data(struct sock *sk,
(flags & MSG_DONTWAIT), &err);
} else {
skb = NULL;
- if (atomic_read(&sk->sk_wmem_alloc) <=
+ if (refcount_read(&sk->sk_wmem_alloc) <=
2 * sk->sk_sndbuf)
skb = sock_wmalloc(sk,
alloclen + hh_len + 15, 1,
@@ -1144,7 +1144,7 @@ static int __ip_append_data(struct sock *sk,
skb->len += copy;
skb->data_len += copy;
skb->truesize += copy;
- atomic_add(copy, &sk->sk_wmem_alloc);
+ refcount_add(copy, &sk->sk_wmem_alloc);
}
offset += copy;
length -= copy;
@@ -1368,7 +1368,7 @@ ssize_t ip_append_page(struct sock *sk, struct flowi4 *fl4, struct page *page,
skb->len += len;
skb->data_len += len;
skb->truesize += len;
- atomic_add(len, &sk->sk_wmem_alloc);
+ refcount_add(len, &sk->sk_wmem_alloc);
offset += len;
size -= len;
}
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 3354a61..1f82de5 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -644,7 +644,7 @@ static bool tcp_should_autocork(struct sock *sk, struct sk_buff *skb,
return skb->len < size_goal &&
sysctl_tcp_autocorking &&
skb != tcp_write_queue_head(sk) &&
- atomic_read(&sk->sk_wmem_alloc) > skb->truesize;
+ refcount_read(&sk->sk_wmem_alloc) > skb->truesize;
}

static void tcp_push(struct sock *sk, int flags, int mss_now,
@@ -672,7 +672,7 @@ static void tcp_push(struct sock *sk, int flags, int mss_now,
/* It is possible TX completion already happened
* before we set TSQ_THROTTLED.
*/
- if (atomic_read(&sk->sk_wmem_alloc) > skb->truesize)
+ if (refcount_read(&sk->sk_wmem_alloc) > skb->truesize)
return;
}

diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index bc68da3..11f69bb 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -152,7 +152,7 @@ struct sk_buff *tcp_gso_segment(struct sk_buff *skb,
swap(gso_skb->sk, skb->sk);
swap(gso_skb->destructor, skb->destructor);
sum_truesize += skb->truesize;
- atomic_add(sum_truesize - gso_skb->truesize,
+ refcount_add(sum_truesize - gso_skb->truesize,
&skb->sk->sk_wmem_alloc);
}

diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c
index 22548b5..7cd1283 100644
--- a/net/ipv4/tcp_output.c
+++ b/net/ipv4/tcp_output.c
@@ -863,12 +863,11 @@ void tcp_wfree(struct sk_buff *skb)
struct sock *sk = skb->sk;
struct tcp_sock *tp = tcp_sk(sk);
unsigned long flags, nval, oval;
- int wmem;

/* Keep one reference on sk_wmem_alloc.
* Will be released by sk_free() from here or tcp_tasklet_func()
*/
- wmem = atomic_sub_return(skb->truesize - 1, &sk->sk_wmem_alloc);
+ WARN_ON(refcount_sub_and_test(skb->truesize - 1, &sk->sk_wmem_alloc));

/* If this softirq is serviced by ksoftirqd, we are likely under stress.
* Wait until our queues (qdisc + devices) are drained.
@@ -877,7 +876,7 @@ void tcp_wfree(struct sk_buff *skb)
* - chance for incoming ACK (processed by another cpu maybe)
* to migrate this flow (skb->ooo_okay will be eventually set)
*/
- if (wmem >= SKB_TRUESIZE(1) && this_cpu_ksoftirqd() == current)
+ if (refcount_read(&sk->sk_wmem_alloc) >= SKB_TRUESIZE(1) && this_cpu_ksoftirqd() == current)
goto out;

for (oval = READ_ONCE(sk->sk_tsq_flags);; oval = nval) {
@@ -981,7 +980,7 @@ static int tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, int clone_it,
skb->sk = sk;
skb->destructor = skb_is_tcp_pure_ack(skb) ? __sock_wfree : tcp_wfree;
skb_set_hash_from_sk(skb, sk);
- atomic_add(skb->truesize, &sk->sk_wmem_alloc);
+ refcount_add(skb->truesize, &sk->sk_wmem_alloc);

skb_set_dst_pending_confirm(skb, sk->sk_dst_pending_confirm);

@@ -2101,7 +2100,7 @@ static bool tcp_small_queue_check(struct sock *sk, const struct sk_buff *skb,
limit = min_t(u32, limit, sysctl_tcp_limit_output_bytes);
limit <<= factor;

- if (atomic_read(&sk->sk_wmem_alloc) > limit) {
+ if (refcount_read(&sk->sk_wmem_alloc) > limit) {
/* Always send the 1st or 2nd skb in write queue.
* No need to wait for TX completion to call us back,
* after softirq/tasklet schedule.
@@ -2117,7 +2116,7 @@ static bool tcp_small_queue_check(struct sock *sk, const struct sk_buff *skb,
* test again the condition.
*/
smp_mb__after_atomic();
- if (atomic_read(&sk->sk_wmem_alloc) > limit)
+ if (refcount_read(&sk->sk_wmem_alloc) > limit)
return true;
}
return false;
@@ -2735,7 +2734,7 @@ int __tcp_retransmit_skb(struct sock *sk, struct sk_buff *skb, int segs)
/* Do not sent more than we queued. 1/4 is reserved for possible
* copying overhead: fragmentation, tunneling, mangling etc.
*/
- if (atomic_read(&sk->sk_wmem_alloc) >
+ if (refcount_read(&sk->sk_wmem_alloc) >
min_t(u32, sk->sk_wmem_queued + (sk->sk_wmem_queued >> 2),
sk->sk_sndbuf))
return -EAGAIN;
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index ff54faa..b8f127e 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -317,7 +317,7 @@ static int esp6_output(struct xfrm_state *x, struct sk_buff *skb)
skb->data_len += tailen;
skb->truesize += tailen;
if (sk)
- atomic_add(tailen, &sk->sk_wmem_alloc);
+ refcount_add(tailen, &sk->sk_wmem_alloc);

skb_push(skb, -skb_network_offset(skb));

diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index 528b3c1..42a2f73 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1464,7 +1464,7 @@ static int __ip6_append_data(struct sock *sk,
(flags & MSG_DONTWAIT), &err);
} else {
skb = NULL;
- if (atomic_read(&sk->sk_wmem_alloc) <=
+ if (refcount_read(&sk->sk_wmem_alloc) <=
2 * sk->sk_sndbuf)
skb = sock_wmalloc(sk,
alloclen + hh_len, 1,
@@ -1577,7 +1577,7 @@ static int __ip6_append_data(struct sock *sk,
skb->len += copy;
skb->data_len += copy;
skb->truesize += copy;
- atomic_add(copy, &sk->sk_wmem_alloc);
+ refcount_add(copy, &sk->sk_wmem_alloc);
}
offset += copy;
length -= copy;
diff --git a/net/kcm/kcmproc.c b/net/kcm/kcmproc.c
index bf75c92..c343ac6 100644
--- a/net/kcm/kcmproc.c
+++ b/net/kcm/kcmproc.c
@@ -162,7 +162,7 @@ static void kcm_format_psock(struct kcm_psock *psock, struct seq_file *seq,
psock->sk->sk_receive_queue.qlen,
atomic_read(&psock->sk->sk_rmem_alloc),
psock->sk->sk_write_queue.qlen,
- atomic_read(&psock->sk->sk_wmem_alloc));
+ refcount_read(&psock->sk->sk_wmem_alloc));

if (psock->done)
seq_puts(seq, "Done ");
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 9d96b82..ba00bd3 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -104,7 +104,7 @@ static void pfkey_sock_destruct(struct sock *sk)
}

WARN_ON(atomic_read(&sk->sk_rmem_alloc));
- WARN_ON(atomic_read(&sk->sk_wmem_alloc));
+ WARN_ON(refcount_read(&sk->sk_wmem_alloc));

atomic_dec(&net_pfkey->socks_nr);
}
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 7dac4a9..9332b24 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -341,7 +341,7 @@ static void netlink_sock_destruct(struct sock *sk)
}

WARN_ON(atomic_read(&sk->sk_rmem_alloc));
- WARN_ON(atomic_read(&sk->sk_wmem_alloc));
+ WARN_ON(refcount_read(&sk->sk_wmem_alloc));
WARN_ON(nlk_sk(sk)->groups);
}

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index a0dbe7c..82eb052 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1320,7 +1320,7 @@ static void packet_sock_destruct(struct sock *sk)
skb_queue_purge(&sk->sk_error_queue);

WARN_ON(atomic_read(&sk->sk_rmem_alloc));
- WARN_ON(atomic_read(&sk->sk_wmem_alloc));
+ WARN_ON(refcount_read(&sk->sk_wmem_alloc));

if (!sock_flag(sk, SOCK_DEAD)) {
pr_err("Attempt to release alive packet socket: %p\n", sk);
@@ -2482,7 +2482,7 @@ static int tpacket_fill_skb(struct packet_sock *po, struct sk_buff *skb,
skb->data_len = to_write;
skb->len += to_write;
skb->truesize += to_write;
- atomic_add(to_write, &po->sk.sk_wmem_alloc);
+ refcount_add(to_write, &po->sk.sk_wmem_alloc);

while (likely(to_write)) {
nr_frags = skb_shinfo(skb)->nr_frags;
diff --git a/net/phonet/socket.c b/net/phonet/socket.c
index a6c8da3..27b0b13 100644
--- a/net/phonet/socket.c
+++ b/net/phonet/socket.c
@@ -360,7 +360,7 @@ static unsigned int pn_socket_poll(struct file *file, struct socket *sock,
return POLLHUP;

if (sk->sk_state == TCP_ESTABLISHED &&
- atomic_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf &&
+ refcount_read(&sk->sk_wmem_alloc) < sk->sk_sndbuf &&
atomic_read(&pn->tx_credits))
mask |= POLLOUT | POLLWRNORM | POLLWRBAND;

diff --git a/net/rds/tcp_send.c b/net/rds/tcp_send.c
index dcf4742..592e68b 100644
--- a/net/rds/tcp_send.c
+++ b/net/rds/tcp_send.c
@@ -208,7 +208,7 @@ void rds_tcp_write_space(struct sock *sk)
tc->t_last_seen_una = rds_tcp_snd_una(tc);
rds_send_path_drop_acked(cp, rds_tcp_snd_una(tc), rds_tcp_is_acked);

- if ((atomic_read(&sk->sk_wmem_alloc) << 1) <= sk->sk_sndbuf)
+ if ((refcount_read(&sk->sk_wmem_alloc) << 1) <= sk->sk_sndbuf)
queue_delayed_work(rds_wq, &cp->cp_send_w, 0);

out:
diff --git a/net/rxrpc/af_rxrpc.c b/net/rxrpc/af_rxrpc.c
index 7fb59c3..b473ac2 100644
--- a/net/rxrpc/af_rxrpc.c
+++ b/net/rxrpc/af_rxrpc.c
@@ -56,7 +56,7 @@ static void rxrpc_sock_destructor(struct sock *);
*/
static inline int rxrpc_writable(struct sock *sk)
{
- return atomic_read(&sk->sk_wmem_alloc) < (size_t) sk->sk_sndbuf;
+ return refcount_read(&sk->sk_wmem_alloc) < (size_t) sk->sk_sndbuf;
}

/*
@@ -665,7 +665,7 @@ static void rxrpc_sock_destructor(struct sock *sk)

rxrpc_purge_queue(&sk->sk_receive_queue);

- WARN_ON(atomic_read(&sk->sk_wmem_alloc));
+ WARN_ON(refcount_read(&sk->sk_wmem_alloc));
WARN_ON(!sk_unhashed(sk));
WARN_ON(sk->sk_socket);

diff --git a/net/sched/sch_atm.c b/net/sched/sch_atm.c
index 2209c2d..2b0778b 100644
--- a/net/sched/sch_atm.c
+++ b/net/sched/sch_atm.c
@@ -491,7 +491,7 @@ static void sch_atm_dequeue(unsigned long data)
ATM_SKB(skb)->vcc = flow->vcc;
memcpy(skb_push(skb, flow->hdr_len), flow->hdr,
flow->hdr_len);
- atomic_add(skb->truesize,
+ refcount_add(skb->truesize,
&sk_atm(flow->vcc)->sk_wmem_alloc);
/* atm.atm_options are already set by atm_tc_enqueue */
flow->vcc->send(flow->vcc, skb);
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 71ce6b9..6574281 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -392,7 +392,7 @@ static void sctp_packet_set_owner_w(struct sk_buff *skb, struct sock *sk)
* therefore only reserve a single byte to keep socket around until
* the packet has been transmitted.
*/
- atomic_inc(&sk->sk_wmem_alloc);
+ refcount_inc(&sk->sk_wmem_alloc);
}

static int sctp_packet_pack(struct sctp_packet *packet,
diff --git a/net/sctp/proc.c b/net/sctp/proc.c
index 206377f..25cd840 100644
--- a/net/sctp/proc.c
+++ b/net/sctp/proc.c
@@ -365,7 +365,7 @@ static int sctp_assocs_seq_show(struct seq_file *seq, void *v)
assoc->c.sinit_num_ostreams, assoc->max_retrans,
assoc->init_retries, assoc->shutdown_retries,
assoc->rtx_data_chunks,
- atomic_read(&sk->sk_wmem_alloc),
+ refcount_read(&sk->sk_wmem_alloc),
sk->sk_wmem_queued,
sk->sk_sndbuf,
sk->sk_rcvbuf);
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index c1120d5..67dfec1 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -164,7 +164,7 @@ static inline void sctp_set_owner_w(struct sctp_chunk *chunk)
sizeof(struct sk_buff) +
sizeof(struct sctp_chunk);

- atomic_add(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc);
+ refcount_add(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc);
sk->sk_wmem_queued += chunk->skb->truesize;
sk_mem_charge(sk, chunk->skb->truesize);
}
@@ -7539,7 +7539,7 @@ static void sctp_wfree(struct sk_buff *skb)
sizeof(struct sk_buff) +
sizeof(struct sctp_chunk);

- atomic_sub(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc);
+ WARN_ON(refcount_sub_and_test(sizeof(struct sctp_chunk), &sk->sk_wmem_alloc));

/*
* This undoes what is done via sctp_set_owner_w and sk_mem_charge
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index a48d403..a74339e 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -442,7 +442,7 @@ static int unix_dgram_peer_wake_me(struct sock *sk, struct sock *other)
static int unix_writable(const struct sock *sk)
{
return sk->sk_state != TCP_LISTEN &&
- (atomic_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
+ (refcount_read(&sk->sk_wmem_alloc) << 2) <= sk->sk_sndbuf;
}

static void unix_write_space(struct sock *sk)
@@ -487,7 +487,7 @@ static void unix_sock_destructor(struct sock *sk)

skb_queue_purge(&sk->sk_receive_queue);

- WARN_ON(atomic_read(&sk->sk_wmem_alloc));
+ WARN_ON(refcount_read(&sk->sk_wmem_alloc));
WARN_ON(!sk_unhashed(sk));
WARN_ON(sk->sk_socket);
if (!sock_flag(sk, SOCK_DEAD)) {
@@ -2024,7 +2024,7 @@ static ssize_t unix_stream_sendpage(struct socket *socket, struct page *page,
skb->len += size;
skb->data_len += size;
skb->truesize += size;
- atomic_add(size, &sk->sk_wmem_alloc);
+ refcount_add(size, &sk->sk_wmem_alloc);

if (newskb) {
err = unix_scm_to_skb(&scm, skb, false);
--
2.7.4

2017-03-16 15:32:55

by Elena Reshetova

[permalink] [raw]

Subject: [PATCH 16/17] net: convert netlbl_lsm_cache.refcount from atomic_t to refcount_t

2017-03-16 15:32:54

by Elena Reshetova

[permalink] [raw]

Subject: [PATCH 17/17] net: convert packet_fanout.sk_ref from atomic_t to refcount_t

2017-03-16 15:34:31

by Elena Reshetova

[permalink] [raw]

Subject: [PATCH 15/17] net: convert net.passive from atomic_t to refcount_t

2017-03-16 15:34:57

by Elena Reshetova

[permalink] [raw]

Subject: [PATCH 04/17] net: convert sk_buff.users from atomic_t to refcount_t

refcount_t type and corresponding API should be
used instead of atomic_t when the variable is used as
a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free
situations.

Signed-off-by: Elena Reshetova <[email protected]>
Signed-off-by: Hans Liljestrand <[email protected]>
Signed-off-by: Kees Cook <[email protected]>
Signed-off-by: David Windsor <[email protected]>
---
drivers/infiniband/hw/nes/nes_cm.c | 4 ++--
drivers/isdn/mISDN/socket.c | 2 +-
drivers/net/rionet.c | 2 +-
drivers/s390/net/ctcm_main.c | 26 +++++++++++++-------------
drivers/s390/net/netiucv.c | 10 +++++-----
drivers/s390/net/qeth_core_main.c | 4 ++--
drivers/scsi/cxgbi/libcxgbi.h | 2 +-
include/linux/skbuff.h | 6 +++---
net/core/datagram.c | 8 ++++----
net/core/dev.c | 10 +++++-----
net/core/netpoll.c | 4 ++--
net/core/pktgen.c | 16 ++++++++--------
net/core/rtnetlink.c | 2 +-
net/core/skbuff.c | 20 ++++++++++----------
net/dccp/ipv6.c | 2 +-
net/ipv6/syncookies.c | 2 +-
net/ipv6/tcp_ipv6.c | 2 +-
net/key/af_key.c | 4 ++--
net/netlink/af_netlink.c | 6 +++---
net/rxrpc/skbuff.c | 12 ++++++------
net/sctp/outqueue.c | 2 +-
net/sctp/socket.c | 2 +-
22 files changed, 74 insertions(+), 74 deletions(-)

diff --git a/drivers/infiniband/hw/nes/nes_cm.c b/drivers/infiniband/hw/nes/nes_cm.c
index fb983df..08dac2e 100644
--- a/drivers/infiniband/hw/nes/nes_cm.c
+++ b/drivers/infiniband/hw/nes/nes_cm.c
@@ -743,7 +743,7 @@ int schedule_nes_timer(struct nes_cm_node *cm_node, struct sk_buff *skb,

if (type == NES_TIMER_TYPE_SEND) {
new_send->seq_num = ntohl(tcp_hdr(skb)->seq);
- atomic_inc(&new_send->skb->users);
+ refcount_inc(&new_send->skb->users);
spin_lock_irqsave(&cm_node->retrans_list_lock, flags);
cm_node->send_entry = new_send;
add_ref_cm_node(cm_node);
@@ -925,7 +925,7 @@ static void nes_cm_timer_tick(unsigned long pass)
flags);
break;
}
- atomic_inc(&send_entry->skb->users);
+ refcount_inc(&send_entry->skb->users);
cm_packets_retrans++;
nes_debug(NES_DBG_CM, "Retransmitting send_entry %p "
"for node %p, jiffies = %lu, time to send = "
diff --git a/drivers/isdn/mISDN/socket.c b/drivers/isdn/mISDN/socket.c
index 99e5f97..c5603d1 100644
--- a/drivers/isdn/mISDN/socket.c
+++ b/drivers/isdn/mISDN/socket.c
@@ -155,7 +155,7 @@ mISDN_sock_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
copied = skb->len + MISDN_HEADER_LEN;
if (len < copied) {
if (flags & MSG_PEEK)
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
else
skb_queue_head(&sk->sk_receive_queue, skb);
return -ENOSPC;
diff --git a/drivers/net/rionet.c b/drivers/net/rionet.c
index 300bb14..e9f101c 100644
--- a/drivers/net/rionet.c
+++ b/drivers/net/rionet.c
@@ -201,7 +201,7 @@ static int rionet_start_xmit(struct sk_buff *skb, struct net_device *ndev)
rionet_queue_tx_msg(skb, ndev,
nets[rnet->mport->id].active[i]);
if (count)
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
count++;
}
} else if (RIONET_MAC_MATCH(eth->h_dest)) {
diff --git a/drivers/s390/net/ctcm_main.c b/drivers/s390/net/ctcm_main.c
index ac65f12..d079d02 100644
--- a/drivers/s390/net/ctcm_main.c
+++ b/drivers/s390/net/ctcm_main.c
@@ -483,7 +483,7 @@ static int ctcm_transmit_skb(struct channel *ch, struct sk_buff *skb)
spin_unlock_irqrestore(&ch->collect_lock, saveflags);
return -EBUSY;
} else {
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
header.length = l;
header.type = skb->protocol;
header.unused = 0;
@@ -500,7 +500,7 @@ static int ctcm_transmit_skb(struct channel *ch, struct sk_buff *skb)
* Protect skb against beeing free'd by upper
* layers.
*/
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
ch->prof.txlen += skb->len;
header.length = skb->len + LL_HEADER_LENGTH;
header.type = skb->protocol;
@@ -517,14 +517,14 @@ static int ctcm_transmit_skb(struct channel *ch, struct sk_buff *skb)
if (hi) {
nskb = alloc_skb(skb->len, GFP_ATOMIC | GFP_DMA);
if (!nskb) {
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
skb_pull(skb, LL_HEADER_LENGTH + 2);
ctcm_clear_busy(ch->netdev);
return -ENOMEM;
} else {
memcpy(skb_put(nskb, skb->len), skb->data, skb->len);
- atomic_inc(&nskb->users);
- atomic_dec(&skb->users);
+ refcount_inc(&nskb->users);
+ refcount_dec(&skb->users);
dev_kfree_skb_irq(skb);
skb = nskb;
}
@@ -542,7 +542,7 @@ static int ctcm_transmit_skb(struct channel *ch, struct sk_buff *skb)
* Remove our header. It gets added
* again on retransmit.
*/
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
skb_pull(skb, LL_HEADER_LENGTH + 2);
ctcm_clear_busy(ch->netdev);
return -ENOMEM;
@@ -553,7 +553,7 @@ static int ctcm_transmit_skb(struct channel *ch, struct sk_buff *skb)
ch->ccw[1].count = skb->len;
skb_copy_from_linear_data(skb,
skb_put(ch->trans_skb, skb->len), skb->len);
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
dev_kfree_skb_irq(skb);
ccw_idx = 0;
} else {
@@ -679,7 +679,7 @@ static int ctcmpc_transmit_skb(struct channel *ch, struct sk_buff *skb)

if ((fsm_getstate(ch->fsm) != CTC_STATE_TXIDLE) || grp->in_sweep) {
spin_lock_irqsave(&ch->collect_lock, saveflags);
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
p_header = kmalloc(PDU_HEADER_LENGTH, gfp_type());

if (!p_header) {
@@ -716,7 +716,7 @@ static int ctcmpc_transmit_skb(struct channel *ch, struct sk_buff *skb)
* Protect skb against beeing free'd by upper
* layers.
*/
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);

/*
* IDAL support in CTCM is broken, so we have to
@@ -729,8 +729,8 @@ static int ctcmpc_transmit_skb(struct channel *ch, struct sk_buff *skb)
goto nomem_exit;
} else {
memcpy(skb_put(nskb, skb->len), skb->data, skb->len);
- atomic_inc(&nskb->users);
- atomic_dec(&skb->users);
+ refcount_inc(&nskb->users);
+ refcount_dec(&skb->users);
dev_kfree_skb_irq(skb);
skb = nskb;
}
@@ -810,7 +810,7 @@ static int ctcmpc_transmit_skb(struct channel *ch, struct sk_buff *skb)
ch->trans_skb->len = 0;
ch->ccw[1].count = skb->len;
memcpy(skb_put(ch->trans_skb, skb->len), skb->data, skb->len);
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
dev_kfree_skb_irq(skb);
ccw_idx = 0;
CTCM_PR_DBGDATA("%s(%s): trans_skb len: %04x\n"
@@ -855,7 +855,7 @@ static int ctcmpc_transmit_skb(struct channel *ch, struct sk_buff *skb)
"%s(%s): MEMORY allocation ERROR\n",
CTCM_FUNTAIL, ch->id);
rc = -ENOMEM;
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
dev_kfree_skb_any(skb);
fsm_event(priv->mpcg->fsm, MPCG_EVENT_INOP, dev);
done:
diff --git a/drivers/s390/net/netiucv.c b/drivers/s390/net/netiucv.c
index 3f85b97..44fd71c 100644
--- a/drivers/s390/net/netiucv.c
+++ b/drivers/s390/net/netiucv.c
@@ -743,7 +743,7 @@ static void conn_action_txdone(fsm_instance *fi, int event, void *arg)
conn->prof.tx_pending--;
if (single_flag) {
if ((skb = skb_dequeue(&conn->commit_queue))) {
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
if (privptr) {
privptr->stats.tx_packets++;
privptr->stats.tx_bytes +=
@@ -767,7 +767,7 @@ static void conn_action_txdone(fsm_instance *fi, int event, void *arg)
txbytes += skb->len;
txpackets++;
stat_maxcq++;
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
dev_kfree_skb_any(skb);
}
if (conn->collect_len > conn->prof.maxmulti)
@@ -959,7 +959,7 @@ static void netiucv_purge_skb_queue(struct sk_buff_head *q)
struct sk_buff *skb;

while ((skb = skb_dequeue(q))) {
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
dev_kfree_skb_any(skb);
}
}
@@ -1177,7 +1177,7 @@ static int netiucv_transmit_skb(struct iucv_connection *conn,
IUCV_DBF_TEXT(data, 2,
"EBUSY from netiucv_transmit_skb\n");
} else {
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
skb_queue_tail(&conn->collect_queue, skb);
conn->collect_len += l;
rc = 0;
@@ -1247,7 +1247,7 @@ static int netiucv_transmit_skb(struct iucv_connection *conn,
} else {
if (copied)
dev_kfree_skb(skb);
- atomic_inc(&nskb->users);
+ refcount_inc(&nskb->users);
skb_queue_tail(&conn->commit_queue, nskb);
}
}
diff --git a/drivers/s390/net/qeth_core_main.c b/drivers/s390/net/qeth_core_main.c
index 315d8a2..5103a02 100644
--- a/drivers/s390/net/qeth_core_main.c
+++ b/drivers/s390/net/qeth_core_main.c
@@ -1239,7 +1239,7 @@ static void qeth_release_skbs(struct qeth_qdio_out_buffer *buf)
iucv->sk_txnotify(skb, TX_NOTIFY_GENERALERROR);
}
}
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
dev_kfree_skb_any(skb);
skb = skb_dequeue(&buf->skb_list);
}
@@ -3969,7 +3969,7 @@ static inline int qeth_fill_buffer(struct qeth_qdio_out_q *queue,
int flush_cnt = 0, hdr_len, large_send = 0;

buffer = buf->buffer;
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
skb_queue_tail(&buf->skb_list, skb);

/*check first on TSO ....*/
diff --git a/drivers/scsi/cxgbi/libcxgbi.h b/drivers/scsi/cxgbi/libcxgbi.h
index 18e0ea8..9584b062 100644
--- a/drivers/scsi/cxgbi/libcxgbi.h
+++ b/drivers/scsi/cxgbi/libcxgbi.h
@@ -378,7 +378,7 @@ static inline void cxgbi_sock_enqueue_wr(struct cxgbi_sock *csk,
* just one user currently so we use atomic_set rather than skb_get
* to avoid the atomic op.
*/
- atomic_set(&skb->users, 2);
+ refcount_set(&skb->users, 2);

if (!csk->wr_pending_head)
csk->wr_pending_head = skb;
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 957a2b0..0bcaec4 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -815,7 +815,7 @@ struct sk_buff {
unsigned char *head,
*data;
unsigned int truesize;
- atomic_t users;
+ refcount_t users;
};

#ifdef __KERNEL__
@@ -1313,7 +1313,7 @@ static inline struct sk_buff *skb_queue_prev(const struct sk_buff_head *list,
*/
static inline struct sk_buff *skb_get(struct sk_buff *skb)
{
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
return skb;
}

@@ -1414,7 +1414,7 @@ static inline void __skb_header_release(struct sk_buff *skb)
*/
static inline int skb_shared(const struct sk_buff *skb)
{
- return atomic_read(&skb->users) != 1;
+ return refcount_read(&skb->users) != 1;
}

/**
diff --git a/net/core/datagram.c b/net/core/datagram.c
index ea63334..281e5d6 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -244,7 +244,7 @@ struct sk_buff *__skb_try_recv_datagram(struct sock *sk, unsigned int flags,
}
}
*peeked = 1;
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
} else {
__skb_unlink(skb, queue);
if (destructor)
@@ -313,9 +313,9 @@ void __skb_free_datagram_locked(struct sock *sk, struct sk_buff *skb, int len)
{
bool slow;

- if (likely(atomic_read(&skb->users) == 1))
+ if (likely(refcount_read(&skb->users) == 1))
smp_rmb();
- else if (likely(!atomic_dec_and_test(&skb->users))) {
+ else if (likely(!refcount_dec_and_test(&skb->users))) {
sk_peek_offset_bwd(sk, len);
return;
}
@@ -343,7 +343,7 @@ int __sk_queue_drop_skb(struct sock *sk, struct sk_buff *skb,
spin_lock_bh(&sk->sk_receive_queue.lock);
if (skb == skb_peek(&sk->sk_receive_queue)) {
__skb_unlink(skb, &sk->sk_receive_queue);
- atomic_dec(&skb->users);
+ refcount_dec(&skb->users);
if (destructor)
destructor(sk, skb);
err = 0;
diff --git a/net/core/dev.c b/net/core/dev.c
index d947308..eeb6338 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1830,7 +1830,7 @@ static inline int deliver_skb(struct sk_buff *skb,
{
if (unlikely(skb_orphan_frags(skb, GFP_ATOMIC)))
return -ENOMEM;
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
return pt_prev->func(skb, skb->dev, pt_prev, orig_dev);
}

@@ -2449,10 +2449,10 @@ void __dev_kfree_skb_irq(struct sk_buff *skb, enum skb_free_reason reason)
{
unsigned long flags;

- if (likely(atomic_read(&skb->users) == 1)) {
+ if (likely(refcount_read(&skb->users) == 1)) {
smp_rmb();
- atomic_set(&skb->users, 0);
- } else if (likely(!atomic_dec_and_test(&skb->users))) {
+ refcount_set(&skb->users, 0);
+ } else if (likely(!refcount_dec_and_test(&skb->users))) {
return;
}
get_kfree_skb_cb(skb)->reason = reason;
@@ -3864,7 +3864,7 @@ static __latent_entropy void net_tx_action(struct softirq_action *h)

clist = clist->next;

- WARN_ON(atomic_read(&skb->users));
+ WARN_ON(refcount_read(&skb->users));
if (likely(get_kfree_skb_cb(skb)->reason == SKB_REASON_CONSUMED))
trace_consume_skb(skb);
else
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index 9424673..891d88e 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -271,7 +271,7 @@ static void zap_completion_queue(void)
struct sk_buff *skb = clist;
clist = clist->next;
if (!skb_irq_freeable(skb)) {
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
dev_kfree_skb_any(skb); /* put this one back */
} else {
__kfree_skb(skb);
@@ -303,7 +303,7 @@ static struct sk_buff *find_skb(struct netpoll *np, int len, int reserve)
return NULL;
}

- atomic_set(&skb->users, 1);
+ refcount_set(&skb->users, 1);
skb_reserve(skb, reserve);
return skb;
}
diff --git a/net/core/pktgen.c b/net/core/pktgen.c
index 96947f5..598355f 100644
--- a/net/core/pktgen.c
+++ b/net/core/pktgen.c
@@ -3361,7 +3361,7 @@ static void pktgen_wait_for_skb(struct pktgen_dev *pkt_dev)
{
ktime_t idle_start = ktime_get();

- while (atomic_read(&(pkt_dev->skb->users)) != 1) {
+ while (refcount_read(&(pkt_dev->skb->users)) != 1) {
if (signal_pending(current))
break;

@@ -3418,7 +3418,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
if (pkt_dev->xmit_mode == M_NETIF_RECEIVE) {
skb = pkt_dev->skb;
skb->protocol = eth_type_trans(skb, skb->dev);
- atomic_add(burst, &skb->users);
+ refcount_add(burst, &skb->users);
local_bh_disable();
do {
ret = netif_receive_skb(skb);
@@ -3426,11 +3426,11 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
pkt_dev->errors++;
pkt_dev->sofar++;
pkt_dev->seq_num++;
- if (atomic_read(&skb->users) != burst) {
+ if (refcount_read(&skb->users) != burst) {
/* skb was queued by rps/rfs or taps,
* so cannot reuse this skb
*/
- atomic_sub(burst - 1, &skb->users);
+ WARN_ON(refcount_sub_and_test(burst - 1, &skb->users));
/* get out of the loop and wait
* until skb is consumed
*/
@@ -3444,7 +3444,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
goto out; /* Skips xmit_mode M_START_XMIT */
} else if (pkt_dev->xmit_mode == M_QUEUE_XMIT) {
local_bh_disable();
- atomic_inc(&pkt_dev->skb->users);
+ refcount_inc(&pkt_dev->skb->users);

ret = dev_queue_xmit(pkt_dev->skb);
switch (ret) {
@@ -3485,7 +3485,7 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
pkt_dev->last_ok = 0;
goto unlock;
}
- atomic_add(burst, &pkt_dev->skb->users);
+ refcount_add(burst, &pkt_dev->skb->users);

xmit_more:
ret = netdev_start_xmit(pkt_dev->skb, odev, txq, --burst > 0);
@@ -3511,11 +3511,11 @@ static void pktgen_xmit(struct pktgen_dev *pkt_dev)
/* fallthru */
case NETDEV_TX_BUSY:
/* Retry it next time */
- atomic_dec(&(pkt_dev->skb->users));
+ refcount_dec(&(pkt_dev->skb->users));
pkt_dev->last_ok = 0;
}
if (unlikely(burst))
- atomic_sub(burst, &pkt_dev->skb->users);
+ WARN_ON(refcount_sub_and_test(burst, &pkt_dev->skb->users));
unlock:
HARD_TX_UNLOCK(odev, txq);

diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index c4e84c5..8044cf3 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -647,7 +647,7 @@ int rtnetlink_send(struct sk_buff *skb, struct net *net, u32 pid, unsigned int g

NETLINK_CB(skb).dst_group = group;
if (echo)
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
netlink_broadcast(rtnl, skb, pid, group, GFP_KERNEL);
if (echo)
err = netlink_unicast(rtnl, skb, pid, MSG_DONTWAIT);
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 89f0d4e..94e961f 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -176,7 +176,7 @@ struct sk_buff *__alloc_skb_head(gfp_t gfp_mask, int node)
memset(skb, 0, offsetof(struct sk_buff, tail));
skb->head = NULL;
skb->truesize = sizeof(struct sk_buff);
- atomic_set(&skb->users, 1);
+ refcount_set(&skb->users, 1);

skb->mac_header = (typeof(skb->mac_header))~0U;
out:
@@ -247,7 +247,7 @@ struct sk_buff *__alloc_skb(unsigned int size, gfp_t gfp_mask,
/* Account for allocated memory : skb + skb->head */
skb->truesize = SKB_TRUESIZE(size);
skb->pfmemalloc = pfmemalloc;
- atomic_set(&skb->users, 1);
+ refcount_set(&skb->users, 1);
skb->head = data;
skb->data = data;
skb_reset_tail_pointer(skb);
@@ -314,7 +314,7 @@ struct sk_buff *__build_skb(void *data, unsigned int frag_size)

memset(skb, 0, offsetof(struct sk_buff, tail));
skb->truesize = SKB_TRUESIZE(size);
- atomic_set(&skb->users, 1);
+ refcount_set(&skb->users, 1);
skb->head = data;
skb->data = data;
skb_reset_tail_pointer(skb);
@@ -696,9 +696,9 @@ void kfree_skb(struct sk_buff *skb)
{
if (unlikely(!skb))
return;
- if (likely(atomic_read(&skb->users) == 1))
+ if (likely(refcount_read(&skb->users) == 1))
smp_rmb();
- else if (likely(!atomic_dec_and_test(&skb->users)))
+ else if (likely(!refcount_dec_and_test(&skb->users)))
return;
trace_kfree_skb(skb, __builtin_return_address(0));
__kfree_skb(skb);
@@ -748,9 +748,9 @@ void consume_skb(struct sk_buff *skb)
{
if (unlikely(!skb))
return;
- if (likely(atomic_read(&skb->users) == 1))
+ if (likely(refcount_read(&skb->users) == 1))
smp_rmb();
- else if (likely(!atomic_dec_and_test(&skb->users)))
+ else if (likely(!refcount_dec_and_test(&skb->users)))
return;
trace_consume_skb(skb);
__kfree_skb(skb);
@@ -807,9 +807,9 @@ void napi_consume_skb(struct sk_buff *skb, int budget)
return;
}

- if (likely(atomic_read(&skb->users) == 1))
+ if (likely(refcount_read(&skb->users) == 1))
smp_rmb();
- else if (likely(!atomic_dec_and_test(&skb->users)))
+ else if (likely(!refcount_dec_and_test(&skb->users)))
return;
/* if reaching here SKB is ready to free */
trace_consume_skb(skb);
@@ -906,7 +906,7 @@ static struct sk_buff *__skb_clone(struct sk_buff *n, struct sk_buff *skb)
C(head_frag);
C(data);
C(truesize);
- atomic_set(&n->users, 1);
+ refcount_set(&n->users, 1);

atomic_inc(&(skb_shinfo(skb)->dataref));
skb->cloned = 1;
diff --git a/net/dccp/ipv6.c b/net/dccp/ipv6.c
index b4019a5..c734eea 100644
--- a/net/dccp/ipv6.c
+++ b/net/dccp/ipv6.c
@@ -351,7 +351,7 @@ static int dccp_v6_conn_request(struct sock *sk, struct sk_buff *skb)
if (ipv6_opt_accepted(sk, skb, IP6CB(skb)) ||
np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo ||
np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim) {
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
ireq->pktopts = skb;
}
ireq->ir_iif = sk->sk_bound_dev_if;
diff --git a/net/ipv6/syncookies.c b/net/ipv6/syncookies.c
index 895ff65..2626a1d 100644
--- a/net/ipv6/syncookies.c
+++ b/net/ipv6/syncookies.c
@@ -185,7 +185,7 @@ struct sock *cookie_v6_check(struct sock *sk, struct sk_buff *skb)
if (ipv6_opt_accepted(sk, skb, &TCP_SKB_CB(skb)->header.h6) ||
np->rxopt.bits.rxinfo || np->rxopt.bits.rxoinfo ||
np->rxopt.bits.rxhlim || np->rxopt.bits.rxohlim) {
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
ireq->pktopts = skb;
}

diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index bdbc432..21bb2fc 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -718,7 +718,7 @@ static void tcp_v6_init_req(struct request_sock *req,
np->rxopt.bits.rxinfo ||
np->rxopt.bits.rxoinfo || np->rxopt.bits.rxhlim ||
np->rxopt.bits.rxohlim || np->repflow)) {
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
ireq->pktopts = skb;
}
}
diff --git a/net/key/af_key.c b/net/key/af_key.c
index c6252ed..9d96b82 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -194,11 +194,11 @@ static int pfkey_broadcast_one(struct sk_buff *skb, struct sk_buff **skb2,

sock_hold(sk);
if (*skb2 == NULL) {
- if (atomic_read(&skb->users) != 1) {
+ if (refcount_read(&skb->users) != 1) {
*skb2 = skb_clone(skb, allocation);
} else {
*skb2 = skb;
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
}
}
if (*skb2 != NULL) {
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 7b73c7c..7dac4a9 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1797,7 +1797,7 @@ static int netlink_sendmsg(struct socket *sock, struct msghdr *msg, size_t len)
}

if (dst_group) {
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
netlink_broadcast(sk, skb, dst_portid, dst_group, GFP_KERNEL);
}
err = netlink_unicast(sk, skb, dst_portid, msg->msg_flags&MSG_DONTWAIT);
@@ -2175,7 +2175,7 @@ int __netlink_dump_start(struct sock *ssk, struct sk_buff *skb,
struct netlink_sock *nlk;
int ret;

- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);

sk = netlink_lookup(sock_net(ssk), ssk->sk_protocol, NETLINK_CB(skb).portid);
if (sk == NULL) {
@@ -2332,7 +2332,7 @@ int nlmsg_notify(struct sock *sk, struct sk_buff *skb, u32 portid,
int exclude_portid = 0;

if (report) {
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
exclude_portid = portid;
}

diff --git a/net/rxrpc/skbuff.c b/net/rxrpc/skbuff.c
index 67b02c4..b8985d0 100644
--- a/net/rxrpc/skbuff.c
+++ b/net/rxrpc/skbuff.c
@@ -27,7 +27,7 @@ void rxrpc_new_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
{
const void *here = __builtin_return_address(0);
int n = atomic_inc_return(select_skb_count(op));
- trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
+ trace_rxrpc_skb(skb, op, refcount_read(&skb->users), n, here);
}

/*
@@ -38,7 +38,7 @@ void rxrpc_see_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
const void *here = __builtin_return_address(0);
if (skb) {
int n = atomic_read(select_skb_count(op));
- trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
+ trace_rxrpc_skb(skb, op, refcount_read(&skb->users), n, here);
}
}

@@ -49,7 +49,7 @@ void rxrpc_get_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
{
const void *here = __builtin_return_address(0);
int n = atomic_inc_return(select_skb_count(op));
- trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
+ trace_rxrpc_skb(skb, op, refcount_read(&skb->users), n, here);
skb_get(skb);
}

@@ -63,7 +63,7 @@ void rxrpc_free_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
int n;
CHECK_SLAB_OKAY(&skb->users);
n = atomic_dec_return(select_skb_count(op));
- trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
+ trace_rxrpc_skb(skb, op, refcount_read(&skb->users), n, here);
kfree_skb(skb);
}
}
@@ -78,7 +78,7 @@ void rxrpc_lose_skb(struct sk_buff *skb, enum rxrpc_skb_trace op)
int n;
CHECK_SLAB_OKAY(&skb->users);
n = atomic_dec_return(select_skb_count(op));
- trace_rxrpc_skb(skb, op, atomic_read(&skb->users), n, here);
+ trace_rxrpc_skb(skb, op, refcount_read(&skb->users), n, here);
kfree_skb(skb);
}
}
@@ -93,7 +93,7 @@ void rxrpc_purge_queue(struct sk_buff_head *list)
while ((skb = skb_dequeue((list))) != NULL) {
int n = atomic_dec_return(select_skb_count(rxrpc_skb_rx_purged));
trace_rxrpc_skb(skb, rxrpc_skb_rx_purged,
- atomic_read(&skb->users), n, here);
+ refcount_read(&skb->users), n, here);
kfree_skb(skb);
}
}
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index db352e5..1d6dae9 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -1094,7 +1094,7 @@ static void sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp)
sctp_cname(SCTP_ST_CHUNK(chunk->chunk_hdr->type)) :
"illegal chunk", ntohl(chunk->subh.data_hdr->tsn),
chunk->skb ? chunk->skb->head : NULL, chunk->skb ?
- atomic_read(&chunk->skb->users) : -1);
+ refcount_read(&chunk->skb->users) : -1);

/* Add the chunk to the packet. */
status = sctp_packet_transmit_chunk(packet, chunk, 0, gfp);
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index 6f0a9be..c1120d5 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -7421,7 +7421,7 @@ struct sk_buff *sctp_skb_recv_datagram(struct sock *sk, int flags,
if (flags & MSG_PEEK) {
skb = skb_peek(&sk->sk_receive_queue);
if (skb)
- atomic_inc(&skb->users);
+ refcount_inc(&skb->users);
} else {
skb = __skb_dequeue(&sk->sk_receive_queue);
}
--
2.7.4

2017-03-16 15:36:22

by Elena Reshetova

[permalink] [raw]

On Wed, Mar 22, 2017 at 07:54:04AM -0700, Eric Dumazet wrote:
>
> I guess someone could code a lib/test_refcount.c launching X threads
> using either atomic_inc or refcount_inc() in a loop.
>
> That would give a rough estimate of the refcount_t overhead among
> various platforms.

Cycles spend on uncontended ops:

SKL SNB IVB-EP

atomic: lock incl ~15 ~13 ~10
atomic-ref: call refcount_inc ~31 ~37 ~31
atomic-ref2: $inlined ~23 ~22 ~21

Contended numbers (E3-1245 v5):

root@skl:~/spinlocks# LOCK=./atomic ./test1.sh
1: 14.797240
2: 87.451230
4: 100.747790
8: 118.234010

root@skl:~/spinlocks# LOCK=./atomic-ref ./test1.sh
1: 30.627320
2: 91.866730
4: 111.029560
8: 141.922420

root@skl:~/spinlocks# LOCK=./atomic-ref2 ./test1.sh
1: 23.243930
2: 98.620250
4: 119.604240
8: 124.864380

The code includes the patches found here:

https://lkml.kernel.org/r/[email protected]

and effectively does:

#define REFCOUNT_WARN(cond, str) WARN_ON_ONCE(cond)

s/WARN_ONCE/REFCOUNT_WARN/

on lib/refcount.c

Find the tarball of the userspace code used attached (its a bit of a
mess; its grown over time and needs a cleanup).

I used: gcc (Debian 6.3.0-6) 6.3.0 20170205

So while its about ~20 cycles worse, reducing contention is far more
effective than removing straight line instruction count (which too is
entirely possible, because GCC generates absolute shite in places).

Attachments:

(No filename) (1.37 kB)
spinlocks.tar.bz2 (19.99 kB)
Download all attachments

2017-03-22 19:08:33

by Kees Cook

[permalink] [raw]

Subject: Re: [PATCH 07/17] net: convert sock.sk_refcnt from atomic_t to refcount_t

On Tue, Mar 21, 2017 at 7:03 PM, Eric Dumazet <[email protected]> wrote:
> On Tue, 2017-03-21 at 16:51 -0700, Kees Cook wrote:
>
>> Am I understanding you correctly that you'd want something like:
>>
>> refcount.h:
>> #ifdef UNPROTECTED_REFCOUNT
>> #define refcount_inc(x) atomic_inc(x)
>> ...
>> #else
>> void refcount_inc(...
>> ...
>> #endif
>>
>> some/net.c:
>> #define UNPROTECTED_REFCOUNT
>> #include <refcount.h>
>>
>> or similar?
>
> At first, it could be something simple like that yes.
>
> Note that we might define two refcount_inc() : One that does whole
> tests, and refcount_inc_relaxed() that might translate to atomic_inc()
> on non debug kernels.
>
> Then later, maybe provide a dynamic infrastructure so that we can
> dynamically force the full checks even for refcount_inc_relaxed() on say
> 1% of the hosts, to get better debug coverage ?

Well, this isn't about finding bugs in normal workflows. This is about
catching bugs that attackers have found and start exploiting to gain a
use-after-free primitive. The intention is for it to be always
enabled.

-Kees

--
Kees Cook
Pixel Security