2015-02-26 19:06:57

by Eyal Birger

[permalink] [raw]
Subject: [PATCH net-next v2 0/7] net: move skb->dropcount to skb->cb[]

Commit 977750076d98 ("af_packet: add interframe drop cmsg (v6)")
unionized skb->mark and skb->dropcount in order to allow recording
of the socket drop count while maintaining struct sk_buff size.

skb->dropcount was introduced since there was no available room
in skb->cb[] in packet sockets. However, its introduction led to
the inability to export skb->mark to userspace.

It was considered to alias skb->priority instead of skb->mark.
However, that would lead to the inabilty to export skb->priority
to userspace if desired. Such change may also lead to hard-to-find
issues as skb->priority is assumed to be alias free, and, as noted
by Shmulik Ladkani, is not 'naturally orthogonal' with other skb
fields.

This patch series follows the suggestions made by Eric Dumazet moving
the dropcount metric to skb->cb[], eliminating this problem
at the expense of 4 bytes less in skb->cb[] for protocol families
using it.

The patch series include compactization of bluetooth and packet
use of skb->cb[] as well as the infrastructure for placing dropcount
in skb->cb[].

---
Changes in v2:
- Rebase
- Receive const struct sock * in sock_skb_set_dropcount()
per Eric Dumazet's suggestion
- struct bt_skb_cb compactization code improvements following
suggestions from Shmulik Landani and David Laight
- Fix incorrect asignment to skb->dev in packet_rcv()
---

Eyal Birger (7):
net: bluetooth: compact struct bt_skb_cb by inlining struct
hci_req_ctrl
net: bluetooth: compact struct bt_skb_cb by converting boolean fields
to bit fields
net: rxrpc: change call to sock_recv_ts_and_drops() on rxrpc recvmsg
to sock_recv_timestamp()
net: packet: use skb->dev as storage for skb orig len instead of
skb->cb[]
net: use common macro for assering skb->cb[] available size in
protocol families
net: add common accessor for setting dropcount on packets
net: move skb->dropcount to skb->cb[]

include/linux/skbuff.h | 2 --
include/net/bluetooth/bluetooth.h | 14 +++++---------
include/net/sock.h | 23 +++++++++++++++++++++++
net/bluetooth/af_bluetooth.c | 3 +--
net/bluetooth/hci_core.c | 12 ++++++------
net/bluetooth/hci_event.c | 4 ++--
net/bluetooth/hci_request.c | 6 +++---
net/bluetooth/hci_sock.c | 2 +-
net/can/bcm.c | 2 +-
net/can/raw.c | 6 +++---
net/core/sock.c | 2 +-
net/ipv4/af_inet.c | 2 +-
net/ipv4/tcp.c | 3 +--
net/ipv6/af_inet6.c | 2 +-
net/packet/af_packet.c | 15 +++++++--------
net/rxrpc/ar-recvmsg.c | 2 +-
net/sctp/protocol.c | 3 +--
net/socket.c | 4 ++--
18 files changed, 60 insertions(+), 47 deletions(-)

--
2.1.4


2015-02-28 21:00:47

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

From: Eyal Birger <[email protected]>
Date: Sat, 28 Feb 2015 22:38:04 +0200

> My concern is that any value I pick based on the existing implementations
> would need to be adjusted come a protocol with a larger address length.

Then we need a method that requires protocols to register their
address length in a manner that will allow us to validate that
limit at compile time.

This is not rocket science.

Right now we're proposing to do utterly stupid shit like encoding
integers in device pointers in the sk_buff, when that is absolutely
not necessary at all.

2015-02-28 20:38:04

by Eyal Birger

[permalink] [raw]
Subject: Re: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

On Sat, Feb 28, 2015 at 10:08 PM, David Miller <[email protected]> wrote:
> From: Eyal Birger <[email protected]>
> Date: Sat, 28 Feb 2015 21:39:34 +0200
>
>> On Sat, Feb 28, 2015 at 9:21 PM, David Miller <[email protected]> wrote:
>>> From: Eyal Birger <[email protected]>
>>> Date: Thu, 26 Feb 2015 21:07:01 +0200
>>>
>>>> As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
>>>> of additional room are needed in skb->cb[] in packet sockets.
>>>>
>>>> Store the skb original length in skb->dev instead of skb->cb[] for
>>>> this purpose.
>>>>
>>>> Signed-off-by: Eyal Birger <[email protected]>
>>>
>>> I'm a little confused, why is this even needed?
>>>
>>> packet_skb_cb is 24 bytes by my calculations, which is much
>>> smaller than the cb[] size which is 48 bytes.
>>
>> Note the BUILD_BUG_ON in packet_rcv().
>>
>> packet_skb_cb may contain an address as large as MAX_ADDR_LEN (32)
>> Therefore the required space is sizeof(packet_skb_cb) + MAX_ADDR_LEN - 8
>> which is 48 bytes before this change.
>
> So let's take a step back.
>
> What link layer we support has 32-byte hardware addresses?

The largest one I've been hinted of is INFINIBAND_ALEN which is 20 bytes.

> If the answer is lower than 32, we should use that value instead.

Won't that value become the 'real' MAX_ADDR_LEN in that case?

My concern is that any value I pick based on the existing implementations
would need to be adjusted come a protocol with a larger address length.

Also, I would have to assert the available amount of space at run-time as
the address length is provided by a call to dev_parse_header() instead of
using a build-time assert.

Regards,
Eyal.

2015-02-28 20:08:30

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

From: Eyal Birger <[email protected]>
Date: Sat, 28 Feb 2015 21:39:34 +0200

> On Sat, Feb 28, 2015 at 9:21 PM, David Miller <[email protected]> wrote:
>> From: Eyal Birger <[email protected]>
>> Date: Thu, 26 Feb 2015 21:07:01 +0200
>>
>>> As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
>>> of additional room are needed in skb->cb[] in packet sockets.
>>>
>>> Store the skb original length in skb->dev instead of skb->cb[] for
>>> this purpose.
>>>
>>> Signed-off-by: Eyal Birger <[email protected]>
>>
>> I'm a little confused, why is this even needed?
>>
>> packet_skb_cb is 24 bytes by my calculations, which is much
>> smaller than the cb[] size which is 48 bytes.
>
> Note the BUILD_BUG_ON in packet_rcv().
>
> packet_skb_cb may contain an address as large as MAX_ADDR_LEN (32)
> Therefore the required space is sizeof(packet_skb_cb) + MAX_ADDR_LEN - 8
> which is 48 bytes before this change.

So let's take a step back.

What link layer we support has 32-byte hardware addresses?

If the answer is lower than 32, we should use that value instead.

2015-02-28 19:39:34

by Eyal Birger

[permalink] [raw]
Subject: Re: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

On Sat, Feb 28, 2015 at 9:21 PM, David Miller <[email protected]> wrote:
> From: Eyal Birger <[email protected]>
> Date: Thu, 26 Feb 2015 21:07:01 +0200
>
>> As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
>> of additional room are needed in skb->cb[] in packet sockets.
>>
>> Store the skb original length in skb->dev instead of skb->cb[] for
>> this purpose.
>>
>> Signed-off-by: Eyal Birger <[email protected]>
>
> I'm a little confused, why is this even needed?
>
> packet_skb_cb is 24 bytes by my calculations, which is much
> smaller than the cb[] size which is 48 bytes.

Note the BUILD_BUG_ON in packet_rcv().

packet_skb_cb may contain an address as large as MAX_ADDR_LEN (32)
Therefore the required space is sizeof(packet_skb_cb) + MAX_ADDR_LEN - 8
which is 48 bytes before this change.

Regards,
Eyal.

2015-02-28 19:21:35

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

From: Eyal Birger <[email protected]>
Date: Thu, 26 Feb 2015 21:07:01 +0200

> As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
> of additional room are needed in skb->cb[] in packet sockets.
>
> Store the skb original length in skb->dev instead of skb->cb[] for
> this purpose.
>
> Signed-off-by: Eyal Birger <[email protected]>

I'm a little confused, why is this even needed?

packet_skb_cb is 24 bytes by my calculations, which is much
smaller than the cb[] size which is 48 bytes.

2015-02-26 19:07:04

by Eyal Birger

[permalink] [raw]
Subject: [PATCH net-next v2 7/7] net: move skb->dropcount to skb->cb[]

Commit 977750076d98 ("af_packet: add interframe drop cmsg (v6)")
unionized skb->mark and skb->dropcount in order to allow recording
of the socket drop count while maintaining struct sk_buff size.

skb->dropcount was introduced since there was no available room
in skb->cb[] in packet sockets. However, its introduction led to
the inability to export skb->mark, or any other aliased field to
userspace if so desired.

Moving the dropcount metric to skb->cb[] eliminates this problem
at the expense of 4 bytes less in skb->cb[] for protocol families
using it.

Signed-off-by: Eyal Birger <[email protected]>
---
include/linux/skbuff.h | 2 --
include/net/sock.h | 18 ++++++++++++++++--
net/socket.c | 4 ++--
3 files changed, 18 insertions(+), 6 deletions(-)

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index d898b32..bba1330 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -492,7 +492,6 @@ static inline u32 skb_mstamp_us_delta(const struct skb_mstamp *t1,
* @napi_id: id of the NAPI struct this skb came from
* @secmark: security marking
* @mark: Generic packet mark
- * @dropcount: total number of sk_receive_queue overflows
* @vlan_proto: vlan encapsulation protocol
* @vlan_tci: vlan tag control information
* @inner_protocol: Protocol (encapsulation)
@@ -641,7 +640,6 @@ struct sk_buff {
#endif
union {
__u32 mark;
- __u32 dropcount;
__u32 reserved_tailroom;
};

diff --git a/include/net/sock.h b/include/net/sock.h
index 0996fe4..38369d3 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2078,13 +2078,27 @@ static inline int sock_intr_errno(long timeo)
return timeo == MAX_SCHEDULE_TIMEOUT ? -ERESTARTSYS : -EINTR;
}

+struct sock_skb_cb {
+ u32 dropcount;
+};
+
+/* Store sock_skb_cb at the end of skb->cb[] so protocol families
+ * using skb->cb[] would keep using it directly and utilize its
+ * alignement guarantee.
+ */
+#define SOCK_SKB_CB_OFFSET ((FIELD_SIZEOF(struct sk_buff, cb) - \
+ sizeof(struct sock_skb_cb)))
+
+#define SOCK_SKB_CB(__skb) ((struct sock_skb_cb *)((__skb)->cb + \
+ SOCK_SKB_CB_OFFSET))
+
#define sock_skb_cb_check_size(size) \
- BUILD_BUG_ON((size) > FIELD_SIZEOF(struct sk_buff, cb))
+ BUILD_BUG_ON((size) > SOCK_SKB_CB_OFFSET)

static inline void
sock_skb_set_dropcount(const struct sock *sk, struct sk_buff *skb)
{
- skb->dropcount = atomic_read(&sk->sk_drops);
+ SOCK_SKB_CB(skb)->dropcount = atomic_read(&sk->sk_drops);
}

void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
diff --git a/net/socket.c b/net/socket.c
index bbedbfc..b78cf60 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -731,9 +731,9 @@ EXPORT_SYMBOL_GPL(__sock_recv_wifi_status);
static inline void sock_recv_drops(struct msghdr *msg, struct sock *sk,
struct sk_buff *skb)
{
- if (sock_flag(sk, SOCK_RXQ_OVFL) && skb && skb->dropcount)
+ if (sock_flag(sk, SOCK_RXQ_OVFL) && skb && SOCK_SKB_CB(skb)->dropcount)
put_cmsg(msg, SOL_SOCKET, SO_RXQ_OVFL,
- sizeof(__u32), &skb->dropcount);
+ sizeof(__u32), &SOCK_SKB_CB(skb)->dropcount);
}

void __sock_recv_ts_and_drops(struct msghdr *msg, struct sock *sk,
--
2.1.4

2015-02-26 19:07:03

by Eyal Birger

[permalink] [raw]
Subject: [PATCH net-next v2 6/7] net: add common accessor for setting dropcount on packets

As part of an effort to move skb->dropcount to skb->cb[], use
a common function in order to set dropcount in struct sk_buff.

Signed-off-by: Eyal Birger <[email protected]>
---
include/net/sock.h | 6 ++++++
net/core/sock.c | 2 +-
net/packet/af_packet.c | 2 +-
3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index a2502d2..0996fe4 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2081,6 +2081,12 @@ static inline int sock_intr_errno(long timeo)
#define sock_skb_cb_check_size(size) \
BUILD_BUG_ON((size) > FIELD_SIZEOF(struct sk_buff, cb))

+static inline void
+sock_skb_set_dropcount(const struct sock *sk, struct sk_buff *skb)
+{
+ skb->dropcount = atomic_read(&sk->sk_drops);
+}
+
void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
struct sk_buff *skb);
void __sock_recv_wifi_status(struct msghdr *msg, struct sock *sk,
diff --git a/net/core/sock.c b/net/core/sock.c
index 93c8b20..9c74fc8 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -466,7 +466,7 @@ int sock_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
skb_dst_force(skb);

spin_lock_irqsave(&list->lock, flags);
- skb->dropcount = atomic_read(&sk->sk_drops);
+ sock_skb_set_dropcount(sk, skb);
__skb_queue_tail(list, skb);
spin_unlock_irqrestore(&list->lock, flags);

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 120043f..6535a29 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1838,7 +1838,7 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev,

spin_lock(&sk->sk_receive_queue.lock);
po->stats.stats1.tp_packets++;
- skb->dropcount = atomic_read(&sk->sk_drops);
+ sock_skb_set_dropcount(sk, skb);
__skb_queue_tail(&sk->sk_receive_queue, skb);
spin_unlock(&sk->sk_receive_queue.lock);
sk->sk_data_ready(sk);
--
2.1.4

2015-02-26 19:07:02

by Eyal Birger

[permalink] [raw]
Subject: [PATCH net-next v2 5/7] net: use common macro for assering skb->cb[] available size in protocol families

As part of an effort to move skb->dropcount to skb->cb[] use a common
macro in protocol families using skb->cb[] for ancillary data to
validate available room in skb->cb[].

Signed-off-by: Eyal Birger <[email protected]>
---
include/net/sock.h | 3 +++
net/bluetooth/af_bluetooth.c | 3 +--
net/can/bcm.c | 2 +-
net/can/raw.c | 6 +++---
net/ipv4/af_inet.c | 2 +-
net/ipv4/tcp.c | 3 +--
net/ipv6/af_inet6.c | 2 +-
net/packet/af_packet.c | 3 +--
net/sctp/protocol.c | 3 +--
9 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/include/net/sock.h b/include/net/sock.h
index ab186b1..a2502d2 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -2078,6 +2078,9 @@ static inline int sock_intr_errno(long timeo)
return timeo == MAX_SCHEDULE_TIMEOUT ? -ERESTARTSYS : -EINTR;
}

+#define sock_skb_cb_check_size(size) \
+ BUILD_BUG_ON((size) > FIELD_SIZEOF(struct sk_buff, cb))
+
void __sock_recv_timestamp(struct msghdr *msg, struct sock *sk,
struct sk_buff *skb);
void __sock_recv_wifi_status(struct msghdr *msg, struct sock *sk,
diff --git a/net/bluetooth/af_bluetooth.c b/net/bluetooth/af_bluetooth.c
index ce22e0c..4b904c9 100644
--- a/net/bluetooth/af_bluetooth.c
+++ b/net/bluetooth/af_bluetooth.c
@@ -711,10 +711,9 @@ EXPORT_SYMBOL_GPL(bt_debugfs);

static int __init bt_init(void)
{
- struct sk_buff *skb;
int err;

- BUILD_BUG_ON(sizeof(struct bt_skb_cb) > sizeof(skb->cb));
+ sock_skb_cb_check_size(sizeof(struct bt_skb_cb));

BT_INFO("Core ver %s", VERSION);

diff --git a/net/can/bcm.c b/net/can/bcm.c
index ee9ffd9..d559f92 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -328,7 +328,7 @@ static void bcm_send_to_user(struct bcm_op *op, struct bcm_msg_head *head,
* containing the interface index.
*/

- BUILD_BUG_ON(sizeof(skb->cb) < sizeof(struct sockaddr_can));
+ sock_skb_cb_check_size(sizeof(struct sockaddr_can));
addr = (struct sockaddr_can *)skb->cb;
memset(addr, 0, sizeof(*addr));
addr->can_family = AF_CAN;
diff --git a/net/can/raw.c b/net/can/raw.c
index 00c13ef..94601b7 100644
--- a/net/can/raw.c
+++ b/net/can/raw.c
@@ -95,8 +95,8 @@ struct raw_sock {
*/
static inline unsigned int *raw_flags(struct sk_buff *skb)
{
- BUILD_BUG_ON(sizeof(skb->cb) <= (sizeof(struct sockaddr_can) +
- sizeof(unsigned int)));
+ sock_skb_cb_check_size(sizeof(struct sockaddr_can) +
+ sizeof(unsigned int));

/* return pointer after struct sockaddr_can */
return (unsigned int *)(&((struct sockaddr_can *)skb->cb)[1]);
@@ -135,7 +135,7 @@ static void raw_rcv(struct sk_buff *oskb, void *data)
* containing the interface index.
*/

- BUILD_BUG_ON(sizeof(skb->cb) < sizeof(struct sockaddr_can));
+ sock_skb_cb_check_size(sizeof(struct sockaddr_can));
addr = (struct sockaddr_can *)skb->cb;
memset(addr, 0, sizeof(*addr));
addr->can_family = AF_CAN;
diff --git a/net/ipv4/af_inet.c b/net/ipv4/af_inet.c
index d2e49ba..4ce954c 100644
--- a/net/ipv4/af_inet.c
+++ b/net/ipv4/af_inet.c
@@ -1675,7 +1675,7 @@ static int __init inet_init(void)
struct list_head *r;
int rc = -EINVAL;

- BUILD_BUG_ON(sizeof(struct inet_skb_parm) > FIELD_SIZEOF(struct sk_buff, cb));
+ sock_skb_cb_check_size(sizeof(struct inet_skb_parm));

rc = proto_register(&tcp_prot, 1);
if (rc)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 9d72a0f..4b57ea8 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -3005,12 +3005,11 @@ static void __init tcp_init_mem(void)

void __init tcp_init(void)
{
- struct sk_buff *skb = NULL;
unsigned long limit;
int max_rshare, max_wshare, cnt;
unsigned int i;

- BUILD_BUG_ON(sizeof(struct tcp_skb_cb) > sizeof(skb->cb));
+ sock_skb_cb_check_size(sizeof(struct tcp_skb_cb));

percpu_counter_init(&tcp_sockets_allocated, 0, GFP_KERNEL);
percpu_counter_init(&tcp_orphan_count, 0, GFP_KERNEL);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index e8c4400..6bafcc2 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -824,7 +824,7 @@ static int __init inet6_init(void)
struct list_head *r;
int err = 0;

- BUILD_BUG_ON(sizeof(struct inet6_skb_parm) > FIELD_SIZEOF(struct sk_buff, cb));
+ sock_skb_cb_check_size(sizeof(struct inet6_skb_parm));

/* Register the socket-side information for inet6_create. */
for (r = &inetsw6[0]; r < &inetsw6[SOCK_MAX]; ++r)
diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 9d571bc..120043f 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -1810,8 +1810,7 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev,
skb = nskb;
}

- BUILD_BUG_ON(sizeof(*PACKET_SKB_CB(skb)) + MAX_ADDR_LEN - 8 >
- sizeof(skb->cb));
+ sock_skb_cb_check_size(sizeof(*PACKET_SKB_CB(skb)) + MAX_ADDR_LEN - 8);

sll = &PACKET_SKB_CB(skb)->sa.ll;
sll->sll_family = AF_PACKET;
diff --git a/net/sctp/protocol.c b/net/sctp/protocol.c
index 8f34b27..53b7acd 100644
--- a/net/sctp/protocol.c
+++ b/net/sctp/protocol.c
@@ -1322,8 +1322,7 @@ static __init int sctp_init(void)
int max_share;
int order;

- BUILD_BUG_ON(sizeof(struct sctp_ulpevent) >
- sizeof(((struct sk_buff *) 0)->cb));
+ sock_skb_cb_check_size(sizeof(struct sctp_ulpevent));

/* Allocate bind_bucket and chunk caches. */
status = -ENOBUFS;
--
2.1.4

2015-02-26 19:07:01

by Eyal Birger

[permalink] [raw]
Subject: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

As part of an effort to move skb->dropcount to skb->cb[], 4 bytes
of additional room are needed in skb->cb[] in packet sockets.

Store the skb original length in skb->dev instead of skb->cb[] for
this purpose.

Signed-off-by: Eyal Birger <[email protected]>
---
net/packet/af_packet.c | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/net/packet/af_packet.c b/net/packet/af_packet.c
index 9c28cec..9d571bc 100644
--- a/net/packet/af_packet.c
+++ b/net/packet/af_packet.c
@@ -216,7 +216,6 @@ static void prb_fill_vlan_info(struct tpacket_kbdq_core *,
static void packet_flush_mclist(struct sock *sk);

struct packet_skb_cb {
- unsigned int origlen;
union {
struct sockaddr_pkt pkt;
struct sockaddr_ll ll;
@@ -224,6 +223,7 @@ struct packet_skb_cb {
};

#define PACKET_SKB_CB(__skb) ((struct packet_skb_cb *)((__skb)->cb))
+#define PACKET_SKB_ORIGLEN(__skb) ((unsigned long *)&(__skb)->dev)

#define GET_PBDQC_FROM_RB(x) ((struct tpacket_kbdq_core *)(&(x)->prb_bdqc))
#define GET_PBLOCK_DESC(x, bid) \
@@ -1757,7 +1757,7 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev,
struct packet_sock *po;
u8 *skb_head = skb->data;
int skb_len = skb->len;
- unsigned int snaplen, res;
+ unsigned int snaplen, origlen, res;

if (skb->pkt_type == PACKET_LOOPBACK)
goto drop;
@@ -1825,13 +1825,13 @@ static int packet_rcv(struct sk_buff *skb, struct net_device *dev,

sll->sll_halen = dev_parse_header(skb, sll->sll_addr);

- PACKET_SKB_CB(skb)->origlen = skb->len;
+ origlen = skb->len;

if (pskb_trim(skb, snaplen))
goto drop_n_acct;

skb_set_owner_r(skb, sk);
- skb->dev = NULL;
+ *PACKET_SKB_ORIGLEN(skb) = origlen;
skb_dst_drop(skb);

/* drop conntrack reference */
@@ -3006,7 +3006,7 @@ static int packet_recvmsg(struct kiocb *iocb, struct socket *sock,
aux.tp_status = TP_STATUS_USER;
if (skb->ip_summed == CHECKSUM_PARTIAL)
aux.tp_status |= TP_STATUS_CSUMNOTREADY;
- aux.tp_len = PACKET_SKB_CB(skb)->origlen;
+ aux.tp_len = *PACKET_SKB_ORIGLEN(skb);
aux.tp_snaplen = skb->len;
aux.tp_mac = 0;
aux.tp_net = skb_network_offset(skb);
--
2.1.4

2015-02-26 19:07:00

by Eyal Birger

[permalink] [raw]
Subject: [PATCH net-next v2 3/7] net: rxrpc: change call to sock_recv_ts_and_drops() on rxrpc recvmsg to sock_recv_timestamp()

Commit 3b885787ea4112 ("net: Generalize socket rx gap / receive queue overflow cmsg")
allowed receiving packet dropcount information as a socket level option.
RXRPC sockets recvmsg function was changed to support this by calling
sock_recv_ts_and_drops() instead of sock_recv_timestamp().

However, protocol families wishing to receive dropcount should call
sock_queue_rcv_skb() or set the dropcount specifically (as done
in packet_rcv()). This was not done for rxrpc and thus this feature
never worked on these sockets.

Formalizing this by not calling sock_recv_ts_and_drops() in rxrpc as
part of an effort to move skb->dropcount into skb->cb[]

Signed-off-by: Eyal Birger <[email protected]>
---
net/rxrpc/ar-recvmsg.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/rxrpc/ar-recvmsg.c b/net/rxrpc/ar-recvmsg.c
index 4575485..d58ba70 100644
--- a/net/rxrpc/ar-recvmsg.c
+++ b/net/rxrpc/ar-recvmsg.c
@@ -150,7 +150,7 @@ int rxrpc_recvmsg(struct kiocb *iocb, struct socket *sock,
&call->conn->trans->peer->srx, len);
msg->msg_namelen = len;
}
- sock_recv_ts_and_drops(msg, &rx->sk, skb);
+ sock_recv_timestamp(msg, &rx->sk, skb);
}

/* receive the message */
--
2.1.4

2015-02-26 19:06:59

by Eyal Birger

[permalink] [raw]
Subject: [PATCH net-next v2 2/7] net: bluetooth: compact struct bt_skb_cb by converting boolean fields to bit fields

Convert boolean fields incoming and req_start to bit fields and move
force_active in order save space in bt_skb_cb in an effort to use
a portion of skb->cb[] for storing skb->dropcount.

Signed-off-by: Eyal Birger <[email protected]>
---
include/net/bluetooth/bluetooth.h | 6 +++---
net/bluetooth/hci_core.c | 2 +-
net/bluetooth/hci_request.c | 2 +-
net/bluetooth/hci_sock.c | 2 +-
4 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h
index 0989366..4500bf8 100644
--- a/include/net/bluetooth/bluetooth.h
+++ b/include/net/bluetooth/bluetooth.h
@@ -277,11 +277,11 @@ typedef void (*hci_req_complete_t)(struct hci_dev *hdev, u8 status, u16 opcode);

struct bt_skb_cb {
__u8 pkt_type;
- __u8 incoming;
+ __u8 force_active;
__u16 opcode;
__u16 expect;
- __u8 force_active;
- bool req_start;
+ __u8 incoming:1;
+ __u8 req_start:1;
u8 req_event;
hci_req_complete_t req_complete;
struct l2cap_chan *chan;
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 85a0655..80f40e8 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3517,7 +3517,7 @@ int hci_send_cmd(struct hci_dev *hdev, __u16 opcode, __u32 plen,
/* Stand-alone HCI commands must be flagged as
* single-command requests.
*/
- bt_cb(skb)->req_start = true;
+ bt_cb(skb)->req_start = 1;

skb_queue_tail(&hdev->cmd_q, skb);
queue_work(hdev->workqueue, &hdev->cmd_work);
diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index db2f45a..f857e76 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -116,7 +116,7 @@ void hci_req_add_ev(struct hci_request *req, u16 opcode, u32 plen,
}

if (skb_queue_empty(&req->cmd_q))
- bt_cb(skb)->req_start = true;
+ bt_cb(skb)->req_start = 1;

bt_cb(skb)->req_event = event;

diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index f003818..37198fb 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -965,7 +965,7 @@ static int hci_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
/* Stand-alone HCI commands must be flagged as
* single-command requests.
*/
- bt_cb(skb)->req_start = true;
+ bt_cb(skb)->req_start = 1;

skb_queue_tail(&hdev->cmd_q, skb);
queue_work(hdev->workqueue, &hdev->cmd_work);
--
2.1.4

2015-02-26 19:06:58

by Eyal Birger

[permalink] [raw]
Subject: [PATCH net-next v2 1/7] net: bluetooth: compact struct bt_skb_cb by inlining struct hci_req_ctrl

struct hci_req_ctrl is never used outside of struct bt_skb_cb;
Inlining it frees 8 bytes on a 64 bit system in skb->cb[] allowing
the addition of more ancillary data.

Signed-off-by: Eyal Birger <[email protected]>
Reviewed-by: Shmulik Ladkani <[email protected]>
---
include/net/bluetooth/bluetooth.h | 10 +++-------
net/bluetooth/hci_core.c | 12 ++++++------
net/bluetooth/hci_event.c | 4 ++--
net/bluetooth/hci_request.c | 6 +++---
net/bluetooth/hci_sock.c | 2 +-
5 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h
index e00455a..0989366 100644
--- a/include/net/bluetooth/bluetooth.h
+++ b/include/net/bluetooth/bluetooth.h
@@ -275,21 +275,17 @@ struct hci_dev;

typedef void (*hci_req_complete_t)(struct hci_dev *hdev, u8 status, u16 opcode);

-struct hci_req_ctrl {
- bool start;
- u8 event;
- hci_req_complete_t complete;
-};
-
struct bt_skb_cb {
__u8 pkt_type;
__u8 incoming;
__u16 opcode;
__u16 expect;
__u8 force_active;
+ bool req_start;
+ u8 req_event;
+ hci_req_complete_t req_complete;
struct l2cap_chan *chan;
struct l2cap_ctrl control;
- struct hci_req_ctrl req;
bdaddr_t bdaddr;
__le16 psm;
};
diff --git a/net/bluetooth/hci_core.c b/net/bluetooth/hci_core.c
index 3322d3f..85a0655 100644
--- a/net/bluetooth/hci_core.c
+++ b/net/bluetooth/hci_core.c
@@ -3517,7 +3517,7 @@ int hci_send_cmd(struct hci_dev *hdev, __u16 opcode, __u32 plen,
/* Stand-alone HCI commands must be flagged as
* single-command requests.
*/
- bt_cb(skb)->req.start = true;
+ bt_cb(skb)->req_start = true;

skb_queue_tail(&hdev->cmd_q, skb);
queue_work(hdev->workqueue, &hdev->cmd_work);
@@ -4195,7 +4195,7 @@ static bool hci_req_is_complete(struct hci_dev *hdev)
if (!skb)
return true;

- return bt_cb(skb)->req.start;
+ return bt_cb(skb)->req_start;
}

static void hci_resend_last(struct hci_dev *hdev)
@@ -4255,14 +4255,14 @@ void hci_req_cmd_complete(struct hci_dev *hdev, u16 opcode, u8 status)
* command queue (hdev->cmd_q).
*/
if (hdev->sent_cmd) {
- req_complete = bt_cb(hdev->sent_cmd)->req.complete;
+ req_complete = bt_cb(hdev->sent_cmd)->req_complete;

if (req_complete) {
/* We must set the complete callback to NULL to
* avoid calling the callback more than once if
* this function gets called again.
*/
- bt_cb(hdev->sent_cmd)->req.complete = NULL;
+ bt_cb(hdev->sent_cmd)->req_complete = NULL;

goto call_complete;
}
@@ -4271,12 +4271,12 @@ void hci_req_cmd_complete(struct hci_dev *hdev, u16 opcode, u8 status)
/* Remove all pending commands belonging to this request */
spin_lock_irqsave(&hdev->cmd_q.lock, flags);
while ((skb = __skb_dequeue(&hdev->cmd_q))) {
- if (bt_cb(skb)->req.start) {
+ if (bt_cb(skb)->req_start) {
__skb_queue_head(&hdev->cmd_q, skb);
break;
}

- req_complete = bt_cb(skb)->req.complete;
+ req_complete = bt_cb(skb)->req_complete;
kfree_skb(skb);
}
spin_unlock_irqrestore(&hdev->cmd_q.lock, flags);
diff --git a/net/bluetooth/hci_event.c b/net/bluetooth/hci_event.c
index a3fb094..8e8c433 100644
--- a/net/bluetooth/hci_event.c
+++ b/net/bluetooth/hci_event.c
@@ -3106,7 +3106,7 @@ static void hci_cmd_status_evt(struct hci_dev *hdev, struct sk_buff *skb)
cancel_delayed_work(&hdev->cmd_timer);

if (ev->status ||
- (hdev->sent_cmd && !bt_cb(hdev->sent_cmd)->req.event))
+ (hdev->sent_cmd && !bt_cb(hdev->sent_cmd)->req_event))
hci_req_cmd_complete(hdev, opcode, ev->status);

if (ev->ncmd && !test_bit(HCI_RESET, &hdev->flags)) {
@@ -5039,7 +5039,7 @@ void hci_event_packet(struct hci_dev *hdev, struct sk_buff *skb)

skb_pull(skb, HCI_EVENT_HDR_SIZE);

- if (hdev->sent_cmd && bt_cb(hdev->sent_cmd)->req.event == event) {
+ if (hdev->sent_cmd && bt_cb(hdev->sent_cmd)->req_event == event) {
struct hci_command_hdr *cmd_hdr = (void *) hdev->sent_cmd->data;
u16 opcode = __le16_to_cpu(cmd_hdr->opcode);

diff --git a/net/bluetooth/hci_request.c b/net/bluetooth/hci_request.c
index b59f92c..db2f45a 100644
--- a/net/bluetooth/hci_request.c
+++ b/net/bluetooth/hci_request.c
@@ -55,7 +55,7 @@ int hci_req_run(struct hci_request *req, hci_req_complete_t complete)
return -ENODATA;

skb = skb_peek_tail(&req->cmd_q);
- bt_cb(skb)->req.complete = complete;
+ bt_cb(skb)->req_complete = complete;

spin_lock_irqsave(&hdev->cmd_q.lock, flags);
skb_queue_splice_tail(&req->cmd_q, &hdev->cmd_q);
@@ -116,9 +116,9 @@ void hci_req_add_ev(struct hci_request *req, u16 opcode, u32 plen,
}

if (skb_queue_empty(&req->cmd_q))
- bt_cb(skb)->req.start = true;
+ bt_cb(skb)->req_start = true;

- bt_cb(skb)->req.event = event;
+ bt_cb(skb)->req_event = event;

skb_queue_tail(&req->cmd_q, skb);
}
diff --git a/net/bluetooth/hci_sock.c b/net/bluetooth/hci_sock.c
index 1d65c5b..f003818 100644
--- a/net/bluetooth/hci_sock.c
+++ b/net/bluetooth/hci_sock.c
@@ -965,7 +965,7 @@ static int hci_sock_sendmsg(struct kiocb *iocb, struct socket *sock,
/* Stand-alone HCI commands must be flagged as
* single-command requests.
*/
- bt_cb(skb)->req.start = true;
+ bt_cb(skb)->req_start = true;

skb_queue_tail(&hdev->cmd_q, skb);
queue_work(hdev->workqueue, &hdev->cmd_work);
--
2.1.4

2015-03-01 05:35:26

by Eyal Birger

[permalink] [raw]
Subject: Re: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

On Sun, Mar 1, 2015 at 7:31 AM, David Miller <[email protected]> wrote:
> From: Eyal Birger <[email protected]>
> Date: Sun, 1 Mar 2015 07:09:54 +0200
>
>> Ok. Another suggestion I received was to delay the preparation of the full
>> sockaddr_ll until it is needed, and store the skb original length in the first
>> two fields (sll_protocol and sll_family) as they can be derived later on from
>> the skb.
>>
>> IMHO, It would still be somewhat of a hack though.
>>
>> Would that approach be considered?
>
> I think it's better than mangling skb->dev

Ok. Will submit a v3.

2015-03-01 05:31:43

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

From: Eyal Birger <[email protected]>
Date: Sun, 1 Mar 2015 07:09:54 +0200

> Ok. Another suggestion I received was to delay the preparation of the full
> sockaddr_ll until it is needed, and store the skb original length in the first
> two fields (sll_protocol and sll_family) as they can be derived later on from
> the skb.
>
> IMHO, It would still be somewhat of a hack though.
>
> Would that approach be considered?

I think it's better than mangling skb->dev

2015-03-01 05:09:54

by Eyal Birger

[permalink] [raw]
Subject: Re: [PATCH net-next v2 4/7] net: packet: use skb->dev as storage for skb orig len instead of skb->cb[]

On Sat, Feb 28, 2015 at 11:00 PM, David Miller <[email protected]> wrote:
> From: Eyal Birger <[email protected]>
> Date: Sat, 28 Feb 2015 22:38:04 +0200
>
>> My concern is that any value I pick based on the existing implementations
>> would need to be adjusted come a protocol with a larger address length.
>
> Then we need a method that requires protocols to register their
> address length in a manner that will allow us to validate that
> limit at compile time.
>

Sorry to reiterate this, but such validation will inherently become the
actual limit for hardware addresses.
So, it would be equivalent to changing the in-kernel definition of
MAX_ADDR_SIZE.

Is this something to be considered?

> This is not rocket science.
>
> Right now we're proposing to do utterly stupid shit like encoding
> integers in device pointers in the sk_buff, when that is absolutely
> not necessary at all.

Ok. Another suggestion I received was to delay the preparation of the full
sockaddr_ll until it is needed, and store the skb original length in the first
two fields (sll_protocol and sll_family) as they can be derived later on from
the skb.

IMHO, It would still be somewhat of a hack though.

Would that approach be considered?

Regards,
Eyal.