2024-04-24 14:14:24

by Adrián Moreno

[permalink] [raw]
Subject: [PATCH net-next 0/8] net: openvswitch: Add sample multicasting.

** Background **
Currently, OVS supports several packet sampling mechanisms (sFlow,
per-bridge IPFIX, per-flow IPFIX). These end up being translated into a
userspace action that needs to be handled by ovs-vswitchd's handler
threads only to be forwarded to some third party application that
will somehow process the sample and provide observability on the
datapath.

A particularly interesting use-case is controller-driven
per-flow IPFIX sampling where the OpenFlow controller can add metadata
to samples (via two 32bit integers) and this metadata is then available
to the sample-collecting system for correlation.

** Problem **
The fact that sampled traffic share netlink sockets and handler thread
time with upcalls, apart from being a performance bottleneck in the
sample extraction itself, can severely compromise the datapath,
yielding this solution unfit for highly loaded production systems.

Users are left with little options other than guessing what sampling
rate will be OK for their traffic pattern and system load and dealing
with the lost accuracy.

Looking at available infrastructure, an obvious candidated would be
to use psample. However, it's current state does not help with the
use-case at stake because sampled packets do not contain user-defined
metadata.

** Proposal **
This series is an attempt to fix this situation by extending the
existing psample infrastructure to carry a variable length
user-defined cookie.

The main existing user of psample is tc's act_sample. It is also
xtended to forward the action's cookie to psample.

Finally, OVS sample action is extended with a couple of attributes
(OVS_SAMPLE_ATTR_PSAMPLE_{GROUP,COOKIE}) that contain a 32 group_id
and a variable length cookie. When provided, OVS sends the packet
to psample for observability.

In order to make it easier for users to receive samples coming from
a specific source, group_id filtering is added to psample as well
as a tracepoint for troubleshooting.

--
rfc_v2 -> v1:
- Accomodate Ilya's comments.
- Split OVS's attribute in two attributes and simplify internal
handling of psample arguments.
- Extend psample and tc with a user-defined cookie.
- Add a tracepoint to psample to facilitate troubleshooting.

rfc_v1 -> rfc_v2:
- Use psample instead of a new OVS-only multicast group.
- Extend psample and tc with a user-defined cookie.

Adrian Moreno (8):
net: netlink: export genl private pointer getters
net: psample: add multicast filtering on group_id
net: psample: add user cookie
net: psample: add tracepoint
net: sched: act_sample: add action cookie to sample
net:openvswitch: add psample support
selftests: openvswitch: add sample action.
selftests: openvswitch: add psample test

Documentation/netlink/specs/ovs_flow.yaml | 6 +
include/net/psample.h | 2 +
include/uapi/linux/openvswitch.h | 49 ++++-
include/uapi/linux/psample.h | 2 +
net/netlink/genetlink.c | 2 +
net/openvswitch/actions.c | 51 ++++-
net/openvswitch/flow_netlink.c | 80 +++++--
net/psample/psample.c | 131 ++++++++++-
net/psample/trace.h | 62 ++++++
net/sched/act_sample.c | 12 +
.../selftests/net/openvswitch/openvswitch.sh | 97 +++++++-
.../selftests/net/openvswitch/ovs-dpctl.py | 207 +++++++++++++++++-
12 files changed, 655 insertions(+), 46 deletions(-)
create mode 100644 net/psample/trace.h

--
2.44.0



2024-04-24 14:15:28

by Adrián Moreno

[permalink] [raw]
Subject: [PATCH net-next 2/8] net: psample: add multicast filtering on group_id

Packet samples can come from several places (e.g: different tc sample
actions), typically using the sample group (PSAMPLE_ATTR_SAMPLE_GROUP)
to differentiate them.

Likewise, sample consumers that listen on the multicast group may only
be interested on a single group. However, they are currently forced to
receive all samples and discard the ones that are not relevant, causing
unnecessary overhead.

Allow users to filter on the desired group_id by adding a new command
PSAMPLE_SET_FILTER that can be used to pass the desired group id.
Store this filter on the per-socket private pointer and use it for
filtering multicasted samples.

Signed-off-by: Adrian Moreno <[email protected]>
---
include/uapi/linux/psample.h | 1 +
net/psample/psample.c | 110 +++++++++++++++++++++++++++++++++--
2 files changed, 105 insertions(+), 6 deletions(-)

diff --git a/include/uapi/linux/psample.h b/include/uapi/linux/psample.h
index e585db5bf2d2..9d62983af0a4 100644
--- a/include/uapi/linux/psample.h
+++ b/include/uapi/linux/psample.h
@@ -28,6 +28,7 @@ enum psample_command {
PSAMPLE_CMD_GET_GROUP,
PSAMPLE_CMD_NEW_GROUP,
PSAMPLE_CMD_DEL_GROUP,
+ PSAMPLE_CMD_SET_FILTER,
};

enum psample_tunnel_key_attr {
diff --git a/net/psample/psample.c b/net/psample/psample.c
index a5d9b8446f77..f5f77515b969 100644
--- a/net/psample/psample.c
+++ b/net/psample/psample.c
@@ -98,13 +98,77 @@ static int psample_nl_cmd_get_group_dumpit(struct sk_buff *msg,
return msg->len;
}

-static const struct genl_small_ops psample_nl_ops[] = {
+struct psample_obj_desc {
+ struct rcu_head rcu;
+ u32 group_num;
+};
+
+struct psample_nl_sock_priv {
+ struct psample_obj_desc __rcu *filter;
+ spinlock_t filter_lock; /* Protects filter. */
+};
+
+static void psample_nl_sock_priv_init(void *priv)
+{
+ struct psample_nl_sock_priv *sk_priv = priv;
+
+ spin_lock_init(&sk_priv->filter_lock);
+}
+
+static void psample_nl_sock_priv_destroy(void *priv)
+{
+ struct psample_nl_sock_priv *sk_priv = priv;
+ struct psample_obj_desc *filter;
+
+ filter = rcu_dereference_protected(sk_priv->filter, true);
+ kfree_rcu(filter, rcu);
+}
+
+static int psample_nl_set_filter_doit(struct sk_buff *skb,
+ struct genl_info *info)
+{
+ struct psample_obj_desc *filter = NULL;
+ struct psample_nl_sock_priv *sk_priv;
+ struct nlattr **attrs = info->attrs;
+
+ if (attrs[PSAMPLE_ATTR_SAMPLE_GROUP]) {
+ filter = kzalloc(sizeof(*filter), GFP_KERNEL);
+ filter->group_num =
+ nla_get_u32(attrs[PSAMPLE_ATTR_SAMPLE_GROUP]);
+ }
+
+ sk_priv = genl_sk_priv_get(&psample_nl_family, NETLINK_CB(skb).sk);
+ if (IS_ERR(sk_priv)) {
+ kfree(filter);
+ return PTR_ERR(sk_priv);
+ }
+
+ spin_lock(&sk_priv->filter_lock);
+ filter = rcu_replace_pointer(sk_priv->filter, filter,
+ lockdep_is_held(&sk_priv->filter_lock));
+ spin_unlock(&sk_priv->filter_lock);
+ kfree_rcu(filter, rcu);
+ return 0;
+}
+
+static const struct nla_policy
+psample_set_filter_policy[PSAMPLE_ATTR_SAMPLE_GROUP + 1] = {
+ [PSAMPLE_ATTR_SAMPLE_GROUP] = { .type = NLA_U32, },
+};
+
+static const struct genl_ops psample_nl_ops[] = {
{
.cmd = PSAMPLE_CMD_GET_GROUP,
.validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
.dumpit = psample_nl_cmd_get_group_dumpit,
/* can be retrieved by unprivileged users */
- }
+ },
+ {
+ .cmd = PSAMPLE_CMD_SET_FILTER,
+ .doit = psample_nl_set_filter_doit,
+ .policy = psample_set_filter_policy,
+ .flags = 0,
+ },
};

static struct genl_family psample_nl_family __ro_after_init = {
@@ -114,10 +178,13 @@ static struct genl_family psample_nl_family __ro_after_init = {
.netnsok = true,
.module = THIS_MODULE,
.mcgrps = psample_nl_mcgrps,
- .small_ops = psample_nl_ops,
- .n_small_ops = ARRAY_SIZE(psample_nl_ops),
+ .ops = psample_nl_ops,
+ .n_ops = ARRAY_SIZE(psample_nl_ops),
.resv_start_op = PSAMPLE_CMD_GET_GROUP + 1,
.n_mcgrps = ARRAY_SIZE(psample_nl_mcgrps),
+ .sock_priv_size = sizeof(struct psample_nl_sock_priv),
+ .sock_priv_init = psample_nl_sock_priv_init,
+ .sock_priv_destroy = psample_nl_sock_priv_destroy,
};

static void psample_group_notify(struct psample_group *group,
@@ -360,6 +427,32 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
}
#endif

+static inline void psample_nl_obj_desc_init(struct psample_obj_desc *desc,
+ u32 group_num)
+{
+ memset(desc, 0, sizeof(*desc));
+ desc->group_num = group_num;
+}
+
+static int psample_nl_sample_filter(struct sock *dsk, struct sk_buff *skb,
+ void *data)
+{
+ struct psample_obj_desc *desc = data;
+ struct psample_nl_sock_priv *sk_priv;
+ struct psample_obj_desc *filter;
+ int ret = 0;
+
+ rcu_read_lock();
+ sk_priv = __genl_sk_priv_get(&psample_nl_family, dsk);
+ if (!IS_ERR_OR_NULL(sk_priv)) {
+ filter = rcu_dereference(sk_priv->filter);
+ if (filter && desc)
+ ret = (filter->group_num != desc->group_num);
+ }
+ rcu_read_unlock();
+ return ret;
+}
+
void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
u32 sample_rate, const struct psample_metadata *md)
{
@@ -370,6 +463,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
#ifdef CONFIG_INET
struct ip_tunnel_info *tun_info;
#endif
+ struct psample_obj_desc desc;
struct sk_buff *nl_skb;
int data_len;
int meta_len;
@@ -487,8 +581,12 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
#endif

genlmsg_end(nl_skb, data);
- genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,
- PSAMPLE_NL_MCGRP_SAMPLE, GFP_ATOMIC);
+ psample_nl_obj_desc_init(&desc, group->group_num);
+ genlmsg_multicast_netns_filtered(&psample_nl_family,
+ group->net, nl_skb, 0,
+ PSAMPLE_NL_MCGRP_SAMPLE,
+ GFP_ATOMIC, psample_nl_sample_filter,
+ &desc);

return;
error:
--
2.44.0


2024-04-24 14:16:40

by Adrián Moreno

[permalink] [raw]
Subject: [PATCH net-next 5/8] net: sched: act_sample: add action cookie to sample

If the action has a user_cookie, pass it along to the sample so it can
be easily identified.

Signed-off-by: Adrian Moreno <[email protected]>
---
net/sched/act_sample.c | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
index a69b53d54039..5c3f86ec964a 100644
--- a/net/sched/act_sample.c
+++ b/net/sched/act_sample.c
@@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
const struct tc_action *a,
struct tcf_result *res)
{
+ u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};
struct tcf_sample *s = to_sample(a);
struct psample_group *psample_group;
struct psample_metadata md = {};
+ struct tc_cookie *user_cookie;
int retval;

tcf_lastuse_update(&s->tcf_tm);
@@ -189,6 +191,16 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
if (skb_at_tc_ingress(skb) && tcf_sample_dev_ok_push(skb->dev))
skb_push(skb, skb->mac_len);

+ rcu_read_lock();
+ user_cookie = rcu_dereference(a->user_cookie);
+ if (user_cookie) {
+ memcpy(cookie_data, user_cookie->data,
+ user_cookie->len);
+ md.user_cookie = cookie_data;
+ md.user_cookie_len = user_cookie->len;
+ }
+ rcu_read_unlock();
+
md.trunc_size = s->truncate ? s->trunc_size : skb->len;
psample_sample_packet(psample_group, skb, s->rate, &md);

--
2.44.0


2024-04-24 19:52:49

by Jiri Pirko

[permalink] [raw]
Subject: Re: [PATCH net-next 2/8] net: psample: add multicast filtering on group_id

Wed, Apr 24, 2024 at 03:50:49PM CEST, [email protected] wrote:
>Packet samples can come from several places (e.g: different tc sample
>actions), typically using the sample group (PSAMPLE_ATTR_SAMPLE_GROUP)
>to differentiate them.
>
>Likewise, sample consumers that listen on the multicast group may only
>be interested on a single group. However, they are currently forced to
>receive all samples and discard the ones that are not relevant, causing
>unnecessary overhead.
>
>Allow users to filter on the desired group_id by adding a new command
>PSAMPLE_SET_FILTER that can be used to pass the desired group id.
>Store this filter on the per-socket private pointer and use it for
>filtering multicasted samples.
>
>Signed-off-by: Adrian Moreno <[email protected]>
>---
> include/uapi/linux/psample.h | 1 +
> net/psample/psample.c | 110 +++++++++++++++++++++++++++++++++--
> 2 files changed, 105 insertions(+), 6 deletions(-)
>
>diff --git a/include/uapi/linux/psample.h b/include/uapi/linux/psample.h
>index e585db5bf2d2..9d62983af0a4 100644
>--- a/include/uapi/linux/psample.h
>+++ b/include/uapi/linux/psample.h
>@@ -28,6 +28,7 @@ enum psample_command {
> PSAMPLE_CMD_GET_GROUP,
> PSAMPLE_CMD_NEW_GROUP,
> PSAMPLE_CMD_DEL_GROUP,
>+ PSAMPLE_CMD_SET_FILTER,
> };
>
> enum psample_tunnel_key_attr {
>diff --git a/net/psample/psample.c b/net/psample/psample.c
>index a5d9b8446f77..f5f77515b969 100644
>--- a/net/psample/psample.c
>+++ b/net/psample/psample.c
>@@ -98,13 +98,77 @@ static int psample_nl_cmd_get_group_dumpit(struct sk_buff *msg,
> return msg->len;
> }
>
>-static const struct genl_small_ops psample_nl_ops[] = {
>+struct psample_obj_desc {
>+ struct rcu_head rcu;
>+ u32 group_num;
>+};
>+
>+struct psample_nl_sock_priv {
>+ struct psample_obj_desc __rcu *filter;
>+ spinlock_t filter_lock; /* Protects filter. */
>+};
>+
>+static void psample_nl_sock_priv_init(void *priv)
>+{
>+ struct psample_nl_sock_priv *sk_priv = priv;
>+
>+ spin_lock_init(&sk_priv->filter_lock);
>+}
>+
>+static void psample_nl_sock_priv_destroy(void *priv)
>+{
>+ struct psample_nl_sock_priv *sk_priv = priv;
>+ struct psample_obj_desc *filter;
>+
>+ filter = rcu_dereference_protected(sk_priv->filter, true);
>+ kfree_rcu(filter, rcu);
>+}
>+
>+static int psample_nl_set_filter_doit(struct sk_buff *skb,
>+ struct genl_info *info)
>+{
>+ struct psample_obj_desc *filter = NULL;
>+ struct psample_nl_sock_priv *sk_priv;
>+ struct nlattr **attrs = info->attrs;
>+
>+ if (attrs[PSAMPLE_ATTR_SAMPLE_GROUP]) {
>+ filter = kzalloc(sizeof(*filter), GFP_KERNEL);
>+ filter->group_num =
>+ nla_get_u32(attrs[PSAMPLE_ATTR_SAMPLE_GROUP]);
>+ }
>+
>+ sk_priv = genl_sk_priv_get(&psample_nl_family, NETLINK_CB(skb).sk);
>+ if (IS_ERR(sk_priv)) {
>+ kfree(filter);
>+ return PTR_ERR(sk_priv);
>+ }
>+
>+ spin_lock(&sk_priv->filter_lock);
>+ filter = rcu_replace_pointer(sk_priv->filter, filter,
>+ lockdep_is_held(&sk_priv->filter_lock));
>+ spin_unlock(&sk_priv->filter_lock);
>+ kfree_rcu(filter, rcu);
>+ return 0;
>+}
>+
>+static const struct nla_policy
>+psample_set_filter_policy[PSAMPLE_ATTR_SAMPLE_GROUP + 1] = {
>+ [PSAMPLE_ATTR_SAMPLE_GROUP] = { .type = NLA_U32, },
>+};
>+
>+static const struct genl_ops psample_nl_ops[] = {
> {
> .cmd = PSAMPLE_CMD_GET_GROUP,
> .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
> .dumpit = psample_nl_cmd_get_group_dumpit,
> /* can be retrieved by unprivileged users */
>- }
>+ },
>+ {
>+ .cmd = PSAMPLE_CMD_SET_FILTER,
>+ .doit = psample_nl_set_filter_doit,
>+ .policy = psample_set_filter_policy,
>+ .flags = 0,
>+ },

Sidenote:
Did you think about converting psample to split ops and to introcude
ynl spec file for it?


> };
>
> static struct genl_family psample_nl_family __ro_after_init = {
>@@ -114,10 +178,13 @@ static struct genl_family psample_nl_family __ro_after_init = {
> .netnsok = true,
> .module = THIS_MODULE,
> .mcgrps = psample_nl_mcgrps,
>- .small_ops = psample_nl_ops,
>- .n_small_ops = ARRAY_SIZE(psample_nl_ops),
>+ .ops = psample_nl_ops,
>+ .n_ops = ARRAY_SIZE(psample_nl_ops),
> .resv_start_op = PSAMPLE_CMD_GET_GROUP + 1,
> .n_mcgrps = ARRAY_SIZE(psample_nl_mcgrps),
>+ .sock_priv_size = sizeof(struct psample_nl_sock_priv),
>+ .sock_priv_init = psample_nl_sock_priv_init,
>+ .sock_priv_destroy = psample_nl_sock_priv_destroy,
> };
>
> static void psample_group_notify(struct psample_group *group,
>@@ -360,6 +427,32 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
> }
> #endif
>
>+static inline void psample_nl_obj_desc_init(struct psample_obj_desc *desc,
>+ u32 group_num)
>+{
>+ memset(desc, 0, sizeof(*desc));
>+ desc->group_num = group_num;
>+}
>+
>+static int psample_nl_sample_filter(struct sock *dsk, struct sk_buff *skb,
>+ void *data)
>+{
>+ struct psample_obj_desc *desc = data;
>+ struct psample_nl_sock_priv *sk_priv;
>+ struct psample_obj_desc *filter;
>+ int ret = 0;
>+
>+ rcu_read_lock();
>+ sk_priv = __genl_sk_priv_get(&psample_nl_family, dsk);
>+ if (!IS_ERR_OR_NULL(sk_priv)) {
>+ filter = rcu_dereference(sk_priv->filter);
>+ if (filter && desc)
>+ ret = (filter->group_num != desc->group_num);
>+ }
>+ rcu_read_unlock();
>+ return ret;
>+}
>+
> void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
> u32 sample_rate, const struct psample_metadata *md)
> {
>@@ -370,6 +463,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
> #ifdef CONFIG_INET
> struct ip_tunnel_info *tun_info;
> #endif
>+ struct psample_obj_desc desc;
> struct sk_buff *nl_skb;
> int data_len;
> int meta_len;
>@@ -487,8 +581,12 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
> #endif
>
> genlmsg_end(nl_skb, data);
>- genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,
>- PSAMPLE_NL_MCGRP_SAMPLE, GFP_ATOMIC);
>+ psample_nl_obj_desc_init(&desc, group->group_num);
>+ genlmsg_multicast_netns_filtered(&psample_nl_family,
>+ group->net, nl_skb, 0,
>+ PSAMPLE_NL_MCGRP_SAMPLE,
>+ GFP_ATOMIC, psample_nl_sample_filter,
>+ &desc);
>
> return;
> error:
>--
>2.44.0
>
>

2024-04-25 07:24:27

by Adrián Moreno

[permalink] [raw]
Subject: Re: [PATCH net-next 2/8] net: psample: add multicast filtering on group_id



On 4/24/24 16:54, Jiri Pirko wrote:
> Wed, Apr 24, 2024 at 03:50:49PM CEST, [email protected] wrote:
>> Packet samples can come from several places (e.g: different tc sample
>> actions), typically using the sample group (PSAMPLE_ATTR_SAMPLE_GROUP)
>> to differentiate them.
>>
>> Likewise, sample consumers that listen on the multicast group may only
>> be interested on a single group. However, they are currently forced to
>> receive all samples and discard the ones that are not relevant, causing
>> unnecessary overhead.
>>
>> Allow users to filter on the desired group_id by adding a new command
>> PSAMPLE_SET_FILTER that can be used to pass the desired group id.
>> Store this filter on the per-socket private pointer and use it for
>> filtering multicasted samples.
>>
>> Signed-off-by: Adrian Moreno <[email protected]>
>> ---
>> include/uapi/linux/psample.h | 1 +
>> net/psample/psample.c | 110 +++++++++++++++++++++++++++++++++--
>> 2 files changed, 105 insertions(+), 6 deletions(-)
>>
>> diff --git a/include/uapi/linux/psample.h b/include/uapi/linux/psample.h
>> index e585db5bf2d2..9d62983af0a4 100644
>> --- a/include/uapi/linux/psample.h
>> +++ b/include/uapi/linux/psample.h
>> @@ -28,6 +28,7 @@ enum psample_command {
>> PSAMPLE_CMD_GET_GROUP,
>> PSAMPLE_CMD_NEW_GROUP,
>> PSAMPLE_CMD_DEL_GROUP,
>> + PSAMPLE_CMD_SET_FILTER,
>> };
>>
>> enum psample_tunnel_key_attr {
>> diff --git a/net/psample/psample.c b/net/psample/psample.c
>> index a5d9b8446f77..f5f77515b969 100644
>> --- a/net/psample/psample.c
>> +++ b/net/psample/psample.c
>> @@ -98,13 +98,77 @@ static int psample_nl_cmd_get_group_dumpit(struct sk_buff *msg,
>> return msg->len;
>> }
>>
>> -static const struct genl_small_ops psample_nl_ops[] = {
>> +struct psample_obj_desc {
>> + struct rcu_head rcu;
>> + u32 group_num;
>> +};
>> +
>> +struct psample_nl_sock_priv {
>> + struct psample_obj_desc __rcu *filter;
>> + spinlock_t filter_lock; /* Protects filter. */
>> +};
>> +
>> +static void psample_nl_sock_priv_init(void *priv)
>> +{
>> + struct psample_nl_sock_priv *sk_priv = priv;
>> +
>> + spin_lock_init(&sk_priv->filter_lock);
>> +}
>> +
>> +static void psample_nl_sock_priv_destroy(void *priv)
>> +{
>> + struct psample_nl_sock_priv *sk_priv = priv;
>> + struct psample_obj_desc *filter;
>> +
>> + filter = rcu_dereference_protected(sk_priv->filter, true);
>> + kfree_rcu(filter, rcu);
>> +}
>> +
>> +static int psample_nl_set_filter_doit(struct sk_buff *skb,
>> + struct genl_info *info)
>> +{
>> + struct psample_obj_desc *filter = NULL;
>> + struct psample_nl_sock_priv *sk_priv;
>> + struct nlattr **attrs = info->attrs;
>> +
>> + if (attrs[PSAMPLE_ATTR_SAMPLE_GROUP]) {
>> + filter = kzalloc(sizeof(*filter), GFP_KERNEL);
>> + filter->group_num =
>> + nla_get_u32(attrs[PSAMPLE_ATTR_SAMPLE_GROUP]);
>> + }
>> +
>> + sk_priv = genl_sk_priv_get(&psample_nl_family, NETLINK_CB(skb).sk);
>> + if (IS_ERR(sk_priv)) {
>> + kfree(filter);
>> + return PTR_ERR(sk_priv);
>> + }
>> +
>> + spin_lock(&sk_priv->filter_lock);
>> + filter = rcu_replace_pointer(sk_priv->filter, filter,
>> + lockdep_is_held(&sk_priv->filter_lock));
>> + spin_unlock(&sk_priv->filter_lock);
>> + kfree_rcu(filter, rcu);
>> + return 0;
>> +}
>> +
>> +static const struct nla_policy
>> +psample_set_filter_policy[PSAMPLE_ATTR_SAMPLE_GROUP + 1] = {
>> + [PSAMPLE_ATTR_SAMPLE_GROUP] = { .type = NLA_U32, },
>> +};
>> +
>> +static const struct genl_ops psample_nl_ops[] = {
>> {
>> .cmd = PSAMPLE_CMD_GET_GROUP,
>> .validate = GENL_DONT_VALIDATE_STRICT | GENL_DONT_VALIDATE_DUMP,
>> .dumpit = psample_nl_cmd_get_group_dumpit,
>> /* can be retrieved by unprivileged users */
>> - }
>> + },
>> + {
>> + .cmd = PSAMPLE_CMD_SET_FILTER,
>> + .doit = psample_nl_set_filter_doit,
>> + .policy = psample_set_filter_policy,
>> + .flags = 0,
>> + },
>
> Sidenote:
> Did you think about converting psample to split ops and to introcude
> ynl spec file for it?
>

If split opts are preferred then sure, I can do that.

Thanks.

>> };
>>
>> static struct genl_family psample_nl_family __ro_after_init = {
>> @@ -114,10 +178,13 @@ static struct genl_family psample_nl_family __ro_after_init = {
>> .netnsok = true,
>> .module = THIS_MODULE,
>> .mcgrps = psample_nl_mcgrps,
>> - .small_ops = psample_nl_ops,
>> - .n_small_ops = ARRAY_SIZE(psample_nl_ops),
>> + .ops = psample_nl_ops,
>> + .n_ops = ARRAY_SIZE(psample_nl_ops),
>> .resv_start_op = PSAMPLE_CMD_GET_GROUP + 1,
>> .n_mcgrps = ARRAY_SIZE(psample_nl_mcgrps),
>> + .sock_priv_size = sizeof(struct psample_nl_sock_priv),
>> + .sock_priv_init = psample_nl_sock_priv_init,
>> + .sock_priv_destroy = psample_nl_sock_priv_destroy,
>> };
>>
>> static void psample_group_notify(struct psample_group *group,
>> @@ -360,6 +427,32 @@ static int psample_tunnel_meta_len(struct ip_tunnel_info *tun_info)
>> }
>> #endif
>>
>> +static inline void psample_nl_obj_desc_init(struct psample_obj_desc *desc,
>> + u32 group_num)
>> +{
>> + memset(desc, 0, sizeof(*desc));
>> + desc->group_num = group_num;
>> +}
>> +
>> +static int psample_nl_sample_filter(struct sock *dsk, struct sk_buff *skb,
>> + void *data)
>> +{
>> + struct psample_obj_desc *desc = data;
>> + struct psample_nl_sock_priv *sk_priv;
>> + struct psample_obj_desc *filter;
>> + int ret = 0;
>> +
>> + rcu_read_lock();
>> + sk_priv = __genl_sk_priv_get(&psample_nl_family, dsk);
>> + if (!IS_ERR_OR_NULL(sk_priv)) {
>> + filter = rcu_dereference(sk_priv->filter);
>> + if (filter && desc)
>> + ret = (filter->group_num != desc->group_num);
>> + }
>> + rcu_read_unlock();
>> + return ret;
>> +}
>> +
>> void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>> u32 sample_rate, const struct psample_metadata *md)
>> {
>> @@ -370,6 +463,7 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>> #ifdef CONFIG_INET
>> struct ip_tunnel_info *tun_info;
>> #endif
>> + struct psample_obj_desc desc;
>> struct sk_buff *nl_skb;
>> int data_len;
>> int meta_len;
>> @@ -487,8 +581,12 @@ void psample_sample_packet(struct psample_group *group, struct sk_buff *skb,
>> #endif
>>
>> genlmsg_end(nl_skb, data);
>> - genlmsg_multicast_netns(&psample_nl_family, group->net, nl_skb, 0,
>> - PSAMPLE_NL_MCGRP_SAMPLE, GFP_ATOMIC);
>> + psample_nl_obj_desc_init(&desc, group->group_num);
>> + genlmsg_multicast_netns_filtered(&psample_nl_family,
>> + group->net, nl_skb, 0,
>> + PSAMPLE_NL_MCGRP_SAMPLE,
>> + GFP_ATOMIC, psample_nl_sample_filter,
>> + &desc);
>>
>> return;
>> error:
>> --
>> 2.44.0
>>
>>
>


2024-04-26 02:52:41

by Jamal Hadi Salim

[permalink] [raw]
Subject: Re: [PATCH net-next 5/8] net: sched: act_sample: add action cookie to sample

On Wed, Apr 24, 2024 at 9:54 AM Adrian Moreno <[email protected]> wrote:
>
> If the action has a user_cookie, pass it along to the sample so it can
> be easily identified.
>
> Signed-off-by: Adrian Moreno <[email protected]>
Reviewed-by: Jamal Hadi Salim <[email protected]>

cheers,
jamal

> ---
> net/sched/act_sample.c | 12 ++++++++++++
> 1 file changed, 12 insertions(+)
>
> diff --git a/net/sched/act_sample.c b/net/sched/act_sample.c
> index a69b53d54039..5c3f86ec964a 100644
> --- a/net/sched/act_sample.c
> +++ b/net/sched/act_sample.c
> @@ -165,9 +165,11 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
> const struct tc_action *a,
> struct tcf_result *res)
> {
> + u8 cookie_data[TC_COOKIE_MAX_SIZE] = {};
> struct tcf_sample *s = to_sample(a);
> struct psample_group *psample_group;
> struct psample_metadata md = {};
> + struct tc_cookie *user_cookie;
> int retval;
>
> tcf_lastuse_update(&s->tcf_tm);
> @@ -189,6 +191,16 @@ TC_INDIRECT_SCOPE int tcf_sample_act(struct sk_buff *skb,
> if (skb_at_tc_ingress(skb) && tcf_sample_dev_ok_push(skb->dev))
> skb_push(skb, skb->mac_len);
>
> + rcu_read_lock();
> + user_cookie = rcu_dereference(a->user_cookie);
> + if (user_cookie) {
> + memcpy(cookie_data, user_cookie->data,
> + user_cookie->len);
> + md.user_cookie = cookie_data;
> + md.user_cookie_len = user_cookie->len;
> + }
> + rcu_read_unlock();
> +
> md.trunc_size = s->truncate ? s->trunc_size : skb->len;
> psample_sample_packet(psample_group, skb, s->rate, &md);
>
> --
> 2.44.0
>