2023-02-23 12:22:04

by 缘 陶

[permalink] [raw]
Subject: [PATCH net-next v2 1/1] net: openvswitch: Use on stack sw_flow_key in ovs_packet_cmd_execute

Use on stack sw_flow_key in ovs_packet_cmd_execute

Reason: As key function in slow-path, ovs_packet_cmd_execute and
ovs_flow_cmd_new allocate transient memory for sw_flow
and frees it at the end of function.
The procedure is not efficient in 2 aspects
1. actuall sw_flow_key is what the function need
2. free/alloc involves kmem_cache operations
when system under frequent slow path operation

Existing code in ovs_flow_cmd_new/set/get use stack
to store sw_flow_mask and sw_flow_key deliberately

Performance benefit:
ovs_packet_cmd_execute efficiency improved
Avoid 2 calls to kmem_cache alloc
Avoid memzero of 200 bytes
6% less instructions

Testing topology
+-----+
nic1--| |--nic1
nic2--| |--nic2
VM1(16cpus) | ovs | VM2(16 cpus)
nic3--|4cpus|--nic3
nic4--| |--nic4
+-----+
2 threads on each vnic with affinity set on client side

netperf -H $peer -p $((port+$i)) -t UDP_RR -l 60 -- -R 1 -r 8K,8K
netperf -H $peer -p $((port+$i)) -t TCP_RR -l 60 -- -R 1 -r 120,240
netperf -H $peer -p $((port+$i)) -t TCP_CRR -l 60 -- -R 1 -r 120,240

Before the fix
Mode Iterations Variance Average
UDP_RR 10 %1.31 46724
TCP_RR 10 %6.26 77188
TCP_CRR 10 %0.10 20505
UDP_STREAM 10 %4.55 19907
TCP_STREAM 10 %9.93 28942

After the fix
Mode Iterations Variance Average
UDP_RR 10 %1.51 49097
TCP_RR 10 %5.58 78540
TCP_CRR 10 %0.14 20542
UDP_STREAM 10 %11.17 22532
TCP_STREAM 10 %11.14 28579

Signed-off-by: Eddy Tao <[email protected]>
---
V1 -> V2: Further reduce memory usage by using sw_flow_key instead
of sw_flow, revise description of change and provide data

net/openvswitch/datapath.c | 30 +++++++++++-------------------
1 file changed, 11 insertions(+), 19 deletions(-)

diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
index fcee6012293b..ae3146d51079 100644
--- a/net/openvswitch/datapath.c
+++ b/net/openvswitch/datapath.c
@@ -596,8 +596,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
struct nlattr **a = info->attrs;
struct sw_flow_actions *acts;
struct sk_buff *packet;
- struct sw_flow *flow;
- struct sw_flow_actions *sf_acts;
+ struct sw_flow_key key;
struct datapath *dp;
struct vport *input_vport;
u16 mru = 0;
@@ -636,24 +635,20 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
}

/* Build an sw_flow for sending this packet. */
- flow = ovs_flow_alloc();
- err = PTR_ERR(flow);
- if (IS_ERR(flow))
- goto err_kfree_skb;
+ memset(&key, 0, sizeof(key));

err = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY],
- packet, &flow->key, log);
+ packet, &key, log);
if (err)
- goto err_flow_free;
+ goto err_kfree_skb;

err = ovs_nla_copy_actions(net, a[OVS_PACKET_ATTR_ACTIONS],
- &flow->key, &acts, log);
+ &key, &acts, log);
if (err)
- goto err_flow_free;
+ goto err_kfree_skb;

- rcu_assign_pointer(flow->sf_acts, acts);
- packet->priority = flow->key.phy.priority;
- packet->mark = flow->key.phy.skb_mark;
+ packet->priority = key.phy.priority;
+ packet->mark = key.phy.skb_mark;

rcu_read_lock();
dp = get_dp_rcu(net, ovs_header->dp_ifindex);
@@ -661,7 +656,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
if (!dp)
goto err_unlock;

- input_vport = ovs_vport_rcu(dp, flow->key.phy.in_port);
+ input_vport = ovs_vport_rcu(dp, key.phy.in_port);
if (!input_vport)
input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);

@@ -670,20 +665,17 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)

packet->dev = input_vport->dev;
OVS_CB(packet)->input_vport = input_vport;
- sf_acts = rcu_dereference(flow->sf_acts);

local_bh_disable();
- err = ovs_execute_actions(dp, packet, sf_acts, &flow->key);
+ err = ovs_execute_actions(dp, packet, acts, &key);
local_bh_enable();
rcu_read_unlock();

- ovs_flow_free(flow, false);
+ ovs_nla_free_flow_actions(acts);
return err;

err_unlock:
rcu_read_unlock();
-err_flow_free:
- ovs_flow_free(flow, false);
err_kfree_skb:
kfree_skb(packet);
err:
--
2.27.0



2023-02-23 12:25:08

by 缘 陶

[permalink] [raw]
Subject: Re: [PATCH net-next v2 1/1] net: openvswitch: Use on stack sw_flow_key in ovs_packet_cmd_execute

Sorry, there is a typo in the mail, i will resend shortly, please ignore
it for now

On 2023/2/23 20:21, Eddy Tao wrote:
> Use on stack sw_flow_key in ovs_packet_cmd_execute
>
> Reason: As key function in slow-path, ovs_packet_cmd_execute and
> ovs_flow_cmd_new allocate transient memory for sw_flow
> and frees it at the end of function.
> The procedure is not efficient in 2 aspects
> 1. actuall sw_flow_key is what the function need
> 2. free/alloc involves kmem_cache operations
> when system under frequent slow path operation
>
> Existing code in ovs_flow_cmd_new/set/get use stack
> to store sw_flow_mask and sw_flow_key deliberately
>
> Performance benefit:
> ovs_packet_cmd_execute efficiency improved
> Avoid 2 calls to kmem_cache alloc
> Avoid memzero of 200 bytes
> 6% less instructions
>
> Testing topology
> +-----+
> nic1--| |--nic1
> nic2--| |--nic2
> VM1(16cpus) | ovs | VM2(16 cpus)
> nic3--|4cpus|--nic3
> nic4--| |--nic4
> +-----+
> 2 threads on each vnic with affinity set on client side
>
> netperf -H $peer -p $((port+$i)) -t UDP_RR -l 60 -- -R 1 -r 8K,8K
> netperf -H $peer -p $((port+$i)) -t TCP_RR -l 60 -- -R 1 -r 120,240
> netperf -H $peer -p $((port+$i)) -t TCP_CRR -l 60 -- -R 1 -r 120,240
>
> Before the fix
> Mode Iterations Variance Average
> UDP_RR 10 %1.31 46724
> TCP_RR 10 %6.26 77188
> TCP_CRR 10 %0.10 20505
> UDP_STREAM 10 %4.55 19907
> TCP_STREAM 10 %9.93 28942
>
> After the fix
> Mode Iterations Variance Average
> UDP_RR 10 %1.51 49097
> TCP_RR 10 %5.58 78540
> TCP_CRR 10 %0.14 20542
> UDP_STREAM 10 %11.17 22532
> TCP_STREAM 10 %11.14 28579
>
> Signed-off-by: Eddy Tao <[email protected]>
> ---
> V1 -> V2: Further reduce memory usage by using sw_flow_key instead
> of sw_flow, revise description of change and provide data
>
> net/openvswitch/datapath.c | 30 +++++++++++-------------------
> 1 file changed, 11 insertions(+), 19 deletions(-)
>
> diff --git a/net/openvswitch/datapath.c b/net/openvswitch/datapath.c
> index fcee6012293b..ae3146d51079 100644
> --- a/net/openvswitch/datapath.c
> +++ b/net/openvswitch/datapath.c
> @@ -596,8 +596,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
> struct nlattr **a = info->attrs;
> struct sw_flow_actions *acts;
> struct sk_buff *packet;
> - struct sw_flow *flow;
> - struct sw_flow_actions *sf_acts;
> + struct sw_flow_key key;
> struct datapath *dp;
> struct vport *input_vport;
> u16 mru = 0;
> @@ -636,24 +635,20 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
> }
>
> /* Build an sw_flow for sending this packet. */
> - flow = ovs_flow_alloc();
> - err = PTR_ERR(flow);
> - if (IS_ERR(flow))
> - goto err_kfree_skb;
> + memset(&key, 0, sizeof(key));
>
> err = ovs_flow_key_extract_userspace(net, a[OVS_PACKET_ATTR_KEY],
> - packet, &flow->key, log);
> + packet, &key, log);
> if (err)
> - goto err_flow_free;
> + goto err_kfree_skb;
>
> err = ovs_nla_copy_actions(net, a[OVS_PACKET_ATTR_ACTIONS],
> - &flow->key, &acts, log);
> + &key, &acts, log);
> if (err)
> - goto err_flow_free;
> + goto err_kfree_skb;
>
> - rcu_assign_pointer(flow->sf_acts, acts);
> - packet->priority = flow->key.phy.priority;
> - packet->mark = flow->key.phy.skb_mark;
> + packet->priority = key.phy.priority;
> + packet->mark = key.phy.skb_mark;
>
> rcu_read_lock();
> dp = get_dp_rcu(net, ovs_header->dp_ifindex);
> @@ -661,7 +656,7 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
> if (!dp)
> goto err_unlock;
>
> - input_vport = ovs_vport_rcu(dp, flow->key.phy.in_port);
> + input_vport = ovs_vport_rcu(dp, key.phy.in_port);
> if (!input_vport)
> input_vport = ovs_vport_rcu(dp, OVSP_LOCAL);
>
> @@ -670,20 +665,17 @@ static int ovs_packet_cmd_execute(struct sk_buff *skb, struct genl_info *info)
>
> packet->dev = input_vport->dev;
> OVS_CB(packet)->input_vport = input_vport;
> - sf_acts = rcu_dereference(flow->sf_acts);
>
> local_bh_disable();
> - err = ovs_execute_actions(dp, packet, sf_acts, &flow->key);
> + err = ovs_execute_actions(dp, packet, acts, &key);
> local_bh_enable();
> rcu_read_unlock();
>
> - ovs_flow_free(flow, false);
> + ovs_nla_free_flow_actions(acts);
> return err;
>
> err_unlock:
> rcu_read_unlock();
> -err_flow_free:
> - ovs_flow_free(flow, false);
> err_kfree_skb:
> kfree_skb(packet);
> err:

2023-02-23 12:39:28

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH net-next v2 1/1] net: openvswitch: Use on stack sw_flow_key in ovs_packet_cmd_execute

On Thu, Feb 23, 2023 at 08:24:50PM +0800, Eddy Tao wrote:
> Sorry, there is a typo in the mail, i will resend shortly, please ignore it
> for now


net-next is now closed.

You'll need to repost this patch after v6.3-rc1 has been tagged.
Or post it as an RFC.

Ref: https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html

2023-02-23 12:42:48

by Paolo Abeni

[permalink] [raw]
Subject: Re: [PATCH net-next v2 1/1] net: openvswitch: Use on stack sw_flow_key in ovs_packet_cmd_execute

On Thu, 2023-02-23 at 20:24 +0800, Eddy Tao wrote:
> Sorry, there is a typo in the mail, i will resend shortly,

please, don't do that.

# Form letter - net-next is closed

The merge window for v6.3 has begun and therefore net-next is closed
for new drivers, features, code refactoring and optimizations.
We are currently accepting bug fixes only.

Please repost when net-next reopens after Mar 6th.

RFC patches sent for review only are obviously welcome at any time.


2023-02-23 14:10:15

by 缘 陶

[permalink] [raw]
Subject: Re: [PATCH net-next v2 1/1] net: openvswitch: Use on stack sw_flow_key in ovs_packet_cmd_execute

Sure, will redo the post when window open
Have a great day
eddy

________________________________________
From: Paolo Abeni <[email protected]>
Sent: Thursday, February 23, 2023 12:41
To: Eddy Tao; [email protected]
Cc: Pravin B Shelar; David S. Miller; Eric Dumazet; Jakub Kicinski; [email protected]; [email protected]
Subject: Re: [PATCH net-next v2 1/1] net: openvswitch: Use on stack sw_flow_key in ovs_packet_cmd_execute

On Thu, 2023-02-23 at 20:24 +0800, Eddy Tao wrote:
> Sorry, there is a typo in the mail, i will resend shortly,

please, don't do that.

# Form letter - net-next is closed

The merge window for v6.3 has begun and therefore net-next is closed
for new drivers, features, code refactoring and optimizations.
We are currently accepting bug fixes only.

Please repost when net-next reopens after Mar 6th.

RFC patches sent for review only are obviously welcome at any time.