In rds_send_xmit() there is logic to batch the sends. However, if
another thread has acquired the lock, it is considered a race and we
yield. The code incrementing the s_send_lock_queue_raced statistics
counter did not count this event correctly.
This commit removes a small race in determining the race and
increments the statistics counter correctly.
Signed-off-by: Håkon Bugge <[email protected]>
Reviewed-by: Knut Omang <[email protected]>
---
net/rds/send.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/net/rds/send.c b/net/rds/send.c
index 058a407..ecfe0b5 100644
--- a/net/rds/send.c
+++ b/net/rds/send.c
@@ -101,6 +101,11 @@ void rds_send_path_reset(struct rds_conn_path *cp)
}
EXPORT_SYMBOL_GPL(rds_send_path_reset);
+static bool someone_in_xmit(struct rds_conn_path *cp)
+{
+ return test_bit(RDS_IN_XMIT, &cp->cp_flags);
+}
+
static int acquire_in_xmit(struct rds_conn_path *cp)
{
return test_and_set_bit(RDS_IN_XMIT, &cp->cp_flags) == 0;
@@ -428,14 +433,19 @@ int rds_send_xmit(struct rds_conn_path *cp)
* some work and we will skip our goto
*/
if (ret == 0) {
+ bool raced;
+
smp_mb();
+ raced = someone_in_xmit(cp) ||
+ send_gen != READ_ONCE(cp->cp_send_gen);
+
if ((test_bit(0, &conn->c_map_queued) ||
- !list_empty(&cp->cp_send_queue)) &&
- send_gen == READ_ONCE(cp->cp_send_gen)) {
- rds_stats_inc(s_send_lock_queue_raced);
+ !list_empty(&cp->cp_send_queue)) && !raced) {
if (batch_count < send_batch_count)
goto restart;
queue_delayed_work(rds_wq, &cp->cp_send_w, 1);
+ } else if (raced) {
+ rds_stats_inc(s_send_lock_queue_raced);
}
}
out:
--
2.9.3
On 9/6/2017 8:29 AM, Håkon Bugge wrote:
> In rds_send_xmit() there is logic to batch the sends. However, if
> another thread has acquired the lock, it is considered a race and we
> yield. The code incrementing the s_send_lock_queue_raced statistics
> counter did not count this event correctly.
>
> This commit removes a small race in determining the race and
> increments the statistics counter correctly.
>
> Signed-off-by: Håkon Bugge <[email protected]>
> Reviewed-by: Knut Omang <[email protected]>
> ---
> net/rds/send.c | 16 +++++++++++++---
> 1 file changed, 13 insertions(+), 3 deletions(-)
>
Those counters are not really to give that accurate so
am not very keen to add additional cycles in send paths
and add additional code. Have you seen any real issue
or this is just a observation. s_send_lock_queue_raced
counter is never used to check for smaller increments
and hence the question.
Regards,
Santosh
> On 6 Sep 2017, at 17:58, Santosh Shilimkar <[email protected]> wrote:
>
> On 9/6/2017 8:29 AM, Håkon Bugge wrote:
>> In rds_send_xmit() there is logic to batch the sends. However, if
>> another thread has acquired the lock, it is considered a race and we
>> yield. The code incrementing the s_send_lock_queue_raced statistics
>> counter did not count this event correctly.
>> This commit removes a small race in determining the race and
>> increments the statistics counter correctly.
>> Signed-off-by: Håkon Bugge <[email protected]>
>> Reviewed-by: Knut Omang <[email protected]>
>> ---
>> net/rds/send.c | 16 +++++++++++++---
>> 1 file changed, 13 insertions(+), 3 deletions(-)
> Those counters are not really to give that accurate so
> am not very keen to add additional cycles in send paths
> and add additional code. Have you seen any real issue
> or this is just a observation. s_send_lock_queue_raced
> counter is never used to check for smaller increments
> and hence the question.
Hi Santosh,
Yes, I agree with accuracy of s_send_lock_queue_raced. But the main point is that the existing code counts some partial share of when it is _not_ raced.
So, in the critical path, my patch adds one test_bit(), which hits the local CPU cache, if not raced. If raced, some other thread is in control, so I would not think the added cycles would make any big difference.
I can send a v2 where the race tightening is removed if you like.
Thxs, Håkon
On 9/6/2017 9:12 AM, Håkon Bugge wrote:
>
[...]
>
> Hi Santosh,
>
>
> Yes, I agree with accuracy of s_send_lock_queue_raced. But the main point is that the existing code counts some partial share of when it is _not_ raced.
>
> So, in the critical path, my patch adds one test_bit(), which hits the local CPU cache, if not raced. If raced, some other thread is in control, so I would not think the added cycles would make any big difference.
>
Cycles added for no good reason is the point.
> I can send a v2 where the race tightening is removed if you like.
>
Yes please.
Regards,
Santosh