2015-05-30 23:28:24

by Krishna Chaitanya

[permalink] [raw]
Subject: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

cfg80211 reuses the skb before asking for a fresh on from genl framework,
this works efficiently for smaller messages but NLM_F_DUMP is normally
used to transport larger data normally > PAGE_SIZE, so if the message
occupies more than GOODSIZE its better to ask for a new, saves couple
of hanshakes with the driver.

This improves the time to get the DUMP response across to user space.
Signed-off-by: Chaitanya T K <[email protected]>
---
V2: no new line, so commit log was cut off. Updated now.
---
net/wireless/nl80211.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index c264eff..152bd0c 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -7636,6 +7636,10 @@ static int nl80211_testmode_dump(struct sk_buff *skb,
}

genlmsg_end(skb, hdr);
+
+ /* Don't re-use skb, when we know nla_put fails*/
+ if (skb->len > NLMSG_GOODSIZE / 2)
+ break;
}

err = skb->len;


2015-05-31 10:01:48

by Krishna Chaitanya

[permalink] [raw]
Subject: Re: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

On Sun, May 31, 2015 at 2:01 PM, Johannes Berg
<[email protected]> wrote:
> On Sun, 2015-05-31 at 04:58 +0530, Chaitanya T K wrote:
>> cfg80211 reuses the skb before asking for a fresh on from genl framework,
>> this works efficiently for smaller messages but NLM_F_DUMP is normally
>> used to transport larger data normally > PAGE_SIZE, so if the message
>> occupies more than GOODSIZE its better to ask for a new, saves couple
>> of hanshakes with the driver.
>>
>> This improves the time to get the DUMP response across to user space.
>
> This doesn't make any sense. If the driver is slow to actually create
> the data, it should implement this logic, but realistically the driver
> should just check if there's enough space and only try to create data to
> put into the skb if it's sufficient?
>
> It sounds to me like you're actually interacting with the hardware at
> this point (otherwise it wouldn't be slow!) which pretty much seems
> wrong anyway.

In our case the data size is 411KB, and each message carries around 3072
bytes, so total messages would be 138. The first message we get, we
retrieve the data, and for subsequent messages we just fill the skb,
we are not interacting with the hardware.

Without this patch, heres the sequence:
genl->cfg80211->retrieve the dump from HW, skb_put: 3072
cfg80211->skb_put, fail
genl->cfg80211->skb_put: 3072.

With this patch:

genl->cfg80211->retrive the dump from HW, skb_put: 3072
genl->cfg80211->skb_put: 3072.

2015-05-31 10:42:29

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

On Sun, 2015-05-31 at 15:31 +0530, Krishna Chaitanya wrote:

> In our case the data size is 411KB, and each message carries around 3072
> bytes, so total messages would be 138. The first message we get, we
> retrieve the data, and for subsequent messages we just fill the skb,
> we are not interacting with the hardware.
>
> Without this patch, heres the sequence:
> genl->cfg80211->retrieve the dump from HW, skb_put: 3072
> cfg80211->skb_put, fail
> genl->cfg80211->skb_put: 3072.
>
> With this patch:
>
> genl->cfg80211->retrive the dump from HW, skb_put: 3072
> genl->cfg80211->skb_put: 3072.

You've failed to explain why this is so much of a problem that you're
trying to "fix" it. I suspect the "fail" part inside your driver is
needlessly expensive, I don't think the function call can be expensive
enough to worry anyone.

Note # of "messages" as you say is actually irrelevant - you should look
at how often the kernel/user boundary is crossed, that's really far more
interesting, and your patch makes that MUCH worse when the put size is
small (say 100 bytes) because then you're practically doing that twice
as often.

Anyway - I stand by your patch being a terrible idea. End of story.

johannes


2015-05-31 06:58:50

by Arend Van Spriel

[permalink] [raw]
Subject: Re: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

On 31-05-15 01:28, Chaitanya T K wrote:
> cfg80211 reuses the skb before asking for a fresh on from genl framework,
> this works efficiently for smaller messages but NLM_F_DUMP is normally
> used to transport larger data normally > PAGE_SIZE, so if the message
> occupies more than GOODSIZE its better to ask for a new, saves couple
> of hanshakes with the driver.
>
> This improves the time to get the DUMP response across to user space.
> Signed-off-by: Chaitanya T K <[email protected]>
> ---
> V2: no new line, so commit log was cut off. Updated now.
> ---
> net/wireless/nl80211.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
> index c264eff..152bd0c 100644
> --- a/net/wireless/nl80211.c
> +++ b/net/wireless/nl80211.c
> @@ -7636,6 +7636,10 @@ static int nl80211_testmode_dump(struct sk_buff *skb,
> }
>
> genlmsg_end(skb, hdr);
> +
> + /* Don't re-use skb, when we know nla_put fails*/

After reading the commit message this comment is not clear to me. Maybe
it is just me ;-)

> + if (skb->len > NLMSG_GOODSIZE / 2)

This definitely does not match the description in the commit message:
'message occupies more than GOODSIZE' != (skb->len > NLMSG_GOODSIZE / 2)

Regards,
Arend

> + break;
> }
>
> err = skb->len;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-wireless" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

2015-05-31 08:31:34

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

On Sun, 2015-05-31 at 04:58 +0530, Chaitanya T K wrote:
> cfg80211 reuses the skb before asking for a fresh on from genl framework,
> this works efficiently for smaller messages but NLM_F_DUMP is normally
> used to transport larger data normally > PAGE_SIZE, so if the message
> occupies more than GOODSIZE its better to ask for a new, saves couple
> of hanshakes with the driver.
>
> This improves the time to get the DUMP response across to user space.

This doesn't make any sense. If the driver is slow to actually create
the data, it should implement this logic, but realistically the driver
should just check if there's enough space and only try to create data to
put into the skb if it's sufficient?

It sounds to me like you're actually interacting with the hardware at
this point (otherwise it wouldn't be slow!) which pretty much seems
wrong anyway.

johannes


2015-05-31 11:21:47

by Johannes Berg

[permalink] [raw]
Subject: Re: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

On Sun, 2015-05-31 at 16:47 +0530, Krishna Chaitanya wrote:

> > Note # of "messages" as you say is actually irrelevant - you should look
>
> Well with 138 messages the function and each message 3072 bytes
> calls b/w cfg80211 and driver
> without the patch: would be 276 calls
> with this patch: would be 138 calls
> Thats a lot of function calls, don't you think?

No, I don't think so. That really should be within the noise, it's all
in the icache already after the first round.

> > at how often the kernel/user boundary is crossed, that's really far more
> > interesting, and your patch makes that MUCH worse when the put size is
> > small (say 100 bytes) because then you're practically doing that twice
> > as often.
>
> My patch doesn't deteriorate the situation, and not change the kernel to
> user boundary. With/Without the patch 3072 bytes are transported in a
> single message from kernel to user.

*in your case*

In the case that somebody is creating smaller messages it makes things
MUCH worse by allowing only half the data to be carried across the
kernel/userspace boundary each time any data crosses it, so it will
result in many more syscalls in that case. If you're worried about the
overhead of a simple function (pointer) call in the kernel, then surely
you should be far more worried about this.

johannes


2015-05-31 11:39:40

by Krishna Chaitanya

[permalink] [raw]
Subject: Re: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

On Sun, May 31, 2015 at 4:51 PM, Johannes Berg
<[email protected]> wrote:
> On Sun, 2015-05-31 at 16:47 +0530, Krishna Chaitanya wrote:
>
>> > Note # of "messages" as you say is actually irrelevant - you should look
>>
>> Well with 138 messages the function and each message 3072 bytes
>> calls b/w cfg80211 and driver
>> without the patch: would be 276 calls
>> with this patch: would be 138 calls
>> Thats a lot of function calls, don't you think?
>
> No, I don't think so. That really should be within the noise, it's all
> in the icache already after the first round.

Agree, but there's a scope for improvement (at least functionally), so
i took a shot.

>
>> > at how often the kernel/user boundary is crossed, that's really far more
>> > interesting, and your patch makes that MUCH worse when the put size is
>> > small (say 100 bytes) because then you're practically doing that twice
>> > as often.
>>
>> My patch doesn't deteriorate the situation, and not change the kernel to
>> user boundary. With/Without the patch 3072 bytes are transported in a
>> single message from kernel to user.
>
> *in your case*
>
> In the case that somebody is creating smaller messages it makes things
> MUCH worse by allowing only half the data to be carried across the
> kernel/userspace boundary each time any data crosses it, so it will
> result in many more syscalls in that case. If you're worried about the
> overhead of a simple function (pointer) call in the kernel, then surely
> you should be far more worried about this.

Thats why we have the threshold so that smaller messages (<2048)
do not suffer.

2015-05-31 09:56:31

by Krishna Chaitanya

[permalink] [raw]
Subject: Re: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

On Sun, May 31, 2015 at 12:28 PM, Arend van Spriel <[email protected]> wrote:
> On 31-05-15 01:28, Chaitanya T K wrote:
>> cfg80211 reuses the skb before asking for a fresh on from genl framework,
>> this works efficiently for smaller messages but NLM_F_DUMP is normally
>> used to transport larger data normally > PAGE_SIZE, so if the message
>> occupies more than GOODSIZE its better to ask for a new, saves couple
>> of hanshakes with the driver.
>>
>> This improves the time to get the DUMP response across to user space.
>> Signed-off-by: Chaitanya T K <[email protected]>
>> ---
>> V2: no new line, so commit log was cut off. Updated now.
>> ---
>> net/wireless/nl80211.c | 4 ++++
>> 1 file changed, 4 insertions(+)
>>
>> diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
>> index c264eff..152bd0c 100644
>> --- a/net/wireless/nl80211.c
>> +++ b/net/wireless/nl80211.c
>> @@ -7636,6 +7636,10 @@ static int nl80211_testmode_dump(struct sk_buff *skb,
>> }
>>
>> genlmsg_end(skb, hdr);
>> +
>> + /* Don't re-use skb, when we know nla_put fails*/
>
> After reading the commit message this comment is not clear to me. Maybe
> it is just me ;-)
>
>> + if (skb->len > NLMSG_GOODSIZE / 2)
>
> This definitely does not match the description in the commit message:
> 'message occupies more than GOODSIZE' != (skb->len > NLMSG_GOODSIZE / 2)
We need to find a threshold where smaller messages still get
aggregated in the same
skb where as larger messages save an extra handshake and request for
another skb.

So this threshold would be NLMSG_GOODSIZE/2. Assuming messages are equal size
except for the last one and if one message is > NLMSG_GOODSIE/2, we
cannot fit another
message in the same skb.



--
Thanks,
Regards,
Chaitanya T K.

2015-05-31 11:17:50

by Krishna Chaitanya

[permalink] [raw]
Subject: Re: [PATCH v2] cfg80211: Don't re-use the skb for larger NL messages.

On Sun, May 31, 2015 at 4:12 PM, Johannes Berg
<[email protected]> wrote:
> On Sun, 2015-05-31 at 15:31 +0530, Krishna Chaitanya wrote:
>
>> In our case the data size is 411KB, and each message carries around 3072
>> bytes, so total messages would be 138. The first message we get, we
>> retrieve the data, and for subsequent messages we just fill the skb,
>> we are not interacting with the hardware.
>>
>> Without this patch, heres the sequence:
>> genl->cfg80211->retrieve the dump from HW, skb_put: 3072
>> cfg80211->skb_put, fail
>> genl->cfg80211->skb_put: 3072.
>>
>> With this patch:
>>
>> genl->cfg80211->retrive the dump from HW, skb_put: 3072
>> genl->cfg80211->skb_put: 3072.
>
> You've failed to explain why this is so much of a problem that you're
> trying to "fix" it. I suspect the "fail" part inside your driver is
> needlessly expensive, I don't think the function call can be expensive
> enough to worry anyone.

"Fail" is a simple nla_put, its not at all expensive. We don't interact with
the hardware at that time. We in our tests we saw that this improved
the time to transport the dump (collection not included) to user space.

> Note # of "messages" as you say is actually irrelevant - you should look

Well with 138 messages the function and each message 3072 bytes
calls b/w cfg80211 and driver
without the patch: would be 276 calls
with this patch: would be 138 calls
Thats a lot of function calls, don't you think?

> at how often the kernel/user boundary is crossed, that's really far more
> interesting, and your patch makes that MUCH worse when the put size is
> small (say 100 bytes) because then you're practically doing that twice
> as often.

My patch doesn't deteriorate the situation, and not change the kernel to
user boundary. With/Without the patch 3072 bytes are transported in a
single message from kernel to user.

My patch only affects cfg80211 <--> Driver interaction as explained above.

> Anyway - I stand by your patch being a terrible idea. End of story.





--
Thanks,
Regards,
Chaitanya T K.