2012-11-13 01:52:22

by Shan Wei

[permalink] [raw]
Subject: [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper

From: Shan Wei <[email protected]>


Signed-off-by: Shan Wei <[email protected]>
---
v4:
derefrence pointer before reading to avoid compile warning.
---
net/xfrm/xfrm_ipcomp.c | 8 +++-----
1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index e5246fb..2906d52 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -276,18 +276,16 @@ static struct crypto_comp * __percpu *ipcomp_alloc_tfms(const char *alg_name)
struct crypto_comp * __percpu *tfms;
int cpu;

- /* This can be any valid CPU ID so we don't need locking. */
- cpu = raw_smp_processor_id();

list_for_each_entry(pos, &ipcomp_tfms_list, list) {
struct crypto_comp *tfm;

- tfms = pos->tfms;
- tfm = *per_cpu_ptr(tfms, cpu);
+ /* This can be any valid CPU ID so we don't need locking. */
+ tfm = __this_cpu_read(*pos->tfms);

if (!strcmp(crypto_comp_name(tfm), alg_name)) {
pos->users++;
- return tfms;
+ return pos->tfms;
}
}

--
1.7.1


2012-11-13 07:21:11

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper

On Tue, Nov 13, 2012 at 09:52:09AM +0800, Shan Wei wrote:
> From: Shan Wei <[email protected]>
>

Please add a proper commit message, explaining why you do this change.

>
> Signed-off-by: Shan Wei <[email protected]>
> ---
> v4:
> derefrence pointer before reading to avoid compile warning.
> ---
> net/xfrm/xfrm_ipcomp.c | 8 +++-----
> 1 files changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
> index e5246fb..2906d52 100644
> --- a/net/xfrm/xfrm_ipcomp.c
> +++ b/net/xfrm/xfrm_ipcomp.c
> @@ -276,18 +276,16 @@ static struct crypto_comp * __percpu *ipcomp_alloc_tfms(const char *alg_name)
> struct crypto_comp * __percpu *tfms;
> int cpu;
>
> - /* This can be any valid CPU ID so we don't need locking. */
> - cpu = raw_smp_processor_id();
>
> list_for_each_entry(pos, &ipcomp_tfms_list, list) {
> struct crypto_comp *tfm;
>
> - tfms = pos->tfms;
> - tfm = *per_cpu_ptr(tfms, cpu);
> + /* This can be any valid CPU ID so we don't need locking. */
> + tfm = __this_cpu_read(*pos->tfms);

This should just fetch the tfm pointer, so why exactly __this_cpu_read
is better than __this_cpu_ptr? Please keep in mind that performance is
not the most important thing here. It's much more important that it
works in any case.

2012-11-13 09:33:33

by Shan Wei

[permalink] [raw]
Subject: Re: [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper

Steffen Klassert said, at 2012/11/13 15:21:
>
> This should just fetch the tfm pointer, so why exactly __this_cpu_read
> is better than __this_cpu_ptr? Please keep in mind that performance is
> not the most important thing here. It's much more important that it
> works in any case.

[0/9] describes why we should submit this series patches.
you are not included in original mail.

this_cpu_ptr relocates and address. this_cpu_read() relocates the address
and performs the fetch. If you want to operate on rda(defined as per_cpu)
then you can only use this_cpu_ptr. this_cpu_read() saves you more instructions
since it can do the relocation and the fetch in one instruction.

More info, please refer to http://www.spinics.net/lists/kernel/msg1435153.html

2012-11-13 10:48:50

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH v4 3/9] net: xfrm: use __this_cpu_read per-cpu helper

On Tue, Nov 13, 2012 at 05:33:19PM +0800, Shan Wei wrote:
> Steffen Klassert said, at 2012/11/13 15:21:
> >
> > This should just fetch the tfm pointer, so why exactly __this_cpu_read
> > is better than __this_cpu_ptr? Please keep in mind that performance is
> > not the most important thing here. It's much more important that it
> > works in any case.
>
> [0/9] describes why we should submit this series patches.
> you are not included in original mail.
>
> this_cpu_ptr relocates and address. this_cpu_read() relocates the address
> and performs the fetch. If you want to operate on rda(defined as per_cpu)
> then you can only use this_cpu_ptr. this_cpu_read() saves you more instructions
> since it can do the relocation and the fetch in one instruction.
>

Ok, so please add a commit message to describe your changes.

Thanks.

2012-11-13 12:36:14

by Shan Wei

[permalink] [raw]
Subject: Re: [PATCH v5 3/9] net: xfrm: use __this_cpu_read per-cpu helper

Steffen Klassert said, at 2012/11/13 18:48:
>
> Ok, so please add a commit message to describe your changes.
>
> Thanks.
>

[PATCH v5] net: xfrm: use __this_cpu_read per-cpu helper

this_cpu_ptr/this_cpu_read is faster than per_cpu_ptr(p, smp_processor_id())
and can reduce memory accesses.
The latter helper needs to find the offset for current cpu,
and needs more assembler instructions which objdump shows in following.

this_cpu_ptr relocates and address. this_cpu_read() relocates the address
and performs the fetch. this_cpu_read() saves you more instructions
since it can do the relocation and the fetch in one instruction.

per_cpu_ptr(p, smp_processor_id()):
1e: 65 8b 04 25 00 00 00 00 mov %gs:0x0,%eax
26: 48 98 cltq
28: 31 f6 xor %esi,%esi
2a: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
31: 48 8b 04 c5 00 00 00 00 mov 0x0(,%rax,8),%rax
39: c7 44 10 04 14 00 00 00 movl $0x14,0x4(%rax,%rdx,1)

this_cpu_ptr(p)
1e: 65 48 03 14 25 00 00 00 00 add %gs:0x0,%rdx
27: 31 f6 xor %esi,%esi
29: c7 42 04 14 00 00 00 movl $0x14,0x4(%rdx)
30: 48 c7 c7 00 00 00 00 mov $0x0,%rdi

Signed-off-by: Shan Wei <[email protected]>
---
net/xfrm/xfrm_ipcomp.c | 8 +++-----
1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/net/xfrm/xfrm_ipcomp.c b/net/xfrm/xfrm_ipcomp.c
index e5246fb..2906d52 100644
--- a/net/xfrm/xfrm_ipcomp.c
+++ b/net/xfrm/xfrm_ipcomp.c
@@ -276,18 +276,16 @@ static struct crypto_comp * __percpu *ipcomp_alloc_tfms(const char *alg_name)
struct crypto_comp * __percpu *tfms;
int cpu;

- /* This can be any valid CPU ID so we don't need locking. */
- cpu = raw_smp_processor_id();

list_for_each_entry(pos, &ipcomp_tfms_list, list) {
struct crypto_comp *tfm;

- tfms = pos->tfms;
- tfm = *per_cpu_ptr(tfms, cpu);
+ /* This can be any valid CPU ID so we don't need locking. */
+ tfm = __this_cpu_read(*pos->tfms);

if (!strcmp(crypto_comp_name(tfm), alg_name)) {
pos->users++;
- return tfms;
+ return pos->tfms;
}
}

--
1.7.1

2012-11-14 08:44:05

by Steffen Klassert

[permalink] [raw]
Subject: Re: [PATCH v5 3/9] net: xfrm: use __this_cpu_read per-cpu helper

On Tue, Nov 13, 2012 at 08:36:00PM +0800, Shan Wei wrote:
> Steffen Klassert said, at 2012/11/13 18:48:
> >
> > Ok, so please add a commit message to describe your changes.
> >
> > Thanks.
> >
>
> [PATCH v5] net: xfrm: use __this_cpu_read per-cpu helper
>
> this_cpu_ptr/this_cpu_read is faster than per_cpu_ptr(p, smp_processor_id())
> and can reduce memory accesses.
> The latter helper needs to find the offset for current cpu,
> and needs more assembler instructions which objdump shows in following.
>
> this_cpu_ptr relocates and address. this_cpu_read() relocates the address
> and performs the fetch. this_cpu_read() saves you more instructions
> since it can do the relocation and the fetch in one instruction.
>
> per_cpu_ptr(p, smp_processor_id()):
> 1e: 65 8b 04 25 00 00 00 00 mov %gs:0x0,%eax
> 26: 48 98 cltq
> 28: 31 f6 xor %esi,%esi
> 2a: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
> 31: 48 8b 04 c5 00 00 00 00 mov 0x0(,%rax,8),%rax
> 39: c7 44 10 04 14 00 00 00 movl $0x14,0x4(%rax,%rdx,1)
>
> this_cpu_ptr(p)
> 1e: 65 48 03 14 25 00 00 00 00 add %gs:0x0,%rdx
> 27: 31 f6 xor %esi,%esi
> 29: c7 42 04 14 00 00 00 movl $0x14,0x4(%rdx)
> 30: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
>
> Signed-off-by: Shan Wei <[email protected]>

Applied to ipsec-next, thanks!