Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends().
http://lkml.iu.edu/hypermail/linux/kernel/1410.3/04561.html
The following series tries to do this.
There are still some hard-coded locations which I was not sure how to replace
with. I will send in separate patches/questions regarding them.
Pranith Kumar (9):
doc: memory-barriers.txt: Document use of lockless_dereference()
drivers: dma: Replace smp_read_barrier_depends() with
lockless_dereference()
dcache: Replace smp_read_barrier_depends() with lockless_dereference()
hyperv: Replace smp_read_barrier_depends() with lockless_dereference()
percpu: Replace smp_read_barrier_depends() with lockless_dereference()
perf: Replace smp_read_barrier_depends() with lockless_dereference()
seccomp: Replace smp_read_barrier_depends() with
lockless_dereference()
task_work: Replace smp_read_barrier_depends() with
lockless_dereference()
netfilter: Replace smp_read_barrier_depends() with
lockless_dereference()
Documentation/memory-barriers.txt | 4 ++--
drivers/dma/ioat/dma_v2.c | 3 +--
drivers/dma/ioat/dma_v3.c | 3 +--
fs/dcache.c | 7 ++-----
include/linux/hyperv.h | 9 ++++-----
include/linux/percpu-refcount.h | 4 +---
kernel/events/core.c | 3 +--
kernel/events/uprobes.c | 8 ++++----
kernel/seccomp.c | 7 +++----
kernel/task_work.c | 3 +--
net/ipv4/netfilter/arp_tables.c | 3 +--
net/ipv4/netfilter/ip_tables.c | 3 +--
net/ipv6/netfilter/ip6_tables.c | 3 +--
13 files changed, 23 insertions(+), 37 deletions(-)
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
Documentation/memory-barriers.txt | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 7ee2ae6..d33aab3 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -203,8 +203,8 @@ There are some minimal guarantees that may be expected of a CPU:
and always in that order. On most systems, smp_read_barrier_depends()
does nothing, but it is required for DEC Alpha. The ACCESS_ONCE()
is required to prevent compiler mischief. Please note that you
- should normally use something like rcu_dereference() instead of
- open-coding smp_read_barrier_depends().
+ should normally use something like rcu_dereference() or
+ lockless_dereference() instead of open-coding smp_read_barrier_depends().
(*) Overlapping loads and stores within a particular CPU will appear to be
ordered within that CPU. This means that for:
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
drivers/dma/ioat/dma_v2.c | 3 +--
drivers/dma/ioat/dma_v3.c | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/dma/ioat/dma_v2.c b/drivers/dma/ioat/dma_v2.c
index 695483e..0f94d72 100644
--- a/drivers/dma/ioat/dma_v2.c
+++ b/drivers/dma/ioat/dma_v2.c
@@ -142,10 +142,9 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
active = ioat2_ring_active(ioat);
for (i = 0; i < active && !seen_current; i++) {
- smp_read_barrier_depends();
prefetch(ioat2_get_ring_ent(ioat, idx + i + 1));
desc = ioat2_get_ring_ent(ioat, idx + i);
- tx = &desc->txd;
+ tx = &lockless_dereference(desc)->txd;
dump_desc_dbg(ioat, desc);
if (tx->cookie) {
dma_descriptor_unmap(tx);
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 895f869..cbd0537 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -389,7 +389,6 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
for (i = 0; i < active && !seen_current; i++) {
struct dma_async_tx_descriptor *tx;
- smp_read_barrier_depends();
prefetch(ioat2_get_ring_ent(ioat, idx + i + 1));
desc = ioat2_get_ring_ent(ioat, idx + i);
dump_desc_dbg(ioat, desc);
@@ -398,7 +397,7 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
if (device->cap & IOAT_CAP_DWBES)
desc_get_errstat(ioat, desc);
- tx = &desc->txd;
+ tx = &lockless_dereference(desc)->txd;
if (tx->cookie) {
dma_cookie_complete(tx);
dma_descriptor_unmap(tx);
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
fs/dcache.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/fs/dcache.c b/fs/dcache.c
index a6c5d7e..27b8b5b 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -230,8 +230,7 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
* early because the data cannot match (there can
* be no NUL in the ct/tcount data)
*/
- cs = ACCESS_ONCE(dentry->d_name.name);
- smp_read_barrier_depends();
+ cs = lockless_dereference(dentry->d_name.name);
return dentry_string_cmp(cs, ct, tcount);
}
@@ -2714,10 +2713,8 @@ static int prepend(char **buffer, int *buflen, const char *str, int namelen)
static int prepend_name(char **buffer, int *buflen, struct qstr *name)
{
const char *dname = ACCESS_ONCE(name->name);
- u32 dlen = ACCESS_ONCE(name->len);
char *p;
-
- smp_read_barrier_depends();
+ u32 dlen = lockless_dereference(name->len);
*buflen -= dlen + 1;
if (*buflen < 0)
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
kernel/seccomp.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 4ef9687..3729b06 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -175,17 +175,16 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
*/
static u32 seccomp_run_filters(struct seccomp_data *sd)
{
- struct seccomp_filter *f = ACCESS_ONCE(current->seccomp.filter);
struct seccomp_data sd_local;
u32 ret = SECCOMP_RET_ALLOW;
+ /* Make sure cross-thread synced filter points somewhere sane. */
+ struct seccomp_filter *f =
+ lockless_dereference(current->seccomp.filter);
/* Ensure unexpected behavior doesn't result in failing open. */
if (unlikely(WARN_ON(f == NULL)))
return SECCOMP_RET_KILL;
- /* Make sure cross-thread synced filter points somewhere sane. */
- smp_read_barrier_depends();
-
if (!sd) {
populate_seccomp_data(&sd_local);
sd = &sd_local;
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
net/ipv4/netfilter/arp_tables.c | 3 +--
net/ipv4/netfilter/ip_tables.c | 3 +--
net/ipv6/netfilter/ip6_tables.c | 3 +--
3 files changed, 3 insertions(+), 6 deletions(-)
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index f95b6f9..fc7533d 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -270,12 +270,11 @@ unsigned int arpt_do_table(struct sk_buff *skb,
local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
- smp_read_barrier_depends();
+ private = lockless_dereference(table->private);
table_base = private->entries[smp_processor_id()];
e = get_entry(table_base, private->hook_entry[hook]);
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 99e810f..e0fd044 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -325,13 +325,12 @@ ipt_do_table(struct sk_buff *skb,
IP_NF_ASSERT(table->valid_hooks & (1 << hook));
local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
cpu = smp_processor_id();
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
- smp_read_barrier_depends();
+ private = lockless_dereference(table->private);
table_base = private->entries[cpu];
jumpstack = (struct ipt_entry **)private->jumpstack[cpu];
stackptr = per_cpu_ptr(private->stackptr, cpu);
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index e080fbb..0459d6a 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -348,12 +348,11 @@ ip6t_do_table(struct sk_buff *skb,
local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
- smp_read_barrier_depends();
+ private = lockless_dereference(table->private);
cpu = smp_processor_id();
table_base = private->entries[cpu];
jumpstack = (struct ip6t_entry **)private->jumpstack[cpu];
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
include/linux/hyperv.h | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 08cfaff..06418b1 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -127,13 +127,12 @@ hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi,
u32 *read, u32 *write)
{
u32 read_loc, write_loc, dsize;
-
- smp_read_barrier_depends();
+ struct hv_ring_buffer_info *rbi_p = lockless_dereference(rbi);
/* Capture the read/write indices before they changed */
- read_loc = rbi->ring_buffer->read_index;
- write_loc = rbi->ring_buffer->write_index;
- dsize = rbi->ring_datasize;
+ read_loc = rbi_p->ring_buffer->read_index;
+ write_loc = rbi_p->ring_buffer->write_index;
+ dsize = rbi_p->ring_datasize;
*write = write_loc >= read_loc ? dsize - (write_loc - read_loc) :
read_loc - write_loc;
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
kernel/task_work.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/kernel/task_work.c b/kernel/task_work.c
index 8727032..b5599ce 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -62,8 +62,7 @@ task_work_cancel(struct task_struct *task, task_work_func_t func)
* we raced with task_work_run(), *pprev == NULL/exited.
*/
raw_spin_lock_irqsave(&task->pi_lock, flags);
- while ((work = ACCESS_ONCE(*pprev))) {
- smp_read_barrier_depends();
+ while ((work = lockless_dereference(*pprev))) {
if (work->func != func)
pprev = &work->next;
else if (cmpxchg(pprev, work, work->next) == work)
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
include/linux/percpu-refcount.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 494ab05..3cf6759 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -128,10 +128,8 @@ static inline void percpu_ref_kill(struct percpu_ref *ref)
static inline bool __ref_is_percpu(struct percpu_ref *ref,
unsigned long __percpu **percpu_countp)
{
- unsigned long percpu_ptr = ACCESS_ONCE(ref->percpu_count_ptr);
-
/* paired with smp_store_release() in percpu_ref_reinit() */
- smp_read_barrier_depends();
+ unsigned long percpu_ptr = lockless_dereference(ref->percpu_count_ptr);
if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
return false;
--
1.9.1
Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
Signed-off-by: Pranith Kumar <[email protected]>
---
kernel/events/core.c | 3 +--
kernel/events/uprobes.c | 8 ++++----
2 files changed, 5 insertions(+), 6 deletions(-)
diff --git a/kernel/events/core.c b/kernel/events/core.c
index d59fdc0..9dd5920 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3400,14 +3400,13 @@ static void perf_remove_from_owner(struct perf_event *event)
struct task_struct *owner;
rcu_read_lock();
- owner = ACCESS_ONCE(event->owner);
/*
* Matches the smp_wmb() in perf_event_exit_task(). If we observe
* !owner it means the list deletion is complete and we can indeed
* free this event, otherwise we need to serialize on
* owner->perf_event_mutex.
*/
- smp_read_barrier_depends();
+ owner = lockless_dereference(event->owner);
if (owner) {
/*
* Since delayed_put_task_struct() also drops the last
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 6158a64b..c070949 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1212,8 +1212,8 @@ static struct xol_area *get_xol_area(void)
if (!mm->uprobes_state.xol_area)
__create_xol_area(0);
- area = mm->uprobes_state.xol_area;
- smp_read_barrier_depends(); /* pairs with wmb in xol_add_vma() */
+ /* pairs with wmb in xol_add_vma() */
+ area = lockless_dereference(mm->uprobes_state.xol_area);
return area;
}
@@ -1507,8 +1507,8 @@ static unsigned long get_trampoline_vaddr(void)
struct xol_area *area;
unsigned long trampoline_vaddr = -1;
- area = current->mm->uprobes_state.xol_area;
- smp_read_barrier_depends();
+ area = lockless_dereference(current->mm->uprobes_state.xol_area);
+
if (area)
trampoline_vaddr = area->vaddr;
--
1.9.1
On Fri, 2014-11-21 at 10:06 -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>
> ---
> net/ipv4/netfilter/arp_tables.c | 3 +--
> net/ipv4/netfilter/ip_tables.c | 3 +--
> net/ipv6/netfilter/ip6_tables.c | 3 +--
> 3 files changed, 3 insertions(+), 6 deletions(-)
>
> diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
> index f95b6f9..fc7533d 100644
> --- a/net/ipv4/netfilter/arp_tables.c
> +++ b/net/ipv4/netfilter/arp_tables.c
> @@ -270,12 +270,11 @@ unsigned int arpt_do_table(struct sk_buff *skb,
>
> local_bh_disable();
> addend = xt_write_recseq_begin();
> - private = table->private;
> /*
> * Ensure we load private-> members after we've fetched the base
> * pointer.
> */
> - smp_read_barrier_depends();
> + private = lockless_dereference(table->private);
> table_base = private->entries[smp_processor_id()];
>
Please carefully read the code, before and after your change, then
you'll see this change broke the code.
Problem is that a bug like that can be really hard to diagnose and fix
later, so really you have to be very careful doing these mechanical
changes.
IMO, current code+comment is better than with this
lockless_dereference() which in this particular case obfuscates the
code. more than anything.
In this case we do have a lock (sort of), so lockless_dereference() is
quite misleading.
On Fri, Nov 21, 2014 at 7:06 AM, Pranith Kumar <[email protected]> wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>
> ---
> kernel/seccomp.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
Thanks!
Acked-by: Kees Cook <[email protected]>
Do you need me to carry this patch in the seccomp tree, or will
someone else be taking your entire series?
-Kees
>
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 4ef9687..3729b06 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -175,17 +175,16 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
> */
> static u32 seccomp_run_filters(struct seccomp_data *sd)
> {
> - struct seccomp_filter *f = ACCESS_ONCE(current->seccomp.filter);
> struct seccomp_data sd_local;
> u32 ret = SECCOMP_RET_ALLOW;
> + /* Make sure cross-thread synced filter points somewhere sane. */
> + struct seccomp_filter *f =
> + lockless_dereference(current->seccomp.filter);
>
> /* Ensure unexpected behavior doesn't result in failing open. */
> if (unlikely(WARN_ON(f == NULL)))
> return SECCOMP_RET_KILL;
>
> - /* Make sure cross-thread synced filter points somewhere sane. */
> - smp_read_barrier_depends();
> -
> if (!sd) {
> populate_seccomp_data(&sd_local);
> sd = &sd_local;
> --
> 1.9.1
>
--
Kees Cook
Chrome OS Security
On 11/21/2014 11:33 AM, Kees Cook wrote:
> On Fri, Nov 21, 2014 at 7:06 AM, Pranith Kumar <[email protected]> wrote:
>> Recently lockless_dereference() was added which can be used in place of
>> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>>
>> Signed-off-by: Pranith Kumar <[email protected]>
>> ---
>> kernel/seccomp.c | 7 +++----
>> 1 file changed, 3 insertions(+), 4 deletions(-)
> Thanks!
>
> Acked-by: Kees Cook <[email protected]>
>
> Do you need me to carry this patch in the seccomp tree, or will
> someone else be taking your entire series?
>
> -Kees
Please take this patch individually into your tree. There are discussions about the other patches and I will drop the accepted patches and iterate for the next version.
Thanks!
>> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
>> index 4ef9687..3729b06 100644
>> --- a/kernel/seccomp.c
>> +++ b/kernel/seccomp.c
>> @@ -175,17 +175,16 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
>> */
>> static u32 seccomp_run_filters(struct seccomp_data *sd)
>> {
>> - struct seccomp_filter *f = ACCESS_ONCE(current->seccomp.filter);
>> struct seccomp_data sd_local;
>> u32 ret = SECCOMP_RET_ALLOW;
>> + /* Make sure cross-thread synced filter points somewhere sane. */
>> + struct seccomp_filter *f =
>> + lockless_dereference(current->seccomp.filter);
>>
>> /* Ensure unexpected behavior doesn't result in failing open. */
>> if (unlikely(WARN_ON(f == NULL)))
>> return SECCOMP_RET_KILL;
>>
>> - /* Make sure cross-thread synced filter points somewhere sane. */
>> - smp_read_barrier_depends();
>> -
>> if (!sd) {
>> populate_seccomp_data(&sd_local);
>> sd = &sd_local;
>> --
>> 1.9.1
>>
>
>
On Fri, Nov 21, 2014 at 11:12 AM, Eric Dumazet <[email protected]> wrote:
> On Fri, 2014-11-21 at 10:06 -0500, Pranith Kumar wrote:
>> Recently lockless_dereference() was added which can be used in place of
>> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>>
>> Signed-off-by: Pranith Kumar <[email protected]>
>> ---
>> net/ipv4/netfilter/arp_tables.c | 3 +--
>> net/ipv4/netfilter/ip_tables.c | 3 +--
>> net/ipv6/netfilter/ip6_tables.c | 3 +--
>> 3 files changed, 3 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
>> index f95b6f9..fc7533d 100644
>> --- a/net/ipv4/netfilter/arp_tables.c
>> +++ b/net/ipv4/netfilter/arp_tables.c
>> @@ -270,12 +270,11 @@ unsigned int arpt_do_table(struct sk_buff *skb,
>>
>> local_bh_disable();
>> addend = xt_write_recseq_begin();
>> - private = table->private;
>> /*
>> * Ensure we load private-> members after we've fetched the base
>> * pointer.
>> */
>> - smp_read_barrier_depends();
>> + private = lockless_dereference(table->private);
>> table_base = private->entries[smp_processor_id()];
>>
>
>
> Please carefully read the code, before and after your change, then
> you'll see this change broke the code.
>
> Problem is that a bug like that can be really hard to diagnose and fix
> later, so really you have to be very careful doing these mechanical
> changes.
>
> IMO, current code+comment is better than with this
> lockless_dereference() which in this particular case obfuscates the
> code. more than anything.
>
> In this case we do have a lock (sort of), so lockless_dereference() is
> quite misleading.
>
Hi Eric,
Thanks for looking at this patch.
I've been scratching my head since morning trying to find out what was
so obviously wrong with this patch. Alas, I don't see what you do.
Could you point it out and show me how incompetent I am, please?
Thanks!
--
Pranith
On Fri, 2014-11-21 at 16:57 -0500, Pranith Kumar wrote:
> Hi Eric,
>
> Thanks for looking at this patch.
>
> I've been scratching my head since morning trying to find out what was
> so obviously wrong with this patch. Alas, I don't see what you do.
>
> Could you point it out and show me how incompetent I am, please?
>
> Thanks!
Well, even it the code is _not_ broken, I don't see any value with this
patch.
If I use git blame on current code, line containing
smp_read_barrier_depends() exactly points to the relevant commit [1]
After your change, it will point to some cleanup, which makes little
sense to me, considering you did not change the smp_wmb() in
xt_replace_table().
I, as a netfilter contributor would like to keep current code as is,
because it is how I feel safe with it.
We have a proliferation of interfaces, but this does not help to
understand the issues and code maintenance.
smp_read_barrier_depends() better documents the read barrier than
lockless_dereference().
The point of having a lock or not is irrelevant here.
[1]
http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=b416c144f46af1a30ddfa4e4319a8f077381ad63
Hi,
On 2014-11-21 16:57:00 -0500, Pranith Kumar wrote:
> On Fri, Nov 21, 2014 at 11:12 AM, Eric Dumazet <[email protected]> wrote:
> > On Fri, 2014-11-21 at 10:06 -0500, Pranith Kumar wrote:
> >> Recently lockless_dereference() was added which can be used in place of
> >> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
> >>
> >> Signed-off-by: Pranith Kumar <[email protected]>
> >> ---
> >> net/ipv4/netfilter/arp_tables.c | 3 +--
> >> net/ipv4/netfilter/ip_tables.c | 3 +--
> >> net/ipv6/netfilter/ip6_tables.c | 3 +--
> >> 3 files changed, 3 insertions(+), 6 deletions(-)
> >>
> >> diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
> >> index f95b6f9..fc7533d 100644
> >> --- a/net/ipv4/netfilter/arp_tables.c
> >> +++ b/net/ipv4/netfilter/arp_tables.c
> >> @@ -270,12 +270,11 @@ unsigned int arpt_do_table(struct sk_buff *skb,
> >>
> >> local_bh_disable();
> >> addend = xt_write_recseq_begin();
> >> - private = table->private;
> >> /*
> >> * Ensure we load private-> members after we've fetched the base
> >> * pointer.
> >> */
> >> - smp_read_barrier_depends();
> >> + private = lockless_dereference(table->private);
> >> table_base = private->entries[smp_processor_id()];
> >>
> >
> >
> > Please carefully read the code, before and after your change, then
> > you'll see this change broke the code.
> >
> > Problem is that a bug like that can be really hard to diagnose and fix
> > later, so really you have to be very careful doing these mechanical
> > changes.
> >
> > IMO, current code+comment is better than with this
> > lockless_dereference() which in this particular case obfuscates the
> > code. more than anything.
> >
> > In this case we do have a lock (sort of), so lockless_dereference() is
> > quite misleading.
> >
>
> Hi Eric,
>
> Thanks for looking at this patch.
>
> I've been scratching my head since morning trying to find out what was
> so obviously wrong with this patch. Alas, I don't see what you do.
Afaics the read_barrier_depends protected the load from private->entries[x]
earlier, not the load of table->private itself.
Greetings,
Andres Freund
On Fri, Nov 21, 2014 at 7:05 PM, Eric Dumazet <[email protected]> wrote:
>
> On Fri, 2014-11-21 at 16:57 -0500, Pranith Kumar wrote:
>
>> Hi Eric,
>>
>> Thanks for looking at this patch.
>>
>> I've been scratching my head since morning trying to find out what was
>> so obviously wrong with this patch. Alas, I don't see what you do.
>>
>> Could you point it out and show me how incompetent I am, please?
>>
>> Thanks!
>
> Well, even it the code is _not_ broken, I don't see any value with this
> patch.
Phew. Not being broken itself is a win :)
>
> If I use git blame on current code, line containing
> smp_read_barrier_depends() exactly points to the relevant commit [1]
And that is an opinion I will respect. I don't want to muck the git
history where it is significant.
This effort is to eventually replace the uses of
smp_read_barrier_depends() and to use either rcu or
lockless_dereference() as documented in memory-barriers.txt.
>
> After your change, it will point to some cleanup, which makes little
> sense to me, considering you did not change the smp_wmb() in
> xt_replace_table().
That does not need to change as it is fine as it is. It still pairs
with the smp_read_barrier_depends() in lockless_dereference().
>
> I, as a netfilter contributor would like to keep current code as is,
> because it is how I feel safe with it.
>
> We have a proliferation of interfaces, but this does not help to
> understand the issues and code maintenance.
>
> smp_read_barrier_depends() better documents the read barrier than
> lockless_dereference().
I think this is a matter of opinion. But in the current effort I've
seen cases where it is not clear what the barrier is actually
guaranteeing. I am glad that the current code is not one of those and
it has reasonable comments.
lockless_dereference() on the other hand makes the dependency explicit.
>
> The point of having a lock or not is irrelevant here.
>
> [1]
> http://git.kernel.org/cgit/linux/kernel/git/davem/net.git/commit/?id=b416c144f46af1a30ddfa4e4319a8f077381ad63
>
>
>
>
Thanks!
--
Pranith
On Fri, Nov 21, 2014 at 10:05:59AM -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>
Applied to percpu/for-3.19.
Thanks.
--
tejun
On Fri, Nov 21, 2014 at 10:05:59AM -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>
Applied to percpu/for-3.19.
Thanks.
--
tejun