2014-11-13 19:25:03

by Pranith Kumar

[permalink] [raw]
Subject: [RFC PATCH 00/16] Replace smp_read_barrier_depends() with lockless_derefrence()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends().

http://lkml.iu.edu/hypermail/linux/kernel/1410.3/04561.html

The following series tries to do this.

There are still some hard-coded locations which I was not sure how to replace
with. I will send in separate patches/questions regarding them.

Pranith Kumar (16):
crypto: caam - Remove unnecessary smp_read_barrier_depends()
doc: memory-barriers.txt: Document use of lockless_dereference()
drivers: dma: Replace smp_read_barrier_depends() with
lockless_dereference()
dcache: Replace smp_read_barrier_depends() with lockless_dereference()
overlayfs: Replace smp_read_barrier_depends() with
lockless_dereference()
assoc_array: Replace smp_read_barrier_depends() with
lockless_dereference()
hyperv: Replace smp_read_barrier_depends() with lockless_dereference()
rcupdate: Replace smp_read_barrier_depends() with
lockless_dereference()
percpu: Replace smp_read_barrier_depends() with lockless_dereference()
perf: Replace smp_read_barrier_depends() with lockless_dereference()
seccomp: Replace smp_read_barrier_depends() with
lockless_dereference()
task_work: Replace smp_read_barrier_depends() with
lockless_dereference()
ksm: Replace smp_read_barrier_depends() with lockless_dereference()
slab: Replace smp_read_barrier_depends() with lockless_dereference()
netfilter: Replace smp_read_barrier_depends() with
lockless_dereference()
rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()

Documentation/memory-barriers.txt | 2 +-
drivers/crypto/caam/jr.c | 3 ---
drivers/dma/ioat/dma_v2.c | 3 +--
drivers/dma/ioat/dma_v3.c | 3 +--
fs/dcache.c | 7 ++-----
fs/overlayfs/super.c | 4 +---
include/linux/assoc_array_priv.h | 11 +++++++----
include/linux/hyperv.h | 9 ++++-----
include/linux/percpu-refcount.h | 4 +---
include/linux/rcupdate.h | 10 +++++-----
kernel/events/core.c | 3 +--
kernel/events/uprobes.c | 8 ++++----
kernel/seccomp.c | 7 +++----
kernel/task_work.c | 3 +--
lib/assoc_array.c | 7 -------
mm/ksm.c | 7 +++----
mm/slab.h | 6 +++---
net/ipv4/netfilter/arp_tables.c | 3 +--
net/ipv4/netfilter/ip_tables.c | 3 +--
net/ipv6/netfilter/ip6_tables.c | 3 +--
net/rxrpc/ar-ack.c | 22 +++++++++-------------
security/keys/keyring.c | 6 ------
22 files changed, 50 insertions(+), 84 deletions(-)

--
1.9.1


2014-11-13 19:25:20

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 04/16] dcache: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
fs/dcache.c | 7 ++-----
1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index a6c5d7e..27b8b5b 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -230,8 +230,7 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
* early because the data cannot match (there can
* be no NUL in the ct/tcount data)
*/
- cs = ACCESS_ONCE(dentry->d_name.name);
- smp_read_barrier_depends();
+ cs = lockless_dereference(dentry->d_name.name);
return dentry_string_cmp(cs, ct, tcount);
}

@@ -2714,10 +2713,8 @@ static int prepend(char **buffer, int *buflen, const char *str, int namelen)
static int prepend_name(char **buffer, int *buflen, struct qstr *name)
{
const char *dname = ACCESS_ONCE(name->name);
- u32 dlen = ACCESS_ONCE(name->len);
char *p;
-
- smp_read_barrier_depends();
+ u32 dlen = lockless_dereference(name->len);

*buflen -= dlen + 1;
if (*buflen < 0)
--
1.9.1

2014-11-13 19:25:30

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 10/16] perf: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
kernel/events/core.c | 3 +--
kernel/events/uprobes.c | 8 ++++----
2 files changed, 5 insertions(+), 6 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index d59fdc0..9dd5920 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -3400,14 +3400,13 @@ static void perf_remove_from_owner(struct perf_event *event)
struct task_struct *owner;

rcu_read_lock();
- owner = ACCESS_ONCE(event->owner);
/*
* Matches the smp_wmb() in perf_event_exit_task(). If we observe
* !owner it means the list deletion is complete and we can indeed
* free this event, otherwise we need to serialize on
* owner->perf_event_mutex.
*/
- smp_read_barrier_depends();
+ owner = lockless_dereference(event->owner);
if (owner) {
/*
* Since delayed_put_task_struct() also drops the last
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 6158a64b..c070949 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1212,8 +1212,8 @@ static struct xol_area *get_xol_area(void)
if (!mm->uprobes_state.xol_area)
__create_xol_area(0);

- area = mm->uprobes_state.xol_area;
- smp_read_barrier_depends(); /* pairs with wmb in xol_add_vma() */
+ /* pairs with wmb in xol_add_vma() */
+ area = lockless_dereference(mm->uprobes_state.xol_area);
return area;
}

@@ -1507,8 +1507,8 @@ static unsigned long get_trampoline_vaddr(void)
struct xol_area *area;
unsigned long trampoline_vaddr = -1;

- area = current->mm->uprobes_state.xol_area;
- smp_read_barrier_depends();
+ area = lockless_dereference(current->mm->uprobes_state.xol_area);
+
if (area)
trampoline_vaddr = area->vaddr;

--
1.9.1

2014-11-13 19:25:38

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 14/16] slab: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
mm/slab.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/mm/slab.h b/mm/slab.h
index 3347fd7..1cf40054 100644
--- a/mm/slab.h
+++ b/mm/slab.h
@@ -209,15 +209,15 @@ cache_from_memcg_idx(struct kmem_cache *s, int idx)

rcu_read_lock();
params = rcu_dereference(s->memcg_params);
- cachep = params->memcg_caches[idx];
- rcu_read_unlock();

/*
* Make sure we will access the up-to-date value. The code updating
* memcg_caches issues a write barrier to match this (see
* memcg_register_cache()).
*/
- smp_read_barrier_depends();
+ cachep = lockless_dereference(params->memcg_caches[idx]);
+ rcu_read_unlock();
+
return cachep;
}

--
1.9.1

2014-11-13 19:25:52

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 15/16] netfilter: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
net/ipv4/netfilter/arp_tables.c | 3 +--
net/ipv4/netfilter/ip_tables.c | 3 +--
net/ipv6/netfilter/ip6_tables.c | 3 +--
3 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index f95b6f9..fc7533d 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -270,12 +270,11 @@ unsigned int arpt_do_table(struct sk_buff *skb,

local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
- smp_read_barrier_depends();
+ private = lockless_dereference(table->private);
table_base = private->entries[smp_processor_id()];

e = get_entry(table_base, private->hook_entry[hook]);
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 99e810f..e0fd044 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -325,13 +325,12 @@ ipt_do_table(struct sk_buff *skb,
IP_NF_ASSERT(table->valid_hooks & (1 << hook));
local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
cpu = smp_processor_id();
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
- smp_read_barrier_depends();
+ private = lockless_dereference(table->private);
table_base = private->entries[cpu];
jumpstack = (struct ipt_entry **)private->jumpstack[cpu];
stackptr = per_cpu_ptr(private->stackptr, cpu);
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index e080fbb..0459d6a 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -348,12 +348,11 @@ ip6t_do_table(struct sk_buff *skb,

local_bh_disable();
addend = xt_write_recseq_begin();
- private = table->private;
/*
* Ensure we load private-> members after we've fetched the base
* pointer.
*/
- smp_read_barrier_depends();
+ private = lockless_dereference(table->private);
cpu = smp_processor_id();
table_base = private->entries[cpu];
jumpstack = (struct ip6t_entry **)private->jumpstack[cpu];
--
1.9.1

2014-11-13 19:26:09

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
net/rxrpc/ar-ack.c | 22 +++++++++-------------
1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/net/rxrpc/ar-ack.c b/net/rxrpc/ar-ack.c
index c6be17a..9237448 100644
--- a/net/rxrpc/ar-ack.c
+++ b/net/rxrpc/ar-ack.c
@@ -234,8 +234,7 @@ static void rxrpc_resend(struct rxrpc_call *call)
loop != call->acks_head || stop;
loop = (loop + 1) & (call->acks_winsz - 1)
) {
- p_txb = call->acks_window + loop;
- smp_read_barrier_depends();
+ p_txb = lockless_dereference(call)->acks_window + loop;
if (*p_txb & 1)
continue;

@@ -303,8 +302,7 @@ static void rxrpc_resend_timer(struct rxrpc_call *call)
loop != call->acks_head;
loop = (loop + 1) & (call->acks_winsz - 1)
) {
- p_txb = call->acks_window + loop;
- smp_read_barrier_depends();
+ p_txb = lockless_dereference(call)->acks_window + loop;
txb = (struct sk_buff *) (*p_txb & ~1);
sp = rxrpc_skb(txb);

@@ -354,9 +352,10 @@ static int rxrpc_process_soft_ACKs(struct rxrpc_call *call,
resend = 0;
resend_at = 0;
for (loop = 0; loop < ack->nAcks; loop++) {
- p_txb = call->acks_window;
- p_txb += (call->acks_tail + loop) & (call->acks_winsz - 1);
- smp_read_barrier_depends();
+ struct rxrpc_call *callp = lockless_dereference(call);
+
+ p_txb = callp->acks_window;
+ p_txb += (callp->acks_tail + loop) & (callp->acks_winsz - 1);
txb = (struct sk_buff *) (*p_txb & ~1);
sp = rxrpc_skb(txb);

@@ -385,8 +384,7 @@ static int rxrpc_process_soft_ACKs(struct rxrpc_call *call,
loop != call->acks_head;
loop = (loop + 1) & (call->acks_winsz - 1)
) {
- p_txb = call->acks_window + loop;
- smp_read_barrier_depends();
+ p_txb = lockless_dereference(call)->acks_window + loop;
txb = (struct sk_buff *) (*p_txb & ~1);
sp = rxrpc_skb(txb);

@@ -432,8 +430,7 @@ static void rxrpc_rotate_tx_window(struct rxrpc_call *call, u32 hard)
ASSERTCMP(hard - call->acks_hard, <=, win);

while (call->acks_hard < hard) {
- smp_read_barrier_depends();
- _skb = call->acks_window[tail] & ~1;
+ _skb = lockless_dereference(call)->acks_window[tail] & ~1;
rxrpc_free_skb((struct sk_buff *) _skb);
old_tail = tail;
tail = (tail + 1) & (call->acks_winsz - 1);
@@ -577,8 +574,7 @@ static void rxrpc_zap_tx_window(struct rxrpc_call *call)
call->acks_window = NULL;

while (CIRC_CNT(call->acks_head, call->acks_tail, winsz) > 0) {
- tail = call->acks_tail;
- smp_read_barrier_depends();
+ tail = lockless_dereference(call)->acks_tail;
_skb = acks_window[tail] & ~1;
smp_mb();
call->acks_tail = (call->acks_tail + 1) & (winsz - 1);
--
1.9.1

2014-11-13 19:25:36

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 13/16] ksm: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
mm/ksm.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/ksm.c b/mm/ksm.c
index d247efa..a67de79 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -542,15 +542,14 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
expected_mapping = (void *)stable_node +
(PAGE_MAPPING_ANON | PAGE_MAPPING_KSM);
again:
- kpfn = ACCESS_ONCE(stable_node->kpfn);
- page = pfn_to_page(kpfn);
-
/*
* page is computed from kpfn, so on most architectures reading
* page->mapping is naturally ordered after reading node->kpfn,
* but on Alpha we need to be more careful.
*/
- smp_read_barrier_depends();
+ kpfn = lockless_dereference(stable_node->kpfn);
+ page = pfn_to_page(kpfn);
+
if (ACCESS_ONCE(page->mapping) != expected_mapping)
goto stale;

--
1.9.1

2014-11-13 19:27:13

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 11/16] seccomp: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
kernel/seccomp.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/kernel/seccomp.c b/kernel/seccomp.c
index 4ef9687..3729b06 100644
--- a/kernel/seccomp.c
+++ b/kernel/seccomp.c
@@ -175,17 +175,16 @@ static int seccomp_check_filter(struct sock_filter *filter, unsigned int flen)
*/
static u32 seccomp_run_filters(struct seccomp_data *sd)
{
- struct seccomp_filter *f = ACCESS_ONCE(current->seccomp.filter);
struct seccomp_data sd_local;
u32 ret = SECCOMP_RET_ALLOW;
+ /* Make sure cross-thread synced filter points somewhere sane. */
+ struct seccomp_filter *f =
+ lockless_dereference(current->seccomp.filter);

/* Ensure unexpected behavior doesn't result in failing open. */
if (unlikely(WARN_ON(f == NULL)))
return SECCOMP_RET_KILL;

- /* Make sure cross-thread synced filter points somewhere sane. */
- smp_read_barrier_depends();
-
if (!sd) {
populate_seccomp_data(&sd_local);
sd = &sd_local;
--
1.9.1

2014-11-13 19:27:11

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 12/16] task_work: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
kernel/task_work.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/kernel/task_work.c b/kernel/task_work.c
index 8727032..b5599ce 100644
--- a/kernel/task_work.c
+++ b/kernel/task_work.c
@@ -62,8 +62,7 @@ task_work_cancel(struct task_struct *task, task_work_func_t func)
* we raced with task_work_run(), *pprev == NULL/exited.
*/
raw_spin_lock_irqsave(&task->pi_lock, flags);
- while ((work = ACCESS_ONCE(*pprev))) {
- smp_read_barrier_depends();
+ while ((work = lockless_dereference(*pprev))) {
if (work->func != func)
pprev = &work->next;
else if (cmpxchg(pprev, work, work->next) == work)
--
1.9.1

2014-11-13 19:25:27

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 07/16] hyperv: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
include/linux/hyperv.h | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 08cfaff..06418b1 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -127,13 +127,12 @@ hv_get_ringbuffer_availbytes(struct hv_ring_buffer_info *rbi,
u32 *read, u32 *write)
{
u32 read_loc, write_loc, dsize;
-
- smp_read_barrier_depends();
+ struct hv_ring_buffer_info *rbi_p = lockless_dereference(rbi);

/* Capture the read/write indices before they changed */
- read_loc = rbi->ring_buffer->read_index;
- write_loc = rbi->ring_buffer->write_index;
- dsize = rbi->ring_datasize;
+ read_loc = rbi_p->ring_buffer->read_index;
+ write_loc = rbi_p->ring_buffer->write_index;
+ dsize = rbi_p->ring_datasize;

*write = write_loc >= read_loc ? dsize - (write_loc - read_loc) :
read_loc - write_loc;
--
1.9.1

2014-11-13 19:25:24

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 06/16] assoc_array: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

I replaced the inline functions dereferencing pointer 'x' to use
lockless_dereference() because of which we do not need to litter the code with
smp_read_barrier_depends().

Signed-off-by: Pranith Kumar <[email protected]>
---
include/linux/assoc_array_priv.h | 11 +++++++----
lib/assoc_array.c | 7 -------
security/keys/keyring.c | 6 ------
3 files changed, 7 insertions(+), 17 deletions(-)

diff --git a/include/linux/assoc_array_priv.h b/include/linux/assoc_array_priv.h
index 711275e..96449c3 100644
--- a/include/linux/assoc_array_priv.h
+++ b/include/linux/assoc_array_priv.h
@@ -118,7 +118,8 @@ struct assoc_array_edit {

static inline bool assoc_array_ptr_is_meta(const struct assoc_array_ptr *x)
{
- return (unsigned long)x & ASSOC_ARRAY_PTR_TYPE_MASK;
+ return (unsigned long)lockless_dereference(x) &
+ ASSOC_ARRAY_PTR_TYPE_MASK;
}
static inline bool assoc_array_ptr_is_leaf(const struct assoc_array_ptr *x)
{
@@ -126,7 +127,8 @@ static inline bool assoc_array_ptr_is_leaf(const struct assoc_array_ptr *x)
}
static inline bool assoc_array_ptr_is_shortcut(const struct assoc_array_ptr *x)
{
- return (unsigned long)x & ASSOC_ARRAY_PTR_SUBTYPE_MASK;
+ return (unsigned long)lockless_dereference(x) &
+ ASSOC_ARRAY_PTR_SUBTYPE_MASK;
}
static inline bool assoc_array_ptr_is_node(const struct assoc_array_ptr *x)
{
@@ -135,13 +137,14 @@ static inline bool assoc_array_ptr_is_node(const struct assoc_array_ptr *x)

static inline void *assoc_array_ptr_to_leaf(const struct assoc_array_ptr *x)
{
- return (void *)((unsigned long)x & ~ASSOC_ARRAY_PTR_TYPE_MASK);
+ return (void *)((unsigned long)lockless_dereference(x) &
+ ~ASSOC_ARRAY_PTR_TYPE_MASK);
}

static inline
unsigned long __assoc_array_ptr_to_meta(const struct assoc_array_ptr *x)
{
- return (unsigned long)x &
+ return (unsigned long)lockless_dereference(x) &
~(ASSOC_ARRAY_PTR_SUBTYPE_MASK | ASSOC_ARRAY_PTR_TYPE_MASK);
}
static inline
diff --git a/lib/assoc_array.c b/lib/assoc_array.c
index 2404d03..5b62033 100644
--- a/lib/assoc_array.c
+++ b/lib/assoc_array.c
@@ -37,12 +37,10 @@ begin_node:
if (assoc_array_ptr_is_shortcut(cursor)) {
/* Descend through a shortcut */
shortcut = assoc_array_ptr_to_shortcut(cursor);
- smp_read_barrier_depends();
cursor = ACCESS_ONCE(shortcut->next_node);
}

node = assoc_array_ptr_to_node(cursor);
- smp_read_barrier_depends();
slot = 0;

/* We perform two passes of each node.
@@ -85,7 +83,6 @@ begin_node:

continue_node:
node = assoc_array_ptr_to_node(cursor);
- smp_read_barrier_depends();

for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
ptr = ACCESS_ONCE(node->slots[slot]);
@@ -104,7 +101,6 @@ finished_node:

if (assoc_array_ptr_is_shortcut(parent)) {
shortcut = assoc_array_ptr_to_shortcut(parent);
- smp_read_barrier_depends();
cursor = parent;
parent = ACCESS_ONCE(shortcut->back_pointer);
slot = shortcut->parent_slot;
@@ -215,7 +211,6 @@ jumped:

consider_node:
node = assoc_array_ptr_to_node(cursor);
- smp_read_barrier_depends();

slot = segments >> (level & ASSOC_ARRAY_KEY_CHUNK_MASK);
slot &= ASSOC_ARRAY_FAN_MASK;
@@ -253,7 +248,6 @@ consider_node:
cursor = ptr;
follow_shortcut:
shortcut = assoc_array_ptr_to_shortcut(cursor);
- smp_read_barrier_depends();
pr_devel("shortcut to %d\n", shortcut->skip_to_level);
sc_level = level + ASSOC_ARRAY_LEVEL_STEP;
BUG_ON(sc_level > shortcut->skip_to_level);
@@ -343,7 +337,6 @@ void *assoc_array_find(const struct assoc_array *array,
* actually going to dereference it.
*/
leaf = assoc_array_ptr_to_leaf(ptr);
- smp_read_barrier_depends();
if (ops->compare_object(leaf, index_key))
return (void *)leaf;
}
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index 8177010..48d3464 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -683,7 +683,6 @@ descend_to_keyring:
* doesn't contain any keyring pointers.
*/
shortcut = assoc_array_ptr_to_shortcut(ptr);
- smp_read_barrier_depends();
if ((shortcut->index_key[0] & ASSOC_ARRAY_FAN_MASK) != 0)
goto not_this_keyring;

@@ -693,7 +692,6 @@ descend_to_keyring:
}

node = assoc_array_ptr_to_node(ptr);
- smp_read_barrier_depends();

ptr = node->slots[0];
if (!assoc_array_ptr_is_meta(ptr))
@@ -706,7 +704,6 @@ descend_to_node:
kdebug("descend");
if (assoc_array_ptr_is_shortcut(ptr)) {
shortcut = assoc_array_ptr_to_shortcut(ptr);
- smp_read_barrier_depends();
ptr = ACCESS_ONCE(shortcut->next_node);
BUG_ON(!assoc_array_ptr_is_node(ptr));
}
@@ -714,7 +711,6 @@ descend_to_node:

begin_node:
kdebug("begin_node");
- smp_read_barrier_depends();
slot = 0;
ascend_to_node:
/* Go through the slots in a node */
@@ -762,14 +758,12 @@ ascend_to_node:

if (ptr && assoc_array_ptr_is_shortcut(ptr)) {
shortcut = assoc_array_ptr_to_shortcut(ptr);
- smp_read_barrier_depends();
ptr = ACCESS_ONCE(shortcut->back_pointer);
slot = shortcut->parent_slot;
}
if (!ptr)
goto not_this_keyring;
node = assoc_array_ptr_to_node(ptr);
- smp_read_barrier_depends();
slot++;

/* If we've ascended to the root (zero backpointer), we must have just
--
1.9.1

2014-11-13 19:28:03

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 09/16] percpu: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
include/linux/percpu-refcount.h | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 494ab05..3cf6759 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -128,10 +128,8 @@ static inline void percpu_ref_kill(struct percpu_ref *ref)
static inline bool __ref_is_percpu(struct percpu_ref *ref,
unsigned long __percpu **percpu_countp)
{
- unsigned long percpu_ptr = ACCESS_ONCE(ref->percpu_count_ptr);
-
/* paired with smp_store_release() in percpu_ref_reinit() */
- smp_read_barrier_depends();
+ unsigned long percpu_ptr = lockless_dereference(ref->percpu_count_ptr);

if (unlikely(percpu_ptr & __PERCPU_REF_ATOMIC))
return false;
--
1.9.1

2014-11-13 19:28:24

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 08/16] rcupdate: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
include/linux/rcupdate.h | 10 +++++-----
1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index ed4f593..386ba28 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -582,11 +582,11 @@ static inline void rcu_preempt_sleep_check(void)
})
#define __rcu_dereference_check(p, c, space) \
({ \
- typeof(*p) *_________p1 = (typeof(*p) *__force)ACCESS_ONCE(p); \
+ /* Dependency order vs. p above. */ \
+ typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \
rcu_lockdep_assert(c, "suspicious rcu_dereference_check() usage"); \
rcu_dereference_sparse(p, space); \
- smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
- ((typeof(*p) __force __kernel *)(_________p1)); \
+ ((typeof(*p) __force __kernel *)(________p1)); \
})
#define __rcu_dereference_protected(p, c, space) \
({ \
@@ -603,10 +603,10 @@ static inline void rcu_preempt_sleep_check(void)
})
#define __rcu_dereference_index_check(p, c) \
({ \
- typeof(p) _________p1 = ACCESS_ONCE(p); \
+ /* Dependency order vs. p above. */ \
+ typeof(p) _________p1 = lockless_dereference(p); \
rcu_lockdep_assert(c, \
"suspicious rcu_dereference_index_check() usage"); \
- smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
(_________p1); \
})

--
1.9.1

2014-11-13 19:25:16

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 03/16] drivers: dma: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
drivers/dma/ioat/dma_v2.c | 3 +--
drivers/dma/ioat/dma_v3.c | 3 +--
2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/dma/ioat/dma_v2.c b/drivers/dma/ioat/dma_v2.c
index 695483e..0f94d72 100644
--- a/drivers/dma/ioat/dma_v2.c
+++ b/drivers/dma/ioat/dma_v2.c
@@ -142,10 +142,9 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)

active = ioat2_ring_active(ioat);
for (i = 0; i < active && !seen_current; i++) {
- smp_read_barrier_depends();
prefetch(ioat2_get_ring_ent(ioat, idx + i + 1));
desc = ioat2_get_ring_ent(ioat, idx + i);
- tx = &desc->txd;
+ tx = &lockless_dereference(desc)->txd;
dump_desc_dbg(ioat, desc);
if (tx->cookie) {
dma_descriptor_unmap(tx);
diff --git a/drivers/dma/ioat/dma_v3.c b/drivers/dma/ioat/dma_v3.c
index 895f869..cbd0537 100644
--- a/drivers/dma/ioat/dma_v3.c
+++ b/drivers/dma/ioat/dma_v3.c
@@ -389,7 +389,6 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
for (i = 0; i < active && !seen_current; i++) {
struct dma_async_tx_descriptor *tx;

- smp_read_barrier_depends();
prefetch(ioat2_get_ring_ent(ioat, idx + i + 1));
desc = ioat2_get_ring_ent(ioat, idx + i);
dump_desc_dbg(ioat, desc);
@@ -398,7 +397,7 @@ static void __cleanup(struct ioat2_dma_chan *ioat, dma_addr_t phys_complete)
if (device->cap & IOAT_CAP_DWBES)
desc_get_errstat(ioat, desc);

- tx = &desc->txd;
+ tx = &lockless_dereference(desc)->txd;
if (tx->cookie) {
dma_cookie_complete(tx);
dma_descriptor_unmap(tx);
--
1.9.1

2014-11-13 19:30:24

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 05/16] overlayfs: Replace smp_read_barrier_depends() with lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
fs/overlayfs/super.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 08b704c..b0f050e 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -84,12 +84,10 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)

static struct dentry *ovl_upperdentry_dereference(struct ovl_entry *oe)
{
- struct dentry *upperdentry = ACCESS_ONCE(oe->__upperdentry);
/*
* Make sure to order reads to upperdentry wrt ovl_dentry_update()
*/
- smp_read_barrier_depends();
- return upperdentry;
+ return lockless_dereference(oe->__upperdentry);
}

void ovl_path_upper(struct dentry *dentry, struct path *path)
--
1.9.1

2014-11-13 19:30:57

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 02/16] doc: memory-barriers.txt: Document use of lockless_dereference()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
Documentation/memory-barriers.txt | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 3d5f49b..841ac36 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -203,7 +203,7 @@ There are some minimal guarantees that may be expected of a CPU:
and always in that order. On most systems, smp_read_barrier_depends()
does nothing, but it is required for DEC Alpha. The ACCESS_ONCE()
is required to prevent compiler mischief. Please note that you
- should normally use something like rcu_dereference() instead of
+ should normally use something like lockless_dereference() instead of
open-coding smp_read_barrier_depends().

(*) Overlapping loads and stores within a particular CPU will appear to be
--
1.9.1

2014-11-13 19:31:27

by Pranith Kumar

[permalink] [raw]
Subject: [PATCH 01/16] crypto: caam - Remove unnecessary smp_read_barrier_depends()

Recently lockless_dereference() was added which can be used in place of
hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Signed-off-by: Pranith Kumar <[email protected]>
---
drivers/crypto/caam/jr.c | 3 ---
1 file changed, 3 deletions(-)

diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
index bae20d8..9b3ef1bc 100644
--- a/drivers/crypto/caam/jr.c
+++ b/drivers/crypto/caam/jr.c
@@ -181,8 +181,6 @@ static void caam_jr_dequeue(unsigned long devarg)
for (i = 0; CIRC_CNT(head, tail + i, JOBR_DEPTH) >= 1; i++) {
sw_idx = (tail + i) & (JOBR_DEPTH - 1);

- smp_read_barrier_depends();
-
if (jrp->outring[hw_idx].desc ==
jrp->entinfo[sw_idx].desc_addr_dma)
break; /* found */
@@ -218,7 +216,6 @@ static void caam_jr_dequeue(unsigned long devarg)
if (sw_idx == tail) {
do {
tail = (tail + 1) & (JOBR_DEPTH - 1);
- smp_read_barrier_depends();
} while (CIRC_CNT(head, tail, JOBR_DEPTH) >= 1 &&
jrp->entinfo[tail].desc_addr_dma == 0);

--
1.9.1

2014-11-13 20:07:17

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [RFC PATCH 00/16] Replace smp_read_barrier_depends() with lockless_derefrence()

On Thu, Nov 13, 2014 at 02:24:06PM -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends().
>
> http://lkml.iu.edu/hypermail/linux/kernel/1410.3/04561.html
>
> The following series tries to do this.
>
> There are still some hard-coded locations which I was not sure how to replace
> with. I will send in separate patches/questions regarding them.

Thank you for taking this on! Some questions and comments in response
to the individual patches.

Thanx, Paul

> Pranith Kumar (16):
> crypto: caam - Remove unnecessary smp_read_barrier_depends()
> doc: memory-barriers.txt: Document use of lockless_dereference()
> drivers: dma: Replace smp_read_barrier_depends() with
> lockless_dereference()
> dcache: Replace smp_read_barrier_depends() with lockless_dereference()
> overlayfs: Replace smp_read_barrier_depends() with
> lockless_dereference()
> assoc_array: Replace smp_read_barrier_depends() with
> lockless_dereference()
> hyperv: Replace smp_read_barrier_depends() with lockless_dereference()
> rcupdate: Replace smp_read_barrier_depends() with
> lockless_dereference()
> percpu: Replace smp_read_barrier_depends() with lockless_dereference()
> perf: Replace smp_read_barrier_depends() with lockless_dereference()
> seccomp: Replace smp_read_barrier_depends() with
> lockless_dereference()
> task_work: Replace smp_read_barrier_depends() with
> lockless_dereference()
> ksm: Replace smp_read_barrier_depends() with lockless_dereference()
> slab: Replace smp_read_barrier_depends() with lockless_dereference()
> netfilter: Replace smp_read_barrier_depends() with
> lockless_dereference()
> rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()
>
> Documentation/memory-barriers.txt | 2 +-
> drivers/crypto/caam/jr.c | 3 ---
> drivers/dma/ioat/dma_v2.c | 3 +--
> drivers/dma/ioat/dma_v3.c | 3 +--
> fs/dcache.c | 7 ++-----
> fs/overlayfs/super.c | 4 +---
> include/linux/assoc_array_priv.h | 11 +++++++----
> include/linux/hyperv.h | 9 ++++-----
> include/linux/percpu-refcount.h | 4 +---
> include/linux/rcupdate.h | 10 +++++-----
> kernel/events/core.c | 3 +--
> kernel/events/uprobes.c | 8 ++++----
> kernel/seccomp.c | 7 +++----
> kernel/task_work.c | 3 +--
> lib/assoc_array.c | 7 -------
> mm/ksm.c | 7 +++----
> mm/slab.h | 6 +++---
> net/ipv4/netfilter/arp_tables.c | 3 +--
> net/ipv4/netfilter/ip_tables.c | 3 +--
> net/ipv6/netfilter/ip6_tables.c | 3 +--
> net/rxrpc/ar-ack.c | 22 +++++++++-------------
> security/keys/keyring.c | 6 ------
> 22 files changed, 50 insertions(+), 84 deletions(-)
>
> --
> 1.9.1
>

2014-11-13 20:10:09

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 01/16] crypto: caam - Remove unnecessary smp_read_barrier_depends()

On Thu, Nov 13, 2014 at 02:24:07PM -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>
> ---
> drivers/crypto/caam/jr.c | 3 ---
> 1 file changed, 3 deletions(-)
>
> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
> index bae20d8..9b3ef1bc 100644
> --- a/drivers/crypto/caam/jr.c
> +++ b/drivers/crypto/caam/jr.c
> @@ -181,8 +181,6 @@ static void caam_jr_dequeue(unsigned long devarg)
> for (i = 0; CIRC_CNT(head, tail + i, JOBR_DEPTH) >= 1; i++) {
> sw_idx = (tail + i) & (JOBR_DEPTH - 1);
>
> - smp_read_barrier_depends();
> -

Did you mean to add a lockless_dereference() somewhere? I would guess
that one is required on the load into sw_idx before the "for" loop,
but I cannot claim to be familiar with this code.

Thanx, Paul

> if (jrp->outring[hw_idx].desc ==
> jrp->entinfo[sw_idx].desc_addr_dma)
> break; /* found */
> @@ -218,7 +216,6 @@ static void caam_jr_dequeue(unsigned long devarg)
> if (sw_idx == tail) {
> do {
> tail = (tail + 1) & (JOBR_DEPTH - 1);
> - smp_read_barrier_depends();
> } while (CIRC_CNT(head, tail, JOBR_DEPTH) >= 1 &&
> jrp->entinfo[tail].desc_addr_dma == 0);
>
> --
> 1.9.1
>

2014-11-13 20:11:23

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 02/16] doc: memory-barriers.txt: Document use of lockless_dereference()

On Thu, Nov 13, 2014 at 02:24:08PM -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>
> ---
> Documentation/memory-barriers.txt | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
> index 3d5f49b..841ac36 100644
> --- a/Documentation/memory-barriers.txt
> +++ b/Documentation/memory-barriers.txt
> @@ -203,7 +203,7 @@ There are some minimal guarantees that may be expected of a CPU:
> and always in that order. On most systems, smp_read_barrier_depends()
> does nothing, but it is required for DEC Alpha. The ACCESS_ONCE()
> is required to prevent compiler mischief. Please note that you
> - should normally use something like rcu_dereference() instead of
> + should normally use something like lockless_dereference() instead of

Good catch, but please keep both possibilities, something like
"... like rcu_dereference() or lockless_dereference() instead of ..."

Thanx, Paul

> open-coding smp_read_barrier_depends().
>
> (*) Overlapping loads and stores within a particular CPU will appear to be
> --
> 1.9.1
>

2014-11-13 20:11:36

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 06/16] assoc_array: Replace smp_read_barrier_depends() with lockless_dereference()

Pranith Kumar <[email protected]> wrote:

> static inline bool assoc_array_ptr_is_meta(const struct assoc_array_ptr *x)
> {
> - return (unsigned long)x & ASSOC_ARRAY_PTR_TYPE_MASK;
> + return (unsigned long)lockless_dereference(x) &
> + ASSOC_ARRAY_PTR_TYPE_MASK;
> }

Nack. x is not being dereferenced here, nor in some of the other places
you've added this.

David

2014-11-13 20:17:52

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()

Pranith Kumar <[email protected]> wrote:

> loop != call->acks_head || stop;
> loop = (loop + 1) & (call->acks_winsz - 1)
> ) {
> - p_txb = call->acks_window + loop;
> - smp_read_barrier_depends();
> + p_txb = lockless_dereference(call)->acks_window + loop;

Nack. You've stuck an implicit barrier on a dereference that doesn't matter.
And similar for other hunks of this patch.

David

2014-11-13 20:47:39

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()

Pranith Kumar <[email protected]> wrote:

> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.

Actually, the use of smp_read_barrier_depends() is wrong in circular
buffering. See Documentation/circular-buffers.txt

David

2014-11-13 21:55:15

by Pranith Kumar

[permalink] [raw]
Subject: Re: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()

On 11/13/2014 03:47 PM, David Howells wrote:
> Pranith Kumar <[email protected]> wrote:
>
>> Recently lockless_dereference() was added which can be used in place of
>> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Actually, the use of smp_read_barrier_depends() is wrong in circular
> buffering. See Documentation/circular-buffers.txt
>

OK. Should I send in a patch removing these barriers then?

--
Pranith

2014-11-13 23:07:41

by David Howells

[permalink] [raw]
Subject: Re: [PATCH 16/16] rxrpc: Replace smp_read_barrier_depends() with lockless_dereference()

Pranith Kumar <[email protected]> wrote:

> OK. Should I send in a patch removing these barriers then?

No. There need to be stronger barriers, at least in some of the cases.
circular-buffers.txt details what is required, but not all of the cases match
the pattern there, so it needs a bit more consideration.

David

2014-11-14 01:03:44

by Kim Phillips

[permalink] [raw]
Subject: Re: [PATCH 01/16] crypto: caam - Remove unnecessary smp_read_barrier_depends()

On Thu, 13 Nov 2014 16:51:12 -0500
Pranith Kumar <[email protected]> wrote:

> On 11/13/2014 03:10 PM, Paul E. McKenney wrote:
> > On Thu, Nov 13, 2014 at 02:24:07PM -0500, Pranith Kumar wrote:
> >> Recently lockless_dereference() was added which can be used in place of
> >> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
> >>
> >> Signed-off-by: Pranith Kumar <[email protected]>
> >> ---
> >> drivers/crypto/caam/jr.c | 3 ---
> >> 1 file changed, 3 deletions(-)
> >>
> >> diff --git a/drivers/crypto/caam/jr.c b/drivers/crypto/caam/jr.c
> >> index bae20d8..9b3ef1bc 100644
> >> --- a/drivers/crypto/caam/jr.c
> >> +++ b/drivers/crypto/caam/jr.c
> >> @@ -181,8 +181,6 @@ static void caam_jr_dequeue(unsigned long devarg)
> >> for (i = 0; CIRC_CNT(head, tail + i, JOBR_DEPTH) >= 1; i++) {
> >> sw_idx = (tail + i) & (JOBR_DEPTH - 1);
> >>
> >> - smp_read_barrier_depends();
> >> -
> >
> > Did you mean to add a lockless_dereference() somewhere? I would guess
> > that one is required on the load into sw_idx before the "for" loop,
> > but I cannot claim to be familiar with this code.
>
> No, I could not understand why the barrier was here in the first place. The change log does not reflect the subject line.
>
> sw_idx is a local variable and is used to index the entinfo array. It does not seem like the barrier is needed. If not a comment saying why could be added I guess.

I likely misinterpreted the "read index before reading contents at
that index" instructions in early circular buffer documentation, and
left it in for the benefit of the doubt. Now it looks to me it's not
necessary, given both sw_idx and tail are just being computed within
a lock, and removing both barriers doesn't affect the compiler
output, so:

Reviewed-by: Kim Phillips <[email protected]>

Thanks,

Kim

2014-11-14 13:14:53

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH 09/16] percpu: Replace smp_read_barrier_depends() with lockless_dereference()

On Thu, Nov 13, 2014 at 02:24:15PM -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>

How should this be routed? There's a pending change on the same
region of code and if this patch is routed outside percpu tree it'd
cause a conflict and I can't route this as lockless_dereference()
isn't in the mainline yet. Maybe expose a git branch which contaisn
lockless_dereference() so that percpu can pull it in?

Thanks.

--
tejun

2014-11-14 16:02:44

by Pranith Kumar

[permalink] [raw]
Subject: Re: [PATCH 09/16] percpu: Replace smp_read_barrier_depends() with lockless_dereference()

On Fri, Nov 14, 2014 at 8:14 AM, Tejun Heo <[email protected]> wrote:
>
> How should this be routed? There's a pending change on the same
> region of code and if this patch is routed outside percpu tree it'd
> cause a conflict and I can't route this as lockless_dereference()
> isn't in the mainline yet. Maybe expose a git branch which contaisn
> lockless_dereference() so that percpu can pull it in?
>

I do not have access to git hosting (except github, but I think that
is not acceptable). May be once the patches are final, I can send in a
version which you can put in a branch on kernel.org?

--
Pranith

2014-11-17 04:52:22

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 09/16] percpu: Replace smp_read_barrier_depends() with lockless_dereference()

On Fri, Nov 14, 2014 at 08:14:48AM -0500, Tejun Heo wrote:
> On Thu, Nov 13, 2014 at 02:24:15PM -0500, Pranith Kumar wrote:
> > Recently lockless_dereference() was added which can be used in place of
> > hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
> >
> > Signed-off-by: Pranith Kumar <[email protected]>
>
> How should this be routed? There's a pending change on the same
> region of code and if this patch is routed outside percpu tree it'd
> cause a conflict and I can't route this as lockless_dereference()
> isn't in the mainline yet. Maybe expose a git branch which contaisn
> lockless_dereference() so that percpu can pull it in?

Actually, lockless_dereference() is now in mainline, if that makes it
easier.

Thanx, Paul

2014-11-17 10:35:28

by Miklos Szeredi

[permalink] [raw]
Subject: Re: [PATCH 05/16] overlayfs: Replace smp_read_barrier_depends() with lockless_dereference()

On Thu, Nov 13, 2014 at 8:24 PM, Pranith Kumar <[email protected]> wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>
> ---
> fs/overlayfs/super.c | 4 +---
> 1 file changed, 1 insertion(+), 3 deletions(-)
>
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 08b704c..b0f050e 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -84,12 +84,10 @@ enum ovl_path_type ovl_path_type(struct dentry *dentry)
>
> static struct dentry *ovl_upperdentry_dereference(struct ovl_entry *oe)
> {
> - struct dentry *upperdentry = ACCESS_ONCE(oe->__upperdentry);
> /*
> * Make sure to order reads to upperdentry wrt ovl_dentry_update()
> */
> - smp_read_barrier_depends();
> - return upperdentry;
> + return lockless_dereference(oe->__upperdentry);
> }
>
> void ovl_path_upper(struct dentry *dentry, struct path *path)
> --
> 1.9.1
>

Thanks, already in overlayfs-next branch of

git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-next

Thanks,
Miklos

2014-11-18 19:33:10

by Paul E. McKenney

[permalink] [raw]
Subject: Re: [PATCH 08/16] rcupdate: Replace smp_read_barrier_depends() with lockless_dereference()

On Thu, Nov 13, 2014 at 02:24:14PM -0500, Pranith Kumar wrote:
> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>

Queued for 3.20, thank you! I was wondering about your removing the
"_" from "_________p1", but see that this allowed you to avoid the
checkpatch.pl warning. ;-)

Thanx, Paul

> ---
> include/linux/rcupdate.h | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
> index ed4f593..386ba28 100644
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@ -582,11 +582,11 @@ static inline void rcu_preempt_sleep_check(void)
> })
> #define __rcu_dereference_check(p, c, space) \
> ({ \
> - typeof(*p) *_________p1 = (typeof(*p) *__force)ACCESS_ONCE(p); \
> + /* Dependency order vs. p above. */ \
> + typeof(*p) *________p1 = (typeof(*p) *__force)lockless_dereference(p); \
> rcu_lockdep_assert(c, "suspicious rcu_dereference_check() usage"); \
> rcu_dereference_sparse(p, space); \
> - smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
> - ((typeof(*p) __force __kernel *)(_________p1)); \
> + ((typeof(*p) __force __kernel *)(________p1)); \
> })
> #define __rcu_dereference_protected(p, c, space) \
> ({ \
> @@ -603,10 +603,10 @@ static inline void rcu_preempt_sleep_check(void)
> })
> #define __rcu_dereference_index_check(p, c) \
> ({ \
> - typeof(p) _________p1 = ACCESS_ONCE(p); \
> + /* Dependency order vs. p above. */ \
> + typeof(p) _________p1 = lockless_dereference(p); \
> rcu_lockdep_assert(c, \
> "suspicious rcu_dereference_index_check() usage"); \
> - smp_read_barrier_depends(); /* Dependency order vs. p above. */ \
> (_________p1); \
> })
>
> --
> 1.9.1
>

2014-11-18 19:38:40

by Pranith Kumar

[permalink] [raw]
Subject: Re: [PATCH 08/16] rcupdate: Replace smp_read_barrier_depends() with lockless_dereference()


On 11/18/2014 12:16 PM, Paul E. McKenney wrote:
> On Thu, Nov 13, 2014 at 02:24:14PM -0500, Pranith Kumar wrote:
>> Recently lockless_dereference() was added which can be used in place of
>> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>>
>> Signed-off-by: Pranith Kumar <[email protected]>
> Queued for 3.20, thank you! I was wondering about your removing the
> "_" from "_________p1", but see that this allowed you to avoid the
> checkpatch.pl warning. ;-)
>
Yes, that was the easiest way I could think of to avoid the warning with adding more lines.

I will send the rest of the patches updated with comments.

Thanks,
Pranith

2014-11-30 22:35:03

by Hugh Dickins

[permalink] [raw]
Subject: Re: [PATCH 13/16] ksm: Replace smp_read_barrier_depends() with lockless_dereference()

On Thu, 13 Nov 2014, Pranith Kumar wrote:

> Recently lockless_dereference() was added which can be used in place of
> hard-coding smp_read_barrier_depends(). The following PATCH makes the change.
>
> Signed-off-by: Pranith Kumar <[email protected]>

Sorry, I don't think your patch is buggy, but I do think it
makes this tricky piece of code harder to follow, not easier.

It is certainly not a standard use of lockless_dereference() (kpfn
is not a pointer), and it both hides and moves where the barrier is.
And then at the end of the function, there's still explicit barriers
and ACCESS_ONCE comparison with kpfn, which this makes more obscure.

Unless you are actually fixing a bug (I don't pretend to have
tested this on Alpha, and I can get barriers wrong as we all do),
or smp_read_barrier_depends() is about to be withdrawn from use,
I'd rather say NAK to this patch.

Hugh

> ---
> mm/ksm.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/mm/ksm.c b/mm/ksm.c
> index d247efa..a67de79 100644
> --- a/mm/ksm.c
> +++ b/mm/ksm.c
> @@ -542,15 +542,14 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
> expected_mapping = (void *)stable_node +
> (PAGE_MAPPING_ANON | PAGE_MAPPING_KSM);
> again:
> - kpfn = ACCESS_ONCE(stable_node->kpfn);
> - page = pfn_to_page(kpfn);
> -
> /*
> * page is computed from kpfn, so on most architectures reading
> * page->mapping is naturally ordered after reading node->kpfn,
> * but on Alpha we need to be more careful.
> */
> - smp_read_barrier_depends();
> + kpfn = lockless_dereference(stable_node->kpfn);
> + page = pfn_to_page(kpfn);
> +
> if (ACCESS_ONCE(page->mapping) != expected_mapping)
> goto stale;
>
> --
> 1.9.1