Hi Kent,
I am resending the series without couple of original patches.
As I understand, you have not been able to find this series
anywhere, so please note your ACKs to the first two ;)
Thanks!
Cc: Kent Overstreet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: "Nicholas A. Bellinger" <[email protected]>
Alexander Gordeev (3):
percpu_ida: Fix data race on cpus_have_tags cpumask
percpu_ida: Move waking up waiters out of atomic contexts
percpu_ida: Sanity check initialization parameters
lib/percpu_ida.c | 26 +++++++++++++++++++++-----
1 files changed, 21 insertions(+), 5 deletions(-)
--
1.7.7.6
Signed-off-by: Alexander Gordeev <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: "Nicholas A. Bellinger" <[email protected]>
---
lib/percpu_ida.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index c2fa3dc..81b5ae9 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -305,6 +305,11 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
{
unsigned i, cpu, order;
+ if (batch_size > max_size)
+ return -ERANGE;
+ if (!batch_size)
+ return -EINVAL;
+
memset(pool, 0, sizeof(*pool));
init_waitqueue_head(&pool->wait);
--
1.7.7.6
Function steal_tags() might miss a bit in cpus_have_tags due to
unsynchronized access from percpu_ida_free(). As result, function
percpu_ida_alloc() might enter unwakable sleep. This update adds
memory barriers to prevent the described scenario.
In fact, accesses to cpus_have_tags are fenced by prepare_to_wait()
and wake_up() calls at the moment and the aforementioned sequence
does not appear could hit. Nevertheless, explicit memory barriers
still seem justifiable.
Signed-off-by: Alexander Gordeev <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: "Nicholas A. Bellinger" <[email protected]>
Acked-by: Kent Overstreet <[email protected]>
---
lib/percpu_ida.c | 12 ++++++++++--
1 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index 7be235f..fccfb28 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -68,6 +68,11 @@ static inline void steal_tags(struct percpu_ida *pool,
unsigned cpus_have_tags, cpu = pool->cpu_last_stolen;
struct percpu_ida_cpu *remote;
+ /*
+ * Pairs with smp_wmb() in percpu_ida_free()
+ */
+ smp_rmb();
+
for (cpus_have_tags = cpumask_weight(&pool->cpus_have_tags);
cpus_have_tags * pool->percpu_max_size > pool->nr_tags / 2;
cpus_have_tags--) {
@@ -237,8 +242,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
spin_unlock(&tags->lock);
if (nr_free == 1) {
- cpumask_set_cpu(smp_processor_id(),
- &pool->cpus_have_tags);
+ cpumask_set_cpu(smp_processor_id(), &pool->cpus_have_tags);
+ /*
+ * Pairs with smp_rmb() in steal_tags()
+ */
+ smp_wmb();
wake_up(&pool->wait);
}
--
1.7.7.6
Currently percpu_ida_free() waikes up waiters always with local
interrupts disabled and sometimes with pool->lock held. Yet, it
does not appear there is any reason why it could not be done out
of these atomic contexts.
Signed-off-by: Alexander Gordeev <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: "Nicholas A. Bellinger" <[email protected]>
Acked-by: Kent Overstreet <[email protected]>
---
lib/percpu_ida.c | 9 ++++++---
1 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index fccfb28..c2fa3dc 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -229,6 +229,7 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
struct percpu_ida_cpu *tags;
unsigned long flags;
unsigned nr_free;
+ bool wake_up = false;
BUG_ON(tag >= pool->nr_tags);
@@ -247,7 +248,7 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
* Pairs with smp_rmb() in steal_tags()
*/
smp_wmb();
- wake_up(&pool->wait);
+ wake_up = true;
}
if (nr_free == pool->percpu_max_size) {
@@ -261,13 +262,15 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
move_tags(pool->freelist, &pool->nr_free,
tags->freelist, &tags->nr_free,
pool->percpu_batch_size);
-
- wake_up(&pool->wait);
+ wake_up = true;
}
spin_unlock(&pool->lock);
}
local_irq_restore(flags);
+
+ if (wake_up)
+ wake_up(&pool->wait);
}
EXPORT_SYMBOL_GPL(percpu_ida_free);
--
1.7.7.6
When stealing from a remote CPU all available tags are moved
from the remote CPU's cache to the stealing CPU's one. Since
the best CPU to steal from is not selected the victim might
actively performing IO and (as result of losing all local
tags) a further cycle of cache rebouncing and/or stealing
could be provoked.
This update is an attempt to soften the described scenario
and limit the number of tags to be stolen at once to the
value of percpu_batch_size.
Signed-off-by: Alexander Gordeev <[email protected]>
Cc: Kent Overstreet <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Jens Axboe <[email protected]>
Cc: "Nicholas A. Bellinger" <[email protected]>
---
lib/percpu_ida.c | 18 ++++++++++--------
1 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
index fad029c..b4c4cc7 100644
--- a/lib/percpu_ida.c
+++ b/lib/percpu_ida.c
@@ -84,20 +84,22 @@ static inline void steal_tags(struct percpu_ida *pool,
pool->cpu_last_stolen = cpu;
remote = per_cpu_ptr(pool->tag_cpu, cpu);
- cpumask_clear_cpu(cpu, &pool->cpus_have_tags);
-
- if (remote == tags)
+ if (remote == tags) {
+ cpumask_clear_cpu(cpu, &pool->cpus_have_tags);
continue;
+ }
spin_lock(&remote->lock);
if (remote->nr_free) {
- memcpy(tags->freelist,
- remote->freelist,
- sizeof(unsigned) * remote->nr_free);
+ const struct percpu_ida *p = pool;
+
+ move_tags(tags->freelist, &tags->nr_free,
+ remote->freelist, &remote->nr_free,
+ min(remote->nr_free, p->percpu_batch_size));
- tags->nr_free = remote->nr_free;
- remote->nr_free = 0;
+ if (!remote->nr_free)
+ cpumask_clear_cpu(cpu, &pool->cpus_have_tags);
}
spin_unlock(&remote->lock);
--
1.7.7.6
--
Regards,
Alexander Gordeev
[email protected]
Hi Jens,
Any feedback on these?
--
Regards,
Alexander Gordeev
[email protected]
On Wed, Feb 26, 2014 at 10:11:56PM +0100, Alexander Gordeev wrote:
> Hi Jens,
>
> Any feedback on these?
Sorry, I dropped the ball last time...
On Thu, Feb 06, 2014 at 01:24:53PM +0100, Alexander Gordeev wrote:
> Function steal_tags() might miss a bit in cpus_have_tags due to
> unsynchronized access from percpu_ida_free(). As result, function
> percpu_ida_alloc() might enter unwakable sleep. This update adds
> memory barriers to prevent the described scenario.
>
> In fact, accesses to cpus_have_tags are fenced by prepare_to_wait()
> and wake_up() calls at the moment and the aforementioned sequence
> does not appear could hit. Nevertheless, explicit memory barriers
> still seem justifiable.
>
> Signed-off-by: Alexander Gordeev <[email protected]>
> Cc: Kent Overstreet <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Jens Axboe <[email protected]>
> Cc: "Nicholas A. Bellinger" <[email protected]>
> Acked-by: Kent Overstreet <[email protected]>
> ---
> lib/percpu_ida.c | 12 ++++++++++--
> 1 files changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
> index 7be235f..fccfb28 100644
> --- a/lib/percpu_ida.c
> +++ b/lib/percpu_ida.c
> @@ -68,6 +68,11 @@ static inline void steal_tags(struct percpu_ida *pool,
> unsigned cpus_have_tags, cpu = pool->cpu_last_stolen;
> struct percpu_ida_cpu *remote;
>
> + /*
> + * Pairs with smp_wmb() in percpu_ida_free()
> + */
> + smp_rmb();
> +
> for (cpus_have_tags = cpumask_weight(&pool->cpus_have_tags);
> cpus_have_tags * pool->percpu_max_size > pool->nr_tags / 2;
> cpus_have_tags--) {
> @@ -237,8 +242,11 @@ void percpu_ida_free(struct percpu_ida *pool, unsigned tag)
> spin_unlock(&tags->lock);
>
> if (nr_free == 1) {
> - cpumask_set_cpu(smp_processor_id(),
> - &pool->cpus_have_tags);
> + cpumask_set_cpu(smp_processor_id(), &pool->cpus_have_tags);
> + /*
> + * Pairs with smp_rmb() in steal_tags()
> + */
> + smp_wmb();
> wake_up(&pool->wait);
I think I'm nacking this - there's a lot of code in the kernel that relies on
the fact that prepare_to_wait)/wake_up() do the appropriate fences, we really
shouldn't be adding to the barriers those do.
If you can come up with some other reason we need the barriers I'll reconsider.
On Thu, Feb 06, 2014 at 01:24:55PM +0100, Alexander Gordeev wrote:
> Signed-off-by: Alexander Gordeev <[email protected]>
> Cc: Kent Overstreet <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Jens Axboe <[email protected]>
> Cc: "Nicholas A. Bellinger" <[email protected]>
> ---
> lib/percpu_ida.c | 5 +++++
> 1 files changed, 5 insertions(+), 0 deletions(-)
>
> diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
> index c2fa3dc..81b5ae9 100644
> --- a/lib/percpu_ida.c
> +++ b/lib/percpu_ida.c
> @@ -305,6 +305,11 @@ int __percpu_ida_init(struct percpu_ida *pool, unsigned long nr_tags,
> {
> unsigned i, cpu, order;
>
> + if (batch_size > max_size)
> + return -ERANGE;
> + if (!batch_size)
> + return -EINVAL;
> +
> memset(pool, 0, sizeof(*pool));
>
> init_waitqueue_head(&pool->wait);
Acked-by: Kent Overstreet <[email protected]>
On Sun, Feb 16, 2014 at 12:25:29PM +0100, Alexander Gordeev wrote:
> When stealing from a remote CPU all available tags are moved
> from the remote CPU's cache to the stealing CPU's one. Since
> the best CPU to steal from is not selected the victim might
> actively performing IO and (as result of losing all local
> tags) a further cycle of cache rebouncing and/or stealing
> could be provoked.
>
> This update is an attempt to soften the described scenario
> and limit the number of tags to be stolen at once to the
> value of percpu_batch_size.
I don't want this patch without a benchmark justifying the change and more
analysis; this could easily lead to more stealing and cacheline bouncing.
>
> Signed-off-by: Alexander Gordeev <[email protected]>
> Cc: Kent Overstreet <[email protected]>
> Cc: Peter Zijlstra <[email protected]>
> Cc: Jens Axboe <[email protected]>
> Cc: "Nicholas A. Bellinger" <[email protected]>
> ---
> lib/percpu_ida.c | 18 ++++++++++--------
> 1 files changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/lib/percpu_ida.c b/lib/percpu_ida.c
> index fad029c..b4c4cc7 100644
> --- a/lib/percpu_ida.c
> +++ b/lib/percpu_ida.c
> @@ -84,20 +84,22 @@ static inline void steal_tags(struct percpu_ida *pool,
> pool->cpu_last_stolen = cpu;
> remote = per_cpu_ptr(pool->tag_cpu, cpu);
>
> - cpumask_clear_cpu(cpu, &pool->cpus_have_tags);
> -
> - if (remote == tags)
> + if (remote == tags) {
> + cpumask_clear_cpu(cpu, &pool->cpus_have_tags);
> continue;
> + }
>
> spin_lock(&remote->lock);
>
> if (remote->nr_free) {
> - memcpy(tags->freelist,
> - remote->freelist,
> - sizeof(unsigned) * remote->nr_free);
> + const struct percpu_ida *p = pool;
> +
> + move_tags(tags->freelist, &tags->nr_free,
> + remote->freelist, &remote->nr_free,
> + min(remote->nr_free, p->percpu_batch_size));
>
> - tags->nr_free = remote->nr_free;
> - remote->nr_free = 0;
> + if (!remote->nr_free)
> + cpumask_clear_cpu(cpu, &pool->cpus_have_tags);
> }
>
> spin_unlock(&remote->lock);
> --
> 1.7.7.6
>
> --
> Regards,
> Alexander Gordeev
> [email protected]