Two small changes.
* Unlike most init functions, percpu_ref_init() allocates memory and
may fail. Let's mark it with __must_check in case the caller
forgets.
* percpu_ref_kill_rcu() is unnecessarily using ACCESS_ONCE() to
dereference @ref->pcpu_count, which can be misleading. The pointer
is guaranteed to be valid and visible and can't change underneath
the function. Drop ACCESS_ONCE().
Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/percpu-refcount.h | 3 ++-
lib/percpu-refcount.c | 4 +---
2 files changed, 3 insertions(+), 4 deletions(-)
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -67,7 +67,8 @@ struct percpu_ref {
struct rcu_head rcu;
};
-int percpu_ref_init(struct percpu_ref *ref, percpu_ref_func_t *release);
+int __must_check percpu_ref_init(struct percpu_ref *ref,
+ percpu_ref_func_t *release);
void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
percpu_ref_func_t *confirm_kill);
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -57,12 +57,10 @@ int percpu_ref_init(struct percpu_ref *r
static void percpu_ref_kill_rcu(struct rcu_head *rcu)
{
struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
- unsigned __percpu *pcpu_count;
+ unsigned __percpu *pcpu_count = ref->pcpu_count;
unsigned count = 0;
int cpu;
- pcpu_count = ACCESS_ONCE(ref->pcpu_count);
-
/* Mask out PCPU_REF_DEAD */
pcpu_count = (unsigned __percpu *)
(((unsigned long) pcpu_count) & ~PCPU_STATUS_MASK);
Normally, percpu_ref_init() initializes and percpu_ref_kill*()
initiates destruction which completes asynchronously. The
asynchronous destruction can be problematic in init failure path where
the caller wants to destroy half-constructed object - distinguishing
half-constructed objects from the usual release method can be painful
for complex objects.
This patch implements percpu_ref_cancel_init() which synchronously
destroys the percpu_ref without invoking release. To avoid
unintentional misuses, the function requires the ref to have finished
percpu_ref_init() but never used and triggers WARN otherwise.
Signed-off-by: Tejun Heo <[email protected]>
---
include/linux/percpu-refcount.h | 1 +
lib/percpu-refcount.c | 27 +++++++++++++++++++++++++++
2 files changed, 28 insertions(+)
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -69,6 +69,7 @@ struct percpu_ref {
int __must_check percpu_ref_init(struct percpu_ref *ref,
percpu_ref_func_t *release);
+void percpu_ref_cancel_init(struct percpu_ref *ref);
void percpu_ref_kill_and_confirm(struct percpu_ref *ref,
percpu_ref_func_t *confirm_kill);
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -54,6 +54,33 @@ int percpu_ref_init(struct percpu_ref *r
return 0;
}
+/**
+ * percpu_ref_cancel_init - cancel percpu_ref_init()
+ * @ref: percpu_ref to cancel init for
+ *
+ * Once a percpu_ref is initialized, its destruction is initiated by
+ * percpu_ref_kill*() and completes asynchronously, which can be painful to
+ * do when destroying a half-constructed object in init failure path.
+ *
+ * This function destroys @ref without invoking @ref->release and the
+ * memory area containing it can be freed immediately on return. To
+ * prevent accidental misuse, it's required that @ref has finished
+ * percpu_ref_init(), whether successful or not, but never used.
+ */
+void percpu_ref_cancel_init(struct percpu_ref *ref)
+{
+ unsigned __percpu *pcpu_count = ref->pcpu_count;
+ int cpu;
+
+ WARN_ON_ONCE(atomic_read(&ref->count) != 1 + PCPU_COUNT_BIAS);
+
+ if (pcpu_count) {
+ for_each_possible_cpu(cpu)
+ WARN_ON_ONCE(*per_cpu_ptr(pcpu_count, cpu));
+ free_percpu(ref->pcpu_count);
+ }
+}
+
static void percpu_ref_kill_rcu(struct rcu_head *rcu)
{
struct percpu_ref *ref = container_of(rcu, struct percpu_ref, rcu);
On Wed, Jun 12, 2013 at 08:52:35PM -0700, Tejun Heo wrote:
> Normally, percpu_ref_init() initializes and percpu_ref_kill*()
> initiates destruction which completes asynchronously. The
> asynchronous destruction can be problematic in init failure path where
> the caller wants to destroy half-constructed object - distinguishing
> half-constructed objects from the usual release method can be painful
> for complex objects.
>
> This patch implements percpu_ref_cancel_init() which synchronously
> destroys the percpu_ref without invoking release. To avoid
> unintentional misuses, the function requires the ref to have finished
> percpu_ref_init() but never used and triggers WARN otherwise.
That's a good idea, I should've implemented that for aio.
I probably would've just gone with percpu_ref_free() (if caller knows
it's safe, they can do whatever they want) but I suppose I can live with
percpu_ref_cancel_init().
Acked-by: Kent Overstreet <[email protected]>
On Wed, Jun 12, 2013 at 08:56:36PM -0700, Kent Overstreet wrote:
> On Wed, Jun 12, 2013 at 08:52:35PM -0700, Tejun Heo wrote:
> > Normally, percpu_ref_init() initializes and percpu_ref_kill*()
> > initiates destruction which completes asynchronously. The
> > asynchronous destruction can be problematic in init failure path where
> > the caller wants to destroy half-constructed object - distinguishing
> > half-constructed objects from the usual release method can be painful
> > for complex objects.
> >
> > This patch implements percpu_ref_cancel_init() which synchronously
> > destroys the percpu_ref without invoking release. To avoid
> > unintentional misuses, the function requires the ref to have finished
> > percpu_ref_init() but never used and triggers WARN otherwise.
>
> That's a good idea, I should've implemented that for aio.
>
> I probably would've just gone with percpu_ref_free() (if caller knows
> it's safe, they can do whatever they want) but I suppose I can live with
> percpu_ref_cancel_init().
At first I named it percpu_ref_free() but it looked too symmetric to
init, more so than kill, so I was worried that people might get
confused that this is the normal interface to shutdown a percpu
refcnt, so the weird cancel_init name and further restriction on its
usage.
Thanks.
--
tejun
On Wed, Jun 12, 2013 at 08:58:31PM -0700, Tejun Heo wrote:
> At first I named it percpu_ref_free() but it looked too symmetric to
> init, more so than kill, so I was worried that people might get
> confused that this is the normal interface to shutdown a percpu
> refcnt, so the weird cancel_init name and further restriction on its
> usage.
...Yeah, confusion with _kill() is a good point. Ok, cancel_init() it
is.
On Wed, Jun 12, 2013 at 09:00:19PM -0700, Kent Overstreet wrote:
> On Wed, Jun 12, 2013 at 08:58:31PM -0700, Tejun Heo wrote:
> > At first I named it percpu_ref_free() but it looked too symmetric to
> > init, more so than kill, so I was worried that people might get
> > confused that this is the normal interface to shutdown a percpu
> > refcnt, so the weird cancel_init name and further restriction on its
> > usage.
>
> ...Yeah, confusion with _kill() is a good point. Ok, cancel_init() it
> is.
Applied both patches to percpu/for-3.11. Added a paragraph explaining
the weird naming and usage restriction just in case.
Thanks.
--
tejun
Tejun Heo <[email protected]> writes:
> Two small changes.
>
> * Unlike most init functions, percpu_ref_init() allocates memory and
> may fail. Let's mark it with __must_check in case the caller
> forgets.
But it's quite OK to ignore OOM errors in builtin init functions.
It would be neatest to have it fail into slow mode, of course, but it's
probably not worth the pain.
Cheers,
Rusty.
On Wed, Jun 19, 2013 at 12:25:14PM +0930, Rusty Russell wrote:
> But it's quite OK to ignore OOM errors in builtin init functions.
I think it'd be cleaner to let those use cases use BUG_ON() around it.
We really want most users to be checking its return value.
> It would be neatest to have it fail into slow mode, of course, but it's
> probably not worth the pain.
percpu allocation is always GFP_KERNEL, so it can't get any slower
without deadlocking.
Thanks.
--
tejun
Tejun Heo <[email protected]> writes:
> On Wed, Jun 19, 2013 at 12:25:14PM +0930, Rusty Russell wrote:
>> But it's quite OK to ignore OOM errors in builtin init functions.
>
> I think it'd be cleaner to let those use cases use BUG_ON() around it.
> We really want most users to be checking its return value.
Yeah, but it's an admission of API design failure.
__attribute__((warn_unused_result)) is a bad implementation of a poorly
conceived idea. It was originally designed to catch realloc misuse,
which is presumably why casting to (void) doesn't suppress it.
Protecting realloc properly would mean the GCC understanding that the
pointer arg handed to realloc was no longer valid which would catch many
more cases, but compilers are hard, so we got the hacky attribute.
Now seems to get abused by lazy coders who blame users for their own
broken APIs. And Ubuntu, who turn it on by default in their gcc when
optimizing. Yeah, it's a sore point :)
So I end up writing code like this (to quote from ccan):
/* Gcc's warn_unused_result is fascist bullshit. */
#define doesnt_matter()
...
if (system(command))
doesnt_matter();
>> It would be neatest to have it fail into slow mode, of course, but it's
>> probably not worth the pain.
>
> percpu allocation is always GFP_KERNEL, so it can't get any slower
> without deadlocking.
Sorry, I was unclear. If you fail the percpu allocation, you have a
counter which is always in atomic mode.
This saves everyone a headache. init doesn't fail, no poorly-tested
failure paths, no whining Rusty.
Rant over,
Rusty.
Hello,
On Thu, Jun 20, 2013 at 10:29:51AM +0930, Rusty Russell wrote:
> Now seems to get abused by lazy coders who blame users for their own
> broken APIs. And Ubuntu, who turn it on by default in their gcc when
> optimizing. Yeah, it's a sore point :)
How is the API broken? It is a function which either succeeds or
fails and, if it fails during boot, like any other allocation failures
during boot, the boot fails. It's not different from kmalloc()
returning NULL on failure.
> if (system(command))
> doesnt_matter();
It *does* matter.
> Sorry, I was unclear. If you fail the percpu allocation, you have a
> counter which is always in atomic mode.
>
> This saves everyone a headache. init doesn't fail, no poorly-tested
> failure paths, no whining Rusty.
How does that save a headache? It *should* fail if allocation fails.
Having untraceable persistent slow down after heavy memory pressure is
no fun to track down. If you're worried about not being able to
detect bugs in error path, the right thing to do would be inducing
allocation failures regularly so that those paths can be tested, which
we already do.
Thanks.
--
tejun