Subject: [PATCH 0/6] use irqsafe variant of refcount_dec_and_lock() / atomic_dec_and_lock()

The irqsave variant of refcount_dec_and_lock() and atomic_dec_and_lock()
made it into v4.18-rc2. This is just a repost of the users so that they
can be routed through the individual subsystems.

Sebastian

Cc: Andrew Morton <[email protected]>
Cc: [email protected]
Cc: Shaohua Li <[email protected]>
Cc: [email protected]



Subject: [PATCH 4/6] bdi: Use irqsave variant of refcount_dec_and_lock()

From: Anna-Maria Gleixner <[email protected]>

The irqsave variant of refcount_dec_and_lock handles irqsave/restore when
taking/releasing the spin lock. With this variant the call of
local_irq_save/restore is no longer required.

Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Anna-Maria Gleixner <[email protected]>
[bigeasy: s@atomic_dec_and_lock@refcount_dec_and_lock@g ]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
mm/backing-dev.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 55a233d75f39..f5981e9d6ae2 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -473,11 +473,8 @@ void wb_congested_put(struct bdi_writeback_congested *congested)
{
unsigned long flags;

- local_irq_save(flags);
- if (!refcount_dec_and_lock(&congested->refcnt, &cgwb_lock)) {
- local_irq_restore(flags);
+ if (!refcount_dec_and_lock_irqsave(&congested->refcnt, &cgwb_lock, &flags))
return;
- }

/* bdi might already have been destroyed leaving @congested unlinked */
if (congested->__bdi) {
--
2.18.0


Subject: [PATCH 6/6] userns: Use irqsave variant of refcount_dec_and_lock()

From: Anna-Maria Gleixner <[email protected]>

The irqsave variant of refcount_dec_and_lock handles irqsave/restore when
taking/releasing the spin lock. With this variant the call of
local_irq_save/restore is no longer required.

Cc: Andrew Morton <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Anna-Maria Gleixner <[email protected]>
[bigeasy: s@atomic_dec_and_lock@refcount_dec_and_lock@g ]
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
kernel/user.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/user.c b/kernel/user.c
index 5f65ef195259..0df9b1640b2a 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -169,11 +169,8 @@ void free_uid(struct user_struct *up)
if (!up)
return;

- local_irq_save(flags);
- if (refcount_dec_and_lock(&up->__count, &uidhash_lock))
+ if (refcount_dec_and_lock_irqsave(&up->__count, &uidhash_lock, &flags))
free_user(up, flags);
- else
- local_irq_restore(flags);
}

struct user_struct *alloc_uid(kuid_t uid)
--
2.18.0


Subject: [PATCH 1/6] drivers/md/raid5: Use irqsave variant of atomic_dec_and_lock()

From: Anna-Maria Gleixner <[email protected]>

The irqsave variant of atomic_dec_and_lock handles irqsave/restore when
taking/releasing the spin lock. With this variant the call of
local_irq_save is no longer required.

Cc: Shaohua Li <[email protected]>
Cc: [email protected]
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Anna-Maria Gleixner <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
drivers/md/raid5.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 2031506a0ecd..e933bff9459e 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -409,16 +409,15 @@ void raid5_release_stripe(struct stripe_head *sh)
md_wakeup_thread(conf->mddev->thread);
return;
slow_path:
- local_irq_save(flags);
/* we are ok here if STRIPE_ON_RELEASE_LIST is set or not */
- if (atomic_dec_and_lock(&sh->count, &conf->device_lock)) {
+ if (atomic_dec_and_lock_irqsave(&sh->count, &conf->device_lock, flags)) {
INIT_LIST_HEAD(&list);
hash = sh->hash_lock_index;
do_release_stripe(conf, sh, &list);
spin_unlock(&conf->device_lock);
release_inactive_stripe_list(conf, &list, hash);
+ local_irq_restore(flags);
}
- local_irq_restore(flags);
}

static inline void remove_hash(struct stripe_head *sh)
--
2.18.0


Subject: [PATCH 5/6] userns: Use refcount_t for reference counting instead atomic_t

refcount_t type and corresponding API should be used instead of atomic_t when
the variable is used as a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free situations.

Cc: Andrew Morton <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
include/linux/sched/user.h | 5 +++--
kernel/user.c | 8 ++++----
2 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/include/linux/sched/user.h b/include/linux/sched/user.h
index 96fe289c4c6e..39ad98c09c58 100644
--- a/include/linux/sched/user.h
+++ b/include/linux/sched/user.h
@@ -4,6 +4,7 @@

#include <linux/uidgid.h>
#include <linux/atomic.h>
+#include <linux/refcount.h>
#include <linux/ratelimit.h>

struct key;
@@ -12,7 +13,7 @@ struct key;
* Some day this will be a full-fledged user tracking system..
*/
struct user_struct {
- atomic_t __count; /* reference count */
+ refcount_t __count; /* reference count */
atomic_t processes; /* How many processes does this user have? */
atomic_t sigpending; /* How many pending signals does this user have? */
#ifdef CONFIG_FANOTIFY
@@ -59,7 +60,7 @@ extern struct user_struct root_user;
extern struct user_struct * alloc_uid(kuid_t);
static inline struct user_struct *get_uid(struct user_struct *u)
{
- atomic_inc(&u->__count);
+ refcount_inc(&u->__count);
return u;
}
extern void free_uid(struct user_struct *);
diff --git a/kernel/user.c b/kernel/user.c
index 36288d840675..5f65ef195259 100644
--- a/kernel/user.c
+++ b/kernel/user.c
@@ -96,7 +96,7 @@ static DEFINE_SPINLOCK(uidhash_lock);

/* root_user.__count is 1, for init task cred */
struct user_struct root_user = {
- .__count = ATOMIC_INIT(1),
+ .__count = REFCOUNT_INIT(1),
.processes = ATOMIC_INIT(1),
.sigpending = ATOMIC_INIT(0),
.locked_shm = 0,
@@ -123,7 +123,7 @@ static struct user_struct *uid_hash_find(kuid_t uid, struct hlist_head *hashent)

hlist_for_each_entry(user, hashent, uidhash_node) {
if (uid_eq(user->uid, uid)) {
- atomic_inc(&user->__count);
+ refcount_inc(&user->__count);
return user;
}
}
@@ -170,7 +170,7 @@ void free_uid(struct user_struct *up)
return;

local_irq_save(flags);
- if (atomic_dec_and_lock(&up->__count, &uidhash_lock))
+ if (refcount_dec_and_lock(&up->__count, &uidhash_lock))
free_user(up, flags);
else
local_irq_restore(flags);
@@ -191,7 +191,7 @@ struct user_struct *alloc_uid(kuid_t uid)
goto out_unlock;

new->uid = uid;
- atomic_set(&new->__count, 1);
+ refcount_set(&new->__count, 1);
ratelimit_state_init(&new->ratelimit, HZ, 100);
ratelimit_set_flags(&new->ratelimit, RATELIMIT_MSG_ON_RELEASE);

--
2.18.0


Subject: [PATCH 2/6] drivers/md/raid5: Do not disable irq on release_inactive_stripe_list() call

From: Anna-Maria Gleixner <[email protected]>

There is no need to invoke release_inactive_stripe_list() with interrupts
disabled. All call sites, except raid5_release_stripe(), unlock
->device_lock and enable interrupts before invoking the function.

Make it consistent.

Cc: Shaohua Li <[email protected]>
Cc: [email protected]
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Anna-Maria Gleixner <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
drivers/md/raid5.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e933bff9459e..ca1dd0cb04c5 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -414,9 +414,8 @@ void raid5_release_stripe(struct stripe_head *sh)
INIT_LIST_HEAD(&list);
hash = sh->hash_lock_index;
do_release_stripe(conf, sh, &list);
- spin_unlock(&conf->device_lock);
+ spin_unlock_irqrestore(&conf->device_lock, flags);
release_inactive_stripe_list(conf, &list, hash);
- local_irq_restore(flags);
}
}

--
2.18.0


Subject: [PATCH 3/6] bdi: Use refcount_t for reference counting instead atomic_t

refcount_t type and corresponding API should be used instead of atomic_t when
the variable is used as a reference counter. This allows to avoid accidental
refcounter overflows that might lead to use-after-free situations.

Cc: [email protected]
Cc: Andrew Morton <[email protected]>
Suggested-by: Peter Zijlstra <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
---
include/linux/backing-dev-defs.h | 3 ++-
include/linux/backing-dev.h | 4 ++--
mm/backing-dev.c | 12 ++++++------
3 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
index 24251762c20c..9a6bc0951cfa 100644
--- a/include/linux/backing-dev-defs.h
+++ b/include/linux/backing-dev-defs.h
@@ -12,6 +12,7 @@
#include <linux/timer.h>
#include <linux/workqueue.h>
#include <linux/kref.h>
+#include <linux/refcount.h>

struct page;
struct device;
@@ -75,7 +76,7 @@ enum wb_reason {
*/
struct bdi_writeback_congested {
unsigned long state; /* WB_[a]sync_congested flags */
- atomic_t refcnt; /* nr of attached wb's and blkg */
+ refcount_t refcnt; /* nr of attached wb's and blkg */

#ifdef CONFIG_CGROUP_WRITEBACK
struct backing_dev_info *__bdi; /* the associated bdi, set to NULL
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 72ca0f3d39f3..c28a47cbe355 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -404,13 +404,13 @@ static inline bool inode_cgwb_enabled(struct inode *inode)
static inline struct bdi_writeback_congested *
wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
{
- atomic_inc(&bdi->wb_congested->refcnt);
+ refcount_inc(&bdi->wb_congested->refcnt);
return bdi->wb_congested;
}

static inline void wb_congested_put(struct bdi_writeback_congested *congested)
{
- if (atomic_dec_and_test(&congested->refcnt))
+ if (refcount_dec_and_test(&congested->refcnt))
kfree(congested);
}

diff --git a/mm/backing-dev.c b/mm/backing-dev.c
index 2e5d3df0853d..55a233d75f39 100644
--- a/mm/backing-dev.c
+++ b/mm/backing-dev.c
@@ -438,10 +438,10 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
if (new_congested) {
/* !found and storage for new one already allocated, insert */
congested = new_congested;
- new_congested = NULL;
rb_link_node(&congested->rb_node, parent, node);
rb_insert_color(&congested->rb_node, &bdi->cgwb_congested_tree);
- goto found;
+ spin_unlock_irqrestore(&cgwb_lock, flags);
+ return congested;
}

spin_unlock_irqrestore(&cgwb_lock, flags);
@@ -451,13 +451,13 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
if (!new_congested)
return NULL;

- atomic_set(&new_congested->refcnt, 0);
+ refcount_set(&new_congested->refcnt, 1);
new_congested->__bdi = bdi;
new_congested->blkcg_id = blkcg_id;
goto retry;

found:
- atomic_inc(&congested->refcnt);
+ refcount_inc(&congested->refcnt);
spin_unlock_irqrestore(&cgwb_lock, flags);
kfree(new_congested);
return congested;
@@ -474,7 +474,7 @@ void wb_congested_put(struct bdi_writeback_congested *congested)
unsigned long flags;

local_irq_save(flags);
- if (!atomic_dec_and_lock(&congested->refcnt, &cgwb_lock)) {
+ if (!refcount_dec_and_lock(&congested->refcnt, &cgwb_lock)) {
local_irq_restore(flags);
return;
}
@@ -804,7 +804,7 @@ static int cgwb_bdi_init(struct backing_dev_info *bdi)
if (!bdi->wb_congested)
return -ENOMEM;

- atomic_set(&bdi->wb_congested->refcnt, 1);
+ refcount_set(&bdi->wb_congested->refcnt, 1);

err = wb_init(&bdi->wb, bdi, 1, GFP_KERNEL);
if (err) {
--
2.18.0


Subject: Re: [PATCH 1/6] drivers/md/raid5: Use irqsave variant of atomic_dec_and_lock()

On 2018-07-03 22:01:36 [+0200], To [email protected] wrote:
> From: Anna-Maria Gleixner <[email protected]>
>
> The irqsave variant of atomic_dec_and_lock handles irqsave/restore when
> taking/releasing the spin lock. With this variant the call of
> local_irq_save is no longer required.

Shaohua, are you with this?

> Cc: Shaohua Li <[email protected]>
> Cc: [email protected]
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Anna-Maria Gleixner <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> drivers/md/raid5.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 2031506a0ecd..e933bff9459e 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -409,16 +409,15 @@ void raid5_release_stripe(struct stripe_head *sh)
> md_wakeup_thread(conf->mddev->thread);
> return;
> slow_path:
> - local_irq_save(flags);
> /* we are ok here if STRIPE_ON_RELEASE_LIST is set or not */
> - if (atomic_dec_and_lock(&sh->count, &conf->device_lock)) {
> + if (atomic_dec_and_lock_irqsave(&sh->count, &conf->device_lock, flags)) {
> INIT_LIST_HEAD(&list);
> hash = sh->hash_lock_index;
> do_release_stripe(conf, sh, &list);
> spin_unlock(&conf->device_lock);
> release_inactive_stripe_list(conf, &list, hash);
> + local_irq_restore(flags);
> }
> - local_irq_restore(flags);
> }
>
> static inline void remove_hash(struct stripe_head *sh)
> --
> 2.18.0
>

Subject: Re: [PATCH 3/6] bdi: Use refcount_t for reference counting instead atomic_t

On 2018-07-03 22:01:38 [+0200], To [email protected] wrote:
> refcount_t type and corresponding API should be used instead of atomic_t when
> the variable is used as a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free situations.

Andrew, is it okay for you to collect this one (and 4/6 of this series,
both bdi)? The prerequisites are already in Linus' tree.

> Cc: [email protected]
> Cc: Andrew Morton <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> include/linux/backing-dev-defs.h | 3 ++-
> include/linux/backing-dev.h | 4 ++--
> mm/backing-dev.c | 12 ++++++------
> 3 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h
> index 24251762c20c..9a6bc0951cfa 100644
> --- a/include/linux/backing-dev-defs.h
> +++ b/include/linux/backing-dev-defs.h
> @@ -12,6 +12,7 @@
> #include <linux/timer.h>
> #include <linux/workqueue.h>
> #include <linux/kref.h>
> +#include <linux/refcount.h>
>
> struct page;
> struct device;
> @@ -75,7 +76,7 @@ enum wb_reason {
> */
> struct bdi_writeback_congested {
> unsigned long state; /* WB_[a]sync_congested flags */
> - atomic_t refcnt; /* nr of attached wb's and blkg */
> + refcount_t refcnt; /* nr of attached wb's and blkg */
>
> #ifdef CONFIG_CGROUP_WRITEBACK
> struct backing_dev_info *__bdi; /* the associated bdi, set to NULL
> diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
> index 72ca0f3d39f3..c28a47cbe355 100644
> --- a/include/linux/backing-dev.h
> +++ b/include/linux/backing-dev.h
> @@ -404,13 +404,13 @@ static inline bool inode_cgwb_enabled(struct inode *inode)
> static inline struct bdi_writeback_congested *
> wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
> {
> - atomic_inc(&bdi->wb_congested->refcnt);
> + refcount_inc(&bdi->wb_congested->refcnt);
> return bdi->wb_congested;
> }
>
> static inline void wb_congested_put(struct bdi_writeback_congested *congested)
> {
> - if (atomic_dec_and_test(&congested->refcnt))
> + if (refcount_dec_and_test(&congested->refcnt))
> kfree(congested);
> }
>
> diff --git a/mm/backing-dev.c b/mm/backing-dev.c
> index 2e5d3df0853d..55a233d75f39 100644
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -438,10 +438,10 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
> if (new_congested) {
> /* !found and storage for new one already allocated, insert */
> congested = new_congested;
> - new_congested = NULL;
> rb_link_node(&congested->rb_node, parent, node);
> rb_insert_color(&congested->rb_node, &bdi->cgwb_congested_tree);
> - goto found;
> + spin_unlock_irqrestore(&cgwb_lock, flags);
> + return congested;
> }
>
> spin_unlock_irqrestore(&cgwb_lock, flags);
> @@ -451,13 +451,13 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
> if (!new_congested)
> return NULL;
>
> - atomic_set(&new_congested->refcnt, 0);
> + refcount_set(&new_congested->refcnt, 1);
> new_congested->__bdi = bdi;
> new_congested->blkcg_id = blkcg_id;
> goto retry;
>
> found:
> - atomic_inc(&congested->refcnt);
> + refcount_inc(&congested->refcnt);
> spin_unlock_irqrestore(&cgwb_lock, flags);
> kfree(new_congested);
> return congested;
> @@ -474,7 +474,7 @@ void wb_congested_put(struct bdi_writeback_congested *congested)
> unsigned long flags;
>
> local_irq_save(flags);
> - if (!atomic_dec_and_lock(&congested->refcnt, &cgwb_lock)) {
> + if (!refcount_dec_and_lock(&congested->refcnt, &cgwb_lock)) {
> local_irq_restore(flags);
> return;
> }
> @@ -804,7 +804,7 @@ static int cgwb_bdi_init(struct backing_dev_info *bdi)
> if (!bdi->wb_congested)
> return -ENOMEM;
>
> - atomic_set(&bdi->wb_congested->refcnt, 1);
> + refcount_set(&bdi->wb_congested->refcnt, 1);
>
> err = wb_init(&bdi->wb, bdi, 1, GFP_KERNEL);
> if (err) {
> --
> 2.18.0

Sebastian

Subject: Re: [PATCH 5/6] userns: Use refcount_t for reference counting instead atomic_t

On 2018-07-03 22:01:40 [+0200], To [email protected] wrote:
> refcount_t type and corresponding API should be used instead of atomic_t when
> the variable is used as a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free situations.

Ingo, Andrew: who of you two feels most comfortable to apply this one
(and 6/6 of this series, both `userns')? If none, who would you suggest?

> Cc: Andrew Morton <[email protected]>
> Suggested-by: Peter Zijlstra <[email protected]>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> include/linux/sched/user.h | 5 +++--
> kernel/user.c | 8 ++++----
> 2 files changed, 7 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/sched/user.h b/include/linux/sched/user.h
> index 96fe289c4c6e..39ad98c09c58 100644
> --- a/include/linux/sched/user.h
> +++ b/include/linux/sched/user.h
> @@ -4,6 +4,7 @@
>
> #include <linux/uidgid.h>
> #include <linux/atomic.h>
> +#include <linux/refcount.h>
> #include <linux/ratelimit.h>
>
> struct key;
> @@ -12,7 +13,7 @@ struct key;
> * Some day this will be a full-fledged user tracking system..
> */
> struct user_struct {
> - atomic_t __count; /* reference count */
> + refcount_t __count; /* reference count */
> atomic_t processes; /* How many processes does this user have? */
> atomic_t sigpending; /* How many pending signals does this user have? */
> #ifdef CONFIG_FANOTIFY
> @@ -59,7 +60,7 @@ extern struct user_struct root_user;
> extern struct user_struct * alloc_uid(kuid_t);
> static inline struct user_struct *get_uid(struct user_struct *u)
> {
> - atomic_inc(&u->__count);
> + refcount_inc(&u->__count);
> return u;
> }
> extern void free_uid(struct user_struct *);
> diff --git a/kernel/user.c b/kernel/user.c
> index 36288d840675..5f65ef195259 100644
> --- a/kernel/user.c
> +++ b/kernel/user.c
> @@ -96,7 +96,7 @@ static DEFINE_SPINLOCK(uidhash_lock);
>
> /* root_user.__count is 1, for init task cred */
> struct user_struct root_user = {
> - .__count = ATOMIC_INIT(1),
> + .__count = REFCOUNT_INIT(1),
> .processes = ATOMIC_INIT(1),
> .sigpending = ATOMIC_INIT(0),
> .locked_shm = 0,
> @@ -123,7 +123,7 @@ static struct user_struct *uid_hash_find(kuid_t uid, struct hlist_head *hashent)
>
> hlist_for_each_entry(user, hashent, uidhash_node) {
> if (uid_eq(user->uid, uid)) {
> - atomic_inc(&user->__count);
> + refcount_inc(&user->__count);
> return user;
> }
> }
> @@ -170,7 +170,7 @@ void free_uid(struct user_struct *up)
> return;
>
> local_irq_save(flags);
> - if (atomic_dec_and_lock(&up->__count, &uidhash_lock))
> + if (refcount_dec_and_lock(&up->__count, &uidhash_lock))
> free_user(up, flags);
> else
> local_irq_restore(flags);
> @@ -191,7 +191,7 @@ struct user_struct *alloc_uid(kuid_t uid)
> goto out_unlock;
>
> new->uid = uid;
> - atomic_set(&new->__count, 1);
> + refcount_set(&new->__count, 1);
> ratelimit_state_init(&new->ratelimit, HZ, 100);
> ratelimit_set_flags(&new->ratelimit, RATELIMIT_MSG_ON_RELEASE);
>
> --
> 2.18.0

Sebastian

2018-07-16 22:58:44

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 3/6] bdi: Use refcount_t for reference counting instead atomic_t

On Tue, 3 Jul 2018 22:01:38 +0200 Sebastian Andrzej Siewior <[email protected]> wrote:

> refcount_t type and corresponding API should be used instead of atomic_t when
> the variable is used as a reference counter. This allows to avoid accidental
> refcounter overflows that might lead to use-after-free situations.
>
> ...
>
> --- a/mm/backing-dev.c
> +++ b/mm/backing-dev.c
> @@ -438,10 +438,10 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
> if (new_congested) {
> /* !found and storage for new one already allocated, insert */
> congested = new_congested;
> - new_congested = NULL;
> rb_link_node(&congested->rb_node, parent, node);
> rb_insert_color(&congested->rb_node, &bdi->cgwb_congested_tree);
> - goto found;
> + spin_unlock_irqrestore(&cgwb_lock, flags);
> + return congested;
> }
>
> spin_unlock_irqrestore(&cgwb_lock, flags);
> @@ -451,13 +451,13 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
> if (!new_congested)
> return NULL;
>
> - atomic_set(&new_congested->refcnt, 0);
> + refcount_set(&new_congested->refcnt, 1);
> new_congested->__bdi = bdi;
> new_congested->blkcg_id = blkcg_id;
> goto retry;
>
> found:
> - atomic_inc(&congested->refcnt);
> + refcount_inc(&congested->refcnt);
> spin_unlock_irqrestore(&cgwb_lock, flags);
> kfree(new_congested);
> return congested;
>
> ...
>

I'm not sure that the restructuring of wb_congested_get_create() was
desirable and it does make the patch harder to review. But it looks
OK to me.

2018-07-17 01:25:22

by Shaohua Li

[permalink] [raw]
Subject: Re: [PATCH 1/6] drivers/md/raid5: Use irqsave variant of atomic_dec_and_lock()

On Mon, Jul 16, 2018 at 02:27:40PM +0200, Sebastian Andrzej Siewior wrote:
> On 2018-07-03 22:01:36 [+0200], To [email protected] wrote:
> > From: Anna-Maria Gleixner <[email protected]>
> >
> > The irqsave variant of atomic_dec_and_lock handles irqsave/restore when
> > taking/releasing the spin lock. With this variant the call of
> > local_irq_save is no longer required.
>
> Shaohua, are you with this?

Acked-by: Shaohua Li <[email protected]>
> > Cc: Shaohua Li <[email protected]>
> > Cc: [email protected]
> > Acked-by: Peter Zijlstra (Intel) <[email protected]>
> > Signed-off-by: Anna-Maria Gleixner <[email protected]>
> > Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> > ---
> > drivers/md/raid5.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 2031506a0ecd..e933bff9459e 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -409,16 +409,15 @@ void raid5_release_stripe(struct stripe_head *sh)
> > md_wakeup_thread(conf->mddev->thread);
> > return;
> > slow_path:
> > - local_irq_save(flags);
> > /* we are ok here if STRIPE_ON_RELEASE_LIST is set or not */
> > - if (atomic_dec_and_lock(&sh->count, &conf->device_lock)) {
> > + if (atomic_dec_and_lock_irqsave(&sh->count, &conf->device_lock, flags)) {
> > INIT_LIST_HEAD(&list);
> > hash = sh->hash_lock_index;
> > do_release_stripe(conf, sh, &list);
> > spin_unlock(&conf->device_lock);
> > release_inactive_stripe_list(conf, &list, hash);
> > + local_irq_restore(flags);
> > }
> > - local_irq_restore(flags);
> > }
> >
> > static inline void remove_hash(struct stripe_head *sh)
> > --
> > 2.18.0
> >

Subject: Re: [PATCH 3/6] bdi: Use refcount_t for reference counting instead atomic_t

On 2018-07-16 15:57:16 [-0700], Andrew Morton wrote:
> > --- a/mm/backing-dev.c
> > +++ b/mm/backing-dev.c
> > @@ -438,10 +438,10 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
> > if (new_congested) {
> > /* !found and storage for new one already allocated, insert */
> > congested = new_congested;
> > - new_congested = NULL;
> > rb_link_node(&congested->rb_node, parent, node);
> > rb_insert_color(&congested->rb_node, &bdi->cgwb_congested_tree);
> > - goto found;
> > + spin_unlock_irqrestore(&cgwb_lock, flags);
> > + return congested;
> > }
> >
> > spin_unlock_irqrestore(&cgwb_lock, flags);
> > @@ -451,13 +451,13 @@ wb_congested_get_create(struct backing_dev_info *bdi, int blkcg_id, gfp_t gfp)
> > if (!new_congested)
> > return NULL;
> >
> > - atomic_set(&new_congested->refcnt, 0);
> > + refcount_set(&new_congested->refcnt, 1);
> > new_congested->__bdi = bdi;
> > new_congested->blkcg_id = blkcg_id;
> > goto retry;
> >
> > found:
> > - atomic_inc(&congested->refcnt);
> > + refcount_inc(&congested->refcnt);
> > spin_unlock_irqrestore(&cgwb_lock, flags);
> > kfree(new_congested);
> > return congested;
> >
> > ...
> >
>
> I'm not sure that the restructuring of wb_congested_get_create() was
> desirable and it does make the patch harder to review. But it looks
> OK to me.

By `restructuring' you mean the addition of return statement instead
using the goto label in the first hunk? If so, then you would have

refcount_set(&new_congested->refcnt, 0);
refcount_inc(&congested->refcnt);

which is a 0 -> 1 transition and is forbidden by refcount_t. So I had to
avoid this one.

Thank you applying the patches!

You applied the bdi and userns switch from atomic_t to refcount_t.
There were also the patches
[PATCH 4/6] bdi: Use irqsave variant of refcount_dec_and_lock()
[PATCH 6/6] userns: Use irqsave variant of refcount_dec_and_lock()

in the series which make use the irqsave version of
refcount_dec_and_lock(). Did you miss them by chance or skipped them on
purpose?

Sebastian

Subject: Re: [PATCH 1/6] drivers/md/raid5: Use irqsave variant of atomic_dec_and_lock()

On 2018-07-16 17:37:27 [-0700], Shaohua Li wrote:
> On Mon, Jul 16, 2018 at 02:27:40PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2018-07-03 22:01:36 [+0200], To [email protected] wrote:
> > > From: Anna-Maria Gleixner <[email protected]>
> > >
> > > The irqsave variant of atomic_dec_and_lock handles irqsave/restore when
> > > taking/releasing the spin lock. With this variant the call of
> > > local_irq_save is no longer required.
> >
> > Shaohua, are you with this?
>
> Acked-by: Shaohua Li <[email protected]>

Thank you.
Is there a reason why you did not apply it or are you too busy now and
do it later (and just signal the ack that you are fine with it)?

Sebastian

2018-07-19 02:44:57

by Shaohua Li

[permalink] [raw]
Subject: Re: [PATCH 1/6] drivers/md/raid5: Use irqsave variant of atomic_dec_and_lock()

On Wed, Jul 18, 2018 at 12:57:21PM +0200, Sebastian Andrzej Siewior wrote:
> On 2018-07-16 17:37:27 [-0700], Shaohua Li wrote:
> > On Mon, Jul 16, 2018 at 02:27:40PM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2018-07-03 22:01:36 [+0200], To [email protected] wrote:
> > > > From: Anna-Maria Gleixner <[email protected]>
> > > >
> > > > The irqsave variant of atomic_dec_and_lock handles irqsave/restore when
> > > > taking/releasing the spin lock. With this variant the call of
> > > > local_irq_save is no longer required.
> > >
> > > Shaohua, are you with this?
> >
> > Acked-by: Shaohua Li <[email protected]>
>
> Thank you.
> Is there a reason why you did not apply it or are you too busy now and
> do it later (and just signal the ack that you are fine with it)?

Since you sent a series, I suppose you want someone else to take it. But I can
take it for sure, will do soon.

Thanks,
Shaohua

Subject: Re: [PATCH 1/6] drivers/md/raid5: Use irqsave variant of atomic_dec_and_lock()

On 2018-07-18 19:44:05 [-0700], Shaohua Li wrote:
> Since you sent a series, I suppose you want someone else to take it. But I can
> take it for sure, will do soon.

I'm sorry for the miss understanding. I tried to explain this in the
cover letter. The prerequisites are already merged so then the actual
changes can be merged via the individual maintainer.
Thank you Shaohua.

> Thanks,
> Shaohua

Sebastian

Subject: Re: [PATCH 2/6] drivers/md/raid5: Do not disable irq on release_inactive_stripe_list() call

On 2018-07-03 22:01:37 [+0200], To [email protected] wrote:
> From: Anna-Maria Gleixner <[email protected]>
>
> There is no need to invoke release_inactive_stripe_list() with interrupts
> disabled. All call sites, except raid5_release_stripe(), unlock
> ->device_lock and enable interrupts before invoking the function.
>
> Make it consistent.

Shaohua, I've seen that you already applied 1/6. Could you also please
apply this one? This is the only remaining raid5 patch :)

> Cc: Shaohua Li <[email protected]>
> Cc: [email protected]
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
> Signed-off-by: Anna-Maria Gleixner <[email protected]>
> Signed-off-by: Sebastian Andrzej Siewior <[email protected]>
> ---
> drivers/md/raid5.c | 3 +--
> 1 file changed, 1 insertion(+), 2 deletions(-)
>
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index e933bff9459e..ca1dd0cb04c5 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -414,9 +414,8 @@ void raid5_release_stripe(struct stripe_head *sh)
> INIT_LIST_HEAD(&list);
> hash = sh->hash_lock_index;
> do_release_stripe(conf, sh, &list);
> - spin_unlock(&conf->device_lock);
> + spin_unlock_irqrestore(&conf->device_lock, flags);
> release_inactive_stripe_list(conf, &list, hash);
> - local_irq_restore(flags);
> }
> }
>

Sebastian