2024-04-07 06:55:16

by Huang, Ying

[permalink] [raw]
Subject: [PATCH] mm,swap: add document about RCU read lock and swapoff interaction

During reviewing a patch to fix the race condition between
free_swap_and_cache() and swapoff() [1], it was found that the
document about how to prevent racing with swapoff isn't clear enough.
Especially RCU read lock can prevent swapoff from freeing data
structures. So, the document is added as comments.

[1] https://lore.kernel.org/linux-mm/[email protected]/

Signed-off-by: "Huang, Ying" <[email protected]>
Cc: Ryan Roberts <[email protected]>
Cc: David Hildenbrand <[email protected]>
Cc: Miaohe Lin <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Minchan Kim <[email protected]>
---
mm/swapfile.c | 26 +++++++++++++-------------
1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 4919423cce76..6925462406fa 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1226,16 +1226,15 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,

/*
* When we get a swap entry, if there aren't some other ways to
- * prevent swapoff, such as the folio in swap cache is locked, page
- * table lock is held, etc., the swap entry may become invalid because
- * of swapoff. Then, we need to enclose all swap related functions
- * with get_swap_device() and put_swap_device(), unless the swap
- * functions call get/put_swap_device() by themselves.
+ * prevent swapoff, such as the folio in swap cache is locked, RCU
+ * reader side is locked, etc., the swap entry may become invalid
+ * because of swapoff. Then, we need to enclose all swap related
+ * functions with get_swap_device() and put_swap_device(), unless the
+ * swap functions call get/put_swap_device() by themselves.
*
- * Note that when only holding the PTL, swapoff might succeed immediately
- * after freeing a swap entry. Therefore, immediately after
- * __swap_entry_free(), the swap info might become stale and should not
- * be touched without a prior get_swap_device().
+ * RCU reader side lock (including any spinlock) is sufficient to
+ * prevent swapoff, because synchronize_rcu() is called in swapoff()
+ * before freeing data structures.
*
* Check whether swap entry is valid in the swap device. If so,
* return pointer to swap_info_struct, and keep the swap entry valid
@@ -2495,10 +2494,11 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)

/*
* Wait for swap operations protected by get/put_swap_device()
- * to complete.
- *
- * We need synchronize_rcu() here to protect the accessing to
- * the swap cache data structure.
+ * to complete. Because of synchronize_rcu() here, all swap
+ * operations protected by RCU reader side lock (including any
+ * spinlock) will be waited too. This makes it easy to
+ * prevent folio_test_swapcache() and the following swap cache
+ * operations from racing with swapoff.
*/
percpu_ref_kill(&p->users);
synchronize_rcu();
--
2.39.2



2024-04-08 07:23:48

by Ryan Roberts

[permalink] [raw]
Subject: Re: [PATCH] mm,swap: add document about RCU read lock and swapoff interaction

On 07/04/2024 07:54, Huang Ying wrote:
> During reviewing a patch to fix the race condition between
> free_swap_and_cache() and swapoff() [1], it was found that the
> document about how to prevent racing with swapoff isn't clear enough.
> Especially RCU read lock can prevent swapoff from freeing data
> structures. So, the document is added as comments.
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
>
> Signed-off-by: "Huang, Ying" <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Miaohe Lin <[email protected]>
> Cc: Hugh Dickins <[email protected]>
> Cc: Minchan Kim <[email protected]>

LGTM!

Reviewed-by: Ryan Roberts <[email protected]>


> ---
> mm/swapfile.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 4919423cce76..6925462406fa 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1226,16 +1226,15 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
>
> /*
> * When we get a swap entry, if there aren't some other ways to
> - * prevent swapoff, such as the folio in swap cache is locked, page
> - * table lock is held, etc., the swap entry may become invalid because
> - * of swapoff. Then, we need to enclose all swap related functions
> - * with get_swap_device() and put_swap_device(), unless the swap
> - * functions call get/put_swap_device() by themselves.
> + * prevent swapoff, such as the folio in swap cache is locked, RCU
> + * reader side is locked, etc., the swap entry may become invalid
> + * because of swapoff. Then, we need to enclose all swap related
> + * functions with get_swap_device() and put_swap_device(), unless the
> + * swap functions call get/put_swap_device() by themselves.
> *
> - * Note that when only holding the PTL, swapoff might succeed immediately
> - * after freeing a swap entry. Therefore, immediately after
> - * __swap_entry_free(), the swap info might become stale and should not
> - * be touched without a prior get_swap_device().
> + * RCU reader side lock (including any spinlock) is sufficient to
> + * prevent swapoff, because synchronize_rcu() is called in swapoff()
> + * before freeing data structures.
> *
> * Check whether swap entry is valid in the swap device. If so,
> * return pointer to swap_info_struct, and keep the swap entry valid
> @@ -2495,10 +2494,11 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
>
> /*
> * Wait for swap operations protected by get/put_swap_device()
> - * to complete.
> - *
> - * We need synchronize_rcu() here to protect the accessing to
> - * the swap cache data structure.
> + * to complete. Because of synchronize_rcu() here, all swap
> + * operations protected by RCU reader side lock (including any
> + * spinlock) will be waited too. This makes it easy to
> + * prevent folio_test_swapcache() and the following swap cache
> + * operations from racing with swapoff.
> */
> percpu_ref_kill(&p->users);
> synchronize_rcu();


2024-04-08 07:45:25

by David Hildenbrand

[permalink] [raw]
Subject: Re: [PATCH] mm,swap: add document about RCU read lock and swapoff interaction

On 07.04.24 08:54, Huang Ying wrote:
> During reviewing a patch to fix the race condition between
> free_swap_and_cache() and swapoff() [1], it was found that the
> document about how to prevent racing with swapoff isn't clear enough.
> Especially RCU read lock can prevent swapoff from freeing data
> structures. So, the document is added as comments.
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
>
> Signed-off-by: "Huang, Ying" <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Miaohe Lin <[email protected]>
> Cc: Hugh Dickins <[email protected]>
> Cc: Minchan Kim <[email protected]>
> ---
> mm/swapfile.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 4919423cce76..6925462406fa 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1226,16 +1226,15 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
>
> /*
> * When we get a swap entry, if there aren't some other ways to
> - * prevent swapoff, such as the folio in swap cache is locked, page
> - * table lock is held, etc., the swap entry may become invalid because
> - * of swapoff. Then, we need to enclose all swap related functions
> - * with get_swap_device() and put_swap_device(), unless the swap
> - * functions call get/put_swap_device() by themselves.
> + * prevent swapoff, such as the folio in swap cache is locked, RCU
> + * reader side is locked, etc., the swap entry may become invalid
> + * because of swapoff. Then, we need to enclose all swap related
> + * functions with get_swap_device() and put_swap_device(), unless the
> + * swap functions call get/put_swap_device() by themselves.
> *
> - * Note that when only holding the PTL, swapoff might succeed immediately
> - * after freeing a swap entry. Therefore, immediately after
> - * __swap_entry_free(), the swap info might become stale and should not
> - * be touched without a prior get_swap_device().
> + * RCU reader side lock (including any spinlock) is sufficient to
> + * prevent swapoff, because synchronize_rcu() is called in swapoff()
> + * before freeing data structures.
> *
> * Check whether swap entry is valid in the swap device. If so,
> * return pointer to swap_info_struct, and keep the swap entry valid
> @@ -2495,10 +2494,11 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
>
> /*
> * Wait for swap operations protected by get/put_swap_device()
> - * to complete.
> - *
> - * We need synchronize_rcu() here to protect the accessing to
> - * the swap cache data structure.
> + * to complete. Because of synchronize_rcu() here, all swap
> + * operations protected by RCU reader side lock (including any
> + * spinlock) will be waited too. This makes it easy to
> + * prevent folio_test_swapcache() and the following swap cache
> + * operations from racing with swapoff.
> */
> percpu_ref_kill(&p->users);
> synchronize_rcu();

Reviewed-by: David Hildenbrand <[email protected]>

--
Cheers,

David / dhildenb


2024-04-10 07:59:06

by Miaohe Lin

[permalink] [raw]
Subject: Re: [PATCH] mm,swap: add document about RCU read lock and swapoff interaction

On 2024/4/7 14:54, Huang Ying wrote:
> During reviewing a patch to fix the race condition between
> free_swap_and_cache() and swapoff() [1], it was found that the
> document about how to prevent racing with swapoff isn't clear enough.
> Especially RCU read lock can prevent swapoff from freeing data
> structures. So, the document is added as comments.
>
> [1] https://lore.kernel.org/linux-mm/[email protected]/
>
> Signed-off-by: "Huang, Ying" <[email protected]>
> Cc: Ryan Roberts <[email protected]>
> Cc: David Hildenbrand <[email protected]>
> Cc: Miaohe Lin <[email protected]>
> Cc: Hugh Dickins <[email protected]>
> Cc: Minchan Kim <[email protected]>

Thanks for your work.

Reviewed-by: Miaohe Lin <[email protected]>
.

> ---
> mm/swapfile.c | 26 +++++++++++++-------------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 4919423cce76..6925462406fa 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1226,16 +1226,15 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
>
> /*
> * When we get a swap entry, if there aren't some other ways to
> - * prevent swapoff, such as the folio in swap cache is locked, page
> - * table lock is held, etc., the swap entry may become invalid because
> - * of swapoff. Then, we need to enclose all swap related functions
> - * with get_swap_device() and put_swap_device(), unless the swap
> - * functions call get/put_swap_device() by themselves.
> + * prevent swapoff, such as the folio in swap cache is locked, RCU
> + * reader side is locked, etc., the swap entry may become invalid
> + * because of swapoff. Then, we need to enclose all swap related
> + * functions with get_swap_device() and put_swap_device(), unless the
> + * swap functions call get/put_swap_device() by themselves.
> *
> - * Note that when only holding the PTL, swapoff might succeed immediately
> - * after freeing a swap entry. Therefore, immediately after
> - * __swap_entry_free(), the swap info might become stale and should not
> - * be touched without a prior get_swap_device().
> + * RCU reader side lock (including any spinlock) is sufficient to
> + * prevent swapoff, because synchronize_rcu() is called in swapoff()
> + * before freeing data structures.
> *
> * Check whether swap entry is valid in the swap device. If so,
> * return pointer to swap_info_struct, and keep the swap entry valid
> @@ -2495,10 +2494,11 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile)
>
> /*
> * Wait for swap operations protected by get/put_swap_device()
> - * to complete.
> - *
> - * We need synchronize_rcu() here to protect the accessing to
> - * the swap cache data structure.
> + * to complete. Because of synchronize_rcu() here, all swap
> + * operations protected by RCU reader side lock (including any
> + * spinlock) will be waited too. This makes it easy to
> + * prevent folio_test_swapcache() and the following swap cache
> + * operations from racing with swapoff.
> */
> percpu_ref_kill(&p->users);
> synchronize_rcu();
>