Hi everyone,
This series contains a cleaup patch to remove unneeded swap_cache_info
statistics, and a bugfix patch to avoid data races of inuse_pages. More
details can be found in the respective changelogs. Thanks!
---
v3:
rebase on linux-next-20220624
drop patch 1/3 per Huang, Ying
collect Reviewed-by tag per David, Muchun, Acked-by tag per Huang, Ying
use WRITE_ONCE to pair with READ_ONCE in patch 2/3
v2:
collect Reviewed-by tag per David
drop patch "mm/swapfile: avoid confusing swap cache statistics"
add a new patch to remove swap_cache_info statistics per David
Many thanks David for review and comment.
---
Miaohe Lin (2):
mm/swapfile: fix possible data races of inuse_pages
mm/swap: remove swap_cache_info statistics
mm/swap_state.c | 17 -----------------
mm/swapfile.c | 8 ++++----
2 files changed, 4 insertions(+), 21 deletions(-)
--
2.23.0
si->inuse_pages could still be accessed concurrently now. The plain reads
outside si->lock critical section, i.e. swap_show and si_swapinfo, which
results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
races. Note these data races should be ok because they're just used for
showing swap info.
Signed-off-by: Miaohe Lin <[email protected]>
Reviewed-by: David Hildenbrand <[email protected]>
---
mm/swapfile.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index edc3420d30e7..5c8681a3f1d9 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -695,7 +695,7 @@ static void swap_range_alloc(struct swap_info_struct *si, unsigned long offset,
si->lowest_bit += nr_entries;
if (end == si->highest_bit)
WRITE_ONCE(si->highest_bit, si->highest_bit - nr_entries);
- si->inuse_pages += nr_entries;
+ WRITE_ONCE(si->inuse_pages, si->inuse_pages + nr_entries);
if (si->inuse_pages == si->pages) {
si->lowest_bit = si->max;
si->highest_bit = 0;
@@ -732,7 +732,7 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
add_to_avail_list(si);
}
atomic_long_add(nr_entries, &nr_swap_pages);
- si->inuse_pages -= nr_entries;
+ WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
if (si->flags & SWP_BLKDEV)
swap_slot_free_notify =
si->bdev->bd_disk->fops->swap_slot_free_notify;
@@ -2641,7 +2641,7 @@ static int swap_show(struct seq_file *swap, void *v)
}
bytes = si->pages << (PAGE_SHIFT - 10);
- inuse = si->inuse_pages << (PAGE_SHIFT - 10);
+ inuse = READ_ONCE(si->inuse_pages) << (PAGE_SHIFT - 10);
file = si->swap_file;
len = seq_file_path(swap, file, " \t\n\\");
@@ -3260,7 +3260,7 @@ void si_swapinfo(struct sysinfo *val)
struct swap_info_struct *si = swap_info[type];
if ((si->flags & SWP_USED) && !(si->flags & SWP_WRITEOK))
- nr_to_be_unused += si->inuse_pages;
+ nr_to_be_unused += READ_ONCE(si->inuse_pages);
}
val->freeswap = atomic_long_read(&nr_swap_pages) + nr_to_be_unused;
val->totalswap = total_swap_pages + nr_to_be_unused;
--
2.23.0
Miaohe Lin <[email protected]> writes:
> si->inuse_pages could still be accessed concurrently now. The plain reads
> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
> races. Note these data races should be ok because they're just used for
> showing swap info.
>
> Signed-off-by: Miaohe Lin <[email protected]>
> Reviewed-by: David Hildenbrand <[email protected]>
Reviewed-by: "Huang, Ying" <[email protected]>
Thanks!
Best Regards,
Huang, Ying
> ---
> mm/swapfile.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index edc3420d30e7..5c8681a3f1d9 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -695,7 +695,7 @@ static void swap_range_alloc(struct swap_info_struct *si, unsigned long offset,
> si->lowest_bit += nr_entries;
> if (end == si->highest_bit)
> WRITE_ONCE(si->highest_bit, si->highest_bit - nr_entries);
> - si->inuse_pages += nr_entries;
> + WRITE_ONCE(si->inuse_pages, si->inuse_pages + nr_entries);
> if (si->inuse_pages == si->pages) {
> si->lowest_bit = si->max;
> si->highest_bit = 0;
> @@ -732,7 +732,7 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
> add_to_avail_list(si);
> }
> atomic_long_add(nr_entries, &nr_swap_pages);
> - si->inuse_pages -= nr_entries;
> + WRITE_ONCE(si->inuse_pages, si->inuse_pages - nr_entries);
> if (si->flags & SWP_BLKDEV)
> swap_slot_free_notify =
> si->bdev->bd_disk->fops->swap_slot_free_notify;
> @@ -2641,7 +2641,7 @@ static int swap_show(struct seq_file *swap, void *v)
> }
>
> bytes = si->pages << (PAGE_SHIFT - 10);
> - inuse = si->inuse_pages << (PAGE_SHIFT - 10);
> + inuse = READ_ONCE(si->inuse_pages) << (PAGE_SHIFT - 10);
>
> file = si->swap_file;
> len = seq_file_path(swap, file, " \t\n\\");
> @@ -3260,7 +3260,7 @@ void si_swapinfo(struct sysinfo *val)
> struct swap_info_struct *si = swap_info[type];
>
> if ((si->flags & SWP_USED) && !(si->flags & SWP_WRITEOK))
> - nr_to_be_unused += si->inuse_pages;
> + nr_to_be_unused += READ_ONCE(si->inuse_pages);
> }
> val->freeswap = atomic_long_read(&nr_swap_pages) + nr_to_be_unused;
> val->totalswap = total_swap_pages + nr_to_be_unused;
On Sat, Jun 25, 2022 at 05:33:45PM +0800, Miaohe Lin wrote:
> si->inuse_pages could still be accessed concurrently now. The plain reads
> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
> races. Note these data races should be ok because they're just used for
> showing swap info.
Was this found by kcsan? If so, it would be useful to record the exact
kscan report in the commit message.
On Mon, Jun 27, 2022 at 09:27:43PM +0800, Miaohe Lin wrote:
> On 2022/6/27 20:43, Qian Cai wrote:
> > On Sat, Jun 25, 2022 at 05:33:45PM +0800, Miaohe Lin wrote:
> >> si->inuse_pages could still be accessed concurrently now. The plain reads
> >> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
> >> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
> >> races. Note these data races should be ok because they're just used for
> >> showing swap info.
> >
> > Was this found by kcsan? If so, it would be useful to record the exact
> > kscan report in the commit message.
>
> Sorry, it's found via code inspection.
Well, if we are going to do a WRITE_ONCE() in those places just for
documentation purpose now, I think we will need to fix all places in the mm
subsystem to be consistent.
On 2022/6/27 20:43, Qian Cai wrote:
> On Sat, Jun 25, 2022 at 05:33:45PM +0800, Miaohe Lin wrote:
>> si->inuse_pages could still be accessed concurrently now. The plain reads
>> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
>> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
>> races. Note these data races should be ok because they're just used for
>> showing swap info.
>
> Was this found by kcsan? If so, it would be useful to record the exact
> kscan report in the commit message.
Sorry, it's found via code inspection.
Thanks.
> .
>
Qian Cai <[email protected]> writes:
> On Mon, Jun 27, 2022 at 09:27:43PM +0800, Miaohe Lin wrote:
>> On 2022/6/27 20:43, Qian Cai wrote:
>> > On Sat, Jun 25, 2022 at 05:33:45PM +0800, Miaohe Lin wrote:
>> >> si->inuse_pages could still be accessed concurrently now. The plain reads
>> >> outside si->lock critical section, i.e. swap_show and si_swapinfo, which
>> >> results in data races. READ_ONCE and WRITE_ONCE is used to fix such data
>> >> races. Note these data races should be ok because they're just used for
>> >> showing swap info.
>> >
>> > Was this found by kcsan? If so, it would be useful to record the exact
>> > kscan report in the commit message.
>>
>> Sorry, it's found via code inspection.
>
> Well, if we are going to do a WRITE_ONCE() in those places just for
> documentation purpose now, I think we will need to fix all places in the mm
> subsystem to be consistent.
We have already done this in swapfile.c, please search "WRITE_ONCE"
in that file.
Best Regards,
Huang, Ying