2020-05-01 01:55:04

by Wei Yang

[permalink] [raw]
Subject: [PATCH 1/3] mm/swapfile.c: classify SWAP_MAP_XXX to make it more readable

swap_info_struct->swap_map[] encodes some flag and count. And to
do some condition check, it also introduces some special values.

Currently those macros are defined with some magic order, which makes
audience hard to understand the exact meaning.

This patch split those macros into three categories:

flag
special value for first swap_map
special value for continued swap_map

May this help audiences a little.

Signed-off-by: Wei Yang <[email protected]>
---
include/linux/swap.h | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 1e99f7ac1d7e..ec507c67529b 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -183,12 +183,17 @@ enum {
#define SWAP_CLUSTER_MAX 32UL
#define COMPACT_CLUSTER_MAX SWAP_CLUSTER_MAX

-#define SWAP_MAP_MAX 0x3e /* Max duplication count, in first swap_map */
-#define SWAP_MAP_BAD 0x3f /* Note pageblock is bad, in first swap_map */
+/* Bit flag in swap_map */
#define SWAP_HAS_CACHE 0x40 /* Flag page is cached, in first swap_map */
-#define SWAP_CONT_MAX 0x7f /* Max count, in each swap_map continuation */
-#define COUNT_CONTINUED 0x80 /* See swap_map continuation for full count */
-#define SWAP_MAP_SHMEM 0xbf /* Owned by shmem/tmpfs, in first swap_map */
+#define COUNT_CONTINUED 0x80 /* Flag swap_map continuation for full count */
+
+/* Special Value in first swap_map */
+#define SWAP_MAP_MAX 0x3e /* Max count */
+#define SWAP_MAP_BAD 0x3f /* Note page is bad */
+#define SWAP_MAP_SHMEM 0xbf /* Owned by shmem/tmpfs */
+
+/* Special Value in each swap_map continuation */
+#define SWAP_CONT_MAX 0x7f /* Max count */

/*
* We use this to track usage of a cluster. A cluster is a block of swap disk
--
2.23.0


2020-05-01 01:55:04

by Wei Yang

[permalink] [raw]
Subject: [PATCH 2/3] mm/swapfile.c: __swap_entry_free() always free 1 entry

__swap_entry_free() always free 1 entry, let's remove the usage.

Signed-off-by: Wei Yang <[email protected]>
---
mm/swapfile.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index f73e0c11fab9..1a877d1d40e3 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1251,13 +1251,14 @@ struct swap_info_struct *get_swap_device(swp_entry_t entry)
}

static unsigned char __swap_entry_free(struct swap_info_struct *p,
- swp_entry_t entry, unsigned char usage)
+ swp_entry_t entry)
{
struct swap_cluster_info *ci;
unsigned long offset = swp_offset(entry);
+ unsigned char usage;

ci = lock_cluster_or_swap_info(p, offset);
- usage = __swap_entry_free_locked(p, offset, usage);
+ usage = __swap_entry_free_locked(p, offset, 1);
unlock_cluster_or_swap_info(p, ci);
if (!usage)
free_swap_slot(entry);
@@ -1292,7 +1293,7 @@ void swap_free(swp_entry_t entry)

p = _swap_info_get(entry);
if (p)
- __swap_entry_free(p, entry, 1);
+ __swap_entry_free(p, entry);
}

/*
@@ -1715,7 +1716,7 @@ int free_swap_and_cache(swp_entry_t entry)

p = _swap_info_get(entry);
if (p) {
- count = __swap_entry_free(p, entry, 1);
+ count = __swap_entry_free(p, entry);
if (count == SWAP_HAS_CACHE &&
!swap_page_trans_huge_swapped(p, entry))
__try_to_reclaim_swap(p, swp_offset(entry),
--
2.23.0

2020-05-01 01:57:57

by Wei Yang

[permalink] [raw]
Subject: [PATCH 3/3] mm/swapfile.c: count won't be bigger than SWAP_MAP_MAX

When the condition is true, there are two possibilities:

1. count == SWAP_MAP_BAD
2. count == (SWAP_MAP_MAX & COUNT_CONTINUED) == SWAP_MAP_SHMEM

The first case would be filtered by the first if in __swap_duplicate().

And the second case means this swap entry is for shmem. Since we never
do another duplication for shmem swap entry. This won't happen neither.

Signed-off-by: Wei Yang <[email protected]>
---
mm/swapfile.c | 2 --
1 file changed, 2 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 1a877d1d40e3..88dd2ad34aad 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3404,8 +3404,6 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)

if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX)
count += usage;
- else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX)
- err = -EINVAL;
else if (swap_count_continued(p, offset, count))
count = COUNT_CONTINUED;
else
--
2.23.0

2020-05-01 22:50:44

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/swapfile.c: count won't be bigger than SWAP_MAP_MAX

On Fri, 1 May 2020 01:52:59 +0000 Wei Yang <[email protected]> wrote:

> When the condition is true, there are two possibilities:

I'm struggling with this one.

> 1. count == SWAP_MAP_BAD
> 2. count == (SWAP_MAP_MAX & COUNT_CONTINUED) == SWAP_MAP_SHMEM

I'm not sure what 2. is trying to say. For a start, (SWAP_MAP_MAX &
COUNT_CONTINUED) is zero. I guess it meant "|"?

Also, the return value documentation says we return EINVAL for migration
entries. Where's that happening, or is the comment out of date?

> The first case would be filtered by the first if in __swap_duplicate().
>
> And the second case means this swap entry is for shmem. Since we never
> do another duplication for shmem swap entry. This won't happen neither.


2020-05-02 13:31:00

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/swapfile.c: count won't be bigger than SWAP_MAP_MAX

On Fri, May 01, 2020 at 03:48:53PM -0700, Andrew Morton wrote:
>On Fri, 1 May 2020 01:52:59 +0000 Wei Yang <[email protected]> wrote:
>
>> When the condition is true, there are two possibilities:
>
>I'm struggling with this one.
>
>> 1. count == SWAP_MAP_BAD
>> 2. count == (SWAP_MAP_MAX & COUNT_CONTINUED) == SWAP_MAP_SHMEM
>
>I'm not sure what 2. is trying to say. For a start, (SWAP_MAP_MAX &
>COUNT_CONTINUED) is zero. I guess it meant "|"?

Oops, you are right. It should be (SWAP_MAP_MAX | COUNT_CONTINUED).

Sorry for the confusion.

>
>Also, the return value documentation says we return EINVAL for migration
>entries. Where's that happening, or is the comment out of date?
>

Not paid attention to this.

Take look into the code, I don't find a relationship between the swap count
and migration. Seems we just make a migration entry but not duplicate it.
If my understanding is correct.

>> The first case would be filtered by the first if in __swap_duplicate().
>>
>> And the second case means this swap entry is for shmem. Since we never
>> do another duplication for shmem swap entry. This won't happen neither.
>

--
Wei Yang
Help you, Help me

2020-05-02 13:43:34

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/swapfile.c: count won't be bigger than SWAP_MAP_MAX

On Sat, May 02, 2020 at 01:29:11PM +0000, Wei Yang wrote:
>On Fri, May 01, 2020 at 03:48:53PM -0700, Andrew Morton wrote:
>>On Fri, 1 May 2020 01:52:59 +0000 Wei Yang <[email protected]> wrote:
>>
>>> When the condition is true, there are two possibilities:
>>
>>I'm struggling with this one.
>>
>>> 1. count == SWAP_MAP_BAD
>>> 2. count == (SWAP_MAP_MAX & COUNT_CONTINUED) == SWAP_MAP_SHMEM
>>
>>I'm not sure what 2. is trying to say. For a start, (SWAP_MAP_MAX &
>>COUNT_CONTINUED) is zero. I guess it meant "|"?
>
>Oops, you are right. It should be (SWAP_MAP_MAX | COUNT_CONTINUED).
>
>Sorry for the confusion.
>

Hmm... I made a mistake again, the two cases should be

* SWAP_MAP_BAD
* (SWAP_MAP_BAD | COUNT_CONTINUED) == SWAP_MAP_SHMEM

What a shame.

>>
>>Also, the return value documentation says we return EINVAL for migration
>>entries. Where's that happening, or is the comment out of date?
>>
>
>Not paid attention to this.
>
>Take look into the code, I don't find a relationship between the swap count
>and migration. Seems we just make a migration entry but not duplicate it.
>If my understanding is correct.
>
>>> The first case would be filtered by the first if in __swap_duplicate().
>>>
>>> And the second case means this swap entry is for shmem. Since we never
>>> do another duplication for shmem swap entry. This won't happen neither.
>>
>
>--
>Wei Yang
>Help you, Help me

--
Wei Yang
Help you, Help me

2020-05-06 08:24:50

by Huang Ying

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/swapfile.c: count won't be bigger than SWAP_MAP_MAX

Wei Yang <[email protected]> writes:

> On Fri, May 01, 2020 at 03:48:53PM -0700, Andrew Morton wrote:
>>On Fri, 1 May 2020 01:52:59 +0000 Wei Yang <[email protected]> wrote:
>>
>>> When the condition is true, there are two possibilities:
>>
>>I'm struggling with this one.
>>
>>> 1. count == SWAP_MAP_BAD
>>> 2. count == (SWAP_MAP_MAX & COUNT_CONTINUED) == SWAP_MAP_SHMEM
>>
>>I'm not sure what 2. is trying to say. For a start, (SWAP_MAP_MAX &
>>COUNT_CONTINUED) is zero. I guess it meant "|"?
>
> Oops, you are right. It should be (SWAP_MAP_MAX | COUNT_CONTINUED).
>
> Sorry for the confusion.
>
>>
>>Also, the return value documentation says we return EINVAL for migration
>>entries. Where's that happening, or is the comment out of date?
>>
>
> Not paid attention to this.
>
> Take look into the code, I don't find a relationship between the swap count
> and migration. Seems we just make a migration entry but not duplicate it.
> If my understanding is correct.

Per my understanding, one functionality of the error path is to catch
the behavior that shouldn't happen at all. For example, if
__swap_duplicate() is called for the migration entry because of some
race condition.

Best Regards,
Huang, Ying

2020-05-07 22:23:43

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/swapfile.c: count won't be bigger than SWAP_MAP_MAX

On Wed, May 06, 2020 at 04:22:54PM +0800, Huang, Ying wrote:
>Wei Yang <[email protected]> writes:
>
>> On Fri, May 01, 2020 at 03:48:53PM -0700, Andrew Morton wrote:
>>>On Fri, 1 May 2020 01:52:59 +0000 Wei Yang <[email protected]> wrote:
>>>
>>>> When the condition is true, there are two possibilities:
>>>
>>>I'm struggling with this one.
>>>
>>>> 1. count == SWAP_MAP_BAD
>>>> 2. count == (SWAP_MAP_MAX & COUNT_CONTINUED) == SWAP_MAP_SHMEM
>>>
>>>I'm not sure what 2. is trying to say. For a start, (SWAP_MAP_MAX &
>>>COUNT_CONTINUED) is zero. I guess it meant "|"?
>>
>> Oops, you are right. It should be (SWAP_MAP_MAX | COUNT_CONTINUED).
>>
>> Sorry for the confusion.
>>
>>>
>>>Also, the return value documentation says we return EINVAL for migration
>>>entries. Where's that happening, or is the comment out of date?
>>>
>>
>> Not paid attention to this.
>>
>> Take look into the code, I don't find a relationship between the swap count
>> and migration. Seems we just make a migration entry but not duplicate it.
>> If my understanding is correct.
>
>Per my understanding, one functionality of the error path is to catch
>the behavior that shouldn't happen at all. For example, if
>__swap_duplicate() is called for the migration entry because of some
>race condition.
>

If __swap_duplicate() run for a migration entry, it returns since
get_swap_entry() couldn't find a swap_info_struct. So the return value is
-EINVAL.

While when this situation would happen? And the race condition you mean is?

>Best Regards,
>Huang, Ying

--
Wei Yang
Help you, Help me

2020-05-07 23:51:55

by Huang Ying

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/swapfile.c: count won't be bigger than SWAP_MAP_MAX

Wei Yang <[email protected]> writes:

> On Wed, May 06, 2020 at 04:22:54PM +0800, Huang, Ying wrote:
>>Wei Yang <[email protected]> writes:
>>
>>> On Fri, May 01, 2020 at 03:48:53PM -0700, Andrew Morton wrote:
>>>>On Fri, 1 May 2020 01:52:59 +0000 Wei Yang <[email protected]> wrote:
>>>>
>>>>> When the condition is true, there are two possibilities:
>>>>
>>>>I'm struggling with this one.
>>>>
>>>>> 1. count == SWAP_MAP_BAD
>>>>> 2. count == (SWAP_MAP_MAX & COUNT_CONTINUED) == SWAP_MAP_SHMEM
>>>>
>>>>I'm not sure what 2. is trying to say. For a start, (SWAP_MAP_MAX &
>>>>COUNT_CONTINUED) is zero. I guess it meant "|"?
>>>
>>> Oops, you are right. It should be (SWAP_MAP_MAX | COUNT_CONTINUED).
>>>
>>> Sorry for the confusion.
>>>
>>>>
>>>>Also, the return value documentation says we return EINVAL for migration
>>>>entries. Where's that happening, or is the comment out of date?
>>>>
>>>
>>> Not paid attention to this.
>>>
>>> Take look into the code, I don't find a relationship between the swap count
>>> and migration. Seems we just make a migration entry but not duplicate it.
>>> If my understanding is correct.
>>
>>Per my understanding, one functionality of the error path is to catch
>>the behavior that shouldn't happen at all. For example, if
>>__swap_duplicate() is called for the migration entry because of some
>>race condition.
>>
>
> If __swap_duplicate() run for a migration entry, it returns since
> get_swap_entry() couldn't find a swap_info_struct. So the return value is
> -EINVAL.
>
> While when this situation would happen? And the race condition you mean is?

Sorry for confusing. I don't mean there are some known race conditions
in current kernel that will trigger the error code path. I mean we may
use the error path to identify some race conditions in the future.

I remember that Matthew thought that the swap code should work
reasonably even for garbage PTE.

Best Regards,
Huang, Ying

>>Best Regards,
>>Huang, Ying

2020-05-08 21:21:54

by Wei Yang

[permalink] [raw]
Subject: Re: [PATCH 3/3] mm/swapfile.c: count won't be bigger than SWAP_MAP_MAX

On Fri, May 08, 2020 at 07:48:01AM +0800, Huang, Ying wrote:
>Wei Yang <[email protected]> writes:
>
>> On Wed, May 06, 2020 at 04:22:54PM +0800, Huang, Ying wrote:
>>>Wei Yang <[email protected]> writes:
>>>
>>>> On Fri, May 01, 2020 at 03:48:53PM -0700, Andrew Morton wrote:
>>>>>On Fri, 1 May 2020 01:52:59 +0000 Wei Yang <[email protected]> wrote:
>>>>>
>>>>>> When the condition is true, there are two possibilities:
>>>>>
>>>>>I'm struggling with this one.
>>>>>
>>>>>> 1. count == SWAP_MAP_BAD
>>>>>> 2. count == (SWAP_MAP_MAX & COUNT_CONTINUED) == SWAP_MAP_SHMEM
>>>>>
>>>>>I'm not sure what 2. is trying to say. For a start, (SWAP_MAP_MAX &
>>>>>COUNT_CONTINUED) is zero. I guess it meant "|"?
>>>>
>>>> Oops, you are right. It should be (SWAP_MAP_MAX | COUNT_CONTINUED).
>>>>
>>>> Sorry for the confusion.
>>>>
>>>>>
>>>>>Also, the return value documentation says we return EINVAL for migration
>>>>>entries. Where's that happening, or is the comment out of date?
>>>>>
>>>>
>>>> Not paid attention to this.
>>>>
>>>> Take look into the code, I don't find a relationship between the swap count
>>>> and migration. Seems we just make a migration entry but not duplicate it.
>>>> If my understanding is correct.
>>>
>>>Per my understanding, one functionality of the error path is to catch
>>>the behavior that shouldn't happen at all. For example, if
>>>__swap_duplicate() is called for the migration entry because of some
>>>race condition.
>>>
>>
>> If __swap_duplicate() run for a migration entry, it returns since
>> get_swap_entry() couldn't find a swap_info_struct. So the return value is
>> -EINVAL.
>>
>> While when this situation would happen? And the race condition you mean is?
>
>Sorry for confusing. I don't mean there are some known race conditions
>in current kernel that will trigger the error code path. I mean we may
>use the error path to identify some race conditions in the future.
>

Yep, NP.

For the code itself, do you have some comment?

>I remember that Matthew thought that the swap code should work
>reasonably even for garbage PTE.
>
>Best Regards,
>Huang, Ying
>
>>>Best Regards,
>>>Huang, Ying

--
Wei Yang
Help you, Help me