2014-04-07 18:38:01

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH 2/5] vrange: Add purged page detection on setting memory non-volatile

On 03/23/2014 10:42 AM, KOSAKI Motohiro wrote:
> On Fri, Mar 21, 2014 at 2:17 PM, John Stultz <[email protected]> wrote:
>> Users of volatile ranges will need to know if memory was discarded.
>> This patch adds the purged state tracking required to inform userland
>> when it marks memory as non-volatile that some memory in that range
>> was purged and needs to be regenerated.
>>
>> This simplified implementation which uses some of the logic from
>> Minchan's earlier efforts, so credit to Minchan for his work.
>>
>> Cc: Andrew Morton <[email protected]>
>> Cc: Android Kernel Team <[email protected]>
>> Cc: Johannes Weiner <[email protected]>
>> Cc: Robert Love <[email protected]>
>> Cc: Mel Gorman <[email protected]>
>> Cc: Hugh Dickins <[email protected]>
>> Cc: Dave Hansen <[email protected]>
>> Cc: Rik van Riel <[email protected]>
>> Cc: Dmitry Adamushko <[email protected]>
>> Cc: Neil Brown <[email protected]>
>> Cc: Andrea Arcangeli <[email protected]>
>> Cc: Mike Hommey <[email protected]>
>> Cc: Taras Glek <[email protected]>
>> Cc: Jan Kara <[email protected]>
>> Cc: KOSAKI Motohiro <[email protected]>
>> Cc: Michel Lespinasse <[email protected]>
>> Cc: Minchan Kim <[email protected]>
>> Cc: [email protected] <[email protected]>
>> Signed-off-by: John Stultz <[email protected]>
>> ---
>> include/linux/swap.h | 15 ++++++++--
>> include/linux/swapops.h | 10 +++++++
>> include/linux/vrange.h | 3 ++
>> mm/vrange.c | 75 +++++++++++++++++++++++++++++++++++++++++++++++++
>> 4 files changed, 101 insertions(+), 2 deletions(-)
>>
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index 46ba0c6..18c12f9 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -70,8 +70,19 @@ static inline int current_is_kswapd(void)
>> #define SWP_HWPOISON_NUM 0
>> #endif
>>
>> -#define MAX_SWAPFILES \
>> - ((1 << MAX_SWAPFILES_SHIFT) - SWP_MIGRATION_NUM - SWP_HWPOISON_NUM)
>> +
>> +/*
>> + * Purged volatile range pages
>> + */
>> +#define SWP_VRANGE_PURGED_NUM 1
>> +#define SWP_VRANGE_PURGED (MAX_SWAPFILES + SWP_HWPOISON_NUM + SWP_MIGRATION_NUM)
>> +
>> +
>> +#define MAX_SWAPFILES ((1 << MAX_SWAPFILES_SHIFT) \
>> + - SWP_MIGRATION_NUM \
>> + - SWP_HWPOISON_NUM \
>> + - SWP_VRANGE_PURGED_NUM \
>> + )
> This change hwpoison and migration tag number. maybe ok, maybe not.

Though depending on config can't these tag numbers change anyway?


> I'd suggest to use younger number than hwpoison.
> (That's why hwpoison uses younger number than migration)

So I can, but the way these are defined makes the results seem pretty
terrible:

#define SWP_MIGRATION_WRITE (MAX_SWAPFILES + SWP_HWPOISON_NUM \
+ SWP_MVOLATILE_PURGED_NUM + 1)

Particularly when:
#define MAX_SWAPFILES ((1 << MAX_SWAPFILES_SHIFT) \
- SWP_MIGRATION_NUM \
- SWP_HWPOISON_NUM \
- SWP_MVOLATILE_PURGED_NUM \
)

Its a lot of unnecessary mental gymnastics. Yuck.

Would a general cleanup like the following be ok to try to make this
more extensible?

thanks
-john

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 3507115..21387df 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -49,29 +49,38 @@ static inline int current_is_kswapd(void)
* actions on faults.
*/

+enum {
+ /*
+ * NOTE: We use the high bits here (subtracting from
+ * 1<<MAX_SWPFILES_SHIFT), so to preserve the values insert
+ * new entries here at the top of the enum, not at the bottom
+ */
+#ifdef CONFIG_MEMORY_FAILURE
+ SWP_HWPOISON_NR,
+#endif
+#ifdef CONFIG_MIGRATION
+ SWP_MIGRATION_READ_NR,
+ SWP_MIGRATION_WRITE_NR,
+#endif
+ SWP_MAX_NR,
+};
+#define MAX_SWAPFILES ((1 << MAX_SWAPFILES_SHIFT) - SWP_MAX_NR)
+
/*
* NUMA node memory migration support
*/
#ifdef CONFIG_MIGRATION
-#define SWP_MIGRATION_NUM 2
-#define SWP_MIGRATION_READ (MAX_SWAPFILES + SWP_HWPOISON_NUM)
-#define SWP_MIGRATION_WRITE (MAX_SWAPFILES + SWP_HWPOISON_NUM + 1)
-#else
-#define SWP_MIGRATION_NUM 0
+#define SWP_MIGRATION_READ (MAX_SWAPFILES + SWP_MIGRATION_READ_NR)
+#define SWP_MIGRATION_WRITE (MAX_SWAPFILES + SWP_MIGRATION_WRITE_NR)
#endif

/*
* Handling of hardware poisoned pages with memory corruption.
*/
#ifdef CONFIG_MEMORY_FAILURE
-#define SWP_HWPOISON_NUM 1
-#define SWP_HWPOISON MAX_SWAPFILES
-#else
-#define SWP_HWPOISON_NUM 0
+#define SWP_HWPOISON (MAX_SWAPFILES + SWP_HWPOISON_NR)
#endif

-#define MAX_SWAPFILES \
- ((1 << MAX_SWAPFILES_SHIFT) - SWP_MIGRATION_NUM - SWP_HWPOISON_NUM)

/*
* Magic header for a swap area. The first part of the union is


2014-04-07 22:14:23

by KOSAKI Motohiro

[permalink] [raw]
Subject: Re: [PATCH 2/5] vrange: Add purged page detection on setting memory non-volatile

>> This change hwpoison and migration tag number. maybe ok, maybe not.
>
> Though depending on config can't these tag numbers change anyway?

I don't think distro disable any of these.


>> I'd suggest to use younger number than hwpoison.
>> (That's why hwpoison uses younger number than migration)
>
> So I can, but the way these are defined makes the results seem pretty
> terrible:
>
> #define SWP_MIGRATION_WRITE (MAX_SWAPFILES + SWP_HWPOISON_NUM \
> + SWP_MVOLATILE_PURGED_NUM + 1)
>
> Particularly when:
> #define MAX_SWAPFILES ((1 << MAX_SWAPFILES_SHIFT) \
> - SWP_MIGRATION_NUM \
> - SWP_HWPOISON_NUM \
> - SWP_MVOLATILE_PURGED_NUM \
> )
>
> Its a lot of unnecessary mental gymnastics. Yuck.
>
> Would a general cleanup like the following be ok to try to make this
> more extensible?
>
> thanks
> -john
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 3507115..21387df 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -49,29 +49,38 @@ static inline int current_is_kswapd(void)
> * actions on faults.
> */
>
> +enum {
> + /*
> + * NOTE: We use the high bits here (subtracting from
> + * 1<<MAX_SWPFILES_SHIFT), so to preserve the values insert
> + * new entries here at the top of the enum, not at the bottom
> + */
> +#ifdef CONFIG_MEMORY_FAILURE
> + SWP_HWPOISON_NR,
> +#endif
> +#ifdef CONFIG_MIGRATION
> + SWP_MIGRATION_READ_NR,
> + SWP_MIGRATION_WRITE_NR,
> +#endif
> + SWP_MAX_NR,
> +};
> +#define MAX_SWAPFILES ((1 << MAX_SWAPFILES_SHIFT) - SWP_MAX_NR)
> +

I don't see any benefit of this code. At least, SWP_MAX_NR is suck.
The name doesn't match the actual meanings.

2014-04-08 03:09:44

by John Stultz

[permalink] [raw]
Subject: Re: [PATCH 2/5] vrange: Add purged page detection on setting memory non-volatile

On 04/07/2014 03:14 PM, KOSAKI Motohiro wrote:
>>> This change hwpoison and migration tag number. maybe ok, maybe not.
>> Though depending on config can't these tag numbers change anyway?
> I don't think distro disable any of these.

Well, it still shouldn't break if the config options are turned off.
This isn't some subtle userspace visible ABI, is it?
I'm fine with keeping the values the same, but it just seems worrying if
this logic is so fragile.


>>> I'd suggest to use younger number than hwpoison.
>>> (That's why hwpoison uses younger number than migration)
>> So I can, but the way these are defined makes the results seem pretty
>> terrible:
>>
>> #define SWP_MIGRATION_WRITE (MAX_SWAPFILES + SWP_HWPOISON_NUM \
>> + SWP_MVOLATILE_PURGED_NUM + 1)
>>
>> Particularly when:
>> #define MAX_SWAPFILES ((1 << MAX_SWAPFILES_SHIFT) \
>> - SWP_MIGRATION_NUM \
>> - SWP_HWPOISON_NUM \
>> - SWP_MVOLATILE_PURGED_NUM \
>> )
>>
>> Its a lot of unnecessary mental gymnastics. Yuck.
>>
>> Would a general cleanup like the following be ok to try to make this
>> more extensible?
>>
>> thanks
>> -john
>>
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index 3507115..21387df 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -49,29 +49,38 @@ static inline int current_is_kswapd(void)
>> * actions on faults.
>> */
>>
>> +enum {
>> + /*
>> + * NOTE: We use the high bits here (subtracting from
>> + * 1<<MAX_SWPFILES_SHIFT), so to preserve the values insert
>> + * new entries here at the top of the enum, not at the bottom
>> + */
>> +#ifdef CONFIG_MEMORY_FAILURE
>> + SWP_HWPOISON_NR,
>> +#endif
>> +#ifdef CONFIG_MIGRATION
>> + SWP_MIGRATION_READ_NR,
>> + SWP_MIGRATION_WRITE_NR,
>> +#endif
>> + SWP_MAX_NR,
>> +};
>> +#define MAX_SWAPFILES ((1 << MAX_SWAPFILES_SHIFT) - SWP_MAX_NR)
>> +
> I don't see any benefit of this code. At least, SWP_MAX_NR is suck.


So it makes adding new special swap types (like SWP_MVOLATILE_PURGED)
much cleaner. If we need to preserve the actual values for SWP_HWPOSIN
and SWP_MIGRATION_* as you suggested earlier, the cleanup above makes
doing so when adding a new type much easier.

For example adding the MVOLATILE_PURGED value (without effecting the
values of HWPOSIN or MIGRATION_*) is only:

@@ -55,6 +55,7 @@ enum {
* 1<<MAX_SWPFILES_SHIFT), so to preserve the values insert
* new entries here at the top of the enum, not at the bottom
*/
+ SWP_MVOLATILE_PURGED_NR,
#ifdef CONFIG_MEMORY_FAILURE
SWP_HWPOISON_NR,
#endif
@@ -81,6 +82,10 @@ enum {
#define SWP_HWPOISON (MAX_SWAPFILES + SWP_HWPOISON_NR)
#endif

+/*
+ * Purged volatile range pages
+ */
+#define SWP_MVOLATILE_PURGED (MAX_SWAPFILES + SWP_MVOLATILE_PURGED_NR)


That's *much* nicer when compared with modifying every value to subtract the extra entry, as it was done before.


> The name doesn't match the actual meanings.
Would SWP_MAX_SPECIAL_TYPE_NR be a better name? Do you have other
suggestions?

thanks
-john