2014-04-03 19:09:57

by Cyrill Gorcunov

[permalink] [raw]
Subject: [rfc 0/3] Cleaning up soft-dirty bit usage

Hi! I've been trying to clean up soft-dirty bit usage. I can't cleanup
"ridiculous macros in pgtable-2level.h" completely because I need to
define _PAGE_FILE,_PAGE_PROTNONE,_PAGE_NUMA bits in sequence manner
like

#define _PAGE_BIT_FILE (_PAGE_BIT_PRESENT + 1) /* _PAGE_BIT_RW */
#define _PAGE_BIT_NUMA (_PAGE_BIT_PRESENT + 2) /* _PAGE_BIT_USER */
#define _PAGE_BIT_PROTNONE (_PAGE_BIT_PRESENT + 3) /* _PAGE_BIT_PWT */

which can't be done right now because numa code needs to save original
pte bits for example in __split_huge_page_map, if I'm not missing something
obvious.

Also if we ever redefine the bits above we will need to update PAT code
which uses _PAGE_GLOBAL + _PAGE_PRESENT to make pte_present return true
or false.

Another weird thing I found is the following sequence:

mprotect_fixup
change_protection (passes @prot_numa = 0 which finally ends up in)
...
change_pte_range(..., prot_numa)

if (!prot_numa) {
...
} else {
... this seems to be dead code branch ...
}

is it intentional, and @prot_numa argument is supposed to be passed
with prot_numa = 1 one day, or it's leftover from old times?

Note I've not yet tested the series building it now, hopefully finish
testing in a couple of hours.

Linus, by saying "define the bits we use when PAGE_PRESENT==0 separately
and explicitly" you meant complete rework of the bits, right? Not simply
group them in once place in a header?

Cyrill


2014-04-07 13:07:26

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: [rfc 0/3] Cleaning up soft-dirty bit usage

On Thu, Apr 03, 2014 at 10:48:44PM +0400, Cyrill Gorcunov wrote:
> Hi! I've been trying to clean up soft-dirty bit usage. I can't cleanup
> "ridiculous macros in pgtable-2level.h" completely because I need to
> define _PAGE_FILE,_PAGE_PROTNONE,_PAGE_NUMA bits in sequence manner
> like
>
> #define _PAGE_BIT_FILE (_PAGE_BIT_PRESENT + 1) /* _PAGE_BIT_RW */
> #define _PAGE_BIT_NUMA (_PAGE_BIT_PRESENT + 2) /* _PAGE_BIT_USER */
> #define _PAGE_BIT_PROTNONE (_PAGE_BIT_PRESENT + 3) /* _PAGE_BIT_PWT */
>
> which can't be done right now because numa code needs to save original
> pte bits for example in __split_huge_page_map, if I'm not missing something
> obvious.

Sorry, I didn't get this. How __split_huge_page_map() does depend on pte
bits order?

>
> Also if we ever redefine the bits above we will need to update PAT code
> which uses _PAGE_GLOBAL + _PAGE_PRESENT to make pte_present return true
> or false.
>
> Another weird thing I found is the following sequence:
>
> mprotect_fixup
> change_protection (passes @prot_numa = 0 which finally ends up in)
> ...
> change_pte_range(..., prot_numa)
>
> if (!prot_numa) {
> ...
> } else {
> ... this seems to be dead code branch ...
> }
>
> is it intentional, and @prot_numa argument is supposed to be passed
> with prot_numa = 1 one day, or it's leftover from old times?

I see one more user of change_protection() -- change_prot_numa(), which
has .prot_numa == 1.

--
Kirill A. Shutemov

2014-04-07 13:24:42

by Cyrill Gorcunov

[permalink] [raw]
Subject: Re: [rfc 0/3] Cleaning up soft-dirty bit usage

On Mon, Apr 07, 2014 at 04:07:01PM +0300, Kirill A. Shutemov wrote:
> On Thu, Apr 03, 2014 at 10:48:44PM +0400, Cyrill Gorcunov wrote:
> > Hi! I've been trying to clean up soft-dirty bit usage. I can't cleanup
> > "ridiculous macros in pgtable-2level.h" completely because I need to
> > define _PAGE_FILE,_PAGE_PROTNONE,_PAGE_NUMA bits in sequence manner
> > like
> >
> > #define _PAGE_BIT_FILE (_PAGE_BIT_PRESENT + 1) /* _PAGE_BIT_RW */
> > #define _PAGE_BIT_NUMA (_PAGE_BIT_PRESENT + 2) /* _PAGE_BIT_USER */
> > #define _PAGE_BIT_PROTNONE (_PAGE_BIT_PRESENT + 3) /* _PAGE_BIT_PWT */
> >
> > which can't be done right now because numa code needs to save original
> > pte bits for example in __split_huge_page_map, if I'm not missing something
> > obvious.
>
> Sorry, I didn't get this. How __split_huge_page_map() does depend on pte
> bits order?

__split_huge_page_map
...
for (i = 0; i < HPAGE_PMD_NR; i++, haddr += PAGE_SIZE) {
...
here we modify with pte bits
entry = pte_mknuma(entry); --> clean _PAGE_PRESENT and set _PAGE_NUMA

pte bits must remain valid and meaningful, for example we might
have set _PAGE_RW here

> > is it intentional, and @prot_numa argument is supposed to be passed
> > with prot_numa = 1 one day, or it's leftover from old times?
>
> I see one more user of change_protection() -- change_prot_numa(), which
> has .prot_numa == 1.

Yeah, thanks, managed to miss this.