2002-10-02 14:52:39

by Dave McCracken

[permalink] [raw]
Subject: [PATCH] Snapshot of shared page tables


Ok, here it is. This patch works for my simple tests, both under UP and
SMP, including under memory pressure. I'd appreciate anyone who'd like to
take it and beat on it. Please let me know of any problems you find.

The patch is against this morning's 2.5 BK tree.

Dave McCracken

======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
[email protected] T/L 678-3059


Attachments:
(No filename) (499.00 B)
shpte-2.5.40-1.diff (36.39 kB)
Download all attachments

2002-10-02 16:45:59

by Daniel Phillips

[permalink] [raw]
Subject: Re: [PATCH] Snapshot of shared page tables

On Wednesday 02 October 2002 16:57, Dave McCracken wrote:
>
> Ok, here it is. This patch works for my simple tests, both under UP and
> SMP, including under memory pressure. I'd appreciate anyone who'd like to
> take it and beat on it. Please let me know of any problems you find.
>
> The patch is against this morning's 2.5 BK tree.

Interesting, you substituted pte_page_lock(ptepage) for mm->page_table_lock.
Could you wax poetic about that, please?

--
Daniel

2002-10-02 16:52:24

by Dave McCracken

[permalink] [raw]
Subject: Re: [PATCH] Snapshot of shared page tables


--On Wednesday, October 02, 2002 18:51:41 +0200 Daniel Phillips
<[email protected]> wrote:

> Interesting, you substituted pte_page_lock(ptepage) for
> mm->page_table_lock. Could you wax poetic about that, please?

Sure. If a pte page is shared, the mm->page_table_lock is not sufficient
to protect the rest of the page fault. Therefore we need a lock at the pte
page level. The mm->page_table_lock is held during the page fault until we
have a valid and locked pte page we're working on, then it's dropped for
the rest of the fault.

Feel free to poke holes in my logic, but I think it's the right locking
model for shared pte pages.

Dave McCracken

======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
[email protected] T/L 678-3059

2002-10-02 16:54:44

by Daniel Phillips

[permalink] [raw]
Subject: Re: [PATCH] Snapshot of shared page tables

On Wednesday 02 October 2002 18:51, Daniel Phillips wrote:
> On Wednesday 02 October 2002 16:57, Dave McCracken wrote:
> >
> > Ok, here it is. This patch works for my simple tests, both under UP and
> > SMP, including under memory pressure. I'd appreciate anyone who'd like to
> > take it and beat on it. Please let me know of any problems you find.
> >
> > The patch is against this morning's 2.5 BK tree.
>
> Interesting, you substituted pte_page_lock(ptepage) for mm->page_table_lock.
> Could you wax poetic about that, please?

Never mind, I see the logic. This reflects the fact that page_table_lock
is insufficient protection when pte pages are shared. So you solved that
problem and at the same time improved the scalability for the general case
immensely, without adding any new overhead. Very nice!

--
Daniel

2002-10-02 22:36:27

by Paul Mackerras

[permalink] [raw]
Subject: Re: [PATCH] Snapshot of shared page tables

Dave McCracken writes:

> Ok, here it is. This patch works for my simple tests, both under UP and
> SMP, including under memory pressure. I'd appreciate anyone who'd like to
> take it and beat on it. Please let me know of any problems you find.

Interesting. I notice that you are using the _PAGE_RW bit in the
PMDs. Are you relying on the hardware to do anything with that bit,
or is it only used by software?

(If you are relying on the hardware to do something different when
_PAGE_RW is clear in the PMD, then your approach isn't portable.)

Paul.

2002-10-02 22:42:45

by Dave McCracken

[permalink] [raw]
Subject: Re: [PATCH] Snapshot of shared page tables


--On Thursday, October 03, 2002 08:39:20 +1000 Paul Mackerras
<[email protected]> wrote:

> Interesting. I notice that you are using the _PAGE_RW bit in the
> PMDs. Are you relying on the hardware to do anything with that bit,
> or is it only used by software?
>
> (If you are relying on the hardware to do something different when
> _PAGE_RW is clear in the PMD, then your approach isn't portable.)

Yes, I am relying on the hardware. I was under the impression that it was
pretty much universal that making the pmd read-only would make the hardware
treat all ptes under it as read-only. This came out of a discussion on
lkml last winter where this assertion was made.

Do you know of a page table-based architecture that doesn't have and honor
read-only protections at the pmd level?

Dave McCracken

======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
[email protected] T/L 678-3059