2007-12-14 00:03:25

by Pallipadi, Venkatesh

[permalink] [raw]
Subject: [RFC PATCH 00/12] PAT 64b: PAT support for X86_64

Yes. It is that wonderful time of the year again.
No, no. We are not talking about holiday season or new year here.

We are talking about one another rehash of "why we do not support PAT in x86"
question and series of patches that implement some PAT support before going
into hibernation again. Only difference is that we hope to take this little
further this time and may be really get this support into
upstream kernel soon.


This series is heavily derived from the PAT patchset by Eric Biederman and
Andi Kleen.
http://www.firstfloor.org/pub/ak/x86_64/pat/

We have forwarded ported these patches to latest kernel, addressed some of the
race conditions, cut a lot more corners to get this into a working patchset
that can be seen as an RFC. Specifically, the chanegs we added include:

* Various bug fixes over original patchset above.
* Change x86_64 identity map to only map non-reserved memory. This helps
to handle UC/WC mapping of reserved region in a much simple manner
(we don't have to do cpa any more, as such not keep track of the actual
reference counts. We still track all the usages to keep the mappings
consistent. We just avoid the headache of splitting mattr regions for
managing ref counts for every individual usage of the reserved area).
* Modify reserve_mattr and free_mattr to handle various mapping of reserved
regions cleanly.


There are many rough edges in the patchset. TBD list below refers to
the open issues that we have thought through during this process.

TBD:
* Do we need to allow RAM pages to be mapped as WC? If not, then
we don't need to follow the TLB flush mechanism (make pte not present,
flush, and set pte with new mapping) mentioned in section 10.12.4 of SDM
Vol3a.
* If the above can be assumed, then for a complete solution, handle RAM
pages with UC and /dev/mem mapping conflicts.
Can we use the existing page struct to keep track of the /dev/mem
mappings (through the page ref count) and not allow
to free the page while the /dev/mem mappings are active. And
allow /dev/mem to map only those pages which are marked reserved (which
the driver does before doing iomap).
* For X and others, do we need the ioctl interface to sysfs or get the type
attribute through a different sysfs file.
* Clean up early table space allocation, avoiding overallocation there.
* Avoid mapping 0 - 1M physical addresses in kernel text mapping.
* Read reserved regions in /dev/mem read() as 0xffff or something, and continue
reading across holes, till we reach the high_memory (end of memory).
* For fork(), for every /dev/mem mapping, we have to keep track of the usage
by doing reserve_mattr().

There are also many edges completely missing. Lot of things we did not look
at all for this first cut. Specifically:

* Only supports x86_64 for now. i386 may not even compile with this patchset.
* We did not look at implication of PAT on Suspend-Resume.
* We did not look at implications of PAT on KEXEC.
* Coding style details.

We expect this can be done easily once we have discussed/resolved the
basic PAT problems with this RFC.


Fireaway all comments, complaints, concerns and things we may break while
we do this.

Tested with 2.6.24-rc4 and X86_64.

Signed-off-by: Venkatesh Pallipadi <[email protected]>
Signed-off-by: Suresh Siddha <[email protected]>
---

--


2007-12-14 00:28:59

by Dave Airlie

[permalink] [raw]
Subject: Re: [RFC PATCH 00/12] PAT 64b: PAT support for X86_64


> Yes. It is that wonderful time of the year again.
> No, no. We are not talking about holiday season or new year here.
>
> We are talking about one another rehash of "why we do not support PAT in x86"
> question and series of patches that implement some PAT support before going
> into hibernation again. Only difference is that we hope to take this little
> further this time and may be really get this support into
> upstream kernel soon.

Woot, PAT support: this time we mean it!!

I'll just give one comment after a reading your todo..

> * Do we need to allow RAM pages to be mapped as WC? If not, then
> we don't need to follow the TLB flush mechanism (make pte not present,
> flush, and set pte with new mapping) mentioned in section 10.12.4 of SDM
> Vol3a.

Yes, the main use for GPUs is to have RAM pages mapped WC, and placed into
a GART on the GPU side, currently for Intel IGD we are okay as the CPU can
access the GPU GART aperture, but other chips exist where this isn't
possible, I think poulsbo and possible some of the AMD IGPs..

Dave.

2007-12-14 22:01:36

by Suresh Siddha

[permalink] [raw]
Subject: Re: [RFC PATCH 00/12] PAT 64b: PAT support for X86_64

On Fri, Dec 14, 2007 at 12:28:25AM +0000, Dave Airlie wrote:
> Yes, the main use for GPUs is to have RAM pages mapped WC, and placed into
> a GART on the GPU side, currently for Intel IGD we are okay as the CPU can
> access the GPU GART aperture, but other chips exist where this isn't
> possible, I think poulsbo and possible some of the AMD IGPs..

Ok. So how is it working today on these platforms with no PAT support.
Open source drivers use UC or WB on these platforms? As this RAM is not
contiguous, one can't use MTRRs to set WC. Right?

Well, if WC is needed for RAM, then we have to address it too.

thanks,
suresh

2007-12-14 22:27:53

by Dave Airlie

[permalink] [raw]
Subject: Re: [RFC PATCH 00/12] PAT 64b: PAT support for X86_64

On Dec 15, 2007 8:00 AM, Siddha, Suresh B <[email protected]> wrote:
> On Fri, Dec 14, 2007 at 12:28:25AM +0000, Dave Airlie wrote:
> > Yes, the main use for GPUs is to have RAM pages mapped WC, and placed into
> > a GART on the GPU side, currently for Intel IGD we are okay as the CPU can
> > access the GPU GART aperture, but other chips exist where this isn't
> > possible, I think poulsbo and possible some of the AMD IGPs..
>
> Ok. So how is it working today on these platforms with no PAT support.
> Open source drivers use UC or WB on these platforms? As this RAM is not
> contiguous, one can't use MTRRs to set WC. Right?
>
> Well, if WC is needed for RAM, then we have to address it too.
>

It doesn't work really, which is mostly the problem :)

We mostly use UC on these pages, or WB within cache coherent domains.
mtrrs are totally useless in this situation.

Dave.

2007-12-14 22:40:36

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [RFC PATCH 00/12] PAT 64b: PAT support for X86_64

Dave Airlie wrote:
> On Dec 15, 2007 8:00 AM, Siddha, Suresh B <[email protected]> wrote:
>> On Fri, Dec 14, 2007 at 12:28:25AM +0000, Dave Airlie wrote:
>>> Yes, the main use for GPUs is to have RAM pages mapped WC, and placed into
>>> a GART on the GPU side, currently for Intel IGD we are okay as the CPU can
>>> access the GPU GART aperture, but other chips exist where this isn't
>>> possible, I think poulsbo and possible some of the AMD IGPs..
>> Ok. So how is it working today on these platforms with no PAT support.
>> Open source drivers use UC or WB on these platforms? As this RAM is not
>> contiguous, one can't use MTRRs to set WC. Right?
>>
>> Well, if WC is needed for RAM, then we have to address it too.
>>
>
> It doesn't work really, which is mostly the problem :)
>
> We mostly use UC on these pages, or WB within cache coherent domains.
> mtrrs are totally useless in this situation.
>

In what sense does it not work?

-hpa

2007-12-14 22:41:01

by Dave Airlie

[permalink] [raw]
Subject: Re: [RFC PATCH 00/12] PAT 64b: PAT support for X86_64

> > It doesn't work really, which is mostly the problem :)
> >
> > We mostly use UC on these pages, or WB within cache coherent domains.
> > mtrrs are totally useless in this situation.
> >
>
> In what sense does it not work?

oh I was mostly joking hence the smily.. really it just means thing
run slower than they need to..

Dave.