2004-03-04 21:06:31

by Chris Friesen

[permalink] [raw]
Subject: problem with cache flush routine for G5?


We're running into issues with the "flush_data_cache" routine on the G5.

For the G5, the L1 dcache is 32K and the L2 cache is 512K. At 128
bytes/cacheline, that's 256 and 4096 cachelines, respectively.

In the existing tree, NUM_CACHE_LINES is set to 128*8, or 1024. Is this
an oversight or am I missing something?

Also, I'm curious why the dcbf instruction is not used for this.

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]


2004-03-04 23:51:26

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?

On Fri, 2004-03-05 at 08:06, Chris Friesen wrote:
> We're running into issues with the "flush_data_cache" routine on the G5.
>
> For the G5, the L1 dcache is 32K and the L2 cache is 512K. At 128
> bytes/cacheline, that's 256 and 4096 cachelines, respectively.
>
> In the existing tree, NUM_CACHE_LINES is set to 128*8, or 1024. Is this
> an oversight or am I missing something?
>
> Also, I'm curious why the dcbf instruction is not used for this.

First of all, why do you need to flush the cache at all ?

If you are talking about the cache flush in the 32 bits bootloaders,
then yes, this seem to be broken, you should ask Tom Rini who
maintain these things.

The kernel proper definitely doesn't contain such a routine.

Ben.


2004-03-04 23:57:57

by Tom Rini

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?

On Fri, Mar 05, 2004 at 10:51:08AM +1100, Benjamin Herrenschmidt wrote:
> On Fri, 2004-03-05 at 08:06, Chris Friesen wrote:
> > We're running into issues with the "flush_data_cache" routine on the G5.
> >
> > For the G5, the L1 dcache is 32K and the L2 cache is 512K. At 128
> > bytes/cacheline, that's 256 and 4096 cachelines, respectively.
> >
> > In the existing tree, NUM_CACHE_LINES is set to 128*8, or 1024. Is this
> > an oversight or am I missing something?
> >
> > Also, I'm curious why the dcbf instruction is not used for this.
>
> First of all, why do you need to flush the cache at all ?
>
> If you are talking about the cache flush in the 32 bits bootloaders,
> then yes, this seem to be broken, you should ask Tom Rini who
> maintain these things.

... unless this is a 'G5' that's not in a pmac, it's not my code, and
the openfirmware bootloaders don't, IIRC, do any cache stuff.

--
Tom Rini
http://gate.crashing.org/~trini/

2004-03-05 00:04:46

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?


>
> ... unless this is a 'G5' that's not in a pmac, it's not my code, and
> the openfirmware bootloaders don't, IIRC, do any cache stuff.

Heh, well, they should. You need to flush the dcache & invalidate the
icache over the kernel image after decompressing it. It's just that
those routines are totally broken as far as I can see, but I don't
think the pmac bootloaders will use them, they probably use a
different implementation that works.

Note that whatever machines are using those routines will probably
have trouble too anyway

Ben.


2004-03-05 00:37:18

by Chris Friesen

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?

Benjamin Herrenschmidt wrote:
> First of all, why do you need to flush the cache at all ?

In-house OS emulator. For some reason it wants to be able to flush the
cache. I don't know why.

> If you are talking about the cache flush in the 32 bits bootloaders,
> then yes, this seem to be broken, you should ask Tom Rini who
> maintain these things.
>
> The kernel proper definitely doesn't contain such a routine.

It did in 2.4, and we added a syscall to export it to userspace. Now
I'm supposed to figure out what to do for 2.6, and it appears that the
kernel version is gone and the one in boot is screwed.


The only remaining ppc version of flush_data_cache is used by
flush_instruction_cache in arc/ppc/boot/common/util.S

There is also another version of flush_instruction_cache implemented in
arch/ppc/kernel/misc.S.


Here are the suspicious instances of flush_instruction_cache in
arch/ppc/boot:

boot/prep/head.S: bl flush_instruction_cache
boot/common/util.S:_GLOBAL(flush_instruction_cache)
boot/simple/relocate.S: b flush_instruction_cache
boot/simple/relocate.S: b flush_instruction_cache
boot/simple/misc-embedded.c: flush_instruction_cache();






--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2004-03-05 00:44:12

by Chris Friesen

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?

Benjamin Herrenschmidt wrote:
>>... unless this is a 'G5' that's not in a pmac, it's not my code, and
>>the openfirmware bootloaders don't, IIRC, do any cache stuff.
>>
>
> Heh, well, they should.

Actually I think they do. chrpmain.c and coffmain.c both reference
flush_cache() in misc.S in arch/ppc/boot/pmac.

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2004-03-05 02:11:03

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?


> It did in 2.4, and we added a syscall to export it to userspace. Now
> I'm supposed to figure out what to do for 2.6, and it appears that the
> kernel version is gone and the one in boot is screwed.

Ugh ? No, the kernel doesn't contain a routine that you can
use to flush the entire cache. It contains and use routines
to flush regions of the dcache & icache, and those can prefectly
be used in userland. In fact, none of the cache flush code is
relying on supervisor mode, you don't need to add a syscall for
that, just copy the code you need in userland.

> The only remaining ppc version of flush_data_cache is used by
> flush_instruction_cache in arc/ppc/boot/common/util.S

That's wrong. You should flush the cache over the range
where you need it flushed. Also, there are very few reasons
why one would want to flush the dcache, so it would be interesting
to know what you are really trying to do.

> There is also another version of flush_instruction_cache implemented in
> arch/ppc/kernel/misc.S.

That is only used on some embedded CPUs afaik.

Ben.


2004-03-05 05:39:55

by Chris Friesen

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?

Benjamin Herrenschmidt wrote:

> In fact, none of the cache flush code is
> relying on supervisor mode, you don't need to add a syscall for
> that, just copy the code you need in userland.

That's useful information.

> That's wrong. You should flush the cache over the range
> where you need it flushed.

Yeah, I know. I'm not sure why they want this.

> Also, there are very few reasons
> why one would want to flush the dcache, so it would be interesting
> to know what you are really trying to do.

I'll look into it further and see what I can find out.

Assuming that I really did want to flush the whole cache, how would I go
about doing that from userspace? Do it like the loop in
arch/ppc/boot/common/util.S? How would this interact with virtual
addresses? Would I need to mmap an appropriately sized area of kernel
memory to get contiguous memory? The app is running as root, so
permissions are not an issue.

Thanks for your help,

Chris




--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2004-03-05 05:47:33

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?


> Assuming that I really did want to flush the whole cache, how would I go
> about doing that from userspace?

I don't think you can do it reliably, but again, that do not make
sense, so please check with those folks what's exactly going on.



2004-03-05 15:15:49

by Tom Rini

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?

On Fri, Mar 05, 2004 at 11:04:26AM +1100, Benjamin Herrenschmidt wrote:
>
> >
> > ... unless this is a 'G5' that's not in a pmac, it's not my code, and
> > the openfirmware bootloaders don't, IIRC, do any cache stuff.
>
> Heh, well, they should. You need to flush the dcache & invalidate the
> icache over the kernel image after decompressing it. It's just that
> those routines are totally broken as far as I can see, but I don't
> think the pmac bootloaders will use them, they probably use a
> different implementation that works.

I take that back, the openfirmware stuff does have it's own flush_cache
(of course).

> Note that whatever machines are using those routines will probably
> have trouble too anyway

I hate to disappoint Ben, but nothing is having any trouble with that
code, even if it's not exactly right. :)

Regardless, I'll go and make all of the boot code code the same,
working, cache flushing code. Thanks for finding that, Chris.

--
Tom Rini
http://gate.crashing.org/~trini/

2004-03-05 17:21:44

by Chris Friesen

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?

Benjamin Herrenschmidt wrote:
>>Assuming that I really did want to flush the whole cache, how would I go
>>about doing that from userspace?
>>
>
> I don't think you can do it reliably, but again, that do not make
> sense, so please check with those folks what's exactly going on.

I've gotten some more information on what's going on.

We have a proprietary OS (originally written for custom hardware) that
is running on a thin emulation layer on top of linux.

This OS allows runtime patching of code. After changing the
instruction(s), it then has to make sure that the icache doesn't contain
stale instructions.

The original code was written for ppc hardware that had the ability to
flush the whole dcache and invalidate the whole icache, all at once, so
that's what they used. The code doesn't track the address/size of what
was changed. For our existing products, we are using the 74xx series,
and they've got hardware cache flush/invalidate as well, so we just kept
using that. For the 970 however, that hardware mechanisms seem to be
absent, which started me on this whole path.

After doing some digging in the 970fx specs, it seems that we may not
need to explicitly force a store of the L1 dcache at all. According to
the docs, the L1 dcache is unconditionally store-through. Thus, for a
brute-force implementation we should be able to just invalidate the
whole icache, do the appropriate sync/isync, and it should pick up the
changed instructions from the L2 cache. Do you see any problems with
this? Do I actually still need the store?

Of course, the proper fix is to change the code in the OS running on the
emulator to track the addresses that got changed and just do the minimal
work required.

Chris

--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]

2004-03-05 23:52:38

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?


> I hate to disappoint Ben, but nothing is having any trouble with that
> code, even if it's not exactly right. :)

It's actually completely wrong. If things work, they do so by mere
luck.

> Regardless, I'll go and make all of the boot code code the same,
> working, cache flushing code. Thanks for finding that, Chris.
--
Benjamin Herrenschmidt <[email protected]>

2004-03-05 23:55:06

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?


> This OS allows runtime patching of code. After changing the
> instruction(s), it then has to make sure that the icache doesn't contain
> stale instructions.
>
> The original code was written for ppc hardware that had the ability to
> flush the whole dcache and invalidate the whole icache, all at once, so
> that's what they used.

That's very inefficient.

> The code doesn't track the address/size of what
> was changed. For our existing products, we are using the 74xx series,
> and they've got hardware cache flush/invalidate as well, so we just kept
> using that.

Ouch... that _VERY_ inefficient... and the HW flush on the 74xx may
be broken on some models afaik =P

> For the 970 however, that hardware mechanisms seem to be
> absent, which started me on this whole path.
>
> After doing some digging in the 970fx specs, it seems that we may not
> need to explicitly force a store of the L1 dcache at all. According to
> the docs, the L1 dcache is unconditionally store-through. Thus, for a
> brute-force implementation we should be able to just invalidate the
> whole icache, do the appropriate sync/isync, and it should pick up the
> changed instructions from the L2 cache. Do you see any problems with
> this? Do I actually still need the store?

It's unclear if stores will get straight to the coherency domain or not,
they may still be stuffed in a store queue... Though a sync would
probably flush it. Still, you should really try to get your code fixes
to just dcb{f,st}/icbi on the right instruction.

> Of course, the proper fix is to change the code in the OS running on the
> emulator to track the addresses that got changed and just do the minimal
> work required.
>
> Chris
--
Benjamin Herrenschmidt <[email protected]>

2004-03-06 02:36:17

by Anton Blanchard

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?


> After doing some digging in the 970fx specs, it seems that we may not
> need to explicitly force a store of the L1 dcache at all. According to
> the docs, the L1 dcache is unconditionally store-through. Thus, for a
> brute-force implementation we should be able to just invalidate the
> whole icache, do the appropriate sync/isync, and it should pick up the
> changed instructions from the L2 cache. Do you see any problems with
> this? Do I actually still need the store?

Yeah you will get away with it on the 970FX, do the sync before and
isync afterwards. As you suggest, you really should modify it to track
changed regions and flush them explicitly :)

Anton

2004-03-08 07:51:25

by Segher Boessenkool

[permalink] [raw]
Subject: Re: problem with cache flush routine for G5?

> After doing some digging in the 970fx specs, it seems that we may not
> need to explicitly force a store of the L1 dcache at all. According
> to the docs, the L1 dcache is unconditionally store-through. Thus, for
> a brute-force implementation we should be able to just invalidate the
> whole icache, do the appropriate sync/isync, and it should pick up the
> changed instructions from the L2 cache. Do you see any problems with
> this? Do I actually still need the store?

You need a sync instruction before the instruction cache invalidate,
to make sure all stores to L2 have completed.

I don't know which "970fx specs" you mean, but if it's the programming
manual, it should tell you how to invalidate the entire instruction
cache (and how to flush the L2 cache, if you really must. But just
don't).


Segher