2017-07-18 09:45:28

by Ian Molton

[permalink] [raw]
Subject: RFC: brcmfmac bus level addressing issues.

Hi folks,

Its come to my attention that there is a substantial disparity between
the PCIe and SDIO variants of the driver when it comes to handlign
writes via the backplane.

The SDIO bus code checks, upon every (32 bit) access, wether the
backplane window is in the right place, and only updates it if it has
actually changed.

The PCIe code sets the window *regardless* of wether its changed, on
*every single* write.

The SDIO code has no explicit selection of the window address based on
the core selected.

The PCIe code uses brcmf_pcie_select_core(), which, ultimately, appears
to be totally redundant, due to the above mentioned 32 bit access code
setting the window register regardless of its current value.

------------------------------------

Can we standardise how this is supposed to work? Its ugly, and its going
to cause bugs, ultimately. I suspect its probably the cause of the code
I mentioned in my recent patch titled "brcmfmac: HACK - stabilise the
value of ->sbwad in use for some xfer routines."

We really *dont* want to call brcmf_pcie_select_core() all over the
place. Its inefficient, traversing a list as it does, when all it does
is return a pointer that never actually changes, to the core structures
that contain addressing info.

I'd propose we do what I've done in my SDIO patch set - we call
brcmf_chip_get_core() *once* after the chip has been probed, and store
the pointer returned.

The window register setting can be hidden in the read32/write32 buscore
ops, and will never be incorrect from that point, and the code can
simply use a flat address space model. A single if() has got to be less
costly than writing the register on overy single read32/write32...

Anyhow, whatever we decide to do, can we do the same thing in both bus
drivers?

-Ian


2017-07-18 11:27:52

by Ian Molton

[permalink] [raw]
Subject: Re: brcmfmac bus level addressing issues.

On 18/07/17 11:35, Hante Meuleman wrote:
> Hi Ian,
>
> I've really no idea what you mean.

You should look at the code...

> brcmf_pcie_select_core is redundant?

Essentially yes - there may be a couple of corner cases where the IO
accesses are not performed via brcmf_pcie_buscore_{read,write}32() - but
other than that, brcmf_pcie_buscore_prep_addr() sets the IO window
unconditionally on every access.

> Care to try to boot a device without this function?

I strongly suspect it would work. Perhaps try it? Give me a device and
I'll try it.

> Called all over the place? Hell no, it is default pointing to PCIE2
> and functions which need to map the window to another core will do
> so, temporarily, but move it back to PCIE2, at least that is the
> idea, may be you found a bug?

brcmf_pcie_select_core() looks up the core structure from the core id.

it then sets BRCMF_PCIE_BAR0_WINDOW according to the core base address.

it actually goes to the length of reading it back and trying again if
its not set, even, which is at least a little bit horrifying.

------------

brcmf_pcie_buscore_{read,write}32() both call brcmf_pcie_buscore_prep_addr()

brcmf_pcie_buscore_prep_addr() *unconditionally* programs
BRCMF_PCIE_BAR0_WINDOW on *every single* IO access.

If you want inefficient - its right there.


The SDIO version of the code is actually considerably more efficient on
this point - it at least only programs the window register only when it
changes, not on every single IO access.

> We are
> for sure not going to hide the selecting of the window in the read/write
> routines, that would result in a giant amount of overhead.

Actually it would result in *considerably* less overhead than the
current code, that blindly sets the window on every access.

> Currently PCIE
> devices reach 1.5Gpbs, we need to go faster than that in the near future.

I dont need a lesson on writing efficient code, thanks.

> We don't want just change that to make it bit nicer..... Why do you need
> to see the same in the SDIO and PCIE drivers? SDIO and PCIE differ in many
> aspects. Sure some things can be improved in or the other, but they sure
> don't need to look alike.

I dont "need" to see the same in both drivers. Not where it isnt
appropriate.

but every part of the drivers that can share code without noticeably
impacting performance *should* do so. You should be justifying to me why
the code has to be different, not the other way around. Are you
sreiously arguing that sharing common code is a bad idea?

> It may be ugly, but thusfar it has not caused bugs

Oh, I bet you it has... try reading the SDIO version (note the reliance
on the dangling ->sbwad pointer) and tell me again that this hasnt
caused bugs.

Right now, the bulk of the driver code is sat on top of at least two bus
drivers with differing IO models, and is working via good luck alone.

> The concept in pcie bus part is simple.

And differs completely from the SDIO part.

> The main core to select is PCIE2 (once you have
> booted and established initial communication with firmware) and every
> routine which needs to access another core will change the window
> temporarily and set it back once done. Please don't mess with this, it
> works, it is clear and it is fast.

If is anything but fast. changing the window involves traversiong the
list of cores. Every. Single. Time. It doesnt *have* to - but thats what
brcmf_chip_get_core() does, and brcmf_pcie_select_core() calls it.

2017-07-18 10:35:07

by Hante Meuleman

[permalink] [raw]
Subject: RE: brcmfmac bus level addressing issues.

Hi Ian,

I've really no idea what you mean. Brcmf_pcie_select_core is redundant?
Care to try to boot a device without this function? Called all over the
place? Hell no, it is default pointing to PCIE2 and functions which need
to map the window to another core will do so, temporarily, but move it
back to PCIE2, at least that is the idea, may be you found a bug? We are
for sure not going to hide the selecting of the window in the read/write
routines, that would result in a giant amount of overhead. Currently PCIE
devices reach 1.5Gpbs, we need to go faster than that in the near future.
We don't want just change that to make it bit nicer..... Why do you need
to see the same in the SDIO and PCIE drivers? SDIO and PCIE differ in many
aspects. Sure some things can be improved in or the other, but they sure
don't need to look alike.

It may be ugly, but thusfar it has not caused bugs (and there won't be
large changes in the near future where it will cause bugs). The concept in
pcie bus part is simple. The main core to select is PCIE2 (once you have
booted and established initial communication with firmware) and every
routine which needs to access another core will change the window
temporarily and set it back once done. Please don't mess with this, it
works, it is clear and it is fast.

Regards,
Hante

-----Original Message-----
From: Ian Molton [mailto:[email protected]]
Sent: Tuesday, July 18, 2017 11:45 AM
To: [email protected]
Cc: Arend Van Spriel; Franky Lin; Hante Meuleman
Subject: RFC: brcmfmac bus level addressing issues.

Hi folks,

Its come to my attention that there is a substantial disparity between the
PCIe and SDIO variants of the driver when it comes to handlign writes via
the backplane.

The SDIO bus code checks, upon every (32 bit) access, wether the backplane
window is in the right place, and only updates it if it has actually
changed.

The PCIe code sets the window *regardless* of wether its changed, on
*every single* write.

The SDIO code has no explicit selection of the window address based on the
core selected.

The PCIe code uses brcmf_pcie_select_core(), which, ultimately, appears to
be totally redundant, due to the above mentioned 32 bit access code
setting the window register regardless of its current value.

------------------------------------

Can we standardise how this is supposed to work? Its ugly, and its going
to cause bugs, ultimately. I suspect its probably the cause of the code I
mentioned in my recent patch titled "brcmfmac: HACK - stabilise the value
of ->sbwad in use for some xfer routines."

We really *dont* want to call brcmf_pcie_select_core() all over the place.
Its inefficient, traversing a list as it does, when all it does is return
a pointer that never actually changes, to the core structures that contain
addressing info.

I'd propose we do what I've done in my SDIO patch set - we call
brcmf_chip_get_core() *once* after the chip has been probed, and store the
pointer returned.

The window register setting can be hidden in the read32/write32 buscore
ops, and will never be incorrect from that point, and the code can simply
use a flat address space model. A single if() has got to be less costly
than writing the register on overy single read32/write32...

Anyhow, whatever we decide to do, can we do the same thing in both bus
drivers?

-Ian