On Tue, 31 Jul 2007 16:36:30 +0100
David Vrabel <[email protected]> wrote:
> These three patches enhance the support for the SDIO IO_RW_EXTENDED
> command. The block size of functions is managed and the I/O ops
> (sdio_readsb() etc) are extended to handle arbitrary lengths of data
> (by using multiple commands).
>
> I've not yet had a chance to test this stuff as I don't (yet) have
> the time to write a Bluetooth Type-A driver so these are posted as an
> example of the sort of API I'd expect.
>
Thanks. These are some nice improvements. I do have one suggestion
though:
Could we design it so that sdio_io_rw_ext_helper() sets the block size
itself? That way most drivers wouldn't have to care about that detail
and the core would be free to choose optimal values.
I suspect that some transactions might require a certain block size.
But we could satisfy that by stating that any transfer small enough to
fit into one block will not be split up.
(PS. You might want to add [PATCH x/3] to your subjects so that the
order of the patches is crystal clear)
Rgds
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Pierre Ossman wrote:
> On Tue, 31 Jul 2007 16:36:30 +0100
> David Vrabel <[email protected]> wrote:
>
>> These three patches enhance the support for the SDIO IO_RW_EXTENDED
>> command. The block size of functions is managed and the I/O ops
>> (sdio_readsb() etc) are extended to handle arbitrary lengths of data
>> (by using multiple commands).
>>
>> I've not yet had a chance to test this stuff as I don't (yet) have
>> the time to write a Bluetooth Type-A driver so these are posted as an
>> example of the sort of API I'd expect.
>>
>
> Thanks. These are some nice improvements. I do have one suggestion
> though:
>
> Could we design it so that sdio_io_rw_ext_helper() sets the block size
> itself? That way most drivers wouldn't have to care about that detail
> and the core would be free to choose optimal values.
I would expect the block size to be set once per card, and never be
changed and thus it's not logically a per-transfer operation. We
certainly wouldn't want to change the block size willy-nilly as it's an
expensive operation.
The patch I've presented does put the selection of the block size in the
core (bar one thing which I agree should be removed).
> I suspect that some transactions might require a certain block size.
> But we could satisfy that by stating that any transfer small enough to
> fit into one block will not be split up.
I consider it unlikely that any card would want to do anything other
than always use the largest possible block size.
David
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
On Mon, 06 Aug 2007 11:31:19 +0100
David Vrabel <[email protected]> wrote:
>
> I would expect the block size to be set once per card, and never be
> changed and thus it's not logically a per-transfer operation. We
> certainly wouldn't want to change the block size willy-nilly as it's
> an expensive operation.
>
Indeed. It would of course be optimized so that it doesn't change the
size needlessly.
The beauty is that drivers wouldn't have to care. Things just
work<tm>. :)
> > I suspect that some transactions might require a certain block size.
> > But we could satisfy that by stating that any transfer small enough
> > to fit into one block will not be split up.
>
> I consider it unlikely that any card would want to do anything other
> than always use the largest possible block size.
>
I have a counter example. I have here a Marvell wifi card which needs a
firmware upload. And it seems to be rather picky about parameters
during that upload.
I'm still experimenting with a clean way to do things for this card.
I'll get back to you. :)
Rgds
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Pierre Ossman wrote:
> On Mon, 06 Aug 2007 11:31:19 +0100
> David Vrabel <[email protected]> wrote:
>
>> I would expect the block size to be set once per card, and never be
>> changed and thus it's not logically a per-transfer operation. We
>> certainly wouldn't want to change the block size willy-nilly as it's
>> an expensive operation.
>>
>
> Indeed. It would of course be optimized so that it doesn't change the
> size needlessly.
Drivers may care about the block size though so you can't have it
changing behind their backs. e.g., they may need to pad data to a
multiple of the block size.
>>> I suspect that some transactions might require a certain block size.
>>> But we could satisfy that by stating that any transfer small enough
>>> to fit into one block will not be split up.
>> I consider it unlikely that any card would want to do anything other
>> than always use the largest possible block size.
>>
>
> I have a counter example. I have here a Marvell wifi card which needs a
> firmware upload. And it seems to be rather picky about parameters
> during that upload.
>
> I'm still experimenting with a clean way to do things for this card.
> I'll get back to you. :)
sdio_set_block_size(func, 64); /* ew, this card is fussy */
download_firmware();
sdio_set_block_size(func, func->max_blksize); /* Ahhh, back to the
card's default */
If a card is fussy about the block size I don't think there's anything
cleaner than specifying directly in the driver.
David
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
On Mon, 06 Aug 2007 16:32:40 +0100
David Vrabel <[email protected]> wrote:
>
> Drivers may care about the block size though so you can't have it
> changing behind their backs. e.g., they may need to pad data to a
> multiple of the block size.
>
Well, we have to assume that they aren't just padding to pass the time.
I can see a couple of reasons:
1. They are padding because it made their code easier by allowing just
one transfer.
This is what I believe is the common case, and one that will go away if
we allow the core free access to the block size. All the complexity is
in the core and the drivers don't even have to bother with setting a
block size.
2. They must write an exact number of bytes to the card before it
activates.
As far as the core is concerned, padding and "real" data are the same
thing, so this poses no restriction on how the core can fiddle with
block sizes and transfers.
3. The entire transfer must be in one transaction.
Now this is a problem as the core might prefer to do two transfers
(probably one block and one byte).
As I believe this will be a rare case, we should try to make sure we
can handle this in a way that can keep less fussy transactions simple.
So I propose changing your sdio_set_block_size() API to
sdio_force_block_size().
That way the driver can lock the block size when it has particular
needs, yet keep it under the (assumingly more optimal) control of the
core at other times.
One could have a calling convention such as specifying a block size of
zero to turn off the forced block size.
One could also use this as a less complex way of informing the drivers
of host restrictions. If the block size specified isn't possible, we
can return an error code stating such. Without that, every driver that
needs this would need to duplicate code that computes possible block
size and would make our life a pain when we want to add new host
restrictions.
> >
> > I have a counter example. I have here a Marvell wifi card which
> > needs a firmware upload. And it seems to be rather picky about
> > parameters during that upload.
> >
>
> sdio_set_block_size(func, 64); /* ew, this card is fussy */
>
> download_firmware();
>
> sdio_set_block_size(func, func->max_blksize); /* Ahhh, back to the
> card's default */
>
> If a card is fussy about the block size I don't think there's anything
> cleaner than specifying directly in the driver.
>
Well, the driver has to specify the information somehow. As to how,
there are a number of options. My proposed sdio_force_block_size() will
work here as well.
Rgds
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
This is essentially what I mean.
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Pierre Ossman wrote:
> This is essentially what I mean.
It does not behave very well, see the specific comments at the end.
When considering optimizing SDIO it is important to always keep in mind
that a command is very expensive. They're some 100-150 clocks long and
there's also overheads in interrupt handling/scheduling, this can add to
up to 10s of us. Therefore, when selecting an optimum block size one
should aim to reduce the number of commands before attempting to
maximize the block size.
There are two cases to consider.
1. Card's max block_size <= 512.
In this case the optimum block size is /always/ the card's maximum.
This means we can do all transfers in two commands (and if the driver is
aware of the block size it may choose to pad transfers so only one
command is done).
2. Card's max block size > 512.
We can't handle arbitrary sized transfers with a block size > 512 (since
512 is maximum length of a byte mode transfer), so if we wish to use a
block size of 1024 (say) some transfers will require three commands (a
block mode, and two byte mode transfers). For a block size of 2048 (the
maximum possible) some transfer may require 5 commands!
We could attempt to set the block size as follows:
transfer size % 1024 == 0 => block size = 1024
transfer size % 2048 == 0 => block size = 2048
otherwise => block size = 512
However, without knowing what sizes of transfers the driver is going to
use we cannot intelligently know when it's best to switch the block
size. A block size change is expensive (two commands).
The change in block size from 512 to 2048 improves performance (for an
individual command) by 2.0% [1]. I don't believe this performance
benefit is worth the possible overhead of frequent block size changes
nor the cost of potentially more than 2 commands per transfer.
[1] The real performance benefit will be much less due to per command
overhead and other non IO_RW_EXTENDED traffic on the bus.
In conclusion, the optimum block size is based solely on the card and
host capabilities and should be limited to at most 512. Hence the block
size can (and should) only be set once during card initialization.
This has the added benefit of making the code simpler and hence easier
to understand and maintain.
> +/*
> + * Internal function. Splits an arbitrarily sized data transfer
> + * into several IO_RW_EXTENDED commands.
> + */
> +static int sdio_io_rw_ext_helper(struct sdio_func *func, int write,
> + unsigned addr, int incr_addr, u8 *buf, unsigned size)
> +{
> + unsigned remainder = size;
> + int ret;
> + unsigned short blksz;
> + struct mmc_host *host = func->card->host;
> +
> + if (func->forced_blksz)
> + blksz = func->forced_blksz;
> + else {
> + if (size <= 512)
> + goto byte_remainder;
The maximum size for a byte mode transfer is the block size set in the card.
> +
> + blksz = size;
> + if (size > 0xffff)
> + blksz = 0xffff;
The largest possible block size is 2048. I'd also not be keen on using
a block size which isn't a power of 2 -- hardware is unlikely to have
been exercised with unusual block sizes.
> + if (blksz > func->blksize)
> + blksz = func->blksize;
> + if (blksz > host->max_blk_size)
> + blksz = host->max_blk_size;
> +
> + /* avoid changing blksize needlessly */
> + if (func->cur_blksz && ((blksz % func->cur_blksz) == 0))
> + blksz = func->cur_blksz;
If func->blksize == 1024 and we perform transfers in the 512 to 1024
range, this would set the block size on every transfer.
> + }
David
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
On Tue, 07 Aug 2007 13:51:39 +0100
David Vrabel <[email protected]> wrote:
>
> In conclusion, the optimum block size is based solely on the card and
> host capabilities and should be limited to at most 512. Hence the
> block size can (and should) only be set once during card
> initialization.
>
Well, I can hardly argue with that thorough analysis. :)
I still like the idea of having the control in the core as much as
possible though. It allows people to experiment and figure out new and
clever ways of solving this in a way that can benefit all drivers.
So could we do most of what you propose, where we set the block size to
maximum, yet <= 512. Drivers can tweak this further by calling
sdio_force_block_size() (as need e.g. by my marvell card), and keep the
convention of 0 meaning "back to core default"? That way we can modify
the meaning of "core default" if more clever ways are available.
> > + if (size <= 512)
> > + goto byte_remainder;
>
> The maximum size for a byte mode transfer is the block size set in
> the card.
>
Right, sorry. It was late when I was hacking on this.
> > + /* avoid changing blksize needlessly */
> > + if (func->cur_blksz && ((blksz % func->cur_blksz)
> > == 0))
> > + blksz = func->cur_blksz;
>
> If func->blksize == 1024 and we perform transfers in the 512 to 1024
> range, this would set the block size on every transfer.
>
Quite right. I had no doubts that thing had room for improvement. But I
agree that we might as well select a fixed block size and stick with it
for the general case.
Rgds
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
On Tue, 07 Aug 2007 13:54:32 +0100
David Vrabel <[email protected]> wrote:
>
> Index: mmc/drivers/mmc/core/sdio.c
> ===================================================================
> --- mmc.orig/drivers/mmc/core/sdio.c 2007-08-07
> 00:38:33.000000000 +0100 +++ mmc/drivers/mmc/core/sdio.c
> 2007-08-07 07:17:29.000000000 +0100 @@ -53,6 +53,7 @@
> {
> int ret;
> struct sdio_func *func;
> + unsigned block_size;
>
> BUG_ON(fn > SDIO_MAX_FUNCS);
>
> @@ -70,6 +71,15 @@
> if (ret)
> goto fail;
>
> + /*
> + * Set the function's block size to the largest supported by
> + * both the function and the host.
> + */
> + block_size = min(func->max_blksize,
> func->card->host->max_blk_size);
> + ret = sdio_set_block_size(func, block_size);
> + if (ret)
> + goto fail;
> +
> card->sdio_func[fn - 1] = func;
>
> return 0;
This is probably better done in the sdio_enable_func().
We also need to check that the card actually supports block io.
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
On Tue, 07 Aug 2007 13:55:12 +0100
David Vrabel <[email protected]> wrote:
> sdio: extend sdio_readsb() and friends to handle any length of buffer
>
> Extend sdio_readsb(), sdio_writesb(), sdio_memcpy_fromio(), and
> sdio_memcpy_toio() to handle any length of buffer by splitting the
> transfer into several IO_RW_EXTENDED commands. Typically, a transfer
> would be split into a single block mode transfer followed by a byte
> mode transfer for the remainder.
>
> For this to work the block size must be limited to the maximum size
> of a byte mode transfer (512 bytes). This limitation could be
> revisited if there are any cards out there that require a block size
> > 512.
>
> host->max_seg_size <= host->max_req_size so there's no need to check
> both when determining the maximum data size for a single command.
>
> Signed-off-by: David Vrabel <[email protected]>
> ---
> Index: mmc/drivers/mmc/core/sdio_io.c
> ===================================================================
> --- mmc.orig/drivers/mmc/core/sdio_io.c 2007-08-07
> 07:27:31.000000000 +0100 +++ mmc/drivers/mmc/core/sdio_io.c
> 2007-08-07 07:35:02.000000000 +0100 @@ -153,6 +153,13 @@
> {
> int ret;
>
> + /* The standard permits block sizes up to 2048 bytes, but
> + * sdio_readsb() etc. can only work with block size <= 512.
> */
> + if (blksz > 512) {
> + blksz = 512;
> + dev_warn(&func->dev, "block size limited to 512
> bytes\n");
> + }
> +
I'm not sure about not returning an error here. What if a driver calls
it and relies on being able to transfer >512 blocks?
> ret = mmc_io_rw_direct(func->card, 1, 0,
> SDIO_FBR_BASE(func->num) + SDIO_FBR_BLKSIZE,
> blksz & 0xff, NULL);
> @@ -227,6 +234,48 @@
>
> EXPORT_SYMBOL_GPL(sdio_writeb);
>
> +/* Split an arbitrarily sized data transfer into several
> + * IO_RW_EXTENDED commands. */
> +static int sdio_io_rw_ext_helper(struct sdio_func *func, int write,
> + unsigned fn, unsigned addr, int incr_addr, u8 *buf, unsigned
> size) +{
> + unsigned remainder = size;
> + unsigned max_blocks;
> + int ret;
> +
We need to check support for block transfers here as well.
Also, the parameter fn is redundant.
> @@ -109,18 +109,22 @@
> cmd.opcode = SD_IO_RW_EXTENDED;
> cmd.arg = write ? 0x80000000 : 0x00000000;
> cmd.arg |= fn << 28;
> - cmd.arg |= bang ? 0x00000000 : 0x04000000;
> + cmd.arg |= incr_addr ? 0x04000000 : 0x00000000;
> cmd.arg |= addr << 9;
> - cmd.arg |= (size == 512) ? 0 : size;
> + if (blocks > 1) {
|| blksz > 512
Rgds
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
On Tue, 07 Aug 2007 13:55:59 +0100
David Vrabel <[email protected]> wrote:
> sdio: disable CD resistor
>
> Disable the card detect resistor to ensure all data lines are equally
> loaded. Not doing this can have a negative impact on buses with
> marginal signal quality.
>
> Signed-off-by: David Vrabel <[email protected]>
I'm not sure about this. I've seen several hosts which lack a
mechanical detect switch.
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Pierre Ossman wrote:
> On Tue, 07 Aug 2007 13:55:59 +0100
> David Vrabel <[email protected]> wrote:
>
>> sdio: disable CD resistor
>>
>> Disable the card detect resistor to ensure all data lines are equally
>> loaded. Not doing this can have a negative impact on buses with
>> marginal signal quality.
>>
>> Signed-off-by: David Vrabel <[email protected]>
>
> I'm not sure about this. I've seen several hosts which lack a
> mechanical detect switch.
If host is correctly designed it will have identical pull-ups on all the
DAT lines, and therefore this internal pull-up should be disconnected.
However, given that disabling the pull-up may cause 4 bit transfers to
fail on hardware that omits a host-side pull-up on DAT3 I agree that the
resistor should remain enabled.
Is there an example driver for a host that uses pin 1/DAT3 sensing for
card detection? I'm curious about how removal detections work.
App Note on card detection goes into the use of this resistor in more
detail but I don't believe this document is publicly available :(.
David
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
On Tue, 07 Aug 2007 15:46:36 +0100
David Vrabel <[email protected]> wrote:
>
> Is there an example driver for a host that uses pin 1/DAT3 sensing for
> card detection? I'm curious about how removal detections work.
>
Both wbsd and sdhci have experienced hardware with this. In sdhci this
is hidden completely in hardware, but wbsd has to do some funky logic
to handle the problem.
> App Note on card detection goes into the use of this resistor in more
> detail but I don't believe this document is publicly available :(.
>
SD is a wonderful world. :/
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Pierre Ossman wrote:
> On Tue, 07 Aug 2007 13:54:32 +0100
> David Vrabel <[email protected]> wrote:
>
>>
>> Index: mmc/drivers/mmc/core/sdio.c
>> ===================================================================
>> --- mmc.orig/drivers/mmc/core/sdio.c 2007-08-07
>> 00:38:33.000000000 +0100 +++ mmc/drivers/mmc/core/sdio.c
>> 2007-08-07 07:17:29.000000000 +0100 @@ -53,6 +53,7 @@
>> {
>> int ret;
>> struct sdio_func *func;
>> + unsigned block_size;
>>
>> BUG_ON(fn > SDIO_MAX_FUNCS);
>>
>> @@ -70,6 +71,15 @@
>> if (ret)
>> goto fail;
>>
>> + /*
>> + * Set the function's block size to the largest supported by
>> + * both the function and the host.
>> + */
>> + block_size = min(func->max_blksize,
>> func->card->host->max_blk_size);
>> + ret = sdio_set_block_size(func, block_size);
>> + if (ret)
>> + goto fail;
>> +
>> card->sdio_func[fn - 1] = func;
>>
>> return 0;
>
> This is probably better done in the sdio_enable_func().
I don't think so. A driver might enable/disable a function multiple
times but there's no need to set the block size every time.
It may be best to move setting the block size back to before the probe
so a driver can be sure the block size is something sensible. Consider
a failed probe that called sdio_set_block_size() -- this will currently
mess up drivers probed later.
David
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
On Tue, 07 Aug 2007 18:20:26 +0100
David Vrabel <[email protected]> wrote:
> Pierre Ossman wrote:
> > On Tue, 07 Aug 2007 13:54:32 +0100
> > David Vrabel <[email protected]> wrote:
> >
> >>
> >> Index: mmc/drivers/mmc/core/sdio.c
> >> ===================================================================
> >> --- mmc.orig/drivers/mmc/core/sdio.c 2007-08-07
> >> 00:38:33.000000000 +0100 +++ mmc/drivers/mmc/core/sdio.c
> >> 2007-08-07 07:17:29.000000000 +0100 @@ -53,6 +53,7 @@
> >> {
> >> int ret;
> >> struct sdio_func *func;
> >> + unsigned block_size;
> >>
> >> BUG_ON(fn > SDIO_MAX_FUNCS);
> >>
> >> @@ -70,6 +71,15 @@
> >> if (ret)
> >> goto fail;
> >>
> >> + /*
> >> + * Set the function's block size to the largest supported
> >> by
> >> + * both the function and the host.
> >> + */
> >> + block_size = min(func->max_blksize,
> >> func->card->host->max_blk_size);
> >> + ret = sdio_set_block_size(func, block_size);
> >> + if (ret)
> >> + goto fail;
> >> +
> >> card->sdio_func[fn - 1] = func;
> >>
> >> return 0;
> >
> > This is probably better done in the sdio_enable_func().
>
> I don't think so. A driver might enable/disable a function multiple
> times but there's no need to set the block size every time.
>
Why would it want to do that?
Anyway, as long as cur_blksz is preserved, sdio_set_block_size() can
easily filter out redundant calls. No need to compromise in the rest of
the code for that.
In the patch I sent, the block size is set the first time is needed.
Wouldn't that avoid all problems?
> It may be best to move setting the block size back to before the probe
> so a driver can be sure the block size is something sensible.
> Consider a failed probe that called sdio_set_block_size() -- this
> will currently mess up drivers probed later.
>
Right, or remove the lock in the variant I proposed.
Rgds
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Here's my proposed compromise.
If a driver uses sdio_force_block_size() extensively, it will be like
your original version with sdio_set_block_size(). If it doesn't,
however, then that is a way to indicate that the driver has no specific
requirements (meaning we are free to change things in the future).
You'll also notice that calling sdio_force_block_size() is free. The
actual change comes when it is first used, meaning that drivers don't
have to be as careful when calling it.
The reset of custom/forced block size is moved to the probe function as
you suggested.
diff --git a/drivers/mmc/core/sdio_bus.c b/drivers/mmc/core/sdio_bus.c
index ca1d4b2..6ead36a 100644
--- a/drivers/mmc/core/sdio_bus.c
+++ b/drivers/mmc/core/sdio_bus.c
@@ -133,6 +133,8 @@ static int sdio_bus_probe(struct device *dev)
if (!id)
return -ENODEV;
+ func->forced_blksz = 0;
+
return drv->probe(func, id);
}
diff --git a/drivers/mmc/core/sdio_cis.c b/drivers/mmc/core/sdio_cis.c
index fffbc5a..20fdd1d 100644
--- a/drivers/mmc/core/sdio_cis.c
+++ b/drivers/mmc/core/sdio_cis.c
@@ -145,9 +145,9 @@ static int cistpl_funce_func(struct sdio_func *func,
printk(KERN_DEBUG "%s: TPLFE_CSA_PROPERTY = 0x%02x\n",
sdio_func_id(func), val);
/* TPLFE_MAX_BLK_SIZE */
- func->blksize = buf[12] | (buf[13] << 8);
+ func->max_blksz = buf[12] | (buf[13] << 8);
printk(KERN_DEBUG "%s: TPLFE_MAX_BLK_SIZE = %u\n",
- sdio_func_id(func), (unsigned)func->blksize);
+ sdio_func_id(func), (unsigned)func->max_blksz);
/* TPLFE_OCR */
val = buf[14] | (buf[15] << 8) | (buf[16] << 16) | (buf[17] << 24);
printk(KERN_DEBUG "%s: TPLFE_OCR = 0x%08x\n",
diff --git a/drivers/mmc/core/sdio_io.c b/drivers/mmc/core/sdio_io.c
index 7f08ba5..46ada64 100644
--- a/drivers/mmc/core/sdio_io.c
+++ b/drivers/mmc/core/sdio_io.c
@@ -202,6 +202,115 @@ void sdio_writeb(struct sdio_func *func, unsigned char b, unsigned int addr,
EXPORT_SYMBOL_GPL(sdio_writeb);
+/*
+ * Internal function. Sets the block size of the card.
+ */
+static int sdio_set_block_size(struct sdio_func *func, unsigned short blksz)
+{
+ int ret;
+
+ if (blksz == func->cur_blksz)
+ return 0;
+
+ /*
+ * We start by clearing the current block size as a failure
+ * will leave the size on the card in an unknown state.
+ */
+ func->cur_blksz = 0;
+
+ ret = mmc_io_rw_direct(func->card, 1, 0,
+ func->num * 0x100 + SDIO_FBR_BLKSIZE,
+ blksz & 0xff, NULL);
+ if (ret)
+ return ret;
+
+ ret = mmc_io_rw_direct(func->card, 1, 0,
+ func->num * 0x100 + SDIO_FBR_BLKSIZE + 1,
+ (blksz >> 8) & 0xff, NULL);
+ if (ret)
+ return ret;
+
+ func->cur_blksz = blksz;
+
+ return 0;
+}
+
+/*
+ * Internal function. Splits an arbitrarily sized data transfer
+ * into several IO_RW_EXTENDED commands.
+ */
+static int sdio_io_rw_ext_helper(struct sdio_func *func, int write,
+ unsigned addr, int incr_addr, u8 *buf, unsigned size)
+{
+ unsigned remainder = size;
+ int ret;
+ unsigned short blksz;
+ struct mmc_host *host = func->card->host;
+
+ if (func->forced_blksz)
+ blksz = func->forced_blksz;
+ else {
+ blksz = func->max_blksz;
+ if (blksz > host->max_blk_size)
+ blksz = host->max_blk_size;
+ if (blksz > 512)
+ blksz = 512;
+ }
+
+ WARN_ON(blksz > 512);
+
+ ret = sdio_set_block_size(func, blksz);
+ if (ret)
+ return ret;
+
+ while (remainder >= blksz) {
+ unsigned blocks;
+
+ blocks = remainder / blksz;
+
+ if (blocks > 511)
+ blocks = 511;
+ if (blocks > host->max_blk_count)
+ blocks = host->max_blk_count;
+ if (blocks * blksz > host->max_req_size)
+ blocks = host->max_req_size / blksz;
+ if (blocks * blksz > host->max_seg_size)
+ blocks = host->max_seg_size / blksz;
+
+ size = blocks * blksz;
+
+ ret = mmc_io_rw_extended(func->card, write, func->num,
+ addr, incr_addr, buf, blocks, blksz);
+ if (ret)
+ return ret;
+
+ remainder -= size;
+ buf += size;
+ if (incr_addr)
+ addr += size;
+ }
+
+ /*
+ * Because the block size is smaller than both the card
+ * maximum size and 512, then we know we can fit the remainder
+ * into a byte transfer.
+ */
+ if (remainder) {
+ /*
+ * We don't check host requirements as a block of 512
+ * bytes is the bare minimum support a host must
+ * provide.
+ */
+
+ ret = mmc_io_rw_extended(func->card, write, func->num,
+ addr, incr_addr, buf, 1, remainder);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
/**
* sdio_memcpy_fromio - read a chunk of memory from a SDIO function
* @func: SDIO function to access
@@ -209,14 +318,13 @@ EXPORT_SYMBOL_GPL(sdio_writeb);
* @addr: address to begin reading from
* @count: number of bytes to read
*
- * Reads up to 512 bytes from the address space of a given SDIO
- * function. Return value indicates if the transfer succeeded or
- * not.
+ * Reads from the address space of a given SDIO function. Return
+ * value indicates if the transfer succeeded or not.
*/
int sdio_memcpy_fromio(struct sdio_func *func, void *dst,
unsigned int addr, int count)
{
- return mmc_io_rw_extended(func->card, 0, func->num, addr, 0, dst,
+ return sdio_io_rw_ext_helper(func, 0, addr, 1, dst,
count);
}
@@ -229,14 +337,13 @@ EXPORT_SYMBOL_GPL(sdio_memcpy_fromio);
* @src: buffer that contains the data to write
* @count: number of bytes to write
*
- * Writes up to 512 bytes to the address space of a given SDIO
- * function. Return value indicates if the transfer succeeded or
- * not.
+ * Writes to the address space of a given SDIO function. Return
+ * value indicates if the transfer succeeded or not.
*/
int sdio_memcpy_toio(struct sdio_func *func, unsigned int addr,
void *src, int count)
{
- return mmc_io_rw_extended(func->card, 1, func->num, addr, 0, src,
+ return sdio_io_rw_ext_helper(func, 1, addr, 1, src,
count);
}
@@ -247,14 +354,13 @@ int sdio_memcpy_toio(struct sdio_func *func, unsigned int addr,
* @addr: address of (single byte) FIFO
* @count: number of bytes to read
*
- * Reads up to 512 bytes from the specified FIFO of a given SDIO
- * function. Return value indicates if the transfer succeeded or
- * not.
+ * Reads from the specified FIFO of a given SDIO function. Return
+ * value indicates if the transfer succeeded or not.
*/
int sdio_readsb(struct sdio_func *func, void *dst, unsigned int addr,
int count)
{
- return mmc_io_rw_extended(func->card, 0, func->num, addr, 1, dst,
+ return sdio_io_rw_ext_helper(func, 0, addr, 0, dst,
count);
}
@@ -267,14 +373,13 @@ EXPORT_SYMBOL_GPL(sdio_readsb);
* @src: buffer that contains the data to write
* @count: number of bytes to write
*
- * Writes up to 512 bytes to the specified FIFO of a given SDIO
- * function. Return value indicates if the transfer succeeded or
- * not.
+ * Writes to the specified FIFO of a given SDIO function. Return
+ * value indicates if the transfer succeeded or not.
*/
int sdio_writesb(struct sdio_func *func, unsigned int addr, void *src,
int count)
{
- return mmc_io_rw_extended(func->card, 1, func->num, addr, 1, src,
+ return sdio_io_rw_ext_helper(func, 1, addr, 0, src,
count);
}
@@ -393,3 +498,30 @@ void sdio_writel(struct sdio_func *func, unsigned long b, unsigned int addr,
EXPORT_SYMBOL_GPL(sdio_writel);
+/**
+ * sdio_force_block_size - lock a certain block size
+ * @func: SDIO function to set block size for
+ * @size: block size in bytes, or 0 to remove any lock
+ *
+ * Locks the used block size for all multi-block sdio functions
+ * in order to satisfy transactions with precise requirements.
+ * This lock can be removed by specifying 0 (zero) as the block
+ * size. Returns failure if the selected block size isn't
+ * supported.
+ */
+int sdio_force_block_size(struct sdio_func *func, unsigned short size)
+{
+ if (size > func->max_blksz)
+ return -EINVAL;
+ if (size > func->card->host->max_blk_size)
+ return -EINVAL;
+ if (size > 512)
+ return -EINVAL;
+
+ func->forced_blksz = size;
+
+ return 0;
+}
+
+EXPORT_SYMBOL_GPL(sdio_force_block_size);
+
diff --git a/drivers/mmc/core/sdio_ops.c b/drivers/mmc/core/sdio_ops.c
index 277870c..b5bbcfc 100644
--- a/drivers/mmc/core/sdio_ops.c
+++ b/drivers/mmc/core/sdio_ops.c
@@ -88,7 +88,7 @@ int mmc_io_rw_direct(struct mmc_card *card, int write, unsigned fn,
}
int mmc_io_rw_extended(struct mmc_card *card, int write, unsigned fn,
- unsigned addr, int bang, u8 *buf, unsigned size)
+ unsigned addr, int incr_addr, u8 *buf, unsigned blocks, unsigned blksz)
{
struct mmc_request mrq;
struct mmc_command cmd;
@@ -97,7 +97,7 @@ int mmc_io_rw_extended(struct mmc_card *card, int write, unsigned fn,
BUG_ON(!card);
BUG_ON(fn > 7);
- BUG_ON(size > 512);
+ BUG_ON(blocks > 511);
memset(&mrq, 0, sizeof(struct mmc_request));
memset(&cmd, 0, sizeof(struct mmc_command));
@@ -107,20 +107,27 @@ int mmc_io_rw_extended(struct mmc_card *card, int write, unsigned fn,
mrq.data = &data;
cmd.opcode = SD_IO_RW_EXTENDED;
+
cmd.arg = write ? 0x80000000 : 0x00000000;
cmd.arg |= fn << 28;
- cmd.arg |= bang ? 0x00000000 : 0x04000000;
+ cmd.arg |= incr_addr ? 0x04000000 : 0x00000000;
cmd.arg |= addr << 9;
- cmd.arg |= (size == 512) ? 0 : size;
+
+ if ((blocks > 1) || (blksz > 512)) {
+ cmd.arg |= 0x08000000;
+ cmd.arg |= blocks;
+ } else
+ cmd.arg |= (blksz == 512) ? 0 : blksz;
+
cmd.flags = MMC_RSP_R5 | MMC_CMD_ADTC;
- data.blksz = size;
- data.blocks = 1;
+ data.blksz = blksz;
+ data.blocks = blocks;
data.flags = write ? MMC_DATA_WRITE : MMC_DATA_READ;
data.sg = &sg;
data.sg_len = 1;
- sg_init_one(&sg, buf, size);
+ sg_init_one(&sg, buf, blksz * blocks);
mmc_set_data_timeout(&data, card, 0);
diff --git a/drivers/mmc/core/sdio_ops.h b/drivers/mmc/core/sdio_ops.h
index 1d42e4f..e2e74b0 100644
--- a/drivers/mmc/core/sdio_ops.h
+++ b/drivers/mmc/core/sdio_ops.h
@@ -16,7 +16,7 @@ int mmc_send_io_op_cond(struct mmc_host *host, u32 ocr, u32 *rocr);
int mmc_io_rw_direct(struct mmc_card *card, int write, unsigned fn,
unsigned addr, u8 in, u8* out);
int mmc_io_rw_extended(struct mmc_card *card, int write, unsigned fn,
- unsigned addr, int bang, u8 *data, unsigned size);
+ unsigned addr, int incr_addr, u8 *buf, unsigned blocks, unsigned blksz);
#endif
diff --git a/include/linux/mmc/sdio_func.h b/include/linux/mmc/sdio_func.h
index 14d9147..8bdbb3b 100644
--- a/include/linux/mmc/sdio_func.h
+++ b/include/linux/mmc/sdio_func.h
@@ -43,7 +43,9 @@ struct sdio_func {
unsigned short vendor; /* vendor id */
unsigned short device; /* device id */
- unsigned short blksize; /* maximum block size */
+ unsigned short max_blksz; /* maximum block size */
+ unsigned short cur_blksz; /* current block size */
+ unsigned short forced_blksz; /* driver forced block size */
unsigned int state; /* function state */
#define SDIO_STATE_PRESENT (1<<0) /* present in sysfs */
@@ -136,5 +138,7 @@ extern int sdio_memcpy_toio(struct sdio_func *func, unsigned int addr,
extern int sdio_writesb(struct sdio_func *func, unsigned int addr,
void *src, int count);
+extern int sdio_force_block_size(struct sdio_func *func, unsigned short size);
+
#endif
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Pierre Ossman wrote:
>
>> A driver might enable/disable a function multiple
>> times but there's no need to set the block size every time.
>
> Why would it want to do that?
A function enable/disable cycle should act as a per-function reset.
David
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
On Wed, 08 Aug 2007 10:46:24 +0100
David Vrabel <[email protected]> wrote:
> Pierre Ossman wrote:
> >
> >> A driver might enable/disable a function multiple
> >> times but there's no need to set the block size every time.
> >
> > Why would it want to do that?
>
> A function enable/disable cycle should act as a per-function reset.
>
It could be argued that it would reset the other per-function
information as well. :)
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Pierre Ossman wrote:
> On Wed, 08 Aug 2007 10:46:24 +0100
> David Vrabel <[email protected]> wrote:
>
>> Pierre Ossman wrote:
>>>> A driver might enable/disable a function multiple
>>>> times but there's no need to set the block size every time.
>>> Why would it want to do that?
>> A function enable/disable cycle should act as a per-function reset.
>>
>
> It could be argued that it would reset the other per-function
> information as well. :)
I do not think that that's what the SDIO specification requires, nor is
it what hardware I am familiar with does.
David
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
Pierre Ossman wrote:
> +static int sdio_io_rw_ext_helper(struct sdio_func *func, int write,
> + unsigned addr, int incr_addr, u8 *buf, unsigned size)
> +{
> + unsigned remainder = size;
> + int ret;
> + unsigned short blksz;
> + struct mmc_host *host = func->card->host;
> +
> + if (func->forced_blksz)
> + blksz = func->forced_blksz;
> + else {
> + blksz = func->max_blksz;
> + if (blksz > host->max_blk_size)
> + blksz = host->max_blk_size;
> + if (blksz > 512)
> + blksz = 512;
> + }
> +
> + WARN_ON(blksz > 512);
> +
> + ret = sdio_set_block_size(func, blksz);
> + if (ret)
> + return ret;
We need to know the block size in use /before/ the start of the transfer
as we would like drivers to be able to perform transfers with single
commands as this can result in significantly better performance[1]. The
drivers for our (CSR's) WiFi chips should do this. This isn't some (as
you suggested in a previous post) 'rare' requirement.
Therefore, we should not change the block size here.
We can also support block sizes > 512 and have this function do the
correct thing (as you'll see when I post my final patches).
David
[1] Consider a chip with a block size of 64 that regularly does short
transfers of between 64 - 128 bytes. Without padding, this would
require two commands per transfer instead of just one, cutting
performance by 50%! This scenario could be a WiFi chip doing VoIP.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
On Wed, 08 Aug 2007 11:19:33 +0100
David Vrabel <[email protected]> wrote:
>
> We need to know the block size in use /before/ the start of the
> transfer as we would like drivers to be able to perform transfers
> with single commands as this can result in significantly better
> performance[1]. The drivers for our (CSR's) WiFi chips should do
> this. This isn't some (as you suggested in a previous post) 'rare'
> requirement.
>
Well, there are more ways that can be achieved.
First, the driver could lock down the block size using
sdio_force_block_size(). Then it knows what it gets.
Second, we could try to make it possible for the driver to indicate
"feel free to pad this transfer". Then we could also remove the need
for drivers to mess with buffers and keep such stuff in the core. We
could even magically remove a memcpy() by setting up two sg entries,
one for the data and one for the padding.
>
> [1] Consider a chip with a block size of 64 that regularly does short
> transfers of between 64 - 128 bytes. Without padding, this would
> require two commands per transfer instead of just one, cutting
> performance by 50%! This scenario could be a WiFi chip doing VoIP.
Provided block size < 128. Otherwise it'll jump straight to the byte
transfer.
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org
Pierre Ossman wrote:
> On Wed, 08 Aug 2007 11:19:33 +0100
> David Vrabel <[email protected]> wrote:
>
>> We need to know the block size in use /before/ the start of the
>> transfer as we would like drivers to be able to perform transfers
>> with single commands as this can result in significantly better
>> performance[1]. The drivers for our (CSR's) WiFi chips should do
>> this. This isn't some (as you suggested in a previous post) 'rare'
>> requirement.
>>
>
> Well, there are more ways that can be achieved.
>
> First, the driver could lock down the block size using
> sdio_force_block_size(). Then it knows what it gets.
>
> Second, we could try to make it possible for the driver to indicate
> "feel free to pad this transfer". Then we could also remove the need
> for drivers to mess with buffers and keep such stuff in the core. We
> could even magically remove a memcpy() by setting up two sg entries,
> one for the data and one for the padding.
Setting the block size in io_rw_ext_helper() has several drawbacks, namely:
1. Reduces the flexibility of drivers to manage what commands are performed.
2. A performance penalty on the first transfer.
3. Greater code complexity.
4. Non-intuitive location for card initialization code.
Sure we could just through hoops and add (much) extra complexity to the
core, to improve item 1 but 2, 3 and 4 are insoluble. Given that setting
block size before the first command has /zero/ benefits[1], why bother?
Your insistence on this stupid idea baffles me, particularly in the
light of your other useful comments.
I would also like to advise that until a larger number of function
drivers become available that the core is kept as simple and as flexible
as possible. Without knowing how different cards operate it is
difficult to know what's common behaviour and what's card specific.
My latest (and hopefully final) patch set follows.
David
[1] dynamic block sizing was (potentially) a useful benefit but I have
comprehensibly shown it's not beneficial. The idea that is somehow
necessary to permit the core to change it's block size selection
algorithm is bogus.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
--
David Vrabel, Software Engineer, Drivers group Tel: +44 (0)1223 692562
CSR plc, Churchill House, Cambridge Business Park, Cowley Road, CB4 0WZ
.
On Wed, 08 Aug 2007 14:22:20 +0100
David Vrabel <[email protected]> wrote:
>
> Setting the block size in io_rw_ext_helper() has several drawbacks,
> namely:
>
> 1. Reduces the flexibility of drivers to manage what commands are
> performed. 2. A performance penalty on the first transfer.
> 3. Greater code complexity.
> 4. Non-intuitive location for card initialization code.
>
> Sure we could just through hoops and add (much) extra complexity to
> the core, to improve item 1 but 2, 3 and 4 are insoluble. Given that
> setting block size before the first command has /zero/ benefits[1],
> why bother?
3 and 4 are subjective, and for 2, the performance penalty has to come
some place. The upside, as explained, is that drivers aren't penalized
for setting the block size and then failing to use it. Something that
gives us the flexibility to write cleaner code.
As for benefits, there are other issues. We might want to deal with
things like alignment problems in the future.
>
> Your insistence on this stupid idea baffles me, particularly in the
> light of your other useful comments.
>
> I would also like to advise that until a larger number of function
> drivers become available that the core is kept as simple and as
> flexible as possible. Without knowing how different cards operate it
> is difficult to know what's common behaviour and what's card specific.
>
Agreed, we're doing a bit more guessing than one would like. But I
don't agree in keeping the core simple and flexible. Rather the
opposite. Any guarantee that we give drivers now and need to remove in
the future needs to be tested with all drivers that rely on it (a
difficult task as it often takes some effort to find people with
hardware and the ability to test patches).
As such, the guarantees should be kept at a minimum. Code is easily
changed, API is not.
IMO, the best way to achieve this is with an API that describes _what_
the drivers want to do, not _how_. Hence my insistence on allowing
drivers to specify the difference between "I want this data
transferred" and "I want this data transferred with an exact block size
of 32 bytes".
> My latest (and hopefully final) patch set follows.
>
These looks good. The only part I'm really attached to is the blksz ==
0 case, which you have.
Rgds
--
-- Pierre Ossman
Linux kernel, MMC maintainer http://www.kernel.org
PulseAudio, core developer http://pulseaudio.org
rdesktop, core developer http://www.rdesktop.org