2005-12-17 21:19:05

by Vitaly Wool

[permalink] [raw]
Subject: [PATCH/RFC] SPI core: turn transfers to be linked list

diff -uNr a/drivers/spi/spi.c b/drivers/spi/spi.c
--- a/drivers/spi/spi.c 2005-12-17 22:13:13.947792000 +0300
+++ b/drivers/spi/spi.c 2005-12-17 22:29:07.283863088 +0300
@@ -535,7 +535,9 @@
static DECLARE_MUTEX(lock);

int status;
- struct spi_message message;
+ struct spi_message message = {
+ .transfers = LIST_HEAD_INIT(message.transfers);
+ };
struct spi_transfer x[2];

/* Use preallocated DMA-safe buffer. We can't avoid copying here,
@@ -551,13 +553,13 @@
memcpy(buf, txbuf, n_tx);
x[0].tx_buf = buf;
x[0].len = n_tx;
+ list_add_tail(&x[0].link, &message.transfers);

x[1].rx_buf = buf + n_tx;
x[1].len = n_rx;
+ list_add_tail(&x[1].link, &message.transfers);

/* do the i/o */
- message.transfers = x;
- message.n_transfer = ARRAY_SIZE(x);
status = spi_sync(spi, &message);
if (status == 0) {
memcpy(rxbuf, x[1].rx_buf, n_rx);
diff -uNr a/include/linux/spi/spi.h b/include/linux/spi/spi.h
--- a/include/linux/spi/spi.h 2005-12-17 22:13:13.957790000 +0300
+++ b/include/linux/spi/spi.h 2005-12-17 23:39:32.304562312 +0300
@@ -286,6 +286,8 @@

unsigned cs_change:1;
u16 delay_usecs;
+
+ struct list_head link;
};

/**
@@ -304,8 +306,7 @@
* @state: for use by whichever driver currently owns the message
*/
struct spi_message {
- struct spi_transfer *transfers;
- unsigned n_transfer;
+ struct list_head transfers;

struct spi_device *spi;

@@ -406,16 +407,16 @@
spi_write(struct spi_device *spi, const u8 *buf, size_t len)
{
struct spi_transfer t = {
- .tx_buf = buf,
- .rx_buf = NULL,
- .len = len,
- .cs_change = 0,
- };
+ .tx_buf = buf,
+ .rx_buf = NULL,
+ .len = len,
+ .cs_change = 0,
+ };
struct spi_message m = {
- .transfers = &t,
- .n_transfer = 1,
- };
+ .transfers = LIST_HEAD_INIT(m.transfers);
+ };

+ list_add_tail(&t.link, &m.transfers);
return spi_sync(spi, &m);
}

@@ -432,16 +433,16 @@
spi_read(struct spi_device *spi, u8 *buf, size_t len)
{
struct spi_transfer t = {
- .tx_buf = NULL,
- .rx_buf = buf,
- .len = len,
- .cs_change = 0,
- };
+ .tx_buf = NULL,
+ .rx_buf = buf,
+ .len = len,
+ .cs_change = 0,
+ };
struct spi_message m = {
- .transfers = &t,
- .n_transfer = 1,
- };
+ .transfers = LIST_HEAD_INIT(m.transfers);
+ };

+ list_add_tail(&t.link, &m.transfers);
return spi_sync(spi, &m);
}


Attachments:
spi-list.patch (2.29 kB)

2005-12-18 20:40:49

by David Brownell

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list

On Saturday 17 December 2005 1:18 pm, Vitaly Wool wrote:
> Greetings,
>
> the patch attached changes the way transfers are chained in the SPI
> core. Namely, they are turned into linked lists instead of array. The
> reason behind is that we'd like to be able to use lightweight memory
> allocation mechanism to use it in interrupt context.

Hmm, color me confused. Is there something preventing a driver from
having its own freelist (or whatever), in cases where kmalloc doesn't
suffice? If not, why should the core change? And what sort of driver
measurements are you doing, to conclude that kmalloc doesn't suffice?


> An example of such
> kind of mechanism can be found in spi-alloc.c file in our core. The
> lightweightness is achieved by the knowledge that all the blocks to be
> allocated are of the same size.

I'd have said that since this increases the per-transfer costs (to set
up and manage the list memberships) you want to increase the weight of
that part of the API, not decrease it. ;)

Note that your current API maps to mine roughly by equating

allocate your spi_msg
allocate my { spi_message + one spi_transfer }

So if you're doing one allocation anyway, you already have the relevant
linked list (in spi_message) and pre-known size. So this patch wouldn't
improve any direct translation of your driver stack.


> We'd like to use this allocation
> technique for both message structure and transfer structure The problem
> with the current code is that transfers are represnted as an array so it
> can be of any size effectively.

Could you elaborate on this problem you perceive? This isn't the only
driver API in Linux to talk in terms of arrays describing transfers,
even potentially large arrays.

Consider how "struct scatterlist" is used, and how USB manages the
descriptors for isochronous transfers. They don't use linked lists
there, and haven't seemed to suffer from it.

- Dave

2005-12-19 07:48:40

by Vitaly Wool

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list

David Brownell wrote:

>On Saturday 17 December 2005 1:18 pm, Vitaly Wool wrote:
>
>
>>Greetings,
>>
>>the patch attached changes the way transfers are chained in the SPI
>>core. Namely, they are turned into linked lists instead of array. The
>>reason behind is that we'd like to be able to use lightweight memory
>>allocation mechanism to use it in interrupt context.
>>
>>
>
>Hmm, color me confused. Is there something preventing a driver from
>having its own freelist (or whatever), in cases where kmalloc doesn't
>suffice? If not, why should the core change? And what sort of driver
>measurements are you doing, to conclude that kmalloc doesn't suffice?
>
>
Can't get what you're talking about. A freelist of what?
Basically the idea of the custom lightweight allocation is the
following: allocate a page and divide into regions of the same size,
initialize and use these regions as a stack.
Next, why the current model prevents us from doing that. The array may
be of any size thus we can't divide the page into regions of the same
size: we can't predict what the maximum size will be. And this is the
problem of the core.

>
>
>
>>An example of such
>>kind of mechanism can be found in spi-alloc.c file in our core. The
>>lightweightness is achieved by the knowledge that all the blocks to be
>>allocated are of the same size.
>>
>>
>
>I'd have said that since this increases the per-transfer costs (to set
>up and manage the list memberships) you want to increase the weight of
>that part of the API, not decrease it. ;)
>
>
Disagree. Let's look deeper. kmalloc itself is more heavyweght than
setting up list memberships.
The list setting commands are pretty essential and will not add a lot to
the assembly code.

>Note that your current API maps to mine roughly by equating
>
> allocate your spi_msg
> allocate my { spi_message + one spi_transfer }
>
>So if you're doing one allocation anyway, you already have the relevant
>linked list (in spi_message) and pre-known size. So this patch wouldn't
>improve any direct translation of your driver stack.
>
>
This is the patch to your API, so I don't see why you're mentioning it here.
And your understanding is not quite correct. What if I'm going to send a
chain of 5 messages? I'll allocate 5 spi_msg's in my case which all are
gonna be of the same size -- thus the technique described above is
applicable. In case of your core it's not.
And I'm not worrying that much about direct translation, I'm just trying
to make your core acceptabe for us so that we could give up ours and
work with yours.
So... it's just one more step towards convergence you are not likely to
accept.

>>We'd like to use this allocation
>>technique for both message structure and transfer structure The problem
>>with the current code is that transfers are represnted as an array so it
>>can be of any size effectively.
>>
>>
>
>Could you elaborate on this problem you perceive? This isn't the only
>driver API in Linux to talk in terms of arrays describing transfers,
>even potentially large arrays.
>
>
The problem is: we're using real-time enhancements patch developed by
Ingo/Sven/Daniel etc. You cannot call kmalloc from the interrupt
context if you're using this patch. Yeah, yeah -- the interrupt
handlers are in threads by default, but we can't go for that since we
want immediate acknowledgement from the interrupt context, and that
implies spi_message/spi_transfer allocation.

>Consider how "struct scatterlist" is used, and how USB manages the
>descriptors for isochronous transfers. They don't use linked lists
>there, and haven't seemed to suffer from it.
>
>
Not sure if I understand why it's relevant to what we're discussing.

Vitaly

2005-12-19 17:07:04

by Greg KH

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list

On Mon, Dec 19, 2005 at 10:49:11AM +0300, Vitaly Wool wrote:
> The problem is: we're using real-time enhancements patch developed by
> Ingo/Sven/Daniel etc. You cannot call kmalloc from the interrupt
> context if you're using this patch.

So you can't even call:
kmalloc(sizeof(foo), GFP_ATOMIC);
in an interrupt anymore?
If so, that's going to break mainline pretty bad...

thanks,

greg k-h

2005-12-19 20:03:47

by Vitaly Wool

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list

Greg KH wrote:

>On Mon, Dec 19, 2005 at 10:49:11AM +0300, Vitaly Wool wrote:
>
>
>>The problem is: we're using real-time enhancements patch developed by
>>Ingo/Sven/Daniel etc. You cannot call kmalloc from the interrupt
>>context if you're using this patch.
>>
>>
>
>So you can't even call:
> kmalloc(sizeof(foo), GFP_ATOMIC);
>in an interrupt anymore?
>If so, that's going to break mainline pretty bad...
>
>
Don't think it will... Since most of the interrupts will work in threads
pretty well... and it's np to alloc memory from threads.

Vitaly

2005-12-20 08:11:59

by David Brownell

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list


> >Hmm, color me confused. Is there something preventing a driver from
> >having its own freelist (or whatever), in cases where kmalloc doesn't
> >suffice? If not, why should the core change? And what sort of driver
> >measurements are you doing, to conclude that kmalloc doesn't suffice?
> >
> >
> Can't get what you're talking about. A freelist of what?

Of whatever the driver wants. You had pointed at some code that
boiled down to keeping a freelist of some structures in the core.
Such code doesn't _need_ to be in the core at all ...


> Basically the idea of the custom lightweight allocation is the
> following: allocate a page and divide into regions of the same size,
> initialize and use these regions as a stack.

There already a lot of allocators that support that, like kmem_cache_t
or dma_pool or mempool_t ... no more, please! They don't all have that
"stack"/freelist notion though; easy for drivers to do that, IMO.

Drivers that don't want to hit the heap will preallocate everything
they can, and they'll probably have their own freelists. That wins
by eliminating the slab subroutine calls from what are often hot
code paths, along with their fault handling logic ... and removing
the need (and testing) for fault handling wins by improving robustness.


> Next, why the current model prevents us from doing that. The array may
> be of any size thus we can't divide the page into regions of the same
> size: we can't predict what the maximum size will be. And this is the
> problem of the core.

The driver code calling through the core doesn't have that problem.
It knows what size it needs, and if it needed enough of them it'd be
easy to make a kmem_cache_t to suit its needs.


> >I'd have said that since this increases the per-transfer costs (to set
> >up and manage the list memberships) you want to increase the weight of
> >that part of the API, not decrease it. ;)
> >
> >
> Disagree. Let's look deeper. kmalloc itself is more heavyweght than
> setting up list memberships.

But kmalloc was previously optional, yes? And should still be ...


> The list setting commands are pretty essential and will not add a lot to
> the assembly code.

I'm not totally averse to such changes, but you don't seem to be making
the best arguments. Example: they're clearly not "essential" because
transfer queues work today with the lists at the spi_message level.


> >Note that your current API maps to mine roughly by equating
> >
> > allocate your spi_msg
> > allocate my { spi_message + one spi_transfer }
> >
> >So if you're doing one allocation anyway, you already have the relevant
> >linked list (in spi_message) and pre-known size. So this patch wouldn't
> >improve any direct translation of your driver stack.
> >
> >
> This is the patch to your API, so I don't see why you're mentioning it here.

Because interface changes need to be discussed in context, and at this time
you have the only code that seems to suggest a need for this ...


> And your understanding is not quite correct. What if I'm going to send a
> chain of 5 messages? I'll allocate 5 spi_msg's in my case which all are
> gonna be of the same size -- thus the technique described above is
> applicable. In case of your core it's not.

I must have been deceived then by this little utility:

static inline struct spi_message *spi_message_alloc(unsigned ntrans, gfp_t flags)
{
struct spi_message *m;

m = kzalloc(sizeof(struct spi_message)
+ ntrans * sizeof(struct spi_transfer),
flags);
if (m) {
m->transfers = (void *)(m + 1);
m->n_transfer = ntrans;
}
return m;
}

Sure looks to me like spi_message_alloc(5, SLAB_ATOMIC) should do the trick.
One allocation, always the same size. And kzalloc() could easily optimize
that to use the "size-192" slab cache (or whatever), like kmalloc() does;
that'd be a small speedup.


> >Could you elaborate on this problem you perceive? This isn't the only
> >driver API in Linux to talk in terms of arrays describing transfers,
> >even potentially large arrays.
>
> The problem is: we're using real-time enhancements patch developed by
> Ingo/Sven/Daniel etc. You cannot call kmalloc from the interrupt
> context if you're using this patch. Yeah, yeah -- the interrupt
> handlers are in threads by default, but we can't go for that since we
> want immediate acknowledgement from the interrupt context, and that
> implies spi_message/spi_transfer allocation.

Could you elaborate a bit here? You seem to be implying that for some
reason one of your SPI related drivers must use non-threaded hardirqs,
AND (news to me, if true) that such hardirqs can't kmalloc(), AND that
it can't use any of several widely used strategies to avoid hitting
things like the slab allocator. (That last seems hardest to believe...)

I asked earlier what sort of performance measurements you're making
that are leading you to these conclusions. I'm still wondering. :)


> >Consider how "struct scatterlist" is used, and how USB manages the
> >descriptors for isochronous transfers. They don't use linked lists
> >there, and haven't seemed to suffer from it.
> >
> >
> Not sure if I understand why it's relevant to what we're discussing.

Examples of driver interfaces in Linux that work fine using arrays to couple
a group of transfers. If you see some problem that makes a compelling
argument that the SPI must change, it would also affect drivers using
those interfaces too, at least a little bit ... right? Does it?

- Dave

2005-12-20 18:15:17

by Vitaly Wool

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list

David Brownell wrote:

>>>Hmm, color me confused. Is there something preventing a driver from
>>>having its own freelist (or whatever), in cases where kmalloc doesn't
>>>suffice? If not, why should the core change? And what sort of driver
>>>measurements are you doing, to conclude that kmalloc doesn't suffice?
>>>
>>>
>>>
>>>
>>Can't get what you're talking about. A freelist of what?
>>
>>
>
>Of whatever the driver wants. You had pointed at some code that
>boiled down to keeping a freelist of some structures in the core.
>Such code doesn't _need_ to be in the core at all ...
>
>
Correct.
But you're missing two things: a) it wasn't in the core, it was the
library and b) it doesn't matter.
It's just a simple stack-based memory allocation model which is _very
attractive_ to use and which *can't* be used with transfers as arrays.

>
>
>
>>Basically the idea of the custom lightweight allocation is the
>>following: allocate a page and divide into regions of the same size,
>>initialize and use these regions as a stack.
>>
>>
>
>There already a lot of allocators that support that, like kmem_cache_t
>or dma_pool or mempool_t ... no more, please! They don't all have that
>"stack"/freelist notion though; easy for drivers to do that, IMO.
>
>Drivers that don't want to hit the heap will preallocate everything
>they can, and they'll probably have their own freelists. That wins
>by eliminating the slab subroutine calls from what are often hot
>code paths, along with their fault handling logic ... and removing
>the need (and testing) for fault handling wins by improving robustness.
>
>
See b) above. Doesn't matter where to put an implementation of this
memory allocation model. Matters whether it's usable or not. Currently
it's not, especially if a driver is complicated enuff to have transfers
of arbitrary size.

>>>I'd have said that since this increases the per-transfer costs (to set
>>>up and manage the list memberships) you want to increase the weight of
>>>that part of the API, not decrease it. ;)
>>>
>>>
>>>
>>>
>>Disagree. Let's look deeper. kmalloc itself is more heavyweght than
>>setting up list memberships.
>>
>>
>
>But kmalloc was previously optional, yes? And should still be ...
>
>
Whoops. How can it be optional if we want to have an async transfer?

>
>
>
>>The list setting commands are pretty essential and will not add a lot to
>>the assembly code.
>>
>>
>
>I'm not totally averse to such changes, but you don't seem to be making
>the best arguments. Example: they're clearly not "essential" because
>transfer queues work today with the lists at the spi_message level.
>
>
Not sure if I got you here, sorry.

>
>
>
>>And your understanding is not quite correct. What if I'm going to send a
>>chain of 5 messages? I'll allocate 5 spi_msg's in my case which all are
>>gonna be of the same size -- thus the technique described above is
>>applicable. In case of your core it's not.
>>
>>
>
>I must have been deceived then by this little utility:
>
>static inline struct spi_message *spi_message_alloc(unsigned ntrans, gfp_t flags)
>{
> struct spi_message *m;
>
> m = kzalloc(sizeof(struct spi_message)
> + ntrans * sizeof(struct spi_transfer),
> flags);
> if (m) {
> m->transfers = (void *)(m + 1);
> m->n_transfer = ntrans;
> }
> return m;
>}
>
>Sure looks to me like spi_message_alloc(5, SLAB_ATOMIC) should do the trick.
>One allocation, always the same size. And kzalloc() could easily optimize
>that to use the "size-192" slab cache (or whatever), like kmalloc() does;
>that'd be a small speedup.
>
>
'5' is only an example. What if there gonna be 7, 10, 12? What if the
possible cases for a driver are 1, 2, 5, 7, 12 and 14 transfers per message?
Of course the allocation won't be of the same size always. Moreover, the
size is gonna differ a lot.

>
>
>
>>>Could you elaborate on this problem you perceive? This isn't the only
>>>driver API in Linux to talk in terms of arrays describing transfers,
>>>even potentially large arrays.
>>>
>>>
>>The problem is: we're using real-time enhancements patch developed by
>>Ingo/Sven/Daniel etc. You cannot call kmalloc from the interrupt
>>context if you're using this patch. Yeah, yeah -- the interrupt
>>handlers are in threads by default, but we can't go for that since we
>>want immediate acknowledgement from the interrupt context, and that
>>implies spi_message/spi_transfer allocation.
>>
>>
>
>Could you elaborate a bit here? You seem to be implying that for some
>reason one of your SPI related drivers must use non-threaded hardirqs,
>AND (news to me, if true) that such hardirqs can't kmalloc(), AND that
>it can't use any of several widely used strategies to avoid hitting
>things like the slab allocator. (That last seems hardest to believe...)
>
>I asked earlier what sort of performance measurements you're making
>that are leading you to these conclusions. I'm still wondering. :)
>
>
This is mostly a stability issue, not a performance thing.

>>>Consider how "struct scatterlist" is used, and how USB manages the
>>>descriptors for isochronous transfers. They don't use linked lists
>>>there, and haven't seemed to suffer from it.
>>>
>>>
>>>
>>>
>>Not sure if I understand why it's relevant to what we're discussing.
>>
>>
>
>Examples of driver interfaces in Linux that work fine using arrays to couple
>a group of transfers. If you see some problem that makes a compelling
>argument that the SPI must change, it would also affect drivers using
>those interfaces too, at least a little bit ... right? Does it?
>
>
I'm not concerned much with USB mow :), though I know guys having _a
lot_ of trouble trying USB devices work within the real-time Linux
enhancememts environment I was referring to earlier ;) Probably a lot of
memory allocations within interrupt context... which is a always-avoid
thing for any real-time system BTW...

Vitaly

2005-12-21 13:21:49

by Vitaly Wool

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list

David Brownell wrote:

>>The list setting commands are pretty essential and will not add a lot to
>>the assembly code.
>>
>>
>
>I'm not totally averse to such changes, but you don't seem to be making
>the best arguments. Example: they're clearly not "essential" because
>transfer queues work today with the lists at the spi_message level.
>
>
One more reason for that that came out only recently: suppore we're
adding transfers to an already configured message (i. e. with some
transfers set up already). This 'chaning' may happen for some kinds of
devices. And in case transfers is an array, we should either be apriory
aware of whether the chaining will take place or allocate an array large
enough to hold additional transfers. Neither of these look good to me,
and having a linked list of transfers will definitely solve this problem.

Vitaly

2005-12-22 17:55:49

by David Brownell

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list

On Wednesday 21 December 2005 5:22 am, Vitaly Wool wrote:
> David Brownell wrote:
>
> >>The list setting commands are pretty essential and will not add a lot to
> >>the assembly code.
> >
> >I'm not totally averse to such changes, but you don't seem to be making
> >the best arguments. Example: they're clearly not "essential" because
> >transfer queues work today with the lists at the spi_message level.
>
> One more reason for that that came out only recently: suppore we're
> adding transfers to an already configured message (i. e. with some
> transfers set up already). This 'chaning' may happen for some kinds of
> devices.

What would those devices be? Have you looked at how, say, 802.15.4
wireless stacks would need to work? I'm thinking those will be
interesting for some embedded Linux folk. There are probably at least
half a dozen such Zigbee-capable radio chips that hook up through SPI,
and they're not always hooked up to 8-bit CPUs!


By the way, I'd say the framework already chains transfers, and what
you're proposing is that drivers be able to do so more flexibly.


> And in case transfers is an array, we should either be apriory
> aware of whether the chaining will take place or allocate an array large
> enough to hold additional transfers. Neither of these look good to me,
> and having a linked list of transfers will definitely solve this problem.

Well, that's the guts of the good example I was hoping you would share.
I'll be posting a refresh of this code soonish; maybe you can provide
a complete patch, changing all the code over to use list-not-array?

My own reasons for liking the notion of spi_transfer list membership
are to enable such transfer shuffling within the controller drivers,
more than at the spi_device driver level. If both layers will see some
benefits, then it's likely a good change. ;)

- Dave

2005-12-22 22:12:50

by Vitaly Wool

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list

David Brownell wrote:

>>And in case transfers is an array, we should either be apriory
>>aware of whether the chaining will take place or allocate an array large
>>enough to hold additional transfers. Neither of these look good to me,
>>and having a linked list of transfers will definitely solve this problem.
>>
>>
>
>Well, that's the guts of the good example I was hoping you would share.
>I'll be posting a refresh of this code soonish; maybe you can provide
>a complete patch, changing all the code over to use list-not-array?
>
>
Let's agree upon that I'll proovide the complete patch as soon as you
repost all the patches from the very beginning (with the updates you've
made).
It's a little bit hard to track all that stuff now, I mean patches,
patches to patches, etc. :)

Vitaly

2005-12-23 00:19:19

by David Brownell

[permalink] [raw]
Subject: Re: [PATCH/RFC] SPI core: turn transfers to be linked list


> >Well, that's the guts of the good example I was hoping you would share.
> >I'll be posting a refresh of this code soonish; maybe you can provide
> >a complete patch, changing all the code over to use list-not-array?
> >
> >
> Let's agree upon that I'll proovide the complete patch as soon as you
> repost all the patches from the very beginning (with the updates you've
> made).

Agreed. That's why I just posted them ... clean/merged versions of
everything, with updates folded in so the patches are mostly standalone.

Your turn. ;)

- Dave


> It's a little bit hard to track all that stuff now, I mean patches,
> patches to patches, etc. :)
>
> Vitaly
>