2020-12-11 13:21:33

by Ioana Ciornei

[permalink] [raw]
Subject: Re: [PATCH RESEND net-next 1/2] dpaa2-eth: send a scatter-gather FD instead of realloc-ing

On Fri, Dec 11, 2020 at 09:30:43AM +0000, David Laight wrote:
> From: Daniel Thompson
> > Sent: 10 December 2020 17:32
> >
> > On Mon, Jun 29, 2020 at 06:47:11PM +0000, Ioana Ciornei wrote:
> > > Instead of realloc-ing the skb on the Tx path when the provided headroom
> > > is smaller than the HW requirements, create a Scatter/Gather frame
> > > descriptor with only one entry.
>
> Is it worth simplifying the code by permanently allocating (and dma-mapping)
> the extra structure for every ring entry.
> It is (probably) only one page and 1 iommu entry.


That is exactly what I was thinking. At the moment the SGT structure is
pre-allocated but not pre-mapped.

I'll let you know how it goes.

Ioana


2020-12-11 13:29:41

by David Laight

[permalink] [raw]
Subject: RE: [PATCH RESEND net-next 1/2] dpaa2-eth: send a scatter-gather FD instead of realloc-ing

From: Ioana Ciornei
> Sent: 11 December 2020 09:39
>
> On Fri, Dec 11, 2020 at 09:30:43AM +0000, David Laight wrote:
> > From: Daniel Thompson
> > > Sent: 10 December 2020 17:32
> > >
> > > On Mon, Jun 29, 2020 at 06:47:11PM +0000, Ioana Ciornei wrote:
> > > > Instead of realloc-ing the skb on the Tx path when the provided headroom
> > > > is smaller than the HW requirements, create a Scatter/Gather frame
> > > > descriptor with only one entry.
> >
> > Is it worth simplifying the code by permanently allocating (and dma-mapping)
> > the extra structure for every ring entry.
> > It is (probably) only one page and 1 iommu entry.
>
>
> That is exactly what I was thinking. At the moment the SGT structure is
> pre-allocated but not pre-mapped.
>
> I'll let you know how it goes.

How much does the dma-map actually cost?
For short fragments it is probably worth copying into a pre-allocated
pre-mapped transmit buffer area.
You'd want to do aligned full-word copies and use separate cache lines
for each frame.
It does make tx setup more error prone - since you need the space in
the tx buffer area as well as in the tx ring.

For one OS (not sun's) on a sparc mbus+sbus system one of my colleagues
measured a cutoff point of about 1k.

The copy to tx buffer path also helps with the pathological skb that
are 1500 bytes in 1 byte fragments.
(Maybe skb can't get that bad, but I've seen that on other OS.)

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)