2001-04-19 00:04:06

by kambo

[permalink] [raw]
Subject: CONFIG_PACKET_MMAP help

Hi,

I upgrading an application to the CONFIG_PACKET_MMAP
interface, and was trying to figure out how the api works. I 'RTFS'
But had a few questions:

1. for tp_frame_size, I dont want to truncate any data on ethernet, I
need 1514 bytes, is this the best way to do it and not waste space?

static const int TURBO_FRAME_SIZE=TPACKET_ALIGN(TPACKET_ALIGN(sizeof(tpacket_hdr))+TPACKET_ALIGN(sizeof(struct sockaddr_ll)+ETH_HLEN) + 1500);

2. what is tp_block_nr for? I dont understand it, I just set it to 1
and make tp_block_size big enough for all the frames I need, so its
just one contiguous space, all I need is about a megabyte I think.

3. is this the general approach for the api?

open socket
set ring size
mmap()

h starts at frame[0] of the mmaped area
while(1) {
if (tp->status == 0) poll() for pollin on the socket /* is there a
race here? */
parse/copy out what I want from h + h->tp_mac
set tp->status to 0 when I am done
h = next packet in ring, or wraps
}

4. what does the copy threshold setsockopt tuning accomplish? doesnt it always
have to copy anyway, to the mmaped area?

Thanks,
c.c. me por favor...

K. Lohan
------------------------
kambo at home dawt com



2001-04-19 20:38:49

by Edgar Toernig

[permalink] [raw]
Subject: Re: CONFIG_PACKET_MMAP help

Hi,

[email protected] wrote:
>
> 1. for tp_frame_size, I dont want to truncate any data on ethernet, I
> need 1514 bytes, is this the best way to do it and not waste space?
>
> static const int TURBO_FRAME_SIZE=
> TPACKET_ALIGN(TPACKET_ALIGN(sizeof(tpacket_hdr)) +
> TPACKET_ALIGN(sizeof(struct sockaddr_ll)+ETH_HLEN) + 1500);

Looks OK. Maybe instead of ETH_HLEN min(ETH_HLEN,16)? The framesize
calculation is really strange...

> 2. what is tp_block_nr for? I dont understand it, I just set it to 1
> and make tp_block_size big enough for all the frames I need, so its
> just one contiguous space, all I need is about a megabyte I think.

Better go the other way around - set tb_block_size to PAGE_SIZE and
tb_block_nr appropriate. tb_block_size is the contiguous physical memory
the kernel tries to allocate. Anything above PAGE_SIZE is likely to fail.
For you that would mean only 2 packets per 4k-page. You could try to
start with bigger (power of 2) block sizes and go down to smaller ones if
it fails (ENOMEM). [1]. Btw, there's in implicit limit on tb_block_nr.
The vector to manage the blocks is kmalloc'ed and may not be larger than
128kb giving max 32768 blocks. Hmm... moment... seems there's a similar
limit for tp_frame_nr (max 32768 frames). I'm pretty sure _that_ limit
was not there when I worked with this during 2.3. Not so nice on gigabit
ethernet :-(

> 3. is this the general approach for the api?
> [...]

Looks OK too.

> if (tp->status == 0) poll() for pollin on the socket /* is there a
> race here? */

No race.

> 4. what does the copy threshold setsockopt tuning accomplish? doesnt it always
> have to copy anyway, to the mmaped area?

I haven't used it myself. Reading the sources it does something different.
Afaics when active if there's a packet that has been truncated by the
framesize it is additionally stored in the socket's receive queue to be
fetched by a normal read/recv. It notifies you about this by setting
the TP_STATUS_COPY bit. So it seems to mean: copy to socket if threshold
(framesize) exceeded.

Ciao, ET.


[1] The PACKET_RX_RING sockopt accepts all block sizes that are a multiple
of PAGE_SIZE but always allocates a power of 2 size chunk. So using non
power of 2 sizes will waste locked kernel memory.

2001-04-20 16:05:12

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: CONFIG_PACKET_MMAP help

Hello!

> 1. for tp_frame_size, I dont want to truncate any data on ethernet, I
> need 1514 bytes, is this the best way to do it and not waste space?

To select small snapsize (obtained from later experiments),
to set PACKET_COPY_THRESH to read larger packets via recvmsg().

> 2. what is tp_block_nr for? I dont understand it, I just set it to 1
> and make tp_block_size big enough for all the frames I need, so its
> just one contiguous space, all I need is about a megabyte I think.

Kernel has problems with allocating large chunks of memory.
If you see problems with allocating large chuns, split them
to less ones.

> while(1) {
> if (tp->status == 0) poll() for pollin on the socket /* is there a
> race here? */

No. poll returns, when new frame appears.


> 4. what does the copy threshold setsockopt tuning accomplish? doesnt it always
> have to copy anyway, to the mmaped area?

see anser to question 1. It has a sense when size of chunk is small enough.
Small packets are copied to ring, large ones (which are truncated) are queued
to socket to be received via recvmsg().

Alexey