Hi,
I upgrading an application to the CONFIG_PACKET_MMAP
interface, and was trying to figure out how the api works. I 'RTFS'
But had a few questions:
1. for tp_frame_size, I dont want to truncate any data on ethernet, I
need 1514 bytes, is this the best way to do it and not waste space?
static const int TURBO_FRAME_SIZE=TPACKET_ALIGN(TPACKET_ALIGN(sizeof(tpacket_hdr))+TPACKET_ALIGN(sizeof(struct sockaddr_ll)+ETH_HLEN) + 1500);
2. what is tp_block_nr for? I dont understand it, I just set it to 1
and make tp_block_size big enough for all the frames I need, so its
just one contiguous space, all I need is about a megabyte I think.
3. is this the general approach for the api?
open socket
set ring size
mmap()
h starts at frame[0] of the mmaped area
while(1) {
if (tp->status == 0) poll() for pollin on the socket /* is there a
race here? */
parse/copy out what I want from h + h->tp_mac
set tp->status to 0 when I am done
h = next packet in ring, or wraps
}
4. what does the copy threshold setsockopt tuning accomplish? doesnt it always
have to copy anyway, to the mmaped area?
Thanks,
c.c. me por favor...
K. Lohan
------------------------
kambo at home dawt com
Hi,
[email protected] wrote:
>
> 1. for tp_frame_size, I dont want to truncate any data on ethernet, I
> need 1514 bytes, is this the best way to do it and not waste space?
>
> static const int TURBO_FRAME_SIZE=
> TPACKET_ALIGN(TPACKET_ALIGN(sizeof(tpacket_hdr)) +
> TPACKET_ALIGN(sizeof(struct sockaddr_ll)+ETH_HLEN) + 1500);
Looks OK. Maybe instead of ETH_HLEN min(ETH_HLEN,16)? The framesize
calculation is really strange...
> 2. what is tp_block_nr for? I dont understand it, I just set it to 1
> and make tp_block_size big enough for all the frames I need, so its
> just one contiguous space, all I need is about a megabyte I think.
Better go the other way around - set tb_block_size to PAGE_SIZE and
tb_block_nr appropriate. tb_block_size is the contiguous physical memory
the kernel tries to allocate. Anything above PAGE_SIZE is likely to fail.
For you that would mean only 2 packets per 4k-page. You could try to
start with bigger (power of 2) block sizes and go down to smaller ones if
it fails (ENOMEM). [1]. Btw, there's in implicit limit on tb_block_nr.
The vector to manage the blocks is kmalloc'ed and may not be larger than
128kb giving max 32768 blocks. Hmm... moment... seems there's a similar
limit for tp_frame_nr (max 32768 frames). I'm pretty sure _that_ limit
was not there when I worked with this during 2.3. Not so nice on gigabit
ethernet :-(
> 3. is this the general approach for the api?
> [...]
Looks OK too.
> if (tp->status == 0) poll() for pollin on the socket /* is there a
> race here? */
No race.
> 4. what does the copy threshold setsockopt tuning accomplish? doesnt it always
> have to copy anyway, to the mmaped area?
I haven't used it myself. Reading the sources it does something different.
Afaics when active if there's a packet that has been truncated by the
framesize it is additionally stored in the socket's receive queue to be
fetched by a normal read/recv. It notifies you about this by setting
the TP_STATUS_COPY bit. So it seems to mean: copy to socket if threshold
(framesize) exceeded.
Ciao, ET.
[1] The PACKET_RX_RING sockopt accepts all block sizes that are a multiple
of PAGE_SIZE but always allocates a power of 2 size chunk. So using non
power of 2 sizes will waste locked kernel memory.
Hello!
> 1. for tp_frame_size, I dont want to truncate any data on ethernet, I
> need 1514 bytes, is this the best way to do it and not waste space?
To select small snapsize (obtained from later experiments),
to set PACKET_COPY_THRESH to read larger packets via recvmsg().
> 2. what is tp_block_nr for? I dont understand it, I just set it to 1
> and make tp_block_size big enough for all the frames I need, so its
> just one contiguous space, all I need is about a megabyte I think.
Kernel has problems with allocating large chunks of memory.
If you see problems with allocating large chuns, split them
to less ones.
> while(1) {
> if (tp->status == 0) poll() for pollin on the socket /* is there a
> race here? */
No. poll returns, when new frame appears.
> 4. what does the copy threshold setsockopt tuning accomplish? doesnt it always
> have to copy anyway, to the mmaped area?
see anser to question 1. It has a sense when size of chunk is small enough.
Small packets are copied to ring, large ones (which are truncated) are queued
to socket to be received via recvmsg().
Alexey