2001-11-19 07:13:55

by Alexander Viro

[permalink] [raw]
Subject: more fun with procfs (netfilter)

% cat /proc/net/ip_conntrack |od -c
0000000 t c p 6 1 2 1 0 7
[snip]
0005137
% od -c </proc/net/ip_conntrack
0000000
% cat /proc/net/ip_tables_names | od -c -w8
0000000 n a t \n f i l t
0000010 e r \n
0000013
% od -c -w8 </proc/net/ip_tables_names
0000000 n a t \n
0000004

Reason: netfilter procfs files try to fit entire records into the user
buffer. Do a read shorter than record size and you've got zero. And
read() returning 0 means you-know-what...

BTW, from strace output in cpuinfo bug report SuSE bash does reads by 128
bytes. Which means that while read i; do echo $i; done </proc/net/ip_conntrack
will come out empty (lots of lines are longer than 160 characters).

I'll try to see if seq_file is suitable there, but in any case something
needs to be done - read() should return 0 _only_ at EOF and 128 bytes
definitely counts as reasonable buffer size.


2001-11-19 07:18:26

by Alexander Viro

[permalink] [raw]
Subject: Re: more fun with procfs (netfilter)



> BTW, from strace output in cpuinfo bug report SuSE bash does reads by 128
> bytes. Which means that while read i; do echo $i; done </proc/net/ip_conntrack
> will come out empty (lots of lines are longer than 160 characters).

PS: with bash-2.03:
$ while read i; do echo $i; done < /proc/ip_tables_names
$ while read i; do echo $i; done < /proc/ip_conntrack
$
'cause this beast reads byte-by-byte...

2001-11-19 09:17:51

by Herbert Xu

[permalink] [raw]
Subject: Re: more fun with procfs (netfilter)

Alexander Viro <[email protected]> wrote:

> PS: with bash-2.03:
> $ while read i; do echo $i; done < /proc/ip_tables_names
> $ while read i; do echo $i; done < /proc/ip_conntrack
> $
> 'cause this beast reads byte-by-byte...

Actually, there is no easy way of implementing a read(1) that doesn't
read byte-by-byte...
--
Debian GNU/Linux 2.2 is out! ( http://www.debian.org/ )
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

2001-11-19 09:23:11

by Alexander Viro

[permalink] [raw]
Subject: Re: more fun with procfs (netfilter)



On Mon, 19 Nov 2001, Herbert Xu wrote:

> Alexander Viro <[email protected]> wrote:
>
> > PS: with bash-2.03:
> > $ while read i; do echo $i; done < /proc/ip_tables_names
> > $ while read i; do echo $i; done < /proc/ip_conntrack
> > $
> > 'cause this beast reads byte-by-byte...
>
> Actually, there is no easy way of implementing a read(1) that doesn't
> read byte-by-byte...

Some shells (pdksh 5.2.14-1, bash 2.04 as shipped by SuSE) are trying to be
smart if stdin is from regular file - they hope that third argument of
lseek() is in bytes and is consistent with read() return value.

2001-11-19 19:19:57

by H. Peter Anvin

[permalink] [raw]
Subject: Re: more fun with procfs (netfilter)

Followup to: <[email protected]>
By author: Alexander Viro <[email protected]>
In newsgroup: linux.dev.kernel
>
> Some shells (pdksh 5.2.14-1, bash 2.04 as shipped by SuSE) are trying to be
> smart if stdin is from regular file - they hope that third argument of
> lseek() is in bytes and is consistent with read() return value.
>

Not just hope... they have a legitimate reason to expect that
guarantee from anything that advertises itself as S_IFREG. I really
think procfs files should advertise themselves as S_IFCHR if they
can't fully obey the semantics of S_IFREG files (including having a
working length in stat()!)

Such S_IFCHR devices can return 0 in st_rdev to signal userspace that
this is a device node keyed by special filesystem semantics rather
than by device number.

-hpa
--
<[email protected]> at work, <[email protected]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt <[email protected]>

2001-11-20 00:49:11

by Rusty Russell

[permalink] [raw]
Subject: Re: more fun with procfs (netfilter)

In message <[email protected]> you wri
te:
> Reason: netfilter procfs files try to fit entire records into the user
> buffer. Do a read shorter than record size and you've got zero. And
> read() returning 0 means you-know-what...

Yes. Don't do this.

Hope that helps,
Rusty.
--
Premature optmztion is rt of all evl. --DK

2001-11-20 10:19:30

by Alexander Viro

[permalink] [raw]
Subject: Re: more fun with procfs (netfilter)



On Tue, 20 Nov 2001, Rusty Russell wrote:

> In message <[email protected]> you wri
> te:
> > Reason: netfilter procfs files try to fit entire records into the user
> > buffer. Do a read shorter than record size and you've got zero. And
> > read() returning 0 means you-know-what...
>
> Yes. Don't do this.
>
> Hope that helps,

That's nice, but...

% awk '/ESTABLISHED/{print $5}' /proc/net/ip_conntrack| wc -l
26
% grep ESTABLISHED /proc/net/ip_conntrack| wc -l
56
%

- IOW, awk (both gawk and mawk) loses everything past the first 4Kb.
And yes, it's a real-world example (there was more than $5 and it was
followed by sed(1), but that doesn't affect the result - lost lines).

So the list of "don't do this" is a bit longer than just reading it
from shell.

2001-11-21 00:17:22

by Jamie Lokier

[permalink] [raw]
Subject: Re: more fun with procfs (netfilter)

Alexander Viro wrote:
> - IOW, awk (both gawk and mawk) loses everything past the first 4Kb.
> And yes, it's a real-world example (there was more than $5 and it was
> followed by sed(1), but that doesn't affect the result - lost lines).

Does this break fopen/fscanf as well then? There are programs which use
fscanf to read this info.

-- Jamie

2001-11-21 00:23:31

by Alexander Viro

[permalink] [raw]
Subject: Re: more fun with procfs (netfilter)



On Wed, 21 Nov 2001, Jamie Lokier wrote:

> Alexander Viro wrote:
> > - IOW, awk (both gawk and mawk) loses everything past the first 4Kb.
> > And yes, it's a real-world example (there was more than $5 and it was
> > followed by sed(1), but that doesn't affect the result - lost lines).
>
> Does this break fopen/fscanf as well then? There are programs which use
> fscanf to read this info.

I suspect that unless you do something stupid with setvbuf() you should be
OK - glibc uses sufficiently large buffers for stdio and doesn't try to
cram as much as possible into them (that's what kills awk - it ends up
doing read(2) again and agian trying to fill the buffer and eventually
tail of the buffer becomes too small; then it gets 0 from read(2) and
decides that it was an EOF).