2021-07-20 09:19:09

by Jason Vas Dias

[permalink] [raw]
Subject: /proc/net/{udp,tcp}{,6} : ip address format : RFC : need for /proc/net/{udp,tcp}{,6}{{n,h},{le,be}} ?

Good day -

I noticed that /proc/net/{udp,tcp} files (bash expansion) - the IPv4
socket tables - contain IPv4 addresses in hex format like:

0100007F:0035

(Little-Endian IPv4 address 127.0.0.1 , Big Endian port 53)

I would have printed / expected the IPv4 address to be printed EITHER
like:
7F000001:0035 (Both Big-Endian)
OR
0100007F:3500 (Both Little-Endian)
.

It is rather idiosyncratic that Linux chooses
to print Little-Endian IPv4 addresses, but not
Little-Endian Ports , and where the other numbers
eg. (rx:tx) , (tr:tm/when) in those files are all
Big-Endian.

Perhaps a later version of Linux could either
A) Print ALL IP addresses and Ports and numbers in network
(Big Endian) byte order, or as IP dotted-quad+port strings
; OR:
B) Provide /proc/net/{udp,tcp}{,6}{n,be,h,le,ip} files
( use shell : $ echo
which print IPv4 addresses & Ports in formats:
n: network: always Big Endian
h: host: native either Little-Endian (LE) or Big Endian (BE)
be: BE - alias for 'n'
le: LE - alias for 'h' on LE platforms, else LE
ip: as dotted-decimal-quad+':'decimal-port strings, with numbers in BE.
; OR:
C) Provide /proc/net/{udp,tcp}{,6}bin memory mappable binary socket
table files
.
?

Should I raise a bug on this ?

Rather than currently letting users discover this fact
by mis-converting IP addresses / ports initially as I did at first.

Just a thought / request for comments.

One would definitely want to inform the netstat + lsof + glibc
developers before choosing option A .

Option B allows users to choose which endianess to use (for ALL numbers)
by only adding new files, not changing existing ones.

Option C would obviate the need to choose an endianess file by
just providing one new memory-mappable binary representation
of the sockets table, of size an even multiple of the page-size,
but whose reported size would be (sizeof(some_linux_ip_socket_table_struct_t) *
n_sockets_in_table). It could be provided alongside option B.

I think options B and / or C would be nice to have - I might implement an
extension to the procfs code that prints these socket tables to
do this, maybe enabled by a new experimental
'+rational-ip-socket-tables' boot option -
then at least it would be clear how the numbers in those files are
meant to be read / converted.

All the best,
Jason








2021-07-20 18:41:32

by Jason Vas Dias

[permalink] [raw]
Subject: re: /proc/net/{udp,tcp}{,6} : ip address format : RFC : need for /proc/net/{udp,tcp}{,6}{{n,h},{le,be}} ?


RE:
On 20/07/2021, Randy Dunlap <[email protected]> wrote:
> On 7/20/21 2:14 AM, Jason Vas Dias wrote:
> ...
> Hi,
> I suggest sending your email to [email protected]
> g'day.

Good day -

I noticed that /proc/net/{udp,tcp} files (bash expansion) - the IPv4
socket tables - contain IPv4 addresses in hex format like:

0100007F:0035

(Little-Endian IPv4 address 127.0.0.1 , Big Endian port 53)

I would have printed / expected the IPv4 address to be printed EITHER
like:
7F000001:0035 (Both Big-Endian)
OR
0100007F:3500 (Both Little-Endian)
.

It is rather idiosyncratic that Linux chooses
to print Little-Endian IPv4 addresses, but not
Little-Endian Ports , and where the other numbers
eg. (rx:tx) , (tr:tm/when) in those files are all
Big-Endian.

Perhaps a later version of Linux could either
A) Print ALL IP addresses and Ports and numbers in network
(Big Endian) byte order, or as IP dotted-quad+port strings
; OR:
B) Provide /proc/net/{udp,tcp}{,6}{n,be,h,le,ip} files
( use shell : $ echo
which print IPv4 addresses & Ports in formats:
n: network: always Big Endian
h: host: native either Little-Endian (LE) or Big Endian (BE)
be: BE - alias for 'n'
le: LE - alias for 'h' on LE platforms, else LE
ip: as dotted-decimal-quad+':'decimal-port strings, with numbers in BE.
; OR:
C) Provide /proc/net/{udp,tcp}{,6}bin memory mappable binary socket
table files
.
?

Should I raise a bug on this ?

Rather than currently letting users discover this fact
by mis-converting IP addresses / ports initially as I did at first.

Just a thought / request for comments.

One would definitely want to inform the netstat + lsof + glibc
developers before choosing option A .

Option B allows users to choose which endianess to use (for ALL numbers)
by only adding new files, not changing existing ones.

Option C would obviate the need to choose an endianess file by
just providing one new memory-mappable binary representation
of the sockets table, of size an even multiple of the page-size,
but whose reported size would be (sizeof(some_linux_ip_socket_table_struct_t) *
n_sockets_in_table). It could be provided alongside option B.

I think options B and / or C would be nice to have - I might implement an
extension to the procfs code that prints these socket tables to
do this, maybe enabled by a new experimental
'+rational-ip-socket-tables' boot option -
then at least it would be clear how the numbers in those files are
meant to be read / converted.

All the best,
Jason







2021-07-20 19:02:32

by Jason Vas Dias

[permalink] [raw]
Subject: re: /proc/net/{udp,tcp}{,6} : ip address format : RFC : need for /proc/net/{udp,tcp}{,6}{{n,h},{le,be}} ?


RE:
On 20/07/2021, Randy Dunlap <[email protected]> wrote:
> On 7/20/21 2:14 AM, Jason Vas Dias wrote:
> ...
> Hi,
> I suggest sending your email to [email protected]
> g'day.
>>> (he meant netdev@)

Good day -

I noticed that /proc/net/{udp,tcp} files (bash expansion) - the IPv4
socket tables - contain IPv4 addresses in hex format like:

0100007F:0035

(Little-Endian IPv4 address 127.0.0.1 , Big Endian port 53)

I would have printed / expected the IPv4 address to be printed EITHER
like:
7F000001:0035 (Both Big-Endian)
OR
0100007F:3500 (Both Little-Endian)
.

It is rather idiosyncratic that Linux chooses
to print Little-Endian IPv4 addresses, but not
Little-Endian Ports , and where the other numbers
eg. (rx:tx) , (tr:tm/when) in those files are all
Big-Endian.

Perhaps a later version of Linux could either
A) Print ALL IP addresses and Ports and numbers in network
(Big Endian) byte order, or as IP dotted-quad+port strings
; OR:
B) Provide /proc/net/{udp,tcp}{,6}{n,be,h,le,ip} files
( use shell : $ echo ^^
to expand
) -
which print IPv4 addresses & Ports in formats indicated by suffix :
n: network: always Big Endian
h: host: native either Little-Endian (LE) or Big Endian (BE)
be: BE - alias for 'n'
le: LE - alias for 'h' on LE platforms, else LE
ip: as dotted-decimal-quad+':'decimal-port strings, with numbers in BE.
; OR:
C) Provide /proc/net/{udp,tcp}{,6}bin memory mappable binary socket
table files
.
?

Should I raise a bug on this ?

Rather than currently letting users discover this fact
by mis-converting IP addresses / ports initially as I did at first.

Just a thought / request for comments.

One would definitely want to inform the netstat + lsof + glibc
developers before choosing option A .

Option B allows users to choose which endianess to use (for ALL numbers)
by only adding new files, not changing existing ones.

Option C would obviate the need to choose an endianess file by
just providing one new memory-mappable binary representation
of the sockets table, of size an even multiple of the page-size,
but whose reported size would be (sizeof(some_linux_ip_socket_table_struct_t) *
n_sockets_in_table). It could be provided alongside option B.

I think options B and / or C would be nice to have - I might implement an
extension to the procfs code that prints these socket tables to
do this, maybe enabled by a new experimental
'+rational-ip-socket-tables' boot option -
then at least it would be clear how the numbers in those files are
meant to be read / converted.

All the best,
Jason







2021-07-20 22:43:27

by Stephen Hemminger

[permalink] [raw]
Subject: Re: /proc/net/{udp,tcp}{,6} : ip address format : RFC : need for /proc/net/{udp,tcp}{,6}{{n,h},{le,be}} ?

On Tue, 20 Jul 2021 19:59:57 +0100
"Jason Vas Dias" <[email protected]> wrote:

> RE:
> On 20/07/2021, Randy Dunlap <[email protected]> wrote:
> > On 7/20/21 2:14 AM, Jason Vas Dias wrote:
> > ...
> > Hi,
> > I suggest sending your email to [email protected]
> > g'day.
> >>> (he meant netdev@)
>
> Good day -
>
> I noticed that /proc/net/{udp,tcp} files (bash expansion) - the IPv4
> socket tables - contain IPv4 addresses in hex format like:
>
> 0100007F:0035
>
> (Little-Endian IPv4 address 127.0.0.1 , Big Endian port 53)
>
> I would have printed / expected the IPv4 address to be printed EITHER
> like:
> 7F000001:0035 (Both Big-Endian)
> OR
> 0100007F:3500 (Both Little-Endian)
> .
>
> It is rather idiosyncratic that Linux chooses
> to print Little-Endian IPv4 addresses, but not
> Little-Endian Ports , and where the other numbers
> eg. (rx:tx) , (tr:tm/when) in those files are all
> Big-Endian.
>
> Perhaps a later version of Linux could either
> A) Print ALL IP addresses and Ports and numbers in network
> (Big Endian) byte order, or as IP dotted-quad+port strings
> ; OR:
> B) Provide /proc/net/{udp,tcp}{,6}{n,be,h,le,ip} files
> ( use shell : $ echo ^^
> to expand
> ) -
> which print IPv4 addresses & Ports in formats indicated by suffix :
> n: network: always Big Endian
> h: host: native either Little-Endian (LE) or Big Endian (BE)
> be: BE - alias for 'n'
> le: LE - alias for 'h' on LE platforms, else LE
> ip: as dotted-decimal-quad+':'decimal-port strings, with numbers in BE.
> ; OR:
> C) Provide /proc/net/{udp,tcp}{,6}bin memory mappable binary socket
> table files
> .

/proc is part of the guaranteed stable ABI in Linux. the format of those
files can not change like that, it would break several applications.

And adding new to /proc is actively discouraged since we have better
interfaces like netlink or sysfs.



> Should I raise a bug on this ?

No

> Rather than currently letting users discover this fact
> by mis-converting IP addresses / ports initially as I did at first.
>
> Just a thought / request for comments.
>
> One would definitely want to inform the netstat + lsof + glibc
> developers before choosing option A .

Netstat is actually part of iputils and is mostly deprecated in
favor of iproute2 ss command.

> Option B allows users to choose which endianess to use (for ALL numbers)
> by only adding new files, not changing existing ones.
>
> Option C would obviate the need to choose an endianess file by
> just providing one new memory-mappable binary representation
> of the sockets table, of size an even multiple of the page-size,
> but whose reported size would be (sizeof(some_linux_ip_socket_table_struct_t) *
> n_sockets_in_table). It could be provided alongside option B.
>
> I think options B and / or C would be nice to have - I might implement an
> extension to the procfs code that prints these socket tables to
> do this, maybe enabled by a new experimental
> '+rational-ip-socket-tables' boot option -
> then at least it would be clear how the numbers in those files are
> meant to be read / converted.
>
> All the best,
> Jason

So, yes what you say makes sense but that was not how the early
prehistoric (2.4 or earlier) versions of Linux decided to output addresses
and it can never change.

2021-07-22 10:43:20

by Jason Vas Dias

[permalink] [raw]
Subject: Re: /proc/net/{udp,tcp}{,6} : ip address format : RFC : need for /proc/net/{udp,tcp}{,6}{{n,h},{le,be}} ?


RE: On 20 July 2021 at 23:41, Stephen Hemminger wrote:
>> So, yes what you say makes sense but that was not how the early
>> prehistoric (2.4 or earlier) versions of Linux decided to output addresses
>> and it can never change.

I don't like those words: "it can never change" !:-)

How about either or both Options B & C under sysfs then?

ie. something like /sys/class/net/{udp,tcp}{,6,n,h,ip,bin}
6: ipv6
[optionally:
[ n: hex, network byte order
h: hex, host byte order
ip: ipv4 ascii dotted quad decimal IPv4 address with ':' <port>
suffix, and decimal numbers
ip6:ipv6 ascii 32-bit hex words of IPv6 address separated by ':' (or
'::') with '#' <port> suffix, with decimal numbers
] [and / or:
bin:memory mapped read-only binary table
]]

I know ip route and netlink can be used. But since Linux is mandated to
print the IP socket and routing tables in ASCII, which I think is a
great idea for shell / perl / python / java / nodejs / lisp / "script language X" scripts,
in the /proc/net/{udp,tcp}* files, it should net be precluded from providing
a better attempt in new files / filesystems - that is all I am
suggesting.

It is a much more attractive proposition for scripts to parse some ASCII
text rather than having to make a call into a native code library or run an
executable like 'ip' (iproute2) to use netlink sockets for this ;
since Linux has to do this job for the /proc filesystem anyway,
why not at least consider then idea of improving & extending this
excellent support for scripts , and make their task simpler and more
efficient ? ie. they could use one number conversion routine for
all numbers in each new file.

I'd personally find such tables most useful, and might actually develop
a module for them. Especially if they included the netlink IP stats
like 64-bit total counts of rx & tx bytes for each socket as well
as rx & tx queue lengths.

Best Regards,
Jason