2003-03-17 15:20:24

by Sparks, Jamie

[permalink] [raw]
Subject: select() stress


Hello,

I'm running some code ported from an sgi running Irix 6.5 on a
redhat 7.1 box: 2.4.7-10, i686. Control hangs on a select()
statement forever. The select is never completed, so I can't
check errno.

Please reply to me personally as I'm not currently subscribed:
[email protected]

on sgi, the call is:

select(getdtablehi(), &socklist, NULL, NULL, NULL);

where socklist is declared as: FD_SET socklist;

on linux, there is no getdtablehi() equivalent, so I use
getdtablesize() in its place. getdtablehi() returns the
number of fd's currently open and getdtablesize() returns
the number of fd's that *can* be open.

here's the code:

for (;;)
{
printf("CLEARING sockets\n");
FD_ZERO(&socklist); /* Always clear the structure first. */
FD_SET(Sockfd[0], &socklist);
FD_SET(Sockfd[1], &socklist);

int len = -1;
bool wasDequeued =false;
if (!StateManager::getSaveInProgress()&& StateManager::hasQueuedPdus())
{
FD_ZERO(&socklist); /* Always clear the structure first. */
pdu = StateManager::dequeuePostPduProcess();
len = sizeof(pdu);
wasDequeued = true;
printf("PDU DEqueued from StateManager since Save has completed\n");
}
else
{
printf("prior to select\n");
// orig sgi if (select(getdtablehi(), &socklist, NULL, NULL, NULL) < 0)

/* ****************************** */
/* THIS select() STATEMENT NEVER COMPLETES */
/* ****************************** */
if (select(getdtablesize(), &socklist, NULL, NULL, NULL) < 0)
{
if (errno != EINTR) perror("WeapTerrain");
continue;
}

printf("after select\n");
}
printf("Prior to finding which socket\n");

for (ii=0;ii<2;ii++)
{
len = -1;
if (FD_ISSET(Sockfd[ii],&socklist))
{
if (!ii)
{
len = waitForSocketMessage(Sockfd[ii],&pdu,
sizeof(pdu));
} else if (ii)
{
len = 0;
}
}

if (wasDequeued){len = sizeof(pdu);wasDequeued=false;}

if (len == sizeof(pdu) && StateManager::getSaveInProgress())
{
StateManager::queuePostPduProcess(pdu);
printf("incoming PDU queued in StateManager since SaveInProgress\n");
continue;
}

if (len >= 0)
{
printf("LEN?TYPE = %d %d\n",len, pdu.dpdu.detonation_header.type);
}

.
.
.

Please reply to me personally as I'm not currently subscribed:
[email protected]

thanks,

Jamie


2003-03-17 15:32:54

by Matti Aarnio

[permalink] [raw]
Subject: Re: select() stress

On Mon, Mar 17, 2003 at 10:28:59AM -0500, Sparks, Jamie wrote:
> Hello,
>
> I'm running some code ported from an sgi running Irix 6.5 on a
> redhat 7.1 box: 2.4.7-10, i686. Control hangs on a select()
> statement forever. The select is never completed, so I can't
> check errno.

You do set two socket fds for read monitoring, no write-sockets,
nor exceptions, and most definitely, no timeouts.

If, for some reason, whoever is supposed to send something to
those sockets does not do it, e.g. due to some odd buffering
somewhere, you are effectively stuck.


> Please reply to me personally as I'm not currently subscribed:
> [email protected]
>
> on sgi, the call is:
>
> select(getdtablehi(), &socklist, NULL, NULL, NULL);
>
> where socklist is declared as: FD_SET socklist;
>
> on linux, there is no getdtablehi() equivalent, so I use
> getdtablesize() in its place. getdtablehi() returns the
> number of fd's currently open and getdtablesize() returns
> the number of fd's that *can* be open.

I would be carefull with that, and explicitely code
additional things to find out current highest fd in
the interest set:

> here's the code:
>
> for (;;)
> {
int highfd;

> printf("CLEARING sockets\n");
> FD_ZERO(&socklist); /* Always clear the structure first. */
> FD_SET(Sockfd[0], &socklist);
highfd = Sockfd[0];
> FD_SET(Sockfd[1], &socklist);
if (Sockfd[1] > highfd) highfd = Sockfd[1];
>
> int len = -1;
> bool wasDequeued =false;
> if (!StateManager::getSaveInProgress()&& StateManager::hasQueuedPdus())
> {
> FD_ZERO(&socklist); /* Always clear the structure first. */
> pdu = StateManager::dequeuePostPduProcess();
> len = sizeof(pdu);
> wasDequeued = true;
> printf("PDU DEqueued from StateManager since Save has completed\n");
> }
> else
> {
> printf("prior to select\n");
> // orig sgi if (select(getdtablehi(), &socklist, NULL, NULL, NULL) < 0)

int rc = select(highfd + 1, &socklist, NULL, NULL, NULL);
if (rc < 0) ...

> /* ****************************** */
> /* THIS select() STATEMENT NEVER COMPLETES */
> /* ****************************** */
> if (select(getdtablesize(), &socklist, NULL, NULL, NULL) < 0)
> {
> if (errno != EINTR) perror("WeapTerrain");
> continue;
> }
>
> printf("after select\n");
> }
> printf("Prior to finding which socket\n");
>
> for (ii=0;ii<2;ii++)
> {
> len = -1;
> if (FD_ISSET(Sockfd[ii],&socklist))
> {
> if (!ii)
> {
> len = waitForSocketMessage(Sockfd[ii],&pdu,
> sizeof(pdu));
> } else if (ii)
> {
> len = 0;
> }
> }
>
> if (wasDequeued){len = sizeof(pdu);wasDequeued=false;}
>
> if (len == sizeof(pdu) && StateManager::getSaveInProgress())
> {
> StateManager::queuePostPduProcess(pdu);
> printf("incoming PDU queued in StateManager since SaveInProgress\n");
> continue;
> }
>
> if (len >= 0)
> {
> printf("LEN?TYPE = %d %d\n",len, pdu.dpdu.detonation_header.type);
> }
>
>
> Please reply to me personally as I'm not currently subscribed:
> [email protected]
>
> thanks,
>
> Jamie

2003-03-17 15:48:11

by Richard B. Johnson

[permalink] [raw]
Subject: Re: select() stress

On Mon, 17 Mar 2003, Sparks, Jamie wrote:

>
> Hello,
>
> I'm running some code ported from an sgi running Irix 6.5 on a
> redhat 7.1 box: 2.4.7-10, i686. Control hangs on a select()
> statement forever. The select is never completed, so I can't
> check errno.
>
[SNIPPED...]


> /* ****************************** */
> if (select(getdtablesize(), &socklist, NULL, NULL, NULL) < 0)
> {
> if (errno != EINTR) perror("WeapTerrain");
> continue;
> }

select() takes a file-descriptor as its first argument, not the
return-value of some function that returns the number of file-
descriptors. You cannot assume that this number is the same
as the currently open socket. Just use the socket-value. That's
the file-descriptor.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.

2003-03-17 17:49:13

by Miquel van Smoorenburg

[permalink] [raw]
Subject: Re: select() stress

In article <Pine.LNX.4.53.0303171112090.22652@chaos>,
Richard B. Johnson <[email protected]> wrote:
>select() takes a file-descriptor as its first argument, not the
>return-value of some function that returns the number of file-
>descriptors. You cannot assume that this number is the same
>as the currently open socket. Just use the socket-value. That's
>the file-descriptor.

Duh? "man select".

Mike.
--
Anyone who is capable of getting themselves made President should
on no account be allowed to do the job -- Douglas Adams.

2003-03-17 18:17:19

by Olaf Titz

[permalink] [raw]
Subject: Re: select() stress

> select() takes a file-descriptor as its first argument, not the
> return-value of some function that returns the number of file-
> descriptors. You cannot assume that this number is the same
> as the currently open socket. Just use the socket-value. That's
^ plus one

(yes, I made that mistake more than enough times...)

Olaf

2003-03-18 10:24:03

by DervishD

[permalink] [raw]
Subject: Re: select() stress

Hi all :)

Richard B. Johnson dixit:
> > /* ****************************** */
> > if (select(getdtablesize(), &socklist, NULL, NULL, NULL) < 0)
> > {
> > if (errno != EINTR) perror("WeapTerrain");
> > continue;
> > }
> select() takes a file-descriptor as its first argument, not the
> return-value of some function that returns the number of file-
> descriptors. You cannot assume that this number is the same
> as the currently open socket. Just use the socket-value. That's
> the file-descriptor.

Not at all. 'select()' takes a *number of file descriptors* as
its first argument, meaning the maximum number of file descriptors to
check (it checks only the first N file descriptors, being 'N' the
first argument). Usually that first argument is FD_SETSIZE, but the
result of any function returning a number is right if you know that
the return value is what you want.

If, for example, FD_SETSIZE is set to UINT_MAX but
getdtablesize() returns 100 ('ulimit' came to mind), it's a good idea
to use the return value of that function. Anyway, IMHO is better to
use FD_SETSIZE.

See the glibc info for more references.

Bye and happy coding :)
Ra?l N??ez de Arenas Coronado

--
Linux Registered User 88736
http://www.pleyades.net & http://www.pleyades.net/~raulnac

2003-03-18 12:53:51

by Richard B. Johnson

[permalink] [raw]
Subject: Re: select() stress

On Tue, 18 Mar 2003, DervishD wrote:

> Hi all :)
>
> Richard B. Johnson dixit:
> > > /* ****************************** */
> > > if (select(getdtablesize(), &socklist, NULL, NULL, NULL) < 0)
> > > {
> > > if (errno != EINTR) perror("WeapTerrain");
> > > continue;
> > > }
> > select() takes a file-descriptor as its first argument, not the
> > return-value of some function that returns the number of file-
> > descriptors. You cannot assume that this number is the same
> > as the currently open socket. Just use the socket-value. That's
> > the file-descriptor.
>
> Not at all. 'select()' takes a *number of file descriptors* as
> its first argument, meaning the maximum number of file descriptors to
> check (it checks only the first N file descriptors, being 'N' the
> first argument). Usually that first argument is FD_SETSIZE, but the
> result of any function returning a number is right if you know that
> the return value is what you want.
>
> If, for example, FD_SETSIZE is set to UINT_MAX but
> getdtablesize() returns 100 ('ulimit' came to mind), it's a good idea
> to use the return value of that function. Anyway, IMHO is better to
> use FD_SETSIZE.
>
> See the glibc info for more references.
>
> Bye and happy coding :)
> Ra?l N??ez de Arenas Coronado
>

What I said has been misinterpreted. Select takes the highest
number fd in the set you want to examine plus 1. It therefore
requires some relationship to the fd that you are using if
you are using a socket whos value was N you must select on
(at least) N+1, not the return value of a function that gives
the maximum number of fds that you can open. They are not the
same and are not guaranteed to be related although on some
target, they might. It's the same problem as:

write(1, "Hello\n", 6);

Such code is broken. At the very least, one needs to use
STDOUT_FILENO as the fd, and really should not count characters
by hand.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.

2003-03-18 14:33:43

by DervishD

[permalink] [raw]
Subject: Re: select() stress

Hi Richard, again :)

In my last message I told you that getdtablesize() is not
reliable for closing all file descriptors, that its return value is
not necessarily related to the file descriptor index. Well, I forgot
to say that getdtablehi() effectively returns the index for the
largest file descriptor available to the process plus one, that is,
perfect for using with 'select()' and for closing all open files:

for(i=0; i<getdtablehi(); i++) close(i);

Is this implemented under Linux? I have a piece of software that
relies on the above (now it's written using getdtablesize(), which is
non-correct as you noted) for closing all file descriptors...

Thanks again for noting this, Richard :)

Ra?l N??ez de Arenas Coronado

--
Linux Registered User 88736
http://www.pleyades.net & http://www.pleyades.net/~raulnac

2003-03-18 14:33:41

by DervishD

[permalink] [raw]
Subject: Re: select() stress

Hi Richard :)

Richard B. Johnson dixit:
> > > descriptors. You cannot assume that this number is the same
> > > as the currently open socket. Just use the socket-value. That's
> > > the file-descriptor.
> > Not at all. 'select()' takes a *number of file descriptors* as
> > its first argument, meaning the maximum number of file descriptors to
> > check (it checks only the first N file descriptors, being 'N' the
> > first argument). Usually that first argument is FD_SETSIZE, but the
> > result of any function returning a number is right if you know that
> > the return value is what you want.
> What I said has been misinterpreted. Select takes the highest
> number fd in the set you want to examine plus 1.

AFAIK, only if the first argument is 'FD_SETSIZE', but I'm not
sure of this point.

And yes, now I understand what you meant, and you're right. If
you put in the set file descriptor 'N', you *must* put in the first
argument at least N+1, or the file descriptor won't be checked.

Anyway, in the case of 'getdtablesize()', and assuming that it
returns the highest 'openable' file descriptor, it will always return
a number that is higher than any open file descriptor that the
process has (except if it's inherited from the parent and the child
has a lower file descriptor limit, but this involves tweaking with
getdtablesize()...), since the fd numbers start from zero.

> They are not the same and are not guaranteed to be related although
> on some target, they might.

That's what I didn't understand with getdtablesize(). In the man
page I can read that the function returns the size of the descriptor
table for the process, not the highest number for a file descriptor,
so you can't use it for 'select()', because you can have a socket
descriptor with value e.g. 40055 open and getdtablesize() will
return, for example, 1024. That is, you can open 1024 file
descriptors in your process, but the open call can return 40000 :?

This leads me to the following thinking: I thought that the code
below is a good way of closing all opened file descriptors, but if
the OS can return an arbitrary number higher than the descriptor
table size for a file descriptor, won't work:

for (i=0; i < getdtablesize(); i++) close(i);

How can this be achieved, knowing that the return value for
getdtablesize() doesn't need to be related with fd numbers (that is,
the kernel can return any arbitrary value for a file descriptor,
given that the limit for OPEN_MAX or getdtablesize() is honored)?

Interesting issue :) Thanks, Richard.

Ra?l N??ez de Arenas Coronado

--
Linux Registered User 88736
http://www.pleyades.net & http://www.pleyades.net/~raulnac

2003-03-18 14:39:41

by Sparks, Jamie

[permalink] [raw]
Subject: Re: select() stress

This message uses a character set that is not supported by the Internet
Service. To view the original message content, open the attached message.
If the text doesn't display correctly, save the attachment to disk, and then
open it using a viewer that can display the original character set.
<<message.txt>>


Attachments:
message.txt (1.91 kB)

2003-03-18 14:52:29

by Richard B. Johnson

[permalink] [raw]
Subject: Re: select() stress

On Tue, 18 Mar 2003, Sparks, Jamie wrote:

> This message uses a character set that is not supported by the Internet
> Service. To view the original message content, open the attached message.
> If the text doesn't display correctly, save the attachment to disk, and then
> open it using a viewer that can display the original character set.
> <<message.txt>>
>

Please don't use that goddam M$ mailer. I can't see what you
wrote without saving to a file, etc. Most use 'pine' or
something compatible with __text__ !


Anyway you advised to do something like:

fd = open("/", O_RDONLY);
close(fd);

fd is now supposed to contain the largest process fd + 1.
I don't think this is correct! You can do open thousands
of fds, ultimately more than the max fd value. It will
eventually wrap.


Cheers,
Dick Johnson
Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
Why is the government concerned about the lunatic fringe? Think about it.

2003-03-18 14:56:41

by Sparks, Jamie

[permalink] [raw]
Subject: Re: select() stress

I'm using pine.

j

On Tue, 18 Mar 2003, Richard B. Johnson wrote:

> On Tue, 18 Mar 2003, Sparks, Jamie wrote:
>
> > This message uses a character set that is not supported by the Internet
> > Service. To view the original message content, open the attached message.
> > If the text doesn't display correctly, save the attachment to disk, and then
> > open it using a viewer that can display the original character set.
> > <<message.txt>>
> >
>
> Please don't use that goddam M$ mailer. I can't see what you
> wrote without saving to a file, etc. Most use 'pine' or
> something compatible with __text__ !
>
>
> Anyway you advised to do something like:
>
> fd = open("/", O_RDONLY);
> close(fd);
>
> fd is now supposed to contain the largest process fd + 1.
> I don't think this is correct! You can do open thousands
> of fds, ultimately more than the max fd value. It will
> eventually wrap.
>
>
> Cheers,
> Dick Johnson
> Penguin : Linux version 2.4.20 on an i686 machine (797.90 BogoMips).
> Why is the government concerned about the lunatic fringe? Think about it.
>

2003-03-19 04:08:10

by Chris Friesen

[permalink] [raw]
Subject: Re: select() stress

Sparks, Jamie wrote:
> I'm using pine.

Your email made mozilla's mail client barf too. It complains about using an
unsupported character set although the message is actually shown as an
attachment . The headers show


Content-Type: TEXT/PLAIN; charset=X-UNKNOWN

Chris


--
Chris Friesen | MailStop: 043/33/F10
Nortel Networks | work: (613) 765-0557
3500 Carling Avenue | fax: (613) 765-2986
Nepean, ON K2H 8E9 Canada | email: [email protected]