i am using the GARMIN_GPS/usb driver to read a gps receiver.
In testing the ability of my software to recover from various errors, I
try this: unplug the gps/USB cable from the usb hub.
Interestingly enough the thread spins.
the SELECT() waits for something to happen, and I get one channel that
something interesting happened.
Then i try to find out how many chars are in the read buff via FIONREAD.
That call errors out with an i/o error.
Needless to day, the code resets the SELECT parameters, and SELECT is
called again. It again says that something interesting has happened on
that ( i/o errored ) channel. And we now repeat the FIONREAD.
In this case what, will reset the "something interesting has happened"
report from the SELECT call? Will it ever be reset in this case?
> i am using the GARMIN_GPS/usb driver to read a gps receiver.
> In testing the ability of my software to recover from various errors, I
> try this: unplug the gps/USB cable from the usb hub.
>
> Interestingly enough the thread spins.
> the SELECT() waits for something to happen, and I get one channel that
> something interesting happened.
> Then i try to find out how many chars are in the read buff via FIONREAD.
> That call errors out with an i/o error.
>
> Needless to day, the code resets the SELECT parameters, and SELECT is
> called again. It again says that something interesting has happened on
> that ( i/o errored ) channel. And we now repeat the FIONREAD.
>
> In this case what, will reset the "something interesting has happened"
> report from the SELECT call? Will it ever be reset in this case?
Nope. An errored connection is always ready for read/write -- there is
nothing to wait for as far as the kernel is concerned. Your code keeps
asking the kernel if something interesting has happened, the kernel keeps
telling it yes, and it refuses to do anything about it.
At minimum, a detection of an error condition that could be persistent
should be followed by a delay. A detection of an error condition that has in
fact persisted and was previously detected should *never* be followed by an
immediate return to 'select'.
You need to *handle* the I/O error. Backoff might be one way, sleeping for
an increasing amount of time after each fatal error, subject to some limit.
And why are you calling FIONREAD? Just 'read' the data -- you're going to
have to eventually anyway.
DS
David Schwartz wrote:
> Nope. An errored connection is always ready for read/write -- there is
> nothing to wait for as far as the kernel is concerned. Your code keeps
> asking the kernel if something interesting has happened, the kernel keeps
> telling it yes, and it refuses to do anything about it.
>
The select() returns because i pulled the USB cable from hub. Seems
reasonable.
The next select() found what? to be interesting in order to prematurely
terminate the select-wait? As far as I can tell, nothing interesting has
happened since the previous select(). In this case the select() is only
looking at read()'s.
Uncle George wrote:
> David Schwartz wrote:
>
>> Nope. An errored connection is always ready for read/write -- there is
>> nothing to wait for as far as the kernel is concerned. Your code keeps
>> asking the kernel if something interesting has happened, the kernel keeps
>> telling it yes, and it refuses to do anything about it.
>>
> The select() returns because i pulled the USB cable from hub. Seems
> reasonable.
>
> The next select() found what? to be interesting in order to prematurely
> terminate the select-wait? As far as I can tell, nothing interesting has
> happened since the previous select(). In this case the select() is only
> looking at read()'s.
It's because you haven't done anything to handle the error which is
still persisting. Likely the only thing sane you can do in this case is
close the fd and try to reopen it later.
--
Robert Hancock Saskatoon, SK, Canada
To email, remove "nospam" from [email protected]
Home Page: http://www.roberthancock.com/
David Schwartz wrote:
>> In this case what, will reset the "something interesting has happened"
>> report from the SELECT call? Will it ever be reset in this case?
>
> Nope. An errored connection is always ready for read/write -- there is
> nothing to wait for as far as the kernel is concerned. Your code keeps
> asking the kernel if something interesting has happened, the kernel keeps
> telling it yes, and it refuses to do anything about it.
Actually its somewhat of a misreading.
The first sentence of the man pages for select suggests that something
interesting has happened inbetween select()'s.
Later on it states "more precisely, to see if a read will not block",
which is still different from "if characters become available for reading".
The "precisely" fits my issue, and not the others.
Uncle George <[email protected]> wrote:
> i am using the GARMIN_GPS/usb driver to read a gps receiver.
> In testing the ability of my software to recover from various errors, I
> try this: unplug the gps/USB cable from the usb hub.
>
> Interestingly enough the thread spins.
> the SELECT() waits for something to happen, and I get one channel that
> something interesting happened.
Or, more precisely, that you won't have to wait for an event.
> Needless to day, the code resets the SELECT parameters, and SELECT is
> called again. It again says that something interesting has happened on
> that ( i/o errored ) channel. And we now repeat the??FIONREAD.
That's exactly what select promised you'd get.
--
Today's assembler command: EXOP Execute Operator
Fri?, Spammer: [email protected] [email protected]
[email protected] [email protected]
> David Schwartz wrote:
> > Nope. An errored connection is always ready for read/write -- there is
> > nothing to wait for as far as the kernel is concerned. Your code keeps
> > asking the kernel if something interesting has happened, the
> > kernel keeps
> > telling it yes, and it refuses to do anything about it.
> The select() returns because i pulled the USB cable from hub. Seems
> reasonable.
Good. Then there is nothing further to discuss.
> The next select() found what? to be interesting in order to prematurely
> terminate the select-wait? As far as I can tell, nothing interesting has
> happened since the previous select(). In this case the select() is only
> looking at read()'s.
You have a very serious misunderstanding of what 'select' does. The 'select'
function is level triggered and state based, not edge triggered or event
based. The situation was the same as before, and so the same result is
required.
The kernel assumes that either you handled the error condition or you aren't
going to handle the error condition. In either case, the correct thing to do
is to again inform you of the error.
Suppose the first 'select' comes from code that is just curious how many
sockets are ready but has no intention of handling the events. Not reporting
the error on the next call to 'select' would be disastrous.
DS
David Schwartz wrote:
>> David Schwartz wrote:
>
>>> Nope. An errored connection is always ready for read/write -- there is
>>> nothing to wait for as far as the kernel is concerned. Your code keeps
>>> asking the kernel if something interesting has happened, the
>>> kernel keeps
>>> telling it yes, and it refuses to do anything about it.
>
>> The select() returns because i pulled the USB cable from hub. Seems
>> reasonable.
>
> Good. Then there is nothing further to discuss.
>
>> The next select() found what? to be interesting in order to prematurely
>> terminate the select-wait? As far as I can tell, nothing interesting has
>> happened since the previous select(). In this case the select() is only
>> looking at read()'s.
>
> You have a very serious misunderstanding of what 'select' does. The 'select'
> function is level triggered and state based, not edge triggered or event
> based. The situation was the same as before, and so the same result is
> required.
The misunderstanding is from the docs.
The select() does not report device errors.
Select will just "more precisely, to see if a read will not block".
> The misunderstanding is from the docs.
> The select() does not report device errors.
> Select will just "more precisely, to see if a read will not block".
This is a much slighter misunderstanding. The result of the 'select'
function tells you nothing about what a particular 'read' will or will not
do. It's just a status reporting function. Saying that 'select' tells you 'a
read will not block' is as misleading as saying 'access' tells you if an
'open' will succeed.
All status reporting functions report status. You can phrase this about what
a hypothetical concurrent operation *would* *have* done. But it is
misleading to phrase it as what an actual future operation *will* do. The
kernel does not predict the future.
DS
David Schwartz wrote:
>> The misunderstanding is from the docs.
>> The select() does not report device errors.
>> Select will just "more precisely, to see if a read will not block".
>
> This is a much slighter misunderstanding. The result of the 'select'
> function tells you nothing about what a particular 'read' will or will not
> do.
The docs 'precisely' says that it does. I'm sorry if you cannot trouble
yourself to read the man pages to address your issues with the correct
functionality of the select call.
Maybe you can address your issues and concerns with the documentation
folks.
thanks again.
> David Schwartz wrote:
> >> The misunderstanding is from the docs.
> >> The select() does not report device errors.
> >> Select will just "more precisely, to see if a read will not block".
> > This is a much slighter misunderstanding. The result of the 'select'
> > function tells you nothing about what a particular 'read' will
> > or will not do.
> The docs 'precisely' says that it does. I'm sorry if you cannot trouble
> yourself to read the man pages to address your issues with the correct
> functionality of the select call.
No, that's not what the docs say. That's what the docs are frequently
misunderstood to say, but that is not what they actually say. When they say,
for example, "see if a read will not block", they mean a hypothetical
concurrent read that does not take place, not a future actual read.
It's bad wording, but it is not incorrect unless you choose to misunderstand
it. For example, one could write that the 'access' function tells you if an
"'open' will succeed", but that doesn't mean an 'open' after an 'access'
*must* succeed.
Like all status-reporting functions, 'select' tells you information that is
accurate at some point in-between when you called it and when it returned
that value to you. But it makes no predictions about the future, and the
documentation does not say that it does. It does imply that it does, and
that's why it's important to correct this misconception whenever it comes
up.
POSIX's documentation is a bit clearer, saying "would block" rather than
"will block". This helps to make it clear that we're talking about a
hypothetical concurrent operation, not an actual future one.
DS
Robert Hancock wrote:
> It's because you haven't done anything to handle the error which is
> still persisting. Likely the only thing sane you can do in this case is
> close the fd and try to reopen it later.
>
This seems to be true, but not for what you might think.
It appears that if u plug the USB/serial device back into the usb-hub,
the code creates a /dev/ttyUSB1 ( if you have not yet closed the
disconnected /dev/ttyUSB0. ) When you do close /dev/ttyUSB0, then the
device is erased from the /dev directory.
Now /dev/ttyUSB1 is the device. And /dev/ttyUSB0 disappeared. This does
not seem proper. As now the program has no idea or capability to re-open
the GPS device.
I have been informed that this was an approved kernel feature. Is this
suppose to happen? Or was it an unintended consequence?