2014-11-19 09:41:57

by Dexuan Cui

[permalink] [raw]
Subject: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon

Under high memory pressure and very high KVP R/W test pressure, the netlink
recvfrom() may transiently return ENOBUFS to the daemon -- we found this
during a 2-week stress test.

We'd better not terminate the daemon on this failure, because a typical KVP
user can re-try the R/W and hopefully it will succeed next time.

Cc: K. Y. Srinivasan <[email protected]>
Signed-off-by: Dexuan Cui <[email protected]>
---
tools/hv/hv_kvp_daemon.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
index 22b0764..9f4b303 100644
--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -1559,8 +1559,15 @@ int main(int argc, char *argv[])
addr_p, &addr_l);

if (len < 0) {
+ int saved_errno = errno;
syslog(LOG_ERR, "recvfrom failed; pid:%u error:%d %s",
addr.nl_pid, errno, strerror(errno));
+
+ if (saved_errno == ENOBUFS) {
+ syslog(LOG_ERR, "error = ENOBUFS: ignored");
+ continue;
+ }
+
close(fd);
return -1;
}
--
1.9.1


2014-11-19 10:50:14

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon

Dexuan Cui <[email protected]> writes:

> Under high memory pressure and very high KVP R/W test pressure, the netlink
> recvfrom() may transiently return ENOBUFS to the daemon -- we found this
> during a 2-week stress test.
>
> We'd better not terminate the daemon on this failure, because a typical KVP
> user can re-try the R/W and hopefully it will succeed next time.
>
> Cc: K. Y. Srinivasan <[email protected]>
> Signed-off-by: Dexuan Cui <[email protected]>
> ---
> tools/hv/hv_kvp_daemon.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
> index 22b0764..9f4b303 100644
> --- a/tools/hv/hv_kvp_daemon.c
> +++ b/tools/hv/hv_kvp_daemon.c
> @@ -1559,8 +1559,15 @@ int main(int argc, char *argv[])
> addr_p, &addr_l);
>
> if (len < 0) {
> + int saved_errno = errno;
> syslog(LOG_ERR, "recvfrom failed; pid:%u error:%d %s",
> addr.nl_pid, errno, strerror(errno));
> +
> + if (saved_errno == ENOBUFS) {

is it possible to meet EAGAIN (or EWOULDBLOCK) here as well? I'd suggest
we ignore these as well in such case. Ignoring ENOMEM here is doubtful,
I think. But possible.

> + syslog(LOG_ERR, "error = ENOBUFS: ignored");
> + continue;
> + }
> +
> close(fd);
> return -1;
> }

--
Vitaly

2014-11-19 12:31:41

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon

> -----Original Message-----
> From: Vitaly Kuznetsov
> Sent: Wednesday, November 19, 2014 18:50 PM
> To: Dexuan Cui
> Cc: [email protected]; [email protected]; driverdev-
> [email protected]; [email protected]; [email protected];
> [email protected]; Haiyang Zhang
> Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon
>
> Dexuan Cui writes:
>
> > Under high memory pressure and very high KVP R/W test pressure, the
> netlink
> > recvfrom() may transiently return ENOBUFS to the daemon -- we found
> this
> > during a 2-week stress test.
> >
> > We'd better not terminate the daemon on this failure, because a typical
> KVP
> > user can re-try the R/W and hopefully it will succeed next time.
> >
> > diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
> > index 22b0764..9f4b303 100644
> > --- a/tools/hv/hv_kvp_daemon.c
> > +++ b/tools/hv/hv_kvp_daemon.c
> > @@ -1559,8 +1559,15 @@ int main(int argc, char *argv[])
> > addr_p, &addr_l);
> >
> > if (len < 0) {
> > + int saved_errno = errno;
> > syslog(LOG_ERR, "recvfrom failed; pid:%u
> error:%d %s",
> > addr.nl_pid, errno, strerror(errno));
> > +
> > + if (saved_errno == ENOBUFS) {
>
> is it possible to meet EAGAIN (or EWOULDBLOCK) here as well? I'd suggest
> we ignore these as well in such case. Ignoring ENOMEM here is doubtful,
> I think. But possible.
>
> Vitaly

I don't think EAGAIN is possible because "man recvfrom" says
"If no messages are available at the socket, the receive calls wait for a
message to arrive, unless the socket is nonblocking (see fcntl(2)), in which
case the value -1 is returned and the external variable errno is set to
EAGAIN or EWOULDBLOCK".

The same man page mention ENOMEM for recvmsg(), but not recvfrom().

-- Dexuan

2014-11-19 12:41:01

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon

Dexuan Cui <[email protected]> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov
>> Sent: Wednesday, November 19, 2014 18:50 PM
>> To: Dexuan Cui
>> Cc: [email protected]; [email protected]; driverdev-
>> [email protected]; [email protected]; [email protected];
>> [email protected]; Haiyang Zhang
>> Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon
>>
>> Dexuan Cui writes:
>>
>> > Under high memory pressure and very high KVP R/W test pressure, the netlink
>> > recvfrom() may transiently return ENOBUFS to the daemon -- we found this
>> > during a 2-week stress test.
>> >
>> > We'd better not terminate the daemon on this failure, because a typical KVP
>> > user can re-try the R/W and hopefully it will succeed next time.
>> >
>> > diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
>> > index 22b0764..9f4b303 100644
>> > --- a/tools/hv/hv_kvp_daemon.c
>> > +++ b/tools/hv/hv_kvp_daemon.c
>> > @@ -1559,8 +1559,15 @@ int main(int argc, char *argv[])
>> > addr_p, &addr_l);
>> >
>> > if (len < 0) {
>> > + int saved_errno = errno;
>> > syslog(LOG_ERR, "recvfrom failed; pid:%u error:%d %s",
>> > addr.nl_pid, errno, strerror(errno));
>> > +
>> > + if (saved_errno == ENOBUFS) {
>>
>> is it possible to meet EAGAIN (or EWOULDBLOCK) here as well? I'd suggest
>> we ignore these as well in such case. Ignoring ENOMEM here is doubtful,
>> I think. But possible.
>>
>> Vitaly
>
> I don't think EAGAIN is possible because "man recvfrom" says
> "If no messages are available at the socket, the receive calls wait for a
> message to arrive, unless the socket is nonblocking (see fcntl(2)), in which
> case the value -1 is returned and the external variable errno is set to
> EAGAIN or EWOULDBLOCK".
>
> The same man page mention ENOMEM for recvmsg(), but not recvfrom().

Ah, sorry, I though your patch patches the other place: call to
netlink_send() which does sendmsg() (and my EAGAIN/EWOULDBLOCK/ENOMEM
comment was about it). It could also make sense to patch them both as I
think it is possible to hit these as well.

>
> -- Dexuan

--
Vitaly

2014-11-19 13:05:31

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon

Dexuan Cui <[email protected]> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov [mailto:[email protected]]
>> Sent: Wednesday, November 19, 2014 20:41 PM
>> To: Dexuan Cui
>> Cc: [email protected]; [email protected]; driverdev-
>> [email protected]; [email protected]; [email protected];
>> [email protected]; Haiyang Zhang
>> Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon
>>
>> Dexuan Cui <[email protected]> writes:
>>
>> >> -----Original Message-----
>> >> From: Vitaly Kuznetsov
>> >> Sent: Wednesday, November 19, 2014 18:50 PM
>> >> To: Dexuan Cui
>> >> Cc: [email protected]; [email protected];
>> driverdev-
>> >> [email protected]; [email protected]; [email protected];
>> >> [email protected]; Haiyang Zhang
>> >> Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon
>> >>
>> >> Dexuan Cui writes:
>> >>
>> >> > Under high memory pressure and very high KVP R/W test pressure,
>> the netlink
>> >> > recvfrom() may transiently return ENOBUFS to the daemon -- we found
>> this
>> >> > during a 2-week stress test.
>> >> >
>> >> > We'd better not terminate the daemon on this failure, because a
>> typical KVP
>> >> > user can re-try the R/W and hopefully it will succeed next time.
>> >> >
>> >> > diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
>> >> > index 22b0764..9f4b303 100644
>> >> > --- a/tools/hv/hv_kvp_daemon.c
>> >> > +++ b/tools/hv/hv_kvp_daemon.c
>> >> > @@ -1559,8 +1559,15 @@ int main(int argc, char *argv[])
>> >> > addr_p, &addr_l);
>> >> >
>> >> > if (len < 0) {
>> >> > + int saved_errno = errno;
>> >> > syslog(LOG_ERR, "recvfrom failed; pid:%u
>> error:%d %s",
>> >> > addr.nl_pid, errno, strerror(errno));
>> >> > +
>> >> > + if (saved_errno == ENOBUFS) {
>> >>
>> >> is it possible to meet EAGAIN (or EWOULDBLOCK) here as well? I'd
>> suggest
>> >> we ignore these as well in such case. Ignoring ENOMEM here is doubtful,
>> >> I think. But possible.
>> >>
>> >> Vitaly
>> >
>> > I don't think EAGAIN is possible because "man recvfrom" says
>> > "If no messages are available at the socket, the receive calls wait for a
>> > message to arrive, unless the socket is nonblocking (see fcntl(2)), in
>> which
>> > case the value -1 is returned and the external variable errno is set to
>> > EAGAIN or EWOULDBLOCK".
>> >
>> > The same man page mention ENOMEM for recvmsg(), but not recvfrom().
>>
>> Ah, sorry, I though your patch patches the other place: call to
>> netlink_send() which does sendmsg() (and my
>> EAGAIN/EWOULDBLOCK/ENOMEM
>> comment was about it). It could also make sense to patch them both as I
>> think it is possible to hit these as well.
>>
>> > -- Dexuan
>> --
>> Vitaly
>
> OK, I can add this new check:
> (I'll send out the v2 tomorrow in case people have new comments)
>

Thanks!

> --- a/tools/hv/hv_kvp_daemon.c
> +++ b/tools/hv/hv_kvp_daemon.c
> @@ -1770,8 +1770,15 @@ kvp_done:
>
> len = netlink_send(fd, incoming_cn_msg);
> if (len < 0) {
> + int saved_errno = errno;
> syslog(LOG_ERR, "net_link send failed; error: %d %s", errno,
> strerror(errno));
> +
> + if (saved_errno == ENOMEM || saved_errno == EAGAIN) {

Sorry for being pushy, but it seems ENOBUFS is also possible here (at
least man sendmsg mentions it).

> + syslog(LOG_ERR, "send error: ignored");
> + continue;
> + }
> +
> exit(EXIT_FAILURE);
> }
> }
>
> Thanks,
> -- Dexuan

--
Vitaly

2014-11-19 13:12:38

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon

> -----Original Message-----
> From: Vitaly Kuznetsov [mailto:[email protected]]
> Sent: Wednesday, November 19, 2014 20:41 PM
> To: Dexuan Cui
> Cc: [email protected]; [email protected]; driverdev-
> [email protected]; [email protected]; [email protected];
> [email protected]; Haiyang Zhang
> Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon
>
> Dexuan Cui <[email protected]> writes:
>
> >> -----Original Message-----
> >> From: Vitaly Kuznetsov
> >> Sent: Wednesday, November 19, 2014 18:50 PM
> >> To: Dexuan Cui
> >> Cc: [email protected]; [email protected];
> driverdev-
> >> [email protected]; [email protected]; [email protected];
> >> [email protected]; Haiyang Zhang
> >> Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon
> >>
> >> Dexuan Cui writes:
> >>
> >> > Under high memory pressure and very high KVP R/W test pressure,
> the netlink
> >> > recvfrom() may transiently return ENOBUFS to the daemon -- we found
> this
> >> > during a 2-week stress test.
> >> >
> >> > We'd better not terminate the daemon on this failure, because a
> typical KVP
> >> > user can re-try the R/W and hopefully it will succeed next time.
> >> >
> >> > diff --git a/tools/hv/hv_kvp_daemon.c b/tools/hv/hv_kvp_daemon.c
> >> > index 22b0764..9f4b303 100644
> >> > --- a/tools/hv/hv_kvp_daemon.c
> >> > +++ b/tools/hv/hv_kvp_daemon.c
> >> > @@ -1559,8 +1559,15 @@ int main(int argc, char *argv[])
> >> > addr_p, &addr_l);
> >> >
> >> > if (len < 0) {
> >> > + int saved_errno = errno;
> >> > syslog(LOG_ERR, "recvfrom failed; pid:%u
> error:%d %s",
> >> > addr.nl_pid, errno, strerror(errno));
> >> > +
> >> > + if (saved_errno == ENOBUFS) {
> >>
> >> is it possible to meet EAGAIN (or EWOULDBLOCK) here as well? I'd
> suggest
> >> we ignore these as well in such case. Ignoring ENOMEM here is doubtful,
> >> I think. But possible.
> >>
> >> Vitaly
> >
> > I don't think EAGAIN is possible because "man recvfrom" says
> > "If no messages are available at the socket, the receive calls wait for a
> > message to arrive, unless the socket is nonblocking (see fcntl(2)), in
> which
> > case the value -1 is returned and the external variable errno is set to
> > EAGAIN or EWOULDBLOCK".
> >
> > The same man page mention ENOMEM for recvmsg(), but not recvfrom().
>
> Ah, sorry, I though your patch patches the other place: call to
> netlink_send() which does sendmsg() (and my
> EAGAIN/EWOULDBLOCK/ENOMEM
> comment was about it). It could also make sense to patch them both as I
> think it is possible to hit these as well.
>
> > -- Dexuan
> --
> Vitaly

OK, I can add this new check:
(I'll send out the v2 tomorrow in case people have new comments)

--- a/tools/hv/hv_kvp_daemon.c
+++ b/tools/hv/hv_kvp_daemon.c
@@ -1770,8 +1770,15 @@ kvp_done:

len = netlink_send(fd, incoming_cn_msg);
if (len < 0) {
+ int saved_errno = errno;
syslog(LOG_ERR, "net_link send failed; error: %d %s", errno,
strerror(errno));
+
+ if (saved_errno == ENOMEM || saved_errno == EAGAIN) {
+ syslog(LOG_ERR, "send error: ignored");
+ continue;
+ }
+
exit(EXIT_FAILURE);
}
}

Thanks,
-- Dexuan

2014-11-19 13:14:44

by Dexuan Cui

[permalink] [raw]
Subject: RE: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon

> -----Original Message-----
> From: Vitaly Kuznetsov
> >> --
> >> Vitaly
> >
> > OK, I can add this new check:
> > (I'll send out the v2 tomorrow in case people have new comments)
> >
>
> Thanks!
>
> > --- a/tools/hv/hv_kvp_daemon.c
> > +++ b/tools/hv/hv_kvp_daemon.c
> > @@ -1770,8 +1770,15 @@ kvp_done:
> >
> > len = netlink_send(fd, incoming_cn_msg);
> > if (len < 0) {
> > + int saved_errno = errno;
> > syslog(LOG_ERR, "net_link send failed; error: %d %s", errno,
> > strerror(errno));
> > +
> > + if (saved_errno == ENOMEM || saved_errno == EAGAIN) {
>
> Sorry for being pushy, but it seems ENOBUFS is also possible here (at
> least man sendmsg mentions it).
OK, I'll add this too. :-)

BTW, I realized sendmsg() can't return EAGAIN here as that's for non-blocking
socket.

Here I simply ignore the error, hoping the other end will re-try.

>
> > + syslog(LOG_ERR, "send error: ignored");
> > + continue;
> > + }
> > +
> > exit(EXIT_FAILURE);
> > }
> > }
> >
> > Thanks,
> > -- Dexuan
>
> Vitaly

-- Dexuan

2014-11-19 14:40:04

by Vitaly Kuznetsov

[permalink] [raw]
Subject: Re: [PATCH] tools: hv: ignore ENOBUFS in the KVP daemon

Dexuan Cui <[email protected]> writes:

>> -----Original Message-----
>> From: Vitaly Kuznetsov
>> >> --
>> >> Vitaly
>> >
>> > OK, I can add this new check:
>> > (I'll send out the v2 tomorrow in case people have new comments)
>> >
>>
>> Thanks!
>>
>> > --- a/tools/hv/hv_kvp_daemon.c
>> > +++ b/tools/hv/hv_kvp_daemon.c
>> > @@ -1770,8 +1770,15 @@ kvp_done:
>> >
>> > len = netlink_send(fd, incoming_cn_msg);
>> > if (len < 0) {
>> > + int saved_errno = errno;
>> > syslog(LOG_ERR, "net_link send failed; error: %d %s", errno,
>> > strerror(errno));
>> > +
>> > + if (saved_errno == ENOMEM || saved_errno == EAGAIN) {
>>
>> Sorry for being pushy, but it seems ENOBUFS is also possible here (at
>> least man sendmsg mentions it).
> OK, I'll add this too. :-)
>
> BTW, I realized sendmsg() can't return EAGAIN here as that's for non-blocking
> socket.
>
> Here I simply ignore the error, hoping the other end will re-try.
>

I agree, it's sufficient to ignore ENOBUFS on recieve path and both
ENOMEM/ENOBUFS on send.

Thanks!

>>
>> > + syslog(LOG_ERR, "send error: ignored");
>> > + continue;
>> > + }
>> > +
>> > exit(EXIT_FAILURE);
>> > }
>> > }
>> >
>> > Thanks,
>> > -- Dexuan
>>
>> Vitaly
>
> -- Dexuan

--
Vitaly