From: Nicholas Bellinger <[email protected]>
This patch changes iscsit_do_[tx,rx]_data() to fail on short
write/reads when kernel_[send,recv]msg() returns a value different
than the requested transfer length, returning -EPIPE and thus
causing a connection reset to occur.
This avoids a potential bug in the original code where a short
write/read would result in kernel_[send,recv]msg() being called
again with the original iovec base + length.
In practice this has not been an issue because iscsit_do_tx_data()
is only used for transferring 48 byte headers + 4 byte digests,
along with seldom used control payloads from NOPIN + TEXT_RSP +
REJECT with less than 32k of data. Nor has it been occuring with
iscsit_do_rx_data() because MSG_WAITALL won't return to caller
until the requested transfer length is reached, or an error has
occured.
So following Al's audit of iovec consumers, go ahead and fail
the connection on short writes/reads for now, and remove the
bogus logic.
Reported-by: Al Viro <[email protected]>
Cc: David S. Miller <[email protected]>
Signed-off-by: Nicholas Bellinger <[email protected]>
---
drivers/target/iscsi/iscsi_target_util.c | 48 +++++++++++++-------------------
1 file changed, 20 insertions(+), 28 deletions(-)
diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c
index ce87ce9..9aa24f1 100644
--- a/drivers/target/iscsi/iscsi_target_util.c
+++ b/drivers/target/iscsi/iscsi_target_util.c
@@ -1326,7 +1326,7 @@ static int iscsit_do_rx_data(
struct iscsi_conn *conn,
struct iscsi_data_count *count)
{
- int data = count->data_length, rx_loop = 0, total_rx = 0, iov_len;
+ int ret, iov_len;
struct kvec *iov_p;
struct msghdr msg;
@@ -1338,35 +1338,31 @@ static int iscsit_do_rx_data(
iov_p = count->iov;
iov_len = count->iov_count;
- while (total_rx < data) {
- rx_loop = kernel_recvmsg(conn->sock, &msg, iov_p, iov_len,
- (data - total_rx), MSG_WAITALL);
- if (rx_loop <= 0) {
- pr_debug("rx_loop: %d total_rx: %d\n",
- rx_loop, total_rx);
- return rx_loop;
- }
- total_rx += rx_loop;
- pr_debug("rx_loop: %d, total_rx: %d, data: %d\n",
- rx_loop, total_rx, data);
+ ret = kernel_recvmsg(conn->sock, &msg, iov_p, iov_len,
+ count->data_length, MSG_WAITALL);
+ if (ret != count->data_length) {
+ pr_err("Unexpected ret: %d recv data: %d\n",
+ ret, count->data_length);
+ return -EPIPE;
}
+ pr_debug("ret: %d, recv data: %d\n", ret, count->data_length);
- return total_rx;
+ return ret;
}
static int iscsit_do_tx_data(
struct iscsi_conn *conn,
struct iscsi_data_count *count)
{
- int data = count->data_length, total_tx = 0, tx_loop = 0, iov_len;
+ int ret, iov_len;
struct kvec *iov_p;
struct msghdr msg;
if (!conn || !conn->sock || !conn->conn_ops)
return -1;
- if (data <= 0) {
- pr_err("Data length is: %d\n", data);
+ if (count->data_length <= 0) {
+ pr_err("Data length is: %d\n", count->data_length);
return -1;
}
@@ -1375,20 +1371,16 @@ static int iscsit_do_tx_data(
iov_p = count->iov;
iov_len = count->iov_count;
- while (total_tx < data) {
- tx_loop = kernel_sendmsg(conn->sock, &msg, iov_p, iov_len,
- (data - total_tx));
- if (tx_loop <= 0) {
- pr_debug("tx_loop: %d total_tx %d\n",
- tx_loop, total_tx);
- return tx_loop;
- }
- total_tx += tx_loop;
- pr_debug("tx_loop: %d, total_tx: %d, data: %d\n",
- tx_loop, total_tx, data);
+ ret = kernel_sendmsg(conn->sock, &msg, iov_p, iov_len,
+ count->data_length);
+ if (ret != count->data_length) {
+ pr_err("Unexpected ret: %d send data %d\n",
+ ret, count->data_length);
+ return -EPIPE;
}
+ pr_debug("ret: %d, sent data: %d\n", ret, count->data_length);
- return total_tx;
+ return ret;
}
int rx_data(
--
1.9.1
On Tue, Dec 16, 2014 at 04:48:58AM +0000, Nicholas A. Bellinger wrote:
> In practice this has not been an issue because iscsit_do_tx_data()
> is only used for transferring 48 byte headers + 4 byte digests,
> along with seldom used control payloads from NOPIN + TEXT_RSP +
> REJECT with less than 32k of data. Nor has it been occuring with
> iscsit_do_rx_data() because MSG_WAITALL won't return to caller
> until the requested transfer length is reached, or an error has
> occured.
>
> So following Al's audit of iovec consumers, go ahead and fail
> the connection on short writes/reads for now, and remove the
> bogus logic.
Umm... This won't apply anymore. For one thing, rscvmsg path in mainline
doesn't use kernel_recvmsg() - not since
commit e5a4b0bb803b39a36478451eae53a880d2663d5b
Author: Al Viro <[email protected]>
Date: Mon Nov 24 18:17:55 2014 -0500
switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives
That code has already become
memset(&msg, 0, sizeof(struct msghdr));
iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC,
count->iov, count->iov_count, data);
while (total_rx < data) {
rx_loop = sock_recvmsg(conn->sock, &msg,
(data - total_rx), MSG_WAITALL);
if (rx_loop <= 0) {
pr_debug("rx_loop: %d total_rx: %d\n",
rx_loop, total_rx);
return rx_loop;
}
total_rx += rx_loop;
pr_debug("rx_loop: %d, total_rx: %d, data: %d\n",
rx_loop, total_rx, data);
}
with short reads dealt with just fine - we set ->msg_iter once and each
call of sock_recvmsg() (which doesn't need set_fs() anymore) advances
it for the amount actually received.
sendmsg() side is trivially dealt with in the same fashion. I haven't
pushed that into vfs#iov_iter-net yet, but as soon as the davem opens
net-next I'll be posting the sendmsg part of the series for review and
this will go there as well. FWIW, right now it looks thus:
iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC,
count->iov, count->iov_count, data);
while (msg_data_left(&msg)) {
tx_loop = sock_sendmsg(conn->sock, &msg);
if (tx_loop <= 0) {
pr_debug("tx_loop: %d total_tx %d\n",
tx_loop, total_tx);
return tx_loop;
}
total_tx += tx_loop;
pr_debug("tx_loop: %d, total_tx: %d, data: %d\n",
tx_loop, total_tx, data);
}
(msg_data_left(msg) == iov_iter_count(&msg->msg_iter) and sock_sendmsg()
has lost the third argument - it was always equal to msg_data_left(msg)).
iovec is never drained, ->msg_iter is always advanced by the amount actually
sent. Makes (ex-)users of kernel_sendmsg()/kernel_recvmsg() much simpler
and trivial way of handling short writes/reads becomes correct...
On Tue, 2014-12-16 at 06:04 +0000, Al Viro wrote:
> On Tue, Dec 16, 2014 at 04:48:58AM +0000, Nicholas A. Bellinger wrote:
>
> > In practice this has not been an issue because iscsit_do_tx_data()
> > is only used for transferring 48 byte headers + 4 byte digests,
> > along with seldom used control payloads from NOPIN + TEXT_RSP +
> > REJECT with less than 32k of data. Nor has it been occuring with
> > iscsit_do_rx_data() because MSG_WAITALL won't return to caller
> > until the requested transfer length is reached, or an error has
> > occured.
> >
> > So following Al's audit of iovec consumers, go ahead and fail
> > the connection on short writes/reads for now, and remove the
> > bogus logic.
>
> Umm... This won't apply anymore. For one thing, rscvmsg path in mainline
> doesn't use kernel_recvmsg() - not since
> commit e5a4b0bb803b39a36478451eae53a880d2663d5b
> Author: Al Viro <[email protected]>
> Date: Mon Nov 24 18:17:55 2014 -0500
>
> switch memcpy_to_msg() and skb_copy{,_and_csum}_datagram_msg() to primitives
>
> That code has already become
> memset(&msg, 0, sizeof(struct msghdr));
> iov_iter_kvec(&msg.msg_iter, READ | ITER_KVEC,
> count->iov, count->iov_count, data);
>
> while (total_rx < data) {
> rx_loop = sock_recvmsg(conn->sock, &msg,
> (data - total_rx), MSG_WAITALL);
> if (rx_loop <= 0) {
> pr_debug("rx_loop: %d total_rx: %d\n",
> rx_loop, total_rx);
> return rx_loop;
> }
> total_rx += rx_loop;
> pr_debug("rx_loop: %d, total_rx: %d, data: %d\n",
> rx_loop, total_rx, data);
> }
>
> with short reads dealt with just fine - we set ->msg_iter once and each
> call of sock_recvmsg() (which doesn't need set_fs() anymore) advances
> it for the amount actually received.
Ah, didn't realize this was already merged. Dropping this part now.
Thanks for getting this addressed in -rc1 code btw. ;)
>
> sendmsg() side is trivially dealt with in the same fashion. I haven't
> pushed that into vfs#iov_iter-net yet, but as soon as the davem opens
> net-next I'll be posting the sendmsg part of the series for review and
> this will go there as well.
Your planning to push the iscsit_do_tx_data() changes post -rc1,
right..?
> FWIW, right now it looks thus:
>
> iov_iter_kvec(&msg.msg_iter, WRITE | ITER_KVEC,
> count->iov, count->iov_count, data);
> while (msg_data_left(&msg)) {
> tx_loop = sock_sendmsg(conn->sock, &msg);
> if (tx_loop <= 0) {
> pr_debug("tx_loop: %d total_tx %d\n",
> tx_loop, total_tx);
> return tx_loop;
> }
> total_tx += tx_loop;
> pr_debug("tx_loop: %d, total_tx: %d, data: %d\n",
> tx_loop, total_tx, data);
> }
>
> (msg_data_left(msg) == iov_iter_count(&msg->msg_iter) and sock_sendmsg()
> has lost the third argument - it was always equal to msg_data_left(msg)).
>
> iovec is never drained, ->msg_iter is always advanced by the amount actually
> sent. Makes (ex-)users of kernel_sendmsg()/kernel_recvmsg() much simpler
> and trivial way of handling short writes/reads becomes correct...
> --
So what do you want to do for the existing (potential) bug in
iscsit_do_tx_data()?
Do you prefer the tx change goes via target-pending for <= -rc1 +
stable, ahead of your proper sendmsg series for -rc2 code..?
--nab