Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753009AbaBJSzy (ORCPT ); Mon, 10 Feb 2014 13:55:54 -0500 Received: from mail.linux-iscsi.org ([67.23.28.174]:44031 "EHLO linux-iscsi.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752253AbaBJSzu (ORCPT ); Mon, 10 Feb 2014 13:55:50 -0500 Message-ID: <1392058718.17867.8.camel@haakon3.risingtidesystems.com> Subject: Re: [PATCH] This extends tx_data and and iscsit_do_tx_data with the additional parameter flags and avoids sending multiple TCP packets in iscsit_fe_sendpage_sg From: "Nicholas A. Bellinger" To: Eric Dumazet Cc: John Ogness , Eric Dumazet , "David S. Miller" , target-devel , Linux Network Development , LKML , thomas@glanzmann.de Date: Mon, 10 Feb 2014 10:58:38 -0800 In-Reply-To: <1391949039.10160.129.camel@edumazet-glaptop2.roam.corp.google.com> References: <1391868816.10160.93.camel@edumazet-glaptop2.roam.corp.google.com> <20140208141905.GG20512@glanzmann.de> <1391869805.10160.97.camel@edumazet-glaptop2.roam.corp.google.com> <20140208150001.GI20512@glanzmann.de> <1391871986.10160.105.camel@edumazet-glaptop2.roam.corp.google.com> <20140208165732.GB22359@glanzmann.de> <1391879318.10160.108.camel@edumazet-glaptop2.roam.corp.google.com> <20140208171531.GA23798@glanzmann.de> <1391886759.10160.114.camel@edumazet-glaptop2.roam.corp.google.com> <20140209074027.GA8105@glanzmann.de> <20140209074227.GA8219@glanzmann.de> <1391949039.10160.129.camel@edumazet-glaptop2.roam.corp.google.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.4.4-1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Eric & Thomas, On Sun, 2014-02-09 at 04:30 -0800, Eric Dumazet wrote: > On Sun, 2014-02-09 at 08:42 +0100, Eric Dumazet wrote: > > The new infrastructure is used in iscsit_fe_sendpage_sg to avoid sending three > > TCP packets instead of one by settings the MSG_MORE when calling kernel_sendmsg > > via the wrapper functions tx_data and iscsit_do_tx_data. This reduces the TCP > > overhead by sending the same data in less TCP packets and minimized the TCP RTP > > when TCP auto corking is enabled. When creating a 500 GB VMFS filesystem the > > filesystem is created in 3 seconds instead of 4 seconds. > > > > Signed-off-by: Thomas Glanzmann > > X-tested-by: Thomas Glanzmann > > --- > > Hmm, thanks but this is not how to do this. > > When you submit a patch written by someone else, you should : > > 1) Use your own identity as the sender, not impersonate me. > ( thats standard convention ) > > 2) Put following line as first line of the mail > ( Documentation/SubmittingPatches lines ~565) > > From: Eric Dumazet > > Then I'll add my : > Signed-off-by: Eric Dumazet > > Anyway, patch is not yet complete : We also want to set > MSG_MORE/MSG_SENDPAGE_NOTLAST for all pages but last one in a sg list. > > > This will fix suboptimal traffic : > > 13:32:04.976923 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 289953:292849, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896 > 13:32:04.976936 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 292849:295745, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896 > 13:32:04.976944 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [P.], seq 295745:298193, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2448 > 13:32:04.976952 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 298193:301089, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896 > 13:32:04.976960 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [.], seq 301089:303985, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2896 > 13:32:04.976998 IP 10.101.99.5.3260 > 10.101.0.12.43418: Flags [P.], seq 303985:306385, ack 45792, win 795, options [nop,nop,TS val 4294914045 ecr 1577012], length 2400 > > Please try following updated patch, thanks ! > > Once tested, we'll submit it formally. > > drivers/target/iscsi/iscsi_target_parameters.c | 2 > drivers/target/iscsi/iscsi_target_util.c | 38 +++++++++------ > drivers/target/iscsi/iscsi_target_util.h | 2 > 3 files changed, 25 insertions(+), 17 deletions(-) > This looks correct to me.. Thomas, once your able to confirm please include your 'Tested-by' and I'll include for the next -rc3 PULL request. Thanks! --nab > diff --git a/drivers/target/iscsi/iscsi_target_parameters.c b/drivers/target/iscsi/iscsi_target_parameters.c > index 4d2e23fc76fd..b80239250a1c 100644 > --- a/drivers/target/iscsi/iscsi_target_parameters.c > +++ b/drivers/target/iscsi/iscsi_target_parameters.c > @@ -79,7 +79,7 @@ int iscsi_login_tx_data( > */ > conn->if_marker += length; > > - tx_sent = tx_data(conn, &iov[0], iov_cnt, length); > + tx_sent = tx_data(conn, &iov[0], iov_cnt, length, 0); > if (tx_sent != length) { > pr_err("tx_data returned %d, expecting %d.\n", > tx_sent, length); > diff --git a/drivers/target/iscsi/iscsi_target_util.c b/drivers/target/iscsi/iscsi_target_util.c > index 0819e688a398..3c529f7c61ce 100644 > --- a/drivers/target/iscsi/iscsi_target_util.c > +++ b/drivers/target/iscsi/iscsi_target_util.c > @@ -1165,7 +1165,7 @@ send_data: > iov_count = cmd->iov_misc_count; > } > > - tx_sent = tx_data(conn, &iov[0], iov_count, tx_size); > + tx_sent = tx_data(conn, &iov[0], iov_count, tx_size, 0); > if (tx_size != tx_sent) { > if (tx_sent == -EAGAIN) { > pr_err("tx_data() returned -EAGAIN\n"); > @@ -1196,7 +1196,8 @@ send_hdr: > iov.iov_base = cmd->pdu; > iov.iov_len = tx_hdr_size; > > - tx_sent = tx_data(conn, &iov, 1, tx_hdr_size); > + tx_sent = tx_data(conn, &iov, 1, tx_hdr_size, > + cmd->tx_size != tx_hdr_size ? MSG_MORE : 0); > if (tx_hdr_size != tx_sent) { > if (tx_sent == -EAGAIN) { > pr_err("tx_data() returned -EAGAIN\n"); > @@ -1225,18 +1226,24 @@ send_hdr: > while (data_len) { > u32 space = (sg->length - offset); > u32 sub_len = min_t(u32, data_len, space); > + int flags = 0; > + > + if ((data_len != sub_len) || cmd->padding || > + conn->conn_ops->DataDigest) > + flags = MSG_SENDPAGE_NOTLAST | MSG_MORE; > + > send_pg: > tx_sent = conn->sock->ops->sendpage(conn->sock, > - sg_page(sg), sg->offset + offset, sub_len, 0); > + sg_page(sg), > + sg->offset + offset, > + sub_len, flags); > if (tx_sent != sub_len) { > if (tx_sent == -EAGAIN) { > - pr_err("tcp_sendpage() returned" > - " -EAGAIN\n"); > + pr_err("tcp_sendpage() returned -EAGAIN\n"); > goto send_pg; > } > > - pr_err("tcp_sendpage() failure: %d\n", > - tx_sent); > + pr_err("tcp_sendpage() failure: %d\n", tx_sent); > return -1; > } > > @@ -1249,7 +1256,8 @@ send_padding: > if (cmd->padding) { > struct kvec *iov_p = &cmd->iov_data[iov_off++]; > > - tx_sent = tx_data(conn, iov_p, 1, cmd->padding); > + tx_sent = tx_data(conn, iov_p, 1, cmd->padding, > + conn->conn_ops->DataDigest ? MSG_MORE : 0); > if (cmd->padding != tx_sent) { > if (tx_sent == -EAGAIN) { > pr_err("tx_data() returned -EAGAIN\n"); > @@ -1263,7 +1271,7 @@ send_datacrc: > if (conn->conn_ops->DataDigest) { > struct kvec *iov_d = &cmd->iov_data[iov_off]; > > - tx_sent = tx_data(conn, iov_d, 1, ISCSI_CRC_LEN); > + tx_sent = tx_data(conn, iov_d, 1, ISCSI_CRC_LEN, 0); > if (ISCSI_CRC_LEN != tx_sent) { > if (tx_sent == -EAGAIN) { > pr_err("tx_data() returned -EAGAIN\n"); > @@ -1349,11 +1357,12 @@ static int iscsit_do_rx_data( > > static int iscsit_do_tx_data( > struct iscsi_conn *conn, > - struct iscsi_data_count *count) > + struct iscsi_data_count *count, > + int flags) > { > int data = count->data_length, total_tx = 0, tx_loop = 0, iov_len; > struct kvec *iov_p; > - struct msghdr msg; > + struct msghdr msg = { .msg_flags = flags }; > > if (!conn || !conn->sock || !conn->conn_ops) > return -1; > @@ -1363,8 +1372,6 @@ static int iscsit_do_tx_data( > return -1; > } > > - memset(&msg, 0, sizeof(struct msghdr)); > - > iov_p = count->iov; > iov_len = count->iov_count; > > @@ -1408,7 +1415,8 @@ int tx_data( > struct iscsi_conn *conn, > struct kvec *iov, > int iov_count, > - int data) > + int data, > + int flags) > { > struct iscsi_data_count c; > > @@ -1421,7 +1429,7 @@ int tx_data( > c.data_length = data; > c.type = ISCSI_TX_DATA; > > - return iscsit_do_tx_data(conn, &c); > + return iscsit_do_tx_data(conn, &c, flags); > } > > void iscsit_collect_login_stats( > diff --git a/drivers/target/iscsi/iscsi_target_util.h b/drivers/target/iscsi/iscsi_target_util.h > index e4fc34a02f57..1b4f06801adc 100644 > --- a/drivers/target/iscsi/iscsi_target_util.h > +++ b/drivers/target/iscsi/iscsi_target_util.h > @@ -54,7 +54,7 @@ extern int iscsit_print_dev_to_proc(char *, char **, off_t, int); > extern int iscsit_print_sessions_to_proc(char *, char **, off_t, int); > extern int iscsit_print_tpg_to_proc(char *, char **, off_t, int); > extern int rx_data(struct iscsi_conn *, struct kvec *, int, int); > -extern int tx_data(struct iscsi_conn *, struct kvec *, int, int); > +extern int tx_data(struct iscsi_conn *, struct kvec *, int, int, int); > extern void iscsit_collect_login_stats(struct iscsi_conn *, u8, u8); > extern struct iscsi_tiqn *iscsit_snmp_get_tiqn(struct iscsi_conn *); > > > > -- > To unsubscribe from this list: send the line "unsubscribe target-devel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/