2017-03-28 13:26:17

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 00/15] cover letter

Overview
========
A kernel TLS Tx only socket option for TCP sockets.
Similarly to the kernel TLS socket(https://lwn.net/Articles/665602),
only symmetric crypto is done in the kernel, as well as TLS record framing.
The handshake remains in userspace, and the negotiated cipher keys/iv are provided to the TCP socket.

Today, userspace TLS must perform 2 passes over the data. First, it has to encrypt the data. Second, the data is copied to the TCP socket in the kernel.
Kernel TLS avoids one pass over the data by encrypting the data from userspace pages into kernelspace buffers.

Non application-data TLS records must be encrypted using the latest crypto state available in the kernel. It is possible to get the crypto context from the kernel and encrypt such recrods in user-space. But we choose to encrypt such TLS records in the kernel by setting the MSG_OOB flag and providing the record type with the data.

TLS Tx crypto offload is a new feature of network devices. It enables the kernel TLS socket to skip encryption and authentication operations on the transmit side of the data path, delegating those to the NIC. In turn, the NIC encrypts packets that belong to an offloaded TLS socket on the fly. The NIC does not modify any packet headers. It expects to receive fully framed TCP packets with TLS records as payload. The NIC replaces plaintext with ciphertext and fills the authentication tag. The NIC does not hold any state beyond the context needed to encrypt the next expected packet, i.e. expected TCP sequence number and crypto state.

There are 2 flows for TLS Tx offload, a fast path and a slow path.
Fast path: packet matches the expected TCP sequence number in the context.
Slow path: packet does not match the expected TCP sequence number in the context. For example: TCP retransmissions. For a packet in the slow path, we need to resynchronize the crypto context of the NIC by providing the TLS record data for that packet before it could be encrypted and transmitted by the NIC.

Motivation
==========
1) Performance: The CPU overhead of encryption in the data path is high, at least 4x for netperf over TLS between 2 machines connected back-to-back.
Our single stream performance tests show that using crypto offload for TLS sockets achieves the same throughput as plain TCP traffic while increasing CPU utilization by only
x1.4.

2) Flexibility: The protocol stack is implemented entirely on the host CPU.
Compared to solutions based on TCP offload, this approach offloads only encryption. Keeping memory management, congestion control, etc. in the host CPU.

Notes
=====
1) New paths:
o net/tls - TLS layer in kernel
o drivers/net/ethernet/mellanox/accelerator/* - NIC driver support, currently implemented as seperated modules.
In the future this code will go into the mlx5 driver. We attached to this patch only the module that integrated with TLS layer.
The complete NIC sample driver is available at https://github.com/Mellanox/tls-offload/tree/tx_rfc_v5

2) We implemented support for this API in OpenSSL 1.1.0, the code is available at https://github.com/Mellanox/tls-openssl/tree/master

3) TLS crypto offload was presented during netdevconf1.2, more details could be found in the presentation and paper:
https://netdevconf.org/1.2/session.html?boris-pismenny

4) These RFC patches are based on kernel 4.9-rc7.

Aviad Yehezkel (5):
tcp: export do_tcp_sendpages function
tcp: export tcp_rate_check_app_limited function
tcp: Add TLS socket options for TCP sockets
tls: tls offload support
mlx/tls: Enable MLX5_CORE_QP_SIM mode for tls

Dave Watson (2):
crypto: Add gcm template for rfc5288
crypto: rfc5288 aesni optimized intel routines

Ilya Lesokhin (8):
tcp: Add clean acked data hook
net: Add TLS offload netdevice and socket support
mlx/mlx5_core: Allow sending multiple packets
mlx/tls: Hardware interface
mlx/tls: Sysfs configuration interface Configure the driver/hardware
interface via sysfs.
mlx/tls: Add mlx_accel offload driver for TLS
mlx/tls: TLS offload driver Add the main module entrypoints and tie
the module into the build system
net/tls: Add software offload

MAINTAINERS | 14 +
arch/x86/crypto/aesni-intel_asm.S | 6 +
arch/x86/crypto/aesni-intel_avx-x86_64.S | 4 +
arch/x86/crypto/aesni-intel_glue.c | 105 ++-
crypto/gcm.c | 122 ++++
crypto/tcrypt.c | 14 +-
crypto/testmgr.c | 16 +
crypto/testmgr.h | 47 ++
drivers/net/ethernet/mellanox/Kconfig | 1 +
drivers/net/ethernet/mellanox/Makefile | 1 +
.../net/ethernet/mellanox/accelerator/tls/Kconfig | 11 +
.../net/ethernet/mellanox/accelerator/tls/Makefile | 4 +
.../net/ethernet/mellanox/accelerator/tls/tls.c | 658 +++++++++++++++++++
.../net/ethernet/mellanox/accelerator/tls/tls.h | 100 +++
.../ethernet/mellanox/accelerator/tls/tls_cmds.h | 112 ++++
.../net/ethernet/mellanox/accelerator/tls/tls_hw.c | 429 ++++++++++++
.../net/ethernet/mellanox/accelerator/tls/tls_hw.h | 49 ++
.../ethernet/mellanox/accelerator/tls/tls_main.c | 77 +++
.../ethernet/mellanox/accelerator/tls/tls_sysfs.c | 196 ++++++
.../ethernet/mellanox/accelerator/tls/tls_sysfs.h | 47 ++
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 11 +-
include/linux/netdevice.h | 23 +
include/net/inet_connection_sock.h | 2 +
include/net/tcp.h | 2 +
include/net/tls.h | 228 +++++++
include/uapi/linux/Kbuild | 1 +
include/uapi/linux/tcp.h | 2 +
include/uapi/linux/tls.h | 84 +++
net/Kconfig | 1 +
net/Makefile | 1 +
net/ipv4/tcp.c | 37 +-
net/ipv4/tcp_input.c | 3 +
net/ipv4/tcp_rate.c | 1 +
net/tls/Kconfig | 12 +
net/tls/Makefile | 7 +
net/tls/tls_device.c | 594 +++++++++++++++++
net/tls/tls_main.c | 352 ++++++++++
net/tls/tls_sw.c | 729 +++++++++++++++++++++
38 files changed, 4078 insertions(+), 25 deletions(-)
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/Kconfig
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/Makefile
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls.c
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls.h
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.h
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
create mode 100644 include/net/tls.h
create mode 100644 include/uapi/linux/tls.h
create mode 100644 net/tls/Kconfig
create mode 100644 net/tls/Makefile
create mode 100644 net/tls/tls_device.c
create mode 100644 net/tls/tls_main.c
create mode 100644 net/tls/tls_sw.c

--
2.7.4


2017-03-28 13:26:50

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 01/15] tcp: Add clean acked data hook

From: Ilya Lesokhin <[email protected]>

Called when a TCP segment is acknowledged.
Could be used by application protocols who hold additional
metadata associated with the stream data
This is required for TLS offloads to release metadata
for acknowledged TLS records.

Signed-off-by: Boris Pismenny <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
include/net/inet_connection_sock.h | 2 ++
net/ipv4/tcp_input.c | 3 +++
2 files changed, 5 insertions(+)

diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h
index 146054c..0b0aceb 100644
--- a/include/net/inet_connection_sock.h
+++ b/include/net/inet_connection_sock.h
@@ -77,6 +77,7 @@ struct inet_connection_sock_af_ops {
* @icsk_pmtu_cookie Last pmtu seen by socket
* @icsk_ca_ops Pluggable congestion control hook
* @icsk_af_ops Operations which are AF_INET{4,6} specific
+ * @icsk_clean_acked Clean acked data hook
* @icsk_ca_state: Congestion control state
* @icsk_retransmits: Number of unrecovered [RTO] timeouts
* @icsk_pending: Scheduled timer event
@@ -99,6 +100,7 @@ struct inet_connection_sock {
__u32 icsk_pmtu_cookie;
const struct tcp_congestion_ops *icsk_ca_ops;
const struct inet_connection_sock_af_ops *icsk_af_ops;
+ void (*icsk_clean_acked)(struct sock *sk);
unsigned int (*icsk_sync_mss)(struct sock *sk, u32 pmtu);
__u8 icsk_ca_state:6,
icsk_ca_setsockopt:1,
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index fe668c1..c158bec 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3667,6 +3667,9 @@ static int tcp_ack(struct sock *sk, const struct sk_buff *skb, int flag)
if (!prior_packets)
goto no_queue;

+ if (icsk->icsk_clean_acked)
+ icsk->icsk_clean_acked(sk);
+
/* See if we can take anything off of the retransmit queue. */
flag |= tcp_clean_rtx_queue(sk, prior_fackets, prior_snd_una, &acked,
&sack_state, &now);
--
2.7.4

2017-03-28 13:26:22

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 05/15] tcp: Add TLS socket options for TCP sockets

This patch adds TLS_TX and TLS_RX TCP socket options.

Setting these socket options will change the sk->sk_prot
operations of the TCP socket. The user is responsible to
prevent races between calls to the previous operations
and the new operations. After successful return, data
sent on this socket will be encapsulated in TLS.

Signed-off-by: Aviad Yehezkel <[email protected]>
Signed-off-by: Boris Pismenny <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
---
include/uapi/linux/tcp.h | 2 ++
net/ipv4/tcp.c | 32 ++++++++++++++++++++++++++++++++
2 files changed, 34 insertions(+)

diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
index c53de26..f9f0e29 100644
--- a/include/uapi/linux/tcp.h
+++ b/include/uapi/linux/tcp.h
@@ -116,6 +116,8 @@ enum {
#define TCP_SAVE_SYN 27 /* Record SYN headers for new connections */
#define TCP_SAVED_SYN 28 /* Get SYN headers recorded for connection */
#define TCP_REPAIR_WINDOW 29 /* Get/set window parameters */
+#define TCP_TLS_TX 30
+#define TCP_TLS_RX 31

struct tcp_repair_opt {
__u32 opt_code;
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 302fee9..2d190e3 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -273,6 +273,7 @@
#include <net/icmp.h>
#include <net/inet_common.h>
#include <net/tcp.h>
+#include <net/tls.h>
#include <net/xfrm.h>
#include <net/ip.h>
#include <net/sock.h>
@@ -2676,6 +2677,21 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
tp->notsent_lowat = val;
sk->sk_write_space(sk);
break;
+ case TCP_TLS_TX:
+ case TCP_TLS_RX: {
+ int (*fn)(struct sock *sk, int optname,
+ char __user *optval, unsigned int optlen);
+
+ fn = symbol_get(tls_sk_attach);
+ if (!fn) {
+ err = -EINVAL;
+ break;
+ }
+
+ err = fn(sk, optname, optval, optlen);
+ symbol_put(tls_sk_attach);
+ break;
+ }
default:
err = -ENOPROTOOPT;
break;
@@ -3064,6 +3080,22 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
}
return 0;
}
+ case TCP_TLS_TX:
+ case TCP_TLS_RX: {
+ int err;
+ int (*fn)(struct sock *sk, int optname,
+ char __user *optval, int __user *optlen);
+
+ fn = symbol_get(tls_sk_query);
+ if (!fn) {
+ err = -EINVAL;
+ break;
+ }
+
+ err = fn(sk, optname, optval, optlen);
+ symbol_put(tls_sk_query);
+ return err;
+ }
default:
return -ENOPROTOOPT;
}
--
2.7.4

2017-03-28 13:26:56

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 03/15] tcp: export tcp_rate_check_app_limited function

We will use it via tls new code.

Signed-off-by: Aviad Yehezkel <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Boris Pismenny <[email protected]>
---
net/ipv4/tcp_rate.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/net/ipv4/tcp_rate.c b/net/ipv4/tcp_rate.c
index 9be1581..a226f76 100644
--- a/net/ipv4/tcp_rate.c
+++ b/net/ipv4/tcp_rate.c
@@ -184,3 +184,4 @@ void tcp_rate_check_app_limited(struct sock *sk)
tp->app_limited =
(tp->delivered + tcp_packets_in_flight(tp)) ? : 1;
}
+EXPORT_SYMBOL(tcp_rate_check_app_limited);
--
2.7.4

2017-03-28 13:26:30

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 13/15] crypto: Add gcm template for rfc5288

From: Dave Watson <[email protected]>

AAD data length is 13 bytes, tag is 16.

Signed-off-by: Dave Watson <[email protected]>
---
crypto/gcm.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
crypto/tcrypt.c | 14 ++++---
crypto/testmgr.c | 16 ++++++++
crypto/testmgr.h | 47 +++++++++++++++++++++
4 files changed, 194 insertions(+), 5 deletions(-)

diff --git a/crypto/gcm.c b/crypto/gcm.c
index f624ac9..07c2805 100644
--- a/crypto/gcm.c
+++ b/crypto/gcm.c
@@ -1016,6 +1016,120 @@ static struct crypto_template crypto_rfc4106_tmpl = {
.module = THIS_MODULE,
};

+static int crypto_rfc5288_encrypt(struct aead_request *req)
+{
+ if (req->assoclen != 21)
+ return -EINVAL;
+
+ req = crypto_rfc4106_crypt(req);
+
+ return crypto_aead_encrypt(req);
+}
+
+static int crypto_rfc5288_decrypt(struct aead_request *req)
+{
+ if (req->assoclen != 21)
+ return -EINVAL;
+
+ req = crypto_rfc4106_crypt(req);
+
+ return crypto_aead_decrypt(req);
+}
+
+static int crypto_rfc5288_create(struct crypto_template *tmpl,
+ struct rtattr **tb)
+{
+ struct crypto_attr_type *algt;
+ struct aead_instance *inst;
+ struct crypto_aead_spawn *spawn;
+ struct aead_alg *alg;
+ const char *ccm_name;
+ int err;
+
+ algt = crypto_get_attr_type(tb);
+ if (IS_ERR(algt))
+ return PTR_ERR(algt);
+
+ if ((algt->type ^ CRYPTO_ALG_TYPE_AEAD) & algt->mask)
+ return -EINVAL;
+
+ ccm_name = crypto_attr_alg_name(tb[1]);
+ if (IS_ERR(ccm_name))
+ return PTR_ERR(ccm_name);
+
+ inst = kzalloc(sizeof(*inst) + sizeof(*spawn), GFP_KERNEL);
+ if (!inst)
+ return -ENOMEM;
+
+ spawn = aead_instance_ctx(inst);
+ crypto_set_aead_spawn(spawn, aead_crypto_instance(inst));
+ err = crypto_grab_aead(spawn, ccm_name, 0,
+ crypto_requires_sync(algt->type, algt->mask));
+ if (err)
+ goto out_free_inst;
+
+ alg = crypto_spawn_aead_alg(spawn);
+
+ err = -EINVAL;
+
+ /* Underlying IV size must be 12. */
+ if (crypto_aead_alg_ivsize(alg) != 12)
+ goto out_drop_alg;
+
+ /* Not a stream cipher? */
+ if (alg->base.cra_blocksize != 1)
+ goto out_drop_alg;
+
+ err = -ENAMETOOLONG;
+ if (snprintf(inst->alg.base.cra_name, CRYPTO_MAX_ALG_NAME,
+ "rfc5288(%s)", alg->base.cra_name) >=
+ CRYPTO_MAX_ALG_NAME ||
+ snprintf(inst->alg.base.cra_driver_name, CRYPTO_MAX_ALG_NAME,
+ "rfc5288(%s)", alg->base.cra_driver_name) >=
+ CRYPTO_MAX_ALG_NAME)
+ goto out_drop_alg;
+
+ inst->alg.base.cra_flags = alg->base.cra_flags & CRYPTO_ALG_ASYNC;
+ inst->alg.base.cra_priority = alg->base.cra_priority;
+ inst->alg.base.cra_blocksize = 1;
+ inst->alg.base.cra_alignmask = alg->base.cra_alignmask;
+
+ inst->alg.base.cra_ctxsize = sizeof(struct crypto_rfc4106_ctx);
+
+ inst->alg.ivsize = 8;
+ inst->alg.chunksize = crypto_aead_alg_chunksize(alg);
+ inst->alg.maxauthsize = crypto_aead_alg_maxauthsize(alg);
+
+ inst->alg.init = crypto_rfc4106_init_tfm;
+ inst->alg.exit = crypto_rfc4106_exit_tfm;
+
+ inst->alg.setkey = crypto_rfc4106_setkey;
+ inst->alg.setauthsize = crypto_rfc4106_setauthsize;
+ inst->alg.encrypt = crypto_rfc5288_encrypt;
+ inst->alg.decrypt = crypto_rfc5288_decrypt;
+
+ inst->free = crypto_rfc4106_free;
+
+ err = aead_register_instance(tmpl, inst);
+ if (err)
+ goto out_drop_alg;
+
+out:
+ return err;
+
+out_drop_alg:
+ crypto_drop_aead(spawn);
+out_free_inst:
+ kfree(inst);
+ goto out;
+}
+
+static struct crypto_template crypto_rfc5288_tmpl = {
+ .name = "rfc5288",
+ .create = crypto_rfc5288_create,
+ .module = THIS_MODULE,
+};
+
static int crypto_rfc4543_setkey(struct crypto_aead *parent, const u8 *key,
unsigned int keylen)
{
@@ -1284,8 +1398,14 @@ static int __init crypto_gcm_module_init(void)
if (err)
goto out_undo_rfc4106;

+ err = crypto_register_template(&crypto_rfc5288_tmpl);
+ if (err)
+ goto out_undo_rfc4543;
+
return 0;

+out_undo_rfc4543:
+ crypto_unregister_template(&crypto_rfc4543_tmpl);
out_undo_rfc4106:
crypto_unregister_template(&crypto_rfc4106_tmpl);
out_undo_gcm:
@@ -1302,6 +1422,7 @@ static void __exit crypto_gcm_module_exit(void)
kfree(gcm_zeroes);
crypto_unregister_template(&crypto_rfc4543_tmpl);
crypto_unregister_template(&crypto_rfc4106_tmpl);
+ crypto_unregister_template(&crypto_rfc5288_tmpl);
crypto_unregister_template(&crypto_gcm_tmpl);
crypto_unregister_template(&crypto_gcm_base_tmpl);
}
@@ -1315,4 +1436,5 @@ MODULE_AUTHOR("Mikko Herranen <[email protected]>");
MODULE_ALIAS_CRYPTO("gcm_base");
MODULE_ALIAS_CRYPTO("rfc4106");
MODULE_ALIAS_CRYPTO("rfc4543");
+MODULE_ALIAS_CRYPTO("rfc5288");
MODULE_ALIAS_CRYPTO("gcm");
diff --git a/crypto/tcrypt.c b/crypto/tcrypt.c
index ae22f05..22538a7 100644
--- a/crypto/tcrypt.c
+++ b/crypto/tcrypt.c
@@ -1338,26 +1338,30 @@ static int do_test(const char *alg, u32 type, u32 mask, int m)
break;

case 152:
- ret += tcrypt_test("rfc4543(gcm(aes))");
+ ret += tcrypt_test("rfc5288(gcm(aes))");
break;

case 153:
- ret += tcrypt_test("cmac(aes)");
+ ret += tcrypt_test("rfc4543(gcm(aes))");
break;

case 154:
- ret += tcrypt_test("cmac(des3_ede)");
+ ret += tcrypt_test("cmac(aes)");
break;

case 155:
- ret += tcrypt_test("authenc(hmac(sha1),cbc(aes))");
+ ret += tcrypt_test("cmac(des3_ede)");
break;

case 156:
- ret += tcrypt_test("authenc(hmac(md5),ecb(cipher_null))");
+ ret += tcrypt_test("authenc(hmac(sha1),cbc(aes))");
break;

case 157:
+ ret += tcrypt_test("authenc(hmac(md5),ecb(cipher_null))");
+ break;
+
+ case 158:
ret += tcrypt_test("authenc(hmac(sha1),ecb(cipher_null))");
break;
case 181:
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index 62dffa0..4cae414 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -3748,6 +3748,22 @@ static const struct alg_test_desc alg_test_descs[] = {
}
}
}, {
+ .alg = "rfc5288(gcm(aes))",
+ .test = alg_test_aead,
+ .fips_allowed = 1,
+ .suite = {
+ .aead = {
+ .enc = {
+ .vecs = aes_gcm_rfc5288_enc_tv_template,
+ .count = AES_GCM_5288_ENC_TEST_VECTORS
+ },
+ .dec = {
+ .vecs = aes_gcm_rfc5288_dec_tv_template,
+ .count = AES_GCM_5288_DEC_TEST_VECTORS
+ }
+ }
+ }
+ }, {
.alg = "rfc7539(chacha20,poly1305)",
.test = alg_test_aead,
.suite = {
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index e64a4ef..65d725a 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -15193,6 +15193,8 @@ static struct cipher_testvec cast6_xts_dec_tv_template[] = {
#define AES_GCM_DEC_TEST_VECTORS 8
#define AES_GCM_4106_ENC_TEST_VECTORS 23
#define AES_GCM_4106_DEC_TEST_VECTORS 23
+#define AES_GCM_5288_ENC_TEST_VECTORS 1
+#define AES_GCM_5288_DEC_TEST_VECTORS 1
#define AES_GCM_4543_ENC_TEST_VECTORS 1
#define AES_GCM_4543_DEC_TEST_VECTORS 2
#define AES_CCM_ENC_TEST_VECTORS 8
@@ -21932,6 +21934,7 @@ static struct aead_testvec aes_gcm_rfc4106_dec_tv_template[] = {
.assoc = "\x01\x01\x01\x01\x01\x01\x01\x01"
"\x00\x00\x00\x00\x00\x00\x00\x00",
.alen = 16,
+
.result = "\x01\x01\x01\x01\x01\x01\x01\x01"
"\x01\x01\x01\x01\x01\x01\x01\x01",
.rlen = 16,
@@ -22485,6 +22488,50 @@ static struct aead_testvec aes_gcm_rfc4106_dec_tv_template[] = {
}
};

+static struct aead_testvec aes_gcm_rfc5288_enc_tv_template[] = {
+ {
+ .key = "\x34\x19\x96\x6e\xc5\x8c\x17\x9c"
+ "\x56\x78\x5e\xbb\x30\x52\x21\x89"
+ "\xea\xbc\x6e\x50",
+ .klen = 20,
+ .iv = "\x00\x00\x00\x00\x00\x00\x00\x01"
+ "\x5f\x73\x65\x73",
+ .assoc = "\x00\x00\x00\x00\x00\x00\x00\x01"
+ "\x17\x03\x03\x00\x10\x00\x00\x00"
+ "\x00\x00\x00\x00\x00",
+ .alen = 21,
+ .input = zeroed_string,
+ .ilen = 16,
+ .result = "\xa5\x2b\x6c\x6e\x2d\x78\x6f\x80"
+ "\x0e\x65\x69\x70\x0a\xe8\x86\xed"
+ "\x6d\x38\x29\x1d\x35\x3f\x62\xcf"
+ "\x46\x9c\x19\x78\x00\x0d\x67\xaa",
+ .rlen = 32,
+ }
+};
+
+static struct aead_testvec aes_gcm_rfc5288_dec_tv_template[] = {
+ {
+ .key = "\x73\xf0\xfa\x44\x76\xf5\xd5\x17"
+ "\x00\x12\x42\x85\xcb\x4f\x92\x1f"
+ "\x7d\x63\x9f\xc6",
+ .klen = 20,
+ .iv = "\x00\x00\x00\x00\x00\x00\x00\x01"
+ "\x74\x61\x73\x6b",
+ .assoc = "\x00\x00\x00\x00\x00\x00\x00\x01"
+ "\x17\x03\x03\x00\x10\x00\x00\x00"
+ "\x00\x00\x00\x00\x00",
+ .alen = 21,
+ .input = "\x05\x56\x46\x23\x1c\x86\x5e\xd0"
+ "\x12\x37\x2a\xa3\x65\x8b\x8c\x90"
+ "\xab\xbd\xca\xda\xae\x6e\xc0\xb2"
+ "\x91\x1b\x9b\x34\xe3\xea\x86\x8f",
+ .ilen = 32,
+ .result = zeroed_string,
+ .rlen = 16,
+ },
+};
+
static struct aead_testvec aes_gcm_rfc4543_enc_tv_template[] = {
{ /* From draft-mcgrew-gcm-test-01 */
.key = "\x4c\x80\xcd\xef\xbb\x5d\x10\xda"
--
2.7.4

2017-03-28 13:27:05

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 15/15] net/tls: Add software offload

From: Ilya Lesokhin <[email protected]>

Signed-off-by: Dave Watson <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
MAINTAINERS | 1 +
include/net/tls.h | 44 ++++
net/tls/Makefile | 2 +-
net/tls/tls_main.c | 34 +--
net/tls/tls_sw.c | 729 +++++++++++++++++++++++++++++++++++++++++++++++++++++
5 files changed, 794 insertions(+), 16 deletions(-)
create mode 100644 net/tls/tls_sw.c

diff --git a/MAINTAINERS b/MAINTAINERS
index e3b70c3..413c1d9 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8491,6 +8491,7 @@ M: Ilya Lesokhin <[email protected]>
M: Aviad Yehezkel <[email protected]>
M: Boris Pismenny <[email protected]>
M: Haggai Eran <[email protected]>
+M: Dave Watson <[email protected]>
L: [email protected]
T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
diff --git a/include/net/tls.h b/include/net/tls.h
index f7f0cde..bb1f41e 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -48,6 +48,7 @@

#define TLS_CRYPTO_INFO_READY(info) ((info)->cipher_type)
#define TLS_IS_STATE_HW(info) ((info)->state == TLS_STATE_HW)
+#define TLS_IS_STATE_SW(info) ((info)->state == TLS_STATE_SW)

#define TLS_RECORD_TYPE_DATA 0x17

@@ -68,6 +69,37 @@ struct tls_offload_context {
spinlock_t lock; /* protects records list */
};

+#define TLS_DATA_PAGES (TLS_MAX_PAYLOAD_SIZE / PAGE_SIZE)
+/* +1 for aad, +1 for tag, +1 for chaining */
+#define TLS_SG_DATA_SIZE (TLS_DATA_PAGES + 3)
+#define ALG_MAX_PAGES 16 /* for skb_to_sgvec */
+#define TLS_AAD_SPACE_SIZE 21
+#define TLS_AAD_SIZE 13
+#define TLS_TAG_SIZE 16
+
+#define TLS_NONCE_SIZE 8
+#define TLS_PREPEND_SIZE (TLS_HEADER_SIZE + TLS_NONCE_SIZE)
+#define TLS_OVERHEAD (TLS_PREPEND_SIZE + TLS_TAG_SIZE)
+
+struct tls_sw_context {
+ struct sock *sk;
+ void (*sk_write_space)(struct sock *sk);
+ struct crypto_aead *aead_send;
+
+ /* Sending context */
+ struct scatterlist sg_tx_data[TLS_SG_DATA_SIZE];
+ struct scatterlist sg_tx_data2[ALG_MAX_PAGES + 1];
+ char aad_send[TLS_AAD_SPACE_SIZE];
+ char tag_send[TLS_TAG_SIZE];
+ skb_frag_t tx_frag;
+ int wmem_len;
+ int order_npages;
+ struct scatterlist sgaad_send[2];
+ struct scatterlist sgtag_send[2];
+ struct sk_buff_head tx_queue;
+ int unsent;
+};
+
struct tls_context {
union {
struct tls_crypto_info crypto_send;
@@ -102,6 +134,12 @@ int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
int tls_device_sendpage(struct sock *sk, struct page *page,
int offset, size_t size, int flags);

+int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx);
+void tls_clear_sw_offload(struct sock *sk);
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_sw_sendpage(struct sock *sk, struct page *page,
+ int offset, size_t size, int flags);
+
struct tls_record_info *tls_get_record(struct tls_offload_context *context,
u32 seq);

@@ -174,6 +212,12 @@ static inline struct tls_context *tls_get_ctx(const struct sock *sk)
return sk->sk_user_data;
}

+static inline struct tls_sw_context *tls_sw_ctx(
+ const struct tls_context *tls_ctx)
+{
+ return (struct tls_sw_context *)tls_ctx->priv_ctx;
+}
+
static inline struct tls_offload_context *tls_offload_ctx(
const struct tls_context *tls_ctx)
{
diff --git a/net/tls/Makefile b/net/tls/Makefile
index 65e5677..61457e0 100644
--- a/net/tls/Makefile
+++ b/net/tls/Makefile
@@ -4,4 +4,4 @@

obj-$(CONFIG_TLS) += tls.o

-tls-y := tls_main.o tls_device.o
+tls-y := tls_main.o tls_device.o tls_sw.o
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index 6a3df25..a4efd02 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -46,6 +46,7 @@ MODULE_DESCRIPTION("Transport Layer Security Support");
MODULE_LICENSE("Dual BSD/GPL");

static struct proto tls_device_prot;
+static struct proto tls_sw_prot;

int tls_push_frags(struct sock *sk,
struct tls_context *ctx,
@@ -188,13 +189,10 @@ int tls_sk_query(struct sock *sk, int optname, char __user *optval,
rc = -EINVAL;
goto out;
}
- if (TLS_IS_STATE_HW(crypto_info)) {
- lock_sock(sk);
- memcpy(crypto_info_aes_gcm_128->iv,
- ctx->iv,
- TLS_CIPHER_AES_GCM_128_IV_SIZE);
- release_sock(sk);
- }
+ lock_sock(sk);
+ memcpy(crypto_info_aes_gcm_128->iv, ctx->iv,
+ TLS_CIPHER_AES_GCM_128_IV_SIZE);
+ release_sock(sk);
rc = copy_to_user(optval,
crypto_info_aes_gcm_128,
sizeof(*crypto_info_aes_gcm_128));
@@ -224,6 +222,7 @@ int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
struct tls_context *ctx = tls_get_ctx(sk);
struct tls_crypto_info *crypto_info;
bool allocated_tls_ctx = false;
+ struct proto *prot = NULL;

if (!optval || (optlen < sizeof(*crypto_info))) {
rc = -EINVAL;
@@ -267,12 +266,6 @@ int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
goto err_sk_user_data;
}

- /* currently we support only HW offload */
- if (!TLS_IS_STATE_HW(crypto_info)) {
- rc = -ENOPROTOOPT;
- goto err_crypto_info;
- }
-
/* check version */
if (crypto_info->version != TLS_1_2_VERSION) {
rc = -ENOTSUPP;
@@ -306,6 +299,12 @@ int tls_sk_attach(struct sock *sk, int optname, char __user *optval,

if (TLS_IS_STATE_HW(crypto_info)) {
rc = tls_set_device_offload(sk, ctx);
+ prot = &tls_device_prot;
+ if (rc)
+ goto err_crypto_info;
+ } else if (TLS_IS_STATE_SW(crypto_info)) {
+ rc = tls_set_sw_offload(sk, ctx);
+ prot = &tls_sw_prot;
if (rc)
goto err_crypto_info;
}
@@ -315,8 +314,9 @@ int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
goto err_set_device_offload;
}

- /* TODO: add protection */
- sk->sk_prot = &tls_device_prot;
+ rc = 0;
+
+ sk->sk_prot = prot;
goto out;

err_set_device_offload:
@@ -337,6 +337,10 @@ static int __init tls_init(void)
tls_device_prot.sendmsg = tls_device_sendmsg;
tls_device_prot.sendpage = tls_device_sendpage;

+ tls_sw_prot = tcp_prot;
+ tls_sw_prot.sendmsg = tls_sw_sendmsg;
+ tls_sw_prot.sendpage = tls_sw_sendpage;
+
return 0;
}

diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
new file mode 100644
index 0000000..4698dc7
--- /dev/null
+++ b/net/tls/tls_sw.c
@@ -0,0 +1,729 @@
+/*
+ * af_tls: TLS socket
+ *
+ * Copyright (C) 2016
+ *
+ * Original authors:
+ * Fridolin Pokorny <[email protected]>
+ * Nikos Mavrogiannopoulos <[email protected]>
+ * Dave Watson <[email protected]>
+ * Lance Chao <[email protected]>
+ *
+ * Based on RFC 5288, RFC 6347, RFC 5246, RFC 6655
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation; either version 2 of the
+ * License, or (at your option) any later version.
+ */
+
+#include <linux/module.h>
+#include <net/tcp.h>
+#include <net/inet_common.h>
+#include <linux/highmem.h>
+#include <linux/netdevice.h>
+#include <crypto/aead.h>
+
+#include <net/tls.h>
+
+static int tls_kernel_sendpage(struct sock *sk, int flags);
+
+static inline void tls_make_aad(struct sock *sk,
+ int recv,
+ char *buf,
+ size_t size,
+ char *nonce_explicit,
+ unsigned char record_type)
+{
+ memcpy(buf, nonce_explicit, TLS_NONCE_SIZE);
+
+ buf[8] = record_type;
+ buf[9] = TLS_1_2_VERSION_MAJOR;
+ buf[10] = TLS_1_2_VERSION_MINOR;
+ buf[11] = size >> 8;
+ buf[12] = size & 0xFF;
+}
+
+static int tls_do_encryption(struct sock *sk, struct scatterlist *sgin,
+ struct scatterlist *sgout, size_t data_len,
+ struct sk_buff *skb)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+ int ret;
+ unsigned int req_size = sizeof(struct aead_request) +
+ crypto_aead_reqsize(ctx->aead_send);
+ struct aead_request *aead_req;
+
+ pr_debug("tls_do_encryption %p\n", sk);
+
+ aead_req = kmalloc(req_size, GFP_ATOMIC);
+
+ if (!aead_req)
+ return -ENOMEM;
+
+ aead_request_set_tfm(aead_req, ctx->aead_send);
+ aead_request_set_ad(aead_req, TLS_AAD_SPACE_SIZE);
+ aead_request_set_crypt(aead_req, sgin, sgout, data_len, tls_ctx->iv);
+
+ ret = crypto_aead_encrypt(aead_req);
+
+ kfree(aead_req);
+ if (ret < 0)
+ return ret;
+ tls_kernel_sendpage(sk, MSG_DONTWAIT);
+
+ return ret;
+}
+
+/* Allocates enough pages to hold the decrypted data, as well as
+ * setting ctx->sg_tx_data to the pages
+ */
+static int tls_pre_encrypt(struct sock *sk, size_t data_len)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+ int i;
+ unsigned int npages;
+ size_t aligned_size;
+ size_t encrypt_len;
+ struct scatterlist *sg;
+ int ret = 0;
+ struct page *tx_pages;
+
+ encrypt_len = data_len + TLS_OVERHEAD;
+ npages = encrypt_len / PAGE_SIZE;
+ aligned_size = npages * PAGE_SIZE;
+ if (aligned_size < encrypt_len)
+ npages++;
+
+ ctx->order_npages = order_base_2(npages);
+ WARN_ON(ctx->order_npages < 0 || ctx->order_npages > 3);
+ /* The first entry in sg_tx_data is AAD so skip it */
+ sg_init_table(ctx->sg_tx_data, TLS_SG_DATA_SIZE);
+ sg_set_buf(&ctx->sg_tx_data[0], ctx->aad_send, sizeof(ctx->aad_send));
+ tx_pages = alloc_pages(GFP_KERNEL | __GFP_COMP,
+ ctx->order_npages);
+ if (!tx_pages) {
+ ret = -ENOMEM;
+ return ret;
+ }
+
+ sg = ctx->sg_tx_data + 1;
+ /* For the first page, leave room for prepend. It will be
+ * copied into the page later
+ */
+ sg_set_page(sg, tx_pages, PAGE_SIZE - TLS_PREPEND_SIZE,
+ TLS_PREPEND_SIZE);
+ for (i = 1; i < npages; i++)
+ sg_set_page(sg + i, tx_pages + i, PAGE_SIZE, 0);
+
+ __skb_frag_set_page(&ctx->tx_frag, tx_pages);
+
+ return ret;
+}
+
+static void tls_release_tx_frag(struct sock *sk)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+ struct page *tx_page = skb_frag_page(&ctx->tx_frag);
+
+ if (!tls_is_pending_open_record(tls_ctx) && tx_page) {
+ struct sk_buff *head;
+ /* Successfully sent the whole packet, account for it*/
+
+ head = skb_peek(&ctx->tx_queue);
+ skb_dequeue(&ctx->tx_queue);
+ sk->sk_wmem_queued -= ctx->wmem_len;
+ sk_mem_uncharge(sk, ctx->wmem_len);
+ ctx->wmem_len = 0;
+ kfree_skb(head);
+ ctx->unsent -= skb_frag_size(&ctx->tx_frag) - TLS_OVERHEAD;
+ tls_increment_seqno(tls_ctx->iv, sk);
+ __free_pages(tx_page,
+ ctx->order_npages);
+ __skb_frag_set_page(&ctx->tx_frag, NULL);
+ }
+ ctx->sk_write_space(sk);
+}
+
+static int tls_kernel_sendpage(struct sock *sk, int flags)
+{
+ int ret;
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+
+ skb_frag_size_add(&ctx->tx_frag, TLS_OVERHEAD);
+ ret = tls_push_frags(sk, tls_ctx, &ctx->tx_frag, 1, 0, flags);
+ if (ret >= 0)
+ tls_release_tx_frag(sk);
+ else if (ret != -EAGAIN)
+ tls_err_abort(sk);
+
+ return ret;
+}
+
+static int tls_push_zerocopy(struct sock *sk, struct scatterlist *sgin,
+ int pages, int bytes, unsigned char record_type)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+ int ret;
+
+ tls_make_aad(sk, 0, ctx->aad_send, bytes, tls_ctx->iv, record_type);
+
+ sg_chain(ctx->sgaad_send, 2, sgin);
+ //sg_unmark_end(&sgin[pages - 1]);
+ sg_chain(sgin, pages + 1, ctx->sgtag_send);
+ ret = sg_nents_for_len(ctx->sgaad_send, bytes + 13 + 16);
+
+ ret = tls_pre_encrypt(sk, bytes);
+ if (ret < 0)
+ goto out;
+
+ tls_fill_prepend(tls_ctx,
+ page_address(skb_frag_page(&ctx->tx_frag)),
+ bytes, record_type);
+
+ skb_frag_size_set(&ctx->tx_frag, bytes);
+
+ ret = tls_do_encryption(sk,
+ ctx->sgaad_send,
+ ctx->sg_tx_data,
+ bytes, NULL);
+
+ if (ret < 0)
+ goto out;
+
+out:
+ if (ret < 0) {
+ sk->sk_err = EPIPE;
+ return ret;
+ }
+
+ return 0;
+}
+
+static int tls_push(struct sock *sk, unsigned char record_type)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+ int bytes = min_t(int, ctx->unsent, (int)TLS_MAX_PAYLOAD_SIZE);
+ int nsg, ret = 0;
+ struct sk_buff *head = skb_peek(&ctx->tx_queue);
+
+ if (!head)
+ return 0;
+
+ bytes = min_t(int, bytes, head->len);
+
+ sg_init_table(ctx->sg_tx_data2, ARRAY_SIZE(ctx->sg_tx_data2));
+ nsg = skb_to_sgvec(head, &ctx->sg_tx_data2[0], 0, bytes);
+
+ /* The length of sg into decryption must not be over
+ * ALG_MAX_PAGES. The aad takes the first sg, so the payload
+ * must be less than ALG_MAX_PAGES - 1
+ */
+ if (nsg > ALG_MAX_PAGES - 1) {
+ ret = -EBADMSG;
+ goto out;
+ }
+
+ tls_make_aad(sk, 0, ctx->aad_send, bytes, tls_ctx->iv, record_type);
+
+ sg_chain(ctx->sgaad_send, 2, ctx->sg_tx_data2);
+ sg_chain(ctx->sg_tx_data2,
+ nsg + 1,
+ ctx->sgtag_send);
+
+ ret = tls_pre_encrypt(sk, bytes);
+ if (ret < 0)
+ goto out;
+
+ tls_fill_prepend(tls_ctx,
+ page_address(skb_frag_page(&ctx->tx_frag)),
+ bytes, record_type);
+
+ skb_frag_size_set(&ctx->tx_frag, bytes);
+ tls_ctx->pending_offset = 0;
+ head->sk = sk;
+
+ ret = tls_do_encryption(sk,
+ ctx->sgaad_send,
+ ctx->sg_tx_data,
+ bytes, head);
+
+ if (ret < 0)
+ goto out;
+
+out:
+ if (ret < 0) {
+ sk->sk_err = EPIPE;
+ return ret;
+ }
+
+ return 0;
+}
+
+static int zerocopy_from_iter(struct iov_iter *from,
+ struct scatterlist *sg, int *bytes)
+{
+ //int len = iov_iter_count(from);
+ int n = 0;
+
+ if (bytes)
+ *bytes = 0;
+
+ //TODO pass in number of pages
+ while (iov_iter_count(from) && n < MAX_SKB_FRAGS - 1) {
+ struct page *pages[MAX_SKB_FRAGS];
+ size_t start;
+ ssize_t copied;
+ int j = 0;
+
+ if (bytes && *bytes >= TLS_MAX_PAYLOAD_SIZE)
+ break;
+
+ copied = iov_iter_get_pages(from, pages, TLS_MAX_PAYLOAD_SIZE,
+ MAX_SKB_FRAGS - n, &start);
+ if (bytes)
+ *bytes += copied;
+ if (copied < 0)
+ return -EFAULT;
+
+ iov_iter_advance(from, copied);
+
+ while (copied) {
+ int size = min_t(int, copied, PAGE_SIZE - start);
+
+ sg_set_page(&sg[n], pages[j], size, start);
+ start = 0;
+ copied -= size;
+ j++;
+ n++;
+ }
+ }
+ return n;
+}
+
+int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+ int ret = 0;
+ long timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
+ bool eor = !(msg->msg_flags & MSG_MORE);
+ struct sk_buff *skb = NULL;
+ size_t copy, copied = 0;
+ unsigned char record_type = TLS_RECORD_TYPE_DATA;
+
+ lock_sock(sk);
+
+ if (msg->msg_flags & MSG_OOB) {
+ if (!eor || ctx->unsent) {
+ ret = -EINVAL;
+ goto send_end;
+ }
+
+ ret = copy_from_iter(&record_type, 1, &msg->msg_iter);
+ if (ret != 1) {
+ return -EFAULT;
+ goto send_end;
+ }
+ }
+
+ while (msg_data_left(msg)) {
+ bool merge = true;
+ int i;
+ struct page_frag *pfrag;
+
+ if (sk->sk_err)
+ goto send_end;
+ if (!sk_stream_memory_free(sk))
+ goto wait_for_memory;
+
+ skb = skb_peek_tail(&ctx->tx_queue);
+ // Try for zerocopy
+ if (!skb && !skb_frag_page(&ctx->tx_frag) && eor) {
+ int pages;
+ int err;
+ // TODO can send partial pages?
+ int page_count = iov_iter_npages(&msg->msg_iter,
+ ALG_MAX_PAGES);
+ struct scatterlist sgin[ALG_MAX_PAGES + 1];
+ int bytes;
+
+ sg_init_table(sgin, ALG_MAX_PAGES + 1);
+
+ if (page_count >= ALG_MAX_PAGES)
+ goto reg_send;
+
+ // TODO check pages?
+ err = zerocopy_from_iter(&msg->msg_iter, &sgin[0],
+ &bytes);
+ pages = err;
+ ctx->unsent += bytes;
+ if (err < 0)
+ goto send_end;
+
+ // Try to send msg
+ tls_push_zerocopy(sk, sgin, pages, bytes, record_type);
+ for (; pages > 0; pages--)
+ put_page(sg_page(&sgin[pages - 1]));
+ if (err < 0) {
+ tls_err_abort(sk);
+ goto send_end;
+ }
+ continue;
+ }
+
+reg_send:
+ while (!skb) {
+ skb = alloc_skb(0, sk->sk_allocation);
+ if (skb)
+ __skb_queue_tail(&ctx->tx_queue, skb);
+ }
+
+ i = skb_shinfo(skb)->nr_frags;
+ pfrag = sk_page_frag(sk);
+
+ if (!sk_page_frag_refill(sk, pfrag))
+ goto wait_for_memory;
+
+ if (!skb_can_coalesce(skb, i, pfrag->page,
+ pfrag->offset)) {
+ if (i == ALG_MAX_PAGES) {
+ struct sk_buff *tskb;
+
+ tskb = alloc_skb(0, sk->sk_allocation);
+ if (!tskb)
+ goto wait_for_memory;
+
+ if (skb)
+ skb->next = tskb;
+ else
+ __skb_queue_tail(&ctx->tx_queue,
+ tskb);
+
+ skb = tskb;
+ skb->ip_summed = CHECKSUM_UNNECESSARY;
+ continue;
+ }
+ merge = false;
+ }
+
+ copy = min_t(int, msg_data_left(msg),
+ pfrag->size - pfrag->offset);
+ copy = min_t(int, copy, TLS_MAX_PAYLOAD_SIZE - ctx->unsent);
+
+ if (!sk_wmem_schedule(sk, copy))
+ goto wait_for_memory;
+
+ ret = skb_copy_to_page_nocache(sk, &msg->msg_iter, skb,
+ pfrag->page,
+ pfrag->offset,
+ copy);
+ ctx->wmem_len += copy;
+ if (ret)
+ goto send_end;
+
+ /* Update the skb. */
+ if (merge) {
+ skb_frag_size_add(&skb_shinfo(skb)->frags[i - 1], copy);
+ } else {
+ skb_fill_page_desc(skb, i, pfrag->page,
+ pfrag->offset, copy);
+ get_page(pfrag->page);
+ }
+
+ pfrag->offset += copy;
+ copied += copy;
+ ctx->unsent += copy;
+
+ if (ctx->unsent >= TLS_MAX_PAYLOAD_SIZE) {
+ ret = tls_push(sk, record_type);
+ if (ret)
+ goto send_end;
+ }
+
+ continue;
+
+wait_for_memory:
+ ret = tls_push(sk, record_type);
+ if (ret)
+ goto send_end;
+//push_wait:
+ set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+ ret = sk_stream_wait_memory(sk, &timeo);
+ if (ret)
+ goto send_end;
+ }
+
+ if (eor)
+ ret = tls_push(sk, record_type);
+
+send_end:
+ ret = sk_stream_error(sk, msg->msg_flags, ret);
+
+ /* make sure we wake any epoll edge trigger waiter */
+ if (unlikely(skb_queue_len(&ctx->tx_queue) == 0 && ret == -EAGAIN))
+ sk->sk_write_space(sk);
+
+ release_sock(sk);
+ return ret < 0 ? ret : size;
+}
+
+void tls_sw_sk_destruct(struct sock *sk)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+ struct page *tx_page = skb_frag_page(&ctx->tx_frag);
+
+ crypto_free_aead(ctx->aead_send);
+
+ if (tx_page)
+ __free_pages(tx_page, ctx->order_npages);
+
+ skb_queue_purge(&ctx->tx_queue);
+ tls_sk_destruct(sk, tls_ctx);
+}
+
+int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx)
+{
+ char keyval[TLS_CIPHER_AES_GCM_128_KEY_SIZE +
+ TLS_CIPHER_AES_GCM_128_SALT_SIZE];
+ struct tls_crypto_info *crypto_info;
+ struct tls_crypto_info_aes_gcm_128 *gcm_128_info;
+ struct tls_sw_context *sw_ctx;
+ u16 nonece_size, tag_size, iv_size;
+ char *iv;
+ int rc = 0;
+
+ if (!ctx) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+ if (ctx->priv_ctx) {
+ rc = -EEXIST;
+ goto out;
+ }
+
+ sw_ctx = kzalloc(sizeof(*sw_ctx), GFP_KERNEL);
+ if (!sw_ctx) {
+ rc = -ENOMEM;
+ goto out;
+ }
+
+ ctx->priv_ctx = (struct tls_offload_context *)sw_ctx;
+
+ crypto_info = &ctx->crypto_send;
+ switch (crypto_info->cipher_type) {
+ case TLS_CIPHER_AES_GCM_128: {
+ nonece_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
+ tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE;
+ iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
+ iv = ((struct tls_crypto_info_aes_gcm_128 *)crypto_info)->iv;
+ gcm_128_info =
+ (struct tls_crypto_info_aes_gcm_128 *)crypto_info;
+ break;
+ }
+ default:
+ rc = -EINVAL;
+ goto out;
+ }
+
+ ctx->prepand_size = TLS_HEADER_SIZE + nonece_size;
+ ctx->tag_size = tag_size;
+ ctx->iv_size = iv_size;
+ ctx->iv = kmalloc(iv_size, GFP_KERNEL);
+ if (!ctx->iv) {
+ rc = ENOMEM;
+ goto out;
+ }
+ memcpy(ctx->iv, iv, iv_size);
+
+ /* Preallocation for sending
+ * scatterlist: AAD | data | TAG (for crypto API)
+ * vec: HEADER | data | TAG
+ */
+ sg_init_table(sw_ctx->sg_tx_data, TLS_SG_DATA_SIZE);
+ sg_set_buf(&sw_ctx->sg_tx_data[0], sw_ctx->aad_send,
+ sizeof(sw_ctx->aad_send));
+
+ sg_set_buf(sw_ctx->sg_tx_data + TLS_SG_DATA_SIZE - 2,
+ sw_ctx->tag_send, sizeof(sw_ctx->tag_send));
+ sg_mark_end(sw_ctx->sg_tx_data + TLS_SG_DATA_SIZE - 1);
+
+ sg_init_table(sw_ctx->sgaad_send, 2);
+ sg_init_table(sw_ctx->sgtag_send, 2);
+
+ sg_set_buf(&sw_ctx->sgaad_send[0], sw_ctx->aad_send,
+ sizeof(sw_ctx->aad_send));
+ /* chaining to tag is performed on actual data size when sending */
+ sg_set_buf(&sw_ctx->sgtag_send[0], sw_ctx->tag_send,
+ sizeof(sw_ctx->tag_send));
+
+ sg_unmark_end(&sw_ctx->sgaad_send[1]);
+
+ if (!sw_ctx->aead_send) {
+ sw_ctx->aead_send =
+ crypto_alloc_aead("rfc5288(gcm(aes))",
+ CRYPTO_ALG_INTERNAL, 0);
+ if (IS_ERR(sw_ctx->aead_send)) {
+ rc = PTR_ERR(sw_ctx->aead_send);
+ sw_ctx->aead_send = NULL;
+ pr_err("bind fail\n"); // TODO
+ goto out;
+ }
+ }
+
+ sk->sk_destruct = tls_sw_sk_destruct;
+ sw_ctx->sk_write_space = ctx->sk_write_space;
+ ctx->sk_write_space = tls_release_tx_frag;
+
+ skb_queue_head_init(&sw_ctx->tx_queue);
+ sw_ctx->sk = sk;
+
+ memcpy(keyval, gcm_128_info->key, TLS_CIPHER_AES_GCM_128_KEY_SIZE);
+ memcpy(keyval + TLS_CIPHER_AES_GCM_128_KEY_SIZE, gcm_128_info->salt,
+ TLS_CIPHER_AES_GCM_128_SALT_SIZE);
+
+ rc = crypto_aead_setkey(sw_ctx->aead_send, keyval,
+ TLS_CIPHER_AES_GCM_128_KEY_SIZE +
+ TLS_CIPHER_AES_GCM_128_SALT_SIZE);
+ if (rc)
+ goto out;
+
+ rc = crypto_aead_setauthsize(sw_ctx->aead_send, TLS_TAG_SIZE);
+ if (rc)
+ goto out;
+
+out:
+ return rc;
+}
+
+int tls_sw_sendpage(struct sock *sk, struct page *page,
+ int offset, size_t size, int flags)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_sw_context *ctx = tls_sw_ctx(tls_ctx);
+ int ret = 0, i;
+ long timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
+ bool eor;
+ struct sk_buff *skb = NULL;
+ size_t queued = 0;
+ unsigned char record_type = TLS_RECORD_TYPE_DATA;
+
+ if (flags & MSG_SENDPAGE_NOTLAST)
+ flags |= MSG_MORE;
+
+ /* No MSG_EOR from splice, only look at MSG_MORE */
+ eor = !(flags & MSG_MORE);
+
+ lock_sock(sk);
+
+ if (flags & MSG_OOB) {
+ ret = -ENOTSUPP;
+ goto sendpage_end;
+ }
+ sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
+
+ /* Call the sk_stream functions to manage the sndbuf mem. */
+ while (size > 0) {
+ size_t send_size = min(size, TLS_MAX_PAYLOAD_SIZE);
+
+ if (!sk_stream_memory_free(sk) ||
+ (ctx->unsent + send_size > TLS_MAX_PAYLOAD_SIZE)) {
+ ret = tls_push(sk, record_type);
+ if (ret)
+ goto sendpage_end;
+ set_bit(SOCK_NOSPACE, &sk->sk_socket->flags);
+ ret = sk_stream_wait_memory(sk, &timeo);
+ if (ret)
+ goto sendpage_end;
+ }
+
+ if (sk->sk_err)
+ goto sendpage_end;
+
+ skb = skb_peek_tail(&ctx->tx_queue);
+ if (skb) {
+ i = skb_shinfo(skb)->nr_frags;
+
+ if (skb_can_coalesce(skb, i, page, offset)) {
+ skb_frag_size_add(
+ &skb_shinfo(skb)->frags[i - 1],
+ send_size);
+ skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG;
+ goto coalesced;
+ }
+
+ if (i >= ALG_MAX_PAGES) {
+ struct sk_buff *tskb;
+
+ tskb = alloc_skb(0, sk->sk_allocation);
+ while (!tskb) {
+ ret = tls_push(sk, record_type);
+ if (ret)
+ goto sendpage_end;
+ set_bit(SOCK_NOSPACE,
+ &sk->sk_socket->flags);
+ ret = sk_stream_wait_memory(sk, &timeo);
+ if (ret)
+ goto sendpage_end;
+
+ tskb = alloc_skb(0, sk->sk_allocation);
+ }
+
+ if (skb)
+ skb->next = tskb;
+ else
+ __skb_queue_tail(&ctx->tx_queue,
+ tskb);
+ skb = tskb;
+ i = 0;
+ }
+ } else {
+ skb = alloc_skb(0, sk->sk_allocation);
+ __skb_queue_tail(&ctx->tx_queue, skb);
+ i = 0;
+ }
+
+ get_page(page);
+ skb_fill_page_desc(skb, i, page, offset, send_size);
+ skb_shinfo(skb)->tx_flags |= SKBTX_SHARED_FRAG;
+
+coalesced:
+ skb->len += send_size;
+ skb->data_len += send_size;
+ skb->truesize += send_size;
+ sk->sk_wmem_queued += send_size;
+ ctx->wmem_len += send_size;
+ sk_mem_charge(sk, send_size);
+ ctx->unsent += send_size;
+ queued += send_size;
+ offset += queued;
+ size -= send_size;
+
+ if (eor || ctx->unsent >= TLS_MAX_PAYLOAD_SIZE) {
+ ret = tls_push(sk, record_type);
+ if (ret)
+ goto sendpage_end;
+ }
+ }
+
+ if (eor || ctx->unsent >= TLS_MAX_PAYLOAD_SIZE)
+ ret = tls_push(sk, record_type);
+
+sendpage_end:
+ ret = sk_stream_error(sk, flags, ret);
+
+ if (ret < 0)
+ ret = sk_stream_error(sk, flags, ret);
+
+ release_sock(sk);
+
+ return ret < 0 ? ret : queued;
+}
--
2.7.4

2017-03-28 13:26:29

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 12/15] mlx/tls: Enable MLX5_CORE_QP_SIM mode for tls

Signed-off-by: Aviad Yehezkel <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
---
drivers/net/ethernet/mellanox/accelerator/tls/tls.c | 6 ++++++
drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c | 2 ++
drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h | 2 ++
3 files changed, 10 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls.c b/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
index 07a4b67..3560f784 100644
--- a/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
@@ -494,9 +494,11 @@ static struct sk_buff *mlx_tls_rx_handler(struct sk_buff *skb, u8 *rawpet,
static void mlx_tls_free(struct mlx_tls_dev *dev)
{
list_del(&dev->accel_dev_list);
+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
#ifdef MLX_TLS_SADB_RDMA
kobject_put(&dev->kobj);
#endif
+#endif
dev_put(dev->netdev);
kfree(dev);
}
@@ -592,6 +594,7 @@ int mlx_tls_add_one(struct mlx_accel_core_device *accel_device)
goto err_netdev;
}

+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
#ifdef MLX_TLS_SADB_RDMA
ret = tls_sysfs_init_and_add(&dev->kobj,
mlx_accel_core_kobj(dev->accel_device),
@@ -603,6 +606,7 @@ int mlx_tls_add_one(struct mlx_accel_core_device *accel_device)
goto err_ops_register;
}
#endif
+#endif

mutex_lock(&mlx_tls_mutex);
list_add(&dev->accel_dev_list, &mlx_tls_devs);
@@ -611,10 +615,12 @@ int mlx_tls_add_one(struct mlx_accel_core_device *accel_device)
dev->netdev->tlsdev_ops = &mlx_tls_ops;
goto out;

+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
#ifdef MLX_TLS_SADB_RDMA
err_ops_register:
mlx_accel_core_client_ops_unregister(accel_device);
#endif
+#endif
err_netdev:
dev_put(netdev);
err_conn:
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
index 2860fc3..76ba784 100644
--- a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
@@ -36,6 +36,7 @@
#include "tls_sysfs.h"
#include "tls_cmds.h"

+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
#ifdef MLX_TLS_SADB_RDMA
struct mlx_tls_attribute {
struct attribute attr;
@@ -192,3 +193,4 @@ int tls_sysfs_init_and_add(struct kobject *kobj, struct kobject *parent,
fmt, arg);
}
#endif
+#endif
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
index bfaa857..d7c3185 100644
--- a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
@@ -37,9 +37,11 @@

#include "tls.h"

+#if IS_ENABLED(CONFIG_MLX5_CORE_FPGA_QP_SIM)
#ifdef MLX_TLS_SADB_RDMA
int tls_sysfs_init_and_add(struct kobject *kobj, struct kobject *parent,
const char *fmt, char *arg);
#endif
+#endif

#endif /* __TLS_SYSFS_H__ */
--
2.7.4

2017-03-28 13:26:23

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 06/15] tls: tls offload support

This patch introduces TX HW offload.

tls_main: contains generic logic that will be shared by both
SW and HW implementations.
tls_device: contains generic HW logic that is shared by all
HW offload implementations.

Signed-off-by: Boris Pismenny <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
MAINTAINERS | 13 +
include/net/tls.h | 184 ++++++++++++++
include/uapi/linux/Kbuild | 1 +
include/uapi/linux/tls.h | 84 +++++++
net/Kconfig | 1 +
net/Makefile | 1 +
net/tls/Kconfig | 12 +
net/tls/Makefile | 7 +
net/tls/tls_device.c | 594 ++++++++++++++++++++++++++++++++++++++++++++++
net/tls/tls_main.c | 348 +++++++++++++++++++++++++++
10 files changed, 1245 insertions(+)
create mode 100644 include/net/tls.h
create mode 100644 include/uapi/linux/tls.h
create mode 100644 net/tls/Kconfig
create mode 100644 net/tls/Makefile
create mode 100644 net/tls/tls_device.c
create mode 100644 net/tls/tls_main.c

diff --git a/MAINTAINERS b/MAINTAINERS
index b340ef6..e3b70c3 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -8486,6 +8486,19 @@ F: net/ipv6/
F: include/net/ip*
F: arch/x86/net/*

+NETWORKING [TLS]
+M: Ilya Lesokhin <[email protected]>
+M: Aviad Yehezkel <[email protected]>
+M: Boris Pismenny <[email protected]>
+M: Haggai Eran <[email protected]>
+L: [email protected]
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git
+T: git git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next.git
+S: Maintained
+F: net/tls/*
+F: include/uapi/linux/tls.h
+F: include/net/tls.h
+
NETWORKING [IPSEC]
M: Steffen Klassert <[email protected]>
M: Herbert Xu <[email protected]>
diff --git a/include/net/tls.h b/include/net/tls.h
new file mode 100644
index 0000000..f7f0cde
--- /dev/null
+++ b/include/net/tls.h
@@ -0,0 +1,184 @@
+/* Copyright (c) 2016-2017, Mellanox Technologies All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * - Neither the name of the Mellanox Technologies nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior written
+ * permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE
+ */
+
+#ifndef _TLS_OFFLOAD_H
+#define _TLS_OFFLOAD_H
+
+#include <linux/types.h>
+
+#include <uapi/linux/tls.h>
+
+
+/* Maximum data size carried in a TLS record */
+#define TLS_MAX_PAYLOAD_SIZE ((size_t)1 << 14)
+
+#define TLS_HEADER_SIZE 5
+#define TLS_NONCE_OFFSET TLS_HEADER_SIZE
+
+#define TLS_CRYPTO_INFO_READY(info) ((info)->cipher_type)
+#define TLS_IS_STATE_HW(info) ((info)->state == TLS_STATE_HW)
+
+#define TLS_RECORD_TYPE_DATA 0x17
+
+
+struct tls_record_info {
+ struct list_head list;
+ u32 end_seq;
+ int len;
+ int num_frags;
+ skb_frag_t frags[MAX_SKB_FRAGS];
+};
+
+struct tls_offload_context {
+ struct list_head records_list;
+ struct tls_record_info *open_record;
+ struct tls_record_info *retransmit_hint;
+ u32 expectedSN;
+ spinlock_t lock; /* protects records list */
+};
+
+struct tls_context {
+ union {
+ struct tls_crypto_info crypto_send;
+ struct tls_crypto_info_aes_gcm_128 crypto_send_aes_gcm_128;
+ };
+
+ void *priv_ctx;
+
+ u16 prepand_size;
+ u16 tag_size;
+ u16 iv_size;
+ char *iv;
+
+ /* TODO: change sw code to use below fields and push_frags function */
+ skb_frag_t *pending_frags;
+ u16 num_pending_frags;
+ u16 pending_offset;
+
+ void (*sk_write_space)(struct sock *sk);
+ void (*sk_destruct)(struct sock *sk);
+};
+
+
+int tls_sk_query(struct sock *sk, int optname, char __user *optval,
+ int __user *optlen);
+int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
+ unsigned int optlen);
+
+void tls_clear_device_offload(struct sock *sk, struct tls_context *ctx);
+int tls_set_device_offload(struct sock *sk, struct tls_context *ctx);
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
+int tls_device_sendpage(struct sock *sk, struct page *page,
+ int offset, size_t size, int flags);
+
+struct tls_record_info *tls_get_record(struct tls_offload_context *context,
+ u32 seq);
+
+void tls_sk_destruct(struct sock *sk, struct tls_context *ctx);
+void tls_icsk_clean_acked(struct sock *sk);
+
+void tls_device_sk_destruct(struct sock *sk);
+
+
+int tls_push_frags(struct sock *sk, struct tls_context *ctx,
+ skb_frag_t *frag, u16 num_frags, u16 first_offset,
+ int flags);
+int tls_push_paritial_record(struct sock *sk, struct tls_context *ctx,
+ int flags);
+
+static inline bool tls_is_pending_open_record(struct tls_context *ctx)
+{
+ return !!ctx->num_pending_frags;
+}
+
+static inline bool tls_is_sk_tx_device_offloaded(struct sock *sk)
+{
+ return smp_load_acquire(&sk->sk_destruct) ==
+ &tls_device_sk_destruct;
+}
+
+static inline void tls_err_abort(struct sock *sk)
+{
+ xchg(&sk->sk_err, -EBADMSG);
+ sk->sk_error_report(sk);
+}
+
+static inline void tls_increment_seqno(unsigned char *seq, struct sock *sk)
+{
+ int i;
+
+ for (i = 7; i >= 0; i--) {
+ ++seq[i];
+ if (seq[i] != 0)
+ break;
+ }
+
+ if (i == -1)
+ tls_err_abort(sk);
+}
+
+static inline void tls_fill_prepend(struct tls_context *ctx,
+ char *buf,
+ size_t plaintext_len,
+ unsigned char record_type)
+{
+ size_t pkt_len, iv_size = ctx->iv_size;
+
+ pkt_len = plaintext_len + iv_size + ctx->tag_size;
+
+ /* we cover nonce explicit here as well, so buf should be of
+ * size KTLS_DTLS_HEADER_SIZE + KTLS_DTLS_NONCE_EXPLICIT_SIZE
+ */
+ buf[0] = record_type;
+ buf[1] = TLS_VERSION_MINOR(ctx->crypto_send.version);
+ buf[2] = TLS_VERSION_MAJOR(ctx->crypto_send.version);
+ /* we can use IV for nonce explicit according to spec */
+ buf[3] = pkt_len >> 8;
+ buf[4] = pkt_len & 0xFF;
+ memcpy(buf + TLS_NONCE_OFFSET, ctx->iv, iv_size);
+}
+
+static inline struct tls_context *tls_get_ctx(const struct sock *sk)
+{
+ return sk->sk_user_data;
+}
+
+static inline struct tls_offload_context *tls_offload_ctx(
+ const struct tls_context *tls_ctx)
+{
+ return (struct tls_offload_context *)tls_ctx->priv_ctx;
+}
+
+
+#endif /* _TLS_OFFLOAD_H */
diff --git a/include/uapi/linux/Kbuild b/include/uapi/linux/Kbuild
index cd2be1c..96ae5ca 100644
--- a/include/uapi/linux/Kbuild
+++ b/include/uapi/linux/Kbuild
@@ -406,6 +406,7 @@ header-y += sysinfo.h
header-y += target_core_user.h
header-y += taskstats.h
header-y += tcp.h
+header-y += tls.h
header-y += tcp_metrics.h
header-y += telephony.h
header-y += termios.h
diff --git a/include/uapi/linux/tls.h b/include/uapi/linux/tls.h
new file mode 100644
index 0000000..464621b
--- /dev/null
+++ b/include/uapi/linux/tls.h
@@ -0,0 +1,84 @@
+/* Copyright (c) 2016-2017, Mellanox Technologies All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * - Neither the name of the Mellanox Technologies nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior written
+ * permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE
+ */
+
+#ifndef _UAPI_LINUX_TLS_H
+#define _UAPI_LINUX_TLS_H
+
+#include <linux/types.h>
+#include <asm/byteorder.h>
+#include <linux/socket.h>
+#include <linux/tcp.h>
+
+/* Supported versions */
+#define TLS_VERSION_MINOR(ver) ((ver) & 0xFF)
+#define TLS_VERSION_MAJOR(ver) (((ver) >> 8) & 0xFF)
+
+#define TLS_VERSION_NUMBER(id) ((((id##_VERSION_MAJOR) & 0xFF) << 8) | \
+ ((id##_VERSION_MINOR) & 0xFF))
+
+#define TLS_1_2_VERSION_MAJOR 0x3
+#define TLS_1_2_VERSION_MINOR 0x3
+#define TLS_1_2_VERSION TLS_VERSION_NUMBER(TLS_1_2)
+
+/* Supported ciphers */
+#define TLS_CIPHER_AES_GCM_128 51
+#define TLS_CIPHER_AES_GCM_128_IV_SIZE ((size_t)8)
+#define TLS_CIPHER_AES_GCM_128_KEY_SIZE ((size_t)16)
+#define TLS_CIPHER_AES_GCM_128_SALT_SIZE ((size_t)4)
+#define TLS_CIPHER_AES_GCM_128_TAG_SIZE ((size_t)16)
+
+struct tls_ctrlmsg {
+ unsigned char type;
+ unsigned char data[0];
+} __attribute__((packed));
+
+enum tls_state {
+ TLS_STATE_SW = 0x0,
+ TLS_STATE_HW = 0x1,
+};
+
+struct tls_crypto_info {
+ __u16 version;
+ __u16 cipher_type;
+ __u32 state;
+};
+
+struct tls_crypto_info_aes_gcm_128 {
+ struct tls_crypto_info info;
+ unsigned char iv[TLS_CIPHER_AES_GCM_128_IV_SIZE];
+ unsigned char key[TLS_CIPHER_AES_GCM_128_KEY_SIZE];
+ unsigned char salt[TLS_CIPHER_AES_GCM_128_SALT_SIZE];
+};
+
+#endif /* _UAPI_LINUX_TLS_H */
diff --git a/net/Kconfig b/net/Kconfig
index a100500..b50e899 100644
--- a/net/Kconfig
+++ b/net/Kconfig
@@ -55,6 +55,7 @@ menu "Networking options"

source "net/packet/Kconfig"
source "net/unix/Kconfig"
+source "net/tls/Kconfig"
source "net/xfrm/Kconfig"
source "net/iucv/Kconfig"

diff --git a/net/Makefile b/net/Makefile
index 4cafaa2..23da6df 100644
--- a/net/Makefile
+++ b/net/Makefile
@@ -15,6 +15,7 @@ obj-$(CONFIG_LLC) += llc/
obj-$(CONFIG_NET) += ethernet/ 802/ sched/ netlink/
obj-$(CONFIG_NETFILTER) += netfilter/
obj-$(CONFIG_INET) += ipv4/
+obj-$(CONFIG_TLS) += tls/
obj-$(CONFIG_XFRM) += xfrm/
obj-$(CONFIG_UNIX) += unix/
obj-$(CONFIG_NET) += ipv6/
diff --git a/net/tls/Kconfig b/net/tls/Kconfig
new file mode 100644
index 0000000..75bfb43
--- /dev/null
+++ b/net/tls/Kconfig
@@ -0,0 +1,12 @@
+#
+# TLS configuration
+#
+config TLS
+ tristate "Transport Layer Security support"
+ depends on NET
+ default m
+ ---help---
+ Enable kernel support for TLS protocol. This allows processing
+ of protocol in kernel as well as oflloading it to HW.
+
+ If unsure, say N.
diff --git a/net/tls/Makefile b/net/tls/Makefile
new file mode 100644
index 0000000..65e5677
--- /dev/null
+++ b/net/tls/Makefile
@@ -0,0 +1,7 @@
+#
+# Makefile for the TLS subsystem.
+#
+
+obj-$(CONFIG_TLS) += tls.o
+
+tls-y := tls_main.o tls_device.o
diff --git a/net/tls/tls_device.c b/net/tls/tls_device.c
new file mode 100644
index 0000000..77a4a59
--- /dev/null
+++ b/net/tls/tls_device.c
@@ -0,0 +1,594 @@
+/* Copyright (c) 2016-2017, Mellanox Technologies All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * - Neither the name of the Mellanox Technologies nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior written
+ * permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE
+ */
+
+#include <linux/module.h>
+#include <net/tcp.h>
+#include <net/inet_common.h>
+#include <linux/highmem.h>
+#include <linux/netdevice.h>
+
+#include <net/tls.h>
+
+/* We assume that the socket is already connected */
+static struct net_device *get_netdev_for_sock(struct sock *sk)
+{
+ struct inet_sock *inet = inet_sk(sk);
+ struct net_device *netdev = NULL;
+
+ pr_info("Using output interface 0x%x\n", inet->cork.fl.flowi_oif);
+ netdev = dev_get_by_index(sock_net(sk), inet->cork.fl.flowi_oif);
+
+ return netdev;
+}
+
+static void detach_sock_from_netdev(struct sock *sk, struct tls_context *ctx)
+{
+ struct net_device *netdev;
+
+ netdev = get_netdev_for_sock(sk);
+ if (!netdev) {
+ pr_err("got offloaded socket with no netdev\n");
+ return;
+ }
+
+ if (!netdev->tlsdev_ops) {
+ pr_err("attach_sock_to_netdev: netdev %s with no TLS offload\n",
+ netdev->name);
+ return;
+ }
+
+ netdev->tlsdev_ops->tls_dev_del(netdev, sk, TLS_OFFLOAD_CTX_DIR_TX);
+ dev_put(netdev);
+}
+
+static int attach_sock_to_netdev(struct sock *sk, struct tls_context *ctx)
+{
+ struct net_device *netdev = get_netdev_for_sock(sk);
+ int rc = -EINVAL;
+
+ if (!netdev) {
+ pr_err("attach_sock_to_netdev: netdev not found\n");
+ goto out;
+ }
+
+ if (!netdev->tlsdev_ops) {
+ pr_err("attach_sock_to_netdev: netdev %s with no TLS offload\n",
+ netdev->name);
+ goto out;
+ }
+
+ rc = netdev->tlsdev_ops->tls_dev_add(
+ netdev,
+ sk,
+ TLS_OFFLOAD_CTX_DIR_TX,
+ &ctx->crypto_send,
+ (struct tls_offload_context **)(&ctx->priv_ctx));
+ if (rc) {
+ pr_err("The netdev has refused to offload this socket\n");
+ goto out;
+ }
+
+ sk->sk_bound_dev_if = netdev->ifindex;
+ sk_dst_reset(sk);
+
+ rc = 0;
+out:
+ dev_put(netdev);
+ return rc;
+}
+
+static void destroy_record(struct tls_record_info *record)
+{
+ skb_frag_t *frag;
+ int nr_frags = record->num_frags;
+
+ while (nr_frags > 0) {
+ frag = &record->frags[nr_frags - 1];
+ __skb_frag_unref(frag);
+ --nr_frags;
+ }
+ kfree(record);
+}
+
+static void delete_all_records(struct tls_offload_context *offload_ctx)
+{
+ struct tls_record_info *info, *temp;
+
+ list_for_each_entry_safe(info, temp, &offload_ctx->records_list, list) {
+ list_del(&info->list);
+ destroy_record(info);
+ }
+}
+
+void tls_clear_device_offload(struct sock *sk, struct tls_context *tls_ctx)
+{
+ struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
+
+ if (!ctx)
+ return;
+
+ if (ctx->open_record)
+ destroy_record(ctx->open_record);
+
+ delete_all_records(ctx);
+ detach_sock_from_netdev(sk, tls_ctx);
+}
+
+void tls_icsk_clean_acked(struct sock *sk)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_offload_context *ctx;
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct tls_record_info *info, *temp;
+ unsigned long flags;
+
+ if (!tls_ctx)
+ return;
+
+ ctx = tls_offload_ctx(tls_ctx);
+
+ spin_lock_irqsave(&ctx->lock, flags);
+ info = ctx->retransmit_hint;
+ if (info && !before(tp->snd_una, info->end_seq)) {
+ ctx->retransmit_hint = NULL;
+ list_del(&info->list);
+ destroy_record(info);
+ }
+
+ list_for_each_entry_safe(info, temp, &ctx->records_list, list) {
+ if (before(tp->snd_una, info->end_seq))
+ break;
+ list_del(&info->list);
+
+ destroy_record(info);
+ }
+
+ spin_unlock_irqrestore(&ctx->lock, flags);
+}
+EXPORT_SYMBOL(tls_icsk_clean_acked);
+
+void tls_device_sk_destruct(struct sock *sk)
+{
+ struct tls_context *ctx = tls_get_ctx(sk);
+
+ tls_clear_device_offload(sk, ctx);
+ tls_sk_destruct(sk, ctx);
+}
+EXPORT_SYMBOL(tls_device_sk_destruct);
+
+static inline void tls_append_frag(struct tls_record_info *record,
+ struct page_frag *pfrag,
+ int size)
+{
+ skb_frag_t *frag;
+
+ frag = &record->frags[record->num_frags - 1];
+ if (frag->page.p == pfrag->page &&
+ frag->page_offset + frag->size == pfrag->offset) {
+ frag->size += size;
+ } else {
+ ++frag;
+ frag->page.p = pfrag->page;
+ frag->page_offset = pfrag->offset;
+ frag->size = size;
+ ++record->num_frags;
+ get_page(pfrag->page);
+ }
+
+ pfrag->offset += size;
+ record->len += size;
+}
+
+static inline int tls_push_record(struct sock *sk,
+ struct tls_context *ctx,
+ struct tls_offload_context *offload_ctx,
+ struct tls_record_info *record,
+ struct page_frag *pfrag,
+ int flags,
+ unsigned char record_type)
+{
+ skb_frag_t *frag;
+ struct tcp_sock *tp = tcp_sk(sk);
+ struct page_frag fallback_frag;
+ struct page_frag *tag_pfrag = pfrag;
+
+ /* fill prepand */
+ frag = &record->frags[0];
+ tls_fill_prepend(ctx,
+ skb_frag_address(frag),
+ record->len - ctx->prepand_size,
+ record_type);
+
+ if (unlikely(!skb_page_frag_refill(
+ ctx->tag_size,
+ pfrag, GFP_KERNEL))) {
+ /* HW doesn't care about the data in the tag
+ * so in case pfrag has no room
+ * for a tag and we can't allocate a new pfrag
+ * just use the page in the first frag
+ * rather then write a complicated fall back code.
+ */
+ tag_pfrag = &fallback_frag;
+ tag_pfrag->page = skb_frag_page(frag);
+ tag_pfrag->offset = 0;
+ }
+
+ tls_append_frag(record, tag_pfrag, ctx->tag_size);
+ record->end_seq = tp->write_seq + record->len;
+ spin_lock_irq(&offload_ctx->lock);
+ list_add_tail(&record->list, &offload_ctx->records_list);
+ spin_unlock_irq(&offload_ctx->lock);
+
+ offload_ctx->open_record = NULL;
+ tls_increment_seqno(ctx->iv, sk);
+
+ /* all ready, send */
+ return tls_push_frags(sk, ctx, record->frags,
+ record->num_frags, 0, flags);
+
+}
+
+static inline int tls_get_new_record(
+ struct tls_offload_context *offload_ctx,
+ struct page_frag *pfrag,
+ size_t prepand_size)
+{
+ skb_frag_t *frag;
+ struct tls_record_info *record;
+
+ /* TODO: do we want to use pfrag
+ * to store the record metadata?
+ * the lifetime of the data and
+ * metadata is the same and
+ * we can avoid kmalloc overhead.
+ */
+ record = kmalloc(sizeof(*record), GFP_KERNEL);
+ if (!record)
+ return -ENOMEM;
+
+ frag = &record->frags[0];
+ __skb_frag_set_page(frag, pfrag->page);
+ frag->page_offset = pfrag->offset;
+ skb_frag_size_set(frag, prepand_size);
+
+ get_page(pfrag->page);
+ pfrag->offset += prepand_size;
+
+ record->num_frags = 1;
+ record->len = prepand_size;
+ offload_ctx->open_record = record;
+ return 0;
+}
+
+static inline int tls_do_allocation(
+ struct sock *sk,
+ struct tls_offload_context *offload_ctx,
+ struct page_frag *pfrag,
+ size_t prepand_size)
+{
+ struct tls_record_info *record;
+
+ if (!sk_page_frag_refill(sk, pfrag))
+ return -ENOMEM;
+
+ record = offload_ctx->open_record;
+ if (!record) {
+ tls_get_new_record(offload_ctx, pfrag, prepand_size);
+ record = offload_ctx->open_record;
+ if (!record)
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
+static int tls_push_data(struct sock *sk,
+ struct iov_iter *msg_iter,
+ size_t size, int flags,
+ unsigned char record_type)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
+ struct tls_record_info *record = ctx->open_record;
+ struct page_frag *pfrag;
+ int copy, rc = 0;
+ size_t orig_size = size;
+ u32 max_open_record_len;
+ long timeo;
+ int more = flags & (MSG_SENDPAGE_NOTLAST | MSG_MORE);
+ int tls_push_record_flags = flags | MSG_SENDPAGE_NOTLAST;
+ bool last = false;
+
+ if (sk->sk_err)
+ return sk->sk_err;
+
+ /* Only one writer at a time is allowed */
+ if (sk->sk_write_pending)
+ return -EBUSY;
+ timeo = sock_sndtimeo(sk, flags & MSG_DONTWAIT);
+ pfrag = sk_page_frag(sk);
+
+ /* KTLS_TLS_HEADER_SIZE is not counted as part of the TLS record, and
+ * we need to leave room for an authentication tag.
+ */
+ max_open_record_len = TLS_MAX_PAYLOAD_SIZE
+ + TLS_HEADER_SIZE - tls_ctx->tag_size;
+
+ if (tls_is_pending_open_record(tls_ctx)) {
+ rc = tls_push_paritial_record(sk, tls_ctx, flags);
+ if (rc < 0)
+ return rc;
+ }
+
+ do {
+ if (tls_do_allocation(sk, ctx, pfrag,
+ tls_ctx->prepand_size)) {
+ rc = sk_stream_wait_memory(sk, &timeo);
+ if (!rc)
+ continue;
+
+ record = ctx->open_record;
+ if (!record)
+ break;
+handle_error:
+ if (record_type != TLS_RECORD_TYPE_DATA) {
+ /* avoid sending partial
+ * record with type !=
+ * application_data
+ */
+ size = orig_size;
+ destroy_record(record);
+ ctx->open_record = NULL;
+ } else if (record->len > tls_ctx->prepand_size) {
+ goto last_record;
+ }
+
+ break;
+ }
+
+ record = ctx->open_record;
+ copy = min_t(size_t, size, (pfrag->size - pfrag->offset));
+ copy = min_t(size_t, copy, (max_open_record_len - record->len));
+
+ if (copy_from_iter_nocache(
+ page_address(pfrag->page) + pfrag->offset,
+ copy, msg_iter) != copy) {
+ rc = -EFAULT;
+ goto handle_error;
+ }
+ tls_append_frag(record, pfrag, copy);
+
+ size -= copy;
+ if (!size) {
+last_record:
+ tls_push_record_flags = flags;
+ last = true;
+ }
+
+ if ((last && !more) ||
+ (record->len >= max_open_record_len) ||
+ (record->num_frags >= MAX_SKB_FRAGS - 1)) {
+ rc = tls_push_record(sk,
+ tls_ctx,
+ ctx,
+ record,
+ pfrag,
+ tls_push_record_flags,
+ record_type);
+ if (rc < 0)
+ break;
+ }
+ } while (!last);
+
+ if (orig_size - size > 0) {
+ rc = orig_size - size;
+ if (record_type != TLS_RECORD_TYPE_DATA)
+ rc++;
+ }
+
+ return rc;
+}
+
+static inline bool record_is_open(struct sock *sk)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+ struct tls_offload_context *ctx = tls_offload_ctx(tls_ctx);
+ struct tls_record_info *record = ctx->open_record;
+
+ return record;
+}
+
+int tls_device_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
+{
+ unsigned char record_type = TLS_RECORD_TYPE_DATA;
+ int rc = 0;
+
+ lock_sock(sk);
+
+ if (unlikely(msg->msg_flags & MSG_OOB)) {
+ if ((msg->msg_flags & MSG_MORE) || record_is_open(sk)) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+ if (copy_from_iter(&record_type, 1, &msg->msg_iter) != 1) {
+ rc = -EFAULT;
+ goto out;
+ }
+
+ --size;
+ msg->msg_flags &= ~MSG_OOB;
+ }
+
+ rc = tls_push_data(sk, &msg->msg_iter, size,
+ msg->msg_flags,
+ record_type);
+
+out:
+ release_sock(sk);
+ return rc;
+}
+
+int tls_device_sendpage(struct sock *sk, struct page *page,
+ int offset, size_t size, int flags)
+{
+ struct iov_iter msg_iter;
+ struct kvec iov;
+ char *kaddr = kmap(page);
+ int rc = 0;
+
+ if (flags & MSG_SENDPAGE_NOTLAST)
+ flags |= MSG_MORE;
+
+ lock_sock(sk);
+
+ if (flags & MSG_OOB) {
+ rc = -ENOTSUPP;
+ goto out;
+ }
+
+ iov.iov_base = kaddr + offset;
+ iov.iov_len = size;
+ iov_iter_kvec(&msg_iter, WRITE | ITER_KVEC, &iov, 1, size);
+ rc = tls_push_data(sk, &msg_iter, size,
+ flags,
+ TLS_RECORD_TYPE_DATA);
+ kunmap(page);
+
+out:
+ release_sock(sk);
+ return rc;
+}
+
+struct tls_record_info *tls_get_record(struct tls_offload_context *context,
+ u32 seq)
+{
+ struct tls_record_info *info;
+
+ info = context->retransmit_hint;
+ if (!info ||
+ before(seq, info->end_seq - info->len))
+ info = list_first_entry(&context->records_list,
+ struct tls_record_info, list);
+
+ list_for_each_entry_from(info, &context->records_list, list) {
+ if (before(seq, info->end_seq)) {
+ if (!context->retransmit_hint ||
+ after(info->end_seq,
+ context->retransmit_hint->end_seq))
+ context->retransmit_hint = info;
+ return info;
+ }
+ }
+
+ return NULL;
+}
+EXPORT_SYMBOL(tls_get_record);
+
+int tls_set_device_offload(struct sock *sk, struct tls_context *ctx)
+{
+ struct tls_crypto_info *crypto_info;
+ struct tls_offload_context *offload_ctx;
+ struct tls_record_info *dummy_record;
+ u16 nonece_size, tag_size, iv_size;
+ char *iv;
+ int rc;
+
+ if (!ctx) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+ if (ctx->priv_ctx) {
+ rc = -EEXIST;
+ goto out;
+ }
+
+ crypto_info = &ctx->crypto_send;
+ switch (crypto_info->cipher_type) {
+ case TLS_CIPHER_AES_GCM_128: {
+ nonece_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
+ tag_size = TLS_CIPHER_AES_GCM_128_TAG_SIZE;
+ iv_size = TLS_CIPHER_AES_GCM_128_IV_SIZE;
+ iv = ((struct tls_crypto_info_aes_gcm_128 *)crypto_info)->iv;
+ break;
+ }
+ default:
+ rc = -EINVAL;
+ goto out;
+ }
+
+ dummy_record = kmalloc(sizeof(*dummy_record), GFP_KERNEL);
+ if (!dummy_record) {
+ rc = -ENOMEM;
+ goto out;
+ }
+
+ rc = attach_sock_to_netdev(sk, ctx);
+ if (rc)
+ goto err_dummy_record;
+
+ ctx->prepand_size = TLS_HEADER_SIZE + nonece_size;
+ ctx->tag_size = tag_size;
+ ctx->iv_size = iv_size;
+ ctx->iv = kmalloc(iv_size, GFP_KERNEL);
+ if (!ctx->iv) {
+ rc = ENOMEM;
+ goto detach_sock;
+ }
+ memcpy(ctx->iv, iv, iv_size);
+
+ offload_ctx = ctx->priv_ctx;
+ dummy_record->end_seq = offload_ctx->expectedSN;
+ dummy_record->len = 0;
+ dummy_record->num_frags = 0;
+
+ INIT_LIST_HEAD(&offload_ctx->records_list);
+ list_add_tail(&dummy_record->list, &offload_ctx->records_list);
+ spin_lock_init(&offload_ctx->lock);
+
+ inet_csk(sk)->icsk_clean_acked = &tls_icsk_clean_acked;
+
+ /* After this line the tx_handler might access the offload context */
+ smp_store_release(&sk->sk_destruct,
+ &tls_device_sk_destruct);
+ goto out;
+
+detach_sock:
+ detach_sock_from_netdev(sk, ctx);
+err_dummy_record:
+ kfree(dummy_record);
+out:
+ return rc;
+}
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
new file mode 100644
index 0000000..6a3df25
--- /dev/null
+++ b/net/tls/tls_main.c
@@ -0,0 +1,348 @@
+/* Copyright (c) 2016-2017, Mellanox Technologies All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * - Neither the name of the Mellanox Technologies nor the
+ * names of its contributors may be used to endorse or promote
+ * products derived from this software without specific prior written
+ * permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
+ * AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO,
+ * THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED.
+ * IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR
+ * ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
+ * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
+ * SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
+ * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
+ * STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
+ * IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
+ * POSSIBILITY OF SUCH DAMAGE
+ */
+
+#include <linux/module.h>
+
+#include <net/tcp.h>
+#include <net/inet_common.h>
+#include <linux/highmem.h>
+#include <linux/netdevice.h>
+
+#include <net/tls.h>
+
+MODULE_AUTHOR("Mellanox Technologies");
+MODULE_DESCRIPTION("Transport Layer Security Support");
+MODULE_LICENSE("Dual BSD/GPL");
+
+static struct proto tls_device_prot;
+
+int tls_push_frags(struct sock *sk,
+ struct tls_context *ctx,
+ skb_frag_t *frag,
+ u16 num_frags,
+ u16 first_offset,
+ int flags)
+{
+ int sendpage_flags = flags | MSG_SENDPAGE_NOTLAST;
+ int ret = 0;
+ size_t size;
+ int offset = first_offset;
+
+ size = skb_frag_size(frag) - offset;
+ offset += frag->page_offset;
+
+ while (1) {
+ if (!--num_frags)
+ sendpage_flags = flags;
+
+ /* is sending application-limited? */
+ tcp_rate_check_app_limited(sk);
+retry:
+ ret = do_tcp_sendpages(sk,
+ skb_frag_page(frag),
+ offset,
+ size,
+ sendpage_flags);
+
+ if (ret != size) {
+ if (ret > 0) {
+ offset += ret;
+ size -= ret;
+ goto retry;
+ }
+
+ offset -= frag->page_offset;
+ ctx->pending_offset = offset;
+ ctx->pending_frags = frag;
+ ctx->num_pending_frags = num_frags + 1;
+ return ret;
+ }
+
+ if (!num_frags)
+ break;
+
+ frag++;
+ offset = frag->page_offset;
+ size = skb_frag_size(frag);
+ }
+
+ return 0;
+}
+
+int tls_push_paritial_record(struct sock *sk, struct tls_context *ctx,
+ int flags) {
+ skb_frag_t *frag = ctx->pending_frags;
+ u16 offset = ctx->pending_offset;
+ u16 num_frags = ctx->num_pending_frags;
+
+ ctx->num_pending_frags = 0;
+
+ return tls_push_frags(sk, ctx, frag,
+ num_frags, offset, flags);
+}
+
+static void tls_write_space(struct sock *sk)
+{
+ struct tls_context *ctx = tls_get_ctx(sk);
+
+ if (tls_is_pending_open_record(ctx)) {
+ gfp_t sk_allocation = sk->sk_allocation;
+ int rc;
+
+ sk->sk_allocation = GFP_ATOMIC;
+ rc = tls_push_paritial_record(sk, ctx,
+ MSG_DONTWAIT | MSG_NOSIGNAL);
+ sk->sk_allocation = sk_allocation;
+
+ if (rc < 0)
+ return;
+ }
+
+ ctx->sk_write_space(sk);
+}
+
+int tls_sk_query(struct sock *sk, int optname, char __user *optval,
+ int __user *optlen)
+{
+ int rc = 0;
+ struct tls_context *ctx = tls_get_ctx(sk);
+ struct tls_crypto_info *crypto_info;
+ int len;
+
+ if (get_user(len, optlen))
+ return -EFAULT;
+
+ if (!optval || (len < sizeof(*crypto_info))) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+ if (!ctx) {
+ rc = -EBUSY;
+ goto out;
+ }
+
+ /* get user crypto info */
+ switch (optname) {
+ case TCP_TLS_TX: {
+ crypto_info = &ctx->crypto_send;
+ break;
+ }
+ case TCP_TLS_RX:
+ /* fallthru since for now we don't support */
+ default: {
+ rc = -ENOPROTOOPT;
+ goto out;
+ }
+ }
+
+ if (!TLS_CRYPTO_INFO_READY(crypto_info)) {
+ rc = -EBUSY;
+ goto out;
+ }
+
+ if (len == sizeof(crypto_info)) {
+ rc = copy_to_user(optval, crypto_info, sizeof(*crypto_info));
+ goto out;
+ }
+
+ switch (crypto_info->cipher_type) {
+ case TLS_CIPHER_AES_GCM_128: {
+ struct tls_crypto_info_aes_gcm_128 *crypto_info_aes_gcm_128 =
+ container_of(crypto_info,
+ struct tls_crypto_info_aes_gcm_128,
+ info);
+
+ if (len != sizeof(*crypto_info_aes_gcm_128)) {
+ rc = -EINVAL;
+ goto out;
+ }
+ if (TLS_IS_STATE_HW(crypto_info)) {
+ lock_sock(sk);
+ memcpy(crypto_info_aes_gcm_128->iv,
+ ctx->iv,
+ TLS_CIPHER_AES_GCM_128_IV_SIZE);
+ release_sock(sk);
+ }
+ rc = copy_to_user(optval,
+ crypto_info_aes_gcm_128,
+ sizeof(*crypto_info_aes_gcm_128));
+ break;
+ }
+ default:
+ rc = -EINVAL;
+ }
+
+out:
+ return rc;
+}
+EXPORT_SYMBOL(tls_sk_query);
+
+void tls_sk_destruct(struct sock *sk, struct tls_context *ctx)
+{
+ ctx->sk_destruct(sk);
+ kfree(ctx->iv);
+ kfree(ctx);
+ module_put(THIS_MODULE);
+}
+
+int tls_sk_attach(struct sock *sk, int optname, char __user *optval,
+ unsigned int optlen)
+{
+ int rc = 0;
+ struct tls_context *ctx = tls_get_ctx(sk);
+ struct tls_crypto_info *crypto_info;
+ bool allocated_tls_ctx = false;
+
+ if (!optval || (optlen < sizeof(*crypto_info))) {
+ rc = -EINVAL;
+ goto out;
+ }
+
+ /* allocate tls context */
+ if (!ctx) {
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx) {
+ rc = -ENOMEM;
+ goto out;
+ }
+ sk->sk_user_data = ctx;
+ allocated_tls_ctx = true;
+ }
+
+ /* get user crypto info */
+ switch (optname) {
+ case TCP_TLS_TX: {
+ crypto_info = &ctx->crypto_send;
+ break;
+ }
+ case TCP_TLS_RX:
+ /* fallthru since for now we don't support */
+ default: {
+ rc = -ENOPROTOOPT;
+ goto err_sk_user_data;
+ }
+ }
+
+ /* Currently we don't support set crypto info more than one time */
+ if (TLS_CRYPTO_INFO_READY(crypto_info)) {
+ rc = -EEXIST;
+ goto err_sk_user_data;
+ }
+
+ rc = copy_from_user(crypto_info, optval, sizeof(*crypto_info));
+ if (rc) {
+ rc = -EFAULT;
+ goto err_sk_user_data;
+ }
+
+ /* currently we support only HW offload */
+ if (!TLS_IS_STATE_HW(crypto_info)) {
+ rc = -ENOPROTOOPT;
+ goto err_crypto_info;
+ }
+
+ /* check version */
+ if (crypto_info->version != TLS_1_2_VERSION) {
+ rc = -ENOTSUPP;
+ goto err_crypto_info;
+ }
+
+ switch (crypto_info->cipher_type) {
+ case TLS_CIPHER_AES_GCM_128: {
+ if (optlen != sizeof(struct tls_crypto_info_aes_gcm_128)) {
+ rc = -EINVAL;
+ goto err_crypto_info;
+ }
+ rc = copy_from_user(crypto_info,
+ optval,
+ sizeof(struct tls_crypto_info_aes_gcm_128));
+ break;
+ }
+ default:
+ rc = -EINVAL;
+ goto err_crypto_info;
+ }
+
+ if (rc) {
+ rc = -EFAULT;
+ goto err_crypto_info;
+ }
+
+ ctx->sk_write_space = sk->sk_write_space;
+ ctx->sk_destruct = sk->sk_destruct;
+ sk->sk_write_space = tls_write_space;
+
+ if (TLS_IS_STATE_HW(crypto_info)) {
+ rc = tls_set_device_offload(sk, ctx);
+ if (rc)
+ goto err_crypto_info;
+ }
+
+ if (!try_module_get(THIS_MODULE)) {
+ rc = -EINVAL;
+ goto err_set_device_offload;
+ }
+
+ /* TODO: add protection */
+ sk->sk_prot = &tls_device_prot;
+ goto out;
+
+err_set_device_offload:
+ tls_clear_device_offload(sk, ctx);
+err_crypto_info:
+ memset(crypto_info, 0, sizeof(*crypto_info));
+err_sk_user_data:
+ if (allocated_tls_ctx)
+ kfree(ctx);
+out:
+ return rc;
+}
+EXPORT_SYMBOL(tls_sk_attach);
+
+static int __init tls_init(void)
+{
+ tls_device_prot = tcp_prot;
+ tls_device_prot.sendmsg = tls_device_sendmsg;
+ tls_device_prot.sendpage = tls_device_sendpage;
+
+ return 0;
+}
+
+static void __exit tls_exit(void)
+{
+}
+
+module_init(tls_init);
+module_exit(tls_exit);
--
2.7.4

2017-03-28 13:26:24

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 07/15] mlx/mlx5_core: Allow sending multiple packets

From: Ilya Lesokhin <[email protected]>

Modify mlx5e_xmit to xmit multiple packet chained
using skb->next

Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
drivers/net/ethernet/mellanox/mlx5/core/en_tx.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
index e6ce509..f2d0cc0 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tx.c
@@ -35,7 +35,7 @@
#include "en.h"

#define MLX5E_SQ_NOPS_ROOM MLX5_SEND_WQE_MAX_WQEBBS
-#define MLX5E_SQ_STOP_ROOM (MLX5_SEND_WQE_MAX_WQEBBS +\
+#define MLX5E_SQ_STOP_ROOM (2 * MLX5_SEND_WQE_MAX_WQEBBS +\
MLX5E_SQ_NOPS_ROOM)

void mlx5e_send_nop(struct mlx5e_sq *sq, bool notify_hw)
@@ -405,6 +405,8 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)
struct mlx5e_sq *sq = NULL;
struct mlx5_accel_ops *accel_ops;
struct mlx5_swp_info swp_info = {0};
+ struct sk_buff *next;
+ int rc;

rcu_read_lock();
accel_ops = mlx5_accel_get(priv->mdev);
@@ -417,7 +419,12 @@ netdev_tx_t mlx5e_xmit(struct sk_buff *skb, struct net_device *dev)

sq = priv->txq_to_sq_map[skb_get_queue_mapping(skb)];

- return mlx5e_sq_xmit(sq, skb, &swp_info);
+ do {
+ next = skb->next;
+ rc = mlx5e_sq_xmit(sq, skb, &swp_info);
+ skb = next;
+ } while (next);
+ return rc;
}

bool mlx5e_poll_tx_cq(struct mlx5e_cq *cq, int napi_budget)
--
2.7.4

2017-03-28 13:26:31

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 14/15] crypto: rfc5288 aesni optimized intel routines

From: Dave Watson <[email protected]>

The assembly routines require the AAD data to be padded out
to the nearest 4 bytes.
Copy the 13 byte tag to a spare assoc data area when necessary

Signed-off-by: Dave Watson <[email protected]>
---
arch/x86/crypto/aesni-intel_asm.S | 6 ++
arch/x86/crypto/aesni-intel_avx-x86_64.S | 4 ++
arch/x86/crypto/aesni-intel_glue.c | 105 ++++++++++++++++++++++++++-----
3 files changed, 99 insertions(+), 16 deletions(-)

diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 383a6f8..4e80bb8 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -229,6 +229,9 @@ XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
MOVADQ SHUF_MASK(%rip), %xmm14
mov arg7, %r10 # %r10 = AAD
mov arg8, %r12 # %r12 = aadLen
+ add $3, %r12
+ and $~3, %r12
+
mov %r12, %r11
pxor %xmm\i, %xmm\i

@@ -454,6 +457,9 @@ XMM2 XMM3 XMM4 XMMDst TMP6 TMP7 i i_seq operation
MOVADQ SHUF_MASK(%rip), %xmm14
mov arg7, %r10 # %r10 = AAD
mov arg8, %r12 # %r12 = aadLen
+ add $3, %r12
+ and $~3, %r12
+
mov %r12, %r11
pxor %xmm\i, %xmm\i
_get_AAD_loop\num_initial_blocks\operation:
diff --git a/arch/x86/crypto/aesni-intel_avx-x86_64.S b/arch/x86/crypto/aesni-intel_avx-x86_64.S
index 522ab68..0756e4a 100644
--- a/arch/x86/crypto/aesni-intel_avx-x86_64.S
+++ b/arch/x86/crypto/aesni-intel_avx-x86_64.S
@@ -360,6 +360,8 @@ VARIABLE_OFFSET = 16*8

mov arg6, %r10 # r10 = AAD
mov arg7, %r12 # r12 = aadLen
+ add $3, %r12
+ and $~3, %r12


mov %r12, %r11
@@ -1619,6 +1621,8 @@ ENDPROC(aesni_gcm_dec_avx_gen2)

mov arg6, %r10 # r10 = AAD
mov arg7, %r12 # r12 = aadLen
+ add $3, %r12
+ and $~3, %r12


mov %r12, %r11
diff --git a/arch/x86/crypto/aesni-intel_glue.c b/arch/x86/crypto/aesni-intel_glue.c
index 4ff90a7..dcada94 100644
--- a/arch/x86/crypto/aesni-intel_glue.c
+++ b/arch/x86/crypto/aesni-intel_glue.c
@@ -957,6 +957,8 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
{
u8 one_entry_in_sg = 0;
u8 *src, *dst, *assoc;
+ u8 *assocmem = NULL;
+
__be32 counter = cpu_to_be32(1);
struct crypto_aead *tfm = crypto_aead_reqtfm(req);
struct aesni_rfc4106_gcm_ctx *ctx = aesni_rfc4106_gcm_ctx_get(tfm);
@@ -966,12 +968,8 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
struct scatter_walk src_sg_walk;
struct scatter_walk dst_sg_walk = {};
unsigned int i;
-
- /* Assuming we are supporting rfc4106 64-bit extended */
- /* sequence numbers We need to have the AAD length equal */
- /* to 16 or 20 bytes */
- if (unlikely(req->assoclen != 16 && req->assoclen != 20))
- return -EINVAL;
+ unsigned int padded_assoclen = (req->assoclen + 3) & ~3;
+ u8 assocbuf[24];

/* IV below built */
for (i = 0; i < 4; i++)
@@ -996,7 +994,8 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
} else {
/* Allocate memory for src, dst, assoc */
assoc = kmalloc(req->cryptlen + auth_tag_len + req->assoclen,
- GFP_ATOMIC);
+ GFP_ATOMIC);
+ assocmem = assoc;
if (unlikely(!assoc))
return -ENOMEM;
scatterwalk_map_and_copy(assoc, req->src, 0,
@@ -1005,6 +1004,14 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
dst = src;
}

+ if (req->assoclen != padded_assoclen) {
+ scatterwalk_map_and_copy(assocbuf, req->src, 0,
+ req->assoclen, 0);
+ memset(assocbuf + req->assoclen, 0,
+ padded_assoclen - req->assoclen);
+ assoc = assocbuf;
+ }
+
kernel_fpu_begin();
aesni_gcm_enc_tfm(aes_ctx, dst, src, req->cryptlen, iv,
ctx->hash_subkey, assoc, req->assoclen - 8,
@@ -1025,7 +1032,7 @@ static int helper_rfc4106_encrypt(struct aead_request *req)
} else {
scatterwalk_map_and_copy(dst, req->dst, req->assoclen,
req->cryptlen + auth_tag_len, 1);
- kfree(assoc);
+ kfree(assocmem);
}
return 0;
}
@@ -1034,6 +1041,7 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
{
u8 one_entry_in_sg = 0;
u8 *src, *dst, *assoc;
+ u8 *assocmem = NULL;
unsigned long tempCipherLen = 0;
__be32 counter = cpu_to_be32(1);
int retval = 0;
@@ -1043,16 +1051,11 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
unsigned long auth_tag_len = crypto_aead_authsize(tfm);
u8 iv[16] __attribute__ ((__aligned__(AESNI_ALIGN)));
u8 authTag[16];
+ u8 assocbuf[24];
struct scatter_walk src_sg_walk;
struct scatter_walk dst_sg_walk = {};
unsigned int i;
-
- if (unlikely(req->assoclen != 16 && req->assoclen != 20))
- return -EINVAL;
-
- /* Assuming we are supporting rfc4106 64-bit extended */
- /* sequence numbers We need to have the AAD length */
- /* equal to 16 or 20 bytes */
+ unsigned int padded_assoclen = (req->assoclen + 3) & ~3;

tempCipherLen = (unsigned long)(req->cryptlen - auth_tag_len);
/* IV below built */
@@ -1079,6 +1082,7 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
} else {
/* Allocate memory for src, dst, assoc */
assoc = kmalloc(req->cryptlen + req->assoclen, GFP_ATOMIC);
+ assocmem = assoc;
if (!assoc)
return -ENOMEM;
scatterwalk_map_and_copy(assoc, req->src, 0,
@@ -1087,6 +1091,14 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
dst = src;
}

+ if (req->assoclen != padded_assoclen) {
+ scatterwalk_map_and_copy(assocbuf, req->src,
+ 0, req->assoclen, 0);
+ memset(assocbuf + req->assoclen, 0,
+ padded_assoclen - req->assoclen);
+ assoc = assocbuf;
+ }
+
kernel_fpu_begin();
aesni_gcm_dec_tfm(aes_ctx, dst, src, tempCipherLen, iv,
ctx->hash_subkey, assoc, req->assoclen - 8,
@@ -1109,7 +1121,7 @@ static int helper_rfc4106_decrypt(struct aead_request *req)
} else {
scatterwalk_map_and_copy(dst, req->dst, req->assoclen,
tempCipherLen, 1);
- kfree(assoc);
+ kfree(assocmem);
}
return retval;
}
@@ -1151,6 +1163,12 @@ static int rfc4106_encrypt(struct aead_request *req)

aead_request_set_tfm(req, tfm);

+ /* Assuming we are supporting rfc4106 64-bit extended */
+ /* sequence numbers We need to have the AAD length */
+ /* equal to 16 or 20 bytes */
+ if (unlikely(req->assoclen != 16 && req->assoclen != 20))
+ return -EINVAL;
+
return crypto_aead_encrypt(req);
}

@@ -1167,6 +1185,43 @@ static int rfc4106_decrypt(struct aead_request *req)

aead_request_set_tfm(req, tfm);

+ /* Assuming we are supporting rfc4106 64-bit extended */
+ /* sequence numbers We need to have the AAD length */
+ /* equal to 16 or 20 bytes */
+ if (unlikely(req->assoclen != 16 && req->assoclen != 20))
+ return -EINVAL;
+
+ return crypto_aead_decrypt(req);
+}
+
+static int rfc5288_encrypt(struct aead_request *req)
+{
+ struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+ struct cryptd_aead **ctx = crypto_aead_ctx(tfm);
+ struct cryptd_aead *cryptd_tfm = *ctx;
+
+ if (unlikely(req->assoclen != 21))
+ return -EINVAL;
+
+ aead_request_set_tfm(req, irq_fpu_usable() ?
+ cryptd_aead_child(cryptd_tfm) :
+ &cryptd_tfm->base);
+
+ return crypto_aead_encrypt(req);
+}
+
+static int rfc5288_decrypt(struct aead_request *req)
+{
+ struct crypto_aead *tfm = crypto_aead_reqtfm(req);
+ struct cryptd_aead **ctx = crypto_aead_ctx(tfm);
+ struct cryptd_aead *cryptd_tfm = *ctx;
+
+ if (unlikely(req->assoclen != 21))
+ return -EINVAL;
+
+ aead_request_set_tfm(req, irq_fpu_usable() ?
+ cryptd_aead_child(cryptd_tfm) :
+ &cryptd_tfm->base);
return crypto_aead_decrypt(req);
}
#endif
@@ -1506,6 +1561,24 @@ static struct aead_alg aesni_aead_algs[] = { {
.cra_blocksize = 1,
.cra_ctxsize = sizeof(struct aesni_rfc4106_gcm_sync_ctx),
.cra_module = THIS_MODULE,
+ },
+}, {
+ .init = rfc4106_init,
+ .exit = rfc4106_exit,
+ .setkey = rfc4106_set_key,
+ .setauthsize = rfc4106_set_authsize,
+ .encrypt = rfc5288_encrypt,
+ .decrypt = rfc5288_decrypt,
+ .ivsize = 8,
+ .maxauthsize = 16,
+ .base = {
+ .cra_name = "rfc5288(gcm(aes))",
+ .cra_driver_name = "rfc5288-gcm-aesni",
+ .cra_priority = 400,
+ .cra_flags = CRYPTO_ALG_ASYNC,
+ .cra_blocksize = 1,
+ .cra_ctxsize = sizeof(struct cryptd_aead *),
+ .cra_module = THIS_MODULE,
},
} };
#else
--
2.7.4

2017-03-28 13:27:30

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 02/15] tcp: export do_tcp_sendpages function

We will use it via tls new code.

Signed-off-by: Aviad Yehezkel <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Boris Pismenny <[email protected]>
---
include/net/tcp.h | 2 ++
net/ipv4/tcp.c | 5 +++--
2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/include/net/tcp.h b/include/net/tcp.h
index 207147b..3a72d4c 100644
--- a/include/net/tcp.h
+++ b/include/net/tcp.h
@@ -348,6 +348,8 @@ int tcp_v4_tw_remember_stamp(struct inet_timewait_sock *tw);
int tcp_sendmsg(struct sock *sk, struct msghdr *msg, size_t size);
int tcp_sendpage(struct sock *sk, struct page *page, int offset, size_t size,
int flags);
+ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
+ size_t size, int flags);
void tcp_release_cb(struct sock *sk);
void tcp_wfree(struct sk_buff *skb);
void tcp_write_timer_handler(struct sock *sk);
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index 1149b48..302fee9 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -873,8 +873,8 @@ static int tcp_send_mss(struct sock *sk, int *size_goal, int flags)
return mss_now;
}

-static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
- size_t size, int flags)
+ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
+ size_t size, int flags)
{
struct tcp_sock *tp = tcp_sk(sk);
int mss_now, size_goal;
@@ -1003,6 +1003,7 @@ static ssize_t do_tcp_sendpages(struct sock *sk, struct page *page, int offset,
}
return sk_stream_error(sk, flags, err);
}
+EXPORT_SYMBOL(do_tcp_sendpages);

int tcp_sendpage(struct sock *sk, struct page *page, int offset,
size_t size, int flags)
--
2.7.4

2017-03-28 13:27:24

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 11/15] mlx/tls: TLS offload driver Add the main module entrypoints and tie the module into the build system

From: Ilya Lesokhin <[email protected]>

Signed-off-by: Guy Shapiro <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Haggai Eran <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
drivers/net/ethernet/mellanox/Kconfig | 1 +
drivers/net/ethernet/mellanox/Makefile | 1 +
.../net/ethernet/mellanox/accelerator/tls/Kconfig | 11 ++++
.../net/ethernet/mellanox/accelerator/tls/Makefile | 4 ++
.../ethernet/mellanox/accelerator/tls/tls_main.c | 77 ++++++++++++++++++++++
5 files changed, 94 insertions(+)
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/Kconfig
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/Makefile
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c

diff --git a/drivers/net/ethernet/mellanox/Kconfig b/drivers/net/ethernet/mellanox/Kconfig
index 1b3ca6a..f270b76 100644
--- a/drivers/net/ethernet/mellanox/Kconfig
+++ b/drivers/net/ethernet/mellanox/Kconfig
@@ -21,6 +21,7 @@ source "drivers/net/ethernet/mellanox/mlx5/core/Kconfig"
source "drivers/net/ethernet/mellanox/mlxsw/Kconfig"
source "drivers/net/ethernet/mellanox/accelerator/core/Kconfig"
source "drivers/net/ethernet/mellanox/accelerator/ipsec/Kconfig"
+source "drivers/net/ethernet/mellanox/accelerator/tls/Kconfig"
source "drivers/net/ethernet/mellanox/accelerator/tools/Kconfig"

endif # NET_VENDOR_MELLANOX
diff --git a/drivers/net/ethernet/mellanox/Makefile b/drivers/net/ethernet/mellanox/Makefile
index 96a5856..fd8afc0 100644
--- a/drivers/net/ethernet/mellanox/Makefile
+++ b/drivers/net/ethernet/mellanox/Makefile
@@ -7,4 +7,5 @@ obj-$(CONFIG_MLX5_CORE) += mlx5/core/
obj-$(CONFIG_MLXSW_CORE) += mlxsw/
obj-$(CONFIG_MLX_ACCEL_CORE) += accelerator/core/
obj-$(CONFIG_MLX_ACCEL_IPSEC) += accelerator/ipsec/
+obj-$(CONFIG_MLX_ACCEL_TLS) += accelerator/tls/
obj-$(CONFIG_MLX_ACCEL_TOOLS) += accelerator/tools/
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/Kconfig b/drivers/net/ethernet/mellanox/accelerator/tls/Kconfig
new file mode 100644
index 0000000..d9c0733
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/Kconfig
@@ -0,0 +1,11 @@
+#
+# Mellanox tls accelerator driver configuration
+#
+
+config MLX_ACCEL_TLS
+ tristate "Mellanox Technologies TLS accelarator driver"
+ depends on MLX_ACCEL_CORE
+ default n
+ ---help---
+ TLS accelarator driver by Mellanox Technologies.
+
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/Makefile b/drivers/net/ethernet/mellanox/accelerator/tls/Makefile
new file mode 100644
index 0000000..93a7733
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/Makefile
@@ -0,0 +1,4 @@
+obj-$(CONFIG_MLX_ACCEL_TLS) += mlx_tls.o
+
+ccflags-y := -I$(srctree)/
+mlx_tls-y := tls_main.o tls_sysfs.o tls_hw.o tls.o
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c b/drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c
new file mode 100644
index 0000000..85078f5
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_main.c
@@ -0,0 +1,77 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#include <linux/module.h>
+
+#include "tls.h"
+
+MODULE_AUTHOR("Mellanox Technologies Advance Develop Team <[email protected]>");
+MODULE_DESCRIPTION("Mellanox Innova TLS Driver");
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_VERSION(DRIVER_VERSION);
+
+static struct mlx_accel_core_client mlx_tls_client = {
+ .name = "mlx_tls",
+ .add = mlx_tls_add_one,
+ .remove = mlx_tls_remove_one,
+};
+
+static struct notifier_block mlx_tls_netdev_notifier = {
+ .notifier_call = mlx_tls_netdev_event,
+};
+
+static int __init mlx_tls_init(void)
+{
+ int err = 0;
+
+ err = register_netdevice_notifier(&mlx_tls_netdev_notifier);
+ if (err) {
+ pr_warn("mlx_tls_init error in register_netdevice_notifier %d\n",
+ err);
+ goto out_wq;
+ }
+
+ mlx_accel_core_client_register(&mlx_tls_client);
+
+out_wq:
+ return err;
+}
+
+static void __exit mlx_tls_exit(void)
+{
+ mlx_accel_core_client_unregister(&mlx_tls_client);
+ unregister_netdevice_notifier(&mlx_tls_netdev_notifier);
+}
+
+module_init(mlx_tls_init);
+module_exit(mlx_tls_exit);
+
--
2.7.4

2017-03-28 13:26:27

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 10/15] mlx/tls: Add mlx_accel offload driver for TLS

From: Ilya Lesokhin <[email protected]>

Implement the transmit and receive callbacks as well as the netdev
operations for adding and removing sockets.

Signed-off-by: Guy Shapiro <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Haggai Eran <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
.../net/ethernet/mellanox/accelerator/tls/tls.c | 652 +++++++++++++++++++++
.../net/ethernet/mellanox/accelerator/tls/tls.h | 100 ++++
2 files changed, 752 insertions(+)
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls.c
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls.h

diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls.c b/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
new file mode 100644
index 0000000..07a4b67
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls.c
@@ -0,0 +1,652 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#include "tls.h"
+#include "tls_sysfs.h"
+#include "tls_hw.h"
+#include "tls_cmds.h"
+#include <linux/mlx5/driver.h>
+#include <linux/netdevice.h>
+
+static LIST_HEAD(mlx_tls_devs);
+static DEFINE_MUTEX(mlx_tls_mutex);
+
+/* Start of context identifiers range (inclusive) */
+#define SWID_START 5
+/* End of context identifiers range (exclusive) */
+#define SWID_END BIT(24)
+
+static netdev_features_t mlx_tls_feature_chk(struct sk_buff *skb,
+ struct net_device *netdev,
+ netdev_features_t features,
+ bool *done)
+{
+ return features;
+}
+
+int mlx_tls_get_count(struct net_device *netdev)
+{
+ return 0;
+}
+
+int mlx_tls_get_strings(struct net_device *netdev, uint8_t *data)
+{
+ return 0;
+}
+
+int mlx_tls_get_stats(struct net_device *netdev, u64 *data)
+{
+ return 0;
+}
+
+/* must hold mlx_tls_mutex to call this function */
+static struct mlx_tls_dev *find_mlx_tls_dev_by_netdev(
+ struct net_device *netdev)
+{
+ struct mlx_tls_dev *dev;
+
+ list_for_each_entry(dev, &mlx_tls_devs, accel_dev_list) {
+ if (dev->netdev == netdev)
+ return dev;
+ }
+
+ return NULL;
+}
+
+struct mlx_tls_offload_context *get_tls_context(struct sock *sk)
+{
+ struct tls_context *tls_ctx = tls_get_ctx(sk);
+
+ return container_of(tls_offload_ctx(tls_ctx),
+ struct mlx_tls_offload_context,
+ context);
+}
+
+static int mlx_tls_add(struct net_device *netdev,
+ struct sock *sk,
+ enum tls_offload_ctx_dir direction,
+ struct tls_crypto_info *crypto_info,
+ struct tls_offload_context **ctx)
+{
+ struct tls_crypto_info_aes_gcm_128 *crypto_info_aes_gcm_128;
+ struct mlx_tls_offload_context *context;
+ struct mlx_tls_dev *dev;
+ int swid;
+ int ret;
+
+ pr_info("mlx_tls_add called\n");
+
+ if (direction == TLS_OFFLOAD_CTX_DIR_RX) {
+ pr_err("mlx_tls_add(): do not support recv\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ if (!crypto_info ||
+ crypto_info->cipher_type != TLS_CIPHER_AES_GCM_128) {
+ pr_err("mlx_tls_add(): support only aes_gcm_128\n");
+ ret = -EINVAL;
+ goto out;
+ }
+ crypto_info_aes_gcm_128 =
+ (struct tls_crypto_info_aes_gcm_128 *)crypto_info;
+
+ dev = mlx_tls_find_dev_by_netdev(netdev);
+ if (!dev) {
+ pr_err("mlx_tls_add(): tls dev not found\n");
+ ret = -EINVAL;
+ goto out;
+ }
+
+ swid = ida_simple_get(&dev->swid_ida, SWID_START, SWID_END,
+ GFP_KERNEL);
+ if (swid < 0) {
+ pr_err("mlx_tls_add(): Failed to allocate swid\n");
+ ret = swid;
+ goto out;
+ }
+
+ context = kzalloc(sizeof(*context), GFP_KERNEL);
+ if (!context) {
+ ret = -ENOMEM;
+ goto release_swid;
+ }
+
+ context->swid = htonl(swid);
+ context->context.expectedSN = tcp_sk(sk)->write_seq;
+
+ ret = mlx_tls_hw_start_cmd(dev,
+ sk,
+ crypto_info_aes_gcm_128,
+ context);
+ if (ret)
+ goto relese_context;
+
+ try_module_get(THIS_MODULE);
+ *ctx = &context->context;
+out:
+ return ret;
+
+relese_context:
+ kfree(context);
+release_swid:
+ ida_simple_remove(&dev->swid_ida, swid);
+ return ret;
+}
+
+static void mlx_tls_del(struct net_device *netdev,
+ struct sock *sk,
+ enum tls_offload_ctx_dir direction)
+{
+ struct mlx_tls_offload_context *context = NULL;
+
+ if (direction == TLS_OFFLOAD_CTX_DIR_RX) {
+ pr_err("mlx_tls_del(): do not support recv\n");
+ return;
+ }
+
+ context = get_tls_context(sk);
+ if (context)
+ mlx_tls_hw_stop_cmd(netdev, context);
+ else
+ pr_err("delete non-offloaded context\n");
+}
+
+static const struct tlsdev_ops mlx_tls_ops = {
+ .tls_dev_add = mlx_tls_add,
+ .tls_dev_del = mlx_tls_del,
+};
+
+struct mlx_tls_dev *mlx_tls_find_dev_by_netdev(struct net_device *netdev)
+{
+ struct mlx_tls_dev *dev;
+
+ mutex_lock(&mlx_tls_mutex);
+ dev = find_mlx_tls_dev_by_netdev(netdev);
+ mutex_unlock(&mlx_tls_mutex);
+ return dev;
+}
+
+#define SYNDROME_OFFLOAD_REQUIRED 32
+#define SYNDROME_SYNC 33.
+#define SYNDROME_BYPASS 34
+
+#define MIN_BYPASS_RECORD_SIZE 29
+#define BYPASS_RECORD_PADDING_SIZE \
+ (MIN_BYPASS_RECORD_SIZE - TLS_HEADER_SIZE)
+
+#define MAX_BYPASS_SIZE ((1 << 15) - BYPASS_RECORD_PADDING_SIZE - 1)
+
+static void create_bypass_record(u8 *buf, u16 len)
+{
+ len += BYPASS_RECORD_PADDING_SIZE;
+ buf[0] = TLS_RECORD_TYPE_DATA;
+ buf[1] = TLS_1_2_VERSION_MAJOR;
+ buf[2] = TLS_1_2_VERSION_MINOR;
+ buf[3] = len >> 8;
+ buf[4] = len & 0xFF;
+ memset(buf + TLS_HEADER_SIZE, 0, BYPASS_RECORD_PADDING_SIZE);
+}
+
+struct sync_info {
+ s32 sync_len;
+ int nr_frags;
+ skb_frag_t frags[MAX_SKB_FRAGS];
+};
+
+static int get_sync_data(struct mlx_tls_offload_context *context,
+ u32 tcp_seq, struct sync_info *info)
+{
+ struct tls_record_info *record;
+ unsigned long flags;
+ int remaining;
+ s32 sync_size;
+ int ret = -EINVAL;
+ int i = 0;
+
+ spin_lock_irqsave(&context->context.lock, flags);
+ record = tls_get_record(&context->context, tcp_seq);
+
+ if (unlikely(!record)) {
+ pr_err("record not found for seq %u\n", tcp_seq);
+ goto out;
+ }
+
+ sync_size = tcp_seq - (record->end_seq - record->len);
+ info->sync_len = sync_size;
+ if (unlikely(sync_size < 0)) {
+ if (record->len != 0) {
+ pr_err("Invalid record for seq %u\n", tcp_seq);
+ goto out;
+ }
+ goto done;
+ }
+
+ remaining = sync_size;
+ while (remaining > 0) {
+ info->frags[i] = record->frags[i];
+ __skb_frag_ref(&info->frags[i]);
+ remaining -= skb_frag_size(&info->frags[i]);
+
+ if (remaining < 0) {
+ skb_frag_size_add(
+ &info->frags[i],
+ remaining);
+ }
+
+ i++;
+ }
+ info->nr_frags = i;
+done:
+ ret = 0;
+out:
+ spin_unlock_irqrestore(&context->context.lock, flags);
+ return ret;
+}
+
+static struct sk_buff *complete_sync_skb(
+ struct sk_buff *skb,
+ struct sk_buff *nskb,
+ u32 tcp_seq,
+ int headln,
+ unsigned char syndrome
+ )
+{
+ struct iphdr *iph;
+ struct tcphdr *th;
+ int mss;
+ struct pet *pet;
+ __be16 tcp_seq_low;
+
+ nskb->dev = skb->dev;
+ skb_reset_mac_header(nskb);
+ skb_set_network_header(nskb, skb_network_offset(skb));
+ skb_set_transport_header(nskb, skb_transport_offset(skb));
+ memcpy(nskb->data, skb->data, headln);
+
+ iph = ip_hdr(nskb);
+ iph->tot_len = htons(nskb->len - skb_network_offset(nskb));
+ th = tcp_hdr(nskb);
+ tcp_seq -= nskb->data_len;
+ th->seq = htonl(tcp_seq);
+ tcp_seq_low = htons(tcp_seq);
+
+ mss = nskb->dev->mtu - (headln - skb_network_offset(nskb));
+ skb_shinfo(nskb)->gso_size = 0;
+ if (nskb->data_len > mss) {
+ skb_shinfo(nskb)->gso_size = mss;
+ skb_shinfo(nskb)->gso_segs = DIV_ROUND_UP(nskb->data_len, mss);
+ }
+ skb_shinfo(nskb)->gso_type = skb_shinfo(skb)->gso_type;
+
+ nskb->queue_mapping = skb->queue_mapping;
+
+ pet = (struct pet *)(nskb->data + sizeof(struct ethhdr));
+ pet->syndrome = syndrome;
+ memcpy(pet->content.raw, &tcp_seq_low, sizeof(tcp_seq_low));
+
+ nskb->ip_summed = CHECKSUM_PARTIAL;
+ __skb_pull(nskb, skb_transport_offset(skb));
+ inet_csk(skb->sk)->icsk_af_ops->send_check(skb->sk, nskb);
+ __skb_push(nskb, skb_transport_offset(skb));
+
+ nskb->next = skb;
+ nskb->xmit_more = 1;
+ return nskb;
+}
+
+static void strip_pet(struct sk_buff *skb)
+{
+ struct ethhdr *old_eth;
+ struct ethhdr *new_eth;
+
+ old_eth = (struct ethhdr *)((skb->data) - sizeof(struct ethhdr));
+ new_eth = (struct ethhdr *)((skb_pull_inline(skb, sizeof(struct pet)))
+ - sizeof(struct ethhdr));
+ skb->mac_header += sizeof(struct pet);
+
+ memmove(new_eth, old_eth, 2 * ETH_ALEN);
+ /* Ethertype is already in its new place */
+}
+
+static struct sk_buff *handle_ooo(struct mlx_tls_offload_context *context,
+ struct sk_buff *skb)
+{
+ struct sync_info info;
+ u32 tcp_seq = ntohl(tcp_hdr(skb)->seq);
+ struct sk_buff *nskb;
+ int linear_len = 0;
+ int headln;
+ unsigned char syndrome = SYNDROME_SYNC;
+
+ if (get_sync_data(context, tcp_seq, &info)) {
+ dev_kfree_skb_any(skb);
+ return NULL;
+ }
+
+ headln = skb_transport_offset(skb) + tcp_hdrlen(skb);
+
+ if (unlikely(info.sync_len < 0)) {
+ if (-info.sync_len > MAX_BYPASS_SIZE) {
+ if (skb->len - headln > -info.sync_len) {
+ pr_err("Required bypass record is too big\n");
+ /* can fragment into two large SKBs in SW */
+ return NULL;
+ }
+ skb_push(skb, sizeof(struct ethhdr));
+ strip_pet(skb);
+ skb_pull(skb, sizeof(struct ethhdr));
+ return skb;
+ }
+
+ linear_len = MIN_BYPASS_RECORD_SIZE;
+ }
+
+ linear_len += headln;
+ nskb = alloc_skb(linear_len, GFP_ATOMIC);
+ if (unlikely(!nskb)) {
+ dev_kfree_skb_any(skb);
+ return NULL;
+ }
+
+ skb_put(nskb, linear_len);
+ syndrome = SYNDROME_SYNC;
+ if (likely(info.sync_len >= 0)) {
+ int i;
+
+ for (i = 0; i < info.nr_frags; i++)
+ skb_shinfo(nskb)->frags[i] = info.frags[i];
+
+ skb_shinfo(nskb)->nr_frags = info.nr_frags;
+ nskb->data_len = info.sync_len;
+ nskb->len += info.sync_len;
+ } else {
+ create_bypass_record(nskb->data + headln, -info.sync_len);
+ tcp_seq -= MIN_BYPASS_RECORD_SIZE;
+ syndrome = SYNDROME_BYPASS;
+ }
+
+ return complete_sync_skb(skb, nskb, tcp_seq, headln, syndrome);
+}
+
+static int insert_pet(struct sk_buff *skb)
+{
+ struct ethhdr *eth;
+ struct pet *pet;
+ struct mlx_tls_offload_context *context;
+
+ pr_debug("insert_pet started\n");
+ if (skb_cow_head(skb, sizeof(struct pet)))
+ return -ENOMEM;
+
+ eth = (struct ethhdr *)skb_push(skb, sizeof(struct pet));
+ skb->mac_header -= sizeof(struct pet);
+ pet = (struct pet *)(eth + 1);
+
+ memmove(skb->data, skb->data + sizeof(struct pet), 2 * ETH_ALEN);
+
+ eth->h_proto = cpu_to_be16(MLX5_METADATA_ETHER_TYPE);
+ pet->syndrome = SYNDROME_OFFLOAD_REQUIRED;
+
+ memset(pet->content.raw, 0, sizeof(pet->content.raw));
+ context = get_tls_context(skb->sk);
+ memcpy(pet->content.send.sid, &context->swid,
+ sizeof(pet->content.send.sid));
+
+ return 0;
+}
+
+static struct sk_buff *mlx_tls_tx_handler(struct sk_buff *skb,
+ struct mlx5_swp_info *swp_info)
+{
+ struct mlx_tls_offload_context *context;
+ int datalen;
+ u32 skb_seq;
+
+ pr_debug("mlx_tls_tx_handler started\n");
+
+ if (!skb->sk || !tls_is_sk_tx_device_offloaded(skb->sk))
+ goto out;
+
+ datalen = skb->len - (skb_transport_offset(skb) + tcp_hdrlen(skb));
+ if (!datalen)
+ goto out;
+
+ skb_seq = ntohl(tcp_hdr(skb)->seq);
+
+ context = get_tls_context(skb->sk);
+ pr_debug("mlx_tls_tx_handler: mapping: %u cpu %u size %u with swid %u expectedSN: %u actualSN: %u\n",
+ skb->queue_mapping, smp_processor_id(), skb->len,
+ ntohl(context->swid), context->context.expectedSN, skb_seq);
+
+ insert_pet(skb);
+
+ if (unlikely(context->context.expectedSN != skb_seq)) {
+ skb = handle_ooo(context, skb);
+ if (!skb)
+ goto out;
+
+ pr_info("Sending sync packet\n");
+
+ if (!skb->next)
+ goto out;
+ }
+ context->context.expectedSN = skb_seq + datalen;
+
+out:
+ return skb;
+}
+
+static struct sk_buff *mlx_tls_rx_handler(struct sk_buff *skb, u8 *rawpet,
+ u8 petlen)
+{
+ struct pet *pet = (struct pet *)rawpet;
+
+ if (petlen != sizeof(*pet))
+ goto out;
+
+ dev_dbg(&skb->dev->dev, ">> rx_handler %u bytes\n", skb->len);
+ dev_dbg(&skb->dev->dev, " RX PET: size %lu, etherType %04X, syndrome %02x\n",
+ sizeof(*pet), be16_to_cpu(pet->ethertype), pet->syndrome);
+
+ if (pet->syndrome != 48) {
+ dev_dbg(&skb->dev->dev, "unexpected pet syndrome %d\n",
+ pet->syndrome);
+ goto out;
+ }
+
+out:
+ return skb;
+}
+
+/* Must hold mlx_tls_mutex to call this function.
+ * Assumes that dev->core_ctx is destroyed be the caller
+ */
+static void mlx_tls_free(struct mlx_tls_dev *dev)
+{
+ list_del(&dev->accel_dev_list);
+#ifdef MLX_TLS_SADB_RDMA
+ kobject_put(&dev->kobj);
+#endif
+ dev_put(dev->netdev);
+ kfree(dev);
+}
+
+int mlx_tls_netdev_event(struct notifier_block *this, unsigned long event,
+ void *ptr)
+{
+ struct net_device *netdev = netdev_notifier_info_to_dev(ptr);
+ struct mlx_tls_dev *accel_dev = NULL;
+
+ if (!netdev)
+ goto out;
+
+ pr_debug("mlx_tls_netdev_event: %lu\n", event);
+
+ /* We are interested only in net devices going down */
+ if (event != NETDEV_UNREGISTER)
+ goto out;
+
+ /* Take down all connections using a netdev that is going down */
+ mutex_lock(&mlx_tls_mutex);
+ accel_dev = find_mlx_tls_dev_by_netdev(netdev);
+ if (!accel_dev) {
+ pr_debug("mlx_tls_netdev_event: Failed to find tls device for net device\n");
+ goto unlock;
+ }
+ mlx_tls_free(accel_dev);
+
+unlock:
+ mutex_unlock(&mlx_tls_mutex);
+out:
+ return NOTIFY_DONE;
+}
+
+static struct mlx5_accel_ops mlx_tls_client_ops = {
+ .rx_handler = mlx_tls_rx_handler,
+ .tx_handler = mlx_tls_tx_handler,
+ .feature_chk = mlx_tls_feature_chk,
+ .get_count = mlx_tls_get_count,
+ .get_strings = mlx_tls_get_strings,
+ .get_stats = mlx_tls_get_stats,
+ .mtu_extra = sizeof(struct pet),
+ .features = 0,
+};
+
+int mlx_tls_add_one(struct mlx_accel_core_device *accel_device)
+{
+ int ret = 0;
+ struct mlx_tls_dev *dev = NULL;
+ struct net_device *netdev = NULL;
+#ifdef MLX_TLS_SADB_RDMA
+ struct mlx_accel_core_conn_init_attr init_attr = {0};
+#endif
+ pr_debug("mlx_tls_add_one called for %s\n", accel_device->name);
+
+ dev = kzalloc(sizeof(*dev), GFP_KERNEL);
+ if (!dev) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ INIT_LIST_HEAD(&dev->accel_dev_list);
+ dev->accel_device = accel_device;
+ ida_init(&dev->swid_ida);
+
+#ifdef MLX_TLS_SADB_RDMA
+ init_attr.rx_size = 128;
+ init_attr.tx_size = 32;
+ init_attr.recv_cb = mlx_tls_hw_qp_recv_cb;
+ init_attr.cb_arg = dev;
+ dev->conn = mlx_accel_core_conn_create(accel_device, &init_attr);
+ if (IS_ERR(dev->conn)) {
+ ret = PTR_ERR(dev->conn);
+ pr_err("mlx_tls_add_one(): Got error while creating connection %d\n",
+ ret);
+ goto err_dev;
+ }
+#endif
+ netdev = accel_device->ib_dev->get_netdev(accel_device->ib_dev,
+ accel_device->port);
+ if (!netdev) {
+ pr_err("mlx_tls_add_one(): Failed to retrieve net device from ib device\n");
+ ret = -EINVAL;
+ goto err_conn;
+ }
+ dev->netdev = netdev;
+
+ ret = mlx_accel_core_client_ops_register(accel_device,
+ &mlx_tls_client_ops);
+ if (ret) {
+ pr_err("mlx_tls_add_one(): Failed to register client ops %d\n",
+ ret);
+ goto err_netdev;
+ }
+
+#ifdef MLX_TLS_SADB_RDMA
+ ret = tls_sysfs_init_and_add(&dev->kobj,
+ mlx_accel_core_kobj(dev->accel_device),
+ "%s",
+ "accel_dev");
+ if (ret) {
+ pr_err("mlx_tls_add_one(): Got error from kobject_init_and_add %d\n",
+ ret);
+ goto err_ops_register;
+ }
+#endif
+
+ mutex_lock(&mlx_tls_mutex);
+ list_add(&dev->accel_dev_list, &mlx_tls_devs);
+ mutex_unlock(&mlx_tls_mutex);
+
+ dev->netdev->tlsdev_ops = &mlx_tls_ops;
+ goto out;
+
+#ifdef MLX_TLS_SADB_RDMA
+err_ops_register:
+ mlx_accel_core_client_ops_unregister(accel_device);
+#endif
+err_netdev:
+ dev_put(netdev);
+err_conn:
+ mlx_accel_core_conn_destroy(dev->conn);
+#ifdef MLX_TLS_SADB_RDMA
+err_dev:
+#endif
+ kfree(dev);
+out:
+ return ret;
+}
+
+void mlx_tls_remove_one(struct mlx_accel_core_device *accel_device)
+{
+ struct mlx_tls_dev *dev;
+ struct net_device *netdev = NULL;
+
+ pr_debug("mlx_tls_remove_one called for %s\n", accel_device->name);
+
+ mutex_lock(&mlx_tls_mutex);
+
+ list_for_each_entry(dev, &mlx_tls_devs, accel_dev_list) {
+ if (dev->accel_device == accel_device) {
+ netdev = dev->netdev;
+ netdev->tlsdev_ops = NULL;
+ mlx_accel_core_client_ops_unregister(accel_device);
+#ifdef MLX_TLS_SADB_RDMA
+ mlx_accel_core_conn_destroy(dev->conn);
+#endif
+ mlx_tls_free(dev);
+ break;
+ }
+ }
+ mutex_unlock(&mlx_tls_mutex);
+}
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls.h b/drivers/net/ethernet/mellanox/accelerator/tls/tls.h
new file mode 100644
index 0000000..7c7539a
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls.h
@@ -0,0 +1,100 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#ifndef __TLS_H__
+#define __TLS_H__
+
+#include <linux/types.h>
+#include <linux/kobject.h>
+#include <linux/kfifo.h>
+#include <linux/list.h>
+#include <linux/skbuff.h>
+#include <linux/hashtable.h>
+#include <net/sock.h>
+#include <net/inet_common.h>
+#include <net/tls.h>
+#include <linux/mlx5/accel_sdk.h>
+
+#include "tls_cmds.h"
+
+#define DRIVER_NAME "mlx_tls"
+#define DRIVER_VERSION "0.1"
+#define DRIVER_RELDATE "January 2016"
+
+#define MLX_TLS_DEVICE_NAME "mlx_tls"
+/* TODO: Consider moving this to include/uapi/linux/if_ether.h */
+
+struct send_pet_content {
+ /* The next field is meaningful only for sync packets with LSO
+ * enabled (by the syndrome field))
+ */
+ __be16 first_seq; /* LSBs of the first TCP seq in the packet */
+ unsigned char sid[3];
+} __packed;
+
+/*TODO: move this to HW/cmds header files when added*/
+struct pet {
+ unsigned char syndrome;
+ union {
+ unsigned char raw[5];
+ /* from host to FPGA */
+ struct send_pet_content send;
+ } __packed content;
+ /* packet type ID field */
+ __be16 ethertype;
+} __packed;
+
+struct mlx_tls_dev {
+ struct kobject kobj;
+ struct list_head accel_dev_list;
+ struct mlx_accel_core_device *accel_device;
+ struct mlx_accel_core_conn *conn;
+ struct net_device *netdev;
+ struct ida swid_ida;
+};
+
+struct mlx_tls_offload_context {
+ struct tls_offload_context context;
+ struct list_head tls_del_list;
+ struct net_device *netdev;
+ __be32 swid;
+};
+
+int mlx_tls_netdev_event(struct notifier_block *this,
+ unsigned long event, void *ptr);
+
+int mlx_tls_add_one(struct mlx_accel_core_device *accel_device);
+void mlx_tls_remove_one(struct mlx_accel_core_device *accel_device);
+
+struct mlx_tls_dev *mlx_tls_find_dev_by_netdev(struct net_device *netdev);
+
+#endif /* __TLS_H__ */
--
2.7.4

2017-03-28 13:26:26

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 09/15] mlx/tls: Sysfs configuration interface Configure the driver/hardware interface via sysfs.

From: Ilya Lesokhin <[email protected]>

Signed-off-by: Guy Shapiro <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
.../ethernet/mellanox/accelerator/tls/tls_sysfs.c | 194 +++++++++++++++++++++
.../ethernet/mellanox/accelerator/tls/tls_sysfs.h | 45 +++++
2 files changed, 239 insertions(+)
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h

diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
new file mode 100644
index 0000000..2860fc3
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.c
@@ -0,0 +1,194 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#include <rdma/ib_verbs.h>
+
+#include "tls_sysfs.h"
+#include "tls_cmds.h"
+
+#ifdef MLX_TLS_SADB_RDMA
+struct mlx_tls_attribute {
+ struct attribute attr;
+ ssize_t (*show)(struct mlx_tls_dev *dev, char *buf);
+ ssize_t (*store)(struct mlx_tls_dev *dev, const char *buf,
+ size_t count);
+};
+
+#define MLX_TLS_ATTR(_name, _mode, _show, _store) \
+ struct mlx_tls_attribute mlx_tls_attr_##_name = { \
+ .attr = {.name = __stringify(_name), .mode = _mode}, \
+ .show = _show, \
+ .store = _store, \
+ }
+#define to_mlx_tls_dev(obj) \
+ container_of(kobj, struct mlx_tls_dev, kobj)
+#define to_mlx_tls_attr(_attr) \
+ container_of(attr, struct mlx_tls_attribute, attr)
+
+static ssize_t mlx_tls_attr_show(struct kobject *kobj, struct attribute *attr,
+ char *buf)
+{
+ struct mlx_tls_dev *dev = to_mlx_tls_dev(kobj);
+ struct mlx_tls_attribute *mlx_tls_attr = to_mlx_tls_attr(attr);
+ ssize_t ret = -EIO;
+
+ if (mlx_tls_attr->show)
+ ret = mlx_tls_attr->show(dev, buf);
+
+ return ret;
+}
+
+static ssize_t mlx_tls_attr_store(struct kobject *kobj, struct attribute *attr,
+ const char *buf, size_t count)
+{
+ struct mlx_tls_dev *dev = to_mlx_tls_dev(kobj);
+ struct mlx_tls_attribute *mlx_tls_attr = to_mlx_tls_attr(attr);
+ ssize_t ret = -EIO;
+
+ if (mlx_tls_attr->store)
+ ret = mlx_tls_attr->store(dev, buf, count);
+
+ return ret;
+}
+
+static ssize_t mlx_tls_sqpn_read(struct mlx_tls_dev *dev, char *buf)
+{
+ return sprintf(buf, "%u\n", dev->conn->qp->qp_num);
+}
+
+static ssize_t mlx_tls_sgid_read(struct mlx_tls_dev *dev, char *buf)
+{
+ union ib_gid *sgid = (union ib_gid *)&dev->conn->fpga_qpc.remote_ip;
+
+ return sprintf(buf, "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n",
+ be16_to_cpu(((__be16 *)sgid->raw)[0]),
+ be16_to_cpu(((__be16 *)sgid->raw)[1]),
+ be16_to_cpu(((__be16 *)sgid->raw)[2]),
+ be16_to_cpu(((__be16 *)sgid->raw)[3]),
+ be16_to_cpu(((__be16 *)sgid->raw)[4]),
+ be16_to_cpu(((__be16 *)sgid->raw)[5]),
+ be16_to_cpu(((__be16 *)sgid->raw)[6]),
+ be16_to_cpu(((__be16 *)sgid->raw)[7]));
+}
+
+static ssize_t mlx_tls_dqpn_read(struct mlx_tls_dev *dev, char *buf)
+{
+ return sprintf(buf, "%u\n", dev->conn->fpga_qpn);
+}
+
+static ssize_t mlx_tls_dqpn_write(struct mlx_tls_dev *dev, const char *buf,
+ size_t count)
+{
+ int tmp;
+
+ tmp = sscanf(buf, "%u\n", &dev->conn->fpga_qpn);
+ mlx_accel_core_connect(dev->conn);
+
+ return count;
+}
+
+static ssize_t mlx_tls_dgid_read(struct mlx_tls_dev *dev, char *buf)
+{
+ union ib_gid *dgid = (union ib_gid *)&dev->conn->fpga_qpc.fpga_ip;
+
+ return sprintf(buf, "%04x:%04x:%04x:%04x:%04x:%04x:%04x:%04x\n",
+ be16_to_cpu(((__be16 *)dgid->raw)[0]),
+ be16_to_cpu(((__be16 *)dgid->raw)[1]),
+ be16_to_cpu(((__be16 *)dgid->raw)[2]),
+ be16_to_cpu(((__be16 *)dgid->raw)[3]),
+ be16_to_cpu(((__be16 *)dgid->raw)[4]),
+ be16_to_cpu(((__be16 *)dgid->raw)[5]),
+ be16_to_cpu(((__be16 *)dgid->raw)[6]),
+ be16_to_cpu(((__be16 *)dgid->raw)[7]));
+}
+
+static ssize_t mlx_tls_dgid_write(struct mlx_tls_dev *dev, const char *buf,
+ size_t count)
+{
+ union ib_gid *dgid = (union ib_gid *)&dev->conn->fpga_qpc.fpga_ip;
+ int i = 0;
+ int tmp;
+
+ tmp = sscanf(buf,
+ "%04hx:%04hx:%04hx:%04hx:%04hx:%04hx:%04hx:%04hx\n",
+ &(((__be16 *)dgid->raw)[0]),
+ &(((__be16 *)dgid->raw)[1]),
+ &(((__be16 *)dgid->raw)[2]),
+ &(((__be16 *)dgid->raw)[3]),
+ &(((__be16 *)dgid->raw)[4]),
+ &(((__be16 *)dgid->raw)[5]),
+ &(((__be16 *)dgid->raw)[6]),
+ &(((__be16 *)dgid->raw)[7]));
+
+ for (i = 0; i < 8; i++)
+ ((__be16 *)dgid->raw)[i] = cpu_to_be16(((u16 *)dgid->raw)[i]);
+
+ return count;
+}
+
+static void mlx_tls_dev_release(struct kobject *kobj)
+{
+}
+
+static MLX_TLS_ATTR(sqpn, 0444, mlx_tls_sqpn_read, NULL);
+static MLX_TLS_ATTR(sgid, 0444, mlx_tls_sgid_read, NULL);
+static MLX_TLS_ATTR(dqpn, 0666, mlx_tls_dqpn_read, mlx_tls_dqpn_write);
+static MLX_TLS_ATTR(dgid, 0666, mlx_tls_dgid_read, mlx_tls_dgid_write);
+
+struct attribute *mlx_tls_def_attrs[] = {
+ &mlx_tls_attr_sqpn.attr,
+ &mlx_tls_attr_sgid.attr,
+ &mlx_tls_attr_dqpn.attr,
+ &mlx_tls_attr_dgid.attr,
+ NULL,
+};
+
+const struct sysfs_ops mlx_tls_dev_sysfs_ops = {
+ .show = mlx_tls_attr_show,
+ .store = mlx_tls_attr_store,
+};
+
+static struct kobj_type mlx_tls_dev_type = {
+ .release = mlx_tls_dev_release,
+ .sysfs_ops = &mlx_tls_dev_sysfs_ops,
+ .default_attrs = mlx_tls_def_attrs,
+};
+
+int tls_sysfs_init_and_add(struct kobject *kobj, struct kobject *parent,
+ const char *fmt, char *arg)
+{
+ return kobject_init_and_add(kobj, &mlx_tls_dev_type,
+ parent,
+ fmt, arg);
+}
+#endif
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
new file mode 100644
index 0000000..bfaa857
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_sysfs.h
@@ -0,0 +1,45 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#ifndef __TLS_SYSFS_H__
+#define __TLS_SYSFS_H__
+
+#include <linux/sysfs.h>
+
+#include "tls.h"
+
+#ifdef MLX_TLS_SADB_RDMA
+int tls_sysfs_init_and_add(struct kobject *kobj, struct kobject *parent,
+ const char *fmt, char *arg);
+#endif
+
+#endif /* __TLS_SYSFS_H__ */
--
2.7.4

2017-03-28 13:27:12

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 08/15] mlx/tls: Hardware interface

From: Ilya Lesokhin <[email protected]>

Implement the hardware interface to set up TLS offload.

Signed-off-by: Guy Shapiro <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Matan Barak <[email protected]>
Signed-off-by: Haggai Eran <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
.../ethernet/mellanox/accelerator/tls/tls_cmds.h | 112 ++++++
.../net/ethernet/mellanox/accelerator/tls/tls_hw.c | 429 +++++++++++++++++++++
.../net/ethernet/mellanox/accelerator/tls/tls_hw.h | 49 +++
3 files changed, 590 insertions(+)
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c
create mode 100644 drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.h

diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h b/drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h
new file mode 100644
index 0000000..8916f00
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_cmds.h
@@ -0,0 +1,112 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+
+#ifndef MLX_TLS_CMDS_H
+#define MLX_TLS_CMDS_H
+
+#define MLX_TLS_SADB_RDMA
+
+enum fpga_cmds {
+ CMD_SETUP_STREAM = 1,
+ CMD_TEARDOWN_STREAM = 2,
+};
+
+enum fpga_response {
+ EVENT_SETUP_STREAM_RESPONSE = 0x81,
+};
+
+#define TLS_TCP_IP_PROTO BIT(3) /* 0 - UDP; 1 - TCP */
+#define TLS_TCP_INIT BIT(2) /* 1 - Initialized */
+#define TLS_TCP_VALID BIT(1) /* 1 - Valid */
+#define TLS_TCP_IPV6 BIT(0) /* 0 - IPv4;1 - IPv6 */
+
+struct tls_cntx_tcp {
+ __be32 ip_da[4];
+ __be32 flags;
+ __be16 src_port;
+ __be16 dst_port;
+ u32 pad;
+ __be32 tcp_sn;
+ __be32 ip_sa[4];
+} __packed;
+
+struct tls_cntx_crypto {
+ u8 enc_state[16];
+ u8 enc_key[32];
+} __packed;
+
+struct tls_cntx_record {
+ u8 rcd_sn[8];
+ u16 pad;
+ u8 flags;
+ u8 rcd_type_ver;
+ __be32 rcd_tcp_sn_nxt;
+ __be32 rcd_implicit_iv;
+ u8 rcd_residue[32];
+} __packed;
+
+#define TLS_RCD_ENC_AES_GCM128 (0)
+#define TLS_RCD_ENC_AES_GCM256 (BIT(4))
+#define TLS_RCD_AUTH_AES_GCM128 (0)
+#define TLS_RCD_AUTH_AES_GCM256 (1)
+
+#define TLS_RCD_VER_1_2 (3)
+
+struct tls_cntx {
+ struct tls_cntx_tcp tcp;
+ struct tls_cntx_record rcd;
+ struct tls_cntx_crypto crypto;
+} __packed;
+
+struct setup_stream_cmd {
+ u8 cmd;
+ __be32 stream_id;
+ struct tls_cntx tls;
+} __packed;
+
+struct teardown_stream_cmd {
+ u8 cmd;
+ __be32 stream_id;
+} __packed;
+
+struct generic_event {
+ __be32 opcode;
+ __be32 stream_id;
+};
+
+struct setup_stream_response {
+ __be32 opcode;
+ __be32 stream_id;
+};
+
+#endif /* MLX_TLS_CMDS_H */
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c b/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c
new file mode 100644
index 0000000..3a02f1e
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.c
@@ -0,0 +1,429 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#include "tls_hw.h"
+#include "tls_cmds.h"
+#include <linux/inetdevice.h>
+#include <linux/socket.h>
+
+static void mlx_tls_del_work(struct work_struct *w);
+
+static DEFINE_SPINLOCK(tls_del_lock);
+static DECLARE_WORK(tls_del_work, mlx_tls_del_work);
+static LIST_HEAD(tls_del_list);
+
+static int build_ctx(struct tls_crypto_info_aes_gcm_128 *crypto_info,
+ u32 expectedSN,
+ unsigned short skc_family,
+ struct inet_sock *inet,
+ struct tls_cntx *tls)
+{
+ if (skc_family != PF_INET6) {
+ tls->tcp.ip_sa[3] = inet->inet_rcv_saddr;
+ tls->tcp.ip_da[3] = inet->inet_daddr;
+ } else {
+#if IS_ENABLED(CONFIG_IPV6)
+ memcpy((void *)tls->tcp.ip_sa,
+ inet->pinet6->saddr.in6_u.u6_addr8, 16);
+ memcpy((void *)tls->tcp.ip_da,
+ inet->pinet6->daddr_cache->in6_u.u6_addr8, 16);
+#endif
+ pr_err("IPv6 isn't supported yet\n");
+ return -EINVAL;
+ }
+ tls->tcp.flags |= htonl(TLS_TCP_IP_PROTO);
+ tls->tcp.flags |= htonl(TLS_TCP_VALID);
+ tls->tcp.flags |= htonl(TLS_TCP_INIT);
+ tls->tcp.src_port = inet->inet_sport;
+ tls->tcp.dst_port = inet->inet_dport;
+ tls->tcp.tcp_sn = htonl(expectedSN);
+
+ tls->rcd.rcd_tcp_sn_nxt = htonl(expectedSN);
+// tls->rcd.enc_auth_mode |= TLS_RCD_AUTH_AES_GCM128;
+// tls->rcd.enc_auth_mode |= TLS_RCD_ENC_AES_GCM128;
+ tls->rcd.rcd_type_ver |= TLS_RCD_VER_1_2 << 4;
+
+ memcpy(&tls->rcd.rcd_implicit_iv, crypto_info->salt,
+ TLS_CIPHER_AES_GCM_128_SALT_SIZE);
+ memcpy(tls->rcd.rcd_sn, crypto_info->iv,
+ TLS_CIPHER_AES_GCM_128_IV_SIZE);
+ memcpy(tls->crypto.enc_key, crypto_info->key,
+ TLS_CIPHER_AES_GCM_128_KEY_SIZE);
+ memcpy(tls->crypto.enc_key + TLS_CIPHER_AES_GCM_128_KEY_SIZE,
+ crypto_info->key, TLS_CIPHER_AES_GCM_128_KEY_SIZE);
+
+ return 0;
+}
+
+#ifdef MLX_TLS_SADB_RDMA
+static int send_teardown_cmd(struct mlx_tls_dev *dev, __be32 swid)
+{
+ struct mlx_accel_core_dma_buf *buf;
+ struct teardown_stream_cmd *cmd;
+ int size = sizeof(*buf) + sizeof(*cmd);
+
+ buf = kzalloc(size, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ buf->data = buf + 1;
+ buf->data_size = sizeof(*cmd);
+
+ cmd = (struct teardown_stream_cmd *)buf->data;
+ cmd->cmd = CMD_TEARDOWN_STREAM;
+ cmd->stream_id = swid;
+
+ return mlx_accel_core_sendmsg(dev->conn, buf);
+}
+#endif
+
+static void mlx_tls_del_work(struct work_struct *w)
+{
+ struct mlx_tls_offload_context *context;
+ struct mlx_tls_dev *dev;
+
+ spin_lock_irq(&tls_del_lock);
+ while (true) {
+ context =
+ list_first_entry_or_null(&tls_del_list,
+ struct mlx_tls_offload_context,
+ tls_del_list);
+ spin_unlock_irq(&tls_del_lock);
+ if (!context)
+ break;
+
+ dev = mlx_tls_find_dev_by_netdev(context->netdev);
+
+#ifdef MLX_TLS_SADB_RDMA
+ if (send_teardown_cmd(dev, context->swid)) {
+ /* try again later */
+ schedule_work(w);
+ break;
+ }
+#endif
+
+ ida_simple_remove(&dev->swid_ida, ntohl(context->swid));
+
+ module_put(THIS_MODULE);
+
+ spin_lock_irq(&tls_del_lock);
+ list_del(&context->tls_del_list);
+ kfree(context);
+ }
+}
+
+void mlx_tls_hw_stop_cmd(struct net_device *netdev,
+ struct mlx_tls_offload_context *context)
+{
+ unsigned long flags;
+
+ pr_info("mlx_tls_hw_stop_cmd\n");
+ spin_lock_irqsave(&tls_del_lock, flags);
+ list_add_tail(&context->tls_del_list, &tls_del_list);
+ context->netdev = netdev;
+ schedule_work(&tls_del_work);
+ spin_unlock_irqrestore(&tls_del_lock, flags);
+}
+
+#ifndef MLX_TLS_SADB_RDMA
+#include "../core/accel_core.h"
+
+#define GW_CTRL_RW htonl(BIT(29))
+#define GW_CTRL_BUSY htonl(BIT(30))
+#define GW_CTRL_LOCK htonl(BIT(31))
+
+#define GW_CTRL_ADDR_SHIFT 26
+
+static int mlx_accel_gw_waitfor(struct mlx5_core_dev *dev, u64 addr, u32 mask,
+ u32 value)
+{
+ int ret = 0;
+ u32 gw_value;
+ int try = 0;
+ static const int max_tries = 100;
+
+ while (true) {
+ pr_debug("Waiting for %x/%x. Try %d\n", value, mask, try);
+ ret = mlx5_fpga_access_reg(dev, sizeof(u32), addr,
+ (u8 *)&gw_value, false);
+ if (ret)
+ return ret;
+
+ pr_debug("Value is %x\n", gw_value);
+ if ((gw_value & mask) == value)
+ break; //lock is taken automatically if it was 0.
+ try++;
+ if (try >= max_tries) {
+ pr_debug("Timeout waiting for %x/%x at %llx. Value is %x after %d tries\n",
+ value, mask, addr, gw_value, try);
+ return -EBUSY;
+ }
+ usleep_range(10, 100);
+ };
+ return 0;
+}
+
+static int mlx_accel_gw_lock(struct mlx5_core_dev *dev, u64 addr)
+{
+ return mlx_accel_gw_waitfor(dev, addr, GW_CTRL_LOCK, 0);
+}
+
+static int mlx_accel_gw_unlock(struct mlx5_core_dev *dev, u64 addr)
+{
+ u32 gw_value;
+ int ret;
+
+ pr_debug("Unlocking %llx\n", addr);
+ ret = mlx5_fpga_access_reg(dev, sizeof(u32), addr,
+ (u8 *)&gw_value, false);
+ if (ret)
+ return ret;
+
+ if ((gw_value & GW_CTRL_LOCK) != GW_CTRL_LOCK)
+ pr_warn("Lock expected when unlocking, but not held for device %s addr %llx\n",
+ dev->priv.name, addr);
+
+ pr_debug("Old value %x\n", gw_value);
+ gw_value &= ~GW_CTRL_LOCK;
+ pr_debug("New value %x\n", gw_value);
+ ret = mlx5_fpga_access_reg(dev, sizeof(u32), addr,
+ (u8 *)&gw_value, true);
+ if (ret)
+ return ret;
+ return 0;
+}
+
+static int mlx_accel_gw_op(struct mlx5_core_dev *dev, u64 addr,
+ unsigned int index, bool write)
+{
+ u32 gw_value;
+ int ret;
+
+ if (index >= 8)
+ pr_warn("Trying to access index %u out of range for GW at %llx\n",
+ index, addr);
+
+ pr_debug("Performing op %u at %llx\n", write, addr);
+ ret = mlx5_fpga_access_reg(dev, sizeof(u32), addr,
+ (u8 *)&gw_value, false);
+ if (ret)
+ return ret;
+
+ pr_debug("Old op value is %x\n", gw_value);
+ if ((gw_value & GW_CTRL_LOCK) != GW_CTRL_LOCK)
+ pr_warn("Lock expected for %s, but not held for device %s addr %llx\n",
+ write ? "write" : "read", dev->priv.name, addr);
+
+ gw_value &= htonl(~(7 << GW_CTRL_ADDR_SHIFT));
+ gw_value |= htonl(index << GW_CTRL_ADDR_SHIFT);
+ if (write)
+ gw_value &= ~GW_CTRL_RW;
+ else
+ gw_value |= GW_CTRL_RW;
+
+ gw_value |= GW_CTRL_BUSY;
+
+ pr_debug("New op value is %x\n", gw_value);
+ ret = mlx5_fpga_access_reg(dev, sizeof(u32), addr,
+ (u8 *)&gw_value, true);
+ if (ret)
+ return ret;
+
+ return mlx_accel_gw_waitfor(dev, addr, GW_CTRL_BUSY, 0);
+}
+
+static int mlx_accel_gw_write(struct mlx5_core_dev *dev, u64 addr,
+ unsigned int index)
+{
+ return mlx_accel_gw_op(dev, addr, index, true);
+}
+
+#define CRSPACE_TCP_BASE 0x0
+#define CRSPACE_TCP_OFFSET 0x10
+#define CRSPACE_RECORD_BASE 0x100
+#define CRSPACE_RECORD_OFFSET 0xc
+#define CRSPACE_CRYPTO_BASE 0x180
+#define CRSPACE_CRYPTO_OFFSET 0x10
+
+static void write_context(struct mlx5_core_dev *dev, void *ctx,
+ size_t size, u64 base, u32 offset) {
+ mlx_accel_gw_lock(dev, base);
+ mlx5_fpga_access_reg(dev, size, base + offset, ctx, true);
+ mlx_accel_gw_write(dev, base, 0);
+ mlx_accel_gw_unlock(dev, base);
+}
+
+int mlx_tls_hw_start_cmd(struct mlx_tls_dev *dev, struct sock *sk,
+ struct tls_crypto_info_aes_gcm_128 *crypto_info,
+ struct mlx_tls_offload_context *context)
+{
+ struct tls_cntx tls;
+ int ret;
+ struct inet_sock *inet = inet_sk(sk);
+ u32 expectedSN = context->context.expectedSN;
+
+ memset(&tls, 0, sizeof(tls));
+
+ ret = build_ctx(crypto_info,
+ expectedSN,
+ sk->sk_family,
+ inet,
+ &tls);
+ if (ret)
+ return ret;
+
+ write_context(dev->accel_device->hw_dev,
+ &tls.rcd,
+ sizeof(tls.rcd),
+ CRSPACE_RECORD_BASE,
+ CRSPACE_RECORD_OFFSET);
+
+ write_context(dev->accel_device->hw_dev,
+ &tls.crypto,
+ sizeof(tls.crypto),
+ CRSPACE_CRYPTO_BASE,
+ CRSPACE_CRYPTO_OFFSET);
+
+ write_context(dev->accel_device->hw_dev,
+ &tls.tcp,
+ sizeof(tls.tcp),
+ CRSPACE_TCP_BASE,
+ CRSPACE_TCP_OFFSET);
+ return 0;
+}
+
+#else /* MLX_TLS_SADB_RDMA */
+static DEFINE_SPINLOCK(setup_stream_lock);
+static LIST_HEAD(setup_stream_list);
+struct setup_stream_t {
+ struct list_head list;
+ __be32 swid;
+ struct completion x;
+};
+
+static void mlx_accel_core_kfree_complete(struct mlx_accel_core_conn *conn,
+ struct mlx_accel_core_dma_buf *buf,
+ struct ib_wc *wc)
+{
+ kfree(buf);
+}
+
+int mlx_tls_hw_start_cmd(struct mlx_tls_dev *dev,
+ struct sock *sk,
+ struct tls_crypto_info_aes_gcm_128 *crypto_info,
+ struct mlx_tls_offload_context *context)
+{
+ struct mlx_accel_core_dma_buf *buf;
+ struct setup_stream_cmd *cmd;
+ struct inet_sock *inet = inet_sk(sk);
+ u32 expectedSN = context->context.expectedSN;
+ int ret;
+ int size = sizeof(*buf) + sizeof(*cmd);
+ struct setup_stream_t ss;
+
+ buf = kzalloc(size, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ buf->data = buf + 1;
+ buf->data_size = sizeof(*cmd);
+ buf->complete = mlx_accel_core_kfree_complete;
+
+ cmd = (struct setup_stream_cmd *)buf->data;
+ cmd->cmd = CMD_SETUP_STREAM;
+ cmd->stream_id = context->swid;
+
+ ret = build_ctx(crypto_info,
+ expectedSN,
+ sk->sk_family,
+ inet,
+ &cmd->tls);
+ if (ret) {
+ kfree(buf);
+ return ret;
+ }
+
+ ss.swid = context->swid;
+ init_completion(&ss.x);
+ spin_lock_irq(&setup_stream_lock);
+ list_add_tail(&ss.list, &setup_stream_list);
+ spin_unlock_irq(&setup_stream_lock);
+
+ mlx_accel_core_sendmsg(dev->conn, buf);
+ ret = wait_for_completion_killable(&ss.x);
+ if (ret) {
+ spin_lock_irq(&setup_stream_lock);
+ list_del(&ss.list);
+ spin_unlock_irq(&setup_stream_lock);
+ }
+
+ return ret;
+}
+
+static void handle_setup_stream_response(__be32 swid)
+{
+ struct setup_stream_t *ss;
+ unsigned long flags;
+ int found = 0;
+
+ spin_lock_irqsave(&setup_stream_lock, flags);
+ list_for_each_entry(ss, &setup_stream_list, list) {
+ if (ss->swid == swid) {
+ list_del(&ss->list);
+ complete(&ss->x);
+ found = 1;
+ break;
+ }
+ }
+ spin_unlock_irqrestore(&setup_stream_lock, flags);
+
+ if (!found)
+ pr_err("Got unexpected setup stream response swid = %u\n",
+ ntohl(swid));
+}
+
+void mlx_tls_hw_qp_recv_cb(void *cb_arg,
+ struct mlx_accel_core_dma_buf *buf)
+{
+ struct generic_event *ev = (struct generic_event *)buf->data;
+
+ switch (ev->opcode) {
+ case htonl(EVENT_SETUP_STREAM_RESPONSE):
+ handle_setup_stream_response(ev->stream_id);
+ break;
+ default:
+ pr_warn("mlx_tls_hw_qp_recv_cb: unexpected event opcode %u\n",
+ ntohl(ev->opcode));
+ }
+}
+
+#endif /* MLX_TLS_SADB_RDMA */
diff --git a/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.h b/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.h
new file mode 100644
index 0000000..5a11d30
--- /dev/null
+++ b/drivers/net/ethernet/mellanox/accelerator/tls/tls_hw.h
@@ -0,0 +1,49 @@
+/*
+ * Copyright (c) 2015-2017 Mellanox Technologies. All rights reserved.
+ *
+ * This software is available to you under a choice of one of two
+ * licenses. You may choose to be licensed under the terms of the GNU
+ * General Public License (GPL) Version 2, available from the file
+ * COPYING in the main directory of this source tree, or the
+ * OpenIB.org BSD license below:
+ *
+ * Redistribution and use in source and binary forms, with or
+ * without modification, are permitted provided that the following
+ * conditions are met:
+ *
+ * - Redistributions of source code must retain the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer.
+ *
+ * - Redistributions in binary form must reproduce the above
+ * copyright notice, this list of conditions and the following
+ * disclaimer in the documentation and/or other materials
+ * provided with the distribution.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
+ * EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
+ * MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
+ * NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS
+ * BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN
+ * ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN
+ * CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+ * SOFTWARE.
+ *
+ */
+#ifndef __TLS_HW_H__
+#define __TLS_HW_H__
+
+#include "tls.h"
+
+int mlx_tls_hw_init(void);
+void mlx_tls_hw_deinit(void);
+
+int mlx_tls_hw_start_cmd(struct mlx_tls_dev *dev, struct sock *sk,
+ struct tls_crypto_info_aes_gcm_128 *crypto_info,
+ struct mlx_tls_offload_context *context);
+void mlx_tls_hw_stop_cmd(struct net_device *netdev,
+ struct mlx_tls_offload_context *context);
+void mlx_tls_hw_qp_recv_cb(void *cb_arg,
+ struct mlx_accel_core_dma_buf *buf);
+
+#endif /* __TLS_HW_H__ */
--
2.7.4

2017-03-28 13:26:21

by Aviad Yehezkel

[permalink] [raw]
Subject: [RFC TLS Offload Support 04/15] net: Add TLS offload netdevice and socket support

From: Ilya Lesokhin <[email protected]>

This patch add a new NDO to add and delete TLS contexts on netdevices.

Signed-off-by: Boris Pismenny <[email protected]>
Signed-off-by: Ilya Lesokhin <[email protected]>
Signed-off-by: Aviad Yehezkel <[email protected]>
---
include/linux/netdevice.h | 23 +++++++++++++++++++++++
1 file changed, 23 insertions(+)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 51f9336..ce4760c 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -844,6 +844,25 @@ struct xfrmdev_ops {
};
#endif

+#if IS_ENABLED(CONFIG_TLS)
+enum tls_offload_ctx_dir {
+ TLS_OFFLOAD_CTX_DIR_RX,
+ TLS_OFFLOAD_CTX_DIR_TX,
+};
+
+struct tls_crypto_info;
+struct tls_offload_context;
+
+struct tlsdev_ops {
+ int (*tls_dev_add)(struct net_device *netdev, struct sock *sk,
+ enum tls_offload_ctx_dir direction,
+ struct tls_crypto_info *crypto_info,
+ struct tls_offload_context **ctx);
+ void (*tls_dev_del)(struct net_device *netdev, struct sock *sk,
+ enum tls_offload_ctx_dir direction);
+};
+#endif
+
/*
* This structure defines the management hooks for network devices.
* The following hooks can be defined; unless noted otherwise, they are
@@ -1722,6 +1741,10 @@ struct net_device {
const struct xfrmdev_ops *xfrmdev_ops;
#endif

+#if IS_ENABLED(CONFIG_TLS)
+ const struct tlsdev_ops *tlsdev_ops;
+#endif
+
const struct header_ops *header_ops;

unsigned int flags;
--
2.7.4

2017-03-28 14:56:46

by Tom Herbert

[permalink] [raw]
Subject: Re: [RFC TLS Offload Support 05/15] tcp: Add TLS socket options for TCP sockets

On Tue, Mar 28, 2017 at 6:26 AM, Aviad Yehezkel <[email protected]> wrote:
> This patch adds TLS_TX and TLS_RX TCP socket options.
>
> Setting these socket options will change the sk->sk_prot
> operations of the TCP socket. The user is responsible to
> prevent races between calls to the previous operations
> and the new operations. After successful return, data
> sent on this socket will be encapsulated in TLS.
>
> Signed-off-by: Aviad Yehezkel <[email protected]>
> Signed-off-by: Boris Pismenny <[email protected]>
> Signed-off-by: Ilya Lesokhin <[email protected]>
> ---
> include/uapi/linux/tcp.h | 2 ++
> net/ipv4/tcp.c | 32 ++++++++++++++++++++++++++++++++
> 2 files changed, 34 insertions(+)
>
> diff --git a/include/uapi/linux/tcp.h b/include/uapi/linux/tcp.h
> index c53de26..f9f0e29 100644
> --- a/include/uapi/linux/tcp.h
> +++ b/include/uapi/linux/tcp.h
> @@ -116,6 +116,8 @@ enum {
> #define TCP_SAVE_SYN 27 /* Record SYN headers for new connections */
> #define TCP_SAVED_SYN 28 /* Get SYN headers recorded for connection */
> #define TCP_REPAIR_WINDOW 29 /* Get/set window parameters */
> +#define TCP_TLS_TX 30
> +#define TCP_TLS_RX 31
>
> struct tcp_repair_opt {
> __u32 opt_code;
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 302fee9..2d190e3 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -273,6 +273,7 @@
> #include <net/icmp.h>
> #include <net/inet_common.h>
> #include <net/tcp.h>
> +#include <net/tls.h>
> #include <net/xfrm.h>
> #include <net/ip.h>
> #include <net/sock.h>
> @@ -2676,6 +2677,21 @@ static int do_tcp_setsockopt(struct sock *sk, int level,
> tp->notsent_lowat = val;
> sk->sk_write_space(sk);
> break;
> + case TCP_TLS_TX:
> + case TCP_TLS_RX: {
> + int (*fn)(struct sock *sk, int optname,
> + char __user *optval, unsigned int optlen);
> +
> + fn = symbol_get(tls_sk_attach);
> + if (!fn) {
> + err = -EINVAL;
> + break;
> + }
> +
> + err = fn(sk, optname, optval, optlen);
> + symbol_put(tls_sk_attach);
> + break;
> + }
> default:
> err = -ENOPROTOOPT;
> break;
> @@ -3064,6 +3080,22 @@ static int do_tcp_getsockopt(struct sock *sk, int level,
> }
> return 0;
> }
> + case TCP_TLS_TX:
> + case TCP_TLS_RX: {
> + int err;
> + int (*fn)(struct sock *sk, int optname,
> + char __user *optval, int __user *optlen);
> +
> + fn = symbol_get(tls_sk_query);
> + if (!fn) {
> + err = -EINVAL;
> + break;
> + }
> +
> + err = fn(sk, optname, optval, optlen);
> + symbol_put(tls_sk_query);
> + return err;
> + }

This mechanism should be generalized. If we can do this with TLS then
there will likely be other ULPs that we might want to set on a TCP
socket. Maybe something like TCP_ULP_PUSH, TCP_ULP_POP (borrowing from
STREAMS ever so slightly :-) ). I'd also suggest that the ULPs are
indicated by a text string in the socket option argument, then have
each ULP perform a registration for their service.


> default:
> return -ENOPROTOOPT;
> }
> --
> 2.7.4
>

2017-03-29 17:41:25

by David Miller

[permalink] [raw]
Subject: Re: [RFC TLS Offload Support 00/15] cover letter

From: Aviad Yehezkel <[email protected]>
Date: Tue, 28 Mar 2017 16:26:17 +0300

> TLS Tx crypto offload is a new feature of network devices. It
> enables the kernel TLS socket to skip encryption and authentication
> operations on the transmit side of the data path, delegating those
> to the NIC. In turn, the NIC encrypts packets that belong to an
> offloaded TLS socket on the fly. The NIC does not modify any packet
> headers. It expects to receive fully framed TCP packets with TLS
> records as payload. The NIC replaces plaintext with ciphertext and
> fills the authentication tag. The NIC does not hold any state beyond
> the context needed to encrypt the next expected packet,
> i.e. expected TCP sequence number and crypto state.

It seems like, since you do the TLS framing in TCP and the card is
expecting to fill in certain aspects, there is a requirement that the
packet contents aren't mangled between the TLS framing code and when
the SKB hits the card.

Is this right?

For example, what happens if netfilter splits a TLS Tx offloaded frame
into two TCP segments?

2017-03-29 18:11:05

by Hannes Frederic Sowa

[permalink] [raw]
Subject: Re: [RFC TLS Offload Support 00/15] cover letter

Hello,

On 29.03.2017 19:41, David Miller wrote:
> From: Aviad Yehezkel <[email protected]>
> Date: Tue, 28 Mar 2017 16:26:17 +0300
>
>> TLS Tx crypto offload is a new feature of network devices. It
>> enables the kernel TLS socket to skip encryption and authentication
>> operations on the transmit side of the data path, delegating those
>> to the NIC. In turn, the NIC encrypts packets that belong to an
>> offloaded TLS socket on the fly. The NIC does not modify any packet
>> headers. It expects to receive fully framed TCP packets with TLS
>> records as payload. The NIC replaces plaintext with ciphertext and
>> fills the authentication tag. The NIC does not hold any state beyond
>> the context needed to encrypt the next expected packet,
>> i.e. expected TCP sequence number and crypto state.
>
> It seems like, since you do the TLS framing in TCP and the card is
> expecting to fill in certain aspects, there is a requirement that the
> packet contents aren't mangled between the TLS framing code and when
> the SKB hits the card.
>
> Is this right?
>
> For example, what happens if netfilter splits a TLS Tx offloaded frame
> into two TCP segments?

Furthermore, it doesn't seem to work with bonding or any other virtual
interface, which could move the skb's to be processed on another NIC, as
the context is put onto the NIC. Even a redirect can not be processed
anymore (seems like those patches try to stick the connection to an
interface anyway).

Wouldn't it be possible to keep the state in software and push down a
security context per skb, which get applied during sending? If not
possible via hw, slowpath can encrypt packet in sw.

Also sticking connections to outgoing interfaces might work for TX, but
you can't force the interface where packets come in.

Bye,
Hannes

2017-03-30 06:49:52

by Boris Pismenny

[permalink] [raw]
Subject: RE: [RFC TLS Offload Support 00/15] cover letter

> >> TLS Tx crypto offload is a new feature of network devices. It enables
> >> the kernel TLS socket to skip encryption and authentication
> >> operations on the transmit side of the data path, delegating those to
> >> the NIC. In turn, the NIC encrypts packets that belong to an
> >> offloaded TLS socket on the fly. The NIC does not modify any packet
> >> headers. It expects to receive fully framed TCP packets with TLS
> >> records as payload. The NIC replaces plaintext with ciphertext and
> >> fills the authentication tag. The NIC does not hold any state beyond
> >> the context needed to encrypt the next expected packet, i.e. expected
> >> TCP sequence number and crypto state.
> >
> > It seems like, since you do the TLS framing in TCP and the card is
> > expecting to fill in certain aspects, there is a requirement that the
> > packet contents aren't mangled between the TLS framing code and when
> > the SKB hits the card.
> >
> > Is this right?
> >
> > For example, what happens if netfilter splits a TLS Tx offloaded frame
> > into two TCP segments?
We maintain the crypto context by tracking TCP sequence numbers, splitting
TCP segments is not a problem. Even if reordering is introduced anywhere
between TCP and the driver, we can identify it according to the TCP
sequence number and handle it gracefully – see mlx_tls_tx_handler.

>
> Furthermore, it doesn't seem to work with bonding or any other virtual
> interface, which could move the skb's to be processed on another NIC, as the
> context is put onto the NIC. Even a redirect can not be processed anymore
> (seems like those patches try to stick the connection to an interface anyway).
>
> Wouldn't it be possible to keep the state in software and push down a
> security context per skb, which get applied during sending? If not possible via
> hw, slowpath can encrypt packet in sw.
We do have all the state required to encrypt a TLS packet in software. But,
pushing down the state for each skb is too expansive, because the state
depends on all data in the TLS record, essentially it requires to resend the
record up to that skb. This is accomplished for OOO packets in the
“handle_ooo” function in mlx_tls.

Maybe we could use that functionality to handle bonding, but at first it would
be easier to prevent it.

The slowpath you’ve mentioned is tricky, because you need to decide in
advance that a TLS record will use the slowpath, because after the plaintext
TLS record is pushed into TCP it is difficult to fallback to software crypto.
>
> Also sticking connections to outgoing interfaces might work for TX, but you
> can't force the interface where packets come in.
Right. This RFC handles only TX offload.
>
> Bye,
> Hannes

Thanks,
Boris