2023-10-10 14:19:56

by Sascha Hauer

[permalink] [raw]
Subject: Problem with io_uring splice and KTLS

Hi,

I am working with a webserver using io_uring in conjunction with KTLS. The
webserver basically splices static file data from a pipe to a socket which uses
KTLS for encryption. When splice is done the socket is closed. This works fine
when using software encryption in KTLS. Things go awry though when the software
encryption is replaced with the CAAM driver which replaces the synchronous
encryption with a asynchronous queue/interrupt/completion flow.

So far I have traced it down to tls_push_sg() calling tcp_sendmsg_locked() to
send the completed encrypted messages. tcp_sendmsg_locked() sometimes waits for
more memory on the socket by calling sk_stream_wait_memory(). This in turn
returns -ERESTARTSYS due to:

if (signal_pending(current))
goto do_interrupted;

The current task has the TIF_NOTIFY_SIGNAL set due to:

io_req_normal_work_add()
{
...
/* This interrupts sk_stream_wait_memory() (notify_method == TWA_SIGNAL) */
task_work_add(req->task, &tctx->task_work, ctx->notify_method)))
}

The call stack when sk_stream_wait_memory() fails is as follows:

[ 1385.428816] dump_backtrace+0xa0/0x128
[ 1385.432568] show_stack+0x20/0x38
[ 1385.435878] dump_stack_lvl+0x48/0x60
[ 1385.439539] dump_stack+0x18/0x28
[ 1385.442850] tls_push_sg+0x100/0x238
[ 1385.446424] tls_tx_records+0x118/0x1d8
[ 1385.450257] tls_sw_release_resources_tx+0x74/0x1a0
[ 1385.455135] tls_sk_proto_close+0x2f8/0x3f0
[ 1385.459315] inet_release+0x58/0xb8
[ 1385.462802] inet6_release+0x3c/0x60
[ 1385.466374] __sock_release+0x48/0xc8
[ 1385.470035] sock_close+0x20/0x38
[ 1385.473347] __fput+0xbc/0x280
[ 1385.476399] ____fput+0x18/0x30
[ 1385.479537] task_work_run+0x80/0xe0
[ 1385.483108] io_run_task_work+0x40/0x108
[ 1385.487029] __arm64_sys_io_uring_enter+0x164/0xad8
[ 1385.491907] invoke_syscall+0x50/0x128
[ 1385.495655] el0_svc_common.constprop.0+0x48/0xf0
[ 1385.500359] do_el0_svc_compat+0x24/0x40
[ 1385.504279] el0_svc_compat+0x38/0x108
[ 1385.508026] el0t_32_sync_handler+0x98/0x140
[ 1385.512294] el0t_32_sync+0x194/0x198

So the socket is being closed and KTLS tries to send out the remaining
completed messages. From a splice point of view everything has been sent
successfully, but not everything made it through KTLS to the socket and the
remaining data is sent while closing the socket.

I vaguely understand what's going on here, but I haven't got the slightest idea
what to do about this. Any ideas?

Sascha

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


2023-10-10 14:29:04

by Jens Axboe

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On 10/10/23 8:19 AM, Sascha Hauer wrote:
> Hi,
>
> I am working with a webserver using io_uring in conjunction with KTLS. The
> webserver basically splices static file data from a pipe to a socket which uses
> KTLS for encryption. When splice is done the socket is closed. This works fine
> when using software encryption in KTLS. Things go awry though when the software
> encryption is replaced with the CAAM driver which replaces the synchronous
> encryption with a asynchronous queue/interrupt/completion flow.
>
> So far I have traced it down to tls_push_sg() calling tcp_sendmsg_locked() to
> send the completed encrypted messages. tcp_sendmsg_locked() sometimes waits for
> more memory on the socket by calling sk_stream_wait_memory(). This in turn
> returns -ERESTARTSYS due to:
>
> if (signal_pending(current))
> goto do_interrupted;
>
> The current task has the TIF_NOTIFY_SIGNAL set due to:
>
> io_req_normal_work_add()
> {
> ...
> /* This interrupts sk_stream_wait_memory() (notify_method == TWA_SIGNAL) */
> task_work_add(req->task, &tctx->task_work, ctx->notify_method)))
> }
>
> The call stack when sk_stream_wait_memory() fails is as follows:
>
> [ 1385.428816] dump_backtrace+0xa0/0x128
> [ 1385.432568] show_stack+0x20/0x38
> [ 1385.435878] dump_stack_lvl+0x48/0x60
> [ 1385.439539] dump_stack+0x18/0x28
> [ 1385.442850] tls_push_sg+0x100/0x238
> [ 1385.446424] tls_tx_records+0x118/0x1d8
> [ 1385.450257] tls_sw_release_resources_tx+0x74/0x1a0
> [ 1385.455135] tls_sk_proto_close+0x2f8/0x3f0
> [ 1385.459315] inet_release+0x58/0xb8
> [ 1385.462802] inet6_release+0x3c/0x60
> [ 1385.466374] __sock_release+0x48/0xc8
> [ 1385.470035] sock_close+0x20/0x38
> [ 1385.473347] __fput+0xbc/0x280
> [ 1385.476399] ____fput+0x18/0x30
> [ 1385.479537] task_work_run+0x80/0xe0
> [ 1385.483108] io_run_task_work+0x40/0x108
> [ 1385.487029] __arm64_sys_io_uring_enter+0x164/0xad8
> [ 1385.491907] invoke_syscall+0x50/0x128
> [ 1385.495655] el0_svc_common.constprop.0+0x48/0xf0
> [ 1385.500359] do_el0_svc_compat+0x24/0x40
> [ 1385.504279] el0_svc_compat+0x38/0x108
> [ 1385.508026] el0t_32_sync_handler+0x98/0x140
> [ 1385.512294] el0t_32_sync+0x194/0x198
>
> So the socket is being closed and KTLS tries to send out the remaining
> completed messages. From a splice point of view everything has been sent
> successfully, but not everything made it through KTLS to the socket and the
> remaining data is sent while closing the socket.
>
> I vaguely understand what's going on here, but I haven't got the
> slightest idea what to do about this. Any ideas?

Two things to try:

1) Depending on how you use the ring, set it up with
IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN. The latter will
avoid using signal based task_work notifications, which may be messing
you up here.

2) io_uring will hold a reference to the file/socket. I'm unsure if this
is a problem in the above case, but sometimes it'll prevent the final
flush.

Do you have a reproducer that could be run to test? Sometimes easier to
see what's going on when you can experiment, it'll save some time.

--
Jens Axboe

2023-10-11 11:22:42

by Sascha Hauer

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On Tue, Oct 10, 2023 at 08:28:13AM -0600, Jens Axboe wrote:
> On 10/10/23 8:19 AM, Sascha Hauer wrote:
> > Hi,
> >
> > I am working with a webserver using io_uring in conjunction with KTLS. The
> > webserver basically splices static file data from a pipe to a socket which uses
> > KTLS for encryption. When splice is done the socket is closed. This works fine
> > when using software encryption in KTLS. Things go awry though when the software
> > encryption is replaced with the CAAM driver which replaces the synchronous
> > encryption with a asynchronous queue/interrupt/completion flow.
> >
> > So far I have traced it down to tls_push_sg() calling tcp_sendmsg_locked() to
> > send the completed encrypted messages. tcp_sendmsg_locked() sometimes waits for
> > more memory on the socket by calling sk_stream_wait_memory(). This in turn
> > returns -ERESTARTSYS due to:
> >
> > if (signal_pending(current))
> > goto do_interrupted;
> >
> > The current task has the TIF_NOTIFY_SIGNAL set due to:
> >
> > io_req_normal_work_add()
> > {
> > ...
> > /* This interrupts sk_stream_wait_memory() (notify_method == TWA_SIGNAL) */
> > task_work_add(req->task, &tctx->task_work, ctx->notify_method)))
> > }
> >
> > The call stack when sk_stream_wait_memory() fails is as follows:
> >
> > [ 1385.428816] dump_backtrace+0xa0/0x128
> > [ 1385.432568] show_stack+0x20/0x38
> > [ 1385.435878] dump_stack_lvl+0x48/0x60
> > [ 1385.439539] dump_stack+0x18/0x28
> > [ 1385.442850] tls_push_sg+0x100/0x238
> > [ 1385.446424] tls_tx_records+0x118/0x1d8
> > [ 1385.450257] tls_sw_release_resources_tx+0x74/0x1a0
> > [ 1385.455135] tls_sk_proto_close+0x2f8/0x3f0
> > [ 1385.459315] inet_release+0x58/0xb8
> > [ 1385.462802] inet6_release+0x3c/0x60
> > [ 1385.466374] __sock_release+0x48/0xc8
> > [ 1385.470035] sock_close+0x20/0x38
> > [ 1385.473347] __fput+0xbc/0x280
> > [ 1385.476399] ____fput+0x18/0x30
> > [ 1385.479537] task_work_run+0x80/0xe0
> > [ 1385.483108] io_run_task_work+0x40/0x108
> > [ 1385.487029] __arm64_sys_io_uring_enter+0x164/0xad8
> > [ 1385.491907] invoke_syscall+0x50/0x128
> > [ 1385.495655] el0_svc_common.constprop.0+0x48/0xf0
> > [ 1385.500359] do_el0_svc_compat+0x24/0x40
> > [ 1385.504279] el0_svc_compat+0x38/0x108
> > [ 1385.508026] el0t_32_sync_handler+0x98/0x140
> > [ 1385.512294] el0t_32_sync+0x194/0x198
> >
> > So the socket is being closed and KTLS tries to send out the remaining
> > completed messages. From a splice point of view everything has been sent
> > successfully, but not everything made it through KTLS to the socket and the
> > remaining data is sent while closing the socket.
> >
> > I vaguely understand what's going on here, but I haven't got the
> > slightest idea what to do about this. Any ideas?
>
> Two things to try:
>
> 1) Depending on how you use the ring, set it up with
> IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN. The latter will
> avoid using signal based task_work notifications, which may be messing
> you up here.

These flags do not make a difference unfortunately.

>
> 2) io_uring will hold a reference to the file/socket. I'm unsure if this
> is a problem in the above case, but sometimes it'll prevent the final
> flush.

Not sure what you want me to test here.

FWIW I tried to do the close() outside of io_uring and just did a
regular close() in userspace. That didn't make a difference either.

>
> Do you have a reproducer that could be run to test? Sometimes easier to
> see what's going on when you can experiment, it'll save some time.

I would love to provide a reproducer, but you'll need a device with an
asynchronous encryption engine providing gcm(aes). I am using the CAAM
engine on a Layerscape board, but some i.MX6/8 based board should do
as well. I don't know what other hardware you might have which supports
that.

Sascha

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

2023-10-12 13:34:42

by Sascha Hauer

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On Tue, Oct 10, 2023 at 08:28:13AM -0600, Jens Axboe wrote:
> On 10/10/23 8:19 AM, Sascha Hauer wrote:
> > Hi,
> >
> > I am working with a webserver using io_uring in conjunction with KTLS. The
> > webserver basically splices static file data from a pipe to a socket which uses
> > KTLS for encryption. When splice is done the socket is closed. This works fine
> > when using software encryption in KTLS. Things go awry though when the software
> > encryption is replaced with the CAAM driver which replaces the synchronous
> > encryption with a asynchronous queue/interrupt/completion flow.
> >
> > So far I have traced it down to tls_push_sg() calling tcp_sendmsg_locked() to
> > send the completed encrypted messages. tcp_sendmsg_locked() sometimes waits for
> > more memory on the socket by calling sk_stream_wait_memory(). This in turn
> > returns -ERESTARTSYS due to:
> >
> > if (signal_pending(current))
> > goto do_interrupted;
> >
> > The current task has the TIF_NOTIFY_SIGNAL set due to:
> >
> > io_req_normal_work_add()
> > {
> > ...
> > /* This interrupts sk_stream_wait_memory() (notify_method == TWA_SIGNAL) */
> > task_work_add(req->task, &tctx->task_work, ctx->notify_method)))
> > }
> >
> > The call stack when sk_stream_wait_memory() fails is as follows:
> >
> > [ 1385.428816] dump_backtrace+0xa0/0x128
> > [ 1385.432568] show_stack+0x20/0x38
> > [ 1385.435878] dump_stack_lvl+0x48/0x60
> > [ 1385.439539] dump_stack+0x18/0x28
> > [ 1385.442850] tls_push_sg+0x100/0x238
> > [ 1385.446424] tls_tx_records+0x118/0x1d8
> > [ 1385.450257] tls_sw_release_resources_tx+0x74/0x1a0
> > [ 1385.455135] tls_sk_proto_close+0x2f8/0x3f0
> > [ 1385.459315] inet_release+0x58/0xb8
> > [ 1385.462802] inet6_release+0x3c/0x60
> > [ 1385.466374] __sock_release+0x48/0xc8
> > [ 1385.470035] sock_close+0x20/0x38
> > [ 1385.473347] __fput+0xbc/0x280
> > [ 1385.476399] ____fput+0x18/0x30
> > [ 1385.479537] task_work_run+0x80/0xe0
> > [ 1385.483108] io_run_task_work+0x40/0x108
> > [ 1385.487029] __arm64_sys_io_uring_enter+0x164/0xad8
> > [ 1385.491907] invoke_syscall+0x50/0x128
> > [ 1385.495655] el0_svc_common.constprop.0+0x48/0xf0
> > [ 1385.500359] do_el0_svc_compat+0x24/0x40
> > [ 1385.504279] el0_svc_compat+0x38/0x108
> > [ 1385.508026] el0t_32_sync_handler+0x98/0x140
> > [ 1385.512294] el0t_32_sync+0x194/0x198
> >
> > So the socket is being closed and KTLS tries to send out the remaining
> > completed messages. From a splice point of view everything has been sent
> > successfully, but not everything made it through KTLS to the socket and the
> > remaining data is sent while closing the socket.
> >
> > I vaguely understand what's going on here, but I haven't got the
> > slightest idea what to do about this. Any ideas?
>
> Two things to try:
>
> 1) Depending on how you use the ring, set it up with
> IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN. The latter will
> avoid using signal based task_work notifications, which may be messing
> you up here.
>
> 2) io_uring will hold a reference to the file/socket. I'm unsure if this
> is a problem in the above case, but sometimes it'll prevent the final
> flush.
>
> Do you have a reproducer that could be run to test? Sometimes easier to
> see what's going on when you can experiment, it'll save some time.

Okay, here is a reproducer:

https://github.com/saschahauer/webserver-uring-test.git

Execute ./prepare.sh in that repository, it will compile the webserver,
generate cert.pem/key.pem and generate some testfile to download. If the
meson build doesn't work for you then you can compile the program by
hand with something like:

gcc -O3 -Wall -o webserver webserver_liburing.c -lcrypto -lssl -luring

When the webserver is started you can get a file from it with:

curl -k https://<ipaddr>:8443/foo -o foo

or:

while true; do curl -k https://<ipaddr>:8443/foo -o foo; if [ $? != 0 ]; then break; fi; done

This should run without problems as by default likely the encryption
requests are running synchronously.

In case you don't have encryption hardware you can create an
asynchronous encryption module using cryptd. Compile a kernel with
CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
webserver with the '-c' option. /proc/crypto should then contain an
entry with:

name : gcm(aes)
driver : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
module : kernel
priority : 150

Make sure there is no other module providing gcm(aes) with a priority higher
than 150 so that this one is actually used.

With that the while true loop above should break out with a short read
fairly fast. Passing IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN
to io_uring_queue_init() makes it harder to reproduce for me. With that
I need multiple shells in parallel running the above loop.

The repository also contains a kernel patch which will provide you a
stack dump when KTLS gets an error from tcp_sendmsg_locked().

Now I hope I haven't done anything silly in the webserver ;)

Sascha


--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

2023-10-13 01:13:30

by Jens Axboe

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On 10/12/23 7:34 AM, Sascha Hauer wrote:
> On Tue, Oct 10, 2023 at 08:28:13AM -0600, Jens Axboe wrote:
>> On 10/10/23 8:19 AM, Sascha Hauer wrote:
>>> Hi,
>>>
>>> I am working with a webserver using io_uring in conjunction with KTLS. The
>>> webserver basically splices static file data from a pipe to a socket which uses
>>> KTLS for encryption. When splice is done the socket is closed. This works fine
>>> when using software encryption in KTLS. Things go awry though when the software
>>> encryption is replaced with the CAAM driver which replaces the synchronous
>>> encryption with a asynchronous queue/interrupt/completion flow.
>>>
>>> So far I have traced it down to tls_push_sg() calling tcp_sendmsg_locked() to
>>> send the completed encrypted messages. tcp_sendmsg_locked() sometimes waits for
>>> more memory on the socket by calling sk_stream_wait_memory(). This in turn
>>> returns -ERESTARTSYS due to:
>>>
>>> if (signal_pending(current))
>>> goto do_interrupted;
>>>
>>> The current task has the TIF_NOTIFY_SIGNAL set due to:
>>>
>>> io_req_normal_work_add()
>>> {
>>> ...
>>> /* This interrupts sk_stream_wait_memory() (notify_method == TWA_SIGNAL) */
>>> task_work_add(req->task, &tctx->task_work, ctx->notify_method)))
>>> }
>>>
>>> The call stack when sk_stream_wait_memory() fails is as follows:
>>>
>>> [ 1385.428816] dump_backtrace+0xa0/0x128
>>> [ 1385.432568] show_stack+0x20/0x38
>>> [ 1385.435878] dump_stack_lvl+0x48/0x60
>>> [ 1385.439539] dump_stack+0x18/0x28
>>> [ 1385.442850] tls_push_sg+0x100/0x238
>>> [ 1385.446424] tls_tx_records+0x118/0x1d8
>>> [ 1385.450257] tls_sw_release_resources_tx+0x74/0x1a0
>>> [ 1385.455135] tls_sk_proto_close+0x2f8/0x3f0
>>> [ 1385.459315] inet_release+0x58/0xb8
>>> [ 1385.462802] inet6_release+0x3c/0x60
>>> [ 1385.466374] __sock_release+0x48/0xc8
>>> [ 1385.470035] sock_close+0x20/0x38
>>> [ 1385.473347] __fput+0xbc/0x280
>>> [ 1385.476399] ____fput+0x18/0x30
>>> [ 1385.479537] task_work_run+0x80/0xe0
>>> [ 1385.483108] io_run_task_work+0x40/0x108
>>> [ 1385.487029] __arm64_sys_io_uring_enter+0x164/0xad8
>>> [ 1385.491907] invoke_syscall+0x50/0x128
>>> [ 1385.495655] el0_svc_common.constprop.0+0x48/0xf0
>>> [ 1385.500359] do_el0_svc_compat+0x24/0x40
>>> [ 1385.504279] el0_svc_compat+0x38/0x108
>>> [ 1385.508026] el0t_32_sync_handler+0x98/0x140
>>> [ 1385.512294] el0t_32_sync+0x194/0x198
>>>
>>> So the socket is being closed and KTLS tries to send out the remaining
>>> completed messages. From a splice point of view everything has been sent
>>> successfully, but not everything made it through KTLS to the socket and the
>>> remaining data is sent while closing the socket.
>>>
>>> I vaguely understand what's going on here, but I haven't got the
>>> slightest idea what to do about this. Any ideas?
>>
>> Two things to try:
>>
>> 1) Depending on how you use the ring, set it up with
>> IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN. The latter will
>> avoid using signal based task_work notifications, which may be messing
>> you up here.
>>
>> 2) io_uring will hold a reference to the file/socket. I'm unsure if this
>> is a problem in the above case, but sometimes it'll prevent the final
>> flush.
>>
>> Do you have a reproducer that could be run to test? Sometimes easier to
>> see what's going on when you can experiment, it'll save some time.
>
> Okay, here is a reproducer:
>
> https://github.com/saschahauer/webserver-uring-test.git
>
> Execute ./prepare.sh in that repository, it will compile the webserver,
> generate cert.pem/key.pem and generate some testfile to download. If the
> meson build doesn't work for you then you can compile the program by
> hand with something like:
>
> gcc -O3 -Wall -o webserver webserver_liburing.c -lcrypto -lssl -luring
>
> When the webserver is started you can get a file from it with:
>
> curl -k https://<ipaddr>:8443/foo -o foo
>
> or:
>
> while true; do curl -k https://<ipaddr>:8443/foo -o foo; if [ $? != 0 ]; then break; fi; done
>
> This should run without problems as by default likely the encryption
> requests are running synchronously.
>
> In case you don't have encryption hardware you can create an
> asynchronous encryption module using cryptd. Compile a kernel with
> CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
> webserver with the '-c' option. /proc/crypto should then contain an
> entry with:
>
> name : gcm(aes)
> driver : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
> module : kernel
> priority : 150
>
> Make sure there is no other module providing gcm(aes) with a priority higher
> than 150 so that this one is actually used.
>
> With that the while true loop above should break out with a short read
> fairly fast. Passing IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN
> to io_uring_queue_init() makes it harder to reproduce for me. With that
> I need multiple shells in parallel running the above loop.
>
> The repository also contains a kernel patch which will provide you a
> stack dump when KTLS gets an error from tcp_sendmsg_locked().
>
> Now I hope I haven't done anything silly in the webserver ;)

Perfect! Thanks a lot for preparing all of that. Not sure I'll get to it
tomorrow, but if not, then definitely on Monday.

--
Jens Axboe

2023-10-13 01:45:57

by Jens Axboe

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On 10/12/23 7:34 AM, Sascha Hauer wrote:
> In case you don't have encryption hardware you can create an
> asynchronous encryption module using cryptd. Compile a kernel with
> CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
> webserver with the '-c' option. /proc/crypto should then contain an
> entry with:
>
> name : gcm(aes)
> driver : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
> module : kernel
> priority : 150

I did a bit of prep work to ensure I had everything working for when
there's time to dive into it, but starting it with -c doesn't register
this entry. Turns out the bind() in there returns -1/ENOENT. For the
life of me I can't figure out what I'm missing. I tried this with both
arm64 and x86-64. On the latter there's some native AES that is higher
priority, but I added a small hack in cryptd to ensure it's the highest
one. But I don't even get that far...

--
Jens Axboe

2023-10-13 05:47:43

by Sascha Hauer

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On Thu, Oct 12, 2023 at 07:45:07PM -0600, Jens Axboe wrote:
> On 10/12/23 7:34 AM, Sascha Hauer wrote:
> > In case you don't have encryption hardware you can create an
> > asynchronous encryption module using cryptd. Compile a kernel with
> > CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
> > webserver with the '-c' option. /proc/crypto should then contain an
> > entry with:
> >
> > name : gcm(aes)
> > driver : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
> > module : kernel
> > priority : 150
>
> I did a bit of prep work to ensure I had everything working for when
> there's time to dive into it, but starting it with -c doesn't register
> this entry. Turns out the bind() in there returns -1/ENOENT.

Yes, that happens here as well, that's why I don't check for the error
in the bind call. Nevertheless it has the desired effect that the new
algorithm is registered and used from there on. BTW you only need to
start the webserver once with -c. If you start it repeatedly with -c a
new gcm(aes) instance is registered each time.

I think what I am doing here is not the intended use case of cryptd and
only works by accident.

Sascha

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

2023-10-13 13:46:15

by Jens Axboe

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On 10/12/23 11:47 PM, Sascha Hauer wrote:
> On Thu, Oct 12, 2023 at 07:45:07PM -0600, Jens Axboe wrote:
>> On 10/12/23 7:34 AM, Sascha Hauer wrote:
>>> In case you don't have encryption hardware you can create an
>>> asynchronous encryption module using cryptd. Compile a kernel with
>>> CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
>>> webserver with the '-c' option. /proc/crypto should then contain an
>>> entry with:
>>>
>>> name : gcm(aes)
>>> driver : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
>>> module : kernel
>>> priority : 150
>>
>> I did a bit of prep work to ensure I had everything working for when
>> there's time to dive into it, but starting it with -c doesn't register
>> this entry. Turns out the bind() in there returns -1/ENOENT.
>
> Yes, that happens here as well, that's why I don't check for the error
> in the bind call. Nevertheless it has the desired effect that the new
> algorithm is registered and used from there on. BTW you only need to
> start the webserver once with -c. If you start it repeatedly with -c a
> new gcm(aes) instance is registered each time.

Gotcha - I wasn't able to trigger the condition, which is why I thought
perhaps I was missing something.

Can you try the below patch and see if that makes a difference? I'm not
quite sure why it would since you said it triggers with DEFER_TASKRUN as
well, and for that kind of notification, you should never hit the paths
you have detailed in the debug patch.

diff --git a/net/core/stream.c b/net/core/stream.c
index f5c4e47df165..a9a196587254 100644
--- a/net/core/stream.c
+++ b/net/core/stream.c
@@ -67,7 +67,7 @@ int sk_stream_wait_connect(struct sock *sk, long *timeo_p)
return -EPIPE;
if (!*timeo_p)
return -EAGAIN;
- if (signal_pending(tsk))
+ if (task_sigpending(tsk))
return sock_intr_errno(*timeo_p);

add_wait_queue(sk_sleep(sk), &wait);
@@ -103,7 +103,7 @@ void sk_stream_wait_close(struct sock *sk, long timeout)
do {
if (sk_wait_event(sk, &timeout, !sk_stream_closing(sk), &wait))
break;
- } while (!signal_pending(current) && timeout);
+ } while (!task_sigpending(current) && timeout);

remove_wait_queue(sk_sleep(sk), &wait);
}
@@ -134,7 +134,7 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
goto do_error;
if (!*timeo_p)
goto do_eagain;
- if (signal_pending(current))
+ if (task_sigpending(current))
goto do_interrupted;
sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
if (sk_stream_memory_free(sk) && !vm_wait)

--
Jens Axboe

2023-10-16 07:27:01

by Sascha Hauer

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On Fri, Oct 13, 2023 at 07:45:55AM -0600, Jens Axboe wrote:
> On 10/12/23 11:47 PM, Sascha Hauer wrote:
> > On Thu, Oct 12, 2023 at 07:45:07PM -0600, Jens Axboe wrote:
> >> On 10/12/23 7:34 AM, Sascha Hauer wrote:
> >>> In case you don't have encryption hardware you can create an
> >>> asynchronous encryption module using cryptd. Compile a kernel with
> >>> CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
> >>> webserver with the '-c' option. /proc/crypto should then contain an
> >>> entry with:
> >>>
> >>> name : gcm(aes)
> >>> driver : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
> >>> module : kernel
> >>> priority : 150
> >>
> >> I did a bit of prep work to ensure I had everything working for when
> >> there's time to dive into it, but starting it with -c doesn't register
> >> this entry. Turns out the bind() in there returns -1/ENOENT.
> >
> > Yes, that happens here as well, that's why I don't check for the error
> > in the bind call. Nevertheless it has the desired effect that the new
> > algorithm is registered and used from there on. BTW you only need to
> > start the webserver once with -c. If you start it repeatedly with -c a
> > new gcm(aes) instance is registered each time.
>
> Gotcha - I wasn't able to trigger the condition, which is why I thought
> perhaps I was missing something.
>
> Can you try the below patch and see if that makes a difference? I'm not
> quite sure why it would since you said it triggers with DEFER_TASKRUN as
> well, and for that kind of notification, you should never hit the paths
> you have detailed in the debug patch.

I can confirm that this patch makes it work for me. I tested with both
software cryptd and also with my original CAAM encryption workload.
IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN is not needed.
Both my simple webserver and the original C++ Webserver from our
customer are now working without problems.

Do you think there is a chance getting this change upstream? I'm a bit
afraid the code originally uses signal_pending() instead of
task_sigpending() for a good reason.

Sascha

>
> diff --git a/net/core/stream.c b/net/core/stream.c
> index f5c4e47df165..a9a196587254 100644
> --- a/net/core/stream.c
> +++ b/net/core/stream.c
> @@ -67,7 +67,7 @@ int sk_stream_wait_connect(struct sock *sk, long *timeo_p)
> return -EPIPE;
> if (!*timeo_p)
> return -EAGAIN;
> - if (signal_pending(tsk))
> + if (task_sigpending(tsk))
> return sock_intr_errno(*timeo_p);
>
> add_wait_queue(sk_sleep(sk), &wait);
> @@ -103,7 +103,7 @@ void sk_stream_wait_close(struct sock *sk, long timeout)
> do {
> if (sk_wait_event(sk, &timeout, !sk_stream_closing(sk), &wait))
> break;
> - } while (!signal_pending(current) && timeout);
> + } while (!task_sigpending(current) && timeout);
>
> remove_wait_queue(sk_sleep(sk), &wait);
> }
> @@ -134,7 +134,7 @@ int sk_stream_wait_memory(struct sock *sk, long *timeo_p)
> goto do_error;
> if (!*timeo_p)
> goto do_eagain;
> - if (signal_pending(current))
> + if (task_sigpending(current))
> goto do_interrupted;
> sk_clear_bit(SOCKWQ_ASYNC_NOSPACE, sk);
> if (sk_stream_memory_free(sk) && !vm_wait)
>
> --
> Jens Axboe
>
>

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |

2023-10-16 13:17:41

by Jens Axboe

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On 10/16/23 1:26 AM, Sascha Hauer wrote:
> On Fri, Oct 13, 2023 at 07:45:55AM -0600, Jens Axboe wrote:
>> On 10/12/23 11:47 PM, Sascha Hauer wrote:
>>> On Thu, Oct 12, 2023 at 07:45:07PM -0600, Jens Axboe wrote:
>>>> On 10/12/23 7:34 AM, Sascha Hauer wrote:
>>>>> In case you don't have encryption hardware you can create an
>>>>> asynchronous encryption module using cryptd. Compile a kernel with
>>>>> CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
>>>>> webserver with the '-c' option. /proc/crypto should then contain an
>>>>> entry with:
>>>>>
>>>>> name : gcm(aes)
>>>>> driver : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
>>>>> module : kernel
>>>>> priority : 150
>>>>
>>>> I did a bit of prep work to ensure I had everything working for when
>>>> there's time to dive into it, but starting it with -c doesn't register
>>>> this entry. Turns out the bind() in there returns -1/ENOENT.
>>>
>>> Yes, that happens here as well, that's why I don't check for the error
>>> in the bind call. Nevertheless it has the desired effect that the new
>>> algorithm is registered and used from there on. BTW you only need to
>>> start the webserver once with -c. If you start it repeatedly with -c a
>>> new gcm(aes) instance is registered each time.
>>
>> Gotcha - I wasn't able to trigger the condition, which is why I thought
>> perhaps I was missing something.
>>
>> Can you try the below patch and see if that makes a difference? I'm not
>> quite sure why it would since you said it triggers with DEFER_TASKRUN as
>> well, and for that kind of notification, you should never hit the paths
>> you have detailed in the debug patch.
>
> I can confirm that this patch makes it work for me. I tested with both
> software cryptd and also with my original CAAM encryption workload.
> IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN is not needed.
> Both my simple webserver and the original C++ Webserver from our
> customer are now working without problems.

OK, good to hear. I'm assuming you only change for
sk_stream_wait_memory()? If you can reproduce, would be good to test.
But i general none of them should hurt.

FWIW, the reason why DEFER_TASKRUN wasn't fully solving it is because
we'd also use TIF_NOTIFY_SIGNAL for creating new io-wq workers. So while
task_work would not be the trigger for setting that condition, we'd
still end up doing it via io-wq worker creation.

> Do you think there is a chance getting this change upstream? I'm a bit
> afraid the code originally uses signal_pending() instead of
> task_sigpending() for a good reason.

The distinction between signal_pending() and task_sigpending() was
introduced with TIF_NOTIFY_SIGNAL. This isn't a case of networking
needing to use signal_pending(), just that this is was originally the
only aborting condition and now it's a bit too broad for some cases
(like this one).

--
Jens Axboe

2023-10-17 11:52:09

by Sascha Hauer

[permalink] [raw]
Subject: Re: Problem with io_uring splice and KTLS

On Mon, Oct 16, 2023 at 07:17:23AM -0600, Jens Axboe wrote:
> On 10/16/23 1:26 AM, Sascha Hauer wrote:
> > On Fri, Oct 13, 2023 at 07:45:55AM -0600, Jens Axboe wrote:
> >> On 10/12/23 11:47 PM, Sascha Hauer wrote:
> >>> On Thu, Oct 12, 2023 at 07:45:07PM -0600, Jens Axboe wrote:
> >>>> On 10/12/23 7:34 AM, Sascha Hauer wrote:
> >>>>> In case you don't have encryption hardware you can create an
> >>>>> asynchronous encryption module using cryptd. Compile a kernel with
> >>>>> CONFIG_CRYPTO_USER_API_AEAD and CONFIG_CRYPTO_CRYPTD and start the
> >>>>> webserver with the '-c' option. /proc/crypto should then contain an
> >>>>> entry with:
> >>>>>
> >>>>> name : gcm(aes)
> >>>>> driver : cryptd(gcm_base(ctr(aes-generic),ghash-generic))
> >>>>> module : kernel
> >>>>> priority : 150
> >>>>
> >>>> I did a bit of prep work to ensure I had everything working for when
> >>>> there's time to dive into it, but starting it with -c doesn't register
> >>>> this entry. Turns out the bind() in there returns -1/ENOENT.
> >>>
> >>> Yes, that happens here as well, that's why I don't check for the error
> >>> in the bind call. Nevertheless it has the desired effect that the new
> >>> algorithm is registered and used from there on. BTW you only need to
> >>> start the webserver once with -c. If you start it repeatedly with -c a
> >>> new gcm(aes) instance is registered each time.
> >>
> >> Gotcha - I wasn't able to trigger the condition, which is why I thought
> >> perhaps I was missing something.
> >>
> >> Can you try the below patch and see if that makes a difference? I'm not
> >> quite sure why it would since you said it triggers with DEFER_TASKRUN as
> >> well, and for that kind of notification, you should never hit the paths
> >> you have detailed in the debug patch.
> >
> > I can confirm that this patch makes it work for me. I tested with both
> > software cryptd and also with my original CAAM encryption workload.
> > IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN is not needed.
> > Both my simple webserver and the original C++ Webserver from our
> > customer are now working without problems.
>
> OK, good to hear. I'm assuming you only change for
> sk_stream_wait_memory()? If you can reproduce, would be good to test.
> But i general none of them should hurt.

Yes, only the change in sk_stream_wait_memory() is needed for me. The
other two hunks do not change anything for me.

>
> FWIW, the reason why DEFER_TASKRUN wasn't fully solving it is because
> we'd also use TIF_NOTIFY_SIGNAL for creating new io-wq workers. So while
> task_work would not be the trigger for setting that condition, we'd
> still end up doing it via io-wq worker creation.
>
> > Do you think there is a chance getting this change upstream? I'm a bit
> > afraid the code originally uses signal_pending() instead of
> > task_sigpending() for a good reason.
>
> The distinction between signal_pending() and task_sigpending() was
> introduced with TIF_NOTIFY_SIGNAL. This isn't a case of networking
> needing to use signal_pending(), just that this is was originally the
> only aborting condition and now it's a bit too broad for some cases
> (like this one).

Ok. I didn't realize so far that it was you who TIF_NOTIFY_SIGNAL.

Sascha

--
Pengutronix e.K. | |
Steuerwalder Str. 21 | http://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |