2023-01-19 14:38:42

by Ming Lei

[permalink] [raw]
Subject: ublk-nbd: ublk-nbd is avaialbe

Hi,

ublk-nbd[1] is available now.

Basically it is one nbd client, but totally implemented in userspace,
and wrt. current nbd-client in [2], the transmission phase is done
by linux block nbd driver.

The handshake implementation is borrowed from nbd project[2], so
basically ublk-nbd just adds new code for implementing transmission
phase, and it can be thought as moving linux block nbd driver into
userspace.

The added new code is basically in nbd/tgt_nbd.cpp, and io handling
is based on liburing[3], and implemented by c++20 coroutine, so
everything is done in single pthread totally lockless, meantime turns
out it is pretty easy to design & implement, attributed to ublk framework,
c++20 coroutine and liburing.

ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
send zero copy via command line '--send_zc', see details in README[4].

No regression is found in xfstests by using ublk-nbd as both test device
and scratch device, and builtin test(make test T=nbd) runs well.

Fio test("make test T=nbd") shows that ublk-nbd performance is
basically same with nbd-client/nbd driver when running fio on real
ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
nbd-client(nbd driver) with 512K BS, which is because linux nbd
driver sets max_sectors_kb as 64KB at default.

But when running fio over local tcp socket, it is observed in my test
machine that ublk-nbd performs better than nbd-client/nbd driver,
especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
according to different block size.

Any comments are welcome!

[1] https://github.com/ming1/ubdsrv/blob/master/nbd
[2] https://github.com/NetworkBlockDevice/nbd
[3] https://github.com/axboe/liburing
[4] https://github.com/ming1/ubdsrv/blob/master/nbd/README.rst

Thanks,
Ming


2023-01-19 19:07:06

by Jens Axboe

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

On 1/19/23 7:23 AM, Ming Lei wrote:
> Hi,
>
> ublk-nbd[1] is available now.
>
> Basically it is one nbd client, but totally implemented in userspace,
> and wrt. current nbd-client in [2], the transmission phase is done
> by linux block nbd driver.
>
> The handshake implementation is borrowed from nbd project[2], so
> basically ublk-nbd just adds new code for implementing transmission
> phase, and it can be thought as moving linux block nbd driver into
> userspace.
>
> The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> is based on liburing[3], and implemented by c++20 coroutine, so
> everything is done in single pthread totally lockless, meantime turns
> out it is pretty easy to design & implement, attributed to ublk framework,
> c++20 coroutine and liburing.
>
> ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> send zero copy via command line '--send_zc', see details in README[4].
>
> No regression is found in xfstests by using ublk-nbd as both test device
> and scratch device, and builtin test(make test T=nbd) runs well.
>
> Fio test("make test T=nbd") shows that ublk-nbd performance is
> basically same with nbd-client/nbd driver when running fio on real
> ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> nbd-client(nbd driver) with 512K BS, which is because linux nbd
> driver sets max_sectors_kb as 64KB at default.
>
> But when running fio over local tcp socket, it is observed in my test
> machine that ublk-nbd performs better than nbd-client/nbd driver,
> especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> according to different block size.

This is pretty nice! Just curious, have you tried setting up your
ring with

p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;

and see if that yields any extra performance improvements for you?
Depending on how you do processing, you should not need to do any
further changes there.

A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.

--
Jens Axboe


2023-01-26 03:09:34

by Ming Lei

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

Hi Jens,

On Thu, Jan 19, 2023 at 11:49:04AM -0700, Jens Axboe wrote:
> On 1/19/23 7:23 AM, Ming Lei wrote:
> > Hi,
> >
> > ublk-nbd[1] is available now.
> >
> > Basically it is one nbd client, but totally implemented in userspace,
> > and wrt. current nbd-client in [2], the transmission phase is done
> > by linux block nbd driver.
> >
> > The handshake implementation is borrowed from nbd project[2], so
> > basically ublk-nbd just adds new code for implementing transmission
> > phase, and it can be thought as moving linux block nbd driver into
> > userspace.
> >
> > The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> > is based on liburing[3], and implemented by c++20 coroutine, so
> > everything is done in single pthread totally lockless, meantime turns
> > out it is pretty easy to design & implement, attributed to ublk framework,
> > c++20 coroutine and liburing.
> >
> > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> > send zero copy via command line '--send_zc', see details in README[4].
> >
> > No regression is found in xfstests by using ublk-nbd as both test device
> > and scratch device, and builtin test(make test T=nbd) runs well.
> >
> > Fio test("make test T=nbd") shows that ublk-nbd performance is
> > basically same with nbd-client/nbd driver when running fio on real
> > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> > nbd-client(nbd driver) with 512K BS, which is because linux nbd
> > driver sets max_sectors_kb as 64KB at default.
> >
> > But when running fio over local tcp socket, it is observed in my test
> > machine that ublk-nbd performs better than nbd-client/nbd driver,
> > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> > according to different block size.
>
> This is pretty nice! Just curious, have you tried setting up your
> ring with
>
> p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;
>
> and see if that yields any extra performance improvements for you?
> Depending on how you do processing, you should not need to do any
> further changes there.
>
> A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.

IORING_SETUP_COOP_TASKRUN is enabled in current ublksrv.

After disabling COOP_TASKRUN and enabling SINGLE_ISSUER & DEFER_TASKRUN,
not see obvious improvement, meantime regression is observed on 64k
rw.


Thanks,
Ming


2023-01-26 04:09:39

by Willy Tarreau

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

Hi,

On Thu, Jan 26, 2023 at 11:08:26AM +0800, Ming Lei wrote:
> Hi Jens,
>
> On Thu, Jan 19, 2023 at 11:49:04AM -0700, Jens Axboe wrote:
> > On 1/19/23 7:23?AM, Ming Lei wrote:
> > > Hi,
> > >
> > > ublk-nbd[1] is available now.
> > >
> > > Basically it is one nbd client, but totally implemented in userspace,
> > > and wrt. current nbd-client in [2], the transmission phase is done
> > > by linux block nbd driver.
> > >
> > > The handshake implementation is borrowed from nbd project[2], so
> > > basically ublk-nbd just adds new code for implementing transmission
> > > phase, and it can be thought as moving linux block nbd driver into
> > > userspace.
> > >
> > > The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> > > is based on liburing[3], and implemented by c++20 coroutine, so
> > > everything is done in single pthread totally lockless, meantime turns
> > > out it is pretty easy to design & implement, attributed to ublk framework,
> > > c++20 coroutine and liburing.
> > >
> > > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> > > send zero copy via command line '--send_zc', see details in README[4].
> > >
> > > No regression is found in xfstests by using ublk-nbd as both test device
> > > and scratch device, and builtin test(make test T=nbd) runs well.
> > >
> > > Fio test("make test T=nbd") shows that ublk-nbd performance is
> > > basically same with nbd-client/nbd driver when running fio on real
> > > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> > > nbd-client(nbd driver) with 512K BS, which is because linux nbd
> > > driver sets max_sectors_kb as 64KB at default.
> > >
> > > But when running fio over local tcp socket, it is observed in my test
> > > machine that ublk-nbd performs better than nbd-client/nbd driver,
> > > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> > > according to different block size.
> >
> > This is pretty nice! Just curious, have you tried setting up your
> > ring with
> >
> > p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;
> >
> > and see if that yields any extra performance improvements for you?
> > Depending on how you do processing, you should not need to do any
> > further changes there.
> >
> > A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.
>
> IORING_SETUP_COOP_TASKRUN is enabled in current ublksrv.
>
> After disabling COOP_TASKRUN and enabling SINGLE_ISSUER & DEFER_TASKRUN,
> not see obvious improvement, meantime regression is observed on 64k
> rw.

Does it handle network errors better than the default nbd client, i.e.
is it able to seamlessly reconnect after while keeping the same device
or do you end up with multiple devices ? That's one big trouble I faced
with the original nbd client, forcing you to unmount and remount
everything after a network outage for example.

Thanks!
Willy

2023-01-26 11:43:01

by Ming Lei

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

On Thu, Jan 26, 2023 at 05:08:22AM +0100, Willy Tarreau wrote:
> Hi,
>
> On Thu, Jan 26, 2023 at 11:08:26AM +0800, Ming Lei wrote:
> > Hi Jens,
> >
> > On Thu, Jan 19, 2023 at 11:49:04AM -0700, Jens Axboe wrote:
> > > On 1/19/23 7:23?AM, Ming Lei wrote:
> > > > Hi,
> > > >
> > > > ublk-nbd[1] is available now.
> > > >
> > > > Basically it is one nbd client, but totally implemented in userspace,
> > > > and wrt. current nbd-client in [2], the transmission phase is done
> > > > by linux block nbd driver.
> > > >
> > > > The handshake implementation is borrowed from nbd project[2], so
> > > > basically ublk-nbd just adds new code for implementing transmission
> > > > phase, and it can be thought as moving linux block nbd driver into
> > > > userspace.
> > > >
> > > > The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> > > > is based on liburing[3], and implemented by c++20 coroutine, so
> > > > everything is done in single pthread totally lockless, meantime turns
> > > > out it is pretty easy to design & implement, attributed to ublk framework,
> > > > c++20 coroutine and liburing.
> > > >
> > > > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> > > > send zero copy via command line '--send_zc', see details in README[4].
> > > >
> > > > No regression is found in xfstests by using ublk-nbd as both test device
> > > > and scratch device, and builtin test(make test T=nbd) runs well.
> > > >
> > > > Fio test("make test T=nbd") shows that ublk-nbd performance is
> > > > basically same with nbd-client/nbd driver when running fio on real
> > > > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> > > > nbd-client(nbd driver) with 512K BS, which is because linux nbd
> > > > driver sets max_sectors_kb as 64KB at default.
> > > >
> > > > But when running fio over local tcp socket, it is observed in my test
> > > > machine that ublk-nbd performs better than nbd-client/nbd driver,
> > > > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> > > > according to different block size.
> > >
> > > This is pretty nice! Just curious, have you tried setting up your
> > > ring with
> > >
> > > p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;
> > >
> > > and see if that yields any extra performance improvements for you?
> > > Depending on how you do processing, you should not need to do any
> > > further changes there.
> > >
> > > A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.
> >
> > IORING_SETUP_COOP_TASKRUN is enabled in current ublksrv.
> >
> > After disabling COOP_TASKRUN and enabling SINGLE_ISSUER & DEFER_TASKRUN,
> > not see obvious improvement, meantime regression is observed on 64k
> > rw.
>
> Does it handle network errors better than the default nbd client, i.e.
> is it able to seamlessly reconnect after while keeping the same device
> or do you end up with multiple devices ? That's one big trouble I faced
> with the original nbd client, forcing you to unmount and remount
> everything after a network outage for example.

All kinds of ublk disk supports such seamlessly recovery which is
provided by UBLK_CMD_START_USER_RECOVERY/UBLK_CMD_END_USER_RECOVERY.
During user recovery, the bdev and gendisk instance won't be gone,
and will become fully functional after the recovery(such as reconnect)
is successful.

So yes for this seamlessly reconnect error handling.


Thanks,
Ming


2023-01-26 12:55:52

by Willy Tarreau

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

On Thu, Jan 26, 2023 at 07:41:56PM +0800, Ming Lei wrote:
> On Thu, Jan 26, 2023 at 05:08:22AM +0100, Willy Tarreau wrote:
> > Hi,
> >
> > On Thu, Jan 26, 2023 at 11:08:26AM +0800, Ming Lei wrote:
> > > Hi Jens,
> > >
> > > On Thu, Jan 19, 2023 at 11:49:04AM -0700, Jens Axboe wrote:
> > > > On 1/19/23 7:23?AM, Ming Lei wrote:
> > > > > Hi,
> > > > >
> > > > > ublk-nbd[1] is available now.
> > > > >
> > > > > Basically it is one nbd client, but totally implemented in userspace,
> > > > > and wrt. current nbd-client in [2], the transmission phase is done
> > > > > by linux block nbd driver.
> > > > >
> > > > > The handshake implementation is borrowed from nbd project[2], so
> > > > > basically ublk-nbd just adds new code for implementing transmission
> > > > > phase, and it can be thought as moving linux block nbd driver into
> > > > > userspace.
> > > > >
> > > > > The added new code is basically in nbd/tgt_nbd.cpp, and io handling
> > > > > is based on liburing[3], and implemented by c++20 coroutine, so
> > > > > everything is done in single pthread totally lockless, meantime turns
> > > > > out it is pretty easy to design & implement, attributed to ublk framework,
> > > > > c++20 coroutine and liburing.
> > > > >
> > > > > ublk-nbd supports both tcp and unix socket, and allows to enable io_uring
> > > > > send zero copy via command line '--send_zc', see details in README[4].
> > > > >
> > > > > No regression is found in xfstests by using ublk-nbd as both test device
> > > > > and scratch device, and builtin test(make test T=nbd) runs well.
> > > > >
> > > > > Fio test("make test T=nbd") shows that ublk-nbd performance is
> > > > > basically same with nbd-client/nbd driver when running fio on real
> > > > > ethernet link(1g, 10+g), but ublk-nbd IOPS is higher by ~40% than
> > > > > nbd-client(nbd driver) with 512K BS, which is because linux nbd
> > > > > driver sets max_sectors_kb as 64KB at default.
> > > > >
> > > > > But when running fio over local tcp socket, it is observed in my test
> > > > > machine that ublk-nbd performs better than nbd-client/nbd driver,
> > > > > especially with 2 queue/2 jobs, and the gap could be 10% ~ 30%
> > > > > according to different block size.
> > > >
> > > > This is pretty nice! Just curious, have you tried setting up your
> > > > ring with
> > > >
> > > > p.flags |= IORING_SETUP_SINGLE_ISSUER | IORING_SETUP_DEFER_TASKRUN;
> > > >
> > > > and see if that yields any extra performance improvements for you?
> > > > Depending on how you do processing, you should not need to do any
> > > > further changes there.
> > > >
> > > > A "lighter" version is just setting IORING_SETUP_COOP_TASKRUN.
> > >
> > > IORING_SETUP_COOP_TASKRUN is enabled in current ublksrv.
> > >
> > > After disabling COOP_TASKRUN and enabling SINGLE_ISSUER & DEFER_TASKRUN,
> > > not see obvious improvement, meantime regression is observed on 64k
> > > rw.
> >
> > Does it handle network errors better than the default nbd client, i.e.
> > is it able to seamlessly reconnect after while keeping the same device
> > or do you end up with multiple devices ? That's one big trouble I faced
> > with the original nbd client, forcing you to unmount and remount
> > everything after a network outage for example.
>
> All kinds of ublk disk supports such seamlessly recovery which is
> provided by UBLK_CMD_START_USER_RECOVERY/UBLK_CMD_END_USER_RECOVERY.
> During user recovery, the bdev and gendisk instance won't be gone,
> and will become fully functional after the recovery(such as reconnect)
> is successful.
>
> So yes for this seamlessly reconnect error handling.

Nice, it's tempting to give it a try then ;-)

Willy

2023-02-28 10:04:56

by Pavel Machek

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

Hi!

> ublk-nbd[1] is available now.
>
> Basically it is one nbd client, but totally implemented in userspace,
> and wrt. current nbd-client in [2], the transmission phase is done
> by linux block nbd driver.

There is reason nbd-client needs to be in kernel, and the reason is
deadlocks during low memory situations.

Best regards,
Pavel
--
People of Russia, stop Putin before his war on Ukraine escalates.


Attachments:
(No filename) (418.00 B)
signature.asc (195.00 B)
Download all attachments

2023-03-02 03:12:29

by Ming Lei

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

On Tue, Feb 28, 2023 at 11:04:41AM +0100, Pavel Machek wrote:
> Hi!
>
> > ublk-nbd[1] is available now.
> >
> > Basically it is one nbd client, but totally implemented in userspace,
> > and wrt. current nbd-client in [2], the transmission phase is done
> > by linux block nbd driver.
>
> There is reason nbd-client needs to be in kernel, and the reason is
> deadlocks during low memory situations.

Last time, the nbd memory deadlock is solved by the following approach
[1], which is used for ublk too.

Actually ublk can be thought as replacing nbd socket communication
with (much more lightweight & generic) uring_cmd, and move nbd socket
communication into userspace for ublk-nbd. Not see such way may cause
memory deadlock.

Also, ublk has built-in user recovery mechanism, killing deadlock user
daemon and recovering it can be the last straw, and the disk node won't
be gone away during the recovery.

So please provide some analysis or reproductions, otherwise I may have
to ignore your comments.


[1] https://lore.kernel.org/linux-fsdevel/[email protected]/

Thanks,
Ming


2023-03-11 14:58:47

by Wouter Verhelst

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

Hi,

On Thu, Jan 19, 2023 at 10:23:28PM +0800, Ming Lei wrote:
> The handshake implementation is borrowed from nbd project[2], so
> basically ublk-nbd just adds new code for implementing transmission
> phase, and it can be thought as moving linux block nbd driver into
> userspace.
[...]
> Any comments are welcome!

I see you copied nbd-client.c and modified it, but removed all the
author information from it (including mine).

Please don't do that. nbd-client is not public domain, it is GPLv2,
which means you need to keep copyright statements around somewhere. You
can move them into an AUTHORS file or some such if you prefer, but you
can't just remove them blindly.

Thanks.

--
w@uter.{be,co.za}
wouter@{grep.be,fosdem.org,debian.org}

I will have a Tin-Actinium-Potassium mixture, thanks.

2023-03-12 08:31:18

by Ming Lei

[permalink] [raw]
Subject: Re: ublk-nbd: ublk-nbd is avaialbe

On Sat, Mar 11, 2023 at 9:58 PM Wouter Verhelst <[email protected]> wrote:
>
> Hi,
>
> On Thu, Jan 19, 2023 at 10:23:28PM +0800, Ming Lei wrote:
> > The handshake implementation is borrowed from nbd project[2], so
> > basically ublk-nbd just adds new code for implementing transmission
> > phase, and it can be thought as moving linux block nbd driver into
> > userspace.
> [...]
> > Any comments are welcome!
>
> I see you copied nbd-client.c and modified it, but removed all the
> author information from it (including mine).
>
> Please don't do that. nbd-client is not public domain, it is GPLv2,
> which means you need to keep copyright statements around somewhere. You
> can move them into an AUTHORS file or some such if you prefer, but you
> can't just remove them blindly.

Thanks for finding it, and it must be one accident, and I will add the
author info
back soon.

thanks,