2009-04-13 11:23:00

by Bart Van Assche

[permalink] [raw]
Subject: dm-multipath and write request ordering

Hello,

Several people are using the dm-multipath software as follows:
* Linux server A is using dm-multipath to access data stored on
servers B and C via two iSCSI sessions -- one session between servers
A and B and one session between servers A and C.
* On servers B and C iSCSI target software exports a block device that
is replicated between servers B and C.

Round-robin load balancing will only work correctly in such a setup if
the replication software knows the order in which write requests have
been queued on the dm-multipath device. Since iSCSI uses the TCP/IP
protocol, write requests generated by server A can arrive out-of-order
on servers B and C. My questions are as follows:
- Is it correct that round-robin load balancing can only work
correctly in such a setup with proper support for write barriers in
the device mapper ?
- Did I understand it correctly that the current dm implementation
only supports barriers when remapping a single underlying device ?
- Are there any plans to add barrier support to dm-multipath ?

This is where I obtained my information about device mapper from:
[1] Jonathan Corbet, Multipath support in the device mapper, February
2005, http://lwn.net/Articles/124703/.
[2] Andi Kleen, [PATCH] Implement barrier support for single device DM
devices, February 2008, http://lkml.org/lkml/2008/2/15/125
[3] Stefan Bader, [2.6.23 PATCH 11/18] dm: disable barriers, July
2007, http://lkml.org/lkml/2007/7/11/432.
[4] dm source code,
http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.29.y.git;a=tree;f=drivers/md;h=e0a3c2abb32bcec01c192217cda3f085e91217f3;hb=HEAD.

Bart.


2009-04-14 07:16:34

by Andi Kleen

[permalink] [raw]
Subject: Re: dm-multipath and write request ordering

Bart Van Assche <[email protected]> writes:

> - Is it correct that round-robin load balancing can only work
> correctly in such a setup with proper support for write barriers in
> the device mapper ?

No opinion on that one.

> - Did I understand it correctly that the current dm implementation
> only supports barriers when remapping a single underlying device ?

And also only for dm_linear, so probably it doesn't work on dm_mp (haven't
tested) even if it has only a single device.

Barriers over multiple devices are very difficult, at least as long as
the underlying protocol doesn't support shared barriers or the Linux
barrier concept be extended. The problem is that you cannot express
to the underlying device that some of its requests are dependent on
requests going to other devices.

> - Are there any plans to add barrier support to dm-multipath ?

[note I'm not very familiar with Linux's dm_mp finer details, some
details here might be wrong]

I doubt it's really doable with RR. It might be possible if you
do the IO primary on one device and only on error fall back to another
device. Then during the fallback you could just synchronize all
IO and solve the barrier problem this way and otherwise do barriers
on the single active device only. But with multiple devices you
would have the problem described above.

-Andi

--
[email protected] -- Speaking for myself only.

2009-04-14 08:18:15

by Bart Van Assche

[permalink] [raw]
Subject: Re: dm-multipath and write request ordering

On Tue, Apr 14, 2009 at 9:16 AM, Andi Kleen <[email protected]> wrote:
> Bart Van Assche <[email protected]> writes:
>> - Is it correct that round-robin load balancing can only work
>> correctly in such a setup with proper support for write barriers in
>> the device mapper ?
>
> No opinion on that one.

What I'm worried about is that write requests that have been passed to
dm_mp could arrive in another order at the target because of the
variable transport time of protocols like TCP/IP. This could cause
newer data to be overwritten by older data. Separating those write
requests by a barrier won't help because dm_mp ignores barriers.

Bart.

2009-04-16 15:38:47

by Vladislav Bolkhovitin

[permalink] [raw]
Subject: Re: dm-multipath and write request ordering


Bart Van Assche, on 04/13/2009 03:22 PM wrote:
> Hello,
>
> Several people are using the dm-multipath software as follows:
> * Linux server A is using dm-multipath to access data stored on
> servers B and C via two iSCSI sessions -- one session between servers
> A and B and one session between servers A and C.
> * On servers B and C iSCSI target software exports a block device that
> is replicated between servers B and C.
>
> Round-robin load balancing will only work correctly in such a setup if
> the replication software knows the order in which write requests have
> been queued on the dm-multipath device. Since iSCSI uses the TCP/IP
> protocol, write requests generated by server A can arrive out-of-order
> on servers B and C. My questions are as follows:
> - Is it correct that round-robin load balancing can only work
> correctly in such a setup with proper support for write barriers in
> the device mapper ?

Not necessary. If replication between B and C done synchronously,
barriers are not needed. Barriers are necessary only for async commands,
when the next command sent before the previous one completed.

Vlad