Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754014AbdLMV4M (ORCPT ); Wed, 13 Dec 2017 16:56:12 -0500 Received: from mga06.intel.com ([134.134.136.31]:12299 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753980AbdLMV4K (ORCPT ); Wed, 13 Dec 2017 16:56:10 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.45,398,1508828400"; d="scan'208";a="1382043" From: Scott Bauer To: dm-devel@redhat.com Cc: snitzer@redhat.com, agk@redhat.com, linux-kernel@vger.kernel.org, keith.busch@intel.com, jonathan.derrick@intel.com, Scott Bauer Subject: [PATCH v3 2/2] dm unstripe: Add documentation for unstripe target Date: Wed, 13 Dec 2017 14:33:32 -0700 Message-Id: <20171213213332.2914-3-scott.bauer@intel.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20171213213332.2914-1-scott.bauer@intel.com> References: <20171213213332.2914-1-scott.bauer@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4834 Lines: 144 Signed-off-by: Scott Bauer --- Documentation/device-mapper/dm-unstripe.txt | 130 ++++++++++++++++++++++++++++ 1 file changed, 130 insertions(+) create mode 100644 Documentation/device-mapper/dm-unstripe.txt diff --git a/Documentation/device-mapper/dm-unstripe.txt b/Documentation/device-mapper/dm-unstripe.txt new file mode 100644 index 000000000000..01d7194b9075 --- /dev/null +++ b/Documentation/device-mapper/dm-unstripe.txt @@ -0,0 +1,130 @@ +Device-Mapper Unstripe +===================== + +The device-mapper unstripe (dm-unstripe) target provides a transparent +mechanism to unstripe a device-mapper "striped" target to access the +underlying disks without having to touch the true backing block-device. +It can also be used to unstripe a hardware RAID-0 to access backing disks +as well. + + +Parameters: + <# of drives> + + + + The block device you wish to unstripe. + + + The physical drive you wish to expose via this "virtual" device + mapper target. This must be 0 indexed. + +<# of drives> + The number of drives in the RAID 0. + + + The amount of 512B sectors in the chunk striping, or zero, if you + wish you use max_hw_sector_size. + + +Why use this module? +===================== + + An example of undoing an existing dm-stripe: + + This small bash script will setup 4 loop devices and use the existing + dm-stripe target to combine the 4 devices into one. It then will use + the unstripe target on the new combined stripe device to access the + individual backing loop devices. We write data to the newly exposed + unstriped devices and verify the data written matches the correct + underlying device on the striped array. + + #!/bin/bash + + MEMBER_SIZE=$((128 * 1024 * 1024)) + NUM=4 + SEQ_END=$((${NUM}-1)) + CHUNK=256 + BS=4096 + + RAID_SIZE=$((${MEMBER_SIZE}*${NUM}/512)) + DM_PARMS="0 ${RAID_SIZE} striped ${NUM} ${CHUNK}" + COUNT=$((${MEMBER_SIZE} / ${BS})) + + for i in $(seq 0 ${SEQ_END}); do + dd if=/dev/zero of=member-${i} bs=${MEMBER_SIZE} count=1 oflag=direct + losetup /dev/loop${i} member-${i} + DM_PARMS+=" /dev/loop${i} 0" + done + + echo $DM_PARMS | dmsetup create raid0 + for i in $(seq 0 ${SEQ_END}); do + echo "0 1 unstripe /dev/mapper/raid0 ${i} ${NUM} ${CHUNK}" | dmsetup create set-${i} + done; + + for i in $(seq 0 ${SEQ_END}); do + dd if=/dev/urandom of=/dev/mapper/set-${i} bs=${BS} count=${COUNT} oflag=direct + diff /dev/mapper/set-${i} member-${i} + done; + + for i in $(seq 0 ${SEQ_END}); do + dmsetup remove set-${i} + done + + dmsetup remove raid0 + + for i in $(seq 0 ${SEQ_END}); do + losetup -d /dev/loop${i} + rm -f member-${i} + done + +============== + + + Another example: + + Intel NVMe drives contain two cores on the physical device. + Each core of the drive has segregated access to its LBA range. + The current LBA model has a RAID 0 128k chunk on each core, resulting + in a 256k stripe across the two cores: + + Core 0: Core 1: + __________ __________ + | LBA 512| | LBA 768| + | LBA 0 | | LBA 256| + ⎻⎻⎻⎻⎻⎻⎻⎻⎻⎻ ⎻⎻⎻⎻⎻⎻⎻⎻⎻⎻ + + The purpose of this unstriping is to provide better QoS in noisy + neighbor environments. When two partitions are created on the + aggregate drive without this unstriping, reads on one partition + can affect writes on another partition. This is because the partitions + are striped across the two cores. When we unstripe this hardware RAID 0 + and make partitions on each new exposed device the two partitions are now + physically separated. + + With the module we were able to segregate a fio script that has read and + write jobs that are independent of each other. Compared to when we run + the test on a combined drive with partitions, we were able to get a 92% + reduction in five-9ths read latency using this device mapper target. + + +==================== +Example scripts: + + +dmsetup create nvmset1 --table '0 1 unstripe /dev/nvme0n1 1 2 0' +dmsetup create nvmset0 --table '0 1 unstripe /dev/nvme0n1 0 2 0' + +There will now be two mappers: +/dev/mapper/nvmset1 +/dev/mapper/nvmset0 + +that will expose core 0 and core 1. + + +# In a dm-stripe with 4 drives of chunk size 128K: +dmsetup create raid_disk0 --table '0 1 unstripe /dev/mapper/striped 0 4 256' +dmsetup create raid_disk1 --table '0 1 unstripe /dev/mapper/striped 1 4 256' +dmsetup create raid_disk2 --table '0 1 unstripe /dev/mapper/striped 2 4 256' +dmsetup create raid_disk3 --table '0 1 unstripe /dev/mapper/striped 3 4 256' + -- 2.11.0