Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp1380116lql; Tue, 12 Mar 2024 15:56:17 -0700 (PDT) X-Forwarded-Encrypted: i=2; AJvYcCUmjjN3f7b3gimc32eul1aE59HlXVoURv9edXtqB3DLxduhPunjdDNa94IljAxHAEXr6LFvBYJPDUXjKqaXNpmwa1ELjJnxhkq1HL9xbA== X-Google-Smtp-Source: AGHT+IGLtkwmz9KazfNnQkVs3cpr7w5Dr2OXHvJU6Vy9IzbFCRY76+bpmIW0EIs0D1aAqmP4ILjZ X-Received: by 2002:a17:902:d4ce:b0:1dd:ae61:67b4 with SMTP id o14-20020a170902d4ce00b001ddae6167b4mr5432951plg.17.1710284177201; Tue, 12 Mar 2024 15:56:17 -0700 (PDT) Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id e7-20020a170902784700b001dcb8f5e4e9si5218824pln.585.2024.03.12.15.56.16 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Mar 2024 15:56:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-100905-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=neutral (body hash did not verify) header.i=@gmail.com header.s=20230601 header.b=DFtwX40D; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-100905-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-100905-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id D6E2A283449 for ; Tue, 12 Mar 2024 22:56:16 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1AD7A47768; Tue, 12 Mar 2024 22:55:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DFtwX40D" Received: from mail-lj1-f175.google.com (mail-lj1-f175.google.com [209.85.208.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1AD9A446AC; Tue, 12 Mar 2024 22:55:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.175 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710284114; cv=none; b=ZbEY1vAsomZ1Ot3XHN5EgV/HFswMCnHjJqK/ByU/JwT/FHNQTkj5Rb5LioPTz+u0nqQUprWc2P2TbHMr0kW1OwmoTp2EHG37OPG4on9lToRGIJqeVjaOQNAfgrEQSYvgJe+Gfx8RREnkgL5maDF6V93sGHZwCTVqqS1AapAGecE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710284114; c=relaxed/simple; bh=pqB0kQNo2lOrlfpuu3b2oAK++B35JKyT39mKFe8+dtc=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=TV+FImfA5AB90WxsnXeBZWwRMkrD6Z68NxzGvqETJFPTtSqhYFd3GegddOR1TfAzWS5Qnsp+0sjZEVIr8RKlpIiCQUzFFGzZiFJ03T7F9oB3uI3aUjFmHVRkvtkhJxlkCGZi1aTjRECeDRc892IjsDouaejlrXLc2GoTCcXfHsY= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=DFtwX40D; arc=none smtp.client-ip=209.85.208.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-lj1-f175.google.com with SMTP id 38308e7fff4ca-2d29111272eso101575581fa.0; Tue, 12 Mar 2024 15:55:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1710284110; x=1710888910; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Dn4o2EhSzw/dPcbf0I+GhuZZGnLjYr0VW9p4Ognw3lA=; b=DFtwX40DTpA1ESUPV8sJR56EUf0c9+qNJjP8KPyPS2NknXIl8JKqC2NOjRk3GBOfnK iLT6My59Je3s9OBwaiCO/IprHf8YrVs30QSn60dAmMqT+0pVlO3b/DBT8VrBcn/jdxQ6 gWZk9LLXV2BAdKYPsI+8/WuNu/1/8KlsE/Bg9vTUeFVR65zWNi9ZrA9EATpRbYJoUGYf Z54BA/OxLGfFzUue+yZihS0USuMCATXW4j8nNBEIrvo7QdLXyymmLlXlUc0uzZ6bA2nY iDgFOI87H+NFqddsERgg9nvyty4enf6KnwNkz/LaOKtYQHy5VJampv6oHavbYwljlXr1 ssxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710284110; x=1710888910; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Dn4o2EhSzw/dPcbf0I+GhuZZGnLjYr0VW9p4Ognw3lA=; b=ExrLFIbDyxj4RQgFsz3xa7h0a3nxbpZMhmzpiMt2AHV+0WLP4qzZvqIrUkmCFTcrWy tPFiDejlTtounSpnezYeZ2Qhk6st+bPA0iyDOhBH94IyysuqXMq2RbK3zOlp8kA3VYEI dvvh0f0BO09wTx10YVzoum2WG7Ahz3ubjagv4eujHE0R0s7DU5j2wAwSlYP/xN6Li+TL P9K5OTaMIpo8Hdr9raa8STJY7TssTvlF9fg5CbKS/mroT+r4gFlB6XFzscc6f1p0mpJJ HkLzQ7VGqmvk4pibtDIA0cJDbkjOKATGMNk/m9OH1fKXFmbfky98rsHJZlDOeUwto5kM 1WNA== X-Forwarded-Encrypted: i=1; AJvYcCXFs99Hznk+FI3Y0v2tcVmkwJ76vgVKErB9LXwbxDZKCmmIuaK3j2YB85v7Xo9uTjuiDGQmX8cLAvls3tA/7X1o8S487lUz7KNHzz1EFvIzNnpK1Wk688hFMI/94eQPNg0cU2lG/jGCmGQ= X-Gm-Message-State: AOJu0YxjnARnZpY5snCOcf+1CYEiKd4Zn2dkv8zWvZH0AndnEdu1Hmno iFm4wt5dD5OsO73iyg60Hxl27N4ywgJDH1QxRAPF3QBUQPb3eMhxysyGEu5Wvj2WlW5oA4Qw9Yk 3k7KgkdN7e+gdODaShp54/f5BFJ1YKeaq X-Received: by 2002:a05:651c:222b:b0:2d2:751f:abb2 with SMTP id y43-20020a05651c222b00b002d2751fabb2mr2507208ljq.3.1710284110049; Tue, 12 Mar 2024 15:55:10 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: In-Reply-To: From: Patrick Plenefisch Date: Tue, 12 Mar 2024 18:54:59 -0400 Message-ID: Subject: Re: LVM-on-LVM: error while submitting device barriers To: Ming Lei Cc: Mike Snitzer , Goffredo Baroncelli , linux-kernel@vger.kernel.org, Alasdair Kergon , Mikulas Patocka , Chris Mason , Josef Bacik , David Sterba , regressions@lists.linux.dev, dm-devel@lists.linux.dev, linux-btrfs@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Mar 11, 2024 at 9:13=E2=80=AFAM Ming Lei wrot= e: > > On Sun, Mar 10, 2024 at 02:11:11PM -0400, Patrick Plenefisch wrote: > > On Sun, Mar 10, 2024 at 11:27=E2=80=AFAM Mike Snitzer wrote: > > > > > > On Sun, Mar 10 2024 at 7:34P -0400, > > > Ming Lei wrote: > > > > > > > On Sat, Mar 09, 2024 at 03:39:02PM -0500, Patrick Plenefisch wrote: > > > > > On Wed, Mar 6, 2024 at 11:00=E2=80=AFAM Ming Lei wrote: > > > > > > > > > > > > #!/usr/bin/bpftrace > > > > > > > > > > > > #ifndef BPFTRACE_HAVE_BTF > > > > > > #include > > > > > > #endif > > > > > > > > > > > > kprobe:submit_bio_noacct, > > > > > > kprobe:submit_bio > > > > > > / (((struct bio *)arg0)->bi_opf & (1 << __REQ_PREFLUSH)) !=3D 0= / > > > > > > { > > > > > > $bio =3D (struct bio *)arg0; > > > > > > @submit_stack[arg0] =3D kstack; > > > > > > @tracked[arg0] =3D 1; > > > > > > } > > > > > > > > > > > > kprobe:bio_endio > > > > > > /@tracked[arg0] !=3D 0/ > > > > > > { > > > > > > $bio =3D (struct bio *)arg0; > > > > > > > > > > > > if (($bio->bi_flags & (1 << BIO_CHAIN)) && $bio->__bi_r= emaining.counter > 1) { > > > > > > return; > > > > > > } > > > > > > > > > > > > if ($bio->bi_status !=3D 0) { > > > > > > printf("dev %s bio failed %d, submitter %s comp= letion %s\n", > > > > > > $bio->bi_bdev->bd_disk->disk_name, > > > > > > $bio->bi_status, @submit_stack[arg0], k= stack); > > > > > > } > > > > > > delete(@submit_stack[arg0]); > > > > > > delete(@tracked[arg0]); > > > > > > } > > > > > > > > > > > > END { > > > > > > clear(@submit_stack); > > > > > > clear(@tracked); > > > > > > } > > > > > > > > > > > > > > > > Attaching 4 probes... > > > > > dev dm-77 bio failed 10, submitter > > > > > submit_bio_noacct+5 > > > > > __send_duplicate_bios+358 > > > > > __send_empty_flush+179 > > > > > dm_submit_bio+857 > > > > > __submit_bio+132 > > > > > submit_bio_noacct_nocheck+345 > > > > > write_all_supers+1718 > > > > > btrfs_commit_transaction+2342 > > > > > transaction_kthread+345 > > > > > kthread+229 > > > > > ret_from_fork+49 > > > > > ret_from_fork_asm+27 > > > > > completion > > > > > bio_endio+5 > > > > > dm_submit_bio+955 > > > > > __submit_bio+132 > > > > > submit_bio_noacct_nocheck+345 > > > > > write_all_supers+1718 > > > > > btrfs_commit_transaction+2342 > > > > > transaction_kthread+345 > > > > > kthread+229 > > > > > ret_from_fork+49 > > > > > ret_from_fork_asm+27 > > > > > > > > > > dev dm-86 bio failed 10, submitter > > > > > submit_bio_noacct+5 > > > > > write_all_supers+1718 > > > > > btrfs_commit_transaction+2342 > > > > > transaction_kthread+345 > > > > > kthread+229 > > > > > ret_from_fork+49 > > > > > ret_from_fork_asm+27 > > > > > completion > > > > > bio_endio+5 > > > > > clone_endio+295 > > > > > clone_endio+295 > > > > > process_one_work+369 > > > > > worker_thread+635 > > > > > kthread+229 > > > > > ret_from_fork+49 > > > > > ret_from_fork_asm+27 > > > > > > > > > > > > > > > For context, dm-86 is /dev/lvm/brokenDisk and dm-77 is /dev/lower= VG/lvmPool > > > > > > > > io_status is 10(BLK_STS_IOERR), which is produced in submission cod= e path on > > > > /dev/dm-77(/dev/lowerVG/lvmPool) first, so looks it is one device m= apper issue. > > > > > > > > The error should be from the following code only: > > > > > > > > static void __map_bio(struct bio *clone) > > > > > > > > ... > > > > if (r =3D=3D DM_MAPIO_KILL) > > > > dm_io_dec_pending(io, BLK_STS_IOERR); > > > > else > > > > dm_io_dec_pending(io, BLK_STS_DM_REQUEUE); > > > > break; > > > > > > I agree that the above bpf stack traces for dm-77 indicate that > > > dm_submit_bio failed, which would end up in the above branch if the > > > target's ->map() returned DM_MAPIO_KILL or DM_MAPIO_REQUEUE. > > > > > > But such an early failure speaks to the flush bio never being > > > submitted to the underlying storage. No? > > > > > > dm-raid.c:raid_map does return DM_MAPIO_REQUEUE with: > > > > > > /* > > > * If we're reshaping to add disk(s)), ti->len and > > > * mddev->array_sectors will differ during the process > > > * (ti->len > mddev->array_sectors), so we have to requeue > > > * bios with addresses > mddev->array_sectors here or > > > * there will occur accesses past EOD of the component > > > * data images thus erroring the raid set. > > > */ > > > if (unlikely(bio_end_sector(bio) > mddev->array_sectors)) > > > return DM_MAPIO_REQUEUE; > > > > > > But a flush doesn't have an end_sector (it'd be 0 afaik).. so it seem= s > > > weird relative to a flush. > > > > > > > Patrick, you mentioned lvmPool is raid1, can you explain how lvmPoo= l is > > > > built? It is dm-raid1 target or over plain raid1 device which is > > > > build over /dev/lowerVG? > > > > LVM raid1: > > lvcreate --type raid1 -m 1 ... > > OK, that is the reason, as Mike mentioned. > > dm-raid.c:raid_map returns DM_MAPIO_REQUEUE, which is translated into > BLK_STS_IOERR in dm_io_complete(). > > Empty flush bio is sent from btrfs, both .bi_size and .bi_sector are set > as zero, but the top dm is linear, which(linear_map()) maps new > bio->bi_iter.bi_sector, and the mapped bio is sent to dm-raid(raid_map())= , > then DM_MAPIO_REQUEUE is returned. > > The one-line patch I sent in last email should solve this issue. > > https://lore.kernel.org/dm-devel/a783e5ed-db56-4100-956a-353170b1b7ed@inw= ind.it/T/#m8fce3ecb2f98370b7d7ce8db6714bbf644af5459 With this patch on a 6.6.13 base, I can modify files and the BTRFS volume stays RW, while no errors are logged in dmesg! > > But DM_MAPIO_REQUEUE misuse needs close look, and I believe Mike is worki= ng > on that bigger problem. > > I guess most of dm targets don't deal with empty bio well, at least > linear & dm-raid, not look into others yet, :-( > > > Thanks, > Ming >