Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp1097897rwb; Tue, 4 Oct 2022 15:35:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6APCqJWnUoDOH/bzn64fz1lhAu1TrBuQ9zfGfJVSV8djsY216Yi1n13W6TszZNc2VgnLp0 X-Received: by 2002:a05:6402:5246:b0:459:42b1:aed5 with SMTP id t6-20020a056402524600b0045942b1aed5mr6784595edd.398.1664922958270; Tue, 04 Oct 2022 15:35:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1664922958; cv=none; d=google.com; s=arc-20160816; b=I/K933vAtLdnFfQqQfp0qbj0BPVCYJ70r8CtY3Y4/V8Ci7zBLQont0EhryHUPOfHdS WzDvsclM0AHvxdTT6Wj799MBIhW/Xh0Jz3OOKZrEAHybwdPZx+u8jAPWJmVXKcx3GrnA LmjRAD2j7E4RXiM4oUk5D/uQ7IYcNepAbslQddkgpfFk/226vphBPhReAF9TcMXXN855 WYCnz5jWVEY1dowu+1pV0sE9QqnWfcL2DC8aDbs+swLpeeQxPXHrNSIYcaIgL03m6H9N rFyvZUfcetic+IgHmKjQ2wXh9jR72JYcjrU/rqildQpUCJiyTu0If9proT40QBOWmTUc MqBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=bppVNtMGy4EqWGGPO54EHqJYC+5NgaZ/f7bERvur6dg=; b=YSHE3v8bkYH6Xl5xxoPOv+eV52OOOXnKecuc8/R0/4dLi67Rx0MuaMSMmNb37R5nWN 7adl+DF7FwV96FDdTtul4rrVPjduZ9s8Q7lGuo+Fr5uQ9WWtHQB4UO1LLnEMrzEerfh0 16Y0/uuGVXD7SvNzVcQpDILrEd97nZwKdnr+JM/+ZpXEpXsaWlMwQmKnWqbCXJIIXCpP x6/fpLvxomkWGA/DQOkoqOWC7VnulXe0wnbyETl+xM4jdgmAyP/jTIH26VMu5RZfHWp6 u/li8OWD57YxeGM2NTr/7+47FkHKCBZQlFP3fs/7NLJ+G5TcU78tlIW3vwZ2GC8tyc19 iDag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=g96s3fHs; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l12-20020a056402254c00b004577ace3f0bsi13442318edb.439.2022.10.04.15.35.20; Tue, 04 Oct 2022 15:35:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=g96s3fHs; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231618AbiJDWTl (ORCPT + 99 others); Tue, 4 Oct 2022 18:19:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231303AbiJDWSm (ORCPT ); Tue, 4 Oct 2022 18:18:42 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3FCA369F55; Tue, 4 Oct 2022 15:18:41 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D029C614B2; Tue, 4 Oct 2022 22:18:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 353ABC433D6; Tue, 4 Oct 2022 22:18:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1664921920; bh=3OSNGb6eczbbTEHdSyw0AJmcW1Gd0forZ+L7hKDOImU=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=g96s3fHsBQiMBtHOiWUWkcF04ehUH4sAH28zokPRfYfJfiW+vupPchlaH2EO0P704 IYJbqbC86ttuyYpH4dKLCDVmY6NnkGSQ8VGGaz/AzsGnM9YiiH7IaD2o/7baYYCOU3 K8txGUWfwz7mcUr0xbZIBt4qmTdV9w3OlCmOY9zq4hd0p0k/QZQcwYUEP+ECI3gkU8 xnFLs7iR3XPoeKgYfU977ceRc6V/zXQhSaoy6gLyTOV4XvXhD4emLns2oSO/qh5K9e PrV1U+3oMv0o/8ic7bQitPusSsuRG38ajwFhbtGYgpcENrRoPE4Yr1iiK5voX/7WEa Fv2/Kj1Awfg8g== Date: Tue, 4 Oct 2022 15:18:39 -0700 From: "Darrick J. Wong" To: Lukas Czerner Cc: zhanchengbin , Theodore Ts'o , linux-ext4@vger.kernel.org, liuzhiqiang26@huawei.com, linfeilong , kzak@redhat.com, util-linux@vger.kernel.org Subject: Re: [bug report] misc/fsck.c: Processes may kill other processes. Message-ID: References: <4ffe3143-fc53-7178-cf44-f3481eb96ae4@huawei.com> <20220929112813.6aqtktwaff2m7fh2@fedora> <470ea2ee-54fd-3dd5-2500-36fb82665c11@huawei.com> <20220930072042.dwakvbnefctk2jyd@fedora> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220930072042.dwakvbnefctk2jyd@fedora> X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org [cc util-linux and karel zak] TLDR: util-linux's fsck program has an interesting bug in it where if someone runs "fsck -N", it will set up a fsck_instance context for each filesystem with inst->pid = -1. If someone sends the fsck process a SIGINT/SIGTERM before it finishes enumerating filesystems, it will try to kill all the fsck instances via "kill(inst->pid, ...);" which will terminate every process on the system. On Fri, Sep 30, 2022 at 09:20:42AM +0200, Lukas Czerner wrote: > On Fri, Sep 30, 2022 at 09:42:52AM +0800, zhanchengbin wrote: > > > > > > On 2022/9/29 19:28, Lukas Czerner wrote: > > > Hi, > > > > > > indeed we'd like to avoid killing the instance that was not ran because > > > of noexecute. Can you try the following patch? > > > > > > Thanks! > > > -Lukas > > > > Yes, you're right, I think we can fix it in this way. > > > > diff --git a/misc/fsck.c b/misc/fsck.c > > index 1f6ec7d9..91edbf17 100644 > > --- a/misc/fsck.c > > +++ b/misc/fsck.c > > @@ -547,6 +547,8 @@ static int kill_all(int signum) > > for (inst = instance_list; inst; inst = inst->next) { > > if (inst->flags & FLAG_DONE) > > continue; > > + if (inst->pid == -1) > > + continue; > > Yeah, that works as well although I find the "if (noexecute)" condition > more obvious. We can do both. Also rather than checking for -1 we can > check for <= 0 since anything other than real pid at this point is a bug. > > Feel free to send a proper patch. I was about to ask why we even care about misc/fsck.c because it's clearly a weird old program that has been bitrotting for years and likely replaced by some other code in util-linux. Then I thought I had better check util-linux, and... https://git.kernel.org/pub/scm/utils/util-linux/util-linux.git/tree/disk-utils/fsck.c /* * fsck --- A generic, parallelizing front-end for the fsck program. * It will automatically try to run fsck programs in parallel if the * devices are on separate spindles. It is based on the same ideas as * the generic front end for fsck by David Engel and Fred van Kempen, * but it has been completely rewritten from scratch to support * parallel execution. * * Written by Theodore Ts'o, LOL, it's the same source code, and I think it has the same bug, since "noexecute" mode sets pid = -1 at like 688: /* Fork and execute the correct program. */ if (noexecute) pid = -1; and then sets inst->pid = pid at line 703: inst->pid = pid; and kill_all() passes that to kill() at line 733: for (inst = instance_list; inst; inst = inst->next) { if (inst->flags & FLAG_DONE) continue; kill(inst->pid, signum); n++; } From that I conclude that this is a real bug in util-linux, and we ought to be talking to them about this. Evidently this has been broken since e2fsprogs commit 33922999 in January 1999, though it was only added to util-linux in commit 607c2a72952f in February 2009. --D > Thanks! > -Lukas > > > kill(inst->pid, signum); > > n++; > > } > > > > > > > > > diff --git a/misc/fsck.c b/misc/fsck.c > > > index 1f6ec7d9..8fae7730 100644 > > > --- a/misc/fsck.c > > > +++ b/misc/fsck.c > > > @@ -497,9 +497,10 @@ static int execute(const char *type, const char *device, const char *mntpt, > > > } > > > /* Fork and execute the correct program. */ > > > - if (noexecute) > > > + if (noexecute) { > > > pid = -1; > > > - else if ((pid = fork()) < 0) { > > > + inst->flags |= FLAG_DONE; > > > + } else if ((pid = fork()) < 0) { > > > perror("fork"); > > > free(inst); > > > return errno; > > > @@ -544,6 +545,9 @@ static int kill_all(int signum) > > > struct fsck_instance *inst; > > > int n = 0; > > > + if (noexecute) > > > + return 0; > > > + > > > for (inst = instance_list; inst; inst = inst->next) { > > > if (inst->flags & FLAG_DONE) > > > continue; > > regards, > > Zhan Chengbin > > >