Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3921590pxf; Mon, 22 Mar 2021 20:16:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxw52ot28xl8gK7qS8ehQOU2SgMYcBhGRDEyDLHmUTtlkGlVhfVEWPi+/L+tdu2TogYAwnQ X-Received: by 2002:a17:906:22d4:: with SMTP id q20mr2877249eja.54.1616469379204; Mon, 22 Mar 2021 20:16:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616469379; cv=none; d=google.com; s=arc-20160816; b=A37nzwo0UgIu8lIzm37L7O+LinMeCIWEclpanBC0Nox2bKEubU/WhAKWwbVS+IZMH3 VbZEhR3OGIzsMGXpgvqBEDq7MlyV64qpNGnAo0M+OG7WOLyx8G4wPZrGZ5eUSX99NuVb ySbUaX3c8LJZW7ZFcrGH9VN+kqZ3uk1UaxptPMpBXJKs9AhKgoTkG7qgv92NAGxrFGi+ V8lGkn8TFhXblAcUAaz8ym5ATApMw09hC+MobCTfC4b245J9cPt5tJCmtjxcZIBaxs7E sBpznsqsKdeE7wmuIyUgLanln5E/ge/Nq6lHvVIyJE/7/P//g8ceQkcjPiHpDi54+6ut teSQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=W/V8+kwFSxa0gro/RcRlWHUGi2iXmSIttEMFik4sZJ8=; b=CUFMYYS8TBRKc+pk8ICUdkPpdjhCwOlm31as3FVHCwXdK2edvUltJzHSrels4Yn1kb j7jKSBT9xj2xMwcFBubATDVbpU6glCFCIZH/0NNJknLMhNq0AVGthnphM+7lQgGpT+WO lJI1yjVi34JM+AXzfbgWl4pjM6Bb0EF3pkisL8kJKzcr2YXQwM8nUAy0khjMayIDLk1B s7rCH9rRpK4c5Sm3n3I6ivVPfQ/9djNbIdT5IiydDn5F53qlm2aLXs/0v0RH25dS1aBV 7hZshL+nj45Rh++kjWWxgO3axQFlAKvDiEW3ClqYmOMoX0LqyZxS+lxTnCJw4no74ZxO Kc7Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=AKZEpRx7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t6si12931628edw.202.2021.03.22.20.15.57; Mon, 22 Mar 2021 20:16:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=AKZEpRx7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230108AbhCWDOy (ORCPT + 99 others); Mon, 22 Mar 2021 23:14:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230159AbhCWDOc (ORCPT ); Mon, 22 Mar 2021 23:14:32 -0400 Received: from mail-lf1-x133.google.com (mail-lf1-x133.google.com [IPv6:2a00:1450:4864:20::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08B3CC061574 for ; Mon, 22 Mar 2021 20:14:32 -0700 (PDT) Received: by mail-lf1-x133.google.com with SMTP id a198so24477839lfd.7 for ; Mon, 22 Mar 2021 20:14:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=W/V8+kwFSxa0gro/RcRlWHUGi2iXmSIttEMFik4sZJ8=; b=AKZEpRx7Ktbj8qkcXoni9RdWNbrQUF37/565x6vrk2sYoKncssmc6mFgqIzRs2MiaH MqyGsbBgrueeZMo33BL1KWqcKZOKGfI39FLXJET6NhUTOhf3e4OYlXwdAEUYXOIVhaYd fUnsEo4Kl/qxT7Gr18Usa1f3uexGTNnPm4yuAN3YgMiznDSdolJP2ABlvdTKf4YNssnd OgNqv8YsjJUPMtkdES0aF8Sf9OaSjxNQ1bbjEU5ZtOgeAu1Z95/ObFrQ1sBrTkH/RPrM Egu3tEd+h9yQ7SXhIxKWpaZVPQRS8Ygccw2nUC8l2BGpL57D9LmcqBXkiJKYVZ+TMqTw FwxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=W/V8+kwFSxa0gro/RcRlWHUGi2iXmSIttEMFik4sZJ8=; b=T924Bom10HZ0pc/bEZVdR9xaj0ns4XfFaTqe8XcpvSkqziVLgYkOe0FSAh2iBW6a2n AgJiyaRKLi3h7rE7JjjSVzVYREa/O/hXIcJUuIP/KpXirowzNiVzqhOSUNzFC4DXhvcg dWKYc7O2XkTHkCu46bwjoeEnLHf7QrcVGWcgGk52vi/1Vt5PtaG64iiR3Kc51OqQFRbw tsSzSRAPqU8z0XktgSgwso4vNu6ho8mwsmKLvrCd8LPQ5czZ1xS5K9L2FxXYO8lYO7DT 3wtCHlUDzqKAxQ2PkTZ4XC1uMOp+ejjsAIMbNSxP8S6L6dzh36qioDtY5S4mL4iGW4DH tFnQ== X-Gm-Message-State: AOAM5336e0t4Pb+HZ9579MesS6p2ZzBhxtEZ65RMicvaiJZcuDGv2xHu c4pbLkB1ZoFXJ21rApx5ZIzRWijO4ln7BQozCjk= X-Received: by 2002:ac2:4298:: with SMTP id m24mr1370197lfh.429.1616469270517; Mon, 22 Mar 2021 20:14:30 -0700 (PDT) MIME-Version: 1.0 References: <1615985460-112867-1-git-send-email-zhaoqianligood@gmail.com> <20210317143805.GA5610@redhat.com> <20210318180450.GA9977@redhat.com> <20210319163225.GB19971@redhat.com> <20210322163705.GD20390@redhat.com> In-Reply-To: <20210322163705.GD20390@redhat.com> From: qianli zhao Date: Tue, 23 Mar 2021 11:14:19 +0800 Message-ID: Subject: Re: [PATCH V3] exit: trigger panic when global init has exited To: Oleg Nesterov Cc: christian@brauner.io, axboe@kernel.dk, "Eric W. Biederman" , Thomas Gleixner , Peter Collingbourne , linux-kernel@vger.kernel.org, Qianli Zhao Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi,Oleg > No, there is at least one alive init thread. If they all have exited, we = have > the thread which calls panic() above. By current logic, setting PF_EXITING(exit_signals()) is before the panic(),find_alive_thread() determines the PF_EXITING of all child threads, the panic thread's PF_EXITING has been set before panic(),so find_alive_thread() thinks this thread also dead, resulting in find_alive_thread returning NULL.It is possible to trigger a zap_pid_ns_processes()->BUG() in this case. =3D=3D=3D=3D=3D=3D=3D=3D exit_signals(tsk); /* sets PF_EXITING */ ... group_dead =3D atomic_dec_and_test(&tsk->signal->live); if (group_dead) { if (unlikely(is_global_init(tsk))) panic("Attempted to kill init! exitcode=3D0x%08x\n",-------------------->//PF_EXITING has been set tsk->signal->group_exit_code ?: (int)code); =3D=3D=3D=3D=3D=3D=3D > Why do you think so? It can affect _any_ code which runs under > "if (group_dead)". Again, I don't see anything wrong, but I didn't even > try to audit these code paths. Yes,all places where checked the "signal->live" may be affected,but even before my changes, each program that checks "signal->live" may get different state(group_dead or not), depending on the timing of the caller,this situation will not change after my change. After my patch,"signal->live--" and other variable are set in a different order(such as signal->live and PF_EXITING),this can cause abnormalities in the logic associated with these two variables,that is my thinking. Of course, check all the "signal->live--" path is definitely necessary,it's just the case above that we need more attention. Thanks Oleg Nesterov =E4=BA=8E2021=E5=B9=B43=E6=9C=8823=E6=97=A5= =E5=91=A8=E4=BA=8C =E4=B8=8A=E5=8D=8812:37=E5=86=99=E9=81=93=EF=BC=9A > > Hi, > > It seems that we don't understand each other. > > If we move atomic_dec_and_test(signal->live) and do > > if (group_dead && is_global_init) > panic(...); > > > before setting PF_EXITING like your patch does, then zap_pid_ns_processes= () > simply won't be called. > > Because: > > On 03/21, qianli zhao wrote: > > > > Hi,Oleg > > > > > How? Perhaps I missed something again, but I don't think this is poss= ible. > > > > > zap_pid_ns_processes() simply won't be called, find_child_reaper() wi= ll > > > see the !PF_EXITING thread which calls panic(). > > > > > So I think this should be documented somehow, at least in the changel= og. > > > > This problem occurs when both two init threads enter the do_exit, > > One of the init thread is syscall sys_exit_group,and set SIGNAL_GROUP_E= XIT > > The other init thread perform ret_to_user()->get_signal() and found > > SIGNAL_GROUP_EXIT is set,then do_group_exit()->do_exit(),since there > > are no alive init threads it finally goes to > > zap_pid_ns_processes() > > No, there is at least one alive init thread. If they all have exited, we = have > the thread which calls panic() above. > > > and BUG(). > > so we don't need the SIGNAL_GROUP_EXIT check to avoid this BUG(). > > What have I missed? > > Oleg. >