Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp92272pxb; Tue, 2 Mar 2021 20:26:01 -0800 (PST) X-Google-Smtp-Source: ABdhPJzy2JttLXe/NlQ9UzvmmY5152ASZ7bsJuvveX0Pf4UzjmbkYth6EF4WRHnbqknBNePbnZFi X-Received: by 2002:a05:6402:3122:: with SMTP id dd2mr23401624edb.253.1614745561725; Tue, 02 Mar 2021 20:26:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1614745561; cv=none; d=google.com; s=arc-20160816; b=NCeLcK2iaSypKeDk7l3DOcRu0/brVGH9NZdDCVutn8ug9pBgaAXSFyW16dpWG+SrSN mKYHqzMpY+stvLuDw2XlwMozVYwpk+tk1M1mWdX/qhjCCWLs6iRE5/a3kTi3IPR84p1B cFsbEbQL1E8MFdNgvRD7goix3FlbQiJ/xRTEdtvxh6EbnYHdcMi0J02KjiNEd86iPXCq 2yVz4TWCXldaERnqGSrvKk+ZRbvxtp93ZZYZbhOITBugoZ2AwXGfY30gQswzhzVo++YA /l6ganzQXWgEaejnxgeogCWnSvX74waKXXC4jRdfFVWj/DV7qabcOL3YJw0I6xOsFfd1 l8rA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Hz1YO9pOKdF6lCxgpCIRsUs8IE/9dpDT8udUOXk1C4A=; b=jdLYF65zfJvnIbSPqk4lQA655vk+G9KQBza/Kxr8vw/ikKRKssWDdmo/E46lLTHXaG jwPTmGnjez2qUewlFqt5bhu28jsuO5tc0hu8nP/3/vGg6fJl/9/s0CBlsX7zuzHI8eXK CfF+Dk0OEt43mMC+EdcwKJIT85J+QzZ5putn/z5Kv08nthqO0+AN5ExQ1J2cMSC63oUa O4Zp77kzLm12/qgwuZkSsuu8sLRKl/CKdRYaLpKXiwJ04RPCPVRxDfRBQb0DsmALtXaI LBvYnM9jXNA2XJVgR5KniPc8YIM+RaTEwgXpAENuUG83Anh7BhvF6TXU+81EI9ihnssE KbUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="cJ/ppSNA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id j23si13938882eje.581.2021.03.02.20.25.39; Tue, 02 Mar 2021 20:26:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b="cJ/ppSNA"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236474AbhCAOhK (ORCPT + 99 others); Mon, 1 Mar 2021 09:37:10 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:31283 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236463AbhCAOhI (ORCPT ); Mon, 1 Mar 2021 09:37:08 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1614609335; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Hz1YO9pOKdF6lCxgpCIRsUs8IE/9dpDT8udUOXk1C4A=; b=cJ/ppSNAuc1fa+NSgqgN7FqncoLF99jz/nLIl/Nu88eiMWcd7J9E7KL8GsREsyr8Nn4tC1 7w8GgPVDoBiyN1X43U6+ee7nmgVroXyZ56JjHKv+E3cvnChvRozAKoSj9KpA3UToELCkC2 aplPFgDWwrFPY/AK1kNdXsn07BqT7+g= Received: from mail-yb1-f199.google.com (mail-yb1-f199.google.com [209.85.219.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-96--MAUeGexNq2aw69V-9s8mg-1; Mon, 01 Mar 2021 09:35:34 -0500 X-MC-Unique: -MAUeGexNq2aw69V-9s8mg-1 Received: by mail-yb1-f199.google.com with SMTP id j4so18994157ybt.23 for ; Mon, 01 Mar 2021 06:35:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Hz1YO9pOKdF6lCxgpCIRsUs8IE/9dpDT8udUOXk1C4A=; b=E/ECyXUUZ4ez1WeoKEyi6feVf2p5v1cG+zSJKjXTjT4VjTU45Kcp3kp0Wi16qqepez q/rdISZFWtY4YdKU65hMlOvOj8qreRumx2rnPwsJxXAtURbsk9qvSANXVr5YkfBt8jkw jeGUEPIwBCGM+XCYQHtjIr6MJ7jPqTLkp7C51ezkWu2z3jfMel9ZD1+Up33a+7EPbXek 6XKEqPMVGhb+q/EMzbsHHjwfb7dNeH4oeNSi4CRi/MmWd+7iFruVtiYJnfdpl5dB8idi BPsZ9/6t4Qns82q5/QPpVyvZpCW9eCBVkDQlIVVStyNVy4ogMdY9PbZZQAIzzMRYkL4M G/3w== X-Gm-Message-State: AOAM531Xmphy0z+S/qLe4AArJMr7KqaGEMMCFcoqIhO2/+LUnJr30ETv t4d0AkuKO6xPPLDNpmJdZd3B8EzJKmO1Si5eiO5gv/GngY7TTAxR2xP7Mx2FqqebGdTnvvFPB9l XkDwmUZTB0K1aoC1Yhq1yhnh7SxgqbCYz4wDBiV+T X-Received: by 2002:a25:ad67:: with SMTP id l39mr24078948ybe.172.1614609333691; Mon, 01 Mar 2021 06:35:33 -0800 (PST) X-Received: by 2002:a25:ad67:: with SMTP id l39mr24078917ybe.172.1614609333458; Mon, 01 Mar 2021 06:35:33 -0800 (PST) MIME-Version: 1.0 References: <20210223214346.GB6000@sequoia> <20210223215054.GC6000@sequoia> <20210223223652.GD6000@sequoia> <20210226040542.1137-1-hdanton@sina.com> <20210227023524.15844-1-hdanton@sina.com> In-Reply-To: <20210227023524.15844-1-hdanton@sina.com> From: Ondrej Mosnacek Date: Mon, 1 Mar 2021 15:35:21 +0100 Message-ID: Subject: Re: [BUG] Race between policy reload sidtab conversion and live conversion To: Hillf Danton Cc: Paul Moore , Tyler Hicks , Stephen Smalley , SElinux list , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Feb 27, 2021 at 3:35 AM Hillf Danton wrote: > On Fri, 26 Feb 2021 12:19:35 +0100 Ondrej Mosnacek wrote: > > On Fri, Feb 26, 2021 at 5:08 AM Hillf Danton wrote: > > > On Thu, 25 Feb 2021 20:06:45 -0500 Paul Moore wrote: > > > > On Wed, Feb 24, 2021 at 4:35 AM Ondrej Mosnacek wrote: > > > > > After the switch to RCU, we now have: > > > > > 1. Start live conversion of new entries. > > > > > 2. Convert existing entries. > > > > > 3. RCU-assign the new policy pointer to selinux_state. > > > > > [!!! Now actually both old and new sidtab may be referenced by > > > > > readers, since there is no synchronization barrier previously provided > > > > > by the write lock.] > > > > > 4. Wait for synchronize_rcu() to return. > > > > > 5. Now only the new sidtab is visible to readers, so the old one can > > > > > be destroyed. > > > > > > > > > > So the race can happen between 3. and 5., if one thread already sees > > > > > the new sidtab and adds a new entry there, and a second thread still > > > > > has the reference to the old sidtab and also tires to add a new entry; > > > > > live-converting to the new sidtab, which it doesn't expect to change > > > > > by itself. Unfortunately I failed to realize this when reviewing the > > > > > patch :/ > > > > > > > > It is possible I'm not fully understanding the problem and/or missing > > > > an important detail - it is rather tricky code, and RCU can be very > > > > hard to reason at times - but I think we may be able to solve this > > > > with some lock fixes inside sidtab_context_to_sid(). Let me try to > > > > explain to see if we are on the same page here ... > > > > > > > > The problem is when we have two (or more) threads trying to > > > > add/convert the same context into a sid; the task with new_sidtab is > > > > looking to add a new sidtab entry, while the task with old_sidtab is > > > > looking to convert an entry in old_sidtab into a new entry in > > > > new_sidtab. Boom. > > > > > > > > Looking at the code in sidtab_context_to_sid(), when we have two > > > > sidtabs that are currently active (old_sidtab->convert pointer is > > > > valid) and a task with old_sidtab attempts to add a new entry to both > > > > sidtabs it first adds it to the old sidtab then it also adds it to the > > > > new sidtab. I believe the problem is that in this case while the task > > > > grabs the old_sidtab->lock, it never grabs the new_sidtab->lock which > > > > allows it to race with tasks that already see only new_sidtab. I > > > > think adding code to sidtab_context_to_sid() which grabs the > > > > new_sidtab->lock when adding entries to the new_sidtab *should* solve > > > > the problem. > > > > > > > > Did I miss something important? ;) > > > > > > If the convert pointer can be derefered without lock, we can opt to > > > convert context after building sidtab with the risk of AB BA deadlock > > > cut. Below is the minimum change I can think of along your direction. > > > > We could fix this a bit more easily by just having a shared spinlock > > for both (well, *all*) sidtabs. Yes, we'd need to have it all the way > > Looking forward to reading how to add another BKL, This is not even remotely comparable to the Big Kernel Lock... 99.9% of the time, there will be just one sidtab, otherwise there will be just two for a short time during the policy reload, which need to be updated in sync anyway. > > > up in selinux_state and pass it through to sidtab_init(), but IMHO > > that's less bad than trying to get it right with two locks. > > instead of a nearly pure code churn in order to walk around deadlock. I wish it were that simple, but it won't work this way. Consider a scenario like this: 1. Thread A acquires the old policy pointer, encounters a new context, adds it to the old sidtab as X, but doesn't start adding it to the new one yet. 2. Thread B acquires the new policy pointer, encounters a different new context and adds it to the new sidtab as X. [Now you end up with two different contexts assigned to the same SID, depending on the policy] 3. Now thread A continues to add the context to the new sidtab, but since it reuses the count variable from before, it overwrites the entry for X. (And even if you fixed that, re-searched the new sidtab, and added the new context to the end, you'd end up with the context having different SIDs under the old vs. new policy, which is again a messy situation.) You see, moving the sidtab code around can get you in trouble and that's why I'm advocating for using a common lock - it's much easier to prove that such change doesn't create new bugs. And as I said above, merging the locks won't really have performance consequences, so I still believe it to be the better choice here. -- Ondrej Mosnacek Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.