Received: by 10.213.65.68 with SMTP id h4csp2254434imn; Mon, 2 Apr 2018 04:15:50 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+S4UFuNTufZ+ghw6zLJ+iO/LbhdBC7uCqLy/yfLtP8UEs8otKNRWo7tS5Xds1e0Tk/xOMM X-Received: by 10.99.170.9 with SMTP id e9mr5991633pgf.331.1522667749990; Mon, 02 Apr 2018 04:15:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1522667749; cv=none; d=google.com; s=arc-20160816; b=pNQgkaD60OTRFCZ9QJReVh7o0piBBVM0htysTXv0wINq/Espxyh4IgqcwBCVrLgh4K mDDfcAKluz/3rMEojVg1dQbJVK1SuR+zxM5fggpO/mSlshIt7S7qRu8sK0Ar2Cscfsqs 6OeKdtXYw44j50vogTIaektqSOAB6lxnuY2dOoYByeVIH5FOU/Dosh/kPG42EZx/F2pg EI9/DJn/u0uF/kRHbN1uklk5A5FEAS/S9z79Ia9yOuWKl/tRu2W70SM1G+9V6/7URDir Cohg1lVUap+7AzhWq6uApJoBKS/tdgALlqs6fl1UoM//oz38HdH3uHGUZIW5XKcBf5ZI cXog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=0YcOOfOOzEnSiZhKRiyKVEhyMuTaTOT6y0qmsiWxraE=; b=e9VU4GTquU8yD46JKHjk+fUeME/5x65irQtvmReh0dahkzoLVZ7lOXji8M1NvHn4MJ L14F78wuTltCQyTuX8tsuz8lev7dl6DI4MWM/LIQOvCj9jrRaRgbXVQ7J3SDNYqcRkoN WSAYCAef7aqdKKxGbEmDVZ0R3Yxw1cA7/IdNMVF+A2+yZNqFaBfCulqtZrwh7VPzUMwR nC7s+xy2Sp++aQ3YS5MWPSATsz2OeWecAfXnKp/8Hv/eMDrTu60OTPAesU7w9KnhbObC OTruUBUseU+ON1lwn/nHBvlYR9X/VFtt0SDV2+zMl1zY3UWa7uIK7wuqUpgUOys6hsHC 9AJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@colorfullife-com.20150623.gappssmtp.com header.s=20150623 header.b=xDo/rXAB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i12-v6si107819plk.508.2018.04.02.04.15.36; Mon, 02 Apr 2018 04:15:49 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@colorfullife-com.20150623.gappssmtp.com header.s=20150623 header.b=xDo/rXAB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754463AbeDBLMA (ORCPT + 99 others); Mon, 2 Apr 2018 07:12:00 -0400 Received: from mail-wm0-f54.google.com ([74.125.82.54]:53928 "EHLO mail-wm0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754441AbeDBLL6 (ORCPT ); Mon, 2 Apr 2018 07:11:58 -0400 Received: by mail-wm0-f54.google.com with SMTP id p9so24566693wmc.3 for ; Mon, 02 Apr 2018 04:11:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=colorfullife-com.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-transfer-encoding:content-language; bh=0YcOOfOOzEnSiZhKRiyKVEhyMuTaTOT6y0qmsiWxraE=; b=xDo/rXABI4sOVsOkmowNCB3AtIyNrm3tWdRwp9Ao1ThHyzqG8HE4yH1+2JvCzWuC8S MK5DgRwj4gs731w+eeCbObik9c25o/C30wsoXkUeOpm+kNZ1PkApKwG3sS4UqgDLFmvK MF52KUV0Du1u4h+9hVD0Mp8PGaNfpjgnxKUXuZ74VLJbX8rzs+ssPg42OxJyCId1BHpq lbDCbWo3oMg/hwKtcFoNMZUdnhkyfkHf2wWhpL0uIUMEUaZ9iOGbVEHJC5ouz7ph4zxU /SCL/+52g6yJpIzSfS30pYtfq4eZa9OkCoXbtxy7Y6KmYXDse2gA8+Qlb0AmmTQ/qHOl A+Ow== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-transfer-encoding :content-language; bh=0YcOOfOOzEnSiZhKRiyKVEhyMuTaTOT6y0qmsiWxraE=; b=IUJOuPIHBmNIxkC21KF793gG5k1lMXGbfwrxuTqZ56WikkwWPnUmf54eIAjkT9leI1 KuGceWPdWPD/Y+Rahinu3f3j2orKxdxTXkgTKHnqBEXlFVu258XwP50FlboOGUudZuyZ qF1F2/Rf5W4F2U4qKjbI/sawQ5Sw+1GZRNVh/mEfc5Hfpajg4CTgO0Thd++/s4pzwNHO Cc8Sk7dCo4EHAdZoVJSeLdmCtgw4l3ugibhqGujkElnHHpMlUGfJVstebiakt3C8HuWe DRUa+mmKztS+pfcz0AVpcy/4DspNzBzuwqF0CLlHf2pJy/rtTQ/reNlkoXfUkpqQJ0dC xwgQ== X-Gm-Message-State: ALQs6tCESUD4JRvTsRgJ5f/zUHkx4YiJMHSuhuY14DlXJ+Qir6H/J6Y9 cAhjuR0jWtTHAn5jl9lWHwR1Rw== X-Received: by 10.28.124.13 with SMTP id x13mr531136wmc.71.1522667516755; Mon, 02 Apr 2018 04:11:56 -0700 (PDT) Received: from localhost.localdomain (p200300D993C5CA00626DC7FFFE140369.dip0.t-ipconnect.de. [2003:d9:93c5:ca00:626d:c7ff:fe14:369]) by smtp.googlemail.com with ESMTPSA id p187sm755488wme.8.2018.04.02.04.11.55 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 02 Apr 2018 04:11:55 -0700 (PDT) Subject: Re: [REVIEW][PATCH 11/11] ipc/sem: Fix semctl(..., GETPID, ...) between pid namespaces To: Davidlohr Bueso , "Eric W. Biederman" Cc: Linux Containers , linux-kernel@vger.kernel.org, linux-api@vger.kernel.org, khlebnikov@yandex-team.ru, prakash.sangappa@oracle.com, luto@kernel.org, akpm@linux-foundation.org, oleg@redhat.com, serge.hallyn@ubuntu.com, esyr@redhat.com, jannh@google.com, linux-security-module@vger.kernel.org, Pavel Emelyanov , Nagarathnam Muthusamy References: <87vadmobdw.fsf_-_@xmission.com> <20180323191614.32489-11-ebiederm@xmission.com> <20180329005209.fnzr3hzvyr4oy3wi@linux-n805> <20180330190951.nfcdwuzp42bl2lfy@linux-n805> From: Manfred Spraul Message-ID: <334d0390-88e9-b950-b07e-ad69c0516e5b@colorfullife.com> Date: Mon, 2 Apr 2018 13:11:54 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180330190951.nfcdwuzp42bl2lfy@linux-n805> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, On 03/30/2018 09:09 PM, Davidlohr Bueso wrote: > On Wed, 28 Mar 2018, Davidlohr Bueso wrote: > >> On Fri, 23 Mar 2018, Eric W. Biederman wrote: >> >>> Today the last process to update a semaphore is remembered and >>> reported in the pid namespace of that process.  If there are processes >>> in any other pid namespace querying that process id with GETPID the >>> result will be unusable nonsense as it does not make any >>> sense in your own pid namespace. >> >> Yeah that sounds pretty wrong. >> >>> >>> Due to ipc_update_pid I don't think you will be able to get System V >>> ipc semaphores into a troublesome cache line ping-pong.  Using struct >>> pids from separate process are not a problem because they do not share >>> a cache line.  Using struct pid from different threads of the same >>> process are unlikely to be a problem as the reference count update >>> can be avoided. >>> >>> Further linux futexes are a much better tool for the job of mutual >>> exclusion between processes than System V semaphores.  So I expect >>> programs that  are performance limited by their interprocess mutual >>> exclusion primitive will be using futexes. >> The performance of sysv sem and futexes for the contended case is more or less identical, it depends on the CONFIG_ options what is faster. And this is obvious, both primitives must do the same tasks: sleep: - lookup a kernel pointer from a user space reference - acquire a lock, do some housekeeping, unlock and sleep wakeup: - lookup a kernel pointer from a user space reference - acquire a lock, do some housekeeping, especially unlink the to be woken up task, unlock and wakeup The woken up task has nothing to do, it returns immediately to user space. IIRC for the uncontended case, sysvsem was at ~300 cpu cycles, but that number is a few years old, and I don't know what is the impact of spectre. The futex code is obviously faster. But I don't know which real-world applications do their own optimizations for the uncontended case before using sysvsem. Thus the only "real" challenge is to minimize cache line trashing. >> You would be wrong. There are plenty of real workloads out there >> that do not use futexes and are care about performance; in the end >> futexes are only good for the uncontended cases, it can also >> destroy numa boxes if you consider the global hash table. Experience >> as shown me that sysvipc sems are quite still used. >> >>> >>> So while it is possible that enhancing the storage of the last >>> rocess of a System V semaphore from an integer to a struct pid >>> will cause a performance regression because of the effect >>> of frequently updating the pid reference count.  I don't expect >>> that to happen in practice. >> >> How's that? Now thanks to ipc_update_pid() for each semop the user >> passes, perform_atomic_semop() will do two atomic updates for the >> cases where there are multiple processes updating the sem. This is >> not uncommon. >> >> Could you please provide some numbers. > [...] > So at least for a large box this patch hurts the cases where there is low > to medium cpu usage (no more than ~8 processes on a 40 core box) in a non > trivial way. For more processes it doesn't matter. We can confirm that > the > case for threads is irrelevant. While I'm not happy about the 30% > regression > I guess we can live with this. > > Manfred, any thoughts? > Bugfixing has always first priority, and a 30% regression in one microbenchmark doesn't seem to be that bad. Thus I would propose that we fix SEMPID first, and _if_ someone notices a noticeable regression, then we must improve the code. --     Manfred