Received: by 2002:a05:6358:9144:b0:117:f937:c515 with SMTP id r4csp6963881rwr; Wed, 10 May 2023 01:43:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6fZuGURKr38eHb33Jr6qKCPrJVPrjG2oyB/Ceaniau7cQuR2rortWZwt0xZtdO7CYrCoB/ X-Received: by 2002:a17:90a:8c17:b0:24d:f3cc:7d34 with SMTP id a23-20020a17090a8c1700b0024df3cc7d34mr16864094pjo.32.1683708201485; Wed, 10 May 2023 01:43:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683708201; cv=none; d=google.com; s=arc-20160816; b=amgeSoAPMCyqjjUkgVZIvXh3HPDabf8Qe0AxuFH/x6tziZtOFg7zQwOfNUGwDZ3NIz h+YY8O/nfJZ6WTPdT4MUv7j6ltfmZR3QdQ+mkG1lQ5P2KsReyeR5btWRgXpzgSWQ8erS O1F3VNpPbMqoKvsKOiUQYPrLcYgSnOYAjFtgbxX4OfKriKm2B4x7ND5NpPu4bWaUPptK Byd+HkgJodHy/ZJBNIyiW4pQFB6EYdU8sLvTZL+t2k40aiyFfr7KDQvI/zsbNmrhZqSW jmM3QzyTX1ycX2ALl9bojnwy20ci5IsSiX0aOOQ8IaFFDzhmkA0sVmUXkTUcqSYRzg+d u08g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=usGwrNqwocT5ZlgBtBjsBCzaf/mufKvIc8NIljU8lj4=; b=sDCwFKpaJmVguwNgpFJyCc4WXvmEuzJ+q3HlEoQwwc9/LKHD3qS1oSHvhIIiA/0JwT c+JhYiGmNUsvW1n1Zbpfi6p07Bc5cE23CKKdSc3us2NCn7Ovoxvpa7tHQzT27iACkQnL xizTvIuS/25n+TLqq4nDfaDvnluOrA+0ofwH012sVvdpjBJLgGFWi/11sOahYW3R35kP 49O0/88PQmpXdNuJp1ynlrYRg9M29KsXHPShSsRg+Ev5sxfxmeUVzjN4gzVm4aFFYWVK cW0k4gNGL51YyTX3SUkc1tvF8Vs/Fp/na4YFL84TxVbHQcJe2EVOFXegu2FhBT9IfYrJ m/YA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=MH2VtSK+; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a192-20020a6390c9000000b005192614041esi3350689pge.519.2023.05.10.01.43.07; Wed, 10 May 2023 01:43:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=MH2VtSK+; dkim=neutral (no key) header.i=@linutronix.de; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236548AbjEJIbM (ORCPT + 99 others); Wed, 10 May 2023 04:31:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236532AbjEJIbB (ORCPT ); Wed, 10 May 2023 04:31:01 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96CAA449E for ; Wed, 10 May 2023 01:30:59 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1683707457; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=usGwrNqwocT5ZlgBtBjsBCzaf/mufKvIc8NIljU8lj4=; b=MH2VtSK+aP3fFv+wfunCTyL92HVxXBkunD0KDuuyeEn024DOAWjklVSbrsLIx+vqOJAS5e Q6SU+Bhik8RujQK45hX1bREqQ+lr5KjUTERaRtV93MYT6JQvIhgjsRHlOZ8ecg49mwR1g9 rrxfZnMWoA+Pj8allejcsgJMsoAy4iefDp5UOz533l/vPv53b5tfdpAs4VQn1eBHXEVUFe SLauYoOoRV1ut3OKY2Es29k4H8fmD6XPwkoZR5vyV0+pjZ9WTWFwcpsiAX/ePtv8ei023E zKF7/yBNLALoZeCKvRtn3l0xd60lJY3QeiijA9qOmkPqRLHtaSXDGxGpHEC6TA== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1683707457; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=usGwrNqwocT5ZlgBtBjsBCzaf/mufKvIc8NIljU8lj4=; b=7Iq46h07djEVdcpH62BXAS7r7IyJNXLH8nWZCXmnlr9SidmOl1ciVS2EZ4ofGDD6xv6FvM VGENJv79VSevXPCg== To: Pavel Tikhomirov , Frederic Weisbecker Cc: LKML , Anna-Maria Behnsen , Peter Zijlstra , syzbot+5c54bd3eb218bb595aa9@syzkaller.appspotmail.com, Dmitry Vyukov , Sebastian Siewior , Michael Kerrisk , Andrei Vagin , Christian Brauner , Alexander Mikhalitsyn , Pavel Emelyanov Subject: Re: [RFD] posix-timers: CRIU woes In-Reply-To: <009e7658-1377-cc79-7a42-4dda8fec5af0@virtuozzo.com> References: <20230425181827.219128101@linutronix.de> <20230425183312.932345089@linutronix.de> <87zg6i2xn3.ffs@tglx> <87v8h62vwp.ffs@tglx> <878rdy32ri.ffs@tglx> <87v8h126p2.ffs@tglx> <875y911xeg.ffs@tglx> <87ednpyyeo.ffs@tglx> <009e7658-1377-cc79-7a42-4dda8fec5af0@virtuozzo.com> Date: Wed, 10 May 2023 10:30:57 +0200 Message-ID: <87wn1gy4e6.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Pavel! On Wed, May 10 2023 at 12:36, Pavel Tikhomirov wrote: > On 10.05.2023 05:42, Thomas Gleixner wrote: >> So because of that half thought out user space ABI we are now up the >> regression creek without a paddle, unless CRIU can accomodate to a >> different restore mechanism to lift this restriction from the kernel. >> >> Thoughts? > > Maybe we can do something similar to /proc/sys/kernel/ns_last_pid? > Switch to per-(process->signal) idr based approach with idr_set_cursor > to set next id for next posix timer from new sysctl? I'm not a fan of such sysctls. We have already too many of them and that particular one does not buy much. We can simply let timer_create() or a new syscall create a timer at a given ID. That allows CRIU to restore any checkpointed process no matter which kernel version it came from without doing this insane create/delete dance. The downside is that this allows to create stupidly sparse timer IDs even for the non CRIU case, which increases per process kernel memory consumption and creates slightly more overhead in the signal delivery path. The latter is a burden on the process owning the timer and not affecting expiry, which is a context stealing operation. The memory part needs eventually some thoughts vs. accounting. If the 'explicit at ID' option is not used then the ID mechanism is optimzied for dense IDs by using the first available ID in a bottom up search, which recovers holes created by a timer_delete() operation. Thanks, tglx