Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp893119pxf; Thu, 25 Mar 2021 17:13:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzgOdx1k3TT/zEwRJVPktuf7yBljgRS2nZgx8RTG+GrFxaMrxqna4Rw6bhw1YnVdRI8b8CM X-Received: by 2002:aa7:c0cd:: with SMTP id j13mr12383327edp.41.1616717599416; Thu, 25 Mar 2021 17:13:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1616717599; cv=none; d=google.com; s=arc-20160816; b=pAL7i1MIKnn0a4wHGBFEqWZBgeIXZ3OMT0z/XSVWvACYYVO0F26z4vsAfpjOOK+QfH OeMWCIVPcxQW22T/3eJ+pgW3LGv/Le1CXpNt16fbgGkvVBLx+IumN9r6AD9sA1Yrsy5g DuqVE3wvoX1OqG+bAK5BMWM2KzUgcI1L3me890jTYblmJvGgRPOuOWlaKFJgGuBgZ9Z9 Ab+oolA78zxl53vo+6CpWRRkW35tS96bvENGQFsCC997cHOnZnzInELjAqqverSij85I avmgZfvZ1sph1RWOlF4OeraHHllwpNnvDWd3odUbdplRYSuyq7eLdHzbyzV1BfuBE4ut gsMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject:dkim-signature; bh=q9S3+fI2D+2mI26/8bWMBz4t+NgZcdhnEUCCGVDX3aM=; b=prpZoE7klsD8L+jFkJ4Q/OyEgWY5Ynifdhsj3DcB0V3ndFuN8E38f4uc7fHvIZBFpc F12f/28mD/8rbyYCjbSNk2nU+mLckW3m0nx/HtOCFmufE4ZD+BkS/r+ZeYCN+SOO8NE4 bqY+rmQhLJ6aGn5wNgzsIe8o/vojdNuL1zOCwsgpQjj/Eima49XGtyW3DNWdm4GozqSe nFe4apKPS0TENSkefCq/rRUM/15eaQfogs51VBOs+fJVVRMtEOljkn/wzGevJVOwF61b r2ONaxoeHTYNqIsnzywQDOsCRlBm03vVK/ysyZX2Wuv4Lj2qsIMMpj1XUSQCotZbp6+w oxLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=SB5lUVPY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e12si5675803ejs.501.2021.03.25.17.12.57; Thu, 25 Mar 2021 17:13:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=SB5lUVPY; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229873AbhCZAMA (ORCPT + 99 others); Thu, 25 Mar 2021 20:12:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35916 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229669AbhCZALh (ORCPT ); Thu, 25 Mar 2021 20:11:37 -0400 Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 21522C06174A for ; Thu, 25 Mar 2021 17:11:37 -0700 (PDT) Received: by mail-pj1-x1034.google.com with SMTP id k23-20020a17090a5917b02901043e35ad4aso3394953pji.3 for ; Thu, 25 Mar 2021 17:11:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=q9S3+fI2D+2mI26/8bWMBz4t+NgZcdhnEUCCGVDX3aM=; b=SB5lUVPY9Bbxm+ta78VVnlcXkTJ1IfQQ5vAx5lk+EnCtO3Cvs7s8v9s5R2kS8IO2X3 KHQ+OAo9bhSAQygsOVUxKn8GOxYZ5CZ+jYxSMZZ9L8fkH6qLVZsqozrVDfGckp3kmFqj nYts3fBEuKqQbn7bOT2RIE2f+A+I8mgFtN9kM9xaGbId2vGp/jlhikwRO6wiByDC2J1B BaYb14zKzzl+xVkJYIog4vbhndRQ0L73ZZ7q02Z3WyUx2rbVWxVO0+w3hYk6ULko2vox PI8OLHlgKOt367GzKSsqk2STrRV96XcINxhKOBThXu8/Hps4O3iYBeBDZzC3MzMVjiII KSCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=q9S3+fI2D+2mI26/8bWMBz4t+NgZcdhnEUCCGVDX3aM=; b=UnYCMzSRttZqGicBRd5gszXfrbHO/i+/vjkrQAs/S1vS32QGcYdw8pceIFusnmqP2y vSBGA3BCSTSaQcY14Hl8MctJw/MKWnAcN9zJbJdLtlVFgA7KG+La/GHtwmMzM8H2v7gY ycRbw2MDT7NThqmZ3JrOm76c8DeHNb0OZjRH+EllhFFUKL98s8qN1rTCK6WiMjN6OiJe FiDUEMk6KFtnmj4rrP1ljOUNyoC4TvIWglZPwpRuMzoZu4GWZntwiJA80Dyt7NoaH5KT YgryRRYNGoYi5Txeu7tSvV6BNuE24GldBv1nPCbU7eDtuCb1AJaa76bIsBucREbKORmd 8yHQ== X-Gm-Message-State: AOAM532/H4/2V7i/85B4U+W0+MRp8whlUdaXtmEjO7xiXu4TZFNVNduF Pr3t43167oJ2eiJ39i+yas+vJv3ouKSvnA== X-Received: by 2002:a17:90a:2d88:: with SMTP id p8mr11301109pjd.159.1616717496476; Thu, 25 Mar 2021 17:11:36 -0700 (PDT) Received: from [192.168.1.134] ([66.219.217.173]) by smtp.gmail.com with ESMTPSA id q10sm6668760pfc.190.2021.03.25.17.11.35 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 25 Mar 2021 17:11:36 -0700 (PDT) Subject: Re: [PATCH 0/2] Don't show PF_IO_WORKER in /proc//task/ To: Stefan Metzmacher , Linus Torvalds , "Eric W. Biederman" Cc: io-uring , Linux Kernel Mailing List , Oleg Nesterov References: <20210325164343.807498-1-axboe@kernel.dk> <5563d244-52c0-dafb-5839-e84990340765@samba.org> From: Jens Axboe Message-ID: <6a2c4fe3-a019-2744-2e17-34b6325967d7@kernel.dk> Date: Thu, 25 Mar 2021 18:11:34 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <5563d244-52c0-dafb-5839-e84990340765@samba.org> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 3/25/21 3:57 PM, Stefan Metzmacher wrote: > > Am 25.03.21 um 22:44 schrieb Jens Axboe: >> On 3/25/21 2:40 PM, Jens Axboe wrote: >>> On 3/25/21 2:12 PM, Linus Torvalds wrote: >>>> On Thu, Mar 25, 2021 at 12:42 PM Linus Torvalds >>>> wrote: >>>>> >>>>> On Thu, Mar 25, 2021 at 12:38 PM Linus Torvalds >>>>> wrote: >>>>>> >>>>>> I don't know what the gdb logic is, but maybe there's some other >>>>>> option that makes gdb not react to them? >>>>> >>>>> .. maybe we could have a different name for them under the task/ >>>>> subdirectory, for example (not just the pid)? Although that probably >>>>> messes up 'ps' too.. >>>> >>>> Actually, maybe the right model is to simply make all the io threads >>>> take signals, and get rid of all the special cases. >>>> >>>> Sure, the signals will never be delivered to user space, but if we >>>> >>>> - just made the thread loop do "get_signal()" when there are pending signals >>>> >>>> - allowed ptrace_attach on them >>>> >>>> they'd look pretty much like regular threads that just never do the >>>> user-space part of signal handling. >>>> >>>> The whole "signals are very special for IO threads" thing has caused >>>> so many problems, that maybe the solution is simply to _not_ make them >>>> special? >>> >>> Just to wrap up the previous one, yes it broke all sorts of things to >>> make the 'tid' directory different. They just end up being hidden anyway >>> through that, for both ps and top. >>> >>> Yes, I do think that maybe it's better to just embrace maybe just >>> embrace the signals, and have everything just work by default. It's >>> better than continually trying to make the threads special. I'll see >>> if there are some demons lurking down that path. >> >> In the spirit of "let's just try it", I ran with the below patch. With >> that, I can gdb attach just fine to a test case that creates an io_uring >> and a regular thread with pthread_create(). The regular thread uses >> the ring, so you end up with two iou-mgr threads. Attach: >> >> [root@archlinux ~]# gdb -p 360 >> [snip gdb noise] >> Attaching to process 360 >> [New LWP 361] >> [New LWP 362] >> [New LWP 363] >> >> warning: Selected architecture i386:x86-64 is not compatible with reported target architecture i386 >> >> warning: Architecture rejected target-supplied description >> Error while reading shared library symbols for /usr/lib/libpthread.so.0: >> Cannot find user-level thread for LWP 363: generic error >> 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () from /usr/lib/libc.so.6 >> (gdb) info threads >> Id Target Id Frame >> * 1 LWP 360 "io_uring" 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () >> from /usr/lib/libc.so.6 >> 2 LWP 361 "iou-mgr-360" 0x0000000000000000 in ?? () >> 3 LWP 362 "io_uring" 0x00007f7aa52a0a9d in syscall () from /usr/lib/libc.so.6 >> 4 LWP 363 "iou-mgr-362" 0x0000000000000000 in ?? () >> (gdb) thread 2 >> [Switching to thread 2 (LWP 361)] >> #0 0x0000000000000000 in ?? () >> (gdb) bt >> #0 0x0000000000000000 in ?? () >> Backtrace stopped: Cannot access memory at address 0x0 >> (gdb) cont >> Continuing. >> ^C >> Thread 1 "io_uring" received signal SIGINT, Interrupt. >> [Switching to LWP 360] >> 0x00007f7aa526e125 in clock_nanosleep@GLIBC_2.2.5 () from /usr/lib/libc.so.6 >> (gdb) q >> A debugging session is active. >> >> Inferior 1 [process 360] will be detached. >> >> Quit anyway? (y or n) y >> Detaching from program: /root/git/fio/t/io_uring, process 360 >> [Inferior 1 (process 360) detached] >> >> The iou-mgr-x threads are stopped just fine, gdb obviously can't get any >> real info out of them. But it works... Regular test cases work fine too, >> just a sanity check. Didn't expect them not to. > > I guess that's basically what I tried to describe when I said they > should look like a userspace process that is blocked in a syscall > forever. Right, that's almost what they look like, in practice that is what they look like. >> Only thing that I dislike a bit, but I guess that's just a Linuxism, is >> that if can now kill an io_uring owning task by sending a signal to one >> of its IO thread workers. > > Can't we just only allow SIGSTOP, which will be only delivered to > the iothread itself? And also SIGKILL should not be allowed from userspace. I don't think we can sanely block them, and we to cleanup and teardown normally regardless of who gets the signal (owner or one of the threads). So I'm not _too_ hung up on the "io thread gets signal goes to owner" as that is what happens with normal threads too, though I would prefer if that wasn't the case. But overall I feel better just embracing the thread model, rather than having something that kinda sorta looks like a thread, but differs in odd ways. > And /proc/$iothread/ should be read only and owned by root with > "cmdline" and "exe" being empty. I know you brought this one up as part of your series, not sure I get why you want it owned by root and read-only? cmdline and exe, yeah those could be hidden, but is there really any point? Maybe I'm missing something here, if so, do clue me in! -- Jens Axboe