Received: by 2002:a05:7412:e794:b0:fa:551:50a7 with SMTP id o20csp1920159rdd; Thu, 11 Jan 2024 13:26:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IEawQNyzsZjZmKzmvF264qHU5ENzwGWqc28JJ8QdcOiHIWRIRTnzg9ejKq0X9GvVmeuTDRP X-Received: by 2002:a05:6a21:3286:b0:197:6bec:42aa with SMTP id yt6-20020a056a21328600b001976bec42aamr681243pzb.12.1705008381338; Thu, 11 Jan 2024 13:26:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1705008381; cv=none; d=google.com; s=arc-20160816; b=agLx9DhO4BaW0fQhInGoopf6SQtOV3l3ONbDYJfE4JPp3Zx2wDyDwhNfvO82z7JTi2 6FFxmmaLDpW92YpdWOC7HKOjrCc/15A1tSKc9IGfpi0B1Bg1Fdt2McEuwoObNIRuL6EF 7D8xcktb52SRdBLo9kYJYF247YbyjtrsxBkY8xZqO3ToSYwzPCXyRDFfAdBHiW3WNHXz czhRuZOzZMKWQ8kKY37yx9OucGnMnoHJa5+clm5hr74d+cBfxHjkCMWVRV+VcS5+kDxX kbwFIJFZU4qub4fddi2INn05l7ptCsSoW4HhZcpeFYqYwztt3+IwXATryQGtEwN//70j HnFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:in-reply-to:date:subject:cc:to:from:dkim-signature; bh=TLtu3x1/NMFhl9k9RQJ078ylQIzYtpK52w0Kj3MJ9+Q=; fh=OoNijRL2iZL1SEfDQSeUYlJ2j/PE8rExjMUzOZe3ftg=; b=aU14KPXENcCzjT2knTamnrny1J4BZoYXlNlQfygQr4vB9t94TKjEmjHoxgEzXM4mej LxPqtXM66adtut5SXhj7EeiL+IyIkUzXtTof30PvnHRsXZ0VEF3oFVSEPOvqTQiIs5O7 QrO6P62ZHwWfevBwJD0Z/Day5/SSNVErtP1FuW86+rqT0Y29Uxh3GyuAdSYA4V0VbOH4 CKece0wkH/+WdUg2dW6xFwwEN56qLykGGgRZg52q3wNzjuoBFn9dm2MT+XFfbtIX1AVN JtLlGyTdMQV4AUxUYRBONE6ZViY72Z75ZzrzOKi69ERnHt7H1ZxzpbeQTjzP8w4IXH43 7YXQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@aarsen.me header.s=MBO0001 header.b=HlJuRmvR; spf=pass (google.com: domain of linux-kernel+bounces-24087-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-24087-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=aarsen.me Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id bx31-20020a056a02051f00b005c6cc61ff0csi2042827pgb.152.2024.01.11.13.26.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Jan 2024 13:26:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-24087-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@aarsen.me header.s=MBO0001 header.b=HlJuRmvR; spf=pass (google.com: domain of linux-kernel+bounces-24087-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-24087-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=aarsen.me Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 7E31928362A for ; Thu, 11 Jan 2024 21:25:50 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id BA8B45812E; Thu, 11 Jan 2024 21:25:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=aarsen.me header.i=@aarsen.me header.b="HlJuRmvR" Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F0EB35812A for ; Thu, 11 Jan 2024 21:25:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=aarsen.me Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=aarsen.me Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4T9yPk5bKGz9scR; Thu, 11 Jan 2024 22:25:30 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=aarsen.me; s=MBO0001; t=1705008330; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to; bh=TLtu3x1/NMFhl9k9RQJ078ylQIzYtpK52w0Kj3MJ9+Q=; b=HlJuRmvRETTSsJXCAwjrlgbXj6Ce2ze+FPg3cZY97V4eXBQZA8tU1DLAT6sEgALwe7+vnJ oMMY7bZHPHX5XIBUUiElKpZQYXTm0dM908Y76fbEBVEKIQOPQtpWyPAddgo5VPWmpveOay W/jA07BnJfHoVthWJKEnGb5DUVShooaeIKunqnhly0qA8V+cznmBs7lEFHU+8H9qnYWiRI cj3VC109z3emAMEETNk2b989xivsnZBDOvvDzRFQnIFuN3w/M/u27c9PJ2ywuY2ESQgojt T3DV8Nx3k5J5c4CzAJ7iEhBMkw4cPLOE1NHzf1jLjBt7bGlDFGEOblHJW58B0A== From: Arsen =?utf-8?Q?Arsenovi=C4=87?= To: Andrew Pinski Cc: "H. Peter Anvin" , David Howells , linux-kernel@vger.kernel.org Subject: Re: [PATCH 00/45] C++: Convert the kernel to C++ Date: Thu, 11 Jan 2024 22:01:51 +0100 In-reply-to: Message-ID: <864jfjr29j.fsf@aarsen.me> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Andrew Pinski writes: > On Tue, Jan 9, 2024 at 11:57=E2=80=AFAM H. Peter Anvin wr= ote: >> >> Hi all, I'm going to stir the hornet's nest and make what has become the >> ultimate sacrilege. >> >> Andrew Pinski recently made aware of this thread. I realize it was >> released on April 1, 2018, and either was a joke or might have been >> taken as one. However, I think there is validity to it, and I'm going to >> try to motivate my opinion here. >> >> Both C and C++ has had a lot of development since 1999, and C++ has in >> fact, in my personal opinion, finally "grown up" to be a better C for >> the kind of embedded programming that an OS kernel epitomizes. I'm >> saying that as the author of a very large number of macro and inline >> assembly hacks in the kernel. >> >> What really makes me say that is that a lot of things we have recently >> asked for gcc-specific extensions are in fact relatively easy to >> implement in standard C++ and, in many cases, allows for infrastructure >> improvement *without* global code changes (see below.) >> >> C++14 is in my option the "minimum" version that has reasonable >> metaprogramming support has most of it without the type hell of earlier >> versions (C++11 had most of it, but C++14 fills in some key missing piec= es). >> >> However C++20 is really the main game changer in my opinion; although >> earlier versions could play a lot of SFINAE hacks they also gave >> absolutely useless barf as error messages. C++20 adds concepts, which >> makes it possible to actually get reasonable errors. >> >> We do a lot of metaprogramming in the Linux kernel, implemented with >> some often truly hideous macro hacks. These are also virtually >> impossible to debug. Consider the uaccess.h type hacks, some of which I >> designed and wrote. In C++, the various casts and case statements can be >> unwound into separate template instances, and with some cleverness can >> also strictly enforce things like user space vs kernel space pointers as >> well as already-verified versus unverified user space pointers, not to >> mention easily handle the case of 32-bit user space types in a 64-bit >> kernel and make endianness conversion enforceable. >> >> Now, "why not Rust"? First of all, Rust uses a different (often, in my >> opinion, gratuitously so) syntax, and not only would all the kernel >> developers need to become intimately familiar to the level of getting >> the same kind of "feel" as we have for C, but converting C code to Rust >> isn't something that can be done piecemeal, whereas with some cleanups >> the existing C code can be compiled as C++. >> >> However, I find that I disagree with some of David's conclusions; in >> fact I believe David is unnecessarily *pessimistic* at least given >> modern C++. >> >> Note that no one in their sane mind would expect to use all the features >> of C++. Just like we have "kernel C" (currently a subset of C11 with a >> relatively large set of allowed compiler-specific extensions) we would >> have "kernel C++", which I would suggest to be a strictly defined subset >> of C++20 combined with a similar set of compiler extensions.) I realize >> C++20 compiler support is still very new for obvious reasons, so at >> least some of this is forward looking. >> >> So, notes on this specific subset based on David's comments. >> >> On 4/1/18 13:40, David Howells wrote: >> > >> > Here are a series of patches to start converting the kernel to C++. It >> > requires g++ v8. >> > >> > What rocks: >> > >> > (1) Inline template functions, which makes implementation of things = like >> > cmpxchg() and get_user() much cleaner. >> >> Much, much cleaner indeed. But it also allows for introducing things >> like inline patching of immediates *without* having to change literally >> every instance of a variable. >> >> I wrote, in fact, such a patchset. It probably included the most awful >> assembly hacks I have ever done, in order to implement the mechanics, >> but what *really* made me give up on it was the fact that every site >> where a patchable variable is invoked would have to be changed from, say: >> >> foo =3D bar + some_init_offset; >> >> ... to ... >> >> foo =3D imm_add(bar, some_init_offset); >> >> >> > (2) Inline overloaded functions, which makes implementation of thing= s like >> > static_branch_likely() cleaner. >> >> Basically a subset of the above (it just means that for a specific set >> of very common cases it isn't necessary to go all the way to using >> templates, which makes the syntax nicer.) >> >> > (3) Class inheritance. For instance, all those inode wrappers that = require >> > the base inode struct to be included and that has to be accessed= with >> > something like: >> > >> > inode->vfs_inode.i_mtime >> > >> > when you could instead do: >> > >> > inode->i_mtime >> >> This is nice, but it is fundamentally syntactic sugar. Similar things >> can be done with anonymous structures, *except* that C doesn't allow >> another structure to be anonymously included; you have to have an >> entirely new "struct" statement defining all the fields. Welcome to >> macro hell. >> >> > What I would disallow: >> > >> > (1) new and delete. There's no way to pass GFP_* flags in. >> >> Yes, there is. >> >> void * operator new (size_t count, gfp_flags_t flags); >> void operator delete(void *ptr, ...whatever kfree/vfree/etc need, or a >> suitable flag); >> >> > (2) Constructors and destructors. Nests of implicit code makes the = code less >> > obvious, and the replacement of static initialisation with const= ructor >> > calls would make the code size larger. >> >> Yes and no. It also makes it *way* easier to convert to and from using >> dedicated slabs; we already use semi-initialized slabs for some kinds of >> objects, but it requires new code to make use of. >> >> We already *do* use constructors and *especially* destructors for a lot >> of objects, we just call them out. >> >> Note that modern C++ also has the ability to construct and destruct >> objects in-place, so allocation and construction/destruction aren't >> necessarily related. >> >> There is no reason you can't do static initialization where possible; >> even constructors can be evaluated at compile time if they are constexpr. >> >> Constructors (and destructors, for modules) in conjunction with gcc's >> init_priority() extension is also a nice replacement for linker hack >> tables to invoke intializer functions. >> >> > (3) Exceptions and RTTI. RTTI would bulk the kernel up too much and >> > exception handling is limited without it, and since destructors = are not >> > allowed, you still have to manually clean up after an error. >> >> Agreed here, especially since on many platforms exception handling >> relies on DWARF unwind information. > > Let me just add a few things about exceptions and RTTI. > In the darwin kernel, C++ is used for device drivers and both > exceptions and RTTI is not used there either. They have been using C++ > for kernel drivers since the early 2000s even. > You can find out more at https://developer.apple.com/documentation/driver= kit . > There even was a GCC option added an option which would also disable > RTTI and change the ABI to explicitly for the kernel. > -fapple-kext/-mkernel (the former is for only loadable modules while > the latter is for kernel too). > > Note even in GCC, we disable exceptions and RTTI while building GCC. > This is specifically due to not wanting to use them and use other > methods to do that. > Note GDB on the other hand used to use setjmp/longjmp for their > exception handling in C and I think they moved over to using C++ > exceptions which simplified things there. But as far as I know the > Linux kernel does not use a mechanism like that (I know of copy > from/to user using HW exceptions/error/interrupt handling but that is > a special case only). > > >> >> > (4) Operator overloading (except in special cases). >> >> See the example of inline patching above. But yes, overloading and >> *especially* operator overloading should be used only with care; this is >> pretty much true across the board. >> >> > (5) Function overloading (except in special inline cases). >> >> I think we might find non-inline cases where it matters, too. >> >> > (6) STL (though some type trait bits are needed to replace __builtin= s that >> > don't exist in g++). >> >> Just like there are parts of the C library which is really about the >> compiler and not part of the library. is part of that for = C++. > > There is an idea of a free standing C++ library. newer versions of > GCC/libstdc++ does support that but IIRC can only be configured at > compile time of GCC. > type_traits and a few other headers are included in that. I have not > looked into it fully though. There is, and it's quite extensive (and I plan on extending it further in GCC 15, if I get the chance to). The full list of headers libstdc++ exports for freestanding use is a bit larger than the standard one: https://gcc.gnu.org/cgit/gcc/tree/libstdc++-v3/include/Makefile.am#n28 (note that some are partially supported.. I lack a full list of which) Most (actually, nearly all) of the libstdc++ code works for kernel environments, and it is very mature and well-tested, so it can and should be used by kernels too. I haven't fully enabled using it in such a manner yet, but We could handle the kernel specific configuration via a multilib or so (so, the multilib list becomes 32, 64, x32, and a new k64 or so on amd64). Presumably, something like that could be done for libgcc too? It is not necessarily only configurable at build-time, but the libstdc++ configuration augmented by -ffreestanding and the one generated by a 'proper' freestanding build of libstdc++ differ currently. Maybe they can be brought together close enough for Linux? Managarm, which is the kernel I had in mind when working on getting more freestanding stuff has a dedicated kernel build of GCC, however, so I didn't test this case much. I'd like to, sooner or later, consolidate it into the normal managarm system GCC, as a multilib, but I haven't had time to do so yet. In any case, I strongly prefer configuring toolchains 'properly'. > Thanks, > Andrew Pinski > >> >> > (7) 'class', 'private', 'namespace'. >> > >> > (8) 'virtual'. Don't want virtual base classes, though virtual func= tion >> > tables might make operations tables more efficient. >> >> Operations tables *are* virtual classes. virtual base classes make sense >> in a lot of cases, and we de facto use them already. >> >> However, Linux also does conversion of polymorphic objects from one type >> to another -- that is for example how device nodes are implemented. >> Using this with C++ polymorphism without RTTI does require some >> compiler-specific hacks, unfortunately. >> >> > Issues: >> > >> > (1) Need spaces inserting between strings and symbols. >> >> I have to admit I don't really grok this? >> >> > (2) Direct assignment of pointers to/from void* isn't allowed by C++= , though >> > g++ grudgingly permits it with -fpermissive. I would imagine th= at a >> > compiler option could easily be added to hide the error entirely. >> >> Seriously. It should also enforce that it should be a trivial type. >> Unfortunately it doesn't look like there is a way to create user-defined >> implicit conversions from one pointer to another (via a helper class), >> which otherwise would have had some other nice applications. >> >> > (3) Need gcc v8+ to statically initialise an object of any struct th= at's not >> > really simple (e.g. if it's got an embedded union). >> >> Worst case: constexpr constructor. >> >> > (4) Symbol length. Really need to extern "C" everything to reduce t= he size >> > of the symbols stored in the kernel image. This shouldn't be a = problem >> > if out-of-line function overloading isn't permitted. >> >> This really would lose arguably the absolutely biggest advantage of C++: >> type-safe linkage. This is the one reason why Linus actually tried to >> use C++ in one single version of the kernel in the early days (0.99.14, >> if I remember correctly.) At that time, g++ was nowhere near mature >> enough, and it got dropped right away. >> >> >> > So far, it gets as far as compiling init/main.c to a .o file. >> >> ;) =2D- Arsen Arsenovi=C4=87 --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iOYEARYKAI4WIQT+4rPRE/wAoxYtYGFSwpQwHqLEkwUCZaBcyF8UgAAAAAAuAChp c3N1ZXItZnByQG5vdGF0aW9ucy5vcGVucGdwLmZpZnRoaG9yc2VtYW4ubmV0RkVF MkIzRDExM0ZDMDBBMzE2MkQ2MDYxNTJDMjk0MzAxRUEyQzQ5MxAcYXJzZW5AYWFy c2VuLm1lAAoJEFLClDAeosSTGhoBAIIxcLh2fweOcUFVZgHjTIvnZ+20pE7q8g3O TTxv//YwAQCk3qcPGFg5HzDd9tk/F8IQnBgFVmyg/3UMZ5hPaZjcAA== =kkl0 -----END PGP SIGNATURE----- --=-=-=--