Received: by 2002:a05:7412:40d:b0:e2:908c:2ebd with SMTP id 13csp99921rdf; Mon, 20 Nov 2023 18:03:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IGX29kbFcSebQYRSvxcqfbb7BlMZg8IP+VqxLCoHZovDlxgSKws7Sa3lAcBJiDM/C6QnXnr X-Received: by 2002:a17:903:11ce:b0:1c9:c951:57f9 with SMTP id q14-20020a17090311ce00b001c9c95157f9mr8382818plh.68.1700532217698; Mon, 20 Nov 2023 18:03:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700532217; cv=none; d=google.com; s=arc-20160816; b=n7JxQFF2rapU48pHk7FVHP7aPMzVqf6GDtBi1XFXGC777gcWxpVIPm57J5vgvIehCz 6c5PfPFlWMS9IXtmVc5S4IhwsM5qYGm5tCG0i0/8mQnGRYn68DX5E+lhwrROCQq0GCGl 3e9HdPocJJA904ckeEelVVuFNU9h7aO25Hbf4LCW5stHKRdtFDYREFSXwzU8EMep6dTr U6TfbGCMkLX+uGVW+vx91+XBFiADar4wQXWg6jqaoCJaJ9v3rl8DswJOOMG0+upWlJzn JrpLNcRLLzqqadNsY/bqdNl37ovAZbvx7C7y4xO6ccBIDJxx9bus6KkQA62C6oWwmTLZ 02Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :references:message-id:subject:cc:to:from:date:dkim-signature; bh=RI/R/dL+OK3InX9FiqKDXHPEN021Sx3jNB6FQIcl/vA=; fh=DEuNfPpkUwIjEMxdRA6QhbtTNzwVLYNKd62z+DCqqm0=; b=Lgar5Uc9BedM2jgMSUaiEDpFpbhI7OxvB4xIY81mDYPoyRs44E3zuowyf7AJtpUDFK /1SOjvJFBqDgl7IYgZ9TUcijwczTNZuKGurEIDn2pNSJBVNuSWPmEFWPB7rLD8Nk3v3K ouvjZfLactBdHsiVhQvPHKGzdD/rNk0rDrb4wtH8J8+vyjwb1VIDjsMfWdY+MJN6j80p lhK82ypPaIcLt3MHCh/C8wSUDXye54ZAkKCcoBuGix4Es2oh/kmJqYU5JJoBKuilEGpR THV7z6DMjMyMvE9RVpu4GO+vb1XQ95/+a2z6an0uesnGuY9PCVJNMwixDvv3Jopud+Ak J02w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mit.edu header.s=outgoing header.b=fh+Iz59m; spf=pass (google.com: domain of linux-ext4+bounces-56-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-ext4+bounces-56-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mit.edu Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id u3-20020a17090341c300b001cc283d99a6si9461262ple.474.2023.11.20.18.03.37 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 18:03:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-ext4+bounces-56-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@mit.edu header.s=outgoing header.b=fh+Iz59m; spf=pass (google.com: domain of linux-ext4+bounces-56-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-ext4+bounces-56-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=mit.edu Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 269752828D5 for ; Tue, 21 Nov 2023 02:03:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 01B2512B60; Tue, 21 Nov 2023 02:03:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=mit.edu header.i=@mit.edu header.b="fh+Iz59m" X-Original-To: linux-ext4@vger.kernel.org Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7652ACF for ; Mon, 20 Nov 2023 18:03:29 -0800 (PST) Received: from cwcc.thunk.org (pool-173-48-82-21.bstnma.fios.verizon.net [173.48.82.21]) (authenticated bits=0) (User authenticated as tytso@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 3AL22s5b004315 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 20 Nov 2023 21:02:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mit.edu; s=outgoing; t=1700532178; bh=RI/R/dL+OK3InX9FiqKDXHPEN021Sx3jNB6FQIcl/vA=; h=Date:From:Subject:Message-ID:MIME-Version:Content-Type; b=fh+Iz59mAaEhYI6Rnl1Ws5eNRCdm8wGGiPBNrtSex6iEN3evmaEE4VQ38o7HziJIX 1oE53Yx5gx4QD+4kGuf0DRlqTlOmzTWeEI2U3lwozuid2nmQUf3gYQ2cFfVX58ObIS AOF5SVofHZTGGj/d41vBKcNaXkE103Uvrh87PK5pi0LdMGHs5Xqj6zMO4rTyX9usd9 drGCCLumglHiyBKsUPqRzlaLSTktuNh3kOnfULa7loj6SCmki5AcJ9s3b1I/ItdyGj OJ8dmIFQVzZuLCX6BAwNAawsKR4kbHK+H0hmI1SBGThTTmsfe2NgCjRISca5zfG2+N +bR7dioZFi29A== Received: by cwcc.thunk.org (Postfix, from userid 15806) id A86F715C02B0; Mon, 20 Nov 2023 21:02:54 -0500 (EST) Date: Mon, 20 Nov 2023 21:02:54 -0500 From: "Theodore Ts'o" To: Linus Torvalds Cc: Christian Brauner , Gabriel Krisman Bertazi , viro@zeniv.linux.org.uk, linux-f2fs-devel@lists.sourceforge.net, ebiggers@kernel.org, linux-fsdevel@vger.kernel.org, jaegeuk@kernel.org, linux-ext4@vger.kernel.org Subject: Re: [f2fs-dev] [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs Message-ID: <20231121020254.GB291888@mit.edu> References: <20230816050803.15660-1-krisman@suse.de> <20231025-selektiert-leibarzt-5d0070d85d93@brauner> <655a9634.630a0220.d50d7.5063SMTPIN_ADDED_BROKEN@mx.google.com> <20231120-nihilismus-verehren-f2b932b799e0@brauner> Precedence: bulk X-Mailing-List: linux-ext4@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: On Mon, Nov 20, 2023 at 10:07:51AM -0800, Linus Torvalds wrote: > Of course, "do it in shared generic code" doesn't tend to really fix > the braindamage, but at least it's now shared braindamage and not > spread out all over. I'm looking at things like > generic_ci_d_compare(), and it hurts to see the mindless "let's do > lookups and compares one utf8 character at a time". What a disgrace. > Somebody either *really* didn't care, or was a Unicode person who > didn't understand the point of UTF-8. This isn't because of case-folding brain damage, but rather Unicode brain damage. We compare one character at a time because it's possible for some character like ? to either be encoded as 0x0089 (aka "Latin Small Letter E with Acute") OR as 0x0065 0x0301 ("Latin Small Letter E" plus "Combining Acute Accent"). Typically, we pretend that UTF-8 means that we can just encode ?, or 0x0089 as 0xC3 0xA9 and then call it a day and just use strcmp(3) on the sucker. But Unicode is a lot more insane than that. Technically, 0x65 0xCC 0x81 is the same character as 0xC3 0xA9. > Oh well. I guess people went "this is going to suck anyway, so let's > make sure it *really* sucks". It's more like, "this is going to suck, but if it's going to suck anyway, let's implement the full Unicode spec in all its gory^H^H^H^H glory, whether or not it's sane". - Ted