Received: by 2002:a05:6a10:6744:0:0:0:0 with SMTP id w4csp29284pxu; Wed, 14 Oct 2020 19:07:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyqHpvg4sDggCTi4gbBr91zukijkQL27/bpQjxRbmENWfMHF+CSR57e7R3WgykixMhg99Be X-Received: by 2002:a17:906:4d44:: with SMTP id b4mr1996912ejv.131.1602727620291; Wed, 14 Oct 2020 19:07:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1602727620; cv=none; d=google.com; s=arc-20160816; b=q6lq6UUfxQ6xhWMKV5nHMf/zzHmjTjy42GoBhVJ7IQOg4Nl+XbsbwFsLhYKSgBkm30 NFN8WqC3vDnLt9BGnQ8U0oGG53D8w821vAlxhSG/LITUSSQ5gc4SxS7wXtWQ1ArahdDv vFGB8/nXxITXWLPZ/KDmTXPC8LWtGWyhEqw7LjOYmTND8292OCrx8LqZAu2oauRr8iWb +viNrr98eEx666K5ujYGLePsfr/bwTJ7kOzTsb8Fot5US18vU0baGy47NmXLMIP7cTBg vexbBpAha/cH9owN9t5JguWzE0P5P68dHuki7j7iifzdX6IKF37hIo2ag2SJk2+bYoNQ rOBw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :organization:references:in-reply-to:message-id:subject:cc:to:from :date; bh=7kZWmFNAuuj71M/gnMhWW8PcqleXVug7Z/NYM8xHris=; b=MESQNtoTlsLLkbK7GKFP6NR50XQrF46Gy4F74OfNPJdSTw7+5jBM+4SFWc0I+shN5d MYuB72kO7IQhVQo/TajGGGXbFnePwNlr3Cwa+1Mc5ML6ymZ3NqD8T4TvbQc/Sbuo/b/i h0P6k1I1JM7UfidabY2665zwF/9zs9JE6Bdvg+baxHyZfKnVNzTkIfppTxK+XSceAnGI df2rW749DN289RhS/Zm3In92bpu68rgF3e23c6dAnK7w/MUYuT5/pArBMLPwMLkgtbYS N1j0V7sVoSlvQkrcw8p7525n1ScFned2M4jQWnc4m4YGGk7zl22oxyumujbBgI9Cr7r0 C20w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z8si1103124eju.267.2020.10.14.19.06.38; Wed, 14 Oct 2020 19:07:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729908AbgJNUQS (ORCPT + 99 others); Wed, 14 Oct 2020 16:16:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728714AbgJNUQS (ORCPT ); Wed, 14 Oct 2020 16:16:18 -0400 Received: from ms.lwn.net (ms.lwn.net [IPv6:2600:3c01:e000:3a1::42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 439E5C061755; Wed, 14 Oct 2020 13:16:18 -0700 (PDT) Received: from lwn.net (localhost [127.0.0.1]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ms.lwn.net (Postfix) with ESMTPSA id 96878739; Wed, 14 Oct 2020 20:16:17 +0000 (UTC) Date: Wed, 14 Oct 2020 14:16:16 -0600 From: Jonathan Corbet To: "=?UTF-8?B?TsOtY29sYXM=?= F. R. A. Prado" Cc: Mauro Carvalho Chehab , linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, lkcamp@lists.libreplanetbr.org, andrealmeid@collabora.com Subject: Re: [PATCH v2 2/5] docs: automarkup.py: Fix regexes to solve sphinx 3 warnings Message-ID: <20201014141616.63082d5d@lwn.net> In-Reply-To: References: Organization: LWN.net MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 14 Oct 2020 20:09:10 +0000 Nícolas F. R. A. Prado wrote: > One I had noted down was: > > WARNING: Unparseable C cross-reference: '调用debugfs_rename' > > which I believe occurred in the chinese translation. > > I think the problem is that in chinese there normally isn't space between the > words, so even if I had made the regexes only match the beginning of the word > (which I didn't, but I fixed this in this patch with the \b), it would still try > to cross-reference to that symbol containing chinese characters, which is > unparsable to sphinx. > > So since valid identifiers in C are only in ASCII anyway, I used the ASCII flag > to make \w, and \d only match ASCII characters, otherwise they match any unicode > character. OK, this all makes sense, as does your fix. The one thing I would ask would be to put that warning into the changelog for future reference. Thanks, jon