Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp3283080rdb; Sat, 9 Dec 2023 22:05:15 -0800 (PST) X-Google-Smtp-Source: AGHT+IHr/v/TFjooSXev8mExgvfKcyj/Vl9Ucvl2JRe9Jo4d5DHFew41jiGfmPtpvMpLw8TksGhX X-Received: by 2002:a05:6830:2092:b0:6d9:fd0b:c289 with SMTP id y18-20020a056830209200b006d9fd0bc289mr1704298otq.15.1702188315464; Sat, 09 Dec 2023 22:05:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702188315; cv=none; d=google.com; s=arc-20160816; b=cQgnZIBHLewh5I4Y3VWBgZdKYIaFDD3Y9kGdwofDx23n4kaTKDmwnBXlfLumoqMDlP pDLBmEGVduNaGy+aaJxLA6uiAUVj4ZGL743qah6mWtAGd/Qi5miIdh+eUfSQ9fAg/JI+ VEeBNbyMBEgzBC44EE0aC+aCJfILxjFiaZis0JFPTe0MeY0586YKmSuF5zgxx9JVbaBr NB6LYE0UFo5MMlhVLxUiN3nvGfp1WO+Rf+Rm/ynZZxTGBN3h5RbwEdCp2SHjcp5a5izP xEuBNzuXlMYeGBVn/DrNoT+6MLdEMdFaCCPu8LE4SSKY1dcrS5fKCicG2oZegkcFpIhH GjCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=qOpNSV2N3qAkHqCyIFixN5fykdhlbEMH5xnyKKBt4YQ=; fh=jU+hdNRtPn1j6IwgMg3gtJ4ZagzwL42Pin0cBkb7xo8=; b=g/A97DmRRhjA+lG7NeWrraZguAvnwtaNvpy0iECTQmelMhklEd8Mir0tCwdxxZFYXc GiIsUC1g2XV+Gn/eG7dVkXhl5r4kfStNxm9X/FXHBj783+fNLu+gRGoSl0lTC2GvFPfL rRYennINHFlPzOkZ9VuBR5vXUjVwcMgBVsdB1dHUA/sG6bvIcCm/q41/pG2ILtBDpOTj cW9u99FTVHNkYEGBYSGOjhztWGUBIvQ2KoydIBEq7l45wIUgo+h+XRuOg5klivpw4yKE RaJGBbXWyAKbbEnecQNd+3MRbg6v2rJz9aLkXCvByOYxMQ9LH4m4j6QeCJLdRPEGvxDW kEag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="CVMAn78/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id z28-20020a056a001d9c00b006ceb86f1af4si4146685pfw.122.2023.12.09.22.05.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 09 Dec 2023 22:05:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="CVMAn78/"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 0B3DB80697F0; Sat, 9 Dec 2023 22:05:13 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229509AbjLJFwp (ORCPT + 99 others); Sun, 10 Dec 2023 00:52:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47544 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229481AbjLJFwp (ORCPT ); Sun, 10 Dec 2023 00:52:45 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64F35F2 for ; Sat, 9 Dec 2023 21:52:51 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EA205C433C9; Sun, 10 Dec 2023 05:52:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1702187570; bh=WRkgdiINfUAiL/A802TZAKcyuKt/HNwO0NadgthM0XM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=CVMAn78/0pcwJm9Swlh6ENxYmlEf5810HmxwU5GJ5IUsA3e+oCLZlASL9Czp7oMWd MNyADFHEYhXdavBma84hoZXHmpHBGbGtROuAECpQux4PnbmhdRKbZUumlSFEwA/vpG GwjDPCOhFNuqNAhF8MQ5rt3yk//dPO4ukN2/HTaqUqOgY1n1k2sTuthou2Ol3meZiP 1KzKixUXzGHQtjXyKYEfJRZ7XFaG5mSN/MZs5bA1y1jD5cGTVQN9W9HSBeJU0ZXtS/ ZvzmKtGUBw0NVZGBGEzyR//cvb92TfrACuQDXW+TIIr8yM60V7g3KcoPPD/EYmMGvZ V9kATKjg4pIww== Received: by mail-ot1-f51.google.com with SMTP id 46e09a7af769-6d9d59d6676so2492872a34.1; Sat, 09 Dec 2023 21:52:50 -0800 (PST) X-Gm-Message-State: AOJu0YxNkLk0DcOeQai0KCBYiypaHm0rl2ioF6mitfrppqo68FvMV4XF 9uAMh7oo0BtN6HJgCbv+n/7GsgQ5Lg+tXSEzecQ= X-Received: by 2002:a05:6871:114:b0:1fb:75b:2fc2 with SMTP id y20-20020a056871011400b001fb075b2fc2mr3015994oab.89.1702187570316; Sat, 09 Dec 2023 21:52:50 -0800 (PST) MIME-Version: 1.0 References: <20231205165648.GA391810@dev-arch.thelio-3990X> <20231206012441.840082-1-xujialu@vimux.org> In-Reply-To: <20231206012441.840082-1-xujialu@vimux.org> From: Masahiro Yamada Date: Sun, 10 Dec 2023 14:52:13 +0900 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v4] gen_compile_commands.py: fix path resolve with symlinks in it To: Jialu Xu Cc: nathan@kernel.org, justinstitt@google.com, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, llvm@lists.linux.dev, morbo@google.com, ndesaulniers@google.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Sat, 09 Dec 2023 22:05:13 -0800 (PST) On Wed, Dec 6, 2023 at 10:26=E2=80=AFAM Jialu Xu wrote: > > When a path contains relative symbolic links, os.path.abspath() might > not follow the symlinks and instead return the absolute path with just > the relative paths resolved, resulting in an incorrect path. > > 1. Say "drivers/hdf/" has some symlinks: > > # ls -l drivers/hdf/ > total 364 > drwxrwxr-x 2 ... 4096 ... evdev > lrwxrwxrwx 1 ... 44 ... framework -> ../../../../../../drivers/hd= f_core/framework > -rw-rw-r-- 1 ... 359010 ... hdf_macro_test.h > lrwxrwxrwx 1 ... 55 ... inner_api -> ../../../../../../drivers/hd= f_core/interfaces/inner_api > lrwxrwxrwx 1 ... 53 ... khdf -> ../../../../../../drivers/hdf_cor= e/adapter/khdf/linux > -rw-r--r-- 1 ... 74 ... Makefile > drwxrwxr-x 3 ... 4096 ... wifi > > 2. One .cmd file records that: > > # head -1 ./framework/core/manager/src/.devmgr_service.o.cmd > cmd_drivers/hdf/khdf/manager/../../../../framework/core/manager/src/d= evmgr_service.o :=3D ... \ > /path/to/out/drivers/hdf/khdf/manager/../../../../framework/core/mana= ger/src/devmgr_service.c > > 3. os.path.abspath returns "/path/to/out/framework/core/manager/src/devmg= r_service.c", not correct: > > # ./scripts/clang-tools/gen_compile_commands.py > INFO: Could not add line from ./framework/core/manager/src/.devmgr_se= rvice.o.cmd: File \ > /path/to/out/framework/core/manager/src/devmgr_service.c not foun= d > > Use pathlib.Path.resolve(), which resolves the symlinks and normalizes > the paths correctly. > > # cat compile_commands.json > ... > { > "command": ... > "directory": ... > "file": "/path/to/blabla/drivers/hdf_core/framework/core/manager/sr= c/devmgr_service.c" > }, > ... > > Reviewed-by: Nathan Chancellor > Signed-off-by: Jialu Xu > --- > scripts/clang-tools/gen_compile_commands.py | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/scripts/clang-tools/gen_compile_commands.py b/scripts/clang-= tools/gen_compile_commands.py > index 180952fb91c1b..99e28b7152c19 100755 > --- a/scripts/clang-tools/gen_compile_commands.py > +++ b/scripts/clang-tools/gen_compile_commands.py > @@ -11,6 +11,7 @@ import argparse > import json > import logging > import os > +from pathlib import Path > import re > import subprocess > import sys > @@ -172,8 +173,9 @@ def process_line(root_directory, command_prefix, file= _path): > # by Make, so this code replaces the escaped version with '#'. > prefix =3D command_prefix.replace('\#', '#').replace('$(pound)', '#'= ) > > - # Use os.path.abspath() to normalize the path resolving '.' and '..'= . > - abs_path =3D os.path.abspath(os.path.join(root_directory, file_path)= ) > + # Make the path absolute, resolving all symlinks on the way and also= normalizing it. > + # Convert Path object to a string because 'PosixPath' is not JSON se= rializable. > + abs_path =3D str(Path(root_directory, file_path).resolve()) > if not os.path.exists(abs_path): > raise ValueError('File %s not found' % abs_path) > return { Is there any reason why you didn't simply replace os.path.abspath() with os.path.realpath() ? This patch uses pathlib.Path() just in one place, leaving many call-sites of os.path.*() functions. If it is just a matter of your preference, you need to convert os.path.*() for consistency (as a follow-up patch). I see one more os.path.abspath() return (args.log_level, os.path.abspath(args.directory), args.output, args.ar, args.paths if len(args.paths) > 0 else [args.directory]) Does it cause a similar issue for the 'directory' field with symbolic link jungles? -- Best Regards Masahiro Yamada