Received: by 10.223.164.202 with SMTP id h10csp1875154wrb; Mon, 27 Nov 2017 08:37:04 -0800 (PST) X-Google-Smtp-Source: AGs4zMYTRifuRC+biIabNZuUn+gPTOulOxilSSzU48bKfqsPSkhnODR+jGzpkK5+w/PtEelIEH0V X-Received: by 10.101.65.129 with SMTP id a1mr37453047pgq.203.1511800624345; Mon, 27 Nov 2017 08:37:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1511800624; cv=none; d=google.com; s=arc-20160816; b=ajF6VHFlsC+B3rc7J0pixc/45LxRMiLdxqudW0Ufp/qoiKPo6E4CH4NsbV6dsfuIse b3xHrqUcV2Q+rFBIx+lzfiPafePTWWqVdUWBTyqKjGEgr1SPH05vGh+JR8HA0VXAWtCA jrUFDvpLI7pV7/91P8iC75qypaWLqTi/v10tasbwVA/e/Kep+ZpWWI9A5foXc5yDffq8 1crqcUg5bNhOqbXQ6QVnIIkCeWTSutSzvtdYBwdK8LrnsxQt8DzCjvL6TGiq+4wjhTfW nLddik2NAJcpY2QOJg2v5XzjpMUJ0BYlAIxDa8expWCS6igEZSpv8mgF9rT0V9vGaJAR rcXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=Ca6dDSFzNVTRhK+dmtAHexL8fu9ur5FklNoHeuJV8Sc=; b=cAJ6hYe7Pu1Aj6mD7nFYu4SeheQFwJHz0RxK0CbzzeOnbczWl+MeuviDlhzawakNGd U36BbMWWLFzyRNJ99iJdsJHZ9Q1p6Dm3pPy7UzHgwJpnxC0glPAXB4NgKKydvQTrNWjY S7mz3OYZN00ZkmGI5bKCA1XSFkLUoOCb2d+9hqtqIpfdkbGaa7IIMv9QEyuDJaXY+brT ktINgI+pALTgoUoYHdqm5Omd2xPiVMTE02ZRTzcwn6vSPlFhwoJsMsNXUUOni6QpoOUj P8cjtrKizY7khcaClJ6ol4fMkCGkRHzSFEJLOKqVmdrUhb1OHxy2z4z8oBSD8Mr1owmr 5h1g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id w69si22281732pfd.289.2017.11.27.08.36.52; Mon, 27 Nov 2017 08:37:04 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753576AbdK0Qec (ORCPT + 78 others); Mon, 27 Nov 2017 11:34:32 -0500 Received: from mx1.redhat.com ([209.132.183.28]:28733 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753533AbdK0Qea (ORCPT ); Mon, 27 Nov 2017 11:34:30 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 217F3C04AC5A; Mon, 27 Nov 2017 16:34:30 +0000 (UTC) Received: from [10.13.129.233] (dhcp129-233.rdu.redhat.com [10.13.129.233]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3CC735C54A; Mon, 27 Nov 2017 16:34:29 +0000 (UTC) Subject: Re: [PATCH] Use more flexible pattern matching for CPU identification for mapfile.csv To: Ravi Bangoria Cc: Arnaldo Carvalho de Melo , peterz@infradead.org, mingo@redhat.com, mpetlan@redhat.com, alexander.shishkin@linux.intel.com, jolsa@redhat.com, namhyung@kernel.org, sukadev@linux.vnet.ibm.com, maddy@linux.vnet.ibm.com, shriyak@linux.vnet.ibm.com, Linux Kernel Mailing List References: <20171122160849.8400-1-wcohen@redhat.com> <20171122192659.GB2383@kernel.org> <49b4abc7-e1f6-0419-362f-33c8f6818b44@linux.vnet.ibm.com> From: William Cohen Message-ID: <012d38fe-6635-82e8-5273-a4f073f04ffa@redhat.com> Date: Mon, 27 Nov 2017 11:34:28 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <49b4abc7-e1f6-0419-362f-33c8f6818b44@linux.vnet.ibm.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-MW Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Mon, 27 Nov 2017 16:34:30 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/23/2017 01:06 AM, Ravi Bangoria wrote: > Hi William, > > On 11/23/2017 12:56 AM, Arnaldo Carvalho de Melo wrote: >> Em Wed, Nov 22, 2017 at 11:08:49AM -0500, William Cohen escreveu: >>> The powerpc cpuid information includes chip revision information. >>> Changes between chip revisions are usually minor bug fixes and usually >>> do not affect the operation of the performance monitoring hardware. >>> The original mapfile.csv matching requires enumerating every possible >>> cpuid string.  When a new minor chip revision is produced a new entry >>> has to be added to the mapfile.csv and the code recompiled to allow >>> perf to have the implementation specific perf events for this new >>> minor revision.  For users of various distibutions of Linux having to >>> wait for a new release of the kernel's perf tool to be built with >>> these trivial patches is inconvenient. >>> >>> Using regular expressions rather than exactly string matching of the >>> entire cpuid string allows developers to write mapfile.csv files that >>> do not require patches and recompiles for each of these minor version >>> changes.  If special cases need to be made for some particular >>> versions, they can be placed earlier in the mapfile.csv file before >>> the more general matches. > > This is a good idea. I think the patch can be optimized a bit. > Please see my comments. Hi Ravi, Thanks for the comments. I made a some changes based on the comments and will be sending v2 shortly. > >> Looks ok, but you forgot to add lkml to the CC list, I'm doing that now >> and also adding Ravi Bangoria @ IBM, which sent yet another of those >> lines that would match your regexp. >> >> Ravi & IBM crew, please see if we can proceed with Will's regexp approach, ok? >> >> - Arnaldo >> >>> Signed-off-by: William Cohen >>> --- >>>   tools/perf/pmu-events/arch/powerpc/mapfile.csv | 14 +++------ >>>   tools/perf/pmu-events/arch/x86/mapfile.csv     |  5 +--- >>>   tools/perf/pmu-events/jevents.c                | 39 +++++++++++++++++++++++++- >>>   tools/perf/util/pmu.c                          | 20 ++++++++++++- >>>   4 files changed, 62 insertions(+), 16 deletions(-) >>> >>> diff --git a/tools/perf/pmu-events/arch/powerpc/mapfile.csv b/tools/perf/pmu-events/arch/powerpc/mapfile.csv >>> index a0f3a11ca19f..9ad0469057b8 100644 >>> --- a/tools/perf/pmu-events/arch/powerpc/mapfile.csv >>> +++ b/tools/perf/pmu-events/arch/powerpc/mapfile.csv >>> @@ -13,13 +13,7 @@ >>>   # >>>     # Power8 entries >>> -004b0000,1,power8,core >>> -004b0201,1,power8,core >>> -004c0000,1,power8,core >>> -004d0000,1,power8,core >>> -004d0100,1,power8,core >>> -004d0200,1,power8,core >>> -004c0100,1,power8,core >>> -004e0100,1,power9,core >>> -004e0200,1,power9,core >>> -004e1200,1,power9,core >>> +004b[[:xdigit:]]\+,1,power8,core >>> +004c[[:xdigit:]]\+,1,power8,core >>> +004d[[:xdigit:]]\+,1,power8,core >>> +004e[[:xdigit:]]\+,1,power9,core > > Instead of 3 regex for Power8, can we compact it into only 1: > 004[bcd][[:xdigit:]]..> > Both Power8 and Power9 revision number are 16 bits so, > instead of arbitrary number of xdigit, I think, we can limit > it to 4:   [[:xdigit:]]{4} revised the power mapfile.cvs based on the comment above. > > AFAIK, you _have to_ use REG_EXTENDED in both regcomp() > and regexec() to make it work. The missing REG_EXTENDED was an oversight. Corrrected in v2 patch. > >>> diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv >>> index fe1a2c47cabf..93656f2fd53a 100644 >>> --- a/tools/perf/pmu-events/arch/x86/mapfile.csv >>> +++ b/tools/perf/pmu-events/arch/x86/mapfile.csv >>> @@ -23,10 +23,7 @@ GenuineIntel-6-1E,v2,nehalemep,core >>>   GenuineIntel-6-1F,v2,nehalemep,core >>>   GenuineIntel-6-1A,v2,nehalemep,core >>>   GenuineIntel-6-2E,v2,nehalemex,core >>> -GenuineIntel-6-4E,v24,skylake,core >>> -GenuineIntel-6-5E,v24,skylake,core >>> -GenuineIntel-6-8E,v24,skylake,core >>> -GenuineIntel-6-9E,v24,skylake,core >>> +GenuineIntel-6-[4589]E,v24,skylake,core > >  Ok, so all entries for powerpc are regex, but its not the case with x86? The change is the Intel mapfile.csv was mainly for testing purposes, so I could check that the regular expression logic was working on the x86 test machine. Changing the x86 mapfile.cvs isn't as big a win as for the power. The version information is separate for the x86 processors. There are some entries that can be combined like the skylake, but many are just a single entry. > >>>   GenuineIntel-6-37,v13,silvermont,core >>>   GenuineIntel-6-4D,v13,silvermont,core >>>   GenuineIntel-6-4C,v13,silvermont,core >>> diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c >>> index 9eb7047bafe4..4102664ac1bf 100644 >>> --- a/tools/perf/pmu-events/jevents.c >>> +++ b/tools/perf/pmu-events/jevents.c >>> @@ -116,6 +116,43 @@ static void fixdesc(char *s) >>>           *e = 0; >>>   } >>>   +/* Add escapes for '\' so they are proper C strings. */ >>> +static char *fixregex(char *s) >>> +{ >>> +    int len = 0; >>> +    int esc_count = 0; >>> +    char *fixed = NULL; >>> +    char *p, *q; >>> + >>> +    /* Count the number of '\' in string */ >>> +    for (p = s; *p; p++) { >>> +        ++len; >>> +        if (*p == '\\') >>> +            ++esc_count; >>> +    } >>> + >>> +    if (esc_count == 0) >>> +        return s; >>> + >>> +    /* allocate space for a new string */ >>> +    fixed = (char *) malloc(len+1); >>> +    if (!fixed) >>> +        return NULL; >>> + >>> +    /* copy over the characters */ >>> +    q = fixed; >>> +    for (p = s; *p; p++) { >>> +        if (*p == '\\') { >>> +            *q = '\\'; >>> +            ++q; >>> +        } >>> +        *q = *p; >>> +        ++q; >>> +    } >>> +    *q = '\0'; >>> +    return fixed; >>> +} >>> + > > AFAICS, you are processing cpuid to add one more '\' if > you find '\' in regex. Can't you directly use '\\' in mapfile? > Something like: > > 004b[[:xdigit:]]\\+,1,power8,core > > Then you don't need this function at all. Initially during development fixregex didn't exist. However, it was added in the interest of making the code easier to use. The regular expressions in the mapfile.csv are plain regular expressions, not regular expressions where one needs to add '\' to each of the '\'. > >>>   static struct msrmap { >>>       const char *num; >>>       const char *pname; >>> @@ -648,7 +685,7 @@ static int process_mapfile(FILE *outfp, char *fpath) >>>           } >>>           line[strlen(line)-1] = '\0'; >>>   -        cpuid = strtok_r(p, ",", &save); >>> +        cpuid = fixregex(strtok_r(p, ",", &save)); >>>           version = strtok_r(NULL, ",", &save); >>>           fname = strtok_r(NULL, ",", &save); >>>           type = strtok_r(NULL, ",", &save); >>> diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c >>> index 07cb2ac041d7..56942b748d87 100644 >>> --- a/tools/perf/util/pmu.c >>> +++ b/tools/perf/util/pmu.c >>> @@ -12,6 +12,7 @@ >>>   #include >>>   #include >>>   #include >>> +#include >>>   #include "util.h" >>>   #include "pmu.h" >>>   #include "parse-events.h" >>> @@ -570,14 +571,31 @@ struct pmu_events_map *perf_pmu__find_map(void) >>>         i = 0; >>>       for (;;) { >>> +        regex_t re; >>> +        regmatch_t pmatch[1]; >>> +        int match; >>> + >>>           map = &pmu_events_map[i++]; >>>           if (!map->table) { >>>               map = NULL; >>>               break; >>>           } >>>   -        if (!strcmp(map->cpuid, cpuid)) >>> +        if (regcomp(&re, map->cpuid, 0) != 0) { >>> +            /* Warn unable to generate match particular string. */ >>> +            pr_info("Invalid regular expression %s\n", map->cpuid); >>>               break; >>> +        } >>> + >>> +        match = !regexec(&re, cpuid, 1, pmatch, REG_EXTENDED); >>> +        regfree(&re); >>> +        if (match) { >>> +            size_t match_len = (pmatch[0].rm_eo - pmatch[0].rm_so); >>> + >>> +            /* Verify the entire string matched. */ > > I think, if you can add '$' at the end of regex, you don't need this > check. The check for entire cpuid being matched is a design decision to avoid the chance of accidentally doing a partial match. This could certainly be the case with an incorrect regex for set of hex digits like power. It seemed preferable to people putting a '^' and '$' on each entry in the mapfile.csv files. > > So, f.e. for powerpc, final content of the file may look like: > > 004[bcd][[:xdigit:]]{4}$,1,power8,core > 004e[[:xdigit:]]{4}$,1,power9,core > > Though, I haven't gone through x86 entries. Please make sure > my comments works for all archs before changing the patch. The revised patch has been tested on power9 and a x86 skylake machine. Currently, those are the only archtectures that use mapfile.csv files. Thanks, -Will > > Thanks, > Ravi > >>> +            if (match_len == strlen(cpuid)) >>> +                break; >>> +        } >>>       } >>>       free(cpuid); >>>       return map; >>> --  >>> 2.13.6 > From 1584835851253454109@xxx Thu Nov 23 06:07:27 +0000 2017 X-GM-THRID: 1584795644195311981 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread