2016-11-09 11:32:45

by He Chen

[permalink] [raw]
Subject: [PATCH v5 0/3] cpuid: Support AVX512_4VNNIW and AVX512_4FMAPS for KVM guest

This patch series is going to add two new AVX512 features to KVM guest.
Since these two features are defined as scattered features in kernel,
some extra modification in kernel is included.

---
Changes in v5:
* divide the whole patchset into 3 parts.
* refine commit messages.

Changes in v4:
* divide patch into 2 parts, including modification in scattered.c and
support new AVX512 instructions for KVM.
* coding style.
* refine commit message.

Changes in v3:
* add a helper in scattered.c to get scattered leaf.

Changes in v2:
* add new macros for new AVX512 scattered features.
* add a cpuid_count_edx function to processor.h

He Chen (3):
cpuid: cleanup cpuid_regs definitions
cpuid: Add a helper in scattered.c to return cpuid
cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support

arch/x86/events/intel/pt.c | 45 ++++++++++++++-----------------
arch/x86/include/asm/processor.h | 14 ++++++++++
arch/x86/kernel/cpu/scattered.c | 57 ++++++++++++++++++++++++++--------------
arch/x86/kernel/cpuid.c | 4 ---
arch/x86/kvm/cpuid.c | 14 +++++++++-
5 files changed, 84 insertions(+), 50 deletions(-)

--
2.7.4


2016-11-09 11:06:33

by He Chen

[permalink] [raw]
Subject: [PATCH v5 2/3] cpuid: Add a helper in scattered.c to return cpuid

Some sparse CPUID leafs are gathered in a fake leaf to save size of
x86_capability array in current code, but sometimes, kernel or other
modules (e.g. KVM cpuid enumeration) may need actual hardware leaf
information.

This patch adds a helper get_scattered_cpuid_leaf() to rebuild actual
CPUID leaf, and it can be called outside by modules.

Signed-off-by: He Chen <[email protected]>
---
arch/x86/include/asm/processor.h | 3 +++
arch/x86/kernel/cpu/scattered.c | 49 ++++++++++++++++++++++++++++++----------
2 files changed, 40 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 8f6ac5b..e7f8c62 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -189,6 +189,9 @@ extern void identify_secondary_cpu(struct cpuinfo_x86 *);
extern void print_cpu_info(struct cpuinfo_x86 *);
void print_cpu_msr(struct cpuinfo_x86 *);
extern void init_scattered_cpuid_features(struct cpuinfo_x86 *c);
+extern u32 get_scattered_cpuid_leaf(unsigned int level,
+ unsigned int sub_leaf,
+ enum cpuid_regs_idx reg);
extern unsigned int init_intel_cacheinfo(struct cpuinfo_x86 *c);
extern void init_amd_cacheinfo(struct cpuinfo_x86 *c);

diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 5dbdd0b..d1316f9 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -17,24 +17,25 @@ struct cpuid_bit {
u32 sub_leaf;
};

+/* Please keep the leaf sorted by cpuid_bit.level for faster search. */
+static const struct cpuid_bit cpuid_bits[] = {
+ { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 },
+ { X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
+ { X86_FEATURE_INTEL_PT, CPUID_EBX, 25, 0x00000007, 0 },
+ { X86_FEATURE_AVX512_4VNNIW, CPUID_EDX, 2, 0x00000007, 0 },
+ { X86_FEATURE_AVX512_4FMAPS, CPUID_EDX, 3, 0x00000007, 0 },
+ { X86_FEATURE_HW_PSTATE, CPUID_EDX, 7, 0x80000007, 0 },
+ { X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
+ { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX, 11, 0x80000007, 0 },
+ { 0, 0, 0, 0, 0 }
+};
+
void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
{
u32 max_level;
u32 regs[4];
const struct cpuid_bit *cb;

- static const struct cpuid_bit cpuid_bits[] = {
- { X86_FEATURE_INTEL_PT, CPUID_EBX,25, 0x00000007, 0 },
- { X86_FEATURE_AVX512_4VNNIW, CPUID_EDX, 2, 0x00000007, 0 },
- { X86_FEATURE_AVX512_4FMAPS, CPUID_EDX, 3, 0x00000007, 0 },
- { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 },
- { X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
- { X86_FEATURE_HW_PSTATE, CPUID_EDX, 7, 0x80000007, 0 },
- { X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
- { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX,11, 0x80000007, 0 },
- { 0, 0, 0, 0, 0 }
- };
-
for (cb = cpuid_bits; cb->feature; cb++) {

/* Verify that the level is valid */
@@ -51,3 +52,27 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
set_cpu_cap(c, cb->feature);
}
}
+
+u32 get_scattered_cpuid_leaf(unsigned int level, unsigned int sub_leaf,
+ enum cpuid_regs_idx reg)
+{
+ const struct cpuid_bit *cb;
+ u32 cpuid_val = 0;
+
+ for (cb = cpuid_bits; cb->feature; cb++) {
+
+ if (level > cb->level)
+ continue;
+
+ if (level < cb->level)
+ break;
+
+ if (reg == cb->reg && sub_leaf == cb->sub_leaf) {
+ if (cpu_has(&boot_cpu_data, cb->feature))
+ cpuid_val |= BIT(cb->bit);
+ }
+ }
+
+ return cpuid_val;
+}
+EXPORT_SYMBOL_GPL(get_scattered_cpuid_leaf);
--
2.7.4

2016-11-09 11:31:10

by He Chen

[permalink] [raw]
Subject: [PATCH v5 3/3] cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support

Add two new AVX512 instructions support for KVM guest.

AVX512_4VNNIW:
Vector instructions for deep learning enhanced word variable precision.

AVX512_4FMAPS:
Vector instructions for deep learning floating-point single precision.

Signed-off-by: Luwei Kang <[email protected]>
Signed-off-by: He Chen <[email protected]>
---
arch/x86/kvm/cpuid.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index afa7bbb..ddcdf7c 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -16,6 +16,7 @@
#include <linux/export.h>
#include <linux/vmalloc.h>
#include <linux/uaccess.h>
+#include <asm/processor.h>
#include <asm/fpu/internal.h> /* For use_eager_fpu. Ugh! */
#include <asm/user.h>
#include <asm/fpu/xstate.h>
@@ -65,6 +66,11 @@ u64 kvm_supported_xcr0(void)

#define F(x) bit(X86_FEATURE_##x)

+/* These are scattered features in cpufeatures.h. */
+#define KVM_CPUID_BIT_AVX512_4VNNIW 2
+#define KVM_CPUID_BIT_AVX512_4FMAPS 3
+#define KF(x) bit(KVM_CPUID_BIT_##x)
+
int kvm_update_cpuid(struct kvm_vcpu *vcpu)
{
struct kvm_cpuid_entry2 *best;
@@ -376,6 +382,10 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
/* cpuid 7.0.ecx*/
const u32 kvm_cpuid_7_0_ecx_x86_features = F(PKU) | 0 /*OSPKE*/;

+ /* cpuid 7.0.edx*/
+ const u32 kvm_cpuid_7_0_edx_x86_features =
+ KF(AVX512_4VNNIW) | KF(AVX512_4FMAPS);
+
/* all calls to cpuid_count() should be made on the same cpu */
get_cpu();

@@ -458,12 +468,14 @@ static inline int __do_cpuid_ent(struct kvm_cpuid_entry2 *entry, u32 function,
/* PKU is not yet implemented for shadow paging. */
if (!tdp_enabled)
entry->ecx &= ~F(PKU);
+ entry->edx &= kvm_cpuid_7_0_edx_x86_features;
+ entry->edx &= get_scattered_cpuid_leaf(7, 0, CPUID_EDX);
} else {
entry->ebx = 0;
entry->ecx = 0;
+ entry->edx = 0;
}
entry->eax = 0;
- entry->edx = 0;
break;
}
case 9:
--
2.7.4

2016-11-09 11:31:43

by He Chen

[permalink] [raw]
Subject: [PATCH v5 1/3] cpuid: cleanup cpuid_regs definitions

make cpuid_regs more clear and avoid potential name clash.

Signed-off-by: He Chen <[email protected]>
---
arch/x86/events/intel/pt.c | 45 +++++++++++++++++-----------------------
arch/x86/include/asm/processor.h | 11 ++++++++++
arch/x86/kernel/cpu/scattered.c | 28 ++++++++++---------------
arch/x86/kernel/cpuid.c | 4 ----
4 files changed, 41 insertions(+), 47 deletions(-)

diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
index c5047b8..1c1b9fe 100644
--- a/arch/x86/events/intel/pt.c
+++ b/arch/x86/events/intel/pt.c
@@ -36,13 +36,6 @@ static DEFINE_PER_CPU(struct pt, pt_ctx);

static struct pt_pmu pt_pmu;

-enum cpuid_regs {
- CR_EAX = 0,
- CR_ECX,
- CR_EDX,
- CR_EBX
-};
-
/*
* Capabilities of Intel PT hardware, such as number of address bits or
* supported output schemes, are cached and exported to userspace as "caps"
@@ -64,21 +57,21 @@ static struct pt_cap_desc {
u8 reg;
u32 mask;
} pt_caps[] = {
- PT_CAP(max_subleaf, 0, CR_EAX, 0xffffffff),
- PT_CAP(cr3_filtering, 0, CR_EBX, BIT(0)),
- PT_CAP(psb_cyc, 0, CR_EBX, BIT(1)),
- PT_CAP(ip_filtering, 0, CR_EBX, BIT(2)),
- PT_CAP(mtc, 0, CR_EBX, BIT(3)),
- PT_CAP(ptwrite, 0, CR_EBX, BIT(4)),
- PT_CAP(power_event_trace, 0, CR_EBX, BIT(5)),
- PT_CAP(topa_output, 0, CR_ECX, BIT(0)),
- PT_CAP(topa_multiple_entries, 0, CR_ECX, BIT(1)),
- PT_CAP(single_range_output, 0, CR_ECX, BIT(2)),
- PT_CAP(payloads_lip, 0, CR_ECX, BIT(31)),
- PT_CAP(num_address_ranges, 1, CR_EAX, 0x3),
- PT_CAP(mtc_periods, 1, CR_EAX, 0xffff0000),
- PT_CAP(cycle_thresholds, 1, CR_EBX, 0xffff),
- PT_CAP(psb_periods, 1, CR_EBX, 0xffff0000),
+ PT_CAP(max_subleaf, 0, CPUID_EAX, 0xffffffff),
+ PT_CAP(cr3_filtering, 0, CPUID_EBX, BIT(0)),
+ PT_CAP(psb_cyc, 0, CPUID_EBX, BIT(1)),
+ PT_CAP(ip_filtering, 0, CPUID_EBX, BIT(2)),
+ PT_CAP(mtc, 0, CPUID_EBX, BIT(3)),
+ PT_CAP(ptwrite, 0, CPUID_EBX, BIT(4)),
+ PT_CAP(power_event_trace, 0, CPUID_EBX, BIT(5)),
+ PT_CAP(topa_output, 0, CPUID_ECX, BIT(0)),
+ PT_CAP(topa_multiple_entries, 0, CPUID_ECX, BIT(1)),
+ PT_CAP(single_range_output, 0, CPUID_ECX, BIT(2)),
+ PT_CAP(payloads_lip, 0, CPUID_ECX, BIT(31)),
+ PT_CAP(num_address_ranges, 1, CPUID_EAX, 0x3),
+ PT_CAP(mtc_periods, 1, CPUID_EAX, 0xffff0000),
+ PT_CAP(cycle_thresholds, 1, CPUID_EBX, 0xffff),
+ PT_CAP(psb_periods, 1, CPUID_EBX, 0xffff0000),
};

static u32 pt_cap_get(enum pt_capabilities cap)
@@ -213,10 +206,10 @@ static int __init pt_pmu_hw_init(void)

for (i = 0; i < PT_CPUID_LEAVES; i++) {
cpuid_count(20, i,
- &pt_pmu.caps[CR_EAX + i*PT_CPUID_REGS_NUM],
- &pt_pmu.caps[CR_EBX + i*PT_CPUID_REGS_NUM],
- &pt_pmu.caps[CR_ECX + i*PT_CPUID_REGS_NUM],
- &pt_pmu.caps[CR_EDX + i*PT_CPUID_REGS_NUM]);
+ &pt_pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM],
+ &pt_pmu.caps[CPUID_EBX + i*PT_CPUID_REGS_NUM],
+ &pt_pmu.caps[CPUID_ECX + i*PT_CPUID_REGS_NUM],
+ &pt_pmu.caps[CPUID_EDX + i*PT_CPUID_REGS_NUM]);
}

ret = -ENOMEM;
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 984a7bf..8f6ac5b 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -137,6 +137,17 @@ struct cpuinfo_x86 {
u32 microcode;
};

+struct cpuid_regs {
+ u32 eax, ebx, ecx, edx;
+};
+
+enum cpuid_regs_idx {
+ CPUID_EAX = 0,
+ CPUID_EBX,
+ CPUID_ECX,
+ CPUID_EDX,
+};
+
#define X86_VENDOR_INTEL 0
#define X86_VENDOR_CYRIX 1
#define X86_VENDOR_AMD 2
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 1db8dc4..5dbdd0b 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -17,13 +17,6 @@ struct cpuid_bit {
u32 sub_leaf;
};

-enum cpuid_regs {
- CR_EAX = 0,
- CR_ECX,
- CR_EDX,
- CR_EBX
-};
-
void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
{
u32 max_level;
@@ -31,14 +24,14 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
const struct cpuid_bit *cb;

static const struct cpuid_bit cpuid_bits[] = {
- { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x00000007, 0 },
- { X86_FEATURE_AVX512_4VNNIW, CR_EDX, 2, 0x00000007, 0 },
- { X86_FEATURE_AVX512_4FMAPS, CR_EDX, 3, 0x00000007, 0 },
- { X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x00000006, 0 },
- { X86_FEATURE_EPB, CR_ECX, 3, 0x00000006, 0 },
- { X86_FEATURE_HW_PSTATE, CR_EDX, 7, 0x80000007, 0 },
- { X86_FEATURE_CPB, CR_EDX, 9, 0x80000007, 0 },
- { X86_FEATURE_PROC_FEEDBACK, CR_EDX,11, 0x80000007, 0 },
+ { X86_FEATURE_INTEL_PT, CPUID_EBX,25, 0x00000007, 0 },
+ { X86_FEATURE_AVX512_4VNNIW, CPUID_EDX, 2, 0x00000007, 0 },
+ { X86_FEATURE_AVX512_4FMAPS, CPUID_EDX, 3, 0x00000007, 0 },
+ { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 },
+ { X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
+ { X86_FEATURE_HW_PSTATE, CPUID_EDX, 7, 0x80000007, 0 },
+ { X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
+ { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX,11, 0x80000007, 0 },
{ 0, 0, 0, 0, 0 }
};

@@ -50,8 +43,9 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
max_level > (cb->level | 0xffff))
continue;

- cpuid_count(cb->level, cb->sub_leaf, &regs[CR_EAX],
- &regs[CR_EBX], &regs[CR_ECX], &regs[CR_EDX]);
+ cpuid_count(cb->level, cb->sub_leaf, &regs[CPUID_EAX],
+ &regs[CPUID_EBX], &regs[CPUID_ECX],
+ &regs[CPUID_EDX]);

if (regs[cb->reg] & (1 << cb->bit))
set_cpu_cap(c, cb->feature);
diff --git a/arch/x86/kernel/cpuid.c b/arch/x86/kernel/cpuid.c
index 2836de3..9095c80 100644
--- a/arch/x86/kernel/cpuid.c
+++ b/arch/x86/kernel/cpuid.c
@@ -46,10 +46,6 @@

static struct class *cpuid_class;

-struct cpuid_regs {
- u32 eax, ebx, ecx, edx;
-};
-
static void cpuid_smp_cpuid(void *cmd_block)
{
struct cpuid_regs *cmd = (struct cpuid_regs *)cmd_block;
--
2.7.4

2016-11-10 10:45:06

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] cpuid: cleanup cpuid_regs definitions

On Wed, Nov 09, 2016 at 07:05:26PM +0800, He Chen wrote:
> make cpuid_regs more clear and avoid potential name clash.
>
> Signed-off-by: He Chen <[email protected]>

Btw, please name your subjects like this:

x86/cpuid: Cleanup ...

Also the name should be a complete sentence starting with a capital
letter.

The last patch, the 3/3 should be called something like "x86/kvm: ..."
or so.

...

> @@ -31,14 +24,14 @@ void init_scattered_cpuid_features(struct cpuinfo_x86 *c)
> const struct cpuid_bit *cb;
>
> static const struct cpuid_bit cpuid_bits[] = {
> - { X86_FEATURE_INTEL_PT, CR_EBX,25, 0x00000007, 0 },
> - { X86_FEATURE_AVX512_4VNNIW, CR_EDX, 2, 0x00000007, 0 },
> - { X86_FEATURE_AVX512_4FMAPS, CR_EDX, 3, 0x00000007, 0 },
> - { X86_FEATURE_APERFMPERF, CR_ECX, 0, 0x00000006, 0 },
> - { X86_FEATURE_EPB, CR_ECX, 3, 0x00000006, 0 },
> - { X86_FEATURE_HW_PSTATE, CR_EDX, 7, 0x80000007, 0 },
> - { X86_FEATURE_CPB, CR_EDX, 9, 0x80000007, 0 },
> - { X86_FEATURE_PROC_FEEDBACK, CR_EDX,11, 0x80000007, 0 },
> + { X86_FEATURE_INTEL_PT, CPUID_EBX,25, 0x00000007, 0 },
> + { X86_FEATURE_AVX512_4VNNIW, CPUID_EDX, 2, 0x00000007, 0 },
> + { X86_FEATURE_AVX512_4FMAPS, CPUID_EDX, 3, 0x00000007, 0 },
> + { X86_FEATURE_APERFMPERF, CPUID_ECX, 0, 0x00000006, 0 },
> + { X86_FEATURE_EPB, CPUID_ECX, 3, 0x00000006, 0 },
> + { X86_FEATURE_HW_PSTATE, CPUID_EDX, 7, 0x80000007, 0 },
> + { X86_FEATURE_CPB, CPUID_EDX, 9, 0x80000007, 0 },
> + { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX,11, 0x80000007, 0 },

ERROR: space required after that ',' (ctx:VxV)
#150: FILE: arch/x86/kernel/cpu/scattered.c:27:
+ { X86_FEATURE_INTEL_PT, CPUID_EBX,25, 0x00000007, 0 },
^

ERROR: space required after that ',' (ctx:VxV)
#157: FILE: arch/x86/kernel/cpu/scattered.c:34:
+ { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX,11, 0x80000007, 0 },
^

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2016-11-10 11:00:57

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 1/3] cpuid: cleanup cpuid_regs definitions

On Thu, Nov 10, 2016 at 11:45:01AM +0100, Borislav Petkov wrote:
> ERROR: space required after that ',' (ctx:VxV)
> #150: FILE: arch/x86/kernel/cpu/scattered.c:27:
> + { X86_FEATURE_INTEL_PT, CPUID_EBX,25, 0x00000007, 0 },
> ^
>
> ERROR: space required after that ',' (ctx:VxV)
> #157: FILE: arch/x86/kernel/cpu/scattered.c:34:
> + { X86_FEATURE_PROC_FEEDBACK, CPUID_EDX,11, 0x80000007, 0 },

Ah, ignore that one. I see you're fixing them in the next patch.

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2016-11-10 11:08:46

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 2/3] cpuid: Add a helper in scattered.c to return cpuid

On Wed, Nov 09, 2016 at 07:05:27PM +0800, He Chen wrote:
> Some sparse CPUID leafs are gathered in a fake leaf to save size of
> x86_capability array in current code, but sometimes, kernel or other
> modules (e.g. KVM cpuid enumeration) may need actual hardware leaf

s/cpuid/CPUID/

> information.
>
> This patch adds a helper get_scattered_cpuid_leaf() to rebuild actual
> CPUID leaf, and it can be called outside by modules.
>
> Signed-off-by: He Chen <[email protected]>

Modulo $Subject and typo above, the code looks ok to me:

Reviewed-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

2016-11-10 11:19:20

by Borislav Petkov

[permalink] [raw]
Subject: Re: [PATCH v5 3/3] cpuid: add AVX512_4VNNIW and AVX512_4FMAPS instructions support

On Wed, Nov 09, 2016 at 07:05:28PM +0800, He Chen wrote:
> Add two new AVX512 instructions support for KVM guest.

I think you mean "two new AVX512 subfeatures support". An instruction
support would be actual emulation of the instructions or so...

>
> AVX512_4VNNIW:
> Vector instructions for deep learning enhanced word variable precision.
>
> AVX512_4FMAPS:
> Vector instructions for deep learning floating-point single precision.
>
> Signed-off-by: Luwei Kang <[email protected]>
> Signed-off-by: He Chen <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)

Rest looks ok:

Reviewed-by: Borislav Petkov <[email protected]>

--
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.