2016-04-21 13:05:30

by Paul Burton

[permalink] [raw]
Subject: [PATCH 00/11] FP emulation fixes

This series fixes up some issues with FPU emulation, ranging from
missing instructions to outright backwards branches to a non-zero zero
register. Some cleanups are made along the way, reducing unnecessary
duplication of code.

There are still issues around the R6 maddf & msubf instructions, which
currently appear to have been formed by duplicating the existing
multiply & addition code then gluing it together & thus don't implement
true fused multiply-addition, but those fixes need more testing so will
come later.

This series applies atop v4.6-rc4.

Paul Burton (11):
MIPS: math-emu: Fix BC1{EQ,NE}Z emulation
MIPS: Fix BC1{EQ,NE}Z return offset calculation
MIPS: inst: Declare fsel_op for sel.fmt instruction
MIPS: math-emu: Emulate MIPSr6 sel.fmt instruction
MIPS: math-emu: Unify ieee754sp_m{add,sub}f
MIPS: math-emu: Unify ieee754dp_m{add,sub}f
MIPS: math-emu: Add z argument macros
MIPS: math-emu: Fix bit-width in ieee754dp_{mul,maddf,msubf} comments
MIPS: math-emu: Fix code indentation
MIPS: math-emu: Fix m{add,sub}.s shifts
MIPS: math-emu: Fix jalr emulation when rd == $0

arch/mips/include/uapi/asm/inst.h | 1 +
arch/mips/kernel/branch.c | 18 +--
arch/mips/math-emu/Makefile | 4 +-
arch/mips/math-emu/cp1emu.c | 45 +++++--
arch/mips/math-emu/dp_maddf.c | 35 +++--
arch/mips/math-emu/dp_msubf.c | 269 --------------------------------------
arch/mips/math-emu/dp_mul.c | 4 +-
arch/mips/math-emu/ieee754dp.h | 1 +
arch/mips/math-emu/ieee754int.h | 10 ++
arch/mips/math-emu/ieee754sp.c | 3 +-
arch/mips/math-emu/ieee754sp.h | 17 ++-
arch/mips/math-emu/sp_add.c | 6 +-
arch/mips/math-emu/sp_maddf.c | 43 ++++--
arch/mips/math-emu/sp_msubf.c | 258 ------------------------------------
arch/mips/math-emu/sp_sub.c | 6 +-
15 files changed, 130 insertions(+), 590 deletions(-)
delete mode 100644 arch/mips/math-emu/dp_msubf.c
delete mode 100644 arch/mips/math-emu/sp_msubf.c

--
2.8.0


2016-04-21 13:05:45

by Paul Burton

[permalink] [raw]
Subject: [PATCH 01/11] MIPS: math-emu: Fix BC1{EQ,NE}Z emulation

The conditions for branching when emulating the BC1EQZ & BC1NEZ
instructions were backwards, leading to each of those instructions being
treated as the other. Fix this by reversing the conditions, and clear up
the code a little for readability & checkpatch.

Fixes: c909ca718e8f ("MIPS: math-emu: Emulate missing BC1{EQ,NE}Z instructions")
Signed-off-by: Paul Burton <[email protected]>
Reviewed-by: James Hogan <[email protected]>
---

arch/mips/math-emu/cp1emu.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index cdfd44f..99977c3 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -973,9 +973,10 @@ static int cop1Emulate(struct pt_regs *xcp, struct mips_fpu_struct *ctx,
struct mm_decoded_insn dec_insn, void *__user *fault_addr)
{
unsigned long contpc = xcp->cp0_epc + dec_insn.pc_inc;
- unsigned int cond, cbit;
+ unsigned int cond, cbit, bit0;
mips_instruction ir;
int likely, pc_inc;
+ union fpureg *fpr;
u32 __user *wva;
u64 __user *dva;
u32 wval;
@@ -1187,14 +1188,14 @@ emul:
return SIGILL;

cond = likely = 0;
+ fpr = &current->thread.fpu.fpr[MIPSInst_RT(ir)];
+ bit0 = get_fpr32(fpr, 0) & 0x1;
switch (MIPSInst_RS(ir)) {
case bc1eqz_op:
- if (get_fpr32(&current->thread.fpu.fpr[MIPSInst_RT(ir)], 0) & 0x1)
- cond = 1;
+ cond = bit0 == 0;
break;
case bc1nez_op:
- if (!(get_fpr32(&current->thread.fpu.fpr[MIPSInst_RT(ir)], 0) & 0x1))
- cond = 1;
+ cond = bit0 != 0;
break;
}
goto branch_common;
--
2.8.0

2016-04-21 13:05:59

by Paul Burton

[permalink] [raw]
Subject: [PATCH 02/11] MIPS: Fix BC1{EQ,NE}Z return offset calculation

The conditions for branching when emulating the BC1EQZ & BC1NEZ
instructions were backwards, leading to each of those instructions being
treated as the other. Fix this by reversing the conditions, and clear up
the code a little for readability & checkpatch.

Fixes: c8a34581ec09 ("MIPS: Emulate the BC1{EQ,NE}Z FPU instructions")
Signed-off-by: Paul Burton <[email protected]>
Reviewed-by: James Hogan <[email protected]>
---

arch/mips/kernel/branch.c | 18 +++---------------
1 file changed, 3 insertions(+), 15 deletions(-)

diff --git a/arch/mips/kernel/branch.c b/arch/mips/kernel/branch.c
index d8f9b35..ceca6cc 100644
--- a/arch/mips/kernel/branch.c
+++ b/arch/mips/kernel/branch.c
@@ -688,21 +688,9 @@ int __compute_return_epc_for_insn(struct pt_regs *regs,
}
lose_fpu(1); /* Save FPU state for the emulator. */
reg = insn.i_format.rt;
- bit = 0;
- switch (insn.i_format.rs) {
- case bc1eqz_op:
- /* Test bit 0 */
- if (get_fpr32(&current->thread.fpu.fpr[reg], 0)
- & 0x1)
- bit = 1;
- break;
- case bc1nez_op:
- /* Test bit 0 */
- if (!(get_fpr32(&current->thread.fpu.fpr[reg], 0)
- & 0x1))
- bit = 1;
- break;
- }
+ bit = get_fpr32(&current->thread.fpu.fpr[reg], 0) & 0x1;
+ if (insn.i_format.rs == bc1eqz_op)
+ bit = !bit;
own_fpu(1);
if (bit)
epc = epc + 4 +
--
2.8.0

2016-04-21 13:06:14

by Paul Burton

[permalink] [raw]
Subject: [PATCH 03/11] MIPS: inst: Declare fsel_op for sel.fmt instruction

Declare the opcode for the MIPSr6 sel.fmt instruction, as fsel_op in
order to match other FP op names.

Signed-off-by: Paul Burton <[email protected]>
---

arch/mips/include/uapi/asm/inst.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/mips/include/uapi/asm/inst.h b/arch/mips/include/uapi/asm/inst.h
index ddea53e..28f4151 100644
--- a/arch/mips/include/uapi/asm/inst.h
+++ b/arch/mips/include/uapi/asm/inst.h
@@ -167,6 +167,7 @@ enum cop1_sdw_func {
fceill_op = 0x0a, ffloorl_op = 0x0b,
fround_op = 0x0c, ftrunc_op = 0x0d,
fceil_op = 0x0e, ffloor_op = 0x0f,
+ fsel_op = 0x10,
fmovc_op = 0x11, fmovz_op = 0x12,
fmovn_op = 0x13, fseleqz_op = 0x14,
frecip_op = 0x15, frsqrt_op = 0x16,
--
2.8.0

2016-04-21 13:06:29

by Paul Burton

[permalink] [raw]
Subject: [PATCH 04/11] MIPS: math-emu: Emulate MIPSr6 sel.fmt instruction

Add support for emulating the MIPSr6 sel.fmt instruction, which was
previously missing from the FPU emulation code. This instruction selects
its result from 2 possible source registers, based upon bit 0 of the
destination register, and is valid only for S (single) & D (double) data
types.

Signed-off-by: Paul Burton <[email protected]>
---

arch/mips/math-emu/cp1emu.c | 26 ++++++++++++++++++++++++--
1 file changed, 24 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index 99977c3..85dd174 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -1675,7 +1675,7 @@ static int fpu_emu(struct pt_regs *xcp, struct mips_fpu_struct *ctx,
union ieee754sp(*b) (union ieee754sp, union ieee754sp);
union ieee754sp(*u) (union ieee754sp);
} handler;
- union ieee754sp fs, ft;
+ union ieee754sp fd, fs, ft;

switch (MIPSInst_FUNC(ir)) {
/* binary ops */
@@ -1946,6 +1946,17 @@ copcsr:
rfmt = w_fmt;
goto copcsr;

+ case fsel_op:
+ if (!cpu_has_mips_r6)
+ return SIGILL;
+
+ SPFROMREG(fd, MIPSInst_FD(ir));
+ if (fd.bits & 0x1)
+ SPFROMREG(rv.s, MIPSInst_FT(ir));
+ else
+ SPFROMREG(rv.s, MIPSInst_FS(ir));
+ break;
+
case fcvtl_op:
if (!cpu_has_mips_3_4_5_64_r2_r6)
return SIGILL;
@@ -1994,7 +2005,7 @@ copcsr:
}

case d_fmt: {
- union ieee754dp fs, ft;
+ union ieee754dp fd, fs, ft;
union {
union ieee754dp(*b) (union ieee754dp, union ieee754dp);
union ieee754dp(*u) (union ieee754dp);
@@ -2244,6 +2255,17 @@ dcopuop:
rfmt = w_fmt;
goto copcsr;

+ case fsel_op:
+ if (!cpu_has_mips_r6)
+ return SIGILL;
+
+ DPFROMREG(fd, MIPSInst_FD(ir));
+ if (fd.bits & 0x1)
+ DPFROMREG(rv.d, MIPSInst_FT(ir));
+ else
+ DPFROMREG(rv.d, MIPSInst_FS(ir));
+ break;
+
case fcvtl_op:
if (!cpu_has_mips_3_4_5_64_r2_r6)
return SIGILL;
--
2.8.0

2016-04-21 13:06:44

by Paul Burton

[permalink] [raw]
Subject: [PATCH 05/11] MIPS: math-emu: Unify ieee754sp_m{add,sub}f

The code for emulating MIPSr6 madd.s & msub.s instructions has
previously been implemented as 2 different functions, namely
ieee754sp_maddf & ieee754sp_msubf. The difference in behaviour of these
2 instructions is merely the sign of the product, so we can easily share
the code implementing them. Do this for the single precision variant,
removing the original ieee754sp_msubf in favor of reusing the code from
ieee754sp_maddf.

Signed-off-by: Paul Burton <[email protected]>
---

arch/mips/math-emu/Makefile | 2 +-
arch/mips/math-emu/sp_maddf.c | 22 +++-
arch/mips/math-emu/sp_msubf.c | 258 ------------------------------------------
3 files changed, 21 insertions(+), 261 deletions(-)
delete mode 100644 arch/mips/math-emu/sp_msubf.c

diff --git a/arch/mips/math-emu/Makefile b/arch/mips/math-emu/Makefile
index a19641d..3389aff 100644
--- a/arch/mips/math-emu/Makefile
+++ b/arch/mips/math-emu/Makefile
@@ -6,7 +6,7 @@ obj-y += cp1emu.o ieee754dp.o ieee754sp.o ieee754.o \
dp_div.o dp_mul.o dp_sub.o dp_add.o dp_fsp.o dp_cmp.o dp_simple.o \
dp_tint.o dp_fint.o dp_maddf.o dp_msubf.o dp_2008class.o dp_fmin.o dp_fmax.o \
sp_div.o sp_mul.o sp_sub.o sp_add.o sp_fdp.o sp_cmp.o sp_simple.o \
- sp_tint.o sp_fint.o sp_maddf.o sp_msubf.o sp_2008class.o sp_fmin.o sp_fmax.o \
+ sp_tint.o sp_fint.o sp_maddf.o sp_2008class.o sp_fmin.o sp_fmax.o \
dsemul.o

lib-y += ieee754d.o \
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index dd1dd83..93b7132 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -14,8 +14,12 @@

#include "ieee754sp.h"

-union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
- union ieee754sp y)
+enum maddf_flags {
+ maddf_negate_product = 1 << 0,
+};
+
+static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
+ union ieee754sp y, enum maddf_flags flags)
{
int re;
int rs;
@@ -154,6 +158,8 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,

re = xe + ye;
rs = xs ^ ys;
+ if (flags & maddf_negate_product)
+ rs ^= 1;

/* shunt to top of word */
xm <<= 32 - (SP_FBITS + 1);
@@ -253,3 +259,15 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
}
return ieee754sp_format(zs, ze, zm);
}
+
+union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
+ union ieee754sp y)
+{
+ return _sp_maddf(z, x, y, 0);
+}
+
+union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
+ union ieee754sp y)
+{
+ return _sp_maddf(z, x, y, maddf_negate_product);
+}
diff --git a/arch/mips/math-emu/sp_msubf.c b/arch/mips/math-emu/sp_msubf.c
deleted file mode 100644
index 81c38b980..0000000
--- a/arch/mips/math-emu/sp_msubf.c
+++ /dev/null
@@ -1,258 +0,0 @@
-/*
- * IEEE754 floating point arithmetic
- * single precision: MSUB.f (Fused Multiply Subtract)
- * MSUBF.fmt: FPR[fd] = FPR[fd] - (FPR[fs] x FPR[ft])
- *
- * MIPS floating point support
- * Copyright (C) 2015 Imagination Technologies, Ltd.
- * Author: Markos Chandras <[email protected]>
- *
- * This program is free software; you can distribute it and/or modify it
- * under the terms of the GNU General Public License as published by the
- * Free Software Foundation; version 2 of the License.
- */
-
-#include "ieee754sp.h"
-
-union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
- union ieee754sp y)
-{
- int re;
- int rs;
- unsigned rm;
- unsigned short lxm;
- unsigned short hxm;
- unsigned short lym;
- unsigned short hym;
- unsigned lrm;
- unsigned hrm;
- unsigned t;
- unsigned at;
- int s;
-
- COMPXSP;
- COMPYSP;
- u32 zm; int ze; int zs __maybe_unused; int zc;
-
- EXPLODEXSP;
- EXPLODEYSP;
- EXPLODESP(z, zc, zs, ze, zm)
-
- FLUSHXSP;
- FLUSHYSP;
- FLUSHSP(z, zc, zs, ze, zm);
-
- ieee754_clearcx();
-
- switch (zc) {
- case IEEE754_CLASS_SNAN:
- ieee754_setcx(IEEE754_INVALID_OPERATION);
- return ieee754sp_nanxcpt(z);
- case IEEE754_CLASS_DNORM:
- SPDNORMx(zm, ze);
- /* QNAN is handled separately below */
- }
-
- switch (CLPAIR(xc, yc)) {
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
- return ieee754sp_nanxcpt(y);
-
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
- return ieee754sp_nanxcpt(x);
-
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
- return y;
-
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
- return x;
-
- /*
- * Infinity handling
- */
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- ieee754_setcx(IEEE754_INVALID_OPERATION);
- return ieee754sp_indef();
-
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- return ieee754sp_inf(xs ^ ys);
-
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
- if (zc == IEEE754_CLASS_INF)
- return ieee754sp_inf(zs);
- /* Multiplication is 0 so just return z */
- return z;
-
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
- SPDNORMX;
-
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
- return ieee754sp_inf(zs);
- SPDNORMY;
- break;
-
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
- return ieee754sp_inf(zs);
- SPDNORMX;
- break;
-
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
- return ieee754sp_inf(zs);
- /* fall through to real compuation */
- }
-
- /* Finally get to do some computation */
-
- /*
- * Do the multiplication bit first
- *
- * rm = xm * ym, re = xe + ye basically
- *
- * At this point xm and ym should have been normalized.
- */
-
- /* rm = xm * ym, re = xe+ye basically */
- assert(xm & SP_HIDDEN_BIT);
- assert(ym & SP_HIDDEN_BIT);
-
- re = xe + ye;
- rs = xs ^ ys;
-
- /* shunt to top of word */
- xm <<= 32 - (SP_FBITS + 1);
- ym <<= 32 - (SP_FBITS + 1);
-
- /*
- * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
- */
- lxm = xm & 0xffff;
- hxm = xm >> 16;
- lym = ym & 0xffff;
- hym = ym >> 16;
-
- lrm = lxm * lym; /* 16 * 16 => 32 */
- hrm = hxm * hym; /* 16 * 16 => 32 */
-
- t = lxm * hym; /* 16 * 16 => 32 */
- at = lrm + (t << 16);
- hrm += at < lrm;
- lrm = at;
- hrm = hrm + (t >> 16);
-
- t = hxm * lym; /* 16 * 16 => 32 */
- at = lrm + (t << 16);
- hrm += at < lrm;
- lrm = at;
- hrm = hrm + (t >> 16);
-
- rm = hrm | (lrm != 0);
-
- /*
- * Sticky shift down to normal rounding precision.
- */
- if ((int) rm < 0) {
- rm = (rm >> (32 - (SP_FBITS + 1 + 3))) |
- ((rm << (SP_FBITS + 1 + 3)) != 0);
- re++;
- } else {
- rm = (rm >> (32 - (SP_FBITS + 1 + 3 + 1))) |
- ((rm << (SP_FBITS + 1 + 3 + 1)) != 0);
- }
- assert(rm & (SP_HIDDEN_BIT << 3));
-
- /* And now the subtraction */
-
- /* Flip sign of r and handle as add */
- rs ^= 1;
-
- assert(zm & SP_HIDDEN_BIT);
-
- /*
- * Provide guard,round and stick bit space.
- */
- zm <<= 3;
-
- if (ze > re) {
- /*
- * Have to shift y fraction right to align.
- */
- s = ze - re;
- SPXSRSYn(s);
- } else if (re > ze) {
- /*
- * Have to shift x fraction right to align.
- */
- s = re - ze;
- SPXSRSYn(s);
- }
- assert(ze == re);
- assert(ze <= SP_EMAX);
-
- if (zs == rs) {
- /*
- * Generate 28 bit result of adding two 27 bit numbers
- * leaving result in zm, zs and ze.
- */
- zm = zm + rm;
-
- if (zm >> (SP_FBITS + 1 + 3)) { /* carry out */
- SPXSRSX1(); /* shift preserving sticky */
- }
- } else {
- if (zm >= rm) {
- zm = zm - rm;
- } else {
- zm = rm - zm;
- zs = rs;
- }
- if (zm == 0)
- return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
-
- /*
- * Normalize in extended single precision
- */
- while ((zm >> (SP_MBITS + 3)) == 0) {
- zm <<= 1;
- ze--;
- }
-
- }
- return ieee754sp_format(zs, ze, zm);
-}
--
2.8.0

2016-04-21 13:06:59

by Paul Burton

[permalink] [raw]
Subject: [PATCH 06/11] MIPS: math-emu: Unify ieee754dp_m{add,sub}f

The code for emulating MIPSr6 madd.d & msub.d instructions has
previously been implemented as 2 different functions, namely
ieee754dp_maddf & ieee754dp_msubf. The difference in behaviour of these
2 instructions is merely the sign of the product, so we can easily share
the code implementing them. Do this for the double precision variant,
removing the original ieee754dp_msubf in favor of reusing the code from
ieee754dp_maddf.

Signed-off-by: Paul Burton <[email protected]>
---

arch/mips/math-emu/Makefile | 2 +-
arch/mips/math-emu/dp_maddf.c | 22 +++-
arch/mips/math-emu/dp_msubf.c | 269 ------------------------------------------
3 files changed, 21 insertions(+), 272 deletions(-)
delete mode 100644 arch/mips/math-emu/dp_msubf.c

diff --git a/arch/mips/math-emu/Makefile b/arch/mips/math-emu/Makefile
index 3389aff..e9bbc2a 100644
--- a/arch/mips/math-emu/Makefile
+++ b/arch/mips/math-emu/Makefile
@@ -4,7 +4,7 @@

obj-y += cp1emu.o ieee754dp.o ieee754sp.o ieee754.o \
dp_div.o dp_mul.o dp_sub.o dp_add.o dp_fsp.o dp_cmp.o dp_simple.o \
- dp_tint.o dp_fint.o dp_maddf.o dp_msubf.o dp_2008class.o dp_fmin.o dp_fmax.o \
+ dp_tint.o dp_fint.o dp_maddf.o dp_2008class.o dp_fmin.o dp_fmax.o \
sp_div.o sp_mul.o sp_sub.o sp_add.o sp_fdp.o sp_cmp.o sp_simple.o \
sp_tint.o sp_fint.o sp_maddf.o sp_2008class.o sp_fmin.o sp_fmax.o \
dsemul.o
diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 119eda9..d5e0fb1 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -14,8 +14,12 @@

#include "ieee754dp.h"

-union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
- union ieee754dp y)
+enum maddf_flags {
+ maddf_negate_product = 1 << 0,
+};
+
+static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
+ union ieee754dp y, enum maddf_flags flags)
{
int re;
int rs;
@@ -154,6 +158,8 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,

re = xe + ye;
rs = xs ^ ys;
+ if (flags & maddf_negate_product)
+ rs ^= 1;

/* shunt to top of word */
xm <<= 64 - (DP_FBITS + 1);
@@ -263,3 +269,15 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,

return ieee754dp_format(zs, ze, zm);
}
+
+union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
+ union ieee754dp y)
+{
+ return _dp_maddf(z, x, y, 0);
+}
+
+union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
+ union ieee754dp y)
+{
+ return _dp_maddf(z, x, y, maddf_negate_product);
+}
diff --git a/arch/mips/math-emu/dp_msubf.c b/arch/mips/math-emu/dp_msubf.c
deleted file mode 100644
index 1224126..0000000
--- a/arch/mips/math-emu/dp_msubf.c
+++ /dev/null
@@ -1,269 +0,0 @@
-/*
- * IEEE754 floating point arithmetic
- * double precision: MSUB.f (Fused Multiply Subtract)
- * MSUBF.fmt: FPR[fd] = FPR[fd] - (FPR[fs] x FPR[ft])
- *
- * MIPS floating point support
- * Copyright (C) 2015 Imagination Technologies, Ltd.
- * Author: Markos Chandras <[email protected]>
- *
- * This program is free software; you can distribute it and/or modify it
- * under the terms of the GNU General Public License as published by the
- * Free Software Foundation; version 2 of the License.
- */
-
-#include "ieee754dp.h"
-
-union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
- union ieee754dp y)
-{
- int re;
- int rs;
- u64 rm;
- unsigned lxm;
- unsigned hxm;
- unsigned lym;
- unsigned hym;
- u64 lrm;
- u64 hrm;
- u64 t;
- u64 at;
- int s;
-
- COMPXDP;
- COMPYDP;
-
- u64 zm; int ze; int zs __maybe_unused; int zc;
-
- EXPLODEXDP;
- EXPLODEYDP;
- EXPLODEDP(z, zc, zs, ze, zm)
-
- FLUSHXDP;
- FLUSHYDP;
- FLUSHDP(z, zc, zs, ze, zm);
-
- ieee754_clearcx();
-
- switch (zc) {
- case IEEE754_CLASS_SNAN:
- ieee754_setcx(IEEE754_INVALID_OPERATION);
- return ieee754dp_nanxcpt(z);
- case IEEE754_CLASS_DNORM:
- DPDNORMx(zm, ze);
- /* QNAN is handled separately below */
- }
-
- switch (CLPAIR(xc, yc)) {
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
- return ieee754dp_nanxcpt(y);
-
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
- return ieee754dp_nanxcpt(x);
-
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
- return y;
-
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
- return x;
-
-
- /*
- * Infinity handling
- */
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- ieee754_setcx(IEEE754_INVALID_OPERATION);
- return ieee754dp_indef();
-
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- return ieee754dp_inf(xs ^ ys);
-
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
- if (zc == IEEE754_CLASS_INF)
- return ieee754dp_inf(zs);
- /* Multiplication is 0 so just return z */
- return z;
-
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
- DPDNORMX;
-
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
- return ieee754dp_inf(zs);
- DPDNORMY;
- break;
-
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
- return ieee754dp_inf(zs);
- DPDNORMX;
- break;
-
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
- return ieee754dp_inf(zs);
- /* fall through to real computations */
- }
-
- /* Finally get to do some computation */
-
- /*
- * Do the multiplication bit first
- *
- * rm = xm * ym, re = xe + ye basically
- *
- * At this point xm and ym should have been normalized.
- */
- assert(xm & DP_HIDDEN_BIT);
- assert(ym & DP_HIDDEN_BIT);
-
- re = xe + ye;
- rs = xs ^ ys;
-
- /* shunt to top of word */
- xm <<= 64 - (DP_FBITS + 1);
- ym <<= 64 - (DP_FBITS + 1);
-
- /*
- * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
- */
-
- /* 32 * 32 => 64 */
-#define DPXMULT(x, y) ((u64)(x) * (u64)y)
-
- lxm = xm;
- hxm = xm >> 32;
- lym = ym;
- hym = ym >> 32;
-
- lrm = DPXMULT(lxm, lym);
- hrm = DPXMULT(hxm, hym);
-
- t = DPXMULT(lxm, hym);
-
- at = lrm + (t << 32);
- hrm += at < lrm;
- lrm = at;
-
- hrm = hrm + (t >> 32);
-
- t = DPXMULT(hxm, lym);
-
- at = lrm + (t << 32);
- hrm += at < lrm;
- lrm = at;
-
- hrm = hrm + (t >> 32);
-
- rm = hrm | (lrm != 0);
-
- /*
- * Sticky shift down to normal rounding precision.
- */
- if ((s64) rm < 0) {
- rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
- ((rm << (DP_FBITS + 1 + 3)) != 0);
- re++;
- } else {
- rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
- ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
- }
- assert(rm & (DP_HIDDEN_BIT << 3));
-
- /* And now the subtraction */
-
- /* flip sign of r and handle as add */
- rs ^= 1;
-
- assert(zm & DP_HIDDEN_BIT);
-
- /*
- * Provide guard,round and stick bit space.
- */
- zm <<= 3;
-
- if (ze > re) {
- /*
- * Have to shift y fraction right to align.
- */
- s = ze - re;
- rm = XDPSRS(rm, s);
- re += s;
- } else if (re > ze) {
- /*
- * Have to shift x fraction right to align.
- */
- s = re - ze;
- zm = XDPSRS(zm, s);
- ze += s;
- }
- assert(ze == re);
- assert(ze <= DP_EMAX);
-
- if (zs == rs) {
- /*
- * Generate 28 bit result of adding two 27 bit numbers
- * leaving result in xm, xs and xe.
- */
- zm = zm + rm;
-
- if (zm >> (DP_FBITS + 1 + 3)) { /* carry out */
- zm = XDPSRS1(zm);
- ze++;
- }
- } else {
- if (zm >= rm) {
- zm = zm - rm;
- } else {
- zm = rm - zm;
- zs = rs;
- }
- if (zm == 0)
- return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
-
- /*
- * Normalize to rounding precision.
- */
- while ((zm >> (DP_FBITS + 3)) == 0) {
- zm <<= 1;
- ze--;
- }
- }
-
- return ieee754dp_format(zs, ze, zm);
-}
--
2.8.0

2016-04-21 13:07:13

by Paul Burton

[permalink] [raw]
Subject: [PATCH 07/11] MIPS: math-emu: Add z argument macros

Introduce macros for handling the "z" argument to maddf & msubf, making
its handling consistent with that of the "x" & "y" arguments rather than
open-coding equivalents.

Signed-off-by: Paul Burton <[email protected]>
---

arch/mips/math-emu/dp_maddf.c | 9 ++++-----
arch/mips/math-emu/ieee754dp.h | 1 +
arch/mips/math-emu/ieee754int.h | 10 ++++++++++
arch/mips/math-emu/ieee754sp.h | 1 +
arch/mips/math-emu/sp_maddf.c | 8 ++++----
5 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index d5e0fb1..0e1d4d8 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -36,16 +36,15 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,

COMPXDP;
COMPYDP;
-
- u64 zm; int ze; int zs __maybe_unused; int zc;
+ COMPZDP;

EXPLODEXDP;
EXPLODEYDP;
- EXPLODEDP(z, zc, zs, ze, zm)
+ EXPLODEZDP;

FLUSHXDP;
FLUSHYDP;
- FLUSHDP(z, zc, zs, ze, zm);
+ FLUSHZDP;

ieee754_clearcx();

@@ -54,7 +53,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
ieee754_setcx(IEEE754_INVALID_OPERATION);
return ieee754dp_nanxcpt(z);
case IEEE754_CLASS_DNORM:
- DPDNORMx(zm, ze);
+ DPDNORMZ;
/* QNAN is handled separately below */
}

diff --git a/arch/mips/math-emu/ieee754dp.h b/arch/mips/math-emu/ieee754dp.h
index e2babd9..9ba0230 100644
--- a/arch/mips/math-emu/ieee754dp.h
+++ b/arch/mips/math-emu/ieee754dp.h
@@ -60,6 +60,7 @@ static inline int ieee754dp_finite(union ieee754dp x)
while ((m >> DP_FBITS) == 0) { m <<= 1; e--; }
#define DPDNORMX DPDNORMx(xm, xe)
#define DPDNORMY DPDNORMx(ym, ye)
+#define DPDNORMZ DPDNORMx(zm, ze)

static inline union ieee754dp builddp(int s, int bx, u64 m)
{
diff --git a/arch/mips/math-emu/ieee754int.h b/arch/mips/math-emu/ieee754int.h
index ed7bb27..8bc2f69 100644
--- a/arch/mips/math-emu/ieee754int.h
+++ b/arch/mips/math-emu/ieee754int.h
@@ -55,6 +55,9 @@ static inline int ieee754_class_nan(int xc)
#define COMPYSP \
unsigned ym; int ye; int ys; int yc

+#define COMPZSP \
+ unsigned zm; int ze; int zs; int zc
+
#define EXPLODESP(v, vc, vs, ve, vm) \
{ \
vs = SPSIGN(v); \
@@ -81,6 +84,7 @@ static inline int ieee754_class_nan(int xc)
}
#define EXPLODEXSP EXPLODESP(x, xc, xs, xe, xm)
#define EXPLODEYSP EXPLODESP(y, yc, ys, ye, ym)
+#define EXPLODEZSP EXPLODESP(z, zc, zs, ze, zm)


#define COMPXDP \
@@ -89,6 +93,9 @@ static inline int ieee754_class_nan(int xc)
#define COMPYDP \
u64 ym; int ye; int ys; int yc

+#define COMPZDP \
+ u64 zm; int ze; int zs; int zc
+
#define EXPLODEDP(v, vc, vs, ve, vm) \
{ \
vm = DPMANT(v); \
@@ -115,6 +122,7 @@ static inline int ieee754_class_nan(int xc)
}
#define EXPLODEXDP EXPLODEDP(x, xc, xs, xe, xm)
#define EXPLODEYDP EXPLODEDP(y, yc, ys, ye, ym)
+#define EXPLODEZDP EXPLODEDP(z, zc, zs, ze, zm)

#define FLUSHDP(v, vc, vs, ve, vm) \
if (vc==IEEE754_CLASS_DNORM) { \
@@ -140,7 +148,9 @@ static inline int ieee754_class_nan(int xc)

#define FLUSHXDP FLUSHDP(x, xc, xs, xe, xm)
#define FLUSHYDP FLUSHDP(y, yc, ys, ye, ym)
+#define FLUSHZDP FLUSHDP(z, zc, zs, ze, zm)
#define FLUSHXSP FLUSHSP(x, xc, xs, xe, xm)
#define FLUSHYSP FLUSHSP(y, yc, ys, ye, ym)
+#define FLUSHZSP FLUSHSP(z, zc, zs, ze, zm)

#endif /* __IEEE754INT_H */
diff --git a/arch/mips/math-emu/ieee754sp.h b/arch/mips/math-emu/ieee754sp.h
index 374a3f0..b24fdff 100644
--- a/arch/mips/math-emu/ieee754sp.h
+++ b/arch/mips/math-emu/ieee754sp.h
@@ -65,6 +65,7 @@ static inline int ieee754sp_finite(union ieee754sp x)
while ((m >> SP_FBITS) == 0) { m <<= 1; e--; }
#define SPDNORMX SPDNORMx(xm, xe)
#define SPDNORMY SPDNORMx(ym, ye)
+#define SPDNORMZ SPDNORMx(zm, ze)

static inline union ieee754sp buildsp(int s, int bx, unsigned m)
{
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 93b7132..86e1d0b 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -36,15 +36,15 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,

COMPXSP;
COMPYSP;
- u32 zm; int ze; int zs __maybe_unused; int zc;
+ COMPZSP;

EXPLODEXSP;
EXPLODEYSP;
- EXPLODESP(z, zc, zs, ze, zm)
+ EXPLODEZSP;

FLUSHXSP;
FLUSHYSP;
- FLUSHSP(z, zc, zs, ze, zm);
+ FLUSHZSP;

ieee754_clearcx();

@@ -53,7 +53,7 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
ieee754_setcx(IEEE754_INVALID_OPERATION);
return ieee754sp_nanxcpt(z);
case IEEE754_CLASS_DNORM:
- SPDNORMx(zm, ze);
+ SPDNORMZ;
/* QNAN is handled separately below */
}

--
2.8.0

2016-04-21 13:07:29

by Paul Burton

[permalink] [raw]
Subject: [PATCH 08/11] MIPS: math-emu: Fix bit-width in ieee754dp_{mul,maddf,msubf} comments

A comment in ieee754dp_mul indicates that the code is about to perform a
32b x 32b multiplication & keep the high 32b of the result. It appears
this was copied from the single-precision multiplication code, since the
code actually goes on to perform a 64b x 64b multiplication & keep the
high 64b of the result. Fix the comment to indicate 64b.

It appears also that this comment was copied verbatim along with the
rest of the multiplication code into ieee754dp_maddf, which has since
been renamed _dp_maddf. Fix the same issue there.

Signed-off-by: Paul Burton <[email protected]>
---

arch/mips/math-emu/dp_maddf.c | 2 +-
arch/mips/math-emu/dp_mul.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 0e1d4d8..dc83fdd 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -165,7 +165,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
ym <<= 64 - (DP_FBITS + 1);

/*
- * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
+ * Multiply 64 bits xm, ym to give high 64 bits rm with stickness.
*/

/* 32 * 32 => 64 */
diff --git a/arch/mips/math-emu/dp_mul.c b/arch/mips/math-emu/dp_mul.c
index d0901f0..d6a7573 100644
--- a/arch/mips/math-emu/dp_mul.c
+++ b/arch/mips/math-emu/dp_mul.c
@@ -125,7 +125,7 @@ union ieee754dp ieee754dp_mul(union ieee754dp x, union ieee754dp y)
ym <<= 64 - (DP_FBITS + 1);

/*
- * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
+ * Multiply 64 bits xm, ym to give high 64 bits rm with stickness.
*/

/* 32 * 32 => 64 */
--
2.8.0

2016-04-21 13:07:43

by Paul Burton

[permalink] [raw]
Subject: [PATCH 09/11] MIPS: math-emu: Fix code indentation

A line incrementing the re variable was indented a level too deep in
ieee754dp_mul, making the code unclear to read. Fix the indentation.
This appears to have been copied verbatim along with the rest of the
multiplication code to ieee754dp_maddf, now _dp_maddf, too so fix the
indentation there too.

Signed-off-by: Paul Burton <[email protected]>
---

arch/mips/math-emu/dp_maddf.c | 2 +-
arch/mips/math-emu/dp_mul.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index dc83fdd..4a2d03c 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -203,7 +203,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
if ((s64) rm < 0) {
rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
((rm << (DP_FBITS + 1 + 3)) != 0);
- re++;
+ re++;
} else {
rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
diff --git a/arch/mips/math-emu/dp_mul.c b/arch/mips/math-emu/dp_mul.c
index d6a7573..87d0b44 100644
--- a/arch/mips/math-emu/dp_mul.c
+++ b/arch/mips/math-emu/dp_mul.c
@@ -163,7 +163,7 @@ union ieee754dp ieee754dp_mul(union ieee754dp x, union ieee754dp y)
if ((s64) rm < 0) {
rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
((rm << (DP_FBITS + 1 + 3)) != 0);
- re++;
+ re++;
} else {
rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
--
2.8.0

2016-04-21 13:07:57

by Paul Burton

[permalink] [raw]
Subject: [PATCH 10/11] MIPS: math-emu: Fix m{add,sub}.s shifts

The code in _sp_maddf (formerly ieee754sp_madd) appears to have been
copied verbatim from ieee754sp_add, and although it's adding the
unpacked "r" & "z" floats it kept using macros that operate on "x" &
"y". This led to the addition being carried out incorrectly on some
mismash of the product, accumulator & multiplicand fields. Typically
this would lead to the assertions "ze == re" & "ze <= SP_EMAX" failing
since ze & re hadn't been operated upon.

Signed-off-by: Paul Burton <[email protected]>
Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
---

arch/mips/math-emu/ieee754sp.c | 3 ++-
arch/mips/math-emu/ieee754sp.h | 16 +++++++---------
arch/mips/math-emu/sp_add.c | 6 ++++--
arch/mips/math-emu/sp_maddf.c | 13 ++++++++-----
arch/mips/math-emu/sp_sub.c | 6 ++++--
5 files changed, 25 insertions(+), 19 deletions(-)

diff --git a/arch/mips/math-emu/ieee754sp.c b/arch/mips/math-emu/ieee754sp.c
index e0b2c45..2de0c09 100644
--- a/arch/mips/math-emu/ieee754sp.c
+++ b/arch/mips/math-emu/ieee754sp.c
@@ -138,7 +138,8 @@ union ieee754sp ieee754sp_format(int sn, int xe, unsigned xm)
} else {
/* sticky right shift es bits
*/
- SPXSRSXn(es);
+ xm = XSPSRS(xm, es);
+ xe += es;
assert((xm & (SP_HIDDEN_BIT << 3)) == 0);
assert(xe == SP_EMIN);
}
diff --git a/arch/mips/math-emu/ieee754sp.h b/arch/mips/math-emu/ieee754sp.h
index b24fdff..8476067 100644
--- a/arch/mips/math-emu/ieee754sp.h
+++ b/arch/mips/math-emu/ieee754sp.h
@@ -46,19 +46,17 @@ static inline int ieee754sp_finite(union ieee754sp x)
}

/* 3bit extended single precision sticky right shift */
-#define SPXSRSXn(rs) \
- (xe += rs, \
- xm = (rs > (SP_FBITS+3))?1:((xm) >> (rs)) | ((xm) << (32-(rs)) != 0))
+#define XSPSRS(v, rs) \
+ ((rs > (SP_FBITS+3))?1:((v) >> (rs)) | ((v) << (32-(rs)) != 0))

-#define SPXSRSX1() \
- (xe++, (xm = (xm >> 1) | (xm & 1)))
+#define XSPSRS1(m) \
+ ((m >> 1) | (m & 1))

-#define SPXSRSYn(rs) \
- (ye+=rs, \
- ym = (rs > (SP_FBITS+3))?1:((ym) >> (rs)) | ((ym) << (32-(rs)) != 0))
+#define SPXSRSX1() \
+ (xe++, (xm = XSPSRS1(xm)))

#define SPXSRSY1() \
- (ye++, (ym = (ym >> 1) | (ym & 1)))
+ (ye++, (ym = XSPSRS1(ym)))

/* convert denormal to normalized with extended exponent */
#define SPDNORMx(m,e) \
diff --git a/arch/mips/math-emu/sp_add.c b/arch/mips/math-emu/sp_add.c
index f1c87b0..c55c0c0 100644
--- a/arch/mips/math-emu/sp_add.c
+++ b/arch/mips/math-emu/sp_add.c
@@ -132,13 +132,15 @@ union ieee754sp ieee754sp_add(union ieee754sp x, union ieee754sp y)
* Have to shift y fraction right to align.
*/
s = xe - ye;
- SPXSRSYn(s);
+ ym = XSPSRS(ym, s);
+ ye += s;
} else if (ye > xe) {
/*
* Have to shift x fraction right to align.
*/
s = ye - xe;
- SPXSRSXn(s);
+ xm = XSPSRS(xm, s);
+ xe += s;
}
assert(xe == ye);
assert(xe <= SP_EMAX);
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 86e1d0b..a8cd8b4 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -214,16 +214,18 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,

if (ze > re) {
/*
- * Have to shift y fraction right to align.
+ * Have to shift r fraction right to align.
*/
s = ze - re;
- SPXSRSYn(s);
+ rm = XSPSRS(rm, s);
+ re += s;
} else if (re > ze) {
/*
- * Have to shift x fraction right to align.
+ * Have to shift z fraction right to align.
*/
s = re - ze;
- SPXSRSYn(s);
+ zm = XSPSRS(zm, s);
+ ze += s;
}
assert(ze == re);
assert(ze <= SP_EMAX);
@@ -236,7 +238,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
zm = zm + rm;

if (zm >> (SP_FBITS + 1 + 3)) { /* carry out */
- SPXSRSX1();
+ zm = XSPSRS1(zm);
+ ze++;
}
} else {
if (zm >= rm) {
diff --git a/arch/mips/math-emu/sp_sub.c b/arch/mips/math-emu/sp_sub.c
index ec5f937..dc998ed 100644
--- a/arch/mips/math-emu/sp_sub.c
+++ b/arch/mips/math-emu/sp_sub.c
@@ -134,13 +134,15 @@ union ieee754sp ieee754sp_sub(union ieee754sp x, union ieee754sp y)
* have to shift y fraction right to align
*/
s = xe - ye;
- SPXSRSYn(s);
+ ym = XSPSRS(ym, s);
+ ye += s;
} else if (ye > xe) {
/*
* have to shift x fraction right to align
*/
s = ye - xe;
- SPXSRSXn(s);
+ xm = XSPSRS(xm, s);
+ xe += s;
}
assert(xe == ye);
assert(xe <= SP_EMAX);
--
2.8.0

2016-04-21 13:08:11

by Paul Burton

[permalink] [raw]
Subject: [PATCH 11/11] MIPS: math-emu: Fix jalr emulation when rd == $0

When emulating a jalr instruction with rd == $0, the code in
isBranchInstr was incorrectly writing to GPR $0 which should actually
always remain zeroed. This would lead to any further instructions
emulated which use $0 operating on a bogus value until the task is next
context switched, at which point the value of $0 in the task context
would be restored to the correct zero by a store in SAVE_SOME. Fix this
by not writing to rd if it is $0.

Fixes: 102cedc32a6e ("MIPS: microMIPS: Floating point support.")
Cc: stable <[email protected]> # v3.10
Signed-off-by: Paul Burton <[email protected]>

---

arch/mips/math-emu/cp1emu.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/mips/math-emu/cp1emu.c b/arch/mips/math-emu/cp1emu.c
index 85dd174..d96e912 100644
--- a/arch/mips/math-emu/cp1emu.c
+++ b/arch/mips/math-emu/cp1emu.c
@@ -445,9 +445,11 @@ static int isBranchInstr(struct pt_regs *regs, struct mm_decoded_insn dec_insn,
case spec_op:
switch (insn.r_format.func) {
case jalr_op:
- regs->regs[insn.r_format.rd] =
- regs->cp0_epc + dec_insn.pc_inc +
- dec_insn.next_pc_inc;
+ if (insn.r_format.rd != 0) {
+ regs->regs[insn.r_format.rd] =
+ regs->cp0_epc + dec_insn.pc_inc +
+ dec_insn.next_pc_inc;
+ }
/* Fall through */
case jr_op:
/* For R6, JR already emulated in jalr_op */
--
2.8.0