a MUL pipeline is needed similar to the other pipelines in soc.fu, covering MUL operations. https://git.libre-soc.org/?p=soc.git;a=tree;f=src/soc/fu/mul;hb=HEAD
there are actually two different types of MUL here. * VA Form - 3 int in, no carry/overflow * X Form - usual style just like ALU/Logical my feelings are mixed as that is a lot of ports if they are combined. still, actuslly, after some thought it is the same (after combining) port allocation as Shift. # Multiply-Add High Doubleword VA-Form VA-Form * maddhd RT,RA.RB,RC prod[0:127] <- (RA) * (RB) sum[0:127] <- prod + EXTS(RC) RT <- sum[0:63] Special Registers Altered: None
https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=a60febdeb1c572a4b85b410c6519383fc581732d i moved mul operations over to a MUL Function Unit. the unit test, test_pipe_caller.py, when cookie-cut copied over, should then be changed: fn_unit = yield pdecode2.e.fn_unit self.assertEqual(fn_unit, Function.SHIFT_ROT.value) to: fn_unit = yield pdecode2.e.fn_unit self.assertEqual(fn_unit, Function.MUL.value) really we should look at some point at deriving a class to contain the common code from all these tests, soc.fn.*.test.test_pipe_caller.py
from microwatt: how to set up the inputs to the mul pipeline. this can go in main_stage.py when calling the mul unit: if e_in.is_32bit = '1' then if e_in.is_signed = '1' then x_to_multiply.data1 <= (others => a_in(31)); x_to_multiply.data1(31 downto 0) <= a_in(31 downto 0); x_to_multiply.data2 <= (others => b_in(31)); x_to_multiply.data2(31 downto 0) <= b_in(31 downto 0); else x_to_multiply.data1 <= '0' & x"00000000" & a_in(31 downto 0); x_to_multiply.data2 <= '0' & x"00000000" & b_in(31 downto 0); else if e_in.is_signed = '1' then x_to_multiply.data1 <= a_in(63) & a_in; x_to_multiply.data2 <= b_in(63) & b_in; else x_to_multiply.data1 <= '0' & a_in; x_to_multiply.data2 <= '0' & b_in;
i made a start on this, no multi stage, just to get at least something movibg forward. immediately found an issue with the simulator pseudocode. mulli operands are supposed to be signed and it alters the output considerably. this will need alteration of the pseudocode, even to the extent of creating a special MULS function.
commit 512e2d72912ba57913ab1b1297a085d5fae67181 (HEAD -> master) Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net> Date: Thu Jul 9 10:52:46 2020 +0100 add new stages etc. to get multiply working without xer_ca removing xer_ca from the DIV and MUl pipelines (both on input and output) needs a bit of tweaking. it's important because unnecessary registers being read/written to creates dependencies that create chaining and prevent opportunities for parallelism.
hmm to match the exact behaviour of IBM's POWER9 core it is necessary to modify the pseudocode of divhwu and divhw to return the 32 bits of the product mapped *twice*. this is exactly what microwatt does. the second modification needed is going to be in creating a variable named overflow in the pseudocode and returning it. the microwatt test is quite neat: hi bits are both all non zero and not all 1s. this can be easily expressed in the pseudocode.
ha, hilarious overflow <- ((prod[0:32] != 0x0_0000_0000) & (prod[0:32] != 0x1_ffff_ffff)) that's in hexadecimal, which is 36 bits long, not 33. so the pseudocode rightly complains. i changed it to [0]*33 and [1]*33 and that works.
jacob EUR 500 lkcl 250 on this one i feel is reasonable. MAC TODO, tests ok, proof still needed however is separate.