Separate bug for the work I did on the core/pad resource JTAG test code. Was initially documented in bug #50 . The work involved: - Parse a dummy resource list (gpio, i2c, uart) - Instantiate peripherals from the list - Write basic unit tests for the peripherals - Connect the peripheral pins to the JTAG BS chain - Write unit tests to check the JTAG functionality (SAMPLE and EXTEST) Latest code: https://git.libre-soc.org/?p=pinmux.git;a=blob;f=src/spec/testing_stage1.py;h=e6f135388a28d887ddf9664ebc33817373282d35;hb=074aced79c300b2397d2f4504c41e2ddfe9ba7ed Currently this temporary wiki page is used for documenting the tests: https://libre-soc.org/docs/pinmux/temp_pinmux_info/ (I'll be adding the information on the pinmux block later this week) One main issue with a unit test is remaining: The TDO data after an EXTEST (test case 1 and 3) does not match up with the ios dictionary of signals (the signals corresponding to the BS chain shift register). As I'm using the ios dictionary signals for the asserts (and comparing against the expected test vector), the test is passing, but I don't know if the JTAG is not being driven correctly, or something is affecting the shift register (or clears it).
Luke, before closing the bug, I wanted to find the cause of the TDO issue. To make the issue easier to find, I focused on checking the uart rx/tx signals. Pad input rx cleared. UART rx/tx internal loopback disabled: 1. EX_TEST, BS data is 0x80000 (set uart rx core i) 2. EX_TEST, BS data is 0x40000 (set uart tx pad o) 3. EX_TEST, BS data is 0xC0000 (set uart rx core i and tx pad o) In all cases, the rx core input and tx pad output waveforms are correct. TDO however is all zeroes. When tx pad output is high (cases 2 and 3), one of the TDO bits MUST be set to coincide with tx pad output. The same set of test cases was repeated with the rx/tx loopback connection (on the core side): 4. EX_TEST, BS data is 0x80000 (set uart rx core i) 5. EX_TEST, BS data is 0x40000 (set uart tx pad o) 6. EX_TEST, BS data is 0xC0000 (set uart rx core i and tx pad o) The waveforms show the tx core output being set when rx core input is set which is correct. However TDO is again all zeroes. TDO does however, show the correct data when rx pad i is set, with the BS_SAMPLE mode: 7. BS_SAMPLE, BS data is 0x00000 This test case should show the rx pad/core i and tx core/pad o set (because loopback is working, and jtag is passthrough), and this is what is observed. TDO is correct, 1100 0000 0000 0000 0000. Upper two bits indicate rx pad in and tx core out. The ios dict only contains read-only signals, so perhaps they are not shifted out into TDO during an external test (EX_TEST)? Checking yosys diagram, I can see that io_bd Boundary Scan register controls signals coming out of JTAG (core inputs and pad outputs) and it is shifted out to TDO. Checking c4m-jtag/c4m/nmigen/jtag/tap.py, I haven't been able to find how tdo would connect to io_bd (but I did see connection to sr, the main shift register).
https://git.libre-soc.org/?p=c4m-jtag.git;a=blob;f=c4m/nmigen/jtag/tap.py;h=e0cf747150c03c45a7be16682b31422103ab75bc;hb=eb28aea69c7a785085364c2fb3c2cbdbe86f4890#l625 io_bd - and in fact io_sr itself directly - is set to directly be sent out via TDOs, one bit at a time (line 673) io_sr is only shifted "in" from tdi in "shift" mode (line 620) io_sr only takes data *from* pads and core in *capture* mode (line 592) so you cannot just set a few signals and "hope for the best", you have to actually make sure that the correct "Modes" (shift, update, capture) are set. the JTAG example in soc that i referred you to does this correctly. it took me something mad like 2 days to understand it.
Based on your reply, I looked at yesterday's waveforms and saw the Capture state happens before the Shifting state (when the BS data or JTAG configuration is shifted in), but as I was about to start running the code again, I got an assertion error from c4m-jtag tap.py. https://git.libre-soc.org/?p=c4m-jtag.git;a=blob;f=c4m/nmigen/jtag/tap.py;h=e0cf747150c03c45a7be16682b31422103ab75bc;hb=eb28aea69c7a785085364c2fb3c2cbdbe86f4890#l592 (assertion on #l616) The assertion fails because the length of iol does not match the sum of all subsignals in _ios: Traceback (most recent call last): File "testing_stage1.py", line 1005, in <module> test_jtag() File "testing_stage1.py", line 976, in test_jtag dut = setup_blinker(build_blinker=False) File "testing_stage1.py", line 930, in setup_blinker vl = rtlil.convert(top, ports=top.ports()) File "/home/rohdo/src/nmigen/nmigen/back/rtlil.py", line 1054, in convert fragment = ir.Fragment.get(elaboratable, platform).prepare(**kwargs) File "/home/rohdo/src/nmigen/nmigen/hdl/ir.py", line 37, in get obj = obj.elaborate(platform) File "/home/rohdo/src/nmigen/nmigen/hdl/dsl.py", line 538, in elaborate fragment.add_subfragment(Fragment.get(self._named_submodules[name], platform), name) File "/home/rohdo/src/nmigen/nmigen/hdl/ir.py", line 37, in get obj = obj.elaborate(platform) File "/home/rohdo/src/soc/pinmux/src/spec/jtag.py", line 185, in elaborate m = super().elaborate(platform) File "/home/rohdo/src/c4m-jtag/c4m/nmigen/jtag/tap.py", line 434, in elaborate bd2io=io_bd2io, bd2core=io_bd2core, File "/home/rohdo/src/c4m-jtag/c4m/nmigen/jtag/tap.py", line 621, in _elaborate_ios assert idx == length, "Internal error, length mismatch" AssertionError: Internal error, length mismatch Adding prints to tap.py show me that self._ios is a list of records, each record containing a core/pad pair: [(rec uart_0__rx (rec uart_0__rx__core i) (rec uart_0__rx__pad i)), ... more signals ... (rec i2c_0__scl__oe (rec i2c_0__scl__oe__core o) (rec i2c_0__scl__oe__pad o))] So, the total number of subsignals in _ios is 40. iol is a list that *only* contains peripheral core outputs and pad inputs (i.e. signals that JTAG cannot control, but can only sample): [(sig uart_0__rx__pad__i), (sig uart_0__tx__core__o), ... more signals ... (sig i2c_0__scl__i__pad__i), (sig i2c_0__scl__o__core__o), (sig i2c_0__scl__oe__core__o)] No pullups/downs or banksel are requested on any of the pins, so that part of the code in tap.py is not being used. Total number of subsignals in iol is 20. I was wondering whether the chain is meant to be 40 bits long (to account for the core/pad pair), but I don't understand it enough to make that conclusion. The way the code shows it (and the way you described it) indicates that theres a (not including intermediate states like idle, run, etc.): 1. shifting stage (which I've seen in the waveforms) 2. update stage where signals controlled by JTAG BS are set by the shift register 3. capture stage where JTAG inputs are copied into the main shift register So that implies there's no need for a 40-bit chain, as you can access the inputs and outputs by selecting different modes.
(In reply to Andrey Miroshnikov from comment #3) > I was wondering whether the chain is meant to be 40 bits long (to account > for the core/pad pair), but I don't understand it enough to make that > conclusion. look again at the loop. line 634. https://git.libre-soc.org/?p=c4m-jtag.git;a=blob;f=c4m/nmigen/jtag/tap.py;h=e0cf747150c03c45a7be16682b31422103ab75bc;hb=eb28aea69c7a785085364c2fb3c2cbdbe86f4890#l634 if you do not understand it, insert some print statements into that code. again, a reminder: stop thinking that you do not have the right to do anything you want to, to find out what is going on. you are still thinking of other people's code as "a black box that cannot be touched and it must work, as-is". likewise you can perfectly well put print statements into TAP.add_io() as well. if you are expecting only 20 signals and there are double that amount it is probably because you have called a loop around TAP.add_io() twice.
(In reply to Luke Kenneth Casson Leighton from comment #4) > if you do not understand it, insert some print statements into that code. Yep, that was indeed helpful. In addition I had to bite the bullet and study the TAP FSM. Now it makes a lot more sense. Who would've thought? XD > > again, a reminder: stop thinking that you do not have the right to do > anything you want to, to find out what is going on. Sure. Now I've even found a discrepancy that doesn't really makes sense in tap.py The TDI data for sr is shifted in *MSB first*. See: https://git.libre-soc.org/?p=c4m-jtag.git;a=blob;f=c4m/nmigen/jtag/tap.py;hb=dfa012c0a884274f2ec5866ce0ac49e271a0bde5#l624 Now, there's another interesting discrepancy. The Instruction Register (ir) is loaded from TDI *LSB first*. See: https://git.libre-soc.org/?p=c4m-jtag.git;a=blob;f=c4m/nmigen/jtag/tap.py;hb=dfa012c0a884274f2ec5866ce0ac49e271a0bde5#l150 This is important because in tms_data_getset(), which is used in jtag_read_write_reg(), the TDI data is sent *LSB first*!!! https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/debug/test/test_jtag_tap.py;hb=67ee79592f304b48e424a92e09a9fbd3a7fff751#l29 This means tms_data_get_set() works for setting the instruction (SAMPLE, EXTEST, BYPASS, etc.), but is NOT correct for setting the data. Both ID and IR blocks shift in word LSB first, while IO shifts in word MSB first. Is this expected behaviour? If the issue is with tap.py, ID and IR could be changed to use MSB first shifting. Alternatively, what I propose is - add a function tms_instr_set() which will load data *LSB first* for use with the setting the TAP instruction. - modify tms_data_get_set() for-loop to start from the MSB and count down: for i in range(d_len-1, -1, -1) This change also ensures the TDO data is correct (it is shifted out MSB first as well) I made those changes locally, but if you give me access to soc repo, I could pull them in. As for the issues I had with the tests. The above issue was only affecting me now, once I got TDO to output the boundary scan registers. To read the resulting boundary scan chain values after an EXTEST, the TAP FSM has to be moved to the Capture state (as you've explained in comment #2, but I didn't understand significance of until I studied the FSM). I will cover this in more detail on the wiki page. Importantly for this bug, all 4 test cases are working and producing results that are expected. To remind of the cases: 1. Pad inputs and core outputs low, EXTEST, send 2^N-1 (0xFFFFF) test vector. 2. Pad inputs and core outputs low, SAMPLE, send 0xFFFFF test vector. 3. Pad inputs and core outputs high, EXTEST, send 0x00000 test vector. 4. Pad inputs and core outputs high, SAMPLE, send 0x00000 test vector. Results: 1. Pad outputs and oe are high, core inputs are high. 2. All signals low. 3. All pad inputs and core oe signals high. 4. Pad outputs and core inputs high (as well as pad inputs and core outputs). Cases 1 and 2 check control from JTAG only, cases 3 and 4 test a system where signals are controlled externally (by pads or peripherals). The core inputs values are determined by sampling the core outputs, since they are connected via an XOR gate internally. The reason core output oe's are high in case 3 is because the core oe's are only controlled by the oe test vector inputs, while all core outputs are an XOR of the core input and a test vector (core_i ^ test) Can this bug now be closed and RFP sent?
briefly, read it again 150 with m.Elif(self._shift): 151 m.d.posjtag += shift_ir.eq(Cat(shift_ir[1:], self._tdi)) on the next clock of posjtag: this is taking TDI into the top bit (MSB) and dropping the first bit (LSB). in software: shift_ir = shift_ir >> 1 # shift down 1 bit shift_ir |= (tdi << len(shift_ir-1) # put TDI into MSB
Andrey, each Shift Register created and used in one area for one task has absolutely nothing to do with any other shift register. the SR used for identity (IDBlock) has nothing to do with the IO shift register. the WB SR has nothing to do with theIO SR or the ID SR. what each piece of HDL does with each SR is entirely up to that piece of HDL. a module could decide to insert bits in an order related to phases of the moon and the spot trading price of matured smoked Edam if they wanted. therefore the bitreversal should i feel be done by the caller, i.e. call a explicit reverse() on the data in and out of the test function. if you really want to do reversal in the test function, (which is fine btw) then the only "safe" way to do it is to add extra named parameters at the end of the paramlist, defaulting to values which do NOT change the current behaviour. so e.g. reverse=False this is standard practice in well-maintained SQL databases and in any code providing optional args (java, c++, python) but python in particular has to watch out because it is interpreted. welcome to your first experience of maintaining backwards compatibility in software.
29 def tms_data_getset(dut, tms, d_len, d_in=0): , reverse=False 30 res = 0 31 yield dut.bus.tms.eq(tms) 32 for i in range(d_len): r = range(dlen) if reverse: r = reversed(r) or similar
(In reply to Luke Kenneth Casson Leighton from comment #6) > this is taking TDI into the top bit (MSB) and dropping the > first bit (LSB). in software: This is what I tried to describe (I probably didn't make too much sense did I?). I understood the behaviour, however my question is why are the IDBypassBlock and IRBlock shifting new data from the top of their internal shift registers, while the IOBlock shifts new data from the bottom? IRBlock: tdo <- ir[0] ir[1] ... ir[-1] <- tdi IDBypassBlock: tdo <- sr[0] sr[1] ... sr[-1] <- tdi IOBlock: tdi -> sr[0] sr[1] ... sr[-1] -> tdo The answer is probably, this is legacy/complicated/etc. so I won't press it further. Actually, just checked the old VHDL files TAP was based of, looks like they have done the same arrangement: https://git.libre-soc.org/?p=c4m-jtag.git;a=blob;f=c4m/vhdl/jtag/c4m_jtag_idblock.vhdl;h=a8c4ac995da3a787e3a75d10e94bf47018defa81;hb=fe7f74ad4ece0471a90bd4181f15ff678a844b07#l50 (In reply to Luke Kenneth Casson Leighton from comment #7) > Andrey, each Shift Register created and used in > one area for one task has absolutely nothing to > do with any other shift register. I know, I should've been more explicit in my wording. The names such as "io_sr" clearly implied that the signals were completely separate. > a module could decide to insert bits in an order related > to phases of the moon and the spot trading price of matured > smoked Edam if they wanted. Pretty much. > > therefore the bitreversal should i feel be done by the > caller, i.e. call a explicit reverse() on the data in and > out of the test function. > > if you really want to do reversal in the test function, > (which is fine btw) then > the only "safe" way to do it is to add extra named parameters > at the end of the paramlist, defaulting to values which do > NOT change the current behaviour. > > so e.g. reverse=False Sure, sounds like a plan. I can add a "reverse" optional arg to tms_data_getset() The proposed change to test_jtag_tap.py: 1. Add the function: def reverse_bit_order(word, wordlen): rev_word = 0 for bit in range(wordlen): rev_word += ((word >> bit) & 0x1) << (wordlen - 1 - i) return rev_word 2. Modify input arg and add if statement: def tms_data_getset(dut, tms, d_len, d_in=0, reverse=False): if reverse: # Reverse the data word to transmit MSB-first d_in = reverse_bit_order(d_in, d_len) 3. Add "reverse" argument to jtag_read_write_reg() > this is standard practice in well-maintained SQL databases and > in any code providing optional args (java, c++, python) but python > in particular has to watch out because it is interpreted. > > welcome to your first experience of maintaining backwards compatibility > in software. Hahahahaha :)
(In reply to Andrey Miroshnikov from comment #9) > (In reply to Luke Kenneth Casson Leighton from comment #6) > > this is taking TDI into the top bit (MSB) and dropping the > > first bit (LSB). in software: > > This is what I tried to describe (I probably didn't make too much sense did > I?). 3am for me not much makes sense > I understood the behaviour, however my question is why are the IDBypassBlock > and IRBlock shifting new data from the top of their internal shift > registers, while the IOBlock shifts new data from the bottom? Officially Not Our Problem (tm). one to file in the "that's interesting, i'll make a mental note in case it's relevant some point down the line" bucket. > (In reply to Luke Kenneth Casson Leighton from comment #7) > > The proposed change to test_jtag_tap.py: > 1. Add the function: > def reverse_bit_order(word, wordlen): > rev_word = 0 > for bit in range(wordlen): > rev_word += ((word >> bit) & 0x1) << (wordlen - 1 - i) > return rev_word > > 2. Modify input arg and add if statement: > def tms_data_getset(dut, tms, d_len, d_in=0, reverse=False): > if reverse: > # Reverse the data word to transmit MSB-first > d_in = reverse_bit_order(d_in, d_len) > > 3. Add "reverse" argument to jtag_read_write_reg() looks great. you have commit rights to soc (have all along), you could check with "ssh gitolite3@git.libre-soc.org"
(In reply to Luke Kenneth Casson Leighton from comment #10) > looks great. https://git.libre-soc.org/?p=soc.git;a=commitdiff;h=d8bf21d37504d89537c397d32c84e772db5d3d37 Had to use a slightly different appoach, as just changing the input data doesn't also adjust the output. Instead on the reverse flag, the for loop is iterated over backwards. This makes both the input and output data MSB-first. Tests are now working. https://git.libre-soc.org/?p=pinmux.git;a=blob;f=src/spec/testing_stage1.py;h=144fb8510dee99a7b177708a19a3416644564a94;hb=2285f3ad73ea14f79a52344a4c22c91feb754d55 > you have commit rights to soc (have all along), you could check > with "ssh gitolite3@git.libre-soc.org" Ah, didn't realise, thanks! Saved in my "helpful commands" textfile. Since there are no more blockers for this bug, can it be closed complete?
Task finished, sending RFP.