Bug 499 - Create experimental gdb protocol implementation in nmigen for debugging
Summary: Create experimental gdb protocol implementation in nmigen for debugging
Status: CONFIRMED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: All All
: Low enhancement
Assignee: Jacob Lifshay
URL:
Depends on: 503
Blocks:
  Show dependency treegraph
 
Reported: 2020-09-22 21:34 BST by Jacob Lifshay
Modified: 2020-09-25 03:58 BST (History)
4 users (show)

See Also:
NLnet milestone: ---
total budget (EUR) for completion of task and all subtasks: 0
budget (EUR) for this task, excluding subtasks' budget: 0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jacob Lifshay 2020-09-22 21:34:20 BST
(In reply to Luke Kenneth Casson Leighton from bug #490 comment #20)
> (In reply to Jacob Lifshay from bug #490 comment #19)
> > idk if we will have time, but it might be useful to implement gdb's remote
> > protocol over a serial port.
> 
> in RTL?  very unlikely.

GDB's protocol can be quite simple, IIRC you only need to implement 5-6 commands to have a fully working single-threaded implementation. I'd estimate 1k-3k gates for the FSM and a few hundred gates for the UART. If I'm not needed for anything else right now, I'll try to implement it today. What frequency should I use for the reference clock to divide to get the right baud rate? 50MHz? I'm planning at it running at 115,200baud since that's the fastest commonly supported rate.
Comment 1 Luke Kenneth Casson Leighton 2020-09-22 22:21:32 BST
see what you can do, jacob, you will find a serial tx rx implementation in nmigen-soc i believe, which will save some time.

the tricky bit is that it will need to connect to DMI (with suitable converter) and there is already one "thing" connecting to that (DMI2JTAG).  switching between the two could be muxed with an external pin.

also a memory (wishbone master) interface is needed, which Staf's code already provides, and it took me about 2-3 hours to work that out and hook it up with a unit test first.

normally i'd estimate overall it would be about... 3 to 5 days total here which starts to put it outside of the "code freeze" zone.

examine the git commits from the past 3 days. you will find i created a dummy (fake) DMI server nmigen test process which pretends to support DMI address 0 (read and write) and DMI.MSR (read only)

if you can implement it REAL fast and get a wishbone master and DMI gateway done in under 2 days then we can put it in.

one extra wishbone master is not a problem, no MUXing needed, litex just adds it to the list of things to hook up.

basically, go for it but be QUICK ok?
Comment 2 Luke Kenneth Casson Leighton 2020-09-22 22:37:54 BST
can i assume you're talking about this, section 2.3

https://www.embecosm.com/appnotes/ean4/embecosm-howto-rsp-server-ean4-issue-2.html

if so that's reeaaally not going to be quick.  section 4.7.6 which is
for example reading all registers, that alone will be quite a complex
(nested, twin) FSM, like this:

https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/litex/florent/sim.py;hb=HEAD#l225

something like that would need converting to nmigen and that's *just*
for reading the registers.

the advantage of the JTAG approach is that, basically, aside from
the software (openocd) it's done.  i.e. the complexity (and remaining work)
is in the software side.
Comment 3 Luke Kenneth Casson Leighton 2020-09-24 11:49:03 BST
the other thing that occurred to me jacob is that Staf would like to
do scans of IO ports as part of the ASIC test.  this is in c4m jtag
https://gitlab.com/Chips4Makers/c4m-jtag/-/blob/master/c4m/nmigen/jtag/tap.py#L411

therefore we *have* to get the jtag interface operational, where
a direct-to-gdb interface is not a full replacement of all needed
functionality.

with staf's JTAG code having been silicon-proven, and it being necessary
to add, time spent on this right now is effectively time wasted when there
are much higher priority things to do.

i _like_ the idea - it's just a hell of a lot of redundant work.
Comment 4 Jacob Lifshay 2020-09-24 18:13:47 BST
(In reply to Luke Kenneth Casson Leighton from comment #3)
> the other thing that occurred to me jacob is that Staf would like to
> do scans of IO ports as part of the ASIC test.

This is possible with gdb's debugger protocol, you just send a custom interpreter command using gdb's `monitor` command (made up syntax):
(gdb) monitor set port 1AE 0

this is how you can access openocd's command prompt from gdb.

Alternatively, you could write to the GPIO ports' memory using a memory write command.

One of the nice features of GDB's protocol is it supports debugging multi-threaded programs (we would interpret gdb's threads as processor cores). I don't know if the jtag protocol supports that.
Comment 5 Luke Kenneth Casson Leighton 2020-09-24 18:41:45 BST
(In reply to Jacob Lifshay from comment #4)
> (In reply to Luke Kenneth Casson Leighton from comment #3)
> > the other thing that occurred to me jacob is that Staf would like to
> > do scans of IO ports as part of the ASIC test.
> 
> This is possible with gdb's debugger protocol, you just send a custom
> interpreter command using gdb's `monitor` command (made up syntax):
> (gdb) monitor set port 1AE 0
> 
> this is how you can access openocd's command prompt from gdb.

my point is not so much "is it possible" it was "Staf's already provided it, and his work has full unit tests, and it us FPGA proven and silicon proven as well i think"

to do a uart version of the exact same thing, why would we do that, but even if we did it is getting more and more complex.

throwing away Staf's JTAG io work, and the fact that he has SVF scan files already done, which can be run against the simulator, verilator, FPGA *and the ASIC* this is a huge saving all of which we throw away and have to do ourselves to replicate it in gdb serial?

honestly why would we do that? it makes no sense.

basically we need to focus and this proposal is increasing more and more in scope and cost as we go through its full implications.



> Alternatively, you could write to the GPIO ports' memory using a memory
> write command.

which is far more complex and defeats the object of testing the IO pads directly.  remember: Staf needs to be able to test the IO pads in as direct and simple a fashion as possible.

a memory bus is much more HDL that could go wrong.

as the memory bus is litex which uses migen which has zero checking on it, it is already quite risky.


> One of the nice features of GDB's protocol is it supports debugging
> multi-threaded programs (we would interpret gdb's threads as processor
> cores). I don't know if the jtag protocol supports that.

both serial and JTAG are single resource and immediate response.  any multi threading is done at the software level (as part of gdb)

neither the gdb serial nor jtag RTL will have any kind of "thread state", except if the actual core is hyperthreaded but even there the requests will contain the core number over the serial/JTAG link.
Comment 6 Jacob Lifshay 2020-09-24 19:24:57 BST
(In reply to Luke Kenneth Casson Leighton from comment #5)
> (In reply to Jacob Lifshay from comment #4)
> > 
> throwing away Staf's JTAG io work, and the fact that he has SVF scan files
> already done, which can be run against the simulator, verilator, FPGA *and
> the ASIC* this is a huge saving all of which we throw away and have to do
> ourselves to replicate it in gdb serial?

I'm not saying we should completely replace JTAG, I'm saying we could and also that GDB's protocol is nicer for debugging and requires less pins.


> > One of the nice features of GDB's protocol is it supports debugging
> > multi-threaded programs (we would interpret gdb's threads as processor
> > cores). I don't know if the jtag protocol supports that.
> 
> both serial and JTAG are single resource and immediate response.  any multi
> threading is done at the software level (as part of gdb)

My point was that gdb's protocol already has built-in handling for things like "send a signal (interrupt) to thread #7" or "read registers from thread #5" or "pause thread #1" or "resume thread #2". This will be useful for later chips that have multiple cores, since we'd just tell gdb that a chip with 4 cores has 4 gdb-visible threads.
Comment 7 Staf Verhaegen 2020-09-24 19:48:24 BST
I think you guys are mixing up layers. JTAG is a communication interface not a protocol. A debugging protocol is something you can add on top. Like you add TCP/IP on top of ethernet where thus JTAG corresponds with ethernet and the debug protocol with TCP/IP.

AFAIK ARM defines a standard set of JTAG instructions and if a CPU implements these one can use OpenOCD to setup a debug server for debugging the ARM CPU remotely using gdb.

So to me the question is if a similar set of debugging JTAG instructions exists for Power and if OpenOCD or another program has support for this.
Comment 8 Luke Kenneth Casson Leighton 2020-09-24 19:56:17 BST
(In reply to Jacob Lifshay from comment #6)

> I'm not saying we should completely replace JTAG, I'm saying we could and
> also that GDB's protocol is nicer for debugging and requires less pins.

openocd already integrates directly with gdb, and it's only 2 pins.  if
it was 20 or 200 i'd be concerned.

if you run the simulation (litex/florent/sim.py) then type "openocd -f openocd.cfg" it will fire up an openocd instance that acts as a "gateway"
between the jtagremote TCP port, and it is a trivial matter to put into
the openocd.cfg fike "please fire up a remote gdb service".

you then fire up gdb with options that connect to that port.

this does *exactly the same job" as connecting gdb to a serial port.

> > > One of the nice features of GDB's protocol is it supports debugging
> > > multi-threaded programs (we would interpret gdb's threads as processor
> > > cores). I don't know if the jtag protocol supports that.
> > 
> > both serial and JTAG are single resource and immediate response.  any multi
> > threading is done at the software level (as part of gdb)
> 
> My point was that gdb's protocol already has built-in handling for things
> like "send a signal (interrupt) to thread #7" or "read registers from thread
> #5" or "pause thread #1" or "resume thread #2". 

whatever is available in the gdb protocol is *already* going to be proxied
over the jtagremote protocol as well.

> This will be useful for
> later chips that have multiple cores, since we'd just tell gdb that a chip
> with 4 cores has 4 gdb-visible threads.

then all we need do to support that is to add the "hyperthread" number to
the JTAG HDL, most likely one of the DMI registers.

then we add software support in openocd which being software if we screw it
up it's easy to fix.

if you really want to feel free to go ahead with this: i'd recommend that
you outline and document a full project plan.

however please bear in mind that by deciding to spend your time on this
it will be to the detriment of the project because there are much more
important things that you could be doing, such as writing an openocd
jtagremote client/server that will allow us to use Staf's SVF files,
and many other tasks.
Comment 9 Jacob Lifshay 2020-09-24 20:00:33 BST
(In reply to Staf Verhaegen from comment #7)
> So to me the question is if a similar set of debugging JTAG instructions
> exists for Power and if OpenOCD or another program has support for this.

According to the docs, OpenOCD doesn't currently support 64-bit targets and it's support for PowerPC targets is questionable, so we would need to modify OpenOCD in order to use it.
Comment 10 Jacob Lifshay 2020-09-24 20:02:48 BST
(In reply to Jacob Lifshay from comment #9)
> (In reply to Staf Verhaegen from comment #7)
> > So to me the question is if a similar set of debugging JTAG instructions
> > exists for Power and if OpenOCD or another program has support for this.
> 
> According to the docs, OpenOCD doesn't currently support 64-bit targets and
> it's support for PowerPC targets is questionable, so we would need to modify
> OpenOCD in order to use it.

http://openocd.org/doc-release/doxygen/targetnotarm.html
Comment 11 Luke Kenneth Casson Leighton 2020-09-24 20:03:00 BST
(In reply to Staf Verhaegen from comment #7)
> I think you guys are mixing up layers. JTAG is a communication interface not
> a protocol. A debugging protocol is something you can add on top. 

in terms of those defined keywords: jacob is talking about implementing the
actual gdb serial debugging protocol in nmigen RTL.

given that the JTAG RTL works and has openocd to act as a gateway between
gdb and JTAG, which does exactly the same job, i am not sure why this is
continuing to be discussed, given that it is taking up precious time.

> AFAIK ARM defines a standard set of JTAG instructions and if a CPU
> implements these one can use OpenOCD to setup a debug server for debugging
> the ARM CPU remotely using gdb.

yes.  and the RISC-V Debug Working Group defined something similar (they
used DMI addresses then proxied DMI over JTAG), then made a patch to openocd
to support those.
 
> So to me the question is if a similar set of debugging JTAG instructions
> exists for Power and if OpenOCD or another program has support for this.

i checked: there was a very old reverse-engineered 32-bit powerpc processor
effort (Power 2.07?) that did not make it into upstream openocd.  of course
that is *their* defined JTAG registers / formats.

i found a really old patch for or1k on the openocd mailing list which looked
pretty straightforward.
Comment 12 Luke Kenneth Casson Leighton 2020-09-24 20:04:15 BST
(In reply to Jacob Lifshay from comment #9)
> (In reply to Staf Verhaegen from comment #7)
> > So to me the question is if a similar set of debugging JTAG instructions
> > exists for Power and if OpenOCD or another program has support for this.
> 
> According to the docs, OpenOCD doesn't currently support 64-bit targets and
> it's support for PowerPC targets is questionable, so we would need to modify
> OpenOCD in order to use it.

which is quite a lot of work and why we need that 1 hours worth of work
to do a client-server implementation of jtagremote in python before doing
that.
Comment 13 Jacob Lifshay 2020-09-24 23:26:18 BST
What about building a GDB protocol to DMI and WB implementation that works both as a software and hardware implementation? The software implementation could connect over jtag and/or nmigen's simulator interface.