Bug 155 - a PLL is needed for the SoC
Summary: a PLL is needed for the SoC
Status: CONFIRMED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: Other Linux
: --- enhancement
Assignee: Dimitri Galayko
URL:
Depends on:
Blocks: 55 383
  Show dependency treegraph
 
Reported: 2020-01-12 21:23 GMT by Luke Kenneth Casson Leighton
Modified: 2020-11-10 15:48 GMT (History)
5 users (show)

See Also:
NLnet milestone: ---
total budget (EUR) for completion of task and all subtasks: 0
budget (EUR) for this task, excluding subtasks' budget: 0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2020-01-12 21:23:41 GMT
for a SoC, a programmable PLL is needed that set the clock rate at a
range of frequencies.  also, several clocks are needed for different
peripherals (such as SD/MMC needs 50mhz, UART needs variable rates
from 9600 to 115200 and above etc.)

therefore we need not just an analog PLL which can do different
frequencies, based on a (fixed?) stable input clock (crystal,
usually), we need:

* the actual PLL, to operate at a maximum frequency of the silicon
* a way to cut that in coarse granularity (half, quarter, 1/8th, 1/16th)
* some digital counters (and dividers) that will cut it further
* some "register" control - Wishbone B4 for example - for setting parameters

if a crystal is inconvenient (too analog) we can use a straight external
clock oscillator IC which generates 8mhz, 12.5mhz or other suitable stable
input frequency.

interesting find:

https://github.com/lakshmi-sathi/avsdpll_1v8
Comment 1 Jacob Lifshay 2020-01-12 21:45:53 GMT
(In reply to Luke Kenneth Casson Leighton from comment #0)
> https://github.com/ZipCPU/dpll

I'm guessing that's not the kind of PLL you wanted, it's intended for creating lower frequency sine waves that match a reference sine wave by running a counter off a much higher speed digital clock, adjusting the counter's trip count, and running the counter's output to a sin() lookup table. It's intended for something like audio processing, from what I can tell.

Assuming you want a PLL for generating high-frequency (>500MHz) digital clocks, you'd have to use a different method, such as using an analog VCO for the PLL.
Comment 2 Luke Kenneth Casson Leighton 2020-01-14 05:41:08 GMT
yes, someone contacted me privately offlist to explain. thomas. he also said that if we can get the foundry rules he can design one.
Comment 3 Luke Kenneth Casson Leighton 2020-02-28 20:22:33 GMT
http://bugs.libre-riscv.org/show_bug.cgi?id=178#c144
Comment 4 Jean-Paul.Chaput 2020-02-29 13:01:29 GMT
Hello Luke & Al,

I'm happy to inform you that Pr. Dimitri Galayko, which work in the same
team as me in Sorbonne Universite and has a good experience in making
PLL/VCO for digital designs is interested in this endeavor.

I gave him the link to this thread.

His email is <Dimitri.Galayko@lip6.fr>.

Best regards,
Comment 5 Luke Kenneth Casson Leighton 2020-02-29 14:21:19 GMT
(In reply to Jean-Paul.Chaput from comment #4)

> I'm happy to inform you that Pr. Dimitri Galayko, which work in the same
> team as me in Sorbonne Universite and has a good experience in making
> PLL/VCO for digital designs is interested in this endeavor.

that's fantastic, thank you jean-paul, and professor galayko
(when you see this). i will email you off-list for now.
Comment 6 Jacob Lifshay 2020-03-01 03:09:43 GMT
(In reply to Jean-Paul.Chaput from comment #4)
> I'm happy to inform you that Pr. Dimitri Galayko, which work in the same
> team as me in Sorbonne Universite and has a good experience in making
> PLL/VCO for digital designs is interested in this endeavor.

Yay!

Maybe he would be interested in designing a DLL (delay-locked-loop) afterwards since that is likely to also be useful and can share much of the PLL design.
Comment 7 Dimitri Galayko 2020-03-01 11:22:04 GMT
Hello Libre-SOC team 

thank you Jean-Paul for introducing me to this exciting community. 
I would be happy to contribute with my experience in PLL design. 
I need first to well understand the need. 

>for a SoC, a programmable PLL is needed that set the clock rate at a
>range of frequencies.  also, several clocks are needed for different
>peripherals (such as SD/MMC needs 50mhz, UART needs variable rates
>from 9600 to 115200 and above etc.)

That is OK, this is a quite regular need for SOCs. Do you need several clocks signals at different frequencies running in parallel or a single signal but with programmable frequency? 

>therefore we need not just an analog PLL which can do different
>frequencies, based on a (fixed?) stable input clock (crystal,
>usually), we need:

>* the actual PLL, to operate at a maximum frequency of the silicon
A PLL is a block from a mixed electronics, especially if one needs frequency at the higher limit of the technology. At least the oscillator is designed according to a "custom" design flow, and other blocks may be digital or analog. 

>* a way to cut that in coarse granularity (half, quarter, 1/8th, 1/16th)
>* some digital counters (and dividers) that will cut it further 
that is what is usually done

>* some "register" control - Wishbone B4 for example - for setting parameters 
I guess, this is required for programming the frequency? 

>if a crystal is inconvenient (too analog) we can use a straight external
>clock oscillator IC which generates 8mhz, 12.5mhz or other suitable stable
>input frequency.

What do you mean by « too analog » ? If a quartz oscillator is used, the reference clock generator itself may be implemented on chip (and it will be analog) or it is indeed possible to use some external devoted circuit which will generate a digital signal.
Comment 8 Luke Kenneth Casson Leighton 2020-03-01 12:04:53 GMT
(In reply to Dimitri Galayko from comment #7)
> Hello Libre-SOC team 
> 
> thank you Jean-Paul for introducing me to this exciting community. 
> I would be happy to contribute with my experience in PLL design. 
> I need first to well understand the need. 

fantastic.  yes.
 
> >for a SoC, a programmable PLL is needed that set the clock rate at a
> >range of frequencies.  also, several clocks are needed for different
> >peripherals (such as SD/MMC needs 50mhz, UART needs variable rates
> >from 9600 to 115200 and above etc.)
> 
> That is OK, this is a quite regular need for SOCs. Do you need several
> clocks signals at different frequencies running in parallel or a single
> signal but with programmable frequency? 

i believe by "parallel" you mean "in phase"?  i.e. that a single PLL
would generate (simultaneously) N, 2N, 4N and so on, all in lock-step?
i cannot think of a scenario where we would need this...

...the only exception being for DDR (double-data-rate), and that
we can generate with a single flip-flop.

i think we can get away with a single frequency from a single PLL,
where we will then, if we need multiple (non-phase-locked) frequencies
we simply lay out more than one block.

> A PLL is a block from a mixed electronics, especially if one needs frequency
> at the higher limit of the technology. At least the oscillator is designed
> according to a "custom" design flow, and other blocks may be digital or
> analog. 

appreciated.

> >* a way to cut that in coarse granularity (half, quarter, 1/8th, 1/16th)
> >* some digital counters (and dividers) that will cut it further 
> that is what is usually done
> 
> >* some "register" control - Wishbone B4 for example - for setting parameters 
> I guess, this is required for programming the frequency? 

yes.  hunting around: there would be something like this:
https://github.com/RoaLogic/vga_lcd/blob/master/rtl/verilog/vga_cur_cregs.v

which is then hooked in like this:
https://github.com/RoaLogic/vga_lcd/blob/master/rtl/verilog/vga_wb_slave.v#L389

and, therefore, when the processor first boots up, the BOOT ROM's first
task will be to poke to a memory address representing the "slave" address
of the PLL register, setting it so that the PLL outputs a frequency of
say 25 mhz for the SD/MMC I/O interface.

at that point the BOOT ROM can start to actually *use* the SD/MMC interface!

before then, it would either be useless (non-functional) or run at only 8mhz


> >if a crystal is inconvenient (too analog) we can use a straight external
> >clock oscillator IC which generates 8mhz, 12.5mhz or other suitable stable
> >input frequency.
> 
> What do you mean by « too analog » ? 

:)

i mean, i know the voltage swing of 12.5mhz and 24/25 mhz crystals:
they're tiny (well below 0.1v if i remember correctly?).  i wasn't
sure if the designs that you are doing would need something more
along the lines of 1.8v, 3.3v or so.

if you _can_ do something that uses this type of XTAL that would be
fantastic, as they are very cheap, and a standard part:

https://uk.farnell.com/iqd-frequency-products/lfxtal055299/crystal-25mhz-18pf-smd/dp/2449505

these are around only USD $0.15 in china, so we greatly prefer these.
12.5mhz, 24 or 25mhz would be the typical crystal frequency.

> If a quartz oscillator is used, the
> reference clock generator itself may be implemented on chip (and it will be
> analog) or it is indeed possible to use some external devoted circuit which
> will generate a digital signal.

those 3225 XTALs are what, from a cost perspective for mass-produced SoCs
(which our project is), would normally be expected to be used, because they
are so low-cost.

i only mentioned the external IC because i have seen them used (once), and
it may be slightly easier:
https://uk.farnell.com/iqd-frequency-products/lfspxo025560/crystal-50mhz-15pf-smd/dp/1276652

however you can see from the increased cost, we would not be doing ourselves
a favour to require a $1 oscillator when the SoC is being designed to sell for around $4!

therefore, the lower-cost quartz crystal, if you are able to do a circuit which
will use that as an external component, and to let us know what capacitance and resistance is expected, that would be fantastic.

(as you know, Professor Galayko, i am however writing this for the benefit of other readers), usually, the specification for the capacitance is very tight: two capacitors between 3pF and 20pF are attached, to make sure that the requirements of the ASIC circuit match, plus sometimes (not always) a 1Mohm (or so) resistor in parallel can be required.

you can tell i have designed quite a few PCBs with SoCs, peripheral ICs and Embedded Controllers :)
Comment 9 Luke Kenneth Casson Leighton 2020-03-01 12:36:55 GMT
> That really depends on the specifications and on the needs… 
> Indeed an all-digital PLL may be quite compact if the output 
> frequency running range is not too large. Analog PLL needs RC
> filters and it may be quite area hungry. 

i do not know what may be best, here.  honestly if we can get away
with multiplying up from 12 or 24 mhz, in reasonable easy
multiples up to 192mhz (let us assume that is a maximum for 350nm?)

then some digital counters can generate 1/3 of that, 1/4 of that,
1/5 of that, 1/90th, 1/29323th of that and so on.

in this way we can calculate, if we need 115200 baud for a UART,
then it can take the standard (non PLL) frequency 25mhz and just
divide/count.  this will be a straight digital circuit, clearly.

later, when we are not in 350nm, we need to be able to do USB2: this
is very specifically 480mhz (multiple of 24mhz).

some peripherals, such as RGMII... no, i am just checking a datasheet
for the RTL8211, the *PHY* generates the 125mhz input clock, and the
recipient SoC can then use *that*.  so that one is okay.

SDMMC i believe you have to generate a 48mhz clock.
https://github.com/ARMmbed/mbed-os/blob/master/targets/TARGET_STM/TARGET_STM32F4/device/stm32f4xx_ll_sdmmc.c#L48

yes, so we need to be able to generate 48 mhz, and to then route that
out.

however for later SoCs, when we get to 40nm, 28nm, 22nm and so on, we will
need around 400mhz, 800mhz, 1200mhz possibly as high as 1.6ghz

the exact frequency is not hugely relevant, but given that this is intended
as a low-power SoC, being able to set good granularity (approximately
100mhz increments) is reasonably important.

but, going *very* specific - 192.9323 mhz, 28.239929292999 mhz, that is
not a priority at all.

* Quartz XTAL either 12.5mhz or 24mz internal clock.
* 2x, 4x, 8x, 16x internal 12.5/24mhz internal clock
* UART (4800 baud to 115200 baud or 2x or 4x that)
* SD/MMC up to 48mhz
* eMMC up to 100mhz 
* SPI 24mhz maximum
* HyperRAM 100mhz, 150mhz, 160mhz (if doing DDR, so internally 320mhz)
* I2C needs between 100 Khz and 0.4 Mhz.

so the slower ones (UART, I2C) can be done as a simple digital
counter / divider circuit.

the faster ones (HyperRAM, eMMC, main CPU clock) could be done with a
digital PLL that multiples 2/4/8/16, then perhaps a very *very* simple
divider/counter ( 1/3, 1/5, 1/6, 1/7 ) to minimise gate delay?
Comment 10 Luke Kenneth Casson Leighton 2020-03-01 12:41:56 GMT
oh, also SDRAM.  100mhz and 133mhz.  133mhz is 400/3, which can be achieved
(approximately) by 24mhz times 16 in the PLL, then a simple counter-divider
divide by 3.

https://en.wikipedia.org/wiki/PC100
https://en.wikipedia.org/wiki/PC133

this is the kind of simplistic flexibility we need, all based around doubling
quadrupling etc. a 12.5 or 24 mhz XTAL, rather than very specific targetted
analog frequencies.
Comment 11 Jacob Lifshay 2020-03-01 19:09:39 GMT
We should try to have a divide-by-2 as the last step, since that easily generates a duty cycle of 50%. Getting 50% duty cycle is much more complicated in a divide by 3 (or some other odd number).
Comment 12 Jacob Lifshay 2020-03-01 19:20:19 GMT
For the 28nm SoC, it would be nice (but isn't required) to be able to adjust the core clock in small increments (step size of 100MHz or less) and be able to run up above 2.5GHz since those are useful for overclocking.

Additionally, for less niche use cases, the small step size makes lower clock speeds for lower-power modes nicer to use, since there is a more varied choice.
Comment 13 Dimitri Galayko 2020-05-26 10:00:27 BST
Dear all, I'm following the discussion by mail with Luke. There is a need to create a "open-source" and generic PLL with the following specifications: 

Technology: TSMC 180 nm
Input: 24 MHz XTAL
Output Frequency: ~300 MHz (~x12), with fractional intermediate frequencies x2, x3, x4, x5, x6. A double of the frequency needs to be generated, in order to have a 300 MHz well-balanced 1 and 0 phases. 

Goal : to include to October tape-out
Power: TBD -- the whole chip is expected to consume 3 watts 
Jitter specifications : TBD 

I believe that an all-digital PLL can be a good first approach, if the design needs to be “generic”. The most critical and difficult block is the digitally controlled oscillator, which defines greatly the performances of the whole. 

We have a multiple experience of design of all-digital PLLs, whose basic blocks (mainly DCO and phase detector) have been inspired from the work (done in 65 nm CMOS)
**Tierno, Jose A., Alexander V. Rylyakov, and Daniel J. Friedman. "A wide power supply range, wide tuning range, all static CMOS all digital PLL in 65 nm SOI." IEEE Journal of Solid-State Circuits 43.1 (2008): 42-51. **
A detailed report on what have been done in my team is available in the PhD report (in english, available on the web)
**Shan, Chuan. Générateur distribué d'horloge pour puces globalement et localement synchrones de grande taille. Diss. Paris 6, 2014.** 

The advantage of this approach is a compatibility with the digital design flow, all blocks belong to the digital electronics (even if some of them require a custom sizing). I believe that this is well suitable for the goal of a “generic” design flow. After that, it is possible that the performances will not be satisfactory in 180 nm (mainly the power consumption), I have some idea about possible alternative. 

My first simulations in a 180 nm technology (not TSMC, but equivalent) shows that a 8 bits DCO may ensure a 600 MHz output frequency for all corners with a resolution able to provide a reasonably good jitter performance, with the power consumption of ~5mW.  

So, regarding the specifications, the most important point determining the overall architecture is : 
-- Jitter 
-- power consumption. 

Is there any idea about what is expected for these two specifications ?
Comment 14 Luke Kenneth Casson Leighton 2020-05-26 11:24:38 BST
(In reply to Dimitri Galayko from comment #13)
> Dear all, I'm following the discussion by mail with Luke. There is a need to
> create a "open-source" and generic PLL with the following specifications: 
> 
> Technology: TSMC 180 nm
> Input: 24 MHz XTAL
> Output Frequency: ~300 MHz (~x12), with fractional intermediate frequencies
> x2, x3, x4, x5, x6.

as taps, this would be very helpful, so as to be able to do further counter-dividers to create a wide frequency range for peripherals.

> A double of the frequency needs to be generated, in
> order to have a 300 MHz well-balanced 1 and 0 phases. 

interesting.  600mhz in 180nm.

> My first simulations in a 180 nm technology (not TSMC, but equivalent) shows
> that a 8 bits DCO may ensure a 600 MHz output frequency for all corners with
> a resolution able to provide a reasonably good jitter performance, with the
> power consumption of ~5mW.  

this sounds perfectly reasonable.

> So, regarding the specifications, the most important point determining the
> overall architecture is : 
> -- Jitter 
> -- power consumption. 
> 
> Is there any idea about what is expected for these two specifications ?

5mW is perfectly acceptable.  we do not have enough experience to say if jitter is an issue, and for this first chip, being a noncritical test chip, if it is unstable due to jitter we simply run tests at a slower speed, incrementally, until it is.
Comment 15 Staf Verhaegen 2020-05-26 21:12:45 BST
> Input: 24 MHz XTAL

AFAIK there is no crystal oscillator IP available for this project so I assumed the input clock would be CMOS.

> So, regarding the specifications, the most important point determining the
> overall architecture is : 
> -- Jitter 

For a digital circuit jitter of the clock eats in the timing budget, the  critical has to be fast enough to have result for the smallest clock cycle due to jitter.
Analog/mixed-signal blocks will have their requirement on jitter, like resolution/bandwidth for an ADC or error rate for SERDES.
Comment 16 Luke Kenneth Casson Leighton 2020-05-26 22:30:49 BST
(In reply to Staf Verhaegen from comment #15)
> > Input: 24 MHz XTAL
> 
> AFAIK there is no crystal oscillator IP available for this project so I
> assumed the input clock would be CMOS.

if we do not have all the pieces, yes we will use a CMOS external clock, this is the fallback.

Professor Galyco, i may not have made it clear: we are not signing any NDAs and so need *everything* from scratch that would normally be provided by the Foundry (under NDA).  thank you for raising this, Staf.

did the PLL that you developed also include a XTAL interface?
Comment 17 Luke Kenneth Casson Leighton 2020-09-26 14:44:49 BST
professor galayco,

if we are to include the 600 mhz PLL that you have developed in the 180nm SoC it makes me nervous to have two unproven blocks connected together.

i would like us to be able to include it in a way where it can be connected and disconnected very simply by an external pin.

also what do you think of the idea for it to be possible to independently confirm that the PLL is functional by driving an outgoing pin on a simple counter-divider, so that the outgoing pin does not exceed the drive speed of a QFP pad?

something like this:

digital_clk@24mhz -> PLL -> clk_600
clk_600 -> div2 -> clk_300
clk_300 -> div6 -> clk_48

SOC_CLK = MUX(ext_clksel,
              digital_clk@24mhz,
              clk_48)

* clk_48 will be sent directly to an external pad.
* SOC_CLK will drive the main SoC clock
* ext_clksel will be an external pin

in this way we can independently verify the PLL by checking the clk_48 signal, even if the SoC is nonfunctional

also we can independently verify the SoC even if the PLL is nonfunctional.

Staf does this approach run into any of the timing constraints issues you raised?

if this is ok we can write a simple digital divider in nmigen, implementing all of the above, where the only thing needed would be the 300mhz output from the PLL.

(you would not need to write the 6x digital divider, professor, we can do it)
Comment 18 Luke Kenneth Casson Leighton 2020-09-27 09:17:40 BST
this is what i propose:
https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/clock/select.py;hb=HEAD

an external suite of 3 pins allows to select from the following options:

  - 0b000 - CLK_24 (direct)
  - 0b001 - PLL / 6
  - 0b010 - PLL / 4
  - 0b011 - PLL / 3
  - 0b100 - PLL / 2
  - 0b101 - PLL
  - 0b110 - ZERO (direct driving in combination with ONE)
  - 0b111 - ONE

the default is the external 24 mhz digital drive, and it's wired
through combinatorially.

zero and one are available (note that bit zero of the "selecters"
effectively becomes the clock) for convenience, these can be
wired to a bounce-free toggle or an Embedded Controller GPIO
(e.g. STM32F)

manual clock selection through these 3 "select" wires would only be
done once the PLL is known to be stable at 300mhz.  this to be verified
by checking the PLL/6 pin (pll_48_o)

thoughts?
Comment 19 Staf Verhaegen 2020-09-30 11:32:37 BST
(In reply to Luke Kenneth Casson Leighton from comment #18)
> this is what i propose:
> https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/clock/select.py;hb=HEAD

I think you generate a circular dependency here. You should not have the configuration register be synchronous to the clock that is generated by this PLL.

> 
> an external suite of 3 pins allows to select from the following options:
> 
>   - 0b000 - CLK_24 (direct)
>   - 0b001 - PLL / 6
>   - 0b010 - PLL / 4
>   - 0b011 - PLL / 3
>   - 0b100 - PLL / 2
>   - 0b101 - PLL
>   - 0b110 - ZERO (direct driving in combination with ONE)
>   - 0b111 - ONE
> 
> the default is the external 24 mhz digital drive, and it's wired
> through combinatorially.
> 
> zero and one are available (note that bit zero of the "selecters"
> effectively becomes the clock) for convenience, these can be
> wired to a bounce-free toggle or an Embedded Controller GPIO
> (e.g. STM32F)
> 
> manual clock selection through these 3 "select" wires would only be
> done once the PLL is known to be stable at 300mhz.  this to be verified
> by checking the PLL/6 pin (pll_48_o)
> 
> thoughts?

It was always in the plan (in my head) to be able to bypass the PLL. Other configuration options I would not specify as hard specs but let it come from the  PLL design. Only supporting power of two clock division is most natural as it allows to use just only flipflops for it, otherwise extra logic needs to be inserted in the PLL feedback loop.

Also the division of the clock could also be made programmable through JTAG so no extra pins are needed.

I don't think this has to be frozen at the moment for the current design delivery as that does not include the PLL yet. We should have a meeting somewhere mid October to finalize the package and pin allocation.
Comment 20 Luke Kenneth Casson Leighton 2020-09-30 12:32:38 BST
(In reply to Staf Verhaegen from comment #19)
> (In reply to Luke Kenneth Casson Leighton from comment #18)
> > this is what i propose:
> > https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/clock/select.py;hb=HEAD
> 
> I think you generate a circular dependency here. 

i do not believe so (happy to be shown otherwise).  bear in mind:  "sync"
domain is intended to be the 300mhz PLL and would be set up externally
as:

     m.submodule.csel = DomainRenamer("pll300mhzdomain")(ClockSelect())

externally it becomes the responsibility of the parent not to screw up
by e.g. creating an external circular dependency.

* the clk_sel_i wires are intended to go *DIRECTLY* out on to ASIC pins.
  not even set by anything, not in any chip memory - nothing:  i mean
  DIRECTLY out.

* likewise pll_48_o is intended to be put directly out of the ASIC onto
  a pin.

* likewise clk_24_i is intended to be a direct input onto a pin, for connecting
  to a 24 mhz digital clock.


> You should not have the
> configuration register be synchronous to the clock that is generated by this
> PLL.


these are combinatorially set:

        comb += clkgen[SYS_CLK].eq(self.clk_24_i) # 1st is external 24mhz
        comb += clkgen[ZERO].eq(0) # LOW (use with ONE for direct driving)
        comb += clkgen[ONE].eq(1) # HI

in the 300mhz PLL sync domain,counter and clkgen are only set from either
clkgen or counter, example:

        sync += clkgen[PLL2].eq(~clkgen[PLL2]) # half PLL rate

and finally the outputs are set combinatorially, from clkgen:

        comb += self.core_clk_o.eq(clkgen[self.clk_sel_i])

what _is_ a problem though is that if clk_sel_i is set in the middle of
any given clock-pulse you will get one weird spike (uneven rising time
vs falling time).

as a test chip i am not going to worry about that.


> > thoughts?
> 
> It was always in the plan (in my head) to be able to bypass the PLL.

ah ok, likewise, i didn't vocalise it :)

> Other
> configuration options I would not specify as hard specs but let it come from
> the  PLL design. Only supporting power of two clock division is most natural
> as it allows to use just only flipflops for it, otherwise extra logic needs
> to be inserted in the PLL feedback loop.

this isn't intended for association - in any way - with the actual PLL.
it's specifically not intended for connection back to - or any feedback
of any kind - to the actual PLL.

i anticipate (expect) that the PLL would generate one frequency and
one frequency only: a stable 300 mhz output and no other frequency.
(give-or-take a bit: actually 12x 24mhz).

now, if the *PLL* has feedback to create that stable 300mhz clock, great.
and if there happen to be stable "taps" off of the internal digital divider
inside the PLL, that's really great, it will save having to do flip-flops.


> Also the division of the clock could also be made programmable through JTAG
> so no extra pins are needed.

mmmm... i like the principle: it does make me nervous connecting unknown
blocks together though.  simple as it is, if the JTAG HDL goes wrong,
we can't test the PLL.

*reading* from the PLL (having some sort of JTAG-readable counters that
trigger from different taps on the PLL) that's a different matter.

or even just "a counter" then set the external clock source (from the
3 clk_sel_i pins), read that on JTAG and if it goes faster @ 300mhz than
when clk_sel_i is set to 300 rather than 48mhz, you know the PLL's good.


> I don't think this has to be frozen at the moment for the current design
> delivery as that does not include the PLL yet. We should have a meeting
> somewhere mid October to finalize the package and pin allocation.

ok.  answer under different bugreport.
https://bugs.libre-soc.org/show_bug.cgi?id=508
Comment 21 Staf Verhaegen 2020-11-10 08:37:03 GMT
A little over a week ago me, Dimitri and Marie-Minerve from LIP6 had a call to discuss the PLL design for the Libre-SoC. We propose to only use power of two for the division in the PLL so just flip-flops can be used for it. See also https://en.wikipedia.org/wiki/Frequency_divider This is what we propose for the pin-out o the PLL:

* The refclk signal is the external clock signal.
* One external signal vcodiv that selects between having the VCO oscillating at 16x or 8x the reference clock.
* Two signals clksel to select between four signals to use for the SoC clock: the reference clock (e.g. bypass PLL), VCO/2, VCO/4 and VCO/8.
* One external digital output VCOd16 that is VCO/16
* One analog output, likely the voltage for VCO.
* Current design has not lock signal, Dimitri will see if he can provide that. If so it will also be an external digital output.

Dimitri will see if he can tune the PLL so there is a factor of two between the maximum and minimum oscillation frequency of the VCO. This way one could in theory reach any clock frequency for the SoC between fVCOmax/2 and fVCOmin/8 by playing with frequency of reference clock and the clock division.
Comment 22 Luke Kenneth Casson Leighton 2020-11-10 12:00:31 GMT
(In reply to Staf Verhaegen from comment #21)
> A little over a week ago me, Dimitri and Marie-Minerve from LIP6 had a call
> to discuss the PLL design for the Libre-SoC. We propose to only use power of
> two for the division in the PLL so just flip-flops can be used for it. See
> also https://en.wikipedia.org/wiki/Frequency_divider This is what we propose
> for the pin-out o the PLL:
> 
> * The refclk signal is the external clock signal.
> * One external signal vcodiv that selects between having the VCO oscillating
> at 16x or 8x the reference clock.
> * Two signals clksel to select between four signals to use for the SoC
> clock: the reference clock (e.g. bypass PLL), VCO/2, VCO/4 and VCO/8.
> * One external digital output VCOd16 that is VCO/16
> * One analog output, likely the voltage for VCO.
> * Current design has not lock signal, Dimitri will see if he can provide
> that. If so it will also be an external digital output.

if this can actually be the actual pinouts of the PLL rather than part of
the nmigen-based digital core, great.

we will be trusting that the PLL bypass is functional.


> Dimitri will see if he can tune the PLL so there is a factor of two between
> the maximum and minimum oscillation frequency of the VCO. This way one could
> in theory reach any clock frequency for the SoC between fVCOmax/2 and
> fVCOmin/8 by playing with frequency of reference clock and the clock
> division.

this would be great.

i am going to create a "fake" (dummy) PLL which can be replaced by you (Staf)
or you (Jean-Paul) at the verilog / gate / layout level.

the "fake" PLL passes through the external reference clock directly to the
PLL output so as to provide a means to test... something.

i will be absolutely honest and say that, after the warnings about clock
trees that you gave, Staf, i would greatly prefer that the core not be linked
to the PLL at all (a stand-alone test unit).

however that would leave us without an opportunity to try testing the layout at
higher frequencies.


i'm therefore willing to take that risk if Jean-Paul is happy with the implications, which i assume would be as follows:

* that the external clock would *ONLY* go to the PLL (not as a Clock Tree)
* that the PLL *output* is done as a Clock (H) Tree.

if this is correct then the PLL should be positioned right next to where the external clock comes in.
Comment 23 Staf Verhaegen 2020-11-10 12:56:54 GMT
(In reply to Luke Kenneth Casson Leighton from comment #22) 
> 
> i'm therefore willing to take that risk if Jean-Paul is happy with the
> implications, which i assume would be as follows:
> 
> * that the external clock would *ONLY* go to the PLL (not as a Clock Tree)
> * that the PLL *output* is done as a Clock (H) Tree.
> 
> if this is correct then the PLL should be positioned right next to where the
> external clock comes in.

The bypass of the PLL will be done by only using the MUX and buffers of the standard cell library. If that doesn't work, chances of anything else working on the chip are negligible.
I agree with Jean-Paul and Dimitri having the final say on this subject.
Comment 24 Luke Kenneth Casson Leighton 2020-11-10 13:15:00 GMT
(In reply to Staf Verhaegen from comment #23)

> The bypass of the PLL will be done by only using the MUX and buffers of the
> standard cell library.

a single Mux is what i put into a module called ClockSelect, this morning.

> If that doesn't work, chances of anything else
> working on the chip are negligible.

good point.

> I agree with Jean-Paul and Dimitri having the final say on this subject.

the current niolib does this:

* clocks are declared as clocks in a configuration parameter (sys_clk, jtag_tck)
* for each clock, a clock tree **MUST** be made, covering the **ENTIRE** chip
* each clock will have its own ring created, for connection to the IO pads

what we may really need is:

* the clocks that are to be specified for creating H-Clock Trees are
  specified by a **SEPARATE** configuration parameter.

i will raise a separate bugreport about this.
Comment 25 Luke Kenneth Casson Leighton 2020-11-10 13:25:34 GMT
(In reply to Staf Verhaegen from comment #23)

> The bypass of the PLL will be done by only using the MUX and buffers of the
> standard cell library.

that should still be *inside* the PLL block, correct?  i.e. the *PLL block* should use the standard cell library and should use a MUX and buffers, yes?

i.e. this should be done by Dmitri.

the LibreSOC core (and litex peripherals) will *only* take the one clock, and the one clock shall be the PLL's output, correct?  LibreSOC's core (and peripherals) shall *NOT* try to do the MUX, correct?
Comment 26 Staf Verhaegen 2020-11-10 14:20:45 GMT
(In reply to Luke Kenneth Casson Leighton from comment #25)
> (In reply to Staf Verhaegen from comment #23)
> 
> > The bypass of the PLL will be done by only using the MUX and buffers of the
> > standard cell library.
> 
> that should still be *inside* the PLL block, correct?  i.e. the *PLL block*
> should use the standard cell library and should use a MUX and buffers, yes?
> 
> i.e. this should be done by Dmitri.
> 
> the LibreSOC core (and litex peripherals) will *only* take the one clock,
> and the one clock shall be the PLL's output, correct?  LibreSOC's core (and
> peripherals) shall *NOT* try to do the MUX, correct?

Correct, the clock signal for the libre-soc is an output of the PLL block. It's the task of Jean-Paul and Dimitri to align the layout of the PLL with the clock-tree synthesis in Coriolis.
Dimitri has the standard cell library at hand and will place the cells himself in the PLL block, also for the bypass. The reference clock is an input to the PLL block coming from an IO cell.
Comment 27 Luke Kenneth Casson Leighton 2020-11-10 15:48:23 GMT
(In reply to Staf Verhaegen from comment #26)

> Correct, the clock signal for the libre-soc is an output of the PLL block.
> It's the task of Jean-Paul and Dimitri to align the layout of the PLL with
> the clock-tree synthesis in Coriolis.
> Dimitri has the standard cell library at hand and will place the cells
> himself in the PLL block, also for the bypass. The reference clock is an
> input to the PLL block coming from an IO cell.

brilliant.  i will create a dummy module that has the same API (comment #21) which for test purposes wires Ref-in to PLL-out.