Bug for the SVP64 primer document outlining the usecases and advantages of the SVP64 specification.
andrey if you need help with any SVG images Veera can help just draw them quick by hand.
Andrey, in summary, the purpose of the sequence of documents it to communicate, explain and persuade. Starting with an overview where an individual who is new to the topic can quickly grasp the principles, features and benefits of SVP64 it will be the prologue to the next section which will contain greater detail and will form the basis of future programming/application/training guides. If you need help to review as you go along just ask.
Input from Paul "Something usable within a couple of weeks would be good. I have an opportunity to set up a meeting with the people inside IBM who are looking at scalable vector architectures and this would be really useful for that"
(In reply to djac from comment #3) > Input from Paul "Something usable within a couple of weeks would be good. I > have an opportunity to set up a meeting with the people inside IBM who are > looking at scalable vector architectures and this would be really useful for > that" ahh that's very interesting. so actually, if they're evaluating *all* available scalable vector architectures, then that's very different. firstly it means they'll want to know what those are, secondly it's reasonable i feel to be able to assume they're extremely intelligent and know what they're looking at, and thirdly that they'll be doing a "this vs that" so anything we can help them with there, to know what the differences are, would i feel be beneficial. thoughts?
The KISS maxim applies. Comprehensive but simple. We should have a 3-way Zoom on Monday.
(In reply to djac from comment #5) > The KISS maxim applies. > > Comprehensive but simple. We should have a 3-way Zoom on Monday. ack. nothing like this then: :) https://ftp.libre-soc.org/20220617_110034.jpg https://youtu.be/1SsMVP1CTFI
(In reply to Luke Kenneth Casson Leighton from comment #4) > ahh that's very interesting. so actually, if they're evaluating *all* > available scalable vector architectures, then that's very different. Interesting, so we'll need a nice little comparison page of all those arch's. No pressure :D > firstly it means they'll want to know what those are, secondly it's > reasonable i feel to be able to assume they're extremely intelligent That means we'll need brief, yet deep content for them to look over? > what the differences are, would i feel be beneficial. thoughts? A comparison sounds like an easy way to demonstrate the power of SV. (In reply to djac from comment #5) > The KISS maxim applies. > > Comprehensive but simple. We should have a 3-way Zoom on Monday. Will be there. (In reply to Luke Kenneth Casson Leighton from comment #6) > ack. nothing like this then: :) > > https://ftp.libre-soc.org/20220617_110034.jpg > https://youtu.be/1SsMVP1CTFI Actually the diagram is a good idea. Just scale down to fit on a page. The video is still processing, so haven't seen it yet. I submitted my latest content here: https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=svp64-primer/summary.tex;h=b1b47f8754efc13699df898941cf55002fc823cb;hb=c622226176550788c3cf447db36f4fde07bff16f The summary.tex file is a secondary file, you'll also find the main one, bibliography, and acronyms (figured a good idea as we have so many, even if we don't use in the primer). The text is a bit of a mess, and I haven't added any examples yet (wasn't sure if you wanted to use the code from the sigarch article, or use of our own). Currently I have a brief summary of SIMD, Vector Processing, and started to add SV. The text as is, is too big, but I'll leave it for now. We can cut away for the primer next week. I will now be heading off on my weekend camping trip, but can check bugs/logs, David can reach me if needed.
another thing we need to establish: what is IBM looking for? as in: what "features", are they looking for high performance, low power, compiler toolchain support, capabilities: we don't know yet. knowing these things would radically alter what we write.
(In reply to Andrey Miroshnikov from comment #7) > Interesting, so we'll need a nice little comparison page of all those > arch's. No pressure :D number of instructions says it all, really. > > firstly it means they'll want to know what those are, secondly it's > > reasonable i feel to be able to assume they're extremely intelligent > That means we'll need brief, yet deep content for them to look over? David suggested a 2-3 page document with features only that would leave them wishing/wanting to ask more questions. > Actually the diagram is a good idea. Just scale down to fit on a page. > The video is still processing, so haven't seen it yet. still uploading *quail* > I submitted my latest content here: > https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=svp64-primer/summary. > tex;h=b1b47f8754efc13699df898941cf55002fc823cb; > hb=c622226176550788c3cf447db36f4fde07bff16f great. made some clarifications. > I will now be heading off on my weekend camping trip, but can check > bugs/logs, David can reach me if needed. brilliant. enjoy.
important features and benefits to mention: * The v3.1 Specification is not altered in any way. * Specifically designed to be easily implemented on top of an existing Micro-architecture (especially Superscalar Out-of-Order Multi-issue) without disruptive full architectural redesigns. * Divided into Compliancy Levels to suit differing needs. * At the highest Compliancy Level only requires four instructions (SVE2 requires appx 9,000. AVX-512 around 10,000. RVV around 300). * Predication, an often-requested feature, is added cleanly to the Power ISA (without modifying the v3.1 Power ISA) * In-registers arbitrary-sized Matrix Multiply is achieved in three instructions (without adding any v3.1 Power ISA instructions) * Full DCT and FFT RADIX2 Triple-loops are achieved with dramatically reduced instruction count, and power consumption expected to greatly reduce. Normally found only in high-end VLIW DSPs (TI MSP, Qualcomm Hexagon) * Fail-First Load/Store allows strncpy to be implemented in around 14 instructions (Optimised VSX assembler is 240). * Inner loop of MP3 implemented in under 100 instructions (gcc produces 450 for the same function) All areas investigated so far consistently showed reductions in executable size, which as outlined in {SIMD_HARM} has an indirect reduction in power consumption due both to less I-Cache/TLB pressure and Issue remaining idle.
added, please review / critique https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=57ac938a4074d54c86267a74fda14ecbb1a7b086
hmmm.. this image, which is how RISC-V works, i don't think helps us. i totally get that it's based on an SRAM of a fixed size: it just isn't how SV works. https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=svp64-primer/img/vl_reg_n.jpg;hb=HEAD SV is actually much more like how MMX works: register r0: bytes 0 1 2 3 4 5 6 7 64-bit |<-------------------->| 32-bit |<--------->|<-------->| 16-bit |<--->|<--->|<--->|<-->| 8-bit |<->| etc |<>| register r1: bytes 0 1 2 3 4 5 6 7 64-bit |<-------------------->| 32-bit |<--------->|<-------->| 16-bit |<--->|<--->|<--->|<-->| 8-bit |<->| etc |<>| r2,3,4....... .....r126 register r127: bytes 0 1 2 3 4 5 6 7 64-bit |<-------------------->| 32-bit |<--------->|<-------->| 16-bit |<--->|<--->|<--->|<-->| 8-bit |<->| etc |<>|
this is how SVP64 registers work: https://ftp.libre-soc.org/20220618_184935.jpg you get a *rollover* effect in the ***SCALAR*** register file.
(In reply to Luke Kenneth Casson Leighton from comment #12) > hmmm.. this image, which is how RISC-V works, i don't think > helps us. because it's too complex to explain. the Cray vector regfile is way simpler 0 ..... 63 elements v0 v1 v2 .. v7 registers
(In reply to Luke Kenneth Casson Leighton from comment #14) > because it's too complex to explain. the Cray vector regfile is way simpler https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=4c9273dcdfea9ec7ce4f955846280972431239f6
Zoom at 3pm
https://ftp.libre-soc.org/20220620_151109.jpg
(In reply to Luke Kenneth Casson Leighton from comment #17) > https://ftp.libre-soc.org/20220620_151109.jpg added, B&W https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=e36b59c1e3f13b3732a19b517c999f441c66ad73
now includes URLs in the bibliography https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=147c7e52eabba2449fe1a9fccd5f5846aff70bc1
took these out, if they go back in they should be part of (merged into) the advantages subsubsection of SV. where they were, they were repetition so getting annoying -\subsubsection{Prefix 64 - SVP64} - -SVP64, is a specification designed to solve the problems caused by -SIMD implementations by: -\begin{itemize} - \item Simplifying the hardware design - \item Reducing maintenance overhead - \item Reducing code size and power consumption - \item Easier for compilers, coders, documentation - \item Time to support platform is a fraction of conventional SIMD - (Less money on R\&D, faster to deliver) -\end{itemize}
Sent the draft primer to Paul.
I apologise, must've selected the pulldown by mistake. Changing back to "documentation"
these images need converting to SVG * https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=svp64-primer/img/svp64_regs.jpg;hb=HEAD * https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=svp64-primer/img/cray_vector_regs.jpg;hb=HEAD * https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=svp64-primer/img/power_pipelines.jpg;hb=HEAD
(In reply to Luke Kenneth Casson Leighton from comment #23) > these images need converting to SVG As per today's earlier conversation with Luke: https://libre-soc.org/irclog/latest.log.html#t2022-06-27T11:11:54 In addition I converted the SIMD diagram (I apologise if this was unnecessary) Each one has to be exported to PNG (as LaTex doesn't support it by default) See the files here: https://git.libre-soc.org/?p=libreriscv.git;a=tree;f=svp64-primer/img;h=fc0680f7b9ea21a5fce6c1bf502258400af0e9c7;hb=HEAD Once Luke's had a look and is happy with them, I can delete the JPGs from the repo.
these look great Andrey let's close this after removing jpgs. nice work
Old jpg's deleted, closing this bug. https://git.libre-soc.org/?p=libreriscv.git;a=commitdiff;h=c5fd0af78789bb34709227bc8b4e850a72349cae
andrey i'm re-submitting this via the secret URL system as a single RFP combined with bug #858 and bug #875 altering the submission date accordingly
Is there a way we can compare SVP64 to other scalar vector systems. As SVP64 does not use (fixed or predicated) SIMD, but a pure scalar vector, is a comparison even possible? I am thinking of comparing it with AVX(2/512), SVE2, RVV. If possible, we should have a short section on that.
(In reply to Toshaan Bharvani from comment #28) > Is there a way we can compare SVP64 to other scalar vector systems. > As SVP64 does not use (fixed or predicated) SIMD, but a pure scalar vector, > is a comparison even possible? yes because you use fixed (or, better predicated) SIMD at the back-end. i just added a diagram (by Veera) which helps explain. https://git.libre-soc.org/?p=libreriscv.git;a=blob;f=svp64-primer/img/sv_multi_issue.svg;hb=HEAD > I am thinking of comparing it with AVX(2/512), SVE2, RVV. > If possible, we should have a short section on that. sure. i did a summary
https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=media/audio/mp3/mp3_0_apply_window_float_basicsv.s;hb=HEAD show this is easy.
make sure to spell out what SVP64 is not: * not RVV. not based on RVV. based on original Cray concept. * not based on any known other ISA, is its own intuitive concept
Comparison table headings * Number of instructions * Scalable yes/no * Predication masks yes/no * Explicit vector registers * 128-Bit * biginteger capability * Load/Store Fail/First * Twin predication * Data-dependent fail-first * Predicate-result need to think in 2 Dimensions. instructions, vertical, registers horizontal. GPUs shoild have bqckend massive wide SIMD. frontend is atill exact same SVP64 ISA. makes life much easier because uniform
executive summary (2 page) book zero. use arefs how can be done (source code, unit tests) primer is too technical (book 1) merge into same document. "please contact if questions" add revision history with version numbers put together quickly, will be updated,
do a first page title logo etc. who composed it. version number license? skip it end: these are contact details.