3 new instruction unit tests are needed, for BCD: * addg6s * cdtbcd * cbcdtd these are on p109 book I section 3.3.9 v3.0C https://ftp.libre-soc.org/PowerISA_public.v3.0C.pdf TODO checklist *in this order* due to dependencies (edit as needed): * add cdtbcd unit test - TODO * add cbcdtd unit test - TODO * add addg6s unit test - TODO best to be added to decoder/isa/test_caller_bcd.py (new file) to save iteration time (test_caller.py takes some time to complete).
Table 129:BCD-to-DPD translation 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 00_ 000 001 002 003 004 005 006 007 008 009 50_ 280 281 282 283 284 285 286 287 288 289 01_ 010 011 012 013 014 015 016 017 018 019 51_ 290 291 292 293 294 295 296 297 298 299 02_ 020 021 022 023 024 025 026 027 028 029 52_ 2A0 2A1 2A2 2A3 2A4 2A5 2A6 2A7 2A8 2A9 03_ 030 031 032 033 034 035 036 037 038 039 53_ 2B0 2B1 2B2 2B3 2B4 2B5 2B6 2B7 2B8 2B9 04_ 040 041 042 043 044 045 046 047 048 049 54_ 2C0 2C1 2C2 2C3 2C4 2C5 2C6 2C7 2C8 2C9 05_ 050 051 052 053 054 055 056 057 058 059 55_ 2D0 2D1 2D2 2D3 2D4 2D5 2D6 2D7 2D8 2D9 06_ 060 061 062 063 064 065 066 067 068 069 56_ 2E0 2E1 2E2 2E3 2E4 2E5 2E6 2E7 2E8 2E9 07_ 070 071 072 073 074 075 076 077 078 079 57_ 2F0 2F1 2F2 2F3 2F4 2F5 2F6 2F7 2F8 2F9 08_ 00A 00B 02A 02B 04A 04B 06A 06B 04E 04F 58_ 28A 28B 2AA 2AB 2CA 2CB 2EA 2EB 2CE 2CF 09_ 01A 01B 03A 03B 05A 05B 07A 07B 05E 05F 59_ 29A 29B 2BA 2BB 2DA 2DB 2FA 2FB 2DE 2DF 10_ 080 081 082 083 084 085 086 087 088 089 60_ 300 301 302 303 304 305 306 307 308 309 11_ 090 091 092 093 094 095 096 097 098 099 61_ 310 311 312 313 314 315 316 317 318 319 12_ 0A0 0A1 0A2 0A3 0A4 0A5 0A6 0A7 0A8 0A9 62_ 320 321 322 323 324 325 326 327 328 329 13_ 0B0 0B1 0B2 0B3 0B4 0B5 0B6 0B7 0B8 0B9 63_ 330 331 332 333 334 335 336 337 338 339 14_ 0C0 0C1 0C2 0C3 0C4 0C5 0C6 0C7 0C8 0C9 64_ 340 341 342 343 344 345 346 347 348 349 15_ 0D0 0D1 0D2 0D3 0D4 0D5 0D6 0D7 0D8 0D9 65_ 350 351 352 353 354 355 356 357 358 359 16_ 0E0 0E1 0E2 0E3 0E4 0E5 0E6 0E7 0E8 0E9 66_ 360 361 362 363 364 365 366 367 368 369 17_ 0F0 0F1 0F2 0F3 0F4 0F5 0F6 0F7 0F8 0F9 67_ 370 371 372 373 374 375 376 377 378 379 18_ 08A 08B 0AA 0AB 0CA 0CB 0EA 0EB 0CE 0CF 68_ 30A 30B 32A 32B 34A 34B 36A 36B 34E 34F 19_ 09A 09B 0BA 0BB 0DA 0DB 0FA 0FB 0DE 0DF 69_ 31A 31B 33A 33B 35A 35B 37A 37B 35E 35F 20_ 100 101 102 103 104 105 106 107 108 109 70_ 380 381 382 383 384 385 386 387 388 389 21_ 110 111 112 113 114 115 116 117 118 119 71_ 390 391 392 393 394 395 396 397 398 399 22_ 120 121 122 123 124 125 126 127 128 129 72_ 3A0 3A1 3A2 3A3 3A4 3A5 3A6 3A7 3A8 3A9 23_ 130 131 132 133 134 135 136 137 138 139 73_ 3B0 3B1 3B2 3B3 3B4 3B5 3B6 3B7 3B8 3B9 24_ 140 141 142 143 144 145 146 147 148 149 74_ 3C0 3C1 3C2 3C3 3C4 3C5 3C6 3C7 3C8 3C9 25_ 150 151 152 153 154 155 156 157 158 159 75_ 3D0 3D1 3D2 3D3 3D4 3D5 3D6 3D7 3D8 3D9 26_ 160 161 162 163 164 165 166 167 168 169 76_ 3E0 3E1 3E2 3E3 3E4 3E5 3E6 3E7 3E8 3E9 27_ 170 171 172 173 174 175 176 177 178 179 77_ 3F0 3F1 3F2 3F3 3F4 3F5 3F6 3F7 3F8 3F9 28_ 10A 10B 12A 12B 14A 14B 16A 16B 14E 14F 78_ 38A 38B 3AA 3AB 3CA 3CB 3EA 3EB 3CE 3CF 29_ 11A 11B 13A 13B 15A 15B 17A 17B 15E 15F 79_ 39A 39B 3BA 3BB 3DA 3DB 3FA 3FB 3DE 3DF 30_ 180 181 182 183 184 185 186 187 188 189 80_ 00C 00D 10C 10D 20C 20D 30C 30D 02E 02F 31_ 190 191 192 193 194 195 196 197 198 199 81_ 01C 01D 11C 11D 21C 21D 31C 31D 03E 03F 32_ 1A0 1A1 1A2 1A3 1A4 1A5 1A6 1A7 1A8 1A9 82_ 02C 02D 12C 12D 22C 22D 32C 32D 12E 12F 33_ 1B0 1B1 1B2 1B3 1B4 1B5 1B6 1B7 1B8 1B9 83_ 03C 03D 13C 13D 23C 23D 33C 33D 13E 13F 34_ 1C0 1C1 1C2 1C3 1C4 1C5 1C6 1C7 1C8 1C9 84_ 04C 04D 14C 14D 24C 24D 34C 34D 22E 22F 35_ 1D0 1D1 1D2 1D3 1D4 1D5 1D6 1D7 1D8 1D9 85_ 05C 05D 15C 15D 25C 25D 35C 35D 23E 23F 36_ 1E0 1E1 1E2 1E3 1E4 1E5 1E6 1E7 1E8 1E9 86_ 06C 06D 16C 16D 26C 26D 36C 36D 32E 32F 37_ 1F0 1F1 1F2 1F3 1F4 1F5 1F6 1F7 1F8 1F9 87_ 07C 07D 17C 17D 27C 27D 37C 37D 33E 33F 38_ 18A 18B 1AA 1AB 1CA 1CB 1EA 1EB 1CE 1CF 88_ 00E 00F 10E 10F 20E 20F 30E 30F 06E 06F 39_ 19A 19B 1BA 1BB 1DA 1DB 1FA 1FB 1DE 1DF 89_ 01E 01F 11E 11F 21E 21F 31E 31F 07E 07F 40_ 200 201 202 203 204 205 206 207 208 209 90_ 08C 08D 18C 18D 28C 28D 38C 38D 0AE 0AF 41_ 210 211 212 213 214 215 216 217 218 219 91_ 09C 09D 19C 19D 29C 29D 39C 39D 0BE 0BF 42_ 220 221 222 223 224 225 226 227 228 229 92_ 0AC 0AD 1AC 1AD 2AC 2AD 3AC 3AD 1AE 1AF 43_ 230 231 232 233 234 235 236 237 238 239 93_ 0BC 0BD 1BC 1BD 2BC 2BD 3BC 3BD 1BE 1BF 44_ 240 241 242 243 244 245 246 247 248 249 94_ 0CC 0CD 1CC 1CD 2CC 2CD 3CC 3CD 2AE 2AF 45_ 250 251 252 253 254 255 256 257 258 259 95_ 0DC 0DD 1DC 1DD 2DC 2DD 3DC 3DD 2BE 2BF 46_ 260 261 262 263 264 265 266 267 268 269 96_ 0EC 0ED 1EC 1ED 2EC 2ED 3EC 3ED 3AE 3AF 47_ 270 271 272 273 274 275 276 277 278 279 97_ 0FC 0FD 1FC 1FD 2FC 2FD 3FC 3FD 3BE 3BF 48_ 20A 20B 22A 22B 24A 24B 26A 26B 24E 24F 98_ 08E 08F 18E 18F 28E 28F 38E 38F 0EE 0EF 49_ 21A 21B 23A 23B 25A 25B 27A 27B 25E 25F 99_ 09E 09F 19E 19F 29E 29F 39E 39F 0FE 0FF Table 130: DPD-to-BCD translation 0 1 2 3 4 5 6 7 8 9 A B C D E F 00_ 000 001 002 003 004 005 006 007 008 009 080 081 800 801 880 881 01_ 010 011 012 013 014 015 016 017 018 019 090 091 810 811 890 891 02_ 020 021 022 023 024 025 026 027 028 029 082 083 820 821 808 809 03_ 030 031 032 033 034 035 036 037 038 039 092 093 830 831 818 819 04_ 040 041 042 043 044 045 046 047 048 049 084 085 840 841 088 089 05_ 050 051 052 053 054 055 056 057 058 059 094 095 850 851 098 099 06_ 060 061 062 063 064 065 066 067 068 069 086 087 860 861 888 889 07_ 070 071 072 073 074 075 076 077 078 079 096 097 870 871 898 899 08_ 100 101 102 103 104 105 106 107 108 109 180 181 900 901 980 981 09_ 110 111 112 113 114 115 116 117 118 119 190 191 910 911 990 991 0A_ 120 121 122 123 124 125 126 127 128 129 182 183 920 921 908 909 0B_ 130 131 132 133 134 135 136 137 138 139 192 193 930 931 918 919 0C_ 140 141 142 143 144 145 146 147 148 149 184 185 940 941 188 189 0D_ 150 151 152 153 154 155 156 157 158 159 194 195 950 951 198 199 0E_ 160 161 162 163 164 165 166 167 168 169 186 187 960 961 988 989 0F_ 170 171 172 173 174 175 176 177 178 179 196 197 970 971 998 999 10_ 200 201 202 203 204 205 206 207 208 209 280 281 802 803 882 883 11_ 210 211 212 213 214 215 216 217 218 219 290 291 812 813 892 893 12_ 220 221 222 223 224 225 226 227 228 229 282 283 822 823 828 829 13_ 230 231 232 233 234 235 236 237 238 239 292 293 832 833 838 839 14_ 240 241 242 243 244 245 246 247 248 249 284 285 842 843 288 289 15_ 250 251 252 253 254 255 256 257 258 259 294 295 852 853 298 299 16_ 260 261 262 263 264 265 266 267 268 269 286 287 862 863 (888) (889) 17_ 270 271 272 273 274 275 276 277 278 279 296 297 872 873 (898) (899) 18_ 300 301 302 303 304 305 306 307 308 309 380 381 902 903 982 983 19_ 310 311 312 313 314 315 316 317 318 319 390 391 912 913 992 993 1A_ 320 321 322 323 324 325 326 327 328 329 382 383 922 923 928 929 1B_ 330 331 332 333 334 335 336 337 338 339 392 393 932 933 938 939 1C_ 340 341 342 343 344 345 346 347 348 349 384 385 942 943 388 389 1D_ 350 351 352 353 354 355 356 357 358 359 394 395 952 953 398 399 1E_ 360 361 362 363 364 365 366 367 368 369 386 387 962 963 (988) (989) 1F_ 370 371 372 373 374 375 376 377 378 379 396 397 972 973 (998) (999) 20_ 400 401 402 403 404 405 406 407 408 409 480 481 804 805 884 885 21_ 410 411 412 413 414 415 416 417 418 419 490 491 814 815 894 895 22_ 420 421 422 423 424 425 426 427 428 429 482 483 824 825 848 849 23_ 430 431 432 433 434 435 436 437 438 439 492 493 834 835 858 859 24_ 440 441 442 443 444 445 446 447 448 449 484 485 844 845 488 489 25_ 450 451 452 453 454 455 456 457 458 459 494 495 854 855 498 499 26_ 460 461 462 463 464 465 466 467 468 469 486 487 864 865 (888) (889) 27_ 470 471 472 473 474 475 476 477 478 479 496 497 874 875 (898) (899) 28_ 500 501 502 503 504 505 506 507 508 509 580 581 904 905 984 985 29_ 510 511 512 513 514 515 516 517 518 519 590 591 914 915 994 995 2A_ 520 521 522 523 524 525 526 527 528 529 582 583 924 925 948 949 2B_ 530 531 532 533 534 535 536 537 538 539 592 593 934 935 958 959 2C_ 540 541 542 543 544 545 546 547 548 549 584 585 944 945 588 589 2D_ 550 551 552 553 554 555 556 557 558 559 594 595 954 955 598 599 2E_ 560 561 562 563 564 565 566 567 568 569 586 587 964 965 (988) (989) 2F_ 570 571 572 573 574 575 576 577 578 579 596 597 974 975 (998) (999) 30_ 600 601 602 603 604 605 606 607 608 609 680 681 806 807 886 887 31_ 610 611 612 613 614 615 616 617 618 619 690 691 816 817 896 897 32_ 620 621 622 623 624 625 626 627 628 629 682 683 826 827 868 869 33_ 630 631 632 633 634 635 636 637 638 639 692 693 836 837 878 879 34_ 640 641 642 643 644 645 646 647 648 649 684 685 846 847 688 689 35_ 650 651 652 653 654 655 656 657 658 659 694 695 856 857 698 699 36_ 660 661 662 663 664 665 666 667 668 669 686 687 866 867 (888) (889) 37_ 670 671 672 673 674 675 676 677 678 679 696 697 876 877 (898) (899) 38_ 700 701 702 703 704 705 706 707 708 709 780 781 906 907 986 987 39_ 710 711 712 713 714 715 716 717 718 719 790 791 916 917 996 997 3A_ 720 721 722 723 724 725 726 727 728 729 782 783 926 927 968 969 3B_ 730 731 732 733 734 735 736 737 738 739 792 793 936 937 978 979 3C_ 740 741 742 743 744 745 746 747 748 749 784 785 946 947 788 789 3D_ 750 751 752 753 754 755 756 757 758 759 794 795 956 957 798 799 3E_ 760 761 762 763 764 765 766 767 768 769 786 787 966 967 (988) (989) 3F_ 770 771 772 773 774 775 776 777 778 779 796 797 976 977 (998) (999)
(In reply to Luke Kenneth Casson Leighton from comment #1) > Table 129:BCD-to-DPD translation sad case of wrapping ... maybe put on the wiki where you can use markdown tables?
That's OK, I already use it. I'm planning to put it in the test (almost) exactly as is (except that there won't be two halves, they'll simply follow one after another), and do parsing there.
Actually I already have something which doesn't work. Need to re-check both the pseudocode and parsing. As an example, attempting to convert BCD 010 to DPD yields 0x008 instead of 0x010. Will investigate it. I'm somewhat concerned that 1000 iterations will take a lot of time, if we create a program for each iteration. Do we have some way to speed things up?
(In reply to dmitry.selyutin from comment #4) > Do we have some way to speed things up? One of ideas which comes to mind is compiling a program which makes use of all 32 registers instead of checking the same register over and over again in a loop.
(In reply to dmitry.selyutin from comment #4) > As an example, attempting to convert BCD 010 to DPD yields 0x008 instead of 0x010. Will investigate it. Never mind, I'm stupid. These are not numbers in range 0..999, these are (what a surprise, huh?) binary coded decimals. So when they write 010... it actually means 0b000000010000 ((0x0 << 8) | (0x1 << 4) | (0x0 << 0)).
OK, it looks like BCD => DPD works (still in progress, awfully slow). For now, I limited it with first 20 entries, and even this takes 42 seconds on my VM. For sure, we must invent the way to speed things up. I'm thinking of generating the assembly for all 32 registers; please, let me know, if you have better ideas (frankly I don't find my own idea attractive enough).
(In reply to dmitry.selyutin from comment #7) > OK, it looks like BCD => DPD works (still in progress, awfully slow). For > now, I limited it with first 20 entries, and even this takes 42 seconds on > my VM. For sure, we must invent the way to speed things up. yes, this has already been done, search "cases" i.e. "def case_" and you will find the technique used. happy to increase this by EUR 100 if you can adapt test_caller_bcd.py to use the technique. will post-edit this comment with the code for HDL running derive from TestAccumulatorBase https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/test/common.py;hb=HEAD set up tests like this https://git.libre-soc.org/?p=openpower-isa.git;a=blob;f=src/openpower/test/alu/alu_cases.py;hb=HEAD create a TestRunner class like this, BIG NOTE, most of what is here is NOT NEEDED, actual adaptation, see the for-loop which runs each "case" data inside a "self.subTest" https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/simple/test/test_runner.py;h=c7bc9a11e040362dd759b6a0da31e91340b9d1d9;hb=53b4aa9d5c3a818480e3600a6830fac8ea233fdc#l195 basically we *REUSE* the PowerDecoder2 which is the hugely expensive bit, creating a new ISA() simulator instance each time, using the same PowerDecoder2 finally use like this https://git.libre-soc.org/?p=soc.git;a=blob;f=src/soc/simple/test/test_issuer.py;h=ebc529fcada9f0a3b60d2fee463890f477955f21;hb=53b4aa9d5c3a818480e3600a6830fac8ea233fdc#l43
Updated the test with cdtbcd, written in the same fashion and spirit as cbcdtd ones. This one, of course, is slow as well. Luke, thank you for help on how to improve the performance, I'll update the test respectively.
https://git.libre-soc.org/?p=openpower-isa.git;a=commitdiff;h=889b2d55bb86177007d78cc2a9d232d0ff56cea6 not tested, i threw that together based on soc/simple/test_runner TestRunner class, you should easily get the idea from that. i created it because there's quite a lot in the HDL version that you have to ignore.
GCC's DPD to binary (not BCD) conversion table: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libdecnumber/bid/bid2dpd_dpd2bid.h;hb=99dee82307f1e163e150c9c810452979994047ce#l145 binary (not BCD) to DPD conversion table: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libdecnumber/bid/bid2dpd_dpd2bid.h;hb=99dee82307f1e163e150c9c810452979994047ce#l4893
i tracked this down: https://github.com/antonblanchard/microwatt/blob/master/execute1.vhdl#L528 it's an implementation of addg6s in VHDL. the sum_with_carry variable is adding its two inputs, a and b, with an extra (65th) bit, just like we did in the pseudocode. what i would suggest is, simply creating a python function which does the equivalent of the pseudocode (and check it's the same as the microwatt version), and just make up some values to test.
Hi Luke, I think I finally got it. Since the recent commits, we do the following: 1. Use VHDL algorithm as a reference implementation. 2. Instead of this fun with adders, we rely on big ints. 3. Instead of generating the product, we generate random BCDs. 4. And yes, we can batch it!
(In reply to dmitry.selyutin from comment #13) > Hi Luke, I think I finally got it. Since the recent commits, we do the > following: > 1. Use VHDL algorithm as a reference implementation. yes. full_adder64 (much as i like it) can be replaced with just (in python) "return a + b", and everywhere x[N] replaced with "(x>>N) & 0b1" or if in a test: "if x&(1<<N)" the idea is that the code should be simple and obviously readable > 2. Instead of this fun with adders, we rely on big ints. yes. but try to make it "obvious" and/or add comments. instead of: addg6s[lo + 3] = 0 addg6s[lo + 2] = 1 addg6s[lo + 1] = 1 addg6s[lo + 0] = 0 do addg6s |= 0b0110 << lo > 3. Instead of generating the product, we generate random BCDs. yyyeah, lots of them. _hopefully_ that will spam enough numbers at the pseudocode to give enough coverage, i.e. for the carry bit to be triggered with both 0 and 1 with an equal distribution. carry on two random numbers is a 50-50 probability, right? because it uses (ultimately) XOR? > 4. And yes, we can batch it! :) btw do keep to under an 80 char limit. two reasons: we cannot assume that all developers have massive hi-res screens (the recent new developers from India will not), and second, i use hi-res screens to get *more terminals* on-screen ==> more information, more depth of investigation, less work, less effort.
(In reply to Luke Kenneth Casson Leighton from comment #14) First of all, your reply made me realize that I haven't pushed the changes. However, your replies led to some more changes, more below. :-) > yes. full_adder64 (much as i like it) can be replaced It's already replaced in trunk. > do > addg6s |= 0b0110 << lo This is not as close as it can be to `addg6s(lo + 3 downto lo) := "0110";`. The best option would be to assign parts of slice (I'm splitting the integer into bits). IIRC Python supports it, I will re-check. > > 3. Instead of generating the product, we generate random BCDs. > yyyeah, lots of them. Not that many. However, when I checked the code to see the number (16 * 31 currently, 1 register is not used), I found that somehow I swapped the loop. I've fixed it now, and pushed along with forgotten commits. > it uses (ultimately) XOR? I have no idea what Python uses, but most likely yes. I simply took the most obvious option to generate BCD in range 0..9. > btw do keep to under an 80 char limit. I'll re-check it. That said, I don't like how does DPD_TO_BCD_TABLE formatting look like after line-length change. It used to be much cleaner, and the only reason it's changed is the hard limitation. Any ideas on how to improve it are appreciated, I only thought about spaces.
(In reply to dmitry.selyutin from comment #15) > The best option would be to assign parts of slice (I'm splitting the integer > into bits). IIRC Python supports it, I will re-check. Done
(In reply to dmitry.selyutin from comment #15) > (In reply to Luke Kenneth Casson Leighton from comment #14) > > First of all, your reply made me realize that I haven't pushed the changes. doh, been there > However, your replies led to some more changes, more below. :-) > > > > yes. full_adder64 (much as i like it) can be replaced > > It's already replaced in trunk. fantastic. > > do > > addg6s |= 0b0110 << lo > > This is not as close as it can be to `addg6s(lo + 3 downto lo) := "0110";`. don't worry about it. > The best option would be to assign parts of slice (I'm splitting the integer > into bits). IIRC Python supports it, I will re-check. > it doesn't, which is why we created SelectableInt. > > > 3. Instead of generating the product, we generate random BCDs. > > yyyeah, lots of them. > > Not that many. However, when I checked the code to see the number (16 * 31 > currently, 1 register is not used), I found that somehow I swapped the loop. > I've fixed it now, and pushed along with forgotten commits. excellent > > it uses (ultimately) XOR? > > I have no idea what Python uses, but most likely yes. i meant in the abstract (back at gate level) > I simply took the most > obvious option to generate BCD in range 0..9. > > > btw do keep to under an 80 char limit. > > I'll re-check it. some GUI based editors have guide lines. another trick is "git diff" in an xterm 80x65 or so. if any line wraps, wark. > That said, I don't like how does DPD_TO_BCD_TABLE > formatting look like after line-length change. It used to be much cleaner, > and the only reason it's changed is the hard limitation. Any ideas on how to > improve it are appreciated, I only thought about spaces. yeah i did a global search replace vim ":%s/ / /g" and realised after it took away too much. reinserting one space inter-column would likely do it.
(In reply to dmitry.selyutin from comment #16) > (In reply to dmitry.selyutin from comment #15) > > The best option would be to assign parts of slice (I'm splitting the integer > > into bits). IIRC Python supports it, I will re-check. > > Done >>> x = 5 >>> x[2] Traceback (most recent call last): File "<stdin>", line 1, in <module> TypeError: 'int' object has no attribute '__getitem__'
if (a_in[hi] ^ b_in[hi] ^ (sum_with_carry[hi] == 0)): addg6s[lo:lo + 3 + 1] = [0, 1, 1, 0] niiice. ok so it looks pretty close to what was done in execute1.vhdl i like it. it's not terribly efficient, but really clear. good job.
Let me know if we can mark this as completed so that I could update the status tracking.
yes (In reply to dmitry.selyutin from comment #20) > Let me know if we can mark this as completed so that I could update the > status tracking. yes perfect, this and the other one as well (the instructions themselves) bug #656. both can be closed as fixed. you've got them on the paaage.... https://libre-soc.org/3mdeb/ghostmansd/ yes, all good. RFP time - need to sort out with Maciej, about the percentage for you and percentage for 3mdeb.