Bug 1169 - Add ELF and mmap support to ISACaller -- no dynamic linking
Summary: Add ELF and mmap support to ISACaller -- no dynamic linking
Status: RESOLVED FIXED
Alias: None
Product: Libre-SOC's first SoC
Classification: Unclassified
Component: Source Code (show other bugs)
Version: unspecified
Hardware: PC Linux
: High enhancement
Deadline: 2023-12-07
Assignee: Jacob Lifshay
URL:
: 1168 (view as bug list)
Depends on: 982 1173
Blocks: 1228 983
  Show dependency treegraph
 
Reported: 2023-09-19 19:22 BST by Jacob Lifshay
Modified: 2023-12-03 04:55 GMT (History)
4 users (show)

See Also:
NLnet milestone: ---
total budget (EUR) for completion of task and all subtasks: 0
budget (EUR) for this task, excluding subtasks' budget: 0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jacob Lifshay 2023-09-19 19:22:40 BST
Support for ELF files is needed in ISACaller to support userspace programs. This will eventually lead to Power SFFS compliance in Cavatools (this task doing the preliminary research).

TOTAL: 4800 EUR

* DONE: redo budget estimation to account for auxv and try to use up the EUR 6000 available, part of which will be for supporting dynamically linked binaries as a separate task (merged into this task since it's small enough).

* DONE: Add MemMMap type using mmap.mmap and add flag to use it for tests
  * estimate 250 lines of code
  * done in bug #1173
  * 700 EUR
* MOVED TO bug #1228: add support for argv, envp, and auxv
  * estimate 200 lines of code
  * 500 EUR
* DONE: add support for mmap syscall
  * using scheme from https://bugs.libre-soc.org/show_bug.cgi?id=982#c40
  * more complex since we need a simple memory allocator
    to decide where the mmap goes.
  * everything but making sc insn call MemMMap.mmap_syscall
  * see comment #35
  * estimate 150 lines of code
  * 400 EUR
* DONE: add unit tests for mmap syscall
  * tests calling MemMMap.mmap_syscall,
    tests using sc insn are for a future task.
  * estimate 200 lines of code
  * see comment #35
  * 500 EUR
* MOVED TO bug #1228: add support for brk syscall
  * estimate 40 lines of code and 50 lines of tests
  * only additional missing syscall needed for glibc afaict,
    see ELF unit tests section
  * 200 EUR
* DONE: add support for loading ELF binaries
  each loaded piece will be mmap-ed into place.
  * this is mostly just:
    * reading the file headers
    * mmap-ing a chunk for each LOAD program header
    * MOVED TO bug #1228 processing relocations 
      * just R_PPC64_IRELATIVE should be sufficient for now,
        since that's (surprisingly) the only kind
        that /usr/bin/qemu-x86_64-static uses.
        from ppc64el debian buster version 1:3.1+dfsg-8+deb10u8
    * doing the rest of the initial process setup
  * estimate 250 lines of code
  * 700 EUR
* DONE: add unit tests for loading ELF binaries.
  * hello world from comment #3
  * MOVED TO bug #1228 hello world using glibc
    * only needs exit_group, brk, openat, read, write
    * tested with:
      clang -static -xc++ - <<<'int main(){__builtin_printf("Hi!\n");}'
      strace -e fault='!exit_group,brk,openat,read,write' ./a.out
  * combined 100 lines of code
  * 300 EUR
* MOVED TO bug #1228: add minimal support for mprotect with unit tests
  * memory protection is already mostly implemented in MemMMap,
    so all that's needed is just changing the protection for
    the given memory block
  * 150 lines of code total
  * 400 EUR
* MOVED TO bug #1228: support dynamically linked ELF binaries
  * this is mostly just loading ld.so, which then does all
    the hard work of applying relocations and stuff for us.
  * 150 lines of code
  * 400 EUR
* MOVED TO bug #1228: support fstat and openat
  * needed for dynamic linker
  * 150 lines of code, most of which is tests and filling the stat struct
  * 400 EUR
* MOVED TO bug #1228: add unit tests for dynamically linked ELF binaries.
  * /usr/bin/echo
    * only needs exit_group,brk,openat,close,read,write,fstat,mmap,mprotect
    * tested with:
> strace -e fault='!exit_group,brk,openat,close,read,write,fstat,mmap,mprotect' /usr/bin/echo
  * /usr/bin/lua5.3
    * doesn't need any more than `echo` does, so should only need writing a test
    * impressive demo
    * tested with:
> strace -e fault='!exit_group,brk,openat,close,read,write,fstat,mmap,mprotect' /usr/bin/lua5.3 <<<'print("hello\n")'
  * 100 lines of code total
  * 300 EUR
Comment 1 Jacob Lifshay 2023-09-19 19:30:56 BST
*** Bug 1168 has been marked as a duplicate of this bug. ***
Comment 2 Jacob Lifshay 2023-09-20 00:08:14 BST
iirc during the meeting luke suggested using an elf decoding library, icr what was suggested, so I did some searching and I think pyelftools might be best, it's pure python.

https://github.com/eliben/pyelftools
Comment 3 Jacob Lifshay 2023-09-20 01:58:05 BST
(In reply to Jacob Lifshay from comment #2)
> iirc during the meeting luke suggested using an elf decoding library, icr
> what was suggested, so I did some searching and I think pyelftools might be
> best, it's pure python.
> 
> https://github.com/eliben/pyelftools

it seems to work well.

Because just using -static doesn't remove all relocations, I used this test program:
#include <sys/syscall.h>

long syscall(long number, ...);

// too bad powerpc64le doesn't support __attribute__((naked))
asm(
    ".globl syscall\n"
    ".p2align 4\n"
    ".type syscall,@function\n"
    "syscall:\n"
    "mr 0,3\n"
    "mr 3,4\n"
    "mr 4,5\n"
    "mr 5,6\n"
    "mr 6,7\n"
    "mr 7,8\n"
    "mr 8,9\n"
    "sc\n"
    "blr"
);

void _start() {
    static char v = 'H';
    char msg[] = " ello!";
    msg[sizeof(msg) - 1] = '\n';
    msg[0] = v;
    syscall(SYS_write, 1, (const void *)msg, sizeof(msg));
    syscall(SYS_exit_group, 0);
}

I compiled it with:
clang -O3 -ffreestanding -fno-pic -fno-pie -nostdlib -static prog.c

then ran:
readelf.py -e -r a.out

<snip>
Program Headers:
  Type           Offset             VirtAddr           PhysAddr
                 FileSiz            MemSiz              Flags  Align
  LOAD           0x0000000000000000 0x0000000010000000 0x0000000010000000
                 0x0000000000000214 0x0000000000000214  R E    10000
  LOAD           0x0000000000000218 0x0000000010010218 0x0000000010010218
                 0x0000000000000001 0x0000000000000001  RW     10000
  NOTE           0x0000000000000120 0x0000000010000120 0x0000000010000120
                 0x0000000000000024 0x0000000000000024  R      4
  GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                 0x0000000000000000 0x0000000000000000  RW     10

 Section to Segment mapping:
  Segment Sections...
   00     .note.gnu.build-id .text .rodata 
   01     .data 
   02     .note.gnu.build-id 
   03     

There are no relocations in this file.
Comment 4 Jacob Lifshay 2023-09-20 08:25:18 BST
I created a budget estimate, it comes out to 1900 EUR. this gets us running a hello world binary using printf statically linked to glibc.

If we don't want to support glibc, it comes out to maybe EUR 100 less, so I think we should just go ahead and do it.
Comment 5 Luke Kenneth Casson Leighton 2023-09-20 08:35:05 BST
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

int main(int argc, char *argv[])
{
    write(1, "hello\n", 6);
}

  508  powerpc64le-linux-gnu-gcc-8 -static -fPIC test.c 
  510  powerpc64le-linux-gnu-strip -static -fPIC test.c 
  511  powerpc64le-linux-gnu-strip ./a.out 
  512  powerpc64le-linux-gnu-objdump -x ./a.out > /tmp/x

intermediary step 1, the non-relocatable version
intermediary step 2, the above but modified to call a syscall-variant of write
                     (no libc6)
final step 3,        libc6 support.
Comment 6 Luke Kenneth Casson Leighton 2023-09-20 08:36:43 BST
(In reply to Jacob Lifshay from comment #4)
> I created a budget estimate, it comes out to 1900 EUR.

take all available budget, do not waste it. assign some to me.
we can evaluate additional improvements and i know you forgot
to add "option to pypowersim and ISACaller to enable equivalent
of running qemu-user"
Comment 7 Luke Kenneth Casson Leighton 2023-09-20 08:38:50 BST
(In reply to Jacob Lifshay from comment #3)

> > https://github.com/eliben/pyelftools
> 
> it seems to work well.

nce.

> Because just using -static doesn't remove all relocations,

goal is -fPIC -static.

>  I used this test
> program:
> #include <sys/syscall.h>
> 
> long syscall(long number, ...);
> 
> // too bad powerpc64le doesn't support __attribute__((naked))
> asm(
>     ".globl syscall\n"
>     ".p2align 4\n"
>     ".type syscall,@function\n"
>     "syscall:\n"
>     "mr 0,3\n"
>     "mr 3,4\n"
>     "mr 4,5\n"
>     "mr 5,6\n"
>     "mr 6,7\n"
>     "mr 7,8\n"
>     "mr 8,9\n"
>     "sc\n"
>     "blr"
> );

use this in intermediary step 2, "no libc6".
Comment 8 Jacob Lifshay 2023-09-21 00:46:16 BST
(In reply to Luke Kenneth Casson Leighton from comment #6)
> (In reply to Jacob Lifshay from comment #4)
> > I created a budget estimate, it comes out to 1900 EUR.
> 
> take all available budget, do not waste it. assign some to me.
> we can evaluate additional improvements and 

ok. I'll redo the budget adding more stuff. there's 6000 EUR in the parent task, that's enough that it should have multiple sub-tasks.

How about adding a "support loading dynamically linked binaries" task as a new bug, a sibling task of this task? the dynamic linker does all the heavy lifting, basically all we have to do is read the path to the interpreter (ld.so) and load that instead of the main program and pass the file handle in.

> i know you forgot
> to add "option to pypowersim and ISACaller to enable equivalent
> of running qemu-user"

that's part of bug #982, not this bug.
Comment 9 Luke Kenneth Casson Leighton 2023-09-21 03:59:39 BST
(In reply to Jacob Lifshay from comment #8)

> lifting, basically all we have to do is read the path to the interpreter
> (ld.so) and load that instead of the main program and pass the file handle
> in.

if it's really that easy then go for it.

> > i know you forgot
> > to add "option to pypowersim and ISACaller to enable equivalent
> > of running qemu-user"
> 
> that's part of bug #982, not this bug.

ok
Comment 10 Jacob Lifshay 2023-09-22 04:32:54 BST
(In reply to Luke Kenneth Casson Leighton from comment #9)
> (In reply to Jacob Lifshay from comment #8)
> 
> > lifting, basically all we have to do is read the path to the interpreter
> > (ld.so) and load that instead of the main program and pass the file handle
> > in.
> 
> if it's really that easy then go for it.

that was one of two options in the ELF spec, I can't find where it says how the kernel chooses between those two options...

the other option is read the path to the interpreter, load that and the main program, and pass the pointer to the main program in the aux vector.

I did some more testing and it seems like linux uses the second option on the talos server. no relocations seem to be performed by the kernel when loading /usr/bin/echo -- I tested by running gdb /usr/bin/echo and then running `starti` and dumping all the file mappings, comparing them to the file they were loaded from.
Comment 11 Jacob Lifshay 2023-09-22 05:18:00 BST
(In reply to Jacob Lifshay from comment #10)
> I did some more testing and it seems like linux uses the second option on
> the talos server. no relocations seem to be performed by the kernel when
> loading /usr/bin/echo -- I tested by running gdb /usr/bin/echo and then
> running `starti` and dumping all the file mappings, comparing them to the
> file they were loaded from.

I tried it with /usr/bin/python3.7 and, likewise, no relocations seem to be performed...maybe we can get away with not implementing relocations for at least dynamically-linked executables!
Comment 12 Jacob Lifshay 2023-09-22 09:17:53 BST
(In reply to Dmitry Selyutin from bug #981 comment #8)
> (In reply to Jacob Lifshay from comment #7)
> > (In reply to Dmitry Selyutin from comment #6)
> > > Deal?
> > 
> > ok with me
> > 
> > > I'd say it fits bug #982.
> > 
> > i was thinking #982 would only end up with a few syscalls implemented since
> > luke was advocating for just open/openat, close, exit, read, and write (or
> > something), so the additional budget here would allow implementing a lot
> > more.
> 
> mmap. This is a blocker. It needs to be done. Plus free pages list,
> basically a trivial allocator.

yeah, it's part of bug #1169. I have implemented simple memory allocators before, we don't need anything all that complex.
>  
> > > I don't expect much to be done there; also parts
> > > of it fit ELF support which I, frankly speaking, find to be underrated.
> > 
> > what do you mean by underrated? that the 1900 EUR estimate is too big? that
> > it's too small? something else?
> 
> Too small, rationale below.

yeah, I tried to make the budget smaller before i looked at how much is available. I will probably re-estimate to around eur 3000 since i forgot a few things like setting up auxv and stuff.
> 
> > keep in mind there is 6000 eur available in the ELF task's budget parent and
> > since the statically-linked ELF task is too small I want to add
> > dynamically-linked executable support as a separate task and implement more
> > syscalls ... if there's not enough work to take up the 2mo of funds then
> > I'll add more.
> 
> I expect several problems there. First of all, you'll have to implement mmap
> support, and this includes introducing something like a free list for
> bookkeeping the memory allocations, even if these are backed just by the big
> mapping.

yup, i was expecting that (though i might use something different than a free-list), hence why this bug is both ELF *and* mmap.

> Second, you'll have to support different sections differently, not
> just "parse and map at fixed offset".

implementing LOAD program headers isn't too difficult. the kernel ignores sections entirely and just follows the program headers.

> Third, I expect some challenges in
> ISACaller

I'm expecting bug #982 to have handled the issue of enabling/disabling syscall emulation.

I already figured out replacing the simulated memory with the mmap-ped block and luke verified in bug #982 comment 65 that i don't have to worry about needing to pretend to be a dict too.

> and a learning curve with pyelf (if we take it; if not, this is
> even worse).

luckily there's readelf.py that already displays basically everything we need:
https://github.com/eliben/pyelftools/blob/cf814b7adaebb0d336e863d834964b3f4b4e48e1/scripts/readelf.py#L322

> I'd say with all the stuff I mentioned it's already no less
> than 4-5K. But, as I said, it's up to task owner to rate it budget-wise, I'm
> just expressing my considerations.
> 
> And, since you mentioned dynamic linkage. I expect it to be damn hard and
> time-nsuming based on my experience with relocations. Most of the task will
> be reading ELF and architecture specs. The code size won't be that huge; but
> the amount of time will... Again, your estimations might be different.

well, based on my testing, the kernel doesn't actually need to process any relocations for dynamically-linked executables... see comment #11

the dynamic linker is what processes all the relocations for dynamically-linked executables from what i can tell, which makes it really easy since we just load ld.so and it does all the work for us.
Comment 13 Luke Kenneth Casson Leighton 2023-09-22 09:37:57 BST
(In reply to Jacob Lifshay from comment #10)

> I did some more testing and it seems like linux uses the second option on
> the talos server. no relocations seem to be performed by the kernel when
> loading /usr/bin/echo -- I tested by running gdb /usr/bin/echo and then
> running `starti` and dumping all the file mappings, comparing them to the
> file they were loaded from.

that is possibly because it detects that ld.so (see ldd /usr/bin/echo)
is already resident). i.e. you'll find it's in ld.so.cache

a better test would be to run the *ppc64* version of "gdb /usr/bin/echo"
under qemu-ppc64-user although oh wait even running gdb would load
libc6 into ld.so.cache.

i am guessing at this point, you might need a static-compiled
version of strace-ppc64 or static compiled version of gdb-ppc64 to verify
properly.
Comment 14 Jacob Lifshay 2023-09-22 09:50:04 BST
(In reply to Luke Kenneth Casson Leighton from comment #13)
> (In reply to Jacob Lifshay from comment #10)
> 
> > I did some more testing and it seems like linux uses the second option on
> > the talos server. no relocations seem to be performed by the kernel when
> > loading /usr/bin/echo -- I tested by running gdb /usr/bin/echo and then
> > running `starti` and dumping all the file mappings, comparing them to the
> > file they were loaded from.
> 
> that is possibly because it detects that ld.so (see ldd /usr/bin/echo)
> is already resident). i.e. you'll find it's in ld.so.cache

starti stops the program at the first insn ... it doesn't even run any ld.so code yet, so, unless there's special kernel-level caching, ld.so's cache is irrelevant.

I compared the bytes in memory to the bytes on disk (which afaict are not changed by any ld.so shenanigans), they are identical.
Comment 15 Jacob Lifshay 2023-11-14 02:22:46 GMT
I redid the budget estimate but included dynamically linked binaries (since that's relatively small at 1100 EUR when including fstat/openat), it comes out to 4800 EUR total.

This gets us all the way to running lua5.3, which I think is quite impressive!
Comment 16 Jacob Lifshay 2023-11-14 02:26:16 GMT
(In reply to Jacob Lifshay from comment #15)
> I redid the budget estimate but included dynamically linked binaries (since
> that's relatively small at 1100 EUR when including fstat/openat), it comes
> out to 4800 EUR total.

actually, mprotect is only there for dynamically linked binaries, so make that 1500 EUR out of 4800 EUR.
> 
> This gets us all the way to running lua5.3, which I think is quite
> impressive!

lua doesn't actually require any more syscalls than running echo.
Comment 17 Andrey Miroshnikov 2023-11-14 18:11:09 GMT
Checked the task list, happy for you to proceed Jacob.

Luke, need to check the budget breakdown under bug #983, is it correct for yourself and red to receive such a large share? I think we should up Jacob's budget to match the work listed.
Comment 18 Luke Kenneth Casson Leighton 2023-11-14 19:43:27 GMT
(In reply to Andrey Miroshnikov from comment #17)

> is it correct for
> yourself and red to receive such a large share? 

yes.
Comment 19 Jacob Lifshay 2023-11-16 07:11:37 GMT
started working on implementing the memory allocator needed for mmap/brk support, very much WIP:

https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=refs/heads/1169-elf-support

I started out building a memory allocator, but forgot it needs to support mmap-ping files too, not only anonymous memory, so I'm partway through changing it to support that.

I also track a designated memory block (mmap_emu_data_block) that brk adjusts the size of.

Oh, i just thought of this while writing this comment, since it tracks page permissions for safety reasons (so python doesn't segfault), i need to support the intersection, not union, of permission flags for mmaps within an emulated page (emulated pages are 64kB, the host system possibly has smaller pages) -- these are *not* the same pages as the powerpc MMU has, they're just for preventing segfaults and catching bugs.

it'd probably be easiest to require at most one mmap per emulated page.

(this is all very hard to describe since we have pages and blocks and mmaps all at different levels and each level is a completely separate thing...)
Comment 20 Jacob Lifshay 2023-11-17 05:59:32 GMT
continuing on implementing mmap/brk, I think the underlying allocation algorithms are mostly done at this point.

I implemented resizing a memory block (basically all of what brk does), though only for private anonymous memory.
Comment 21 Luke Kenneth Casson Leighton 2023-11-17 08:57:51 GMT
(In reply to Jacob Lifshay from comment #19)

> Oh, i just thought of this while writing this comment, since it tracks page
> permissions for safety reasons (so python doesn't segfault),

this was NOT part of the scope of work.

do the bare minimum work please!

you did not seek authorization to change the scope or priority of this work!

there are far more important things to do.

please STOP, raise a bugreport, and get permission before ASSUMING (yet again)
that it is acceptable to massively increase the scope of work.

the ONLY THING needed for sign-off is read write and close.

i need you to get back onto the crypto work as quickly as possible.
Comment 22 Jacob Lifshay 2023-11-17 21:20:03 GMT
(In reply to Luke Kenneth Casson Leighton from comment #21)
> (In reply to Jacob Lifshay from comment #19)
> 
> > Oh, i just thought of this while writing this comment, since it tracks page
> > permissions for safety reasons (so python doesn't segfault),
> 
> this was NOT part of the scope of work.

page permissions for safety reasons were already added in bug #1173, because if we mmap a file as read-only (remember we're calling mmap on the host to replace part of the mmap.mmap block of memory), we need to not try to write to that memory, otherwise python *will* segfault instead of properly reporting a test error, majorly messing up all other test cases that that pytest worker is trying to run. therefore I think this is necessary.

> 
> you did not seek authorization to change the scope or priority of this work!

I did, I was told (comment #6) that I needed to increase the scope so I could use up more of the EUR 6000

> that it is acceptable to massively increase the scope of work.

I asked andrey to check the increased scope, he said it looks fine. I started working on mmap while waiting for you to check since that's something you previously agreed we needed. Also, mmap makes loading ELFs easier, and is what linux and qemu do.

> the ONLY THING needed for sign-off is read write and close.

you already agreed to more than that in comment #6 and comment #9, you're probably confusing this with bug #982, where we were originally trying to implement read/write/close.
> 
> i need you to get back onto the crypto work as quickly as possible.

sorry, i had forgotten that, being distracted by fixing all the test errors on master. The powmod work is basically done, all the rest of the stuff i recall I said should be part of mul remap.

I do think ELF needs to be higher priority since the grant *is expiring soon*, iirc david said we should try to finish everything off by dec and that doesn't leave much time.

I did ask and you agreed to me working on ELF next and putting off mul remap for later:
https://lists.libre-soc.org/pipermail/libre-soc-dev/2023-October/005771.html
Comment 23 Luke Kenneth Casson Leighton 2023-11-17 23:38:44 GMT
(In reply to Jacob Lifshay from comment #22)
> (In reply to Luke Kenneth Casson Leighton from comment #21)
> > (In reply to Jacob Lifshay from comment #19)
> > 
> > > Oh, i just thought of this while writing this comment, since it tracks page
> > > permissions for safety reasons (so python doesn't segfault),
> > 
> > this was NOT part of the scope of work.
> 
> page permissions for safety reasons were already added in bug #1173, because
> if we mmap a file as read-only (remember we're calling mmap on the host to
> replace part of the mmap.mmap block of memory), we need to not try to write
> to that memory, otherwise python *will* segfault

*great*! not our problem!

 instead of properly
> reporting a test error, majorly messing up all other test cases that that
> pytest worker is trying to run.

so what?

>  therefore I think this is necessary.

why am i only just learning that you have MASSIVELY increased the
scope and not even told me or consulted me?

> > 
> > you did not seek authorization to change the scope or priority of this work!
> 
> I did, I was told (comment #6) that I needed to increase the scope so I
> could use up more of the EUR 6000

no, we get *exactly the same money* even if the scopevs increased!
come on for god's sake jacob GET A GRIP. we are UNDER TIME PRESSURE

> > that it is acceptable to massively increase the scope of work.
> 
> I asked andrey to check the increased scope, he said it looks fine. 

that's because he DOES NOT KNOW THE CONSEQUENCES because he has not
been properly trained and DID NT CONSULT ME.

> I do think ELF needs to be higher priority since the grant *is expiring
> soon*, iirc david said we should try to finish everything off by dec and
> that doesn't leave much time.

and that means GETTING IT DONE FAST by NOT INCREASING THE DAMN SCOPE

andrey you MUST NOT authorize jacob to increase work scope
without a FULL REVIEW which you yourself have not been properly
trained in and are not authorized to make decisions on without
consulting me.

both of you should have followed the "time and budget review" process
and GOT ME TO CHECK IT.

i am getting seriously pissed off that you are threatening our promise
to NLnet that the work will be 100% completed.
Comment 24 Jacob Lifshay 2023-11-21 02:53:50 GMT
(edit: trim context)

(In reply to Luke Kenneth Casson Leighton from comment #23)
> 
> no, we get *exactly the same money* even if the scopevs increased!
> come on for god's sake jacob GET A GRIP. we are UNDER TIME PRESSURE

no matter how much time pressure we're under, if we do too little work for the amount being paid the auditors will likely complain...so, we need to justify why EUR 6000 is appropriate for the scope we end up with.

One possible way to justify that is to increase the monthly rate we're asking, since some other NLNet-funded projects ask for a lot more ($57/hr here):
https://socialhub.activitypub.rocks/t/forgefed-nlnet-grant-application/2598/7
but, unless we apply that new rate for all future budget estimations, it will look suspicious...

If you want to reduce the scope, then we can drop the stuff specifically needed for dynamically linked binaries, but that only decreases the scope by like 30%, see comment #16.

Also, being under time pressure is a good reason to let me keep working on the parts we know we need (mmap and then ELF loading, they were in the original plan and in the revised version) while we figure out what else we need.

> why am i only just learning that you have MASSIVELY increased the
> scope and not even told me or consulted me?

because adding page permissions isn't MASSIVELY increasing the scope, the page permissions code (on master) is 1 flags enum and a dict containing the flags for each page and 4 lines of code checking the flags -- imo pretty minor.

(I'm not counting the code to speed up iterating (for debug logging and test cases) through the memory by tracking what's changed which we would need anyway since it is several orders of magnitude faster.)

> i am getting seriously pissed off that you are threatening our promise
> to NLnet that the work will be 100% completed.

based off the budget estimate, if we included everything I most recently proposed, I should finish by around the end of Dec if I start working now. Probably sooner if you let me just work on it, since that time estimate is based on including lots of emailing back and forth.

If we remove dynamically linked binaries from the scope, then I estimate I'll be done by the middle of Dec if I start working now based off the EUR 3300 left in the budget estimate.

Since iirc this task is the last thing I have to do for this grant, that should leave enough leeway, since iirc David asked that we finish in Dec to leave time for NLNet to process stuff and if anything unexpected comes up.
Comment 25 Luke Kenneth Casson Leighton 2023-11-21 06:55:30 GMT
(In reply to Jacob Lifshay from comment #24)
> (edit: trim context)
> 
> (In reply to Luke Kenneth Casson Leighton from comment #23)
> > 
> > no, we get *exactly the same money* even if the scopevs increased!
> > come on for god's sake jacob GET A GRIP. we are UNDER TIME PRESSURE
> 
> no matter how much time pressure we're under, if we do too little work for
> the amount being paid the auditors will likely complain...so, we need to
> justify why EUR 6000 is appropriate for the scope we end up with.

correct. the complexity is enough in this case.
please stop arguing and get on with it.

> 
> One possible way to justify that is to increase the monthly rate we're
> asking, since some other NLNet-funded projects ask for a lot more ($57/hr
> here):
> https://socialhub.activitypub.rocks/t/forgefed-nlnet-grant-application/2598/7
> but, unless we apply that new rate for all future budget estimations, it
> will look suspicious...

no it will not. please be more flexible, stop arguing, and get on with it.

> If you want to reduce the scope, then we can drop the stuff specifically
> needed for dynamically linked binaries, 

good enough.

but that only decreases the scope by
> like 30%, 

GREAT. find more to cut.

> Also, being under time pressure is a good reason to let me keep working on
> the parts we know we need (mmap and then ELF loading, they were in the
> original plan and in the revised version) while we figure out what else we
> need.

ADAPT 

> because adding page permissions isn't MASSIVELY increasing the scope, the
> page permissions code (on master) is 1 flags enum and a dict containing the
> flags for each page and 4 lines of code checking the flags -- imo pretty
> minor.

it was NOT DISCUSSED AND AUTHORIZED.

you have been doing this consistently and repeatedly for years and i am sick of
it.

> 
> (I'm not counting the code to speed up iterating (for debug logging and test
> cases) through the memory by tracking what's changed which we would need
> anyway since it is several orders of magnitude faster.)

DO NOT do optimisations.

> based off the budget estimate, if we included everything I most recently
> proposed, I should finish by around the end of Dec if I start working now.

MUCH too late.

> If we remove dynamically linked binaries from the scope, then I estimate
> I'll be done by the middle of Dec if I start working now based off the EUR
> 3300 left in the budget estimate.

go go go get fricking on with it and DO NOT add anything more ok?
do not even ASK to add any more.

mid-december when we have the other crypto tasks to get one is seriously
pusjing it.
Comment 26 Jacob Lifshay 2023-11-21 23:26:19 GMT
(In reply to Luke Kenneth Casson Leighton from comment #25)
> mid-december when we have the other crypto tasks to get one is seriously
> pusjing it.

iirc the deadline for that grant is june or so, so we're trying to finish everything by march or so, which means we do have enough time.
Comment 27 Luke Kenneth Casson Leighton 2023-11-22 03:57:39 GMT
(In reply to Jacob Lifshay from comment #26)

> iirc the deadline for that grant is june or so, so we're trying to finish
> everything by march or so, which means we do have enough time.

no we do NOT.

june is the date where NLnet DO NOT GET PAID because the entire
NGI Assure EU Grant TERMINATES.

i *have* explained this several times.

NLnet would have to put in an RFP to the EU *IN ADVANCE* of that
date which if we then did not actually do the work they would be
in breach of their contract as a 3rd Party Financial Provider of
EU Grants and place every single NLnet project at risk, i.e.
they would have committed FRAUD by taking EU money for work not
actually done.

this is real serious jacob, you CANNOT make excuses or not listen
to what i am telling you, here.

our deadline is MARCH THE FIRST 2024.

please cut back the work to bare minimum, UNDER NO CIRCUMSTANCES STOP
OR STALL, and under no circumstances increase the scope of work again
without a proper discussion and review.

remember we can always put in a new grant request to incrementally
improve but not if we have screwed our funder, NLnet.
Comment 28 Jacob Lifshay 2023-11-22 04:01:59 GMT
(In reply to Luke Kenneth Casson Leighton from comment #27)
> (In reply to Jacob Lifshay from comment #26)
> 
> > iirc the deadline for that grant is june or so, so we're trying to finish
> > everything by march or so, which means we do have enough time.
> 
> no we do NOT.

you're telling me exactly what I already knew...my point is we have time since march is several months away and iirc we don't have so much work that we can't finish by then.
Comment 29 Andrey Miroshnikov 2023-11-22 09:46:09 GMT
(In reply to Jacob Lifshay from comment #28)
> you're telling me exactly what I already knew...my point is we have time
> since march is several months away and iirc we don't have so much work that
> we can't finish by then.
The problem being raised here Jacob is that there's more to do than just this task.

Before March you also need to finish the cryptorouter related work (powmod, and so on...), but given that you came up with a new SVP64 feature while doing powmod, we have to expect either simulator bugs, or some other shortcomings which could delay delivery.

This ELF support task is relatively insular, and you *should* cut down the scope of it, to make sure there's plenty of time allocated to SVP64 related (cryptorouter) work.

As long as some very basic (without memory protection, etc.) ELF binary can demonstrate hello world and opening/writing to a file, this task can be deemed complete (as so much work is required just to get to that point).
Comment 30 Luke Kenneth Casson Leighton 2023-11-26 18:26:12 GMT
(In reply to Jacob Lifshay from comment #28)

> you're telling me exactly what I already knew...my point is we have time
> since march is several months away 

March 1st is ONLY THREE MONTHS and four days away with preparation for
FOSDEM being in much of that.

> and iirc we don't have so much work that
> we can't finish by then.

no, we HAVE to get it done. DO NOT under ANY circumstances decide unilaterally
that "it cannot be done therefore there is no point therefore tasks can be
arbitrarily dropped". we MADE A COMMITTMENT TO NLNET IN WRITING and if that
committment is missed it threatens future funding.

please CUT BACK to the absolute bare minimum: you have until december
7th to declare this task done.

if you are in difficulties with that PLEASE ASK FOR GUIDANCE IMMEDIATELY ok?
do not wait. you have 12 days.
Comment 31 Luke Kenneth Casson Leighton 2023-11-26 18:36:35 GMT
i've edited comment #0 and removed the majority of tasks. the static "hello world"
is left: it is just two syscalls which is perfectly sufficient.
bonus points for opening a file, reading and closing it but only if there
is time.

this task can be REVISITED later in another grant to make a more complete system.
but emphaticlly NOT under this grant, you have too much else to do within
the remaining THREE monts and four days.
Comment 32 Jacob Lifshay 2023-11-26 20:02:40 GMT
(In reply to Luke Kenneth Casson Leighton from comment #31)
> i've edited comment #0 and removed the majority of tasks. the static "hello
> world"
> is left: it is just two syscalls which is perfectly sufficient.

I think we should add mmap back in: i'm already like 90% done with a minimal version of it and it makes ELF loading easier.
Comment 33 Luke Kenneth Casson Leighton 2023-11-26 22:46:41 GMT
(In reply to Jacob Lifshay from comment #32)

> I think we should add mmap back in: i'm already like 90% done with a minimal
> version of it and it makes ELF loading easier.

ok great.  do not spend time on "security", do not spend time
on "safety". allocate memory. use it. done.

if necessary allocate the (one, large but still within 32bit system range)
memory block early with a python startup hook.
Comment 34 Jacob Lifshay 2023-11-27 03:24:54 GMT
(In reply to Luke Kenneth Casson Leighton from comment #33)
> (In reply to Jacob Lifshay from comment #32)
> 
> > I think we should add mmap back in: i'm already like 90% done with a minimal
> > version of it and it makes ELF loading easier.
> 
> ok great.  do not spend time on "security", do not spend time
> on "safety". allocate memory. use it. done.

nah, I'm just leaving all the stuff I already implemented in there. no point in wasting more time to delete something that's useful.
> 
> if necessary allocate the (one, large but still within 32bit system range)
> memory block early with a python startup hook.

it has several blocks of memory, since stuff like the stack is always mapped at the other end of the user address space.

e.g. the memory mappings at startup of a simple static-linked binary:
    Start Addr           End Addr       Size     Offset objfile
    0x10000000         0x100b0000    0xb0000        0x0 /home/jacob/a.out
    0x100b0000         0x100d0000    0x20000    0xa0000 /home/jacob/a.out
0x7ffff7fe0000     0x7ffff8000000    0x20000        0x0 [vdso]
0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]

so I put one block at 0x7fff... up to 0x800000000000 for the stack and stuff that likes getting mapped to high addresses and one block
starting at address 0x0 for everything else. I implemented all of that in bug #1173.
Comment 35 Jacob Lifshay 2023-11-27 03:33:36 GMT
Implemented mmap, except for final wiring of sc insn to call MemMMap.mmap_syscall. This is enough for loading ELF files, so I'm leaving it at that.

it supports mapping anonymous memory and files, but only with MAP_PRIVATE. MAP_SHARED isn't supported for now.
MAP_FIXED and not are both supported, though I only test MAP_FIXED since that's all I need for ELF loading.

https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=8a5707536105830db4a60d3fcc2ac0184ff3d19d

commit 8a5707536105830db4a60d3fcc2ac0184ff3d19d
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Nov 26 19:14:21 2023 -0800

    add mmap_syscall tests

commit 00a153e29456dbfcaa339a9d1b4481873c3d40d4
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Nov 26 19:13:25 2023 -0800

    implement MemMMap.mmap_syscall

commit 49b7ab73c59ba285695f10058c1585420b226f4b
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Nov 26 19:11:35 2023 -0800

    add ppc_flags.py so we can get the ppc versions of all the flags we need
    
    tells gcc to dump all #defines, and parses that.
Comment 36 Luke Kenneth Casson Leighton 2023-11-27 08:23:52 GMT
(In reply to Jacob Lifshay from comment #34)
> (In reply to Luke Kenneth Casson Leighton from comment #33)
> > (In reply to Jacob Lifshay from comment #32)
> > 
> > > I think we should add mmap back in: i'm already like 90% done with a minimal
> > > version of it and it makes ELF loading easier.
> > 
> > ok great.  do not spend time on "security", do not spend time
> > on "safety". allocate memory. use it. done.
> 
> nah, I'm just leaving all the stuff I already implemented in there. no point
> in wasting more time to delete something that's useful.

good plan.

> > 
> > if necessary allocate the (one, large but still within 32bit system range)
> > memory block early with a python startup hook.
> 
> it has several blocks of memory, since stuff like the stack is always mapped
> at the other end of the user address space.

make sure it will fit on a system with only 2 GB of RAM.

> e.g. the memory mappings at startup of a simple static-linked binary:
>     Start Addr           End Addr       Size     Offset objfile
>     0x10000000         0x100b0000    0xb0000        0x0 /home/jacob/a.out
>     0x100b0000         0x100d0000    0x20000    0xa0000 /home/jacob/a.out

good. fits in 2GB RAM.

> 0x7ffff7fe0000     0x7ffff8000000    0x20000        0x0 [vdso]
> 0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]

cut those back to 0x3fff_xxxx to fit into 2GB RAM systems.


(In reply to Jacob Lifshay from comment #35)
> Implemented mmap, except for final wiring of sc insn to call
> MemMMap.mmap_syscall. This is enough for loading ELF files, so I'm leaving
> it at that.

GREAT.
 
> it supports mapping anonymous memory and files, but only with MAP_PRIVATE.

not sure if this is in scope, assuming no unless it interferes with ELF loading
and running.

> MAP_SHARED isn't supported for now.

don't care in the least little bit. 100% out of scope of the bare minimum
requirement.

> MAP_FIXED and not are both supported,
> though I only test MAP_FIXED since
> that's all I need for ELF loading.

goood.

this is great jacob, gets the job done, and done fast.
Comment 37 Jacob Lifshay 2023-11-27 08:33:06 GMT
(In reply to Luke Kenneth Casson Leighton from comment #36)
> 
> make sure it will fit on a system with only 2 GB of RAM.

it fits in a system with <1GB.
> 
> > e.g. the memory mappings at startup of a simple static-linked binary:
> >     Start Addr           End Addr       Size     Offset objfile
> >     0x10000000         0x100b0000    0xb0000        0x0 /home/jacob/a.out
> >     0x100b0000         0x100d0000    0x20000    0xa0000 /home/jacob/a.out
> 
> good. fits in 2GB RAM.
> 
> > 0x7ffff7fe0000     0x7ffff8000000    0x20000        0x0 [vdso]
> > 0x7ffffffd0000     0x800000000000    0x30000        0x0 [stack]
> 
> cut those back to 0x3fff_xxxx to fit into 2GB RAM systems.

no need, there's a dict that translates, so if we wanted, we could put a memory block in kernel space too. this needs no particular memory addresses on the host. this also allows you to have multiple MemMMap instances simultaneously too (handy cuz we don't specifically destroy them, just relying on Python's GC)

if we were extra crazy and wanted to implement RV128, we could do that too (with minor adjustments since 2**64 and similar are hard-coded right now).
Comment 38 Jacob Lifshay 2023-11-30 03:26:18 GMT
Luke, if you want us to have the version of pyelftools that we use be on git.libre-soc.org, please make a repo for it, otherwise I'll just link setup.py to the git tag v0.30 at https://github.com/eliben/pyelftools/ and call that good enough.
Comment 39 Luke Kenneth Casson Leighton 2023-11-30 03:56:11 GMT
(In reply to Jacob Lifshay from comment #38)
> Luke, if you want us to have the version of pyelftools that we use be on
> git.libre-soc.org, please make a repo for it, otherwise I'll just link
> setup.py to the git tag v0.30 at https://github.com/eliben/pyelftools/ and
> call that good enough.

all done.
https://git.libre-soc.org/?p=pyelftools.git;a=summary
likely sensible use tag anyway
Comment 40 Jacob Lifshay 2023-11-30 04:07:52 GMT
(In reply to Luke Kenneth Casson Leighton from comment #39)
> all done.

Thanks.
> https://git.libre-soc.org/?p=pyelftools.git;a=summary

mirrored (except I deleted all the refs/pull/* refs)

> likely sensible use tag anyway

I will
Comment 41 Jacob Lifshay 2023-11-30 10:47:19 GMT
working on ELF, I ran into a confusing issue when SRR0 isn't initialized: bug #1226
Comment 42 Jacob Lifshay 2023-12-01 01:16:36 GMT
I got exit_group and write syscalls to work from a loaded ELF file (I'll split out into separate commits later). Currently running into SIGBUS, will debug later (probably tomorrow).
Comment 43 Jacob Lifshay 2023-12-01 09:51:26 GMT
I got a statically-linked hello world to work! I also had to do a bunch of debugging and fixing the syscall support, I cleaned up some of the kludges and fixed the tests to match.

I made ISACaller call load_elf (through MemMMap.initialize), since that way it's easier to use from both unit tests and from pypowersim. basically, you just pass a ELFFile in instead of a Program.

ELF relocations aren't implemented yet...

https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=404f6e2ff5b321137448fac1d5642fc6d8e45ad2

commit 404f6e2ff5b321137448fac1d5642fc6d8e45ad2
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Fri Dec 1 01:17:05 2023 -0800

    openpower/test/elf/simple_cases: add some simple ELF test cases

commit d8b3085bdd04006749e95339a4998d3e63132125
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:55:41 2023 -0800

    add utilities for testing ELF files

commit 1dae051bb1ee50e1420472613aed3f8799e405f0
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:54:48 2023 -0800

    SimRunner: support running an ELFFile

commit 1bbec2049bb76dbf2e40ccd2461daefa3c8f795f
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:51:08 2023 -0800

    ISACaller: support loading an ELFFile

commit 0aac36feb016237c6523b6416a8419711cb919dc
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:46:50 2023 -0800

    mem.py: add load_elf

commit 3bdcd0937ceb6788389679b7a465b7a1fab328a6
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:40:51 2023 -0800

    mem.py: fix SIGBUS when accessing file mapped by mmap_syscall
    
    this fixes SIGBUS errors caused by accessing beyond the end of a
    file but still in the last page of the file, which is a valid thing to
    do, except that we have to account for host pages having a different
    size than emulated pages and map zeros to fill out the rest of the
    emulated page.

commit 3e1c1a5a256ecc6b93e04e6671a486dc3eb7f272
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:29:18 2023 -0800

    caller.py: implement write syscall

commit 3d14be23fa58008aa8ed012295cd4d6fe3b45eb2
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:27:48 2023 -0800

    caller.py: implement exit_group syscall

commit cdd445a72085ee3faa35826a2c0449907a0504f7
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:15:45 2023 -0800

    ISACaller: fix syscall emulation
    
    there were two bugs fixed:
    1. sc emulation was missing a `return`, so it tried to run sc
       again after running sc and rfid, giving the wrong CIA and
       MSR values.
    2. the code to replace and restore the instruction with rfid
       had the wrong endian on the load, so it was corrupting the
       instruction for the next time it was used. I just deleted
       the save/replace/restore code since it isn't needed anymore.
    
    I then changed the syscall tests to ensure both the
    bugs above don't happen again.

commit dcd540c1055af5cabb2c18c67bbe5d6d1b70b744
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Wed Nov 29 19:04:26 2023 -0800

    setup: add pyelftools v0.30 as dependency
Comment 44 Luke Kenneth Casson Leighton 2023-12-01 14:01:40 GMT
(In reply to Jacob Lifshay from comment #43)
> I got a statically-linked hello world to work! 

hooraaay! that's a big damn deal!

> I also had to do a bunch of
> debugging and fixing the syscall support, I cleaned up some of the kludges
> and fixed the tests to match.

yeah sounds about right, only by testing is the dive deep enough

> I made ISACaller call load_elf (through MemMMap.initialize), since that way
> it's easier to use from both unit tests and from pypowersim. basically, you
> just pass a ELFFile in instead of a Program.

yeah that makes perfect sense to me, welcome to the Liskov Substitution
Principle...

> ELF relocations aren't implemented yet...

next grant... or is it quick and easy? -fPIC support would be brilliant
to have, but only if it takes a really short amount of time

> commit dcd540c1055af5cabb2c18c67bbe5d6d1b70b744
> Author: Jacob Lifshay <programmerjake@gmail.com>
> Date:   Wed Nov 29 19:04:26 2023 -0800
> 
>     setup: add pyelftools v0.30 as dependency

mmmm i took all dependencies out, pip3 is so bad nowadays.
can you comment-out it and add to devscripts instead?
leave the actual dependency still in there, don't delete it.
Comment 45 Luke Kenneth Casson Leighton 2023-12-01 14:03:58 GMT
rebased against master for you

libresoc@localhost:~/src/openpower-isa/openpower/isa$ git checkout 1169-elf-support
Previous HEAD position was 7f64d55f Moved maddsubrs/maddrs/msubrs instructions to separate files As per Jacob's suggestion, simplified maddsubrs by removing masks and fixing overflow problems.
Branch '1169-elf-support' set up to track remote branch '1169-elf-support' from 'origin'.
Switched to a new branch '1169-elf-support'
libresoc@localhost:~/src/openpower-isa/openpower/isa$ git pull
Already up to date.
libresoc@localhost:~/src/openpower-isa/openpower/isa$ git rebase master
Successfully rebased and updated refs/heads/1169-elf-support.
glibresoc@localhost:~/src/openpower-isa/openpower/isa$ git pull
Successfully rebased and updated refs/heads/1169-elf-support.
glibresoc@localhost:~/src/openpower-isa/openpower/isa$ git push
Enumerating objects: 190, done.
Counting objects: 100% (190/190), done.
Delta compression using up to 4 threads
Compressing objects: 100% (136/136), done.
Writing objects: 100% (174/174), 19.75 KiB | 396.00 KiB/s, done.
Total 174 (delta 138), reused 39 (delta 38), pack-reused 0
remote: Resolving deltas: 100% (138/138), completed with 15 local objects.
remote: updating openpower-isa www directory
remote: From /var/lib/gitolite3/repositories/openpower-isa
remote:    404f6e2f..85e77c34  1169-elf-support -> origin/1169-elf-support
remote: Already up to date.
remote: Already up to date.
remote: updating ikiwiki openpower-isa underlay
To git.libre-soc.org:openpower-isa.git
   404f6e2f..85e77c34  1169-elf-support -> 1169-elf-support
libresoc@localhost:~/src/openpower-isa/openpower/isa$
Comment 46 Luke Kenneth Casson Leighton 2023-12-01 14:43:51 GMT
untested am in a cafe, sorry

commit 014fdc19b0207a35d564a680c52269841856cf9b (HEAD -> 1169-elf-support, origin/1169-elf-support)                                                           Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Fri Dec 1 14:42:25 2023 +0000                                                                                                                             bug 1169: elf support, minor coding style adjustment, clearer
Comment 47 Luke Kenneth Casson Leighton 2023-12-01 18:13:13 GMT
ok i was fascinated enough by this one to try it out,
messed some things up due to being in a cafe, sorted
them out in another, installed gcc (missing on my
laptop), simplified how the unit tests look, ran them
and wha-hey! absolutely awesome, i got "Hello World"
which is amazing news.
Comment 48 Jacob Lifshay 2023-12-01 20:21:02 GMT
(In reply to Luke Kenneth Casson Leighton from comment #45)
> rebased against master for you

unfortunately you managed to rebase master onto 1169-elf-support, so all the commits on master got rewritten instead of what we want which is rewriting the commits on 1169-elf-support to be based on master. I'll fix this later.
Comment 49 Jacob Lifshay 2023-12-01 20:23:56 GMT
(In reply to Jacob Lifshay from comment #48)
> (In reply to Luke Kenneth Casson Leighton from comment #45)
> > rebased against master for you
> 
> unfortunately you managed to rebase master onto 1169-elf-support, so all the
> commits on master got rewritten instead of what we want which is rewriting
> the commits on 1169-elf-support to be based on master. I'll fix this later.

e.g. commit on master:
https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=22c2fd82ddb9f4383eaf5bc867779d6b6845fcd4

same commit on 1169-elf-support (note different hash):
https://git.libre-soc.org/?p=openpower-isa.git;a=commit;h=1dc3bedcbdb9e10e5da7dd8e5c038a44f978b705
Comment 50 Luke Kenneth Casson Leighton 2023-12-01 20:55:07 GMT
(In reply to Jacob Lifshay from comment #48)
> (In reply to Luke Kenneth Casson Leighton from comment #45)
> > rebased against master for you
> 
> unfortunately you managed to rebase master onto 1169-elf-support, 

huhn moo? i typed "git rebase master" in the 1169-elf-branch. shold
work, right?
Comment 51 Jacob Lifshay 2023-12-01 21:19:56 GMT
(In reply to Luke Kenneth Casson Leighton from comment #50)
> (In reply to Jacob Lifshay from comment #48)
> > (In reply to Luke Kenneth Casson Leighton from comment #45)
> > > rebased against master for you
> > 
> > unfortunately you managed to rebase master onto 1169-elf-support, 
> 
> huhn moo? i typed "git rebase master" in the 1169-elf-branch. shold
> work, right?

well, something messed up. I like to use rebase -i so I can see what it's doing.

I rebased again, but since master already had an import os, i edited the commit with from os import readlink to just use os.readlink and dropped the commit that did that change again.

I also added a new commit after rebasing. I ran tests (but aborted when it got to 99%, i don't want to wait 1hr for powmod), they passed.

log with branches, so you can see i got the rebase the right way around this time:

https://git.libre-soc.org/?p=openpower-isa.git;a=shortlog;h=1d45949e369d92c0cbd41b2c247a5bae9c37e11b

commit 1d45949e369d92c0cbd41b2c247a5bae9c37e11b (HEAD -> 1169-elf-support, origin/1169-elf-support)
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Fri Dec 1 12:57:34 2023 -0800

    elf/simple_cases: deepcopy is unnecessary, call dict.copy

commit 1f1714a2d38941f488c3e31b891568ca3794b519
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Fri Dec 1 17:49:51 2023 +0000

    bug 1169: elf-support correct syntax errors

commit 25435266ec9608aa71dc65116430ef2a72ea5360
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Fri Dec 1 14:42:25 2023 +0000

    bug 1169: elf support, minor coding style adjustment, clearer

commit 76f6a80d1032e434caf040a1ed93b60e5c010a99
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Fri Dec 1 01:17:05 2023 -0800

    openpower/test/elf/simple_cases: add some simple ELF test cases

commit 5b91e9f2e0c40a524bb5dd3f9adbcf856ea72d1e
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:55:41 2023 -0800

    add utilities for testing ELF files

commit 74537736287a035dde84a83d2c72245936babf10
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:54:48 2023 -0800

    SimRunner: support running an ELFFile

commit bab7beab9183331c66ab9e733f71665a8a513498
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:51:08 2023 -0800

    ISACaller: support loading an ELFFile

commit 5c1b59a76b4cf4be53b6c7e0f216bd9a18f8cf0d
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:46:50 2023 -0800

    mem.py: add load_elf

commit 2737991fb0001c56ca182bdc97903075c2659e33
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:40:51 2023 -0800

    mem.py: fix SIGBUS when accessing file mapped by mmap_syscall

    this fixes SIGBUS errors caused by accessing beyond the end of a
    file but still in the last page of the file, which is a valid thing to
    do, except that we have to account for host pages having a different
    size than emulated pages and map zeros to fill out the rest of the
    emulated page.

commit 46b65d91b6d43be1bd4531616438f533d220b09e
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:29:18 2023 -0800

    caller.py: implement write syscall

commit 706bc79c356283f3bcaee817b3dab2f0e4a59ef2
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:27:48 2023 -0800

    caller.py: implement exit_group syscall

commit 6012b9744813c6e56ed1406bdd8036224b41417e
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Thu Nov 30 23:15:45 2023 -0800

    ISACaller: fix syscall emulation

    there were two bugs fixed:
    1. sc emulation was missing a `return`, so it tried to run sc
       again after running sc and rfid, giving the wrong CIA and
       MSR values.
    2. the code to replace and restore the instruction with rfid
       had the wrong endian on the load, so it was corrupting the
       instruction for the next time it was used. I just deleted
       the save/replace/restore code since it isn't needed anymore.

    I then changed the syscall tests to ensure both the
    bugs above don't happen again.

commit cfc88640e740fe4e0cb4c350c95f6cb2eb6b049b
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Wed Nov 29 19:04:26 2023 -0800

    setup: add pyelftools v0.30 as dependency

commit c45ea4254c9e5cbcd357749fefc34a44044c8c65
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Nov 26 21:43:46 2023 -0800

    add g++-powerpc64le-linux-gnu to .gitlab-ci.yml

commit eeb3d2031396f9d4bc8f377344c7f5940eae8b9e
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Nov 26 19:14:21 2023 -0800

    add mmap_syscall tests

commit 5dc69f649cd8919dac6d6be01e6357cb086cadb0
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Nov 26 19:13:25 2023 -0800

    implement MemMMap.mmap_syscall

commit 5f21b2e6093390c1f4c48331781d395357c58386
Author: Jacob Lifshay <programmerjake@gmail.com>
Date:   Sun Nov 26 19:11:35 2023 -0800

    add ppc_flags.py so we can get the ppc versions of all the flags we need

    tells gcc to dump all #defines, and parses that.

commit 5072485f66e9f680892320886bd0b76a5438a0b7 (origin/master, origin/HEAD, master)
Author: Luke Kenneth Casson Leighton <lkcl@lkcl.net>
Date:   Fri Dec 1 17:57:12 2023 +0000

    replace print() with log()
Comment 52 Luke Kenneth Casson Leighton 2023-12-01 22:15:46 GMT
ok thank you jacob. i am hppy this can be signed-off.

so we have a decision-point, i leave it to you:

* do we have time to complete other cryptoprimitives and cavatools
  if doing relocatable binaries?

vs

* if adding relocatable elf support is the delay on RFPs ok?
  RED and myself need $ but a *few* days is fine (no later tn 7th)

doable?
Comment 53 Jacob Lifshay 2023-12-01 22:40:06 GMT
(In reply to Luke Kenneth Casson Leighton from comment #52)
> ok thank you jacob. i am hppy this can be signed-off.
> 
> so we have a decision-point, i leave it to you:
> 
> * do we have time to complete other cryptoprimitives and cavatools
>   if doing relocatable binaries?

I think so...
> 
> vs
> 
> * if adding relocatable elf support is the delay on RFPs ok?
>   RED and myself need $ but a *few* days is fine (no later tn 7th)

I could probably get dynamic binaries working by the 7th, turns out minimal ELF support is much simpler than originally thought, most of my time was spent implementing mmap and debugging the syscall bugs.
> 
> doable?

I think so.

I'd like to implement brk, since that's relatively simple on top of what's already done, and that's required by ld.so and malloc.
Comment 54 Jacob Lifshay 2023-12-01 22:41:55 GMT
merged to master. Luke, please don't mark tasks as resolved until they're merged.
Comment 55 Luke Kenneth Casson Leighton 2023-12-01 23:01:22 GMT
(In reply to Jacob Lifshay from comment #48)
 
> I'd like to implement brk, since that's relatively simple on top of what's
> already done, and that's required by ld.so and malloc.

ok go for it, under bug #1228. i'm declaring this one done as it is
the last subtask of bug #983 which is a large RFP so is going in
straight away (right now).
Comment 56 Jacob Lifshay 2023-12-01 23:03:36 GMT
(In reply to Luke Kenneth Casson Leighton from comment #55)
> ok go for it, under bug #1228. i'm declaring this one done as it is
> the last subtask of bug #983 which is a large RFP so is going in
> straight away (right now).

k, sounds good.

marking resolved since I merged in comment #54
Comment 57 Jacob Lifshay 2023-12-03 02:57:30 GMT
I adjusted comment #0 to mark stuff as MOVED TO bug #1228

CI passed:
git clone "https://build.libre-soc.programmerjake.tk/build-archive.git" build-archive
cd build-archive
git checkout 84f857dd3594fa5655422084c5e9f5b2f51dfbb2
less -R pipelines/608509/job-4990782-log.txt

Pipeline on GitLab: https://salsa.debian.org/Kazan-team/mirrors/openpower-isa/-/pipelines/608509
Comment 58 Luke Kenneth Casson Leighton 2023-12-03 04:55:55 GMT
(In reply to Jacob Lifshay from comment #57)
> I adjusted comment #0 to mark stuff as MOVED TO bug #1228

good idea.

> CI passed:

fantastic can you please put in an RFP bug #983 note it on syncup
and remind andrey to answer my requests to raise a
public question about RFPs?
https://libre-soc.org/meetings/sync_up/sync_up_2023-12-05/

include this in the rfp:

> git clone "https://build.libre-soc.programmerjake.tk/build-archive.git"
> build-archive
> cd build-archive
> git checkout 84f857dd3594fa5655422084c5e9f5b2f51dfbb2
> less -R pipelines/608509/job-4990782-log.txt
> 
> Pipeline on GitLab:
> https://salsa.debian.org/Kazan-team/mirrors/openpower-isa/-/pipelines/608509