Bug 464 - document the budget used for the project by analysing bugzilla
Summary: document the budget used for the project by analysing bugzilla
Status: IN_PROGRESS
Alias: None
Product: Libre-SOC Website
Classification: Unclassified
Component: website (show other bugs)
Version: unspecified
Hardware: Other Linux
: High enhancement
Assignee: Jacob Lifshay
URL:
Depends on:
Blocks:
 
Reported: 2020-08-18 13:27 BST by Luke Kenneth Casson Leighton
Modified: 2020-09-18 07:40 BST (History)
2 users (show)

See Also:
NLnet milestone: ---
total budget (EUR) for completion of task and all subtasks: 0
budget (EUR) for this task, excluding subtasks' budget: 0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:
# commented out test case: #"person1"={amount=123,paid=2020-06-08}


Attachments
Output of budget-sync (16.66 KB, text/plain)
2020-08-21 03:40 BST, Jacob Lifshay
Details
Output of budget-sync (16.83 KB, text/plain)
2020-09-05 03:36 BST, Jacob Lifshay
Details
Output of budget-sync (22.64 KB, text/plain)
2020-09-10 01:53 BST, Jacob Lifshay
Details
Output markdown of budget-sync for programmerjake.mdwn (2.80 KB, text/markdown)
2020-09-18 07:40 BST, Jacob Lifshay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Luke Kenneth Casson Leighton 2020-08-18 13:27:07 BST
an analysis page, autogenerated, is needed which shows and checks the budgets by using the REST api on bugzilla and generating a markdown page to be added to the libresoc wiki
Comment 1 Luke Kenneth Casson Leighton 2020-08-18 13:30:13 BST
jacob i can justify making sure you receive something for this one, however as i will almost certainly want to tweak and enhance it please can you write it in python3.  given that it is an admin script feel free to include type info if it speeds up the task.
Comment 2 Jacob Lifshay 2020-08-18 17:15:33 BST
additional tasks:
http://lists.libre-soc.org/pipermail/libre-soc-dev/2020-August/000163.html
Comment 3 Jacob Lifshay 2020-08-18 22:21:41 BST
Luke, please create a new empty git repo for this task and give me push permissions. maybe name it utils.git?
Comment 4 Luke Kenneth Casson Leighton 2020-08-19 04:29:36 BST
(In reply to Jacob Lifshay from comment #3)
> Luke, please create a new empty git repo for this task and give me push
> permissions. maybe name it utils.git?

done, quite late here 4am so am trusting the choice of repo name.
Comment 5 Jacob Lifshay 2020-08-19 04:36:48 BST
(In reply to Luke Kenneth Casson Leighton from comment #4)
> (In reply to Jacob Lifshay from comment #3)
> > Luke, please create a new empty git repo for this task and give me push
> > permissions. maybe name it utils.git?
> 
> done, quite late here 4am so am trusting the choice of repo name.

Thanks, pushed!

It will need to be made publicly visible on https://git.libre-soc.org
but that can be done tomorrow.
Comment 6 Jacob Lifshay 2020-08-21 03:40:27 BST
Created attachment 98 [details]
Output of budget-sync

I got the budget error detecting features up and working, so, assuming there aren't any bugs, the list of things that need to be fixed is attached.

That list is rather extensive. At the very least, we don't have any loops in our budget tracking graph :)

I still haven't added wiki markdown parsing/synchronizing, will work on that next.
Comment 7 Jacob Lifshay 2020-08-21 03:42:39 BST
I also still haven't added checking to see that only the pre-approved bugs are the ones where all the funding comes from. This will detect accidentally setting a bug as an ultimate source of funding for a tree of bugs.
Comment 8 Luke Kenneth Casson Leighton 2020-08-21 12:48:59 BST
https://bugs.libre-soc.org/show_bug.cgi?id=250

ah.  right.  ok.  so if there are no child tasks, i've not been filling
in the "excluding subtasks" field because it has no meaning.

can you remove that from the test, i will see if it's possible to "disable"
the "excluding subtasks" field entirely just like the "child tasks for
budget allocation" one.
Comment 9 Luke Kenneth Casson Leighton 2020-08-21 12:59:31 BST
nope.  doesn't work.  bugzilla's custom fields optional show/not-show is
not sophisticated enough.

rather than fill hi 100 fields that are not entiirely relevant can we:

* exclude anything that's a zero budget for now (#276) particularly if
  it doesn't have a milestone as well, or report a different message
* exclude anything that has zero in cf_budget which also has no subtasks
  or at least report a different message
Comment 10 Jacob Lifshay 2020-08-22 01:09:18 BST
As part of automatically generating the markdown for budget tracking on our wiki, I added two new bug fields to bugzilla:
* cf_payee:
    The person who should be paid for completing this task.
    If multiple people should be paid, select "Multiple"

    A dropdown list with the names of all the people on our
    "About Us" page on the wiki, as well as two special entries:
    * "---" the empty entry
    * "Multiple" the entry indicating that cf_payees_list should
      be used instead of cf_payee.

* cf_payees_list:
    The list of people who should be paid for completing this task,
    along with the amount to be paid to each one denominated in EUR.
    The list is in TOML format.

    Only enabled when cf_payee is "Multiple", required to be
    non-empty when enabled.

    Example:
    "Jacob Lifshay" = 120
    "Cole Poirier" = 240
    "Luke Kenneth Casson Leighton" = 456.25
Comment 11 Jacob Lifshay 2020-08-22 01:20:20 BST
(In reply to Jacob Lifshay from comment #10)
> As part of automatically generating the markdown for budget tracking on our
> wiki, I added two new bug fields to bugzilla:

For an example, see:
https://bugs.libre-soc.org/show_bug.cgi?id=324
Comment 12 Luke Kenneth Casson Leighton 2020-08-22 01:29:20 BST
(In reply to Jacob Lifshay from comment #10)

>     * "Multiple" the entry indicating that cf_payees_list should
>       be used instead of cf_payee.

don't like it, at all.  names of individuals are hardcoded into the database.  only db admins can add new people.

> * cf_payees_list:
>     The list of people who should be paid for completing this task,
>     along with the amount to be paid to each one denominated in EUR.
>     The list is in TOML format.

much better and covers individual cases as well as not needing db admin privileges.
Comment 13 Jacob Lifshay 2020-08-22 01:37:15 BST
(In reply to Luke Kenneth Casson Leighton from comment #12)
> (In reply to Jacob Lifshay from comment #10)
> 
> >     * "Multiple" the entry indicating that cf_payees_list should
> >       be used instead of cf_payee.
> 
> don't like it, at all.  names of individuals are hardcoded into the
> database.  only db admins can add new people.

So, remove the cf_payee field and use only the cf_payees_list field?

> > * cf_payees_list:
> >     The list of people who should be paid for completing this task,
> >     along with the amount to be paid to each one denominated in EUR.
> >     The list is in TOML format.
> 
> much better and covers individual cases as well as not needing db admin
> privileges.

Additionally, TOML supports comments, and all the semantic-level structure of JSON, which might be handy sometime.
Comment 14 Luke Kenneth Casson Leighton 2020-08-22 01:43:41 BST
(In reply to Jacob Lifshay from comment #13)
.
> 
> So, remove the cf_payee field and use only the cf_payees_list field?

yehyeh.  it's redundant in effect.
 

> Additionally, TOML supports comments, and all the semantic-level structure
> of JSON, which might be handy sometime.

like it.
Comment 15 Jacob Lifshay 2020-08-22 01:58:29 BST
(In reply to Luke Kenneth Casson Leighton from comment #14)
> (In reply to Jacob Lifshay from comment #13)
> .
> > 
> > So, remove the cf_payee field and use only the cf_payees_list field?
> 
> yehyeh.  it's redundant in effect.

Marked cf_payee as obsolete (effectively hidden) and made cf_payees_list always show up.

Wasn't able to delete cf_payee since it was used previously.
Comment 16 Jacob Lifshay 2020-08-28 01:03:47 BST
started adding code to handle cf_payees_list, it parses and checks consistency, but does not yet generate markdown for the wiki.
Comment 17 Jacob Lifshay 2020-09-05 03:36:09 BST
Created attachment 102 [details]
Output of budget-sync

I rewrote the code for determining which fields are incorrect when the fields are mutually inconsistent.

After glancing through the latest results, they mostly look like missing field values except for bug 191 which appears to have too many tasks assigned to it since it gives -41500 for the amount assigned excluding subtasks, meaning that 
 ... doing some hand calculation ... the total amount between subtasks is 91500(!), assuming I calculated it correctly.
Comment 18 Jacob Lifshay 2020-09-07 00:14:45 BST
Comments on the following ideas appreciated:

Note: For the following proposals, the nice-looking task lists currently on the wiki (like https://libre-soc.org/lkcl/) will be automatically generated by the Python script I'm currently writing, the following formats aren't intended for consumption by people, but more for ease of maintaining such that all the required info to completely generate the task lists is in a machine-parsable form that can be easily modified by people following the expected workflow.


I think we should change the cf_payees_list field to include information on when a particular person has been paid, since the PAYMENTPENDING status is insufficient for describing the case when some people have been paid but others have not yet.

I propose changing it in a backward compatible way to include several sections, as in the following example TOML:

# each person can only be in one subtable.
# when their payment's status changes, they move their
# entry to the correct section. empty section tables can be omitted.

# not yet submitted in RFP to NLNet -- not under any subtable
"person1" = 123
"person2" = 456

[submitted]
# submitted in RFP to NLNet, not yet paid -- under the `submitted` table
"person3" = 567

[paid]
# paid by NLNet -- under the `paid` table
"person4" = 432

the result of translating the above to JSON is:
{
    "person1": 123,
    "person2": 456,
    "submitted": {
        "person3": 567,
    },
    "paid": {
        "person4": 432,
    },
}

An alternative proposal would be to maintain a machine-parsable table of bugs for each person, either in their own bug, on the wiki, or in some other location hosted by Libre-SOC.

So, for a single person, their table might look like (TOML again; made up example):
# unsubmitted, unpaid bugs don't need to be mentioned here,
# that info can be retrieved from bugzilla.
submitted = [
    123, # refers to bug #123
    456,
    1,
]
paid = [
    2,
    56,
    47,
]
Comment 19 Luke Kenneth Casson Leighton 2020-09-07 01:11:53 BST
give me a day or so to think about it.  database field design takes time,
to consider the information needed to be stored as well as the cleanest
way to store it.

the date of the payment and submission also needs to be stored, which
tends to suggest that alongside cf_payee_list two more TOML fields
might do the trick: one cf_payee_submitdate and the other cf_payee_paiddate.
these of the format "name" = "23jan2021"
Comment 20 Jacob Lifshay 2020-09-07 01:34:09 BST
(In reply to Luke Kenneth Casson Leighton from comment #19)
> give me a day or so to think about it.  database field design takes time,
> to consider the information needed to be stored as well as the cleanest
> way to store it.
> 
> the date of the payment and submission also needs to be stored, which
> tends to suggest that alongside cf_payee_list two more TOML fields
> might do the trick: one cf_payee_submitdate and the other cf_payee_paiddate.
> these of the format "name" = "23jan2021"

we won't need any more bugzilla fields alongside cf_payee_list since all that's needed is to add another level of nesting. Also, TOML has built-in date/time types, so we should use those instead of strings.

So, something like this:

[submitted."person3"]
amount = 567
date = 2020-08-01

[paid."person4"]
amount = 432
date = 2020-08-06

Alternatively, we could just have the presence of the `submitted` and `paid` fields indicate that that particular payment was submitted then paid:

["person1"]
amount = 123
# no submitted or paid fields means not yet submitted or paid

["person2"]
amount = 456
submitted = 2020-09-23
# submitted field without paid field indicates submitted but not yet paid

["person3"]
amount = 789
submitted = 2020-08-12T06:23:00-07:00 # different timezone
paid = 2020-09-13T13:46:23Z # person3 likes precision :)
# submitted field with paid field indicates both submitted and paid

JSON equivalent (JSON doesn't have separate date/time types, so those are converted to strings):
{
  "person1": {
    "amount": 123
  },
  "person2": {
    "amount": 456,
    "submitted": "2020-09-23"
  },
  "person3": {
    "amount": 789,
    "paid": "2020-09-13T13:46:23+00:00",
    "submitted": "2020-08-12T06:23:00-07:00"
  }
}
Comment 21 Luke Kenneth Casson Leighton 2020-09-07 01:42:25 BST
(In reply to Jacob Lifshay from comment #20)
> (In reply to Luke Kenneth Casson Leighton from comment #19)
> > give me a day or so to think about it.  database field design takes time,
> > to consider the information needed to be stored as well as the cleanest
> > way to store it.
> > 
> > the date of the payment and submission also needs to be stored, which
> > tends to suggest that alongside cf_payee_list two more TOML fields
> > might do the trick: one cf_payee_submitdate and the other cf_payee_paiddate.
> > these of the format "name" = "23jan2021"
> 
> we won't need any more bugzilla fields alongside cf_payee_list since all
> that's needed is to add another level of nesting.

which increases the number of lines, filling the page with even more chatter, and requires a number of already-filled-in fields to be changed.

if you think it through, you'll find that adding those two proposed fields results in a very compact use of screen space.

for two people, six lines are used up with (three) TOML fields. the
bugzilla field description at the left provides a visual field break,
as does the box around the data.

if it is one field with [brackets] format it's *20* lines.  10 per person,
3-4 per "section" (including blank line dividers to provide clarity)

if four people contribute that's *FORTY* lines.


> Also, TOML has built-in
> date/time types, so we should use those instead of strings.

ah good.
Comment 22 Jacob Lifshay 2020-09-07 02:04:19 BST
(In reply to Luke Kenneth Casson Leighton from comment #21)
> (In reply to Jacob Lifshay from comment #20)
> > (In reply to Luke Kenneth Casson Leighton from comment #19)
> > > give me a day or so to think about it.  database field design takes time,
> > > to consider the information needed to be stored as well as the cleanest
> > > way to store it.
> > > 
> > > the date of the payment and submission also needs to be stored, which
> > > tends to suggest that alongside cf_payee_list two more TOML fields
> > > might do the trick: one cf_payee_submitdate and the other cf_payee_paiddate.
> > > these of the format "name" = "23jan2021"
> > 
> > we won't need any more bugzilla fields alongside cf_payee_list since all
> > that's needed is to add another level of nesting.
> 
> which increases the number of lines, filling the page with even more
> chatter, and requires a number of already-filled-in fields to be changed.
> 
> if you think it through, you'll find that adding those two proposed fields
> results in a very compact use of screen space.
> 
> for two people, six lines are used up with (three) TOML fields. the
> bugzilla field description at the left provides a visual field break,
> as does the box around the data.
> 
> if it is one field with [brackets] format it's *20* lines.  10 per person,
> 3-4 per "section" (including blank line dividers to provide clarity)

wait, wait, you're waay overestimating the amount of space required per person. By my calculations, including blank lines, the least dense format mentioned so far takes 5 lines per person.

if you want one line per person:
"person1" = 123
"person2" = 456
...

on submission becomes (using inline tables):
"person1" = { amount = 123, submitted = 2020-04-05 }
"person2" = 456
...

on payment becomes:
"person1" = { amount = 123, submitted = 2020-04-05, paid = 2020-06-08 }
"person2" = 456
...

or, if we leave out the submitted date for paid amounts:
"person1" = { amount = 123, paid = 2020-06-08 }
"person2" = 456
...
Comment 23 Jacob Lifshay 2020-09-07 23:06:29 BST
(In reply to Jacob Lifshay from comment #22)
> if you want one line per person:
> "person1" = 123
> "person2" = 456
> ...
> 
> on submission becomes (using inline tables):
> "person1" = { amount = 123, submitted = 2020-04-05 }
> "person2" = 456
> ...
> 
> on payment becomes:
> "person1" = { amount = 123, submitted = 2020-04-05, paid = 2020-06-08 }
> "person2" = 456
> ...
> 
> or, if we leave out the submitted date for paid amounts:
> "person1" = { amount = 123, paid = 2020-06-08 }
> "person2" = 456
> ...

I'm going to implement the above version, if we decide we want a different scheme, it should be pretty easy to adapt the code, since the only spot that would need to be changed is where the internal data structures are created from bugzilla, all the code that uses the internal data structures won't need to change.
Comment 24 Luke Kenneth Casson Leighton 2020-09-08 09:13:19 BST
(In reply to Jacob Lifshay from comment #23)

> > or, if we leave out the submitted date for paid amounts:
> > "person1" = { amount = 123, paid = 2020-06-08 }
> > "person2" = 456
> > ...
> 
> I'm going to implement the above version, if we decide we want a different
> scheme, it should be pretty easy to adapt the code, since the only spot that
> would need to be changed is where the internal data structures are created
> from bugzilla, all the code that uses the internal data structures won't
> need to change.

yeah it looks very reasonable. it may get a little long (line-wrap), if it can be done as a list "person = [123, 2020-06-08]" that's shorter, and we know (implicitly) that 1 date means "submitted", 2 dates means "paid".

that stands a chance of fitting on a single line without wrapping and thus making it readable in a web browser.
Comment 25 Jacob Lifshay 2020-09-09 01:00:19 BST
(In reply to Luke Kenneth Casson Leighton from comment #24)
> (In reply to Jacob Lifshay from comment #23)
> 
> > > or, if we leave out the submitted date for paid amounts:
> > > "person1" = { amount = 123, paid = 2020-06-08 }
> > > "person2" = 456
> > > ...
> <snip>
> 
> yeah it looks very reasonable. it may get a little long (line-wrap), 

We could also leave out unnecessary spaces:
"person1"={amount=123,paid=2020-06-08}

I'm adding a commented version of the above to this bug so we can try it out.

> if it
> can be done as a list "person = [123, 2020-06-08]" that's shorter, and we
> know (implicitly) that 1 date means "submitted", 2 dates means "paid".

I strongly think we should not use implicit field naming, though I'm fine with coming up with shorter names.

Do we really need 2 dates for every payment, can't we just rely on bugzilla's history function or just asking the relevant person if we really need the submitted date for a payment that's already paid?

> that stands a chance of fitting on a single line without wrapping and thus
> making it readable in a web browser.

I shortened the bugzilla field short descriptions as part of mitigating that:
cf_nlnet_milestone:
    NLnet milestone
cf_total_budget:
    total budget (EUR) for completion of task and all subtasks
cf_budget:
    budget (EUR) for this task, excluding subtasks' budget
cf_budget_parent:
    parent task for budget allocation
cf_payees_list:
    The table of payments (in EUR) for this task; TOML format
Comment 26 Jacob Lifshay 2020-09-09 01:36:45 BST
(In reply to Jacob Lifshay from comment #25)
> (In reply to Luke Kenneth Casson Leighton from comment #24)
> > that stands a chance of fitting on a single line without wrapping and thus
> > making it readable in a web browser.

I tried it out on Chromium on Linux, and on Chrome on Android, and I wasn't able to make it word-wrap, instead, the whole page gained horizontal scroll bars when I reduced the browser window's width.

It *does* word-wrap in bugzilla emails, however, so does just about every other field, so I wouldn't use that as a requirement for line length.
Comment 27 Jacob Lifshay 2020-09-10 01:53:25 BST
Created attachment 104 [details]
Output of budget-sync

I added all the new code to use the parsed config file to translate from aliases to the actual person, and to ensure that only one bug is set as the top-level bug for each NLNet Milestone.
I had to add a new bug #487 for the Future milestone since it didn't have a canonical top-level bug and since adding a bug is much easier than refactoring to add yet another special case.

I added the initial version of the budget-sync-config.toml file with names, emails, and aliases for all the people that currently are in the cf_payees_list field for some bug:
https://git.libre-soc.org/?p=utils.git;a=blob;f=budget-sync-config.toml;h=60ae440353f8e41234728bb3c7d2c6f9d2fdf0d4;hb=HEAD
Comment 28 Jacob Lifshay 2020-09-12 02:31:47 BST
added a WIP version of the markdown writing code.

Example markdown for my page:
<!-- autogenerated by budget-sync -->
# Jacob R. Lifshay

# Status Tracking
## NLNet.2019.10.Wishbone
* [Bug #324: create POWER DIV pipeline](https://bugs.libre-soc.org/show_bug.cgi?id=324)
* [Bug #348: POWER9 SPR pipeline needed](https://bugs.libre-soc.org/show_bug.cgi?id=348)
* [Bug #471: bug in modsd  DIV FSM](https://bugs.libre-soc.org/show_bug.cgi?id=471)
* [Bug #477: add add instructions to power-instruction-analyzer](https://bugs.libre-soc.org/show_bug.cgi?id=477)
## NLNet.2019.Vulkan
* [Bug #466: comprehensive evaluation and planning for 3D MESA driver](https://bugs.libre-soc.org/show_bug.cgi?id=466)
Comment 29 Luke Kenneth Casson Leighton 2020-09-12 09:13:39 BST
(In reply to Jacob Lifshay from comment #28)
> added a WIP version of the markdown writing code.

fantastic.
Comment 30 Jacob Lifshay 2020-09-15 03:16:19 BST
adding unit tests for write_budget_markdown using mocking so I don't have to constantly wait for the slow bugzilla server and manually inspect the output.
Comment 31 Jacob Lifshay 2020-09-18 07:40:14 BST
Created attachment 105 [details]
Output markdown of budget-sync for programmerjake.mdwn

I added code to generate a near-complete version of the output markdown for all users on bugzilla who have a bug assigned to them and/or a payment assigned to them.

The only stuff left that I want to do is to add some more unit tests for the markdown-writing code (though I could easily be convinced that what is there now is good enough).

At this point, this bug is around 99% completed.