https://www.reddit.com/r/programming/comments/p0yn45/three_fundamental_flaws_of_simd/h9n30n9/?utm_source=reddit&utm_medium=web2x&context=3 https://github.com/clausecker/pospop/blob/master/safe.go to be based on work by https://www.reddit.com/user/FUZxxl/, with credits / attribution
fix milestone to match parent task
the algorithm counts by bit *positions*, therefore it makes sense to use bitmatrix-flipping (bpermd) followed by popcount.