Optimizing JPEG code in libjpeg-turbo with new instructions.
I'd like to work on the huffman coding, since that seems like the least straightforward part that has been SIMD-ified before (all of the other parts are simple image format conversions or DCT stuff): SSE2 assembly: https://github.com/libjpeg-turbo/libjpeg-turbo/blob/2cad2169aeed95569b0e25b0a2abef045a2a4eb9/simd/x86_64/jchuff-sse2.asm https://github.com/libjpeg-turbo/libjpeg-turbo/blob/2cad2169aeed95569b0e25b0a2abef045a2a4eb9/simd/x86_64/jcphuff-sse2.asm
intriguing one - involving frequency-identification and sorting. https://www.programiz.com/dsa/huffman-coding
jpeg spec: https://www.w3.org/Graphics/JPEG/itu-t81.pdf reading through it (annoyingly it has no section links, so it makes it harder to find stuff), looks kinda complex so far...may have to reduce scope in order to finish before nlnet's deadline
(In reply to Jacob Lifshay from comment #3) > jpeg spec: https://www.w3.org/Graphics/JPEG/itu-t81.pdf > > reading through it (annoyingly it has no section links, so it makes it > harder to find stuff), looks kinda complex so far...may have to reduce scope > in order to finish before nlnet's deadline that was always the plan: there is far too much to take on otherwise. absolute maximum 60-100 lines of assembler, one key function.
working on adding a minimal JPEG decoder for extracting test data that's 90% of what's needed: commit f930c453550ca201e441f266623c3eb40fd6d9af (HEAD -> master, origin/master, origin/HEAD) Author: Jacob Lifshay <programmerjake@gmail.com> Date: Mon Sep 26 21:05:38 2022 -0700 add WIP jpeg decoder demo this includes a tiny test jpeg that's <2kB, so should be fine to be in git.
(In reply to Jacob Lifshay from comment #5) > working on adding a minimal JPEG decoder for extracting test data that's 90% > of what's needed: I realized that I wasn't very clear: I meant that I have that test-data-extracting decoder 90% completed.
Marking this resolved -- pcdec. is working and has working unit tests -- see #933. #933 will stay open because pcdec. still has some design that needs to be done -- mostly deciding which exact set of 4 modes are needed.