1078 – svp64 counting sort demo

Bug 1078 - svp64 counting sort demo

Summary: svp64 counting sort demo

Status:	CONFIRMED

Alias:	None

Product:	Libre-SOC's first SoC
Classification:	Unclassified
Component:	Documentation (show other bugs)
Version:	unspecified
Hardware:	Other Linux

Importance:	--- enhancement
Assignee:	Luke Kenneth Casson Leighton

URL:

Depends on:
Blocks:	953
	Show dependency tree / graph

Reported:	2023-05-05 11:52 BST by Luke Kenneth Casson Leighton
Modified:	2023-05-05 15:54 BST (History)
CC List:	1 user (show)

See Also:
NLnet milestone:	---
total budget (EUR) for completion of task and all subtasks:	0
budget (EUR) for this task, excluding subtasks' budget:	0
parent task for budget allocation:
child tasks for budget allocation:
The table of payments (in EUR) for this task; TOML format:

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Luke Kenneth Casson Leighton 2023-05-05 11:52:36 BST

https://en.m.wikipedia.org/wiki/Counting_sort#Pseudocode

function CountingSort(input, k)
    
    count ← array of k + 1 zeros
    output ← array of same length as input
    
    for i = 0 to length(input) - 1 do
        j = key(input[i])
        count[j] = count[j] + 1

    for i = 1 to k do
        count[i] = count[i] + count[i - 1]

    for i = length(input) - 1 down to 0 do
        j = key(input[i])
        count[j] = count[j] - 1
        output[count[j]] = input[i]

    return output

Comment 1 Luke Kenneth Casson Leighton 2023-05-05 12:07:12 BST

(In reply to Luke Kenneth Casson Leighton from comment #0)

>     for i = 0 to length(input) - 1 do
>         j = key(input[i])
>         count[j] = count[j] + 1

    setvl k
    svindex on keys
    sv.addi *count, *count, 1 # Indexed on count

>     for i = 1 to k do
>         count[i] = count[i] + count[i - 1]

    setvl k-1
    sv.add/mr *count+1, *count+1, *count
    (or use parallel prefix sum)

>     for i = length(input) - 1 down to 0 do
>         j = key(input[i])
>         count[j] = count[j] - 1
>         output[count[j]] = input[i]

VF mode, 2 stages:

    setvl len(input)
    svindex on keys
    sv.addi/mrr *count, *count, -1 # Indexed on count
    sv.addi/mrr *countcopy, *count # Indexed only on count

2nd stage (HF mode):

    svindex on countcopy:
    sv.addi *output, *input, 0 # Index countcopy on output

first step is doing a histogram. next is cumulative sum of histogram
counts.  then find unique areas. finally copy which has the fully
sorted order.

this algorithm critically relies on Dependency Hazards within the
hardware, to not mess up for example multiple REMAP-Indexed atomic
increments.

Comment 2 Luke Kenneth Casson Leighton 2023-05-05 15:54:03 BST

# python implementation
from typing import List

def get_max(arr: List[int]) -> int:
	mx = arr[0]
	for i in range(1, len(arr)):
		if arr[i] > mx:
			mx = arr[i]
	return mx

def count_sort(arr: List[int], exp: int):
	output = [0] * len(arr)
	count = [0] * 10

	# Store count of occurrences in count[]
	for i in range(len(arr)):
		count[(arr[i] // exp) % 10] += 1

	# Change count[i] so that count[i] now contains actual
	# position of this digit in output[]
	for i in range(1, 10):
		count[i] += count[i - 1]

	# Build the output array
	for i in range(len(arr) - 1, -1, -1):
		output[count[(arr[i] // exp) % 10] - 1] = arr[i]
		count[(arr[i] // exp) % 10] -= 1

	# Copy the output array to arr[], so that arr[] now
	# contains sorted numbers according to current digit
	for i in range(len(arr)):
		arr[i] = output[i]


def parallel_count_sort(arr: List[int]):
	m = get_max(arr)

	# Do counting sort for every digit. Note that instead
	# of passing digit number, exp is passed. exp is 10 ^ i
	# where i is current digit number
	exp = 1
	while m // exp > 0:
		for i in range(len(arr)):
			count_sort(arr, exp)
		exp *= 10

# Driver program to test above functions
def main():
	arr = [170, 45, 75, 90, 802, 24, 2, 66]
	print("Array before sorting:")
	print(arr)

	parallel_count_sort(arr)

	print("Array after sorting:")
	print(arr)

if __name__ == "__main__":
	main()

# This code is contributed by ksam24000

https://summerofhpc.prace-ri.eu/fastest-sorting-algorithm-for-distributed-systems-parallel-radix-sort-difficulty-medium/

https://dirtyhandscoding.github.io/posts/vectorizing-small-fixed-size-sort.html