Question: I have a problem that requires me to
find the 9 biggest numbers out of 1024 16 bit numbers and I want to get an idea of how
many cycles this will take.
Answer: The fastest way I've found to
sort on the C54x uses the MAX or MIN instructions. You might start with something like the
following which is similar to the "sub-block" position search I developed for
QCELP:
- Break the array into 16 blocks of 64 values:
a. Do outer loop 16 times (for each block)
b. Do inner loop 64 times (finds local max)
LD *AR+, A
MAX B ;can inline this inner loop to save branch cycles
c. Save local max
- You now have 16 local maximums. Do a position search to find overall max and its
position within the 16. Save the overall max result somewhere. The position will tell you
which sub-block in the array of 1024 contains the global max. Then go to that sub-block
and overwrite the max value with zero.
- Repeat this 8 more times to find the next 8 highest values.
This should take about 9 * (2048 cycles for inlined array search + 100 cycles??
Overhead for each pass) => 19,332 cycles
It could probably be sped up as much as 40% by eliminating the sub-blocks with the 7
lowest local maximums from further consideration after the first pass. That would require
copying the array into a new one of size 576. |