I have the following simple vector dot multiply as a C2H accelerator.
For some situations (but not all) regular NIOS running 50MHz on a 2C8 is fast enough running in code.
I want to implement this as something supported by Qsys and newer Quartus.
Options? Special instruction in NIOS? User component in Qsys? Something else?
I have constructed signal processing accelerators outside of NIOS using read and write masters, but would like to avoid that effort here.
Seems likely this function has been covered before.
For some situations (but not all) regular NIOS running 50MHz on a 2C8 is fast enough running in code.
I want to implement this as something supported by Qsys and newer Quartus.
Options? Special instruction in NIOS? User component in Qsys? Something else?
I have constructed signal processing accelerators outside of NIOS using read and write masters, but would like to avoid that effort here.
Seems likely this function has been covered before.
Code:
// Multiply data vector with polyphase filter
int DotMpy(int* __restrict__ r, int* __restrict__ p, alt_u16 PolySel)
{
#pragma altera_accelerate connect_variable DotMpy/r to sdram
#pragma altera_accelerate connect_variable DotMpy/r to onchip_SRAM
#pragma altera_accelerate connect_variable DotMpy/p to onchip_SRAM
long long SumVal = 0;
int k = 20;
do {
SumVal += (long long)*r++ * (long long)*p;
if (PolySel&0x100) (p--); else (p++);
} while (--k);
return (int)(SumVal >> 16);
}