Hi everyone,
I'm trying to compile a kernel that has multiples loops, of which some are also nested, and the early resource estimate feature gives me estimations that are quite off. Can somebody explain me why or experienced the same?
Moreover, in some cases I could see that using #pragma unroll N reduced the estimate usage. Could it be that also in this case the estimations are off? or that, implementing the logic of some loop is more demanding that actually unroll it? In this case the unrolling factor is never bigger than 80 and the body of the loop is maximum 2 instructions.
I'm trying to compile a kernel that has multiples loops, of which some are also nested, and the early resource estimate feature gives me estimations that are quite off. Can somebody explain me why or experienced the same?
Moreover, in some cases I could see that using #pragma unroll N reduced the estimate usage. Could it be that also in this case the estimations are off? or that, implementing the logic of some loop is more demanding that actually unroll it? In this case the unrolling factor is never bigger than 80 and the body of the loop is maximum 2 instructions.
Code:
Early resource estimate: 80% logic, 36% ALUTs, 44% registers, 21% RAMs, 12% DSPs Kernel 'process': throughput: 4.92e+05 / resources: 64% logic, 26% ALUTs, 38% registers, 8% RAMs, 12% DSPs)
----------------------------------
Using attributes:
Kernel 'process':
max_unroll_loops(80)
num_simd_work_items(1)
num_compute_units(1)
num_share_resources(1)
max_share_resources(8)
aoc: Compiling....
Early resource estimate: 195% logic, 121% ALUTs, 85% registers, 22% RAMs, 6% DSPs
Kernel 'process': throughput: 2.34e+05 / resources: 180% logic, 111% ALUTs, 78% registers, 9% RAMs, 6% DSPs)
----------------------------------
Using attributes:
Kernel 'process':
max_unroll_loops(1)
num_simd_work_items(2)
num_compute_units(1)
num_share_resources(1)
max_share_resources(8)
aoc: Compiling....
Early resource estimate: 80% logic, 36% ALUTs, 44% registers, 21% RAMs, 12% DSPs
Kernel 'process': throughput: 4.92e+05 / resources: 64% logic, 26% ALUTs, 38% registers, 8% RAMs, 12% DSPs)
----------------------------------
Using attributes:
Kernel 'process':
max_unroll_loops(1)
num_simd_work_items(1)
num_compute_units(2)
num_share_resources(1)
max_share_resources(8)
aoc: Compiling....
Early resource estimate: 146% logic, 64% ALUTs, 83% registers, 31% RAMs, 25% DSPs
Kernel 'process': throughput: 9.84e+05 / resources: 129% logic, 53% ALUTs, 76% registers, 16% RAMs, 25% DSPs)
----------------------------------
Using attributes:
Kernel 'process':
max_unroll_loops(1)
num_simd_work_items(1)
num_compute_units(1)
num_share_resources(1)
max_share_resources(8)
aoc: Compiling....
Early resource estimate: 80% logic, 36% ALUTs, 44% registers, 21% RAMs, 12% DSPs
Kernel 'process': throughput: 4.92e+05 / resources: 64% logic, 26% ALUTs, 38% registers, 8% RAMs, 12% DSPs)
aoc: Compiling....
aoc: Linking with IP library ...
+--------------------------------------------------------------------+
; Estimated Resource Usage Summary ;
+----------------------------------------+---------------------------+
; Resource + Usage ;
+----------------------------------------+---------------------------+
; Logic utilization ; 80% ;
; Dedicated logic registers ; 45% ;
; Memory blocks ; 21% ;
; DSP blocks ; 13% ;
+----------------------------------------+---------------------------;
aoc: First stage compilation completed successfully.
Error: Cannot fit kernel(s) on device
real 200m31.304s
user 254m36.344s
sys 5m58.536s