Quantcast
Channel: Altera Forums
Viewing all 19390 articles
Browse latest View live

OpenCL HostApp fails with acl_bind_buffer_to_device: Assertion `mem' failed

$
0
0
Hey,

first some infos about the environment:

Board: Nallatech 510T (2x Arria 10)
Ubuntu 16.04.
Quartus 17.1.0

Flashing the .aocx and compiling of the HostApp was successfull.

Now when i start the HostApp, the device was found and the HostApp tried to start.
But then the following error is shown:
Code:

acl_mem.c:398: acl_bind_buffer_to_device: Assertion `mem' failed.
See full result:
Code:

Listing OpenCL devices (OCLMiner).  ℹ  13:00:38|ethminer  Found suitable OpenCL device [ p510t_min_ax115 : nalla_pcie (aclnalla_pcie0) ] with 4294967296  bytes of GPU memory
  ℹ  13:00:38|stratum  Connecting to stratumV2 server eth-eu1.nanopool.org:9999
  ℹ  13:00:38|stratum  Connected!
  ℹ  13:00:38|stratum  Starting farm
 ocl  13:00:38|ocl-0    No work. Pause for 3 s.
 ocl  13:00:38|ocl-1    No work. Pause for 3 s.
  ℹ  13:00:38|stratum  Received new job #0x7c1a12  seed: #8308d376eeb469b7ff84bd59c51988d9  target: #000000006df37f675ef6eadf
  ℹ  13:00:38|stratum  Received new job #0x7c1a12  seed: #8308d376eeb469b7ff84bd59c51988d9  target: #000000006df37f675ef6eadf
 ocl  13:00:41|ocl-0    New work: header #7c1a1211… target 000000006df37f675ef6eadf5ab9a2072d44268d97df837e6748956e5c6c2116
 ocl  13:00:41|ocl-0    New seed #8308d376…
 ocl  13:00:41|ocl-1    New work: header #7c1a1211… target 000000006df37f675ef6eadf5ab9a2072d44268d97df837e6748956e5c6c2116
 ocl  13:00:41|ocl-1    New seed #8308d376…
 ocl  13:00:43|ocl-0    Platform: Intel(R) FPGA SDK for OpenCL(TM)
 ocl  13:00:43|ocl-1    Platform: Intel(R) FPGA SDK for OpenCL(TM)
 ocl  13:00:43|ocl-0    Device:  p510t_min_ax115 : nalla_pcie (aclnalla_pcie0)    1CU 4096MB  / OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 17.1
 ocl  13:00:43|ocl-0    OpenCL kernel: Custom 'p510t_min_ax115 : nalla_pcie (aclnalla_pcie0).aocx'
 ocl  13:00:43|ocl-1    Device:  p510t_min_ax115 : nalla_pcie (aclnalla_pcie1)    1CU 4096MB  / OpenCL 1.0 Intel(R) FPGA SDK for OpenCL(TM), Version 17.1
 ocl  13:00:43|ocl-1    OpenCL kernel: Custom 'p510t_min_ax115 : nalla_pcie (aclnalla_pcie1).aocx'
  m  13:00:43|ethminer  Speed  0.00Mh/s No-Fee (295s)
            [0]p510t_min_ax115 : nalla_pcie (aclnalla_pcie0)    1CU 4096MB -  0C  0% -  0.00Mh/s
            [1]p510t_min_ax115 : nalla_pcie (aclnalla_pcie1)    1CU 4096MB -  0C  0% -  0.00Mh/s
 [A0+0:R0+0:F0]
 ocl  13:00:43|ocl-1    OpenCL kernel: Custom 'p510t_min_ax115 : nalla_pcie (aclnalla_pcie1).aocx'
 ocl  13:00:43|ocl-1    OpenCL kernel: GROUP_SIZE 128
 ocl  13:00:43|ocl-0    OpenCL kernel: Custom 'p510t_min_ax115 : nalla_pcie (aclnalla_pcie0).aocx'
 ocl  13:00:43|ocl-0    OpenCL kernel: GROUP_SIZE 128
 ocl  13:00:43|ocl-1    OpenCL kernel: DAG_SIZE 21495797
 ocl  13:00:43|ocl-0    OpenCL kernel: DAG_SIZE 21495797
 ocl  13:00:43|ocl-1    OpenCL kernel: LIGHT_SIZE 671743
 ocl  13:00:43|ocl-0    OpenCL kernel: LIGHT_SIZE 671743
 ocl  13:00:43|ocl-1    OpenCL kernel: ACCESSES 64
 ocl  13:00:43|ocl-0    OpenCL kernel: ACCESSES 64
 ocl  13:00:43|ocl-1    OpenCL kernel: MAX_OUTPUTS 1
 ocl  13:00:43|ocl-0    OpenCL kernel: MAX_OUTPUTS 1
 ocl  13:00:43|ocl-1    OpenCL kernel: PLATFORM 0
 ocl  13:00:43|ocl-0    OpenCL kernel: PLATFORM 0
 ocl  13:00:43|ocl-1    OpenCL kernel: COMPUTE 0
 ocl  13:00:43|ocl-0    OpenCL kernel: COMPUTE 0
 ocl  13:00:43|ocl-1    OpenCL kernel: THREADS_PER_HASH 4
 ocl  13:00:43|ocl-0    OpenCL kernel: THREADS_PER_HASH 4
  ✘  13:00:44|ocl-0    Build info:
  ✘  13:00:44|ocl-1    Build info:
ethminer: acl_mem.c:398: acl_bind_buffer_to_device: Assertion `mem' failed.

I used the ethminer from https://github.com/Maetti79/ethminer

Hope you can help.

If you need more information, just ask.

thank you!

qsys avalon stream arbiter

$
0
0
I need module which has 2 Avalon-St sinks (data in) and single Avalon-St source (data out) and simply redirects one of the sink to source. Is there an option to make qsys infer arbitration logic and connect source0 streams from st_pipeline_stage_1/2 to sink0 stream in st_pipeline_stage_0? When I try to connect the second wire the first connection disappears.
Attached Images

CL_FLUSH is acting as Blocking call.

$
0
0
Hi All,


I'm using non-blocking data transfer in my program to transfer 4K resolution(3840x2160) frame data. If clEnqueueWriteBuffer is blocking(CL_TRUE), it is consuming close to 3.5 millisec and if make it as non-blocking(CL_FALSE), it is just taking neglible(~104 micro seconds) time. But the time of CL_FLUSH is increasing from 4 micro seconds to 3.3 milli seconds.


Below is the snippet of my code:


err = clEnqueueWriteBuffer(commandQueue[0], srcBuffer, CL_FALSE, 0, 3840 * 2160 , inputBuffer, 0, NULL, NULL);
if (CL_SUCCESS != err) {
printf("Error in clEnqueueWriteBuffer srcBuffer %d\n", err);
exit(-1);
}
/* setKernelArg */
...


err = clEnqueueTask(commandQueue[0], kernel, 0, NULL, &kernel_event[0]);
if (CL_SUCCESS != err) {
printf("Error in clEnqueueTask kernel %d\n\n", err);
exit(-1);
}
clFlush(commandQueue[0]);


I am not able to understabd why CL_FLUSH is blocking for 3.5 millisec.


Thanks in advance

Recovery timing violation

$
0
0
Hi all,

I have a design with major recovery timing violation (among others).

The design wasn't written by me, I'm only editing it, so I'm not fully aware of everything inside it.

There's a photo in the attachment with screenshots from timequest with the worst failing recovery path.

I'm unable to grasp where is the ~7000 ns latch edge time in the data required path coming from???

Launch clock is 24 MHz, latch clock is 200 MHz.

I tried putting the reset signal to global resources, but quartus just ignores this assignment.
I'm working in quartus 17.1. I tried using the following assignment:
Code:

set_instance_assignment -name GLOBAL_SIGNAL ON -to "h158_core:u0|h158_core_mem_if_ddr3_emif_0:mem_if_ddr3_emif_0|h158_core_mem_if_ddr3_emif_0_p0:p0|h158_core_mem_if_ddr3_emif_0_p0_memphy:umemphy|h158_core_mem_if_ddr3_emif_0_p0_reset:ureset|h158_core_mem_if_ddr3_emif_0_p0_reset_sync:ureset_afi_clk|reset_reg[15]";
Any help would be appreciated as I'm stuck and have no idea what to do next.

Thanks for your help!
Attached Images

QUARTUS Software

$
0
0
Hello,
I need to install the Quartus software on a production computer, to program boards. I am not currently registered on the website, and it seems registration on the website is not allowed right now. Is there an alternative way I can download the Quartus lite software? Thanks to anyone who can help! Cheers

https://sso.altera.com/idp/startSSO....Id=myaltera_en

John

A Beginner

$
0
0
Hi all,
I am a beginner, with little knowledge of programming I want direction and guidance

VHDL Code for a Mealy machine with two inputs and one output.

$
0
0
Hi guys, I'm new here. What would be the VDHL code for this Mealy machine? Thanks
A sequential circuit with two Dflip-flops Aand B, two inputs, xand y; and one output z
specified by the following next-state and output equations.

A(t +1)=xy’ +xB
B(t +1)=xA+Xb’
z=A

Altera Forum Migrates - existing email addresses

$
0
0
In the Altera Forum Migrates July 30th email sent it states :-

"Top Altera Forum participants (those with a Pupil, Scholar, Teacher or Guru reputation) who have used the Altera Forum in the last 12 months will have their existing email addresses migrated to be used as their log-in ID. "

Does this mean that log in or actually posted a thread ?


Unattended Install for Altera 13.1

$
0
0
I am trying to install Quartus II, ModelSim, SoCED and DSP Builder unattended with everything licensed with local license file. How do I do it.
Thanks for your help.
Cal

MAX 10 SC and RAM blocks

$
0
0
In our design some RAM blocks are being inferred from our VHDL (no altera IP is being used).


When targeting the device 10M16SCU169A7G and trying to compile the following error occurs:


Error (16021): You specified a configuration mode that includes memory initialization, however memory initialization is not supported by the selected device. In the Device and Pin Options dialog box, choose a configuration mode without memory initialization.


After following the intructions and selecting "Single Image", we run into the next error in the assembler:

Error (14703): Invalid internal configuration mode for design with memory initialization


Doing some research points out to the issues with MAX 10 SC boards and memory initialization issues:


https://www.alteraforum.com/forum/sh...ad.php?t=56869


I though this was only with ROM blocks, Is it possible the inferred RAM blocks also trigger the same issue?, or perhaps I am missing a setting?


If this is in fact the issue, then is there any known workaround?


Thanks!

Mux for INOUT ports

$
0
0
Hey guys I'm trying to exchange 2 pairs of INOUT signals, but without much sucess so far.
I have two PS/2 controlers and I would like to exchange the PS2(1) to PS2(2) signals and at same time PS2(2) to PS2(1) signals.
Perhaps it's simpler to explain with the actual (sniped) code.

Code:

-- external ports 
ps2_clk_io        : inout std_logic    := 'Z';
ps2_data_io      : inout std_logic    := 'Z';
ps2_mouse_clk_io  : inout std_logic    := 'Z';
ps2_mouse_data_io : inout std_logic    := 'Z'; 

 -- signals
signal ps2_mode_s  : std_logic := '0';
signal PS2K_DAT_IN  : std_logic;
signal PS2K_DAT_OUT : std_logic;
signal PS2K_CLK_IN  : std_logic;
signal PS2K_CLK_OUT : std_logic;
signal PS2M_DAT_IN  : std_logic;
signal PS2M_DAT_OUT : std_logic;
signal PS2M_CLK_IN  : std_logic;
signal PS2M_CLK_OUT : std_logic; 
signal ps2_data_out : std_logic;
signal ps2_clk_out      : std_logic;
signal ps2_mouse_data_out  : std_logic;
signal ps2_mouse_clk_out    : std_logic; 

 -- LOGIC BLOCK 
-- PS/2 keyboard 
PS2K_DAT_IN <= ps2_data_io when ps2_mode_s = '0' else ps2_mouse_data_io; 
PS2K_CLK_IN <= ps2_clk_io  when ps2_mode_s = '0' else ps2_mouse_clk_io; 
ps2_data_out <= PS2K_DAT_OUT when ps2_mode_s = '0' else PS2M_DAT_OUT;
ps2_clk_out  <= PS2K_CLK_OUT when ps2_mode_s = '0' else PS2M_CLK_OUT; 
ps2_data_io <= '0' when ps2_data_out = '0' else 'Z'; 
ps2_clk_io  <= '0' when ps2_clk_out  = '0' else 'Z'; 

-- PS/2 Mouse 
PS2M_DAT_IN <= ps2_mouse_data_io when ps2_mode_s = '0' else ps2_data_io; 
PS2M_CLK_IN <= ps2_mouse_clk_io  when ps2_mode_s = '0' else ps2_clk_io; 
ps2_mouse_data_out <= PS2M_DAT_OUT when ps2_mode_s = '0' else PS2K_DAT_OUT;
ps2_mouse_clk_out  <= PS2M_CLK_OUT when ps2_mode_s = '0' else PS2K_CLK_OUT; 
ps2_mouse_data_io <= '0' when ps2_mouse_data_out = '0' else 'Z'; 
ps2_mouse_clk_io  <= '0' when ps2_mouse_clk_out  = '0' else 'Z';


As you can see, I would like to exchage the signals between a mouse and a keyboard, using the control signal "ps2_mode_s". If this signal is '0', I need the keyboard on the first port and the mouse on the second. If it's '1', the oposite, mouse on first port and keyboard on second.
I already tried some variations, but I didn't found a proper solution.


Can anyone help, please?

out_of_context mode equivalent in quartus tool

$
0
0
Hi Forum,
I am a beginner in using quartus tool. I have a question regarding I/O buffer insertion in quartus. Earlier I used Vivado tool for synthesizing my design and I used out of context mode in vivado (which will make sure that no IOBUFs are getting inferred for submodules) when I did analysis on sub modules in my design. I wanted to know the out_of_context equivalent in quartus tool or how to tell quartus tool not to infer IO buffers.

Thank you,
Vamsi

Arria 10 SoC Development Kit - The function of FAHBP16 on Page 36

$
0
0
I am using Arria 10 SoC Development Kit (10AS066N3F40E2SG). And try to use the FMC A port for connecting an extra testing board.
In the schematic (
a10_soc_devkit_03_31_2016), some ports connect between FMA A port(J29) and MAXV_FPGA_IO(U21) , then I can't find where they go, so I don't understand how to use them.
Could someone help to explain the logic of following signal ports? Thanks
FAHBP18(Page17, J29) => FAHBP18(Page 36, U21) => ?
FAHBN18(Page17, J29) => FAHBN18(Page 36, U21) => ?
FAHBP19(Page17, J29) => FAHBP19(Page 36, U21) => ?
FAHBN19(Page17, J29) => FAHBN19(Page 36, U21) => ?


How to program EPCQ with .jic file on Altera C5EFP board?

$
0
0
I use C5EFP board, https://www.altera.com/products/boar...clone-v-e.html

I program the EPCQ with jic file, get Error (209025): Can't recognize silicon ID for device 1 when it load 87%, but I can load the .sof file successfully.

Then I read the Cyclone V E FPGA Development Board Reference Manual and schematic, the SW1 MSEL[4:0] need to be set as 10010, and also need to remove resistor R16 and R22, then connect resistor R18 and R23.

Do anyone meet the same problem on C5EFP too? Or is the way in the manual rightly described?

Generated PCIe Gen3x8 example design for stratix 10 s1 board with "Enable DMA" ?

$
0
0
Hello all,

In order to test and see the "HardIP for PCIe for stratix 10" IP component. I generated the example design from platform designer by configuring PCIe express IP with "Enable DMA option" and having "internal descriptor" enabled.

when i opened the example design separately , i find pcie IP and a 8kbytes onchip memeory with dual port access. I compiled this, and programmed the fpga and ran the linux driver and application code that was provided while generating example design. but this linux code has simple link test for 100 reads and writes which was passed with no problem. but what i want to test is PCIE dma transfers with descriptor table stuff and all. I know the theory behind it.

I want a driver and app code that containing below stuff

1. descriptor table and status table in host
2. programming point 1. and cpu writes internal registers of descriptor controller at fpga side.
3. read data mover picking point 2 and loading it into internal fifo via rd_dts_slave port
4. based on the content of each descriptor. the read data mover should move data from host pc to fpga( read DMA flow).
5. after the completion of point 4. the descriptor controller must pass MSI interrupt to HOST to update the status table.


Regards,
Anil

LVDS Interface between two FPGA boards: Xilinx and Altera

$
0
0
Hi,
I am making one-way communication between Cyclone V FPGA Board and Xilinx Kintex custom board.

*Cyclone V FPGA Board: I/O standard = LVDS
*Xilinx Kintex custom board: I/O standard =LVDS and LVDS_25

I am trying to pass LVDS signals generating from Cyclone V FPGA Board as input to the LVDS ports of the Xilinx Kintex custom board. But due to mismatch of I/O standard of two FPGA Boards, i am unable to monitor LVDS signals in Xilinx -ILA.

Any solution, please suggest me.
Xilinx Kintex custom board :I/O Planning



Attached Images

Xilinix to Quartus "Library Conversion"

$
0
0
Hi all,

I have a library written by one of my ex-colleague for Xilinix ISE, It basically converts ML algorithms to HLS, Now I want to convert this library for Altera Fpga. I have used Altera Fpga/Quartus in different projects but I have no experience with developing libraries for Quartus environment. I need guidance how should I proceed and which particular segments are needed to be changed for Altera in general. Because same library is not working when i run my project. Please find the attachment.

Thank you.








Attached Files

Unrolling and used RAMs

$
0
0
Hello,
I'm trying to understand the relationship between channels, unrolls and used RAM (M20K)


For this purpose, I've created this simple program composed of three kernels:
- the first inject data into the channel
- the second accumulates it
- the third receives the result of the accumulation and stores it into global memory


Code:

__kernel void generator_float_vector(int N){
    int outer_loop_limit=(int)(N/U);
    //we cannot have double write
    for(int i=0;i<outer_loop_limit;i++)
    {
    #pragma unroll
    for(int j=0;j<U;j++)
        write_channel_intel(channel_float_vector,(float)(1.0));
    }
}

__kernel void consumer(int N)
{

    int outer_loop_limit=(int)(N/U);
    float acc_o=0;
    float x[U];

    for(int i=0; i<outer_loop_limit; i++)
    {
        float acc=0;
        #pragma unroll
        for(int j=0;j<U; j++)
            x[j]=read_channel_intel(channel_float_vector);
        #pragma unroll
        for(int j=0;j<U; j++)
            acc+=x[j];
        acc_o+=acc;
    }
    write_channel_intel(channel_float_sink,acc_o);
}

__kernel void sink_single(__global float * restrict out)
{
    float r=read_channel_intel(channel_float_sink);
    *out=r;
}


The first and second kernel exploits unroll (to speedup computation). The unrolling factor is derived by using the constant U.
In the second kernel, I made explicit the read from channel just for readability.


Now by varying the number U, I obtain (in the report) different values in terms of used blocks of RAM (M20K).
The code is compiled with the v18.0 of Quartus for the Arria10 board.


In particular:
U=4 RAM=16 (16 used by sink kernel)
U=8 RAM=17 (1 consumer kernel, 16 sink kernel)
U=16 RAM=21 (5 consumer, 16 sink)
U=32 RAM=38 (22 consumer, 16 sink)
U=64 RAM=70 (54 consumer,16 sink)


I believe that the 16 RAMs used by the sink kernel are due to device RAM interface.
What I can not understand is the amount of RAMs used by the consumer kernel:

  • from the programming guide, the compiler should try to exploit private memory (register) if the data used is less than 64bytes.

This should correspond to the case with U=16 (being a float of 4 bytes) but it doesn'seem so

  • starting from U=16, the number of RAMs used increased with U, which should be somehow related to the unrolling



Any suggestions on how to read this numbers?
Thanks

mem_fence() not working for channels

$
0
0
Hi all,

I am testing the function of the feed-forward model(ping-pong buffer) mentioned in the programming guide. And I found the mem_fence function is not working. Here's the code I used for testing:

channel int c_id __attribute__((depth(100)));

__attribute__((reqd_work_group_size(10,1,1)))
__kernel void producer (__global int *restrict x, __global volatile int *restrict producer_data){
int global_x = get_global_id(0);
producer_data[global_x] = global_x;
mem_fence(CLK_CHANNEL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE);
write_channel_altera(c_id, global_x);
}
__attribute__((reqd_work_group_size(10,1,1)))
__kernel void consumer (__global int *restrict y, __global volatile int *restrict producer_data, __global int *restrict output) {
int global_x = read_channel_altera(c_id);
int sum2 = producer_data[global_x] + global_x;
output[global_x] = sum2;
}


The producer_data[] is initialized to zero. The producer kernel writes it's global id to the producer_data[global_x] and channel. After the consumer kernel reads from the channel, it writes producer_data[global_x] + global_x to output[global_x]. The value of output[global_x] should always be global_x * 2.

However in my experiment, the output array is not always global_x * 2, sometimes output[global_x] = global_x. The consumer kernel reads the data before producer writes to global memory. mem_fence() here is not working.

I guess the problem lies with the way to create a shared buffer. I put the producer and consumer kernels in different command queues for the concurrent execution and use clCreateBuffer(...,CL_MEM_READ_WRITE,...) to create a shared buffer. The programming guide mentioned clCreateBuffer(...,CL_MEM_READ_WRITE,...) allocates memory to nonshared DDR memory banks and shared memory should be allocated by using clCreateBuffer(...,CL_MEM_ALLOC_HOST_PTR,...). However, when I am using the clCreateBuffer(...,CL_MEM_ALLOC_HOST_PTR,...) function, these two kernels cannot execute concurrently. Consumer kernel will wait until producer kernel finishes.

How to allocate the shared buffer for two concurrent kernels? Any example host code for the feed-forward model(ping-pong buffer) will greatly help.

Thanks in advance.

connect FPFA TO Internet using wifi adapter

$
0
0
I need help please ,How can I connect my board DE1-Soc FPGA to internet
I have Wifi adapter RT5370 Wireless Adapter
Viewing all 19390 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>