Quantcast
Channel: Altera Forums
Viewing all 19390 articles
Browse latest View live

qsys-script hangs forever

$
0
0
Hi,

I'm trying to run '$ qsys-script --script=generated_from_qsys.tcl' and the process hangs forever.
Running on ubuntu 14.04. TLS, I know its not supported.
So I'm not expecting a solution, but I think there's a deadlock between threads?

Starting the program, waiting for "some" time, typing C^\, I get the following java thread dump:

Thanks,
Jos
Attached Files

Avalon MM slave and interrupt

$
0
0
Hi folks,
I have written an Avalon Slave peripheral to interface to an FTDI chip. I this peripheral I have inlcuded an IRQ signal that is driven from the FTDI chip. However when I add this peripheral to Qysys and make all the necessary connections (including setting an IRQ number) I get some warnings and errors on generation. Namely that the irq sender is not connected to a receiver, but yet in the messages box at the bottom of Qsys it says 0 errors, 0 warnings.


I have included the warnings and errors I get on generation above.

Can anybody give me an idea of where I am going wrong?
Many thanks
deBoogle
Attached Images

Altera Dual Port RAM bytes shifting error.

$
0
0
Hi Guys,

I've written a program in VHDL to access CIS and get data and transfer the data over USB using cypress USB transceiver CY7C68013A.

CIS : 1728 Pixels/Line, Output in three parts of 576 bytes.
CIS clock speed : 6.25Mhz, Max 8 Mhz.

FPGA : Cyclone III-EP3C25Q240C8N.

ADC : AD9200, 1st sample appears on the 5th clock hence first 4 clocks cycles are skipped for every line read.

Am using altera Dual Port RAM to get datas from the three outputs simultaneously and storing the datas in to 3 DPRAMs once this operation is done then I read the datas from DPRAM sequentially 1, 2, 3 and write the datas into cypress USB transceiver.

Problem:
When the image is created in the GUI, last four bytes (573,74,75,76) of the output is shifted to the first four bytes(1, 2, 3, 4). This four byte shift happens in all the three outputs.

I couldn't figure out what is the problem in my code.

Please Help. Thanks in advance.
Attached Images
Attached Files

AC coupled LVDS clock input - Stratix V

$
0
0
Hello,

We are using one of the CLK inputs (not REFCLK) of our 5SGSMD5K2F40I2LN device as an LVDS receiver of a 54MHz clock.
Currently our clock is ac coupled to the input (as illustrated below) and the internal 100 ohm termination is used.



Although this seems to be working, I'm a bit concerned since I couldn't find any mention of internal biasing of LVDS inputs and the only diagram in the LVDS termination section is of an DC coupled connection.


So, my questions are:
Is the ac coupled connection shown above legal?
If so, do I have to provide external biasing or is there internal biasing that can be used?

Thank you in advance,
Alex
Attached Images

CYCLONE V OCT Problem

$
0
0
Hi,

I am using in my design a Cyclone V device (namely the 5CGXBC3B7U15C8N), while using PIN#M13, N12 as an LVDS RX port.
Going through ALTERA device handbook, it has two variations for On Chip Termination:
- Embedded inside FPGA.
- On board termination.

Does declaring LVDS inside QUARTUSII automatically attributes a 100ohm OCT for this pair?

Thank you for your answers,
Refael.

System console closes channel

$
0
0
I have a system console running to control my system via a Jtag -> Avalon MM Master, and it runs fine. But occasionally after I have configured my parameters and I try and read back some status it says "Channel closed".

In full:
error: master_read_32: com.altera.systemconsole.core.services.ChannelClos edException: Channel closed while executing
"master_read_32 $jtag_master2 0x800 16"

Any clues? Is this normal? Is something else interfering? Would an OpenCore Plus license being managed using the same USB-Blaster->Jtag connection interfere?

Struggling to find a reason.

The core that was configured still runs though, so it's not reset things or similar.

Where is the "Create VHDL instantiation template file for current files"?

$
0
0
Hi there,
I have a question about quartus II.
If I want to generate VHDL substantiation files by right click the .vhd file.
However, I only see the "verilog instantiation for current file" and "VHDL component declareation for the current file" in the label?
That is confusing.

Can you tell me how to create vhdl instantiation file?
Thanks.

u-boot v2016.03 working for anyone?

$
0
0
Hi,

I have two boards - a de0_nano_soc and a sockit. Neither works with a u-boot v2016.03 build.

Is anybody out there using these eval boards able to successfully build & run u-boot?

On both cards, the u-boot-with-spl.sfp (includes SPL, uboot.img, dtb all in one) I generate either
doesn't boot at all, or boots sporadically, failing on memory calibration.

Here is one example:
U-Boot SPL 2016.03 (Mar 16 2016 - 08:27:20)
drivers/ddr/altera/sequencer.c: Preparing to start memory calibration
drivers/ddr/altera/sequencer.c: CALIBRATION FAILED
drivers/ddr/altera/sequencer.c: Calibration complete
SDRAM calibration failed.
### ERROR ### Please RESET the board ###

U-Boot SPL 2016.03 (Mar 16 2016 - 08:27:20)
drivers/ddr/altera/sequencer.c: Preparing to start memory calibration
drivers/ddr/altera/sequencer.c: CALIBRATION FAILED
drivers/ddr/altera/sequencer.c: Calibration complete
SDRAM calibration failed.
### ERROR ### Please RESET the board ###

U-Boot SPL 2016.03 (Mar 16 2016 - 08:27:20)
drivers/ddr/altera/sequencer.c: Preparing to start memory calibration
drivers/ddr/altera/sequencer.c: CALIBRATION PASSED
drivers/ddr/altera/sequencer.c: Calibration complete
Trying to boot from MMC

U-Boot 2016.03 (Mar 16 2016 - 08:27:20 -0700)

CPU: Altera SoCFPGA Platform
FPGA: Altera Cyclone V, SE/A4 or SX/C4, version 0x0
BOOT: SD/MMC Internal Transceiver (3.0V)
Watchdog enabled
I2C: ready
DRAM: 1 GiB
MMC: dwmmc0@ff704000: 0
In: serial
Out: serial
Err: serial
Model: Terasic DE0-Nano(Atlas)
Net:
Error: ethernet@ff702000 address not set.
No ethernet found.
Hit any key to stop autoboot: 0

Once started, USB never works - even though the feature has been (seemingly) properly configured

=> usb start
starting USB...
USB0: Core Release: 2.93a
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
dwc_otg_core_host_init: Timeout!
scanning bus 0 for devices... 1 USB Device(s) found
=>
=>
=>

If anyone has solved these problems, please share!


Thanks!
--George

Cyclone V 5CEFA4F23C8 has 24 DQ pins for LPDDR2 Hard Memory Interface

$
0
0
Hi.

I am using the 5CEFA4F23C8 and LPDDR2 in a design, and I want to use the Hard Memory Interface for LPDDR2.

I've noticed that the LPDDR2 hard interface has 24 dedicated DQ pins. That seems to me like such an odd number, because afaik LPDDR2 only comes in x16 and x32 configurations, and Qsys won't let me configure the UniPHY LPDDR2 as 24-bit either.

So why 24 DQ pins? Can I somehow use this interface for more than x16? Can I use it for x32?

If I only use the hard interface for x16, can I use the last 8 DQ pins for other purposes, or are they reserved for the hard memory interface?

cross ref for XILINX XC2S600E-6FG456I

$
0
0
hello ,
i am lookng for FFF part for XILINX XC2S600E-6FG456I
any idea someone?
michael

arria v GT FPGA development

$
0
0
Hello, everyone
I am using a arria v GT FPGA development board, The development board have two Arria V GT 5AGTFD7K3F40I3N FPGA. I need communication between the two FPGA. I see they have inner LVDS ligature in the schematic diagram, but I can't find the pin name in the "development board reference manual". So I don't know how to communicate between the two FPGA. So I need help. Thank you, Best Wishes.

Max V CPLD Pull up for pins

$
0
0
Hi,

I am working on MAX V 5M2210ZF324C5 device for my current project.
I require some of the pins to be pulled up as this wasnt taken care in schematics.
Is this possible for the present device? Please help.

Regards,

Naga Kiran

Simple tutorial for bare metal programming

$
0
0
Hello everyone, recently I got De0-Nano-Soc board with Cyclone V. I have some(not much) experience with NIOS2 programming, but it was fairly simple to work with nios. I don't quite understand how to flash my SoC, how to get header files with addresses. So I need some easy tutorial for dummies with using Quartus, Qsys, DS-5 and flashing with bare metal example. Simple blinky would be enough I think.
What have I already found(shame on me, I do not know forum policy about links):
https://rocketboards.org/foswiki/vie...teraSoCDevices
But in this Labs they are using some SDcard images, and I don't understand why.
https://www.youtube.com/watch?v=8BehnPg8IvM
In this video programming in DS-5 not shown at all.

Nios 2 custom instruction

$
0
0
Hello

I am learning nios 2 processor and nios 2 custom instruction. I have add custom component in qsys and then for analysis and synthesis I have to add hdl files. I have created two normal verilog files for qsys custom component and when I was doing analysis and synthesis I got this error.
"No modules found when analyzing null".
Can you tell me how to solve it.
I am giving you a link which I am using for my example.
https://www.youtube.com/watch?v=ZfthoAFI7LY

Thanks

What is the TDP of a Cyclone V chip?

$
0
0
Hello everyone,

I am looking for some info regarding the Cyclone V SoC. I am looking for the manufacture value of TDP (Thermal Design Power). Anyone knows this value? Or where to properly find it?

Thanks in advance! Best regards.

Error in Including Altera libraries file into uClinux

$
0
0
Hi everyone,

Recently I'm finally successfully booted uClinux into DE2-115. Now I want to change a project from using Nios II Software Build Tools for Eclipse into uClinux. Before I do this, I would like to try to add a simple project which previously built in Eclipse as well into uClinux.

However, when I try to compile the code using nios2-linux-uclibc-gcc hello_world.c -o hello -elf2flt, it give an error : no such file or directory.
Then I try to locate the the header file that I need and write the full location of the header file in the program as shown in the code below. However, even after I write the full location, it still gives error. The header file that I included, did include other header file and it cannot be found automatically as well.

Hence, I would like to ask how can I solve this problem? or I need to change some settings in my Quartus or uClinux-dist?
Hope someone can give me some help. Thank you very much!


This is the hello_world.c code built in Eclipse:

Code:

#include "/home/mun/altera/13.0sp1/nios2eds/components/altera_hal/HAL/inc/sys/alt_stdio.h"
#include "/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/stdio.h"
#include "/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/unistd.h"
#include "/home/mun/altera/13.0sp1/ip/University_Program/Audio_Video/Video/altera_up_avalon_video_character_buffer_with_dma/HAL/inc/altera_up_avalon_video_character_buffer_with_dma.h"
#include "/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/string.h"
#include "/home/mun/Hello/system.h"
#include "/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/inc/altera_avalon_lcd_16207_regs.h"
#include "/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/HAL/inc/altera_avalon_lcd_16207.h"
#include "/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_pio/inc/altera_avalon_pio_regs.h"

//Seven Segmen Decoder
int seg7_decoder(int a)
{
    if (a==0) return 0x40;
    else if (a==1) return 0x79;
    else if (a==2) return 0x24;
    else if (a==3) return 0x30;
    else if (a==4) return 0x19;
    else if (a==5) return 0x12;
    else if (a==6) return 0x02;
    else if (a==7) return 0x78;
    else if (a==8) return 0x00;
    else if (a==9) return 0x10;
    else if (a==10) return 0x08;
    else if (a==11) return 0x03;
    else if (a==12) return 0x46;
    else if (a==13) return 0x21;
    else if (a==14) return 0x06;
    else if (a==15) return 0x0e;
    else return 0xff;
}

int main()
{
        printf("hello_world\n");

        //SD-CARD
        char filename[100];
        char * txt = (float *)calloc(1000,sizeof(char));
        printf("Initializing SD_Card devices...\n");
        int volumes_mounted;
        sd_set_clock_to_max( 80000000 );
        usleep (1000);
        volumes_mounted = sd_fat_mount_all();////mount fat file system.

        if( volumes_mounted <= 0 )
        {
        printf( "SD Card Mount FAILED\n" );
        txt = "No SD-Card detected!";
        }
        else
        {
        printf("SD Card Mount OK\n");
        sprintf(filename, "/dev/sd_controller_0/%s", "test.txt");

        FILE *infile;

        infile = fopen(filename, "r");

        int i = 0;
        for(i=0; i<1000; i++)
        {
          fscanf (infile, "%c", &txt[i]);
        }
        fclose(infile);

        for (i = 0; i<30; i=i+1)
          {
            printf("%c",txt[i]);
          }
        }


          //LCD
          FILE* fp;
          fp= fopen("/dev/lcd", "w");
          if (fp == NULL) {
              fprintf(stderr, "open failed\n");
              return 0;
          }
          fprintf(fp, " Biochipsoc Lab \n      FBME      \n");
          fclose(fp);

        //VGA
        alt_up_char_buffer_dev *CHAR_BUFFER;
        CHAR_BUFFER= alt_up_char_buffer_open_dev("/dev/video_character_buffer_with_dma_0");

        alt_up_char_buffer_init(CHAR_BUFFER);
        alt_up_char_buffer_clear(CHAR_BUFFER);

        sprintf(filename, "SD Card read test.txt  :%s", txt);

        char text_top_row[40] = "Hello DE2-115";
        char values[40] = "Switches decimal value :";
        alt_up_char_buffer_string (CHAR_BUFFER, text_top_row, 15, 20);
        alt_up_char_buffer_string (CHAR_BUFFER, filename, 15, 22);
        alt_up_char_buffer_string (CHAR_BUFFER, values, 15, 24);

        int a=0,b=0x0000000f,j=0;
        int a0 = 0, a1 = 0, a2 = 0, a3 = 0, a4 = 0;

        int kk=0;
        int p=0;
        char c[20];
        char mask[6] = "      ";
        while(1)
                  {
                    //Switches
                    a = IORD(SWITCHES_BASE,0);

                    //Green Led and button
                      if (IORD(GPIO_IN_BASE,0) == 65535)
                      b=((b<<1)&0x000000fe)|((b>>7)&0x00000001);
                      else
                      b=((b>>1)&0x0000007f)|((b<<7)&0x00000080);

                    //Green Led and button
                      if (IORD(BUTTON_BASE,0) == 13) // 13 in binary is 1101 so second button is presses (active low)
                      IOWR(GPIO_OUT_BASE,0,1);
                      else
                      IOWR(GPIO_OUT_BASE,0,0);

                    //b = 0x000001ff;
                    IOWR(LED_GREEN_BASE,0,b);

                      //Red Led
                      IOWR(LED_RED_BASE,0,a);

                      //7SEG
                      a0 = a&0x0000000f;
                      IOWR(HEX0_BASE,0,seg7_decoder(a0));

                      a1 = a&0x000000f0;
                      a1 = a1>>4;
                      IOWR(HEX1_BASE,0,seg7_decoder(a1));

                      a2 = a&0x00000f00;
                      a2 = a2>>8;
                      IOWR(HEX2_BASE,0,seg7_decoder(a2));

                      a3 = a&0x0000f000;
                      a3 = a3>>12;
                      IOWR(HEX3_BASE,0,seg7_decoder(a3));

                      a4 = a&0x000f0000;
                      a4 = a4>>16;
                      IOWR(HEX4_BASE,0,seg7_decoder(a4));

                      IOWR(HEX5_BASE,0,seg7_decoder(0));

                      //button
                      if (IORD(BUTTON_BASE,0) == 11) // 11 in binary is 1011 so third button is presses (active low)
                      j= j-1;
                      else j = j+1;

                      if(j>15)j=0;
                      else if(j<0)j=15;
                      IOWR(HEX6_BASE,0,seg7_decoder(j));
                      IOWR(HEX7_BASE,0,seg7_decoder(j));

                      for (p=40; p<45; p=p+1){
                    alt_up_char_buffer_draw(CHAR_BUFFER , ' ', p, 24);
                      }

                    sprintf(c, "%d", a);
                    alt_up_char_buffer_string (CHAR_BUFFER, c, 39, 24);

                    usleep(200000);
                  }

  return 0;
}



Here are some of the errors:

Code:

/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/inc/altera_avalon_lcd_16207_regs.h:66:16: io.h: No such file or directory
In file included from hello_world.c:8:
/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/HAL/inc/altera_avalon_lcd_16207.h:36:27: sys/alt_alarm.h: No such file or directory
/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/HAL/inc/altera_avalon_lcd_16207.h:37:24: os/alt_sem.h: No such file or directory
/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/stdio.h:45:23: sys/reent.h: No such file or directory
In file included from hello_world.c:2:

In file included from /home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/stdio.h:29,
                from hello_world.c:2:
/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/_ansi.h:15:20: newlib.h: No such file or directory
/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/_ansi.h:16:24: sys/config.h: No such file or directory
/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/stdio.h:50: error: syntax error before "FILE"
/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/stdio.h:50: warning: data definition has no type or storage class
/home/mun/altera/13.0sp1/nios2eds/bin/gnu/H-i686-pc-linux-gnu/nios2-elf/include/stdio.h:59: error: syntax error before "fpos_t"

/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/inc/altera_avalon_lcd_16207_regs.h:66:16: io.h: No such file or directory
In file included from hello_world.c:8:
/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/HAL/inc/altera_avalon_lcd_16207.h:36:27: sys/alt_alarm.h: No such file or directory
/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/HAL/inc/altera_avalon_lcd_16207.h:37:24: os/alt_sem.h: No such file or directory
In file included from hello_world.c:8:
/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/HAL/inc/altera_avalon_lcd_16207.h:61: error: syntax error before "alt_alarm"
/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/HAL/inc/altera_avalon_lcd_16207.h:61: warning: no semicolon at end of struct or union
/home/mun/altera/13.0sp1/ip/altera/sopc_builder_ip/altera_avalon_lcd_16207/HAL/inc/altera_avalon_lcd_16207.h:90: error: syntax error before '}' token


Regards,
Jun Mun

DDR3 Mem Controller w/ Uniphy Question

$
0
0
For the DDR3 Memory Controller w/ Uniphy IP (Cyclone V / Arria V family), there are some configuration options in Uniphy IP's GUI in Qsys.


What are these features under Mode Register 2 and what option should I select for each feature?


1. Auto selfrefresh method: Manual or Automatic?
2. Selfrefresh temperature: Normal or Extended?


Please provide some guidelines. The AlteraEMIF guide isn't very clear.

HSMC Communication Board RS232 on Stratix III Dev Kit?

$
0
0
I have a 3SL150N SIII board with the intent of utilizing RS-232 to type to HyperTerminal. I purchased the HSMC COMM board P0078 from Terasic. I'm having issues with getting signal from SignalTap. Can someone with experience on this board assist me? I'm not sure if there is any other setup configuration.
COMM Card Pinout:
Pin 152 HSMC_0_RXD RS-232 Receive
Pin 145 HSMC_0_TXD RS-232 Transmit

Port B of SIII Board:
Pin 47 HSMB_TX_P0 (ADA_D11) I/O 2.5V PIN_P11 (tx)
Pin 8 HSMB_RX_P0 (ADB_D11) I/O 2.5V PIN_R4 (rx)

I mapped these to their opposites and on SignalTap they stay constant high even when typing. Reversing pin assignments show no change. Am I assigning the right pins?
I have a UART module that takes in rx at 9600. Any guidance is appreciated.

-Steve
Attached Images
Attached Files

Cyclone V clock output pins

$
0
0
On the Cyclone V C3 device there are pins called FPLL_BL_CLKOUT0 and FPLL_BL_CLKOUT1 and similar pins on the other side FPLL_BR_CLKOUT0 and FPLL_BR_CLKOUT1. The implication is that they are specialist pins which can be driven as a clock output by an fPLL.

I can't find anything in the C V Device Handbook about which of the fPLLs are supposed to drive these pins, so I did a little test. I provide a 100 MHz clock input on one of the global clock input pins. I instantiated an ALTPLL which has four outputs: two at 100 MHz, two at 125 MHz. One of each frequency drives an output pin directly, and the other of each frequency clocks a toggle flip-flop whose Q output drives a pin.

In the pin planner, I assigned the 100 MHz clock input to a global clock input. I purposely did not make any assignments for the other pins. I ran the design through Quartus 15.1, and it was perfectly happy. I checked the resulting pin-out file to see what pins Quartus used for the clock outputs, and they turned out to be just general-purpose I/O pins, not the specialist pins noted above.

Next, I assigned the clock outputs to the specialist pins, ran the tools, and again they were happy.

One thing I noticed was that in both cases, Quartus consolidated the two pairs of PLL outputs into just two outputs, one for the 125 MHz clock and the other for the 100 MHz clock.

It appears that, at least in the C V devices, one can drive both an output pin and internal logic clock inputs off of a PLL output.

What, then, is the point of the specialist FPLL_BL_CLKOUT0 pins? I can't find anything in the handbook that specifically mentions why they'd be used.

Relaxing Data Dependencies on Memory Access

$
0
0
I'm trying to stream in a block of contiguous memory, but only process the date until an end marker is reached. Put simply, iterate through an array until a certain value is found, after which all further elements are to be ignored. A simplistic solution with a OpenCL single work item kernel wold be as follows:
Code:

__kernel void in_streamer(__global const uint2* in, uint n) {
    for(uint i = 0; i != n; ++i) {
        uint2 value = in[i];

        write_channel_altera(chan, value);
       
        if(value.s0 == END_MARKER) {
            break;
        }
    }
}
   

__kernel void consumer() {
    while(true) {
        uint2 value = read_channel_altera(chan);
       
        // do work here
       
        if(value.s0 == END_MARKER) {
            break;
        }
    }
}

The consumer kernel is entirely unproblematic, the data dependency to the previous iteration only contains a equality operation. The in_streamer, while working as intended, causes terribly poor performance because there is a data dependency on a memory load operation. The AOCL compiler produces the following warning in the optimization report: "Successive iterations launched every 164 cycles due to: Data dependency on variable, Largest Critical Path Contributor: 98%: Load Operation". This in itself is of course nothing special. I've dealt with such data dependencies before by using a shift register to relax the dependency as the Altera Best Practice Guide suggests.

The Idea is to let allow the compiler to pipeline an expensive operation. To make this possible one I can't use the data in the next iteration, but only after a large number iterations. This usually worked for me in these kinds of problems. It doesn't seem to work with memory accesses.

The following solution tries to implement the in_streamer to break the loop after the end marker was found, but not immediately, in order to relax the dependency. The elements that are read after the end marker was found are discarded and not written to the channel:

Code:

__kernel void in_streamer(__global const uint2* in, uint n) {
    const uint MEM_DELAY = 164;
    bool endmarker_reached[MEM_DELAY];

    #pragma unroll
    for(int s = 0; s < MEM_DELAY; ++s) {
        endmarker_reached[s] = false;
    }
   
    for(uint i = 0; i < n; ++i) {
        uint2 value = in[i];

        if(endmarker_reached[0]) {
            write_channel_altera(chan, value);
        }
       
        bool end_it = false;
        if(value.s0 == 0x70000000) {
            end_it = true;
        }

        #pragma unroll
        for(int s = (MEM_DELAY-1); s > 0; --s) {
            endmarker_reached[s] = endmarker_reached[s-1];
        }
        endmarker_reached[0] = end_it;
       
        if(endmarker_reached[MEM_DELAY-1]) {
            break;
        }
    }
}

Here I run into a problem. While the dependency is relaxed I still get reduced performance, just not as badly reduced as before. The optimization report now says "Successive iterations launched every 2 cycles...". It then gives the following details over a hundred times: "Data dependency on variable, Largest Critical Path Contributor: 45%: Load Operation".

While this is much better than before, it's still a massive waste of processing time. It doesn;t matter how high I set the constant MEM_DELAY, the issue remains.

Another working solution would be the following:
Code:

__kernel void in_streamer(__global const uint2* in, uint n) {
    bool end = false;
   
    for(uint i = 0; i != n; ++i) {
        uint2 value = in[i];

        if(!end) {
            write_channel_altera(chan, value);
        }
       
       
        if(value.s0 == END_MARKER) {
            end = true;
        }
    }
}

This works and both kernels are pipelined perfectly. The problem is, that the input array is read to the very end, the values are only discarded after the end marker.


Has anyone encountered a similar issue? I'd be very interested in where the delay comes from.
Viewing all 19390 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>