Global Sources
EE Times-Asia
Stay in touch with EE Times Asia
?
EE Times-Asia > Networks
?
?
Networks??

Implementing 66MHz-64bit PCI with an FPGA

Posted: 03 May 2000 ?? ?Print Version ?Bookmark and Share

Keywords:actel? iic taipei 2000? fpga? corepci? target+dma?

/ARTICLES/2000MAY/2000MAY03_PL_NTEK_ID_TAC.PDF

48 International IC ? Taipei ? Conference Proceedings Implementing 66MHz-64Bit PCI with an FPGA Abstract In today's rapidly growing industry, support of PCI interfaces is essential for enabling seamless interfacing with other sys- tems. Circuit designers often face the challenge of getting their products to market before the competition does.Also, they face the challenge of getting the designs done with low-cost, high- density, and high-performance devices that can easily integrate more functionality. In the past, high performance designs were achieved by implementing the design in an ASIC. However, this solution can not meet the requirement of shorter time-to- market at a low NRE. By utilizing a CorePCI and a fast, flex- ible SX-A FPGA, circuit designers can now achieve the de- sired performance and meet the market window with a cost- effective solution. This paper will illustrate how to use SX-A FPGAs to implement the CorePCI 66MHz-64bit Target+DMA design, a high performance design that currently cannot be re- alized with any other FPGA architecture. Performance and uti- lization results will be reported along with the design flow.Tech- niques discussed in this application may be applied to other similar applications. Introduction to PCI PCI is a complex specification, and customer needs vary widely in the market. Currently, the four most needed basic functions are: 1. Target-only - PCI slave function. This function provides the ability to be a memory or I/O slave on the PCI bus. A Target is capable of performing data transfers under the con- trol of a PCI Master. The Target transfers data to or from the Target's back-end memory and the PCI bus during a cycle initiated by a Master. If the PCI Master initiates a PCI read cycle to the Target's address space, the Target reads data out of its back-end memory and delivers it to the PCI bus. If the PCI Master initiates a PCI write cycle to the Target's PCI address space, the Target reads data from the PCI bus and writes the data to its back-end memory. The Target is capable of accepting three types of read and write cycles, including Memory, IO, and Configuration cycles. 2. Master-only - PCI initiator function. This function provides the ability to perform memory and configuration cycles as a bus Master. The Master-only is capable of initiating Memory, IO and Configuration cycles on the PCI bus. These cycles are triggered by a write from a backend micropro- cessor into the Master's DMA control registers. The three Master control registers, a PCI address pointer register, a back-end address pointer register, and a control register, set the count of Double words (Dword) to be transferred and the direction of the transfer (from PCI to backend or from back-end to PCI). The Master-only cannot be a Target i.e. it cannot have its backend memory read or written by an- other PCI bus Master. Thus, the Master's back-end memory is secure from corruption, but it cannot get data unless it initiates the PCI cycle. 3. Target+DMA - PCI slave function with the ability to per- form memory transactions as a PCI bus Master. This device can function as a Target, or it can arbitrate for mastership of the PCI bus. Acting as a Master, it can cause a burst data transfer (DMA) from its back-end memory to another Tar- get on the PCI bus. The Burst DMA transfer is caused by writhing to the DMA control registers (the source, destina- tion and count registers described in the Master-Only above). On the Target+DMAconfiguration, the DMAcontrol regis- ters can only be written to form the PCI bus. The Target+DMA Master can not initiate a PCI configuration cycle-it can not be a full Master and configure other PCI Targets. 4. Target+Master - Combined both the Target-only and the Master-only functions. A Target/Master has all of the fea- tures of a Target and a Master. ATarget/Master can act as a Target on the PCI bus. It can also act as a Master with its Master control register's being configured from its Back- end by a back-end microprocessor. When acting as a Mas- ter, it can initiate configuration cycles, memory cycles, and IO cycles. PCI Functions' Availability in the Market Currently, there are various PCI implementations such as: 1. ASSP - Application Specific Standard Products (off shelf products from PLX, Tundra, etc.). 2. Embd - Embedded standard PCI function in PLD or FPGA. 3. ASIC - Application Specific Integrated Circuits. 4. IP - Intellectual Property Core implementation in FPGA. Jing Hua Ma Staff Applications Engineer Actel Corporation International IC ? Taipei ? Conference Proceedings 49 There are three classes of IPs: * Hard Cores that correspond to fixed sili- con wrapped netlist and the associated timing, area, and power dissipation attributes. * Firm Cores that stand for logic netlist and some information on the placement or partial Synthesizable RTL code. * Soft codes that are identified and purely Synthesizable RTL code: VHDL or Verilog HDL. Circuit designers always want to get their products to market before their competition does. However, they face the challenge of getting the designs done with low-cost, high-density, and high-performance devices that can easily inte- grate more functionality. High performance PCI designs can be achieved by implementing the designs in ASSP, Embd, or ASIC. However, these solutions are unable to speed time-to-market, offer flex- ibility to make modifications and easily integrate with other designs, at cost-effective process in a single device. By utiliz- ing a CorePCI and a fast, flexible SX-A FPGA, circuit design- ers can now meet all of these requirements. PCI 66MHz-64bit Target + DMA Design Challenges The challenges include, but are not restricted to, performance, bus width, and data transfer rate, etc.: 1. Performance with 66MHz or 15ns Reg-Reg. 2. 6ns clock-to-out (Clock pad to all out pads). 3. 3ns external setup for all of the PCI pins. 4. Data transfer rates: fully compliant zero-wait-state burst for PCI transfers. 5. 64bits data pins. For more detail information, please refer to the PCI Speci- fication 2.1. CorePCI Implementation The CorePCI macro is designed for use with a wide variety of peripherals where high performance data transactions are re- quired. Figure 2 depicts the typical system applications using the baseline macro. This macro provides a generic set of back-end signals that form a bridge to specific back- ends like SDRAM, SRAM, and CPUs. The core consists of up to four basic units: the Target con- troller, the Master controller, the back-end, and the Wrapper. The CorePCI macro consists of six major func- tional blocks, the DMA state machine, the address phase state machine, the dataphase state machine, the datapath, parity, and the configuration block (Fig- ure 3). All of the blocks shown are required for implementing theTarget+DMAfunction.The DMA, address phase, and dataphase state machines con- trol the macros' outputs and also the dataflow be- tween the PCI bus and the back-end. The remaining modules define the datapath logic for the CorePCI macro. DMA State Machine The DMA state machine is responsible for obtaining the Mas- ter ownership of the PCI bus and for launching a data transfer transaction by FRAMEn Assertion. FRAMEn is driven by the current Master to indicate the beginning and duration of an ac- cess. FRAMEn is asserted to indicate a bus transaction is be- ginning. Once a burst transaction has begun, the DMA state machine tracks the transfer count and terminates the burst by de-asserting the FRAMEn signal and releasing Master owner- ship of the PCI bus. In addition to basic Master's control, the DMA module also implements the Master support registers, PCI Start Address, DMA Start Address, and DMA Control. Address Phase State Machine The address phase state machine is responsible for monitoring the PCI bus and determining if a PCI transaction is targeting the CorePCI macro. When a hit is detected, the DP_START/ DP_START64 signals are activated, setting off the dataphase machine and back-end logic. The address phase state machine also determines the cycle type and provides this information on the RD-CYC, WR_CYC, BAR0_MEM_CYC, BAR1_CYC, and CONFIG_CYC outputs. Figure 2: CorePCI Macro System Block Diagram Figure 1: Flexibility - Ability to make modifications and integrate with other design Portability - Ability to migrate to future processes or other vendor technology Cost - Overall or final cost of the design solution 50 International IC ? Taipei ? Conference Proceedings Dataphase State Machine The dataphase state machine is responsible for con- trolling the PCI output signals and coordinating the data transfers with the back-end logic. When oper- ating as a Target, the PCI outputs are TRDYn, DEVSELn, and STOPn. When operating as a Mas- ter, IRDYn is the primary PCI output. Data trans- fers to the back-end are coordinated using the sig- nals RD_BE_RDY, RD_BE_NOW, WR_BE_RDY, and WR_BE_NOW. The two "BE_RDY" inputs in- dicate that the back-end is ready to transmit or re- ceive data.The "BE_NOW" signals indicate that data transfer will occur on the next rising edge of the clock. The dataphase state machine also drives the DP_DONE output active at the end of the PCI transfer. Parity The parity block generates and checks parity on the PCI bus. Datapath The datapath module provides the steering and registers for the data between the PCI bus and the back-end. Additionally, the datapath contains the address counters and in- crements the value after each data transaction. Configuration The configuration block contains the configura- tion space for the Target controller. These regis- ters include the ID registers, status and control registers, and the base address registers. The Implementation of PCI 66MHz- 64bit Target + DMA in the SX-A The 66MHz-64bit Target+DMA PCI function can be very efficiently implemented on the 54SX32Adevice. This is because the device of- fers a fine-grained, register-rich, and "Sea of Modules" (SOM) architecture with two innova- tive new local routing resources: FastConnect (FC, as shown at Figure 4 with Red Arrow) and DirectConnect (DC, as shown at Figure 4 with Green Dash line Arrow from C to R). These routing resources enable extremely fast and predictable interconnection of mod- ules. It also dramatically reduces the number of antifuses required to complete a circuit, ensuring the highest possible internal performance. Notes: 1. DC is directly connected with only 0.1ns rout- ing delay. 2. FC is only use one antifuse with about 0.3ns routing delay. 3. For regular routing segments, it is connected through 2 to 5 antifuses. Typically it only uses 2 antifuses. In order to achieve the PCI 66MHz-64bit Target + DMA's requirements/goals, a specific I/O address design must be accompanied with a HCLKBUF which is a dedicated input directly wired to each sequential macro in the FPGA. Figure 5 shows the actual implementation of how to achieve the external setup and clk-to- out requirements. Figure 3: Block Diagram of the CorePCI Macro Figure 4: Two innovative new local routing resources DC and FC. Figure 5: Bi-directional address and data bus with fast clk-to-out design. International IC ? Taipei ? Conference Proceedings 51 Design Flow Using Synopsys VHDL synthesis tool, the PCI VHDL code can be synthesized with the Actel's SX-A library to produce an EDIF netlist. Figure 6 illustrates the complete Actel/Synopsys synthesis design flow. The netlist is then imported into the ACTEL's Designer Development tool and layout through the DirectTime Layout methodology with the SX32A device. De- signers can specify timing requirements in the DirectTime Layout tool to achieve their goal. As expected, the PCI 66MHz-64bit Target+DMA VHDL design can be achieved at the speed of 66MHz with the logic utilization only 49% of the 54SX32A, the 32k logic gates'device.So,designerscanaddinadditionallogic to integrate this PCI function with their system de- signs. Figure 7 shows the 64-bit zero-wait-state burst write, and Figure 8 shows the 64-bit zero-wait-state burst read. Conclusion The PCI functions can be easily implemented in FPGAs by using the pre-developed high level lan- guage based functions - Intellectual Property Cores with a combination of the high-performance parts, theA54SX32Adevice. UsingActel's IP core librar- ies is one of the fastest methods to make FPGA sys- tem design. From Actel's website, circuit designers can easily access more system level of the FPGAs' IP cores in the following fields, such as telecom- munications (e.g. All Digital Phase Locked Loop, ISDN E1 Framer/Deframer, High Level Data Link Controller, and ATM Forum Utopia level II interface), proces- sor (8-bit microprocessor), interfaces (e.g. ControllerArea Net- work Bus Interface, I2C Master and Slave Interface, Universal Asynchronous Reciever Transmitter Interface, VME Slave In- terface, and Serial Communication Controller - UART), and common function libraries.Additionally, designers can custom- ize, reuse and synthesize these IP cores through various com- puter-aided synthesis tools, such as Synopsys, Exemplar, and Synplicity tools. Therefore, it is an efficient approach by using IP cores to design with FPGA. Notes: 1. When FRAMEn and REQ64n are asserted and the command bus is `0111', a 64-bit write to memory space is indicated. 2. The Target will compare the address to the programmed space set in the memory base address register. 3. If an address hit occurs, then the Target asserts DP_START and DP_START64 in cycle 3 and claims the PCI bus by as- serting DEVSELn and ACK64n in cycle 4. 4. Data transfer to the back-end begins on the rising edge of cycle 7 and continues for each subsequent cycle until the PCI bus ends the data transfer. 5. For 64-bit transfers the MEM_ADDRESS will incre- ment by 2 each cycle. 6. The PCI transaction com- pletes when TRDYn is de-as- serted in cycle 9 and com- pletes on the back-end in cycle 10. 7. For this case, the PIPE_FULL_CNT is set to "000". Figure 6: The Actel/Synopsys Design Flow. Figure 7. 64-bit zero-wait-states burst write 52 International IC ? Taipei ? Conference Proceedings Figure 8. 64-bit zero-wait-states burst read Notes: 1. When FRAMEn and REQ64n are asserted, and the command bus is `0110', a 64-bit read from memory space is indicated. 2. The Target will compare the address to the programmed space set in the memory base address register. 3. If an address hit occurs, then the Target asserts DP_START and DP_START64 in cycle 3 and claims the PCI bus by assert- ing DEVSELn andACK64n in cycle 4. 4. Data transfer from the back-end begins on the rising edge of cycle 7 and continues for each subsequent cycle until the PCI bus ends the data transfer. The back-end prefetches three DWORDs during zero wait state bursts. 5. For 64-bit transfers, the MEM_ADDRESS will incre- ment by 2 each cycle. 6. The PCI transaction completes when TRDYn is de-asserted in cycle 10. 7. For this case, the PIPE_FULL_CNT is set to "000". Author's contact details: Jing Hua Ma Actel Corporation 955 E. Arques Avenue, Sunnyvale, CA 94086 USA Phone: 1 408 739 1010 Fax: 1 408 739 1540 E-mail: ma@actel.com International IC ? Taipei ? Conference Proceedings 53 Presentation Materials 54 International IC ? Taipei ? Conference Proceedings International IC ? Taipei ? Conference Proceedings 55 56 International IC ? Taipei ? Conference Proceedings International IC ? Taipei ? Conference Proceedings 57 58 International IC ? Taipei ? Conference Proceedings International IC ? Taipei ? Conference Proceedings 59 60 International IC ? Taipei ? Conference Proceedings




Article Comments - Implementing 66MHz-64bit PCI with an...
Comments:??
*? You can enter [0] more charecters.
*Verify code:
?
?
Webinars

Seminars

Visit Asia Webinars to learn about the latest in technology and get practical design tips.

?
?
Back to Top