Upload
taiye-adeboye
View
87
Download
0
Embed Size (px)
Citation preview
Självständigt arbete på grundnivå
Independent degree project first cycle
Electrical Engineering
DFPM on FPGA – A speed optimized implementation of the Dynamic
Functional Particle method on Spartan 3E
Taiyelolu Adeboye
DFPM on FPGA
Taiyelolu Adeboye
2015-09-25
iii
MID SWEDEN UNIVERSITY Department of Electronics Design(EKS)
Examiner: Benny Thörnberg, [email protected]
Supervisor: Kent bertilsson, [email protected]
Author: Taiyelolu O. Adeboye, [email protected]
Degree programme: International Bachelor’s Programme in Electronics, 180 credits
Main field of study: Electronics Engineering
Semester, year: Autumn, 2014
DFPM on FPGA
Taiyelolu Adeboye
Abstract
2015-09-25
iv
Abstract This thesis focuses on the design of electronic circuitry that implements
the Dynamic Functional Particle Method (DFPM). The design was done
in VHDL and implemented on a Xilinx Spartan 3E FPGA. The work
included a digital 33-bit ALU implementation that was designed to
solve differential equations with the DFPM algorithm and UART trans-
ceiver and controller circuits for data exchange between the FPGA and
the PC. This report explains the design principles, process, tests and
results of the work. It also compares the performance of the designed
system with the performance of generic computational devices and also
examines the possibilities and limitations of operational concurrency
with relation to the size of problem sets.
Keywords: MATLAB, VHDL, FPGA, DFPM, algorithm evaluation, CPU
clock cycles, particle method
DFPM on FPGA
Taiyelolu Adeboye
Acknowledgements
2015-09-25
v
Acknowledgements I would like to express my appreciation to my supervisor, Associate
Professor Kent Bertilsson, for his guidance, mentorship and support in
the course of this project. His contribution was vital to the execution and
completion of this project work. I would also like to express my appreci-
ation to Associate Professor Sverker Edvardsson for being so approach-
able and for his great willingness to explain.
My various tutors and examiners in the course of this Bachelor’s pro-
gramme have proven themselves to be exceptional and unforgettable. In
no particular order, Professor Bengt Oelmann, Dr. Börje Norlin, Profes-
sor Kent Bertilsson, Professor Benny Thörnberg, Martin Kjellqvist,
Mikael Hasselmalm, Dr. Najeem Lawal, Mikael Bylund, Amir Yousaf,
Professor Cornelia Schiebold, Dr. Peng Cheng, Mazhar Hussein, Profes-
sor Engmont Porten, Stefan Haller, David Krapohl, Solange Hamrin and
Evelina Caffrey will remain entrenched in my memory.
Without mincing words, Anders Rådberg, Anders Molin, Sara Lodin,
Lars Malmbom, Tove Gullikson and the team at MIUN Innovation will
always remain dear to my heart. Thank you for your time, advice and
your effort!
Finally, I owe a huge debt of gratitude to the following: The divine, for
those moments when I was dry, Temitope Ruth, for being so under-
standing and special, Ire Peter, our bundle of joy, for being so sweet,
Kehinde, my wonderful twin, my family (Samuel, Dorcas, Ardex,
Adeyemi and Ope) for being such a pillar of support, and my friends in
Sweden and in Nigeria. Words will not be enough to express how much
I appreciate you!
Thank you for being part of this journey, muchas gracias! Greater things
are still to come!
DFPM On FPGA
Taiyelolu Adeboye
Table of Contents
2015-09-25
vi
Table of Contents
Abstract ............................................................................................................ iv
Acknowledgements ......................................................................................... v
1 Introduction ............................................................................................ 1
1.1 Background and problem motivation ...................................... 2
1.2 Overall aim ................................................................................... 3
1.3 Scope ............................................................................................. 4
1.4 Tools to be used ........................................................................... 4
1.5 Concrete and verifiable goals .................................................... 4
1.6 Outline .......................................................................................... 5
1.7 Contributions ............................................................................... 5
2 Theory ...................................................................................................... 6
2.1 Definition of terms and abbreviations ...................................... 7
2.1.1 Terms .................................................................................. 7
2.1.2 Abbreviations .................................................................. 11
2.2 DFPM algorithm ........................................................................ 12
3 Methodology ........................................................................................ 15
3.1 Concurrence vs. sequentiality ................................................. 15
3.2 Numerical representation ........................................................ 15
3.3 Modularity .................................................................................. 16
4 Design .................................................................................................... 17
4.1 The DFPM algorithm ................................................................ 17
4.2 Project Top Module ................................................................... 19
4.2.1 The two top sub-modules .............................................. 19
4.2.2 Data type conversion ..................................................... 19
4.3 Project defined Packages .......................................................... 20
4.4 Communication Top Module .................................................. 20
4.4.1 UART ................................................................................ 20
4.5 Iteration Control Top Module ................................................. 22
4.6 Implementation Constraint ...................................................... 24
4.7 Parameters .................................................................................. 24
4.8 Data exchange format ............................................................... 25
4.9 Signed numerical representation ............................................ 26
4.10 Integer and fractional representation ..................................... 27
4.11 Spartan 3E-1200 FG320 FPGA ................................................. 28
DFPM On FPGA
Taiyelolu Adeboye
Table of Contents
2015-09-25
vii
4.12 Nexys2 FPGA demonstration board ...................................... 28
4.13 Xilinx ISE .................................................................................... 29
4.14 ISim Simulation software ......................................................... 29
4.15 Design verification .................................................................... 30
4.16 The complete design ................................................................. 30
5 Results ................................................................................................... 32
5.1 Simulation results ...................................................................... 32
5.1.1 Element wise vector multiplication ............................. 32
5.1.2 Element-wise vector subtraction .................................. 33
5.1.3 Evaluating new vector V ............................................... 34
5.1.4 Evaluating new vector X ............................................... 34
5.1.5 Convergence check ......................................................... 35
5.1.6 DFPM top module .......................................................... 36
5.2 Comparison ................................................................................ 39
6 Discussion ............................................................................................. 42
6.1 FPGA resource utilization ........................................................ 42
6.2 Reduction in computation time ............................................... 42
6.3 Larger problem sets .................................................................. 42
6.4 UART bottleneck ....................................................................... 43
6.5 Precision ...................................................................................... 43
6.6 Communication input/output limitations ............................. 43
6.7 Cross platform comparison...................................................... 43
6.8 Output comparison ................................................................... 45
6.9 Communication possibilities ................................................... 49
6.10 Applications ............................................................................... 49
6.11 Implications ................................................................................ 50
7 Conclusions .......................................................................................... 51
7.1 Benchmark .................................................................................. 51
7.2 Further work .............................................................................. 51
References ........................................................................................................ 53
Appendix A: Documentation of own developed program code ........... 54
Design codes .................................................................................................... 54
New V operations………. .............................................................................. 65
New X operations. ........................................................................................... 67
One Iteration …………………………………………………………...69
DFPM top module .......................................................................................... 73
UART Core …………………………………………………………..76
UART Interface …………………………………………………………..83
Project Top module ......................................................................................... 88
DFPM On FPGA
Taiyelolu Adeboye
Table of Contents
2015-09-25
viii
Test code written in C++ ................................................................................. 96
Appendix B: Explanation of some basic mathematical concepts ........ 100
Two’s complement ........................................................................................ 100
Euclidian norm .............................................................................................. 100
Appendix C: Project report summary ....................................................... 102
Appendix D: MATLAB codes .................................................................... 103
Code for problem specification and comparison. .................................... 103
Appendix E. Table of standard ASCII symbols and their numerical
representation .................................................................................... 109
DFPM On FPGA
Taiyelolu Adeboye
1 Introduction
2015-09-25
1
1 Introduction DFPM on FPGA is a project work that implements the algorithm of the Dy-
namic Functional Particle Method in silicon. The implementation was done on
Xilinx Spartan 3E FPGA, and it was designed for speed (in terms of the num-
ber of clock cycles required for the implementation).
The Dynamic Functional Particle Method (DFPM) is a numerical particle
method that was developed at Mid Sweden University. While the method is
iterative, it consists of steps, some of which can be executed in parallel. There-
fore a FPGA was considered to be able to offer advantages due to its parallel
processing capabilities.
The FPGA implementation takes matrix elements as input parameters through
the UART and returns an output in the form of the solution vector relevant to
the parameter input received.
Figure 1.1: A simplified illustration of the project
DFPM On FPGA
Taiyelolu Adeboye
1 Introduction
2015-09-25
2
1.1 Background and problem motivation
Systems of linear equations can be used to describe many observable natural
phenomena in nature and find application in many areas in physics, mechan-
ics, and sensor fusion among others.
One of the approaches to solving systems of linear equations involves the
application of the knowledge of matrices. This approach treats the system as
matrices or vectors comprising of elements that represent the parameters of
the system in question.
This approach often results in the classical A*X = B problem where A, X and B
are matrices/vectors. A has elements containing various parameters of the
system, X contains elements representing the defining properties of the pa-
rameters and B represents the solution vector.
For instance, if a system is defined as shown below,
3x – 2y + 4z = 10
5y + 1y – 2z = -2
10y – 5y + 3z = 4
Then it can be represented in A*X = B form as shown below.
As the number of variables in these systems increase, the size of the matrices
increase proportionately but the number of iterations required for solving the
problem using an iterative numerical method increases geometrically, thus
consuming significant CPU time.
This project aims to address this problem through the design of an Arithmetic
and Logical Unit (ALU) that implements the DFPM algorithm in a system that
combines sequential and parallel execution as a means of reducing the number
of CPU clock cycles required per iteration and consequentially, the computa-
tion time for the complete algorithm.
DFPM On FPGA
Taiyelolu Adeboye
1 Introduction
2015-09-25
3
1.2 Overall aim
The overall aim of the project is the design of an ALU that implements the
Dynamic Functional Particle Method on a FPGA. The system will be capable of
receiving input in the form of parameters that represent the variables of the
system to be analysed and will give its output in the form of a matrix whose
elements represent the solution to the problem.
The designed system will be capable of communicating with a computer
through the USB port and the data is to be collected and displayed on the
computer screen using suitable software.
The output from the designed system should be correct and consistent in
comparison with values obtainable from a similar computation executed in
MATLAB or similar software on a PC.
Figure 1.2: An overview of the project concept
DFPM On FPGA
Taiyelolu Adeboye
1 Introduction
2015-09-25
4
1.3 Scope
The designed system is expected to be able to resolve system of linear equation
problems expressed in the form A*X = B where A is a 5x5 square matrix while
X and B are 5X1 Vectors respectively. A and B will be given as input to the
designed system while the system gives an output that represents X as a solu-
tion vector of the system.
The input to the designed system should be in the form of positive 8 bit inte-
gers while the output from it is expected to consist of whole numbers as well
as fractions which can be represented to a maximum precision of 8 binary bits.
Although limits have been imposed on the kind of input parameter expected
with the aim of easing the communication between the designed FPGA system
and PC software, it is expected that the ALU designed should be able to exe-
cute the DFPM algorithm on input data beyond these constraints.
1.4 Tools to be used
The following tools are expected to be used to carry out this project:
1. Xilinx Spartan 3E FPGA on Nexys2 demonstration board.
2. Xilinx ISE design suite.
3. Desktop terminal application software running on a PC.
4. MATLAB software running on a PC.
1.5 Concrete and verifiable goals
The goals of the project are as follows:
1. Design of a processor/ALU in VHDL. The unit should implement the
DFPM algorithm.
2. Implementation of parallel processing into the design of the DFPM
computational module, as much as optimal for the problem size.
3. Design of UART communication modules, in VHDL, for the transfer of
data from the PC/UART port to the DFPM computation module speci-
fied in the item number above.
4. Verification of the output from the FPGA. It should be consistently
equivalent to the output of the same algorithm run on a PC.
DFPM On FPGA
Taiyelolu Adeboye
1 Introduction
2015-09-25
5
5. Investigation and suggestion of possible solutions and approaches to
scaling up the design for significantly larger problem sets.
1.6 Outline
Chapter 2 of this report explains, in brief, the theories behind the design and
some related work pertinent to DFPM and the FPGA implementation while
Chapter 3 examines the design methodology and principles behind design
choices and approaches. Chapter 4 outlines some of the tests carried out to
verify the functionality of the modules designed as well as compares the
results with those obtainable from other systems. In the fifth chapter, the
results are discussed, and the possibilities and limitations examined, and
Chapter 6, which concludes the report.
1.7 Contributions
This design was wholly done by the author of this report with support and
guidance from the supervisor (Associate Prof. Kent Bertilsson). The design was
based on the Dynamic Functional Particle Method algorithm which was devel-
oped by Prof. Sverker Edvardsson et al [1].
Prof. Sverker Edvardsson supplied the author with information about DFPM
and sample application of the algorithm implemented in MATLAB. A UART
core designed for the Nexys2 and made available by Digilent Inc., it was
adapted in designing the data exchange modules interfacing between the
FPGA and the PC.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
6
2 Theory Systems of linear and differential equations is a well-established concept in
mathematics and finds its applications in solving theoretical numerical prob-
lems as well as real world challenges in various fields of endeavours like
mechanics, biology, electronics, economics etc. Thus a lot of work has been
done to develop approaches to solving these problems.
The dynamic functional paticle (DFPM) is an approach, recently developed by
Sverker Edvardsson et al [1] [2], which can be used to solve systems of linear
and differential equations. The algorithm is simple, widely applicable and
efficient with significant comparative advantages in relation to some of the
other established approaches [2].
DFPM implements a novel second order dynamical particle method which,
though new, is related to some first order approaches in previous work done
by Sincovec and Madsen [3], Pata and Squassina [4], and F. Alvarez [5].
There are a number of computational libraries and algorithm, implementing
various approaches to solve problems of linear and differential equation sys-
tems. Some of these include ARPACK and LAPACK, Colt library (java), and
IML++ (C++) among others.
Since this report is not a mathematical treatise, the main focus is on design and
implementation of electronic hardware that is able to compute and present
solutions to problems presented as a system of differential equations received
as input.
The design and implementation done in this project, while novel, is also relat-
ed to a previous work by Bruce Land entitled “Hybrid Computing on an
FPGA“ [6], in which a Digital Differential Analyzer (DDA) was designed and
implemented on Altera Cyclone II 2C35 FPGA on an Altera DE2 FPGA
demonstration board. The design made use of numerical representation in 18
bits, of which 16 bits were set apart for floating point fractions. Parallel compu-
tations were also used in order to reduce CPU computation time.
Apart from Bruce Land’s design above, there is little or no known information
about the implementation of numerical or particle methods in FPGA, and this
work could lead to novel concepts and applications.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
7
2.1 Definition of terms and abbreviations
2.1.1 Terms
Below are basic definitions and/or explanation of some important concepts
used in this report.
1. Linear equations
A linear equation can simply be defined as an algebraic equation consisting of
either or both constants and a product of constants and single power variables.
2. Systems of linear equations
These are a set of simultaneous linear equations which are defined as a single
problem and meant to be treated as such. These are often encountered in real
life situations and observable physical phenomena.
3. Differential equations
These kinds of equations define relationships connecting certain functions or
physical properties with their differentials (i.e. derivatives) hence the name.
4. Systems of differential equations
These are simultaneous statements of differential equations defining a specific
problem as a function of relationships between one or more independent
variables and their derivatives (dependent variables).
5. Numerical methods
These are approaches to solving mathematical problems with the use of vari-
ous methods numerical approximation. Numerical methods can be direct or
iterative.
Direct numerical methods include algorithms that have a predefined number
of steps for arriving at solutions. An example is the Gaussian elimination
method. Iterative methods, however, require an undetermined number of
iterations, of computational steps, which can vary with each problem defini-
tion. Examples of iterative numerical methods are Newton’s method and the
Newton-Raphson method.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
8
6. Particle methods
Particle methods are algorithms used, primarily, for the simulation of interact-
ing particles of physical systems and their motion in nature. These algorithms
are, sometimes, applied to numerical treatment of theoretical mathematical
models. The dynamic functional particle method falls under this category.
7. Convergence
Convergence is a characteristic of an iterative method when its sequences
subsequently and consistently approximates, or “converges”, to some specific
numeric approximations. The approximation to which the method converges
to is said to be the solution for the problem being solved with the use of the
iterative method.
8. The Dynamic Functional Particle method
This is an iterative particle method applied to general mathematical problems
by which mathematical problem models can be translated to particle models
and solved, as developed by Sverker Edvardsson et al [2].
The method is robust and widely applicable to problems of systems of linear
and differential equations, especially those defining nature and observable
physical phenomena.
9. Sequential processes
Sequential processes are processes consisting of operations which are carried
out one after the other. In these kinds of processes no two operations take
place simultaneously. All operations follow a definite sequence. Examples are
operations that take place in a single core CPU (Central Processing Unit).
10. Concurrent processes
Concurrent processes are processes consisting of more than one operation
being carried out in parallel. These kinds of processes can occur in multi-core
CPUs, FPGAs and other kinds of devices with parallel processing capabilities.
11. CPU time
This refers to the time spent by a processing unit while carrying out a certain
computational operation or set of operations. It is expressed in seconds.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
9
12. Clock
This is a component in digital electronics systems by which the timing of
operations and processes are controlled. It basically oscillates between a high
and low signal.
13. Clock cycle
This is a single complete up and down oscillation of a clock.
14. Clock frequency
This refers to the number of cycles a clock completes in a second. It is ex-
pressed in Hertz.
15. Field Programmable Gates Array (FPGA)
These are integrated circuits that are factory manufactured to be configurable
by engineers and designers as the use case or application demands. They are
normally programmed in a hardware description language (HDL).
16. Universal Asynchronous Receiver Transmitter
This is a standard hardware that facilitates serial data exchange between two
electronic devices. A UART port should be connected to another UART port in
order for them to exchange data.
Data exchange between UART hardware is 1 bit serial and takes place between
cross-connected receiver and transmitter pins while the data received is con-
verted to parallel 8 bit format and exchanged between the UART hardware
and the device controlling it.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
10
Figure 2.1 Simplified illustration of the UART communication process
17. MATLAB
MATLAB is an interactive software platform and high-level programming
language which is often used in scientific and engineering computing due to its
simplicity, robustness and easy to use interactive environment and functions.
In this project, it was used for the initial execution of the DFPM algorithm and
comparison.
18. Terminal software application
This is a software application that enables its user to get access to one or more
input/output ports (e.g. USB) of a PC and which displays the data stream. In
this project, Br@y++ terminal was used to access a USB port and communicate
with the FPGA running the DFPM algorithm.
19. Two’s complement
Two’s complement is a method of representing positive and negative signed
numbers such that the most significant bit is used to represent the sign while
the rest of the bits represent the numeric value of the number being represent-
ed.
When the most significant bit of a number represented in two’s complement is
“1”, then the number is negative but when it is “0”, the number is positive.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
11
This is a standard way of representing numbers that is frequently applied in
computing and electronics.
2.1.2 Abbreviations
The following abbreviations are used in this report:
ALU: Arithmetic and Logic Unit.
ASCII: American Standard Code for Information Interchange. This is the
standard used for the data exchanged between the PC and the FPGA.
ASIC: Application Specific Integreated Circuit. These are integrated circuits
that are designed or configured for a specific use case or application.
ARPACK: Arnoldi PACKage. Is a software library, coded in FORTRAN,
which can be used to solve eigenvalue problems.
BGA: Ball Grid Array.
CLB: Configurable Logic Blocks. These are logic elements on FPGAs used to
implement circuits.
CPLD: Complex Programmable Logic Device.
CPU: Central Processing Unit.
DE: Differential Equations.
DFPM: Dynamic Functional Particle Method.
FPGA: Field Programmable Gates Array.
FPU: Floating-Point Unit.
HDL: Hardware Description Language. These are languages by which one can
design hardware by means of semantics in an ISE or IDE.
IDE: Integrated Design Environment.
IOB: Input Output Block. These are ports for input and output to and from the
FPGA.
ISE: Integrated Synthesis Environment. This is software for synthesizing
designs done in HDL. Xilinx ISE is an example.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
12
LAPACK: Linear Algebra PACKage. This a library written in FORTRAN
which can be used to solve problems in linear algebra.
LDE: Linear Differential Equations.
LSB: Least Significant Bit.
LUT: Look Up Table
MATLAB: This is a software platform and high-level language used for pro-
gramming and simulations.
MCU: Microcontroller.
MSB: Most Significant Bit.
N/A: Not Applicable.
RAM: Random Access Memory.
RX: Receive. This is a pin through which data is to be received on a transceiver
port.
TX: Transmit. This is a pin through which data is to be transmitted on a trans-
ceiver port.
UART: Universal Asynchronous Receiver Transmitter.
USB: Universal Serial Bus.
VGA: Video Graphics Array. This is a standard for image display.
VHDL: VHSIC Hardware Description Language. In this project, VHDL was
used for digital hardware design.
VHSIC: Very High Speed Integrated Circuit.
2.2 DFPM algorithm
The dynamic functional particle method (DFPM) is widely applicable to solv-
ing a number of different problems when defined as a system of linear or
differential equations. However, the focus of this project work is on the appli-
cation of DFPM to solve the classical A*X = B system of differential equation
problem as described in Chapter 1 of this report.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
13
The algorithm is simply a two-step computation which is iterated until con-
vergence (or a specified level of convergence) is reached. Checking for conver-
gence is done by evaluating the Euclidean norm of the difference between
vector B and the vector product of vector X and matrix A and comparing it
with a predetermined scalar value representing the acceptable tolerance of the
computation.
The algorithm requires a number of input which are three n sized vectors
representing vector B in the problem statement and vectors X and V which are
used in the algorithm. An nxn matrix is also required as an input equivalent to
the A-matrix in the problem statement. Three scalar input Dt, mu and toler-
ance are also expected in the algorithm and they represent the discretization
step, the damping factor and the tolerance respectively.
DFPM On FPGA
Taiyelolu Adeboye
2 Theory
2015-09-25
14
Figure 2.2 A flowchart of the DFPM algorithm
A MATLAB sample code implementing the algorithm in Figure 2.2 above is
included in this report.
DFPM On FPGA
Taiyelolu Adeboye
3 Methodology
2015-09-25
15
3 Methodology As stated in the introductory part of this report, one of the purposes of this
project work is the reduction of CPU time. Hence, significant attention was
paid to the computational processes implemented in this design, as well as the
impact on the speed, and resource use on the FPGA. This chapter describes the
methodologies and considerations that influenced the design and implementa-
tion as described in the following chapter.
The preference of an FPGA over traditional CPUs and other types of pro-
cessing units is a consequence of the advantages offered by operational con-
currency that is characteristic of FPGAs and CPLDs.
After having chosen a design concept, the next biggest challenge was the
design itself. The design in this project work was done in VHDL (VHSIC
Hardware Description Language). While there are other languages and ap-
proaches to similar hardware design, VHDL was chosen because of the ease
with which it can be used to manage large projects, as well as the author’s
familiarity with it.
3.1 Concurrence vs. sequentiality
A limitation that was encountered early in the course of the design was the
limited number of dedicated multipliers on FPGAs. This was due to the fact
that FPGAs have a limit to the number of multipliers available on them, hence
limiting the number of multiplicative operations that can be executed concur-
rently.
An important focus of this work is speed optimization, for which concurrency
is key in this implementation. However, a balance needed to be struck between
concurrency and sequentiality. Hence some operations were run in parallel
while others were sequential. Addition and subtraction operations were most-
ly concurrent while some multiplicative operations were sequential and others
parallel.
3.2 Numerical representation
The dynamic functional particle method involves an iterative process with a
number of multiplications, subtractions and additions at each stage. The algo-
DFPM On FPGA
Taiyelolu Adeboye
3 Methodology
2015-09-25
16
rithm was implemented in MATLAB and run while the result of the computa-
tions at each stage of the iteration was output to the console and examined.
The cursory examination clearly indicated that the various values obtained
from the computations assumed a range that stretched across positive and
negative parts of the number line. This implied that a scheme was needed for a
distinct representation of negative and positive values. The values contained
integers as well as fractions, necessitating a need for representation of frac-
tions.
3.3 Modularity
In order to simplify the design, the whole project was split into to two major
top modules. One of these two top modules implemented the DFPM algorithm
and the necessary iterative computations while the other module was designed
to implement UART communication and data exchange between the UART
hardware on the FPGA board and the port on the PC with which it will be
communicating. This second module was also responsible for the conversion
of the 8-bit parallel data to 33-bit numbers and the format expected by the
DFPM algorithm module.
Each of these top modules was subdivided into smaller modules which carried
out specific functions and communicated with other modules through signals
and inter-module data exchange.
The details of the design are discussed under design in Chapter 4.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
17
4 Design The digital hardware designed in VHDL consisted of combinatorial and syn-
chronous circuits which were coded as IO ports, modules, processes and
signals. The functioning of the combinatorial circuit elements were instantane-
ous while synchronous circuit activities too place at the edge of the clock.
The complete design was made up of several modules exchanging information
with the aid of signal input and output via their ports. Since the design is
reasonably complex and large, an attempt was made to give each module a
name that signified or helped to identify the purpose and function of the
modules.
The core of the design consisted of the modules which executed the DFPM
algorithm, an over view of these core modules and their interaction is present-
ed in Figure 4.1
4.1 The DFPM algorithm
The dynamic functional particle method is widely applicable to many problem
models as stated in Chapter 2 of this report. However, in order to design a
circuit that specifically solves the A*X = B problem, one needs to understand
the step by step procedure of applying DFPM to the problem. Various imple-
mentations of DFPM in MATLAB, C++ and VHDL as applied in this thesis are
included in the appendix.
The procedure entails access to input vectors and matrix containing a number
of elements, of vectors and matrices, which make up the coefficients of the
systems of equations. The next step is the iterative computation, after which
comes the output. Throughout the process, the values of vector B, matrix A, Dt
and the damping factor (mu) remains fixed while the values of vectors X and V
may be modified at the end each iteration.
Each stage of the iterative computation comprises of two steps which are the
approximation calculation and the convergence check. The approximation
calculation takes the form of matrix multiplication, subtraction and addition
operations while the convergence check required a comparison of a predeter-
mined tolerance value with the Euclidian norm of the vector V.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
18
Figure 4.1. An overview of the core modules of the DFPM algorithm
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
19
4.2 Project top module
The topmost level container for the project HDL code was named
DFPM_ON_FPGA_TOP_MODULE. This module functioned as the overall top
module, containing all VHDL code relevant to the project design. It consisted
of two top modules which served two distinctly important functions. The
modules were named “UART_INTERFACE” and
“Signed_DFPM_Iteration_Control_Top_Module”. The complete VHDL code
for all the modules will be included as an appendix to this report.
4.2.1 The two top sub-modules
The communication top module was designed to handle communication with
the PC through the UART port and the UART VHDL code that controlled it.
Data received from the PC which would normally be in 8 bits were converted
to 33 bits in the format stated in section 3.2.2 of this report. The data were also
accumulated in arrays internal to this module until all data relevant to the
specific problem model has been received. The data would then be sent as
output through the ports of this module.
The Signed DFPM Iteration control module receives a stream of 33-bit data in a
format specified in its design, which mathematically describes the problem
being solved. The data received would then be subjected to the DFPM algo-
rithm, after which a solution would be obtained and sent out as an output
through the ports of this module.
At the conclusion of the Signed DFPM Iteration Control module’s computa-
tion, the output signal would be returned to the Communication top module
which reconverts the solution by first translating the result into human reada-
ble decimal equivalent before serially shifting the values out in 8 bits through
the UART interface.
4.2.2 Data type conversion
The communication top module handles data as standard logic vectors and
standard logic signals while the Signed DFPM Iteration Control module han-
dles data as signed bit vectors for all vectors.
This fact necessitated a need for the conversion of the data signal types from
standard logic vectors to signed bits and vice versa. This was done with the aid
of predefined functions which are conversion standards in VHDL. The conver-
sion takes place in the project top module.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
20
4.3 Project defined packages
The input data for each problem consisted of scalar data and many vectors and
some multi-dimensional matrices. Hence a specific format was designed for
easy recognition and handling of these vectors and matrices. Due to the fact
that these design-specific format vector data types were often handled and
shared between multiple modules in the project, it was considered advanta-
geous to create special packages to define these unique format vectors.
The specific formats designed are described below:
1. DFPM_VECTOR_5X32_BIT: A data type defining an array of 5 standard
logic vectors. Representative of a 5 by 1 vector of standard logic type
data.
2. DFPM_VECTOR_25X32_BIT: A data type defining an array of 5
DFPM_VECTOR_5X32_BIT. I.e. a multidimensional array equivalent to
a 5 by 5 matrix of standard logic vector type data.
3. DFPM_ARRAY_5X32_BIT: A data type defining an array of 5 signed bit
vectors. It was used to represent 5 by 1 vectors of containing signed da-
ta.
4. DFPM_ARRAY_25x32_BIT: A data type defining an array of 5
DFPM_ARRAY_5X32_BIT. This is equivalent to a 5 by 5 multidimen-
sional array of signed data.
These packages were used to ease the process of design and implementation
and also facilitated a unified standard between modules.
4.4 Communication top module
The communication top module comprised of 8 sub-modules. The modules
and their functionalities are briefly described below.
4.4.1 UART
These are the modules controlling the UART circuitry
1. RS232RefComp: This module was released by Digilent Inc. as a sample
code for an implementation of a UART core for the Nexys2 board. It is
the only purely non-original code used in this project.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
21
It is a simple implementation of UART designed in VHDL and it is re-
sponsible for 1 bit serial data transmission and reception, as well as the
conversion of 1-bit serial to 8-bit parallel data and transmission to the
on-board electronic hardware.
2. UART_INTERFACE: This module was used to control the RS232Comp
circuit. It determines when the UART core should transmit data, receive
data or neither.
This module is a simple four-state state machine. The states correspond
to:
a. Receive state: When the UART core is switched to receive data.
b. Waiting state: When both the UART interface and the UART core
do nothing but wait for data from the DFPM module.
c. Send state: When the UART module is switched to send an 8 bit da-
ta.
d. RepeatSend state: This is a transitional state where the module goes
to after sending each 8-bit data before sending the next. This helps to
ensure that the data transmission between the UART INTERFACE
and the UART core is hitch-free.
The control of the UART core from the UART INTERFACE and feed-
back from the UART core was facilitated with the aid of four signals namely
wrSig, rdSig, TBESig and RDASig. These signals and their effect on the UART
core are outlined in Table 4.1 below.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
22
Table 4.1 Table of control signals and their effect on the state of the
UART core
UART Module status
Transmit Receive
Signal wrSig 0 Off N/A
1 On N/A
rdSig 0 N/A On
1 N/A Off
Feedback from the UART core was received through the TBE and RDA signals,
which, when raised high, indicated that new data has been read or transmitted
respectively.
4.5 Iteration control top module
This module is made up of the circuitry that implements the DFPM algorithm.
The sub-modules were designed to carry out the various computations and
logical evaluation required in the DFPM method.
1. Signed_Vector_Vector_Mult_5By1: This module computes the ele-
ment-wise product of two 5 by 1 vectors of 33-bit data. Its operation is
concurrent and all computation results are immediately available at the
output when the input values changes.
2. Signed_Vector_Vector_5By1_Subtr: This module computes the ele-
ment-wise difference between the elements that make up two modules.
It concurrently performs subtraction operations on two vectors contain-
ing five elements of 33-bit data type and immediately assigns the result
to the output.
3. Signed_SubtrAndMult_Ops_Module: This module instantiates the
vector multiplication and the vector subtraction modules above and us-
es them in the computation “B – A*X – mu*V” for each iteration stage of
the DFPM algorithm.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
23
In this module, computation of the product of matrix A and vector X
was a combination of concurrent and sequential operations. The prod-
uct of one row of matrix A and the vector X was concurrent but since
matrix A comprised of 5 rows, each row product was pipelined in order
of row sequence.
4. Signed_New_V_Ops: This module computed a new value for the vec-
tor V at each iteration stage of the DFPM algorithm. The value was
based on the result of the operations carried out in the subtraction and
multiplication operations module, described in number 3 above.
5. Signed_New_X_Ops: This module computed a new value for the vector
X in each iteration stage of the DFPM algorithm. The new value for vec-
tor X is always dependent on the new value of vector V above.
6. Signed_Tolerance_Check: This module receives the value of B-A*X as
input and should then compare the Euclidean norm of the vector re-
ceived with the pre-fixed tolerance value. However, computing square
roots in FPGA can be problematic and introduce significant errors.
Hence, the square of the tolerance value was compared with the square
of the Euclidean norm, which is equivalent to the sum of the squares of
the elements that make up the vector input.
After comparison, if the square of the norm was found to be lesser than
the square of the tolerance level, a signal line would then be raised and
the algorithm terminates. The squares of the two vectors were comput-
ed by self-multiplying them with the aid of the Vector_Vector_Mult
module described above.
When the condition checked by this module is found to be true, conver-
gence is said to have been reached.
7. Signed_DFPM_One_Iteration: This module instantiated the subtraction
and multiplication module, new v operation module, new x operation
module and the tolerance check module. It connected the input and
output appropriately and makes up all the operation that make up one
iteration stage of the DFPM algorithm.
8. Signed_DFPM_Iteration_Control: This module instantiated the
Signed_DFPM_One_Iteration module. It feeds the new V and X vectors
back into the computational module and stops the iterations when con-
vergence is attained.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
24
4.6 Implementation constraint
In order to translate, map and route the design done in VHDL to device specif-
ic circuit, an implementation constraints file named UCF_DFPM_TOP was
used. The file links input and output pins specified in the project top module
with the intended pin on the FPGA chip and demonstration board.
4.7 Parameters
The design was intended to make room for some level of easy configurability.
Thus, the initial values of vectors v and x, and the scalar discretization coeffi-
cient (dt), the tolerance and the damping factor (mu) can be changed inside the
DFPM modules. The UART module parameters can also be easily modified.
The default values for these parameters are listed below:
Table 4.2 Table of parameters and corresponding values used
S/N Parameter Value used
1. Vector V [1 1 1 1 1]
2. Vector X [1 1 1 1 1]
3. Damping factor 0.1
4. Discretization coefficient 1.0
5. Tolerance 2-7
6. UART baud rate 9600
7. Number of data bits per trans-
mission
8
8. Parity odd
9. Number of stop bits 1
10. Handshaking None
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
25
4.8 Data exchange format
The exchange of data between the PC terminal and the FPGA system needed
to be standardized in order for the data to be stored in the correct structure
and also for it to be usable by the DFPM computation modules.
The MATLAB approach for specifying vectors and matrices was, hence,
adopted.
In order to specify a problem set of the type applicable in the format usable by
the DFPM module, closing braces begin all problem sets, followed by each
element of each row of the matrix separated by whitespace and each row in a
matrix separated by a semicolon. The solution output from the FPGA is trans-
mitted using the same standard except for the opening and closing braces.
An example of the utilization is shown in the Figure 4.2 below.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
26
Figure 4.2 Image showing the terminal being used for data exchange be-
tween the FPGA and the PC
4.9 Signed numerical representation
Since digital systems only deal with binary arithmetic for numerical computa-
tions and representation, the numbers handled in the DFPM algorithm were
represented by using signed bits. This decision helped to ensure that positive
and negative numbers were distinguished from one another.
The downside of this approach was that the bit being used for sign representa-
tion could not be used for numerical value representation. Therefore an extra
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
27
bit needed to be added to the number of bits representing each signed number
in order to make up for the shortfall.
4.10 Integer and fractional representation
Another important consideration in the design was the representation of
fractional values. It was decided that binary digits after the radix point will be
represented and treated like whole integers i.e. shifted to the left. At the end of
all computations, the result will also be shifted to the right by the appropriate
number of binary digits to make up for the left shift. This process is a simple
scheme that makes for the manipulation of fractions in a way that is similar to
whole numbers.
As a result, each number in the DFPM algorithm consisted of 33 bits. The MSB
indicated the sign of the number while the next 16 bits represented the integer
part of the value being handled. The fractional part of the number was then
represented by the least significant 16 bits.
Below is an image showing a sample numerical representation as used in the
design. It can be seen that the MSB is “0” therefore it is a positive number. The
next 16 bits are equivalent to 910 and the last 16 bits are equivalent to 0.628906
(i.e. 2-1 + 2-3 + 2-8). Hence the number represented in the image below is
+9.628910.
Fig 4.3 Image showing the numerical representation scheme
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
28
The multiplication of two numbers with n number of fractional binary digits
will result in a product with 2n fractional binary digits. This scheme, therefore,
offers an advantage in multiplication operations since it ensures that multipli-
cative operations maintain a precision of 2-810 for each operation.
4.11 Spartan 3E-1200 FG320 FPGA
Spartan 3E-51200 FG320 FPGA is a standard performance 320-ball fine pitch
ball grid array FPGA chip with 1.2 million gates, 136 K RAM, 28 dedicated
multipliers and 250 user IO pins [7]. The chip is made up of five functional
elements which are the Digital Clock Managers (DCMs), the Input/Output
Blocks (IOBs), Configurable Logic Blocks (CLBs), dedicated multipliers and
block RAMs.
The dedicated multipliers are able to directly compute 18-bit by 18-bit multi-
plication in two’s complement while the IOBs can be used for data input and
output to and from the FPGA and the 136 K RAM is equivalent to 139264 bits
of memory available for storage on (136 * 1024 bits). The logic of combinatorial
and synchronous circuits resulting from the VHDL design is mainly imple-
mented in CLBs (Configurable Logic Blocks) on the chip.
4.12 Nexys2 FPGA demonstration board
The Nexys2 FPGA demonstration board is a hardware platform, designed and
manufactured to accommodate and support the Spartan 3E FPGA, enable a
demonstration of its capabilities and provide some standard hardware periph-
eral access to the chip.
It can be powered via USB, battery or wall socket and runs on a 50 MHz oscil-
lator while featuring 16 MB SDRAM and flash and an impressive array of
standard hardware interfaces like VGA, USB, RS232 ports as well as switches,
buttons and a quad digit seven segment display [8].
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
29
Figure 4.4 Image showing a Nexys2 FPGA demonstration board
4.13 Xilinx ISE
Hardware design was done with Xilinx ISE (Integrated Synthesis Environ-
ment) and the generated design was then downloaded onto the FPGA. Xilinx
is free software developed by Xilinx for programming FPGAs and for their
hardware design.
There are a number of other design/synthesis environment applications for
hardware design, e.g. Altera’s Quartus II design environment. However,
Xilinx seemed to be an obvious choice due to the fact that it was offered by the
vendor of the FPGA chip used, and also because it provides out-of-the-box
support for the FPGA chip and the board used.
4.14 ISim simulation software
ISim simulator software is a software application for the simulation of HDL
code which is bundled with the Xilinx ISE software suite. It is easy to use and
provides support for mixed languages, multi-threaded compilation, and dis-
plays the circuit behavior with the aid of waveforms on the screen.
ModelSim is also a simulation software that can be used but due to its usage
restrictions and the author’s familiarity with ISim, ISim was chosen over
ModelSim.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
30
4.15 Design verification
For each module designed in this project, a test-bench was written for testing,
simulation and verification of its functionality and behavior. Test-benches, in
this context, refer to VHDL code written for the purpose of simulating opera-
tional circumstances of the designed module in question. The modules being
tested are normally referred to as unit under test (UUT).
4.16 The complete design
The complete system integrated these different modules and connected them
while doing type conversion in the top module where appropriate. The incom-
ing data from the UART were converted to signed bit vectors and stored in
memory on the FPGA until all the data necessary for each problem set were
received.
After this, a signal that activates the DFPM computation module is raised so
that computation can start. The complete design made use of 26 multipliers, 12
IOB pins and 3243 LUTs. While the utilization of multipliers was 92%, the
utilization of logical and IO blocks was much lower. A copy of the project
report summary is included in the appendix of this report.
DFPM On FPGA
Taiyelolu Adeboye
4 Design
2015-09-25
31
Figure 4.5 The Nexys2 board FPGA connected to a PC and running the
DFPM algorithm.
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
32
5 Results Every module designed in Chapter 4 of this report was tested with a test-bench
written in VHDL. The test benches were written to simulate the expected
conditions and functional environment for each module. The simulations were
done in ISim software and the module’s behavior verified through visual
inspection and calculations. The test benches were not included in appendix of
this report. The following are results of the tests carried out on the modules.
It is worth noting that since the values represented in this chapter are basically
binary, negative numbers were represented in two’s complement.
5.1 Simulation results
5.1.1 Element wise vector multiplication
The image below shows the result of the simulation of the vector multiplica-
tion module. Vectors 1 and 2 were input while vector_out was the output.
Fig 5.1 Test simulation for Signed_Vector_Vector_Mult module
Vector 1 = [5.0 3.0 2.0 4.0 7.0] and Vector 2 = [3.0 2.0 3.0 4.0 5.0]
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
33
The output vector was 10011102 = 78.0
By calculation: (5*3) + (3*2) + (2*3) + (4*4) + (7*5) = 78
This supports the idea that the module worked fine.
5.1.2 Element-wise vector subtraction
Figure 5.2 Test simulation for Signed_Vector_Vector_5By1_Subtr module
Above is an image of the simulation waveform for the vector subtraction
module. The input vectors were named vectors 1 and 2 while the output was
named vector_out.
Vector 1 = [1.0 7.81e-3 11.72e-3 15.62e-3 19.53e-3]
Vector 2 = [15.0 3.91e-3 3.91e-3 3.91e-3 3.91e-3]
Vector out = [-14.0 3.91e-3 7.81e-3 11.72e-3 15.62e-3]
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
34
Simple calculation indicates that Vector 1 – vector 2 = vector out.
5.1.3 Evaluating new vector V
In the image below, the effect of operations pipelining can be seen as the
elements of vector_new_v assume new values one clock cycle after one anoth-
er. The iteration complete signal indicates the completion of the subtraction
and multiplication operations in each iteration stage.
Figure 5.3 Test simulation for Signed_New_V_Ops
5.1.4 Evaluating new vector X
Similar to the module in section 5.1.3 above, the effect of pipelining is seen in
the evaluation of vector_new_x. The signal new_v_ready signified that the
evaluation of the new value for vector V was complete and that the evaluation
process for vector x can start.
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
35
Figure 5.4 Test simulation for Signed_New_V_Ops
The signal new_X_ready is a signal line that indicated that the operation was
complete. The behavior was as expected.
5.1.5 Convergence check
The tolerance check module was simulated with two sets of values for vector
b_ax. The first set of values was set to be beyond the tolerance level while the
second set of values was set to be below the expected limit.
The signal “iteration complete” raised at the end of each multiplication and
subtraction operation of the iteration stage. The convergence check module
completes its function in about seven clock cycles, after which, the “iterate”
signal should be raised high or low depending on the result of the convergence
check.
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
36
Figure 5.5 Test simulation for tolerance check module
It can be seen above that after the second set of values were received and
computed, the “iterate” signal was brought low. This is consistent with the
design concept.
5.1.6 DFPM top module
This simulation was done with the following input set:
Vector B
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
37
Matrix A
Vectors X and V
By visual inspection of the results from the simulation, the final value of vector
X on the output was calculated thus:
Vector X(0) is a negative number since the first bit is 1.
1111111111111111111000111011001012 in two’s complement is equivalent to -
0000000000000000000111000100110102 in unsigned binary. A simplified ap-
proach to conversion of unsigned binary to and from two’s complement is
outlined in the appendix.
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
38
Figure 5.6 Test simulation for DFPM top module
Hence it is correct to state that:
Vector X(0) = - (0.0 + 2-3 + 2-4 + 2-5 + 2-9 + 2-12 + 2-13 + 2-15).
Vector X(0) = -0.2211
In the same manner Vector X(1) is a negative number.
1111111111111111111100100011100112 in two’s complement is equivalent to -
0000000000000000000011011100011002 in unsigned binary. Hence,
Vector X(1) = - (0.0 + 2-4 + 2-5 + 2-7 + 2-8 + 2-9 + 2-13 + 2-14)
Vector X(1) = -0.1076
Vector X(2) , Vector X(3) and Vector X(4) are positive numbers since their MSB
are 0. Therefore conversion from two’s complement is not required for them.
Vector X(2) = 000000000000000000001001111100000
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
39
Vector X(2) = +0.0 + 2-4 + 2-7 + 2-8 + 2-9 + 2-10 + 2-11
Vector X(2) = +0.0776
Vector X(3) = 000000000000000000011111001011011
Vector X(3) = +0.0 + 2-3 + 2-4 + 2-5 + 2-6 + 2-7 + 2-10 + 2-12 + 2-13 + 2-15 + 2-16
Vector X(3) = +0.2436
Vector X(4) = 000000000000000000101101000100000
Vector X(4) = +0.0 + 2-2 + 2-4 + 2-5 + 2-7 + 2-11
Vector X(4) = +0.3520
Therefore the final value of the solution vector in this simulation was
While the behavior seen above was consistent with design expectation, it was
considered that comparison with the output from a MATLAB implementation
would help to further verify the module’s behavior.
The values obtained from the MATLAB code and the VHDL simulations were
quite close as the MATLAB implementation produced vector X as shown
below:
X = [-0.2199, -0.1074, 0.0775, 0.2440, 0.3521]
5.2 Comparison
The circuit implemented on FPGA was tested by connecting the FPGA to a PC
and sending in numbers that represented problem sets while the FPGA re-
turned the solution to the problems. Since the accuracy was crucial, the results
obtained during these tests were noted and compared with values obtainable
from the same algorithm implemented in MATLAB on a PC. The comparison
showed that the values obtained by both systems, for each problem set inves-
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
40
tigated, were approximately equal. A table comparing the results obtained
during two of these tests is shown below.
Table 5.1 Table of a comparison of the results obtained from two runs of
DFPM on different systems.
1st test 2nd test
Problem
Set
Vector A
Vector B
Solution
Vector
(MATLA
B/PC)
Binary N/A N/A
Decimal
Solution
Vector
(FPGA)
Binary
DFPM On FPGA
Taiyelolu Adeboye
5 Results
2015-09-25
41
Decimal
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
42
6 Discussion Based on the tests carried out on the VHDL design modules, the behavior of
the circuit was as expected. However, a number of implications need to be
discussed.
6.1 FPGA resource utilization
Due to the fact that FPGAs have limited resources, there are established limita-
tions to the number of multiplication operations one can execute in parallel for
problems of the 5x5 matrix dimension implemented in this design. As matrix
dimensions get bigger the number of concurrent operations possible are re-
duced proportionately.
By this design, for a problem defined by an n dimension matrix and n-element
vectors, then n + 5 number of multipliers will be needed for the design. This is
because matrix row-vector multiplication in A*X was done concurrently for
each row while other multiplication operations were done sequentially. An-
other limitation is the data size expected by the dedicated multipliers.
The Spartan 3E multipliers are 18-bit multipliers by default and multiplication
operations involving data types bigger than 18 bits will consume even more
resources. As can be seen in the project report, the actual number of multipli-
ers used was 26 out of a total of 28.
6.2 Reduction in computation time
For every iteration stage of this design, computation time for (n-1)2 is saved.
Thus for a solution requiring m number of iterations, the time required for ((n
– 1)2 * m) multiplication operations are saved per solution. For instance, a 5 by
5 design as implemented in this project work saves the computation time for
1600 multiplication operations for a solution requiring a hundred iterations.
6.3 Larger problem sets
An approach to implementing this design for significantly larger problem sets
might be to section the complete data set into subsets containing small-sized
problem sets which the module is capable of handling. The solutions can then
be stored and reused as appropriate. At a point, this approach might encounter
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
43
limitations as well, due to the fact that the on-chip memory of FPGAs is also
limited. However, this was not the focus of this design.
6.4 UART bottleneck
Tests showed that each iteration stage of DFPM computation for a 5 by 5
dimensioned problem required 28 clock cycles. However, the data was being
received through a 9600 baud rate UART. The UART is, thus, slower than the
DFPM computations. In a case where large volumes of data may need to be
transmitted to the DFPM computation module, the UART may prove to be a
bottleneck. This problem might be mitigated with the use of a more parallel
communication mode and faster transmission rates.
6.5 Precision
Although the number of bits assigned for fractional value representation was
quite many (16 bits), there might be some challenges when it comes to the
accuracy of the exact values obtained from multiplication operations. This is
because the result of the multiplication of two 33-bit values is a 66-bit value.
When this product is to be stored back in a 32-bit data type container, then
some bits will be lost.
This problem will, most likely, not affect integer values in the DFPM computa-
tion but can result in some precision loss in the fractional representation.
6.6 Communication input/output limitations
Since the data received from the UART could not be used directly, modules
were written for the forward and reverse translation of the data transmitted to
and received from the DFPM computation module.
For instance, due to the translation done in the “UART_out_DFPM_in” mod-
ule, only single digit decimal numbers are expected as input data typifying the
problem set. Likewise, in order to reduce FPGA resource consumption, reverse
translation of the solution vector element sets was also limited to four fraction-
al digits.
6.7 Cross platform comparison
Since the goal of the project is to implement DFPM in an FPGA design that is
speed optimized, the CPU time consumed by the algorithm became an issue of
pertinent importance. However, since different computational devices have
varying architectures and processing speed, as well as operating systems, a
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
44
reasonable metric for the evaluation of the computation time that is independ-
ent of these parameters was needed in order to compare the performance of
the FPGA design with other implementations. The agreed metric was the
number of clock cycles used by the processing unit while executing the DFPM
algorithm.
Thus comparison was done between the DFPM computation done on the
FPGA and the same algorithm coded in C++ and run on a 2.4 GHz CPU PC.
The FPGA implementation completed the algorithm for solving the sample
problem used for testing the DFPM top module (according to simulation) in
57670 nanoseconds which is equivalent to 2883.5 clock cycles while the PC
used completed the same problem in 0.0156001 seconds.
The time used up by the PC included the time used for context switching and
kernel operations, in the operating system, as well as process user time. Provi-
sion was made in the C++ code used for implementing the algorithm and for
measuring the time taken.
In the C++ code, arrays with a dimension of 1000 were created for storing a
thousand copies of vectors A and B and the DFPM algorithm was implement-
ed and looped through each copy of the same problem statement. Thus a
thousand copies of the same problem were treated with the same algorithm.
The large number of iterations was a result of the fact that the amount of time
spent by the CPU in kernel mode was sometimes too low to be measured by
the functions used to measure the CPU process times when the algorithm was
run only once.
Hence running the algorithm a thousand times generated reasonably measur-
able process times from which the time spent by the CPU while not running
the actual algorithm was deducted and the result of the deduction was divided
by 1000 in order to trim down the CPU time obtained to what is applicable to a
single run of the DFPM algorithm.
Based on the test, and the assumptions that the program/algorithm was exe-
cuted on only one core of the CPU and that the CPU was not overclocking, the
number of clock cycles used by the PC = 2.4 * 109 * 0.0156001/1000 = 37440.240.
This evidently indicated that the FPGA implementation offers a great ad-
vantage.
It is noteworthy to state that if the CPU executed the program on multiple
cores or overclocked while running the program, the PC may have ended up
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
45
using more cycles than stated above. Nonetheless, the calculations show that
in both cases, DFPM would still have been faster. A copy of the C++ code is
included in the appendices.
6.8 Output comparison
In order to ensure consistency of results and ease of operation, a MATLAB
script was written which is able to communicate problem specifications to the
FPGA and receive its results. The MATLAB script also computes the algorithm
on its own and the two outputs were printed to the screen and compared. The
script is described further in Appendix D with the code included.
By making use of the script described above, three different problem sets were
formulated and fed to the DFPM on FPGA design through the MATLAB
script. The results obtained are shown below as well as the MATLAB plots of
the values obtained during each test.
The plots have no units on the x and y axes since the plots were only used to
indicate the proximity between the results obtained. Hence the plots showed
the location of each of the results obtained on the co-ordinate axes.
Figure 6.1 Plot of the values obtained during the first test
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
46
Table 6.1 Table of results obtained in tests with three different problem sets
Tests Results obtained
MATLAB implementation FPGA implementation
Test 1 -2.4599e-01
-1.9253e-01
+5.8280e-03
+2.5866e-01
+5.0859e-01
-2.4715e-01
-1.9301e-01
+5.7221e-03
+2.5965e-01
+5.1057e-01
Test 2 -3.8910e-01
-1.5755e-01
+1.2061e-02
+2.6273e-01
+5.1339e-01
-3.9112e-01
-1.5810e-01
+1.1765e-02
+2.6343e-01
+5.1507e-01
Test 3 +6.5463e-01
+3.7920e-01
+3.1785e-01
+6.8058e-02
-1.8173e-01
+6.5653e-01
+3.7948e-01
+3.2008e-01
+6.8391e-02
-1.8323e-01
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
47
Figure 6.2 Plot of the values obtained during the second test
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
48
Figure 6.3 Plot of the values obtained during the third test
As can be seen in the figures and table above, in each of the three tests carried
out, the results of the MATLAB implementation and the FPGA implementa-
tion tallied so much so that the point plots overlapped at each of the positions
marked on the plots, indicating that, to a large extent, the differences in the
values obtained are almost negligible.
However, it is worth noting that these tests made use of single digit data as
coefficients in the matrices and vectors used to define the problem sets. It is
believed that this implementation can handle these kinds of data but the de-
sign of the communication modules were limited and only capable (by design
intent) to handle single digit input alone.
While the MATLAB implementation produced results that are very close, it
may be reasonable to expect some variation with some other implementations
and system architectures due to the differences in hardware and software
design, as well as system optimization, be it in hardware or software.
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
49
6.9 Communication possibilities
As indicated in an earlier part of this discussion, the speed of the whole system
was limited due to bottlenecks in the UART. However, in consideration of the
fact that most inter-component communication between electronic modules
and components make use of standard protocols, of which UART is one, this
design will still perform slightly better and faster than most other designs that
make use of sequential processing.
Nonetheless, there are other faster protocols which can be exploited in order to
speed up the rate of data exchange and parallel communication can also be
considered since the FPGA has a substantial number of I/O (Input/Output)
pins.
6.10 Applications
This design concept can find application in a large number of fields ranging
from mathematical theory to real world engineering design and systems. The
DFPM can be used to model systems in nature, for instance heat flow in a
space, and fluid flow [10] etc.
A great number of applications can also be found in electronics and engineer-
ing in general. DFPM will prove very useful in solving least squares and,
possibly, weighted least squares problems in sensor fusion. This will prove
useful in radar systems, telecommunications, multi-sensor networks and
mobile sensory and localization problems often encountered in systems requir-
ing self-localization, e.g. mobile robots, and sound-source detecting systems.
DFPM looks promising for the field of image and signal processing especially
in problems requiring singular value decomposition (SVD). DFPM will also
find great usefulness in mechanics where complex linear and non-linear sys-
tems may need to be modeled.
Solutions of large matrix problems often require significant computation and
computational resources, hence DFPM can be found to be a very suitable and
resource-efficient approach to solving these problems. It will be even more
useful when the problem involves sparse matrices, a concept that is useful in
FEM based simulations which is used in all engineering fields [9].
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
50
A DFPM algorithm based on a smaller dimensioned matrix that functions as a
sliding window through the matrix can serve as a very quick, efficient ap-
proach that requires minimal computational resources.
6.11 Implications
While DFPM offers a lot of advantages and developmental possibilities, there
are situations in which its efficiency can possibly be exploited for negative
purposes.
Certain aspects of data safety and integrity depend on hashing and a signifi-
cant amount of computational resource and time is required to break them but
the advent of simpler algorithms and dedicated devices (e.g.) FPGAs with
great computational power facilitate access to, supposedly secured, data by
criminals.
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
51
7 Conclusions It was found that the design approach met expectations and offered significant
advantages over traditional computational devices and methods. It was also
found that implementing the DFPM algorithm in FPGA is an efficient ap-
proach to reducing computation time and improving resource efficiency.
Since the DFPM algorithm is widely applicable to a number of other problems,
implementing the algorithm in a dedicated device that makes efficient use of
resources, while increasing the speed at which results are obtained, offers a lot
of advantages.
7.1 Benchmark
In order to base the conclusions drawn in this project on criteria that are inde-
pendent of platforms, the computation output and the number of clock cycles
were used.
Based on the result of a test carried out using the C++ snippet in Appendix A,
on a mobile PC, Acer Aspire 5750, with dual CPU cores running at 2.4 GHz
clock speed, it was observed that the same algorithm applied to a specific
problem required 75754 clock cycles on the PC while the same problem was
completed in 3192 clock cycles using the FPGA implementation.
Regardless of the significant difference in computation time and computational
architecture and resources, the results obtained from both computations were
close enough to be regarded as equivalent.
Hence, the initial goals of the design were achieved and the expectation of
superior performance and resource-efficiency was verified.
7.2 Further work
A lot can be improved in this design. Below is a list of possibilities:
1. Improving the forward translation modules so that they can handle
multi-digit decimal input in the problem set.
2. Modifying the module that reverse-translates the solution vector from
the DFPM top module so that they are able to handle the full range of
bits representing fractional values in the data type used in the design.
DFPM On FPGA
Taiyelolu Adeboye
6 Discussion
2015-09-25
52
3. Designing the DFPM computational module to be able to handle larger
problem sets along with the possibility of handling multi-dimensional
problem sets.
4. Enhancing the UART baud rate as well as making it configurable in use.
This will reduce the stress that can be encountered while setting up a
connection between the UART on the FPGA and the terminal applica-
tion software.
5. Enhancing the design so that it can handle multiple problem sets, i.e. re-
ceive a problem set, resolve it and return to wait for the next problem.
DFPM On FPGA
2015-09-25
53
References [1] S. Edvardsson, M. Gulliksson, J. Persson, et. al, “The Dynamic Functional
Particle Method: An Approach for Boundary Value Problems”, J. Appl.
Mech. 79(2) 021012 (Feb 24, 2012)
[2] S. Edvardsson et al, Role of the dynamic functional particle method for
solving linear equations, Physical Review E. Statistical, Nonlinear, and
Soft Matter Physics.
[3] R. Sincovec, N. Madsen, Software for non-linear partial differential
equations, ACM Trans. Math. Softw. 1 (1975) 232 260
[4] V. Pata, M. Squassina, On the strongly damped wave equation, Com-
mun. Math. Phy. 253 (2005) 511 533
[5] F. Alvarez, On the minimization property of a second order dissipative
system in Hilbert spaces, Siam J. Control Optim. 38 (2000) 1102 1119
[6] B. Land, Hybrid Computing On an FPGA, Cornell University,
https://courses.cit.cornell.edu/ece576/DDA/FPGAhybridBRL.pdf, last re-
trieved 2014-09-25
[7] Xilinx Inc., 2013: Spartan 3-E FPGA family data sheet,
http://www.xilinx.com/support/documentation/data_sheets/ds312.pdf ,
last retreived 2014-09-25
[8] Digilent Inc., 2011, Digilent Nexys2 Board Reference manual,
http://www.digilentinc.com/data/products/nexys2/nexys2_rm.pdf , last
retrieved 2014-09-25
[9] Y. Saad, Iterative methods for sparse linear systems, 2nd ed., Society for
Industrial and applied mathematics, 2003.
[10] Ne_Zheng Sun, Applications of numerical methods to simulate the
movements of contaminants in groundwater, Environmental Health Per-
spectives, Vol. 83, (Nov. 1989), pp. 97 – 115.
[11] ASCII Table, www.asciitable.com , last retrieved 2014-09-26.
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
54
Appendix A: Documentation of
developed program code
Design codes
Vector multiplication 1 --------------------------------------------------------------
2 -- Company: Mid Sweden University
3 -- Engineer: Taiyelolu Adeboye
4 --
5 -- Create Date: 10:42:33 01/07/2015
6 -- Design Name:
7 -- Module Name: Signed_Vector_Vector_Mult_5By1 - Behavioral
8 -- Project Name: DFPM on FPGA
9 -- Target Devices: Nexys2
10 -------------------------------------------------------------
11 library IEEE;
12 use IEEE.STD_LOGIC_1164.ALL;
13 use IEEE.std_logic_signed.all;
14 use work.DFPM_ARRAY_5X32_BIT.all;
15
16 -- Uncomment the following library declaration if using
17 -- arithmetic functions with Signed or Unsigned values
18 use IEEE.NUMERIC_STD.ALL;
19
20 -- Uncomment the following library declaration if instantiating
21 -- any Xilinx primitives in this code.
22 --library UNISIM;
23 --use UNISIM.VComponents.all;
24
25 entity Signed_Vector_Vector_Mult_5By1 is
26 Port ( Vector_1 : in DFPM_SIGNED_VECTOR_5X32_BIT;
27 Vector_2 : in DFPM_SIGNED_VECTOR_5X32_BIT;
28 CLK : in STD_LOGIC;
29 RST : in STD_LOGIC;
30 Vector_Out : out Signed (32 downto 0));
31 end Signed_Vector_Vector_Mult_5By1;
32
33 architecture Behavioral of Signed_Vector_Vector_Mult_5By1 is
34
35 Signal Mult0, Mult1, Mult2,
Mult3, Mult4 : Signed(65 downto 0):= (others => '0');
36
37 Signal Sum : Signed(69 downto 0):= (others => '0');
38
39 begin
40
41 Mult0 <= Vector_1(0) * Vector_2(0);
42 Mult1 <= Vector_1(1) * Vector_2(1);
43 Mult2 <= Vector_1(2) * Vector_2(2);
44 Mult3 <= Vector_1(3) * Vector_2(3);
45 Mult4 <= Vector_1(4) * Vector_2(4);
46
47 Sum <= "0000" & Mult0 + Mult1 + Mult2 + Mult3 + Mult4;
48
49 Vector_Out <= Sum(48 downto 16);
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
55
50
51 end Behavioral;
Vector subtraction 1 --------------------------------------------------------------
2 -- Company: Mid Sweden University
3 -- Engineer: Taiyelolu Adeboye
4 --
5 -- Create Date: 10:42:33 01/07/2015
6 -- Design Name:
7 -- Module Name: Signed_Vector_Vector_Mult_5By1 - Behavioral
8 -- Project Name: DFPM on FPGA
9 -- Target Devices: Nexys2
10 -------------------------------------------------------------
11
12 library IEEE;
13 use IEEE.STD_LOGIC_1164.ALL;
14 use IEEE.std_logic_signed.all;
15 use work.DFPM_ARRAY_5X32_BIT.all;
16
17 -- Uncomment the following library declaration if using
18 -- arithmetic functions with Signed or Unsigned values
19 use IEEE.NUMERIC_STD.ALL;
20
21 -- Uncomment the following library declaration if using
22 -- arithmetic functions with Signed or Unsigned values
23 --use IEEE.NUMERIC_STD.ALL;
24
25
29
30 entity Signed_Vector_Vector_5By1_Subtr is
31 Port ( Vector_1 : in DFPM_SIGNED_VECTOR_5X32_BIT;
32 vector_2 : in DFPM_SIGNED_VECTOR_5X32_BIT;
33 CLK : in STD_LOGIC;
34 RST : in STD_LOGIC;
35 Vector_Out : out DFPM_SIGNED_VECTOR_5X32_BIT);
36 end Signed_Vector_Vector_5By1_Subtr;
37
38 architecture Behavioral of Signed_Vector_Vector_5By1_Subtr is
39
40 Signal Subtr0, Subtr1, Subtr2, Subtr3, Subtr4 : Signed(33 downto 0);
41
42 begin
43
44 Subtr0 <= '0' & Vector_1(0) - vector_2(0);
45 Subtr1 <= '0' & Vector_1(1) - vector_2(1);
46 Subtr2 <= '0' & Vector_1(2) - vector_2(2);
47 Subtr3 <= '0' & Vector_1(3) - vector_2(3);
48 Subtr4 <= '0' & Vector_1(4) - vector_2(4);
49
50 Vector_Out(0) <= Subtr0(32 downto 0);
51 Vector_Out(1) <= Subtr1(32 downto 0);
52 Vector_Out(2) <= Subtr2(32 downto 0);
53 Vector_Out(3) <= Subtr3(32 downto 0);
54 Vector_Out(4) <= Subtr4(32 downto 0);
55
56
57 end Behavioral;
Subtraction and multiplication operations
Subtr_Ops_Module.vhd Wed Feb 04 01:26:12 2015
Page 1
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
56
1 --------------------------------------------------------------
2 -- Company: Mid Sweden University
3 -- Engineer: Taiyelolu Adeboye
4 --
5 -- Create Date: 10:42:33 01/07/2015
6 -- Design Name:
7 -- Module Name: Signed_Vector_Vector_Mult_5By1 - Behavioral
8 -- Project Name: DFPM on FPGA
9 -- Target Devices: Nexys2
10 -------------------------------------------------------------
11
12 library IEEE;
13 use IEEE.STD_LOGIC_1164.ALL;
14 use IEEE.std_logic_signed.all;
15 use work.DFPM_ARRAY_5X32_BIT.all;
16 use work.DFPM_ARRAY_25X32_BIT.all;
17 use IEEE.NUMERIC_STD.ALL;
18
19
20 entity Signed_SubtrAndMult_Ops_Module is
21 Port ( Vector_A : in DFPM_SIGNED_VECTOR_25X32_BIT;
22 Vector_B : in DFPM_SIGNED_VECTOR_5X32_BIT;
23 Vector_X : in DFPM_SIGNED_VECTOR_5X32_BIT;
24 Scalar_Mu : in SIGNED (32 downto 0);
25 Vector_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
26
27 CLK : in STD_LOGIC;
28 RST : in STD_LOGIC;
29 NEW_ITERATION : in STD_LOGIC := '0';
30 ITERATION_COMPLETE : out STD_LOGIC:= '0';
31
32 B_Minus_AX : out DFPM_SIGNED_VECTOR_5X32_BIT;
33 B_Minus_Ax_Minus_muV : out DFPM_SIGNED_VECTOR_5X32_BIT);
34 end Signed_SubtrAndMult_Ops_Module;
35
36 architecture Behavioral of Signed_SubtrAndMult_Ops_Module is
37
38 ------------------------------------------------
39
40
41 -- This component will be used to evaluate
42 -- The vector multiplication A*X
43 -- It takes two input of 5 by 1 vectors
44 COMPONENT Signed_Vector_Vector_Mult_5By1
45 PORT(
46 Vector_1 : IN DFPM_SIGNED_VECTOR_5X32_BIT;
47 Vector_2 : IN DFPM_SIGNED_VECTOR_5X32_BIT;
48 CLK : IN std_logic;
49 RST : IN std_logic;
50 Vector_Out : OUT Signed(32 downto 0)
51 );
52 END COMPONENT;
53
54 -- This component will be used top evaluate the subtraction in B -
Ax
55 COMPONENT Signed_Vector_Vector_5By1_Subtr
56 Port ( Vector_1 : in DFPM_SIGNED_VECTOR_5X32_BIT;
57 vector_2 : in DFPM_SIGNED_VECTOR_5X32_BIT;
58 CLK : in STD_LOGIC;
59 RST : in STD_LOGIC;
60 Vector_Out : out DFPM_SIGNED_VECTOR_5X32_BIT);
61 END COMPONENT;
62
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
57
63 ------------------------------------------------
64
65
66
67 ------------------------------------------------
68 -- Signals for storing the input values
69 Signal Sig_Vector_A : DFPM_SIGNED_VECTOR_25X32_BIT := ( ((Others =>
'0'), (Others
=> '0'), (Others => '0'), (Others => '0'), (Others => '0')),
70 ((Others => '0'), (Others
=> '0'), (Others => '0'), (Others => '0'), (Others => '0')),
71 ((Others => '0'), (Others
=> '0'), (Others => '0'), (Others => '0'), (Others => '0')),
72 ((Others => '0'), (Others
=> '0'), (Others => '0'), (Others => '0'), (Others => '0')),
73 ((Others => '0'), (Others
=> '0'), (Others => '0'), (Others => '0'), (Others => '0')));
74
75 Signal Sig_Vector_B : DFPM_SIGNED_VECTOR_5X32_BIT := ((Others =>
'0'), (Others =>
'0'), (Others => '0'), (Others => '0'), (Others => '0'));
76 Signal Sig_Vector_X : DFPM_SIGNED_VECTOR_5X32_BIT := ((Others =>
'0'), (Others =>
'0'), (Others => '0'), (Others => '0'), (Others => '0'));
77 Signal Sig_Scalar_Mu: SIGNED (32 downto 0);
78 Signal Sig_Vector_V : DFPM_SIGNED_VECTOR_5X32_BIT := ((Others =>
'0'), (Others =>
'0'), (Others => '0'), (Others => '0'), (Others => '0'));
79
80
81 -- The two signals below are used to connect the signals at the
Vector_vector_Mult_Module
82 -- To the the Corresponding Vector indexes.
83 -- These were used to avoid assigning Dynamically changing signals
directly to a
static line
84 Signal Sig_Vector_A_With_IndexPosition : DFPM_SIGNED_VECTOR_5X32_BIT
:= ((Others =>
'0'), (Others => '0'), (Others => '0'), (Others => '0'), (Others =>
'0'));
85
86 Signal Sig_Vector_A_Mult_X_With_IndexPosition : SIGNED (32 downto
0);
87
88 -- These following two(2) signals will be used to store the products
of the
89 -- Multiplication of Vectors A and X
90 -- as well as Scalar mu and Vector V.
91 Signal Sig_Vector_A_Mult_X : DFPM_SIGNED_VECTOR_5X32_BIT := ((Others
=> '0'), (
Others => '0'), (Others => '0'), (Others => '0'), (Others => '0'));
92 Signal Sig_Vector_Mu_Mult_V : DFPM_SIGNED_VECTOR_5X32_BIT := ((Oth-
ers => '0'), (
Others => '0'), (Others => '0'), (Others => '0'), (Others => '0'));
93
94 -- These following tow signals will be used to store the result
95 -- of the subtraction operations
96 Signal Sig_Vector_B_Minus_AX : DFPM_SIGNED_VECTOR_5X32_BIT := ((Oth-
ers => '0'), (
Others => '0'), (Others => '0'), (Others => '0'), (Others => '0'));
97 Signal Sig_Vector_B_Minus_AX_Minus_MuV : DFPM_SIGNED_VECTOR_5X32_BIT
:= ((Others =>
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
58
'0'), (Others => '0'), (Others => '0'), (Others => '0'), (Others =>
'0'));
98
99 -- This signal will only be raised for one clock cycle
100 -- when there is a new set of data for available computation
101 Signal DFPMCompute : STD_LOGIC := '0';
102
103 -- This signal is used to sommunicate with other modules "down-
stream" of this module
104 -- when there the result of this module's computation is ready
105 Signal Sig_ITERATION_COMPLETE : STD_LOGIC := '0';
106
107 -- This Signal will be used to represent the index position that
108 -- that will be progressively incremented as a means of pipelining
109 -- data for multiplication in this module as well as input for the
110 -- Vector_Vector_Multiplication module
111 Signal MultplicationStageArrayPosition : integer := 0;
112
113 -- This signal will be used to signal when the index position
114 -- can be shifted and when data can be stored for output
115 Signal Shift_Array_Position : STD_LOGIC := '0';
116
117 -- This signal will be raised once when all the products of multi-
plication are
ready.
118 -- This is to enable the module to signal to other modules "down-
stream"
119 -- that the result of the computation is ready
120 Signal MultiplicationProductsReady : STD_LOGIC := '0';
121
122 Signal ReadyFlag : STD_LOGIC := '0';
123
124 -- This clock signal was created as a slowed down (half pace of
CLK)
125 -- And will be used for clocking the shifting of the index position
126 Signal Sig_Clk_For_Index_Shifting : STD_LOGIC := '0';
127
128
129 begin
130 -- For Vector - Vector multiplication
131 Vector_Vector_Mult: Signed_Vector_Vector_Mult_5By1 PORT MAP (
132 Vector_1 => Sig_Vector_A_With_IndexPosition,
133 Vector_2 => Sig_Vector_X,
134 CLK => CLK,
135 RST => RST,
136 Vector_Out => Sig_Vector_A_Mult_X_With_IndexPosition);
137
138 -- For Subtraction operations for B - AX
139 Doing_B_Minus_AX : Signed_Vector_Vector_5By1_Subtr PORT MAP (
140 Vector_1 => Sig_Vector_B,
141 vector_2 => Sig_Vector_A_Mult_X,
142 CLK => CLK,
143 RST => RST,
144 Vector_Out => Sig_Vector_B_Minus_AX);
145
146 -- For Subtraction operations for B - AX - muV
147 Doing_B_Minus_AX_Minus_MuV : Signed_Vector_Vector_5By1_Subtr PORT
MAP (
148 Vector_1 => Sig_Vector_B_Minus_AX,
149 vector_2 => Sig_Vector_Mu_Mult_V,
150 CLK => CLK,
151 RST => RST,
152 Vector_Out => Sig_Vector_B_Minus_AX_Minus_MuV);
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
59
153
154 -- This signal wiill be used to signal that the output of this
module is ready to
be read.
155 ITERATION_COMPLETE <= Sig_ITERATION_COMPLETE;
156
157
158
159
160
161 -- This process determines the when each iteration of the DFPM
algorithm is to be
started
162 -- Computation will only be done if it's a new iteration and it has
not been
completed before
163 -- Therefore this process sets DFPMCompute to '1' only on the
rising edge of
NEW_ITERATION
164 -- And stored new Value into the Vectors only at the rising edge of
NEW_ITERATION
165 process(CLK, RST, Sig_ITERATION_COMPLETE, NEW_ITERATION)
166 Variable NEW_ITERATION_Var : STD_LOGIC := '0';
167 begin
168 if rising_edge(CLK) then
169 if (RST = '1') then
170 DFPMCompute <= '0';
171 NEW_ITERATION_Var := '0';
172 elsif (Sig_ITERATION_COMPLETE = '1') then
173 NEW_ITERATION_Var := '0';
174 DFPMCompute <= '0';
175 -- This more or less senses for the rising edge of NEW_ITERATION
176 elsif (NEW_ITERATION = '1') and (NEW_ITERATION_Var = '0') then
177 --if rising_edge(NEW_ITERATION) then
178 NEW_ITERATION_Var := '1';
179
180 Sig_Vector_A <= Vector_A;
181 Sig_Vector_B <= Vector_B;
182 Sig_Vector_X <= Vector_X;
183 Sig_Vector_V <= Vector_V;
184 Sig_Scalar_Mu <= Scalar_Mu;
185
186 DFPMCompute <= '1';
187 elsif (NEW_ITERATION = '1') and (NEW_ITERATION_Var = '1') then
188 NEW_ITERATION_Var := '0';
189 DFPMCompute <= '0';
190 elsif (NEW_ITERATION = '0') then
191 NEW_ITERATION_Var := '0';
192 DFPMCompute <= '0';
193 end if;
194 end if;
195 end process;
196
197
198 -- This process determies the array postions to be multiplied
together for A*X
199 process(RST, Sig_ITERATION_COMPLETE, DFPMCompute,
Shift_Array_Position,
NEW_ITERATION, CLK, Sig_Clk_For_Index_Shifting, MultplicationStageAr-
rayPosition,
Sig_Vector_A, Sig_Vector_A_Mult_X_With_IndexPosition, Sig_Scalar_Mu,
Sig_Vector_V)
200 Variable MultplicationStageArrayPosition_Var : integer := 0;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
60
201
202 begin
203 if (RST = '1') then
204 MultplicationStageArrayPosition <= 0;
205 Shift_Array_Position <= '0';
206 MultiplicationProductsReady <= '0';
207
208 elsif (Sig_ITERATION_COMPLETE = '1') then
209 MultplicationStageArrayPosition <= 0;
210 Shift_Array_Position <= '0';
211
212 elsif (DFPMCompute = '1') then -- Checking for the rising edge of
NEW
iteration here
213 MultplicationStageArrayPosition <= 0;
214 Shift_Array_Position <= '1';
215 MultiplicationProductsReady <= '0';
216
217 -- Sig_Vector_A_With_IndexPosition <= Sig_Vector_A(0);
218 -- Sig_Vector_A_Mult_X(0) <=
Sig_Vector_A_Mult_X_With_IndexPosition;
219 -- productTempStore := Sig_Scalar_Mu * Sig_Vector_V(0);
220 -- Sig_Vector_Mu_Mult_V(MultplicationStageArrayPosition) <=
productTempStore(48 downto 16);
221
222 elsif (Shift_Array_Position = '1') then
223 if rising_edge(Sig_Clk_For_Index_Shifting) then
224 if (MultplicationStageArrayPosition = 5) then
225 MultplicationStageArrayPosition <= 0;
226 Shift_Array_Position <= '0';
227 MultiplicationProductsReady <= '1';
228 else
229 MultplicationStageArrayPosition_Var :=
MultplicationStageArrayPosition;
230 MultplicationStageArrayPosition <=
MultplicationStageArrayPosition_Var + 1;
231 end if;
232 end if;
233 end if;
234 end process;
235
236 process(CLK, DFPMCompute, Shift_Array_Position, Multplication-
StageArrayPosition)
237 Variable productTempStore : Signed(65 downto 0);
238 begin
239 if rising_edge(CLK) then
240 if (Shift_Array_Position = '1') and ( MultplicationStageArrayPosi-
tion < 5
) then
241 case MultplicationStageArrayPosition is
242 when 0 =>
243 Sig_Vector_A_With_IndexPosition <= Sig_Vector_A(0);
244 Sig_Vector_A_Mult_X(0) <= Sig_Vector_A_Mult_X_With_IndexPosition;
245 productTempStore := Sig_Scalar_Mu * Sig_Vector_V(0);
246 when 1 =>
247 Sig_Vector_A_With_IndexPosition <= Sig_Vector_A(1);
248 Sig_Vector_A_Mult_X(1) <= Sig_Vector_A_Mult_X_With_IndexPosition;
249 productTempStore := Sig_Scalar_Mu * Sig_Vector_V(1);
250 when 2 =>
251 Sig_Vector_A_With_IndexPosition <= Sig_Vector_A(2);
252 Sig_Vector_A_Mult_X(2) <= Sig_Vector_A_Mult_X_With_IndexPosition;
253 productTempStore := Sig_Scalar_Mu * Sig_Vector_V(2);
254 when 3 =>
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
61
255 Sig_Vector_A_With_IndexPosition <= Sig_Vector_A(3);
256 Sig_Vector_A_Mult_X(3) <= Sig_Vector_A_Mult_X_With_IndexPosition;
257 productTempStore := Sig_Scalar_Mu * Sig_Vector_V(3);
258 when 4 =>
259 Sig_Vector_A_With_IndexPosition <= Sig_Vector_A(4);
260 Sig_Vector_A_Mult_X(4) <= Sig_Vector_A_Mult_X_With_IndexPosition;
261 productTempStore := Sig_Scalar_Mu * Sig_Vector_V(4);
262 when Others =>
263 NULL;
264 end case;
265 -- -- Setting the correcponding Vector_A element as the input to
the Vector_Vector_Mult_Module
266 -- Sig_Vector_A_With_IndexPosition <=
Sig_Vector_A(MultplicationStageArrayPosition);
267 -- -- Connecting the output of the Vector_Vector_Mult module to
tghe corresponding A_Mult_X index
268 -- Sig_Vector_A_Mult_X(MultplicationStageArrayPosition) <=
Sig_Vector_A_Mult_X_With_IndexPosition;
269 -- -- Doing mu*V
270 -- productTempStore := Sig_Scalar_Mu *
Sig_Vector_V(MultplicationStageArrayPosition);
271 Sig_Vector_Mu_Mult_V(MultplicationStageArrayPosition) <=
productTempStore(48 downto 16);
272 end if;
273 end if;
274 end process;
275
276
277 -- This process clears ITERATION_COMPLETE and
278 -- only sets it to 1 when the MultiplicationProductsReady signal is
high.
279 -- At the rising_edge of MultiplicationProductsReady, the vectors
280 -- B_Minus_AX and B_Minus_Ax_Minus_muV are assigned.
281 process(CLK, RST, DFPMCompute, MultiplicationProductsReady, Ready-
Flag)
282 begin
283 if rising_edge(clk) then
284 if (RST = '1') then
285 Sig_ITERATION_COMPLETE <= '0';
286 ReadyFlag <= '0';
287
288 elsif (DFPMCompute = '1') then
289 Sig_ITERATION_COMPLETE <= '0';
290 ReadyFlag <= '0';
291 elsif (MultiplicationProductsReady = '1') and (ReadyFlag = '0')
then
292 ReadyFlag <= '1';
293
294 Sig_ITERATION_COMPLETE <= '1';
295 B_Minus_AX <= Sig_Vector_B_Minus_AX;
296 B_Minus_Ax_Minus_muV <= Sig_Vector_B_Minus_AX_Minus_MuV;
297 else
298 Sig_ITERATION_COMPLETE <= '0';
299 -- end if;
300 end if;
301 end if;
302 end process;
303
304 -- The clock signal created in this process is a real afterthought
305 -- It would not have been created if this module had behaved itself
;-))
306 -- It was observed that the circuit computed an output that was
wrong
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
62
307 -- For as long as the shifting of the index position was based on
the normal clock
"CLK"
308 -- Hence this clock that cuts the speed to half.
Subtr_Ops_Module.vhd Wed Feb 04 01:26:12 2015
Page 7 309 process(CLK)
310 begin
311 if rising_edge(CLK) then
312 Sig_Clk_For_Index_Shifting <= not(Sig_Clk_For_Index_Shifting);
313 end if;
314 End process;
315
316 end Behavioral;
317
318
Tolerance check
1 ---------------------------------------------------------------------
-------------
2 -- Company: Mid Sweden University
3 -- Engineer: Taiyelolu Adeboye
4 --
5 -- Create Date: 10:42:33 01/07/2015
6 -- Design Name:
7 -- Module Name: Signed_Vector_Vector_Mult_5By1 - Behavioral
8 -- Project Name: DFPM on FPGA
9 -- Target Devices: Nexys2
10 --------------------------------------------------------------------
--------------
11
12 library IEEE;
13 use IEEE.STD_LOGIC_1164.ALL;
14 use IEEE.std_logic_signed.all;
15 use work.DFPM_ARRAY_5X32_BIT.all;
16
17
18 -- Uncomment the following library declaration if using
19 -- arithmetic functions with Signed or Unsigned values
20 use IEEE.NUMERIC_STD.ALL;
21
22 -- Uncomment the following library declaration if using
23 -- arithmetic functions with Signed or Unsigned values
24 --use IEEE.NUMERIC_STD.ALL;
25
26 -- Uncomment the following library declaration if instantiating
27 -- any Xilinx primitives in this code.
28 --library UNISIM;
29 --use UNISIM.VComponents.all;
30
31 entity Signed_Tolerance_Check is
32 Port ( Vector_B_AX : in DFPM_SIGNED_VECTOR_5X32_BIT;
33 Tolerance_Limit : in Signed (32 downto 0);
34 Iteration_Complete : in STD_LOGIC:= '0';
35
36 CLK : in STD_LOGIC:= '0';
37 RST : in STD_LOGIC:= '0';
38
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
63
39 Tolerance_Limit_Squared, Vector_B_AX_Sum : out Signed (32 downto 0);
40
41 Iterate : out STD_LOGIC := '1');
42 end Signed_Tolerance_Check;
43
44 architecture Behavioral of Signed_Tolerance_Check is
45
46 Signal Sig_Vector_B_AX, Sig_Vector_B_AX_Squared :
DFPM_SIGNED_VECTOR_5X32_BIT;
47 Signal Sig_Tolerance_Limit, Sig_Tolerance_Limit_Squared : Signed (32
downto 0);
48
49 Signal Sig_Vector_B_AX_Sum : Signed(32 downto 0);
50
51 Signal Sig_Position : integer := 0;
52
53 Signal Sig_ShiftPosition, Sig_Multiplication_Is_Complete,
Sig_Check_Tolerance_Limit
: STD_LOGIC := '0';
54
55
56
57
58 begin
59
60 Tolerance_Limit_Squared <= Sig_Tolerance_Limit_Squared;
61 Vector_B_AX_Sum <= Sig_Vector_B_AX_Sum;
62
63 -- This process determines when data stored innternally are to be
serially
multiplied
64 -- They are serially multiplied to save on Multipliers
65 process(CLK, RST, Iteration_Complete, Sig_ShiftPosition,
Sig_Position)
66 Variable Var_Position: integer := 0;
67 begin
68 if rising_edge(CLK) then
69 if (RST = '1') then
70 Sig_Position <= 0;
71 Sig_ShiftPosition <= '0';
72 Sig_Multiplication_Is_Complete <= '0';
73 elsif (Iteration_Complete = '1') then
74 Sig_Check_Tolerance_Limit <= '0';
75 Sig_Position <= 0;
76 Sig_ShiftPosition <= '1';
77 Sig_Multiplication_Is_Complete <= '0';
78 elsif (Sig_Multiplication_Is_Complete = '1') then
79 Sig_Check_Tolerance_Limit <= '1';
80 else
81 if (Sig_ShiftPosition = '1') then
82 if (Sig_Position = 5) then
83 Sig_Position <= 0;
84 Sig_Multiplication_Is_Complete <= '1';
85 Sig_ShiftPosition <= '0';
86 else
87 Var_Position := Sig_Position;
88 Sig_Position <= Var_Position + 1;
89 end if;
90 end if;
91 end if;
92 end if;
93 end process;
94
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
64
95 -- Storing data internally at when signal from SubtrAndMult Module
is high
96 process(Iteration_Complete)
97 Variable productTempStore : Signed(65 downto 0) := (Others => '0');
98 begin
99 if rising_edge(Iteration_Complete) then
100 Sig_Tolerance_Limit <= Tolerance_Limit;
101 Sig_Vector_B_AX <= Vector_B_AX;
102 end if;
103 end process;
104
105 -- Serial multiplication
106 process(CLK, Sig_ShiftPosition, Sig_Position)
107 Variable productTempStore : Signed(65 downto 0);
108 begin
109 if rising_edge(clk) then
110 if (Sig_ShiftPosition <= '1') then
111 Case Sig_Position is
112 when 0 =>
113 productTempStore := (Sig_Vector_B_AX(Sig_Position) *
Sig_Vector_B_AX(Sig_Position));
114 Sig_Vector_B_AX_Squared(Sig_Position) <= productTempStore(48
downto 16);
115 when 1 =>
116 productTempStore := (Sig_Vector_B_AX(Sig_Position) *
Sig_Vector_B_AX(Sig_Position));
117 Sig_Vector_B_AX_Squared(Sig_Position) <= productTempStore(48
downto 16);
118 when 2 =>
119 productTempStore := (Sig_Vector_B_AX(Sig_Position) *
Sig_Vector_B_AX(Sig_Position));
120 Sig_Vector_B_AX_Squared(Sig_Position) <= productTempStore(48
downto 16);
121 when 3 =>
122 productTempStore := (Sig_Vector_B_AX(Sig_Position) *
Sig_Vector_B_AX(Sig_Position));
123 Sig_Vector_B_AX_Squared(Sig_Position) <= productTempStore(48
downto 16);
124 when 4 =>
125 productTempStore := (Sig_Vector_B_AX(Sig_Position) *
Sig_Vector_B_AX(Sig_Position));
126 Sig_Vector_B_AX_Squared(Sig_Position) <= productTempStore(48
downto 16);
127 when 5 =>
128 productTempStore := Sig_Tolerance_Limit * Sig_Tolerance_Limit;
129 Sig_Tolerance_Limit_Squared <= productTempStore(48 downto 16);
130 when others =>
131 NULL;
132 End case;
133 end if;
134 end if;
135 end process;
136
137 process(Sig_Multiplication_Is_Complete)
138 variable Var_Vector_B_AX_Sum : Signed (36 downto 0);
139 begin
140 if rising_edge(Sig_Multiplication_Is_Complete) then
141 Var_Vector_B_AX_Sum := ("0000" & Sig_Vector_B_AX_Squared(0) +
Sig_Vector_B_AX_Squared(1)
142 + Sig_Vector_B_AX_Squared(2) +
Sig_Vector_B_AX_Squared(3)
143 + Sig_Vector_B_AX_Squared(4));
144
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
65
145 Sig_Vector_B_AX_Sum <= Var_Vector_B_AX_Sum(32 downto 0);
146 end if;
147 end process;
148
149 process(CLK, Sig_Check_Tolerance_Limit, Sig_Vector_B_AX_Sum,
Sig_Tolerance_Limit_Squared)
150 begin
151 if rising_edge(CLK) then
152 if (Sig_Check_Tolerance_Limit = '1') then
153 if (Sig_Vector_B_AX_Sum < Sig_Tolerance_Limit_Squared) then
154 Iterate <= '0';
155 else
156 Iterate <= '1';
157 end if;
158 end if;
159 end if;
160 end process;
161 end Behavioral;
162
163
New V operations
1 ---------------------------------------------------------------------
-------------
2 -- Company: Mid Sweden University
3 -- Engineer: Taiyelolu Adeboye
4 --
5 -- Create Date: 10:42:33 01/07/2015
6 -- Design Name:
7 -- Module Name: Signed_Vector_Vector_Mult_5By1 - Behavioral
8 -- Project Name: DFPM on FPGA
9 -- Target Devices: Nexys2
10 --------------------------------------------------------------------
--------------
11
12 library IEEE;
13 use IEEE.STD_LOGIC_1164.ALL;
14 use IEEE.std_logic_signed.all;
15 use work.DFPM_ARRAY_5X32_BIT.all;
16
17
18 -- Uncomment the following library declaration if using
19 -- arithmetic functions with Signed or Unsigned values
20 use IEEE.NUMERIC_STD.ALL;
21 -- Uncomment the following library declaration if instantiating
22 -- any Xilinx primitives in this code.
23 --library UNISIM;
24 --use UNISIM.VComponents.all;
25
26 entity Signed_New_V_Ops is
27 Port ( B_Ax_Muv : in DFPM_SIGNED_VECTOR_5X32_BIT;
28 Vector_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
29
30 DT : in Signed (32 downto 0);
31 CLK : in STD_LOGIC;
32 RST : in STD_LOGIC;
33 ITERATION_COMPLETE : in STD_LOGIC;
34
35 VECTOR_NEW_V : out DFPM_SIGNED_VECTOR_5X32_BIT;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
66
36 NEW_V_READY : out STD_LOGIC);
37 end Signed_New_V_Ops;
38
39 architecture Behavioral of Signed_New_V_Ops is
40
41 Signal Sig_Vector_V : DFPM_SIGNED_VECTOR_5X32_BIT;
42 Signal Sig_B_Ax_MuV : DFPM_SIGNED_VECTOR_5X32_BIT;
43 Signal Sig_B_Ax_MuV_Mult_Dt : DFPM_SIGNED_VECTOR_5X32_BIT;
44
45
46 Signal Sig_Position : integer := 0;
47
48 Signal Sig_ShiftPosition : STD_LOGIC := '0';
49 Signal Sig_NEW_V_READY : STD_LOGIC := '0';
50
51 begin
52
53 process(CLK, RST, ITERATION_COMPLETE, Sig_ShiftPosition,
Sig_Position,
Sig_NEW_V_READY)
54 variable Var_Iteration_Complete : STD_LOGIC := '0';
55 Variable Var_Position : integer := 0;
56 begin
57 if rising_edge(CLK) then
58 if (RST = '1') then
59 Sig_ShiftPosition <= '0';
60 Var_Position := 0;
61 Sig_Position <= 0;
62 Var_Iteration_Complete := '0';
63 Sig_NEW_V_READY <= '0';
64 elsif (ITERATION_COMPLETE = '0') then
65 Var_Iteration_Complete := '0';
66 -- elsif (ITERATION_COMPLETE = '1') and (Var_Iteration_Complete =
'1') then
67 -- Var_Iteration_Complete := '0';
68 elsif (ITERATION_COMPLETE = '1') and (Var_Iteration_Complete = '0')
then
69 Var_Iteration_Complete := '1';
70 Sig_ShiftPosition <= '1';
71 Sig_Position <= 0;
72 Sig_NEW_V_READY <= '0';
73 end if;
74
75 if (Sig_ShiftPosition = '1') then
76 if (Sig_Position = 4) then
77 Sig_ShiftPosition <= '0';
78 Sig_NEW_V_READY <= '1';
79 else
80 Var_Position := Sig_Position;
81 Sig_Position <= Var_Position + 1;
82 end if;
83 end if;
84
85 if (Sig_NEW_V_READY = '1') then
86 Sig_NEW_V_READY <= '0';
87 end if;
88 end if;
89 end process;
90
91 process(ITERATION_COMPLETE)
92 begin
93 if rising_edge(ITERATION_COMPLETE) then
94 Sig_B_Ax_MuV <= B_Ax_Muv;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
67
95 Sig_Vector_V <= Vector_V;
96 end if;
97 end process;
98
99 process(CLK, Sig_ShiftPosition)
100 Variable productTempStore : Signed(65 downto 0) := (Others => '0');
101 begin
102 if rising_edge(CLK) then
103 if (Sig_ShiftPosition = '1') then
104 productTempStore := Sig_B_Ax_MuV(Sig_Position) * DT;
105
106 Sig_B_Ax_MuV_Mult_Dt(Sig_Position) <= productTempStore(48 downto
16);
107 end if;
108 end if;
109 end process;
110
111 NEW_V_READY <= Sig_NEW_V_READY;
112
113 VECTOR_NEW_V(0) <= Sig_Vector_V(0) + Sig_B_Ax_MuV_Mult_Dt(0);
114 VECTOR_NEW_V(1) <= Sig_Vector_V(1) + Sig_B_Ax_MuV_Mult_Dt(1);
115 VECTOR_NEW_V(2) <= Sig_Vector_V(2) + Sig_B_Ax_MuV_Mult_Dt(2);
116 VECTOR_NEW_V(3) <= Sig_Vector_V(3) + Sig_B_Ax_MuV_Mult_Dt(3);
117 VECTOR_NEW_V(4) <= Sig_Vector_V(4) + Sig_B_Ax_MuV_Mult_Dt(4);
118
119 end Behavioral;
120
121
New X operations
1 ---------------------------------------------------------------------
-------------
2 -- Company: Mid Sweden University
3 -- Engineer: Taiyelolu Adeboye
4 --
5 -- Create Date: 10:42:33 01/07/2015
6 -- Design Name:
7 -- Module Name: Signed_Vector_Vector_Mult_5By1 - Behavioral
8 -- Project Name: DFPM on FPGA
9 -- Target Devices: Nexys2
10 --------------------------------------------------------------------
--------------
11
12 library IEEE;
13 use IEEE.STD_LOGIC_1164.ALL;
14 use IEEE.std_logic_signed.all;
15 use work.DFPM_ARRAY_5X32_BIT.all;
16
17
18 -- Uncomment the following library declaration if using
19 -- arithmetic functions with Signed or Unsigned values
20 use IEEE.NUMERIC_STD.ALL;
21 -- Uncomment the following library declaration if instantiating
22 -- any Xilinx primitives in this code.
23 --library UNISIM;
24 --use UNISIM.VComponents.all;
25
26 entity Signed_New_V_Ops is
27 Port ( B_Ax_Muv : in DFPM_SIGNED_VECTOR_5X32_BIT;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
68
28 Vector_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
29
30 DT : in Signed (32 downto 0);
31 CLK : in STD_LOGIC;
32 RST : in STD_LOGIC;
33 ITERATION_COMPLETE : in STD_LOGIC;
34
35 VECTOR_NEW_V : out DFPM_SIGNED_VECTOR_5X32_BIT;
36 NEW_V_READY : out STD_LOGIC);
37 end Signed_New_V_Ops;
38
39 architecture Behavioral of Signed_New_V_Ops is
40
41 Signal Sig_Vector_V : DFPM_SIGNED_VECTOR_5X32_BIT;
42 Signal Sig_B_Ax_MuV : DFPM_SIGNED_VECTOR_5X32_BIT;
43 Signal Sig_B_Ax_MuV_Mult_Dt : DFPM_SIGNED_VECTOR_5X32_BIT;
44
45
46 Signal Sig_Position : integer := 0;
47
48 Signal Sig_ShiftPosition : STD_LOGIC := '0';
49 Signal Sig_NEW_V_READY : STD_LOGIC := '0';
50
51 begin
52
53 process(CLK, RST, ITERATION_COMPLETE, Sig_ShiftPosition,
Sig_Position,
Sig_NEW_V_READY)
54 variable Var_Iteration_Complete : STD_LOGIC := '0';
55 Variable Var_Position : integer := 0;
56 begin
57 if rising_edge(CLK) then
58 if (RST = '1') then
59 Sig_ShiftPosition <= '0';
60 Var_Position := 0;
61 Sig_Position <= 0;
62 Var_Iteration_Complete := '0';
63 Sig_NEW_V_READY <= '0';
64 elsif (ITERATION_COMPLETE = '0') then
65 Var_Iteration_Complete := '0';
66 -- elsif (ITERATION_COMPLETE = '1') and (Var_Iteration_Complete =
'1') then
67 -- Var_Iteration_Complete := '0';
68 elsif (ITERATION_COMPLETE = '1') and (Var_Iteration_Complete = '0')
then
69 Var_Iteration_Complete := '1';
70 Sig_ShiftPosition <= '1';
71 Sig_Position <= 0;
72 Sig_NEW_V_READY <= '0';
73 end if;
74
75 if (Sig_ShiftPosition = '1') then
76 if (Sig_Position = 4) then
77 Sig_ShiftPosition <= '0';
78 Sig_NEW_V_READY <= '1';
79 else
80 Var_Position := Sig_Position;
81 Sig_Position <= Var_Position + 1;
82 end if;
83 end if;
84
85 if (Sig_NEW_V_READY = '1') then
86 Sig_NEW_V_READY <= '0';
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
69
87 end if;
88 end if;
89 end process;
90
91 process(ITERATION_COMPLETE)
92 begin
93 if rising_edge(ITERATION_COMPLETE) then
94 Sig_B_Ax_MuV <= B_Ax_Muv;
95 Sig_Vector_V <= Vector_V;
96 end if;
97 end process;
98
99 process(CLK, Sig_ShiftPosition)
100 Variable productTempStore : Signed(65 downto 0) := (Others => '0');
101 begin
102 if rising_edge(CLK) then
103 if (Sig_ShiftPosition = '1') then
104 productTempStore := Sig_B_Ax_MuV(Sig_Position) * DT;
105
106 Sig_B_Ax_MuV_Mult_Dt(Sig_Position) <= productTempStore(48 downto
16);
107 end if;
108 end if;
109 end process;
110
111 NEW_V_READY <= Sig_NEW_V_READY;
112
113 VECTOR_NEW_V(0) <= Sig_Vector_V(0) + Sig_B_Ax_MuV_Mult_Dt(0);
114 VECTOR_NEW_V(1) <= Sig_Vector_V(1) + Sig_B_Ax_MuV_Mult_Dt(1);
115 VECTOR_NEW_V(2) <= Sig_Vector_V(2) + Sig_B_Ax_MuV_Mult_Dt(2);
116 VECTOR_NEW_V(3) <= Sig_Vector_V(3) + Sig_B_Ax_MuV_Mult_Dt(3);
117 VECTOR_NEW_V(4) <= Sig_Vector_V(4) + Sig_B_Ax_MuV_Mult_Dt(4);
118
119 end Behavioral;
120
121
One Iteration
1 ---------------------------------------------------------------------
-------------
2 -- Company: Mid Sweden University
3 -- Engineer: Taiyelolu Adeboye
4 --
5 -- Create Date: 10:42:33 01/07/2015
6 -- Design Name:
7 -- Module Name: Signed_Vector_Vector_Mult_5By1 - Behavioral
8 -- Project Name: DFPM on FPGA
9 -- Target Devices: Nexys2
10 --------------------------------------------------------------------
--------------
11
12 library IEEE;
13 use IEEE.STD_LOGIC_1164.ALL;
14 use IEEE.std_logic_signed.all;
15 use work.DFPM_ARRAY_5X32_BIT.all;
16 use work.DFPM_ARRAY_25X32_BIT.all;
17
18 -- Uncomment the following library declaration if using
19 -- arithmetic functions with Signed or Unsigned values
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
70
20 use IEEE.NUMERIC_STD.ALL;
21
22 -- Uncomment the following library declaration if instantiating
23 -- any Xilinx primitives in this code.
24 --library UNISIM;
25 --use UNISIM.VComponents.all;
26
27 entity Signed_DFPM_One_Iteration is
28 Port ( VECTOR_A_IN : in DFPM_SIGNED_VECTOR_25X32_BIT;
29 VECTOR_B_IN : in DFPM_SIGNED_VECTOR_5X32_BIT;
30 VECTOR_X_IN : in DFPM_SIGNED_VECTOR_5X32_BIT;
31 VECTOR_V_IN : in DFPM_SIGNED_VECTOR_5X32_BIT;
32 Mu_IN : in Signed (32 downto 0);
33 DT_IN : in Signed (32 downto 0);
34 NEW_ITERATION_IN : in STD_LOGIC;
35
36 CLK : in STD_LOGIC;
37 RST : in STD_LOGIC;
38
39 B_AX_OUT : out DFPM_SIGNED_VECTOR_5X32_BIT;
40 NEW_V_OUT : out DFPM_SIGNED_VECTOR_5X32_BIT;
41 NEW_X_OUT : out DFPM_SIGNED_VECTOR_5X32_BIT;
42
43 ITERATION_STAGE_COMPLETE : out STD_LOGIC;
44 ITERATE_AGAIN : out STD_LOGIC);
45 end Signed_DFPM_One_Iteration;
46
47 architecture Behavioral of Signed_DFPM_One_Iteration is
48
49
50 COMPONENT Signed_SubtrAndMult_Ops_Module
51 Port ( Vector_A : in DFPM_SIGNED_VECTOR_25X32_BIT;
52 Vector_B : in DFPM_SIGNED_VECTOR_5X32_BIT;
53 Vector_X : in DFPM_SIGNED_VECTOR_5X32_BIT;
54 Scalar_Mu : in SIGNED (32 downto 0);
55 Vector_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
56
57 CLK : in STD_LOGIC;
58 RST : in STD_LOGIC;
59 NEW_ITERATION : in STD_LOGIC := '0';
60 ITERATION_COMPLETE : out STD_LOGIC:= '0';
61
62 B_Minus_AX : out DFPM_SIGNED_VECTOR_5X32_BIT;
63 B_Minus_Ax_Minus_muV : out DFPM_SIGNED_VECTOR_5X32_BIT);
64 END COMPONENT;
65
66 COMPONENT Signed_New_V_Ops
67 Port ( B_Ax_Muv : in DFPM_SIGNED_VECTOR_5X32_BIT;
68 Vector_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
69
70 DT : in Signed (32 downto 0);
71 CLK : in STD_LOGIC;
72 RST : in STD_LOGIC;
73 ITERATION_COMPLETE : in STD_LOGIC;
74
75 VECTOR_NEW_V : out DFPM_SIGNED_VECTOR_5X32_BIT;
76 NEW_V_READY : out STD_LOGIC);
77 END COMPONENT;
78
79 COMPONENT Signed_New_X_Ops
80 Port ( VECTOR_X : in DFPM_SIGNED_VECTOR_5X32_BIT;
81 VECTOR_NEW_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
82 DT : in Signed(32 downto 0);
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
71
83
84 CLK : in STD_LOGIC;
85 RST : in STD_LOGIC;
86 NEW_V_READY : in STD_LOGIC;
87
88 VECTOR_NEW_X : out DFPM_SIGNED_VECTOR_5X32_BIT;
89 NEW_X_READY : out STD_LOGIC);
90 END COMPONENT;
91
92 COMPONENT Signed_Tolerance_Check
93 Port ( Vector_B_AX : in DFPM_SIGNED_VECTOR_5X32_BIT;
94 Tolerance_Limit : in Signed (32 downto 0);
95 Iteration_Complete : in STD_LOGIC:= '0';
96
97
98
99 CLK : in STD_LOGIC:= '0';
100 RST : in STD_LOGIC:= '0';
101
102 Tolerance_Limit_Squared, Vector_B_AX_Sum : out Signed (32 downto
0);
103
104 Iterate : out STD_LOGIC := '1');
105 END COMPONENT;
106
107 -------------------------------------------------------------------
--------------------
---------
108
109
110 Signal Sig_VECTOR_A_IN : DFPM_SIGNED_VECTOR_5X32_BIT;
111 Signal Sig_VECTOR_B_IN : DFPM_SIGNED_VECTOR_5X32_BIT;
112 Signal Sig_VECTOR_V_IN : DFPM_SIGNED_VECTOR_5X32_BIT;
113 Signal Sig_VECTOR_X_IN : DFPM_SIGNED_VECTOR_5X32_BIT;
114 Signal Sig_Mu_IN : Signed(32 downto 0);
115 Signal Sig_DT_IN : Signed(32 downto 0);
116 Signal Sig_B_AX_OUT : DFPM_SIGNED_VECTOR_5X32_BIT;
117
118 Signal Sig_Start_SubtrMultOps_To_NewVOps : STD_LOGIC := '0';
119 Signal Sig_Start_NewVOps_To_NewXOps : STD_LOGIC := '0';
120 Signal Sig_New_X_Is_Ready : STD_LOGIC := '0';
121
122 Signal Sig_B_Ax_MuV : DFPM_SIGNED_VECTOR_5X32_BIT;
123 Signal Sig_New_V : DFPM_SIGNED_VECTOR_5X32_BIT;
124 Signal Sig_New_X : DFPM_SIGNED_VECTOR_5X32_BIT;
125
126 Constant Const_Tolerance_Limit : Signed (32 downto 0) :=
"000000000000000000000001000000000"; -- 1*2^(-7) + 1*2^(-8)
127
128
129
130 begin
131
132 Sig_DT_IN <= DT_IN;
133 Sig_Mu_IN <= Mu_IN;
134 Sig_VECTOR_V_IN <= VECTOR_V_IN;
135 Sig_VECTOR_X_IN <= VECTOR_X_IN;
136 ITERATION_STAGE_COMPLETE <= Sig_New_X_Is_Ready;
137 B_AX_OUT <= Sig_B_AX_OUT;
138 NEW_V_OUT <= Sig_New_V;
139 NEW_X_OUT <= Sig_New_X;
140
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
72
141 Inst_Signed_SubtrAndMult_Ops_Module: Signed_SubtrAndMult_Ops_Module
PORT MAP(
142 Vector_A => VECTOR_A_IN,
143 Vector_B => VECTOR_B_IN,
144 Vector_X => Sig_VECTOR_X_IN,
145 Scalar_Mu => Mu_IN,
146 Vector_V => Sig_VECTOR_V_IN,
147 CLK => CLK,
148 RST => RST,
149 NEW_ITERATION => NEW_ITERATION_IN,
150 ITERATION_COMPLETE => Sig_Start_SubtrMultOps_To_NewVOps,
151 B_Minus_AX => Sig_B_AX_OUT,
152 B_Minus_Ax_Minus_muV => Sig_B_Ax_MuV);
153
154
155 Inst_Signed_New_V_Ops: Signed_New_V_Ops PORT MAP(
156 B_Ax_Muv => Sig_B_Ax_MuV,
157 Vector_V => Sig_VECTOR_V_IN,
158 DT => Sig_DT_IN,
159 CLK => CLK,
160 RST => RST,
161 ITERATION_COMPLETE => Sig_Start_SubtrMultOps_To_NewVOps,
162 VECTOR_NEW_V => Sig_New_V,
163 NEW_V_READY => Sig_Start_NewVOps_To_NewXOps);
164
165
166 Inst_Signed_New_X_Ops: Signed_New_X_Ops PORT MAP(
167 VECTOR_X => Sig_VECTOR_X_IN,
168 VECTOR_NEW_V => Sig_New_V,
169 DT => Sig_DT_IN,
170 CLK => CLK,
171 RST => RST,
172 NEW_V_READY => Sig_Start_NewVOps_To_NewXOps,
173 VECTOR_NEW_X => Sig_New_X,
174 NEW_X_READY => Sig_New_X_Is_Ready);
175
176 Inst_Signed_Tolerance_Check: Signed_Tolerance_Check PORT MAP (
177 Vector_B_AX => Sig_B_AX_OUT,
178 Tolerance_Limit => Const_Tolerance_Limit,
179 Iteration_Complete => Sig_Start_SubtrMultOps_To_NewVOps,
180 CLK => CLK,
181 RST => RST,
182 Tolerance_Limit_Squared => open,
183 Vector_B_AX_Sum => open,
184 Iterate => ITERATE_AGAIN);
185
186
187 end Behavioral;
188
189
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
73
DFPM top module
1 ---------------------------------------------------------------------
-------------
2 -- Company: Mid Sweden University
3 -- Engineer: Taiyelolu Adeboye
4 --
5 -- Create Date: 10:42:33 01/07/2015
6 -- Design Name:
7 -- Module Name: Signed_Vector_Vector_Mult_5By1 - Behavioral
8 -- Project Name: DFPM on FPGA
9 -- Target Devices: Nexys2
10 --------------------------------------------------------------------
--------------
11
12 library IEEE;
13 use IEEE.STD_LOGIC_1164.ALL;
14 use IEEE.std_logic_signed.all;
15 use work.DFPM_ARRAY_5X32_BIT.all;
16 use work.DFPM_ARRAY_25X32_BIT.all;
17
18 -- Uncomment the following library declaration if using
19 -- arithmetic functions with Signed or Unsigned values
20 use IEEE.NUMERIC_STD.ALL;
21
22 -- Uncomment the following library declaration if instantiating
23 -- any Xilinx primitives in this code.
24 --library UNISIM;
25 --use UNISIM.VComponents.all;
26
27 entity Signed_DFPM_One_Iteration is
28 Port ( VECTOR_A_IN : in DFPM_SIGNED_VECTOR_25X32_BIT;
29 VECTOR_B_IN : in DFPM_SIGNED_VECTOR_5X32_BIT;
30 VECTOR_X_IN : in DFPM_SIGNED_VECTOR_5X32_BIT;
31 VECTOR_V_IN : in DFPM_SIGNED_VECTOR_5X32_BIT;
32 Mu_IN : in Signed (32 downto 0);
33 DT_IN : in Signed (32 downto 0);
34 NEW_ITERATION_IN : in STD_LOGIC;
35
36 CLK : in STD_LOGIC;
37 RST : in STD_LOGIC;
38
39 B_AX_OUT : out DFPM_SIGNED_VECTOR_5X32_BIT;
40 NEW_V_OUT : out DFPM_SIGNED_VECTOR_5X32_BIT;
41 NEW_X_OUT : out DFPM_SIGNED_VECTOR_5X32_BIT;
42
43 ITERATION_STAGE_COMPLETE : out STD_LOGIC;
44 ITERATE_AGAIN : out STD_LOGIC);
45 end Signed_DFPM_One_Iteration;
46
47 architecture Behavioral of Signed_DFPM_One_Iteration is
48
49
50 COMPONENT Signed_SubtrAndMult_Ops_Module
51 Port ( Vector_A : in DFPM_SIGNED_VECTOR_25X32_BIT;
52 Vector_B : in DFPM_SIGNED_VECTOR_5X32_BIT;
53 Vector_X : in DFPM_SIGNED_VECTOR_5X32_BIT;
54 Scalar_Mu : in SIGNED (32 downto 0);
55 Vector_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
56
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
74
57 CLK : in STD_LOGIC;
58 RST : in STD_LOGIC;
59 NEW_ITERATION : in STD_LOGIC := '0';
60 ITERATION_COMPLETE : out STD_LOGIC:= '0';
61
62 B_Minus_AX : out DFPM_SIGNED_VECTOR_5X32_BIT;
63 B_Minus_Ax_Minus_muV : out DFPM_SIGNED_VECTOR_5X32_BIT);
64 END COMPONENT;
65
66 COMPONENT Signed_New_V_Ops
67 Port ( B_Ax_Muv : in DFPM_SIGNED_VECTOR_5X32_BIT;
68 Vector_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
69
70 DT : in Signed (32 downto 0);
71 CLK : in STD_LOGIC;
72 RST : in STD_LOGIC;
73 ITERATION_COMPLETE : in STD_LOGIC;
74
75 VECTOR_NEW_V : out DFPM_SIGNED_VECTOR_5X32_BIT;
76 NEW_V_READY : out STD_LOGIC);
77 END COMPONENT;
78
79 COMPONENT Signed_New_X_Ops
80 Port ( VECTOR_X : in DFPM_SIGNED_VECTOR_5X32_BIT;
81 VECTOR_NEW_V : in DFPM_SIGNED_VECTOR_5X32_BIT;
82 DT : in Signed(32 downto 0);
83
84 CLK : in STD_LOGIC;
85 RST : in STD_LOGIC;
86 NEW_V_READY : in STD_LOGIC;
87
88 VECTOR_NEW_X : out DFPM_SIGNED_VECTOR_5X32_BIT;
89 NEW_X_READY : out STD_LOGIC);
90 END COMPONENT;
91
92 COMPONENT Signed_Tolerance_Check
93 Port ( Vector_B_AX : in DFPM_SIGNED_VECTOR_5X32_BIT;
94 Tolerance_Limit : in Signed (32 downto 0);
95 Iteration_Complete : in STD_LOGIC:= '0';
96
97
98
99 CLK : in STD_LOGIC:= '0';
100 RST : in STD_LOGIC:= '0';
101
102 Tolerance_Limit_Squared, Vector_B_AX_Sum : out Signed (32 downto
0);
103
104 Iterate : out STD_LOGIC := '1');
105 END COMPONENT;
106
107 -------------------------------------------------------------------
--------------------
---------
108
109
110 Signal Sig_VECTOR_A_IN : DFPM_SIGNED_VECTOR_5X32_BIT;
111 Signal Sig_VECTOR_B_IN : DFPM_SIGNED_VECTOR_5X32_BIT;
112 Signal Sig_VECTOR_V_IN : DFPM_SIGNED_VECTOR_5X32_BIT;
113 Signal Sig_VECTOR_X_IN : DFPM_SIGNED_VECTOR_5X32_BIT;
114 Signal Sig_Mu_IN : Signed(32 downto 0);
115 Signal Sig_DT_IN : Signed(32 downto 0);
116 Signal Sig_B_AX_OUT : DFPM_SIGNED_VECTOR_5X32_BIT;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
75
117
118 Signal Sig_Start_SubtrMultOps_To_NewVOps : STD_LOGIC := '0';
119 Signal Sig_Start_NewVOps_To_NewXOps : STD_LOGIC := '0';
120 Signal Sig_New_X_Is_Ready : STD_LOGIC := '0';
121
122 Signal Sig_B_Ax_MuV : DFPM_SIGNED_VECTOR_5X32_BIT;
123 Signal Sig_New_V : DFPM_SIGNED_VECTOR_5X32_BIT;
124 Signal Sig_New_X : DFPM_SIGNED_VECTOR_5X32_BIT;
125
126 Constant Const_Tolerance_Limit : Signed (32 downto 0) :=
"000000000000000000000001000000000"; -- 1*2^(-7) + 1*2^(-8)
127
128
129
130 begin
131
132 Sig_DT_IN <= DT_IN;
133 Sig_Mu_IN <= Mu_IN;
134 Sig_VECTOR_V_IN <= VECTOR_V_IN;
135 Sig_VECTOR_X_IN <= VECTOR_X_IN;
136 ITERATION_STAGE_COMPLETE <= Sig_New_X_Is_Ready;
137 B_AX_OUT <= Sig_B_AX_OUT;
138 NEW_V_OUT <= Sig_New_V;
139 NEW_X_OUT <= Sig_New_X;
140
141 Inst_Signed_SubtrAndMult_Ops_Module: Signed_SubtrAndMult_Ops_Module
PORT MAP(
142 Vector_A => VECTOR_A_IN,
143 Vector_B => VECTOR_B_IN,
144 Vector_X => Sig_VECTOR_X_IN,
145 Scalar_Mu => Mu_IN,
146 Vector_V => Sig_VECTOR_V_IN,
147 CLK => CLK,
148 RST => RST,
149 NEW_ITERATION => NEW_ITERATION_IN,
150 ITERATION_COMPLETE => Sig_Start_SubtrMultOps_To_NewVOps,
151 B_Minus_AX => Sig_B_AX_OUT,
152 B_Minus_Ax_Minus_muV => Sig_B_Ax_MuV);
153
154
155 Inst_Signed_New_V_Ops: Signed_New_V_Ops PORT MAP(
156 B_Ax_Muv => Sig_B_Ax_MuV,
157 Vector_V => Sig_VECTOR_V_IN,
158 DT => Sig_DT_IN,
159 CLK => CLK,
160 RST => RST,
161 ITERATION_COMPLETE => Sig_Start_SubtrMultOps_To_NewVOps,
162 VECTOR_NEW_V => Sig_New_V,
163 NEW_V_READY => Sig_Start_NewVOps_To_NewXOps);
164
165
166 Inst_Signed_New_X_Ops: Signed_New_X_Ops PORT MAP(
167 VECTOR_X => Sig_VECTOR_X_IN,
168 VECTOR_NEW_V => Sig_New_V,
169 DT => Sig_DT_IN,
170 CLK => CLK,
171 RST => RST,
172 NEW_V_READY => Sig_Start_NewVOps_To_NewXOps,
173 VECTOR_NEW_X => Sig_New_X,
174 NEW_X_READY => Sig_New_X_Is_Ready);
175
176 Inst_Signed_Tolerance_Check: Signed_Tolerance_Check PORT MAP (
177 Vector_B_AX => Sig_B_AX_OUT,
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
76
178 Tolerance_Limit => Const_Tolerance_Limit,
179 Iteration_Complete => Sig_Start_SubtrMultOps_To_NewVOps,
180 CLK => CLK,
181 RST => RST,
182 Tolerance_Limit_Squared => open,
183 Vector_B_AX_Sum => open,
184 Iterate => ITERATE_AGAIN);
185
186
187 end Behavioral;
188
189
UART Core
1 ---------------------------------------------------------------------
---
2 -- RS232RefCom.vhd
3 ---------------------------------------------------------------------
---
4 -- Author: Dan Pederson
5 -- Copyright 2004 Digilent, Inc.
6 ---------------------------------------------------------------------
---
7 -- Description: This file defines a UART which tranfers data from
8 -- serial form to parallel form and vice versa.
9 ---------------------------------------------------------------------
---
10 -- Revision History:
11 -- 07/15/04 (Created) DanP
12 -- 02/25/08 (Created) ClaudiaG: made use of the baudDivide constant
13 -- in the Clock Dividing Processes
14 --------------------------------------------------------------------
----
15
16 library IEEE;
17 use IEEE.STD_LOGIC_1164.ALL;
18 use IEEE.STD_LOGIC_ARITH.ALL;
19 use IEEE.STD_LOGIC_UNSIGNED.ALL;
20
21 -- Uncomment the following lines to use the declarations that are
22 -- provided for instantiating Xilinx primitive components.
23 --library UNISIM;
24 --use UNISIM.VComponents.all;
25
26 entity Rs232RefComp is
27 Port (
28 TXD : out std_logic := '1';
29 RXD : in std_logic;
30 CLK : in std_logic; --Master Clock = 50MHz
31 DBIN : in std_logic_vector (7 downto 0); --Data Bus in
32 DBOUT : out std_logic_vector (7 downto 0); --Data Bus out
33 RDA : inout std_logic; --Read Data Available
34 TBE : inout std_logic := '1'; --Transfer Bus Empty
35 RD : in std_logic; --Read Strobe
36 WR : in std_logic; --Write Strobe
37 PE : out std_logic; --Parity Error Flag
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
77
38 FE : out std_logic; --Frame Error Flag
39 OE : out std_logic; --Overwrite Error Flag
40 RST : in std_logic := '0'); --Master Reset
41 end Rs232RefComp;
42
43 architecture Behavioral of Rs232RefComp is
44 --------------------------------------------------------------------
----
45 -- Component Declarations
46 --------------------------------------------------------------------
----
47
48 --------------------------------------------------------------------
----
49 -- Local Type Declarations
50 --------------------------------------------------------------------
----
51 --Receive state machine
52 type rstate is (
53 strIdle, --Idle state
54 strEightDelay, --Delays for 8 clock cycles
55 strGetData, --Shifts in the 8 data bits, and checks parity
56 strCheckStop --Sets framing error flag if Stop bit is wrong
57 );
58
59 type tstate is (
60 sttIdle, --Idle state
61 sttTransfer, --Move data into shift register
62 sttShift --Shift out data
63 );
64
65 type TBEstate is (
66 stbeIdle,
67 stbeSetTBE,
68 stbeWaitLoad,
69 stbeWaitWrite
70 );
71
72
73 --------------------------------------------------------------------
----
74 -- Signal Declarations
75 --------------------------------------------------------------------
----
76 constant baudDivide : std_logic_vector(7 downto 0) := "10100011"; --
Baud Rate
dividor, set now for a rate of 9600.
77 --Found by
dividing 50MHz by 9600 and 16.
78 signal rdReg : std_logic_vector(7 downto 0) := "00000000"; --Receive
holding register
79 signal rdSReg : std_logic_vector(9 downto 0) := "1111111111"; --
Receive
shift register
80 signal tfReg : std_logic_vector(7 downto 0); --Transfer
holding register
81 signal tfSReg : std_logic_vector(10 downto 0) := "11111111111"; --
Transfer
shift register
82 signal clkDiv : std_logic_vector(8 downto 0) := "000000000"; --used
for rClk
83 signal rClkDiv : std_logic_vector(3 downto 0) := "0000"; --used for
tClk
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
78
84 signal ctr : std_logic_vector(3 downto 0) := "0000"; --used for
delay times
85 signal tfCtr : std_logic_vector(3 downto 0) := "0000"; --used to
delay in transfer
86 signal rClk : std_logic := '0'; --Receiving Clock
87 signal tClk : std_logic; --Transfering Clock
88 signal dataCtr : std_logic_vector(3 downto 0) := "0000"; --Counts
the number
of read data bits
89 signal parError: std_logic; --Parity error bit
90 signal frameError: std_logic; --Frame error bit
91 signal CE : std_logic; --Clock enable for the latch
92 signal ctRst : std_logic := '0';
93 signal load : std_logic := '0';
94 signal shift : std_logic := '0';
95 signal par : std_logic;
96 signal tClkRST : std_logic := '0';
97 signal rShift : std_logic := '0';
98 signal dataRST : std_logic := '0';
99 signal dataIncr: std_logic := '0';
100
101 signal strCur : rstate := strIdle; --Current state in the Receive
state machine
102 signal strNext : rstate; --Next state in the Receive
state machine
103 signal sttCur : tstate := sttIdle; --Current state in the Transfer
state machine
104 signal sttNext : tstate; --Next state in the Transfer
staet machine
105 signal stbeCur : TBEstate := stbeIdle;
106 signal stbeNext: TBEstate;
107
108 -------------------------------------------------------------------
-----
109 -- Module Implementation
110 -------------------------------------------------------------------
-----
111
112 begin
113 frameError <= not rdSReg(9);
114 parError <= not ( rdSReg(8) xor (((rdSReg(0) xor rdSReg(1)) xor
(rdSReg(2) xor
rdSReg(3))) xor ((rdSReg(4) xor rdSReg(5)) xor (rdSReg(6) xor
rdSReg(7)))) );
115 DBOUT <= rdReg;
116 tfReg <= DBIN;
117 par <= not ( ((tfReg(0) xor tfReg(1)) xor (tfReg(2) xor tfReg(3)))
xor ((tfReg(4)
xor tfReg(5)) xor (tfReg(6) xor tfReg(7))) );
118
119 --Clock Dividing Functions--
120
121 process (CLK, clkDiv) --set up clock divide for rClk
122 begin
123 if (Clk = '1' and Clk'event) then
124 if (clkDiv = baudDivide) then
125 clkDiv <= "000000000";
126 else
127 clkDiv <= clkDiv +1;
128 end if;
129 end if;
130 end process;
131
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
79
132 process (clkDiv, rClk, CLK) --Define rClk
133 begin
134 if CLK = '1' and CLK'Event then
135 if clkDiv = baudDivide then
136 rClk <= not rClk;
137 else
138 rClk <= rClk;
139 end if;
140 end if;
141 end process;
142
143 process (rClk) --set up clock divide for tClk
144 begin
145 if (rClk = '1' and rClk'event) then
146 rClkDiv <= rClkDiv +1;
147 end if;
148 end process;
149
150 tClk <= rClkDiv(3); --define tClk
151
152 process (rClk, ctRst) --set up a counter based on rClk
153 begin
154 if rClk = '1' and rClk'Event then
155 if ctRst = '1' then
156 ctr <= "0000";
157 else
158 ctr <= ctr +1;
159 end if;
160 end if;
161 end process;
162
163 process (tClk, tClkRST) --set up a counter based on tClk
164 begin
165 if (tClk = '1' and tClk'event) then
166 if tClkRST = '1' then
167 tfCtr <= "0000";
168 else
169 tfCtr <= tfCtr +1;
170 end if;
171 end if;
172 end process;
173
174 --This process controls the error flags--
175 process (rClk, RST, RD, CE)
176 begin
177 if RD = '1' or RST = '1' then
178 FE <= '0';
179 OE <= '0';
180 RDA <= '0';
181 PE <= '0';
182 elsif rClk = '1' and rClk'event then
183 if CE = '1' then
184 FE <= frameError;
185 OE <= RDA;
186 RDA <= '1';
187 PE <= parError;
188 rdReg(7 downto 0) <= rdSReg (7 downto 0);
189 end if;
190 end if;
191 end process;
192
193 --This process controls the receiving shift register--
194 process (rClk, rShift)
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
80
195 begin
196 if rClk = '1' and rClk'Event then
197 if rShift = '1' then
198 rdSReg <= (RXD & rdSReg(9 downto 1));
199 end if;
200 end if;
201 end process;
202
203 --This process controls the dataCtr to keep track of shifted val-
ues--
204 process (rClk, dataRST)
205 begin
206 if (rClk = '1' and rClk'event) then
207 if dataRST = '1' then
208 dataCtr <= "0000";
209 elsif dataIncr = '1' then
210 dataCtr <= dataCtr +1;
211 end if;
212 end if;
213 end process;
214
215 --Receiving State Machine--
216 process (rClk, RST)
217 begin
218 if rClk = '1' and rClk'Event then
219 if RST = '1' then
220 strCur <= strIdle;
221 else
222 strCur <= strNext;
223 end if;
224 end if;
225 end process;
226
227 --This process generates the sequence of steps needed receive the
data
228
229 process (strCur, ctr, RXD, dataCtr, rdSReg, rdReg, RDA)
230 begin
231 case strCur is
232
233 when strIdle =>
234 dataIncr <= '0';
235 rShift <= '0';
236 dataRst <= '0';
237
238 CE <= '0';
239 if RXD = '0' then
240 ctRst <= '1';
241 strNext <= strEightDelay;
242 else
243 ctRst <= '0';
244 strNext <= strIdle;
245 end if;
246
247 when strEightDelay =>
248 dataIncr <= '0';
249 rShift <= '0';
250 CE <= '0';
251
252 if ctr(2 downto 0) = "111" then
253 ctRst <= '1';
254 dataRST <= '1';
255 strNext <= strGetData;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
81
256 else
257 ctRst <= '0';
258 dataRST <= '0';
259 strNext <= strEightDelay;
260 end if;
261
262 when strGetData =>
263 CE <= '0';
264 dataRst <= '0';
265 if ctr(3 downto 0) = "1111" then
266 ctRst <= '1';
267 dataIncr <= '1';
268 rShift <= '1';
269 else
270 ctRst <= '0';
271 dataIncr <= '0';
272 rShift <= '0';
273 end if;
274
275 if dataCtr = "1010" then
276 strNext <= strCheckStop;
277 else
278 strNext <= strGetData;
279 end if;
280
281 when strCheckStop =>
282 dataIncr <= '0';
283 rShift <= '0';
284 dataRst <= '0';
285 ctRst <= '0';
286
287 CE <= '1';
288 strNext <= strIdle;
289
290 end case;
291
292 end process;
293
294 --TBE State Machine--
295 process (CLK, RST)
296 begin
297 if CLK = '1' and CLK'Event then
298 if RST = '1' then
299 stbeCur <= stbeIdle;
300 else
301 stbeCur <= stbeNext;
302 end if;
303 end if;
304 end process;
305
306 --This process gererates the sequence of events needed to control
the TBE flag--
307 process (stbeCur, CLK, WR, DBIN, load)
308 begin
309
310 case stbeCur is
311
312 when stbeIdle =>
313 TBE <= '1';
314 if WR = '1' then
315 stbeNext <= stbeSetTBE;
316 else
317 stbeNext <= stbeIdle;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
82
318 end if;
319
320 when stbeSetTBE =>
321 TBE <= '0';
322 if load = '1' then
323 stbeNext <= stbeWaitLoad;
324 else
325 stbeNext <= stbeSetTBE;
326 end if;
327
328 when stbeWaitLoad =>
329 if load = '0' then
330 stbeNext <= stbeWaitWrite;
331 else
332 stbeNext <= stbeWaitLoad;
333 end if;
334
335 when stbeWaitWrite =>
336 if WR = '0' then
337 stbeNext <= stbeIdle;
338 else
339 stbeNext <= stbeWaitWrite;
340 end if;
341 end case;
342 end process;
343
344 --This process loads and shifts out the transfer shift register--
345 process (load, shift, tClk, tfSReg)
346 begin
347 TXD <= tfsReg(0);
348 if tClk = '1' and tClk'Event then
349 if load = '1' then
350 tfSReg (10 downto 0) <= ('1' & par & tfReg(7 downto 0) &'0');
351 end if;
352 if shift = '1' then
353
354 tfSReg (10 downto 0) <= ('1' & tfSReg(10 downto 1));
355 end if;
356 end if;
357 end process;
358
359 -- Transfer State Machine--
360 process (tClk, RST)
361 begin
362 if (tClk = '1' and tClk'Event) then
363 if RST = '1' then
364 sttCur <= sttIdle;
365 else
366 sttCur <= sttNext;
367 end if;
368 end if;
369 end process;
370
371 -- This process generates the sequence of steps needed transfer the
data--
372 process (sttCur, tfCtr, tfReg, TBE, tclk)
373 begin
374
375 case sttCur is
376
377 when sttIdle =>
378 tClkRST <= '0';
379 shift <= '0';
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
83
380 load <= '0';
381 if TBE = '1' then
382 sttNext <= sttIdle;
383 else
384 sttNext <= sttTransfer;
385 end if;
386
387 when sttTransfer =>
388 shift <= '0';
389 load <= '1';
390 tClkRST <= '1';
391 sttNext <= sttShift;
392
393
394 when sttShift =>
395 shift <= '1';
396 load <= '0';
397 tClkRST <= '0';
398 if tfCtr = "1100" then
399 sttNext <= sttIdle;
400 else
401 sttNext <= sttShift;
402 end if;
403 end case;
404 end process;
406 end Behavioral;
UART Interface
1 ---------------------------------------------------------------------
-------------
2 -- Company:
3 -- Engineer:
4 --
5 -- Create Date: 21:47:53 01/06/2015
6 -- Design Name:
7 -- Module Name: UART_INTERFACE - Behavioral
8 -- Project Name:
9 -- Target Devices:
10 -- Tool versions:
11 -- Description:
12 --
13 -- Dependencies:
14 --
15 -- Revision:
16 -- Revision 0.01 - File Created
17 -- Additional Comments:
18 --
19 --------------------------------------------------------------------
--------------
20 library IEEE;
21 use IEEE.STD_LOGIC_1164.ALL;
22 use IEEE.STD_LOGIC_ARITH.ALL;
23 use IEEE.STD_LOGIC_UNSIGNED.ALL;
24 -- Uncomment the following library declaration if using
25 -- arithmetic functions with Signed or Unsigned values
26 --use IEEE.NUMERIC_STD.ALL;
27
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
84
28 -- Uncomment the following library declaration if instantiating
29 -- any Xilinx primitives in this code.
30 --library UNISIM;
31 --use UNISIM.VComponents.all;
32
33 entity UART_INTERFACE is
34 Port ( RXD : in STD_LOGIC := '1';
35 DATA_UART_TO_DFPM : out STD_LOGIC_VECTOR (7 downto 0);
36 RDA_SIG : out STD_LOGIC;
37 DATA_READY_FROM_UART : out STD_LOGIC := '0';
38
39 WAITING_FOR_DFPM : out STD_LOGIC := '0';
40
41 CLK : in STD_LOGIC;
42 RST : in STD_LOGIC := '0';
43 LEDS : out STD_LOGIC_VECTOR (7 downto 0) := "00000000";
44
45 TXD : out STD_LOGIC := '1';
46 DATA_DFPM_TO_UART : in STD_LOGIC_VECTOR (7 downto 0);
47 TBE_SIG : out STD_LOGIC;
48 DATA_READY_FROM_DFPM : in STD_LOGIC := '0');
49 end UART_INTERFACE;
50
51 architecture Behavioral of UART_INTERFACE is
52
53 component RS232RefComp
54 Port (TXD : out std_logic := '1';
55 RXD : in std_logic;
56 CLK : in std_logic;
57 DBIN : in std_logic_vector (7 downto 0);
58 DBOUT : out std_logic_vector (7 downto 0);
59 RDA : inout std_logic;
60 TBE : inout std_logic := '1';
61 RD : in std_logic;
62 WR : in std_logic;
63 PE : out std_logic;
64 FE : out std_logic;
65 OE : out std_logic;
66 RST : in std_logic := '0');
67 end component;
68
69 --------------------------------------------------------------------
-----
70 type mainState is (
71 stReceive,
72 stWaitForDFPMOutput,
73 stSend,
74 stRepeatSend);
75 --------------------------------------------------------------------
-----
76
77 signal dbInSig : std_logic_vector(7 downto 0):= "00000000";
78 signal dbOutSig : std_logic_vector(7 downto 0):= "00000000";
79 signal rdaSig : std_logic;
80 signal tbeSig : std_logic;
81 signal rdSig : std_logic;
82 signal wrSig : std_logic;
83 signal peSig : std_logic;
84 signal feSig : std_logic;
85 signal oeSig : std_logic;
86
87 signal stCur : mainState := stReceive;
88 signal stNext : mainState;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
85
89
90 Signal TxCount, RxCount : integer := 0;
91
92 Signal RxFlag, TxFlag, TbeFlag, RdaFlag, clearSendCount : std_logic
:= '0';
93
94 Signal TxDataReadStartPos : integer := 7;
95 Signal RxDataReadStartPos : integer := 0;
96
97 Signal Sig_Waiting_For_DFPM_Results : std_logic := '0';
98
99 Constant endOfRxMessage : std_logic_vector(7 downto 0) :=
"00111010";
100 Constant endOfTxMessage : std_logic_vector(7 downto 0) :=
"11111111";
101
102 Constant numberOfTxTransmissions : integer := 50;
103 Constant numberOfRxTransmissions : integer := 8;
104
105
106
107 begin
108
109 WAITING_FOR_DFPM <= Sig_Waiting_For_DFPM_Results;
110
111 TBE_SIG <= tbeSig;
112
113 RDA_SIG <= rdaSig;
114
115
116 Instantiating_the_UART: RS232RefComp port map ( TXD => TXD,
117 RXD => RXD,
118 CLK => CLK,
119 DBIN => dbInSig,
120 DBOUT => dbOutSig,
121 RDA => rdaSig,
122 TBE => tbeSig,
123 RD => rdSig,
124 WR => wrSig,
125 PE => peSig,
126 FE => feSig,
127 OE => oeSig,
128 RST => RST);
129
130 -------------------------------------------------------------------
------
131 process (CLK, RST)
132 begin
133 if (CLK = '1' and CLK'Event) then
134 if RST = '1' then
135 stCur <= stReceive;
136 else
137 stCur <= stNext;
138 end if;
139 end if;
140 end process;
141 -------------------------------------------------------------------
------
142
143 process (stCur, rdaSig, dboutsig, tbeSig,
TxCount,DATA_READY_FROM_DFPM)
144 Variable TXFlagVar : std_logic:= '0';
145 begin
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
86
146 case stCur is
147 when stReceive =>
148 rdSig <= '0';
149 wrSig <= '0';
150 Sig_Waiting_For_DFPM_Results <= '0';
151 DATA_READY_FROM_UART <= '1';
152 if (dbOutSig = endOfRxMessage) then
153 stNext <= stWaitForDFPMOutput;
154 DATA_READY_FROM_UART <= '0';
155 else
156 stNext <= stReceive;
157 end if;
158 if (rdaSig = '1') then
159 rdSig <= '1';
160 LEDS <= dbOutSig;
161 -- Send the newly received data to DFPM
162 DATA_UART_TO_DFPM <= dbOutSig;
163 end if;
164
165 when stWaitForDFPMOutput =>
166 Sig_Waiting_For_DFPM_Results <= '1';
167 -- Signal with the LEDS
168 LEDS <= (Others => '1');
169 -- Prevent the RX from receiving and the TX from transmitting
170 rdSig <= '1';
171 wrSig <= '0';
172 -- Do nothing else. Just wait until output is ready from the DFPM
173 if (DATA_READY_FROM_DFPM = '1') then
174 stNext <= stSend;
175 else
176 stNext <= stWaitForDFPMOutput;
177 end if;
178
179 when stSend =>
180 Sig_Waiting_For_DFPM_Results <= '0';
181 LEDS <= (Others => '0');
182 --LEDS(0) <= '1';
183 if (TxCount = numberOfTxTransmissions) then
184 stNext <= stReceive;
185 else
186 rdSig <= '1';
187 wrSig <= '1';
188 stNext <= stRepeatSend;
189 end if;
190
191 when stRepeatSend =>
192 --LEDS(1) <= '1';
193 wrSig <= '0';
194 if (tbeSig = '1') then
195 --SEnd the newly received data to UART TX
196 dbInSig <= DATA_DFPM_TO_UART;
197 stNext <= stSend;
198 else
199 stNext <= stRepeatSend;
200 end if;
201 end case;
202 end process;
203
204 ---- Determining the number of tx transmissions to be sent
205 ---- and which positioon in memory is to be printed out.
206 process(tbeSig)
207 Variable TxCountVar : integer := 0;
208 Variable TxDataReadStartPosVar : integer := 0;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
87
209 begin
210 if falling_edge(tbeSig) then
211 TxCountVar := TxCount;
212 TxCount <= TxCountVar + 1;
213
214 TxDataReadStartPosVar := TxDataReadStartPos;
215 TxDataReadStartPos <= TxDataReadStartPosVar - 1;
216 end if;
217 end process;
218
219 end Behavioral;
220
221
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
88
Project Top module DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE.vhd Mon Feb 09 03:31:07 2015
Page 1 1 ---------------------------------------------------------------------
-------------
2 -- Company:
3 -- Engineer:
4 --
5 -- Create Date: 14:56:46 02/06/2015
6 -- Design Name:
7 -- Module Name: DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE - Behavioral
8 -- Project Name:
9 -- Target Devices:
10 -- Tool versions:
11 -- Description:
12 --
13 -- Dependencies:
14 --
15 -- Revision:
16 -- Revision 0.01 - File Created
17 -- Additional Comments:
18 --
19 --------------------------------------------------------------------
--------------
20 library IEEE;
21 use IEEE.STD_LOGIC_1164.ALL;
22 USE ieee.std_logic_unsigned.all;
23 use IEEE.NUMERIC_STD.ALL;
24
25 use work.DFPM_VECTOR_5X32_BIT.all; -- 5 by 1 matrix of std logic
vectors package
26 use work.DFPM_VECTOR_25X32_BIT.all; -- 5 by 5 matrix of
std_logic_vectors package
27 use work.DFPM_ARRAY_5X32_BIT.all; -- 5 by 1 of Signed signed vectors
package
28 use work.DFPM_ARRAY_25X32_BIT.all; -- 5 by 5 matrix of signed vec-
tors package
29
30 -- Uncomment the following library declaration if using
31 -- arithmetic functions with Signed or Unsigned values
32
33
34 -- Uncomment the following library declaration if instantiating
35 -- any Xilinx primitives in this code.
36 --library UNISIM;
37 --use UNISIM.VComponents.all;
38
39 entity DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE is
40 Port ( RXD : in STD_LOGIC;
41 TXD: out STD_LOGIC;
42 CLK : in STD_LOGIC;
43 RST : in STD_LOGIC;
44 LEDS : out STD_LOGIC_VECTOR (7 downto 0));
45 end DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE;
46
47 architecture Behavioral of DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE is
48
49 COMPONENT UART_INTERFACE
50 PORT(
51 RXD : IN std_logic;
52 CLK : IN std_logic;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
89
53 RST : IN std_logic;
54 DATA_DFPM_TO_UART : IN std_logic_vector(7 downto 0);
55 LEDS : out STD_LOGIC_VECTOR (7 downto 0);
56 DATA_READY_FROM_DFPM : IN std_logic;
57
DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE.vhd Mon Feb 09 03:31:07 2015
Page 2 58 WAITING_FOR_DFPM : out STD_LOGIC := '0';
59
60 DATA_UART_TO_DFPM : OUT std_logic_vector(7 downto 0);
61 RDA_SIG : OUT std_logic;
62 DATA_READY_FROM_UART : OUT std_logic;
63 TXD : OUT std_logic;
64 TBE_SIG : OUT std_logic);
65 END COMPONENT;
66
67 COMPONENT Signed_DFPM_Iteration_Control_Top_Module
68 Port ( VECTOR_A_IN : IN DFPM_SIGNED_VECTOR_25X32_BIT;
69 VECTOR_B_IN : IN DFPM_SIGNED_VECTOR_5X32_BIT;
70
71 DATA_READY_FROM_UART_RX : IN STD_LOGIC;
72
73 CLK : IN STD_LOGIC;
74 RST : IN STD_LOGIC;
75
76 VECTOR_B_AX : OUT DFPM_SIGNED_VECTOR_5X32_BIT;
77
78 DATA_READY_FROM_ONE_ITERATION : OUT STD_LOGIC := '0';
79 DATA_READY_FROM_DFPM_ITERATIONS : OUT STD_LOGIC;
80
81 VECTOR_X_OUT : OUT DFPM_SIGNED_VECTOR_5X32_BIT);
82 END COMPONENT;
83
84 type storage_type_dfpm is array (0 to 5) of std_logic_vector(40
downto 0);--
Larger so as to make room for the New line character
85 type type_DFPM_Format is array (0 to 30) of Signed(32 downto 0);
86 ------------------------------------------------------------------
87 Signal Sig_DFPM_Input_array : type_DFPM_Format := (
"000000000000000000000000000000000",
88
"000000000000000000000000000000000",
89
"000000000000000000000000000000000",
90
"000000000000000000000000000000000",
91
"000000000000000000000000000000000",
92
"000000000000000000000000000000000",
93
"000000000000000000000000000000000",
94
"000000000000000000000000000000000",
95
"000000000000000000000000000000000",
96
"000000000000000000000000000000000",
97
"000000000000000000000000000000000",
98
"000000000000000000000000000000000",
99
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
90
"000000000000000000000000000000000",
100
DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE.vhd Mon Feb 09 03:31:07 2015
Page 3 "000000000000000000000000000000000",
101
"000000000000000000000000000000000",
102
"000000000000000000000000000000000",
103
"000000000000000000000000000000000",
104
"000000000000000000000000000000000",
105
"000000000000000000000000000000000",
106
"000000000000000000000000000000000",
107
"000000000000000000000000000000000",
108
"000000000000000000000000000000000",
109
"000000000000000000000000000000000",
110
"000000000000000000000000000000000",
111
"000000000000000000000000000000000",
112
"000000000000000000000000000000000",
113
"000000000000000000000000000000000",
114
"000000000000000000000000000000000",
115
"000000000000000000000000000000000",
116
"000000000000000000000000000000000",
117
"000000000000000000000000000000000" );
118
119 Signal Sig_DFPM_storage_array : storage_type_dfpm := (
"00000000000000000000000000000000000000000",
120
"00000000000000000000000000000000000000000",
121
"00000000000000000000000000000000000000000",
122
"00000000000000000000000000000000000000000",
123
"00000000000000000000000000000000000000000",
124
"00000000000000000000000000000000000000000");
125
126 Signal Sig_UART_STORE_Pos : integer := 0;
127
128 Signal Sig_UART_READ_Pos : integer := 0;
129
130 Signal Sig_UART_READ_Pos_8BitPart : integer := 4;
131
132 Signal Sig_UART_IN_STORAGE_FLAG : STD_LOGIC := '1';
133
DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE.vhd Mon Feb 09 03:31:07 2015
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
91
Page 4 134 Signal Sig_RDA, Sig_TBE : STD_LOGIC := '0';
135 Signal Sig_WAITING_FOR_DFPM, Sig_DataReady_From_UART: STD_LOGIC;
136
137 Signal Sig_UART_Data_Storage_Complete : STD_LOGIC := '0';
138
139 Signal Sig_start_DFPM_computation : STD_LOGIC := '0';
140
141 Signal Sig_DataReady_From_DFPM : STD_LOGIC := '0';
142
143 Signal Sig_DataReady_To_DFPM_Module, Sig_DATA_READY_FROM_UART :
std_logic := '0';
144
145 Signal Sig_DATA_OUT_UART_INTERFACE : std_logic_vector(7 downto 0);
146
147 Signal Sig_VECTOR_A_IN_TO_DFPM : DFPM_SIGNED_VECTOR_25X32_BIT;
148 Signal Sig_VECTOR_B_IN_TO_DFPM : DFPM_SIGNED_VECTOR_5X32_BIT;
149
150 Signal Sig_VECTOR_X_OUT_FROM_DFPM : DFPM_SIGNED_VECTOR_5X32_BIT;
151
152 Signal Sig_DATA_IN_UART_INTERFACE : std_logic_vector(7 downto 0);
153
154 Signal Sig_DFPMStartFlag, Sig_RDA_Signal : std_logic := '0';
155
156 Signal sig_FirstWrite : std_logic := '1';
157
158 ----------------------------------------------------------------
159 --ASCII REpresentations
160 Constant Const_SemiColon : std_logic_vector(7 downto 0) :=
"00111011";
161 Constant Const_Colon : std_logic_vector(7 downto 0) := "00111010";
162 Constant Const_Space : std_logic_vector(7 downto 0) := "00100000";
163 Constant Const_Opening_Bracket : std_logic_vector(7 downto 0) :=
"01011011";
164 Constant Const_Closing_Bracket : std_logic_vector(7 downto 0) :=
"01011101";
165 Constant Const_Newline : std_logic_vector(7 downto 0) :=
"00001010";
166
167 Constant Const_Zero : std_logic_vector(7 downto 0) := "00110000";
168 Constant Const_One : std_logic_vector(7 downto 0) := "00110001";
169 Constant Const_Two : std_logic_vector(7 downto 0) := "00110010";
170 Constant Const_Three : std_logic_vector(7 downto 0) := "00110011";
171 Constant Const_Four : std_logic_vector(7 downto 0) := "00110100";
172 Constant Const_Five : std_logic_vector(7 downto 0) := "00110101";
173 Constant Const_Six : std_logic_vector(7 downto 0) := "00110110";
174 Constant Const_Seven : std_logic_vector(7 downto 0) := "00110111";
175 Constant Const_Eight : std_logic_vector(7 downto 0) := "00111000";
176 Constant Const_Nine : std_logic_vector(7 downto 0) := "00111001";
177
178 -- Other constants
179 Constant Const_9_Zeros : signed(8 downto 0) := "000000000";
180 Constant Const_16_Zeros : signed(15 downto 0) :=
"0000000000000000";
181
182
183 Constant Const_0 : Signed(7 downto 0) := "00000000";
184 Constant Const_1 : Signed(7 downto 0) := "00000001";
185 Constant Const_2 : Signed(7 downto 0) := "00000010";
186 Constant Const_3 : Signed(7 downto 0) := "00000011";
187 Constant Const_4 : Signed(7 downto 0) := "00000100";
188 Constant Const_5 : Signed(7 downto 0) := "00000101";
189 Constant Const_6 : Signed(7 downto 0) := "00000110";
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
92
190 Constant Const_7 : Signed(7 downto 0) := "00000111";
DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE.vhd Mon Feb 09 03:31:07 2015
Page 5 191 Constant Const_8 : Signed(7 downto 0) := "00001000";
192 Constant Const_9 : Signed(7 downto 0) := "00001001";
193
194 Constant C9 : signed(8 downto 0) := "000000000";
195 Constant C16 : signed(15 downto 0) := "0000000000000000";
196 ----------------------------------------------------------------
197
198
199
200 begin
201
202 -- Determining the position in which incoming data is to be stored
203 process(Sig_RDA, Sig_UART_STORE_Pos, Sig_DATA_OUT_UART_INTERFACE)
204 variable varPos : integer;
205 begin
206 if rising_edge(Sig_RDA) then
207 if (Sig_UART_STORE_Pos < 30) then
208 if ((Sig_DATA_OUT_UART_INTERFACE = Const_SemiColon)
209 --or (Sig_DATA_OUT_UART_INTERFACE = Const_Colon)
210 or (Sig_DATA_OUT_UART_INTERFACE = Const_Space)
211 or (Sig_DATA_OUT_UART_INTERFACE = Const_Closing_Bracket)) then
212
213 varPos := Sig_UART_STORE_Pos;
214 Sig_UART_STORE_Pos <= varPos + 1;
215 end if;
216 end if;
217 end if;
218 end process;
219
220 -- The actual data storage
221 process(Sig_RDA, Sig_DATA_OUT_UART_INTERFACE, Sig_UART_STORE_Pos)
222 begin
223 if rising_edge(Sig_RDA) then
224 if (Sig_UART_STORE_Pos < 30) then
225 if ((Sig_DATA_OUT_UART_INTERFACE = Const_One)
226 or (Sig_DATA_OUT_UART_INTERFACE = Const_Two) or (
Sig_DATA_OUT_UART_INTERFACE = Const_Three)
227 or (Sig_DATA_OUT_UART_INTERFACE = Const_Four) or (
Sig_DATA_OUT_UART_INTERFACE = Const_Five)
228 or (Sig_DATA_OUT_UART_INTERFACE = Const_Six) or (
Sig_DATA_OUT_UART_INTERFACE = Const_Seven)
229 or (Sig_DATA_OUT_UART_INTERFACE = Const_Eight) or (
Sig_DATA_OUT_UART_INTERFACE = Const_Nine)) then
230 -- Storing the 4 LSBs - the part of ASCII numerical representation
that contains the number being transmitted
231 -- Sig_DFPM_Input_array(Sig_UART_STORE_Pos)(19 downto 16) <=
Signed(Sig_DATA_OUT_UART_INTERFACE(3 downto 0));
232
233 Sig_DFPM_Input_array(Sig_UART_STORE_Pos)(19 downto 16) <= Signed(
Sig_DATA_OUT_UART_INTERFACE(3 downto 0));
234
235 end if;
236 end if;
237 end if;
238 end process;
239 ------------------------------------------------------------------
240 --Signalling that storage is complete
DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE.vhd Mon Feb 09 03:31:07 2015
Page 6
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
93
241 process(CLK, Sig_UART_STORE_Pos)
242 begin
243 if rising_edge(CLK) then
244 --if (Sig_UART_STORE_Pos = 29) then
245 if (Sig_DATA_OUT_UART_INTERFACE = Const_Colon) then
246 Sig_UART_Data_Storage_Complete <= '1';
247 end if;
248 end if;
249 end process;
250
251 -- Signalling that the DFPM computation should start
252 process(clk, Sig_UART_Data_Storage_Complete, Sig_DFPMStartFlag)
253 variable var_DFPMStartFlag : std_logic := '0';
254 begin
255 if rising_edge(clk) then
256 if (Sig_UART_Data_Storage_Complete = '1') then
257 if (Sig_DFPMStartFlag = '0') then
258 Sig_start_DFPM_computation <= '1';
259 Sig_DFPMStartFlag <= '1';
260 else
261 Sig_start_DFPM_computation <= '0';
262 end if;
263 else
264 Sig_start_DFPM_computation <= '0';
265 end if;
266 end if;
267 end process;
268
269 ------------------------------------------------------------------
270
271 process(Sig_TBE, Sig_UART_READ_Pos, Sig_UART_READ_Pos_8BitPart)
272 Variable var_readPos, var_ReadPos_8bitPart : integer := 0;
273 begin
274 if rising_edge(Sig_TBE) then
275 if (Sig_UART_READ_Pos < 5) then
276
277 --Bits 31 downto 24, then 23 downto 16, then 15 downto 8, then 7
downto 0
278 --in successive bit transmissions through the UART is equivalent to
->
279 --(8(x + 1) - 1) downto (8*x) where x is the
Sig_UART_READ_Pos_8BitPart
280 -- This approach will transmit the data contained in each element
of
the solution vector
281 -- in series if 8 bits starting from the MSB to the LSB
282 --------------------------------------
283 if (Sig_UART_READ_Pos_8BitPart = 4) then
284 Sig_DATA_IN_UART_INTERFACE <= Sig_DFPM_storage_array(
Sig_UART_READ_Pos)(39 downto 32);
285 elsif (Sig_UART_READ_Pos_8BitPart = 3) then
286 Sig_DATA_IN_UART_INTERFACE <= Sig_DFPM_storage_array(
Sig_UART_READ_Pos)(31 downto 24);
287 elsif (Sig_UART_READ_Pos_8BitPart = 2) then
288 Sig_DATA_IN_UART_INTERFACE <= Sig_DFPM_storage_array(
Sig_UART_READ_Pos)(23 downto 16);
289 elsif (Sig_UART_READ_Pos_8BitPart = 1) then
290 Sig_DATA_IN_UART_INTERFACE <= Sig_DFPM_storage_array(
Sig_UART_READ_Pos)(15 downto 8);
291 elsif (Sig_UART_READ_Pos_8BitPart = 0) then
DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE.vhd Mon Feb 09 03:31:07 2015
Page 7 292 Sig_DATA_IN_UART_INTERFACE <= Sig_DFPM_storage_array(
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
94
Sig_UART_READ_Pos)(7 downto 0);
293 end if;
294
295
296 if (Sig_UART_READ_Pos_8BitPart = 0) then
297 Sig_UART_READ_Pos_8BitPart <= 4;
298
299 var_readPos := Sig_UART_READ_Pos;
300 Sig_UART_READ_Pos <= var_readPos + 1;
301 else
302 var_ReadPos_8bitPart := Sig_UART_READ_Pos_8BitPart;
303 Sig_UART_READ_Pos_8BitPart <= var_ReadPos_8bitPart - 1;
304 end if;
305 end if;
306 end if;
307 end process;
308 ------------------------------------------------------------------
309
310 Inst_UART_INTERFACE: UART_INTERFACE PORT MAP(
311 RXD => RXD,
312 DATA_UART_TO_DFPM => Sig_DATA_OUT_UART_INTERFACE,
313 RDA_SIG => Sig_RDA,
314 DATA_READY_FROM_UART => Sig_DATA_READY_FROM_UART,
315
316 CLK => CLK,
317 RST => RST,
318 LEDS => LEDS,
319
320 WAITING_FOR_DFPM => Sig_WAITING_FOR_DFPM,
321
322 TXD => TXD,
323 DATA_DFPM_TO_UART => Sig_DATA_IN_UART_INTERFACE,
324 TBE_SIG => Sig_TBE,
325 DATA_READY_FROM_DFPM => Sig_DataReady_From_DFPM);
326
327
328 Inst_Signed_DFPM_Iteration_Control_Top_Module:
Signed_DFPM_Iteration_Control_Top_Module PORT MAP(
329 VECTOR_A_IN => Sig_VECTOR_A_IN_TO_DFPM,
330 VECTOR_B_IN => Sig_VECTOR_B_IN_TO_DFPM,
331 DATA_READY_FROM_UART_RX => Sig_DATA_READY_FROM_UART,
332 CLK => CLK,
333 RST => RST,
334 VECTOR_B_AX => open,
335 DATA_READY_FROM_ONE_ITERATION => open,
336 DATA_READY_FROM_DFPM_ITERATIONS => Sig_DataReady_From_DFPM,
337 VECTOR_X_OUT => Sig_VECTOR_X_OUT_FROM_DFPM);
338 -------------------------------------------------------------------
--
339 --
340 --
341 -- Sig_VECTOR_A_IN_TO_DFPM <= ( (C9&Const_8&C16, C9&Const_2&C16,
C9&Const_3&C16,
C9&Const_4&C16, C9&Const_5&C16),
342 -- (C9&Const_1&C16, C9&Const_7&C16, C9&Const_3&C16,
C9&Const_4&C16, C9&Const_5&C16),
343 -- (C9&Const_1&C16, C9&Const_2&C16, C9&Const_9&C16,
C9&Const_4&C16, C9&Const_5&C16),
DFPM_ON_FPGA_PROJECT_DEMO_TOP_MODULE.vhd Mon Feb 09 03:31:07 2015
Page 8 344 -- (C9&Const_1&C16, C9&Const_2&C16, C9&Const_3&C16,
C9&"00001010"&C16, C9&Const_5&C16),
345 -- (C9&Const_1&C16, C9&Const_2&C16, C9&Const_3&C16,
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
95
C9&Const_4&C16, C9&"00001100"&C16));
346 --
347 --
348 -- Sig_VECTOR_B_IN_TO_DFPM <= (C9&Const_1&C16, C9&Const_2&C16,
C9&Const_3&C16,
C9&Const_4&C16, C9&Const_5&C16);
349
350
351 -- Direct assignment of Data received from the UART RX pin (which
were stored
after conversion)
352 Sig_VECTOR_A_IN_TO_DFPM <= ( (Sig_DFPM_Input_array(0),
Sig_DFPM_Input_array(1),
Sig_DFPM_Input_array(2), Sig_DFPM_Input_array(3),
Sig_DFPM_Input_array(4)),
353 (Sig_DFPM_Input_array(5), Sig_DFPM_Input_array(6),
Sig_DFPM_Input_array(7), Sig_DFPM_Input_array(8),
Sig_DFPM_Input_array(9)),
354 (Sig_DFPM_Input_array(10), Sig_DFPM_Input_array(11),
Sig_DFPM_Input_array(12), Sig_DFPM_Input_array(13),
Sig_DFPM_Input_array(14)),
355 (Sig_DFPM_Input_array(15), Sig_DFPM_Input_array(16),
Sig_DFPM_Input_array(17), Sig_DFPM_Input_array(18),
Sig_DFPM_Input_array(19)),
356 (Sig_DFPM_Input_array(20), Sig_DFPM_Input_array(21),
Sig_DFPM_Input_array(22), Sig_DFPM_Input_array(23),
Sig_DFPM_Input_array(24)) );
357
358
359 Sig_VECTOR_B_IN_TO_DFPM <= (Sig_DFPM_Input_array(25),
Sig_DFPM_Input_array(26),
Sig_DFPM_Input_array(27), Sig_DFPM_Input_array(28),
Sig_DFPM_Input_array(29));
360
361
-----------------------------------------------------------------------
--------------
362
363
364 -- Direct assignment of the output of the DFPM module to the output
storage
365 Sig_DFPM_storage_array(0) <=
std_logic_vector(Sig_VECTOR_X_OUT_FROM_DFPM(0)) &
Const_Newline;
366 Sig_DFPM_storage_array(1) <=
std_logic_vector(Sig_VECTOR_X_OUT_FROM_DFPM(1)) &
Const_Newline;
367 Sig_DFPM_storage_array(2) <=
std_logic_vector(Sig_VECTOR_X_OUT_FROM_DFPM(2)) &
Const_Newline;
368 Sig_DFPM_storage_array(3) <=
std_logic_vector(Sig_VECTOR_X_OUT_FROM_DFPM(3)) &
Const_Newline;
369 Sig_DFPM_storage_array(4) <=
std_logic_vector(Sig_VECTOR_X_OUT_FROM_DFPM(4)) &
Const_Newline;
370
371 ------------------------------------------------------------------
372
373 end Behavioral;
374
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
96
Test code written in C++
#include <iostream> #include <stdio.h> //#include <time.h> #include <Windows.h> #include <Math.h> /* #define _WIN32_WINNT 0x0600 #define NTDDI_WIN7 (0x06010000) #define _WIN32_WINNT_WIN7 (0x0601) */ /* * This application is used for an estimation of the number of clock cycles used during an execution of * the Dynamic Functional Particle Method algorithm. The method can be used to solve many problems requiring * numerical methods. In this application, DFPM was used to solve the classical A*X = B problem, * where A is a matrix of coefficients, X, a vector of of variables and B a vector of coefficients. * The values of the variables in vector X are computed by the method and printed to the console/screen. * * In order to estimate the number of cycles used, the Kernel and Usertimes were first obtained and then * the total number of clock cycles used in both Kernel and User modes were obtained. This was done to make it easier * to identify the number of clock cycles used up in each mode separately. * * In this implementation, the actual DFPM algorithm was repeated a thousand (1000) times. This was done * in consideration of the fact that a single execution of the algorithm might be complet-ed in a very * short time. So short that the time used during both User and Kernel modes for execu-tion might be too * small to be noted that they will be registered as zero. Thus making it ifficult to identify the number * of clock cycles spent in each mode. * * By repeating the algorithm a thousand times, and then dividing the number of clock cycles obtained by * a thousand. A reasonable approximate of the average number of clock cycles used by the process in user * mode was obtained. */
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
97
int main(){ using namespace std; //************Initial initializations //Creating matrix A int a[5][5] = { { 8, 2, 3, 4, 5 }, { 1, 7, 3, 4, 5 }, { 1, 2, 9, 4, 5 }, { 1, 2, 3, 8, 5 }, { 1, 6, 3, 4, 9 } }; //Creating vector B int b[5] = { 1, 2, 3, 4, 5 }; int mu = 1; double dt = 0.1; volatile double Ax[5]; volatile double B_Minus_Ax[5]; volatile double B_Minus_Ax_Minus_MuV[5]; volatile double muV[5]; double tolerance = 7.8125e-3; int i, j, count; count = 0; int noOfCompleteAlgorithmRepetitions = 1000; //*********** Creating multiple arrays for the vectorNorm and Vectors V and X volatile double arrayOfV[1000][5]; volatile double arrayOfX[1000][5]; volatile double arrayOfVectorNorm[1000]; for (int outerIndex = 0; outerIndex < noOfCompleteAlgorithmRepetitions; outerIn-dex++){ arrayOfVectorNorm[outerIndex] = 10000; for (int innerIndex = 0; innerIndex < 5; innerIndex++){ arrayOfV[outerIndex][innerIndex] = 1; arrayOfX[outerIndex][innerIndex] = 1; } } //********Noting the Kernel and User times at the begining of the algorithm execution FILETIME creationTime1, exitTime1, kernelTime1, userTime1; double startTime_Kernel, startTime_User; bool myBool1; if (myBool1 = GetProcessTimes(GetCurrentProcess(), &creationTime1, &exitTime1, &kernelTime1, & userTime1)){ startTime_Kernel = (double)(kernelTime1.dwLowDateTime | ((unsigned long long)kernelTime1.dwHighDateTime << 32))*0.0000001; startTime_User = (double)(userTime1.dwLowDateTime | ((unsigned long long)userTime1.dwHighDateTime << 32))*0.0000001; }
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
98
else { cout << "Function GetProcessTimes failed on the first call" << endl; } //***Algorithm computations and repetitions while (count < noOfCompleteAlgorithmRepetitions){//Enabling repetition here //The actual algorithm while (arrayOfVectorNorm[count] > tolerance){ arrayOfVectorNorm[count] = 0; i = 0; while (i < 5){ Ax[i] = 0; j = 0; while (j < 5){ //A*X is done here Ax[i] += (a[i][j] * arrayOfX[count][j]); j++; }//Ax[i] gets a new value at this end of the loop muV[i] = mu*arrayOfV[count][i]; B_Minus_Ax[i] = b[i] - Ax[i]; B_Minus_Ax_Minus_MuV[i] = B_Minus_Ax[i] - muV[i]; arrayOfVectorNorm[count] += ((B_Minus_Ax[i])*(B_Minus_Ax[i])); arrayOfV[count][i] += B_Minus_Ax_Minus_MuV[i] * dt; arrayOfX[count][i] += arrayOfV[count][i] * dt; i++; } } count++; } ULONG64 myCycleTime = 0; bool myBool3; if (myBool3 = QueryProcessCycleTime(GetCurrentProcess(), &myCycleTime)){ cout << "The The total number of clock cycles used for " << noOfCompleteAlgorithmRepetitions <<" repetitions is : " << myCycleTime << endl; } else { cout << "Failed accessing the Process Cycle time" << endl; } FILETIME creationTime2, exitTime2, kernelTime2, userTime2; double stopTime_Kernel, stopTime_User; bool myBool2; if (myBool2 = GetProcessTimes(GetCurrentProcess(), &creationTime2, &exitTime2, &kernelTime2, & userTime2)){ stopTime_Kernel = (double)(kernelTime2.dwLowDateTime | ((unsigned long long)kernelTime2.dwHighDateTime << 32))*0.0000001; stopTime_User = (double)(userTime2.dwLowDateTime | ((unsigned long long)userTime2.dwHighDateTime << 32))*0.0000001;
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
99
} else { cout << "Function GetProcessTimes failed on the second call" << endl; } for (int pos = 0; pos < 5; pos++){ cout << arrayOfX[count - 1][pos] << endl; } double totalUserTime = stopTime_User - startTime_User; double totalKernelTime = stopTime_Kernel - startTime_Kernel; if (myBool1 & myBool2 & myBool3){ cout << "The number of repetitions is : " << count << endl; cout << "The total kernel time in seconds : " << totalKernelTime << endl; cout << "The total user time in seconds : " << totalUserTime << endl; cout << "The cpu time in seconds : " << (totalKernelTime + totalUserTime) << endl; cout << "The total number of clock cycles used during each run of the complete algo-rithm is : " << (myCycleTime / noOfCompleteAlgorithmRepetitions) << " cycles." << endl; cout << "In consideration of the possibility of Kernel time being too low to be meas-ured," << " the minimum number of processor clock cycles in User mode is : " << (((totalUserTime - (1 * 0.0000001)) / (totalUserTime + totalKernelTime))*(myCycleTime / noOfCompleteAlgorithmRepeti-tions)) << " cycles." << endl; }else { cout << "One of the processor information obtaining processes failed" <<endl; } return 0; }
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
100
Appendix B: Explanation of some basic
mathematical concepts
Two’s complement
When a positive number is represented in binary, the unsigned form of it is the
same as the two’s complement representation. When such a number is negative,
conversion from binary to two’s complement and vice versa can be done in a
two step operation:
1. Invert all the bits.
2. Add 1 to the LSB
Negative binary numbers represented in two’s complement representation of
always have 1 at the MSB.
Euclidian norm
The Euclidian norm of a vector can be calculated by summing up the squares of
the elements of the vector and finding the root of the sum.
(B.1)
Where n = the number of elements in the vector and x = element at position k.
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
101
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
102
Appendix C: Project report summary
Figure C.1 Project summary
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
103
Appendix D: MATLAB codes
Code for problem specification and comparison.
The MATLAB code below was used to send and recieve problem parameters
between the PC running MATLAB and the FPGA. It required access to the PC’s
USB ports and USB and RS232 cables for powering and communication respec-
tively.
A short guide was included in the comment at the beginning of the script. The
script takes a problem specification of the form Ax = b and runs a MATLAB
implementation of the algorithm, printing the solution to the screen, immediate-
ly after which the same problem is fed into the FPGA that has been pre-
programmed with the DFPM on FPGA design.
The FPGA receives the problem specification and runs the DFPM algorithm to
compute the solution and then sends the solution back to the MATLAB applica-
tion. The MATLAB application then plots a graph of the two sets of values
obtained. The results were discussed in Chapter 6.
%%%%%%%%%%%%%%
%This script runs an implementation of the DFPM in MATLAB
and then parses
%the same problemimplementation to the FPGA implementation
through a port
%object that is bound to the UART, which is connected to
the FPGA.
%%*********************************************************
%%%%
%The following parameters define the problem and they can
be changed right
%from inside this script by modifying lines 56 and 59 for
the MATLAB
%implementation and line 118 for the FPGA implementation.
%Problem model A*x = b
%A = [9 2 3 4 5; 1 7 3 4 5; 1 2 9 4 5; 1 2 3 8 5; 1 2 3 4
9]
%b = [1 2 3 4 5]
%x, which is the solution, will be printed to the MATLAB
console on the PC
%in decimal format.
%%*********************************************************
*%%%
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
104
%Fixed parameters relevant to the test:
%The following parameters are fixed in the sense that these
same parameters
%are available and modifiable in this script for the MATLAB
implementation
%but for the FPGA implementation, one needs to modify the
code in the FPGA
%and resythesize, re-map and re-generate the bit program-
ming file and then
%program the FPGA before the algorithm run conditions can
be at par between
%the two inmolementations.
%The parameters are:
%X = [1 1 1 1 1]
%V = [1 1 1 1 1]
%dt = 0.1 (discretization coefficient)
%mu = 1.0 (damping coefficent)
%%********************************************************%
%%
%Communication parameters:
%Since the PC communicates with the FPGA through USART on
the COM port,
%The parameters are listed below:
%Baud Rate: 9600
%Number of data bits: 8
%Parity: None
%Stop bits: 1
%Handshaking: None
%In order to have a hitch free test, please follow these
steps:
% 1. Program the FPGA with the git programmming file
% 2. Ensure that a USB to RS232 communication cable is
connected between
% the PC's USB port and the FPGA's RS232 port.
% 3. Check and verify the identity of the port to shich the
FPGA is
% connected (in the WIndows device manager or command line)
% (NOTE: If this same code is to run in a linux environ-
ment, then the
% port identity should be checked in the command line as
well and
% necessary modifications made to this script)
% 4. Ensure that the COM port identity corresponds to the
COM port identity
% in the connection parameters indicated in the com object
creation.
% 5. Run the script and check the solution on the MATLAB
console. :-)
% 6. If there is any errror, verify that the steps 1 to 5
were followed.
%%***First part - MATLAB implementa-
tion*********%%%%%%%%%%%%%%%%%%%%%%%%%
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
105
A = [6 2 3 4 5; 1 8 3 4 5; 1 2 7 4 5; 1 2 3 8 5; 1 2 3 4
9];
x = [1 1 1 1 1]';
b = [5 4 3 2 1]';
v = [1 1 1 1 1]';
dt = 0.1;
mu = 1;
for i = 1:10000,
v = v + (b - A*x - mu*v) * dt;
x = x + v*dt;
if norm(b - A*x) < 7.8125e-3,
break,
end
end
i
%Printing out the values in the solution vector
fprintf('The values obtained from the MATLAB implementa-
tion:\n%d, %d, %d, %d, %d\n\n',
x(1), x(2), x(3), x(4), x(5));
%****End of the first
part**********%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%******Second part - the MATLAB to FPGA implementa-
tion/communication*%%
%Communicating through a port object with the following
parameters:
%**Com port ID: COM5 -should be modified as appropriate
%**Baud Rate: 9600
%**No of data bits: 8
%**Parity: odd
%**Stop bit: 1
%Creation of the port object
thePort = serial('COM12');
%Opening the port object created
fopen(thePort);
%Setting the Baud rate
thePort.BaudRate = 9600;
%Setting the number of data bits
thePort.DataBits = 8;
%Setting the parity
thePort.Parity = 'odd';
%Setting the stop bit
thePort.StopBits = 1;
%Setting the Terminator to "Carriage Return"
thePort.Terminator = 'cr';
%get(thePort)
%fprintf(thePort, 'This is a Print');
%Sending tha parameters of the problem statement to the
FPGA
fwrite(thePort, '[6 2 3 4 5;1 8 3 4 5;1 2 7 4 5;1 2 3 8 5;1
2 3 4 9][5 4 3 2 1]::',
'async');
%Acquiring the solution from the FPGA
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
106
theSolution = fread(thePort);
%theSolution = fgets(thePort)
%%Storing up each of the 4 digits of 8 bits making up the
solution for each
%%of the solution elements
x1_arr = [theSolution(2), theSolution(3), theSolution(4),
theSolution(5)];
x2_arr = [theSolution(7), theSolution(8), theSolution(9),
theSolution(10)];
x3_arr = [theSolution(12), theSolution(13), theSolu-
tion(14), theSolution(15)];
x4_arr = [theSolution(17), theSolution(18), theSolu-
tion(19), theSolution(20)];
x5_arr = [theSolution(22), theSolution(23), theSolu-
tion(24), theSolution(25)];
solutionMatrix = [x1_arr; x2_arr; x3_arr; x4_arr; x5_arr];
elementMultipliers = [1, 1, 1, 1, 1];
solutionVector = [1, 1, 1, 1, 1];
for i = 1 : length(solutionMatrix)
%Checking to see if the MSB of the solution is '1'
%This will only be true if the solution element
%being considered in this case is a negative number
%%%Negative numbers%%%%%%%%%%%
if solutionMatrix(i, 2) == 255
%This number will eventually be multiplied with the final
value of
%the solution element
elementMultipliers(i) = -1;
%%%%Conversion of the 2's complement negative number starts
here%%%%
%%%%%%%%%%%%%%%
%Inverting every bit of the four 8 bit digits representing
the
%solution element
solutionMatrix(i, 1) = 255 - solutionMatrix(i, 1);
solutionMatrix(i, 2) = 255 - solutionMatrix(i, 2);
solutionMatrix(i, 3) = 255 - solutionMatrix(i, 3);
solutionMatrix(i, 4) = 255 - solutionMatrix(i, 4);
2015-03-31 18:25 C:\Users\...\DFPM_On_FPGA_MATLAB_Test.m 4
of 4
%Adding one to the solution element
solutionMatrix(i, 4) = solutionMatrix(i, 4) + 1;
%%%%%%%%%%%%%%%
%%%%Conversion of the 2's complement negative number ends
here%%%%
%Summing up the digits in order to obtain the solution
element
solutionVector(i) = solutionMatrix(i, 1) * 255 + solution-
Matrix(i, 2) +
(solutionMatrix(i, 3)/255) + (solutionMatrix(i, 4)/65536);
%%%Positive numbers%%%%%%%%%%%%%%
else
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
107
%Summing up the digits in order to obtain the solution
element
solutionVector(i) = solutionMatrix(i, 1) + solution-
Matrix(i, 2) +
(solutionMatrix(i, 3)/255) + (solutionMatrix(i, 4)/65536);
end;
end;
solutionVector = elementMultipliers .* solutionVector;
fclose(thePort)
delete(thePort)
clear thePort
%Printing out the values in the solution vector as computed
by the FPGA
fprintf('The values obtained from the FPGA implementa-
tion:\n%d, %d, %d, %d, %d\n\n',
solutionVector(1), solutionVector(2), solutionVector(3),
solutionVector(4),
solutionVector(5));
%%**End of the FPGA implementation*****%%%%%%%%%%%%%
plot(x, 'b*');
hold on;
plot(solutionVector, 'rx');
title('Plot of values obtained in MATLAB implementation vs.
FPGA implementation for :
A = [6 2 3 4 5; 1 8 3 4 5; 1 2 7 4 5; 1 2 3 8 5; 1 2 3 4
9], and b = [5 4 3 2 1]');
legend('MATLAB implementation', 'FPGA Implementation');
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
108
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
109
Appendix E: Table of standard ASCII
symbols and their numerical
representation Below is a table of ASCII[11] standard symbols used in handling data exchange
between the PC and the FPGA.
Figure E.1 Table of standard ASCII symbols
DFPM On FPGA
Appendix A: Documentation of
developed program code
2015-09-25
110