05289303

Embed Size (px)

Citation preview

  • 7/31/2019 05289303

    1/6

    Scalable 128-bit AES-CM Crypto-Core Reconfigurable

    Implementation for Secure CommunicationsArmando Astarloa (IEEE Member)

    , Aitzol Zuloaga, Jess Lzaro, Jaime Jimnez, Carlos Cuadrado

    Department of Electronics and Telecommunications

    University of the Basque Country, Spain

    Email: [email protected]

    Abstract A novel cryptographic core (crypto-

    core) approach for secure communications is

    presented in this work. It is an AES-Counter Mode

    core for System-on-Programmable-Devices that

    takes advantage from the flexibility of the

    reconfigurable devices. The proposed architecture

    is parameterizable, so it is easily scalable to fulfill

    different target area-speed trade-offs. This

    parametrization affects both the number of AES

    cipher block processors running in parallel and the

    implementation type. The crypto-core supports

    three AES cipher blocks implementations publiclyavailable. The proposed architecture is analyzed

    with experimental results that show how the

    crypto-core eases and optimizes the secure

    communications implementation in different

    systems.

    I. INTRODUCTION

    The challenge in securizing communications networks

    is to obtain flexible means able to deal with the

    intensive computation needed by the cryptography

    algorithms. A representative example of thisalgorithms is the Rijndael [1] one. This is one of the

    widest cryptography algorithm thanks to it was

    selected by The National Institute of Standards and

    Technology (NIST) for the Advanced Encryption

    Standard (AES) [2].

    The 128-bit AES block cipher combines a 128-bit key

    and a 128-bit plaintext data block to get a 128-bit

    block of ciphertext data. AES is reversible. That is to

    say, the same key with the same algorithm steps inreverse order decrypts the ciphertext and obtains the

    plaintext. National Institute of Standards and

    Technology (NIST) adopted Rijndael [2] algorithmwith 128-bit block size, but retained the choice of

    three key lengths; IEEE 802.11i restricts the key

    length to 128-bits. To convert messages or packetsinto blocks or viceversa, it is necessary to define the

    block cipher's mode of operation. NIST defines a list

    of 16 different approaches [3]. The Electronic

    CodeBook (ECB) mode [4], is the simplest encryption

    mode. In this mode, the message is split into blocks

    and each one is separately encrypted. So, thereforeidentical plaintext blocks are encrypted to identical

    ciphertext blocks. This drawback generates

    vulnerabilities like modification of ciphered messages

    or reply attacks, which are described in [5].

    In order to solve this problem, more complex modes

    of operation combine the data of the previous

    ciphered blocks and use Initialization Vectors (IV) to

    make each ciphered message unique. The AESCipher-Block Chaining (CBC) mode includes these

    features. Before encrypting a block, it is XORed with

    the ciphertext of the previous ciphertext block. Thus,

    each ciphertext block is dependent on all plaintext

    blocks up to that point and in order to make eachmessage unique an IV has to be used. However, this

    mode of operation has the following drawbacks:

    ciphered message length is different than the original

    plain text one. Therefore, padding must be inserted; it

    needs an IV and it does not permit parallel ciphering.

    AES Counter Mode (CM) mode of operation

    overcomes those limitations with a different operation

    way. It does not directly use the AES cipher block to

    encrypt the data like ECB or CBC do; On the

    contrary, it encrypts an arbitrary value called `the

    counter' and then XORs the result with the plain datato produce the ciphered text. The counter value is

    usually incremented by one for each successive block

    processed. Figure 1 shows this process. The message

    is divided into 128-bit vectors, each of these vectors is

    XORed with the result of encrypting the counter value

    correspondent to that block using an AES cipherblock. In this example, the counter starts at 1 and

    increments by one up to 4, and the process crypts 512

    bits in parallel. The receiver, which decrypts using the

    same circuit, must know the starting value of the

    counter and how it advances.

    AES

    Cipher Block

    Plaintext Vector 1Plaintext Vector 1

    Ciphertext Vector 1Ciphertext Vector 1

    1

    Counter value 1

    Key AES

    Cipher Block

    Plaintext Vector 2Plaintext Vector 2

    Ciphertext Vector 2Ciphertext Vector 2

    2

    Counter value 2

    Key AES

    Cipher Block

    Plaintext Vector 3Plaintext Vector 3

    Ciphertext Vector 3Ciphertext Vector 3

    3

    Counter value 3

    Key AES

    Cipher Block

    Plaintext Vector 4Plaintext Vector 4

    Ciphertext Vector 4Ciphertext Vector 4

    4

    Counter value 4

    Key

    Figure 1: AES-CM mode of operation block diagram (4 cipherblocks).

    With this mode of operation, the reversibility is

    ensured due to the use of the XOR function. Also, the

    ciphering can be performed completely in parallelbecause all the counter values are known at the start.

    Another interesting feature of AES-CM is that if the

    message does not break into an exact number of

    blocks, the last short block is XORed with the

    encrypted counter output using only the needed

    number of bits. A direct consequence is that the

    length of the cipher text can be the same as the length

    of the input message. The simplicity and maturity(more than twenty years) of this mode of operation,

    makes it an attractive option for the newest secure

    communication protocols. However, it only provides

  • 7/31/2019 05289303

    2/6

    confidentiality and not message integrity, which

    should be provided by other means if needed. For

    example, in IEEE 802.11i RSN this mode is

    combined with CBC MAC (CCM) to ensure

    confidentiality and integrity in WiFi communications

    [6].

    For all the modes of operation, the most intensive

    processing task is the AES computation. In the AES

    algorithm selection stage, where Rijndael [1] arose as

    the most suitable one, some researchers analyzed

    different candidates taking into account that most ofthe intensive processing may be done by hardware

    using specific chips. These works show performance

    evaluations both for ASIC implementations [7,8] and

    for FPGA implementations [2]. Using this off-chip

    approach (a dedicated chip only for the AES

    algorithm processing), I. Verbauwhede et al. present

    in [9] a high-speed and high-efficiency ASIC Rijndael

    processor. P. Chodowiec et al. [10] use a FPGA PCI

    board to accelerate the Rijndael and Triple DESalgorithms processing. Y. Fu in [11] presents an

    extremely high performance AES Counter Modereconfigurable processor that needs 4 pieces of

    XC2V1000 FPGA. K. Vu in [12] uses a whole FPGA

    to perform the AES computation for CCM mode of

    operation.

    All these implementations offer a very high datathroughput but, nowadays, the need of secure

    communications is rapidly growing in different

    sectors, especially in embedded systems. Embedded

    systems are fundamentally processor-based devices

    operating under resource-constrained conditions.These systems pose severe resource constraints on

    terms of computational capacity and memory [13].

    The cryptographic algorithms computation

    requirements are so high for a conventional embedded

    processor device, that most of its computation

    capacity would be needed if that computation was

    performed by software. For many embedded systems,

    this situation is not allowable.

    In order to face this drawback, the processors most

    commonly used for industrial applications such as

    ColdFire, have embedded crypto-cores in the same

    device. Using this approach, the communicationframes encryption and decryption is done by

    hardware, freeing the main processor core from this

    task. The main drawback of this approach is the

    limited flexibility that it shows. These embedded

    processors are ASIC technology. Thus, the crypto-core is fixed on terms of algorithm implementation

    and interfaces; both for the software interface and for

    the communication media controller peripheral or

    core.

    Apart from the ASIC processor-based embedded

    systems solution, the industry is massively adopting

    the core-based design methodology for systemintegration using FPGAs, which leads to the

    appearance of the System-on-Programmable-Chip

    (SoPC) platforms. Taking into account the fact that

    the FPGAs do not incur in non-recurring engineering

    charges due to their reconfigurable nature, the number

    and diversity of the available Intellectual Propriety

    (IP) cores for digital systems composition have

    heavily increased [14]. The SoPCs are very flexible in

    different aspects: number and type of IP cores andprocessors, buses architectures, hardware and

    software co-processing, etc. This flexibility allows

    very short time-to-market and facilitates custom

    device design for every industry and application.

    The SoPC technology faces the secure

    communication paradigm with the maximum

    flexibility: depending on the application, different

    crypto-cores and communication media controller

    cores can be included in the FPGA device. For the

    secure communication section of the SoPC, the

    designer is in charge of finding the best FPGA

    resource occupation-data throughput trade-off and the

    optimum IP license cost as well. But in the high

    performance FPGA implementations previouslyreported in the literature, the secure communication

    core makes use of an important part of the FPGAresources, which implies a high cost not allowable in

    many low cost embedded devices. Moreover, those

    solutions do not offer easily reusable modules that

    embed both the algorithm and the mode of operation.

    The research work that we present in this paper, aimsto find a flexible solution for FPGA secure

    communications cores. We have designed asoft(The

    term soft core is used to label hardware modules

    described with Hardware Description Languages such

    as VHDL or Verilog. These modules are usuallysynthesizable for different devices.) AES-Counter

    Mode (AES-CM) crypto-core with a new architecture

    based on multiple processors running in parallel. The

    number of AES processors is configurable and also

    their nature: full hardware processors or tiny CPUs. In

    this last case, both the tiny CPUs and their software

    are embedded in the core. This flexibility gives the

    possibility to explore different area-speed trade offs to

    implement ciphered communications channels over

    different technologies ranging from low bandwidth

    serial to Gigabit Ethernet.

    The remainder of this paper is organized into foursections. In Section II the architecture of the multi-

    processor AES-CM core is presented. Section III

    analyzes three AES-CM core series with three

    different AES cipher block implementations. The

    paper ends, Section IV, with the conclusions.

    II. MULTI-PROCESSOR AND MULTI -ARCHITECTURAL AES-CM CORE

    ARCHITECTURE

    One of the main applications of data cryptography is

    data communications. Therefore, the proposed AES-

    CM core must be able to support data encryption anddecryption. Counter Mode is reversible, so the same

    circuit can be used for the transmitter and receiver.Moreover, the proposed approach aims to be flexible

  • 7/31/2019 05289303

    3/6

    enough to allow different AES cypher block

    implementations. This feature will permit the

    generation of cores with different area-speed trade off

    that facilitates further analysis.

    Taking into account these features, Figure 2 shows theblock diagram of the proposed basic architecture for

    the AES-CM core. The external box encloses n 128

    bit AES cipher blocks, connected to fulfill with CM

    mode of operation. The core is provided with a

    counter and registers for the key and for the counter

    initialization value (nonce). The nonce should changefor each message in order to ensure different ciphered

    packets for even the same information. The count

    value starts from the nonce value for each message,

    and each cipher block uses an incremented value

    derived from the counter. The core is completed with

    a Wishbone slave interface [15], in order to facilitate

    the reusability of the module. Wishbone SoC

    interconnection architecture for portable Intellectual

    Property cores is a standard specification for dataexchange between IP cores. It defines the interfaces,

    what bus topologies are allowed and signaling. It isabsolutely royalty free and is used to share open

    projects. It provides high levels of robustness and

    flexibility.

    This AES-CM core admits three different open-source

    AES cipher block implementations. Thus, theembedded processors can be instantiated using three

    different architectures. In this paper, each

    implementation is identified as follows: R. Usselmann

    [16] (`High Speed'), J. Castillo [17] (`Low Area') and

    H.V. Kampen [18] (`Mixed Core'). All of them are

    publicly available.

    R. Usselmann's implementation is described in

    Verilog and it needs about 12 cycles to encrypt ordecrypt a 128 bit block. It uses 16 byte block size and

    16 byte key size. It includes the key expansion

    module and it implements different logic for

    encryption and decryption. The author explains at

    opencores.org that he has tried to trade off area and

    speed with this implementation, in order to fit it in

    low cost FPGA devices. This implementation stores

    the sbox in internal FPGA dedicated RAM

    (BlockRAMs) saving FPGA general purpose logic.

    J. Castillo's implementation is a System-C RTL

    described AES implementation. A Verilog

    synthesizable code is also provided. This design is

    area-optimized, focused on applications where a big

    data throughput is not the main goal. It needs 511cycles to perform a 128 bit block ciphering but it has

    been improved in many aspects: it uses distributedmemory to store the sbox, facilitating its portability to

    other technologies, and it implements the encryptor

    and decryptor in the same block.

    The approach of H.V. Kampen is a sequential AES

    implementation. He provides the program code for thetiny soft 8 bit processor PicoBlaze. The software is

    stored in the internal dedicated memory of the FPGA

    and the processors are implemented using general

    AESCipher Block

    Plaintext Vector 1Plaintext Vector 1

    Ciphertext Vector 1Ciphertext Vector 1

    Counter value 1

    Key AESCipher Block

    Plaintext Vector 2Plaintext Vector 2

    Ciphertext Vector 2Ciphertext Vector 2

    Counter value 2

    Key AES

    Cipher Block

    Plaintext Vector nPlaintext Vector n

    Ciphertext Vector nCiphertext Vector n

    Counter value n

    Key

    Wishbone Slave Interface AES-CM Corenonce

    key

    counter

    FPGA on-chip bus

    FPGA

    Other cores

    Figure 2: AES-CM core general architecture.

  • 7/31/2019 05289303

    4/6

    purpose FPGA logic. The processor units usually can

    be customized for each core in different ways, forexample, the bus width or the number of ports. Many

    of this kind of ultra low area consuming soft

    processors can be included in a FPGA device to

    obtain multiprocessor systems. Moreover, they can be

    part of a application specific IP cores, called Mixed

    Cores [19]. A Mixed Core integrates a tiny softprocessor. This feature, in addition to custom

    hardware modules into the core, offers a very flexibletool to the designer because different

    hardware/software solutions may be explored. These

    processors are most likely to be employed in control

    sections of the core although, due to its usually

    efficient RISC architecture, they perform reasonably

    well data processing applications like the presented in

    this work [18]. Remarkable features of these

    processors are size and flexibility.

    H.V. Kampen distributes three slightly different

    programs for the three PicoBlaze versions that Xilinx

    has developed for different FPGA families. In this

    work, it has been selected the simplest PicoBlazeprocessor. This version can be implemented in any

    Xilinx FPGA device newer than the Virtex family(included). It uses one 4K Block RAM to store the

    software, allowing up to 256 instructions without

    banking. The H.V. Kampen AES implementation

    needs another 4k Block RAM to store the sbox data.

    This implementation is very slow compared to the R.

    Usselmann or J. Castillo ones (it needs about 10.000cycles to process a 128 vector). But the area required

    for its implementation is minimal. This feature gives

    the opportunity to have tens of processors ciphering in

    parallel in the same FPGA. Figure 3 shows this

    configuration.

    III. COMPARATIVE RESULTS

    In this section, the implementation results of differentversions of the AES-CM crypto-core are presented.

    The versions vary both in the open-source AES cipher

    block instantiation and in the number of processors.

    The AES-CM wrapper included in the core, links each

    AES cipher block implementation through a basic

    interface. This interface has the basic signals neededto perform 128-bit vector ciphering:

    clk, reset: The global FPGA reset and clock

    signals.

    key_i: 128-bit key.

    load_key: Strobe signal used to validate

    key_i vector.

    data_i: 128-bit Input vector.

    load_i: Strobe signal used to validate data_i

    vector and to start the computation.

    data_o: Output vector.

    ready_o: Signal set by the AES cipher block

    to inform that the computation has finished

    and the output vector in data_o is valid.

    We have adapted the different AES cipher blockversions to attach them to this interface. The

    complexity of this adaptation has been different for

    each implementation:

    J. Castillo (`Low Area'): It does not need

    any code modification or addition. It plugsoptimally with the defined interface.

    R. Usselman(`High Speed'): The main

    drawback of this implementation is that it

    needs a key expansion stage when a key is

    loaded into the decipher block. Thus, wehave included a finite state machine to

    control the cipher block, the decipher block

    Plaintext Vector 1Plaintext Vector 1

    Ciphertext Vector 1Ciphertext Vector 1

    Counter value 1

    Key

    nonce

    key

    AES-CM Core

    (Mixed Core)

    Tiny-uP

    SOFTWARE

    Tiny-uPMemory map

    EMBEDDED

    MEMORY

    Plaintext Vector 2Plaintext Vector 2

    Ciphertext Vector 2Ciphertext Vector 2

    Counter value 2

    Key

    Tiny-uP

    SOFTWARE

    Tiny-uPMemory map

    EMBEDDED

    MEMORY

    Plaintext Vector nPlaintext Vector n

    Ciphertext Vector nCiphertext Vector n

    Counter value n

    Key

    Tiny-uP

    SOFTWARE

    Tiny-uPMemory map

    EMBEDDED

    MEMORY

    counter

    Wishbone Slave Interface

    FPGA on-chip bus

    FPGA

    Other cores

    Figure 3: AES-CM Mixed Core' version.

  • 7/31/2019 05289303

    5/6

    and the key expansion stage and to offer the

    desired interface.

    H.V. Kampen(`Mixed Core'): Kampen

    only provides the software source code for

    the PicoBlaze processor unit. This processor

    does not have any external port or logic.

    Therefore, we have built a full Mixed Core

    AES cipher block with the proper signals tolink with the defined interface.

    The AES-CM core takes advantage of the generics

    feature of the VHDL hardware description language.Using generics, the module is parameterized and

    cores with different characteristics can be obtained in

    the synthesis stage. For example, AES-CM core

    description includes a NUMBER_AES_MODULES

    generic that defines the number of AES cipher blocks

    that will be embedded in the core.

    In order to analyze the AES-CM core implementation

    results and helped with this generics and customautomatization scripts, three AES-CM core series for

    each architecture have been generated. The target

    FPGA is the XC2VP100-6FF Virtex-II PRO device. It

    delivers 100K logic cells, 8 Megabits of embedded

    synchronous block RAM, over 400 18x18 DSP

    multipliers, and over 1,000 SelectIO pins. It is one of

    the largest FPGA available today. The AES-CM core

    can be implemented on smaller and cheaper Xilinx

    FPGAs, but XC2VP100 has been selected in order to

    explore the largest parallel processors combination.

    The mapping and timing results that will be evaluated

    have been obtained from the real implementation

    results, so they are not synthesis estimations. ThePlace&Route tool has run in `Performance EvaluationMode' to automatically improve the performance of

    all internal clocks in the design, but no timing

    constraint has been set. Thus, for a given

    configuration, a higher operation frequency could be

    obtained by setting a higher effort level for the

    different implementation stages.

    Figure 4 shows the FPGA slices needed for each

    AES-CM core series. The Xilinx slice unit encloses

    two Logic Cells. The main logic cell resources are

    one programmable 4 input Look-Up-Table (LUT) and

    a Flip-Flop. For a given implementation, the sliceoccupation percentage could differ in order to obtain

    better speed results. In this analysis, neither FPGA

    target nor implementation parameters are changed;

    the slice number expresses a real area information

    about the area occupation. It must be taken intoaccount that Mixed Core architecture uses FPGA

    dedicated RAM memory (Block RAMs) that saves

    general purpose resources as no distributed memory is

    needed for programs and keys storage. AES-CM IP

    core with `Mixed Core' architecture can include up to

    70 AES cipher blocks (70 tiny processors) running inparallel ciphering 8960 bits. With the `Low Area'

    architecture, the maximum number of AES cipherblocks that fits in the selected FPGA are 30; and with

    the `High Speed' AES cipher block 14 128-bit vectors

    can be simultaneously ciphered.

    Figure 4: FPGA slices occupation vs parallel AES cipher blocks.

    `Mixed Core' version offers the smallest AES 128-bit

    implementation. If only one 128-bit vector processing

    is required, the resulting AES CM core needs only

    748 slices (808 Flip-FlOps, 826 LUTs and two Block

    RAMs). However, this AES-CM core version datathroughput is very poor. For the simplest AES-CM

    core Mixed Core version the data throughput is 1,57

    Mbit/s with 1 AES block. If it is compared with the

    13 Gbit/s of the AES-CM core `High Speed' version

    (with 14 AES blocks) the difference is enormous. But,also the resources for each implementation, 748 slices

    and 41.867 slices (about the 94% of the FPGA area).

    The flexibility of the proposed architecture gives this

    diversity. So, in order to compare theseimplementations, the following ratio () has been

    defined:

    SlicescyclesClock

    numberBitsMHzspeedMaximum

    Slices

    ThroughputData

    =

    ==

    _

    _)(_

    _

    This ratio has been computed for each core

    implementation. For this analysis, it has taken into

    account only the FPGA general purpose resources

    (slices) and not the internal memory blocks RAM. In

    Figure 5a, is represented for all the series. The ratios

    for the `Low Area' series are one order of magnitude

    bigger than the ratios for the `Mixed Core' series; andthe ratios for the `High Speed' series, two orders of

    magnitude bigger than the `Low Area' series thanks to

    the `High Speed' version only needs 12 clocks for the

    computation. Figures 5b, 5c and 5d show ratio

    evolution in detail for each series independently. For

    the `Low Area' and `Mixed Core' series, this ratio gets

    worse as the number of AES blocks increments.

    However, the ratio for the `High Speed' series has amaximum value for 4 AES block cipher. For this

    combination it occupies about the 27\% of the matrix

    area.

    Depending on the application, the designer canchoose the architecture that best fits. For traffic

    encryption and decryption in many conventional

    communication links where high data throughput is

  • 7/31/2019 05289303

    6/6

    not needed, the `Low Area' versions can be suitable.

    The `Mixed Core' versions, which integrate the H.V.

    Kampen software solution, are the slowest ones. But

    it offers the best area results. If the communication

    link is slow, for example if the SoPC includes a

    secure I2C channel, this version is a cheap and easily

    integrable solution. For high data throughput

    requirements, AES-CM core with `High Speed'

    architecture is the option that better fits. It must be

    also taken into account that the implemented AES

    block ciphers are publicly available. Commercial full

    pipelined ultra high speed AES cipher block

    implementations could be also good candidates to be

    wrapped in the proposed AES-CM core.

    IV. CONCLUSIONS

    The core presented in this work benefits from the

    multi-architectural description facilities that the

    Hardware Description Languages offer and from the

    flexibility of the reconfigurable devices. The same

    entity can be instantiated with different architectureand processor number when the design is being

    synthesized, implementing different circuits with the

    same functionality and interface.

    We have used this feature to build a multi-architectural crypto-core ready to be plugged in SoPCdesigns. Specifically, the source code of three

    publicly available and open AES cipher block

    implementations has been successfully reused in this

    work.

    With the third architecture, `Mixed Core', which

    integrates a tiny processor in the core, we have

    proposed a mixed hardware-software partition for

    message encoding-decoding. The resulting AES-CM

    implementation is `light' enough to be incorporated in

    any low-cost SoPC with low speed communication

    channel. However, the standard 8 bit architecture ofthe PicoBlaze tiny CPU gives a very poor data

    throughput. This may be improved upgrading this

    processor to a 16 bit architecture and adding specific

    cryptographic instructions [20]. This research work

    eases the integration of secure communications in a

    wide range of systems. Moreover, this crypto-core

    establishes a module for our future researches in the

    field of secure Ethernet peer-to-peer communications.

    IV. ACKNOWLEDGMENTThis work has been partially supported by the project

    EJIE07/04, a public research project funded by the

    agreement of the University of the Basque Country

    and EJIE S.A. and by the project S-PE01UN1, apublic research project funded by the Basque

    Government.

    REFERENCES[1] J. Daemen and V. Rijmen, Rijndael: AlgorithmSpecification,http://csrc.nist.gov/encryption/aes/rijndael/, 2001.

    [2] K. Gaj and P. Chodowiec, Comparison of the Hardware

    Performance of the AES Candidates Using ReconfigurableHardware, inProceedings of The Third Advanced Encryption

    Standard Candidate Conference,Apr. 2000, pp. 4054.[3] National Institute of Standards and Technology, Cryptographic

    Toolkit. Modes of Operations, Computer Security Resource

    Center http://csrc.nist.gov/CryptoToolkit/tkmodes.html, 2005.

    [4] , DES Modes of Operation, Computer Security ResourceCenter http://www.itl.nist.gov/fipspubs/fip81.htm, 1980.[5] B. Schneier,Applied cryptography: protocols, algorithms,and

    source code in C, 2nd ed. New York: Wiley, 1996.

    [6] J. Edney and W. A. Arbaugh,Real 802.11 Security:Wi-FiProtected Access and 802.11i. Addison Wesley, 2003.

    [7] A. Elbirt, W. Yip, B. Chetwynd, and C. Paar, An FPGA

    Implementation and Performance Evaluation of the AES Block

    Cipher Candidate Algorithm Finalists, inProceedings of the Third

    Advanced Encryption Standard (AES) Candidate Conference,Apr.2000, pp. 1327.

    [8] T. Ichikawa, T. Kasuya, and M. Matsui, Hardware Evaluationof the AES Finalists, inProceedings of the Third Advanced

    Encryption Standard (AES) Candidate Conference, Apr. 2000, pp.

    297285.[9] I. Verbauwhede, P. Schaumont, and H. Kuo, Design and

    Performance Testing of a 2.29-GB/s Rijndael Processor,IEEEJournal of Solid-State Circuits, vol. 38, no. 3, pp. 569572, Mar.

    2003.

    [10] P. Chodowiec, K. Gaj, P. Bellows, and B. Schott,Experimental Testing of the Gigabit IPSec-Compliant

    Implementations of Rijndael and Triple DES Using SLAAC-1VFPGA Accelerator Board, inProceedings of the Information

    Security Conference, Oct. 2001, pp. 220234.

    [11] Y. Fu, L. Hao, and X. Zhang, Design of An Extremely HighPerformance Counter Mode AES Reconfigurable Processor, in

    Proceedings of the SecondInternational Conference on Embedded

    Software and Systems (ICESS05), Dec. 2005, pp. 262268.

    [12] K. Vu and D. Zier, FPGA Implementation AES for CCM

    Mode Encryption Using Xilinx Spartan-II, ECE 679, AdvancedCrypotgraphy, Oregon State University, 2003.

    [13] D. D. Hwang, P. Schaumont, K. Tiri, and I. Verbauwhede,Securing Embedded Systems,IEEE Security and Privacy, vol. 4,

    no. 2, pp. 4049, Mar. 2006.[14] R. A. Bergamaschi, S. Bhattacharya, R. Wagner, C. Fellenz,and M. Muhlada, Automating the Design of SOCs Using Cores,

    IEEE Design & Test of Computers, vol. 18, no. 5, pp. 3245, 2001.[15] Silicore Corporation, Wishbone System-on-Chip (SoC)

    Interconnection Architecture for Portable IP Cores Revision: B.3,

    http://www.opencores.org, Sep.2002.[16] R. Usselmann, AES (Rijndael) IP Core,

    http://www.opencores.org, 2002.[17] J. Castillo, P. Huerta, and J. Martnez, SystemC design flow

    for an AES/DES CryptoProcessor, WSEAS Transactions on

    Information Science and Applications, pp. 193198, Jul. 2004.[18] H. V. Kampen, PicoBlaze Rijndael (AES-128) block cipher,

    http://www.mediatronix.com, 2003.

    [19] A. Astarloa, U. Bidarte, J. Lazaro, A. Zuloaga, and J. Arias,

    Multiprocessor SoPC-Core for FAT volume computation,

    Microprocessors and Microsystems, vol. 29, no. 10, pp. 421434,Dec. 2005.

    [20] Xilinx Corp., CryptoBlaze: 8-Bit Security Microcontroller,

    Xilinx Application Notes, http://www.xilinx.com, Sep. 2003.

    0

    0,5

    1

    1,5

    2

    2,5

    3

    3,5

    4

    1 4 8 14 22 30 38 46 54 62 70

    128 bit cipher blocks

    log10(DataThrou

    ghput/Slices)

    MixedCore

    LowArea

    HighSpeed

    MixedCore

    0

    5

    10

    15

    20

    25

    1 4 814 22 30 38 46 54 62 70

    128 bit cipher blocks

    DataThroughput/Slices

    MixedCore

    LowArea

    0

    50

    100

    150

    200

    250

    1 2 4 6 8 10 1 4 1 8 2 2 26 30

    128 bit cipher blocks

    DataThroughput/Slices

    LowArea

    HighSpeed

    2850

    2900

    2950

    3000

    3050

    3100

    3150

    3200

    3250

    1 2 4 6 8 10 1 4

    128 bit cipher blocks

    Data

    Throughput/Slices

    HighSpeed

    a) b)

    c) d)

    Figure 5: a) Log10() ratio for the three series.; b) ratio for

    `Mixed Core' series; c) ratio for Low Area' series; d) ratio for

    `High Speed' series.