Our Research Presentation com

Embed Size (px)

Citation preview

  • 8/8/2019 Our Research Presentation com

    1/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Ongoing ComputerEngineering Research

    Projects at the LucianBlaga University of Sibiu

    Prof. Lucian VINTAN, PhD-

    Director

    Advanced Computer

    Architecture & ProcessingSystems Research Lab -http://acaps.ulbsibiu.ro/research

    http://acaps.ulbsibiu.ro/research.phphttp://acaps.ulbsibiu.ro/research.php
  • 8/8/2019 Our Research Presentation com

    2/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    The Research Team

    Prof. Lucian VINTAN, PhD Research Chair Assoc. Prof. Adrian FLOREA, PhD Senior Lecturer Daniel MORARIU, PhD

    Senior Lecturer Ion MIRONESCU, PhD Lecturer Arpad GELLERT, PhD Radu CRETULESCU, PhD student

    Horia CALBOREAN, PhD student Ciprian RADU, PhD student

  • 8/8/2019 Our Research Presentation com

    3/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Computing hardware14 Intel Compute nodes (2 processor HS21 blades with quad-core

    Intel Xeon)2 Cell Compute nodes (2 processor QS22 blades withIBMPowerXCell 8i Processor )

  • 8/8/2019 Our Research Presentation com

    4/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Our current research

    topics Anticipatory Techniques in Advanced ProcessorArchitectures

    An Automatic Design Space Exploration Frameworkfor Multicore Architecture Optimizations

    Optimizing Application Mapping Algorithms forNoCs through a Unified Framework Optimal Computer Architecture for CFD calculation Adaptive Meta-classifiers for Text Documents

  • 8/8/2019 Our Research Presentation com

    5/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    AnticipatoryTechniques in

    Advanced ProcessorArchitecturesProf. Lucian VINTAN, PhD

    Assoc. Prof. Adrian FLOREA, PhDLecturer Arpad GELLERT, PhD

  • 8/8/2019 Our Research Presentation com

    6/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    FetchBottleneck

    Fetch Rateis limited by the basic-blocksdimension (7-8 instructions in SPEC2000);

    Solutions

    Trace-Cache & Multiple (M-1) Branch Predictors; Branch Prediction increases ILP by predicting branch directions and targets and

    speculatively processing multiple basic-blocks in parallel; As instruction issue width and the pipeline depth are getting higher, accurate

    branch prediction becomes more essential.

    Some Challenges

    Identifying and solving some Difficult-to-Predict Branches (unbiased branches); Helping the computer architect to better understand branches predictability and

    also if the predictor should be improved related to Difficult-to-Predict Branches.

  • 8/8/2019 Our Research Presentation com

    7/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    15%

    20%

    25%

    30%

    35%

    40%

    45%

    50%

    p=1 p=4 p=8 p=12 p=16 p=20 p=24

    Context Length

    Unbiased

    ContextIns

    tanc

    GH (p bits)

    GH (p bits) + PATH (p PCs)

    GH (p bits) + PBV

    Difficult to predict unbiasedbranches A difficult-to-predict branch in a certain dynamic context

    unbiased

    highly shuffled.

  • 8/8/2019 Our Research Presentation com

    8/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Predicting Unbiased Branches

    State of the art branch predictors are unable toaccurately predict unbiased branches;

    The problem: Finding new relevant information that could

    reduce their entropy instead of developing newpredictors;Challenge: Adequately representing unbiased branches in

    the feature space! Accurately Predicting Unbiased Branches is still

    an Open Problem!

  • 8/8/2019 Our Research Presentation com

    9/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Random DegreeMetrics

    Based on:

    Hidden Markov Model (HMM) a strong methodto evaluate the predictability of the sequencesgenerated by unbiased branches;

    Discrete entropy of the sequences generated byunbiased branches;

    Compression rate (Gzip, Huffman) of thesequences generated by unbiased branches.

  • 8/8/2019 Our Research Presentation com

    10/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Issue Bottleneck (Data-flow)

    Conventional processing models are limited in their processing speed by the

    dynamic programs critical path (Amdahl);

    2 Solutions Dynamic Instruction Reuse (DIR) is a non-speculative technique. Value Prediction (VP) is a speculative technique.

    Common issue

    Value locality

    Chalenges

    Selective Instruction Reuse (MUL & DIV) Selective Load Value Prediction (Critical Loads) Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar /

    Simultaneous Multithreaded (SMT) Architecture to anticipate Long-LatencyInstructions Results

    E l iti S l ti I t ti R

  • 8/8/2019 Our Research Presentation com

    11/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Fetch DecodFetch Decod

    Exploiting Selective Instruction Reuseand Value Prediction in a SuperscalarArchitecture

    Selective Instruction Reuse (MUL &DIV)

    Selective Load Value Prediction (CriticalLoads)

  • 8/8/2019 Our Research Presentation com

    12/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Selective Instruction Reuse and Value Prediction inSimultaneous Multithreaded Architectures

    Fetch

    Unit

    Branch

    PredictorPC I-Cache Decode

    Issue

    Queue

    Rename

    Table

    Physical

    Register

    File

    ROB

    LVPT

    Functional

    Units

    LSQ

    D-Cache

    RB

    SMT Architecture (M-Sim) enhanced withper Thread RB and LVPT Structures

  • 8/8/2019 Our Research Presentation com

    13/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Exploiting Selective Instruction Reuse and ValuePrediction in a Superscalar Architecture

    The M-SIM Simulator

    Cycle-Level

    Performance

    Simulator

    Hardware

    Configuration

    SPEC

    Benchmark

    Power ModelsHardware Access Counts

    Performance

    Estimation

    Power

    Estimation

    2IPC

    PowerTotalEDP=

    %100

    =

    base

    baseimproved

    IPC

    IPCIPCSpeedupIPC

    %100

    =

    base

    improvedbase

    EDP

    EDPEDPGainEDP

  • 8/8/2019 Our Research Presentation com

    14/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    0%

    5%

    10%

    15%

    20%

    25%

    30%

    35%

    40%

    16 32 64 128 256 512 1024 2048LVPT entries

    INT - IPC Speedup

    INT - EDP GainFP - IPC Speedup

    FP - EDP Gain

    Exploiting Selective Instruction Reuse and ValuePrediction in a Superscalar Architecture

    Relative IPC speedup and relative energy-delay product gain with a ReuseBuffer of 1024 entries, the Trivial Operation Detector, and the Load Value

    Predictor

  • 8/8/2019 Our Research Presentation com

    15/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    onc us ons an ur erWork

    Indexing the SLVP table with the memoryaddress instead of the instruction address(PC);

    Exploiting an N-value locality instead of 1-value locality; Generating the thermal maps for the optimal

    superscalar and SMT configurations (and, ifnecessary, developing a run-time thermalmanager);

    Understanding and exploiting instructionreuse and value prediction benefits in amulticore architecture.

  • 8/8/2019 Our Research Presentation com

    16/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Anticipatory multicorearchitectures Anticipatory multicores would significantly reduce

    the pressure on the interconnection network

    performance/energy;

    Value prediction, multithreading and the cachecoherence/consistence mechanisms there aresubtle, not well-understood relationships;

    data consistency errorsconsistency violationdetection and recovery;

    The inconsistency cause: VP might execute outof ordersome dependent instructions;

    Dynamic Instruction Reuse in a multicore system.Reuse Buffers coherence problems cachecoherence mechanisms

    Details at http://webspace.ulbsibiu.ro/lucian.vintan/html/#11

    http://webspace.ulbsibiu.ro/lucian.vintan/html/#Exploiting%20Selective%20Instruction%20Reuse%20and%20Value%20Prediction%20in%20a%20Superscalar%20Architecturehttp://webspace.ulbsibiu.ro/lucian.vintan/html/#Exploiting%20Selective%20Instruction%20Reuse%20and%20Value%20Prediction%20in%20a%20Superscalar%20Architecture
  • 8/8/2019 Our Research Presentation com

    17/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    An Automatic DesignSpace Exploration

    Framework for MulticoreArchitectureOptimizationsHoria CALBOREAN, PhD student

    Prof. Lucian VINTAN, PhD

  • 8/8/2019 Our Research Presentation com

    18/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Multiobjective optimization

    Number of (heterogeneous) cores in theprocessor becomes higher the systemsbecome more and more complex

    More configurations have to be simulated

    (NP-hard problem)Time needed to simulate all configurations

    prohibitive

    Performance evaluation has become amultiobjective evaluation

  • 8/8/2019 Our Research Presentation com

    19/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Solutions

    Reducing simulation time parallel & distributed simulation

    sampling simulation

    Reducing number of simulations intelligent multiobjective algorithms

  • 8/8/2019 Our Research Presentation com

    20/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Proposed framework

    We developed FADSE (framework forautomatic design space exploration)

    Compatible with most of the existing

    simulators Portable - implemented in java

    Includes many well known multiobjective

    algorithms Is able to run simulators and also well

    known test problems

  • 8/8/2019 Our Research Presentation com

    21/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Existing tools

    Bounded to a certain simulator(Magellan)

    Lack portability - bounded to a certain

    operating system (M3Explorer,Magellan)

    Perform design space exploration of

    small parts of the system (only thecache - Archexplorer)

  • 8/8/2019 Our Research Presentation com

    22/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    FADSE applicationarchitecture

  • 8/8/2019 Our Research Presentation com

    23/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Features

    Parallel simulation (client servermodel)

    Ability to introduce constrains through

    XML interface Easily configurable through XML files: change DSE algorithm,

    specify input parameters and their possiblevalues,

    specify desired output metrics, etc.

  • 8/8/2019 Our Research Presentation com

    24/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Our target

    Perform an evaluation of the existingalgorithms on different simulators

    Find out which one performs best

    Improve the algorithms - map them onthe specific problem of design spaceexploration

  • 8/8/2019 Our Research Presentation com

    25/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Conclusions

    We have developed a framework whichis able to perform automatic designspace exploration

    Extensible, portable Many implemented multiobjective

    algorithms (through the use of jMetal)

    Reduces time through parallel&distributed execution of simulators

  • 8/8/2019 Our Research Presentation com

    26/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Optimizing ApplicationMapping Algorithms for

    NoCs through a UnifiedFramework

    Ciprian RADU, PhD student

    Prof. Lucian VINTAN, PhD

  • 8/8/2019 Our Research Presentation com

    27/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Outline

    Introduction The application mapping problem for NoCs

    The relation between application mapping androuting

    Evaluating application mapping algorithms forNetworks-on-Chip The framework design

    The ns-3 NoC simulator

    Automatic Design Space Exploration forNetworks-on-Chip The framework

  • 8/8/2019 Our Research Presentation com

    28/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    The application mappingproblem for NoCs

  • 8/8/2019 Our Research Presentation com

    29/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Application mapping &routing

  • 8/8/2019 Our Research Presentation com

    30/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Evaluating application mappingalgorithms for Networks-on-Chip Existing application mapping algorithms are

    currently evaluated on specific NoCs e.g.: NoCs with 2D mesh topology

    Existing comparisons between the algorithmsare not made on the same NoC architecture

    We propose a unified frameworkfor theevaluation and optimization of application

    mapping algorithms on different NoC designs

  • 8/8/2019 Our Research Presentation com

    31/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    The framework design

    3 major components: A module that contains the implementation

    of different application mapping algorithms;

    A network traffic generator; A Network-on-Chip simulator.

  • 8/8/2019 Our Research Presentation com

    32/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    The framework design flow

  • 8/8/2019 Our Research Presentation com

    33/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    The ns-3 NoC simulator

    Based on ns-3, an event driven simulator forInternet systems

    Aims for a good accuracy speedtrade-off Flexible and scalable

    Current parameters: Packet size, packet injection rate, packet injectionprobability;

    Buffer size; Network size; Switching mechanism (SAF, VCT, Wormhole); Routing protocol (XY, YX, SLB, SO); Network topology (2D mesh, Irvine mesh); Traffic patterns (bit-complement, bit-reverse, matrix

    transpose, uniform random).

    http://www.nsnam.org/http://www.nsnam.org/
  • 8/8/2019 Our Research Presentation com

    34/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Automatic Design SpaceExploration for Networks-on-Chip Motivation There is no NoC suitable for all kinds of workload

    There is an exponential number of possible NoC

    architectures Exhaustive DSE is no longer suitable

    Automatic DSE uses an heuristic drivenexploration of the design space Disadvantage: near-optimal solutions Advantage: speed

  • 8/8/2019 Our Research Presentation com

    35/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    The framework

    Components: DSE module NoC simulator

    The DSE module determines the parameters of

    the NoC architecture Uses algorithms from Artificial Intelligence

    The NoC simulator (ns-3 NoC) is automaticallyconfigured to simulate the network architecture

    determined by the DSE module The simulation results (network performance)

    help the DSE module at generating a betterNoC architecture

    Design Space

    Exploration module

    Design Space

    Exploration moduleNetwork-on-Chip

    simulator

    Network-on-Chip

    simulatorConfigure

    the simulator

    Configure

    the simulator

    Simulation results

  • 8/8/2019 Our Research Presentation com

    36/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Optimal computerarchitecture for CFD

    calculationSenior Lecturer Ion DanMIRONESCU, PhD

    Prof. Lucian VINTAN, PhD

  • 8/8/2019 Our Research Presentation com

    37/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Practical aplication Modelling and simulation of multiscale,

    multicomponent, multiphase flow in complexgeometry (ongoing projects) for : optimisation of sugar crystalisation

    prediction of the flow properties of polymer based

    dispers systems (starch and starch fractions,microbial polysacharides)

    HPC/CFD

  • 8/8/2019 Our Research Presentation com

    38/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Goals

    Speed-up of this application on thegiven architecture

    Finding the optimal manycore

    architecture for CFD application (e.g.NoC)

  • 8/8/2019 Our Research Presentation com

    39/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Method - Lattice Boltzmann

    (Chirila,2010)

  • 8/8/2019 Our Research Presentation com

    40/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Method advantages

    easy discretization of complexgeometry

    easy incorporation of multi models

    easy paralelisation

    easy cupling to other scale models(Molecular Dynamics)

  • 8/8/2019 Our Research Presentation com

    41/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Computational model

    Local

    Value

    s

    Ghost

    data

    COMPUTECOMPUTE

    COMPUTECOMPUTE COMPUTE

    COMPUTECOMPUTECOMPUTE

    COMPUTE

    EXCHANGE

  • 8/8/2019 Our Research Presentation com

    42/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    General-purpose manycoreplatform

    What can be used and what must be accountedfor:

    ILP (super scalar, out of order, branch

    prediction) Task and Thread LP

    (multicore/multiprocessor)

    Mixed programming model (shared memory

    on blade, message passing between blades) Cache system

  • 8/8/2019 Our Research Presentation com

    43/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Special purpose many coreplatformWhat can be used and what must be

    accounted for:

    SIMD

    Task and Thread LP (hardwaremultithreading, multicore/multiprocessor)

    Message passing

    Local store model full user control

  • 8/8/2019 Our Research Presentation com

    44/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Charm++

    provides a high-level abstraction of aparallel program

    cooperating message-driven objects

    called chares support for load balancing, fault

    tolerance, automatic checkpointing

    support for all architectures trough aspecific low level tier

    NAMD MD implementd in charm++

  • 8/8/2019 Our Research Presentation com

    45/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Charm++ LB implementation

  • 8/8/2019 Our Research Presentation com

    46/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Charm++ LB implementation

  • 8/8/2019 Our Research Presentation com

    47/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    DSE

    Search optimal values for

    sites/bloc

    blocs (chares)/core, /thread, /blade

    communication patterns

  • 8/8/2019 Our Research Presentation com

    48/57

    Advanced Computer Architecture & Processing Systems Research Labhttp://acaps.ulbsibiu.ro/research.php

    Adaptive Meta-

    classifiers for TextDocuments

    Prof. Lucian VINTAN, PhD

    Daniel MORARIU, PhD

    Radu CRETULESCU, PhD student

  • 8/8/2019 Our Research Presentation com

    49/57

    Advanced Computer Architecture & Processing Systems Research Lab

    http://acaps.ulbsibiu.ro/research.php

    Introduction

    We investigated a way to create a newadaptive meta-classifier for classifying textdocuments in order to increase theclassification accuracy.

    During the first processing phase (pre-classification) the meta-classifier uses a non-adaptive selector.

    In the second phase (classification) we use afeed-forward neural network based on theback-propagation learning method.

  • 8/8/2019 Our Research Presentation com

    50/57

    Advanced Computer Architecture & Processing Systems Research Lab

    http://acaps.ulbsibiu.ro/research.php

    The architecture of theadaptive meta-classifier M-

    BP

  • 8/8/2019 Our Research Presentation com

    51/57

    Advanced Computer Architecture & Processing Systems Research Lab

    http://acaps.ulbsibiu.ro/research.php

    Classification accuracy

    Influence of the neurons number from the

    hidden layer

    90

    92

    94

    96

    98

    100

    350 320 290 260 230 200 170 140 110 80Averge error using the training set

    Classifica

    ti

    Accurac

    y

    96 neurons128 neurons

    160 neurons

    176 neurons

    192 neurons

    Time necessary for reaching

  • 8/8/2019 Our Research Presentation com

    52/57

    Advanced Computer Architecture & Processing Systems Research Lab

    http://acaps.ulbsibiu.ro/research.php

    Time necessary for reachingthe given total error

    0

    50

    100

    150

    200

    250

    300

    350

    400

    0 10000 20000 30000

    Time in seconds

    Errorthreshol

    96 neurons

    128 neurons

    160 neurons

    176 neurons

    192 neurons

  • 8/8/2019 Our Research Presentation com

    53/57

    Advanced Computer Architecture & Processing Systems Research Lab

    http://acaps.ulbsibiu.ro/research.php

    Conclusions This new adaptive meta-classifier uses 8 types of SVM

    classifiers and one Nave Bayes type classifier to achievethe transposition of the input data from a large-scalespace into a much smaller size space.

    The best results (99.74% in terms of classificationaccuracy) were obtained using a neural network with 192neurons in the hidden layer.

    The meta-classifier managed to exceed the maximum"theoretical" limit of 98.63% which could be reached by anideal non-adaptive meta-classifier that always chose the

    correct prediction if at least one classifier provide it. For Reuters2000 text documents we obtained

    classification accuracy up to 99.74%.

    Some Refererences Computer Architectures

  • 8/8/2019 Our Research Presentation com

    54/57

    Advanced Computer Architecture & Processing Systems Research Lab

    http://acaps.ulbsibiu.ro/research.php

    Some Refererences Computer Architectures

    L. VINTAN, A. GELLERT, A. FLOREA, M. OANCEA, C. EGAN Understanding Prediction Limits throughUnbiased Branches, Eleventh Asia-Pacific Computer Systems Architecture Conference, Shanghai 6-8th,September, 2006 - http://webspace.ulbsibiu.ro/lucian.vintan/html/LNCS.pdf

    A. GELLERT, A. FLOREA, M. VINTAN, C. EGAN, L. VINTAN - Unbiased Branches: An Open Problem,The Twelfth Asia-Pacific Computer Systems Architecture Conference (ACSAC 2007), Seoul, Korea,August 23-25th, 2007 - http://webspace.ulbsibiu.ro/lucian.vintan/html/acsac2007.pdf

    VINTAN L. N., FLOREA A., GELLERT A. Random Degrees of Unbiased Branches, Proceedings of TheRomanian Academy, Series A: Mathematics, Physics, Technical Sciences, Information Science, Volume9, Number 3, pp. 259 - 268, Bucharest, 2008 -http://www.academiaromana.ro/sectii2002/proceedings/doc2008-3/13-Vintan.pdf

    A. GELLERT, A. FLOREA, L. VINTAN. - Exploiting Selective Instruction Reuse and Value Prediction in aSuperscalar Architecture, Journal of Systems Architecture, vol. 55, issues 3, pp. 188-195, ISSN 1383-7621, Elsevier, 2009 - http://webspace.ulbsibiu.ro/lucian.vintan/html/jsa2009.pdf

    GELLERT A., PALERMO G., ZACCARIA V., FLOREA A., VINTAN L., SILVANO C. - Energy-PerformanceDesign Space Exploration in SMT Architectures Exploiting Selective Load Value Predictions, Design,Automation & Test in Europe International Conference (DATE 2010), March 8-12, 2010, Dresden,Germany - http://webspace.ulbsibiu.ro/lucian.vintan/html/Date_2010.pdf

    CALBOREAN H., VINTAN L. -An Automatic Design Space Exploration Framework for MulticoreArchitecture Optimizations, Proceedings of The 9-th IEEE RoEduNet International Conference, ISBN ,Sibiu, June 24-26, 2010 - http://roedu2010.ulbsibiu.ro/ (indexata IEEE Xplore Digital Library)

    RADU C., VINTAN L. - Optimizing Application Mapping Algorithms for NoCs through a UnifiedFramework, Proceedings of The 9-th IEEE RoEduNet International Conference, ISBN , Sibiu, June 24-26,

    2010 - http://roedu2010.ulbsibiu.ro/ (indexata IEEE Xplore Digital Library) L. N. VINTAN - Direcii de cercetare n domeniul sistemelor multicore / Main Challenges in MulticoreArchitecture Research, Revista Romana de Informatica si Automatica, ISSN: 1220-1758, ICI Bucuresti,vol. 19, nr. 3, 2009, v. http://www.ici.ro/RRIA/ria2009_3/index.html

    http://webspace.ulbsibiu.ro/lucian.vintan/html/LNCS.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/LNCS.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/acsac2007.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/acsac2007.pdfhttp://www.academiaromana.ro/sectii2002/proceedings/doc2008-3/13-Vintan.pdfhttp://www.academiaromana.ro/sectii2002/proceedings/doc2008-3/13-Vintan.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/jsa2009.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/jsa2009.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/Date_2010.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/Date_2010.pdfhttp://roedu2010.ulbsibiu.ro/http://roedu2010.ulbsibiu.ro/http://roedu2010.ulbsibiu.ro/http://roedu2010.ulbsibiu.ro/http://www.ici.ro/RRIA/ria2009_3/index.htmlhttp://www.ici.ro/RRIA/ria2009_3/index.htmlhttp://roedu2010.ulbsibiu.ro/http://roedu2010.ulbsibiu.ro/http://webspace.ulbsibiu.ro/lucian.vintan/html/Date_2010.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/jsa2009.pdfhttp://www.academiaromana.ro/sectii2002/proceedings/doc2008-3/13-Vintan.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/acsac2007.pdfhttp://webspace.ulbsibiu.ro/lucian.vintan/html/LNCS.pdf
  • 8/8/2019 Our Research Presentation com

    55/57

    Advanced Computer Architecture & Processing Systems Research Lab

    http://acaps.ulbsibiu.ro/research.php

    References (1/2) - CFDCalculation1. J. Hu and R. Marculescu, Energy-aware mapping for tile-based NoC architectures under performance

    constraints, in Proceedings of the 2003 Asia and South Pacific Design Automation Conference.Kitakyushu, Japan: ACM, 2003, pp. 233239.

    2. R. Marculescu and J. Hu, Energy- and performance-aware mapping for regular NoC architectures,IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 24, no. 4, pp.551562, 2005.

    3. S. Murali and G. D. Micheli, Bandwidth-Constrained mapping of cores onto NoC architectures, inProceedings of the conference on Design, Automation and Test in Europe - Volume 2. IEEEComputer Society, 2004, p. 20896.

    4. K. Srinivasan and K. S. Chatha, A technique for low energy mapping and routing in network-on-chiparchitectures, in Proceedings of the 2005 international symposium on Low power electronics anddesign. San Diego, CA, USA: ACM, 2005, pp. 387392.

    5. G. Ascia, V. Catania, and M. Palesi, Multi-objective mapping for mesh-based NoC architectures, inProceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesignand system synthesis. Stockholm, Sweden: ACM, 2004, pp. 182187.

    6. J. P. Soininen and T. Salminen, Evaluating application mapping using network simulation, Proc of theInter Symp on SystemonChip, vol. 1100, no. Kaitovyl 1, p. 2730, 2003.

    7. (2010) The SystemC website. [Online]. Available: http://www.systemc.org

    8. S. Murali and G. D. Micheli, SUNMAP: a tool for automatic topology selection and generation forNoCs, in Proceedings of the 41st annual Design Automation Conference. San Diego, CA, USA:ACM, 2004, pp. 914919.

    9. C. Grecu, A. Ivanov, P. Pande, A. Jantsch, E. Salminen, U. Ogras, and R. Marculescu, Towards openNetwork-on-Chip benchmarks, in Proceedings of the First International Symposium on Networks-on-Chip.IEEE Computer Society, 2007, p. 205.

  • 8/8/2019 Our Research Presentation com

    56/57

    Advanced Computer Architecture & Processing Systems Research Lab

    http://acaps.ulbsibiu.ro/research.php

    References (2/2) - CFDCalculation10. S. Mahadevan, F. Angiolini, M. Storgaard, R. G. Olsen, J. Sparso, and J. Madsen, A network traffic

    generator model for fast Network-on-Chip simulation, in Proceedings of the conference on Design,Automation and Test in Europe - Volume 2. IEEE Computer Society, 2005, pp. 780785.

    11. R. P. Dick, D. L. Rhodes, and W. Wolf, TGFF: task graphs for free, in Proceedings of the 6thinternational workshop on Hardware/software codesign. Seattle, Washington, United States: IEEEComputer Society, 1998, pp. 97101.

    12. (2010) The Embedded System Synthesis Benchmarks Suite (E3S) website. [Online]. Available:http://ziyang.eecs.umich.edu/~dickrp/e3s/

    13. (2010) The Embedded Microprocessor Benchmark Consortium (EEMBC) website. [Online].

    Available: http://www.eembc.org14. (2010) The ns-3 network simulator website. [Online]. Available: http://www.nsnam.org/

    15. H. vom Lehn, K. Wehrle, and E. Weingartner, A performance comparison of recent networksimulators, 2009 IEEE International Conference on Communications, pp. 15, 2009.

    16. S. Schlingmann, Selbstoptimierendes routing in einem network-on-a-chip, Masters thesis, Universityof Augsburg, 2007.

    17. J. Duato, S. Yalamanchili, and L. M. Ni, Interconnection Networks: An Engineering Approach, 1st ed.Institute of Electrical & Electronics Enginee, 1997.

    18. S. E. Lee and N. Bagherzadeh, Increasing the throughput of an adaptive router in network-on-

    chip (NoC), in Proceedings of the 4th international conference on Hardware/software codesign andsystem synthesis. Seoul, Korea: ACM, 2006, pp. 8287.

    19. E. Salmien, A. Kulmala, and T. D. Hamalainen, Survey of network-on-chipproposals, White paper, OCP-IP, Tampere University of Technology, March 2008.[On-line]. Available: http://ocpip.biz/uploads/documents/OCP -IP_Survey_of_NoC_Proposals_White

    _Paper_April_2008.pdf

    f l ifi

    http://ziyang.eecs.umich.edu/~dickrp/e3s/http://ziyang.eecs.umich.edu/~dickrp/e3s/http://ziyang.eecs.umich.edu/~dickrp/e3s/http://www.eembc.org/http://www.nsnam.org/http://www.nsnam.org/http://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP-IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://www.nsnam.org/http://www.eembc.org/http://ziyang.eecs.umich.edu/~dickrp/e3s/http://ziyang.eecs.umich.edu/~dickrp/e3s/http://ziyang.eecs.umich.edu/~dickrp/e3s/
  • 8/8/2019 Our Research Presentation com

    57/57

    Advanced Computer Architecture & Processing Systems Research Lab

    References - Meta-classifiersfor Text Documents CRETULESCU R., MORARIU D., VINTAN L. Eurovision-like

    weighted Non-Adaptive Meta-classifier for Text Documents,Proceedings of the 8th RoEduNet IEEE InternationalConference Networking in Education and Research, pp. 145-150, ISBN 978-606-8085-15-9, Galati, December 2009 (indexataISI Web of Science - http://apps.isiknowledge.com/)

    MORARIU D., CRETULESCU R., VINTAN L. Improving a SVMMeta-classifier for Text Documents by using Nave Bayes,International Journal of Computers, Communications &Control (IJCCC), Agora University Editing House - CCCPublications, ISSN 1841 9836, E-ISSN 1841-9844, Vol. V, No.3, pp. 351-361, 2010

    CRETULESCU R., MORARIU D., VINTAN L., COMAN I. D. AnAdaptive Meta-classifier for Text Documents, The 16thInternational Conference on Information Systems Analysisand Synthesis: ISAS 2010, Orlando Florida, USA, April 6th 9th 2010

    http://apps.isiknowledge.com/http://apps.isiknowledge.com/