Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

Embed Size (px)

Citation preview

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    1/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    2/57

    Co-Editor Dr. AL-Dahoud Ali

    Ubiquitous Computing andCommunication Journal

    Book: 2009 Volume 4

    Publishing Date: 07-30-2009

    Proceedings

    ISSN 1992-8424

    This work is subjected to copyright. All rights are reserved whether the whole or part of

    the material is concerned, specifically the rights of translation, reprinting, re-use of

    illusions, recitation, broadcasting, reproduction on microfilms or in any other way, and

    storage in data banks. Duplication of this publication of parts thereof is permitted only

    under the provision of the copyright law 1965, in its current version, and permission of

    use must always be obtained from UBICC Publishers. Violations are liable to prosecution

    under the copy right law.

    UBICC Journal is a part of UBICC Publishers

    www.ubicc.org

    UBICC Journal

    Printed in South Korea

    Typesetting: Camera-ready by author, data conversation by UBICC Publishing Services,

    South Korea

    UBICC Publishers

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    3/57

    Guest Editors Biography

    Dr. Al-Dahoud, is a associated professor at Al-Zaytoonah University, Amman, Jordan.

    He took his PhD from La Sabianza1/Italy and Kiev Polytechnic/Ukraine, on 1996.

    He worked at Al-Zaytoonah University since 1996 until now. He worked as visiting

    professor in many universities in Jordan and Middle East, as supervisor of master and

    PhD degrees in computer science. He established the ICIT since 2003 and he is the

    program chair of ICIT until now. He was the Vice President of the IT committee in the

    ministry of youth/Jordan, 2005, 2006. Al-Dahoud was the General Chair of (ICITST-

    2008), June 2328, 2008, Dublin, Ireland (www.icitst.org).

    He has directed and led many projects sponsored by NUFFIC/Netherlands:

    - The Tailor-made Training 2007 and On-Line Learning & Learning in an Integrated

    Virtual Environment" 2008.

    His hobby is conference organization, so he participates in the following conferences as

    general chair, program chair, sessions organizer or in the publicity committee:

    - ICITs, ICITST, ICITNS, DepCos, ICTA, ACITs, IMCL, WSEAS, and AICCSA

    Journals Activities: Al-Dahoud worked as Editor in Chief or guest editor or in the Editorial

    board of the following Journals:

    Journal of Digital Information Management, IAJIT, Journal of Computer Science, Int. J.

    Internet Technology and Secured Transactions, and UBICC.

    He published many books and journal papers, and participated as speaker in many

    conferences worldwide.

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    4/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    5/57

    TESTING OF PROGRAM CORRECTNES

    IN FORMAL THEORY

    Ivana Berkovic

    University of Novi Sad, Technical Faculty Mihajlo Pupin, Zrenjanin, Serbia

    [email protected]

    BrankoMarkoski

    University of Novi Sad, Technical Faculty Mihajlo Pupin, Zrenjanin, Serbia

    [email protected]

    Jovan Setrajcic

    University of Novi Sad, Faculty of Sciences, Novi Sad, Serbia

    [email protected]

    Vladimir Brtka

    University of Novi Sad, Technical Faculty Mihajlo Pupin, Zrenjanin, Serbia

    [email protected]

    Dalibor Dobrilovic

    University of Novi Sad, Technical Faculty Mihajlo Pupin, Zrenjanin, Serbia

    [email protected]

    ABSTRACT

    Within softwares life cycle, program testing is very important, since quality of

    specification demand, design and application must be proven. All definitions related

    to program testing are based on the same tendency and that is to give answer to the

    question: does the program behave in the requested way? One of oldest and best-known methods used in constructive testing of smaller programs is the symbolic

    program execution. One of ways to prove whether given program is written correctly

    is to execute it symbolically. Ramified program may be translated into declarative

    shape, i.e. into a clause sequence, and this translation may be automated. Method

    comprises of transformation part and resolution part.This work gives the description

    of the general frame for the investigation of the problem regarding program

    correctness, using the method of resolution invalidation.. It is shown how the rules

    of program logic can be used in the automatic resolution procedure. The examples of

    the realization on the LP prolog language are given (without limitation to Horn's

    clauses and without final failure).. The process of Pascal program execution in the

    LP system demonstrator is shown.

    Keywords: program correctness, resolution, test information, testing programs

    1. INTRODUCTION

    The program testing is defined as a process of

    program execution and comparison of observed

    behaviour to behaviour requested. The primary goal

    of testing is to find software flaws [1], and

    secondary goal is to improve self-confidence in

    testers (persons performing tests) in case when test

    finds no errors. Conflict between these two goals in

    visible when a testing process finds no error. In

    absence of other information, this may mean that

    the software is either of very high or very poor

    quality.

    Program testing is, in principle, complicated

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

    UbiCC Journal Volume 4 No. 3 618

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    6/57

    process that must be executed as systematically as

    possible in order to provide adequate reliability and

    quality certificate.

    Within software lifespan, program testing is one

    of most important activities since fulfillment of

    specification requirements, design and application

    must be checked out. According to Mantos [2], big

    software producers spend about 40% of time forprogram testing. In order to test large and

    complicated programs, testing must be as

    systematic as possible. Therefore, from all testing

    methods, only one that must not be applied is ad

    hoc testing method, since it cannot verify quality

    and correctness regarding the specification,

    construction or application. Testing firstly certifies

    whether the program performs the job it was

    intended to do, and then how it behaves in different

    exploitation conditions. Therefore, the key element

    in program testing is its specification, since, by

    definition, testing must be based on it. Testing

    strategy includes a set of activities organized in

    well-planned sequence of steps, which finally

    confirms (or refutes) fulfillment of required

    software quality. Errors are made in all stages of

    software development and have a tendency to

    expand. A number of errors revealed may rise

    during designing and then increase several times

    during the coding. According to [3], program-

    testing stages cost three to five times more than any

    other stages in a software life span.

    In large systems, many errors are found at the

    beginning of testing process, with visible decline in

    error percent during mending the errors in the

    software itself. There are several different

    approaches to program testing. One of our

    approaches is given in [4]. Testing result may not

    be predicted in advance. On the basis of testing

    results it may be concluded how much more errors

    are present in the software.

    The usual approach to testing is based on

    requests analyse. Specification is being converted

    into test items. Apart of the fact that incorrigible

    errors may occur in programs, specification

    requests are written in much higher level than

    testing standards. This means that, during testing,

    attention must be paid to much more details than it

    is listed in specification itself. Due to lack of time

    or money, only parts of the software are being

    tested, or the parts listed in specification.

    Structural testing method belongs to another

    strategy of testing approaches, so-called "whitebox"" (some authors call it transparent or glass

    box). Criterion of usual "white box" is to execute

    every executive statement during the testing and to

    write every result during testing in a testing log.

    The basic force in all these testings is that complete

    code is taken into account during testing, which

    makes easier to find errors, even when software

    details are unclear or incomplete.

    According to [5] testing may be descriptive and

    prescriptive. In descriptive testing, testing of all test

    items is not necessary. Instead, in testing log is

    written whether software is hard to test, is it stable

    or not, number of bugs, etc... Prescriptive testing

    establishes operative steps helping software control,

    i.e. dividing complex modules in several more

    simple ones. There are several tests of complexsoftware measurements. Important criterion in

    measurement selection is equality (harmony) of

    applications. It is popular in commercial software

    application because it guarantees to user a certain

    level of testing, or possibility of so-called internal

    action [6]. There is a strong connection between

    complexity and testing, and methodology of

    structural testing makes this connection explicit [6].

    Firstly, complexity is the basic source of software

    errors. This is possible in both abstract and concrete

    sense. In abstract sense, complexity above certain

    point exceeds ability of the human mind to do an

    exact mathematical manipulation. Structural

    programming techniques may push these barriers,

    but may not remove them completely. Other

    factors, listed in [7], claim that when module is

    more complex, it is more probable that it contains

    an error. In addition, above certain complexity

    threshold, probability of the error in the module is

    progressively rising. On the basis of this

    information, many software purchasers define a

    number of cycles (software module cyclicity,

    McCabe [8]1) in order increase total reliability. On

    the other hand, complexity may be used directly to

    distribute testing attempts in input data by

    connecting complexity and number of errors, in

    order to aim testing to finding most probable errors

    ("lever" mechanism, [9]). In structural testing

    methodology, this distribution means to precisely

    determine number of testing paths needed for every

    software module being tested, which exactly is the

    cyclic complexity. Other usual criteria of "white

    box" testing has important flaw that may be

    fulfilled with small number of tests for arbitrary

    complexity (using any possible meaning of the

    "complexity") [10].

    The program correctness demonstration and the

    programming of correct programs are two similar

    theoretical problems, which are very meaningful in

    practice [11]. The first is resolved within the

    program analysis and the second within the

    program synthesis, although because of the

    connection that exists between the program analysisand the program synthesis it is noticed the

    reciprocal interference of the two processes.

    Nevertheless, when it is a mater of the automatic

    methods that are to prove the correctness and of the

    1 McCabe, measure based on a number and

    structure of the cycle.

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

    UbiCC Journal Volume 4 No. 3 619

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    7/57

    methods of automatic program synthesis, the

    difference between them is evident.

    In reference [12] in describes the initial

    possibility of automatic synthesis of simple

    programs using the resolution procedure of

    automatic demonstration of theorems (ADT), more

    precisely with the resolution procedure of

    deduction of answer to request. The demonstrationthat the request that has a form of (x)W(x) is thelogical consequence of the axioms that determinate

    the predicate W and determinate (elementary)

    program operators provides that the variable x in

    the response obtains the value that represents the

    requested composition of (elementary) operators,

    i.e. the requested program. The works of Z. Mann,

    observe in detail the problems of program analysis

    and synthesis using the resolution procedure of

    demonstration and deduction of the response.

    The different research tendency is axiomatic

    definition of the semantics of the program language

    Pascal in the form of specific rules of the program

    logic deduction, described in the works [14,15].Although the concepts of the two mentioned

    approaches are different, they have the same

    characteristic. It is the deductive system on

    predicate language. In fact, it is a mater of

    realization in the special predicate computation that

    is based on deduction in formal theory. With this,

    the problem of program correctness is to be related

    to automatic checkup of (existing) demonstrations

    regarding mathematical theorems. The two

    approaches mentioned above and their

    modifications are based on that kind of concept.

    2. DESCRIPTION OF METHOD FOR ONE

    PASSAGE SYMBOLIC TESTING PROGRAM

    The method is based on transformation of given

    Pascal program, into a sequence of prologue

    clauses, which comprise axiomatic base for

    functioning of deductive resolution mechanism in a

    BASELOG system [10] . For given Pascal program,

    by a single passage through resolution procedure of

    BASELOG system, all possible outputs in Pascal

    program are obtained in a symbolic shape, together

    with paths leading to every one of them. Both parts,

    transformation and resolution one, are completely

    automated and are naturally attached to each other.

    When a resolution part has finished, a sequence of

    paths and symbolic outputs is reading out for given

    input Pascal program. This is a transformation of

    programming structures and programming

    operators into a sequence of clauses, being realized

    by models depending on concrete programming

    language. Automation covers branching IF-THEN

    and IF-THEN-ELSE structures, as well as WHILE-

    DO and REPEAT UNTIL cyclic structures,

    which may be mutually nested in each other. This

    paper gives review of possibilities in work with

    single-dimension sequences and programs within a

    Pascal program. Number of passages through cyclic

    structures must be fixed in advance using counter.

    During the testing process of given (input) Pascal

    program both parts are involved, transformation

    and resolution, in a following way: Transformation

    part

    ends function by a sequence of clauses, or demands forced termination, dependingon input Pascal program.

    Impossibility of generating a sequence of clauses in

    transformation part points that a Pascal program has

    no correct syntax, i.e. that there are mistakes in

    syntax or in logical structure (destructive testing).

    In this case, since axiomatic base was not

    constructed, resolution part is not activated and user

    is prompted to mend a Pascal program syntax. In

    the case that transformation part finishes function

    by generating a sequence of clauses, resolution part

    is activated with following possible outcomes:

    Ra) ends function giving a list of symbolic

    outputs and corresponding Pascalprogram routes, or

    Rb) ends by message that id could not generate

    list of outputs and routes, or

    Rc) doesn't end function and demands forced

    termination.

    Ra) By comparing symbolic outputs and routes

    with specification, the user may

    declare a given Pascal program ascorrect, if outputs are in

    accordance to specification

    (constructive testing), or

    if a discrepancy of some symbolicexpression to specification has

    been found, this means that thereis a semantic error in a Pascal

    program (destructive testing) at the

    corresponding route.

    Rb) Impossibility to generate a list of symbolic

    expressions in resolution part, which means

    that there is a logical-structural error in a

    Pascal program (destructive testing).

    Rc) Too long function or a (unending) cycle

    means that there is a logic and/or semantic

    error in a Pascal program (destructive

    testing).

    In this way, by using this method, user may be

    assured in correctness of a Pascal program or in

    presence of syntax and/or logic-structure semantic

    errors. As opposite to present methods of symbolic

    testing of the programs, important feature of this

    method is single-passage, provided by specific

    property of OL resolution [11] with marked

    literals, at which a resolution module in BASELOG

    system is founded.

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

    UbiCC Journal Volume 4 No. 3 620

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    8/57

    3. DEDUCTION IN FORMAL THEORY AND

    PROGRAM CORRECTNESS

    The program verification may lean on

    techniques for automatic theorem proving. These

    techniques embody principles of deductive

    reasoning, same ones that are used by programmers

    during program designation. Why not use sameprinciples in the automatic synthesis system, which

    may construct program instead of merely proving

    its correctness? Designing the program demands

    more originality and more creativity than proving

    its correctness, but both tasks demand the same way

    of thinking. [13]

    Structural programming itself helped the

    automatic synthesis of computer programs in the

    beginning, establishing principles in program

    development on the basis of specification. These

    principles should be guidelines for programmers. In

    the matter of fact, advocates of structural

    programming were very pessimistic regarding

    possibility to ever automatize their techniques.

    Dijkstra went so far to say that we should not

    automatize programming even if we could, since

    this would deprive this job from all delight.

    Proving program correctness is a theoretical

    problem with much practical importance, and is

    done within program analyse. Related theoretical

    problem is the design of correct programs that is

    solved in another way within program synthesis.

    It is evident that these processes are intertwined,

    since analysis and synthesis of programs are closely

    related. Nevertheless, differences between these

    problems are distinct regarding automatic method

    of proving program correctness and automatic

    method of program synthesis.

    If we observe a program, it raises question of

    termination and correctness, and if we observe two

    programs we have question of equivalence of given

    programs. Abstract, i.e. non-interpreted program is

    defined using pointed graph. From such a program,

    we may obtain partially interpreted program, using

    interpretation of function symbols, predicate

    symbols and constant symbols. If we interpret free

    variables into partially interpreted program, a

    realized program is obtained. Function of such a

    program is observed using sequence executed.

    Realized program, regarded as deterministic, has

    one executive sequence, and if it does not exist at

    all, it has no executive sequence. On the other hand,

    when the program is partially interpreted, we seeseveral executive sequences. In previously stated

    program type, for every predicate interpreted it is

    known when it is correct and when not, which

    would mean that depending on input variables

    different execution paths are possible. Considering

    abstract program, we conclude that it has only one

    executive sequence, where it is not known whether

    predicate P or his negation is correct.

    According to The basic presumptions of

    programming logic are given in [14]. The basicrelation {P}S{Q}is a specification for program S

    with following meaning: if predicate P at input is

    fulfilled (correct) before execution of program S,

    then predicate Q at the output is fulfilled (correct)

    after execution of program S. In order to prove

    correctness of program S, it is necessary to proverelation {P}S{Q}, where input values of variables

    must fulfill predicate P and output variable values

    must fulfill predicate Q. Since it is not proven that

    S is terminating, and that this is only presumption,

    then we may say that partial correctness of the

    program is defined. If it is proven that S terminates

    and that relation {P}S{Q} is fulfilled, we say that S

    is completely correct. For program design, we use

    thus determined notion of correctness.

    The basic idea is that program design should be

    done simultaneously with proving correctness of

    the program for given specifications[15,16]. First

    the specification {P}S{Q} is executed with given

    prerequisite P and given resultant post condition Q,

    and then subspecifications of {Pi}Si{Qi} type are

    executed for components Si from which the

    program S is built. Special rules of execution

    provide proof that fulfillment of relation {P}S{Q}

    follows from fulfillment of relations {Pi}Si{Qi} for

    component programs Si.

    Notice that given rules in [9] are used formanual design and manual confirmation of

    program's correctness, without mention about

    possibility of automatic (resolution) confirmation

    methods.If we wish to prove correctness of

    program S, we must prove relation {P}S{Q}, whereinput values of variables must fulfill the formula P

    and output values of variables must fulfill the

    formula Q. This defines only partial correctness of

    program S, since it is assumed that program S

    terminates. If we prove that S terminates and that

    relation {P}S{Q} is satisfied, we say that S istotally correct.Thus designated principle of

    correctness is used for program designation.

    Designation starts from specification {P}S{Q} withgiven precondition P and given resulting

    postcondition Q.Formula {P}S{Q} is written asK(P, S, Q), where K is a predicate symbol and

    P,S,Q are variables of first-order predicate

    calculation.

    {Pzy} z := y {P}

    we are writing as K(t(P,Z,Y), d(Z,Y), P)...

    where t,d are function symbols and P,Z,Y are

    variables;

    ...Rules R():

    P1. . {P}S{R} , RQ.

    {P}S{Q}.

    we write.. K(P,S,R) Im(R,Q) K(P,S,Q)

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

    UbiCC Journal Volume 4 No. 3 621

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    9/57

    where Im (implication) is a predicate symbol, and

    P, S, R, Q are variables;

    P2 RP, {P}S{Q}

    {R}S{Q}. we write Im(R,P) K(P,S,Q) K(R,S,Q)

    P3 {P}S1{R} , {R}S2{Q}

    {P}begin S1; S2 end {Q}

    K(P,S1,R) K(R,S2,Q) K(P,s(S1,S2),Q)where s is a function symbol, and P, S1, S2, R, q

    are variables

    P4 {PB}S1{Q, {P~B}S2{Q}

    {P}if B then S1 else S2{Q}

    K(k(P,B),S1,Q)K(k(P,n(B)),S2,Q) K(P,ife(B,S1,S2),Q)

    where k, n, ife are function symbols

    P5 {PB}S{Q} , P~B Q

    {P} if B then S{Q}

    K(k(P,B),S,Q) Im(k(P,n(B)),Q) K(P,if(B,S),Q)

    where k, n, if are function symbols

    P6 {PB} S {P }

    {P} while B do S {P~B}

    K(k(P,B),S,P) K(P,wh(B,S),k(P,n(B)))

    where k, n, wh are function symbols

    P7 {P}S{Q} , Q~B P

    {P}repeat S until B {QB}

    K(P,S,Q) Im(k(Q,n(B)),P) K(P,ru(S,B),k(Q,B))

    where k, n, ru are function symbols

    Transcription of other programming logic rules is

    also possible.

    Axiom A():

    A1 K(t(P,Z,Y),d(Z.Y),P)

    assigning axiom

    Formal theory is given by ((), F(), A(), R()),where is a set of symbols (alphabet) of theory ,F is a set of formulae (correct words in alphabet ),A is a set of axioms for theory (AF), R is a set ofderivation rules for theory .B is a theorem withintheory if and only if B is possible to derive withincalculus k from set R() A() (k is a first-orderpredicate calculus).Let S be special predicate

    calculus (first-order theory) with it's own axioms

    A(S) = R() A(). This means that derivation oftheorem B within theory could be replaced withderivation within special predicate calculus S,

    whose own axioms A(S)= R() A().Axioms ofspecial predicate calculus S are: A(S)= A()R().We assume that s is a syntax unit whose(partial) correctness is being proven for certain

    input predicate U and output predicate V.

    Within theory S is being proved

    ... (P)(Q)K(P,s,Q)S

    where s is a constant for presentation of a given

    program. Program is written in functional notation

    with symbols: s (sequence), d (assigning), ife (if-

    then-else), if (if-then), wh (while), ru (repeat-until).

    To starting set of axioms A(S), negation of

    statement is added: Result of negation using

    resolution procedure is as follows: /Im(X,Y,)Odgovor(P,Q), where X,Y,P,Q are values for

    which successful negation To means that for these

    values a proof is found. But this does not mean thatgiven program is partially correct. It is necessary to

    establish that input and output predicates U, V are

    in accordance with P, Q, and also that Im (X,Y)

    is really fulfilled for domain predicates ant

    terms.Accordance means confirmation that .. is

    valid. : U P, Q V) ( X Y).thereare two ways to establish accordance: manually or

    by automatic resolution procedure. Realization of

    these ways is not possible within theory S, but it is

    possible within the new theory, which is defined by

    predicates and terms which are part of the program

    s and input-output predicates U, V. Within this

    theory U, P, Q, V, X, Y are not variables, but

    formulae with domain variables, domain terms anddomain predicates.This method concerns derivation

    within special predicate calculus based on

    deduction within the formal theory. Thus the

    program's correctness problem is associated with

    automatic proving of (existing) proofs of

    mathematical theorems.

    The formal theory is determined with theformulation of (S(), F(), A(), R()) where S isthe set of symbols (alphabet) of the theory , F isthe set of formulas (regular words in the alphabet

    S), A is the set of axioms of the theory (AF), Ris the set of rules of execution of the theory .Deduction (proof) of the formula B in the theory is the final sequence B1, B2, ... , Bn (Bn is B) of

    formulas of this theory, of that kind that for every

    element Bi of that sequence it is valid: Bi is axiom,

    or Bi is deducted with the application of some rules

    of deduction Ri R from some preceding elements

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

    UbiCC Journal Volume 4 No. 3 622

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    10/57

    of that sequence. It is said that B is the theorem of

    the theory and we write

    B [17].

    Suppose S() is a set of symbols of predicatecomputation and F() set of formulas of predicatecomputation. In that case, the rules of deduction R(

    ) can be written in the form: Bi1Bi2 ... Bik Bi (Ri) where Bik, Bi are formulas from F().Suppose predicate computation of first line, thanit is valid:

    R(), A()

    B if

    B (1)

    B is theorem in the theory if and only if B isdeductible in computation from the set R() A().

    Suppose S is a special predicate computation

    (theory of first line) with its own axioms:

    A(S) = R() A() , (rules of deduction in S arerules of deduction of computation ) then it is validA(S)

    B ifS

    B , so that (1) can be written:

    SB if

    B (2)

    That means that the deduction of theorem B in

    theory can be replaced with deduction in specialpredicate computation S, that has its own axioms

    A(S) = R() A().Now we can formulate the following task:

    The sequence of formulas has been given B1, B2, ...

    , Bn (Bn is B, Bi different from B for i

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    11/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    12/57

    ~IM(Y1,V1)~K(X1,U0,Y1)K(X1,U0,V1)& /

    consequence rule

    ~O(X1,V1)& / negation addition

    0

    0

    LP system generates next negation

    number of resolvents generated = 10maximal obtained level = 11

    DEMONSTRATION IS PRINTED

    level where the empty item is generated = 11

    LEVEL=1; central item

    :/O(X1,V1)~K(X1,s(h,g),Y1)~K(Y1,w(b,s(d(i,t1),d

    (p,t2))),V1)&

    4.lateral, 2.literal :

    ~K(k(X1,V2),U0,X1)K(X1,w(V2,U0),k(X1,ng(V2)

    ))&

    LEVEL= 2; resolvent:

    /O(X1,k(X0,ng(b)))~K(X1,s(h,g),X0)/~K(X0,w(b,s

    (d(i,t1),d(p,t2))),k(X0,ng(b)))~K(k(X0,b),s(d(i,t1),d

    (p,t2)),X0)&

    3.lateral, 3.literal :

    ~K(X1,Y1,U1)~K(U1,Y2,V1)K(X1,s(Y1,Y2),V1)

    &

    LEVEL= 3; resolvent:

    /O(X1,k(V1,ng(b)))~K(X1,s(h,g),V1)/~K(V1,w(b,s

    (d(i,t1),d(p,t2))),k(V1,ng(b)))/~K(k(V1,b),s(d(i,t1),

    d(p,t2)),V1)~K(k(V1,b),d(i,t1),U1)~K(U1,d(p,t2),V

    1)&

    5.lateral, 1.literal :

    K(t(X1,Z1,Y1),d(Z1,Y1),X1)&

    LEVEL= 4; resolvent:

    /O(X1,k(X0,ng(b)))~K(X1,s(h,g),X0)/~K(X0,w(b,s

    (d(i,t1),d(p,t2))),k(X0,ng(b)))/~K(k(X0,b),s(d(i,t1),

    d(p,t2)),X0)~K(k(X0,b),d(i,t1),t(X0,p,t2))&

    6.lateral, 3.literal :

    ~IM(X2,Y1)~K(Y1,U0,V1)K(X2,U0,V1)&

    LEVEL= 5; resolvent:

    /O(X1,k(X0,ng(b)))~K(X1,s(h,g),X0)/~K(X0,w(b,s

    (d(i,t1),d(p,t2))),k(X0,ng(b)))/~K(k(X0,b),s(d(i,t1),

    d(p,t2)),X0)/~K(k(X0,b),d(i,t1),t(X0,p,t2))~IM(k(X

    0,b),Y1)~K(Y1,d(i,t1),t(X0,p,t2))&

    5. lateral, 1.literal :

    K(t(X1,Z1,Y1),d(Z1,Y1),X1)&

    LEVEL= 6; resolvent:

    /~IM(k(X0,b),t(t(X0,p,t2),i,t1))/O(X1,k(X0,ng(b)))

    ~K(X1,s(h,g),X0)&

    3.lateral, 3.literal :

    ~K(X1,Y1,U1)~K(U1,Y2,V1)K(X1,s(Y1,Y2),V1)

    &

    LEVEL= 7; resolvent:

    /~IM(k(V1,b),t(t(V1,p,t2),i,t1))/O(X2,k(V1,ng(b)))/

    ~K(X2,s(h,g),V1)~K(X2,h,U1)~K(U1,g,V1)&

    2.lateral, 2.literal :

    ~K(Y1,d(i,0),V1)K(Y1,g,V1)&

    LEVEL= 8; resolvent:

    /~IM(k(V0,b),t(t(V0,p,t2),i,t1))/O(X2,k(V0,ng(b)))/~K(X2,s(h,g),V0)~K(X2,h,Y1)/~K(Y1,g,V0)~K(Y

    1,d(i,0),V0)&

    5.lateral, 1.literal :

    K(t(X1,Z1,Y1),d(Z1,Y1),X1)&

    LEVEL= 9; resolvent:

    /~IM(k(X1,b),t(t(X1,p,t2),i,t1))/O(X2,k(X1,ng(b)))/

    ~K(X2,s(h,g),X1)~K(X2,h,t(X1,i,0))&

    1 lateral, 2.literal :

    ~K(Y1,d(p,x),V1)K(Y1,h,V1)&

    LEVEL= 10; resolvent:

    /~IM(k(X1,b),t(t(X1,p,t2),i,t1))/O(Y1,k(X1,ng(b)))/

    ~K(Y1,s(h,g),X1)/~K(Y1,h,t(X1,i,0))~K(Y1,d(p,x),

    t(X1,i,0))&

    5.lateras, 1.literal :

    K(t(X1,Z1,Y1),d(Z1,Y1),X1)&

    LEVEL= 11; resolvent:

    /O(Y1,k(X1,ng(b)))/~K(Y1,s(h,g),X1)/~K(Y1,h,t(X

    1,i,0))~K(Y1,d(p,x),t(X1,i,0))&

    5. lateral, 1.literal :

    K(t(X1,Z1,Y1),d(Z1,Y1),X1)&

    LEVEL= 11; resolvent:

    DEMONSTRATION IS PRINTED

    Now we need to prove compliance, i.e. that there is

    in effect:

    ( X YZ

    T ) (U P, Q V) that is atLEVEL= 12; resolvent:

    /IM(k(X1,b),t(t(X1,i,t1),p,t2))O(t(t(X1,i,0),p,x),k(X

    1,ng(b)))&

    By getting marks to domain level we obtain:

    (X1 (i

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    13/57

    +

    =

    =

    =

    =

    =

    =>+=

    =>=

    =>=

    1

    0

    0

    0

    0

    )1(

    )1()11(

    )1(

    )1(

    i

    j

    i

    j

    i

    j

    i

    j

    jxip

    jixip

    jixip

    jxp

    For i = 0 we obtain:

    xp

    jxp

    jxp

    j

    i

    j

    =

    =>=

    =>=

    =

    =

    0

    0

    0

    )1(

    )1(

    By this the compliance is proven, which is enough

    to conclude that a given program is (partially)

    correct (until terminating).

    5. INTERPRETATION RELATED TO

    DEMONSTRATION OF PROGRAM

    CORRECTNESS

    Interpret the sequence J: B1, ... , Bn as program

    S. Interpret the elements A() as initial elements forthe composition of program S, and the elements R(

    ) as rules for the composition of programconstructions.Vice versa, if we consider program S as

    sequence J, initial elementary program operators as

    elements A() and rules for composition ofprogram structures as elements R(), with this theproblem of verification of the correctness of the

    given program is related to demonstration of

    correctness of deduction in corresponding formal

    theory. It is necessary to represent axioms, rules

    and program with predicate formulas.

    With all that is mentioned above we defined the

    general frame for the composition of concrete

    proceedings for demonstration of program

    correctness with the deductive method. With the

    variety of choices regarding axioms, rules and

    predicate registration for the different composition

    proceedings are possible.

    6. CONCLUSION

    Software testing is the important step in

    program development. Software producers would

    like to predict number of errors in software systems

    before the application, so they could estimate

    quality of product bought and difficulties in

    maintenance process [18]. Testing often takes 40%

    of time needed for development of software

    package, which is the best proof that it is a very

    complex process. Aim of testing is to establishwhether software is behaving in the way envisaged

    by specification. Therefore, primary goal of

    software testing is to find errors. Nevertheless, not

    all errors are ever found, but there is a secondary

    goal in testing, that is to enable a person who

    performs testing (tester) to trust the software system

    [19]. From these reasons, it is very important to

    choose such a testing method that will, in given

    functions of software system, find those fatal errors

    that bring to highest hazards. In order to realize

    this, one of tasks given to programmers is to

    develop software that is easy to test ("software is

    designed for people, not for machines") [20].

    Program testing is often equalized to looking for

    any errors [20]. There is no point in testing for

    errors that probably do not exist. It is much more

    efficient to think thoroughly about kind of errors

    that are most probable (or most harmful) and then

    to choose testing methods that will be able to find

    such errors. Success of a set of test items is equal to

    successful execution of detailed test program. One

    of big issues in program testing is the error

    reproduction (testers find errors and programmers

    remove bugs) [21]. It is obvious that there must be

    some coordination between testers and

    programmers. Error reproduction is the case when

    it would be the vest to do a problematic test again

    and to know exactly when and where error

    occurred. Therefore, there is no ideal test, as well as

    there is no ideal product.[22] .Software producers

    would like to anticipate the number of errors in

    software systems before their application in order to

    estimate the quality of acquired program and the

    difficulties in the maintenance. This work gives the

    summary and describes the process of program

    testing, the problems that are to be resolved by

    testers and some solutions for the efficacious

    elimination of errors[23]. The testing of big and

    complex programs is in general the complicated

    process that has to be realized as systematically as

    possible, in order to provide adequate confidence

    and to confirm the quality of given application [24].

    The deductions in formal theories represent generalframe for the development of deductive methods

    for the verification of program correctness. This

    frame gives two basic methods (invalidation of

    added predicate formula and usage of rules of

    program logic) and their modifications.

    The work with formula that is added to the

    given program implies the presence of added

    axioms and without them, the invalidation cannot

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

    UbiCC Journal Volume 4 No. 3 626

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    14/57

    be realized. The added axioms describe

    characteristics of domain predicates and operations

    and represent necessary knowledge that is to be

    communicated to the deductive system. The

    existing results described above imply that kind of

    knowledge, but this appears to be notable difficulty

    in practice.

    ACKNOWELDGEMENTS

    The work presented in the paper was developed

    within the IT Project WEB

    portals for data analysis and consulting, No.

    13013, supported by the

    government of Republic of Serbia, 2008. 2010.

    7. REFERENCES

    [1] Marks, David M. Testing very big systems

    New York:McGraw-Hill, 1992

    [2] Manthos A., Vasilis C., Kostis D.

    Systematicaly Testing a Real-Time Operating

    System IEEE Trans. Software Eng., 1995

    [3] Voas J., Miller W. Software Testability: The

    New Verification IEEE Software 1995

    [4] Perry William E. Year 2000 Software Testing

    New York: John Wiley& SONS 1999

    [5] Whittaker J.A., Whittaker, Agrawal K.

    A case study in software reliability

    measurement Proceedinga of Quality Week,

    paper no.2A2, San Francisko, USA 1995

    [6] Zeller A. Yesterday, my program worked,

    Today, it does not. Why?Passau Germany,

    2000

    [7] Markoski B., Hotomski P., Malbaski D.,

    Bogicevic N. Testing the integration and the

    system, International ZEMAK symposium,

    Ohrid, FR Macedonia, 2004.

    [8] McCabe, Thomas J, &Butler, Charles W.

    Design Complexity Measurement and Testing

    Communications of the ACM 32, 1992

    [9] Markoski B., Hotomski P., Malbaski D.

    Testing the complex software, International

    ZEMAK symposium, Ohrid, FR Macedonia,

    2004.

    [10] Chidamber, S. and C. Kemerer, Towards a

    Metrics Suite for Object OrientedDesigne,

    Proceedings of OOPSLA, July 2001

    [11] J.A. Whittaker, What is Software Testing?

    And Why Is It So Hard? IEEE Software, vol.

    17, no. 1, 2000,

    [12] Nilsson N., Problem-Solving Methods inArtificial Intelligence , McGraw-Hill, 1980

    [13] Manna Z., Mathematical Theory of

    Computation , McGraw-Hill, 1978

    [14] Floyd. R.W.,Assigning meanings to

    programs , In: Proc. Sym. in Applied

    Math.Vol.19, Mathematical Aspects of

    Computer Science, Amer. Math. Soc., pp. 19-

    32., 1980.

    [15] Hoare C.A.R. Proof of a program Find

    Communications of the ACM 14, 39-45.

    1971.

    [16] Hoare C.A.R, Wirth N., An axiomatic

    definition of the programming language

    Pascal , Acta Informatica 2, pp. 335-355.

    1983

    [17] Markoski B., Hotomski P., Malbaski D.,Obradovic D. Resolution methods in proving

    the program correctness , YUGER, An

    international journal dealing with theoretical

    and computational aspects of operations

    research, systems science and menagement

    science,Beograd,Serbia,2007,

    [18] Myers G.J., The Art of Software Testing, New

    York, Wiley, 1979.

    [19] Chan, F., T. Chen, I. Mak and Y. Yu,

    Proportional sampling strategy: Guidelines

    for software test practitioners , Information

    and Software Technology, Vol. 38, No. 12,

    pp. 775-782, 1996.

    [20] K. Beck, Test Driven Development: By

    Example, Addison-Wesley, 2003

    [21] P. Runeson, C. Andersson, and M. Hst, Test

    Processes in Software Product EvolutionA

    Qualitative Survey on the State of Practice,

    J. Software Maintenance and Evolution, vol.

    15, no. 1, 2003, pp. 4159.

    [22] G. Rothermel et al., On Test Suite

    Composition and Cost-Effective Regression

    Testing, ACM Trans. Software Eng. and

    Methodology, vol. 13, no. 3, 2004, pp. 27733

    [23] N. Tillmann and W. Schulte, Parameterized

    Unit Tests, Proc. 10th

    European Software

    Eng. Conf., ACM Press, 2005, pp. 253262.

    [24] Nathaniel Charlton Program verification

    with interacting analysis plugins Formal

    Aspects of Computing. London: Aug 2007.

    Vol. 19, Iss. 3; p. 375

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

    UbiCC Journal Volume 4 No. 3 627

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    15/57

    Detecting Metamorphic viruses by using Arbitrary Length of Control Flow

    Graphs and Nodes Alignment

    Essam Al daoud

    Zarka Private University, Jordan

    [email protected]

    Ahid Al-ShbailAl al-bayt University, [email protected]

    Adnan M. Al-SmadiAl al-bayt University, Jordan

    [email protected]

    ABSTRACT

    Detection tools such as virus scanners have performed poorly, particularly when

    facing previously unknown virus or novel variants of existing ones. This study proposes an efficient and novel method based on arbitrary length of control flowgraphs (ALCFG) and similarity of the aligned ALCFG matrix. The metamorphicviruses are generated by two tools; namely: next generation virus creation kit(NGVCK0.30) and virus creation lab for Windows 32 (VCL32). The results showthat all the generated metamorphic viruses can be detected by using the suggestedapproach, while less than 62% are detected by well-known antivirus software.

    Keywords: metamorphic virus, antivirus, control flow graph, similaritymeasurement.

    1. INTRODUCTION

    Virus writers use better evasion techniques totransform their virus to avoid detection. For example, polymorphic and metamorphic are specificallydesigned to bypass detection tools. There is strongevidence that commercial antivirus are susceptible tocommon evasion techniques used by virus writers[1].Metamorphic Virus can reprogram itself. it use codeobfuscation techniques to challenge deeper staticanalysis and can also beat dynamic analyzers byaltering its behavior, it does this by translating itsown code into a temporary representation, edit thetemporary representation of itself, and then writeitself back to normal code again. This procedure is

    done with the virus itself, and thus also themetamorphic engine itself undergoes changes.Metamorphic viruses use several metamorphictransformations, including Instruction reordering,data reordering, inlining and outlining, registerrenaming, code permutation, code expansion, codeshrinking, Subroutine interleaving, and garbage codeinsertion. The altered code is then recompiled tocreate a virus executable that looks fundamentallydifferent from the original. For example, The sourcecode of the metamorphic virus Win32/Simile isapproximately 14,000 lines of assembly code. Themetaphoric engine itself takes up approximately 90%

    of the virus code, which is extremely powerful[2].

    W32/Ghost contains many procedures and generateshuge number of metamorphic viruses, it can generate

    at least 10! = 3,628,800 variations[3].

    In this paper, we develop a methodology fordetecting metamorphic virus in executables. we haveinitially focused our attention on viruses and simpleentry point infection. However, our method isgeneral and can be applied to any malware and anyobfuscated entry point.

    2. RELATED WORKS

    Lakhotia, Kapoor, and Kumar believe thatantivirus technologies could counter attack using the

    same techniques that metamorphic virus writers use;identify similar weak spots in metamorphic viruses[4]. Geometric detection is based on modificationsthat a virus has made to the file structure. Peter Szorcalls this method shape heuristics because is far fromexact and prone to false positives [5]. In 2005 Ando,Quynh, and Takefuji introduced a resolution basedtechnique for detecting metamorphic viruses. In theirmethod, scattered and obfuscated code is resolvedand simplified to several parts of malicious code.Their experiment showed that compared withemulation, this technique is effective formetamorphic viruses which apply anti-heuristic

    techniques, such as register substitution or

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    16/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    17/57

    if j = -1 then break

    Algorithm4:Construct the matrix ALCFGInput: The matrixLabels of size c2 and the matrix

    JumpTo of size e3

    Output:ALCFG represented as mm matrix andnodes sequence NodeSeq contains mnodes

    1- Fill the upper minor diagonal of matrix ALCFGby 1

    2- Fill the array NodeSeq by "K" // labels3- for each row i in the matrix JumpTo

    x=JumpTo[i][2];NodeSeq[x]= JumpTo[i][3]

    for each row j in the matrix Labelsif JumpTo[i][1]= Labels[j][1] then

    y= Labels[j][2] ; ALCFG[x][y]=1

    Table 1: the instructions and corresponding symbols

    Instructions Symbol

    JE, JZ, A

    JP, JPE R

    JNE,JNZ N

    JNP, JPO D

    JA, JNBE, JG, JNLE E

    JAE,JNB,JNC, JGE, JNL Q

    JB, JNAE, JC, JL, JNGE G

    JBE, JNA, JLE,JNG H

    JO, JS I

    JNO, JNS, JCXZ, JECXZ LLOOP P

    LABEL K

    GAP M

    All above algorithms can be implemented very

    fast and can be optimized. The worst case ofalgorithm 2 is 5n where n is the number of the linesin the disassembled file, the worst case of algorithm

    3 is n and the worst case of algorithm 4 is 2)2(m

    where m n. Therefore; the total complexity ofalgorithm 1 is O(n)+O(m2).

    Definition 1: A skeleton signature of a binary file isthe nodes sequence NodeSeq and the matrixALCFG.

    To illustrate the previous procedures; consider theinput is the virus Z0mbie III, where Figure 1 is partfrom the source code ofZ0mbie III, Figure 2 is theop matrix, figure 3 is the Labels matrix and figure 4is JumpTo matrix of the first 20 nodes of the virusZ0mbie III.

    Figure 1: part fromZ0mbie III

    Figure 2: the op matrix

    Figure 3: TheLabels Matrix

    Figure 4: TheJumpTo Matrix

    5 tsr

    6 cf8_io

    7 tsr_complete

    8 restore_program

    13 __cycle_1

    16 __mz

    18 c cle 2

    1 tsr_complete N

    2 tsr_complete A

    3 __cycle_1 H

    4 __mz E

    9 __exit A

    10 restore_program A

    11 __exit N

    12 __exit Q

    14 __exit G

    15 __mz A

    17 __exit N

    19 __exit I

    20 __cycle_2_next G

    N tsr_complete 0

    A tsr_complete 0tsr 0

    call c000_rw 0

    call c000_ro 0

    H __cycle_1 0

    E __mz 0

    tsr_complete 0

    restore_program 0

    A __exit 0

    A restore_program 0

    N __exit 0

    cf8_io 0

    ---

    ---

    ---

    ---

    start:.. pop esisub esi, $-1-start push esi

    . jne tsr_complete

    shl edi, 9. je tsr_complete

    tsr:int 3call c000_rwpushamov ecx, virsizecall c000_ro

    tsr_complete:out 80h, al

    .

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    18/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    19/57

    the mismatched nodes, and delete the last rowsand columns fromALCFGV where the number ofthe deleted rows and columns equal to thenumber of the gabs

    2- If mismatch with symbol then delete the row iand the column i from the matrices ALCFGS and

    ALCFGV for all i in the mismatched nodes.3- Rename the matrices to dsALCFG and

    dvALCFG .

    4- If dsALCFG =d

    vALCFG then

    Return 1Else

    Return 0

    The most expensive step in the previous algorithmsis Needleman-Wunsch-Sellers algorithm which can

    be implemented in m2 operation, and the total

    complexity of all procedures is O(n)+O(m

    2

    ).Therefore the suggested method is much faster thanthe previous methods; for example the cost offinding the isomorphic sub graph in [9] is wellknownNP-complete problem.

    To illustrate the suggested similarity measurefunction, assume that we like to the check weatherthe programPis infected by the virus Z0mbie IIIornot. Assume that the threshold T=70 and m=10 (notethat: to reduce the false positive we must increase thethreshold and the number of the processed nodes),the first 10 nodes that are extracted from Pand the

    ALCFG matrix are (the skeleton signature ofP):

    N A H A E K K K K A

    =

    1

    1

    1

    1

    1

    11

    1

    11

    11

    sALCFG

    By using algorithm 6 the nodes ofPaligned with thenodes ofZ0mbie IIIas following:

    N A H A E K K K K A -N A H - E K K K K A A

    c= number of matched nodes*100/ total number ofnodes=9*100/10=90 >T.

    The mismatch occur with gabs; therefore column 4and row 4 must be deleted fromALCFGS, column 10and row 10 must be deleted from ALCFGV. Sincematrices after deletion are identical, we conclude that

    the program P is infected by a modified version of

    Z0mbie III and %90),( =vs ALCFGALCFG .

    5. IMPLEMENTATION

    The metamorphic viruses are taken from VX

    Heavens search engine and generated by two tools;namely: Next Generation Virus Creation Kit(NGVCK0.30) and Virus Creation Lab for Windows32 (VCL32) [11]. Since the output of the kits wasalready in the asm format, we used Turbo Assembler(TASM 5.0) for compiling and linking the files togenerate exes, which are later disassembled usingIDA pro 4.9 Freeware Version. Algorithm 4 isimplemented by using MATLAB 7.0. The

    NGVCK0.30 has advanced assembly source-morphing engine, and all variants of the virusesgenerated by NGVCK will have the samefunctionality, but they have different signatures. In

    this study; 100 metamorphic viruses are generated byusing (NGVCK). 40 viruses are used for analyzingand 60 viruses are used for testing, let us call the firstgroup A1 and the second group T1. After applyingthe suggested procedures on A1 we note that all theviruses in A1 have just seven different skeletonsignatures when T=100 and m=20 and have fourdifferent skeletons when T=80 and m=20 and havethree different skeletons when T=70 and m=20. T1group is tested by using 7 antivirus software; theresults are obtained by using the on-line service [12].100% of the generated viruses are recognized by the

    proposed method and by McAfee, but none of the

    viruses are detected by using the rest software.Another 100 viruses are generated by using VCL32,where all of them are obfuscated manually byinserting dead code, transposition the code,reassigning the registers and substituting theinstructions. The generated viruses are divided intotwo groups, A2 and T2, A2 contains 40 viruses foranalyzing and T2 contains 60 viruses for testing.Again 100% of the generated viruses are detected bythe proposed method, 84% are detected by Norman,23% are detected by McAfee and 0% are detected bythe rest software. Figure 5 describes the averagedetection percentage of the metamorphic viruses in

    T1 and T2.

    6. CONCLUSION

    The antivirus software trying to detect the viruses byusing variant static and dynamic methods. However;all the existing methods are not adequate. To developnew reliable antivirus software some problems must

    be fixed. This paper suggested new procedures todetect the metamorphic viruses by using arbitrarylength of control flow graphs and nodes alignment.The suspected files are disassembled, the opcodeencoded, the control flow analyzed, and the

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    20/57

    similarity of the matrices is measured by using a newsimilarity measurement. The implementation of thesuggested approach show that all the generatedmetamorphic viruses can be detected while less than62% are detected by other well known antivirussoftware.

    0

    20

    40

    60

    80

    100

    120

    Micr

    osoft

    Kasp

    ersky

    Syma

    ntec

    McA

    fee

    Clam

    AV

    Norm

    a AVG

    Prop

    osed

    Figure 5: The average percentage of the detectedviruses from group T1 and T2.

    REFERENCES

    [1] M. Christodorescu, J. Kinder, S. Jha, S.Katzenbeisser, and H. Veith: Malware

    Normalization, Technical Report # 1539 at theDepartment of Computer Sciences, Universityof Wisconsin, Madison, (2005).

    [2] F. Perriot: Striking Similarities: Win32/Simileand Metamorphic Virus Code, SymantecCorporation (2003).

    [3] E. Konstantinou: Metamorphic Virus: Analysisand Detection Technical Report, RHUL-MA-2008-02 Department of Mathematics RoyalHolloway, University of London, (2008).

    [4] A. Lakhotia, A. Kapoor, and E. U. Kumar: Are

    metamorphic computer viruses really invisible?,part 1. Virus Bulletin, pp 5-7, (2004).

    [5] P. Szor: The Art of Computer Virus Research andDefense, Addison Wesley Professional, 1edition, pp. 10-33 (2005).

    [6] R. Ando, N. A. Quynh, and Y. Takefuji:Resolution based metamorphic computer virusdetection using redundancy control strategy, InWSEAS Conference, Tenerife, Canary Islands,Spain, Dec. pp. 16-18. (2005).

    [7] R. G. Finones and R. T. Fernande: Solving themetamorphic puzzle, Virus Bulletin, pp. 14-19,(2006).

    [8] M. R. Chouchane and A. Lakhotia: Using enginesignature to detect metamorphic malware, InWORM '06: Proceedings of the 4th ACMworkshop on Recurring malcode, New York,

    NY, USA, pp. 73-78, (2006).[9] D. Bruschi, L. Martignoni, and M.Monga:

    Detecting self-mutating malware using controlflow graph matching, In DIMVA, pp. 129-143,(2006).

    [10] W. Wong and M. Stamp: Hunting formetamorphic engines, Journal in ComputerVirology, vol 2 (3), pp. 211-229, (2006).

    [11] http://vx.netlux.org/ last access March (2009).[12] http://www.virustotal.com/ last access March

    (2009).

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    21/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    22/57

    the reliability of each component or adding

    redundant components [8]. Of course, the second

    method is more expensive than the first. Our paper

    considers the first method. The aim of this paper is

    to obtain the optimal system reliability design with

    the following constrains. :

    1: Basic linear-cost-reliability relation used foreach component [7].

    2: Criticality of components [9]. The designer

    should take this in to account before building a

    reliable system and according to criticality of

    component increasing reliabilities will go toward the

    most critical component. Components criticality

    can be derived from its failure effects to system

    reliability failure. Which the position of a

    component will play an important role for its

    criticality which we called it the index of criticality.

    2 SYSTEM RELIABILITY PROBLEM

    2.1 Literature view

    Many methods have been reported to

    improve system reliability. Tillman, Hwang, and

    Kuo [10] provide survey of optimal system

    reliability. They divided optimal system reliability

    models into series, parallel, series-parallel, parallel-

    series, standby, and complex classes. They also

    categorized optimization methods into integer

    programming, dynamic programming, linear

    programming, geometric programming, generalized

    Lagrangian functions, and heuristic approaches. The

    authors concluded that many algorithms have beenproposed but only a few have been demonstrated to

    be effective when applied to large-scale nonlinear

    programming problems. Also, none has proven to be

    generally superior. Fyffe, Hines, and Lee [11]

    provide a dynamic programming algorithm for

    solving the system reliability allocation problem. As

    the number of constraints in a given reliability

    problem increases, the computation required for

    solving the problem increases exponentially. In

    order to overcome these computational difficulties,

    the authors introduce the Lagrange multiplier to

    reduce the dimensionality of the problem. To

    illustrate their computational procedure, the authorsuse a hypothetical system reliability allocation

    problem, which consists of fourteen functional units

    connected in series. While their formulation

    provides a selection of components, the search

    space is restricted to consider only solutions where

    the same component type is used in parallel.

    Nakagawa and Miyazaki [12] proposed a more

    efficient algorithm. In their algorithm, the authors

    use surrogate constraints obtained by combining

    multiple constraints into one constraint. In order to

    demonstrate the efficiency of their algorithm, they

    also solve 33 variations of the Fyffe problem. Of the

    33 problems, their algorithm produces optimal

    solutions for 30 of them. Misra and Sharma [13]

    presented a simple and efficient technique for

    solving integer-programming problems such as the

    system reliability design problem. The algorithm is

    based on function evaluations and a search limited

    to the boundary of resources. In the nonlinear

    programming approach, Hwang, Tillman and Kuo

    [14] use the generalized Lagrangian function

    method and the generalized reduced gradientmethod to solve nonlinear optimization problems

    for reliability of a complex system. They first

    maximize complex-system reliability with a tangent

    cost-function and then minimize the cost with a

    minimum system reliability. The same authors also

    present a mixed integer programming approach to

    solve the reliability problem [15]. They maximize

    the system reliability as a function of component

    reliability level and the number of components at

    each stage. Using a genetic algorithm (GA)

    approach, Coit and Smith [16], [17], [18] provide a

    competitive and robust algorithm to solve the

    system reliability problem. The authors use a

    penalty guided algorithm which searches over

    feasible and infeasible regions to identify a final,

    feasible optimal, or near optimal, solution. The

    penalty function is adaptive and responds to the

    search history. The GA performs very well on two

    types of problems: redundancy allocation as

    originally proposed by Fyffe, et al., and randomly

    generated problems with more complex

    configurations. For a fixed design configuration and

    known incremental decreases in component failure

    rates and their associated costs, Painton and

    Campbell [19] also used a GA based algorithm tofind a maximum reliability solution to satisfy

    specific cost constraints. They formulate a flexible

    algorithm to optimize the 5th percentile of the mean

    time-between-failure distribution. In this paper ant

    colony optimization will be modified and adapted,

    which will consider the measure of criticality will

    gives a guidance to the ants for its nest and ranking

    of critical components will be taken into

    consideration to choose the most reliable

    components which then will be improved till reach

    the optimal systems components reliability value.

    2.2 Ant colony optimization approachAnt colony optimization (ACO) algorithm [20,

    21], which imitate foraging behavior of real life

    ants, is a cooperative population-based search

    algorithm. While traveling, Ants deposit an amount

    of pheromone (a chemical substance). When other

    ants find pheromone trails, they decide to follow the

    trail with more pheromone, and while following a

    specific trail, their own pheromone reinforces the

    followed trail. Therefore, the continuous deposit of

    pheromone on a trail shall maximize the probability

    of selecting that trail by next ants. Moreover, ants

    shall use short paths to food source shall return to

    nest sooner and therefore, quickly mark their paths

    twice, before other ants return. As more ants

    complete shorter paths pheromone acc m lates

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    23/57

    faster on shorter paths and longer paths are less

    reinforced. Pheromone evaporation is a process of

    decreasing the intensities of pheromone trails over

    time. This process is used to avoid locally

    convergence (old pheromone strong influence is

    avoided to prevent premature solution stagnation),

    to explore more search space and to decrease theprobability of using longer paths. Because ACO has

    been proposed to solve many optimization problems

    [22],[23], our proposed idea is also to adapt this

    algorithm to optimize system reliability and

    specially complex system

    3 METHODOLOGY

    3.1 Problem definition3.1 .1 Notation

    In this section, we define all parameters used in

    our model.

    Rs : Reliability of system

    Pi : Reliability of components i.

    qi : probability of failure of components (i).

    Qn : Probability of failure to system

    n : Total number of components.

    ICRi : Index of criticality measure.

    ICRp: index of criticality for path to destination

    ISTi : Index of structure measure.

    Ct : Total cost of components.

    Ci : Cost of component

    Cc : Cost for improvement

    P(i)min: Minimum accepted reliability value

    ACO:start node for ant,

    : next node chosen.

    i :initial pheromone trail intensity

    i(old) :pheromone trail intensity of combinationbefore update of

    i(new) :pheromone trail intensity of combinationafter update

    :problem-specific heuristic of combination

    ij

    : relative importance of the pheromone trail

    intensity

    : relative importance of the problem-

    specific heuristic for global solution:index for component choices from set AC

    trail persistence for local solution

    :number of best solutions chosen for offline

    pheromone update index

    3.1.2 Assumption

    In this section, we present the assumptions

    under which formulation of our model is presented.

    1: There are many different methods used to derive

    the expression of total reliability of complexsystem,which are derived in a certain system topology, we

    state oursystem expressions according to themethods of papers [3-5].

    2: We used a cost-reliability curve [7] to derive anequation to express each cost components according

    to its reliability and then the total system cost will

    be additive in term of cost at constitute

    components. See Fig. (1).

    Figure 1: cost-reliability curve

    As show in Fig 1. and by equaling the slopes of two

    triangles we can derive equation number (1) as

    following:

    3: In [9] calculation of ICRi and ISTi derivation

    equation s (2) and (3) for each components from its

    structural measure, which given by,

    (2)

    Where,

    (3)

    4-Every ICRi must be lower than initial value ai.

    This value is a minimum accepted level of criticality

    measure to every component.

    5-After the complex system presented

    mathematically, a set of paths will be available from

    specified source to destination. those paths will be

    ranked each one according to its components

    criticalities.

    3.2 Formulation of the problem:

    The objective function in general, has the form :

    Maximize, Rs=f(P1,P2,P3,....Pn).

    subject to the following constrains,

    1. ICRi : i =1,2,n

    2. To ensure that the total cost of components not

    more than proposed cost value the following

    equation number (4) can be used:

    :Pi(min) > 0 (4)

    Note that this set of constrains permits only

    positive components cost.

    Cost

    Rs

    Ci Ct

    1

    Pi min

    )1(....

    p(i)min-1

    p(i)min-p

    p(i)min-1

    p(i)min-p 21nCtCtCc

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    24/57

    4 MODEL CONSTRUCTION

    The algorithm uses an ACO technique with the

    criticality approach to ensure global converges from

    any starting point. The algorithm is iterative. At

    each iteration, the set of ants are identified using

    some indicator matrices. Below are the main stepsof our proposed model . As we see in the Fig. 2

    which illustrating a set of steps illustrated below:

    1. Ant colony parameters are initialized

    2. The criticality of components will be

    calculated according to derived reliability equation,

    then will be ranked according to its values

    3. Using equation number(5) Ant equation:

    (5)

    The probability to choose the next node will be

    estimated after a random number generated. and

    until the destination node. The selected nodes will

    be chosen .According to the criticality components

    through this path.

    Figure 2: Flow diagram adapted ant system

    4. Eq. (6): update the pheromone according to the

    criticality measure. Which can be calculateproduct of components criticalities value

    The update equation will become as follows:

    (7)

    5.A new reliabilities will be generated.

    6.Till reach best solution and all ant moved to

    achieve maximum reliability of the system with

    minimum cost.

    5 EXPERIMINTAL RESULTS

    In the following examples, we use a bench

    mark systems configurations like a Bridge, and

    Delta .

    5.1 Bridge problem:

    Figure 3: Bridge system

    To find the polynomial for a complex system we

    must know that it always given at a certain time to

    be transmitted from source (s) to destination (D),

    see Fig. 3.The objective function to be maximized has the

    form: Rs=

    1- (q1+q4.q5.p1+q3.q4.p1.p5+q2.q4.p1.p5.p3)

    Subject to:

    1.

    3

    1

    45(pi)*Ci

    i

    2. The ICRi constraint.

    ICRi calculated : i=1,2,5..

    - We use the values in the Fig. 3 as initial values for

    components reliabilities to improve the system:

    P(1)min=0.9, P (2)min=0.9,

    P(3)min=0.8, P (4)min=0.7, p(5)min=0.8.

    3. We choose the cost-reliability curve to

    permit distribution of cost depending on ranking of

    components according to there criticality. The

    model was built in such a way that reduce the fail of

    the most critical components, this is done by

    increasing the reliability of the most critical

    components, which tend to maximizes the over allreliability what is our goal. We summarized our

    results in the following Table (1) and Table

    Yes

    Input system reliability

    equation

    Update pheromone :

    Randomly initialize Pi and minimum values

    and enerate random number choose n Ants

    Evaluate ICRi for com onents & rank

    Do same until the destination then Select ath

    Calculate

    Generate new Pi If random No. < Ant Move

    ants reached

    destination?NO

    Get optimized values

    S D

    2 3

    5

    1 4

    (6)

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    25/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    26/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    27/57

    5. The ant colony algorithm improved by the

    previous experience which was given by the index

    of criticality which gives to ant an experience to

    deposit of pheromone on a trail which will

    maximize the probability of selecting that trail by

    next ants. Moreover, ants shall use more reliable

    paths. Our numerical experiences show that ourapproach is promising especially for complex

    systems.

    7 REFERENCES

    [1] A. Lisnianski,. H. Ben-Haim, and D.

    Elmakis: Multistate System Reliability

    optimization: an Application, Levitin,

    Gregory book , USA, pp.1-20. ISBN

    9812383069. (2004)

    [2] S. Krishnamurthy, AP. Mathur.: On the

    estimation of reliability of a software

    system using reliabilities of its components

    .In: Proceedings of the ninth international

    symposium on software reliability

    engineering(ISSRE97).Albuquerque;.p.146.

    (1997)

    [3] T. Coyle, RG. Arno, PS.: Hale.

    Application of the minimal cut set

    reliability analysis methodology to the gold

    book standard network. In the commercial

    and power systems technical conference;.

    p. 8293. industrial (2002)

    [4] K. Fant, Brandt S. : Null convention logic,

    a complete and consistent logic for

    asynchronous digital circuit synthesis. In:

    the international conference on application

    specific systems, architectures, and

    processors (ASAP 96); p. 26173. (1996).

    [5] C. Gopal H, Nader A.: A new approach to

    system reliability. IEEE Trans

    Reliab;50(1):7584. (2001).

    [6] Y. Chen, Z. hongshi:" : Bounds on the

    Reliability of Systems With Unreliable

    Nodes & Components". IEEE, Trans. on

    reliability, vol.53, No. 2, June.(2004).

    [7] B. A. Ayyoub.: An application of reliability

    engineering in computer networks

    communication AAST and MT Thesis,

    p.p17Sep.(1999).

    [8] S. Magdy, R.d Schinzinger: "On Measures

    of computer systems Reliability and Critical

    Components", IEEE, Trans. on Reliability(1988).

    ElAlem: " An Application of Reliability

    Engineering in Complex Computer System

    and Its Solution Using Trust Region

    Method", WSES , software and hardware

    Engineering for 21st century book,

    pp261,(1999).

    [10] ATillman,C.Hwang,,K.Way : Optimization

    Techniques for System Reliability with

    Redundancy,A Review, IEEE Transactions

    on Reliability, vol. R-26, no. 3, , pp. 148-

    155. August (1977).

    [11] E. David. Fyffe, W. William. K. L Hines,

    Nam: System Reliability Allocation And

    a Computational Algorithm, IEEE

    Transactions on Reliability, vol. R-17, no. 2,

    , pp. 64-69. June (1968).

    [12] Y. Nakagawa, S. Miyazaki: Surrogate

    Constraints Algorithm for Reliability

    Optimization Problems with Two

    Constraints, IEEE Transactions on

    Reliability, vol. R-30, no. 2, , pp. 175-180.

    June (1981).

    [13] K. Behari Misra, U. Sharma: An Efficient

    Algorithm to Solve Integer-Programming

    Problems Arising in System-Reliability

    Design ,IEEE Transactions on Reliability,

    vol. 40, no. 1, , pp. 81 91. April (1991).

    [14] C. Lai Hwang, A. Frank Tillman, W. Kuo, :

    Reliability Optimization by Generalized

    Lagrangian - Function and Reduced-

    Gradient Methods, IEEE Transactions on

    Reliability, vol. R-28, no. 4, pp. 316-319.

    October (1979).

    [15] A. Frank Tillman, C.Hwang, W Kuo, :

    Determining Component Reliability and

    Redundancy for Optimum System

    Reliability, IEEE Transactions on

    Reliability, vol. R-26, no. 3, pp. 162- 165.

    August (1977).

    [16] D. Coit, Alice E.Smith, Reliability

    Optimization of Series-Parallel Systems

    Using a Genetic Algorithm, IEEE

    Transactions on Reliability, vol. 45, no. 2, ,

    pp. 254-260 June,(1996 ).

    [17] W. David. Coit, Alice E. Smith: Penalty

    Guided Genetic Search for Reliability

    Design Optimization, Computers and

    Industrial Engineering, vol. 30, no. 4, pp.

    95-904. (1996).

    [18] W. David Coit, E. Alice Smith, M. David

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    28/57

    [19]

    Tate,: Adaptive Penalty Methods for

    Genetic Optimization of Constrained

    Combinatorial Problems, INFORMS

    Journal on Computing, vol. 8, no. 2, Spring,

    pp. 173-182. (1996).

    L. Painton, C. James: Genetic Algorithmsin Optimization of System Reliability,

    IEEE Transactions on Reliability, vol. 44,

    no. 2, , pp. 172-178. June (1995)

    [20]

    [21]

    [22]

    [23]

    N. Demirel,., Toksar, M.: Optimization of

    the quadratic assignment problem using an

    ant colony algorithm, Applied Mathematics

    and Computation, Vol. 183, optimization

    ,Applied Mathematics and Computation,

    Vol. 191, pp. 42--56 (2007).

    Y. Feng, L. Yu,G.Zhang,: Ant colony pattern

    search algorithms for unconstrained and

    bound constrained optimization ,AppliedMathematics and Computation, Vol. 191,

    pp. 42--56 (2007).

    M. Dorigo, L. M. Gambardella: Ant

    Colony System: A Cooperative Learning

    Approach to the Travelling Salesman

    Problem, IEEE Transactions on

    Evolutionary Computation, vol. 1, no. 1, ,

    pp. 53-66. April (1997).

    B. Bullnheimer, F. Richard, H. ChristineStrauss, Applying the Ant System to the

    Vehicle Routing Problem, 2nd Meta-

    heuristics International Conference (MIC-

    97), Sophia-Ant polis, France, pp. 21-24.

    July, (1997).

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    29/57

    A COMPREHENSIVE QUALITY EVALUATION

    SYSTEM FOR PACS

    Dinu Dragan, Dragan IveticDepartmant for Computing and Automatics, Republic of [email protected], [email protected]

    ABSTRACT

    An imposing number of lossy compression techniques used in medicine, represents

    a challenge for the developers of a Picture Archiving and Communication System(PACS). How to choose an appropriate lossy medical image compressiontechnique for PACS? The question is not anymore whether to compress medicalimages in lossless or lossy way, but rather which type of lossy compression to use.The number of quality evaluations and criteria used for evaluation of a lossy

    compression technique is enormous. The mainstream quality evaluations and

    criteria can be broadly divided in two categories: objective and subjective. Theyevaluate the presentation (display) quality of a lossy compressed medical image.Also, there are few quality evaluations which measure technical characteristics of alossy compression technique. In our opinion, technical evaluations represent anindependent and invaluable category of quality evaluations. The conclusion is that

    quality evaluations from each category measure only one quality aspect of amedical image compression technique. Therefore, it is necessary to apply arepresentative(s) of each group to acquire the complete evaluation of lossy medicalimage compression technique for a PACS. Furthermore, a correlation function

    between the quality evaluation categories would simplify the overall evaluation ofcompression techniques. This would enable the use of medical images of highest

    quality while engaging the optimal processing, storage, and presentation resources.The paper represents a preliminary work, an introduction to future research and

    work aiming at developing a comprehensive quality evaluation system.

    Keywords: medical image quality metrics, medical image compression, PACS

    1 INTRODUCTIONPicture Archiving and Communication System

    (PACS) represents an integral part of modern

    hospitals. It enables communication, storage, processing, and presentation of digital medicalimages and corresponding data [1]. Digital medicalimages tend to occupy enormous amount of storage

    space [2, 3]. The complete annual volume of medicalimages in a modern hospital easily reaches hundredsof petabytes and is still on the rise [4]. The increaseddemand for digital medical images introduced stillimage compression for medical imaging [5], whichrelaxes storage and network requirements of a PACS,

    and reduces the overall cost of the system [3].In general, all compressed medical images can

    be placed in two groups: lossless and lossy. The firstgroup is more appealing to physicians, becausedecompression restores the image completely,

    without data loss. It achieves modest results andmaximum compression ratio of 3:1 [6, 7, 8]. Several

    studies [9, 10] showed that this is not suitable forPACS, and that at least 10:1 compression ratio has tobe achieved.

    The second group of compression techniques

    achieves greater compression ratios, but with datadistortion in restored image [6, 7, 8]. Lossycompression provoked serious doubts and oppositionfrom medical staff. The opposition rose from the fact

    that the loss of data can influence medical imageinterpretation and can lead to serious errors in

    treatment of a patient. Therefore, the main research

    area for lossy compression of medical images isfinding of the greatest compression ratio that stillmaintains diagnostically important information. The

    degree of lossy compression of medical imageswhich maintains no visual distortion under normalmedical viewing conditions is called visuallylossless compression [10]. Several studies[8, 11, 12] and standards [13] proved clinical

    acceptability to use lossy compression of medicalimages as long as the modality of the image, the

    nature of the imaged pathology, and image anatomyare taken into account during lossy compression. Themedical organization involved has to approve and

    adopt a lossy compression of medical images appliedin PACS. Therefore, it is necessary to provide a

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    30/57

    quality evaluation of different compressiontechniques from PACS point of view.

    During our work on a PACS for a lung hospital,we tried to adopt image compression for medicalimages which achieves highest compression ratio

    with minimal distortion within decompressed image.

    Also, we needed image compression suitable fortelemedicine purposes. We consulted the technicalstudies in search for quality evaluation of image

    compression technique. The sheer amount of studiesis overwhelming [14, 15]. There is no unique qualityevaluation which is suitable for various compressiontechniques and different applications of imagecompression [16, 17]. In most cases the studies arefocused only on presentation (display) quality of thelossy compressed medical image. Technical features

    of compression technique are usually ignored.This paper represents a preliminary research. Its

    purpose is to identify all the elements needed to

    evaluate the quality of a compression technique forPACS. We identified three categories of qualityevaluations and criteria: presentation-objective,

    presentation-subjective, and technical-objective.Overview of technical studies led us to conclusionthat quality evaluations from each category measureonly one quality aspect of an image compression

    technique. To perform the complete evaluation ofmedical image compression technique for PACS, itis necessary to apply a representative of eachcategory. A correlation function between therepresentatives of each category would simplify the

    overall evaluation of compression techniques. A 3D

    evaluation space introduced by the paper is a 3Dspace defined by this correlation function and qualityevaluations used. Our goal is to develop anevaluation tool based on the 3D evaluation spacewhich is expected for 2011. All the elements of the

    quality evaluation system are identified in the paper.The organization of the paper is as follows:

    section 2 gives the short overview of the lossycompression techniques used in medical domain;section 3 describes the quality evaluations used tomeasure the quality of compression techniques; 3D

    evaluation space is discussed in section 4; section 5concludes the paper.

    2 LOSSY COMPRESSION OF MEDICALIMAGES

    Over the past decades an imposing number oflossy compression techniques have been tested andused in medical domain. Industry approved standardshave been used as often as the proprietarycompressions. On the part of the image affected, they

    can be categorized in two groups:1. medical image regions of interest (ROI) are

    compressed losslessly while the rest of the imagebackground is compressed lossy,

    2. the entire medical image is compressed lossytargeting the visually lossless threshold.

    The first group offers selective lossy compression of

    medical images. Parts of the image containingdiagnostically crucial information (ROI) arecompressed in a lossless way, whereas the rest of the

    image containing unimportant data is compressedlossy. This approach enables considerable higher

    compression ratio than ordinary lossy compression[18, 19]. Larger regions of the medical image containunimportant data which can be compressed at higherrates [19]. Downfall of this approach iscomputational complexity (an element of technical-objective evaluation). Each ROI has to be marked

    before compression. Even for images of the samemodality, ROIs are rarely in the same place. ROIsare identified either manually by qualified medicalspecialist or automated based on a region-detectionalgorithm [20]. The goal is to find a perfect

    combination of automated ROI detection algorithmsand selective compression technique.

    Over the years various solutions for ROIcompression of medical images emerged whichdiffer in image modalities used, ROI definitions,coding shames and compression goals [20]. Some of

    them are: a ROI-based compression technique withtwo multi-resolution coding schemes reported byStrom [19], a block based JPEG ROI compressionand a importance schema coding based on wavelets

    reported by Bruckmann [18], a motion compensatedROI coding for colon CT images reported by

    Bokturk [21], a region based discrete wavelet

    transform reported by Penedo [22], a JPEG2000 ROIcoding reported by Anastassopoulos [23].

    The second group of lossy compressiontechniques applies lossy compression over entire

    medical image. Considerable efforts have been madein finding and applying the visual lossless threshold.Over the years various solutions emerged whichdiffer in goals imposed on a compression technique

    (for particular medical modality or for a group ofmodalities), and in compression techniques used

    (industry standards or proprietary compressiontechniques).

    Some of the solutions presented over the years

    are: a compression using predictive pruned tree-structured vector quantization reported by Cosman[17], a wavelet coder based on Set Partitioning inHierarchical Trees (SPIHT) reported by Lu [24], awavelet coder exploiting Human Visual Systemreported by Kai [25], a JPEG coder and wavelet-based trelliscoded quantization (WTCQ) reported bySlone [10], a JPEG2000 coder reported by Bilgin[26].

    Although the substantial effort has been made todevelop a selective lossy compression of medicalimages, the industry standards that apply lossycompression on the entire medical image are

    commonly used in PACS.

    Special Issue on ICIT 2009 Conference - Bioinformatics and Image

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    31/57

  • 8/14/2019 Special Issue on ICIT 2009 Conference - Bioinfomatics and Image - Ubiquitous Computing and Communication Journal [ISSN 1992-8424]

    32/57

    diagnostically acceptable [32]. Other studies usedqualified observers to interpret reconstructed medicalimages compressed at various levels. Thecompression levels on which results were the sameas for the original image have been rated as

    acceptable [5]. Also, some studies used qualified

    technicians to define a just noticeable differenceused to select the point at which compression level isnot diagnostically usable. The observers have been

    presented with series of images, each compressed athigher level. They simple had to define the point atwhich changes became obvious. The studies were based on presumption that one can perceivechanges in the image long before an image isdegraded enough to lose its diagnostic value [5].

    When subjectively evaluating medical images, it

    is not sufficient to say that image looks good. Itshould be