Annotation of Metagenomes: from Jigsaws to Taxa

Embed Size (px)

Citation preview

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    1/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    A n n o t a t i o n o f M e t a g e n o m e s : f r o m J i g s a w s t o T a x a

    F a b i o G o r i

    1

    G i a n l u i g i F o l i n o

    2

    M i k e S . M . J e t t e n

    3

    E l e n a

    M a r c h i o r i

    1

    1

    R a d b o u d U n i v e r s i t y N i j m e g e n , I n s t i t u t e f o r C o m p u t i n g a n d I n f o r m a t i o n S c i e n c e s ,

    T h e N e t h e r l a n d s

    3

    R a d b o u d U n i v e r s i t y N i j m e g e n , D e p a r t m e n t o f M i c r o b i o l o g y , T h e N e t h e r l a n d s

    2

    I C A R - C N R , R e n d e , I t a l y

    N i j m e g e n , 1 0 S e p t e m b e r 2 0 1 0

    g o r i @ s c i e n c e . r u . n l

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    2/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    T a b l e o f C o n t e n t s

    I n t r o d u c t i o n t o M e t a g e n o m i c s

    T a x o n o m i c A n n o t a t i o n

    M T R d e s c r i p t i o n

    R e s u l t s

    C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    3/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    W h a t i s M e t a g e n o m i c s ?

    M e t a g e n o m i c s :

    s t u d y o f m i c r o b i a l

    c o m m u n i t i e s a n a l y s i n g

    t h e i r g e n e t i c m a t e r i a l

    W h y ?

    M o s t m i c r o b e s

    c a n n o t b e c u l t u r e d

    U n d e r s t a n d o r g a n i s m s

    i n t e r a c t i o n s

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    4/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    W h a t i s M e t a g e n o m i c s ?

    M e t a g e n o m i c s :

    s t u d y o f m i c r o b i a l

    c o m m u n i t i e s a n a l y s i n g

    t h e i r g e n e t i c m a t e r i a l

    W h y ?

    M o s t m i c r o b e s

    c a n n o t b e c u l t u r e d

    U n d e r s t a n d o r g a n i s m s

    i n t e r a c t i o n s

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    5/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    W h a t i s M e t a g e n o m i c s ?

    M e t a g e n o m i c s :

    s t u d y o f m i c r o b i a l

    c o m m u n i t i e s a n a l y s i n g

    t h e i r g e n e t i c m a t e r i a l

    W h y ?

    M o s t m i c r o b e s

    c a n n o t b e c u l t u r e d

    U n d e r s t a n d o r g a n i s m s

    i n t e r a c t i o n s

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    6/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    H o w ? D N A S e q u e n c i n g T e c h n o l o g y

    E n v i r o n m e n t a l

    S a m p l e

    D N A s

    S m a l l - I n s e r t L i b r a r y C l o n i n g

    A m e t a g e n o m i c d a t a s e t i s m a d e b y w o r d s i n 4 - l e t t e r a l p h a b e t

    {A , C , G , T }

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    7/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    W h a t k i n d o f d a t a ? A m e t a . . . j i g s a w - p u z z l e

    F r a g m e n t s o f D N A s

    P i e c e s a r e s i m i l a r

    O r i g i n a l p i c t u r e s a r e

    u n k n o w n

    M i s s i n g P i e c e s

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    8/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    A s s e m b l y o f m e t a g e n o m i c s f r a g m e n t s i s o f t e n n o t p o s s i b l e

    ( a ) L o w - d i v e r s i t y c o m m u n i t i e s ( b ) H i g h - d i v e r s i t y c o m m u n i t i e s

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    9/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    S o . . . W h a t c a n M e t a g e n o m i c s d o ?

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    10/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    E x a m p l e s o f m e t a g e n o m i c s s t u d i e s

    S t u d y o f f u n c t i o n a l c o n t e n t o f a m e t a g e n o m e

    [ S . D e m a n e c h e e t a l . , J o u r n a l o f M i c r o b i o l o g i c a l M e t h o d s , 2 0 0 9 ]

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    11/52

    S t u d y o f t a x o n o m i c c o n t e n t o f a m e t a g e n o m e

    [ J . F . B i d d l e e t a l . , P N A S , 2 0 0 8 ]

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    12/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s

    T a x o n o m i c A n n o t a t i o n

    M T R d e s c r i p t i o n

    R e s u l t s

    C o n c l u s i o n

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    13/52

    T a x o n o m y : a b i o l o g i c a l c l a s s i c a t i o n

    L i n n e a n t a x o n o m y :

    F o r m a l s y s t e m f o r c l a s s i f y i n g a n d n a m i n g

    l i v i n g t h i n g s

    B a s e d o n a s i m p l e h i e r a r c h i c a l s t r u c t u r e

    S i m i l a r e l e m e n t s a r e g r o u p e d t o g e t h e r

    R a n k : l e v e l i n t h e h i e r a r c h y ( l e f t )

    T a x o n : u n i t o f t h e h i e r a r c h y

    ( g r o u p o f s i m i l a r l i v i n g t h i n g s )

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    14/52

    A s i m p l i e d b r a n c h o f t h e t a x o n o m y

    T h e s t r u c t u r e o f t h e t a x o n o m y c a n b e r e p r e s e n t e d a s a t r e e :

    N o d e s a r e t a x a

    L e a v e s a r e s p e c i e s

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    15/52

    E x a m p l e s o f t a x o n o m i c c l a s s i c a t i o n s

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    16/52

    T h e m e t a j i g s a w - p u z z l e

    F r a g m e n t s o f D N A s

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    17/52

    A n n o t a t i o n : d i s c o v e r i n g t h e o r i g i n a l p i c t u r e s o f t h e p u z z l e s

    A s s i g n e a c h f r a g m e n t

    t o a n o r g a n i s m o r t a x o n

    ( o f r a n k s p e c i e s , o r g e n u s . . . )

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    18/52

    R e l a t e d w o r k s

    C o m p o s i t i o n - b a s e d m e t h o d s : b a s e d o n n - g r a m s f r e q u e n c i e s

    S i m i l a r i t y - b a s e d m e t h o d s : b a s e d o n c o m p a r i s o n w i t h

    r e f e r e n c e s e q u e n c e s

    O r g a n i s m - l e v e l c l a s s i c a t i o n :

    B e s t H i t

    ( B H )

    R a n k - l e v e l c l a s s i c a t i o n :

    L o w e s t C o m m o n A n c e s t o r

    ( L C A ) ,

    u s e d b y M E G A N a n d G a l a x y

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    19/52

    C o m p a r e A l i g n : B L A S T s o f t w a r e

    L o c a l a l i g n m e n t ( g a p s a l l o w e d )

    N u c l e o t i d e a n d / o r P e p t i d e s e q u e n c e s

    A l i g n m e n t S c o r e s

    B i t S c o r e (SB

    ) = a l i g n m e n t q u a l i t y

    E - v a l u e

    P - v a l u e

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    20/52

    S i m i l a r i t y - b a s e d m e t h o d s

    W i t h v e r y s h o r t f r a g m e n t s ( 1 0 0 b p )

    M e t h o d C o m p a r i s o n B H L C A

    C l a s s i c a t i o n L e v e l O r g a n i s m R a n k s ( t o o h i g h )

    Q u a n t i t y o f f r a g m e n t s A l l F i l t e r e d ( l o w )

    A c c u r a c y N o t v e r y g o o d Q u i t e g o o d

    G o a l : T a x o n o m i c A n n o t a t i o n o f S h o r t M e t a g e n o m i c s f r a g m e n t s

    ( r a n k - l e v e l ) s . t .

    A s s i g n a s m a n y f r a g m e n t s a s p o s s i b l e

    A s s i g n w i t h g o o d a c c u r a c y

    A s s i g n t o t h e i r l o w e s t f e a s i b l e r a n k

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    21/52

    S i m i l a r i t y - b a s e d m e t h o d s

    W i t h v e r y s h o r t f r a g m e n t s ( 1 0 0 b p )

    M e t h o d C o m p a r i s o n B H L C A

    C l a s s i c a t i o n L e v e l O r g a n i s m R a n k s ( t o o h i g h )

    Q u a n t i t y o f f r a g m e n t s A l l F i l t e r e d ( l o w )

    A c c u r a c y N o t v e r y g o o d Q u i t e g o o d

    G o a l : T a x o n o m i c A n n o t a t i o n o f S h o r t M e t a g e n o m i c s f r a g m e n t s

    ( r a n k - l e v e l ) s . t .

    A s s i g n a s m a n y f r a g m e n t s a s p o s s i b l e

    A s s i g n w i t h g o o d a c c u r a c y

    A s s i g n t o t h e i r l o w e s t f e a s i b l e r a n k

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    22/52

    S i m i l a r i t y - b a s e d m e t h o d s

    W i t h v e r y s h o r t f r a g m e n t s ( 1 0 0 b p )

    M e t h o d C o m p a r i s o n B H L C A

    C l a s s i c a t i o n L e v e l O r g a n i s m R a n k s ( t o o h i g h )

    Q u a n t i t y o f f r a g m e n t s A l l F i l t e r e d ( l o w )

    A c c u r a c y N o t v e r y g o o d Q u i t e g o o d

    G o a l : T a x o n o m i c A n n o t a t i o n o f S h o r t M e t a g e n o m i c s f r a g m e n t s

    ( r a n k - l e v e l ) s . t .

    A s s i g n a s m a n y f r a g m e n t s a s p o s s i b l e

    A s s i g n w i t h g o o d a c c u r a c y

    A s s i g n t o t h e i r l o w e s t f e a s i b l e r a n k

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    23/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s

    T a x o n o m i c A n n o t a t i o n

    M T R d e s c r i p t i o n

    R e s u l t s

    C o n c l u s i o n

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    24/52

    T h e m e t h o d : c o m p a r e a n d a s s i g n

    M u l t i p l e T a x o n o m i c R a n k b a s e d c l u s t e r i n g

    G o a l : T a x o n o m i c A n n o t a t i o n o f S h o r t M e t a g e n o m i c s f r a g m e n t s

    ( r a n k - l e v e l )

    1 A l i g n t h e f r a g m e n t s t o r e f e r e n c e s e q u e n c e s , s e l e c t i n g g o o d

    m a t c h i n g s

    2 A s s i g n e a c h f r a g m e n t t o a t a x o n o f o n e o f i t s b e s t m a t c h i n g

    s e q u e n c e s

    N E W

    A s s i g n f r o m t h e h i g h e s t r a n k t o t h e l o w e s t f e a s i b l e r a n k

    A s s i g n m e n t s o f f r a g m e n t s a r e d e p e n d e n t e a c h o t h e r

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    25/52

    T h e m e t h o d : c o m p a r e a n d a s s i g n

    M u l t i p l e T a x o n o m i c R a n k b a s e d c l u s t e r i n g

    G o a l : T a x o n o m i c A n n o t a t i o n o f S h o r t M e t a g e n o m i c s f r a g m e n t s

    ( r a n k - l e v e l )

    1 A l i g n t h e f r a g m e n t s t o r e f e r e n c e s e q u e n c e s , s e l e c t i n g g o o d

    m a t c h i n g s

    2 A s s i g n e a c h f r a g m e n t t o a t a x o n o f o n e o f i t s b e s t m a t c h i n g

    s e q u e n c e s

    N E W

    A s s i g n f r o m t h e h i g h e s t r a n k t o t h e l o w e s t f e a s i b l e r a n k

    A s s i g n m e n t s o f f r a g m e n t s a r e d e p e n d e n t e a c h o t h e r

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    26/52

    H o w c a n w e a l i g n s e q u e n c e s ? B L A S T

    B l a s t X

    W e s e l e c t t h e g o o d m a t c h i n g s a c c o r d i n g t o t h e s e i n d e x e s :

    S

    B

    (r

    ,p

    ) =q u a l i t y o f a l i g n m e n t

    E - v a l u e ( r , p ) P - v a l u e

    P o s i t i o n o f p i n t h e l i s t o f h i t s o f r

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    27/52

    O n a x e d r a n k , h o w d o w e a s s i g n t h e f r a g m e n t s ?

    F o r e a c h r a n k j :

    C l u s t e r o f f r a g m e n t s C

    i

    T a x o n t

    i

    o f r a n k j

    C l u s t e r s e l e c t i o n = a s s i g n m e n t .

    S e t C o v e r i n g P r o b l e m

    m i n

    I

    {1,...,

    n

    }| I |, s u c h t h a t

    i

    I

    C

    i

    =R

    .( S C P )

    R

    := {r1

    , . . . , rm

    }, f r a g m e n t s t h a t w e w a n t t o a s s i g n

    C j := {C

    i , . . . , C n }, c l u s t e r s o f f r a g m e n t s ( C

    i R )

    S C P i s N P - h a r d ; w e u s e d a f a s t g r e e d y a l g o r i t h m

    [ C h v a t a l V , 1 9 7 9 ]

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    28/52

    S e t C o v e r i n g P r o b l e m

    C l u s t e r o f f r a g m e n t s C

    i

    T a x o n ti

    o f r a n k j

    S e l e c t c o l l e c t i o n o f c l u s t e r s ( t a x a ) s . t .

    N o f r a g m e n t i s l e f t o u t s i d e

    M i n i m a l n u m b e r o f s e l e c t e d c l u s t e r s

    E x a m p l e :

    C

    1

    C

    2

    C

    3

    C

    4

    C

    5

    C

    6

    r

    1

    r

    2

    r

    3

    r

    4

    r

    5

    r

    6

    r

    7

    r

    8

    r

    9

    r

    1 0

    C l u s t e r i n g S o l u t i o n :

    C

    1

    C

    2

    C

    3

    C

    4

    C

    5

    C

    6

    r

    1

    r

    2

    r

    3

    r

    4

    r

    5

    r

    6

    r

    7

    r

    8

    r

    9

    r

    1 0

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    29/52

    A l g o r i t h m s c h e m e

    1 C o m p a r e f r a g m e n t s R w i t h r e f e r e n c e p r o t e i n s ( N C B I - N R )

    u s i n g B l a s t X a n d F i l t e r B l a s t X o u t p u t

    2 F r o m t h e h i g h e s t t o t h e l o w e s t r a n k

    ( j=

    k i n g d o m , . . . ,

    s p e c i e s )

    1 C r e a t e p o t e n t i a l c l u s t e r s c o l l e c t i o n Cj

    , f o r r a n k j . ( C

    i

    R ) .

    r

    C

    i

    r h a s a B l a s t X - m a t c h w i t h a p r o t e i n o f t a x o n t

    i

    2 S e l e c t c l u s t e r s f r o m Cj

    v i a S e t C o v e r i n g P r o b l e m

    3 I d e n t i f y f r a g m e n t s w i t h i n c o h e r e n t t a x o n o m i c c l a s s i c a t i o n

    ( w . r . t . h i g h e r r a n k s c l a s s i c a t i o n s ) a n d r e m o v e t h e m , e v e n

    f r o m R

    3F r o m t h e l o w e s t t o t h e h i g h e s t r a n k

    1 F o r t h i s r a n k , M a j o r i t y V o t e o n c l u s t e r s ' i n t e r s e c t i o n s

    2 M a k e h i g h e r r a n k s c l a s s i c a t i o n s c o h e r e n t w i t h t h e M a j o r i t y

    V o t e r e s u l t s

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    30/52

    A l g o r i t h m s c h e m e

    1 C o m p a r e f r a g m e n t s R w i t h r e f e r e n c e p r o t e i n s ( N C B I - N R )

    u s i n g B l a s t X a n d F i l t e r B l a s t X o u t p u t

    2 F r o m t h e h i g h e s t t o t h e l o w e s t r a n k

    ( j=

    k i n g d o m , . . . ,

    s p e c i e s )

    1 C r e a t e p o t e n t i a l c l u s t e r s c o l l e c t i o n Cj

    , f o r r a n k j . ( C

    i

    R ) .

    r

    C

    i

    r h a s a B l a s t X - m a t c h w i t h a p r o t e i n o f t a x o n t

    i

    2 S e l e c t c l u s t e r s f r o m Cj

    v i a S e t C o v e r i n g P r o b l e m

    3 I d e n t i f y f r a g m e n t s w i t h i n c o h e r e n t t a x o n o m i c c l a s s i c a t i o n

    ( w . r . t . h i g h e r r a n k s c l a s s i c a t i o n s ) a n d r e m o v e t h e m , e v e n

    f r o m R

    3F r o m t h e l o w e s t t o t h e h i g h e s t r a n k

    1 F o r t h i s r a n k , M a j o r i t y V o t e o n c l u s t e r s ' i n t e r s e c t i o n s

    2 M a k e h i g h e r r a n k s c l a s s i c a t i o n s c o h e r e n t w i t h t h e M a j o r i t y

    V o t e r e s u l t s

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    31/52

    A l g o r i t h m s c h e m e

    1 C o m p a r e f r a g m e n t s R w i t h r e f e r e n c e p r o t e i n s ( N C B I - N R )

    u s i n g B l a s t X a n d F i l t e r B l a s t X o u t p u t

    2 F r o m t h e h i g h e s t t o t h e l o w e s t r a n k

    ( j=

    k i n g d o m , . . . ,

    s p e c i e s )

    1 C r e a t e p o t e n t i a l c l u s t e r s c o l l e c t i o n Cj

    , f o r r a n k j . ( C

    i

    R ) .

    r

    C

    i

    r h a s a B l a s t X - m a t c h w i t h a p r o t e i n o f t a x o n t

    i

    2 S e l e c t c l u s t e r s f r o m Cj

    v i a S e t C o v e r i n g P r o b l e m

    3 I d e n t i f y f r a g m e n t s w i t h i n c o h e r e n t t a x o n o m i c c l a s s i c a t i o n

    ( w . r . t . h i g h e r r a n k s c l a s s i c a t i o n s ) a n d r e m o v e t h e m , e v e n

    f r o m R

    3F r o m t h e l o w e s t t o t h e h i g h e s t r a n k

    1 F o r t h i s r a n k , M a j o r i t y V o t e o n c l u s t e r s ' i n t e r s e c t i o n s

    2 M a k e h i g h e r r a n k s c l a s s i c a t i o n s c o h e r e n t w i t h t h e M a j o r i t y

    V o t e r e s u l t s

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    32/52

    A l g o r i t h m s c h e m e

    1 C o m p a r e f r a g m e n t s R w i t h r e f e r e n c e p r o t e i n s ( N C B I - N R )

    u s i n g B l a s t X a n d F i l t e r B l a s t X o u t p u t

    2 F r o m t h e h i g h e s t t o t h e l o w e s t r a n k

    ( j=

    k i n g d o m , . . . ,

    s p e c i e s )

    1 C r e a t e p o t e n t i a l c l u s t e r s c o l l e c t i o n Cj

    , f o r r a n k j . ( C

    i

    R ) .

    r

    C

    i

    r h a s a B l a s t X - m a t c h w i t h a p r o t e i n o f t a x o n t

    i

    2 S e l e c t c l u s t e r s f r o m Cj

    v i a S e t C o v e r i n g P r o b l e m

    3 I d e n t i f y f r a g m e n t s w i t h i n c o h e r e n t t a x o n o m i c c l a s s i c a t i o n

    ( w . r . t . h i g h e r r a n k s c l a s s i c a t i o n s ) a n d r e m o v e t h e m , e v e n

    f r o m R

    3F r o m t h e l o w e s t t o t h e h i g h e s t r a n k

    1 F o r t h i s r a n k , M a j o r i t y V o t e o n c l u s t e r s ' i n t e r s e c t i o n s

    2 M a k e h i g h e r r a n k s c l a s s i c a t i o n s c o h e r e n t w i t h t h e M a j o r i t y

    V o t e r e s u l t s

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    33/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    34/52

    S u m m a r y

    M T R

    G o a l : T a x o n o m i c A n n o t a t i o n o f S h o r t M e t a g e n o m i c s

    f r a g m e n t s

    M e t h o d :

    C o m p a r i s o n w i t h r e f e r e n c e s e q u e n c e s

    C o h e r e n t r a n k - b y - r a n k c l u s t e r i n g

    M a j o r i t y v o t e o n i n t e r s e c t i o n s

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    35/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s

    T a x o n o m i c A n n o t a t i o n

    M T R d e s c r i p t i o n

    R e s u l t s

    C o n c l u s i o n

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    36/52

    E x p e r i m e n t s - T h e D a t a

    9 s i m u l a t e d d a t a s e t s , f r o m 3 s e t o f o r g a n i s m s a t 3 l e v e l o f c o v e r a g e

    ( 0 . 1 X , 1 X a n d 4 X )

    L e n g t h :

    1 0 0 b p

    D a t a s e t M 1 a r t i c i a l l y g e n e r a t e d f r o m 9 g e n o m e p r o j e c t s

    O r g a n i s m G e n o m e s i z e ( b p ) F r a g m e n t s s a m p l e d

    C l o s t r i d i u m p h y t o f e r m e n t a n s I S D g 4 5 3 3 5 1 2 4 6 3 7 9

    P r o c h l o r o c o c c u s m a r i n u s N A T L 2 A 1 8 4 2 8 9 9 1 8 6 8 1

    L a c t o b a c i l l u s r e u t e r i 1 0 0 - 2 3 2 1 7 4 2 9 9 2 3 7 1 0

    C a l d i c e l l u l o s i r u p t o r s a c c h a r o l y t i c u s D S M 8 9 0 3 2 9 7 0 2 7 5 2 9 4 9 6

    C l o s t r i d i u m s p . O h I L A s 2 9 9 7 6 0 8 2 9 3 4 8

    H e r p e t o s i p h o n a u r a n t i a c u s A T C C 2 3 7 7 9 6 6 0 5 1 5 1 6 9 3 8 7

    B a c i l l u s w e i h e n s t e p h a n e n s i s K B A B 4 5 6 0 2 5 0 3 4 5 4 6 3

    H a l o t h e r m o t h r i x o r e n i i H 1 6 8 2 5 7 8 1 4 6 2 6 9 8 0

    C l o s t r i d i u m c e l l u l o l y t i c u m H 1 0 3 9 5 8 6 8 3 3 9 8 0 2

    [ D . D a l e v i e t a l . , B i o i n f o r m a t i c s , 2 0 0 8 ]

    P r o t e i n d a t a b a s e : N C B I - N R n o n - r e d u n d a n t p r o t e i n s e q u e n c e d a t a b a s e w i t h

    e n t r i e s f r o m G e n P e p t , S w i s s p r o t , P I R , P D F , P D B , a n d N C B I R e f S e q

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    37/52

    M a s k i n g a n d F i l t e r i n g

    T o s i m u l a t e t h e p r e s e n c e o f U N K N O W N o r g a n i s m s

    R e m o v e a l l t h e a l i g n m e n t s w i t h s p e c i e s c o n t a i n e d i n t h e d a t a s e t s

    F i l t e r i n g B L A S T p a r a m e t e r s :

    F i r s t 5 0 h i t s

    S

    B

    >3 0

    E - v a l u e

    1 0

    6

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    38/52

    C o m p a r i s o n w i t h L C A

    A c c u r a c y a n d N u m b e r o f f r a g m e n t s a s s i g n e d ( f o r e a c h r a n k )

    R a n k M T R ( # o f f r a g m e n t s ) L C A ( # o f f r a g m e n t s )

    k i n g d o m 1 0 0 . 0 0

    ( 1 6 6 9 4 8 ) 9 9 . 9 9 ( 1 5 5 2 6 3 )

    p h y l u m 9 9 . 8 6 ( 1 6 6 9 4 8 ) 9 9 . 9 3 ( 1 5 5 2 5 8 )

    c l a s s 9 9 . 7 3 ( 1 6 6 9 3 6 ) 9 9 . 8 1 ( 1 4 1 8 2 9 )

    o r d e r 9 7 . 6 7 ( 1 6 6 1 4 8 ) 9 8 . 1 4 ( 1 1 5 7 3 2 )

    f a m i l y 9 7 . 6 2 ( 1 6 5 2 3 1 ) 9 8 . 0 4 ( 1 1 0 4 8 8 )

    g e n u s 9 7 . 4 2 ( 1 4 0 4 7 6 ) 9 8 . 3 5 ( 1 1 0 1 3 9 )

    T a b l e : D a t a n a m e : M 3 , C o v e r a g e 4 X , E v = 0 . 0 0 0 0 0 1 , T o t f r a g m e n t s : 1 , 3 8 5 , 0 2 8

    R a n k M T R ( # o f f r a g m e n t s ) L C A ( # o f f r a g m e n t s )

    k i n g d o m 9 5 . 0 7

    ( 8 8 5 3 7 ) 9 4 . 6 6 ( 7 3 1 7 6 )

    p h y l u m 9 3 . 2 1

    ( 8 8 5 3 7 ) 9 2 . 5 7 ( 7 3 1 6 9 )

    c l a s s 8 9 . 2 5

    ( 8 7 6 3 5 ) 8 8 . 9 8 ( 6 0 2 9 4 )

    o r d e r 8 9 . 2 4

    ( 8 5 6 5 7 ) 8 8 . 4 4 ( 5 7 3 7 3 )

    f a m i l y 7 7 . 3 5 ( 8 1 3 6 6 ) 8 1 . 8 4 ( 4 8 7 6 0 )

    g e n u s 6 1 . 3 6 ( 7 7 3 0 7 ) 7 4 . 6 0 ( 4 0 8 2 3 )

    T a b l e : D a t a n a m e : M 2 , C o v e r a g e 1 X , E v = 0 . 0 0 0 0 0 1 , T o t f r a g m e n t s : 2 8 8 , 7 3 0

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    39/52

    P o p u l a t i o n D i s t r i b u t i o n s : M 2 , c o v 0 . 1 X , g e n u s

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    40/52

    I n t r o d u c t i o n t o M e t a g e n o m i c s

    T a x o n o m i c A n n o t a t i o n

    M T R d e s c r i p t i o n

    R e s u l t s

    C o n c l u s i o n

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    41/52

    C o n c l u s i o n s

    G o a l : T a x o n o m i c A n n o t a t i o n o f S h o r t M e t a g e n o m i c s

    f r a g m e n t s

    S i m u l a t e d M e t a g e n o m e s w i t h 5 - 9 g e n o m e s

    M T R a s s i g n s m o r e f r a g m e n t s t h a n L C A ( a l l r a n k s )

    A c c u r a c y : l e s s a c c u r a t e t h a n L C A e s p . a t l o w r a n k s

    T r a d e - o a c c u r a c y / q u a n t i t y ( c o n v e n i e n t )

    A c c u r a c y : M T R i s v e r y c l o s e t o L C A , i f t h e r i g h t t a x a a r e i n

    t h e d a t a b a s e

    P o p u l a t i o n d i s t r i b u t i o n o f M T R t e n d s t o b e b e t t e r

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    42/52

    F u t u r e D e v e l o p m e n t s

    T e s t O t h e r f r a g m e n t s l e n g t h s

    G o a l s

    1 A n a l y z e A L L t h e f r a g m e n t s ( n o t o n l y t h e l t e r e d o n e s )

    2 I n c r e a s e A c c u r a c y

    P o s s i b l e A l g o r i t h m I m p r o v e m e n t s

    I n c l u d e c o m p o s i t i o n - b a s e d m e a s u r e s

    F o r G o a l 1 :

    C h a n g e B L A S T l t e r i n g

    A k i n d o f r e i t e r a t i o n o f t h e m e t h o d

    F o r G o a l s 1 & 2 : C h a n g e c l u s t e r i n g m e t h o d

    ( W e i g h t e d S C P o r T r e e S C P i n s t e a d o f S C P )

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    43/52

    T h a n k Y O U !

    Q u e s t i o n s ?

    g o r i @ s c i e n c e . r u . n l

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    44/52

    T a b l e : R e a l - l i f e d a t a s e t s : n u m b e r o f r e a d s a s s i g n e d u p t o a r a n k .

    R e a l L i f e S a l t e r n C o r a l C h i c k e n

    M T R

    k i n g d o m 1 5 8 1 2 4 5 2 2 1 1 1 6 5 5

    p h y l u m 1 5 7 6 2 3 0 2 7 1 1 1 6 5 0

    c l a s s 1 5 3 0 2 1 9 2 0 1 0 9 9 8 6

    o r d e r 1 3 1 7 2 1 0 1 9 1 0 8 1 0 0

    f a m i l y 1 0 3 5 1 5 5 8 3 1 0 0 6 7 6

    g e n u s 9 7 9 1 1 4 2 2 9 4 5 0 7

    s p e c i e s 9 3 7 9 5 6 0 8 9 8 1 8

    L C A

    k i n g d o m 1 2 1 7 2 1 2 8 7 9 3 4 1 6

    p h y l u m 1 2 0 8 1 6 5 2 6 9 3 3 9 9

    c l a s s 1 0 5 1 1 2 3 0 1 8 7 9 1 7

    o r d e r 8 0 7 6 8 4 1 8 7 1 4 6

    f a m i l y 6 9 1 5 0 4 5 7 0 3 7 6

    g e n u s 6 3 5 4 6 8 5 6 9 6 3 6

    s p e c i e s 3 1 1 4 3 4 0 2 9 1 6 0

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    45/52

    F i g u r e : P o p u l a t i o n d i s t r i b u t i o n s ( r a n k g e n u s ) o f c o r a l d a t a s e t b y M T R

    ( l e f t ) a n d L C A ( r i g h t )

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    46/52

    T a b l e : A c c u r a c y a n d n u m b e r o f a s s i g n e d r e a d s o n M 1 d a t a s e t s

    M 1 0 . 1 x 1 x 4 x

    M T R

    k i n g d o m 1 0 0 . 0 0 ( 5 6 6 9 ) 9 9 . 9 3 ( 5 6 3 4 8 ) 9 9 . 9 3 ( 1 7 3 5 4 1 )

    p h y l u m 9 2 . 5 0 ( 5 6 6 9 ) 9 2 . 5 9 ( 5 6 3 2 5 ) 9 3 . 3 9 ( 1 7 3 5 2 1 )

    c l a s s 8 4 . 0 4 ( 5 5 5 6 ) 8 5 . 4 4 ( 5 4 3 4 1 ) 8 7 . 1 5 ( 1 6 7 5 4 6 )

    o r d e r 6 4 . 9 3 ( 5 3 6 6 ) 6 6 . 2 3 ( 5 3 3 9 5 ) 6 6 . 6 9 ( 1 6 3 8 4 0 )

    f a m i l y 6 4 . 8 7 ( 4 9 0 4 ) 6 3 . 6 7 ( 5 0 5 8 7 ) 6 3 . 2 2 ( 1 5 4 1 3 4 )

    g e n u s 6 3 . 6 6 ( 4 6 2 8 ) 6 2 . 5 8 ( 4 8 2 4 4 ) 6 0 . 5 0 ( 1 4 4 4 7 5 )

    L C A

    k i n g d o m 1 0 0 . 0 0 ( 4 1 4 5 ) 9 9 . 9 2 ( 4 2 6 2 0 ) 9 9 . 9 1 ( 1 3 2 1 3 0 )

    p h y l u m 9 5 . 0 8 ( 4 1 4 5 ) 9 4 . 8 1 ( 4 2 5 9 3 ) 9 5 . 0 2 ( 1 3 2 0 9 9 )

    c l a s s 9 4 . 4 6 ( 3 7 3 9 ) 9 3 . 2 4 ( 3 8 9 7 0 ) 9 3 . 6 0 ( 1 2 1 9 8 0 )

    o r d e r 7 5 . 2 9 ( 3 4 9 7 ) 7 4 . 1 8 ( 3 6 8 5 7 ) 7 2 . 4 3 ( 1 1 6 6 3 2 )

    f a m i l y 7 1 . 9 4 ( 2 9 6 1 ) 6 9 . 9 4 ( 3 1 9 1 3 ) 6 9 . 0 7 ( 1 0 2 2 3 9 )

    g e n u s 7 1 . 0 3 ( 2 6 8 6 ) 6 8 . 3 9 ( 2 9 3 6 0 ) 6 6 . 6 3 ( 9 4 3 4 6 )

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    47/52

    T a b l e : A c c u r a c y a n d n u m b e r o f a s s i g n e d r e a d s o n M 2 d a t a s e t s

    M 2 0 . 1 x 1 x 4 x

    M T R

    k i n g d o m 9 5 . 2 7 ( 9 0 3 0 ) 9 5 . 0 7 ( 8 8 5 3 7 ) 9 1 . 4 1 ( 1 7 4 5 8 3 )

    p h y l u m 9 3 . 8 3 ( 9 0 3 0 ) 9 3 . 2 1 ( 8 8 5 3 7 ) 8 8 . 7 5 ( 1 7 4 5 8 3 )

    c l a s s 8 9 . 9 8 ( 9 0 1 2 ) 8 9 . 2 5 ( 8 7 6 3 5 ) 8 6 . 3 2 ( 1 6 8 8 5 4 )

    o r d e r 9 0 . 4 4 ( 8 8 2 2 ) 8 9 . 2 4 ( 8 5 6 5 7 ) 8 6 . 1 4 ( 1 6 7 2 2 2 )

    f a m i l y 8 0 . 5 6 ( 7 2 6 4 ) 7 7 . 3 5 ( 8 1 3 6 6 ) 7 3 . 0 1 ( 1 5 9 5 9 1 )

    g e n u s 6 4 . 4 1 ( 6 4 8 0 ) 6 1 . 3 6 ( 7 7 3 0 7 ) 5 5 . 9 1 ( 1 4 7 1 3 9 )

    L C A

    k i n g d o m 9 4 . 8 2 ( 7 2 0 5 ) 9 4 . 6 6 ( 7 3 1 7 6 ) 9 0 . 7 6 ( 1 4 3 2 2 6 )

    p h y l u m 9 3 . 2 1 ( 7 2 0 5 ) 9 2 . 5 7 ( 7 3 1 6 9 ) 8 7 . 8 0 ( 1 4 3 2 0 6 )

    c l a s s 8 9 . 8 2 ( 5 9 4 1 ) 8 8 . 9 8 ( 6 0 2 9 4 ) 8 3 . 5 9 ( 1 1 7 8 8 1 )

    o r d e r 8 9 . 9 0 ( 5 6 1 5 ) 8 8 . 4 4 ( 5 7 3 7 3 ) 8 3 . 0 1 ( 1 1 3 1 6 8 )

    f a m i l y 8 3 . 7 7 ( 4 7 5 7 ) 8 1 . 8 4 ( 4 8 7 6 0 ) 7 7 . 6 1 ( 1 0 0 9 2 5 )

    g e n u s 7 6 . 9 1 ( 3 9 0 7 ) 7 4 . 6 0 ( 4 0 8 2 3 ) 6 9 . 6 8 ( 8 2 8 0 5 )

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    48/52

    T a b l e : A c c u r a c y a n d n u m b e r o f a s s i g n e d r e a d s o n M 3 d a t a s e t s

    M 3 0 . 1 x 1 x 4 x

    M T R

    k i n g d o m 1 0 0 . 0 0 ( 1 1 7 9 2 ) 9 9 . 9 7 ( 1 1 6 8 6 9 ) 1 0 0 . 0 0 ( 1 6 6 9 4 8 )

    p h y l u m 9 9 . 5 8 ( 1 1 7 9 2 ) 9 9 . 4 7 ( 1 1 6 8 6 9 ) 9 9 . 8 6 ( 1 6 6 9 4 8 )

    c l a s s 9 6 . 9 7 ( 1 1 7 6 3 ) 9 7 . 0 7 ( 1 1 6 1 3 4 ) 9 9 . 7 3 ( 1 6 6 9 3 6 )

    o r d e r 9 1 . 7 9 ( 1 1 6 0 6 ) 9 1 . 7 0 ( 1 1 5 0 3 4 ) 9 7 . 6 7 ( 1 6 6 1 4 8 )

    f a m i l y 9 2 . 2 7 ( 1 1 1 1 7 ) 9 1 . 2 5 ( 1 1 1 5 6 0 ) 9 7 . 6 2 ( 1 6 5 2 3 1 )

    g e n u s 9 4 . 0 6 ( 1 0 4 1 9 ) 9 2 . 1 9 ( 1 0 1 5 3 3 ) 9 7 . 4 2 ( 1 4 0 4 7 6 )

    L C A

    k i n g d o m 1 0 0 . 0 0 ( 1 0 3 3 3 ) 9 9 . 9 6 ( 1 0 2 8 2 4 ) 9 9 . 9 9 ( 1 5 5 2 6 3 )

    p h y l u m 9 9 . 7 2 ( 1 0 3 3 3 ) 9 9 . 6 9 ( 1 0 2 8 1 3 ) 9 9 . 9 3 ( 1 5 5 2 5 8 )

    c l a s s 9 8 . 8 6 ( 9 1 6 2 ) 9 8 . 8 2 ( 9 1 4 4 5 ) 9 9 . 8 1 ( 1 4 1 8 2 9 )

    o r d e r 9 6 . 7 4 ( 7 7 8 8 ) 9 6 . 6 2 ( 7 7 8 2 2 ) 9 8 . 1 4 ( 1 1 5 7 3 2 )

    f a m i l y 9 6 . 8 7 ( 7 5 4 5 ) 9 6 . 4 2 ( 7 5 6 1 6 ) 9 8 . 0 4 ( 1 1 0 4 8 8 )

    g e n u s 9 7 . 6 1 ( 6 7 4 8 ) 9 6 . 0 1 ( 6 8 5 7 3 ) 9 8 . 3 5 ( 1 1 0 1 3 9 )

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    B e s t H i t ( B H )

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    49/52

    F o r e a c h ( q u e r y ) f r a g m e n t

    C o m p a r e w i t h r e f e r e n c e s e q u e n c e s u s i n g B L A S T

    A s s i g n t h e f r a g m e n t t o t h e h i t w i t h t h e h i g h e s t S

    B

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    B H : P r o s a n d C o n s

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    50/52

    P r o s :

    A s s i g n a l l t h e f r a g m e n t s

    A s s i g n t o p r o t e i n s / o r g a n i s m s

    C o n s ( s h o r t f r a g m e n t s ) :

    L o w a c c u r a c y

    U n k n o w n o r g a n i s m n o t r e l i a b l e

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    L o w e s t C o m m o n A n c e s t o r ( L C A )

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    51/52

    F o r e a c h ( q u e r y ) f r a g m e n t

    C o m p a r e w i t h r e f e r e n c e s e q u e n c e s u s i n g B L A S T

    F i l t e r t h e h i t l i s t

    A s s i g n t h e f r a g m e n t t o t h e l o w e s t c o m m o n t a x o n o f a l l

    t h e s e h i t s

    L C A

    H

    1

    H

    2

    H

    3

    H

    4

    H

    5

    H

    6

    H

    7

    H

    8

    H

    9

    H

    1 0

    H

    1 1

    H

    1 2

    I n t r o d u c t i o n t o M e t a g e n o m i c s T a x o n o m i c A n n o t a t i o n M T R d e s c r i p t i o n R e s u l t s C o n c l u s i o n

    L C A : P r o s a n d C o n s

  • 8/2/2019 Annotation of Metagenomes: from Jigsaws to Taxa

    52/52

    P r o s :

    H i g h e r a c c u r a c y t h a n B H

    A s s i g n t o t a x a i s m o r e r e a l i s t i c ( w i t h s h o r t

    f r a g m e n t s )

    C o n s :

    F e w f r a g m e n t s a t l o w r a n k s

    M a n y u n a s s i g n e d f r a g m e n t s