S. Ioannou G. Moschovitis, K. Ntalianis, K. Karpouzis and S. Kollias, Effective Access to Large Audiovisual Assets Based on User Preferences

Embed Size (px)

Citation preview

  • 8/12/2019 S. Ioannou G. Moschovitis, K. Ntalianis, K. Karpouzis and S. Kollias, Effective Access to Large Audiovisual Assets Ba

    1/6

    E f fe c t iv e A c c e s s to L a r g e A u d i o v i s u a l A s s e t sB a s e d o n U s e r P r e f e re n c e s

    S. Ioannou G . Moschovit is K. Ntalianis K. K arpouz is and S. Kol liasImage, Video and M ultimed ia Laboratory

    Electrical and C omp uter Engineering DepartmentNational Technical Un iversity of Athens

    Heroon Po lytechniou 9, 157 80 Athens, Greeceem ai l : s ivann@ soft lab .ece .ntua .gr

    A B S T R A C TC u r r e n t m u l t i m e d i a d a t a b a se s c o n t a i n a w e a l t h o fin forma t ion in the fo rm o f aud iov i s ua l , as wel l as t ex t da ta. Eve nthough ef f i c i en t s earch a lgor i thms have be en deve lop ed for e i thermedia , t here s t i l l ex i s t s t he need fo r abs t r ac t p r es en ta t ion ands ummar iza t ion of the r es u l t s o f da tabas e us er s ' quer i es .M o r e o v e r , m u l t i m e d i a r e t r ie v a l s y s t em s s h o u l d b e c a p a b l e o f

    prov id ing the us er w i th add i t iona l i n form at ion re l a t ed to thes pec i f i c s ub jec t o f t he query , as wel l as s ugges t o ther top icswhich us er s wi th a s imi l a r p rof i l e a r e in t e r es t ed in . I n th i s paper ,w e p r e s e n t a n u m b e r o f s o l u t i o n s t o t h e s e i s s u e s , g i v i n g a s a nexam ple an in t egra t ed a r ch it ec tu re we have deve loped , a lon g wi thnot ion s tha t s uppor t e f f i c i en t and s ecure In t e rne t acces s and eas ya d d i t io n o f n e w m a t e r i a l. S e g m e n t a t i o n o f th e v i d e o i n s h o t s i sf o l l o w e d b y s h o t c l a s s i f i c a t i o n i n a n u m b e r o f p r e d e t e r m i n e dca tegor i es. Gen era t ion of us er s ' p rof i l es accord ing to the s amecategor i es , enhanced by r e l evance f eedback , permi t s an e f f i c i en tpres en ta t ion of the r e t r i eved v ideo s ho t s o r charac te r i s t i c f r amesin t e rms of the us er in t e r es t i n them. M oreover , t h i s c lus t e r ings cheme as s i s t s t he no t ion of l a t e r a l l i nks tha t enab le the us er toc o n t i n u e r e t r ie v a l w i t h d a t a o f s i m i l a r n a t u re o r c o n t e n t t o t h o s ea l r eady r e tu rned . Fur thermore , us er g roups a r e fo rmed andmodeled by r eg i s t e r ing ac tua l p r ef er ences and prac t i ces ; t h i senab les the s ys tem to p red ic t i n forma t ion tha t i s pos s ib lyre l evan t to s pec i f i c us er s and pres en t i t a long wi th the r e tu rnedres u l t s . The concep t s u t i l i zed in th i s s ys t em can be s mooth lyi n t eg r a t ed i n M P E G - 7 c o m p a t i b l e m u l t i m e d i a d a t ab a s e s y s t e m s .K e y w o r d sM u l t i m e d i a d a t a b a s e s , w e bprof i l i ng , query expans ion acces s , t ex t -bas ed s earch , us er

    Permission to make digital or hard copies o f all or part of this work forpersonal or classroom use is granted w ithout fee provided that copiesare not made or distributed for profit or commercial advantage an d thatcopies bear this notice and the full citation on the first page. To copyotherwise, to republish, to post on servers or to redistribute to lists,requires prior specific perm ission and/or a tee.ACM M ult imedia Workshop Marina Del Rey CA U SACopyright ACM 2000 1-58113-311-1/00/l 1. . .$5.00

    1 . I N T R O D U C T I O NRaw f i lm foo tage has been the p r imary s ource o f mater i a l f o rn e w s b r o a d c a s t s, d o c u m e n t a r i e s a n d f i l m m a k i n g s i n c e t h e a d v e n tof the por t ab le camera . How ever , f o r the g rea te r par t o f thep r e v i o u s c e n t u r y , o r g a n i z e d a r c h i v e s o f s u c h m e d i a u s e d t o b erare and occa s iona l , t hus obs t ruc t ing the u t i l i za t ion of the m ater i a l

    in everyday a pp l i ca t ions . I n f ac t , p roducer s wi l l i ng to us e s uchmater i a l i n the i r own broadcas t s were hampered by r es t r i c t ionsimpos ed by the me dia i t s e l f ( o lder f i lm s t r ips r equ i r e s pec i f ichardware fo r p l ayback ; s uch hardware i s us ua l ly incompat ib l ew i t h c o m p u t e r iz e d e d i t i n g s y s t e m s ), a s w e l l a s t h e l a c k o f a n yi n d e x i n g o r s u m m a r i z a t i o n o f t h e v i s u a l d a t a t h a t i s c o n t a i n e d i nthe s t r ips .The a dven t o f f l ex ib le d ig i t i z ing hardware , t oge ther wi th thea u g m e n t e d a b i l i t y o f m o d e r n c o m p u t e r s y s t e m s t o h a n d l e l a r g ea u d i o v i s u a l a ss e ts a n d w i t h e m e r g i n g m u l t i m e d i a d a ta b a s es ys tems in t roduce e f f ec t ive s o lu t ions to thes e p rob lems . Inaddi t ion to tha t , cur r en t and evo lv ing st andards , s uch as MP EG-4and M PEG -7 [5], s uppo r t no t ions tha t a id the e f f i c i en t r e t r i eva land exp lo i t a t ion of s pec i f i c mater i a l , wi thou t the need toma nual ly b rows e th ro ugh a l l ava i l ab le da ta . Th i s i s veryimpor tan t i n t ime-cr i t i ca l opera t ions , s uch as t e l ev i s ed newsbroadcas t s o r news paper pub l i s h ing , o r app l i ca t ions tha t r equ i r ea d v a n c e d q u a l i ty , s u c h a s e n t e r t a i n m e n t . U s e r s o f t h i s k i n d w i l lb e n e f i t fr o m t h e a d v a n c e d s u m m a r i z a t i o n s c h e m e s o f fe r e d b y t h eabove s t andards and wi l l be ab le to r e t r i eve s pec i f i c and a tomicmater i a l as a r es u l t o f s imple and des cr ip t ive quer i es . I n th i scon tex t , quer i es need no t be r es t r i c t ed to t ex tua l va lues bu t a l s oincorpora te by-ex am ple s cheme s , e . g . quer i es by s ke tch orquer i es fo r s egment s tha t con ta in the f ace o f a s pec i f i c per s on .Rever s e ly , t he r es u l t s may be p res en ted in a f as h ion tha t p rov idest h e u s e r w i t h a n a b s t r a c t u n d e r s t a n d i n g o f t h e c o n t e n t t h r o u g h t h eus e o f au tomat i c f ea tu re ex t r ac t ion t echn iques , s uch as s ho tde tec t ion and charac te r i s t i c f r ame ex t r ac t ion .Fur thermore , i n t egra t ed s ys t ems s hould be ab le to s uppor td iver s e g roups o f us er s ; f o r example , h i s to r i ans o r p r in tjourna l i s t s a r e us ua l ly l es s in t e r es t ed in the v i s ua l as pec t o f ar ecorded documentary and prefer to concen t r a t e on the h i s to r i ca land cu l tu ra l backg round o f the s to ry . To prov ide us er s wi th s uchc a p a b i l i t i e s , t h e v i d e o d a t a a r e g e n e r a l l y c o m m e n t e d o n b yexper t s , genera t ing the meta da ta tha t i s neces s ary to be t t e rc o m p r e h e n d t h e c o n t e n t . T e x t u a l m e t a d a t a c a n a l s o b e u s e d t ogenera t e s upplementary in format ion , r e l a t ed to tha t ac tua l lyr e t r i eved by the query .

    2 2 7

  • 8/12/2019 S. Ioannou G. Moschovitis, K. Ntalianis, K. Karpouzis and S. Kollias, Effective Access to Large Audiovisual Assets Ba

    2/6

    In add i t ion to the abo ve , t he in t rodu ct ion of the In t e rne t as am u l t i m e d i a c o n t e n t t r a n s f e r c h a n n e l h a s b r o a d e n e d t h e t a r g e ta u d i e n c e o f s u c h m a t e r ia l , w h i l e i n t r o d u c i n g a n u m b e r o fadd i t iona l i s sues , s uch as es t ab l i s h ing advan ced s ecur i ty s ys t emsand pro tec t ing the ex i s t ing in t e l l ec tua l p roper ty . Both o f thes emat t e r s a r e no t neces s ar i ly as s oc ia t ed wi th the con ten t i t s e l f ;h o w e v e r , r e c e n t w o r k i n d i g i ta l v i d e o w a t e r m a r k i n g s h o w s t h a t i nt h e n e a r f u t u re o n e w i l l b e a b l e t o p r o v e o w n e r s h i p o f a n i m a g e o ra v ideo c l ip wi thou t the need fo r s pec ia l i zed equ ipmen t .S e v e r a l t e c h n i q u e s a n d s y s t e m s h a v e b e e n p r o p o s e d i nl i t e r a t u r e c o p i n g w i t h t h e p r o b l e m o f a d j u s t i n g i n f o r m a t i o nre t r i eva l t o par t i cu la r us er s ' needs . Thes e approaches can bed iv ided in to two main ca t egor i es : ( a ) con ten t -bas edr e c o m m e n d a t i o n a n d ( b ) c o l l a b o r a t i v e r e c o m m e n d a t i o n . Ac o n t e n t - b a s e d re c o m m e n d a t i o n s y s te m , w h i c h h a s i t s ro o t s i n t h einformat ion r e t r i eva l r es earch communi ty , makes i t sr e c o m m e n d a t i o n s b y c o n s t r u c t i n g a p r o f i l e f o r e a c h u s e r a n du s i n g t h i s p r o f il e t o j u d g e w h e t h e r d i s c o v e re d i n f o r m a t i o n w i ll b eof in t e r es t t o the us er o r no t . P rof i l es a r e mos t ly bu i l t up byprov id ing mater i a l t o the us er , s uch as web pag es , ques t ionnai r es ,s to r ed mater i a l , e t c . , accord ing to the app l i ca t ion ; t he us er r a t es

    t h e p r o v i d e d i n f o r m a t i o n a n d , t h u s , e n a b l e s t h e s y s t e m a g e n t t oc r e a te a n e w p r o f il e . I n t h e c a s e o f c o l l a b o r a t iv e r e c o m m e n d a t i o n ,d i s covered in fo rmat ion i s f i lt e r ed by con s ider ing us er s wi th hab i t ss imi l a r t o thos e o f the us er to be s erv iced . As a r es u l t , i t emsprefer r ed by us er s o f s imi l a r p rof i l es a r e p red ic t ed as cas es tha tpos s ib ly in t e r es t t he s pec i f i c us er and a r e p res en ted as tops ugges t ions to the par t i cu la r us er .S e v e r a l e x a m p l e s o f p e r s o n a l i z i n g i n f o rm a t i o n s y s t e m s e x i st .E x a m p l e s o f c o n t e n t - b a s e d r e c o m m e n d a t i o n s y s t e m s i n c l u d e t h eSys k i l l & Webe r t [ 8 ] s of tware agen t which s ugges t s l inks tha t au s e r w o u l d b e i n t e r e s t e d i n o r c o n s t r u c t s L Y C O S - c o m p a t i b l equer i es , t he In foF in der which s cores pages bas ed on theex t r ac t ion of phras es o f s ign i f i can t impor t ance , t heW e b W a t c h e r , a n i n f o r m a t i o n r o u t i n g s y s t e m d e s i g n e d t os ugges t l i nks to us er s fo r ge t t ing f rom a s t a r t ing loca t ion to a goa lone , t he S IFT s ys t em [12] which ad jus t s t he weigh t s o f a p rofi l eb y i n c o r p o r a t i n g a r e l e v a n c e f e e d b a c k a p p r o a c h a n d t h eAm al thaea [7 ], an a r t i f ic i a l ecos ys tem of evo lv ing agen t s tha tcoopera te and com pete in a l imi t ed r es ources env i ron men t . I n th i scon tex t , agen t s us efu l t o the us er ge t pos i t ive c r ed i t , whi l e thebad per former s ge t nega t ive c r ed i t .C o r r e s p o n d i n g l y , c o l l a b o r a t i v e r e c o m m e n d a t i o n s y s t e m sinc lude Grou pLen s [9 ], whic h i s s cheduled to co l l abora t ive lyf i lt e r n e t n e w s , t h e W e b H o u n d a g e n t th a t l oc a t e s u s e r s w i t hs imi l a r r a t ings to s pec i f i c pages and s ugges t s unread pages tha ta r e p ref er r ed by them , the R ing o s ys t em, wh ich i s devoted tof i l t e r s oc ia l i n format ion and the Bel l core , t ha t is a v ideo-r e c o m m e n d e r , w h i c h e f f i c i e n t l y c o m b i n e s u s e r s ' c h o i c e s . I ngenera l , one d i s advan tage of the co l l abora t ive f i l t e r ing approachi s th a t w h e n n e w i n f o r m a t i o n b e c o m e s a v a i l a b l e , ot h e r u s e rs m u s tf i rs t r ea d a n d r a t e t h i s i n f o rm a t i o n b e f o re i t m a y b e r e c o m m e n d e dto o ther s . On the con t r a ry , t he us er p rof i l e approach can he lp tode termine w hether a us er i s l i ke ly to be in t e r es t ed in s pec i f ic newi n f o r m a t i o n w i th o u t r e l y i n g o n t h e o p i n i o n s o f o t h e r u s e rs .O t h e r , h y b r i d , s y s t e m s h a v e a l s o b e e n p r o p o s e d w h i c hs ugges t pages tha t s core h igh ly aga ins t s omeone ' s p rof i l e o r a r er a t ed h igh ly by u s er s wi th s imi l a r p rofi l es . An ef f ec t ive androbus t exam ple o f s uch a s ys t em i s the Fab [2 ], which i s o r i en tedtowards in format ion r e t r i eva l and r e l evance f eedback , as wel l asa u t o m a t e d f i l t e r in g o f i n c o m i n g i n f o r m a t i o n .

    2 . W E B - B A S E D ACCESSIns t ead of ado pt ing a s t r a igh t forward c l i en t - s erver approach ,w e h a v e e m p l o y e d t h e i n c r e a s i n g ly p o p u l a r t h r e e -t i e r a rc h i te c t u res o as to in t egra t e the s erv ices o f each module . I n f ac t , a two- t i e rs ys t em i s no t a lways f eas ib l e , es pec ia l ly when the da tabas e s erverand the web s erver a r e s e tup up in two d i f f e r en t computer s , bo thbeh in d a f i rewal l , as a par t o f t he s ys t em requ i r emen t ss pec i f i ca t ions . As f a r as In t e rne t acces s i s concerned , t h i s s e tupi m p o s e s a n u m b e r o f r e s t r ic t i o n s, w h i c h w o u l d r e q u i re r e s e t t in gthe ex i s t ing f i rewal l s ys t em in o rder to overcom e them.2 .1 T H R E E - T I E R A R C H I T E C T U R E

    T h e u n d e r l y i n g p r i n c i p l e a n d d a t a f l o w i n t h e t h r e e - ti e rs ys t em i s des cr ibed in F igure 1 :

    CLIENTWEB BROWSER)F i g u r e 1 . T h r e e - t i e r a r c h i t e ct u r e b l o c k d i a g r a m

    In s uch a con tex t , t he c l i en t t i e r i s r es pons ib le fo r thef o r m a t i o n a n d t r a n s m i s s i o n o f u s e r s ' i n p u t d a t a, a s w e l l a s f o rpres en ta t ion ( r ender ing) o f the r e t r ieved da ta . A typ ica l webbrow s er i s us ed , s ince the un der ly ing pr inc ip le i s r es t ri c t ed toc a l l s t o p u r e J a v a S c r i p t c o d e . O n t h e o t h e r e n d o f t h e d a t a f l o w ,t h e d a t a b a s e m o d u l e h a n d l e s p u r e S Q L r e q u e s t s a n d r e t u r n sd a t a b a s e o b j e c t s i n t h e f o r m o f d a t a t y p e s t h a t a r e d e t e r m i n e dd u r i n g th e d e s i g n p h a s e o f t h e p r o j e c t. T h i s m e a n s t h a t t h e m i d d l et i e r a c t s a s a n e g o t i a t o r b e t w e e n t h e t w o e n d s o f t h e d a t a f l o wand forms s t andard SQL quer i es f rom the t ex tua l o r o ther us eri n p u t s a n d , r e v e r s e l y , c r e a t e t h e n e c e s s a r y c o d e f o r H T M Ld o c u m e n t s t h a t p r e s e n t t h e r e t r i e v e d d a t a i n t h e b r o w s e r w i n d o w .In add i t ion to tha t , any s ys t em po l i cy i s s ues , e . g . r es t r i c t ions o rlogg ing , t ha t need to be enforced can be inc luded in th i s module .Thi s e f f ec t ive ly s epara t es the bus ines s log ic f rom the da ta i t s e l f ,t h u s m a k i n g i t e a s i e r t o c h a n g e o n e o r t h e o t h e r w i t h o u tneces s ar i ly a f f ec t ing the wh ole s ys t em.W e h a v e i m p l e m e n t e d t h e m i d d l e t i e r u s in g P H P , a s e r v er -s i d e , c r o s s - p l a t f o r m , H T M L e m b e d d e d s c r i p t i n g l a n g u a g e ,b e c a u s e i f o f f e rs a n u m b e r o f a d v a n t a g e s o v e r t w o - t ie r o r c l i e n t -s erver s ys t ems , s uch as : Data secur i ty : t he c l i en t i s r es t r a ined f rom query ing c r i t i ca lda ta , s uch as the da tabas e s chema or s ecur i ty po l i cy op t ions ; t hem i d d l e w a r e c o m p o n e n t d e c i d e s t h e a m o u n t a n d t y p e o f d a t apermi t t ed fo r t r ans mis s ion . A d v a n c e d r e s o u r c e m a n a g e m e n t : d u e t o s e c u r i ty r e s tr i c ti o n s ,a n y d a t a r e v i s i o n a n d m a n a g e m e n t i s f u l fi l le d i n t h e m i d d l e t ie r .S ince a l l t r a f f i c i s con t ro l l ed f rom here , t he s ys t em i s g iven theo p p o r t u n i ty t o p e r f o r m l o a d b a l a n c i n g a n d / o r f a v o r u s e rs w i t hh i g h e r b a n d w i d t h o r p r i v i l e g e s . E a s y m a i n t e n a n c e a n d r e d e s ig n : s i n c e a l l b u s i n e s s l o g i c i ss epara t ed f rom the da ta s t ruc tures and the p res en ta t ion l ayer , anys o l it a r y c h a n g e s a r e n o t c a s c a d e d t o o t h e r m o d u l e s . C h a n g e s i npres en ta t ion and po l i c i es a r e handled in d i s cr e t e s ec t ions o f thes c r i p t , w h i l e c h a n g e s i n t h e d a t a b a s e s c h e m a a r e h a n d l e d i ni s o la t ed func t ions tha t bu i ld the quer i es .

    2 2 8

  • 8/12/2019 S. Ioannou G. Moschovitis, K. Ntalianis, K. Karpouzis and S. Kollias, Effective Access to Large Audiovisual Assets Ba

    3/6

    2 .2 SECURE ACCESSU s e r a u t h e n t i c a t i o n f o l l o w s a t h r e e - w a y h a n d s h a k i n gs cheme, s imi l a r to the one us ed in CH AP (RFC19 94) [4 ] and i sus ed on ly dur ing the in i t i a l au then t i ca t ion phas e . Th i s p rocedurecons i s t s o f t he fo l lo wing s t eps :

    T h e i n i ti a l l o g i n s c r e e n g e n e r a t e d b y t he P H P m o d u l e o f t h eweb s erver , con ta in s the log in and pas s word fo rm f i e lds , a longw i t h a r a n d o m n u m b e r s to r e d i n a h i d d e n f o r m f i e l d. T h i s r a n d o mnumber i s ca l l ed a cha l l enge key and i s genera t ed every t ime thein i t i a l l og in s cr een i s r eques ted . The web brows er ca l cu la t es the MD 5 ([4] - RFC 1321)d i g e st o f a s t r in g c o n t a i n i n g t h e u s e r n a m e , p a s s w o r d a n dchal l enge key . The c ons t ruc ted mes s age d iges t i s s en t back to thes erver , a long wi th the s upp l i ed us er name. The c omp lex i ty o f thed iges t a lgor i thm m akes i t comp uta t iona l ly in f eas ib l e ( [4 ] - R FC1321) to p roduce two MD5 mes s ages tha t map to the s ame d iges to r p roduce any m es s age wi th a g ive n pre- s pec i f ied t a rge t mes s aged iges t . The MI )5 a lgor i thm i s implemented in J avaScr ip t and ,thus , i s execu ted in the c l i en t s ide . T h e P H P l o g i n s c r ip t c o m p u t e s t h e sa m e d i g es t b y u s i n g t h ep l a i n t e x t u s e r n a m e s t r i n g r e c e i v e d b y t h e b r o w s e r , a n d t h erandom key & p la in t ex t pas s word r e t r i eved f rom the da tabas e . I fthe two s t r ings match , t he us er i s au then t i ca t ed . The wholeproces s i s dep ic t ed in F igu re 2 .

    CLIENT Middle TierONEB BRO WSER ) Abache+PHP) Database

    Challeng~ key CK) Challenge Key

    M D5 CK~ USERNAME,PASSWO RD). validates IoginUSERNAME data

    II qued/form user authorized

    F i g u r e 2 . T h e a u t h e n t i c a t i o n p r o c e s sT h i s w a y , e v e n i f m a l i c i o u s u s e r s s n i f f ' t h e n e t w o r k a n dgain acces s to the t r ans fer r ed da ta , t hey do no t ga in acces s to theda tabas e s ys t em becaus e the ac tua l pas s word i s never t r ans mi t t edin p l a in t ex t . Repe a t ing of the encryp ted s t r ing i s o f no us e e i ther ,s i n c e th e s e r v e r- g e n e r a te d c h a l l e n g e k e y i s r a n d o m a n d c h a n g e saccord ing to au then t i ca t ion a t t empts , t ime of day and the c l i en t ' sIP address .Af te r us er s log in , t hey a r e au then t i ca t ed on each s u bs eque ntquery or r eques t . Th i s au then t i ca t ion i s bas ed on the combina t iono f t h e u s e r n a m e a n d t h e c l i e n t ' s I P n u m b e r w h i c h a r e s t or e d i nthe da tabas e , s o as to p reven t mul t ip l e concur r en t l og ins . I naddi t ion to tha t , when a c l i en t r eques t i s made , an appropr i a t ef i e ld con ta in in g the t ime o f the l as t r eques t i s f i rs t checked a ndthen updated in the da tabas e ; t hus , t he s ys t em can impos e anau to- logout p rocedure fo r long- in ac t ive user s .

    2 .3 DATABASE STRUCTUREIn o rder to exp lo i t t he c l as s i f i ca t ion of the mater i a l i nd i f f e r en t ca t egor i es and ens ure eas y upg rad ing to a fu l ly MPEG -7c o m p a t i b l e s c h e m e , w e e m p l o y e d t h e p o p u l a r s c e n e - s h o t -charac te r i s ti c f r ame h ie r ar ch ica l s chem e. At f i r st , the v ideo s wered ig i t i zed f rom the o r ig ina l r ee l s and r ecorded to Betacam SPt a p e s , f o l l o w e d b y M P E G - 2 e n c o d i n g . T h i s m a t e r i a l w a s t h e n

    s e g m e n t e d t o m o r e t h a n s i x t y s c e n e s , w h i c h i n t o t a l c o m p r i s emore than t en thous and s ho t s . Each s cene i s des cr ibed us ingtechnica l f ea tu res , s uch as the to t a l num ber o f f r ames or s oundqual i ty and anno ta t ed by an ex per t h i s to r i an , t hus p rov id ing c lueso n t h e h i s to r i c a l a n d c u l t u r a l e n v i r o n m e n t o f t h e s u b j e c t, i nadd i t ion to the t ex tua l des cr ip t ion o f the v i s ua l da ta . Bes ides tha t ,t he exper t a l s o comments on charac te r i s t i c f r ames ex t r ac t ed f romeach s ho t [ 1 ; t h i s as s i s t s the s um mar ize d pres en ta t ion of the s ho t ,whi l e g iv ing the exper t t he oppo r tun i ty to add ex tendedc o m m e n t a r y t o t h e m a t e r ia l .A n ice advan tage of th i s des cr ip t ion s cheme i s t hes t r a igh t forward in t roduct ion of concep t s inc luded in MPEG-7 ,s u c h a s M u l t i m e d i a D e s c r ip t i o n S c h e m e s ( M M D S ) [ 5 ] a n d X M L -c o m p a t i b l e c o n t e n t m a n a g e m e n t . T h e t a r g e t o f t h e s e c o n c e p t s i sto s t andard ize a s e t o f t oo l s dea l ing wi th des cr ip t ion andm a n a g e m e n t i ss u e s , a s w e l l a s n a v i g a t i o n a n d r e t r ie v a l i nmul t imedia en t i t i es . S ince the l a t es t genera t ion of web brows er sof f er i nheren t s uppor t o f XML, e f f i c i en t s epara t ion of con ten t ,bus ines s log ic and pres en ta t ion of r es u l t s a r e pos s ib l e , wi thou th a v i n g t o r e a r r a n g e t h e e m p l o y e d s c h e m e s .3 . ASSET RETRIEVAL3 .1 S U M M A R I Z A T I O N O F T H E T E X T U A LDESCRIPTIONSThe f i r s t s t ep in ana ly z ing the t ex tua l des cr ip t ion and ex t r ac tkeywords i s t o r emove d ig i t s and punctua t ion , as we as s ume tha twords cons i s t o f l e t t e rs on ly . The s econd f i l t e r ing s tep t akes in tocons idera t ion no i s e words (o r s top words ) s uch as ' a ' , ' t he ', ' i n 'e t c . and no i s e s t ems , f o r t he s pec i f i c top ic o f in t e r es t, whichs hould no t be inc lude d in the s um mar iz a t ion proces s . I n th i sprocedure , i npu t t ex t words a r e compared aga ins t t he exac t no i s ewords , and aga in , a f t e r s t emm ing , aga ins t t he no i s e s t ems ; i f amatch occur s , t he inpu t word i s i gnored . Thus , common invar i an tw o r d s a n d c o m m o n s t e m s c a n b e k e p t o u t o f t h e i n d e x th a tcharac te r i zes the document . For the s ake of s impl i c i ty , l e t usas s ume tha t t he no i s e s t ems ar e s ugges ted by the s pec ia l i zedexper t on each top ic .

    Af te r cons ider ing a l l t he p rev ious cas es , as a f ina l s t ep wer e d u c e t h e r e d u n d a n c y o f t h e r e m a i n i n g w o r d s , b y d e t e c t in g t h es pec i f i c s t em of each word . Fo r example , t he words charac te r s ,charac te r ize , charac te r i st i c and charac te r i za tion a l l r educeto the roo t (o r cano nica l s tem) charac te r . A wel l -kn ownalgor i thm [3] , which i s bas ed on the Por t e r s u f f ix - s t r ipp inga lgor i thm (or Por t e r st emm er ) i s us ed as a p roces s fo r r emo vingc o m m o n m o r p h o l o g i c a l a n d i n f l e c t i o n a l e n d i n g s f r o m w o r d s i nEngl i s h .A f t e r p e r f o r m i n g t h e a f o r e m e n t i o n e d a n a l y s i s, t h e k e y w o r dex t r ac t ion phas e i s ac t iva ted . In the cas e o f p l a in t ex tua ldocum ent s , i n format ion-ba s ed approaches a r e adopted tode termine whic h words c an be us ed as f ea tu res . As a genera l ru l e ,e v e r y e x t r ac t e d w o r d c a n h a v e a w e i g h t c o r r e s p o n d in g to t h ef r equency , t ha t i t occur s in the ho t l i s t pages , and theinf r equenc y tha t i t occur s in the co ld l i s t pages [8 ]. Th i s can bea c c o m p l i s h e d b y f i n d i n g th e m u t u a l i n f o r m a t i o n b e t w e e n t hep r e s e n ce a n d a b s e n c e o f a w o r d a n d t h e c l a s si f ic a t i o n o f a p a g e .Ano ther approach us es the vec tor s pace in form at ion r e t r i eva lparad igm where document s a r e r epres en ted as vec tor s [11] . Tod e t e r m i n e w o r d w e i g h t s , a T F - I D F ( T e r m - F r e q u e n c y / I n v e rs eD o c u m e n t F r e q u e n c y ) s c h e m e i s a d o p t e d t o c a l c u l a t e h o wimpor tan t a word i s , bas ed on how f r equen t ly i t appear s . I n th i s

    2 2 9

  • 8/12/2019 S. Ioannou G. Moschovitis, K. Ntalianis, K. Karpouzis and S. Kollias, Effective Access to Large Audiovisual Assets Ba

    4/6

    s i m p l e c a se t h e w e i g h t f o r a w o r d w b e l o n g i n g to a d o c u m e n t d i sg i v e n b y :w as = f a s l og N ( l )n ,

    w h e r e W d s i s t h e w e i g h t o f t h e w o r d , f d s i s t h e f r e q u e n c y o f t h ew o r d w i n t h e d o c u m e n t , N o i s t h e to t a l n u m b e r o f d o c u m e n t s i nthe co l l ec t ion and n s i s th e n u m b e r o f d o c u m e n t s c o n t a i n i n g t h eword w.O n e r e c e n t m e t h o d [ 2 ] u s e s a m o r e s o p h i s t i c a t e d T F - I D Fs c h e m e , w h i c h n o r m a l i z e s f o r d o c u m e n t l e n g t h , f o l l o w i n g t h er e c o m m e n d a t i o n s o f [ 1 1] . A c c o r d i n g t o S a l t o n a n d B u c k l e y ,vec tor - l eng th norm al i za t ion typ ica l ly does no t work w el l f o r s hor tdocument s . Then , t he weigh t fo r a word w i s es t imated by thef o l l o w i n g fo r m u l a w h i c h h a s a l s o b e e n a d o p t e d i n o u r s c h e m e :

    o5+0.5 /as l ' logA' f d m a x A n s )= , . ,l J e a k f d m a x ) k n J ) )

    w h e r e f d m a x expres s es the h ighes t t e rm f r equency .I n o u r a p p r o a c h w e i n c l u d e th e t w e n t y h i g h e s t- w e i g h t e dwords o f a document to cons t ruc t a document ' s vec tor . Th i s i sd o n e i n a n a t t e m p t t o r e d u c e m e m o r y c h a r g e , d e c r e a s ec o m m u n i c a t i o n s l o ad a n d a v o i d o v e r - fi t ti n g . E x p e r i m e n t s i n [ 8]h a v e d e m o n s t r a t e d t h a t t h e n u m b e r o f w o r d s i s c r u c i a l f o rc o n s t r u c t in g a r o b u s t s c h em e . T o o m a n y w o r d s l e a d t o aper formance decreas e dur ing the c l as s i f i ca t ion proces s o f webp a g e s e v e n w h e n s u p e r v i s e d l e a r n i n g m e t h o d s h a v e b e e nincorpora ted . Fur thermore , our exper iment s fo r a s mal lv o c a b u l a r y ( l e s s t h a n t e n w o r d s ) h a v e s h o w n t h a tr e c o m m e n d a t i o n re s u l ts w e r e p o o r c o m p a r e d t o c a s e s w h e n t h i r t yo r f i ft y w o r d s c o m p o s e d t h e v e c t o r o f a d o c u m e n t . T a b l e 1 s h o w ss o m e o f t h e m o s t i n f o r m a t i v e w o r d s o b t a i n e d f r o m a c o l l e c t i o n o fdocum ent s concern ing h i s to r i ca l even t s .T a b l e 1 . K e y w o r d s u s e d a s f ea t u r e s f o r d o c u m e n t s d e s c r i b i n g

    h i s t o r i c a l e v e n t sw a r i s l a n d a r m y

    e u ro p e r u n n i n g j u n eb r i d g e p o l i t i c i a n g u n

    c o l d n o t i c e i r o nv i c t o r y p e a c e p l a n e

    leader r evo lu t ionp e o p l e c a u s ep r e p a r e b l e e d i n g

    f i r s t condi t ionf igh t ing exhaus t ive

    As one can obs erve , a l l words cons i s t o f l e t t e r s on ly , andthey a r e in the lowercas e fo rm. Such a t ab le i s cons t ruc ted fo re a c h d o c u m e n t ; t h e e l e m e n t s o f a d o c u m e n t ' s t a b le a r e a s s i g n e dweigh t s wi th r es pec t t o the ca t egor i es tha t t he document be longsi n . T h e w e i g h t s c o r r e s p o n d t o t h e l e n g t h o f t h e d o c u m e n t a n d t h ef r e q u e n c y o f t h e s p e c i f ic w o r d s . A f t e r a c e r t a in n u m b e r o fk e y w o r d s ( t h o s e w i t h t h e h i g h e s t w e i g h t s c o n c e r n i n g a n u m b e r o fdocum ent s ) have been p icke d ou t , t he in form at ion i s s uppl i ed to al e a r n i n g su b s y s t e m . T h e n , e a c h t i m e a u s e r a c c e s se s a n e w p a g e ,t h e w e i g h t s o f t h e i r p r o f il e a r e u p d a t e d a c c o r d i n g t o n e w p a g e s 'a n a l y s i s. A s i m p l e w a y t o u p d a t e p r o f i l es i s b y a d d i t io n o f n e w

    docu me nt in format io n to the us er p rof i l e , which i s re f e r r ed in theinforma t ion r e t r i eva l com mu ni ty as r e l evan ce f eedback [ 10].3 .2 U S E R P R O F I L I N G

    T h e s e a r c h p r o c e s s i n a m u l t i m e d i a d a t ab a s e c a n p r o d u c eo v e r w h e l m i n g a m o u n t s o f i n f o r m a t i o n , e s p e c i a ll y i n t h e c a s e o f aus er tha t does no t l ook fo r s ometh ing s pec i f i c . I n o rder to r educet r ans m is s ion t ime and r es u l t s comp lex i ty , i t is des i r ab le to r ankthe r es u l t s accord ing to the us er s p r ef er ences and the ac tua lr e l e v a n c e t o t h e q u e r y s t a te m e n t . F o r t h a t r e a s o n , w e e m p l o y au s e r p r o f i l i n g m e c h a n i s m t o r a n k t h e r e t u r n e d m a t e r i a l, o p t i m i z ethe p rec i s ion s core [6 ] and r ecommend r e l evan t add i t iona l s ho t sfor fu r ther s tudy .For each v ideo s ho t , t he s ys t em produces a f ea tu re vec tortha t cons i s t s o f s ix t een con ten t ca t egory weigh t s ( s ee Tab le 2 ) ,fo l lowed by f ive us er ca t egory weigh t s , des cr ib ing in es s ence afuzzy r e l evance to a f ixed s e t o f ca t egor i es . The us er ca t egoryw e i g h t s c o r r e s p o n d t o f i v e t y p i c a l u s e r s o f t h e s y s t e m , n a m e l yHis to r i an , J ourna l i s t , C inephi l e , Di r ec tor and Cas ua l Us er . Ther e s u l t in g v e c t o r s a r e n o r m a l i z e d f o r c o m p a r i s o n p u r p o s e s , t h u sb u i l d i n g a 2 1 - D u n i t h y p e r c u b e . A c c o r d i n g t o t h i s s c h e m e , as pec i f i c s ho t i s p r ed ic t ed to in t e r es t a g iv en us er i f t he r es pec t ivevec tor s a r e r e l a t ive ly c los e in th i s vec tor s pace .

    T a b l e 2 . T h e c a t e g o r i e s t h a t t h e m a t e r i a l i s c l a ss i f ie d i nS p o r ts A r r i v a l s - I n d u s t r y - C o m m u n i c a t i o n s -D e p a r t u r e s C o m m e r c e T r a n s p o r t a t i o n

    G o v e r n m e n t a l -Celebra t ion s Ecc les i as t i ca l Mi l i t a ry Mu nic ipa lT h e m e s T o p i c s T h e m e sP u b l i c Ar t i s t i c Po l i t i cs Educ a t ionServ ices

    T o u r i s m C e l e b r i ti e s H e a d o f S t a tei s to r i ca lE v e n t sT o m e a s u r e t h e p r o x i m i t y o f f e a t u r e v e c t o rs w e e m p l o y t h es t andard do t p roduct met r i c :

    r ( c , u ) = e n ( 3)where u i s t he us er p rof i l e vec tor , c i s t he s ho t vec tor and r i s t her e s u l ti n g r e l e v a n c e f u n c t i o n . T h e v a l u e o f t h e r e l e v a n c e f u n c t i o n ri s us ed to s or t t he r e tu rned s ho t s , s o tha t t he s ho t s which a r e m orere l evan t a r e d i s p layed f i r s t as i t i s p robab le tha t t he us er i s morein te r es t ed in them.

    Dur ing the r eg i s t r a t ion s t age , new us er s a r e a l lowed torev iew the i r i n i t i a l , neu t r a l p rof i l e and ad jus t i t t o be t t e r matchthe i r i n t e r es t s and prefer ences . I n add i t ion , t he s ys t em t r acks thet r ans ac t ions and cho ices o f the us er s o as to fu r ther r e f ine theprof i l e and improve the mod el o f h i s per s ona . In con t r as t t o o therpropos ed ar ch i t ec tu res , our s ys t em does no t r equ i r e the us er tor a t e the m ater i a l r e tr i eved f rom the query .S i m i l a r to t h e r e l e v a n c e f u n c t i o n , d y n a m i c p r o f i l e u p d a t i n ga l s o cor r es ponds to a vec tor opera t ion . In th i s cas e , a s impler e l e v a n c e f e e d b a c k a l g o r i th m i s u s e d f o r c o m p u t i n g t h e v e c t o ri n c r e m e n t Au:

    A u = s o k o e ( 4 )where s = 1 i f t he us er s e l ec t s e and s = -1 i f t he us er ignores eand ~ , i s a pos i t ive parameter , t yp ica l ly lower than 0 . 1 , ens ur ings m o o t h n e s s o f t h e u p d a t i n g p r o c e d u r e .

    2 3 0

  • 8/12/2019 S. Ioannou G. Moschovitis, K. Ntalianis, K. Karpouzis and S. Kollias, Effective Access to Large Audiovisual Assets Ba

    5/6

    3 .3 V ID E O S H O T R E C O M M E N D A T I O NO u r s y s t e m s u p p o r t s tw o t y p e s o f d y n a m i c r e c o m m e n d a t i o ns erv ices : con ten t -bas ed , where v ide o s ho t s s imi l a r to the one s theus er i s v i ew ing ar e s ugges ted and c o l l abora t ive , where the s ys t emrecommends s ho t s v i ewed by us er s tha t s hare in t e r es t s wi th thecur r en t us er . Both types a r e addres s ed us ing a s imi l a r a lgor i thm.More s pec i f i ca l ly , s t andard c lus t e r ing a lgor i thms ar e us ed tos e g m e n t t h e c o n t e n t a n d u s e r s p a c e s i n ' s i m i l a r ' g r o u p s .A K o h o n e n N e u r a l N e t w o r k p r o v i d e s a t o p o l o g i c a l m a p ,where s ho t s o f s imi l a r con ten t a r e as s igned to n e ighbo r ing nodesT h e n e t w o r k i s u p d a te d w h e n e v e r n e w m a t e r i a l is i n t r o d u c e d i nthe da tabas e . At run t im e whe n the us er i s v i ew ing a par t i cu la rs ho t , t he s ys t em s earches the s ho t s con ta ined in the s ame con ten tc lus t e r and s ugges t s the c los es t member s accord ing to theaforement ioned do t p roduct met r i c . Th i s c lus t e r ing prov ides anaggres s ive cu l l ing mechani s m for the con ten t da tabas e , l imi t ing

    the s earch fo r s im i l a r s ho ts to a s m al l s ubs e t o f t he da tabas e .L ikewis e , t he us er p rof i l e s pace i s s egmented in c lus t e r scon ta in ing us er s wi th s im i l a r p rof i l e vec tor s . We as s um e thos eus er s s hare common in t e r es t s , s o i t makes s ens e to r ecommends hot s v i ew ed by ne ighb or s wi th res pec t t o the us er p rof i l ec l u st e r. T h e r e c o m m e n d a t i o n s c o m e f r o m a p o o l o f m o s t - v i e w e ds hot s by member s o f the s ame c lus t e r . We ca l l t hes e s ugges t ionsla t e r a l becaus e they might d iverge fo rm the us er s ' pa th towardsinformat ion r e t r i eva l whi l e s t i ll be ing in t e r es t t o them .In our imp lemen ta t ion , we s to re h i t f r equencies per c lus t e rand v ideo s ho t . Due to the dynamic na ture o f us er p rof i l es , t heprof i l e s pace c lus t e r ing i s updated when new us er s a r e r eg i s t e r edor ex i s t ing prof i l es a r e r e f ined ; the cur r en t implementa t ions c h e d u l e s t h i s o n c e p e r w e e k . O u r c o n t e n t d o m a i n ( m o v i e s ) i squ i t e s u i t ab le fo r th i s k ind of adap t ive r eco mm enda t ion due to i t ss t a t i c na ture . The f r equency of new addi t ions to the da tabas e i ss mal l , enab l ing lo ts o f d i f f e r en t us er s to v i ew the s am e i t ems .Fur thermo re , the ca t egor i es a r e p rede f ined , t hus enab l in g thecrea t ion of coheren t con ten t c lus t e r s .3 .4 A H A N D S - O N S C E N A R I O

    W e w i l l d e m o n s t r a t e t h e r a n k i n g m e c h a n i s m w i t h a nexamp le : t he us er i s i n t e r es t ed on v ideos r e f er r ing to the Kin gGeorge of Greece and en ter s tha t phras e in the appropr i a te f i e ldof the c l i en t s c r een . The s ys t em quer i es the da tabas e and r e tu rnstwo v ideo s ho t s . I n the fo l lowing , t he vec tor r epres en ta t ions o fthe us er p rof i l e and the matched s ho t s a r e p res en ted , a long wi ththe r e l evance func t ion eva lua t ion and the f ina l s o r t ing . Thes evec tor s cons i s t o f t he s ix t een mater i a l ca t egory weigh t s and thef ive us er ca t egory weigh t s ; a l s o , a s ubs e t o f the r e tu rned s ho t s a r es h o w n a n d t h e v e c t o r s a r e p r e s e n t e d i n u n - n o r m a l i z e d fo r m t os how the ac tua l weigh t s a l loca ted in the r ange [0 .. 1 .

    ~ ~ TY~OYN fIA tN IL)L)xP~KH

    T a b l e 3 . U s e r p r o f i l e v e e t o r ( u )0.1 0.4 0.3 0.6 0.8 0.9 0.30.1 0.1 0.2 1.0 0.4 0.9 0.80.9 0.3 0.8 0.0 0.6 0.1 0.5

    T a bl e4 . T h e 2 1 - D v e c t o r ~ r s h o t # 1 ~ l )0.0 1.0 0.0 0.4 0.8 0.0 0.20.4 0.0 0.0 0.8 0.0 0. I 0.90. I 0.9 0.8 0.2 0.4 0.2 0.7

    F i g u r e 4 . R e t r i e v e d v i d e o s h o t # 2T a b l e 5 . T h e 2 1 - D v e c t o r f o r s h o t # 2 ( c 2 )

    0 . 0 0 . 0 0 . 0 0 . 0 1.0 0 . 0 0 . 80 . 8 0 . 7 0 . 0 0 . 8 0 . 0 0 . 0 0 . 01.0 0.1 0.9 0.5 0.2 0.4 0.7

    V i d e o s h o t # 1 s h o w s t h e r e t u r n o f K i n g C o n s t a n t i n e o fGreece , s on of King George , a f t e r h i s t r ip to the S ta t es in thes ummer of 1967 , whi l e v ideo s ho t #2 i s t aken f rom a parade ind o w n t o w n A t h e n s i n 1 9 3 8 . A l t h o u g h K i n g G e o r g e i s a c t u a l l ymis s ing f rom v ideo s ho t #2 , h i s abs ence i s s t rong ly no ted by theexper t h i s to r i an . Given the ca l cu la t ed r e l evance func t ions , weh a v e r ( c l ) = n o r m ( e l ) n o r m ( u ) = 0 . 7 3 2 a n d r ( c 2 ) = n o r m ( c 2 )n o r m ( u ) = 0 . 6 3 1 9, w h e r e n o r m ( v ) d e n o t e s t h e n o r m a l i z e d v e rs i o nof vec tor v . As a r es u l t , t he s ys t em g ives p r io r i ty to e l over c2 .M o r e o v e r , t h e r e c o m m e n d a t i o n s y s t e m s u g g e s t s t h e f o l l o w i n gvideo s ho t bas ed on i t s c los e p rox imi ty to the a forement ionedi t ems :

    F i g u r e 3 . R e t r i e v e d v i d e o s h o t # 1

    F i g u r e 5 . Su g g e s t e d v i d e o s h o t # 3T a b l e 6 . T h e 2 1 - D v e c t o r f o r s u g g e s t e d s h o t # 3

    0.0 0.9 0.0 0.3 0.7 1.0 0.30 . I 0 . 0 0 . 0 0 . 2 0 . 0 0 . 1 0 . 90 . 5 0 . 9 0 . 9 0 . 4 0 . 2 0 . 9 0 . I

    Thi s s ho t , f rom 1921 , s how s Kin g Cons tan t ine , fa ther o fK i n g G e o r g e , d u r i n g a h i g h l y c e l e b r a t e d v i s i t t o a n O r t h o d o xc h u r c h i n A s i a M i n o r .

    2 3 1

  • 8/12/2019 S. Ioannou G. Moschovitis, K. Ntalianis, K. Karpouzis and S. Kollias, Effective Access to Large Audiovisual Assets Ba

    6/6

    The u ser in qu estion is classified to a profile cluster with thefollowing mean vector:0.1 0.2 0.2 0.3 0.9 0.6 0.10.1 0.1 0.2 0.9 0.2 0.9 0.90.9 0.1 0.8 0.0 0.3 0.1 0.2

    and the collaborative subsystem also sugg ests this relevant shot:

    YI I FYI IOY & M M ICI AINIOL)t tKt l

    Figure 6 . La tera l v ideo sho tThis video shot is taken from a military celebration in 1938.The King does appear in this video, but the key figure is thedictator of Greece and head o f the Greek Arm y at the t ime; this

    explains why this video shot was not retrieved from the initialquery, but su ggest as highly relevant from the sy stem . Thecom plete screen with the two retrieved shots and the sugge stionsmad e by the system , along with summarized descript ions, isshown in Figure 7 b elow.

    t l ~ p e ~ c ~ I N O 6 ~ u ~ e w 4 ~ t ~ t;~7}

    Figure 7 . Re tr i eved and sugges ted sho ts w i th summar ized tex tdescr ip t i ons

    4. ACKNOWLEDGEMENTThis work was part ial ly funded by the Greek Ministry ofPress and Ma ss Media (MPM M) and the program Digitization,archiving and access m M PM M audiovisual data (1999-2000).The M ovie Archive of the Ministry holds the copyright for shotsand stills presented in this pap er.

    5 . R E F E R E N C E S[ 1] A v r i t h i s Y . , D o u l a m i s A ., D o u l a m i s N . and K ol l i a s S.,A S t ochas ti c Fr amew or k f o r O p t ima l K ey Fr am eE xt r ac t ion f r om M PE G Vi deo D a t abases , ComputerVision andImage Understanding, Vol. 75, No. 1/2,July/August 1999, pp. 3-24.[ 2 ] B a l abanov i c M. and Shoha m Y . , Fab con t en t - based

    co l l abor a t ive r ecom men da t i on , Communications oft he ACM, 40(3) , Mar . 1997, pp. 66-72.[ 3 ] Fr akes , W. B . , S t em mi ng a l gor i thms , InformationRetrieval D ata Structures an d Algorithms, Pr en t i ceHal l , Saddle River , NJ , USA , 1992, pp. 131-160.[ 4 ] I n t e r ne t E ng i nee r i ng T ask For ce , R eques t f o rcomments , ht tp: / /www.ie t f .org/ r fc .html .[ 5] I S O / IE C J T C 1 / S C 2 9 / W G 1 1 N 3 2 5 0 , M P E G - 7Pr i nc ipa l C oncep t L i s t , Mar ch 2000 .[ 6 ] Mo nt ebe l l o M. , O p t i mi z i ng R eca l l/ P r ec i s ion s co r e s i nI R o v e r t he W W W , Proceedings of the 21st A CMSIGIR Conference on Research an d development in

    information retrieval, A C M P r e ss , N e w Y o r k , N Y ,U SA , 1998 , pp . 361- 362 .[ 7 ] Mou kas , A . and Maes P . , A m al t haea : evo l v i ng mul t i-agen t i n f o r ma t i on f i l t e ri ng and d i s cove r y sys t ems f o rt he W W W , Autonomous agents and multi-agentsystems, Vol . 1 , 1998, pp. 59-88.[8] Pazzan i , M. , Muram atsu J ., Bi l l sus , D. , Sysk i l l &Weber t : I den t i f y i ng i n t e r e st i ng w eb s i te s ,Proceedings of the National Conference on AI, A A A IPr es s , Me n l o Pa r k , C A , U SA , 1996.[ 9 ] R esn i ck , P . , I acovou , N . , S uchak , M. , B e r gs t r om, P .,and R i ed l, J. , G r ou pL en s : A n ope n a r ch it ec t u r e f o rco l l abor a t ive f i lt e r ing o f ne t new s , Proceedings of the

    A C M Conf. on Computer-Supported CooperativeWork, A C M Pr es s , N Y , 1994 , pp . 175- 186 .[ 10] R occh i o J r . J. , R e l eva nce f eedback i n i n f o r ma t i onret r ieval , The SMA RT Retrieval System - Experimentsin Automa tic Docum ent Processing, Prent ice Hal l ,U pp e r Sadd l e R i ve r , N J , U S A , 1971, pp . 313- 323 .[ 11] Sa l t on , G . and B uck l ey , C ., T e r m w e i gh t i ngappr oaches i n au t oma t i c t ex t r e t ri eva l , T R 87- 881 ,C or ne l l U n i ve r s i t y , C S D epa r t men t , I t haca , N Y , 1987.[ 12] Y an , T . W. and G ar c i a - Mol i na , H . , S I FT - A t oo l fo rw i de - a r ea i n f o r ma t i on d is semi na t i on , Proc. o f theUSENIX Technical Conference, U S E N I X , B e r k e le y ,

    CA, USA, 1995, 177-186.

    2 3 2