19
MQla£ Mw i;fi¯ ;lN ifi¡agp k;gNaL PNw Pk‘:I ‹…qR PlJpl“¡p k;Japl;g NQ ‘l“QgaZlJa; :¡“a´Ja;g Q Ipjsfi“;Japl;g k;“apl;g iQQ“alZ pl :¡“a´Ja;g ;lN Ipjsfi“;_ “apl;g ‘l“QggaZQlJQ P†Ql“ Jp_gpJ;“QN ·a“^ A~:I‘¢ ‹…qR ¢Öp r;figpL A¡;»agL oJ“pBQ¡ ‹‹‹WL ‹…qR r¡pJQQNalZ£

1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

  • Upload
    dangque

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

.2MBb .X J�m� �M/ Jm`BHQ L�H/B- 1/X

1LA�* kyR31M+QMi`Q L�+BQM�H /2 AMi2HB;āM+B� �`iB}+B�H 2*QKTmi�+BQM�HL�iBQM�H J22iBM; QM �`iB}+B�H �M/ *QKTmi�@iBQM�H AMi2HHB;2M+2

1p2Mi +Q@HQ+�i2/ rBi? "_�*Aa kyR3a½Q S�mHQ- "`�xBH- P+iQ#2` kkĜk8- kyR3S`Q+22/BM;b

Page 2: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

Ü kyR3 7Q` i?2 BM/BpB/m�H T�T2`b #v i?2 T�T2`bǶ �mi?Q`bX *QTvBM; T2`KBii2/ 7Q`T`Bp�i2 �M/ �+�/2KB+ Tm`TQb2bX _2@Tm#HB+�iBQM Q7 K�i2`B�H 7`QK i?Bb pQHmK2 `2@[mB`2b T2`KBbbBQM #v i?2 +QTv`B;?i QrM2`bX

Page 3: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

S`27�+2

h?2 L�iBQM�H J22iBM; QM �`iB}+B�H �M/ *QKTmi�iBQM�H AMi2HHB;2M+2 U1LA�*V Bb QM2Q7 i?2 K�BM M�iBQM�H 7Q`mKb 7Q` `2b2�`+?2`b- T`�+iBiBQM2`b- 2/m+�iQ`b- �M/ bim/2MibiQ T`2b2Mi �M/ /Bb+mbb BMMQp�iBQMb- i`2M/b- 2tT2`B2M+2b- �M/ /2p2HQTK2Mib BM i?2}2H/ Q7 �`iB}+B�H �M/ *QKTmi�iBQM�H AMi2HHB;2M+2X AM T�`iB+mH�`- 1LA�* Bb i?2 B/2�H7Q`mK 7Q` mM/2`;`�/m�i2 �M/ ;`�/m�i2 bim/2Mib iQ ;2i i?2B` }`bi 2tT2`B2M+2 BM �H�`;2 b+B2MiB}+ +QM72`2M+2X

h?2 R8i? 2/BiBQM Q7 i?2 2p2Mi r�b +Q@HQ+�i2/ rBi? i?2 di? "`�xBHB�M *QM72`2M+2QM AMi2HHB;2Mi avbi2K U"_�*AaV- i?2 9i? avKTQbBmK QM EMQrH2/;2 .Bb+Qp2`v-JBMBM; �M/ G2�`MBM; UE.JBH2V �M/ i?2 Ni? h?2bBb �M/ .Bbb2`i�iBQM QM �`iB}+B�H�M/ *QKTmi�iBQM�H AMi2HHB;2M+2 *QMi2bi U*h.A�*VX

am#KBbbBQMb r2`2 HBKBi2/ iQ Rk T�;2b 7QHHQrBM; a"* �`iB+H2 bivH2- r`Bii2M 2Bi?2` BM1M;HBb? Q` SQ`im;m2b2X 6QHHQrBM; mT QM T`2pBQmb v2�`b- i?2 2p2Mi ?�/ K�BM i`�+F-�M/ �M bT2+B�H i`�+F 7Q` T�T2`b r?Qb2 }`bi �mi?Q` Bb �M lM/2`;`�/m�i2 bim/2Mi U�ii?2 iBK2 Q7 bm#KBbbBQMVX S�T2`b BM i?2 bT2+B�H i`�+F mM/2`r2Mi i?2 b�K2 `2pB2rBM;T`Q+2bb �b T�T2`b BM i?2 K�BM i`�+F- #mi `�M b2T�`�i2Hv 7Q` i?2 "2bi lM/2`;`�/m�i2S�T2` �r�`/X S�T2`b BM i?2 K�BM i`�+F `�M 7Q` i?2 "2bi S�T2` �r�`/X

h?Bb v2�`Ƕb 2/BiBQM ?�/ RjR bm#KBbbBQMb UNN BM i?2 K�BM i`�+F �M/ jk BM i?2 bT2+B�Hi`�+FV- 7`QK r?B+? r2 �++2Ti2/ 3R U8d BM i?2 K�BM i`�+F �M/ k9 BM i?2 bT2+B�Hi`�+FVX 1LA�* kyR3 �/QTi2/ � /Qm#H2@#HBM/ `2pB2rBM; TQHB+v, 1p2`v bm#KBbbBQMr�b `2pB2r2/ #v �i H2�bi irQ b2T�`�i2 �M/ �MQMvKQmb `2pB2r2`b- r?Q r2`2 MQi�r�`2 Q7 i?2 �mi?Q`bǶ B/2MiBiB2bX

h?2 7QHHQrBM; T�T2`b r2`2 `mMM2`b@mT Q7 "2bi S�T2` �r�`/,

• �miQK�iB+ :2M2`�iBQM Q7 � hvT2@k 6mxxv avbi2K 7Q` hBK2 a2`B2b 6Q`2+�bi#�b2/ QM :2M2iB+ S`Q;`�KKBM;- J�`+Q �MiƬMBQ /� *mM?� 62``2B`�- _B+�`/Qh�Mb+?2Bi- J�`H2v o2HH�b+Q

• 1v2 GQ+�HBx�iBQM lbBM; *QMpQHmiBQM�H L2m`�H L2irQ`Fb �M/ AK�;2 :`�/B2Mib-q2`iQM SX /2 �`�mDQ- h?2HKQ SX /2 �`�mDQ- :mbi�pQ �X GX /2 *�KTQb

• .2i2+iBM; i?2 S`2b2M+2 Q7 J2i2Q`b BM AK�;2b, L2r +QHH2+iBQM �M/ `2bmHib-_2M�iQ JQ`�2b aBHp�- �M� *�`QHBM� GQ`2M�- hB�;Q �X �HK2B/�

• SH�M 1tBbi2M+2 o2`B}+�iBQM �b avK#QHB+ JQ/2H *?2+FBM;- J�+BHBQ /� aBHp�62``2B`�- J�`B� oBpB�M2 J2M2x2b- G2HB�M2 LmM2b /2 "�``Qb

h?2 7QHHQrBM; T�T2`b r2`2 `mMM2`b@mT Q7 i?2 "2bi lM/2`;`�/m�i2 S�T2` �r�`/,

Tiago Almeida
Tiago Almeida
Page 4: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

• � JmHiB@:�mbbB�M 6mxxv J2K#2`b?BT 6mM+iBQM iQ i?2 �H;Q`Bi?K 6mxxv :`Qr@*mi �TTHB2/ iQ a2;K2Mi G2bBQMb BM J�KKQ;`�T?v AK�;2- 6BHBT2 _X *Q`/2B`Q-"2�i`Bx �H#m[m2`[m2- o�HKB` J�+�`BQ

• a2MiBK2Mi �M�HvbBb Q7 hrBii2` SQbib �#Qmi i?2 kyRd �+�/2Kv �r�`/b- A;Q`hX *Q``2�- .�MB2H .X �#/�H�- _Q/`B;Q aX JB�MB- 1H�BM2 _X 6�`B�

• *QMpQHmiBQM�H L2m`�H L2irQ`Fb 7Q` G2�7 .Bb2�b2 *H�bbB}+�iBQM- _�ő :X *�`@p�H?Q- G2iB+B� hX JX wQ#v

• 6�+B�H G�M/K�`Fb .2i2+iBQM QM 6�mHiv .�i�b2ib rBi? _2;`2bbBQM h`22b �M/S`BM+BT�H *QKTQM2Mi �M�HvbBb S�`�K2i`Bx�iBQM- �#M2` aX L�b+BK2MiQ- .�MBHQ�X PHBp2B`�- J�`B� _�[m2H GX /2 *QmiQ- A�HBb *X /2 S�mH� CȹMBQ`

q2 2MQ`KQmbHv BM/2#i2/ iQ i?2 T`Q;`�K +QKKBii22 7Q` i?2B` ;QQ/ �M/ ?�`/ rQ`FX

P+iQ#2` kyR3-.2MBb .X J�m� �M/ Jm`BHQ L�H/B

S`Q;`�K *?�B`b

Page 5: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

P`;�MBxBM; *QKKBii22�M� S�mH� �TT2H- A"J _2b2�`+?S�mHQ *�p�HBM- A"J _2b2�`+?

S`Q;`�K *?�B`b.2MBb .X J�m�- laSJm`BHQ L�H/B- l6a*�_

S`Q;`�K *QKKBii22�/QH7Q L2iQ 62/2`�H- lMBp2`bBiv Q7 h2+?MQHQ;v Q7 S�`�M� Ulh6S_V�/`B�MQ q2`?HB- lMBp2`bB/�/2 62/2`�H /Q _BQ :`�M/2 Ul6_:V�/`B½Q .m�`i2 .Ʀ`B� L2iQ- lMBp2`bB/�/2 62/2`�H /Q _BQ :`�M/2 /Q LQ`i2�H2bb�M/`Q G�K2B`�b EQ2`B+?- úha JQMi`2�H�M� "�xx�M- l6_:a�M� *�`QHBM� GQ`2M�- lMBp2`bB/�/2 62/2`�H /2 a½Q S�mHQ�M/`û "`BiiQ- lMBp2`bB/�/2 62/2`�H /Q a2`;BT2�M/`û SQM+2 /2 G2QM 6 /2 *�`p�H?Q- A*J*@laSfaX*�`HQb�M/`û _2M�iQ oBHH2H� /� aBHp� - lMBp2`bB/�/2 62/2`�H 6HmKBM2Mb2�M/`2B� J�Hm+2HHB- Sl*S_�MBbBQ G�+2`/�- l6J:�MM� *Qbi�- lMBp2`bB/�/2 /2 a½Q S�mHQ�MiQMBQ "`�M+Q- lMBp2`bB/�/2 /2 GBb#Q��m`Q`� SQxQ- l6S_"2MD�KBM "2/`2;�H- l6_L"B�M+� w�/`QxMv - A"J _2b2�`+? "`�xBH"`mMQ GQT2b- lMBp2`bB/�/2 62/2`�H 6HmKBM2Mb2*�`HQb GQT2b- lMBp2`bB/�/2 62/2`�H /2 l#2`H�M/B�*�`HQb h?QK�x- 61A*�`K2HQ "�biQb@6BH?Q- lMBp2`bBiv Q7 S2`M�K#m+Q*�`QHBM� S�mH� /2 �HK2B/� ai�i2- lMBp2`bBiv BM i?2 JB/r2bi Q7 S�`�M� UlLA*1L@h_PV*2b�` h�+H�- lMBp2`bB/�/2 h2+MQHQ;B+� 62/2`�H /Q S�`�M�*H2#2` w�M+?2iiBM- lMBp2`bB/�/2 62/2`�H /2 S2`M�K#m+Q.�pB/ J�`iBMb@C`- lMBp2`bB/�/2 62/2`�H /Q �"*.B�M� �/�K�iiB- lMBp2`bB/�/2 62/2`�H /Q _BQ :`�M/2 U6l_:V.B#BQ G2�M/`Q "Q`;2b- lMBp2`bBiv Q7 "`�bBHB�.B2;Q aBHp�- lMBp2`bB/�/2 62/2`�H /2 a½Q *�`HQb

Page 6: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

1/r�`/ >2`K�MM >�2mbH2`- Sl*@_BQ1K2`bQM S�`�BbQ- SQMiB}+B� lMBp2`bB/�/2 *�iQHB+� /Q S�`�M�1m;2MBQ PHBp2B`�- 6�+mH/�/2 1M;2M?�`B� lMBp2`bB/�/2 /Q SQ`iQ6�#`ő+BQ 1M2K#`2+F- SQMiB}+�H *�i?QHB+ lMBp2`bBiv Q7 S�`�M� USl*S_V6�#BQ *QxK�M- laS @ SQHBiû+MB+�62`M�M/Q PbQ`BQ- lMBp2`bB/�/2 /2 a½Q S�mHQ6H�pB� "�``Qb- lMBp2`bB/�/2 62/2`�H /2 S2`M�K#m+Q6H�pB� "2`M�`/BMB- lMBp2`bB/�/2 62/2`�H 6HmKBM2Mb26H�pBQ hQMB/�M/2H- *2Mi`Q lMBp2`bBi�`BQ /� 61A6H�pBQ aQ�`2b *Q``ā� /� aBHp�- lMBp2`bB/�/2 /2 a½Q S�mHQ6H�pBQ o�`2D½Q- lMBp2`bB/�/2 62/2`�H /Q 1bTő`BiQ a�MiQ6`�M+Bb+Q .2 *�`p�H?Q- *2Mi`Q /2 AM7Q`K�iB+� @ *AMfl6S1:2Q`;2 *�p�H+�MiB- lMBp2`bB/�/2 62/2`�H /2 S2`M�K#m+Q:2`bQM w�p2`m+?�- 62/2`�H lMBp2`bBiv Q7 _BQ /2 C�M2B`Q Ul6_CV:BM� PHBp2B`�- lMBp2`bB/�/2 62/2`�H /2 l#2`H�M/B�:Bb2H2 S�TT�- l6J::`�+�HBx .BKm`Q- lMBp2`bB/�/2 62/2`�H /Q _BQ :`�M/2:mbi�pQ :BKûM2x@Gm;Q- lh6S_>2BiQ` GQT2b- lMBp2`bB/�/2 h2+MQHƦ;B+� 62/2`�H /Q S�`�M�>2HQBb� *�K�`;Q- l6a*�`>m2B G22- lMBp2`bB/�/2 1bi�/m�H /Q P2bi2 /Q S�`�M�AM2b .mi`�- lMBp2`bB/�/2 /Q SQ`iQAp�M/`û S�`�#QMB- laS G2bi2CQ½Q �H+�Mi�`�- lMBp2`bB/�/2 62/2`�H /Q *2�`�CQ½Q "�Hb�- 6�+X *BāM+B�b /� lMBp2`bB/�/2 /2 GBb#Q�CQKB >Ƀ#M2`- l6a*CQ`;2 E�M/�- lMBp2`bB/�/2 62/2`�H /Q �K�xQM�bCmHBQ LB2pQH�- Sl*S_E�`BM� o�H/BpB�@.2H;�/Q- lMBp2`bBiv Q7 a½Q S�mHQG2HB�M2 LmM2b /2 "�``Qb- lMBp2`bBiv Q7 a½Q S�mHQGB q2B;�M;- lMBp2`bBiv Q7 "`�bBHB�Gmőb *Q``2B�- lMBp2`bB/�/2 /2 GBb#Q�Gmőb �H#2`iQ Gm+�b- 62/2`�H lMBp2`bBiv Q7 h2+?MQHQ;v @ S�`�M�GmBb Pi�pBQ �Hp�`2b- l6a*GmBx a�iQ`m P+?B- AMbiBimiQ /2 *QKTmi�ϽQ @ lMBp2`bB/�/2 62/2`�H 6HmKBM2Mb2J�`+2HQ 6BM;2`- laSfAJ1J�`+2HQ G�/2B`�- lMBp2`bB/�/2 /2 "`�bBHB�J�`+Qb ZmBH2b- lMBp2`bB/�/2 62/2`�H /2 a½Q S�mHQJ�`+Qb 1/m�`/Q o�HH2- lMBp2`bB/�/2 1bi�/m�H /2 *�KTBM�bJ�`B� /�b :`�Ï�b LmM2b- laSfA*J*J�`BHiQM �;mB�`- l6S1GJ�`+B� *`BbiBM� JQ`�2b- *QHQ`�/Q ai�i2 lMBp2`bBiv U*alVJ�`+BQ "�b;�HmTT- lLA61aS

Page 7: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

J�`BQ "2M2pB/2b- lMBp2`bB/�/2 62/2`�H /Q _BQ /2 C�M2B`Q Ul6_CVL2HbQM 1#2+F2M- l6_CS�i`ő+B� h2/2b+Q- *2Mi`Q /2 AM7Q`K�iB+� Ul6S1VS�i`B+B� PHBp2B`�- lMBp2`bBiv Q7 a½Q S�mHQS�i`B+F JQ`�iQ`B- lMBp2`bB/�/2 62/2`�H 6HmKBM2Mb2S�mHQ Zm�`2bK�- lMBp2`bB/�/2 /2 úpQ`�S�mHQ h`B;Q- AMbiBimiQ amT2`BQ` /2 1M;2M?�`B� /2 GBb#Q�S2i`m+BQ oB�M�- lMBp2`bB/�/2 62/2`�H 6HmKBM2Mb2_�7�2H "Q`/BMB- Sl*_a_2;Bp�M >m;Q LmM2b a�MiB�;Q- lMBp2`bB/�/2 62/2`�H /Q _BQ :`�M/2 /Q LQ`i2_2BM�H/Q "B�M+?B- *2Mi`Q lMBp2`bBi�`BQ /� 61A_2D�M2 6`Qxx�- lLAa*_2M�iQ hBMƦb- laS_B+�`/Q *2``B- lMBp2`bB/�/2 62/2`�H /2 a½Q *�`HQb_B+�`/Q aBHp2B`�- lMBp2`bB/�/2 62/2`�H /2 a�Mi� *�i�`BM�_B+�`/Q h�Mb+?2Bi- Sl*@_BQ_B+?�`/ :QMÏ�Hp2b- lLA*1Lh_P_QM�H/Q S`�iB- lMBp2`bB/�/2 62/2`�H /Q �"*_mB *�K�+?Q- lMBp2`bBiv Q7 SQ`iQa�`�D�M2 S2`2b- lMBp2`bB/�/2 /2 a½Q S�mHQaőHpBQ *�x2HH�- l6*aS�a2`;BQ �MiQMBQ �M/`�/2 /2 6`2Bi�b- lMBp2`bB/�/2 /2 "`�bőHB�aQH�M;2 _2x2M/2- laSfA*J*h?B�;Q S�`/Q- laSfA*J*hB�;Q �HK2B/�- 62/2`�H lMBp2`bBiv Q7 a½Q *�`HQb Ul6a*�`Vhb�M; AM; _2M- lMBp2`bB/�/2 62/2`�H /2 S2`M�K#m+Qo�H/BM2B 6`2B`2- 1�*> @ lMBp2`bB/�/2 /2 a½Q S�mHQq�;M2` J2B`� C`X- l6J:q2HHBM;iQM SBM?2B`Q /Qb a�MiQb- lMBp2`bB/�/2 62/2`�H /2 S2`M�K#m+Q

Page 8: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

Detecting the presence of meteors in images:new collection and results

Renato Moraes Silva1, Ana Carolina Lorena2, Tiago A. Almeida1

1Department of Computer Science, Federal University of São Carlos (UFSCar),Sorocaba, São Paulo, Brazil

2Science and Technology Institute, Federal University of São Paulo (UNIFESP),São José dos Campos, São Paulo, Brazil

[email protected],[email protected], [email protected]

Abstract. In this paper, we present a new public and real dataset of labeled im-

ages of meteors and non-meteors that we recently used in a machine learning

competition. We also present a comprehensive performance evaluation of sev-

eral established machine learning methods and compare the results with a stack-

ing approach – one of the winning solutions of the competition. We compared

the performance obtained by the methods in the traditional repeated five-fold

cross-validation with the ones obtained using the training and test partitions

used in the competition. A careful analysis of the results indicates that, in gen-

eral, the stacking based approach obtained the best performances compared to

the baselines. Moreover, we found evidence that the validation strategy used by

the platform that hosted the competition can lead to results that do not sustain

in a cross-validation setup, which is recommendable in real-world scenarios.

1. IntroductionIn 2017, the first edition of the KDD-BR (Brazilian competition on Knowledge Discovery

in Databases) has launched a challenge on identifying the presence of meteors in im-ages. The competition was one of the joint activities of the 2017 editions of the BrazilianConference on Intelligent Systems (BRACIS), the Brazilian Symposium on Databases(SBBD) and the Symposium on Knowledge Discovery, Mining and Learning (KDMiLe).The objective was to obtain an automatic system for signalizing the presence of a meteor(popularly known as a shooting star) in an image. For this, we collected images capturedby a monitoring station installed at the Observatory of Astronomy and Space Physics atUniversity of Vale do Paraíba (UNIVAP), São José dos Campos, Brazil. This is one of themonitoring stations of the EXOSS Citizen Science organization1. EXOSS is a Braziliannon-profit organization which monitors meteors that cross the southern skies. It has about50 active monitoring stations at various locations of the Brazilian territory.

The monitoring station consists of a low-cost video surveillance camera, usuallyused in security systems, with a dedicated motion capture software named UFO Capture

2.Each time a moving object is detected by the camera during the night, it starts recording.The captured images include meteors, birds, insects, planes, lightnings, and other objects.There is a large variation regarding weather conditions, the presence of stars/moon inthe sky, camera noise, among others. Therefore, the challenge is to distinguish between

1http://exoss.org2http://sonotaco.com/e_index.html

Rk3

Page 9: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

meteors and non-meteors under varying conditions. Two examples of captured imagesare presented in Figure 1.

(a) Example of meteor. (b) Example of non-meteor.

Figure 1. Examples of captured images.

The usual procedure adopted by the specialists is to inspect the images periodi-cally in order to discard the non-meteors, whilst meteors are validated with another dedi-cated software. The validated images of meteors can be used for several purposes, as forinstance, to identify their orbit or to figure out the region where the meteor fell, makingit easier to find and collect pieces that can be studied later [Gural 2012]. Therefore, anautomatic system may support this filtering process by classifying the recorded imagesinto two classes: meteor vs non-meteor. However, labeled instances are mandatory totrain these supervised models and, to the best of our knowledge, there are no labeleddatasets of meteors publicly available. We are also not aware of any previous attemptto automate this image recognition task, although there are related work on meteor de-tection in video records [Siladi et al. 2015, Vítek and Nasyrova 2018] and radio spectro-grams [Roman and Buiu 2015].

To fill this lack of public data, this paper presents a labeled dataset of meteorsand a set of experiments performed which can be used as a baseline in future work. Sev-eral standard classifiers are evaluated and, afterward, their performances are compared toone of the best solutions submitted in the competition, which is based on feature trans-formations and stacking. Interestingly, the results on a repeated five-fold cross-validationexperiment vary from those which are obtained using the original training and testing par-titions of the competition. This reveals problems in the performance evaluation of suchcompetition platforms, which only perform one holdout round. For a small dataset suchas that used in this competition, this characteristic becomes more problematic.

This paper is structured as follows. Section 2 presents the meteor detection datasetused in the competition. Section 3 shows the configuration of the experiments performedin this study. The results are presented in Section 4. Finally, we give our main conclusionsin Section 5.

2. Meteor datasetThe meteor dataset is composed by features extracted from images captured by a moni-toring station located at the Observatory of Astronomy and Space Physics from the Uni-

RkN

Page 10: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

versity of Vale do Paraíba (UNIVAP), in São José dos Campos, SP, Brazil. They wererecorded during the months of April and March, 2017.

The UFO Capture software, which is used in the monitoring stations to capturemeteors, stores a set of different files per recording: a movie file in the AVI format; anXML file with profile information; a bitmap file that contains mask and average bright-ness information; and two JPEG images with a composition of the video frames. Thefirst JPEG file contains a peak hold or snapshot still image of the captured event (Figure2a). The second file contains a thumbnail image, in which an area of interest where themoving object was detected is also highlighted (Figure 2b). The areas of interest in thecorresponding snapshot are then used to figure out whether such image captured a meteoror not.

(a) Snap shot of a meteor. (b) Thumbnail image.

Figure 2. Examples of two JPEG images stored by UFO Capture per recording.

A total of 122 images (41 from meteors and 81 from non-meteors) were capturedand labeled by specialists. This dataset has an imbalance ratio of 1.97 (ratio of the numberof instances in the majority class to the number of examples in the minority class), whichcan be considered moderate [Fernández et al. 2008]. A large set of features was extractedfrom regions of interest using 21 image preprocessing techniques. Table 1 presents thesetechniques and the number of features generated by each one.

This dataset has challenging characteristics which can be used to test different ca-pabilities of classification methods in future researches: (i) it is imbalanced, (ii) it containsnoisy data, (iii) each sample is represented by 21 feature sets (totalling 3,451 features) thatcan be combined in several ways to test new classification approaches, and (iv) it containsfew instances due to the low period of data collection and due to the difficulty of manuallylabeling the images.

A competition in the Kaggle in class platform3 was configured using the proposeddataset. The competitors could opt to use all or some subsets of features in their solutions.The dataset was divided into two partitions (67-33%): training, with 26 meteors and 54non-meteors; and testing, with 15 meteors and 27 non-meteors. The competitors had ac-cess only to the labels of the training data. The test partition was also randomly split intotwo other partitions of equal size. The participants received feedback about their perfor-mance on the first test partition (public leaderboard test set), while the performance in the

3https://www.kaggle.com/c/can-i-make-a-wish-detecting-shooting-stars

Rjy

Page 11: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

Table 1. Feature sets extracted from the images of the meteor dataset.

Id Feature set # features

FS1 Auto color correlogram [Huang et al. 1997] 768FS2 Color and edge directivity descriptor [Chatzichristofis and Boutalis 2008a] 144FS3 Color histogram [Novak and Shafer 1992] 64FS4 Fuzzy color and texture histogram [Chatzichristofis and Boutalis 2008b] 192FS5 Fuzzy color histogram [Han and Ma 2002] 125FS6 Fuzzy opponent histogram [Van De Sande et al. 2010] 576FS7 Gabor [Fogel and Sagi 1989] 60FS8 Haralick [Haralick et al. 1973] 14FS9 Histogram [Scott 2010] 256FS10 Joint composite descriptor [Zagoris et al. 2010] 168FS11 JPEG coefficient Histogram [Sikora 2001] 192FS12 Luminance layout descriptor [Sikora 2001] 64FS13 MPEG7 color Layout [Sikora 2001] 33FS14 MPEG7 edge histogram [Sikora 2001] 80FS15 Mean intensity local binary patterns [Ojala et al. 1994] 256FS16 Mean patch intensity histogram [Taylor and Drummond 2011] 256FS17 Moments [Abo-Zaid et al. 1988] 4FS18 Opponent histogram [van de Sande et al. 2004] 64FS19 Pyramid histograms of oriented gradients [Bosch et al. 2007] 40FS20 Reference color similarity [Kriegel et al. 2011] 77FS21 Tamura [Tamura et al. 1978] 18Total 3,451

second one (private leaderboard test set) remained secret until the end of the competitionand was used for the final ranking of the solutions submitted based on log-loss.

In the following, we present the results obtained in experiments performed withthis dataset using the same data partitions employed in the competition and also employ-ing a repeated five-fold cross-validation on the whole dataset. The solution adopted byone of the winning teams is also compared under the two previous setups.

3. Experimental settingsIn all experiments, we applied the Z-score normalization using information from the train-ing examples. Moreover, we replaced missing values by the average of the correspondingfeature in the training set.

As the dataset is imbalanced, we first performed experiments with the originalclass distribution and later using the same number of instances in each class of the train-ing set. To balance the classes, we used the synthetic minority over-sampling technique(SMOTE) [Chawla et al. 2002]. We chose a method of oversampling rather than under-sampling because of the small number of instances in the proposed dataset. An under-sampling technique could affect the learning process of the classifiers due to the lack oftraining instances.

3.1. Evaluations

In order to provide a baseline result, we first compared the performance ob-tained by the following established classification methods: Gaussian naïve Bayes

RjR

Page 12: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

(G.NB) [Metsis et al. 2006], logistic regression (LR) [Yu et al. 2011], support vec-tor machines (SVM) [Boser et al. 1992, Cortes and Vapnik 1995], k-nearest neighbors(KNN) [Cover and Hart 1967], decision trees (DT) [Breiman et al. 1984], random for-est (RF) [Breiman 2001], bootstrap aggregating (bagging) [Breiman 1996], and adap-tive boosting (AdaBoost) [Freund and Schapire 1997]. These methods are widely usedas baseline in several other studies.

We used the implementations of all methods from scikit-learn library4

[Pedregosa et al. 2011]. As the performance of KNN, SVM, RF, bagging, and AdaBoostcan be highly affected by the choice of parameters, we performed a grid search using five-fold cross-validation to find the best values for their main parameters. Table 2 presentsthe parameters and the range of values tested, where C is the cost for SVM, � is theparameter of the RBF kernel, k corresponds to the number of neighbors for KNN, and |E|is the number of estimators used in RF, bagging, and AdaBoost. For the other methods,we used their default values.

Table 2. Parameters and range of values used in the grid-search.

Method Parameter Range

SVMKernel {linear, RBF}C {2�5, 2�4, ..., 215}� {2�15, 2�14..., 23}

KNN K {5, 10, ..., 50}

RF |E| {10, 20, ..., 100}

Bagging |E| {10, 20, ..., 100}

AdaBoost |E| {10, 20, ..., 100}

To compare the results, we employed the following well-known performance mea-sures: log-loss and F-measure [Sokolova and Lapalme 2009, Ferri et al. 2009].

4. ResultsTable 3 shows the average results obtained by each evaluated method in ten runs of strat-ified five-fold cross-validation. In these experiments, each training and test example wasrepresented by a feature vector generated by combining the feature sets presented in Table1. The results are sorted by the log-loss. The scores are presented as a grayscale heat mapin which the better the score, the darker the cell color. The bold values indicate the bestscores.

The best F-measure was 0.90, and the median of all experiments was 0.86. Thesevalues indicate that the sets of features that were used to represent the examples of theproposed dataset are sufficiently informative to distinguish most of the images containingmeteor from the ones that do not contain meteor.

SVM obtained the best log-loss in both experiments. However, in the experimentwith the original class distribution, the best F-measure was obtained by RF. In the experi-ment using the SMOTE technique to balance the class distribution, bagging obtained the

4The scikit-learn library is available at: http://scikit-learn.org/stable/index.html

Rjk

Page 13: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

Table 3. Average results obtained by each method in ten runs of stratified five-

fold cross-validation.

Method Log-loss F-measure

Original class distribution

SVM 0.272 0.890AdaBoost 0.345 0.853LR 0.468 0.846RF 0.598 0.892Bagging 0.740 0.885KNN 0.893 0.872DT 4.229 0.815G.NB 8.815 0.614

Balanced class distribution (SMOTE)

SVM 0.256 0.884AdaBoost 0.352 0.890LR 0.475 0.839RF 0.475 0.900Bagging 0.596 0.874KNN 1.237 0.797DT 4.678 0.799G.NB 8.837 0.612

best F-measure. The worst scores for all performance measures were obtained by G.NB,in both experiments.

The results also indicate the class balancing improved the performance of somemethods but decreased the scores of others. For example, SVM, RF, and bagging ob-tained a better log-loss in the experiments using a balanced class distribution. However,AdaBoost, LR, KNN, DT, and G.NB obtained a better log-loss with the original classdistribution. If we analyze the F-measure, we can see that the class balancing improvedonly the scores of AdaBoost and RF.

4.1. CompetitionThe proposed dataset was originally used in a competition hosted on the Kaggle platform,as described in Section 2. We evaluated the results of the classification methods using thesame training and test partition of the competition to analyze whether the results are inline with the ones obtained using stratified five-fold cross-validation.

In the competition, the solutions were evaluated using the log-loss. Table 4presents the results obtained on the public leaderboard test set. The results obtained onthe private leaderboard test set are shown in Table 5. The results are sorted by the log-lossand bold values indicate the best scores. Moreover, the scores are presented as a grayscaleheat map in which the better the score, the darker the cell color.

In the experiment with the public leaderboard test set (Table 4) of the competition,SVM lost three positions in the ranking of log-loss scores in relation to the scores obtainedin the experiments based on five-fold cross-validation. On the other hand, KNN gainedsix positions in the ranking by getting the best log-loss in the experiment with balancedclass distribution.

Rjj

Page 14: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

Table 4. Results obtained by each method on the public leaderboard test set.

Method Log-loss F-measure

Original class distribution

LR 0.293 0.889KNN 0.347 0.842RF 0.420 0.889SVM 0.449 0.889AdaBoost 0.530 0.750Bagging 1.849 0.889DT 4.934 0.824G.NB 11.513 0.462

Balanced class distribution (SMOTE)

KNN 0.262 0.800SVM 0.275 0.889LR 0.291 0.889Bagging 0.372 0.889RF 0.457 0.889AdaBoost 0.476 0.667DT 4.934 0.824G.NB 11.513 0.462

Table 5. Results obtained by each method on the private leaderboard test set.

Method Log-loss F-measure

Original class distribution

RF 0.190 0.750Bagging 0.198 0.750AdaBoost 0.227 0.750KNN 0.299 0.889SVM 0.384 0.750LR 1.449 0.667DT 3.289 0.750G.NB 4.934 0.667

Balanced class distribution (SMOTE)

RF 0.171 0.750Bagging 0.179 0.750KNN 0.220 0.800AdaBoost 0.253 0.889SVM 0.408 0.750LR 1.491 0.667DT 3.289 0.750G.NB 4.934 0.667

In the experiments with the private leaderboard test set (Table 5), the ranking ofthe methods was also different from the ranking obtained in the experiment based on five-fold cross-validation. Moreover, in general, the F-measure of the methods on the publicleaderboard test set was better than the results on the private leaderboard test set. Basedon log-loss, the best method of the experiments with the public leaderboard test set (KNN)

Rj9

Page 15: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

was the third best in the experiments with the private leaderboard test set.

These differences indicate that the best method of a competition hosted on Kagglemay not be the best method to solve the same problem in a real-world application and theresults are largely biased towards the specific data partition left for testing. In the case ofa dataset with few samples such as the one used in this competition, this characteristic ishighlighted since a single prediction error causes a significant impact on the final score ofthe method.

4.2. Comparison of baseline results with competition results

In this section, we present a comparison of the results obtained by the baseline methodswith one of the winners of the competition. Briefly, one of the top rank solutions submittedis a stacking of LR. It uses a meta-classifier that is trained with the probabilities given byindividual models. Each individual model is trained with the training data represented byone of the 21 feature sets presented in Table 1. Figure 3 presents an overview diagram ofthis approach5.

Figure 3. Overview diagram of the stacking approach that won the second place

in the competition.

As shown in Figure 3, in the training stage, each training example is representedby 21 feature vectors (FS1, FS2, ..., FS21) based on the feature sets presented in Table 1.Afterwards, the SMOTE algorithm was applied to balance the class distribution. Then,one classification method (LR) was used to generate 21 predictive models (h1, h2, ..., h21),one for each type of feature.

All feature vectors of the balanced training set are also presented to the module oftransformation of the training set. This module performs n rounds of training and classi-fication, where n is the number of examples in the training set. In each round, it creates 21predictive models using LR, one for each set of feature vectors. In the j-th round, the j-th

5The source code is publicly available at https://github.com/renatoms88/KDDBR, accessedon Aug. 11, 2018.

Rj8

Page 16: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

training example is classified by the 21 models trained with the other examples. Then, anew feature vector is created with 21 dimensions, where the i-th element of the vector isthe probability of the example being meteor given by the i-th predictive model. The newfeature vectors generated by the module of transformation of the training set are used totrain a classification method (LR) that generated a meta-classifier h_prob.

In the test stage, an unseen example is also represented by 21 feature vectors(FS1, FS2, ..., FS21) based on the feature sets presented in Table 1. The i-th feature vectoris presented to the predictive model hi. Then, a new feature vector is created, where thei-th element is the probability of the example being meteor given by the i-th model. Thisnew feature vector is classified by the meta-classifier h_prob that returns the value ofp(0) (probability of the example not being a meteor) and p(1) (probability of the examplebeing a meteor).

Table 6 shows the results obtained by the approach presented in Figure 3. Wepresent the average results obtained in ten runs of stratified five-fold cross-validation. Wealso present the performance obtained by the approach using the same training and testpartition of the competition.

Table 6. Results obtained by the stacking approach.

Validation Log-loss F-measure

Ten runs of stratified five-fold 0.313 0.840

Public leaderboard test set 0.351 0.889

Private leaderboard test set 0.072 0.889

In the experiments based on five-fold cross-validation, the stacking approach ob-tained the second best log-loss. Moreover, the F-measure was smaller than the one ob-tained by SVM, AdaBoost, RF, bagging, LR, and KNN.

In the experiment with the private leaderboard test set, the log-loss of the stackingapproach was significantly better than the log-loss obtained by all other methods. The bestlog-loss obtained by RF was 0.171 (Table 5), while the log-loss obtained by the stackingapproach was 0.072 (Table 6). The F-measure of the stacking approach was also the bestvalue in the experiments. However, in the experiment with the public leaderboard, thelog-loss obtained by KNN, SVM, and LR were better than that obtained by the stackingapproach.

5. Conclusions

In this paper, we present a new dataset of meteors that was first used in the 1st KDD-BRcompetition, along with a set of experiments. This dataset has a big potential to becomea baseline because it has challenging characteristics which can be used to test differentcapabilities of classification methods in future researches, such as: imbalanced data, pres-ence of noise, high dimensionality data (21 feature sets, totalling 3,451 features), and fewinstances. We presented a comprehensive performance evaluation of different establishedmachine learning algorithms on this dataset. The results obtained were compared withone of the winning approaches of the competition.

Rje

Page 17: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

In general, the scores obtained by the evaluated methods were high, which indi-cates that the features used to represent the data are informative. We also observed thatbalancing the class distribution using the SMOTE technique, in general, did not improvethe classification results. Moreover, the stacking approach – one of the best solutionsin the competition – was the best evaluated method in this paper, since it obtained thebest scores in the experiments with the training and testing partitions of the competition,and obtained the second best log-loss in the experiments with repeated five-fold cross-validation.

The results obtained in ten runs of a five-fold cross-validation were different fromthe ones obtained using the original training and testing partitions of the competition.These differences reveal that the validation strategy used by the Kaggle platform may notrepresent properly real-world scenarios, mainly when the number of samples is few.

AcknowledgmentsWe gratefully acknowledge the financial support from the Brazilian agencies FAPESP(grants #2012/22608-8, #2017/09387-6, #2018/02146-6) and CAPES (grant #1797700).We would also like to thank the professionals Daniel dos Santos Kaster, Elaine Ribeiro deFaria, Francisco Carlos Rocha Fernandes, Irapuan Rodrigues, Jennifer Nielsen, RicardoCerri, and Vinicius Veloso de Melo who were also involved in the organization of thecompetition.

ReferencesAbo-Zaid, A., Hinton, O. R., and Horne, E. (1988). About moment normalization and

complex moment descriptors. In Pattern Recognition, pages 399–409. Springer.

Bosch, A., Zisserman, A., and Munoz, X. (2007). Representing shape with a spatialpyramid kernel. In Proceedings of the 6th ACM international conference on Image

and video retrieval, pages 401–408. ACM.

Boser, B. E., Guyon, I. M., and Vapnik, V. N. (1992). A training algorithm for optimalmargin classifiers. In Proceedings of the 5th Annual ACM Workshop on Computational

Learning Theory (COLT’92), pages 144–152, Pittsburgh, PA, USA. ACM.

Breiman, L. (1996). Bagging predictors. Machine Learning, 24(2):123–140.

Breiman, L. (2001). Random forests. Machine Learning, 45(1):5–32.

Breiman, L., Friedman, J. H., Olshen, R. A., and Stone, C. J. (1984). Classification and

Regression Trees. Wadsworth International Group, Belmont, California, USA.

Chatzichristofis, S. A. and Boutalis, Y. S. (2008a). Cedd: color and edge directivitydescriptor: a compact descriptor for image indexing and retrieval. In International

Conference on Computer Vision Systems, pages 312–322. Springer.

Chatzichristofis, S. A. and Boutalis, Y. S. (2008b). Fcth: Fuzzy color and texturehistogram-a low level feature for accurate image retrieval. In Image Analysis for Multi-

media Interactive Services, 2008. WIAMIS’08. Ninth International Workshop on, pages191–196. IEEE.

Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). Smote: syn-thetic minority over-sampling technique. Journal of Artificial Intelligence Research,16(1):321–357.

Rjd

Page 18: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

Cortes, C. and Vapnik, V. N. (1995). Support-vector networks. Machine Learning,20(3):273–297.

Cover, T. M. and Hart, P. E. (1967). Nearest neighbor pattern classification. IEEE Trans-

action on Information Theory, 13(1):21–27.

Fernández, A., García, S., del Jesus, M. J., and Herrera, F. (2008). A study of the be-haviour of linguistic fuzzy rule based classification systems in the framework of im-balanced data-sets. Fuzzy Sets and Systems, 159(18):2378–2398.

Ferri, C., Hernández-Orallo, J., and Modroiu, R. (2009). An experimental comparison ofperformance measures for classification. Pattern Recognition Letters, 30(1):27–38.

Fogel, I. and Sagi, D. (1989). Gabor filters as texture discriminator. Biological cybernet-

ics, 61(2):103–113.

Freund, Y. and Schapire, R. E. (1997). A decision-theoretic generalization of on-linelearning and an application to boosting. Journal of Computer and System Sciences,55(1):119–139.

Gural, P. S. (2012). A new method of meteor trajectory determination applied to multipleunsynchronized video cameras. Meteoritics & Planetary Science, 47(9):1405–1418.

Han, J. and Ma, K.-K. (2002). Fuzzy color histogram and its use in color image retrieval.IEEE transactions on image processing, 11(8):944–952.

Haralick, R. M., Shanmugam, K., et al. (1973). Textural features for image classification.IEEE Transactions on systems, man, and cybernetics, (6):610–621.

Huang, J., Kumar, S. R., Mitra, M., Zhu, W.-J., and Zabih, R. (1997). Image indexingusing color correlograms. In Computer Vision and Pattern Recognition, 1997. Pro-

ceedings., 1997 IEEE Computer Society Conference on, pages 762–768. IEEE.

Kriegel, H.-P., Schubert, E., and Zimek, A. (2011). Evaluation of multiple clusteringsolutions. In MultiClust@ ECML/PKDD, pages 55–66.

Metsis, V., Androutsopoulos, I., and Paliouras, G. (2006). Spam filtering with naive Bayes– which naive Bayes? In Proceedings of the 3rd Conference on Email and Anti-Spam

(CEAS’06), pages 27–28, Mountain View, California.

Novak, C. L. and Shafer, S. A. (1992). Anatomy of a color histogram. In Computer Vision

and Pattern Recognition, 1992. Proceedings CVPR’92., 1992 IEEE Computer Society

Conference on, pages 599–605. IEEE.

Ojala, T., Pietikainen, M., and Harwood, D. (1994). Performance evaluation of texturemeasures with classification based on kullback discrimination of distributions. In Pat-

tern Recognition, 1994. Vol. 1-Conference A: Computer Vision & Image Processing.,

Proceedings of the 12th IAPR International Conference on, volume 1, pages 582–585.IEEE.

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel,M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau,D., Brucher, M., Perrot, M., and Duchesnay, E. (2011). Scikit-learn: Machine learningin Python. Journal of Machine Learning Research, 12:2825–2830.

Rj3

Page 19: 1LA *kyR3 1M+QMi`QL +BQM H/2AMi2HB; M+B …tiago/papers/RMS_ENIAC18.pdfJ `+Qb1/m `/Qo HH2-lMBp2`bB/ /21bi /m H/2* KTBM b J `B / b:` Ï bLmM2b-laSfA*J* J `BHiQM ;mB `-l6S1G J `+B *`BbiBM

Roman, V. S. and Buiu, C. (2015). Automatic detection of meteors in spectrograms us-ing artificial neural networks. In Applied Computational Intelligence and Informatics

(SACI), 2015 IEEE 10th Jubilee International Symposium on, pages 131–134. IEEE.

Scott, D. W. (2010). Averaged shifted histogram. Wiley Interdisciplinary Reviews: Com-

putational Statistics, 2(2):160–164.

Sikora, T. (2001). The mpeg-7 visual standard for content description-an overview. IEEE

Transactions on circuits and systems for video technology, 11(6):696–702.

Siladi, E., Vida, D., and Nyarko, E. K. (2015). Video meteor detection filtering using softcomputing methods. In Proceedings of the International Meteor Conference, Mistel-

bach, Austria, pages 27–30.

Sokolova, M. and Lapalme, G. (2009). A systematic analysis of performance measuresfor classification tasks. Information Processing & Management, 45(4):427–437.

Tamura, H., Mori, S., and Yamawaki, T. (1978). Textural features corresponding to visualperception. IEEE Transactions on Systems, man, and cybernetics, 8(6):460–473.

Taylor, S. and Drummond, T. (2011). Binary histogrammed intensity patches for efficientand robust matching. International journal of computer vision, 94(2):241–265.

Van De Sande, K., Gevers, T., and Snoek, C. (2010). Evaluating color descriptors forobject and scene recognition. IEEE transactions on pattern analysis and machine

intelligence, 32(9):1582–1596.

van de Sande, K. E., Gevers, T., and Snoek, C. G. (2004). Evaluation of color descriptorsfor object and scene recognition. In IEEE Conference on Computer Vision and Pattern

Recognition, Anchorage, Alaska, USA (June 2008).

Vítek, S. and Nasyrova, M. (2018). Real-time detection of sporadic meteors in the inten-sified tv imaging systems. Sensors, 18(1):77.

Yu, H.-F., Huang, F.-L., and Lin, C.-J. (2011). Dual coordinate descent methods forlogistic regression and maximum entropy models. Machine Learning, 85(1-2):41–75.

Zagoris, K., Chatzichristofis, S. A., Papamarkos, N., and Boutalis, Y. S. (2010). Auto-matic image annotation and retrieval using the joint composite descriptor. In Informat-

ics (PCI), 2010 14th Panhellenic Conference on, pages 143–147. IEEE.

RjN