23
University of Groningen Deep learning and hyperspectral imaging for unmanned aerial vehicles Dijkstra, Klaas DOI: 10.33612/diss.131754011 IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite from it. Please check the document version below. Document Version Publisher's PDF, also known as Version of record Publication date: 2020 Link to publication in University of Groningen/UMCG research database Citation for published version (APA): Dijkstra, K. (2020). Deep learning and hyperspectral imaging for unmanned aerial vehicles: Combining convolutional neural networks with traditional computer vision paradigms. University of Groningen. https://doi.org/10.33612/diss.131754011 Copyright Other than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of the author(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons). The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license. More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne- amendment. Take-down policy If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately and investigate your claim. Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons the number of authors shown on this cover page is limited to 10 maximum. Download date: 28-11-2021

Deep Learning and Hyperspectral Imaging for Unmanned

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

University of Groningen

Deep learning and hyperspectral imaging for unmanned aerial vehiclesDijkstra, Klaas

DOI:10.33612/diss.131754011

IMPORTANT NOTE: You are advised to consult the publisher's version (publisher's PDF) if you wish to cite fromit. Please check the document version below.

Document VersionPublisher's PDF, also known as Version of record

Publication date:2020

Link to publication in University of Groningen/UMCG research database

Citation for published version (APA):Dijkstra, K. (2020). Deep learning and hyperspectral imaging for unmanned aerial vehicles: Combiningconvolutional neural networks with traditional computer vision paradigms. University of Groningen.https://doi.org/10.33612/diss.131754011

CopyrightOther than for strictly personal use, it is not permitted to download or to forward/distribute the text or part of it without the consent of theauthor(s) and/or copyright holder(s), unless the work is under an open content license (like Creative Commons).

The publication may also be distributed here under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license.More information can be found on the University of Groningen website: https://www.rug.nl/library/open-access/self-archiving-pure/taverne-amendment.

Take-down policyIf you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediatelyand investigate your claim.

Downloaded from the University of Groningen/UMCG research database (Pure): http://www.rug.nl/research/portal. For technical reasons thenumber of authors shown on this cover page is limited to 10 maximum.

Download date: 28-11-2021

141

Chapter 6

Discussion and conclusion

I n the first part of this chapter the answers to the research questionsare discussed. In the second part the usefulness of combining deeplearning with traditional computer vision paradigms is discussed.

The final part will discuss the envisioned path for the future based on theconclusions from this dissertation.

The main research question addressed in this dissertation is: “How candeep learning be utilized to mitigate the limitations imposed by small aerialplatforms employing hyperspectral imaging technology?” It should be clearthat it is virtually impossible to overcome all limitations imposed by smallaerial platforms when using hyperspectral imaging devices. Much of thelimitations are a result of the underlying physics or are caused bytheoretical limits. However, in this dissertation, solutions to mitigateseveral of the limitations have been demonstrated and discussed.

6.1 Research questions

A methodology for selecting the optimal spectral bands from ahyperspectral cube has been proposed for distinguishing two potatodiseases on leaves in Chapter 2. A per-pixel classification of potato leaveswas performed using several classifiers, and it was demonstrated, usingfeature selection and extraction, that a subset of three bands still providesuseful results. This showed how the laboratory set up of a 28-bandimaging system could be reduced to a three-band system while stillretaining sufficient accuracy. The Liquid Crystal Tunable Filter (LCFT)used for collecting the hyperspectral images is difficult to use on aUnmanned Aerial Vehicle (UAV) because of its temporal instability (each

142 Chapter 6. Discussion and conclusion

hyperspectral plane is taken at a different time). However, when reducedto a three-band camera system, it is suitable for usage on a UAV. Forexample, by using three separate cameras.

Prior to this research, distinguishing Alteraria and ozone damage wasdifficult to perform because of visual similarities between the diseases(Turkensteen et al., 2010). This research contributes by providing amachine-learning-based methodology for distinguishing Alternaria andozone damage on potato plant leaves in laboratory conditions usinghyperspectral imaging. Other studies used commodity multi-spectralcameras or regular RGB cameras for crop-health monitoring on a UAVs(Mohanty et al., 2016; Rebetez et al., 2016). These cameras are limited withrespect to the spectral resolution and the spectral bands that are capturedare intented for measuring Chlorophyll of vegetation (Berra et al., 2017).This research contributes by demonstrating various methods for selectingimportant subsets of spectral bands from a hyperspectral image cube.This gives information on which bands need to be captured for detectingspecific diseases like Alternaria and ozone. This methodology provides ananswer to the questions: “How can machine learning be used to automaticallyselect important spectral bands?” and “What is the subsequent effect of usingless bands on the performance of the posed problem?”.

The usage of a 16-band light-weight hyperspectral camera systemwith a Multispectral Color Filter Array (MCFA) (Geelen et al., 2015),mounted on a UAV, has been proposed in this dissertation. Thehyperspectral images produced by this system suffer from crosstalk andtheir spatial resolution is reduced sixteen fold to accommodate for theincrease in spectral resolution (Keren and Osadchy, 1999). Severalmethods for demosaicking images have been proposed in literature: usingedge information, neural networks, inpainting and linear models (Monnoet al., 2012; Wang, 2014; Wang et al., 2017; Aggarwal and Majumdar,2014). This research contributes by proposing a deep-learning-basedmethod for demosaicking a 4 × 4 image mosaic and simultaneouslyreducing crosstalk between spectral bands. A custom ConvolutionalNeural Network (CNN) has specifically been designed to reduce crosstalkand to increase the spatial resolution of the images created by this type ofhyperspectral camera system. Usually crosstalk (Hirakawa, 2008) isconsidered detrimental. However, this research contributes by observingthat the spatial and spectral correlations in the hyperspectral information

6.2. Computer vision and deep learning 143

actually benefit the upscaling process. This way of using deep learning asa signal processing method demonstrates an answer to the question: Howcan the quality and resolution of hyperspectral images be improved using deeplearning?

A notoriously difficult challenge in computer vision is to detectobjects, which are connected in the image, as separate objects. In thisdissertation two versions of the deep learning architecture CentroidNethave been introduced specifically for this task. The architecture was testedon aerial images of potato crops that were collected using a low-costcommodity UAV. This research contributes by showing that our approachachieves a higher F1 score for counting potato-crops compared to otherone-stage object detection models in literature (Redmon and Farhadi,2017; Lin et al., 2017c). It was found that the CentroidNet architecture isparticularity suitable for counting small and connected objects asindividuals by estimating their centroids. This provides an answer to thequestion: “How can deep learning be used to solve a challengingimage-processing task using images produced by low-cost commodity UAVs?”

The trinity of technologies has provided an interesting framework toprovide focus for the research on mitigating several inherent limitations ofUAVs and hyperspectral cameras by using deep learning. However, theresults of this research are not only relevant within this framework ofthought. Spectral-band selection can be applied in any hyperspectralapplication to add efficiency for small platforms. Similarly, it has beenshown that CentroidNet is applicable in a broader range of applicationslike cell-nuclei counting and bacterial-colony counting. CentroidNetshows increased or on-par performance compared to state-of-the-artinstance segmentation methods (Redmon and Farhadi, 2018; He et al.,2017). Furthermore, the research in this dissertation contributes to a morefundamental insight in the relation between computer vision and deeplearning which is discussed in the next section.

6.2 Computer vision and deep learning

The framework of this research is the trinity of the technologies mentionedin the main title of this dissertation and discussed in the previous section.The subtitle relates to the insight into the relation between computer visionand deep learning gained during this research. During experimentation

144 Chapter 6. Discussion and conclusion

within this application framework, the particular usefulness of combiningthese two technologies became clear. This section elaborates more on thisconclusion.

Nowadays many deep learning architectures exist and the number isstill growing. Most of these have been designed to solve specific technicalshortcomings of predecessor CNN models. For example, vanishinggradients are addressed by ResNet (Szegedy et al., 2017), spatialresolution problems are mitigated by U-Net (Ronneberger et al., 2015) andcomputational complexity is decreased by Xception (Chollet, 2017). Thesearchitectures are general, complex and are applicable in many areas. Thismakes them very widespread and extremely successful regardless of theapplication. The question arises if, for some applications, simpler, lesscomputationally complex models could suffice? This could potentiallyreduce the run time, amount of data needed and even the ecologicalfootprint of deep learning for specific applications.

General architectures can learn to model highly complex functionsand generalization methods like weight decay are used to forget detailsduring training, which in turn reduces the complexity of the model andmake it perform better on specific applications. If the model of a task ispartly known and can be represented with a custom CNN architecture,then this CNN is more suited for the task it was designed for. Thismethod of incorporating prior knowledge results in a simpler modelwhich is less prone to over-fitting. In this dissertation a CNN for crosstalkcorrection and upscaling was designed for images produced by a specifichyperspectral camera. Prior knowledge about the structure and size of theMCFA, or mosaic, of the hyperspectral camera is used to set thehyperparameters of the underlying CNN model. This includes theconvolution filter sizes, amounts and strides. In traditional computervision a solution would be engineered for a specific application and mostparameters would be set manually. In the proposed approach a solution isdesigned in terms of specific convolutions and its parameters are trainedend-to-end using deep learning. This successful combination of computervision and deep learning to reduce model complexity for a certain taskhas been discussed in this dissertation in Chapter 3.

Deep learning has endeavored to remove the need for manual featuredesign and aims to learn solutions to practical problems in an end-to-endfashion without the need for traditional computer vision (LeCun et al.,

6.2. Computer vision and deep learning 145

1998; Krizhevsky et al., 2012). In this dissertation the usefulness ofcombining computer vision and deep learning has been shown by meansof CentroidNet. The input image is preprocessed using a CNN so it can beeasily processed by traditional computer vision algorithms to segmentobject instances. Where in the traditional tandem, computer vision is usedto preprocess image information (Bougharriou et al., 2017; Suleiman andSze, 2014), in the CentroidNet algorithm, traditional computer vision isused to postprocess information. This could be generalized to situationswhere in fully-engineered computer vision solutions, certain elements canbe replaced with CNNs. With this method, difficult parts of the visionpipeline can be trained directly from data to improve the overall systemperfomance. Difficult parts in this context can be elements which are hardto parameterize and which can be represented by CNNs. With this hybridapproach, in the computer vision parts, prior knowledge of theapplication is retained. This combination of computer vision and deeplearning was successfully demonstrated in this dissertation in Chapter 4and Chapter 5.

At the core of CentroidNet is the notion of trained 2D vectors. For eachpixel in the image the CNN predicts two sets of vectors. One vector pointsto the nearest centroid and the other vector points to the nearest border ofthe object with the nearest centroid. Each vector can be considered a vote,and by aggregating votes the location of centroid and the delineation ofthe object is determined. This can be viewed as a variant of majorityvoting similar to a Hough transform or an ensemble of detectors. Thisapproach has shown to be successful in other situations (Mukhopadhyayand Chaudhuri, 2015; Viola and Jones, 2001) and also contributes to theworking of CentroidNet. In the ablation studies of CentroidNetV2 it wasfound that the loss function that was designed to better reflect the natureof the outputs of CentroidNet consistently achieved better results inpredicting object instances. This shows that, when designing hybrid CNNmodels, like CentroidNet, in which the nature of the output of thealgorithm changes, other parts of the training process should beredesigned accordingly. This was discussed in Chapter 5.

In the work that has been discussed in this dissertation sometimes arelatively small amount of images was used or very deep neural networkarchitectures with many parameters have been used. It is important thatthe number of samples and the number of parameters in the deep

146 Chapter 6. Discussion and conclusion

learning network is balanced. Each architecture that has been used in thisdissertation performs a pixel-to-pixel mapping of the input image to theoutput. In some cases a traditional sliding window is used (like inChapter 2) and in other cases a fully convolutional neural network is usedto provide this mapping (Chapter 3, 4 and 5). This means that a singleimage logically consists of a large amount of samples from the perspectiveof the deep learning architecture. The number of samples from one imagethen depends on the size of the footprint of the model and the size of theinput image. This is probably the reason why in the demosaickingexperiments discussed in Chapter 3 no overtraining was observed. Inthose experiment both the footprint was small and the images were large.

When taking into account the break-through research and themultitude of new applications of deep learning for many classic visiontasks, it seems that the field of computer vision is mostly superseded bydeep learning. However, the experiments in this dissertation have shownthat designing custom deep learning algorithms for specific computervision tasks and by combing both technologies, clearly provides aninteresting direction for future research into deep learning.

6.3 Future work

Research into other challenges posed by the trinity of deep learning,hyperspectral imaging and UAVs can be performed in the future. In onedirection, research could primarily focus on elements where all threetechnologies need to be combined. Alternatively, the definition of thesethree areas could be further broadened. For example, instead of deeplearning, a more broader range of machine learning and artificialintelligence concepts could be included. Additionally, the hyperspectralimaging notion can be expanded to encompass multiple cameras toprovide an even richer view of the electromagnetic spectrum to the deeplearning approaches. For example, hyperspectral color cameras, shortwave infrared cameras and thermal cameras could be combined. Alongthis line of thought the image sources can be expanded in multipledimensions including 3D imaging and video. In future research UAVscould be synonym for “small platforms”, both in weight and computingpower. This could include the small deployment platforms used in edge

6.3. Future work 147

computing like the Jetson series, Coral or Movidius 1. The main researchquestion can be restated to encompass this broader context: “How canartificial intelligence be utilized to mitigate the limitations imposed by smallplatforms that collect and process images from sources with high spatial, spectraland temporal dimensionality?”

In the next part of this section future research directions of the specificresearch topics that were discussed in this dissertation are proposed. Inthe concluding part of this section, possible future research into thecombination of computer vision and deep learning is discussed.

From the 28-dimensional hyperspectral cube, three channels wereidentified for distinguishing between Alternaria and Ozone damage onpotato-plant leaves in laboratory conditions. Future research could focuson testing this approach on potato plants in agricultural fields usingUAVs. Additionally, using a similar methodology, research could focus onidentifying other crop diseases like Phytophthora. Deep learning couldthen also be used to exploit key morphological features of the brownishlesions specific for certain crop diseases.

Filter mosaics are applied in many ways by sensor manufacturers.This includes regular Red Green Blue (RGB) Bayer filters and thehyperspectral MCFA sensors like the one used in this research, but alsoper-line-filter approaches are available. Recent camera sensors even use amosaic of polarization filters combined with RGB filters. Future researchcould focus on crosstalk correction and upscaling within this family ofmosaic filter patterns by using the approaches discussed in Chapter 3.Additionally, research could focus on using deep learning for upscalingbeyond the native resolution of the imaging sensor using HyperspectralSingle Image Super Resolution (HSISR).

CentroidNet is a hybrid CNN for instance segmentation that isparticularly suitable for counting objects in images. Future research couldfocus on replacing other parts of the algorithm with a CNN. For example,the voting space might be postprocessed with an additional deep learningmodule to provide clearer centroids which are easier to locate withtraditional computer vision methods. This would reduce the amount ofhyperparameters while still retaining the core design of centroid andborder majority voting. Future research could investigate how, in other

1nvidia.com, coral.ai, movidius.com

148 Chapter 6. Discussion and conclusion

traditional computer vision methods, parts can be replaced with deeplearning to form new hybrid algorithms.

This dissertation started with the classic tandem of feature extractionto feed the machine learning models. Subsequently the idea of designingcustom CNN architectures for specific applications was discussed. Thefinal part of this work focused more on preprocessing images with CNNsand then using deterministic computer vision algorithms to do the finalprocessing. As mentioned earlier this can be generalized to replacingparts of computer vision programs by CNNs which consequently allowsthese parts to be trained from data. This line of thought can be extendedinto a future where (parts of) traditional software algorithms can bereplaced by their trainable counterparts. This does not necessarily have tobe limited to redefining software parts in terms of convolutions. But whena software program is differentiable and there is enough training dataavailable it can likely be trained. This would instigate a paradigm shift inhow scientific software is developed. Future research could focus on howto develop software that is differentiable in a general sense so it can betrained with gradient descent methods. This is already an active field ofresearch called Differentiable Programming (∂P). In a ∂P system, graphsare directly represented by the source code of the software program andcan sometimes be compiled and optimized directly. An introduction into∂P is given by Innes et al. (2019). In recent work Riba et al. (2019) providea differentiable computer vision library. These examples indicate that, inthe future, the defining differences between the fields of computer vision,deep learning and, ultimately, software and algorithm development willprobably fade when an increasing amount of software is developed to betrainable with gradient descent methods.

149

Bibliography

Aggarwal H.K. and Majumdar A., Single-sensor multi-spectralimage demosaicing algorithm using learned interpolation weights.International Geoscience and Remote Sensing Symposium (IGARSS) (IEEE2014), (pp. 2011–2014).

Al-Waisy A.S., Qahwaji R., Ipson S. and Al-Fahdawi S., A multimodaldeep learning framework using local feature representations for facerecognition. Machine Vision and Applications , (2017) pp. 1–20.

Badrinarayanan V., Kendall A. and Cipolla R., SegNet: A DeepConvolutional Encoder-Decoder Architecture for Image Segmentation.Transactions on pattern analysis and machine intelligence 39(12), (2017) pp.2481–2495.

Bai M. and Urtasun R., Deep watershed transform for instancesegmentation. Conference on Computer Vision and Pattern Recognition(2017), (pp. 2858–2866).

Ballard D.H., Generalizing the hough transform to detect arbitrary shapes.Pattern recognition 13(2), (1981) pp. 111–122.

Bayer B.E., Color Imaging Array. U. S. patent 3971 065 (1976), (p. 10).

Baygin M., Karakose M., Sarimaden A. and Akin E., An Image Processingbased Object Counting Approach for Machine Vision Application.Conference on Advances and Innovations in Engineering (2018), (pp.966–970).

Behmann J., Mahlein A.K., Paulus S., Dupuis J., Kuhlmann H., Oerke E.C.and Plümer L., Generation and application of hyperspectral 3D plantmodels: methods and challenges. Machine Vision and Applications 27(5),(2016) pp. 611–624.

150 BIBLIOGRAPHY

Berra E.F., Gaulton R. and Barr S., Commercial Off-the-Shelf DigitalCameras on Unmanned Aerial Vehicles for Multitemporal Monitoring ofVegetation Reflectance and NDVI. Transactions on Geoscience and RemoteSensing, volume 55 (2017), (pp. 4878–4886).

Bishop C.M., Pattern Recognition and Machine Learning (Information Scienceand Statistics) (Springer-Verlag, Berlin, Heidelberg 2006).

Bougharriou S., Hamdaoui F. and Mtibaa A., Linear svm classifier basedhog car detection. 2017 18th International Conference on Sciences andTechniques of Automatic Control and Computer Engineering (STA) (IEEE2017), (pp. 241–245).

Chang C.C. and Lin C.J., LIBSVM: A library for support vector machines.ACM Transactions on Intelligent Systems and Technology 2, (2011) pp. 1–27.

Cheema G.S. and Anand S., Automatic detection and recognition ofindividuals in patterned species. Joint European Conference on MachineLearning and Knowledge Discovery in Databases (Springer 2017), (pp.27–38).

Chen B. and Miao X., Distribution line pole detection and counting basedon yolo using uav inspection line video. Journal of Electrical Engineering& Technology , (2019) pp. 1–8.

Chen H., Qi X., Yu L. and Heng P.A., Dcan: deep contour-aware networksfor accurate gland segmentation. Proceedings of the IEEE conference onComputer Vision and Pattern Recognition (2016a), (pp. 2487–2496).

Chen L.C., Papandreou G., Kokkinos I., Murphy K. and Yuille A.L.,DeepLab: Semantic Image Segmentation with Deep Convolutional Nets,Atrous Convolution, and Fully Connected CRFs. International Conferenceon Learning Representations , (2016b) pp. 834–848.

Chen L.C., Zhu Y., Papandreou G., Schroff F. and Adam H.,Encoder-decoder with atrous separable convolution for semantic imagesegmentation. European Conference on Computer Vision (2018), (pp.801–818).

BIBLIOGRAPHY 151

Chollet F., Xception: Deep learning with depthwise separableconvolutions. Conference on Computer Vision and Pattern Recognition(2017), (pp. 1800–1807).

Cohen J.P., Boucher G., Glastonbury C.A., Lo H.Z. and Bengio Y.,Count-ception: Counting by Fully Convolutional Redundant Counting.International Conference on Computer Vision (2017), (pp. 18–26).

Cortes C. and Vapnik V., Support-vector networks. Machine learning 20(3),(1995) pp. 273–297.

Couprie C., Farabet C., Najman L. and Lecun Y., Convolutional nets andwatershed cuts for real-time semantic labeling of RGBD videos. Journalof Machine Learning Research 15(1), (2014) pp. 3489–3511.

Dai F., Liu H., Ma Y., Cao J., Zhao Q. and Zhang Y., Dense scale networkfor crowd counting. arXiv preprint arXiv:1906.09707 (2019) .

Dalal N. and Triggs B., Histograms of oriented gradients for humandetection. Conference on Computer Vision and Pattern Recognition (2005),(pp. 886–893).

de Boer J., Barbany M.J., Dijkstra M.R., Dijkstra K. and van De LoosdrechtJ., Twirre V2: Evolution of an architecture for automated mini-UAVsusing interchangeable commodity components. International Micro AirVehicle Conference and Competition (2015).

Degraux K., Cambareri V., Jacques L., Geelen B., Blanch C. and LafruitG., Generalized inpainting method for hyperspectral image acquisition.Proceedings - International Conference on Image Processing, ICIP, volume2015 (2015), (pp. 315–319).

Deng L., Li J., Huang J.T., Yao K., Yu D., Seide F., Seltzer M., Zweig G.,He X., Williams J. et al., Recent advances in deep learning for speechresearch at microsoft. 2013 IEEE International Conference on Acoustics,Speech and Signal Processing (IEEE 2013), (pp. 8604–8608).

Dijkstra K., van de Loosdrecht J., Schomaker L.R. and Wiering M.A.,CentroidNet: A deep neural network for joint object localization andcounting. European Conference on Machine Learning and Principles andPractice of Knowledge Discovery in Databases (2018a), (pp. 585–601).

152 BIBLIOGRAPHY

Dijkstra K., van de Loosdrecht J., Schomaker L.R. and Wiering M.A.,Hyperspectral demosaicking and crosstalk correction using deeplearning. Machine Vision and Applications 30(1), (2018b) pp. 1–21.

Dijkstra K., van de Loosdrecht J., Schomaker L.R.B. and Wiering M.A.,Hyper-spectral frequency selection for the classification of vegetationdiseases. European Symposium on Artificial Neural Networks, ComputationalIntelligence and Machine Learning (2017).

Dong C., Loy C.C., He K. and Tang X., Image Super-Resolution Using DeepConvolutional Networks. Transactions on Pattern Analysis and MachineIntelligence 38(2), (2016) pp. 295–307.

Eichhardt I., Chetverikov D. and Jankó Z., Image-guided ToF depthupsampling: a survey. Machine Vision and Applications 28(3-4), (2017) pp.267–282.

Elsken T., Metzen J.H. and Hutter F., Neural Architecture Search: A Survey.Journal of Machine Learning Research 20(55), (2019) pp. 1–21.

Erhan D., Bengio Y., Courville A., Manzagol P.A., Vincent P. and Bengio S.,Why Does Unsupervised Pre-training Help Deep Learning? Journal ofMachine Learning Research 11, (2010) pp. 625–660.

Ferrari A., Lombardi S. and Signoroni A., Bacterial colony counting withConvolutional Neural Networks. Conference of the IEEE Engineering inMedicine and Biology Society (2015), (pp. 7458–7461).

Fukushima K., Neocognitron: A self-organizing neural network modelfor a mechanism of pattern recognition unaffected by shift in position.Biological Cybernetics 36(4), (1980) pp. 193–202.

Galteri L., Seidenari L., Bertini M. and Del Bimbo A., Deep GenerativeAdversarial Compression Artifact Removal. International Conference onComputer Vision , (2017) pp. 4826–4835.

Geelen B., Blanch C., Gonzalez P., Tack N. and Lambrechts A., Atiny VIS-NIR snapshot multispectral camera. Advanced FabricationTechnologies for Micro/Nano Optics and Photonics, volume 9374.International Society for Optics and Photonics (International Society forOptics and Photonics 2015).

BIBLIOGRAPHY 153

Gharbi M., Chaurasia G., Paris S. and Durand F., Deep joint demosaickingand denoising. ACM Transactions on Graphics 35(6), (2016) pp. 1–12.

Goodfellow I., Bengio Y. and Courville A., Deep Learning (MIT Press 2016).URL: http://www.deeplearningbook.org

Goyal P., Dollár P., Girshick R., Noordhuis P., Wesolowski L., Kyrola A.,Tulloch A., Jia Y. and He K., Accurate, Large Minibatch SGD: TrainingImageNet in 1 Hour. arXiv preprint arXiv:1706.02677 (2017) .

Grigorescu S., Trasnea B., Cocias T. and Macesanu G., A survey of deeplearning techniques for autonomous driving. Journal of Field Robotics37(3), (2020) pp. 362–386.

Hallermann N. and Morgenthal G., Unmanned aerial vehicles (uav) for theassessment of existing structures. IABSE Symposium Report, volume 101(International Association for Bridge and Structural Engineering 2013),(pp. 1–8).

He K., Gkioxari G., Dollar P. and Girshick R., Mask R-CNN. Conference onComputer Vision and Pattern Recognition (2017), (pp. 2980–2988).

Hirakawa K., Cross-talk explained. Proceedings - International Conference onImage Processing, ICIP (IEEE 2008), (pp. 677–680).

Hopfield J.J., Neural networks and physical systems with emergentcollective computational abilities. Proceedings of the National Academy ofSciences of the USA, volume 79 (1982), (p. 2554–2558).

Hsieh M.R., Lin Y.L. and Hsu W.H., Drone-Based Object Countingby Spatially Regularized Regional Proposal Network. Conference onComputer Vision and Pattern Recognition (2017), (pp. 4165–4173).

Hubel D.H. and Wiesel T.N., Receptive fields of single neurones in the cat’sstriate cortex. The Journal of physiology 148(3), (1959) pp. 574–591.

Innes M., Edelman A., Fischer K., Rackauckas C., Saba E., Shah V.B. andTebbutt W., A differentiable programming system to bridge machinelearning and scientific computing. arXiv preprint arXiv:1907.07587 (2019).

154 BIBLIOGRAPHY

Jetley S., Sapienza M., Golodetz S. and Torr P.H., Straight to shapes:Real-time detection of encoded shapes. Proceedings of the IEEE Conferenceon Computer Vision and Pattern Recognition (2017), (pp. 6550–6559).

Jia Y., Shelhamer E., Donahue J., Karayev S., Long J., Girshick R.,Guadarrama S. and Darrell T., Caffe: Convolutional architecture for fastfeature embedding. International Conference on Multimedia (ACM 2014),(pp. 675–678).

Kainz P., Urschler M., Schulter S., Wohlhart P. and Lepetit V., You ShouldUse Regression to Detect Cells. International Conference on Medical ImageComputing and Computer-Assisted Intervention (Springer, Cham 2015), (pp.276–283).

Karras T., Laine S. and Aila T., A Style-Based Generator Architecture forGenerative Adversarial Networks. Conference on Computer Vision andPattern Recognition (2019), (pp. 4401–4410).

Kass M., Witkin A. and Terzopoulos D., Snakes: Active contour models.International Journal of Computer Vision 1(4), (1988) pp. 321–331.

Keren D. and Osadchy M., Restoring subsampled color images. MachineVision and Applications 11(4), (1999) pp. 197–202.

Khan A.U.M., Torelli A., Wolf I. and Gretz N., AutoCellSeg: Robustautomatic colony forming unit (CFU)/cell analysis using adaptive imagesegmentation and easy-to-use post-editing techniques. Nature ScientificReports 8(1), (2018) p. 7302.

Krizhevsky A., Sutskever I. and Hinton G.E., ImageNet Classification withDeep Convolutional Neural Networks. Advances in Neural InformationProcessing Systems (2012), (pp. 1097–1105).

LeCun Y., Bengio Y. and Hinton G., Deep learning. Nature 521(7553), (2015)p. 436.

LeCun Y., Bottou L., Bengio Y. and Haffner P., Gradient-based learningapplied to document recognition. Proceedings of the IEEE 86(11), (1998)pp. 2278–2324.

BIBLIOGRAPHY 155

Lee H., Grosse R., Ranganath R. and Ng A.Y., Convolutional deepbelief networks for scalable unsupervised learning of hierarchicalrepresentations. International Conference on Machine Learning (ACM 2009),(pp. 609–616).

Li Y., Zhang X. and Chen D., CSRNet: Dilated Convolutional NeuralNetworks for Understanding the Highly Congested Scenes. Conferenceon Computer Vision and Pattern Recognition (2018), (pp. 1092–1100).

Liang X., Lin L., Wei Y., Shen X., Yang J. and Yan S., Proposal-free networkfor instance-level object segmentation. IEEE transactions on patternanalysis and machine intelligence 40(12), (2017) pp. 2978–2991.

Lighthill J. (1973). Artificial intelligence: A general survey.

Lin H.W., Tegmark M. and Rolnick D., Why Does Deep and CheapLearning Work So Well? Journal of Statistical Physics 168(6), (2017a) pp.1223–1247.

Lin T.Y., Dollár P., Girshick R., He K., Hariharan B. and Belongie S., Featurepyramid networks for object detection. Conference on Computer Vision andPattern Recognition (2017b), (pp. 2117–2125).

Lin T.Y., Goyal P., Girshick R., He K. and Dollár P., Focal loss for denseobject detection. Proceedings of the IEEE international conference oncomputer vision (2017c), (pp. 2980–2988).

Liu W., Anguelov D., Erhan D., Szegedy C., Reed S., Fu C.Y. and Berg A.C.,SSD: Single shot multibox detector. European Conference on ComputerVision (Springer 2016), (pp. 21–37).

Long J., Shelhamer E. and Darrell T., Fully Convolutional Networksfor Semantic Segmentation. Conference on Computer Vision and PatternRecognition (2015), (pp. 3431–3440).

Lowe D.G., Three-dimensional object recognition from singletwo-dimensional images. Artificial intelligence 31(3), (1987) pp. 355–395.

Lowe D.G., Object recognition from local scale-invariant features.International Conference on Computer Vision, volume 99 (1999), (pp.1150–1157).

156 BIBLIOGRAPHY

Luebke D., Harris M., Govindaraju N., Lefohn A., Houston M., OwensJ., Segal M., Papakipos M. and Buck I., GPGPU: general-purposecomputation on graphics hardware. Conference on Supercomputing (ACM2006), (p. 208).

Marr D., Vision: A computational investigation into the human representationand processing of visual information (MIT Press 1982).

Mazur M., Six ways drones are revolutionizing agriculture. MIT TechnologyReview .

Mihoubi S., Losson O., Mathon B. and Macaire L., Multispectraldemosaicing using intensity-based spectral correlation. InternationalConference on Image Processing, Theory, Tools and Applications (IEEE 2015),(pp. 461–466).

Milletari F., Ahmadi S.A., Kroll C., Plate A., Rozanski V., Maiostre J., LevinJ., Dietrich O., Ertl-Wagner B., Bötzel K. and Navab N., Hough-CNN:Deep Learning for Segmentation of Deep Brain Regions in MRI andUltrasound. Computer Vision and Image Understanding , (2017) pp. 92–102.

Minsky M. and Papert S., Perceptrons (MIT Press 1969).

Mohanty S.P., Hughes D.P. and Salathé M., Using deep learning forimage-based plant disease detection. Frontiers in Plant Science 7, (2016)p. 1419.

Monno Y., Tanaka M. and Okutomi M., Multispectral demosaicking usingguided filter. Digital Photography VIII, volume 8299. International Societyfor Optics and Photonics (SPIE 2012), (pp. 204 – 210).

Mukhopadhyay P. and Chaudhuri B.B., A survey of Hough Transform.Pattern Recognition 84(3), (2015) pp. 993–1010.

Mustaniemi J., Kannala J. and Heikkilä J., Parallax correction via disparityestimation in a multi-aperture camera. Machine Vision and Applications27(8), (2016) pp. 1313–1323.

New York Times, New navy device learns by doing. New York Times , (1958)p. 23.

BIBLIOGRAPHY 157

Nguyen H.T., Wistuba M. and Schmidt-Thieme L., Personalized tagrecommendation for images using deep transfer learning. Joint EuropeanConference on Machine Learning and Knowledge Discovery in Databases(Springer 2017), (pp. 705–720).

Özlü A. (2018). TensorFlow Object Counting API.URL: https://github.com/ahmetozlu/tensorflow_object_counting_api

O’Mahony N., Campbell S., Carvalho A., Harapanahalli S., HernandezG.V., Krpalkova L., Riordan D. and Walsh J., Deep learning vs.traditional computer vision. Science and Information Conference (Springer2019), (pp. 128–144).

Paredes J.A., Gonzalez J., Saito C. and Flores A., Multispectral imagingsystem with UAV integration capabilities for crop analysis. InternationalSymposium of Geoscience and Remote Sensing (IEEE 2017), (pp. 1–4).

Pathak A.R., Pandey M. and Rautaray S., Application of Deep Learning forObject Detection. Procedia Computer Science 132, (2018) pp. 1706–1717.

Peng J., Hon B.Y.C. and Kong D., A structural low rank regularizationmethod for single image super-resolution. Machine Vision andApplications 26(7-8), (2015) pp. 991–1005.

Pietikäinen M., Hadid A., Zhao G. and Ahonen T., Computer vision usinglocal binary patterns, volume 40 (Springer Science & Business Media2011).

Pullanagari R.R., Kereszturi G. and Yule I.J., Mapping of macro and micronutrients of mixed pastures using airborne AisaFENIX hyperspectralimagery. Journal of Photogrammetry and Remote Sensing 117, (2016) pp.1–10.

Radoglou-Grammatikis P., Sarigiannidis P., Lagkas T. and Moscholios I.,A compilation of uav applications for precision agriculture. ComputerNetworks 172, (2020) p. 107148.

Rahman M.A. and Wang Y., Optimizing intersection-over-union in deepneural networks for image segmentation. International Symposium onVisual Computing (2016), (pp. 234–244).

158 BIBLIOGRAPHY

Rasmussen J., Ntakos G., Nielsen J., Svensgaard J., Poulsen R.N. andChristensen S., Are vegetation indices derived from consumer-gradecameras mounted on UAVs sufficiently reliable for assessingexperimental plots? European Journal of Agronomy 74, (2016) pp.75–92.

Rebetez J., Satizábal H.F., Mota M., Noll D., Büchi L., Wendling M.,Cannelle B., Perez-Uribe A. and Burgos S., Augmenting a convolutionalneural network with local histograms - a case study in crop classificationfrom high-resolution UAV imagery. European Symposium on ArtificialNeural Networks, Computational Intelligence and Machine Learning (2016).

Redmon J. and Farhadi A., YOLO9000: better, faster, stronger. Conferenceon Computer Vision and Pattern Recognition (2017), (pp. 7263–7271).

Redmon J. and Farhadi A., Yolov3: An incremental improvement. arXivpreprint arXiv:1804.02767 (2018) .

Ren M. and Zemel R.S., End-to-end instance segmentation with recurrentattention. Proceedings of the IEEE Conference on Computer Vision and PatternRecognition (2017), (pp. 6656–6664).

Ren P., Fang W. and Djahel S., A novel yolo-based real-time peoplecounting approach. International Smart Cities Conference (ISC2) (IEEE2017), (pp. 1–2).

Ren S., He K., Girshick R. and Sun J., Faster R-CNN: Towards Real-TimeObject Detection with Region Proposal Networks. Advances in NeuralInformation Processing Systems (2015), (pp. 1137–1149).

Riba E., Mishkin D., Ponsa D., Rublee E. and Bradski G., Kornia: an opensource differentiable computer vision library for pytorch (2019).

Ronneberger O., Fischer P. and Brox T., U-net: Convolutional networks forbiomedical image segmentation. Conference on Medical Image Computingand Computer-Assisted Intervention (2015), (pp. 234–241).

Rosenblatt F., The Perceptron: A Probabilistic Model for InformationStorage and Organization in The Brain. Psychological Review , (1958) pp.65–386.

BIBLIOGRAPHY 159

Sauget V. and Hubert M., Application note for CMS camera and CMSsensor users : Post - processing method for crosstalk reduction inmultispectral data and images . Advanced Fabrication Technologies forMicro/Nano Optics and Photonics , (2016) pp. 1–8.

Schmidhuber J., Deep learning in neural networks: An overview. Neuralnetworks 61, (2015) pp. 85–117.

Schmidt U., Weigert M., Broaddus C. and Myers G., Cell detectionwith star-convex polygons. International Conference on Medical ImageComputing and Computer-Assisted Intervention (Springer 2018), (pp.265–273).

Shelhamer E., Long J. and Darrell T., Fully Convolutional Networks forSemantic Segmentation. Transactions on Pattern Analysis and MachineIntelligence 39(4), (2017) pp. 640–651.

Shi J. and Malik J., Normalized cuts and image segmentation. PatternAnalysis and Machine Intelligence, volume 22 (2000), (pp. 888–905).

Simic Milas A., Romanko M., Reil P., Abeysinghe T. and Marambe A., Theimportance of leaf area index in mapping chlorophyll content of cornunder different agricultural treatments using uav images. InternationalJournal of Remote Sensing 39(15-16), (2018) pp. 5415–5431.

Srivastava N., Hinton G., Krizhevsky A., Sutskever I. and SalakhutdinovR., Dropout: A Simple Way to Prevent Neural Networks fromOverfitting. Journal of Machine Learning Research 15, (2014) pp. 1929–1958.

Stahl T., Pintea S.L. and Van Gemert J.C., Divide and Count: Generic ObjectCounting by Image Divisions. IEEE Transactions on Image Processing 28,(2019) pp. 1035–1044.

Statt N., The ai boom is happening all over the world, and it’s acceleratingquickly. The Verge .

Suleiman A. and Sze V., Energy-efficient hog-based object detection at1080hd 60 fps with multi-scale support. 2014 IEEE Workshop on SignalProcessing Systems (SiPS) (IEEE 2014), (pp. 1–6).

160 BIBLIOGRAPHY

Szegedy C., Ioffe S., Vanhoucke V. and Alemi A.A., Inception-v4,Inception-ResNet and the Impact of Residual Connections on Learning.Conference on Artificial Intelligence (2017), (pp. 4278–4284).

Szegedy C., Liu W., Jia Y., Sermanet P., Reed S., Anguelov D., ErhanD., Vanhoucke V. and Rabinovich A., Going deeper with convolutions.Conference on Computer Vision and Pattern Recognition (2015), (pp. 1–9).

Thomas J. and Gausman H., Leaf reflectance vs. leaf chlorophyll andcarotenoid concentrations for eight crops. Agronomy journal 69(5), (1977)pp. 799–802.

Turkensteen L.J., Spoelder J. and Mulder A., Will the real Alternaria standup please: Experiences with Alternaria–like diseases on potatoes duringthe 2009 growing season in The Netherlands. PPO-Special Report no. 14.Proc. 12th EuroBlight Workshop. France , (2010) pp. 165 – 170.

van Beers F., Lindstrom A., Okafor E. and Wiering M.A., DeepNeural Networks with Intersection over Union Loss for Binary ImageSegmentation. Conference on Pattern Recognition Applications and Methods(2019), (pp. 438–445).

van de Loosdrecht J., Dijkstra K., Postma J.H., Keuning W. and Bruin D.,Twirre: Architecture for autonomous mini-UAVs using interchangeablecommodity components. IMAV 2014: International Micro Air VehicleConference and Competition (2014), (pp. 26–33).

Venables W.N. and Ripley B.D., Modern Applied Statistics with S, volumeFourth edition (Springer 2002).

Viola P. and Jones M., Rapid object detection using a boosted cascade ofsimple features. Computer Vision and Pattern Recognition, volume 1 (IEEE2001), (pp. 511–518).

Wan J., Luo W., Wu B., Chan A.B. and Liu W., Residual regression withsemantic prior for crowd counting. Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition (2019), (pp. 4036–4045).

Wang D., Yu G., Zhou X. and Wang C., Image demosaicking forBayer-patterned CFA images using improved linear interpolation. 2017

BIBLIOGRAPHY 161

Seventh International Conference on Information Science and Technology(ICIST) (2017), (pp. 464–469).

Wang T., Celik K. and Somani A.K., Characterization of mountain drainagepatterns for GPS-denied UAS navigation augmentation. Machine Visionand Applications 27(1), (2016) pp. 87–101.

Wang Y.Q., A multilayer neural network for image demosaicking.International Conference on Image Processing (IEEE 2014), (pp. 1852–1856).

Wang Z., Bovik A.C., Sheikh H.R. and Simoncelli E.P., Image qualityassessment: From error visibility to structural similarity. Transactions onImage Processing 13(4), (2004) pp. 600–612.

Wilkinson S., Mills G., Illidge R. and Davies W.J., How is ozone pollutionreducing our food supply? Journal of Experimental Botany 63(2), (2012)pp. 527–536.

Wu Z., Shen C. and Hengel A.v.d., Bridging category-level andinstance-level semantic image segmentation. arXiv preprintarXiv:1605.06885 (2016) .

Xie W., Noble J.A. and Zisserman A., Microscopy cell counting anddetection with fully convolutional regression networks. Computermethods in biomechanics and biomedical engineering: Imaging & Visualization6(3), (2018) pp. 283–292.