Image and Data Analysis for Biomedical Quantitative Microscopyuu.diva-portal.org/smash/get/diva2:1359483/FULLTEXT01.pdf · the image analysis, while Matuszewski, D.J. did the data

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2019

Digital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1864

Image and Data Analysisfor Biomedical QuantitativeMicroscopy

DAMIAN J. MATUSZEWSKI

ISSN 1651-6214ISBN 978-91-513-0771-8urn:nbn:se:uu:diva-394709

Dissertation presented at Uppsala University to be publicly examined in 2446, ITC,Polacksbacken, Lägerhyddsvägen 2, Uppsala, Thursday, 12 December 2019 at 10:15 forthe degree of Doctor of Philosophy. The examination will be conducted in English. Facultyexaminer: Professor Peter Horvath (Institute for Molecular Medicine Finland, Universityof Helsinki; Institute of Biochemistry, Biological Research Center, Hungarian Academy ofSciences).

AbstractMatuszewski, D. J. 2019. Image and Data Analysis for Biomedical Quantitative Microscopy.Digital Comprehensive Summaries of Uppsala Dissertations from the Faculty of Science andTechnology 1864. 63 pp. Uppsala: Acta Universitatis Upsaliensis. ISBN 978-91-513-0771-8.

This thesis presents automatic image and data analysis methods to facilitate and improvemicroscopy-based research and diagnosis. New technologies and computational tools arenecessary for handling the ever-growing amounts of data produced in life science. The thesispresents methods developed in three projects with different biomedical applications.

In the first project, we analyzed a large high-content screen aimed at enabling personalizedmedicine for glioblastoma patients. We focused on capturing drug-induced cell-cycle disruptionin fluorescence microscopy images of cancer cell cultures. Our main objectives were to identifydrugs affecting the cell-cycle and to increase the understanding of different drugs’ mechanismsof action. Here we present tools for automatic cell-cycle analysis and identification of drugsof interest and their effective doses.

In the second project, we developed a feature descriptor for image matching. Image matchingis a central pre-processing step in many applications. For example, when two or more imagesmust be matched and registered to create a larger field of view or to analyze differences andchanges over time. Our descriptor is rotation-, scale-, and illumination-invariant and it has ashort feature vector which makes it computationally attractive. The flexibility to combine it withany feature detector and the customization possibility make it a very versatile tool.

In the third project, we addressed two general problems for bridging the gap between deeplearning method development and their use in practical scenarios. We developed a method forconvolutional neural network training using minimally annotated images. In many biomedicalapplications, the objects of interest cannot be accurately delineated due to their fuzzy shape,ambiguous morphology, image quality, or the expert knowledge and time it requires. Theminimal annotations, in this case, consist of center-points or centerlines of target objects ofapproximately known size. We demonstrated our training method in a challenging applicationof a multi-class semantic segmentation of viruses in transmission electron microscopy images.We also systematically explored the influence of network architecture hyper-parameters on itssize and performance and show the possibility to substantially reduce the size of a networkwithout compromising its performance.

All methods in this thesis were designed to work with little or no input from biomedicalexperts but of course, require fine-tuning for new applications. The usefulness of the tools hasbeen demonstrated by collaborators and other researchers and has inspired further developmentof related algorithms.

Keywords: high-content screening, drug selection, DNA content histogram, manual imageannotation, deep learning, convolutional neural networks, hardware integration

Damian J. Matuszewski, Department of Information Technology, Division of VisualInformation and Interaction, Box 337, Uppsala University, SE-751 05 Uppsala, Sweden.

© Damian J. Matuszewski 2019

ISSN 1651-6214ISBN 978-91-513-0771-8urn:nbn:se:uu:diva-394709 (http://urn.kb.se/resolve?urn=urn:nbn:se:uu:diva-394709)

To my family

List of Papers

This thesis is based on the following papers, which are referred to in the text by their Roman numerals.

I Matuszewski, D.J., Sintorn, I.-M., Puigvert, J.C., Wählby, C. (2016) Comparison of flow cytometry and image-based screen-ing for cell cycle analysis. In Proc. of the 13th International Conference on Image Analysis and Recognition (ICIAR 2016), Póvoa de Varzim, Portugal: pp. 623–630. © 2016 Springer Intl. Publ. Switzerland. DOI: 10.1007/978-3-319-41501-7_70

II Matuszewski, D.J., Wählby, C., Puigvert, J.C., Sintorn, I.-M. (2016) PopulationProfiler: a tool for population analysis and visualization of image-based cell screening data. PLOS ONE, 11(3): e0151554. © 2016 Matuszewski et al. DOI: 10.1371/journal.pone.0151554

III Matuszewski, D.J., Wählby, C., Krona, C., Nelander, S., Sin-torn, I.-M. (2018) Image-Based Detection of Patient-Specific Drug-Induced Cell-Cycle Effects in Glioblastoma. SLAS Dis-covery: Advancing Life Sciences R&D, 23(10): pp. 1030–1039. © 2018 Society for Laboratory Automation and Screening. DOI: 10.1177/2472555218791414

IV Matuszewski, D.J., Hast, A., Wählby, C., Sintorn, I.-M. (2017) A short feature vector for image matching: The Log-Polar Magnitude feature descriptor. PLOS ONE, 12(11): e0188496. © 2017 Matuszewski et al. DOI: 10.1371/journal.pone.0188496

V Matuszewski, D.J. and Sintorn, I.-M. (2018) Minimal annota-tion training for segmentation of microscopy images. In Proc. of the 15th IEEE International Symposium on Biomedical Imag-ing (ISBI 2018), Washington, USA: pp. 387–390. © 2018 IEEE. DOI: 10.1109/ISBI.2018.8363599

VI Matuszewski, D.J. and Sintorn, I.-M. (2019) Reducing the U-Net size for practical scenarios: Virus recognition in electron microscopy images. Computer Methods and Programs in Bio-medicine, 178: pp. 21–39. © 2019 Elsevier B.V. DOI: 10.1016/j.cmpb.2019.05.026

Reprints were made with permission from the respective publishers.

For Paper I, the biological sample preparation, flow cytometry and image acquisition were performed by Puigvert, J.C. Matuszewski, D.J. designed and implemented the image-based screening data analysis. Puigvert, J.C. ran the image analysis, while Matuszewski, D.J. did the data post-processing and quantitative comparison of the two methods. Matuszewski, D.J. wrote the paper with input from the co-authors.

For Paper II, Matuszewski, D.J. designed and implemented Population-Profiler. Matuszewski, D.J. and Puigvert, J.C. equally contributed to the software testing. Puigvert, J.C. provided the data for testing. Matuszewski, D.J. wrote the paper with input from the co-authors.

For Paper III, Krona, C. and Nelander, S. prepared the biological data andacquired the images. Wählby, C. and Sintorn, I.-M. designed and imple-mented the image analysis pipeline. Matuszewski, D.J. is the main contribu-tor to the design and implementation of the image-based screening data analysis: automatic cell-cycle analysis, dose selection, and detection of the drug of interest. Matuszewski, D.J. wrote the paper with input from the co-authors.

For Paper IV, Matuszewski, D.J. is the main contributor to the design and implementation of the algorithm modification that resulted in the presented method. The original algorithm was designed and implemented by Hast, A. Matuszewski, D.J. implemented and performed the quantitative evaluation of the method, and wrote the paper with input from the co-authors.

For Papers V and VI, Matuszewski, D.J. is the main contributor to the im-age analysis design and implementation, including the deep neural networks design, implementation, training, and testing. Matuszewski, D.J. wrote the papers with input from Sintorn, I.-M.

Related Work

In addition to the papers included in this thesis, the author has also contrib-uted to the following publications:

PEER REVIEWED • Schmidt, L., Baskaran, S., Johansson, P., Padhan, N., Matuszewski,

D.J., Green, L.C., Elfineh, L., Wee, S., Häggblad, M., Martens, U., Westermark, B., Forsberg-Nilsson, K., Uhrbom, L., Claesson-Welsh, L., Andäng, M., Sintorn, I.-M., Lundgren, B., Lönnstedt, I., Krona, C., Ne-lander, S. (2016). Case-specific potentiation of glioblastoma drugs by pterostilbene. Oncotarget, 7(45): pp. 73200–73215. DOI: 10.18632/oncotarget.12298

• Carreras-Puigvert, J., Zitnik, M., Jemth, A.-S., Carter, M., Unterlass, J.E., Hallström, B., Loseva, O., Karem, Z., Calderón-Montaño, J.M., Lindskog, C., Edqvist, P.-H., Matuszewski, D.J., Blal, H.A., Berntsson, R.P.A., Häggblad, M., Martens, U., Studham, M., Lundgren, B., Wähl-by, C., Sonnhammer, E.L.L., Lundberg, E., Stenmark, P., Zupan, B., Helleday, T. (2017). A comprehensive structural, biochemical and bio-logical profiling of the human NUDIX hydrolase family. Nature Communications, 8(1): 1541. DOI: 10.1038/s41467-017-01642-w

• Gupta, A., Larsson, V., Matuszewski, D.J., Strömblad, S., Wählby, C. Weakly-supervised prediction of cell migration modes in confocal mi-croscopy images using Bayesian deep learning. Submitted for conference publication. Preprint: bioRxiv. DOI: 10.1101/810473

OTHER • Matuszewski, D.J., Sintorn, I.-M., Wählby, C., Schmidt, L., Baskaran,

S., Elfineh, L., Nelander, S. Investigating phenotypic differences and drug response among glioblastoma stem cell cultures from patients. In Proc. of the Swedish Society for Automated Image Analysis (SSBA) Symposium on Image Analysis, Ystad, Sweden, 2015.

• Matuszewski, D.J., Sintorn, I.-M., Puigvert, J.C., Wählby, C. Compar-ing cell cycle analysis using flow cytometry and image-based screening. BioImage Informatics Conference, Gaithersburg, MD, USA, 2015.

• Matuszewski, D.J. and Sintorn, I.-M. Deep learning segmentation of biomedical images with minimal labeling. In Proc. of the Swedish Socie-ty for Automated Image Analysis (SSBA) Symposium on Image Analysis, Stockholm, Sweden, 2018.

Contents

1 Introduction ......................................................................................... 13 1.1 Objectives and Motivation .............................................................. 14 1.2 Thesis Outline ................................................................................. 15

2 Background .......................................................................................... 16 2.1 Microscopy Imaging ....................................................................... 16

2.1.1 Bright-field Microscopy ........................................................ 17 2.1.2 Fluorescence Microscopy ...................................................... 17 2.1.3 High-Content Screening ........................................................ 18 2.1.4 Transmission Electron Microscopy ....................................... 18

2.2 Digital Image Analysis .................................................................... 19 2.2.1 Traditional Image Analysis Workflow .................................. 20 2.2.2 Deep Neural Networks ........................................................... 22

3 Projects and Contributions ................................................................... 31 3.1 High-Content Screening Data Analysis (Papers I–III) .................... 31

3.1.1 Overview ................................................................................ 31 3.1.2 Data ........................................................................................ 32 3.1.3 Methods ................................................................................. 33 3.1.4 Results and Main Conclusions ............................................... 36 3.1.5 Discussion .............................................................................. 37

3.2 Feature Descriptor for Image Matching (Paper IV) ........................ 39 3.2.1 Overview ................................................................................ 39 3.2.2 Data ........................................................................................ 39 3.2.3 Methods ................................................................................. 40 3.2.4 Results and Main Conclusions ............................................... 43 3.2.5 Discussion .............................................................................. 43

3.3 Object Recognition (Papers V–VI) ................................................. 45 3.3.1 Overview ................................................................................ 45 3.3.2 Data ........................................................................................ 47 3.3.3 Methods ................................................................................. 47 3.3.4 Results and Main Conclusions ............................................... 49 3.3.5 Discussion .............................................................................. 51

4 Concluding Remarks and Future Work ............................................... 53

Summary in Swedish .................................................................................... 55

Acknowledgments......................................................................................... 57

References ..................................................................................................... 59

Abbreviations

ANN BLOF CBA

Artificial Neural Network Bagged Local Outlier Factor Centre for Image Analysis at Uppsala University

CNN Convolutional Neural Network FC FFT HCS IFC IQR LPM LPT ReLU SELU SIFT SURF TEM

Flow Cytometry Fast Fourier Transform High-Content Screening Imaging Flow Cytometry Interquartile Range Log-Polar Magnitude feature descriptor Log-Polar Transform Rectified Linear Unit Scaled Exponential Linear Unit Scale-Invariant Feature Transform Speeded-Up Robust Features Transmission Electron Microscopy

13

1 Introduction

Every new beginning comes from some other beginning's end.

Seneca

We witness a rapid technological development that has resulted in many new imaging devices as well as their increasing availability, popularity, and au-tomation. Nowadays every mobile phone is equipped with a digital camera and many laboratories have access to automated imaging systems in which robots help prepare, handle and take images of biomedical samples in large quantities. As a consequence, we have more digital images than ever and this number increases daily. While this abundance of image data allows for new discoveries and even further automatization (e.g. an automated image-based diagnosis and self-driving vehicles) the need for automatic, efficient and robust image analysis methods has increased proportionally to the number of acquired images. One of the largest limitations in the development of these methods is the lack of reliable annotations. To verify that a method works as expected many images often have to be annotated by human operators. This is particularly problematic in biomedical applications in which annotating images is often very costly because it typically takes long hours of tedious work and requires expert knowledge. To address this problem, the papers in this thesis describe automatic image analysis methods that require little or no image annotations or other input from human experts.

The work presented here was performed at the Centre for Image Analysis (CBA), Division of Visual Information and Interaction, Department of In-formation Technology, Uppsala University. CBA was established in 1988 and since then its members have been developing theoretical and applica-tion-based image analysis methods in various interdisciplinary projects in-volving a wide range of applications and modalities: from satellite and air-borne images (for e.g. forestry and agriculture management), through histor-ical document images (for e.g. text recognition and transcription) to various types of microscopy (for e.g. diagnosis and drug discovery). This thesis pre-sents methods developed in three projects with different biomedical applica-tions: 1) data analysis for image-based drug-candidate screening, 2) feature description for image matching and 3) object recognition for classification of viruses.

14

1.1 Objectives and Motivation The main objective of this thesis was to facilitate and improve microscopy-based biomedical research and diagnosis by developing automatic image analysis methods for a variety of quantitative microscopy applications.

Typically for science, the work presented in this thesis was inspired by and built on the work of other scientists. The first project in this thesis focus-es on a large-scale quantitative cell-cycle analysis in fluorescence microsco-py images of cancer cell cultures. The main goals of this project were 1) to increase the understanding of mechanisms of action of different drugs, 2) to advance the research on personalized medicine by studying drug responses in cell cultures established from glioblastoma patients, and 3) to develop tools for automatic cell-cycle analysis, selection of a drug of interest and its effective dose. Docent Ida-Maria Sintorn and Professor Carolina Wählby, supervisors for the author, also initiated their image processing careers with segmentation and analysis of cells in fluorescent images [1, 2] and set-up the image analysis for this project before the author started his Ph.D. studies. The main contributions of the author in this project are methods for automat-ic dose and drug selection based on the measurements of segmented cancer cells, and PopulationProfiler – software for cell population analysis. The developed software has been used by author’s collaborators in other projects and resulted in i.a. journal publications, e.g. [3, 4, 5].

The objective of the second project was to develop a light-weight scale- and rotation-invariant feature descriptor inspired by the methods developed by another CBA member, Professor Anders Hast [6, 7]. The main contribu-tions of the author in this project are the modifications to the algorithm and their implementation, as well as a thorough evaluation of the new method. The proposed feature descriptor is particularly useful in matching, registra-tion, and stitching of microscopy images, which are important image analy-sis steps in many biomedical applications, e.g. when the imaged object is too large to fit in a single field of view (stitching) or when the observation takes a longer time and the object has moved or changed between imaging ses-sions (registration). This work served as an inspiration for further develop-ment of Fourier-based feature descriptors applied in other CBA projects [8, 9, 10].

The third project used a transmission electron microscopy (TEM) image dataset created and described in an earlier CBA thesis by Ph.D. Gustaf Kyl-berg [11]. The main objective here was to extend the virus segmentation and classification proposed by Kylberg et al. [12, 13] with state-of-the-art convo-lutional neural networks (CNNs) allowing semantic segmentation (i.e., pixel-wise classification or combined segmentation and classification without sep-arating individual object instances). To achieve this, the author developed an innovative CNN training method that uses minimally annotated images. This can be particularly useful in applications in which full object annotations are

15

not available or feasible to create. The last objective in this project was to explore the CNN architecture hyper-parameters to reduce the number of trainable weights and, as a consequence, make it more suitable for use in consumer devices. This was inspired by the work of another CBA graduate, Ph.D. Sajith Sadanandan [14].

1.2 Thesis Outline The subject of this thesis is computerized image processing, and therefore, its main focus is on the developed methods and not on the different applica-tions. Even though the methods were developed to solve specific problems they can be adapted to and used in different applications. Hence, the con-cepts and methods are presented as generally as possible with discussions on alternative options and motivations behind specific decisions.

The images used in the development and evaluation of the methods were acquired by collaborators using two different microscopy techniques de-scribed briefly in Section 2.1. The image analysis principles and methods relevant to this work are presented in Section 2.2. Chapter 3 contains de-tailed descriptions of the author’s contributions and recapitulates the includ-ed papers. Finally, Chapter 4 concludes the work presented here and poses questions for future investigations.

16

2 Background

The noblest pleasure is the joy of understanding. Leonardo da Vinci

In this chapter, the image modalities and image analysis methods used in this work are briefly introduced. The majority of the image data in this thesis was acquired using fluorescence microscopy or transmission electron microsco-py. The only exceptions are natural scene images (i.e., images taken with a common digital camera) used in Paper IV for testing the developed method on different datasets. The images were analyzed either with traditional image analysis methods (Papers I–IV) or with deep neural networks (Papers V and VI). In the traditional image analysis, methods designed by human experts to solve specific (intermediary) problems are skillfully adapted and combined in an algorithm with the aim of achieving a specific application goal. A typi-cal traditional image analysis workflow is described in 2.2.1. Deep neural networks can be designed to learn directly from an image dataset to solve the application problem with little or no image preprocessing. Deep neural net-works are described in Section 2.2.2.

2.1 Microscopy Imaging It is a capital mistake to theorize before one has data.

Sherlock Holmes

The methods presented in this thesis were developed for applications using specific types of microscopy images. Bright-field microscopy was not used in this work but was included for comparison with the other methods. Sam-ple preparation, imaging and manual annotation of the images were per-formed by expert collaborators.

17

2.1.1 Bright-field Microscopy Bright-field microscopy is the simplest and most commonly used optical microscopy technique. It uses white light that is transmitted through the sample. The sample projection is magnified with lenses creating an image observed in the microscope eyepiece or recorded with a digital camera sen-sor. The contrast in the image is created by the transmitted light attenuation in dense sample areas which results in a typical image of a dark sample on a bright background. The theoretical bright-field microscopy magnification is limited by the resolving power of the visible light wavelength, and therefore, the theoretical resolution limit is 0.2µm. It allows imaging natural colors of live opaque cells; however, many cell types are colorless or transparent and require staining for better contrast before imaging. Alternatively, differential interference contrast microscopy can be used to enhance the contrast in un-stained transparent samples by recording variations in optical density; how-ever, it requires a much more complex optical system than the conventional bright-field microscope.

2.1.2 Fluorescence Microscopy Fluorescence is a phenomenon in which a sample containing fluorophores is illuminated with light of a specific wavelength, causing them to emit light of longer wavelengths (i.e., of a different color than the absorbed light). Two spectral filters and a dichroic beam-splitter inside a fluorescence microscope separate the excitation light from the much weaker emitted light that forms the image observed in the eyepiece or recorded with the digital camera. One of the limitations with fluorescence microscopy is that only a single color (fluorophore) can be imaged at a time. Therefore, multi-signal, multi-color images of several types of fluorophores must be composed by combining several single-color images. It also requires that the samples have intrinsic fluorescence (autofluorescence) or that specific fluorescent stains are used to label target parts of specimens. On the other hand, fluorescence microscopy typically offers a high signal-to-noise ratio and allows specific and sensitive imaging of target protein expressions or molecule distributions in samples. As a consequence, fluorescence images usually have plain background and are easier to analyze and interpret than bright-field images, and therefore, fluorescence microscopy is used in a wide range of biomedical applications.

Papers I–III describe methods for image analysis and data post-processing of fluorescence microscopy images produced in high-content drug screens (see Section 3.1). Paper IV presents an approach for image matching and registration with demonstrated applications in both fluorescence and trans-mission electron microscopy (see Section 3.2).

18

2.1.3 High-Content Screening High-content screening (HCS) is a large-scale quantitative microscopy imag-ing and analysis technique used in biomedical research and drug discovery. It combines conventional bright-field and/or fluorescence microscopy with precise robotics. The aforementioned microscopy techniques provide high-content information for the cost of low acquisition speed. Advancement of precise robotics allowed the development of automated image-based HCS systems. These systems are capable of imaging multiple stains in large num-bers of cultured cell populations in a short period of time [15, 16]. Given that the samples are typically placed in microwell plates, containing up to 1536 wells per plate, this represents a perfect setting for high-throughput ap-proaches. Large-scale screens involving tens of thousands of images, each containing hundreds of cells described by hundreds of measurements, result in overwhelming amounts of data. Therefore, such systems require robust and time-efficient image analysis methods for quantitative measurements of cell appearance and their post-processing and interpretation [16]. Papers I–III describe methods for cell-cycle analysis in a large-scale cancer drug screening (see Section 3.1).

2.1.4 Transmission Electron Microscopy Transmission electron microscopy (TEM) images the number of electrons that passes through a sample. Dense materials cause electrons to scatter and result in dark patterns, which is somewhat similar to the bright-field micros-copy. The imaging is performed in vacuum as air molecules also scatter elec-trons. TEM uses i.a. electromagnetic lenses, vacuum pumps, and particle accelerators, and typically are large machines that require substantial power and careful sample preparation and handling to operate. Biological samples can be imaged in an electron microscope with a resolution down to single nm or even sub-nm resolution. As a consequence, TEM is one of the main analytical methods in physical, chemical and biological sciences with appli-cations in e.g. virology, nanotechnology and materials science.

Papers IV–VI use TEM images to demonstrate developed methods for image matching and registration (Paper IV, see Section 3.2), and object recognition (Papers V–VI, see Section 3.3).

19

2.2 Digital Image Analysis Numbers have an important story to tell.

They rely on you to give them a voice. Stephen Few

Digital images are organized matrices of pixels. Each pixel has a value that corresponds to the read-out from the camera sensor. 2D images can be inter-preted as 3D matrices with width, height, and intensity in the different color channels as the dimensions. Pixels have non-negative integer values bound by the number of the camera quantization levels. We can use various math-ematical operations to process, highlight, measure, classify or interpret digi-tal images or their parts. We can even teach a computer, by giving it exam-ples, to interpret the images and give us the information we need.

In traditional analysis, the images are processed in a sequence of skillfully designed and combined methods to produce a final result, such as an accu-rate measurement or classification of an imaged object. Each method in such a processing pipeline has a different role and typically has to be adapted to work with the specific type of images and task at hand. Once the algorithm (workflow) is designed and tested it can automatically perform the same analysis on other images. Designing successful image analysis pipelines requires a good understanding of the problem to be solved and the corre-sponding image representations and their variations, expertise in image pro-cessing methods, some experience and a whit of creativity. It is a challeng-ing and time-consuming task that may have to be iterated if the correspond-ing global and local contextual variations in images are too large to be han-dled with a single algorithm. On the other hand, using deep learning and artificial neural networks, we can make the computer learn how to solve the task directly from examples in the dataset [17]. Deep neural networks can offer human experts-like performance and at the same time, they are faster and more consistent in their decisions [17, 18, 19, 20]. However, their use in biomedical applications is often limited by the lack of a large amount of annotated images required for their training. Annotating images is a costly task that requires expert knowledge and many hours of tedious work. Hence, in deep learning, a lot of the human expert effort is moved from the image analysis algorithm development to designing an artificial neural network and preparing annotated data required to train it.

This section introduces the main concepts in image analysis with tradi-tional methods and deep learning that are relevant to this thesis.

20

Figure 1. A typical image analysis workflow.

2.2.1 Traditional Image Analysis Workflow Figure 1 presents a typical workflow in traditional image analysis. Each pro-cessing step is typically handled with one or a combination of several meth-ods. The final performance depends on the effectiveness of each of the methods in solving their tasks. An error in an earlier processing step will be propagated to the subsequent methods and potentially lead to further errors and the wrong result.

Preprocessing Microscopy images often require preprocessing to compensate for content variability or hardware limitations in data collection and to facilitate and improve the analysis that follows. In the preprocessing step, both the input and output are images. There are many common preprocessing tasks such as image normalization, distortion correction, background removal, registra-tion, tiling, stitching, etc.

Paper IV proposes an image feature descriptor that represents small image patches with one-dimensional numerical arrays. These arrays can be com-pared to find similar image patches in a different image, which has a direct application in biomedical image matching, stitching, registration, and re-trieval.

Segmentation Segmentation is a process of separating the image content into the fore-ground and background, interest regions, objects, or any other logical image division for a given context. There are three main image segmentation types: binary, semantic and instance, illustrated in Figure 2. In binary segmenta-tion, an image is divided into background and foreground regions without any further foreground division into different objects or classes. In other words, each pixel is classified as either the background or the foreground. In semantic segmentation, the foreground pixels are classified according to the object type they belong to. Semantic segmentation is a pixel-wise classifica-tion without distinguishing individual object instances. In instance segmenta-

Post

-pro

cess

ing

Image

Prep

roce

ssin

g

Segm

enta

tion

Obj

ect D

escr

iptio

n

Clas

sifica

tion

Image Image Data Information Information

Image

21

tion, two objects of the same type are separated and can be further processed individually.

Accurate and relevant segmentation is crucial for subsequent analysis steps i.e. for measuring and describing numerically a target object and per-forming its classification or other high-level interpretation of the image. As the nature and type of the regions of interest change with image type, con-text, and application, it is practically impossible to design a single segmenta-tion method that handles all cases. As a consequence, many segmentation methods have been proposed and yet, all require some modifications to solve a specific problem.

In Papers I–III, we used Otsu thresholding [21] to segment nuclei in fluo-rescence microscopy images followed by watershed [22, 23] to separate clustered cells. Both methods and their combination are standard approaches in image analysis, particularly useful in fluorescence microscopy with a high signal-to-noise ratio.

Figure 2. Different types of image segmentation.

22

Object Description Once the objects of interest are segmented they can be measured, numerical-ly described, or otherwise represented. This image analysis step extracts the characteristic features of the objects depicted in images that can be used in their classification, identification, or other decision-making processes. There are many object descriptors, e.g., width, area, aspect ratio, color, texture, etc. and each application usually requires a different descriptor subset, combina-tion or even an entirely new dedicated descriptor.

In Papers I–III, we used integrated pixel intensity as our target object de-scriptor, i.e. we summed up the fluorescent DNA signal intensities within the segmented nuclei areas. Paper IV proposes the feature descriptor that can be used to represent imaged objects or their parts.

Classification In this step, an object affiliation is decided based on its descriptors. There are three main types of classification problems, each having their dedicated methods: (1) supervised classification – the possible object classes are known and the classifier learns their representations (and decision rules) in the descriptor space from examples in the training set, (2) unsupervised clas-sification – the object classes are not known and the classifier groups objects based on some specific similarity measurements, and (3) regression – the objects are placed on a continuous scale rather than assigned a category or class.

In Papers I–III, we used a simple threshold-based supervised classifica-tion to estimate the cell-cycle stage of each cell based on their integrated DNA signal.

Post-processing In some cases, segmentation or classification results can be improved by post-processing methods. These include debris and imaging artifacts exclu-sion, border refinement, combining results from alternative approaches and many other techniques. In many applications, the post-processing step also refers to the further analysis of objects and samples aimed to answer the main question to the problem for which object segmentation, description, and classification are merely intermediate results.

In Papers I–III, we analyzed the population distributions of treated cancer cells to estimate the drug effects on cell-cycle and select the strongest drug candidates (see Section 3.1).

2.2.2 Deep Neural Networks A feed-forward artificial neural network (ANN) consists of neurons, also known as perceptrons, organized in layers. The data to be classified (or oth-erwise processed) is fed as input to the first layer of the network. The output of the first layer is fed as input to the second layer etc. The output of the

23

final layer of neurons constitutes the output of the network. All internal lay-ers of neurons are called hidden layers as they are “hidden” to the user that only provides the input to the input layer and receives the output in the out-put layer. Figure 3 shows a schematic of a single neuron and a traditional ANN in which all layers are fully-connected. Two layers are fully-connected if for any neuron in the later layer there is an input connection to every neu-ron in the former layer (the input data here is treated as a layer). Every con-nection in an ANN has its own trainable weight that is multiplied with the corresponding input. Moreover, every neuron has one additional trainable parameter, called bias, added to the weighted sum of that neuron. In each neuron, each input is multiplied by its weight , these products are then summed together with the bias , and the sum is passed to the activation function of that neuron that produces the output :

= ( + ∑ ). (2.1)

The activation functions add nonlinearity to the network and regularize the value of the weighted sum. In the past, ℎ and were commonly used as activation functions. They suffer from vanishing gradients, which in practice does not allow for training ANNs with many hidden layers [18, 24]. ANNs are trained using gradient-based optimization methods and backprop-agation [18]. The network weights are iteratively updated proportionally to the partial derivative of a loss function with respect to the current weight. In case of ℎ and , the gradients are in the range (0, 1), which com-bined with the chain rule mean that the updates get smaller with every layer in the backpropagation. This can slow down the training or even make it impossible. Nowadays, rectified linear unit (ReLU) [25] is commonly used. It passes all positive values and sets to 0 all the negative: ( ) =max (0, ). It has been demonstrated that supervised training of very deep neural networks is much faster if the hidden layers use ReLU [26]. ReLU solves the vanishing gradient problem; however, as the positive values are not bound they can increase quickly to very large numbers in consecutive layers. Using proper regularization of the layers (such as batch normaliza-tion) can, to some extent, compensate for this. Several ReLU variants have been proposed that also pass scaled-down negative input, such as leaky ReLU [27], parametric ReLU [28], and scaled exponential linear unit (SELU) [29]. In addition, SELU can automatically normalize the activations for very deep networks [29]. It is given by:

( ) = − if > 0if x ≤ 0, (2.2)

where and are two constant parameters with default values optimized to 1.6733 and 1.0507, respectively.

24

Figure 3. A single neuron (left) and a fully-connected ANN (right).

A typical ANN is a supervised classifier with one output (i.e., one neuron in the last layer) for each of the possible classes. Therefore, to regularize the output and make it more intuitive to interpret, the softmax activation func-tion is used in the last layer. It scales and normalizes the raw neuron outputs

so that the sum of all final output values in that layer (and typically the network prediction results) is equal to 1:

( ) = ∑ . (2.3)

For a given input sample and an ANN classifier, the softmax output vector values can be interpreted as the probabilities of that sample belongingness to corresponding classes. While softmax is very popular thanks to its easy im-plementation and interpretation, it is not free of problems. First, as softmax imposes that the prediction results sum up to 1, any sample that is an outlier belonging to none of the expected classes will be forced to be classified as one of the known classes, even if the raw output values from all neurons are close to zero or negative (which could be interpreted as the absence of the input signal combination that the neurons were trained to detect). Second, the exponential normalization means that softmax may often hide the confu-sion or uncertainty of the network. Consider the following example: suppose there are 4 classes and the four raw network outputs are {6; 5; 0; 1}, the corresponding softmax output is {0.726; 0.267; 0.002; 0.005}. Now, suppose the raw network outputs are {1,000,006; 1,000,005; 0; 1} the corresponding softmax output is {0.731; 0.269; 0; 0}. You can observe that in both cases softmax gives the highest score to the first class, even though the raw net-work output is similarly high for the second class and hence, its probability should also be high. Finally, if the number of classes is large (hundreds or thousands) the denominator can become large and softmax tends to give low scores distributed over many classes.

We used ReLU in Paper V and SELU in Paper VI in all neural layers but the last ones where we used softmax in both Papers V and VI to obtain the final classification.

The weights in a new un-trained network are randomly generated from a pre-defined distribution. In Paper V we used Glorot normal initializer [30]

25

and in Paper VI we used He normal initializer [28]. Both initializers use normal distributions centered on 0 but with different variances:

== , (2.4)

where is the number of inputs to the initialized neuron and is the number of neurons to which its output is connected to.

An ANN is trained by optimizing its trainable weights to minimize a loss function [18]. The loss function measures the discrepancies between the network prediction and the expected result. Different loss functions can be used depending on the specific application. The categorical cross-entropy is the most commonly used loss function for classification, and mean square (or absolute) error – for regression problems. The training is done in three steps: first, a new sample passes through the network and the corresponding prediction is computed; second, the prediction is compared to the ground truth and the difference is quantified with the loss function; third, the net-work weights are updated backwards using the calculated loss and a gradient descent optimizer. This last stage is called backpropagation [18, 31]. The optimizer estimates how much the loss changes with respect to the change in the trainable weight value and updates that value accordingly. The update magnitudes are regularized by a hyper-parameter – the learning rate; the higher the learning rate, the larger the weight updates, which can make the network converge faster in some cases but it can also make the optimizer miss an optimum. Therefore, the learning rate is usually small and the train-ing typically requires multiple iterations to converge. It is also very depend-ent on the training dataset quality and quantity. An ANN is only able to learn how to handle data variations present in the training set. It is common prac-tice to augment the training data so that an ANN can be trained with less annotated samples and to also make the network robust to the expected data variation that is under-represented in the raw training dataset.

One of the most common problems in ANN training is overfitting. It hap-pens when the model learns the answers for training data “by heart” instead of capturing the underlying distributions and dependencies. This results in poor performance on a test set and other data unseen by the network during the training. The worst (albeit quite common) scenario for overfitting is training a large ANN with too many trainable parameters on too little data for too long. Unfortunately, there are no strict rules for the minimal network size, the minimal amount of training data or a recommended number of training weight updates; it all depends on the application and the underlying data complexity. There are several strategies to overcome overfitting. A val-idation dataset can be used to measure the network performance during the training and to stop it on the first signs of overfitting (i.e., when the perfor-

26

mance on the validation set starts to decrease). Another popular solution is adding to the network model (typically just before or between the fully-connected layers) a dropout [32] – an auxiliary layer that sets random neuron outputs to zeros, forcing the network to learn to handle an incomplete data description. The dropout is used only during the training and it improves the generalization of the network, reduces the chance of overfitting but may also cause redundancy within the network. Finally, the reported ANN perfor-mance should be measured on an independent set of data – the test set.

In a fully-connected ANN, the number of trainable weights increases rap-idly with the number of nodes, and thus connections. Convolutional neural networks (CNNs) are a specific family of ANNs designed to reduce the number of trainable weights. This is achieved first by using the spatial con-text property of input images (that are organized in pixel matrices and the neighborhood of any pixel is very important for understanding its meaning) and making the connections sparse and local, and second, by sharing the same weights in all neurons in a given layer, which also gives the network translation invariance. In recent years, CNNs have proven to be very useful in image analysis and often shown to outperform the traditional tools and methods [17, 19, 20]. A typical CNN uses mainly three types of layers: con-volutional, pooling and fully-connected. They are organized in a specific order, creating network architecture with a set of connections between the layers.

Figure 4. A 3×3 convolutional layer.

27

In the case of 2D image analysis, each neuron in a convolutional layer can be seen as placed “above” that layer’s input matrix and taking as input the values directly under it and their immediate neighborhood (see Figure 4). The size of the neighborhood is pre-defined and often referred to as the ker-nel size of that layer. For example, a 3×3 convolutional layer will have a 3×3 spatial coverage of its input. An input to a layer often has multiple channels and although it is customary to refer to the 2D convolutional layers for their spatial 2D coverage, each neuron in such layer actually takes as an input a 3D submatrix corresponding to its position, the kernel size and all the input channels inside that neighborhood. E.g. in a 3×3 (2D) convolutional layer with an input of 32 channels, each node has 3×3×32 inputs plus a bias, and thus it has 289 trainable weights. All neurons in a convolutional layer share the same weights, which greatly reduces their number with respect to a fully-connected layer. The neurons (and thus their output) are organized in a matrix. The distance between two neighboring neurons with respect to the input matrix is called stride and constitutes one of the hyper-parameters of the layer (other typical hyper-parameters are: kernel size, activation function, initialization method and type of padding). A larger stride makes the neigh-boring neurons overlap less in their spatial input coverage and reduces the number of neurons in that layer (and thus the output size). Naturally, to pro-vide all the necessary input to a neuron, it cannot be placed directly “above” the input matrix border (unless its kernel size is 1×1). As a consequence, the output matrix shrinks by − 1 ( is the kernel size) in all spatial dimensions with respect to the input in each convolutional layer. To maintain the same output size, the input can be padded with zeros, which is a common practice in signal processing (it assumes no signal outside the measured input) but it can also introduce artificial sharp edges where the data is no longer present.

Finally, a convolutional layer is built of several sub-layers of neurons, each sub-layer with the same kernel size and looking in parallel for a differ-ent feature in the input. Therefore, the convolutional layer output has several channels called feature maps.

The term “convolutional” may cause unnecessary confusion with the dis-crete convolution operation. It is true that there are similarities between them: both use kernels for computing local weighted sum outputs and the neuron organization in a convolutional layer indeed resembles the sliding window of the convolution. However, there are many differences: (1) kernel weights in CNNs are trained and optimized while convolution typically uses constant human-engineered weights, (2) convolutional layers also have a trainable bias added to the weighted sum that is not present in convolutions, (3) CNNs originate from ANNs and the sliding window-like behavior is just an adaptation of the general ANN to the context of image processing, and (4) every convolutional layer, as any neural layer, is followed by an activation function (although this has often been implemented as an option, to the best of the author’s knowledge, all neural layers – both convolutional and fully-

28

connected – are always followed by an activation function). It is not the first, nor probably the last time that a term name causes confusion in the machine learning field. Another similar example is the “reward” in reinforcement learning that more often than not is actually negative and de facto penalizes bad choices, even though the word by itself is associated with something strictly positive.

The second type of layer specific to CNNs is the pooling layer. A pooling layer uses a sliding window that typically does not overlap (i.e., its stride is the same as its window size), however, contrary to the convolutional layer, a pooling layer always operates channel by channel (i.e., each feature map is separately processed and reduced in size). Moreover, they do not contain any trainable weights. Instead, they perform a predefined function that reduces the size of the input. The max function is the most commonly used which gives such layer the name max-pooling (see Figure 5) for an example of the typical 2×2 max-pooling layer). Pooling layers are typically used after and between convolutional layers. Pooling layers reduce the spatial dimensions of feature maps and increase the receptive field of the subsequent convolu-tional layers. The dimensionality reduction makes the number of computa-tions and memory requirements smaller. It also reduces the number of train-able weights in the network by making the input vector to the fully-connected layers (i.e., the layers with the largest number of trainable weights that are typically used at the end of the network to perform the final classifi-cation) smaller. The receptive field is a spatial area in the raw network input that is used to calculate a single value in a given feature map. The receptive field gets larger with each convolutional layer (in a typical case – linearly) and pooling layer (in a typical case – exponentially), which allows the net-work to analyze image features that span over a larger area. As pooling lay-ers reject a large amount of the feature map values (e.g. in the case of the typical 2×2 max-pooling with stride = 2, 75% of the feature map is rejected) they can also be seen as a regularization technique that combats the overfit-ting problem.

Figure 5. A 2×2 max-pooling layer.

29

A typical CNN architecture for image classification consists of an alter-nating sequence of convolutional and pooling layers, followed by several fully-connected layers. The feature maps of the last convolutional or pooling layer are flattened to a vector before passing to the fully-connected layers. Intuitively, the convolutional and pooling layers compute gradually more and more sophisticated features, starting from e.g. simple edge and color detectors to complex, application-specific feature detectors [17]. The fully-connected layers at the network end constitute a traditional ANN taking the high-level features as input and producing the output of the entire network. However, CNNs can offer much more than just image classification. In 2015 Long et al. [33] proposed a fully-convolutional ANN (i.e., a CNN without any fully-connected layers) for semantic image segmentation. It used up-sampling and skip (residual) connections to balance the spatial dimensionali-ty reduction in convolutional and pooling layers and to generate an output of the same size as the input. The residual connections copy and pass feature maps outside the regular feed-forward order, skipping one or more layers ahead. These feature maps not only pass conventionally to the next layer but are also combined (usually concatenated) with the output from the skipped layers. He et al. [34] showed that residual connections can be used to train very deep ANNs as they allow the gradients to flow freely through possibly hundreds of layers (they trained networks with up to 152 layers) without the vanishing gradient problem.

Ronneberger et al. [35] proposed another fully-convolutional CNN archi-tecture with skip connections, the U-Net, which became seminal in micros-copy image analysis. It has been adopted and successfully used in a wide range of biomedical applications, see eg. [14, 19, 20, 35, 36]. The U-Net architecture can be divided into 2 parts: the contracting encoder and the ex-panding decoder. The first part (left half of the CNN model in Figure 6) en-codes the input image in high-level features and detects the target objects, while the second part (right half of the CNN model) decodes the output of the previous layers and uses the residual connections to localize the objects in the original image scale. Each block in the CNN model is composed of two consecutive 3×3 convolutional layers. In the contracting encoder part, the blocks are followed by a max-pooling layer. In the expanding decoder part, the feature maps produced by the previous layer are up-sampled with a factor of two in each spatial dimension using nearest neighbor interpolation, passed through a 2×2 convolutional layer, and concatenated with the cropped output from the encoder part at the corresponding level. The 2×2 convolutional layers blend the up-sampled values and reduce the number of the up-sampled feature maps by half. We used U-Net-based CNNs in Papers V and VI (see Section 3.3).

Figu

re 6

. The

orig

inal

U-N

et C

NN

mod

el.

1 572x572

5682

64

128

2842

2802 1402

1362 682

256

512

1024

1024

512

512

256

256

128

128

642

642 322

282562

5221042

10022002

19623922

3882

3882

2x (C

onv

3x3,

ReL

U)M

ax P

ool

Crop

& C

onca

tena

teUp

-sam

ple

2x2,

Con

v 2x

2, R

eLU

Conv

1x1

, Sof

tmax

30

31

3 Projects and Contributions

No problem can withstand the assault of sustained thinking. Voltaire

In this chapter, the developed methods and the corresponding results thor-oughly described in Papers I–VI are briefly presented. The high-content screening project is presented and the methods for cell-cycle analysis and subsequent drug selection, corresponding to Papers I–III, are described in Section 3.1. Section 3.2 presents the novel feature descriptor for image matching described in Paper IV. Finally, object recognition using CNNs for the application of virus segmentation and classification, corresponding to Papers V–VI is presented in Section 3.3.

3.1 High-Content Screening Data Analysis (Papers I–III)

Simplicity is the ultimate sophistication. Leonardo da Vinci

3.1.1 Overview High-content screening (HCS) is, as explained in Section 2.1.4, a large-scale quantitative microscopy imaging and analysis technique used in biomedical research and drug discovery. It allows many chemical compounds at differ-ent doses to be tested simultaneously on cultivated cells to investigate the phenotypic changes they induce, which facilitates drug discovery [15]. Moreover, HCS can also be used in drug repurposing projects in which a large number of already approved drugs are screened against a particular disease target. Gupta et al. [37] have shown that many currently available drugs have secondary indications, and hence, could be potentially used to treat other diseases than originally designed for. This is particularly interest-ing for the pharmaceutical industry as it can significantly reduce the high cost of testing and approval of new drugs.

HCS typically produces large numbers of images that cannot be reliably analyzed by visual inspection. Although HCS images offer a plethora of

32

information on the individual cell level, many studies focus on the analysis of a single or a small subset of the available measurements [38, 39]. Fur-thermore, cell population analysis is often further simplified to comparing only the mean values and simple statistics [38, 39]. Such data reduction leads to a loss of potentially valuable information on population variability [40]. In Papers I–III, we focused on one measurement, namely, the DNA content in each cell. However, rather than computing and comparing averag-es across the cell populations, we reduced the DNA content data from per-cell measurements to per-well frequency distributions. The DNA content in a cell can be used to estimate its cell-cycle state [41], and thus, we investigated DNA content population distribution dynamics to analyze different drug effects on the cell-cycle in patient-derived cancer cell cultures.

The end goal of this project was to develop HCS data analysis methods for drug selection based on the cell-cycle analysis. In Paper I, we investigat-ed whether our HCS cell-cycle analysis can be applied as an alternative to the conventionally used flow cytometry (FC) in measuring the drug-induced cell-cycle disruption in cancer cells. Our results showed a high correlation between the two approaches which encouraged us to further develop and generalize our method into a user-friendly tool. Paper II introduces Popula-tionProfiler: a software tool that reduces per-cell measurements to popula-tion statistics. Finally, Paper III presents our image-based methods for se-lecting drugs of interest based on their influence on cell-cycle. We demon-strated the effectiveness of our approach in a large screen of 249 drugs at 11 different doses tested on 15 glioblastoma multiforme (brain tumor) patient-derived cell cultures.

3.1.2 Data In Paper I, we used HCS and FC data prepared and acquired by Jordi Carre-ras Puigvert. We used two different cell lines: lung cancer A549 and colon epithelial non-transformed CCD841. A549 is known to be sensitive to cell-cycle disruptors, whereas, CCD841 is a slow-replicating cell line insensitive to cell-cycle perturbations. The cultured cells were exposed to eight treat-ments: two different negative controls and three different drugs, each at two different doses. The technical details of the sample preparation, handling and imaging can be found in Papers I and II.

The image-based drug selection methods described in Paper III were vali-dated and demonstrated on the data from the human glioblastoma cell culture (HGCC) resource [42]. We analyzed the response of 249 drugs at 11 doses in 15 cell cultures derived from patients with different glioblastoma sub-types. Paper III contains the technical details of the sample preparation, han-dling, and imaging.

33

Figure 7. The DNA content histogram of a typical negative control (untreated cell culture) in the dataset. The raw data is in blue and the Gaussian-smoothed histogram in red.

3.1.3 Methods Image Analysis We measured the DNA content as the log2 of the integrated intensity of DNA stain in individually segmented nuclei. Cell segmentation and intensity measurements in HCS images were done using CellProfiler, a widely used software for large-scale image-based screening [43]. The cells were stained with Hoechst, a DNA fluorescent stain, which facilitated image processing and allowed DNA content measurements. To segment the nuclei we de-signed an image analysis pipeline consisting of Gaussian smoothing fol-lowed by Otsu thresholding [21], intensity-based watershed [22, 23] segmen-tation for separating clumped nuclei, and size thresholding for rejecting po-tential debris (objects too small to be nuclei) and staining artifacts (large stain drops). This straightforward nuclei segmentation approach provided robust results mostly unaffected by illumination variations between images.

Cell-cycle Analysis In an unperturbed cancer cell population, a DNA content histogram typically consists of two peaks, as presented in Figure 7. The higher peak to the left (2N) corresponds to the larger part of the cell population that has a single copy of the genome (i.e., 46 chromosomes, or 2 sets of 23, hence, 2N), whereas the smaller peak on the right (4N), corresponds to the subpopulation with double amount of DNA (i.e., imaged just before mitosis). Therefore, for each cell culture, we used the largest and second-largest maxima of the ag-gregated negative controls to localize the corresponding centers of the 2N and 4N sub-populations, respectively. We smoothed the histograms with a Gaussian filter (σ = 1.5) beforehand to avoid multiple peaks at 2N and 4N

34

locations. After the maxima detection, we normalized all DNA content measurements such that the value under the 2N peak corresponds to 1 and the value under the 4N peak corresponds to 2. Individual cells were thereaf-ter assigned to five classes named sub-G1, 2N, S, 4N, and >4N based on thresholds at 0.75, 1.25, 1.75 and 2.5 respectively, in accordance with [41]. During the analysis of treated cells, we automatically adjusted the thresholds to the shape of each well’s histogram within limits defined by the 2N and 4N peaks of the untreated (negative control) wells. This adaptive gating proved to be stable for small and moderate histogram shifts and allowed a compari-son of cell-cycle effects decoupled from variations in, e.g., cell size, illumi-nation, DNA stain distribution or uptake. The alternative, i.e. using the gates found for the negative controls for all the other samples, could potentially detect a DNA content histogram shift representing a relevant drug effect. However, as the DNA content differences in raw cell images are very diffi-cult to notice by visual inspection, the histogram shift might be easily con-fused with illumination variation or nonhomogeneous staining. and, as a consequence, be incorrectly interpreted. Finally, we calculated the five cell-cycle subpopulation contributions by normalizing their respective cell counts by the total number of cells in each histogram so that their sum was equal to 1. The DNA content histogram generation and classification of cells based on their cell-cycle stage were packaged into PopulationProfiler (Paper II) and used in Paper I and III.

Dose Selection In Paper III, we reduced the dataset from the original 11 doses to a single dose per drug to avoid redundancy and to reject less interesting cases of too low (to cause a phenotypical change) or lethal doses. Our aim was to select the highest doses with a relatively large number of cells, assuring both strong phenotypical effects and statistical relevance for further analysis. For each drug, we summed up the cell counts at each dose across all cell cultures. We selected the doses of interest by computing the maximum value of the first forward derivative approximation:

−∆ [ ]( ) = ( ( ) − ( + ℎ)), (3.1)

where is a dose for a given drug and ( ) is the sum of cell counts for all cell cultures at . We used ℎ = 2 to reduce noise and select the doses just before the largest cell count drop typical for lethal doses (as illustrated in Figure 1 in Paper III). However, not all tested drugs showed that characteris-tic sudden cell count drop. Some had a negligible effect on the cell viability, while others were so toxic that they killed the majority of cells even with the lowest dose. We used the cell count mean and standard deviation across all cell cultures and all doses for a given drug to detect these cases. A relatively low standard deviation indicated little or no change in the cell count with respect to the doses and the mean value indicated whether most cells were

35

alive or dead. We selected the highest dose in the case of drugs with consist-ently high cell counts and the lowest dose for those that killed most cells regardless of its concentration.

One alternative here would be to aggregate measurements of 2 or 3 doses around the selected one. This would increase the sample cell count and could potentially strengthen the analysis but it could also increase heterogeneity within the cell populations. In some cases, a drug effect can change with dose. If that is suspected, a whole set of doses should be analyzed individual-ly instead of a single dose or a subset aggregation. The proposed drug selec-tion methods could be adapted to handle such cases by first treating each drug-dose combination as an independent case and then grouping the same drugs according to their effects. In our experiments, we assumed that the drug effect is the same across the doses (although its strength may vary) and that a single selected dose suffices to detect it.

Drug Selection In Paper III, we present, evaluate and discuss two approaches for drug selec-tion that the author designed. In the first, we numerically compared cell-cycle subpopulation contributions of treated cells to the corresponding nega-tive controls searching for the largest differences. For each drug, we took the selected dose and approximated the cell-cycle dissimilarity as the sum of absolute differences between the corresponding cell-cycle subpopulation contributions divided by the number of available samples ( = | |): ∑ ∑ ( ) − ( ) , ∈ {sub-G1, 2N, S, 4N, > 4N}, (3.2)

where is a cell culture, are the cell-cycle subpopulation contributions of a given drug-dose combination and are the corresponding negative con-trol values. Next, we sorted and ranked the drugs according to this cell-cycle dissimilarity measurement.

While visually inspecting the histograms of the selected doses, we ob-served that the majority of the drugs in the screen did not show a clear effect on the cell-cycle. In the second approach, we used this observation and ap-plied outlier detection methods to select drugs that affect the cell-cycle. We treated all samples, including the negative controls, as data-points in the five-dimensional cell-cycle subpopulation contributions space. To reduce noise, we considered only the 3 main cell-cycle subpopulations: 2N, S, and 4N, thus reducing the space to 3D. We used two different outlier selection methods: bagged local outlier factor (BLOF) [44] and interquartile range (IQR). BLOF calculates an outlier score for each data-point based on its neighborhood statistics [44]. The “bagged” version performs these calcula-tions multiple times, each time using a different data projection and then combines the results for improved outlier detection. We classified the data-points as outliers based on the BLOF outlier score and a fixed threshold. In

36

the IQR, we defined outliers as data-points satisfying at least one of the inequalities:

< − ∗ (3.3)

or

+ ∗ < , (3.4)

where is the 25% quartile, is the 75% quartile, is the interquartile range ( = − ), and is the outlier factor. In both outlier detec-tion methods, we used the ratio of outlying cell cultures for a given drug-dose combination as the criterion to sort and rank the drugs according to their cell-cycle disruption strength.

In the last step of the drug selection, we used doxorubicin (one of the drugs tested in the screen) that is known to have a strong effect on the cell-cycle [45, 46], as the positive control. For each method, we selected only those drugs that ranked higher than doxorubicin, i.e. drugs showing stronger cell-cycle disruption effect than doxorubicin.

The cell count is a measure of statistical relevance of the population ef-fects. A very low cell count usually indicates cell deaths and morphological measurements are then less likely to convey useful information. Therefore, during the evaluation of our methods, we reduced noise by considering only those samples that contained at least 100 cells after treatment.

Paper III contains more information on the two outlier detectors, their hy-per-parameters selection, and an alternative way to normalize the data (i.e., after excluding the sub-G1 and >4N subpopulations) that we also used in our experiments.

3.1.4 Results and Main Conclusions In Paper I, we used Pearson’s correlation coefficients between normalized cell-cycle subpopulation distribution vectors found in HCS and FC. The results show a high correlation between the two approaches indicating that HCS can be used as an alternative to FC to investigate the cell dynamics at a single time point based on population statistics. However, it has been demonstrated that using correlation in method comparison may be mislead-ing [47]. Therefore, we also visually compared the DNA content histograms and the mean contributions of the three main cell-cycle subpopulations (2N, S, and 4N) measured in % with the two methods, which confirmed our re-sults. Paper I contains thorough analysis and discussion of the numerical results. It also includes our discussion of the image informatics benefits over conventional single-signal FC with the main points being a more efficient use of cultured cells and raw data preservation in the image-based approach.

We performed the validation of the drug selection methods proposed in Paper III as a blind experiment, i.e. using a code name for each drug, and

37

thus, avoiding the potential bias in the analysis. Nevertheless, our methods selected many drugs that are well-known cell-cycle-specific agents currently used in chemotherapy or under investigation in clinical trials. Although the two approaches and their variants generated slightly different drug ranking, they converged to a very similar list of top-scoring drugs differing only in their ordering. Therefore, the proposed approaches are equally relevant al-ternatives and the final choice of the method depends on the subtle drug selection and sorting differences desired in a particular application. Our re-sults provided new insights into (1) which drugs affect cell-cycle in glioblas-toma cell cultures, and (2) the extent to which these drug effects vary across different patient cases. We present and discuss in full the drug selection re-sults in Paper III. It also contains a detailed discussion on the proposed methods, their possible adjustments, and some alternative approaches.

3.1.5 Discussion Our main idea for the HCS data analysis was to reduce the per-cell meas-urements to per-well distributions, each represented by a histogram, and to further reduce the histograms to sub-type counts based on gating (setting bin ranges) of known control distributions and local adjustments to histogram shapes. Rather than reducing per-cell measurements to population averages, our proposed data reduction maintains information on population heteroge-neity allowing us to discriminate between drugs that perturb the cell-cycle with similar detail as obtained by FC, but at significantly lower cell counts. Moreover, image-based HCS data analysis allows efficient discrimination between true signals and artifacts, and enables comparison of measurements of morphological features, such as sub-cellular signal localization and cyto-skeletal patterns, not possible to observe by conventional flow cytometry.

Recent so-called imaging flow cytometry (IFC) systems combine features of fluorescent microscopy and flow cytometry [48]. They capture images of individual cells allowing multi-parametric fluorescent and morphological analysis of thousands of cells. IFC can be used to perform statistical analysis of subcellular protein distributions in heterogeneous cell populations. How-ever, due to the sample preparation process (i.a. cells have to be separated from each other and detached from the surface they grew on) IFC, just like conventional FC, requires much more cells than HCS (as many cells are irreversibly damaged in the sample preparation). Moreover, contrary to HCS, FC and IFC cannot be used to perform time-lapse analysis to investi-gate changes to spatial distribution, migration patterns, interactions between cells, etc.

PopulationProfiler can visualize arbitrary population histograms and sub-population distributions in a compact and comprehensive way. It allows creating gates for subpopulation classes based on control samples. Although it was originally designed for cell-cycle analysis in HCS setting, it can be

38

useful in many other applications, including IFC data analysis. In the Sup-plementary Material to Paper II, we demonstrate how PopulationProfiler can be used to analyze cell-cycle perturbation, protein translocation, and EdU incorporation. PopulationProfiler has been used by our collaborators in other research projects [3, 4, 5].

Image-based HCS offers a wealth of phenotypical information in the form of cellular morphology, fluorescent signals, textures, spatial cell distribu-tions, etc. Of course, limiting the investigation of drug-induced effects to only one signal and the choice of this signal can be debatable and require justification. In recent years, multi-parametric image-based HCS assays have been proposed [49, 50], including some that directly process raw images with deep neural networks, as reviewed in [20, 51]. Nevertheless, we decid-ed to focus our investigation on a single signal that robustly identifies chem-ical compounds that affect the cell-cycle in cancer cells. A large group of commonly used chemotherapy drugs for cancer treatment are cell-cycle spe-cific (i.e., their main role is to inhibit cancer growth by disrupting the cell-cycle of cancer cells). Therefore, cell-cycle subpopulation analysis is of high interest. Although our assay could potentially benefit from incorporating additional signals, we intentionally decided to simplify the analysis. Such an approach is not very surprising considering the complexity of handling hun-dreds of measurements from hundreds of cells per treatment, in assays often spanning libraries of thousands of compound-dose combinations. Moreover, we observed that all measurements related to the shape and intensity of the cytoplasm were affected by cell density, introducing a bias that was difficult to normalize for. We also noticed that the area of the cell nuclei was affected by some drugs, which could be caused by the DNA accumulation due to an arrest in S or 4N cell-cycle phase or by yet another mechanism of action. The proposed methods for cell-cycle analysis and drug and dose selections were designed to be robust to cell density, high biological heterogeneity (both at the cellular and patient-level) and varying data quality typical for large uncurated image datasets (i.e., corrupted with i.a. missing images, staining and imaging artifacts, and incomplete documentation).

39

3.2 Feature Descriptor for Image Matching (Paper IV) Mathematics compares the most diverse phenomena

and discovers the secret analogies that unite them. Jean-Baptiste Joseph Fourier

3.2.1 Overview In many biomedical applications, two or more images must be matched and aligned to e.g. create a larger field of view (image stitching) or to highlight a sample change over time or other interesting differences (image registra-tion). In Paper IV, we propose the Log-Polar Magnitude feature descriptor (LPM) for image matching. A typical local feature descriptor represents a relatively small image region (a distinctive feature) with a set of numbers – a feature vector. These numbers can be compared with feature vectors ob-tained from regions in a reference image while searching for a match. Fea-ture descriptors are designed to be invariant to various common image trans-formations, for example: translation, rotation, scaling, viewpoint angle, and illumination change. They should be distinctive, robust to occlusion and do not require image segmentation, which has made them a popular and suc-cessfully used image-matching approach in a wide range of applications including video and image retrieval, object tracking, and recognition.

Local features are typically defined by point coordinates and possibly the corresponding neighborhood size and orientation. Feature detectors find feature point candidates in images, while feature descriptors represent them with feature vectors. The final performance depends both on the detector’s capability to detect enough corresponding points in both images (otherwise matching is impossible) and the discriminative power of the descriptor that allows correct feature vector matching. While some approaches like SIFT [52] and SURF [53] combine the two methods, the proposed LPM is only a feature descriptor. In fact, one of its main merits is the ability to work with any feature detector (as demonstrated in Paper IV). Different applications and image types favor different feature detector-descriptor combinations. LPM is a rotation-, scale-, and illumination-invariant descriptor designed for microscopy applications. Nevertheless, the possibilities to combine it with any feature detector and to customize its descriptive priorities make LPM a very versatile tool.

3.2.2 Data We evaluated LPM performance on three datasets: fluorescence microscopy, TEM and the popular Oxford image collection [54]. The fluorescence mi-croscopy dataset contains two image subsets: one with cultured cells and one with a thin tissue section from a breast cancer tumor obtained from the bi-

40

obank at the Department of Pathology, Uppsala University Hospital. Each subset contains four images of the same sample captured at different times, which results in six possible image pairs for matching in each case. There are three main differences between the images of the same subset: (1) due to biological processes, the relative positions of the fluorescent signals changed a bit, (2) the fluorescent stain bleaching caused a change in the image brightness, and (3) the samples are slightly translated due to the acquisition procedure.

The transmission electron microscopy dataset contains eight image pairs of tissue sections. The first six (kidney cells infected with Mimivirus) are translated with respect to each other, whereas the last two (edge of cells with protruding cilia) differ in magnification (scale). The images captured with different magnifications are additionally characterized by different noise levels which is typical for TEM acquisition. Both microscopy datasets con-tain images with large plain (background or transparent) regions of relatively little texture or visual information. As a consequence, they constitute a hard matching problem.

The publicly available Oxford image dataset has been designed and used as the standard benchmark dataset for both local feature detectors and de-scriptors [54, 55]. It contains natural scene image subsets with different im-age transformations of gradually increasing strength: blurring, illumination, JPEG compression, viewpoint, and scaling plus rotation.

3.2.3 Methods Log-Polar Magnitude Feature Descriptor The proposed LPM descriptor computes a feature vector representation of an image feature using Log-Polar Transform (LPT) sampling followed by Fast Fourier Transform (FFT) and selecting key frequencies in the magnitude spectrum. Figure 8 presents the LPM processing flowchart.

We sampled a disc-shaped neighborhood of each detected feature with lo-cal LPT using the formula:

= += + , (3.5)

where are sampling radii logarithmically distributed in the range [1, ], is the feature radius, and are sampling angles linearly distributed in the range [0, 2 ). We tested two sampling resolutions: one with 16 radii and 16 points per radius, and one with 32 radii and 32 points per radius. They re-sulted in 16×16 and 32×32 transformed image patches, respectively.

41

Figure 8. The workflow diagram of the proposed LPM feature descriptor.

In the next step, we used FFT to calculate the frequency domain represen-tation of the LPT image. Finally, we selected a predefined set of frequency magnitudes (i.e., those under the frequency mask in Figure 8) to compose the feature vector and normalized it to a unit length. This frequency component selection can be adapted to customize LPM’s descriptive capability and to improve the matching performance for a specific image type or application. Paper IV contains an in-depth discussion of the LPM parameter tuning and the frequency component selection and how it can be adapted to other appli-cations.

The logarithmic radii sampling makes LPM scale-invariant (zoom factor up to about 2) and less sensitive to erroneous feature size estimation. A fea-ture rotation is represented as a shift in the LPT image, which in turn is only represented as a phase difference in FFT, i.e. it is not present in its magni-tude. Therefore, the LPT-FFT combination makes the proposed descriptor rotation-invariant without the feature orientation information or any addi-tional processing steps. Finally, we did not include the constant frequency component in the feature vector and thus, made LPM illumination-invariant.

42

Figure 9. The proposed feature descriptor evaluation.

Evaluation In the LPM evaluation on the Oxford dataset, we used the so-called VL benchmark test [56] that implements the descriptor matching score intro-duced in [57]. We also developed an alternative feature descriptor evaluation method and used it on all three image datasets. Figure 9 presents the flowchart of our threshold-RANSAC-based evaluation approach. We found the matches with the Matlab matchFeatures function with default parameters. It compares feature vectors normalized to unit lengths and two features are considered a match if the distance between their vectors is smaller than a fixed threshold. Moreover, the matches also have to fulfill the Lowe condi-tion introduced in [58]:

< , (3.6)

where is the Lowe threshold set to default 0.6 in Matlab (the original paper set the threshold to 0.8 [58]). This condition makes the matching much more reliable by requiring that the matching vectors are substantially closer to each other than to any other feature vector and thus, filters out ambiguous matches. The neighbor distance ratio of a given feature vector can be under-stood as a density estimate of false matches in the part of the feature space around it. Note that the Lowe condition formula in Paper IV is stated wrong-ly (the numerator and denominator were mistakenly reversed).

Once the matching feature pairs were found, we used the Optimal RAN-SAC [59], a method for estimating an image transformation model from a set of observed data containing outliers. It is an iterative, repeatable re-estimation version of the random sample consensus algorithm, RANSAC

43

[60]. Finally, a feature descriptor (or feature detector) performance can be measured in the number of validated matches (inliers) and in the inlier ratio:

= | || |. (3.7)

A successful feature descriptor should have a large number of correct match-es and a high inlier ratio. The first is important for providing enough infor-mation to compute the correct transformation model (homography matrix) for a given pair of images and the second demonstrates the descriptor’s abil-ity to represent features in a distinctive way. The proposed evaluation ap-proach, the original benchmark tests, and their limitations are discussed in more details in Paper IV.

3.2.4 Results and Main Conclusions Our experiments showed that LPM can successfully handle image matching with a wide range of transformations including rotation, translation, blurring, JPEG compression, illumination variation, and, to some extent, scaling and viewpoint changes. We compared LPM with SIFT and SURF on the three image datasets. SIFT and SURF are the most commonly used reference methods for feature descriptor comparisons. The proposed descriptor per-formed better than SURF and comparable to SIFT on the Oxford dataset. In both fluorescence and TEM datasets, LPM performed better than the other two methods. These results were obtained with much shorter feature vectors: SIFT and SURF have by default 128-long feature vectors and the proposed LPM – only 56-long. We also used our evaluation approach to demonstrate that feature detectors performance varied with different image subsets. This showed that for different applications different detectors are optimal. Paper IV contains numerical results and their in-depth discussion.

3.2.5 Discussion Depending on the image type and application, different frequency compo-nents may be crucial or beneficial for creating distinct image feature repre-sentations. LPM can be adapted to optimize its descriptive priorities for a given application and, in this process, the feature vector can be shortened to achieve a desired performance-computational workload compromise. Shorter feature vectors result in faster matching. Moreover, LPM could be potential-ly used with a dedicated cascade matching algorithm that compares in first order the frequency components important for a given application, quickly rejecting many obviously incorrect matching candidates. Feature matching could become much faster without the need for computing distances between entire feature vectors. However, such a matching algorithm has yet to be developed and tested, and the author leaves this for future studies.

44

The proposed evaluation method does not use a ground truth but instead compares the matching results to the image transformation model found with RANSAC. This can be seen as a major problem of this approach as its per-formance may be influenced by the number of correct matches, inlier ratio or RANSAC limitations. We used Optimal RANSAC that is repeatable and capable of finding the optimal subset of feature matches even in a heavily contaminated result set (i.e., even with the inlier ratio below 0.05) [59]. Nev-ertheless, despite working very well in our experiments, the Optimal RAN-SAC-based validation may fail if the number of matches is very small or if the transformations are non-rigid, e.g., due to a perspective change, sample fracture or lens distortion. This should be considered during the testing and another type of RANSAC or validation method (e.g., based on a similarity measurement used for non-rigid registration) would be preferable in the aforementioned cases.

This work was an inspiration for further development of Fourier-based feature descriptors applied in other CBA projects [8, 9, 10].

45

3.3 Object Recognition (Papers V–VI) ‘Data! Data! Data!’ he cried impatiently.

‘I can't make bricks without clay.’ Sherlock Holmes

3.3.1 Overview Convolutional neural networks (CNNs) can offer human experts-like per-formance and at the same time, they can be faster and more consistent in their decisions. However, their use in biomedical applications is often lim-ited by the lack of a large amount of annotated images required for CNN training. While image augmentation techniques can help training CNNs with a relatively small number of images, the annotations still have to be of high quality and the training set has to be representative of all data variations that may occur in the final application. Annotating images is a costly task that requires expert knowledge and many hours of tedious work. Moreover, in many applications, the objects of interest cannot be accurately delineated due to their fuzzy shape, ambiguous morphology or image quality. In Paper V, we proposed a method for CNN training using minimally annotated images. The manual annotations, in this case, consist of center points or centerlines of target objects of approximately known size. Our method uses these anno-tations and prior knowledge about the approximate size of the targets to au-tomatically generate ground truth labels with “certain” object and back-ground regions that are used in the training while leaving uncertain regions out of the training.

Another limiting factor for using CNNs in many applications is that they typically are very computationally heavy and often require expensive state-of-the-art hardware. In Paper VI, we investigated the possibility of reducing the size of a popular CNN: U-Net, making it more memory and computation efficient and, hence, easier to implement in practical scenarios. We para-metrized its architecture and demonstrated that comparable results can be achieved with substantially less trainable weights than the original U-Net. We also demonstrated the effectiveness of our minimal annotation training method in a challenging application of a pixel-wise virus classification in transmission electron microscopy images with 15 virus classes. Our light-weight CNN achieved 82.2% accuracy with nearly four times less trainable weights. It was optimized for an efficient trainable weight use which makes it attractive for commercial hardware implementation.

Tabl

e 1.

The

TEM

vir

us im

age

data

set.

Viru

s typ

e #

parti

cles

#

test

parti

cles

#

train

im

ages

#

test

imag

es

# al

l im

ages

M

inim

al a

nnot

atio

n tra

inin

g

(Pap

er V

) U

-Net

size

redu

ctio

n(P

aper

VI)

Fina

l eva

luat

ion

(Pap

er V

I) A

deno

viru

s 29

0 92

54

13

67

●

Astr

oviru

s 42

0 13

1 64

13

77

●

●CC

HF

471

159

49

28

77

● Co

wpo

x 35

3 10

8 42

15

57

●

Den

gue

141

78

20

19

39

● ●

Ebol

a 32

0 10

7 74

41

11

5 ●

●In

fluen

za

687

208

72

29

101

● La

ssa

419

126

67

41

108

● M

arbu

rg

131

40

70

26

96

● N

orov

irus

419

128

30

24

54

● ●

Orf

389

118

41

30

71

● Pa

pillo

ma

1235

41

4 24

9

33

● RV

FV

2029

60

9 94

42

13

6 ●

●Ro

tavi

rus

274

92

29

8 37

●

WN

V

199

78

6 3

9 ●

●To

tal

7777

24

88

736

341

1077

46

47

3.3.2 Data In this project, we used the transmission electron microscopy virus image dataset [11]. Table 1 lists the virus types and the corresponding numbers of images and virus instances in the dataset. The virus classes are not balanced and there are relatively few images per class. We used images with only one virus type (Rift Valley) to develop and test our approach for minimal annota-tion training, as described in Paper V. In Paper VI, we explored the effect of different architecture hyper-parameters on performance and size of U-Net using five virus classes. We tested the best architectures on all 15 virus clas-ses.

Various magnification levels (pixel sizes ranging from 0.26 to 5.57 nm), irregular background, presence of noise and debris, and varying particle ap-pearance due to different decomposition stages make this dataset challeng-ing, even for manual segmentation. The virus particle boundaries are often unclear, the particles are often clustered together and thus, in many cases, an accurate delineation is impossible. Therefore, each virus particle was anno-tated only with its approximate center-point (in case of isolated spherical particles) or a centerline (in case of clustered or elongated particles).

We rescaled all images using Lanczos-3 kernel interpolation so that the pixel size was equal to 1 nm. Then we randomly split them into training and test sets. We augmented, tiled and normalized the training set images as de-scribed in Papers V and VI. Image augmentation is a common approach in CNN training; more sample variety often has to be added artificially to allow successful training, especially if the training set is small. Tiling of the imag-es was necessary to fit them into the network models. Finally, we used the standard procedure for the intensity normalization: subtraction of the mean followed by division by the standard deviation.

3.3.3 Methods Minimal annotation training We used mathematical morphology to create ground truth object and back-ground labels from the manually annotated virus particle centers. For each image in the dataset, we generated uncomplimentary binary label images: one for the background and one for each object class, as illustrated in Figure 10. All object class labels except the correct one were set to zero matrices (i.e., they are empty black images). We created the object label image by dilating the annotations with a disk-shaped structuring element with a diame-ter equal to 0.7 of the typical virus diameter for that class, , provided as a priori information. This gave some tolerance to the particle size variation and imperfect annotation placement. We created the background label in a similar way by first dilating the annotations with a disk-shaped structuring

48

element of diameter 1.5 and then taking the complement of the result. Ob-viously, if the manual annotations are incomplete or incorrect the resulting label images will also be erroneous.

The label images generated with our approach are non-complimentary, i.e. there are wide boundary regions between the objects and background thatare labeled as neither. We wanted the CNN model to learn from the obviousforeground/background examples and to leave the uncertain regions out ofthe training. To achieve this we had to modify the loss function so that itdisregards the undefined regions. We used a modified Dice coefficient as theloss function: ( , ) = − ∑( ) ∑( ) ∑( ) , (3.8)

where is the ground truth, is the prediction, and is the per-pixel weight matrix with a value for each pixel in . The logarithm added in Paper VI speeds up the network convergence by giving high loss values to very wrong predictions. We tried two configurations of the per-pixel weight matrix :

1. binary that disregards the undefined regions:= + for background and correct class (a)for all other classes (b) ,

2. floating-point that additionally normalizes the weights for an un-balanced number of foreground/background pixels:= ( ,∑ , )∑ , ∑ , + ( ,∑ , )∑ , ∑ , for (a)for (b).

In a similar way to the Dice loss, we defined the intersection over union loss function. In Paper V we present the results and discuss our experiments with both loss functions in different combinations with the weight matrices.

Figure 10. A sample image from the virus dataset and the corresponding generated label images.

49

U-Net size-reduction When we tested our minimal annotation training approach in a single virus class segmentation problem (Paper V), we reduced the number of feature maps at each convolutional layer with respect to the original U-Net [35]. Our network thus had much less trainable weights (<0.5 M vs >31 M). This al-lowed faster training and helped prevent overfitting. Its good results also inspired us to explore further the possibility of parametrizing the original U-Net to reduce its size in a more complex multi-class segmentation applica-tion.

We described the U-Net architecture with four hyper-parameters: the base number of feature maps, the feature maps multiplier, the number of encod-ing-decoding levels, and the number of feature maps in the last two convolu-tional layers. Due to the long training time and hardware limitation, we lim-ited our experiments to only five virus classes (see Table 1) and explored one hyper-parameter at a time. In each experiment, we selected several of the most promising settings that were used in the following experiments. Once we found the best models we retrained them and tested on the 15 virus class dataset. The details and discussion of our systematic U-Net architecture hyper-parameter exploration can be found in Paper VI.

3.3.4 Results and Main Conclusions The networks trained with our minimal annotation method provide approxi-mated segmentations of the objects that are generally expected to be larger than the corresponding automatically generated object labels. Therefore, to evaluate their performance we used metrics that do not penalize decisions in the undefined regions: weighted Dice score and intersection over union (in Paper V), and normalized mean object accuracy (in Paper VI). The details of the evaluation approaches and discussion of the results can be found in Pa-pers V and VI, respectively.

Figu

re 1

1. T

he a

rchi

tect

ure

of o

ur b

est p

erfo

rmin

g U

-Net

-bas

ed m

odel

.

1 1024x1024

1024216

32

5122

5122 2562

2562 1282

64

128

512

256

128

128

6464

3240

16

1282 642

642

1282

12822562

25625122

512210242

10242

10242

2x (C

onv

3x3,

SEL

U)M

ax P

ool

Conc

aten

ate

Up-s

ampl

e 2x

2, C

onv

2x2,

SEL

UCo

nv 1

x1, S

oftm

ax

32

642

256

322

322

512

50

51

Figure 11 presents the architecture of our best performing U-Net-based CNN model. In the 15 virus class semantic segmentation problem, it achieved similar accuracy to the original U-Net: 82.2% but with over 23 million of trainable weights less (7.8 M in comparison to 31 M). A large part of the errors resulted from the difficulty in image annotation and potentially incorrect manual annotations. However, our quick experiments with curating the dataset showed that the networks seemed robust to a small amount of missing or incorrect annotations in the training set. Other causes of errors were the relatively small number of images per class, substantial variation in virus particle appearance within the same class, and random selection of images for the test set. The random selection is a good practice. However, in combination with the small dataset and the large variation in object appear-ance, some images with rare virus appearances were selected for the test set. As a consequence, despite image augmentation, the training set was not rep-resentative enough in some cases and the virus types were sometimes con-fused.

Our U-Net size-reduction experiments lead to two main conclusions: 1) the architecture hyper-parameters are pivotal if less trainable weights are to be used and 2) if there is no restriction on the network size using a deeper CNN generally gives better results. However, training larger networks comes at a price: they require more data, time, computational resources, and are also more prone to overfitting.

3.3.5 Discussion The proposed minimal annotation training method in its current form can only be applied to pixel-wise classification (semantic segmentation). To extend it to instance segmentation traditional image analysis post-processing methods, such as watershed [22, 23], could be used on the original images and segmentation results to separate clustered particles. An alternative would be to use the results of our model to aid manual virus particle delineation, thus making it a simpler task. Such data could then be used to train a final CNN that solves the instance segmentation.

Reducing network memory and computation power requirements is very important for many clinical and biomedical applications in which the analy-sis has to be fast, accurate and in the same time implemented on affordable hardware that is typically only a part of a larger commercial system. Our architecture hyper-parameter exploration can be adapted to other CNNs and used to optimize them for a given application making them less hardware demanding. Moreover, our minimal annotation training can reduce the cost of annotating images. It is also particularly useful in applications in which full object annotations are not available or feasible. We believe that the two approaches presented here address important problems that are very relevant

52

for bridging the gap between CNN development and their actual use in prac-tical scenarios.

Our minimal annotation training inspired similar methods applied to fi-brosarcoma cell detection in phase contrast microscopy images under condi-tions where very little or inaccurate training data is available [61].

53

4 Concluding Remarks and Future Work

Education never ends Watson. It is a series of lessons with the greatest for the last.

Sherlock Holmes

The automatic image and data analysis methods presented in this thesis were developed to facilitate and improve microscopy-based biomedical research and diagnosis in a variety of quantitative microscopy applications. In many such cases, annotating images is the major method development limitation as it is very costly due to required expert knowledge and long hours of tedious work. Although the proposed methods require parameter fine-tuning when adapted to new applications, they were designed to work with little or no annotations (or other input) from the biomedical experts. This makes them useful in many applications also beyond the ones described here.

The software developed as part of this thesis was appreciated and used by researchers in other projects and resulted in biomedical findings reported in journal publications. This work also inspired further development of similar algorithms in other CBA projects.

Digital image analysis is transitioning from using traditional algorithms designed end-to-end by a human expert to example-based machine learning. Therefore, the future work includes mainly further deep neural network de-velopment to increase our understanding of how CNNs work and how to use them more efficiently to improve the solutions to the problems described in this thesis. In the virus semantic segmentation, the performance could be potentially improved by replacing the simple convolutional layer blocks in the proposed U-Net-based CNNs with more advanced ones like the Google Inception module [62] or by adding residual connections within the blocks. Another interesting and particularly useful, from the application point of view, extension would be modifying the network to allow detection of new virus species not present in the training dataset. This would require i.a. soft-max modification or replacement. Section 2.2.2 discusses softmax problems that should be addressed in a future study. In the case of high-content screen-ing, Eulenberg et al. [63] trained a CNN to classify cell images captured with imaging flow cytometry into different cell-cycle stages. Moreover, they used the raw CNN output (i.e., before softmax) to visualize and reconstruct the cell-cycle of Jurkat cells and disease progression in diabetic retinopathy. It would be interesting to compare their method with the traditional image

54

analysis used in this thesis and to investigate how the combination of the two approaches could be applied in future applications. On the other hand, there is a large need for methods and tools that incorporate other cellular meas-urements or take raw microscopy images as input, as they could give further insight into the phenotypic changes induced by the screened substances and help us understand better their mechanisms of action. One approach would be to use deep neural networks (or other methods) in unsupervised learning to find groups of drugs that affect cell morphology in a similar way. Unsu-pervised learning requires no annotations and is designed to cluster the data samples based on their similarity. It is particularly useful in applications in which the classification results of the available data are not fully known, which is the case of drug screening. Recent works showed interesting unsu-pervised deep learning results for other high-content screening problems [64, 65, 66]. Sommer et al. [65] trained an auto-encoder CNN on negative con-trol samples and used it as a self-learned feature extractor for clustering the rest of the screening data. Pawlowski et al. [64] and Kensert et al. [66] have successfully used transfer learning of networks trained on ImageNet in chemical compound clustering.

55

Summary in Swedish

På senare tid har det skett en snabb teknisk utveckling som har resulterat i en mängd nya bildtagningsapparater med ökande tillgänglighet, popularitet och grad av automatisering. Många laboratorier har tillgång till automatiska bild-tagningssystem där robotar hjälper till att förbereda hantera och ta bilder av biomedicinska prover i stora mängder. Medan detta överflöd av bilddata möjliggör nya upptäckter och ytterligare automatisering (till exempel auto-matiserad bildbaserad diagnos), har behovet av automatiska, effektiva och robusta bildbehandlingsmetoder ökat proportionerligt mot antalet tagna bil-der. En av de största begräsningarna i utvecklandet av dessa metoder är bris-ten av pålitliga annoteringar. För att träna en metod och verifiera att den fungerar behövs det många annoterade bilder från en mänsklig. Detta är spe-ciellt problematiskt i biomedicinska applikationer där annoterade bilder är dyra att generera då det kräver expertkunskap och mycket tid. För att bidra till att lösa detta problem beskriver artiklarna i den här avhandlingen auto-matiska bildbehandlingsmetoder som kräver lite eller inga annoteringar från experter. Målet med arbetet i avhandlingen var att främja och förbättra mik-roskopibaserade biomedicinsk forskning och diagnostik genom utveckling av automatiska bildanalysmetoder för tre specifika biomedicinska tillämp-ningar: 1) Dataanalys av bildbaserad screening av läkemedelskandidater, 2) registrering och matchning av bilder, 3) detektion och klassificering av virus.

Det första projektet fokuserar på storskalig kvantitativ cellcykelanalys av fluorescencemikroskopibilder på cancercellkulturer. Det huvudsakliga målet var att utveckla verktyg för automatiserad cellcykelanalys och med hjälp av det kunna identifiera läkemedelssubstanser och dess effektiva dos och för att öka förståelsen kring verkningsmekanismen av olika läkemedel. Samtidigt bidrog projektet till ökad förståelse kring individanpassade behandlingar genom att läkemedelssvar i cellkulturer från patienter med glioblastom stu-derades. I artikel I, undersökte vi om vår bildbaserade cellcykelanalys kan appliceras som ett alternativ till den mer konventionella tekniken flödescy-tometri för att mäta läkemedelsinducerad påverkan på cellcykeln hos cancer-celler. Våra resultat visar en hög korrelation mellan de två metoderna vilket uppmuntrade oss att fortsätta utveckla och generalisera vår metod till ett användarvänligt verktyg. Artikel II introducerar och beskriver Population-Profiler: ett verktyg som reducerar individuella cellmätningar till populat-ionsstatistik. Slutligen, Artikel III presenterar en bildbaserade metod för identifiering av läkemedel som påverkar cellcykeln. Vi visade effektiviteten

56

i vår metod i en stor analys (så kallad screen) av 249 läkemedel med 11 olika doser testade på cellkulturer från 15 glioblastoma multiforme (hjärntumör) patienter.

I många biomedicinska applikationer behöver två eller flera bilder bli matchade och registrerade för att exempelvis skapa en större översiktsbild, eller analysera skillnader mellan bilderna, tex. en förändring över tid. Målet med det andra projektet var att utveckla en robust och beräkningseffektiv skal- och rotationsinvariant punktdeskriptor för biomedicinska mikroskopi-bilder. I artikel IV beskriver vi den s.k. Log-Polar Magnitude deskriptorn för bildmatchning. Den har en kort egenskapsvektor, dvs består av få egen-skapsmått, och kan fungera med praktiskt taget vilken typ av punktdetektor som helst. Olika tillämpningar och bildtyper behöver olika punktdetektorer-deskriptorer kombinationer. Deskriptorn som presenteras i artikel IV är inva-riant mot rotationer, förändringar i skala (förstoring), och ljusintensitet, vilka är vanliga variationer i biologiska mikroskopibilder. Dessa egenskaper till-sammans med möjligheten att kombinera deskriptorn med olika detektorer och möjligheten att skräddarsy dess beskrivande egenskaper för olika till-lämpningar gör den till ett väldigt flexibelt verktyg.

Neurala faltningsnätverk (Convolutional Neural Networks (CNN)) kan leverera mänsklig expertlik prestanda samtidigt som de är snabbare och mer konsekventa i sina beslut. Användandet av CNN i biomedicinska applikat-ioner har dock varit begränsat på grund av avsaknaden av den stora mängd annoterat data som behövs för att träna nätverken. I många applikationer kan dessutom intressanta objekt inte avgränsas exakt på grund av dess suddiga form, tvetydliga morfologi och/eller bildkvalitet. I det tredje projektet ut-vecklade vi en metod för att träna CNN med minimalt annoterade bilder, vilket presenteras i artikel V. Annoteringarna består i detta fall av punkter eller linjer i mitten av objekt med ungefärligt känd storlek. I artikel VI vi-sade vi hur effektiv metoden är i en utmanande tillämpning bestående av pixelvis virusklassificering i bilder tagna med ett transmissionselektronmik-roskop med 15 virusklasser. Vi modifierade den ursprungliga U-net arkitek-turen för att reducera antalet träningsbara vikter vilket gör den attraktiv för kommersiell hårdvaruimplementation. Den föreslagna träningen med mini-malt annoterade bilder kan vara användbar i många applikationer där full-ständiga objektannoteringar inte är tillgängliga eller möjliga.

Den mjukvara som utvecklats som del av den här avhandlingen har an-vänts av samarbetspartners och andra forskare och resulterat i publicerade biomedicinska upptäckter. Arbetet i avhandlingen har också inspirerat fort-satt utveckling av liknande algoritmer i andra projekt på avdelningen. Para-metrarna i de presenterade metoderna behöver finjustering för att tillämpas på nya applikationer. Då de var designade att fungera med lite eller inga annoteringar (eller annan input) från biomedicinska experter är de väldigt användbara i många applikationer även bortom de beskrivna här.

57

Acknowledgments

Good friends are like stars, you don't always see them, but you know they're always there.

Christy Evans

This Ph.D. has been an exciting journey. As in any big enterprise, there were many moments of laughter, high energy and motivation, but also those of doubts and difficulty. I consider myself very lucky as, during the five years it took, I could always count on my family, old and new friends. You made the good moments great and helped me go through the bad ones. Hereby, I want to thank you all. This thesis would not be what it is without you. And in particular I would like to thank:

• my supervisors: Ida-Maria Sintorn and Carolina Wählby, for your trust, support, help, and understanding when it mattered most,

• my collaborators: Jordi Carreras Puigvert, Cecilia Krona, and Anders Hast, for your help and fruitful work together,

• all seniors and Ph.D. students at CBA for many interesting discus-sions and creating a great work environment,

• Anna-Lena Forsberg and Lena Nordström for always having time and answers for me and for making all the administration stuff easy,

• Anindya Gupta for your kindness, friendship, dedication and many long fika discussions on life and science,

• Teo Asplund, for our lunch discussions on games, books, movies, culture, science and much more,

• Elisabeth “Eli” Linnér, Kristina “Kika” Lidayova, Christopher Avenel, and Amit “Brother-from-another-mother” Suveer for making long days and evenings at work fun,

• Erik Wernersson for introducing me to the world of high-quality sound; this thesis would not have been written without the head-phones I got from you,

58

• my Uppsala Historiska Fäktskola friends: Magnus Lundborg, Thomas Nyzell, Andreas Isaksson, Martin Svedberg, Henrik Persson, Johan Dahlberg, Mattias Landelius, Robert Ringefors and the rest of UHFS for many hours of sword clashing and for helping me keep my stress levels low,

• my best friends: Dawid Domiński, Łukasz Stryjek, and Marcin Chmurski, for our fantastic adventures in the Old World and here, for your support and countless hours of games, discussions, and fun,

• moim rodzicom, za to, że zawsze mnie wspieraliście i zawsze mogłem na Was liczyć, nawet wtedy, kiedy nie rozumieliście lub nie zgadzaliście się z moimi wyborami i decyzjami,

• my partner in life, Sabine Radde, for being with me in good and bad, for your care and love, and for listening to my never-ending stories about swords and fencing,

• and my cute little Cuddlesaurus, Matti, for your radiating enthusi-asm, energy and many smiles you gave me.

Uppsala, October 2019

Damian Matuszewski

59

References

[1] Wählby, C. Algorithms for applied digital image cytometry (Doctoral disserta-tion). Acta Universitatis Upsaliensis. 2003.

[2] Sintorn, I.M. Segmentation methods and shape descriptions in digital images (Doctoral dissertation). Acta Universitatis Agriculturae Sueciae. 2005.

[3] Schmidt, L., Baskaran, S., Johansson, P., Padhan, N., Matuszewski, D.J., Green, L.C., Elfineh, L., Wee, S., Häggblad, M., Martens, U., Westermark, B., Forsberg-Nilsson, K., Uhrbom, L., Claesson-Welsh, L., Andäng, M., Sintorn, I.M., Lundgren, B., Lönnstedt, I., Krona, C., Nelander, S. (2016) Case-specific potentiation of glioblastoma drugs by pterostilbene. Oncotarget, 7(45): pp. 73200–73215.

[4] Carreras-Puigvert, J., Zitnik, M., Jemth, A.-S., Carter, M., Unterlass, J.E., Hall-ström, B., Loseva, O., Karem, Z., Calderón-Montaño, J.M., Lindskog, C., Edqvist, P.-H., Matuszewski, D.J., Blal, H.A., Berntsson, R.P.A., Häggblad, M., Martens, U., Studham, M., Lundgren, B., Wählby, C., Sonnhammer, E.L.L., Lundberg, E., Stenmark, P., Zupan, B., Helleday, T. (2017) A comprehensive structural, biochemical and biological profiling of the human NUDIX hydrolase family. Nature Communications, 8(1): 1541.

[5] Bombrun, M., Gao, H., Ranefall, P., Mejhert, N., Arner, P., Wählby, C. (2017) Quantitative high-content/high-throughput microscopy analysis of lipid droplets in subject-specific adipogenesis models. Cytometry Part A, 91(11): pp. 1068–1077.

[6] Hast, A., Marchetti, A. (2013) Rotation invariant feature matching-based on gaussian filtered log polar transform and phase correlation. In Proc. of the 8th International Symposium on Image and Signal Processing and Analysis (ISPA), pp. 107–112.

[7] Hast, A. (2014) Robust and invariant phase based local feature matching. In Proc. of the 22nd International Conference on Pattern Recognition, pp. 809–814.

[8] Hast, A., Fornés. A. (2016) A Segmentation-free Handwritten Word Spotting Approach by Relaxed Feature Matching. In Proc. of the 12th International Workshop on Document Analysis Systems, pp. 150–155.

[9] Hast, A., Vats, E. (2018) Radial Line Fourier descriptor for historical handwrit-ten text representation. Journal of WSCG, 26(1): pp. 31-40.

[10] Hast, A., Sablina, V.A., Sintorn, I.M., Kylberg, G. (2018) A Fast Fourier Based Feature Descriptor and a Cascade Nearest Neighbour Search with an Efficient Matching Pipeline for Mosaicing of Microscopy Images. Pattern Recognition and Image Analysis, 28(2): pp. 261–272.

[11] Kylberg, G. Automatic virus identification using TEM: image segmentation and texture analysis (Doctoral dissertation). Acta Universitatis Upsaliensis. 2014.

[12] Kylberg, G., Uppström, M., Hedlund, K.O., Borgefors, G., Sintorn, I.M. (2012) Segmentation of virus particle candidates in transmission electron microscopy images. Journal of Microscopy, 245(2): pp. 140–147.

60

[13] Sintorn, I.M., Kylberg, G. (2014) Virus recognition based on local texture. In Proc. of the 22nd International Conference on Pattern Recognition, pp. 3227–3232.

[14] Sadanandan, S.K., Ranefall, P., Wählby, C. (2016) Feature augmented deep neural networks for segmentation of cells. In Proc. of European Conference on Computer Vision, pp. 231–243.

[15] Bickle, M. (2010) The beautiful cell: high-content screening in drug discovery. Analytical and Bioanalytical Chemistry, 398(1): pp. 219–226.

[16] Usaj, M.M., Styles, E.B., Verster, A.J., Friesen, H., Boone, C., Andrews, B.J. (2016) High-content screening for quantitative cell biology. Trends in Cell Bi-ology, 26(8): pp. 598–611.

[17] LeCun, Y., Bengio, Y., Hinton, G. (2015) Deep learning. Nature, 521(7553), pp. 436–444.

[18] Schmidhuber, J. (2015) Deep learning in neural networks: An overview. Neural Networks, 61: pp. 85–117.

[19] Xing, F., Xie, Y., Su, H., Liu, F., Yang, L. (2017) Deep learning in microscopy image analysis: A survey. IEEE Transactions on Neural Networks and Learning Systems, 29(10), pp. 4550–4568.

[20] Gupta, A., Harrison, P.J., Wieslander, H., Pielawski, N., Kartasalo, K., Partel, G., Solorzano, L., Suveer, A., Klemm, A.H., Spjuth, O., Sintorn, I.M., Wählby, C. (2019) Deep learning in image cytometry: a review. Cytometry Part A, 95(4): pp. 366–380.

[21] Otsu, N. (1979) A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9(1): pp. 62–66.

[22] Beucher, S., Lantuéj, C. (1979) Use of watersheds in contour detection. In Proc. of International Workshop on Image Processing Real-Time Edge and Motion Detection.

[23] Vincent, L., Soille, P. (1991) Watersheds in digital spaces: an efficient algo-rithm based on immersion simulations. IEEE Transactions on Pattern Analysis & Machine Intelligence, 13(6): pp. 583–598.

[24] Sadanandan, S.K. Deep Neural Networks and Image Analysis for Quantitative Microscopy (Doctoral dissertation). Acta Universitatis Upsaliensis. 2017.

[25] Nair, V., Hinton, G.E. (2010) Rectified linear units improve restricted boltz-mann machines. In Proc. of the 27th International Conference on Machine Learning (ICML), pp. 807–814.

[26] Glorot, X., Bordes, A., Bengio, Y. (2011) Deep sparse rectifier neural networks. In Proc. of the 14th International Conference on Artificial Intelligence and Sta-tistics, pp. 315–323.

[27] Maas, A.L., Hannun, A.Y., Ng, A.Y. (2013) Rectifier nonlinearities improve neural network acoustic models. In Proc. of the 30th International Conference on Machine Learning (ICML), 30(1).

[28] He, K., Zhang, X., Ren, S., Sun, J. (2015) Delving deep into rectifiers: Surpas-sing human-level performance on imagenet classification. In Proc. of the IEEE International Conference on Computer Vision, pp. 1026–1034.

[29] Klambauer, G., Unterthiner, T., Mayr, A., Hochreiter, S. (2017) Self-normalizing neural networks. Advances in Neural Information Processing Sys-tems, pp. 971–980.

[30] Glorot, X., Bengio, Y. (2010) Understanding the difficulty of training deep feedforward neural networks. In Proc. of the 13th International Conference on Artificial Intelligence and Statistics, pp. 249–256.

61

[31] Rumelhart, D.E., Hinton, G.E., Williams, R.J. (1988) Learning representations by back-propagating errors. Cognitive Modeling, 5(3): pp. 533–536.

[32] Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R. (2014) Dropout: a simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1): pp.1929–1958.

[33] Long, J., Shelhamer, E., Darrell, T. (2015) Fully convolutional networks for semantic segmentation. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440.

[34] He, K., Zhang, X., Ren, S., Sun, J. (2016) Deep residual learning for image recognition. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778.

[35] Ronneberger, O., Fischer, P., Brox, T. (2015) U-net: Convolutional networks for biomedical image segmentation. In Proc. of the International Conference on Medical Image Computing and Computer-Assisted Intervention, pp. 234–241.

[36] Çiçek, Ö., Abdulkadir, A., Lienkamp, S.S., Brox, T., Ronneberger, O. (2016) 3D U-Net: learning dense volumetric segmentation from sparse annotation. In Proc. of the International Conference on Medical Image Computing and Com-puter-Assisted Intervention, pp. 424–432.

[37] Gupta, S.C., Sung, B., Prasad, S., Webb, L.J., Aggarwal, B.B. (2013) Cancer drug discovery by repurposing: teaching new tricks to old dogs. Trends in Pharmacological Sciences, 34(9): 508–517.

[38] Ljosa, V., Caie, P.D., Ter Horst, R., Sokolnicki, K.L., Jenkins, E.L., Daya, S., Roberts, M.E., Jones, T.R., Singh, S., Genovesio, A., Clemons, P.A., Carragher, N.O., Carpenter, A.E. (2013) Comparison of methods for image-based profiling of cellular morphological responses to small-molecule treatment. Journal of Bi-omolecular Screening, 18(10): pp.1321–1329.

[39] Singh, S., Carpenter, A.E., Genovesio, A. (2014) Increasing the content of high-content screening: an overview. Journal of Biomolecular Screening, 19(5): pp. 640–650.

[40] Gough, A.H., Chen, N., Shun, T.Y., Lezon, T.R., Boltz, R.C., Reese, C.E., Wagner, J., Vernetti, L.A., Grandis, J.R., Lee, A.V., Stern, A.M., Schurdak, M.E., Taylor, D.L. (2014) Identifying and quantifying heterogeneity in high content analysis: application of heterogeneity indices to drug discovery. PLoS ONE, 9(7): e102678.

[41] Chan, G.K.Y., Kleinheinz, T.L., Peterson, D., Moffat, J.G. (2013) A simple high-content cell cycle assay reveals frequent discrepancies between cell num-ber and ATP and MTS proliferation assays. PLoS ONE, 8(5): e63583.

[42] Xie, Y., Bergström, T., Jiang, Y., Johansson, P., Marinescu, V.D., Lindberg, N., Segerman, A., Wicher, G., Niklasson, M., Baskaran, S., Sreedharan, S., Ever-lien, I., Kastemar, M., Hermansson, A., Elfineh, L., Libard S., Holland, E.C., Hesselager, G., Alafuzoff, I., Westermark, B., Nelander, S., Forsberg-Nilsson, K., Uhrbom, L. (2015) The human glioblastoma cell culture resource: validated cell models representing all molecular subtypes. EBioMedicine, 2(10): pp. 1351–1363.

[43] Carpenter, A.E., Jones, T.R., Lamprecht, M.R., Clarke, C., Kang, I.H., Friman, O., Guertin, D.A., Chang, J.H., Lindquist, R.A., Moffat, J., Golland, P., Sabati-ni, D.M. (2006) CellProfiler: image analysis software for identifying and quan-tifying cell phenotypes. Genome Biology, 7(10): R100.

[44] Lazarevic, A., Kumar, V. (2005) Feature bagging for outlier detection. In Proc. of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining, pp. 157–166.

62

[45] Ling, Y.H., El-Naggar, A.K., Priebe, W., Perez-Soler, R. (1996) Cell cycle-dependent cytotoxicity, G2/M phase arrest, and disruption of p34cdc2/cyclin B1 activity induced by doxorubicin in synchronized P388 cells. Molecular Phar-macology, 49(5): pp. 832–841.

[46] Potter, A.J., Gollahon, K.A., Palanca, B.J., Harbert, M.J., Choi, Y.M., Mos-kovitz, A.H., Potter, J.D., Rabinovitch, P.S. (2002) Flow cytometric analysis of the cell cycle phase specificity of DNA damage induced by radiation, hydrogen peroxide and doxorubicin. Carcinogenesis, 23(3): pp. 389–401.

[47] Bland, J.M., Altman, D. (1986) Statistical methods for assessing agreement between two methods of clinical measurement. The Lancet, 327(8476): pp. 307–310.

[48] Barteneva, N.S., Fasler-Kan, E., Vorobjev, I.A. (2012) Imaging flow cytometry: coping with heterogeneity in biological systems. Journal of Histochemistry & Cytochemistry, 60(10): pp. 723–733.

[49] Caicedo, J.C., Cooper, S., Heigwer, F., Warchal, S., Qiu, P., Molnar, C., Va-silevich, A.S., Barry, J.D., Bansal, H.S., Kraus, O., Wawer, M., Paavolainen, L., Herrmann, M.D., Rohban, M., Hung, J., Hennig, H., Concannon, J., Smith, I., Clemons, P.A., Singh, S., Rees, P., Horvath, P., Linington, R.G., Carpenter, A.E. (2017) Data-analysis strategies for image-based cell profiling. Nature Methods, 14(9), p.849–863.

[50] Sjögren, A.K., Breitholtz, K., Ahlberg, E., Milton, L., Forsgard, M., Persson, M., Stahl, S.H., Wilmer, M.J., Hornberg, J.J. (2018) A novel multi-parametric high content screening assay in ciPTEC-OAT1 to predict drug-induced ne-phrotoxicity during drug discovery. Archives of Toxicology, 92(10): pp. 3175–3190.

[51] Smith, K., Piccinini, F., Balassa, T., Koos, K., Danka, T., Azizpour, H., Horvath, P. (2018) Phenotypic image analysis software tools for exploring and understanding big image data from cell-based assays. Cell Systems, 6(6): pp. 636–653.

[52] Lowe, D.G. (1999) Object recognition from local scale-invariant features. In Proc. of International Conference on Computer Vision, pp. 1150–1157.

[53] Bay, H., Ess, A., Tuytelaars, T., Van Gool, L. (2008) Speeded-up robust fea-tures (SURF). Computer Vision and Image Understanding, 110(3): pp. 346–359.

[54] Mikolajczyk, K., Schmid, C. (2005) A performance evaluation of local de-scriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10): pp. 1615–1630.

[55] Miksik, O., Mikolajczyk, K. (2012) Evaluation of local detectors and de-scriptors for fast feature matching. In Proc. of the 21st International Conference on Pattern Recognition (ICPR), pp. 2681–2684.

[56] Lenc K., Gulshan V., Vedaldi A. VLBenchmarks. (2012) http://www.vlfeat.org/benchmarks/ (accessed 02.09.19).

[57] Mikolajczyk, K., Tuytelaars, T., Schmid, C., Zisserman, A., Matas, J., Schaf-falitzky, F., Kadir, T., Van Gool, L. (2005) A comparison of affine region de-tectors. International Journal of Computer Vision, 65(1/2): pp. 43–72.

[58] Lowe, D.G. (2004) Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2): pp. 91–110.

[59] Hast A, Nysjö J, Marchetti A. (2013) Optimal ransac-towards a repeatable algo-rithm for finding the optimal set. Journal of WSCG, 21(1): pp. 21–30.

[60] Fischler, M.A., Bolles, R.C. (1981) Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6): pp. 381–395.

63

[61] Lomanov, K., del Rincón, J.M., Miller, P., Gribben, H. (2019) Cell Detection With Deep Convolutional Networks Trained With Minimal Annotations. In Proc. of the 16th IEEE International Symposium on Biomedical Imaging (ISBI), pp. 943–947.

[62] Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D., Erhan, D., Vanhoucke, V., Rabinovich, A. (2015) Going deeper with convolutions. In Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9.

[63] Eulenberg, P., Köhler, N., Blasi, T., Filby, A., Carpenter, A.E., Rees, P., Theis, F.J., Wolf, F.A. (2017) Reconstructing cell cycle and disease progression using deep learning. Nature Communications, 8(1): 463.

[64] Pawlowski, N., Caicedo, J.C., Singh, S., Carpenter, A.E., Storkey, A. (2016) Automating morphological profiling with generic deep convolutional networks. BioRxiv, 085118.

[65] Sommer, C., Hoefler, R., Samwer, M., Gerlich, D.W. (2017) A deep learning and novelty detection framework for rapid phenotyping in high-content screen-ing. Molecular Biology of the Cell, 28(23): pp.3428–3436.

[66] Kensert, A., Harrison, P.J., Spjuth, O. (2019) Transfer learning with deep con-volutional neural networks for classifying cellular morphological changes. SLAS Discovery, 24(4): pp. 466–475.

Acta Universitatis UpsaliensisDigital Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology 1864

Editor: The Dean of the Faculty of Science and Technology

A doctoral dissertation from the Faculty of Science andTechnology, Uppsala University, is usually a summary of anumber of papers. A few copies of the complete dissertationare kept at major Swedish research libraries, while thesummary alone is distributed internationally throughthe series Digital Comprehensive Summaries of UppsalaDissertations from the Faculty of Science and Technology.(Prior to January, 2005, the series was published under thetitle “Comprehensive Summaries of Uppsala Dissertationsfrom the Faculty of Science and Technology”.)

Distribution: publications.uu.seurn:nbn:se:uu:diva-394709

ACTAUNIVERSITATIS

UPSALIENSISUPPSALA

2019

Documents

Image and Data Analysis for Biomedical Quantitative Microscopyuu.diva-portal.org/smash/get/diva2:1359483/FULLTEXT01.pdf · the image analysis, while Matuszewski, D.J. did the data