1
e Importance of resholding in Imaging Analysis of Protein Aggregates Lew Brown, Fluid Imaging Technologies, Inc. Abstract Introduction Methodology Conclusions Dynamic imaging particle analysis (DIPA) shows much greater sensitivity to transparent particles, such as protein aggregates, than light obscuration can. While not yet fully accepted in industry compendia, DIPA is being used increasingly in the formulation process for characterization of sub- visible particulates in biologics. DIPA measures particle size and shape by first creating a binary image based upon a defined threshold from the background value for each pixel in the image. e resulting binary image is used for all particle measurements, so the choice of thresholding technique is critical to the validity of the measurements. is poster will demonstrate how different thresholding techniques can produce dramatically different characterization of protein aggregates. To do this, a base set of camera images are acquired for a therapeutic protein containing aggregates. ese “raw images” can then be run through image processing software using different thresholding techniques to quantify and measure the aggregates contained. By using the same set of images for each thresholding process, we insure that any variation in results can only be caused by the thresholding process. e results will show striking differences caused by the thresholding variation, thereby supporting the importance of proper thresholding when using DIPA. Dynamic Imaging Particle Analysis overcomes the limitations of “volume-based” particle analysis technologies such as light obscuration by combining the speed of these analyzers with the particle discriminatory capability found in microscopy. Using sophisticated pattern recognition techniques developed in the digital image processing domain, one can automatically segregate out “other allowable” particles such as silicone droplets from the particle counts, yielding much higher accuracy in the analysis of protein aggregates in parenterals. In a dynamic imaging particle analysis system, particles are imaged in a 2-dimensional space as they move past the camera’s field of view (FOV). Since the illumination is behind the flow cell as viewed by the camera, particles typically will be darker than the background due to the fact that they are opaque. For this reason, most imaging particle analyzers only threshold based on the presence of pixels that are darker than the background pixel value, as is shown in Figure 1. Figure 1 illustrates the fact that, even with a simple relatively opaque particle, a difference in threshold value can produce a different binary image, and hence different particle measurements. is issue becomes further compounded when a particle has transparency. With transparent particles, in some cases the particle image pixel may actually be lighter than the background pixel, so the simple binarization process discussed above will not recognize those pixels that are lighter as being part of a particle. When simple darker only thresholding is used, as is the case with other imaging particle analyzers, transparent particles can often be cut up into smaller particle pieces. is results in incorrect measurements and incorrect count or concentration calculations. In order to demonstrate how improper thresholding can effect both protein aggregate physical measurements as well as concentration measurements, a simple experiment was constructed. A sample of bovine serum was put through a freeze/thaw cycle in order to stress it, and then run through the FlowCAM® DIPA system. e FlowCAM has the ability to collect and store “raw” images of the entire camera field of view during acquisition, allowing those raw images to be post- processed by the VisualSpreadsheet® software as though they are the “live” video stream from the camera. erefore, the identical set of original camera images can be processed using different threshold settings by the VisualSpreadsheet software, meaning that the effect of the thresholding is isolated as the only variable that is different between the two data sets collected. In the first phase of analysis, the same raw images representing a single sample were reprocessed multiple times using different threshold settings so that the results could be directly compared. e results for each run were then graphed as concentration in four different size bins: <10µm, >10µm, >25µm and >50µm. As the threshold settings become “more sensitive” to subtle gray-scale changes, more particles are captured and measured, yielding higher concentration numbers. In the second phase of analysis, the phenomenon of “fractionation”, whereby single large particles are cut into many small particles due to improper thresholding is investigated. is was done by taking two specific large protein aggregates found in the raw files and studying how that individual aggregate is imaged at the different threshold settings. I. e Effect of Varying Dark reshold Level Only: As previously stated, almost all DIPA systems use brightfield illumination, whereby the sample is illuminated from behind (“back- lit”), and thresholding is conducted by looking at where pixels are darker than the background due to the light being obstructed or refracted by the particle’s presence. However, because protein aggregates are largely transparent in nature, the differences between background and particle pixels become increasingly smaller, necessitating adjustment of the threshold level to smaller differences in order to properly capture the particle. is first series of reprocessed raw files used only a change in the dark threshold value to see how it affected size and concentration. Remember that since we use the same camera images for each run, the only difference for each is the threshold value, so that variable had been completely isolated for this analysis. Table 1 shows the tabular results for this series, while Figure 2 shows the resultant graphs. II. e Effect of Varying Dark reshold Plus Light reshold and Adding Neighborhood Analysis: A unique capability of FlowCAM’s VisualSpreadsheet software is the ability to threshold on either darker pixels (relative to the background), lighter pixels or both simultaneously. Because protein aggregates can be so transparent due to an index of refraction so close to the carrier fluid, there are instances where the aggregate will actually be lighter than the background intensity. Figure 3 shows a good example of this on a large protein aggregate. After the first series of reprocessing done above, it was observed (by viewing the images) that the run where the threshold was set eleven levels darker than the background (“D11”) produced the best results overall. resholding lighter than that begins to add in “artifacts” caused by noise in the camera electronics, which gets quite severe at a threshold less than ten (“D10”). In the second series of reprocessed raw files, the dark threshold was kept constant at eleven (“D11”), and then the raw files were also thresholded simultaneously by finding pixels where the intensity was at least eleven levels lighter than the background (“L11”). Finally, VisualSpreadsheet’s unique capability to perform neighborhood analysis on the binary image, grouping particle pixels that are close together, was used to further improve the results by overcoming fractionation, whereby larger particles might get split up into smaller ones. Table 2 shows the tabular results for this series, while Figure 4 shows the resultant graphs. e graphs show exactly what the tabular data reports. e differences in the smaller size bands do not appear as dramatic as the numbers show (with the exception of the >50µm bin) due to the use of a logarithmic axis. As previously explained, the dramatic increase in particle concentrations for the three larger size bins is due to reduced fractionation of the larger particles, as shown in more detail below. III. e Effect of Fractionation Caused by reshold Settings: As was seen in Section II above, the neighborhood analysis significantly increases the concentration in the larger particle size bins. is is because the neighborhood analysis helps to “group together” parts of larger particles that would normally be split apart as separate, smaller particles because they are do not physically touch in the binary image. To demonstrate this, two large protein aggregates were found on separate raw images. VisualSpreadsheet was then used to find all “particle images” which had that original raw file as their “source image”, with the resultant images displayed as shown in Figure 5. is study shows conclusively how important threshold settings are in order to properly capture, measure and calculate concentration for protein aggregates found in therapeutic proteins. ree basic conclusions are made: 1. e initial level for the number of levels darker than the background should first be optimized to capture the maximum number of particles without adding in noise. 2. e addition of thresholding on pixels lighter than the background significantly increases the number of aggregates captured. 3. Finally, adding neighborhood analysis to the binary images captured prevents fractionation, thereby correctly characterizing larger aggregates. Since the index of refraction varies with protein and carrier matrix used, it is recommended that optimization using VisualSpreadsheet tools should be carried out for each different formulation prior to analysis. Note the significant increase in the concentration of particles in all size bins caused by adding the lighter pixels into the thresholding! A dramatic change occurs in the 3rd run when neighborhood analysis is added to the thresholding: the number of the smallest (<10µm) particles actually decreases, while the number of larger particles in the other three size bins increases dramatically. is is due to lessened fractionation of the larger particles, as discussed in the next section. Run Parameters <10µm >10µm >25µm >50µm D40 3095 258 0 0 D20 20977 1075 43 0 D15 43545 2235 129 0 D13 58418 4041 172 0 D12 67402 4900 172 0 D11 79395 5631 301 0 D10 92119 7909 387 0 D9 110088 9586 602 0 D8 135192 11649 1161 43 Run Parameters <10µm >10µm >25µm >50µm D11_Dnn0_CSH0 79395 5631 301 0 D11_L11_Dnn0_CSH0 116063 6276 344 0 D11_L11_Dnn0_CSH2 112452 15819 2837 387 1 10 100 1000 10000 100000 <10um >10um >25um >50um Particles/ml D40_Dnn0_CSH0 D20_Dnn0_CSH0 D15_Dnn0_CSH0.lst D13_Dnn0_CSH0.lst D12_Dnn0_CSH0.lst 1 10 100 1000 10000 100000 1000000 <10um >10um >25um >50um Particles/ml D12_Dnn0_CSH0.lst D11_Dnn0_CSH0.lst D10_Dnn0_CSH0.lst D9_Dnn0_CSH0.lst D8_Dnn0_CSH0.lst 1 10 100 1000 10000 100000 1000000 <10um >10um >25um >50um Particles/ml D11_Dnn0_CSH0.lst D11_L11_Dnn0_CSH0.lst D11_L11_Dnn0_CSH2.lst Basic Thresholding Background = 200, Particle Pixel Value in RED Threshold = 120 Threshold = 80 100 50 100 50 150 50 100 50 100 100 50 100 50 150 50 100 50 100 Original Gray-Scale Image Original Gray-Scale Image Resulting Binary Image Resulting Binary Image Figure 1: The effect of thresholding a spherical gray-scale particle image into a binary image with two different threshold values. The smaller threshold value on the right produces a different binary of the same particle. Note that the right side binary would be assigned a larger Equivalent Spherical Diameter (ESD) based on area. Table 1: Concentration (particles/ml) for different size bins for each run using different level of dark thresholding (i.e. “D40” means threshold of 40 levels darker than background). Figure 2: Concentration (particles/ml) for different size bins for each run using different level of dark thresholding (i.e. “D40” means threshold of 40 levels darker than background). Table 2: Concentration (particles/ml) for different size bins for each run, adding simultaneous light thresholding (i.e. “L11” means threshold of 11 levels lighter than background) and neighborhood analysis (“CSH0” indicates no extra processing while “CSH2” means that neighborhood processing was done on the binary image). The first run was the “baseline” of dark thresholding only. Figure 4: Concentration (particles/ml) for different size bins for each run using different combinations of thresholding parameters (see Table 2 for key). Threshold Dark 11 Only (29 Particles Found) Threshold Dark 11 Plus Light 11 (50 Particles Found) Threshold Dark 11 Plus Light 11, Plus Neighborhood Analysis (4 Particles Found) Threshold Dark 11 Only (38 Particles Found) Threshold Dark 11 Plus Light 11 (62 Particles Found) Threshold Dark 11 Plus Light 11, Plus Neighborhood Analysis (18 Particles Found) Figure 5: Individual particle images resulting from different threshold settings for each of the two original raw images. Note how the addition of the neighborhood analysis is able to finally capture the entire large protein aggregate. Figure 3: Large protein aggregate captured by FlowCAM, showing area of aggregate where the pixels are actually lighter than the background. Rawfile #110 Rawfile #513

The Importance of Thresholding in Imaging Analysis …...The Importance of Thresholding in Imaging Analysis of Protein Aggregates Lew Brown, Fluid Imaging Technologies, Inc. Abstract

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Importance of Thresholding in Imaging Analysis …...The Importance of Thresholding in Imaging Analysis of Protein Aggregates Lew Brown, Fluid Imaging Technologies, Inc. Abstract

The Importance of Thresholding in Imaging Analysis of Protein AggregatesLew Brown, Fluid Imaging Technologies, Inc.

Abstract

Introduction

Methodology

Conclusions

Dynamic imaging particle analysis (DIPA) shows much greater sensitivity to transparent particles, such as protein aggregates, than light obscuration can. While not yet fully accepted in industry compendia, DIPA is being used increasingly in the formulation process for characterization of sub-visible particulates in biologics. DIPA measures particle size and shape by first creating a binary image based upon a defined threshold from the background value for each pixel in the image. The resulting binary image is used for all particle measurements, so the choice of thresholding technique is critical to the validity of the measurements.

This poster will demonstrate how different thresholding techniques can produce dramatically different characterization of protein aggregates. To do this, a base set of camera images are acquired for a therapeutic protein containing aggregates. These “raw images” can then be run through image processing software using different thresholding techniques to quantify and measure the aggregates contained. By using the same set of images for each thresholding process, we insure that any variation in results can only be caused by the thresholding process. The results will show striking differences caused by the thresholding variation, thereby supporting the importance of proper thresholding when using DIPA.

Dynamic Imaging Particle Analysis overcomes the limitations of “volume-based” particle analysis technologies such as light obscuration by combining the speed of these analyzers with the particle discriminatory capability found in microscopy. Using sophisticated pattern recognition techniques developed in the digital image processing domain, one can automatically segregate out “other allowable” particles such as silicone droplets from the particle counts, yielding much higher accuracy in the analysis of protein aggregates in parenterals.

In a dynamic imaging particle analysis system, particles are imaged in a 2-dimensional space as they move past the camera’s field of view (FOV). Since the illumination is behind the flow cell as viewed by the camera, particles typically will be darker than the background due to the fact that they are opaque. For this reason, most imaging particle analyzers only threshold based on the presence of pixels that are darker than the background pixel value, as is shown in Figure 1.

Figure 1 illustrates the fact that, even with a simple relatively opaque particle, a difference in threshold value can produce a different binary image, and hence different particle measurements. This issue becomes further compounded when a particle has transparency. With transparent particles, in some cases the particle image pixel may actually be lighter than the background pixel, so the simple binarization process discussed above will not recognize those pixels that are lighter as being part of a particle.

When simple darker only thresholding is used, as is the case with other imaging particle analyzers, transparent particles can often be cut up into smaller particle pieces. This results in incorrect measurements and incorrect count or concentration calculations.

In order to demonstrate how improper thresholding can effect both protein aggregate physical measurements as well as concentration measurements, a simple experiment was constructed. A sample of bovine serum was put through a freeze/thaw cycle in order to stress it, and then run through the FlowCAM® DIPA system. The FlowCAM has the ability to collect and store “raw” images of the entire camera field of view during acquisition, allowing those raw images to be post-processed by the VisualSpreadsheet® software as though they are the “live” video stream from the camera. Therefore, the identical set of original camera images can be processed using different threshold settings by the VisualSpreadsheet software, meaning that the effect of the thresholding is isolated as the only variable that is different between the two data sets collected.

In the first phase of analysis, the same raw images representing a single sample were reprocessed multiple times using different threshold settings so that the results could be directly compared. The results for each run were then graphed as concentration in four different size bins: <10µm, >10µm, >25µm and >50µm. As the threshold settings become “more sensitive” to subtle gray-scale changes, more particles are captured and measured, yielding higher concentration numbers.

In the second phase of analysis, the phenomenon of “fractionation”, whereby single large particles are cut into many small particles due to improper thresholding is investigated. This was done by taking two specific large protein aggregates found in the raw files and studying how that individual aggregate is imaged at the different threshold settings.

I. The Effect of Varying Dark Threshold Level Only:

As previously stated, almost all DIPA systems use brightfield illumination, whereby the sample is illuminated from behind (“back-lit”), and thresholding is conducted by looking at where pixels are darker than the background due to the light being obstructed or refracted by the particle’s presence. However, because protein aggregates are largely transparent in nature, the differences between background and particle pixels become increasingly smaller, necessitating adjustment of the threshold level to smaller differences in order to properly capture the particle.

This first series of reprocessed raw files used only a change in the dark threshold value to see how it affected size and concentration. Remember that since we use the same camera images for each run, the only difference for each is the threshold value, so that variable had been completely isolated for this analysis. Table 1 shows the tabular results for this series, while Figure 2 shows the resultant graphs.

II. The Effect of Varying Dark Threshold Plus Light Threshold and Adding Neighborhood Analysis:

A unique capability of FlowCAM’s VisualSpreadsheet software is the ability to threshold on either darker pixels (relative to the background), lighter pixels or both simultaneously. Because protein aggregates can be so transparent due to an index of refraction so close to the carrier fluid, there are instances where the aggregate will actually be lighter than the background intensity. Figure 3 shows a good example of this on a large protein aggregate.

After the first series of reprocessing done above, it was observed (by viewing the images) that the run where the threshold was set eleven levels darker than the background (“D11”) produced the best results overall. Thresholding lighter than that begins to add in “artifacts” caused by noise in the camera electronics, which gets quite severe at a threshold less than ten (“D10”).

In the second series of reprocessed raw files, the dark threshold was kept constant at eleven (“D11”), and then the raw files were also thresholded simultaneously by finding pixels where the intensity was at least eleven levels lighter than the background (“L11”). Finally, VisualSpreadsheet’s unique capability to perform neighborhood analysis on the binary image, grouping particle pixels that are close together, was used to further improve the results by overcoming fractionation, whereby larger particles might get split up into smaller ones.

Table 2 shows the tabular results for this series, while Figure 4 shows the resultant graphs.

The graphs show exactly what the tabular data reports. The differences in the smaller size bands do not appear as dramatic as the numbers show (with the exception of the >50µm bin) due to the use of a logarithmic axis. As previously explained, the dramatic increase in particle concentrations for the three larger size bins is due to reduced fractionation of the larger particles, as shown in more detail below.

III. The Effect of Fractionation Caused by Threshold Settings:

As was seen in Section II above, the neighborhood analysis significantly increases the concentration in the larger particle size bins. This is because the neighborhood analysis helps to “group together” parts of larger particles that would normally be split apart as separate, smaller particles because they are do not physically touch in the binary image.

To demonstrate this, two large protein aggregates were found on separate raw images. VisualSpreadsheet was then used to find all “particle images” which had that original raw file as their “source image”, with the resultant images displayed as shown in Figure 5.

This study shows conclusively how important threshold settings are in order to properly capture, measure and calculate concentration for protein aggregates found in therapeutic proteins. Three basic conclusions are made:

1. The initial level for the number of levels darker than the background should first be optimized to capture the maximum number of particles without adding in noise.

2. The addition of thresholding on pixels lighter than the background significantly increases the number of aggregates captured.

3. Finally, adding neighborhood analysis to the binary images captured prevents fractionation, thereby correctly characterizing larger aggregates.

Since the index of refraction varies with protein and carrier matrix used, it is recommended that optimization using VisualSpreadsheet tools should be carried out for each different formulation prior to analysis.

Note the significant increase in the concentration of particles in all size bins caused by adding the lighter pixels into the thresholding! A dramatic change occurs in the 3rd run when neighborhood analysis is added to the thresholding: the number of the smallest (<10µm) particles actually decreases, while the number of larger particles in the other three size bins increases dramatically. This is due to lessened fractionation of the larger particles, as discussed in the next section.

Run Parameters <10µm >10µm >25µm >50µm

D40 3095 258 0 0D20 20977 1075 43 0D15 43545 2235 129 0D13 58418 4041 172 0D12 67402 4900 172 0D11 79395 5631 301 0D10 92119 7909 387 0D9 110088 9586 602 0D8 135192 11649 1161 43

Run Parameters <10µm >10µm >25µm >50µm

D11_Dnn0_CSH0 79395 5631 301 0D11_L11_Dnn0_CSH0 116063 6276 344 0D11_L11_Dnn0_CSH2 112452 15819 2837 387

1

10

100

1000

10000

100000

<10um >10um >25um >50um

Particles/ml D40_Dnn0_CSH0

D20_Dnn0_CSH0

D15_Dnn0_CSH0.lst

D13_Dnn0_CSH0.lst

D12_Dnn0_CSH0.lst

1

10

100

1000

10000

100000

1000000

<10um >10um >25um >50um

Particles/ml D12_Dnn0_CSH0.lst

D11_Dnn0_CSH0.lst

D10_Dnn0_CSH0.lst

D9_Dnn0_CSH0.lst

D8_Dnn0_CSH0.lst

1

10

100

1000

10000

100000

1000000

<10um >10um >25um >50um

Particles/ml

D11_Dnn0_CSH0.lst

D11_L11_Dnn0_CSH0.lst

D11_L11_Dnn0_CSH2.lst

Basic ThresholdingBackground = 200, Particle Pixel Value in RED

Threshold = 120 Threshold = 80

100 50 100

50 150 50

100 50 100

100 50 100

50 150 50

100 50 100

Original Gray-Scale Image Original Gray-Scale Image

Resulting Binary Image Resulting Binary Image

Figure 1: The effect of thresholding a spherical gray-scale particle image

into a binary image with two different threshold values.

The smaller threshold value on the right produces a

different binary of the same particle. Note that the right

side binary would be assigned a larger Equivalent Spherical

Diameter (ESD) based on area.

Table 1: Concentration (particles/ml) for different size bins for each run using different level of dark thresholding (i.e. “D40” means threshold of 40 levels darker than background).

Figure 2: Concentration (particles/ml) for different size bins for each run using different level of dark thresholding (i.e. “D40” means threshold of 40 levels darker than background).

Table 2: Concentration (particles/ml) for different size bins for each run, adding simultaneous light thresholding (i.e. “L11” means threshold of 11 levels lighter than background) and neighborhood analysis (“CSH0” indicates no extra processing while “CSH2” means that

neighborhood processing was done on the binary image). The first run was the “baseline” of dark thresholding only.

Figure 4: Concentration (particles/ml) for different size bins for each run using different combinations of thresholding parameters (see Table 2 for key).

Threshold Dark 11 Only (29 Particles Found)

Threshold Dark 11 Plus Light 11 (50 Particles Found)

Threshold Dark 11 Plus Light 11, Plus Neighborhood Analysis

(4 Particles Found)

Threshold Dark 11 Only (38 Particles Found)

Threshold Dark 11 Plus Light 11 (62 Particles Found)

Threshold Dark 11 Plus Light 11, Plus Neighborhood Analysis

(18 Particles Found)

Figure 5: Individual particle images resulting from different threshold settings for each of the two original raw images. Note how the addition of the neighborhood analysis is able to finally

capture the entire large protein aggregate.

Figure 3: Large protein aggregate captured by FlowCAM, showing area of aggregate where the pixels are actually

lighter than the background.

Rawfile #110 Rawfile #513