Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
On the Determination Coefficient and Global
Adjustment Test of Goodness of Fit
HIV infection :: CD4+ T cells dynamics
Cecilia Azevedo; Rita Rb-Silva; Ana Horta; Margarida Correia-Neves
Centre of Mathematics, School of Sciences & Life and Health Sciences Research Institute, School
of Medicine.
University of Minho
Table of contents
1. Introduction
2. Motivation and data set
3. Statistical methodology
4. Results
5. Discussion
1
Introduction
CD4+ T Cells
T cells are regulators of the immune system. CD4+ T cells are also
known as helper T cells because they act on other cells of the immune
system to promote various aspects of the immune response. They have a
crucial role in the bodys immune system.
Figure 1: Human CD4+ T Cell. Image courtesy of NIAID
2
HIV - Human immunodeficiency virus
When HIV virus enters in the bloodstream of a person it mainly attacks
CD4+ T cells, thought its CD4 protein (located in the border of the cell,
Figure 2).
Figure 2: Membrane fusion
3
HIV - Human immunodeficiency virus
The HIV virus is a retrovirus as its genome is a single-stranded RNA and
include in their replication cycle DNA intermediates (Figure 3).
Figure 3: Viral RNA in the host cell cytoplasm
4
The virus uses its own reverse transcriptase enzyme to produce DNA
from its RNA genome (Figure 4).
Figure 4: Viral DNA
5
That DNA is incorporated into the host cell genome who treats the viral
DNA as part of its own genome, translating and transcribing the viral
genes and producing the proteins required to assemble new copies of the
virus (Figure 5).
Figure 5: New copies of HIV
6
Number of CD4+ T cells
The HIV replication results in massive depletion of CD4+ T cells in the
human blood. After this acute phase, several immune responses are
generated by the host and viremia declines until a lower state level.
However these responses fail to eliminate the virus leading to a chronic
infection in most individuals during an asymptomatic period which can
go on several years. During the chronic phase of the infection, blood
CD4+ T cell count declines slowly (see Figure 6).
Figure 6: Schematic diagram of the course of HIV-1 infection
7
Motivation and data set
::The problem::
AntiRetroviral Treatment (ART) dramatically suppresses viral replication
and reduces the plasma HIV-1 viral load to below the limits of detection
of the most sensitive clinical assays (<50 RNA copies/mL).
ART causes a progressive reconstitution of the immune system.
That must imply an increase in circulating CD4+ T-lymphocytes.
The mathematical problem includes
growth significance :: test if T-lymphocytes growth during ART is
statistically significant;
growth magnitude :: quantify globally the increase in circulating CD4+
T-lymphocytes during ART.
8
:: The data ::
We have CD4+ T cells counts of 50 patients (10 female) with HIV-1 and
in the same ART program. The counts were taken at baseline (BL) and
at 1, 3, 6, 9, 12, 16, 20, 24, 28, 32, 36, 42, 48, 54, and 60 months during
ART. In this problem we are considering data for male and female
together because there is no statistical evidence that this two groups are
different concerning to CD4+ T cells counts at the different time points,
age, or even body mass index (BMI at BL).
Table 1: Characteristics of the patients in the beginning of ART
Min. 1st Qu. Median Mean 3rd Qu. Max.
CD4 BL 9.0 116.0 262.5 247.1 320.5 1033.0 186.44
Age 22.00 32.75 41.00 40.55 46.25 67.00 10.24
BMI 16.23 20.75 23.05 23.45 25.82 33.61 3.487
9
Statistical methodology
There are an enormous variability in CD4 count data. This variability is
within subjects and between subjects (as usual in this type of human
data). In Figure 7, we have the data for the different subjects in separate
panels.
Time points
CD
4 co
unt
0500
10001500
0 20 40 60
●● ●●●●●●● ●●●●●
AH039
●●●●●● ●●●●●●●● ●
AH015
0 20 40 60
●●● ●●●●●●●● ●●●
●●●
AH073
●●
●
AH002
0 20 40 60
●●●●
AH029
● ●●●●● ●●
AH001
0 20 40 60
●●●●● ●●●●●●●●● ●
AH048
●●●●●● ●●●●●●●● ●●●
AH079● ●●●●
●●●●●●
AH037
●●● ●●●●●●●●●●●●
AH090
●●●● ●●●●●●
AH013
●●●●●●●●●●● ●●●●●
AH089
●●●●●●●●●● ●●●●●
AH006
●●●●●● ●●●●●●
AH035
●● ●●●●●●●● ●●●●●●
AH088
050010001500
●●●●●● ●●●●●●●●● ●●
AH0800
50010001500
●●●●
AH072
● ●●●●●● ●●
●●●●●●
AH052
●●●●●●●●●●●●●● ●
AH066
●● ●●●●●●●● ●●●●
AH025
●●●●●
●●●●●●●●● ●●●
AH078
●● ●●●●●●●● ●●●●●
AH058
●●●● ●●
●●●●●● ●●●
AH092
● ●●●●●●● ●●●●
●
AH036●
●●● ●●●●●●●●
●●●
AH062
●●●●●●●● ●●●
●●●
AH004
●● ●●●●●●
●●
●●●
●
AH022
●●●●●●●●●●●● ●●
●●●
AH075
●● ●●●●●●●●●
●●●●
AH042
●●●●●
●●●● ●●●●●
AH040
●●●●● ●●●●●●●● ●
●
AH047
050010001500
●●●●●●●●
●●●●●●●● ●
AH0830
50010001500
●●● ●●
●●●●●
AH009
●●●● ●●●●●●●●
●●●
●
AH061
●●● ●●●●●●●
●● ●●●
●
AH076
●●●●●●
●●
AH054
●●●●●● ●●●●●
●●● ●
AH049
●●●●●
● ●●●●●●●● ●
AH096
●● ●●
●●●●● ●
●●
●●
AH007
●● ●●●●●●●● ●●●●●
AH041●
●●●●●●● ●●●●●●
AH021
●●●●
●●●
●●
●●● ●
AH060
●● ●●●●●●●
●●●●
●
AH023
●●●●
●●●●●●●
●● ●●
AH093
●●●●
●●
●●●●
● ●●●
AH027
●●●● ●●●●●
●●
●
●●●
AH045
●● ●●
●●●●
AH059
050010001500
●●
●
●●
●●●
●●
●●●
●
●
AH0460
50010001500
●
●
●●
●
●
●●●● ●
●●●
AH026
0 20 40 60
●●●
●● ●●
●
●●●●●●
●●
●
AH077
Figure 7: Scatter diagram10
A Linear Model
Globally CD4+ counts grow over time, see Figure 8.
350
400
450
500
550
600
650
Mean of CD4 T cells number with time
time
cd4_
coun
t
●
●
●●
●
●
●
●
●
● ●
●
●
●
●
●
●
0 1 2 3 6 9 12 20 28 36 48 60
n=41n=43n=39n=43n=34n=42n=37n=43n=39n=44n=38n=41n=35n=43n=41n=39n=40
Figure 8: Global mean growth of number of T cells with time
11
Results
Coefficient of Determination
If we use all data (all counts available for the 50 patients), we obtain the
equation y = 455.887 + 1.628x for least-squares regression line and this
linear model has a coefficient of determination very low, R2=0.012, so
more than 98% of the total variability on data is not explained by this
model, see Figure 9. The model explains nothing!!!
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
● ●
●
●
●
●
●●
●
●
●●
●●●
●
●●
●
●
●
●
●
●
●
●
●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
● ●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
● ● ●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
● ●
●
●●
●
●
●●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●●
●
● ●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●
●
●
● ●
●
●
●
●
●
● ●
●
●
●
● ●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●●
●
●●
●
●
●
●
●
● ●
●
●
●●
●
●
●
●
● ●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
0 10 20 30 40 50 60
050
010
0015
00
time
CD
4+ c
ount
Figure 9: Scatter Plot with Regression Line
12
Utility of the model
The model presented is an useless model?
Global Adjustment Test based on β (the slope parameter)
H0 : β = 0 (useless model) vs H1 : β 6= 0.
The test statistic is
T =β̂
σ̂β̂=
β̂√MSEs2x
√n − 2 ∼ tn−1
and the p-value = 2×(1− Φ(n − 1, |Tobs|)
), where Φ is the df of tn−2
and |Tobs| is the absolute value of the observed test statistic T . In our
case we have the estimative of the parameter, β̂ = 1.63 and a low
standard error associated σ̂β̂ = 0.572 which implies a statistic value
Tobs=2.842.
In this case we have 680 degrees of freedom (50×17-168-2), so the p
value is 0.0046, and we reject the useless of the model.
13
Discussion
Conclusions
The low p-value of the Global Test of Adjustment
is a consequence of the large number of observations.
For medicine is a realistic model. There are a global (constant) mean
growth of 20 T cells per year of ART. As far as we know is consensual
that there are a global growth of this kind of cells during ART but there
was not a quantification of the growth.
Here we quantify the global growth but this model should not be used to
make predictions.
This result allows answering a part of the question:
if there is growth, it is a slow one. It should not be forgotten
that only the positive balance is being considered. It has to be
taken into account growth that prevents decay.
14
to be continued . . .
THANK YOU VERY MUCH FOR YOUR ATTENTION !!!!
15