Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 81
LOGISTIC REGRESSION ANALYSIS OF INFANT MORTALITY:
EVIDENCE FROM GHANA
1Felix Atanga Adongo 2Richmond Essieku 3John Amo Jr. Lewis 4John Boamah
1University of Mines and Technology, Ghana 2University of Cape Coast, Ghana 3University of Liberia, Liberia 4University of Cape Coast, Ghana
ABSTRACT
Aware of the appreciable level of infant mortality in the Bongo district and its toll on the health
and general well-being of the inhabitants, the study adopted an econometric tool, modelled the risk
factors of the phenomenon and made recommendations to ameliorate the problem. Binary Logistic
Regression was adopted to analyse data from questionnaires administered in five community
clinics of the district. The study in its uniqueness considered four level risk factors comprising the
mother, child, environmental and the medical attendants’ level. For the mother level risk factors,
the nutritional status and antenatal care were significant as far as infant deaths are concerned. For
the child level which included the sex and size of the child, results show that this model was not
very informative given its overly poor fit and severely biased estimates evidenced by the likelihood
ratio and Hosmer-Lemeshow tests. The level of sun in the region of the pregnant woman was a
significant contributor to infant mortality in the district. Care immediately after delivery from the
medical attendants’ level was also a significant contributor to infant deaths in the district. The
study also analysed data on infant deaths from the BDH to find out the possibility of infant survival
in the hospital. The results revealed quite a substantial likelihood of infant survival given a
maximum of 3% possibility of a baby expiring after birth.
Keywords: Binary Logistic Regression, Risk Factors, Model, Infant Mortality, Likelihood Ratio,
Hosmer-Lemeshow
Abbreviation: BDH: Bongo District Hospital
Introduction
The loss of a child from birth remains a sad reality as it exacts a toll on the health and well-being
of the immediate family and the society. Infant mortality rate is often used as one of the indicators
to measure the health and well-being of an economy as its occurrence taints the outlook of the
nation. It is the priority of every government thus to fight the incidences of infant mortality.
According to Oestergaard et al. (2011), 7.7 million children below the age of one year died
worldwide where 3.1 million died in the first month of birth. More than five million children under
one year of age die every year in Africa out of which half of them die within the first four weeks
of birth (Kwara, 2012). According to (WHO, 2016), 4.2 million representing 75% of all deaths
below five years were recorded in the first year of life in 2016. Infant death is highest in Africa as
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 82
it records about 52 deaths per 1,000 live births. This is roughly six times higher than that of the
European region which is about 8 deaths per every 1,000 live births (WHO, 2016).
The level of infant deaths has not yet met a substantial decline over the past few years amidst
progressive technology in the health sector and an augmented devotion to parental care. While
most would expect the rate to be declining at an appreciable level, it has rather remained fairly
steady since the early 2000. There are several risk factors pertinent to infant deaths in our societies
today. The Sudden Infant Death Syndrome (SIDS) is an inexplicable sudden demise of babies and
it has considerably claimed the lives of children below one year of age. In 2005, 2,234 infants died
due to SIDS (WHO, 2006)
The environment that accommodates the child after birth contributes hugely to the health of the
child and its survival. Society underestimates the effect of the environment on the survival of the
child. For example, excessive hot weather conditions on pregnant women can hamper the survival
of infants, the nature of drinking water in the environment, homes with the absurdity of forbidding
a woman delivering at the hospital rather than at home, homes with the impossibility of delivering
at the hospitals because of their remoteness and many more. These and many other environmental
issues have an appreciable impact on infant deaths and should be given a limelight addressing the
problem.
This research focuses on the Bongo district in the Upper East Region of Ghana where infant
mortality is widespread. The district during the dry season from the start of February to the close
of April becomes excessively hot with an average temperature of about 38 which is not
congenial for pregnant women. Preterm and stillbirths are predominant during these times in the
district and reason could be accordingly attributed to hostile weather conditions. There has been
little to no progress in research relevant to mitigating child deaths in the first year of age in the
district. Consequently, this paper identified the major risk factors of infant deaths in the district’s
context and modelled these factors at four different levels using the Logistic Regression model.
Kwara (2012) modelled the risk factors of neonatal mortality in Ghana but did not consider the
medical or traditional birth attendants’ level risk factors which are often disregarded but appear to
have serious impact on deaths of infants. Infant deaths could be triggered when there are birth
complications like breach presentation – much blame will be on the doctors and midwives for
exhibiting incompetence. Data obtained from questionnaires administered to workers in five
community clinics with fairly good knowledge on health and child mortality were used for this
analysis. The study also made use of secondary data on infant deaths from the district hospital
from 2006 to 2014. This data were analysed to predict the chance of a baby surviving given that
the mother delivers at the BDH.
The rest of paper is categorized as follows: Section II talks about the review of related literature,
section III talks about the methodology and conceptual framework adopted in the study, section
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 83
IV is concerned with the analysis of results and findings, section V discusses the findings and
section VI concludes and provides recommendations and gaps of the research.
Literature Review
This section reviews literature on infant deaths and its major causes that take place in the world.
According to (Kwara, 2012), a noticeable level of transformation is taking place in areas relevant
to maternal and child health in order to realize the international declaration and country
commitment objectives. The pursuit for evaluation and information on child mortalities has thus
become progressively obvious. This section presents a review on infant mortality, its relationship
with stillbirth, perinatal mortality, neonatal mortality, post-neonatal mortality and their associated
risk factors. Conscious of the negative effects infant mortality is hurling on the immediate families,
communities and the nation as a whole, a good number of scholars and researchers have examined
the phenomenon and its related risk factors and have ascertained these findings.
Preterm of delivery on Infant Mortality
(Khashu et al., 2009) conducted a study to compare the mortality and morbidity of late preterm
infants to those born at term. Data were collected from the British Columbia Perinatal Registry
(BCPR) and were analysed including all singleton births between 33 and 40 weeks gestation from
April 1999 to March 2002 in the province of British Columbia, Canada. The birth cohort was
divided into late preterm (33-36 weeks, n = 6,381) and term (37-40 weeks, n = 88,867) groups.
The results show that stillbirth rate, perinatal, neonatal and infant mortality rates were significantly
higher in the late preterm group compared to the term group.
BMI of Mother on Infant Mortality
(Chen et al., 2009) conducted a research on the maternal obesity and the risk of infant deaths in
the United States. The aim of this research was to examine the effects of maternal obesity on
neonatal and postnatal death separately, and to examine causes of infant death associated with
maternal obesity. The study was the association between the maternal obesity and the risk of infant
death by using 1998 US National Maternity and Infant Health Survey (NMIHS) data. A case
controlled analysis of 4,265 infant deaths and 7,293 controls were conducted. Self-reported
pregnancy BMI and weight gain were used in the primary analysis, whereas weight variables in
medical records were used in a subset of 4,308 women. They found out that the normal weight
women who gained 0.66 to 0.97 Ib/wk. during pregnancy, obese women had significantly
increased risk of neonatal death and overall infant death.
Effect of diabetic women on Infant Mortality
(Dunne et al., 2009) carried out a study to evaluate the pregnancy outcome in pre-gestational
diabetes along the Atlantic seaboard from 2006-2007. The Atlantic Diabetes in pregnancy group
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 84
representing five antenatal centres in a wide geographical location in Ireland established in 2005.
All women with diabetes for greater than 6 months before the index pregnancy were included. The
pregnancy outcome was compared with background rates. Prospective information was obtained
from 104 singleton pregnancies from 2006-2007 and compared to the background population.
Results show that significant associations were found with stillbirth, and PM, where rates were 5.0
and 3.5 times that of the background population respectively.
Alcoholic Consumption and Smoking on Infant Mortality
(Rasch, 2003) aimed at studying the association between cigarette, alcohol, and caffeine
consumption and the occurrence of spontaneous abortion. The study population consisted of 330
women with spontaneous abortion and 1,168 pregnant women receiving antenatal care. A case
control design was utilized; cases were defined as women with a spontaneous abortion in
gestational week 6-16 and controls as women with a live foetus in gestational week 6-16 and
controls as women. The variables studied include age, parity, occupational situation, cigarette,
alcohol, and caffeine consumption. He realized that there was a significant association between
alcohol consumption (5 or more units of alcohol per week) during pregnancy and spontaneous
abortion (OR: 4.8; CI: 2.9-8.2).
Methodology
Due to the advancement in technology, many researchers often tackle problems by the use of
statistical software packages with slight knowledge about the methods used. The use of statistical
software packages has actually helped in analysis of data. As a result; researchers do not deal much
with the conceptual framework of the methods used. This section focuses on the methods adopted
in analysing the data in the subsequent section.
The Econometrics of Logistic Regression
In econometrics, logistic regression is a type of non-linear probabilistic regression whose link
function is the logistic CDF. It is a type of regression model that is used to predict a categorical
response given one or more predictor variables. Examples in the binary case are, a customer
decides whether or not to take a solar panel offer, whether or not a student passes an accountancy
test, a child survives after birth or dies after birth and so on. Since the dependent variable is not
continuous, we cannot predict a numerical value for it, instead we predict the chances that the
response occurs. The logistic regression model unlike the Linear Probability Model (LPM) is very
useful because, it can take any input from negative to positive infinity with the outcome variable
taking on values between zero and unity and so interpretable as a probability. Logistic regression
model can first be understood by looking at the logistic function defined below:
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 85
(3.1)
The Logit Function
Considering the logistic function in equation (3.1), let be a linear function of an explanatory
variable say where and are constants; then the logistic function by
rationalization of the numerator is given by:
(3.2)
The inverse of logistic function can be seen as:
(3.3)
is the natural logarithm of the odds and equivalently:
where is the odds.
Generally, is the logit function of some linear combination of the predictors. The equation
for in (3.3) illustrates that the logit (natural logarithm of the odds) is equivalent to the
linear regression expression. is the probability that the dependent variable equals a case which
is coded 1 other than 0, given some linear combination of predictors.
The Odds Ratio
The odds of the dependent variable equalling a case are equivalent to the exponential function of
the linear regression expression. So we can define odds of the dependent variable equalling a case
(given some linear combination of the predictors) as follows:
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 86
(3.4)
From equation (3.4), the odds ratio (OR) can be defined as the ratio of an increase in one unit of
the independent variable which is given by:
(3.5)
Hence the odds ratio is given by . This factor is the OR for the independent variable
and it gives the relative amount by which the odds of the outcome increase (OR greater than 1)
or decrease (OR less than 1) when the value of the independent variable is increased by 1 unit.
For example, the variable INFANT MORTALITY is coded as 0 = (Infant Survival) and 1 = (Infant
Mortality), and the odds ratio for an independent variable say antenatal care is 3.2. This means
that, in the model, the odds for the outcome (INFANT MORTALITY) in cases where the Infant
died are 3.2 times higher than in cases where the Infant survived when antenatal care increases by
one unit.
Assumptions of the Logistic Regression Model
Logistic Regression does not make many of the key assumptions of linear regression and general
linear models that are based on ordinary least squares algorithms particularly regarding linearity,
normality, homoscedasticity, and measurement level. (Anon., 2014)
These are the following key assumptions of Logistic Regression Model:
Binary logistic regression requires the dependent variable to be binary.
Logistic regression assumes that P(Y = 1) is a probability of the event occurring, it is
necessary that the dependent variable is coded accordingly. That is the factor level 1 of the
dependent should represent the desired outcome.
The model should be fitted correctly. That is only the meaningful variables should be
included.
The model should have little or no multicollinearity.
Logistic regression assumes linearity of independent variables and log odds.
Logistic regression requires quite a large sample size. Reliability of estimates declines as
fewer observations are used.
Tests and Goodness of Fit Measures Adopted in the Study
The Hosmer-Lemeshow Test
The Hosmer-Lemeshow test is a statistical test for goodness of fit used in logistic regression
modelling. The data are divided into approximately ten groups defined by increasing order of
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 87
estimated risk. The observed and expected number of cases in each group is calculated and a Chi-
squared statistic is calculated by the use of the formula:
(3.6)
With and being the observed events, expected events and number of observations for the
risk decile group respectively, and is the number of groups. The test statistic follows a Chi-
squared distribution with degrees of freedom. A large value of Chi-squared (with small p-
values < 0.05 level of significance) indicates poor fit while small Chi-squared values with large p-
values (closer to 1) indicate a good logistic regression model fit.
The Likelihood Ratio Test
The Likelihood Ratio test is performed by estimating two models and comparing the fit of one
model to the fit of the other. Removing predictor variables from a model will almost always make
the model fit less well (that is a model will have a lower log likelihood), but it is necessary to test
whether the observed difference in model fit is statistically significant. The likelihood ratio test
does this by comparing the log likelihoods of the two models, if this difference is statistically
significant (p-value less than 0.05), then the less restrictive model (the one with more predictors)
is said to fit the data significantly better than the more restrictive model (the one with only the
constant or fewer predictors). If one has the log likelihoods from the models, the likelihood ratio
test is fairly easy to calculate. The formula for the likelihood ratio test is:
(3.7)
Wald Estimator
The Wald statistic can also be used to assess the individual predictors in a particular model. Unlike
linear regression where we use the test statistics t to assess the significance of coefficients in the
model, in logistic regression, the Wald estimator is used to assess the contribution each predictor
plays in the model. The Wald statistic is the ratio of the square of the regression coefficient to the
square of the asymptotic standard error of the coefficient and this is asymptotically distributed as
a chi-square distribution with degree of freedom equal to unity. The Wald statistic can be obtained
by:
(3.8)
Where A.S.E is the Asymptotic Standard Error of the regression coefficient, beta, the significance
or importance of a variable depends largely on the Wald statistic. The significance is proportional
to the Wald statistic.
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 88
The Pseudo R Squares
In the linear regression, the squared multiple correlations, R squares are used to assess goodness
of fit as it represents the proportion of variability in the outcome that is explained by the model. In
logistic regression analysis, there is no agreed upon analogous measure, but there are several
competing measures each with limitations. The Cox and Snell R squared is an alternative index of
goodness of fit related to the R squared from linear regression. The Cox and Snell index is
problematic as its maximum value is 0.75 thus accounting up to a third quarter of variability in the
outcome. The Nagelkerke R squared however provides a correction to the Cox and Snell R squared
so that the maximum value is now equal to unity. Yet, a higher R squared value does not guarantee
a good model fit as it increases with the number of predictor variables. Fitting the model with a lot
of independent variables bloats the R squared yet is not very informative as the reliability of the
estimates declines and inferences with such estimates may be unsound. So in fitting a model we
entertain a great deal of care about the parsimony goodness of fit trade off.
Data Collection and Analysis
This section looks at analysis of data collected from questionnaires administered in five
community clinics under the district and data on infant deaths collected from the BDH from 2006
– 2014. A total of 118 out of 125 questionnaires administered were received. The non-response
rate to the questionnaires was roughly 5%. Thus, a total of 118 observations from the
questionnaires were used in the first part of the analysis (modelling the risk factors of infant
mortality at various levels). The questionnaire divided the risk factors into four levels which
include the mother, child, environmental and the medical attendants’ level factors. Factors
included in the analysis of the mother level are, the nutritional status of the mother, antenatal care
and the educational level of the mother. From the child level category, the factors included were
the sex of the child and the size of the child. From the environmental level category, the only factor
included and in the framework of the district was the level of sun. From the medical attendants’
level category, the considered factors were the handling complications and the care for the child
immediately after delivery or postnatal care. The data collected from the district hospital contained
the number of infant deaths and the number of births from 2006 – 2014. This secondary data will
be used in the second part of the analysis (predicting the chances of infant survival in the district
hospital).
The administered questionnaire was divided into six broad sections. The first section is the
demographic and socio-economic characteristics of the respondent; which entailed the gender, age
bracket, educational level, occupation, marital status and community of residence. The second
section solicited for the respondent’s knowledge about infant mortality in the district. It asks
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 89
respondents whether or not there are ever infant mortalities in their community clinic of work and
about the current level of infant mortality rate in their clinic if there exists. The former question
was coded in the “Yes” “No” response category intended for the dependent variable “Infant
Mortality” which in our case is binary with “Yes” was coded as 1 for infant mortality and “No”
was coded as 0 for infant survival. The last four sections of the questionnaire designed the
questions in the same way, they were captioned as: determinants of infant mortality from the
mother level, determinants of infant mortality from the child level, determinants of infant mortality
from the environment level and determinants of infant mortality from the medical attendants’ level.
The questionnaire categorize all questions from section 4 – 6 on a six-point Likert scale comprising
strongly disagree, disagree, somewhat disagree, somewhat agree, agree and strongly agree. It
proceeds to probe respondents – if at all there was infant mortality in either their community clinic
of work, other clinics, hospitals or homes within the district, then the following listed risk factors
were perhaps the cause. Strongly disagree is coded as 1 and in the order above through to 6 for
strongly agree. The questionnaire additionally allows respondents to give their views on some of
the risk factors not captured in the questionnaire. Most information of respondents is captured in
the introduction.
The following table is the check of collinearity or multicollinearity check of the risk factors
considered in the modelling process. So there are a total of 8 independent variables modelled at
four different levels. The software used in the analysis is the Statistical Package for Social
Sciences (SPSS)
Collinearity and Diagnostic Test
Independent Variable Collinearity Statistics
Tolerance VIF
Nutritional Status of Mother 0.676 1.476
Attendance of Antenatal Care 0.990 1.010
Level of Mother’s Education 0.994 1.006
Sex of Child 0.972 1.029
Size of Child 0.936 1.068
Level of Sun 0.920 1.087
Handling of Complications 0.967 1.034
Care Immediately after delivery 0.999 1.001
Prior to running a logistic regression models, collinearity/multicollinearity test must be ran to
ensure there are no issues of perfect collinearity or multicollinearity since estimates of logistic
regression models are sensitive to multicollinearity. Note that Variance Inflation Factor (VIF)
values are all less than 5 (see for example Rogerson, 2005), this implies not a very strong
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 90
correlation among the variables and therefore these can all be included in running the logistic
regression models.
Results of the Mother Level Risk Factors
Overall Model Evaluation
Test Chi-Squared DF p-value
Likelihood Test 15.839 3 0.001
Goodness of Fit
Hosmer-Lemeshow Test 1.951 5 0.856
R Squared
Cox and Snell’s R Squared = 0.165
Nagelkerke R Squared = 0.269
Predictor 𝛽 (std. error)
Wald DF p-value Odds
Ratio
95% CI for the
Odds Ratio
Lower Upper
Nutritional Status of
Mother
2.065
(0.798)
6.689 1 0.010 7.883 1.649 37.689
Antenatal Care -2.209
(0.752)
8.637 1 0.003 0.110 0.025 0.479
Level of Mother’s
Education
1.625
(0.868)
3.502 1 0.061 5.077 0.926 27.839
Constant -2.318
(3.124)
0.550 1 0.458 0.099
From the first table, 26.9% of the variability in the dependent variable is accounted by the
variability of the logistic regression model from the Nagelkerke’s R squared. P-value of the
likelihood ratio test is less than 0.05 implying that the model with predictors fits significantly better
than the model with only the constant (more restrictive model). The Hosmer-Lemeshow test has a
p-value greater than 0.05 indicating that the model with three predictors is a good fitting model.
Looking at the Wald criterion and odds ratio table, we observe that all variables are significant
except the mother’s level of education which is marginally significant at the 10% alpha level. This
shows that the model from the mother level is a statistically stable model. The coefficient on
antenatal care is -2.209, this means that infant mortality is less likely to occur when antennal care
increases. The odds ratio for the nutritional status of the mother is 7.883. This shows that the odds
of the outcome in cases of infant mortality is 7.883 times higher than in cases of infant survival
when there is an increase in the category of the mother’s nutritional status with other variables
constant. This is counterintuitive and contrasting to the expectations of the study given that a
woman who feeds sumptuously well has a high possibility of her baby expiring after birth.
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 91
Results of the Child Level Risk Factors
Overall Model Evaluation
Test Chi-Squared DF p-value
Likelihood Test 0.475 2 0.796
Goodness of Fit
Hosmer-Lemeshow Test 8.371 4 0.079
R Squared
Cox and Snell’s R Squared = 0.042
Nagelkerke R Squared = 0.078
Predictor 𝛽 (std. error)
Wald DF p-value Odds
Ratio
95% CI for the
Odds Ratio
Lower Upper
Sex of Child 0.464
(0.743)
0.391 1 0.532 1.591 0.371 6.821
Size of Child 0.056
(0.189)
0.087 1 0.768 1.057 0.730 1.531
Constant -1.497
(0.695)
4.634 1 0.031 0.224
Nagelkerke’s R squared which corrects the Cox and Snell’s R squared explains only 7.8% of
variability in the logistic regression model. Here in the case of the child level factors, first table
shows that the model with predictors did not fit significantly better than the model with the
intercept only given the large p-value of the likelihood ratio test, 0.796. Also results from the
Hosmer-Lemeshow test indicate that the model with the two predictors did not fit the data well as
the p-value of the test is far from unity. Second table confirms the output of the first table because
none of the two variables considered in the model is significant. However, Size of child has a
positive coefficient indicating a higher likelihood of infant mortality when the size of the baby
increases. Interpreting the sex of child is not very educative since it is in category. The odds ratio
of the variable Sex of child is 1.591 meaning the odds of the outcome in cases of infant mortality
is 1.591 times higher in male infants than in female infants. Standard errors are substantially higher
than coefficients indicating a highly unstable model. Thus, we obtain a constant model from this
level since logistic regression must only be fitted with the right and significant variables.
Results of the Environmental Risk Factor
Overall Model Evaluation
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 92
Test Chi-Squared DF p-value
Likelihood Test 16.516 1 0.000
Goodness of Fit
Hosmer-Lemeshow Test 0.730 3 0.866
R Squared
Cox and Snell’s R Squared = 0.169
Nagelkerke R Squared = 0.262
Predictor 𝛽 (std. error)
Wald DF p-value Odds
Ratio
95% CI for the
Odds Ratio
Lower Upper
Level of Sun 1.309
(0.440)
8.851 1 0.003 3.702 1.563 8.767
Constant -6.676
(1.957)
11.642 1 0.001 0.001
Similar to the mother level risk factors, both the likelihood ratio and the Hosmer-Lemeshow tests
indicate that the model with the predictor Level of Sun fits the data well and also fits better than
the constant only model given the 0.000 p-value of the likelihood ratio test and the 0.866 p-value
of the Hosmer-Lemeshow test. This shows that the model that will be obtained from the second
table of this level is a good fitting model with statistical stability since Wald statistic is fairly large
and standard error is low as well. Level of sun is statistically significant and the positive coefficient
shows that increasing the level of sun by increasing the category increases the likelihood of infant
deaths which is very much expected and conforms to the situation in the district. Odds ratio is
greater than one implying that, increasing the category of the level of sun in the district, infant
deaths increase over infant survival by 3.702 times. Nagelkerke’s R squared explains 26.2% of
variability in the logistic regression model.
Results of the Medical Attendants’ Level Risk Factors
Overall Model Evaluation
Test Chi-
Squared
DF p-value
Likelihood Test 7.016 2 0.030
Goodness of Fit
Hosmer-Lemeshow Test 1.323 5 0.933
R Squared
Cox and Snell’s R Squared = 0.077
Nagelkerke R Squared = 0.121
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 93
Predictor 𝛽 (std. error)
Wald DF p-value Odds
Ratio
95% CI for the
Odds Ratio
Lower Upper
Handling
Complications
0.540
(0.299)
3.256 1 0.071 1.716 0.954 3.086
Postnatal Care -1.221
(0.576)
4.494 1 0.034 0.295 0.095 0.912
Constant -2.697
(1.201)
5.044 1 0.025 0.067
From the explanations above which now seem like a cliché, we have a good fitting model for the
last level risk factors given the 0.030 p-value of the likelihood test and the 0.933 p-value of
Hosmer-Lemeshow test. 12.1% of the variability in the dependent variable is accounted by the
variability of the logistic regression model from the Nagelkerke’s R squared. The odds of outcome
of infant mortality is 1.716 times higher than infant survival when there is an increase in handling
complications at birth. This is quite ambiguous. The negative coefficient on postnatal care
indicates that, the likelihood of infant mortality decreases as the category of postnatal care
increases which is intuitive. Postnatal care is significant at 5% level with handling complications
marginally significant at the 10% level and the model overall is statistically stable given the fairly
low standard errors.
Data from the BDH
Considering the number of births in the district hospital from 2006 to 2014 and in fulfilling one of
the objectives of this study; that is predicting the chances of infant survival in the BDH, a binary
logistic regression coded the dependent variable as: infant survival (desired) as 1 and infant
mortality (undesired) as 0. The results of infant survival are summarized in the subsequent three
tables. Considering the results below, the likelihood ratio test has a p-value of 0.023 indicating that
the model with the predictor year group fits significantly better than the restrictive model (intercept
only model). The Hosmer-Lemeshow test has a p-value of 0.205 which is greater than 0.05 though
not close to one indicating a good logistic regression model fit of the data. Again Nagelkerke’s R
squared explains only 7.6% of the variability in the logistic regression model. This low R squared
is not very educational in the framework of prediction. Because, there is a vast percentage of
variability that is left unexplained by the model. However, inferring from the second table, the
odds ratio for the predictor year group is 1.730 indicating that when the year group increases by
one unit, then the odds of the desired outcome infant survival is 1.730 times higher than the odds
of infant mortality which pretty confirms the findings below in table three as the probabilities of
an infant surviving in the last column of table three is increasing with years. In other words, the
odds of an infant surviving in the district hospital will be 1.730 times higher than an infant expiring
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 94
in a year say 2008 than it is in the year 2007. The model we used for predicting the chances of an
infant surviving at the hospital is given below:
IS is the Infant Survival.
Summary of tests and goodness of fit of infant survival from the Bongo District Hospital
Overall Model Evaluation
Test Chi-Squared DF p-value
Likelihood Test 5.204 1 0.023
Goodness of Fit
Hosmer-Lemeshow Test 1.607 1 0.205
R Squared
Cox and Snell’s R Squared = 0.030
Nagelkerke R Squared = 0.076
Predictor 𝛽 (std. error)
Wald DF p-value Odds
Ratio
95% CI for the
Odds Ratio
Lower Upper
Year Group 0.548
(0.244)
5.056 1 0.025 1.730 1.073 2.788
Constant 3.089
(0.484)
40.712 1 0.000 21.963
Summary Results of IS from the BDH (overall)
Year Year
Group
(Coded)
Infant
Mortality
Number
of Births
Probability
of Infant
Survival
Odds
p/(1-p)
Logit
(p)
Predicted
Probabilities
2006 1 13 450 0.9711 37.9777 3.6370 0.9743
2007 2 5 528 0.9905 65.6935 4.1850 0.9850
2008 3 8 750 0.9893 113.6360 4.7330 0.9913
2009 4 16 1067 0.9850 196.5663 5.2810 0.9949
2010 5 13 1081 0.9880 340.0185 5.8290 0.9971
2011 6 15 414 0.9638 588.1606 6.3770 0.9983
2012 7 8 525 0.9848 1017.3943 6.9250 0.9988
2013 8 13 541 0.9760 1759.8784 7.4730 0.9994
2014 9 6 706 0.9915 3044.2200 8.0231 0.9996
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 95
Discussion
At the mother level factors, two out of three factors were significant at the 5% level. Nutritional
status of mother and attendance of antenatal care are contributing factors to infant deaths in the
district. Because of financial restraints, most women in the district cannot afford a well-balanced
meal that they ought to consume during gestation period. They are therefore obliged to consume
any meal at their disposal at the expense of the health of their unborn children (extracts from the
questionnaires). It is not uncommon in the district to see a child born with infections most of which
result from poor nutritional status of their mothers. Also, in the Bongo district where there are
inadequate hospitals, clinics and CHPS compounds, women living in the remote areas often find
it inconvenient traveling several miles to seek antenatal care (extracts from the questionnaires).
The aftermath of missing antenatal care could be sad for the woman and the entire household.
The next set of factors was the child level risk factors which included the sex of the child and size
of the child. These factors are mostly not directly related to the family after the woman becomes
pregnant. They are mostly related to nature except for the size of the child that could be related to
nutritional status of the mother (Kwara, 2012). These are inevitable causes of infant deaths and the
study’s findings align with the fact since there was not any meaningful association of the variables
with infant deaths. In this category, both factors were found insignificant as causing infant death
in the district.
The next factor the study looked at was the environmental level risk factor which captured only
the level of sun in the region of the pregnant woman. The factor considered was significant as
contributing to infant mortality in the district. It is obvious that there is a causality between infant
deaths and weather conditions wherein infant deaths rise in the district during the hot weather
conditions from latter days of February until the close of April.
The last set of factors the study looked at were the medical attendants’ level risk factors which
included the handling complications during birth and the care for the child immediately after
delivery or postnatal care. Here, care can either be from the medical attendant or the traditional
birth attendant. It was found out that, the care for the child immediately after delivery by either
medical attendants at the hospitals or at home by traditional birth attendants was a significant
contributor to infant deaths in the district.
Conclusion
The four level risk factors were modelled, however the model for the child level involved only the
intercept since the two factors considered were not significant as contributing to infant deaths in
the district. Evidence from the Likelihood ratio and the Hosmer-Lemeshow tests and the individual
significance of the variables in the Wald criterion for the child level risk factors confirm that the
intercept only model is better than including the so called risk factors. This confirms the work of
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 96
Kwara (2012) about modelling neonatal mortality rate in Ghana at three different levels excluding
the medical attendants’ level. Thus, an efficiency of the logistic regression model is realized when
only the significant and meaningful variables are fitted. Noting that, at the mother level, mother’s
educational level was not significant at the benchmark 5% alpha level. Handling complications at
birth by attendants was not significant at the 5% level and hence we can explicitly write our models
as.
(6.1)
(6.2)
(6.3)
From the analysis of data from the BDH, the chances of infants surviving in the district hospital
were very high. From 2006 – 2014, the probability that infants survived in the hospital was between
0.9743 and 0.9997. This gives approximately at most 3% infant deaths in the hospital between
2006 and 2014. This shows from 2006 – 2014 that, a maximum of 3 infant deaths per every 100
live births implies 30 infant deaths for every 1,000 live births in the district hospital citeris paribus.
A rate lower than the 38.47 infant deaths per every 1,000 live births of infants in Ghana for 2014.
This further logically implies that the alarming rate of infant mortality observed in the district was
grossly attributed to home delivery wherein mothers at the brink of delivery were attended to by
traditional birth attendants who may lack the formal and requisite training in delivery and handling
of these foetuses. These deaths perhaps could also be due to delivery in other community clinics
of the district which are not studied here in the paper.
Recommendations of the study: It is recommended that the nutritional status of every pregnant
woman should be improved and the nutritional section of every health facility, if any, should be
fortified to advise pregnant women on the importance of feeding well during pregnancy and after
delivery. Women should be encouraged to deliver at the hospitals given the vast difference
between the numbers of deaths at home and unaccounted for and the number of deaths in the
Bongo district hospital. It is recommended that a nursing college or midwifery institution be built
in the district to properly train nurses, midwives and obstetricians on how to handle and take good
care of mothers and their new-borns until discharged from hospital.
Lapses of the study: Predicted probabilities of infant survival in the district hospital could be
spurious given the very low Nagelkerke’s R squared value. Reason so imputed to small sample
size. Long (1997) approach can be used to attain the right sample size per the population and the
entity of study. It is thus recommended that further research should be carried out about infant
mortality in the district using several other risk factors and using a fairly and reasonably large
sample size. The BDH could also keep a very rigorous data on infant deaths and their associated
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 97
major risk factors at all levels for prospective detailed studies of the occurrence. This is because
using questionnaires to measure these risk factors comes along with a lot of measurement errors
which are innate and impossible to sidestep and consequently, obtaining biased and inconsistent
estimates of the true parameters of interest. Several control factors could be included in the models
to address potential issues of endogeneity or omitted variable bias. Or perhaps to cater for any
possibility of endogeneity, an instrumental variable estimation of the parameters of non-linear
models (logit or probit models) could be studied however in the framework of child mortality (see,
Charbonneau, 2013), this could be a remedy to the potential endogeneity and perhaps
inconsistency problems.
Reference
Anon., (2014), “Assumptions of the Logistic Regression”,
www.statisticsolutions.com/assumptions_of_logistic_regression, Accessed: February
2015.
Ananth, C.V. and Basso, O. (2010), “Impact of pregnancy-induced hypertension on stillbirth and
neonatal mortality”, Epidemiology, January; 21(1):118-23.
Blair, P.S., Fleming, P.J., Bentley, D., Smith, I., Bacon, C., Taylor, E., Berry, J., Golding, J. and
Tripp, J. (1996), “Smoking and the sudden infant death syndrome:
results from 1993-5 case-control study for confidential inquiry into stillbirths and deaths
in infancy”, BMJ, July 27; 313 (7051):195-8.
Cedergren, M.I. (2004), “Maternal morbid obesity and risk of adverse pregnancy outcome”,
Obstetrical Gynecology, February; 103(2):219-24.
Charbonneau, K. B. (2013), “Multiple fixed effects in theoretical and applied econometrics. PhD
thesis, Princeton University.
Chen, A., Feresu, S.A., Fernandez, C. and Rogan, W.F. (2009), “Maternal obesity and the risk of
infant deaths in the United States”, Epidemiology, January; 20(1):78-81.
Clausen, T.D., Matheson, E., Ekborn, P., Helmut, E., Mandrup-Poulsen, T. and Damm, P. (2005),
“Poor pregnancy outcome in women with type 2 diabetes”, Diabetes Care, February;
28(2):323-8.
Dunne, F.P., Avalos, G., Durkan, M., Mitchell, Y., Gallacher, T., Keenan, M., Hogan, M.,
Carmody, L.A. and Gaffney, G. (2009), “Pregnancy outcome for women with
presentational diabetes along the Irish Atlantic seaboard”, Diabetes Care, July; 32(7):
1205- 6.
International Journal of Scientific and Education Research
Vol. 2, No. 04; 2018
http://ijsernet.org/
www.ijsernet.org Page 98
Khashu, M., Narayanan, M., Bhargava, S. and Osiovich, H. (2009), “Perinatal outcomes associated
with preterm birth at 33 to 36 weeks’ gestation: a population-based cohort study”,
pediatrics, January; 123(1): 109-13.
Kwara, K. (2012), “Modeling the risk factors of neonatal mortality in Ghana using logistic
regression”, June 2012, pp. 1-8, 36-70.
Long, J.S. (1997), “Regression Models for categorical and limited dependent variables,
Thousand Oaks, CA: Sage Publications.
Oestergaard MZ, Inoue M, Yoshida S, Maharani WR, Gore FM, et al. (2011) Neonatal Mortality
levels for 193 countries in 2009 with trends since 1990: A systematic analysis of progress,
projections and priorities. PLoS Med 8: e1001080.
Rasch, V. (2003), “Cigarette, alcohol, and caffeine consumption: risk factors for spontaneous
abortion”, Acta Obstetrical Gynecology Scand., February; 82(2):182-8.
Roberson, P.A. (2001), “Statistical methods for geography”, London: sage.
World Health Organization (WHO), (2016), “Child mortality and causes of death”, Global Health
Observation (GHO) data.
World Health Organization (WHO), (2006), “Infant Mortality”, Country, Regional and Global
Estimates.