61
Amanda Sullivan & Yerlin Barquero Cigarette Smoking at Raritan Valley Community College: Attitudes and Prevalence Semester Project Part IV Statistics Honors MATH117H-01

amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Amanda Sullivan & Yerlin Barquero

Cigarette Smoking at Raritan Valley Community College: Attitudes and Prevalence

Semester Project Part IV

Statistics Honors MATH117H-01

Dr. Lynne E. Kowski, Ph.D

December 7th, 2016

Page 2: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Table of Contents

Section One: Introduction …………………………………………………………………… …. 3

Section Two: Sampling Design…………………………………………………………………...4

Section Three: Analysis

Question One: How Old Are You? (In Years) … ………………….…………….……...10

Question Two: Do You Smoke Cigarettes? (Yes or No) ……………..............................14

Question Three: If so, How Many Cigarettes Per Day……………………….….............17

Question Four: Less People Are Smoking Now More Than in the Past Decade………..20

Question Five: How Many People in Your Household Smoke……… ………………...23

Question Six: What Do You Believe to Be a Trigger for Someone

To Start Smoking.......................................................................................26

Question Seven: What is Your Gender…………………….…………………………….29

Question Eight: Do You Believe It is A Possibility That Smoking Can Be

Eventually Eliminated…………………………………………………..32

Analysis for ANOVA……………………………………………….…………………...34

Analysis for Regression……………………………………………………………….....37

Conclusion……………………………………………………………………………………….40

Appendix…………………………………………………………………………………………42

Survey ……………………………………………………………………………….….42

Works Cited…………………………………………………………………………………….. 43

2

Page 3: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Section One: Introduction

The one population under consideration in this study is Raritan Valley Community

College (RVCC) students. Our survey was influenced by the common belief that among people

today, particularly millennials, there is a lower prevalence of cigarette smoking. Our survey was

influenced in part by the fact that there is an abundance of medical literature supporting that

there are harmful side effects of cigarette smoking. Additionally, there are more methods

available today such as drugs and support groups for cigarette smokers who want to quit. In turn,

these factors could reduce the prevalence of cigarette smoking not just in the millennial

generation, but the entire population. Given the fact that the nature of RVCC as an institution

leads to diversity in ages among its students, we administered a survey to a sample of the RVCC

student population to make inferences to see the current trends on cigarette smoking within the

college. Although RVCC consists mostly of younger students, there are some older adults as well

so we tested our data with information for all adult smokers.

3

Page 4: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Section Two: Sampling Design

In order to obtain the data needed for our analysis, we administered our surveys using the

method of systematic random sampling. In this method, the population size is divided by the

desired sample size; if it is not a whole number, it must be rounded down. The number from the

calculation is called m. Afterward, a random number (k) must be obtained, either through

technology or a random number chart. It must fall between the numbers 1 and m. Once m and k

are calculated, the random samples can be obtained through the following procedure: k, k+m,

k+2m, and so on. This procedure is repeated until you have achieved all of your required

samples.

In our case, the total population of all Raritan Valley Community College students were

grouped into classes. Professor Kowski compiled a list of active classes available during the Fall

2016 semester, with each having an assigned number. This allowed us to perform our systematic

sampling procedure easily. We elected to have a sample size of 25 classes, which would ensure a

sufficient number of responses in case uncontrollable circumstances occured, such as absent

students or professors declining our request. For the systematic random sampling procedure, we

first divided the total number of active classes at RVCC (963) by our desired sample size (25).

So, 963 divided by 25 was equal to 38.25. Because this is not a whole number, we rounded down

to 38 and labeled the number as m. With m equal to 38, we calculated the number k, which had

to be between 1 and 38 (m). Using a random number generator, we found that k was 13

(Random.org). Then, we applied the final procedure of k, k+m, k+2m as follows: 13, 13+38,

13+2(38), 13+3(38)... and so on until we reached the sample size of 25. This method helped us to

prevent bias because the number k was randomly generated, and thus allowed us to move down

the class list in systematic way without any interventions on our part. Unlike cluster sampling,

this method mitigates the risk that the selected samples may have attributes in common, such as a

group of classes belonging to the Biology major. Our systematically chosen samples from the

class list are as follows:

4

Page 5: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

*Bold type denotes classes selected for surveyingSystematically

Chosen Class Number (Using k, k+m, k+2m)

Class Information (CRN, Section, Course Title, Day and Time, Location, Instructor)

Students Enrolled in Class (As of

Beginning of Fall 2016 Semester)

13 14494, ACCT-221 51, TAX ACCOUNTING I, W 05:30 pm-08:20 pm, HNTRDN H116, Nicole DeRosa

No Response

19

51 14832, ARTS-131 01, WATERCOLOR II, W 01:00 pm -04:50 pm, ARTS A07, Christopher Koep

3

89 11021,BIOL-101 03X, GENERAL BIOLOGY I, WF 10:00 am - 11:20 am, WEST W114, Lenahan, Melanie

20

127 10243, BIOL-124 51X, HUMAN ANATOMY & PHYSIOLOGY, MW 06:00 pm - 07:20 pm, SCIENC SC151, Anthony Rosania

No Response

19

165 15262, BUSI-112 02, PRINCIPLES OF MANAGEMENT W 01:00 pm - 03:50 pm, HNTRDN H328, Tonya Rimple

38

203 14489, CHEM-212 02X, ORGANIC CHEMISTRY II TR 02:30 pm - 04:20 pm, SCIENC SC151, Paul Schueler

Accepted, Received Surveys

11

241 13681, CISY-102 95X, COMPUTER LITERACY W 05:30pm - 09:40 pm, WEST W309, Robert Bertolatus

No Response

22

279 10430, COMM-101 03 SPEECH MW 01:00 pm - 02:20 pm, WEST W113, Thomas Valasek

24

317 16073, CRMJ-122 13, DOMESTIC PREPAREDNESS F 06:00 pm - 08:50 pm, SMRSET S244, Joseph Howarth

No Response

6

355 10001, ECON-102 01, MICROECONOMICS, WF 10:00 am-11:20 am, HNTRDN H330, Daniel Aronson

37

393 11479, ENGL-070 L06, ENGLISH COMPOSITION I WORKSHOP, MW 11:30 am - 12:50 pm, SMRSET S350, Andrew Manno

19

5

Page 6: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Accepted, Surveys Received431 10583, ENGL-111 28, ENGLISH

COMPOSITION I, MW 04:00 pm - 05:20 pm, HNTRDN H122A, Melody Pazian

23

469 13904 ENGL-111H 01 ENGLISH COMP I HON:COMP&CONTRO, TR 10:00 am - 11:20 am, SMRSET S243, Christine Pipitone

21

507 10032, ENGR-105 02, INTRODUCTION TO ENGINEERING, W 02:30 pm - 04:20 pm SCIENC SC100, Antonella Pompo

26

545 12919, FILM-101 04, ART OF MOVIE:FILM APP & ANALYS, W 04:00 pm - 06:50 pm, HNTRDN H120, Thomas Lancaster

23

583 10142, HIST-101 01, WORLD CIVILIZATION I MW, 07:00 am - 08:20 am, HNTRDN H110, Christian Kemp

Accepted, Surveys Received

25

621 12800, HMNS-240 51, INTRODUCTION TO SPECIAL ED, R 06:00 pm - 08:50 pm, SMRSET S018, Andrea Romano

15

659 15250, MATH-020 13Y, ELEMENTARY ALGEBRA, M 11:00 am - 12:50 pm, SMRSET S349, Tatyana Stepanova

21

697 14840, MATH-030R 51, INTER ALGEBRA W/ALGEBRA REVIEW, TWR 07:30 pm - 09:20 pm, CLLCTR C022, Gregory Bullock

15

735 12428, MATH-112 04X, PRECALCULUS I MW 11:00 am-12:50 pm, SMRSET S247, Patricia Hulsen

Accepted, Surveys Received

40

773 12392, MUSC-103 01, ROCK N' ROLL HIST & CULTURE, TR 10:00 am - 11:20 am, ARTS A33, Lawrence Barbee

Declined Request

39

811 811, 10304 NURS-L01 05X, FOUND OF NURSG LAB, W 02:00 pm - 03:50 pm, HNTRDN H227, Maryann Balut

Accepted, Surveys Received

11

849 13390, PHIL-114 03 ETHICS, MW 02:30 pm - 03:50

17

6

Page 7: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

 Pm, HNTRDN H329, Sean Meseroll887 10916, PSYC-103 15, INTRODUCTION

TO PSYCHOLOGY, TR 04:00 pm - 05:20 pm, HNTRDN H333, Kathleen Manz

32

925 12865 SPAN-103 03X INTRODUCTION TO SPANISH I, TR 03:00 pm - 04:50 pm, WEST W306, Sandra Reynolds

19

TOTAL CLASSES: 25

EXPECTED STUDENT RESPONSES: 543

After getting the 25 samples of the classes, which totaled to 543 enrolled students,

Professor Kowski informed us that between monitoring responses from the instructors and

analyzing the data, a sample of 25 classes equating to 543 possible surveys would be too

overwhelming. We agreed with Professor Kowski and decided to reduce our sample size. So,

after the first class listed, we would select every fifth class and request permission to administer

surveys from those instructors. This ensured diversity among majors and gave us a total of five

classes to administer surveys to. This gave us an expected 113 student surveys which was well

over the amount of the 50 minimum surveys needed for analysis.

After sending out the e-mails requesting permission to administer our surveys to the

professors, we received three acceptances out of five classes. We obtained a prompt acceptance

from Professor Paul Schueler. We received the surveys from his class, but did not have all of the

expected amount since some students may have been absent, withdrew from the class at that

time, or refused to take the survey. Professor Christian Kemp approved our request, but we did

not receive all of the expected amount of surveys for the same possible reasons stated before.

Professor Andrew Manno approved our request, but again we did not receive all expected

surveys for the possible reasons mentioned above. Professor Lawrence Barbee declined our

request to administer the surveys to his class for undisclosed reasons. Unfortunately, Professor

Nicole DeRosa did not respond to our e-mail requests. After one week, we sent her a friendly

reminder. After another week, we finally decided to sample another class and sent an email

informing Professor Nicole DeRosa that we would no longer require the surveys from her class.

Although we were following the correct procedures to ensure a sufficient number of

responses we realized that across our returned surveys, very few students responded in the

affirmative that they currently smoke cigarettes. Professor Kowski informed us that this would

become an issue later on. For our analysis to work in terms of achieving a normal distribution,

7

Page 8: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

we would need to have at least 30 students who responded in the affirmative that they currently

smoke cigarettes. At that point, we chose to select more classes to ensure that we could get the

30 minimum that smoke cigarettes by the posted deadlines. Professor Kowski advised us to

select classes that took place during the evening, had a large number of students enrolled, or was

in a discipline that is often believed to have a large amount of smokers (e.g. nursing). This was a

source of bias in our sampling because it was not truly random and required intervention on our

part. We selected the following classes which are represented by their numbers: 127, 241, 317,

735, and 811. These classes are shown in bold font on the table.

After selecting the above five classes, we emailed the professors and two of the five

responded in the affirmative. Professor Patricia Hulsen accepted our request to administer the

surveys to her class, but we did not receive the exact number of expected responses possibly due

to the aforementioned possible reasons. Professor Maryann Balut did not respond to our first

email request, but she did respond to the friendly reminder a week later and accepted our request.

We received all expected surveys. Unfortunately, Professors Anthony Rosania, Robert

Bertolatus, and Joseph Howarth did not respond to our emails. We sent an initial email to each,

followed by a friendly reminder one week later. Another week later, we finally decided to stop

pursuing their classes and sent them an email stating we would no longer require surveys from

their class.

Even with the surveys from the extra classes, we did not achieve the 30 minimum surveys

with students answering “yes” that they currently smoke cigarettes. In order to resolve this issue,

Professor Kowski authorized us to give in-person surveys to students we saw smoking cigarettes

around the RVCC campus. Even though this was a definite source of bias in our sampling

design, it was necessary so that we could further progress our study. We waited by the smoking

section of the main parking lot for students to come by. When they did, we asked them if they

were enrolled in at least one class at RVCC. We also ensured that they did not already take the

survey, which was a possibility since we already administered a considerable number of surveys

to our selected classes. Most students were cooperative in taking the survey and by the end of the

day, we received a significant number of smoker affirmative surveys. In order to preserve the

integrity of our sampling process, one survey had to be discarded since the person forcefully

grabbed a survey from our pile and filled it out on behalf of someone else, despite us repeatedly

telling him not to. One person filled out a survey circling both “yes” and “no” for the question of

8

Page 9: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

whether or not they smoke cigarettes, so we discarded this as well since there was no way to

analyze it within the parameters of our study. Added to the few affirmative cigarette smoking

surveys from the classes we sampled, we managed to get a total of 33 students who responded

that they smoke cigarettes.

Given our complex sampling experience, we obtained a sample of 117 surveys from the

RVCC student population, which was well above the minimum of 50 that we needed for our

analysis.

9

Page 10: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Section Three: Analysis

The first question of our survey is below:

1) How Old Are You (In Years)?

________________

It is important to first list our null and alternate hypotheses for the average ages of

students at RVCC. Professor Kowski provided us with an enrollment report which

contained demographic data on the college’s student body. From this, we noted that as of

the Spring 2015 semester, the average age of RVCC students was equal to 25.5 years old

(10th Day Enrollment Report”). We rounded up to 26 years old for efficiency. We formed

these hypotheses:

Ho: μ=26

Ha: μ≠ 26

We chose our alternate hypothesis to be not equal to 26 for a variety of reasons.

Many students enroll the academic year directly after their high school graduation, which

could place them well below the mean age of 26 years old. Some older adult students

enroll in order to change their career path or to take a class for personal enrichment, so

their ages could be well above the mean of 26 years old. Before performing further

analyses, we had to determine whether or there is a difference between the ages of the

smokers and nonsmokers. To see if there is a difference, we looked at summary statistics

of smokers vs non-smokers and performed a Two-Sample T test at the 5% significance

level, as run by Minitab:    N Mean St Dev SE Mean Age of Smokers      33  20.52   2.46     0.43

Age of Non-Smokers  84  21.94   6.63     0.72

Difference = μ (Age of Smokers) - μ (Age of Non-Smokers)Estimate for difference:  -1.42595% CI for difference: (-3.092, 0.241)T-Test of difference = 0 (vs ≠): T-Value = -1.69  P-Value = 0.093  DF = 114

10

Page 11: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Based on this hypothesis test, the p-value was .093, which is greater than our set

significance level of 5%, or .05. Therefore, we retained the null hypothesis that there is

no difference in ages between smokers and non-smokers as there is not sufficient

evidence for the alternative hypothesis. Using Minitab, we calculated a tally for each

discrete variable along with their frequencies and relative frequencies. These are shown

in the table below:

Tally for: Age (In Years)

Age Count Percent 17 1 0.85 18 22 18.80 19 30 25.64 20 17 14.53 21 14 11.97 22 9 7.69 23 5 4.27 24 6 5.13 25 2 1.71 27 2 1.71 28 1 0.85 29 1 0.85 30 1 0.85 32 1 0.85 36 1 0.85 37 1 0.85 44 1 0.85 50 1 0.85 56 1 0.85 N= 117

The data of frequencies and tallies show the ages of the students. A majority of the students are

between 18 and 21 years old. This is expected as many students enroll directly from high school

and are expected to take around two to three years to finish their associates degree. The most

common ages are 18 and 19 years old, which make up 18.80% and 25.64% respectively. As

shown in the table, there still is a diversity among ages as student ages are spread around ages in

late-twenties, in their thirties, forties, and fifties categories. It is interesting to note that the

youngest student was 17 years old which may mean they are in an early college credits program

or graduated from high school early. From this, we can conclude that the ages of students at

Raritan Valley Community College mostly consists of young adults. To further investigate

information on the ages of the sampled students, we generated a histogram and boxplot which

include summary statistics. Minitab performed this task, as shown below:

11

Page 12: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

1st Quartile 19.000Median 20.0003rd Quartile 22.000Maximum 56.000

20.477 22.600

19.000 20.000

5.135 6.650

A-Squared 14.90P-Value <0.005Mean 21.538StDev 5.795Variance 33.578Skewness 3.7576Kurtosis 16.6596N 117Minimum 17.000

Anderson-Darling Normality Test

95% Confidence Interval for Mean

95% Confidence Interval for Median

95% Confidence Interval for StDev

52.545.037.530.022.5

Median

Mean

2322212019

95% Confidence Intervals

Summary Report for Age

Both the histogram and boxplot from the data articulate our findings in the frequency and

relative frequency tables. Since the histogram is right-skewed, we can see that most of the

student’s ages are distributed around 22.5 years of age or less. In addition to the histogram, the

boxplot represents this trend as well. Most students are young adults as the boxplot is right

skewed. Unlike the histogram, the boxplot better displays the outliers, whose ages are much

higher therefore considerably farther away from the edge of the boxplot. In this case, the IQR is

equal to 3. The outliers are 1.5 times this range. To find further information on the ages, we will

perform a t-interval and one mean t-test, as generated by Minitab:

One-Sample T: Age

Test of μ = 26 vs ≠ 26

Variable N Mean StDev SE Mean 95% CI T PAGE 117 21.538 5.795 0.536 (20.477, 22.600) -8.33 0.000

12

Page 13: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

From this calculation, we can see that our p-value is 0.000. We have chosen to perform

our hypothesis testing at the 5% significance level, or .05. Because the p-value of 0.000 is less

than the significance level of .05, we reject the null hypothesis in favor of our alternative. So, the

mean age is not equal to 26. When using the confidence interval with a 95% confidence level,

we can be 95% confident that the mean age of students falls between 20.477 and 22.600 years

old.

Tally for: Do you Smoke Cigarettes?

Do you smoke cigarettes? Count PercentNo 84 71.79Yes 33 28.21

N= 117

13

Page 14: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Question 2: Do you smoke cigarettes? (Circle One)

a.Yes b. No

For this question, the students taking the survey were given the option to circle

“yes” that they currently do smoke cigarettes, and “no” that they do not currently smoke

cigarettes. Before starting our analysis, we researched the proportion of smokers in the

United States to obtain our null and alternate hypotheses. The Centers for Disease

Control and Prevention stated: “In 2014, nearly 17 of every 100 U.S. adults aged 18 years

and older (16.8%) currently smoked cigarettes” (“Adult Cigarette Smoking Hits All-Time

Low”). Using this, we multiplied 16.8% by 100 in order to convert it to a proportion

(.168). Here are our null and alternative hypotheses for those that smoke cigarettes:

Ho: p=.168Ha: p<.168We chose our alternate hypothesis to be less than .168 because of the common

assumption that less people today are smoking cigarettes than in the past. This belief is

especially true in regards to young adults, which made up a majority of the students in

Question 1. As mentioned in our introduction, there is an abundance of medical literature

highlighting the negative side effects of cigarette smoking, which could lead the

alternative hypothesis to be less than .168. Below is a frequency and relative frequency

table of the responses students gave for this question:

Tally for: Do you smoke cigarettes? (Yes or No)Do you smoke cigarettes?  Count  Percent

No                   84 71.79

Yes                  33 28.21

N= 117

From this table, 84 students out of the 117 surveyed responded in the negative

that they do not smoke cigarettes, making up 71.79%. 33 out of the 117 surveyed

responded that they do smoke cigarettes, making up 28.21%. From the percentages, it

14

Page 15: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

becomes clear that a majority of the RVCC students do not smoke cigarettes, which is

consistent with the common belief that less people today are smoking cigarettes. To

further analyze this, we created a pie chart. For data entry purposes, 0 represents non-

cigarette smokers and 1 represents cigarette smokers:

Our pie chart further validates that many RVCC students are not cigarette smokers. Those

who responded that they do not smoke cigarettes make up more than half of the pie chart.

Although those who responded that they do smoke cigarettes take up a portion of the pie chart,

they still are the minority. We then performed a one-proportion hypothesis test at the 5%

significance level:

One-Proportion Z-TestTest of p = .168 vs < .168

Variable          X N   Sample p    95% CI (Exact)     P-Value (Exact)Smoke Cigarettes? 33  117  0.282051  (0.202759, 0.372708)     0.999

15

Page 16: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

From the results, the exact p-value is 0.999, which is much higher than our set

significance level of 5%, or .05. So, we retained the null hypothesis that the proportion of

cigarette smokers is equal to .168. The data do not provide sufficient evidence in order to adopt

our alternative hypothesis that a proportion of less than .168 smoke cigarettes. From our

confidence interval, we can be 95% confident that the population proportion of those who smoke

cigarettes lies within the range of .202759 and .372708.

16

Page 17: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Question 3: If so, how many cigarettes per day? (If you don’t smoke, please answer

“ZERO”)

_______________

For this portion of the survey, we asked specifically to those who did smoke, how many

cigarettes they smoked per day.  For the purpose of creating an unbiased test, we went ahead and

looked at the national average as per the Center for Disease Control and Prevention website.

Under their press release from 2014, they recorded a decline of cigarettes smoked per day from

16.7 to 14.2 (“Adult cigarette smoking rate overall hits all-time low”). For our purposes, we kept

the data they recorded as 14.2, therefore creating our null hypothesis of 14.2.

With the common assumption of this survey that cigarette smoking has declined, the

alternate hypothesis states that RVCC students smoke less than 14.2 cigarettes per day.

Ho: μ=14.2Ha: μ<14.2

Below is a frequency and relative frequency table of the cigarettes smoked per day by

students generated by Minitab:

Tally for: Cigarettes Smoked Per DayCigarettes/Day Count Percent      1.0       2 6.06

      2.0       2 6.06

      3.5       4 12.12

      4.0       2 6.06

      4.5       2 6.06

      5.0       3 9.09

      6.5       1 3.08

      8.0       1 3.08

     10.0       5 15.15

     12.0       1 3.03

     12.5       1 3.03

     14.0       1 3.03

     15.0       2 6.06

     20.0       5 15.15

     25.0       1 3.03

N=     33

17

Page 18: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

As seen above with the relative frequency chart above, we can see that the total smokers

who smoke more than 14 cigarettes per day is only 9, compared to the remaining 24 individuals

who smoke less than 14 cigarettes per day. Our minimum is one cigarette and the maximum

being 25 cigarettes per day. Below is a histogram of the data:

The following information will be used to retain the average from the Centers of Disease

Control and Prevention (CDC), or reject their statement. We have chosen a confidence level of

95% with a 5% significance level, or .05. Here are the results for the One-Sample T Test,

performed by Minitab:

One-Sample T Test: Cigarettes Smoked Per DayTest of μ = 14.2 vs < 14.2

Variable                   N   Mean   StDev  SE Mean   95% CI    T      PCigarettes Smoked/Day 33  9.39   6.78     1.18  (6.99,11.80) -4.07  0.000

18

Page 19: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

The p-value is 0.000, which is much smaller than our set significance level of 5%, or .05.

So, we rejected the null in favor of our alternate hypothesis. At the 5% significance level, the

data provide sufficient evidence to conclude that the cigarette smoking students smoke less than

14.2 cigarettes per day. From this, we can also be 95% confident that the average number of

cigarettes smoked per day is between 6.99 and 11.80.

19

Page 20: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Question 4: Less people are smoking now than in the past decade. (Please Circle One)

Strongly Disagree | Disagree | Neutral | Agree | Strongly Agree

Students read this statement and had to respond using a Likert Scale. They were given the

choices of “Strongly Disagree,” “Disagree,” “Neutral,” “Agree,” and “Strongly Agree.” For data

entry purposes, each response was given a number, so “Strongly Disagree” is 1, “Disagree” is 2,

and so on. We formulated our null and alternative hypotheses:

Ho: μ= 3 (Neutral)Ha: μ ≠3 (Neutral)Even though data suggests that less people smoke cigarettes now than in the past years,

some people may have different views on the current trends of smoking. As a default, we set our

null hypotheses to the response of “Neutral.” Since we have no concrete way of inferring how

students may respond to this question, we set our alternative hypothesis to be not equal

“Neutral.” Here is a frequency and relative frequency table of the responses students gave:

Likert Response      Count  PercentStrongly Disagree (1)       9     7.69

Disagree (2)               21    17.95

Neutral (3)               28    23.93

Agree (4)                  50    42.74

Strongly Agree (5)          9     7.69

       N=    117

From the table, many students answered “Agree” in response to our statement that less

people are smoking now than in the past years. 50 out of 117 responders selected this answer,

making up 42.74%,. A considerable number of students responded with “Neutral,” making up 28

out of 117 students and 23.93 percent. The other categories have a considerable number of

responses as well. Here is a histogram of the students’ responses to the statement:

20

Page 21: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

The histogram is left-skewed as seen by the longer tail on that side, which is logical since

the majority of the students responded with “Agree,” or 4. This better reflects and provides a

visual of the data we found in the frequency and relative frequency table as it shows the

distribution. So, from both the histogram and table, many students agree that less people are

smoking today than in the past decade. Here is the data needed to test our alternative hypothesis

along with our 95% Confidence Interval each from Minitab:

One-Sample T-TestVariable       N   Mean  StDev  SE Mean      95% CI         T      PQuestion 4    117  3.248  1.082    0.100  (3.050, 3.446)  2.48  0.015

Our p-value from the one mean t-test is equal to 0.015, which is less than our significance

level of 5%, or .05. We rejected the null in favor of our alternate hypothesis. At the 5%

21

Page 22: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

significance level, the data provide sufficient evidence to conclude that student’s feelings on our

statement differ from “Neutral.” From the confidence interval, we can be 95% confident that the

mean response will be between 3.050 and 3.446, which roughly equates to the “Neutral”

category.

22

Page 23: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Question 5: How many people in your household smoke?

How many people in your household smoke, and approximately what are their age in the

following table: (If you live alone answer “ZERO” for each)

Relation: Smoke: Yes / No Age (Approximate)

Spouse

Sibling(s)

Children

Parent

Other_______________

According to the surveys we received, a number of students seemed to have had

difficulty answering this question. We ultimately choose to not take the age column into

consideration for our analysis because some responses received for this part were

incomprehensible. To simplify this, we took a tally of the smokers in the students’ household and

did this for each survey. From the 2010 Census, it is stated that the average household size is

3.14, so for our testing purpose we are rounding this number down to 3 (“Household and

Families: 2010). We could not find data stating the average number of smokers per household, so

we will simply test with average number of people in the house regardless of whether or not they

smoke cigarettes. Our null hypothesis is 3. Following the trends from the declining values shown

in the Center of Disease Control for Prevention, we set the alternate hypothesis to less than 3:

Ho: μ = 3 Ha: μ< 3

23

Page 24: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Before doing further analysis, we must determine whether or not there is a significant

difference between smokers and non-smokers for this question. Here are the results from

Minitab:

Difference = μ (Smokers in House (Smokers)) - μ (Smokers in House (Non-Smokers))Estimate for difference: 0.26995% CI for difference: (-0.074, 0.613)T-Test of difference = 0 (vs ≠): T-Value = 1.58 P-Value = 0.121 DF = 51

Based on the results of this hypothesis test, the p-value is equal to 0.121. This is greater

than our set significance level of 5%, or .05. So, we retained the null that there is no difference

between the amount of smokers in the household depending on whether or not one smokes or

does not smoke cigarettes. We proceeded with the analysis of cigarette smokers in the household

with cigarette smokers and non-cigarette smokers together. Here is a frequency and relative

frequency table of the amount of cigarette smokers in the household:

Tally for: People who smoke in household

#of Cigarette Smokers in Household  Count  Percent        0      69    58.97        1      33    28.21        2      12    10.26        3       3     2.56       N=    117

As seen with the frequency relative frequency table, over half of the people surveyed had

0 people who smoked in the household making up for 58.97% of the total surveyed. This is

consistent with the ideology that less people today are smoking cigarettes. Very few of those

surveyed had 2 or more cigarette smokers in their household and many only had one cigarette

smoker in their household. Here is a histogram of the data:

24

Page 25: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

As seen above, this histogram a skewed to the right, meaning that many of the students

had 1 cigarette smoker in their household or less. This is lower than 3 cigarette smokers in the

household, as those who had 0 cigarette smokers accounted for more than half of those surveyed.

The following is a One-Sample T Test at the 5% significance level:

One-Sample T: People who smoke in householdTest of μ = 3 vs < 3

Variable N Mean StDev SE Mean 95% Upper Bound T PSmokers in House 117 0.5641 0.7811 0.0722 0.6838 -33.73 0.000

Using the p-value approach, we can reject the null because the p-value of 0.000 is much

lower than .05. We can reject the null hypothesis and adopt our alternate hypothesis which states

that members of the household who smoke cigarettes is less than 3.

25

Page 26: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Question 6: What do you believe to be a trigger for someone to start smoking?

What do you believe to be a trigger for someone to start smoking? (Fill In)

a. Peer Pressure b. Media c. Stress D. Other:___________

Students were asked to select what they believed to be the biggest trigger for someone to

start smoking cigarettes, with the proposed options listed as: “Peer pressure, Media, Stress” and

the option for “other” in case students believed there was another trigger for beginning cigarette

smoking. Due to the lack of statistical knowledge we had at the time of writing this survey, we

neglected to add the instructions to circle only one answer. This error caused many students to

circle more than one answer. Since we had to satisfy the assumptions for the one-way ANOVA

testing, we diced to create an additional two groups. So, our new groups were “Peer Pressure,”

“Media,” “Stress,” “Two Triggers,” and “Three Triggers.” In order to complete the data, we

reviewed all the surveys and for those who wrote an answer for “Other” and categorized them

into one of the original groups of “Peer Pressure,” “Media,” and “Stress.” For example, in survey

34 the individual stated: “People I work with smoke,” so we grouped this in “Peer Pressure.”

Below we will examine what students at Raritan Valley Community College believed to be the

biggest trigger for someone to start smoking. Three of the surveys were not taken into

consideration for this question’s analysis since they did not provide an answer for this question,

so our sample was reduced to 114 for this question and our ANOVA investigations we

performed after this:

Tally for: Biggest Trigger for Starting Smoking CigarettesSmoking Triggers  Count  Percent        Media      3    2.63

Peer Pressure     38     33.33

       Stress     44     38.60

Three Triggers    14     12.28

 Two Triggers     15     13.16

N=    114

26

Page 27: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

With the information from the frequency table we can see that the two leading triggers as

believed by the students surveyed at Raritan Valley Community College are “Stress” making up

38.6% and following closely behind was Peer Pressure making up 38.33%. This did not come as

a surprise as it is a generally believed that smoking cigarettes is a way to cope with stress. Below

is a pie graph of the data, which provides a visual representation of how students answered this

question:

What did come as a surprise in this study, was that only 2.63% of students believed that

“Media” was a sole trigger to begin cigarette smoking. This is shocking since media plays such

an integral role in present society and is made up of television ads and films showing cigarette

smoking. In fact, the American Cancer Society has stated the following regarding the media and

its influence on smoking. “Tobacco use is also shown in video games, online, and on TV. And

movies showing smokers are another big influence. Studies show that young people who see

smoking in movies are more likely to start smoking” (“Why People Start Smoking and Why It’s

Hard to Stop”). Given this, it’s surprising to see that this category made up such a small

percentage.

27

Page 28: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

28

Page 29: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Question 7: What is your gender?What is your gender? (Circle one)

a. Maleb. Femalec. Other

Students had to respond to this question with their gender. We choose to include “Other”

as one of the options since somebody may not identify as either “Male” or “Female.” However,

nobody responded with “Other.” Because of this, we eliminated it from our observations as it

would have no effect to the final data. First, we had to see if there was a difference in gender

depending on whether or not they’re cigarette smokers. Below is the information we needed to

perform the Two-Sample T Test:

male (Smoke) (1 = yes, 0  24  61  0.393443female (Smoke) (1 = yes,   9  56  0.160714  Difference = p (male (Smoke) (1 = yes, 0= no)) - p (female (Smoke) (1 = yes, 0= no))Estimate for difference:  0.23272895% CI for difference:  (0.0769034, 0.388553)Test for difference = 0 (vs ≠ 0):  Z = 2.93  P-Value = 0.003

Because our p-value is 0.003 and our significance level is 5%, or .05, we rejected the null

hypothesis that there is no difference between the genders or students and their status on

cigarette smoking. So, at the 5% significance level, the data provides us with sufficient evidence

to conclude that a difference exists between males and females in terms of their cigarette

smoking status. We will analyze the data for only the cigarette smokers since the hypothesis test

results were significant. Below is a frequency and relative frequency table of the genders of

cigarette smokers:

Gender of Smokers  Count  Percent       Female      9    27.27

       Male      24    72.73

      

N=     33

Males made of the greatest amount of the cigarette smoking students. 24 out of 33 were

males, making up 72.73% of the students. Only 9 out of 33 were female, making up 27.27% of

the students. Below is a pie chart of the genders of cigarette smoking students:

29

Page 30: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

01

Category

1

0

Cigarette Smoking Students (By Gender)

We researched information on cigarette smoking and gender from the World Health

Organization (WHO). WHO had run their own study and found the following: “Globally, about

40% of men smoke as compared with nearly 9% of women. However, the epidemic of tobacco

use among women is increasing in some countries” (“10 Facts on Gender and Tobacco”). Since

WHO is a worldwide recognized institution, we used their proportion of 0.09 for our hypothesis

testing. This created our null hypothesis 9% of women surveyed smoke. For our alternate

hypothesis we want to see if there is a difference from their study, which is why our alternate

hypothesis will be set to not equal to .09. When we ran the test in Minitab, “Female” was the

specified attribute (or “Event”):

Ho: p= 0.09Ha: p≠0.09Test and CI for One Proportion: Female Cigarette Smokers

Test of p = 0.09 vs p ≠ 0.09

Event = 1

30

Page 31: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Variable X N Sample p 95% CI P-Value (Exact)Gender (X=Female) 9 33 0.272727 (0.132996, 0.455238) 0.002

From the calculated p-value, which is 0.002, the data is significant at 5%, or .05. We had

sufficient evidence to conclude that there is a difference between our proportion of female

cigarette smokers and the statistic reported from WHO. According to WHO, “…40% of men

smoke compared with 9% of women” (“10 Facts on Gender and Tobacco”). From the confidence

interval, we can be 95% confident that the proportion of females who smoke cigarettes is

between .132996 and .455238. Using this, we formulated null and alternate hypotheses for male

cigarette smokers. We chose our alternate to be not equal to in order to find if there is a

difference in the proportion of male smokers compared to WHO:

Ho: p= 0.40Ha: p≠0.40Here are the calculations for the One-Proportion Z test, with “Male” set as the attribute,

or “Event”:

Test and CI for One Proportion: Male Cigarette Smokers

Test of p = 0.4 vs p ≠ 0.4

Event = 1

Variable X N Sample p 95% CI P-Value (Exact)Male Cigarette Smokers 24 33 0.727273 (0.544762, 0.867004) 0.000

The calculated test results are above. The p-value is 0.000, which is lower than our set

significance level of 5%, or .05. We reject the null and adopt the alternate hypothesis that there is

a difference between the proportion of male cigarette smoker students and the data for males

reported by WHO. From the confidence interval, we can be 95% confident that the proportion of

cigarette smoking males is between .544762 and .867004.

31

Page 32: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Question 8: Do you believe it is a possibility that smoking cigarettes can be eventually

eliminated? (Circle One)

a. Possible

b. Impossible

Students were asked to circle the option they believed to be true. Here are our null and

alternative hypotheses for this proportion question:

Ho: p=.50Ha: p≠.50Like Question 4, we set our hypotheses as such because there is no way concrete way to

predict how students will feel about this question. So, we chose the default proportion of .50

would feel that putting an end to cigarette smoking is “Possible” and set our alternate hypothesis

to be not equal to since students may feel one way or the other about this question. Here is

frequency and relative frequency table of the students’ responses:

     Response    Count  Percent                  Impossible   56    47.86

                  Possible     61    52.14

                 N=    117

Most of the students responded that it is “Possible” that smoking cigarettes will eventually be

eliminated, with 61 out of 117 responding this way. This made up 52.14%. However, a

considerable number responded that the prospect of eliminating cigarette smoking was

“Impossible.” They made up 56 out of 117, or 47.86%. From this table, the students’ responses

did not lean dramatically toward either of the responses. Here is a pie chart of the data:

32

Page 33: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

The pie chart reflects the same trend as the frequency and relative frequency table.

Slightly more students responded that it was “Possible” to end cigarette smoking, while slightly

less answered “Impossible” to our question. As seen by the slices of the pie graph, the students

did not respond drastically toward any one answer. Here are the data needed to perform a

hypothesis test at the 5% significance level along with the 95% Confidence Interval:

                     X    N   Sample p         95% CI         P-ValuePossible/Impossible  61  117  0.521368  (0.427057, 0.614570)    0.712

The p-value is .712, which is greater than our set significance level of 5%, or .05%. We retain

the null that the proportion of students answering “Possible” is equal to a proportion .50. At the

5% significance level, the data does not provide sufficient evidence to conclude that there is a

difference from the proportion of .50 that eliminating cigarette smoking can be eliminated. From

the Confidence Interval, we can be 95% confident that the proportion of students answering that

cigarette smoking can be eliminated is between .427057 and .614570.

33

Page 34: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

One-Way ANOVA Investigations: Numerical and Proportion

Before performing the ANOVA investigations, we had to eliminate three rows since the

students did not provide a response for question 6 about the triggers for starting cigarette

smoking. Because of this, we also had to eliminate the data from those surveys for what we were

testing against. For example, when we performed the investigation of Age versus the Trigger, we

had to omit the age data for those students who did not respond to question 6 in order to prevent

confusion. This reduced our sample size from 117 to 114 for these investigations. We ran

ANOVA tests for all of our numerical and proportional questions. Below are the ones that had a

significant p-value. If none of the tests yielded significance, we simply chose the investigation

that had the lowest p-value:

One-way ANOVA: Age versus Triggers for Starting SmokingMethod

Null hypothesis         All means are equalAlternative hypothesis  At least one mean is differentSignificance level      α = 0.05Rows unused             3

Equal variances were assumed for the analysis.

Factor Information

Factor                Levels  ValuesQUESTION 6 (Trigger)       5  Media, Peer Pressure, Stress, Three Triggers, Two Triggers

Analysis of Variance

Source                 DF  Adj SS  Adj MS  F-Value  P-ValueQUESTION 6 (Trigger)    4   189.6   47.41     1.62    0.175Error                 109  3192.9   29.29Total                 113  3382.5

Model Summary

     S   R-sq  R-sq(adj)  R-sq(pred)5.41225  5.61%      2.14%       0.00%

Means

QUESTION 6

34

Page 35: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

(Trigger)        N    Mean  StDev       95% CIMedia            3   21.00   0.00  ( 14.81,  27.19)Peer Pressure   38   22.82   8.29  ( 21.08,  24.56)Stress          44  20.477  2.246  (18.860, 22.094)Three Triggers  14  19.286  1.978  (16.419, 22.153)Two Triggers    15   22.33   5.21  ( 19.56,  25.10)

Pooled StDev = 5.41225

The p-value calculated from this investigation was 0.175. This is not significant since we

chose our significance level was 5%, or .05. The data do not provide sufficient evidence to

conclude that the mean age varies depending on what students believe to be a trigger for starting

cigarette smoking.

One-way ANOVA: Do You Smoke Cigarettes versus Triggers for Starting Cigarette Smoking

Method

Null hypothesis         All means are equalAlternative hypothesis  At least one mean is differentSignificance level      α = 0.05Rows unused             3

Equal variances were assumed for the analysis.

Factor Information

Factor                Levels  ValuesQUESTION 6 (Trigger)       5  Media, Peer Pressure, Stress, Three Triggers, Two Triggers

Analysis of Variance

Source                 DF  Adj SS  Adj MS  F-Value  P-ValueQUESTION 6 (Trigger)    4   1.121  0.2803     1.37    0.250Error                 109  22.326  0.2048Total                 113  23.447

Model Summary

      S   R-sq  R-sq(adj)  R-sq(pred)0.452579  4.78%      1.29%       0.00%

Means

QUESTION 6(Trigger)        N      Mean     StDev          95% CI

35

Page 36: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Media            3  0.000000  0.000000  (-0.517881, 0.517881)Peer Pressure   38    0.2105    0.4132  (   0.0650,   0.3560)Stress          44    0.3636    0.4866  (   0.2284,   0.4989)Three Triggers  14     0.429     0.514  (    0.189,    0.668)Two Triggers    15     0.200     0.414  (   -0.032,    0.432)

Pooled StDev = 0.452579

The p-value for this test equated to 0.250. Given our significance level of 5%, or .05, we

had to retain the null. The data do not provide sufficient evidence to conclude that the mean

proportion of students who smoke cigarettes is influenced by what they believe to be a trigger to

start smoking.

36

Page 37: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Regression Investigation: Numerical Data

To perform the regression analysis, we ran each of our numerical questions to see if one

the variables were associated. It was appropriate to only consider the data exclusive to cigarette

smokers for these investigations. For example, we only looked at the ages of the cigarette

smokers and the people who smoke cigarettes in the cigarette smoker’s household. Upon

calculating the p-values for each regression investigation, we found the variables of the ages of

smokers was associated with the amount of smokers in their household. Below are the null and

alternate hypotheses and the results of the test:

Ho: β=0 Meaning age of the smokers (x) is useful for predicting number of cigarette smokers in the household smokers have (y)Ha: β ≠ 0 Meaning age of the smokers (x) is not useful for predicting number of cigarette smokers in the household smokers have (y)Regression Analysis: Smokers in House Smokers versus Ages of Smokers

Analysis of Variance

Source             DF  Adj SS  Adj MS  F-Value  P-ValueRegression          1   2.995  2.9954     4.41    0.044 Ages of Smokers   1   2.995  2.9954     4.41    0.044Error              31  21.065  0.6795 Lack-of-Fit       6   2.932  0.4886     0.67    0.672 Pure Error       25  18.133  0.7253Total              32  24.061

Model Summary

      S    R-sq  R-sq(adj)  R-sq(pred)0.824332  12.45%      9.63%       1.26%

Coefficients

Term               Coef  SE Coef  T-Value  P-Value   VIFConstant          -1.79     1.22    -1.46    0.153Ages of Smokers  0.1242   0.0591     2.10    0.044  1.00

Regression Equation

Smokers in House Smokers = -1.79 + 0.1242 Ages of Smokers

37

Page 38: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Fits and Diagnostics for Unusual Observations

     Smokers    in House                  StdObs   Smokers    Fit  Resid  Resid12     3.000  0.569  2.431   3.01  R19     2.000  1.811  0.189   0.30     X20     3.000  1.315  1.685   2.20  R

R  Large residualX  Unusual X

The p-value is 0.044, which is less than our set significance level of 5%, or .05. So, we

reject the null in favor of the alternative. At the 5% significance level, the data provide sufficient

evidence to conclude that the ages of the cigarettes smokers are useful for predicting the amount

of cigarette smokers in their household. Below is a scatter plot of the data, which includes R2:

30282624222018

3.0

2.5

2.0

1.5

1.0

0.5

0.0

S 0.824332R-Sq 12.4%R-Sq(adj) 9.6%

Ages of Smokers

Smok

ers i

n Ho

use S

mok

ers

Fitted Line PlotSmokers in House Smokers = - 1.790 + 0.1242 Ages of Smokers

Based on this graph, the data points are only very slightly placed around the line of best

fit. R2 is equal to 12.4%, meaning that the variation of cigarette smokers in a cigarette smokers

household is some-what explained by the age of the cigarette smokers. To further examine the

relationship between the two variables, we will examine the Residual plot vs. predictor and

normal probability plot of residual graphs:

38

Page 39: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

210-1-2

99

90

50

10

1

Standardized Residual

Perc

ent

0.840.780.720.660.60

3

2

1

0

-1

Fitted Value

Stan

dard

ized

Resid

ual

2.52.01.51.00.50.0-0.5-1.0

16

12

8

4

0

Standardized Residual

Freq

uenc

y

302520151051

3

2

1

0

-1

Observation Order

Stan

dard

ized

Resid

ual

Normal Probability Plot Versus Fits

Histogram Versus Order

Residual Plots for Smokers in House Smokers

From the Normal Probability plot, it is clear that the data is not linear and therefore does

not have a normal distribution. Although this violates a rule to do regression testing, we needed

to continue for the purposes of our test. For the residual plot vs. predictor, the data is not

scattered linearly around the y=0 line and it is difficult to tell which assumptions from the

textbook it violates.

39

Page 40: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Conclusion:

As we mentioned in the beginning, the primary focus of this study was to see if the

prevalence of smoking cigarettes in the United States population is actually decreasing, and what

the attitudes of students are in regards to the future of cigarette smoking. As a conclusion of this

study, we were able to gain some insightful information as to the habits of the students surveyed

at Raritan Valley Community College. When it came to the age of the students surveyed, there

was no difference in the age between smokers and nonsmokers, as we had predicted a mean age

of 26, this proved to be untrue, as the mean age of the surveyed was under 22.5. When singling

out the proportion of those who smoked vs. those who did, the nation proportion as stated by the

CDC was 16.8% of the population smokes, and when running test for this study we came to the

conclusion that out of everyone surveyed there were 28.21% who did smoke, and the hypothesis

test led us to retain the null hypothesis. On the other hand, we were extremely happy to see that

for those who do smoke, they smoked less cigarettes per day than the national of 14.2. RVCC

students are smoking between 6.9891 and 11.79888 (7 and 12 rounded) cigarettes per day.

Additionally, those surveyed who smoked a great majority of them were male. The WHO states

that an overall 9% of women smoke, and in our study we were able to prove that female cigarette

smoking students was not equal to a proportion of .09 (9%). In our analysis regarding the number

of people who smoked in the household, we were surprised to see that a majority of the students

(58.97%) surveyed had no people (0) who smoked in their household, with the mean falling

between 0.421 and 0.707 cigarette smokers per household. At the completion of this survey, the

triggers for smoking as answered by those surveyed proved to be a shock. Media had only 2.97%

of the students’ responses, a number we found surprisingly low considering how involved the

millennial generation is with technology and the access they had for social media. When

questioning students regarding their whether or not people are smoking now more so than in the

past, our hypothesized response of “Neutral” had been rejected after running analysis and we

found it be not equal to. Based on our tallies and pie chart and histogram, many students

answered “Agree.” One of the things we were pleased to see were the responses students had to

the future of cigarette smoking. We proposed the question of whether or not cigarette smoking

could be eventually eliminated, and were glad to see that 52.14% of the students surveyed

answer that it would be possible, with the mean falling between .427057 and .614570. Following

our ANOVA analysis, we found that for the investigation having the lowest p-value for

40

Page 41: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

proportion and numerical investigations, all of the means proved to be equal according to the

test. For the regression, our data violated the key principles needed to run a regression hypothesis

test. However, we did so anyway and found that the age of the cigarette smoking students was

mildly useful for predicting the amount of cigarette smokers present in a cigarette smokers

household.

Our sampling design did come with a lot of bias, simply because of the difficulty in

obtaining a sufficient amount of cigarette smokers from the classes surveyed. Because we had to

approach students, this created the level of bias in our study. While we were glad to see that most

students in the classes we selected did not smoke, it created a problem in terms of running the

analysis.

To conclude this survey, and ultimately answer the question on the prevalence of

smoking at Raritan Valley Community College, we were able to determine that while there was

not sufficient evidence to conclude that RVCC students are smoking less than the national

average, but for those who did smoke actually smoked less cigarettes per day than the national

average reported by the CDC. In addition, our hypothesis testing and tallies show that RVCC

students hold an optimistic vision for less people smoking cigarettes today and many feel that it

can eventually be eliminated.

41

Page 42: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Appendix

CIGARETTE SMOKINGQuestions

1. How old are you (In years)?______________

2. Do you smoke cigarettes? (Circle one)a. yes b. no

3. If so, how many cigarettes per day? (If you don’t smoke, please answer “ZERO”)____________________

4. Less people are smoking now than in the past decade. (Please circle one)

Strongly Disagree Disagree Neutral Agree      Strongly Agree

5. How many people in your household smoke, and approximately what are their age in the following table: (If you live alone answer “ZERO” for each)

Relation: Smoke: Yes / No Age (Approximate)

Spouse

Sibling(s)

Children

Parent

Other_______________

6. What do you believe to be a trigger for someone to start smoking?a. Peer Pressure b. Media c. Stress D. Other:___________ (Fill In)

7. What is your gender? (Circle one)

a. Maleb. Femalec. Other

8. Do you believe it is a possibility that smoking cigarettes can be eventually eliminated? (Circle one)

a. Possibleb. Impossible

42

Page 43: amandasullivanblog.files.wordpress.com€¦  · Web viewThe one population under consideration in this study is Raritan Valley Community College (RVCC) students. Our survey was influenced

Works Cited

"10 Facts on Gender and Tobacco." World Health Organization. WHO, 2010. Web. 6 Dec. 2016.

"Adult Cigarette Smoking Rate Overall Hits All-time Low." Centers for Disease Control and

Prevention. Centers for Disease Control and Prevention, 26 Nov. 2014. Web. 06 Dec.

2016.

Facts about RVCC. (n.d.). Retrieved December 6, 2016, from

http://www.raritanval.edu/about/foundation/facts.html

"Households and Families: 2010." Census. U.S. Census Bureau, Apr. 2012. Web. 6 Dec. 2016.

"Why People Start Smoking and Why It’s Hard to Stop." Why People Start Using Tobacco, and

Why It's Hard to Stop. American Cancer Society, 13 Nov. 2015. Web. 06 Dec. 2016.

43