# assignment ANA 500.docx

ANA 500

Use gretl and the GSS Dataset provided in the Exams folder to answer the questions below.

Upload a Word or PDF document that contains your answers.

Upload your gretl script file in addition to your Word or PDF document.

1. What are the elements/entities in this dataset? (3 points)

The elements are people/ survey respondents

2. How many variables are in this dataset? (3 points)

There are 16 variables in this datasets

3. How many observations are in this dataset? (3 points)

There are

4. What type of variable is “wwwhr”? What type of variable is “childs? What type of variable is “sex”? (6 points)

5. (10 points)

a) Calculate simple descriptive statistics for the variables: hrs1, wwwhr, age, and educ.

b) Provide estimated density plots for the variables wwwhr and age. Do the variables appear to be skewed, if yes, in which direction?

c) Calculate the correlation coefficient between age and the number of hours per week spent on the internet. Provide a scatterplot for these two variables. Is the correlation positive or negative? Does the correlation appear strong or weak?

d) Calculate the correlation coefficient between age and years of education. Provide a scatterplot for these two variables. Is the correlation positive or negative? Does the correlation appear strong or weak?

6. The variable “childs” measures the number of children the person has. The variable “marital” describes the person’s marital status. (10 points)

a) Create a new variable named “anychildren” that equals 0 if the person has no children and equals 1 if they have any children at all.

b) Create a new variable named “married” that equals 0 if the person is not currently married and equals 1 if they are currently married (note: the category “separated” should be coded with a 0).

c) What proportion of respondents have at least one child?

d) What proportion of respondents are married?

e) Produce graphs of the frequency distributions for the two variables you created. Which variable is more evenly distributed?

f) Produce a cross-tabulation for the two variables you created. What percentage of respondents that are married have at least one child?

g) Produce separate graphs of the frequency distribution for the anychildren variable for those who are married and those who are not.

7. Confidence intervals (assume population standard deviations are unknown) (15 points)

a) Estimate 95% confidence intervals for the mean number of hours worked last week and the mean number of hours spent on the internet per week.

b) Now, estimate 90% confidence intervals for the mean number of hours worked last week and the mean number of hours spent on the internet per week. Which confidence intervals are wider, the 95% intervals or the 90% intervals? Why?

c) Estimate 95% confidence intervals for the proportion with any children and the proportion currently married.

8. Calculate descriptive statistics for the variables hrs1, wwwhr, age, and education by whether or not the survey respondent has any children (i.e. calculate separate descriptive statistics for those people with children and those people without children). (4 points)

9. The variable “polviews” describes the political views of the respondent. (6 points)

a) What is the average age for each of the political view categories?

b) What is the average years of education for each of the political view categories?

10. Hypothesis tests (Use a 5% level of significance for all tests. For tests involving means, assume the population standard deviation is unknown) (40 points)

a) Test the hypothesis that the mean age is greater than 38.

b) Test the hypothesis that the mean age is less than 40.

c) Test the hypothesis that the proportion of people with a child is less than 0.70.

d) Test the hypothesis that the proportion of people who are married is greater than 0.50.

e) Test the hypothesis that the variables anychildren and married are related (i.e. not independent).

f) Test the hypothesis that the mean hours spent on the internet is different between people with children and respondents without children.

g) Test the hypothesis that the mean hours spent on the internet is different between people that are married and people that are not married.

h) Test the hypothesis that mean age differs across the political view categories.

i) Test the hypothesis that the mean years of education differs across the political view categories.

j) Test the hypothesis that the mean hours worked differs across political view categories.

k) Test the hypothesis that the mean hours spent on the internet differs across political view categories.