Posted: June 13th, 2021

Homework 1

- Assume that you are hired to investigate the causal effect between being raised in high-poverty neighborhoods in the US and future outcomes during adulthood (such as health, well-being, social networks and economic self-sufficiency). You can employ the data sources file in Blackboard to answer the following questions:

- (a) Mention a suitable dataset that can help you answer the question above. Provide its name and the website where it can be downloaded.

- (b) What is the sample size in this dataset? Is this a reasonable number for your research?

- (c) Briefly describe the data you found in part (a). Using the codebook discuss which variables are crucial to answer the research question posed in the statement above (no more than 10 lines).

- (a) Mention a suitable dataset that can help you answer the question above. Provide its name and the website where it can be downloaded.
- Suppose you are a researcher interested in studying the relationship between household charac- teristics and future educational outcomes of children. You have been advised that one dataset which satisfies your requirements is the Early Childhood Longitudinal Study, Birth Cohort (you will need to find it online). In order to answer the following questions, additionally you will need to locate the codebooks of this database. Note that you do not need the microdata to answer these questions.

- (a) Briefly describe the objectives of this study and the different rounds of the survey. Mention the methods employed for data collection. At what ages are the interviews conducted? (Your answer should not exceed 10 lines).

- (b) Describe which are the restrictions for the use of this database.

- (c) How many children are classified as low birth weight in the first round of the survey?

- (d) Describe the groups of variables available in the first round. Classify them in child charac- teristics, mother characteristics and household characteristics.

- (e) Choose two variables you could employ as baseline characteristics of the household. Describe how these variables would be relevant for studying future outcomes of children.

- (f) Calculate the nonresponse rate between the initial number of individuals interviewed and the two following rounds of the survey.

- (a) Briefly describe the objectives of this study and the different rounds of the survey. Mention the methods employed for data collection. At what ages are the interviews conducted? (Your answer should not exceed 10 lines).

- (g) Suppose you are interested in studying how socio-emotional skills are developed before the age of two. Describe which assessments included in this study could be employed for this purpose. Does the study have similar assessments for higher ages?

- (h) Describe which measurements can be used to analyze the cognitive skills of children in kinder- garten.

3. This problem asks you to work directly with Stata. Suppose you are a researcher interested in studying the labor market outcomes of recent college graduates. One public-use, suitable dataset for this purpose is the National Survey of College Graduates (NSCG). In order to answer the following questions, you will need to use the attached documentation to identify the variables of interest.

- (a) Explore the survey using the codebook and survey description. Based on this, write down one scientific question (related to the topic mentioned above) which could be answered using the NSCG.

- (b) Use the codebook provided with the database to identify the variables related to hours worked per week, weeks worked per year and year earnings. Notice that information about weeks worked can be derived using two variables.

- (c) After handling invalid values properly, create a table showing the mean and standard devia- tion of the three variables described in part (a) for men and women separately.

- (d) In order to see the distribution of hours worked per week, crate a histogram of this variable for men and women. Plot the density in the y-axis and use a bin width of 10 for the x-axis.

- (e) Create a new variable lnhourwage defined as the (natural) logarithm of year earnings di- vided by total hours worked during the year. Produce a table showing the mean, standard deviation and percentiles 10th and 90th of this variable for men and women separately. Drop observations which yield a negative value of this variable.

- (f) Use the interview questionnaire to identify the variable which indicates whether a respondent changed employer and/or job between 2013 and 2015, as well as the variables describing the reason of change in case the employer is different between these two years. What is the proportion of respondents who stayed with the same employer and job during this period?

- (g) As a researcher, you are also interested in studying how the gender wage gap varies across major fields. Using the variable related to the first bachelor degree (nbamemg) and your variable lnhourwage create a table showing the mean hourly wage for women and men across different majors. Which is the one that presents the higher wage gap?

- (h) Run a regression of hourly wages on education separately for men and women. How does the parameter of education differ across gender?

- (i) Create a variable of potential experience ptlexper, defined as age-education-6. Run a re- gression of hourly wages on education, potential experience and potential experience squared separately for men and women. Interpret your results.

Place an order in 3 easy steps. Takes less than 5 mins.