OSC 2019 - Parramatta

Oceania Stata Conference Presentations

Demetris Christodoulou, The University of Sydney

Data Visualisation Using Stata from First Principles

I am in the process of developing extensive web-based resources that teach data visualisation using Stata from first principles. Some of these materials form part of a course that I teach at the University of Sydney on “Visual Data Analytics”, see https://sydney.edu.au/courses/units-of-study/2018/qbus/qbus6860.html. The free web-resources offer two approaches for learning data visualisation. The homepage will offer a random palette of graphs for browsing content by type; this unstructured learning suits those who are interested to find out how to make one of the several advanced graphical forms - the Stata code for exact reproduction is provided at the end of each page. The website’s main objective, however, is to teach a structural workflow approach to data visualisation, using theory of graphs, visual perception and statistical tools specifically designed for visual analysis.

Chuck Huber, StataCorp

Causal Inference for Complex Observational Data

Observational data often have issues which present challenges for the data analyst. The treatment status or exposure of interest is often not assigned randomly. Data are sometimes missing not at random (MNAR) which can lead to sample selection bias. And many statistical models for these data must account for unobserved confounding. This talk will demonstrate how to use standard maximum likelihood estimation to fit extended regression models (ERMs) that deal with all of these common issues alone or simultaneously.

Michael Keane and Tim Neal, University of New South Wales

Implementing the Keane and Runkle Approach for Fitting Dynamic Panel Data Models in Stata Using xtkr

In this presentation we discuss the xtkr command for Stata, which implements the Keane and Runkle (1992) approach for fitting dynamic panel data models with fixed effects and weakly exogenous regressors. Monte Carlo simulations show that, in certain situations, this approach offers an improvement over the popular difference generalized method of moments and system generalized method of moments estimators in terms of bias and root mean squared error. An empirical application to cigarette demand also demonstrates its usefulness for applied researchers.
See full presentation

Choonjoo Lee, Korea National Defense University & Jinwoo Lee, Auckland University of Technology

Technology Forecasting using Data Envelopment Analysis in Stata

This presentation introduces a user-written Stata program for Technology Forecasting using Data Envelopment Analysis (TFDEA). TFDEA was applied to predict the technological dynamics of smartphones. Data sets with more than 5,000 observations were collected on the Web site using data mining techniques. We compare the results with previous studies and discuss data management for large data sets.
See full presentation

Ning Li, Australian Mathematical Sciences Institute (AMSI)

Longitudinal Study on Age and Life Satisfaction

One of the major studies in the world, utilizing data on millions of people from a wide range of countries, found that the level of happiness is U shaped in age. This U-shaped relation, repeatedly confirmed by many other studies, has given rise to the prevailing notion of midlife crisis in many countries including Australia. According to this U-shape, happiness improves with age after midlife. This seems to contradict with general expectations because health often falls with age. Arguments against the claimed U-shape became extremely controversial after a group of studies found convincing evidence that happiness decreases with age after midlife. This group of studies reached this conclusion by analysing longitudinal data and controlling for individual fixed effects. In the literature of age-happiness research, qualitatively different results were produced from the same data set by different research groups, hence a mystery. In this talk, I will reproduce the result of each side of the debate and provide a new standing point to view the seemingly conflicting results. All the analyses are implemented in Stata.
See full presentation

Federico Masera, University of New South Wales

discretize: Command to Convert a Continuous Instrument into a Dummy Variable for Instrumental Variable Estimation

The Instrumental Variable (IV) method is a standard econometric approach to address endogeneity issues (i.e. when an explanatory variable is correlated with the error term). It relies on finding an instrument, excluded from the outcome equation (second stage), but which is a determinant of the endogenous variable of interest (first stage). Many instruments rely on cross-sectional variation produced by a dummy variable, which is discretized from a continuous variable. There might be several reasons for converting a continuous variable into a binary instrument. First, continuous instruments recoded as dummies have been shown to provide a parsimonious non-parametric model for the underlying first stage relation (Angrist & Pischke, 2009). Second, it provides a simple tool to evaluate the IV strategy and the identification assumptions. Unfortunately, the construction of the binary instrument often appears to be arbitrary, which may raise concerns about the robustness of the second stage results. We propose a data-driven procedure to build this discrete instrument, implemented in a command called discretize. The boundaries of the discrete variable are chosen to maximize the F-statistic in the first stage. This procedure has two main advantages. First, it minimizes the weak instrument problem, which can arise in case of incorrect functional specification in the first stage. Second, it offers a transparent, data-driven, procedure to select an instrument that does not depend on arbitrary decisions made by the researcher. Several options are available with the command to check graphically the robustness of the first and second stage parameters. The presentation includes an explanation of the functioning of the discretize command, as well as an illustration of its usefulness with an example that relates the raise of violent crime in city centres and the process of suburbanisation. The endogenous relation is solved using lead poisoning as instrument.
See full presentation

Irma Mooi-Reci, The University of Melbourne

The Cumulative Disadvantage of Unemployment: Longitudinal Evidence Across Gender and Age in Germany

Unemployment is an important predictor of one’s future employment success. Yet, much about the endurance of unemployment effects on workers’ careers, how they evolve and play out over time remains poorly understood. Our study complements this knowledge gap by examining the quality of career trajectories following an unemployment spell among a representative sample of previously unemployed workers with different socio-demographic characteristics in Germany. We apply a new dynamic measure for sequence quality that extends Stata’s sqset package to quantify the quality of binary sequences, distinguishing between “good” (i.e., employment) from “bad” labor force status activities (i.e., unemployment and inactivity). The advantage of this newly developed measure is that it captures the volatility of labor force trajectories and their evolution since the occurrence of an initial unemployment spell. In addition, and more importantly, it quantifies the quality of labor force status trajectories in a dynamic way such that the measure decreases with unfavorable activities (such as unemployment and inactivity) and increases with favorable employment experiences. As such, this is the first sequence-based measure that quantifies the overall quality of labor force outcomes, and thus the career recovery from unemployment. We use longitudinal data from the German Socio-Economic Panel (GSOEP) before the Great Financial Recession over the period 1984-2005 and deploy a series of hybrid models that control for unobserved heterogeneity. Our results demonstrate a deteriorating trend in career quality since an initial spell of unemployment. This finding provides evidence for a cumulating disadvantage process following unemployment. Furthermore, we also find that recovery processes are contingent upon when respondents experience unemployment.
See full presentation

Steve Quinn, Swinburne University

Non-linear Regression Using Stata and Sigmaplot

The luminance-response function of the brief flash full field photopic ERG rises to a peak before falling to a sub-maximal plateau. Previous work has shown that the on and off responses that are inherent in this suggest that the function can be modelled by the sum of a guassian density function and a logistic distribution function. This talk discusses the non-linear modelling used to obtain these functions and the advantages of using both Stata v15 and Sigmaplot to obtain results.
See full presentation

Peter Robinson, Australian Pharmacy Council

Streamlining Resulting and Analysis for High-stakes Examinations

Under the National Registration and Accreditation Scheme, the Australian Pharmacy Council Ltd (APC) is the designated independent accreditation agency for Australian pharmacy until June 2024. As part of this delegation from the Pharmacy Board of Australia the APC is responsible for the delivery of high-stakes computer-delivered pharmacy examinations. These include examinations held overseas that are part of the assessment process for provisional registration of pharmacists in Australia. APC offers 12 examinations per year and develops several parallel forms per examination. Until 2019 examination questions were stored in an in-house database. Results were produced using a combination of the database facility and Excel programs. Individual item responses were analysed employing a Classical Item Theory approach using an Excel add-on and imported back to the database to inform future use of the items. In early 2019 APC moved its itembank into ExamDeveloper software that provides a range of management tools and the ability to store a wide range of variables to inform development of examinations. During 2019 APC has also progressively moved to implement a Rasch Modelling approach. This presentation will describe the progress and outcomes of a project using Stata aimed to develop automated resulting, improve data management, integrity and accountability, improve efficiency of calculations of item statistics, graphics and tools for analysis of individual questions and examinations as a whole and automate as much as practicable reporting to stakeholders.
See full presentation

William W Tyler, Charles Darwin University

Strains and Gains: Estimating a First Year University Student Engagement Effect with ERM

Evaluation studies with observation data have been vastly enriched by the possibilities offered by Stata’s Extended Regression Model (ERM) framework, which can account simultaneously for endogeneity at the covariate, treatment assignment and sample selection levels, with a robust clustering option. The paper examines the challenges of building and testing an extended regression model for estimating the causal effect of online engagement on first year academic outcomes. With reference to an exploratory counter-factual model, the paper makes a number of suggestions for the further development of the extended regression framework.
See full presentation

Oceania Stata Conference 2019 – Parramatta

20 August 2019