Theses Doctoral

# Essays on Statistical Decision Theory and Econometrics

This dissertation studies statistical decision making in various guises. I start by providing a general decision theoretic model of statistical behavior, and then analyze two particular instances which fit in that framework.

Chapter 1 studies statistical decision theory (SDT), a class of models pioneered by Abraham Wald to analyze how agents use data when making decisions under uncertainty. Despite its prominence in information economics and econometrics, SDT has not been given formal choice-theoretic or behavioral foundations. This chapter axiomatizes preferences over decision rules and experiments for a broad class of SDT models. The axioms show how certain seemingly-natural decision rules are incompatible with this broad class of SDT models. Using those representation result, I then develop a methodology to translate axioms from classical decision-theory, a la Anscombe and Aumann (1963), to the SDT framework. The usefulness of this toolkit is then illustrated by translating various classical axioms, which serve to refine my baseline framework into more specific statistical decision theoretic models, some of which are novel to SDT. I also discuss foundations for SDT under other kinds of choice data.

Chapter 2 studies statistical identifiability of finite mixture models. If a model is not identifiable, multiple combinations of its parameters can lead to the same observed distribution of the data, which greatly complicates, if not invalidates, causal inference based on the model. High-dimensional latent parameter models, which include finite mixtures, are widely used in economics, but are only guaranteed to be identifiable under specific conditions. Since these conditions are usually stated in terms of the hidden parameters of the model, they are seldom testable using noisy data. This chapter provides a condition which, when imposed on the directly observable mixture distribution, guarantees that a finite mixture model is non-parametrically identifiable. Since the condition relates to an observable quantity, it can be used to devise a statistical test of identification for the model. Thus I propose a Bayesian test of whether the model is close to being identified, which the econometrician may apply before estimating the parameters of the model. I also show that, when the model is identifiable, approximate non-negative matrix factorization provides a consistent, likelihood-free estimator of mixture weights.

Chapter 3 studies the robustness of pricing strategies when a firm is uncertain about the distribution of consumers' willingness-to-pay. When the firm has access to data to estimate this distribution, a simple strategy is to implement the mechanism that is optimal for the estimated distribution. We find that such an empirically optimal mechanism boasts strong profit and regret guarantees. Moreover, we provide a toolkit to evaluate the robustness properties of different mechanisms, showing how to consistently estimate and conduct valid inference on the profit generated by any one mechanism, which enables one to evaluate and compare their probabilistic revenue guarantees.