Academic Commons Search Results
https://academiccommons.columbia.edu/catalog?action=index&controller=catalog&f%5Bauthor_facet%5D%5B%5D=Gelman%2C+Andrew+E.&format=rss&fq%5B%5D=has_model_ssim%3A%22info%3Afedora%2Fldpd%3AContentAggregator%22&q=&rows=500&sort=record_creation_date+desc
Academic Commons Search Resultsen-usA Practical Guide to Measuring Social Structure Using Indirectly Observed Network Data
https://academiccommons.columbia.edu/catalog/ac:185370
McCormick, Tyler H.; Moussa, Amal; DiPrete, Thomas A.; Ruf, Johannes; Gelman, Andrew E.; Teitler, Julien O.; Zheng, Tian10.7916/D86H4G9DThu, 29 Jun 2017 03:41:05 +0000Aggregated relational data (ARD) are an increasingly common tool for learning about social networks through standard surveys. Recent statistical advances present social scientists with new options for analyzing such data. In this article, we propose guidelines for learning about various network processes using ARD and a template to aid practitioners. We first propose that ARD can be used to measure “social distance” between a respondent and a subpopulation (individuals named Kevin, those in prison, or those serving in the military). We then present common methods for analyzing these data and associate each of these methods with a specific way of measuring social distance, thus associating statistical tools with their underlying social science phenomena. We examine the implications of using each of these social distance measures using an Internet survey about contemporary political issues.Statistics, Social sciences--Researchthm2105, am2810, tad61, ag389, jot8, tz33Sociology, Statistics, Social WorkArticlesProtecting Minorities in Large Binary Elections: A Test of Storable Votes Using Field Data
https://academiccommons.columbia.edu/catalog/ac:182487
Casella, Alessandra M.; Gelman, Andrew E.; Ehrenberg, Shuky; Shen, Jie10.7916/D8KH0M4QThu, 29 Jun 2017 03:39:52 +0000The legitimacy of democratic systems requires the protection of minority preferences while ideally treating every voter equally. During the 2006 student elections at Columbia University, we asked voters to rank the importance of different contests and to choose where to cast a single extra "bonus vote," had one been available — a simple version of Storable Votes. We then constructed distributions of intensities and electoral outcomes and estimated the probable impact of the bonus vote through bootstrapping techniques. The bonus vote performs well: when minority preferences are particularly intense, the minority wins at least one contest with 15-30 percent probability; when the minority wins, aggregate welfare increases with 85-95 percent probability. The paper makes two contributions: it tests the performance of storable votes in a setting where preferences were not controlled, and it suggests the use of bootstrapping techniques when appropriate replications of the data cannot be obtained.Political scienceac186, ag389Economics, StatisticsArticlesHow Many People Do You Know in Prison? Using Overdispersion in Count Data to Estimate Social Structure in Networks
https://academiccommons.columbia.edu/catalog/ac:185364
Zheng, Tian; Salganik, Matthew J.; Gelman, Andrew E.10.7916/D800011WThu, 29 Jun 2017 03:38:21 +0000Networks—sets of objects connected by relationships—are important in a number of fields. The study of networks has long been central to sociology, where researchers have attempted to understand the causes and consequences of the structure of relationships in large groups of people. Using insight from previous network research, Killworth et al. and McCarty et al. have developed and evaluated a method for estimating the sizes of hard-to-count populations using network data collected from a simple random sample of Americans. In this article we show how, using a multilevel overdispersed Poisson regression model, these data also can be used to estimate aspects of social structure in the population. Our work goes beyond most previous research on networks by using variation, as well as average responses, as a source of information. We apply our method to the data of McCarty et al. and find that Americans vary greatly in their number of acquaintances. Further, Americans show great variation in propensity to form ties to people in some groups (e.g., males in prison, the homeless, and American Indians), but little variation for other groups (e.g., twins, people named Michael or Nicole). We also explore other features of these data and consider ways in which survey data can be used to estimate network structure.Statistics, Social sciences--Researchtz33, ag389Statistics, Political ScienceArticlesR2WinBUGS: A Package for Running WinBUGS from R
https://academiccommons.columbia.edu/catalog/ac:154734
Sturtz, Sibylle; Ligges, Uwe; Gelman, Andrew E.10.7916/D80C55HHTue, 27 Jun 2017 15:43:29 +0000The R2WinBUGS package provides convenient functions to call WinBUGS from R. It automatically writes the data and scripts in a format readable by WinBUGS for processing in batch mode, which is possible since version 1.4. After the WinBUGS process has finished, it is possible either to read the resulting data into R by the package itself—which gives a compact graphical summary of inference and convergence diagnostics—or to use the facilities of the coda package for further analyses of the output. Examples are given to demonstrate the usage of this package.Statisticsag389StatisticsArticlesMultiple Imputation with Diagnostics (mi) in R: Opening Windows into the Black Box
https://academiccommons.columbia.edu/catalog/ac:154731
Su, Yu-Sung; Gelman, Andrew E.; Hill, Jennifer; Yajima, Masanao10.7916/D8VQ3CD3Tue, 27 Jun 2017 15:43:28 +0000Our mi package in R has several features that allow the user to get inside the imputation process and evaluate the reasonableness of the resulting models and imputations. These features include: choice of predictors, models, and transformations for chained imputation models; standard and binned residual plots for checking the fit of the conditional distributions used for imputation; and plots for comparing the distributions of observed and imputed data. In addition, we use Bayesian models and weakly informative prior distributions to construct more stable estimates of imputation models. Our goal is to have a demonstration package that (a) avoids many of the practical problems that arise with existing multivariate imputation programs, and (b) demonstrates state-of-the-art diagnostics that can be applied more generally and can be incorporated into the software of others.Statisticsag389StatisticsArticlesBayesian Statistical Pragmatism
https://academiccommons.columbia.edu/catalog/ac:154737
Gelman, Andrew E.10.7916/D8MC98QJTue, 27 Jun 2017 15:39:20 +0000I agree with Rob Kass’ point that we can and should make use of statistical methods developed under different philosophies, and I am happy to take the opportunity to elaborate on some of his arguments.Statisticsag389StatisticsArticlesSegregation in Social Networks Based on Acquaintanceship and Trust
https://academiccommons.columbia.edu/catalog/ac:154740
DiPrete, Thomas A.; Gelman, Andrew E.; McCormick, Tyler; Teitler, Julien O.; Zheng, Tian10.7916/D8F198DHTue, 27 Jun 2017 15:38:27 +0000Using 2006 General Social Survey data, the authors compare levels of segregation by race and along other dimensions of potential social cleavage in the contemporary United States. Americans are not as isolated as the most extreme recent estimates suggest. However, hopes that “bridging” social capital is more common in broader acquaintanceship networks than in core networks are not supported. Instead, the entire acquaintanceship network is perceived by Americans to be about as segregated as the much smaller network of close ties. People do not always know the religiosity, political ideology, family behaviors, or socioeconomic status of their acquaintances, but perceived social divisions on these dimensions are high, sometimes rivaling racial segregation in acquaintanceship networks. The major challenge to social integration today comes from the tendency of many Americans to isolate themselves from others who differ on race, political ideology, level of religiosity, and other salient aspects of social identity.Statisticstad61, ag389, thm2105, jot8, tz33StatisticsArticlesMeasuring Scholarly Impact: The Influence of 'Altmetrics'
https://academiccommons.columbia.edu/catalog/ac:165365
Priem, Jason; Holmes, Kristi; Trasande, Caitlin Aptowicz; Gelman, Andrew E.10.7916/D8CR62WSMon, 19 Jun 2017 20:53:20 +0000"Altmetrics" refers to methods of measuring scholarly impact using Web-based social media. Why does it matter? In many academic fields, attaining scholarly prestige means publishing research articles in important scholarly journals. However, many in the academic community consider a journal's prestige, which is determined by a metric calculated using the number of citations to the journal, to be a poor proxy for the quality of the individual author's work. At the same time, hiring and promotion committees are looking for ways to determine the impact of alternate formats now commonly used by researchers such as blogs, data sets, videos, and social media. The panelists all work with innovative new tools for assessing scholarly impact. They are: Jason Priem, Co-Founder, ImpactStory; Kristi Holmes, Bioinformaticist, Bernard Becker Medical Library, Washington University in St. Louis School of Medicine; and Caitlin Aptowicz Trasande, Head of Science Metrics, Digital Science.Information science, Information technologyag389Center for Digital Research and Scholarship, Scholarly Communication Program, Libraries and Information ServicesInterviewsAn experimental study of storable votes
https://academiccommons.columbia.edu/catalog/ac:116172
Casella, Alessandra M.; Gelman, Andrew E.; Palfrey, Thomas R.10.7916/D89P3CVMMon, 12 Jun 2017 20:45:34 +0000The storable votes mechanism is a method of voting for committees that meet periodically to consider a series of binary decisions. Each member is allocated a fixed budget of votes to be cast as desired over the multiple decisions. Voters are induced to spend more votes on those decisions that matter to them most, shifting the ex ante probability of winning away from decisions they value less and towards decisions they value more, typically generating welfare gains over standard majority voting with non-storable votes. The equilibrium strategies have a very intuitive feature-the number of votes cast must be monotonic in the voter's intensity of preferences-but are otherwise difficult to calculate, raising questions of practical implementation. In our experiments, realized efficiency levels were remarkably close to theoretical equilibrium predictions, while subjects adopted monotonic but off-equilibrium strategies. We are lead to conclude that concerns about the complexity of the game may have limited practical relevance.Political scienceac186, ag389EconomicsReportsA simple scheme to improve the efficiency of referenda
https://academiccommons.columbia.edu/catalog/ac:115266
Casella, Alessandra M.; Gelman, Andrew E.10.7916/D8611BHGMon, 12 Jun 2017 20:45:34 +0000This paper proposes a simple scheme designed to elicit and reward intensity of preferences in referenda: voters faced with a number of binary proposals are given one regular vote for each proposal plus an additional number of bonus votes to cast as desired. Decisions are taken according to the majority of votes cast. In our base case, where there is no systematic difference between proposals' supporters and opponents, there is always a positive number of bonus votes such that ex ante utility is increased by the scheme, relative to simple majority voting. When the distributions of valuations of supporters and opponents differ, the improvement in efficiency is guaranteed if the distributions can be ranked according to first order stochastic dominance. If they are, however, the existence of welfare gains is independent of the exact number of bonus votes.Political scienceac186, ag389EconomicsReportsProtecting minorities in binary elections: A test of storable votes using field data
https://academiccommons.columbia.edu/catalog/ac:125276
Casella, Alessandra M.; Ehrenberg, Shuky; Gelman, Andrew E.; Shen, Jie10.7916/D8BR9021Fri, 02 Jun 2017 13:07:29 +0000Democratic systems are built, with good reason, on majoritarian principles, but their legitimacy requires the protection of strongly held minority preferences. The challenge is to do so while treating every voter equally and preserving aggregate welfare. One possible solution is storable votes: granting each voter a budget of votes to cast as desired over multiple decisions. During the 2006 student elections at Columbia University, we tested a simple version of this idea: voters were asked to rank the importance of the different contests and to choose where to cast a single extra "bonus vote," had one been available. We used these responses to construct distributions of intensities and electoral outcomes, both without and with the bonus vote. Bootstrapping techniques provided estimates of the probable impact of the bonus vote. The bonus vote performs well: when minority preferences are particularly intense, the minority wins at least one of the contests with 15-30 percent probability; and, when the minority wins, aggregate welfare increases with 85-95 percent probability. When majority and minority preferences are equally intense, the effect of the bonus vote is smaller and more variable but on balance still positive.Political science, Mathematical statistics, Statisticsac186, ag389StatisticsReportsWhy we (usually) don't have to worry about multiple comparison
https://academiccommons.columbia.edu/catalog/ac:129500
Gelman, Andrew E.; Hill, Jennifer; Yajima, Masanao10.7916/D8RN3FPWFri, 02 Jun 2017 13:02:43 +0000Applied researchers often find themselves making statistical inferences in settings that would seem to require multiple comparisons adjustments. We challenge the Type I error paradigm that underlies these corrections. Moreover we posit that the problem of multiple comparisons can disappear entirely when viewed from a hierarchical Bayesian perspective. We propose building multilevel models in the settings where multiple comparisons arise. Multilevel models perform partial pooling (shifting estimates toward each other), whereas classical procedures typically keep the centers of intervals stationary, adjusting for multiple comparisons by making the intervals wider (or, equivalently, adjusting the p-values corresponding to intervals of fixed width). Thus, multilevel models address the multiple comparisons problem and also yield more efficient estimates, especially in settings with low group-level variation, which is where multiple comparisons are a particular concern.Statisticsag389Columbia Population Research CenterReportsSegregation in social networks based on acquaintanceship and trust
https://academiccommons.columbia.edu/catalog/ac:129491
DiPrete, Thomas A.; McCormick, Tyler; Gelman, Andrew E.; Teitler, Julien O.; Zheng, Tian10.7916/D88P66C3Fri, 02 Jun 2017 13:02:21 +0000Using recently collected data from the 2006 General Social Survey, we compare levels of segregation by race and along other dimensions of potential social cleavage in the contemporary United States. Americans are not as isolated as other recent evidence suggests. However, hopes that "bridging" social capital is more common in broader acquaintanceship networks than in core networks are not supported by the GSS data. Instead, the entire acquaintanceship network appears to be as segregated as the more restricted and much smaller network based on trust. Social divisions based on religiosity, political ideology, family behaviors and socioeconomic standing are high and in some cases rival racial segregation in their intensity. The major challenge to social integration today comes less from the risk of social isolation--complete isolation is rare--than from the tendency of many Americans to isolate themselves from others who differ on race, political ideology, level of religiosity, and other salient aspects of social identity.Social structure, Sociologytad61, thm2105, ag389, jot8, tz33Columbia Population Research Center, Statistics, Social Work, Sociology, Political ScienceReportsBayesian hierarchical classes analysis
https://academiccommons.columbia.edu/catalog/ac:125300
Leenen, Iwin; Mechelen, Iven van; Gelman, Andrew E.; Knop, Stijn de10.7916/D82Z1C7CWed, 31 May 2017 19:34:28 +0000Hierarchical classes models are models for N-way N-mode data that represent the association among the N modes and simultaneously yield, for each mode, a hierarchical classification of its elements. In this paper we present a stochastic extension of the hierarchical classes model for two-way two-mode binary data. In line with the original model, the new probabilistic extension still represents both the association among the two modes and the hierarchical classifications. A fully Bayesian method for fitting the new model is presented and evaluated in a simulation study. Furthermore, we propose tools for model selection and model checking based on Bayes factors and posterior predictive checks. We illustrate the advantages of the new approach with applications in the domain of the psychology of choice and psychiatric diagnosis.Statisticsag389StatisticsArticlesWhy we (usually) don't have to worry about multiple comparisons
https://academiccommons.columbia.edu/catalog/ac:125255
Gelman, Andrew E.; Hill, Jennifer; Yajima, Masanao10.7916/D8QR53VGWed, 31 May 2017 19:34:21 +0000Statisticsag389StatisticsPresentations (Communicative Events)Sampling for Bayesian computation with large datasets
https://academiccommons.columbia.edu/catalog/ac:125252
Huang, Zaiying; Gelman, Andrew E.10.7916/D8VH5VJCWed, 31 May 2017 19:34:18 +0000Multilevel models are extremely useful in handling large hierarchical datasets. However, computation can be a challenge, both in storage and CPU time per iteration of Gibbs sampler or other Markov chain Monte Carlo algorithms. We propose a computational strategy based on sampling the data, computing separate posterior distributions based on each sample, and then combining these to get a consensus posterior inference. With hierarchical data structures, we perform cluster sampling into subsets with the same structures as the original data. This reduces the number of parameters as well as sample size for each separate model fit. We illustrate with examples from climate modeling and newspaper marketing.Statistics, Bayesian statistical decision theory, Cluster analysis, Multilevel models (Statistics)ag389StatisticsArticlesWhy we (usually) don't have to worry about multiple comparisons
https://academiccommons.columbia.edu/catalog/ac:125225
Gelman, Andrew E.; Hill, Jennifer; Yajima, Masanao10.7916/D84X5FHKWed, 31 May 2017 19:34:18 +0000Applied researchers often find themselves making statistical inferences in settings that would seem to require multiple comparisons adjustments. We challenge the Type I error paradigm that underlies these corrections. Moreover we posit that the problem of multiple comparisons can disappear entirely when viewed from a hierarchical Bayesian perspective. We propose building multilevel models in the settings where multiple comparisons arise. Multilevel models perform partial pooling (shifting estimates toward each other), whereas classical procedures typically keep the centers of intervals stationary, adjusting for multiple comparisons by making the intervals wider (or, equivalently, adjusting the p-values corresponding to intervals of fixed width). Thus, multilevel models address the multiple comparisons problem and also yield more efficient estimates, especially in settings with low group-level variation, which is where multiple comparisons are a particular concern.Statistics, Bayesian statistical decision theory, Multilevel models (Statistics), Statistical hypothesis testingag389StatisticsArticlesGoing beyond the book: Toward critical reading in statistics teaching
https://academiccommons.columbia.edu/catalog/ac:125240
Gelman, Andrew E.10.7916/D8CF9WTVWed, 31 May 2017 19:34:18 +0000We can improve our teaching of statistical examples from books by collecting further data, reading cited articles, and performing further data analysis. This should not come as a surprise, but what might be new is the realization of how close to the surface these research opportunities are: even influential and celebrated books can have examples where more can be learned with a small amount of additional effort. We discuss three examples that have arisen in our own teaching: an introductory textbook that motivated us to think more carefully about categorical and continuous variables; a book for the lay reader that misreported a study of menstruation and accidents; and a monograph on the foundations of probability that overinterpreted statistically insignificant fluctuations in sex ratios.Political science, Statistics, Left- and right-handedness, Menstruation, Traffic accidentsag389StatisticsReportsWhy we (usually) don't have to worry about multiple comparisons
https://academiccommons.columbia.edu/catalog/ac:125258
Gelman, Andrew E.; Hill, Jennifer; Yajima, Masanao10.7916/D8G73MF6Wed, 31 May 2017 19:34:17 +0000Statisticsag389StatisticsPresentations (Communicative Events)Fully Bayesian computing
https://academiccommons.columbia.edu/catalog/ac:125246
Kerman, Jouni; Gelman, Andrew E.10.7916/D83X8DBRWed, 31 May 2017 19:34:17 +0000A fully Bayesian computing environment calls for the possibility of defining vector and array objects that may contain both random and deterministic quantities, and syntax rules that allow treating these objects much like any variables or numeric arrays. Working within the statistical package R, we introduce a new object-oriented framework based on a new random variable data type that is implicitly represented by simulations. We seek to be able to manipulate random variables and posterior simulation objects conveniently and transparently and provide a basis for further development of methods and functions that can access these objects directly. We illustrate the use of this new programming environment with several examples of Bayesian computing, including posterior predictive checking and the manipulation of posterior simulations. This new environment is fully Bayesian in that the posterior simulations can be handled directly as random variables.Computer science, Statistics, Bayesian statistical decision theory, Object-oriented programming (Computer science)ag389StatisticsArticlesBayesian Combination of State Polls and Election Forecasts
https://academiccommons.columbia.edu/catalog/ac:125228
Lock, Kari; Gelman, Andrew E.10.7916/D8WD4698Wed, 31 May 2017 19:34:16 +0000A wide range of potentially useful data are available for election forecasting: the results of previous elections, a multitude of pre-election polls, and predictors such as measures of national and statewide economic performance. How accurate are different forecasts? We estimate predictive uncertainty via analysis of data collected from past elections (actual outcomes, pre-election polls, and model estimates). With these estimated uncertainties, we use Bayesian inference to integrate the various sources of data to form posterior distributions for the state and national two-party Democratic vote shares for the 2008 election. Our key idea is to separately forecast the national popular vote shares and the relative positions of the states. More generally, such an approach could be applied to study changes in public opinion and other phenomena with wide national swings and fairly stable spatial distributions relative to the national average.Political science, Statistics, Bayesian statistical decision theoryag389StatisticsArticlesWhat will we know on Tuesday at 7pm?
https://academiccommons.columbia.edu/catalog/ac:125231
Gelman, Andrew E.; Silver, Nate10.7916/D8RR24XZWed, 31 May 2017 19:34:16 +0000Political science, Statisticsag389StatisticsArticlesWhat does "Do campaigns matter?" mean?
https://academiccommons.columbia.edu/catalog/ac:125249
Bafumi, Joseph; Gelman, Andrew E.; Park, David K.10.7916/D8057NNSWed, 31 May 2017 19:34:16 +0000Scholars disagree over the extent to which presidential campaigns activate predispositions in voters or create vote preferences that could not be predicted. When campaign related information flows activate predispositions, election results are largely predetermined given balanced resources. They can be accurately forecast well before a campaign has run its course. Alternatively, campaigns may change vote outcomes beyond forcing predispositions to some equilibrium level. We find most evidence for the former: opinion poll data are consistent with Presidential campaigns activating predispositions, with fundamental variables increasing in importance as a presidential election draws near.Political science, Statistics, Presidents--Electionjb878, ag389StatisticsArticlesOne vote, many Mexicos: Income and vote choice in the 1994, 2000, and 2006 presidential elections
https://academiccommons.columbia.edu/catalog/ac:125237
Cortina, Jeronimo; Gelman, Andrew E.10.7916/D8H70NJ4Wed, 31 May 2017 19:34:16 +0000Using multilevel modeling of state-level economic data and individual-level exit poll data from the 1994, 2000 and 2006 Mexican presidential elections, we find that income has a stronger effect in predicting the vote for the conservative party in poorer states than in richer states -- a pattern that has also been found in recent U.S. elections. In addition (and unlike in the U.S.), richer states on average tend to support the conservative party at higher rates than poorer states. Our findings raise questions regarding the role that income polarization and region play in vote choice. The electoral results since 1994 reveal that collapsing multiple states into large regions entails significant loss of information that otherwise may uncover sharper and quiet revealing differences in voting patterns between rich and poor states as well as rich and poor individuals within states.Political science, Statisticsag389StatisticsArticlesFitting Multilevel Models When Predictors and Group Effects Correlate
https://academiccommons.columbia.edu/catalog/ac:125243
Bafumi, Joseph; Gelman, Andrew E.10.7916/D87P953XWed, 31 May 2017 19:34:15 +0000Random effects models (that is, regressions with varying intercepts that are modeled with error) are avoided by some social scientists because of potential issues with bias and uncertainty estimates. Particularly, when one or more predictors correlate with the group or unit effects, a key Gauss-Markov assumption is violated and estimates are compromised. However, this problem can easily be solved by including the average of each individual-level predictors in the group-level regression. We explain the solution, demonstrate its effectiveness using simulations, show how it can be applied in some commonly-used statistical software, and discuss its potential for substantive modeling.Statisticsjb878, ag389StatisticsArticlesThoughts on new statistical procedures for age-period-cohort analyses
https://academiccommons.columbia.edu/catalog/ac:125234
Gelman, Andrew E.10.7916/D8N01D7DWed, 31 May 2017 19:34:15 +0000Statisticsag389StatisticsArticlesImproving the Presentation of Quantitative Results in Political Science
https://academiccommons.columbia.edu/catalog/ac:125095
Kastellec, John; Gelman, Andrew E.10.7916/D8NZ8FBGWed, 31 May 2017 19:34:10 +0000Political science, Statisticsag389StatisticsPresentations (Communicative Events)La philosophie et l'expérience de la statistique bayésienne
https://academiccommons.columbia.edu/catalog/ac:125180
Gelman, Andrew E.; Shalizi, Cosma10.7916/D89P37CPWed, 31 May 2017 19:34:10 +0000Statisticsag389StatisticsPresentations (Communicative Events)Social and political polarization, and some other topics in network analysis
https://academiccommons.columbia.edu/catalog/ac:125159
Gelman, Andrew E.10.7916/D8J67PNHWed, 31 May 2017 19:34:09 +0000Statisticsag389StatisticsPresentations (Communicative Events)Posterior predictive checking and generalized graphical models
https://academiccommons.columbia.edu/catalog/ac:125156
Gelman, Andrew E.10.7916/D8251QW3Wed, 31 May 2017 19:34:09 +0000Statisticsag389StatisticsPresentations (Communicative Events)La polarisation politique et comment étudier Ã§a avec la statistique
https://academiccommons.columbia.edu/catalog/ac:125086
Gelman, Andrew E.10.7916/D85X2GNZWed, 31 May 2017 19:34:09 +0000Statisticsag389StatisticsPresentations (Communicative Events)Some computational and modeling issues for hierarchical models
https://academiccommons.columbia.edu/catalog/ac:125092
Gelman, Andrew E.10.7916/D8XD17CRWed, 31 May 2017 19:34:09 +0000Statisticsag389StatisticsPresentations (Communicative Events)Culture wars, voting, and polarization: divisions and unities in modern American politics
https://academiccommons.columbia.edu/catalog/ac:125089
Gelman, Andrew E.10.7916/D8SN0GPGWed, 31 May 2017 19:34:09 +0000Political science, Statisticsag389StatisticsPresentations (Communicative Events)Rich state, poor state, red state, blue state: What's the matter with Connecticut?
https://academiccommons.columbia.edu/catalog/ac:125297
Gelman, Andrew E.; Shor, Boris; Bafumi, Joseph; Park, David K.10.7916/D8WD45S4Thu, 13 Apr 2017 15:46:17 +0000For decades, the Democrats have been viewed as the party of the poor, with the Republicans representing the rich. Recent presidential elections, however, have shown a reverse pattern, with Democrats performing well in the richer blue states in the northeast and coasts, and Republicans dominating in the red states in the middle of the country and the south. Through multilevel modeling of individual-level survey data and county- and state-level demographic and electoral data, we reconcile these patterns. Furthermore, we find that income matters more in red America than in blue America. In poor states, rich people are much more likely than poor people to vote for the Republican presidential candidate, but in rich states (such as Connecticut), income has a very low correlation with vote preference.Mathematical statisticsag389StatisticsArticlesPartisans without constraint: Political polarization and trends in American public opinion
https://academiccommons.columbia.edu/catalog/ac:125291
Baldassarri, Delia; Gelman, Andrew E.10.7916/D84T6QK4Thu, 13 Apr 2017 15:46:16 +0000Public opinion polarization is here conceived as a process of alignment along multiple lines of potential disagreement and measured as growing constraint in individuals' preferences. Using NES data from 1972 to 2004, the authors model trends in issue partisanship--the correlation of issue attitudes with party identification--and issue alignment--the correlation between pairs of issues--and find a substantive increase in issue partisanship, but little evidence of issue alignment. The findings suggest that opinion changes correspond more to a resorting of party labels among voters than to greater constraint on issue attitudes: since parties are more polarized, they are now better at sorting individuals along ideological lines. Levels of constraint vary across population subgroups: strong partisans and wealthier and politically sophisticated voters have grown more coherent in their beliefs. The authors discuss the consequences of partisan realignment and group sorting on the political process and potential deviations from the classic pluralistic account of American politics.Mathematical statisticsag389StatisticsArticlesStruggles with survey weighting and regression modeling
https://academiccommons.columbia.edu/catalog/ac:125309
Gelman, Andrew E.10.7916/D8H41XN4Thu, 13 Apr 2017 15:46:16 +0000The general principles of Bayesian data analysis imply that models for survey responses should be constructed conditional on all variables that affect the probability of inclusion and nonresponse, which are also the variables used in survey weighting and clustering. However, such models can quickly become very complicated, with potentially thousands of poststratification cells. It is then a challenge to develop general families of multilevel probability models that yield reasonable Bayesian inferences. We discuss in the context of several ongoing public health and social surveys. This work is currently open-ended, and we conclude with thoughts on how research could proceed to solve these problems.Mathematical statisticsag389StatisticsArticlesRejoinder: Struggles with survey weighting and regression modeling
https://academiccommons.columbia.edu/catalog/ac:125312
Gelman, Andrew E.10.7916/D8CC15WBThu, 13 Apr 2017 15:46:16 +0000I was motivated to write this paper, with its controversial opening line, "Survey weighting is a mess," from various experiences as an applied statistician.Mathematical statisticsag389StatisticsArticlesBayes, Jeffreys, Prior Distributions and the Philosophy of Statistics
https://academiccommons.columbia.edu/catalog/ac:125279
Gelman, Andrew E.10.7916/D8J38ZTDThu, 13 Apr 2017 15:46:16 +0000I actually own a copy of Harold Jeffreys's Theory of Probability but have only read small bits of it, most recently over a decade ago to confirm that, indeed, Jeffreys was not too proud to use a classical chi-squared p-value when he wanted to check the misfit of a model to data (Gelman, Meng and Stern, 2006). I do, however, feel that it is important to understand where our probability models come from, and I welcome the opportunity to use the present article by Robert, Chopin and Rousseau as a platform for further discussion of foundational issues. In this brief discussion I will argue the following: (1) in thinking about prior distributions, we should go beyond Jeffreys's principles and move toward weakly informative priors; (2) it is natural for those of us who work in social and computational sciences to favor complex models, contra Jeffreys's preference for simplicity; and (3) a key generalization of Jeffreys's ideas is to explicitly include model checking in the process of data analysis.Mathematical statisticsag389StatisticsArticlesThe playing field shifts: Predicting the seats-votes curve in the 2008 U.S. House election
https://academiccommons.columbia.edu/catalog/ac:125285
Kastellec, Jonathan P.; Gelman, Andrew E.; Chandler, Jamie P.10.7916/D8DB873GThu, 13 Apr 2017 15:46:16 +0000The 2008 U.S. House elections mark the first time since 1994 that the Democrats will seek to retain a majority. With the political climate favoring Democrats this year, it seems almost certain that the party will retain control, and will likely increase its share of seats. In five national polls taken in June of this year, Democrats enjoyed on average a 13-point advantage in the generic congressional ballot; as Bafumi, Erikson, and Wlezien (2007) point out, these early polls, suitably adjusted, are good predictors of the November vote. As of late July, bettors at intrade.com put the probability of the Democrats retaining a majority at about 95% (Intrade.com 2008). Elsewhere in this symposium, Klarner (2008) predicts an 11-seat gain for the Democrats, while Lockerbie (2008) forecasts a 25-seat pickup. In this paper we document how the electoral playing field has shifted from a Republican advantage between 1996 and 2004 to a Democratic tilt today. In an earlier article (Kastellec, Gelman, and Chandler 2008), we predicted the seats-votes curve in the 2006 election, showing how the Democrats faced an uphill battle in their effort to take control of the House and, their victory notwithstanding, ended up winning a lower percentage of seats than their average district vote nationwide. We follow up on this analysis by using the same method to predict the seats-votes curve in 2008. Due to the shift in incumbency advantage from the Republicans to the Democrats, compounded by a greater number of retirements among Republican members, we show that the Democrats now enjoy a partisan bias, and can expect to win more seats than votes for the first time since 1992. While this bias is not as large as the advantage the Republicans held in 2006, it will likely help the Democrats increase their share of seats.Mathematical statisticsjpk2004, ag389StatisticsArticlesDiscussion of the Article "Website Morphing"
https://academiccommons.columbia.edu/catalog/ac:125288
Gelman, Andrew E.10.7916/D88K7G9VThu, 13 Apr 2017 15:46:16 +0000The article under discussion illustrates the trade-off between optimization and exploration that is fundamental to statistical experimental design. In this discussion, I suggest that the research under discussion could be made even more effective by checking the fit of the model by comparing observed data to replicated data sets simulated from the fitted model.Mathematical statisticsag389StatisticsArticlesPredicting and dissecting the seats-votes curve in the 2006 U.S. House election
https://academiccommons.columbia.edu/catalog/ac:125294
Kastellec, Jonathan P.; Gelman, Andrew E.; Chandler, Jamie P.10.7916/D8125ZW5Thu, 13 Apr 2017 15:46:14 +0000The 2008 U.S. House elections mark the first time since 1994 that the Democrats will seek to retain a majority. With the political climate favoring Democrats this year, it seems almost certain that the party will retain control, and will likely increase its share of seats. In five national polls taken in June of this year, Democrats enjoyed on average a 13-point advantage in the generic congressional ballot; as Bafumi, Erikson, and Wlezien (2007) point out, these early polls, suitably adjusted, are good predictors of the November vote. As of late July, bettors at intrade.com put the probability of the Democrats retaining a majority at about 95% (Intrade.com 2008). Elsewhere in this symposium, Klarner (2008) predicts an 11-seat gain for the Democrats, while Lockerbie (2008) forecasts a 25-seat pickup. In this paper we document how the electoral playing field has shifted from a Republican advantage between 1996 and 2004 to a Democratic tilt today. In an earlier article (Kastellec, Gelman, and Chandler 2008), we predicted the seats-votes curve in the 2006 election, showing how the Democrats faced an uphill battle in their effort to take control of the House and, their victory notwithstanding, ended up winning a lower percentage of seats than their average district vote nationwide. We follow up on this analysis by using the same method to predict the seats-votes curve in 2008. Due to the shift in incumbency advantage from the Republicans to the Democrats, compounded by a greater number of retirements among Republican members, we show that the Democrats now enjoy a partisan bias, and can expect to win more seats than votes for the first time since 1992. While this bias is not as large as the advantage the Republicans held in 2006, it will likely help the Democrats increase their share of seats.Mathematical statisticsjpk2004, ag389StatisticsArticlesComment: Bayesian Checking of the Second Levels of Hierarchical Models
https://academiccommons.columbia.edu/catalog/ac:125303
Gelman, Andrew E.10.7916/D8RN3F38Thu, 13 Apr 2017 15:46:13 +0000Bayarri and Castellanos (BC) have written an interesting paper discussing two forms of posterior model check, one based on cross-validation and one based on replication of new groups in a hierarchical model. We think both these checks are good ideas and can become even more effective when understood in the context of posterior predictive checking. For the purpose of discussion, however, it is most interesting to focus on the areas where we disagree with BC.Mathematical statisticsag389StatisticsArticlesBayes: Radical, liberal, or conservative?
https://academiccommons.columbia.edu/catalog/ac:125306
Gelman, Andrew E.10.7916/D8MW2PCJThu, 13 Apr 2017 15:46:12 +0000Mathematical statisticsag389StatisticsArticles