03_descriptive_statistics

Figure 1: Dotplot displaying frequency of tables and graphs, by type of data summary

Figure 2: Mosaic plot for displaying contingency data

Figure 3: Dotplot displaying summary of means and standard deviations

Figure 4: Combined Dotplot/Violin plots for displaying distributions of variables

Figure 5: Using an Advanced Dotplots to Present Proportions.

Note: This figure is based on a dataset we created coding each graph and table that appeared in five issues of leading political science journals. That dataset is available here. A codebook is available here.

**Using R**: Download the R code for this graph.

**Table 1 from Iversen and Soskice (2006)**

**Our Graph**

**Using R**: Download the R code for this graph.

**Using Stata**: Stata does not seem to produce mosaic plots yet.

**Table 1 (Panel A) from McClurg (2006): The Political Character of Social Networks. This table provides descriptive statistics for the political character of the social networks as perceived by respondents. **

Mean | Standard Deviation | Min | Max | N | ||
---|---|---|---|---|---|---|

Panel A: Descriptive Statistics | ||||||

Size^{a} | 3.13 | 1.49 | 1 | 5 | 1260 | |

Political Talk | 1.82 | 0.61 | 0 | 3 | 1253 | |

Political Agreement | 0.43 | 0.41 | 0 | 1 | 1154 | |

Political Knowledge | 1.22 | 0.42 | 0 | 2 | 1220 | |

^{a}When respondents who report having no network are included the mean of this variable drops to 2.57 with a standard deviation 1.81 (n = 1537). |

**Our Graph**

**Using R**: Download the R code for this graph.

**Using Stata**: Download a Stata dataset and the Stata code for this graph.

**Note** The published violin plots for “Percent Negative Ads”, “Issue Convergence” and “Issue Salience” mistakenly show oversmoothing. Scroll down to see an updated version of the plot that corrects for this.

**Table 2 from Kaplan et al. (2006): Descriptive Statistics of Campaign and Issue-Level Variables**

Variable | N | Mean | SD | Min | Max | |
---|---|---|---|---|---|---|

Issue Convergence | 982 | 24.85 | 34.73 | 0.00 | 99.98 | |

Competitiveness (CQ Ranking) | 65 | 1.54 | 1.20 | 0.00 | 3.00 | |

Total Spending/Capita (millions) | 65 | 3.47 | 2.71 | 0.28 | 13.39 | |

Difference Spending/Capita (millions) | 65 | 1.12 | 1.32 | 0.03 | 9.26 | |

State Voting Age Pop. (millions-ln) | 65 | 1.20 | 0.85 | -0.65 | 3.13 | |

Percent Negative Ads | 65 | 21.38 | 16.84 | 0.00 | 54.96 | |

2000 Year (binary) | 65 | 0.38 | 0.49 | 0.00 | 1.00 | |

2002 Year (binary) | 65 | 0.32 | 0.47 | 0.00 | 1.00 | |

Consensual Issue (binary) | 43 | 0.28 | 0.45 | 0.00 | 1.00 | |

Issue Owned (binary) | 43 | 0.49 | 0.51 | 0.00 | 1.00 | |

Issue Salience | 43 | 2.86 | 6.38 | 0.00 | 35.63 |

**Our Graph**

Note: The bottom two panels is this figure are based on three datasets used in Kaplan et.
al (2006), who have made their data available here. We
have created separate files available
here,
here and
here that will allow you to
easily import each into *R* using our code.

**Using R**: Download the R code for this graph.

**Using Stata** – Download two Stata datasets here and here (note these are different from the datasets listed above), and Stata code here. Many thanks to Felipe Botero for supplying this code.

**Smoothing correction**

The published violin plots for “Percent Negative Ads”, “Issue Convergence” and “Issue Salience” mistakenly show oversmoothing. We can alleviate the problem by choosing the degree of smoothing manually. Many thanks to Håvard Strand for pointing out the problem to us. R code for the updated graph is here.

**Table 2 from Schwindt-Bayer (2006): Number of Bills Sponsored in Each Thematic Area**

Argentina | Colombia-Chamber | Colombia-Senate | Costa Rica | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|

1995 | 1999 | 1994-1998 | 1998-2002 | 1994-1998 | 1998-2002 | 1994-1998 | 1998-2002 | Total | |||

Number of legislators who sponsored at least one bill | 246 | 257 | 139 | 165 | 87 | 94 | 57 | 57 | 1102 | ||

Women's Issues | 33 | 30 | 23 | 9 | 16 | 18 | 23 | 35 | 187 | ||

Children/Family | 28 | 40 | 13 | 7 | 22 | 9 | 25 | 27 | 171 | ||

Education | 44 | 66 | 67 | 72 | 42 | 29 | 56 | 75 | 451 | ||

Health | 27 | 51 | 13 | 14 | 13 | 5 | 16 | 33 | 172 | ||

Economics | 208 | 305 | 74 | 80 | 113 | 65 | 120 | 160 | 1125 | ||

Agriculture | 28 | 49 | 23 | 18 | 22 | 19 | 34 | 38 | 231 | ||

Fiscal Affairs | 45 | 61 | 11 | 17 | 21 | 13 | 27 | 51 | 246 | ||

Other Bills^{*} | 567 | 901 | 405 | 406 | 371 | 356 | 628 | 764 | 4398 | ||

Total number of bills | 980 | 1503 | 629 | 623 | 620 | 514 | 929 | 1183 | 6981 | ||

^{*}“Other Bills'' include all bills that do not fall into the seven thematic areas. This would include bills related to public administration, the environment, foreign affairs, culture, and public welfare, among others. |

**Our Graph**

Note: For this graph, we first created a Stata dataset, which is available here.

**Using R**: Download the R code for this graph.

**Table A2 from Iversen and Soskice (2006)**

(1) | (2) | (3) | (4) | (5) | (6) | (7) | (8) | (9) | (10) | (11) | (12) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|

Redistribution | 1 | |||||||||||

Inequality | -0.38 | 1 | ||||||||||

Partisanship | -0.5 | 0.37 | 1 | |||||||||

Turnout | 0.11 | -0.38 | -0.24 | 1 | ||||||||

Unionization | 0.75 | -0.22 | -0.49 | 0.51 | 1 | |||||||

Veto points | -0.44 | -0.01 | 0.33 | -0.43 | -0.56 | 1 | ||||||

Electoral system | 0.34 | -0.54 | -0.66 | 0.71 | 0.49 | -0.27 | 1 | |||||

Left fragmentation | -0.57 | -0.09 | 0.14 | -0.27 | -0.76 | 0.14 | -0.18 | 1 | ||||

Right overrepresentation | -0.13 | 0.66 | 0.46 | 0.1 | 0.14 | -0.16 | -0.24 | -0.48 | 1 | |||

Per capita income | 0.12 | -0.42 | -0.08 | -0.51 | -0.18 | 0.61 | -0.22 | 0.08 | -0.64 | 1 | ||

Female LF participation | 0.8 | -0.45 | -0.28 | -0.19 | 0.48 | -0.06 | 0.17 | -0.37 | -0.168 | 0.38 | 1 | |

Unemployment | -0.49 | 0.55 | 0.52 | 0.06 | -0.2 | 0.01 | -0.2 | 0.02 | 0.63 | -0.41 | -0.51 | 1 |

**Our Graph**

Download the R code for this graph.

To create this graph, you will also need a function that we wrote called plot.corr, available here.

**Notes**

The idea for this plot is taken from Figure 8 of Zheng et al. (2006). It uses a seriation algorithm in order to put similar variables (in terms of their correlation with all other variables) together. There is an implementation of this graph in the arm package. (See ?corrplot)

**Table 4 from Iversen and Soskice (2006)**

**Our graph**

Download the R code for this graph.

**Notes**

We turn each data column into a dotplot, creating a 2 x 2 “small multiples” plot. For both types of political system, we order the countries by their “Effective Number of Parties,” from low to high, which helps gives a sense of the distribution across political system. The vertical dotted line depicts the mean value within each plot, thereby allowing for comparison of each country's level to the average value across countries.

The following graph comes from Kastellec et. al (forthcoming). To replicate the figure, you will need three Stata datasets: aggregate House results from 1946-2004; individual House results from 1946-2004; and individual House results from 2006.

Download the R code for this graph.

**Notes**

The graph compares the average district vote for Democrats in House elections versus the party's share of House seats, showing that after 1958 the latter exceeded the former until the Republicans took control of the House in 1994, when the pattern switched. Key features of the graph include: using shading to distinguish periods of Republican and Democratic control of the House; directly labeling each line in the graph rather than using a legend; and labeling the x-axis only at 10-year intervals to avoid clutter.

03_descriptive_statistics.txt · Last modified: 2013/12/15 15:20 (external edit)