====== Plotting Descriptive Statistics ====== ===== On This Page ===== [[#Graphs that appear in paper]] [[#Graphs that do not appear in paper]] =====Graphs that appear in paper===== [[#Figure 1]]: Dotplot displaying frequency of tables and graphs, by type of data summary [[#Figure 2]]: Mosaic plot for displaying contingency data [[#Figure 3]]: Dotplot displaying summary of means and standard deviations [[#Figure 4]]: Combined Dotplot/Violin plots for displaying distributions of variables [[#Figure 5]]: Using an Advanced Dotplots to Present Proportions. ====Figure 1==== Note: This figure is based on a dataset we created coding each graph and table that appeared in five issues of leading political science journals. That dataset is available [[http://svn.tables2graphs.com/tables2graphs/tables2graphs.dta|here]]. A codebook is available [[http://svn.tables2graphs.com/tables2graphs/codebook.pdf|here]]. {{http://svn.tables2graphs.com/tables2graphs/table_graph_freq.png}} **Using R**: Download the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/figure_1_table_coding_freq_graphs.R|R code}} for this graph. ====Figure 2==== **Table 1 from Iversen and Soskice (2006)** {{http://svn.tables2graphs.com/tables2graphs/tables/Tables%20for%20Website/iverson_table1.gif}} **Our Graph** {{http://svn.tables2graphs.com/tables2graphs/iversen_fig1.png}} **Using R**: Download the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/figure_2_iversen_table_1.R|R code}} for this graph. **Using Stata**: Stata does not seem to produce mosaic plots yet. ==== Figure 3 ==== **Table 1 (Panel A) from McClurg (2006): The Political Character of Social Networks. This table provides descriptive statistics for the political character of the social networks as perceived by respondents. ** ^ ^ Mean ^ Standard Deviation ^ Min ^ Max ^ N ^ |Panel A: Descriptive Statistics |||||| |Sizea | 3.13 | 1.49 | 1 | 5 | 1260 || |Political Talk | 1.82 | 0.61 | 0 | 3 | 1253 || |Political Agreement | 0.43 | 0.41 | 0 | 1 | 1154 || |Political Knowledge | 1.22 | 0.42 | 0 | 2 | 1220 || |aWhen respondents who report having //no network// are included the mean of this variable drops to 2.57 with a standard deviation 1.81 (//n// = 1537). |||||| **Our Graph** {{http://svn.tables2graphs.com/tables2graphs/mcclurg_fig.png}} **Using R**: Download the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/figure_3_mcclurg_table_1.R|R code}} for this graph. **Using Stata**: Download a {{http://svn.tables2graphs.com/tables2graphs/stata/mcclurg.dta|Stata dataset}} and the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/stata_figure_3_mcclurg_table_1.do|Stata code}} for this graph. ==== Figure 4 ==== **Note** The published violin plots for "Percent Negative Ads", "Issue Convergence" and "Issue Salience" mistakenly show oversmoothing. Scroll down to see an updated version of the plot that corrects for this. **Table 2 from Kaplan et al. (2006): Descriptive Statistics of Campaign and Issue-Level Variables** ^Variable ^ N ^ Mean ^ SD ^ Min ^ Max ^ |Issue Convergence | 982 | 24.85 | 34.73 | 0.00 | 99.98 || |Competitiveness (CQ Ranking) | 65 | 1.54 | 1.20 | 0.00 | 3.00 || |Total Spending/Capita (millions) | 65 | 3.47 | 2.71 | 0.28 | 13.39 || |Difference Spending/Capita (millions) | 65 | 1.12 | 1.32 | 0.03 | 9.26 || |State Voting Age Pop. (millions-ln) | 65 | 1.20 | 0.85 | -0.65 | 3.13 || |Percent Negative Ads | 65 | 21.38 | 16.84 | 0.00 | 54.96 || |2000 Year (binary) | 65 | 0.38 | 0.49 | 0.00 | 1.00 || |2002 Year (binary) | 65 | 0.32 | 0.47 | 0.00 | 1.00 || |Consensual Issue (binary) | 43 | 0.28 | 0.45 | 0.00 | 1.00 || |Issue Owned (binary) | 43 | 0.49 | 0.51 | 0.00 | 1.00 || |Issue Salience | 43 | 2.86 | 6.38 | 0.00 | 35.63 || **Our Graph** Note: The bottom two panels is this figure are based on three datasets used in Kaplan et. al (2006), who have made their data available [[http://home.gwu.edu/~dkp/tpm.htm|here]]. We have created separate files available [[http://svn.tables2graphs.com/tables2graphs/kaplan_campaign.dta|here]], [[http://svn.tables2graphs.com/tables2graphs/kaplan_campaign_issue.dta|here]] and [[http://svn.tables2graphs.com/tables2graphs/kaplan_issue.dta|here]] that will allow you to easily import each into //R// using our code. {{http://svn.tables2graphs.com/tables2graphs/kaplan_fig.png}} **Using R**: Download the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/figure_4_kaplan_table_2.R|R code}} for this graph. **Using Stata** -- Download two Stata datasets [[http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/kaplan_binary.dta|here]] and [[http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/kaplan_violin.dta|here]] (note these are different from the datasets listed above), and Stata code [[http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/stata_figure_4_kaplan_table_2.do|here]]. Many thanks to Felipe Botero for supplying this code. **Smoothing correction** The published violin plots for "Percent Negative Ads", "Issue Convergence" and "Issue Salience" mistakenly show oversmoothing. We can alleviate the problem by choosing the degree of smoothing manually. Many thanks to Håvard Strand for pointing out the problem to us. R code for the updated graph is [[http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/figure_4_kaplan_table_2_b.R|here]]. {{http://svn.tables2graphs.com/tables2graphs/kaplan_fig_b.png}} ==== Figure 5 ==== **Table 2 from Schwindt-Bayer (2006): Number of Bills Sponsored in Each Thematic Area** ^ ^ Argentina ^^ Colombia-Chamber ^^ Colombia-Senate ^^ Costa Rica ^^ ^ | | **1995** | **1999** | **1994-1998** | **1998-2002** | **1994-1998** | **1998-2002** | **1994-1998** | **1998-2002** | **Total ** || |Number of legislators who sponsored at least one bill | 246 | 257 | 139 | 165 | 87 | 94 | 57 | 57 | 1102 || |Women's Issues | 33 | 30 | 23 | 9 | 16 | 18 | 23 | 35 | 187 || |Children/Family | 28 | 40 | 13 | 7 | 22 | 9 | 25 | 27 | 171 || |Education | 44 | 66 | 67 | 72 | 42 | 29 | 56 | 75 | 451 || |Health | 27 | 51 | 13 | 14 | 13 | 5 | 16 | 33 | 172 || |Economics | 208 | 305 | 74 | 80 | 113 | 65 | 120 | 160 | 1125 || |Agriculture | 28 | 49 | 23 | 18 | 22 | 19 | 34 | 38 | 231 || |Fiscal Affairs | 45 | 61 | 11 | 17 | 21 | 13 | 27 | 51 | 246 || |Other Bills* | 567 | 901 | 405 | 406 | 371 | 356 | 628 | 764 | 4398 || |Total number of bills | 980 | 1503 | 629 | 623 | 620 | 514 | 929 | 1183 | 6981 || |*“Other Bills'' include all bills that do not fall into the seven thematic areas. This would include bills related to public administration, the environment, foreign affairs, culture, and public welfare, among others.|||||||||||| **Our Graph** Note: For this graph, we first created a Stata dataset, which is available [[http://svn.tables2graphs.com/tables2graphs/schwindt_table2.dta|here]]. {{http://svn.tables2graphs.com/tables2graphs/schwindt_fig.png}} **Using R**: Download the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/figure_5_schwindt_table_2.R|R code}} for this graph. =====Graphs that do not appear in paper===== [[#A Correlation Matrix]] [[#A Small Multiples Dotplot]] [[#A Multi-line Time Series Plot]] ==== A Correlation Matrix ==== **Table A2 from Iversen and Soskice (2006)** ^ ^ (1) ^ (2) ^ (3) ^ (4) ^ (5) ^ (6) ^ (7) ^ (8) ^ (9) ^ (10) ^ (11) ^ (12) ^ |Redistribution | 1 | | | | | | | | | | | | |Inequality | -0.38 | 1 | | | | | | | | | | | |Partisanship | -0.5 | 0.37 | 1 | | | | | | | | | | |Turnout | 0.11 | -0.38 | -0.24 | 1 | | | | | | | | | |Unionization | 0.75 | -0.22 | -0.49 | 0.51 | 1 | | | | | | | | |Veto points | -0.44 | -0.01 | 0.33 | -0.43 | -0.56 | 1 | | | | | | | |Electoral system | 0.34 | -0.54 | -0.66 | 0.71 | 0.49 | -0.27 | 1 | | | | | | |Left fragmentation | -0.57 | -0.09 | 0.14 | -0.27 | -0.76 | 0.14 | -0.18 | 1 | | | | | |Right overrepresentation | -0.13 | 0.66 | 0.46 | 0.1 | 0.14 | -0.16 | -0.24 | -0.48 | 1 | | | | |Per capita income | 0.12 | -0.42 | -0.08 | -0.51 | -0.18 | 0.61 | -0.22 | 0.08 | -0.64 | 1 | | | |Female LF participation | 0.8 | -0.45 | -0.28 | -0.19 | 0.48 | -0.06 | 0.17 | -0.37 | -0.168 | 0.38 | 1 | | |Unemployment| -0.49 | 0.55 | 0.52 | 0.06 | -0.2 | 0.01 | -0.2 | 0.02 | 0.63 | -0.41 | -0.51 | 1 | **Our Graph** {{http://svn.tables2graphs.com/tables2graphs/corr.png}} Download the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/extra_correlation.R|R code}} for this graph. To create this graph, you will also need a function that we wrote called plot.corr, available {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/corr.R|here}}. **Notes** The idea for this plot is taken from Figure 8 of {{http://www.stat.columbia.edu/~gelman/research/published/overdisp_final.pdf|Zheng et al. (2006).}} It uses a seriation algorithm in order to put similar variables (in terms of their correlation with all other variables) together. There is an implementation of this graph in the {{http://cran.r-project.org/web/packages/arm/index.html|arm}} package. (See ?corrplot) ==== A Small Multiples Dotplot ==== **Table 4 from Iversen and Soskice (2006)** {{http://svn.tables2graphs.com/tables2graphs/tables/Tables%20for%20Website/iverson_table4.gif}} **Our graph** {{http://svn.tables2graphs.com/tables2graphs/iversen_fig4.png}} Download the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/extra_2_iversen_table_4.R|R code}} for this graph. **Notes** We turn each data column into a dotplot, creating a 2 x 2 "small multiples" plot. For both types of political system, we order the countries by their "Effective Number of Parties," from low to high, which helps gives a sense of the distribution across political system. The vertical dotted line depicts the mean value within each plot, thereby allowing for comparison of each country's level to the average value across countries. ==== A Multi-line Time Series Plot ==== The following graph comes from {{http://www.columbia.edu/~jpk2004/house_ps.pdf|Kastellec et. al (forthcoming)}}. To replicate the figure, you will need three Stata datasets: {{http://svn.tables2graphs.com/tables2graphs/House_1946-2006_aggregate.dta|aggregate House results from 1946-2004}}; {{http://svn.tables2graphs.com/tables2graphs/House_1946_2004_updated.dta|individual House results from 1946-2004}}; and {{http://svn.tables2graphs.com/tables2graphs/2006_house_data.dta|individual House results from 2006}}. {{http://svn.tables2graphs.com/tables2graphs/time_series_house.png}} Download the {{http://svn.tables2graphs.com/tables2graphs/Rcode/Final%20Code%20for%20Website/extra_4_time%20series%20example.R|R code}} for this graph. **Notes** The graph compares the average district vote for Democrats in House elections versus the party's share of House seats, showing that after 1958 the latter exceeded the former until the Republicans took control of the House in 1994, when the pattern switched. Key features of the graph include: using shading to distinguish periods of Republican and Democratic control of the House; directly labeling each line in the graph rather than using a legend; and labeling the x-axis only at 10-year intervals to avoid clutter.