D Making Good Tables

D.1 Importance of Tables in Modeling

library(knitr)  # Only two packages used in this Appendix
library (kableExtra)
knitr::opts_chunk$set(echo = TRUE)

The data scientist may interpret “A picture is worth a thousand words” as “A figure or table may be worth a thousand words.” The most elegant and sophisticated analysis is useless if the results are not documented or used. While figures are a subject too broad for an appendix, tables are so core to analysis that it warrants explaining how we created the tables for the book. This Appendix is provided to show all the techniques that were used in making the tables throughout the book to enable the reader to use them as well. Providing the information in one place in the Appendix will reduce the repetition in the other chapters.

Throughout the bulk of the book, sometimes the command for generating the table is shown for demonstrating particular techniques but they are discussed in more detail in this appendix. The RMarkdown source documents have the code for each table in the book.

Because of the importance of tables, many packages have been developed for creating rich and sophisticated tables such as knitr’s kable, pander, grid.table, huxtable, flextable, gt, xtable, ztable, and more. Previous versions of this book extensively used pander and grid.table. After a lot of trial and error, I settled on knitr’s kable using the enhancements provided by kableExtra for the following reasons:

  • Ease of use
  • Widely used
  • Well documented, including use with bookdown
  • Straightforward table resizing
  • Ability to use LaTeX in row and column names
  • Relatively compact syntax

Just as ggplot implemented a widely used and powerful grammar of graphics, some of the recent packages have taken a similar approach and philosophy, specifically huxtable, flextable, and gives rise to the name of the recent gt from team at RStudio. These are powerful packages worth considering in the future but for the time-being, I will focus on kable as assisted with kableExtra.

There are more options in kable and kableExtra than we are covering here but this covers everything that was used in the book.

D.2 Kable vs. Kbl

In theory, everything done with kableExtra could be done through just kable since kableExtra serves as a friendly interface to kable. Actually doing everything directly through kable would be much more difficult and prone to user error so kableExtra is recommended. The first item to note is kableExtra provides a shorthand function, kbl, similar to knitr’s native kable. In fact, it is essentially a wrapper that provides several useful benefits:

  • automatic format recognition rather than requiring setting format=“latex,”
  • provides direction options for important kable options which makes the command more readable and provides autocomplete, and
  • shortens every line by 2 letters helping fit commands within a single printed line.

We will use kbl throughout this Appendix and book. If you are getting an error that kbl function is not recognized, you will need to load the kableExtra package.

D.3 Table Footnotes with Kable

Footnotes can be text or listed with demarcations for numbers, letters, or symbols.

m <- matrix(c(1:4),nrow=2, ncol=2) 
rownames(m) <- c("Row name 1", "Row name 2")

kbl (m, booktabs=T,
     caption="Footnotes in Tables Using kbl",
     col.names=c("C1", "C2"), 
     row.names=F)|>
  kable_styling(latex_options = "hold_position")|>
  footnote(general = "General comments about the table. ",
       number = c("Footnote 1; ", "Footnote 2; "),
       alphabet = c("Footnote A; ", "Footnote B; "),
       symbol = c("Footnote Symbol 1; ", "Footnote Symbol 2"))
TABLE D.1: Footnotes in Tables Using kbl
C1 C2
1 3
2 4
Note:
General comments about the table.
1 Footnote 1;
2 Footnote 2;
a Footnote A;
b Footnote B;
* Footnote Symbol 1;
Footnote Symbol 2

D.4 Setting Row and Column Names in Kable

Kable has a handy option for changing col.names in the function directly. This is helpful because often the data structure has a specific naming convention for specific reasons but are not in very human readable format. Rather than always changing the column names before displaying the table or creating a new table with just different column names, this gives you the option just changing it for display purposes without affecting the original table.

One tricky thing is that while column names can be changed in kable, row names cannot be changed in kable, despite the options looking equivalent col.names= vs. row.names=.

The following code chunk seems like it should work but row.names generates an error because it wants a TRUE or FALSE (or their shorthand equivalents of T and F) rather than the row names. The code chunk was set to eval=FALSE in order to avoid causing an error. The lesson is that row names are not always handled in R in the same way as column names or as even consistently in different settings.

# eval=FALSE to avoid breaking error.
m <- matrix(1:4,nrow=2, ncol=2) 

kable (m, 
     col.names=c("C1", "C2"),
     row.names=c("R1", "R2")) 
# Works for col.names but not row.names
# The following demonstrates use of row.names

Since col.names works well, there is no reason to spend more time on it. Let’s instead show how to replace row names for a table. The following example gives bad row names to a matrix that I want to replace and use with an better set of row names. It then generates four tables showing these situations. The last two options give good results.

m <- matrix(1:4, 2, 2)

BR <- c("Bad row name 1", "Bad row name 2")
GR <- c("Good row name 1", "Good row name 2")

row.names (m)<-BR  # Treat as default to be avoided

kbl (m, booktabs=T, caption="Retain Row Names by Using `row.name=T`",
     col.names=c("C1", "C2"), row.names=T)                            |>   
  kable_styling (latex_options = "hold_position")  
TABLE D.2: Retain Row Names by Using row.name=T
C1 C2
Bad row name 1 1 3
Bad row name 2 2 4
kbl (m, booktabs=T, caption="Drop Row Name by Using `row.name=F`",
     col.names=c("C1", "C2"), row.names=F)                            |>
  kable_styling (latex_options = "hold_position")  
TABLE D.2: Drop Row Name by Using row.name=F
C1 C2
1 3
2 4
kbl (cbind(as.matrix(GR),m), booktabs=T,
       caption="Adding Row Names as a Leading Column",
     col.names=c("", "C1", "C2"), row.names=F)                        |>
  kable_styling (latex_options = "hold_position")  
TABLE D.2: Adding Row Names as a Leading Column
C1 C2
Good row name 1 1 3
Good row name 2 2 4
rownames(m)<-GR  
kbl (m, booktabs=T,
       caption="Replacing row names in matrix before kable",
     col.names=c("C1", "C2"), row.names=T)                            |>
  kable_styling (latex_options = "hold_position")  
TABLE D.2: Replacing row names in matrix before kable
C1 C2
Good row name 1 1 3
Good row name 2 2 4

The kable_styling command has a parameter, latex_options that can be passed a lot of different parameters that only apply to the output of latex. I often use the "hold_position" option to encourage LaTeX to keep the table placement near the code chunk. If "hold_position" is not passed, LaTeX will float the table to what it considers a good place relative to the rest of the text that might be well away from the actual intent.

To summarize:

  • col.names from kable makes it very easy to temporarily change the column names just for the purpose of the table.
  • Row names can be added using a cbind to pre-pend the row names as the first column. Note that this means that an additional column name needs to be added. Here I add "" to make it a blank column name.
  • Kable treats row.names as a simple flag that can be set to show the original rownames TRUE (or T) and FALSE (or F) to hide the original rownames.

D.5 Booktabs vs. Default

The booktabs option is used to provide the formatting commonly used in books and journals. Notably it reduces the extra border lines for every element that leaves an otherwise nice table looking like it is an Excel screen capture.

kbl (cbind(as.matrix(GR),m),
     caption="Default format using kable.",
     col.names=c("", "C1", "C2"), 
     row.names=F) |>
  kable_styling(latex_options = "hold_position")
TABLE D.3: Default format using kable.
C1 C2
Good row name 1 1 3
Good row name 2 2 4
kbl (cbind(as.matrix(GR),m), booktabs=T,
     caption="Kable using booktabs.",
     col.names=c("", "C1", "C2"), 
     row.names=F)|>
  kable_styling(latex_options = "hold_position")
TABLE D.3: Kable using booktabs.
C1 C2
Good row name 1 1 3
Good row name 2 2 4

D.6 Using LaTeX in Kable Column Names

Some applications require richer notation than simple plain text. With a little extra effort, you can get full LaTeX notation in tables. The primary use is likely to be in row and column names. As discussed earlier, column names can be set from within the kbl function call. Two important things to account for:

  • Set escape=F to allow using LaTeX,
  • Any command requiring a slash character in LaTeX requires a double slash. For example, $\theta^{CRS}$ to create \(\theta^{CRS}\) would instead require $\\theta^{CRS}$ as shown below.
m <- matrix(1:4, 2, 2)
GR <- c("$\\phi$", "$\\omega$")

kbl(cbind(as.matrix(GR),m), booktabs=T, escape=F, caption=
    "Using LaTeX in Row and Column Names",
    col.names=c("","$\\theta^{CRS}$", "$\\lambda_A$"), 
    row.names=F)|>
  kable_styling(latex_options = "hold_position")
TABLE D.4: Using LaTeX in Row and Column Names
\(\theta^{CRS}\) \(\lambda_A\)
\(\phi\) 1 3
\(\omega\) 2 4

D.7 Fitting Tables to Page Width

Large tables can be problematic. Kable can scale a table to fit the page width easily using the scale_down option in kableExtra via the kable_styling and latex_options.

NWeeks <- 24
mbig <- matrix(1:(2*NWeeks), ncol= NWeeks, byrow=TRUE,)
rownames (mbig) <- c("Widgets", "Gadgets")

kbl(mbig,booktabs=T, 
    caption="Scaling Table to Fit Page")|>
  kable_styling(latex_options = 
                  c("hold_position", "scale_down"))
TABLE D.5: Scaling Table to Fit Page
Widgets 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Gadgets 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48

A few things to note:

  • Multiple options can be passed at once to latex_options as shown above.
  • The scale_down option is done through latex_options highlighting that it only works for PDFs generated through LaTeX.
  • When shrunk, the font may be too small to read.
  • The scale_down option will also scale up a table that does not fit the full page width which may make a small table look cartoonishly large.
  • Some oversized tables may benefit from using the kbl option of longtable=T.

On the other hand, sometimes this is a sign that the display of this information needs to be rethought: - transposing the table to be taller rather than wider, - not displaying certain columns or rows with less information value, - using a figure of some form rather than a table, - turning the table (or page) sideways, - sticking the table into an appendix or as a file for download, or - providing a summary table of means, standard deviations, or other information.

Adler, Joseph. 2012. R in a Nutshell. Sebastopol, CA: O’Reilly.
Anderson, Timothy R., Tugrul U. Daim, and Francois F. Lavoie. 2007. “Measuring the Efficiency of University Technology Transfer.” Technovation 27 (5): 306–18. https://doi.org/10.1016/j.technovation.2006.10.003.
Baker, Kenneth R. 2015. Optimization Modeling with Spreadsheets. 3rd Edition. Hoboken: Wiley.
Banker, R. D., A. Charnes, and W. W. Cooper. 1984. “Some Models for Estimating Technical and Scale Inefficiencies in Data Envelopment Analysis.” Management Science, 1078–92.
Bogetoft, Peter, and Lars Otto. 2013. Benchmarking with DEA, SFA, and R. 2011 edition. New York u.a.: Springer.
Charnes, A., William W. Cooper, and E. Rhodes. 1978. A Data Envelopment Analysis Approach to Evaluation of the Program Follow Through Experiment in U.S. Public School Education. Boston: Division of Research Graduate School of Business Administration Harvard University.
Hillier, Frederick, and Gerald Lieberman. 2020. Introduction to Operations Research. 11th edition. Dubuque: McGraw-Hill Education.
Kabacoff, Robert. 2011. R in Action: Data Analysis and Graphics with R. Shelter Island, NY; London: Manning ; Pearson Education [distributor].
Ragsdale, Cliff. 2017. Spreadsheet Modeling & Decision Analysis: A Practical Introduction to Business Analytics. 8 edition. Boston, MA: Cengage Learning.
Rhodes, Edwardo Lao. 1978. “Data Envelopment Analysis and Approaches For Measuring the Efficiency of Decision-Making Units With an Application to Program Follow-Through in U. S. Education.” PhD thesis.
Tovey, Craig A. 2021. Linear Optimization and Duality: A Modern Exposition.
Winston, Wayne L. 2003. Operations Research: Applications and Algorithms. 4th edition. Belmont, Calif: Cengage Learning.