A A Very Brief Introduction to R

A.1 Purpose

This Appendix provides a very brief introduction to the basics of R from the perspective of use for optimization modeling. This is not meant to be comprehensive R tutorial. There is an ever expanding collection of such materials available.

A.2 Getting Started with R

This Appendix helps a reader new to R quickly get started. I strongly recommend using RStudio. I also like books such as R in a Nutshell (Adler 2012) or R in Action (Kabacoff 2011).

I usually find the best way for me to learn a new tool is to roll up my sleeves and jump right in. This book can be approached in that way - just be ready for a little more experimentation. This book is available on Github and the R Markdown files are available for use. Code fragments are shown quite liberally for demonstrating how things work. While this may be a bit verbose it is meant to enable readers to jump in at various points of the book.

While code fragments can be copied from the R markdown files for this book, it is often best to physically retype many of the code fragments shown as that gives time to reflect on what each statement is doing.

Let’s define some conventions to be used throughout the book. First, let me be clear, the goal of this section is not to provide a comprehensive introduction to R. Entire books are written on that subject. My goal here is to simply make sure that everyone can get started productively in the material covered later.

Since this book is focused on using R for operations research, we will focus on the capabilities that are needed for this area and introduce additional features and functions as needed. If you are already comfortable with R, RStudio, and RMarkdown, you may skip the remainder of this section.

Begin by ensuring that you have access to or installed R and RStudio. Both are available for Windows, Mac, and Linux operating systems as well as being available as web services.

Now, let’s assume that you are running RStudio.

In this book, I will frequently show code fragments called chunks to show R code. In general, these code chunks can be run by simply retyping the command(s) in the Console.

a <- 7

This command assigns the value of 7 to the variable a. It is standard in R to use <- as an assignment operator rather than an equals sign. We can then use R to process this information.

6*a
## [1] 42

Yes, not surprisingly, \(6*7\) is \(42\). Notice that if we don’t assign that result to something, it gives an immediate result to the screen.

We will often be using or defining data that has more than one element. In fact, R is designed around more complex data structures. Let’s go ahead and define a matrix of data.

b<-matrix(c(1,2,3,4,5,6,7,8))

By default, it is assuming that the matrix has one column which means that every data value is in a separate row. The c function is a concatenate operator meaning that it combines all the following items together.

Let’s look at b now to see what it contains.

b
##      [,1]
## [1,]    1
## [2,]    2
## [3,]    3
## [4,]    4
## [5,]    5
## [6,]    6
## [7,]    7
## [8,]    8

Let’s instead define this matrix to have four columns and two rows.

b<-matrix(c(1,2,3,4,5,6,7,8), ncol=4)
b
##      [,1] [,2] [,3] [,4]
## [1,]    1    3    5    7
## [2,]    2    4    6    8

Notice that since there are eight elements, we only needed to tell R that the matrix has four columns and it then knew that there would be only two rows. Of course we could have set the number of rows to be two for the same result.

This is still a little ambiguous. Let’s give the rows and columns names. For now we will simply name them Row and Col.

b<-matrix(c(1,2,3,4,5,6,7,8), ncol=4, 
          dimnames=c(list(c("Row1", "Row2")),
                     list(c("Col1", "Col2","Col3","Col4"))))

Remark (RStudio Console). The RStudio console can save typing by pressing the up arrow key to view previous command(s) which can then be further edited.

Okay, this command has a lot more going on. The term dimnames is a parameter that contain names for rows and columns. One thing to note is that this line fills up more space than a single line so it rolls over to multiple line. The dimnames parameter will get get two concatenated (combined) lists. The first list is a combined list of two text strings “Row1” and “Row2.” The next line does the same for columns.

Let’s confirm that it works.

b
##      Col1 Col2 Col3 Col4
## Row1    1    3    5    7
## Row2    2    4    6    8

This table is still not that “nice” looking. Let’s use a package that does a nicer job of formatting tables. To do this, we will use an extra package. Up to this point, everything that we have done just simply uses standard built in functions of R. The package that we will use is kable but there are plenty of others available such as pander, kable, xtable, and huxtable. While kable is a function of knitr, it is significantly enhanced by the kableExtra package. For more information on the specifics of creating rich tables, see Appendix D.

Let’s start by loading the kableExtra package.

library (kableExtra)

If R indicates that you don’t have the kableExtra package installed on your computer, you can press the “Install” command under the Packages tab and then type in kableExtra or use the install.packages command from RStudio. The kbl is a shorthand way of entering the kable function that is provided by kableExtra. A lot of the power of R comes from the thousdands of packages developed by other people so you will often be installing and loading packages.

Notice the hash symbol is used to mark a comment in the above code chunk. It is also used to “comment out” a command that I don’t need to use at the current time. Using comments to explain what a command is doing can be helpful to anyone that needs to revisit your code in the future, including yourself!

knitr::kable (b, booktabs=T, 
     caption="Example using Kable with booktabs") |>
  kableExtra::kable_styling(latex_options = "hold_position")
TABLE A.1: Example using Kable with booktabs
Col1 Col2 Col3 Col4
Row1 1 3 5 7
Row2 2 4 6 8

The table looks very nice in the book. Let’s explain in detail how the table is generated. This code chunk has a lot going and is worth a careful look.

  • The first line knitr::kable means that we should run the kable function from within knitr. In general, as a long as the library is loaded and the same function is not defined by two different loaded libraries, you don’t need to specify where the function is being run from.
  • The booktabs=T sets a format for clean published table style by setting the option to T for TRUE.
  • The second line provides a caption.
  • The last part of the second line warrants a little attention. It is the new pipe operator and requires R version 4.1.0 or higher. The pipe operator basically says pass this commands output into the beginning of the next command. We’ll use it a lot for table generation. Prior to R version 4.1.0, people that used pipes in R often used the %>% operator from a separate package such as magrittr. For most users, including us, the built-in |> will suffice.
  • The last command, kable_styling, has an option for hold_position which holds the placement closer to where it is called, keeping LaTeX from second guessing where the ideal location would be for the table. Note that is also specifying its source package, in this case, kableExtra. There are a lot of other options in kable_styling that we will use in later chapters as needed.
  • For your own use, you could shorten all of this to just kable (b). On the other hand, kableExtra has one more small trick up its sleeve. It provides a shorter name for the kable function of just kbl. When space is at premium, this shortcut is handy. It also serves as a helpful check to ensure that kableExtra is loaded.

Let’s continue experimenting with operators.

c<-a*b
TABLE A.2: Scalar Multiplication of Matrix b to Make Matrix c
Col1 Col2 Col3 Col4
Row1 7 21 35 49
Row2 14 28 42 56

Now let’s do a transpose operation on the original matrix. What this means that it converts rows into columns and columns into rows.

d<-t(b)
TABLE A.3: Transposition of Matrix b
Row1 Row2
Col1 1 2
Col2 3 4
Col3 5 6
Col4 7 8
Note:
Row and column names were changed at the same time.

Notice that row and column names are also transposed. The result is that we now have rows labeled as columns and columns labeled as rows! This is only a problem given that we used the words rows and columns in the names. If these had been more descriptive such as weeks and product names, it would have been a good thing to change them at the same time.

Now we have done some basic operations within R.

We could try this but but we aren’t quite there yet.

c
##      Col1 Col2 Col3 Col4
## Row1    7   21   35   49
## Row2   14   28   42   56
d
##      Row1 Row2
## Col1    1    2
## Col2    3    4
## Col3    5    6
## Col4    7    8

Try renaming the rows and columns for d. As a hint, you could use dimnames or colnames and rownames.

Recall that we have created a variety of objects now: a, b, c, and d. R provides a lot of tools for slicing, dicing, and combining data.

TABLE A.4: Original Matrix b
Col1 Col2 Col3 Col4
Row1 1 3 5 7
Row2 2 4 6 8

Let’s look at the original matrix, b and what we can do with it. Let’s grab the second row, third column and last element.

TABLE A.5: Second Row of b
Col1 2
Col2 4
Col3 6
Col4 8

The as.matrix function is used to convert the object into a matrix so that kable can display it well.

temp2 <- as.matrix(b[,3])
TABLE A.6: Third Column of b
Row1 5
Row2 6

Grabbing the third column is not terribly interesting but operations such as this will often be quite useful.

temp3 <- as.matrix(b[2,4]) 
TABLE A.7: Last Element of b
8

Let’s take the first row of b and combine it with the second row of c to form a new matrix, e. Since these are rows that are going to be combined, we will use a command, rbind, to bind these rows together.

temp4 <- rbind(b[1,],c[2,])
TABLE A.8: Combined Matrix
Col1 Col2 Col3 Col4
1 3 5 7
14 28 42 56

Notice that temp4 has inherited the column names but lost the row names. Let’s set the row names for this matrix.

rownames(temp4)<-list("From b", "From c")
TABLE A.9: Combined Matrix with Explanation of Source
Col1 Col2 Col3 Col4
From b 1 3 5 7
From c 14 28 42 56

We could combine all of matrix b and matrix c together using row binding or column binding. Table Let’s view the results of binding using columns (cbind) and rows (rbind).

temp5<- cbind(b,c)
TABLE A.10: Column Binding of Matrices b and c
Col1 Col2 Col3 Col4 Col1 Col2 Col3 Col4
Row1 1 3 5 7 7 21 35 49
Row2 2 4 6 8 14 28 42 56
temp6 <- rbind(b,c)
TABLE A.11: Row Binding of Matrices b and c
Col1 Col2 Col3 Col4
Row1 1 3 5 7
Row2 2 4 6 8
Row1 7 21 35 49
Row2 14 28 42 56

Data organizing is a less glamorous part of the job for practicing analytics professionals but can consume a majority of the workday. There is a lot more that can be said about data wrestling but scripting the data cleansing in terms R commands will make the work more repeatable, consistent, and in the end save time.

A.3 Exercises

Exercise A.1 (Creating a Matrix of Daily Demand for Four Weeks) Assume that your company’s product has a demand of 10 widgets on Monday, increasing by five units for every day through Sunday. Build a matrix of four weeks of demand where each row is a separate week. Each Monday starts over with the same demand. The rows should be named Week1, Week2, etc. The columns should have names corresponding to the day of the week.

Exercise A.2 (Creating a Matrix of Demand for Two Products) Assume that your company’s product has a demand of 10 widgets on Monday, increasing by five units for every day through Sunday. Gadgets have a demand of 20 on Monday, increasing by 3 units a day through Sunday. Build a matrix showing each product as a separate row. Rows should have names for the products and columns for the days of the week.

Exercise A.3 (Creating a Matrix of Monthly Demand) Consider some top selling online products for the year 2021: toys, shoes, pens and pencils, decorative bottles, drills, cutters, and GPS navigation systems. Create your own assumed starting counts per item for throughout year by increasing each item sales by 50 units per month, starting from January till December.

Exercise A.4 (Displaying Selective Data from the Matrix) In the above matrix, grab the columns and rows to show the data only for pens and pencils during the months of June, August and October.

Exercise A.5 (Creating Separate Demands for Products) In the above matrix, demand for toys and shoes has been increased by 20 and 10 units respectively. Create a matrix for only these products for the first 6 months. Display the matrix twice–once without formatting and a second time using an R package that provides better table formatting (pander, kable via knitr, huxtable or similar tool. Be sure to provide a table caption.)

References

Adler, Joseph. 2012. R in a Nutshell. Sebastopol, CA: O’Reilly.
Kabacoff, Robert. 2011. R in Action: Data Analysis and Graphics with R. Shelter Island, NY; London: Manning ; Pearson Education [distributor].