Deploying Shiny Apps or Interactice Documents with Shiny Server and AWS

       Recently, I needed to deploy a flexdashboard that I had built. For shiny applications or interactive documents, there are a few options for deployment and hosting, including shinyapp.io, RStudio Connect, and Shiny Server. The available resource that I had was our AWS EC2 instance, and so I decided to host our flexdashboard through shiny-server and AWS EC2. In short, I want to document the steps that I took and the resources that have helped me better understand the deployment process in this post for future reference. This post will be updated as my understanding of AWS, shiny, and application deployment improves over time. A note is that I am a MAC user, which means I’ll be using the Terminal app initially for the EC2 set up.

       Some resources that have helped me understand AWS and EC2 are the following:

Step 1: AWS EC2

       If you are working for an organization that uses AWS EC2, the chances are that your data team or IT department may already have an EC2 instance running. In that case, consult your cloud manager or supervisor or whoever manages your organization’s AWS account regarding the following:

  • The root user could create your IAM user account, which gives you certain access rights.

  • You would need to connect to EC2 via a Secure Shell (SSH) using a Command Line Interface (CLI), and so you need to obtain the AWS EC2 .pem private key file.

  • You may also want to obtain the SSH commands that allows you to SSH into your organization’s EC2 instance.

The rest of the setup steps may differ quite a bit depending on whether or not you are using your organization’s EC2 instance or running your own. For the purpose of this post, however, we will create our own personal AWS account and EC2 instance. I find that practicing deploying an application using my own AWS account and EC2 instance has helped me ultimately set up the production environment on my organization’s EC2 instance. The first step, though, is to register for an AWS account, which is free of charge.

Launch an EC2 Instance and Select an AMI

       Launch an EC2 instance by selecting an Amazon AMI.

Because many tutorials and resources online are based on Ubuntu, we will use the Ubuntu AMI. Not only does this option have the free tier option but, based on my experience, it could also save us a lot of pain in having to deal with system requirements later on. The Ubuntu 20.04.1 LTS (at the time of writing this post) is a well documented operating system with a large user base and so trouble shooting is relatively easier in my experience compared to an AMI such as the Amazon Linux AMI 1, which is based on Red Hat Enterprise Linux (RHEL). If you work for an organization, you may not be able to choose which AMI to use. But the steps that follow should work with other AMI (for instance, we use the Amazon Linux AMI at our organization), but note that you may run into problems installing the required system libraries and packages needed for deployment and even for R packages as some of the commands will be different.

Choose an Instance Type

       We will choose t2.micro, which is free tier eligible. Depending on your needs for computing resources (for instance, installing R packages with complied code), you may run out of memory with 1 GiB of Memory and 1 vCPUs, so you could also consider other instance types. I recommend reading the following article to better understand the differences between instance types.

Configure Instance Details

       We could leave this as default.

Add Storage

       The default EBS volume size is 8 GB but we get up to 30 GB of General Purpose SSD via the free tier. See the documentation on EBS volume options.

Add Tags

       Tags may be useful for organizing your AWS services. See the documentation for more on this.

Configure Security Group

       Security groups function as virtual firewalls for your EC2 instances to control inbound and outbound traffic. By default, AWS blocks traffic from all ports except for port 22, which is the port we use to SSH into our instance. I use the following configuration based on mgritts’s article.

TypeProtocolPort RangeSourceDescription
SSHTCP22Anywhere: 0.0.0.0/0, ::0SSH
HTTPTCP80Anywhere: 0.0.0.0/0, ::0Use nginx to password protect and set up proxy
Custom TCPTCP3838Anywhere: 0.0.0.0/0, ::0Default Shiny server
Custom TCPTCP8787Anywhere: 0.0.0.0/0, ::0Default R Studio server

Since our instance is utilized as a web server, we use security rules to allow IP addresses to access our instance using HTTP or Custom TCP so that external users can browse the content on our web server.

  • The second rule allows for inbound HTTP access from all IPv4 and IPv6 addresses.

  • The third and forth allow for displaying web data based port numbers.

Key Pair

       The last step for setting up an EC2 instance is creating your .pem private key file, or select to use an existing key file provided by your organization.

Finally, launch your instance.

Elastic IP

       An elastic IP address is different than our EC2 instance’s Public IPv4 address; in short, an Elastic IP address is allocated to our AWS account, and is ours until we release it. Therefore, this IP address can be reused for our EC2 instances. The re-usability of our IP may be useful when we want to upgrade or downgrade our EC2 instance type. Without an elastic IP address, a new Public IPv4 address will be used each time we stop and re-launch our instance. This means that any service that depends on our public IP will need to be updated. The benefit of an elastic IP address is that we can simply associate it to the new server. In other words, the elastic IP address allows us to mask the failure of an instance or software by rapidly remapping the address to a new instance in our account. The setup is as follows:

Select the Action drop down menu in the top right corner and choose Associate Elastic IP address. From now on, every time we make changes to our EC2 instance, we can simply re-associate this IP address to our new instance.

Step 2: Connecting to AWS EC2

Connecting via SSH

       To connect to our EC2 instance via SSH, we will use the terminal (for windows, the steps for PuTTY can be found here). When you select “Connect” in your AWS console, you should be taken to the following page:

  • Open the terminal, navigate to the location of our .pem key:
# Change working directory 
$ cd path_to_pem_file
  • Next, run the following command to ensure that our key is not publicly viewable:
$ chmod 400 file.pem
  • Connect to the instance:
$ ssh -i "file.pem" ubuntu@ec2-public-ip-address.compute-1.amazonaws.com

If this is your first time connecting to your EC2 instance, you may receive an Are you sure you want to continue connecting (yes/no/[fingerprint])? prompt. Entering yes should successfully connect you to you EC2 instance:

Welcome to Ubuntu 20.04.3 LTS (GNU/Linux 5.11.0-1022-aws x86_64)

 * Documentation:  https://help.ubuntu.com
 * Management:     https://landscape.canonical.com
 * Support:        https://ubuntu.com/advantage

  System information as of Sat Jan 29 01:27:58 UTC 2022

  System load:  0.0               Processes:             100
  Usage of /:   4.9% of 29.02GB   Users logged in:       0
  Memory usage: 21%               IPv4 address for eth0: 172.31.91.243
  Swap usage:   0%


1 update can be applied immediately.
To see these additional updates run: apt list --upgradable

Disconnecting

       To disconnect from our instance:

$ exit

Upgrading and Installing System Packages

       This particular step and many of the steps that follow are places where the operating system will begin to matter. The following commands are meant to work with the Ubuntu OS (a Debian-based Linux distribution). For instance, the Advanced package tool, or APT, used to handle the installation and removal of software is developed for Ubuntu/Debian system software packages. For Red Hat-based Linux systems, the Yellowdog Updater, Modified, or YUM, package-management utility is used. In addition, R packages can depend on software external to the R ecosystem. On Ubuntu, for instance, in order to install the curl R package, we must install the system library first via apt-get install libcurl. Resolving system dependency issues can be painful at times, and the pain points may vary based on the operating system (AMI). One effective way for troubleshooting based on my own experience is simply Google searching for system dependencies on an ad-hoc basis (after an error is thrown, for instance, when your try to install an R package). If your are lucky, you won’t be the first to crash because of a missing system library.

# Update commands
$ sudo apt update
$ sudo apt-get update -y
$ sudo apt-get dist-upgrade -y
# Install some system libraries
$ sudo apt-get -y install \
    nginx \
    gdebi-core \
    apache2-utils \
    pandoc \
    pandoc-citeproc \
    libssl-dev \
    libcurl4-gnutls-dev \
    libcairo2-dev \
    libgsl0-dev \
    libgdal-dev \
    libgeos-dev \
    libproj-dev \
    libxml2-dev \
    libxt-dev \
    libv8-dev \
    libhdf5-dev \
    git

       The difference between apt-get and apt is that the former is an older command with more options while apt is a newer, more user-friendly command with fewer options. To understand these shell commands, I found explainshell.com (its github repo can be found here) to be extremely useful. Other resources that are also helpful include:

To be able to compile R packages, we also need to install the build-essentials packages that are necessary for compiling software:

$ sudo apt install build-essential

On Ubuntu, you may run the following command to check on disk space:

$ df -h

If nginx is installed successfully, you should see the following page by entering your Public IPv4 address (obtained from Instance summary in your AWS console) into your web browser:

Step 3: Installing R, Rstudio Server, and Shiny Server

Installing R from CRAN

       Because R updates frequently, the latest stable version isn’t always available from Ubuntu’s default repositories, and so we’ll need to add the external repository maintained by CRAN. To install the latest version of R from CRAN, the commands are as follows:

# Update indices
$ sudo apt update -qq
# Install two helper packages 
$ sudo apt install --no-install-recommends software-properties-common dirmngr
# Add the signing key (by Michael Rutter) for these repositories
# To verify key, run gpg --show-keys /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc 
# Fingerprint: 298A3A825C0D65DFD57CBB651716619E084DAB9
wget -qO- https://cloud.r-project.org/bin/linux/ubuntu/marutter_pubkey.asc | sudo tee -a /etc/apt/trusted.gpg.d/cran_ubuntu_key.asc
# Add the R 4.0 repo from CRAN -- adjust 'focal' to 'groovy' or 'bionic' as needed
sudo add-apt-repository "deb https://cloud.r-project.org/bin/linux/ubuntu $(lsb_release -cs)-cran40/"

The instructions for installing R on other operating systems can be found here under “Download and Install R”. Finally, run the following command to install R:

# Install recommended packages
$ sudo apt install r-base

Or, install without considering recommended packages:

# Install without recommended packages
$ sudo apt install --no-install-recommends r-base

To check the R version:

$ R --version

Other useful commands are:

# Run R from the terminal
$ R 
# Quit
$ q()

       If you are using an operating system that comes with another AMI, the stable version of R its the default repository may be different. For Amazon Linux AMI, the latest version of R is 3.x; depending on the R packages you need to install, this may or may not be troublesome.

Installing Rstudio Server

       To download the latest version of Rstudio server, use the following link and select your linux platform. The official instructions are easy to follow, use the following commands to install Rstudio server for Ubuntu 20 (at the time of writing this post):

$ sudo apt-get install gdebi-core
$ wget https://download2.rstudio.org/server/bionic/amd64/rstudio-server-2021.09.2-382-amd64.deb
$ sudo gdebi rstudio-server-2021.09.2-382-amd64.deb

Installing Shiny Server

       Similarly, install the Shiny R package and the latest version of Shiny Server by following the instructions on this page.

# Install shiny
# This may take a while to compile on tc2.micro
$ sudo su - \
-c "R -e \"install.packages('shiny', repos='https://cran.rstudio.com/')\""
# Install shiny server
$ sudo apt-get install gdebi-core
$ wget https://download3.rstudio.org/ubuntu-14.04/x86_64/shiny-server-1.5.17.973-amd64.deb
$ sudo gdebi shiny-server-1.5.17.973-amd64.deb

Checking installed

On Ubuntu, run the following commands:

$ cd ~
$ ls

You should see that both Rstudio server and Shiny server are installed. On RedHat-based Linux distributions, you might use the following commands to check if both servers are properly installed:

# List installed packages
$ sudo yum list installed
# Use grep command to filter for specific package
$ sudo yum list installed | grep nginx

Other resources that I have found useful are as follows:

If you enter http://<public-ipv4>:8787 and http://<public-ipv4>:3838 into your browser, you should see the following pages:

And

Install R packages (System Library)

       One way to install R packages is to install them in a system-level or global library; this library is available for all users and roles of your EC2 instance. The syntax for installing R packages from CRAN within the terminal is as follows:

$ sudo su - -c "R -e \"install.packages(c('tidyverse', 'data.table'), repos='http://cran.rstudio.com/')\""

To install developmental versions of R packages from github:

$ sudo su - -c "R -e \"install.packages('devtools', repos='http://cran.rstudio.com/')\""
$ sudo su - -c "R -e \"devtools::install_github('tidyverse/ggplot2')\""

Note: As mentioned earlier, with the t2.micro instance type, you may simply run out of memory installing certain R packages with compiled code (for example, Rcpp and RcppArmadillo). If this happens, it would appear that the installation process has been stuck in a never-ending process.

Install R packages (User Library)

       Alternatively, there is also the option to install add-on R packages (those that do not come with base R) in a user-level library, which may be appealing for many reasons. We will return to this once we set up the user credentials for Rstudio server.

Step 4: Rstudio Server and IDE

User Login

       The RStudio Server enables you to provide a browser based interface (the RStudio IDE) to a version of R running on a remote Linux server. The RStudio IDE can be accessed by entering http://<public-ipv4>:8787 into your browser. The log in credentials use the user information on your EC2 instance, which is stored in the /etc/passwd file. This file stores essential information about the users on the system. We can manage users on our EC2 instance using the following commands for linux:

  • In Ubuntu, there are two command-line tools that you can use to create a new user account: useradd and adduser. The former, useradd, is a low-level utility and adduser is a script written in Perl that acts as a friendly interactive frontend for useradd:
$ sudo adduser username

The command above will prompt you to enter the following information to set up the user:

Adding user `username' ...
Adding new group `username' (1001) ...
Adding new user `username' (1001) with group `username' ...
Creating home directory `/home/username' ...
Copying files from `/etc/skel' ...
New password: 
Retype new password: 
passwd: password updated successfully
Changing the user information for username
Enter the new value, or press ENTER for the default
	Full Name []: Your Name
	Room Number []: 
	Work Phone []: 
	Home Phone []: 
	Other []: 
Is the information correct? [Y/n] y

       This will create the new user’s home directory, and copy files from /etc/skel to this directory. Within the home directory, the user can write, edit, and delete files and directories. To allow this user to be able to perform administrative tasks, add this existing user to the sudo group using usermod:

$ sudo usermod -a -G sudo username

       Always use the -a (append) option when adding a user to a new group. If you omit the -a option, the user will be removed from any groups not listed after the -G option. On success, the usermod command does not display any output, but warns you if the user or group doesn’t exist.

  • In Ubuntu, you can use two commands to delete a user account: userdel and its interactive frontend deluser:
$ sudo deluser username

To delete the user and its home directory and mail spool, use the --remove-home flag:

$ sudo deluser --remove-home username

Note that sometimes you may need to kill an R session before removing the user:

# Kill an individual session
$ sudo rstudio-server kill-session <pid>
# Force kill all running sessions
$ sudo rstudio-server kill-all

The session process ID can be obtained with the following base R function:

Sys.getpid()
  • To change password for a user:
$ sudo passwd username
  • To remove a password and set up a new password upon deletion:
$ sudo passwd -d username
  • To see list of all users, simply use the following commands:
# List users
$ cat /etc/passwd
$ cut -d: -f1 /etc/passwd
# Search for username using the grep command
$ grep username /etc/passwd
# Or
$ grep -w '^username' /etc/passwd
  • To see details about the file:
$ stat /etc/passwd

       More information can be found via the Rstudio Server administration guide. Finally, logging into your Rstudio IDE should take you to the following GUI:

Some useful commands for managing rstudio-server:

$ sudo rstudio-server stop
$ sudo rstudio-server start
$ sudo rstudio-server restart

User Library

       Once you have set up a user on your EC2 instance, which creates a home directory for the username, your user-level library will be set up as well, so there is nothing extra to do here. Login to your rstudio IDE and run the following function:

.libPaths()

This would return the following paths on Ubuntu:

[1] "/home/username/R/x86_64-pc-linux-gnu-library/4.1"
[2] "/usr/local/lib/R/site-library"                  
[3] "/usr/lib/R/site-library"                        
[4] "/usr/lib/R/library"  

On RedHat (for instance, the Amazon Linux AMI 1), the output may be something like:

[1] "/home/username/R/x86_64-redhat-linux-gnu-library/3.4"
[2] "/usr/lib64/R/library"                                    
[3] "/usr/share/R/library"  

The first path is always your user library, which means that running install.package() using the Rstudio IDE will install the packages in that path. On Debian and Ubuntu, the R_LIBS_USER environment variable is set in /etc/R/Renviron.

R_LIBS_USER=${R_LIBS_USER-'~/R/$platform-library/R-version'}

where $platform is something like ‘x86_64-pc-linux-gnu-library’ and is dependent on the version of R installed on your EC2 instance. The environment variable R_LIBS_SITE is set in /etc/R/Renviron to

R_LIBS_SITE=${R_LIBS_SITE-'/usr/local/lib/R/site-library:/usr/lib/R/site-library:/usr/lib/R/library'}

We can access the environment variables via:

$ sudo nano /etc/R/Renviron

       The R packages part of r-base and r-recommended are installed into the directory /usr/lib/R/library. The other R packages available as precompiled Debian packages r-cran-* and r-bioc-* are installed into /usr/lib/R/site-library. More information for Debian packages of R software can be found in the following article. For other operating systems, the location of these start-up files may be different. But the configuration files can be edited directly in the IDE:

# Install usethis
install.packages("usethis")
# Open configuration files
usethis::edit_r_environ()

Step 5: Shiny Server

       The best resource available for Shiny server is the administrative guide, which covers the most important information from system requirements to server management to hosting models to security.

Configure Shiny Server

       Important: The first thing, though, is to stop the server:

# Ubuntu
$ sudo systemctl stop shiny-server
# Redhat
$ sudo stop shiny-server

Other useful commands include:

# Ubuntu
$ sudo systemctl start shiny-server
$ sudo systemctl status shiny-server
$ sudo systemctl restart shiny-server
# Redhat
$ sudo start shiny-server
$ sudo status shiny-server
$ sudo restart shiny-server

       To configure Shiny server, we need to modify the default configuration file located at /etc/shiny-server/shiny-server.conf using GNU nano.

$ sudo nano /etc/shiny-server/shiny-server.conf

This should open the default configuration file:

       This configuration expects that your Shiny applications are located in the following path /srv/shiny-server/. For other hosting models, please see the following section of the administrative guide. There is one sample application in the path /srv/shiny-server/sample-apps/hello/. By default, Shiny Server listens (receives information) on port 3838, so the example application will be available at http://<server-address>:3838/sample-apps/hello/. I added the following directives to the configuration file (the list of all the directives that are supported in Shiny Server config files can be found here in section “7.2 Configuration Settings”):

  • I added the run_as directive followed by my username. For one, the paths in which R will look for packages (.libPaths()) are often user-dependent. Since the packages required to run a Shiny application are installed in my user-level library, I must run the application as the correct user. For locations configured with site_dir, the run_as setting will be used to determine which user should spawn the R Shiny processes. This setting can be configured globally, or for a particular server or location.

  • I added sanitize_errors off to report error on the client. This is optional, since you could always check the log files located in the path /var/log/shiny-server using the Less command.

$ cd /var/log/shiny-server
$ sudo less [file_name].log
  • I changed directory_index to off to disable the directoryIndex page when user visits the base URL— http://<public-ipv4>:3838.

Reverse Proxy

       A reverse proxy is the application that sits in front of back-end applications and forwards client requests to those applications. An analogy that helped me understand this better is to think of the server as a house that has many doors, which are called ports. We are limited to listening TCP ports. If we wish to obtain some information in the house (that is, the server), we must pass one of these ports to retrieve the information from the information provider (a specific process, application or a service) associated with that port.

       Shiny server is one of such information providers, and it is located at port 3838. In order to reach Shiny server, we must specify the port number in the URL we enter into the browser— http://<server-address>:3838. If we do not specific the port number or if we specified the incorrect number, we will not be able to obtain data stream from Shiny server. The reserve proxy functions like a doorman at the main entrance of the house that brings us to the right information provider without having to specify which door to pass through. It directs client requests to the appropriate back-end server.

       In other words, we simply need to type http://<server-address> or http://<server-address>/* (* means any sub-domain) into our browser to speak to the reverse proxy, which fetches the right information for us directly. In speaking to the reverse proxy, we do not have to specify the port number since we will have already configured the proxy to know exactly which port we want to reach. For this task, we will use nginx to set up the reverse proxy, which should already be installed on your EC2 instance.

       To configure nginx, we first need to stop the service.

$ sudo service nginx stop
# Other useful commands
$ sudo service nginx start
$ sudo service nginx status

Next, we need to navigate to the directory where nginx is installed:

$ cd /etc/nginx
$ ls

       The results of ls may differ, sometimes substantially, depending on the AMI (and the operating system) that we are using. For instance, on Ubuntu, the default installation of nginx might create a sites-avalable and a sites-enabled directory. On RedHat/CentOS/Fedora, the default installation of nginx does not include such directories. For those operating systems, the default place to store the configuration files is the following directory /etc/nginx/conf.d/*.conf. In addition to that, in the /etc/nginx/nginx.conf configuration file, we must ensure that the include /etc/nginx/conf.d/*.conf; directive is added in the http block to tell nginx to pull in any files in the `/etc/nginx/conf.d directory that has the extension .conf. On Ubuntu, the following set up steps are needed:

  • Having navigated to the /etc/nginx directory, we should see at least the following sub-directories (if not, we can create them):
conf.d sites-enabled nginx.conf  sites-available
  • Navigate to the sites-available directory and create a new configuration file specifically for Shiny server:
$ cd sites-available
$ sudo nano shiny.conf
  • Write the following block of directives in the shiny.conf file:
server {
    # Listen on 80 port
    listen 80;
    # For IPv6 addresses
    listen [::]:80;
    # The reverse proxy
    location / {
        proxy_pass http://127.0.0.1:3838/;
        proxy_redirect http://127.0.0.1:3838/ $scheme://$host/;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_read_timeout 20d;
        proxy_buffering off;
    }
}

       When nginx proxies a request, it 1) sends the request (i.e., a client trying to access our shiny application or interactive document hosted on our EC2 instance) to a specified proxied server, 2) fetches the response, and 3) sends it back to the client. To understand the configuration above:

  • The proxy_pass directive passes all requests processed in location / to the proxied server at the specified address http://127.0.0.1:3838/. See more details on this here. Note also that the “/” prefix is used for matching requests. The location block above provides the shortest prefix (length one), and only if all other location blocks fail to provide a match will this block be used. Since we do not have any other location blocks at the moment, this one will be used.

  • The proxy_redirect directive does something like URL rewrite, replacing http://127.0.0.1:3838/ with variables $scheme://$host/. You can read more details on proxy_redirect here and on the components of a URL here. The full list of nginx variables can be found here.

  • The proxy_http_version directive sets the HTTP protocol version for proxying. By default, version 1.0 is used. More details here.

  • The two proxy_set_header field value directives have something to do with WebSocket proxying. These headers have to be passed explicitly so that the proxied server can know the client’s intention to switch a protocol to WebSocket.

  • By default, the web socket connection will be closed if the proxied server does not transmit any data within 60 seconds. This timeout can be increased with the proxy_read_timeout directive. Set this to 20 days. The configuration measure units can be found here.

  • The final directive proxy_buffering turns response buffering off. Disabling response buffering is necessary for applications that need immediate access to the data stream according to the following article on nginx performance.

       Next, we need to create a shortcut (symbolic link) inside the sites-enabled directory. The reason is that nginx does not look at sites-available but only the sites-enabled directory in the /etc/nginx/nginx.conf configuration file. We create the .conf files inside sites-available and create a shortcut inside sites-enabled to access it. One benefit of this is that, to temporarily deactivate your access to Shiny, you only have to delete the shortcut but not the actual configuration file in sites-available:

$ cd /etc/nginx/sites-enabled
# Use absolute path
$ sudo ln -s /etc/nginx/sites-available/shiny.conf /etc/nginx/sites-enabled/
# To remove a symbolic link
$ sudo rm your-site-config  
  • Finally, we need to add the following block to the configuration file located in /etc/nginx/nginx.conf as specified here. Note that you must add the following within the http bloc in the nginx configuration file:
map $http_upgrade $connection_upgrade {
    default upgrade;
    ''      close;
} 

You can use the following command to open the configuration file:

$ sudo nano /etc/nginx/nginx.conf
  • To test if the configuration files are syntactically correct, run the following:
$ sudo nginx -t

This should output the results below if the configuration test has passed:

nginx: the configuration file /etc/nginx/nginx.conf syntax is ok
nginx: configuration file /etc/nginx/nginx.conf test is successful
  • Important: By default, there will be a default configuration file located in the sites-available and sites-enabled directories. We must also remove them:
$ cd /etc/nginx/sites-enabled
$ sudo rm default
$ cd /etc/nginx/sites-available
$ sudo rm default
  • Finally, restart nginx:
$ sudo service nginx start

Step 6: Deployment

Remove Example Shiny Files

       To deploy your shiny application or interactive documents, we first need to remove the default index.html and sample-apps from the R Shiny server:

# Set file permissions to read/write
$ sudo chmod 7777 /srv/shiny-server/
$ sudo rm /srv/shiny-server/index.html
$ sudo rm -rf /srv/shiny-server/sample-apps

For other chmod options, see Chapter 7 of Matt Dancho’s book.

Github

       You could upload the source files of your shiny application or interactive document directly to your EC2 instance using the upload button in file pane of the Rstudio IDE:

However, in this post, we will opt to host the source files in a remote repository on github, and then cloning said repository from within the Rstudio IDE on our EC2 instance. One huge benefit of this approach is version control, which allows us to easily keep track of changes to our source files over time. To set up the remote repository on github for the source files, you could read the following chapter of Matt Dancho’s book. Some resources that helped me learn more about Github and git version control in the past are:

       Assuming that you now have a github repository containing all of your source files and their dependencies. The next step is to configure your git user.name and user.email in the Rstudio terminal using the following commands:

git config --global user.name 'Your Name'
git config --global user.email 'your_email@example.com'
git config --global credential.helper 'cache --timeout=10000000'

The third command tells git to cache your password for the next four months (about ten million seconds). For more details, please see the following tutorial by Jenny Bryan. Once you have configured your name and e-mail address in git, the following steps will clone the remote repository containing your source files from within the Rstudio IDE on your EC2 instance:

  • In your repository, click on the Code (green) button and copy the HTTPS URL in the drop-down menu.

  • Create a new project within your Rstudio IDE on your EC2 instance. Select Version Control.

  • Select “Clone a project from a Git repository” and enter your URL and the name of your git repository.

  • Once you create the project, you should see that your source files are located in the file pane of the IDE:

Deploying With Shiny Server

       You are now ready to copy the source files on your EC2 to /srv/shiny-server/. Recall from the previous section that we are deploying through a hosting model called site_dir, which hosts the entire directory tree at /srv/shiny-server. Run the following functions to create a sub-directory within /srv/shiny-server/ and copy your source files from your EC2 instance into said sub-directory. The name of the directory can be anything as long as it is syntactically valid:

# To create a new sub-directory
dir.create(path = "/srv/shiny-server/portfolio_dashboard")
# Copy the source file into the directory created above
file.copy("dashboard.Rmd", "/srv/shiny-server/portfolio_dashboard")

If you get an error that the file you are seeking to copy does not exist, check to make sure that the first file path is specified correctly. If you created the project using the steps above, your working directory should be the project directory on your EC2 instance. Some other useful functions and commands are:

# To remove files
file.remove("/srv/shiny-server/portfolio_dashboard/dashboard.Rmd")
# To list files in a directory
list.files(path = "/srv/shiny-server/portfolio_dashboard/")

In the terminal (as in our EC2 instance and not the Rstudio IDE):

# To remove directories within /srv/shiny-server/
$ sudo rm -rf /srv/shiny-server/sub_directory

       Now, in your browser, enter http://http://<public-ipv4>/* (if you created a sub-directory within /srv/shiny-server/) or http://<public-ipv4> (if you simply copied the source files to /srv/shiny-server/). Your application or interactive documents should be successfully deployed at you EC2’s elastic IP address:

Step 9 (Optional): Password Protect

       Sometimes we may need to password protect our application. For this task, we will use nginx again. In particular, the ngx_http_auth_basic_module module allows for limiting access to resources by validating the user name and password using the “HTTP Basic Authentication” protocol.

Configure Nginx

       The following steps configure the nginx configuration file located at /etc/nginx/sites-available/shiny.conf to use the ngx_http_auth_basic_module. If you have not set up a domain name or switched to HTTPS, your configuration file may look different, but the steps should be very similar regardless. You essentially need to add a location block inside your server block.

  • Stop nginx and open up the configuration file:
$ sudo service nginx stop
$ sudo nano /etc/nginx/sites-available/shiny.conf
  • Add the following directives to the existing location block with the shortest prefix “/”:
location / {
    proxy_pass http://127.0.0.1:3838/;
    proxy_redirect http://127.0.0.1:3838/ $scheme://$host/;
    proxy_http_version 1.1;
    proxy_set_header Upgrade $http_upgrade;
    proxy_set_header Connection $connection_upgrade;
    proxy_read_timeout 20d;
    proxy_buffering off;
    # Add the following
    auth_basic "Username and Password are required";
    auth_basic_user_file /etc/nginx/.htpasswd;
}

More details on nginx location blocks can be found here under section “Serving Static Content”. For more information on nginx server and location block selection algorithms, see the following article.

  • Your configuration file should now look similar to this:
server {
    listen 80;
    listen [::]:80;

    location / {
        return 301 https://$host$request_uri;
    }
}

server {
    listen 443 ssl http2;
    listen [::]:443 ssl http2;
    server_name dashwu.com;

    # The reverse proxy
    location / {
        proxy_pass http://127.0.0.1:3838/;
        proxy_redirect http://127.0.0.1:3838/ $scheme://$host/;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection $connection_upgrade;
        proxy_read_timeout 20d;
        proxy_buffering off;
        auth_basic "Username and Password are required";
        auth_basic_user_file /etc/nginx/.htpasswd;
    }

    ssl_certificate /etc/letsencrypt/live/dashwu.com/fullchain.pem;
    ssl_certificate_key /etc/letsencrypt/live/dashwu.com/privkey.pem;
    ssl_session_timeout 1d;
    ssl_session_cache shared:MozSSL:10m;  # About 40000 sessions
    ssl_session_tickets off;

    # curl https://ssl-config.mozilla.org/ffdhe2048.txt > /path/to/dhparam
    ssl_dhparam /etc/ssl/certs/dhparam.pem;

    # intermediate configuration
    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:DHE-RSA-AES128-GCM-S>
    ssl_prefer_server_ciphers off;

    # HSTS (ngx_http_headers_module is required) (63072000 seconds)
    add_header Strict-Transport-Security "max-age=63072000" always;

    # OCSP stapling
    ssl_stapling on;
    ssl_stapling_verify on;

    # verify chain of trust of OCSP response using Root CA and Intermediate certs
    ssl_trusted_certificate /etc/letsencrypt/live/dashwu.com/chain.pem;
}
  • If you navigate to your elastic IP address, you should now see that your application is now password protected:

User Login

       The final step is to create username-password pairs to access your application hosted on your EC2 instance at your elastic IP address. We could do so using apache2-utils (Debian, Ubuntu) or httpd-tools (RHEL/CentOS/Oracle Linux). In section “Upgrading and Installing System Packages”, we have already installed apache2-utils. To check if it is indeed installed:

$ dpkg --list | grep apache2-utils
# If not installed
$ sudo apt-get -y install apache2-utils
  • To create a password file and a first user. Use the htpasswd utility with the -c flag (which stands for “create a new file”). The file pathname is the first argument and the username is the second argument:
$ sudo htpasswd -c /etc/nginx/.htpasswd user1

This should prompt you to enter a password for user1.

  • Creating additional user-password pairs does not require the -c flag since the file already exists:
$ sudo htpasswd /etc/nginx/.htpasswd user2
$ cat /etc/nginx/.htpasswd
  • To remove a user, simply delete the file using the -D flag:
$ sudo htpasswd -D /etc/nginx/.htpasswd user2
  • To change password for a user:
# Change password for user2
$ sudo htpasswd /etc/nginx/.htpasswd user2

Refer to the manual page to see other command line arguments.

Related