Graphing probability distributions with Shiny

A quick way to plot and compare two density functions.

Photo by Pakata Goh on Unsplash

Introduction

I am a Statistics undergraduate, so I usually work with probability distributions (Gaussian, chi-squared, etc.). I find it quite useful to visualize the distribution when I’m working on a probability problem, but I couldn’t find an interactive app with that purpose.

These past few weeks, I began learning about Shiny, an R package that allows users to easily build interactive web apps. With just the basics, I made it.

NOTE: I’m not claiming this is the the most efficient way to achieve the goal. I only want to show you what Shiny is capable of with just a few lines of code. You can find the entire app here: https://github.com/pablovidal00/Probability-graphicator.git

Making the app

Every Shiny app consists of two components: the ui and the server.

  • The ui takes care of how your app looks and how you can interact with it.
  • The server handles the internal calculations necessary to display what you’ve requested.

UI

First, we need to create a page and ask the user to choose a distribution using fluidPage and selectInput :

This code does the following: creates the page, creates a row, controls how much space we take with column(in this case, we are using 2), and finally asking the user to select an option. A variable ‘ dist ‘ will store the option. With the use of lists, continuous and discrete distributions are separated.

Now, we have to ask for the values of parameters. We face a problem: they depend on the chosen distribution. Luckily, conditionalPanel allows us to control what we display based on a condition expressed in the JavaScript language.

Example for the Normal Distribution

Here, we create a new column that only displays numericInput if the selected distribution is Normal. We store the mean and standard deviation in variables called “mean” and “sd”, we give them an initial value. In the case of the standard deviation, we assure only positive values setting a minimum. We should make as many conditionalPanel as distributions we have.
With this, the app already does half its job! By now, it looks like this:

The last component of the ui is the plot. In my app, I want it to be placed below the inputs so I create a new fluidRow.

This tells the app to display a plot named “plot”, which will be created on the server. Now, the ui is finished.

In case someone wanted to plot two distributions, they should create new columns, copy-paste the code asking for the distribution and parameters, and adding a new plot. The variable names must be different (“dist2”, “mean2”, etc.).

Server

The server takes three arguments: input, output, and session. The first one behaves like a list with all the variables created in the ui, while the second will be a list of what the server makes. For example, “input$mean” is the value chosen for the parameter “mean” and “output$plot” is the plot the ui displays.

One of the differences between standard R and Shiny is that we can only code as we would usually do in a render function. render takes care of building the object the ui receives, such as a plot, text, etc. So, how can we call the variables supplied by the user? The answer is reactive . It tells Shiny to update the variable and wherever is called each time the user changes it.

For example, “mean” here will change each time the user changes the parameter “input$mean”. It is important to differentiate both of them, as one is in the ui and the other is in the server. For every possible parameter of a distribution, we should have a reactive variable. Now, we are ready to build the plot.

Let’s take a look at this carefully

  • “inf” and “max” are the limits of the sequence of points. Six standard deviations seem enough to visualize the distribution.
  • “points”, that are passed as an argument to dnorm , which returns the density of the vector of points.
  • “Density” is what is going to be plotted in the y-axis, while “points” is in the x-axis.
  • “yl” is the y-axis limit, and is declared with the operator <<- so it can be used as a parameter in a second plot in order to compare two distributions on the same scale.
  • The arguments in plot just specify the type (line), the color, the title of the x-axis, the title of the plot and the limit of the y-axis.
  • abline adds a vertical line in the mean of the distribution.

As in the other parts of the app, this is only for the Normal distribution. To extend it, just add whatever distributions you like in the switch commands and specify the values they should take.

Result

Adding the code necessary for the second distribution, here’s how it looks.

Conclusions

This app could be a lot more complex, but the point was to present a beginner-friendly project. You could try optimizing the calculations, changing the layout of the app or even using ggplot2 for the plot.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store