How to read violin plots

Understanding a violin plot in more detail

In general, violin plots are a method of plotting numeric data and can be considered a combination of the box plot with a kernel density plot. 

In the violin plot, we can find the same information as in the box plots:

  • Median (a white dot on the violin plot)
  • Interquartile range (the black bar in the center of violin)
  • The lower/upper adjacent values (the black lines stretched from the bar) — defined as first quartile — 1.5 IQR and third quartile + 1.5 IQR respectively. These values can be used in a simple outlier detection technique (Tukey’s fences) — observations lying outside of these “fences” can be considered outliers.

The advantage of the violin plot over the box plot is that aside from showing the abovementioned statistics it also shows the entire distribution of the data. This is of interest, especially when dealing with multimodal data, i.e., a distribution with more than one peak.

For example, here is an example of a WindESCo generated violin plot: