Labeled Scatter Plots and Bubble Charts in R

The rhtmlLabeledScatter R package on GitHub that attempts to solve three challenges with labeled scatter plots: readability with large numbers of labels and bubbles, and the use of images.


Four solutions for overlapping labels

1. Automatically arranging labels so they do not overlap

If you look at the scatter plot below, you should immediately see the most obvious way that the package deals with overlapping labels: labels are automatically re-arranged so that they do not overlap. Lines connect labels to their points.

2. Allowing viewers to move labels using drag-and-drop

The second option for dealing with overlapping labels is that they are draggable. If you are viewing this visualization using a device with a mouse, you can click on the labels to rearrange them to make them even more readable. If you do this using a software platform that can remember the state of an HTMLwidget, such as Displayr, the final position where you leave a label is remembered.

3. Labels can be dragged off the plot

The third option is that you can drag the labels off the plot, which causes them to be added to a legend. A notation on the relevant axis shows the direction of any removed labels (try this for yourself).

4. Tooltips on hover

The fourth option for addressing overlapping labels is the use of tooltips. Hover your mouse over any point and you can see its label.

Bubble charts

The four tools for addressing overlapping labels are also all available for bubble charts, as illustrated below.



It is possible to use images on the scatter plots. Automatically rearranging the images avoids overlaps, as shown in the example below.


Trend arrows

The last example, shown below, uses trends to show movement over time on the scatter plot.


The source code

Click here to login to Displayr and access the R source code (click on a chart, and from the object inspector, select Properties > R CODE).

About Tim Bock

Tim Bock is the founder of Displayr. Tim is a data scientist, who has consulted, published academic papers, and won awards, for problems/techniques as diverse as neural networks, mixture models, data fusion, market segmentation, IPO pricing, small sample research, and data visualization. He has conducted data science projects for numerous companies, including Pfizer, Coca Cola, ACNielsen, KFC, Weight Watchers, Unilever, and Nestle. He is also the founder of Q, a data science product designed for survey research, which is used by all the world’s seven largest market research consultancies. He studied econometrics, maths, and marketing, and has a University Medal and PhD from the University of New South Wales (Australia’s leading research university), where he was an adjunct member of staff for 15 years.

Access the R source code yourself

Keep updated with the latest in data science.