How to Make Scatter Plot in ggplot

Your First ggplot: Scatter Plot | GGPLOT2 Tutorial

Your First ggplot: Scatter Plot

Scatter plots are one of the most fundamental and powerful visualization types. They reveal relationships between two continuous variables and are your gateway to understanding ggplot2.

Perfect Starting Point: Scatter plots demonstrate all core ggplot2 concepts – data mapping, geometric layers, aesthetics, and customization – making them the ideal first visualization to master.

The Absolute Basics

1

Minimum Viable Scatter Plot

The simplest scatter plot requires just three components: data, aesthetic mapping, and points geometry.

# Load ggplot2
library(ggplot2)

# Basic scatter plot using mtcars dataset
ggplot(data = mtcars, aes(x = wt, y = mpg)) +
  geom_point()
Output: Basic Scatter Plot
Basic scatter plot showing car weight vs MPG
This basic plot shows the relationship between car weight and fuel efficiency. Each point represents one vehicle.

Step-by-Step Scatter Plot Creation

Step 1: Prepare Your Data

Ensure your data is in a data frame format. ggplot2 works best with tidy data.

# Check your data structure
head(mtcars)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Step 2: Initialize the Plot

Start with ggplot() and specify your data and basic mappings.

# Initialize plot with data and aesthetics
p <- ggplot(data = mtcars,
       aes(x = wt, y = mpg))
Step 3: Add Points Layer

Use geom_point() to add scatter plot points.

# Add the points geometry
p <- p + geom_point()
Step 4: Display the Plot

Print the plot object to see your visualization.

# Display the plot
p
Output: Step-by-Step Result
Final scatter plot output
The completed basic scatter plot showing the inverse relationship between weight and MPG.

Essential geom_point() Parameters

Parameter Description Example Use Case
size Point size (numeric) size = 3 Make points more visible
color Point border color color = "blue" Change point appearance
fill Point fill color (for shapes 21-25) fill = "red" Filled point shapes
alpha Transparency (0-1) alpha = 0.5 Overlapping points
shape Point shape (0-25) shape = 17 Different point markers
stroke Border thickness stroke = 1.5 Emphasize point borders

Adding Variables Through Aesthetics

2

Color by Category

Map a categorical variable to color to reveal patterns across groups.

# Color points by number of cylinders
ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
  geom_point(size = 3)
Output: Color by Cylinders
Scatter plot colored by cylinder count
Colors reveal that 4-cylinder cars (red) are generally lighter and more fuel-efficient.

Note: Use factor() to convert numeric variables to categorical for proper color scaling.

3

Size by Continuous Variable

Map a continuous variable to point size to show a third dimension.

# Size points by horsepower
ggplot(mtcars, aes(x = wt, y = mpg, size = hp)) +
  geom_point(alpha = 0.7)
Output: Size by Horsepower
Scatter plot with point size representing horsepower
Larger points indicate higher horsepower. Notice how high-horsepower cars tend to be heavier and less efficient.
4

Combine Multiple Aesthetics

Use color, size, and shape together to visualize multiple variables.

# Multiple aesthetics: color by cyl, size by hp
ggplot(mtcars, aes(x = wt, y = mpg,
   color = factor(cyl), size = hp)) +
  geom_point(alpha = 0.7) +
  labs(color = “Cylinders”, size = “Horsepower”)
Output: Multiple Aesthetics
Scatter plot with color and size aesthetics
This plot shows three variables simultaneously: position (weight vs MPG), color (cylinders), and size (horsepower).

Customization Techniques

Point Styling

Control appearance with fixed parameters outside aes().

geom_point(
  color = “darkblue”,
  size = 3,
  alpha = 0.6,
  shape = 16
)

Color Scales

Customize how colors map to categories.

scale_color_manual(
  values = c(“4” = “red”,
    “6” = “blue”, “8” = “green”)
)

Size Scales

Control how sizes map to values.

scale_size_continuous(
  range = c(1, 10)
)

Labels & Titles

Add informative labels and titles.

labs(
  title = “Car Weight vs MPG”,
  x = “Weight (1000 lbs)”,
  y = “Miles per Gallon”
)

Complete Scatter Plot Example

5

Production-Ready Scatter Plot

A fully customized scatter plot suitable for reports and presentations.

# Complete customized scatter plot
ggplot(mtcars, aes(x = wt, y = mpg,
   color = factor(cyl), size = hp)) +
  geom_point(alpha = 0.7, shape = 16) +
  scale_color_brewer(palette = “Set1”,
    name = “Cylinders”) +
  scale_size_continuous(
    name = “Horsepower”,
    range = c(2, 8)
  ) +
  labs(
    title = “Vehicle Performance Analysis”,
    subtitle = “Relationship between weight, fuel efficiency, and power”,
    x = “Weight (1000 lbs)”,
    y = “Miles per Gallon”
  ) +
  theme_minimal()
Output: Production-Ready Plot
Fully customized production-ready scatter plot
This polished plot includes proper labeling, a professional color scheme, clear legends, and a clean theme suitable for reports.

Common Patterns and Variations

With Regression Line

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  geom_smooth(method = “lm”)
Scatter plot with regression line

Faceted by Group

ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_point() +
  facet_wrap(~cyl)
Faceted scatter plot by cylinders

Jittered Points

# For overplotting
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
  geom_jitter(width = 0.2)
Jittered scatter plot

Text Labels

# Add text instead of points
ggplot(mtcars, aes(x = wt, y = mpg)) +
  geom_text(aes(label = rownames(mcars)))
Scatter plot with text labels

Troubleshooting Common Issues

Problem: “Error: Aesthetics must be either length 1 or the same as the data”
Solution: Ensure all variables in aes() are columns in your data frame, not separate vectors.

Problem: Points not showing colors as expected
Solution: Remember the difference: color = "blue" (fixed) vs aes(color = variable) (mapped to data).

Problem: Overplotting with many points
Solution: Use alpha = 0.5 for transparency or geom_jitter() for slight position adjustments.

Pro Tip: Always start simple and build complexity gradually. Create a basic scatter plot first, then add colors, sizes, and other aesthetics one at a time to ensure each layer works as expected.

Congratulations! You’ve created your first ggplot2 scatter plot. In our next tutorial, we’ll explore how to customize colors, themes, and layouts to make your visualizations publication-ready.

Educational Resources Footer
GitHub