Your First ggplot: Scatter Plot
Scatter plots are one of the most fundamental and powerful visualization types. They reveal relationships between two continuous variables and are your gateway to understanding ggplot2.
Perfect Starting Point: Scatter plots demonstrate all core ggplot2 concepts – data mapping, geometric layers, aesthetics, and customization – making them the ideal first visualization to master.
The Absolute Basics
Minimum Viable Scatter Plot
The simplest scatter plot requires just three components: data, aesthetic mapping, and points geometry.
library(ggplot2)
# Basic scatter plot using mtcars dataset
ggplot(data = mtcars, aes(x = wt, y = mpg)) +
geom_point()
Step-by-Step Scatter Plot Creation
Ensure your data is in a data frame format. ggplot2 works best with tidy data.
head(mtcars)
#> mpg cyl disp hp drat wt qsec vs am gear carb
#> Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
#> Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Start with ggplot() and specify your data and basic mappings.
p <- ggplot(data = mtcars,
aes(x = wt, y = mpg))
Use geom_point() to add scatter plot points.
p <- p + geom_point()
Print the plot object to see your visualization.
p
Essential geom_point() Parameters
| Parameter | Description | Example | Use Case |
|---|---|---|---|
size |
Point size (numeric) | size = 3 |
Make points more visible |
color |
Point border color | color = "blue" |
Change point appearance |
fill |
Point fill color (for shapes 21-25) | fill = "red" |
Filled point shapes |
alpha |
Transparency (0-1) | alpha = 0.5 |
Overlapping points |
shape |
Point shape (0-25) | shape = 17 |
Different point markers |
stroke |
Border thickness | stroke = 1.5 |
Emphasize point borders |
Adding Variables Through Aesthetics
Color by Category
Map a categorical variable to color to reveal patterns across groups.
ggplot(mtcars, aes(x = wt, y = mpg, color = factor(cyl))) +
geom_point(size = 3)
Note: Use factor() to convert numeric variables to categorical for proper color scaling.
Size by Continuous Variable
Map a continuous variable to point size to show a third dimension.
ggplot(mtcars, aes(x = wt, y = mpg, size = hp)) +
geom_point(alpha = 0.7)
Combine Multiple Aesthetics
Use color, size, and shape together to visualize multiple variables.
ggplot(mtcars, aes(x = wt, y = mpg,
color = factor(cyl), size = hp)) +
geom_point(alpha = 0.7) +
labs(color = “Cylinders”, size = “Horsepower”)
Customization Techniques
Point Styling
Control appearance with fixed parameters outside aes().
color = “darkblue”,
size = 3,
alpha = 0.6,
shape = 16
)
Color Scales
Customize how colors map to categories.
values = c(“4” = “red”,
“6” = “blue”, “8” = “green”)
)
Size Scales
Control how sizes map to values.
range = c(1, 10)
)
Labels & Titles
Add informative labels and titles.
title = “Car Weight vs MPG”,
x = “Weight (1000 lbs)”,
y = “Miles per Gallon”
)
Complete Scatter Plot Example
Production-Ready Scatter Plot
A fully customized scatter plot suitable for reports and presentations.
ggplot(mtcars, aes(x = wt, y = mpg,
color = factor(cyl), size = hp)) +
geom_point(alpha = 0.7, shape = 16) +
scale_color_brewer(palette = “Set1”,
name = “Cylinders”) +
scale_size_continuous(
name = “Horsepower”,
range = c(2, 8)
) +
labs(
title = “Vehicle Performance Analysis”,
subtitle = “Relationship between weight, fuel efficiency, and power”,
x = “Weight (1000 lbs)”,
y = “Miles per Gallon”
) +
theme_minimal()
Common Patterns and Variations
With Regression Line
geom_point() +
geom_smooth(method = “lm”)
Faceted by Group
geom_point() +
facet_wrap(~cyl)
Jittered Points
ggplot(mtcars, aes(x = factor(cyl), y = mpg)) +
geom_jitter(width = 0.2)
Text Labels
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_text(aes(label = rownames(mcars)))
Troubleshooting Common Issues
Problem: “Error: Aesthetics must be either length 1 or the same as the data”
Solution: Ensure all variables in aes() are columns in your data frame, not separate vectors.
Problem: Points not showing colors as expected
Solution: Remember the difference: color = "blue" (fixed) vs aes(color = variable) (mapped to data).
Problem: Overplotting with many points
Solution: Use alpha = 0.5 for transparency or geom_jitter() for slight position adjustments.
Pro Tip: Always start simple and build complexity gradually. Create a basic scatter plot first, then add colors, sizes, and other aesthetics one at a time to ensure each layer works as expected.
Congratulations! You’ve created your first ggplot2 scatter plot. In our next tutorial, we’ll explore how to customize colors, themes, and layouts to make your visualizations publication-ready.
