🎯 Topic: How to Change Column Names in a Dataset in R (Base R only)
Overview (150+ words):
Renaming columns is a fundamental data-cleaning task. Clear, consistent column names make your scripts easier to read, reduce the chance of mistakes, and help when sharing code. In base R you can rename columns with functions such as colnames(), names(), which(), setNames(), and string functions like gsub(). You can rename a single column, rename multiple columns programmatically, or apply pattern replacements across many names. Important safety steps include working on a copy of the original dataset, verifying the column exists before renaming, checking for duplicate names after renaming, and ensuring types are unaffected. This guide uses only base R and a reproducible custom dataset of 50 rows and 7 columns (ID, Name, Age, Gender, math_score, eng_score, attend_pct). Each example uses that dataset exclusively and includes a short explanation. After practicing these base-R methods you will be able to rename columns safely without loading packages, which is useful in environments where additional libraries are unavailable.
Dataset: Create students_survey50 (6+ columns, 50 rows)
The dataset students_survey50 simulates 50 students and includes the columns:
ID, Name, Age, Gender, math_score, eng_score, attend_pct. Use the code below to create it in base R.
R code: Create the dataset (copy-paste into R)
# Create a reproducible 50-row dataset (base R only)
set.seed(42)
ID <- 1:50
Name <- paste0("Student_", sprintf("%02d", ID))
Age <- sample(10:16, 50, replace = TRUE)
Gender <- sample(c("M","F"), 50, replace = TRUE)
math_score <- pmin(pmax(round(rnorm(50, 70, 10)), 0), 100) # keep 0-100
eng_score <- pmin(pmax(round(rnorm(50, 72, 9)), 0), 100)
attend_pct <- round(runif(50, 75, 100), 1)
students_survey50 <- data.frame(
ID, Name, Age, Gender, math_score, eng_score, attend_pct,
stringsAsFactors = FALSE
)
# Quick verification
dim(students_survey50) # should be 50 x 7
head(students_survey50, 6)
Dataset explanation: This dataset is intentionally simple but realistic. Column types: ID integer, Name character, Age integer, Gender character, and numeric scores and attendance. All examples that follow use only this data frame.
Step: Always work on a copy before renaming
# Best practice: keep original copy data_orig <- students_survey50 data_work <- data_orig
Explanation: Keeping data_orig preserves the source. If a rename breaks later code, you can revert by reassigning data_work <- data_orig.
Method 1 — Rename a single column using colnames()
# Rename 'attend_pct' to 'attendance_percent' colnames(data_work)[colnames(data_work) == "attend_pct"] <- "attendance_percent" # Verify change colnames(data_work)
Explanation: colnames() returns a character vector of column names. The logical comparison finds the exact element equal to the old name and assigns a new value to that slot. This is explicit and works well for single known names.
Method 2 — Rename by position using names() and which()
# Find index of 'math_score' and rename to 'math_percentage' idx <- which(names(data_work) == "math_score") if(length(idx) == 1) names(data_work)[idx] <- "math_percentage" colnames(data_work)
Explanation: Use which() to get position(s). Check length(idx) to avoid accidental multiple matches; this protects you when similar names exist.
Method 3 — Rename multiple columns at once using setNames()
# Create a named vector mapping old->new and apply setNames
old_names <- c("eng_score","Name")
new_names <- c("english_score","student_name")
# Create a copy of names and replace matches
nm <- names(data_work)
nm[nm %in% old_names] <- new_names[match(nm[nm %in% old_names], old_names)]
names(data_work) <- nm
# Verify
colnames(data_work)
Explanation: We created vectors of old and new names, matched them, and replaced names programmatically. This method scales when you need to rename several known columns without packages.
Method 4 — Pattern-based renaming using gsub()
# Example: change any column ending with '_score' to end with '_pct'
names(data_work) <- gsub("_score$", "_pct", names(data_work))
colnames(data_work)
Explanation: Use regex-based gsub() to programmatically change names that match patterns—helpful for consistent renaming across many columns.
Method 5 — Ensure uniqueness after renaming
# Check duplicates and make unique if needed
if(any(duplicated(names(data_work)))){
warning("Duplicate column names detected; making names unique")
names(data_work) <- make.names(names(data_work), unique = TRUE)
}
colnames(data_work)
Explanation: Duplicate names break some functions. make.names(..., unique=TRUE) ensures each name is unique by adding suffixes if needed.
Method 6 — Safe rename with existence check
# Safe rename pattern: rename only if column exists
old <- "nonexistent_col"
new <- "something_new"
if(old %in% names(data_work)){
names(data_work)[names(data_work) == old] <- new
} else {
message(paste("Column", old, "not found. No rename performed."))
}
Explanation: This prevents silent failures or unintended new columns when the specified old name does not exist.
Complete flow example (copy-paste and run)
# Full example: create copy, do a few renames, verify structure
data_test <- students_survey50 # fresh copy
# 1. single rename
colnames(data_test)[colnames(data_test) == "attend_pct"] <- "attendance_percent"
# 2. rename by index
i <- which(names(data_test) == "math_score")
if(length(i) == 1) names(data_test)[i] <- "math_percentage"
# 3. pattern rename
names(data_test) <- gsub("_score$", "_pct", names(data_test))
# 4. verify
colnames(data_test)
str(data_test)
head(data_test, 4)
Explanation: This shows a typical workflow: copy, rename single columns, rename by index, apply a pattern-based rename, and then inspect with colnames(), str(), and head().
Practice Exercises (Self-assessment)
- Using
students_survey50, renamemath_scoretomath_pctandeng_scoretoenglish_pctusing base R. Show code and resulting column names. - Programmatically rename any column that contains the substring
pctto prefix it withcol_(e.g.,col_math_pct) usinggsub(). Show code and verification. - Attempt to rename a non-existent column (e.g.,
height_cm) and show a safe check that outputs a message instead of error. - After several renames, demonstrate how to revert to the original dataset (
data_orig). Provide the code and explanation. - Explain in 3–4 sentences why you should check for duplicates after renaming and give the base-R code to do it.
Answer Format (How to present answers)
## Exercise #n — Short title # R code ...R code here... # Output (printed): ...expected printed output (e.g., colnames(data_test) or head(...))... # Short explanation (2-4 sentences) Explanation...
Example Solutions (Concise, base R only)
# Ex 1
d1 <- students_survey50
names(d1)[names(d1) == "math_score"] <- "math_pct"
names(d1)[names(d1) == "eng_score"] <- "english_pct"
colnames(d1)
# Ex 2 (programmatic prefix)
d2 <- students_survey50
names(d2) <- gsub("^(.*pct)$", "col_\\1", names(d2))
colnames(d2)
# Ex 3 (safe check)
oldn <- "height_cm"
if(oldn %in% names(students_survey50)){
names(students_survey50)[names(students_survey50) == oldn] <- "height_m"
} else {
message("Column 'height_cm' not found; no rename performed.")
}
# Ex 4 (revert)
data_work <- data_orig # revert to original copy
# Ex 5 (duplicates check)
if(any(duplicated(names(data_work)))){
print("Duplicates found")
names(data_work) <- make.names(names(data_work), unique = TRUE)
} else {
print("No duplicate column names")
}
Final notes for students
Renaming columns in base R is powerful and lightweight — no extra packages required. Use explicit methods for individual renames and programmatic methods (with gsub() and vector matching) for bulk operations. Always create a copy before modifying data, verify column existence before renaming, and check for duplicate names afterwards. Practice the exercises using the provided 50-row dataset until these patterns become second nature.
Prepared for learners — copy into WordPress safely; styles are scoped to this container and won’t change your site theme.

