How to Generate Random Data in R Programming and Export Data to Excel (Step-by-Step Guide)

How to Generate Random Data in R Programming | Complete Tutorial

How to Generate Random Data in R Programming

A Complete Guide to Creating and Exporting Random Datasets for Educational Purposes

Quick Start Example

Here’s a simple educational example to get you started with generating random student data:

Complete R Script
# Install package (only once)
install.packages("openxlsx")

# Load library
library(openxlsx)

# Generate random educational data
set.seed(123)  # for reproducibility

student_data <- data.frame(
  StudentID = 1:20,
  Name = paste("Student", 1:20),
  Age = sample(18:22, 20, replace = TRUE),
  Gender = sample(c("Male", "Female"), 20, replace = TRUE),
  Marks_Math = sample(50:100, 20, replace = TRUE),
  Marks_Science = sample(50:100, 20, replace = TRUE),
  Marks_English = sample(50:100, 20, replace = TRUE)
)

# View data in RStudio
print(student_data)

# Export to Excel file
write.xlsx(student_data, file = "Student_Data.xlsx")

# The file "Student_Data.xlsx" will be saved in your working directory

📘 Step-by-Step Explanation

1 Install and Load the Package
Package Setup
install.packages("openxlsx")   # Install package (only first time)
library(openxlsx)              # Load the package into R
  • install.packages(“openxlsx”): Downloads and installs the openxlsx package from CRAN. This package helps us write and read Excel files.
  • library(openxlsx): Activates the package for use in the current R session.
Teaching point:

Packages in R are like “apps” that add extra features.


2 Set Random Seed
Reproducibility Setup
set.seed(123)
  • This makes sure that every time you run the code, you get the same random data.
  • Without it, random numbers would change each time you run the script.
Teaching point:

set.seed() ensures reproducibility (very important in data science).


3 Create the Dataset
Data Generation
student_data <- data.frame(
  StudentID = 1:20,
  Name = paste("Student", 1:20),
  Age = sample(18:22, 20, replace = TRUE),
  Gender = sample(c("Male", "Female"), 20, replace = TRUE),
  Marks_Math = sample(50:100, 20, replace = TRUE),
  Marks_Science = sample(50:100, 20, replace = TRUE),
  Marks_English = sample(50:100, 20, replace = TRUE)
)
  • data.frame(): Creates a table-like structure in R (like an Excel sheet).
  • StudentID = 1:20: Creates student IDs from 1 to 20.
  • Name = paste(“Student”, 1:20): Creates names like Student 1, Student 2, … Student 20.
  • Age = sample(18:22, 20, replace = TRUE): Randomly picks ages between 18–22 for 20 students.
  • Gender = sample(c(“Male”, “Female”), 20, replace = TRUE): Randomly assigns Male/Female for each student.
  • Marks_Math, Marks_Science, Marks_English: Random marks between 50–100 are assigned.
Teaching point:

This is how you simulate real-life data for practice.


4 View the Data in RStudio
Display Data
print(student_data)
  • Displays the dataset in the console.
  • You can also simply type student_data to see the data in RStudio.
Teaching point:

Always check your data before exporting.


5 Export the Data to Excel
Export to Excel
write.xlsx(student_data, file = "Student_Data.xlsx")
  • write.xlsx(): Writes the dataset into an Excel file.
  • file = “Student_Data.xlsx”: The name of the Excel file.
  • The file is saved in the current working directory.

💡 Key Takeaways

  • Use set.seed() for reproducible random data
  • The sample() function is perfect for generating random values
  • The openxlsx package makes Excel export simple and efficient
  • Always verify your data with print() before exporting

Educational Resources Footer