Unit 1: Introduction to R Programming
What is R?
R is an open-source programming language and environment designed for statistical computing, data analysis, and visualization. It is widely used by statisticians, data scientists, and researchers for developing statistical software and performing data analysis. R provides a vast collection of libraries and tools for manipulating data, running statistical tests, and creating high-quality visualizations.
R is not just a programming language; it is an ecosystem that includes:
- A language for data manipulation and analysis.
- An environment for running code and scripts.
- A community-driven repository of packages (CRAN) for extended functionality.
History of R
R was created by Ross Ihaka and Robert Gentleman at the University of Auckland, New Zealand, in 1993. It was inspired by the S language (developed at Bell Labs) and designed to be more user-friendly and flexible. The name “R” is derived from the first letter of the creators’ names.
R was initially used in academic and research settings but has since become a standard tool in industries like finance, healthcare, and marketing. The R Foundation oversees its development, and the Comprehensive R Archive Network (CRAN) hosts thousands of packages for specialized tasks.
Why Use R?
Open Source
R is free to use, modify, and distribute, making it accessible to everyone.
Extensive Libraries
CRAN hosts over 18,000 packages for statistics, machine learning, and visualization.
Data Visualization
R’s ggplot2 package is renowned for creating publication-quality graphics.
Community Support
A large, active community contributes to documentation, forums, and tutorials.
Cross-Platform
R runs on Windows, macOS, and Linux, ensuring compatibility across systems.
Integration
R integrates with other languages like Python, SQL, and C++, and tools like Excel and Tableau.

