Data Types in R – Character, Numeric, Integer, Logical, Complex

Data Types in R – MBA Business Analysis

Data Types in R Programming

Welcome to the first unit of our Business Analysis Techniques course. In this module, we’ll explore the fundamental building blocks of R programming – data types. Understanding data types is crucial for effective data manipulation, analysis, and visualization in business contexts.

Key Concept: Data types define how R stores and manipulates different kinds of information. Choosing the appropriate data type is essential for efficient data processing and accurate analysis.

Fundamental Data Types in R

R has several basic data types that serve different purposes in data analysis. Let’s explore each one in detail with business-relevant examples.

1. Numeric Data Type

The numeric data type represents real numbers (both integers and decimals). This is the default data type for numbers in R.

Business Application:

Financial metrics, sales figures, percentages, measurements

# Creating numeric variables revenue <- 150000.75 profit_margin <- 0.15 units_sold <- 2500 # Checking data type class(revenue) typeof(revenue) # Mathematical operations gross_profit <- revenue * profit_margin average_revenue <- revenue / units_sold # Display results print(paste("Gross Profit: $", gross_profit)) print(paste("Average Revenue per Unit: $", average_revenue))

2. Integer Data Type

Integers are whole numbers without decimal points. In R, we need to explicitly specify when we want integer values.

Business Application:

Employee counts, product quantities, customer IDs

# Creating integer variables employee_count <- 250L warehouse_capacity <- 10000L customer_id <- 15432L # Checking data type class(employee_count) typeof(employee_count) # Integer operations new_hires <- 15L total_employees <- employee_count + new_hires # Display results print(paste("Total Employees after hiring:", total_employees))

3. Character Data Type

Character data type stores text values (strings). In R, we use quotes to define character values.

Business Application:

Customer names, product categories, department names, addresses

# Creating character variables company_name <- "Tech Solutions Inc." product_category <- "Electronics" customer_tier <- "Premium" # Checking data type class(company_name) # String operations welcome_message <- paste("Welcome to", company_name) category_upper <- toupper(product_category) # Display results print(welcome_message) print(paste("Category in uppercase:", category_upper))

4. Logical Data Type

Logical data type represents Boolean values: TRUE or FALSE. These are essential for conditional operations and filtering data.

Business Application:

Status indicators, eligibility checks, condition evaluations

# Creating logical variables is_profitable <- TRUE has_discount <- FALSE meets_target <- revenue > 100000 # Checking data type class(is_profitable) # Logical operations eligible_for_bonus <- is_profitable & meets_target special_offer <- has_discount | (revenue > 200000) # Display results print(paste(“Eligible for bonus:”, eligible_for_bonus)) print(paste(“Special offer available:”, special_offer))

5. Factor Data Type

Factors are used to represent categorical data. They store both the values and the possible levels (categories).

Business Application:

Customer segments, product types, regions, satisfaction levels

# Creating factor variables customer_segment <- factor(c("Premium", "Standard", "Premium", "Basic", "Standard")) product_type <- factor(c("Electronics", "Clothing", "Electronics", "Home", "Clothing")) # Checking data type and levels class(customer_segment) levels(customer_segment) # Working with factors summary(customer_segment) # Ordered factors (for ordinal data) satisfaction_level <- factor( c("Low", "Medium", "High", "Medium", "High"), levels = c("Low", "Medium", "High"), ordered = TRUE ) # Display results print("Customer Segment Summary:") print(summary(customer_segment)) print(paste("Is satisfaction ordered?", is.ordered(satisfaction_level)))

Data Type Comparison and Appropriate Uses

Data Type Appropriate Business Use Cases Key Functions Memory Considerations
Numeric Financial calculations, metrics with decimal precision class(), typeof(), is.numeric() Uses more memory than integers
Integer Countable items, IDs, quantities as.integer(), is.integer(), L suffix More memory efficient for whole numbers
Character Text data, names, descriptions, categories paste(), nchar(), toupper(), tolower() Memory usage depends on string length
Logical Boolean conditions, flags, status indicators is.logical(), & (AND), | (OR), ! (NOT) Most memory-efficient data type
Factor Categorical data with limited unique values factor(), levels(), summary() More efficient than character for repeated categories

Data Type Conversion

In business analysis, we often need to convert between data types to perform appropriate operations.

# Data type conversion examples # Numeric to character sales_figure <- 12500.50 sales_text <- as.character(sales_figure) print(paste("Sales as text:", sales_text)) # Character to numeric price_string <- "29.99" price_numeric <- as.numeric(price_string) print(paste("Price as numeric:", price_numeric)) # Logical to numeric (TRUE=1, FALSE=0) is_active <- c(TRUE, FALSE, TRUE, TRUE) active_count <- sum(is_active) print(paste("Number of active items:", active_count)) # Character to factor region <- c("North", "South", "East", "West", "North") region_factor <- as.factor(region) print("Region as factor:") print(region_factor)

Practical Exercises

Exercise 1: Employee Data Analysis

Create variables to store the following employee information with appropriate data types:

  • Employee ID (integer)
  • Employee Name (character)
  • Department (factor with levels: “HR”, “Finance”, “Marketing”, “IT”)
  • Salary (numeric)
  • Is Manager (logical)

Create data for 5 employees and calculate the average salary by department.

Exercise 2: Sales Performance Evaluation

Create a small dataset for sales representatives with the following information:

  • Salesperson Name
  • Region (categorical)
  • Quarterly Sales (numeric)
  • Target Met (logical – TRUE if sales > 100000)

Calculate the percentage of salespeople who met their target and identify the highest performer.

Exercise 3: Product Inventory Management

Create variables to track product inventory:

  • Product ID (integer)
  • Product Name (character)
  • Category (factor)
  • Current Stock (integer)
  • Reorder Needed (logical – TRUE if stock < 10)

Identify which products need reordering and calculate the total inventory value.

Key Takeaways

  • Choosing the right data type improves efficiency and accuracy in business analysis
  • Numeric types are ideal for calculations, integers for countable items
  • Factors optimize memory usage for categorical data with repeated values
  • Logical types are essential for conditional operations and filtering
  • Proper data type selection impacts the performance of your R code

In our next session, we’ll explore data structures in R (vectors, matrices, data frames) and how they build upon these fundamental data types for business analysis.

Educational Resources Footer
GitHub