Where can beginners find a Free Python tutorial for Data Science?

Beginners can start with a Free Python tutorial for Data Science for beginners that provides step-by-step lessons, practical examples, and downloadable study notes.

Is there a Free Python tutorial for Data Science PDF available?

Yes, many platforms provide a Free Python tutorial for Data Science PDF that includes concepts, exercises, and data analysis examples for self-paced learning.

Where can I download Data Science in Python PDF notes?

You can download Data Science in Python PDF notes from various educational portals offering free learning materials covering libraries like NumPy, Pandas, and Matplotlib.

Is there a Python for Data Science, AI & Development free course?

Yes, several platforms offer a Python for Data Science, AI & Development free course that teaches essential programming concepts, data handling, and AI fundamentals.

Where can I find Data Science using Python notes PDF?

Data Science using Python notes PDF can be accessed on educational websites providing detailed notes on Python libraries, data cleaning, and machine learning basics.

Is there a Data Science with Python Google free course?

Yes, Google offers a Data Science with Python free course that includes hands-on lessons, exercises, and beginner-friendly content for practical learning.

Can I get a Python for Data Science course with certificate?

Many websites provide a Python for Data Science course with certificate, helping learners build strong portfolios and enhance job-ready skills.

Where can I find an online Python course free with certificate?

You can join an online Python course free with certificate on multiple learning platforms that offer structured lessons and assessments.

Which is the best free Python course for complete beginners in Data Science?

The best course for beginners in Data Science is one that includes hands-on projects, a Free Python tutorial for Data Science PDF, and real datasets for practice.

Does any free Python tutorial include exercises and datasets?

Yes, many free tutorials include datasets, exercises, and Data Science using Python notes PDF to help learners practice real-world data problems effectively.

Free Python Tutorial for Data Science for Beginners

The Free Python Tutorial for Data Science for Beginners helps students build strong foundations in Python while exploring essential data science concepts. Moreover, it introduces core topics such as variables, loops, and data types.

Introduction to Python Concepts

This chapter covers basic programming topics and gradually explains how Python is used in data cleaning and visualization. Additionally, learners understand NumPy, Pandas, and Matplotlib through simple examples. The sentences are clear and structured for easy reading.

Learning Resources and Notes

Students also receive helpful practice questions and notes to reinforce concepts. As a result, they gain confidence in handling real datasets. This section ensures step-by-step learning with simple explanations so beginners can follow easily.

Introduction to Python

Python is one of the most popular programming languages used in business analysis, artificial intelligence, automation, and data science. It is known for its simple syntax, making it easy for beginners—especially MBA students without a technical background—to learn quickly. Python allows you to automate reports, analyze datasets, work with Excel/CSV files, and perform advanced analytics with minimal code. Its readability and large community support make it ideal for business decision-making scenarios. Python programs can be written in any text editor and executed instantly, making it a flexible tool for business professionals.

Example Python Program

# Simple Python Program
print("Welcome to Python for Business Analysis!")  # prints text to the screen

Output:
Welcome to Python for Business Analysis!

Python Data Types: Numbers, Strings, Lists

1. Numbers

Numbers in Python include integers, floating-point numbers, and complex values. They are commonly used in business for calculations such as sales totals, profit margins, interest rates, and forecasting. Python allows direct mathematical operations without requiring complex formulas. You can store numbers in variables and perform addition, subtraction, multiplication, or even financial projections. Numbers can also be combined with strings using formatting, making them ideal for automated report generation.

# Numbers in Python
price = 1200         # integer
tax_rate = 0.18      # float

total = price + (price * tax_rate)   # calculating tax

print("Total Price:", total)

Output:
Total Price: 1416.0

2. Strings

Strings are sequences of characters used to store text such as customer names, email IDs, company titles, or product descriptions. Python provides many string methods for manipulating and analyzing business-related text data. You can clean data, convert text to uppercase, extract email domains, or generate formatted business reports. Strings are flexible, making them essential for preparing automated text-based output.

# Working with Strings
company = "Global Business Solutions"
print(company.upper())             # convert to uppercase
print("Word Count:", len(company.split()))  # count words

Output:
GLOBAL BUSINESS SOLUTIONS
Word Count: 3

3. Lists

Lists store multiple values in a single variable and allow easy addition, removal, or modification of items. They are widely used in business analytics for storing monthly sales numbers, customer groups, product categories, or survey responses. Lists are ordered and dynamically changeable, making them ideal for tasks where data needs to be updated frequently. Python’s list functions help analysts iterate through data and perform calculations efficiently.

# Using Lists in Python
sales = [1200, 1500, 1100, 1800]    # list of monthly sales

sales.append(2000)     # add new sale value
print("Updated Sales:", sales)

print("Highest Sale:", max(sales))

Output:
Updated Sales: [1200, 1500, 1100, 1800, 2000]
Highest Sale: 2000

Python Data Types: Tuples and Dictionaries

1. Tuples

Tuples are similar to lists but cannot be changed after creation. This immutability makes them useful for fixed datasets such as product codes, department names, or months of the year. Tuples are faster than lists and ensure data safety by preventing accidental modifications. They are ideal for storing reference information used in business applications.

# Using Tuples
months = ("Jan", "Feb", "Mar", "Apr")

print("Total Months:", len(months))
print("First Month:", months[0])

Output:
Total Months: 4
First Month: Jan

2. Dictionaries

Dictionaries store data in key-value pairs, making them perfect for representing structured business information. They are commonly used to store customer data, employee records, product details, or business metrics. Dictionaries allow fast lookup and modification using keys, making them ideal for data analysis and report automation.

# Dictionary Example
employee = {
    "name": "John Smith",
    "department": "Finance",
    "salary": 75000
}

print("Employee Name:", employee["name"])
print("Department:", employee["department"])

Output:
Employee Name: John Smith
Department: Finance

Files and Exceptions

Working with Files

Python makes it easy to read and write files, which is essential for business analysts who manage reports, financial data, logs, and CSV files. Using Python file handling, analysts can automate report creation, extract information, or save processed data. The with open() structure ensures files are safely opened and closed, reducing errors.

# Writing to a File
with open("report.txt", "w") as file:
    file.write("Sales Report Generated Successfully")

print("File Created!")

Output:
File Created!

Exceptions (Error Handling)

Exceptions allow your program to continue running even when an error occurs. This is important in business applications where missing files, incorrect input, or calculation issues are common. Using a try-except block ensures stability and prevents program crashes, improving overall reliability.

# Example of Exception Handling
try:
    num = int("ABC")       # this will cause an error
except ValueError:
    print("Error: Invalid number entered")

Output:
Error: Invalid number entered

Try this example : Exceptions (Error Handling)

try:
    age = int(input("Enter your age: "))
    print("Your age:", age)
except ValueError:
    print("Please enter a valid number!")

Practice Exercises

Create a Python program that calculates profit = selling price − cost price.
Write a program that stores 5 customer names in a list and prints the first and last name.
Create a dictionary for a product and print each value.
Write a script to read a text file and count the number of words.
Use exception handling to prevent errors while dividing numbers.

Types of Operators in Python

Operators in Python are special symbols that perform operations on variables and values. They allow Python to process mathematical expressions, compare results, assign values, and perform logical checks. Operators are essential for business calculations, data filtering, analytics, and automation tasks. There are several types of operators: Arithmetic (like +, -, *), Comparison (like ==, >, <), Logical (and, or, not), Assignment (like +=, -=), and Membership (in, not in). Understanding operators allows analysts to make decisions within code, build rules, and create data-driven workflows efficiently.

# Types of Operators in Python

a = 10
b = 5

print("Arithmetic:", a + b)     # Arithmetic Operator
print("Comparison:", a > b)     # Comparison Operator
print("Logical:", a > 0 and b > 0)   # Logical Operator

x = 20
x += 5     # Assignment Operator
print("Assignment:", x)

print("Membership:", 5 in [1, 2, 3, 4, 5])  # Membership Operator

Output:
Arithmetic: 15
Comparison: True
Logical: True
Assignment: 25
Membership: True

Classes and Objects

Classes and objects form the foundation of Object-Oriented Programming (OOP) in Python. A class is a blueprint or template for creating objects—similar to how a company policy defines employee roles. An object is an instance of a class, containing specific data and behaviors. This structure helps organize business logic, model real-world systems (such as customer records, invoices, or product catalogs), and create scalable applications. OOP makes programs modular, reusable, and easier to maintain. Classes contain attributes (data) and methods (functions), enabling analysts to structure complex workflows cleanly.

# Example of Classes and Objects

class Employee:
    def __init__(self, name, department):
        self.name = name
        self.department = department

    def show_details(self):
        print(f"Employee: {self.name}, Department: {self.department}")

# Creating an object
emp1 = Employee("John Smith", "Finance")
emp1.show_details()

Output:
Employee: John Smith, Department: Finance

Python Example: Classes and Objects (Detailed Explanation)

Full Code:

# Example of Classes and Objects

class Employee:
    def __init__(self, name, department):
        self.name = name
        self.department = department

    def show_details(self):
        print(f"Employee: {self.name}, Department: {self.department}")

# Creating an object
emp1 = Employee("John Smith", "Finance")
emp1.show_details()

What This Code Teaches

This example demonstrates how to create a class, define a constructor (__init__), create object attributes, write methods, and finally create and use an object.

1. What is a Class?

A class is a blueprint for creating objects. In this case, the class is named Employee. A class defines properties and behaviors that created objects will have.

class Employee:
    ...

2. The Constructor: init()

The constructor is a special method that runs automatically whenever a new object is created. It initializes the object's attributes.

def __init__(self, name, department):
    self.name = name
    self.department = department

Here:

self refers to the object itself
name and department are inputs provided when creating the object
self.name and self.department store those values inside the object

3. Instance Attributes

These lines create variables that belong to each object:

self.name = name
self.department = department

4. Method Inside the Class

A method is a function defined inside a class. Here, show_details() prints information stored in the object.

def show_details(self):
    print(f"Employee: {self.name}, Department: {self.department}")

5. Creating an Object

Below line creates a new Employee object and automatically calls the constructor.

emp1 = Employee("John Smith", "Finance")

6. Calling the Method

This line calls the show_details() method on the object:

emp1.show_details()

Output:

Employee: John Smith, Department: Finance

Summary

class Employee → defines a blueprint
__init__() → initializes object data
self → refers to the object itself
show_details() → prints stored information
emp1 = Employee(...) → creates an object
emp1.show_details() → calls a method

Reading Files Using `open()`

The open() function lets Python read data stored in text files. This is useful for business analysts who deal with reports, logs, survey results, or exported CRM data. Python can read files line-by-line, load complete content, or process large documents efficiently. Using modes like "r" (read), analysts can extract insights, clean raw text, or convert data into structured form. Reading files is a core part of automation workflows such as scheduled analytics reports, financial statement parsing, and batch processing.

# Reading a file using open()

with open("sample.txt", "r") as file:
    data = file.read()
    print(data)

Output (example):
This is sample file content.

Writing Files Using `open()`

Python can also write information into files using the same open() function but with the mode "w" (write) or "a" (append). This is extremely useful for automating business reports, storing processed data, creating logs, or exporting results after analysis. The write mode replaces existing content, while append mode adds new content without deleting old data. File-writing helps analysts build automated systems where Python saves updates, summaries, or financial calculations into readable documents.

# Writing data to a file

with open("report.txt", "w") as file:
    file.write("Business Report Generated Successfully")

print("File Saved")

Output:
File Saved

Loading Data with Pandas

Tutorial: Create a small DataFrame, save to CSV (current working directory), and read it back using pandas

This short tutorial is written for students. It shows each step with a small example DataFrame, how to save it as students.csv in the current working directory (CWD), and how to import the same CSV again. Paste this block into any large HTML file — it uses only basic tags and <pre><code> blocks, so it will not affect other page formatting.

Full runnable Python code (copy & run in VS Code / terminal)

import os
import pandas as pd

# 1) Show current working directory (where the CSV will be saved)
print("Current working directory:", os.getcwd())

# 2) Create a small DataFrame
data = {
    "student_id": [101, 102, 103],
    "name": ["Asha", "Ravi", "Maya"],
    "grade": ["5th", "6th", "5th"],
    "score": [88, 92, 79]
}
df = pd.DataFrame(data)
print("\nDataFrame created:")
print(df)

# 3) Save DataFrame to CSV in the current working directory
csv_filename = "students.csv"
df.to_csv(csv_filename, index=False)   # index=False avoids writing the row numbers to file
print(f"\nSaved DataFrame to CSV: {csv_filename}")

# 4) Confirm file exists where expected
full_path = os.path.join(os.getcwd(), csv_filename)
print("Full CSV path:", full_path)
print("File exists?:", os.path.exists(full_path))

# 5) Read the CSV back into a new DataFrame
df_loaded = pd.read_csv(csv_filename)
print("\nCSV loaded back into DataFrame:")
print(df_loaded)

Line-by-line explanation (stepwise)

Imports

import os
import pandas as pd

os helps you find and join file paths and check the current working directory. pandas is the library used for DataFrame operations and CSV read/write.

1) Check the current working directory

print("Current working directory:", os.getcwd())

os.getcwd() returns the folder path where your Python process is running. When you call df.to_csv("students.csv"), the file will be saved inside this directory unless you provide an absolute path.

2) Create a small DataFrame

data = {
    "student_id": [101, 102, 103],
    "name": ["Asha", "Ravi", "Maya"],
    "grade": ["5th", "6th", "5th"],
    "score": [88, 92, 79]
}
df = pd.DataFrame(data)

We build a Python dictionary where keys are column names and values are lists. pd.DataFrame(...) converts it into a table-like structure.

3) Save DataFrame to CSV

csv_filename = "students.csv"
df.to_csv(csv_filename, index=False)

to_csv writes the DataFrame to a CSV file. index=False prevents pandas from writing the row index (0,1,2...) into a separate column — usually what you want for a clean CSV.

4) Confirm file path and existence

full_path = os.path.join(os.getcwd(), csv_filename)
print("File exists?:", os.path.exists(full_path))

This confirms the CSV was written where you expected. Use your OS file explorer or the VS Code Explorer panel to visually verify students.csv appears in the same folder.

5) Read the CSV back into Python

df_loaded = pd.read_csv(csv_filename)
print(df_loaded)

pd.read_csv() reads the CSV file and recreates a DataFrame. This verifies that the saved CSV contains the expected data.

Extra tips for VS Code students

If you run the script from the integrated terminal, check the terminal prompt to see the CWD, or run pwd (macOS/Linux) or pwd in PowerShell / cd in Command Prompt.
If the CSV doesn't appear in Explorer: click the Explorer refresh button or reopen the folder.
To save in a specific folder: give an absolute or relative path, e.g. df.to_csv("data/students.csv", index=False). Make sure the folder (data/) exists.
To view only first few rows: use print(df.head()).

What students should practice next

Create a DataFrame with more rows and different data types (dates, floats, categories).
Try reading other CSVs with pd.read_csv and inspect columns with df.columns.
Experiment with index=True or saving Excel files (df.to_excel(...)).

Pandas is a powerful data analysis library used heavily in business analytics. It provides a DataFrame—a table-like structure ideal for working with financial reports, sales datasets, customer information, and large CSV/Excel files. With Pandas, loading data is straightforward using functions like read_csv() or read_excel(). Pandas automatically structures raw data, making it easy to clean, filter, summarize, visualize, or export. This allows analysts to convert raw business data into insights with just a few lines of code.

import pandas as pd

# Loading CSV data
df = pd.read_csv("sales.csv")

print(df.head())     # view first 5 rows

Output (example):
Displays first 5 rows of the CSV file.

Working With and Saving Data Using Pandas

After loading data, Pandas enables powerful operations such as filtering rows, adding new columns, calculating summaries, grouping data, and merging datasets. This makes it ideal for business intelligence tasks such as revenue analysis, customer segmentation, and financial modeling. Once analysis is complete, Pandas allows saving results back into CSV or Excel formats using to_csv() or to_excel(). This helps automate data pipelines and create shareable business reports.

import pandas as pd

# Working with data
df = pd.DataFrame({
    "Product": ["A", "B", "C"],
    "Sales": [1200, 1500, 1000]
})

df["Tax"] = df["Sales"] * 0.18    # adding new column

# Saving the updated data
df.to_csv("updated_sales.csv", index=False)

print(df)

Output:
Product Sales Tax
0 A 1200 216.0
1 B 1500 270.0
2 C 1000 180.0

Array-Oriented Programming with NumPy

NumPy (Numerical Python) is the foundation of numerical and scientific computing in Python. It introduces the powerful ndarray object, which is significantly faster and more memory-efficient than Python lists. NumPy allows analysts to perform complex mathematical operations—such as matrix multiplication, statistical analysis, or forecasting—using just a few lines of code. Its vectorized operations eliminate slow Python loops, enabling large-scale business analytics and financial modeling. NumPy arrays support element-wise calculations, broadcasting, reshaping, filtering, and aggregation—making them essential for data preprocessing, machine learning, simulation models, demand forecasting, and quantitative analysis.

import numpy as np

# Creating a NumPy array
sales = np.array([1200, 1500, 1700, 1600])

# Vectorized operations (faster than Python lists)
tax = sales * 0.18       # calculate 18% tax for all values
total_sales = sales + tax

print("Sales:", sales)
print("Tax:", tax)
print("Total Sales:", total_sales)

Output:
Sales: [1200 1500 1700 1600]
Tax: [216. 270. 306. 288.]
Total Sales: [1416. 1770. 2006. 1888.]

More NumPy Features (Reshaping, Aggregation)

import numpy as np

data = np.array([[10, 20, 30],
                 [40, 50, 60]])

print("Sum of all values:", data.sum())
print("Row-wise sum:", data.sum(axis=1))
print("Column-wise max:", data.max(axis=0))

Output:
Sum of all values: 210
Row-wise sum: [ 60 150]
Column-wise max: [40 50 60]

Data Cleaning and Preparation

Data cleaning and preparation are critical steps in business analytics because raw data often contains missing values, duplicates, inconsistent formatting, or incorrect types. Before analysis or modeling, data must be standardized, cleaned, and transformed into a usable format. Pandas provides powerful tools for handling missing data, removing duplicates, converting data types, renaming columns, filtering records, and creating new calculated fields. Clean data leads to accurate insights and ensures decisions are based on reliable information. This step is considered the backbone of data science, representing nearly 60–70% of the analytics workflow in most business environments.

import pandas as pd

# Sample raw data
data = {
    "Product": ["A", "B", "C", "C"],
    "Sales": [1200, None, 1500, 1500],
    "City": ["Delhi", "Mumbai", "Delhi", "Delhi"]
}

df = pd.DataFrame(data)

# Cleaning operations
df["Sales"] = df["Sales"].fillna(df["Sales"].mean())   # fill missing values
df = df.drop_duplicates()                              # remove duplicate rows
df["City"] = df["City"].str.upper()                    # standardize text

print(df)

Output:
Product Sales City
0 A 1200.0 DELHI
1 B 1400.0 MUMBAI
2 C 1500.0 DELHI

Additional Cleaning Tasks

# More cleaning operations

df["Sales_Tax"] = df["Sales"] * 0.18   # new calculated column

filtered = df[df["Sales"] > 1300]      # filter by condition

print(filtered)

Output:
Product Sales City Sales_Tax
1 B 1400.0 MUMBAI 252.0
2 C 1500.0 DELHI 270.0

Plotting and Visualization

Plotting and visualization play a crucial role in business analytics, helping convert raw data into meaningful visual patterns. Python’s Matplotlib library is widely used to create charts such as line graphs, bar charts, histograms, pie charts, and scatter plots. These visuals help students and managers quickly observe trends, compare categories, and detect business patterns like seasonality, fluctuations, or outliers. Visualization simplifies communication and supports decision-making by presenting complex information in an easy-to-understand manner. Matplotlib is flexible, customizable, and works well with Pandas, making it an essential tool for reporting, dashboards, and presentations.

Example: Monthly Sales Line Graph

import matplotlib.pyplot as plt

# Sample sales data
months = ["Jan", "Feb", "Mar", "Apr", "May"]
sales = [1200, 1500, 1700, 1600, 2000]

plt.plot(months, sales, marker='o')
plt.title("Monthly Sales Trend")
plt.xlabel("Months")
plt.ylabel("Sales in USD")
plt.grid(True)
plt.show()

Graph Output (Example):
A line graph showing sales increasing from Jan to May with markers on each point.

Example: Bar Chart of City-wise Revenue

cities = ["Delhi", "Mumbai", "Chennai", "Kolkata"]
revenue = [50000, 60000, 45000, 52000]

plt.bar(cities, revenue)
plt.title("Revenue by City")
plt.xlabel("City")
plt.ylabel("Revenue")
plt.show()

Graph Output (Example):
A bar chart with four bars representing revenue in each city.

Data Aggregation and Group Operations

Data aggregation is the process of summarizing information to identify meaningful patterns. In business analytics, this includes performing operations such as sum, mean, count, max, or min across different groups. Pandas provides the groupby() function, which allows analysts to group data based on categories like region, product type, or month, and perform descriptive statistics. Group operations are essential for tasks such as analyzing total sales by region, average revenue per product, customer segmentation, or comparing performance across time periods. Aggregation transforms raw transactional data into insights suitable for dashboards and strategic decisions.

Example: Aggregating Sales by City

import pandas as pd

# Sample business data
data = {
    "City": ["Delhi", "Mumbai", "Delhi", "Chennai", "Mumbai"],
    "Sales": [1200, 1500, 1800, 1300, 1700]
}

df = pd.DataFrame(data)

# Group by City and calculate total sales
city_sales = df.groupby("City")["Sales"].sum()

print(city_sales)

Output:
City
Chennai 1300
Delhi 3000
Mumbai 3200
Name: Sales, dtype: int64

Example: Multiple Aggregations (Sum, Mean)

product_data = {
    "Product": ["A", "A", "B", "B", "C"],
    "Revenue": [2000, 2500, 1800, 2200, 3000]
}

df2 = pd.DataFrame(product_data)

# Apply multiple aggregation functions
summary = df2.groupby("Product")["Revenue"].agg(["sum", "mean", "max"])

print(summary)

Output:
sum mean max
Product
A 4500 2250 2500
B 4000 2000 2200
C 3000 3000 3000

Intro Python: Small Dataset + Step-by-step Examples

Intro to Python — tiny dataset + step-by-step practice

A compact, easy dataset and annotated Python snippets covering: basic data types, functions, strings & lists, tuples & dicts, files & exceptions, operators, classes/objects, reading/writing files, pandas & numpy, cleaning, plotting, and group aggregation. Copy code into a Python file or Jupyter notebook and run.

Tiny dataset (CSV) — "students.csv"

This is a very small dataset (6 rows). Students should copy this into a file named students.csv or run the code below to create it automatically.

Name,Age,Grade,Passed,Math,English,Date Alice,20,Junior,True,85,78,2024-09-10 Bob,19,Sophomore,False,62,55,2024-09-12 Cara,21,Senior,True,91,88,2024-09-11 Dan,20,Junior,True,73,80,2024-09-10 Eva,22,Senior,False,58,65,2024-09-12 Finn,19,Sophomore,True,79,82,2024-09-11

Tip: students can either save the CSV manually or run the "Create CSV" Python snippet below to create it automatically in the working folder.

Create the CSV with Python (copy & run)

# create_students_csv.py
csv_text = """Name,Age,Grade,Passed,Math,English,Date
Alice,20,Junior,True,85,78,2024-09-10
Bob,19,Sophomore,False,62,55,2024-09-12
Cara,21,Senior,True,91,88,2024-09-11
Dan,20,Junior,True,73,80,2024-09-10
Eva,22,Senior,False,58,65,2024-09-12
Finn,19,Sophomore,True,79,82,2024-09-11
"""
with open("students.csv", "w", encoding="utf-8") as f:
    f.write(csv_text)
print("students.csv created (6 rows).")

Run this once to create students.csv in the same folder as your script.

1) Introduction to Python: Data types

Show basic Python types and simple operations.

# types_demo.py
# 1. Numbers
a = 10         # int
b = 3.5        # float

# 2. Strings
s = "Hello"

# 3. Boolean
flag = True

# 4. None (no value)
x = None

# Print types and simple arithmetic
print(type(a), type(b), type(s), type(flag), type(x))
print("sum:", a + b)          # addition
print("concat:", s + " world")

Explanation: type() tells you the variable type. Integers and floats support arithmetic; strings can be concatenated.

2) Functions

Define and use functions with parameters and return values.

# functions_demo.py
def add(x, y):
    "Return sum of x and y"
    return x + y

def greet(name="Student"):
    "Return a greeting"
    return f"Hello, {name}!"

print(add(3, 4))          # prints 7
print(greet("Aisha"))     # prints "Hello, Aisha!"
print(greet())            # uses default argument

Explanation: functions keep code reusable. Default arguments are helpful for optional values.

3) Strings and Lists

Common string methods and list operations.

# strings_lists.py
# Strings
s = "  Data Science  "
print("Strip:", s.strip())          # remove surrounding spaces
print("Upper:", s.upper())          # UPPERCASE
print("Slice:", s[2:10])           # substring

# Lists
students = ["Alice", "Bob", "Cara"]
students.append("Dan")             # add element
students.insert(1, "Zoe")          # insert at index
print(students)
print("pop:", students.pop())      # remove last
print("index of Bob:", students.index("Bob"))

Explanation: strings are immutable; lists are mutable and support append/insert/pop.

4) Tuples and Dictionaries

Tuples are immutable; dictionaries map keys to values.

# tuples_dicts.py
# Tuple (immutable)
coords = (10, 20)
# coords[0] = 5  # would raise an error

# Dictionary
student = {"name":"Alice", "age":20, "grade":"Junior"}
print(student["name"])             # access by key
student["age"] = 21                # update
student["city"] = "Mumbai"         # add new key
print(student)

# iterate dict
for k, v in student.items():
    print(k, "->", v)

Explanation: use dicts when you want labeled fields (like columns).

5) Files and Exceptions (reading/writing basics)

How to safely read/write files and handle exceptions.

# files_exceptions.py
# Read whole file
try:
    with open("students.csv", "r", encoding="utf-8") as f:
        contents = f.read()
    print("File loaded, length:", len(contents))
except FileNotFoundError:
    print("students.csv not found. Run the create script first.")

# Write a small text file
try:
    with open("notes.txt", "w", encoding="utf-8") as f:
        f.write("This is a note from Python.")
    print("notes.txt written.")
except Exception as e:
    print("Error writing file:", e)

Explanation: with ensures file is closed; exceptions help handle errors gracefully.

6) Types of Operators (arithmetic, comparison, logical)

# operators_demo.py
x = 7
y = 3

# arithmetic
print(x + y, x - y, x * y, x / y, x // y, x % y, x ** y)

# comparisons
print(x > y, x == y, x != y)

# logical
print((x > 5) and (y < 5))
print((x > 10) or (y < 5))
print(not (x == y))

Explanation: use the right operator for integer division (//), power (**), and logical tests.

7) Classes and Objects (simple example)

# classes_demo.py
class Student:
    def __init__(self, name, age, grade):
        # constructor: run when object is created
        self.name = name
        self.age = age
        self.grade = grade

    def is_adult(self):
        return self.age >= 18

    def summary(self):
        return f"{self.name} ({self.age}) - {self.grade}"

# create objects
s = Student("Alice", 20, "Junior")
print(s.summary())
print("Is adult?", s.is_adult())

Explanation: classes bundle data (attributes) and behavior (methods).

8) Reading files with open (CSV) & 9) Writing files with open

Two ways: manual CSV parsing and using csv module. For quick practice we'll use the csv module.

# read_write_csv.py
import csv

# Read students.csv
with open("students.csv", "r", encoding="utf-8") as f:
    reader = csv.DictReader(f)     # each row is an ordered dict
    rows = list(reader)

print("Rows read:", len(rows))
print("First row:", rows[0])

# Modify in-memory and write a new CSV
rows[0]["City"] = "Pune"  # add a new column value to first row
fieldnames = list(rows[0].keys())
with open("students_modified.csv", "w", encoding="utf-8", newline="") as f:
    writer = csv.DictWriter(f, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(rows)
print("students_modified.csv written.")

Explanation: DictReader makes CSV rows accessible by column name; DictWriter writes structured output.

10) Loading data with Pandas

Simple pandas usage: read CSV, inspect, head(), dtypes.

# pandas_basic.py
import pandas as pd

df = pd.read_csv("students.csv", parse_dates=["Date"])
print(df.head())          # show first rows
print(df.dtypes)          # column types
print(df.describe())      # summary stats for numeric columns

Explanation: parse_dates turns the Date column into datetime objects.

11) Working with and saving with Pandas

# pandas_save.py
import pandas as pd
df = pd.read_csv("students.csv", parse_dates=["Date"])

# Add a computed column: total score
df["Total"] = df["Math"] + df["English"]

# Filter: passed students
passed = df[df["Passed"] == True]

# Save filtered dataframe
passed.to_csv("students_passed.csv", index=False)
print("Saved students_passed.csv with", len(passed), "rows.")

Explanation: DataFrames let you add columns and save to disk with to_csv.

12) Array-oriented programming with NumPy

# numpy_demo.py
import numpy as np

# create numpy array from list
scores = np.array([85, 62, 91, 73, 58, 79])

print("mean:", scores.mean())
print("std:", scores.std())
print("add 5 to all:", scores + 5)   # vectorized operation (fast)

# boolean mask
mask = scores >= 70
print("scores >= 70:", scores[mask])

Explanation: NumPy applies operations to entire arrays at once (vectorized), which is faster than Python loops for large data.

13) Data cleaning and preparation (Pandas)

# pandas_cleaning.py
import pandas as pd
df = pd.read_csv("students.csv", parse_dates=["Date"])

# 1. Check for missing values
print(df.isna().sum())

# 2. Example: fill missing numeric values with column mean (if any)
# df["Math"] = df["Math"].fillna(df["Math"].mean())

# 3. Convert types: ensure Age is integer
df["Age"] = df["Age"].astype(int)

# 4. Remove duplicate rows (if present)
df = df.drop_duplicates()

# 5. Rename column
df = df.rename(columns={"English":"Eng"})

print("Cleaned dataframe:")
print(df.head())

Explanation: common cleaning steps: find & fill missing values, correct data types, drop duplicates, rename columns.

14) Plotting and Visualization (Matplotlib)

Basic line / bar / histogram plots (students can run in Jupyter or a Python script that opens a window).

# plotting_demo.py
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("students.csv")
df["Total"] = df["Math"] + df["English"]

# Bar plot of total score by student
plt.figure(figsize=(6,4))
plt.bar(df["Name"], df["Total"])
plt.title("Total score by student")
plt.xlabel("Student")
plt.ylabel("Total")
plt.tight_layout()
plt.show()

# Histogram of Math scores
plt.figure(figsize=(6,4))
plt.hist(df["Math"], bins=5)
plt.title("Distribution of Math scores")
plt.xlabel("Math score")
plt.ylabel("Count")
plt.tight_layout()
plt.show()

Explanation: plt.show() displays the figure (in notebooks it appears inline; in scripts it opens a window).

15) Data aggregation and group operations (Pandas)

# groupby_demo.py
import pandas as pd
df = pd.read_csv("students.csv")

# Example: average Math score by Grade (Junior/Sophomore/Senior)
grouped = df.groupby("Grade")["Math"].mean().reset_index().rename(columns={"Math":"Avg_Math"})
print(grouped)

# Aggregation with multiple functions
agg = df.groupby("Grade").agg({
    "Math": ["mean", "min", "max"],
    "English": ["mean"]
})
print(agg)

# Count passed vs not passed
pass_counts = df.groupby("Passed").size().reset_index(name="Count")
print(pass_counts)

Explanation: groupby + agg compute summary stats per group (very common in data analysis).

Suggested order for students to practice

Run the CSV creation script to get students.csv.
Try basic types, strings & lists, tuples & dicts, operators.
Practice functions and a simple class.
File read/write and exceptions.
Install pandas & numpy (pip install pandas numpy matplotlib) and try Pandas examples.
Do cleaning, then plotting, then groupby aggregation.

Note: to run plotting code in a headless environment (like some servers) you may need to run in Jupyter or save plots to files using plt.savefig("plot.png").

Small Exercises for Students (practice)

Calculate the average total score for all students (use Pandas or numpy).
Find the student with the highest Math score and print their name.
Create a function that accepts a student's name and returns their English score (or a message if not found).
Add a column Result that contains "Pass" if Passed==True else "Fail", and save to a new CSV.
Group by Grade and plot average total score per grade.

If you want, I can also provide ready-made solutions for these exercises in a single script — tell me which ones you'd like answers for.