Python Data Essentials: Pandas, NumPy, and String Methods
Introduction to Pandas
Pandas is a powerful and flexible Python library used for data manipulation, analysis, and cleaning. It is suitable for handling different kinds of data, such as:
- Tabular data with heterogeneous columns (different types of data in a single dataset).
- Ordered & unordered time-series data (data arranged based on time or random order).
- Arbitrary matrix data with row & column labels.
- Unlabeled data, making it useful for raw statistical data processing.
Essential Pandas Operations
Pandas offers numerous functions for data wrangling. Here are key operations:
- Slicing DataFrames: Allows extracting specific rows or columns from a dataset. A DataFrame is a two-dimensional data structure in Pandas.
- Merging & Joining: Merging combines two DataFrames into one, allowing you to specify a common column. Joining is similar but works based on indexes instead of columns.
- Concatenation: Stacks multiple DataFrames together vertically (rows) or horizontally (columns).
Modifying DataFrames
- Changing Index: You can set or reset the index using .set_index()or.reset_index()methods.
- Renaming Columns: Change column names using .rename().
Data Munging
Data munging refers to transforming data from one format to another.
Python NumPy Fundamentals
NumPy (Numerical Python) is a core Python library used for scientific computing.
- It provides an N-dimensional array object, which helps with efficient data storage and manipulation.
- NumPy supports linear algebra, random number generation, and integration with other languages (C, C++).
NumPy Arrays
Single-Dimensional Array
Stores elements in a sequence. For example:
import numpy as np
a = np.array([1,2,3])
print(a)Multi-Dimensional Array
Allows complex matrix operations. For example:
a = np.array([(1,2,3), (4,5,6)])
print(a)Why Use NumPy Instead of Python Lists?
- Less Memory Usage: NumPy arrays take up significantly less space than lists.
- Faster Execution: Computations on NumPy arrays are much quicker.
- Convenience: Provides built-in functions for efficient data manipulation.
Core NumPy Operations
- Find Dimension (ndim): Determines whether an array is single or multi-dimensional.a = np.array([(1,2,3), (4,5,6)]) print(a.ndim) # Output: 2
- Byte Size (itemsize): Displays the size of each element in memory.a = np.array([(1,2,3)]) print(a.itemsize) # Output: 4
- Data Type (dtype): Identifies the data type of elements in an array.a = np.array([(1,2,3)]) print(a.dtype) # Output: int32
- Array Size & Shape: sizegives total elements, andshapegives rows & columns.a = np.array([(1,2,3,4,5,6)]) print(a.size) # Output: 6 print(a.shape) # Output: (1,6)
- Reshape (reshape): Rearranges an array into a different row-column structure.a = np.array([(8,9,10), (11,12,13)]) a = a.reshape(3,2) print(a)
- Linspace (linspace): Generates evenly spaced values in a range.a = np.linspace(1,3,10) print(a)
- Finding Min/Max/Sum: Get statistical values from an array.a = np.array([1,2,3]) print(a.min()) # Output: 1 print(a.max()) # Output: 3 print(a.sum()) # Output: 6
Introduction to Strings in Python
- A string is a sequence of characters enclosed in single ('), double ("), or triple (""") quotes.
- Strings are immutable, meaning they cannot be modified after creation.
- Python provides built-in string functions for easy manipulation.
Changing Case in Strings
- str.upper(): Converts all characters to uppercase.
- str.lower(): Converts all characters to lowercase.
Example:
ss = "Softpro India"
print(ss.upper()) # Output: SOFTPRO INDIA
print(ss.lower()) # Output: softpro indiaJoining, Splitting, and Replacing Strings
- str.join(): Joins strings using a separator.
- str.split(): Splits a string into a list of words.
- str.replace(): Replaces a substring with another value.
Boolean String Methods
- Used for validating input types (e.g., names should be alphabetic, postal codes should be numeric).
- Methods return TrueorFalsebased on string properties:- str.isalnum(): Checks if all characters are letters or numbers.
- str.isalpha(): Checks if all characters are alphabetic.
- str.isnumeric(): Checks if all characters are numeric.
- str.isspace(): Checks if the string contains only whitespace.
- str.isupper(): Checks if the string is uppercase.
- str.islower(): Checks if the string is lowercase.
 
Checking Palindromes
- A palindrome reads the same forward and backward.
- Example:python_string = input("Enter a string: ") reverse_string = "".join(reversed(python_string)) if python_string == reverse_string: print("String is palindrome") else: print("String is non-palindrome")
Generating Shortened Names
Converts a full name into an abbreviated format.
name = input("Enter your full name: ")
shortname = name.split(" ")
print("Your short name:", end="")
for n in range(len(shortname)-1):
    print(shortname[n][0] + ".", end="")
print(shortname[-1])Replacing Words in a Sentence
Searches and replaces a word in a sentence.
sentence = input("Enter a sentence: ")
fw = input("Find what? ")
rw = input("Replace with: ")
print("Modified sentence: " + sentence.replace(fw, rw))Number Format Conversions
Converts a decimal number into binary, octal, and hexadecimal.
n = int(input("Enter a number: "))
print("Binary format:", bin(n).replace("0b", ""))
print("Octal format:", oct(n).replace("0o", ""))
print("Hexadecimal format:", hex(n).replace("0x", ""))