top of page

A Comprehensive Guide to Python Data Types for Data Science



I. Introduction to Python Data Types


Python, as a dynamically-typed language, is well-known for its simplicity and versatility. A key aspect of Python, and programming in general, is understanding data types. These basic units of data provide a roadmap to manage information in your program. Particularly in data science, we frequently use sequences or container data types, like lists, tuples, and strings. In this tutorial, we will focus on these Python data types to help you bolster your data science skills.


II. Container Sequences


1. Definition and functions of container sequences


Container sequences, as the name suggests, are types that hold multiple items, where each item can be a different type. They can store a variety of data types and enable complex data manipulation and iteration, which are indispensable in data science.


2. Mutability and immutability of container sequences


The terms 'mutable' and 'immutable' are inherent properties of container sequences in Python. Mutable containers allow you to change their content after creation (like lists), whereas immutable containers do not (like tuples and strings).


3. The concept of iterating or looping over data


Iteration, or looping, is a cornerstone in programming and data analysis. It allows you to execute the same code block for each item in a sequence or other iterable objects.


III. Lists


1. What are lists and why are they used


A Python list is a built-in data type that can contain multiple items in an ordered sequence. Think of a list as a train, with each passenger (item) having a specific seat (index). Lists are mutable, meaning you can add or remove passengers from the train as needed.


Code Snippet:

# Creating a list
list_example = ['apple', 'banana', 'cherry']
print(list_example)

Output:

['apple', 'banana', 'cherry']


2. Adding and removing elements in lists


Adding and removing elements in a list is akin to inviting or evicting guests from a party. You can add guests using the append() function or remove them with the remove() function.


Code Snippet:


# Adding an element
list_example.append('date')
print(list_example)

# Removing an element
list_example.remove('apple')
print(list_example)

Output:

['apple', 'banana', 'cherry', 'date']
['banana', 'cherry', 'date']


3. Accessing individual elements in a list using indexes


Accessing elements in a list is like choosing a book from a shelf using its position. Python uses zero-based indexing, so the first item is at index 0.


Code Snippet:


# Accessing an element
print(list_example[1])

Output:

'banana'


4. Examples and use cases of lists


In data science, lists are often used to store datasets, manipulate data, and for iterations, among other tasks. For example, you may use a list to store the ages of a group of people.


IV. Advanced List Operations


1. Combining lists using operators and methods


Combining lists in Python is like attaching two separate train cars to form a longer train. You can use the + operator or the extend() method to achieve this.


Code Snippet:

# Combining lists
list1 = [1, 2, 3]
list2 = [4, 5, 6]
combined_list = list1 + list2
print(combined_list)

# Using extend method
list1.extend(list2)
print(list1)

Output:

[1, 2, 3, 4, 5, 6]
[1, 2, 3, 4, 5, 6]


2. Finding the index of an element in a list


Finding an item in a list is like looking for a specific book on a bookshelf. You know

what the book looks like (the element), and you want to know where it is (its index). For this, we use the index() method.


Code Snippet:

# Finding index of an element
print(list1.index(4))

Output:

3


3. Removing elements from a list


Earlier, we saw how to remove a known element from a list. However, if you want to remove an element at a specific position, like picking a specific book from a shelf to remove, you use the pop() method with the index as its argument.


Code Snippet:

# Removing an element at specific position
list1.pop(3)
print(list1)

Output:

[1, 2, 3, 5, 6]


4. Looping over lists using list comprehension


List comprehension is a compact way of creating a new list by performing an operation on each item in an existing list. It's like creating a new train filled with transformed passengers from an old train.


Code Snippet:

# Creating a new list with list comprehension
squared_list = [x**2 for x in list1]
print(squared_list)

Output:

[1, 4, 9, 25, 36]


5. Sorting lists


Sorting a list is like arranging books on a shelf in a particular order. You can use

the sort() method to sort a list in ascending order.


Code Snippet:

# Sorting a list
list1.sort()
print(list1)

Output:

[1, 2, 3, 5, 6]


V. Tuples


1. Introduction to tuples and their internal uses


A tuple is a collection of objects which are ordered and immutable. They are similar to lists, but unlike lists, they cannot be changed once declared. Tuples are like a time capsule, once sealed, the objects inside can't be changed, only viewed.


Code Snippet:

# Creating a tuple
tuple_example = ('apple', 'banana', 'cherry')
print(tuple_example)

Output:

('apple', 'banana', 'cherry')


2. Similarities and differences between tuples and lists


Tuples and lists are alike in several ways; both can store a collection of items, and these items can be accessed through indexing. However, the crucial difference is the mutability. Imagine lists as a bookshelf where you can replace, add or remove books (items). Conversely, tuples are more like a printed book. Once printed, the contents cannot be altered.


Code Snippet:

# Lists are mutable
fruits_list = ['apple', 'banana', 'cherry']
fruits_list[1] = 'blueberry'
print(fruits_list)

# Tuples are immutable
fruits_tuple = ('apple', 'banana', 'cherry')
fruits_tuple[1] = 'blueberry'  # This will raise an error

Output:

['apple', 'blueberry', 'cherry']
TypeError: 'tuple' object does not support item assignment


3. The concept of immutability in tuples


As mentioned, tuples are immutable. This characteristic is beneficial when you want a sequence of items that should remain constant throughout the program's lifespan, such as the days of the week or the colors of a rainbow.


Code Snippet:

# Attempting to modify a tuple
days_of_week = ('Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday')
days_of_week[0] = 'Newday'  # This will raise an error

Output:

TypeError: 'tuple' object does not support item assignment


4. Creating tuples and unpacking them into variables


Tuples can be a handy way to assign multiple variables at once, similar to how a magician might pull several objects out of a hat simultaneously.


Code Snippet:

# Creating a tuple and unpacking it
coordinate = (34.0522, -118.2437)  # tuple with latitude and longitude
latitude, longitude = coordinate  # unpacking tuple into variables
print("Latitude:", latitude)
print("Longitude:", longitude)

Output:

Latitude: 34.0522
Longitude: -118.2437


VI. Advanced Tuple Operations


1. Using the zip function to pair elements from different lists into tuples


Think of zip as a zipper that merges two separate parts together. The zip function in Python does just that, it takes two iterables and merges them into tuples, each containing one element from each iterable.


Code Snippet:

# Using the zip function
fruit_names = ['apple', 'banana', 'cherry']
fruit_colors = ['red', 'yellow', 'red']
zipped_fruits = zip(fruit_names, fruit_colors)
print(list(zipped_fruits))  # convert the zipped object into a list to print it

Output:

[('apple', 'red'), ('banana', 'yellow'), ('cherry', 'red')]


2. Unpacking tuples for use in loops


Similar to how we unpacked a tuple into variables earlier, we can use the same concept to unpack tuples in a loop.


Code Snippet:

# Looping over a list of tuples
fruits = [('apple', 'red'), ('banana', 'yellow'), ('cherry', 'red')]
for name, color in fruits:
    print(f"The color of {name} is {color}")

Output:

The color of apple is red
The color of banana is yellow
The color of cherry is red


3. Enumerating positions with the enumerate function


The enumerate function in Python is like a GPS for your data; it keeps track of where you are in your data sequence. It returns the index and value of each element in an iterable.


Code Snippet:

# Using the enumerate function
fruits = ['apple', 'banana', 'cherry']
for i, fruit in enumerate(fruits):
    print(f"Fruit {i+1}: {fruit}")

Output:

Fruit 1: apple
Fruit 2: banana
Fruit 3: cherry


4. Potential issues when creating tuples and precautions to take


Creating tuples can sometimes be tricky, especially when you're creating a tuple with only one element. It may seem like Python is playing a prank on you! Python gets confused between whether to consider it as a tuple or just an element.


Code Snippet:

# Incorrect way to create a single-element tuple
single_element_tuple = ("apple")  # this is not a tuple, it's a string
print(type(single_element_tuple))

# Correct way to create a single-element tuple
single_element_tuple = ("apple",)  # the trailing comma tells Python this is a tuple
print(type(single_element_tuple))

Output:

<class 'str'>
<class 'tuple'>


VII. Strings


1. Introduction to Strings as a Sequence Type


In Python, a string is a sequence of characters enclosed in either single quotes, double quotes, or triple quotes. Strings are the main way we represent and manipulate text information in Python. Like a DNA sequence, a string keeps the characters in a precise order.


Code Snippet:

# Creating a string
greeting = "Hello, world!"
print(greeting)

Output:

Hello, world!


2. Unique Ways of Constructing and Working with Strings


Strings can be concatenated (joined together) using the + operator, or replicated (repeated) using the * operator. Think of this like adding or multiplying numbers, but for text!


Code Snippet:

# String concatenation and replication
first_name = "John"
last_name = "Doe"
full_name = first_name + " " + last_name  # concatenation
print(full_name)

greeting = "Hello! " * 3  # replication
print(greeting)

Output:

John Doe
Hello! Hello! Hello!


VIII. Advanced String Operations


1. Creating Formatted Strings Using f-string


f-strings, introduced in Python 3.6, allow for embedding expressions inside string literals, using curly braces {}. Think of f-strings as a powerful tool for text customization.


Code Snippet:

# Creating a formatted string using f-string
name = "John"
age = 30
greeting = f"Hello, my name is {name} and I am {age} years old."
print(greeting)

Output:

Hello, my name is John and I am 30 years old.


2. Joining Iterables with Strings Using the join() Method


The join() method is a string method that joins the elements of an iterable (like a list or a tuple) into one string. It's like a string version of a glue stick, sticking

sequences together.


Code Snippet:

# Joining a list of strings using the join() method
words = ["Hello", "world", "!"]
sentence = " ".join(words)  # joins the words with a space in between
print(sentence)

Output:

Hello world !


3. Searching for Specific Parts of a String Using the startswith() and endswith() Methods


The startswith() and endswith() methods return True if the string starts or ends with the given substring, respectively.


Code Snippet:

# Using startswith() and endswith()
sentence = "Hello, world!"
print(sentence.startswith("Hello"))  # checks if sentence starts with "Hello"
print(sentence.endswith("!"))  # checks if sentence ends with "!"

Output:

True
True


4. Finding Values Inside a String Using the in Operator


The in operator checks if a certain value is present in a sequence. In the case of strings, it checks if a substring is part of a larger string. It's like searching for a specific word in a book.


Code Snippet:

# Using the in operator with strings
sentence = "The quick brown fox jumps over the lazy dog."
word = "fox"
print(word in sentence)  # checks if "fox" is in the sentence

Output:

True


5. Case Insensitive Search Using the lower() Method


To perform a case insensitive search, you can convert the string and the substring to lowercase using the lower() method.


Code Snippet:

# Case insensitive search using the lower() method
sentence = "The quick brown fox jumps over the lazy dog."
word = "FOX"
print(word.lower() in sentence.lower())  # checks if "fox" is in the sentence, ignoring case

Output:

True

bottom of page