top of page

Python Lists and Data Science Topics



Welcome to this comprehensive tutorial on Python lists and their applications in data science! In this tutorial, we will delve into the fundamental concepts of Python data types, explore the powerful world of Python lists, and understand how to manipulate lists to efficiently handle data in data science projects. We will also explore the behind-the-scenes workings of lists, providing you with a deep understanding of how Python stores and handles lists in memory. So, let's get started!


1. Introduction to Python Data Types


Overview of Python Data Types


Python is a versatile programming language that supports various data types, each designed to store and manipulate different types of information. The common data types include:

  • float: Used to represent real numbers with decimal points.

  • int: Represents integers (whole numbers).

  • str: Stands for string and is used to represent text.

  • bool: Represents boolean values, True or False.

Working with Variables


In Python, you can store data using variables. Variables act as containers for data, allowing you to represent and manipulate values conveniently.


2. Introduction to Python Lists


Building Python Lists


Python lists are dynamic collections of items enclosed in square brackets []. Unlike variables that store a single value, lists can hold multiple data points of different types, making them a fundamental data structure for data scientists.


# Example of a Python list
heights = [1.72, 1.68, 1.80, 1.75]

Benefits of Lists


Lists provide a way to group related data points under a single name, making it easier to manage and process large datasets efficiently.


# List with various data types
mixed_list = [42, "hello", 3.14, True]


3. Subsetting Lists


Accessing List Elements


You can access individual elements in a list using indexing. Python uses zero-based indexing, which means the first element is at index 0.


# Accessing list elements using index
print(heights[0])  # Output: 1.72
print(heights[2])  # Output: 1.80

Negative Indexing


Python also supports negative indexing to access elements from the end of the list.


# Accessing elements using negative index
print(heights[-1])  # Output: 1.75 (last element)
print(heights[-3])  # Output: 1.68 (third element from the end)

List Slicing


List slicing allows you to create new lists by extracting a range of elements from an existing list.


# List slicing example
subset_list = heights[1:3]  # Output: [1.68, 1.80]

4. List Manipulation


Changing List Elements


You can modify individual elements or slices of a list using assignment.


# Changing list elements
heights[0] = 1.70
print(heights)  # Output: [1.70, 1.68, 1.80, 1.75]

# Changing list slice
heights[1:3] = [1.65, 1.78]
print(heights)  # Output: [1.70, 1.65, 1.78, 1.75]


Adding and Removing Elements


You can add elements to a list using the plus (+) operator, and remove elements using the del keyword.


# Adding elements to a list
heights += [1.83, 1.72]
print(heights)  # Output: [1.70, 1.65, 1.78, 1.75, 1.83, 1.72]

# Removing elements from a list
del heights[0]
print(heights)  # Output: [1.65, 1.78, 1.75, 1.83, 1.72]


5. Behind the Scenes - List References


Understanding List References


Python lists store references to memory locations where the actual data is stored, rather than storing the data directly.


# List references example
a = [1, 2, 3]
b = a

Copying Lists


When copying lists using the equals sign, both variables refer to the same list in memory, leading to unexpected behavior when modifying elements.


# Copying lists using equals sign
a = [1, 2, 3]
b = a
b[0] = 10
print(a)  # Output: [10, 2, 3]

Creating Independent Copies


To create independent copies of a list, you can use the list() function or list slicing.


# Creating independent copies
a = [1, 2, 3]
b = list(a)
b[0] = 10
print(a)  # Output: [1, 2, 3]

6. Conclusion


In this tutorial, we covered the fundamental aspects of Python lists and their significance in data science. You've learned how to build, subset, and manipulate lists, and gained insights into list references and copying. Python lists are a versatile tool for handling data efficiently in various data science tasks, from data wrangling to analysis and visualization. As you continue your journey in Python and data science, make sure to practice and explore more complex applications of lists to enhance your skills further.

Happy coding!

bottom of page