1. Introduction to Docstrings
In any coding project, documentation is like the compass guiding you on your path. In Python, the built-in documentation comes in the form of docstrings. To appreciate their role, let's start with a scenario. Imagine you are in front of a complex piece of machinery with numerous buttons, levers, and switches, but without a manual. It would be frustrating, wouldn't it? That's what it feels like diving into a complex function without documentation.
2. Anatomy of a Docstring
A docstring is a built-in feature in Python that allows you to attach a string literal to a function, method, class, module, or script. It is usually defined by triple quotes ('''...''' or """..."""). Here's a simple example:
def say_hello(name):
"""This function prints a hello message for the given name."""
print(f"Hello, {name}!")
You can access the docstring using the __doc__ attribute:
print(say_hello.__doc__)
Output:
This function prints a hello message for the given name.
3. Docstring Formats
While Python itself does not impose any specific style for docstrings, maintaining a consistent style across your codebase is important for readability. Two popular styles are Google-style and Numpydoc. Both have their merits and choosing one depends on personal or team preference.
4. Google Style Docstring
Let's start with Google-style. Here's how you'd document a function:
def calculate_area(length, width):
"""
Calculates the area of a rectangle.
Args:
length (float): The length of the rectangle.
width (float): The width of the rectangle.
Returns:
float: The area of the rectangle.
Raises:
ValueError: If either length or width is negative.
"""
if length < 0 or width < 0:
raise ValueError("Length and width must be positive.")
return length * width
In the Google style, the docstring starts with a brief description of the function. Then, it details the function's arguments (Args), return values (Returns), and any errors it might raise (Raises).
5. Numpydoc Format
The Numpydoc style is another popular choice, especially in scientific computing. It's similar to Google-style, but it has a section for variable types that's separate from the description. Here's the same function in Numpydoc style:
def calculate_area(length, width):
"""
Calculates the area of a rectangle.
Parameters
----------
length : float
The length of the rectangle.
width : float
The width of the rectangle.
Returns
-------
float
The area of the rectangle.
Raises
------
ValueError
If either length or width is negative.
"""
if length < 0 or width < 0:
raise ValueError("Length and width must be positive.")
return length * width
6. Retrieving Docstrings
There are several ways to retrieve a docstring in Python. The simplest is using the __doc__ attribute:
print(calculate_area.__doc__)
You can also use the getdoc() function from the inspect module:
import inspect
print(inspect.getdoc(calculate_area))
Both will display the docstring of the function.
7. Principles of Code Design: DRY and "Do One Thing"
When we're writing code, it's not just about getting it to work. We should also consider how maintainable, scalable, and readable it is. Two key principles that can help us achieve these goals are DRY (Don't Repeat Yourself) and the "Do One Thing" principle.
7.1 The DRY Principle
The DRY principle is straightforward: each piece of your code should represent a single idea or concept and should only occur once in your system. This principle aims to prevent redundancy, which is a major source of bugs and inefficiency.
Imagine you're making a sandwich, and you need to spread butter on a slice of bread. Instead of spreading butter on the bread every time you make a sandwich, wouldn't it be easier to create a "Butter Spreader" machine that does it for you? The "Butter Spreader" is your function that encapsulates the "butter-spreading" code, so you don't repeat it everywhere you need it.
Here's an example of code that violates the DRY principle:
# Compute the square of x
x_squared = x * x
print(f"The square of x is {x_squared}")
# Later in the code...
# Compute the square of y
y_squared = y * y
print(f"The square of y is {y_squared}")
A DRY version would look like this:
def compute_square(num):
"""Compute the square of a number."""
squared = num * num
return squared
x_squared = compute_square(x)
print(f"The square of x is {x_squared}")
y_squared = compute_square(y)
print(f"The square of y is {y_squared}")
Now, we have a reusable function compute_square(), and we've reduced code duplication!
7.2 The "Do One Thing" Principle
The "Do One Thing" principle states that a function should do one thing, do it well, and do it only. When a function does more than one operation, it can become more difficult to test and debug.
To illustrate, imagine you have a "Multi-tool Swiss Knife". It's compact and versatile, but would you use it in a professional kitchen? Probably not. In a kitchen, you have specialized tools, each doing one thing perfectly: a chef's knife for chopping, a grater for grating, and so on. Your functions should be like these specialized kitchen tools.
An example of a function violating this principle:
def compute_square_and_cube(num):
"""Compute the square and cube of a number."""
squared = num * num
cubed = num * num * num
return squared, cubed
A better approach:
def compute_square(num):
"""Compute the square of a number."""
return num * num
def compute_cube(num):
"""Compute the cube of a number."""
return num * num * num
Now, each function performs a single task, making them more manageable and maintainable.
8. Understanding Python's "Pass by Assignment"
One of Python's unique features is its behavior when passing arguments to functions, often referred to as "Pass by Assignment." This behavior plays a significant role when we deal with mutable and immutable objects, so let's take a closer look at it.
8.1 "Pass by Assignment" Explained
In Python, all variables are references to objects. When you pass a variable to a function, you're essentially giving the function a reference to the object. Whether changes in the function affect the original variable depends on the object's type: mutable or immutable.
Consider passing a note in class. If the note is written on a piece of paper (immutable object), you can't change the message without everyone noticing. But, if the note is written on a small whiteboard (mutable object), you can easily erase and rewrite parts of the message, and the person receiving the message might not even realize it's been altered.
Here is some code to illustrate:
def update_num(num):
"""Add 1 to the given number."""
print(f"Initial value in function: {num}")
num += 1
print(f"Final value in function: {num}")
x = 5
update_num(x)
print(f"Value of x after function: {x}")
Output:
Initial value in function: 5
Final value in function: 6
Value of x after function: 5
Even though num was updated in the function, x remains unaltered because integers are immutable in Python.
8.2 Mutable vs. Immutable Objects
Python's built-in types can be divided into mutable (can be changed after creation) and immutable (cannot be changed after creation).
Common immutable types include integers, floats, strings, and tuples. On the other hand, lists, dictionaries, and sets are examples of mutable types.
Here's an example using a list (a mutable object):
def update_list(lst):
"""Append 1 to the given list."""
print(f"Initial list in function: {lst}")
lst.append(1)
print(f"Final list in function: {lst}")
x = [5]
update_list(x)
print(f"Value of x after function: {x}")
Output:
Initial list in function: [5]
Final list in function: [5, 1]
Value of x after function: [5, 1]
In this case, the changes in the function propagate to the original variable x because lists are mutable.
9. Dangers of Mutable Default Arguments
Now that we understand the concept of "Pass by Assignment" and the difference between mutable and immutable objects, we are ready to discuss a common Python pitfall: mutable default arguments. This is a topic that often catches new (and sometimes experienced) Python programmers off guard.
9.1 The Problem with Mutable Default Arguments
In Python, you can assign default values to function parameters. This feature allows a function to be called with fewer arguments than it is defined to allow. However, when using mutable types as default arguments, we might run into unexpected behaviors.
Let's consider the classic example of a function that appends an item to a list:
def append_to(num, target=[]):
"""Append num to target list."""
target.append(num)
return target
print(append_to(1))
print(append_to(2))
print(append_to(3))
You might expect this code to print out three lists: [1], [2], and [3]. But let's see what actually happens:
[1]
[1, 2]
[1, 2, 3]
Why did this happen?
When Python creates the function, it executes the function definition only once, meaning that a single list (target) is created. Every time the function is called without a target list, Python uses this same list. Hence, every modification to the target list is "remembered" across function calls.
9.2 A Solution: None as a Sentinel Value
To avoid this issue, a common practice is to use None as a default value and define
the mutable object within the function body:
def append_to(num, target=None):
"""Append num to target list."""
if target is None:
target = []
target.append(num)
return target
print(append_to(1))
print(append_to(2))
print(append_to(3))
Output:
[1]
[2]
[3]
This way, a new list is created every time the function is called, providing the expected behavior.
This highlights the importance of being cautious when using mutable default arguments in Python. It's one of those subtle aspects of Python that can lead to hard-to-detect bugs if not handled properly.