How to be Pythonic and why you should care

Common strategies to improve the quality of your Python software

A snake is a snake and that’s that

I’ve officially been writing code for over a dozen years now with the last 5 as a full-time software engineer, and while I still have MUCH to learn (a lifetime of learning to be exact!), I have seen my fair share of software and have (dare I say) developed my skills immensely in the field during that time. I still remember some of the first programs I ever wrote, and cringe in bed at night as I relive the nightmares of my days as a beginner programmer. While I will never escape the crimes of my past (writing quintuple-nested loops is the biggest of those sins), perhaps I can partially redeem myself, even if only slightly, by helping other developers who are fresh into the field learn a few best practices to write faster, cleaner, and better code.

Photo by Rock'n Roll Monkey on Unsplash

Pythonic — A bionic Python?

As with nearly every programming language, there are certain stylistic and conventional guidelines that are accepted by the Python community to promote unified, maintainable, and concise applications that are written the way the language intended them to be written. These guidelines range from proper variable, class, and module naming conventions, to looping structures, and even the proper way to wrap lines of code. The name “Pythonic” was coined to describe any program, function, or block of code that follows these guidelines and takes advantage of Python’s unique capabilities.

Why does all of this matter? This question is open to many interpretations, but a few key reasons why you should care come down to the clarity, efficiency, and credibility of the code. Let’s break these down further.

Clarity

The clarity of your code is paramount to your success if you want to be a developer. As you grow in the field, you will likely work with others at some point in time, which will require peers to read your code. If your code is written poorly, it can be a nightmare for others to decipher your intentions, even in short chunks. Take the following example from the r/badcode subreddit:

The definition of insanity

Does this code work? Yup. Does the function name describe the function’s purpose? Sure does. Is it easy to identify what this code is supposed to accomplish if you changed the function name? Probably not without spending an hour analyzing it.

As is the case of every beginning developer I have known (myself included), there is a commonly-held mentality of “it works — don’t touch it” when it comes to code. The moment we can write something that solves our problem, we are afraid of doing anything to the code in fear that we will break everything and be unable to fix it again.

I would encourage any developer to break this mentality as early as possible (this goes for all languages). Even if you created the poorly-written code yourself, it is often difficult to return to it a week, month, or even a year later and attempt to unravel its mystery. To make matters worse, if you can’t decipher the code yourself, how do you expect fellow teammates or collaborators to uncover the meaning?

By writing programs the way the language was intended, developers should naturally be writing code that looks similar to that of their peers. This makes it easy to understand, easy to share, and easy to update.

Efficiency

Back when I was interning in college, one of my fellow interns I met on the job told me “don’t bother writing something that’s already been done in Python, because you won’t be able to write something better.” While I was originally frustrated by this depressing thought, I eventually realized there was some truth to his statement. Python has been around for nearly three decades at this point and has quickly become one of the most popular languages by developers around the world. Python is also known for containing an abundance of libraries that can do almost anything you want or need. Many of these libraries and features see thousands of members creating updates over several years, squeezing as much performance out of every line of code as possible. While you are certainly welcome to writing your own optimal string comparison function, chances are what you come up with won’t be any faster than what already exists, and the time spent developing the new function could have been spent working on the actual problem you are attempting to solve. In general, look for a built-in function or data type that achieves what you are looking for. Chances are, this will be the fastest way to complete a task. If not, check if there are any libraries or packages that can be installed which do what you need. If you still don’t have a solution, now’s the time to create your own!

Credibility

For anyone who first learned how to program in a language other than Python, it’s generally clear which language the developer came from. Take the following problem as an example:

  • Find the sum of all numbers between 10 and 1,000

A C (or C++) developer would probably write something along the following lines:

int a = 10;
int b = 1000;
int total_sum = 0;
while (b >= a) {
total_sum += a;
a++;
}

A direct Python re-write of this would look very similar:

a = 10
b = 1000
total_sum = 0
while b >= a:
total_sum += a
a += 1

While the above statement will yield the expected output, most Python developers would throw a fit over this code, complaining that it isn’t Pythonic and doesn’t leverage the language’s power. Starting fresh, here’s how you can solve the problem the Pythonic way:

total_sum = sum(range(10, 1001))

This single line of code generates the exact same result as above (for the record, I did intend to write 1001 in the code as Python’s range command has an inclusive lower bound and a non-inclusive upper bound, meaning the lower number will be a part of the loop, while the higher number will not). If you were to write Python code using the first example, your credibility as a Python developer would go down as the Python community is very passionate about writing code following the guidelines. Here’s another example:

  • Determine if a particular string is in an array

For most non-Python developers, the first solution would probably look something like this:

#include <stdbool.h>
char * arr[] = {"apples", "oranges", "bananas", "grapes"};
char * s = "cherries";
bool found = false;
int len = sizeof(arr) / sizeof(arr[0]);
for (int i = 0; i < len; i++) {
if (!strcmp(arr[i], s)) {
found = true;
}
}

As before, a direct Python translation would be:

arr = ["apples", "oranges", "bananas", "grapes"]
s = "cherries"
found = False
size = len(arr)
for i in range(0, size):
if arr[i] == s:
found = True

As I’m sure you guessed, there’s a much simpler way to write this in Python:

arr = ["apples", "oranges", "bananas", "grapes"]
found = "cherries" in arr

No matter which method you choose above, found will always evaluate to False (or false) in the end. The last choice, however, is the clear champion when it comes to Pythonic code. It is concise and easily understandable. Even those that have never read Python (or any code for that matter) have a chance at comprehending the intention of this last block unlike the previous two.

The final example is one of my favorite tools in Python, list comprehension. This technique allows you to embed a loop inside a list to create a new list. Consider the following:

  • Double the value of every even value in an array

First, here’s the C code:

int[] arr = { 1, 2, 3, 4, 5, 6 };
int length = sizeof(arr) / sizeof(arr[0]);
for (int i = 0; i < length; i++) {
if (arr[i] % 2 == 0) {
arr[i] *= 2
}
}

And the direct Python translation:

arr = [1, 2, 3, 4, 5, 6]
length = len(arr)
for i in range(0, length):
if arr[i] % 2 == 0:
arr[i] *= 2

Now the Pythonic way:

arr = [1, 2, 3, 4, 5, 6]
arr = [x * 2 if x % 2 == 0 else x for x in arr]

This might look funny at first if you have never seen list comprehension in action. I’ve found it’s often easiest to look at list comprehension from right to left. First, it iterates through every element in the list for x in arr, then it checks if the element is even if x % 2 == 0. If so, it doubles the number x * 2, and stays the same if not else x. Whatever the element ends up as, it gets appended to a new list. In our case, we are overwriting the original value of arr with the new list.

These are just a few common ways to make code Pythonic. You likely noticed that all of these examples involved loops of some sort. While there are many ways to write Pythonic code, a great practice is to ask yourself if you truly need a loop or if it can be replaced with an idiomatic substitute.

If you care about your credibility in the software world and want to proudly call yourself a Python developer, make sure you know and use some of these techniques when applicable in your code.

Photo by Carolyn V on Unsplash

Ye Old Guidelines

Hopefully you now understand the importance of writing Pythonic code. At this point, you are likely wondering what some of the guidelines are and how you can follow them. Allow me to introduce you to the Python Enhancement Proposal #8 (PEP 8). For those that are unfamiliar with PEPs, they are proposals written by the community designed to improve some aspect of Python, ranging from performance, to new features, and documentation. The 8th proposal, specifically, provides recommendations on styling guidelines and conventions. This is an often cited resource on how to be Pythonic and I highly recommend giving it a read if you haven’t already. Here are some of the topics I find most important:

Naming conventions

Naming conventions are important for any language to provide a common means of identifying types and objects. Here’s an abridged version of the naming conventions:

  • Packages/Modules: Must use all-lowercase. Underscores can be used if necessary, but are discouraged. Ex: package or module.py.
  • Classes: Must use CapWords. Recommended to not use the word Class in the name. Ex: class BasketballTeam:.
  • Constants: Must use screaming snake case. Ex: API_URL = '...'.
  • Functions/Variables: Must use standard snake case. Ex: home_team_points = ... or def final_boxscore(...).
  • Function/Method Arguments: Must use standard snake case. Ex: home_team_name.

Proper usage of comments

Comments are an important aid to the clarity of code. I generally recommend adding a comment above any section of code who’s purpose wouldn’t be immediately obvious to someone else. In general, comments should be in complete sentences, located above code blocks, and written in English. Additionally, the usage of documentation strings (or “docstrings”) are recommended to record the purpose of functions as well describe the types, names, and descriptions of inputs and outputs, if applicable. PEP 257 contains great information on how to use docstrings.

Wrap lines to 79 characters or less

One of the more controversial topics in Python development pertains to line wrapping. PEP 8 calls for every line of code to be less than or equal to 79 characters. Some embedded systems have limited screen sizes that can only display as many as 80 characters on a line, which would require ugly wrapping of code if it were any longer. Additionally, if a particular line of code is hundreds of characters long, it can get very difficult to read as many variables and function calls can be lost within the line.

While this likely won’t be an issue in most cases, sometimes a line of code requires lots of real estate, especially if it contains long variable names or complex list comprehensions. A few ways to combat this are to create a newline every time you use a comma in function calls. For example, replace

some_function(first_var, second_var, third_var, ... twelfth_var)

with

some_function(first_var,
second_var,
third_var,
...
twelfth_var)

The two blocks above will be executed exactly the same. While within parenthesis (or a tuple), any code on a new-line after a comma will be included with the statement as the next parameter.

Python also allows the use of backslashes to separate code that isn’t contained in a tuple or other similar object. For example, replace

if first_boolean_variable == accepted_value and not second_boolean_variable:  # This is all one line
print('Accepted')

with

if first_boolean_variable == accepted_value and \
not second_boolean_variable: # This is a second line
print('Accepted')

While these changes will add additional lines of code to your program, it becomes much easier to read, especially on a range of display types and sizes.

Photo by Jared Rice on Unsplash

The Zen of Python

Another commonly referenced resource is the Zen of Python. The following short prose is the sole contents of PEP 20, written by Tim Peters:

Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren’t special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one — and preferably only one — obvious way to do it.
Although that way may not be obvious at first unless you’re Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it’s a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea — let’s do more of those!

While the lines above are fairly self-explanatory, the overarching theme can be summarized by the 7th note: “Readability counts.” To me, this means that code should be written in a way that any Python developer, regardless of his or her experience, should be able to read and understand the code. Python uses a simple syntax which is closer to natural English than nearly all other languages. As such, the code should be simple and beautiful. The best way to get a message across in English is to deliver it in a concise manner. The same goes for Python.

No matter what your code is working to achieve, always remember the Zen of Python. If your code doesn’t follow these principles, then it isn’t a truly Pythonic application.

Every developer needs a set of useful tools. Photo by Cesar Carlevarino Aragon on Unsplash

Tools for writing Pythonic code

We’ve now learned why Pythonic code is important, what some key principles are, and a few examples of Pythonic code. After all that, it’s time we learn how to actually apply these techniques. Luckily for us, there are several tools that we can use to check if our code adheres to Python’s guidelines. The first tool, pycodestyle (formerly pep8) checks any specified Python module to determine if it violates any of the guidelines listed in PEP 8. More information can be found on the GitHub repository.

Another commonly-used tool is pylint. The basic premise of pylint is the same as pycodestyle, but it goes several steps further and is more aggressive with its reach and suggestions. I personally prefer pycodestyle as I’ve encountered several false-positives in the past with pylint. While there are ways to combat these false-positives, I found it wasn’t worth it to continually alter my source code to work around and explain why certain lines didn’t pass but should.

I should also note that additional code validation and linting tools exist, but the two mentioned above are the most commonly used resources.

EDIT: Thanks to Tarun Chadha for mentioning that these tools can be integrated with popular IDEs, such as PyCharm and Visual Studio Code.

Photo by Sara Kurfeß on Unsplash

Here’s to an idiomatic future!

If you made it this far, you should now be armed with the knowledge to write faster, cleaner, and better Python applications. Not only will this aid your development skills, it will also command greater respect from the Python community. Practice this trade, and you just might be considered in the sacred realm of the great Python developers.

Software engineer passionate about sports and artificial intelligence and, apparently, a blogger by night.