Serious Python

This book review was quoted by the publisher.

Serious Python

The computer programming language Python continues to grow in popularity, and now consistently ranks among the top half-dozen languages in terms of job openings, usage for open-source projects, popularity as a learning language at universities, and other metrics. However, despite its almost three-decade history, there are still far fewer intermediate and advanced Python developers than there are beginners as well as active programmers completely unfamiliar with the language. Thus we should not be astonished that there are fewer computer programming books aimed at the more experienced developers than the latter. Within that first category of books, the publisher No Starch Press introduces a new title, Serious Python: Black-Belt Advice on Deployment, Scalability, Testing, and More.

Its author, Julien Danjou, has been quite actively involved in the Python language and the open-source software community for almost two decades. The book states that the author is currently working as the project team leader for OpenStack, while the marketing material on the Amazon.com web page states that he is a principal software engineer at Red Hat. It's not clear what is the most up-to-date information. In any event, he brings considerable technical experience to this book, which is the fourth edition of the series originally titled The Hacker's Guide to Python.

Serious Python has a publication date of 11 December 2018, under the ISBN 978-1593278786. (This review is based on the e-book version kindly provided by No Starch Press.) On the publisher's web page for the book, visitors will find a brief synopsis, the table of contents, and some biographical information about the author, as well as a link to download a sample chapter (on unit testing). The print version comprises 240 pages, most of which are organized into 13 chapters. In the introduction, the author briefly contrasts the kind of Python programming he did when learning the language and working on relatively small personal projects, versus working on major projects of much greater complexity and scope (more than 9 million lines of code), as well as a need for automated testing. He also describes the intended audience and purpose of the book, and the topics of each chapter.

The first one, entitled "Starting Your Project", advises readers to use Python version 3.7, partly because the last version of the 2.x branch, namely 2.7, will no longer be supported after the year 2020. Accordingly, the book has been written for version 3.x. Experienced programmers have learned from, well, experience that if the file and directory structure of a project is initially quite faulty and not soon corrected, then this can lead to unanticipated problems and the temptation to implement kludges that can make a bad situation worse. To forestall this possibility, the author presents some basic principles for how to best structure a new Python project. He also covers the PEP 440 versioning standard. He then presents the essential coding standards for Python code. All of the advice seems quite reasonable, with the exception of limiting each line to 79 characters — a prescription that is arguably unneeded if not wholly outdated given the current screen resolutions of modern laptops and computer monitors, as well as the ability to specify code line wrapping within any decent programmer's editor or integrated development environment (IDE). There are numerous tools for detecting Python style and coding errors, and some of the most commonly used are briefly described. The chapter concludes with a brief Q&A session starring Joshua Harlow, one of several highly experienced Python developers who are featured at the end of each chapter.

Modules, libraries, and frameworks are important features of Python, and they are the subject of the second chapter — specifically, some of the details of the import system, including the sys module, import paths, and custom importers. The author wisely recommends that all Python newbies should familiarize themselves with the substantial range and depth of functions available in the standard library. Most if not all experienced developers have had occasion to berate themselves for spending time and energy writing a function that they later discovered was already built into a language or its common libraries (and, ahem, I speak from experience). At perhaps the opposite end of the spectrum of potential mistakes, the developers of a project can scrupulously avoid reinventing the wheel, through extensive use of external libraries. But that approach has its own pitfalls, and thus it is helpful that the author briefly explains the criteria used by the OpenStack project to avoid relying upon any external library that could become unviable. After describing the use of pip for installing packages and external libraries, he briefly provides a few warnings about the use of frameworks. Lastly, Doug Hellman, another Python veteran, provides his insight on libraries and other topics. (As noted earlier, each chapter concludes with a similar Q&A interview, which for the sake of brevity will not be discussed for any of the remaining instances.)

The third chapter addresses every programmer's favorite topic, documentation — specifically, some general principles of effective API documentation within the code, the value of using the reStructuredText (reST) format and the Sphinx documentation generator, automatically testing your code snippets with doctest, writing custom Sphinx extensions if needed, naming functions in an API, numbering the versions of a custom API, and documenting changes to it. The only flaw in the discussion is that the author seems to be using the term "APIs" to mean the functions within one (perhaps he meant API calls) and the term "old interface" to mean the deprecated versions of a function. This is one of nine chapters that contain brief summaries, all of which are useless and whose content either repeats what was presented just a few pages earlier or contains new material that should have been incorporated into those earlier pages. Such summaries in technical books usually seem to be only padding.

Regardless of the language, any computer code that deals heavily with dates, times, and time zones can be fraught with difficulties. In the fourth chapter, the author begins by pointing out the necessity of time stamps to have associated time zones, just to be meaningful, and then he shows how to create a default datetime object that is time zone aware, how to serialize such objects to be able to transmit them to non-Python systems, and how to resolve ambiguous non-UTC timestamps resulting from daylight saving time (just one more reason, in my opinion, to abandon it altogether).

In the fifth chapter, readers will learn how to distribute to the world their Python code, beginning with historical information about the most popular package installation tools, of which setuptools is the current recommendation. (Readers unfamiliar with the Python ecosystem may be dismayed at the apparent chaos of abandoned projects and non-standardization.) Next an example pair of setup.py and setup.cfg are employed to illustrate how one can use them to publish a custom package. My only complaint with this material is that the URL "http://julien.danjou.info/software/rebuildd/", which is currently invalid, should be fixed so it points to a correct example that would be instructive to the reader. Also covered in some detail are the Wheel distribution format, tarball creation, and setuptools entry points.

The extent to which an individual developer or a project team implements unit testing within their code, is often an excellent predictor of the maintainability of the code and, in turn, the long-term technical success of the project. In chapter 6, the author demonstrates how Python makes it relatively easy to follow unit testing best practices and thereby reap the rewards of more reliable code, by utilizing the pytest package — specifically, how to: set up and run assertion tests, skip certain tests (conditionally and otherwise), run certain tests whose names match a pattern, run tests in parallel (on multi-processor computers), use regular and parameterized fixtures, and use mock objects for simulating various test conditions. The reader next learns how to use the Python coverage tool for detecting code that is not being tested because, for whatever reason, it is never executed during one's unit tests. Outside the realm of Python, virtual machines are increasingly being used for many purposes, and in a similar manner virtual environments allow testers to create a consistent environment for their purposes, using the virtualenv and tox tools, as explored by the author.

For most people, a decorator is an artistic soul who will quite possibly convince their spouse to double the cost of remodeling the house. But for programmers, decorators are special functions that can modify other functions. As the author notes, "The primary use case for decorators is in factoring common code that needs to be called before, after, or around multiple functions." After demonstrating how to create custom decorators, he focuses on the ones built into Python. Plenty of example code is provided, but for this particular topic, readers will likely find much of it cryptic and certainly not an argument for using decorators. The author then delves into the inner workings of Python methods in general, and in particular static, class, and abstract methods, as well as super(). For developers familiar with object orientation in other languages but not Python, the reference to "the state and methods of the class" may prove baffling at first, since classes typically do not have state (although of course their objects do).

When developers talk about functional programming, the first language that comes to mind is usually not Python, and yet the language does support a fair amount of such… functionality. In the eighth chapter, the author explicates such concepts as pure functions to encourage the writing of clean code, generators to easily create objects, yield statements for flagging generators and returning values, isgeneratorfunction() for doing just as its name suggests, list comprehensions for defining lists inline with their declarations, and a small menagerie of Python functions that facilitate functional programming, including map(), filter(), and several others. The chapter concludes with information on the package "first", the lambda() function, and some prebuilt itertools functions. The topic of the next chapter, Python's abstract syntax tree (AST), is far more esoteric than functional programming, and would possibly be only of value to those interested in Hy (a Python-Lisp hybrid language built on AST) and willing to brave the lack of documentation.

Performance and optimization — or as the author strangely phrases them, "Performances and Optimizations" — are critical topics in any computer language. In Chapter 10, readers are encouraged to become familiar with and fully leverage methods of Python's data structures, as well as lesser-used data structures themselves, thereby avoiding unnecessary coding. The author then demonstrates how to use code profiling techniques, if necessary, to discover and quantify any bottlenecks in one's programs — even to the extent of disassembling the Python code into the corresponding bytecode. Readers learn why it is best to avoid defining functions within functions, how to speed up searching within lists (by ordering and bisecting them), how to store object attributes in list objects (instead of dictionary objects), how to make use of named tuples (and their advantages over objects with __slots__), and how to cache function results (memoization). The chapter concludes with advice on keeping one's project compatible with PyPy and using objects that implement the buffer protocol to avoid the unneeded copying of data in memory.

Most software development teams would be delighted if their projects were to increase in popularity to an extent that they eventually encounter problems with scalability and architecture constraints — topics covered in the next chapter, and which the author acknowledges could and do fill books on their own. He discusses multithreading, some of the difficulties associated with the Python global interpreter lock (GIL), and techniques for working around these limitations, namely, asynchronous event handling and multiprocessing. In many scenarios, a Python program would be best written so as to leverage an event-driven architecture, in which the program listens for particular events (such as user input) and responds to them immediately. The chapter concludes with a brief discussion of service-oriented architecture (SOA), representational state transfer (REST), and the socket library ZeroMQ for communicating among processes and dispatching work.

Connecting one's applications with one or more relational databases — possibly in conjunction with an object relational mapping (ORM) library — oftentimes turns out to be more work than anticipated, and can pose additional security risks in the case of web-based applications. In the penultimate chapter, the author illustrates the advantages of knowing and using SQL instead of relying entirely upon an ORM tool, even the most popular one in the Python ecosystem, sqlalchemy. He then shows how to utilize PostgreSQL and Flask to store messages in a table and access them via an HTTPS REST API. But he does not explain the rationale for spending this much time and space detailing what to the uninitiated reader seems like a rather narrow use case that most developers would never encounter. The final chapter presents a variety of advanced Python techniques, including: version 2 versus version 3 Python detection and difference handling; dispatch generic functions; context manager objects; and the attr library for simplifying attribute declarations.

Overall, this book does a fine job of presenting the chosen material. There is plenty of example code, and the narrative is tightly tied to it. Some of the phrasing is a bit odd, such as, in the Acknowledgments section, the author cites one of the contributors "for messing up with testing" (page xv). Employees rarely get credit for that! The statement "If you're not already familiar with Lisp […] the Hy syntax will look familiar" (page 145) sounds like it should instead read "If you're already familiar…" But my favorite is the delightfully redundant "repeating the same things over and over again" (page 94). In several instances, long dashes are used where semicolons would be more appropriate.

There were relatively few errata evident: "would be really be great" (page 14) may have been a verbal misstep by the interviewee; "readthedocs .org" (page 35) contains an errant space that is not visible in the text and yet is revealed by doing a copy and paste; the URL "http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html" (page 36) has an invalid link value in the e-book; "to handle connection with" (page 73) would be better as "to handle connecting with"; "Th mock library" (page 86) is missing a vowel; "iter-tools" (page 133) should be made whole as "itertools"; "comparisons operators" (page 139) should instead read "comparison operators"; "allows to us get" (page 160) needs two of its words swapped; and the URL "http://www.aiai.ed.ac.uk/~jeff/clos-guide.html" (page 205) points to a forbidden page (as of this writing). Aside from these minor blemishes, the writing is clear and straightforward.

Yet the primary weakness of this book is that important topics that we should expect to find in one targeted at intermediate and advanced Python programmers — interfacing with file systems and operating systems, network programming, restricted execution, binary data, web frameworks and applications, etc. — are not covered, and yet considerable space is devoted to topics of dubious utility that most programmers will likely never need, such as AST and Hy. Less importantly, readers may find that several of the chapters, especially some of the earlier ones, seem to come up short, with less actionable information than would be expected. Nonetheless, Serious Python does contain a considerable amount of judicious battle-tested advice from an experienced developer — as well as some insightful gems from the guest contributors — making the overall effort a welcome addition to the limited number of books aimed at more advanced Python programmers.

Copyright © 2018 Michael J. Ross. All rights reserved.
bad bots block