Securing PHP Web Applications

This book review was published by Slashdot, .

Securing PHP Web Applications

The owners and the developers of typical websites face a quandary, one often unrecognized and unstated: They generally want their sites' contents and functionality to be accessible to everyone on the Internet, yet the more they open those sites, the more vulnerable they can become to attackers of all sorts. In their latest book, Securing PHP Web Applications, Tricia and William Ballad argue that PHP is an inherently insecure language, and they attempt to arm PHP programmers with the knowledge and techniques for making the sites they develop as secure as possible, short of disconnecting them from the Internet.

The book was published by Addison-Wesley Professional on 26 December 2008, under the ISBN 978-0321534347. The publisher maintains a Web page for the book, where visitors will find a detailed description, the table of contents, and a sample chapter ("Cross-Site Scripting", Chapter 10) only three pages in length — undoubtedly a record. That is essentially all one will find on that Web page. Most technical publishers offer far more information on the Web pages for each one of their books — such as the preface and index online, updates to the book's content (including reported errata, confirmed and otherwise), descriptions of the chapters, information about and pictures of the author(s), feedback from readers and the media, and, perhaps most valuable of all, the sample code used in the given book. (However, that is less of a factor with this particular book, since it does not contain much sample code.) Many such publisher pages even have links to book- or technology-specific forums, where readers can post questions to the authors, and read other people's questions and the replies. Addison-Wesley Professional, like all of the Pearson Education imprints, has through the years proven quite sparing with the supplementary online content, thereby no doubt reducing the number of prospective readers and other traffic to their sites.

Despite its fairly modest length (336 pages) in comparison to the average programming book being published these days, Securing PHP Web Applications tries to cover a sizable number of topics, in five parts, which encompass 17 chapters: general security issues; error handling; system calls; buffer overflows and sanitizing variables; input validation; file access; user authentication; encryption and passwords; sessions and attacks against them; cross-site scripting; securing Apache and MySQL; securing IIS and SQL Server; securing PHP; automated testing; exploit testing; designing a secure application; and hardening an existing application. The book concludes with an epilogue on professional habits to improve the security of one's applications, an appendix describing additional resources, a glossary, and an index. Throughout the book, the authors illustrate key ideas with the use of a sample application — in this case, a Web-based guest book.

The first chapter, which is the only one in the first part of the book, is rather brief, but does prime the reader for all the material that follows, because it explains the inherent security problems of Web applications, and explains the dangers of some of the inadequate measures that naive programmers can take, such as security through obscurity, and the common belief that hackers only go after major websites.

Chapter 2 focuses on error handling, but begins with an example of SQL injection, and how effective it can be against the first iteration of the guest book application code. The most potentially confusing part of the discussion is when the authors show an SQL injection attack that perverts an INSERT statement by injecting it with an SQL command to drop a table, and the two commands are separated by a semicolon. But then instead of discussing how multiple SQL statements can be separated by semicolons (well, depending upon one's server settings), they instead discuss separating PHP commands was semicolons, but not SQL commands. Nonetheless, readers will find some good advice on handling unexpected input and using a centralized error-handling mechanism, even if quite simple. Also, the question of whether or not to accept HTML in user input, is briefly addressed. However, the material would be more useful if the authors were to explain specifically when htmlspecialchars() should be used instead of htmlentities(). Also, the option of using standard bulletin board codes (such as [b]bold[/b]) should have been mentioned, if only briefly with references to outside resources. At the bottom of page 22, the bare regex following a "!~" is not valid PHP (or even Perl, which it much more resembles). Lastly, one should not follow the recommendation of providing absolutely no feedback to the user as to what characters were invalid in the text they entered. Hackers gain nothing from being told the obvious, that HTML tags are not allowed; but legitimate users will be incensed when told only that the system didn't understand their input, with no indication as to how to make it acceptable.

In the third chapter, the authors explain the obvious danger of using unsanitized user input within a call to the operating system, such as exec() or system(). The discussion here assumes that you are on a *nux server, not Windows. Two PHP commands are suggested for sanitizing user input, as well as the option and advantages of building a custom API that is limited to only the system calls that should ever be executed within your Web application. On page 33, their test code appears to assume that register_globals has been enabled (so the GET variables in the malicious URL are automatically instantiated and set to the values in the URL), which is disappointing for a book on PHP security, since the dangers inherent in register_globals are so severe that it is now disabled by default, is deprecated in PHP version 5.3.0, and will be completely removed in version 6.

In Chapter 4, readers get an overview of program and data storage on a computer, including buffers, stacks, and heaps, as groundwork for learning what buffer overflows are and how hackers can try to exploit them to execute database and operating system statements, including using your server as a staging point for remote exploits and denial-of-service attacks. The fifth chapter dovetails nicely with the previous one, because it discusses input validation, which is a key component of avoiding boundary condition attacks. The authors explain the importance of validating tainted data, using character length and regular expressions. One simple countermeasure to such attacks that the authors fail to mention, is simply setting a maximum input length ("maxlength") on HTML "input" tag fields. After all, most entry fields on forms are input tags — not textarea tags, for which the maxlength attribute only specifies wrapping. Using maxlength does not prevent manipulation of POST values, but does prevent the less knowledgeable attacker from overflowing input tag fields.

Chapter 6 explains the risks in working with local and remote files, and why it is critical to not allow mischievous users do such tricks as inserting a pathname in a filename, when your code is expecting only a simple filename. Unfortunately, some of the code and claims in this chapter are suspect: On page 70, the value of $path_to_uploaded_files is missing a needed trailing forward slash. The suggested method of processing malicious file paths could be made much more simple and secure with the use of basename(). The file_get_contents() attack shown on page 71 again seems to assume that register_globals is enabled; even if it were enabled, the exploit wouldn't work because $file is always set to a value in the script code. The authors seemingly believe that GET variables can override anything in a script. Nonetheless, their advice about handling user-uploaded files is spot on.

Part 4 of the book focuses on user security. The first of its chapters covers user authentication and authorization — combining the two for their sample application — and starting with usernames and passwords. Access denial due to invalid username or password is supposedly illustrated by Figure 7.2, but all that it illustrates is that a concept that needs no visual depiction is not made more clear by trying to represent it with a confusing image. The authors provide a thorough discussion of authentication purposes and methods, as well as password encryption and strength. Yet they provide no rationale for setting the default values for usernames, passwords, and email addresses to " " simply because the columns are non-nullable. After all, a record would only be added to the table if those values were known. Also, in their validateUsernamePassword() function, they've mistakenly commented out the first "return FALSE;" and they create unused variables $username and $password.

Chapter 8 provides an overview of various types of encryption, particularly for passwords, and some recommendations for PHP-supported algorithms. One blemish in this discussion is the claim that the longer the key for decryption, the longer it will take for your application to load the data (presumably the encrypted text) — which doesn't make sense. Also, their password() and login() functions reference class member names of an object not yet defined or explained. Code out of context like this can be confusing to the reader.

Sessions are a key component of maintaining and securing the identity of an authenticated user as she goes from one page to another in your PHP application. In Chapter 9, the authors describe the three major categories of session attacks: fixation, hijacking, and injection. The next chapter addresses cross-site scripting (XSS), but runs only three pages, and provides no examples of an XSS attack, which would have been helpful for the reader to understand how such an attack could try to compromise his PHP code, and what sort of malicious code to look for in his site. However, references to four open source XSS filtering projects are provided, in case the reader would like to learn more about them.

The fifth part of the book is devoted to securing whichever server environment on which you choose to host your application — Apache and MySQL, or IIS and Microsoft SQL Server, as well as PHP. In the chapter on PHP, the authors present the Zend Core release of PHP, which can save developers time in installing components of the LAMP stack, and also save them from reinventing the wheel, by using the Zend Framework. Other techniques for hardening PHP are discussed. Chapters 14 and 15 explain how to use automated testing and exploit testing, to increase your application's security, using powerful exploit testing tools — free and proprietary.

The sixth and final part of the book contains two chapters, which purportedly discuss the advantages of designing security into a new application right from the start, and how to improve security in an application that has already been built. In the former chapter, the authors stress the importance of balancing no design ("Skip reading Slashdot for one day...") and too much design (i.e., stalling). But the material mostly consists of the basics of designing a Web application, with no new information on security, and concludes with a brief reiteration of security principles detailed in earlier chapters. The latter chapter offers some good advice on having separate development and test environments, in addition to the production environment. The principles expounded in each of the two chapters, do not overlap at all, and yet together they apply equally to new applications under development just as much as they do to finished applications; splitting the principles up does not make sense.

Sadly, the book does not live up to its potential. In general, much of the sample code is sloppy, as exemplified by the instances noted above. The authors and the technical reviewers should have tested the attacks, and thereby found which ones don't work. Even the HTML should not be used by any new Web developer as an example of quality code that adheres to leading standards. In the HTML that they have their sample PHP code generate, the tag attribute values are in single quotes, and not double, which means all of that code would need to be changed to make it compliant with XHTML 1.0. Moreover, by choosing to use single quotes for both the attribute values and the PHP strings, the authors end up having to escape every single attribute value quote mark, which wastes space and looks ridiculous. They repeat this at the end of Chapter 6, but this time with all double quotes. Also, some of the technical decisions are rather odd, such as their setting those default values to spaces in the user table, noted earlier. A few terms are used strangely, as well, such as their statement that IIS's footprint is the number of entry points to it; actually, a Web server software's footprint generally refers to how much memory it consumes. Every chapter ends with a summary, titled "Wrapping It Up", none of which add any value to the book. There are at least three technical errata in the book that should have been caught: spaces in "u + rwx, go + rx" (page 76), and the invalid addresses "" (page 215) and "www.ballad-nonfiction/SecuringPHP/" (page 288; adding ".com" does not fix it).

On the other hand, the book's marketing copy claims that "Tricia and William Ballad demystify PHP security by presenting realistic scenarios and code examples, practical checklists, detailed visuals..." and that is certainly a fair claim. Most of the explanations are straightforward and informative. As a side note, kudos to Addison-Wesley Professional for printing this book on recycled paper; one can only hope that all publishers adopt that policy.

The primary value of Securing PHP Web Applications is that it touches upon security topics that are often glossed over or completely neglected in other PHP security books and articles. This is important, because online miscreants will be searching out every possible chink in your website's armor. You should do the same, before they strike — and this book shows how.

Copyright © 2009 Michael J. Ross. All rights reserved.
bad bots block