Mastering Oracle SQL
This book review was published by SDOUG, .
If and when you wish to access data stored in an Oracle database, you will need to utilize Oracle SQL statements. This will be true regardless of whether you are entering your queries in raw form at a command line of some sort (such as within the SQL*Plus utility), or whether you are querying and/or changing data via code inside of a PL/SQL or Java program. The only conceivable exception to this, is if the operations you wish to perform have already been captured inside of Oracle stored procedures — in which case you need only call those procedures with the appropriate parameters. But even in that unlikely case, you are well advised to examine those procedures to confirm that they are querying the Oracle tables correctly. Those queries will be written in Oracle SQL. Consequently, a solid understanding of SQL, and Oracle's extensions to the language, are essential for working effectively with any Oracle database.
But while the need to learn Oracle SQL is quite evident, the best approach to learning Oracle SQL is less clear-cut, to which most experienced Oracle programmers and DBAs will energetically attest. Most if not all SQL books on the market may describe the language comprehensively, and yet rarely explain the real world usage of the commands, much less how to use those commands to solve the most difficult and common data query problems that confront the typical SQL programmer. Furthermore, those SQL books typically cover just the "vanilla" flavor of SQL (usually the ANSI standard), and not the variations implemented by the major RDBMS vendors, including Oracle. The frustrated programmer may then turn to the vendor documentation, in hopes that all of the extensions — in addition to the language itself — will be clearly explained. But again, the syntax of all the SQL commands may be fully delineated, yet with little or no discussion of their practical applications, nor how to utilize their capabilities with maximum ease and performance efficiency.
If you have found yourself in such a quandary, you are not alone. Sanjay Mishra and Alan Beaulieu faced the same challenges, and decided to do something about it: They wrote a book. Specifically, they authored Mastering Oracle SQL, recently published by O'Reilly & Associates (published in April 2002 under the ISBN 978-0596001292). Their book is organized into 14 chapters, and compactly fitted into only 307 pages. Unlike some database books on the market, this title will certainly not break your bookshelf, or even your wallet, for it can be had for less than $35. If you have a need to work with Oracle SQL to any significant extent, then this volume will be worth every penny. It will help you craft better SQL queries, and save you time when searching for proven solutions to real-life database problems.
The first chapter is a clear introduction to SQL (Structured Query Language) independent of any particular vendor's implementation. The authors begin with a very brief history of the language, and then present the entity-relationship model of a simple database, comprising 14 tables. This model is used for all of the examples in the book, and fortunately is designed to be as realistic as possible. The authors' choice of using one single database model throughout the book will be appreciated, especially by those readers who have struggled with other programming books which force the reader to learn one new contrived example after another for every chapter and/or topic. The authors then present the most common SQL statements: SELECT, INSERT, DELETE, and UPDATE. They also briefly explain the separate clauses in a SELECT statement, including the ORDER BY clause. They also describe how to use the DISTINCT keyword to eliminate duplicate rows from a result set.
In Chapter 2, the authors explicate how one can filter data in SQL statements, in order to limit the results returned by query statements, or to limit the table rows affected by DML (data manipulation language) statements. To that end, they present the purpose and usage of the WHERE clause, including expressions, and the various conditions that one can use in a WHERE clause to create those expressions: equality (=) and inequality (!=), membership ([NOT] IN), range ([NOT] BETWEEN... AND...), and matching ([NOT] LIKE). In addition, they explain the meaning of NULL values, and how to handle them.
The next chapter discusses how to access data contained in multiple tables which are related to one another through their primary and foreign keys. The authors explain how to perform joins on those tables, including the critical concepts of inner joins, Cartesian products, equi-joins and non-equi-joins, outer joins, self joins (both simple and outer), and subqueries, which in some situations can be absolutely invaluable. Another important topic examined in this chapter is the join view, which is essentially a view based on a join. Examples are given to demonstrate to the reader how to use INSERT, DELETE, and UPDATE statements on join views. The chapter is wrapped up with a clear description of the new ANSI-standard join syntax introduced in Oracle9i, including the new ANSI outer join.
Throughout Chapter 4, techniques for creating summary information — such as totals and subtotals of column values — are presented, using various group operations. Utilizing these techniques, the reader will be able to apply various aggregate functions to grouped rows. For instance, one can calculate the average, minimum, and maximum values of a column for a particular subset of rows in a result set. The authors explain how to handle NULL values, as well as the usage of the DISTINCT keyword. In addition, the GROUP BY clause is presented, followed naturally by the HAVING clause. Illustrative examples are provided for all of these new concepts.
The fifth chapter delves into a topic touched upon just briefly in an earlier chapter: subqueries, which come in two flavors, correlated and noncorrelated. The former is a type of subquery in which you are referencing one or more columns found in the outer (i.e., containing) SQL statement. The latter type of subquery, of course, has no such referencing of any outside columns in the subquery. Such statements may involve querying on a single row, or a single column, or perhaps multiple rows, or multiple columns. All of these situations are explored, with examples. A valuable and rarely seen distinction made by the authors is that a FROM clause is more than just a list of tables and views; rather, it is a list of data sets. This broader perspective helps the reader to better understand the use of SELECT statements inside of FROM clauses — typically known as inline views. As part of this discussion, the authors explain query execution as well as data set fabrication. Before presenting a useful case study illustrating subqueries, the authors describe how certain features of Oracle SQL can pose limitations upon SQL statements. This includes the usage of hierarchical and aggregate queries, as well as inline views, and the WITH CHECK OPTION feature. Methods of overcoming these limitations are examined.
Chapter 6 focuses on temporal datatypes, in other words, date and time data. This is an important topic, regrettably not adequately discussed in most SQL programming books. Oracle's internal DATE storage format is explain, in addition to the two most commonly used functions for converting temporal values: TO_DATE and TO_CHAR. Various date formats are examined, including the default format, which is adequate for many situations. An extremely useful table (not the relational kind!) is included to show all of the Oracle date format codes, with examples for each one. This handy table spans four pages, and may alone be worth the price of the book when needed by the harried programmer under temporal pressure. A number of other topics are presented, including two- versus four-digit years, date literals, and the ISO standards. Next the authors explore techniques for manipulating (non-romantic) dates, such as calculating the number of months in between two dates, or getting the date of a specified weekday. To avoid accidentally waking up colleagues in other time zones, the wise database programmer will utilize Oracle's built-in NEW_TIME function, which can convert dates in times from one time zone to another. The authors also explore date ranges, date pivot tables, and summarizing temporal columns. Oracle9i introduced some new DATETIME features, including fractional seconds, date and time intervals, and the previously mentioned time zones. The chapter concludes with a thorough discussion of how to work with timestamps and intervals using many of Oracle's built-in functions.
Even though the underpinnings and origins of relational databases are found in mathematical set theory, this may not be fully realized by the reader until he or she begins working with set operations in SQL (covered in Chapter 7). They comprise UNION ALL (combining the results of two SELECT statements into one data set), UNION (combining the results but discarding any duplicate rows), MINUS (removing from one result set any rows found in the second result set), and INTERSECT (returning just those rows which are common to both result sets). It is shown how to use set operations to compare two tables with one another. Set operations are also useful for handling NULLs in various compound queries.
Unlike many relational database management systems on the market, Oracle has native capabilities for representing and manipulating hierarchical information. For instance, one may have a company's organizational chart represented in an employee table that references itself, and one would like to query the table in order to get the row representing the company's president. This row would represent the root node in the hierarchy. Such operations can be done using standard SQL, but can be more easily accomplished using Oracle's hierarchical extensions. By learning these functions, in Chapter 8, the reader will be able to perform operations such as listing rows in their hierarchical order, or deleting a subtree within a hierarchy. These capabilities may also find application when one is working with XML documents, which have hierarchical relationships, by definition.
In the ninth chapter, the authors explain how to use some commonly employed functions, specifically, DECODE, NVL, NVL2, and CASE. The brevity of this particular chapter in no way reflects the potential utility of these functions. They essentially allow you to simulate conditional logic in SQL, a language which otherwise can only be used in a declarative fashion. In Chapter 10, the authors explain how to use partitions and collections, and how to perform operations on specific partitions and subpartitions. Their discussion encompasses partitioning indexes, partitioning methods (range, hash, composite, and list), partition specification, and partition pruning. In addition, the reader is shown how to utilize the object-oriented features introduced in Oracle8, specifically, object types, attributes, tables, and parameters. One disadvantage with relational tables, at times, can be the lack of any ordering of the rows. This can be overcome by pulling the data into one or more instances of Oracle's collection types: variable arrays, and nested tables. The authors show how to create these collections, how to query on them, and how to take them apart in preparation for relational operations, using views.
Even though this book focuses on SQL within Oracle, such a volume would be incomplete without a discussion of PL/SQL, which is the procedural programming language that can incorporate SQL statements. In the eleventh chapter, the authors briefly describe PL/SQL procedures, functions, and packages, as well as how to call stored functions from within Oracle SQL statements. Of practical importance are the restrictions on calling PL/SQL functions from within SQL statements. The chapter describes purity levels, consistency issues, and other restrictions. The usage of PL/SQL functions within Oracle SQL is not limited to SELECT statements — in some cases they may be very useful in the other DML statements (INSERT, DELETE, and UPDATE). As before, the authors provide clear examples of such usage.
Chapter 12 presents a number of advanced group operations, which probably will never be used by the average Oracle SQL developer, but may be of interest nonetheless. They are ROLLUP (both the new way, and the old way, using UNION), CUBE, and GROUPING (including GROUPING SETS). Concatenated groupings and grouping on composite columns, as well as the GROUPING_ID and GROUP_ID functions, are covered, with examples. For the programmer creating or modifying a typical Oracle-based decision support application, these advanced group operations features may prove essential. In another area, if the developer is working with decision support queries, then chapter 13 will probably be of value, because it provides an overview of analytic SQL, including ranking, percentile, value range, distribution, percentage rank, windowing, reporting, and ratio functions. Even though none of these topics are explored in depth, the reader will at least be given an idea of what is possible when creating analytic and decision support queries, and how to get started.
The book concludes with a chapter explaining a handful of SQL best practices. In some respects, particularly for the more advanced Oracle SQL programmer, this could be the most valuable portion of the book, even though it is also the shortest. The authors explain why it is more efficient to use EXISTS instead of DISTINCT, the trade-offs between EXISTS and IN, between WHERE and HAVING, and finally between UNION and UNION ALL. They also warn against unnecessary parsing, including such considerations when using bind variables and table aliases. They conclude with a brief discussion of when one should consider using literal SQL for decision support systems.
The sample code and the errata for the book may be obtained from the O'Reilly Web page for this title. To the experienced database programmer, particularly one fluent in the extensions of other RDBMS flavors of SQL, the material in this book will appear vaguely familiar — even somewhat unnecessary, or at least not warranting an entire book. But to the new database developer, especially one who has never worked with any SQL extensions before, this book should prove valuable. It may also serve as a general reference for more simple SQL usage. As with most technical sources, it is all dependent upon one's background. It should be noted that while the authors do not present an overwhelming amount of material, their writing style and examples are straightforward, quite unlike many of the turgid and weighty tomes on the market, whose authors seemingly devote more energy to filling pages then enlightening the reader with clear explanations. In contrast, Mishra and Beaulieu's comprehendible and approachable style is typical of O'Reilly's technical books, including those in their Oracle series. This newcomer is a welcome addition to the family.