Foreword
There is a humorous, computing-related aphorism that goes like this:
"There are 10 types of people: those who understand
binary, and those who don't."
Besides being amusing to people who understand number representation,
this saying can be used to group people into four (or 100)
categories:
Those who will never quite get the meaning of the statement, even if
it is explained to them Those who need some explanation, but will eventually get the meaning Those who have the background to grasp the meaning when they read it Those who have the knowledge and understanding to not only see the
statement as obvious, but be able to come up with it independently on
their own
There are parallels for these four categories in many different areas
of endeavor. You can apply it to art, to cooking, to
architecture...or to writing software. I have been teaching aspects
of software engineering and security for over 20 years, and I have
seen it up close. When it comes to writing reliable software, there
are four kinds of programmers:
Those who are constantly writing buggy code, no matter what Those who can write reasonable code, given coaching and examples Those who write good code most of the time, but who
don't fully realize their limitations Those who really understand the language, the machine architecture,
software engineering, and the application area, and who can write
textbook code on a regular basis
The gap between the third category and the fourth may not seem like
much to some readers, but there are far fewer people in that last
category than you might think. It's also the case
that there are lots of people in the third category who would claim
they are in the fourth, but really aren't...similar
to the 70% of all licensed drivers who say they are in the top 50% of
safe drivers. Being an objective judge of one's own
abilities is not always possible.
What compounds the problem for us all is that programmers are
especially unlikely to realize (or are unwilling to admit) their
limits. There are levels and degrees of complexity when working with
computers and software that few people completely understand.
However, programmers generally hold a world view that they can write
correct code all the time, and only occasionally do mistakes occur,
when in reality mistakes are commonplace in nearly
everyone's code. As with the four categories, or the
drivers, or any other domain where skill and training are required,
the experts with real ability are fewer in number than those who
believe they are expert. The result is software
that may be subtly—or catastrophically—incorrect.
A program with serious flaws may compile properly, and work with
obvious inputs. This helps reinforce the view that the code is
correct. If something later exposes a flaw, many programmers will say
that a "bug" somehow
"got into the code." Or maybe
"it's a computer
problem." Neither is candid. Instead, whoever
designed and built the system made mistakes. As a profession, we are
unwilling to take responsibility when we code things incorrectly. Is
it any wonder that a recent NIST study estimated that industry in the
United States alone is spending $60 billion a year patching and
customizing badly-written software? Is it a surprise that there are
thousands of security patches per year for common software platforms?
We've seen estimates that go as high as $1.5
trillion in damages per year worldwide for security problems alone,
and simple crashes and errors may be more than 10 times as much.
These are not rare flaws causing problems. There is a real crisis in
producing quality software.
The reality is that if we truly face up to the situation, we might
reassess some conventional beliefs. For instance, it is not true that
a system is more secure because we can patch the source code when a
flaw is discovered. A system is secure or it is not—there is no
"more secure." You
can't say a car is safer because you can replace the
fenders yourself after the brakes give out and it goes over a cliff,
either. A system is secure if there are no flaws
that lead to a violation of policy. Being able to install
the latest patch to the latest bad code doesn't make
a system safer. If anything, after we've done it a
few times, it should perhaps reduce our confidence in the quality of
the software.
An honest view of programming might also cause us to pay more
attention to design—to capturing requirements and developing
specifications. Too often we end up with code that is put together
without understanding the needs—and the pitfalls—of the
environment where it will be used. The result is software that
misbehaves when someone runs it in a different environment, or with
unexpected input. There's a saying that has been
attributed to Brian Kernighan, but which appears to have first been
written down by W. D. Young, W.E. Boebert, and R.Y. Kain in 1985:
"A program that has not been specified cannot be
incorrect; it can only be surprising." Most of the
security patches issued today are issued to eliminate surprises
because there are no specifications for the underlying code. As a
profession, we write too much surprising code.
I could go on, but I hope my points are clear: there are some real
problems in the way software is being produced, and those problems
lead to some serious—and expensive—problems. However,
problem-free software and absolute security are almost always beyond
our reach in any significant software project, so the next best thing
is to identify and reduce the risks. Proven approaches to reduce
these risks include using established methods of software
engineering, exercising care in design and development, reusing
proven software, and thinking about how to handle potential errors.
This is the process of assurance—of building trust in our
systems. Assurance needs to be built in rather than asserted after
the software is finished.
That's why this book is so valuable. It can help
people write correct, robust software the first time and avoid many
of the surprises. The material in this book can help you provide a
network connection with end-to-end security, as well as help you
eliminate the need to patch the code because you
didn't add enough entropy to key generation, or you
failed to change the UID/GID values in the correct order. Using this
code you can get the environment set correctly, the signals checked,
and the file descriptors the way you need them. And along the way,
you can read a clear, cogent description about what needs to be set
and why in each case. Add in some good design and careful testing,
and a lot of the surprises go away.
Are all the snippets of code in this book correct? Well, correct for
what? There are many other things that go into writing reliable code,
and they depend on the context. The code in this book will only get
you partway to your goal of good code. As with any cookbook, you may
need to adjust the portions or add a little extra seasoning to match
your overall menu. But before you do that, be sure you understand the
implications! The authors of this book have tried to anticipate most
of the circumstances where you would use their code, and their
instructions can help you avoid the most obvious problems (and many
subtle ones). However, you also need to build the rest of the code
properly, and run it on a well-administered system. (For that, you
might want to check out some of the other O'Reilly
books, such as Secure Coding by Mark Graff and
Kenneth van Wyk, and Practical Unix and Internet
Security by Simson Garfinkel, Gene Spafford, and Alan
Schwartz.)
So, let's return to those four categories of
programmers. This book isn't likely to help the
group of people who are perpetually unclear on the concepts, but it
is unlikely to hurt them. It will do a lot to help the people who
need guidance and examples, because it contains the text as well as
the code. The people who write good software most of the time could
learn a lot by reading this book, and using the examples as starting
points. And the experts are the ones who will readily adopt this code
(with, perhaps, some small adaptions); expert coders know that reuse
of trusted components is a key method of avoiding mistakes. Whichever
category of programmer you think you are in, you will probably
benefit from reading this book and using the code.
Maybe if enough people catch on to what it means to write reliable
code, and they start using references such as this book, we can all
start saying "There are 10 kinds of computer
programmers: those who write code that breaks, and those who read
O'Reilly books."
—Gene Spafford, June 2003
|