Wednesday, September 12, 2018

Engineering Synthesis

What is the nature of software engineering? How is it different from other kinds of engineering? Why is it so hard?

These are questions I have struggled with for many years. In my work, I have seen more than a few different takes on software engineering. Even when things start out right they seem to end at a sad place, and this has bothered me. Is it really impossible to do software "right?" Or do we just have the wrong idea about how to do it? Software engineering is a relatively new discipline, so maybe we still have some things to learn.

I'm going to draw from several sources here, and try to synthesize some ideas about engineering, science, and art. I feel kind of silly writing all these words summarizing other sources when you could just go watch the videos and read the papers yourself. But for my purposes these sources are a framework for discussing and organizing my thoughts.

Real Software Engineering

"Real Software Engineering" by Glenn Vanderberg
http://www.infoq.com/presentations/Software-Engineering

Glenn Vanderberg is a software practitioner, and he is reacting to the claim that software engineering needs to grow up and become a "real" engineering discipline. But what is "real" engineering?

There are actually a couple of different versions of this talk available online, and in one Vanderberg takes some time to talk about "how did we get here?" He digs up some history on the NATO conference in 1968 whose goal was to define software engineering. He then talks about some commonly believed myths about engineering, about how different engineering disciplines use different methods, then brings it back around to software engineering and applies what we've learned.

There were three big ideas from Vanderberg's talk that stood out to me:

  1. The model of scientists discovering knowledge and engineers then applying that knowledge is wrong.
  2. Software engineering is unique because we spend a lot of time crafting design documents and models and a trivial amount of time actually producing the end product, which is the exact opposite of most other branches of engineering.
  3. Agile methods are the best methods we have and for all practical purposes they are software engineering.

When I first watched Vanderberg's talk years ago, the big idea was the second—about the uniqueness of software engineering—but coming back to it later I was surprised to find this first idea echoed in other sources. Vanderberg gives a examples of advances in knowledge that came not from academics or scientists, but instead from practitioners and engineers. One example is Robert Maillart. He was an engineer who revolutionized the use of reinforced concrete in bridge building. He did this before there were mathematical models to explain the uses and limits of reinforced concrete. Scientific advances are just as likely to come from practitioners as from academics.

My second idea from Vanderberg is that among the kinds of engineering, software engineering has some unique characteristics. If one were to build a skyscraper, one would construct designs, models, blueprints, then those would be handed over to a construction team who would construct the building. The blueprints are relatively cheap to produce. The actual construction is error prone and requires a lot of materials and labor. Looking at this process, it would seem very important to focus as much effort on the architecting of blueprints as possible. Once you've laid the foundation, it is expensive to rethink the footprint of the building.

If I were to apply this process to software engineering I might do something like the following: Hire a system architect to create a design document, and then get a bunch of code monkeys to actually construct the system by writing code. In my interpretation, the requirements and design document are the model and blueprints, the system architect is the architect, and the code monkeys are the construction crew. Vanderberg picked up an insight from Jack Reeves in the 90's: this interpretation is wrong.

Customers do not pay for code, they pay for an executable. They want a working system. That is the constructed product, and it is the compiler not the code monkeys that produces it. The code is the design document and mathematical model. The code monkeys are not the construction crew, they are the architects. Source code and its type systems are a mathematical model that can be formally verified. Using a compiler, I can produce a prototype from that model instantaneously and for free. The source code also contains documentation, and to the extent that it has automated tests (also written in the same language) it is self verifying. Modern high level languages and domain specific languages can even be mostly understood by domain experts.

Software engineering is a unique engineering discipline, because source code is a unique artifact. We should be careful not to take engineering methods from a discipline where constructing a prototype is time consuming and expensive and one is necessarily forced to spend more time on up front design to avoid that cost. This will lead nicely into my third big idea, that agile methods are for all practical purposes the best kind of software engineering we know.

When I say agile methods, I mean agile with a little 'a'. I'm thinking (vaguely) of an incremental tinkering approach, versus a straight line mechanical approach. I'm thinking of a technician approach, versus a technique approach. Or as the original Agile Manifesto said, "people over process." I think they got that right. What is interesting is they were not the only ones to get it right. The original NATO conference on software engineering (1968!) had it right before they had it wrong.

There were two NATO conferences that were a year apart. At the first session Alan Perlis summarized the discussion on system design:

  1. A software system can best be designed if the testing is interlaced with the designing instead of being used after the design.
  2. A simulation which matches the requirements contains the control which organizes the design of the system.
  3. Through successive repetitions of this process of interlaced testing and design the model ultimately becomes the software system itself. I think that it is the key of the approach that has been suggested, that there is no such question as testing things after the fact with simulation models, but that in effect the testing and the replacement of simulations with modules that are deeper and more detailed goes on with the simulation model controlling, as it were, the place and order in which these things are done.

What he is saying is:

  1. Test early, test often.
  2. Take a breadth first approach mocking out what you need so you can get a sense for the overall system.
  3. Iteratively refine the system and replace the mocks.

That is suspiciously similar to an incremental development method. Between the 1968 NATO conference and the 1969 NATO conference things changed, and there was a clear tension between those who thought programming was best done by an expert technician, and those who thought programming was best done mechanistically by someone taught a body of scientific techniques. At the end of the 1969 conference, Tom Simpson gave a talk called "Masterpiece Engineering" which is oozing with conflicts of technician vs. technique.

There was definitely a lot of political maneuvering at the NATO conferences. There are some other resources you can investigate if you'd like. The point is the seeds of agile were there, but for some reason we ended up with 33 years of waterfall.

Engineering(,) A Path to Science

"Engineering(,) A Path to Science" by Richard P. Gabriel
http://www.infoq.com/presentations/Mixin-based-Inheritance

"Structure of a Programming Language Revolution" by Richard P. Gabriel
http://dreamsongs.com/Files/Incommensurability.pdf

Richard Gabriel's talk comes from an interesting perspective. He was involved in the Lisp community and has an academic background (he earned a PhD), but is not an academic. After working as a practitioner, he went back to school to earn a Masters of Fine Arts. Upon returning to the technical community, he felt a paradigm shift had happened while he was gone. The conferences he used to attend had been renamed and were now focused on academics instead of practitioners. His entire field--Lisp systems engineering--and its journals had been deleted.

Then he was given the first scientific paper on mix-in inheritance. Being familiar with previous work done on Lisp based inheritance systems, he felt that this paper was using the same terms to describe some of the mechanisms from the Common Lisp Object System, but the terms had different meaning. Gabriel felt he was experiencing incommensurability, that a paradigm shift had happened from an engineering focus to a scientific focus, and now "scientific" papers were being written that described, as new, things that engineers had already known, using the same terms but with different meanings.

The talk is definitely worth watching. It is an interesting personal story intertwined with technical discussions of the previous work versus the paper he had been given. It is an exploration of whether incommensurability can actually happen and to what extent. He also challenges the myth that science always precedes engineering.

I'm honestly not sure whether Gabriel intended his talk and paper to have a single point. Maybe he is mostly interested in relating his personal experience, but this is what I took away:

  1. In general, science does not always precede engineering, and in particular the relationship between computer science and software engineering is even more complex, because the engineers literally create the reality that the scientists study.
  2. There are two approaches to software: the systems approach, and the language approach.
  3. Making engineering subservient to science means throwing away the progress that engineers can and do make.

This was actually the first talk that started the wheels turning for me on the relationship between science and engineering. I had been told in college that scientists expand the body of knowledge and engineers apply that body of knowledge. Gabriel uses as his example the steam engine. When the steam engine was invented the popular theory used to explain its operation was the Caloric Theory of heat, which stated that there was an invisible, weightless, odorless gas called "caloric" that permeated the Universe. The amount of caloric in the Universe is constant, and its interaction with air molecules can explain heat and radiation, and from it you can deduce most of the gas laws. The Caloric Theory was a useful theory with predictive power. When Laplace adjusted Newton's pulse equations to account for caloric, he was able to more accurately predict the speed of sound.

Eventually the Caloric Theory was replaced by Thermodynamics, and amazingly steam engines continued to work! The steam engine was developed by mechanics who observed the relationship between pressure, volume, and temperature. Whether its operation was explained by the Caloric Theory or Thermodynamics made no difference to them. Yet, an engineer's invention can and does spark the curiosity of a scientist to develop a theory to explain how it is that an invention works. This is even more true in the case of computer software.

The second moral I drew from Gabriel's talk is that there are (at least) two approaches to software: a systems approach and a language approach. Gabriel acknowledges that at first he thought the incommensurability that he saw was a difference between an engineering paradigm and a scientific paradigm, but eventually he saw it as a more technically focused conflict between a systems paradigm and a language paradigm. Perhaps what Gabriel means is that you can approach either systems or languages from an engineering or a scientific perspective. However, I tend to see systems versus languages as engineering versus science.

The systems paradigm views software as interacting components forming a whole; real stuff doing real things. The language paradigm views software as abstract signs and rules of grammar conveying meaning. Good design, from a systems perspective, comes from a skilled technician following good design principles (I would even call it aesthetics). Good design, from the language perspective, comes from a relatively less skilled technician working within a language that from the outset excludes bad design through grammatical rules and compilers. The system approach tends to view software as a living organism that is incrementally poked and prodded, changed and observed. The language approach tends to view software as a series of mathematical transformations, preserving meaning. If each of the paradigms were a theory of truth, the systems paradigm would be correspondence, and the language paradigm would be coherence.

I see system versus language as engineering versus science. I view engineering as a bottom up, incremental, tinkering approach, at least when it comes to software and the way I like to practice software engineering. I view science as a top down, formal, mathematical approach. I actually like both, and I think both have their place, but when engineering is made subservient to science, we're actually losing something very important. When engineers are shut out of conferences and journals, there are discoveries that will be left unpublished, and new scientific theories left untheorized. (This was what Gabriel saw happening.)

Computer Programming as an Art

"Computer Programming as an Art" by Donald Knuth
http://dl.acm.org/ft_gateway.cfm?id=1283929&type=pdf

For those with even a cursory exposure to Computer Science, Donald Knuth needs no introduction. Knuth is coming from an academic perspective, but even for an academic his perspective is a bit unique. He has created and maintains several large open source software projects. This is his ACM Turing Award lecture given in 1974. He starts by quoting the first issue of the Communications of the ACM (1959). It claims that for programming to become an important part of computer research and development (to be taken seriously) it needs to transition from being an art to a disciplined science.

The big idea I draw here is: Programming can be art (in the "fine art" sense), which means it is (at least sometimes) a creative endeavor.

Knuth first explores the definition of "art" and "science." He looks at their use over time. Their use was (and is) not consistent. At times "science" and "art" are used interchangeably. "Art" was used to describe something made of human intellect, not nature. Eventually "science" came to mean "knowledge" and "art" came to mean "application." Though even that usage is not universal. To Knuth an "art" is something that is not fully understood and requires some aesthetics and intuition. A "science" is something well understood. Something that can be mechanized and automated. It is something that can be taught to a computer. Can computer programming be taught to a computer?

Knuth does not think that programming can ever be fully automated. However, it is still useful to automate as much as possible, since it advances the artistry of programming. He believes, and cites others, that progress is made not by rejecting art in the name of science, nor science in the name of art, but by making use of both. He makes reference to C. P. Snow's "The Two Cultures" as an example of another voicing concern about separating art and science. At this point when he speaks of art he means something more along the lines of "fine art" than "engineering."

Knuth goes on to talk of creativity, beauty, art, and style. He hits on how sometimes resource constraints can force a programmer to come up with an elegant solution, and this has an artistic aspect to it. He also encourages people to, when it comes to programming, make art for art's sake. Programs can be just for fun.

Knuth's talk is focused on the act of programming, and when he deals with engineering versus science he means with respect to the act of programming. To what extent can the act of programming be made automatic? To what extent must it remain a human act of creativity? This is a little further afield of the previous sources, but Knuth's insistence on seeing programming as a creative act is the big idea I drew from his talk, and is really the point of his talk.

Given that programming can sometimes be a creative act, it raises a lot of questions in my mind. Is programming always a creative act? If programming is a creative act, how should a programming project be managed? Is the high failure rate of software projects related to this? Perhaps this ties back into Tom Simpson's "Masterpiece Engineering" satire. Imagine a project manager with a room full of artists creating Gantt charts and task dependency graphs to plan out the creation of a new masterpiece!

On the other hand, nothing appeals to the ego more than seeing oneself as a grand master of art. There should be a measure of moderation here. I think there is benefit to trying to understand programming as an artistic (or at least "creative") endeavor, whatever that means, but we should not go crazy with hubris.

Better Science Through Art

"Better Science Through Art" by Richard P. Gabriel and Kevin J. Sullivan
https://www.dreamsongs.com/Files/BetterScienceThroughArt.pdf

"Better Science Through Art" by Richard P. Gabriel
https://www.tele-task.de/archive/video/flash/12636/

I have already covered some of Gabriel's background, but I will say that having been involved and educated in both a technical field and an artistic field gives him a unique perspective on the relationship between science, engineering, and art.

I unfortunately don't know much about Sullivan's background, other than he is a professor of computer science at the University of Virginia. His collaboration with Gabriel produced one of my favorite papers ever. I don't know that I can tease out what should be attributed to whom. I will be basing my comments on Gabriel's talk, but I don't intend to attribute everything to him, or to diminish Sullivan's contributions.

The big ideas I drew from this is:
  1. Science, engineering, and art all have at their core "disciplined noticing."
  2. Disciplined noticing is a skill that requires practice.
  3. The creation of knowledge—even in the case of science—requires an abductive leap powered by creative spark.

This is a really great talk, and covers a lot of ground. It is entertaining, insightful, and very worth watching. He attacks some common caricatures of science, engineering, and art, and digs into the actual process behind each. In the end, he finds that there are a lot of similarities to the methods in science, engineering, and art. It is a process of exploration, discovery, and verification. He calls it disciplined noticing.

I have found this to be true in my experience. Just like people have a caricature of science, that it is straight line progress, the monotonic aggregation of knowledge, there's a similar caricature of software development. My experience has been that writing software is a creative, exploratory process. Sometimes I go down an alley, but find that I need to back out and take a different turn. I may write a test, run it, change some code, change a test, run it, think for a while, delete a bunch of code and rewrite it all.

In my experience this process—writing, evaluating, and rewriting—has much more in common with writing a novel than constructing a building.

Conclusion

This long meandering post must come to an end. First of all, I would highly recommend looking at each of these cited sources. They will reward you. Perhaps you may even find that I have seen them through my own preconceived notions, and you may draw an altogether different conclusion from them. So be it.

This "conclusion" is not really a conclusion, but a way-point. I started on this journey to understand the nature of software engineering, how it is different from other kinds of engineering, and why it is so hard. I ended up at a place that intuitively I knew I would end. I will not make an absolute statement. I will say that at least sometimes (and in my experience) software development is a creative process more akin to creative writing.

I have also seen that there is a tremendous amount of creativity in both engineering and science. I believe that at the core of engineering, science, and art is a drive to understand and influence the world, which requires observation, testing, and evaluation. I don't claim to know how to do software engineering "right," but I don't think we will ever do it right if we refuse to see that creativity (which is at times unpredictable) is a key part of the effort.

I have learned that both engineering and science are useful for discovering and validating knowledge. Scientists and engineers should collaborate. Neither should be seen a primary at the expense of the other. They can even be seen as external expressions of the same process sometimes using similar tools and techniques.

I have learned that software is unique in engineering. Whereas a blueprint is a written artifact using specialized notation, the building it describes must be brought into existence through a complex error prone process. Code is written using specialized notation, but the gap from code to execution is much smaller. There are pitfalls and challenges, no doubt, but I would like to see how the nature of what we produce can change how we produce it. I'm still holding out hope that the nature of software can change the face of the human organizations that produce it.

Practically, what this all means is that a software engineering process should be iterative. It should embrace unpredictability and allow space for the creative process. In the same way that a painter never thinks his painting is complete, software should be developed in a way that continuously produces value, so the project could be closed down and the product shipped at any point, and the customer is still happy with the result.

So I end back at the beginning with Vanderberg. I don't think that agile is the last word, but I think it is the best we have so far.

No comments: