Monday, May 21, 2018


If you're like me, you spend 8+ hours a day in front of a screen. About five years ago, I decided that I needed better hobbies than learning new programming languages and writing code for personal projects. I wanted find ways to learn new skills and connect with people. I've done that by playing board games at local meetups and building a robot, and I've done that with gardening.

Gardening has been incredibly frustrating and incredibly rewarding in a roller-coastery kind of way. I'd like to share my journey with you in an attempt to get you interested in gardening. I'll share some resources I've found interesting and useful.

Why gardening?

I chose gardening for many different reasons. I remember my parents having a garden when I was a kid, and I wanted to have a hobby that my kids could be involved in and excited about. I like to eat things like tomatoes that my wife does not often buy, because no one else (including her) likes them. I wanted to do something outdoors. I wanted to become a little more self-sufficient.

Those are some of my reasons, but maybe you have other reasons. Maybe you'd like to reduce your carbon footprint by producing your own food that doesn't get shipped half way across the world. Maybe you like the idea that food from your garden is essentially tax-free income. Maybe you want to increase the diversity in your diet and/or help preserve and conserve heirloom food varieties that are endangered. Maybe you don't want to grow food but flowers providing you with a vibrant, delicate beauty.

How gardening?

There are many ways to garden from containers to raised beds. One of the things I enjoy about gardening is an entire world of new things to learn. It is a gateway hobby into things like cooking, canning, composting, carpentry, and other words that begin with 'c'.

I have focused mostly on fruits and veggies, since I want to be able to eat from my garden, but I've also grown (and grow more and more) flowers. I've grown some edible flowers and some inedible. It is incredibly satisfying to have some color around the house.

I started small with some containers on my deck. I used a couple of EarthBoxes, then built my own DIY EarthBoxes. I like the sub-irrigated planter (SIP) concept so much that I'm planning on putting in a raised bed SIP in my backyard, automatically fed by rain barrels. If you want to learn more about SIPs, check out

Gardening (like most hobbies) can be as expensive as you let it. You can buy all kinds of gardening gadgets and gizmos. One of my goals is to make gardening as economical as possible. To garden you need:

  1. Plants
  2. Sun
  3. Water
  4. Nutrients

The sun part is pretty easy, since my back yard is south facing. I just need to work around the shadows cast by trees and the deck.

You can buy seeds pretty cheaply, but you can also harvest seeds from your plants, so you don't have to continually buy seed packets. This will only work with open-pollinated (OP) plants. Check out this video to learn about OPs, hybrids, and heirlooms: Often, it is easier to buy seedlings at a nursery or farmer's market.

You can also plant perennials like strawberries and asparagus. These don't need to be replanted every year. You plant them once and you can harvest for years.

You can obviously water your plants with your tap, but rain barrels are a way save money taking advantage of an abundant resource over our heads. You can buy rain barrels, or you can make your own. My water company even gives a $30 rebate each for up to two rain barrels that I install.

Plants need nutrients, and nutrients can be provided by fertilizer. I still use fertilizer occasionally, but I've opted to make my own compost. Unfortunately I don't have many trees whose leaves I can compost. This is usually the easiest way to make compost. However, I am composting what leaves I have along with grass clippings and cardboard boxes from all my Amazon Prime orders. I compost trimmings from my garden, and kitchen waste. I'm even thinking about getting some composting worms! Here is a video about how ridiculously easy it is to compost:

Lessons Learned

I've been gardening about five years, and here are some lessons I've learned.

Time and timeliness.

As a software engineer, I work in a field where I'm constantly learning, and there's a new JavaScript framework every week. I enjoy being more aware of the weather and seasonal rhythms. Plants work on a different timescale. If something goes wrong with the crop this year, I may have to wait another whole year to try again. That can be frustrating, but it can also be an opportunity both to think over a longer timescale and to be very focused on what is happening right now because the stakes are high.

Everything wants to kill your plants.

In container gardening on my deck I've dealt mostly with insects, and there are billions of them. When I moved into raised bed gardening with my strawberry patch, I had to deal with deer eating all the leaves off my strawberries. For the past couple of years it has been impossible for me to grow zucchini or squash, because vine borers have eaten them from the inside out. I'm not necessarily a fan of squishing bugs, but there was nothing more satisfying than digging those buggers out and squishing their fat bodies. It was a kind of anger management program.

The lesson is you need to think about pest management from the beginning. Talk to your neighbors about what pests they've dealt with in their gardens. Or at least be prepared that the first year could be rough until you know what you up against. When you do know what you're up against...research! If you live in the US look up your local cooperative extension website. Virginia's has all kinds of great publications for growing things in my region.

Your plants want to live

Even the sun can sometimes be brutal on your plants. I tried seed starting a couple of years ago. The last step is to "harden off " your plants by gently exposing them to the elements. I was a little less than gentle and nearly killed my plants.

After the hardening off incident I felt like a bad plant daddy, but the amazing thing was my plants came back. They want to live. They are partners in this gardening adventure.

It is satisfying to make things grow

It can sometimes be difficult to diagnose what is wrong with a plant: is it overwatered, underwatered, missing some nutrient, etc? Plants are complicated yet fascinating living things. It is worth the effort to understand them and work with them. One of the most fascinating books I've read is Botany for Gardeners by Brian Capon I really enjoyed Brian's writing style, and it is a very approachable introduction to cellular function, propagation, and the fascinating life of plants.

In the end there is a lot to learn, and it is hard work, but it is so satisfying to nurture a living thing.

It is satisfying to work hard

I have a personal rule for myself that as much as possible I will refuse to have someone else mow my lawn. It saves money. I listen to podcasts and audio books. I like to walk around my house and property (only 1/3 acre but still) and see how things are doing. It can be hard work since my yard is mostly a hill, but I like to get the exercise.

Gardening can be hard work, too. One Sunday afternoon, in addition to mowing and edging, I pulled out two bushes (which if you've ever done, then you know), and planted an apple tree and six red raspberry canes. I was sunburnt and sore, and paid for it the next day, but it was satisfying, and I'm looking forward to the fruit of my labor (literally!).

Play the odds

I recommend starting small, because like any hobby you can get excited and spend a lot of money before you realize it. However, you also have to know that gardening is about playing the odds, so don't start too small. When you start seeds, you put three in each hole, and when they sprout you thin them down to just the strongest of the seedlings. If you buy tomato seedlings from a nursery, don't just buy one, buy two or three. You have to expect that some plants won't survive.

It can also be helpful to plant more than one kind of thing. You may not get everything you want, but you should plant a diverse mix of plants and enjoy whatever you get. If you only plant cucumbers, then horde of cucumber beetles can destroy everything, but if you also have tomatoes, then it's not a total wash.


Have I accomplished my goal of learning new skills and getting to know people? Absolutely! Of the five houses that border mine three are gardeners, and when I'm out early in the morning tending my garden my neighbors are often out, too. I've had chances to get to know them.

I've gotten outdoors. I've gotten plenty of exercise. My kids are involved and excited about gardening. They even eat things they normally wouldn't, because we've grown them ourselves.

If you want a hobby to get you away from the screen and doing something physical in the real world, then give gardening a go.

Thursday, September 7, 2017

The Ethics of Software Quality

Security professionals are in a hard place. If there is a security breach, they take the fall. However, if they do their job right, no one notices. Further, they may even meet resistance to doing their job right because they are being overly cautious, taking too much time, costing too much money, etc., etc.

I think a software professional who wants to create quality software faces the same challenge. You may deliver quality software, but then get accused of taking too long (according to some arbitrary idea someone has) or "gold plating." You get compared to co-workers who write code much faster, even though it may have more bugs. Focusing on speed as a primary metric for software development is a race to the bottom.

This is not to say that there aren't times when something needs to be timeboxed, or a programmer needs to resist "gold plating." It is possible to fall into a trap of tweaking and refactoring ad infinitum. However, I don't find that there is a bright line or objective standard for judging this. Maybe that is because I believe software development to be a creative, exploratory process, so I'm apt to think there's more than a little taste and discernment.

To produce quality software you must take an ethical approach. What do I mean by this? While it seems obvious that there are ethical issues in software development---for example poor quality software wastes time and money, causes frustration, and in the extreme case can cause damage to property and loss of life---that's not what I mean.

What I mean by "ethical approach" (and maybe there's a better term for it) is you must have an intrinsic motivation to create quality software. You have to do it because "it's the right thing." You will rarely get support from managers to produce quality software. You will shoulder the blame for quality issues in your code. If your code is beautiful and functional and bug-free, rarely will anyone even notice, let alone commend you.

How can you develop a "software quality conscience"? I don't have all the answers, but I have a couple of suggestions:
  1. Read good code and read about good code. If it is garbage in, then it will be garbage out.
  2. Surround yourself by other people who care about quality. Find a team of like minded people whether it is at work or not.
  3. Keep things in perspective. I find, as I'm further into my career, that I've had bosses bluster at me to get things done by a certain time ("do or die"), and found that it didn't really have a huge impact on the success or failure of my project or company. Don't be insubordinate or lazy, but don't buy into the hype. Be realistic.
 You are responsible for fighting the good fight. So step up.

Monday, February 27, 2017

Continuous Planning

"In preparing for battle I have always found that plans are useless, but planning is indispensable." -- Dwight D. Eisenhower

There is a tension between engineering on the one hand, and on the other hand those who would like to know when the task will be done. A product must be marketed, documented, sold, and supported. "When it's done," is useless when you're trying to sell to a customer against a market full of competitors. However, the software we write gets more complex each day, and the process for bringing it to life is complex. Complexity means unknowns, and unknowns mean uncertainty. A software project is like a hurricane with a cone of uncertainty preceding it. This tension between the desire to know and the reality of uncertainty is a fundamental part of working a software project (and probably other kinds of projects).

Before going too much farther I will state my assumption: a completion date is an output not an input, and the most effective tool for managing a completion date is changing the amount of work you want to do (i.e. "scope").

You cannot take a date and work backwards. This is no different than taking a date and working forwards. Well actually, there is a big difference. In working forward, you can always push the completion date out. In working backwards you cannot start any earlier than now. The completion date inevitably follows from when you start, how quickly you can work, and how much you are trying to do.

You can spend money on tools, training, consultants, but these each have a time cost.

You can add more people, but in order to establish a context on a project a new person must learn a code base, tools, technologies, personalities of the team, and to do so he or she must take time from an otherwise productive member of the team.

You can have the current team work overtime, but too much of that will cause quality issues and burnout.

You can relax expectations about quality, but that is just trading your future time to get something done more quickly and temporarily.

The best thing you can do to manage a completion date, is to cut the amount of "stuff" you are trying to do, or to rearrange the order of when you will do it, so you get the things you want earlier than you otherwise would have.

Given that a date is an output, as engineers and managers we try to navigate this tension between the desire to know and the reality of uncertainty with planning, but there's a problem with plans: they're useless. Imagine planning a single task. When will it be done? Well, if you ask one engineer she will give you an estimate based on her skill and experience. If the task is ever given to another engineer, then that estimate is invalidated. On top of that, an engineer (or human really) is notorious for estimating only the amount of work she must do. She doesn't think about QA testing, deployment, and data migrations, among other things. Nor does she think to factor in overhead like meetings, filling out time cards, learning new skills, bonding with coworkers, etc.

That is just at the most atomic level of estimation. Once you start to think about collaboration things get more complex. Does our engineer need to get a review from a coworker? That coworker is now being taken off of his task to do the review, which can lead to delays. What if our engineer needs assistance from someone more familiar with a particular technology or part of the code base? What if our engineer wants to brainstorm with another engineer? If our engineer gets delayed then any tasks that were dependent on her task also get delayed.

Now imagine making a plan for a product that spans several teams and tens or hundreds (or thousands??) of people. If you don't know everything that everyone is working on and how they are all related, then you can't with certainty plan out anything, and that is only taking into account everything that can be known, there are still unknowns (like if someone may get sick). This is the uselessness of a plan.

Well, a plan is not entirely useless. It is probably very accurate for the tasks that will be started in a few days, but entirely inaccurate for the tasks that will be started in three or six months.

So, there are two problems with a plan: 1) it must be updated to reflect new information, and 2) it fails to take into account the "cone of uncertainty."

Updating a plan seems easy enough, however the larger the plan the more work it will take to keep updated. One could certainly employ an army of project managers who verify that the task breakdown, estimates, and dependencies have not changed; that you've taken into account every meeting, vacation plan, all the testing, deployment, and overhead. Ideally the plan would be updated continuously (so its more of a "dashboard" than a "plan"). More valuable than knowing that three months ago we thought a task should be complete on such and such a date would be knowing when we think it will be completed as of now with all the latest information we have access to, but that would create quite a drag on the entire team.

Even if you could keep the plan up-to-date, it gives the false impression that one can know precisely when a task will be complete. You may be able to predict the completion date of a task that starts tomorrow, but not for a task that will start in three months. Three months provides plenty of time for both knowns and unknowns to change when the task could even start, let alone when it would complete.

Usually, this uncertainty is handled by "padding" the date, but this is not enough. A single point-in-time completion date conveys certainty, and this is certainly wrong. The completion of a task should always be a range, one that is narrow for the near future and wide for the distant future.

Incidentally, I think even agile burndown charts get this wrong. In my opinion, there is (and should be) variability to a team's velocity. Simply taking some velocity value and running it out a few months to predict a single point in time when a task will be complete is at odds with reality.

What does Continuous Planning look like? Well I don't really know, because I just made it up! At a high level I would summarize it as: plan using real data, with task completion ranges, over as long a term as you want, in aggregate, on average, continuously. The task completion ranges are the key. You can plan over as long a term as you want, however, the ranges will get wider. If you can reduce the variability in your process---and prove it with your data---then you can narrow the ranges. Planning is done in aggregate and on average, because it is impossible to know and manage every possible factor, so we must abstract away much of the minutiae. Finally, to plan continuously implies some kind of tool to facilitate.

To the extent that I have thought about how this would work out practically, this is what I would do:

Each team would estimate their tasks by each member recording the number of hours he or she thinks it would take him or her to complete the task, given that there are no other distractions. This is a kind of pure estimation that engineers usually make. It would be helpful to discuss the task as a team, and try to elicit different opinions on the complexity of the task, so the estimates will be as complete as possible.

Why not use story points? I have been a fan of story points precisely because they abstract away hours. Hours can vary depending on a persons skill and experience. Hours can get lengthened by interruptions and discovering additional complexity. Hours give a false impression that they would map directly to calendar time, and you can accurately predict when a task will complete.

However, the first thing a person asks is how long does X points take. Usually you have to pick as a standard comparison some "golden story" for a certain number of points. People will usually consciously or unconsciously come up with some rule of thumb like, "an eight point story should take about a sprint to complete." So in the end you are estimating in hours, but they're a convoluted form of hours.

Hours are a natural unit for estimations. The danger in using hours is actually trusting the estimate for a precise completion date. We've already rejected precise completion dates with Continuous Planning, and the rest of the process is designed around (automatically) finding an accurate scale with which to judge these estimates. I would actually advocate that the estimates and velocities be hidden variables, and (other than your own estimate) you only see the completion range for a task. This would hopefully reduce some confused expectations around what it means for an estimate to be denominated in hours.

The estimates from each team member would be combined together into a single estimate for the task. The method for combination could be taking an average. It could involve throwing out extreme values first, or doing some sophisticated statistical analysis.

Having done these estimates, a task tracking system would keep track of when tasks started and when they completed, or how long they've been started even if not complete. This actual data can be used to calculate a velocity. The velocity could be calculated at several levels. You could calculate the velocity for an individual task, for a particular team member, for the team as a whole. You could even calculate the velocity for a feature epic cutting across several teams.

To calculate a date range for completion, you can take the average plus or minus a standard deviation for a single velocity calculation over time and get an optimistic and pessimistic velocity. You could get an optimistic and pessimistic velocity by taking the minimum and maximum of the most recent velocity calculations at two different level (task and team, for example). I'm not sure which would work best; it warrants some research.

Tasks would be a hierarchical tree. An epic is really just a task with subtasks. The velocities and estimates can flow up the tree for the purposes of calculating estimated completion ranges for epics.

If you wanted to get fancy, you could draw dependencies between tasks, and the system could then attempt some kind of topological sort of the tasks, and using a prioritized backlog and team assignments to each task, construct a plan for what could be done in parallel, and---based on velocities calculated from real data--calculate a completion range for each task, epic, and the project as a whole.

As you can see there are still many questions to be answered. I think this is an idea worth exploring. In my experience, the usual tools fall flat at resolving the tension between the desire to know and the reality of uncertainty.

To effectively attack this tension requires abstracting away much of the minutiae of detailed planning by embracing the variability of the process. The plan is useless. Planning is indispensable. Therefore, plan continuously

In preparing for battle I have always found that plans are useless, but planning is indispensable.
Read more at:
In preparing for battle I have always found that plans are useless, but planning is indispensable.
Read more at:
In preparing for battle I have always found that plans are useless, but planning is indispensable.
Read more at:
In preparing for battle I have always found that plans are useless, but planning is indispensable.
Read more at:
In preparing for battle I have always found that plans are useless, but planning is indispensable.
Read more at:
n preparing for battle I have always found that plans are useless, but planning is indispensable.
Read more at:
n preparing for battle I have always found that plans are useless, but planning is indispensable.
Read more at:
n preparing for battle I have always found that plans are useless, but planning is indispensable.
Read more at:

Tuesday, August 30, 2016

"Clojure Polymorphism" Released!

From my new blog Real World Clojure. What am I doing with this new blog? I have no idea, but you can follow along.

~ ~ ~ ~

I have released a short e-book (30 pages) titled "Clojure Polymorphism." You can get 50% off by using this coupon link
What is this book about?
When it comes to Clojure there are many tutorials, websites, and books about how to get started (language syntax, set up a project, configure your IDE, etc.). There are also many tutorials, websites, and books about how language features work (protocols, transducers, core.async). There are precious few tutorials, websites, and books about when and how to use Clojure's features.

This is a comparative architecture class. I assume you are familiar with Clojure and even a bit proficient at it.  I will pick a theme and talk about the tools Clojure provides in that theme.  I will use some example problems, solve them with different tools, and then pick them apart for what is good and what is bad.  There will not be one right answer.  There will be principles that apply in certain contexts.
I this installment, I will pick up the theme of "Polymorphism" looking at the tools of polymorphism that Clojure provides. Then I take a couple of problems and solve them several ways. At the end of it all, we look back at the implementations and extract principles. The end goal is for you to develop an understanding of tradeoffs and a taste for good Clojure design.

I have some ideas for other e-books. Perhaps a concurrency tour of Clojure taking a look at futures, STM, reducers, core.async, etc. Or maybe talk about identity by looking at atom, agent, ref, volatile!, etc. Or maybe look at code quality tools. Or how to organize namespaces. Or adding a new data structure with deftype?

What would you like to see? Contact me. :)

Friday, August 19, 2016

Reducible Streams

Laziness is a great tool, but there are some gotchas. The classic:

(with-open [f (io/reader (io/file some-file))]
  (line-seq f))

line-seq will return a lazy seq of lines read from some-file, but if the lazy seq escapes the dynamic extent of with-open, then you will get an exception:

IOException Stream closed (

With laziness, the callee produces data, but the caller can control when data is produced. However, sometimes the data that is produced has associated resources that must be managed. Leaving the caller in control of when data is produced means the caller must know about and manage the related resources. Using a lazy sequence is like co-routines passing control back and forth between the caller and callee, but it only transfers control for each item, there is no way to run a cleanup routine after the caller has decided to stop consuming the sequence.

A Tempting Solution

One might immediately think about putting the resource control into the lazy seq:

(defn my-line-seq* [rdr [line & lines]]
  (if line
    (cons line (lazy-seq (my-line-seq* rdr lines)))
    (do (.close rdr)

(defn my-line-seq [some-file]
  (let [rdr (io/reader (io/file some-file))
        lines (line-seq rdr)]
    (my-line-seq* rdr lines)))

This way the caller can consume the sequence how it wants, but the callee remains in control of the resources. The problem with this approach is the caller is not guaranteed to fully consume the sequence, and unless the caller fully consumes the sequence the file reader will never get closed.

An Actual Solution

There is a way to fix this. You can require the caller to pass in a function to consume the generated data, then the callee can manage the resource and execute the function. It might look something like:

(defn process-the-file [some-file some-fn]
  (with-open [f (io/reader (io/file some-file))]
    (doall (some-fn (line-seq f)))))

(process-the-file my-file-name do-the-things)

Once upon a time used to have a with-query-results macro that would expose a lazy seq of query results, and you had these resource management issues. Then it was changed to use this second approach where you pass in functions.

There is a hitch to this approach. Now the callee has to know more about how the caller's logic works. For instance, in the above code you are assuming that some-fn returns a sequence that you can pass to doall, but what if some-fn reduces the sequence of lines down to a scalar value? Perhaps process-the-file could take two functions seq-fn and item-fn:

(defn process-the-file [some-file item-fn seq-fn]
  (with-open [f (io/reader (io/file some-file))]
    (seq-fn (map item-fn (line-seq f)))))

(process-the-file my-file-name do-a-thing identity)

That's better? I still see two problems:
  1. The caller is back to having to know/worry about resource management, because it could pass a seq-fn that does not fully realize the lazy seq before it escapes the with-open
  2. The logic hooks that process-the-file provides may never be quite right. What about a hook for when the file is open? How about when it is closed?
I could argue that this whole situation is worse, since the caller still has to worry about resource management, and now the callee has this additional burden of trying to predict all of the logic hooks the caller might want.

An additional design consequence is that you are inverting control from what it was in the lazy seq case. Whereas before the caller had control over when the data is consumed, now the callee does. You have to break your logic up into small chunks that can be passed into process-the-file, which can make the code a bit harder to follow, and you must put your sharded logic close to the callsite for process-the-file (i.e. you cannot take a lazy sequence from process-the-file and pass it to another part of your code for processing). There are advantages and disadvantages to this consequence, so it is not necessarily bad, it is just something you have to consider.

Another Solution

We can also solve this by using a different mechanism in Clojure: reduction. Normally you would think of the reduction process as taking a collection and producing a scalar value:

(defn process-the-file [some-file some-fn]
  (with-open [f (io/reader (io/file some-file))]
    (reduce (fn [a v] (conj a (somefn v)) [] (line-seq f))))

(process-the-file my-file-name do-a-thing)

While this may look very similar to our first attempt, we have some options for improving it. Ideally we'd like to push the resource management into the reduction process and pull the logic out. We can do this by reifying a couple of Clojure interfaces, and by taking advantage of transducers.

If we can wrap a stream in an object that is reducible, then it can manage its own resources. The reduction process puts the collection in control of how it is reduced, so it can clean up resources even in the case of early termination. When we also make use of transducers, we can keep our logic together as a single transformation pipeline, but pass the logic into the reduction process.

I have created a library called pjstadig/reducible-stream, which will create this wrapper object around a stream. There are several functions that will fuse an input stream, a decoding process, and resource management into an reducible object. Let's take a look at them:
  • decode-lines! will take an input stream and produce a reducible collection of the lines from that stream.
  • decode-edn! will take an input stream and produce a reducible collection of the objects read from that stream (using clojure.edn/read).
  • decode-clojure! will take an input stream and produce a reducible collection of the objects read from that stream (using clojure.core/read).
  • decode-transit! will take an input stream and produce a reducible collection of the objects read from that stream.
Finally, there is a decode! function that encapsulates the general abstraction, and can be used for some other kind of decoding process. Here is an example of the use of decode-lines!:

(into []
      (comp (filter (comp odd? count))
            (take-while (complement #(string/starts-with? % "1"))))
      (decode-lines! (io/input-stream (io/file "/etc/hosts"))))

This code will parse /etc/hosts into lines keeping only lines with an odd number of characters until it finds a line that starts with the number '1'. Whether the process consumes the entire file or not, the input stream will be closed.

  • This reducible object can be created and passed around to other bits of code until it is ready to be consumed.
  • When the object is consumed either partially or fully the related resources will be cleaned up.
  • Logic can be defined separately and in total (as a transducer), and can be applied to other sources like channels, collection, etc..
  • This object can only be consumed once. If you try to consume it again, you will get an exception because the stream is already closed.
  • If you treat this object like a sequence, it will fully consume the input stream and fully realize the decoded data in memory. In certain uses cases this may be an acceptable tradeoff for having the resources automatically managed for you.


Clojure affords you several different tools for deciding how to construct your logic and manage resources when you are processing collections. Laziness is one tool and it has advantages and disadvantages. It's main disadvantage is around managing resources.

By making use of transducers and the reduction process in a smart way, we can produce an object that can manage its own resources while also allowing collection processing logic to be defined externally. The library pjstadig/reducible-stream provides a way to construct these reducible wrappers with decoding and resource management fused to a stream.


Special hat tip to hiredman. His treatise on reducers is well worth the read. Many moons ago it got me started thinking about these things, and I think with transducers on the scene, the idea of a collection managing its own resources during reduction is even more interesting.

Monday, May 9, 2016

The March of Technology

"Our inventions are wont to be pretty toys, which distract our attention from serious things. They are but improved means to an unimproved end, an end which it was already but too easy to arrive at; as railroads lead to Boston or New York. We are in great haste to construct a magnetic telegraph from Maine to Texas; but Maine and Texas, it may be, have nothing important to communicate. Either is in such a predicament as the man who was earnest to be introduced to a distinguished deaf woman, but when he was presented, and one end of her ear trumpet was put into his hand, had nothing to say. As if the main object were to talk fast and not to talk sensibly. We are eager to tunnel under the Atlantic and bring the Old World some weeks nearer to the New; but perchance the first news that will leak through into the broad, flapping American ear will be that the Princess Adelaide has the whooping cough. After all, the man whose horse trots a mile in a minute does not carry the most important messages; he is not an evangelist, nor does he come round eating locusts and wild honey. I doubt if Flying Childers ever carried a peck of corn to mill."
Thoreau, Henry David. Walden, and on the Duty of Civil Disobedience. Project Gutenberg. Web. 09 May 2016.

Or in the words of a more modern philosopher and poet:

Saturday, March 5, 2016

Making Fake Things

Software is fake. There are bits inside a computer represented by a magnetic or electrical charge or mechanical potential or some such thing. But software is not an electrical charge. Electrical charges can represent ones and zeroes and a series of ones and zeroes like "10111101" can represent the JVM opcode "anewarray" or the fraction one-half ("½") in the ISO-8859-1 character encoding or the number -67 in twos-complement. Software is not electrical charges, it is a particular interpretation imposed on electrical charges. An interpretation does not weigh anything. It has no color, taste, temperature, volume, mass, or any physical features. It is fake, but fake things can be useful.

Fake things can represent real things (or other fake things). For example, you can represent a couch with a 3D model in a computer. You can represent cities and towns and roads with fake things. You can also represent fake things with other fake things. JVM opcodes, characters, and numbers are all fake things represented by "10111101", which is fake. Fake things are useful because they can represent real and fake things in a way that can be cheaply manipulated and transported instantly across the world. Fake things also have challenges.

Software is a little unique even among fake things because in making software we are often making something that has never existed before. When someone creates a stove there are hundreds of thousands of others stoves in existence to draw upon. There are wood stoves, electric stoves, and gas stoves. But when someone created a text editor, they created something that had never existed before. Here is how Richard Gabriel describes it:

"But, consider the first people to design and build a text editor. Before that, there was never a text editor. Changes to a manuscript were always made by retyping or retypesetting. How would people want to make textual changes? How would people want to navigate? Searching? - no one ever heard of that before. Systematic changes? Huh? By the way, there were no display terminals, so how do you even look at the manuscript?" --

Web applications, virtual currencies, automated theorem provers, and many other software applications had never existed before or were so different in nature from their physical counterparts that they were a unique thing. Making fake things is hard enough, but making things that have never existed before is that much harder. That's not the end of it, though.

Fake things have no real world to help co-design them. Stoves have a real world to help co-design them. There are accessories that are used with stoves that help co-design them. Real things like pots and pans. Stoves have to fit through doorways, nestle between kitchen cabinets, and match the colors on the walls. Text editors have accessories like keyboards and mice that were invented to give real people made of meat a way of manipulating a conceptual world by proxy. Perhaps a mouse has to be compatible with a human hand, but a text editor has to be compatible with the mental model of a text editor that exists in a human mind, a model which no one had ever thought of before. Ultimately making software is a process of collaborating with other humans to dream up some mental model, and then making a fake thing out of software that other humans can use to manipulate that model (assuming they properly understand the mental model).

Which reminds me, collaboration is also a fake thing. Collaboration is about using real things, like vibrating air, to push around fake things, like words. It is about using real things, like markers and whiteboards, to manipulate fake things, like ideas. All of these real things can be replaced by fake things, like video conferencing software and text editors. And fake things like words and ideas can be replaced by other fake things, and all of these fake things can be instantly transported, copied, and manipulated by real people in real (and very distant) places. Collaboration is not a real thing, it is a fake thing produced through the interaction of real people thinking creatively.

And making software is a creative act. Writing software is writing instructions to make a computer do something. You must choose the instructions, determine their order, name things. You develop your own style. Writing software is writing words that have effect. Writing software is as close as you can get to God with words speaking reality into existence, the ultimate creative act. But writing software is not just for telling computers what to do. It is also collaboration with other humans. They must read, understand, modify, and extend what you write. They must understand your vision. You must collaborate with them through your source code.

So, here we are. We have discovered that software is a fake thing, that it is often an entirely new thing, that it is a pure product of the mind, that it is born of collaboration, and it is creative expression. Now what? We must systematically question the constraints we place on ourselves, because those constraints are often meant for real things and our things are fake. Here are a few examples:

A top-down management hierarchy is for making real things, not fake things. Top-down, command-and-control hierarchies are about control and efficiency. Control and efficiency are important for real things, because real things have locality, cost, and scarcity. Software has none of these things. Control and efficiency are important when you are manufacturing the same thing over and over. Software is often exploratory. Software is valuable not because we repetitively make lots of little copies of the same thing, but because we dream up some new way of doing things that has never been done before. Control and efficiency are important when you have a predictable process. A creative process is not predictable. You may think for hours about a problem, sleep on it, and then have the answer pop into your head the instant you wake up. We need to think differently, not just about what we make, but how we make it.

Offices are about locality. An office puts materials, means of production, and managers in the same physical location. Yet with software there is no material and the means of production are mental. There is no reason to be concerned about locality. Ostensibly having a bunch of people in the same office enables them to collaborate, but collaboration is a fake thing. Collaboration does not exist in San Francisco or Saint Louis. It does not weigh 1kg. It is not blue. Having an office for collaboration is a rationalization that projects the past onto the future. Is collaboration different using video conferencing and Google Docs than it is using tables and chairs in an office? Yes, because fake things are different than real things. I do not recommend mixing fake things like video conferencing with real things like offices. It may take getting used to, but embracing the fakeness of collaboration has advantages like hiring people where they want to live instead of trying to convince them to live where you live. It also means having permanent, searchable, modifiable artifacts that can be shared instantly across the world, instead of a whiteboard in a room.

Software can process data, but software is also data. This creates leverage. You can flip a bit, and that bit can flip ten others, and those ten another one hundred, etc. Compilers, build tools, continuous integration, and automated tests are all software doing things to software. "The cloud" has created a lot of leverage because it took something that was real (a machine) and made it fake (a "cloud instance"), and once it is fake it can be manipulated by software. The higher you can climb the mountain of abstraction the more powerful you will become. Before selling to Facebook, WhatsApp had ~450 million active users and ~55 employees. Yahoo has ~12,500 employees. I don't know how many active users they have, but let's just pretend it is ~450 million. Don't be Yahoo.

These are just examples, and you can agree or disagree. My point is, we as an industry can achieve market success and realize our visions much more powerfully, but we must understand the nature of the software we are creating (it is fake), and the newness of what we are doing every day, and its collaborative nature, and the tools that we can take advantage of, and we must have the courage to give up on arbitrary constraints that are optimized for making real things. We must pursue leverage, because leverage will enable us to do amazing things.