Sunday, August 19, 2018

Machine Learning and Software Development

I am extremely disappointed with the state of Software Development as practiced today.

For many years I was bedazzled by the wondrous new features of the latest Programming Languages. Amazing breakthroughs. Context-Free Grammars. Structured Programming. Object Oriented Programming. Event Driven Architectures. Restful Interfaces. Look how wonderful it all is!

After years of chasing the latest and greatest, something ominous begins to dawn on me. These "developments" and "improvements" are not happening fast enough. The software being created is NOT better or more reliable or easier to understand or easier to develop. The frenetic pace of "new frameworks" and "new tools" and "new paradigms" and "new buzzwords" obscures the fact that we are spinning our wheels throwing up "More Stuff" without recognizing that it is just a rehash of the same old problems. The training becomes more and more specialized and it becomes harder and harder to be sure you fully understand all the features you are expected to make use of.

The users of the Software are led to believe that much progress is being made because we can create glitzier User Interfaces, or because we have reached a (sort of) consensus on how programs should behave, or because they can access exponentially more data. But Users are generally not in a position to evaluate the internal quality of Software, or to understand the costs of managing and developing that Software.

In reality, programming today is fundamentally the same as it was when we used Hollerith Cards and submitted Batch Jobs.  A program is a string of characters stored in a file. A language processor reads and interprets these characters according to a set of rules. Many of these modules are combined to create the program that will later be executed.

We have added many layers to "simplify" this process. Generating the sheer volume of Software required to make the modern world work has required some computer assistance. We created Editors and File Systems and Integrated Development Environments. We created Optimizing Compilers and Compiler Optimizers. We created Collaboration Tools. We created Interpretive Languages and Language Interpreters and Just-In-Time Compilers.

I have spent much of my career designing and developing tools to make Software development faster, less error-prone, less obscure and more effective. I have kept my head down and drunk the Kool-Aid.

However -

The universe awaits. We will soon need to create reliable programs to control the tools and equipment that we bring with us as we leave the Earth. Nothing about the current Software design and development methodologies is sustainable or applicable for use in space or on other planets.

It is 2018 and I will venture to say that no program has EVER been written in space. The tools are too clumsy. The level of specialized knowledge and training is too great. The risks are enormous. The only people that truly understand the systems are back on Earth.

Currently, any new Software or updates to existing Software used anywhere in the space program must be developed and tested on Earth and transmitted to the target system.  This might be OK when the target is a few minutes away (at most). Danger flags begin to appear when the spacecraft are further away. When you almost lose New Horizons on approach to Pluto because the people on Earth do not understand the operation of the 1970-era File System used by the probe designers in 2005, you get some idea of the impending collapse.

As we move out into the solar system we will be at the mercy of systems and Software that becomes progressively more obsolete. Losing a probe to human misunderstanding is expensive and embarrassing, but tolerable. Losing a colony ship to something like this is completely unacceptable.

Ships in transit (to Mars, for example) must have software systems that can be adapted to any situations that may develop over the course of several months. It is not possible for the designers to anticipate all possible contingencies - and there are people right there on the scene. It is therefore incumbent on us to make sure that those people are able to safely change or update the Software to deal with the new situation.

After arriving at Mars, a bunch of critical equipment will be responsible for the lives of everyone in the colony. This equipment will become progressively obsolete and subject to failure. The only people capable of creating maintenance or upgrade patches (or fixing latent flaws) are back on Earth. There will be no incentive for those experts to remain current or to train a new generation of experts. The only equipment using this software is "out of sight and out of mind".

Software development must be adapted to no longer require humans to be experts. There are currently no efforts being made in this direction. It seems to be a case of all the Software Developers continuing to Drink the Kool-Aid.  The software development methodologies are so ingrained that no one seems to recognize the shortcomings.

---

What is needed is Machine Learning applied to Software Development.

When I work with a software development team I expect to be able to discuss program requirements in verbally. I can tell a programmer that a "button should be blue when you hover over it", or that "displayed records should be alphabetized" or that "the banner should be smaller". I can then stand back and watch while he makes the changes.

At no time do I touch a keyboard. The subject-matter expert (the programmer) knows what I mean to have happen and does it. Maybe it takes changing five different files. Maybe it takes creating a bunch of new functions. Maybe it takes running a bunch of validation tests. Maybe something goes wrong. All those things get fixed.

The expert programmer knows all the details. He knows the syntax for the 15 cryptic frameworks. He understands the database architecture. He remembers the names of the API calls, and the ones that are deprecated due to bugs. All I had to do was casually mention what I wanted - the expert did the rest.

Unfortunately, most of what I do as a programmer is very similar to what I do when driving a car: just get from point A to point B without bumping into anything. There might be dozens of ways of accomplishing the task. As a Senior Developer, I might choose a "better" way than others. But I should not have to. My assistant should be fully capable.

We should be striving toward the day when the "Subject Matter Expert" is actually a machine intelligence. Using Machine Learning techniques we should expect that the knowledge and understanding that is currently a perishable commodity should be available forever.

All programming is a trial and error process. Neophyte programmers do lots of trials and learn from their many errors. Senior programmers make fewer trials and create much more obscure errors. This process of trial and error is exactly what would be expected to form the training cases for a Machine Intelligence.

In all of Software Development, the biggest mistake we are currently making is throwing away those valuable training cases. Knowing about the programs that do not work AND WHY THEY FAIL is ultimately more valuable that the final product: the one that usually works.

The obsolescence that will ultimately plague any human construct need not be potentially fatal to those future generations. Ensuring the deployment of fully capable experts on each of these colony software systems will make for safer universe for everyone.

---

In this post I couch my concerns in terms of a future manned space mission or space colony. These environments simply would not have enough personnel to allow for specialist programmers or software developers, plus their support staff, plus training and education programs.

Real-world uses for such technology are much closer to home.

The premise of this essay is the fact that I consider the software development tools to be inadequate for the task - and that they will reach an unsustainable point in the near future.

As a senior designer I am expected to be able to implement an expedient solution to problems I am assigned. This means that I must arrange the available resources to provide an acceptable result. Often this means that I have a staff with a certain skill set and my job becomes more difficult. I must decide whether to invest time and money in training or hiring a particular skill set, or using existing skills in a creative but sub-optimal manner.

If your entire staff is certified for Microsoft SQL Server then (amazingly) every problem that comes through the door (magically) seems to need a SQL database.

My life would be much simpler if I had access to skilled assistants that could perform the rote tasks using a particular set of tools. I could reasonably ask for multiple proposed solutions to a given problem and compare the results. I would have access to solutions that I might not have thought of. I would discover failure modes or options that I had not considered.

The benefits of Machine Learning in these common situations are immediate and will become more pronounced as it becomes ever more difficult for human beings to keep up with new requirements. The use of Programming Assistants to aid in Software Development efforts will be tremendously helpful.

Perhaps even more important will be the ability of a Programming Assistant to explain WHY a particular feature exists or HOW it works. Modern programs may have single lines of code that contain elements from a half-dozen completely different programming languages. The ability to ask a simple question such as "Why is that semicolon there?" (and get a quick and meaningful answer) would be wonderful.

The explanatory abilities of a Programming Assistant, including an understanding of the implementation history and goals of a piece of software would be a valuable supplement to whatever documentation exists for the program. A Programming Assistant would be capable of retaining the skills over time, and learning to recognize requirements and deficiencies. Skills and understanding would no longer be perishable commodities. Last year's programs would no longer be dangerous to use because the knowledge of features and limitations would remain fresh.

I mentioned that it might become reasonable to entertain competing proposals for implementing complex tasks using different combinations of skills. A properly trained Programming Assistant should be able to perform many of these comparisons and tradeoffs automatically. And should be able to produce an objective report on the relative merits of different approaches.

The ability of a Programming Assistant to retain an understanding of past mistakes would allow it to anticipate failures and suggest resolutions. This is contrasted with the current "wait till it breaks then scramble to fix it" approach. For example, "everybody" knows that 10,000 tiny files in a Linux file system directory is a potential problem. Yet that revelation put New Horizons into safe mode days before reaching Pluto. Fortunately, the "scramble to fix it" had enough time to recover before losing the mission.