home

Languages As Tools

2016-07-19

More programming language-as-tool ruminations; this time swiped and improved from a Twitter rant I went on in June.

I'm experimenting with OCaml at home some now. So far, I like it. It would be grand if OCaml integrated well enough with SRE-style tools to be able to move from Python/Ruby to OCaml(or similar). Rust is too low-level, Haskell too lazy and too pure. Seriously, everything I do is IO(). Monads carefully carving up the effect space manually is a pain in the butt. Why do I want to sit on the pain?

Broadly, the computer should do reasoning for you, not you for the computer. I don't want to be writing unit tests to test things that the compiler can figure out; I don't want to be rewriting my software into monadic forms. This seems so obvious that it should permeate how we develop software. I shouldn't be contorting the problem to fit into Haskell-space, Rust-space, or Python-space.

I shouldn't have to spend my time fighting stupid issues that can be solved trivially.

Here's a real world example.

I had a Python build program for an embedded system where production builds got cooked a special way. Encryption signatures with HSM-based production keys or somesuch; there was specific if release: code. The point is, release was deep magic. So, it didn't get tested. It got reviewed, heavily. As one would expect, on 1.0, the build broke. It had the following error.

"referenced variable not bound" 

Thanks, Python. Glad you're on my side. This is industrial-grade stupidity. This is solved years ago with C. Perl figured this one out: use strict; use warnings; in Perl 5.8... or earlier. The moral of this story: Consider better tools. They matter.

Let's talk about best practices, because right now the usual response coming out is "use the best tool for the job".

Assuming "best practices" (really: the popular practices ) are actually best is deadly dangerous. Consider the long list of abandoned ideas: Leeches. Flat earth. Fan death. Vaccines causing autism. Lead(Pb) is a good spice. Transmutation into gold. I'm not touching the toxic social beliefs: that will draw down some troll I don't need. Many of these ideas were or are considered best practice("the right idea") and were/are incredibly popular. Consider this next time someone argues for popularity / best practice for a software tool. Ad Populum is a popular argument. Argue for the correct solution. Don't argue for the popular one, or the flashy one, or the new one, etc. Argue from math.

Now, the best tool for the job. Cold reality is, we can pretty much shape any language system we need into the semantic we need it to be. In-language shipped libraries only matter a tiny bit, when you're starting the project. Some care and feeding by a competent set of programmers, an API published, and C's string handling is essentially fixed, so long as the reviewers ensure the local API is used. Same for OO. C can have OO. GDB has three OO systems, last I checked. Ugly as all get out though. Moving on. What defines a programming language beyond libraries? Semantics. Here's where things get sticky.

Does your language support compiling? Compile-time type checks? Are your objects checkable types? Does your language support functions? closures? continuations? Does it do pass-by-reference, pass-by-value, or some combination thereof? What about abstraction facilities, both syntactic and semantic? Etc. I'm going to abstract the above and draw a rectangle: you have designed systems on the one hand, and organic systems on the other hand. Languages like C99, C++, Perl, and Python, are organic. They tend not to have trained programming language designers and the associated mathematical principles behind them, and they have grown in ad hoc ways, causing unforseen intersections of features. C and Perl are particularly notable here. On the far other side of designed systems, we find the Scheme, ML, and programming language research crowd busily working away designing new systems. The preeminent modern language here is Rust. These systems are, ideally, predictable and easy to reason about; features don't intersect in strange ways; they are mathematically based, and words with suffixes of "-morphism" are used with meaning and power.

The rectangle has another, limited axis besides designed and organic; I call it assisted correctness. How much does the system itself assist in ensuring your code does what it should?

Python and Assembly are grouped near each other; further up the line is Perl, then Common Lisp (which is always an odd duck), then Go, then C, then C++, then Rust, then the ML children, and we come to Coq and Idis. You might rearrange it a bit, but I think you get the gist.

But I didn't actually answer the question, What's the right tool for the job? Well, what's the job? You have 3 ways to deliver work: Fast, Cheap, or Good. Typically in software development, you don't have the option of Fast and Good, or Cheap and Good. so we find out you get it Fast and Cheap, or you get it Good and Expensive.

If you're in the business of deploying cat videos, you can probably opt for cheap. Congratulations, you have a ton of technical debt and a flaky site, but hey, cat videos. Who cares.

If you're in the business of MRI scanning machines, you probably need to flip all the bits on good.

However, this only gets us to release 1.

Let's talk about iteration two now, or better yet, iteration four. (Side moral: software development is an iterated game, not a one-and-done). It's release 4 time, and all that stuff your predecessors (now promoted and working elsewhere for buckets of money) crapped into the codebase is sitting there, and it's stinking, bad. At this point, the options for new features or bug fixes are these:

  • Slow and crappy.

  • Slower and less crappy.

Because, the cold truth has become, your codebase is so ad hoc, so bad, that maintaining it has become a profoundly significant challenge. Business reality here might be that your company didn't have time to do the job right before, because predecessor just needed to get to 1.0.

But now the system is stalling. And you have a big problem on your hands. Tools help you avoid this. Let's rewind to day 0, when all was sweetness and light...

Let's look at the tool selection. The Right Tool to start with will be the one which can automatically grub through your codebase and provide correct automated assistance in keeping your code working. Remember: down the road, this thing will be sprawling, you won't remember a thing, and you've got to fix it. That choice is going to scale the best, because it doesn't require tribal lore, effort on anyone's part, or extra hours in the budget (might wind up costing some "cool' points though, sorry). Then, you should be looking at the designed set of systems as a secondary attribute: systems which will not have unexpected feature interactions. Unexpected feature interactions are often manifested as defects.

Other features which can help: abstractive capability (generics, macros, closures, continuations), well thought out dependency management, good coupling/cohesion approaches. But all of those features break down in practice to some degree, and can not be relied on to protect your program from the thixatropic cruft.

You can draw your own conclusions as to where you should take your own project, but for me, the conclusion is clear: something like a static type system is optimal for doing serious software engineering.