2016
For my first ever essay/article here at CodeFox, I’ve decided to write about abstraction. If you have any substantive experience in programming, the technical examples will be elementary to you, but I find the theoretical discussion of abstraction to be highly interesting. Hopefully you will too!
In English the word abstraction means “the quality of dealing with ideas instead of actual events.” It is derived from the latin verb abstrahere, which means “to draw away.” In computer science abstraction is the technique we use to separate behavior from implementation, and its practical result should always be a relative reduction in complexity.
Using this technique, the binary addition of 00000001 with 00000001, which can be done lickedy-split by any computer, becomes 1 + 1. The behavior is the same – in this case, integer addition – and the result of both operations is equivalent to 2, but the abstracted version can be done lickedy-split by most humans, not so much by computers. Thus, we have separated the behavior – again, integer addition – from its confusing implementation in the binary guts of a computer.
And as Buddhists like to say, it’s turtles all the way down. Everything we do on our many beloved devices is a long cascade of abstractions leading from the click of a mouse or the touch of a screen down to an intricate stew of binary guts. Computer scientists have been abstracting computational behavior for decades. Take the following bit of code written in assembly language:
mov eac, 00000001b add eac, 00000001b
This performs our integer addition by storing the binary equivalent of 1 in a slot of memory called eac, then adding it to the binary equivalent of 1, storing the result (the binary equivalent of 2), in eac. If that sounds overly complicated, it is, because it’s very close to the guts of the computer. It’s what we call “low-level stuff,” meaning it isn’t very abstract. But that’s okay, because our computer scientist friends didn’t stop at assembly language. This kind of instruction was abstracted into something easier to understand like the following C++ code:
int answer = 1 + 1; cout >> answer;
Ahh, isn’t that nice … now we’re able to create a named variable where we can store the result of our integer addition, which we can can now simply write as 1 + 1, and we can then even display said result on a monitor for all to see. Pretty cool, right?
We can also modularize this operation for reuse by turning it into what we call a function. Here’s what that might look like in JavaScript:
function add(a,b) {return a + b;} alert(add(1,1));
This performs our integer addition and displays a popup window in a web browser with the result, in this case 2. What’s more, we can reuse this function as many times as we like, and if we want to change the behavior we only have to change it once, where we defined the function.
The JavaScript example is more abstract than our earlier C++ code. C++ is a strongly typed language, hence we have to tell the computer that our variable called answer is an integer (int), otherwise we’ll get an error before we even get out of the gate. JavaScript can adapt datatypes on the fly. It is a weakly typed language. Furthermore, it is an interpreted language, as opposed to C++, which is a compiled language. This means C++ programs get compiled before they can run. They essentially get “de-abstracted” in the compilation process, whereas JavaScript programs get “interpreted,” most commonly by a web browser.
We’ll talk about the web browser as an abstraction itself a bit later, but this difference between compiled vs. interpreted, and strongly typed vs. weakly typed languages deserves a bit more attention. Each approach has advantages and disadvantages directly related to its level of abstraction.
Compiled languages allow the programmer to have much more control over the program and its resources, but are also much less forgiving at run-time. This is probably the most fundamental trade-off to consider when deciding what language or framework to use for a given project. There are many other concerns, from security and scalability to performance and maintainability, but direct control over program behavior and resources vs flexibility and speed of implementation is paramount.
This point is directly related to our discussion of abstraction. Generally speaking, the more abstracted a programming language becomes, the more flexible and forgiving it is at run-time, and the quicker it is to implement. On the flip side, these higher-level (more abstracted) solutions don’t perform nearly as well from the CPU’s standpoint, and they are more prone to errors. Because of the rigors of compilation, many edge case scenarios that would break a program are ferreted out early in the development process when using lower-level languages.
This all illustrates how abstraction differentiates one language from another, but individual computer languages also offer layers of abstraction within themselves. Libraries can be added to a program that extend the functionality of the language in which you’re working. These libraries offer classes and methods that make a programmer’s job easier – again, reducing complexity by separating behavior from implementation – but they come with the same kinds of trade-offs we just discussed.
Modern web design and development is full of such abstractions. As I mentioned earlier, the modern web browser is a powerful layer of abstraction that hides so much complexity from its users. For decades web browsers have allowed developers to separate content (HTML) from presentation (CSS) and behavior (mostly JavaScript). The web browser is now also a powerful debugging and testing tool for developers.
Within the general technologies of HTML, CSS, and JavaScript, the modern developer has a dizzying array of abstractions available. JavaScript has been abstracted many times over with the addition of intuitive DOM (Document Object Model) manipulation through JQuery and countless other libraries and APIs designed to do everything from animating DOM elements to visualizing complex datasets. More recently, tools called pre-processors have entered the scene to reduce the complexity of writing HTML and CSS. Take the following comparison of HTML to its equivalent Jade:
<!DOCTYPE html> <html lang="en"> <head> <title>Jade</title> <script type="text/javascript"> //<![CDATA[ if (foo) { bar() } //]]> </script> </head> <body> <h1>Jade - node template engine</h1> <div id="container"> <p>You are amazing</p> </div> </body> </html>
!!! 5 html(lang="en") head title= pageTitle :javascript | if (foo) { | bar() | } body h1 Jade - node template engine #container You are amazing
The Jade code is clearly easier to write if you know what you’re doing, but it ultimately gets compiled by the pre-processor into HTML before deployment to a live environment. These kinds of technologies are potentially useful abstractions, but their ultimate value is debatable. Their adoption comes with a learning curve, which is a cost some may not want to bear, despite a potential increase in efficiency. They also present a nontrivial challenge to maintainability. For example, if you develop an application using SASS as your CSS pre-processor, there’s no guarantee the next person to have your job will know what they’re looking at when they start digging through your source code. And who knows if the team actively maintaining SASS today will still be doing so five years hence.
In the end, there’s a compelling argument to be made that the increase in developmental efficiency these abstractions offer is not worth the learning curves associated with them, let alone the roadblocks to maintainability that are sure to arise. Many will argue, for instance, that CSS3 works just fine as it is, and its conventions are only getting more robust, so what’s the need for LESS or SASS or Stylus or whatever.
I am generally a fan of abstraction, especially when it makes code more human-readable. I’d much rather look at a Jade file than an HTML file. I like reading and writing Python better than C++, but that’s just me. As we have established, there are pitfalls on both sides of the fence.
Which way does your pleasure tend? Are you more comfortable with stricter, lower level programming languages, or do you like picking up the shiniest new abstractions you can find? Let us know what you think in the comments, and thanks for reading!