Sunday 31 December 2017

Mandala or The end of control

A good friend of mine asked me why I don’t blog anymore, so I took my new-years flight as an opportunity to write some random thoughts down. Happy new year!

We used to build Information Systems or Control Systems. Sometimes, they were clumsily merging - but finally become something entirely new: Intelligent Intent Systems. I don’t like catchy slang, though, let’s just say we finally have universal “Systems”.

A sufficiently lean online shop is essentially an easy interface for sending a signal into an extremely complex, often entirely automated, logistics chain that translates information into physical control commands - the Information System part manages the feedback loop to humans. The more feedback, an idea from Control Systems, became incorporated into Information Systems, the smarter, faster, and more intuitive we were able to interact. Like frames in a movie, we’re now at the point where it becomes seamless and continuous. The IoT, spatial computing, but actually just technology becoming synonymous with information close the loop back into our world (the "real" world). We used to interpret our systems as essentially closed and deterministic, in both imperative and functional programming styles. But the new types of systems have rapidly become Probabilistic Systems.


With the planet-scale cloud of distributed services risk-driven models, from complexity and uncertainty theory, have taken center stage. With Machine Learning rapidly surpassing human experience in programming and research, we will ourselves have to model the real, human, world in a descriptiveprobabilistic way as part of those systems, observing and inferring, rather than imperatively defining the message flows between agents, and its consistency properties, the data flows between processors or structural limitations (think column- vs row-oriented data).

We don’t observe outside of the system, but as a part of it. Much like quantum physics, psychology or sociology (especially of power), told us. We humans are only agents receiving signals in this system. We inhabit a second-order Cybernetics Technosphere we call reality, built on platforms that define what they want and value economically in entirely new, sometimes alien ways, like real-time exchanges including spatial and social dimensions - with the gig- and experience economy only being the tip of the iceberg. It’s not a matrix or a mastermind though, it’s just a real, techno-human ecosystem with its own, uncontrollable evolutionary goals.

Despite me not liking the gig economy, I like the idea of evolutionary systems. Most writing focuses on the iterative process to avoid unpredictability, though, assuming some external, given, linearity, to incorporate feedback. That’s important for organizations, but it’s not mentioned where the feedback comes from: The goal, the adversary, the predator, or the local maximum.


I mainly work on the integration between real world and software, the hard end of mobile / ubiquitous / spatial / pervasive computing. Building independent, distributed service meshes, in a DevOps and Design Thinking (or DesignOps, whatever the cool kids call it these days) way. In such systems, you’re always going after the local maximum, the goal is unclear, more often than not multiple conflicting ones. You’re naturally dealing with (domain) verticals rather than horizontals. Every day has a new trade off between best experience possible and the technical realities. Those systems are not linear evolutions, they are mandalas of expanding and contracting system boundaries. They have to be observable, though, as with observation comes empathy, and with empathy learning.

In the future, we may refactor the parameters of our systems based on deeper insights about their non-deterministic behaviour. That's what I like about SRE-style work. It's the non-deterministic, the probabilistic part of software engineering. It focuses on observability and serviceability, and ML RCA’s involve explainability - correctness becomes an optimization goal, not an axiom. When I spoke about Spanner first time publicly in 2012, compared it to the twisted experience of time in movies like Spaceballs and The Hitchhiker's Guide to the Galaxy. The powerful takeaway is: Nothing is fixed, if we can reason about it, we can change it.

To understand the magnitude of change to our profession, we have to understand the societal context. Of all the possible futures, a dystopian scenario is interesting here - I shorten my version after watching Charly Stross’ talk from 34c3 which tells it better: In this scenario, the we is not a harmonic, transhuman, unity. The new we is us and algorithms from us, for us. A dark (in the sense of dark matter) singularity not of eternal life but thoughtlessmutual uncertainty, where biased algorithms and biased, dumbed down or even corrupt, people push each other further into the edges, not becoming market segments but mobs which reinforce themselves. The algorithm is as helpless as their users, because the society and economy around it require the entropy as fuel making regulation impossible.

Quick, personalized, adjustment, unlearning, or one-shot learning, can maybe avoid this scenario - it seems AI is already forgetting easier than us, controlling and optimizing itself faster than us, collaborating and sharing surprising insights nicer than us. In a "thinking fast, thinking slow" model, maybe the human is ought to become the "slow" part - as Nate Silver once put it "complementary roles that computer processing speed and human ingenuity can play in prediction". Instead of turning away, the human needs to be enabled to act.


Right now, we train machine learning systems by saying “yes” - soon we will reach the point where transfer is so good that we’ll start saying “no”. That “no” has to be slow enough, it has to be thoughtful, and it has to weigh more than assumed silent consent. We have to introduce reason, empathy and ethics, not only into individual machine learning models, but into the whole system that is driven by technology, into all of the information and the human organizational complex around it.

What excites me about working with systems from an observability, serviceability, and explainability perspective is that we can bring all of that rich knowledge from physics, psychology or sociology, hermeneutics, but also art in, and start reasoning about the overall behaviour, rather than deterministic, imperative requirements. We only have to keep talking to each other - and try to understand.

Sunday 19 March 2017

On the other side of Certainty

"We have to create the preconditions under which
Exaptation can happen naturally...
which actually means introducing inefficiency into systems"

"The question is no longer how systems behave ...
but how to ask for the probability distribution
of the properties that change the system"


Moving from project- into product world last year, I wanted to experience the real long-term view, because only strategy can tackle complexity. And our software grows more complex, though arguably less complicated, every year. But I also wanted to understand uncertainty, the dark matter of complexity, better. After some time in, I understand it's probably more important for an architect* to have strong operations knowledge than very strong algorithmic skills.

Resilience is sometimes mistaken as being adaptive, or agile. It just means expecting uncertainty and disruption, but also working properly under normal conditions. Just calling everything disruptive, and reacting in an agile way to every random demand, is not resilient. For instance, "The datacenter as a computer"** came to us, rather unexpected, from a relatively "boring" infrastructure level, rather than from new frameworks and startups, but also not from consortium standards and language ecosystems. Similarly, in true grassroots manner, polyglot programming and JavaScript on top of Unix principles catapulted us into the 21st century and will eventually enable domain logic to exist, as Adrian Cockcroft calls it, in functions, unaware of most technological constraints. In order to understand resilience, you need to care about your product, and want to improve it, have a real goal, a story you want to tell. Ironically, you have to be a little bit un-adaptive, inefficient and un-agile in order not to overfit, but to really improve.

We are still waiting for the 4th paradigm of programming. My guess is it's going to be more than just goal-oriented, it will be probabilistic, in the sense of an abstract goal corridor. While engineers will live inside the goal corridor, making sure its workings are predictable, specializing in certainty, architects live on the outside, in the long tail of the probability distribution, the Multiverses where the Dragon Kings live, outside predictive models, specializing in uncertainty. That’s what makes a system resilient. It implies engineers have a fairly static/discrete/fitted view of a system, the perfect snapshot, the position, whereas architects have a time-smeared continuum perspective, the story, the momentum. It's the whole story, not only the goal, that differentiates between emergence and evolution. But it's also important to understand that both of those roles are equally creative and forward-looking, none is a "higher level", they are two sides of the same coin.

Helga Nowotny's distinction between risk and uncertainty fits nicely here - risk can be computed, uncertainty not. Risk a relation between snapshots, uncertainty is the continuum in between. A weather forecast has risk, the climate is uncertainty. In that sense, a risk-driven architecture involves everyone, but uncertainty needs a different perspective. The future leaves to the architect (the systems-of-systems-carer, archeological gardener, forensic librarian, ontological cartographer or whatever you like to call her) the role of the curator, and maybe narrator, of the uncomputable. Paraphrasing what Akka Architects say: Architects don't design system interactions, they curate the context for a discourse about system interactions.

"It is not a question of establishing limits with walls,
but by other means"

Why is all of this important, why should we get used to speak in terms of probabilities, but, most importantly, tackle certainty and uncertainty differently, but with the same importance?

In my last job I learned the importance of maintainability and traceability. In every single one of my projects the first thing I introduced was proper monitoring, alerting and analytics - Robustness was core, as was accountability, to be able to become lean and agile. With new infrastructure, whether in the cloud or not, this has become the default. The battle for certainty, traceability, and robustness is won.

While we were busy fighting this battle, uncertainty has come back, as Ms. Nowotny would put it, "systemic risk", as risk deeply embedded in the complex relationships of our services: "Uncertainty switches gestalt". A cloud service going down is a risk that can be predicted - the dashboard not showing it, because it is hosted on the same cloud service is systemic risk, the type of Dragon King uncertainty which we don't expect. Soon it will be mainstream that coders will pair with AI, and operations and product teams will regularly train models rather than manually define metrics. The relationships between components will become so complex that, more often than not, it will take a long time to even recognize errors, or feature usage, let alone find the root cause or customer need.

Despite all the data we accumulate, Observability does still not mean Explainability (sometimes as beautifully visualized as our old architecture diagrams) or Introspection, the ability to rationalize the inscrutability of AI.

Most of our architecture diagrams are nothing more than Thomassons, useless depictions of a fictional state which we only keep because it took us so long to create them. But similar to code, it's actually more productive to delete them. What we need is a visualization of the statistical complexity of actual state. In my book, a good friend of mine wrote a story how we made network traffic audible in order to hear inconsistencies - that's what we need to understand the architecture of a system. A real-time explanation of what's going on, mapped to the architecturally relevant components, such as interfaces, deployment units (like functions) and, last but not least, rules inside machine learning systems.

I am looking forward to new, real-time programming and data toolkits, think Eve, Jupyter and Glitch, especially because they enable a different kind of coder to build software. And with serverless and deep learning we will be able to scale those apps and the data required quicker than ever before. But it will require architects to understand them, if something is wrong, if a use case needs to be developed or a feature is behaving unexpectedly - i.e. in operations and product development. These architects won't be able to look at diagrams anymore, they won't be certain about documentation (not that they ever were), and they won't even be able to observe the system in its entirety. Architects will indeed become archeological gardeners or forensic librarians. Most importantly, they will become like anthropologists or biologists in the early days, trying to understand evolution, with the help of experimentation and collection. We'll finally see architects developing models instead of diagrams. And with that, we will see very different people in this role too. Which is very exciting.

*) separating this from a Tech Lead here, sometimes it can make sense to combine the two roles, though
**) e.g. serverless computing platforms such as Lambda and globally distributed databases such as Spanner, with them a different approach to time, where evergreen becomes an axiom, but also a comeback of Spreadsheets through functional programming and k/v stores