May/June 2021


Escaping the Singularity:
Don't Get Stuck in the "Con" Game


  Pat Helland

Consistency, convergence, and confluence are not the same! Eventual consistency and eventual convergence aren't the same as confluence, either.

"Eventual consistency" is a popular phrase with a fuzzy definition. People are even inconsistent in their use of consistency. But two other terms, "convergence" and "confluence", that have crisper definitions and are more easily understood.

Data, Databases, Escaping the Singularity,




Declarative Machine Learning Systems

  Piero Molino, Christopher Ré

The future of machine learning will depend on it being in the hands of the rest of us.

The people training and using ML models now are typically experienced developers with years of study working within large organizations, but the next wave of ML systems should allow a substantially larger number of people, potentially without any coding skills, to perform the same tasks. These new ML systems will not require users to fully understand all the details of how models are trained and used for obtaining predictions, but will provide them a more abstract interface that is less demanding and more familiar. Declarative interfaces are well-suited for this goal, by hiding complexity and favoring separation of interest, and ultimately leading to increased productivity.

AI




Real-world String Comparison

  Torsten Ullrich

How to handle Unicode sequences correctly

In many languages a string comparison is a pitfall for beginners. With any Unicode string as input, a comparison often causes problems even for advanced users. The semantic equivalence of different characters in Unicode requires a normalization of the strings before comparing them. This article shows how to handle Unicode sequences correctly. The comparison of two strings for equality often raises questions concerning the difference between comparison by value, comparison of object references, strict equality, and loose equality. The most important aspect is semantic equivalence.

Code, Data




Kode Vicious:
Divide and Conquer


The use and limits of bisection

Bisection is of no use if you have a heisenbug that fails only from time to time. These subtle bugs are the hardest to fix and the ones that cause us to think critically about what we are doing. Timing bugs, bugs in distributed systems, and all the difficult problems we face in building increasingly complex software systems can't yet be addressed by simple bisection. It's often the case that it would take longer to write a usable bisection test for a complex problem than it would to analyze the problem whilst at the tip of the tree.

Debugging, Kode Vicious




Opinion:
When Curation Becomes Creation


  Liu Leqi, Dylan Hadfield-Menell, and Zachary C. Lipton

Algorithms, microcontent, and the vanishing distinction between platforms and creators

Media platforms today benefit from: (1) discretion to organize content, (2) algorithms for curating user-posted content, and (3) absolution from liability. This favorable regulatory environment results from the current legal framework, which distinguishes between intermediaries and content providers. This distinction is ill-adapted to the modern social media landscape, where platforms deploy powerful data-driven algorithms to play an increasingly active role in shaping what people see, and where users supply disconnected bits of raw content as fodder. Today's platforms have license to monetize whatever content they like, moderate if and when it aligns with their corporate objectives, and curate their content however they wish.

HCI, Opinion, Privacy and Rights




Digging into Big Provenance
(with SPADE)


  Ashish Gehani, Raza Ahmad, Hassaan Irshad, Jianqiao Zhu, and Jignesh Patel

A user interface for querying provenance

Several interfaces exist for querying provenance. Many are not flexible in allowing users to select a database type of their choice. Some provide query functionality in a data model that is different from the graph-oriented one that is natural for provenance. Others have intuitive constructs for finding results but have limited support for efficiently chaining responses, as needed for faceted search. This article presents a user interface for querying provenance that addresses these concerns and is agnostic to the underlying database being used.

Data


The Bikeshed:
What Went Wrong?


  Poul-Henning Kamp

Why we need an IT accident investigation board

Governments should create IT accident investigation boards for the exact same reasons they have done so for ships, railroads, planes, and in many cases, automobiles. Denmark got its Railroad Accident Investigation Board because too many people were maimed and killed by steam trains. The UK's Air Accidents Investigation Branch was created for pretty much the same reasons, but, specifically, because when the airlines investigated themselves, nobody was any the wiser. Does that sound slightly familiar in any way?

Compliance, The Bikeshed


 



March/April 2021


Commit to Memory:
A New Era for Mechanical CAD


  Jessie Frazelle

Time to move forward from decades-old design

The hardware industry is desperate for a modern way to do mechanical design. A new CAD program created for the modern world would lower the barrier to building hardware, decrease the time of development, and usher in a new era of building. The tools used to build with today are supported on the shoulders of giants, but a lot could be done to make them even better. At some point, mechanical CAD lost some of its roots of innovation. Let's dive into a few of the problems with the CAD programs that exist today and see how to make them better.

Commit to Memory, Hardware,


Escaping the Singularity:
ACID: My Personal "C" Change


  Pat Helland

How could I miss such a simple thing?

I had a chance recently to chat with my old friend, Andreas Reuter, the inventor of ACID. He and his Ph.D. advisor, Theo Härder, coined the term in their famous 1983 paper, Principles of Transaction-Oriented Database Recovery. I had blinders on after almost four decades of seeing C based on my assumptions. One big lesson for me is to work hard to ALWAYS question your assumptions. Try hard to surround yourself with curious and passionate people, both young and old, who will challenge you and try to dislodge your blinders. Foster a culture that makes them safe as they do so.

Databases, Escaping the Singularity,




Kode Vicious:
In Praise of the Disassembler


There's much to be learned from the lower-level details of hardware.

When you're starting out you want to be able to hold the entire program in your head if at all possible. Once you're conversant with your first, simple assembly language and the machine architecture you're working with, it will be completely possible to look at a page or two of your assembly and know not only what it is supposed to do but also what the machine will do for you step by step. When you look at a high-level language, you should be able to understand what you mean it to do, but often you have no idea just how your intent will be translated into action. Assembly and machine code is where the action is.

Development, Kode Vicious


Drill Bits
Schrödinger's Code: Undefined Behavior in Theory and Practice


  Terence Kelly with special guest borers Weiwei Gu and Vladimir Maksimovski

Undefined behavior ranks among the most baffling and perilous aspects of popular programming languages. This installment of Drill Bits clears up widespread misconceptions and presents practical techniques to banish undefined behavior from your own code and pinpoint meaningless operations in any software—techniques that reveal alarming faults in software supporting business-critical applications at Fortune 500 companies.

Code, Databases, Development, Drill Bits, Open Source, Software Design


Case Study: Quantum-safe Trust for Vehicles:
The Race is Already On


A discussion with Michael Gardiner, Alexander Truskovsky, George Neville-Neil, and Atefeh Mashatan

In the automotive industry, cars now coming off assembly lines are sometimes referred to as "rolling data centers" in acknowledgment of all the entertainment and communications capabilities they contain. The fact that autonomous driving systems are also well along in development does nothing to allay concerns about security. Indeed, it would seem the stakes of automobile cybersecurity are about to become immeasurably higher just as some of the underpinnings of contemporary cybersecurity are rendered moot.

Case studies, Privacy and Rights, Security


The Complex Path to Quantum Resistance

  Dr. Atefeh Mashatan and Douglas Heintzman

Is your organization prepared?

Competing quantum-resistant proposals are currently going through academic due diligence and scrutiny by industry leaders. Until the newly minted quantum-resistant standards are finalized, ICT leaders should do their best to plan for a smooth transition. This article provides a series of recommendations for these decision-makers, including what they need to know and do today. It will help them in devising an effective quantum transition plan with a holistic lens that considers the affected assets in people, process, and technology. To do so, the decision-makers first need to comprehend the nature of quantum computing in order to grasp the impact of the impending quantum threat and appreciate its magnitude.

Privacy and Rights, Security


Biases in AI Systems

  Ramya Srinivasan and Ajay Chander

A survey for practitioners

This article provides an organization of various kinds of biases that can occur in the AI pipeline starting from dataset creation and problem formulation to data analysis and evaluation. It highlights the challenges associated with the design of bias-mitigation strategies, and it outlines some best practices suggested by researchers. Finally, a set of guidelines is presented that could aid ML developers in identifying potential sources of bias, as well as avoiding the introduction of unwanted biases. The work is meant to serve as an educational resource for ML developers in handling and addressing issues related to bias in AI systems.

AI, Privacy and Rights


 



January/February 2021


Escaping the Singularity:
Fail-fast Is Failing... Fast!


  Pat Helland

Changes in compute environments are placing pressure on tried-and-true distributed-systems solutions.

For more than 40 years, fail-fast has been the dominant way of achieving fault tolerance. In this approach, some mechanism is responsible for ensuring that each component is up, functioning, and responding to work. As the industry moves to leverage cloud computing, this is getting more challenging. The way we create robust solutions is under pressure as the individual components don't fail fast but instead, starts running slow, which is far worse The slow component may be healthy enough to say, "I'm still here!" but slow enough to clog up all the work. This makes fail-fast schemes vulnerable.

Distributed Computing, Distributed Development, Escaping the Singularity, Quality Assurance


Software Development in Disruptive Times

  João Varajão

Creating a software solution with fast decision capability, agile project management, and extreme low-code technology

In this project, the challenge was to "deploy software faster than the coronavirus spread." In a project with such peculiar characteristics, several factors can influence success, but some clearly stand out: top management support, agility, understanding and commitment of the project team, and the technology used. Conventional development approaches and technologies would simply not be able to meet the requirements promptly.

Development




Kode Vicious:
Aversion to Versions


Resolving code-dependency issues

One should never hardcode a version or a path inside the code itself. Code needs to be flexible so that it can be installed anywhere and run anywhere so long as the necessary dependencies can be resolved, either at build time for statically compiled code or at runtime for interpreted code or code with dynamically linked libraries. There are current, good ways to get this right, so it's a shame that so many people continue to get it wrong.

Development, Kode Vicious


WebRTC - Realtime Communication for the Open Web Platform

  Niklas Blum, Serge Lachapelle, and Harald Alvestrand, Google

What was once a way to bring audio and video to the web has expanded into more use cases we could ever imagine.

In this time of pandemic, the world has turned to Internet-based, RTC (realtime communication) as never before. The number of RTC products has, over the past decade, exploded in large part because of cheaper high-speed network access and more powerful devices, but also because of an open, royalty-free platform called WebRTC. WebRTC is growing from enabling useful experiences to being essential in allowing billions to continue their work and education, and keep vital human contact during a pandemic. The opportunities and impact that lie ahead for WebRTC are intriguing indeed.

Web Services


Toward Confidential Cloud Computing

  Mark Russinovich, Manuel Costa, Cédric Fournet, David Chisnall, Antoine Delignat-Lavaud, Sylvan Clebsch, Kapil Vaswani, Vikas Bhatia

Extending hardware-enforced cryptographic protection to data while in use

Although largely driven by economies of scale, the development of the modern cloud also enables increased security. Large data centers provide aggregate availability, reliability, and security assurances. The operational cost of ensuring that operating systems, databases, and other services have secure configurations can be amortized among all tenants, allowing the cloud provider to employ experts who are responsible for security; this is often unfeasible for smaller businesses, where the role of systems administrator is often conflated with many others.

Distributed Computing, Privacy, Security


The SPACE of Developer Productivity

  Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Thomas Zimmermann, Brian Houck, Jenna Butler

There's more to it than you think.

Developer productivity is about more than an individual's activity levels or the efficiency of the engineering systems relied on to ship software, and it cannot be measured by a single metric or dimension. The SPACE framework captures different dimensions of productivity, and here we demonstrate how this framework can be used to understand productivity in practice and why using it will help teams better understand developer productivity and create better measures to inform their work and teams.

Management, Workflow


 



November/December 2020


Drill Bits:
Offline Algorithms in Low-Frequency Trading


  Terence Kelly

Clearing Combinatorial Auctions

Expectations run high for software that makes real-world decisions, particularly when money hangs in the balance. This third episode of the Drill Bits column shows how well-designed software can effectively create wealth by optimizing gains from trade in combinatorial auctions. We'll unveil a deep connection between auctions and a classic textbook problem, we'll see that clearing an auction resembles a high-stakes mutant Tetris, we'll learn to stop worrying and love an NP-hard problem that's far from intractable in practice, and we'll contrast the deliberative business of combinatorial auctions with the near-real-time hustle of high-frequency trading. The example software that accompanies this installment of Drill Bits implements two algorithms that clear combinatorial auctions.

Code, Development, Drill Bits, Software Design


Enclaves in the Clouds

  Jatinder Singh, Jennifer Cobbe, Do Le Quoc, and Zahra Tarkhani

Legal considerations and broader implications

With organizational data practices coming under increasing scrutiny, demand is growing for mechanisms that can assist organizations in meeting their data-management obligations. TEEs (trusted execution environments) provide hardware-based mechanisms with various security properties for assisting computation and data management. TEEs are concerned with the confidentiality and integrity of data, code, and the corresponding computation. Because the main security properties come from hardware, certain protections and guarantees can be offered even if the host privileged software stack is vulnerable.

Compliance


Commit to Memory:
Let's Play Global Thermonuclear Energy


  Jessie Frazelle

It's important to know where your power comes from.

For us to grow and progress as a civilization, we need more investment in providing electricity to the world through clean, safe, and efficient processes. Thermonuclear energy is a huge step forward. This article is mostly focused on the use cases around grid-scale reactors. It's hard to see a future without some sort of thermonuclear energy powering all sorts of things around us.

Commit to Memory, Hardware, Power


Best Practice: Application Frameworks

  Chris Nokleberg and Brad Hawkes

While powerful, frameworks are not for everyone.

After an overview of the central aspects of frameworks, we dive deeper into the benefits of frameworks, the tradeoffs they entail, and the most important features we recommend implementing. Then we show a practical application of frameworks at Google: how developing a microservices platform allowed Google to break up its monolithic code base, and how frameworks enabled that change.

Development


Kode Vicious
The Non-psychopath's Guide to Managing an Open-source Project


Respect your staff, learn from others, and know when to let go.

Transitioning from one of the technical faithful to one of the hated PHBs (pointy-haired bosses), whether in the corporate or the open-source world, is truly a difficult transition. Unless you are a type who has always been meant for the C-suite it's going to take a lot of work and a lot of patience, mostly with yourself, to make this transition. Doing something "for the good of (blank)" usually means you are sublimating your own needs to the needs of others, and if you don't acknowledge that, you are going to get smacked and surprised by your own reactions to people very, very quickly.

Kode Vicious, Management, Open Source


Escaping the Singularity:
Baleen Analytics


  Pat Helland

Large-scale filtering of data provides serendipitous surprises.

Data analytics hoovers up anything it can find and we are finding patterns and insights that weren't available before, with implications for both data analytics and for messaging between services and microservices. It seems that a pretty good understanding among many different sources allows more flexibility and interconnectivity. Increasingly, flexibility dominates perfection.

Escaping the Singularity, Data


Case Study:
Always-on Time-series Database:
Keeping Up Where There's No Way to Catch Up


A discussion with Theo Schlossnagle, Justin Sheehy, and Chris McCubbin

What if you found you needed to provide for the capture of data from disconnected operations, such that updates might be made by different parties at the same time without conflicts? And what if your service called for you to receive massive volumes of data almost continuously throughout the day, such that you couldn't really afford to interrupt data ingest at any point for fear of finding yourself so far behind present state that there would be almost no way to catch up?

Case studies, Databases


 



 




Older Issues