Volume 20, Issue 3
Research for Practice:
Convergence
Peter Alvaro, Martin Kleppmann
Research for Practice reboot
It is with great pride and no small amount of excitement that I announce the reboot of acmqueue's Research for Practice column. For three years, beginning at its inception in 2016, Research for Practice brought both seminal and cutting-edge research—via careful curation by experts in academia—within easy reach for practitioners who are too busy building things to manage the deluge of scholarly publications. We believe the series succeeded in its stated goal of sharing "the joy and utility of reading computer science research" between academics and their counterparts in industry. We know our readers have missed it, and we are delighted to rekindle the flame after a three-year hiatus.
For this first installment, we invited Dr. Martin Kleppmann, research fellow and affiliated lecturer at the University of Cambridge, to curate a selection of recent research papers in a perennially interesting domain: convergent or "eventual consistent" replicated systems. His expert analysis circles the topic, viewing it through the lens of recent work in four distinct research domains: systems, programming languages, human-computer interaction, and data management. Along the way, readers will be exposed to a variety of data structures, algorithms, proof techniques, and programming models (each described in terms of a distinct formalism), all of which attempt to make programming large-scale distributed systems easier. I hope you enjoy his column as much as I did.
Distributed Computing,
Research for Practice
Privacy of Personal Information
Sutapa Mondal, Mangesh S. Gharote, and Sachin P. Lodha
Going incog in a goldfish bowl
Each online interaction with an external service creates data about the user that is digitally recorded and stored. These external services may be credit card transactions, medical consultations, census data collection, voter registration, etc. Although the data is ostensibly collected to provide citizens with better services, the privacy of the individual is inevitably put at risk. With the growing reach of the Internet and the volume of data being generated, data protection and, specifically, preserving the privacy of individuals, have become particularly important. In this article we discuss the data privacy concepts using two fictitious characters, Swara and Betaal, and their interactions with a fictitious entity, namely Asha Hospital.
Privacy and Rights
Kode Vicious:
Securing the Company Jewels
GitHub and runbook security
Often the problem with a runbook isn't the runbook itself, it's the runner of the runbook that matters. A runbook, or a checklist, is supposed to be an aid to memory and not a replacement for careful and independent thought. But our industry being what it is, we now see people take these things to their illogical extremes, and I think this is the problem you are running into with your local runbook runner.
Kode Vicious,
Security
Escaping the Singularity:
I'm Probably Less Deterministic Than I Used to Be
Pat Helland
Embracing randomness is necessary in cloud environments.
In my youth, I thought the universe was ruled by cause and effect like a big clock. In this light, computing made sense. Now I see that both life and computing can be a crapshoot, and that has given me a new peace.
Distributed Computing,
Escaping the Singularity
The Challenges of IoT, TLS, and Random
Number Generators in the Real World
James P. Hughes, Whitfield Diffie
Bad random numbers are still with us and are proliferating in modern systems.
Many in the cryptographic community scoff at the mistakes made in implementing RNGs. Many cryptographers and members of the IETF resist the call to make TLS more resilient to this class of failures. This article discusses the history, current state, and fragility of the TLS protocol, and it closes with an example of how to improve the protocol. The goal is not to suggest a solution but to start a dialog to make TLS more resilient by proving that the security of TLS without the assumption of perfect random numbers is possible.
Business,
Privacy and Rights
Volume 20, Issue 2
The Bikeshed:
Linear Address Spaces
Poul-Henning Kamp
Unsafe at any speed
One disadvantage of being a systems programmer: You see up close how each successive generation in an architecture has been inflicted with yet another "extension," "accelerator," "cache," "look-aside buffer," or some other kind of "marchitecture," to the point where the once-nice and orthogonal architecture is almost obscured by the "improvements" that followed. It seems almost like a law of nature: Any successful computer architecture, under immense pressure to "improve" while "remaining 100 percent compatible," will become a complicated mess.
Computer Architecture,
System Evolution,
The Bikeshed
Kode Vicious:
When Should a Black Box Be Transparent?
When is a replacement not a replacement?
While we all know that the pandemic has caused incredible amounts of death and destruction to the planet, and the past two years have brought unprecedented attention on the formerly very boring area of supply chains, the sun comes up and the world still spins—which is to say that the world has not ended, yet.
Honestly, if it did, it would be a nice break for me.
Supply chain issues are both real and the world's latest excuse for everything.
If I had kids (and let's all be thankful that I do not) I would expect them to be telling their teachers, "The supply chain ate my homework."
Data,
Kode Vicious,
Networks
Walk a Mile in Their Shoes
Jenna Butler and Catherine Yeh
The Covid pandemic through the lens of four tech workers
This article shares the stories of four fictional people made up from the amalgamated diaries of 20 individuals who submitted more than 150 diary entries over the first year of the pandemic. The article follows these four composite characters as they navigate a year of Covid while shipping one of the largest software products in the world. While the past 20 months have been a challenge, evidence suggests the next year and beyond will continue to be filled with changes in how people work as they settle into a new normal. These four stories are meant to help you better understand experiences in this new world so you can operate in a more empathetic and productive way as we move into the uncertain future of hybrid work.
Business,
Privacy and Rights
Case Study
FHIR: Reducing Friction
in the Exchange of Healthcare Data
A discussion with James Agnew, Pat Helland, and Adam Cole
With the full clout of the Centers for Medicare and Medicaid Services currently being brought to bear on healthcare providers to meet high standards for patient data interoperability and accessibility, it would be easy to assume the only reason this goal wasn't accomplished long ago is simply a lack of will. Interoperable data? How hard can that be? Much harder than you think, it turns out.
To dig into why this is the case, we asked Pat Helland, a principal architect at Salesforce, to speak with James Agnew (CTO) and Adam Cole (senior solutions architect) of Smile CDR, a Toronto, Ontario-based provider of a leading platform used by healthcare organizations to achieve FHIR (Fast Healthcare Interoperability Resources) compliance. They discuss the efforts and misadventures witnessed along the way to a time where it no longer seems inconceivable for healthcare providers to exchange patient records.
Case studies,
Data,
Privacy and Rights
Long Live Software Easter Eggs!
Benoit Baudry, Tim Toady, Martin Monperrus
They are as old as software.
It's a period of unrest. Rebel developers, striking from continuous deployment servers, have won their first victory. During the battle, rebel spies managed to push an epic commit in the HTML code of https://pro.sony. Pursued by sinister agents, the rebels are hiding in commits, buttons, tooltips, API, HTTP headers, and configuration screens.
Code,
Development,
Game Development
Drill Bits
Persistent Memory Allocation
Terence Kelly with Special Guest Borers Zi Fan Tan, Jianan Li, and Haris Volos
Leverage to move a world of software
A lever multiplies the force of a light touch, and the right software interfaces provide formidable leverage in multiple layers of code: A familiar interface enables a new persistent memory allocator to breathe new life into an enormous installed base of software and hardware. Compatibility allows a persistent heap to slide easily beneath a widely used scripting-language interpreter, thereby endowing all scripts with effortless on-demand persistence.
Drill Bits,
Code,
Data,
Development
Volume 20, Issue 1
Autonomous Computing
Pat Helland
We frequently compute across autonomous boundaries but the implications of the patterns to ensure independence are rarely discussed.
Autonomous computing is a pattern for business work using collaborations to connect fiefdoms and their emissaries. This pattern, based on paper forms, has been used for centuries. Here, we explain fiefdoms, collaborations, and emissaries. We examine how emissaries work outside the autonomous boundary and are convenient while remaining an outsider. And we examine how work across different fiefdoms can be initiated, run for long periods of time, and eventually be completed.
Data,
Software Design
Distributed Latency Profiling through Critical Path Tracing
Brian Eaton, Jeff Stewart, Jon Tedesco, and N. Cihan Tas
CPT can provide actionable and precise latency analysis.
Low latency is an important feature for many Google applications such as Search, and latency-analysis tools play a critical role in sustaining low latency at scale. For complex distributed systems that include services that constantly evolve in functionality and data, keeping overall latency to a minimum is a challenging task. In large, real-world distributed systems, existing tools such as RPC telemetry, CPU profiling, and distributed tracing are valuable to understand the subcomponents of the overall system, but are insufficient to perform end-to-end latency analyses in practice. Scalable and accurate fine-grain tracing has made Critical Path Tracing the standard approach for distributed latency analysis for many Google applications, including Google Search.
Data,
Networks,
Search
Middleware 101
Alexandros Gazis and Eleftheria Katsiri
What to know now and for the future
Middleware should not serve solely as an object-oriented solution to execute simple request-response commands. Middleware can incorporate pull-push events and streams via multiple gateways by combining microservices architectures to develop a holistic decentralized ecosystem.
Development,
System Evolution
Persistence Programming
Archie L. Cobbs
Are we doing this right?
A few years ago, my team and I were frustrated by trying to meet the data-storage requirements of a project using the traditional model of Java over an SQL database.
So we created our own custom persistence layer from scratch.
This was a lot of work, but it gave us a chance to rethink persistence programming.
By reducing the "database" to its core functions and reimplementing everything else in code, we found that managing persistence became more natural and more powerful.
Data,
Development
Volume 19, Issue 6
The Bikeshed:
Surveillance Too Cheap to Meter
Poul-Henning Kamp
Stopping Big Brother would require an expensive overhaul of the entire system.
IT nerds tend to find technological solutions for all sorts of problems - economic, political, sociological, and so on. Most of the time, these solutions don't make the problems that much worse, but when a problem is of a purely economic nature, only solutions that affect the economics of the situation can possibly work. Neither cryptography nor smart programming will be able to move the needle even a little bit when the fundamental problem is that surveillance is too cheap to meter.
Privacy and Rights,
The Bikeshed
FPGAs in Client Compute Hardware
Michael Mattioli
Despite certain challenges, FPGAs provide security and performance benefits over ASICs.
FPGAs are remarkably versatile. They are used in a wide variety of applications and industries where use of ASICs is less economically feasible. Despite the area, cost, and power challenges designers face when integrating FPGAs into devices, they provide significant security and performance benefits.
Hardware,
Processors
Kode Vicious:
Getting Off the Mad Path
Debuggers and assertions
KV continues to grind his teeth as he sees code loaded with debugging statements that would be totally unnecessary if the programmers who wrote the code could be both confident in and proficient with their debuggers. If one is lucky enough to have access to a good debugger, one should give extreme thanks to whatever they normally give thanks to and use the damn thing!
Debugging,
Kode Vicious
The Keys to the Kingdom
Phil Vachon
A deleted private key, a looming deadline, and a last chance to patch a new static root of trust into the bootloader
An unlucky fat-fingering precipitated the current crisis: The client had accidentally deleted the private key needed to sign new firmware updates. They had some exciting new features to ship, along with the usual host of reliability improvements. Their customers were growing impatient, but my client had to stall when asked for a release date. How could they come up with a meaningful date? They had lost the ability to sign a new firmware release.
Failure and Recovery
Drill Bits
Steampunk Machine Learning
Victorian contrivances for modern data science
Terence Kelly
Fitting models to data is all the rage nowadays but has long been an essential skill of engineers. Veterans know that real-world systems foil textbook techniques by interleaving routine operating conditions with bouts of overload and failure; to be practical, a method must model the former without distortion by the latter. Surprisingly effective aid comes from an unlikely quarter: a simple and intuitive model-fitting approach that predates the Babbage Engine. The foundation of industrial-strength decision support and anomaly detection for production datacenters, this approach yields accurate yet intelligible models without hand-holding or fuss. It is easy to practice with modern analytics software and is widely applicable to computing systems and beyond.
Drill Bits,
Code,
Data,
Development,
Visualization,
AI and Machine Learning,
Performance
Interpretable Machine Learning
Valerie Chen, Jeffrey Li, Joon Sik Kim, Gregory Plumb, Ameet Talwalkar
Moving from mythos to diagnostics
The emergence of machine learning as a society-changing technology in the past decade has triggered concerns about people's inability to understand the reasoning of increasingly complex models. The field of IML (interpretable machine learning) grew out of these concerns, with the goal of empowering various stakeholders to tackle use cases, such as building trust in models, performing model debugging, and generally informing real human decision-making.
AI