Enabling Offline Inferences at Uber Scale
Introduction
At Uber we use data from user support interactions to identify gaps in our products and create better, more delightful experiences for our users....
Uber’s Real-Time Document Check
Introduction
Justification for Identity Verification
Latin America is a rich cultural region, known for its world-renowned gastronomy, its abundant biodiversity, and its welcoming population. However, socio-economic...
Uber’s Emergency Button and The Technologies Behind It
Safety has long been a top priority at Uber, as Uber’s CEO Dara Khosrowshahi wrote in ‘Raising the Bar on Safety’ in September 2018....
Avoiding CPU Throttling in a Containerized Environment
At Uber, all stateful workloads run on a common containerized platform across a large fleet of hosts. Stateful workloads include MySQL®, Apache Cassandra®, ElasticSearch®,...
One Stone, Three Birds: Finer-Grained Encryption @ Apache Parquet™
Overview
Data access restrictions, retention, and encryption at rest are fundamental security controls. This blog explains how we have built and utilized open-sourced Apache Parquet™'s...
Introducing Ballast: An Adaptive Load Test Framework
As Uber's architecture has grown to encompass thousands of interdependent microservices, we need to test our mission-critical components at max load in order to...
Cost Efficiency @ Scale in Big Data File Format
Background
Our Apache Hadoop® based data platform ingests hundreds of petabytes of analytical data with minimum latency and stores it in a data lake built...
CRISP: Critical Path Analysis for Microservice Architectures
Uber’s backend is an exemplar of microservice architecture. Each microservice is a small, individually deployable program performing a specific business logic (operation). The microservice...
How Uber Migrated Financial Data from DynamoDB to Docstore
Introduction
Each day, Uber moves millions of people around the world and delivers tens of millions of food and grocery orders. This generates a large...
Introducing uGroup: Uber’s Consumer Management Framework
Background
Apache Kafka® is widely used across Uber’s multiple business lines. Take the example of an Uber ride: When a user opens up the Uber app,...
Improving HDFS I/O Utilization for Efficiency
Scaling our data infrastructure with lower hardware costs while maintaining high performance and service reliability has been no easy feat. To accommodate the exponential...
Building Uber’s Fulfillment Platform for Planet-Scale using Google Cloud Spanner
Introduction
The Fulfillment Platform is a foundational Uber domain that enables the rapid scaling of new verticals. The platform handles billions of database transactions each...
Real-Time Exactly-Once Ad Event Processing with Apache Flink, Kafka, and Pinot
Uber recently launched a new capability: Ads on UberEats. With this new ability came new challenges that needed to be solved at Uber, such...
YAML Generator for Funnel YAML Files: Streamlining the Mobile Data Workflow Process
At Uber, real-time mobile analytics events—generated by button taps, page views, and more—form the backbone of the mobile data workflow process.
To process these events,...
Jellyfish: Cost-Effective Data Tiering for Uber’s Largest Storage System
Problem
Uber deploys a few storage technologies to store business data based on their application model. One such technology is called Schemaless, which enables the...
Streaming Real-Time Analytics with Redis, AWS Fargate, and Dash Framework
Introduction
Uber’s GSS (Global Scaled Solutions) team runs scaled programs for diverse products and businesses, including but not limited to Eats, Rides, and Freight. The...
Enabling Seamless Kafka Async Queuing with Consumer Proxy
Uber has one of the largest deployments of Apache Kafka in the world, processing trillions of messages and multiple petabytes of data per day....
Building Scalable Streaming Pipelines for Near Real-Time Features
Background
Uber is committed to providing reliable services to customers across our global markets. To achieve this, we heavily rely on machine learning (ML) to...
Eats Safety Team On-Call Overview
Introduction
Our engineers have the responsibility of ensuring a consistent and positive experience for our riders, drivers, eaters, and delivery/restaurant partners.
Ensuring such an experience requires...
Unifying Support Content to Enable More Empathetic and Personalized Customer Support Experiences
Introduction
Content quality is critical to the support experienced by Uber’s customers. Consider an Eater who reached out for help to cancel a very delayed...