Tech News: 2022-15

21:03, Monday, 11 2022 April UTC

Other languages: Bahasa Indonesia, Deutsch, English, dagbanli, español, français, italiano, magyar, polski, português, português do Brasil, suomi, svenska, čeština, русский, українська, עברית, العربية, বাংলা, 中文, 日本語, ꯃꯤꯇꯩ ꯂꯣꯟ, 한국어

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Recent changes

  • There is a new public status page at www.wikimediastatus.net. This site shows five automated high-level metrics where you can see the overall health and performance of our wikis’ technical environment. It also contains manually-written updates for widespread incidents, which are written as quickly as the engineers are able to do so while also fixing the actual problem. The site is separated from our production infrastructure and hosted by an external service, so that it can be accessed even if the wikis are briefly unavailable. You can read more about this project.
  • On Wiktionary wikis, the software to play videos and audio files on pages has now changed. The old player has been removed. Some audio players will become wider after this change. The new player has been a beta feature for over four years. [1][2]

Changes later this week

  • The new version of MediaWiki will be on test wikis and MediaWiki.org from 12 April. It will be on non-Wikipedia wikis and some Wikipedias from 13 April. It will be on all wikis from 14 April (calendar).

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

Why good information on the environment matters

16:13, Monday, 11 2022 April UTC

Human-dominated landscapes tend to be homogenized in a that’s often invisible to us. Tourists visiting anywhere in the tropics expect a see lot of the same things — coconut trees, mangos, pineapples, bananas. Despite the fact that the tropics are some of the most biologically diverse regions of the planet, we see this artificial aggregation of a small number of common species. And alongside these intentional introductions are a whole lot of species that we have unintentionally spread around the world. Tramp species are species that have been spread around the world by human activity. Originally applied to ant species that had managed to find their way around the world like tramps or stowaways, the term has come to describe a group of species that are usually associated with human activity. While some tramp species become invasive species, most do not.

Most people are familiar with the invasive species, but might have a hard time separating that concept from the related idea of introduced species. Familiar ideas like these got added to Wikipedia first (the invasive species article was created in 2002, while the introduced species article was created in 2003). The article on tramp species, on the other hand, wasn’t created until November 2021 when a student in Sarah Turner’s Advanced Seminar in Environmental Science class created the article. It’s a concept that fits an important part in our understanding of this topic, but as long as it had no Wikipedia article, it’s likely to be invisible to many people learning about the topic. Since undergraduates rely heavily on Wikipedia as a freely available alternative to textbooks, the topics that are missing from Wikipedia are more likely to slip through the cracks for students learning ecology.

Disease, as we have learned during the Covid-19 pandemic, is more than just the interaction between a pathogen and its host. There’s a whole world of environmental factors that means that there’s much more to disease transmission than simply infection rates. These sorts of things are part of the science of disease ecology, but more than a year into the pandemic, Wikipedia’s article on the topic was just a short overview. A student editor in the class was able to transform the article into something much more useful and information to to readers.

Climate change affects not only global temperatures, but also rainfall patterns and sea level rise. By expanding the ice sheet model and flood risk management articles, student editors were able to improve the information that’s out there for people trying to understand these important tools for forecasting changes in the world we live in. Other new articles created by students in the class include CLUE model, a spatially-explicit landuse-change model, Cooper Reef, an artificial reef in Australia, Indigenous rainforest blockades in Borneo, the Impacts of tourism in Kodagu district in Karnataka, India, and Soapstone mining in Tabaka, Kenya. Other existing articles that they made major improvements to include Alopecia in animalsBlond capuchin and Stream power.

Wikipedia’s coverage of environmental science is uneven. Many are covered well, but there are large gaps. Other articles suffer because they’re incomplete, badly organized, or out of date. This leaves a lot of room for student editors to make important contributions.

Image credit: Forest & Kim Starr, CC BY 3.0 US, via Wikimedia Commons

Tech News issue #15, 2022 (April 11, 2022)

00:00, Monday, 11 2022 April UTC
This document has a planned publication deadline (link leads to timeanddate.com).
previous 2022, week 15 (Monday 11 April 2022) next

Tech News: 2022-15

weeklyOSM 611

09:42, Sunday, 10 2022 April UTC

29/03/2022-04/04/2022

lead picture

Patterns in placenames [1] © see | map data © OpenStreetMap contributors

Mapping

  • Anne-Karoline Distel reported on a survey of Callan, Ireland, where address attributes (house numbers and street names) seem kind of curious.
  • Dino Michelini wrote (it) > en, in his blog, a well researched piece on the ancient Etruscan-Roman road Via Clodia. He also outlined what still needs to be done to improve the mapping of this road in OSM.
  • LySioS, an OSM France contributor, proposed (fr) > en that mappers in the field use an OSM business card to facilitate contacts with local residents.
  • LySioS also published (fr) > en a diary post for beginners about the ten commandments for OSM mapping (we reported earlier).
  • The OpenStreetMap tool set Neis-one.org now recognises MapComplete as a distinct data editor rather than just one of the ‘unknown’, as reported by MapComplete’s main developer.
  • The following proposals are waiting for your comments:

Community

  • The UN Mappers is now also choosing a Mapper of the Month. UN Mapper of the Month for April 2022 is SSEKITOLEKO.
  • Amanda McCann’s activity report for March 2022 is online.
  • Christoph Hormann shared his analysis of OSM-related group communication channels and platforms.
  • Minh Nguyen tackled the lack of a negative feedback option on the wiki and provided a JavaScript snippet to add to a user script page, so that one could chide any chosen contribution (an April Fool’s Day joke).
  • raspbeguy shared (fr) a small script, similar to git-blame, that indicates the last person who modified or deleted tags on a OSM element.
  • Seth Deegan has proposed adding the Translate extension to the OSM Wiki, something that would improve the process of translating articles on the Wiki. The proposal is open for comments.
  • The Ukrainian OSM community has published an appeal to the OSM community urging everyone to refrain from any mapping of the territory of Ukraine while the Russian–Ukrainian war is unfolding.

Events

  • OSMUS has honoured Ian Dees with the Hall of Fame Award.
  • Bryan Housel presented the 2.0 alpha of the new RapiD at SotMUS. The test instance shows high performance during tests.

Education

  • The group ‘Geospatial Analysis Community of Practice’ at the University of Queensland, Australia, has published an extensive tutorial on ‘spatial networks’ with R.

OSM research

  • Marco Minghini and his colleagues published a paper reviewing the initiatives from the Italian OpenStreetMap community during the early COVID-19 pandemic, discussing it from a data ecosystem perspective at both national and European scales.

Maps

  • [1] SeeSchloss created a map tool that uses OpenStreetMap data to visualise patterns in place names in various northern hemisphere territories.
  • MapTiler presented a short tutorial on ‘Customised maps made easy’.
  • Christopher Beddow wrote an article examining the bundle of geospatial components that make up Google Maps, and listed alternatives to each. He further suggests that bundling the alternatives is a strategy to compete with Google Maps as a widely used mobile app.

Did you know …

  • … the possibilities 1, 2, 3 of printing beautiful map based gifts?
  • flat_steps? The tag for steps where individual steps are separated by about 1 metre or more. Such steps may be accessible to some people who would otherwise avoid highway=steps.

Other “geo” things

  • CAMALIOT, an Android App, is a project run by a consortium led by ETH Zurich (ETHZ) in collaboration with the International Institute for Applied Systems Analysis (IIASA) and the European Space Agency. The app is gathering data for machine learning analysis of meteorology and space weather patterns.
  • Cartographers from Le Monde described (fr) > en the steps taken in making their maps, using the Ukraine situation as an example.
  • @MatsushitaSakura left a photo (zhcn) on an internet detective hobby club and asked for help to find out where it was. Another user (@猫爪子) found the possible location six months later with the help of overpass turbo and some detailed Danish mapping and showed the Overpass QL code (01:45) (zhcn) he used.

Upcoming Events

Where What Online When Country
Skillshare Session: OSM Community Forum osmcalpic 2022-04-08
Berlin 166. Berlin-Brandenburg OpenStreetMap Stammtisch osmcalpic 2022-04-08 flag
OSM Africa April Mapathon: Map Kenya osmcalpic 2022-04-09
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-11
臺北市 OpenStreetMap x Wikidata Taipei #39 osmcalpic 2022-04-11 flag
Roma Capitale Incontro dei mappatori romani e laziali osmcalpic 2022-04-11 flag
Washington MappingDC Mappy Hour osmcalpic 2022-04-13 flag
San Jose South Bay Map Night osmcalpic 2022-04-13 flag
20095 Hamburger Mappertreffen osmcalpic 2022-04-12 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-13
Michigan Michigan Meetup osmcalpic 2022-04-14 flag
OSM Utah Monthly Meetup osmcalpic 2022-04-14
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-18
OSMF Engineering Working Group meeting osmcalpic 2022-04-18
150. Treffen des OSM-Stammtisches Bonn osmcalpic 2022-04-19
City of Nottingham OSM East Midlands/Nottingham meetup (online) osmcalpic 2022-04-19 flag
Lüneburg Lüneburger Mappertreffen (online) osmcalpic 2022-04-19 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-20
Dublin Irish Virtual Map and Chat osmcalpic 2022-04-21 flag
New York New York City Meetup osmcalpic 2022-04-23 flag
京都市 京都!街歩き!マッピングパーティ:第29回 Re:鹿王院 osmcalpic 2022-04-24 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-25
Bremen Bremer Mappertreffen (Online) osmcalpic 2022-04-25 flag
San Jose South Bay Map Night osmcalpic 2022-04-27 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-27
[Online] OpenStreetMap Foundation board of Directors – public videomeeting osmcalpic 2022-04-28
Gent Open Belgium 2022 osmcalpic 2022-04-29 flag
Rapperswil-Jona Mapathon/Hackathon at the OST Campus Rapperswil and virtually osmcalpic 2022-04-29 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Lejun, Nordpfeil, PierZen, SK53, Strubbl, TheSwavu, derFred.

A Trainsperiments Week Reflection

13:59, Friday, 08 2022 April UTC

Over here in the Release-Engineering-Team, Train Deployment is usually a rotating duty. We've written about it before, so I won't go into the exact process, but I want to tell you something new about it.

It's awful, incredibly stressful, and a bit lonely.

And last week we ran an experiment where we endeavored to perform the full train cycle four times in a single week... What is wrong with us? (Okay. I need to own this. It was technically my idea.) So what is wrong with me? Why did I wish this on my team? Why did everyone agree to it?

First I think it's important to portray (and perhaps with a little more color) how terrible running the train can be.

How it usually feels to run a Train Deployment and why

Here's a little chugga-choo with a captain and a crew. Would the llama like a ride? Llama Llama tries to hide.

―Llama Llama, Llama Llama Misses Mama

At the outset of many a week I have wondered why, when the kids are safely in childcare and I'm finally in a quiet house well fed and preparing a nice hot shower to not frantically use but actually enjoy, my shoulder is cramping and there's a strange buzzing ballooning in my abdomen.

Am I getting sick? Did I forget something? This should be nice. Why can't I have nice things? Why... Oh. Yes. Right. I'm on train this week.

Train begins in the body before it terrorizes the mind, and I'm not the only one who feels that way.

A week of periodic drudgery which at any moment threatens to tip into the realm of waking nightmare.

―Stoic yet Hapless Conductor

Aptly put. The nightmare is anything from a tiny visual regression to taking some of the largest sites on the Internet down completely.

Giving a presentation but you have no idea what the slides are.

―Bravely Befuddled Conductor

Yes. There's no visibility into what we are deploying. It's a week's worth of changes, other teams' changes, changes from teams with different workflows and development cycles, all touching hundreds of different codebases. The changes have gone through review, they've been hammered by automated tests, and yet we are still too far removed from them to understand what might happen when they're exposed to real world conditions.

It's like throwing a penny into a well, a well of snakes, bureaucratic snakes that hate pennies, and they start shouting at you to fill out oddly specific sounding forms of which you have none.

―Lost Soul been 'round these parts

Kafkaesque.

When under the stress and threat of the aforementioned nightmare, it's difficult to think straight. But we have to. We have to parse and investigate intricate stack traces, run git blames on the deployment server, navigate our bug reporting forms and try to recall which teams are responsible for which parts of the aggregate MediaWiki codebase we've put together which itself is highly specific to WMF's production installation and really only becomes that long after changes merge to main branches of the constituent codebases.

We have to exercise clear judgement and make decisive calls of whether to rollback partially (previous group) or completely (all groups to previous version). We may have to halt everything and start hollering in IRC, Slack channels, mailing lists, to get the signal to the right folks (wonderful and gracious folks) that no more code changes will be deployed until what we're seeing is dealt with. We have to play the bad guys and gals to get the train back on track.

Trainsperiments Week and what was different about it

Study after study shows that having a good support network constitutes the single most powerful protection against becoming traumatized. Safety and terror are incompatible. When we are terrified, nothing calms us down like a reassuring voice or the firm embrace of someone we trust.

―Bessel Van Der Kolk, M.D., The Body Keeps the Score

Four trains in a single week and everyone in Release Engineering is onboard. What could possibly be better about that?

Well there is a safety in numbers as they say, and not in some Darwinistic way where most of us will be picked off by the train demons and the others will somehow take solace in their incidental fitness, but in a way where we are mutually trusting, supportive, and feeling collectively resourced enough to do the needful with aplomb.

So we set up video meetings for all scheduled deployment windows, had synchronous hand offs between our European colleagues and our North American ones. We welcomed folks from other teams into our deployments to show them the good, the bad, and the ugly of how their code gets its final send off 'round the bend and into the setting hot fusion reaction that is production. We found and fixed longstanding and mysterious bugs in our tooling. We deployed four full trains in a single week.

And it felt markedly different.

One of those barn raising projects you read about where everybody pushes the walls up en masse.

―Our Stoic Now Softened but Still Sardonic Conductor

Yes! Lonely and unwitnessed work is de facto drudgery. Toiling safely together we have a greater chance at staving off the stress and really feeling the accomplishment.

Giving a presentation with your friends and everyone contributes one slide.

―Our No Longer Befuddled but Simply Brave Conductor

Many hands make light work!

It was like throwing a handful of pennies into a well, a well of snakes, still bureaucratic and shouty, oh hey but my friends are here and they remind me these are just stack traces, words on a screen, and my friends happen to be great at filling out forms.

―Our Once Lost Now Found Conductor

When no one person is overwhelmed or unsafe, we all think and act more clearly.

The hidden takeaways of Trainsperiment Week

So how should what we've learned during our Trainsperiment Week inform our future deployment strategies and process. How should train deployments change?

The known hypothesis we wanted to test by performing this experiment was in essence:

  1. More frequent deployments will result in fewer changes being deployed each time.
  2. Fewer changes on average means the deployment is less likely to fail. The deployment is safer.
  3. A safer deployment can be performed more frequently. (Positive feedback loop to #1.)
  4. Overall we will: move faster; break less.

I don't know if we've proved that yet but we got an inkling that yes, the smaller subsequent deployments of the week did seem to go more smoothly. One week, however, even a week of four deployment cycles is not a large enough sample to say definitively whether doing train more frequently will for sure result in safer, more frequent deployments with fewer failures.

What was not apparent until we did our retrospective, however, is that it simply felt easier to do deployments together. It was still a kind of drudgery, but it was not abjectly terrible.

My personal takeaway is that a conductor who feels resourced and safe is the basis for all other improvements to the deployment process, and I want conductors to not only have tooling that works reliably with actionable logging at their disposal, but to feel a sense of community there with them when they're pushing the buttons. I want them to feel that the hard calls of whether or not to halt everything and rollback are not just their calls but shared in the moment among numerous people with intimate knowledge of the overall MediaWiki software ecosystem.

Better tooling—particularly around error reporting and escalation—is a barrier to entry for sure. Once we've made sufficient improvements there we need to get that tooling into other people's hands and show them that this process does not have to be so terrifying. And I think we're on the right track here with increased frequency and smaller sets of changes, but we can't lose sight of the human/social element and foundational basis of safety.

More than anything else, I want wider participation in the train deployment process by engineers in the entire organization along with volunteers.


Thanks to @thcipriani for reading my drafts and unblocking me from myself a number of times. Thanks to @jeena and @brennen for the inspirational analogies.

Signpost Technology Report — March 2022

16:00, Thursday, 07 2022 April UTC

This Technology Report covers the period from March 1, 2022 to March 27, 2022 UTC. It is republished from the Signpost on-wiki, and by extension is dual-licensed under CC BY SA 4.0 and GFDL 1.3.

2022 Wikimedia Hackathon

The logo of the Hackathon

The Wikimedia Hackathon 2022 is taking place as a hybrid event on May 20–22, 2022. The Hackathon will be held online and there will be grants available to support local in-person meetups around the world. The Hackathon is for anyone who contributes (or wants to contribute) to Wikimedia’s technical areas – as code creators, maintainers, translators, designers, technical writers and other technical roles. You can come with a project in mind, join an existing project, or create something new with others. The choice is yours! Newcomers are welcome. If you have any accessibility or translation requests, please contact [email protected]. A Wikimedia Hackathon is a space for the technical community to come together and work together on technical projects, learn from each other, and make new friends. The Hackathon will primarily be held online. Local affiliates can also apply for grants to host in-person local meetups. Meetups can be anything from social gatherings with food, to a party for watching the opening or closing ceremony, to renting a venue where people can participate together in the online event. The Code of Conduct for Wikimedia’s Technical Spaces will be in effect throughout the event, on all platforms, discussion channels, and at local meetups. Please have a look at it and ensure you are willing and able to follow it.

Desktop Improvements from the Web team

A series of new features and rearrangements to the Vector skin.

It has been almost 12 years since the current default desktop skin (Vector) was deployed. Since then, web design, as well as the expectations of readers and editors, have evolved. At the same time, the interface has been enriched with extensions, gadgets and user scripts. Most of these were not coordinated visually or cross-wiki.

In 2019, the Wikimedia Foundation Web team took a close look at Vector. It was time to take some of these ideas and bring them to the default experience of all users, on all wikis, in an organized, consistent way. Inspired by the existing tools, The Web team decided to build out improvements to the desktop experience based on research and communities’ feedback. So the Desktop Improvements project began.

Its goals are to make Wikimedia wikis more welcoming, increase the utility for viewing, and maintain the utility for editing. The Web team measures the increase of trust and positive sentiment towards our sites, and the utility of our sites (the usage of common actions such as search and language switching).

Improvements that the team has worked on include: logo reconfiguration, a collapsible sidebar, limiting content width, moving the search widget (and other search improvements), adding a more intuitive language switcher, implementing a user menu, programming a sticky side and article header, improving the table of contents, and rearranging page tools. Next, they will make general aesthetic improvements.

Currently, on most wikis, only logged-in users are able to opt-in individually by selecting Vector (2022) in preferences. On almost 30 early adopter wikis, the changes are deployed for all by default, and logged-in users are (and will be) able to opt-out. The team increases the set of early adopter wikis gradually.

Before June 2022, they will begin conversations with all the communities of the largest wikis, including the English and German-language Wikipedias, to make the improvements default on those wikis. They are inviting everyone to an open meeting with them which will take place on Tuesday March 29 at 18:00 on Zoom.

Sunflower, a new Commons uploading tool

A screenshot of the Sunflower interface

A simple and fresh take on uploading files

Fastily, on the project documentation page

Sunflower is an upload tool created by Fastily for macOS which makes it easy to batch-upload files to the Wikimedia Commons. The tool has a clean, intuitive yet featured-packed interface. The project’s maintainer describes it as a simple and fresh take on uploading files to Commons. This means it won’t do everything under the sun, nor should you expect that. Sunflower is currently available for macOS Monterey (12.2 or newer). More details are on Commons.

In brief

New user scripts to customise your Wikipedia experience

For further news and updates associated with user scripts, see the Scripts++ Newsletter

Bot tasks

Recently approved tasks

Bots that have been approved for operations after a successful BRFA can be found here (edit) for informational purposes. No other approval action is required for these bots. Old requests can be found in the archives.

Current requests for approval

Current bot requests for approval can be found here. The Bot Approvals Group recommends all community members to participate in the requests process, even if they have little to no knowledge of programming.

Latest tech news

Latest tech news from the Wikimedia technical community: 2022 #12, #11, & #10. Please tell other users about these changes. Not all changes will affect you. Translations are available on Meta.

Meetings

  • Recurrent item Advanced item You can join the technical advice meeting on IRC. During the meeting, volunteer developers can ask for advice. The meeting takes place every Wednesday from 4:00–5:00 p.m. UTC. See how to join here.

More Wikidata metrics on the Dashboard

14:17, Thursday, 07 2022 April UTC

We’re excited to announce some new updates to Dashboard statistics regarding Wikidata. As of April 2022, the Programs and Events Dashboard shares Wikidata details about merges, aliases, labels, claims, and more!

In early March, we rolled out the final batch of improvements from Outreachy intern Ivana Novaković-Leković. Ivana’s internship focused on improving the Dashboard’s support for Wikidata. After an overhaul of the system’s interface messages to add Wikidata-specific terminology — “Item” instead of “Article” and so on, for events that focus on Wikidata — Ivana worked on integrating Wikidata edit analysis into the Dashboard’s data update system. We deployed under-the-hood changes in February to begin collecting the data we would need — edit summaries from all tracked Wikidata edits. The final step was to add a visualization of that data, which you can see in action here.

The new Wikidata stats are based on analyzing the edit summary of each edit. The edit summaries for Items on Wikidata are more structured than the free-form summaries from Wikipedia and other wikis, making it possible to reliably classify most common types of contributions. For example, adding an label for a Wikidata item will result in an edit summary that includes the code `wbsetlabel-add`.

There are some limitations to this strategy, however. Multi-part revisions — for example, adding a new property that also includes qualifiers and references — will only be partially represented in the stats. That example gets counted towards ‘Claims created’, but not towards ‘References added’ or ‘Qualifiers added’. The Wikidata API provides no direct method to count these details, but it’s possible to calculate them by comparing the ‘before’ and ‘after’ state of an Item via its complete JSON entity data. We may explore that in the future, but it would require some significant changes in the Dashboard’s storage architecture before that would be possible.

Over the last several weeks, we’ve been backfilling the Wikidata stats for almost all the Programs & Events Dashboard events that edited Wikidata, and Campaign pages also show aggregate Wikidata stats.

Thanks, Ivana, for your great work!

So what does this mean for Dashboard users?

Anne Chen, a Wikidata Institute alumnae, has been using Wikidata more in the archaeology course she teaches at Yale University. As you can see from this screenshot of a recent edit-a-thon, there are many more granular Wikidata statistics you can follow. Prior to this update, the Dashboard provided users with statistics limited to number of participants, items created, items edited, total edits, references added, and page views. Although these are useful statistics to have access to, the nature of Wikidata editing can demand other sets of metrics.

Screenshot from Dashboard
Wikidata Dashboard detailed statistics example

Merging, for instance, is an important feature of Wikidata editing. Data is coming to Wikidata from different corners of the world all at once, so duplication is a natural occurrence on Wikidata, but it still needs to be addressed. Now this specific metric is easy to track on the Dashboard. Similarly, label, alias, and description work is essential for transition, disambiguation, and providing context to users about items. These statistics used to be more difficult to discover, now they show up in the statistics box on the Dashboard.

Screenshot of the Download Stats button
Download stats button on the Dashboard

For experienced Dashboard users, you may be used to obtaining these statistics from the “Download stats” button on the home tab of the Dashboard. This button still exists, so if it’s more convenient to have these stats as a CSV file, you can still get them that way! For those curious users who may be wondering what “other updates” mean, those are edits made outside of the item space on Wikidata. This would include user pages, talk pages, and WikiProject pages.

We’re excited that these new statistics are more accessible since users will have different outcomes for their projects. The more statistics we can track the better we can tell the stories of our impact and work on Wikidata. We hope you enjoy these new features.

If you’re interested in learning more about Wikidata, editing Wikidata, and Wikidata statistics, keep an eye on our calendar for future courses.

The Wikimedia Foundation is excited to announce a new podcast collaboration with Sowt to bring Wikipedia and free knowledge to Arabic-speaking audiences across the Middle East and North Africa (MENA). The collaboration is part of  our commitment to knowledge equity and of  sharing free and accessible knowledge with everyone, everywhere. 

This partnership focuses on a mutual vision, to share knowledge with the world through high-quality, narrative-driven audio content that explains the history, context, and relevance of specific topics, while building greater awareness, understanding, and affinity for Wikipedia across the region.

More about Sowt and Manbet

Sowt is a fast-growing regional independent podcast network based in Amman, Jordan. Sowt also means ‘voice’ or ‘sound’ in Arabic. Sowt will produce a new podcast season of Manbet. Hosted and written by veteran content creator Bisher Najjar, the educational podcast explores various topics from the fields of humanities and society.

The first episode of Manbet published in October 2020, was followed by 3 seasons with topics ranging from exploring the history of passports and the birth of Arab feminism to the fashion revolution, among many others. Sowt has many other shows like Eib and Domtak which have been featured in international media and have topped the iTunes charts for the MENA region.

With direct collaboration and support from the Foundation’s Communications Department, Sowt will produce 10 episodes as part of the 4th podcast season of Manbet. 

About the new season and importance to free knowledge

The podcast scene across the MENA region has been growing for years with more and more listeners tuning in. With podcast penetration exceeding 18% in the region, the Arabic language is well-positioned for audio content with over 400 million speakers globally. Podcast audiences find that audio content is more trustworthy and more intimate than other forms of media as well as the free-to-access nature of some platforms.

This partnership will utilize the strengths of Sowt’s storytelling to reach listeners, narrate an informative story in an entertaining way with accurate information and knowledge, while at the same time increasing awareness and adherence to Wikimedia’s core beliefs of promoting free and accessible knowledge for all. Each episode dives deep into a singular topic, explaining its history, context and importance. 

This partnership furthers Wikipedia’s status as a living, trustworthy, useful, international, and free knowledge platform. It also underlines our commitment to inclusivity and our support of independent organisations that use technology to increase reach and impact in more places globally. This is the first partnership of its kind for the Wikimedia Foundation focusing on Arabic podcasts in the MENA region. 

What’s next! 

The power of digital audio storytelling and the importance of open knowledge are perfectly aligned with this project, and we aim to reach more Arabic-speaking audiences all over the world. Starting 19 April, tune in to the new season of Manbet to uncover topics such as Sufism, the life of Native Americans, the history of Yemen, the story of Nollywood, and humans’ ancient dream of flying.  

Sowt and the Wikimedia Foundation, have partnered  to produce a new season of Sowt’s podcast Manbet. Hosted and written by veteran content creator Bisher Najjar, the educational podcast explores various topics from the fields of humanities and society.

This partnership is a result of Sowt and the Wikimedia Foundation’s mutual vision to share knowledge with the world. The Wikimedia Foundation is the global nonprofit that operates Wikipedia and the other Wikimedia free knowledge projects and aims at ensuring every single human can freely share in the sum of all knowledge. Sowt produces and distributes high-quality audio shows in Arabic to create a dialogue around the most important topics to Arab listeners across the world.

“This partnership brings together the power of audio storytelling and the importance of open knowledge and access to information. As a leader in Arabic podcast production, Sowt and the Wikimedia Foundation are aligned on expanding access to high quality audio content for Arabic speaking audiences,” said Jack Rabah, the Wikimedia Foundation Lead Regional Partnerships Manager (Middle East and Africa). “Together, we can build greater awareness and understanding of Wikipedia through a series of informative narrated podcasts for all listeners across the MENA region.” 

The first episode of Manbet published in October 2020, was followed by 3 seasons with topics ranging from exploring the history of passports and the birth of Arab feminism to the fashion revolution, among many others. Tune in to the new season of Manbet to uncover topics such as Sufism, the life of Native Americans, the history of Yemen, the story of Nollywood, and humans’ ancient dream of flying.

“I believe that the partnership between Sowt and the Wikimedia Foundation enriches the production of content in the Arab Region and Manbet is the best program to reflect this kind of collaboration. This season of Manbet attempts to take our audience through a journey of history, cinema, music and thriller. This span of information and stories will give us an insight of how knowledge has been and will always be power.” Said Ahmed Eman Zakaria, Manbet producer working with Sowt. 

The new season of Manbet comes out in April 2022, and you can listen to new episodes wherever you get your podcasts.

Links:

This Recent research column originally appeared in the March 2022 issue of the Signpost. It is republished from on-wiki, and by extension is dual-licensed under CC BY SA 4.0 and GFDL 1.3. The authors of this post are Bri, Gerald Waldo Luis and Tilman Bayer.

A monthly overview of recent academic research about Wikipedia and other Wikimedia projects, also published as the Wikimedia Research Newsletter.

The first scholarly references on Wikipedia articles, and the editors who placed them there

Reviewed by Bri

The authors of the study “Dataset of first appearances of the scholarly bibliographic references on Wikipedia articles”[1] developed “a methodology to detect [when] the oldest scholarly reference [was] added” to 180,795 unique pages on English Wikipedia. The authors concluded the dataset can help investigate “how the scholarly references on Wikipedia grew and which editors added them”. The paper includes a list of the top English Wikipedia editors in a number of scientific research fields.

English Wikipedia lacking in open access references

Reviewed by Gerald Waldo Luis

A four-author study was published by the journal Insights on February 2, 2022 titled “Exploring open access coverage of Wikipedia-cited research across the White Rose Universities”.[2] As implied, it analyzes English Wikipedia references published by universities of the White Rose University ConsortiumLeeds, Sheffield, and York—and examines why open access (OA) is an important feature for Wikipedians to use. It summarizes that the English Wikipedia is still lacking in OA references—that is, those from the consortium.

The study opens by stating that despite the open source nature of Wikipedia editing, there is no requirement to link to OA sites where possible. It then criticizes this lack of scrutiny, reasoning that it is contrary to Wikipedia’s goal of being an accessible portal to knowledge. Several following sections encapsulate the importance of Wikipedia among the research community, which makes OA crucial; this has been recognized by the World Health Organization when they announced they would make their COVID-19 content free to use for Wikipedia. Wikipedia has also proven to be a factor in increasing paper readerships.

Overall, 300 references were sampled for this study. The authors also added: “Of the 293 sample citations where an affiliation could be validated, 291 (99.3%) had been correctly attributed.” “In total,” the study summarizes, “there were 6,454 citations of the [consortium’s] research on the English Wikipedia in the period 1922 to April 2019.” It then presented tables breaking down these references to specific categories: Sheffield was cited the most (2,523), while York was the least (1,525). Biology-related articles cited the consortium the most (1,707), while art and writing articles cited them the least (7). As expected by the authors, journal articles—specifically from Sheffield—were cited the most (1,565). There is also a table breaking the references down by different OA licenses. York had the most OA sources cited on the English Wikipedia (56%). There are fewer sources that have non-commercial and non-derivative Creative Commons licenses. The study, however, disclaims that this is not a review of all English Wikipedia references.

In a penultimate “discussion” section, the study says that while there are many OA references, it is still “some way to go before all Wikipedia citations are fully available [in OA]”, with nearly half of the sampled references paywalled, thus stressing the need for more OA scholarly works. However, with Plan S, a recent OA-endorsing initiative, the study expressed optimism in this goal. It also proposes the solution of more edit-a-thons, which usually involve librarians and researchers who can help with this OA effort. The study notes that Leeds once held an edit-a-thon too. Its “conclusion” section states that “This [effort] can be achieved through greater awareness regarding Wikipedia’s function as an influential and popular platform for communicating science, [a] greater understanding […] as to the importance of citing OA works over [paywalled works].”

Briefly

Other recent publications

Other recent publications that could not be covered in time for this issue include the items listed below. Contributions, whether reviewing or summarizing newly published research, are always welcome. Compiled by Tilman Bayer

“Citation Needed: A Taxonomy and Algorithmic Assessment of Wikipedia’s Verifiability”

From the abstract:[3]

“In this paper, we aim to provide an empirical characterization of the reasons why and how Wikipedia cites external sources to comply with its own verifiability guidelines. First, we construct a taxonomy of reasons why inline citations are required, by collecting labeled data from editors of multiple Wikipedia language editions. We then crowdsource a large-scale dataset of Wikipedia sentences annotated with categories derived from this taxonomy. Finally, we design algorithmic models to determine if a statement requires a citation, and to predict the citation reason.”

“Psychology and Wikipedia: Measuring Psychology Journals’ Impact by Wikipedia Citations”

From the abstract:[4]

“We are presenting a rank of academic journals classified as pertaining to psychology, most cited on Wikipedia, as well as a rank of general-themed academic journals that were most frequently referenced in Wikipedia entries related to psychology. We then compare the list to journals that are considered most prestigious according to the SciMago journal rank score. Additionally, we describe the time trajectories of the knowledge transfer from the moment of the publication of an article to its citation in Wikipedia. We propose that the citation rate on Wikipedia, next to the traditional citation index, may be a good indicator of the work’s impact in the field of psychology.”

“Measuring University Impact: Wikipedia Approach”

From the abstract:[5]

“we discuss the new methodological technique that evaluates the impact of university based on popularity (number of page-views) of their alumni’s pages on Wikipedia. […] Preliminary analysis shows that the number of page-views is higher for the contemporary persons that prove the perspectives of this approach [sic]. Then, universities were ranked based on the methodology and compared to the famous international university rankings ARWU and QS based only on alumni scales: for the top 10 universities, there is an intersection of two universities (Columbia University, Stanford University).”

“Creating Biographical Networks from Chinese and English Wikipedia”

From the abstract and paper:[6]

“The ENP-China project employs Natural Language Processing methods to tap into sources of unprecedented scale with the goal to study the transformation of elites in Modern China (1830-1949). One of the subprojects is extracting various kinds of data from biographies and, for that, we created a large corpus of biographies automatically collected from the Chinese and English Wikipedia. The dataset contains 228,144 biographical articles from the offline Chinese Wikipedia copy and is supplemented with 110,713 English biographies that are linked to a Chinese page. We also enriched this bilingual corpus with metadata that records every mentioned person, organization, geopolitical entity and location per Wikipedia biography and links the names to their counterpart in the other language.” “By inspecting the [Chinese Wikipedia dump] XML files, we concluded that there was no metadata that identifies the biographies and, therefore, we had to rely on the unstructured textual data of the pages. […] we decided to rely on deep learning for text classification. […] The task is to assign a document to one or more predefined categories, in our case, “biography” or “non-biography.” […] For our extraction, we used one of the most widely used contextualized word representations to date, BERT, combined with the neural network’s architecture, BiLSTM. BiLSTM is state of the art for many NLP tasks, including text classification. In our case, we trained a model with examples of Chinese biographies and non-biographies so that it relies on specific semantic features of each type of entry in order to predict its category.”

See also an accompanying blog post.

Apparently the authors were unaware of Wikipedia categories such as zh:Category:人物 (or its English Wikipedia equivalent Category:People) which might have provided an useful additional feature for the machine learning task of distinguishing biographies and non-biographies. On the other hand, they made use of Wikidata to generate a training dataset of biographies and non-biographies.

“Learning to Predict the Departure Dynamics of Wikidata Editors”

From the abstract:[7]

“…we investigate the synergistic effect of two different types of features: statistical and pattern-based ones with DeepFM as our classification model which has not been explored in a similar context and problem for predicting whether a Wikidata editor will stay or leave the platform. Our experimental results show that using the two sets of features with DeepFM provides the best performance regarding AUROC (0.9561) and F1 score (0.8843), and achieves substantial improvement compared to using either of the sets of features and over a wide range of baselines”

“When Expertise Gone Missing: Uncovering the Loss of Prolific Contributors in Wikipedia”

From the abstract and paper (preprint version):[8]

“we have studied the ongoing crisis in which experienced and prolific editors withdraw. We performed extensive analysis of the editor activities and their language usage to identify features that can forecast prolific Wikipedians, who are at risk of ceasing voluntary services. To the best of our knowledge, this is the first work which proposes a scalable prediction pipeline, towards detecting the prolific Wikipedians, who might be at a risk of retiring from the platform and, thereby, can potentially enable moderators to launch appropriate incentive mechanisms to retain such `would-be missing’ valued Wikipedians.”

“We make the following novel contributions in this paper. – We curate a first ever dataset of missing editors, a comparable dataset of active editors along with all the associated metadata that can appropriately characterise the editors from each dataset.[…]

– First we put forward a number of features describing the editors (activity and behaviour) which portray significant differences between the active and the missing editors.[…]

– Next we use SOTA machine learning approaches to predict the currently prolific editors who are at the risk of leaving the platform in near future. Our best models achieve an overall accuracy of 82% in the prediction task. […]

An intriguing finding is that some very simple factors like how often an editor’s edits are reverted or how often an editor is assigned administrative tasks could be monitored by the moderators to determine whether an editor is about to leave the platform”

References

Due to software limitations, references are displayed here in image format. For the original list of references with clickable links, see the Signpost column.

UNLOCK free knowledge projects | Call for applications

00:00, Wednesday, 06 2022 April UTC
An illustration of the key phrase “We accelerate your ideas. Together we build the future of Free Knowledge” of the Wikimedia Accelerator UNLOCK by Wikimedia Deutschland https://commons.wikimedia.org/wiki/Category:UNLOCK_Accelerator#/media/File:Wikimedia_Accelerator_UNLOCK.png

Access to knowledge is a matter of equity

The Wikimedia Accelerator program UNLOCK calls for new ideas and projects that break down social and technical barriers preventing people from both accessing and contributing to free knowledge. Applications are open from April 5th until May 29th.

The UNLOCK program supports innovative projects and the teams behind them with tailored coaching, access to an international network of experts, peer-to-peer exchange and a scholarship. We accelerate your ideas!

What kind of ideas get supported?

In alignment with recommendation 9 “Innovate in free knowledge” of the Movement Strategy, we look for projects (ideally at an early stage from idea sketch-level to a ready / almost ready prototype) that address knowledge equity in the following areas:

  • New technologies, tools, models of governance or similar that support more diverse modes and formats of knowledge (e.g. audio, visual, video, geospatial, etc.). With your ideas we can create an environment where new and/ or underrepresented communities can devise their own technologies, systems, social structures, policies and governance.  
  • Alternative practices and models that create more opportunities for everyone to participate in free knowledge projects; new concepts for overall fairer conditions, greater incentives and fairer compensation for the contribution to free knowledge; your idea for a more inclusive representation of the diverse knowledge of our world; or for building creative commons standards as real alternatives to existing closed-source business models.

Who is the program for?

We are looking for active Wikimedians as well as free knowledge enthusiasts, developers, designers and activists from outside the Wikimedia movement. For this year’s edition, we will pilot a regionally focused program as Wikimedia Deutschland and Wikimedia Serbia have joined forces to promote and strengthen cross-regional and cross-affiliate collaboration. Besides, we also teamed up with Impact Hub Belgrade – an organization from the innovation ecosystem. This means that applications are open to project teams (of a minimum of two and no more than five members) from the following countries: Albania, Austria, Bosnia-Herzegovina, Germany, Kosovo, Montenegro, North Macedonia as well as Serbia and Switzerland. The program language is English in order to enable exchange and create synergies among the cross-regional project teams. UNLOCK is designed to work completely virtually.  

Applications are open from April 5th until May 29th. Join the UNLOCK Accelerator to turn your idea into a working prototype. Head over to the UNLOCK program page for details (incl. eligibility criteria, program timeline, application form and more). 

You want to apply?

Great – we look forward to your ideas! If you require further assistance with your application or if you have an idea but are not sure whether or not to apply, feel free to reach out to the organizing team behind the program via Email: [email protected] 

You know people who would be interested in this program?

Sharing is caring – Help us spread the word about the call in your network and community. You can find us on LinkedIn, and/or Twitter and simply share our posts. Ideally with a few personal words – a brief “Apply now” is enough! Besides, we created a one pager to be shared, which contains all important information.   

And finally, take a look at the video below that captures former participants, coaches and the organizing team talking about their experiences, motivations and the big picture of this innovation-driving program UNLOCK.

Brand guidelines navigation

Over the past few months, the Brand Studio team within the Wikimedia Foundation communications department has been working on much needed updates to the Movement Brand Guidelines portal on Meta-Wiki. The update included expanding the portal with more information, a fresh look as well as a pilot to provide new do-it-yourself tools using the cloud based design platform, figma.

It all started last October when the Wikimedia Foundation Board of Trustees approved a resolution to advance key areas of global branding while extending the pause of renaming work. One of the new projects the Board of Trustees directed the Brand Studio team to work on was: supporting the movement with updated brand guidelines. 

The brand portal on Meta-Wiki  is used by affiliates, foundation staff, the press and individuals. Unfortunately over time, the needs of brand users began to outpace the portal’s resources. In February we shared a proposal with the community to replace it with an updated version and continue to apply updates on the portal at least biannually based on community feedback. 

Updated content and navigation

The new portal provides the user with seven main sections to navigate: Overview, Logo, Typography, Colours, Imagery, Campaigns and Events, and Create. The logo section, for example, provides information about all the Wikimedia marks, not just the Wikimedia Foundation’s logo. 

Our colours have been expanded to include all the Wikimedia colour palettes: The Core palette, (black, white and greyscale) The Legacy palette (our legendary tricolour) and The Creative palette (inspired by the Wikipedia 20 birthday identity). We have also defined our colours with specific values so you can easily implement them in your design and shared colour versions that have been optimised for accessibility. 

Updated templates and new do-it-yourself tools

The Create section of the portal provides the user with ready templates on Figma that can be used to create a new logo for an affiliate or a community activity in very short time without any need for experience with design. In addition, the section provides a new presentation template to replace the one we all have been using in events and meetings since 2016. 

Last January, we shared the proposed portal, design templates and the new presentation template for feedback. We would like to thank everyone who took the time to review the materials and share their feedback with us through the survey, office hours, the email or Meta-Wiki. We used a big part of the feedback provided to apply rapid changes to the tools before publishing them and we are planning to use more of this feedback during the next round of updates. 

Training

Please join us on Tuesday 19 April 2022 at 15:00 UTC (using this link) or Sunday 24 April 2022 at 08:00 UTC (using this link) in a 60 minutes live workshop to walk you through the new tools, how to use them, create designs together and answer any questions you may have about the updates. Recorded summaries of the training will be available to watch later for those who cannot attend the live workshop.

We look forward to hearing more feedback on the updates from across the movement and to see the updates in use by affiliates and individuals. 

This blog post is part of a three part blog series exploring how organizing helps the Wikimedia movement grow participation and respond to movement strategy. Parts two and three will be published during the next several weeks, and links will be updated here.

When I first joined the Wikipedia community, I became known among many in the United States community as a young, enthusiastic advocate — as someone who could share excitedly the promise of GLAM-Wiki, the Education Program and editing Wikipedia to change the world. I ran nearly a dozen events powered by enthusiasm — advocating Wikipedia to anyone I could, and making sure that they edited at least once. 

How many of those early recruits are still in the movement? Nearly none. 

12+ years of organizing later, I look back and see that only a handful of the people I contacted in those first few years are still contributing with the movement. My enthusiasm for the mission, the opportunities that I saw on Wikipedia, and the energy I invested in recruiting in those first few years — felt a bit of a waste. Fun, but poorly spent energy.  Through sheer force of will I made those activities work — so much so that volunteer Alex convinced some early Foundation staff that there was a model that worked of “just training enthusiastic organizers like Alex”. 

Nearly every day, I encounter this same will and enthusiasm while organizing campaigns like #WikiForHumanRights, joining the outings of my local affiliates, and communicating with, coaching and mentoring organizers throughout the movement — as part of the Campaigns Programs team at the foundation. And just as frequently, I find that this sometimes misdirected energy can leave organizers either dispirited or exhausted. 

A flier for our upcoming #WikiForHumanRights campaign (learn more on diff): a campaign at the intersection of environmental issues and human rights. Learning how to run campaigns has really changed how I think about who we can recruit to our community.

As Wikimedians, our love for the process of documenting the world also creates an infectious enthusiasm and care for showing others how to do the same. But the fact is: most people don’t share our enthusiasm for public knowledge, or documenting the world, or learning how we do it (on a Wiki). It takes more than a beautiful vision to inspire; it requires alignment to our purpose and actions, and invitation and care in bringing newcomers along with us.

“Anyone can edit” is a software setting, not how you sustain a community

At the heart of my early enthusiasm was a complete belief in “Anyone can edit.” I looked at my University’s library, saw how much knowledge was not on Wikipedia, and thought “if only they knew what power filling these knowledge gaps could have, everyone would want to edit”.  Obviously that wasn’t the case.

It’s important to remember how we got to “anyone can edit”. The tagline comes from our roots in the early internet and the open source software community. Anyone can edit is a theory (i.e. there could be editors, because the software lets them edit), but in practice this is rarely the case. Editing requires both a number of interactions to go right with our platforms, and the motivation, knowledge and access to time to do so.

The instructions prepared by my colleague Felix Nartey for the #1lib1ref campaign. Even though editing a page is turned on, there are a lot of steps between clicking edit and successful contribution.

There’s an old saying in the movement that “Wikipedians are born, not made”. These born Wikimedians are a rather rare bunch — they are often those self-selecting because they have the skills to use our software, research an encyclopedia and belong in the social environment that arises from this process. This, in many ways, echoes certain parts of the open source communities which created meritocracy cultures by focusing on the communities they already had, they often became tragically misaligned cultures of exclusion (for example). 

The New Editors Experiences research completed by the Wikimedia Foundation in 2017−18 provided a fairly robust examination of how we made it hard for the motivated participants. And the subsequent work of the Growth team at the Foundation has created some impressive, incremental steps in making it easy for “anyone who clicks edit” or “anyone who creates an account” to be successful. Now instead of the vacant stare of a blank wikitext page welcoming you after you create an account, you are actually welcomed, given a choose-your-own adventure path to participating, and soon you will be told how your work matters

But that doesn’t mean that everyone will see the opportunity to edit and create an account. Only a tiny portion of the global population sees the edit button as an invitation and is willing to invest time and energy in the centuries-long project we have invested in: the sum of all human knowledge. The growth features are useful to most new editors if they click edit, but were designed to most deeply facilitate participation for two of the six fictional design research personas developed during the New Editor Experiences research: the Reactive Corrector and the Knowledge Sharer. For them, the edit button is a sufficient invitation for their personal missions, and their purpose can be facilitated by a wonderful set of tools. But the other types of new editors don’t always imagine themselves just clicking edit to begin with. 

The six fictional personas identified by the New Editors Research in 2017. Two of the personas are commonly identified when discussing “Born not made” Wikipedians, the reactive corrector and knowledge sharer. But how do we put invitations out to other potential contributors?

If you found your way into the movement organically through editing, there is a good chance that part of your personality and motivation is similar to these two fictional personas prioritized by the Growth team.  For example, teach someone who is an instinctive Knowledge Sharer how to edit Wikidata, and they will be editing for years to come; or invite a Reactive Corrector to WPWP, and you might get 1000 new images on Wikipedia articles. And, though we continue to grow in many parts of the world through natural growth from these kinds of newcomers, the editing communities are starting to slow their replacement rate. 

Most people don’t share the same compulsion to edit as our existing communities — they could be inspired by the work of the reactive corrector and knowledge sharer, but the sheer persistence to document the world is not their thing. Each of the four other personas from the research need something else beyond the better tools to motivate their participation in our projects: the Audience Builder, a financial or self-promotion goal; the Box Checker, an outside requirement such as a school assignment; the Social Changer, a vision for how knowledge changes the world; and the Joiner Inner, a community to join. 

How do we shift focus to “anyone who shares our vision will be able to join us”?

The Wikimedia movement’s vision imagines “a world in which every single human being can freely share in the sum of all knowledge.” This can feel as inspiring as it is broad. It mentions nothing of wikis, or of particular Wikimedian knowledge production processes, or even the internet. But in practice, our work as a movement happens on our platforms and in our communities, and is driven by very concrete socio-technical tools and norms. This is where Wikimedia’s 2030 Strategic Direction Movement Strategy gives us a clearer and more actionable mandate: “anyone who shares our vision will be able to join us”. Through this lens, we can more clearly see the limitations of “anyone can edit” as a call to action. In order for people who share our vision — of universal access to knowledge — to join us, they first must be able to see their own public knowledge missions within the work of Wikimedia. 

I would argue that from a platform contribution perspective — that means we need to get better at inviting two of the four personas who we can provide a more deliberate invitation to: the Social Changers and the Joiner Inners. I exclude the Box Checkers and the Audience Builders, because Wiki Education and other education programs have figured out great ways to engage the Box Checkers, and the general sentiment of the movement about the Audience Builders is mixed: if making benign edits, they can become fine parts of the community; but as a group they have a tendency to start by pushing their point of view in such a way that it may be more trouble than it is worth (and in a world of disinformation, this can have toxic results).  

One of the problems for Social Changer and Joiner Inner personas though: they need a connection to our movement. These personas need people and/or spaces that help them achieve their personal mission. Fortunately we have a growing network of capable affiliates and organizers in the movement who can provide that space. These affiliates and organizers create partnerships, outreach activities, events, and learning opportunities for the parts of the public that may be familiar with Wikimedia, but haven’t yet thought to participate.

Organizers align some of our hardest to reach newcomers to our purpose and actions. By focusing on these newcomers with invitation and care, the newcomers feel supported and join us in our mission.

In 2019, we published the Movement Organizers Research to better understand organizers: Who are the facilitators that introduce new audiences to contributing? Where do they come from? How do we make sure that our movement makes it as easy for Organizers to contribute as it is to edit?

In Part II, I will focus on what we learned from the Movement Organizers research, and how organizers in the last few years have helped us reimagine what a welcoming invitation is for targeted participants, like Social Changers. 

Tech/News/2022/14

15:48, Tuesday, 05 2022 April UTC

Other languages: Bahasa Indonesia, Deutsch, English, dagbanli, français, italiano, magyar, polski, português do Brasil, suomi, svenska, čeština, русский, українська, עברית, العربية, فارسی, ગુજરાતી, 中文, 日本語, ꯃꯤꯇꯩ ꯂꯣꯟ, 한국어

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Problems

  • For a few days last week, edits that were suggested to newcomers were not tagged in the Special:RecentChanges feed. This bug has been fixed. [1]

Changes later this week

  • The new version of MediaWiki will be on test wikis and MediaWiki.org from 5 April. It will be on non-Wikipedia wikis and some Wikipedias from 6 April. It will be on all wikis from 7 April (calendar).
  • Some wikis will be in read-only for a few minutes because of a switch of their main database. It will be performed on 7 April at 7:00 UTC (targeted wikis).

Future changes

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

Don’t Blink: Public Policy Snapshot for March 2022

14:38, Tuesday, 05 2022 April UTC

In case you blinked, we’re happy to catch you up with legislative and regulatory developments around the world that shape people’s ability to participate in the free knowledge movement. 

Here are the most important developments that have preoccupied the Wikimedia Foundation’s Global Advocacy team.


US Legislative Developments

  • Sec. 230: As part of our work advocating for protections for online intermediaries like Wikipedia, our team participated in a symposium hosted by William and Mary Law School. The panel discussion about ‘Business Law’s Response to Emerging Cultural Issues’ covered how crucial Section 230 of the Communications Decency Act has been to the development of free expression online, as well as how the various proposals to amend it may negatively impact the Internet. Our policy specialist Kate Ruane highlighted how these proposed changes would impact nonprofit projects like Wikipedia, and other online services, that are distinct from the large social media companies often at the center of the debate surrounding proposed changes to Section 230. 
  • Journalism Competition & Preservation Act (JCPA): Wikimedia Small Projects hosted our team for an episode of the SuenaWiki podcast to discuss the challenges posed by the JCPA currently under consideration in the US Congress.

Latin America and the Caribbean

  • Argentina: The Foundation’s Global Advocacy and Legal teams collaborated to file an amicus brief in the Supreme Court of Argentina in a right to be forgotten case (Denegri v. Google Inc). The applicant, a public figure, is asking Google to delist their name from content related to their media past, which they wish to be forgotten. In our brief, we argue that the right to be forgotten should not apply in this case as doing so would be an obstacle to freedom of expression and the right to information.
  • Chile: Both international and Chilean groups, including Wikimedia Chile, have expressed concern over legislation under consideration in the Chilean Congress to regulate digital platforms. Our blog post highlights the shortcomings of the Bill, including ambiguous language, impractical content moderation requirements, and a lack of consideration of community-led platforms. The Bill has the potential to become a misguided influence on similar regulations throughout the region, if approved.

Asia

  • Bangladesh: Authorities are currently reviewing the recommendations for proposed regulations, which could potentially impact Wikimedia’s volunteer-driven model and impose a short timeline for content removal and excessive penalties. A coalition letter signed by the Wikimedia Foundation and sent to the Bangladesh Telecommunication Regulatory Commission on 7 March outlines the concerns of major international human rights and internet freedom groups with the proposed “Regulation for Digital, Social Media, and OTT Platforms.” The letter has received significant media attention by Bangladesh’s major print and online media.

European Union

  • Digital Markets Act (DMA): On March 24, the three main EU bodies concluded negotiations over a common version of the  DMA, an EU regulation that is intended to ensure a higher degree of competition among services on the internet by preventing large companies from abusing their power. The Free Knowledge Advocacy Group EU has been monitoring these developments and advocating for provisions that will enable free knowledge projects to thrive. An analysis of the practical consequences of the negotiation outcomes regarding interoperability of services can be found in their blogpost

Additional Developments

  • United Kingdom Online Safety Bill: The United Kingdom (UK) Government formally introduced its long-awaited Online Safety Bill on Thursday, March 17. Our Global Advocacy team published an initial assessment of what this Bill means for community-governed platforms like Wikipedia. The Bill attempts to hold internet platforms accountable for harmful content that is spread via their services, but the approach promoted in the UK Bill is misguided both in terms of the users it claims to protect and the platforms it supposedly holds accountable. Stay tuned for a deep-dive analysis of the Bill. 
  • The European Court of Human Rights has dismissed the Wikimedia Foundation’s 2019 petition to lift the block of Wikipedia in Turkey. The Court explained its decision on the grounds that the Turkish government had already restored access to Wikipedia in January 2020, and because the block was already determined to be a human rights violation in the Turkish Constitutional Court’s December 2019 ruling. The European Court of Human Rights’ decision comes at a time when access to knowledge continues to be under threat around the world. The Wikimedia Foundation will continue to defend the right of everyone to freely access and participate in knowledge. Learn more about the case, and the current status of Turkish Wikipedia in the Foundation’s official statement
  • World Intellectual Property Organization (WIPO): The Global Advocacy team has supported a group of Wikimedia chapters in applying for ad hoc observer status at the WIPO Standing Committee on Copyright and Related Rights. Observer status in this body will allow the Wikimedia Movement to have a voice in future discussions shaping copyright issues globally. The team has also been supporting interested affiliates to apply for permanent observer status at WIPO, which will allow them to participate in discussions on other intellectual property issues (e.g., traditional knowledge, climate change) that impact access to knowledge. Chapters we have helped apply for permanent and ad hoc observer status include those of Argentina, France, Germany, Italy, Mexico, South Africa, Sweden, Switzerland.

To learn more about our team and the work we do, follow us on Twitter (@WikimediaPolicy) or sign-up to the Wikimedia public policy mailing list. The team’s Meta page is under construction.

Outreachy report #30: March 2022

00:00, Tuesday, 05 2022 April UTC

March was a tough month–my partner and I had dengue fever as we reviewed and processed initial applications–, but we made through it. ✨ Team highlights Sage developed new code to help us review and process school time commitments: Sage and I have been trying to develop strategies to review academic calendars quickly for years. We’ve gone from external notes to trying to gathering data on specific schools and requesting initial application reviewers to assign students to us.

WikiCrowd at 50k answers

19:13, Monday, 04 2022 April UTC

In January 2022 I published a new Wikimedia tool called WikiCrowd.

This tool allows people to answer simple questions to contribute edits to Wikimedia projects such as Wikimedia Commons and Wikidata.

It’s designed to be able to deal with a wide variety of questions, but due to time constraints, the extent of the current questions covers Aliases for Wikidata, and Depict statements for Wikimedia Commons.

The tool has just surpassed 55k questions, 50k answers, 32k edits and 75 users.

Thanks to @pmgpmgpmgpmg (Twitter, Github) and @waldyrious (Twitter, Github) for their sustained contributions to the project filling issues as well as contributing code and question definitions.

User Leaderboard

Though I haven’t implemented a leaderboard as part of the tool, the number of questions answered, and resulting edits are tracked in the backend.

Thus, of the 50k answers, we can take a look at who contributed to the crowd!

  1. PMG: 35,581 answers resulting in 21,084 edits at a 59% edit rate
  2. I dream of horses: 4543 answers resulting in 3184 edits at a 70% edit rate
  3. Tiefenschaerfe: 3749 answers resulting in 3207 edits at an 85% edit rate
  4. Addshore: 3049 answers resulting in 2133 edits at a 69% edit rate
  5. OutdoorAcorn: 708 answers resulting in 526 edits at a 74% edit rate
  6. Waldyrious: 443 answers resulting in 310 edits at a 69% edit rate
  7. Fences and windows: 409 answers resulting in 242 edits at a 59% edit rate
  8. Amazomagisto: 328 answers resulting in 211 edits at a 64 % edit rate

Thanks to all of the 75 users that have given the tool a go in the past months.

Answer overview

  • Yes is the favourite answer with 32,192 occurrences
  • No comes second with 13,473 occurrences
  • And a total of 3,818 questions were skipped altogether

In the future skipped questions will likely be presented to a user a second time.

Question overview

Depicts questions have by far been the most popular, and also the easiest to generate more interesting groups of questions for.

  • 48,236 Depicts questions
  • 776 Alias questions
  • 471 Depicts refinement questions

The question mega groups were split into subgroups.

  • Depicts has had 45 different things that could be depicted
  • Aliases can be added from 3 different language Wikipedias
  • Depicts refinement has been used on 19 of the 45 depicted things

Question success rate

Some questions are harder than others, and some questions have better filtering in terms of candidate answers than others.

For this reason, I suspect that some questions will have a much higher success rate, than others, and some with more skips.

At a high level, the groups of questions have quite different yes rates.

  • Depicts: 65% yes, 27% no, 8% skip
  • Alias: 54% yes, 23% no, 21% skip
  • Depicts refinement: 95% yes, 2% no, 2% skip

If we take a deeper dive into the depict questions, we can probably see some depictions that are hard to spot or commons categories that possibly include a wider variety of media around a core subject.

An example of this would be categories for US presidents that also include whole categories for election campaigns, or demonstrations, neither of which would normally feature the president.

Depicts yes no skip
firework 95.99% 0% 4.01%
jet aircraft 95.19% 3.48% 1.33%
helicopter 89.50% 1.41% 9.09%
dog 87.70% 8.55% 3.76%
steam locomotive 85.24% 7.48% 7.28%
duck 83.35% 10.14% 6.51%
train 82.75% 10.66% 6.59%
hamburger 82.58% 5.63% 11.80%
candle 77.07% 16.67% 6.27%
house cat 74.26% 16.31% 9.43%
laptop 63.32% 27.36% 9.32%
bridge 61.36% 23.93% 14.71%
parachute 61.04% 20.22% 18.74%
camera 57.85% 39.86% 2.29%
electric toothbrush 48.79% 34.76% 16.45%
Barack Obama 28.29% 70.23% 1.49%
pie chart 21.13% 61.76% 17.11%
covered bridge 3.51% 79.61% 16.88%
Summary of depict questions (where over ~1000 questions exist) ordered by yes %

The % rate of yes answers could be used to decide the ease of questions allowing some users to pick harder categories, or forcing new users to try easy questions first.

As question generation is tweaked, particularly for depicts questions where categories can be excluded, we should also see the yes % change over time. Slowly tuning question generation to get to a 80% yes range could be fun!

Of course, none of this is implemented yet ;)…

Queries behind this data

Just in case this needs to be generated again, here are the queries used.

For the user leader boards…


DB::table('answers') ->select('username', DB::raw('count(*) as answers')) ->groupBy('username') ->orderBy('answers', 'desc') ->join('users','answers.user_id','=','users.id') ->limit(10) ->get(); DB::table('edits') ->select('username', DB::raw('count(*) as edits')) ->groupBy('username') ->orderBy('edits', 'desc') ->join('users','edits.user_id','=','users.id') ->limit(10) ->get();
Code language: PHP (php)

And the question yes rate data came from the following query and a pivot table…


DB::table('questions') ->select('question_groups.name','answer',DB::raw('count(*) as counted')) ->join('answers','answers.question_id','=','questions.id', 'left outer') ->join('edits','edits.question_id','=','questions.id', 'left outer') ->join('question_groups','questions.question_group_id','=','question_groups.id') ->groupBy('question_groups.name','answer') ->orderBy('question_groups.name','desc') ->get();
Code language: PHP (php)

Looking forward

Come and contribute, code, issues or ideas on the Github repo.

Next blog post at 100k? Or maybe now that there are cron jobs for question generation (people don’t have to wait for me) 250k is a more sensible next step.

The post WikiCrowd at 50k answers appeared first on addshore.

Tech News issue #14, 2022 (April 4, 2022)

00:00, Monday, 04 2022 April UTC
previous 2022, week 14 (Monday 04 April 2022) next

weeklyOSM 610

10:02, Sunday, 03 2022 April UTC

22/03/2022-28/03/2022

lead picture

JOSM on a Steam Deck [1] © by Riiga licensed under CC BY-SA 4.0 | map data © OpenStreetMap contributors (ODbL) | JOSM: GPLv2 or later

Mapping campaigns

  • OSM Ireland’s building mapping campaign reached a significant milestone as reported by Amanda McCann.

Mapping

  • FasterTracker ponders (pt) > de the lack of a clear and immediate definition of the use of the network key in the context of the public transport, taking the example of the AML (pt) > en, the Lisbon metropolitan area.
  • Minh Nguyen blogged about oddities of township boundaries in Ohio (and as both Minh and commenters point out, it is not just Ohio).
  • muchichka pointed out (uk) > en
    that providing information about the movement and deployment of military forces and relevant international aid is forbidden according to a recent amendment of the Ukrainian Criminal Code. The diary post title indicates that in muchichka’s interpretation this extends to any mapping of military facilities.
  • The following proposals are waiting for your comments:
    • Standardising the tagging of manufacturer:*=* and model:*=* of artificial elements.
    • Introducing quiet_hours=* to facilitate people looking for autism-friendly opening hours.
    • Clarifying the diference between surveillance:type=guard and office=security.
    • Adding loading dock details like dock:height=*, door:height=* or door:width=*.

Community

  • [1] @riiga#7118, on the OSM World Discord, showed JOSM running on their Steam Deck Game Console: ‘No matter the device, no matter the place: mapping first, and with JOSM of course’. Original post (Discord login required.)
  • Based on the OSM Community Index, Who Maps Where (WMW) allows one to search for a mapper with local knowledge anywhere in the world. If you’re okay with your area of local knowledge being shown on the map, the project’s README on GitHub describes how that works.
  • qeef shared his view that communication within the HOT Tasking Manager is wrong because it duplicates OSM functionality.

OpenStreetMap Foundation

  • Guillaume Rischard noted, on Twitter, a blog post from Fastly about how the OpenStreetMap Operations team is using the Fastly (CDN) to ‘provide updates in near real-time’.

Education

  • unen’s latest diary entry continued his reflections on the discussions at his weekly help desk sessions for the HOT Open Mapping Hub Asia-Pacific. He invited people to provide contact details to be informed of future discussion agendas. Issues from recent weeks included accessing older versions of OSM data, and participants’ problems with remote control issues of JOSM.

Maps

    • Marcus Marcos Dione was dissatisfied with the appearance of hill shading in areas of Northern Europe. In his blog he explained how this is a result of the way hill shading is calculated using OSGeo tools.

updated

switch2OSM

  • PlayzinhoAgro wrote, in his blog, about adding public service points to address gender-based violence in Brazil. Volunteer lawyers and psychologists providing assistance are shown (pt) > en on a map.

Open Data

  • ITDP has published a recording of the webinar ‘Why Open Data Matters for Cycling’, available on the Trufi Association website.

Software

  • A new version of Organic Maps has been released for iOS and Android. Map data was updated and Wikipedia articles were added. As usual, the release also includes small bugfixes for routing, styles, and translations.
  • Anthon Khorev has released ‘osm-note-viewer’, an alternative to https://www.openstreetmap.org/user/username/notes, where one can have an overview of notes related to a user both as a list and on a map.

Releases

  • GNU/Linux.ch reported (de) > en on the new version of StreetComplete. The intuitive usability, even for OSM newbies, is highlighted.

Did you know …

Other “geo” things

  • @Pixel_Dailies (a Twitter account) challenges pixel artists with a new theme everyday. On Monday the theme was bird’s eye view and most of the participants’ entries, which can be found through #BirdsEyeView, feature some kind of aerial map.
  • Valentin Socha tweeted screen captures from 1993 France weather reports, where weekly forecasts were shown on a cut-out map with a letter for each day.

Upcoming Events

Where What Online When Country
Tucson State of the Map US osmcalpic 2022-04-01 – 2022-04-03 flag
Burgos Evento OpenStreetMap Burgos (Spain) 2022 osmcalpic 2022-04-01 – 2022-04-03 flag
Região Geográfica Imediata de Teófilo Otoni Mapathona na Cidade Nanuque – MG -Brasil – Edifícios, Estradas, Pontos de Interesses e Área Verde osmcalpic 2022-04-02 – 2022-04-03 flag
Bogotá Distrito Capital – Municipio Notathon en OpenStreetMap – resolvamos notas de Latinoamérica osmcalpic 2022-04-02 flag
Ciudad de Guatemala Segundo mapatón YouthMappers en Guatemala (remoto) osmcalpic 2022-04-02 – 2022-04-03 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-04
OSMF Engineering Working Group meeting osmcalpic 2022-04-04
Bologna Open Data Pax osmcalpic 2022-04-04 flag
Stuttgart Stuttgarter Stammtisch osmcalpic 2022-04-05 flag
Greater London Missing Maps London Mapathon osmcalpic 2022-04-05 flag
Berlin OSM-Verkehrswende #34 (Online) osmcalpic 2022-04-05 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-06
Tasking Manager Collective Meet Up – Option 1 osmcalpic 2022-04-06
Tasking Manager Collective Meet Up – Option 2 osmcalpic 2022-04-06
Heidelberg Heidelberg Int’l. Weeks Against Racism: Humanitarian Cartography and OpenStreetMap osmcalpic 2022-04-06 flag
Berlin 166. Berlin-Brandenburg OpenStreetMap Stammtisch osmcalpic 2022-04-08 flag
OSM Africa April Mapathon: Map Kenya osmcalpic 2022-04-09
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-11
臺北市 OpenStreetMap x Wikidata Taipei #39 osmcalpic 2022-04-11 flag
Washington MappingDC Mappy Hour osmcalpic 2022-04-13 flag
San Jose South Bay Map Night osmcalpic 2022-04-13 flag
20095 Hamburger Mappertreffen osmcalpic 2022-04-12 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-13
Michigan Michigan Meetup osmcalpic 2022-04-14 flag
OSM Utah Monthly Meetup osmcalpic 2022-04-14
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-18
150. Treffen des OSM-Stammtisches Bonn osmcalpic 2022-04-19
City of Nottingham OSM East Midlands/Nottingham meetup (online) osmcalpic 2022-04-19 flag
Lüneburg Lüneburger Mappertreffen (online) osmcalpic 2022-04-19 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-20
Dublin Irish Virtual Map and Chat osmcalpic 2022-04-21 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Lejun, Nordpfeil, PierZen, SK53, Sammyhawkrad, Strubbl, TheSwavu, UNGSC_Alessia13, alesarrett, derFred.

Profiling a Wikibase item creation on test.wikidata.org

21:54, Saturday, 02 2022 April UTC

Today I was in a Wikibase Stakeholder group call, and one of the discussions was around Wikibase importing speed, data loading, and the APIs. My previous blog post covering what happens when you make a new Wikibase item was raised, and we also got onto the topic of profiling.

So here comes another post looking at some of the internals of Wikibase, through the lens of profiling on test.wikidata.org.

The tools used to write this blog post for Wikimedia infrastructure are both open source, and also public. You can do similar profiling on both your own Wikibase, or for your requests that you suspect are slow on Wikimedia sites such as Wikidata.

Wikimedia Profiling

Profiling of Wikimedia sites is managed and maintained by the Wikimedia performance team. They have a blog, and one of the most recent posts was actually covering profiling PHP at scale in production, so if you want to know the details of how this is achieved give it a read.

Throughout this post I will be looking at data collected from a production Wikimedia request, by setting the X-Wikimedia-Debug header in my request. This header has a few options, and you can find the docs on wikitech.wikimedia.org. There are also browser extensions available to easily set this header on your requests.

I will be using the Wikimedia hosted XHGui to visualize the profile data. Wikimedia specific documentation for this interface also exists on wikitech.wikimedia.org. This interface contains a random set of profiled requests, as well as any requests that were specifically requested to be profiled.

Profiling PHP & MediaWiki

If you want to profile your own MediaWiki or Wikibase install, or PHP in general, then you should take a look at the mediawiki.org documentation page for this. You’ll likely want to use either Tideways or XDebug, but probably want to avoid having to setup any extra UI to visualize the data.

This profiling only covered the main PHP application (MediaWiki & Wikibase extension). Other services such as the query service would require separate profiling.

Making a profiled request

On test.wikidata I chose a not so random item (Q64) which happens to be a small version of the item for Berlin on Wikidata. It has a bunch of labels and a couple of statements.

I made a few modifications including removing the ID and changing all labels to avoid conflicts with the item that I had just copied and came up with some JSON ready to feed back into the API.

I navigated to the API sandbox for test.wikidata.org, and setup a request using wbeditentity which would allow me to create a fresh item. The options look something like this:

  • new = item
  • token = <Auto-fill the token using the UI button>
  • data = <json data that I am using to create an item>

With the XHGui option selected in the WikimediaDebug browser extension, I can hit the “Make request” button and should see my item created. The next page will also output the full runtime of the request from the client perspective, in this case roughly 3.6 seconds.

Finding the request in XHGui

Opening up XHGui I should find the POST request that I just made to test.wikidata somewhere near the top of the list of profiled requests.

Clicking on the Time column, the details page of the profiled request will load. You can find my request, id 61fc06c1fe879940dbdf4a38 (archive URL just in case).

Profiling overview

There are lots of gotchas when it comes to reading a profile such as this:

  • The fact that profiling is happening will generally make everything run slower
  • Profiling tends to overestimate the cost of calling functions, so small functions called many times will appear to look worse than they actually are
  • When IO is involved, such as caching (if the cache is cold), database writes, relying on the internet, or external services, any number of things can cause individual functions to become inflated
  • It’s hard to know what any of it means, without knowing what the classes and methods are doing

Next let’s look at some terms that it makes sense to understand:

  • Wall time: also called real-world time, is the actual time that a thing has taken to run. This includes things such as waiting for IO, or your CPU switching to low power mode.
  • CPU time: also called process time, is the ammount of time the CPU actaully spent processing instructions, excluding things such as time spent waiting for IO.
  • Self: also called exclusive, covers the resources spent in the function itself, excluding time spent in children.
  • Inclusive: covers the resources inclusive of all children

You can read some more about different types of time and inclusivity in profiling on the Time docs for blackfire.io.

Reading the profile

The full wall time of the request is 5,266,796 µs, or 5.2 seconds. This is significantly more than we saw from the perspective of the client making the API request. This is primarily because of the extra processing that MediaWiki and Wikibase does after sending a response back to the user.

The full CPU time of the request is 3,543,361 µs, or 3.5 seconds. We can infer from this that the request included roughly 1.7 seconds of time not doing computations. This could be waiting for databases, or other IO.

We can find likely candidates for this 1.7 seconds of time spent not computing by looking at the top of the function breakdown for wall time, and comparing CPU time.

Method Calls Self Wall Time Self CPU time Difference
Wikimedia\Rdbms\DatabaseMysqli::doQuery 809 1,003,729 µs 107,371 µs ~ 0.9 s
GuzzleHttp\Handler\CurlHandler::__invoke 1 371,120 µs 2,140 µs ~ 0.3 s
MultiHttpClient::runMultiCurl 15 280,697 µs 16,066 µs ~ 0.25 s
Wikimedia\Rdbms\DatabaseMysqli::mysqlConnect 45 68,183 µs 15,229 µs ~ 0.05 s

The 4 methods above have a combined difference between wall and CPU time of 1.5s, which accounts for most of the 1.7s we were looking for. The most expensive method call here is actually the single call to GuzzleHttp\Handler\CurlHandler::__invoke which spends 0.3s waiting, as all of the other methods are called many other times. On average Wikimedia\Rdbms\DatabaseMysqli::doQuery only spends 0.001s per method call in this request.

GuzzleHttp\Handler\CurlHandler::__invoke

Lets have a closer look at this GuzzleHttp\Handler\CurlHandler::__invoke call. We have a few options to see what is actually happening in this method call.

  1. Click on the method to see the details of the call, navigate up through the parents to find something that starts to make some sense
  2. Use the callgraph view (only shows methods that represent more than 1% of execution time)

I’ll choose number 2, and have included a screenshot of the very tall call graph for this method to the right.

At the top of this call we see MediaWiki\SyntaxHighlight\Pygmentize::highlight, which I was not expecting in such an API call.

Another level up we see WANObjectCache::fetchOrRegenerate which means that this was involved in a cache miss, and this data was regenerated.

Even further up the same tree I see SyntaxHighlight::onApiFormatHighlight.

This method is part of the SyntaxHighlight extension, and spends some time making the output of the API pretty for users in a web browser.

So what have I learnt here? Don’t profile with jsonfm. However using the API sandbox you don’t get this option, and thus bug T300909 was born.

Callgraph overview

Having the callgraph open we can see some of the most “expensive” methods in terms of inclusive wall time. You can also find these in the table view by sorting using the headings.

main() represents the bulk of the MediaWiki request (5.2s). This is split into ApiMain::execute taking ~3.4 seconds, and MediaWiki::doPostOutputShutdown taking ~1.7 seconds.

ApiMain::execute

This is where the “magic happens” so to speak. ~3.4 seconds of execution time.

The first bit of Wikibase code you will see in this call graph path is Wikibase\Repo\Api\ModifyEntity::execute. This is the main execute method in the base class that is used by the API that we are calling. Moving to this Wikibase code we also lose another ~0.4 seconds due to my syntax highlighting issue that we can ignore.

Taking a look at the next level of methods in the order they run (roughly) we see most of the execution time.

Method Inclusive Wall time Description
Wikibase\Repo\Api\ModifyEntity::loadEntityFromSavingHelper ~0.2 seconds Load the entity (if exists) that is being edited
Wikibase\Repo\Api\EditEntity::getChangeOp ~0.6 seconds Takes your API input and turns it into ChangeOp objects (previous post)
Wikibase\Repo\Api\ModifyEntity::checkPermissions ~0.3 seconds Checks the user permissions to perform the action
Wikibase\Repo\Api\EditEntity::modifyEntity ~1.8 seconds Take the ChangeOp objects and apply them to an Entity (previous post)
Wikibase\Repo\Api\EntitySavingHelper::attemptSaveEntity ~0.4 seconds Take the Entity and persist it in the SQL database

In the context of the Wikibase stakeholder group call I was in today that was talking about initial import speeds, and general editing speeds, what could I say about this?

  • Why spend 0.3 seconds of an API call checking permissions? Perhaps you are are doing your initial import in a rather “safe” environment. Perhaps You don’t care about all of the permissions that are checked?
  • Permissions are currnetly checked in 3 places for this call. 1) upfront 2) if we need to create a new item 3) just before saving. In total this makes up ~0.6 seconds according to the profiling.
  • Putting the formed PHP Item object into the database actually only takes ~0.15 seconds.
  • Checking the uniqueness of of labels and descriptions takes up ~1.2 seconds of validation of ChangeOps. Perhaps you don’t want that?

MediaWiki::doPostOutputShutdown

This is some of the last code to run as part of a request.

The name implies it, but to be clear this PostOutputShutdown method runs after the user has been served with a request. Taking a look back at the user-perceived time of 3.6 seconds, we can see that the wall time of the whole request (5.2s) minus this post output shutdown (1.7s) is roughly 3.5 seconds.

In relation to my previous post from the point of view of Wikibase, this is when most secondary data updates will happen. Some POST SEND derived data updates also happen in this step.

Closing

As I stated in the call, Wikibase was created primarily with the usecase of Wikidata in mind. There was never a “mass data load” stage for Wikidata requiring extremely high edit rates in order to import thoughts or millions of items. Thus interfaces and internals do not tend to this usecase and optimizations or configurations that could be made have not been made.

I hope that this post will trigger some questions around expensive parts of the editing flow (in terms of time) and also springboard more folks into looking at profiling of either Wikidata and test.wikidata, or their own Wikibase installs.

For your specific use case you may see some easy wins with what is outlined above. But remember that this post and specific profiling is only the tip of the iceberg, and there are many other areas to look at.

The post Profiling a Wikibase item creation on test.wikidata.org appeared first on addshore.

Altering a Gerrit change (git workflow)

21:54, Saturday, 02 2022 April UTC

I don’t use git-review for Gerrit interactions. This is primarily because back in 2012/2013 I couldn’t get git-review installed, and someone presented me with an alternative that worked. Years later I realized that this was actually the documented way of pushing changes to Gerrit.

As a little introduction to what this workflow looks, and a comparison with git-review I have created 2 overview posts altering a gerrit change on the Wikimedia gerrit install. I’m not trying to convince you, either way, is better, merely show the similarities/difference and what is happening behind the scenes.

Be sure to take a look at the other post “Altering a Gerrit change (git-review workflow)

I’ll be taking a change from the middle of last year, rebasing it, making a change, and pushing it back for review. Fundamentally the 2 approaches do the same thing, just one (git-review) requires an external tool.

1) Rebase

Firstly I’ll rebase the change by clicking the “Rebase” button in the top right of the UI. (But this step is entirely optional)

This will create a second patchset on the change, automatically rebase on the master branch if possible. (Or it would tell you to rebase locally).

2) Checkout

In order to checkout the change I’ll use the “Download” button on the right of the change near the changed files.

A dialogue will appear with a bunch of commands that I can copy depending on what I want to do.

As I want to alter the change in place, I’ll use the “Checkout” link.

This will fetch the ref/commit, and then check it out.

3) Change

I can now go ahead and make my change to the commit in my IDE.

The change is quite small and can be seen in the diff below.

Now I need to amend the commit that we fetched from gerrit.

If I want to change to commit message in some way I can do git commit --all --amend

If there is no need to change the commit message you can also pass the --no-edit option.

You’ll notice that we are still in a detached state, but that doesn’t matter too much, as the next step is pushing to gerrit, and once that has happened we don’t need to worry about this commit locally.

4) Push

In order to submit the altered commit back to gerrit, you can just run the following command


git push origin HEAD:refs/for/master

The response of the push will let you know what has happened, and you can find the URL back to the change here.

A third patchset now exists on the change on Gerrit.

Overview

The whole process looks like something like this.

Visualization created with https://git-school.github.io/
  1. A commit already exists on Gerrit that is currently up for review
  2. Clicking the rebase button will rebase this commit on top of the HEAD of the branch
  3. Fetching the commit will bring that commit on to your local machine, where you can now check it out
  4. Making a change and ammending the commit, will create a new commit locally
  5. You can then push this altered commit back to gerrit for review

If you want to know more about what Gerrit is doing, you can read the docs on the “gritty details”

Git aliases

You can use a couple of git aliases to avoid some of these slightly long commands


alias.amm=commit -a --amend alias.amn=commit -a --amend --no-edit alias.p=!f() { git push origin HEAD:refs/for/master; }; f

And you can level these up to provide you with a little more flexibility


alias.amm=commit -a --amend alias.amn=commit -a --amend --no-edit alias.main=!git symbolic-ref refs/remotes/origin/HEAD | sed 's@^refs/remotes/origin/@@' alias.p=!f() { git push origin HEAD:refs/for/$(git main)%ready; }; f alias.pd=!f() { git push origin HEAD:refs/for/$(git main)%wip; }; f
Code language: JavaScript (javascript)

You can read more about my git aliases in a previous post.

The post Altering a Gerrit change (git workflow) appeared first on addshore.

Tech/News/2022/13

09:22, Friday, 01 2022 April UTC

Other languages: Bahasa Indonesia, Deutsch, English, français, italiano, polski, suomi, čeština, русский, українська, עברית, العربية, فارسی, ไทย, 中文, 日本語, ꯃꯤꯇꯩ ꯂꯣꯟ

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Recent changes

  • There is a simple new Wikimedia Commons upload tool available for macOS users, Sunflower.

Changes later this week

  • The new version of MediaWiki will be on test wikis and MediaWiki.org from 29 March. It will be on non-Wikipedia wikis and some Wikipedias from 30 March. It will be on all wikis from 31 March (calendar).
  • Some wikis will be in read-only for a few minutes because of regular database maintenance. It will be performed on 29 March at 7:00 UTC (targeted wikis) and on 31 March at 7:00 UTC (targeted wikis). [1][2]

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

Let’s talk about relationships — nothing gossip-y — but, rather, how does one thing relate to something else? On Wikidata we talk about relationships using something called properties. Part of the semantic triple (subject, predicate, object — or in Wikidata parlance, item, property, value), properties define how one thing relates to another on Wikidata. Is it a date? A name? A location? An image? An identifier. Here’s an example: for those in the northern hemisphere, we may be thankful that this post is being published as spring (Q1312) follows (P155) winter (Q1311). In that sentence ‘follows’ is the property that explains a relationship between ‘winter’ and ‘spring.’ The Wikidata community uses properties to define any kind of relationship between things. How many properties are there? I’m glad you asked.

As of March 2022, there are around 10,000 properties on Wikidata. Roughly 7,000 of these are external identifier properties (external identifier properties correspond to external collections — museums and libraries — whose collection includes a person, place or concept that also exists in Wikidata). That leaves around 3,000 properties the community uses to describe everything. You can read the discussion page of any property to orient yourself to that property, but there are other ways to understand how properties work too. Knowing where to start with those can be a little overwhelming. This post will profile properties about properties. If that sounds confusing, I get it! I’ll provide plenty of examples to contextualize everything and help you better understand how properties work.

Let’s learn through examples. As you discover properties, wouldn’t it be wonderful if there were a way to see the property in action to know if you were using it correctly? I have good news for you: there IS a property that does this. It’s called Wikidata Property Example (P1855 for super-fans). Click that link, and read all about property examples, including links to queries where you can see thousands of properties — with examples — in the wild on Wikidata. To review: there is a property on Wikidata that exists to give you examples of properties and how they work. Can you focus the query on a specific property? Yes. Can you get multiple examples for one query? Yes. Does the example I shared list all properties with examples? Yes! Is this is one of the best ways you can use properties like a pro? Absolutely.

Now that you’re familiar with one way to learn how a properties works, consider this: maybe the dataset you are working with requires you to describe an inverse relationship — or something that is the opposite of something else. If only there were a property that could express an inverse relationship! Well, today is your lucky day because there is a property called inverse property (P1696) that does exactly that. Please note, and this is very important, that this property describes other properties on Wikidata and their relationship is inverse to each other. For example if what you’re describing follows something or if it is followed by something else, the follows (P1696) property would be connected by the inverse property. Another example would be family relationships like a parent property (mother/father) and the child property is the property for you!

If you’re not talking about relationships (properties), but rather items — concepts, people, places — there is a completely different property called opposite of (P461) that the community uses to describe conceptual opposites. What’s a conceptual opposite? You can think of this as the opposite of the color white is the color black. The opposite of summer is winter. It’s okay if it’s a little confusing. Examples will help distinguish these two. To review: an inverse property is used exclusively with relationships — child/parent, capital/capital of, officeholder/position held, owner of/owned by. Another property “opposite of” is used exclusively to describe opposing concepts. Both of these properties are great for distinguishing related things on Wikidata. Let’s move on to another distinguished property.

You are nearly a property pro. You’re feeling confident, you understand how these descriptive connections relate to each other. The world is your oyster and you want to describe more things with more properties, more accuracy, and more precision. I love the enthusiasm. There’s a property that can help you do this: it suggests related properties on Wikidata. It’s called — you guessed it — related property Property (P1659). You can use this property to see other properties related to the one you are wondering about. You can think of it as a “see also” recommendation for properties. There are MANY location-type properties on Wikidata. Suppose you want to know all of the properties related to P131, which describes where things are geographically located? You could use “related properties” in a query to get a list: just like this! You can think of this property as a way to reveal how properties are related to similar properties. Using this property will help make you a super-describer on Wikidata. There’s nothing you can’t describe now!

These three properties (well, four) should reveal more about how to describe anything on Wikidata. Learning how to use properties on Wikidata is essential for maintaining data quality and usefulness of the data. It is also one of the most effect ways to learn how to query and write better queries. The more familiar you are with properties, the more you will get out of Wikidata (and likely any other dataset you’re working with whether it’s part of Wikidata or not). Now that you know more about properties on Wikidata, consider these two things:

  1. Wikidata will always require new properties. If one is missing, you can propose it here. Properties also change over time. If an existing property isn’t working for you (or has never worked for you), you can propose changes on the property’s discussion page. The only way Wikidata will ever be an equitable resource is if property usage and definitions work for all kinds of data and relationships in the world.
  2. The properties I’ve shared with you in this post themselves are incomplete. The community could always use more examples, better definitions, and other ways of describing things. Adding statements to items and properties is a very important way you can help improve these resources.

Stay tuned for more Wikidata property exploration posts here. And if you want to learn more, take the Wikidata Institute course I teach!

Benchmarking MediaWiki with PHPBench

12:14, Wednesday, 30 2022 March UTC

This post gives a quick introduction to a benchmarking tool, phpbench, ready for you to experiment with in core and skins/extensions.[1]

What is phpbench?

From their documentation:

PHPBench is a benchmark runner for PHP analagous to PHPUnit but for performance rather than correctness.

In other words, while a PHPUnit test will tell you if your code behaves a certain way given a certain set of inputs, a PHPBench benchmark only cares how long that same piece of code takes to execute.

The tooling and boilerplate will be familiar to you if you've used PHPUnit. There's a command-line runner at vendor/bin/phpbench, benchmarks are discoverable by default in tests/Benchmark, a configuration file (benchmark.json) allows for setting defaults across all benchmarks, and the benchmark tests classes and tests look pretty similar to PHPUnit tests.

Here's an example test for the Html::openElement() function:

namespace MediaWiki\Tests\Benchmark;

class HtmlBench {

        /**
        * @Assert("mode(variant.time.avg) < 85 microseconds +/- 10%")
        */
        public function benchHtmlOpenElement() {
                \Html::openElement( 'a', [ 'class' => 'foo' ] );
        }
}

So, taking it line by line:

  • class HtmlBench (placed in tests/Benchmark/includes/HtmlBench.php) – the class where you can define the benchmarks for methods in a class. It would make sense to create a single benchmark class for a single class under test, just like with PHPUnit.
  • public function benchHtmlOpenElement() {} – method names that begin with bench will be executed by phpbench; other methods can be used for set-up / teardown work. The contents of the method are benchmarked, so any set-up / teardown work should be done elsewhere.
  • @Assert("mode(variant.time.avg) < 85 microseconds +/- 10%") – we define a phpbench assertion that the average execution time will be less than 85 microseconds, with a tolerance of +/- 10%.

If we run the test with composer phpbench, we will see that the test passes. One thing to be careful with, though, is adding assertions that are too strict – you would not want a patch to fail CI because the assertion for execution was not flexible enough (more on this later on).

Measuring performance while developing

One neat feature in PHPBench is the ability to tag current results and compare with another run. Looking at the HTMLBench benchmark test from above, for example, we can compare the work done in rMW5deb6a2a4546: Html::openElement() micro-optimisations to get before and after comparisons of the performance changes.

Here's a benchmark of e82c5e52d50a9afd67045f984dc3fb84e2daef44, the commit before the performance improvements added to Html::openElement() in rMW5deb6a2a4546: Html::openElement() micro-optimisations

❯ git checkout -b html-before-optimizations e82c5e52d50a9afd67045f984dc3fb84e2daef44 # get the old HTML::openElement code before optimizations
❯ git review -x 727429 # get the core patch which introduces phpbench support
❯ composer phpbench -- tests/Benchmark/includes/HtmlBench.php --tag=original

And the output [2]:

Note that we've used --tag=original to store the results. Now we can check out the newer code, and use --ref=original to compare with the baseline:

❯ git checkout -b html-after-optimizations 5deb6a2a4546318d1fa94ad8c3fa54e9eb8fc67c # get the new HTML::openElement code with optimizations
❯ git review -x 727429 # get the core patch which introduces phpbench support
❯ composer phpbench -- tests/Benchmark/includes/HtmlBench.php --ref=original --report=aggregate

And the output [3]:

We can see that the execution time roughly halved, from 18 microseconds to 8 microseconds. (For understanding the other columns in the report, it's best to read through the Quick Start guide for phpbench.) PHPBench can also provide an error exit code if the performance decreased. One way that PHPBench might fit into our testing stack would be to have a job similar to Fresnel, where a non-voting comment on a patch alerts developers whether the PHPBench performance decreased in the patch.

Testing with extensions

A slightly more complex example is available in GrowthExperiments (patch). That patch makes use of setUp/tearDown methods to prepopulate the database entries needed for the code being benchmarked:

/**
 * @BeforeMethods ("setUpLinkRecommendation")
 * @AfterMethods ("tearDownLinkRecommendation")
 * @Assert("mode(variant.time.avg) < 20000 microseconds +/- 10%")
 */
public function benchFilter() {
        $this->linkRecommendationFilter->filter( $this->tasks );
}

The setUpLinkRecommendation and tearDownLinkRecommendation methods have access to MediaWikiServices, and generally you can do similar things you'd do in an integration test to setup and teardown the environment. This test is towards the opposite end of the spectrum from the core test discussed above which looks at Html::openElement(); here, the goal is to look at a higher level function that involves database queries and interacting with MediaWiki services.

What's next

You can experiment with the tooling and see if it is useful to you. Some open questions:

  • do we want to use phpbench? or are the scripts in maintenance/benchmarks already sufficient for our benchmarking needs?
  • we already have a benchmarking tools in maintenance/benchmarks that extend a Benchmarker class; would it make sense to convert these to use phpbench?
  • what are sensible defaults for "revs" and "iterations" as well as retry thresholds?
  • do we want to run phpbench assertions in CI?
    • if yes, do we want assertions about using absolute times (e.g. "this function should take less than 20 ms") or relative assertions ("patch code is within 10% +/- of old code)
    • if yes, do we want to aggregate reports over time, so we can see trends for the code we benchmark?
    • should we disable phpbench as part of the standard set of tests run by Quibble, and only have it run as a non-voting job like Fresnel?

Looking forward to your feedback! [4]


[1] thank you, @hashar, for working with me to include this in Quibble and roll out to CI to help with evaluation!

[2]

> phpbench run --config=tests/Benchmark/phpbench.json --report=aggregate 'tests/Benchmark/includes/HtmlBench.php' '--tag=original'
PHPBench (1.1.2) running benchmarks...
with configuration file: /Users/kostajh/src/mediawiki/w/tests/Benchmark/phpbench.json
with PHP version 7.4.24, xdebug ✔, opcache ❌

\MediaWiki\Tests\Benchmark\HtmlBench

    benchHtmlOpenElement....................R1 I1 ✔ Mo18.514μs (±1.94%)

Subjects: 1, Assertions: 1, Failures: 0, Errors: 0
Storing results ... OK
Run: 1346543289c75373e513cc3b11fbf5215d8fb6d0
+-----------+----------------------+-----+------+-----+----------+----------+--------+
| benchmark | subject              | set | revs | its | mem_peak | mode     | rstdev |
+-----------+----------------------+-----+------+-----+----------+----------+--------+
| HtmlBench | benchHtmlOpenElement |     | 50   | 5   | 2.782mb  | 18.514μs | ±1.94% |
+-----------+----------------------+-----+------+-----+----------+----------+--------+

[3]

> phpbench run --config=tests/Benchmark/phpbench.json --report=aggregate 'tests/Benchmark/includes/HtmlBench.php' '--ref=original' '--report=aggregate'
PHPBench (1.1.2) running benchmarks...
with configuration file: /Users/kostajh/src/mediawiki/w/tests/Benchmark/phpbench.json
with PHP version 7.4.24, xdebug ✔, opcache ❌
comparing [actual vs. original]

\MediaWiki\Tests\Benchmark\HtmlBench

    benchHtmlOpenElement....................R5 I4 ✔ [Mo8.194μs vs. Mo18.514μs] -55.74% (±0.50%)

Subjects: 1, Assertions: 1, Failures: 0, Errors: 0
+-----------+----------------------+-----+------+-----+---------------+-----------------+----------------+
| benchmark | subject              | set | revs | its | mem_peak      | mode            | rstdev         |
+-----------+----------------------+-----+------+-----+---------------+-----------------+----------------+
| HtmlBench | benchHtmlOpenElement |     | 50   | 5   | 2.782mb 0.00% | 8.194μs -55.74% | ±0.50% -74.03% |
+-----------+----------------------+-----+------+-----+---------------+-----------------+----------------+

[4] Thanks to @zeljkofilipin for reviewing a draft of this post.

On Thursday March 24th the trilogue negotiators concluded discussions, dramatic at times, over the Digital Markets Act. The compromise includes some gains on interoperability, a potential changemaker in the online intermediation. What to expect? Where not to hold your breath? We parse out the practical consequences of the trilogue outcome on interoperability.

Winding road to the final compromise

Interoperability has been a point of contention since the European Commission published their first draft in December 2020. The EC drafted it narrowly, obligating gatekeepers to offer interoperability to the so-called ancillary services, like payment or identification services, that wish to operate within closed ecosystems. IMCO Rapporteur MEP Andreas Schwab followed this approach in his draft report. 

That didn’t go well with many MEPs who were disappointed with the fact that an opportunity to open up walled gardens of online intermediation had not been exploited. Many amendments and heated debates later, the final EP report provided that interconnection should be also possible between messaging apps and services (the so-called number independent interpersonal communication services) as well as social networks.

Since the Council’s approach was focused on refining the business-to-business side of interoperability, the trilogues didn’t show much promise in securing the extension of the EC’s scope. Somehow, under pressure of time the delegation of MEPs managed to negotiate some gains that keep the spirit if not the letter of the EP mandate.

Basic rules of becoming interoperable under DMA

As originally devised, the final DMA compromise envisions that only services designated as gatekeepers will be obliged to create conditions for interoperability with other services. This possibility will be, however, accessible on request – meaning that there won’t be any obligation to make a given service permanently and publicly accessible. A gatekeeper will have 3 months to “render requested basic functionalities operational”. 

Contrary to the original proposal by the European Commission, the compromise includes a definition of the functionality enabling opening digital ecosystems that so far have been closed: 

‘Interoperability’ means the ability to exchange information and mutually use the information which has been exchanged through interfaces or other solutions, so that all elements of hardware or software work with other hardware and software and with users in all the ways in which they are intended to function.

The definition is pretty straightforward and covers potential applications of frictionless communication exchange broadly, relating to both hardware and software. It refers to both the provisions already outlined by the European Commission in the original draft and to those worked out during the trilogues. The latter, as explained below in more detail, is an improvement as it encompasses some services that are then accessible to individual users and groups of individuals (the so-called end users).

End users will be able to freely decide whether they want to make use of the interconnected services or rather stay with the provider they had originally chosen. A service that wants to connect with a gatekeeper will need to do so within the same level of security. This means that if a gatekeeper offers end-to-end encryption, a connecting service will also need to provide it.

Messaging

End-to-end text messaging between two end users will be one of the basic functionalities that will become interoperable on request. Within two years after designation, the gatekeeper will also need to make available text messaging within groups of individual users. 

Similarly, sharing of images, voice messages and video attachments between two individuals will be the key available function that, after two years from becoming a gatekeeper, will need to be extended to groups.  

Calling

Voice and video calls will not be immediately available after gatekeepers are designated. They will have 4 years to create technical and operational conditions to make end-to-end video or voice calls available between two individuals and groups. 

Social networking? Maybe…

Social networking should also be one of the functionalities that gatekeepers should make interoperable, but the negotiators were not keen on agreeing to proposals made by the European Parliament team. The obligation for gatekeepers who offer social networking services did not make it into the final text. 

Fortunately, the DMA has a revision clause that binds the European Commission to evaluate the regulation and report to the European Parliament and the Council of the EU. The negotiators agreed to include an assessment if social networking services should be included in the scope of interoperability provisions in the revision clause. So there is no promise, but at least the EC has to look into the issue again and produce some evidence for – or against – extending the scope. 

The art of war compromise

The negotiations over interoperability were indeed dramatic. Apparently the French Presidency was unsure of its mandate from the Council to negotiate extended interoperability provisions and hesitated to negotiate beyond what the Council had included in their draft. Even worse, the European Commission authored a non-paper full of simplified claims pointing at how interoperability is not a feasible solution either for messaging or social networking.

“With the French Presidential elections looming, the incentive to wrap up what is possible to wrap up became greater. This was the chance for the Parliamentary negotiators to defend the mandate bestowed on them by the EP. “

Fortunately for the DMA, the negotiations over DSA were dragging. It became apparent that despite bold promises to deliver the outcome on the two regulations, the French Presidency won’t be able to assign two successes to its account. With the French Presidential elections looming, the incentive to wrap up what is possible to wrap up became greater. This was the chance for the Parliamentary negotiators to defend the mandate bestowed on them by the EP. 

Hence the result that goes along the demarcation line between what the EP wanted and what the Council agreed to give. Yes, end users will enjoy more interconnectivity, but only if service providers request it from the gatekeepers. Yes, private one-on-one messaging will be available first via text and sharing of images, audio and video attachments, but groups will need to wait two years to benefit from that. Yes, calling and video calling others will be possible but within 4 years. Yes, social networking could become interoperable but only if the European Commission sees it as necessary to ensure contestability – and that the soonest 3 years after the regulation enters into force. 

No doubt, the EP delegation fought hard and used available opportunities to secure what they could regarding interoperability. Ideally it would be better and extended to social networking but considering the pressure from the Council and the lobbying of the Big Tech on the issue, we couldn’t realistically count on more.

Stay tuned for the analysis of other provisions of the Digital Markets Act as adopted by the trilogues negotiators!

My Home Assistant Music Cube

00:02, Monday, 28 2022 March UTC

Last year, I spent $17 on an Aqara cube, and it’s been one of my best purchases for enjoyment per dollar spent.

I control my multi-room audio using a gyroscopic gesture-recognition cube -- yes, this basically makes me Iron Man.

The Aqara cube is a three-inch square plastic cube that sends gestures over Zigbee to a cheap off-the-shelf dongle.

By pairing this cube with Home Assistant, I have a three-dimensional button with 45 unique interactions to control whatever I want.

And over the last six months, I’ve used it to control a small fleet of antiquated streaming devices to help me discover new music.

🎭 The Tragedy of the Logitech Squeezebox

The Logitech Squeezebox is a bygone streaming device that was too beautiful for this world. Logitech snuffed the Squeezebox in 2012.

But because others share my enthusiasm for Squeezeboxes, there’s still hope. The second-hand market persists. And there are wonderful nerds cobbling together Squeezeboxes from Raspberry Pis.

Logitech Squeezebox fans

I built a DIY Squeezebox from a Pi Zero Pimoroni PirateRadio kit and Squeezelite software.

I blanket my humble abode in music by combining a DIY PirateRadio, a Squeezebox Boom, and a Squeezebox Touch.

My Dockerized Logitech Media Server perfectly synchronizes these three devices. Music from Spotify or WQXR is seamless when you walk from bedroom to kitchen to dining room.

🏴‍☠️ Pimoroni PirateRadio

Home Assistant is ✨magic✨

Home Assistant is open-source home automation software, and it’s the only IoT software I don’t find myself screaming at regularly.

And, of course, there’s a Logitech Squeezebox integration for Home Assistant. The integration lets you use Logitech Media Server’s (somewhat esoteric) API to control your devices from Home Assistant.

Home Assistant Squeezebox Lovelace Card

I also use a community-made Home Assistant Blueprint that automates each of the cube’s 45 unique gestures.

Mi Magic Cube in Home Assistant

Currently, since my mental stack is tiny, I only use four gestures:

  1. Shake: Turn on all players, and start playing a random album from Spotify (that’s right, album – I’m old enough to yearn for the halcyon days of Rdio).
  2. Double-tap: Turn off all players.
  3. Flip: Next track.
  4. Twist: Twist right for volume up; twist left for volume down – like a volume knob.

🧐 Why would anyone do this?

In a 2011 article, “A Brief Rant on the Future of Interaction Design,” Brett Victor describes touchscreens as “pictures under glass.” I loathe pictures under glass.

It’s impossible to use a device with a touchscreen without looking at it. And touchscreen interaction is slow – traversing a menu system is all point-and-click, there are no shortcuts.

Another alternative is control via smart speakers – devices literally straight out of a dystopian novel.

While the smart speaker is the closest thing to a ubiquitous command-line interface in everyday use, I’m too weirded-out to have a smart speaker in my house.

I’ve opted for a better way: shake a cube and music appears.

The cube is a pleasant tactile experience – shake it, tap it, spin it – its a weighty and fun fidget toy. Its design affords instant access to all its features – there is no menu system to dig through.

The cube is frictionless calm technology and it’s behaved beautifully in the background of my day-to-day for months.

Tech News issue #13, 2022 (March 28, 2022)

00:00, Monday, 28 2022 March UTC
previous 2022, week 13 (Monday 28 March 2022) next

weeklyOSM 609

10:06, Sunday, 27 2022 March UTC

15/03/2022-21/03/2022

Mapping campaigns

  • UN Mappers are going to map building footprints, supporting UNSMIL to ensure peace in Libya, on Wednesday 30 March from 06:00 UTC until 16:00 UTC. The tasks will be distributed via the tasking manager.
  • Andrés Gómez Casanova announced that the note-a-thon (a group activity solving local notes in Latin American countries) will be held on the first Saturday of each month from now on. The note-a-thon is registered as an organised activity. See the details (es) > en of this activity on the wiki. Events are coordinated (es) > en on Meetup.

Mapping

  • bgo_eiu reported on his unexpected insights gained while trying to improve the mapping of Baltimore’s public transit in OSM.
  • Counter-mapping the City was a two day virtual conference about using mapping as a means of promoting the struggles of marginalised people in the Philippines.
  • dcapillae has started (es) > en an initiative with authorised mappers to improve the positions of recently imported recycling containers in Malaga (Spain). To support this he has created a special style usable in JOSM showing the different types of goods suitable for recycling.
  • User kempelen pointed out (hu) > en, once again, the importance of separating landuse=* and area=* from highway=*.
  • Voting is open until Tuesday 5 April for artwork_subject=sheela-na-gig, for mapping Sheela-na-gigs, stone carvings depicting nude women exposing their genitals, found on churches, castles, other walls and in museums.
  • Tjuro has finished their micro-mapping of the rebuilt station square north in Zwolle.

Community

  • Edoardo Neerhut is looking for work for Aleks (@Approksimator), who has lost his income due to the war in Ukraine.
  • OSMF Japan and Microsoft Corporation will hold (ja) a workshop (ja) > en on Soundscape and OSM on Wednesday 30 March. Soundscape is a 3D voice guidance application based on OSM data.
  • Amanda McCann’s February diary is online.
  • The Communication Working Group of OSMF has officially announced the new discussion forum for OSM (we reported earlier) in a blog post.
  • For International Women’s Day GeoladiesPH had a 3 hour mapathon, focusing on women, called #BreakTheBiasedMap.
  • Chinese mapper 喵耳引力波 wrote (zhcn) > en a diary entry, in which they list all of the Chinese mappers who gathered to map and refine the mountainous terrain and roads after the MU5735 air crash, and guesses this may have been due to modern media allowing mappers to follow breaking news.

Events

  • The 8th State of the Map France (SotM-Fr) will take place at the University of Nantes from 10 to 12 June. The call for papers is open (fr) until Thursday 31 March.
  • The State of the Map Working Group revealed the logo for SotM 2022 in Firenze (Italy). They also published a number of key dates as follows:
    • Monday 25 April 2022: 23:59:59 UTC: Deadline for talk and workshop submissions
    • June 2022: Talk video production (test video and final video)
    • August 2022: Lightning talk video production
    • 19 to 21 August 2022: State of the Map.

Education

  • Corinna John described (de) > en in her blog how to create a ‘photo map’ using QGIS.
  • Youthmappers published their project results about how to connect volunteered geographic information and crowdsourced spatial data with government cartographers and geographers to better serve the public across the Americas.

Maps

  • [1] Sven Geggus has improved his OpenCampingMap. Now sanitary_dump_station, water_point and drinking_water are also displayed at zoom level 19.
  • Reporters from franceinfo used (fr) > en OpenStreetMap’s data for an article about the cost of commuting after the recent rise in oil prices.
  • Alex Wellerstein presented his OSM based ‘nukemap’, which allows you to visualise the impact of a simulated nuclear detonation. Start with a tactical bomb (10 kt) and try the advanced options!

Software

  • lwn.net reported that there is an OpenStreetMap viewer for Emacs.
  • Organic Maps is participating in the Google Summer of Code 2022. Six ideas for projects are already available.
  • Kevin is the new maintainer of the Awesome OpenStreetMap list. He invites you to help make this list more awesome.

Releases

  • Last week we reported on release 17.0.4 of Vespucci. In it there was a problem with the default templates that can be solved by manually updating the templates.

Upcoming Events

Where What Online When Country
Perth Social Mapping Online osmcalpic 2022-03-27 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-03-28
Bremen Bremer Mappertreffen (Online) osmcalpic 2022-03-28 flag
San Jose South Bay Map Night osmcalpic 2022-03-30 flag
Ville de Bruxelles – Stad Brussel Virtual OpenStreetMap Belgium meeting osmcalpic 2022-03-29 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-03-30
Tucson State of the Map US osmcalpic 2022-04-01 – 2022-04-03 flag
Hlavní město Praha Missing Maps GeoNight MSF CZ online mapathon 2022 #1 osmcalpic 2022-04-01 flag
Burgos Evento OpenStreetMap Burgos (Spain) 2022 osmcalpic 2022-04-01 – 2022-04-03 flag
Região Geográfica Imediata de Teófilo Otoni Mapathona na Cidade Nanuque – MG -Brasil – Edifícios, Estradas, Pontos de Interesses e Área Verde osmcalpic 2022-04-02 – 2022-04-03 flag
Bogotá Distrito Capital – Municipio Notathon en OpenStreetMap – resolvamos notas de Latinoamérica osmcalpic 2022-04-02 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-04
OSMF Engineering Working Group meeting osmcalpic 2022-04-04
Stuttgart Stuttgarter Stammtisch osmcalpic 2022-04-05 flag
Greater London Missing Maps London Mapathon osmcalpic 2022-04-05 flag
Berlin OSM-Verkehrswende #34 (Online) osmcalpic 2022-04-05 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-06
Tasking Manager Collective Meet Up – Option 1 osmcalpic 2022-04-06
Tasking Manager Collective Meet Up – Option 2 osmcalpic 2022-04-06
Berlin 166. Berlin-Brandenburg OpenStreetMap Stammtisch osmcalpic 2022-04-08 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-11
臺北市 OpenStreetMap x Wikidata Taipei #39 osmcalpic 2022-04-11 flag
Washington MappingDC Mappy Hour osmcalpic 2022-04-13 flag
Hamburg Hamburger Mappertreffen osmcalpic 2022-04-12 flag
San Jose South Bay Map Night osmcalpic 2022-04-13 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-13
OSM Utah Monthly Meetup osmcalpic 2022-04-14
Michigan Michigan Meetup osmcalpic 2022-04-14 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Lejun, Nordpfeil, PierZen, SK53, Strubbl, TheSwavu, derFred, Can.

Semantic MediaWiki 4.0.1 released

20:23, Thursday, 24 2022 March UTC

March 24, 2022

Semantic MediaWiki 4.0.1 (SMW 4.0.1) has been released today as a new version of Semantic MediaWiki.

It is a maintenance release providing a bug fixes and translation updates. Please refer to the help pages on installing or upgrading Semantic MediaWiki to get detailed instructions on how to do this.

Wikipedia article, or essay?

16:14, Thursday, 24 2022 March UTC

In Wiki Education’s Wikipedia Student Program, college and university instructors assign their students to edit Wikipedia as part of the coursework. For most courses, this replaces a research paper or the literature review section of a longer analytical paper, related to the course topic. But for some courses, Wikipedia also becomes an object of study for students.

That’s the case for New York University Clinical Associate Professor David Cregar, who teaches in the Expository Writing Program. His course, “We are not in a post-fact world”: Wikipedia & the Construction of Knowledge, focuses on both having students contribute to Wikipedia and contextualizing Wikipedia in the broader knowledge landscape. Student Audrey Yang fulfilled the second part of the assignment in a creative way — by creating what looks like a Wikipedia article, called “Wikipedia & the Structure of Knowledge” (but, it notes, “From Fake Wikipedia, not a real encyclopedia (but still free)”). Audrey’s reflection of how Wikipedia compares to the Library of Babel contains all the hallmarks of a Wikipedia article: a table of contents, edit buttons on section headers, citations, wikilinks, and even a talk page, complete with notes from her thought process as she wrote the essay!

This example of an end-of-term reflective essay is particularly fun and creative. Download this PDF to see Audrey’s work. And many thanks to Audrey and Professor Cregar for sharing it with us!