A New Research Roadmap For Addressing  Knowledge Gaps.

17:00, Thursday, 21 2022 April UTC

The Wikimedia Foundation Research shares an update to its 2019 Knowledge Gaps White Paper. With a continued commitment to knowledge equity, the update reflects on the past, present, and future of our research around Knowledge Gaps, provides a summary of our findings and contributions, and revises the roadmap for the next 3-5 years.

The update describes three main developments:

  • Guiding Principles, a set of principles guiding our research in knowledge gaps: knowledge equity, multi-project focus, community-driven research, machine-in-the-loop, inclusivity, privacy, and openness.
  • Consolidated Research Roadmap, consisting of three main directions: identify, measure, and bridge knowledge gaps. 
  • Ideas for Future Research:  big research questions, spanning a 5 to 10-year horizon, which we would like to share with the research community.

This post summarizes the key points of each development. For more in-depth information about our revised roadmap, please refer to the full update (or its pdf version on Commons), and stay tuned for follow-up posts! Please share suggestions and feedback on the talk page. We are looking forward to hearing your thoughts on our knowledge gaps research!

Principles Guiding Knowledge Gaps Research

  • Knowledge Equity. The Knowledge Equity principle is at the heart of the knowledge gaps research. Our aim is to provide tools and research to identify, measure, and address those knowledge gaps that prevent us from reaching knowledge equity.
  • Beyond one project. We conduct research for Wikimedia projects including but not limited to Wikipedia, such as Wikidata, Wikimedia Commons, Wikisource, and Wiktionary. 
  • Community-driven research. We are inspired and influenced by Wikimedia communities. Our aim is to develop research and technologies that work in harmony with community principles and mechanisms.
  • Inclusive research methods. Knowledge equity starts from the technologies we develop. To support a diverse community, our outputs are inclusive by design, and scalable across platforms, content types, languages, and cultures.
  • Machine-in-the-loop. Our automated systems are designed to play a supporting role for editors, empowering them and improving their capacity. Editors and other users should keep full control of whether to adopt or reject algorithmic suggestions. 
  • Privacy. Our research is guided by the respect for privacy that is core to all Wikimedia projects. 
  • Openness. Freedom and Open Source is a guiding principle of our research. We follow the Wikimedia Foundation’s Open Access policy.

Our 3-5 years Research Roadmap: Identify, Measure, and Bridge Knowledge Gaps

Identify Knowledge Gaps

This direction focuses on developing systematic definitions of knowledge gaps and their context as the first step towards operationalizing knowledge equity. Our research is focusing on four main goals:

The Knowledge Gaps Taxonomy V2, by Miriam (WMF) – Own work, CC BY-SA 4.0
  • Define Knowledge Gaps. We developed and are continuously improving the first taxonomy of Wikimedia knowledge gaps, a structured, systematic list of inequalities in Wikimedia projects across readers, contributors, and content. 
  • Define Barriers to Knowledge. We plan to create a taxonomy of barriers preventing people from accessing free knowledge. 
  • Understanding Readers and Contributors. We are working on understanding in-depth reader motivation, navigation patterns, and curiosity, as well as contributor workflows, to help uncover and prioritize new and existing knowledge gaps..
  • The Role of Images and Multimedia. We are interested in studying the role of multimedia content in Wikimedia platforms, the extent of its presence, and its impact on navigation. 

Measure Knowledge Gaps

Once knowledge gaps are systematically defined, the next challenge is to develop ways to measure them. Our goals for this direction are as follows:

  • Quantify knowledge gaps. We defined high-level metrics, namely the tools, data, and logic needed to measure the knowledge gaps in the taxonomy, tested some readers and contributors metrics and we are currently working on metrics for the multimedia gap and the readability gap. 
  • Create snapshots of the state of gaps over time. We are currently deploying 5 demographics content gap metrics and are fostering the creation of new tools and research to  systematically generate data about disparities in content, contributors, and readers. 
  • Build a tool to explore knowledge gaps data. We are building the knowledge gap index, an accessible and user-friendly interface, through which we hope to allow Wikimedians and researchers to monitor data about knowledge gaps, set targets and track  the progress towards those goals.

Bridge Knowledge Gaps

This direction discusses ways to address Wikimedia knowledge gaps, informed by the systematic definitions and measurements described so far. We are working on two main research fronts:

  • Discover, prioritize, and tag Wikimedia Content. Operating in harmony with community practices,  we are developing and encouraging research around tools for content recommendation (for example list, building, link recommendation, image recommendation and section recommendation), prioritization (based on article importance) and tagging (see topic filters) that can help communities address gaps. 
  • Tools and systems to address knowledge gaps.  Besides collaborating with WMF’s product teams to support the development of structured tasks, we are studying ways to promote equity and reduce biases through our recommender systems.  We also developed a set of tools that facilitate the visualization and exploration of patterns in Wikimedia projects, and continue our collaborations with teams inside of WMF, the Wikimedia affiliates, and the developer community to build products that address knowledge gaps.  

Beyond the 5-Year Horizon: Ideas for the Future

During our conversations, several ideas for research projects with a longer-term horizon emerged. We share them here with hope to generate interest and awareness around research questions that are crucial for the future of Wikimedia projects.

  • Learning. One of the main motivations bringing people to Wikimedia sites is intrinsic learning. But are our content, tools,  and recommender systems designed to foster and maximize learning? 
  • A Model of Wikipedia’s Complexity. Researchers have long studied ways to model complex systems such as climate change or social networks, and  these models have been shown to be extremely useful in understanding the underlying mechanisms and providing quantitative forecasts.  Can we have a model that reflects Wikipedia’s processes and can help answer questions about knowledge gaps and imagine extreme scenarios?
  • Named Entity Recognition in images. Named Entity Recognition is a well-established NLP task that, given a piece of natural language text, such as a Wikipedia article, extracts semantic entities in a knowledge graph such as Wikidata.  The recognition of named entities in text has been widely explored in the natural language processing field. But how can we classify highly granular semantics in images?
  • New and External forms of knowledge While most of our work focuses on understanding gaps in Wikimedia spaces as of today, Wikipedia and its sister projects are in continuous evolution. Wikimedia projects do not live in isolation: they are an essential part of the larger web ecosystem. How does Wikipedia address the knowledge gaps of the broader web, and what forms of knowledge Wikimedia needs to acquire in order to fulfill its role in the web?

Acknowledgements

The Research team would like to thank everyone who has supported us throughout the years with the Knowledge Gaps research and who  has given love and feedback to the Knowledge Gap projects: thank you all.

Funds for this research are provided by donors who give to the Wikimedia Foundation, by grants from the Siegel Family Endowment, and by a grant from the Argosy Foundation. Thank you!

What We Learned from Trainsperiment Week

13:55, Thursday, 21 2022 April UTC

Developers should own the process of putting their code into production. They should decide when to deploy, monitor their deployment, and make decisions about rollback.

But that’s not how we work at Wikimedia today, and we on Release Engineering aren’t sure how to get there, so we’ve decided to experiment.

Typically a deployment takes us a full week to complete—the week of March 21st, 2022, we deployed MediaWiki four times.

We called that week 🚂🧪Trainsperiment Week.

📻 Deployment frequency

MediaWiki's mainline branch is changing constantly, but we deploy MediaWiki weekly (kind of). We keep stats that measure how far our main branch is from production.

The trainsperiment changed our deployment frequency, which affected all the other metrics, too. Faster deployment means smaller batch size, and shorter change lead time.

📦 Change lead time

The number that we knew would change during trainsperiment week was change lead time—the time from merge to deploy. If I merge a change, then a minute later I deploy it, that change’s lead time is one minute.

This chart shows the average lead time of all patches in a given train:

The chart below compares a typical week (1.38.0-wmf.1) to trainsperiment week (1.38.0-wmf.2, wmf.3, and wmf.4). Each dot is a change in a particular version—fewer dots mean fewer changes.

During trainsperiment week, we deployed faster. Each deployment was smaller, and the lead time of each patch in a release was shorter.

Here’s the same data on a logarithmic scale. During trainsperiment week there were only a few hours between trains, so the lead time could be measured in hours, not days!

📝 Survey Feedback

At the end of the week, we asked for feedback via the Wikitech-l mailing list. We collected comments from the mediawiki.org talk page and the summaries of candid conversations.

👍 Satisfaction

A small number of people took the time to respond to the survey—20 people answered our questions.

Almost everyone who took the survey seemed satisfied with communication. Most were satisfied with the experiment overall.

There were concerns on the talk page and in the survey responses about testing. Testing felt time-crunched, and everyone was worried about the time pressure on our Quality and Test Engineering Team (QTE).

🌚 Impact

Less than half of our respondents felt that the Trainsperiment positively impacted their work, with one respondent strongly disagreeing that there was a positive impact.

Most people were neutral about the impact of this experiment on their work.

The person who felt that there was a negative impact was concerned about the lack of time allotted for testing—they urged us to rethink testing if we wanted to try this again.

💌 Comments

The survey contained free-form prompts for feedback. Below is a smattering of representative responses. Most of the comments below are amalgamations and simplifications, but the reactions in quotes are verbatim.

What should RelEng have done differently

  • Automated alerts: emails whenever there’s a deploy or the train is blocked

What would you need to change if we did this every week?

  • No time to find and fix regressions means the QA process would need to change somehow
  • More transparency around when train rolls out and a clearer blocking process
  • Translations
  • “my mental model.”

Other Feedback

  • “With less time between groups, breakage will reach all wikis very quickly”
  • “Often Tuesdays are currently used to deploy bug fixes that are hard to test locally […] we would need to revisit many of our workflows”
  • “This, at least on paper, will help devs”
  • “This was a pure win, IMO.”

🗣️ Conversations

We talked individually to people who had concerns about the experiment on Slack and IRC, in meetings, in the survey feedback, and on the talk page.

People were concerned about shortening the time for review. This is understandable given that we shortened a 168-hour process to a 12-hour process. 

Our QA process takes time. Our overburdened principal engineers take time to review code going live on a weekly basis. Due to some esoteric details, even our CI system gives us more confidence given more time—it was possible that MediaWiki could have broken compatibility with an extension without alerting anyone.

We have come to rely on the weekly cadence to make a careful release, and a faster process would mean rethinking our process pipeline to production.

🎀 Release Engineering's Feedback

The weekly train hides a lot of technical debt—it’s a giant feature flag and the missing testing environment rolled into one. It goes out every week (mostly), and Release Engineering spends about 20% of its time monitoring the release.

During trainsperiment week, we spent 100% of our time deploying—that’s not sustainable for our team.

We surfaced process pain points with this experiment, which was a success. We added to the already overlarge burdens of our principal engineers and quality engineers, which was a failure.

But this isn’t the end of the experiments. We endeavor to bring developers and production closer together—preferably with us standing back a healthy distance. If you’d like to help us get there—get in touch.


Thanks to @kchapman, @brennen, and @Krinkle for reading earlier drafts of this post and offering their feedback.

Activism symbol for Wikipedia
Activism symbol for Wikipedia. Image from Jasmina El Bouamraoui and Karabo Poppy Moletsane – Wikimedia Foundation, CC0 1.0, via Wikimedia Commons

On 8th April 2022, the United States government, in response to advocacy efforts by the Wikimedia Foundation and others, took a big step to protect the open internet for the people of Russia: the United States government authorized US internet companies to continue providing essential internet services in Russia amid growing sanctions against the Russian government. Considering that the US is the country with the highest number of Internet Service Providers (ISPs), this has a critical impact in preserving access to the internet within Russia. This decision helps ensure that US companies do not cut off Russian Wikipedia editors and readers, independent media, and people across Russia—including those speaking for human rights and against the war—from the internet and the free exchange of knowledge online. 

In response to a letter sent by the Wikimedia Foundation, our partners at Access Now, and over 50 civil society organizations, the United States Treasury Department’s Office of Foreign Asset Control, which is the office in charge of enforcing US economic and trade sanctions, issued General License 25. These types of licenses allow US companies to continue providing essential services to people living in countries under sanctions to protect their human rights, including the right to free expression. 

The license clarifies that providing telecommunications and internet services to people within Russia are not subject to the growing list of sanctions imposed since the Russian government illegally invaded Ukraine. The license is critical for providing certainty to US-based private companies that they can continue to supply internet access within Russia. Uncertainty had recently led several companies to cut off their services.

The Global Advocacy and Public Policy team took on this issue as part of broader efforts led by the Wikimedia Foundation to support the movement amidst the ongoing crisis in Ukraine. This has included direct support to affected individuals and affiliates, as well as working to ensure the free exchange of knowledge continues to be protected for people across the region.   

Any interruption to the internet’s backbone services would make it extremely difficult for Russian Wikipedians and Russian people to access accurate information overall, particularly about the invasion of Ukraine, and would threaten fundamental freedoms, such as the right to free expression. Moreover, the rest of the world would be cut off from hearing the perspectives of the people of Russia, including independent media in the country, on our projects. When one country cannot participate in the global conversation on Wikipedia and Wikimedia projects, the rest of the world suffers. 

When the United States, European Union, United Kingdom and other countries began imposing sanctions in the early days of the Russian government’s invasion of Ukraine, these focused on Russian oligarchs, companies, and important aspects of the Russian economy. The Wikimedia Foundation and our allies in civil society around the world were concerned that the governments imposing these sanctions were not doing enough to preserve access to telecommunications and to the internet within Russia. In particular, it was unclear whether transactions related to internet access were covered or not by United States sanctions against Russia. 

Why did the Global Advocacy and Public Policy team decide to write a letter to the US government about internet access in Russia?

The Global Advocacy and Public Policy team at the Wikimedia Foundation was worried that this lack of clarity could lead private companies in the US to stop providing essential communications services, like internet backbone services, to companies within Russia. Confirming the fears of the Wikimedia Foundation and our allies, companies like Cogent and Lumen, major US-based internet service providers, began to cease providing service to Russia in early March 2022

What US government action did the Global Advocacy and Public Policy team advocate?

In the past, the US Treasury Department issued general licenses to clarify how government sanctions applied to businesses operating in the country. A general license is a type of formal authorization for companies, directing them on how they can do business in a particular country when sanctions are issued. For example, when the United States took similar actions against the Iranian government in 2013, it issued a license permitting companies to continue transactions related to internet access and telecommunications for citizens within Iran. 

Immediately following initial sanctions, the US Treasury Department issued a number of licenses pertaining to Russia and Ukraine, but not one specifically allowing for the continued provision of internet access. In the Wikimedia Foundation’s view, the lack of a general license for internet access was creating uncertainty surrounding whether US companies and payment processors could still provide those services to Russia without being in conflict with sanctions.

How did the letter with Access Now and others come together?

To urge the US government to action, the Wikimedia Foundation joined a group of allies, including Access Now and Human Rights First, and helped to draft a letter requesting that a general license for internet access to Russia be issued. The letter was signed by over 50 civil society groups, and was covered by news media organizations like Politico and The Washington Post.

What happened after the letter was sent?

While the US government responded saying they shared the goals of the letter and gave public statements clarifying that providing internet access was not a sanctioned activity, we persisted in our request that the US government issue a general license. It was important that the government not only say that US companies could continue providing services, but also why they should continue providing service within Russia. Last week, when the US Treasury Department met the letter’s demands and issued the general license, it was a tremendous victory for ensuring the internet continues to be protected for Russian people and accessible to them. It is more essential than ever that the free flow of information into and out of Russia continues uninterrupted.

What comes next?

We applaud the efforts that the US government made toward preserving access to a free and open internet. We urge the United States and other governments that have sanctioned the illegal invasion of Ukraine to continue demonstrating their commitment to free expression around the world, and supporting the ability of everyone across all borders to participate in the free exchange of knowledge online. 

We celebrate this important achievement while still recognizing there is much to be done to support the movement through the ongoing crisis. We will continue to help shape important policy decisions that ensure the Wikimedia Foundation and movement can continue to provide free knowledge to the world. If you’d like to learn more about how you can support affected communities, check out our post about solidarity with Ukraine on Diff.   

Volunteers have contributed to the Kyrgyz language Wikipedia by adding and editing information about 100 women of Kyrgyzstan.

The overall theme of WikiGap is closing the gender gap and other gaps relevant for diversity on Wikipedia. The aim of this project was to bridge the gap in information about the achievements of Kyrgyz women in the Kyrgyz language Wikipedia as well as promote gender equality by raising awareness about women who have contributed significantly to various spheres such as politics, culture, sports, journalism, and diplomacy among others.

This project was implemented by Media Policy Institute – an independent non-profit organization with the financial support of the Embassy of the Netherlands in Kyrgyzstan. WikiGap Women of Kyrgyzstan 2022 project was coordinated by Ainura Yeshenalieva of the Media Policy Institute. The work of the volunteers was supervised by Aigul Omurkanova, Ph.D., Associate Professor of the Department of International Journalism at Kyrgyz-Russian Slavic University and Aliya Alisheva, Ph.D., Acting Associate Professor of Civil and family Law at the Kyrgyz State Law University.

Banner of the project by Tatyana Zelenskaya under CC BY-SA 4.0

From November 2021 until March 2022, 30 volunteers out of 100 that were selected through a competition, added new names to the Kyrgyz-language Wikipedia and updated the already available information about women of Kyrgyzstan. In March 2022 Media Policy Institute held the final presentation of the WikiGap. Women of Kyrgyzstan project and presented certificates to the volunteers.

Volunteer receiving certificate by Media Policy Institute under CC BY-SA 4.0.
Here you can find more photos from the final presentation of the project.

“Most of the volunteers are girls; it is important to highlight that in the process of writing about other women, they also got inspired. They not only want to continue editing Wikipedia, but also want to be like these prominent women whom we wrote about” – said coordinator of the project Ainura Yeshenalieva in her conversation on Вечер Трудного дня.

To the question of why 30 volunteers were selected to participate in this project, Aynura Yeshenalieva said: “When assessing the applications, we looked at their writing and analytical skills. Initially we invited about 40 participants to our trainings, some dropped out. Thirty volunteers continued working with supervisors and consultants. Our consultants organized trainings both online and offline. Some of the main topics were on how to create an article on Wikipedia, how to find reliable sources of information, and how to work with photos”.

One of the supervisors, Aigul Omurkanova, said that “there are not many photos on Commons from Kyrgyzstan and many articles about women were left without photos. The author of the photo needs to upload it themselves. I want to remind people, if you have photos from your trips in Kyrgyzstan, please do share them on Commons”.

Earlier in 2019, a similar project was implemented in the neighboring country Kazakhstan. 

“I’m not a magician, I’m just learning. When it comes to the quality of the Kyrgyz-language Wikipedia, it’s not just about adding articles and applauding each other and celebrating how good we are. Professional editing and continuous feedback on how things should have been done better to improve the quality is also vital. The Kyrgyz-language Wikipedia is still lagging behind in this regard” – highlighted Aigul Omurkanova in her speech.

The list of added information about women in the frame of WikiGap project can be found on the Kyrgyz Wikipedia.

Here you can find out more about WikiGap and check this link to join the WikiGap2022 Challenge to support strengthen the Wikipedia’s coverage of women and related topics into as many languages as possible. 

Tech/News/2022/16

21:00, Wednesday, 20 2022 April UTC

Other languages: Bahasa Indonesia, Deutsch, English, français, italiano, magyar, polski, português, português do Brasil, svenska, čeština, русский, українська, עברית, العربية, বাংলা, ไทย, 中文, 日本語, ꯃꯤꯇꯩ ꯂꯣꯟ, 한국어

Latest tech news from the Wikimedia technical community. Please tell other users about these changes. Not all changes will affect you. Translations are available.

Changes later this week

  • The new version of MediaWiki will be on test wikis and MediaWiki.org from 19 April. It will be on non-Wikipedia wikis and some Wikipedias from 20 April. It will be on all wikis from 21 April (calendar).
  • Some wikis will be in read-only for a few minutes because of a switch of their main database. It will be performed on 19 April at 07:00 UTC (targeted wikis) and on 21 April at 7:00 UTC (targeted wikis).
  • Administrators will now have the option to delete/undelete the associated “Talk” page when they are deleting a given page. An API endpoint with this option is also available. This concludes the 11th wish of the 2021 Community Wishlist Survey.
  • On selected wikis, 50% of logged-in users will see the new table of contents. When scrolling up and down the page, the table of contents will stay in the same place on the screen. This is part of the Desktop Improvements project. [1]
  • Message boxes produced by MediaWiki code will no longer have these CSS classes: successbox, errorbox, warningbox. The styles for those classes and messagebox will be removed from MediaWiki core. This only affects wikis that use these classes in wikitext, or change their appearance within site-wide CSS. Please review any local usage and definitions for these classes you may have. This was previously announced in the 28 February issue of Tech News.

Future changes

Tech news prepared by Tech News writers and posted by bot • Contribute • Translate • Get help • Give feedback • Subscribe or unsubscribe.

From the lush green hills of the historical site San Andrés, El Salvador, to the overwhelming waterfalls in Murchison Falls National Park, Uganda, to the capturing temple of David in Jerusalem, Israel. Where local restrictions would allow, people have gone out again to capture their wonderful surroundings and share them with the world through the largest photo competition in the World, Wiki Loves Monuments.

As part of the competition, photographers donate their images to Wikimedia Commons, the free repository that holds most of the images used on Wikipedia and the other Wikimedia projects, helping to document the world’s cultural wonders for generations to come.

With the events going on in the world, it is important to stay aware that there is a deadline to capture the world around you. We consider it our responsibility to make people aware of its beauty, and that it deserves to be shared through the eyes of the people that experience it every day. We make the commitment to create the visuals that can illustrate the stories of these places and bring them to life, and invite you to help expand and maintain the treasure trove of the cultural heritage we all share in the upcoming 2022 edition.

In this twelfth edition of Wiki Loves Monuments, 37 countries joined together in the contest. We received over 172.000 entries from 4914 uploaders, and welcomed 3198 first time contributors. Out of all entries the national winners have been selected, and have been brought to an international jury of experts. This jury assessed, considered and ranked the 339 national winning photos based on our usual criteria: usefulness for Wikipedia, technical quality and originality.

Please enjoy the following profiles of the 2021 Wiki Loves Monuments winners, who were announced today. This year’s international winners come from 11 different countries, including multiple winners from Poland, Ukraine and India.


First place: Although Donatas Dabravolskas lives only 15 minutes away from the Royal Portuguese Cabinet of Reading in Rio de Janeiro, Brazil, he only came to know about the place from a social media post, which motivated him to visit and take a bunch of photographs. As said by jury members, this winning picture of WLM 2021 has “amazing symmetry” and “conveys greatness and true scale of this immense library.” Photo by Donatas Dabravolskas, CC BY-SA 4.0.

Second place: A professional photographer who has been in the field for over a decade, Damian Pankowiec, believes that it is important to “capture the subject at the right time of day and the weather”. This stunning picture of Saint Roch chapel in Krasnobród, Poland, is praised by a jury member as the “best way to help a modest, otherwise unspectacular monument in gaining value in a picture is looking for the proper season; and for this picture autumn was chosen very well,” and as another member says, “it creates the sensation to be there.” Photo by Damian Pankowiec, CC BY-SA 4.0.

Third place: Along with his friends, Zysko Serhi, travels around Ukraine to showcase the beauty of it to the world. Far from his home, Serhi captured a “colourful interior shot” of Samchyky palace in Starokostiantyniv, Ukraine. Photo by Zysko Serhi, CC BY-SA 4.0.

Fourth place: Małgorzata Pawelczyk captured the Stefan Czarniecki Monument in Tykocin, Poland, as the “sun rays were shining through the clouds and the fog.” Her motivation to submit this photograph to WLM was to “create awareness of the beauty of architecture even in small towns.” A jury member remarked it as a “great use of the weather and the time of day to isolate the subject within its surroundings.” Photo by Figoosia, CC BY-SA 4.0.

Fifth place: Basavaraj M, who has been into photography for about four decades with a special interest for architecture, presents us with the picture of the Gali Mantapa from his hometown, which is Chitradurga in Karnataka, India. The photograph is also the winner of WLM 2021 in India. A jury member remarked this as a “good combination of stone architecture with natural rocks.” Photo by Basavarajmin21, CC BY-SA 4.0.

Sixth place: During the pandemic in 2020, Daniel Horowitz, along with his wife “rented a secluded cabin near Lake Champlain in Vermont, hundreds of miles from home.” Since his childhood, he has always been drawn to old buildings, ruins and abandoned places. Towards the end of their trip, as he was taking pictures, Horowitz wandered off (as he usually does), to capture Fort Crown Point in New York. It was a British fortress along the western bank of Lake Champlain, built in 1759 to defend against French forces and partially destroyed by fire in 1773. Photo by Daniel M. Horowitz, CC BY-SA 4.0.

Seventh place: Dormition (Uspensky) Cathedral was photographed on a “rare sunny winter day” in Kharkiv, Ukraine, in “extreme cold and the snow on the roofs added some neatness to the whole scene”, by Ekaterina Polischuk. She, along with being a landscape photographer, is also a drone pilot. A jury member remarked the photo as, “the tower in the foreground creates a dominant look and great perspective towards the depicted monument; lines are converging towards it and draws the eye into the shot. Excellent day for such an image; snow on top of the golden tower is like icing on the cake.” Photo by Ekaterina Polischuk, CC BY-SA 4.0.

Eighth place: On a weekend walk to Ross Castle with his family, Mark McGuire happened to perfectly capture “a series of fortunate events” in one frame, rather than just the monument. In McGuire’s words, “For the ducks, boat and sunset to all align at the same time was pure chance and one of these rare moments that is unlikely to be replicated.” It was also a jury member’s favorite “by its composition” which blends, “architecture, nature and human elements.” Photo by Markiemcg1, CC BY-SA 4.0.

HDR tonemapped

Ninth place: On a cloudy day with strong winds, Hadi Dehghanpour, along with three “photographer friends” traveled almost 400 km to “one of the unique and special attractions of Khorasan Razavi”, the Windmills of Nashtifan. A jury member remarked that “the middle of a frame adds to the magnitude of the structure.” Photo by Hadidehghanpour, CC BY-SA 4.0.

Tenth place: Yashin, who has been into photography for about seven years now, presents with a beautiful image of Intercession Church: Posevkino, Voronezh Oblast, Russia. Photo by Yashin.v, CC BY-SA 4.0.


Congratulations to all the winners, and our thanks go out to everyone who participated! If you’d like to see more photographs like these, see the 2021 winners from Wiki Loves Earth, a similar photo contest that aims to document the world’s natural heritage. Be sure to check out the national winners from each of the participating countries in Wiki Loves Monuments 2021, and last year’s international winners, too!

For more information, including how to join this year’s contest, go to wikilovesmonuments.org. Share your favorite winning images on social media using #wikilovesmonuments.

The Unreasonable Fight for Municipal Broadband

17:31, Wednesday, 20 2022 April UTC

I love Longmont’s municipal broadband, but we had to fight Comcast every step of the way to get it.

NextLight—Longmont’s Broadband

As a gracious person, I’ll dutifully pretend that the problem might be on my side of the Zoom call. But the problem is never me—my internet is just too good.

Since 2016, I’ve been using Longmont’s municipal broadband—NextLight—and it’s been objectively awesome.

  • It’s fast—1Gbps symmetric
  • It’s cheap(ish)—$49.95 per month
  • It’s rock-solid—I’ve never had an outage
  • I’ve never hit a data cap
  • PC Magazine ranks NextLight among the top five ISPs in the United States every year—often besting Google Fiber

And, even better, the city insists I’ll pay $49.95 forever.

But Longmont spent more than a decade fighting Comcast to provide this excellent internet. And any city working on its municipal broadband offering should prepare to do the same.

🎉 Canceling Comcast

In 2016, I was paying Comcast $150 a month for their top-tier (at the time) 100Mbps speeds. I could usually eke out a little more than 30Mbps on a good day.

I lept at the chance for gigabit internet. I signed up for NextLight as a charter member—locking in $49.95/mo for life (the current price is only $69.95 last I checked). And I took the opportunity to upgrade my $20 router from 2008 at the same time.

My new fancy router—Ubiquiti EdgeRouter Lite + EdgeSwitch 16-POE

And when I returned my cable box to Comcast to cancel my service, the representative felt compelled to counsel me: “NextLight, huh? you know,” he said, leaning in, “if you miss even a single payment, they’ll raise your price?”

“I’ll take my chances.”

I’ve been automatically billed $49.95 every month since, and this is what my speed looks like this morning:

930Mbps is not quite 1Gbps, but I’ll take it

🥺 Why can’t we have nice things?

Big cable companies suck. Big cable companies burned hundreds of thousands of dollars to stop Longmont’s municipal broadband.

In 2005, Comcast and CenturyLink rammed through the egregious Colorado SB-05-152, prohibiting municipalities in the state of Colorado from offering telecommunication services.

Longmont had to hold two referendums on the measure—one in 2009, which failed, and another in 2011, which passed:

In 2009, “No Blank Check Longmont” (Comcast/CenturyLink) spent $250,000 to dash our dreams of municipal broadband. They framed it as a choice between fast internet vs. police and firefighters.

In 2011, “Look Before You Leap Longmont” (Comcast/CenturyLink) spent $300,000 urging us to rethink our municipal broadband plans. They stood in lone opposition to our unanimous city council and our local paper.

Comcast spent $500,000 in a tiny city of less than 100,000 people. You can be sure, Comcast will do all this again in a heartbeat.

📓 Lessons

Rather than use their vast resources to improve their service, Comcast will spend big to ensure they never have to compete.

Let Longmont be a lesson. In 2011, Longmont won because it formed an honest citizens’ advisory group: Longmont’s Future. Longmont’s Future got the word out about the vote on Facebook, its website, and the local press.

And ever since, Longmonsters (that’s right—Longmont’s demonym is “Longmonster”) have chosen NextLight over competing services.

Real competition won. Fuck Comcast. Long live municipal broadband.

In my last two posts, I explained why social changers (Part I) and movement organizers (Part II) are key parts of recruiting the potential participants in our movement and fulfilling Wikimedia movement strategy. In this post, I want to explain a bit behind why we have been focusing the WikiForHumanRights campaign (join today) on the environmental themes associated with the Right to a Healthy Environment.

Community building to include new knowledge allies and build a prosperous movement requires finding intersectionality between outside movements and our own. Right now there is a precious moment to meet one movement, the climate movement, in a way that both highlights some of our most glaring knowledge gaps while also recruiting from a diverse ecosystem that can help us grow our communities. 

The climate movement has erupted in the public scene with the leadership of young and indigenous people, global south activists and women. They are a digitally enabled movement seeking to create a more just, livable world through knowledge and community. Members of the Wikimedia movement have been paying attention to this discourse and reflecting on their own role in communicating the climate crisis and the need for sustainable development using Wikimedia tactics. From Wikimedians for Sustainable Development to #WikiForHumanRights, our early attempts to engage this public suggest the time is right to connect them with the opportunities afforded by Wikimedia platforms.

As I outlined in the previous post, we need to create active paths for outside movements to join the Wikimedia community. Wikimedidistas de Uruguay and Wikimedia Argentina are piloting a course for climate communicators and editing Wikipedia. Climate activists the world over are looking for capacity development opportunities like this.

I think, given some time and space and support, the broader sustainability topic area will be as significant to the movement’s growth as existing thematic networks focused on gender, education, culture and heritage (built, living and natural esp. the Wiki Loves campaigns), and library outreach. Like each of these areas, the climate crisis affects a global public who needs us to create persistent reliable multilingual documentation. How you might ask? In this essay I will describe what is happening in the climate movement and climate communications, where I think Wikimedia platforms have a clear opportunity to invite a new public to our community, and where I think we can start to make that change.

Why Climate and Sustainability now?

In the past few years, the climate movement and public awareness of climate change has swelled. This is happening the world over; youth and other activists are inspiring whole generations of adults shocked by the impacts of climate change to join what used to be niche activism. The Climate Movement has been growing rapidly through major activations like global climate marches and the youth movement, catalyzed by leaders like Greta Thunberg, Vanessa Nakate, Disha Ravi and other Fridays for Future organizers

Why are they able to change the conversation? 

  • First, science has become increasingly clear: society as a whole is running full steam at a crisis that it doesn’t have the political will to solve or adapt to. This lack of understanding has largely been the work of three decades of corporate misinformation stalling political action. The Intergovernmental Panel on Climate Change (IPCC) review of the impacts of Climate Change is very grim, and clear: humanity is not preparing quickly enough for what is coming.
  • Second, the effects of climate change are becoming increasingly real for everyone. One of the main challenges in mobilizing large-scale action: when the science for climate change became clear ~30 years ago it seemed like a far off problem. Not anymore. If you are looking, new disasters are happening almost every day, and at least once a week a major one rises to center stage of most European and North American news cycles (more on this perspective bias below). 
  • Third, leadership from the youth and wider environmental movements has fully embraced an intersectional coalition that elevates marginalized communities, and tries to solve problems from a truly global, multilingual, multi-contextual perspective focused on equitable futures. The environmental movement is now focused on not just conservation or preservation of the natural world, but also sustainable development and environmental justice.
The increasingly visible impacts of the climate crises on communities around the world has made the climate movement grow in intersectional and multidimensional ways. We are entering a world of increasing complexity, and many people are looking for impactful actions they can take; contributing to Wikimedia should be one of them.

I can’t emphasize enough how confusing and complicated the world is going to get in the next few decades as the crisis grows more visible. What we are about to experience (and you probably are already experiencing) is unpredictable but very foreseeable. The Working Group II report from the IPCC, recently documented a future (even if we reduce carbon emissions), that includes (among many others): 

  • Harm and dying because of heat waves, droughts, flooding and disease. 
  • Havoc on supply chains causing widespread economic suffering and food shortages.
  • Severe damage to communities in the global south and other marginalized spaces without the resources to adapt.
  • Destruction of cultural heritage, communities, buildings, and livelihoods.
  • Extreme stress on both land and water ecosystems.

At the same time, most of the technologies and policy frameworks we need to both reduce carbon emissions and adapt to these situations are readily available. This crisis is preventable with the knowledge that experts have been gathering for the last few decades. A successful reduction of human suffering requires billions of small, informed decisions at every level of society from the international diplomats down to the “smallest possible policy maker,” as described by Anna Grijalva from UNDP Ecuador, at the level of local jurisdictions: grocery store managers, construction engineers, local politicians, and farmers. 

Where could we focus? How do we play a role?

Cool; so the world is going to have some massive problems. Why is Wikimedia particularly well suited to be involved in this solution? 

At the most basic level, Wikipedia matters because the general public is beginning to see how our work affects the public narrative about climate change more generally, and when we get it wrong, it has the potential to be harmful to vulnerable communities

We are probably one of the biggest sources of information on climate change in the world – with more than 324 million annual pageviews across over 25,000 explicitly-about-climate articles in nearly every language Wikipedia along with billions pageviews on other climate connected pages. Probably, the only other website that has comparable public impact is NASA’s which is published in English only and has a United States bias. (Let me know if you can identify another large one). As the IPCC has communicated, climate communication is one of the key tools for addressing the current crises.

But I also think our role in communicating the Climate Crisis matters because we have an opportunity to align our future with the environmental movement’s emerging priorities for public knowledge: it’s good for us, our mission and the future of public knowledge.

Synergy and capacity for fulfilling our mission

The intersectional nature of the climate crises, actually highlights some of the most glaring gaps in the Wikimedia movement — both knowledge gaps and community gaps. This intersectionality is both in terms of the topical dimensions — since climate change impacts practically every sphere of life — and in terms of relevance to marginalized communities, for whom climate justice and action is increasingly a focus.

There are dozens of areas of knowledge within climate and sustainability topics and these overlap heavily with huge gaps on Wikimedia projects. For example, approximately 1 billion people make their livelihoods from agriculture and the food industry employs another billion or so additional people. The triple planetary crises — the climate crises, along with pollution and the biodiversity loss– directly affect our food systems. 

As far as I can tell, Wikipedia’s coverage on agriculture is terrible (please prove me wrong), and our coverage is only slightly better when it comes to food culture (thank you WikiCheese, The Levant Food Photo Contest, Wiki Loves Food in India and other similar initiatives!). Working on climate topics through agriculture by example, would make our platform more valuable to a major portion of the global population not using us and allow us to do what we do best (document cultural practices under threat). 

A picture of women harvesting seaweed in Tanzania — part of a series of photos that won best Photo Essay that won part of Wiki Loves Africa 2017. Seaweed farming is an important climate adaptation for many parts of the world, but is only documented in 9 languages.

Similar Wikipedia content gaps with climate change overlaps include:

  • Cities and other populated places, especially in the Global South
  • Bodies of water and natural reserves under threat
  • Public infrastructure topics related to the Sustainable Development goals, such as power stations, water infrastructure, etc
  • Action and stories by indigenous communities, underrepresented people, and women seeking climate justice and human rights defense. 100s of activists die every year fighting for visibility of these issues in contexts where we have communities. 
  • Industries and economic practices implicated in the radical economic changes required for a sustainable future.

And those are just the most glaring gaps that are well researched by the communities of experts and journalists focused on the climate crises — there are many more. In my volunteer time, I have documented some of the other gaps I have found over on English Wikipedia’s WikiProject Climate Change.

The Sustainable Development Goals and their many climate intersections are also the opportunity of the moment: there is a groundswell of organizations focused on evidence-based communication, and international and local funders looking for high impact projects. As part of my work at the Wikimedia Foundation and participation in Open Climate, I have observed a number of funding opportunities, job openings and potential partnerships well within the Wikimedia movement’s scope go unsupported. Or I have seen projects implemented in such a way that they will not have lasting and value for the public. Organizations are even getting funding to work on Wikimedia, without many movement connections.

Institutions around the world are looking for ways to communicate and educate the global population about climate change and sustainability — and there are not enough organizations ready to provide high impact, and truly global, projects. Projects like WikiLovesSDGs and Wiki4Climate, alongside larger calls to action like WikiForHumanRights barely scratch the potential of the collaborations we could grow (for a full list of documented activities see on Meta). 

We have a huge multilingual advantage

If you are primarily an English speaker and have been paying attention to the news at all in the last few years, you might find what I am saying rather boring: the causes and impacts of climate change, and the ability for us to address them is rather normal conversation. In the English-speaking world, climate communication is at least readily accessible and making progress against disinformation. 

This simply is not true in many other languages; reputation in academic communities is frequently connected to Global North institutes and English, and the career incentive for academics is to disseminate their scholarship in English, or at best with limited translation. Even larger languages are full of communication gaps around the climate crises, like Spanish (in media and public sectors), Portuguese (Brazil and Portugal) and Arabic. This is not a crisis only solved by highly educated multilingual experts, but the current science communication environment assumes that. 

Several parts of the youth movement feel like they have to solve the language gap problem. The youth movement has become very focused on developing isolated blogs, insider communications and social media assets for Tik Tok and Instagram. They are frequently missing the key long-term, persistent, multilingual and broad-public content that we offer. Moreover, these platforms are susceptible to spreading harm, including spreading misinformation and increasing the mental health burden of climate on youth.

We know how to make lasting, multilingual, culturally appropriate, high impact content — this is our expertise. If we could harness just a fraction of the energy that thousands of climate communicators are going to put into translating content for local audiences, our smaller language communities could explode with participants. 

Anticipating confusion, creating information for decisions 

A recent spike in pageviews on the “Climate change in South Africa” article on English Wikipedia, at around the same time that a report was published by the government there.

The dire impacts of the climate crises that I describe above are going to create a lot of confusion and questions among our readers. At the very least we can learn from our previous experiences with other dire breaking news, and how they rely on our coverage. I think we can get a small sample of what this might look like in recent heartbreaking experiences of our Russian-language and Ukrainian-language communities (see coverage in the Signpost). Our readers who are interested in the current war in both local and other languages are benefiting from two decades of work, as a movement, creating content about the Central and Eastern European (CEE) region. 

As global public attention pivoted towards the war, every Wikipedia was flooded with reader attention on topics related to CEE, with readers clicking through to all kinds of context (on English Wikipedia for example). We have seen this kind of experience of breaking news bringing attention before: the Farmer’s Protests in India, the beginning of the COVID Pandemic, the Arab Spring, the invasions of Iraq and Afghanistan by the United States, etc. Each time humans experience a sudden burst of unexpected crises — a certain segment of the population will be exploring Wikimedia content. 

The climate crisis is going to be shocking.

From a more proactive, and less reactive perspective, we need to be helpful for the billions of decisions needed for humanity to endure the climate crises. Here are just a few decisions we could influence: 

  • Members of the public and legislators interested in new laws to protect ecosystems against destruction (for example, this recently trending topic in Argentina)
  • A farmer’s choice about whether to grow corn instead of other more climate tolerant crops.
  • A local council-person who received a briefing on how sea level rise will destroy their local economy, trying to figure out the concept of managed retreat that was mentioned by one of the academics at the briefing. A study of policy makers in Argentina, for example, found that most of their policy makers didn’t understand the local impacts of climate change.
  • A tourist reading about their next destination vacation, realizes that international tourism rarely benefits that local community and chooses instead to find something more sustainable in their own community
  • A solar power investor learning about the human rights issues related to cobalt and lithium mining, and needing to purchase a slightly more expensive form of energy storage that uses less conflict-minerals, like pumped water or an iron-oxide battery system. Every year, hundreds of human rights defenders are killed over projects like these, and public knowledge is still scarce.

The number of questions that we are going to need to answer related to the environmental crises abound, they are “glocal” (both global and local) and multilingual in nature and Wikipedia isn’t providing those answers. 

Building a future for sustainability and the environment in our movement? 

An event recently organized with community from South Sudan as part of the #WikiForHumanRights campaign. Community groups around the world have found enthusiastic partners around the world. 

Every time I introduce Wikimedia organizers or other open movement community members to the expansive sustainability-focused opportunities, both in topics and audiences, they immediately find actionable contributions. Also, when I talk to members of the environmental movement, they too have light bulb moments: “oh really, I didn’t realize Wikipedia’s potential for our mission”. The public needs our movement, our public knowledge mission, and our multilingual and multidisciplinary content to address the triple planetary crises

To create this space, to invite the kinds of organizers and editors that I describe in the last two posts, where should we start? We might find some inspiration from other Wikimedia outreach communities. We can look at various parts of the movement that have become well organized around Gender, Education, Medicine or GLAM, and apply some of what they’ve learned about building thematic communities. 

I think there are some really obvious steps: 

  • We need to convene the parts of our movement interested in these topics, and find ways to work together in focused collaboration — the WikiWomenCamps were critical in 2012 and 2017 for forming the Gender Organizing network and practice. 
  • We need to grow the infrastructure on-wiki (like WikiProjects) for guiding new contributors to the most impactful content.
  • We need to build better community spaces for digitally convening and coordinating the network of volunteers  (like Wikimedians for Sustainable Development). 
  • We need to create spaces for community members to learn to speak the language and about the issues of the climate and sustainability movements– the GLAM-Wiki community for example, has had several key moments of community growth when the movement and professional communities have actively talked with each other
  • We need to study how our content is a source of sustainability misinformation, and where we have systemic knowledge gaps that we can address.
  • We need to pilot and document more collaborations with the climate movement to advance their communications goals — there is a passionate, well educated, and communications-savvy environmental movement growing in almost every part of the world that includes career activists, youth leaders, scientists and journalists. We can recruit this public.

Do you want to get started? What can you do? 

Despite how overwhelming the environmental crises are, I believe Wikipedia and the Wikimedia movement are powerfully placed to create space for optimism, and to help humanity make decisions for a healthy environment for future generations — and in doing so, we can attract diverse new publics to our public knowledge movement — its a win-win opportunity. 

How Smart is the SMART Copyright Act?

15:36, Wednesday, 20 2022 April UTC
The word "copyright" spelled out on a keyboard
Copyright spelled out on a keyboard. Image by Dennis Skley, CC BY-ND 2.0, via Flickr

During March 2022, United States Senators Patrick Leahy and Thom Tillis introduced the Strengthening Measures to Advance Rights Technologies Copyright Act of 2022 (SMART Copyright Act). The bill is deceptively simple. It would require the Library of Congress to mandate that online platforms use certain “technical measures” (i.e., automated systems) to identify infringing content. Its simplicity masks its dangers, however. For that reason, though the Wikimedia Foundation agrees that technical measures to identify potentially infringing works can be useful in some circumstances, we sent a letter (reproduced below) on 19th April 2022 to the bill’s sponsors letting them know that we oppose it. 

Under the SMART Copyright Act, the Foundation and Wikimedia communities could be forced to accommodate and implement technical tools to identify and manage copyrighted content that may not be right for Wikimedia projects. This requirement could force the Foundation to change its existing copyright review process, even though the current process is working very well. 

Currently, content contributed to Wikimedia projects must be available through a free knowledge license, in the public domain, or subject to some other limitation on copyright protection. The Foundation and our communities mostly rely on Wikimedia editors to figure out whether particular content complies with the rules. These editors do use certain automated technical measures to help them, but the decisions about which measures are appropriate and what content requires action are theirs. In addition, the Foundation accepts requests to remove content under the Digital Millennium Copyright Act (DMCA). Because the user policies and review systems are extremely effective, the number of DMCA takedown notices the Foundation receives is very small, and many are not granted. For example, our last transparency report shows we received only 21 total DMCA notices between July and December 2020 (as compared to the nearly 150,000 received by Facebook). We granted only 2 of them, which indicates that the other 19 were inappropriate or defective in some manner.  

If the SMART Copyright Act forces Wikimedia projects to use inappropriate tools or to substitute inappropriate tools for our existing copyright enforcement process, we are concerned it will make our copyright enforcement worse. The SMART Copyright Act, like other proposals before it, puts too much faith in artificial intelligence and automated tools as the only solution to infringement. While we fully agree that tools can be a helpful aid in identifying infringement, they should not be considered as a fix for all enforcement problems. There are two main reasons for this:

  1. Technical tools are not good at determining when a work was “fairly used” or when a work has entered the public domain. This flaw leads to inappropriate censorship. Even YouTube’s Content ID identifies numerous false positives for infringement, and fails to catch a significant amount of problematic content. We worry that such tools would do far worse than the Wikipedia non-free content policy enforced by users.
  2. Technical tools are often developed and owned by one company, and are not open source or freely available. If specific tools are mandated by the copyright office, this would make it difficult for smaller companies and nonprofits to use them without becoming overly reliant on those companies.

The SMART Copyright Act tries to address these concerns by requiring the Librarian of Congress to implement a process to take input from a broad range of stakeholders. The problem with this approach is that large rights holders and large platforms are very likely to dominate the process, since these organizations can devote more time and staff to the proceedings. Lost in or absent from the debate will be small platforms, nonprofit platforms, and—most concerningly—the public and the creative community that relies on free knowledge protections to flourish. This will likely produce designated technical measures that fail to take into account the diversity of information, formats, forums, and platforms as well as the impacts that deploying these measures could have on various kinds of information, formats and platforms.

Online platforms should be free to use the processes and technical measures that are most appropriate for their individual formats and communities. Appointing a government agency to dictate which technical measures platforms must use will likely lead to censorship of legal content. It could also make Wikimedia projects’ copyright enforcement less efficient. For those reasons, we hope that senators will reconsider the SMART Copyright Act. 


*     *     *

BE HEARD on the SMART Copyright Act! In addition to the letter the Foundation sent, you can let Congress know that you oppose mandatory censorship filters. Fight for the Future is leading a petition opposing the harmful impacts of the legislation that will be delivered to Congress on 25th April, 2022. You can sign the petition at www.nocensorshipfilter.com and make sure Congress knows just how many people are concerned about the impacts this bill will have on free speech. 


*     *     *

Senator Patrick Leahy   

Chair 

Senate Judiciary Committee Subcommittee on Intellectual Property

437 Russell Senate Office Building

Washington, DC 20510

Senator Thom Tillis

Ranking Member

Senate Judiciary Committee Subcommittee on Intellectual Property

113 Dirksen Senate Office Building

Washington, DC 20510

Dear Chair Leahy and Ranking Member Tillis:

The Wikimedia Foundation opposes the Strengthening Measures to Advance Rights Technologies Copyright Act of 2022 (SMART Copyright Act) due to our strong concerns about the negative impacts it could have on free knowledge projects, including Wikipedia. The bill, as currently drafted, would require the Librarian of Congress to institute a process that would mandate that nearly all online platforms use certain technical measures to identify and remove potentially infringing content from their services. We are concerned that requiring one-size fits all measures could upset the delicate balance between encouraging free expression and allowing for vigorous enforcement of intellectual property rights that has emerged since the passage of the Digital Millennium Copyright Act. Particularly, we are concerned that the imposition of these measures could force the Wikimedia Foundation and other hosts of community-driven platforms to make changes to our public interest projects that could harm Wikipedia’s volunteer contributors’ commitment to the exchange of free knowledge and disrupt our already well-functioning copyright enforcement system.

The Wikimedia Foundation hosts several projects of free knowledge, the most famous of which is Wikipedia. Within these projects, hundreds of thousands of users around the world create free, collective knowledge, and the projects use a number of long-established community-led systems to ensure copyright compliance. One of the requirements for knowledge to be freely available is that it is hosted under a free culture copyright license (our projects primarily use Creative Commons licenses) or in the public domain. The Wikimedia projects also make exceptions to this free culture requirement on a case by case basis. For example, English language Wikipedia allows fair use images to illustrate articles where no non-free image is available such as for older musical groups and movies. This is reflected in a policy written and voted on by the users themselves.

The Wikimedia projects use a multi-layered system of human review and tools that assist volunteer reviewers to ensure the accuracy of copyrighted material and licensing information on the projects. As an initial step, many Wikimedia projects have an upload wizard (for example this one is the most common for photographs) that prompts the user to provide licensing information or, if it is their own work, to license it under a creative commons license. The Wikimedia Foundation’s Terms of Use also have a more formal content licensing agreement within them.

Once a work is uploaded, it is typically monitored by other users with the assistance of a variety of tools. The Foundation hosts some tools, which are developed by an open source developer community and used for detecting possible copyrighted materials on the Wikimedia projects. Other tools are hosted on community-created pages that help users address copyright issues. These tools are employed by the volunteer editors to help them review changes to the Wikimedia projects and identify changes that may infringe copyright or violate the free knowledge licensing requirements of the projects. 

The Foundation also accepts DMCA requests sent to it directly. Because the user policies and review systems are extremely effective, the number of DMCAs the Foundation receives is vastly smaller than the millions received by most hosting providers and many are done in bad faith. For example, in our last transparency report we received only 21 total DMCA notices and granted only 2 of them, indicating that the other 19 were inappropriate or defective in some manner. 

Because our overall architecture focuses on hosting content that is freely available under copyright law, these measures broadly assist in ensuring that Wikimedia hosted content is legally available. At the same time, limitations and exceptions to the copyright system are also an important part of protecting free expression. Some of the inaccurate DMCAs we have received in the past resulted from the use of technical tools other than those hosted by the Foundation or commonly used by our community finding works on our sites that either falsely asserted ownership or, even more concerningly, failed to adequately assess fair use even in clear cases of non-commercial educational use. Takedown demands generated by such tools can be disruptive and confusing for our user communities and require resources for a legal response that may take away from other work to advance the Foundation’s non-profit mission.

Under the SMART Copyright Act, the Foundation and our user communities could be forced to accommodate and implement technical tools to identify copyrighted content, regardless of whether those tools are appropriate for our projects. Forcing our projects to use inappropriate tools or to substitute these tools for our existing copyright enforcement process runs the risk of eliminating the nuanced review the Foundation and community are able to engage in when reviewing allegations of copyright infringement, including analysis of whether the content represents fair use of a copyrighted work. That will inevitably lead to over-enforcement and over-removal of legal content and harm to free expression and the free knowledge movement. 

The SMART Copyright Act, like other proposals before it, simply puts too much faith and emphasis on artificial intelligence and automated tools to enforce copyright laws. While we strongly agree that tools can be a helpful aid in identifying infringement, they should not be considered as a fix for all enforcement problems or supersede the work of a volunteer community, and the concerns that mandating reliance on them creates for free expression and the free exchange of legal content far outweigh any benefit to rightsholders. Even YouTube’s content ID tool, which is highly accurate and efficient for its purposes, nonetheless identifies numerous false positives for infringement and also fails to catch a significant amount of problematic content as well.

We appreciate that the bill does require that the Librarian of Congress consider many of these concerns when determining whether to make a technical measure a designated technical measure and on which platforms the particular measures would have to be deployed. However, the process is very likely to be dominated by rightsholders and by large platforms that have the time and capacity to devote to the proceedings. Lost in or absent from the debate will be small platforms, non-profit platforms, and, most concerningly, the public and the creative community that relies on free-knowledge protections to flourish. In addition, any designated technical measure will present implementation problems. As noted above, they often do a poor job analyzing whether content qualifies for one of the exceptions or limitations on copyright, including whether the content is a fair use. Finally, any approved technical measure will almost certainly be proprietary rather than free and open source further limiting the ability of small platforms to integrate them and raising the likelihood that those that benefit will be the standards creators: the industry power players as well as the big tech companies. This will likely produce designated technical measures that are over-broad and implementable only by the largest platforms and/or fail to take into account the diversity of information, formats, forums, and platforms and the impacts deploying these measures could have on them.

Thank you for taking the time to consider our views and our concerns. If you have any additional questions, please do not hesitate to reach out to Kate Ruane, Lead Public Policy Specialist for the United States, [email protected].

Open Education and OER in the Curriculum

15:27, Wednesday, 20 2022 April UTC

Principles of Open Education and OER 

This blog post was originally posted on the University of Edinburgh’s Curriculum Transformation Hub.

The principles of open education were initially outlined in the 2008 Cape Town Declaration [1], which advocates that everyone should have the freedom to use, customize, and redistribute educational resources without constraint, to nourish the kind of participatory culture of learning, sharing and cooperation that rapidly changing knowledge societies need. 

Broadly speaking, open education encompasses teaching techniques and academic practices that draw on open technologies, pedagogical approaches and open educational resources (OER) to facilitate collaborative and flexible learning. This may involve both teachers and learners engaging in the co-creation of learning experiences, participating in online peer communities, using, creating and sharing open educational resources (OER) and open knowledge, sharing experiences and professional practice, and engaging with interdisciplinarity and open scholarship. 

Although open education can encompass many different approaches, open educational resources, or OER, are central to this domain. The UNESCO Recommendation on OER [2] defines open educational resources as 

 “teaching, learning and research materials in any medium, digital or otherwise, that reside in the public domain or have been released under an open license that permits no-cost access, use, adaptation and redistribution by others with no or limited restrictions.” 

Open Education and OER at the University of Edinburgh 

At the University of Edinburgh, we believe that open education and OER, are fully in keeping with our institutional vision, purpose and values, to discover knowledge and make the world a better place, while ensuring that our teaching and research is diverse, inclusive, accessible to all and relevant to society.   In line with the UNESCO Recommendation on OER, we also believe that OER and open knowledge are critical to achieving the aims of the United Nations Sustainable Development Goals [3].   

To support open education and the creation and use of OER, the University has an Open Educational Resources Policy [4], approved by our Learning and Teaching Committee, which encourages staff and students to use, create and publish OERs to enhance the quality of the student experience, expand provision of learning opportunities, and enrich our shared knowledge commons.  We also have a central OER Service [5], based in Information Services Group, that provides staff and students with advice and guidance on creating and using OER, engaging with open education and developing digital and copyright literacy skills.  Understanding authorship, copyright, and licensing is increasingly important at a time when both staff and students are actively engaged in co-creating digital resources and open knowledge.    

Benefits and Risks of Openness  

Open education approaches, such as collaborative flexible learning and co-creation of learning experiences, can be beneficial in many different contexts, but they are particularly well suited to hybrid teaching and learning, where no separation is made between digital and on campus student cohorts, and students are brought together by the way teaching is designed, enabling them to move between digital and classroom-based learning activities. 

Engaging with open education, OER and open knowledge through curriculum assignments can help to develop a wide range of core disciplinary competencies and transferable attributes including: 

  • Digital, data and copyright literacy skills, 
  • Understanding how knowledge and information is created shared and contested online, 
  • Collaborative working and collective knowledge creation, 
  • Information synthesis, 
  • Critical thinking and source evaluation, 
  • Writing as public outreach.  

However, it’s also important to consider the risks of openness, as any understanding of openness is highly personal, contextualised and continually negotiated. We all experience openness from different perspectives, depending on different intersecting factors of power, privilege, inclusion and exclusion.  

In his 5Rs for Open Pedagogy [6] Rajiv Jhangiani identifies Risk as being one of his values for Open Pedagogy. 

“Open pedagogy involves vulnerabilities and risks that are not distributed evenly and that should not be ignored or glossed over. These risks are substantially higher for women, students and scholars of colour, precarious faculty, and many other groups and voices that are marginalized by the academy.” 

Many systemic barriers and structural inequalities exist in open spaces and communities; open does not necessarily mean accessible to all.  When engaging with open education, we need to be aware of our own privilege and be sensitive to those who may experience openness differently, and we need to address the systemic barriers and structural inequalities that may prevent others from engaging with open education and to enable everyone to participate equitably, and on their own terms. 

The University has an invaluable Digital Safety and Citizenship Web Hub [7], that offers comprehensive information and resources on a range of digital safety and citizenship-related issues, including training and events, and advice on being an informed digital citizen.   

If we’re sensitive to these risks and inequities and work to mitigate them, integrating open education and OER into the curriculum can bring significant benefits, including building networks, relationships and communities, fostering agency and empowerment, developing strong societal values and an appreciation of equity, intersectionality and social justice. 

Open Education in the Curriculum 

Wikimedia in the Curriculum 

One way to engage with open education and the creation of open knowledge is by contributing to Wikipedia, the world’s biggest open educational resource and the gateway through which millions of people seek access to knowledge.  Working together with the University’s Wikimedian in Residence, Ewan McAndrew, colleagues from a number of schools and colleges have integrated Wikipedia and Wikidata editing assignments into their courses.  Editing Wikipedia provides valuable opportunities for students to develop their digital research and communication skills, and enables them to contribute to the creation and dissemination of open knowledge. Writing articles that will be publicly accessible and live on after the end of their assignment has proved to be highly motivating for students, and provides an incentive for them to think more deeply about their research. It encourages them to ensure they are synthesising all the reliable information available, and to think about how they can communicate their scholarship to a general audience. Students can see that their contribution will benefit the huge audience that consults Wikipedia, plugging gaps in coverage, and bringing to light hidden histories, significant figures, and important concepts and ideas. This makes for a valuable and inspiring teaching and learning experience, that enhances the digital literacy, research and communication skills of both staff and students. 

Talking about a Wikipedia assignment that focused on improving articles on Islamic art, science and the occult, Dr Glaire Andersen, from Edinburgh College of Art commented 

“In a year that brought pervasive systemic injustices into stark relief, our experiment in applying our knowledge outside the classroom gave us a sense that we were creating something positive, something that mattered. As one student commented, “Really love the Wikipedia project. It feels like my knowledge is actually making a difference in the wider world, if in a small way.”   

Other examples include Global Health Challenges postgraduates collaborating to improve Wikipedia articles on natural or manmade disasters. History students re-examining the legacy of Scotland’s involvement in the transatlantic slave trade and presenting a more positive view of black British history. Digital Education Masters students collaborating to publish a new entry on Information Literacies. And Reproductive Biology Honours students work in groups to publish new articles on reproductive biomedical terms. 

Wikimedia in the Classroom assignment, Aine Kavanagh, Reproductive Biology, by Ewan McAndrew, Wikimedian in Residence, University of Edinburgh, CC BY SA.

Our Wikimedian in Residence provides a free central service to all staff and students across the University, further information including testimonies from staff and students who have taken part in Wikimedia in the Curriculum assignments is available here: Wikimedian in Residence. 

Open Education and Co-creation – GeoScience Outreach 

Another important benefit of open education is that it helps to facilitate the co-creation of knowledge and understanding.  Co-creation can be described as student led collaborative initiatives, often developed in partnership with teachers or other bodies outwith the institution, that lead to the development of shared outputs.  A key feature of co-creation is that is must be based on equal partnerships between teachers and students and “relationships that foster respect, reciprocity, and shared responsibility”[8]. 

One successful example of open education and co-creation in the curriculum is the Geosciences Outreach Course, which provides students with an opportunity to work with a wide range of clients including schools, museums, outdoor centres, and community groups, to design and deliver resources for STEM engagement. Students may work on project ideas suggested by the client, but they are also encouraged to develop their own ideas.  This provides students with the opportunity to work in new and challenging environments, acquiring a range of transferable skills that enhance their employability. They gain experience of science outreach, public engagement, teaching and learning, and knowledge transfer while at the same time developing communication, project and time management skills.  

A key element of the course is to develop resources with a legacy that can be reused by other communities and organisations. Open Content Curation student Interns employed by the University’s OER Service repurpose these materials to create open educational resources aligned to the Scottish Curriculum for Excellence, which are shared online through Open.Ed and TES Resources [9] where they can be found and reused by school teachers and learners.  These OERs, co-created by our students, have been downloaded over 58,000 times and the collection was recently awarded Open Education Global’s Open Curation Award [10].  

Open Education Awards for Excellence: Open Curation / Repository – University of Edinburgh by Stephanie (Charlie) Farley, CC BY SA. 

OER Assignments – Digital Futures for Learning 

OER creation assignments are also incorporated into the Digital Futures for Learning module, part of the MSc in Digital Education, where students create open resource that critically evaluate the implications of educational trends, such as the future of writing, complexity in education, and radical digital literacy.  Creating genuinely open resources that are usable and reusable requires careful attention to issues such as accessibility, structure, audience, and licensing. The students need to critically consider and apply their learning, and in doing so are able to create practical re-usable resources, while demonstrating a range of transferable skills and competencies.  

Commenting on this OER creation assignment, course leader Dr Jen Ross said 

“Experiencing first-hand what it means to engage in open educational practice gives student an appetite to learn and think more.  The creation of OERs provides a platform for students to share their learning. In this way, these assignments can have ongoing, tangible value for students and for the people who encounter their work.” [11] 

Reusing and Repurposing OER 

Reusing and customising existing open educational resources can help to diversify and expand the pool of teaching and learning resources available to staff and students. 

LGBT+ Resources for Medical Education 

In 2016 undergraduate medical students developed a suite of resources covering lesbian, gay, bisexual and transsexual health. Although knowledge of LGBT+ health and of the sensitivities needed to treat LGBT patients are valuable skills for qualifying doctors, these issues are not well-covered in the medical curricula. This project remixed and repurposed resources originally created by Case Western Reserve University, and then contributed them back to the commons as OER. New open resources including digital stories recorded from patient interviews and resources for Secondary School children, were also created and released as OER. In a recent blog post on Teaching Matters [12], Dr. Jeni Harden, Senior Lecturer in Social Science and Health, reflected on how these resources have contributed to the medicine curriculum over the past five years. 

Fundamentals of Music Theory 

Fundamentals of Music Theory [13] is an open textbook co-created by staff and students from the Reid School of Music with support from the University’s OER Service.  This Student Experience Grant funded collaborative project [14] repurposed existing open licensed MOOC content and blended-learning course materials to co-create a proof-of-concept open textbook. The project enabled our student partners to develop digital and copyright literacy skills, an understanding of OER and open textbooks, familiarity with ebook applications, and experience of working with educational media and content. Their input enhanced the original teaching materials and brought about further teaching and learning enhancement. Open textbooks have the potential to benefit universities in the post-pandemic world by reducing textbook costs, benefit staff by providing access to easily customisable open textbooks, and benefit students by providing free, high quality digital learning materials. Furthermore, open textbooks and OER have the potential to facilitate the democratic reshaping of teaching materials through student engagement and co-creation. 

Further Information  

These are just some examples of ways that open education and OER have already been integrated into the curriculum here at the University of Edinburgh.  They demonstrate how valuable co-creating open knowledge and open educational resources through curriculum assignments can be to help students develop essential digital skills, core competencies and transferable attributes, and enable our learners to become fully engaged digital citizens. 

For further information about open education and OER please visit the University’s OER Service at Open.Ed or e-mail us at [email protected].  

References 

  1. Capetown Open Education Declaration https://www.capetowndeclaration.org/read/
  2. UNESCO, (2019), Recommendation on Open Educational Resources, http://portal.unesco.org/en/ev.php-URL_ID=49556&URL_DO=DO_TOPIC&URL_SECTION=201.html
  3. United Nations Sustainable Development Goals https://sdgs.un.org/goals
  4. University of Edinburgh Open Educational Resources Policy, https://www.ed.ac.uk/files/atoms/files/openeducationalresourcespolicy.pdf
  5. OER Service, https://open.ed.ac.uk/
  6. Jhangiani, R, (2019), 5Rs for Open Pedagogy, Rajiv Jhangiani, Ph.D. Blog, https://thatpsychprof.com/5rs-for-open-pedagogy/
  7. Digital Safety and Citizenship Web Hub, https://www.ed.ac.uk/information-services/help-consultancy/is-skills/digital-safety-and-citizenship
  8. Lubicz-Nawrocka, T., (2019), An introduction to student and staff co-creation of the curriculum, Teaching Matters Blog, https://www.teaching-matters-blog.ed.ac.uk/an-introduction-to-student-and-staff-co-creation-of-the-curriculum/
  9. University of Edinburgh Open.Ed Hub, TES Resources, https://www.tes.com/teaching-resources/shop/OpenEd
  10. OE Awards for Excellence https://awards.oeglobal.org/awards/2021/open-curation/open-ed-collection-of-geoscience-outreach-oers-and-more-on-tes/
  11. Ross, J., (2019), Digital Futures for Learning: An OER assignment, Open.Ed Blog, https://open.ed.ac.uk/digital-futures-for-learning-an-oer-assignment/
  12. Farley, S. and Harden, J., (2021), Five years on: The LGBT+ Healthcare 101 OER, Teaching Matters Blog, https://www.teaching-matters-blog.ed.ac.uk/five-years-on-the-lgbt-healthcare-101-oer/
  13. Edwards, M., Kitchen, J., Moran, N., Moir, Z., and Worth, R., (2021), Fundamentals of Music Theory, Edinburgh Diamond, DOI: https://doi.org/10.2218/ED.9781912669226
  14. Open eTextbooks for Access to Music Education Project, https://blogs.ed.ac.uk/opentextbooks/

WeDigBio: A Wikidata empowered workflow

14:48, Wednesday, 20 2022 April UTC

Twice a year there is a world wide citizen science effort organised by WeDigBio.org to digitise natural history specimen data. I love contributing to WeDigBio. So twice a year I put down my other hobbies and concentrate on the vitally important task of transcribing natural history specimen labels. However because I love researching people and linking them to their work my workflow extends beyond this important work. I attempt to link data on collectors of the specimens I’m transcribing to their previous collections. I do this with Wikidata and use the Wikidata Q identifier in Bionomia.net, a website that links collectors specimens to the collectors identifier.

I start by transcribing a specimen label. I particularly like contributing to the Australian Museum transcription website DigiVol but there are multiple other platforms that people can contribute to including Notes from Nature, DoeDat and so many more.

I log into the DigiVol platform and then pick a project to work on. During the April 2022 WeDigBio campaign it was the Royal Botanic Garden Edinburgh Papaveraceae (North America) project.

Screenshot of DigiVol

Then I start transcribing. When I transcribe I pay particular attention to women collectors. Many of these women have made under-appreciated contributions to our knowledge on natural history. These women are often invisible, as many use their married names. This has resulted in their work being attributed to their husbands.

So when I come across a woman like this I make sure both she and her title are included in the transcription of the label. See for example this specimen.

Screenshot of DigiVol

When I transcribed the collectors I made sure the woman is entered into the transcription platform as completely as I can from the information given on the label.

Screenshot of DigiVol

Then I then attempt to research her. I act like a family genealogist researcher, except I am attempting to track down a stranger rather than a family member. I use various platforms such as familysearch.com, findagrave.com, full text searches of the Biodiversity Heritage Library and the Internet Archive to attempt to find more information on her. In a case like Mrs. Steele I often have to first track down her husband and then attempt to find her full name via his name in databases. I was VERY lucky in Mrs. Steele’s case as her husband had a Wikipedia article about him which gave her full name and birth dates. Be warned, this is extremely unusual and more effort is normally required to track these women down.

Screenshot of the English Wikipedia article for the husband of Grace Steele.

If I am successful in tracking her down I then log into Wikidata,. First, I check she isn’t already in Wikidata. If there is no Wikidata item for her I create one. I make sure I put in as much information as I can about her and also link her item to her husband’s item (if it exists) via a statement about her being his spouse. Assuming her husband has an item I make the same statement on his item so that researchers have an easier time of finding her.

Screenshot of Wikidata item for Grace Steele.

I explain how to add collectors to Wikidata see in this video on YouTube. There is also this document that gives instructions to help guide you when adding specimen collector data to Wikidata.

Once the Wikidata item is completed for the woman collector I then go to the Bionomia website. This website helps link information on the collectors of specimens to the actual specimens themselves.

Screenshot of Bionomia.net

In order to log into Bionomia Tracker you do have to have an Orcid id. This is mine https://orcid.org/0000-0002-5398-7721

Screenshot of Orcid.org

I’ve made sure to add details about myself and make it public so that in the future, my work is able to be found and attributed to me, just as we are doing for these collectors. If you are a volunteer citizen scientist you too can do the same.

Once I have logged into Bionomia I can add Mrs. Steele into that platform by adding her Wikidata item to it. I can only do this if she is deceased and has a death date added to her Wikidata item. If she is alive, we will need her Orcid id in order to be able to add her to Bionomia. If she is alive and doesn’t have an Orcid id we won’t be able to add her to Bionomia and will have to be satisfied with her just being in Wikidata.

Once she is added to Bionomia I then start carefully attributing specimens to her. Care must be taken because, as she is known as “Mrs. Steele”, Bionomia will suggest specimens only collected by her husband as well as those they collected together.

Screenshot of Bionomia.net


More information and instructions on how to attribute specimens to collectors in Bionomia can be found here

Once I’ve attributed specimens to her, I make the profile public so that anyone can find her profile. This can only be done for collectors who are deceased. Living collectors have control over whether their information on Bionomia can be made public.

By doing this work I’ve attributed specimens to the collector, helped improve her profile, and enabled Natural History institutions to improve their own databases based on the data created and linked in both Wikidata and Bionomia. As an added benefit Bionomia links the specimens the collector collected to the scientific research that has used those specimens. This ensures that the impact of Mrs. Steele’s collecting is tracked and can be appreciated.

At the time of writing, Mrs. Steele’s specimens have been used to inform science in 25 recently published scientific papers. Proving that her contributions to science were well worth the effort of linking her to her work.

Screenshot of Bionomia.net

© Siobhan Leachman, 2022. Unless indicated otherwise, this document is licensed for re-use under the Creative Commons Zero 1.0 International licence. Please note that this licence does not apply to any images. Those screenshots are reused in this document under section 43 the Copyright Act 1994 of New Zealand. Other countries may have different copyright exceptions allowing reuse of those screenshots.  In essence, you are free to copy, distribute and adapt this document, as long as you abide by the other licence terms.

From the lush green hills of the historical site San Andrés, El Salvador, to the overwhelming waterfalls in Murchison Falls National Park, Uganda, to the capturing temple of David in Jerusalem, Israel. Where local restrictions would allow, people have gone out again to capture their wonderful surroundings and share them with the world through the largest photo competition in the World, Wiki Loves Monuments.

As part of the competition, photographers donate their images to Wikimedia Commons, the free repository that holds most of the images used on Wikipedia and the other Wikimedia projects, helping to document the world’s cultural wonders for generations to come.

With the events going on in the world, it is important to stay aware that there is a deadline to capture the world around you. We consider it our responsibility to make people aware of its beauty, and that it deserves to be shared through the eyes of the people that experience it every day. We make the commitment to create the visuals that can illustrate the stories of these places and bring them to life, and invite you to help expand and maintain the treasure trove of the cultural heritage we all share in the upcoming 2022 edition.

In this twelfth edition of Wiki Loves Monuments, 37 countries joined together in the contest. We received over 172.000 entries from 4914 uploaders, and welcomed 3198 first time contributors. Out of all entries the national winners have been selected, and have been brought to an international jury of experts. This jury assessed, considered and ranked the 339 national winning photos based on our usual criteria: usefulness for Wikipedia, technical quality and originality.

Please enjoy the following profiles of the 2021 Wiki Loves Monuments winners, who were announced today. This year’s international winners come from 11 different countries, including multiple winners from Poland, Ukraine and India.


First place: Although Donatas Dabravolskas lives only 15 minutes away from the Royal Portuguese Cabinet of Reading in Rio de Janeiro, Brazil, he only came to know about the place from a social media post, which motivated him to visit and take a bunch of photographs. As said by jury members, this winning picture of WLM 2021 has “amazing symmetry” and “conveys greatness and true scale of this immense library.” Photo by Donatas Dabravolskas, CC BY-SA 4.0.

Second place: A professional photographer who has been in the field for over a decade, Damian Pankowiec, believes that it is important to “capture the subject at the right time of day and the weather”. This stunning picture of Saint Roch chapel in Krasnobród, Poland, is praised by a jury member as the “best way to help a modest, otherwise unspectacular monument in gaining value in a picture is looking for the proper season; and for this picture autumn was chosen very well,” and as another member says, “it creates the sensation to be there.” Photo by Damian Pankowiec, CC BY-SA 4.0.

Third place: Along with his friends, Zysko Serhi, travels around Ukraine to showcase the beauty of it to the world. Far from his home, Serhi captured a “colourful interior shot” of Samchyky palace in Starokostiantyniv, Ukraine. Photo by Zysko Serhi, CC BY-SA 4.0.

Fourth place: Małgorzata Pawelczyk captured the Stefan Czarniecki Monument in Tykocin, Poland, as the “sun rays were shining through the clouds and the fog.” Her motivation to submit this photograph to WLM was to “create awareness of the beauty of architecture even in small towns.” A jury member remarked it as a “great use of the weather and the time of day to isolate the subject within its surroundings.” Photo by Figoosia, CC BY-SA 4.0.

Fifth place: Basavaraj M, who has been into photography for about four decades with a special interest for architecture, presents us with the picture of the Gali Mantapa from his hometown, which is Chitradurga in Karnataka, India. The photograph is also the winner of WLM 2021 in India. A jury member remarked this as a “good combination of stone architecture with natural rocks.” Photo by Basavarajmin21, CC BY-SA 4.0.

Sixth place: During the pandemic in 2020, Daniel Horowitz, along with his wife “rented a secluded cabin near Lake Champlain in Vermont, hundreds of miles from home.” Since his childhood, he has always been drawn to old buildings, ruins and abandoned places. Towards the end of their trip, as he was taking pictures, Horowitz wandered off (as he usually does), to capture Fort Crown Point in New York. It was a British fortress along the western bank of Lake Champlain, built in 1759 to defend against French forces and partially destroyed by fire in 1773. Photo by Daniel M. Horowitz, CC BY-SA 4.0.

Seventh place: Dormition (Uspensky) Cathedral was photographed on a “rare sunny winter day” in Kharkiv, Ukraine, in “extreme cold and the snow on the roofs added some neatness to the whole scene”, by Ekaterina Polischuk. She, along with being a landscape photographer, is also a drone pilot. A jury member remarked the photo as, “the tower in the foreground creates a dominant look and great perspective towards the depicted monument; lines are converging towards it and draws the eye into the shot. Excellent day for such an image; snow on top of the golden tower is like icing on the cake.” Photo by Ekaterina Polischuk, CC BY-SA 4.0.

Eighth place: On a weekend walk to Ross Castle with his family, Mark McGuire happened to perfectly capture “a series of fortunate events” in one frame, rather than just the monument. In McGuire’s words, “For the ducks, boat and sunset to all align at the same time was pure chance and one of these rare moments that is unlikely to be replicated.” It was also a jury member’s favorite “by its composition” which blends, “architecture, nature and human elements.” Photo by Markiemcg1, CC BY-SA 4.0.

Ninth place: On a cloudy day with strong winds, Hadi Dehghanpour, along with three “photographer friends” traveled almost 400 km to “one of the unique and special attractions of Khorasan Razavi”, the Windmills of Nashtifan. A jury member remarked that “the middle of a frame adds to the magnitude of the structure.” Photo by Hadidehghanpour, CC BY-SA 4.0.

Tenth place: Yashin who has been into photography for about seven years now, presents with a beautiful image of Intercession Church: Posevkino, Voronezh Oblast, Russia. Photo by Yashin.v, CC BY-SA 4.0.


Congratulations to all the winners, and our thanks go out to everyone who participated! If you’d like to see more photographs like these, see the 2021 winners from Wiki Loves Earth, a similar photo contest that aims to document the world’s natural heritage. Be sure to check out the national winners from each of the participating countries in Wiki Loves Monuments 2021, and last year’s international winners, too!

For more information, including how to join next year’s contest, go to wikilovesmonuments.org. Share your favorite winning images on social media using #wikilovesmonuments.

Wikidata as a tool for biodiversity informatics

15:54, Tuesday, 19 2022 April UTC
Headshot of Rick Levy
Rick Levy. Image courtesy Rick Levy, all rights reserved.

Rick Levy’s job as a scientific data manager at the Denver Botanic Gardens has him immersed in the field of biodiversity informatics. And a topic that keeps coming up in his field? Wikidata.

“So many conversations at conferences and webinars mention Wikidata. I needed an introduction on how to use it, as it seemed to be so useful and ubiquitous in these circles,” Rick says.

So Rick signed up to take Wiki Education’s Wikidata Institute, a three-week deep dive into the open structured data project that’s becoming more commonplace in biodiversity informatics — and other fields. The course provided that introduction he needed.

“It gave me skills to make big changes efficiently,” Rick says of the Wikidata Institute course. “I thought my cohort and instructor were super interesting and friendly and working on important projects.”

The Denver Botanic Gardens generates a lot of data, he says, which they want to make as freely available as possible. The Gardens has collections related to natural history and conducts its own ecological research. Wikidata is a natural place to share this data.

“Linking data is so incredibly useful. Additionally, it is great to have the data available and presented in a way that the general public can access,” Rick says. “In the future, I hope to publish the majority of our data to Wikidata, including tens of thousands of museum specimens and ecological datasets.”

Rick continues to be inspired by Wikidata and uses the skills he gained in the course.

“Editing and creating items snowballs so fast, it is hard to know when to stop,” he says. “It makes me feel like I am adding to and improving a community of information.”

Interested in taking the same course Rick too? Visit wikiedu.org/wikidata.

Image credit: Carol M. Highsmith, Public domain, via Wikimedia Commons

A little-known naturalist from Chikkaballapur

09:11, Tuesday, 19 2022 April UTC
Bangalore has historically, being an administrative centre with a mild climate, had a fair share of colonial natural history collectors and naturalists. We know a fair bit about the botanists who walked this region and a bit about hunters of larger game but rather little about those who studied insects. A few years ago I became aware of the Campbell brothers from Ireland (but of Scottish origin). It took some time to put together the Wikipedia entries on them which is where more straightforward biographical details may be found.

After a trip to the Nandi Hills [to examine a large number of heritage Eucalyptus trees (nearly 200 years old) that the Horticulture Department had decided to cut down to the stump, supposedly because falling branches were seen by the Archaeological Survey of India as a threat to heritage buildings nearby], some of us decided to visit Chikballapur to examine the place of work of  Dr Thomas Vincent Campbell (1863-16 December 1930) - "T.V." as he was known to his friends was a missionary doctor with the London Missionary Society and had worked briefly at Jammalamadugu where his older brother William Howard Campbell (20 September 1859 - 18 February 1910) had worked as a missionary. Another brother back in Derry, David Callender Campbell (1860-1926) was also a keen observer of moths and a botanist. In their younger days in Derry, they and their siblings had put together a "family" museum of natural history that was said to be among the best in the region! William was the oldest of nine siblings and appears to have been the sturdiest considering that he was a champion rugby player at Edinburgh University. He moved to Cuddapah in 1884 and he may well have been the first person to see Jerdon's courser in life - Jerdon, Hume, and others appear to have dealt only with specimens obtained from local hunters. William collected moths and many of them appear to have gone to Lord Rothschild and nearly 60 taxa were described on their basis by Hampson. In 1909, he was to become director United Theological College Bangalore but ill health (sprue) forced him to return to Europe and he died in 1910 in Italy. His Cuddapah-born son Sir David Callender Campbell (1891 – 1963) became a prominent Northern Ireland politician. William's life is covered in some detail by Alan Knox while examining the only known egg of Jerdon's courser. A biography (a bit hagiographic though) of William in Telugu also exists.

T.V.'s life on the other hand was hard to find information on, we knew of his insect specimens. He was in contact with E.A. Butler who specialized in the life histories of insects and T.V. seems to have taken off after him and not only colllected bugs (ie Hemiptera) but made notes on them which were used by Distant in the Fauna of British India. Several insects that T.V. collected have never been seen again. T.V. moved to Chikaballapur and worked at the Ralph Wardlaw Thompson Memorial Hospital which is now just known as the CSI Hospital and largely in disrepair. The hospital in its heyday was among the few in the region and treated a large number of patients. After suffering from tuberculosis, he also established a TB sanatorium at Madanapalli. Campbell treated nearly a thousand cases of cataract and was awarded a Kaisar-i-Hind medal for work in 1908. Campbell appears to have made a very large collection of insects from Cuddapah, Chikballapur, and from the Ooty area (where he would have spent summers). Many of these are now in the Natural History Museum in London and a good number are type specimens (ie, the specimens on the basis of which new species were described). Professor C.A. Viraktamath, entomologist and specialist on the leafhoppers, has for many years searched for a supposedly wingless Gunhilda noctua which was collected from the Nilgiris. Based on T.V.'s connections, I believe the place to look for them would be somewhere in the vicinity of the church in Ketti. Considering the massive alteration in habitats, there is a slight chance that the species has gone extinct but it is doubtful that it was so narrow in its distribution.
 
W.H. Campbell

 
Dr T.V. Campbell
T.V.'s former home in Chikaballapur

Dr TV attending to patients in Chikaballapur, c. 1912

A lane inside the hospital premises named after T.V.

Foundation stone of the hospital

The Wardlaw Thompson Hospital c. 1914

Gunhilda noctua - a monotypic genus never seen
since T.V. found them for W.L. Distant to describe in 1918
from The Fauna of British India. Rhynchota Vol.II

The Wikipedia entries can be found at T.V. Campbell and W.H. Campbell. Many people helped in the development of these articles. Roy Vickery kindly obtained a hard to find obituary of T.V., Alan Knox sent me some additional sources on W.H.C. and Susan Daniel, librarian at the United Theological College was extremely helpful. Arun Nandvar drove and S. Subramanya joined our little adventure in Chikaballapur. Dr Eric Lott made enquiries with the SOAS and LMS archives but found little. My entomologist friends and mentors, Prashanth Mohanraj and Yeshwanth H.M. shared their enthusiasm in discovering more about T.V. 

POSTSCRIPT - April 2022: S. Subramanya and I visited Jammalamadugu (and nearby places including Buchupalli where WHC had found a large pelicanry). It seems that the hospital that TVC began continues to prosper. It seems to have gained the favour of the political class thanks to the association of the former Chief Minister Dr Y S R Reddy who was not only born in the Campbell Hospital but worked there too. Apparently very little is known of the work of W.H. Campbell who seems to have largely been active as a missionary. The village of Buchupalli where he had described a large pelicanry seems to have no signs of any large water birds and absolutely no memory among its current day residents (who might be three or four generations down from those that lived in the 1890s).


The CSI Campbell Hospital

The entrance

Inscription below the statue with a gratuitous knighthood!

Dr TV holds a disarticulated stethoscope!




International Roma Day Edit-a-thon 2022

19:57, Monday, 18 2022 April UTC

On the occasion of International Roma Day, Wikimedia Serbia organized the third global edit-a-thon where Wikipedia volunteers around the world wrote and improved articles on Roma people and their history and culture. This year’s edit-a-thon was supported by Shared Knowledge, GLAM Macedonia, Wikimedians of Albanian Language User Group, Wikimedia España, Wikimedia Community User Group Greece, Wikimedians of Slovakia and Slovak Romani Wikipedia Community, WikiDonne, Wikimedians of Republic of Srpska and Wikimedians of Erzya language User Group. The goal of this campaign is to fight prejudice and discrimination against Roma people by spreading knowledge on Wikipedia and other Wiki projects.

Preparations

Following the example of previous years, a page was created on Meta with a table containing a list of articles showing whether they exist in the mentioned language edition of Wikipedia or not. This system also allows participants to work on translating articles if sources and literature on a given item do not exist in their language. A lot of time has been invested in inviting Wikimedia affiliates and communities to join and support this event. We were primarily focused on countries and regions that have a significant Roma minority. Fortunately the response was good and we tried to motivate our partners to find their local Roma organizations and organize a Wikipedia editing workshop for them. 

Results

The edit-a-thon lasted from April 1 to April 8 which is the International Roma Day. This year, we broke the record again and 67 participants wrote a total of 279 new articles, while 4 articles were improved. Balkan communities have once again shown that in this part of the world, the visibility of knowledge related to minority and marginalized groups is of great importance. Most of the articles were written on Macedonian Wikipedia, followed by Serbian, Albanian and Greek Wikipedia. 

At the event organized by the GLAM Macedonia, 27 participants wrote 128 articles in just one day. On the occasion of this extraordinary result, we asked Nataša Nedanoska from GLAM Macedonia what this event represents for them:

As you know, we are always pleased to support this kind of activity and we are glad to have such a high level of cooperation.

This edit-a-thon has raised great interest, to our satisfaction, which means that we are committed to the inclusion of minorities, promoting multiethnic values, getting to know and teaching the importance of notable people from smaller ethnic communities. This leads to improved perception and visibility of the contributions of the Roma community to the social and cultural life of North Macedonia.

Speaking of great results, we owe a special thanks to the user Gikü, who improved 119 items on Wikidata during the edit-a-thon. The final results can be viewed on this link.

Starting from last year, articles on Roma people and their culture became an integral part of the Wikimedia CEE Spring event, so interested editors from Central and Eastern Europe can contribute to this topic and participate in their local competitions until May 31st. As always, we are inviting you all to write articles on Wikipedia on the topic of any minority or marginalized group of people in order to reduce the existing knowledge gap.

Tech News issue #16, 2022 (April 18, 2022)

00:00, Monday, 18 2022 April UTC
previous 2022, week 16 (Monday 18 April 2022) next

Tech News: 2022-16

weeklyOSM 612

09:51, Sunday, 17 2022 April UTC

05/04/2022-11/04/2022

lead picture

progress-visualizer shows progress of Import/Catalogue/Road import (Norway)/Update [1] © by Mathias Haugsbø | map data © OpenStreetMap contributors (ODbL) |

Breaking news

  • OSM Ukraine is urging everyone to restrain (uk) > en from any mapping in Ukraine while the conflict is still occuring as it fuels the information war.

Mapping

  • jmapb has published part two of their ‘Using NYC Dept of Buildings Building Information Search’ series.
  • Anne-Karoline Distel’s video this week covers streetlight mapping.
  • The proposal on content=track_ballast is waiting for your comments. The proposed tag describes the content of a container feature such as man_made=storage_tank as track ballast, a term which describes solid material used to fill a track bed on railways.
  • Voting is currently open for:
    • A modified version of the artwork_subject=sheela-na-gig proposal until Wednesday 20 April. This version drops the suggestion of also adding a subject:wikidata tag duplicating the same information.
    • isced:2011:programme=*, to update isced:level tagging to the 2011 version of the International Standard Classification of Education, until Sunday 24 April.

Community

Local chapter news

Events

  • René Chalon (fr) > en (user renecha) presented (fr) the OpenStreetMap project during Journée du Libre Éducatif 2022. Hosted in Lyon on 1 April, the event showcased (fr) > en 12 different open-source projects with educational potential to more than 400 visitors.

Maps

  • As well as a tool for mapping place name elements (we reported last week), SeeSchloss also offers a map to search street names in France. Unfortunately it’s currently falling foul of OSM France’s ‘attribution is not optional’ campaign, as can be seen by looking at the map.

switch2OSM

  • The Ukrainian war crime evidence collection platform uses (uk) > en OSM.

Licences

  • AngocA wrote (es) > en a long blog post about ‘Clarifying permission to use CC BY in OSM’.

Software

  • [1] Mathias Haugsbø, from Norway, has created progress-visualizer, a tool that takes OpenStreetMap wiki tables and visualises each project’s mapping progress on a map, making it very simple to create status maps for mapping projects. To date, the tool only covers Norway.
  • The Mapillary team is conducting a two minute user survey (via Facebook), with an option to answer extended questions. The goal is to gather community opinions about satisfaction with the Mapillary apps and platform and to make plans for Mapillary to better fit user needs. Anyone who has used Mapillary is encouraged to share feedback.
  • Pieter Vander Vennet has released a new version of MapComplete, which has support to help in quickly translating the application. His diary post explained how this can be done and invited everyone to contribute translations.
  • AngocA compared (es) > en the features offered by different services/applications/websites that use OSM map notes.

Programming

  • Tomasz Taraś (tomczk) presented a few options for importing OSM data into a PostgreSQL database, including a worked example using imposm3.

Did you know …

  • … the liveuamap that links news from different countries of the world (we reported earlier) to an OSM map? There is also a thematic map on epidemics.
  • … the advantages of micromapping traffic signs? The OSM community in Helsinki did just that and discovered all sorts of faults with official signage.
  • … the fastest way to contact the Data Working Group?
  • … web services that offer map tiles are usually free up to a certain limit, after which they charge by usage? At Protomaps, Brandon Liu gave some thoughts on this subject.

OSM in the media

  • On April 5, there was a curious celebration: ‘Read a Road Map Day‘. On the occasion of this day, Saarländischer Rundfunk filmed a short feature at the elementary school in Lebach. Thanks to the OSM veteran Wolfgang Barth and the OSM-connected editor Herbert Mangold, OpenStreetMap is of course also addressed (de).

Other “geo” things

  • Christopher Beddow wrote about his vision for the next generation of maps.
  • Open311 is a ‘collaborative model and open standard for civic issue tracking’. It’s used by the City and County of San Francicso, which has a public services provision problem. Because the data is open, someone has been able to create a map of some of the issues that this causes.

Upcoming Events

Where What Online When Country
OSM World Discord Note Mapathon osmcalpic 2022-04-10 – 2022-04-17
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-18
Lyon Rencontre mensuelle Lyon osmcalpic 2022-04-19 flag
150. Treffen des OSM-Stammtisches Bonn osmcalpic 2022-04-19
City of Nottingham OSM East Midlands/Nottingham meetup (online) osmcalpic 2022-04-19 flag
Lüneburg Lüneburger Mappertreffen (online) osmcalpic 2022-04-19 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-20
Focused Workshop – Editing OSM Data with Street-Level Imagery osmcalpic 2022-04-21
Dublin Irish Virtual Map and Chat osmcalpic 2022-04-21 flag
New York New York City Meetup osmcalpic 2022-04-23 flag
Bogotá Distrito Capital – Municipio Introducción a la edición del Wiki de OpenStreetMap osmcalpic 2022-04-23 flag
京都市 京都!街歩き!マッピングパーティ:第29回 Re:鹿王院 osmcalpic 2022-04-24 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-25
Bremen Bremer Mappertreffen (Online) osmcalpic 2022-04-25 flag
OSMF Engineering Working Group meeting osmcalpic 2022-04-25
San Jose South Bay Map Night osmcalpic 2022-04-27 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-27
Roma Capitale Incontro dei mappatori romani e laziali osmcalpic 2022-04-27 flag
[Online] OpenStreetMap Foundation board of Directors – public videomeeting osmcalpic 2022-04-28
Gent Open Belgium 2022 osmcalpic 2022-04-29 flag
Rapperswil-Jona Mapathon/Hackathon at the OST Campus Rapperswil and virtually osmcalpic 2022-04-29 flag
IJmuiden OSM Nederland bijeenkomst (online) osmcalpic 2022-04-30 flag
London Missing Maps London Mapathon osmcalpic 2022-05-03 flag
Berlin OSM-Verkehrswende #35 (Online) osmcalpic 2022-05-03 flag
Boa Viagem BOA VIAGEM(CE) BRASIL – EDIFÍCIOS, ESTRADAS, PONTOS DE INTERESSE E ÁREA VERDE. osmcalpic 2022-05-07 – 2022-05-08 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Lejun, SK53, SomeoneElse, Strubbl, TheSwavu, derFred.

Are you planning to attend the Wikimedia Hackathon from May 20-22 next month? We hope so! The main event will be held online.

Host a session

The call for sessions is now open on the schedule page. If you’d like to host a session, pick an open slot in the category that best fits your topic, and add yourself to the schedule. Want to run a session but not sure of a topic? Check out the interests that people have listed on the Participants list!

The Developer Advocacy team has put together some suggestions for how to create a fun session.

Propose a project for hacking

Hackathons are all about hacking! To propose a project or seek help on an existing project, add a task to the Phabricator board. As the board grows, check back to find projects you might want to hack on during the event.

Other ways to participate

In addition to attending (virtually or at a local meetup), you can also participate by volunteering to welcome newcomers, contribute translations, or help with other types of tasks. Check the wiki page for ideas.

More info

You can find more information on the Hackathon MediaWiki.org page, which will continue to grow over the next few weeks.

About this post

Featured image credit: File:A_cat_reads_the_Gerrit_commit_message_guidelines_on_MediaWiki.org.jpg, TBurmeister(WMF), Creative Commons Attribution-Share Alike 4.0 license

This Month in GLAM: March 2022

08:48, Wednesday, 13 2022 April UTC

DSA: Trilogues Update

08:03, Wednesday, 13 2022 April UTC

European Union (EU) lawmakers are under a lot of self-imposed pressure to reach an agreement on content moderation rules that will apply to all platforms. Several cornerstones have been placed either at the highest political levels (e.g., banning targeted ads directed at minors) or agreed upon on a technical level (e.g., notice-and-action procedures). But there is still no breakthrough on a few other articles, like the newly floated “crisis response mechanism.”  

The European Commission published its legislative proposal back in December 2020. The European Council adopted its position in December 2021, while the European Parliament agreed on its version in January 2022. We have previously compared the three stances from a free knowledge point of view. Since January, the three institutions are in semi-formal negotiation procedures called “trilogues”, where they are trying to reach a final compromise. It is time for us to give you an update on the negotiations.

Whose Content Moderation And Rules Are We Talking About?

Online platforms allowing users to post content often have functions that allow these users to set up their own rules and actively moderate certain spaces. This is true for the now classical, but still very popular, online discussion forums, including Reddit groups, fan pages or club bulletin boards. It is especially true for Wikimedia projects, including Wikipedia, where volunteer editors make up the rules and moderate the space. 

With the Digital Services Act (DSA) imposing obligations for content moderation, it would be undesirable to put volunteer citizens who care about a space under the same legal pressures as professionals working full time for a corporation. Hence, we need to make sure that the definitions of “content moderation” and “terms and conditions” reflect that. Currently they both do. 

As of this week, both the Parliament and the Council agree to back the Commission proposal and define “content moderation” within this regulation as “activities, automated or not, undertaken by providers of intermediary services.”

When it comes to “terms and conditions” the two bodies have a slight disagreement. The Parliament position is to add a “by the service provider” clarification to the definition. The Council, however, believes that is already a given in the text, which reads:

(q) ‘terms and conditions’ means all terms and conditions or clauses, irrespective of their name or form, which govern the contractual relationship between the provider of intermediary services and the recipients of the services.

We welcome the fact that legislators and officials are having a conversation about this, with projects such as ours and online forums in mind. 

“Actual knowledge” of Illegal Content

A cornerstone of the DSA is to set up clear and straightforward rules for the interactions between users and providers with regards to content moderation. A notice-and-action mechanism is the first step. Then there are ways for users to contest the decisions—or indecision—of the service providers: internal complaints, out-of-court dispute settlements and, of course, court challenges. 

It was of the utmost importance for Wikimedia to highlight that not every notice the Wikimedia Foundation receives is about illegal content. This is crucial, as “actual knowledge” of illegal content forces action, usually a deletion. The agreed upon new text now includes language explaining that notices imply actual knowledge of illegal content only if a “diligent provider of hosting services can establish the illegality of the relevant activity or information without a detailed legal examination.”

A new addition in the negotiations is that the internal complaint handling mechanism would allow users to complain when platforms decide not to act on breaches of their terms and conditions.

Who Will Regulate and Oversee Wikipedia and its Sister Projects? 

According to the original Commission proposal, each Member State would designate a regulator responsible for enforcing the new rules. A platform would be regulated either where it is established or where it chooses to have a legal representative if the service provider is not located within the EU. During the trilogues, the Council suggested and the Parliament accepted that rules specific to Very Large Online Platforms (VLOP) should be enforced by the Commission. We normally welcome this move, even if it does provide for some inconsistency out of the Wikimedia projects, only Wikipedia is likely to be a VLOP, which means that it alone will be overseen by the Commission, while our other projects will be overseen by national authorities. 

What Will This Cost Wikimedia?

As the idea to have the Commission play the role of a regulator received traction, another line of thought was also suddenly accepted: establishing a fee for VLOPs to pay for the additional Commission staff needed. The idea is for the DSA to give powers to the Commission to impose fees based on a delegated act.  

It took some back and forth, but the final proposal by the Commission is to waive the fee for not-for-profit service providers. 

Crisis Response Mechanism

Sparked by the invasion of Ukraine, the last weeks saw political pressure build up to include a “crisis response mechanism” into the regulation. It would empower the Commission to require that providers of VLOPs apply “specific effective and proportionate measures” during a crisis. A crisis is defined to take place “where extraordinary circumstances lead to a serious threat to public security or public health in the Union.” 

While we understand the need for such a mechanism in principle, we are uncomfortable with its wording. Several key points must be addressed:               

  • Decisions that affect freedom of expression and access to information, in particular in times of crisis, cannot be legitimately taken through executive power alone.
  • The definition of crisis is unclear and broad, giving enormous leeway to the European Commission. 
  • A crisis response must be temporary by nature. The text must include a solid time limit

Targeted Advertising

It looks like the Parliament and the Council will agree to ban targeted advertising to minors as well as using sensitive data (e.g., political and religious beliefs) for targeting. Wikimedia generally supports everyone’s right to edit and share information without being tracked and recorded. 

Waiver for Nonprofits, Maybe?

It is still an open question whether the Council will accept the Parliament’s proposal to include a waiver allowing not-for-profits to be excluded from certain obligations, such as out-of-court-dispute settlement mechanisms. We welcome this, as it would avoid us setting up new mechanisms that could disrupt a largely efficient community content moderation system. But if the definitions of the DSA make it clear that this applies only to service provider decisions, we will not worry too much about it. 

General Monitoring Prohibition

Negotiators are still discussing a compromise that there should be no general obligation to monitor, “through automated or non-automated means,” information transmitted or stored by intermediary services. The Parliament wants to go further and clarify that this also includes “de facto” general monitoring obligations—i.e., rules that compound to general monitoring in practice. The thinking behind this is that sometimes several smaller obligations can lead the providers to a situation where they need to monitor all content. The Council is still pushing back on this. 

We do believe that a ban on general monitoring is crucial to ensure intellectual freedom and support the Parliament’s position on this.

Next Steps

The next technical meeting of advisers and experts is on 19 April 2022. The next political round of negotiations is scheduled for 22 April 2022. Europe needs a set of general content moderation rules, and the DSA is on track to deliver exactly this. We hope that all parts of the regulation will be properly deliberated and proper safeguards will be enshrined. Wikimedia will continue to provide constructive input to lawmakers as well as participate in the public debate. 

How we deploy code

01:13, Wednesday, 13 2022 April UTC

Last week I spoke to a few of my Wikimedia Foundation (WMF) colleagues about how we deploy code—I completely botched it. I got too complex too fast. It only hit me later—to explain deployments, I need to start with a lie.

M. Jagadesh Kumar explains:

Every day, I am faced with the dilemma of explaining some complex phenomena [...] To realize my goal, I tell "lies to students."

This idea comes from Terry Pratchett's "lies-to-children" — a false statement that leads to a more accurate explanation. Asymptotically approaching truth via approximation.

Every section of this post is a subtle lie, but approximately correct.

Release Train

The first lie I need to tell is that we deploy code once a week.

Every Thursday, Release-Engineering-Team deploys a MediaWiki release to all 978 wikis. The "release branch" is 198 different branches—one branch each for mediawiki/core, mediawiki/vendor, 188 MediaWiki extensions, and eight skins—that get bundled up via git submodule.

Progressive rollout

The next lie gets a bit closer to the truth: we don't deploy on Thursday; we deploy Tuesday through Thursday.

The cleverly named TrainBranchBot creates a weekly train branch at 2 am UTC every Tuesday.

Progressive rollouts give users time to spot bugs. We have an experienced user-base—as Risker attested on the Wikitech-l mailing list:

It's not always possible for even the best developer and the best testing systems to catch an issue that will be spotted by a hands-on user, several of whom are much more familiar with the purpose, expected outcomes and change impact on extensions than the people who have written them or QA'd them.

Bugs

Now I'm nearing the complete truth: we deploy every day except for Fridays.

Brace yourself: we don't write perfect software. When we find serious bugs, they block the release train — we will not progress from Group1 to Group2 (for example) until we fix the blocking issue. We fix the blocking issue by backporting a patch to the release branch. If there's a bug in this release, we patch that bug in our mainline branch, then git cherry-pick that patch onto our release branch and deploy that code.

We deploy backports three times a day during backport deployment windows.  In addition to backports, developers may opt to deploy new configuration or enable/disable features in the backport deployment windows.

Release engineers train others to deploy backports twice a week.

Emergencies

We deploy on Fridays when there are major issues. Examples of major issues are:

  • Security issues
  • Data loss or corruption
  • Availability of service
  • Preventing abuse
  • Major loss of functionality/visible breakage

We avoid deploying on Fridays because we have a small team of people to respond to incidents. We want those people to be away from computers on the weekends (if they want to be), not responding to emergencies.

Non-MediaWiki code

There are 42 microservices on Kubernetes deployed via helm. And there are 64 microservices running on bare metal. The service owners deploy those microservices outside of the train process.

We coordinate deployments on our deployment calendar wiki page.

The whole truth

We progressively deploy a large bundle of MediaWiki patches (between 150 and 950) every week. There are 12 backport windows a week where developers can add new features, fix bugs, or deploy new configurations. There are microservices deployed by developers at their own pace.

Important Resources:

More resources:


Thanks to @brennen, @greg, @KSiebert, @Risker, and @VPuffetMichel for reading early drafts of this post. The feedback was very helpful. Stay tuned for "How we deploy code: Part II."

Statement by the jury of WLM 2021

18:52, Tuesday, 12 2022 April UTC

We, the Jurors of Wiki Loves Monuments 2021, have recently completed the jury process. The international team is in the process of finalizing the communications, and we as the 2021 jury feel it pertinent to point out a few elements, for clarity and transparency.

metal made peace monument

Peace monument in Gulu, Northern Uganda. WLM 2019, by Malaika Overcomer – CC BY- SA 4.0

Wiki Loves Monuments is an international photographic competition, born and developed inside the Wikimedia Community, to promote historic and cultural sites around the world, it has seen participation from more than 50 countries over the years. The competition has for over a decade worked towards creating a unique visual record of places that hold cultural heritage value and we hope that these records can be used to help keep their stories alive, or repair any damage caused, in the years ahead.

Our final selections include pictures of monuments located in countries currently at war with other countries. In all the involved countries the Wikimedia Movement is present with Wikimedians and cultural activities in the aim of making human knowledge available to every person in the world. Cultural places, and as an extension monuments, belong to each and every single human being on this planet. Every country, regardless of where the monuments are located, is responsible for preserving them; they are not just “the countries’ monuments”, they are milestones of the whole human journey. We are all merely the current custodians of these cultural sites and must work to ensure that they are passed down to the next generations.

The awarded 2021 pictures were selected before the events of 24th February. We do confirm our choices now, with a war raging and destroying both people and heritage.  We have Wikimedians in each of the involved countries; and we love them all. We are worried for their safety and wish them all the peace that an honest mind can imagine. We are waiting for them to start working together again as soon as possible, in peace. We love them all. We can only ask that peace and prosperity returns to peoples lives.

This being said, we find ourselves unable to present our results as we usually do. There is no joy, this year, in celebrating the results of the annual WLM competition. Our hearts are bleeding and our emotions are tinged with sadness. But our commitment to record and share cultures remains stronger and brighter than all that.

The Jurors of the 2021 edition of Wiki Loves Monuments

Wikidata query service updater evolution

14:41, Tuesday, 12 2022 April UTC

The Wikidata Query Service (WDQS) sits in front of Wikidata and provides access to query its data via a SPARQL API. The query service itself is built on top of Blazegraph, but in many regards is very similar to any other triple store that provides a SPARQL API.

In the early days of the query service (circa 2015), the service was only run by Wikidata, hence the name. However, as interest and usage of Wikibase continued to grow more people started running a query service of their own, for data in their own Wikibase. But you’ll notice most people still refer to it as WDQS today.

Whereas most core Wikibase functionality is developed by Wikimedia Deutschland, the query service is developed by the search platform team at the Wikimedia Foundation, with a focus on wikidata.org, but also a goal of keeping it useable outside of Wikimedia infrastructure.

The query service itself currently works as a whole application rather than just a database. Under the surface, this can roughly be split into 2 key parts

  • Backend Blazegraph database that stores and indexes data
  • Updater process that takes data from a Wikibase and puts it in the database

This actually means that you can run your own query service, without running a Wikibase at all. For example, you can load the whole of Wikidata into a query service that you operate, and have it stay up to date with current events. Though in practice this is quite some work, and expense on storage and indexing and I expect not many folks do this.

Over time the updater element of the query service updater has iterated through some changes. The updater now packaged with Wikibase as used by most folks outside of the Wikimedia infrastructure is now 2 steps behind the updater used for Wikidata itself.

The updater generations look something like this:

  • HTTP API Recent Changes polling updater (used by most Wikibases)
  • Kafka based Recent Changes polling updater
  • Streaming updater (used on Wikidata)

Let’s take a look at a high-level overview of these updaters, what has changed and why. I’ll also be applying some pretty arbitrary / gut feeling scores to 4 categories for each updater.

Fundamentally they all work in the same way, somehow find out that entities in a Wikibase have changed, get the new data from the wikibase, and update the data in one or more blazegraph backends using a couple of differing methods.

Diagram from high-level query service architecture overview showing the general elements of a query service backend

HTTP API polling updater

Simplicity Score: 9/10
Legacy Score: 6/10
Scalability Score: 3/10
Reliability Score: 4/10

The HTTP API polling updater was the original updater likely dating back to 2014/2015. It makes use of the MediaWiki recent changes data and API, normally polling every 10 seconds to look for new changes in the namespaces that Wikibase entities are expected to be found. If changes are detected, it will retrieve the new data, removing old data from the database and storing this new data.

As the updater makes use of a MediaWiki core feature no additional extensions, services or functionality need to be deployed to a Wikibase. It’s nice and easy to set up, requires minimal resources and is quite easy to reason with. So, high simplicity score.

Diagram from high-level query service architecture overview showing what happens at runtime of the RCPoller updater

A middle ground score for legacy is given, as this is the oldest updater, however still used widely by Wikibases around the world.

When judging scalability, we have to look at Wikidata. This updater would no longer work very well for the number of changes on Wikidata (let’s say 600 edits per minute) and the number of backend query service databases that those changes need to make their way to (12+ currently). This updater was designed to point at a single blazegraph backend.

By default Recent Changes only store 30 days’ worth of data, so if your updater breaks for 30 days and you don’t notice, you’ll need to reload the data from scratch. For small wikibases, this is one of the most noticeable and annoying things to happen.

Kafka based polling updater

Simplicity Score: 5/10
Legacy Score: 9/10
Scalability Score: 5/10
Reliability Score: 7/10

I’ll gloss over the Kafka polling updater, as this was never generally used in the Wikibase space and only ever in Wikimedia production for Wikidata. It was used roughly between 2018 (created in T185951) and 2021.

At a high level, this updater simple replaced the MediaWiki recent changes HTTP API with a stream of recent changes that were written to Kafka by various event-related extensions for MediaWiki. These events contained similar information that the recent changes API would provide, such as page title, namespace, and from this, the updater can determine what entities have changed.

This loses points for simplicity, as it requires both running of Kafka, and also extra extensions in MediaWiki to emit events. Top scores for legacy, as no one uses this solution anymore. Some scalability issues were solved, such as the elimination of repeated hits to the MediaWiki API, but as with the HTTP updater the total process is still duplicated as the number of backends scale-up, and writes to backends are not very efficient. But as this didn’t rely on the public HTTP API, instead on an internal Kafka service, it can have some extra points for reliability.

Streaming updater

Simplicity Score: 3.5/10
Legacy Score: 1/10
Scalability Score: 9/10
Reliability Score: 9/10

The streaming updater was fully rolled out to Wikidata at the end of 2021 (see T244590) and came with some more significant changes to the update process.

Simplicity decreases due to more components making up the process, as well as more complicated RDF diffing on updates. Low legacy score as its currently in use and actively maintained by the Wikimedia search platform team. It solves a variety of scaling issues for Wikidata with some insane increase in updater performance, and on the whole, due to this factor and more is more reliable.

Similar to the Kafka based polling updater the information about when entities changed comes from Kafka. A single “Producer” listens to this stream of entity changes producing another stream of RDF changes that need to happen in the change. This stream of RDF changes is then listened to by a “Consumer” on each backend which runs write queries against the backend to update the data stored. Note the “Single Host” box in the diagram below.

Diagram from high-level query service architecture overview showing how the streaming updater is split between general services and per backend host services

Some major wins when it comes to this new implementation are:

  • Streaming rather than polling, so no waiting in between polls
  • Entity changes and the RDF from Wikibase are only retrieved once by the Producer in situations where multiple backends run
  • Only RDF changes are written into the database rather than removing all triples associated with an entity and rewriting new triples. This reduces blazegraph write load, as well as increases update speed.

These effects can be seen clearly on Wikidata. The number of requests to the API to retrieve RDF data for Wikibase entities has dropped (less load on MediaWiki). And in cases where a backend falls behind due to some issue and is then fixed, the backend will very quickly catch back up with the current state of Wikidata rather than taking hours.

The post Wikidata query service updater evolution appeared first on addshore.

Why good information on the environment matters

16:13, Monday, 11 2022 April UTC

Human-dominated landscapes tend to be homogenized in a that’s often invisible to us. Tourists visiting anywhere in the tropics expect a see lot of the same things — coconut trees, mangos, pineapples, bananas. Despite the fact that the tropics are some of the most biologically diverse regions of the planet, we see this artificial aggregation of a small number of common species. And alongside these intentional introductions are a whole lot of species that we have unintentionally spread around the world. Tramp species are species that have been spread around the world by human activity. Originally applied to ant species that had managed to find their way around the world like tramps or stowaways, the term has come to describe a group of species that are usually associated with human activity. While some tramp species become invasive species, most do not.

Most people are familiar with the invasive species, but might have a hard time separating that concept from the related idea of introduced species. Familiar ideas like these got added to Wikipedia first (the invasive species article was created in 2002, while the introduced species article was created in 2003). The article on tramp species, on the other hand, wasn’t created until November 2021 when a student in Sarah Turner’s Advanced Seminar in Environmental Science class created the article. It’s a concept that fits an important part in our understanding of this topic, but as long as it had no Wikipedia article, it’s likely to be invisible to many people learning about the topic. Since undergraduates rely heavily on Wikipedia as a freely available alternative to textbooks, the topics that are missing from Wikipedia are more likely to slip through the cracks for students learning ecology.

Disease, as we have learned during the Covid-19 pandemic, is more than just the interaction between a pathogen and its host. There’s a whole world of environmental factors that means that there’s much more to disease transmission than simply infection rates. These sorts of things are part of the science of disease ecology, but more than a year into the pandemic, Wikipedia’s article on the topic was just a short overview. A student editor in the class was able to transform the article into something much more useful and information to to readers.

Climate change affects not only global temperatures, but also rainfall patterns and sea level rise. By expanding the ice sheet model and flood risk management articles, student editors were able to improve the information that’s out there for people trying to understand these important tools for forecasting changes in the world we live in. Other new articles created by students in the class include CLUE model, a spatially-explicit landuse-change model, Cooper Reef, an artificial reef in Australia, Indigenous rainforest blockades in Borneo, the Impacts of tourism in Kodagu district in Karnataka, India, and Soapstone mining in Tabaka, Kenya. Other existing articles that they made major improvements to include Alopecia in animalsBlond capuchin and Stream power.

Wikipedia’s coverage of environmental science is uneven. Many are covered well, but there are large gaps. Other articles suffer because they’re incomplete, badly organized, or out of date. This leaves a lot of room for student editors to make important contributions.

Image credit: Forest & Kim Starr, CC BY 3.0 US, via Wikimedia Commons

Tech News issue #15, 2022 (April 11, 2022)

00:00, Monday, 11 2022 April UTC
previous 2022, week 15 (Monday 11 April 2022) next

Tech News: 2022-15

weeklyOSM 611

09:42, Sunday, 10 2022 April UTC

29/03/2022-04/04/2022

lead picture

Patterns in placenames [1] © see | map data © OpenStreetMap contributors

Mapping

  • Anne-Karoline Distel reported on a survey of Callan, Ireland, where address attributes (house numbers and street names) seem kind of curious.
  • Dino Michelini wrote (it) > en, in his blog, a well researched piece on the ancient Etruscan-Roman road Via Clodia. He also outlined what still needs to be done to improve the mapping of this road in OSM.
  • LySioS, an OSM France contributor, proposed (fr) > en that mappers in the field use an OSM business card to facilitate contacts with local residents.
  • LySioS also published (fr) > en a diary post for beginners about the ten commandments for OSM mapping (we reported earlier).
  • The OpenStreetMap tool set Neis-one.org now recognises MapComplete as a distinct data editor rather than just one of the ‘unknown’, as reported by MapComplete’s main developer.
  • The following proposals are waiting for your comments:

Community

  • The UN Mappers is now also choosing a Mapper of the Month. UN Mapper of the Month for April 2022 is SSEKITOLEKO.
  • Amanda McCann’s activity report for March 2022 is online.
  • Christoph Hormann shared his analysis of OSM-related group communication channels and platforms.
  • Minh Nguyen tackled the lack of a negative feedback option on the wiki and provided a JavaScript snippet to add to a user script page, so that one could chide any chosen contribution (an April Fool’s Day joke).
  • raspbeguy shared (fr) a small script, similar to git-blame, that indicates the last person who modified or deleted tags on a OSM element.
  • Seth Deegan has proposed adding the Translate extension to the OSM Wiki, something that would improve the process of translating articles on the Wiki. The proposal is open for comments.
  • The Ukrainian OSM community has published an appeal to the OSM community urging everyone to refrain from any mapping of the territory of Ukraine while the Russian–Ukrainian war is unfolding.

Events

  • OSMUS has honoured Ian Dees with the Hall of Fame Award.
  • Bryan Housel presented the 2.0 alpha of the new RapiD at SotMUS. The test instance shows high performance during tests.

Education

  • The group ‘Geospatial Analysis Community of Practice’ at the University of Queensland, Australia, has published an extensive tutorial on ‘spatial networks’ with R.

OSM research

  • Marco Minghini and his colleagues published a paper reviewing the initiatives from the Italian OpenStreetMap community during the early COVID-19 pandemic, discussing it from a data ecosystem perspective at both national and European scales.

Maps

  • [1] SeeSchloss created a map tool that uses OpenStreetMap data to visualise patterns in place names in various northern hemisphere territories.
  • MapTiler presented a short tutorial on ‘Customised maps made easy’.
  • Christopher Beddow wrote an article examining the bundle of geospatial components that make up Google Maps, and listed alternatives to each. He further suggests that bundling the alternatives is a strategy to compete with Google Maps as a widely used mobile app.

Did you know …

  • … the possibilities 1, 2, 3 of printing beautiful map based gifts?
  • flat_steps? The tag for steps where individual steps are separated by about 1 metre or more. Such steps may be accessible to some people who would otherwise avoid highway=steps.

Other “geo” things

  • CAMALIOT, an Android App, is a project run by a consortium led by ETH Zurich (ETHZ) in collaboration with the International Institute for Applied Systems Analysis (IIASA) and the European Space Agency. The app is gathering data for machine learning analysis of meteorology and space weather patterns.
  • Cartographers from Le Monde described (fr) > en the steps taken in making their maps, using the Ukraine situation as an example.
  • @MatsushitaSakura left a photo (zhcn) on an internet detective hobby club and asked for help to find out where it was. Another user (@猫爪子) found the possible location six months later with the help of overpass turbo and some detailed Danish mapping and showed the Overpass QL code (01:45) (zhcn) he used.

Upcoming Events

Where What Online When Country
Skillshare Session: OSM Community Forum osmcalpic 2022-04-08
Berlin 166. Berlin-Brandenburg OpenStreetMap Stammtisch osmcalpic 2022-04-08 flag
OSM Africa April Mapathon: Map Kenya osmcalpic 2022-04-09
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-11
臺北市 OpenStreetMap x Wikidata Taipei #39 osmcalpic 2022-04-11 flag
Roma Capitale Incontro dei mappatori romani e laziali osmcalpic 2022-04-11 flag
Washington MappingDC Mappy Hour osmcalpic 2022-04-13 flag
San Jose South Bay Map Night osmcalpic 2022-04-13 flag
20095 Hamburger Mappertreffen osmcalpic 2022-04-12 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-13
Michigan Michigan Meetup osmcalpic 2022-04-14 flag
OSM Utah Monthly Meetup osmcalpic 2022-04-14
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-18
OSMF Engineering Working Group meeting osmcalpic 2022-04-18
150. Treffen des OSM-Stammtisches Bonn osmcalpic 2022-04-19
City of Nottingham OSM East Midlands/Nottingham meetup (online) osmcalpic 2022-04-19 flag
Lüneburg Lüneburger Mappertreffen (online) osmcalpic 2022-04-19 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-20
Dublin Irish Virtual Map and Chat osmcalpic 2022-04-21 flag
New York New York City Meetup osmcalpic 2022-04-23 flag
京都市 京都!街歩き!マッピングパーティ:第29回 Re:鹿王院 osmcalpic 2022-04-24 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-25
Bremen Bremer Mappertreffen (Online) osmcalpic 2022-04-25 flag
San Jose South Bay Map Night osmcalpic 2022-04-27 flag
Open Mapping Hub Asia Pacific OSM Help Desk osmcalpic 2022-04-27
[Online] OpenStreetMap Foundation board of Directors – public videomeeting osmcalpic 2022-04-28
Gent Open Belgium 2022 osmcalpic 2022-04-29 flag
Rapperswil-Jona Mapathon/Hackathon at the OST Campus Rapperswil and virtually osmcalpic 2022-04-29 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Lejun, Nordpfeil, PierZen, SK53, Strubbl, TheSwavu, derFred.

A Trainsperiments Week Reflection

13:59, Friday, 08 2022 April UTC

Over here in the Release-Engineering-Team, Train Deployment is usually a rotating duty. We've written about it before, so I won't go into the exact process, but I want to tell you something new about it.

It's awful, incredibly stressful, and a bit lonely.

And last week we ran an experiment where we endeavored to perform the full train cycle four times in a single week... What is wrong with us? (Okay. I need to own this. It was technically my idea.) So what is wrong with me? Why did I wish this on my team? Why did everyone agree to it?

First I think it's important to portray (and perhaps with a little more color) how terrible running the train can be.

How it usually feels to run a Train Deployment and why

Here's a little chugga-choo with a captain and a crew. Would the llama like a ride? Llama Llama tries to hide.

―Llama Llama, Llama Llama Misses Mama

At the outset of many a week I have wondered why, when the kids are safely in childcare and I'm finally in a quiet house well fed and preparing a nice hot shower to not frantically use but actually enjoy, my shoulder is cramping and there's a strange buzzing ballooning in my abdomen.

Am I getting sick? Did I forget something? This should be nice. Why can't I have nice things? Why... Oh. Yes. Right. I'm on train this week.

Train begins in the body before it terrorizes the mind, and I'm not the only one who feels that way.

A week of periodic drudgery which at any moment threatens to tip into the realm of waking nightmare.

―Stoic yet Hapless Conductor

Aptly put. The nightmare is anything from a tiny visual regression to taking some of the largest sites on the Internet down completely.

Giving a presentation but you have no idea what the slides are.

―Bravely Befuddled Conductor

Yes. There's no visibility into what we are deploying. It's a week's worth of changes, other teams' changes, changes from teams with different workflows and development cycles, all touching hundreds of different codebases. The changes have gone through review, they've been hammered by automated tests, and yet we are still too far removed from them to understand what might happen when they're exposed to real world conditions.

It's like throwing a penny into a well, a well of snakes, bureaucratic snakes that hate pennies, and they start shouting at you to fill out oddly specific sounding forms of which you have none.

―Lost Soul been 'round these parts

Kafkaesque.

When under the stress and threat of the aforementioned nightmare, it's difficult to think straight. But we have to. We have to parse and investigate intricate stack traces, run git blames on the deployment server, navigate our bug reporting forms and try to recall which teams are responsible for which parts of the aggregate MediaWiki codebase we've put together which itself is highly specific to WMF's production installation and really only becomes that long after changes merge to main branches of the constituent codebases.

We have to exercise clear judgement and make decisive calls of whether to rollback partially (previous group) or completely (all groups to previous version). We may have to halt everything and start hollering in IRC, Slack channels, mailing lists, to get the signal to the right folks (wonderful and gracious folks) that no more code changes will be deployed until what we're seeing is dealt with. We have to play the bad guys and gals to get the train back on track.

Trainsperiments Week and what was different about it

Study after study shows that having a good support network constitutes the single most powerful protection against becoming traumatized. Safety and terror are incompatible. When we are terrified, nothing calms us down like a reassuring voice or the firm embrace of someone we trust.

―Bessel Van Der Kolk, M.D., The Body Keeps the Score

Four trains in a single week and everyone in Release Engineering is onboard. What could possibly be better about that?

Well there is a safety in numbers as they say, and not in some Darwinistic way where most of us will be picked off by the train demons and the others will somehow take solace in their incidental fitness, but in a way where we are mutually trusting, supportive, and feeling collectively resourced enough to do the needful with aplomb.

So we set up video meetings for all scheduled deployment windows, had synchronous hand offs between our European colleagues and our North American ones. We welcomed folks from other teams into our deployments to show them the good, the bad, and the ugly of how their code gets its final send off 'round the bend and into the setting hot fusion reaction that is production. We found and fixed longstanding and mysterious bugs in our tooling. We deployed four full trains in a single week.

And it felt markedly different.

One of those barn raising projects you read about where everybody pushes the walls up en masse.

―Our Stoic Now Softened but Still Sardonic Conductor

Yes! Lonely and unwitnessed work is de facto drudgery. Toiling safely together we have a greater chance at staving off the stress and really feeling the accomplishment.

Giving a presentation with your friends and everyone contributes one slide.

―Our No Longer Befuddled but Simply Brave Conductor

Many hands make light work!

It was like throwing a handful of pennies into a well, a well of snakes, still bureaucratic and shouty, oh hey but my friends are here and they remind me these are just stack traces, words on a screen, and my friends happen to be great at filling out forms.

―Our Once Lost Now Found Conductor

When no one person is overwhelmed or unsafe, we all think and act more clearly.

The hidden takeaways of Trainsperiment Week

So how should what we've learned during our Trainsperiment Week inform our future deployment strategies and process. How should train deployments change?

The known hypothesis we wanted to test by performing this experiment was in essence:

  1. More frequent deployments will result in fewer changes being deployed each time.
  2. Fewer changes on average means the deployment is less likely to fail. The deployment is safer.
  3. A safer deployment can be performed more frequently. (Positive feedback loop to #1.)
  4. Overall we will: move faster; break less.

I don't know if we've proved that yet but we got an inkling that yes, the smaller subsequent deployments of the week did seem to go more smoothly. One week, however, even a week of four deployment cycles is not a large enough sample to say definitively whether doing train more frequently will for sure result in safer, more frequent deployments with fewer failures.

What was not apparent until we did our retrospective, however, is that it simply felt easier to do deployments together. It was still a kind of drudgery, but it was not abjectly terrible.

My personal takeaway is that a conductor who feels resourced and safe is the basis for all other improvements to the deployment process, and I want conductors to not only have tooling that works reliably with actionable logging at their disposal, but to feel a sense of community there with them when they're pushing the buttons. I want them to feel that the hard calls of whether or not to halt everything and rollback are not just their calls but shared in the moment among numerous people with intimate knowledge of the overall MediaWiki software ecosystem.

Better tooling—particularly around error reporting and escalation—is a barrier to entry for sure. Once we've made sufficient improvements there we need to get that tooling into other people's hands and show them that this process does not have to be so terrifying. And I think we're on the right track here with increased frequency and smaller sets of changes, but we can't lose sight of the human/social element and foundational basis of safety.

More than anything else, I want wider participation in the train deployment process by engineers in the entire organization along with volunteers.


Thanks to @thcipriani for reading my drafts and unblocking me from myself a number of times. Thanks to @jeena and @brennen for the inspirational analogies.

More Wikidata metrics on the Dashboard

14:17, Thursday, 07 2022 April UTC

We’re excited to announce some new updates to Dashboard statistics regarding Wikidata. As of April 2022, the Programs and Events Dashboard shares Wikidata details about merges, aliases, labels, claims, and more!

In early March, we rolled out the final batch of improvements from Outreachy intern Ivana Novaković-Leković. Ivana’s internship focused on improving the Dashboard’s support for Wikidata. After an overhaul of the system’s interface messages to add Wikidata-specific terminology — “Item” instead of “Article” and so on, for events that focus on Wikidata — Ivana worked on integrating Wikidata edit analysis into the Dashboard’s data update system. We deployed under-the-hood changes in February to begin collecting the data we would need — edit summaries from all tracked Wikidata edits. The final step was to add a visualization of that data, which you can see in action here.

The new Wikidata stats are based on analyzing the edit summary of each edit. The edit summaries for Items on Wikidata are more structured than the free-form summaries from Wikipedia and other wikis, making it possible to reliably classify most common types of contributions. For example, adding an label for a Wikidata item will result in an edit summary that includes the code `wbsetlabel-add`.

There are some limitations to this strategy, however. Multi-part revisions — for example, adding a new property that also includes qualifiers and references — will only be partially represented in the stats. That example gets counted towards ‘Claims created’, but not towards ‘References added’ or ‘Qualifiers added’. The Wikidata API provides no direct method to count these details, but it’s possible to calculate them by comparing the ‘before’ and ‘after’ state of an Item via its complete JSON entity data. We may explore that in the future, but it would require some significant changes in the Dashboard’s storage architecture before that would be possible.

Over the last several weeks, we’ve been backfilling the Wikidata stats for almost all the Programs & Events Dashboard events that edited Wikidata, and Campaign pages also show aggregate Wikidata stats.

Thanks, Ivana, for your great work!

So what does this mean for Dashboard users?

Anne Chen, a Wikidata Institute alumnae, has been using Wikidata more in the archaeology course she teaches at Yale University. As you can see from this screenshot of a recent edit-a-thon, there are many more granular Wikidata statistics you can follow. Prior to this update, the Dashboard provided users with statistics limited to number of participants, items created, items edited, total edits, references added, and page views. Although these are useful statistics to have access to, the nature of Wikidata editing can demand other sets of metrics.

Screenshot from Dashboard
Wikidata Dashboard detailed statistics example

Merging, for instance, is an important feature of Wikidata editing. Data is coming to Wikidata from different corners of the world all at once, so duplication is a natural occurrence on Wikidata, but it still needs to be addressed. Now this specific metric is easy to track on the Dashboard. Similarly, label, alias, and description work is essential for transition, disambiguation, and providing context to users about items. These statistics used to be more difficult to discover, now they show up in the statistics box on the Dashboard.

Screenshot of the Download Stats button
Download stats button on the Dashboard

For experienced Dashboard users, you may be used to obtaining these statistics from the “Download stats” button on the home tab of the Dashboard. This button still exists, so if it’s more convenient to have these stats as a CSV file, you can still get them that way! For those curious users who may be wondering what “other updates” mean, those are edits made outside of the item space on Wikidata. This would include user pages, talk pages, and WikiProject pages.

We’re excited that these new statistics are more accessible since users will have different outcomes for their projects. The more statistics we can track the better we can tell the stories of our impact and work on Wikidata. We hope you enjoy these new features.

If you’re interested in learning more about Wikidata, editing Wikidata, and Wikidata statistics, keep an eye on our calendar for future courses.

Sowt and the Wikimedia Foundation, have partnered  to produce a new season of Sowt’s podcast Manbet. Hosted and written by veteran content creator Bisher Najjar, the educational podcast explores various topics from the fields of humanities and society.

This partnership is a result of Sowt and the Wikimedia Foundation’s mutual vision to share knowledge with the world. The Wikimedia Foundation is the global nonprofit that operates Wikipedia and the other Wikimedia free knowledge projects and aims at ensuring every single human can freely share in the sum of all knowledge. Sowt produces and distributes high-quality audio shows in Arabic to create a dialogue around the most important topics to Arab listeners across the world.

“This partnership brings together the power of audio storytelling and the importance of open knowledge and access to information. As a leader in Arabic podcast production, Sowt and the Wikimedia Foundation are aligned on expanding access to high quality audio content for Arabic speaking audiences,” said Jack Rabah, the Wikimedia Foundation Lead Regional Partnerships Manager (Middle East and Africa). “Together, we can build greater awareness and understanding of Wikipedia through a series of informative narrated podcasts for all listeners across the MENA region.” 

The first episode of Manbet published in October 2020, was followed by 3 seasons with topics ranging from exploring the history of passports and the birth of Arab feminism to the fashion revolution, among many others. Tune in to the new season of Manbet to uncover topics such as Sufism, the life of Native Americans, the history of Yemen, the story of Nollywood, and humans’ ancient dream of flying.

“I believe that the partnership between Sowt and the Wikimedia Foundation enriches the production of content in the Arab Region and Manbet is the best program to reflect this kind of collaboration. This season of Manbet attempts to take our audience through a journey of history, cinema, music and thriller. This span of information and stories will give us an insight of how knowledge has been and will always be power.” Said Ahmed Eman Zakaria, Manbet producer working with Sowt. 

The new season of Manbet comes out in April 2022, and you can listen to new episodes wherever you get your podcasts.

Links:

Outreachy report #30: March 2022

00:00, Tuesday, 05 2022 April UTC

March was a tough month–my partner and I had dengue fever as we reviewed and processed initial applications–, but we made through it. ✨ Team highlights Sage developed new code to help us review and process school time commitments: Sage and I have been trying to develop strategies to review academic calendars quickly for years. We’ve gone from external notes to trying to gathering data on specific schools and requesting initial application reviewers to assign students to us.