Production Excellence #39: December 2021

04:59, Wednesday, 19 2022 January UTC

How’d we do in our strive for operational excellence last month? Read on to find out!

Incidents

One documented incident last month:

2021-12-03 mx
Impact: A portion of outgoing email from wikimedia.org was delivered with a delay of upto 24 hours. This affected staff Gmail, and Znuny/Phabricator notifications. No mail was lost, it was eventually delivered.

Image from Incident graphs.


Incident follow-up

Remember to review and schedule Incident Follow-up work in Phabricator. These are preventive measures and tech debt mitigations written down after an incident. Read about past incidents at Incident status on Wikitech.

Recently resolved incident follow-up:

Create paging alert for high MX queues.
Filed in December after the mail delivery incident, resolved later that month by Keith (Herron).

Limit db execution time of expensive MW special pages.
Filed in December after various incidents due to high DB/appserver load, carried out by Amir (Ladsgroup).


Trends

In December we reported 22 new errors in December, of which 5 have since been resolved, and 17 remain open and have carried over to January. From the 298 issues previously carried over, we also resolved 17, thus the workboard still adds up to 298 in total.

In previous editions, we sometimes looked at the breakdown of tasks that remained unresolved. This time, I'd like to draw attention to the throughput and distribution of tasks that did get resolved.

Production errors resolved in the month of December, by team and component (query):

  • Community-Tech (2): GlobalPreferences (1), CodeMirror (1).
  • DBA: DjVuHandler (1).
  • Editing-team: DiscussionTools (1).
  • Fundraising Tech: CentralNotice (1).
  • Growth-Team (8): GrowthExperiments (6), Image-Suggestions (1), StructuredDiscussions (1).
  • Language-Team: UniversalLanguageSelector (1).
  • Parsoid (1).
  • Product-Infrastructure: TemplateStyles (1).
  • Readers-Web (2).
  • Structured-Data (2).
  • Wikidata team: Wikidata-Page-Banner (1).
  • Missing steward (1): MediaWiki-Logevents (T289806: Thanks @Umherirrender!).

For the month-over-month numbers, refer to the spreadsheet data.


Outstanding errors

Take a look at the workboard and look for tasks that could use your help.
View Workboard

Oldest unresolved errors:

  • (June 2020) WikibaseClient: RuntimeException in wblistentityusage API. T254334
  • (June 2020) WikibaseClient: Deadlock in EntityUsageTable::addUsages method. T255706

Thanks!

Thank you to everyone who helped by reporting, investigating, or resolving problems in Wikimedia production. Thanks!

Until next time,

– Timo Tijhof

💡 Did you know:

To find your team's error reports, use the appropriate "Filter" link in the sidebar of the workboard.

Job Interviews By Zombies

00:09, Wednesday, 19 2022 January UTC

The golden rule of interviewing is: make yourself easy to hire.

Like many organizations at the moment, we’re hiring. As I’m interviewing people, I’m struck by how hard some folks make it for me to give them a “yes.”

Candidates tend to murmur things like, “we decided to move to Kubernetes” or “the team built a release pipeline.” Bromides like these make me wonder – what exactly did you (the candidate) do?

[Often in an interview, a candidate’s] answers are about “we”, “us”, and “the team.” The interviewer walks away having little idea what the candidate’s actual impact was and might conclude that the candidate did little

– Gale Laakmann McDowell, Cracking the Coding Interview

When you’re humble, you’re hard to hire

Being humble is an admirable trait. When candidates succumb to words like “team” and “we” and “us,” it’s possible they’re just being humble. But it’s also reasonable to conclude that the word “I” is absent from their pat answers because they didn’t actually do anything.

I’m compelled to press candidates, “what was your role in that project?” I have many questions I’d love to ask, and I don’t relish using our short time together asking questions to clarify vagaries. You’re easier to hire when I know what you did.

Take the blame? Take the credit

If you’re not the kind of person who shifts blame when things go wrong, then why shift the blame when things have gone right? Writers and speakers use the passive voice to hide who’s responsible for an action. “Mistakes were made” rather than “I made a mistake.”

An easy trick to detect the passive voice in writing is to inject the phrase “by zombies” after the verb:

Like: “Mistakes were made by zombies.”

Or: “The ball was thrown by the boy by zombies.”

There must be a corollary rule for job interviews.

Just say I: the unaccountable zombie rule of interviewing

You’re making it hard to hire you if you’re talking about the unaccountable zombies you worked with: “We The unaccountable zombies I worked with decided it was more sensible to use boring technologies, so we the unaccountable zombies I worked with went with MySQL.”

Just as I’d have to stop the interview to inquire, “I’m sorry, did you say the team of unaccountable zombies you worked with?” I’d have to stop the interview to clarify, “who exactly is we?”

The not-so-secret secret of hiring is: I want to hire you. I’m interviewing you after all—you must be one singular human being!

If you agree, just say “I.”

Improving Wikipedia’s coverage of racial justice

14:36, Monday, 17 2022 January UTC

Wikipedia remains the product of the world in which it is created. A recent survey of US-based contributors to the English Wikipedia found that only 0.5% of editors identified as Black or African American. Making the contributor base more closely resemble the world at large is an important step toward a more equitable Wikipedia. And Wiki Education’s ability to attract a body of student editors who are substantially more representative of the population at large plays an important part in our ability to begin to address the issue of racial justice in the way Wikipedia articles are written.

Over the last few years, the tragedy of Kalief Browder, an African American boy who was imprisoned without trial on Rikers Island for three years (and later died by suicide) has gotten more of the attention it deserves, but in Spring 2017 it was still possible for a student in a Wiki Education-supported class to make important changes ot the way he was presented in Google searches by adding 26 words to the lead section of the article.

Because of Wikipedia’s ubiquity, changes like this can be transformative. And student editors who improve Wikipedia articles as part of a Wiki Education-supported class are in an excellent position to address racial justice issues while making articles better for all readers.

In addition to creating and expanding articles about Black anthropologists like Donna Auston and John L. Jackson Jr., students in Hanna Garth’s BlackLivesMatter class created an article about the Museum of Black Joy and expanded the Black Lives Matter art article. Another group of students expanded the school discipline article to add to and expand coverage of disparities in the ways that school discipline impacts Black students.

When faced with a broad topic like school discipline, it’s easy to focus on the median. The task of trying to cover an entire area of knowledge like this can be overwhelming, and it’s entirely natural to start with what’s broadly applicable to all groups. But when you create a framework about the median, the next editor who comes along and tries to expand the page is likely to be guided by what’s already there. Over time, as a page is expanded and revised, you can end up with a page that seems complete, but actually elides important information. By adding topics like these to an article, student editors can fill important gaps. And when the experience of Black and other minoritized communities are added to Wikipedia, they become that much more visible to the world.

Students in Laura Gutierrez’s Hispanic USA class focused on a number of articles related to racial justice, in particular the Sterilization of Latinas articles, adding a lot of information about policies in California and the eugenicist organization that spearheaded them, and about the treatment of women in Puerto Rico. Striking in their additions was this quote:

the Immigration Act of 1924 further developed the idea that labor-migrants were needed, but women and children were not as there was a fear of Latino and Immigrant invasion

Another group of students in the class expanded the Bracero program article. By adding information about the wives and families of the men employed in this program, the students added humanizing dimensions to an article that previously covered the program in mostly economic and labor history terms. The class also created a new article about the 1970 takeover of the Lincoln Hospital in South Bronx by the Young Lords. The goal of the takeover was to raise awareness of the disparities in health care and health outcomes experienced in this primarily poor, primarily non-white community.

Topics related to the health of Black and minoritized communities were also addressed by a student in Carwil Bjork-James’ Biology and Culture of Race class who created an article on race and maternal health in the United States and by students in Diana Strassmann’s Poverty, Justice, and Human Capabilities class who created articles about medical racism in the United States and environmental racism in the United States. Other students in this class also expanded the racial disparities in the COVID-19 pandemic in the United States article and the racial capitalism article, among many others.

The article on discrimination based on skin color attracted edits both by a student in this class and by a student in Coleman Nye’s Feminist Approaches to Research class.

While Wikipedia policy does a good job of addressing deadnaming, the issue of slave names is only indirectly handled through the guidelines related to changed names. This mirrored in a comparison between the encyclopaedia’s 1000-word deadnaming article and the slave name article which covers the common meaning together, the practice in Ancient Rome and Sinéad O’Connor’s conversion to Islam, all in a little over 300 words. This, in a large part, reflects the way that Wikipedia reflects the composition of its editing community. Student editors are part of the process of making that editing community more representative.

Interested in adding a Wikipedia assignment to your class? Visit teach.wikiedu.org for more information.

Image credit: August Schwerdfeger from Minneapolis, United States, CC BY 2.0, via Wikimedia Commons

weeklyOSM 599

11:13, Sunday, 16 2022 January UTC

04/01/2022-10/01/2022

lead picture

Relative use of the various OSM editors 2009-2021 [1] | © plot by User:Nop

About us

  • The over 10-year success story of weeklyOSM, as well as our editorial team, continue to develop but are also subject to change.
    • With new support, weeklyOSM will therefore be available from issue 595 not only in ZH-TW (Chinese Traditional) but also in ZH-CN (Chinese Simplified).
    • However, issues will no longer be published in PT-BR, as we can no longer ensure quality due to a lack of native language support. We suggest that readers use the PT-PT edition until we find someone who wants to fill this gap in the editorial team.

Mapping campaigns

  • qeef correctly stated in his blog post that ‘data quality matters’. Taking a cue from Johnwhelan’s cleanup suggestions, which we reported on in issue #598, qeef describes three steps that are useful for quality assurance to clean up the geometry of an area following a distributed mapping action when using the Divide and map. Now tasking manager.
  • Christoffs presented (pl) > de us with a new website showing all the defibrillator locations (menu and description only on (pl)) in Poland mapped in OSM.
  • Enock Seth Nyamador announced the start of an organised mapping activity in the Eastern Region of Ghana on the Talk-gh mailing list. The focus of this campaign is to add the smoothness tag to highways that have been surveyed as part of a research project in the area.
  • Following the large oil spill in Peru, the Peruvian OSM community has established two mapping tasks with its Tasking Manager to facilitate medical care in the affected region:
    • #4 Mapping of buildings (es) > de in Wachapea, Pakun, Umukai and Nazareth communities affected by oil spills.
    • #7 Mapping of access roads (es) > de to Wachapea, Pakun, Umukai and Nazareth communities.
  • UNGSC-DTLM-Ale_Zena works and maps for the UN Mappers Project. In a blog post, Alessandro expressed frustration about the poor quality of buildings added by beginners from a HOT task manager project for an extremely dense area, causing ‘another chaos in Mogadishu’.

Mapping

  • Ivan Ruggiero shared, on the Talk-africa mailing list, a problem with tagging he found in Cameroon. From an overpass query, he found more than 2000 nodes with 138 different invalid values for the place=* tag.
  • Emmanuel Jolaiya started a conversation on how to tag ‘football viewing centres’ on the Talk-africa mailing list. He describes them as a type of public football viewing centres which do not fall under cinemas or pubs/bars with television. Responses to the thread show that they are present in other African countries.
  • Mateusz Konieczny made a short summary of some interesting parts of the 2021 editor usage statistics.
  • Minh Nguyen shared his insights on how to detail school classifications, which has long been a tricky subject in OSM.
  • OSM now has over three million notes in the database. Unfortunately, note #3000000 (es) > en was closed as invalid.
  • Requests have been made for comments on the following proposals:
    • To deprecate embassy=embassy and promote embassy=yes instead.
    • New tag tourism=cabin for a small, single-unit, rentable building (or group of buildings) with few amenities designed to accommodate lodging.
  • Voting is underway for the following proposals:
  • The proposal for the new tag amenity=parcel_locker was adopted with 47 votes in favor, 12 against and 1 abstention.

Community

  • Gigi Meikle described in her very detailed article the enormous importance of mapping small isolated communities and seasonally difficult-to-pass routes to them with OSM to help Operation Fistula (improving maternal health related to labour-related fistula formation and its treatment) succeed in Madagascar.
  • OpenCage continued its interviews with members of the OSM community in 2022. In the 9 January issue, Ronald Lomora (aka Romeo from Juba South Sudan) answered OpenCage’s questions.
  • Amanda’s OSM Activity Report for December 2021.
  • MapSãoFrancisco has begun (pt) > de its projects in the Velho Chico basin with collaborative mapping of lakes and marginal floodplains in the lower São Francisco in OSM. To select the project areas that can be mapped, access can be made either through MapSãoFrancisco or through the HOT Tasking Manager.
  • Simon Poole has updated the contributor statistics reports on the wiki.
  • Thomas Straupis shared his personal and very critical opinion of OSM and its governance for discussion against the background of his participation in the ‘International Cartographic Conference 2021’.

OpenStreetMap Foundation

  • Microsoft donated $10,000 to OpenStreetMap in their latest FOSS Fund award round.

Events

Education

  • b-unicycling has started a new video series on Youtube. As she reports in her blog post she will present the Derrynaflan Heritage Trail. In the first episode she maps the first buildings on the trail and shows how she created a route relation for the trail.

Software

  • GraphHopper Maps for Android is a Capacitor-wrapper (Capacitor is an open source native runtime for building web apps) for the navi branch of GraphHopper Maps, which provides directions and (experimental) turn-by-turn navigation for car, bicycle and foot. There are many known limitations for the turn-by-turn navigation feature (which are also in the browser version).
  • With his diary entry lubosb [sk] > en addresses people who come to Slovakia for tourist reasons. He recommends freemap.sk as a planning tool and OsmAnd for on-road use.
  • Manuel Reimer announced (de) > en a Firefox add-on called OSM Everywhere, which replaces maps by Google, Bing or Here with OpenStreetMap tiles on every website you visit.
  • Victor Shcherb, the founder of OsmAnd, summed up the features added in 2021 and ventures a view of what’s planned for 2022.
  • SomeoneElse presented his way of making a map-related prediction about flooded footpaths in relation to water levels measured at close upstream and downstream gauging stations.
  • wowirleben.de (where we live) (de) > en is a web project that takes OSM data, and groups the data by ‘is it relevant to a specific road?’. The results are presented in list form with a map view and it puts a focus on web links. It may become an interesting tool for checking the accuracy of links around places you know. For now, it’s only available for Germany. The website uses Google Analytics and AdSense. It is also possible to propose an entry, which can get promoted with a payment.

Did you know …

  • … that the Wikipage ‘3D Demo Areas‘ contains many links displaying 3D buildings on a large scale?
  • … the Open Etymology Map, an interactive map that shows the origin of names of streets and points of interest based on OpenStreetMap and Wikidata?

Other “geo” things

  • Matthias Maurer took Russian Orthodox Christmas as an opportunity to show two wintry Russian river landscapes from a high altitude.
  • Allan Mustard pointed out a highly interesting map of medieval Europe showing the situation in 1444 at the rise of the Ottomans. We also found a video showing the territorial development of Europe including the respective population figures from 400 BC to the year 2017.
  • French confectioner Mr. Laurent has created (fr) a topographical chocolate of Pic du Midi d’Ossau using public data from the French National Institute of Geographic and Forest Information (IGN).

Upcoming Events

Where What Online When Country
Lyon EPN des Rancy : Technique de cartographie et d’édition osmcalpic 2022-01-15 flag
北京市 公共交通绘制问题集中交流会/Public Transport Mapping in China – Meetup1 osmcalpic 2022-01-16 – 2022-01-17 flag
Hilversum OSGeo.nl/OSM-NL/QGIS-NL Nieuwjaars Borrel osmcalpic 2022-01-16 flag
Lyon Rencontre mensuelle Lyon osmcalpic 2022-01-18 flag
147.Treffen des OSM-Stammtisches Bonn osmcalpic 2022-01-18
San Jose South Bay Map Night osmcalpic 2022-01-19 flag
Lüneburg Lüneburger Mappertreffen (online) osmcalpic 2022-01-18 flag
Dublin OpenStreetMap Ireland – Virtual Map and Chat osmcalpic 2022-01-20 flag
Grenoble Rencontre mensuelle du groupe OSM Grenoble osmcalpic 2022-01-24 flag
Bremen Bremer Mappertreffen (Online) osmcalpic 2022-01-24 flag
OSMF Engineering Working Group meeting osmcalpic 2022-01-24
Hoeselt Rode Kruis Mapathon – Hoeselt osmcalpic 2022-01-25 flag
OSM Utah Highway Classification Documentation Sprint osmcalpic 2022-01-27
Düsseldorf Düsseldorfer OpenStreetMap-Treffen (Online) osmcalpic 2022-01-26 flag
Missing Maps online Mapathon des DRK osmcalpic 2022-01-26
[Online] OpenStreetMap Foundation board of Directors – public videomeeting osmcalpic 2022-01-27
Amsterdam OSM Nederland nieuwjaarsbijeenkomst (online) osmcalpic 2022-01-29 flag
Bogotá Distrito Capital – Departamento Reunión Trimestral OSM-LatAm osmcalpic 2022-01-29 flag
京都市 京都!街歩き!マッピングパーティ:第29回 鹿王院 osmcalpic 2022-01-30 flag
Ville de Bruxelles – Stad Brussel Virtual OpenStreetMap Belgium meeting osmcalpic 2022-01-31 flag
London Missing Maps London Mapathon osmcalpic 2022-02-01 flag
San Jose South Bay Map Night osmcalpic 2022-02-02 flag
Landau an der Isar Virtuelles Niederbayern-Treffen osmcalpic 2022-02-01 flag
Berlin OSM-Verkehrswende #33 (Online) osmcalpic 2022-02-01 flag
Lyon EPN des Rancy : Technique de cartographie et d’édition osmcalpic 2022-02-05 flag

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by JuliaBesaB, Nordpfeil, SK53, Sammyhawkrad, Strubbl, TheSwavu, YoViajo, conradoos, derFred.

21 ways we’ve made Wikipedia better

00:01, Saturday, 15 2022 January UTC

January 15 is Wikipedia’s 21st birthday. Happy birthday, Wikipedia! In honor of the occasion, we at Wiki Education are reflecting on 21 ways our organization’s work has made Wikipedia better (in no particular order).

1. We’ve added a LOT of content to Wikipedia. In 2018, Wiki Education reached a milestone: Student editors in our Wikipedia Student Program had added as much content to English Wikipedia as was in the last print edition of Encyclopædia Britannica. Since then, we’ve steadily worked to add even more content. We’ve now added almost two full Britannicas to Wikipedia.

2. The depth and breadth of content we add covers all disciplines. Because we work with nearly every academic discipline taught in higher education, we improve a wide variety of topics. Want to know more about geologyContested monumentsIslamic art and architectureAfrican archaeologyIndigenous CanadiansOccupational epidemiologyForeign literatureLatina artists? All of these are topics on Wikipedia improved by student editors through our program.

3. We helped fill content gaps in articles related to 9/11. In a collaboration with ReThink Media in the months prior to the 20th anniversary of 9/11, we brought peace and security studies experts to Wikipedia to improve articles related to September 11, the War on Terror, and related topics. While Wikipedia’s extremely active WikiProject Military History had led to extensive coverage of the specifics of war in these articles, our experts were able to identify and fill content gaps related to the context of humanitarian implications of war. Articles our scholars improved received more than 7 million page views.

4. We’ve improved knowledge equity content. Some of the examples in the prior point illustrate this, but to drill down: Wiki Education has spent nearly a decade inviting instructors who teach in courses related to race, gender, and sexuality and other knowledge equity content areas to teach with Wikipedia. The result? Articles like the one on Harlem Renaissance writer Rudolph Fisher, which as it was expanded caught the eye of a journalist who then wrote about Fisher, bringing him to even more prominence. Or the significant work Wiki Education does to counter the biography gender gap on Wikipedia.

5. We overhauled the 19th Amendment article. In collaboration with the National Archives and Records Administration, Wiki Education hosted a series of courses bringing historians and women’s studies experts to Wikipedia prior to the 100th anniversary of passage of the 19th Amendment. Over the course of several months, scholars improved articles related to suffrage, suffragists, and — in an advanced course — the article on the 19th Amendment itself. Prior to our scholars’ work, the article centered the narrative not just on white people, but also on white men — so our program participants helped shift Wikipedia’s narrative to center women as well as adding a new section about the continued disenfranchisement of women of color.

6. We’ve brought women in science to Wikipedia. Through our partnership with 500 Women Scientists, we’ve enabled 75 members of the group to add and expand biographies of women in STEMM to Wikipedia. This work is helping change the face of science on Wikipedia.

7. Our Year of Science sparked a burst of science editing. We declare 2016 to be the Year of Science, a focused campaign to bring more science editors to Wikipedia. The results exceeded our expectations — and launched our ongoing Communicating Science initiative. In 2021, five years later, we continued to add more content to more science articles than we did during the official Year of Science. In an age where providing neutral, fact-based science information is critically important, we’re both improving Wikipedia’s coverage of science — and teaching early career scientists the important skill of being able to teach science to a general audience.

8. We’re helping the world learn about the climate crisis. As part of our Communicating Science initiative, we’ve attracted several courses that specifically work to improve Wikipedia’s coverage of the climate crisis. In this post, we explain how students from eight different universities helped add better scientific information related to climate change on Wikipedia.

9. We shaped Wikipedia’s coverage of the COVID-19 pandemic. Many student editors in our Wikipedia Student Program have improved articles related to information about the pandemic, from articles on vaccines and diseases to effects of and impacts on COVID-19. We also ran a series of courses in our Scholars & Scientists Program where we brought subject matter experts to Wikipedia to improve articles related to local, state, and regional responses to the pandemic from a public policy perspective. All told, we’ve helped millions of people learn more about the pandemic.

10. We’ve added biographies of Nobel laureates — before they were honored. Every Nobel Prize announcement season, readers flock to Wikipedia to read more about the scientists receiving the honor. Sometimes, notably in the case of Donna Strickland, the biography is missing — but Wiki Education’s helped avoid that in other cases. In 2018, one of our Wiki Scholars participants transformed Jennifer Doudna‘s Wikipedia article — two years before she won the Nobel Prize for Chemistry for her work on CRISPR. In 2017, the Nobel Prize in Physiology or Medicine was awarded to three laureates whose biographies were all created or expanded by student editors in our Wikipedia Student Program.

11. We taught a Nobel Laureate to edit. Our work with Nobel Laureates isn’t just limited to writing their biographies — we also taught one to edit Wikipedia! Dr. Bill Phillips, a 1997 laureate in physics, participated in our Wiki Scientists course in partnership with the American Physical Society. “Everyone who finds Wikipedia to be a good resource ought to contribute in one way or another, to the ongoing value of Wikipedia. One way of doing that, of course, is to act as an editor,” Phillips told us after the class.

12. We’ve inspired students to become editors. Dr. Phillips wasn’t the only one inspired by learning how to edit Wikipedia through our programs. “I call my senators, I vote, I donate to the ACLU, and now, I edit Wikipedia,” wrote a Rice University student. We’ve similarly inspired several other students to become Wikimedians.

13. We’ve inspired staff to edit. After six years of working for us, our Wikipedia Student Program Manager, Helaine Blumenthal, finally got the itch to edit herself. (Most of our staff also edit as volunteers.) Helaine reflected on her work creating the article on Impact of the COVID-19 pandemic on people with disabilities. “I was both dismayed but unsurprised to find a paucity of information on the topic, but I’m hopeful that my article sparks others to think about how COVID has affected populations already at high risk for a host of physical, emotional, and socioeconomic disadvantages,” she says.

14. Our staff has also reflected on knowledge equity. Wiki Education’s staff are part of the broader Wikimedia community, and we as a movement are thinking about knowledge equity as a key pillar of our current strategy. Our Senior Wikipedia Expert, Ian Ramjohn, reflected on how to represent Indigenous knowledge in our projects. Wikidata Program Manager Will Kent wrote about diversity and how we as a community generate lists of equity topics to improve. These kinds of reflections are important not just for us but for our community at large.

15. We provide the software for global program leaders to manage their work. Wikipedia is enhanced by not only our work, but the work of program leaders all over the world. And thousands of these program leaders use Programs & Events Dashboard, available on WMF Labs, to manage, track, and report on their programs and events. The Dashboard is a software originally built for Wiki Education’s own course management; today, we’ve added popular features like authorship highlighting and other features that have made the Dashboard a key tool in the Wikmedia movement.

16. We’ve documented our work so others can learn. Our commitment to supporting other program leaders isn’t just technology. We also run a variety of programs and initiatives, and as part of our ongoing commitment to documenting what we do and what we’ve learned, we publish evaluation reports on Meta, the central organizing wiki for the Wikimedia movement. To date, we’ve published seven of these detailed reports, offering information on how others in the movement could replicate our successes and avoid our mistakes. We believe these reports are a critical part of our commitment to documentation and knowledge sharing.

17. Our research on student learning outcomes helps further other education programs. In 2016, we also commissioned Dr. Zach McDowell to do a student learning outcomes research project. His results are useful for any education program leader looking to demonstrate learning outcomes from writing Wikipedia articles. Ensuring we’re providing a positive pedagogical experience is what draws many instructors to teach with Wikipedia.

18. Our work has inspired other researchers. It’s not just the research we commission; our programs have also inspired other researchers to publish about Wikipedia. Some research is about teaching with Wikipedia. Others is about our program’s impact to articles. Others focus on our impact to scholarly references. Because our Dashboard allows researchers to download CSVs of participant Wikipedia user names (not real names), we’re often not even aware researchers are studying the impact of our programs until the research is published!

19. Our partnerships have transformed Wikipedia’s relationship with academia. A decade ago, Wikipedia was the bane of teachers’ existence. Today, thanks in part to our work, Wikipedia is embraced in the academy. This can partially be attributed to our work in fostering partnerships with academic associations who then encourage their members — professionals in that discipline — to participate in Wiki Education’s programs. Our partnerships with large associations like the American Sociological Association, American Chemical Society, American Physical Society, and American Anthropological Association — to name a few — have not only furthered our programmatic work in those topic areas on Wikipedia, but has also helped shape the perception of Wikipedia in academia.

20. The scale of our work has a huge impact on Wikipedia. Many other Wikimedia groups also do important work similar to us. What sets Wiki Education apart from our peers is the sheer size of our programs. In 2021, we taught 10,758 people who had never registered an account before how to edit Wikipedia. We bring so many new active editors, in fact, that 19% of English Wikipedia’s new contributors come through our programs.

21. We’re changing the face of Wikipedia. And it’s not just the scale of contributors: It’s also the diversity. While only 22% of existing contributors to Wikipedia in Northern America identify as women, Wiki Education’s programs are working to change that: 67% of our participants identify as women, and an additional 3% identify as non-binary or another gender identity. Similarly, while 89% of existing U.S. editors identify as white, only 55% of Wiki Education’s program participants do. Our programs are bringing dramatically more diverse participants to Wikipedia than our current core community, which helps us to further our collective mission to collect the sum of all human knowledge by bringing in a more diverse set of people and expertise.

If you’re inspired by our work, join us! Spread the word about teaching with Wikipedia to higher education instructors you know in the U.S. and Canada. Encourage organizations you’re a member of to partner with us to offer a Wikipedia editing course. Take one of our courses yourself. And donate to Wiki Education to help us keep making these 21 ways we’ve improved Wikipedia continue into its 22nd year.

Image credit: Elya, CC BY-SA 3.0, via Wikimedia Commons

January Scots writing drive

15:35, Friday, 14 2022 January UTC

By Dr Sara Thomas, Scotland Programme Coordinator at Wikimedia UK

Over the last 18 months we’ve been working with the Scots Wiki community and Marco Cafolla, running editathons to improve the quality of the Scots wiki. The community has done a huge amount of work, as documented here, including a lot of article deletion in order to address the issues that came to light in August 2019. However, what the Scots wiki needs in the long term is more editors. As we know from our work with Celtic Knot, working on a minority language Wikipedia can feel like a huge task, but many hands make light work!  

At the end of last year we surveyed previous editathon participants. In response to what folks said, we’re extending the two day editathon to a week-long writing drive. We’ll also be including two evening sessions for collaborative editing – one on Tuesday 18th, 7-9pm, and one on Thursday 20th, 7-9pm. They’ll be held on Zoom, so please sign up through those Eventbrite links to get the joining info.

For January, we’ll be focussing on articles about languages, and on stub articles. There’s also other tasks for those who are less confident with their written Scots to help out with, such as adding images, or references.  And of course, we highly recommend the Scots Language Centre’s collection of resources for anyone wanting to improve their written Scots!

Sign up on the dashboard and then drop by Scots wiki any time over the next week (17th-23rd January)!  You can see the on-wiki event page here.

The post January Scots writing drive appeared first on WMUK.

Most liked Wikidata tweets

07:51, Friday, 14 2022 January UTC

Wikidata is 9, and Twitter has been around for the entire Wikidata lifespan. So let’s take a look back through time at some of the most liked Wikidata tweets (according to Twitter free search) since creation.

Personally, I think it’s rather cool that half of the tweets are in languages other than English!

Want this list but for Wikibase (the software that runs Wikidata)? Check out my Wikibase focused post!

2021, @wikidata 412 💕s

Announcement of the new Wikidata Query Builder by @wikidata!

2020, @ftrain 1.7K 💕s

Context is needed for this one, as the Wikidata reference is quite far down the thread.

The first tweet says:

Wikipedia just alerted me that an edit made from my IP address was been reverted by a bot, which was surprising, so I looked into it, and now I need to have a whole new kind of discussion with my children.

@ftrain

Moving on to detail that this includes Wikidata!!!

And by Wikipedia I mean THE WHOLE FOUNDATION including Commons and Wikidata.

@ftrain

2019, @simongerman600 857 💕s

Map image & associated Reddit thread. Very nice, reminds me of the Wikidata Map!

This maps shows every lighthouse in the world found in OpenStreetMap and WikiData. Source: https://buff.ly/2pIDzGn

@simongerman600

2018, @christelmolinie 219 💕s

1st mass transfer of data from the collections of @MSR_Tlse on #Wikidata thanks to @RhesusNegatif trainee of @Ecoledeschartes The complete and structured data of nearly 200 Greek ceramics are accessible and reusable http://tinyurl.com/y8zxkqnk#OpenGLAM#OpenData

@christelmolinie, Twitter Translate

2017, @Modesto 191 💕s

There is quite some discussion under this tweet, but I haven’t read through it all.

Hello. I just removed “Villa de Puerto del Son” from Google Maps.
The solution: delete the coordinates of the object in Wikidata.

@Modesto, Twitter Translation

2016, @Amazing_Maps 405 💕s

I love maps, and Twitter does too!

I found a post covering this same map on Reddit, and conveniently this includes the link to the original Wikidata Query Service query. Unfortunately “Query timeout limit reached” :(

German cities with “bach” (eng. creek) in their name

Source: Wikidata

@Amazing_Maps

2015, @jboner 70 💕s

As Wikidata continued to grow, people got more and more interested in processing it faster! This post looks at quickly processing the JSON dumps from 2015 Akka streams and Scala. The code is still on Github, but I imagine it takes a little more time to process using this method now!

Process the whole Wikidata in 7 minutes with your laptop (and Akka streams) http://engineering.intenthq.com/2015/06/wikidata-akka-streams/

@jboner

2014, @hillbig 58 💕s

The paper referenced in this tweet appears to have moved, but there is a copy on archive.org.

Wikidata: A Free Collaborative Knowledge Base, by Denny Vrandečić & Markus Krötzsch

Wikidata, the data version of wikipedia, is expanding rapidly. Research, software, services, etc. using this are expected to expand in the future. http://korrekt.org/papers/Wikidata-CACM-2014.pdf

@hillbig, Twitter Translation

2013, @yandex 20 💕s

That’s right, back in 2013 Yandex gave a grant of 150,000 EUR to Wikidata in the early years.

Yandex supported the @Wikidata project from @Wikimedia. Read about what this project is and why we find it useful: https://habr.com/en/company/yandex/blog/182360/

@yandex, Twitter translation

The post Most liked Wikidata tweets appeared first on addshore.

WBStack in 2021 and the future

07:50, Friday, 14 2022 January UTC

2021 is nearly over, WBStack is over 2 years old (initially announced back in 2019), and has continued to grow. The future is bright with wikibase.cloud looking to be launched by Wikimedia Deutschland in the new year (announced at WikidataCon 2021), and as a result, the code under the surface has had the most eyes on it since its inception.

Let’s take a closer look at some of the developments this year, and the progress that WBStack has made.

Current Usage

WBStack now has 148 individual user accounts registered on the platform that enabled wiki creation. These accounts have created 510 wikis with Wikibase installed since the platform was initially put online, and 335 of those wikis are still currently published (the other 175 have been deleted).

Nov 2019 April 2020 May 2020 Nov 2021 Dec 2021
Platform Users 38 70 76 139 148
Non deleted Wikis 145 306 335
All Wikis 65 178 226 476 510
Pages 1.4 million 1.9 million
Edits 200,000 295,000 4.1 million 4.6 million
WBStack edits over time

Data in this table primarily comes from previous blog posts and has been consistently gathered by me over the lifetime of the project.

Across all of the 510 wikis that have been created over 4.6 million edits have been made to-date, on 1.9 million pages. Comparing this to Wikidata, that’s 10% of where Wikidata was after 2 years and 2% of where Wikidata is today.

Generally speaking, WBStack growth has been accelerating, and I’m pretty sure that is down to a more certain future for projects that start their life on the platform, as well as a few projects in particular really ramping up their growth.

Of course, I’m speaking mainly from a technical standpoint of a scaling platform, rather than saying anything about the content, quality or coverage. I know for a fact that a large proportion of WBStack wikis are test sites, sandboxing sites and trial projects.

WMDE developments

Wikibase as a Service was represented on the 2021 Wikimedia Deutschland roadmap for Wikibase (as it was in 2020). In 2021 this was initially called “Wikibase evaluation service (Wikibase as a Service MVP): Development”

We will build the Wikibase evaluation service, a way to instantly create a temporary Wikibase sandbox that will enable users to more quickly and easily evaluate if the software is a fit for their project. This is the MVP version of “Wikibase as a Service”. The goal of this MVP is to learn about the technical requirements and challenges of hosted Wikibases so that we can make decisions around a long-term strategy.

Development plan 2021

This commitment was one of the things that lead to the open-sourcing of the WBStack components at the end of 2020. The plan also lead to a continued investigation of the Wikibase as a Service space throughout 2021, and various bits of work being done on WBStack codebases by engineers.

This has included:

  • Enabling WBStack to power a Wikibase sandbox type feature which would allow users to create temporary Wikibases for quick prototyping, learning and experimentation
  • Adding the WikibaseManifest and Federated Properties features to the platform, along with a new configuration UI (which you can read about in a previous blog post)
  • Deployment work for the future wikibase.cloud service, which lead to improvements such as a dedicated helm charts repository, and many other code changes to WBStack components

wikibase.cloud

As announced at WikidataCon 2021 (program entry, slides, notes)…

Wikibase.Cloudis a new platform based on WBstack code, but managed and maintained by Wikimedia Germany.

It is a “Wikibase as a Service” platform that will offer open knowledge projects a new way to create Wikibases quickly and easily.

Wikibase.cloud is designed for projects that are interested in quick start-up and simplicity rather than those that will needRe a lot of customisation.

We are currently getting the platform set up and ready for users. We anticipate launching in early 2022.

Pre Launch Slides

Be sure to watch the wikibaseug mailing list or Wikibase / WBStack telegram channels for updates on this.

Transition

Once wikibase.cloud is launched I plan on helping existing WBStack users to migrate to the new WMDE owned and operated platform. We will wait until that moment to decide how best to cross that bridge though. It will be possible to keep all data, and domains/addresses if desired.

Alongside existing wikis being migrated, the codebases associated with WBStack will also start to be owned and stewarded by WMDE. This is essentially the wbstack github organization currently. Though I imagine the wbstack branding for the platform code may remain. All code should remain open source in line with the Wikimedia values.

Once all of the above is complete, I plan on finally turning off wbstack.com, at least in terms of its ability to host wikibases for folks, at some point in 2022.

Reflection

It’s been quite a journey, but the journey is not yet over. I look forward to finally reflecting on all of this when I don’t have to worry about this service as an individual anymore.

I’m very excited for what wikibase.cloud means for the future of Wikibase adoption, the Wikibase ecosystem and the future of linked open data on the web.

The post WBStack in 2021 and the future appeared first on addshore.

Tech Lead Digest – Q3/4 2021

07:49, Friday, 14 2022 January UTC

It’s time for the 5th instalment of my tech lead digest posts. I switched to monthly for 2 months, but decided to back down to quarterlyish. You can find the other digests by checking out the series.

🧑‍🤝‍🧑Wikidata & Wikibase

The biggest event of note in the past months was WikidataCon 2021 which took place toward the end of October 2021. Spread over 3 days the event celebrated Wikidatas 9th birthday. We are still awaiting the report from the event to know how many folks participated, and recordings of talks will likely not be available until early 2022. At which point I’ll try to write another blog post.

Just before WikidataCon the updated strategy for Linked Open Data was published by Wikimedia Deutschland which includes sub-strategies for Wikidata and the Wikibase Ecosystem. This strategy is much easier to digest than the strategy papers published in 2019 and I highly recommend the read. Part of the Wikidata strategy talks about “sharing workload” which reminds me of some thoughts I recently had comparing Wikipedia and Wikidata editing. Wikibase has a focus on Ecosystem enablement, which I am looking forward to working on.

The Wikibase stakeholder group continues to grow and organize. A Twitter account (@wbstakeholders) now exists tweeting relevant updates. Now with over 14 organizational members and 15 individual members, the budget is now public and the group is working on getting some desired features implemented. If you are an organization or individual working in the Wikibase space, be sure to check them out! The group recently published a prioritized list of institutional requirements, and I’m happy to say that some parts of the “Automatic maintenance processes and updating cascades should work out of the box” area that scored 4 have already been tackled by the Wikidata / Wikibase teams.

I also have to give a special shout out to the Learn Wikidata interactive course. I haven’t completed the course myself, but it aspires to teach librarians, library staff members, and other information professionals how to edit Wikidata. The project uses software called Twine which allows creating sites and learning materials with branching narratives. You can find the code on Github. Thanks to the WikiCite grants for making this possible.

At the technical level:

  • Work on the Mismatch Finder continues
  • Federated Properties received another batch of development work and is likley to be announced for testing in the coming days, weeks or months
  • The Search Platform team at the foundation deployed the new Streaming Updater for the Wikidata query service, promising more efficient updating
  • Wikibase.Cloud was announced and has been worked on, which will be the Wikimedia Deutschland owned and operated version of wbstack.com using the same codebase. You can read more about what WBStack is in my introductory post.
  • The team worked on removing the need to setup cron scripts for various Wikibase components, such as change dispatching and also some database cleanup. This will be fully handeled by the job queue in a future Wikibase release.

🌎Wider Wikimedia

New Wikimedia Foundation board members were elected, and have since been formally appointed, in a vote that saw higher turnout than the previous election. Maryana Iskander was appointed as the new CEO of the Wikimedia Foundation with an official start date of January 2022.

A screenshot of suggested edits from growth features

Growth features continue to be rolled out across more Wikipedia projects (it’s almost everywhere now). This feature provides users with a personalized homepage, suggested tasks and help panel. It’s been worked on by the WMF Growth team, and I really like the direction this is taking.

MediaWiki had a 1.37 branch cut, and is likely to be released in the coming days or weeks.

In the future, unregistered editors will be given an identity that is not their IP address. You can read the suggestions for how that identity could work and discuss on the talk page.

The technical leadership community of practice received an updated name.

A Toolhub catalogue has been developed to make it easier to find software tools relating to Wikimedia projects. (announcement)

🔗Links & Reading

Reading

Podcasts

Projects

  • Backstage.io, An open platform for building developer portals (Mentioned in the Twitter / Spotify podcast above)
  • Twine, an open-source tool for telling interactive, nonlinear stories.
  • notebooksharing.space, the fastest way to share your notebooks (blogpost)
  • skaffold, Local Kubernetes Development. Awesome, and in use for wikibse.cloud.

🧩Did you know?

The post Tech Lead Digest – Q3/4 2021 appeared first on addshore.

This Month in GLAM: December 2021

10:49, Wednesday, 12 2022 January UTC

Adding women chemists to Wikipedia

16:38, Tuesday, 11 2022 January UTC
head shot of Maggie Tam
Maggie Tam.
Image courtesy Maggie Tam, all rights reserved.

Chemist Maggie Tam had never edited Wikipedia before taking one of our recent 500 Wiki Women Scientists courses — in fact, she didn’t even know you could.

“I used to think that each Wikipedia article was written by a single author,” Maggie admits. “I didn’t realize that anyone can edit and make changes to articles. Before the class, I never clicked on the History page, or the Talk page of articles. It is very heartwarming to find out the extent of community collaborative involvement in the articles.”

Maggie is now part of that community. As a volunteer, she’s the Communications Committee Co-Chair for Females in Mass Spectrometry, a nonprofit community that supports women in the field of mass spectrometry. In an effort to help improve Wikipedia’s coverage of the topic, she connected with 500 Women Scientists, an organization that partnered with Wiki Education to offer this course, led by Wiki Education’s Will Kent and Ian Ramjohn. Maggie signed up.

“I imagined the class to be similar to learning to drive, that there would be studying about rules, a road experience in a car with an instructor and dual brakes, and finally a road test,” Maggie explains. “I was skeptical to hear that we could begin to edit in the real Wikipedia (not just the sandbox) after the first week of training, which would be equivalent to driving on the road! As a matter of fact, it is astonishingly easy to start editing and creating an article, especially with visual editing. The course covered a good number of Wikipedia policies and resources to give us confidence. In the driving analogy, Will Kent and Ian Ramjohn are the dual brakes, who helped troubleshoot issues. There is a continuous road test in the form of reviews and edits from the Wikipedia community.”

She started by creating an article on chemist Hilary R. Bollan. Next, using Professor Hannes Röst’s list of mass spectrometrists, she created the articles for two “red links”, or missing articles, Catherine E. Costello and Jennifer Van Eyk. Then she made edits to the existing articles on Ying Ge and Vicki Wysocki.

The outcomes were great for representation of women chemists on Wikipedia — and for Maggie, who says she liked the class setting.

“I enjoyed the comradeship,” she says. “Once a week, I get to spend my lunch hour with other women scientists from different parts of the world, all working towards creating biographies to improve representation on Wikipedia.”

Now that the class is over, Maggie intends to keep working on adding more women scientists to Wikipedia, and engaging others in the Females in Mass Spectrometry group with an edit-a-thon using the 500 Women Scientists and Wiki Education resources. She wants Wikipedia’s coverage of women scientists to reflect the reality of the women already in the field — and inspire the next generation of scientists.

“There is a song in Girl Guides called ‘Yes She Can’,” Maggie says. “When I ask girls in my Girl Guides Brownies unit to research on women role models, they often start with online resources — Wikipedia being one of them. These little girls will learn more about the amazing pioneering work of women scientists when more articles exist in Wikipedia.”

Interested in hosting a course like the 500 Women Scientists course Maggie took? Visit partner.wikiedu.org.

weeklyOSM 598

10:37, Sunday, 09 2022 January UTC

28/12/2021-03/01/2022

lead picture

Showcase of detailed mapping in the traffic area in Berlin-Neukölln [1] | © Straßenraumkarte | map data © OpenStreetMap contributors

Mapping campaigns

  • Marjan, from TomTom, informed us of the project they have planned in the Dominican Republic (similar to those they have carried out in many other countries). They compared their data with the data in OSM and opened a MapRoulette Challenge to check the missing roads.

Mapping

  • A Taiwan mapper 琉璃百合, mentioned (zhtw) > en the traffic light tagging issue in the Talk-cn mailing list. She believes that some Chinese mappers incorrectly tag traffic lights at road intersections, and gave some proper examples.
  • DeBigC had set targets for 2021, which he reviewed at the end of the year, and set his OSM targets for 2022. His summary is provided with interesting links.
  • User Bxl-forever made the first changeset of 2022 (UTC) according to OSMCha.
  • Martin Wieland, from the unsurv project, has set himself the task of compiling a database of already installed video cameras that is as up to date as possible. Tools for the largely automatic mapping of such electronic eyes are to help with this project.
  • Jeremy G describes (fr) in his OSM Diary the process he used in adding the ‘industrial trail walk’ at l’Argentière-La Bessée in France as a relation.
  • Johnwhelan detected a lot of mistakes after some completed projects in the HOT Tasking Manager which did not have appropiate validation and made some recomendations on how to clean up after completing a project in the tasking manager.
  • The proposal amenity=small_vehicle_parking, to tag parking locations of any kind of small road vehicle, apart from bicycles, is waiting for your comments.Voting is open for defensive_works=*, to tag military constructions or buildings designed for the defence of territories in warfare, until Tuesday 18 January.

Community

  • 48 contributors mapped every single day in 2021. Interested in more statistics? Just have a look at OSMStats by Pascal Neis.
  • User AndreyGeograf wrote (ru) > en about his experience with micro and nano mapping.
  • Janet Chapman reported on her experience with six years of ‘Crowd2Map‘.
  • OpenStreetMap Belgium’s Mapper of the Month for January is Koos Krijnders from The Netherlands.
  • As a response to an article titled ‘Should you fix errors and contribute to Google Maps for free?’ Cj Malone wrote an article providing insights about contributing open data to OpenStreetMap for free.

Local chapter news

  • Cartographers from the Digne area in France wrote (fr) > en about naming and tagging practices for pedestrian trails and how different renderers show them.

Maps

  • [1] Alex (user Supaplex030) continued to explore the boundaries of the level of detail that can be rendered based on OSM data with his public space map for Neukölln, Berlin. A blog post by Tobias Jordans shows many interesting details such as cycleways, turn lanes and junctions and explains how the data is processed to allow this rendering. One conclusion is that a collaboration on pre-processed OSM data for streets would be of huge benefit.
  • Reddit user burgerking_foot shared a map showing places named Å from Katapult. Funnily enough, all of them are distributed in northern Europe.

Software

  • The software Flatmap has been renamed to Planetiler. Working in a similar way to Tilemaker it allows production of OSM-based map files in Mapbox Vector Tile format (MVT).
  • OpenMobilityIndicators has been extended (fr) > en to the whole of France. The toolbox uses OpenStreetMap data to measure pedestrian accessibility in a given area.
  • qeef wrote about the second year of ‘Divide and map. Now.‘, which, as an example, enabled a simultaneous mapathon of 200 mappers by dividing up an area.

Releases

  • Dietmar Seifert has provided (de) an analysis of how many addresses in Germany are already mapped (de) in OpenStreetMap.

Did you know …

OSM in the media

  • Effective 1 January 2022, a total of 10 towns in Poland have been granted (pl) > en city status. As happens every year, the required changes have been implemented promptly by Polish mappers.

Upcoming Events

Where What Online When Country
OSM Africa Monthly Mapathon: Map Mauritania osmcalpic 2022-01-08
臺北市 OpenStreetMap x Wikidata Taipei #36 osmcalpic 2022-01-10 flag
OSMF Engineering Working Group meeting osmcalpic 2022-01-10
München Münchner OSM-Treffen osmcalpic 2022-01-11 flag
Helechosa de los Montes Reunión mensual de la comunidad española osmcalpic 2022-01-11 flag
Michigan Michigan Meetup osmcalpic 2022-01-13 flag
Berlin 163. Berlin-Brandenburg OpenStreetMap Stammtisch osmcalpic 2022-01-13 flag
Bochum OSM-Treffen Bochum (Januar) ONLINE osmcalpic 2022-01-13 flag
Lyon EPN des Rancy : Technique de cartographie et d’édition osmcalpic 2022-01-15 flag
Hilversum OSGeo.nl/OSM-NL/QGIS-NL Nieuwjaars Borrel osmcalpic 2022-01-16 flag
San Jose South Bay Map Night osmcalpic 2022-01-19 flag
147.Treffen des OSM-Stammtisches Bonn osmcalpic 2022-01-18
Bremen Bremer Mappertreffen (Online) osmcalpic 2022-01-24 flag
Hoeselt Rode Kruis Mapathon – Hoeselt osmcalpic 2022-01-25 flag
Düsseldorf Düsseldorfer OpenStreetMap-Treffen (Online) osmcalpic 2022-01-26 flag
Missing Maps online Mapathon des DRK osmcalpic 2022-01-26

Note:
If you like to see your event here, please put it into the OSM calendar. Only data which is there, will appear in weeklyOSM.

This weeklyOSM was produced by Lejun, Nordpfeil, PierZen, SK53, Sammyhawkrad, Strubbl, TheSwavu, derFred, miurahr, 快乐的老鼠宝宝.

Wikidata ontological tree of Trains

22:27, Friday, 07 2022 January UTC

While looking working on my recent WikiCrowd project I ended up looking at the ontological tree of both Wikidata entities and Wikimedia Commons categories.

In this post, I’ll look at some of the ontology mappings that happen between projects, some of the SPARQL that can help you use this ontology in tools, and also some tools to help you explore this complex tree.

I’m using trains as I think they are fairly easy for most folks to relate to, and also don’t have a massively complex tree.

Commons & Wikidata mapping

Depicts questions in WikiCrowd are entirely generated from these Wikimedia Commons categories, such as Category:Trains & Category:Steam locomotives. These are then mapped to items on Wikidata such as Q870 (train) & Q171043 (steam locomotive).

Wikimedia Commons categories quite often contain infoboxes on the right-hand side that link to a variety of resources for the thing the category is covering. And quite often there is a Wikidata item ID present, this is the case for the categories above.

Likewise on Wikidata statements for P373 (Commons category) will often exist for entities that are depicted on Commons.

In theory, this means that I could run a simple SPARQL query to find all Wikidata items that have matching categories on Commons, throw them into the WikCrowd app and let people answer depicts questions.

Unfortunately, the world is a little more complex than that. On both sides of the equation, the ontological tree has some hidden surprises. For example Category:Trains includes Category:Views from trains as a subcategory, and this will mostly not include images of trains.

Wikidata ontology

Generally speaking, you want to keep the depcits statements of the most specific description of the thing being depicted. You can read more about this in the depicts project page on Commons. So if an image depicts a steam locomotive, perhaps it doesn’t want to also say that it depicts a more generic locomotive.

But here we hit our first problem, as I had to say locomotive above, and not train!

As you can see on the right, locomotives make up a part of a train, rather than being an instance of or subclass of train. The description of train is currently “form of rail transport consisting of a series of connected vehicles”, implying multiple carriages.

So, a steam train is probably powered by a steam locomotive, but it in itself doesn’t constitute a train unless it’s made of multiple carriages? or does it? This is a question for another time…

Another part of the Wikimedia Commons advice on creating depicts statements is also relevant here. I drew wheels in this diagram! Does that mean wheels should be depicted? Imagine how many more parts of a train or locomotive there are in one of these photos?

File:Petagas Sabah Sabah-Heritage-Steam-Train-01.jpg – Photo by CEphoto, Uwe Aranas

Visually exploring the wider tree

You need to understand the area of data that you are working with, otherwise, you’ll end up making incorrect changes or incorrect assumptions.

Many tools exist to help you explore this space, but the one that I’m going to suggest today is a Wikidata visualization tool made by metaphacts.

This tool enables easy creation of diagrams and exploration of relations, such as the expanded view of trains & locomotives, all the way up to vehicles shown below. (Permalink to this diagram on the tool)

You can start on the tool homepage, and search for a single Wikidata item, such as Q870 (train).

This will load the item as the first node in your diagram, and prompt you to explore connections.

If you search for P279 (subclass of) you’ll find relations in both directions. On one side, you’ll find the more generic class of Q1301433 (land vehicle). The other side doesn’t contain locomotive for the reasons stated above, for this relation you’ll have to take a look at P527 (has part).

In SPARQL

The above methods are all well and good for people, but what if you want to write a tool making use of this tree in some way?

This is exactly what WikiCrowd currently does for depict statement questions in order to check if a less or more specific statement already exists before making an edit.

Looking up the tree at the superclasses, or less specific items you can do something like this:


SELECT DISTINCT ?i ?iLabel WHERE { wd:Q870 wdt:P279+ ?i SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } }
Code language: SQL (Structured Query Language) (sql)

Or looking down the tree, for all instances of, or instances of subclasses of, or subclasses of, or subclasses of subclasses of you can do something like this:


SELECT DISTINCT ?i ?iLabel WHERE { ?i wdt:P31/wdt:P279*|wdt:P279/wdt:P279* wd:Q870 SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } }
Code language: SQL (Structured Query Language) (sql)

Finally

I want to write more about some of the ontological challenges on Wikidata, particularly how different types of people perceive the world. The case that came up in 2021 was how to use depicts statements on Commons to aid searching. But it turns out searching for “Cat” which could be either Q146 (house cat) or Q20980826 (cat – species of mammal) makes things hard.

I also might queue up some interesting ontological diagrams from Wikidata on Twitter, so be sure to follow! (Yes Tom, if you are reading this, I wrote this for you. Time to like and subscribe.)

Thanks to Lucas (@LucasWerkmeistr) for always being happy to talk to me about SPARQL :)

The post Wikidata ontological tree of Trains appeared first on addshore.

A first look at WikiCrowd

22:26, Friday, 07 2022 January UTC

I have quite enjoyed the odd contribution to an app by Google called Crowdsource. You can find it either on the web, or also as an app.

Crowdsource allows people to throw data at Google in controlled ways to add to the massive pile of data that Google uses to improve its services and at the end of the day beat its competition.

It does this by providing a collection of micro contribution tasks in a marginally gamified way, similar to how Google Maps contributions get you Local Guide points etc. In Crowdsource you get a contribution count, a level, and a metric for agreements.

While I enjoy making the odd contribution when bored out of my mind and enjoy looking at the new challenges (currently at 2625 contributions), I always think that data like this should just be going out into the world under a free licence to benefit everyone.

So finally, introducing WikiCrowd, an interface, and soon to be app, that I developed over the new year period.

WikiCrowd Overview

WikiCrowd is hosted on toolforge and can be found at https://wikicrowd.toolforge.org/ (Source code on Github)

In order to contribute, you need some knowledge of the world, a Wikimedia account and that’s it!

Screenshot showing the wikicrowd application, listing various groups of questions users can contribute to

I’m very aware there are an infinite number of things similar to this in the world, even within the Wikimedia space, but this is my take, with a few small existing differences, and some that are in the works.

The tasks that are presented are intended to be achievable by anyone, with very little context needed about what Wikipedia / Wikimedia is, how structured data works there, and what is actually happening behind the scenes.

There is currently a very heavy bias towards image depictions, as it turns out that Wikimedia Commons categories are a fairly good starting point for easy data sets, and people enjoy looking at images.

When selecting a category, you get presented with repeated questions, with a simple yes, no, maybe option.

Screenshot of the wikicrowd application show a single question asking if the image displayed clearly depicts a cat (which it does)

Currently, if you hit “Yes”, and edit under your account will happen shortly after submission. “No” and “Maybe” are recorded, but for now nothing else will happen.

The idea here is that people like simple tasks, focusing on a single thing to identify in a picture.

Other views exist, such as proposing Wikidata statements from lead text of Wikipedia articles, which has been taken from the Wikidata Game by Magnus.

At the time of writing this post, 25 people have tried the tool out in the 5 days since it was put online. There are around 10k questions currently in the system (being expanded daily), and 5.3k of them are already answered leading to 3.7k edits on Wikimedia Commons and Wikidata.

The future

The easiest thing to see in the future is the continued expansion of the categories used for image depicts questions. If you have any ideas leave a comment on this post or reach out to me (perhaps on Twitter).

Really I want this to be a native phone app, which will bring offline access, faster access, and have it a little closer to folks fingertips. I needed a backend API before getting there, and accidentally created a preliminary UI in the process. I’m currently working on a phone app using flutter.dev.

Currently, all “Yes” responses immediately result in edits, but I want to introduce a concept of agreement. In most cases, this would be 2 or 3 yes responses = 1 edit, which should increase accuracy and avoid mistakes by fast fingers.

Likewise, all “No” or “Maybe” responses currently don’t get shown to anyone a second time. I’d like to change that for the same reason.

I feel that there is some value in the negative responses being public somehow. I’m sure when training machine learning models people would also like the negatives, but I’m not sure how I may expose those yet.

The post A first look at WikiCrowd appeared first on addshore.

WikidataCon 2021 was in October 2021, and one of the sessions that I spoke in was a “Pre-launch Announcement and Preview of Wikibase.Cloud”.

The recording is now up on YouTube, and below you’ll find a write-up summary of what was said.

You can also find:

So what is wikibase.cloud?

It’s a new platform that has yet to be launched, that is based on WBStack code, but that will be managed and maintained by Wikimedia Deutschland (or Wikimedia Germany).

This is a Wikibase as a service platform, that exists to offer open knowledge projects a new way to create their own Wikibase very quickly and very easily.

It’s more designed for projects that are interested in a quick startup and simplicity rather than those that are going to need lots of customizations such as custom integrations, a variety of very specialized extensions, tools and so on.

As for the status of it currently, we are at the moment getting the platform set up and ready for users.

The current anticipated launch date is early 2022, so around January, but if at all possible our goal is to get that launched for our ecosystem sooner than that, so we will keep you updated on the progress.

Why are we (Wikimedia Germany) working on this

Over the years it’s become very apparent that there end up being a lot of skills that are needed to run Wikibase, there are a lot of things involved in the Wikibase / MediaWiki ecosystem as well. Be that initially setting things up, updating things, staying on top of all of these updates as things continue changing, and also for Wikibase as well, the cost, especially when compared to something like MediaWiki.

MediaWiki can run on a shared hosting system that costs a couple of dollars a month. With Wikibase, when you also want to have an elastic search and you also want to have a query service, the costs can go up quite quickly, which is especially difficult for people who just want to try it out in the first place or have smaller projects.

Wikibase as a service is a solution to this or at least some parts of this. It was developed a few years back and is all open source under the WBStack organization. It combines MediaWiki, Wikibase, Blazegraph, SPARQL UI, some of the tools like quickstatements, cradle, widar, all within one platform, and it all runs on a shared infrastructure, so there are lots of costs savings.

I wouldn’t be surprised if some individual wikibases cost far more than the few hundred that exist on WBStack at the moment.

This also brings down the effort required to maintain all of these things as there is one set of things to maintain.

Hopefully, this will really differentiate from just running the Wikibase software yourself as it will be super easy to run out of the box, you’ll just have to create an account, click a button, and you can have your Wikibase with all of these things set up straight away.

And as Sam said earlier, there is definitely still space for requirements that will be outside of the normal use cases to need to run Wikibase yourself as well.

Where are we?

So the goal of today’s very quick lightning is to let everyone know that this is something that we are looking at and working on as opposed to a thing that is out there right now ready for you to sign up for, ready for you to use.

In the interest of full transparency, we are letting you know what our status is and where we are.

We are going to be releasing more information about the service, including the launch of a waiting list (to answer an earlier question) about how you sign up already to be interested in joining before December first.

We will get together some kind of nice and GDPR approved way for you to sign up for a waiting list to find out more information and sign up for the service at the point that it launches.

Just for the sake of clarity, any future announcements about wikibase.cloud are going to be made to the previously existing Wikibase user group mailing list, and we will also make them to the Wikibase telegram channel as well.

Questions & Answers

And that is it in terms of content, so we now have 5 minutes for any questions, and we will do our best to answer all of them in the 5 minutes allotted.

If there are any questions / burning questions / really important questions that we can’t get to today then please email me, my email is here, and we will try to get you an answer in a very timely fashion.

Will we need an invitation to register and create a Wikibase in the cloud?

Yes, for now.

Does / will it support federated properties?

Yes. WBStack.com already supports the first generation of federated properties right now

How much will it cost to use?

Right now the goal is for this to generally for the average user not cost anything at all. This is not for Wikimedia Germany something that we intend to use as a revenue generator.

This is something that we are offering to the community to ensure that open knowledge projects can freely and easily create Wikibases.

We will continue to work with the community over the next year or so to figure out what makes sense, is there some kind of reasonable avenue for some kind of really high volume high resource-consuming projects to contribute directly to the funding of the platform. But the goal of this is to be free.

Will it allow media?

In its first version, it will not allow media (That’s not technically in the WBStack code just yet). But in the future, Adam would like for it to allow media.

What will happen to the instances that are on WBStack? And will they be transferred to wikibase.cloud?

Yes. The longer-term idea here is that wbstack.com will stop existing. There will be a nice transition for the people that want to be transitioned.

Definition of a free knowledge project?

When we are thinking of free and open when it comes to free knowledge we are thinking of Wikibase instances that can be freely explored and queried or wikibase instances that consist of data that has a creative commons or compatible licence. So we are not necessarily thinking this is the right solution for closed system use cases (things that are specifically beneficial to one organization), but for wikibases that will help to create more knowledge for the world at large.

How will namespace prefixes be figured out?

Right now it will be as t is on WBStack where you only have the Wikidata prefixes such as wd:, but one of the things that we would want to do (as everyone else that uses Wikibase wants) is add some other prefixes there for the local wikibases

Where can I find the telegram group?

You can find the group linked from the Wikibase Community User Group page on meta.

What will wikbase.cloud offer more / different from what you offer now?

WBStack right now is a platform that is run as Adam in his capacity as a volunteer, which is a lot of work and responsibility for a single person.

Wikibase.cloud will be based on the WBStack open-source platform, but will be maintained by a full team of engineers at Wikimedia Germany, included in our annual roadmaps, and eventually folded into our broader support capabilities, including our partnerships team and our community communications team.

So in the coming year what we launch will not necessarily be a full end to end software as a service platform with all of the guarantees and assurances that potentially an institution might be looking for although that is what we strive to get to in the next year. But in the short term, you will know that you have a team of people who are working on this actively, it’s not something that is heavily burdening a single individual.

Technically and feature-wise it will have everything that WBStack.com currently has. In fact, it will have more, as WBStack.com will remain on MediaWiki 1.35, and Wikibase.cloud will start a bit further along than that.

How does the scale of wikibase.cloud compare to the scale of WBStack? Are there plans to support bigger wikis than currently?
I think the plan would be to support whatever fits within the scope of open knowledge projects in the ecosystem. The largest current Wikibase on WSBStack is somewhere between 0.5 to 1 million Entities. Minimally we would be supporting that, but the goal is that this is a feasible option for organizations that maybe don’t have strong customization needs, but maybe that have significant numbers of items and entities, where being able to put those somewhere that is a reliable and stable place would be beneficial for them and their organization.

What domain/subdomain instances will there be?

Custom domains are already possible on wbstack.com so they should also be possible on wikibase.cloud

If I want to run an instance on wikibase.cloud with 2.7 million items about persons, will that scale?

I don’t see why not

The post Pre-launch Announcement of Wikibase.Cloud [WikidataCon Writeup] appeared first on addshore.

Laravel 8 with Wikimedia OAuth login

19:15, Thursday, 06 2022 January UTC

I recently wrote a little app called wikicrowd (blog post to follow) using Laravel and MediaWiki / Wikimedia authentication. It certainly wasn’t entirely out of the box, and the existing docs still need some tweaking.

This post reflects the steps I went through to set this app up, and it should only take a few minutes.

You can find a tag of the code at the end of this walkthrough on Github for PHP 8. (There is also a tag for PHP 7.4)

Shout out to the developers that worked on the Wikidata Mismatch Finder which is also a Laravel app with MediaWiki OAuth and was used as inspiration when writing this post, along with the documentation for the package used by Taavi.

Setup Laravel

First off I need a Laravel installation. Currently 8.x is the stream of the latest versions, and the installation docs say to run the below command.


curl -s https://laravel.build/demo-laravel-mediawiki-auth | bash
Code language: JavaScript (javascript)

I’m not a fan of running random code on the internet on my machine, but this is what the docs say. It creates a directory at the location you specify at the end of the URL, in my case demo-laravel-mediawiki-auth , creates a laravel/laravel project, and does a composer install.

Modify Laravel

We also want to delete a couple of the migrations that we will not be using.


rm ./database/migrations/2014_10_12_000000_create_users_table.php rm ./database/migrations/2014_10_12_100000_create_password_resets_table.php

Replacing one of them with a new user table in a file something like 2021_12_29_000000_create_users_table.php


<?php use Illuminate\Database\Migrations\Migration; use Illuminate\Database\Schema\Blueprint; use Illuminate\Support\Facades\Schema; class CreateUsersTable extends Migration { public function up() { Schema::create('users', function (Blueprint $table) { $table->id(); $table->string('username')->unique(); $table->timestamps(); }); } public function down() { Schema::dropIfExists('users'); } }
Code language: PHP (php)

And tweak the model to match


<?php namespace App\Models; use Illuminate\Contracts\Auth\MustVerifyEmail; use Illuminate\Database\Eloquent\Factories\HasFactory; use Illuminate\Foundation\Auth\User as Authenticatable; use Illuminate\Notifications\Notifiable; use Laravel\Sanctum\HasApiTokens; class User extends Authenticatable { use HasApiTokens, HasFactory, Notifiable; protected $fillable = [ 'username', ]; }
Code language: PHP (php)

Run Laravel

A previous command suggested to start running things locally using sail, which I’m going to use!


./vendor/bin/sail up

This will use sail and a docker-compose file that’s held within the project to set up the needed services, and probably too many of them. For the initial development of your app, you could likely comment out the meilisearch and selenium services.

You’ll always want to create the DB tables, as we will be using some of them for users. (You might want to remove some of these default tables later)


./vendor/bin/sail artisan migrate

You should now find your app accessible on localhost. For me this was http://localhost:80

Socalite

We will be using the socalite plugin for Laravel for authentication alongside taavi/laravel-socialite-mediawiki which enables socalite to work with MediaWiki. So we need to install these.


composer require laravel/socialite composer require taavi/laravel-socialite-mediawiki
Code language: JavaScript (javascript)

The documentation of these packages contains everything you need.

First we need to be able to configure the laralve-socialite-mediawiki plugin, with a new config file created at config/services.php, or this snippet added to the existing config.


return [ 'mediawiki' => [ 'identifier' => env('MEDIAWIKI_OAUTH_CLIENT_ID'), // oauth client id 'secret' => env('MEDIAWIKI_OAUTH_CLIENT_SECRET'), // oauth client secret 'callback_uri' => env('MEDIAWIKI_OAUTH_CALLBACK_URL'), // redirect url 'base_url' => env('MEDIAWIKI_OAUTH_BASE_URL'), // base url of wiki, for example https://meta.wikimedia.org ], ];
Code language: PHP (php)

And we need a basic OAuthLoginController in the app/Controllers directory. This will handle the redirect to the OAuth target, and the response of the callback, creating a new user within Laravel using the default model.


<?php namespace App\Http\Controllers; use Illuminate\Http\Request; use Laravel\Socialite\Facades\Socialite; use Illuminate\Support\Facades\Auth; use App\Models\User; class OauthLoginController extends Controller { public function __construct() { $this->middleware( 'guest' )->only( [ 'login', 'callback' ] ); $this->middleware( 'auth' )->only( 'logout' ); } public function login() { return Socialite::driver( 'mediawiki' ) ->redirect(); } public function callback() { $socialiteUser = Socialite::driver( 'wiki' )->user(); $user = User::firstOrCreate( [ 'username' => $socialiteUser->name, ] ); Auth::login( $user, false ); return redirect()->intended( '/' ); } public function logout( Request $request ) { $this->guard()->logout(); $request->session()->invalidate(); return redirect( '/' ); } private function guard() { return Auth::guard(); } }
Code language: PHP (php)

We can make a new file for the auth routes at routes/auth.php


<?php use Illuminate\Support\Facades\Route; use App\Http\Controllers\OAuthLoginController; Route::get('/login', [OAuthLoginController::class, 'login']) ->name('login'); Route::get('/callback', [OAuthLoginController::class, 'callback']) ->name('oauth.callback'); Route::get('/logout', [OAuthLoginController::class, 'logout']) ->name('logout');
Code language: PHP (php)

And load these routes in app/Providers/RouteServiceProvider.php by adding the following to the boot method.


Route::prefix('auth') ->middleware('web') ->namespace($this->namespace) ->group(base_path('routes/auth.php'));
Code language: PHP (php)

From the socialite side of things, everything should now be ready to go!

MediaWiki OAuth

I’ll be making an app that can login using Wikimedia OAuth, but the same steps should work for any MediaWiki site that has the OAuth extension installed. There is quite a good guide for developers which will let you know what is happening behind the scenes here.

I can create a new OAuth consumer using the Special:OAuthConsumerRegistration page and clicking the Propose new consumer link. It doesn’t matter which Wikimedia site you perform this step on as OAuth is central to all sites.

I need to fill out the form, giving my app a name, making sure OAuth 1.0a is selected as version 1.5.0 of the socalite mediawiki plugin, and that only supports OAuth 1.0a currently. A bunch of other fields are also required.

You need to configure an OAuth callback (you won’t be able to change this) for where your app will be hosted. And if you want to perform local testing you’ll need a second consumer with a different callback URL. For this demo app I’m using http://localhost/auth/callback

Basic rights are enough for login, but if you want to perform actions in the future you may want to select more.

Accept, and hit Propose Consumer!

You’ll be presented with a consumer token and secret token. This would be the time to “write down” the secret so you don’t forget it. (Though you can always reset it)

By default, you, the proposer of a consumer, can start using it straight away, though it would need approval for other accounts to be able to use it.

We can now do the final configuration in the .env file of laravel.


MEDIAWIKI_OAUTH_CLIENT_ID=<consumer_token eg: hr83j081wjgt19ujgn0m1ghgj901> MEDIAWIKI_OAUTH_CLIENT_SECRET=<consumer_secret eg: j901tjwf1j9gjsagj89ajgajkb921> MEDIAWIKI_OAUTH_CALLBACK_URL=oob MEDIAWIKI_OAUTH_BASE_URL=<wiki_url eg: https://meta.wikimedia.org>
Code language: HTML, XML (xml)

Testing it!

You should now be able to use the starting Laravel UI and the Login link in the top right to complete your login flow.

If you were successful you should now only see a Home link in the top right, rather than login.

You can check your database using tinker


./vendor/bin/sail artisan tinker

And selecting all the users


>>> DB::select("select * from users"); => [ {#3541 +"id": 1, +"username": "Addshore", +"created_at": "2022-01-06 08:59:56", +"updated_at": "2022-01-06 08:59:56", }, ]
Code language: PHP (php)

FAQ

Q: I want to be able to use the authenticated user tokens in the Laravel code to make API requests.

This is quite easily possible and I’ll write a followup post about this. You can find some currently working code in my wikicrowd app on Github.

Q: I get an error about a socialite argument being passed by reference


Taavi\LaravelSocialiteMediawiki\Socialite\One\WikiHmacSha1Signature::Taavi\LaravelSocialiteMediawiki\Socialite\One\{closure}(): Argument #2 ($value) must be passed by reference, value given
Code language: PHP (php)

You are likey running PHP8 and a version of the taavi/laravel-socialite-mediawiki package that is too old. If you want to use PHP 8 be sure to have taavi/laravel-socialite-mediawiki 1.6.0+

Q: I get an error about driver not being supported


Driver [wiki] not supported.
Code language: CSS (css)

You have copied a bad example from somewhere. ‘mediawiki’ needs to be used in both your services and Controller, not ‘wiki’.

Q: What is ‘oob’ in the MediaWiki OAuth callback

oauth_callback must be set, and must be set to ‘oob’ when not using the prefix option. ‘oob’ stands for Out Of Band

The post Laravel 8 with Wikimedia OAuth login appeared first on addshore.

Understanding diversity through Wikipedia and Wikidata

17:28, Wednesday, 05 2022 January UTC

Wikipedia is a tremendous resource. It is also a biased resource, lacking in diversity in any number of ways. Wikipedia isn’t the only source that suffers from systemic bias – most collections do, whether it’s from an art museum, library, archive or elsewhere. Initiatives to improve representation within collections are becoming more commonplace, which is an important first step in improving diversity in any given collection. The most well-known Wikipedia project working in this area is called Women in Red, which strives to improve the representation of women on Wikipedia. This project achieves these outcomes by generating detailed lists. But where do these lists come from? And how can projects analyze other demographic variables to improve representation?

The answer to both of these questions is another Wikimedia project called Wikidata. Wikidata provides linked data representations of nearly 100 million people, things, and events (as of December 2021)…including every single Wikipedia article. The Wikipedia article for the astronaut Sunita Williams also has a corresponding Wikidata item. Wikidata is both human and machine readable, which means we can run queries of all of Wikidata to answer any number of questions that pop into our heads based on statements in Wikidata. You can think of a statement as a simple sentence that describes things. An example of a statement is Sunita Williams’ occupation is astronaut (among many other accomplished occupations). Taking a closer look at Sunita Williams’ Wikidata item, we can see that, she identifies as female, went to Needham High School, and is of Indian and Slovene descent. These are all other statements on Wikidata. Women in Red generates their lists through Wikidata queries about gender statements on Wikidata. As we can see from that linked list, we can get more specific with our queries and ask Wikidata about ethnicity as well. So we can pick any of these other statements and run queries about them (like what are some occupations of Slovene Americans? How many people from Needham High became astronauts…spoiler: just one for now).

We can really start to learn about representation on Wikipedia when we start asking about every single person on Wikipedia (thanks to Wikidata’s machine readability, this takes seconds to do). For example, we could ask what the most common languages of astronauts speak. We could see which ethnicities are best represented (or underrepresented) in certain industries. Or we could switch out ethnicity for sexual orientation and see a similar list. We could pick any variable and run a query about it. Querying also allows us to do a more detailed analysis of all the Wikipedias that exist. We could see which language versions represent certain topics more than others. We can look at article size across different languages and whether or not articles even exist in a particular language.

Generating lists like this is so helpful in identifying areas for improvement on Wikipedia. Before we continue we need to address two concerns. One is that for this approach, the lists do not represent reality; they only represent what is or isn’t on Wikipedia/Wikidata. There is a lot that will continue to be missing. The more we add to Wikidata, the better this will get. The other concern we need to acknowledge is even though Wikidata represents data about sexual orientation, gender, and ethnicity doesn’t mean it does it perfectly — or even well. For example, the “sex or gender” property (P21) has evolved over time, but still has a long way to go before it can accurately represent the gender spectrum for all people. Demographic data is sensitive and there are still many questions the community is looking to answer in a respectful, accurate way. It’s also true that individuals may not want this kind of data shared publicly, which may skew the numbers. This isn’t a Wikidata-specific problem. Many systems and ontologies contend with this and, in fact, Wikidata’s openness has allowed properties like “sex or gender” to evolve and adapt very quickly.

In spite of these concerns, what we do have is a set of tools that can help us start to understand how diverse a given corner of Wikipedia is and how we can start to improve it. The insights that Wikipedia and Wikidata can provide us are far beyond what we could do previously and we must take advantage of it if we are ever to improve representation on these platforms.

Projects like Women in Red demonstrate that we can organize the community to begin to tackle systemic bias. If you have ideas of other ways to engage with systemic bias on Wikipedia and Wikidata, feel free to reach out to us to help make Wikipedia and Wikidata more accurate and representative of this world we live in.

Want to learn more? Enroll in our Wikidata Institute: wikiedu.org/wikidata.

Image credit: NASA, Public domain, via Wikimedia Commons

A Tale Of Code Review Review

New Wikimedia Code of Conduct

Remedial Skills In Open-To-The-Public Working Groups

Design, and Friction Preventing Design Improvement, in Open Tech

Inclusive-Or: Hospitality in Bug Tracking

Grief

UTC

Grief

Leadership Crisis at the Wikimedia Foundation

What Should We Stop Doing? (FLOSS Community Metrics Meeting keynote)

Comparing Codes of Conduct to Copyleft Licenses (My FOSDEM Speech)

How To Improve Bus Factor In Your Open Source Project

The Triumph Of Outreachy

Join Me In Donating to Stumptown Syndicate and Open Source Bridge

How I made a tidepool: Implementing the Friendly Space Policy for Wikimedia Foundation technical events