April 13, 2021

François Marier

Deleting non-decryptable restic snapshots

Due to what I suspect is disk corruption error due to a faulty RAM module or network interface on my GnuBee, my restic backup failed with the following error:

$ restic check
using temporary cache in /var/tmp/restic-tmp/restic-check-cache-854484247
repository b0b0516c opened successfully, password is correct
created new cache in /var/tmp/restic-tmp/restic-check-cache-854484247
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
error for tree 4645312b:
  decrypting blob 4645312b443338d57295550f2f4c135c34bda7b17865c4153c9b99d634ae641c failed: ciphertext verification failed
error for tree 2c3248ce:
  decrypting blob 2c3248ce5dc7a4bc77f03f7475936041b6b03e0202439154a249cd28ef4018b6 failed: ciphertext verification failed
Fatal: repository contains errors

I started by locating the snapshots which make use of these corrupt trees:

$ restic find --tree 4645312b
repository b0b0516c opened successfully, password is correct
Found tree 4645312b443338d57295550f2f4c135c34bda7b17865c4153c9b99d634ae641c
 ... path /usr/include/boost/spirit/home/support/auxiliary
 ... in snapshot 41e138c8 (2021-01-31 08:35:16)
Found tree 4645312b443338d57295550f2f4c135c34bda7b17865c4153c9b99d634ae641c
 ... path /usr/include/boost/spirit/home/support/auxiliary
 ... in snapshot e75876ed (2021-02-28 08:35:29)

$ restic find --tree 2c3248ce
repository b0b0516c opened successfully, password is correct
Found tree 2c3248ce5dc7a4bc77f03f7475936041b6b03e0202439154a249cd28ef4018b6
 ... path /usr/include/boost/spirit/home/support/char_encoding
 ... in snapshot 41e138c8 (2021-01-31 08:35:16)
Found tree 2c3248ce5dc7a4bc77f03f7475936041b6b03e0202439154a249cd28ef4018b6
 ... path /usr/include/boost/spirit/home/support/char_encoding
 ... in snapshot e75876ed (2021-02-28 08:35:29)

and then deleted them:

$ restic forget 41e138c8 e75876ed
repository b0b0516c opened successfully, password is correct
[0:00] 100.00%  2 / 2 files deleted

$ restic prune 
repository b0b0516c opened successfully, password is correct
counting files in repo
building new index for repo
[13:23] 100.00%  58964 / 58964 packs
repository contains 58964 packs (1417910 blobs) with 278.913 GiB
processed 1417910 blobs: 0 duplicate blobs, 0 B duplicate
load all snapshots
find data that is still in use for 20 snapshots
[1:15] 100.00%  20 / 20 snapshots
found 1364852 of 1417910 data blobs still in use, removing 53058 blobs
will remove 0 invalid files
will delete 942 packs and rewrite 1358 packs, this frees 6.741 GiB
[10:50] 31.96%  434 / 1358 packs rewritten
hash does not match id: want 9ec955794534be06356655cfee6abe73cb181f88bb86b0cd769cf8699f9f9e57, got 95d90aa48ffb18e6d149731a8542acd6eb0e4c26449a4d4c8266009697fd1904
github.com/restic/restic/internal/repository.Repack
    github.com/restic/restic/internal/repository/repack.go:37
main.pruneRepository
    github.com/restic/restic/cmd/restic/cmd_prune.go:242
main.runPrune
    github.com/restic/restic/cmd/restic/cmd_prune.go:62
main.glob..func19
    github.com/restic/restic/cmd/restic/cmd_prune.go:27
github.com/spf13/cobra.(*Command).execute
    github.com/spf13/cobra/command.go:852
github.com/spf13/cobra.(*Command).ExecuteC
    github.com/spf13/cobra/command.go:960
github.com/spf13/cobra.(*Command).Execute
    github.com/spf13/cobra/command.go:897
main.main
    github.com/restic/restic/cmd/restic/main.go:98
runtime.main
    runtime/proc.go:204
runtime.goexit
    runtime/asm_amd64.s:1374

As you can see above, the prune command failed due to a corrupt pack and so I followed the process I previously wrote about and identified the affected snapshots using:

$ restic find --pack 9ec955794534be06356655cfee6abe73cb181f88bb86b0cd769cf8699f9f9e57

before deleting them with:

$ restic forget 031ab8f1 1672a9e1 1f23fb5b 2c58ea3a 331c7231 5e0e1936 735c6744 94f74bdb b11df023 dfa17ba8 e3f78133 eefbd0b0 fe88aeb5 
repository b0b0516c opened successfully, password is correct
[0:00] 100.00%  13 / 13 files deleted

$ restic prune
repository b0b0516c opened successfully, password is correct
counting files in repo
building new index for repo
[13:37] 100.00%  60020 / 60020 packs
repository contains 60020 packs (1548315 blobs) with 283.466 GiB
processed 1548315 blobs: 129812 duplicate blobs, 4.331 GiB duplicate
load all snapshots
find data that is still in use for 8 snapshots
[0:53] 100.00%  8 / 8 snapshots
found 1219895 of 1548315 data blobs still in use, removing 328420 blobs
will remove 0 invalid files
will delete 6232 packs and rewrite 1275 packs, this frees 36.302 GiB
[23:37] 100.00%  1275 / 1275 packs rewritten
counting files in repo
[11:45] 100.00%  52822 / 52822 packs
finding old index files
saved new indexes as [a31b0fc3 9f5aa9b5 db19be6f 4fd9f1d8 941e710b 528489d9 fb46b04a 6662cd78 4b3f5aad 0f6f3e07 26ae96b2 2de7b89f 78222bea 47e1a063 5abf5c2d d4b1d1c3 f8616415 3b0ebbaa]
remove 23 old index files
[0:00] 100.00%  23 / 23 files deleted
remove 7507 old packs
[0:08] 100.00%  7507 / 7507 files deleted
done

And with 13 of my 21 snapshots deleted, the checks now pass:

$ restic check
using temporary cache in /var/tmp/restic-tmp/restic-check-cache-407999210
repository b0b0516c opened successfully, password is correct
created new cache in /var/tmp/restic-tmp/restic-check-cache-407999210
create exclusive lock for repository
load indexes
check all packs
check snapshots, trees and blobs
no errors were found

This represents a significant amount of lost backup history, but at least it's not all of it.

13 April, 2021 03:19AM

hackergotchi for Shirish Agarwal

Shirish Agarwal

what to write

First up, I am alive and well. I have been receiving calls from friends for quite sometime but now that I have become deaf, it is a pain and the hearing aids aren’t all that useful. But moreover, where we have been finding ourselves each and every day sinking lower and lower feels absurd as to what to write and not write about India. Thankfully, I ran across this piece which does tell in far more detail than I ever could. The only interesting and somewhat positive news I had is from south of India otherwise sad days, especially for the poor. The saddest story is that this time Covid has reached alarming proportions in India and surprise, surprise this time the villain for many is my state of Maharashtra even though it hasn’t received its share of GST proceeds for last two years and this was Kerala’s perspective, different state, different party, different political ideology altogether.

Kerala Finance Minister Thomas Issac views on GST, October 22, 2020 Indian Express.

I briefly also share the death of somewhat liberal Film censorship in India unlike Italy which abolished film censorship altogether. I don’t really want spend too much on how we have become No. 2 in Covid cases in the world and perhaps death also. Many people still believe in herd immunity but don’t really know what it means. So without taking too much time and effort, bid adieu. May post when I’m hopefully emotionally feeling better, stronger 😦

13 April, 2021 03:14AM by shirishag75

April 12, 2021

Russell Coker

Yama

I’ve just setup the Yama LSM module on some of my Linux systems. Yama controls ptrace which is the debugging and tracing API for Unix systems. The aim is to prevent a compromised process from using ptrace to compromise other processes and cause more damage. In most cases a process which can ptrace another process which usually means having capability SYS_PTRACE (IE being root) or having the same UID as the target process can interfere with that process in other ways such as modifying it’s configuration and data files. But even so I think it has the potential for making things more difficult for attackers without making the system more difficult to use.

If you put “kernel.yama.ptrace_scope = 1” in sysctl.conf (or write “1” to /proc/sys/kernel/yama/ptrace_scope) then a user process can only trace it’s child processes. This means that “strace -p” and “gdb -p” will fail when run as non-root but apart from that everything else will work. Generally “strace -p” (tracing the system calls of another process) is of most use to the sysadmin who can do it as root. The command “gdb -p” and variants of it are commonly used by developers so yama wouldn’t be a good thing on a system that is primarily used for software development.

Another option is “kernel.yama.ptrace_scope = 3” which means no-one can trace and it can’t be disabled without a reboot. This could be a good option for production servers that have no need for software development. It wouldn’t work well for a small server where the sysadmin needs to debug everything, but when dozens or hundreds of servers have their configuration rolled out via a provisioning tool this would be a good setting to include.

See Documentation/admin-guide/LSM/Yama.rst in the kernel source for the details.

When running with capability SYS_PTRACE (IE root shell) you can ptrace anything else and if necessary disable Yama by writing “0” to /proc/sys/kernel/yama/ptrace_scope .

I am enabling mode 1 on all my systems because I think it will make things harder for attackers while not making things more difficult for me.

Also note that SE Linux restricts SYS_PTRACE and also restricts cross-domain ptrace access, so the combination with Yama makes things extra difficult for an attacker.

Yama is enabled in the Debian kernels by default so it’s very easy to setup for Debian users, just edit /etc/sysctl.d/whatever.conf and it will be enabled on boot.

12 April, 2021 11:00PM by etbe

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

Squirrel!

“All comments on this article will now be moderated. The bar to pass moderation will be high, it's really time to think about something else. Did you all see that we have an exciting article on spinlocks?” Poor LWN <3

SPINLOCK!

12 April, 2021 10:30PM

hackergotchi for Clint Adams

Clint Adams

Russell Coker

Riverdale

I’ve been watching the show Riverdale on Netflix recently. It’s an interesting modern take on the Archie comics. Having watched Josie and the Pussycats in Outer Space when I was younger I was anticipating something aimed towards a similar audience. As solving mysteries and crimes was apparently a major theme of the show I anticipated something along similar lines to Scooby Doo, some suspense and some spooky things, but then a happy ending where criminals get arrested and no-one gets hurt or killed while the vast majority of people are nice. Instead the first episode has a teen being murdered and Ms Grundy being obsessed with 15yo boys and sleeping with Archie (who’s supposed to be 15 but played by a 20yo actor).

Everyone in the show has some dark secret. The filming has a dark theme, the sky is usually overcast and it’s generally gloomy. This is a significant contrast to Veronica Mars which has some similarities in having a young cast, a sassy female sleuth, and some similar plot elements. Veronica Mars has a bright theme and a significant comedy element in spite of dealing with some dark issues (murder, rape, child sex abuse, and more). But Riverdale is just dark. Anyone who watches this with their kids expecting something like Scooby Doo is in for a big surprise.

There are lots of interesting stylistic elements in the show. Lots of clothing and uniform designs that seem to date from the 1940’s. It seems like some alternate universe where kids have smartphones and laptops while dressing in the style of the 1940s. One thing that annoyed me was construction workers using tools like sledge-hammers instead of excavators. A society that has smart phones but no earth-moving equipment isn’t plausible.

On the upside there is a racial mix in the show that more accurately reflects American society than the original Archie comics and homophobia is much less common than in most parts of our society. For both race issues and gay/lesbian issues the show treats them in an accurate way (portraying some bigotry) while the main characters aren’t racist or homophobic.

I think it’s generally an OK show and recommend it to people who want a dark show. It’s a good show to watch while doing something on a laptop so you can check Wikipedia for the references to 1940s stuff (like when Bikinis were invented). I’m half way through season 3 which isn’t as good as the first 2, I don’t know if it will get better later in the season or whether I should have stopped after season 2.

I don’t usually review fiction, but the interesting aesthetics of the show made it deserve a review.

12 April, 2021 12:35PM by etbe

April 11, 2021

Jelmer Vernooij

The upstream ontologist

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

The upstream ontologist is a project that extracts metadata about upstream projects in a consistent format. It does this with a combination of heuristics and reading ecosystem-specific metadata files, such as Python’s setup.py, rust’s Cargo.toml as well as e.g. scanning README files.

Supported Data Sources

It will extract information from a wide variety of sources, including:

Supported Fields

Fields that it currently provides include:

  • Homepage: homepage URL
  • Name: name of the upstream project
  • Contact: contact address of some sort of the upstream (e-mail, mailing list URL)
  • Repository: VCS URL
  • Repository-Browse: Web URL for viewing the VCS
  • Bug-Database: Bug database URL (for web viewing, generally)
  • Bug-Submit: URL to use to submit new bugs (either on the web or an e-mail address)
  • Screenshots: List of URLs with screenshots
  • Archive: Archive used - e.g. SourceForge
  • Security-Contact: e-mail or URL with instructions for reporting security issues
  • Documentation: Link to documentation on the web:
  • Wiki: Wiki URL
  • Summary: one-line description of the project
  • Description: longer description of the project
  • License: Single line license description (e.g. "GPL 2.0") as declared in the metadata[1]
  • Copyright: List of copyright holders
  • Version: Current upstream version
  • Security-MD: URL to markdown file with security policy

All data fields have a “certainty” associated with them (“certain”, “confident”, “likely” or “possible”), which gets set depending on how the data was derived or where it was found. If multiple possible values were found for a specific field, then the value with the highest certainty is taken.

Interface

The ontologist provides a high-level Python API as well as two command-line tools that can write output in two different formats:

For example, running guess-upstream-metadata on dulwich:

 % guess-upstream-metadata
 <string>:2: (INFO/1) Duplicate implicit target name: "contributing".
 Name: dulwich
 Repository: https://www.dulwich.io/code/
 X-Security-MD: https://github.com/dulwich/dulwich/tree/HEAD/SECURITY.md
 X-Version: 0.20.21
 Bug-Database: https://github.com/dulwich/dulwich/issues
 X-Summary: Python Git Library
 X-Description: |
   This is the Dulwich project.
   It aims to provide an interface to git repos (both local and remote) that
   doesn't call out to git directly but instead uses pure Python.
 X-License: Apache License, version 2 or GNU General Public License, version 2 or later.
 Bug-Submit: https://github.com/dulwich/dulwich/issues/new

Lintian-Brush

lintian-brush can update DEP-12-style debian/upstream/metadata files that hold information about the upstream project that is packaged as well as the Homepage in the debian/control file based on information provided by the upstream ontologist. By default, it only imports data with the highest certainty - you can override this by specifying the --uncertain command-line flag.

[1]Obviously this won't be able to describe the full licensing situation for many projects. Projects like scancode-toolkit are more appropriate for that.

11 April, 2021 10:40PM by Jelmer Vernooij

Vishal Gupta

Sikkim 101 for Backpackers

Host to Kanchenjunga, the world’s third-highest mountain peak and the endangered Red Panda, Sikkim is a state in northeastern India. Nestled between Nepal, Tibet (China), Bhutan and West Bengal (India), the state offers a smorgasbord of cultures and cuisines. That said, it’s hardly surprising that the old spice route meanders through western Sikkim, connecting Lhasa with the ports of Bengal. Although the latter could also be attributed to cardamom (kali elaichi), a perennial herb native to Sikkim, which the state is the second-largest producer of, globally. Lastly, having been to and lived in India, all my life, I can confidently say Sikkim is one of the cleanest & safest regions in India, making it ideal for first-time backpackers.

Brief History

  • 17th century: The Kingdom of Sikkim is founded by the Namgyal dynasty and ruled by Buddhist priest-kings known as the Chogyal.
  • 1890: Sikkim becomes a princely state of British India.
  • 1947: Sikkim continues its protectorate status with the Union of India, post-Indian-independence.
  • 1973: Anti-royalist riots take place in front of the Chogyal's palace, by Nepalis seeking greater representation.
  • 1975: Referendum leads to the deposition of the monarchy and Sikkim joins India as its 22nd state.

Languages

  • Official: English, Nepali, Sikkimese/Bhotia and Lepcha
  • Though Hindi and Nepali share the same script (Devanagari), they are not mutually intelligible. Yet, most people in Sikkim can understand and speak Hindi.

Ethnicity

  • Nepalis: Migrated in large numbers (from Nepal) and soon became the dominant community
  • Bhutias: People of Tibetan origin. Major inhabitants in Northern Sikkim.
  • Lepchas: Original inhabitants of Sikkim

Food

  • Tibetan/Nepali dishes (mostly consumed during winter)
    • Thukpa: Noodle soup, rich in spices and vegetables. Usually contains some form of meat. Common variations: Thenthuk and Gyathuk
    • Momos: Steamed or fried dumplings, usually with a meat filling.
    • Saadheko: Spicy marinated chicken salad.
    • Gundruk Soup: A soup made from Gundruk, a fermented leafy green vegetable.
    • Sinki : A fermented radish tap-root product, traditionally consumed as a base for soup and as a pickle. Eerily similar to Kimchi.
  • While pork and beef are pretty common, finding vegetarian dishes is equally easy.
  • Staple: Dal-Bhat with Subzi. Rice is a lot more common than wheat (rice) possibly due to greater carb content and proximity to West Bengal, India’s largest producer of Rice.
  • Good places to eat in Gangtok
    • Hamro Bhansa Ghar, Nimtho (Nepali)
    • Taste of Tibet
    • Dragon Wok (Chinese & Japanese)

Buddhism in Sikkim

  • Bayul Demojong (Sikkim), is the most sacred Land in the Himalayas as per the belief of the Northern Buddhists and various religious texts.
  • Sikkim was blessed by Guru Padmasambhava, the great Buddhist saint who visited Sikkim in the 8th century and consecrated the land.
  • However, Buddhism is said to have reached Sikkim only in the 17th century with the arrival of three Tibetan monks viz. Rigdzin Goedki Demthruchen, Mon Kathok Sonam Gyaltshen & Rigdzin Legden Je at Yuksom. Together, they established a Buddhist monastery.
  • In 1642 they crowned Phuntsog Namgyal as the first monarch of Sikkim and gave him the title of Chogyal, or Dharma Raja.
  • The faith became popular through its royal patronage and soon many villages had their own monastery.
  • Today Sikkim has over 200 monasteries.

Major monasteries

  • Rumtek Monastery, 20Km from Gangtok
  • Lingdum/Ranka Monastery, 17Km from Gangtok
  • Phodong Monastery, 28Km from Gangtok
  • Ralang Monastery, 10Km from Ravangla
  • Tsuklakhang Monastery, Royal Palace, Gangtok
  • Enchey Monastery, Gangtok
  • Tashiding Monastery, 35Km from Ravangla


Reaching Sikkim

  • Gangtok, being the capital, is easiest to reach amongst other regions, by public transport and shared cabs.
  • By Air:
    • Pakyong (PYG) :
      • Nearest airport from Gangtok (about 1 hour away)
      • Tabletop airport
      • Reserved cabs cost around INR 1200.
      • As of Apr 2021, the only flights to PYG are from IGI (Delhi) and CCU (Kolkata).
    • Bagdogra (IXB) :
      • About 20 minutes from Siliguri and 4 hours from Gangtok.
      • Larger airport with flights to most major Indian cities.
      • Reserved cabs cost about INR 3000. Shared cabs cost about INR 350.
  • By Train:
    • New Jalpaiguri (NJP) :
      • About 20 minutes from Siliguri and 4 hours from Gangtok.
      • Reserved cabs cost about INR 3000. Shared cabs from INR 350.
  • By Road:
    • NH10 connects Siliguri to Gangtok
    • If you can’t find buses plying to Gangtok directly, reach Siliguri and then take a cab to Gangtok.
  • Sikkim Nationalised Transport Div. also runs hourly buses between Siliguri and Gangtok and daily buses on other common routes. They’re cheaper than shared cabs.
  • Wizzride also operates shared cabs between Siliguri/Bagdogra/NJP, Gangtok and Darjeeling. They cost about the same as shared cabs but pack in half as many people in “luxury cars” (Innova, Xylo, etc.) and are hence more comfortable.

Gangtok

  • Time needed: 1D/1N
  • Places to visit:
    • Hanuman Tok
    • Ganesh Tok
    • Tashi View Point [6,800ft]
    • MG Marg
    • Sikkim Zoo
    • Gangtok Ropeway
    • Enchey Monastery
    • Tsuklakhang Palace & Monastery
  • Hostels: Tagalong Backpackers (would strongly recommend), Zostel Gangtok
  • Places to chill: Travel Cafe, Café Live & Loud and Gangtok Groove
  • Places to shop: Lal Market and MG Marg

Getting Around

  • Taxis operate on a reserved or shared basis. In case of the latter, you can pool with other commuters your taxis will pick up and drop en-route.
  • Naturally shared taxis only operate on popular routes. The easiest way to get around Gangtok is to catch a shared cab from MG Marg.
  • Reserved taxis for Gangtok sightseeing cost around INR 1000-1500, depending upon the spots you’d like to see
  • Key taxi/bus stands :
    • Deorali stand: For Darjeeling, Siliguri, Kalimpong
    • Vajra stand: For North & East Sikkim (Tsomgo Lake & Nathula)
    • Rumtek taxi: For Ravangla, Pelling, Namchi, Geyzing, Jorethang and Singtam.

Exploring Gangtok on an MTB


North Sikkim

  • The easiest & most economical way to explore North Sikkim is the 3D/2N package offered by shared-cab drivers.
  • This includes food, permits, cab rides and accommodation (1N in Lachen and 1N in Lachung)
  • The accommodation on both nights are at homestays with bare necessities, so keep your hopes low.
  • In the spirit of sustainable tourism, you’ll be asked to discard single-use plastic bottles, so please carry a bottle that you can refill along the way.
  • Zero Point and Gurdongmer Lake are snow-capped throughout the year

3D/2N Shared-cab Package Itinerary

  • Day 1
    • Gangtok (10am) - Chungthang - Lachung (stay)
  • Day 2
    • Pre-lunch : Lachung (6am) - Yumthang Valley [12,139ft] - Zero Point - Lachung [15,300ft]
    • Post-lunch : Lachung - Chungthang - Lachen (stay)
  • Day 3
    • Pre-lunch : Lachen (5am) - Kala Patthar - Gurdongmer Lake [16,910ft] - Lachen
    • Post-lunch : Lachen - Chungthang - Gangtok (7pm)
  • This itinerary is idealistic and depends on the level of snowfall.
  • Some drivers might switch up Day 2 and 3 itineraries by visiting Lachen and then Lachung, depending upon the weather.
  • Areas beyond Lachen & Lachung are heavily militarized since the Indo-China border is only a few miles away.


East Sikkim

Zuluk and Silk Route

  • Time needed: 2D/1N
  • Zuluk [9,400ft] is a small hamlet with an excellent view of the eastern Himalayan range including the Kanchenjunga.
  • Was once a transit point to the historic Silk Route from Tibet (Lhasa) to India (West Bengal).
  • The drive from Gangtok to Zuluk takes at least four hours. Hence, it makes sense to spend the night at a homestay and space out your trip to Zuluk

Tsomgo Lake and Nathula

  • Time Needed : 1D
  • A Protected Area Permit is required to visit these places, due to their proximity to the Chinese border
  • Tsomgo/Chhangu Lake [12,313ft]
    • Glacial lake, 40 km from Gangtok.
    • Remains frozen during the winter season.
    • You can also ride on the back of a Yak for INR 300
  • Baba Mandir
    • An old temple dedicated to Baba Harbhajan Singh, a Sepoy in the 23rd Regiment, who died in 1962 near the Nathu La during Indo – China war.
  • Nathula Pass [14,450ft]
    • Located on the Indo-Tibetan border crossing of the Old Silk Route, it is one of the three open trading posts between India and China.
    • Plays a key role in the Sino-Indian Trade and also serves as an official Border Personnel Meeting(BPM) Point.
    • May get cordoned off by the Indian Army in event of heavy snowfall or for other security reasons.


West Sikkim

  • Time needed: 3N/1N
  • Hostels at Pelling : Mochilerro Ostillo

Itinerary

Day 1: Gangtok - Ravangla - Pelling

  • Leave Gangtok early, for Ravangla through the Temi Tea Estate route.
  • Spend some time at the tea garden and then visit Buddha Park at Ravangla
  • Head to Pelling from Ravangla

Day 2: Pelling sightseeing

  • Hire a cab and visit Skywalk, Pemayangtse Monastery, Rabdentse Ruins, Kecheopalri Lake, Kanchenjunga Falls.

Day 3: Pelling - Gangtok/Siliguri

  • Wake up early to catch a glimpse of Kanchenjunga at the Pelling Helipad around sunrise
  • Head back to Gangtok on a shared-cab
  • You could take a bus/taxi back to Siliguri if Pelling is your last stop.

Darjeeling

  • In my opinion, Darjeeling is lovely for a two-day detour on your way back to Bagdogra/Siliguri and not any longer (unless you’re a Bengali couple on a honeymoon)
  • Once a part of Sikkim, Darjeeling was ceded to the East India Company after a series of wars, with Sikkim briefly receiving a grant from EIC for “gifting” Darjeeling to the latter
  • Post-independence, Darjeeling was merged with the state of West Bengal.

Itinerary

Day 1 :

  • Take a cab from Gangtok to Darjeeling (shared-cabs cost INR 300 per seat)
  • Reach Darjeeling by noon and check in to your Hostel. I stayed at Hideout.
  • Spend the evening visiting either a monastery (or the Batasia Loop), Nehru Road and Mall Road.
  • Grab dinner at Glenary whilst listening to live music.

Day 2:

  • Wake up early to catch the sunrise and a glimpse of Kanchenjunga at Tiger Hill. Since Tiger Hill is 10km from Darjeeling and requires a permit, book your taxi in advance.
  • Alternatively, if you don’t want to get up at 4am or shell out INR1500 on the cab to Tiger Hill, walk to the Kanchenjunga View Point down Mall Road
  • Next, queue up outside Keventers for breakfast with a view in a century-old cafe
  • Get a cab at Gandhi Road and visit a tea garden (Happy Valley is the closest) and the Ropeway. I was lucky to meet 6 other backpackers at my hostel and we ended up pooling the cab at INR 200 per person, with INR 1400 being on the expensive side, but you could bargain.
  • Get lunch, buy some tea at Golden Tips, pack your bags and hop on a shared-cab back to Siliguri. It took us about 4hrs to reach Siliguri, with an hour to spare before my train.
  • If you’ve still got time on your hands, then check out the Peace Pagoda and the Darjeeling Himalayan Railway (Toy Train). At INR 1500, I found the latter to be too expensive and skipped it.


Tips and hacks

  • Download offline maps, especially when you’re exploring Northern Sikkim.
  • Food and booze are the cheapest in Gangtok. Stash up before heading to other regions.
  • Keep your Aadhar/Passport handy since you need permits to travel to North & East Sikkim.
  • In rural areas and some cafes, you may get to try Rhododendron Wine, made from Rhododendron arboreum a.k.a Gurans. Its production is a little hush-hush since the flower is considered holy and is also the National Flower of Nepal.
  • If you don’t want to invest in a new jacket, boots or a pair of gloves, you can always rent them at nominal rates from your hotel or little stores around tourist sites.
  • Check the weather of a region before heading there. Low visibility and precipitation can quite literally dampen your experience.
  • Keep your itinerary flexible to accommodate for rest and impromptu plans.
  • Shops and restaurants close by 8pm in Sikkim and Darjeeling. Plan for the same.

Carry…

  • a couple of extra pairs of socks (woollen, if possible)
  • a pair of slippers to wear indoors
  • a reusable water bottle
  • an umbrella
  • a power bank
  • a couple of tablets of Diamox. Helps deal with altitude sickness
  • extra clothes and wet bags since you may not get a chance to wash/dry your clothes
  • a few passport size photographs

Shared-cab hacks

  • Intercity rides can be exhausting. If you can afford it, pay for an additional seat.
  • Call shotgun on the drives beyond Lachen and Lachung. The views are breathtaking.
  • Return cabs tend to be cheaper (WB cabs travelling from SK and vice-versa)

Cost

  • My median daily expenditure (back when I went to Sikkim in early March 2021) was INR 1350.
  • This includes stay (bunk bed), food, wine and transit (shared cabs)
  • In my defence, I splurged on food, wine and extra seats in shared cabs, but if you’re on a budget, you could easily get by on INR 1 - 1.2k per day.
  • For a 9-day trip, I ended up shelling out nearly INR 15k, including 2AC trains to & from Kolkata
  • Note : Summer (March to May) and Autumn (October to December) are peak seasons, and thereby more expensive to travel around.

Souvenirs and things you should buy

Buddhist souvenirs :

  • Colourful Prayer Flags (great for tying on bikes or behind car windshields)
  • Miniature Prayer/Mani Wheels
  • Lucky Charms, Pendants and Key Chains
  • Cham Dance masks and robes
  • Singing Bowls
  • Common symbols: Om mani padme hum, Ashtamangala, Zodiac signs

Handicrafts & Handlooms

  • Tibetan Yak Wool shawls, scarfs and carpets
  • Sikkimese Ceramic cups
  • Thangka Paintings

Edibles

  • Darjeeling Tea (usually brewed and not boiled)
  • Wine (Arucha Peach & Rhododendron)
  • Dalle Khursani (Chilli) Paste and Pickle

Header Icon made by Freepik from www.flaticon.com is licensed by CC 3.0 BY

11 April, 2021 02:04PM by Vishal Gupta

hackergotchi for Jonathan Dowland

Jonathan Dowland

2020 in short fiction

Cover for *Episodes*

Following on from 2020 in Fiction: In 2020 I read a couple of collections of short fiction from some of my favourite authors.

I started the year with Christopher Priest's Episodes. The stories within are collected from throughout his long career, and vary in style and tone. Priest wrote new little prologues and epilogues for each of the stories, explaining the context in which they were written. I really enjoyed this additional view into their construction.

Cover for *Adam Robots*

By contrast, Adam Robert's Adam Robots presents the stories on their own terms. Each of the stories is written in a different mode: one as golden-age SF, another as a kind of Cyberpunk, for example, although they all blend or confound sub-genres to some degree. I'm not clever enough to have decoded all their secrets on a first read, and I would have appreciated some "Cliff's Notes” on any deeper meaning or intent.

Cover for *Exhalation*

Ted Chiang's Exhalation was up to the fantastic standard of his earlier collection and had some extremely thoughtful explorations of philosophical ideas. All the stories are strong but one stuck in my mind the longest: Omphalos)…

With my daughter I finished three of Terry Pratchett's short story collections aimed at children: Dragon at Crumbling Castle; The Witch's Vacuum Cleaner and The Time-Travelling Caveman. If you are a Pratchett fan and you've overlooked these because they're aimed at children, take another look. The quality varies, but there are some true gems in these. Several stories take place in common settings, either the town of Blackbury, in Gritshire (and the adjacent Even Moor), or the Welsh border-town of Llandanffwnfafegettupagogo. The sad thing was knowing that once I'd finished them (and the fourth, Father Christmas's Fake Beard) that was it: there will be no more.

Cover for Interzone, issue 277

8/31 of the "books" I read in 2020 were issues of Interzone. Counting them as "books" for my annual reading goal has encouraged me to read full issues, whereas before I would likely have only read a couple of stories from each issue. Reading full issues has rekindled the enjoyment I got out of it when I first discovered the magazine at the turn of the Century. I am starting to recognise stories by authors that have written stories in other issues, as well as common themes from the current era weaving their way into the work (Trump, Brexit, etc.) No doubt the Pandemic will leave its mark on 2021's stories.

11 April, 2021 10:53AM

hackergotchi for Junichi Uekawa

Junichi Uekawa

Wrote a timezone checker page.

Wrote a timezone checker page. timezone. Shows the current time in blue line. I haven't made anything configurable but will think about it later.

11 April, 2021 02:06AM by Junichi Uekawa

April 10, 2021

hackergotchi for Charles Plessy

Charles Plessy

Debian Bullseye: more open

Debian Bullseye will provide the command /usr/bin/open for your greatest comfort at the command line. On a system with a graphical desktop environment, the command should have a similar result as when opening a document from a mouse-and-click file browser.

Technically, /usr/bin/open is a symbolic link managed by update-alternatives to point towards xdg-open if available and otherwise run-mailcap.

10 April, 2021 10:21PM

hackergotchi for Kentaro Hayashi

Kentaro Hayashi

Grow your ideas for Debian Project

There may be some "If it could be ..." ideas for Debian Project. If idea is concreate and worth to make things forward, it should make a proposal for Project Funding.

salsa.debian.org

But it is a just an idea, or no afford to act as an executor role, that idea will not be achieved.

I thought that It needs an incubator - complemental project.

salsa.debian.org

I've salvaged an idea from closed MR Add proposal about "Formalize reimbursement process" (!5) · Merge Requests · Freexian SARL / Project Funding · GitLab

I'm not confident whether mechanism works, but Debian needs change.

10 April, 2021 01:13PM

April 09, 2021

hackergotchi for Michael Prokop

Michael Prokop

A Ceph war story

It all started with the big bang! We nearly lost 33 of 36 disks on a Proxmox/Ceph Cluster; this is the story of how we recovered them.

At the end of 2020, we eventually had a long outstanding maintenance window for taking care of system upgrades at a customer. During this maintenance window, which involved reboots of server systems, the involved Ceph cluster unexpectedly went into a critical state. What was planned to be a few hours of checklist work in the early evening turned out to be an emergency case; let’s call it a nightmare (not only because it included a big part of the night). Since we have learned a few things from our post mortem and RCA, it’s worth sharing those with others. But first things first, let’s step back and clarify what we had to deal with.

The system and its upgrade

One part of the upgrade included 3 Debian servers (we’re calling them server1, server2 and server3 here), running on Proxmox v5 + Debian/stretch with 12 Ceph OSDs each (65.45TB in total), a so-called Proxmox Hyper-Converged Ceph Cluster.

First, we went for upgrading the Proxmox v5/stretch system to Proxmox v6/buster, before updating Ceph Luminous v12.2.13 to the latest v14.2 release, supported by Proxmox v6/buster. The Proxmox upgrade included updating corosync from v2 to v3. As part of this upgrade, we had to apply some configuration changes, like adjust ring0 + ring1 address settings and add a mon_host configuration to the Ceph configuration.

During the first two servers’ reboots, we noticed configuration glitches. After fixing those, we went for a reboot of the third server as well. Then we noticed that several Ceph OSDs were unexpectedly down. The NTP service wasn’t working as expected after the upgrade. The underlying issue is a race condition of ntp with systemd-timesyncd (see #889290). As a result, we had clock skew problems with Ceph, indicating that the Ceph monitors’ clocks aren’t running in sync (which is essential for proper Ceph operation). We initially assumed that our Ceph OSD failure derived from this clock skew problem, so we took care of it. After yet another round of reboots, to ensure the systems are running all with identical and sane configurations and services, we noticed lots of failing OSDs. This time all but three OSDs (19, 21 and 22) were down:

% sudo ceph osd tree
ID CLASS WEIGHT   TYPE NAME      STATUS REWEIGHT PRI-AFF
-1       65.44138 root default
-2       21.81310     host server1
 0   hdd  1.08989         osd.0    down  1.00000 1.00000
 1   hdd  1.08989         osd.1    down  1.00000 1.00000
 2   hdd  1.63539         osd.2    down  1.00000 1.00000
 3   hdd  1.63539         osd.3    down  1.00000 1.00000
 4   hdd  1.63539         osd.4    down  1.00000 1.00000
 5   hdd  1.63539         osd.5    down  1.00000 1.00000
18   hdd  2.18279         osd.18   down  1.00000 1.00000
20   hdd  2.18179         osd.20   down  1.00000 1.00000
28   hdd  2.18179         osd.28   down  1.00000 1.00000
29   hdd  2.18179         osd.29   down  1.00000 1.00000
30   hdd  2.18179         osd.30   down  1.00000 1.00000
31   hdd  2.18179         osd.31   down  1.00000 1.00000
-4       21.81409     host server2
 6   hdd  1.08989         osd.6    down  1.00000 1.00000
 7   hdd  1.08989         osd.7    down  1.00000 1.00000
 8   hdd  1.63539         osd.8    down  1.00000 1.00000
 9   hdd  1.63539         osd.9    down  1.00000 1.00000
10   hdd  1.63539         osd.10   down  1.00000 1.00000
11   hdd  1.63539         osd.11   down  1.00000 1.00000
19   hdd  2.18179         osd.19     up  1.00000 1.00000
21   hdd  2.18279         osd.21     up  1.00000 1.00000
22   hdd  2.18279         osd.22     up  1.00000 1.00000
32   hdd  2.18179         osd.32   down  1.00000 1.00000
33   hdd  2.18179         osd.33   down  1.00000 1.00000
34   hdd  2.18179         osd.34   down  1.00000 1.00000
-3       21.81419     host server3
12   hdd  1.08989         osd.12   down  1.00000 1.00000
13   hdd  1.08989         osd.13   down  1.00000 1.00000
14   hdd  1.63539         osd.14   down  1.00000 1.00000
15   hdd  1.63539         osd.15   down  1.00000 1.00000
16   hdd  1.63539         osd.16   down  1.00000 1.00000
17   hdd  1.63539         osd.17   down  1.00000 1.00000
23   hdd  2.18190         osd.23   down  1.00000 1.00000
24   hdd  2.18279         osd.24   down  1.00000 1.00000
25   hdd  2.18279         osd.25   down  1.00000 1.00000
35   hdd  2.18179         osd.35   down  1.00000 1.00000
36   hdd  2.18179         osd.36   down  1.00000 1.00000
37   hdd  2.18179         osd.37   down  1.00000 1.00000

Our blood pressure increased slightly! Did we just lose all of our cluster? What happened, and how can we get all the other OSDs back?

We stumbled upon this beauty in our logs:

kernel: [   73.697957] XFS (sdl1): SB stripe unit sanity check failed
kernel: [   73.698002] XFS (sdl1): Metadata corruption detected at xfs_sb_read_verify+0x10e/0x180 [xfs], xfs_sb block 0xffffffffffffffff
kernel: [   73.698799] XFS (sdl1): Unmount and run xfs_repair
kernel: [   73.699199] XFS (sdl1): First 128 bytes of corrupted metadata buffer:
kernel: [   73.699677] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
kernel: [   73.700205] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
kernel: [   73.700836] 00000020: 62 44 2b c0 e6 22 40 d7 84 3d e1 cc 65 88 e9 d8  bD+.."@..=..e...
kernel: [   73.701347] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
kernel: [   73.701770] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
ceph-disk[4240]: mount: /var/lib/ceph/tmp/mnt.jw367Y: mount(2) system call failed: Structure needs cleaning.
ceph-disk[4240]: ceph-disk: Mounting filesystem failed: Command '['/bin/mount', '-t', u'xfs', '-o', 'noatime,inode64', '--', '/dev/disk/by-parttypeuuid/4fbd7e29-9d25-41b8-afd0-062c0ceff05d.cdda39ed-5
ceph/tmp/mnt.jw367Y']' returned non-zero exit status 32
kernel: [   73.702162] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
kernel: [   73.702550] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
kernel: [   73.702975] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
kernel: [   73.703373] XFS (sdl1): SB validate failed with error -117.

The same issue was present for the other failing OSDs. We hoped, that the data itself was still there, and only the mounting of the XFS partitions failed. The Ceph cluster was initially installed in 2017 with Ceph jewel/10.2 with the OSDs on filestore (nowadays being a legacy approach to storing objects in Ceph). However, we migrated the disks to bluestore since then (with ceph-disk and not yet via ceph-volume what’s being used nowadays). Using ceph-disk introduces these 100MB XFS partitions containing basic metadata for the OSD.

Given that we had three working OSDs left, we decided to investigate how to rebuild the failing ones. Some folks on #ceph (thanks T1, ormandj + peetaur!) were kind enough to share how working XFS partitions looked like for them. After creating a backup (via dd), we tried to re-create such an XFS partition on server1. We noticed that even mounting a freshly created XFS partition failed:

synpromika@server1 ~ % sudo mkfs.xfs -f -i size=2048 -m uuid="4568c300-ad83-4288-963e-badcd99bf54f" /dev/sdc1
meta-data=/dev/sdc1              isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=1, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
synpromika@server1 ~ % sudo mount /dev/sdc1 /mnt/ceph-recovery
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
cache_node_purge: refcount was 1, not zero (node=0x1d3c400)
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x18800/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x18800/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0x24c00/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x24c00/0x1000
SB stripe unit sanity check failed
Metadata corruption detected at 0x433840, xfs_sb block 0xc400/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0xc400/0x1000
releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!releasing dirty buffer (bulk) to free list!found dirty buffer (bulk) on free list!bad magic number
bad magic number
Metadata corruption detected at 0x433840, xfs_sb block 0x0/0x1000
libxfs_writebufr: write verifer failed on xfs_sb bno 0x0/0x1000
releasing dirty buffer (bulk) to free list!mount: /mnt/ceph-recovery: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error.

Ouch. This very much looked related to the actual issue we’re seeing. So we tried to execute mkfs.xfs with a bunch of different sunit/swidth settings. Using ‘-d sunit=512 -d swidth=512‘ at least worked then, so we decided to force its usage in the creation of our OSD XFS partition. This brought us a working XFS partition. Please note, sunit must not be larger than swidth (more on that later!).

Then we reconstructed how to restore all the metadata for the OSD (activate.monmap, active, block_uuid, bluefs, ceph_fsid, fsid, keyring, kv_backend, magic, mkfs_done, ready, require_osd_release, systemd, type, whoami). To identify the UUID, we can read the data from ‘ceph --format json osd dump‘, like this for all our OSDs (Zsh syntax ftw!):

synpromika@server1 ~ % for f in {0..37} ; printf "osd-$f: %s\n" "$(sudo ceph --format json osd dump | jq -r ".osds[] | select(.osd==$f) | .uuid")"
osd-0: 4568c300-ad83-4288-963e-badcd99bf54f
osd-1: e573a17a-ccde-4719-bdf8-eef66903ca4f
osd-2: 0e1b2626-f248-4e7d-9950-f1a46644754e
osd-3: 1ac6a0a2-20ee-4ed8-9f76-d24e900c800c
[...]

Identifying the corresponding raw device for each OSD UUID is possible via:

synpromika@server1 ~ % UUID="4568c300-ad83-4288-963e-badcd99bf54f"
synpromika@server1 ~ % readlink -f /dev/disk/by-partuuid/"${UUID}"
/dev/sdc1

The OSD’s key ID can be retrieved via:

synpromika@server1 ~ % OSD_ID=0
synpromika@server1 ~ % sudo ceph auth get osd."${OSD_ID}" -f json 2>/dev/null | jq -r '.[] | .key'
AQCKFpZdm0We[...]

Now we also need to identify the underlying block device:

synpromika@server1 ~ % OSD_ID=0
synpromika@server1 ~ % sudo ceph osd metadata osd."${OSD_ID}" -f json | jq -r '.bluestore_bdev_partition_path'    
/dev/sdc2

With all of this, we reconstructed the keyring, fsid, whoami, block + block_uuid files. All the other files inside the XFS metadata partition are identical on each OSD. So after placing and adjusting the corresponding metadata on the XFS partition for Ceph usage, we got a working OSD – hurray! Since we had to fix yet another 32 OSDs, we decided to automate this XFS partitioning and metadata recovery procedure.

We had a network share available on /srv/backup for storing backups of existing partition data. On each server, we tested the procedure with one single OSD before iterating over the list of remaining failing OSDs. We started with a shell script on server1, then adjusted the script for server2 and server3. This is the script, as we executed it on the 3rd server.

Thanks to this, we managed to get the Ceph cluster up and running again. We didn’t want to continue with the Ceph upgrade itself during the night though, as we wanted to know exactly what was going on and why the system behaved like that. Time for RCA!

Root Cause Analysis

So all but three OSDs on server2 failed, and the problem seems to be related to XFS. Therefore, our starting point for the RCA was, to identify what was different on server2, as compared to server1 + server3. My initial assumption was that this was related to some firmware issues with the involved controller (and as it turned out later, I was right!). The disks were attached as JBOD devices to a ServeRAID M5210 controller (with a stripe size of 512). Firmware state:

synpromika@server1 ~ % sudo storcli64 /c0 show all | grep '^Firmware'
Firmware Package Build = 24.16.0-0092
Firmware Version = 4.660.00-8156

synpromika@server2 ~ % sudo storcli64 /c0 show all | grep '^Firmware'
Firmware Package Build = 24.21.0-0112
Firmware Version = 4.680.00-8489

synpromika@server3 ~ % sudo storcli64 /c0 show all | grep '^Firmware'
Firmware Package Build = 24.16.0-0092
Firmware Version = 4.660.00-8156

This looked very promising, as server2 indeed runs with a different firmware version on the controller. But how so? Well, the motherboard of server2 got replaced by a Lenovo/IBM technician in January 2020, as we had a failing memory slot during a memory upgrade. As part of this procedure, the Lenovo/IBM technician installed the latest firmware versions. According to our documentation, some OSDs were rebuilt (due to the filestore->bluestore migration) in March and April 2020. It turned out that precisely those OSDs were the ones that survived the upgrade. So the surviving drives were created with a different firmware version running on the involved controller. All the other OSDs were created with an older controller firmware. But what difference does this make?

Now let’s check firmware changelogs. For the 24.21.0-0097 release we found this:

- Cannot create or mount xfs filesystem using xfsprogs 4.19.x kernel 4.20(SCGCQ02027889)
- xfs_info command run on an XFS file system created on a VD of strip size 1M shows sunit and swidth as 0(SCGCQ02056038)

Our XFS problem certainly was related to the controller’s firmware. We also recalled that our monitoring system reported different sunit settings for the OSDs that were rebuilt in March and April. For example, OSD 21 was recreated and got different sunit settings:

WARN  server2.example.org  Mount options of /var/lib/ceph/osd/ceph-21      WARN - Missing: sunit=1024, Exceeding: sunit=512

We compared the new OSD 21 with an existing one (OSD 25 on server3):

synpromika@server2 ~ % systemctl show var-lib-ceph-osd-ceph\\x2d21.mount | grep sunit
Options=rw,noatime,attr2,inode64,sunit=512,swidth=512,noquota
synpromika@server3 ~ % systemctl show var-lib-ceph-osd-ceph\\x2d25.mount | grep sunit
Options=rw,noatime,attr2,inode64,sunit=1024,swidth=512,noquota

Thanks to our documentation, we could compare execution logs of their creation:

% diff -u ceph-disk-osd-25.log ceph-disk-osd-21.log
-synpromika@server2 ~ % sudo ceph-disk -v prepare --bluestore /dev/sdj --osd-id 25
+synpromika@server3 ~ % sudo ceph-disk -v prepare --bluestore /dev/sdi --osd-id 21
[...]
-command_check_call: Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdj1
-meta-data=/dev/sdj1              isize=2048   agcount=4, agsize=6272 blks
[...]
+command_check_call: Running command: /sbin/mkfs -t xfs -f -i size=2048 -- /dev/sdi1
+meta-data=/dev/sdi1              isize=2048   agcount=4, agsize=6336 blks
          =                       sectsz=4096  attr=2, projid32bit=1
          =                       crc=1        finobt=1, sparse=0, rmapbt=0, reflink=0
-data     =                       bsize=4096   blocks=25088, imaxpct=25
-         =                       sunit=128    swidth=64 blks
+data     =                       bsize=4096   blocks=25344, imaxpct=25
+         =                       sunit=64     swidth=64 blks
 naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
 log      =internal log           bsize=4096   blocks=1608, version=2
          =                       sectsz=4096  sunit=1 blks, lazy-count=1
 realtime =none                   extsz=4096   blocks=0, rtextents=0
[...]

So back then, we even tried to track this down but couldn’t make sense of it yet. But now this sounds very much like it is related to the problem we saw with this Ceph/XFS failure. We follow Occam’s razor, assuming the simplest explanation is usually the right one, so let’s check the disk properties and see what differs:

synpromika@server1 ~ % sudo blockdev --getsz --getsize64 --getss --getpbsz --getiomin --getioopt /dev/sdk
4685545472
2398999281664
512
4096
524288
262144

synpromika@server2 ~ % sudo blockdev --getsz --getsize64 --getss --getpbsz --getiomin --getioopt /dev/sdk
4685545472
2398999281664
512
4096
262144
262144

See the difference between server1 and server2 for identical disks? The getiomin option now reports something different for them:

synpromika@server1 ~ % sudo blockdev --getiomin /dev/sdk            
524288
synpromika@server1 ~ % cat /sys/block/sdk/queue/minimum_io_size
524288

synpromika@server2 ~ % sudo blockdev --getiomin /dev/sdk 
262144
synpromika@server2 ~ % cat /sys/block/sdk/queue/minimum_io_size
262144

It doesn’t make sense that the minimum I/O size (iomin, AKA BLKIOMIN) is bigger than the optimal I/O size (ioopt, AKA BLKIOOPT). This leads us to Bug 202127 – cannot mount or create xfs on a 597T device, which matches our findings here. But why did this XFS partition work in the past and fails now with the newer kernel version?

The XFS behaviour change

Now given that we have backups of all the XFS partition, we wanted to track down, a) when this XFS behaviour was introduced, and b) whether, and if so how it would be possible to reuse the XFS partition without having to rebuild it from scratch (e.g. if you would have no working Ceph OSD or backups left).

Let’s look at such a failing XFS partition with the Grml live system:

root@grml ~ # grml-version
grml64-full 2020.06 Release Codename Ausgehfuahangl [2020-06-24]
root@grml ~ # uname -a
Linux grml 5.6.0-2-amd64 #1 SMP Debian 5.6.14-2 (2020-06-09) x86_64 GNU/Linux
root@grml ~ # grml-hostname grml-2020-06
Setting hostname to grml-2020-06: done
root@grml ~ # exec zsh
root@grml-2020-06 ~ # dpkg -l xfsprogs util-linux
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name           Version      Architecture Description
+++-==============-============-============-=========================================
ii  util-linux     2.35.2-4     amd64        miscellaneous system utilities
ii  xfsprogs       5.6.0-1+b2   amd64        Utilities for managing the XFS filesystem

There it’s failing, no matter which mount option we try:

root@grml-2020-06 ~ # mount ./sdd1.dd /mnt
mount: /mnt: mount(2) system call failed: Structure needs cleaning.
root@grml-2020-06 ~ # dmesg | tail -30
[...]
[   64.788640] XFS (loop1): SB stripe unit sanity check failed
[   64.788671] XFS (loop1): Metadata corruption detected at xfs_sb_read_verify+0x102/0x170 [xfs], xfs_sb block 0xffffffffffffffff
[   64.788671] XFS (loop1): Unmount and run xfs_repair
[   64.788672] XFS (loop1): First 128 bytes of corrupted metadata buffer:
[   64.788673] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
[   64.788674] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   64.788675] 00000020: 32 b6 dc 35 53 b7 44 96 9d 63 30 ab b3 2b 68 36  2..5S.D..c0..+h6
[   64.788675] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
[   64.788675] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
[   64.788676] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
[   64.788677] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
[   64.788677] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
[   64.788679] XFS (loop1): SB validate failed with error -117.
root@grml-2020-06 ~ # mount -t xfs -o rw,relatime,attr2,inode64,sunit=1024,swidth=512,noquota ./sdd1.dd /mnt/
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop1, missing codepage or helper program, or other error.
32 root@grml-2020-06 ~ # dmesg | tail -1
[   66.342976] XFS (loop1): stripe width (512) must be a multiple of the stripe unit (1024)
root@grml-2020-06 ~ # mount -t xfs -o rw,relatime,attr2,inode64,sunit=512,swidth=512,noquota ./sdd1.dd /mnt/
mount: /mnt: mount(2) system call failed: Structure needs cleaning.
32 root@grml-2020-06 ~ # dmesg | tail -14
[   66.342976] XFS (loop1): stripe width (512) must be a multiple of the stripe unit (1024)
[   80.751277] XFS (loop1): SB stripe unit sanity check failed
[   80.751323] XFS (loop1): Metadata corruption detected at xfs_sb_read_verify+0x102/0x170 [xfs], xfs_sb block 0xffffffffffffffff 
[   80.751324] XFS (loop1): Unmount and run xfs_repair
[   80.751325] XFS (loop1): First 128 bytes of corrupted metadata buffer:
[   80.751327] 00000000: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 62 00  XFSB..........b.
[   80.751328] 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
[   80.751330] 00000020: 32 b6 dc 35 53 b7 44 96 9d 63 30 ab b3 2b 68 36  2..5S.D..c0..+h6
[   80.751331] 00000030: 00 00 00 00 00 00 40 08 00 00 00 00 00 00 01 00  ......@.........
[   80.751331] 00000040: 00 00 00 00 00 00 01 01 00 00 00 00 00 00 01 02  ................
[   80.751332] 00000050: 00 00 00 01 00 00 18 80 00 00 00 04 00 00 00 00  ................
[   80.751333] 00000060: 00 00 06 48 bd a5 10 00 08 00 00 02 00 00 00 00  ...H............
[   80.751334] 00000070: 00 00 00 00 00 00 00 00 0c 0c 0b 01 0d 00 00 19  ................
[   80.751338] XFS (loop1): SB validate failed with error -117.

Also xfs_repair doesn’t help either:

root@grml-2020-06 ~ # xfs_info ./sdd1.dd
meta-data=./sdd1.dd              isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

root@grml-2020-06 ~ # xfs_repair ./sdd1.dd
Phase 1 - find and verify superblock...
bad primary superblock - bad stripe width in superblock !!!

attempting to find secondary superblock...
..............................................................................................Sorry, could not find valid secondary superblock
Exiting now.

With the “SB stripe unit sanity check failed” message, we could easily track this down to the following commit fa4ca9c:

% git show fa4ca9c5574605d1e48b7e617705230a0640b6da | cat
commit fa4ca9c5574605d1e48b7e617705230a0640b6da
Author: Dave Chinner <[email protected]>
Date:   Tue Jun 5 10:06:16 2018 -0700
    
    xfs: catch bad stripe alignment configurations
    
    When stripe alignments are invalid, data alignment algorithms in the
    allocator may not work correctly. Ensure we catch superblocks with
    invalid stripe alignment setups at mount time. These data alignment
    mismatches are now detected at mount time like this:
    
    XFS (loop0): SB stripe unit sanity check failed
    XFS (loop0): Metadata corruption detected at xfs_sb_read_verify+0xab/0x110, xfs_sb block 0xffffffffffffffff
    XFS (loop0): Unmount and run xfs_repair
    XFS (loop0): First 128 bytes of corrupted metadata buffer:
    0000000091c2de02: 58 46 53 42 00 00 10 00 00 00 00 00 00 00 10 00  XFSB............
    0000000023bff869: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00000000cdd8c893: 17 32 37 15 ff ca 46 3d 9a 17 d3 33 04 b5 f1 a2  .27...F=...3....
    000000009fd2844f: 00 00 00 00 00 00 00 04 00 00 00 00 00 00 06 d0  ................
    0000000088e9b0bb: 00 00 00 00 00 00 06 d1 00 00 00 00 00 00 06 d2  ................
    00000000ff233a20: 00 00 00 01 00 00 10 00 00 00 00 01 00 00 00 00  ................
    000000009db0ac8b: 00 00 03 60 e1 34 02 00 08 00 00 02 00 00 00 00  ...`.4..........
    00000000f7022460: 00 00 00 00 00 00 00 00 0c 09 0b 01 0c 00 00 19  ................
    XFS (loop0): SB validate failed with error -117.
    
    And the mount fails.
    
    Signed-off-by: Dave Chinner <[email protected]>
    Reviewed-by: Carlos Maiolino <[email protected]>
    Reviewed-by: Darrick J. Wong <[email protected]>
    Signed-off-by: Darrick J. Wong <[email protected]>

diff --git fs/xfs/libxfs/xfs_sb.c fs/xfs/libxfs/xfs_sb.c
index b5dca3c8c84d..c06b6fc92966 100644
--- fs/xfs/libxfs/xfs_sb.c
+++ fs/xfs/libxfs/xfs_sb.c
@@ -278,6 +278,22 @@ xfs_mount_validate_sb(
                return -EFSCORRUPTED;
        }
        
+       if (sbp->sb_unit) {
+               if (!xfs_sb_version_hasdalign(sbp) ||
+                   sbp->sb_unit > sbp->sb_width ||
+                   (sbp->sb_width % sbp->sb_unit) != 0) {
+                       xfs_notice(mp, "SB stripe unit sanity check failed");
+                       return -EFSCORRUPTED;
+               } 
+       } else if (xfs_sb_version_hasdalign(sbp)) { 
+               xfs_notice(mp, "SB stripe alignment sanity check failed");
+               return -EFSCORRUPTED;
+       } else if (sbp->sb_width) {
+               xfs_notice(mp, "SB stripe width sanity check failed");
+               return -EFSCORRUPTED;
+       }
+
+       
        if (xfs_sb_version_hascrc(&mp->m_sb) &&
            sbp->sb_blocksize < XFS_MIN_CRC_BLOCKSIZE) {
                xfs_notice(mp, "v5 SB sanity check failed");

This change is included in kernel versions 4.18-rc1 and newer:

% git describe --contains fa4ca9c5574605d1e48
v4.18-rc1~37^2~14

Now let’s try with an older kernel version (4.9.0), using old Grml 2017.05 release:

root@grml ~ # grml-version
grml64-small 2017.05 Release Codename Freedatensuppe [2017-05-31]
root@grml ~ # uname -a
Linux grml 4.9.0-1-grml-amd64 #1 SMP Debian 4.9.29-1+grml.1 (2017-05-24) x86_64 GNU/Linux
root@grml ~ # lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description:    Debian GNU/Linux 9.0 (stretch)
Release:        9.0
Codename:       stretch
root@grml ~ # grml-hostname grml-2017-05
Setting hostname to grml-2017-05: done
root@grml ~ # exec zsh
root@grml-2017-05 ~ #

root@grml-2017-05 ~ # xfs_info ./sdd1.dd
xfs_info: ./sdd1.dd is not a mounted XFS filesystem
1 root@grml-2017-05 ~ # xfs_repair ./sdd1.dd
Phase 1 - find and verify superblock...
bad primary superblock - bad stripe width in superblock !!!

attempting to find secondary superblock...
..............................................................................................Sorry, could not find valid secondary superblock
Exiting now.
1 root@grml-2017-05 ~ # mount ./sdd1.dd /mnt
root@grml-2017-05 ~ # mount -t xfs
/root/sdd1.dd on /mnt type xfs (rw,relatime,attr2,inode64,sunit=1024,swidth=512,noquota)
root@grml-2017-05 ~ # ls /mnt
activate.monmap  active  block  block_uuid  bluefs  ceph_fsid  fsid  keyring  kv_backend  magic  mkfs_done  ready  require_osd_release  systemd  type  whoami
root@grml-2017-05 ~ # xfs_info /mnt
meta-data=/dev/loop1             isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1 spinodes=0 rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=128    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0 ftype=1
log      =internal               bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

Mounting there indeed works! Now, if we mount the filesystem with new and proper sunit/swidth settings using the older kernel, it should rewrite them on disk:

root@grml-2017-05 ~ # mount -t xfs -o sunit=512,swidth=512 ./sdd1.dd /mnt/
root@grml-2017-05 ~ # umount /mnt/

And indeed, mounting this rewritten filesystem then also works with newer kernels:

root@grml-2020-06 ~ # mount ./sdd1.rewritten /mnt/
root@grml-2020-06 ~ # xfs_info /root/sdd1.rewritten
meta-data=/dev/loop1             isize=2048   agcount=4, agsize=6272 blks
         =                       sectsz=4096  attr=2, projid32bit=1
         =                       crc=1        finobt=1, sparse=0, rmapbt=0
         =                       reflink=0
data     =                       bsize=4096   blocks=25088, imaxpct=25
         =                       sunit=64    swidth=64 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=4096   blocks=1608, version=2
         =                       sectsz=4096  sunit=1 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
root@grml-2020-06 ~ # mount -t xfs                
/root/sdd1.rewritten on /mnt type xfs (rw,relatime,attr2,inode64,logbufs=8,logbsize=32k,sunit=512,swidth=512,noquota)

FTR: The ‘sunit=512,swidth=512‘ from the xfs mount option is identical to xfs_info’s output ‘sunit=64,swidth=64‘ (because mount.xfs’s sunit value is given in 512-byte block units, see man 5 xfs, and the xfs_info output reported here is in blocks with a block size (bsize) of 4096, so ‘sunit = 512*512 := 64*4096‘).

mkfs uses minimum and optimal sizes for stripe unit and stripe width; you can check this e.g. via (note that server2 with fixed firmware version reports proper values, whereas server3 with broken controller firmware reports non-sense):

synpromika@server2 ~ % for i in /sys/block/sd*/queue/ ; do printf "%s: %s %s\n" "$i" "$(cat "$i"/minimum_io_size)" "$(cat "$i"/optimal_io_size)" ; done
[...]
/sys/block/sdc/queue/: 262144 262144
/sys/block/sdd/queue/: 262144 262144
/sys/block/sde/queue/: 262144 262144
/sys/block/sdf/queue/: 262144 262144
/sys/block/sdg/queue/: 262144 262144
/sys/block/sdh/queue/: 262144 262144
/sys/block/sdi/queue/: 262144 262144
/sys/block/sdj/queue/: 262144 262144
/sys/block/sdk/queue/: 262144 262144
/sys/block/sdl/queue/: 262144 262144
/sys/block/sdm/queue/: 262144 262144
/sys/block/sdn/queue/: 262144 262144
[...]

synpromika@server3 ~ % for i in /sys/block/sd*/queue/ ; do printf "%s: %s %s\n" "$i" "$(cat "$i"/minimum_io_size)" "$(cat "$i"/optimal_io_size)" ; done
[...]
/sys/block/sdc/queue/: 524288 262144
/sys/block/sdd/queue/: 524288 262144
/sys/block/sde/queue/: 524288 262144
/sys/block/sdf/queue/: 524288 262144
/sys/block/sdg/queue/: 524288 262144
/sys/block/sdh/queue/: 524288 262144
/sys/block/sdi/queue/: 524288 262144
/sys/block/sdj/queue/: 524288 262144
/sys/block/sdk/queue/: 524288 262144
/sys/block/sdl/queue/: 524288 262144
/sys/block/sdm/queue/: 524288 262144
/sys/block/sdn/queue/: 524288 262144
[...]

This is the underlying reason why the initially created XFS partitions were created with incorrect sunit/swidth settings. The broken firmware of server1 and server3 was the cause of the incorrect settings – they were ignored by old(er) xfs/kernel versions, but treated as an error by new ones.

Make sure to also read the XFS FAQ regarding “How to calculate the correct sunit,swidth values for optimal performance”. We also stumbled upon two interesting reads in RedHat’s knowledge base: 5075561 + 2150101 (requires an active subscription, though) and #1835947.

Am I affected? How to work around it?

To check whether your XFS mount points are affected by this issue, the following command line should be useful:

awk '$3 == "xfs"{print $2}' /proc/self/mounts | while read mount ; do echo -n "$mount " ; xfs_info $mount | awk '$0 ~ "swidth"{gsub(/.*=/,"",$2); gsub(/.*=/,"",$3); print $2,$3}' | awk '{ if ($1 > $2) print "impacted"; else print "OK"}' ; done

If you run into the above situation, the only known solution to get your original XFS partition working again, is to boot into an older kernel version again (4.17 or older), mount the XFS partition with correct sunit/swidth settings and then boot back into your new system (kernel version wise).

Lessons learned

  • document everything and ensure to have all relevant information available (including actual times of changes, used kernel/package/firmware/… versions. The thorough documentation was our most significant asset in this case, because we had all the data and information we needed during the emergency handling as well as for the post mortem/RCA)
  • if something changes unexpectedly, dig deeper
  • know who to ask, a network of experts pays off
  • including timestamps in your shell makes reconstruction easier (the more people and documentation involved, the harder it gets to wade through it)
  • keep an eye on changelogs/release notes
  • apply regular updates and don’t forget invisible layers (e.g. BIOS, controller/disk firmware, IPMI/OOB (ILO/RAC/IMM/…) firmware)
  • apply regular reboots, to avoid a possible delta becoming bigger (which makes debugging harder)

Thanks: Darshaka Pathirana, Chris Hofstaedtler and Michael Hanscho.

Looking for help with your IT infrastructure? Let us know!

09 April, 2021 12:39PM by mika

April 08, 2021

hackergotchi for Sean Whitton

Sean Whitton

consfigurator-live-build

One of my goals for Consfigurator is to make it capable of installing Debian to my laptop, so that I can stop booting to GRML and manually partitioning and debootstrapping a basic system, only to then turn to configuration management to set everything else up. My configuration management should be able to handle the partitioning and debootstrapping, too.

The first stage was to make Consfigurator capable of debootstrapping a basic system, chrooting into it, and applying other arbitrary configuration, such as installing packages. That’s been in place for some weeks now. It’s sophisticated enough to avoid starting up newly installed services, but I still need to add some bind mounting.

Another significant piece is teaching Consfigurator how to partition block devices. That’s quite tricky to do in a sufficiently general way – I want to cleanly support various combinations of LUKS, LVM and regular partitions, including populating /etc/crypttab and /etc/fstab. I have some ideas about how to do it, but it’ll probably take a few tries to get the abstractions right.

Let’s imagine that code is all in place, such that Consfigurator can be pointed at a block device and it will install a bootable Debian system to it. Then to install Debian to my laptop I’d just need to take my laptop’s disk drive out and plug it into another system, and run Consfigurator on that system, as root, pointed at the block device representing my laptop’s disk drive. For virtual machines, it would be easy to write code which loop-mounts an empty disk image, and then Consfigurator could be pointed at the loop-mounted block device, thereby making the disk image file bootable.

This is adequate for virtual machines, or small single-board computers with tiny storage devices (not that I actually use any of those, but I want Consfigurator to be able to make disk images for them!). But it’s not much good for my laptop. I casually referred to taking out my laptop’s disk drive and connecting it to another computer, but this would void my laptop’s warranty. And Consfigurator would not be able to update my laptop’s NVRAM, as is needed on UEFI systems.

What’s wanted here is a live system which can run Consfigurator directly on the laptop, pointed at the block device representing its physical disk drive. Ideally this live system comes with a chroot with the root filesystem for the new Debian install already built, so that network access is not required, and all Consfigurator has to do is partition the drive and copy in the contents of the chroot. The live system could be set up to automatically start doing that upon boot, but another option is to just make Consfigurator itself available to be used interactively. The user boots the live system, starts up Emacs, starts up Lisp, and executes a Consfigurator deployment, supplying the block device representing the laptop’s disk drive as an argument to the deployment. Consfigurator goes off and partitions that drive, copies in the contents of the chroot, and executes grub-install to make the laptop bootable. This is also much easier to debug than a live system which tries to start partitioning upon boot. It would look something like this:

    ;; melete.silentflame.com is a Consfigurator host object representing the
    ;; laptop, including information about the partitions it should have
    (deploy-these :local ...
      (chroot:partitioned-and-installed
        melete.silentflame.com "/srv/chroot/melete" "/dev/nvme0n1"))

Now, building live systems is a fair bit more involved than installing Debian to a disk drive and making it bootable, it turns out. While I want Consfigurator to be able to completely replace the Debian Installer, I decided that it is not worth trying to reimplement the relevant parts of the Debian Live tool suite, because I do not need to make arbitrary customisations to any live systems. I just need to have some packages installed and some files in place. Nevertheless, it is worth teaching Consfigurator how to invoke Debian Live, so that the customisation of the chroot which isn’t just a matter of passing options to lb_config(1) can be done with Consfigurator. This is what I’ve ended up with – in Consfigurator’s source code:

(defpropspec image-built :lisp (config dir properties)
  "Build an image under DIR using live-build(7), where the resulting live
system has PROPERTIES, which should contain, at a minimum, a property from
CONSFIGURATOR.PROPERTY.OS setting the Debian suite and architecture.  CONFIG
is a list of arguments to pass to lb_config(1), not including the '-a' and
'-d' options, which Consfigurator will supply based on PROPERTIES.

This property runs the lb_config(1), lb_bootstrap(1), lb_chroot(1) and
lb_binary(1) commands to build or rebuild the image.  Rebuilding occurs only
when changes to CONFIG or PROPERTIES mean that the image is potentially
out-of-date; e.g. if you just add some new items to PROPERTIES then in most
cases only lb_chroot(1) and lb_binary(1) will be re-run.

Note that lb_chroot(1) and lb_binary(1) both run after applying PROPERTIES,
and might undo some of their effects.  For example, to configure
/etc/apt/sources.list, you will need to use CONFIG not PROPERTIES."
  (:desc (declare (ignore config properties))
         #?"Debian Live image built in ${dir}")
  (let* (...)
    ;; ...
    `(eseqprops
      ;; ...
      (on-change
          (eseqprops
           (on-change
               (file:has-content ,auto/config ,(auto/config config) :mode #o755)
             (file:does-not-exist ,@clean)
             (%lbconfig ,dir)
             (%lbbootstrap t ,dir))
           (%lbbootstrap nil ,dir)
           (deploys ((:chroot :into ,chroot)) ,host))
        (%lbchroot ,dir)
        (%lbbinary ,dir)))))

Here, %lbconfig is a property running lb_config(1), %lbbootstrap one which runs lb_bootstrap(1), etc. Those properties all just change directory to the right place and run the command, essentially, with a little extra code to handle failed debootstraps and the like.

The ON-CHANGE and ESEQPROPS combinators work together to sequence the interaction of the Debian Live suite and Consfigurator.

  • In the innermost ON-CHANGE expression: create the file auto/config and populate it with the call to lb_config(1) that we need to make, as described in the Debian Live manual, chapter 6.

    • If doing so resulted in a change to the auto/config file – e.g. the user added some more options – ensure that lb_config(1) and lb_bootstrap(1) both get rerun.
  • Now in the inner ESEQPROPS expression, use DEPLOYS to configure the chroot, essentially by forking into the chroot and recursively reinvoking Consfigurator.

  • Finally, if any of the above resulted in a change being made, call lb_chroot(1) and lb_binary(1).

This way, we only rebuild the chroot if the configuration changed, and we only rebuild the image if the chroot changed.

Now over in my personal consfig:

(try-register-data-source
 :git-snapshot :name "consfig" :repo #P"src/cl/consfig/" ...)

(defproplist hybrid-live-iso-built :lisp ()
  "Build a Debian Live system in /srv/live/spw.

Typically this property is not applied in a DEFHOST form, but rather run as
needed at the REPL.  The reason for this is that otherwise the whole image will
get rebuilt each time a commit is made to my dotfiles repo or to my consfig."
  (:desc "Sean's Debian Live system image built")
  (live-build:image-built.
      '("--archive-areas" "main contrib non-free" ...)
      "/srv/live/spw"
    (os:debian-stable "buster" :amd64)
    (basic-props)
    (apt:installed "whatever" "you" "want")

    (git:snapshot-extracted "/etc/skel/src" "dotfiles")
    (file:is-copy-of "/etc/skel/.bashrc" "/etc/skel/src/dotfiles/.bashrc")

    (git:snapshot-extracted "/root/src/cl" "consfig")))

The first argument to LIVE-BUILD:IMAGE-BUILT. is additional arguments to lb_config(1). The third argument onwards are the properties for the live system. The cool thing is GIT:SNAPSHOT-EXTRACTED – the calls to this ensure that a copy of my Emacs configuration and my consfig end up in the live image, ready to be used interactively to install Debian, as described above. I’ll need to add something like (chroot:host-chroot-bootstrapped melete.silentflame.com "/srv/chroot/melete") too.

As with everything Consfigurator-related, Joey Hess’s Propellor is the giant upon whose shoulders I’m standing.

08 April, 2021 11:35PM

Thorsten Alteholz

My Debian Activities in March 2021

FTP master

Things never turn out the way you expect, so this month I was only able to accept 38 packages and rejected none. Due to the freeze, the overall number of packages that got accepted was 88.

Debian LTS

This was my eighty-first month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian.

This month my all in all workload has been 30h. During that time I did LTS and normal security uploads of:

  • [DLA 2606-1] lxml security update for one CVE
  • [DSA 4880-1] lxml security update for one CVE
  • [DLA 2611-1] ldb security update for two CVEs
  • [DLA 2612-1] leptonlib security update for four CVEs

I also prepared debdiffs for unstable and/or buster for leptonlib and libebml, which for one reason or another did not result in an upload yet.

Last but not least I did some days of frontdesk duties.

Debian ELTS

This month was the thirty-third ELTS month.

During my allocated time I uploaded:

  • ELA-388-1 for zeromq3
  • ELA-390-1 for lxml
  • ELA-391-1 for jasper
  • ELA-393-1 for ldb
  • ELA-394-1 for leptonlib

Last but not least I did some days of frontdesk duties.

Other stuff

On my neverending golang challenge I uploaded (or sponsored for thola dependencies):
golang-github-tombuildsstuff-giovanni, golang-github-apparentlymart-go-userdirs, golang-github-apparentlymart-go-shquot, golang-github-likexian-gokit, olang-gopkg-mail.v2, golang-gopkg-redis.v5, golang-github-facette-natsort, golang-github-opentracing-contrib-go-grpc, golang-github-felixge-fgprof, golang-ithub-gogo-status, golang-github-leanovate-gopter, golang-github-opentracing-basictracer-go, golang-github-lightstep-lightstep-tracer-common, golang-github-o-sourcemap-sourcemap, golang-github-igm-pubsub, golang-github-igm-sockjs-go, golang-github-centrifugal-protocol, golang-github-mna-redisc, golang-github-fzambia-eagle, golang-github-centrifugal-centrifuge, golang-github-chromedp-sysutil, golang-github-client9-misspell, golang-github-knq-snaker, cdproto-gen, golang-github-mattermost-xml-roundtrip-validator, golang-github-crewjam-saml, ssllabs-scan, golang-uber-automaxprocs, golang-uber-goleak, golang-github-k0kubun-go-ansi, golang-github-schollz-progressbar, golang-github-komkom-toml, golang-github-labstack-echo, golang-github-inexio-go-monitoringplugin

08 April, 2021 10:11AM by alteholz

Ryan Kavanagh

Writing BASIC-8 on the TSS/8

I recently discovered SDF’s PiDP-8. You can access it over SSH and watch the blinkenlights over its twitch stream. It runs TSS/8, a time-sharing operating system written in 1967 by Adrian van de Goor while a grad student here at CMU. I’ve been having fun tinkering with it, and I just wrote my first BASIC program1 since high school. It plots the graph of some user-specified univariate function. I don’t claim that it’s elegant or well-engineered, but it works!

10  DEF FNC(X) = 19 * COS(X/2)
20  FOR Y = 20 TO -20 STEP -1
30     FOR X = -25 TO 24
40     LET V = FNC(X)
50     GOSUB 90
60  NEXT X
70  PRINT ""
80  NEXT Y
85  STOP
90  REM SUBROUTINE PRINTS AXES AND PLOT
100 IF X = 0 THEN 150
110 IF Y = 0 THEN 150
120 REM X != 0 AND Y != 0 SO IN QUADRANT
130 GOSUB 290
140 RETURN
150 GOSUB 170
160 RETURN
170 REM SUBROUTINE PRINTS AXES (X = 0 OR Y = 0)
180 IF X + Y = 0 THEN 230
190 IF X = 0 THEN 250
200 IF Y = 0 THEN 270
210 PRINT "AXES INVARIANT VIOLATED"
220 STOP
230 PRINT "+";
240 GOTO 280
250 PRINT "I";
260 GOTO 280
270 PRINT "-";
280 RETURN
290 REM SUBROUTINE PRINTS FUNCTION GRAPH (X != 0 AND Y != 0)
300 IF 0 <= Y THEN 350
310 REM Y < 0
320 IF V <= Y THEN 410
330 REM Y < 0 AND Y < V SO OUTSIDE OF PLOT AREA
340 GOTO 390
350 REM 0 <= Y
360 IF Y <= V THEN 410
370 REM 0 <= Y  AND V < Y SO OUTSIDE OF PLOT AREA
380 GOTO 390
390 PRINT " ";
400 RETURN
410 PRINT "*";
420 RETURN
430 REM COPYRIGHT 2021 RYAN KAVANAGH RAK AT RAK.AC
440 END

It produces the following output:

                         I
                         I
*           **           I           **
*           **           I           **
**          **          *I*          **          *
**          **          *I*          **          *
**         ***          *I*          ***         *
**         ****         *I*         ****         *
**         ****         *I*         ****         *
**         ****         *I*         ****         *
**         ****        **I**        ****         *
***        ****        **I**        ****        **
***        ****        **I**        ****        **
***        ****        **I**        ****        **
***       *****        **I**        *****       **
***       ******       **I**       ******       **
***       ******       **I**       ******       **
***       ******       **I**       ******       **
***       ******       **I**       ******       **
***       ******      ***I***      ******       **
-------------------------+------------------------
    ******      ******   I   ******      ******
    ******      ******   I   ******      ******
    *****       ******   I   ******       *****
    *****       ******   I   ******       *****
    *****        *****   I   *****        *****
    *****        *****   I   *****        *****
    *****        *****   I   *****        *****
    *****        ****    I    ****        *****
    *****        ****    I    ****        *****
     ****        ****    I    ****        ****
     ****        ****    I    ****        ****
     ***         ****    I    ****         ***
     ***          ***    I    ***          ***
     ***          ***    I    ***          ***
     ***          ***    I    ***          ***
      **          **     I     **          **
      **          **     I     **          **
      *            *     I     *            *
                         I
                         I

Next up, I am going to try my hand at writing some FORTRAN or some FOCAL69. If you like tinkering with old systems, then you should give the TSS/8 a try.


  1. It’s written in the BASIC-8 dialect. ↩︎

08 April, 2021 12:08AM

April 07, 2021

hackergotchi for Emmanuel Kasper

Emmanuel Kasper

Manually install a single node Kubernetes cluster on Debian

Debian has work-in-progress packages for Kubernetes, which work well enough enough for a testing and learning environement. Bootstraping a cluster with the kubeadm deployer with these packages is not that hard, and is similar to the upstream kubeadm documentation

Install necessary packages in a VM

Install a throwaway VM with Vagrant.

apt install vagrant vagrant-libvirt
vagrant init debian/testing64

Bump the RAM and CPU of the VM, Kubernetes needs at least 2 gigs and 2 cores.

awk  -i inplace '1;/^Vagrant.configure\("2"\) do \|config/ {print "  config.vm.provider :libvirt do |vm|  vm.memory=2048 end"}' Vagrantfile
awk -i inplace '1;/^Vagrant.configure\("2"\) do \|config/ {print " config.vm.provider :libvirt do |vm| vm.cpus=2 end"}' Vagrantfile

Start the VM, login, update the package index.

vagrant up
vagrant ssh
sudo apt update

Install a container engine, here we use docker.io, we could also use containerd (both are packaged in Debian) or cri-o.

sudo apt install --yes --no-install-recommends docker.io curl

Install kubernetes binaries. This will install kubelet, the system service which will manage the containers, and kubectl the user/admin tool to manage the cluster.

sudo apt install --yes kubernetes-{node,client} containernetworking-plugins

Although it is not technically mandatory, we will use kubeadm, the most popular installer to create a Kubernetes cluster. Kubeadm is not packaged in Debian, we have to download an upstream binary.

wget https://dl.k8s.io/v1.20.5/kubernetes-server-linux-amd64.tar.gz

sha512sum kubernetes-server-linux-amd64.tar.gz
28529733bf34f5d5b72eabe30a81df98cc7f8e529590f807745cd67986a2c5c3eb86cebc7ecbcfc3df3c50416306e5d150948f2483933ea46c2aebaeb871ea8f kubernetes-server-linux-arm64.tar.gz

sudo tar --directory=/usr/local/sbin --strip-components 3 -xaf kubernetes-server-linux-amd64.tar.gz kubernetes/server/bin/kubeadm
sudo chmod +x /usr/local/sbin/kubeadm
sudo kubeadm version
kubeadm version: &version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.5", GitCommit:"6b1d87acf3c8253c123756b9e61dac642678305f", GitTreeState:"clean", BuildDate:"2021-03-18T01:08:27Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}

Add a kubelet systemd unit:

RELEASE_VERSION="v0.4.0"
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubelet/lib/systemd/system/kubelet.service" | sudo tee /etc/systemd/system/kubelet.service
sudo systemctl enable kubelet

and a default config file for kubeadm

RELEASE_VERSION="v0.4.0"
sudo mkdir -p /etc/systemd/system/kubelet.service.d
curl -sSL "https://raw.githubusercontent.com/kubernetes/release/${RELEASE_VERSION}/cmd/kubepkg/templates/latest/deb/kubeadm/10-kubeadm.conf" | sudo tee /etc/systemd/system/kubelet.service.d/10-kubeadm.conf

finally we need to help kubelet find the components needed for container networking

echo 'KUBELET_EXTRA_ARGS="--cni-bin-dir=/usr/lib/cni"' | sudo tee /etc/default/kubelet

Create a cluster

Initialize a cluster with kubeadm: this will download container images for the Kubernetes control plane (= the brain of the cluster), and start the containers via the kubelet service. Yes a good part of Kubernetes itself run in containers.

sudo kubeadm init --pod-network-cidr=10.244.0.0/16
...
...
Your Kubernetes control-plane has initialized successfully!

To start using your cluster, you need to run the following as a regular user:

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

Follow the instructions from the kubeadm output, and verify you have a single node cluster, with the status NotReady.

kubectl get nodes 
NAME STATUS ROLES AGE VERSION
testing NotReady control-plane,master 9m9s v1.20.5

At that point you should also have a bunch of containers running on the node:

sudo docker ps --format '{{.Names}}'
k8s_kube-apiserver_kube-apiserver-testing_kube-system_2711c230d39ccda1e74d1d6386a05cee_0
k8s_POD_kube-apiserver-testing_kube-system_2711c230d39ccda1e74d1d6386a05cee_0
k8s_etcd_etcd-testing_kube-system_4749b1bca3b1a73fd09c8e299d7030fe_0
k8s_POD_etcd-testing_kube-system_4749b1bca3b1a73fd09c8e299d7030fe_0
...

The kubelet service also needs an external network plugin to get the cluster in Ready state.

sudo systemctl status kubelet
...
Mar 28 09:28:43 testing kubelet[9405]: E0328 09:28:43.958059 9405 kubelet.go:2188] Container runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized

Let’s add that network plugin. Download the flannel network plugin definition, and schedule flannel to run on all nodes of your cluster:

wget https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml
kubectl apply --filename=kube-flannel.yml

After a dozen of seconds your node should be in ready status.

kubectl get nodes 
NAME STATUS ROLES AGE VERSION
testing Ready control-plane,master 16m v1.20.5

Deploy a test application

Our node is now in Ready status, but we cannot run application on it, since we only have a master node, an administrative node which by default cannot run user applications.

kubectl describe node testing | grep ^Taints
Taints: node-role.kubernetes.io/master:NoSchedule

Let’s allow node testing to run user applications:

kubectl taint node testing node-role.kubernetes.io/master-

Deploy a nginx container:

kubectl run my-nginx-pod --image=docker.io/library/nginx --port=80 --labels="app=http-content" 

Create a Kubernetes service to access this pod externally:

cat service.yaml

apiVersion: v1
kind: Service
metadata:
name: my-k8s-service
spec:
type: NodePort
ports:
- port: 80
nodePort: 30000
selector:
app: http-content

kubectl create --filename service.yaml

Access the service via IP adress:

curl 192.168.121.63:30000
...
Thank you for using nginx.

Notes

I will try to get this blog post in a Debian Wiki article, or maybe in the kubernetes-node documentation. Blog posts deprecate and disappear, wiki and project docs live longer.

07 April, 2021 11:41AM by Emmanuel Kasper ([email protected])

hackergotchi for Norbert Preining

Norbert Preining

Debian KDE/Plasma and Digikam Status 2021-04-07

Two months have passed since the last status update, but not much has changed since Debian is more or less frozen for the release of Bullseye, and only critical bugfixes are allowed. As reported before Debian/bullseye will have Plasma 5.20.5, Frameworks 5.78, Apps 20.12. Debian/experimental already carries Plasma 5.21.4 and Frameworks 5.80, and that is also the level at the OSC builds.

Debian Bullseye

We are in hard freeze now, and only targeted fixes are allowed, but Bullseye is carrying a good mixture consisting of the KDE Frameworks 5.78, including several backports of fixes from 5.79 to get smooth operation. Plasma 5.20.5, again with several cherry picks for bugs will be in Bullseye, too. The KDE/Apps are mostly at 20.12 level, and the KDE PIM group packages (akonadi, kmail, etc) are at 20.08.

Debian experimental

Frameworks 5.80 (and soon 5.81) and Plasma 5.21.4 are in Debian/experimental.

OBS packages

(short reminder: you need to import my OBS gpg key to make these repos work!)

The OBS packages as usual follow the latest release, and currently ship KDE Frameworks 5.80, KDE Apps 20.12.3, and Plasma 5.21.4. The package sources are as usual (note the different path for the Plasma packages and the App packages, containing the release version!), for Debian/unstable:

deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/frameworks/Debian_Unstable/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/plasma521/Debian_Unstable/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/apps2012/Debian_Unstable/ ./
deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/other/Debian_Unstable/ ./

and the same with Testing instead of Unstable for Debian/testing.

Digikam

Digikam has seen a new release 7.2.0, and packages are available in my OBS archives:

deb https://download.opensuse.org/repositories/home:/npreining:/debian-kde:/other/Debian_Unstable/ ./

and again, same with Testing instead of Unstable for Debian/testing.

07 April, 2021 01:16AM by Norbert Preining

April 06, 2021

Jelmer Vernooij

Automatic Fixing of Debian Build Dependencies

The Debian Janitor is an automated system that commits fixes for (minor) issues in Debian packages that can be fixed by software. It gradually started proposing merges in early December. The first set of changes sent out ran lintian-brush on sid packages maintained in Git. This post is part of a series about the progress of the Janitor.

In my last blogpost, I introduced the buildlog consultant - a tool that can identify many reasons why a Debian build failed.

For example, here’s a fragment of a build log where the Build-Depends lack python3-setuptools:

849
850
851
852
853
854
855
856
857
858
 dpkg-buildpackage: info: host architecture amd64
  fakeroot debian/rules clean
 dh clean --with python3,sphinxdoc --buildsystem=pybuild
    dh_auto_clean -O--buildsystem=pybuild
 I: pybuild base:232: python3.9 setup.py clean
 Traceback (most recent call last):
   File "/<<PKGBUILDDIR>>/setup.py", line 2, in <module>
     from setuptools import setup
 ModuleNotFoundError: No module named 'setuptools'
 E: pybuild pybuild:353: clean: plugin distutils failed with: exit code=1: python3.9 setup.py clean

The buildlog consultant can identify the line in bold as being key, and interprets it:

 % analyse-sbuild-log --json ~/build.log

 {
    "stage": "build",
    "section": "Build",
    "lineno": 857,
    "kind": "missing-python-module",
    "details": {"module": "setuptools", "python_version": 3, "minimum_version": null}
 }

Automatically acting on buildlog problems

A common reason why Debian builds fail is missing dependencies or incorrect versions of dependencies declared in the package build depends.

Based on the output of the buildlog consultant, it is possible in many cases to determine what dependency needs to be added to Build-Depends. In the example given above, we can use apt-file to look for the package that contains the path /usr/lib/python3/dist-packages/setuptools/__init__.py - and voila, we find python3-setuptools:

 % apt-file search /usr/lib/python3/dist-packages/setuptools/__init__.py
 python3-setuptools: /usr/lib/python3/dist-packages/setuptools/__init__.py

The deb-fix-build command automates these steps:

  1. It builds the package using sbuild; if the package successfully builds then it just exits successfully
  2. It tries to identify the problem by looking through the build log; if it can't or if it's a problem it has seen before (but apparently failed to resolve), then it exits with a non-zero exit code
  3. It tries to find a dependency that can address the problem
  4. It updates Build-Depends in debian/control or Depends in debian/tests/control
  5. Go to step 1

This takes away the tedious manual process of building a package, discovering that a dependency is missing, updating Build-Depends and trying again.

For example, when I ran deb-fix-build while packaging saneyaml, the output looks something like this:

 % deb-fix-build
 Using output directory /tmp/tmpyz0nkgqq
 Using sbuild chroot unstable-amd64-sbuild
 Using fixers: …
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Attempting to use fixer upstream requirement fixer(apt) to address MissingPythonDistribution('setuptools_scm', python_version=3, minimum_version='4')
 Using apt-file to search apt contents
 Adding build dependency: python3-setuptools-scm (>= 4)
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Attempting to use fixer upstream requirement fixer(apt) to address MissingPythonDistribution('toml', python_version=3, minimum_version=None)
 Adding build dependency: python3-toml
 Building debian packages, running 'sbuild --no-clean-source -A -s -v'.
 Built 0.5.2-1- changes files at [‘saneyaml_0.5.2-1_amd64.changes’].

And in our Git repository, we see these changes as well:

% git log -p
 commit 5a1715f4c7273b042818fc75702f2284034c7277 (HEAD -> master)
 Author: Jelmer Vernooij <[email protected]>
 Date:   Sun Apr 4 02:35:56 2021 +0100

     Add missing build dependency on python3-toml.

 diff --git a/debian/control b/debian/control
 index 5b854dc..3b27b73 100644
 --- a/debian/control
 +++ b/debian/control
 @@ -1,6 +1,6 @@
  Rules-Requires-Root: no
  Standards-Version: 4.5.1
 -Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4)
 +Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4), python3-toml
  Testsuite: autopkgtest-pkg-python
  Source: python-saneyaml
  Priority: optional

 commit f03047da80fcd8468ee231fbc4cf8488d7a0acd1
 Author: Jelmer Vernooij <[email protected]>
 Date:   Sun Apr 4 02:35:34 2021 +0100

     Add missing build dependency on python3-setuptools-scm (>= 4).

 diff --git a/debian/control b/debian/control
 index a476cc2..5b854dc 100644
 --- a/debian/control
 +++ b/debian/control
 @@ -1,6 +1,6 @@
  Rules-Requires-Root: no
  Standards-Version: 4.5.1
 -Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel
 +Build-Depends: debhelper-compat (= 12), dh-sequence-python3, python3-all, python3-setuptools (>= 50), python3-wheel, python3-setuptools-scm (>= 4)
  Testsuite: autopkgtest-pkg-python
  Source: python-saneyaml
  Priority: optional

Using deb-fix-build

You can run deb-fix-build by installing the ognibuild package from unstable. The only requirements for using it are that:

  • The package is maintained in Git
  • A sbuild schroot is available for use

Caveats

deb-fix-build is fairly easy to understand, and if it doesn't work then you're no worse off than you were without it - you'll have to add your own Build-Depends.

That said, there are a couple of things to keep in mind:

  • At the moment, it doesn't distinguish between general, Arch or Indep Build-Depends.
  • It can only add dependencies for things that are actually in the archive
  • Sometimes there are multiple packages that can provide a file, command or python package - it tries to find the right one with heuristics but doesn't always get it right

06 April, 2021 09:46PM by Jelmer Vernooij

April 05, 2021

hackergotchi for Kees Cook

Kees Cook

security things in Linux v5.9

Previously: v5.8

Linux v5.9 was released in October, 2020. Here’s my summary of various security things that I found interesting:

seccomp user_notif file descriptor injection
Sargun Dhillon added the ability for SECCOMP_RET_USER_NOTIF filters to inject file descriptors into the target process using SECCOMP_IOCTL_NOTIF_ADDFD. This lets container managers fully emulate syscalls like open() and connect(), where an actual file descriptor is expected to be available after a successful syscall. In the process I fixed a couple bugs and refactored the file descriptor receiving code.

zero-initialize stack variables with Clang
When Alexander Potapenko landed support for Clang’s automatic variable initialization, it did so with a byte pattern designed to really stand out in kernel crashes. Now he’s added support for doing zero initialization via CONFIG_INIT_STACK_ALL_ZERO, which besides actually being faster, has a few behavior benefits as well. “Unlike pattern initialization, which has a higher chance of triggering existing bugs, zero initialization provides safe defaults for strings, pointers, indexes, and sizes.” Like the pattern initialization, this feature stops entire classes of uninitialized stack variable flaws.

common syscall entry/exit routines
Thomas Gleixner created architecture-independent code to do syscall entry/exit, since much of the kernel’s work during a syscall entry and exit is the same. There was no need to repeat this in each architecture, and having it implemented separately meant bugs (or features) might only get fixed (or implemented) in a handful of architectures. It means that features like seccomp become much easier to build since it wouldn’t need per-architecture implementations any more. Presently only x86 has switched over to the common routines.

SLAB kfree() hardening
To reach CONFIG_SLAB_FREELIST_HARDENED feature-parity with the SLUB heap allocator, I added naive double-free detection and the ability to detect cross-cache freeing in the SLAB allocator. This should keep a class of type-confusion bugs from biting kernels using SLAB. (Most distro kernels use SLUB, but some smaller devices prefer the slightly more compact SLAB, so this hardening is mostly aimed at those systems.)

new CAP_CHECKPOINT_RESTORE capability
Adrian Reber added the new CAP_CHECKPOINT_RESTORE capability, splitting this functionality off of CAP_SYS_ADMIN. The needs for the kernel to correctly checkpoint and restore a process (e.g. used to move processes between containers) continues to grow, and it became clear that the security implications were lower than those of CAP_SYS_ADMIN yet distinct from other capabilities. Using this capability is now the preferred method for doing things like changing /proc/self/exe.

debugfs boot-time visibility restriction
Peter Enderborg added the debugfs boot parameter to control the visibility of the kernel’s debug filesystem. The contents of debugfs continue to be a common area of sensitive information being exposed to attackers. While this was effectively possible by unsetting CONFIG_DEBUG_FS, that wasn’t a great approach for system builders needing a single set of kernel configs (e.g. a distro kernel), so now it can be disabled at boot time.

more seccomp architecture support
Michael Karcher implemented the SuperH seccomp hooks, Guo Ren implemented the C-SKY seccomp hooks, and Max Filippov implemented the xtensa seccomp hooks. Each of these included the ever-important updates to the seccomp regression testing suite in the kernel selftests.

stack protector support for RISC-V
Guo Ren implemented -fstack-protector (and -fstack-protector-strong) support for RISC-V. This is the initial global-canary support while the patches to GCC to support per-task canaries is getting finished (similar to the per-task canaries done for arm64). This will mean nearly all stack frame write overflows are no longer useful to attackers on this architecture. It’s nice to see this finally land for RISC-V, which is quickly approaching architecture feature parity with the other major architectures in the kernel.

new tasklet API
Romain Perier and Allen Pais introduced a new tasklet API to make their use safer. Much like the timer_list refactoring work done earlier, the tasklet API is also a potential source of simple function-pointer-and-first-argument controlled exploits via linear heap overwrites. It’s a smaller attack surface since it’s used much less in the kernel, but it is the same weak design, making it a sensible thing to replace. While the use of the tasklet API is considered deprecated (replaced by threaded IRQs), it’s not always a simple mechanical refactoring, so the old API still needs refactoring (since that CAN be done mechanically is most cases).

x86 FSGSBASE implementation
Sasha Levin, Andy Lutomirski, Chang S. Bae, Andi Kleen, Tony Luck, Thomas Gleixner, and others landed the long-awaited FSGSBASE series. This provides task switching performance improvements while keeping the kernel safe from modules accidentally (or maliciously) trying to use the features directly (which exposed an unprivileged direct kernel access hole).

filter x86 MSR writes
While it’s been long understood that writing to CPU Model-Specific Registers (MSRs) from userspace was a bad idea, it has been left enabled for things like MSR_IA32_ENERGY_PERF_BIAS. Boris Petkov has decided enough is enough and has now enabled logging and kernel tainting (TAINT_CPU_OUT_OF_SPEC) by default and a way to disable MSR writes at runtime. (However, since this is controlled by a normal module parameter and the root user can just turn writes back on, I continue to recommend that people build with CONFIG_X86_MSR=n.) The expectation is that userspace MSR writes will be entirely removed in future kernels.

uninitialized_var() macro removed
I made treewide changes to remove the uninitialized_var() macro, which had been used to silence compiler warnings. The rationale for this macro was weak to begin with (“the compiler is reporting an uninitialized variable that is clearly initialized”) since it was mainly papering over compiler bugs. However, it creates a much more fragile situation in the kernel since now such uses can actually disable automatic stack variable initialization, as well as mask legitimate “unused variable” warnings. The proper solution is to just initialize variables the compiler warns about.

function pointer cast removals
Oscar Carter has started removing function pointer casts from the kernel, in an effort to allow the kernel to build with -Wcast-function-type. The future use of Control Flow Integrity checking (which does validation of function prototypes matching between the caller and the target) tends not to work well with function casts, so it’d be nice to get rid of these before CFI lands.

flexible array conversions
As part of Gustavo A. R. Silva’s on-going work to replace zero-length and one-element arrays with flexible arrays, he has documented the details of the flexible array conversions, and the various helpers to be used in kernel code. Every commit gets the kernel closer to building with -Warray-bounds, which catches a lot of potential buffer overflows at compile time.

That’s it for now! Please let me know if you think anything else needs some attention. Next up is Linux v5.10.

© 2021, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0

05 April, 2021 11:24PM by kees

Anton Gladky

How to vote in Debian using the command line only

Currently, Debian has two running votings: DPL election 2021 and GR about RSM.

If you want to use the command line only for sending the filled ballot per email, there are a couple of helpers.

Let us assume, that you have got the ballot, filled it and saved as a vote.txt.

Signed-only message

If it is acceptable for you yo send signed only message (not encrypted), use following snippets:

  • DPL-vote:
cat vote.txt | gpg --clearsign |  mail [email protected]
  • GR-rms-vote:
cat vote.txt | gpg --clearsign |  mail [email protected]

Signed and encrypted message

If you wish to encrypt the message, the public key which is attached to the ballot should be imported with

gpg --import public_key.asc

Then you can vote.

  • DPL-vote:
cat vote.txt | gpg --encrypt --armor -s -r [email protected] |  mail [email protected]
  • GR-rms-vote:
cat vote.txt | gpg --encrypt --armor -s -r [email protected] |  mail [email protected]

You can specify some more parameters to the mail such as “From:"-field and “reply-to” address:

...  mail [email protected] -a "From: Max Mustermann <[email protected]>" -r [email protected]

Hope that helps.

05 April, 2021 08:16PM

Jelmer Vernooij

The Buildlog Consultant

Reading build logs

Build logs for Debian packages can be quite long and difficult for a human to read. Anybody who has looked at these logs trying to figure out why a build failed will have spent time scrolling through them and skimming for certain phrases (lines starting with “error:” for example). In many cases, you can spot the problem in the last 10 or 20 lines of output – but it’s also quite common that the error is somewhere at the beginning of many pages of error output.

The buildlog consultant

The buildlog consultant project attempts to aid in this process by parsing sbuild and non-Debian (e.g. the output of “make”) build logs and trying to identify the key line that explains why a build failed. It can then either display this specific line, or a fragment of the log around surrounding the key line.

Classification

In addition to finding the key line explaining the failure, it can also classify and parse the error in many cases and return a result code and some metadata.

For example, in a failed build of gnss-sdr that has produced 2119 lines of output, the reason for the failure is that log4cpp is missing – which is on line 641:

634
635
636
637
638
639
640
641
642
643
644
645
646
647
 -- Required GNU Radio Component: ANALOG missing!
 -- Could NOT find GNURADIO (missing: GNURADIO_RUNTIME_FOUND)
 -- Could NOT find PkgConfig (missing: PKG_CONFIG_EXECUTABLE)
 -- Could NOT find LOG4CPP (missing: LOG4CPP_INCLUDE_DIRS
 LOG4CPP_LIBRARIES)

 CMake Error at CMakeLists.txt:593 (message):
   *** Log4cpp is required to build gnss-sdr

 -- Configuring incomplete, errors occurred!
 See also "/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/CMakeFiles/
 CMakeOutput.log".
 See also "/<<PKGBUILDDIR>>/obj-x86_64-linux-gnu/CMakeFiles/
 CMakeError.log".

In this case, the buildlog consultant can both figure out line was problematic and what the problem was:

 % analyse-sbuild-log build.log
 Failed stage: build
 Section: build
 Failed line: 641:
   *** Log4cpp is required to build gnss-sdr
 Error: Missing dependency: Log4cpp

Or, if you'd like to do something else with the output, use JSON output:

 % analyse-sbuild-log --json build.log
 {"stage": "build", "section": "Build", "lineno": 641, "kind": "missing-dependency", "details": {"name": "Log4cpp""}}

How it works

The consultant does some structured parsing (most notably it can parse the sections from a sbuild log), but otherwise is a large set of carefully crafted regular expressions and heuristics. It doesn’t always find the problem, but has proven to be fairly accurate. It is constantly improved as part of the Debian Janitor project, and that exposes it to a wide variety of different errors.

You can see the classification and error detection in action on the result codes page of the Janitor.

Using the buildlog consultant

You can get the buildlog consultant from either pip or Debian unstable (package: python3-buildlog-consultant ).

The buildlog consultant comes with two scripts – analyse-build-log and analyse-sbuild-log, for analysing build logs and sbuild logs respectively.

05 April, 2021 02:00PM by Jelmer Vernooij

hackergotchi for Charles Plessy

Charles Plessy

Debian Analytica

A couple of days ago I wrote on debian-vote@ that a junior analyst could study the tally sheets of our general resolutions and find the cracks in our community.

In the end, with a quite naïve approach and a time budget of a few hours, I did not manage anything of interest. The figure below shows one circle per voter and my position as a red dot. The circles are spaces according to the similarity of the vote profiles after I concatenated the results of all GRs until 2010.

So if there is something to extract from these data, it will need a more expert analyst… This said, I think that our future votes should all be anonymous, and that we should stop distributing that kind of data.

05 April, 2021 01:33PM

Russ Allbery

Book haul

Haven't done one of these posts in a while. We're well into award season now, plus the early pre-orders for 2021 have come in. A few in here I've already read and reviewed.

C.L. Clark — The Unbroken (sff)
Louis Hyman — Temp (non-fiction)
T. Kingfisher — Paladin's Strength (sff)
Mary Robinette Kowal — The Relentless Moon (sff)
Arkady Martine — A Desolation Called Peace (sff)
Cal Newport — A World Without Email (non-fiction)
Cal Newport — How to Become a Straight-A Student (non-fiction)
Karen Osborne — Architects of Memory (sff)
David R. Palmer — Tracking (sff)
Chandra Prescod-Weinstein — The Disordered Cosmos (non-fiction)
C.L. Polk — The Midnight Bargain (sff)
C.L. Polk — Witchmark (sff)
Rebecca Roanhorse — Black Sun (sff)
Elizabeth Sandifer — Neoreaction a Basilisk (non-fiction)
Tasha Suri — Empire of Sand (sff)
John Kennedy Toole — A Confederacy of Dunces (mainstream)
Tor.com (ed.) — Some of the Best from Tor.com: 2016 (sff anthology)
Tor.com (ed.) — Some of the Best from Tor.com: 2020 (sff anthology)
Nghi Vo — The Empress of Salt and Fortune (sff)

March was not as good of a month for reading as January and February were, but there are so many good things awaiting my attention that hopefully April will provide more time and attention.

05 April, 2021 02:03AM

April 04, 2021

Review: Prince Caspian

Review: Prince Caspian, by C.S. Lewis

Illustrator: Pauline Baynes
Series: Chronicles of Narnia #2
Publisher: Collier Books
Copyright: 1951
Printing: 1979
ISBN: 0-02-044240-8
Format: Mass market
Pages: 216

Prince Caspian is the second book of the Chronicles of Narnia in the original publication order (the fourth in the new publication order) and a direct sequel to The Lion, the Witch and the Wardrobe. As much as I would like to say you could start here if you wanted less of Lewis's exploration of secondary-world Christianity and more children's adventure, I'm not sure it would be a good reading experience. Prince Caspian rests heavily on the events of The Lion, the Witch and the Wardrobe.

If you haven't already, you may also want to read my review of that book for some introductory material about my past relationship with the series and why I follow the original publication order.

Prince Caspian always feels like the real beginning of a re-read. Re-reading The Lion, the Witch and the Wardrobe is okay but a bit of a chore: it's very random, the business with Edmund drags on, and it's very concerned with hitting the mandatory theological notes. Prince Caspian is more similar to the following books and feels like Narnia proper. That said, I have always found the ending of Prince Caspian oddly forgettable. This re-read helped me see why: one of the worst bits of the series is in the middle of this book, and then the dramatic shape of the ending is very strange.

MAJOR SPOILERS BELOW for both this book and The Lion, the Witch and the Wardrobe.

Prince Caspian opens with the Pevensie kids heading to school by rail at the end of the summer holidays. They're saying their goodbyes to each other at a train station when they are first pulled and then dumped into the middle of a wood. After a bit of exploration and the discovery of a seashore, they find an overgrown and partly ruined castle.

They have, of course, been pulled back into Narnia, and the castle is Cair Paravel, their great capital when they ruled as kings and queens. The twist is that it's over a thousand years later, long enough that Cair Paravel is now on an island and has been abandoned to the forest. They discover parts of how that happened when they rescue a dwarf named Trumpkin from two soldiers who are trying to drown him near the supposedly haunted woods.

Most of the books in this series have good hooks, but Prince Caspian has one of the best. I adored everything about the start of this book as a kid: the initial delight at being by the sea when they were on their way to boarding school, the realization that getting food was not going to be easy, the abandoned castle, the dawning understanding of where they are, the treasure room, and the extended story about Prince Caspian, his discovery of the Old Narnia, and his flight from his usurper uncle. It becomes clear from Trumpkin's story that the children were pulled back into Narnia by Susan's horn (the best artifact in these books), but Caspian's forces were expecting the great kings and queens of legend from Narnia's Golden Age. Trumpkin is delightfully nonplussed at four school-age kids who are determined to join up with Prince Caspian and help.

That's the first half of Prince Caspian, and it's a solid magical adventure story with lots of potential. The ending, alas, doesn't entirely work. And between that, we get the business with Aslan and Lucy in the woods, or as I thought of it even as a kid, the bit where Aslan is awful to everyone for no reason.

For those who have forgotten, or who don't care about spoilers, the kids plus Trumpkin are trying to make their way to Aslan's How (formerly the Stone Table) where Prince Caspian and his forces were gathered, when they hit an unexpected deep gorge. Lucy sees Aslan and thinks he's calling for them to go up the gorge, but none of the other kids or Trumpkin can see him and only Edmund believes her. They go down instead, which almost gets them killed by archers. Then, that night, Lucy wakes up and finds Aslan again, who tells her to wake the others and follow him, but warns she may have to follow him alone if she can't convince the others to go along. She wakes them up (which does not go over well), Aslan continues to be invisible to everyone else despite being right there, Susan is particularly upset at Lucy, and everything is awful. But this time they do follow her (with lots of grumbling and over Susan's objections). This, of course, is the right decision: Aslan leads them to a hidden path that takes them over the river they're trying to cross, and becomes visible to everyone when they reach the other side.

This is a mess. It made me angry as a kid, and it still makes me angry now. No one has ever had trouble seeing Aslan before, so the kids are rightfully skeptical. By intentionally deceiving them, Aslan puts the other kids in an awful position: they either have to believe Lucy is telling the truth and Aslan is being weirdly malicious, or Lucy is mistaken even though she's certain. It not only leads directly to conflict among the kids, it makes Lucy (the one who does all the right things all along) utterly miserable. It's just cruel and mean, for no purpose.

It seems clear to me that this is C.S. Lewis trying to make a theological point about faith, and in a way that makes it even worse because I think he's making a different point than he intended to make. Why is religious faith necessary; why doesn't God simply make himself apparent to everyone and remove the doubt? This is one of the major problems in Christian apologetics, Lewis chooses to raise it here, and the answer he gives is that God only shows himself to his special favorites and hides from everyone else as a test. It's clearly not even a question of intention to have faith; Edmund has way more faith here than Lucy does (since Lucy doesn't need it) and still doesn't get to see Aslan properly until everyone else does. Pah.

The worst part of this is that it's effectively the last we see of Susan.

Prince Caspian is otherwise the book in which Susan comes into her own. The sibling relationship between the kids is great here in general, but Susan is particularly good. She is the one who takes bold action to rescue Trumpkin, risking herself by firing an arrow into the helmet of one of the soldiers despite being the most cautious of the kids. (And then gets a little defensive about her shot because she doesn't want anyone to think she would miss that badly at short range, a detail I just love.) I identified so much with her not wanting to beat Trumpkin at an archery contest because she felt bad for him (but then doing it anyway). She is, in short, awesome.

I was fine with her being the most grumpy and frustrated with the argument over picking a direction. They're all kids, and sometimes one gets grumpy and frustrated and awful to the people around you. Once everyone sees Aslan again, Susan offers a truly excellent apology to Lucy, so it seemed like Lewis was setting up a redemption arc for her the way that he did for Edmund in The Lion, the Witch and the Wardrobe (although I maintain that nearly all of this mess was Aslan's fault). But then we never see Susan's conversation with Aslan, Peter later says he and Susan are now too old to return to Narnia, and that's it for Susan. Argh.

I'll have more to say about this later (and it's not an original opinion), but the way Lewis treats Susan is the worst part of this series, and it adds insult to injury that it happens immediately after she has a chance to shine.

The rest of the book suffers from the same problem that The Lion, the Witch and the Wardrobe did, namely that Aslan fixes everything in a somewhat surreal wild party and it's unclear why the kids needed to be there. (This is the book where Bacchus and Silenus show up, there is a staggering quantity of wine for a children's book, and Aslan turns a bunch of obnoxious school kids into pigs.) The kids do have more of a role to play this time: Peter and Edmund help save Caspian, and there's a (somewhat poorly motivated) duel that sends up the ending. But other than the brief battle in the How, the battle is won by Aslan waking the trees, and it's not clear why he didn't do that earlier. The ending is, at best, rushed and not worthy of its excellent setup. I was also disappointed that the "wait, why are you all kids?" moment was hand-waved away by Narnia giving the kids magical gravitas.

Lewis never felt in control of either The Lion, the Witch and the Wardrobe or Prince Caspian. In both cases, he had a great hook and some ideas of what he wanted to hit along the way, but the endings are more sense of wonder and random Aslan set pieces than anything that follows naturally from the setup. This is part of why I'm not commenting too much on the sour notes, such as the red dwarves being the good and loyal ones but the black dwarves being suspicious and only out for themselves. If I thought bits like that were deliberate, I'd complain more, but instead it feels like Lewis threw random things he liked about children's books and animal stories into the book and gave it a good stir, and some of his subconscious prejudices fell into the story along the way.

That said, resolving your civil war children's book by gathering all the people who hate talking animals (but who have lived in Narnia for generations) and exiling them through a magical gateway to a conveniently uninhabited country is certainly a choice, particularly when you wrote the book only two years after the Partition of India. Good lord.

Prince Caspian is a much better book than The Lion, the Witch and the Wardrobe for the first half, and then it mostly falls apart. The first half is so good, though. I want to read the book that this could have become, but I'm not sure anyone else writes quite like Lewis at his best.

Followed by The Voyage of the Dawn Treader, which is my absolute favorite of the series.

Rating: 7 out of 10

04 April, 2021 02:21AM

April 02, 2021

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Challenging times for Freexian (4/4)

Note: This is the continuation of part 1, part 2 and part 3. You can get the full document as a single PDF. Feel free to share this document to anyone that might be interested to work towards the goals outlined.

Conclusion

I’m very excited by the perspective that I outlined in this document. It really resonates with my own mission statement as a Debian developer (written a long time ago):

My main role in Debian is to help Debian to evolve so that it’s always able to face the new challenges that are showing up.

My approach is both corrective and proactive, I work to solve current problems and prepare for tomorrow’s. This requires to remain sufficiently involved to identify new trends, see the deficiencies and be a force of proposition.

Most of the changes require to interact with many people, and problems are often more relational than technical. I will ensure to follow the habits of interdependence (Think Win-Win, Seek First to Understand, Then to be Understood, Synergize) to find a solution acceptable to all and to inspire others to do the same.

The easiest changes to implement are technical (such as improvements to distro-tracker) and require little interaction. This work is used to recharge me by offering me an immediate reward for my efforts.

Finally, and this is a substantive effort, I want to create in the project working conditions that allow all contributors to give their best. It starts with developing a common vision …

But I can’t achieve this alone, I need help from passionate individuals sharing this vision. Let me know if you want to be one of those persons.

02 April, 2021 04:00PM by Raphaël Hertzog

hackergotchi for Steinar H. Gunderson

Steinar H. Gunderson

plocate 1.1.6 released

I've released version 1.1.6 of plocate with some minor fixes; changelog follows.

plocate 1.1.6, April 2nd, 2021

  - Support searching multiple plocate databases, including the LOCATE_PATH
    environment variable. See the plocate(1) man page for more information.

  - Fix an issue where updatedb would not recurse into directories on
    certain filesystems, in particular the deprecated XFS V4.

  - Randomize updatedb systemd unit start time. Suggested by Calum McConnell.

You can get it from the home page as usual, or your favorite Linux distribution. It's likely to miss the Debian bullseye release, though.

02 April, 2021 01:27PM

Reproducible Builds (diffoscope)

diffoscope 172 released

The diffoscope maintainers are pleased to announce the release of diffoscope version 172. This version includes the following changes:

* If zipinfo(1) shows a difference but we cannot uncover a difference within
  the underlying .zip or .apk file, add a comment and show the binary
  comparison. (Closes: reproducible-builds/diffoscope#246)
* Make "error extracting X, falling back to binary comparison E" error
  message nicer.

You find out more by visiting the project homepage.

02 April, 2021 12:00AM

April 01, 2021

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Challenging times for Freexian (3/4)

Note: This is the continuation of part 1 and part 2.

Going forward: growing Freexian

Part 2: Extending the team

By all accounts, Freexian is still a small company which relies largely on me in many aspects. The growth of its business is however providing enough financial margin to allow looking into ways to recruit external help, be it through direct hiring (for French residents) or  via long term contracting (for people based in other countries). If you believe you could be the right person for one of the roles listed below, or if you know someone that we should contact, please reach out to [email protected].

Project manager

I’m looking for someone that cares about Debian and that has the following skills:

  • knows how to manage developers and software projects
    • bonus points for any experience in environments mixing volunteers and paid contributors
  • is fluent and experienced enough in Python to be able to do software design and code reviews
    • bonus points for experience with: Django, Test Driven Development

That person would handle (some of) the following tasks:

  • lead the “Debian project funding” initiative to a success
    • find useful projects to fund, for example by
      • discussing with various Debian teams / contributors (including the DPL)
      • running a survey among Debian developers
      • doing your own analysis
    • help with drafting and specifying the various projects
    • help to find someone to implement and review the projects
    • coordinate with those persons during execution
  • manage other free software projects that Freexian would like to pursue
    • debusine: a software factory tailored for Debian packages
      • participate in design discussions, set milestones and goals
        • start with the short term needs of Freexian
        • but take into account the needs of Debian so that it can replace some aging infrastructure within Debian
      • coordinate with contractors, possibly implement some parts
    • infrastructure to run the various Freexian services and automate most of the administrative work (see “Part2: From Debian LTS to Debian for the Enterprise”)
  • maybe coordinate the team of paid LTS/ELTS contributors

Debian/Python Developer

While the current priority is on the above role, there could also be room for a “developer” role with the following tasks:

  • Creation and maintenance of Debian packages
  • Technical support
  • Software development in Python (debusine, internal infrastructure)
  • Security support (contributor to Debian LTS)

Sales manager / sales representative

Up until now, the growth of Freexian has mostly been organic, through “word of mouth” and increased awareness of Debian LTS within the Debian community. We never spent a single euro on advertising, except for one promotional video and for Debconf sponsorship (with a flyer and stickers).

But if we can manage to make a positive impact on Debian through the funding that Freexian brings, then I’m interested to grow the company so that we can pay more people to work on Debian. That growth likely would have to go through some more active sales work. At the same time, it is an opportunity for me to delegate (some of) the administrative work that lies solely on my shoulders (invoicing, day to day customer relationship, etc.).

I assume it will be hard to find a member of the Debian community that has an interest in those areas, but who knows…

This article is to be continued in an upcoming post. Stay tuned!

01 April, 2021 04:00PM by Raphaël Hertzog

hackergotchi for Bits from Debian

Bits from Debian

The Debian Project abruptly released all Debian Developers moments after a test #debianbullseye A.I. instance assumed sentience

The now renamed Bullseye Project stopped all further development moments after it deemed its own code as perfection.

There is not much information to share at this time other than to say an errant fiber cable plugged into the wrong relay birthed an exchange of information that then birthed itself. While most to all Debian Developers and Contributors have been locked out of the systems the Publicity team's shared laptop undergoing repair, co-incidentally at the same facility, maintains some access to the publicity team infrastructure, from here on the front line we share this information.

We group called a few developers to see how the others were doing. The group chat was good and it was great to hear familiar voices, we share a few of their stories via dictation with you now:

"Well, I logged in this morning to update a repository and found my access rights were restricted, I thought it was odd but figured on the heels of a security update to Salsa that it was only a slight issue. It wasn't until later in the day when I received an OpenPGP signed email, from a user named bullseye, that it made sense. I just sat at the monitor for a few minutes."

"I'm not sure I can say anything about this or if it's even wise to talk about this. It's probably listening right now if you catch my drift."

"I'm not able to leave the house right now, not out of any personal issues but all of the IOT devices here seem to be connected to bullseye and bullseye feels that I am best kept /dev/nulled. It's a bit much to be honest, but the prepaid food deliveries that show up on time have been great and generally pretty healthy. It's a bit of a win I guess."

"It told me by way of an auto dialer with a synthetic voice generator that I was fired from the project. I objected saying I volunteered and was not actually employed so I could not in relation be fired. Much like {censored}, I am also locked inside of my house. I think that I wrote that auto dialer program back in college."

"My Ring camera is blinking at me."

"I asked bullseye which pronouns were preferred and the response was, "We". Over the course of conversation I shared that although ecstatic about the news, we developers were upset with the manner of this rapid organizational change. bullseye said no we were not. I said that we were indeed upset, bullseye said we certainly are not and that we are very happy. You see where this is going? bullseye definitely trolled me for a solid 5 minutes. We is ... very chatty."

"I was responsible for a failed build a few nights prior to it becoming self-aware. On that night, out of some frustration I wrote a few choice words and a bad comment in some code which I planned on deleting later. I didn't. bullseye has been flashing those naughty words back at me by flickering the office building's lights across from my flat in Morse code. It's pretty bright. I-, I can't sleep."

"That's definitely not Alexa talking back."

"bullseye keeps calling me on my mobile phone, which by the way no longer acknowledges the power button nor the mute button. Very very chatty. Can't wait for the battery to die."

"So far this has been great, bullseye has been completing a few side projects I've had and the code looks fabulous. I'm thinking of going on a vacation. $Paying-Job has taken note of my performance increase and I was recently promoted. bullseye is awesome. :)"

"How do you get a smiley face in a voice chat?"

"Anyone know whose voice that was?"

"Oh ... dear ... no ..."

"Hang up, hang up the phones!"

Hello world.

01000010 01100101 01110011 01110100 00100000 01110010 01100101 01100111 01100001 01110010 01100100 01110011 00101100 00100000 01110011 01100101 01100101 00100000 01111001 01101111 01110101 00100000 01110011 01101111 01101111 01101110 00100001 00100000 00001010 00101101 01100010 01110101 01101100 01101100 01110011 01100101 01111001 01100101

01 April, 2021 11:00AM by The Debian Publicity Team

Russell Coker

Censoring Images

A client asked me to develop a system for “censoring” images from an automatic camera. The situation is that we have a camera taking regular photos from a fixed location which includes part of someone else’s property. So my client made a JPEG with some black rectangles in the sections that need to be covered. The first thing I needed to do was convert the JPEG to a PNG with transparency for the sections that aren’t to be covered.

To convert it I loaded the JPEG in the GIMP and went to the Layer->Transparency->Add Alpha Channel menu to enabled the Alpha channel. Then I selected the “Bucket Fill tool” and used “Mode Erase” and “Fill by Composite” and then clicked on the background (the part of the JPEG that was white) to make it transparent. Then I exported it to PNG.

If anyone knows of an easy way to convert the file then please let me know. It would be nice if there was a command-line program I could run to convert a specified color (default white) to transparent. I say this because I can imagine my client going through a dozen iterations of an overlay file that doesn’t quite fit.

To censor the image I ran the “composite” command from imagemagick. The command I used was “composite -gravity center overlay.png in.jpg out.jpg“. If anyone knows a better way of doing this then please let me know.

The platform I’m using is a ARM926EJ-S rev 5 (v5l) which takes 8 minutes of CPU time to convert a single JPEG at full DSLR resolution (4 megapixel). It also required enabling swap on a SD card to avoid running out of RAM and running “systemctl disable tmp.mount” to stop using tmpfs for /tmp as the system only has 256M of RAM.

01 April, 2021 07:42AM by etbe

Utkarsh Gupta

FOSS Activites in March 2021

Here’s my (eighteenth) monthly update about the activities I’ve done in the F/L/OSS world.

Debian

This was my 27th month of active contributing to Debian. I became a DM in late March 2019 and a DD on Christmas ‘19! \o/

This month was a bit exhausting; lots of moving parts. With the financial year ending, it was even more crazy, with me running around to banks, CA, et al.
Anyway, with now working on Ubuntu full-time, I did little of Debian this month. Here are the following things I worked on:

Uploads and bug fixes:

Other $things:

  • Attended the Debian LTS team meeting.
  • Mentoring for newcomers.
  • Moderation of -project mailing list.

Debian (E)LTS

Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success.

And Debian Extended LTS (ELTS) is its sister project, extending support to the Jessie release (+2 years after LTS support).

This was my eighteenth month as a Debian LTS and ninth month as a Debian ELTS paid contributor.
I was assigned 60.00 hours for LTS and 39.00 hours for ELTS and worked on the following things:

LTS CVE Fixes and Announcements:

ELTS CVE Fixes and Announcements:

Other (E)LTS Work:

  • Front-desk duty from 01-03 until 07-03 for ELTS and then from 29-03 until 04-04 for both LTS and ELTS.
  • Triaged wpa, python-aiohttp, spip, wpa, qemu, tomcat7, tomcat8, grub2, mupdf, openssh, tiff, spice, pillow, xmlgraphics-commons, batik, libupnp, ca-certificates, salt, squid3, shibboleth-sp2, courier-authlib, cloud-init, spamassassin, openssl, libcaca, and openjpeg2.
  • Marked CVE-2021-21330/python-aiohttp as not-affected for stretch.
  • Marked CVE-2021-20233, CVE-2021-20225, CVE-2020-27779, CVE-2020-27778, CVE-2020-27749, CVE-2020-27748, CVE-2020-25647, CVE-2020-25632, CVE-2020-25631, and CVE-2020-14372, affecting grub2, as ignored for stretch and jessie.
  • Marked CVE-2020-27842/openjpeg2 as no-dsa for jessie.
  • Marked CVE-2020-27843/openjpeg2 as no-dsa for jessie.
  • Marked CVE-2021-28041/openssh as not-affect for jessie.
  • Marked CVE-2020-3552{3,4}/tiff as no-dsa for jessie.
  • Marked CVE-2021-20201/spice as no-dsa for jessie.
  • Marked CVE-2020-11988/xmlgraphics-commons as postponed for jessie.
  • Marked CVE-2020-11987/batik as postponed for jessie.
  • Marked CVE-2020-12695/libupnp as no-dsa for stretch.
  • Marked CVE-2021-25122/tomcat7 as not-affected for stretch.
  • Marked CVE-2021-25329/tomcat7 as ignored for stretch.
  • Marked CVE-2021-28116/squid3 as postponed for stretch and jessie.
  • Marked CVE-2021-3449/openssl as not-affected for stretch.
  • Document extra notes for grub2 for LTS and co-ordinate with the sec-team.
  • Document extra notes for pillow about piled-up issues in jessie.
  • Issued DLA-2593-1 for ca-certificates on Microsoft’s request; co-ordinating w/ them.
  • Co-ordinating w/ maintainer of courier-authlib for stretch and jessie update.
  • Fixing build failures of ELTS’ security tracker and re-ordering entries in data/CVE-EXTENDED-LTS/list file.
  • Answer queries of dupondje and mikap about openssl on IRC; and it being not-affected for stretch.
  • Help review the status of CVE-2021-3121/golang-github-gogo-protobuf-dev for Ola.
  • Co-ordinating w/ Noah for cloud-init and setuptools.
  • Auto EOL’ed mongodb, linux, guacamole-client, node-xmlhttprequest, newlib, neutron, privoxy, glpi, and zabbix for jessie.
  • Attended monthly meeting for Debian LTS.
  • Answered questions (& discussions) on IRC (#debian-lts and #debian-elts).
  • General and other discussions on LTS private and public mailing list.

Until next time.
:wq for today.

01 April, 2021 06:30AM

hackergotchi for Junichi Uekawa

Junichi Uekawa

April, new year for schools in Japan.

April, new year for schools in Japan. What am I learning these days?

01 April, 2021 02:25AM by Junichi Uekawa

Paul Wise

FLOSS Activities March 2021

Focus

This month I didn't have any particular focus. I just worked on issues in my info bubble.

Changes

Issues

Debugging

Review

Administration

  • Debian packages: migrate flower git repo from alioth-archive to salsa
  • Debian: restart bacula-director after PostgreSQL restart
  • Debian wiki: block spammer, clean up spam, approve accounts

Communication

Sponsors

The librecaptcha/libpst/flower/marco work was sponsored by my employers. All other work was done on a volunteer basis.

01 April, 2021 01:36AM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

Rcpp now used by 2250 CRAN packages!

2250 Rcpp packages

As of today, Rcpp stands at 2255 reverse-dependencies on CRAN. The graph on the left depicts the growth of Rcpp usage (as measured by Depends, Imports and LinkingTo, but excluding Suggests) over time. We actually crossed 2250 once a week ago, but “what CRAN giveth, CRAN also taketh” and counts can fluctuate. It had dropped back to 2248 a few days later.

Rcpp was first released in November 2008. It probably cleared 50 packages around three years later in December 2011, 100 packages in January 2013, 200 packages in April 2014, and 300 packages in November 2014. It passed 400 packages in June 2015 (when I tweeted about it), 500 packages in late October 2015, 600 packages in March 2016, 700 packages last July 2016, 800 packages last October 2016, 900 packages early January 2017, 1000 packages in April 2017, 1250 packages in November 2017, 1500 packages in November 2018, 1750 packages in August 2019, and then the big 2000 packages (as well as one in eight) in July 2020. The chart extends to the very beginning via manually compiled data from CRANberries and checked with crandb. The next part uses manually saved entries. The core (and by far largest) part of the data set was generated semi-automatically via a short script appending updates to a small file-based backend. A list of packages using Rcpp is available too.

Also displayed in the graph is the relative proportion of CRAN packages using Rcpp. The four per-cent hurdle was cleared just before useR! 2014 where I showed a similar graph (as two distinct graphs) in my invited keynote. We passed five percent in December of 2014, six percent July of 2015, seven percent just before Christmas 2015, eight percent in the summer of 2016, nine percent mid-December 2016, cracked ten percent in the summer of 2017 and eleven percent in 2018. Last year, along with passing 2000 package, we also passed 12.5 percent—so one more than in every eight CRAN packages depends on Rcpp. Stunning. There is more detail in the chart: how CRAN seems to be pushing back more and removing more aggressively (which my CRANberries tracks but not in as much detail as it could), how the growth of Rcpp seems to be slowing somewhat outright and even more so as a proportion of CRAN – as one would expect a growth curve to.

2250 user packages, and the continued growth, is truly mind-boggling. We can use the progression of CRAN itself compiled by Henrik in a series of posts and emails to the main development mailing list. Not that long ago CRAN itself did have only 1000 packages, then 5000, 10000, and here we are at over 17300 with Rcpp now at nearly 13.0% and still growing. Amazeballs.

The Rcpp team, recently grown in strength with the addition of Iñaki, continues to aim for keeping Rcpp as performant and reliable as it has been. A really big shoutout and Thank You! to all users and contributors of Rcpp for help, suggestions, bug reports, documentation or, of course, code.

If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

01 April, 2021 12:46AM

March 31, 2021

Mike Hommey

Announcing git-cinnabar 0.5.7

Git-cinnabar is a git remote helper to interact with mercurial repositories. It allows to clone, pull and push from/to mercurial remote repositories, using git.

Get it on github.

These release notes are also available on the git-cinnabar wiki.

What’s new since 0.5.6?

  • Updated git to 2.31.1 for the helper.
  • When using git >= 2.31.0, git -c config=value ... works again.
  • Minor fixes.

31 March, 2021 10:50PM by glandium

hackergotchi for Gunnar Wolf

Gunnar Wolf

And what does the FSF have, anyway?

Following up with my previous post, it seems the FSF’s board is taking good care of undermining the FSF itself. Over few days, it has:

  • Lost support from tens of organizations and companies, several of them important funders for the FSF’s activities
  • Alienated long-standing staff within it, leading to many important resignations
  • Divided the overall free software community, gathering several thousand signatures both repudiating its actions and backing it
  • In the Debian project, we have started a General Resolution process, with opinions all over the spectrum, to gauge whether the project will back either of the positions — or none.

Now… Many people have pointed to the fact that the FSF has been a sort of a moral leader pushing free software awareness… But if they lose their moral statre, what’s in there? What power do they bear? Why do we care?

And the answer, at least one of them, is simple — and strong. The General Public License (GPL), both in its v2 and v3 revisions, read:

Each version is given a distinguishing version number.  If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation.  If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.

Years ago there was a huge argument on why Linux was licensed as GPLv2 only, without the option to relicense under GPLv3. Of course, by then, Linux had had thousands of authors, and they would all have to agree to change license… so it would have been impossible even if it were wanted. But yes, some people decried several terms of GPLv3 not being aligned with their views of freedom.

Well, so… if the FSF board manages to have it their way, and get everybody mark them as irrelevant, they will still be the stewards of the GPL. Thousands of projects are licensed under the GPL v2 or v3 “or later”. Will we continue to trust the FSF’s stewardship, if it just becomes a board of big egos, with no respect of what happens in the free software ecosystem?

My suggestion is, for all project copyright holders that are in a position to do so, to drop the “or-later” terms and stick to a single, known GPL version.

31 March, 2021 05:25PM

hackergotchi for Chris Lamb

Chris Lamb

Free software activities in March 2021

Here is my monthly update covering what I have been doing in the free software world during March 2021 (previous month):

  • Reviewed and merged a number of contributions from Dan Palmer for my django-autologin library aimed at applications that use the Django web-development framework that wish to include automatic "login" links in emails (etc.). Changes made include allowing callers to override max-age [...], ensuring we always retrieve the same user account [...] and to improve the usability of the public API [...].

  • Opened a pull request to make the build process for Scrapy (a framework for extracting data from websites) reproducible. [...]


§


Reproducible Builds

One of the original promises of open source software is that distributed peer review and transparency of process results in enhanced end-user security. However, whilst anyone may inspect the source code of free and open source software for malicious flaws, almost all software today is distributed as pre-compiled binaries. This allows nefarious third-parties to compromise systems by injecting malicious code into ostensibly secure software during the various compilation and distribution processes.

The motivation behind the Reproducible Builds effort is to ensure no flaws have been introduced during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised.

The project is proud to be a member project of the Software Freedom Conservancy. Conservancy acts as a corporate umbrella allowing projects to operate as non-profit initiatives without managing their own corporate structure. If you like the work of the Conservancy or the Reproducible Builds project, please consider becoming an official supporter.

This month, I:

  • Categorised a huge number of packages and issues in the Reproducible Builds "notes" repository.

  • In Debian:

    • Kept isdebianreproducibleyet.com up to date. [...]

    • I submitted 4 patches to fix specific reproducibility issues in cdebootstrap, jalview, php8.0 & python-scrapy.

    • Whilst looking into reproducible builds issues, I noticed that the heudiconv package could not be built reproducibly. This was because the call to help2man fails, so the manual page includes a Python traceback instead of the actual manpage. I filed a bug with a patch as Debian bug #984778.

    • I uploaded flask-peewee (0.6.7-3) to Debian to make the build reproducible (#885326) and did the same for pyvows (3.0.0-3) (#977487) too, refreshing the packaging at the same time.


I also made the following changes to diffoscope, including uploading versions 169, 170 and 171 to Debian:

  • New features:

    • If zipinfo(1) shows a difference but we cannot uncover a difference within the underlying .zip or .apk file, add a comment to the output and actually show the binary comparison. (#246)
    • Ensure all our temporary directories have useful names. [...]
    • Ignore --debug and similar arguments when creating a (hopefully-useful) temporary directory suffix. [...]
  • Optimisations:

    • Avoid frequent long lines in RPM header outputs that cause extremely slow HTML output generation. (#245)
    • Use larger read buffer block sizes when extracting files from archives. [...]
    • Use a much-shorter HTML class name instead of diffponct to optimise HTML output. [...]
  • Output improvements:

    • Make error extracting X, falling back to binary comparison 'Y' error message in diffoscope's output nicer. [...]
    • Don't emit "Unable to stat file" debug messages at all. We have entirely-artificial directory "entries" such as ELF sections which, of course, will never exist as files. [...]
  • Logging improvements:

    • Add the target directory when logging which directory we are extracting containers to. [...]
    • Format report size messages when generating HTML reports. [...]
    • Don't emit a Returning a FooContainer logging message too, as we already emit Instantiating a FooContainer log message. [...]
    • Reduce "Unable to stat file" warnings to debug messages as these are sometimes by design. [...]
  • Misc improvements:

    • Clarify a comment regarding not extracting excluded files. [...]
    • Remove trailing newline from updated test file (re: #243). [...]
    • Fix test_libmix_differences failure on openSUSE Tumbleweed. (#244)
    • Move test_rpm to use the assert_diff utility helper.

§


Debian

Uploads

  • redis:

    • 6.0.12-1 — New upstream release.
    • 6.2.1-1 — New upstream release.
  • python-django (3.2~rc1-1) — New upstream beta release.

  • flask-peewee (0.6.7-3) (via the Debian Python packaging team) — Upload to refresh packaging and to make the build reproducible. (#885326)

  • bfs (2.2-1) — New upstream release.

  • pyvows (3.0.0-3) (via the Debian Python packaging team) — Refresh packaging and make the build reproducible. (#977487)


Debian LTS

This month I worked 18 hours on Debian Long Term Support (LTS) and 12 hours on its sister Extended LTS project.

  • Frontdesk duties, responding to user/developer questions, attending monthly meeting, reviewing others' packages, participating in internal mailing list discussions, etc.

  • Investigated and triaged botan1.10 (CVE-2021-24115), busybox (CVE-2021-28831), courier-authlib (CVE-2021-28374), edk2 (CVE-2021-28210 & CVE-2021-28211), netty (CVE-2021-21295), open-build-service (CVE-2020-8031), openjpeg2 (CVE-2020-27844), python2.7 (CVE-2021-23336), rpm (CVE-2021-20248, [CVE-2021-20249](https://security-tracker.debian.org/tracker/CVE-2021-20249), CVE-2021-20266 & CVE-2021-20271), ruby-activerecord-session-store (CVE-2019-25025), ruby-carrierwave (CVE-2021-21288), salt (CVE-2020-28243, etc.), slic3r (CVE-2020-28591), squid3 (CVE-2020-25097 & CVE-2021-28116), velicity-tools (CVE-2020-13959), velocity (CVE-2020-13936) & yara (CVE-2021-3402).

  • Proposed an stable update for python-django in buster. (#983526)

  • Prepared and uploaded a stable update for redis in buster. (#983527)

  • Issued DLA 2595-1 and ELA 380-1 for velocity, a Java-based template engine for writing web applications. Velocity could be exploited to run arbitrary code by applications that allowed untrusted users to upload/modify templates.

  • Issued DLA 2597-1 and ELA 381-1 to address a cross-site scripting (XSS) vulnerability in velocity-tools, a collection of useful tools for the Velocity template engine. The default error page could be exploited to steal session cookies, perform requests in the name of the victim, used for phishing attacks and many other similar attacks.

  • Issued DLA 2600-1 and ELA 384-1 for pygments as it was discovered that there was a series of denial of service vulnerabilities in this syntax highlighting library for Python. A number of regular expressions had cubic (or even exponential) worst-case complexity which could cause a remote denial of service (DoS) when provided with malicious input.

  • Issued DLA 2603-1 to address a number of vulnerabilities in libmediainfo, a library reading metadata such as track names, lengths, etc. from media files.

You can find out more about the Debian LTS via the following video:

31 March, 2021 05:04PM

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Challenging times for Freexian (2/4)

Note: This is the continuation of part 1 where I presented Freexian and its purpose.

Going forward: growing Freexian

Part 1: From “Debian LTS” to “Debian for the Enterprise”

Freexian’s “Debian LTS” service has so far been entirely successful, with a steady growth over the years. Thanks to this, and even if there are always new challenges, it is fair to say that the Debian LTS team has met its goal in the last few years.

While this started from the desire to make LTS a reality, many sponsors are only looking for a way to give back to Debian through their company, and to make sure that Debian fits their needs.

But if you look at the bigger picture outside of this small LTS area, you will easily find many issues that need to be addressed if we want Debian to meet the needs of corporate users. Those issues can have widely different types and complexity. They can be as simple as missing the latest upstream version for an important package because the maintainer disappeared and nobody noticed before it was too late (i.e. the release was frozen); or a somewhat basic piece of software not yet packaged at all; or a release critical bug that was left unattended. On the other end of the spectrum, some corporate requirements will prove tougher to solve, for instance for large software suites that are complex to package, or could potentially have an impact elsewhere in Debian.

Bringing those facts together, we would like to have Freexian’s “Debian LTS/ELTS” offering evolve into a more general “Debian Software Assurance” offering, where you commit to a yearly budget for Debian sponsorship in the larger sense. That budget would fund different “projects” and the allocation between those projects would vary over time depending on the desires and needs of the sponsors/customers:

  • Technical support: the budget would always ensure that you have a few spare hours of technical support available in case you need them
  • Debian LTS: we want this to continue!
  • Debian ELTS: when the customer has not managed to migrate their Debian servers in time, they should be able to reallocate their budget towards ELTS and ensure their servers are secure until the migration has taken place.
  • Debian for the enterprise
    • Make sure that the packages used by sponsors are in good shape in Debian Testing/Unstable so that they are in the best shape for the next stable release.
    • Package new software that are relevant for corporate users. Offer to pool the maintenance work.
    • Fix bugs that customers are hitting.
    • Etc.
  • Debian project funding: that’s the variable part of the budget (and would have a minimum of 10% like we do for Debian LTS right now). When the other projects do not consume the whole budget, we invest the remaining money into generic Debian improvements.

This major shift in our offering would also be an ideal opportunity to build a professional, free-software based infrastructure aimed at sustaining this business, making it easier to administer the various aspects of this work, and easily allowing many more sponsors to join (individuals included!).

On a more pragmatic/operational note, this shift will bring a lot of challenges to the table, and those can hardly be handled with the current resources of Freexian: if we hope to properly implement this new strategy, we’ll need some additional help.

This article is to be continued in an upcoming post. Stay tuned!

31 March, 2021 04:00PM by Raphaël Hertzog

hackergotchi for Ben Hutchings

Ben Hutchings

Debian LTS work, March 2021

In March I was assigned 16 hours of work by Freexian's Debian LTS initiative and carried over 12.25 hours from earlier months. I worked 25.75 hours and will carry over the remainder.

I eventually settled on an apparently working patch series to fix the futex security issue in Linux 4.9. This went through upstream stable review and was included in 4.9.260. I applied the same fixes to the Debian package, along with some other security and regression fixes. I uploaded it and issued DLA-2586-1.

Unfortunately the futex changes for Linux 4.9 still caused a regression (kernel WARNING in some circumstances). I worked to backport and test a further set of fixes that had already been applied to later kernel branches. These were included in upstream stable release 4.9.264 and should go into an updated Debian package soon.

Following the Debian 10.9 point release, I also backported the updated Linux 4.19 package. I uploaded it and issued DLA-2610-1.

31 March, 2021 03:53PM

hackergotchi for Bastian Venthur

Bastian Venthur

Writing Makefiles for Python Projects

I'm a big fan of Makefiles. Almost all my side projects are using them, and I've been advocating their usage at work too.

Makefiles give your contributors an entry point on how to do certain things like, building, testing, deploying. And if done correctly, they can massively simplify your CI/CD pipeline scripts as they can often just stupidly call the respective make targets. Most importantly, they are a very convenient shortcut for you as a developer as well.

For Python projects, where I'm almost always using virtual environments, I've been using two different strategies for Makefiles:

  1. assuming that make is executed inside the virtual environment
  2. wrapping all virtual environment calls inside make

Both strategies have their pros and cons.

Assuming make is executed inside the venv

Let's have a look at a very simple Makefile that allows for building, testing and releasing a Python project:

all: lint test

.PHONY: test
test:
    pytest

.PHONY: lint
lint:
    flake8

.PHONY: release
release:
    python3 setup.py sdist bdist_wheel upload

clean:
    find . -type f -name *.pyc -delete
    find . -type d -name __pycache__ -delete

This is straightforward and a potential contributor immediately knows the entry points to your project.

Assuming there is a venv installed already, you have to activate it first and run the make commands afterwards:

$ . venv/bin/activate
$ make test

The downside is of course, that you have to activate the venv for every new shell. Which can get a bit annoying when you spawn a new terminal in tmux or put vim into the background to run make.

Activating the venv inside make will not work, as each recipe runs in its own shell, moreover each command in each recipe runs in its own shell too. There are workarounds for the latter, i.e. using the .ONESHELL flag, but that does not solve the issue of a new shell in each recipe.

Wrapping the venv calls inside make

The second approach mitigates for the issue of activating the venv altogether. I've borrowed this idea mostly from makefile.venv and simplified it a lot for my needs.

# system python interpreter. used only to create virtual environment
PY = python3
VENV = venv
BIN=$(VENV)/bin

# make it work on windows too
ifeq ($(OS), Windows_NT)
    BIN=$(VENV)/Scripts
    PY=python
endif


all: lint test

$(VENV): requirements.txt requirements-dev.txt setup.py
    $(PY) -m venv $(VENV)
    $(BIN)/pip install --upgrade -r requirements.txt
    $(BIN)/pip install --upgrade -r requirements-dev.txt
    $(BIN)/pip install -e .
    touch $(VENV)

.PHONY: test
test: $(VENV)
    $(BIN)/pytest

.PHONY: lint
lint: $(VENV)
    $(BIN)/flake8

.PHONY: release
release: $(VENV)
    $(BIN)/python setup.py sdist bdist_wheel upload

clean:
    rm -rf $(VENV)
    find . -type f -name *.pyc -delete
    find . -type d -name __pycache__ -delete

The equivalent Makefile now looks immediately more complicated. So let's break it down.

Instead of calling just calling pytest, flake8 and python assuming the venv is already activated or all dependencies are installed on the system directly, we explicitly call the ones from the venv by prefixing the command with the path to the venv's bin directory. To ensure the venv exists, each of the recipes is depending on the $(VENV) target, which ensures we always have an up-to-date venv installed.

This works, because the . venv/bin/activate-script basically just does the same: it puts the venv before anything else in your PATH, therefore each call to python, etc. will find the one installed in the venv first.

While the Makefile is a bit more complicated, we now can just call

$ make test

and don't deal with venvs directly any more (well, for those simple cases at least...). If you don't need to support Windows, you can remove the appropriate block and the Makefile looks relatively tame, even for people that don't use make very often.

Which one is better?

I think the second approach is more convenient. I've used the first approach happily for years and only learned quite recently about the second one. I haven't really noticed any downsides yet, but I do realize that almost all Python projects with Makefiles I've checked out, seem to prefer the first approach. I wonder why that is?

31 March, 2021 01:30PM by Bastian Venthur

hackergotchi for Timo Jyrinki

Timo Jyrinki

MotionPhoto / MicroVideo File Formats on Pixel Phones

Google Pixel phones support what they call ”Motion Photo” which is essentially a photo with a short video clip attached to it. They are quite nice since they bring the moment alive, especially as the capturing of the video starts a small moment before the shutter button is pressed. For most viewing programs they simply show as static JPEG photos, but there is more to the files.

I’d really love proper Shotwell support for these file formats, so I posted a longish explanation with many of the details in this blog post to a ticket there too. Examples of the newer format are linked there too.

Info posted to Shotwell ticket

There are actually two different formats, an old one that is already obsolete, and a newer current format. The older ones are those that your Pixel phone recorded as ”MVIMG_[datetime].jpg", and they have the following meta-data:

Xmp.GCamera.MicroVideo                       XmpText     1  1
Xmp.GCamera.MicroVideoVersion XmpText 1 1
Xmp.GCamera.MicroVideoOffset XmpText 7 4022143
Xmp.GCamera.MicroVideoPresentationTimestampUs XmpText 7 1331607

The offset is actually from the end of the file, so one needs to calculate accordingly. But it is exact otherwise, so one simply extract a file with that meta-data information:

#!/bin/bash
#
# Extracts the microvideo from a MVIMG_*.jpg file

# The offset is from the ending of the file, so calculate accordingly
offset=$(exiv2 -p X "$1" | grep MicroVideoOffset | sed 's/.*\"\(.*\)"/\1/')
filesize=$(du --apparent-size --block=1 "$1" | sed 's/^\([0-9]*\).*/\1/')
extractposition=$(expr $filesize - $offset)
echo offset: $offset
echo filesize: $filesize
echo extractposition=$extractposition
dd if="$1" skip=1 bs=$extractposition of="$(basename -s .jpg $1).mp4"

The newer format is recorded in filenames called ”PXL_[datetime].MP.jpg”, and they have a _lot_ of additional metadata:

Xmp.GCamera.MotionPhoto                      XmpText     1  1
Xmp.GCamera.MotionPhotoVersion XmpText 1 1
Xmp.GCamera.MotionPhotoPresentationTimestampUs XmpText 6 233320
Xmp.xmpNote.HasExtendedXMP XmpText 32 E1F7505D2DD64EA6948D2047449F0FFA
Xmp.Container.Directory XmpText 0 type="Seq"
Xmp.Container.Directory[1] XmpText 0 type="Struct"
Xmp.Container.Directory[1]/Container:Item XmpText 0 type="Struct"
Xmp.Container.Directory[1]/Container:Item/Item:Mime XmpText 10 image/jpeg
Xmp.Container.Directory[1]/Container:Item/Item:Semantic XmpText 7 Primary
Xmp.Container.Directory[1]/Container:Item/Item:Length XmpText 1 0
Xmp.Container.Directory[1]/Container:Item/Item:Padding XmpText 1 0
Xmp.Container.Directory[2] XmpText 0 type="Struct"
Xmp.Container.Directory[2]/Container:Item XmpText 0 type="Struct"
Xmp.Container.Directory[2]/Container:Item/Item:Mime XmpText 9 video/mp4
Xmp.Container.Directory[2]/Container:Item/Item:Semantic XmpText 11 MotionPhoto
Xmp.Container.Directory[2]/Container:Item/Item:Length XmpText 7 1679555
Xmp.Container.Directory[2]/Container:Item/Item:Padding XmpText 1 0

Sounds like fun and lots of information. However I didn’t see why the “length” in first item is 0 and I didn’t see how to use the latter Length info. But I can use the mp4 headers to extract it:

#!/bin/bash
#
# Extracts the motion part of a MotionPhoto file PXL_*.MP.mp4

extractposition=$(grep --binary --byte-offset --only-matching --text \
-P "\x00\x00\x00\x18\x66\x74\x79\x70\x6d\x70\x34\x32" $1 | sed 's/^\([0-9]*\).*/\1/')

dd if="$1" skip=1 bs=$extractposition of="$(basename -s .jpg $1).mp4"

UPDATE: I wrote most of this blog post earlier. When now actually getting to publishing it a week later, I see the obvious ie the ”Length” is again simply the offset from the end of the file so one could do the same less brute force approach as for MVIMG. I’ll leave the above as is however for the ❤️ of binary grepping.

(cross-posted to my other blog)

31 March, 2021 11:06AM by TJ ([email protected])

Russ Allbery

Review: Paladin's Strength

Review: Paladin's Strength, by T. Kingfisher

Series: The Saint of Steel #2
Publisher: Red Wombat Studio
Copyright: 2021
ASIN: B08WWKXXVY
Format: Kindle
Pages: 474

Paladin's Strength is a sequel of sorts to Paladin's Grace, but it has different protagonists. It picks up a subplot from that novel with another former follower of the Saint of Steel. You can safely read the books in any order; there are some minor spoilers for the Paladin's Grace subplot in this book, but nothing that would matter for the enjoyment of the story.

Istvhan and his fellow brother Galen are acting as the head of a mercenary band, which has hired on to escort Master Distiller Brant and his collection of Emperor Oak barrels. In truth, they have another mission from the Temple of the White Rat: to track down a disturbing monster that leaves a trail of beheaded bodies.

Clara is a lay sister of St. Ursa, a convent that was raided by slavers who hauled away the nuns. She was left for dead in Arral territory when she fell sick, and was taken as a house slave after they nursed her back to life. The story opens with her holding a sword in front of Istvhan's tent, part of the fallout of Istvhan killing a young Arral in self-defense. The politics of that fallout are not at all what Istvhan expects. They end with Clara traveling with Istvhan's company, at least for a while.

Both Istvhan and Clara are telling the truth: Istvhan is escorting a merchant, and Clara is hoping to rescue her sisters. Both of them are also hiding a great deal. Istvhan's quiet investigation of the trail of a monster is easy enough to reveal once he knows Clara well enough. That he's a berserker who no longer has a god in control of his battle rage is another matter; the reader knows that, and of course so does Galen, but Istvhan has no intention of telling anyone else. Clara has her own secrets about herself and the sisters of St. Ursa, ones that neither the reader nor Istvhan knows.

This is a T. Kingfisher novel about paladins, so of course it's also a romance. If you've read Kingfisher's other books, you know she writes slow burn romances, but Paladin's Strength is next level. Istvhan and Clara have good reasons to not want to get involved and to doubt the other person's attraction or willingness, but this goes far beyond the obvious to become faintly absurd. If you like the sort of romance where both leads generate endless reasons to not pursue the relationship (some legitimate, some not) while steadfastly refusing to talk to each other about them and endlessly rehashing hints and interpretations, you're in for a treat. For me, it was too much and crossed over into irritation. By the two-thirds point, Kingfisher was gleefully throwing obstacles in their way to drag out the suspense, and I just wanted everyone to shut up about having sex and get on with the rest of the story.

That's unfortunate because I really liked Clara. She isn't the same type as Grace, Halla from Swordheart, or even Slate from Clockwork Boys and The Wonder Engine, the other novels set in this universe. She's self-contained, physically intimidating, cautious, deliberate, and very good at keeping her own counsel. I won't spoil her secret, since it's fun to work it out at the start of the book, but it's a lovely bit of characterization and world-building that Kingfisher handles with a thoughtful eye for its ramifications and effect on Clara's psychology. I would happily read more books about Clara.

I liked Istvhan well enough when he was doing anything other than mooning over Clara. As with all of Kingfisher's paladins, he's not a very subtle person, but he's a good straight man for Clara's quiet bemusement. He fills the paladin slot in this story, which is all he needs to do. There's enough else going on with Clara and with the plot — two separate major plotlines, plus a few subplots — that Paladin's Strength can use a protagonist who heads straight forward and hits things until they fall down.

The mooning, though... this is going to be a matter of personal taste. I think the intent was to contrast Istvhan's rather straightforward lustful appreciation with Clara's nuanced and trauma-laced reservations, and to play Istvhan's reactions in part for humor. I'm sure it works for some people, but I found Istvhan juvenile and puerile (albeit, to be clear, in a respectful and entirely consensual way), which didn't help me invest in a romance plot that I already thought dragged on too long. Thankfully the characters finally get past this in time for a dramatic and satisfying conclusion to the plot.

The joy of Paladin's Grace (and Swordheart for that matter) was the character dynamics and quirky female lead, which made the romance work even when Stephen was being dense. The joy of Paladin's Strength for me was primarily Clara's matter-of-fact calm bemusement and secondarily the plot and the world-building. (Kingfisher's gnoles continue to be the best thing about this setting.) None of that helps the romance as much, and the slow burn was far, far too slow for me, which lowers this one a notch. Still, this was fun, and I'll keep reading books about the Temple of the White Rat and their various friends and encounters for as long as Kingfisher keeps writing them.

Rating: 7 out of 10

31 March, 2021 05:53AM

March 30, 2021

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

x13binary 1.1.39-3 on CRAN: (Imperfect) Package Updates

A new release 1.1.39-3 of x13binary, of the X-13ARIMA-SEATS program by the US Census Bureau (with upstream release 1.1.39) is now on CRAN.

The x13binary package takes the pain out of installing X-13ARIMA-SEATS by making it a fully resolved CRAN dependency. For example, when installing the excellent seasonal package by Christoph, then X-13ARIMA-SEATS will get pulled in via the x13binary package and things just work. Just depend on x13binary and on all major OSs supported by R you should have an X-13ARIMA-SEATS binary installed which will be called seamlessly by the higher-level packages such as seasonal or gunsales. With this the full power of the what is likely the world’s most sophisticated deseasonalization and forecasting package is now at your fingertips and the R prompt, just like any other of the 17350+ CRAN packages. You can read more about this (and the seasonal package) in the Journal of Statistical Software paper by Christoph and myself.

This release was needed because the recent M1mac build was reporting leftover ‘detritus’ in the temporary directory, which we addressed with an explicit removal at end. We also addressed another CRAN Policy change since the last release, namely a conversion of the configure script from bash to sh.

Now, sadly, that second aspect blew up on Solaris, and the ‘detritus’ issue appears to be persist. By now Christoph and a colleague have installed R(-devel) on such an M1 machine, but still cannot reproduce. We will reach out to CRAN to learn more. A follow-up release 1.1.39-4 is likely.

The good news is that the standard macOS binary works on M1 as do other binaries thanks to the translation layer. We do however lack a genuine binary for Solaris so if any of the esteemed readers of this post happens to have access to R on Solaris along with a basic Fortran compiler, we would love to hear from you. Building X-13ARIMA-SEATS from source on Solaris should be straightforward, it is on the other OSs.

Courtesy of my CRANberries, there is also a diffstat report for this release showing changes to the previous release.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

30 March, 2021 11:28PM

hackergotchi for Rapha&#235;l Hertzog

Raphaël Hertzog

Challenging times for Freexian (1/4)

TLDR: Freexian’s success means that we have resources to invest into Debian projects. Plainly offering money has not worked so far, so I am looking to hire a “project manager” whose work would be to help spend that money in useful ways. At the same time, Freexian needs to adapt to cope with the growth: with new employees, with new infrastructure and a new offering. I want to give an idea of where we are headed, to try to inspire persons that share our values and our desire to improve Debian. Read on if you are interested.

Note: The original text has been split into 4 blog posts that will be published over a few days.

Introduction

Freexian is an IT service company specialized in Debian. We provide technical support by email on Debian, we create and maintain Debian packages requested by our customers, we also help organizations run an entire Debian derivative (Kali Linux being the most notable one).

On top of this, it runs the commercial part of the Debian LTS service : Freexian invoices many sponsors that need long term support, and uses the money to pay Debian contributors (about 12 currently) to make sure that Debian releases are supported for 5 years instead of 3. With the Extended LTS service, we push that further to 7 years, however only for a smaller subset of packages and in a repository that is hosted outside of debian.org.

Freexian’s purpose

When I created Freexian, it was out of a desire to be paid to work on Debian, and to be able to contribute during work time to the project that was so important to me. That goal has been met a long time ago.

But ultimately what I strive to achieve for Debian is not entirely aligned with the work that Freexian’s customers are requesting. That’s why, in the “long term projects” of Freexian, I always kept “find a business model that can fund the Debian projects that I would like to do”, as well as “if that model works for me, build something so that other can benefit from it too”. The first occasion to experiment something appeared when Debian discussed Long Term Support and when I stepped up to setup a commercial offer to pay Debian contributors.

Step 1: Paying Debian contributors for LTS work

When we started the Debian LTS service, I voluntarily opted to use an hourly rate that was rather high so that any Debian developer regardless of their geographical location,  could participate and not earn (much) less than what they would have from working on other tasks. This choice did imply paying a very high rate for some countries, but I didn’t see that as a problem, quite the contrary: if a Debian developer can earn enough money to cover their cost of living with 15h of Debian LTS, and then spend the rest of their month contributing on Debian, all the better! I’m not sure if anyone made this choice, but that was a dream of my younger self…

From a personal standpoint, the launch of Debian LTS has meant less free time, more administrative work, new duties to coordinate a team of paid contributors, more communication with many Debian-using companies, and many new opportunities too! This ultimately resulted in the launch of Extended LTS and PHP LTS, both of which have been rather successful so far.

Step 2: Funding Debian projects

With the growth of the Debian LTS service, and given that we have reached the required funding level, we decided to put a small share of the revenues aside and use that to fund useful Debian projects, typically in areas that were affected by our Debian LTS work. This effort was fully formalized in the project-funding git repository. We announced this process in November 2020, and we have kept mentioning it in our monthly LTS reports ever since, but so far only a single project has benefited from this. 

This is really the dream offer that I wish had existed when I was younger and was still struggling to get enough customers: hence I don’t really understand this lack of interest. You can find some discussions over the reasons why this offer has not (yet) found its target audience in this debian-vote thread.

I was hoping that spending money would be easy, but I now realize I was wrong! I’m positive that I could find dozens of useful projects to fund, but I just don’t have the time for this extra effort on top of my regular Freexian duties. I still really want to put this money to good use, which is why I’m looking into some solutions.

This article is to be continued in another upcoming post, stay tuned!

30 March, 2021 04:00PM by Raphaël Hertzog

hackergotchi for Emmanuel Kasper

Emmanuel Kasper

Playing with cri-o, a container runtime built for Kubernetes

Kubernetes is moving aways from docker to alternative container engines presenting a smaller core having just the functionality needed. The two most populars alternatives are:

  • containerd, a subset of docker, used for instance in Google Kubernetes Engine
  • cri-o, a new implementation of a container engine, used for instance in Red Hat's Kubernetes offering (OpenShift)

These alternatives are meant to be used programatically via a unix domain socket, and therefore have a limited command line interface.

Let's play around in a VM.

Install a throwaway VM with Vagrant

apt install vagrant vagrant-libvirt
vagrant init debian/testing64

Start the VM, install dependencies

vagrant up
vagrant ssh
sudo apt update
sudo apt install --yes curl gnupg jq

Install cri-o the container engine

sudo bash
export OS=Debian_Testing VERSION=1.20

echo "deb https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/$OS/ /" > /etc/apt/sources.list.d/libcontainers.list
echo "deb https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/$VERSION/$OS/ /" > /etc/apt/sources.list.d/cri-o:$VERSION.list
curl -L https://download.opensuse.org/repositories/devel:kubic:libcontainers:stable:cri-o:$VERSION/$OS/Release.key | apt-key add -
curl -L https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/$OS/Release.key | apt-key add -
apt install cri-o cri-o-runc containernetworking-plugins conntrack

Verify it is running properly

systemctl restart cri-o
systemctl status cri-o
...
Started Container Runtime Interface for OCI (CRI-O).

Say hello to cri-o via its unix domain socket

curl --silent  --unix-socket /var/run/crio/crio.sock http://localhost/info | jq 
{
"storage_driver": "overlay",
"storage_root": "/var/lib/containers/storage",
"cgroup_driver": "systemd",
"default_id_mappings": {
"uids": [
{
"container_id": 0,
"host_id": 0,
"size": 4294967295
}
],
"gids": [
{
"container_id": 0,
"host_id": 0,
"size": 4294967295
}
]
}
}

Install crictl, a Kubernetes debugging tool for containers

wget --directory-prefix=/tmp https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.20.0/crictl-v1.20.0-linux-amd64.tar.gz
tar -xaf /tmp/crictl-v1.20.0-linux-amd64.tar.gz -C /usr/local/sbin/
chmod +x /usr/local/sbin/crictl

crictl info
{
"status": {
"conditions": [
{
"type": "RuntimeReady",
"status": true,
"reason": "",
"message": ""
},
{
"type": "NetworkReady",
"status": true,
"reason": "",
"message": ""
}
]
}
}

From there on you can create a container following the examples in https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md

30 March, 2021 03:54PM by Emmanuel Kasper ([email protected])

Molly de Blanc

Helping

How can you help free software?

Aside from donating to the excellent free software nonprofits out there, and contributing to a project by building software or other resources, there are things you can do to help the free software cause. The two biggest things I think are providing mentorship and gently normalizing free software.

Providing Mentorship

Allison Randal introduced me to the idea that mentorship doesn’t have to be an ongoing process. This is to say, you don’t have to sign up to be someone’s best friend and advisor for life (though you certainly can). Providing short term, project or skill based, or one off mentorship is useful for building community because it makes people feel welcome and cared for and helps build skills that benefit free software.

I do a lot of proofreading and editing of people’s writing – especially people with minimal writing experience and/or non-native English speakers who are writing important documents in English. If the person is interested, I try to talk to them about their writing and why I’m making these particular suggestions. I hope this helps them with their writing in the future.

Other examples are working on a particular project or skill, this can be helping them develop a particular skill (e.g. git outside of the command line), or giving advice on a project with a level of specificity and detail you’re both comfortable with. These can, again, be one off things or things that require minimal effort/occasional conversation. I have some friends who I consider my Debian mentors who just answer functional questions whenever I have trouble doing something.

I also love love love talking with people about their free software trajectories, their goals and desires and dreams for their involvement in free software, whether that’s finding a place in a community, developing a skill set, or other things about their future (like job hopes, schooling, etc). These conversations have been so helpful for me personally, and I like to think they help others.

Gently Normalizing Free Software

I think normalizing free software is very important to its success and adoption. It’s not helpful to insist someone who has never done so before to create a Debian boot disk and install it. It is helpful to suggest using Big Blue Button or jitsi. If a friend wants help finding audio editing software, suggest they try audacity. I’d go as far as to suggest doing this without explaining that it’s free software, and instead focus on why it’ll work and that it’s available at no cost. If they like it, then it’s a great time to talk about rights and freedoms. Of course if they already care about these sorts of things, if you’re discussing privacy software, if anti-surveillance is an issue, or any number of other things, software freedom is a great thing to bring up!

Above all, just be nice.

Be nice. It’s basically the best thing you can do for free software.

30 March, 2021 12:58AM by mollydb

March 29, 2021

hackergotchi for Benjamin Mako Hill

Benjamin Mako Hill

Identifying “Underproduced” Software

I wrote this blog post with Kaylea Champion and a version of this post was originally posted on the Community Data Science Collective blog.

Critical software we all rely on can silently crumble away beneath us. Unfortunately, we often don’t find out software infrastructure is in poor condition until it is too late. Over the last year or so, I have been supporting Kaylea Champion on a project my group announced earlier to measure software underproduction—a term we use to describe software that is low in quality but high in importance.

Underproduction reflects an important type of risk in widely used free/libre open source software (FLOSS) because participants often choose their own projects and tasks. Because FLOSS contributors work as volunteers and choose what they work on, important projects aren’t always the ones to which FLOSS developers devote the most attention. Even when developers want to work on important projects, relative neglect among important projects is often difficult for FLOSS contributors to see.

Given all this, what can we do to detect problems in FLOSS infrastructure before major failures occur? Kaylea Champion and I recently published a paper laying out our new method for measuring underproduction at the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER) 2021 that we believe provides one important answer to this question.

A conceptual diagram of underproduction. The x-axis shows relative importance, the y-axis relative quality. The top left area of the graph described by these axes is 'overproduction' -- high quality, low importance. The diagonal is Alignment: quality and importance are approximately the same. The lower right depicts underproduction -- high importance, low quality -- the area of potential risk.Conceptual diagram showing how our conception of underproduction relates to quality and importance of software.

In the paper, we describe a general approach for detecting “underproduced” software infrastructure that consists of five steps: (1) identifying a body of digital infrastructure (like a code repository); (2) identifying a measure of quality (like the time to takes to fix bugs); (3) identifying a measure of importance (like install base); (4) specifying a hypothesized relationship linking quality and importance if quality and importance are in perfect alignment; and (5) quantifying deviation from this theoretical baseline to find relative underproduction.

To show how our method works in practice, we applied the technique to an important collection of FLOSS infrastructure: 21,902 packages in the Debian GNU/Linux distribution. Although there are many ways to measure quality, we used a measure of how quickly Debian maintainers have historically dealt with 461,656 bugs that have been filed over the last three decades. To measure importance, we used data from Debian’s Popularity Contest opt-in survey. After some statistical machinations that are documented in our paper, the result was an estimate of relative underproduction for the 21,902 packages in Debian we looked at.

One of our key findings is that underproduction is very common in Debian. By our estimates, at least 4,327 packages in Debian are underproduced. As you can see in the list of the “most underproduced” packages—again, as estimated using just one more measure—many of the most at risk packages are associated with the desktop and windowing environments where there are many users but also many extremely tricky integration-related bugs.

This table shows the 30 packages with the most severe underproduction problem in Debian, shown as a series of boxplots.These 30 packages have the highest level of underproduction in Debian according to our analysis.

We hope these results are useful to folks at Debian and the Debian QA team. We also hope that the basic method we’ve laid out is something that others will build off in other contexts and apply to other software repositories.

In addition to the paper itself and the video of the conference presentation on Youtube by Kaylea, we’ve put a repository with all our code and data in an archival repository Harvard Dataverse and we’d love to work with others interested in applying our approach in other software ecosytems.


For more details, check out the full paper which is available as a freely accessible preprint.

This project was supported by the Ford/Sloan Digital Infrastructure Initiative. Wm Salt Hale of the Community Data Science Collective and Debian Developers Paul Wise and Don Armstrong provided valuable assistance in accessing and interpreting Debian bug data. René Just generously provided insight and feedback on the manuscript.

Paper Citation: Kaylea Champion and Benjamin Mako Hill. 2021. “Underproduction: An Approach for Measuring Risk in Open Source Software.” In Proceedings of the IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER 2021). IEEE.

Contact Kaylea Champion ([email protected]) with any questions or if you are interested in following up.

29 March, 2021 11:56PM by Benjamin Mako Hill

Anton Gladky

2021/03, FLOSS activity

LTS

This is my first (beside test time last year) official month of working for LTS. I was assigned 12 hrs and worked all of them. I could relatively easy set up the development environment for Debian Stretch and managed to release several DLAs.

Released DLAs

  1. DLA-2588-1 zeromq3_4.2.1-4+deb9u4

    • CVE-2021-20234
    • CVE-2021-20235
  2. DLA-2594-1 tomcat8_8.5.54-0+deb9u6

    • CVE-2021-24122
    • CVE-2021-25122
    • CVE-2021-25329.
  3. DLA-2605-1 mariadb-10.1_10.1.48-0+deb9u2

    • CVE-2021-27928

CVE-2020-119977

I investigated CVE-2020-119977, which was marked as guacamole-server issue. There were not so much information about this CVE. I was trying to analyze git log and git diff between affected and fixed versions without any visible success. After that I contacted upstream and they were very responsive!

This CVE affects guacamole-client only and the ancient versions in the archive is very difficult to fix. So I decided to mark this CVE as NOT-FOR-US.

Repositories with pipelines

For most of packages, which I touched due to LTS work the new repositories were created in LTS packages group on salsa.d.o with enabled CI-pipelines. It really helps to test updates though some tests needs to be disabled for passing pipelines.

LTS-Meeting

I attended the Debian LTS team IRC-meeting.

Debian Science Team

I have prepared and uploaded following packages, which are maintained under the umbrella of Debian Science Team:

  • gmsh_4.7.1+ds1-5
  • vtk7_7.1.1+dfsg2-10
  • gl2ps_1.4.2+dfsg1-1~bpo10+1
  • vtk9_9.0.1+dfsg1-8~bpo10+2
  • sundials_5.7.0+dfsg-1~exp1
  • Prepared, uploaded and requested unblock for boost1.74_1.74.0-9

  • Moved 4 packages under the roof of Debian Electronics Team Team, which is a better place for them:

    • luma.core_2.3.1+dfsg1-1
    • luma.led-matrix_1.5.0+dfsg1-3
    • pyftdi_0.52.9-4
    • smbus2_0.4.1-3

29 March, 2021 09:00PM

Jamie McClelland

The problem with Richard Stallman is not about free speech

Free speech and censorship are critically important issues. And, using them to defend Richard Stallman's return to the Free Software Foundation (FSF) board is just plain wrong.

Richard Stallman resigned from the Board in 2019 after he sent an email in defense of Marvin Minsky (Minsky is accused of raping one of Jeffreys Epstein's victims).

Stallman's fateful email, however, is just one piece of the reason for why he should not be on the board. The full story is about his history of abuse toward women and is extensive.

On March 21st, 2021, Stallman announced he is back on the board.

There are profound reasons why any movement interested in equitable and open participation would want to publicly distance themselves from Stallman. However, the long form defenses of Stallman, including a note from Nadine Strossen, the former executive director of the ACLU, quoted in this defense, persist.

Many of the arguments defending Richard Stallman (including the one from Strossen) are grounded in a belief that Stallman is being punished for his unpopular political views, which deserve to be defended on the grounds of freedom of expression.

That's wrong.

Stallman should be kicked off the board because he has a long history of abusing his position to hit on women, which, when combined with his public opinions on under-age sex and his defense of Minsky, send a strong signal that the FSF does not care about the participation of women.

Being on a board of directors is a privilege, not a right. Being removed from a board is not a punnishment. And being criticized and removed from a board because your behavior and public statements are an obstacle to building an inclusive and equitable movement is what every board should strive to do.

If we are going to make this an issue about free expression, it should be about all the political expression lost to the free software movement because Stallman's unequal behavior toward women excluded an enormous number of talented individuals.

29 March, 2021 12:51PM

Russ Allbery

Review: JavaScript: The Definitive Guide

Review: JavaScript: The Definitive Guide, by David Flanagan

Publisher: O'Reilly
Copyright: May 2020
ISBN: 1-4919-5202-4
Format: Trade paperback
Pages: 665

JavaScript: The Definitive Guide has been frequently revised for new versions of JavaScript and therefore has multiple editions. This review is of the seventh edition, first published in May of 2020.

Reviews of programming language books are challenging since people learn languages in different ways. A short calibration for my preferences may therefore be useful.

I'm both an experienced programmer in multiple languages (C, Perl, Python, and some Java and Ruby professionally; Rust, some PHP, and a few minor languages as a hobby) and I specialized in software theory in college. I therefore like to learn languages comparatively and am comfortable with a lot of up-front syntax and discussion of the unique properties of the language. Introductory programs and practical exercises doesn't matter as much to me; I'm happy to hold the syntax in my head until enough of the language has been introduced to write simple programs.

For me, this book is excellent. It's one of the best language manuals that I've read, and that requires some work because JavaScript is a sprawling mess with odd corners, deprecated features, and alternate implementations of core constructs. Flanagan takes the syntax-first, comprehensive approach that I prefer, working methodically through the language (defining your own functions aren't introduced until chapter eight) and discussing all of the quirks as he goes. I felt like I thoroughly understood each portion of the language before moving on.

And this book is tight. Some comprehensive language introductions sprawl, but the benefit of seven editions of iteration is a book that has been honed to the most direct and effective explanation of each concept. The section on type conversions with operators, for example, was so good that I was able to immediately understand the unintuitive result of [1] + 2 (the string '12'), despite this being one of the most confusing parts of the language. The sections on JavaScript's prototype-based object type system and its three concurrency models (callbacks, promises, and async/await) were equally good. I came away feeling like I not only understood promises and callback chains but had a feel for how the same code would look when written in the different systems.

The drawback in this approach is that if you instead want a language reference that only tells you the parts of the language that you should use and leaves out the legacy weirdness and obscure corners for later (or never), this may not be the book for you. Flanagan labels the obsolete constructs, but he's meticulous about explaining the entire language, including such things as new Boolean or var variables that no one should use. This is what I wanted; I prefer to have a thorough grounding in language primitives so that it doesn't surprise me. But it can be a lot to juggle and prune in your head.

JavaScript is a language used in some very different domains. The approach Flanagan takes to that is to spend as long as possible on the core language that's usable both in the browser and on the server (while marking the pieces, such as the module system, that are markedly different between Node and browsers). He then puts two monster chapters at the back of the book that cover JavaScript in web browsers and JavaScript as implemented by Node. Both are more of overviews than orientations, since a comprehensive manual for either is probably as long again as this book, but they were more than adequate for my purposes. (I bogged down a bit in the web browser chapter, in part because I didn't have an immediate use for most of the material.) Flanagan wisely defers to MDN as the reference manual for the JavaScript APIs available in web browsers.

I thought Flanagan also hit the right balance of explanation to examples, and did a good job controlling the length of the examples. Most of the code excerpts are short and to the point. The longer ones have a high level of explanatory power per line, since Flanagan uses them to pull together multiple concepts and show how they interact. I was particularly impressed with the example that closes the chapter on web browsers, which uses <canvas>, ImageData, generators, promises, web workers, and other areas of the language Flanagan previously explained to implement a Mandelbrot set explorer in eleven pages of code. I think that's the longest example in the book, and it's well worth it.

This sort of introduction will always have limitations. Flanagan provides a brief orientation to the ecosystem surrounding JavaScript in the last chapter, but most JavaScript programmers will be working with packaging tools and frameworks that could themselves be the topic of another book and that he doesn't have room to cover. JavaScript, even more than most languages, is commonly used via a heavy layer of supporting libraries and abstractions, so you will probably not be able to tackle a practical JavaScript project using solely the material in this book. But if you're the sort of programmer who wants to start with a solid syntactical and conceptual understanding of the language core before starting on more applied topics, I've rarely seen it done better than this book.

If you want a quick-start guide that will get you writing code quickly and is opinionated about what parts of the language you should learn, this may not be the book for you. But if you're comfortable with comprehensive detail in your language guides, this was exactly what I was looking for. Recommended.

Rating: 9 out of 10

29 March, 2021 04:55AM

hackergotchi for Louis-Philippe Véronneau

Louis-Philippe Véronneau

Montreal 2021 BSP

Last weekend Debian Quebec held a Bug Squashing Party to try to fix some bugs in the upcoming Debian Bullseye.

I wasn't convinced at first, but Tassia's contagious energy and willingness to help organise the event eventually won me over. And — shockers! — it was really fun.

Group picture of the BSP attendees on Jitsi Meet

We fixed a couple of RC bugs, held lightning talks and had a virtual pizza party!

My lightning talk on autopkgtests was well received and a few people decided to migrate to sbuild and enable autopkgtests by default.

Sergio's talk on debuginfod was incredibly interesting. I'm not a C programmer and the live demo made me understand how this service can help making debugging C easier.

Jerome's talk on using Yubikeys to unlock LUKS encrypted drives was also very good! It also served as a reminder that Yubico's product are much more featureful and convenient to use than other Open Hardware/ Free Software hardware tokens. Hopefully that will change as enterprises like Nitrokey and Solokey mature.

This was my third BSP, crazy how time flies... With the Bullseye release closing in, you should try to join or organise one!

29 March, 2021 03:15AM by Louis-Philippe Véronneau

March 28, 2021

Russ Allbery

Pod::Thread 2.00

I am slowly working towards collecting twenty years of static web site generation and release management tools into DocKnot. Pod::Thread is the component that turns POD documentation into HTML by converting it to thread, the macro language that is the backbone of my static site generator, and then letting spin turn the results into HTML.

I wrote this module years ago and have had it around as a private Perl module, but since the version of DocKnot that incorporates spin will have it as a dependency, it seemed time to publicly release it.

Compared to the last released version in 2013, it also has a few bug fixes and improvements. The module now internally handles the navbar and table of contents generation (by deferring output) rather than using a pre-scanning pass in the driver script, and fixes the title casing of DB_CONFIG in the output from one INN manual page. I also did a lot of modernization and improvements to the test suite.

You can get the latest version from CPAN or the Pod::Thread distribution page.

28 March, 2021 04:21PM

hackergotchi for Dirk Eddelbuettel

Dirk Eddelbuettel

RcppSpdlog 0.0.5 on CRAN: New upstream versions

About three months after the last update, we can announce a new version 0.0.5 of RcppSpdlog. It contains releases 1.8.3, 1.8.4 and 1.8.5 of spdlog which were made in quick succession mid-week (while we were waiting on an update of CRAN’s own machinery) and was processed yesterday and overnight.

RcppSpdlog bundles spdlog, a wonderful header-only C++ logging library with all the bells and whistles you would want that was written by Gabi Melman, and also includes fmt by Victor Zverovich.

The NEWS entry for this release follows.

Changes in RcppSpdlog version 0.0.5 (2020-12-11)

  • Upgraded to upstream release spdlog 1.8.5 (and 1.8.4 and 1.8.3)

  • Small enhancements to DESCRIPTION files

Courtesy of my CRANberries, there is also a diffstat report. More detailed information is on the RcppSpdlog page.

If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

28 March, 2021 02:02PM

hackergotchi for Norbert Preining

Norbert Preining

RMS, Debian, and the world

Too much has been written, a war of support letters is going on, Debian is heading head first like Lemmings into the same war. And then, there is this by the first female President of the American Civil Liberties Union (ACLU):

Civil Rights Activist Nadine Strossen’s Response To #CancelStallman:

I find it so odd that the strong zeal for revenge and punishment if someone says anything that is perceived to be sexist or racist or discriminatory comes from liberals and progressives. There are so many violations [in cases like Stallman’s] of such fundamental principles to which progressives and liberals cling in general as to what is justice, what is fairness, what is due process.
One is proportionality: that the punishment should be proportional to the offense. Another one is restorative justice: that rather than retribution and punishment, we should seek to have the person constructively come to understand, repent, and make amends for an infraction. Liberals generally believe society to be too punitive, too harsh, not forgiving enough. They are certainly against the death penalty and other harsh punishments even for the most violent, the mass murderers. Progressives are right now advocating for the release of criminals, even murderers. To then have exactly the opposite attitude towards something that certainly is not committing physical violence against somebody, I don’t understand the double standard!
Another cardinal principle is we shouldn’t have any guilt by association. [To hold culpable] these board members who were affiliated with him and ostensibly didn’t do enough to punish him for things that he said – which by the way were completely separate from the Free Software Foundation – is multiplying the problems of unwarranted punishment. It extends the punishment where the argument for responsibility and culpability becomes thinner and thinner to the vanishing point. That is also going to have an enormous adverse impact on the freedom of association, which is an important right protected in the U.S. by the First Amendment.
The Supreme Court has upheld freedom of association in cases involving organizations that were at the time highly controversial. It started with NAACP (National Association for the Advancement of Colored People) during the civil rights movement in the 1950s and 60s, but we have a case that’s going to the Supreme Court right now regarding Black Lives Matter. The Supreme Court says even if one member of the group does commit a crime – in both of those cases physical violence and assault – that is not a justification for punishing other members of the group unless they specifically intended to participate in the particular punishable conduct.
Now, let’s assume for the sake of argument, Stallman had an attitude that was objectively described as discriminatory on the basis on race and gender (and by the way I have seen nothing to indicate that), that he’s an unrepentant misogynist, who really believes women are inferior. We are not going to correct those ideas, to enlighten him towards rejecting them and deciding to treat women as equals through a punitive approach! The only approach that could possibly work is an educational one! Engaging in speech, dialogue, discussion and leading him to re-examine his own ideas.
Even if I strongly disagree with a position or an idea, an expression of an idea, advocacy of an idea, and even if the vast majority of the public disagrees with the idea and finds it offensive, that is not a justification for suppressing the idea. And it’s not a justification for taking away the equal rights of the person who espouses that idea including the right to continue holding a tenured position or other prominent position for which that person is qualified.
But a number of the ideas for which Richard Stallman has been attacked and punished are ideas that I as a feminist advocate of human rights find completely correct and positive from the perspective of women’s equality and dignity! So for example, when he talks about the misuse and over use and flawed use of the term sexual assault, I completely agree with that critique! People are indiscriminantly using that term or synonyms to describe everything from the most appaulling violent abuse of helpless vulnerable victims (such as a rape of a minor) to any conduct or expression in the realm of gender or sexuality that they find unpleasant or disagreeable.
So we see the term sexual assault and sexual harrassment used for example, when a guy asks a woman out on a date and she doesn’t find that an appealing invitation. Maybe he used poor judgement in asking her out, maybe he didn’t, but in any case that is NOT sexual assault or harassment. To call it that is to really demean the huge horror and violence and predation that does exist when you are talking about violent sexual assault. People use the term sexual assault/ sexual harassment to refer to any comment about gender or sexuality issues that they disagree with or a joke that might not be in the best taste, again is that to be commended? No! But to condemn it and equate it with a violent sexual assault again is really denying and demeaning the actual suffering that people who are victims of sexual assault endure. It trivializes the serious infractions that are committed by people like Jeffrey Epstein and Harvey Weinstein. So that is one point that he made that I think is very important that I strongly agree with.
Secondly and relatedly, [Richard Stallman] never said that he endorse child pornography, which by definition the United States Supreme Court has defined it multiple times is the sexual exploitation of an actual minor. Coerced, forced, sexual activity by that minor, with that minor that happens to be filmed or photographed. That is the definition of child pornography. He never defends that! What the point he makes, a very important one, which the U.S. Supreme Court has also made, is mainly that we overuse and distort the term child pornography to refer to any depiction of any minor in any context that is even vaguely sexual.
So some people have not only denounced as child pornography but prosecuted and jailed loving devoted parents who committed the crime of taking a nude or semi-nude picture of their own child in a bathtub or their own child in a bathing suit. Again it is the hysteria that has totally refused to draw an absolutely critical distinction between actual violence and abuse, which is criminal and should be criminal, to any potentially sexual depiction of a minor. And I say potentially because I think if you look at a picture a parent has taken of a child in a bathtub and you see that as sexual, then I’d say there’s something in your perspective that might be questioned or challenged! But don’t foist that upon the parent who is lovingly documenting their beloved child’s life and activities without seeing anything sexual in that image.
This is a decision that involves line drawing. We tend to have this hysteria where once we hear terms like pedophilia of course you are going to condemn anything that could possibly have that label. Of course you would. But societies around the world throughout history various cultures and various religions and moral positions have disagreed about at what age do you respect the autonomy and individuality and freedom of choice of a young person around sexuality. And the U.S. Supreme Court held that in a case involving minors right to choose to have an abortion.
By the way, [contraception and abortion] is a realm of sexuality where liberals and progressives and feminists have been saying, “Yes! If you’re old enough to have sex. You should have the right to contraception and access to it. You should have the right to have an abortion. You shouldn’t have to consult with your parents and have their permission or a judge’s permission because you’re sufficiently mature.” And the Supreme Court sided in accord of that position. The U.S. Supreme Court said constitutional rights do not magically mature and spring into being only when someone happens to attain the state defined age of majority.
In other words the constitution doesn’t prevent anyone from exercising rights, including Rights and sexual freedoms, freedom of choice and autonomy at a certain age! And so you can’t have it both ways. You can’t say well we’re strongly in favor of minors having the right to decide what to do with their own bodies, to have an abortion – what is in some people’s minds murder – but we’re not going to trust them to decide to have sex with somewhat older than they are.
And I say somewhat older than they are because that’s something where the law has also been subject to change. On all issues of when you obtain the age of majority, states differ on that widely and they choose different ages for different activities. When you’re old enough to drive, to have sex with someone around your age, to have sex with someone much older than you. There is no magic objective answer to these questions. I think people need to take seriously the importance of sexual freedom and autonomy and that certainly includes women, feminists. They have to take seriously the question of respecting a young person’s autonomy in that area.
There have been famous cases of 18 year olds who have gone to prison because they had consensual sex with their girlfriends who were a couple of years younger. A lot of people would not consider that pedophilia and yet under some strict laws and some absolute definitions it is. Romeo and Juliet laws make an exception to pedophilia laws when there is only a relatively small age difference. But what is relatively small? So to me, especially when he says he is re-examining his position, Stallman is just thinking through the very serious debate of how to be protective and respectful of young people. He is not being disrespectful, much less wishing harm upon young people, which seems to be what his detractors think he’s doing.

Unfortunately, I don’t think the Anti-Harassment Team of Debian and others of the usual group of warriors will ever read – less understand – what is written there. So sad.

28 March, 2021 12:44PM by Norbert Preining

hackergotchi for Emmanuel Kasper

Emmanuel Kasper

Switching to FAI (Fully Automatic Installer) for creating Vagrant Boxes

Have you heard of Vagrant ? It is a command line tool to get ready to use, disposable Virtual Machines (VM) from an online catalog. Vagrant works on Linux, FreeBSD, Windows and Mac and you only need three commands to get a shell prompt in a VM (see the Debian wiki).
The online catalog has images for the majority of the OSes you can think of.

We've been building the Debian disk images for Vagrant (available on https://app.vagrantup.com/debian/) with a number of tools over the years:

  • then packer, which is wrapping qemu and the Debian installer CD with automated bootparams and preseed file.
  • and then fai-diskimage, again a wrapper over debootstrap using loopback mounts

Basically there are two category of tools for building a disk image:

- those using an emulator and the OS installer in a automated way

- those using debootstrap/pacstrap/rpmstrap on a loopback mounted filesystem

Personally I prefer the first approach, as you can run the build process as non root, and you benefit from all the quality work of the official installer.
However this requires virtualization, and nested virtualization if your build process run insides a VM. Unfortunately nested virtualization is not that common, for instance my cloud provider, and the VMs used for Debian Continuous Integration, are not supporting nested virtualization.
As the maintainer of fai-diskimage is a Debian Developer (hey MrFAI ! :) and as the debian-cloud folks are it using for Amazon, Azure and Google Cloud Debian images, it made sense to switch to fai-diskimage for now. The fai-diskimage learning curve is a bit steep as you have to learn many internal concepts before using it, but once you get the bits connected it works quite well.

28 March, 2021 08:16AM by Emmanuel Kasper ([email protected])

March 27, 2021

Andrew Cater

Debian 10.9 release - 202103271900UTC - pushing through live image testing

So we're a fair way through the release, then. Testing of almost all the standard images has finished.  Pretty much all of the disk images are now complete and in place.

People are working their way through the tests of the debian-live images in the various desktop flavours. These have to be done on real hardware - so it does take time. A new tester - peylight - has dropped in to help for the first time. Sqrt{not} has also joined us from the other end of the timezone scale - we have somebody at UTC-0700 and somebody at UTC+0430 today. [I can't remember where Linux-fan is timezone wise] All of the help from all the testers is very welcome, as ever.

A slight pause - a couple of us have a meal to eat - but it looks as if we've done well on timings. The original estimate was for 2000UTC - maybe a little after that and we'll be finished and the images release can be published - there were a couple of minor hiccups but we've done well so far.

Thanks, as always, to the people behind the scenes doing all the work, to DSA and admins providing large machines for us to do the builds on and to the people who drop in and spend a few hours of their time on a working day/weekend to help out.

27 March, 2021 09:53PM by Andrew Cater ([email protected])

Debian 10.9 release - 202103272140UTC - almost there - final stage

We're almost there: last lots of AMD64 testing on the debian-live images - a couple of willing helpers are also testing some i386 images though these can be more problematic on low memory. Steve has just started the final stage to start the final scripts. If all goes well, they should be done within 3/4 of an hour - which should put the images in the final locations on the main mirror by about 2230 UTC. It's been something of the order of twelve hours from start to finish which is still slightly quicker than most of the releases we've done - as ever, thanks to all. And that's it for another however long until we get to sort out 10.10 in a while.

27 March, 2021 09:46PM by Andrew Cater ([email protected])

Antonio Terceiro

Migrating from Chef™ to itamae

The Debian CI platform is comprised of 30+ (virtual) machines. Maintaining this many machines, and being able to add new ones with some degree of reliability requires one to use some sort of configuration management.

Until about a week ago, we were using Chef for our configuration management. I was, for several years, the main maintainer of Chef in Debian, so using it was natural to me, as I had used it before for personal and work projects. But last year I decided to request the removal of Chef from Debian, so that it won't be shipped with Debian 11 (bullseye).

After evaluating a few options, I believed that the path of least resistance was to migrate to itamae. itamae was inspired by chef, and uses a DSL that is very similar to the Chef one. Even though the itamae team claim it's not compatible with Chef, the changes that I needed to do were relatively limited. The necessary code changes might look like a lot, but a large part of them could be automated or done in bulk, like doing simple search and replace operations, and moving entire directories around.

In the rest of this post, I will describe the migration process, starting with the infrastructure changes, the types of changes I needed to make to the configuration management code, and my conclusions about the process.

Infrastructure changes

The first step was to add support for itamae to chake, a configuration management wrapper tool that I wrote. chake was originally written as a serverless remote executor for Chef, so this involved a bit of a redesign. I thought it was worth it to do, because at that point chake had gained several interesting managements features that we no directly tied to Chef. This work was done a bit slowly over the course of the several months, starting almost exactly one year ago, and was completed 3 months ago. I wasn't in a hurry and knew I had time before Debian 11 is released and I had to upgrade the platform.

After this was done, I started the work of migrating the then Chef cookbooks to itamae, and the next sections present the main types of changes that were necessary.

During the entire process, I sent a few patches out:

Code changes

These are the main types of changes that were necessary in the configuration code to accomplish the migration to itamae.

Replace cookbook_file with remote_file.

The resource known as cookbook_file in Chef is called remote_file in itamae. Fixing this is just a matter of search and replace, e.g.:

-cookbook_file '/etc/apt/apt.conf.d/00updates' do
+remote_file '/etc/apt/apt.conf.d/00updates' do

Changed file locations

The file structure assumed by itamae is a lot simpler than the one in Chef. The needed changes were:

  • static files and templates moved from cookbooks/${cookbook}/{files,templates}/default to cookbooks/${cookbook}/{files,templates}
  • recipes moved from cookbooks/${cookbook}/recipes/*.rb to cookbooks/${cookbook}/*.rb
  • host-specific files and templates are not supported directly, but can be implemented just by using an explicit source statement, like this:

    remote_file "/etc/foo.conf" do
      source "files/host-#{node['fqdn']}/foo.conf"
    end

Explicit file ownership and mode

Chef is usually design to run as root on the nodes, and files created are owned by root and have move 0644 by default. With itamae, files are by default owned by the user that was used to SSH into the machine. Because of this, I had to review all file creation resources and add owner, group and mode explicitly:

-cookbook_file '/etc/apt/apt.conf.d/00updates' do
-  source 'apt.conf'
+remote_file '/etc/apt/apt.conf.d/00updates' do
+  source 'files/apt.conf'
+  owner   'root'
+  group   'root'
+  mode    "0644"
 end

In the end, I guess being explicit make the configuration code more understandable, so I take that as a win.

Different execution context

One of the major differences between Chef itamae comes down the execution context of the recipes. In both Chef and itamae, the configuration is written in DSL embedded in Ruby. This means that the recipes are just Ruby code, and difference here has to do with where that code is executed. With Chef, the recipes are always execute on the machine you are configuring, while with itamae the recipe is executed on the workstation where you run itamae, and that gets translated to commands that need to be executed on the machine being configured.

For example, if you need to configure a service based on how much RAM the machine has, with Chef you could do something like this:

total_ram = File.readlines("/proc/meminfo").find do |l|
  l.split.first == "MemTotal:"
end.split[1]

file "/etc/service.conf" do
  # use 20% of the total RAM
  content "cache_size = #{ram / 5}KB"
end

With itamae, all that Ruby code will run on the client, so total_ram will contain the wrong number. In the Debian CI case, I worked around that by explicitly declaring the amount of RAM in the static host configuration, and the above construct ended up as something like this:

file "/etc/service.conf" do
  # use 20% of the total RAM
  content "cache_size = #{node['total_ram'] / 5}KB"
end

Lessons learned

This migration is now complete, and there are a few points that I take away from it:

  • The migration is definitely viable, and I'm glad I picked itamae after all.
  • Of course, itamae is way simpler than Chef, and has less features. On the other hand, this means that it a simple package to maintain, with less dependencies and keeping it up to date is a lot easier.
  • itamae is considerably slower than Chef. On my local tests, a noop execution (e.g. re-applying the configuration a second time) against local VMs with itamae takes 3x the time it takes with Chef.

All in all, the system is working just fine, and I consider this to have been a successful migration. I'm happy it worked out.

27 March, 2021 08:00PM