leah blogs

16aug2019 · 32, 040, 0x20, 0b100000

ich bin neu auf der welt
und ich geh von mir weg
und ich geh zu mir hin
ich bin sechs monate
und ich geh von mir weg
und ich geh zu mir hin
ich bin ein jahr alt
und ich geh von mir weg
und ich geh zu mir hin
wie ich zwei jahre bin
und ich geh von mir weg
und ich geh zu mir hin
das ist mein vierter geburtstag
und ich geh von mir weg
und ich geh zu mir hin
als ein schulkind von acht jahren
und ich geh von mir weg
und ich geh zu mir hin
und erkenne mich mit sechzehn kaum wieder
und ich geh von mir weg
und ich geh zu mir hin
der zweiunddreißigste ist ein schöner geburtstag
und ich geh von mir weg
und ich geh zu mir hin
ich mit vierundsechzig
geh nicht mehr doppelt so weit
    — Ernst Jandl, doppelt so weit (1976)

NP: Deutsche Laichen—Emanzenlesbenschlampe

24dec2018 · Merry Christmas!

Jingle all the way
@SpaceCatPics

Frohe Weihnachten, ein schönes Fest, und einen guten Rutsch ins neue Jahr wünscht euch
Leah Neukirchen

Merry Christmas and a Happy New Year!

NP: Laura Jane Grace & the Devouring Mothers—Apocalypse Now (& Later)

05jun2018 · GitHub: quo vadis?

As GitHub user #139 I feel compelled to say something about GitHub getting bought by Microsoft.

I still remember, back when GitHub was founded, I was both thrilled and frightened. Thrilled, because the three founders managed to bootstrap a startup the right way: profitable from day one, focussed on a single, successful product, that did what people wanted. I told myself that if I’d ever do a startup, I’d do it like them. (Yet they took venture capital in 2012 and 2015 and as a result grew from 50 to over 800 employees by now, making loss at least in 2016 until they changed their business plans.)

Why was I frightened? I saw GitHub was immedately getting very successful, and many people, especially in the Ruby community, moved their projects to it, creating both a monopoly and a single point of failure.

One of my most successful projects, Rack, was converted from Darcs to Git in May 2008. I put it on GitHub (which was only a few months old by then) about that time, but I also provided my own Git mirror on own infrastructure. However, development quickly shifted to GitHub only, mostly because pull requests and issues were very convenient. Over time, my skepticism vanished, using GitHub was a no-brainer, and while occasional outages remembered us of the central position GitHub is in, we didn’t do anything.

Let me emphasize a few ways GitHub vastly improved my own open source work: finally, it was easy to report issues for many projects, without having to register yet another Bugzilla account, and issues could easily link issues at other projects. GitHub made it simple to quickly look into the actual source of many projects in a straight-forward way, without having to figure out CVS checkouts or fetch tarballs. It was easy to see which people contribute to which projects, and I discovered some cool projects this way.

So, now they are getting bought by Microsoft. I’m sad that they are getting bought at all, because I think it’s very important that such a central piece of the open source community stays independent of major software vendors. As for getting bought by Microsoft, I cannot share the enthusiasm many have: it is still a huge company that makes its profits primarily from proprietary, closed-source software and vendor lock-in, and while their management certainly changed a lot in the last decade, who knows how long this will last. Worse buyers are easily imaginable, however.

It is therefore sensible to think of alternatives to GitHub. Contrary to many, I don’t think switching to alternative offerings such as GitLab.com, BitBucket or SourceForge significantly improves the situation: while GitHub’s monopoly could get whittled down, we are still dependent on another for-profit company that is likely to be acquired by a major corporation sooner or later, too.

As for my own projects, I plan to be moving the most part of my recent projects (so called leahutils) from GitHub to a self-hosted solution. The details are not fixed yet, but I have enough experience with GitLab that I’m sure I’ll use something else. For these projects, I also have different needs compared to what GitHub offered: I’m often the sole committer, and I prefer receiving patches by mail and refining them myself rather than telling people how to improve their pull requests. So likely, I’ll set up a mix of cgit and public-inbox, and adopt a quite different workflow.

Other projects I’m involved in, most importantly VoidLinux, are far more dependent on outside contributions and having access to CI infrastructure, which already makes it hard to move away from GitHub and Travis. For now we’ve decided to stick to GitHub, as there are more pressing issues currently, and we don’t expect GitHub to go haywire anytime soon. Still, our autonomy as an open source project is something we need to bethink more often and take care of.

NP: Julia Weldon—Comatose Hope

22jan2018 · Anatomy of a Ceph meltdown

Last week, the server farm of our LMU Student Council had a major downtime over almost five days. As part of the administrator team there, I’d like to publish this post mortem to share our experiences and lessons learned to avoid situations like this in the future.

First and foremost, having a multiple-day spanning downtime is completely unacceptable for a central service like this (and I really wish there was a way to fix this quicker), but the nature of the issue made it really hard to find another solution or workaround. In theory it would have been possible to set up an emergency system restored from backups, but this would have blocked hardware that we need to ensure regular operation later. Also setting up things from scratch is likely to introduce new issues, and resources were bound on recovery. (Please remember that we are all unpaid volunteers who have our own studies and/or day jobs, and no one has had more experience with Ceph than what you get from reading the manual.)

A quick word on our setup: We have three file servers with 12TB storage each that provide each three Ceph OSDs, a monitor, and MDS (to provide CephFS to a shell server and the office machines). Connected to these are two virtualization hosts that run 24 virtual machines total in QEMU/KVM. The file servers and virtualization hosts run on Gentoo, most VM are Debian, a few run Windows. The setup is very redundant: Ceph guarantees each file server can drop out without problems, and if one virtualization host goes down, we can start all machines on the other host (even if main memory gets a bit tight then).

Unfortunately, Ceph itself is a single point of failure: when Ceph goes down, no virtual machine works.

It follows a protocol of the events:

2018-01-15: At night, trying to debug an issue related to CephFS, an administrator had to restart an MDS, which failed. Then they tried to restart an OSD, which failed too. This caused the Ceph cluster to start rebalancing. I was not involved yet; as far as I know no further action was taken.

2018-01-16: Trying to restart the OSD again, we noticed that ceph-osd crashed immediately. It turned out that all three systems had been updated a few times without restarting the OSD. No OSD could start anymore. We kept the last two OSD running (this turned out to be a mistake). The file servers, running Gentoo, also had a profile update done by another administrator. We came to the conclusion that we needed to rebuild world to get into a consistent state.

Ceph and glibc were built without debugging symbols, so all information we had came from the ceph-osd output of backtrace(3), which pointed to the functions parse_network and find_ip_in_subnet_list. These functions are run very early by Ceph during configuration file parsing. I looked into the code, and it was quite simple, and only used std::string and std::list, two interfaces that changed in the recent libstdc++ ABI change.

My working idea behind the bug was now that the libstdc++ ABI change between GCC 4.2 and GCC 6.2 triggered this.

After emerge world, which took several hours, all software was built on the new libstdc++ ABI.

ceph-osd still crashed.

Another theory was that tcmalloc was at fault, but a Ceph without tcmalloc failed as well.

We decided to build a debugging version of Ceph to inspect the issue deeper. Compiling Ceph on Gentoo failed twice: (1) Building Ceph failed due to Ceph trying to run git, which triggered a sandbox exception since we have a /.git directory in the root folder. This could be worked around by setting GIT_CEILING_DIRECTORIES. (2) Building Ceph with debugging symbols took more than 32 GB of disk space, so we had to create space for that at first.

2018-01-17: Debugging of Ceph intensified. It turned out the call to parse_network triggered a data corruption in a std::list<std::string>, which caused the destructor of this data structure to segfault. Tracking down the exact place where this corruption happened turned out to be hard: printing STL data structures is provided by gdb, but to create watchpoints on certain addresses you need to reverse-engineer the actual memory layouts. (For a short time, we assumed the switch to short string optimization was at fault, but spelling out the IPv6 address didn’t help.) Finally I managed to set a watchpoint, and it turned out inet_pton(3) triggered an overflow, which resulted in corruption of the next variable on stack, the list mentioned above. Googling some more turned up Ceph Bug #19371, which tells us that Ceph tried to parse an IPv6 address into a struct sockaddr, which only has space for an IPv4 address! This explained the data corruption. A fix was published in Ceph 10.2.8. We still ran Ceph 10.2.3, the version marked stable in Gentoo. (Up to this, we thought the quite old version of Ceph was not at fault, since it ran well before!)

We decided to update to Ceph 10.2.10.

The OSD crashed, but due to a different thing. First, the Gentoo init.d scripts were broken, secondly Ceph now assumes to run a user ceph (it ran as root before). We started ceph-osd as root again.

The OSDs started fine, so all OSD were restarted now. The MDS reported degradation and the storage itself was degraded a lot (this means the redundancy requirement was not met) and unbalanced.

Ceph started recovery, but for yet unknown reasons the OSD started to crash often and consume vast amounts of RAM (3-5x as much as usual), which drove the system into swapping at first, and then it started to disconnect the OSD because there were too slow to respond, which slowed down recovery even further.

We assume this is Ceph Bug #21761.

We reduced osd_map_cache trying to lower RAM usage, but we are not sure this had any effect.

We started adding more swap, this time on SSD which were meant to serve as Ceph cache usually. This made the situation a bit better, the OSD started to crash later, and had better responsiveness.

2018-01-18: Ceph recovery was still slow, so we looked for more information. MDS was still degraded, we did not know how to fix this. Reading the mailing list we learned to set noout (we knew that) and nodown, to force disable dropping out of the cluster. We also learned to set noup to let the OSD deal with the backlog, since the osdmap epochs were seriously out of sync (up to 10000). After setting noup and letting the OSD churn (this took several hours at high CPU load), the MDS was not degraded anymore! The system continued to balance and started backfilling.

At some point we took single OSD (backed by XFS) down to chown their storages to ceph:ceph, which took several hours each.

OSD RAM usage normalized.

2018-01-19: Backfilling progressed slowly, so we increased osd_max_backfill and osd_recovery_threads. We set noscrub and nodeepscrub to reduce non-recovery I/O. At some point later at night, the system went from HEALTH_ERR to HEALTH_WARN again!

2018-01-20: The OSD went all back to active+clean. Two things were stopping us from HEALTH_OK: we needed to set require_jewel_osds and sortbitwise. Setting both was unproblematic and worked fine.

We started to bring up first virtual machines again. This caused some minor fallout:

  • The LDAP server started fine, but did not bring up its IPv6 route (a Debian issue we hit before), so the mail server could not identify accounts. This was fixed quickly.
  • The mailing list server received a few mails to bigger mailing lists, and started to send them out all at once, which caused us to exceed quota at our upstream SMTP server (and the quota was too low, as it turned out later). This meant we had a backlog of over 5000 messages for several hours.

At the end of the day, all systems were operational again.

There is no evince that data was lost during the downtime. It is possible that inbound mail was bounced at the gateway, and thus not delivered, but in this case the sender was notified of this fact. All other mail that was sent inbound was delivered when the mail server came back up.

Lessons learned:

  • If we notice something is going wrong with Ceph, we will not hesitate to shut down the cluster prematurely. It’s better to have 30 min downtime once, than a mess of this scale.
  • We should not update Ceph on all machines at once. After updating Ceph (or other critical parts of the system), we will check all services restart fine.
  • We will build glibc with debugging symbols. (I think this would have pointed me to inet_ptoa quicker and saved a few hours of debugging.)
  • We will track Ceph releases more closely, and generally trust upstream releases (I don’t know why Gentoo does not stabilize newer releases of Ceph, they fix significant bugs).

    (At some point I had proposed to run the OSD in a Debian chroot, but stretch contains Ceph 10.2.5 which was affected by the same bugs.)

  • We need to find a solution to fix the Debian IPv6 issue, which bit us a bit too often.

NP: Light Bearer—Aggressor & Usurper

24dec2017 · Merry Christmas!

Jingle all the way
Comic © Wondermark

Frohe Weihnachten, ein schönes Fest, und einen guten Rutsch ins neue Jahr!

Merry Christmas and a Happy New Year!

NP: EA80—Nr. 1

27feb2017 · A time-proven zsh prompt

I’ve been using below shell prompt since 2013 and only slightly tweaked it over time. The most significant change was probably displaying the Git branch.

The basic idea of my prompt is to not show redundant or obvious information. This allows the prompt to be short, yet useful.

By default, the prompt displays the hostname, shortened directory, and a % to signify a zsh. The hostname is bold to make it stand out when you are scrolling, and the sigil is colored to mark the beginning of the command. It looks like this:

juno ~% ./mycommand -x

Long directory names are truncated in the middle:

juno /tmp/dirwithare…gname%

In rare cases, only showing two levels of hierarchy may be confusing, so you can set $NDIRS to something higher, e.g. 4:

juno deeply/nested/dir/structure%

When the previous command failed, the prompt also displays the exit status of the previous command:

juno 42? ~%

When there are background jobs running, the prompt shows how many there are:

juno 1& ~%

Note how the status and job display use the associated ASCII symbols.

When we are in a Git repository, the current branch is displayed inline as part of the base directory (when possible), or as a prefix, together with the repo name. By design, in the most common cases this keeps the prompt very short:

juno prj/rack@master%
juno rack@master/doc%
juno rack@master doc/Rack%

When the prompt detects a SSH session, the prompt sigil is doubled, so we are a bit more careful there:

hecate prj/lr%%

When the shell runs as root, the sigil is red (I don’t usually run zsh as root):

juno /etc#

That’s it, essentially. Apart from the Git integration, it’s really straight-forward. Not visible above is trick 4 to simplify pasting of old lines, and how it updates the title of terminal emulators to hostname: dir respectively hostname: current-command (which needs quite complicated quoting).

The whole thing is defined in the PROMPT section of my .zshrc.

NP: Light Bearer—Aggressor & Usurper

02jan2017 · zz: a smart and efficient directory changer

A nice feature I’ve become used to in the last year is a so-called “smart directory changer” that keeps track of the directories you change into, and then lets you jump to popular ones quickly, using fragments of the path to find the right location.

There is quite some prior art in this, such as autojump, fasd or z, but I could not resist building my own implementation of it, optimized for zsh.

As far as I can see, my zz directory changer is the only one with a “pay-as-you-go” performance impact, i.e., not every directory change is slowed down, but only every use of the smart matching functonality.

The idea is pretty easy: we add a chpwd hook to zsh to keep track of directory changes, and log for each change a line looking like “0 $epochtime 1 $path” into a file ~/.zz. This is an operation with effectively constant cost on a Unix system.

chpwd_zz() {
  print -P '0\t%D{%s}\t1\t%~' >>~/.zz
}
chpwd_functions=( ${(kM)functions:#chpwd?*} )

The actual jumping function is called zz:

zz() {

How does the matching work? It’s an adaption of the z algorithm: The lines of ~/.zz are tallied by directory and last-used time stamp, so for example the lines

0 1483225200 1 ~/src
0 1483225201 1 ~/tmp
0 1483225202 1 ~/src
0 1483225203 1 ~/tmp
0 1483225204 1 ~/src

would turn into

6 1483225204 3 ~/src
4 1483225203 2 ~/tmp

Also, the initial number, the effective score of the directory, is computed: We take the relative age of the directory (that is, seconds since we went there), and boost or dampen the results: the frequency is multiplied by 4 for directories not older than 1 hour, doubled for directories we went into today, halved for directories we went into this week, and divided by 4 else.

  awk -v ${(%):-now=%D{%s}} <~/.zz '
    function r(t,f) {
      age = now - t
      return (age<3600) ? f*4 : (age<86400) ? f*2 : (age<604800) ? f/2 : f/4
    }
    { f[$4]+=$3; if ($2>l[$4]) l[$4]=$2 }
    END { for(i in f) printf("%d\t%d\t%d\t%s\n",r(l[i],f[i]),l[i],f[i],i) }' |

By design, this tallied file can be appended again with new lines originating from chpwd, and recomputed whenever needed.

The output of this tally is then sorted by age, truncated to 9000 lines, then sorted by score. (My ~/.zz is only 350 lines, however.)

      sort -k2 -n -r | sed 9000q | sort -n -r -o ~/.zz

With this precomputed tally (which is generated in linear time), finding the best match is easy. It is the first string that matches all arguments:

  if (( $# )); then
    local p=$(awk 'NR != FNR { exit }  # exit after first file argument
                   { for (i = 3; i < ARGC; i++) if ($4 !~ ARGV[i]) next
                     print $4; exit }' ~/.zz ~/.zz "$@")

If nothing was found, we bail with exit code 1. If zz is used interactively, it changes into the best match, else the best match is just printed. This allows using things like cp foo.mkv $(zz mov).

    [[ $p ]] || return 1
    local op=print
    [[ -t 1 ]] && op=cd
    if [[ -d ${~p} ]]; then
      $op ${~p}
    else

If we found a directory that doesn’t exist anymore, we clean up the ~/.zz file, and try it all over.

      # clean nonexisting paths and retry
      while read -r line; do
        [[ -d ${~${line#*$'\t'*$'\t'*$'\t'}} ]] && print -r $line
      done <~/.zz | sort -n -r -o ~/.zz
      zz "$@"
    fi

With no arguments, zz simply prints the top ten directories.

  else
    sed 10q ~/.zz
  fi
}

I actually shortcut zz to z and add a leading space to not store z calls into history:

alias z=' zz'

The full code (possibly updated) can be found as usual in my .zshrc.

I use lots of shell hacks, but zz definitely is among my most successful ones.

NP: Leonard Cohen—Leaving The Table

24dec2016 · Merry Christmas!

Comic © Liz Climo

Frohe Weihnachten, ein schönes Fest, und einen guten Rutsch ins neue Jahr wünscht euch
Christian Neukirchen

Merry Christmas and a Happy New Year!

NP: Against Me!—Haunting, Haunted, Haunts

15jan2016 · Dear Github

These kind of posts seem popular these days.

My top six of features GitHub is missing:

  1. Searching for text in commit messages. Fixed 2017-01-04. About 2/3 of the repos I clone, I solely clone to run git log --grep.
  2. Searching in the wiki. Fixed 2016-08-08.
  3. Archive tarballs with submodule checkouts included; else submodule usage is totally pointless.
  4. Marking issues private to committers. Useful both for embargoed security issues and to keep out an angry mob.
  5. Being able to disable pull requests. For projects that use Github mainly as a mirror.
  6. IPv6 support. It’s 2016, damnit.

Sincerely,
chris2

NP: Revolte Springen—Hinter den Barrikaden

24dec2015 · Merry Christmas!

Consumers' crèche

Frohe Weihnachten, ein schönes Fest, und einen guten Rutsch ins neue Jahr wünscht euch Christian Neukirchen

Merry Christmas and a Happy New Year!

NP: Elende Bande—Uns das Leben

Copyright © 2004–2019