npmpackages on our gitlab instance npm registry and even the publication into the npmjs registry. To publish the packages we have added rules to the
gitlab-ciconfiguration of the relevant repositories and we publish them when a
tagis created. As the we are lazy by definition, I configured the system to use the
tagas the package version; I tested if the contents of the
package.jsonwhere in sync with the expected version and if it was not I updated it and did a force push of the
tagwith the updated file using the following code on the script that publishes the package:
# Update package version & add it to the .build-args INITIAL_PACKAGE_VERSION="$(npm pkg get version tr -d '"')" npm version --allow-same --no-commit-hooks --no-git-tag-version \ "$CI_COMMIT_TAG" UPDATED_PACKAGE_VERSION="$(npm pkg get version tr -d '"')" echo "UPDATED_PACKAGE_VERSION=$UPDATED_PACKAGE_VERSION" >> .build-args # Update tag if the version was updated or abort if [ "$INITIAL_PACKAGE_VERSION" != "$UPDATED_PACKAGE_VERSION" ]; then if [ -n "$CI_GIT_USER" ] && [ -n "$CI_GIT_TOKEN" ]; then git commit -m "Updated version from tag $CI_COMMIT_TAG" package.json git tag -f "$CI_COMMIT_TAG" -m "Updated version from tag" git push -f -o ci.skip origin "$CI_COMMIT_TAG" else echo "!!! ERROR !!!" echo "The updated tag could not be uploaded." echo "Set CI_GIT_USER and CI_GIT_TOKEN or fix the 'package.json' file" echo "!!! ERROR !!!" exit 1 fi fi
tagand update it, but I drop the idea pretty soon as there were multiple issues to consider (i.e. we can have tags pointing to commits present in multiple branches and even if it only points to one the
tagdoes not have to be the
HEADof the branch making the inclusion difficult). In any case this system was working, so we left it until we started to publish to the NPM Registry; as we are using a token to push the packages that we don t want all developers to have access to (right now it would not matter, but when the team grows it will) I started to use gitlab protected branches on the projects that need it and adjusting the
.npmrcfile using protected variables. The problem then was that we can no longer do a standard force push for a branch (that is the main point of the protected branches feature) unless we use the gitlab api, so the tags with the wrong version started to fail. As the way things were being done seemed dirty anyway I thought that the best way of fixing things was to forbid users to push a
tagthat includes a version that does not match the
package.jsonversion. After thinking about it we decided to use githooks on the gitlab server for the repositories that need it, as we are only interested in
tagswe are going to use the update hook; it is executed once for each ref to be updated, and takes three parameters:
gitalyrelative path of each repo and located it on the server filesystem (as I said we are using
dockerand the gitlab s
datadirectory is on
/srv/gitlab/data, so the path to the repo has the form
/srv/gitlab/data/git-data/repositories/@hashed/xx/yy/hash.git). Once we have the directory we need to:
custom_hookssub directory inside it,
updatescript (as we only need one script we used that instead of creating an
update.ddirectory, the good thing is that this will also work with a standard git server renaming the base directory to
$ cd /srv/gitlab/data/git-data/repositories/@hashed/xx/yy/hash.git $ mkdir custom_hooks $ edit_or_copy custom_hooks/update $ chmod 0755 custom_hooks/update $ chown --reference=. -R custom_hooks
updatescript we are using is as follows:
#!/bin/sh set -e # kyso update hook # # Right now it checks version.txt or package.json versions against the tag name # (it supports a 'v' prefix on the tag) # Arguments ref_name="$1" old_rev="$2" new_rev="$3" # Initial test if [ -z "$ref_name" ] [ -z "$old_rev" ] [ -z "$new_rev" ]; then echo "usage: $0 <ref> <oldrev> <newrev>" >&2 exit 1 fi # Get the tag short name tag_name="$ ref_name##refs/tags/ " # Exit if the update is not for a tag if [ "$tag_name" = "$ref_name" ]; then exit 0 fi # Get the null rev value (string of zeros) zero=$(git hash-object --stdin </dev/null tr '0-9a-f' '0') # Get if the tag is new or not if [ "$old_rev" = "$zero" ]; then new_tag="true" else new_tag="false" fi # Get the type of revision: # - delete: if the new_rev is zero # - commit: annotated tag # - tag: un-annotated tag if [ "$new_rev" = "$zero" ]; then new_rev_type="delete" else new_rev_type="$(git cat-file -t "$new_rev")" fi # Exit if we are deleting a tag (nothing to check here) if [ "$new_rev_type" = "delete" ]; then exit 0 fi # Check the version against the tag (supports version.txt & package.json) if git cat-file -e "$new_rev:version.txt" >/dev/null 2>&1; then version="$(git cat-file -p "$new_rev:version.txt")" if [ "$version" = "$tag_name" ] [ "$version" = "$ tag_name#v " ]; then exit 0 else EMSG="tag '$tag_name' and 'version.txt' contents '$version' don't match" echo "GL-HOOK-ERR: $EMSG" exit 1 fi elif git cat-file -e "$new_rev:package.json" >/dev/null 2>&1; then version="$( git cat-file -p "$new_rev:package.json" jsonpath version tr -d '\[\]"' )" if [ "$version" = "$tag_name" ] [ "$version" = "$ tag_name#v " ]; then exit 0 else EMSG="tag '$tag_name' and 'package.json' version '$version' don't match" echo "GL-HOOK-ERR: $EMSG" exit 1 fi else # No version.txt or package.json file found exit 0 fi
tags, if the
ref_namedoes not have the prefix
refs/tags/the script does an
tagis new or not we are not using the value (in
gitlabthat is handled by the protected tag feature),
tagthe script does an
exit 0, we don t need to check anything in that case,
tagis annotated or not (we set the
commit, but we don t use the value),
version.txtfile and if it does not exist we check the
package.jsonfile, if it does not exist either we do an
exit 0, as there is no version to check against and we allow that on a tag,
GL-HOOK-ERR:prefix to the messages to show them on the gitlab web interface (can be tested creating a tag from it),
package.jsonfile we use the
jsonpathbinary (it is installed by the jsonpath ruby gem) because it is available on the
gitlabcontainer (initially I used
sedto get the value, but a real JSON parser is always a better option).
tagto a repository that has a
package.jsonfile and the tag does not match the version (if
version.txtis present it takes precedence) the
pushfails. If the
tagmatches or the files are not present the
tagis added if the user has permission to add it in
gitlab(our hook is only executed if the user is allowed to create or update the
Display information about the physical volume in order to shrink it:
sudo lvremove /dev/larjona-pc-vg/swap_1
sudo pvresize --setphysicalvolumesize 380G /dev/mapper/cryptdisk
sudo pvchange -x y /dev/mapper/cryptdisk
sudo lvcreate -L 4G -n swap_1 larjona-pc-vg
sudo mkswap -L swap_1 /dev/larjona-pc-vg/swap_1
Then reduced the sda3 partition with the KDE partition manager (it took a while), and copy it to the new disk. Turned off the computer and unplugged the old disk. Started the computer with the Debian 11 Live USB again, UEFI boot. Now, to make my system boot:
sudo pvs -v --segments --units s /dev/mapper/cryptdisk
sudo cryptsetup -b 838860800 resize cryptdisk
sudo cryptsetup status cryptdisk
sudo vgchange -a n vgroup
sudo vgchange -an
sudo cryptsetup luksClose cryptdisk
sudo cryptsetup luksOpen /dev/sda3 crypdisk
sudo vgscan --mknodes
sudo vgchange -ay
sudo mount /dev/mapper/larjona--pc--vg-root /mnt
sudo mount /dev/sda2 /mnt/boot
sudo mount /dev/sda1 /mnt/boot/efi
mount --rbind /sys /media/linux/sys
mount -t efivarfs none /sys/firmware/efi/efivars
for i in /dev /dev/pts /proc /run; do sudo mount -B $i /mnt$i; done
sudo chroot /mnt
/mnt/etc/crypttabto reflect the name of the new encrypted partition, edited
/mnt/etc/fstabto paste the UUIDs of the new partitions.
grub-installand reinstalled the kernels as noted in the reference, rebooted and logged in my Plasma desktop (Well, the actual process was not so smooth but after several tries and errors and searching for help I managed to get the needed commands to make my system boot from the new disk).
papermodtheme, how I publish it using
nginx, how I ve integrated the
remark42comment system and how I ve automated its publication using
json2file-go. It is a long post, but I hope that at least parts of it can be interesting for some, feel free to ignore it if that is not your case
config.ymlfile is the one shown below (probably some of the settings are not required nor being used right now, but I m including the current file, so this post will have always the latest version of it): Some notes about the settings:
assets.disableHLJSare set to
true; we plan to use
adocand the inclusion of the
hljsassets adds styles that collide with the ones used by
ShowTocis set to
TocOpensetting is set to
falseto make the ToC appear collapsed initially. My plan was to use the
asciidoctorToC, but after trying I believe that the theme one looks nice and I don t need to adjust styles, although it has some issues with the
html5sprocessor (the admonition titles use
<h6>and they are shown on the ToC, which is weird), to fix it I ve copied the
layouts/partial/toc.htmlto my site repository and replaced the range of headings to end at
5still seems a lot, but as I don t think I ll use that heading level on the posts it doesn t really matter).
params.profileModevalues are adjusted, but for now I ve left it disabled setting
falseand I ve set the
homeInfoParamsto show more or less the same content with the latest posts under it (I ve added some styles to my
custom.cssstyle sheet to center the text and image of the first post to match the look and feel of the profile).
asciidocExtsection I ve adjusted the
html5s, I ve added the
asciidoctorand adjusted the
asciidoctor-diagramwork right (haven t tested it yet).
html5sprocessor I ve added some files to the
assets/css/extended/custom.cssto make the
homeInfoParamslook like the profile page and I ve also changed a little bit some theme styles to make things look better with the
assets/css/extended/adoc.csswith some styles taken from the
asciidoctor-default.css, see this blog post about the original file; mine is the same after formatting it with css-beautify and editing it to use variables for the colors to support light and dark themes:
theme-vars.cssfile that changes the highlighted code background color and adds the color definitions used by the admonitions:
font-awesome, so I ve downloaded its resources for version
4.7.0(the one used by
asciidoctor) storing the
font-awesome.cssinto on the
assets/css/extendeddir (that way it is merged with the rest of
.cssfiles) and copying the fonts to the
static/assets/fonts/dir (will be served directly):
FA_BASE_URL="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0" curl "$FA_BASE_URL/css/font-awesome.css" \ > assets/css/extended/font-awesome.css for f in FontAwesome.otf fontawesome-webfont.eot \ fontawesome-webfont.svg fontawesome-webfont.ttf \ fontawesome-webfont.woff fontawesome-webfont.woff2; do curl "$FA_BASE_URL/fonts/$f" > "static/assets/fonts/$f" done
rouge) so we need a
cssto do the highlight styling; as
rougeprovides a way to export them, I ve created the
assets/css/extended/rouge.cssfile with the
rougify style thankful_eyes > assets/css/extended/rouge.css
html5sbackend with admonitions I ve added a variation of the example found on this blog post to
assets/js/adoc-admonitions.js: and enabled its minified use on the
layouts/partials/extend_footer.htmlfile adding the following lines to it:
- $admonitions := slice (resources.Get "js/adoc-admonitions.js") resources.Concat "assets/js/adoc-admonitions.js" minify fingerprint <script defer crossorigin="anonymous" src=" $admonitions.RelPermalink " integrity=" $admonitions.Data.Integrity "></script>
layouts/partials/comments.htmlwith the following content based on the remark42 documentation, including extra code to sync the dark/light setting with the one set on the site: In development I use it with anonymous comments enabled, but to avoid SPAM the production site uses social logins (for now I ve only enabled Github & Google, if someone requests additional services I ll check them, but those were the easy ones for me initially). To support theme switching with
remark42I ve also added the following inside the
- if (not site.Params.disableThemeToggle) <script> /* Function to change theme when the toggle button is pressed */ document.getElementById("theme-toggle").addEventListener("click", () => if (typeof window.REMARK42 != "undefined") if (document.body.className.includes('dark')) window.REMARK42.changeTheme('light'); else window.REMARK42.changeTheme('dark'); ); </script> - end
theme-togglebutton is pressed we change the
remark42theme before the
PaperModone (that s needed here only, on page loads the
remark42theme is synced with the main one using the code from the
docker-composewith the following configuration: To run it properly we have to create the
.envfile with the current user ID and GID on the variables
APP_GID(if we don t do it the files can end up being owned by a user that is not the same as the one running the services):
$ echo "APP_UID=$(id -u)\nAPP_GID=$(id -g)" > .env
Dockerfileused to generate the
sto/hugo-adocis: If you review it you will see that I m using the docker-asciidoctor image as the base; the idea is that this image has all I need to work with
asciidoctorand to use
hugoI only need to download the binary from their latest release at github (as we are using an image based on alpine we also need to install the
libc6-compatpackage, but once that is done things are working fine for me so far). The image does not launch the server by default because I don t want it to; in fact I use the same
docker-compose.ymlfile to publish the site in production simply calling the container without the arguments passed on the
docker-compose.ymlfile (see later). When running the containers with
docker compose upif you have the
docker-compose-pluginpackage installed) we also launch a
nginxcontainer and the
remark42service so we can test everything together. The
remark42image is the original one with an updated version of the
init.shscript: The updated
init.shis similar to the original, but allows us to use an
APP_GIDvariable and updates the
/etc/groupfile of the container so the files get the right user and group (with the original script the group is always
1001): The environment file used with
remark42for development is quite minimal: And the
nginx/default.conffile used to publish the service locally is simple too:
gitto clone & pull the repository,
jsonfiles from shell scripts,
json2file-goto save the webhook messages to files,
inotify-toolsto detect when new files are stored by
json2file-goand launch scripts to process them,
nginxto publish the site using HTTPS and work as proxy for
remark42(I run it using a container),
task-spoolto queue the scripts that update the deployment.
docker composefrom the debian packages on the
docker-ceto run the containers,
docker compose(it is a plugin, so no
-in the name).
gitrepository I ve created a deploy key, added it to
giteaand cloned the project on the
/srv/blogopsPATH (that route is owned by a regular user that has permissions to run
docker, as I said before).
hugoTo compile the site we are using the
docker-compose.ymlfile seen before, to be able to run it first we build the container images and once we have them we launch
docker compose run:
$ cd /srv/blogops $ git pull $ docker compose build $ if [ -d "./public" ]; then rm -rf ./public; fi $ docker compose run hugo --
/srv/blogops/public(we remove the directory first because
hugodoes not clean the destination folder as
jekylldoes). The deploy script re-generates the site as described and moves the
publicdirectory to its final place for publishing.
remark42with dockerOn the
/srv/blogops/remark42folder I have the following
../.envfile is loaded to get the
APP_GIDvariables that are used by my version of the
init.shscript to adjust file permissions and the
env.prodfile contains the rest of the settings for
remark42, including the social network tokens (see the remark42 documentation for the available parameters, I don t include my configuration here because some of them are secrets).
nginxconfiguration for the
blogops.mixinet.netsite is as simple as:
server listen 443 ssl http2; server_name blogops.mixinet.net; ssl_certificate /etc/letsencrypt/live/blogops.mixinet.net/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/blogops.mixinet.net/privkey.pem; include /etc/letsencrypt/options-ssl-nginx.conf; ssl_dhparam /etc/letsencrypt/ssl-dhparams.pem; access_log /var/log/nginx/blogops.mixinet.net-443.access.log; error_log /var/log/nginx/blogops.mixinet.net-443.error.log; root /srv/blogops/nginx/public_html; location / try_files $uri $uri/ =404; include /srv/blogops/nginx/remark42.conf; server listen 80 ; listen [::]:80 ; server_name blogops.mixinet.net; access_log /var/log/nginx/blogops.mixinet.net-80.access.log; error_log /var/log/nginx/blogops.mixinet.net-80.error.log; if ($host = blogops.mixinet.net) return 301 https://$host$request_uri; return 404;
/srv/blogops/nginx/public_htmland not on
/srv/blogops/public; the reason for that is that I want to be able to compile without affecting the running site, the deployment script generates the site on
/srv/blogops/publicand if all works well we rename folders to do the switch, making the change feel almost atomic.
giteaat my home and the VM where the blog is served, I m going to configure the
json2file-goto listen for connections on a high port using a self signed certificate and listening on IP addresses only reachable through the VPN. To do it we create a
systemd socketto run
json2file-goand adjust its configuration to listen on a private IP (we use the
FreeBindoption on its definition to be able to launch the service even when the IP is not available, that is, when the VPN is down). The following script can be used to set up the
mkcertto create the temporary certificates, to install the package on
backportsrepository must be available.
json2file-goserver we go to the project and enter into the
hooks/gitea/newpage, once there we create a new webhook of type
giteaand set the target URL to
https://172.31.31.1:4443/blogopsand on the secret field we put the token generated with
uuidby the setup script:
sed -n -e 's/blogops://p' /etc/json2file-go/dirlist
webhooksection of the
giteaserver allows us to call the IP and skips the TLS verification (you can see the available options on the gitea documentation). The
[webhook]section of my server looks like this:
[webhook] ALLOWED_HOST_LIST=private SKIP_TLS_VERIFY=true
webhookconfigured we can try it and if it works our
json2fileserver will store the file on the
giteaand store the messages on files, but we have to do something to process those files once they are saved in our machine. An option could be to use a
cronjobto look for new files, but we can do better on Linux using
inotifywe will use the
inotify-toolsto watch the
json2fileoutput directory and execute a script each time a new file is moved inside it or closed after writing (
IN_MOVED_TOevents). To avoid concurrency problems we are going to use
task-spoolerto launch the scripts that process the webhooks using a queue of length 1, so they are executed one by one in a FIFO queue. The spooler script is this: To run it as a daemon we install it as a
systemd serviceusing the following script:
last_job_syncto synchronize job dependencies of the previous submission. Although the DRM scheduler guarantees the order of starting to execute a job in the same queue in the kernel space, the order of completion isn t predictable. On the other hand, we still needed to use syncobjs to follow job completion since we have event threads on the CPU side. Therefore, a more accurate implementation requires last_job syncobjs to track when each engine (CL, TFU, and CSD) is idle. We also needed to keep the driver working on previous versions of v3d kernel-driver with single semaphores, then we kept tracking ANY
last_job_syncto preserve the previous implementation.
This was waiting for multisync support in the v3d kernel, which is already available. Exposing this feature however enabled a few more CTS tests that exposed pre-existing bugs in the user-space driver so we fix those here before exposing the feature.
This should give you emulated timeline semaphores for free and kernel-assisted sharable timeline semaphores for cheap once you have the kernel interface wired in.
aptway to install those packages. And downloading those one by one and then installing them sounds like too much work. Step Zero: Prerequisites If you are an Ubuntu (or Debian!) developer, you might already have ubuntu-dev-tools installed. If not, it has some really useful tools!
$ sudo apt install ubuntu-dev-tools
$ mkdir mesa-downgrade; cd mesa-downgrade
pull-lp-debs. The first argument is the source package name. In this case, I next need to specify what version I want; otherwise it will give me the latest version which isn t helpful. I could specify a series codename like jammy or impish but that won t give me what I want this time.
$ pull-lp-debs mesa 21.3.5-1ubuntu2
$ sudo apt install --only-upgrade --mark-auto ./*.deb
aptto only install packages that are already installed. I don t actually need all 24 packages installed; I just want to change the versions for the stuff I already have.
aptto keep these packages marked in
dpkgas automatically installed. This allows any of these packages to be suggested for removal once there isn t anything else depending on them. That s useful if you don t want to have old libraries installed on your system in case you do manual installation like this frequently. Finally, the
aptinstall syntax has a quirk: It needs a path to a file because it wants an easy way to distinguish from a package name. So adding
./before filenames works. I guess this is a bug.
aptshould be taught that libegl-mesa0_21.3.5-1ubuntu2_amd64.deb is a file name not a package name. Step Four: Cleanup Let s assume that you installed old versions. To get back to the current package versions, you can just upgrade like normal.
$ sudo apt dist-upgrade
$ sudo apt-mark hold
apt-mark unholdto see what packages you have held and release the holds. Remember you won t get security updates or other bug fixes for held packages! And when you re done with the debs we download, you can remove all the files:
$ cd .. ; rm -ri mesa-downgrade
/jammysuffix on the package name.
$ sudo apt install libegl-mesa0/jammy
apt listHere s one suggested way to find them:
$ apt list --installed --all-versions grep local] --after-context 1
aptis designed to upgrade packages not downgrade them. You can break things by downgrading. For instance, a database could upgrade its format to a new version but I wouldn t expect it to be able to reverse that just because you attempt to install an older version.
/usr/share/metainfo, which was very inefficient. To shorten a long story, the old caching code was rewritten with the new concepts of caches not necessarily being system-wide and caches existing for more fine-grained groups of files in mind. The new caching code uses Richard Hughes excellent libxmlb internally for memory-mapped data storage. Unlike LMDB, libxmlb knows about the XML document model, so queries can be much more powerful and we do not need to build indices manually. The library is also already used by GNOME Software and fwupd for parsing of (refined) AppStream metadata, so it works quite well for that usecase. As a result, search queries via libappstream are now a bit slower (very much depends on the query, roughly 20% on average), but can be mmuch more powerful. The caching code is a lot more robust, which should speed up startup time of applications. And in addition to all of that, the
AsPoolclass has gained a flag to allow it to monitor AppStream source data for changes and refresh the cache fully automatically and transparently in the background. All software written against the previous version of the libappstream library should continue to work with the new caching code, but to make use of some of the new features, software using it may need adjustments. A lot of methods have been deprecated too now. 2. Experimental compose support Compiling MetaInfo and other metadata into AppStream collection metadata, extracting icons, language information, refining data and caching media is an involved process. The appstream-generator tool does this very well for data from Linux distribution sources, but the tool is also pretty heavyweight with lots of knobs to adjust, an underlying database and a complex algorithm for icon extraction. Embedding it into other tools via anything else but its command-line API is also not easy (due to D s GC initialization, and because it was never written with that feature in mind). Sometimes a simpler tool is all you need, so the libappstream-compose library as well as
appstreamcli composeare being developed at the moment. The library contains building blocks for developing a tool like appstream-generator while the cli tool allows to simply extract metadata from any directory tree, which can be used by e.g. Flatpak. For this to work well, a lot of appstream-generator s D code is translated into plain C, so the implementation stays identical but the language changes. Ultimately, the generator tool will use libappstream-compose for any general data refinement, and only implement things necessary to extract data from the archive of distributions. New applications (e.g. for new bundling systems and other purposes) can then use the same building blocks to implement new data generators similar to appstream-generator with ease, sharing much of the code that would be identical between implementations anyway. 2. Supporting user input controls Want to advertise that your application supports touch input? Keyboard input? Has support for graphics tablets? Gamepads? Sure, nothing is easier than that with the new
controlrelation item and
supportsrelation kind (since 0.12.11 / 0.15.0, details):
<supports> <control>pointing</control> <control>keyboard</control> <control>touch</control> <control>tablet</control> </supports>
display_lengthrelation item to require or recommend a minimum (or maximum) display size that the described GUI application can work with. For example:
<requires> <display_length compare="ge">360</display_length> </requires>
display_lengthvalue will be checked against the longest edge of a display by default (by explicitly specifying the shorter edge, this can be changed). This feature is available since 0.13.0, details. See also Tobias Bernard s blog entry on this topic. 4. Tags This is a feature that was originally requested for the LVFS/fwupd, but one of the great things about AppStream is that we can take very project-specific ideas and generalize them so something comes out of them that is useful for many. The new
tagstag allows people to tag components with an arbitrary namespaced string. This can be useful for project-internal organization of applications, as well as to convey certain additional properties to a software center, e.g. an application could mark itself as featured in a specific software center only. Metadata generators may also add their own tags to components to improve organization. AppStream gives no recommendations as to how these tags are to be interpreted except for them being a strictly optional feature. So any meaning is something clients and metadata authors need to negotiate. It therefore is a more specialized usecase of the already existing
customtag, and I expect it to be primarily useful within larger organizations that produce a lot of software components that need sorting. For example:
<tags> <tag namespace="lvfs">vendor-2021q1</tag> <tag namespace="plasma">featured</tag> </tags>
display_lengthtags, resolved a few minor issues and also added a button to instantly copy the generated output to clipboard so people can paste it into their project. If you want to create a new MetaInfo file, this tool is the best way to do it! The creator tool will also not transfer any data out of your webbrowser, it is strictly a client-side application. And that is about it for the most notable changes in AppStream land! Of course there is a lot more, additional tags for the LVFS and content rating have been added, lots of bugs have been squashed, the documentation has been refined a lot and the library has gained a lot of new API to make building software centers easier. Still, there is a lot to do and quite a few open feature requests too. Onwards to 1.0!
mbsync). But I also evaluated OfflineIMAP which was resurrected from the Python 2 apocalypse, and because I had used it before, for a long time. Read on for the details.
The baseline we are comparing against is SMD (syncmaildir) which performs the sync in about 7-8 seconds locally (3.5 seconds for each push/pull command) and about 10-12 seconds remotely. Anything close to that or better is good enough. I do not have recent numbers for a SMD full sync baseline, but the setup documentation mentions 20 minutes for a full sync. That was a few years ago, and the spool has obviously grown since then, so that is not a reliable baseline. A baseline for a full sync might be also set with rsync, which copies files at nearly 40MB/s, or 317Mb/s!
$ notmuch count --exclude=false 372758 $ du -sh --exclude xapian Maildir 13G Maildir
That is 5 minutes to transfer the entire spool. Incremental syncs are obviously pretty fast too:
anarcat@angela:tmp(main)$ time rsync -a --info=progress2 --exclude xapian shell.anarc.at:Maildir/ Maildir/ 12,647,814,731 100% 37.85MB/s 0:05:18 (xfr#394981, to-chk=0/395815) 72.38user 106.10system 5:19.59elapsed 55%CPU (0avgtext+0avgdata 15988maxresident)k 8816inputs+26305112outputs (0major+50953minor)pagefaults 0swaps
As an extra curiosity, here's the performance with
anarcat@angela:tmp(main)$ time rsync -a --info=progress2 --exclude xapian shell.anarc.at:Maildir/ Maildir/ 0 0% 0.00kB/s 0:00:00 (xfr#0, to-chk=0/395815) 1.42user 0.81system 0:03.31elapsed 67%CPU (0avgtext+0avgdata 14100maxresident)k 120inputs+0outputs (3major+12709minor)pagefaults 0swaps
tar, pretty similar with
rsync, minus incremental which I cannot be bothered to figure out right now:
anarcat@angela:tmp(main)$ time ssh shell.anarc.at tar --exclude xapian -cf - Maildir/ pv -s 13G tar xf - 56.68user 58.86system 5:17.08elapsed 36%CPU (0avgtext+0avgdata 8764maxresident)k 0inputs+0outputs (0major+7266minor)pagefaults 0swaps 12,1GiO 0:05:17 [39,0MiB/s] [===================================================================> ] 92%
rsyncmanages to almost beat a plain
taron file transfer, I'm actually surprised by how well it performs here, considering there are many little files to transfer. (But then again, this maybe is exactly where
tarneeds to glue all those little files together,
rsynccan just directly talk to the other side and tell it to do live changes. Something to look at in another article maybe?) Since both ends are NVMe drives, those should easily saturate a gigabit link. And in fact, a backup of the server mail spool achieves much faster transfer rate on disks:
That's 131Mibyyte per second, vastly faster than the gigabit link. The client has similar performance:
anarcat@marcos:~$ tar fc - Maildir pv -s 13G > Maildir.tar 15,0GiO 0:01:57 [ 131MiB/s] [===================================] 115%
So those disks should be able to saturate a gigabit link, and they are not the bottleneck on fast links. Which begs the question of what is blocking performance of a similar transfer over the gigabit link, but that's another question altogether, because no sync program ever reaches the above performance anyways. Finally, note that when I migrated to SMD, I wrote a small performance comparison that could be interesting here. It show SMD to be faster than OfflineIMAP, but not as much as we see here. In fact, it looks like OfflineIMAP slowed down significantly since then (May 2018), but this could be due to my larger mail spool as well.
anarcat@angela:~(main)$ tar fc - Maildir pv -s 17G > Maildir.tar 16,2GiO 0:02:22 [ 116MiB/s] [==================================] 95%
mbsync) project is written in C and supports syncing Maildir and IMAP folders, with possibly multiple replicas. I haven't tested this but I suspect it might be possible to sync between two IMAP servers as well. It supports partial mirorrs, message flags, full folder support, and "trash" functionality.
Long gone are the days where I would spend a long time reading a manual page to figure out the meaning of every option. If that's your thing, you might like this one. But I'm more of a "EXAMPLES section" kind of person now, and I somehow couldn't find a sample file on the website. I started from the Arch wiki one but it's actually not great because it's made for Gmail (which is not a usual Dovecot server). So a sample config file in the manpage would be a great addition. Thankfully, the Debian packages ships one in
SyncState * Sync New ReNew Flags IMAPAccount anarcat Host imap.anarc.at User anarcat PassCmd "pass imap.anarc.at" SSLType IMAPS CertificateFile /etc/ssl/certs/ca-certificates.crt IMAPStore anarcat-remote Account anarcat MaildirStore anarcat-local # Maildir/top/sub/sub #SubFolders Verbatim # Maildir/.top.sub.sub SubFolders Maildir++ # Maildir/top/.sub/.sub # SubFolders legacy # The trailing "/" is important #Path ~/Maildir-mbsync/ Inbox ~/Maildir-mbsync/ Channel anarcat # AKA Far, convert when all clients are 1.4+ Master :anarcat-remote: # AKA Near Slave :anarcat-local: # Exclude everything under the internal [Gmail] folder, except the interesting folders #Patterns * ![Gmail]* "[Gmail]/Sent Mail" "[Gmail]/Starred" "[Gmail]/All Mail" # Or include everything Patterns * # Automatically create missing mailboxes, both locally and on the server #Create Both Create slave # Sync the movement of messages between folders and deletions, add after making sure the sync works #Expunge Both
/usr/share/doc/isync/examples/mbsyncrc.samplebut I only found that after I wrote my configuration. It was still useful and I recommend people take a look if they want to understand the syntax. Also, that syntax is a little overly complicated. For example,
Farneeds colons, like:
Why? That seems just too complicated. I also found that sections are not clearly identified:
Channelmark section beginnings, for example, which is not at all obvious until you learn about
mbsync's internals. There are also weird ordering issues: the
SyncStateoption needs to be before
IMAPAccount, presumably because it's global. Using a more standard format like .INI or TOML could improve that situation.
rsync. The incremental runs are roughly 2 seconds, which is even more impressive, as that's actually faster than
Those tests were performed with isync 1.3.0-2.2 on Debian bullseye. Tests with a newer isync release originally failed because of a corrupted message that triggered bug 999804 (see below). Running 1.4.3 under valgrind works around the bug, but adds a 50% performance cost, the full sync running in 1h35m. Once the upstream patch is applied, performance with 1.4.3 is fairly similar, considering that the new sync included the
===> multitime results 1: mbsync -a Mean Std.Dev. Min Median Max real 2.015 0.052 1.930 2.029 2.105 user 0.660 0.040 0.592 0.661 0.722 sys 0.338 0.033 0.268 0.341 0.387
registerfolder with 4000 messages:
That is ~13GB in ~60 minutes, which gives us 28.3Mbps. Incrementals are also pretty similar to 1.3.x, again considering the double-connect cost:
120.74user 213.19system 59:47.69elapsed 9%CPU (0avgtext+0avgdata 105420maxresident)k 29128inputs+28284376outputs (0major+45711minor)pagefaults 0swaps
Those tests were all done on a Gigabit link, but what happens on a slower link? My server uplink is slow: 25 Mbps down, 6 Mbps up. There
===> multitime results 1: mbsync -a Mean Std.Dev. Min Median Max real 2.500 0.087 2.340 2.491 2.629 user 0.718 0.037 0.679 0.711 0.793 sys 0.322 0.024 0.284 0.320 0.365
mbsyncis worse than the SMD baseline:
That's 30 seconds for a sync, which is an order of magnitude slower than SMD.
===> multitime results 1: mbsync -a Mean Std.Dev. Min Median Max real 31.531 0.724 30.764 31.271 33.100 user 1.858 0.125 1.721 1.818 2.131 sys 0.610 0.063 0.506 0.600 0.695
mbsyncUI is kind of neat:
(Note that nice switch away from slavery-related terms too.) The display is minimal, and yet informative. It's not obvious what does mean at first glance, but the manpage is useful at least for clarifying that:
anarcat@angela:~(main)$ mbsync -a Notice: Master/Slave are deprecated; use Far/Near instead. C: 1/2 B: 204/205 F: +0/0 *0/0 #0/0 N: +1/200 *0/0 #0/0
This represents the cumulative progress over channels, boxes, and messages affected on the far and near side, respectively. The message counts represent added messages, messages with updated flags, and trashed messages, respectively. No attempt is made to calculate the totals in advance, so they grow over time as more information is gathered. (Emphasis mine).In other words:
C 2/2: channels done/total (2 done out of 2)
B 204/205: mailboxes done/total (204 out of 205)
F: changes on the far side
N: +10/200 *0/0 #0/0: changes on the "near" side:
+10/200: 10 out of 200 messages downloaded
*0/0: no flag changed
#0/0: no message deleted
Sto "mark spam", which basically assigns the tag
spamto the message and removes a bunch of others. Then I have a notmuch-purge script which moves that message to the spam folder, for training purposes. It basically does this:
This method, which worked fine in SMD (and also OfflineIMAP) created this error on sync:
notmuch search --output=files --format=text0 "$search_spam" \ xargs -r -0 mv -t "$HOME/Maildir/$ PREFIX junk/cur/"
And indeed, there are now two messages with that UID in the mailbox:
Maildir error: duplicate UID 37578.
This is actually a known limitation or, as mbsync(1) calls it, a "RECOMMENDATION":
anarcat@angela:~(main)$ find Maildir/.junk/ -name '*U=37578*' Maildir/.junk/cur/1637427889.134334_2.angela,U=37578:2,S Maildir/.junk/cur/1637348602.2492889_221804.angela,U=37578:2,S
When using the more efficient default UID mapping scheme, it is important that the MUA renames files when moving them between Maildir fold ers. Mutt always does that, while mu4e needs to be configured to do it:So it seems I would need to fix my script. It's unclear how the paths should be renamed, which is unfortunate, because I would need to change my script to adapt to
(setq mu4e-change-filenames-when-moving t)
mbsync, but I can't tell how just from reading the above. (A manual fix is actually to rename the file to remove the
mbsyncwill generate a new one and then sync correctly.) Fortunately, someone else already fixed that issue: afew, a notmuch tagging script (much puns, such hurt), has a move mode that can rename files correctly, specifically designed to deal with
mbsync. I had already been told about afew, but it's one more reason to standardize my notmuch hooks on that project, it looks like. Update: I have tried to use afew and found it has significant performance issues. It also has a completely different paradigm to what I am used to: it assumes all incoming mail has a
newand lays its own tags on top of that (
sent, etc). It can only move files from one folder at a time (see this bug) which breaks my spam training workflow. In general, I sync my tags into folders (e.g.
sent) and message flags (e.g.
S", etc), and afew is not well suited for this (although there are hacks that try to fix this). I have worked hard to make my tagging scripts idempotent, and it's something afew doesn't currently have. Still, it would be better to have that code in Python than bash, so maybe I should consider my options here.
mbsyncis really fast, but the downside of that is that it's written in C, and with that comes a whole set of security issues. The Debian security tracker has only three CVEs on isync, but the above issues show there could be many more. Reading the source code certainly did not make me very comfortable with trusting it with untrusted data. I considered sandboxing it with systemd (below) but having systemd run as a
--userprocess makes that difficult. I also considered using an apparmor profile but that is not trivial because we need to allow SSH and only some parts of it... Thankfully, upstream has been diligent at addressing the issues I have found. They provided a patch within a few days which did fix the sync issues. Update: upstream actually took the issue very seriously. They not only got CVE-2021-44143 assigned for my bug report, they also audited the code and found several more issues collectively identified as CVE-2021-3657, which actually also affect 1.3 (ie. Debian 11/bullseye/stable). Somehow my corpus doesn't trigger that issue, but it was still considered serious enough to warrant a CVE. So one the one hand: excellent response from upstream; but on the other hand: how many more of those could there be in there?
mbsyncas a systemd service. It suggests using the
-V) flag which is a little intense here, as it outputs 1444 lines of messages. I have used the following
And the following
[Unit] Description=Mailbox synchronization service ConditionHost=!marcos Wants=network-online.target After=network-online.target Before=notmuch-new.service [Service] Type=oneshot ExecStart=/usr/bin/mbsync -a Nice=10 IOSchedulingClass=idle NoNewPrivileges=true [Install] WantedBy=default.target
Note that we trigger
[Unit] Description=Mailbox synchronization timer ConditionHost=!marcos [Timer] OnBootSec=2m OnUnitActiveSec=5m Unit=mbsync.service [Install] WantedBy=timers.target
notmuchthrough systemd, with the
Beforeand also by adding
An improvement over polling repeatedly with a
[Unit] Description=notmuch new After=mbsync.service [Service] Type=oneshot Nice=10 ExecStart=/usr/bin/notmuch new [Install] WantedBy=mbsync.service
.timerwould be to wake up only on IMAP notify, but neither imapnotify nor goimapnotify seem to be packaged in Debian. It would also not cover for the "sent folder" use case, where we need to wake up on local changes.
IMAPStore remote Tunnel "ssh -q host.remote.com /usr/sbin/imapd"
BatchMode, restrict to
IdentitiesOnly, provide a password-less key just for this, add compression (
-C), find the Dovecot
imapbinary, and you get this:
And it actually seems to work:
IMAPAccount anarcat-tunnel Tunnel "ssh -o BatchMode=yes -o IdentitiesOnly=yes -i ~/.ssh/id_ed25519_mbsync -o HostKeyAlias=shell.anarc.at -C email@example.com /usr/lib/dovecot/imap"
It's a bit noisy, however.
$ mbsync -a Notice: Master/Slave are deprecated; use Far/Near instead. C: 0/2 B: 0/1 F: +0/0 *0/0 #0/0 N: +0/0 *0/0 #0/0imap(anarcat): Error: net_connect_unix(/run/dovecot/stats-writer) failed: Permission denied C: 2/2 B: 205/205 F: +0/0 *0/0 #0/0 N: +1/1 *3/3 #0/0imap(anarcat)<1611280><90uUOuyElmEQlhgAFjQyWQ>: Info: Logged out in=10808 out=15396642 deleted=0 expunged=0 trashed=0 hdr_count=0 hdr_bytes=0 body_count=1 body_bytes=8087
dovecot/imapdoesn't have a "usage" to speak of, but even the source code doesn't hint at a way to disable that
Errormessage, so that's unfortunate. That socket is owned by
root:dovecotso presumably Dovecot runs the
$user:dovecot, which we can't do here. Oh well? Interestingly, the SSH setup is not faster than IMAP. With IMAP:
===> multitime results 1: mbsync -a Mean Std.Dev. Min Median Max real 2.367 0.065 2.220 2.376 2.458 user 0.793 0.047 0.731 0.776 0.871 sys 0.426 0.040 0.364 0.434 0.476
Basically: 200ms slower. Tolerable.
===> multitime results 1: mbsync -a Mean Std.Dev. Min Median Max real 2.515 0.088 2.274 2.532 2.594 user 0.753 0.043 0.645 0.766 0.804 sys 0.328 0.045 0.212 0.340 0.393
mbsyncon my first workstation. The work on the second one was more streamlined, especially since the corruption on mailboxes was fixed:
dpkg -i isync_1.4.3-1.1~_amd64.deb
rsync -a --info=progress2 angela:Maildir/ Maildir-mbsync/
find Maildir-mbsync/ -type f -name '*.angela,*' -print0 rename -0 's/\.angela,/\.curie,/'
rm -rf Maildir-mbsync/.notmuch/xapian/
systemctl --user --now disable smd-pull.service smd-pull.timer smd-push.service smd-push.timer notmuch-new.service notmuch-new.timer
smd-pull --show-tags ; smd-push --show-tags ; notmuch new ; notmuch-sync-flagged -v
notmuch dump pv > notmuch.dump
cp -al Maildir Maildir-bak
ssh-keygen -t ed25519 -f .ssh/id_ed25519_mbsync cat .ssh/id_ed25519_mbsync.pub
.ssh/authorized_keyson the server, like this: command="/usr/lib/dovecot/imap",restrict ssh-ed25519 AAAAC...
mv Maildir Maildir-smd
mv Maildir-mbsync Maildir
notmuch new pv notmuch.dump notmuch restore
smd-*services: systemctl --user enable mbsync.timer notmuch-new.service systemctl --user start mbsync.timer rm ~/.config/systemd/user/smd* systemctl daemon-reload
As detailed in the crash report, all of those were actually innocuous and could be ignored. Also note that we completely trash the
[...] Warning: cannot apply tags to missing message: CAN6gO7_QgCaiDFvpG3AXHi6fW12qaN286+2a7ERQ2CQtzjSEPw@mail.gmail.com Warning: cannot apply tags to missing message: CAPTU9Wmp0yAmaxO+qo8CegzRQZhCP853TWQ_Ne-YF94MDUZ+Dw@mail.gmail.com Warning: cannot apply tags to missing message: F5086003-2917-4659-B7D2-66C62FCD4128@gmail.com [...] Warning: cannot apply tags to missing message: firstname.lastname@example.org Warning: cannot apply tags to missing message: email@example.com Warning: cannot apply tags to missing message: notmuch-sha1-000458df6e48d4857187a000d643ac971deeef47 Warning: cannot apply tags to missing message: notmuch-sha1-0079d8e0c3340e6f88c66f4c49fca758ea71d06d Warning: cannot apply tags to missing message: notmuch-sha1-0194baa4cfb6d39bc9e4d8c049adaccaa777467d Warning: cannot apply tags to missing message: notmuch-sha1-02aede494fc3f9e9f060cfd7c044d6d724ad287c Warning: cannot apply tags to missing message: notmuch-sha1-06606c625d3b3445420e737afd9a245ae66e5562 Warning: cannot apply tags to missing message: notmuch-sha1-0747b020f7551415b9bf5059c58e0a637ba53b13 [...]
notmuchdatabase because it's actually faster to reindex from scratch than let
notmuchslowly figure out that all mails are new and all the old mails are gone. The fresh indexing took:
While a reindexing on top of an existing database was going twice as slow, at about 120 files/sec.
nov 19 15:08:54 angela notmuch: Processed 384679 total files in 23m 41s (270 files/sec.). nov 19 15:08:54 angela notmuch: Added 372610 new messages to the database.
Note that it may be out of sync with my live (and private) configuration file, as I do not publish my "dotfiles" repository publicly for security reasons.
SyncState * Sync All # IMAP side, AKA "Far" IMAPAccount anarcat-imap Host imap.anarc.at User anarcat PassCmd "pass imap.anarc.at" SSLType IMAPS CertificateFile /etc/ssl/certs/ca-certificates.crt IMAPAccount anarcat-tunnel Tunnel "ssh -o BatchMode=yes -o IdentitiesOnly=yes -i ~/.ssh/id_ed25519_mbsync -o HostKeyAlias=shell.anarc.at -C firstname.lastname@example.org /usr/lib/dovecot/imap" IMAPStore anarcat-remote Account anarcat-tunnel # Maildir side, AKA "Near" MaildirStore anarcat-local # Maildir/top/sub/sub #SubFolders Verbatim # Maildir/.top.sub.sub SubFolders Maildir++ # Maildir/top/.sub/.sub # SubFolders legacy # The trailing "/" is important #Path ~/Maildir-mbsync/ Inbox ~/Maildir/ # what binds Maildir and IMAP Channel anarcat Far :anarcat-remote: Near :anarcat-local: # Exclude everything under the internal [Gmail] folder, except the interesting folders #Patterns * ![Gmail]* "[Gmail]/Sent Mail" "[Gmail]/Starred" "[Gmail]/All Mail" # Or include everything #Patterns * Patterns * !register !.register # Automatically create missing mailboxes, both locally and on the server Create Both #Create Near # Sync the movement of messages between folders and deletions, add after making sure the sync works Expunge Both # Propagate mailbox deletion Remove both IMAPAccount anarcat-register-imap Host imap.anarc.at User register PassCmd "pass imap.anarc.at-register" SSLType IMAPS CertificateFile /etc/ssl/certs/ca-certificates.crt IMAPAccount anarcat-register-tunnel Tunnel "ssh -o BatchMode=yes -o IdentitiesOnly=yes -i ~/.ssh/id_ed25519_mbsync -o HostKeyAlias=shell.anarc.at -C email@example.com /usr/lib/dovecot/imap" IMAPStore anarcat-register-remote Account anarcat-register-tunnel MaildirStore anarcat-register-local SubFolders Maildir++ Inbox ~/Maildir/.register/ Channel anarcat-register Far :anarcat-register-remote: Near :anarcat-register-local: Create Both Expunge Both Remove both
notmuch, and would sometimes crash mysteriously. It's been a while, so my memory is hazy on that. It also kind of died in a fire when Python 2 stop being maintained. The main author moved on to a different project, imapfw which could serve as a framework to build IMAP clients, but never seemed to implement all of the OfflineIMAP features and certainly not configuration file compatibility. Thankfully, a new team of volunteers ported OfflineIMAP to Python 3 and we can now test that new version to see if it is an improvement over
That only transferred about 8GB of mail, which gives us a transfer rate of 5.3Mbit/s, more than 5 times slower than
Copy message from RemoteAnarcat:junk: ERROR: Copying message 30624 [acc: Anarcat] decoding with 'X-EUC-TW' codec failed (AttributeError: 'memoryview' object has no attribute 'decode') Thread 'Copy message from RemoteAnarcat:junk' terminated with exception: Traceback (most recent call last): File "/usr/share/offlineimap3/offlineimap/imaputil.py", line 406, in utf7m_decode for c in binary.decode(): AttributeError: 'memoryview' object has no attribute 'decode' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/share/offlineimap3/offlineimap/threadutil.py", line 146, in run Thread.run(self) File "/usr/lib/python3.9/threading.py", line 892, in run self._target(*self._args, **self._kwargs) File "/usr/share/offlineimap3/offlineimap/folder/Base.py", line 802, in copymessageto message = self.getmessage(uid) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 342, in getmessage data = self._fetch_from_imap(str(uid), self.retrycount) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 908, in _fetch_from_imap ndata1 = self.parser['8bit-RFC'].parsebytes(data) File "/usr/lib/python3.9/email/parser.py", line 123, in parsebytes return self.parser.parsestr(text, headersonly) File "/usr/lib/python3.9/email/parser.py", line 67, in parsestr return self.parse(StringIO(text), headersonly=headersonly) File "/usr/lib/python3.9/email/parser.py", line 56, in parse feedparser.feed(data) File "/usr/lib/python3.9/email/feedparser.py", line 176, in feed self._call_parse() File "/usr/lib/python3.9/email/feedparser.py", line 180, in _call_parse self._parse() File "/usr/lib/python3.9/email/feedparser.py", line 385, in _parsegen for retval in self._parsegen(): File "/usr/lib/python3.9/email/feedparser.py", line 298, in _parsegen for retval in self._parsegen(): File "/usr/lib/python3.9/email/feedparser.py", line 385, in _parsegen for retval in self._parsegen(): File "/usr/lib/python3.9/email/feedparser.py", line 256, in _parsegen if self._cur.get_content_type() == 'message/delivery-status': File "/usr/lib/python3.9/email/message.py", line 578, in get_content_type value = self.get('content-type', missing) File "/usr/lib/python3.9/email/message.py", line 471, in get return self.policy.header_fetch_parse(k, v) File "/usr/lib/python3.9/email/policy.py", line 163, in header_fetch_parse return self.header_factory(name, value) File "/usr/lib/python3.9/email/headerregistry.py", line 601, in __call__ return self[name](name, value) File "/usr/lib/python3.9/email/headerregistry.py", line 196, in __new__ cls.parse(value, kwds) File "/usr/lib/python3.9/email/headerregistry.py", line 445, in parse kwds['parse_tree'] = parse_tree = cls.value_parser(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 2675, in parse_content_type_header ctype.append(parse_mime_parameters(value[1:])) File "/usr/lib/python3.9/email/_header_value_parser.py", line 2569, in parse_mime_parameters token, value = get_parameter(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 2492, in get_parameter token, value = get_value(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 2403, in get_value token, value = get_quoted_string(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 1294, in get_quoted_string token, value = get_bare_quoted_string(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 1223, in get_bare_quoted_string token, value = get_encoded_word(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 1064, in get_encoded_word text, charset, lang, defects = _ew.decode('=?' + tok + '?=') File "/usr/lib/python3.9/email/_encoded_words.py", line 181, in decode string = bstring.decode(charset) AttributeError: decoding with 'X-EUC-TW' codec failed (AttributeError: 'memoryview' object has no attribute 'decode') Last 1 debug messages logged for Copy message from RemoteAnarcat:junk prior to exception: thread: Register new thread 'Copy message from RemoteAnarcat:junk' (account 'Anarcat') ERROR: Exceptions occurred during the run! ERROR: Copying message 30624 [acc: Anarcat] decoding with 'X-EUC-TW' codec failed (AttributeError: 'memoryview' object has no attribute 'decode') Traceback: File "/usr/share/offlineimap3/offlineimap/folder/Base.py", line 802, in copymessageto message = self.getmessage(uid) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 342, in getmessage data = self._fetch_from_imap(str(uid), self.retrycount) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 908, in _fetch_from_imap ndata1 = self.parser['8bit-RFC'].parsebytes(data) File "/usr/lib/python3.9/email/parser.py", line 123, in parsebytes return self.parser.parsestr(text, headersonly) File "/usr/lib/python3.9/email/parser.py", line 67, in parsestr return self.parse(StringIO(text), headersonly=headersonly) File "/usr/lib/python3.9/email/parser.py", line 56, in parse feedparser.feed(data) File "/usr/lib/python3.9/email/feedparser.py", line 176, in feed self._call_parse() File "/usr/lib/python3.9/email/feedparser.py", line 180, in _call_parse self._parse() File "/usr/lib/python3.9/email/feedparser.py", line 385, in _parsegen for retval in self._parsegen(): File "/usr/lib/python3.9/email/feedparser.py", line 298, in _parsegen for retval in self._parsegen(): File "/usr/lib/python3.9/email/feedparser.py", line 385, in _parsegen for retval in self._parsegen(): File "/usr/lib/python3.9/email/feedparser.py", line 256, in _parsegen if self._cur.get_content_type() == 'message/delivery-status': File "/usr/lib/python3.9/email/message.py", line 578, in get_content_type value = self.get('content-type', missing) File "/usr/lib/python3.9/email/message.py", line 471, in get return self.policy.header_fetch_parse(k, v) File "/usr/lib/python3.9/email/policy.py", line 163, in header_fetch_parse return self.header_factory(name, value) File "/usr/lib/python3.9/email/headerregistry.py", line 601, in __call__ return self[name](name, value) File "/usr/lib/python3.9/email/headerregistry.py", line 196, in __new__ cls.parse(value, kwds) File "/usr/lib/python3.9/email/headerregistry.py", line 445, in parse kwds['parse_tree'] = parse_tree = cls.value_parser(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 2675, in parse_content_type_header ctype.append(parse_mime_parameters(value[1:])) File "/usr/lib/python3.9/email/_header_value_parser.py", line 2569, in parse_mime_parameters token, value = get_parameter(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 2492, in get_parameter token, value = get_value(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 2403, in get_value token, value = get_quoted_string(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 1294, in get_quoted_string token, value = get_bare_quoted_string(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 1223, in get_bare_quoted_string token, value = get_encoded_word(value) File "/usr/lib/python3.9/email/_header_value_parser.py", line 1064, in get_encoded_word text, charset, lang, defects = _ew.decode('=?' + tok + '?=') File "/usr/lib/python3.9/email/_encoded_words.py", line 181, in decode string = bstring.decode(charset) Folder junk [acc: Anarcat]: Copy message UID 30626 (29008/49310) RemoteAnarcat:junk -> LocalAnarcat:junk Command exited with non-zero status 100 5252.91user 535.86system 3:21:00elapsed 47%CPU (0avgtext+0avgdata 846304maxresident)k 96344inputs+26563792outputs (1189major+2155815minor)pagefaults 0swaps
mbsync. This bug is possibly limited to the
0.0~git20210225.1e7ef9e+dfsg-4), while the current
sidversion (the equally gorgeous
0.0~git20211018.e64c254+dfsg-1) seems unaffected.
This is 8h31m for transferring 12G, which is around 3.1Mbit/s. That is nine times slower than
*** Finished account 'Anarcat' in 511:12 ERROR: Exceptions occurred during the run! ERROR: Exception parsing message with ID (<20190619152034.BFB8810E07A@marcos.anarc.at>) from imaplib (response type: bytes). AttributeError: decoding with 'X-EUC-TW' codec failed (AttributeError: 'memoryview' object has no attribute 'decode') Traceback: File "/usr/share/offlineimap3/offlineimap/folder/Base.py", line 810, in copymessageto message = self.getmessage(uid) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 343, in getmessage data = self._fetch_from_imap(str(uid), self.retrycount) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 910, in _fetch_from_imap raise OfflineImapError( ERROR: Exception parsing message with ID (<40A270DB.firstname.lastname@example.org>) from imaplib (response type: bytes). AttributeError: decoding with 'x-mac-roman' codec failed (AttributeError: 'memoryview' object has no attribute 'decode') Traceback: File "/usr/share/offlineimap3/offlineimap/folder/Base.py", line 810, in copymessageto message = self.getmessage(uid) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 343, in getmessage data = self._fetch_from_imap(str(uid), self.retrycount) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 910, in _fetch_from_imap raise OfflineImapError( ERROR: IMAP server 'RemoteAnarcat' does not have a message with UID '32686' Traceback: File "/usr/share/offlineimap3/offlineimap/folder/Base.py", line 810, in copymessageto message = self.getmessage(uid) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 343, in getmessage data = self._fetch_from_imap(str(uid), self.retrycount) File "/usr/share/offlineimap3/offlineimap/folder/IMAP.py", line 889, in _fetch_from_imap raise OfflineImapError(reason, severity) Command exited with non-zero status 1 8273.52user 983.80system 8:31:12elapsed 30%CPU (0avgtext+0avgdata 841936maxresident)k 56376inputs+43247608outputs (811major+4972914minor)pagefaults 0swaps "offlineimap -o " took 8 hours 31 mins 15 secs
mbsync, almost an order of magnitude! Now that we have a full sync, we can test incremental synchronization. That is also much slower:
That is also an order of magnitude slower than
===> multitime results 1: sh -c "offlineimap -o true" Mean Std.Dev. Min Median Max real 24.639 0.513 23.946 24.526 25.708 user 23.912 0.473 23.404 23.795 24.947 sys 1.743 0.105 1.607 1.729 2.002
mbsync, and significantly slower than what you'd expect from a sync process. ~30 seconds is long enough to make me impatient and distracted; 3 seconds, less so: I can wait and see the results almost immediately.
mbsyncover a slow link, but I Haven't tested that theory. The OfflineIMAP mail spool is missing quite a few messages as well:
... although that's probably all either new messages or the
anarcat@angela:~(main)$ find Maildir-offlineimap -type f -type f -a \! -name '.*' wc -l 381463 anarcat@angela:~(main)$ find Maildir -type f -type f -a \! -name '.*' wc -l 385247
registerfolder, so OfflineIMAP might actually be in a better position there. But digging in more, it seems like the actual per-folder diff is fairly similar to
mbsync: a few messages missing here and there. Considering OfflineIMAP's instability and poor performance, I have not looked any deeper in those discrepancies.
offlineimap, but requires running an IMAP server locally, Perl
rsmtpwhich is a nice name for
rsendmail. not evaluated because it seems awfully complex to setup, Haskell
mbsyncto sync my mail. I'm a little disappointed by the synchronisation times over the slow link, but I guess that's on par for the course if we use IMAP. We are bound by the network speed much more than with custom protocols. I'm also worried about the C implementation and the crashes I have witnessed, but I am encouraged by the fast upstream response. Time will tell if I will stick with that setup. I'm certainly curious about the promises of interimap and mail-sync, but I have ran out of time on this project.
gusnan@debian-i7:~ > dpkg --compare-versions 1.0b lt 1.0 && echo true gusnan@debian-i7:~ > dpkg --compare-versions 1.0b gt 1.0 && echo true trueBut there s a solution name the beta versions something like 1.0~beta. And you don t need to force upstream to make any changes either. You can use uscan and the watch file to make it interpret an upstream 1.0b version as 1.0~beta in Debian. This is done by using a line like
in uversionmangle in your debian/watch file. In this case i have added on the end something to make the ending ~b into ~beta instead. Full version of the watch file available here.
uversionmangle=s/(\d)[\_\.\-\+]?((RC rc pre dev beta alpha b a)\d*)$/$1~$2/;s/\~b/\~beta/;,\