Search Results: "cjf"

30 September 2023

Russell Coker: Links September 2023

Interesting article in Wired about adversarial attacks on ML systems to get them to do things that they are explicitely programmed not to do such as describe how to make illegal drugs [1]. The most interesting part of this is that the attacks work on most GPT systems which is probably due to the similar data used to train them. Vice has an interesting article about the Danish Synthetic Party , a political partyled by an AI [2]. Citizens can vote for candidates who will try to get laws passed that match the AI generated goals, there is no option of voting for an AI character. The policies they are advocating for are designed to appeal to the 20% of Danes who don t vote. They are also trying to inspire similar parties in other countries. I think this has the potential to improve democracy. Vice reports that in 2021 a man tried to assasinate the Queen of England with inspiration from Star Wars and an AI chat bot [3]. While someone who wants to be a real-life Sith is probably going to end up doing something bad we still don t want to have chat bots encourage it. Bruce Schneier wrote an interesting article about milestones for AI involvement in the political process [4]. Sam Varghese wrote an interesting article about the allegations that India is following the example of Saudi Arabia and assasinating people in other countries who disagree with their government [5]. We need to stop this. Ian Jackson wrote an interesting blog post advocating that DKIM PRIVATE keys be rotated and PUBLISHED [6]. The idea is that if a hostile party gets access to the mailbox of someone who received private email from you then in the normal DKIM setup of keys never changing they can prove that the email is authentic when they leak it. While if you mail server publishes the old keys as Ian advocates then the hostile party can t prove that you sent the email in question as anyone could have forged a signature. Anything that involves publishing a private key gets an immediate negative reaction but I can t fault the logic here.

27 June 2021

Fran ois Marier: Removing unsafe-inline from Ikiwiki's style-src directive

After moving my Ikiwiki blog to my own server and enabling a basic CSP policy, I decided to see if I could tighten up the policy some more and stop relying on style-src 'unsafe-inline'. This does require that OpenID logins be disabled, but as a bonus, it also removes the need for jQuery to be present on the server.

Revised CSP policy First of all, I visited all of my pages in a Chromium browser and took note of the missing hashes listed in the developer tools console (Firefox doesn't show the missing hashes):
  • 'sha256-4Su6mBWzEIFnH4pAGMOuaeBrstwJN4Z3pq/s1Kn4/KQ='
  • 'sha256-j0bVhc2Wj58RJgvcJPevapx5zlVLw6ns6eYzK/hcA04='
  • 'sha256-j6Tt8qv7z2kSc7fUs0YHbrxawwsQcS05fVaX1r2qrbk='
  • 'sha256-p4cncjf0hAIeTSS5tXecf7qTUanDC27KdlKhT9eOsZU='
  • 'sha256-Y6v8OCtFfMmI5mbpwqCreLofmGZQfXYK7jJHCoHvn7A='
  • 'sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU='
which took care of all of the inline styles. Note that I kept unsafe-inline in the directive since it will be automatically ignored by browsers who understand hashes, but will be honored and make the site work on older browsers. Next I added the new unsafe-hashes source expression along with the hash of the CSS fragment (clear: both) that is present on all pages related to comments in Ikiwiki:
$ echo -n "clear: both"   openssl dgst -sha256 -binary   openssl base64 -A
matwEc6givhWX0+jiSfM1+E5UMk8/UGLdl902bjFBmY=
My final style-src directive is therefore the following:
style-src 'self' 'unsafe-inline' 'unsafe-hashes' 'sha256-4Su6mBWzEIFnH4pAGMOuaeBrstwJN4Z3pq/s1Kn4/KQ=' 'sha256-j0bVhc2Wj58RJgvcJPevapx5zlVLw6ns6eYzK/hcA04=' 'sha256-j6Tt8qv7z2kSc7fUs0YHbrxawwsQcS05fVaX1r2qrbk=' 'sha256-p4cncjf0hAIeTSS5tXecf7qTUanDC27KdlKhT9eOsZU=' 'sha256-Y6v8OCtFfMmI5mbpwqCreLofmGZQfXYK7jJHCoHvn7A=' 'sha256-47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=' 'sha256-matwEc6givhWX0+jiSfM1+E5UMk8/UGLdl902bjFBmY='

Browser compatibility While unsafe-hashes is not yet implemented in Firefox, it happens to work just fine due to a bug (i.e. unsafe-hashes is always enabled whether or not the policy contains it). It's possible that my new CSP policy won't work in Safari, but these CSS clears don't appear to be needed anyways and so it's just going to mean extra CSP reporting noise.

Removing jQuery Since jQuery appears to only be used to provide the authentication system selector UI, I decided to get rid of it. I couldn't find a way to get Ikiwiki to stop pulling it in and so I put the following hack in my Apache config file:
# Disable jQuery.
Redirect 204 /ikiwiki/jquery.fileupload.js
Redirect 204 /ikiwiki/jquery.fileupload-ui.js
Redirect 204 /ikiwiki/jquery.iframe-transport.js
Redirect 204 /ikiwiki/jquery.min.js
Redirect 204 /ikiwiki/jquery.tmpl.min.js
Redirect 204 /ikiwiki/jquery-ui.min.css
Redirect 204 /ikiwiki/jquery-ui.min.js
Redirect 204 /ikiwiki/login-selector/login-selector.js
Replacing the files on disk with an empty reponse seems to work very well and removes a whole lot of code that would otherwise be allowed by the script-src directive of my CSP policy. While there is a slight cosmetic change to the login page, I think the reduction in the attack surface is well worth it.

6 June 2020

Russell Coker: Comparing Compression

I just did a quick test of different compression options in Debian. The source file is a 1.1G MySQL dump file. The time is user CPU time on a i7-930 running under KVM, the compression programs may have different levels of optimisation for other CPU families. Facebook people designed the zstd compression system (here s a page giving an overview of it [1]). It has some interesting new features that can provide real differences at scale (like unusually large windows and pre-defined dictionaries), but I just tested the default mode and the -9 option for more compression. For the SQL file zstd -9 provides significantly better compression than gzip while taking only slightly less CPU time than gzip -9 while zstd with the default option (equivalent to zstd -3 ) gives much faster compression than gzip -9 while also being slightly smaller. For this use case bzip2 is too slow for inline compression of a MySQL dump as the dump process locks tables and can hang clients. The lzma and xz compression algorithms provide significant benefits in size but the time taken is grossly disproportionate. In a quick check of my collection of files compressed with gzip I was only able to fine 1 fild that got less compression with zstd with default options, and that file got better compression with zstd -9 . So zstd seems to beat gzip everywhere by every measure. The bzip2 compression seems to be obsolete, zstd -9 is much faster and has slightly smaller output. Both xz and lzma seem to offer a combination of compression and time taken that zstd can t beat (for this file type at least). The ultra compression mode 22 gives 2% smaller output files but almost 28 minutes of CPU time for compression is a bit ridiculous. There is a threaded mode for zstd that could potentially allow a shorter wall clock time for zstd --ultra -22 than lzma/xz while also giving better compression.
Compression Time Size
zstd 5.2s 130m
zstd -9 28.4s 114m
gzip -9 33.4s 141m
bzip2 -9 3m51 119m
lzma 6m20 97m
xz 6m36 97m
zstd -19 9m57 99m
zstd --ultra -22 27m46 95m
Conclusion For distributions like Debian which have large archives of files that are compressed once and transferred a lot the zstd --ultra -22 compression might be useful with multi-threaded compression. But given that Debian already has xz in use it might not be worth changing until faster CPUs with lots of cores become more commonly available. One could argue that for Debian it doesn t make sense to change from xz as hard drives seem to be getting larger capacity (and also smaller physical size) faster than the Debian archive is growing. One possible reason for adopting zstd in a distribution like Debian is that there are more tuning options for things like memory use. It would be possible to have packages for an architecture like ARM that tends to have less RAM compressed in a way that decreases memory use on decompression. For general compression such as compressing log files and making backups it seems that zstd is the clear winner. Even bzip2 is far too slow and in my tests zstd clearly beats gzip for every combination of compression and time taken. There may be some corner cases where gzip can compete on compression time due to CPU features, optimisation for CPUs, etc but I expect that in almost all cases zstd will win for compression size and time. As an aside I once noticed the 32bit of gzip compressing faster than the 64bit version on an Opteron system, the 32bit version had assembly optimisation and the 64bit version didn t at that time. To create a tar archive you can run tar czf or tar cJf to create an archive with gzip or xz compression. To create an archive with zstd compression you have to use tar --zstd -cf , that s 7 extra characters to type. It s likely that for most casual archive creation (EG for copying files around on a LAN or USB stick) saving 7 characters of typing is more of a benefit than saving a small amount of CPU time and storage space. It would be really good if tar got a single character option for zstd compression. The external dictionary support in zstd would work really well with rsync for backups. Currently rsync only supports zlib, adding zstd support would be a good project for someone (unfortunately I don t have enough spare time). Now I will change my database backup scripts to use zstd. Update: The command tar acvf a.zst filenames will create a zstd compressed tar archive, the a option to GNU tar makes it autodetect the compression type from the file name. Thanks Enrico!

19 October 2016

Reproducible builds folks: Reproducible Builds: week 77 in Stretch cycle

What happened in the Reproducible Builds effort between Sunday October 9 and Saturday October 15 2016: Media coverage Documentation update After discussions with HW42, Steven Chamberlain, Vagrant Cascadian, Daniel Shahaf, Christopher Berg, Daniel Kahn Gillmor and others, Ximin Luo has started writing up more concrete and detailed design plans for setting SOURCE_ROOT_DIR for reproducible debugging symbols, buildinfo security semantics and buildinfo security infrastructure. Toolchain development and fixes Dmitry Shachnev noted that our patch for #831779 has been temporarily rejected by docutils upstream; we are trying to persuade them again. Tony Mancill uploaded javatools/0.59 to unstable containing original patch by Chris Lamb. This fixed an issue where documentation Recommends: substvars would not be reproducible. Ximin Luo filed bug 77985 to GCC as a pre-requisite for future patches to make debugging symbols reproducible. Packages reviewed and fixed, and bugs filed The following updated packages have become reproducible - in our current test setup - after being fixed: The following updated packages appear to be reproducible now, for reasons we were not able to figure out. (Relevant changelogs did not mention reproducible builds.) Some uploads have addressed some reproducibility issues, but not all of them: Some uploads have addressed nearly all reproducibility issues, except for build path issues: Patches submitted that have not made their way to the archive yet: Reviews of unreproducible packages 101 package reviews have been added, 49 have been updated and 4 have been removed in this week, adding to our knowledge about identified issues. 3 issue types have been updated: Weekly QA work During of reproducibility testing, some FTBFS bugs have been detected and reported by: tests.reproducible-builds.org Debian: Openwrt/LEDE/NetBSD/coreboot/Fedora/archlinux: Misc. We are running a poll to find a good time for an IRC meeting. This week's edition was written by Ximin Luo, Holger Levsen & Chris Lamb and reviewed by a bunch of Reproducible Builds folks on IRC.

4 December 2015

Lunar: Why is Jack so angry?

Last summer, Innuendo Studios made a series of 6 short videos trying to understand why would anyone get involved in a coordinated harrasment campaign. The recent articles making straw man arguments and gross mischaracterization of the actions of people trying to grow the pool of free software contributors reminded me of these videos, especially the fifth episode which describes why some people get so angry when others point out that maybe the general homogeneity of backgrounds is also related to how we treat people. Guess we can transpose one of the explanation in the video to free software communities:
Bad People do bad things; a sexist is a wife-beater or sexual assailant; I am neither; therefore I am a Good Person and the things I do are good; I work on free software where people report sexist biases; therefore they say I am a wife-beater or sexual assailant; this is a false and ridiculous claim; therefore they are bad.
This slightly helps to better understand why some people would attack efforts that don't concern them so strongly. Sadly, it doesn't help much with what we could do about it. Perhaps we can help them understand that they might be just as biased as everyone else living in an institutionally sexist and racist society. This does not make them bad people. It's just something we all need to keep in mind to improve the situation. (Also, be sure to read the follow-up post if you watch the series until the last episode.) Thanks to Nicolas Dandrimont for his comments and suggestions.

13 October 2009

David Pashley: Tarballs explained

This entry was originally posted in slightly different form to Server Fault If you're coming from a Windows world, you're used to using tools like zip or rar, which compress collections of files. In the typical Unix tradition of doing one thing and doing one thing well, you tend to have two different utilities; a compression tool and a archive format. People then use these two tools together to give the same functionality that zip or rar provide. There are numerous different compression formats; the common ones used on Linux these days are gzip (sometimes known as zlib) and the newer, higher performing bzip2. Unfortunately bzip2 uses more CPU and memory to provide the higher rates of compression. You can use these tools to compress any file and by convention files compressed by either of these formats is .gz and .bz2. You can use gzip and bzip2 to compress and gunzip and bunzip2 to decompress these formats. There are also several different types of archive formats available, including cpio, ar and tar, but people tend to only use tar. These allow you to take a number of files and pack them into a single file. They can also include path and permission information. You can create and unpack a tar file using the tar command. You might hear these operations referred to as "tarring" and "untarring". (The name of the command comes from a shortening of Tape ARchive. Tar was an improvement on the ar format in that you could use it to span multiple physical tapes for backups).
# tar -cf archive.tar list of files to include
This will create (-c) and archive into a file -f called archive.tar. (.tar is the convention extention for tar archives). You should now have a single file that contains five files ("list", "of", "files", "to" and "include"). If you give tar a directory, it will recurse into that directory and store everything inside it.
# tar -xf archive.tar
# tar -xf archive.tar list of files
This will extract (-x) the previously created archive.tar. You can extract just the files you want from the archive by listing them on the end of the command line. In our example, the second line would extract "list", "of", "file", but not "to" and "include". You can also use
# tar -tf archive.tar
to get a list of the contents before you extract them. So now you can combine these two tools to replication the functionality of zip:
# tar -cf archive.tar directory
# gzip archive.tar
You'll now have an archive.tar.gz file. You can extract it using:
# gunzip archive.tar.gz
# tar -xf archive.tar
We can use pipes to save us having an intermediate archive.tar:
# tar -cf - directory   gzip > archive.tar.gz
# gunzip < archive.tar.gz   tar -xf -
You can use - with the -f option to specify stdin or stdout (tar knows which one based on context). We can do slightly better, because, in a slight apparent breaking of the "one job well" idea, tar has the ability to compress its output and decompress its input by using the -z argument (I say apparent, because it still uses the gzip and gunzip commandline behind the scenes)
# tar -czf archive.tar.gz directory
# tar -xzf archive.tar.gz
To use bzip2 instead of gzip, use bzip2, bunzip2 and -j instead of gzip, gunzip and -z respectively (tar -cjf archive.tar.bz2). Some versions of tar can detect a bzip2 file archive with you use -z and do the right thing, but it is probably worth getting in the habit of being explicit. More info:
Read Comments (8)