Search Results: "arnau"

17 January 2021

Wouter Verhelst: Software available through Extrepo

Just over 7 months ago, I blogged about extrepo, my answer to the "how do you safely install software on Debian without downloading random scripts off the Internet and running them as root" question. I also held a talk during the recent "MiniDebConf Online" that was held, well, online. The most important part of extrepo is "what can you install through it". If the number of available repositories is too low, there's really no reason to use it. So, I thought, let's look what we have after 7 months... To cut to the chase, there's a bunch of interesting content there, although not all of it has a "main" policy. Each of these can be enabled by installing extrepo, and then running extrepo enable <reponame>, where <reponame> is the name of the repository. Note that the list is not exhaustive, but I intend to show that even though we're nowhere near complete, extrepo is already quite useful in its current state:

Free software
  • The debian_official, debian_backports, and debian_experimental repositories contain Debian's official, backports, and experimental repositories, respectively. These shouldn't have to be managed through extrepo, but then again it might be useful for someone, so I decided to just add them anyway. The config here uses the deb.debian.org alias for CDN-backed package mirrors.
  • The belgium_eid repository contains the Belgian eID software. Obviously this is added, since I'm upstream for eID, and as such it was a large motivating factor for me to actually write extrepo in the first place.
  • elastic: the elasticsearch software.
  • Some repositories, such as dovecot, winehq and bareos contain upstream versions of their respective software. These two repositories contain software that is available in Debian, too; but their upstreams package their most recent release independently, and some people might prefer to run those instead.
  • The sury, fai, and postgresql repositories, as well as a number of repositories such as openstack_rocky, openstack_train, haproxy-1.5 and haproxy-2.0 (there are more) contain more recent versions of software packaged in Debian already by the same maintainer of that package repository. For the sury repository, that is PHP; for the others, the name should give it away. The difference between these repositories and the ones above is that it is the official Debian maintainer for the same software who maintains the repository, which is not the case for the others.
  • The vscodium repository contains the unencumbered version of Microsoft's Visual Studio Code; i.e., the codium version of Visual Studio Code is to code as the chromium browser is to chrome: it is a build of the same softare, but without the non-free bits that make code not entirely Free Software.
  • While Debian ships with at least two browsers (Firefox and Chromium), additional browsers are available through extrepo, too. The iridiumbrowser repository contains a Chromium-based browser that focuses on privacy.
  • Speaking of privacy, perhaps you might want to try out the torproject repository.
  • For those who want to do Cloud Computing on Debian in ways that isn't covered by Openstack, there is a kubernetes repository that contains the Kubernetes stack, the as well as the google_cloud one containing the Google Cloud SDK.

Non-free software While these are available to be installed through extrepo, please note that non-free and contrib repositories are disabled by default. In order to enable these repositories, you must first enable them; this can be accomplished through /etc/extrepo/config.yaml.
  • In case you don't care about freedom and want the official build of Visual Studio Code, the vscode repository contains it.
  • While we're on the subject of Microsoft, there's also Microsoft Teams available in the msteams repository. And, hey, skype.
  • For those who are not satisfied with the free browsers in Debian or any of the free repositories, there's opera and google_chrome.
  • The docker-ce repository contains the official build of Docker CE. While this is the free "community edition" that should have free licenses, I could not find a licensing statement anywhere, and therefore I'm not 100% sure whether this repository is actually free software. For that reason, it is currently marked as a non-free one. Merge Requests for rectifying that from someone with more information on the actual licensing situation of Docker CE would be welcome...
  • For gamers, there's Valve's steam repository.
Again, the above lists are not meant to be exhaustive. Special thanks go out to Russ Allbery, Kim Alvefur, Vincent Bernat, Nick Black, Arnaud Ferraris, Thorsten Glaser, Thomas Goirand, Juri Grabowski, Paolo Greppi, and Josh Triplett, for helping me build the current list of repositories. Is your favourite repository not listed? Create a configuration based on template.yaml, and file a merge request!

19 September 2020

Bits from Debian: New Debian Maintainers (July and August 2020)

The following contributors were added as Debian Maintainers in the last two months: Congratulations!

24 August 2020

Arnaud Rebillout: Send emails from your terminal with msmtp

In this tutorial, we'll configure everything needed to send emails from the terminal. We'll use msmtp, a lightweight SMTP client. For the sake of the example, we'll use a GMail account, but any other email provider can do. Your OS is expected to be Debian, as usual on this blog, although it doesn't really matter. We will also see how to store the credentials for the email account in the system keyring. And finally, we'll go the extra mile, and see how to configure various command-line utilities so that they automatically use msmtp to send emails. Even better, we'll make msmtp the default email sender, to actually avoid configuring these utilities one by one. Prerequisites Strong prerequisites (if you don't recognize yourself here, you probably landed on the wrong page): Weak prerequisites (if your setup doesn't match those points exactly, that's fine, you can still read on): GMail account setup For a GMail account, there's a bit of configuration to do. For other email providers, I have no idea, maybe you can just skip this part, or maybe you will have to go through a similar procedure. If you want an external program (msmtp in this case) to talk to the GMail servers on your behalf, and send emails, you can't just use your usual GMail password. Instead, GMail requires you to generate so-called app passwords, one for each application that needs to access your GMail account. This approach has several advantages: So app passwords are a good idea, it just requires a bit of work to set it up. Let's see what it takes. First, 2-Step Verification must be enabled on your GMail account. Visit https://myaccount.google.com/security, and if that's not the case, enable it. You'll need to authorize all of your devices (computer(s), phone(s) and so on), and it can be a bit tedious, granted. But you only have to do it once in a lifetime, and after it's done, you're left with a more secure account, so it's not that bad, right? Enabling the 2-Step Verification will unlock the feature we need: App passwords. Visit https://myaccount.google.com/apppasswords, and under "Signing in to Google", click "App passwords", and generate one. An app password is a 16 characters string, something like qwertyuiopqwerty. It's supposed to be used from only one place, ie. from ONE application that is installed on ONE device. That's why it's common to give it a name of the form application@device, so in our case it could be msmtp@laptop, but really it's free form, choose whatever name suits you, as long as it makes sense to you. So let's give a name to this app password, write it down for now, and we're done with the GMail config. Send your first email Time to get started with msmtp. First thing first, installation, trivial:
sudo apt install msmtp
Let's try to send an email. At this point, we did not create any configuration file for msmtp yet, so we have to provide every details on the command line.
# Write a dummy email
cat << EOF > message.txt
From: YOUR_LOGIN@gmail.com
To: SOMEONE_ELSE@SOMEWHERE_ELSE.com
Subject: Cafe Sua Da
Iced-coffee with condensed milk
EOF
# Send it
cat message.txt   msmtp \
    --auth=on --tls=on \
    --host smtp.gmail.com \
    --port 587 \
    --user YOUR_LOGIN \
    --read-envelope-from \
    --read-recipients
# msmtp prompts you for your password:
# this is where goes the app password!
Obviously, in this example you should replace the uppercase words with the real thing, that is, your email login, and real email addresses. Also, let me insist, you must enter the app password that was generated previously, not your real GMail password. And it should work already, this email should have been sent and received by now. So let me explain quickly what happened here. In the file message.txt, we provided From: (the email address of the person sending the email) and To: (the destination email address). Then we asked msmtp to re-use those values to set the envelope of the email with --read-envelope-from and --read-recipients. What about the other parameters? For more details, you should refer to the msmtp documentation. Write a configuration file So we could send an email, that's cool already. However the command to do that was a bit long, and we don't want to juggle with all these arguments every time we send an email. So let's write down all of that into a configuration file. msmtp supports two locations: ~/.msmtprc and ~/.config/msmtp/config, at your preference. In this tutorial we'll use ~/.msmtprc for brevity:
cat << 'EOF' > ~/.msmtprc
defaults
tls on
account gmail
auth on
host smtp.gmail.com
port 587
user YOUR_LOGIN
from YOUR_LOGIN@gmail.com
account default : gmail
EOF
And for a quick explanation: All in all it's pretty simple, and it's becoming easier to send an email:
# Write a dummy email. Note that the
# header 'From:' is no longer needed,
# it's already in '~/.msmtprc'.
cat << 'EOF' > message.txt
To: SOMEONE_ELSE@SOMEWHERE_ELSE.com
Subject: Flat White
The milky way for coffee
EOF
# Send it
cat message.txt   msmtp \
    --account default \
    --read-recipients
Actually, --account default is not needed, as it's the default anyway if you don't provide a --account argument. Furthermore --read-recipients can be shortened as -t. So we can make it real short now:
msmtp -t < message.txt
At this point, life is good! Except for one thing maybe: we still have to type the password every time we send an email. Surely it must be possible to avoid that annoyance... Store your password in the system keyring For this part, we'll make use of the libsecret tool to store the password in the system keyring via the Secret Service API. It means that your desktop environment should implement the Secret Service specification, which is the case for both GNOME and KDE. Note that GNOME provides Seahorse to have a look at your secrets, KDE has the KDE Wallet. There's also KeePassXC, which I have only heard of but never used. I guess it can be your password manager of choice if you use neither GNOME nor KDE. For those running an up-to-date Debian unstable, you should have msmtp >= 1.8.11-2, and you're all good to go. For those having an older version than that however, you will have to install the package msmtp-gnome in order to have msmtp built with libsecret support. Note that this package depends on seahorse, hence it pulls in a good part of the GNOME stack when you install it. For those not running GNOME, that's unfortunate. All of this was discussed and fixed in #962689. Alright! So let's just make sure that the libsecret tools are installed:
sudo apt install libsecret-tools
And now we can store our password in the system keyring with this command:
secret-tool store --label msmtp \
    host smtp.gmail.com \
    service smtp \
    user YOUR_LOGIN
If this looks a bit too magic, and you want something more visual, you can actually fire a GUI like seahorse (for GNOME users), or kwalletmanager5 (for KDE users), and then you will see what passwords are stored in there. Here's a screenshot of Seahorse, with a msmtp password stored: seahorse with msmtp password Let's try to send an email again:
msmtp -t < message.txt
No need for a password anymore, msmtp got it from the system keyring! For more details on how msmtp handle the passwords, and to see what other methods are supported, refer to the extensive documentation. Use-cases and integration Let's go over a few use-cases, situations where you might end up sending emails from the command-line, and what configuration is required to make it work with msmtp. Git Send-Email Sending emails with git is a common workflow for some projects, like the Linux kernel. How does git send-email actually send emails? From the git-send-email manual page:
the built-in default is to search for sendmail in /usr/sbin, /usr/lib and $PATH if such program is available
It is possible to override this default though:
--smtp-server=
[...] Alternatively it can specify a full pathname of a sendmail-like program instead; the program must support the -i option.
So in order to use msmtp here, you'd add a snippet like that to your ~/.gitconfig file:
[sendemail]
    smtpserver = /usr/bin/msmtp
For a full guide, you can also refer to https://git-send-email.io. Debian developer tools Tools like bts or reportbug are also good examples of command-line tools that need to send emails. From the bts manual page:
--sendmail=SENDMAILCMD
Specify the sendmail command [...] Default is /usr/sbin/sendmail.
So if you want bts to send emails with msmtp instead of sendmail, you must use bts --sendmail='/usr/bin/msmtp -t'. Note that bts also loads settings from the file /etc/devscripts.conf and ~/.devscripts, so you could also set BTS_SENDMAIL_COMMAND='/usr/bin/msmtp -t' in one of those files. From the reportbug manual page:
--mta=MTA
Specify an alternate MTA, instead of /usr/sbin/sendmail (the default).
In order to use msmtp here, you'd write reportbug --mta=/usr/bin/msmtp. Note that reportbug reads it settings from /etc/reportbug.conf and ~/.reportbugrc, so you could as well set mta /usr/bin/msmtp in one of those files. So who is this sendmail again? By now, you probably noticed that sendmail seems to be considered the default tool for the job, the "traditional" command that has been around for ages. Rather than configuring every tool to use something else than sendmail, wouldn't it be simpler to actually replace sendmail by msmtp? Like, create a symlink that points to msmtp, something like ln -sr /usr/bin/msmtp /usr/sbin/sendmail? So that msmtp acts as a drop-in replacement for sendmail, and there's nothing else to configure? Answer is yes, kind of. Actually, the first msmtp feature that is listed on the homepage is "Sendmail compatible interface (command line options and exit codes)". Meaning that msmtp is a drop-in replacement for sendmail, that seems to be the intent. However, you should refrain from creating or modifying anything in /usr, as it's the territory of the package manager, apt. Any change in /usr might be overwritten by apt the next time you run an upgrade or install new packages. In the case of msmtp, there is actually a package named msmtp-mta that will create this symlink for you. So if you really want a definitive replacement for sendmail, there you go:
sudo apt install msmtp-mta
From this point, sendmail is now a symlink /usr/sbin/sendmail /usr/bin/msmtp, and there's no need to configure git, bts, reportbug or any other tool that would rely on sendmail. Everything should work "out of the box". Conclusion I hope that you enjoyed reading this article! If you have any comment, feel free to send me a short email, preferably from your terminal!

17 August 2020

Arnaud Rebillout: Modify Vim syntax files for your taste

In this short how-to, we'll see how to make small modifications to a Vim syntax file, in order to change how a particular file format is highlighted. We'll go for a simple use-case: modify the Markdown syntax file, so that H1 and H2 headings (titles and subtitles, if you prefer) are displayed in bold. Of course, this won't be exactly as easy as expected, but no worries, we'll succeed in the end. The calling Let's start with a screenshot: how Vim displays Markdown files for me, someone who use the GNOME terminal with the Solarized light theme. Vim - Markdown file with original highlighting I'm mostly happy with that, except for one or two little details. I'd like to have the titles displayed in bold, for example, so that they're easier to spot when I skim through a Markdown file. It seems like a simple thing to ask, so I hope there can be a simple solution. The first steps Let's learn the basics. In Vim world, the rules to highlight files formats are defined in the directory /usr/share/vim/vim82/syntax (I bet you'll have to adjust this path depending on the version of Vim that is installed on your system). And so, for the Markdown file format, the rules are defined in the file /usr/share/vim/vim82/syntax/markdown.vim. The first thing we could do is to have a look at this file, try to make sense of it, and maybe start to make some modifications. But wait a moment. You should know that modifying a system file is not a great idea. First because your changes will be lost as soon as an update kicks in and the package manager replaces this file by a new version. Second, because you will quickly forget what files you modified, and what were your modifications, and if you do that too much, you might experience what is called "maintenance headache" in the long run. So instead, maybe you DO NOT modify this file, and instead you copy it in your personal Vim folder, more precisely in ~/.vim/syntax. Create this directory if it does not exist:
mkdir -p ~/.vim/syntax
cp /usr/share/vim/vim82/syntax/markdown.vim ~/.vim/syntax
The file in your personal folder takes precedence over the system file of the same name in /usr/share/vim/vim82/syntax/, it is a replacement for the existing syntax files. And so from now on, Vim uses the file ~/.vim/syntax/markdown.vim, and this is where we can make our modifications. (And by the way, this is explained in the Vim faq-24.12) And so, it's already nice to know all of that, but wait, there's even better. There's is another location of interest, and it is ~/.vim/after/syntax. You can drop syntax files in this directory, and these files are treated as additions to the existing syntax. So if you only want to make slight modifications, that's the way to go. (And by the way, this is explained in the Vim faq-24.11) So let's forget about a syntax replacement in ~/.vim/syntax/markdown.vim, and instead let's go for some syntax additions in ~/.vim/after/syntax/markdown.vim.
mkdir -p ~/.vim/after/syntax
touch ~/.vim/after/syntax/markdown.vim
Now, let's answer the initial question: how do we modify the highlighting rules for Markdown files, so that the titles are displayed in bold? First, we have to understand where are the rules that define the highlighting for titles. Here there are, from the file /usr/share/vim/vim82/syntax/markdown.vim:
hi def link markdownH1 htmlH1
hi def link markdownH2 htmlH2
hi def link markdownH3 htmlH3
...
You should know that H1 means Heading 1, and so on, and so we want to make H1 and H2 bold. What we can see here is that the headings in the Markdown files are highlighted like the headings in HTML files, and this is obviously defined in the file /usr/share/vim/vim82/syntax/html.vim. So let's have a look into this file:
hi def link htmlH1 Title
hi def link htmlH2 htmlH1
hi def link htmlH3 htmlH2
...
Let's keep digging a bit. Where is Title defined? For those using the default color scheme like me, this is defined straight in the Vim source code, in the file src/highlight.c.
CENT("Title term=bold ctermfg=DarkMagenta",
     "Title term=bold ctermfg=DarkMagenta gui=bold guifg=Magenta"),
And for those using custom color schemes, it might be defined in a file under /usr/share/vim/vim82/colors/. Alright, so how do we override that? We can just define this kind of rules in our syntax additions file at ~/.vim/after/syntax/markdown.vim:
hi link markdownH1 markdownHxBold
hi link markdownH2 markdownHxBold
hi markdownHxBold  term=bold ctermfg=DarkMagenta gui=bold guifg=Magenta cterm=bold
As you can see, the only addition we made, compared to what's defined in src/highlight.c, is cterm=bold. And that's already enough to achieve the initial goal, make the titles (ie. H1 and H2) bold. The result can be seen in the following screenshot: Vim - Markdown file with modified highlighting The rabbit hole So we could stop right here, and life would be easy and good. However, with this solution there's still something that is not perfect. We use the color DarkMagenta as defined in the default color scheme. What I didn't mention however, is that this is applicable for a light background. If you have a dark background though, dark magenta won't be easy to read. Actually, if you look a bit more into src/highlight.c, you will see that the default color scheme comes in two variants, one for a light background, and one for a dark background. And so the definition for Title for a dark background is as follow:
CENT("Title term=bold ctermfg=LightMagenta",
     "Title term=bold ctermfg=LightMagenta gui=bold guifg=Magenta"),
Hmmm, so how do we do that in our syntax file? How can we support both light and dark background, so that the color is right in both cases? After a bit of research, and after looking at other syntax files, it seems that the solution is to check for the value of the background option, and so our syntax file becomes:
hi link markdownH1 markdownHxBold
hi link markdownH2 markdownHxBold
if &background == "light"
  hi markdownHxBold term=bold ctermfg=DarkMagenta gui=bold guifg=Magenta cterm=bold
else
  hi markdownHxBold term=bold ctermfg=LightMagenta gui=bold guifg=Magenta cterm=bold
endif
In case you wonder, in Vim script you prefix Vim options with &, and so you get the value of the background option by writing &background. You can learn this kind of things in the Vim scripting cheatsheet. And so, it's easy enough, except for one thing: it doesn't work. The headings always show up in DarkMagenta, even for a dark background. This is why I called this paragraph "the rabbit hole", by the way. So... Well after trying a few things, I noticed that in order to make it work, I would have to reload the syntax files with :syntax on. At this point, the most likely explanation is that the background option is not set yet when the syntax files are loaded at startup, hence it needs to be reloaded manually afterward. And after muuuuuuch research, I found out that it's actually possible to set a hook for when an option is modified. Meaning, it's possible to execute a function when the background option is modified. Quite cool actually. And so, there it goes in my ~/.vimrc:
" Reload syntax when the background changes 
autocmd OptionSet background if exists("g:syntax_on")   syntax on   endif
For humans, this line reads as:
  1. when the background option is modified -- autocmd OptionSet background
  2. check if the syntax is on -- if exists("g:syntax_on")
  3. if that's the case, reload it -- syntax on
With that in place, my Markdown syntax overrides work for both dark and light background. Champagne! The happy end To finish, let me share my actual additions to the markdown.vim syntax. It makes H1 and H2 bold, along with their delimiters, and it also colors the inline code and the code blocks.
" H1 and H2 headings -> bold
hi link markdownH1 markdownHxBold
hi link markdownH2 markdownHxBold
" Heading delimiters (eg '#') and rules (eg '----', '====') -> bold
hi link markdownHeadingDelimiter markdownHxBold
hi link markdownRule markdownHxBold
" Code blocks and inline code -> highlighted
hi link markdownCode htmlH1
" The following test requires this addition to your vimrc:
" autocmd OptionSet background if exists("g:syntax_on")   syntax on   endif
if &background == "light"
  hi markdownHxBold term=bold ctermfg=DarkMagenta gui=bold guifg=Magenta cterm=bold
else
  hi markdownHxBold term=bold ctermfg=LightMagenta gui=bold guifg=Magenta cterm=bold
endif
And here's how it looks like with a light background: Vim - Markdown file with final highlighting (light) And a dark background: Vim - Markdown file with final highlighting (dark) That's all, that's very little changes compared to the highlighting from the original syntax file, and now that we understand how it's supposed to be done, it's not much effort to achieve it. It's just that finding the workaround to make it work for both light and dark background took forever, and leaves the usual, unanswered question: bug or feature?

10 August 2020

Arnaud Rebillout: GoAccess 1.4, a detailed tutorial

GoAccess v1.4 was just released a few weeks ago! Let's take this chance to write a loooong tutorial. We'll go over every steps to install and operate GoAccess. This is a tutorial aimed at those who don't play sysadmin every day, and that's why it's so long, I did my best to provide thorough explanations all along, so that it's more than just a "copy-and-paste" kind of tutorial. And for those who do play sysadmin everyday: please try not to fall asleep while reading, and don't hesitate to drop me an e-mail if you spot anything inaccurate in here. Thanks! Introduction So what's GoAccess already? GoAccess is a web log analyzer, and it allows you to visualize the traffic for your website, and get to know a bit more about your visitors: how many visitors and hits, for which pages, coming from where (geolocation, operating system, web browser...), etc... It does so by parsing the access logs from your web server, be it Apache, NGINX or whatever. GoAccess gives you different options to display the statistics, and in this tutorial we'll focus on producing a HTML report. Meaning that you can see the statistics for your website straight in your web browser, under the form of a single HTML page. For an example, you can have a look at the stats of my blog here: http://goaccess.arnaudr.io. GoAccess is written in C, it has very few dependencies, it had been around for about 10 years, and it's distributed under the MIT license. Assumptions This tutorial is about installing and configuring, so I'll assume that all the commands are run as root. I won't prefix each of them with sudo. I use the Apache web server, running on a Debian system. I don't think it matters so much for this tutorial though. If you're using NGINX it's fine, you can keep reading. Also, I will just use the name SITE for the name of the website that we want to analyze with GoAccess. Just replace that with the real name of your site. I also assume the following locations for your stuff: If you have your stuff in /srv/SITE/ log,www instead, no worries, just adjust the paths accordingly, I bet you can do it. Installation The latest version of GoAccess is v1.4, and it's not yet available in the Debian repositories. So for this part, you can follow the instructions from the official GoAccess download page. Install steps are explained in details, so there's nothing left for me to say :) When this is done, let's get started with the basics. We're talking about the latest version v1.4 here, let's make sure:
$ goaccess --version
GoAccess - 1.4.
...
Now let's try to create a HTML report. I assume that you already have a website up and running. GoAccess needs to parse the access logs. These logs are optional, they might or might not be created by your web server, depending on how it's configured. Usually, these log files are named access.log, unsurprisingly. You can check if those logs exist on your system by running this command:
find /var/log -name access.log
Another important thing to know is that these logs can be in different formats. In this tutorial we'll assume that we work with the combined log format, because it seems to be the most common default. To check what kind of access logs your web server produces, you must look at the configuration for your site. For an Apache web server, you should have such a line in the file /etc/apache2/sites-enabled/SITE.conf:
CustomLog $ APACHE_LOG_DIR /SITE/access.log combined
For NGINX, it's quite similar. The configuration file would be something like /etc/nginx/sites-enabled/SITE, and the line to enable access logs would be something like:
access_log /var/log/nginx/SITE/access.log
Note that NGINX writes the access logs in the combined format by default, that's why you don't see the word combined anywhere in the line above: it's implicit. Alright, so from now on we assume that yes, you have access log files available, and yes, they are in the combined log format. If that's the case, then you can already run GoAccess and generate a report, for example for the log file /var/log/apache2/access.log
goaccess \
    --log-format COMBINED \
    --output /tmp/report.html \
    /var/log/apache2/access.log
It's possible to give GoAccess more than one log files to process, so if you have for example the file access.log.1 around, you can use it as well:
goaccess \
    --log-format COMBINED \
    --output /tmp/report.html \
    /var/log/apache2/access.log \
    /var/log/apache2/access.log.1
If GoAccess succeeds (and it should), you're on the right track! All is left to do to complete this test is to have a look at the HTML report created. It's a single HTML page, so you can easily scp it to your machine, or just move it to the document root of your site, and then open it in your web browser. Looks good? So let's move on to more interesting things. Web server configuration This part is very short, because in terms of configuration of the web server, there's very little to do. As I said above, the only thing you want from the web server is to create access log files. Then you want to be sure that GoAccess and your web server agree on the format for these files. In the part above we used the combined log format, but GoAccess supports many other common log formats out of the box, and even allows you to parse custom log formats. For more details, refer to the option --log-format in the GoAccess manual page. Another common log format is named, well, common. It even has its own Wikipedia page. But compared to combined, the common log format contains less information, it doesn't include the referrer and user-agent values, meaning that you won't have it in the GoAccess report. So at this point you should understand that, unsurprisingly, GoAccess can only tell you about what's in the access logs, no more no less. And that's all in term of web server configuration. Configuration to run GoAccess unprivileged Now we're going to create a user and group for GoAccess, so that we don't have to run it as root. The reason is that, well, for everything running unattended on your server, the less code runs as root, the better. It's good practice and common sense. In this case, GoAccess is simply a log analyzer. So it just needs to read the logs files from your web server, and there is no need to be root for that, an unprivileged user can do the job just as well, assuming it has read permissions on /var/log/apache2 or /var/log/nginx. The log files of the web server are usually part of the adm group (though it might depend on your distro, I'm not sure). This is something you can check easily with the following command:
ls -l /var/log   grep -e apache2 -e nginx
As a result you should get something like that:
drwxr-x--- 2 root adm 20480 Jul 22 00:00 /var/log/apache2/
And as you can see, the directory apache2 belongs to the group adm. It means that you don't need to be root to read the logs, instead any unprivileged user that belongs to the group adm can do it. So, let's create the goaccess user, and add it to the adm group:
adduser --system --group --no-create-home goaccess
addgroup goaccess adm
And now, let's run GoAccess unprivileged, and verify that it can still read the log files:
setpriv \
    --reuid=goaccess --regid=goaccess \
    --init-groups --inh-caps=-all \
    -- \
    goaccess \
    --log-format COMBINED \
    --output /tmp/report2.html \
    /var/log/apache2/access.log
setpriv is the command used to drop privileges. The syntax is quite verbose, it's not super friendly for tutorials, but don't be scared and read the manual page to learn what it does. In any case, this command should work, and at this point, it means that you have a goaccess user ready, and we'll use it to run GoAccess unprivileged. Integration, option A - Run GoAccess once a day, from a logrotate hook In this part we wire things together, so that GoAccess processes the log files once a day, adds the new logs to its internal database, and generates a report from all that aggregated data. The result will be a single HTML page. Introducing logrotate In order to do that, we'll use a logrotate hook. logrotate is a little tool that should already be installed on your server, and that runs once a day, and that is in charge of rotating the log files. "Rotating the logs" means moving access.log to access.log.1 and so on. With logrotate, a new log file is created every day, and log files that are too old are deleted. That's what prevents your logs from filling up your disk basically :) You can check that logrotate is indeed installed and enabled with this command (assuming that your init system is systemd):
systemctl status logrotate.timer
What's interesting for us is that logrotate allows you to run scripts before and after the rotation is performed, so it's an ideal place from where to run GoAccess. In short, we want to run GoAccess just before the logs are rotated away, in the prerotate hook. But let's do things in order. At first, we need to write a little wrapper script that will be in charge of running GoAccess with the right arguments, and that will process all of your sites. The wrapper script This wrapper is made to process more than one site, but if you have only one site it works just as well, of course. So let me just drop it on you like that, and I'll explain afterward. Here's my wrapper script:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#!/bin/bash
# Process log files /var/www/apache2/SITE/access.log,
# only if /var/lib/goaccess-db/SITE exists.
# Create HTML reports in $1, a directory that must exist.
set -eu
OUTDIR=
LOGDIR=/var/log/apache2
DBDIR=/var/lib/goaccess-db
fail()   echo >&2 "$@"; exit 1;  
[ $# -eq 1 ]   fail "Usage: $(basename $0) OUTPUT_DIRECTORY"
OUTDIR=$1
[ -d "$OUTDIR" ]   fail "'$OUTDIR' is not a directory"
[ -d "$LOGDIR" ]   fail "'$LOGDIR' is not a directory"
[ -d "$DBDIR"  ]   fail "'$DBDIR' is not a directory"
for d in $(find "$LOGDIR" -mindepth 1 -maxdepth 1 -type d); do
    site=$(basename "$sitedir")
    dbdir=$DBDIR/$site
    logfile=$d/access.log
    outfile=$OUTDIR/$site.html
    if [ ! -d "$dbdir" ]   [ ! -e "$logfile" ]; then
        echo "  Skipping site '$site'"
        continue
    else
        echo "  Processing site '$site'"
    fi
    setpriv \
        --reuid=goaccess --regid=goaccess \
        --init-groups --inh-caps=-all \
        -- \
    goaccess \
        --agent-list \
        --anonymize-ip \
        --persist \
        --restore \
        --config-file /etc/goaccess/goaccess.conf \
        --db-path "$dbdir" \
        --log-format "COMBINED" \
        --output "$outfile" \
        "$logfile"
done
So you'd install this script at /usr/local/bin/goaccess-wrapper for example, and make it executable:
chmod +x /usr/local/bin/goaccess-wrapper
A few things to note: As is, the script makes the assumption that the logs for your site are logged in a sub-directory /var/log/apache2/SITE/. If it's not the case, adjust that in the wrapper accordingly. The name of this sub-directory is then used to find the GoAccess database directory /var/lib/goaccess-db/SITE/. This directory is expected to exist, meaning that if you don't create it yourself, the wrapper won't process this particular site. It's a simple way to control which sites are processed by this GoAccess wrapper, and which sites are not. So if you want goaccess-wrapper to process the site SITE, just create a directory with the name of this site under /var/lib/goaccess-db:
mkdir -p /var/lib/goaccess-db/SITE
chown goaccess:goaccess /var/lib/goaccess-db/SITE
Now let's create an output directory:
mkdir /tmp/goaccess-reports
chown goaccess:goaccess /tmp/goaccess-reports
And let's give a try to the wrapper script:
goaccess-wrapper /tmp/goaccess-reports
ls /tmp/goaccess-reports
Which should give you:
SITE.html
At the same time, you can check that GoAccess populated the database with a bunch of files:
ls /var/lib/goaccess-db/SITE
Setting up the logrotate prerotate hook At this point, we have the wrapper in place. Let's now add a pre-rotate hook so that goaccess-wrapper runs once a day, just before the logs are rotated away. The logrotate config file for Apache2 is located at /etc/logrotate.d/apache2, and for NGINX it's at /etc/logrotate.d/nginx. Among the many things you'll see in this file, here's what is of interest for us: In the config file, there is also this snippet:
prerotate
    if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
        run-parts /etc/logrotate.d/httpd-prerotate; \
    fi; \
endscript
It indicates that scripts in the directory /etc/logrotate.d/httpd-prerotate/ will be executed before the rotation takes place. Refer to the man page run-parts(8) for more details... Putting all of that together, it means that logs from the web server are rotated once a day, and if we want to run scripts just before the rotation, we can just drop them in the httpd-prerotate directory. Simple, right? Let's first create this directory if it doesn't exist:
mkdir -p /etc/logrotate.d/httpd-prerotate/
And let's create a tiny script at /etc/logrotate.d/httpd-prerotate/goaccess:
1
2
#!/bin/sh
exec goaccess-wrapper /tmp/goaccess-reports
Don't forget to make it executable:
chmod +x /etc/logrotate.d/httpd-prerotate/goaccess
As you can see, the only thing that this script does is to invoke the wrapper with the right argument, ie. the output directory for the HTML reports that are generated. And that's all. Now you can just come back tomorrow, check the logs, and make sure that the hook was executed and succeeded. For example, this kind of command will tell you quickly if it worked:
journalctl   grep logrotate
Integration, option B - Run GoAccess once a day, from a systemd service OK so we've just seen how to use a logrotate hook. One downside with that is that we have to drop privileges in the wrapper script, because logrotate runs as root, and we don't want to run GoAccess as root. Hence the rather convoluted syntax with setpriv. Rather than embedding this kind of thing in a wrapper script, we can instead run the wrapper script from a [systemd][] service, and define which user runs the wrapper straight in the systemd service file. Introducing systemd niceties So we can create a systemd service, along with a systemd timer that fires daily. We can then set the user and group that execute the script straight in the systemd service, and there's no need for setpriv anymore. It's a bit more streamlined. We can even go a bit further, and use systemd parameterized units (also called templates), so that we have one service per site (instead of one service that process all of our sites). That will simplify the wrapper script a lot, and it also looks nicer in the logs. With this approach however, it seems that we can't really run exactly before the logs are rotated away, like we did in the section above. But that's OK. What we'll do is that we'll run once a day, no matter the time, and we'll just make sure to process both log files access.log and access.log.1 (ie. the current logs and the logs from yesterday). This way, we're sure not to miss any line from the logs. Note that GoAccess is smart enough to only consider newer entries from the log files, and discard entries that are already in the database. In other words, it's safe to parse the same log file more than once, GoAccess will do the right thing. For more details see "INCREMENTAL LOG PROCESSING" from man goaccess. systemd]: https://freedesktop.org/wiki/Software/systemd/ Implementation And here's how it all looks like. First, a little wrapper script for GoAccess:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/bin/bash
# Usage: $0 SITE DBDIR LOGDIR OUTDIR
set -eu
SITE=$1
DBDIR=$2
LOGDIR=$3
OUTDIR=$4
LOGFILES=()
for ext in log log.1; do
    logfile="$LOGDIR/access.$ext"
    [ -e "$logfile" ] && LOGFILES+=("$logfile")
done
if [ $ #LOGFILES[@]  -eq 0 ]; then
    echo "No log files in '$LOGDIR'"
    exit 0
fi
goaccess \
    --agent-list \
    --anonymize-ip \
    --persist \
    --restore \
    --config-file /etc/goaccess/goaccess.conf \
    --db-path "$DBDIR" \
    --log-format "COMBINED" \
    --output "$OUTDIR/$SITE.html" \
    "$ LOGFILES[@] "
This wrapper does very little. Actually, the only thing it does is to check for the existence of the two log files access.log and access.log.1, to be sure that we don't ask GoAccess to process a file that does not exist (GoAccess would not be happy about that). Save this file under /usr/local/bin/goaccess-wrapper, don't forget to make it executable:
chmod +x /usr/local/bin/goaccess-wrapper
Then, create a systemd parameterized unit file, so that we can run this wrapper as a systemd service. Save it under /etc/systemd/system/goaccess@.service:
[Unit]
Description=Update GoAccess report - %i
ConditionPathIsDirectory=/var/lib/goaccess-db/%i
ConditionPathIsDirectory=/var/log/apache2/%i
ConditionPathIsDirectory=/tmp/goaccess-reports
PartOf=goaccess.service
[Service]
Type=oneshot
User=goaccess
Group=goaccess
Nice=19
ExecStart=/usr/local/bin/goaccess-wrapper \
 %i \
 /var/lib/goaccess-db/%i \
 /var/log/apache2/%i \
 /tmp/goaccess-reports
So, what is a systemd parameterized unit? It's a service to which you can pass an argument when you enable it. The %i in the unit definition will be replaced by this argument. In our case, the argument will be the name of the site that we want to process. As you can see, we use the directive ConditionPathIsDirectory= extensively, so that if ever one of the required directories does not exist, the unit will just be skipped (and marked as such in the logs). It's a graceful way to fail. We run the wrapper as the user and group goaccess, thanks to User= and Group=. We also use Nice= to give a low priority to the process. At this point, it's already possible to test. Just make sure that you created a directory for the GoAccess database:
mkdir -p /var/lib/goaccess-db/SITE
chown goaccess:goaccess /var/lib/goaccess-db/SITE
Also make sure that the output directory exists:
mkdir /tmp/goaccess-reports
chown goaccess:goaccess /tmp/goaccess-reports
Then reload systemd and fire the unit to see if it works:
systemctl daemon-reload
systemctl start goaccess@SITE.service
journalctl   tail
And that should work already. As you can see, the argument, SITE, is passed in the systemctl start command. We just append it after the @, in the name of the unit. Now, let's create another GoAccess service file, which sole purpose is to group all the parameterized units together, so that we can start them all in one go. Note that we don't use a systemd target for that, because ultimately we want to run it once a day, and that would not be possible with a target. So instead we use a dummy oneshot service. So here it is, saved under /etc/systemd/system/goaccess.service:
[Unit]
Description=Update GoAccess reports
Requires= \
 goaccess@SITE1.service \
 goaccess@SITE2.service
[Service]
Type=oneshot
ExecStart=true
As you can see, we simply list the sites that we want to process in the Requires= directive. In this example we have two sites named SITE1 and SITE2. Let's ensure that everything is still good:
systemctl daemon-reload
systemctl start goaccess.service
journalctl   tail
Check the logs, both sites SITE1 and SITE2 should have been processed. And finally, let's create a timer, so that systemd runs goaccess.service once a day. Save it under /etc/systemd/system/goaccess.timer.
[Unit]
Description=Daily update of GoAccess reports
[Timer]
OnCalendar=daily
RandomizedDelaySec=1h
Persistent=true
[Install]
WantedBy=timers.target
Finally, enable the timer:
systemctl daemon-reload
systemctl enable --now goaccess.timer
At this point, everything should be OK. Just come back tomorrow and check the logs with something like:
journalctl   grep goaccess
Last word: if you have only one site to process, of course you can simplify, for example you can hardcode all the paths in the file goaccess.service instead of using a parameterized unit. Up to you. Daily operations So in this part, we assume that you have GoAccess all setup and running, once a day or so. Let's just go over a few things worth noting. Serve your report Up to now in this tutorial, we created the reports in /tmp/goaccess-reports, but that was just for the sake of the example. You will probably want to save your reports in a directory that is served by your web server, so that, well, you can actually look at it in your web browser, that was the point, right? So how to do that is a bit out of scope here, and I guess that if you want to monitor your website, you already have a website, so you will have no trouble serving the GoAccess HTML report. However there's an important detail to be aware of: GoAccess shows all the IP addresses of your visitors in the report. As long as the report is private it's OK, but if ever you make your GoAccess report public, then you should definitely invoke GoAccess with the option --anonymize-ip. Keep an eye on the logs In this tutorial, the reports we create, along with the GoAccess databases, will grow bigger every day, forever. It also means that the GoAccess processing time will grow a bit each day. So maybe the first thing to do is to keep an eye on the logs, to see how long it takes to GoAccess to do its job every day. Also, maybe you'd like to keep an eye on the size of the GoAccess database with:
du -sh /var/lib/goaccess-db/SITE
If your site has few visitors, I suspect it won't be a problem though. You could also be a bit pro-active in preventing this problem in the future, and for example you could break the reports into, say, monthly reports. Meaning that every month, you would create a new database in a new directory, and also start a new HTML report. This way you'd have monthly reports, and you make sure to limit the GoAccess processing time, by limiting the database size to a month. This can be achieved very easily, by including something like YEAR-MONTH in the database directory, and in the HTML report. You can handle that automatically in the wrapper script, for example:
sfx=$(date +'%Y-%m')
mkdir -p $DBDIR/$sfx
goaccess \
    --db-path $DBDIR/$sfx \
    --output "$OUTDIR/$SITE-$sfx.html" \
    ...
You get the idea. Further notes Migration from older versions With the --persist option, GoAccess keeps all the information from the logs in a database, so that it can re-use it later. In prior versions, GoAccess used the Tokyo Cabinet key-value store for that. However starting from v1.4, GoAccess dropped this dependency and now uses its own database format. As a result, the previous database can't be used anymore, you will have to remove it and restart from zero. At the moment there is no way to convert the data from the old database to the new one. If you're interested, this is discussed upstream at [#1783][bug-1783]. Another thing that changed with this new version is the name for some of the command-line options. For example, --load-from-disk was dropped in favor of --restore, and --keep-db-files became --persist. So you'll have to look at the documentation a bit, and update your script(s) accordingly. Other ways to use GoAccess It's also possible to do it completely differently. You could keep GoAccess running, pretty much like a daemon, with the --real-time-html option, and have it process the logs continuously, rather than calling it on a regular basis. It's also possible to see the GoAccess report straight in the terminal, thanks to libncurses, rather than creating a HTML report. And much more, GoAccess is packed with features. Conclusion I hope that this tutorial helped some of you folks. Feel free to drop an e-mail for comments.

Arnaud Rebillout: GoAccess 1.4, a detailed tutorial

GoAccess v1.4 was just released a few weeks ago! Let's take this chance to write a loooong tutorial. We'll go over every steps to install and operate GoAccess. This is a tutorial aimed at those who don't play sysadmin every day, and that's why it's so long, I did my best to provide thorough explanations all along, so that it's more than just a "copy-and-paste" kind of tutorial. And for those who do play sysadmin everyday: please try not to fall asleep while reading, and don't hesitate to drop me an e-mail if you spot anything inaccurate in here. Thanks! Introduction So what's GoAccess already? GoAccess is a web log analyzer, and it allows you to visualize the traffic for your website, and get to know a bit more about your visitors: how many visitors and hits, for which pages, coming from where (geolocation, operating system, web browser...), etc... It does so by parsing the access logs from your web server, be it Apache, NGINX or whatever. GoAccess gives you different options to display the statistics, and in this tutorial we'll focus on producing a HTML report. Meaning that you can see the statistics for your website straight in your web browser, under the form of a single HTML page. For an example, you can have a look at the stats of my blog here: https://goaccess.arnaudr.io. GoAccess is written in C, it has very few dependencies, it had been around for about 10 years, and it's distributed under the MIT license. Assumptions This tutorial is about installing and configuring, so I'll assume that all the commands are run as root. I won't prefix each of them with sudo. I use the Apache web server, running on a Debian system. I don't think it matters so much for this tutorial though. If you're using NGINX it's fine, you can keep reading. Also, I will just use the name SITE for the name of the website that we want to analyze with GoAccess. Just replace that with the real name of your site. I also assume the following locations for your stuff: If you have your stuff in /srv/SITE/ log,www instead, no worries, just adjust the paths accordingly, I bet you can do it. Installation The latest version of GoAccess is v1.4, and it's not yet available in the Debian repositories. So for this part, you can follow the instructions from the official GoAccess download page. Install steps are explained in details, so there's nothing left for me to say :) When this is done, let's get started with the basics. We're talking about the latest version v1.4 here, let's make sure:
$ goaccess --version
GoAccess - 1.4.
...
Now let's try to create a HTML report. I assume that you already have a website up and running. GoAccess needs to parse the access logs. These logs are optional, they might or might not be created by your web server, depending on how it's configured. Usually, these log files are named access.log, unsurprisingly. You can check if those logs exist on your system by running this command:
find /var/log -name access.log
Another important thing to know is that these logs can be in different formats. In this tutorial we'll assume that we work with the combined log format, because it seems to be the most common default. To check what kind of access logs your web server produces, you must look at the configuration for your site. For an Apache web server, you should have such a line in the file /etc/apache2/sites-enabled/SITE.conf:
CustomLog $ APACHE_LOG_DIR /SITE/access.log combined
For NGINX, it's quite similar. The configuration file would be something like /etc/nginx/sites-enabled/SITE, and the line to enable access logs would be something like:
access_log /var/log/nginx/SITE/access.log
Note that NGINX writes the access logs in the combined format by default, that's why you don't see the word combined anywhere in the line above: it's implicit. Alright, so from now on we assume that yes, you have access log files available, and yes, they are in the combined log format. If that's the case, then you can already run GoAccess and generate a report, for example for the log file /var/log/apache2/access.log
goaccess \
    --log-format COMBINED \
    --output /tmp/report.html \
    /var/log/apache2/access.log
It's possible to give GoAccess more than one log files to process, so if you have for example the file access.log.1 around, you can use it as well:
goaccess \
    --log-format COMBINED \
    --output /tmp/report.html \
    /var/log/apache2/access.log \
    /var/log/apache2/access.log.1
If GoAccess succeeds (and it should), you're on the right track! All is left to do to complete this test is to have a look at the HTML report created. It's a single HTML page, so you can easily scp it to your machine, or just move it to the document root of your site, and then open it in your web browser. Looks good? So let's move on to more interesting things. Web server configuration This part is very short, because in terms of configuration of the web server, there's very little to do. As I said above, the only thing you want from the web server is to create access log files. Then you want to be sure that GoAccess and your web server agree on the format for these files. In the part above we used the combined log format, but GoAccess supports many other common log formats out of the box, and even allows you to parse custom log formats. For more details, refer to the option --log-format in the GoAccess manual page. Another common log format is named, well, common. It even has its own Wikipedia page. But compared to combined, the common log format contains less information, it doesn't include the referrer and user-agent values, meaning that you won't have it in the GoAccess report. So at this point you should understand that, unsurprisingly, GoAccess can only tell you about what's in the access logs, no more no less. And that's all in term of web server configuration. Configuration to run GoAccess unprivileged Now we're going to create a user and group for GoAccess, so that we don't have to run it as root. The reason is that, well, for everything running unattended on your server, the less code runs as root, the better. It's good practice and common sense. In this case, GoAccess is simply a log analyzer. So it just needs to read the logs files from your web server, and there is no need to be root for that, an unprivileged user can do the job just as well, assuming it has read permissions on /var/log/apache2 or /var/log/nginx. The log files of the web server are usually part of the adm group (though it might depend on your distro, I'm not sure). This is something you can check easily with the following command:
ls -l /var/log   grep -e apache2 -e nginx
As a result you should get something like that:
drwxr-x--- 2 root adm 20480 Jul 22 00:00 /var/log/apache2/
And as you can see, the directory apache2 belongs to the group adm. It means that you don't need to be root to read the logs, instead any unprivileged user that belongs to the group adm can do it. So, let's create the goaccess user, and add it to the adm group:
adduser --system --group --no-create-home goaccess
addgroup goaccess adm
And now, let's run GoAccess unprivileged, and verify that it can still read the log files:
setpriv \
    --reuid=goaccess --regid=goaccess \
    --init-groups --inh-caps=-all \
    -- \
    goaccess \
    --log-format COMBINED \
    --output /tmp/report2.html \
    /var/log/apache2/access.log
setpriv is the command used to drop privileges. The syntax is quite verbose, it's not super friendly for tutorials, but don't be scared and read the manual page to learn what it does. In any case, this command should work, and at this point, it means that you have a goaccess user ready, and we'll use it to run GoAccess unprivileged. Integration, option A - Run GoAccess once a day, from a logrotate hook In this part we wire things together, so that GoAccess processes the log files once a day, adds the new logs to its internal database, and generates a report from all that aggregated data. The result will be a single HTML page. Introducing logrotate In order to do that, we'll use a logrotate hook. logrotate is a little tool that should already be installed on your server, and that runs once a day, and that is in charge of rotating the log files. "Rotating the logs" means moving access.log to access.log.1 and so on. With logrotate, a new log file is created every day, and log files that are too old are deleted. That's what prevents your logs from filling up your disk basically :) You can check that logrotate is indeed installed and enabled with this command (assuming that your init system is systemd):
systemctl status logrotate.timer
What's interesting for us is that logrotate allows you to run scripts before and after the rotation is performed, so it's an ideal place from where to run GoAccess. In short, we want to run GoAccess just before the logs are rotated away, in the prerotate hook. But let's do things in order. At first, we need to write a little wrapper script that will be in charge of running GoAccess with the right arguments, and that will process all of your sites. The wrapper script This wrapper is made to process more than one site, but if you have only one site it works just as well, of course. So let me just drop it on you like that, and I'll explain afterward. Here's my wrapper script:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#!/bin/bash
# Process log files /var/www/apache2/SITE/access.log,
# only if /var/lib/goaccess-db/SITE exists.
# Create HTML reports in $1, a directory that must exist.
set -eu
OUTDIR=
LOGDIR=/var/log/apache2
DBDIR=/var/lib/goaccess-db
fail()   echo >&2 "$@"; exit 1;  
[ $# -eq 1 ]   fail "Usage: $(basename $0) OUTPUT_DIRECTORY"
OUTDIR=$1
[ -d "$OUTDIR" ]   fail "'$OUTDIR' is not a directory"
[ -d "$LOGDIR" ]   fail "'$LOGDIR' is not a directory"
[ -d "$DBDIR"  ]   fail "'$DBDIR' is not a directory"
for d in $(find "$LOGDIR" -mindepth 1 -maxdepth 1 -type d); do
    site=$(basename "$sitedir")
    dbdir=$DBDIR/$site
    logfile=$d/access.log
    outfile=$OUTDIR/$site.html
    if [ ! -d "$dbdir" ]   [ ! -e "$logfile" ]; then
        echo "  Skipping site '$site'"
        continue
    else
        echo "  Processing site '$site'"
    fi
    setpriv \
        --reuid=goaccess --regid=goaccess \
        --init-groups --inh-caps=-all \
        -- \
    goaccess \
        --agent-list \
        --anonymize-ip \
        --persist \
        --restore \
        --config-file /etc/goaccess/goaccess.conf \
        --db-path "$dbdir" \
        --log-format "COMBINED" \
        --output "$outfile" \
        "$logfile"
done
So you'd install this script at /usr/local/bin/goaccess-wrapper for example, and make it executable:
chmod +x /usr/local/bin/goaccess-wrapper
A few things to note: As is, the script makes the assumption that the logs for your site are logged in a sub-directory /var/log/apache2/SITE/. If it's not the case, adjust that in the wrapper accordingly. The name of this sub-directory is then used to find the GoAccess database directory /var/lib/goaccess-db/SITE/. This directory is expected to exist, meaning that if you don't create it yourself, the wrapper won't process this particular site. It's a simple way to control which sites are processed by this GoAccess wrapper, and which sites are not. So if you want goaccess-wrapper to process the site SITE, just create a directory with the name of this site under /var/lib/goaccess-db:
mkdir -p /var/lib/goaccess-db/SITE
chown goaccess:goaccess /var/lib/goaccess-db/SITE
Now let's create an output directory:
mkdir /tmp/goaccess-reports
chown goaccess:goaccess /tmp/goaccess-reports
And let's give a try to the wrapper script:
goaccess-wrapper /tmp/goaccess-reports
ls /tmp/goaccess-reports
Which should give you:
SITE.html
At the same time, you can check that GoAccess populated the database with a bunch of files:
ls /var/lib/goaccess-db/SITE
Setting up the logrotate prerotate hook At this point, we have the wrapper in place. Let's now add a pre-rotate hook so that goaccess-wrapper runs once a day, just before the logs are rotated away. The logrotate config file for Apache2 is located at /etc/logrotate.d/apache2, and for NGINX it's at /etc/logrotate.d/nginx. Among the many things you'll see in this file, here's what is of interest for us: In the config file, there is also this snippet:
prerotate
    if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
        run-parts /etc/logrotate.d/httpd-prerotate; \
    fi; \
endscript
It indicates that scripts in the directory /etc/logrotate.d/httpd-prerotate/ will be executed before the rotation takes place. Refer to the man page run-parts(8) for more details... Putting all of that together, it means that logs from the web server are rotated once a day, and if we want to run scripts just before the rotation, we can just drop them in the httpd-prerotate directory. Simple, right? Let's first create this directory if it doesn't exist:
mkdir -p /etc/logrotate.d/httpd-prerotate/
And let's create a tiny script at /etc/logrotate.d/httpd-prerotate/goaccess:
1
2
#!/bin/sh
exec goaccess-wrapper /tmp/goaccess-reports
Don't forget to make it executable:
chmod +x /etc/logrotate.d/httpd-prerotate/goaccess
As you can see, the only thing that this script does is to invoke the wrapper with the right argument, ie. the output directory for the HTML reports that are generated. And that's all. Now you can just come back tomorrow, check the logs, and make sure that the hook was executed and succeeded. For example, this kind of command will tell you quickly if it worked:
journalctl   grep logrotate
Integration, option B - Run GoAccess once a day, from a systemd service OK so we've just seen how to use a logrotate hook. One downside with that is that we have to drop privileges in the wrapper script, because logrotate runs as root, and we don't want to run GoAccess as root. Hence the rather convoluted syntax with setpriv. Rather than embedding this kind of thing in a wrapper script, we can instead run the wrapper script from a [systemd][] service, and define which user runs the wrapper straight in the systemd service file. Introducing systemd niceties So we can create a systemd service, along with a systemd timer that fires daily. We can then set the user and group that execute the script straight in the systemd service, and there's no need for setpriv anymore. It's a bit more streamlined. We can even go a bit further, and use systemd parameterized units (also called templates), so that we have one service per site (instead of one service that process all of our sites). That will simplify the wrapper script a lot, and it also looks nicer in the logs. With this approach however, it seems that we can't really run exactly before the logs are rotated away, like we did in the section above. But that's OK. What we'll do is that we'll run once a day, no matter the time, and we'll just make sure to process both log files access.log and access.log.1 (ie. the current logs and the logs from yesterday). This way, we're sure not to miss any line from the logs. Note that GoAccess is smart enough to only consider newer entries from the log files, and discard entries that are already in the database. In other words, it's safe to parse the same log file more than once, GoAccess will do the right thing. For more details see "INCREMENTAL LOG PROCESSING" from man goaccess. systemd]: https://freedesktop.org/wiki/Software/systemd/ Implementation And here's how it all looks like. First, a little wrapper script for GoAccess:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/bin/bash
# Usage: $0 SITE DBDIR LOGDIR OUTDIR
set -eu
SITE=$1
DBDIR=$2
LOGDIR=$3
OUTDIR=$4
LOGFILES=()
for ext in log log.1; do
    logfile="$LOGDIR/access.$ext"
    [ -e "$logfile" ] && LOGFILES+=("$logfile")
done
if [ $ #LOGFILES[@]  -eq 0 ]; then
    echo "No log files in '$LOGDIR'"
    exit 0
fi
goaccess \
    --agent-list \
    --anonymize-ip \
    --persist \
    --restore \
    --config-file /etc/goaccess/goaccess.conf \
    --db-path "$DBDIR" \
    --log-format "COMBINED" \
    --output "$OUTDIR/$SITE.html" \
    "$ LOGFILES[@] "
This wrapper does very little. Actually, the only thing it does is to check for the existence of the two log files access.log and access.log.1, to be sure that we don't ask GoAccess to process a file that does not exist (GoAccess would not be happy about that). Save this file under /usr/local/bin/goaccess-wrapper, don't forget to make it executable:
chmod +x /usr/local/bin/goaccess-wrapper
Then, create a systemd parameterized unit file, so that we can run this wrapper as a systemd service. Save it under /etc/systemd/system/goaccess@.service:
[Unit]
Description=Update GoAccess report - %i
ConditionPathIsDirectory=/var/lib/goaccess-db/%i
ConditionPathIsDirectory=/var/log/apache2/%i
ConditionPathIsDirectory=/tmp/goaccess-reports
PartOf=goaccess.service
[Service]
Type=oneshot
User=goaccess
Group=goaccess
Nice=19
ExecStart=/usr/local/bin/goaccess-wrapper \
 %i \
 /var/lib/goaccess-db/%i \
 /var/log/apache2/%i \
 /tmp/goaccess-reports
So, what is a systemd parameterized unit? It's a service to which you can pass an argument when you enable it. The %i in the unit definition will be replaced by this argument. In our case, the argument will be the name of the site that we want to process. As you can see, we use the directive ConditionPathIsDirectory= extensively, so that if ever one of the required directories does not exist, the unit will just be skipped (and marked as such in the logs). It's a graceful way to fail. We run the wrapper as the user and group goaccess, thanks to User= and Group=. We also use Nice= to give a low priority to the process. At this point, it's already possible to test. Just make sure that you created a directory for the GoAccess database:
mkdir -p /var/lib/goaccess-db/SITE
chown goaccess:goaccess /var/lib/goaccess-db/SITE
Also make sure that the output directory exists:
mkdir /tmp/goaccess-reports
chown goaccess:goaccess /tmp/goaccess-reports
Then reload systemd and fire the unit to see if it works:
systemctl daemon-reload
systemctl start goaccess@SITE.service
journalctl   tail
And that should work already. As you can see, the argument, SITE, is passed in the systemctl start command. We just append it after the @, in the name of the unit. Now, let's create another GoAccess service file, which sole purpose is to group all the parameterized units together, so that we can start them all in one go. Note that we don't use a systemd target for that, because ultimately we want to run it once a day, and that would not be possible with a target. So instead we use a dummy oneshot service. So here it is, saved under /etc/systemd/system/goaccess.service:
[Unit]
Description=Update GoAccess reports
Requires= \
 goaccess@SITE1.service \
 goaccess@SITE2.service
[Service]
Type=oneshot
ExecStart=true
As you can see, we simply list the sites that we want to process in the Requires= directive. In this example we have two sites named SITE1 and SITE2. Let's ensure that everything is still good:
systemctl daemon-reload
systemctl start goaccess.service
journalctl   tail
Check the logs, both sites SITE1 and SITE2 should have been processed. And finally, let's create a timer, so that systemd runs goaccess.service once a day. Save it under /etc/systemd/system/goaccess.timer.
[Unit]
Description=Daily update of GoAccess reports
[Timer]
OnCalendar=daily
RandomizedDelaySec=1h
Persistent=true
[Install]
WantedBy=timers.target
Finally, enable the timer:
systemctl daemon-reload
systemctl enable --now goaccess.timer
At this point, everything should be OK. Just come back tomorrow and check the logs with something like:
journalctl   grep goaccess
Last word: if you have only one site to process, of course you can simplify, for example you can hardcode all the paths in the file goaccess.service instead of using a parameterized unit. Up to you. Daily operations So in this part, we assume that you have GoAccess all setup and running, once a day or so. Let's just go over a few things worth noting. Serve your report Up to now in this tutorial, we created the reports in /tmp/goaccess-reports, but that was just for the sake of the example. You will probably want to save your reports in a directory that is served by your web server, so that, well, you can actually look at it in your web browser, that was the point, right? So how to do that is a bit out of scope here, and I guess that if you want to monitor your website, you already have a website, so you will have no trouble serving the GoAccess HTML report. However there's an important detail to be aware of: GoAccess shows all the IP addresses of your visitors in the report. As long as the report is private it's OK, but if ever you make your GoAccess report public, then you should definitely invoke GoAccess with the option --anonymize-ip. Keep an eye on the logs In this tutorial, the reports we create, along with the GoAccess databases, will grow bigger every day, forever. It also means that the GoAccess processing time will grow a bit each day. So maybe the first thing to do is to keep an eye on the logs, to see how long it takes to GoAccess to do its job every day. Also, maybe you'd like to keep an eye on the size of the GoAccess database with:
du -sh /var/lib/goaccess-db/SITE
If your site has few visitors, I suspect it won't be a problem though. You could also be a bit pro-active in preventing this problem in the future, and for example you could break the reports into, say, monthly reports. Meaning that every month, you would create a new database in a new directory, and also start a new HTML report. This way you'd have monthly reports, and you make sure to limit the GoAccess processing time, by limiting the database size to a month. This can be achieved very easily, by including something like YEAR-MONTH in the database directory, and in the HTML report. You can handle that automatically in the wrapper script, for example:
sfx=$(date +'%Y-%m')
mkdir -p $DBDIR/$sfx
goaccess \
    --db-path $DBDIR/$sfx \
    --output "$OUTDIR/$SITE-$sfx.html" \
    ...
You get the idea. Further notes Migration from older versions With the --persist option, GoAccess keeps all the information from the logs in a database, so that it can re-use it later. In prior versions, GoAccess used the Tokyo Cabinet key-value store for that. However starting from v1.4, GoAccess dropped this dependency and now uses its own database format. As a result, the previous database can't be used anymore, you will have to remove it and restart from zero. At the moment there is no way to convert the data from the old database to the new one. If you're interested, this is discussed upstream at [#1783][bug-1783]. Another thing that changed with this new version is the name for some of the command-line options. For example, --load-from-disk was dropped in favor of --restore, and --keep-db-files became --persist. So you'll have to look at the documentation a bit, and update your script(s) accordingly. Other ways to use GoAccess It's also possible to do it completely differently. You could keep GoAccess running, pretty much like a daemon, with the --real-time-html option, and have it process the logs continuously, rather than calling it on a regular basis. It's also possible to see the GoAccess report straight in the terminal, thanks to libncurses, rather than creating a HTML report. And much more, GoAccess is packed with features. Conclusion I hope that this tutorial helped some of you folks. Feel free to drop an e-mail for comments.

3 August 2020

Arnaud Rebillout: GoAccess 1.4, a detailed tutorial

GoAccess v1.4 was just released a few weeks ago! Let's take this chance to write a loooong tutorial. We'll go over every steps to install and operate GoAccess. This is a tutorial aimed at those who don't play sysadmin every day, and that's why it's so long, I did my best to provide thorough explanations all along, so that it's more than just a "copy-and-paste" kind of tutorial. And for those who do play sysadmin everyday: please try not to fall asleep while reading, and don't hesitate to drop me an e-mail if you spot anything inaccurate in here. Thanks! Introduction So what's GoAccess already? GoAccess is a web log analyzer, and it allows you to visualize the traffic for your website, and get to know a bit more about your visitors: how many visitors and hits, for which pages, coming from where (geolocation, operating system, web browser...), etc... It does so by parsing the access logs from your web server, be it Apache, NGINX or whatever. GoAccess gives you different options to display the statistics, and in this tutorial we'll focus on producing a HTML report. Meaning that you can see the statistics for your website straight in your web browser, under the form of a single HTML page. For an example, you can have a look at the stats of my blog here: http://goaccess.arnaudr.io. GoAccess is written in C, it has very few dependencies, it had been around for about 10 years, and it's distributed under the MIT license. Assumptions This tutorial is about installing and configuring, so I'll assume that all the commands are run as root. I won't prefix each of them with sudo. I use the Apache web server, running on a Debian system. I don't think it matters so much for this tutorial though. If you're using NGINX it's fine, you can keep reading. Also, I will just use the name SITE for the name of the website that we want to analyze with GoAccess. Just replace that with the real name of your site. I also assume the following locations for your stuff: If you have your stuff in /srv/SITE/ log,www instead, no worries, just adjust the paths accordingly, I bet you can do it. Installation The latest version of GoAccess is v1.4, and it's not yet available in the Debian repositories. So for this part, you can follow the instructions from the official GoAccess download page. Install steps are explained in details, so there's nothing left for me to say :) When this is done, let's get started with the basics. We're talking about the latest version v1.4 here, let's make sure:
$ goaccess --version
GoAccess - 1.4.
...
Now let's try to create a HTML report. I assume that you already have a website up and running. GoAccess needs to parse the access logs. These logs are optional, they might or might not be created by your web server, depending on how it's configured. Usually, these log files are named access.log, unsurprisingly. You can check if those logs exist on your system by running this command:
find /var/log -name access.log
Another important thing to know is that these logs can be in different formats. In this tutorial we'll assume that we work with the combined log format, because it seems to be the most common default. To check what kind of access logs your web server produces, you must look at the configuration for your site. For an Apache web server, you should have such a line in the file /etc/apache2/sites-enabled/SITE.conf:
CustomLog $ APACHE_LOG_DIR /SITE/access.log combined
For NGINX, it's quite similar. The configuration file would be something like /etc/nginx/sites-enabled/SITE, and the line to enable access logs would be something like:
access_log /var/log/nginx/SITE/access.log
Note that NGINX writes the access logs in the combined format by default, that's why you don't see the word combined anywhere in the line above: it's implicit. Alright, so from now on we assume that yes, you have access log files available, and yes, they are in the combined log format. If that's the case, then you can already run GoAccess and generate a report, for example for the log file /var/log/apache2/access.log
goaccess \
    --log-format COMBINED \
    --output /tmp/report.html \
    /var/log/apache2/access.log
It's possible to give GoAccess more than one log files to process, so if you have for example the file access.log.1 around, you can use it as well:
goaccess \
    --log-format COMBINED \
    --output /tmp/report.html \
    /var/log/apache2/access.log \
    /var/log/apache2/access.log.1
If GoAccess succeeds (and it should), you're on the right track! All is left to do to complete this test is to have a look at the HTML report created. It's a single HTML page, so you can easily scp it to your machine, or just move it to the document root of your site, and then open it in your web browser. Looks good? So let's move on to more interesting things. Web server configuration This part is very short, because in terms of configuration of the web server, there's very little to do. As I said above, the only thing you want from the web server is to create access log files. Then you want to be sure that GoAccess and your web server agree on the format for these files. In the part above we used the combined log format, but GoAccess supports many other common log formats out of the box, and even allows you to parse custom log formats. For more details, refer to the option --log-format in the GoAccess manual page. Another common log format is named, well, common. It even has its own Wikipedia page. But compared to combined, the common log format contains less information, it doesn't include the referrer and user-agent values, meaning that you won't have it in the GoAccess report. So at this point you should understand that, unsurprisingly, GoAccess can only tell you about what's in the access logs, no more no less. And that's all in term of web server configuration. Configuration to run GoAccess unprivileged Now we're going to create a user and group for GoAccess, so that we don't have to run it as root. The reason is that, well, for everything running unattended on your server, the less code runs as root, the better. It's good practice and common sense. In this case, GoAccess is simply a log analyzer. So it just needs to read the logs files from your web server, and there is no need to be root for that, an unprivileged user can do the job just as well, assuming it has read permissions on /var/log/apache2 or /var/log/nginx. The log files of the web server are usually part of the adm group (though it might depend on your distro, I'm not sure). This is something you can check easily with the following command:
ls -l /var/log   grep -e apache2 -e nginx
As a result you should get something like that:
drwxr-x--- 2 root adm 20480 Jul 22 00:00 /var/log/apache2/
And as you can see, the directory apache2 belongs to the group adm. It means that you don't need to be root to read the logs, instead any unprivileged user that belongs to the group adm can do it. So, let's create the goaccess user, and add it to the adm group:
adduser --system --group --no-create-home goaccess
addgroup goaccess adm
And now, let's run GoAccess unprivileged, and verify that it can still read the log files:
setpriv \
    --reuid=goaccess --regid=goaccess \
    --init-groups --inh-caps=-all \
    -- \
    goaccess \
    --log-format COMBINED \
    --output /tmp/report2.html \
    /var/log/apache2/access.log
setpriv is the command used to drop privileges. The syntax is quite verbose, it's not super friendly for tutorials, but don't be scared and read the manual page to learn what it does. In any case, this command should work, and at this point, it means that you have a goaccess user ready, and we'll use it to run GoAccess unprivileged. Integration, option A - Run GoAccess once a day, from a logrotate hook In this part we wire things together, so that GoAccess processes the log files once a day, adds the new logs to its internal database, and generates a report from all that aggregated data. The result will be a single HTML page. Introducing logrotate In order to do that, we'll use a logrotate hook. logrotate is a little tool that should already be installed on your server, and that runs once a day, and that is in charge of rotating the log files. "Rotating the logs" means moving access.log to access.log.1 and so on. With logrotate, a new log file is created every day, and log files that are too old are deleted. That's what prevents your logs from filling up your disk basically :) You can check that logrotate is indeed installed and enabled with this command (assuming that your init system is systemd):
systemctl status logrotate.timer
What's interesting for us is that logrotate allows you to run scripts before and after the rotation is performed, so it's an ideal place from where to run GoAccess. In short, we want to run GoAccess just before the logs are rotated away, in the prerotate hook. But let's do things in order. At first, we need to write a little wrapper script that will be in charge of running GoAccess with the right arguments, and that will process all of your sites. The wrapper script This wrapper is made to process more than one site, but if you have only one site it works just as well, of course. So let me just drop it on you like that, and I'll explain afterward. Here's my wrapper script:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
#!/bin/bash
# Process log files /var/www/apache2/SITE/access.log,
# only if /var/lib/goaccess-db/SITE exists.
# Create HTML reports in $1, a directory that must exist.
set -eu
OUTDIR=
LOGDIR=/var/log/apache2
DBDIR=/var/lib/goaccess-db
fail()   echo >&2 "$@"; exit 1;  
[ $# -eq 1 ]   fail "Usage: $(basename $0) OUTPUT_DIRECTORY"
OUTDIR=$1
[ -d "$OUTDIR" ]   fail "'$OUTDIR' is not a directory"
[ -d "$LOGDIR" ]   fail "'$LOGDIR' is not a directory"
[ -d "$DBDIR"  ]   fail "'$DBDIR' is not a directory"
for d in $(find "$LOGDIR" -mindepth 1 -maxdepth 1 -type d); do
    site=$(basename "$sitedir")
    dbdir=$DBDIR/$site
    logfile=$d/access.log
    outfile=$OUTDIR/$site.html
    if [ ! -d "$dbdir" ]   [ ! -e "$logfile" ]; then
        echo "  Skipping site '$site'"
        continue
    else
        echo "  Processing site '$site'"
    fi
    setpriv \
        --reuid=goaccess --regid=goaccess \
        --init-groups --inh-caps=-all \
        -- \
    goaccess \
        --agent-list \
        --anonymize-ip \
        --persist \
        --restore \
        --config-file /etc/goaccess/goaccess.conf \
        --db-path "$dbdir" \
        --log-format "COMBINED" \
        --output "$outfile" \
        "$logfile"
done
So you'd install this script at /usr/local/bin/goaccess-wrapper for example, and make it executable:
chmod +x /usr/local/bin/goaccess-wrapper
A few things to note: As is, the script makes the assumption that the logs for your site are logged in a sub-directory /var/log/apache2/SITE/. If it's not the case, adjust that in the wrapper accordingly. The name of this sub-directory is then used to find the GoAccess database directory /var/lib/goaccess-db/SITE/. This directory is expected to exist, meaning that if you don't create it yourself, the wrapper won't process this particular site. It's a simple way to control which sites are processed by this GoAccess wrapper, and which sites are not. So if you want goaccess-wrapper to process the site SITE, just create a directory with the name of this site under /var/lib/goaccess-db:
mkdir -p /var/lib/goaccess-db/SITE
chown goaccess:goaccess /var/lib/goaccess-db/SITE
Now let's create an output directory:
mkdir /tmp/goaccess-reports
chown goaccess:goaccess /tmp/goaccess-reports
And let's give a try to the wrapper script:
goaccess-wrapper /tmp/goaccess-reports
ls /tmp/goaccess-reports
Which should give you:
SITE.html
At the same time, you can check that GoAccess populated the database with a bunch of files:
ls /var/lib/goaccess-db/SITE
Setting up the logrotate prerotate hook At this point, we have the wrapper in place. Let's now add a pre-rotate hook so that goaccess-wrapper runs once a day, just before the logs are rotated away. The logrotate config file for Apache2 is located at /etc/logrotate.d/apache2, and for NGINX it's at /etc/logrotate.d/nginx. Among the many things you'll see in this file, here's what is of interest for us: In the config file, there is also this snippet:
prerotate
    if [ -d /etc/logrotate.d/httpd-prerotate ]; then \
        run-parts /etc/logrotate.d/httpd-prerotate; \
    fi; \
endscript
It indicates that scripts in the directory /etc/logrotate.d/httpd-prerotate/ will be executed before the rotation takes place. Refer to the man page run-parts(8) for more details... Putting all of that together, it means that logs from the web server are rotated once a day, and if we want to run scripts just before the rotation, we can just drop them in the httpd-prerotate directory. Simple, right? Let's first create this directory if it doesn't exist:
mkdir -p /etc/logrotate.d/httpd-prerotate/
And let's create a tiny script at /etc/logrotate.d/httpd-prerotate/goaccess:
1
2
#!/bin/sh
exec goaccess-wrapper /tmp/goaccess-reports
Don't forget to make it executable:
chmod +x /etc/logrotate.d/httpd-prerotate/goaccess
As you can see, the only thing that this script does is to invoke the wrapper with the right argument, ie. the output directory for the HTML reports that are generated. And that's all. Now you can just come back tomorrow, check the logs, and make sure that the hook was executed and succeeded. For example, this kind of command will tell you quickly if it worked:
journalctl   grep logrotate
Integration, option B - Run GoAccess once a day, from a systemd service OK so we've just seen how to use a logrotate hook. One downside with that is that we have to drop privileges in the wrapper script, because logrotate runs as root, and we don't want to run GoAccess as root. Hence the rather convoluted syntax with setpriv. Rather than embedding this kind of thing in a wrapper script, we can instead run the wrapper script from a [systemd][] service, and define which user runs the wrapper straight in the systemd service file. Introducing systemd niceties So we can create a systemd service, along with a systemd timer that fires daily. We can then set the user and group that execute the script straight in the systemd service, and there's no need for setpriv anymore. It's a bit more streamlined. We can even go a bit further, and use systemd parameterized units (also called templates), so that we have one service per site (instead of one service that process all of our sites). That will simplify the wrapper script a lot, and it also looks nicer in the logs. With this approach however, it seems that we can't really run exactly before the logs are rotated away, like we did in the section above. But that's OK. What we'll do is that we'll run once a day, no matter the time, and we'll just make sure to process both log files access.log and access.log.1 (ie. the current logs and the logs from yesterday). This way, we're sure not to miss any line from the logs. Note that GoAccess is smart enough to only consider newer entries from the log files, and discard entries that are already in the database. In other words, it's safe to parse the same log file more than once, GoAccess will do the right thing. For more details see "INCREMENTAL LOG PROCESSING" from man goaccess. systemd]: https://freedesktop.org/wiki/Software/systemd/ Implementation And here's how it all looks like. First, a little wrapper script for GoAccess:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/bin/bash
# Usage: $0 SITE DBDIR LOGDIR OUTDIR
set -eu
SITE=$1
DBDIR=$2
LOGDIR=$3
OUTDIR=$4
LOGFILES=()
for ext in log log.1; do
    logfile="$LOGDIR/access.$ext"
    [ -e "$logfile" ] && LOGFILES+=("$logfile")
done
if [ $ #LOGFILES[@]  -eq 0 ]; then
    echo "No log files in '$LOGDIR'"
    exit 0
fi
goaccess \
    --agent-list \
    --anonymize-ip \
    --persist \
    --restore \
    --config-file /etc/goaccess/goaccess.conf \
    --db-path "$DBDIR" \
    --log-format "COMBINED" \
    --output "$OUTDIR/$SITE.html" \
    "$ LOGFILES[@] "
This wrapper does very little. Actually, the only thing it does is to check for the existence of the two log files access.log and access.log.1, to be sure that we don't ask GoAccess to process a file that does not exist (GoAccess would not be happy about that). Save this file under /usr/local/bin/goaccess-wrapper, don't forget to make it executable:
chmod +x /usr/local/bin/goaccess-wrapper
Then, create a systemd parameterized unit file, so that we can run this wrapper as a systemd service. Save it under /etc/systemd/system/goaccess@.service:
[Unit]
Description=Update GoAccess report - %i
ConditionPathIsDirectory=/var/lib/goaccess-db/%i
ConditionPathIsDirectory=/var/log/apache2/%i
ConditionPathIsDirectory=/tmp/goaccess-reports
PartOf=goaccess.service
[Service]
Type=oneshot
User=goaccess
Group=goaccess
Nice=19
ExecStart=/usr/local/bin/goaccess-wrapper \
 %i \
 /var/lib/goaccess-db/%i \
 /var/log/apache2/%i \
 /tmp/goaccess-reports
So, what is a systemd parameterized unit? It's a service to which you can pass an argument when you enable it. The %i in the unit definition will be replaced by this argument. In our case, the argument will be the name of the site that we want to process. As you can see, we use the directive ConditionPathIsDirectory= extensively, so that if ever one of the required directories does not exist, the unit will just be skipped (and marked as such in the logs). It's a graceful way to fail. We run the wrapper as the user and group goaccess, thanks to User= and Group=. We also use Nice= to give a low priority to the process. At this point, it's already possible to test. Just make sure that you created a directory for the GoAccess database:
mkdir -p /var/lib/goaccess-db/SITE
chown goaccess:goaccess /var/lib/goaccess-db/SITE
Also make sure that the output directory exists:
mkdir /tmp/goaccess-reports
chown goaccess:goaccess /tmp/goaccess-reports
Then reload systemd and fire the unit to see if it works:
systemctl daemon-reload
systemctl start goaccess@SITE.service
journalctl   tail
And that should work already. As you can see, the argument, SITE, is passed in the systemctl start command. We just append it after the @, in the name of the unit. Now, let's create another GoAccess service file, which sole purpose is to group all the parameterized units together, so that we can start them all in one go. Note that we don't use a systemd target for that, because ultimately we want to run it once a day, and that would not be possible with a target. So instead we use a dummy oneshot service. So here it is, saved under /etc/systemd/system/goaccess.service:
[Unit]
Description=Update GoAccess reports
Requires= \
 goaccess@SITE1.service \
 goaccess@SITE2.service
[Service]
Type=oneshot
ExecStart=true
As you can see, we simply list the sites that we want to process in the Requires= directive. In this example we have two sites named SITE1 and SITE2. Let's ensure that everything is still good:
systemctl daemon-reload
systemctl start goaccess.service
journalctl   tail
Check the logs, both sites SITE1 and SITE2 should have been processed. And finally, let's create a timer, so that systemd runs goaccess.service once a day. Save it under /etc/systemd/system/goaccess.timer.
[Unit]
Description=Daily update of GoAccess reports
[Timer]
OnCalendar=daily
RandomizedDelaySec=1h
Persistent=true
[Install]
WantedBy=timers.target
Finally, enable the timer:
systemctl daemon-reload
systemctl enable --now goaccess.timer
At this point, everything should be OK. Just come back tomorrow and check the logs with something like:
journalctl   grep goaccess
Last word: if you have only one site to process, of course you can simplify, for example you can hardcode all the paths in the file goaccess.service instead of using a parameterized unit. Up to you. Daily operations So in this part, we assume that you have GoAccess all setup and running, once a day or so. Let's just go over a few things worth noting. Serve your report Up to now in this tutorial, we created the reports in /tmp/goaccess-reports, but that was just for the sake of the example. You will probably want to save your reports in a directory that is served by your web server, so that, well, you can actually look at it in your web browser, that was the point, right? So how to do that is a bit out of scope here, and I guess that if you want to monitor your website, you already have a website, so you will have no trouble serving the GoAccess HTML report. However there's an important detail to be aware of: GoAccess shows all the IP addresses of your visitors in the report. As long as the report is private it's OK, but if ever you make your GoAccess report public, then you should definitely invoke GoAccess with the option --anonymize-ip. Keep an eye on the logs In this tutorial, the reports we create, along with the GoAccess databases, will grow bigger every day, forever. It also means that the GoAccess processing time will grow a bit each day. So maybe the first thing to do is to keep an eye on the logs, to see how long it takes to GoAccess to do its job every day. Also, maybe you'd like to keep an eye on the size of the GoAccess database with:
du -sh /var/lib/goaccess-db/SITE
If your site has few visitors, I suspect it won't be a problem though. You could also be a bit pro-active in preventing this problem in the future, and for example you could break the reports into, say, monthly reports. Meaning that every month, you would create a new database in a new directory, and also start a new HTML report. This way you'd have monthly reports, and you make sure to limit the GoAccess processing time, by limiting the database size to a month. This can be achieved very easily, by including something like YEAR-MONTH in the database directory, and in the HTML report. You can handle that automatically in the wrapper script, for example:
sfx=$(date +'%Y-%m')
mkdir -p $DBDIR/$sfx
goaccess \
    --db-path $DBDIR/$sfx \
    --output "$OUTDIR/$SITE-$sfx.html" \
    ...
You get the idea. Further notes Migration from older versions With the --persist option, GoAccess keeps all the information from the logs in a database, so that it can re-use it later. In prior versions, GoAccess used the Tokyo Cabinet key-value store for that. However starting from v1.4, GoAccess dropped this dependency and now uses its own database format. As a result, the previous database can't be used anymore, you will have to remove it and restart from zero. At the moment there is no way to convert the data from the old database to the new one. If you're interested, this is discussed upstream at [#1783][bug-1783]. Another thing that changed with this new version is the name for some of the command-line options. For example, --load-from-disk was dropped in favor of --restore, and --keep-db-files became --persist. So you'll have to look at the documentation a bit, and update your script(s) accordingly. Other ways to use GoAccess It's also possible to do it completely differently. You could keep GoAccess running, pretty much like a daemon, with the --real-time-html option, and have it process the logs continuously, rather than calling it on a regular basis. It's also possible to see the GoAccess report straight in the terminal, thanks to libncurses, rather than creating a HTML report. And much more, GoAccess is packed with features. Conclusion I hope that this tutorial helped some of you folks. Feel free to drop an e-mail for comments.

1 August 2020

Utkarsh Gupta: FOSS Activites in July 2020

Here s my (tenth) monthly update about the activities I ve done in the F/L/OSS world.

Debian
This was my 17th month of contributing to Debian. I became a DM in late March last year and a DD last Christmas! \o/ Well, this month I didn t do a lot of Debian stuff, like I usually do, however, I did a lot of things related to Debian (indirectly via GSoC)! Anyway, here are the following things I did this month:

Uploads and bug fixes:

Other $things:
  • Mentoring for newcomers.
  • FTP Trainee reviewing.
  • Moderation of -project mailing list.
  • Sponsored php-twig for William, ruby-growl, ruby-xmpp4r, and uby-uniform-notifier for Cocoa, sup-mail for Iain, and node-markdown-it for Sakshi.

GSoC Phase 2, Part 2! In May, I got selected as a Google Summer of Code student for Debian again! \o/
I am working on the Upstream-Downstream Cooperation in Ruby project. The first three blogs can be found here: Also, I log daily updates at gsocwithutkarsh2102.tk. Whilst the daily updates are available at the above site^, I ll breakdown the important parts of the later half of the second month here:
  • Marc Andre, very kindly, helped in fixing the specs that were failing earlier this month. Well, the problem was with the specs, but I am still confused how so. Anyway..
  • Finished documentation of the second cop and marked the PR as ready to be reviewed.
  • David reviewed and suggested some really good changes and I fixed/tweaked that PR as per his suggestion to finally finish the last bits of the second cop, RelativeRequireToLib.
  • Merged the PR upon two approvals and released it as v0.2.0!
  • We had our next weekly meeting where we discussed the next steps and the things that are supposed to be done for the next set of cops.
  • Introduced rubocop-packaging to the outer world and requested other upstream projects to use it! It is being used by 13 other projects already!
  • Started to work on packaging-style-guide but I didn t push anything to the public repository yet.
  • Worked on refactoring the cops_documentation Rake task which was broken by the new auto-corrector API. Opened PR #7 for it. It ll be merged after the next RuboCop release as it uses CopsDocumentationGenerator class from the master branch.
  • Whilst working on autoprefixer-rails, I found something unusual. The second cop shouldn t really report offenses if the require_relative calls are from lib to lib itself. This is a false-positive. Opened issue #8 for the same.

Debian (E)LTS
Debian Long Term Support (LTS) is a project to extend the lifetime of all Debian stable releases to (at least) 5 years. Debian LTS is not handled by the Debian security team, but by a separate group of volunteers and companies interested in making it a success. And Debian Extended LTS (ELTS) is its sister project, extending support to the Jessie release (+2 years after LTS support). This was my tenth month as a Debian LTS and my first as a Debian ELTS paid contributor.
I was assigned 25.25 hours for LTS and 13.25 hours for ELTS and worked on the following things:

LTS CVE Fixes and Announcements:

ELTS CVE Fixes and Announcements:

Other (E)LTS Work:
  • Did my LTS frontdesk duty from 29th June to 5th July.
  • Triaged qemu, firefox-esr, wordpress, libmediainfo, squirrelmail, xen, openjpeg2, samba, and ldb.
  • Mark CVE-2020-15395/libmediainfo as no-dsa for Jessie.
  • Mark CVE-2020-13754/qemu as no-dsa/intrusive for Stretch and Jessie.
  • Mark CVE-2020-12829/qemu as no-dsa for Jessie.
  • Mark CVE-2020-10756/qemu as not-affected for Jessie.
  • Mark CVE-2020-13253/qemu as postponed for Jessie.
  • Drop squirrelmail and xen for Stretch LTS.
  • Add notes for tomcat8, shiro, and cacti to take care of the Stretch issues.
  • Emailed team@security.d.o and debian-lts@l.d.o regarding possible clashes.
  • Maintenance of LTS Survey on the self-hosted LimeSurvey instance. Received 1765 (just wow!) responses.
  • Attended the fourth LTS meeting. MOM here.
  • General discussion on LTS private and public mailing list.

Other(s)
Sometimes it gets hard to categorize work/things into a particular category.
That s why I am writing all of those things inside this category.
This includes two sub-categories and they are as follows.

Personal: This month I did the following things:
  • Released v0.2.0 of rubocop-packaging on RubyGems!
    It s open-sourced and the repository is here.
    Bug reports and pull requests are welcomed!
  • Released v0.1.0 of get_root on RubyGems!
    It s open-sourced and the repository is here.
  • Wrote max-word-frequency, my Rails C1M2 programming assignment.
    And made it pretty neater & cleaner!
  • Refactored my lts-dla and elts-ela scripts entirely and wrote them in Ruby so that there are no issues and no false-positives!
    Check lts-dla here and elts-ela here.
  • And finally, built my first Rails (mini) web-application!
    The repository is here. This was also a programming assignment (C1M3).
    And furthermore, hosted it at Heroku.

Open Source: Again, this contains all the things that I couldn t categorize earlier.
Opened several issues and PRs:
  • Issue #8273 against rubocop, reporting a false-positive auto-correct for Style/WhileUntilModifier.
  • Issue #615 against http reporting a weird behavior of a flaky test.
  • PR #3791 for rubygems/bundler to remove redundant bundler/setup require call from spec_helper generated by bundle gem.
  • Issue #3831 against rubygems, reporting a traceback of undefined method, rubyforge_project=.
  • Issue #238 against nheko asking for enhancement in showing the font name in the very font itself.
  • PR #2307 for puma to constrain rake-compiler to v0.9.4.
  • And finally, I joined the Cucumber organization! \o/

Thank you for sticking along for so long :) Until next time.
:wq for today.

29 May 2020

Gunnar Wolf: Heads up Online MiniDebConf is Online

I know most Debian people know about this already But in case you don t follow the usual Debian communications channels, this might interest you! Given most of the world is still under COVID-19 restrictions, and that we want to work on Debian, given there is no certainty as to what the future holds in store for us Our DPL fearless as they always are had the bold initiative to make this weekend into the first-ever miniDebConf Online (MDCO)! miniDebConf Online So, we are already halfway through DebCamp (which means, you can come and hang out with us in the debian.social DebCamp Jitsi lounge, where some impromptu presentations might happen (or not). Starting tomorrow morning (11AM UTC), we will have a quite interesting set of talks. I am reproducing the schedule here:

Saturday 2020.05.30
Time (UTC) Speaker Talk
11:00 - 11:10 MDCO team members Hello + Welcome
11:30 - 11:50 Wouter Verhelst Extrepo
12:00 - 12:45 JP Mengual Debian France, trust european organization
13:00 - 13:20 Arnaud Ferraris Bringing Debian to mobile phones, one package at a time
13:30 - 15:00 Lunch Break A chance for the teams to catch some air
15:00 - 15:45 JP Mengual The community team, United Nations Organizations of Debian?
16:00 - 16:45 Christoph Biedl Clevis and tang - overcoming the disk unlocking problem
17:00 - 17:45 Antonio Terceiro I m a programmer, how can I help Debian

Sunday 2020.05.31
Time (UTC) Speaker Talk
11:00 - 11:45 Andreas Tille The effect of Covid-19 on the Debian Med project
12:00 - 12:45 Paul Gevers BoF: running autopkgtest for your package
13:00 - 13:20 Ben Hutchings debplate: Build many binary packages with templates
13:30 - 15:00 Lunch break A chance for the teams to catch some air
15:00 - 15:45 Holger Levsen Reproducing bullseye in practice
16:00 - 16:45 Jonathan Carter Striving towards excellence
17:00 - 17:45 Delib* Organizing Peer-to-Peer Debian Facilitation Training
18:00 - 18:15 MDCO team members Closing
  • subject to confirmation

Timezone Remember this is an online event, meant for all of the world! Yes, the chosen times seem quite Europe-centric (but they are mostly a function of the times the talk submitters requested). Talks are 11:00 18:00UTC, which means, 06:00 13:00 Mexico (GMT-5), 20:00 03:00 Japan (GMT+9), 04:00 11:00 Western Canada/USA/Mexico (GMT-7) and the rest of the world, somewhere in between. (No, this was clearly not optimized for our dear usual beer team. Sorry! I guess we need you to be fully awake at beertime!)

[update] Connecting! Of course, I didn t make it clear at first how to connect to the Online miniDebConf, silly me!
  • The video streams are available at: https://video.debconf.org/
  • Suggested: tune in to the #minidebconf-online IRC channel in OFTC.
That should be it. Hope to see you there! (Stay home, stay safe )

Gunnar Wolf: Heads up Online MiniDebConf is Online

I know most Debian people know about this already But in case you don t follow the usual Debian communications channels, this might interest you! Given most of the world is still under COVID-19 restrictions, and that we want to work on Debian, given there is no certainty as to what the future holds in store for us Our DPL fearless as they always are had the bold initiative to make this weekend into the first-ever miniDebConf Online (MDCO)! miniDebConf Online So, we are already halfway through DebCamp (which means, you can come and hang out with us in the debian.social DebCamp Jitsi lounge, where some impromptu presentations might happen (or not). Starting tomorrow morning (11AM UTC), we will have a quite interesting set of talks. I am reproducing the schedule here:

Saturday 2020.05.30
Time (UTC) Speaker Talk
11:00 - 11:10 MDCO team members Hello + Welcome
11:30 - 11:50 Wouter Verhelst Extrepo
12:00 - 12:45 JP Mengual Debian France, trust european organization
13:00 - 13:20 Arnaud Ferraris Bringing Debian to mobile phones, one package at a time
13:30 - 15:00 Lunch Break A chance for the teams to catch some air
15:00 - 15:45 JP Mengual The community team, United Nations Organizations of Debian?
16:00 - 16:45 Christoph Biedl Clevis and tang - overcoming the disk unlocking problem
17:00 - 17:45 Antonio Terceiro I m a programmer, how can I help Debian

Sunday 2020.05.31
Time (UTC) Speaker Talk
11:00 - 11:45 Andreas Tille The effect of Covid-19 on the Debian Med project
12:00 - 12:45 Paul Gevers BoF: running autopkgtest for your package
13:00 - 13:20 Ben Hutchings debplate: Build many binary packages with templates
13:30 - 15:00 Lunch break A chance for the teams to catch some air
15:00 - 15:45 Holger Levsen Reproducing bullseye in practice
16:00 - 16:45 Jonathan Carter Striving towards excellence
17:00 - 17:45 Delib* Organizing Peer-to-Peer Debian Facilitation Training
18:00 - 18:15 MDCO team members Closing
  • subject to confirmation

Timezone Remember this is an online event, meant for all of the world! Yes, the chosen times seem quite Europe-centric (but they are mostly a function of the times the talk submitters requested). Talks are 11:00 18:00UTC, which means, 06:00 13:00 Mexico (GMT-5), 20:00 03:00 Japan (GMT+9), 04:00 11:00 Western Canada/USA/Mexico (GMT-7) and the rest of the world, somewhere in between. (No, this was clearly not optimized for our dear usual beer team. Sorry! I guess we need you to be fully awake at beertime!)

[update] Connecting! Of course, I didn t make it clear at first how to connect to the Online miniDebConf, silly me!
  • The video streams are available at: https://video.debconf.org/
  • Suggested: tune in to the #minidebconf-online IRC channel in OFTC.
That should be it. Hope to see you there! (Stay home, stay safe )

20 June 2017

Norbert Preining: TeX Live 2017 hits Debian/unstable

Yesterday I uploaded the first packages of TeX Live 2017 to Debian/unstable, meaning that the new release cycle has started. Debian/stretch was released over the weekend, and this opened up unstable for new developments. The upload comprised the following packages: asymptote, cm-super, context, context-modules, texlive-base, texlive-bin, texlive-extra, texlive-extra, texlive-lang, texworks, xindy.
I mentioned already in a previous post the following changes: The last two changes are described together with other news (easy TEXMF tree management) in the TeX Live release post. These changes more or less sum up the new infra structure developments in TeX Live 2017. Since the last release to unstable (which happened in 2017-01-23) about half a year of package updates have accumulated, below is an approximate list of updates (not split into new/updated, though). Enjoy the brave new world of TeX Live 2017, and please report bugs to the BTS! Updated/new packages:
academicons, achemso, acmart, acro, actuarialangle, actuarialsymbol, adobemapping, alkalami, amiri, animate, aomart, apa6, apxproof, arabluatex, archaeologie, arsclassica, autoaligne, autobreak, autosp, axodraw2, babel, babel-azerbaijani, babel-english, babel-french, babel-indonesian, babel-japanese, babel-malay, babel-ukrainian, bangorexam, baskervaldx, baskervillef, bchart, beamer, beamerswitch, bgteubner, biblatex-abnt, biblatex-anonymous, biblatex-archaeology, biblatex-arthistory-bonn, biblatex-bookinother, biblatex-caspervector, biblatex-cheatsheet, biblatex-chem, biblatex-chicago, biblatex-claves, biblatex-enc, biblatex-fiwi, biblatex-gb7714-2015, biblatex-gost, biblatex-ieee, biblatex-iso690, biblatex-manuscripts-philology, biblatex-morenames, biblatex-nature, biblatex-opcit-booktitle, biblatex-oxref, biblatex-philosophy, biblatex-publist, biblatex-shortfields, biblatex-subseries, bibtexperllibs, bidi, biochemistry-colors, bookcover, boondox, bredzenie, breqn, bxbase, bxcalc, bxdvidriver, bxjalipsum, bxjaprnind, bxjscls, bxnewfont, bxorigcapt, bxpapersize, bxpdfver, cabin, callouts, chemfig, chemformula, chemmacros, chemschemex, childdoc, circuitikz, cje, cjhebrew, cjk-gs-integrate, cmpj, cochineal, combofont, context, conv-xkv, correctmathalign, covington, cquthesis, crimson, crossrefware, csbulletin, csplain, csquotes, css-colors, cstldoc, ctex, currency, cweb, datetime2-french, datetime2-german, datetime2-romanian, datetime2-ukrainian, dehyph-exptl, disser, docsurvey, dox, draftfigure, drawmatrix, dtk, dviinfox, easyformat, ebproof, elements, endheads, enotez, eqnalign, erewhon, eulerpx, expex, exsheets, factura, facture, fancyhdr, fbb, fei, fetamont, fibeamer, fithesis, fixme, fmtcount, fnspe, fontmfizz, fontools, fonts-churchslavonic, fontspec, footnotehyper, forest, gandhi, genealogytree, glossaries, glossaries-extra, gofonts, gotoh, graphics, graphics-def, graphics-pln, grayhints, gregoriotex, gtrlib-largetrees, gzt, halloweenmath, handout, hang, heuristica, hlist, hobby, hvfloat, hyperref, hyperxmp, ifptex, ijsra, japanese-otf-uptex, jlreq, jmlr, jsclasses, jslectureplanner, karnaugh-map, keyfloat, knowledge, komacv, koma-script, kotex-oblivoir, l3, l3build, ladder, langsci, latex, latex2e, latex2man, latex3, latexbug, latexindent, latexmk, latex-mr, leaflet, leipzig, libertine, libertinegc, libertinus, libertinust1math, lion-msc, lni, longdivision, lshort-chinese, ltb2bib, lualatex-math, lualibs, luamesh, luamplib, luaotfload, luapackageloader, luatexja, luatexko, lwarp, make4ht, marginnote, markdown, mathalfa, mathpunctspace, mathtools, mcexam, mcf2graph, media9, minidocument, modular, montserrat, morewrites, mpostinl, mptrees, mucproc, musixtex, mwcls, mweights, nameauth, newpx, newtx, newtxtt, nfssext-cfr, nlctdoc, novel, numspell, nwejm, oberdiek, ocgx2, oplotsymbl, optidef, oscola, overlays, pagecolor, pdflatexpicscale, pdfpages, pdfx, perfectcut, pgfplots, phonenumbers, phonrule, pkuthss, platex, platex-tools, polski, preview, program, proofread, prooftrees, pst-3dplot, pst-barcode, pst-eucl, pst-func, pst-ode, pst-pdf, pst-plot, pstricks, pstricks-add, pst-solides3d, pst-spinner, pst-tools, pst-tree, pst-vehicle, ptex2pdf, ptex-base, ptex-fontmaps, pxbase, pxchfon, pxrubrica, pythonhighlight, quran, ran_toks, reledmac, repere, resphilosophica, revquantum, rputover, rubik, rutitlepage, sansmathfonts, scratch, seealso, sesstime, siunitx, skdoc, songs, spectralsequences, stackengine, stage, sttools, studenthandouts, svg, tcolorbox, tex4ebook, tex4ht, texosquery, texproposal, thaienum, thalie, thesis-ekf, thuthesis, tikz-kalender, tikzmark, tikz-optics, tikz-palattice, tikzpeople, tikzsymbols, titlepic, tl17, tqft, tracklang, tudscr, tugboat-plain, turabian-formatting, txuprcal, typoaid, udesoftec, uhhassignment, ukrainian, ulthese, unamthesis, unfonts-core, unfonts-extra, unicode-math, uplatex, upmethodology, uptex-base, urcls, variablelm, varsfromjobname, visualtikz, xassoccnt, xcharter, xcntperchap, xecjk, xepersian, xetexko, xevlna, xgreek, xsavebox, xsim, ycbook.

5 December 2016

Norbert Preining: Debian/TeX Live 2016.20161130-1

As we are moving closer to the Debian release freeze, I am shipping out a new set of packages. Nothing spectacular here, just the regular updates and a security fix that was only reported internally. Add sugar and a few minor bug fixes.
texlive2016-debian I have been silent for quite some time, busy at my new job, busy with my little monster, writing papers, caring for visitors, living. I have quite a lot of things I want to write, but not enough time, so very short only this one. Enjoy. New packages awesomebox, baskervillef, forest-quickstart, gofonts, iscram, karnaugh-map, tikz-optics, tikzpeople, unicode-bidi. Updated packages acmart, algorithms, aomart, apa, apa6, appendix, apxproof, arabluatex, asymptote, background, bangorexam, beamer, beebe, biblatex-gb7714-2015, biblatex-mla, biblatex-morenames, bibtexperllibs, bidi, bookcover, bxjalipsum, bxjscls, c90, cals, cell, cm, cmap, cmextra, context, cooking-units, ctex, cyrillic, dirtree, ekaia, enotez, errata, euler, exercises, fira, fonts-churchslavonic, formation-latex-ul, german, glossaries, graphics, handout, hustthesis, hyphen-base, ipaex, japanese, jfontmaps, kpathsea, l3build, l3experimental, l3kernel, l3packages, latex2e-help-texinfo-fr, layouts, listofitems, lshort-german, manfnt, mathastext, mcf2graph, media9, mflogo, ms, multirow, newpx, newtx, nlctdoc, notes, patch, pdfscreen, phonenumbers, platex, ptex, quran, readarray, reledmac, shapes, showexpl, siunitx, talk, tcolorbox, tetex, tex4ht, texlive-en, texlive-scripts, texworks, tikz-dependency, toptesi, tpslifonts, tracklang, tugboat, tugboat-plain, units, updmap-map, uplatex, uspace, wadalab, xecjk, xellipsis, xepersian, xint.

27 January 2016

Uwe Kleine-K nig: Installing Debian Jessie on a Netgear ReadyNAS 104

The Netgear ReadyNAS 104 comes shipped with U-Boot. To access its "shell" remove the small quadratic sticker at the backside to reveal the UART pins (3V3, pinout available at Arnaud's NAS page[1] and connect a matching adapter. Also connect a network cable to the lower jack. Then on a different machine in the same network setup a tftp server (e.g. apt-get install tftpd-hpa). As of today the latest beta netboot installer (Beta 2) doesn't work any more because the kernel in jessie was updated since the installer was released. So pick up the armhf netboot installer from the daily snapshots. You need initrd.gz and vmlinuz. Furthermore armada-370-netgear-rn104.dtb. Update: As jessie is released now, download the following images: To make the installer ready to boot do:
# apt-get install u-boot-tools
$ cat vmlinuz armada-370-netgear-rn104.dtb > vmlinuz-rn104
$ mkimage -A arm -O linux -T multi -C none -a 0x04000000 -e 0x04000000 -n "Debian Jessie armhf installer" -d vmlinuz-rn104:initrd.gz uImage-installer-rn104
# cp uImage-installer-rn104 /srv/tftp
Then on the U-Boot shell setup networking and start the installer by issuing the following commands:
dhcp
setenv serverip 192.168.1.17
tftp uImage-installer-rn104
bootm $load_addr
With 192.168.1.17 being the IPv4 of the machine you set up the tftp server above, adapt accordingly to your setup. While in U-Boot the default ethernet device is the lower jack, the installer is only able to use the upper, so replug the ethernet cable to the upper receptacle. Go through the installation, and before rebooting do the following: Select "Change debconf priority" and set it to "low". Then "Download installer components" and check "mtd-modules-3.16.0-4-armmp-di". After that "Execute a shell" and do:
# depmod -a 
# modprobe pxa3xx_nand
# apt-install flash-kernel
# mount --bind /proc /target/proc
# chroot /target
# cat >> /etc/flash-kernel/db << EOF
Machine: NETGEAR ReadyNAS 104
DTB-Id: armada-370-netgear-rn104.dtb
DTB-Append: yes
Mtd-Kernel: uImage
Mtd-Initrd: minirootfs
U-Boot-Kernel-Address: 0x04000000
U-Boot-Initrd-Address: 0x05000000
Required-Packages: u-boot-tools
EOF
# flash-kernel
You then need to adapt the u-boot environment to pass the right root parameter to Linux. Alternatively add Bootloader-Sets-Incorrect-Root: yes to /etc/flash-kernel/db. [1] Note this page is about a ReadyNAS 102, the pinout is identical though.

4 June 2013

Rapha&#235;l Hertzog: My Free Software Activities in May 2013

This is my monthly summary of my free software related activities. If you re among the people who made a donation to support my work (70 , thanks everybody!), then you can learn how I spent your money. Otherwise it s just an interesting status update on my various projects. The Debian Administrator s Handbook Spanish translation completed. The Spanish team finished the translation of the book. The PDF build process was not yet ready to build translations so I had to fix this. At the same time, I also improved the mobipocket build script to make use of Amazon s kindlegen when available (since Amazon now requires the use of this tool to generate Mobipocket files that can be distributed on their platform). Once those issues were sorted I made some promotion of this first completed translation because they really deserve some big kudos ! Plans for the French translation. You know that the Debian Administrator s Handbook came to life as a translation of the French book Cahier de l Admin Debian (published by Eyrolles). This means that we currently have a free translation of a proprietary book. It s a bid of an odd situation that I always wanted to fix. I discussed with Eyrolles to find out how we could publish the original book under the same licenses that we picked for the English book and the result is that we setup a new crowdfunding campaign to liberate the French book and then make it an official French translation of the Debian Administrator s Handbook. Read the rest and support us on the ulule project page (a kickstarter like for people who are not based in the US). Liberate the Debian Handbook Debian France I updated our membership management application (galette) to version 0.7.4.1 with numerous bug fixes but the true highlight this month was Solutions Libres et Opensource , a tradeshow in Paris where Tanguy Ortolo, me, and other volunteers (C dric Boutillier, Arnaud G., and some that I have forgotten, thanks to them!), held a Debian booth for two consecutive days (May 28-29). For once we had lots of goodies to sell (buffs, mouse pad, polos, stickers, etc.) and the booth was very well attended. The Debian Booth (Tanguy on the left, Rapha l on the right) Google s Summer of Code Last month I was rater overwhelmed with queries from students who were interested in applying for the Package Tracking System Rewrite project that I offered to mentor as part of Google s Summer of Code. In the end, I got 6 good student applications that Stefano and me evaluated. We selected Marko Lalic. The Community Bonding Period is just starting and we re fleshing out details on how we will organize the work. We ll try to use the IRC channel #debian-qa on OFTC for questions and answers and weekly meetings. Misc Debian packaging I packaged zim 0.60 and with the release of Wheezy, I uploaded to unstable all the packages that I staged in experimental (cpputest, publican). I sponsored the upload of libmicrohttpd 0.9.27-1. I filed a couple of bug reports that I experienced with the upcoming dpkg 1.17.0 (#709172, #709009). In both cases, the package was using a wrongly hardcoded path to dpkg-divert (the binary moved from /usr/sbin/ to /usr/bin/ a while ago and the compatibility symlink is dropped now). I also dealt with #709064 where the user reported upgrade issues related to multiarch. I also filed an upstream bug report on publican to request some way to avoid so much duplication of files (actually I filed it as a response to the Debian bug #708705 that I received). Kali work I had to update OpenVAS for Kali but some parts failed to build in a Debian 7 environment. I diagnosed the problem and submitted a patch upstream. I also got in touch with the Debian OpenVAS maintainer as I wanted to contribute the package back to Debian, but timing issues have pushed this back for a little longer. Thanks See you next month for a new summary of my activities.

One comment Liked this article? Click here. My blog is Flattr-enabled.

10 February 2013

Stefano Zacchiroli: bits from the DPL for January 2013

(insert here: I've been to FOSDEM, I got a nasty flu, and other $lame_excuses for the delay in sending out this report) Dear Project Members, here's the monthly DPL activity report, this time for January 2013. About the next DPL This is the last DPL report before the start of the election process for the next term: around early March, about 20 days from now, the Secretary will send out the call for nominations. I'd like to respond (also) here to inquiries I'm receiving these days: I will not run again as DPL. So you have about 20 days to mob^Wconvince other DDs to run, or decide to run yourself. Do not to wait for the vary last minute, as that makes for lousy campaigns. I'm available to give feedback about my DPL experience to prospective candidates, ... and also to join mobbing^Wconvincing actions toward potential candidates. Just contact me. Call for helps Assets Cloud Images Work has gone on also on the front of supporting Debian installation in public "clouds". Thanks to Arnaud Patard, Jose Miguel Parrella Romero, Pierre Couzy, and Gianugo Rabellino, we now have Debian testing images for Microsoft Azure. Together with Amazon EC2, this is the second large provider supporting Debian via images maintained by Debian Developers. More providers are welcome, exactly as more hardware/CD vendors shipping Debian are always welcome. If you want to contribute support for other providers just show up on the -cloud mailing list and say so. Some documentation effort in view of Wheezy are in need of help too, in order to let our users know about "cloud" options, see #695681. DPL helpers The DPL helpers experiment goes on. We have had 2 more IRC meetings in January (see the minutes). Documentation of the "team" communication channels (mailing list, IRC, Git, etc.) is now available from the DPL wiki page. Talks I've given an invited Debian talk at Polytech'Grenoble, as part of a free software event organized for students of local universities. Slides of the talk are available. I'd like to thank Vincent Danjean for the event organization. Let's release Wheezy now!
Cheers.
PS the day-to-day activity log for January 2013 is available at the usual place master:/srv/leader/news/bits-from-the-DPL.txt.201301

26 January 2013

Ben Hutchings: What's in the Linux kernel for Debian 7.0 'wheezy', part 4

The Debian package of the Linux kernel is based on Linux 3.2, but has some additional features. This continues from parts 1, 2 and 3, and covers new and improved hardware support. DRM drivers from Linux 3.4 (proposed) Some recent Intel and AMD graphics chips were not supported well or at all by the DRM (Direct Rendering Manager) drivers in Linux 3.2. Although many bug fixes have been included in 3.2.y stable updates, we are considering updating them to the versions found in Linux 3.4. (We did something similar for Debian 6.0 'squeeze'.) Julien Cristau has been working on this and has prepared binary packages. To install, run as root:
gpg --no-default-keyring --keyring /usr/share/keyrings/debian-keyring.gpg --export 310180050905E40C   apt-key add -
echo deb http://people.debian.org/~jcristau/wheezy-drm34/ ./ > /etc/apt/sources.list.d/jcristau-wheezy-drm34.list
apt-get update
apt-get install linux-image-3.2.0-4.drm-amd64  # or -486, or -686-pae
Please test these and report your results to bug #687442. I would suggest testing suspend/resume (if applicable), use of internal and external displays on laptops, and 3D accelerated graphics (games and compositing window managers such as GNOME Shell). Remember that AMD/ATI graphics chips require the firmware from the firmware-linux-nonfree package for 3D acceleration and many other features. amilo-rfkill driver I wrote a standard rfkill driver for the Fujitsu-Siemens Amilo A1655 and M7440, based on the out-of-tree fsaa1655g and fsam7440 drivers. Unlike those, it should work with the rfkill command, Network Manager, etc. ALPS touchpads Newer touchpads made by ALPS use different protocols for reporting scroll and pinch gestures. Jonathan Nieder backported the changes to support these. Wacom tablets Jonathan Nieder updated the wacom driver to the version in Linux 3.5, adding support for the Intuos5, Bamboo Connect, Bamboo 16FG and various other models. Emulex OneConnect 'Skyhawk' Sarveshwar Bandi at Emulex contributed a backport of the be2net driver from Linux 3.5, adding support for their 'Skyhawk' chip. Marvell Kirkwood Marvell's Kirkwood SoCs have been supported upstream for some time and in Debian since release 6.0 'squeeze'. However new models and boards generally require specific support. Arnaud Patard backported support for the 6282 rev A1 chip found in QNAP TS-x19P II models, and for the Marvell Dreamplug and Iomega Iconnect. Miscellaneous More to come Missing hardware support is an important bug that can be fixed by kernel updates during a freeze and throughout the lifetime of a stable release. If you know that new hardware isn't supported by the Debian kernel, please open a bug report. I can't promise that it will be fixed, but we need to know what's missing. Hardware vendors that maintain their own drivers upstream (not out-of-tree) are especially welcome to contribute tested backports that add support for the latest hardware. Send mail to debian-kernel@lists.debian.org if you have any questions about this.

14 October 2012

Arnaud Quette: Cryptography: SSL support using Mozilla NSS landed in NUT

this week, a new feature was merged into the NUT development tree: beside of the historical OpenSSL support (for more than 10 years), NUT now also provides SSL features using Mozilla NSS. I've already posted a lengthy mail on the NUT developers list. But there are still a few things to be told: For the braves who want to test, here is a small procedure, adapted to Debian. We will use an existing installation, and overwrite upsd, upsmon, libupsclient and few more. Beware that it will obviously breaks the MD5 sums of your nut packages!
# apt-get install nut
$ tar xzvf nut-trunk-r3751.tar.gz
# apt-get install libnss3-dev 
./configure --without-all --with-nss \
        --prefix=/ --sysconfdir=/etc/nut  \
        --with-statepath=/var/run/nut \
        --with-altpidpath=/var/run/nut \
        --with-drvpath=/lib/nut \
        --with-pidpath=/var/run/nut \
        --datadir=/usr/share/nut \
        --with-pkgconfig-dir=/usr/lib/pkgconfig \
        --with-user=nut --with-group=nut
$ make && make install
Before concluding, here is the traditional thank you guys: As a conclusion, cryptography integration and usability is still not on par with other proprietary OS. I would love to see the crypto situation improving in Debian (and friends obviously). So this was just my 2 cents (nuts ;-)) to the cause... cheers,
-- Arno

25 September 2012

Arnaud Quette: Power management and NUT #1: an introduction

in this series of articles, I will be talking in depth about power management through the NUT project, its packaging on Debian, how to use it in general and how I see it being part of the GreenIT thing. So, let's start with an introduction: NUT is a Free Software (GPL v2+ and 3 to be precise), originally created for power protection using UPS, from home to data-centers:
To shortly describe the main features, I would say that NUT: NUT used to stand for Network UPS Tools. That is, a software for talking to your UPS and shutting down your systems when needed. This definition is a bit limited nowadays, since NUT supports 4 types of power device: Considering this, is the name Network UPS Tools still suitable? Not really! But the acronym NUT is well known! So, for the time being, I just stick using it, and focus on other more important things, until a better opportunity (ideas and comments are welcome!). That said, what can you do exactly with NUT? Currently, you can: All the above is available in a standardized way: Well, this is already a long and dense post, so I will stop there for today. In the next post, we will have a deeper dive into using NUT, for various use cases: submit yours if you can ;-) cheers,
-- Arno

10 July 2012

Arnaud Quette: Definitive solution to IPMI over LAN with Dell iDrac Express

I have this bunch of Dell R610, with iDrac6 Express management cards. I used these, among other things, for developing IPMI support in NUT and working on Infrastructure & Cloud power management. But that's the topic of another post (still, if you're interested in, check this and that). The thing is that this "IPMI" monitoring development has been limited to local support (Ie, power supplies can't be monitored remotely by the nut-ipmipsu driver), due to an issue : any attempt to enable IPMI access over the network was miserably failing! Well, these attempts were limited to a couple of 15 minutes runs, without plain motivation, almost a year ago. The various firmwares were up to date (iDrac 1.70, ...) , everything was running and configured fine, locally. But still... no IPMI available through the network! Looking on the Net, I've learned that many Dell customers with iDrac Express cards, were having the same issue. Dell support seems to have replaced tons of motherboards! There, I switched to other things, and time has passed.... A good year later (last week), I decided that it was time to get back on this. And I've found the solution there Incredible: this was due to a 'bug' in the Broadcom NetXtreme II LoM (LAN on Motherboad) firmware! I've not had time to dig this issue in depth, but here is a base explanation, for what it's worth: Some LoM initial self tests are failing. Thus, the LoM are not switched to the managed mode, and can't actually be available for BMC management (thus no IPMI over the network). In my case, the tests were wrongly failing at 'A07', a test which tries to establish a Gigabit connection! Strangely, all these servers are connected on a Gb switch! Not a fully satisfactory answer, but that said, there is a solution, and I've not much time to pour into this investigation (comments may always change my mind though!). So here is a comprehensive procedure to fix this, from your Linux system, and using FreeDOS:
$ mkfs.msdos /dev/sdX1
Note: 'X' is to be replaced by the exact name of your USB key. An hint: call 'tail -f /var/log/syslog" and unplug / replug your USB key. You will see some entries like "...sdb Attached SCSI removable disk". So, there, it's "sdb".
$ qemu -boot a -fda balder10.img -hda /dev/sdX A:\> sys c: A:\> xcopy /E /N a: c:
Note that you will need "root" privileges.
$ qemu -hda /dev/sdX
$ unzip Bcom_LAN_14.2.x_DOSUtilities_A03.exe
c:\ uxdiag -t abcd mfw 1
For what it's worth (again), I just hope that it will be useful to others... I will now prepare another post using using FreeIPMI to manage your servers, the GNU way... cheers,
-- Arno Thanks to Jordi Clariana, his enlightening post, Daniel for this one, Aur lien was motivating me again in solving this iDrac Express issue and Al Chu (FreeIPMI project leader) for all his invaluable help on IPMI.

9 July 2012

Arnaud Quette: Hello Planet Debian

Hey fellows Debian lovers, and old friends reading this, after more than a decade serving Debian, our greatest and beloved OS, I've finally decided to start blogging. Please, bear with me, I'm still very new at blogging. Thus, constructive comments and mails are welcome ;-) So here we actually go with the usual 'quick introduction post': I'm Arnaud Quette, 36 years old, and Debian developer since 2001. I'm leading the upstream project called 'NUT' (Network UPS Tools project) since 2005, and I'm a Free Software hacker since 1997 (incomplete list of contribs and linkedIn profile). I'm working at Eaton, a Debian partner participating in the power protection of our Debian infrastructure (FTP masters and Alioth). If you're a Debian admin, and feel Eaton can help, drop me a mail. My main Opensource contribution, for some years, is to lead 'NUT', and the related power management developments and distribution. As such, I also ensure that all this work lands finely in Debian, and is useful and reliable to users. In the past, I've also been working on some Media Center packages (Coherence UPnP, moovida, LIRC, ...), Synaptics touchpads app and some more. Here is the current list of maintained and co-maintained packages, though not up to date. It's not always that easy to balance between upstream lead and the numerous developments on one hand, and Debian packaging and support on the other hand... Not counting real life and the many others responsibilities I have. Just forgive me for the time it has taken (and is still taking, sometimes) to update my various packages. Just remember that I'm always doing my best :-) So dear Planet Debian readers, I'll be bothering you with some news on power management and GreenIT in Debian. There will be both upstream and Debian developments and releases info, so that you all can react. There may probably be some diversion, on other Debian related topics, one way or another... All this will depend on your feedback, and the balance with my available time and pleasure to share with you these few tech bits. cheers,
Arnaud

Next.