Search Results: "tina"

27 May 2017

Russ Allbery: On time management

Last December, the Guardian published a long essay by Oliver Burkeman entitled "Why time management is ruining our lives". Those who follow my book reviews know I read a lot of time management books, so of course I couldn't resist this. And, possibly surprisingly, not to disagree with it. It's an excellent essay, and well worth your time. Burkeman starts by talking about Inbox Zero:
If all this fervour seems extreme Inbox Zero was just a set of technical instructions for handling email, after all this was because email had become far more than a technical problem. It functioned as a kind of infinite to-do list, to which anyone on the planet could add anything at will.
This is, as Burkeman develops in the essay, an important critique of time management techniques in general, not just Inbox Zero: perhaps you can become moderately more efficient, but what are you becoming more efficient at doing, and why does it matter? If there were a finite amount of things that you had to accomplish, with leisure the reward at the end of the fixed task list, doing those things more efficiently makes perfect sense. But this is not the case in most modern life. Instead, we live in a world governed by Parkinson's Law: "Work expands to fill the time available for its completion." Worse, we live in a world where the typical employer takes Parkinson's Law, not as a statement on the nature of ever-expanding to-do lists, but a challenge to compress the time made available for a task to try to force the work to happen faster. Burkeman goes farther into the politics, pointing out that a cui bono analysis of time management suggests that we're all being played by capitalist employers. I wholeheartedly agree, but that's worth a separate discussion; for those who want to explore that angle, David Graeber's Debt and John Kenneth Galbraith's The Affluent Society are worth your time. What I want to write about here is why I still read (and recommend) time management literature, and how my thinking on it has changed. I started in the same place that most people probably do: I had a bunch of work to juggle, I felt I was making insufficient forward progress on it, and I felt my day contained a lot of slack that could be put to better use. The alluring promise of time management is that these problems can be resolved with more organization and some focus techniques. And there is a huge surge of energy that comes with adopting a new system and watching it work, since the good ones build psychological payoff into the tracking mechanism. Starting a new time management system is fun! Finishing things is fun! I then ran into the same problem that I think most people do: after that initial surge of enthusiasm, I had lists, systems, techniques, data on where my time was going, and a far more organized intake process. But I didn't feel more comfortable with how I was spending my time, I didn't have more leisure time, and I didn't feel happier. Often the opposite: time management systems will often force you to notice all the things you want to do and how slow your progress is towards accomplishing any of them. This is my fundamental disagreement with Getting Things Done (GTD): David Allen firmly believes that the act of recording everything that is nagging at you to be done relieves the brain of draining background processing loops and frees you to be more productive. He argues for this quite persuasively; as you can see from my review, I liked his book a great deal, and used his system for some time. But, at least for me, this does not work. Instead, having a complete list of goals towards which I am making slow or no progress is profoundly discouraging and depressing. The process of maintaining and dwelling on that list while watching it constantly grow was awful, quite a bit worse psychologically than having no time management system at all. Mark Forster is the time management author who speaks the best to me, and one of the points he makes is that time management is the wrong framing. You're not going to somehow generate more time, and you're usually not managing minutes and seconds. A better framing is task management, or commitment management: the goal of the system is to manage what you mentally commit to accomplishing, usually by restricting that list to something far shorter than you would come up with otherwise. How, in other words, to limit your focus to a small enough set of goals that you can make meaningful progress instead of thrashing. That, for me, is now the merit and appeal of time (or task) management systems: how do I sort through all the incoming noise, distractions, requests, desires, and compelling ideas that life throws at me and figure out which of them are worth investing time in? I also benefit from structuring that process for my peculiar psychology, in which backlogs I have to look at regularly are actively dangerous for my mental well-being. Left unchecked, I can turn even the most enjoyable hobby into an obligation and then into a source of guilt for not meeting the (entirely artificial) terms of the obligation I created, without even intending to. And here I think it has a purpose, but it's not the purpose that the time management industry is selling. If you think of time management as a way to get more things done and get more out of each moment, you're going to be disappointed (and you're probably also being taken advantage of by the people who benefit from unsustainable effort without real, unstructured leisure time). I practice Inbox Zero, but the point wasn't to be more efficient at processing my email. The point was to avoid the (for me) psychologically damaging backlog of messages while acting on the knowledge that 99% of email should go immediately into the trash with no further action. Email is an endless incoming stream of potential obligations or requests for my time (even just to read a longer message) that I should normallly reject. I also take the time to notice patterns of email that I never care about and then shut off the source or write filters to delete that email for me. I can then reserve my email time for moments of human connection, directly relevant information, or very interesting projects, and spend the time on those messages without guilt (or at least much less guilt) about ignoring everything else. Prioritization is extremely difficult, particularly once you realize that true prioritization is not about first and later, but about soon or never. The point of prioritization is not to choose what to do first, it's to choose the 5% of things that you going to do at all, convince yourself to be mentally okay with never doing the other 95% (and not lying to yourself about how there will be some future point when you'll magically have more time), and vigorously defend your focus and effort for that 5%. And, hopefully, wholeheartedly enjoy working on those things, without guilt or nagging that there's something else you should be doing instead. I still fail at this all the time. But I'm better than I used to be. For me, that mental shift was by far the hardest part. But once you've made that shift, I do think the time management world has a lot of tools and techniques to help you make more informed choices about the 5%, and to help you overcome procrastination and loss of focus on your real goals. Those real goals should include true unstructured leisure and "because I want to" projects. And hopefully, if you're in a financial position to do it, include working less on what other people want you to do and more on the things that delight you. Or at least making a well-informed strategic choice (for the sake of money or some other concrete and constantly re-evaluated reason) to sacrifice your personal goals for some temporary external ones.

24 May 2017

Steve Kemp: Getting ready for Stretch

I run about 17 servers. Of those about six are very personal and the rest are a small cluster which are used for a single website. (Partly because the code is old and in some ways a bit badly designed, partly because "clustering!", "high availability!", "learning!", "fun!" - seriously I had a lot of fun putting together a fault-tolerant deployment with haproxy, ucarp, etc, etc. If I were paying for it the site would be both retired and static!) I've started the process of upgrading to stretch by picking a bunch of hosts that do things I could live without for a few days - in case there were big problems, or I needed to restore from backups. So far I've upgraded: All upgrades were painless, with only one real surprise - the attic-backup software was removed from Debian. Although I do intend to retry using Larss' excellent obnum in the near future pragmatically I wanted to stick with what I'm familiar with. Borg backup is a fork of attic I've been aware of for a long time, but I never quite had a reason to try it out. Setting it up pretty much just meant editing my backup-script:
s/attic/borg/g
Once I did that, and created some new destinations all was good:
borg@rsync.io ~ $ borg init /backups/git.steve.org.uk.borg/
borg@rsync.io ~ $ borg init /backups/master.steve.org.uk.borg/
borg@rsync.io ~ $ ..
Upgrading other hosts, for example my website(s), and my email-box, will be more complex and fiddly. On that basis they will definitely wait for the formal stretch release. But having a couple of hosts running the frozen distribution is good for testing, and to let me see what is new.

22 May 2017

Gunnar Wolf: Open Source Symposium 2017

I travelled (for three days only!) to Argentina, to be a part of the Open Source Symposium 2017, a co-located event of the International Conference on Software Engineering.

This is, all in all, an interesting although small conference We are around 30 people in the room. This is a quite unusual conference for me, as this is among the first "formal" academic conference I am part of. Sessions have so far been quite interesting.
What am I linking to from this image? Of course, the proceedings! They managed to publish the proceedings via the "formal" academic channels (a nice hard-cover Springer volume) under an Open Access license (which is sadly not usual, and is unbelievably expensive). So, you can download the full proceedings, or article by article, in EPUB or in PDF...
...Which is very very nice :)
Previous editions of this symposium have also their respective proceedings available, but AFAICT they have not been downloadable.
So, get the book; it provides very interesant and original insights into our community seen from several quite novel angles!
AttachmentSize
oss2017_cover.png84.47 KB

3 May 2017

Vincent Bernat: VXLAN: BGP EVPN with Cumulus Quagga

VXLAN is an overlay network to encapsulate Ethernet traffic over an existing (highly available and scalable, possibly the Internet) IP network while accomodating a very large number of tenants. It is defined in RFC 7348. For an uncut introduction on its use with Linux, have a look at my VXLAN & Linux post. VXLAN deployment In the above example, we have hypervisors hosting a virtual machines from different tenants. Each virtual machine is given access to a tenant-specific virtual Ethernet segment. Users are expecting classic Ethernet segments: no MAC restrictions1, total control over the IP addressing scheme they use and availability of multicast. In a large VXLAN deployment, two aspects need attention:
  1. discovery of other endpoints (VTEPs) sharing the same VXLAN segments, and
  2. avoidance of BUM frames (broadcast, unknown unicast and multicast) as they have to be forwarded to all VTEPs.
A typical solution for the first point is using multicast. For the second point, this is source-address learning.

Introduction to BGP EVPN BGP EVPN (RFC 7432 and draft-ietf-bess-evpn-overlay for its application to VXLAN) is a standard control protocol to efficiently solves those two aspects without relying on multicast nor source-address learning. BGP EVPN relies on BGP (RFC 4271) and its MP-BGP extensions (RFC 4760). BGP is the routing protocol powering the Internet. It is highly scalable and interoperable. It is also extensible and one of its extension is MP-BGP. This extension can carry reachability information (NLRI) for multiple protocols (IPv4, IPv6, L3VPN and in our case EVPN). EVPN is a special family to advertise MAC addresses and the remote equipments they are attached to. There are basically two kinds of reachability information a VTEP sends through BGP EVPN:
  1. the VNIs they have interest in (type 3 routes), and
  2. for each VNI, the local MAC addresses (type 2 routes).
The protocol also covers other aspects of virtual Ethernet segments (L3 reachability information from ARP/ND caches, MAC mobility and multi-homing2) but we won t describe them here. To deploy BGP EVPN, a typical solution is to use several route reflectors (both for redundancy and scalability), like in the picture below. Each VTEP opens a BGP session to at least two route reflectors, sends its information (MACs and VNIs) and receives others . This reduces the number of BGP sessions to configure. VXLAN deployment with route reflectors Compared to other solutions to deploy VXLAN, BGP EVPN has three main advantages:
  • interoperability with other vendors (notably Juniper and Cisco),
  • proven scalability (a typical BGP routers handle several millions of routes), and
  • possibility to enforce fine-grained policies.
On Linux, Cumulus Quagga is a fairly complete implementation of BGP EVPN (type 3 routes for VTEP discovery, type 2 routes with MAC or IP addresses, MAC mobility when a host changes from one VTEP to another one) which requires very little configuration. This is a fork of Quagga and currently used in Cumulus Linux, a network operating system based on Debian powering switches from various brands. At some point, BGP EVPN support will be contributed back to FRR, a community-maintained fork of Quagga3. It should be noted the BGP EVPN implementation of Cumulus Quagga currently only supports IPv4.

Route reflector setup Before configuring each VTEP, we need to configure two or more route reflectors. There are many solutions. I will present three of them:
  • using Cumulus Quagga,
  • using GoBGP, an implementation of BGP in Go,
  • using Juniper JunOS.
For reliability purpose, it s possible (and easy) to use one implementation for some route reflectors and another implementation for the other ones. The proposed configurations are quite minimal. However, it is possible to centralize policies on the route reflectors (e.g. routes tagged with some community can only be readvertised to some group of VTEPs).

Using Quagga The configuration is pretty simple. We suppose the configured route reflector has 203.0.113.254 configured as a loopback IP.
router bgp 65000
  bgp router-id 203.0.113.254
  bgp cluster-id 203.0.113.254
  bgp log-neighbor-changes
  no bgp default ipv4-unicast
  neighbor fabric peer-group
  neighbor fabric remote-as 65000
  neighbor fabric capability extended-nexthop
  neighbor fabric update-source 203.0.113.254
  bgp listen range 203.0.113.0/24 peer-group fabric
  !
  address-family evpn
   neighbor fabric activate
   neighbor fabric route-reflector-client
  exit-address-family
  !
  exit
!
A peer group fabric is defined and we leverage the dynamic neighbor feature of Cumulus Quagga: we don t have to explicitely define each neighbor. Any client from 203.0.113.0/24 and presenting itself as part of AS 65000 can connect. All sent EVPN routes will be accepted and reflected to the other clients. You don t need to run Zebra, the route engine talking with the kernel. Instead, start bgpd with the --no_kernel flag.

Using GoBGP GoBGP is a clean implementation of BGP in Go4. It exposes an RPC API for configuration (but accepts a configuration file and comes with a command-line client). It doesn t support dynamic neighbors, so you ll have to use the API, the command-line client or some templating language to automate their declaration. A configuration with only one neighbor is like this:
global:
  config:
    as: 65000
    router-id: 203.0.113.254
    local-address-list:
      - 203.0.113.254
neighbors:
  - config:
      neighbor-address: 203.0.113.1
      peer-as: 65000
    afi-safis:
      - config:
          afi-safi-name: l2vpn-evpn
    route-reflector:
      config:
        route-reflector-client: true
        route-reflector-cluster-id: 203.0.113.254
More neighbors can be added from the command line:
$ gobgp neighbor add 203.0.113.2 as 65000 \
>         route-reflector-client 203.0.113.254 \
>         --address-family evpn
GoBGP won t try to interact with the kernel which is fine as a route reflector.

Using Juniper JunOS A variety of Juniper products can be a BGP route reflector, notably: The main factor is the CPU and the memory. The QFX5100 is low on memory and won t support large deployments without some additional policing. Here is a configuration similar to the Quagga one:
interfaces  
    lo0  
        unit 0  
            family inet  
                address 203.0.113.254/32;
             
         
     
 
protocols  
    bgp  
        group fabric  
            family evpn  
                signaling  
                    /* Do not try to install EVPN routes */
                    no-install;
                 
             
            type internal;
            cluster 203.0.113.254;
            local-address 203.0.113.254;
            allow 203.0.113.0/24;
         
     
 
routing-options  
    router-id 203.0.113.254;
    autonomous-system 65000;
 

VTEP setup The next step is to configure each VTEP/hypervisor. Each VXLAN is locally configured using a bridge for local virtual interfaces, like illustrated in the below schema. The bridge is taking care of the local MAC addresses (notably, using source-address learning) and the VXLAN interface takes care of the remote MAC addresses (received with BGP EVPN). Bridged VXLAN device VXLANs can be provisioned with the following script. Source-address learning is disabled as we will rely solely on BGP EVPN to synchronize FDBs between the hypervisors.
for vni in 100 200; do
    # Create VXLAN interface
    ip link add vxlan$ vni  type vxlan
        id $ vni  \
        dstport 4789 \
        local 203.0.113.2 \
        nolearning
    # Create companion bridge
    brctl addbr br$ vni 
    brctl addif br$ vni  vxlan$ vni 
    brctl stp br$ vni  off
    ip link set up dev br$ vni 
    ip link set up dev vxlan$ vni 
done
# Attach each VM to the appropriate segment
brctl addif br100 vnet10
brctl addif br100 vnet11
brctl addif br200 vnet12
The configuration of Cumulus Quagga is similar to the one used for a route reflector, except we use the advertise-all-vni directive to publish all local VNIs.
router bgp 65000
  bgp router-id 203.0.113.2
  no bgp default ipv4-unicast
  neighbor fabric peer-group
  neighbor fabric remote-as 65000
  neighbor fabric capability extended-nexthop
  neighbor fabric update-source dummy0
  ! BGP sessions with route reflectors
  neighbor 203.0.113.253 peer-group fabric
  neighbor 203.0.113.254 peer-group fabric
  !
  address-family evpn
   neighbor fabric activate
   advertise-all-vni
  exit-address-family
  !
  exit
!
If everything works as expected, the instances sharing the same VNI should be able to ping each other. If IPv6 is enabled on the VMs, the ping command shows if everything is in order:
$ ping -c10 -w1 -t1 ff02::1%eth0
PING ff02::1%eth0(ff02::1%eth0) 56 data bytes
64 bytes from fe80::5254:33ff:fe00:8%eth0: icmp_seq=1 ttl=64 time=0.016 ms
64 bytes from fe80::5254:33ff:fe00:b%eth0: icmp_seq=1 ttl=64 time=4.98 ms (DUP!)
64 bytes from fe80::5254:33ff:fe00:9%eth0: icmp_seq=1 ttl=64 time=4.99 ms (DUP!)
64 bytes from fe80::5254:33ff:fe00:a%eth0: icmp_seq=1 ttl=64 time=4.99 ms (DUP!)
--- ff02::1%eth0 ping statistics ---
1 packets transmitted, 1 received, +3 duplicates, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.016/3.745/4.991/2.152 ms

Verification Step by step, let s check how everything comes together.

Getting VXLAN information from the kernel On each VTEP, Quagga should be able to retrieve the information about configured VXLANs. This can be checked with vtysh:
# show interface vxlan100
Interface vxlan100 is up, line protocol is up
  Link ups:       1    last: 2017/04/29 20:01:33.43
  Link downs:     0    last: (never)
  PTM status: disabled
  vrf: Default-IP-Routing-Table
  index 11 metric 0 mtu 1500
  flags: <UP,BROADCAST,RUNNING,MULTICAST>
  Type: Ethernet
  HWaddr: 62:42:7a:86:44:01
  inet6 fe80::6042:7aff:fe86:4401/64
  Interface Type Vxlan
  VxLAN Id 100
  Access VLAN Id 1
  Master (bridge) ifindex 9 ifp 0x56536e3f3470
The important points are:
  • the VNI is 100, and
  • the bridge device was correctly detected.
Quagga should also be able to retrieve information about the local MAC addresses :
# show evpn mac vni 100
Number of MACs (local and remote) known for this VNI: 2
MAC               Type   Intf/Remote VTEP      VLAN
50:54:33:00:00:0a local  eth1.100
50:54:33:00:00:0b local  eth2.100

BGP sessions Each VTEP has to establish a BGP session to the route reflectors. On the VTEP, this can be checked by running vtysh:
# show bgp neighbors 203.0.113.254
BGP neighbor is 203.0.113.254, remote AS 65000, local AS 65000, internal link
 Member of peer-group fabric for session parameters
  BGP version 4, remote router ID 203.0.113.254
  BGP state = Established, up for 00:00:45
  Neighbor capabilities:
    4 Byte AS: advertised and received
    AddPath:
      L2VPN EVPN: RX advertised L2VPN EVPN
    Route refresh: advertised and received(new)
    Address family L2VPN EVPN: advertised and received
    Hostname Capability: advertised
    Graceful Restart Capabilty: advertised
[...]
 For address family: L2VPN EVPN
  fabric peer-group member
  Update group 1, subgroup 1
  Packet Queue length 0
  Community attribute sent to this neighbor(both)
  8 accepted prefixes

  Connections established 1; dropped 0
  Last reset never
Local host: 203.0.113.2, Local port: 37603
Foreign host: 203.0.113.254, Foreign port: 179
The output includes the following information:
  • the BGP state is Established,
  • the address family L2VPN EVPN is correctly advertised, and
  • 8 routes are received from this route reflector.
The state of the BGP sessions can also be checked from the route reflectors. With GoBGP, use the following command:
# gobgp neighbor 203.0.113.2
BGP neighbor is 203.0.113.2, remote AS 65000, route-reflector-client
  BGP version 4, remote router ID 203.0.113.2
  BGP state = established, up for 00:04:30
  BGP OutQ = 0, Flops = 0
  Hold time is 9, keepalive interval is 3 seconds
  Configured hold time is 90, keepalive interval is 30 seconds
  Neighbor capabilities:
    multiprotocol:
        l2vpn-evpn:     advertised and received
    route-refresh:      advertised and received
    graceful-restart:   received
    4-octet-as: advertised and received
    add-path:   received
    UnknownCapability(73):      received
    cisco-route-refresh:        received
[...]
  Route statistics:
    Advertised:             8
    Received:               5
    Accepted:               5
With JunOS, use the below command:
> show bgp neighbor 203.0.113.2
Peer: 203.0.113.2+38089 AS 65000 Local: 203.0.113.254+179 AS 65000
  Group: fabric                Routing-Instance: master
  Forwarding routing-instance: master
  Type: Internal    State: Established
  Last State: OpenConfirm   Last Event: RecvKeepAlive
  Last Error: None
  Options: <Preference LocalAddress Cluster AddressFamily Rib-group Refresh>
  Address families configured: evpn
  Local Address: 203.0.113.254 Holdtime: 90 Preference: 170
  NLRI evpn: NoInstallForwarding
  Number of flaps: 0
  Peer ID: 203.0.113.2     Local ID: 203.0.113.254     Active Holdtime: 9
  Keepalive Interval: 3          Group index: 0    Peer index: 2
  I/O Session Thread: bgpio-0 State: Enabled
  BFD: disabled, down
  NLRI for restart configured on peer: evpn
  NLRI advertised by peer: evpn
  NLRI for this session: evpn
  Peer supports Refresh capability (2)
  Stale routes from peer are kept for: 300
  Peer does not support Restarter functionality
  NLRI that restart is negotiated for: evpn
  NLRI of received end-of-rib markers: evpn
  NLRI of all end-of-rib markers sent: evpn
  Peer does not support LLGR Restarter or Receiver functionality
  Peer supports 4 byte AS extension (peer-as 65000)
  NLRI's for which peer can receive multiple paths: evpn
  Table bgp.evpn.0 Bit: 20000
    RIB State: BGP restart is complete
    RIB State: VPN restart is complete
    Send state: in sync
    Active prefixes:              5
    Received prefixes:            5
    Accepted prefixes:            5
    Suppressed due to damping:    0
    Advertised prefixes:          8
  Last traffic (seconds): Received 276  Sent 170  Checked 276
  Input messages:  Total 61     Updates 3       Refreshes 0     Octets 1470
  Output messages: Total 62     Updates 4       Refreshes 0     Octets 1775
  Output Queue[1]: 0            (bgp.evpn.0, evpn)
If a BGP session cannot be established, the logs of each BGP daemon should mention the cause.

Sent routes From each VTEP, Quagga needs to send:
  • one type 3 route for each local VNI, and
  • one type 2 route for each local MAC address.
The best place to check the received routes is on one of the route reflectors. If you are using JunOS, the following command will display the received routes from the provided VTEP:
> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.2
bgp.evpn.0: 10 destinations, 10 routes (10 active, 0 holddown, 0 hidden)
  Prefix                  Nexthop              MED     Lclpref    AS path
  2:203.0.113.2:100::0::50:54:33:00:00:0a/304 MAC/IP
*                         203.0.113.2                  100        I
  2:203.0.113.2:100::0::50:54:33:00:00:0b/304 MAC/IP
*                         203.0.113.2                  100        I
  3:203.0.113.2:100::0::203.0.113.2/304 IM
*                         203.0.113.2                  100        I
  3:203.0.113.2:200::0::203.0.113.2/304 IM
*                         203.0.113.2                  100        I
There is one type 3 route for VNI 100 and another one for VNI 200. There are also two type 2 routes for two MAC addresses on VNI 100. To get more information, you can add the keyword extensive. Here is a type 3 route advertising 203.0.113.2 as a VTEP for VNI 1008:
> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.2 extensive
bgp.evpn.0: 11 destinations, 11 routes (11 active, 0 holddown, 0 hidden)
* 3:203.0.113.2:100::0::203.0.113.2/304 IM (1 entry, 1 announced)
     Accepted
     Route Distinguisher: 203.0.113.2:100
     Nexthop: 203.0.113.2
     Localpref: 100
     AS path: I
     Communities: target:65000:268435556 encapsulation:vxlan(0x8)
[...]
Here is a type 2 route announcing the location of the 50:54:33:00:00:0a MAC address for VNI 100:
> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.2 extensive
bgp.evpn.0: 11 destinations, 11 routes (11 active, 0 holddown, 0 hidden)
* 2:203.0.113.2:100::0::50:54:33:00:00:0a/304 MAC/IP (1 entry, 1 announced)
     Accepted
     Route Distinguisher: 203.0.113.2:100
     Route Label: 100
     ESI: 00:00:00:00:00:00:00:00:00:00
     Nexthop: 203.0.113.2
     Localpref: 100
     AS path: I
     Communities: target:65000:268435556 encapsulation:vxlan(0x8)
[...]
With Quagga, you can get a similar output with vtysh:
# show bgp evpn route
BGP table version is 0, local router ID is 203.0.113.1
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal
Origin codes: i - IGP, e - EGP, ? - incomplete
EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]
EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP]
   Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 203.0.113.2:100
*>i[2]:[0]:[0]:[48]:[50:54:33:00:00:0a]
                    203.0.113.2                   100      0 i
*>i[2]:[0]:[0]:[48]:[50:54:33:00:00:0b]
                    203.0.113.2                   100      0 i
*>i[3]:[0]:[32]:[203.0.113.2]
                    203.0.113.2                   100      0 i
Route Distinguisher: 203.0.113.2:200
*>i[3]:[0]:[32]:[203.0.113.2]
                    203.0.113.2                   100      0 i
[...]
With GoBGP, use the following command:
# gobgp global rib -a evpn   grep rd:203.0.113.2:200
    Network  Next Hop             AS_PATH              Age        Attrs
*>  [type:macadv][rd:203.0.113.2:100][esi:single-homed][etag:0][mac:50:54:33:00:00:0a][ip:<nil>][labels:[100]]203.0.113.2                               00:00:17   [ Origin: i   LocalPref: 100   Extcomms: [VXLAN], [65000:268435556] ]
*>  [type:macadv][rd:203.0.113.2:100][esi:single-homed][etag:0][mac:50:54:33:00:00:0b][ip:<nil>][labels:[100]]203.0.113.2                               00:00:17   [ Origin: i   LocalPref: 100   Extcomms: [VXLAN], [65000:268435556] ]
*>  [type:macadv][rd:203.0.113.2:200][esi:single-homed][etag:0][mac:50:54:33:00:00:0a][ip:<nil>][labels:[200]]203.0.113.2                               00:00:17   [ Origin: i   LocalPref: 100   Extcomms: [VXLAN], [65000:268435656] ]
*>  [type:multicast][rd:203.0.113.2:100][etag:0][ip:203.0.113.2]203.0.113.2                               00:00:17   [ Origin: i   LocalPref: 100   Extcomms: [VXLAN], [65000:268435556] ]
*>  [type:multicast][rd:203.0.113.2:200][etag:0][ip:203.0.113.2]203.0.113.2                               00:00:17   [ Origin: i   LocalPref: 100   Extcomms: [VXLAN], [65000:268435656] ]

Received routes Each VTEP should have received the type 2 and type 3 routes from its fellow VTEPs, through the route reflectors. You can check with the show bgp evpn route command of vtysh. Does Quagga correctly understand the received routes? The type 3 routes are translated to an assocation between the remote VTEPs and the VNIs:
# show evpn vni
Number of VNIs: 2
VNI        VxLAN IF              VTEP IP         # MACs   # ARPs   Remote VTEPs
100        vxlan100              203.0.113.2     4        0        203.0.113.3
                                                                   203.0.113.1
200        vxlan200              203.0.113.2     3        0        203.0.113.3
                                                                   203.0.113.1
The type 2 routes are translated to an association between the remote MACs and the remote VTEPs:
# show evpn mac vni 100
Number of MACs (local and remote) known for this VNI: 4
MAC               Type   Intf/Remote VTEP      VLAN
50:54:33:00:00:09 remote 203.0.113.1
50:54:33:00:00:0a local  eth1.100
50:54:33:00:00:0b local  eth2.100
50:54:33:00:00:0c remote 203.0.113.3

FDB configuration The last step is to ensure Quagga has correctly provided the received information to the kernel. This can be checked with the bridge command:
# bridge fdb show dev vxlan100   grep dst
00:00:00:00:00:00 dst 203.0.113.1 self permanent
00:00:00:00:00:00 dst 203.0.113.3 self permanent
50:54:33:00:00:0c dst 203.0.113.3 self
50:54:33:00:00:09 dst 203.0.113.1 self
All good! The two first lines are the translation of the type 3 routes (any BUM frame will be sent to both 203.0.113.1 and 203.0.113.3) and the two last ones are the translation of the type 2 routes.

Interoperability One of the strength of BGP EVPN is the interoperability with other network vendors. To demonstrate it works as expected, we will configure a Juniper vMX to act as a VTEP. First, we need to configure the physical bridge9. This is similar to the use of ip link and brctl with Linux. We only configure one physical interface with two old-school VLANs paired with matching VNIs.
interfaces  
    ge-0/0/1  
        unit 0  
            family bridge  
                interface-mode trunk;
                vlan-id-list [ 100 200 ];
             
         
     
 
routing-instances  
    switch  
        instance-type virtual-switch;
        interface ge-0/0/1.0;
        bridge-domains  
            vlan100  
                domain-type bridge;
                vlan-id 100;
                vxlan  
                    vni 100;
                    ingress-node-replication;
                 
             
            vlan200  
                domain-type bridge;
                vlan-id 200;
                vxlan  
                    vni 200;
                    ingress-node-replication;
                 
             
         
     
 
Then, we configure BGP EVPN to advertise all known VNIs. The configuration is quite similar to the one we did with Quagga:
protocols  
    bgp  
        group fabric  
            type internal;
            multihop;
            family evpn signaling;
            local-address 203.0.113.3;
            neighbor 203.0.113.253;
            neighbor 203.0.113.254;
         
     
 
routing-instances  
    switch  
        vtep-source-interface lo0.0;
        route-distinguisher 203.0.113.3:1; #  
        vrf-import EVPN-VRF-VXLAN;
        vrf-target  
            target:65000:1;
            auto;
         
        protocols  
            evpn  
                encapsulation vxlan;
                extended-vni-list all;
                multicast-mode ingress-replication;
             
         
     
 
routing-options  
    router-id 203.0.113.3;
    autonomous-system 65000;
 
policy-options  
    policy-statement EVPN-VRF-VXLAN  
        then accept;
     
 
We also need a small compatibility patch for Cumulus Quagga10. The routes sent by this configuration are very similar to the routes sent by Quagga. The main differences are:
  • on JunOS, the route distinguisher is configured statically (in ), and
  • on JunOS, the VNI is also encoded as an Ethernet tag ID.
Here is a type 3 route, as sent by JunOS:
> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.3 extensive
bgp.evpn.0: 13 destinations, 13 routes (13 active, 0 holddown, 0 hidden)
* 3:203.0.113.3:1::100::203.0.113.3/304 IM (1 entry, 1 announced)
     Accepted
     Route Distinguisher: 203.0.113.3:1
     Nexthop: 203.0.113.3
     Localpref: 100
     AS path: I
     Communities: target:65000:268435556 encapsulation:vxlan(0x8)
     PMSI: Flags 0x0: Label 6: Type INGRESS-REPLICATION 203.0.113.3
[...]
Here is a type 2 route:
> show route table bgp.evpn.0 receive-protocol bgp 203.0.113.3 extensive
bgp.evpn.0: 13 destinations, 13 routes (13 active, 0 holddown, 0 hidden)
* 2:203.0.113.3:1::200::50:54:33:00:00:0f/304 MAC/IP (1 entry, 1 announced)
     Accepted
     Route Distinguisher: 203.0.113.3:1
     Route Label: 200
     ESI: 00:00:00:00:00:00:00:00:00:00
     Nexthop: 203.0.113.3
     Localpref: 100
     AS path: I
     Communities: target:65000:268435656 encapsulation:vxlan(0x8)
[...]
We can check that the vMX is able to make sense of the routes it receives from its peers running Quagga:
> show evpn database l2-domain-id 100
Instance: switch
VLAN  DomainId  MAC address        Active source                  Timestamp        IP address
     100        50:54:33:00:00:0c  203.0.113.1                    Apr 30 12:46:20
     100        50:54:33:00:00:0d  203.0.113.2                    Apr 30 12:32:42
     100        50:54:33:00:00:0e  203.0.113.2                    Apr 30 12:46:20
     100        50:54:33:00:00:0f  ge-0/0/1.0                     Apr 30 12:45:55
On the other end, if we look at one of the Quagga-based VTEP, we can check the received routes are correctly understood:
# show evpn vni 100
VNI: 100
 VxLAN interface: vxlan100 ifIndex: 9 VTEP IP: 203.0.113.1
 Remote VTEPs for this VNI:
  203.0.113.3
  203.0.113.2
 Number of MACs (local and remote) known for this VNI: 4
 Number of ARPs (IPv4 and IPv6, local and remote) known for this VNI: 0
# show evpn mac vni 100
Number of MACs (local and remote) known for this VNI: 4
MAC               Type   Intf/Remote VTEP      VLAN
50:54:33:00:00:0c local  eth1.100
50:54:33:00:00:0d remote 203.0.113.2
50:54:33:00:00:0e remote 203.0.113.2
50:54:33:00:00:0f remote 203.0.113.3
Get in touch if you have some success with other vendors!

  1. For example, they may use bridges to connect containers together.
  2. Such a feature can replace proprietary implementations of MC-LAG allowing several VTEPs to act as a endpoint for a single link aggregation group. This is not needed on our scenario where hypervisors act as VTEPs.
  3. The development of Quagga is slow and closed . New features are often stalled. FRR is placed under the umbrella of the Linux Foundation, has a GitHub-centered development model and an election process. It already has several interesting enhancements (notably, BGP add-path, BGP unnumbered, MPLS and LDP).
  4. I am unenthusiastic about projects whose the sole purpose is to rewrite something in Go. However, while being quite young, GoBGP is quite valuable on its own (good architecture, good performance).
  5. The 48-port version is around $10,000 with the BGP license.
  6. An empty chassis with a dual routing engine (RE-S-1800X4-16G) is around $30,000.
  7. I don t know how pricey the vRR is. For evaluation purposes, it can be downloaded for free if you are a customer.
  8. The value 100 used in the route distinguishier (203.0.113.2:100) is not the one used to encode the VNI. The VNI is encoded in the route target (65000:268435556), in the 24 least signifiant bits (268435556 & 0xffffff equals 100). As long as VNIs are unique, we don t have to understand those details.
  9. For some reason, the use of a virtual switch is mandatory. This is specific to this platform: a QFX doesn t require this.
  10. The encoding of the VNI into the route target is being standardized in draft-ietf-bess-evpn-overlay. Juniper already implements this draft.

Vincent Bernat: VXLAN & Linux

VXLAN is an overlay network to carry Ethernet traffic over an existing (highly available and scalable) IP network while accommodating a very large number of tenants. It is defined in RFC 7348. Starting from Linux 3.12, the VXLAN implementation is quite complete as both multicast and unicast are supported as well as IPv6 and IPv4. Let s explore the various methods to configure it. VXLAN setup To illustrate our examples, we use the following setup: A VXLAN tunnel extends the individual Ethernet segments accross the three bridges, providing a unique (virtual) Ethernet segment. From one host (e.g. H1), we can reach directly all the other hosts in the virtual segment:
$ ping -c10 -w1 -t1 ff02::1%eth0
PING ff02::1%eth0(ff02::1%eth0) 56 data bytes
64 bytes from fe80::5254:33ff:fe00:8%eth0: icmp_seq=1 ttl=64 time=0.016 ms
64 bytes from fe80::5254:33ff:fe00:b%eth0: icmp_seq=1 ttl=64 time=4.98 ms (DUP!)
64 bytes from fe80::5254:33ff:fe00:9%eth0: icmp_seq=1 ttl=64 time=4.99 ms (DUP!)
64 bytes from fe80::5254:33ff:fe00:a%eth0: icmp_seq=1 ttl=64 time=4.99 ms (DUP!)
--- ff02::1%eth0 ping statistics ---
1 packets transmitted, 1 received, +3 duplicates, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.016/3.745/4.991/2.152 ms

Basic usage The reference deployment for VXLAN is to use an IP multicast group to join the other VTEPs:
# ip -6 link add vxlan100 type vxlan \
>   id 100 \
>   dstport 4789 \
>   local 2001:db8:1::1 \
>   group ff05::100 \
>   dev eth0 \
>   ttl 5
# brctl addbr br100
# brctl addif br100 vxlan100
# brctl addif br100 vnet22
# brctl addif br100 vnet25
# brctl stp br100 off
# ip link set up dev br100
# ip link set up dev vxlan100
The above commands create a new interface acting as a VXLAN tunnel endpoint, named vxlan100 and put it in a bridge with some regular interfaces1. Each VXLAN segment is associated to a 24-bit segment ID, the VXLAN Network Identifier (VNI). In our example, the default VNI is specified with id 100. When VXLAN was first implemented in Linux 3.7, the UDP port to use was not defined. Several vendors were using 8472 and Linux took the same value. To avoid breaking existing deployments, this is still the default value. Therefore, if you want to use the IANA-assigned port, you need to explicitely set it with dstport 4789. As we want to use multicast, we have to specify a multicast group to join (group ff05::100), as well as a physical device (dev eth0). With multicast, the default TTL is 1. If your multicast network leverages some routing, you ll have to increase the value a bit, like here with ttl 5. The vxlan100 device acts as a bridge device with remote VTEPs as virtual ports:
  • it sends broadcast, unknown unicast and multicast (BUM) frames to all VTEPs using the multicast group, and
  • it discovers the association from Ethernet MAC addresses to VTEP IP addresses using source-address learning.
The following figure summarizes the configuration, with the FDB of the Linux bridge (learning local MAC addresses) and the FDB of the VXLAN device (learning distant MAC addresses): Bridged VXLAN device The FDB of the VXLAN device can be observed with the bridge command. If the destination MAC is present, the frame is sent to the associated VTEP (unicast). The all-zero address is only used when a lookup for the destination MAC fails.
# bridge fdb show dev vxlan100   grep dst
00:00:00:00:00:00 dst ff05::100 via eth0 self permanent
50:54:33:00:00:0b dst 2001:db8:3::1 self
50:54:33:00:00:08 dst 2001:db8:1::1 self
If you are interested to get more details on how to setup a multicast network and build VXLAN segments on top of it, see my Network virtualization with VXLAN article.

Without multicast Using VXLAN over a multicast IP network has several benefits:
  • automatic discovery of other VTEPs sharing the same multicast group,
  • good bandwidth usage (packets are replicated as late as possible),
  • decentralized and controller-less design2.
However, multicast is not available everywhere and managing it at scale can be difficult. In Linux 3.8, the DOVE extensions have been added to the VXLAN implementation, removing the dependency on multicast.

Unicast with static flooding We can replace multicast by head-end replication of BUM frames to a statically configured lists of remote VTEPs3:
# ip -6 link add vxlan100 type vxlan \
>   id 100 \
>   dstport 4789 \
>   local 2001:db8:1::1
# bridge fdb append 00:00:00:00:00:00 dev vxlan100 dst 2001:db8:2::1
# bridge fdb append 00:00:00:00:00:00 dev vxlan100 dst 2001:db8:3::1
The VXLAN is defined without a remote multicast group. Instead, all the remote VTEPs are associated with the all-zero address: a BUM frame will be duplicated to all those destinations. The VXLAN device will still learn remote addresses automatically using source-address learning. It is a very simple solution. With a bit of automation, you can keep the default FDB entries up-to-date easily. However, the host will have to duplicate each BUM frame (head-end replication) as many times as there are remote VTEPs. This is quite reasonable if you have a dozen of them. This may become out-of-hand if you have thousands of them. Cumulus vxfld daemon is an example of use of this strategy (in the head-end replication mode).

Unicast with static L2 entries When the associations of MAC addresses and VTEPs are known, it is possible to pre-populate the FDB and disable learning:
# ip -6 link add vxlan100 type vxlan \
>   id 100 \
>   dstport 4789 \
>   local 2001:db8:1::1 \
>   nolearning
# bridge fdb append 00:00:00:00:00:00 dev vxlan100 dst 2001:db8:2::1
# bridge fdb append 00:00:00:00:00:00 dev vxlan100 dst 2001:db8:3::1
# bridge fdb append 50:54:33:00:00:09 dev vxlan100 dst 2001:db8:2::1
# bridge fdb append 50:54:33:00:00:0a dev vxlan100 dst 2001:db8:2::1
# bridge fdb append 50:54:33:00:00:0b dev vxlan100 dst 2001:db8:3::1
Thanks to the nolearning flag, source-address learning is disabled. Therefore, if a MAC is missing, the frame will always be sent using the all-zero entries. The all-zero entries are still needed for broadcast and multicast traffic (e.g. ARP and IPv6 neighbor discovery). This kind of setup works well to provide virtual L2 networks to virtual machines (no L3 information available). You need some glue to update the FDB entries. BGP EVPN with Cumulus Quagga is an example of use of this strategy (see VXLAN: BGP EVPN with Cumulus Quagga for additional information).

Unicast with static L3 entries In the previous example, we had to keep the all-zero entries for ARP and IPv6 neighbor discovery to work correctly. However, Linux can answer to neighbor requests on behalf of the remote nodes4. When this feature is enabled, the default entries are not needed anymore (but you could keep them):
# ip -6 link add vxlan100 type vxlan \
>   id 100 \
>   dstport 4789 \
>   local 2001:db8:1::1 \
>   nolearning \
>   proxy
# ip -6 neigh add 2001:db8:ff::11 lladdr 50:54:33:00:00:09 dev vxlan100
# ip -6 neigh add 2001:db8:ff::12 lladdr 50:54:33:00:00:0a dev vxlan100
# ip -6 neigh add 2001:db8:ff::13 lladdr 50:54:33:00:00:0b dev vxlan100
# bridge fdb append 50:54:33:00:00:09 dev vxlan100 dst 2001:db8:2::1
# bridge fdb append 50:54:33:00:00:0a dev vxlan100 dst 2001:db8:2::1
# bridge fdb append 50:54:33:00:00:0b dev vxlan100 dst 2001:db8:3::1
This setup totally eliminates head-end replication. However, protocols relying on multicast won t work either. With some automation, this is a setup that should work well with containers: if there is a registry keeping a list of all IP and MAC addresses in use, a program could listen to it and adjust the FDB and the neighbor tables. The VXLAN backend of Docker s libnetwork is an example of use of this strategy (but it also uses the next method).

Unicast with dynamic L3 entries Linux can also notify a program an (L2 or L3) entry is missing. The program queries some central registry and dynamically adds the requested entry. However, for L2 entries, notifications are issued only if:
  • the destination MAC address is not known,
  • there is no all-zero entry in the FDB, and
  • the destination MAC address is not a multicast or broadcast one.
Those limitations prevent us to do a unicast with dynamic L2 entries scenario. First, let s create the VXLAN device with the l2miss and l3miss options5:
ip -6 link add vxlan100 type vxlan \
   id 100 \
   dstport 4789 \
   local 2001:db8:1::1 \
   nolearning \
   l2miss \
   l3miss \
   proxy
Notifications are sent to programs listening to an AF_NETLINK socket using the NETLINK_ROUTE protocol. This socket needs to be bound to the RTNLGRP_NEIGH group. The following is doing exactly that and decodes the received notifications:
# ip monitor neigh dev vxlan100
miss 2001:db8:ff::12 STALE
miss lladdr 50:54:33:00:00:0a STALE
The first notification is about a missing neighbor entry for the requested IP address. We can add it with the following command:
ip -6 neigh replace 2001:db8:ff::12 \
    lladdr 50:54:33:00:00:0a \
    dev vxlan100 \
    nud reachable
The entry is not permanent so that we don t need to delete it when it expires. If the address becomes stale, we will get another notification to refresh it. Once the host receives our proxy answer for the neighbor discovery request, it can send a frame with the MAC we gave as destination. The second notification is about the missing FDB entry for this MAC address. We add the appropriate entry with the following command6:
bridge fdb replace 50:54:33:00:00:0a \
    dst 2001:db8:2::1 \
    dev vxlan100 dynamic
The entry is not permanent either as it would prevent the MAC to migrate to the local VTEP (a dynamic entry cannot override a permanent entry). This setup works well with containers and a global registry. However, there is small latency penalty for the first connections. Moreover, multicast and broadcast won t be available in the underlay network. The VXLAN backend for flannel, a network fabric for Kubernetes, is an example of this strategy.

Decision There is no one-size-fits-all solution. You should consider the multicast solution if:
  • you are in an environment where multicast is available,
  • you are ready to operate (and scale) a multicast network,
  • you need multicast and broadcast inside the virtual segments,
  • you don t have L2/L3 addresses available beforehand.
The scalability of such a solution is pretty good if you take care of not putting all VXLAN interfaces into the same multicast group (e.g. use the last byte of the VNI as the last byte of the multicast group). When multicast is not available, another generic solution is BGP EVPN: BGP is used as a controller to ensure distribution of the list of VTEPs and their respective FDBs. As mentioned earlier, an implementation of this solution is Cumulus Quagga. I explore this option in a separate post: VXLAN: BGP EVPN with Cumulus Quagga. If you operate in a container-like environment where L2/L3 addresses are known beforehand, a solution using static and/or dynamic L2 and L3 entries based on a central registry and no source-address learning would also fit the bill. This provides a more security-tight solution (bound resources, MiTM attacks dampened down, inability to amplify bandwidth usage through excessive broadcast). Various environment-specific solutions are available7 or you can build your own.

Other considerations Independently of the chosen strategy, here are a few important points to keep in mind when implementing a VXLAN overlay.

Isolation While you may expect VXLAN interfaces to only carry L2 traffic, Linux doesn t disable IP processing. If the destination MAC is a local one, Linux will route or deliver the encapsulated IP packet. Check my post about the proper isolation of a Linux bridge.

Encryption VXLAN enforces isolation between tenants, but the traffic is totally unencrypted. The most direct solution to provide encryption is to use IPsec. Some container-based solutions may come with IPsec support out-of-the box (notably Docker s libnetwork, but flannel has plan for it too). This is quite important for a deployment over a public cloud.

Overhead The format of a VXLAN-encapsulated frame is the following: VXLAN encapsulation VXLAN adds a fixed overhead of 50 bytes. If you also use IPsec, the overhead depends on many factors. In transport mode, with AES and SHA256, the overhead is 56 bytes. With NAT traversal, this is 64 bytes (additional UDP header). In tunnel mode, this is 72 bytes. See Cisco IPsec Overhead Calculator Tool. Some users will expect to be able to use an Ethernet MTU of 1500 for the overlay network. Therefore, the underlay MTU should be increased. If it is not possible, ensure the inner MTU (inside the containers or the virtual machines) is correctly decreased8.

IPv6 While all the examples above are using IPv6, the ecosystem is not quite ready yet. The multicast L2-only strategy works fine with IPv6 but every other scenario currently needs some patches (1, 2, 3). On top of that, IPv6 may not have been implemented in VXLAN-related tools:

Multicast Linux VXLAN implementation doesn t support IGMP snooping. Multicast traffic will be broadcasted to all VTEPs unless multicast MAC addresses are inserted into the FDB.

  1. This is one possible implementation. The bridge is only needed if you require some form of source-address learning for local interfaces. Another strategy is to use MACVLAN interfaces.
  2. The underlay multicast network may still need some central components, like rendez-vous points for PIM-SM protocol. Fortunately, it s possible to make them highly available and scalable (e.g. with Anycast-RP, RFC 4610).
  3. For this example and the following ones, a patch is needed for the ip command (to be included in 4.11) to use IPv6 for transport. In the meantime, here is a quick workaround:
    # ip -6 link add vxlan100 type vxlan \
    >   id 100 \
    >   dstport 4789 \
    >   local 2001:db8:1::1 \
    >   remote 2001:db8:2::1
    # bridge fdb append 00:00:00:00:00:00 \
    >   dev vxlan100 dst 2001:db8:3::1
    
  4. You may have to apply an IPv6-related patch to the kernel (to be included in 4.12).
  5. You have to apply an IPv6-related patch to the kernel (to be included in 4.12) to get appropriate notifications for missing IPv6 addresses.
  6. Directly adding the entry after the first notification would have been smarter to avoid unnecessary retransmissions.
  7. flannel and Docker s libnetwork were already mentioned as they both feature a VXLAN backend. There are also some interesting experiments like BaGPipe BGP for Kubernetes which leverages BGP EVPN and is therefore interoperable with other vendors.
  8. There is no such thing as MTU discovery on an Ethernet segment.

23 April 2017

Shirish Agarwal: Debconf 17 in Montreal

Debconf Montreal 17 logo Before I start, debconf registration opened about a week back depending upon what time-frame you are in. So, if you have used and contributed to Debian or are curious to know and try out Debian, this would be right time to register for the conference. For those who are either financially weak (like yours truly) or those who are a minority in any way can avail those
sponsorships, more information about the sponsorship/travel bursary can be found here. Also make sure that you read, understand and accept the Debconf Code of conduct before trying to register. Even if Debconf does give the bursary, getting a visa for Canada might be tricky but then what would be life without any challenges. In the hierarchy of wants and needs, I would say attending debconf would come in the want category and not the needs category as needs are simply food, clothes and roof over one s head, thankfully all of which have been there for as long as I remember or know. To dream is necessary for life and being part of debconf would be part of that dream. While I have put up couple of talks, I have asked if somebody would be willing to tackle/share some basic things in Debconf 17 as well. Let s see if I m able to get any response to that. There was a recent Debian BSP in Montreal. While I don t know how many bugs were fixed, although UDD shows only 32 bugs remain to be fixed before Stretch could be released. Indian cricket team To put things into perspective, I had gone to a friend s place at Market Yard and came to know of a local cricket league which is being played in the city on the likes/basis of the IPL . The amount they will get is INR 88,000 (which is 1,361 USD, around the cost of ticket to and fro from Pune to Montreal and back as of date) and similar and 10 teams of 11-15 people are vying from it. The winning team will get that amount along with a trophy and signed certificates. Of course at the highest level i.e. the IPL, each night the players would probably earn a million or more for each play. One IPL season pay is what 90% of the people don t/can t earn in their whole lifetime. Montreal Metro image from Wikimedia Commons Apart from the conference, I have shared about the northern lights and the snacks on stand thing, there is one thing more to explore and that is the Montreal Metro . While I was re-reading the wiki entry today I was very much reminded about Pune Metro and Metro Samvaad that I attended couple of weeks before. That too as it was not held too far from my house, around 500 metres so could attend. The only hitch I see is the question mark on fate of 6,133 trees . I have shared one probable solution but then it has its own challenges . In the end, Metro is the need of the hour as vehicles in pune are reaching the 5 million mark while residents are around 3 million plus something depending on whether you chose to believe the 2011 census or the more recentish world population review . In either case/scenario I do not see any improvement unless consistent public transport is available. While Buses don t run on time, it is hoped that the metro would run on time. Some of the similarities between the two system, apart from the fact that from what I read Montreal Metro is mostly subway while Pune Metro would be mostly elevated. There is the idea/thought of having POI (Points of Interest) and free shows etc. in order to attract young and old to travel to destinations to and fro. There are a lot of theatre groups, stand-up comedians, rock and roll as well as classical musical concerts for which people like me would be willing to travel week-ends to attend such shows. Of course, this is all in the head, we will know the reality starting this May and commercial journeys if everything goes as planned in end 2018.
Filed under: Miscellenous Tagged: #Canadian Visa, #Debconf Montreal 2017, #Debconf registration, #Debian BSP, #Debian Stretch, #Environment, #Montreal Metro, #Montreal Subway, #planet-debian, #Points of Interest, #Pune Metro, #travel bursary, Debian, Economics

13 April 2017

Antoine Beaupr : New approaches to network fast paths

With the speed of network hardware now reaching 100 Gbps and distributed denial-of-service (DDoS) attacks going in the Tbps range, Linux kernel developers are scrambling to optimize key network paths in the kernel to keep up. Many efforts are actually geared toward getting traffic out of the costly Linux TCP stack. We have already covered the XDP (eXpress Data Path) patch set, but two new ideas surfaced during the Netconf and Netdev conferences held in Toronto and Montreal in early April 2017. One is a patch set called af_packet, which aims at extracting raw packets from the kernel as fast as possible; the other is the idea of implementing in-kernel layer-7 proxying. There are also user-space network stacks like Netmap, DPDK, or Snabb (which we previously covered). This article aims at clarifying what all those components do and to provide a short status update for the tools we have already covered. We will focus on in-kernel solutions for now. Indeed, user-space tools have a fundamental limitation: if they need to re-inject packets onto the network, they must again pay the expensive cost of crossing the kernel barrier. User-space performance is effectively bounded by that fundamental design. So we'll focus on kernel solutions here. We will start from the lowest part of the stack, the af_packet patch set, and work our way up the stack all the way up to layer-7 and in-kernel proxying.

af_packet v4 John Fastabend presented a new version of a patch set that was first published in January regarding the af_packet protocol family, which is currently used by tcpdump to extract packets from network interfaces. The goal of this change is to allow zero-copy transfers between user-space applications and the NIC (network interface card) transmit and receive ring buffers. Such optimizations are useful for telecommunications companies, which may use it for deep packet inspection or running exotic protocols in user space. Another use case is running a high-performance intrusion detection system that needs to watch large traffic streams in realtime to catch certain types of attacks. Fastabend presented his work during the Netdev network-performance workshop, but also brought the patch set up for discussion during Netconf. There, he said he could achieve line-rate extraction (and injection) of packets, with packet rates as high as 30Mpps. This performance gain is possible because user-space pages are directly DMA-mapped to the NIC, which is also a security concern. The other downside of this approach is that a complete pair of ring buffers needs to be dedicated for this purpose; whereas before packets were copied to user space, now they are memory-mapped, so the user-space side needs to process those packets quickly otherwise they are simply dropped. Furthermore, it's an "all or nothing" approach; while NIC-level classifiers could be used to steer part of the traffic to a specific queue, once traffic hits that queue, it is only accessible through the af_packet interface and not the rest of the regular stack. If done correctly, however, this could actually improve the way user-space stacks access those packets, providing projects like DPDK a safer way to share pages with the NIC, because it is well defined and kernel-controlled. According to Jesper Dangaard Brouer (during review of this article):
This proposal will be a safer way to share raw packet data between user space and kernel space than what DPDK is doing, [by providing] a cleaner separation as we keep driver code in the kernel where it belongs.
During the Netdev network-performance workshop, Fastabend asked if there was a better data structure to use for such a purpose. The goal here is to provide a consistent interface to user space regardless of the driver or hardware used to extract packets from the wire. af_packet currently defines its own packet format that abstracts away the NIC-specific details, but there are other possible formats. For example, someone in the audience proposed the virtio packet format. Alexei Starovoitov rejected this idea because af_packet is a kernel-specific facility while virtio has its own separate specification with its own requirements. The next step for af_packet is the posting of the new "v4" patch set, although Miller warned that this wouldn't get merged until proper XDP support lands in the Intel drivers. The concern, of course, is that the kernel would have multiple incomplete bypass solutions available at once. Hopefully, Fastabend will present the (by then) merged patch set at the next Netdev conference in November.

XDP updates Higher up in the networking stack sits XDP. The af_packet feature differs from XDP in that it does not perform any sort of analysis or mangling of packets; its objective is purely to get the data into and out of the kernel as fast as possible, completely bypassing the regular kernel networking stack. XDP also sits before the networking stack except that, according to Brouer, it is "focused on cooperating with the existing network stack infrastructure, and on use-cases where the packet doesn't necessarily need to leave kernel space (like routing and bridging, or skipping complex code-paths)." XDP has evolved quite a bit since we last covered it in LWN. It seems that most of the controversy surrounding the introduction of XDP in the Linux kernel has died down in public discussions, under the leadership of David Miller, who heralded XDP as the right solution for a long-term architecture in the kernel. He presented XDP as a fast, flexible, and safe solution. Indeed, one of the controversies surrounding XDP was the question of the inherent security challenges with introducing user-provided programs directly into the Linux kernel to mangle packets at such a low level. Miller argued that whatever protections are expected for user-space programs also apply to XDP programs, comparing the virtual memory protections to the eBPF (extended BPF) verifier applied to XDP programs. Those programs are actually eBPF that have an interesting set of restrictions:
  • they have a limited size
  • they cannot jump backward (and thus cannot loop), so they execute in predictable time
  • they do only static allocation, so they are also limited in memory
XDP is not a one-size-fits-all solution: netfilter, the TC traffic shaper, and other normal Linux utilities still have their place. There is, however, a clear use case for a solution like XDP in the kernel. For example, Facebook and Cloudflare have both started testing XDP and, in Facebook's case, deploying XDP in production. Martin Kafai Lau, from Facebook, presented the tool set the company is using to construct a DDoS-resilience solution and a level-4 load balancer (L4LB), which got a ten-times performance improvement over the previous IPVS-based solution. Facebook rolled out its own user-space solution called "Droplet" to detect hostile traffic and deploy blocking rules in the form of eBPF programs loaded in XDP. Lau demonstrated the way Facebook deploys a three-part chained eBPF program: the first part allows debugging and dumping of packets, the second is Droplet itself, which drops undesirable traffic, and the last segment is the load balancer, which mangles the packets to tweak their destination according to internal rules. Droplet can drop DDoS attacks at line rate while keeping the architecture flexible, which were two key design requirements. Gilberto Bertin, from Cloudflare, presented a similar approach: Cloudflare has a tool that processes sFlow data generated from iptables in order to generate cBPF (classic BPF) mitigation rules that are then deployed on edge routers. Those rules are created with a tool called bpfgen, part of Cloudflare's BSD-licensed bpftools suite. For example, it could create a cBPF bytecode blob that would match DNS queries to any example.com domain with something like:
    bpfgen dns *.example.com
Originally, Cloudflare would deploy those rules to plain iptables firewalls with the xt_bpf module, but this led to performance issues. It then deployed a proprietary user-space solution based on Solarflare hardware, but this has the performance limitations of user-space applications getting packets back onto the wire involves the cost of re-injecting packets back into the kernel. This is why Cloudflare is experimenting with XDP, which was partly developed in response to the company's problems, to deploy those BPF programs. A concern that Bertin identified was the lack of visibility into dropped packets. Cloudflare currently samples some of the dropped traffic to analyze attacks; this is not currently possible with XDP unless you pass the packets down the stack, which is expensive. Miller agreed that the lack of monitoring for XDP programs is a large issue that needs to be resolved, and suggested creating a way to mark packets for extraction to allow analysis. Cloudflare is currently in a testing phase with XDP and it is unclear if its whole XDP tool chain will be publicly available. While those two companies are starting to use XDP as-is, there is more work needed to complete the XDP project. As mentioned above and in our previous coverage, massive statistics extraction is still limited in the Linux kernel and introspection is difficult. Furthermore, while the existing actions (XDP_DROP and XDP_TX, see the documentation for more information) are well implemented and used, another action may be introduced, called XDP_REDIRECT, which would allow redirecting packets to different network interfaces. Such an action could also be used to accelerate bridges as packets could be "switched" based on the MAC address table. XDP also requires network driver support, which is currently limited. For example, the Intel drivers still do not support XDP, although that should come pretty soon. Miller, in his Netdev keynote, focused on XDP and presented it as the standard solution that is safe, fast, and usable. He identified the next steps of XDP development to be the addition of debugging mechanisms, better sampling tools for statistics and analysis, and user-space consistency. Miller foresees a future for XDP similar to the popularization of the Arduino chips: a simple set of tools that anyone, not just developers, can use. He gave the example of an Arduino tutorial that he followed where he could just look up a part number and get easy-to-use instructions on how to program it. Similar components should be available for XDP. For this purpose, the conference saw the creation of a new mailing list called xdp-newbies where people can learn how to create XDP build environments and how to write XDP programs.

In-kernel layer-7 proxying The third approach that struck me as innovative is the idea of doing layer-7 (application) proxying directly in the kernel. This comes from the idea that, traditionally, we build firewalls to segregate traffic and apply controls, but as most services move to HTTP, those policies become ineffective. Thomas Graf, presented this idea during Netconf using a Star Wars allegory: what if the Death Star were a server with an API? You would have endpoints like /dock or /comms that would allow you to dock a ship or communicate with the Death Star. Those API endpoints should obviously be public, but then there is this /exhaust-port endpoint that should never be publicly available. In order for a firewall to protect such a system, it must be able to inspect traffic at a higher level than the traditional address-port pairs. Graf presented a design where the kernel would create an in-kernel socket that would negotiate TCP connections on behalf of user space and then be able to apply arbitrary eBPF rules in the kernel. Graf's design of in-kernel proxying In this scenario, instead of doing the traditional transfer from Netfilter's TPROXY to user space, the kernel directly decapsulates the HTTP traffic and passes it to BPF rules that can make decisions without doing expensive context switches or memory copies in the case of simply wanting to refuse traffic (e.g. issue an HTTP 403 error). This, of course, requires the inclusion of kTLS to process HTTPS connections. HTTP2 support may also prove problematic, as it multiplexes connections and is harder to decapsulate. This design was described as a "pure pre-accept() hook". Starovoitov also compared the design to the kernel connection multiplexer (KCM). Tom Herbert, KCM's author, agreed that it could be extended to support this, but would require some extensions in user space to provide an interface between regular socket-based applications and the KCM layer. In any case, if the application does TLS (and lots of them do), kTLS gets tricky because it breaks the end-to-end nature of TLS, in effect becoming a man in the middle between the client and the application. Eric Dumazet argued that HA-Proxy already does things like this: it uses splice() to avoid copying too much data around, but it still does a context switch to hand over processing to user space, something that could be fixed in the general case. Another similar project that was presented at Netdev is the Tempesta firewall and reverse-proxy. The speaker, Alex Krizhanovsky, explained the Tempesta developers have taken one person month to port the mbed TLS stack to the Linux kernel to allow an in-kernel TLS handshake. Tempesta also implements rate limiting, cookies, and JavaScript challenges to mitigate DDoS attacks. The argument behind the project is that "it's easier to move TLS to the kernel than it is to move the TCP/IP stack to user space". Graf explained that he is familiar with Krizhanovsky's work and he is hoping to collaborate. In effect, the design Graf is working on would serve as a foundation for Krizhanovsky's in-kernel HTTP server (kHTTP). In a private email, Graf explained that:
The main differences in the implementation are currently that we foresee to use BPF for protocol parsing to avoid having to implement every single application protocol natively in the kernel. Tempesta likely sees this less of an issue as they are probably only targeting HTTP/1.1 and HTTP/2 and to some [extent] JavaScript.
Neither project is really ready for production yet. There didn't seem to be any significant pushback from key network developers against the idea, which surprised some people, so it is likely we will see more and more layer-7 intelligence move into the kernel sooner rather than later.

Conclusion All of this work aims at replacing a rag-tag bunch of proprietary solutions that recently came up to bypass the Linux kernel TCP/IP stack and improve performance for firewalls, proxies, and other key edge network elements. The idea is that, unless the kernel improves its performance, or at least provides a way to bypass its more complex code paths, people will work around it. With this set of solutions in place, engineers will now be able to use standard APIs to hook high-performance systems into the Linux kernel.
The author would like to thank the Netdev and Netconf organizers for travel assistance, Thomas Graf for a review of the in-kernel proxying section of this article, and Jesper Dangaard Brouer for review of the af_packet and XDP sections. Note: this article first appeared in the Linux Weekly News.

12 April 2017

Vincent Bernat: Proper isolation of a Linux bridge

TL;DR: when configuring a Linux bridge, use the following commands to enforce isolation:
# bridge vlan del dev br0 vid 1 self
# echo 1 > /sys/class/net/br0/bridge/vlan_filtering

A network bridge (also commonly called a switch ) brings several Ethernet segments together. It is a common element in most infrastructures. Linux provides its own implementation. A typical use of a Linux bridge is shown below. The hypervisor is running three virtual hosts. Each virtual host is attached to the br0 bridge (represented by the horizontal segment). The hypervisor has two physical network interfaces: Typical use of Linux bridging with virtual machines The main expectation of such a setup is that while the virtual hosts should be able to use resources from the public network, they should not be able to access resources from the infrastructure network (including resources hosted on the hypervisor itself, like a SSH server). In other words, we expect a total isolation between the green domain and the purple one. That s not the case. From any virtual host:
# ip route add 192.168.14.3/32 dev eth0
# ping -c 3 192.168.14.3
PING 192.168.14.3 (192.168.14.3) 56(84) bytes of data.
64 bytes from 192.168.14.3: icmp_seq=1 ttl=59 time=0.644 ms
64 bytes from 192.168.14.3: icmp_seq=2 ttl=59 time=0.829 ms
64 bytes from 192.168.14.3: icmp_seq=3 ttl=59 time=0.894 ms
--- 192.168.14.3 ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2033ms
rtt min/avg/max/mdev = 0.644/0.789/0.894/0.105 ms

Why? There are two main factors of this behavior:
  1. A bridge can accept IP traffic. This is a useful feature if you want Linux to act as a bridge and provide some IP services to bridge users (a DHCP relay or a default gateway). This is usually done by configuring the IP address on the bridge device: ip addr add 192.0.2.2/25 dev br0.
  2. An interface doesn t need an IP address to process incoming IP traffic. Additionally, by default, Linux accepts to answer ARP requests independently from the incoming interface.

Bridge processing After turning an incoming Ethernet frame into a socket buffer, the network driver transfers the buffer to the netif_receive_skb() function. The following actions are executed:
  1. copy the frame to any registered global or per-device taps (e.g. tcpdump),
  2. evaluate the ingress policy (configured with tc),
  3. hand over the frame to the device-specific receive handler, if any,
  4. hand over the frame to a global or device-specific protocol handler (e.g. IPv4, ARP, IPv6).
For a bridged interface, the kernel has configured a device-specific receive handler, br_handle_frame(). This function won t allow any additional processing in the context of the incoming interface, except for STP and LLDP frames or if brouting is enabled1. Therefore, the protocol handlers are never executed in this case. After a few additional checks, Linux will decide if the frame has to be locally delivered:
  • the entry for the target MAC in the FDB is marked for local delivery, or
  • the target MAC is a broadcast or a multicast address.
In this case, the frame is passed to the br_pass_frame_up() function. A VLAN-related check is optionally performed. The socket buffer is attached to the bridge interface (br0) instead of the physical interface (eth0), is evaluated by Netfilter and sent back to netif_receive_skb(). It will go through the four steps a second time.

IPv4 processing When a device doesn t have a protocol-independent receive handler, a protocol-specific handler will be used:
# cat /proc/net/ptype
Type Device      Function
0800          ip_rcv
0011          llc_rcv [llc]
0004          llc_rcv [llc]
0806          arp_rcv
86dd          ipv6_rcv
Therefore, if the Ethernet type of the incoming frame is 0x800, the socket buffer is handled by ip_rcv(). Among other things, the three following steps will happen:
  • If the frame destination address is not the MAC address of the incoming interface, not a multicast one and not a broadcast one, the frame is dropped ( not for us ).
  • Netfilter gets a chance to evaluate the packet (in a PREROUTING chain).
  • The routing subsystem will decide the destination of the packet in ip_route_input_slow(): is it a local packet, should it be forwarded, should it be dropped, should it be encapsulated? Notably, the reverse-path filtering is done during this evaluation in fib_validate_source().
Reverse-path filtering (also known as uRPF, or unicast reverse-path forwarding, RFC 3704) enables Linux to reject traffic on interfaces which it should never have originated: the source address is looked up in the routing tables and if the outgoing interface is different from the current incoming one, the packet is rejected.

ARP processing When the Ethernet type of the incoming frame is 0x806, the socket buffer is handled by arp_rcv().
  • Like for IPv4, if the frame is not for us, it is dropped.
  • If the incoming device has the NOARP flag, the frame is dropped.
  • Netfilter gets a chance to evaluate the packet (configuration is done with arptables).
  • For an ARP request, the values of arp_ignore and arp_filter may trigger a drop of the packet.

IPv6 processing When the Ethernet type of the incoming frame is 0x86dd, the socket buffer is handled by ipv6_rcv().
  • Like for IPv4, if the frame is not for us, it is dropped.
  • If IPv6 is disabled on the interface, the packet is dropped.
  • Netfilter gets a chance to evaluate the packet (in a PREROUTING chain).
  • The routing subsystem will decide the destination of the packet. However, unlike IPv4, there is no reverse-path filtering2.

Workarounds There are various methods to fix the situation. We can completely ignore the bridged interfaces: as long as they are attached to the bridge, they cannot process any upper layer protocol (IPv4, IPv6, ARP). Therefore, we can focus on filtering incoming traffic from br0. It should be noted that for IPv4, IPv6 and ARP protocols, the MAC address check can be circumvented by using the broadcast MAC address.

Protocol-independent workarounds The four following fixes will indistinctly drop IPv4, ARP and IPv6 packets.

Using VLAN-aware bridge Linux 3.9 introduced the ability to use VLAN filtering on bridge ports. This can be used to prevent any local traffic:
# echo 1 > /sys/class/net/br0/bridge/vlan_filtering
# bridge vlan del dev br0 vid 1 self
# bridge vlan show
port    vlan ids
eth0     1 PVID Egress Untagged
eth2     1 PVID Egress Untagged
eth3     1 PVID Egress Untagged
eth4     1 PVID Egress Untagged
br0     None
This is the most efficient method since the frame is dropped directly in br_pass_frame_up().

Using ingress policy It s also possible to drop the bridged frame early after it has been re-delivered to netif_receive_skb() by br_pass_frame_up(). The ingress policy of an interface is evaluated before any handler. Therefore, the following commands will ensure no local delivery (the source interface of the packet is the bridge interface) happens:
# tc qdisc add dev br0 handle ffff: ingress
# tc filter add dev br0 parent ffff: u32 match u8 0 0 action drop
In my opinion, this is the second most efficient method.

Using ebtables Just before re-delivering the frame to netif_receive_skb(), Netfilter gets a chance to issue a decision. It s easy to configure it to drop the frame:
# ebtables -A INPUT --logical-in br0 -j DROP
However, to the best of my knowledge, this part of Netfilter is known to be inefficient.

Using namespaces Isolation can also be obtained by moving all the bridged interfaces into a dedicated network namespace and configure the bridge inside this namespace:
# ip netns add bridge0
# ip link set netns bridge0 eth0
# ip link set netns bridge0 eth2
# ip link set netns bridge0 eth3
# ip link set netns bridge0 eth4
# ip link del dev br0
# ip netns exec bridge0 brctl addbr br0
# for i in 0 2 3 4; do
>    ip netns exec bridge0 brctl addif br0 eth$i
>    ip netns exec bridge0 ip link set up dev eth$i
> done
# ip netns exec bridge0 ip link set up dev br0
The frame will still wander a bit inside the IP stack, wasting some CPU cycles and increasing the possible attack surface. But ultimately, it will be dropped.

Protocol-dependent workarounds Unless you require multiple layers of security, if one of the previous workarounds is already applied, there is no need to apply one of the protocol-dependent fix below. It s still interesting to know them because it is not uncommon to already have them in place.

ARP The easiest way to disable ARP processing on a bridge is to set the NOARP flag on the device. The ARP packet will be dropped as the very first step of the ARP handler.
# ip link set arp off dev br0
# ip l l dev br0
8: br0: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 50:54:33:00:00:04 brd ff:ff:ff:ff:ff:ff
arptables can also drop the packet quite early:
# arptables -A INPUT -i br0 -j DROP
Another way is to set arp_ignore to 2 for the given interface. The kernel will only answer to ARP requests whose target IP address is configured on the incoming interface. Since the bridge interface doesn t have any IP address, no ARP requests will be answered.
# sysctl -qw net.ipv4.conf.br0.arp_ignore=2
Disabling ARP processing is not a sufficient workaround for IPv4. A user can still insert the appropriate entry in its neighbor cache:
# ip neigh replace 192.168.14.3 lladdr 50:54:33:00:00:04 dev eth0
# ping -c 1 192.168.14.3
PING 192.168.14.3 (192.168.14.3) 56(84) bytes of data.
64 bytes from 192.168.14.3: icmp_seq=1 ttl=49 time=1.30 ms
--- 192.168.14.3 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.309/1.309/1.309/0.000 ms
As the check on the target MAC address is quite loose, they don t even need to guess the MAC address:
# ip neigh replace 192.168.14.3 lladdr ff:ff:ff:ff:ff:ff dev eth0
# ping -c 1 192.168.14.3
PING 192.168.14.3 (192.168.14.3) 56(84) bytes of data.
64 bytes from 192.168.14.3: icmp_seq=1 ttl=49 time=1.12 ms
--- 192.168.14.3 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 1.129/1.129/1.129/0.000 ms

IPv4 The earliest place to drop an IPv4 packet is with Netfilter3:
# iptables -t raw -I PREROUTING -i br0 -j DROP
If Netfilter is disabled, another possibility is to enable strict reverse-path filtering for the interface. In this case, since there is no IP address configured on the interface, the packet will be dropped during the route lookup:
# sysctl -qw net.ipv4.conf.br0.rp_filter=1
Another option is the use of a dedicated routing rule. Compared to the reverse-path filtering option, the packet will be dropped a bit earlier, still during the route lookup.
# ip rule add iif br0 blackhole

IPv6 Linux provides a way to completely disable IPv6 on a given interface. The packet will be dropped as the very first step of the IPv6 handler:
# sysctl -qw net.ipv6.conf.br0.disable_ipv6=1
Like for IPv4, it s possible to use Netfilter or a dedicated routing rule.

About the example In the above example, the virtual host get ICMP replies because they are routed through the infrastructure network to Internet (e.g. the hypervisor has a default gateway which also acts as a NAT router to Internet). This may not be the case. If you want to check if you are vulnerable despite not getting an ICMP reply, look at the guest neighbor table to check if you got an ARP reply from the host:
# ip route add 192.168.14.3/32 dev eth0
# ip neigh show dev eth0
192.168.14.3 lladdr 50:54:33:00:00:04 REACHABLE
If you didn t get a reply, you could still have issues with IP processing. Add a static neighbor entry before checking the next step:
# ip neigh replace 192.168.14.3 lladdr ff:ff:ff:ff:ff:ff dev eth0
To check if IP processing is enabled, check the bridge host s network statistics:
# netstat -s   grep "ICMP messages"
    15 ICMP messages received
    15 ICMP messages sent
    0 ICMP messages failed
If the counters are increasing, it is processing incoming IP packets. One-way communication still allows a lot of bad things, like DoS attacks. Additionally, if the hypervisor happens to also act as a router, the reach is extended to the whole infrastructure network, potentially exposing weak devices (e.g. PDU) exposing an SNMP agent. If one-way communication is all that s needed, the attacker can also spoof its source IP address, bypassing IP-based authentication.

  1. A frame can be forcibly routed (L3) instead of bridged (L2) by brouting the packet. This action can be triggered using ebtables.
  2. For IPv6, reverse-path filtering needs to be implemented with Netfilter, using the rpfilter match.
  3. If the br_netfilter module is loaded, net.bridge.bridge-nf-call-ipatbles sysctl has to be set to 0. Otherwise, you also need to use the physdev match to not drop IPv4 packets going through the bridge.

4 April 2017

Ritesh Raj Sarraf: Fixing Hardware Bugs

Bugs can be annoying. Especially the ones that crash or hang and do not have a root cause. A good example of such annoyance can be kernel bugs, where a faulty hardware/device driver hinders the kernel's suspend/resume process. Because, as a user, while in the middle of your work, you suspend your machine hoping to resume your work, back when at your destination. But, during suspend, or during resume, randomly the bug triggers leaving you with no choice but a hardware reset. Ultimately, resulting in you losing the entire work state you were in. Such is a situation I encountered with my 2 year old, Lenovo Yoga 2 13. For 2 years, I had been living with this bug with all the side-effects mentioned.
Mar 01 18:43:28 learner kernel: usb 2-4: new high-speed USB device number 38 using xhci_hcd
Mar 01 18:43:54 learner kernel: usb 2-4: new high-speed USB device number 123 using xhci_hcd
Mar 01 18:44:00 learner kernel: usb 2-4: new high-speed USB device number 125 using xhci_hcd
Mar 01 18:44:11 learner kernel: usb 2-4: new high-speed USB device number 25 using xhci_hcd
Mar 01 18:44:16 learner kernel: usb 2-4: new high-speed USB device number 26 using xhci_hcd
Mar 01 18:44:22 learner kernel: usb 2-4: new high-speed USB device number 27 using xhci_hcd
Mar 01 18:44:22 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:22 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:22 learner kernel: usb 2-4: new high-speed USB device number 28 using xhci_hcd
Mar 01 18:44:23 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:23 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:23 learner kernel: usb 2-4: new high-speed USB device number 29 using xhci_hcd
Mar 01 18:44:23 learner kernel: usb 2-4: Device not responding to setup address.
Mar 01 18:44:23 learner kernel: usb 2-4: Device not responding to setup address.
Mar 01 18:44:23 learner kernel: usb 2-4: device not accepting address 29, error -71
Mar 01 18:44:24 learner kernel: usb 2-4: new high-speed USB device number 30 using xhci_hcd
Mar 01 18:44:24 learner kernel: usb 2-4: Device not responding to setup address.
Mar 01 18:44:24 learner kernel: usb 2-4: Device not responding to setup address.
Mar 01 18:44:24 learner kernel: usb 2-4: device not accepting address 30, error -71
Mar 01 18:44:24 learner kernel: usb usb2-port4: unable to enumerate USB device
Mar 01 18:44:24 learner kernel: usb 2-4: new high-speed USB device number 31 using xhci_hcd
Mar 01 18:44:24 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:25 learner kernel: usb 2-4: new high-speed USB device number 32 using xhci_hcd
Mar 01 18:44:30 learner kernel: usb 2-4: new high-speed USB device number 33 using xhci_hcd
Mar 01 18:44:30 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:31 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:31 learner kernel: usb 2-4: new high-speed USB device number 34 using xhci_hcd
Mar 01 18:44:36 learner kernel: usb 2-4: new high-speed USB device number 35 using xhci_hcd
Mar 01 18:44:36 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:36 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:37 learner kernel: usb 2-4: new high-speed USB device number 36 using xhci_hcd
Mar 01 18:44:37 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:37 learner kernel: usb 2-4: device descriptor read/64, error -71
Mar 01 18:44:37 learner kernel: usb 2-4: new high-speed USB device number 37 using xhci_hcd
Mar 01 18:44:37 learner kernel: usb 2-4: Device not responding to setup address.
Mar 01 18:44:37 learner kernel: usb 2-4: Device not responding to setup address.
Mar 01 18:44:38 learner kernel: usb 2-4: device not accepting address 37, error -71
Mar 01 18:44:38 learner kernel: usb 2-4: new high-speed USB device number 38 using xhci_hcd
Mar 01 18:44:38 learner kernel: usb 2-4: Device not responding to setup address.
Mar 02 13:34:05 learner kernel: usb 2-4: new high-speed USB device number 45 using xhci_hcd
Mar 02 13:34:05 learner kernel: usb 2-4: new high-speed USB device number 46 using xhci_hcd
Mar 02 13:34:05 learner kernel: usb 2-4: New USB device found, idVendor=0bda, idProduct=0129
Mar 02 13:34:05 learner kernel: usb 2-4: New USB device strings: Mfr=1, Product=2, SerialNumber=3
Mar 02 13:34:05 learner kernel: usb 2-4: Product: USB2.0-CRW
Mar 02 13:34:05 learner kernel: usb 2-4: Manufacturer: Generic
Mar 02 13:34:05 learner kernel: usb 2-4: SerialNumber: 20100201396000000
Mar 02 13:34:06 learner kernel: usb 2-4: USB disconnect, device number 46
Mar 02 13:34:16 learner kernel: usb 2-4: new high-speed USB device number 47 using xhci_hcd
Mar 02 13:34:21 learner kernel: usb 2-4: new high-speed USB device number 48 using xhci_hcd
Mar 02 13:34:26 learner kernel: usb 2-4: new high-speed USB device number 49 using xhci_hcd
Mar 02 13:34:32 learner kernel: usb 2-4: new high-speed USB device number 51 using xhci_hcd
Mar 02 13:34:37 learner kernel: usb 2-4: new high-speed USB device number 52 using xhci_hcd
Mar 02 13:34:43 learner kernel: usb 2-4: new high-speed USB device number 54 using xhci_hcd
Mar 02 13:34:43 learner kernel: usb 2-4: new high-speed USB device number 55 using xhci_hcd
Mar 02 13:34:49 learner kernel: usb 2-4: new high-speed USB device number 57 using xhci_hcd
Mar 02 13:34:55 learner kernel: usb 2-4: new high-speed USB device number 58 using xhci_hcd
Mar 02 13:35:00 learner kernel: usb 2-4: new high-speed USB device number 60 using xhci_hcd
Mar 02 13:35:06 learner kernel: usb 2-4: new high-speed USB device number 61 using xhci_hcd
Mar 02 13:35:11 learner kernel: usb 2-4: new high-speed USB device number 63 using xhci_hcd
Mar 02 13:35:17 learner kernel: usb 2-4: new high-speed USB device number 64 using xhci_hcd
Mar 02 13:35:22 learner kernel: usb 2-4: new high-speed USB device number 65 using xhci_hcd
Mar 02 13:35:28 learner kernel: usb 2-4: new high-speed USB device number 66 using xhci_hcd
Mar 02 13:35:33 learner kernel: usb 2-4: new high-speed USB device number 68 using xhci_hcd
Mar 02 13:35:39 learner kernel: usb 2-4: new high-speed USB device number 69 using xhci_hcd
Mar 02 13:35:44 learner kernel: usb 2-4: new high-speed USB device number 70 using xhci_hcd
Mar 02 13:35:50 learner kernel: usb 2-4: new high-speed USB device number 71 using xhci_hcd
Mar 02 13:35:50 learner kernel: usb 2-4: Device not responding to setup address.
Mar 02 13:35:50 learner kernel: usb 2-4: Device not responding to setup address.
Mar 02 13:35:50 learner kernel: usb 2-4: device not accepting address 71, error -71
Mar 02 13:35:50 learner kernel: usb 2-4: new high-speed USB device number 73 using xhci_hcd
Mar 02 13:35:51 learner kernel: usb 2-4: new high-speed USB device number 74 using xhci_hcd
Mar 02 13:35:56 learner kernel: usb 2-4: new high-speed USB device number 75 using xhci_hcd
Mar 02 13:35:57 learner kernel: usb 2-4: new high-speed USB device number 77 using xhci_hcd
Mar 02 13:36:03 learner kernel: usb 2-4: new high-speed USB device number 78 using xhci_hcd
Mar 02 13:36:08 learner kernel: usb 2-4: new high-speed USB device number 79 using xhci_hcd
Mar 02 13:36:14 learner kernel: usb 2-4: new high-speed USB device number 80 using xhci_hcd
Mar 02 13:36:20 learner kernel: usb 2-4: new high-speed USB device number 83 using xhci_hcd
Mar 02 13:36:26 learner kernel: usb 2-4: new high-speed USB device number 86 using xhci_hcd
Thanks to the Linux USB maintainers, we tried investigating the issue, which resulted in uncovering other bugs. Unfortunately, this bug was concluded as a possible hardware bug. The only odd bit is that this machine has a Windows 8.1 copy still lying on the spare partition, where the issue was not seen at all. It could very well be that it was not a hardware bug at all, or a hardware bug which had a workaround in the Windows driver. But, the results of the exercise weren't much useful to me because I use the machine under the Linux kernel most of the time. So, this March 2017, with 2 years completion on me purchasing the device, I was annoyed enough by the bugs. That led me trying out finding other ways to taming this issue. Lenovo has some variations of this device. I know that it comes with multiple options for the storgae and the wifi component. I'm not sure if there are more differences. The majority of the devices are connected over the xHCI bus on this machine. If a single device is faulty, or has faulty hardware; it could screw up the entire user experience for that machine. Such is my case. Hardware manufacturers could do a better job if they could provide a means to disable hardware, for example in the BIOS. HP shipped machines have such an option in the BIOS where you can disable devices that do not have an important use case for the user. Good example of such devices are Fingerprint Readers, SD Card Readers, LOMs and mabye Bluetooth too. At least the latter should apply for Linux users, as majority of us have an unpleasant time getting Bluetooth to work out of the box. But on my Lenovo Yoga, it came with a ridiculous BIOS/UEFI, with very very limited options for change. Thankfully, they did have an option to set the booting mode for the device, giving the choices of Legacy Boot and UEFI. Back to the topic, with 2 years of living with the bug, and no clarity on if and whether it was a hardware bug or a driver bug, it left me with no choice but to open up the machine. Next to the mSATA HDD sits the additional board, which houses the Power, USB, Audio In, and the SD Card reader. Opening that up, I got the small board. I barely use the SD Card reader, and given the annoyances I had to suffer because of it, there was no more mercy in killing that device. So, next was to unsolder the SD Card reader completely. Once done, and fitted back into the machine, everything has been working awesomely great in the last 2 weeks. This entire fix costed me 0. So sometimes, fixing a bug is all that matters. In the Hindi language, a nice phrase for such a scenario remnids me of the great Chanakya, " , , ".

Categories:

Keywords:

Like:

1 April 2017

Antoine Beaupr : My free software activities, February and March 2017

Looking into self-financing Before I begin, I should mention that I started tracking my time working on free software more systematically. I spend a lot of time on the computer, as regular readers of this blog might remember so I wanted to know exactly how much time was paid vs free work. I was already using org-mode's time clock system to keep track of my work hours, so I just extended this to my regular free software contributions, which also helps in writing those reports. It turns out that over 60% of my computer time is spent working on free software. That's huge! I was expecting something more along the range of 20 to 40% of my time. So I started thinking about ways of financing this work. I created a Patreon page but I'm hesitant into launching such a campaign: the only thing worse than "no patreon page" is "a patreon page with failed goals and no one financing it". So before starting such an effort, I'd like to get a feeling of what other people's experience with it are. I know that joeyh is close to achieving his goals, but I can't compare with the guy that invented git-annex or debhelper, so I'm concerned I wouldn't be able to raise the same level of funding. So any advice you have, feel free to contact me in private or in the comments. If you would be ready to fund my work, I'd love to know about it, obviously, but I guess I wouldn't get real numbers until I actually open up such a page... Now, onto the regular report.

Wallabako I spent a good chunk of time completing most of the things I had in mind for Wallabako, which I mentioned quickly in the previous report. Wallabako is now much easier to installed, with clearer instructions, an easier to use configuration file, more reliable synchronization and read status propagation. As usual the Wallabako README file has all the details. I've also looked at better integration with Koreader, the free software e-reader that forms the basis of the okreader free software distribution which has been able to port Debian to the Kobo e-readers, a project I am really excited about. This project has the potential of supporting Kobo readers beyond the lifetime that upstream grants it and removes a lot of proprietary software and spyware that ships with the Kobo readers. So I have made a few contributions to okreader and also on koreader, the ebook reader okreader is based on.

Stressant I rewrote stressant, my simple burn-in and stress-testing tool. After struggling in turn with Debirf, live-build, vmdebootstrap and even FAI, I just figured maybe it wasn't the best idea to try and reinvent that particular wheel: instead of reinventing how to build yet another Debian system build tool, maybe I should just reuse what's already there. It turns out there's a well known, succesful and fairly complete recovery system called Grml. It is a Debian Derivative, so all I needed to do was to stop procrastinating and actually write the actual stressant tool instead of just creating a distribution with a bunch of random tools shipped in. This allowed me to focus on which tools were the best to stress test different components. This selection ended up being: fio can also be used to overwrite disk drives with the proper options (--overwrite and --size=100%), although grml also ships with nwipe for wiping old spinning disks and hdparm to do a secure erase of SSD disks (whatever that's worth). Stressant still needs to be shipped with grml for this transition to be complete. In the meantime, I was able to configure the excellent public Gitlab CI service to provide ISO images with Stressant built-in as a stopgap measure. I also need to figure out a way to automate starting stressant from a boot menu to automate deployments on a larger scale, although because I have little need for the feature at this moment in time, this will likely wait for a sponsor to show up for this to be implemented. Still, stressant has useful features like the capability of sending logs by email using a fresh new implementation of the Python SMTPHandler (BufferedSMTPHandler) which waits for logging to complete before sending a single email. Another interesting piece of code in there is the NegateAction argparse handler that enables the use of "toggle flags" (e.g. --flag / --no-flag). I'm so happy with the code that I figure I could just share it here directly:
class NegateAction(argparse.Action):
    '''add a toggle flag to argparse

    this is similar to 'store_true' or 'store_false', but allows
    arguments prefixed with --no to disable the default. the default
    is set depending on the first argument - if it starts with the
    negative form (define by default as '--no'), the default is False,
    otherwise True.
    '''
    negative = '--no'
    def __init__(self, option_strings, *args, **kwargs):
        '''set default depending on the first argument'''
        default = not option_strings[0].startswith(self.negative)
        super(NegateAction, self).__init__(option_strings, *args,
                                           default=default, nargs=0, **kwargs)
    def __call__(self, parser, ns, values, option):
        '''set the truth value depending on whether
        it starts with the negative form'''
        setattr(ns, self.dest, not option.startswith(self.negative))
Short and sweet. I wonder why stuff like this is not in the standard library yet - maybe just because no one bothered yet? It'd be great to get feedback of more experienced Pythonistas on this one. I hope that my work on Stressant is complete. I get zero funding for this work, and have little use for it myself: I manage only a few machines and such a tool really shines when you regularly put new hardware online, which is (fortunately?) not my case anymore. I'd be happy, of course, to accompany organisations and people that wish to further develop and use such a tool. A short demo of stressant as well as detailed description of how it works is of course available in its README file.

Standard third party repositories After looking at improvements for the grml repository instructions, I realized there was no real "best practices" document on how to configure an Apt repository. Sure, there are tools like reprepro and others, but those hardly qualify as policy: they are very flexible and there are lots of ways to create insecure repositories or curl sh style instructions, which we of course generally want to avoid. While the larger problem of Unstrusted Debian packages remain generally unsolved (e.g. when you install any .deb file, it can get root on your system), it seemed to me one critical part of this problem was how to add a random third-party repository to your machine while limiting, as much as possible, what possible attackers could do with such a repository. In other words, to solve the more general problem of insecure .deb files, we also need to solve the distribution problem, otherwise fixing the .deb files themselves will be useless. This lead to the creation of standardized repository instructions that define:
  1. how to distribute the repository's public signing key (ie. over HTTPS)
  2. how to name suites and components (e.g. use stable and main unless you have a good reason, and explain yourself)
  3. recommend a healthy does of apt preferences pinning
  4. how to distribute keys (e.g. with a derive-archive-keyring package)
I've seen so many third party repositories get this wrong. For example, a lot of repositories recommend this type of command to intialize the OpenPGP trust path:
curl http://example.com/key.asc   apt-key add -
This has the following problems:
  • the key is transfered in plaintext and can easily be manipulated by an active attacker (e.g. a router on your path to the server or a neighbor in a Wifi cafe)
  • the key is added to the main trust root, which allows the key to authentify as the real Debian archive, therefore giving it all rights over all packages
  • since it's part of the global archive, it's difficult for a package to remove/add the key when a key rollover is necessary (and repositories generally don't provide a deriv-archive-keyring to do that process anyways)
An example of this are the Docker install instructions that, at least, manage to do this over HTTPS. Some other repositories don't even bother teaching people about the proper way of adding those keys. We settled for:
wget -O /usr/share/keyrings/deriv-archive-keyring.gpg https://deriv.example.net/debian/deriv-archive-keyring.gpg
That location was explicitly chosen to be out of the main trust directory, so that it needs to be explicitly added to the sources.list as well:
deb [signed-by=/usr/share/keyrings/deriv-archive-keyring.gpg] https://deriv.example.net/debian/ stable main
Similarly, we highly recommend users setup "apt pinning" to restrict what a given repository can do. Since pinning is so confusing, most people don't actually bother even configuring it and I have yet to see a single repo advise its users to configure those preferences, which are essential to limit what a repository can do. To keep configuration simple, we recommend this:
Package: *
Pin: origin deriv.example.net
Pin-Priority: 100
Obviously, for a single-package repository, the actual package name should be listed, e.g.:
Package: foo
Pin: origin deriv.example.net
Pin-Priority: 100
And the priority should probably be set to 1 unless you want to allow automatic upgrades. It is my hope that this design will get more traction in the years to come and become a de-facto standard that will be a key part in safely adding third party repositories. There is obviously much more work to be done to improve security when installing untrusted .deb files, and I encourage Debian developers to consider contributing to the UntrustedDebs discussions and particularly to the Teams/Dpkg/Spec/DeclarativePackaging work.

Signal R&D I spent a significant amount of time this month struggling with the Signal project on my phone. I'm still ambivalent on Signal: it's a centralized designed, too dependent on phone numbers, but I must admit they get a lot of things right and it's the only free-software platform that allows for easy-to-use, multi-platform videoconferencing that my family can use. I've been following Signal for a while: up until now, I had been using the LibreSignal rebuild of the official client, as it is distributed on a F-Droid repository. Because I try to avoid Google (proprietary) software on my phone, it's basically the only way I could even install Signal. Unfortunately, the repository is out of date and introduces another point of trust in the distribution model: now you not only need to trust the Signal authors to do the right thing, you also need to trust that F-Droid repo not to inject nasty code on your phone. I've therefore started a discussion about how Signal could be distributed outside of the Google Play Store. I'd like to think it's one of the things that led the Signal people to distribute an official copy of Signal outside of the playstore. After much struggling, I was able to upgrade to this official client and will be able to upgrade easily by just downloading the APK. (Do note that I ended up reinstalling and re-registering Signal, which unfortunately changed my secret keys.) I do hope Signal enters F-Droid one day, but it could take a while because it still doesn't work without Google services and barely works with MicroG, the free software alternative to the Google services clients. Moxie also set a list of requirements like crash reporting and statistics that need to be implemented on F-Droid's side before he agrees to the deployment, so this could take a while. I've also participated in the, ahem, discussion on the JWZ blog regarding a supposed vulnerability in Signal where it would leak previously unknown phone numbers to third parties. I reviewed the way the phone number is uploaded and, while it's possible to create a rainbow table of phone numbers (which are hashed with a truncated SHA-1 checksum), I couldn't verify the claims of other participants in the thread. For me, Signal still does the right thing with contacts, although I do question the way "read status" notifications get transmitted, but that belong in another bug report / blog post.

Debian Long Term Support (LTS) It's been more than a year working on Debian LTS, started by Raphael Hertzog at Freexian. I didn't work much in February so I had a lot of hours to catchup with, and was unfortunately unable to do so, partly because I was busy with other projects, and partly because my colleagues are doing a great job at resolving the most important issues. So one my concerns this month was finding work. It seemed that all the hard packages were either taken (e.g. my usual favorites, tiff and imagemagick, we done by others) or just too challenging (e.g. I don't feel quite comfortable tackling the LTS branch of the Linux kernel yet). I spent quite a bit of time trying to figure out what was wrong with pcre3, only to realise the "32" in the report was not about the architecture, but about the character width. Because of thise, I marked 4 CVEs (CVE-2017-7186, CVE-2017-7244, CVE-2017-7245, CVE-2017-7246) as "not-affected", since the 32-bith character support wasn't enabled in wheezy (or jessie, for that matter). I still spent some time trying to reproduce the issues, which require a compiler with an AddressSanitizer, something that was introduced in both Clang and GCC after Wheezy was released, which makes reproducing this fairly complicated... This allowed me to experiment more with Vagrant, however, and I have provided the Debian cloud team with a 32-bit Vagrant box that was merged in shortly after, although it doesn't show up yet in the official list of Debian images. Then I looked at the apparmor situation (CVE-2017-6507), Debian bug #858768). That was one tricky bug as well, since it's not a security issue in apparmor per se, but more an issue with things that assume a certain behavior from apparmor. I have concluded that Wheezy was not affected because there are no assumptions of proper isolation there - which are provided only starting from LXC 1.0 - and Docker is not in Wheezy. I also couldn't reproduce the issue on Jessie, but, as it turns out, the issue was sysvinit-specific, which is why I couldn't reproduce it under the default systemd configuration shipped with Jessie. I also looked at the various binutils security issues: as I reported on the mailing list, I didn't see anything serious enough in there to warrant a a security release and followed the lead of both the stable and Red Hat security teams by marking this "no-dsa". I similiarly reviewed the mp3splt security issues (specifically CVE-2017-5666) and was fairly puzzled by that issue, which seems to be triggered only the same address sanitization extensions than PCRE, although there was some pretty wild interplay with debugging flags in there. All in all, it seems we can't reproduce that issue in wheezy, but I do not feel confident enough in the results to push that issue aside for now. I finally uploaded the pending graphicsmagick issue (DLA-547-2), a regression update to fix a crash that was introduced in the previous release (DLA-547-1, mistakenly named DLA-574-1). Hopefully that release should clear up some of the confusion and fix the regression. I also released DLA-879-1 for the CVE-2017-6369 in firebird2.5 which was an interesting experiment: I couldn't reproduce the issue in a local VM. After following the Ubuntu setup tutorial, as I wasn't too familiar with the Firebird database until now (hint: the default username and password is sysdba/masterkey), I ended up assuming we were vulnerable and just backporting the patch after seeing the jessie folks push out a release just in case. I also looked at updating the ca-certificates package to deal with the pending WoSign/Startcom removal: I made an explicit list of the CAs that need to be removed after reviewing the Mozilla list. I also sent a patch for an unrelated issue where ca-certificates is writing to /usr/local (!!) in Debian bug #843722. I have also done some "meta" work in starting a discussion about fixing the missing DLA links in the tracker, as you will notice all of the above links lead to nowhere. Thanks to pabs, there are now some links but unfortunately there are about 500 DLAs missing from the website. We also discussed ways to Debian bug #859123, something which is currently a manual process. This is now in the hands of the excellent webmaster team. I have also filed a few missing security bugs (Debian bug #859135, Debian bug #859136), partly because I wanted to help the security team. But it turned out that I felt the script needed some improvements, so I submitted a patch to improve the script so it is easier to run.

Other projects As usual, there's the usual mixed bags of chaos: More stuff on Github...

30 March 2017

Shirish Agarwal: The tale of the dancing girl #nsfw

Demonstration of a Lapdance - Wikipedia

Demonstration of a Lapdance Wikipedia

The post will be adult/mature in nature. So those below 18 please excuse. The post is about an anecdote almost 20 years to date, The result its being posted is I had a dinner with a friend to whom I shared this and he thought it would be nice if I shared this hence sharing it. The conversation was about being young and foolish in which I shared the anecdote. The blog post was supposed to be about Aadhar which shocked me both in the way no political discourse happened and the way the public as well as public policy was gamed but that would have to wait for another day. History I left college in 1995. The anecdote/incident probably happened couple of years earlier so probably 1992-1993. At that time, I was in my teens and as a typical teenager I made few friends. One of those friends, who would remain nameless as since we drifted apart, and as I have not take permission from him, taking his name would not be a good idea. Anyway, this gentleman, let s call him Mr. X as an example. Couple of months before, he had bought an open jeep, similar but very different from the jeep being shown below. Open Jeep had become a fashion statement few months back (in those days) as a Salman khan starred movie had that and anybody who had money wanted one just like that.
Illustration of an open jeep, sadly its a military one - wikipedia

Illustration of an open jeep, sadly its a military one wikipedia

Those days we didn t have cell-phones and I had given my land-line phone number to very few friends as in those days, as the land-lines were a finicky instrument. One fine morning, I get a call from my friend telling he is going to come near my place and I should meet him at some xyz place and we would go for a picnic for the whole day and it is possible that we might return next day. As it was holidays and only a fool would throw away a chance to have a ride in open air jeep, I immediately agreed. I shared that my friends had organized a picnic and giving another friend s number (who didn t know anything) got permission and went to meet Mr. X. This was very early morning, around 0600 hrs. . After meeting him, he told that we would be going to Mumbai, take some more friends from there and then move on. In those days, a railway ticket from Shivaji Nagar to V.T. (now C.S.T.) costed INR 30/- . I had been to Mumbai few times before for various technical conferences and knew few cheap places to eat, I knew that going via train, we could go and come back spending at the most INR 150/- and still have some change left-over (today s meal at a roadside/street vendor easily passes that mark). The Journey I shared with him that it will be costly and I don t have any money to cover the fuel expenses and he said he would shoulder the expenses, he just wanted my company for the road. Those days, it was the scenic Old Mumbai-Pune highway and we took plenty of stops to admire that ghats (hills and valleys together). That journey must have taken around 7-8 hours while today by new Expressway, you could do the same thing by 2.5/3 hours. Anyhow, we reached to some swanky hotel in South Mumbai. South Mumbai was not the financial powerhouse that it today is, there was mix of very old buildings and new buildings like the swanky hotel that we had checked in. I have no memory nor any idea whether it was 1 class, 3 class or 5 class and could have cared less as had been tired from the journey. We checked in, I had a long warm water bath and then slept in the king-size bed with curtains drawn. Evening came and we took the jeep and picked up 2-3 of his friends who were from my age or a year or two older and we went to Nariman Point. Seeing the Queen s necklace from Nariman Point at night is a sight in itself. Keeping with the innocence, I was under the impression that we had arrived at our destination, at this our host, Mr. X and his Bombaiya friends had a quiet laugh saying its a young night still. We must have whiled away couple of hours, having chai and throwing rocks in the sea. The Meeting After a while, Mr. X took us to another swanky place. My eyes were out of my sockets as this seemed to be as elitist a place as could be. I saw many White European women in various stages of undress pole-dancing and lap-dancing. I had recently (in those days) come to know the term but was under the impression that it was something that happened in Europe and States only. I had no idea that lap-dancing was older than my birth as according to Wikipedia. So looking back now, I am not surprised that in two decades the concept crossed the oceans. Again, Mr. X being the host, agreed to bear all the costs and all of us had food, drink and a lap-dance from any of the dancers on the floor. As I was young and probably shy (still am) I asked Mr. X s help to pick a girl/woman for me. The woman whom he picked was auburn-aired, was either my age or a year or two older/younger to me. What proceeded next was about 20-30 minutes of totally sexualized erotic experience. While he and all his friends picked girls to go all the way, I was hesitant to let loose. Maybe it was due to my lack of courage or inexperience, maybe it was not in my city so couldn t predict the outcome, maybe was just afraid that reality might mar fantasy, I dunno till date. Although we kissed and necked a lot, I guess that should count for something. The conversation After all my friends had gone to the various rooms, sometime after I excused myself, went to the loo myself, peed a bit, splashed cold water on self, came out and had couple of glasses of water and came back to my seat. The lady came back and I shared that I was not interested in going further and while she was beautiful, I just didn t have the guts. I did ask her if she would give me company though for sometime as I didn t know anyone else at that place. Our conversation was more about her than me as I had more or less an average life upto that moment. There were only three unorthodox things that I had done before meeting her. I had drunk wines of different types, smoked weed and had a Magic Mushrooms experience the year before with another group of friends I had made there. Goa in those days was simply magical in those days but that probably would need its story/own blog post. When I enquired about her, she shared she was from Russia and she rattled off more than half a dozen places around the world where she had been to and this was her second or third stint in Mumbai and she wasn t at all unhappy about the lifestyle and choices she was leading. I had no answer for her as a young penniless college-going student. Her self-confidence and the way she carried herself was impressive, with or without clothes. During course of the conversation she shared a couple of contacts from whom I could get better weed at slightly higher price if I were in Goa. Few months later, those contacts turned out to be true. After sometime, we took all the women and ourselves, around 8-9 people in his jeep (how he negotiated that is beyond me) went to a hygienic Pani puri and Bhel (puffed rice mixed with variety of spices typically tomato, potato, coriander chutney as well as Tamarind Chutney among other things) place and moved them to tears (the spices in bhel and Pani puri did it for them) and this was when we had explicitly asked the bhel-wala guy to make it extremely mild with just a hint of spice in it. Anyways, sometime later, we dropped them at the same place, dropped his friends and came back to the hotel we booked and got drunk again. After-effects Few years later, it came in the newspapers/media that while India had broken out of financial isolation just few years back (1991) and were profiting from it, many countries of the former USSR were going the other way around and hence there was huge human trafficking and immigration that had taken place. This was in-line with what the lady/woman/Miss X had shared with me. The latest trigger The latest trigger happened couple of months back where I learnt of a hero flight attendant saving a girl from human-trafficking. Till date, I am unsure whether she was doing it willingly or putting a brave smile in front of me, because even if she had confided me in any way, I probably would have been too powerless to help her in any-way. I just don t know. Foolishness thy name While my friend took advantage of my innocence and introduced me to a world which otherwise I would probably not know exists, it could have easily have gone some other way as well. While I m still unsure of the choices I made, I was and am happy that I was able to strike a conversation with her and attempt to reach the person therein. Was it the truth or an elaborate fabricated lie to protect myself and herself, this I will never know. Oppression I understand the fact that as a customer or somebody who is taking part in either of those performances or experiences it isn t easy in any way to know/say that whether the performer is doing it wilfully or not as the experiences are in tightly controlled settings.
Filed under: Miscellenous Tagged: #anecdote, #confusion, #elitist, #growing up, #lap dance, #NSFW, #Open Jeep, Mumbai

23 March 2017

Simon McVittie: GTK hackfest 2017: D-Bus communication with containers

At the GTK hackfest in London (which accidentally became mostly a Flatpak hackfest) I've mainly been looking into how to make D-Bus work better for app container technologies like Flatpak and Snap. The initial motivating use cases are: At the moment, Flatpak runs a D-Bus proxy for each app instance that has access to D-Bus, connects to the appropriate bus on the app's behalf, and passes messages through. That proxy is in a container similar to the actual app instance, but not actually the same container; it is trusted to not pass messages through that it shouldn't pass through. The app-identification mechanism works in practice, but is Flatpak-specific, and has a known race condition due to process ID reuse and limitations in the metadata that the Linux kernel maintains for AF_UNIX sockets. In practice the use of X11 rather than Wayland in current systems is a much larger loophole in the container than this race condition, but we want to do better in future. Meanwhile, Snap does its sandboxing with AppArmor, on kernels where it is enabled both at compile-time (Ubuntu, openSUSE, Debian, Debian derivatives like Tails) and at runtime (Ubuntu, openSUSE and Tails, but not Debian by default). Ubuntu's kernel has extra AppArmor features that haven't yet gone upstream, some of which provide reliable app identification via LSM labels, which dbus-daemon can learn by querying its AF_UNIX socket. However, other kernels like the ones in openSUSE and Debian don't have those. The access-control (AppArmor mediation) is implemented in upstream dbus-daemon, but again doesn't work portably, and is not sufficiently fine-grained or flexible to do some of the things we'll likely want to do, particularly in dconf. After a lot of discussion with dconf maintainer Allison Lortie and Flatpak maintainer Alexander Larsson, I think I have a plan for fixing this. This is all subject to change: see fd.o #100344 for the latest ideas. Identity model Each user (uid) has some uncontained processes, plus 0 or more containers. The uncontained processes include dbus-daemon itself, desktop environment components such as gnome-session and gnome-shell, the container managers like Flatpak and Snap, and so on. They have the user's full privileges, and in particular they are allowed to do privileged things on the user's session bus (like running dbus-monitor), and act with the user's full privileges on the system bus. In generic information security jargon, they are the trusted computing base; in AppArmor jargon, they are unconfined. The containers are Flatpak apps, or Snap apps, or other app-container technologies like Firejail and AppImage (if they adopt this mechanism, which I hope they will), or even a mixture (different app-container technologies can coexist on a single system). They are containers (or container instances) and not "apps", because in principle, you could install com.example.MyApp 1.0, run it, and while it's still running, upgrade to com.example.MyApp 2.0 and run that; you'd have two containers for the same app, perhaps with different permissions. Each container has an container type, which is a reversed DNS name like org.flatpak or io.snapcraft representing the container technology, and an app identifier, an arbitrary non-empty string whose meaning is defined by the container technology. For Flatpak, that string would be another reversed DNS name like com.example.MyGreatApp; for Snap, as far as I can tell it would look like example-my-great-app. The container technology can also put arbitrary metadata on the D-Bus representation of a container, again defined and namespaced by the container technology. For instance, Flatpak would use some serialization of the same fields that go in the Flatpak metadata file at the moment. Finally, the container has an opaque container identifier identifying a particular container instance. For example, launching com.example.MyApp twice (maybe different versions or with different command-line options to flatpak run) might result in two containers with different privileges, so they need to have different container identifiers. Contained server sockets App-container managers like Flatpak and Snap would create an AF_UNIX socket inside the container, bind() it to an address that will be made available to the contained processes, and listen(), but not accept() any new connections. Instead, they would fd-pass the new socket to the dbus-daemon by calling a new method, and the dbus-daemon would proceed to accept() connections after the app-container manager has signalled that it has called both bind() and listen(). (See fd.o #100344 for full details.) Processes inside the container must not be allowed to contact the AF_UNIX socket used by the wider, uncontained system - if they could, the dbus-daemon wouldn't be able to distinguish between them and uncontained processes and we'd be back where we started. Instead, they should have the new socket bind-mounted into their container's XDG_RUNTIME_DIR and connect to that, or have the new socket set as their DBUS_SESSION_BUS_ADDRESS and be prevented from connecting to the uncontained socket in some other way. Those familiar with the kdbus proposals a while ago might recognise this as being quite similar to kdbus' concept of endpoints, and I'm considering reusing that name. Along with the socket, the container manager would pass in the container's identity and metadata, and the method would return a unique, opaque identifier for this particular container instance. The basic fields (container technology, technology-specific app ID, container ID) should probably be added to the result of GetConnectionCredentials(), and there should be a new API call to get all of those plus the arbitrary technology-specific metadata. When a process from a container connects to the contained server socket, every message that it sends should also have the container instance ID in a new header field. This is OK even though dbus-daemon does not (in general) forbid sender-specified future header fields, because any dbus-daemon that supported this new feature would guarantee to set that header field correctly, the existing Flatpak D-Bus proxy already filters out unknown header fields, and adding this header field is only ever a reduction in privilege. The reasoning for using the sender's container instance ID (as opposed to the sender's unique name) is for services like dconf to be able to treat multiple unique bus names as belonging to the same equivalence class of contained processes: instead of having to look up the container metadata once per unique name, dconf can look it up once per container instance the first time it sees a new identifier in a header field. For the second and subsequent unique names in the container, dconf can know that the container metadata and permissions are identical to the one it already saw. Access control In principle, we could have the new identification feature without adding any new access control, by keeping Flatpak's proxies. However, in the short term that would mean we'd be adding new API to set up a socket for a container without any access control, and having to keep the proxies anyway, which doesn't seem great; in the longer term, I think we'd find ourselves adding a second new API to set up a socket for a container with new access control. So we might as well bite the bullet and go for the version with access control immediately. In principle, we could also avoid the need for new access control by ensuring that each service that will serve contained clients does its own. However, that makes it really hard to send broadcasts and not have them unintentionally leak information to contained clients - we would need to do something more like kdbus' approach to multicast, where services know who has subscribed to their multicast signals, and that is just not how dbus-daemon works at the moment. If we're going to have access control for broadcasts, it might as well also cover unicast. The plan is that messages from containers to the outside world will be mediated by a new access control mechanism, in parallel with dbus-daemon's current support for firewall-style rules in the XML bus configuration, AppArmor mediation, and SELinux mediation. A message would only be allowed through if the XML configuration, the new container access control mechanism, and the LSM (if any) all agree it should be allowed. By default, processes in a container can send broadcast signals, and send method calls and unicast signals to other processes in the same container. They can also receive method calls from outside the container (so that interfaces like org.freedesktop.Application can work), and send exactly one reply to each of those method calls. They cannot own bus names, communicate with other containers, or send file descriptors (which reduces the scope for denial of service). Obviously, that's not going to be enough for a lot of contained apps, so we need a way to add more access. I'm intending this to be purely additive (start by denying everything except what is always allowed, then add new rules), not a mixture of adding and removing access like the current XML policy language. There are two ways we've identified for rules to be added: Initially, many contained apps would work in the first way (and in particular sockets=session-bus would add a rule that allows almost everything), while over time we'll probably want to head towards recommending more use of the second. Related topics Access control on the system bus We talked about the possibility of using a very similar ruleset to control access to the system bus, as an alternative to the XML rules found in /etc/dbus-1/system.d and /usr/share/dbus-1/system.d. We didn't really come to a conclusion here. Allison had the useful insight that the XML rules are acting like a firewall: they're something that is placed in front of potentially-broken services, and not part of the services themselves (which, as with firewalls like ufw, makes it seem rather odd when the services themselves install rules). D-Bus system services already have total control over what requests they will accept from D-Bus peers, and if they rely on the XML rules to mediate that access, they're essentially rejecting that responsibility and hoping the dbus-daemon will protect them. The D-Bus maintainers would much prefer it if system services took responsibility for their own access control (with or without using polkit), because fundamentally the system service is always going to understand its domain and its intended security model better than the dbus-daemon can. Analogously, when a network service listens on all addresses and accepts requests from elsewhere on the LAN, we sometimes work around that by protecting it with a firewall, but the optimal resolution is to get that network service fixed to do proper authentication and access control instead. For system services, we continue to recommend essentially this "firewall" configuration, filling in the $ variables as appropriate:
<busconfig>
    <policy user="$ the daemon uid under which the service runs ">
        <allow own="$ the service's bus name "/>
    </policy>
    <policy context="default">
        <allow send_destination="$ the service's bus name "/>
    </policy>
</busconfig>
We discussed the possibility of moving towards a model where the daemon uid to be allowed is written in the .service file, together with an opt-in to "modern D-Bus access control" that makes the "firewall" unnecessary; after some flag day when all significant system services follow that pattern, dbus-daemon would even have the option of no longer applying the "firewall" (moving to an allow-by-default model) and just refusing to activate system services that have not opted in to being safe to use without it. However, the "firewall" also protects system bus clients, and services like Avahi that are not bus-activatable, against unintended access, which is harder to solve via that approach; so this is going to take more thought. For system services' clients that follow the "agent" pattern (BlueZ, polkit, NetworkManager, Geoclue), the correct "firewall" configuration is more complicated. At some point I'll try to write up a best-practice for these. New header fields for the system bus At the moment, it's harder than it needs to be to provide non-trivial access control on the system bus, because on receiving a method call, a service has to remember what was in the method call, then call GetConnectionCredentials() to find out who sent it, then only process the actual request when it has the information necessary to do access control. Allison and I had hoped to resolve this by adding new D-Bus message header fields with the user ID, the LSM label, and other interesting facts for access control. These could be "opt-in" to avoid increasing message sizes for no reason: in particular, it is not typically useful for session services to receive the user ID, because only one user ID is allowed to connect to the session bus anyway. Unfortunately, the dbus-daemon currently lets unknown fields through without modification. With hindsight this seems an unwise design choice, because header fields are a finite resource (there are 255 possible header fields) and are defined by the D-Bus Specification. The only field that can currently be trusted is the sender's unique name, because the dbus-daemon sets that field, overwriting the value in the original message (if any). To make it safe to rely on the new fields, we would have to make the dbus-daemon filter out all unknown header fields, and introduce a mechanism for the service to check (during connection to the bus) whether the dbus-daemon is sufficiently new that it does so. If connected to an older dbus-daemon, the service would not be able to rely on the new fields being true, so it would have to ignore the new fields and treat them as unset. The specification is sufficiently vague that making new dbus-daemons filter out unknown header fields is a valid change (it just says that "Header fields with an unknown or unexpected field code must be ignored", without specifying who must ignore them, so having the dbus-daemon delete those fields seems spec-compliant). This all seemed fine when we discussed it in person; but GDBus already has accessors for arbitrary header fields by numeric ID, and I'm concerned that this might mean it's too easy for a system service to be accidentally insecure: It would be natural (but wrong!) for an implementor to assume that if g_message_get_header (message, G_DBUS_MESSAGE_HEADER_FIELD_SENDER_UID) returned non-NULL, then that was guaranteed to be the correct, valid sender uid. As a result, fd.o #100317 might have to be abandoned. I think more thought is needed on that one. Unrelated topics As happens at any good meeting, we took the opportunity of high-bandwidth discussion to cover many useful things and several useless ones. Other discussions that I got into during the hackfest included, in no particular order: More notes are available from the GNOME wiki. Acknowledgements The GTK hackfest was organised by GNOME and hosted by Red Hat and Endless. My attendance was sponsored by Collabora. Thanks to all the sponsors and organisers, and the developers and organisations who attended.

20 March 2017

Bits from Debian: DebConf17 welcomes its first eighteen sponsors!

DebConf17 logo DebConf17 will take place in Montreal, Canada in August 2017. We are working hard to provide fuel for hearts and minds, to make this conference once again a fertile soil for the Debian Project flourishing. Please join us and support this landmark in the Free Software calendar. Eighteen companies have already committed to sponsor DebConf17! With a warm welcome, we'd like to introduce them to you. Our first Platinum sponsor is Savoir-faire Linux, a Montreal-based Free/Open-Source Software company which offers Linux and Free Software integration solutions and actively contributes to many free software projects. "We believe that it's an essential piece [Debian], in a social and political way, to the freedom of users using modern technological systems", said Cyrille B raud, president of Savoir-faire Linux. Our first Gold sponsor is Valve, a company developing games, social entertainment platform, and game engine technologies. And our second Gold sponsor is Collabora, which offers a comprehensive range of services to help its clients to navigate the ever-evolving world of Open Source. As Silver sponsors we have credativ (a service-oriented company focusing on open-source software and also a Debian development partner), Mojatatu Networks (a Canadian company developing Software Defined Networking (SDN) solutions), the Bern University of Applied Sciences (with over 6,600 students enrolled, located in the Swiss capital), Microsoft (an American multinational technology company), Evolix (an IT managed services and support company located in Montreal), Ubuntu (the OS supported by Canonical) and Roche (a major international pharmaceutical provider and research company dedicated to personalized healthcare). ISG.EE, IBM, Bluemosh, Univention and Skroutz are our Bronze sponsors so far. And finally, The Linux foundation, R seau Koumbit and adte.ca are our supporter sponsors. Become a sponsor too! Would you like to become a sponsor? Do you know of or work in a company or organization that may consider sponsorship? Please have a look at our sponsorship brochure (or a summarized flyer), in which we outline all the details and describe the sponsor benefits. For further details, feel free to contact us through sponsors@debconf.org, and visit the DebConf17 website at https://debconf17.debconf.org.

12 March 2017

Iustin Pop: A recipe for success

It is said that with age comes wisdom. I would be happy for that to be true, because today I must have been very very young then. For example, if you want to make a long bike ride in order to hit some milestone, like your first metric century, it is not indicated to follow ANY of the following points: For bonus points, if you somehow manage to reach the third peak in the above ride, and have mostly only flat/down to the destination, do the following: be so glad you're done with climbing, that you don't pay attention to the map and start a wrong descent, on a busy narrow road, so that you can't stop immediately as you realise you've lost the track; it will cost you only an extra ~80 meters of height towards the end of the ride. Which are pretty cheap, since all the food is gone and the water almost as well, so the backpack is light. Right. However, if you do follow all the above, you're rewarded with a most wonderful thing for the second half of the ride: your will receive a +5 boost on your concentration skill. You will be able to focus on, and think about a single thing for hours at a time, examining it (well, its contents) in minute detail. Plus, when you get home and open that thing I mean, of course, the FRIDGE with all the wonderful FOOD it contains everything will taste MAGICAL! You can now recoup the roughly 1500 calories deficit on the ride, and finally no longer feel SO HUNGRY. That's all. Strava said "EXTREME" suffer score, albeit less than 20% points in the red, which means I was just slugging through the ride (total time confirms it), like a very very very old man. But definitely not a wise one.

28 February 2017

Chris Lamb: Free software activities in February 2017

Here is my monthly update covering what I have been doing in the free software world (previous month):
Reproducible builds

Whilst anyone can inspect the source code of free software for malicious flaws, most software is distributed pre-compiled to end users. The motivation behind the Reproducible Builds effort is to permit verification that no flaws have been introduced either maliciously or accidentally during this compilation process by promising identical results are always generated from a given source, thus allowing multiple third-parties to come to a consensus on whether a build was compromised. (I have been awarded a grant from the Core Infrastructure Initiative to fund my work in this area.) This month I:
I also made the following changes to our tooling:
diffoscope

diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues.

  • New features:
    • Add a machine-readable JSON output format. (Closes: #850791).
    • Add an --exclude option. (Closes: #854783).
    • Show results from debugging packages last. (Closes: #820427).
    • Extract archive members using an auto-incrementing integer avoiding the need to sanitise filenames. (Closes: #854723).
    • Apply --max-report-size to --text output. (Closes: #851147).
    • Specify <html lang="en"> in the HTML output. (re. #849411).
  • Bug fixes:
    • Fix errors when comparing directories with non-directories. (Closes: #835641).
    • Device and RPM fallback comparisons require xxd. (Closes: #854593).
    • Fix tests that call xxd on Debian Jessie due to change of output format. (Closes: #855239).
    • Add missing Recommends for comparators. (Closes: #854655).
    • Importing submodules (ie. parent.child) will attempt to import parent. (Closes: #854670).
    • Correct logic of module_exists ensuring we correctly skip the debian.deb822 tests when python3-debian is not installed. (Closes: #854745).
    • Clean all temporary files in the signal handler thread instead of attempting to pass the exception back to the main thread. (Closes: #852013).
    • Fix behaviour of setting report maximums to zero (ie. no limit).
  • Optimisations:
    • Don't uselessly run xxd(1) on non-directories.
    • No need to track libarchive directory locations.
    • Optimise create_limited_print_func.
  • Tests:
    • When comparing two empty directories, ensure that the mtime of the directory is consistent to avoid non-deterministic failures.
    • Ensure we can at least import the "deb_fallback" and "rpm_fallback" modules.
    • Add test for symlink differing in destination.
    • Add tests for --progress, --status-fd and profiling output options as well as the Deb Changes,Buildinfo,Dsc and RPM fallback comparisons.
    • Add get_data and @skip_unless_module_exists test helpers.
    • Mark impossible-to-reach code to improve test coverage.

buildinfo.debian.net

buildinfo.debian.net is my experiment into how to process, store and distribute .buildinfo files after the Debian archive software has processed them.

  • Drop raw_text fields now as we've moved these to Amazon S3.
  • Drop storage of Installed-Build-Depends and subsequently-orphaned Binary package instances to recover diskspace.

strip-nondeterminism

strip-nondeterminism is our tool to remove specific non-deterministic results from a completed build.

  • Print log entry when fixing a file. (Closes: #777239).
  • Run our entire testsuite in autopkgtests, not just the first test. (Closes: #852517).
  • Don't test for stat(2)'s blksize and block attributes. (Closes: #854937).
  • Use error() from Dh_Lib.pm over "manual" die().


Debian
Debian LTS

This month I have been paid to work 13 hours on Debian Long Term Support (LTS). In that time I did the following:
  • "Frontdesk" duties, triaging CVEs, etc.
  • Issued DLA 817-1 for libphp-phpmailer, correcting a local file disclosure vulnerability where insufficient parsing of HTML messages could potentially be used by attacker to read a local file.
  • Issued DLA 826-1 for wireshark which fixes a denial of service vulnerability in wireshark, where a malformed NATO Ground Moving Target Indicator Format ("STANAG 4607") capture file could cause a memory exhausion/infinite loop.

Uploads
  • python-django (1:1.11~beta1-1) New upstream beta release.
  • redis (3:3.2.8-1) New upstream release.
  • gunicorn (19.6.0-11) Use $ misc:Pre-Depends to populate Pre-Depends for dpkg-maintscript-helper.
  • dh-virtualenv (1.0-1~bpo8+1) Upload to jessie-backports.

I sponsored the following uploads: I also performed the following QA uploads:
  • dh-kpatches (0.99.36+nmu4) Make kernel kernel builds reproducible.
Finally, I made the following non-maintainer uploads:
  • cpio (2.12+dfsg-3) Remove rmt.8.gz to prevent a piuparts error.
  • dot-forward (1:0.71-2.2) Correct a FTBFS; we don't install anything to /usr/sbin, so use GNU Make's $(wildcard ..) over the shell's own * expansion.


FTP Team

As a Debian FTP assistant I ACCEPTed 116 packages: autobahn-cpp, automat, bglibs, bitlbee, bmusb, bullet, case, certspotter, checkit-tiff, dash-el, dash-functional-el, debian-reference, el-x, elisp-bug-hunter, emacs-git-messenger, emacs-which-key, examl, genwqe-user, giac, golang-github-cloudflare-cfssl, golang-github-docker-goamz, golang-github-docker-libnetwork, golang-github-go-openapi-spec, golang-github-google-certificate-transparency, golang-github-karlseguin-ccache, golang-github-karlseguin-expect, golang-github-nebulouslabs-bolt, gpiozero, gsequencer, jel, libconfig-mvp-slicer-perl, libcrush, libdist-zilla-config-slicer-perl, libdist-zilla-role-pluginbundle-pluginremover-perl, libevent, libfunction-parameters-perl, libopenshot, libpod-weaver-section-generatesection-perl, libpodofo, libprelude, libprotocol-http2-perl, libscout, libsmali-1-java, libtest-abortable-perl, linux, linux-grsec, linux-signed, lockdown, lrslib, lua-curses, lua-torch-cutorch, mariadb-10.1, mini-buildd, mkchromecast, mocker-el, node-arr-exclude, node-brorand, node-buffer-xor, node-caller, node-duplexer3, node-ieee754, node-is-finite, node-lowercase-keys, node-minimalistic-assert, node-os-browserify, node-p-finally, node-parse-ms, node-plur, node-prepend-http, node-safe-buffer, node-text-table, node-time-zone, node-tty-browserify, node-widest-line, npd6, openoverlayrouter, pandoc-citeproc-preamble, pydenticon, pyicloud, pyroute2, pytest-qt, pytest-xvfb, python-biomaj3, python-canonicaljson, python-cgcloud, python-gffutils, python-h5netcdf, python-imageio, python-kaptan, python-libtmux, python-pybedtools, python-pyflow, python-scrapy, python-scrapy-djangoitem, python-signedjson, python-unpaddedbase64, python-xarray, qcumber, r-cran-urltools, radiant, repo, rmlint, ruby-googleauth, ruby-os, shutilwhich, sia, six, slimit, sphinx-celery, subuser, swarmkit, tmuxp, tpm2-tools, vine, wala & x265. I additionally filed 8 RC bugs against packages that had incomplete debian/copyright files against: checkit-tiff, dash-el, dash-functional-el, libcrush, libopenshot, mkchromecast, pytest-qt & x265.

Gunnar Wolf: Much belated book presentation, this Saturday

Once again, I'm making an announcement mainly for my local circle of friends and (gasp!) followers. For those of you over 100Km away from Mexico City, please disregard this message. Back in July 2015, and after two years of hard work, my university finished the publishing step of my second book. This is a textbook for the subject I teach at Computer Engineering: Operating Systems Fundamentals. The book is, from its inception, fully available online under a permissive (CC-BY) license. One of the books aimed contributions is to present a text natively written in Spanish. Besides, our goal (I coordinated a team of authors, working with two colleagues from Rosario, Argentina, and one from Cauca, Colombia) was to provide a book students can easily and legally share with no legal issues. I have got many good reviews so far, and after teaching based on it for four years (while working on it and after its publication), I can attest the material is light enough to fit in a Bachelors level degree, while it's deep enough to make our students sweat healthily ;-) Anyway: I have been scheduled to present the book at my university's main book show, 38 Feria Internacional del Libro del Palacio de Miner a this Saturday, 2017.03.04 16:00; Sal n Manuel Tols . What's even better: This time, I won't be preparing a speech! The book will be presented by my two very good friends, Jos Mar a Serralde and Rolando Cedillo. Both of them are clever, witty, fun, and a real honor to work with. Of course, having them present our book is more than a double honor. So, everybody who can make it: FIL Miner a is always great and fun. Come share the love! Come have a book! Or, at least, have a good time and a nice chat with us!

29 January 2017

Sam Hartman: Network Audio Visualization: Network Modeling

Previously, I wrote about my project to create an audio depiction of network traffic. In this second post, I explore how I model aspects of the network that will be captured in the audio representation. Before getting started, I'll pass along a link. This is not the first time someone has tried to put sound to packets flying through the ether: I was pointed at Peep. I haven't looked at Peep, but will do so after I finish my own write up. Not being an academic, I feel no obligation to compare and contrast my work to others:-)
I started with an idea of what I'd like to hear. One of my motivations was to explore some automated updates we run at work. So, I was hoping to capture the initial DNS and ARP traffic as the update discovered the systems it would contact. Then I was hoping to capture the ssh and other traffic of the actual update.
To Packet or Stream
One of the simplest things to do would simply be to model network packets. For DNS I chose that approach. I was dubious that a packet-based model would capture the aspects of TCP streams I typically care about. I care about the source and destination (both address and port) of course. However I also care about how much traffic is being carried over the stream and the condition of the stream. Are there retransmits? Are there a bunch of unanswered SYNs? But I don't care about the actual distribution of packets. Also, a busy TCP stream can generate thousands of packets a second. I doubted my ability to distinguish thousands of sounds a second at all, especially while trying to convey enough information to carry stream characteristics like overall traffic volume.
So, for TCP, I decided to model some characteristics of streams rather than individual packets.
For DNS, I decided to represent individual requests/replies.
I came up with something clever for ARPP. There, I model the request/reply as an outstanding request. A lot of unanswered ARPs can be a sign of a scan or a significant problem. The mornful sound of a TCP stream trailing off into an unanswered ARP as the cache times out on a broken network is certainly something I'd like to capture. So, I track when an ARP request is sent and when/if it is answered.
Sound or Music
I saw two approaches. First, I could use some sound to represent streams. As an example, a running diesel engine could make a great representation of a stream. The engine speed could represent overall traffic flow. There are many opportunities for detuning the engine to represent various problems that can happen with a stream. Perhaps using stereo separation and slightly different fundamental frequencies I could even represent a couple of streams and still be able to track them.
However, at least with me as a listener, that's not going to scale to a busy network. The other option I saw was to try and create melodic music with various musical phrases modified as conditions within the stream or network changed. That seemed a lot harder to do, but humans are good at listening to complicated music.
I ended up deciding that at least for the TCP streams, I was going to try and produce something more musical than sound. I was nervous: I kept having visions of a performance of "Peter and the Wolf" with different instruments representing all the characters that somehow went dreadfully wrong.
As an aside, the decision to approach music rather than sound depended heavily on what I was trying to capture. If I'm modeling more holistic properties of a system--for example, total network traffic without splitting into streams--I think parameterized sounds would be a better approach.
The decision to approach things musically affected the rest of the modeling. Somehow I was going to need to figure out notes to play. I'd already rejected the idea of modeling packets, so I wouldn't simply be able to play notes when a packet arrived.
Energy Decay
As I played with various options, I realized that the critical challenge would be figuring out how to focus the listener's attention on the important aspects of what was going on. Clutter was the great enemy. My job would be figuring out how to spend sound wisely. When something interesting happened, that part of the model should get more focus--more of the listener's energy.
Soon I found myself thinking a lot about managing the energy of network streams. I imagined streams getting energy when something happened, and spending that energy to convey that interesting event to the listener. Energy needed to accumulate fast enough that even low-traffic streams could be noticed. Energy needed to be spent fast enough that old events were not taking listener focus from new, interesting things going on. However, if the energy were spent slow enough, then network events could be smoothed out to give a better picture of the stream rather than individual packets.
This concept of managing some decaying quantity and managing the rate of decay proved useful at multiple levels of the model.
Two Layer Model
I started with a python script that parses tcpdump output. It associates a packet with a stream and batches packets together to avoid overloading other parts of the system.
The output of this script are stream events. Events include a source and destination address, a stream ID, traffic in each direction, and any special events on the stream.
For DNS, the script just outputs packet events. For ARP, the script outputs request start, reply, and timeout events. There's some initial support for UDP, but so far that doesn't make sound.
Right now, FINs are modeled, but SYNs and the interesting TCP conditions aren't directly modeled. If you get retransmissions you'll notice because packet flow will decrease. However, I'd love to explicitly sound retransmissions. I also think a window filling as an application fails to read is important. I imagine either narrowing a band-pass filter to clamp the audio bandwidth available to a stream with a full window. Or perhaps taking it the other direction and adding an echo.
The next layer down tracks the energy of each stream. But that, and how I map energy into music, is the topic of the next post.

26 January 2017

John Goerzen: What is happening to America?

I still remember vividly my first visit to Europe, back in 2010. I had just barely gotten off a plane in Hamburg and on to a bus to Lubeck, and struck up a conversation with a friendly, well-educated German classical musician next to me. We soon started to discuss politics and religion. Over the course of the conversation, in response to his questions, I explained I had twice voted against George W. Bush, that I opposed the war in Iraq for many reasons, that I did thought there was an ethical imperative to work to defeat climate change, that I viewed health care as an important ethical and religious issue, that I thought evolution was well-established, and that I am a Christian. Finally, without any hint of insult intended, and rather a lot of surprise written all over his face, he said: Wow. You re an American, and a Christian, and you re so . normal! This, it seems to me, has a lot to do with Trump. Ouch It felt like a punch to the gut. The day after the election, having known that a man that appeared to stand for everything that honorable people are against won the election, like people all around the world, I was trying to make sense of how could this happen? As I ve watched since, as he stacks government with wealthy cronies with records nearly as colorful as his own, it is easy to feel even more depressed. Based on how Trump spoke and acted, it would be easy to conclude that the deplorables won the day that he was elected by a contingent of sexists or racists ascendent in power. But that would be too simple an explanation. This is, after all, the same country that elected Barack Obama twice. There are a many people that voted twice for a black man, and then for Trump. Why? Racism, while doubtless a factor, can t explain it all. How Trump could happen Russ Allbery made some excellent points recently:
[Many Americans are] hurt, and they re scared, and they feel like a lot of the United States just slammed the door in their faces. The status quo is not working for people. Technocratic government by political elites is not working for people. Business as usual is not working for people. Minor tweaks to increasingly arcane systems is not working for people. People are feeling lost in bureaucracy, disaffected by elections that do not present a clear alternate vision, and depressed by a slow slide into increasingly dismal circumstances. Government is not doing what we want it to do for us. And people are getting left behind. The left in the United States (of which I m part) has for many years been very concerned about the way blacks and other racial minorities are systematically pushed to the margins of our economy, and how women are pushed out of leadership roles. Those problems are real. But the loss of jobs in the industrial heartland, the inability of a white, rural, working-class man to support his family the way his father supported him, the collapse of once-vibrant communities into poverty and despair: those problems are real too. The status quo is not working for anyone except for a few lucky, highly-educated people on the coasts. People, honestly, like me, and like many of the other (primarily white and male) people who work in tech. We are one of the few beneficiaries of a system that is failing the vast majority of people in this country.
Russ is, of course, right. The Democrats have been either complicit in policies damaging to many, or ineffective in preventing them. They have often appeared unconcerned with the plight of people outside cities (even if that wasn t really the case). And it goes deeper. When s the last time you visited Kansas? I live in Kansas. The nearest paved road is about a 3-mile drive from my home. The nearest town, population 600, is a 6-mile drive. My governor whom I did not vote for cut taxes on the wealthy so much that our excellent local schools have been struggling for years. But my community is amazing, full of loving and caring people, the sort of people who you know you ll be living with for 40 years, and so you make sure you get along well with. I have visited tourist sites in Berlin, enjoyed an opera and a Broadway show in New York, taken a train across the country to Portland, explored San Francisco. I ve enjoyed all of them. Many rural people do get out and experience the world. I have been in so many conversations where I try to explain where I live to people that simply cannot fathom it. I have explained how the 18 acres I own is a very small amount where I am. How, yes, I do actually have electricity and Internet. How a bad traffic day is one where I have to wait for three cars to go past before turning onto the paved road. How I occasionally find a bull in my front yard, how I can walk a quarter mile and be at the creek on the edge of my property, how I can get to an airport faster than most New Yorkers and my kids can walk out the front door and play in a spot more peaceful than Central Park, and how all this is way cheaper than a studio apartment in a bad part of San Francisco. It is rare indeed to see visitors actually traveling to Kansas as a destination. People have no concept of the fact that my mechanic would drop everything and help me get my broken-down car to the shop for no charge, that any number of neighbors or uncles would bring a tractor and come plow the snow off my 1/4-mile driveway out of sheer kindness, that people around here really care for each other in a way you don t see in a city. There are people that I know see politics way differently than me, but I know them to be good people. They would also do anything for a person in need, no matter who they are. I may find the people that they vote for to be repugnant, but I cannot say I ve looked this person in the eyes and they are nothing but deplorable. And so, people in rural areas feel misunderstood. And they are right. Some perspectives on Trump As I ve said, I do find Trump to be deplorable, but not everyone that voted for him is. How, then, do people wind up voting for him? The New Yorker had an excellent story about a man named Mark Frisbie, owner of a welding and fab shop. The recession had been hard on his business. His wife s day-care center also closed. Health care was hard to find, and the long, slow decline had spanned politicians of every stripe. Mark and his wife supposedly did everything they were supposed to: they worked hard, were honest, were entrepreneurial, and yet he had lost his business, his family house, his health coverage, everything. He doesn t want a handout. He wants to be able to earn a living. Asked who he d vote for, he said, Is none of the above an option? The Washington Post had another insightful article, about a professor from Madison, WI interviewing people in rural areas. She said people would often say: All the decisions are made in Madison and Milwaukee and nobody s listening to us. Nobody s paying attention, nobody s coming out here and asking us what we think. Decisions are made in the cities, and we have to abide by them. She pushed back, hard, on the idea that Trump supporters are ignorant, and added that liberals that push that line of thinking are only making the problem worse. I would agree; seeing all the talk about universities dis-inviting speakers that don t hew to certain political views doesn t help either. A related article talks about the lack of empathy for Trump voters. And then we have a more recent CNN article: Where Tump support and Obamacare use soar together, explaining in great detail how it can be logical for someone to be on Obamacare but not like it. We can all argue that the Republicans may have as much to do with that as anything, but the problem exists. And finally, a US News article makes this point:
His supporters realize he s a joke. They do not care. They know he s authoritarian, nationalist, almost un-American, and they love him anyway, because he disrupts a broken political process and beats establishment candidates who ve long ignored their interests. When you re earning $32,000 a year and haven t had a decent vacation in over a decade, it doesn t matter who Trump appoints to the U.N., or if he poisons America s standing in the world, you just want to win again, whoever the victim, whatever the price. According to the Republican Party, the biggest threat to rural America was Islamic terrorism. According to the Democratic Party it was gun violence. In reality it was prescription drug abuse and neither party noticed until it was too late.
Are we leaving people out? All this reminded me of reading about Donald Knuth, the famous computer scientist and something of the father of modern computing, writing about his feelings of trepidation about sharing with his university colleagues that he was working on a project related to the Bible. I am concerned about the complaints about the PC culture , because I think it is good that people aren t making racist or anti-semitic jokes in public anymore. But, as some of these articles point out, in many circles, making fun of Christians and conservatives is still one of the accepted targets. Does that really help anything? (And as a Christian that is liberal, have all of you that aren t Christians so quickly forgotten how churches like the Episcopals blazed the way for marriage equality many years ago already?) But they don t get a free pass I have found a few things, however, absolutely scary. One was an article from December showing that Trump voters actually changed their views on Russia after Trump became the nominee. Another one from just today was a study on how people reacted when showed inauguration crowd photos. NPR ran a story today as well, on how Trump is treating journalists like China does. Chilling stuff indeed. Conclusion So where does this leave us? Heading into uncertain times, for sure, but perhaps just maybe with a greater understanding of our neighbors. Perhaps we will all be able to see past the rhetoric and polarization, and understand that there is something, well, normal about each other. Doing that is going to be the only way we can really take our country back.

2 January 2017

Shirish Agarwal: India Tourism, E-Visa and Hong Kong

A Safe and Happy New Year to all. While Debconf India is still a pipe-dream as of now, did see that India has been gradually doing it easier for tourists and casual business visitors to come visit India. This I take as very positive development for India itself. The 1st condition is itself good for anybody visiting India
Eligibility International Travellers whose sole objective of visiting India is recreation , sight-seeing , casual visit to meet friends or relatives, short duration medical treatment or casual business visit.
https://indianvisaonline.gov.in/visa/tvoa.html That this facility is being given to 130 odd countries is better still
Albania, Andorra, Anguilla, Antigua & Barbuda, Argentina, Armenia, Aruba, Australia, Austria, Bahamas, Barbados, Belgium, Belize, Bolivia, Bosnia & Herzegovina, Botswana, Brazil, Brunei, Bulgaria, Cambodia, Canada, Cape Verde, Cayman Island, Chile, China, China- SAR Hong-Kong, China- SAR Macau, Colombia, Comoros, Cook Islands, Costa Rica, Cote d lvoire, Croatia, Cuba, Czech Republic, Denmark, Djibouti, Dominica, Dominican Republic, East Timor, Ecuador, El Salvador, Eritrea, Estonia, Fiji, Finland, France, Gabon, Gambia, Georgia, Germany, Ghana, Greece, Grenada, Guatemala, Guinea, Guyana, Haiti, Honduras, Hungary, Iceland, Indonesia, Ireland, Israel, Jamaica, Japan, Jordan, Kenya, Kiribati, Laos, Latvia, Lesotho, Liberia, Liechtenstein, Lithuania, Luxembourg, Madagascar, Malawi, Malaysia, Malta, Marshall Islands, Mauritius, Mexico, Micronesia, Moldova, Monaco, Mongolia, Montenegro, Montserrat, Mozambique, Myanmar, Namibia, Nauru, Netherlands, New Zealand, Nicaragua, Niue Island, Norway, Oman, Palau, Palestine, Panama, Papua New Guinea, Paraguay, Peru, Philippines, Poland, Portugal, Republic of Korea, Republic of Macedonia, Romania, Russia, Saint Christopher and Nevis, Saint Lucia, Saint Vincent & the Grenadines, Samoa, San Marino, Senegal, Serbia, Seychelles, Singapore, Slovakia, Slovenia, Solomon Islands, South Africa, Spain, Sri Lanka, Suriname, Swaziland, Sweden, Switzerland, Taiwan, Tajikistan, Tanzania, Thailand, Tonga, Trinidad & Tobago, Turks & Caicos Island, Tuvalu, UAE, Ukraine, United Kingdom, Uruguay, USA, Vanuatu, Vatican City-Holy See, Venezuela, Vietnam, Zambia and Zimbabwe.
This should make it somewhat easier for any Indian organizer as well as any participants from any of the member countries shared. There is possibility that this list would even get longer, provided we are able to scale our airports and all and any necessary infrastructure that would be needed for International Visitors to have a good experience. What has been particularly interesting is to know which ports of call are being used by International Visitors as well as overall growth rate
The Percentage share of Foreign Tourist Arrivals (FTAs) in India during November, 2016 among the top 15 source countries was highest from USA (15.53%) followed by UK (11.21%), Bangladesh (10.72%), Canada (4.66%), Russian Fed (4.53%), Australia (4.04%), Malaysia (3.65%), Germany (3.53%), China (3.14%), France (2.88%), Sri Lanka (2.49%), Japan (2.49%), Singapore (2.16%), Nepal (1.46%) and Thailand (1.37%).
And port of call
The Percentage share of Foreign Tourist Arrivals (FTAs) in India during November 2016 among the top 15 ports was highest at Delhi Airport (32.71%) followed by Mumbai Airport (18.51%), Chennai Airport (6.83%), Bengaluru Airport (5.89%), Haridaspur Land check post (5.87%), Goa Airport (5.63%), Kolkata Airport (3.90%), Cochin Airport (3.29%), Hyderabad Airport (3.14%), Ahmadabad Airport (2.76%), Trivandrum Airport (1.54%), Trichy Airport (1.53%), Gede Rail (1.16%), Amritsar Airport (1.15%), and Ghojadanga land check post (0.82%) .
The Ghojadanga land check post seems to be between West Bengal, India and Bangladesh. Gede Railway Station is also in West Bengal as well. So all and any overlanders could take any of those ways.Even Hardispur Land Check post comes in the Bengal-Bangladesh border only. In the airports, Delhi Airport seems to be attracting lot more business than the Mumbai Airport. Part of the reason I *think* is the direct link of Delhi Airport to NDLS via the Delhi Airport Express Line . The same when it will happen in Mumbai should be a game-changer for city too. Now if you are wondering why I have been suddenly talking about visas and airports in India, it came because Hong Kong is going to Withdraw Visa Free Entry Facility For Indians. Although, as rightly pointed out in the article doesn t make sense from economic POV and seems to be somewhat politically motivated. Not that I or anybody else can do anything about that. Seeing that, I thought it was a good opportunity to see how good/Bad our Government is and it seems to be on the right path. Although the hawks (Intelligence and Counter-Terrorist Agencies) will probably become a bit more paranoid , their work becomes tougher.
Filed under: Miscellenous Tagged: #Airport Metro Line 3, #CSIA, #Incredible India, #India, #International Tourism

24 December 2016

Shirish Agarwal: Trains, Planes and the future

Swacch Bharat - Indian Railways Copyright: Indian Express

Swacch Bharat Indian Railways Copyright: Indian Express

Some of the content may be NSFW. viewer discretion advised. I have had a life-long fascination with trains. One of my first memories was that of 5-7 year old, clutching my mother or grandmother s hand seeing the steam engine lumbering down whistling and smoking at the same time. I was both afraid and strangely drawn to the iron beast and the first time I knew and then slowly understood that if we come with luggage and the steam-engine comes, it means we are going to travel. I have travelled some, but there are lots to explore still and I do hope that I cover some more of it during my lifetime. The reason I am writing about trains is an article which caught my eye couple of days. Besides seeing the changing geography, the variety of food one can get on train and in stations is one of the primary reasons that Indians love to travel by trains. It is one place where you could have incredible conversations over cup of tea or favourite food and unlike air travel and the famed IFE (In-flight entertainment) people are actually pretty social even with all the gadgets. For those who are wondering, the author was travelling between Jamshedpur, Gujarat to Kolkatta, a train ride which has now gone on my bucket list for the delectable items the author has described To add to the above, it is still cheaper than air travel, although that is changing a bit as Indian Railways seeks to modernize Railways and make it into world-class bullet trains. Indian Railways has a long, rich culture and some of the most interesting nuggets you learn over time adds to the fascination of the Railways. For instance I m sharing this letter which I read first in book and then saw in the New Delhi Railway Museum. The letter I am sharing below was written by a certain Shri Okhil Chandra Sen to the Sahibganj Railway Office in year 1909, almost 38 years before India became independent. I am arrive by passenger train Ahmedpur station and my belly is too much swelling with jackfruit. I am therefore went to privy. Just I doing the nuisance that guard making whistle blow for train to go off and I am running with lotah in one hand and dhoti in the next when I am fall over and expose all my shocking to man and female women on plateform. I am got leaved at Ahmedpur station. This too much bad, if passenger go to make dung that dam guard not wait train five minutes for him. I am therefore pray your honour to make big fine on that guard for public sake. Otherwise I am making big report! to papers. If it were not for Mr. Okhil Chandra Sen we would still be running with water bottle (improvement) and jeans/shorts/whatever (again improvement) while the possibility of falling over would always be omnipresent in a hurry. Now we do have toilets and some of the better trains even have Bio-toilets which should make things better as well.(/NSFW) For the plane bit, most of my flights have been domestic flying. Some of my most memorable flights is when flying from Mumbai on a clear sky overlooking the Queen s necklace, loving it and landing in Bangalore during mist or rain or both. Delhi is also good as airports go but nothing much adventurous about it. It was only with the experience of my first international flight, I realized the same feeling again, nervousness and sense of adventure as you meet new people. Nowadays every week I do try and broaden my horizon by seeking and learning a bit about International Travel.
Copyright: National Geographic Magazine

Copyright: National Geographic Magazine

In this I came across an article on National Geographic site which also evoked similar feelings. While I can t go back to the past and even if I did (in distant past before I was born), I wouldn t want to improve my financial situation at all (as otherwise I would hit the Grandfather Paradox or/and the Butterfly effect (essentially saying there s no free lunch), it still makes you wonder about a time when people had lot more adventure and lot more moving parts. I do wish they had a much bigger snapshot of that plane so I could really see how people sat in the old aircraft. The low-resolution picture doesn t do justice to the poster and the idea of that time. https://en.wikipedia.org/wiki/A_Sound_of_Thunder for an implementation of Butterfly effect. The Grandfather Paradox has been seen plenty of times in fantasy movies like the Back to the Future, Planet of the Apes and many others so will not go there. For the average joe today, s/he has to navigate security,check bags, get her/imself processed through passport control, get boarding pass, get to the gate on-time, get to the aircraft via bridge or bus, get to the seat, somehow make it through the ascent and use your IFE and get snacks and meals till it s time to touch-down and re-do the whole drill again as many times you are connecting. I really admire Gunnar Wolf for the tenacity he showed for the x number of connections he made both ways.
The world's 10 best airports Copyright: Changi International Airport

Photo Courtesy Changi International Airport, Singapore

While leafing through the interweb today, came across an article . While you can slice and dice the report anyway you want, for me if ever I get a chance again for an International Travel, I would try to see I get a layover at these three airports in order of preference (this is on the basis that none of these airports need a transit visa for the activities shared) a. Changi International Airport It is supposed to have shower amenities, has a movie theatre (+1), free tour of the city (+1) and of course as many Indians do go to Singapore as a destination in itself would have multiple vegetarian options (+2) so would be nice if I need to layover. b. Zurich Airport (ZRH) For passengers with an extended layover, Zurich Airport offers bicycle and inline-skate rentals and excursions to the Swiss Museum of Transport Lucerne. From business-insider.com. While I m not much of a bicycle and inline-skating freak, if the Swiss Museum of Transport Lucerne is anything to the scale of Isiko Museum which I shared in a blog post sometime before, it would be worth by itself. I haven t tried to find the site but can imagine, for e.g. if it has a full-scale model of a submarine or train engine, either steam-engines or ones like SNCF or any of the other bullet-trains and early aircraft, it would just blow my mind. When you are talking about transport, there is so much science, business, logistics etc. that I m sure I ll overload with information, photos and any trinkets they have to buy. c. Central Japan International Airport (NGO) It has a 1,000-foot-long sky deck where passengers can watch ships sail into Nagoya Port. There s also a traditional Japanese bathhouse where you can have a relaxing soak while watching the sunset over the bay. BusinessInsider.com Not a bad place to be if you need a layover. Just sink yourself in the bathhouse and see the bay and ships coming in. Luxury indeed. Honourable mention d. Munich Airport (MUC) A nearby visitors park features mini golf and a display of historic aircraft. Business-Insider.com . Now this would have made my list but I guess one would need a Schengen visa to access the visitors park but then if you have that, then why just stay in the Airport itself, could travel through Europe itself and have a longish stop-over. So all in all, it s indeed a fascinating time to be alive, dreaming and just being. Till later. Update I had forgotten to share one more reason why I was writing this article. Although somewhat of a cynic, am hopeful that Pune metro happens. Also, if I had just waited a day, would have been able to add couple of wonderful articles that would make people wanderlust more
Filed under: Miscellenous Tagged: #Best Airports, #Central Japan International Airport, #Changi International Airport, #Food, #Loo, #Nostalgia, #NSFW, #Planes, #Steam Engine, #Trains, #Zurich Airport, Indian Railways, memories

Next.

Previous.