Search Results: "cesar"

23 November 2021

Enrico Zini: Really lossy compression of JPEG

Suppose you have a tool that archives images, or scientific data, and it has a test suite. It would be good to collect sample files for the test suite, but they are often so big one can't really bloat the repository with them. But does the test suite need everything that is in those files? Not necesarily. For example, if one's testing code that reads EXIF metadata, one doesn't care about what is in the image. That technique works extemely well. I can take GRIB files that are several megabytes in size, zero out their data payload, and get nice 1Kb samples for the test suite. I've started to collect and organise the little hacks I use for this into a tool I called mktestsample:
$ mktestsample -v samples1/*
2021-11-23 20:16:32 INFO common samples1/cosmo_2d+0.grib: size went from 335168b to 120b
2021-11-23 20:16:32 INFO common samples1/grib2_ifs.arkimet: size went from 4993448b to 39393b
2021-11-23 20:16:32 INFO common samples1/polenta.jpg: size went from 3191475b to 94517b
2021-11-23 20:16:32 INFO common samples1/test-ifs.grib: size went from 1986469b to 4860b
Those are massive savings, but I'm not satisfied about those almost 94Kb of JPEG:
$ ls -la samples1/polenta.jpg
-rw-r--r-- 1 enrico enrico 94517 Nov 23 20:16 samples1/polenta.jpg
$ gzip samples1/polenta.jpg
$ ls -la samples1/polenta.jpg.gz
-rw-r--r-- 1 enrico enrico 745 Nov 23 20:16 samples1/polenta.jpg.gz
I believe I did all I could: completely blank out image data, set quality to zero, maximize subsampling, and tweak quantization to throw everything away. Still, the result is a 94Kb file that can be gzipped down to 745 bytes. Is there something I'm missing? I suppose JPEG is better at storing an image than at storing the lack of an image. I cannot really complain :) I can still commit compressed samples of large images to a git repository, taking very little data indeed. That's really nice!

21 September 2020

Kees Cook: security things in Linux v5.7

Previously: v5.6 Linux v5.7 was released at the end of May. Here s my summary of various security things that caught my attention: arm64 kernel pointer authentication
While the ARMv8.3 CPU Pointer Authentication (PAC) feature landed for userspace already, Kristina Martsenko has now landed PAC support in kernel mode. The current implementation uses PACIASP which protects the saved stack pointer, similar to the existing CONFIG_STACKPROTECTOR feature, only faster. This also paves the way to sign and check pointers stored in the heap, as a way to defeat function pointer overwrites in those memory regions too. Since the behavior is different from the traditional stack protector, Amit Daniel Kachhap added an LKDTM test for PAC as well. BPF LSM
The kernel s Linux Security Module (LSM) API provide a way to write security modules that have traditionally implemented various Mandatory Access Control (MAC) systems like SELinux, AppArmor, etc. The LSM hooks are numerous and no one LSM uses them all, as some hooks are much more specialized (like those used by IMA, Yama, LoadPin, etc). There was not, however, any way to externally attach to these hooks (not even through a regular loadable kernel module) nor build fully dynamic security policy, until KP Singh landed the API for building LSM policy using BPF. With this, it is possible (for a privileged process) to write kernel LSM hooks in BPF, allowing for totally custom security policy (and reporting). execve() deadlock refactoring
There have been a number of long-standing races in the kernel s process launching code where ptrace could deadlock. Fixing these has been attempted several times over the last many years, but Eric W. Biederman and Ernd Edlinger decided to dive in, and successfully landed the a series of refactorings, splitting up the problematic locking and refactoring their uses to remove the deadlocks. While he was at it, Eric also extended the exec_id counter to 64 bits to avoid the possibility of the counter wrapping and allowing an attacker to send arbitrary signals to processes they normally shouldn t be able to. slub freelist obfuscation improvements
After Silvio Cesare observed some weaknesses in the implementation of CONFIG_SLAB_FREELIST_HARDENED s freelist pointer content obfuscation, I improved their bit diffusion, which makes attacks require significantly more memory content exposures to defeat the obfuscation. As part of the conversation, Vitaly Nikolenko pointed out that the freelist pointer s location made it relatively easy to target too (for either disclosures or overwrites), so I moved it away from the edge of the slab, making it harder to reach through small-sized overflows (which usually target the freelist pointer). As it turns out, there were a few assumptions in the kernel about the location of the freelist pointer, which had to also get cleaned up. RISCV page table dumping
Following v5.6 s generic page table dumping work, Zong Li landed the RISCV page dumping code. This means it s much easier to examine the kernel s page table layout when running a debug kernel (built with PTDUMP_DEBUGFS), visible in /sys/kernel/debug/kernel_page_tables. array index bounds checking
This is a pretty large area of work that touches a lot of overlapping elements (and history) in the Linux kernel. The short version is: C is bad at noticing when it uses an array index beyond the bounds of the declared array, and we need to fix that. For example, don t do this:
int foo[5];
...
foo[8] = bar;
The long version gets complicated by the evolution of flexible array structure members, so we ll pause for a moment and skim the surface of this topic. While things like CONFIG_FORTIFY_SOURCE try to catch these kinds of cases in the memcpy() and strcpy() family of functions, it doesn t catch it in open-coded array indexing, as seen in the code above. GCC has a warning (-Warray-bounds) for these cases, but it was disabled by Linus because of all the false positives seen due to fake flexible array members. Before flexible arrays were standardized, GNU C supported zero sized array members. And before that, C code would use a 1-element array. These were all designed so that some structure could be the header in front of some data blob that could be addressable through the last structure member:
/* 1-element array */
struct foo  
    ...
    char contents[1];
 ;
/* GNU C extension: 0-element array */
struct foo  
    ...
    char contents[0];
 ;
/* C standard: flexible array */
struct foo  
    ...
    char contents[];
 ;
instance = kmalloc(sizeof(struct foo) + content_size);
Converting all the zero- and one-element array members to flexible arrays is one of Gustavo A. R. Silva s goals, and hundreds of these changes started landing. Once fixed, -Warray-bounds can be re-enabled. Much more detail can be found in the kernel s deprecation docs. However, that will only catch the visible at compile time cases. For runtime checking, the Undefined Behavior Sanitizer has an option for adding runtime array bounds checking for catching things like this where the compiler cannot perform a static analysis of the index values:
int foo[5];
...
for (i = 0; i < some_argument; i++)  
    ...
    foo[i] = bar;
    ...
 
It was, however, not separate (via kernel Kconfig) until Elena Petrova and I split it out into CONFIG_UBSAN_BOUNDS, which is fast enough for production kernel use. With this enabled, it's now possible to instrument the kernel to catch these conditions, which seem to come up with some regularity in Wi-Fi and Bluetooth drivers for some reason. Since UBSAN (and the other Sanitizers) only WARN() by default, system owners need to set panic_on_warn=1 too if they want to defend against attacks targeting these kinds of flaws. Because of this, and to avoid bloating the kernel image with all the warning messages, I introduced CONFIG_UBSAN_TRAP which effectively turns these conditions into a BUG() without needing additional sysctl settings. Fixing "additive" snprintf() usage
A common idiom in C for building up strings is to use sprintf()'s return value to increment a pointer into a string, and build a string with more sprintf() calls:
/* safe if strlen(foo) + 1 < sizeof(string) */
wrote  = sprintf(string, "Foo: %s\n", foo);
/* overflows if strlen(foo) + strlen(bar) > sizeof(string) */
wrote += sprintf(string + wrote, "Bar: %s\n", bar);
/* writing way beyond the end of "string" now ... */
wrote += sprintf(string + wrote, "Baz: %s\n", baz);
The risk is that if these calls eventually walk off the end of the string buffer, it will start writing into other memory and create some bad situations. Switching these to snprintf() does not, however, make anything safer, since snprintf() returns how much it would have written:
/* safe, assuming available <= sizeof(string), and for this example
 * assume strlen(foo) < sizeof(string) */
wrote  = snprintf(string, available, "Foo: %s\n", foo);
/* if (strlen(bar) > available - wrote), this is still safe since the
 * write into "string" will be truncated, but now "wrote" has been
 * incremented by how much snprintf() *would* have written, so "wrote"
 * is now larger than "available". */
wrote += snprintf(string + wrote, available - wrote, "Bar: %s\n", bar);
/* string + wrote is beyond the end of string, and availabe - wrote wraps
 * around to a giant positive value, making the write effectively 
 * unbounded. */
wrote += snprintf(string + wrote, available - wrote, "Baz: %s\n", baz);
So while the first overflowing call would be safe, the next one would be targeting beyond the end of the array and the size calculation will have wrapped around to a giant limit. Replacing this idiom with scnprintf() solves the issue because it only reports what was actually written. To this end, Takashi Iwai has been landing a bunch scnprintf() fixes. That's it for now! Let me know if there is anything else you think I should mention here. Next up: Linux v5.8.

2020, Kees Cook. This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 License.
CC BY-SA 4.0

8 April 2015

Gunnar Wolf: Guests in the classroom: C sar Y ez (@caesarcomptus) talks about memory assignation algorithms

Once again, on March 11 I had a great guest to save me some work and give a talk at my class! This time it was C sar Y ez, and he talked about memory management algorithms, emphasizing on ARC. The video is available, both at this server and in archive.org. Thanks a lot!

6 December 2013

Gunnar Wolf: For people in Mexico: Workshop next Wednesday! Video editing from the command line (by Chema Serralde, @joseserralde)

(Yes, yes... Maybe I should post in Spanish.. But hey, gotta keep consistecy in my blog!) General, public, open invitation Are you in Mexico City, or do you plan to be next Wednesday (December 11)? Are you interested in video edition? In Free Software? I will have the pleasure to host at home the great Chema Serralde, a good friend, and a multifacetic guru both in the technical and musical areas. He will present a workshop: Video editing from the command line. I asked Chema for an outline of his talk, but given he is a busy guy, I will basically translate the introduction he prepared for this same material in FSL Vallarta, held two weeks ago.
With the help of the commandline, you can become a multimedia guru. We will edit a video using just a terminal. This skill will surprise your friends and your couple. But the most important is that this knowledge is just an excuse to understand step by step what does a video CODEC mean, what is a FORMAT, and how video and audio editors work; by using this knowledge, you will be able to set the basis for multimedia editing, without the promises and secrets of propietary editors. How much does my file weigh and why? How to improve a video file's quality? Why cannot I read my camera's information from GNU/Linux? By the end of this workshop, we well see how some libraries help you develop your first audio and video application, what are their main APIs and uses.
Logistics Everybody is welcome to come for free, no questions asked, no fees collected. I can offer coffee for all, but if you want anything else to eat/drink, you are welcome to bring it. We do require you to reserve and confirm your place (mail me to my usual mail address). We have limited space, and I must set an absolute quota of 10 participants. Some people hide their address... Mine is quite publicly known: Av. Copilco 233, just by Parque Hugo Margain, on the Northern edge of UNAM (Metro Copilco). The course starts at 16:00, and lasts... As long as we make it last ;-) So, that said... See you there! :-D [update]: Chema sent me the list of topics he plans to cover. Copy/pasting from his mail, in Spanish:
TALLER REL MPAGO DE EDICI N AUDIOVISUAL EN L NEA DE COMANDO
Jos Mar a Serralde Ruiz, facilitador
  1. Editando como cavern cola.
    1. Manipulaci n b sica de archivos multimedia en entornos POSIX.
    2. S un Bash VJ (videojockey)
    3. Vaciando y entubando
  2. Editando como cient fico.
    1. Encabezados y fourcc
    2. 3 familias de CODECS de v deo y sus patentes
    3. 3 famlias de CODECS de audio y sus patentes
    4. Muxers, demuxers y muxes.
  3. Editando como artista.
    1. Cajas de herramientas en software libre para procesamiento de v deo.
    2. Procesamiento en tiempo real de v deo (el que se crea artista pierde)
    3. Derritiendo v deo, audio con calcetines MELT + SOX
Software necesario (sistemas operativos POSIX, windouseros acercarse con el af n de repensar sus vidas): mplayer, avconv/ffmpeg (libavcodec), melt, sox, imagemagick

14 October 2013

Gunnar Wolf: C sar Y ez (@caesarcomptus) in the classroom: Process scheduling

I try to have some guests every now and then to my Operating Systems class. The class is not as practical/interactive as I'd like, and having some people show the students how the subjects I teach are reflected in the real world is, I feel, very useful for them to understand the topics' importance. The past semester (the first one for me) I had three guests: Chema Serralde, talking about process scheduling and in particular on the importance of real time, from his perspective as a musician, Rolando Cedillo, talking about the early stages of the boot process, and C sar Y ez, giving a review of file systems. This semester, there have been two guests so far: Felipe Esquivel, who spoke about parallelism, and used renders with Blender to illustrate the speed gains and limitations (i.e. Amdahl's law), and this last Thursday, I invited again C sar Y ez. C sar spoke about process scheduling, first giving a quite thorough review of what had taken me at least three sessions to go through, and second, giving some in-depth review based on his experience with Haiku OS. What else was different this time? I told our coordinator in the faculty, and she invited the other teachers of the subject (and attended herself). So, instead of the usual ~25 students, we have ~40 people in the classroom. And one of them, Adolfo, recorded most of C sar's explanation. Yay! Of course, I asked Adolfo for a copy of his recording, and recoded it in a format more suitable for Web viewing. Here it is (almost 300MB, Ogg Video, ~95 minutes). I still have the original video file given to me, in an Apple-generated badly-compressed .mov, but at over 1.5GB, it's too much for a Web download. I will try to record future sessions, as they will surely be useful!

26 September 2011

Gunnar Wolf: e-voting: Something is brewing in Jalisco...

There's something brewing, moving in Jalisco (a state in Mexico's West, where our second largest city, Guadalajara, is located). And it seems we have an opportunity to participate, hopefully to be taken into account for the future. Ten days ago, I was contacted by phone by the staff of UDG Noticias, for an interview on the Universidad de Guadalajara radio station. The topic? Electronic voting. If you are interested in what I said there, you can get the interview from my webpage. I held some e-mail contact with the interviewer, and during the past few days, he sent me some links to notes in the La Jornada de Jalisco newspaper, and asked for my opinion on them: On September 23, a fellow UNAM researcher, C sar Astudillo, claims the experience in three municipalities in Jalisco prove that e-voting is viable in the state, and today (September 26), third generation of an electronic booth is appearingly invulnerable. Of course, I don't agree with the arguments presented (and I'll reproduce the mails I sent to UDG Noticias about it before my second interview just below They are in Spanish, though). However, what I liked here is that it does feel like a dialogue. Their successive texts seem to answer to my questioning. So, even though I cannot yet claim this is a real dialogue (it would be much better to be able to sit down face to face and have a fluid conversation), it feels very nice to actually be listened to from the other side! My answer to the first note:
El tema de las urnas electr nicas sigue dando de qu hablar por ac en Jalisco... nosotros en Medios UDG hemos presentado distintas voces como la del Dr. Gabriel Corona Armenta, que est a favor del voto electr nico, del Dr. Luis Antonio Sobrado, magistrado presidente del tribunal supremo de elecciones de Costa Rica, quien nos habl sobre los 20 MDD que les cuesta implementar el sistema por lo que no lo han logrado hasta el momento, pudimos hablar hasta argentina con Federico Heinz y su rotunda oposici n al voto electr nico y por supuesto la entrevista que le realizamos a usted. Sin embargo este d a La Jornada Jalisco publica la siguiente nota http://www.lajornadajalisco.com.mx/2011/09/23/index.php?section=politica... nos gustar a saber cu l es su punto de vista al respecto, quedo a la espera de su respuesta
Hola, Pues... Bueno, s que el IFE hizo un desarrollo muy interesante y bien hecho hace un par de a os, dise ando desde cero las urnas que propon an emplear, pero no se instrumentaron fuera de pilotos (por cuesti n de costos, hasta donde entiendo). Se me hace triste y peligroso que el IEPC de Jalisco est proponiendo, teniendo ese antecedente, la compra de tecnolog a prefabricada, y confiando en lo que les ofrece un proveedor. Se me hace bastante iluso, directamente, lo que propone el t tulo: comicios en tres municipios prueban la viabilidad del voto electr nico en todo el estado . Pong moslo en estos t rminos: El que no se caiga una choza de l mina con estructura de madera demuestra que podemos construir rascacielos de l mina con estructura de madera? Ahora, un par de p rrafos que me llaman la atenci n de lo que publica esta nota de La Jornada:
la propuesta de realizar la elecci n en todo el estado con urnas electr nicas que desea llevar a cabo el Instituto Electoral y de Participaci n Ciudadana (IEPC) es viable, pues los comicios realizados en tres municipios son pruebas suficientes para demostrar que la urna es fiable
y algunos p rrafos m s adelante,
Cu ntas experiencias m s se necesitan para saber si es confiable, 20, 30, no lo s (...) Pero cuando se tiene un diagn stico real, efectivo y serio de cu ndo t cnicamente procede, se puede tomar la decisi n
Como lo menciono en mi art culo... No podemos confundir a la ausencia de evidencia con la evidencia de ausencia. Esto es, que en un despliegue menor no haya habido irregulares no significa que no pueda haberlas. Que haya pa ses que operan 100% con urnas electr nicas no significa que sea el camino a seguir. Hay algunas -y no pocas- experiencias de fallas en diversos sentidos de urnas electr nicas, y eso demuestra que no puede haber confianza en las implementaciones. Aunque el equipo nos saliera gratis (que no es el caso), hay que invertir recursos en su resguardo y mantenimiento. Aunque se generara un rastro impreso verificado por el votante (que s lo ha sido el caso en una peque a fracci n de las estacione de votaci n), nada asegura que los resultados reportados por el equipo sean siempre consistentes con la realidad. El potencial para mal uso que ofrecen es demasiado. Saludos,
And to September 26th:
Disculpe que lo molestemos otra vez, pero este d a fue publicada otra nota m s sobre el tema de las Urnas electr nicas en Jalisco donde se asegura que la urna es invulnerable. http://www.lajornadajalisco.com.mx/2011/09/26/index.php?section=politica... nos podr a conceder unos minutos para hablar con usted, como la vez pasada, v a telef nica sobre el caso espec fico de Jalisco, en referencia a estas notas publicadas recientemente? si es posible podr a llamarle este d a a las 2 pm? Quedo a la espera de su respuesta agradeci ndole su ayuda, apreciamos mucho esta colaboraci n que est haciendo con nosotros
Hola, ( ) Respecto a esta nota: Nuevamente, ausencia de evidencia no es evidencia de ausencia. Se le permite a un peque o segmento de personas jugar con una m quina. Significa eso que fue una prueba completa, exhaustiva? No, s lo que ante un jugueteo casual no pudieron encontrar fallos obvios y graves. Un verdadero proceso que brindara confianza consistir a en (como lo hicieron en Brasil - Y resultaron vulnerables) convocar a la comunidad de expertos en seguridad en c mputo a hacer las pruebas que juzguen necesarias teniendo un nivel razonable de acceso al equipo. Adem s, la seguridad va m s all de modificar los resultados guardados. Un par de ejemplos que se me ocurren sin darle muchas vueltas:
  • Qu pasa si meto un chicle a la ranura lectora de tarjeta magn tica?
  • Qu pasa si golpeo alguna de las teclas lo suficiente para hacerla un poquito menos sensible sin destruirla por completo? (o, ya entrados en gastos, si la destruyo)
La negaci n de servicio es otro tipo de ataque con el cual tenemos que estar familiarizados. No s lo es posible modificar el sentido de la votaci n, sino que es muy f cil impedir que la poblaci n ejerza su derecho. Qu har an en este caso? Bueno, podr an caer de vuelta a votaci n sobre papel - Sobre hojas de un block, probablemente firmadas por cada uno de los funcionarios, por ejemplo. Pero si un atacante bloque la lectura de la tarjeta magn tica, que es necesaria para que el presidente de casilla la marque como cerrada, despoj de su voto a los usuarios. S , se tienen los votos impresos (que, francamente, me da mucho gusto ver que esta urna los maneja de esta manera). El conteo es posible, aunque un poco m s inc modo que en una votaci n tradicional (porque hay que revisar cu les son los que est n marcados como invalidados - no me queda muy claro c mo es el escenario del elector que vot por una opci n, se imprimi otra, y el resultado fue corregido y marcado como tal)... Pero es posible. Sin embargo, y para cerrar con esta respuesta: Si hacemos una corrida de prueba, en circunstancias controladas, obviamente no se notar n los much simos fallos que una urna electr nica puede introducir cuando los "chicos malos" son sus programadores. Podemos estar seguro que este marcador Atlas-Chivas-Cruz Azul tenga el mismo ndice de fiabilidad como una elecci n de candidatos reales, uno de los cuales puede haberle pagado a la empresa desarrolladora para manipular la elecci n? Y a n si el proceso fuera perfecto, indican aqu que est n _intentando_ licitar estas urnas (y nuevamente, si lo que menciona esta nota es cierto, son de las mejores urnas disponibles, y han atendido a muchos de los se alamientos - Qu bueno!)... Para qu ? Qu nos van a dar estas urnas, qu va a ganar la sociedad? Mayor rapidez? Despreciable - Media hora de ganancia. A cambio de cu nto dinero? Mayor confiabilidad? Me queda claro que no, siendo que no s lo somos cuatro trasnochados los que ponemos su sistema en duda, sino que sus mismos proponentes apuntan a la duda generalizada. La frase con la que cierra la nota se me hace digna para colgar un ep logo: "en ese futuro quiz no tan distante la corrupci n tambi n ocurre y sta se debe siempre al factor humano". Y el factor humano sigue ah . Las urnas electr nicas son programadas por personas, por personas falibles. Sin importar del lado que est n, recordar n la pol mica cuando se hizo p blico que la agregaci n de votos en el 2006 fue supervisada por la empresa Hildebrando, propiedad del cu ado del entonces candidato a la presidencia Felipe Calder n. Qu evita que caigamos en un escenario similar, pero ampliamente distribu do? Y aqu hay que referirnos a la sentencia de la Suprema Corte de Alemania: En dicho pa s, las votaciones electr nicas fueron declaradas anticonstitucionales porque s lo un grupo de especialistas podr an auditarlas. Una caja llena de papeles con la evidencia clara del sentido del voto de cada participante puede ser comprendida por cualquier ciudadano. El c digo que controla a las urnas electr nicas, s lo por un peque o porcentaje de la poblaci n.

5 August 2011

Rudy Godoy: Compute Clusters Integration for Debian Development and Building Report 4

Hello, this is the fourth report for the project. First, I d like to
apologize for being late, I took a short trip over the weekend and
missed the deadline. This time I ve good news for my project. As
expected I ve made the necesary changes to have Eucalyptus support ARM
images. The approach was extending and exposing an arch field that
will be used by libvirt s XML domain file. I m working currently on
having the involved functions working with the new extension and I
expect to have this done over the middle of the next week. What was done so far

- Extended the ncInstance_t struct on util/data.h adding the archId
parameter. This struct keeps the information regarding a node (nc),
say RAM, Kernel, etc. Currently it has no way to set an arch in order
to be used later. typdef struct ncInstance_t
...
char reservationId[CHAR_BUFFER_SIZE];
char userId[CHAR_BUFFER_SIZE];
char archId[CHAR_BUFFER_SIZE]; // arch field added
int retries;
...

The Eucalyptus team was OK with this approach that I ve proposed few
weeks ago. For all my changes the approach is maintain compatibility
with existing features.
We still have to define if we are setting a default value, say amd64,
or keep it unset until the user sets one. For now we will be supporting
amd64 and arm as valid fields. - Modified the allocate_intance() function to support the archId
field. This involved adding a new parameter and detecting whether its
present and setting de value accordingly. This function is later used
by the doRunInstance() function in node/handlers_kvm.c - Modified the involved functions that are called by the KVM handlers
and result in passing parameters to the XML file generator for being
used by libvirt s and KVM and have the image running under
Eucalyptus. The new parameters and field are not required and only
being used if they are set with a value. The archId parameter has been
added as the last one, so existing function calls can work. This could
be rearranged later, but will involve a more extensive revision with
Eucalyptus developers. - I ve set a git local repo in order to keep track of the changes to
Eucalyptus. Since this part is a sort of a Eucalyptus branch I took
this approach. Steffen and I are discussing whether to release this as
a pkg-eucalyptus branch (as dpatch, using existing packaging). Either
way I m putting the diff files form my git repo over the wiki and will
put them on my site too. I plan to use git-svn in order to send them
later to the pkg-eucalyptus repo once we are settled with this. - Updated the project s wiki[1] page with information regarding all that s
involved on my work, so it can be re-played or resumed later. 1- http://wiki.debian.org/SummerOfCode2011/BuildWithEucalyptus/ProjectLog What I plan to to over the next weeks
-
- Finish the modifications to the relevant functions. Expected dealine:
next week.
- Do integration tests:
Write a test program, there s an example under node/ that could be
helpful.
Generate the XML domain definition from Eucalyptus functions. The
full cycle: handler_kvm -> allocate_instance -> get_instance_xml ->
have the XML file properly generated -> qemu-kvm runs the instance
using the proper emulator (arm in our case).
- Test runnning the image I ve worked on the project s begining.
- Fixing bits and cooperate with the pkg-eucalyptus team on the
packaging bits.
- Talk to Eucalyptus so they adopt our patches.
- Rock and roll :) What would be nice to have after the project ends
-
- The Debian packaging *needs* work. I ve already discussed this on the
pkg-eucalyptus mailing list. I ve also committed myself to help on
that side even after the SoC ends.
- Talk with upstream and Ubuntu to coordinate the packaging efforts and
modifications to the software.
- Get adopters to use our work :) By now, few weeks about to finish the SoC, I can say that it s been a
good experience and as I stated I m planning to keep maintaing/working
on the software after that. I ve already joined the pkg-eucalyptus team
and plan to contribute actively. I m sad I didn t managed to get Debconf
this year, time was a constraint again, hopefully we can meet in 2012! I
guess that all by now! Best regards,
Rud

5 July 2011

Rudy Godoy: Compute Clusters Integration for Debian Development and Testing Report 3

Hello this is the third report regarding my project. As usual, quick
summary first. From my last report I have progressed much little than I
would like but, as stated previously also, I m finishing semester
and between exams and projects it takes almost all my time. However, I m
free of academic duties next week and I ll double the speed on the project. What was done so far

- Started working on the modifications to the Eucalyptus code, as
expected.
I ve almost complete the code for the Libvirt s XML Domain
definition. I ve abstracted the wrapper in order to support a broader
number of arches. Previously it was tailored for supporting x86/amd64
based machines. The idea is to have a list of supported arches and
generate the XML file according that. For ARM some parameters and
elements are different than for the others. This would allow for
people to support another architectures, given they are well supported
by KVM-Qemu.
Started working on changes to util/data.c to make sure the correct
parameters are being passed to the XML generator and to libvirt kvm
calls.
- Define and start to setup the testing / demo environment. Basically
involves having one machine to run KVM as hypervisor so the Eucalyptus
handler uses it to run ARM kvm-qemu images.
- Over this weekend I ll be publising the code changes on my
website. Steffen suggested to create some docs regarding the progress
too. I ll do that once I feel I m on track again for my current
pending work, that s coding. What I ll be working over the next two weeks

- My primary goal is to catch up on the work pace and finish the neccesary
code for Eucalyptus to support ARM images. This means to finish
modifications to:
node/handlers_kvm.c handling nc creation. Little changes are need
here from what I ve indentified already.
util/data. c,h handling instance creation. This is the part where
most of the work shall be done. I ve started to work here already.
- Test the code I ve complete already and send the patches to the teams
involved. To date I m still working on a local environment, because no
usable patches have been generated already, just little pieces here
and there.
- Since this is a sort of gluing project, that is code here and there
and make it work together. I m in the process of finding a machine to
be able to test parts of it once they are complete. I m still
contacting people to get such machine, in anycase I gues I can manage
to get one myself. I can t use my current machine since it involves
reinstalling and working at the OS level.
- For the project s demo I ll use both machines, one as the front-end
and the kvm one for NCs (nodes). That s the goal. What is required after that

- I d like to release a package that includes my changes, so it can be
installed either from official packages or my own packages, this
involves:
Updating the packaging bits to the pkg-Eucalyptus repo.
Talking to upstream about it.
- Write the reference Steffen suggested. I believe that I can finish the first part in one week, because of the
work I ve done in the past weeks to understand the Eucalyptus code. I m
also will be talking more to the Eucalyptus team and also to the
maintainers about such changes. I guess that s all for now.

22 January 2011

Enrico Zini: Match package names across distributions

Match package names across distributions What would happen if we had a quick and reliable way to match package names across distributions? These ideas came up at the appinstaller2011 meeting: We thought they were good ideas, so we started hacking. To try it, you need to get the code and build the index first:
git clone git://git.debian.org/users/enrico/distromatch.git
cd distromatch
# Careful: 90Mb
wget http://people.debian.org/~enrico/dist-info.tar.gz
tar zxf dist-info.tar.gz
# Takes a long time to do the indexing
./distromatch --reindex --verbose
Then you can query it this way:
./distromatch $DISTRO $PKGNAME [$PKGNAME1 ...]
This would give you, for the package $PKGNAME in $DISTRO, the corresponding package names in all other distros for which we have data. If you do not provide package names, it automatically shows output for all packages in $DISTRO. For example:
$ time ./distromatch debian libdigest-sha1-perl
debian:libdigest-sha1-perl fedora:perl-Digest-SHA1
debian:libdigest-sha1-perl mandriva:perl-Digest-SHA1
debian:libdigest-sha1-perl suse:perl-Digest-SHA1
real    0m0.073s
user    0m0.056s
sys 0m0.016s
Yes it's quick. It builds a Xapian index with the information it needs, and then it reuses it. As soon as I find a moment, I intend to deploy an instance of it on DDE. It is using a range of different heuristics: This list may get obsolete soon as more heuristics get implemented. Euristics will never cover all corner cases we surely have, but the idea is that if we can match a sizable amout of packages, the rest can be somehow fixed by hand as needed. The data it requires for a distribution should be rather straightforward to generate:
  1. a file which maps binary package names to source package names
  2. a file with the list of files in all the packages
For example:
$ ls -l dist-debian/
total 39688
-rw-r--r--  1 enrico enrico  1688249 Jan 20 17:37 binsrc
drwxr-xr-x  2 enrico enrico     4096 Jan 21 19:12 db
-rw-r--r--  1 enrico enrico 29960406 Jan 21 10:02 files.gz
-rw-r--r--  1 enrico enrico  8914771 Jan 21 18:39 interesting-files
$ head dist-debian/binsrc 
openoffice.org-dev openoffice.org
ext4-modules-2.6.32-5-4kc-malta-di linux-kernel-di-mipsel-2.6
linux-headers-2.6.30-2-common linux-2.6
libnspr4 nspr
ipfm ipfm
libforks-perl libforks-perl
med-physics debian-med
libntfs-3g-dev ntfs-3g
libguppi16 guppi
selinux selinux
$ zcat dist-debian/files.gz   head
memstat etc/memstat.conf
memstat usr/bin/memstat
memstat usr/share/doc/memstat/changelog.gz
memstat usr/share/doc/memstat/copyright
memstat usr/share/doc/memstat/memstat-tutorial.txt.gz
memstat usr/share/man/man1/memstat.1.gz
libdirectfb-dev usr/bin/directfb-config
libdirectfb-dev usr/bin/directfb-csource
libdirectfb-dev usr/include/directfb-internal/core/clipboard.h
libdirectfb-dev usr/include/directfb-internal/core/colorhash.h
interesting-files and db are generated when indexing. To prove the usefulness of the idea (but does it need proving?), you can find in the same git repo a little example app (it took me 10 minutes to write it), that uses the distromatch engine to export Debtags tags to other distributions:
$ ./exportdebtags fedora   head
memstat: admin::benchmarking, interface::commandline, role::program, use::monitor
libdirectfb-dev: devel::lang:c, devel::library, implemented-in::c, interface::framebuffer, role::devel-lib
libkonqsidebarplugin4a: implemented-in::c++, role::shared-lib, suite::kde, uitoolkit::qt
libemail-simple-perl: devel::lang:perl, devel::library, implemented-in::perl, role::devel-lib, role::shared-lib, works-with::mail
libpoe-component-pluggable-perl: devel::lang:perl, devel::library, implemented-in::perl, role::shared-lib
manpages-ja: culture::japanese, made-of::man, role::documentation
libhippocanvas-dev: devel::library, qa::low-popcon, role::devel-lib
libexpat-ocaml-dev: devel::lang:ocaml, devel::library, implemented-in::c, implemented-in::ocaml, role::devel-lib, works-with-format::xml
libgnutls-dev: devel::library, role::devel-lib, suite::gnu
Just in case this made you itch to play with Debtags in a non-Debian distribution, I've generated the full datasets for Fedora, Mandriva and OpenSUSE. Others have been working on the same matching problem. After we started writing code we started to become aware of existing work: I'd like to make use of those efforts, maybe to cross-validate results, maybe even better as yet another heuristics.

15 February 2010

Bdale Garbee: TeleMetrum v0.2 First Test Flight

On Saturday, I joined The Albuquerque Rocket Society monthly launch in Rio Rancho, NM. A friend, Mike, who lives in the area joined me for the launch. While the morning started off clear and calm, if a bit cold... the wind came up hard and we had to call it quits before lunch. But before the wind "blew us away", I managed to get one flight in. And it was an absolutely perfect test of one of my brand-new TeleMetrum v0.2 boards! My cut-down Hawk Mountain "Raptor" kit, renamed "G-Spot" last October during my quest to exceed 50 g acceleration, was loaded with TeleMetrum serial number 51... and launched on a Cesaroni 229H255WT-14A motor. The ascent was beautiful! I've put a few photos of the rocket leaving the launch rail up on flickr. However, despite a clear sky, we quickly lost sight of it! I managed to spot a bit of the smoke trail from the delay grain as the rocket approached apogee, but that was it! None of us at the launch saw anything after apogee! After losing sight of the rocket, I turned my attention to my computer, where we were receiving a solid telemetry stream. It quickly became apparent that the rocket was descending normally under chute. As it got closer to the ground, I started calling out elevation, azimuth, and distance numbers, but still nobody could spot the rocket. As expected, we lost the RF link once the rocket reached the ground. As various folks on the flight line wished me luck finding my rocket, I put the last reported GPS position into my hand-held receiver. Staring at the map display, Mike and I realized the rocket was far down range, near one of the roads into the site. We jumped into my vehicle and drove down the road to the point closest to the rocket's reported position. We then walked to where the GPS receiver said the rocket should be... And found the rocket within about 20 feet! That was well within the window of position uncertainty my hand-held GPS was reporting at the time. Things just don't get much better than that! We picked up the rocket, and returned to the flight line only a few minutes after leaving it. After dumping the data from the board's on-board memory, I quickly generated the usual plots, along with a kml file that can be viewed in Google Earth. The rocket reached 1881 meters apogee, or around 6173 feet, and the maximum acceleration was 19.5 g. It touched down nearly 1.3 miles down range from the launch rail, in sage-brush desert. I honestly don't think I would have found the rocket without at least the radio beacon. It was hugely gratifying that the GPS worked and let me walk right up to the rocket! I could not have asked for a better test of the new electronics! Later in the day, Keith flew a successful test of serial number 52 at a launch in Wilsonville, Oregon. We're very happy with these results! Weather permitting, I hope to get more test flights in next weekend at Hudson Ranch. Stay tuned!

28 October 2009

Eduardo Marcel Macan: Latinoware: A minha menina (fontes)

A palestra sobre produ o musical com software livre no latinoware 2009 foi um sucesso. Muita gente pediu pra ter a musiquinha que produzimos para ilustrar a palestra, e que <noindex>Cesar Brod</noindex> cantou com tanta desenvoltura :) Ent o aqui segue o arquivo de projeto da m sica A minha menina que foi escolhida em homenagem a <noindex>minha namorada</noindex> e porque Brod queria algo MPB. Eu precisava de algo que pudesse vagamente se tornar parecido com Technopop/Electro. A m sica foi editada com <noindex>LMMS</noindex> (Linux MultiMedia Studio) 0.3.2 e poder ser carregada em qualquer vers o igual ou superior, pois foi feita exclusivamente com plugins e samples padr o da distribui o do LMMS. Apesar do nome, o LMMS est dispon vel para windows tamb m. O que voc est esperando? <noindex>Baixe daqui o LMMS</noindex> e o arquivo original da m sica daqui. P.S: A prop sito, a m sica n o foi masterizada e os volumes est o ajustados segundo os par metros do meu laptop, fone de ouvido e da sala l do latinoware. Brinque com os volumes de cada trilha para ajust -la a seu equipamento.

22 October 2009

Gunnar Wolf: How can we advance without a tax increase?

There has been a lot of buzz recently in Mexico after a tax increase that has been announced for next year. The two main points I have seen criticized are:
Value Added Tax (IVA) increase from 15% to 16%
There was a great improvement regarding the original proposal by our de-facto ruler (why de-facto? Because it is still unclear whether he won the popular vote. He has about the same legitimacy as George Bush during his first term: Legal but illegitimate): In Mexico, there is a category of items regarded as fundamental, which are exempt of IVA (tasa cero). This category includes food and medicines Of course, this category makes up the bulk of the poorer people's consumptions, so they pay much less IVA than people with higher living standards. For a very long time, there has been a push to remove this exemption. This has been fortunately clearly understood and fought against. So, the presidential initiative, as I was saying, contemplated a global 2% tax which would not be IVA, and which would be applied universally. This tax would be earmarked to be applied to social programs, and was euphemistically called Impuesto de Combate a la Pobreza (poverty combat tax). Many people applied the concept of duck typing (if it looks like a duck, walks like a duck and quacks like a duck... It is a Value Added Tax). As many analysts, I believe this tax was meant to be the foot in the doorstop, leading to point out in a couple of years that anyway nothing is IVA-exempt anymore and the world has not come to an end, and we should apply universal IVA... So, the reduction in the increase (2% 1%) is not the most notable thing here The notable (and good!) thing is that they didn't succeed into killing the tasa cero.
3% tax on telecommunications
Many friends have started rallying (with the IMHO least effective way of protest you can find on Earth, just by stating their adherence in their Twitter and Facebook profiles. Wow, great deal!) that the original proposal included a 4% tax on telecommunications, and it appears that 3% will be applied. They say, quite fairly, that telephone and Internet access are no longer considerable a luxury, but a need to power the society into becoming better prepared, more competitive. My friends state as a contrast the Finnish ruling broadband access as a citizen right. What they seem not to realize is that the proportion of taxes in Mexico (collected from the responsible taxpayers, which is not by far the way for the bulk of the money in this country) is close to 30%, taking into account the big taxes (IVA, ISR/IETU) and the host of smaller ones. In Finland, the percentage of taxes payed by every person and remember that tax evasion is way lower than here!) is over 50%.
So What is my opinion on this? What would my ideal tax scheme be?
  • Nobody likes taxes. But the country needs far more infrastructure, far wider inverstments. We need higher taxes But we need those taxes to be collected from people with higher income. And yes, that would mean I would most probably pay more (as I do sit relatively high on the income scale Qualified work, even if you do not seek money for the sake of it, pays much better than non-qualified work; remember the minimal wage in Mexico is around MX$50 a day - Less than US$4 or 3).
  • The increases should be applied to the income tax (ISR). It is supposed to be around 30% for income levels over MX$5000 a month, with a very slight increase after that point. Income tax is highly deductible now, and most people with high income manage to ellude most of it. Many cases have been documented of companies as small as Walmart paying less than MX$1000 a year due to several (intentional? you bet!) holes in the legislation. That is where the bulk of the extra government income should come from!
  • For a couple of years, since I registered as a taxpayer (people receiving money exclusively as salaries under a given limit don't have to declare taxes) I have decided not to hire an accountant to make the numbers look prettier, and just do the numbers myself over the platform provided by SAT/Hacienda (the tax collecting authority). Yes, that means I am paying more than what I could But it also means I am paying what I should! And it is an expensive point of view, but I strongly invite others to do the same. If we criticize Walmart for making numbers look prettier, shouldn't each of us do the same? Shouldn't we all care to pay what we are supposed to, so that the government has enough funds to carry out its tasks?
Yes, I am painfully aware that an important portion of what gets into the government disappears due to corruption and ineptitude. Still, the only position from where I can criticize is from being clearly legal. The same point as I do with software: I cannot ask people to comply with my Free Software licensing if I use ilegally propietary software, can I? So no, I don't use any. Even legal propietary software, free-as-in-beer (i.e. Flash player). So, please think this over before you join the Lemmings into complaining about the tax increase. Yes, this is a bad moment to increase taxes. Yes, Mexico is the worst faring country in all of America in its response to the crisis; the GDP will probably fall between 8% and 10% this year and 2010 will not be much better. Yes, it would be better to increase competitivity. But, yes, we pay ridiculously low amounts of taxes And those of us who can afford a little reduction in our expenditure should do it. And those who make gross money should just stop it. Oh, and last point, regarding the #internetnecesario Twitter hashtag: Don't be Lemmings. Internet should be recognized a basic need for a free society. But right now in our country, it completely is a luxury, even if you cannot live without it. If you are Internet-addicted as myself, you most probably will not notice the 3% increase. FFS, We will pay MX$360 instead of MX$350 a month for my Infinitum connection. Will we really notice? In Mexico, middle and upper class are Internet-enabled. Lower classes are not. Things should change, no doubt. But it is not at all comparable to an universal IVA. Things should change and universal connectivity should be a given. But right now, calling Internet a basic good... is just out of touch with reality.

20 April 2009

Bdale Garbee: TeleMetrum First Flight

Today was the season opener for Tripoli Colorado at their launch site on the buffalo ranch near Hartsel, Colorado. After huge snowfalls along the front range of the mountains in the last few days, we were a little tentative about going, but it turned out to be nearly perfect flying conditions! There was apparently much less snow this week on the high plains west of the front range, and by this morning the snow had almost entirely disappeared, the skies were clear and blue, and the winds were calm except for a few gusty bursts in the afternoon. This launch site is really something special, at 8800 feet above mean sea level, in the middle of a huge area of wide-open short grass prairie. We love flying there, and today the drive to and from the launch site through the snow-covered Colorado Rockies was just beautiful! Son Robert and I flew three rockets today, including his LOC/Precision Lil Nuke with added payload section on an Aerotech G54W-M reload, and our Polecat Aerospace 5.5 inch Goblin kit on one of the relatively new Aerotech I245G-M "Mojave Green" green-flame reloads. But by far the highlight of the day was flying my Giant Leap Vertical Assault on a Cesaroni J335 red-flame reload... with serial number 1 of TeleMetrum on board collecting our first-ever flight data! From the ground, it looked like a textbook perfect dual-deploy flight, with a small drogue chute out at apogee around 4000 feet above ground, and the main chute deploying at 700 feet above ground for a beautiful, soft landing only a couple minutes walk from the launch rail. The ejection events were controlled by a PerfectFlite MiniAlt/WD. The data recovered from it shows a big negative spike in the altitude right at apogee, coincident with firing of the apogee deployment charge. I have to assume this means the aft bulkhead on the avionics bay in the reconstructed coupler section isn't sealing well, and some pressure from the apogee deployment charge leaked in to the avionics bay "fooling" the altimeter into thinking the altitude was lower for a sample or two. Clearly, that needs to get fixed before that airframe flies again. Further evidence that we had a mighty kick from the apogee ejection charge was discovered when we went to clean the motor casing. The Cesaroni reloads are packaged in a plastic liner tube that slides into the reusable aluminum case. When we pulled the spent reload out of the case, it was significantly shorter than when it was loaded, suggesting that the ejection charge forced the forward closure on the reload backwards compressing the heat-softened plastic. This could be evidence that the reconstructed coupler was slow to separate from the booster airframe due to excessive friction? The flight data recovered from the TeleMetrum board looks great until apogee, when the data collection stopped. Since I flew firmware that was compiled and flashed on the flight line from Keith's latest git commit as of this morning, it's entirely possible that there was a software bug that caused data collection to terminate at apogee. We'll investigate that. But I personally think what actually happened is that we experienced a temporary short in the power supply at the time the apogee ejection charge fired. On extraction of the electronics sled from the avionics bay this evening, I noticed that one of the mounting screws has gone missing. If it wasn't snug enough, and vibrated loose during flight, it could have been torn loose at the time of the ejection event and shorted something as it rattled around in the avionics bay. The screw is now just missing, but may have fallen out when we were extracting data on the flight line just after the flight without being noticed at the time. So I'm not inclined to worry much about this, at least until we can get some more flight data! Keith post-processed the raw flight data and presented me with a plot showing two traces, acceleration and barometric altitude. The data from the accelerometer closely matches the published data for the motor we flew, which is a really cool result, and my 10-year-old son enjoyed figuring out why the rocket showed negative acceleration after the motor burn out but was still climbing. (See, there really is some science education hidden in the fun!) All in all, we had a great time, and it's totally cool to have data from a first flight of TeleMetrum! Can't wait to fly it again!

15 April 2009

Bdale Garbee: TeleMetrum Mounted

Last night, I stayed up late reworking the avionics bay in my Giant Leap Vertical Assault kit. This is the rocket I set my personal best altitude to date of 14,141 feet above ground level with. The Missile Works RRC2 mini will move to a lesser airframe to keep it below 10k feet AGL since it's an early unit with the supposedly-buggy early firmware. In its place, I've mounted a PerfectFlite MiniAlt/WD to handle primary deployment duties... and serial number 1 of TeleMetrum! Meanwhile, Keith has completely rewritten the guts of the firmware in the last few days to eliminate FreeRTOS since it turns out to just be too heavy for our needs on this project. My board is now running a snapshot of his latest work, and in bench testing I've confirmed that it can do launch detect, log sensor data until the non-volatile memory chip is full, and dump the raw data over USB for analysis after a flight. Weather has been an issue in Colorado recently, but as soon as we get a decent launch day (maybe as early as this weekend?), I'm looking forward to flying this rocket on one or all of the Cesaroni I205, I540, and J285 currently in my box to get some initial flight data recorded. Keep your fingers crossed for me that the storm front headed towards us clears out by the weekend!

27 March 2009

Gunnar Wolf: La metaprogramaci n y los lenguajes din micos

TitleLa metaprogramaci n y los lenguajes din micos
Publication TypeWeb Article
Year of Publication2008
AuthorsGunnar, Wolf.
Access Date2009/03/27
Key WordsMetaprogramaci n, Ruby, programaci n
AbstractLa metaprogramaci n es una t cnica muy poderosa, y muy socorrida en el terreno de los lenguajes din micos. Puede llevarnos a reducir muy fuertemente el total de c digo que escribimos - Y lo que es mucho m s importante, a minimizar la cantidad de c digo repetido innecesariamente. Nos lleva claramente a aumentar la calidad y mantenibilidad de nuestro c digo.
URLhttp://www.sg.com.mx/content/view/718/1/
Citation Key1925
Export

3 December 2008

David Bremner: Using GLPK from C++

Recently I suggested to some students that they could use the Gnu Linear Programming Toolkit from C++. Shortly afterwards I thought I had better verify that I had not just sent people on a hopeless mission. To test things out, I decided to try using GLPK as part of an ongoing project with Lars Schewe The basic idea of this example is to use glpk to solve an integer program with row generation. The main hurdle (assuming you want to actually write object oriented c++) is how to make the glpk callback work in an object oriented way. Luckily glpk provides a pointer "info" that can be passed to the solver, and which is passed back to the callback routine. This can be used to keep track of what object is involved. Here is the class header
#ifndef GLPSOL_HH
#define GLPSOL_HH
#include "LP.hh"
#include "Vektor.hh"
#include "glpk.h"
#include "combinat.hh"
namespace mpc  
  class  GLPSol : public LP  
  private:
    glp_iocp parm;
    static Vektor<double> get_primal_sol(glp_prob *prob);
    static void callback(glp_tree *tree, void *info);
    static int output_handler(void *info, const char *s);
  protected:
    glp_prob *root;
  public:
    GLPSol(int columns);
    ~GLPSol()  ;
    virtual void rowgen(const Vektor<double> &candidate)  ;
    bool solve();
    bool add(const LinearConstraint &cnst);
   ;
 
#endif
The class LP is just an abstract base class (like an interface for java-heads) defining the add method. The method rowgen is virtual because it is intended to be overridden by a subclass if row generation is actually required. By default it does nothing. Notice that the callback method here is static; that means it is essentially a C function with a funny name. This will be the function that glpk calls when it wants from help.
#include <assert.h>
#include "GLPSol.hh"
#include "debug.hh"
namespace mpc 
  GLPSol::GLPSol(int columns)  
    // redirect logging to my handler
    glp_term_hook(output_handler,NULL);
    // make an LP problem
    root=glp_create_prob();
    glp_add_cols(root,columns);
    // all of my variables are binary, my objective function is always the same
    //  your milage may vary
    for (int j=1; j<=columns; j++) 
      glp_set_obj_coef(root,j,1.0);
      glp_set_col_kind(root,j,GLP_BV);
     
    glp_init_iocp(&parm);
    // here is the interesting bit; we pass the address of the current object
    // into glpk along with the callback function
    parm.cb_func=GLPSol::callback;
    parm.cb_info=this;
   
  int GLPSol::output_handler(void *info, const char *s) 
    DEBUG(1) << s;
    return 1;
   
  Vektor<double> GLPSol::get_primal_sol(glp_prob *prob) 
    Vektor<double> sol;
    assert(prob);
    for (int i=1; i<=glp_get_num_cols(prob); i++) 
      sol[i]=glp_get_col_prim(prob,i);
     
    return sol;
   
  // the callback function just figures out what object called glpk and forwards
  // the call. I happen to decode the solution into a more convenient form, but 
  // you can do what you like
  void GLPSol::callback(glp_tree *tree, void *info) 
    GLPSol *obj=(GLPSol *)info;
    assert(obj);
    switch(glp_ios_reason(tree)) 
    case GLP_IROWGEN:
      obj->rowgen(get_primal_sol(glp_ios_get_prob(tree)));
      break;
    default:
      break;
     
   
  bool GLPSol::solve(void)    
    int ret=glp_simplex(root,NULL);
    if (ret==0) 
      ret=glp_intopt(root,&parm);
    if (ret==0)
      return (glp_mip_status(root)==GLP_OPT);
    else
      return false;
   
  bool GLPSol::add(const LinearConstraint&cnst) 
    int next_row=glp_add_rows(root,1);
    // for mysterious reasons, glpk wants to index from 1
    int indices[cnst.size()+1];
    double coeff[cnst.size()+1];
    DEBUG(3) << "adding " << cnst << std::endl;
    int j=1;
    for (LinearConstraint::const_iterator p=cnst.begin();
         p!=cnst.end(); p++) 
      indices[j]=p->first;
      coeff[j]=(double)p->second;
      j++;
     
    int gtype=0;
    switch(cnst.type()) 
    case LIN_LEQ:
      gtype=GLP_UP;
      break;
    case LIN_GEQ:
      gtype=GLP_LO;
      break;
    default:
      gtype=GLP_FX;
     
    glp_set_row_bnds(root,next_row,gtype,       
                       (double)cnst.rhs(),(double)cnst.rhs());
    glp_set_mat_row(root,
                    next_row,
                    cnst.size(),
                    indices,
                    coeff);
    return true;
   
 
All this is a big waste of effort unless we actually do some row generation. I'm not especially proud of the crude rounding I do here, but it shows how to do it, and it does, eventually solve problems.
#include "OMGLPSol.hh"
#include "DualGraph.hh"
#include "CutIterator.hh"
#include "IntSet.hh"
namespace mpc 
  void OMGLPSol::rowgen(const Vektor<double>&candidate) 
    if (diameter<=0) 
      DEBUG(1) << "no path constraints to generate" << std::endl;
      return;
     
    DEBUG(3) << "Generating paths for " << candidate << std::endl;
  // this looks like a crude hack, which it is, but motivated by the
  // following: the boundary complex is determined only by the signs
  // of the bases, which we here represent as 0 for - and 1 for +
    Chirotope chi(*this);
    for (Vektor<double>::const_iterator p=candidate.begin();
         p!=candidate.end(); p++) 
      if (p->second > 0.5)  
        chi[p->first]=SIGN_POS;
        else  
        chi[p->first]=SIGN_NEG;
       
     
    BoundaryComplex bc(chi);
    DEBUG(3) << chi;
    DualGraph dg(bc);
    CutIterator pathins(*this,candidate);
    int paths_found=
      dg.all_paths(pathins,
                   IntSet::lex_set(elements(),rank()-1,source_facet),
                   IntSet::lex_set(elements(),rank()-1,sink_facet),
                   diameter-1);
    DEBUG(1) << "row generation found " << paths_found << " realized paths\n";
    DEBUG(1) << "effective cuts: " << pathins.effective() << std::endl;
   
  void OMGLPSol::get_solution(Chirotope &chi)  
    int nv=glp_get_num_cols(root);
    for(int i=1;i<=nv;++i)  
      int val=glp_mip_col_val(root,i);
      chi[i]=(val==0 ? SIGN_NEG : SIGN_POS);
     
   
 
So ignore the problem specific way I generate constraints, the key remaining piece of code is CutIterator which filters the generated constraints to make sure they actually cut off the candidate solution. This is crucial, because row generation must not add constraints in the case that it cannot improve the solution, because glpk assumes that if the user is generating cuts, the solver doesn't have to.
#ifndef PATH_CONSTRAINT_ITERATOR_HH
#define PATH_CONSTRAINT_ITERATOR_HH
#include "PathConstraint.hh"
#include "CNF.hh"
namespace mpc  
  class CutIterator : public std::iterator<std::output_iterator_tag,
                                                      void,
                                                      void,
                                                      void,
                                                      void> 
  private:
    LP& _list;
    Vektor<double> _sol;
    std::size_t _pcount;
    std::size_t _ccount;
  public:
    CutIterator (LP& list, const Vektor<double>& sol) : _list(list),_sol(sol), _pcount(0), _ccount(0)  
    CutIterator& operator=(const Path& p)  
      PathConstraint pc(p);
      _ccount+=pc.appendTo(_list,&_sol);
      _pcount++;
      if (_pcount %10000==0)  
        DEBUG(1) << _pcount << " paths generated" << std::endl;
       
      return *this;
     
    CutIterator& operator*()  return *this; 
    CutIterator& operator++()  return *this; 
    CutIterator& operator++(int)  return *this; 
    int effective()   return _ccount;  ;
   ;
 
#endif
Oh heck, another level of detail; the actual filtering actually happens in the appendTo method the PathConstraint class. This is just computing the dot product of two vectors. I would leave it as an exercise to the readier, but remember some fuzz is neccesary to to these kinds of comparisons with floating point numbers. Eventually, the decision is made by the following feasible method of the LinearConstraint class.
 bool feasible(const
Vektor<double> & x)  double sum=0; for (const_iterator
p=begin();p!=end(); p++)  sum+= p->second*x.at(p->first);  
      switch (type()) 
      case LIN_LEQ:
        return (sum <= _rhs+epsilon);
      case LIN_GEQ:
        return (sum >= _rhs-epsilon);
    default:
      return (sum <= _rhs+epsilon) &&
        (sum >= _rhs-epsilon);
       
     

15 August 2008

DebConf 8 video: Debian Derivers Roundtable

Participiants: Martin F Krafft <madduck@madduck.net> (vcs-pkg.org), Florian Maier <contact@marsmenschen.com> (LiMux), Cesar Gomez <cesar.gomez@gmail.com> (Linex), Holger Levsen <holger@layer-acht.org> (Debian Edu), Andreas Tille <tille@debian.org>, Mark Shuttleworth <marks@debian.org> (Ubuntu), Bdale Garbee (Debian)
Full event details

14 August 2008

Eddy Petri&#537;or: crazy idea: user supported snapshot.debian.net

I just had this crazy idea about solving snapshot.debian.net's space issue which just might work if done right.

We all know that snapshot.debian.net has ran out of space in early May this year and we all lost a valuable service. Since there are probably many users, developers and maintainers which find this service useful, it might make sense to think that the users themselves could solve the problem.

How? Think about implementing a really distributed storage/file system (not necesarly a real file system) that can rely on cluster nodes that can enter or exit at any time and the nodes are contributed by users, very much alike the GoogleFS file system (or maybe clients in a torrent swarm).


First problems that I can already see:

Other ideas:
What do other people think?

5 April 2008

Holger Levsen: Extremadura details

I thought the spanish region of Extremadura had 4 million inhabitants and 200000 computers running GNU/Linex, a custom Debian distribution.

I was wrong. It's 1 million inhabitants and ca. 100000 computers running Linex - which is a much better ratio :-)

The exact number is not really known, because about 2000 computers (or so, details...!) don't provide data back for some unknown reasons. Probably a configuration detail.

Details are sooo important, and I love that. But at the same time, details are irrelevant (100000 or 102000...), I love that too.

Personally, I think the Debian Edu developer gathering the last 3.5 days here in Extremadura was very productive and fun. It was so productive, there were so many fun, interesting and insightful discusssions, that currently I can only think of one detail, I want to mention here, which really isn't a detail, but something bigger. I'm very very happy about the work which was put into bug 311188 here, mostly by Winnie. #311188 is just a meta bug, the juice is in the details, all the bugs which are blocking this bug.

And as you are probably interesting in this detail, why we care so much about this bug, let me explain: the blocking bugs of that bug will prevent upgrades from Lenny to Lenny+1 from working without problems for Debian Edu installations. That's why we care - it's only a detail, but an important one.

And now I'll run to get a last drink from the hotel bar, instead of correcting details in this post. Thank you, Cesar y Jose, for arranging this great meeting and everybody else, who makes Debian a fun thing to work on details and the great picture!

12 February 2008

Evan Prodromou: 23 Pluvi se CCXVI

Since we launched Vinismo a few months ago, I've been watching the world of wine blogging with great interest. One thing I really like is Wine Blogging Wednesday, wherein wine bloggers around the world pick a topic or theme and all blog about wines that match that theme, once a month. On Wednesday. Thus the name. I'm surprised I have to explain this. Anyhoo, this months theme is just seven words: the reviews of wines have to be exactly seven words long. I figured, heck: if I'm ever gonna do a WBW review, this has gotta be the easiest to do. Oh, and the wine theme was reds from Italy, which may be the easiest drinking wines in the entire world. Niko and Pio So: Niko picked up a bottle of 2005 Pio Cesare Barbera d'Alba before coming over to our house last night, and we had a glug while doing some floor puzzles with Amita June. My impression? Earthy, melon, medium-bodied; needs sharp cheese. Woo-hoo! Now I'm a real wine blogger. tags:

Next.