Search Results: "heiko"

17 July 2015

Simon Kainz: DUCK challenge: week 2

Just a litte update on the DUCK challenge: In the last week, the following packages were fixed and uploaded into unstable: Last week we had 10 packages uploaded & fixed, the current week resulted in 15 fixed packages. So there are currently 25 packages fixed by 20 different uploaders. I really hope i can meet you all at DebConf15!! The list of the fixed and updated packages is availabe here. I will try to update this ~daily. If I missed one of your uploads, please drop me a line. A big "Thank You" to you. There is still lots of time till the end of DebConf15 and the end of the DUCK Challenge, so please get involved. And rememeber: debcheckout fails? FIX MORE URLS

14 March 2014

Richard Hartmann: Git prize: Outstanding Contribution to Open Source/Linux/Free Software

In February, Linux Magazine contacted me, asking if I would be willing to accept the Linux New Media Award 2014 in the main category "Outstanding Contribution to Open Source/Linux/Free Software" on behalf of the Git community due to my involvement with evangelizing and vcsh. Needless to say, I was thrilled. I managed to poke Junio via someone at Google and he agreed. We also reached out within the German Git community and two maintainers of git submodule, Jens Lehmann and Heiko Voigt, joined in as well. While we didn't manage to hammer out interoperability issues of vcsh and git submodule due to time constraints and too much beer, we are planning to follow up on that. Git beat OpenStack, Python, and Ubuntu by a huge margin; sadly I don't have exact numbers (yet). More details and a rather crummy photo can be found in Linux Magazine's article. A video of the whole thing will uploaded to this page soonish. If it appears that we kept our "speech" very short, that was deliberate after the somewhat prolonged speeches beforehand ;) The aftershow event was nice even though the DJ refused to turn down the music down to tolerable levels; his reaction to people moving father away, and asking him to turn down the volume a bit, was to turn it up... Anyway, given the mix of people present during the award ceremony, very interesting discussions ensued. While I failed to convert Klaus Knopper to zsh and git, at least there's a chance that Cornelius Schuhmacher will start using vcsh and maybe even push for a separation of configuration and state in KDE. The most interesting tidbits of the evening were shared by Abhisek Devkota of cyanogenmod fame. Without spilling any secrets it's safe to say that the future of cyanogenmod is looking extremely bright and that there are surprises in the works which will have quite the impact. Last but not least, here's the physical prize: Glass trophy held by your's truly

29 October 2013

Soeren Sonnenburg: Shogun Toolbox Version 3.0 released!

Dear all, we are proud to announce the 3.0 release of the Shogun Machine-Learning Toolbox. This release features the incredible projects of our 8 hard-working Google Summer of Code students. In addition, you get other cool new features as well as lots of internal improvements, bugfixes, and documentation improvements. To speak in numbers, we got more than 2000 commits changing almost 400000 lines in more than 7000 files and increased the number of unit tests from 50 to 600. This is the largest release that Shogun ever had! Please visit to obtain Shogun. News Here is a brief description of what is new, starting with the GSoC projects, which deserve most fame: Screenshots Everyone likes screenshots. Well, we have got something better! All of the above projects (and more) are now documented in the form of IPython notebooks, combining machine learning fundamentals, code, and plots. Those are a great looking way that we chose to document our framework from now on. Have a look at them and feel free to submit your use case as a notebook! FGM.html GMM.html HashedDocDotFeatures.html LMNN.html SupportVectorMachines.html Tapkee.html bss_audio.html bss_image.html ecg_sep.html gaussian_processes.html logdet.html mmd_two_sample_testing.html The web-demo framework has been integrated into our website, go check them out. Other changes We finally moved to the Shogun build process to CMake. Through GSoC, added a general clone and equals methods to all Shogun objects, and added automagic unit-testing for serialisation and clone/equals for all classes. Other new features include multiclass LDA, and probability outputs for multiclass SVMs. For the full list, see the NEWS. Workshop Videos and slides In case you missed the first Shogun workshop that we organised in Berlin last July, all of the talks have been put online. Shogun in the Cloud As setting up the right environment for shogun and installing it was always one of the biggest problems for the users (hence the switching to CMake), we have created a sandbox where you can try out shogun on your own without installing shogun on your system! Basically it's a web-service which give you access to your own ipython notebook server with all the shogun notebooks. Of course you are more than welcome to create and share your own notebooks using this service! *NOTE*: This is a courtesy service created by Shogun Toolbox developers, hence if you like it please consider some form of donation to the project so that we can keep up this service running for you. Try shogun in the cloud. Thanks The release has been made possible by the hard work of all of our GSoC students, see list above. Thanks also to Thoralf Klein and Bj rn Esser for the load of great contributions. Last but not least, thanks to all the people who use Shogun and provide feedback. S ren Sonnenburg on behalf of the Shogun team (+ Viktor Gal, Sergey Lisitsyn, Heiko Strathmann and Fernando Iglesias)

17 March 2013

Soeren Sonnenburg: Shogun Toolbox Version 2.1.0 Released!

We have just released shogun 2.1.0. This release contains over 800 commits since 2.0.0 with a load of bugfixes, new features and improvements (see the changelog for details) that make Shogun more efficient, robust and versatile. In particular, Christian Montanari developed a first alpha version of a perl modular interface, Heiko Strathmann did add Linear Time MMD on Streaming Data, Viktor Gal wrote a new structured output solver and Sergey Lisitsyn added support for tapkee - a dimension reduction framework. Read more at

27 October 2012

Soeren Sonnenburg: Shogun at Google Summer of Code 2012

The summer came finally to an end and (yes in Berlin we still had 20 C end of October), unfortunately, so did GSoC with it. This has been the second time for SHOGUN to be in GSoC. For those unfamiliar with SHOGUN - it is a very versatile machine learning toolbox that enables unified large-scale learning for a broad range of feature types and learning settings, like classification, regression, or explorative data analysis. I again played the role of an org admin and co-mentor this year and would like to take the opportunity to summarize enhancements to the toolbox and my GSoC experience: In contrast to last year, we required code-contributions in the application phase of GSoC already, i.e., a (small) patch was mandatory for your application to be considered. This reduced the number of applications we received: 48 proposals from 38 students instead of 70 proposals from about 60 students last year but also increased the overall quality of the applications. In the end we were very happy to get 8 very talented students and have the opportunity of boosting the project thanks to their hard and awesome work. Thanks to google for sponsoring three more students compared to last GSoC. Still we gave one slot back to the pool for good to the octave project (They used it very wisely and octave will have a just-in-time compiler now, which will benefit us all!). SHOGUN 2.0.0 is the new release of the toolbox including of course all the new features that the students have implemented in their projects. On the one hand, modules that were already in SHOGUN have been extended or improved. For example, Jacob Walker has implemented Gaussian Processes (GPs) improving the usability of SHOGUN for regression problems. A framework for multiclass learning by Chiyuan Zhang including state-of-the-art methods in this area such as Error-Correcting Output Coding (ECOC) and ShareBoost, among others. In addition, Evgeniy Andreev has made very important improvements w.r.t. the accessibility of SHOGUN. Thanks to his work with SWIG director classes, now it is possible to use python for prototyping and make use of that code with the same flexibility as if it had been written in the C++ core of the project. On the other hand, completely new frameworks and other functionalities have been added to the project as well. This is the case of multitask learning and domain adaptation algorithms written by Sergey Lisitsyn and the kernel two-sample or dependence test by Heiko Strathmann. Viktor Gal has introduced latent SVMs to SHOGUN and, finally, two students have worked in the new structured output learning framework. Fernando Iglesias made the design of this framework introducing the structured output machines into SHOGUN while Michal Uricar has implemented several bundle methods to solve the optimization problem of the structured output SVM. It has been very fun and interesting how the work done in different projects has been put together very early, even during the GSoC period. Only to show an example of this dealing with the generic structured output framework and the improvements in the accessibility. It is possible to make use of the SWIG directors to implement the application specific mechanisms of a structured learning problem instance in python and then use the rest of the framework (written in C++) to solve this new problem. Students! You all did a great job and I am more than amazed what you all have achieved. Thank you very much and I hope some of you will stick around. Besides all these improvements it has been particularly challenging for me as org admin to scale the project. While I could still be deeply involved in each and every part of the project last GSoC, this was no longer possible this year. Learning to trust that your mentors are doing the job is something that didn't come easy to me. Having had about monthly all-hands meetings did help and so did monitoring the happiness of the students. I am glad that it all worked out nicely this year too. Again, I would like to mention that SHOGUN improved a lot code-base/code-quality wise. Students gave very constructive feedback about our (lack) of proper Vector/Matrix/String/Sparse Matrix types. We now have all these implemented doing automagic memory garbage collection behind scenes. We have started to transition to use Eigen3 as our matrix library of choice, which made quite a number of algorithms much easier to implement. We generalized the Label framework (CLabels) to be tractable for not just classification and regression but multitask and structured output learning. Finally, we have had quite a number of infrastructure improvements. Thanks to GSoC money we have a dedicated server for running the buildbot/buildslaves and website. The ML Group at TU Berlin does sponsor virtual machines for building SHOGUN on Debian and Cygwin. Viktor Gal stepped up providing buildslaves for Ubuntu and FreeBSD. Gunnar Raetschs group is supporting redhat based build tests. We have Travis CI running testing pull requests for breakage even before merges. Code quality is now monitored utilizing LLVMs scan-build. Bernard Hernandez appeared and wrote a fancy new website for SHOGUN. A more detailed description of the achievements of each of the students follows:

26 April 2012

Soeren Sonnenburg: GSoC2012 Accepted Students

Shogun has received an outstanding 9 slots in this years google summer of code. Thanks to google and the hard work of our mentors ranking and discussing with the applicants - we were able to accept 8 talented students (We had to return one slot back to the pool due to not having enough mentors for all tasks. We indicated that octave gets the slot so lets hope they got it and will spend it well :). Each of the students will work on rather challenging machine learning topics. The accepted topics and the students attacking them with the help of there mentors are We are excited to have them all in the team! Happy hacking!

7 September 2011

Soeren Sonnenburg: Shogun at Google Summer of Code 2011

Google Summer of Code 2011 gave a big boost to the development of the shogun machine learning toolbox. In case you have never heard of shogun or machine learning: Machine Learning involves algorithms that do intelligent'' and even automatic data processing and is nowadays used everywhere to e.g. do face detection in your camera, compress the speech in you mobile phone, powers the recommendations in your favourite online shop, predicts solulabily of molecules in water, the location of genes in humans, to name just a few examples. Interested? Then you should give it a try. Some very simple examples stemming from a sub-branch of machine learning called supervised learning illustrate how objects represented by two-dimensional vectors can be classified in good or bad by learning a so called support vector machine. I would suggest to install the python_modular interface of shogun and to run the example also included in the source tarball. Two images illustrating the training of a support vector machine follow (click to enlarge): svm svm interactive Now back to Google Summer of Code: Google sponsored 5 talented students who were working hard on various subjects. As a result we now have a new core developer and various new features implemented in shogun: Interfaces to new languages like java, c#, ruby, lua written by Baozeng; A model selection framework written by Heiko Strathman, many dimension reduction techniques written by Sergey Lisitsyn, Gaussian Mixture Model estimation written by Alesis Novik and a full-fledged online learning framework developed by Shashwat Lal Das. All of this work has already been integrated in the newly released shogun 1.0.0. In case you want to know more about the students projects continue reading below, but before going into more detail I would like to summarize my experience with GSoC 2011. My Experience with Google Summer of Code We were a first time organization, i.e. taking part for the first time in GSoC. Having received many many student applications we were very happy to hear that we at least got 5 very talented students accepted but still had to reject about 60 students (only 7% acceptance rate!). Doing this was an extremely tough decision for us. Each of us ended up in scoring students even then we had many ties. So in the end we raised the bar by requiring contributions even before the actual GSoC started. This way we already got many improvements like more complete i/o functions, nicely polished ROC and other evaluation routines, new machine learning algorithms like gaussian naive bayes and averaged perceptron and many bugfixes. The quality of the contributions and independence of the student aided us coming up with the selection of the final five. I personally played the role of the administrator and (co-)mentor and scheduled regular (usually) monthly irc meetings with mentors and students. For other org admins or mentors wanting into GSoC here come my lessons learned: Now please read on to learn about the newly implemented features: Dimension Reduction Techniques Sergey Lisitsyn (Mentor: Christian Widmer) Dimensionality reduction is the process of finding a low-dimensional representation of a high-dimensional one while maintaining the core essence of the data. For one of the most important practical issues of applied machine learning, it is widely used for preprocessing real data. With a strong focus on memory requirements and speed, Sergey implemented the following dimension reduction techniques: See below for the some nice illustrations of dimension reduction/embedding techniques (click to enlarge). isomap swissrollrno kllelocal tangent space alignment Cross-Validation Framework Heiko Strathmann (Mentor: Soeren Sonnenburg) Nearly every learning machine has parameters which have to be determined manually. Before Heiko started his project one had to manually implement cross-validation using (nested) for-loops. In his highly involved project Heiko extend shogun's core to register parameters and ultimately made cross-validation possible. He implemented different model selection schemes (train,validation,test split, n-fold cross-validation, stratified cross-validation, etc and did create some examples for illustration. Note that various performance measures are available to measure how good'' a model is. The figure below shows the area under the receiver operator characteristic curve as an example. foo Interfaces to the Java, C#, Lua and Ruby Programming Languages Baozeng (Mentor: Mikio Braun and Soeren Sonnenburg) Boazeng implemented swig-typemaps that enable transfer of objects native to the language one wants to interface to. In his project, he added support for Java, Ruby, C# and Lua. His knowlegde about swig helped us to drastically simplify shogun's typemaps for existing languages like octave and python resolving other corner-case type issues. The addition of these typemaps brings a high-performance and versatile machine learning toolbox to these languages. It should be noted that shogun objects trained in e.g. python can be serialized to disk and then loaded from any other language like say lua or java. We hope this helps users working in multiple-language environments. Note that the syntax is very similar across all languages used, compare for yourself - various examples for all languages ( python, octave, java, lua, ruby, and csharp) are available. Largescale Learning Framework and Integration of Vowpal Wabbit Shashwat Lal Das (Mentor: John Langford and Soeren Sonnenburg) Shashwat introduced support for 'streaming' features into shogun. That is instead of shogun's traditional way of requiring all data to be in memory, features can now be streamed from e.g. disk, enabling the use of massively big data sets. He implemented support for dense and sparse vector based input streams as well as strings and converted existing online learning methods to use this framework. He was particularly careful and even made it possible to emulate streaming from in-memory features. He finally integrated (parts of) vowpal wabbit, which is a very fast large scale online learning algorithm based on SGD. Expectation Maximization Algorithms for Gaussian Mixture Models Alesis Novik (Mentor: Vojtech Franc) The Expectation-Maximization algorithm is well known in the machine learning community. The goal of this project was the robust implementation of the Expectation-Maximization algorithm for Gaussian Mixture Models. Several computational tricks have been applied to address numerical and stability issues, like An illustrative example of estimating a one and two-dimensional Gaussian follows below. 1D GMM2D GMM Final Remarks All in all, this year s GSoC has given the SHOGUN project a great push forward and we hope that this will translate into an increased user base and numerous external contributions. Also, we hope that by providing bindings for many languages, we can provide a neutral ground for Machine Learning implementations and that way bring together communities centered around different programming languages. All that s left to say is that given the great experiences from this year, we d be more than happy to participate in GSoC2012.

3 December 2010

Debian News: New Debian Developers (November 2010)

The following developers got their Debian accounts in the last month: Congratulations!

18 September 2008

Vincent Fourmond: pdfcrop vs pdfnup --trim

I make heavy use of pdfcrop (the version by Heiko Oberdiek included in texlive) for printing, as it allows to remove all the white spaces around the text before nupping and printing. However, with recent ghostscript, the bounding box detection is dreadfully slow, anything from one to ten seconds for a single page ! When you have a 60 pages article to print, it can be quite tedious... Well, I've just discovered that the --trim option of pdfnup can do very similar things too. It removes a given amount of margins on the side, with the syntax 'left bottom right top'. Most of the time, a well-thought pdfnup --trim ... will work as well as pdfcrop while in the meantime being much faster...

7 March 2006

Zak B. Elep: include

Thank you Ian Dexter Marquez for the USB flash key you sent me! Thank you Steve Kemp and Heiko Herold for your wonderful ports of bash and wget to Windows! And thank you JoshNet for the cafe connection! :D I’m now using those tools to grab ubuntu-desktop and its dependencies via apt-zip; speaking of which, I think that wonderful piece of free software needs a little love… more on that later…