This year, our annual company event will be different. Every year in the summer we collect all conject employees in on one place for a couple of days of fun and learning. In recent years we always played some form of more or less elaborate games. This year, we are going to give something back to the community. We are going to go to Duisburg and build some utility sheds and refurbish some garden furniture for a youth center. We have set the whole thing up like a real construction project. We have a financing team, a design team, a team of project managers, a marketing team and even a catering team. The architects on our team designed some cool custom build sheds that we are going to assemble on site. Our marketeers came up with a very effective web site to market the project and collect donations for the construction costs. Check it out at www.konsequentconject.com. >> more…

There’s a saying “Where there’s one bug, there’s two.” Don’t stop looking because you have found a problem. Chances are excellent that there is another one nearby.

This weekend marks a historic moment for our product conjectPM – we’ve moved from complex object permissions (about 10 different permissions) to simple permissions (read, write, full access). This turned out to be a project, as we had been developing on this for 4 iterations now (1 iteration = 2 weeks).
We were careful not to underestimate this endeavour, as it resembles open brain surgery on a patient. We have had over one week of testing (initial internal tests found over 30 bugs), including tests of the migrated data.
Monday will be the moment of truth, as that’s the go-live for thousands of users…
Our support team, who are very critical testers, have been confident. No matter how many cases you test and how much test automation you can put in place, you can rely on the gut feeling of these guys. >> more…

While improving our translation process we got some inconsistencies in arabian properties files. Below is a small example (1). Suddenly in the unicode escaped files parantheses balance was broken (2), while in the editor everything seemed to be ok (3).

  1. message.title = Message {0}:
  2. message.title = :{\u0627\u0644\u0631\u0633\u0627\u0644\u0629 {0
  3. message.title = :{الرسالة {0

How could this happen? There’s no voodoo going on, just bidi (bidirectional text) algorithm operating in the dark and someone editing the file without knowing about this algorithm. >> more…

The Hadoop File System (HDFS) was designed to store files that have sizes in the magnitude of giga- or even terrabytes. Our requirements are quite different, because we have lots of small files (kilo- or at most megabytes). However, there is a way to store such data in a efficient way.

A hadoop cluster typically consists of one NameNode, that keeps an overview what lies where etc., and a couple of DataNodes, responsible for storing the data. Each file is stored as one block on a DataNode or, if it’s larger than the block size, distributed among several blocks. The default block size is 64 MB. Having mainly small files, we would have a lot of mostly empty blocks lying around. However, these don’t take more disk space than the original files (see “Hadoop: The Definitive Guide”).
It’s purely the number of blocks, that is causing the problem: The NameNode keeps a map in memory that holds the information on which DataNodes a block is stored. With lots of files, this map becomes quite huge. On top of that, during start up each DataNode scans its file system and provides the NameNode with the information which files it is storing. The more files there are, the longer this takes. >> more…

We are currently changing our infrastructure to use the distributed hadoop filesystem (HDFS, an open source filesystem similar to Google’s), instead of dedicated fileservers. Therefore we needed to change the ant task that deletes old files on the developer’s computers to delete those files in HDFS. After some extensive research – “ant hadoop” are really bad search terms – we found that the Hadoop distribution already comes with some predefined tasks. This is how they can be used:

  1. <path id="ant.classpath">
  2. <fileset dir="${libs.dir}">
  3.       <include name="hadoop-0.18.3-ant.jar" />
  4.       <include name="hadoop-0.18.3-core.jar" />
  5.       <include name="commons-cli-2.0-SNAPSHOT.jar" />
  6. </fileset>
  7. </path>
  8. <taskdef name="hdfs" classname="org.apache.hadoop.ant.DfsTask" classpathref="ant.classpath" />
  9. <target  name="createHDFSdirectory">
  10.       <hdfs cmd="mkdir" args="hdfs://localhost:54310/testDir" />
  11. </target>
  12. <target  name="deleteHDFSdirectory">
  13.       <hdfs cmd="rmr" args="hdfs://localhost:54310/testDir" />
  14. </target>

>> more…

Wondering why some svn commands I read in this book won’t work on my mac, I found out that I still have version 1.4.4 installed.
Wanting to get a newer version, I read an instruction saying “you’ll need MacPorts and therefore first install XCode and then…” – but I was pretty sure that wasn’t the best way. Having to install two programms just to get the newest subversion? However, as I’m still rather new to this “prorgamming on a mac” thing, I needed a little help from a friend to figure out an easier way:
1. download the binaries at http://www.open.collab.net/downloads/community/
2. doubleclick, install
3. add /opt/subversion/ to the path (do something like echo ‘export PATH=/opt/subversion/bin:$PATH’ >>./bash_profile in the Terminal) >> more…