We are currently changing our infrastructure to use the distributed hadoop filesystem (HDFS, an open source filesystem similar to Google’s), instead of dedicated fileservers. Therefore we needed to change the ant task that deletes old files on the developer’s computers to delete those files in HDFS. After some extensive research – “ant hadoop” are really bad search terms – we found that the Hadoop distribution already comes with some predefined tasks. This is how they can be used:

  1. <path id="ant.classpath">
  2. <fileset dir="${libs.dir}">
  3.       <include name="hadoop-0.18.3-ant.jar" />
  4.       <include name="hadoop-0.18.3-core.jar" />
  5.       <include name="commons-cli-2.0-SNAPSHOT.jar" />
  6. </fileset>
  7. </path>
  8. <taskdef name="hdfs" classname="org.apache.hadoop.ant.DfsTask" classpathref="ant.classpath" />
  9. <target  name="createHDFSdirectory">
  10.       <hdfs cmd="mkdir" args="hdfs://localhost:54310/testDir" />
  11. </target>
  12. <target  name="deleteHDFSdirectory">
  13.       <hdfs cmd="rmr" args="hdfs://localhost:54310/testDir" />
  14. </target>

>> more…

Tobi gives us some hints how to improve our ant script (there’s better ant task which we can use). Let’s see.. surely it’s better, let’s introduce it…done! Looks much nicer indeed. Ok, seems like the work is done..but wait, something tells us that we can do even more in this task. There’s still a possibility to insert incorrect data via deprecate constructor. We quickly get rid from it. There’s also some legacy code which I want to delete as well. Like that? No way – says Tobi, this stuff is still used from hidden corners of NG, don’t delete it. Let’s have a look..o-la-la, indeed there is still a lot of strange functionality which I even didn’t know about. It seems to be too diffcult to refactor it this time.  >> more…