HomeSitemapContact usBlog

Company

January Newsletter

January 25th, 2012 by
(http://blog.dutchworks.nl/2012/01/25/january-newsletter/)

Once again, the festive season is behind us and we all start the year as always with a bunch of new years resolutions. At Dutchworks we have made a few of our own too and we're totally committed to achieving them. Our goals are to:

  1. Recruit the top talent in the industry and engage them with colleagues, customers & projects they can be proud of
  2. Achieve maximum project delivery reliability by striving for top quality & a rock solid delivery process
  3. Explore our key business domains (even more) and gain maximum exposure and experience in these markets.

Watch this space, as we'll keep you posted on how we are getting on as the year progresses.

Read the rest of this entry »

Berlin Buzzwords 2012

January 11th, 2012 by
(http://blog.dutchworks.nl/2012/01/11/berlin-buzzwords-2012/)

Yes, Berlin Buzzwords is back on the 4th & 5th June 2012! This really is only conference for developers and users of open source software projects, focusing on the issues of scalable search, data-analysis in the cloud and NoSQL-databases. All the talks and presentations are specific to three tags; "search", "store" and "scale".

Looking back to last year, this event had a great turnout. There were well over 440 attendees, of which 130 internationals (from all over including Israel, US, UK, NL, Italy, Spain, Austria and more) and an impressive show of 48 speakers. It was a 2 day event covering 3 tracks with high quality talks, but was surrounded with 5 days of workshops, 10 evening events for attendees to mingle with locals, specialized training opportunities and these are just a few of the activities that were on offer!

What was the outcome? Well let the feedback from some of the delegates tell the story:

"Buzzwords was awesome. A lot of great technical speakers, plenty of interesting attendees and friends, lots of food and fun beer gardens in the evening. I can't wait until next year!“

"I can't recommend this conference enough. Top industry speakers, top developers and fantastic organization. Mark this event on your sponsoring calendar!“

"Berlin Buzzwords is by far one of the best conferences around if you care about search, distributed systems, and nosql...“

"Thanks for organizing. My goal was to learn and I learned a lot!“

So to get the ball rolling for this year the call for papers has now officially opened via the website.

You can submit talks on the following topics:

  •  IR / Search - Lucene, Solr, katta, ElasticSearch or comparable solutions
  •  NoSQL - like CouchDB, MongoDB, Jackrabbit, HBase and others
  •  Hadoop - Hadoop itself, MapReduce, Cascading or Pig and relatives

Related topics not explicitly listed above are also more than welcome I've been told. The requirements are for presentations on the implementation of the systems themselves, technical talks, real world applications and case studies.

What's more this year there is once again an impressive Program Committee consisting of:

  • Isabel Drost (Nokia, Apache Mahout)
  • Jan Lehnardt (CouchBase, Apache CouchDB)
  • Simon Willnauer (SearchWorkings, Apache Lucene)
  • Grant Ingersoll (Lucid Imagination, Apache Lucene)
  • Owen O’Malley (Hortonworks Inc., Apache Hadoop)
  • Jim Webber (Neo Technology, Neo4j)
  • Sean Treadway (Soundcloud)

For more information, submission details and deadlines visit the conference website.

I am truly looking forward to this event, hope to see you there too!

GOTO Amsterdam 2012, May 24-25

January 2nd, 2012 by
(http://blog.dutchworks.nl/2012/01/02/goto-amsterdam-2012-may-24-25/)

Yes, GOTO Amsterdam is back! Our first software development conference co-organized with Trifork last year proved a winning success. We had over 225 attendees, 50% of which were developers and 77% of which were based in the Netherlands; as discussed in my past GOTO retrospect, this event was a huge success and very well received....the KPI measured in number of beers consumed in 48 hours rocked well over the 650 mark!

Because of its success, we have decided to go ahead and schedule the next GOTO Amsterdam. Save the dates for this year: May 24th & 25th 2012, with more details about the program to follow in the upcoming weeks.

GOTO Night #1 sign up now

The popular free GOTO nights will be back too, with the first one scheduled for Thursday 26th January hosted by Marktplaats in Amsterdam. This event will include an open space brainstorm session to allow for your input, but also feature a special presentation from Jørn Larsen featuring a case study "Riak on Drugs (and the other way around)".

Registration is open, so sign up today!

Embedding RSS in Hippo using the pipelines feature

December 13th, 2011 by
(http://blog.dutchworks.nl/2011/12/13/embedding-rss-in-hippo-using-the-pipelines-feature/)

Onehippologo

For one of the biggest Hippo projects I have been working on, we have created a custom rss solution. When we started the project, Hippo did not have an rss solution and we had some requirements for caching and reusability that we could not implement with standard hippo. A few years have passed and hippo is not what it used to be. Nowadays it has a lot more features and a lot less NullPointers (sorry guys, could not resist). About a week a go Jeroen Reijn told me about the Pipeline feature in Hippo. This feels like the right time to start thinking about a new solution for rss.

With this blog post I am going to show you a better way to create rss feeds with Hippo using the provided features of Hippo. I know there are plugins available for rss, still I think mine is better :-) . The solution is based on the Rome project and, as mentioned, the hippo pipelines.Romelogo small

Read the rest of this entry »

Use immutable objects in your Spring MVC controller by implementing your own WebArgumentResolver

December 8th, 2011 by
(http://blog.dutchworks.nl/2011/12/08/use-immutable-objects-in-your-spring-mvc-controller-by-implementing-your-own-webargumentresolver/)

How flexible is Spring MVC in combination with immutable objects? Why don't we want Spring MVC decide for us how to build our objects used for binding? Curious how we tackled this problem? Read on!

In our current project we are using Spring MVC 3 to build our frond-end.
The binding mechanism of Spring MVC is very powerfull and flexible.
For example Spring MVC will automaticly bind fields from the request to the object you are using in your controller. But binding fields from the request to an object will only work when the class contains getters and setters.

An example
Our straightforward mutable address class:

public class MutableAddress {

    private String street;
    private String houseNumber;
    private String postalCode;
    private String city;

    public String getStreet() {
        return street;
    }

    public void setStreet(String street) {
        this.street = street;
    }

   // Getters and setters for all other fields
}

The handler method on the controller takes a MutableAddress object (for example populated with data from the form) and saves it.

@Controller
public class AddressController {

   @RequestMapping(value = "/add", method = POST)
   public String storeAddress(MutableAddress mutableAddress) {
	addressService.storeAddress(mutableAddress);
	return "redirect:/overview";
   }
}

This all works fine but in our case we use immutable objects by design. So we don’t have any setters on our address class. The idea behind an immutable object is once it is created it will contain the correct values and cannot be changed anymore. So Spring MVC has some influence on the objects that we are using for binding data, but there is a way we can avoid that.

This is how the immutable version of our address class looks like:

public class ImmutableAddress {

    private final String street;
    private final String houseNumber;
    private final String postalCode;
    private final String city;

    public ImmutableAddress(String street, String houseNumber, String postalCode, String city) {
        Assert.hasText(street, "'street' must contain text");
        Assert.hasText(houseNumber, "'houseNumber' must contain text");
        Assert.hasText(postalCode, "'postalCode' must contain text");
        Assert.hasText(city, "'city' must contain text");

        this.street = street;
        this.houseNumber = houseNumber;
        this.postalCode = postalCode;
        this.city = city;
    }

    public String getStreet() {
        return street;
    }

   // All getters for the other fields. We don’t have setters!
}

The ImmutableAddress doesn’t contain a default constructor and can only be instantiated by passing in all parameters. Note that all fields are final and again we don’t have any setters! So out of the box using the ImmutableAddress in the method of our controller will not work.

Because we want to use ImmutableAddress directly in our controller method we have to do the binding part ourself. The best way to do this is to write a custom WebArgumentResolver.

public class ImmutableAddressWebArgumentResolver implements WebArgumentResolver {

    @Override
    public Object resolveArgument(MethodParameter methodParameter, NativeWebRequest webRequest) throws Exception {
        if (ImmutableAddress.class.equals(methodParameter.getParameterType())) {
            ServletWebRequest servletWebRequest = (ServletWebRequest) webRequest;
            HttpServletRequest request = servletWebRequest.getRequest();

            String street = request.getParameter("street");
            String houseNumber = request.getParameter("houseNumber");
            String postalCode = request.getParameter("postalCode");
            String city = request.getParameter("city");
            return new ImmutableAddress(street, houseNumber, postalCode, city);
        }

        return WebArgumentResolver.UNRESOLVED;
    }
}

As you can see it’s quite easy. The only thing to do is implement a single method called "resolveArgument". The implementation pulls out the parameters from the request and constructs a new ImmutableAddress that will be returned. This is only done when a ImmutableAddress is used as a parameter in one of our controller methods. In all other cases our custom WebArgumentResolver can’t resolve the arguments for the request so WebArgumentResolver.UNRESOLVED will be returned.

We have only one step left and that is to make sure our custom resolver kicks in before the default one of Spring MVC does.

Define our ImmutableAddressWebArgumentResolver as a Spring bean

<!-- Our custom argumentResolver -->
    <bean id="afnameIndentificatieArgumentResolver"
          class="nl.dutchworks.blog.web.binding.ImmutableAddressWebArgumentResolver"/>
<bean class="org.springframework.web.servlet.mvc.annotation.AnnotationMethodHandlerAdapter"
          p:customArgumentResolver-ref="immutableAddressWebArgumentResolver">
</bean>

Now we can use the ImmutableAddress directly in our controller

@RequestMapping(value = "/add-immutable", method = POST)
public String storeAddress(ImmutableAddress mutableAddress) {
	addressService.storeAddress(mutableAddress);
	return "redirect:/overview";
}

Make sure you don’t use:

<mvc:annotation-driven />

in your Spring context because we configured the AnnotationMethodHandlerAdapter in the context to make use of the customArgumentResolver.

At the moment of writing Spring 3.1 RC1 is available and you are able to use the mvc namespace to register your custom argument resolver:

<mvc:annotation-driven>
	<mvc:argument-resolvers>
		<bean class="nl.dutchworks.blog.web.binding.ImmutableAddressWebArgumentResolver"/>
	</mvc:argument-resolvers>
</mvc:annotation-driven>

My conclusion: there is no need to let Spring MVC decide the way you should build your objects for binding. And you don't have to because Spring MVC is flexible enough just use WebArgumentResolvers!

To dive a little bit deeper in the code download the complete example here.

Using JRebel with web fragments

November 26th, 2011 by
(http://blog.dutchworks.nl/2011/11/26/using-jrebel-with-web-fragments/)

Our team recently added JRebel to our toolbox, and we love it. We use Tomcat in our day-to-day development and getting rid of the annoying reboot to see the changes in code and resources was a big relief.

We have recently started to work on a new project which uses a Servlet 3.0 feature: web fragments. Web fragments allow you to modularize your deployment descriptor which means that a part of your web application can reside in a separate module along with its own descriptor.

With such a project structure we didn't get our JRebel configuration right from the beginning but once you know how to do it it seems extremely trivial. Nevertheless, I would like to share here a working JRebel configuration for a project including web fragments.

Let's assume we have a maven project structure including the main web application and a web fragment. We want JRebel to monitor for changes both classes and resources in our web fragment as in the main web

-- web-fragment
    -- src
        -- main
            -- java
            -- resources
                -- META-INF
                    -- web-fragment.xml
                    -- freemarker
                    -- resources
                        -- static
                            -- js
                            -- css
-- web
    -- main
        -- java
        -- resources
        -- webapp
            -- static
            -- WEB-INF
                -- web.xml
                -- index.jsp

Thanks to Servlet 3.0 any resources under META-INF/resources are accessible from the root context of the web application. This needs some special attention in our JRebel configuration.

<?xml version="1.0" encoding="UTF-8"?>
<application xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://www.zeroturnaround.com"
             xsi:schemaLocation="http://www.zeroturnaround.com http://www.zeroturnaround.com/alderaan/rebel-2_0.xsd">

  <classpath>
      <dir name="/path/to/web/src/main/resources"/>
      <dir name="/path/to//web/target/classes"/>
      <dir name="/path/to/web-fragment/src/main/resources"/>
      <dir name="/path/to/web-fragment/target/classes"/>
  </classpath>

  <web>
      <link target="/">
          <dir name="/path/to/web/web/src/main/webapp">
	      <exclude name="WEB-INF/lib/**"/>
	  </dir>
      </link>
      <link target="static/">
          <dir name="/path/to/web-fragment/src/main/resources/META-INF/resources/static"/>
      </link>
  </web>
</application>

The web fragment's resources have to be defined using "link" element with a target matching the directory name. This way you only need to hit refresh in the browser to see the changes in your JavaScript, CSS or other static resources.

The above configuration uses absolute paths to define the location of directories. If you would like to commit your configuration to your VCS you can define a placeholder like ${project.root} which then every team member has to define as a custom property in JRebel Agent settings.

Information overload or information access?

November 16th, 2011 by
(http://blog.dutchworks.nl/2011/11/16/information-overload-or-information-access/)

Lots of information about World War 2 has been digitized into archives in the past few years by various organizations, such as the Dutch Royal Library and the Dutch Institute for War Documentation (NIOD). Now, how do you make this information useful and accessible for research? The Dutch newspaper NRC posed this challenge in an article on September 10 (http://www.nrc.nl/nieuws/2011/09/10/nederlandse-digitale-archieven-blijken-nauwelijks-bruikbaar/) and this is the challenge that faced us here at Dutchworks in the development of the website www.oorlogsbronnen.nl for NIOD. The solution we built includes use of standard metadata formats, standards for data harvesting, and an amazingly fast search engine with a powerful query language.

A standard metadata format

The data that is searchable on the website in question, www.oorlogsbronnen.nl, is called metadata. It's the data that describes the source objects, such as a title, description, temporal and spatial indication -- When was this picture taken? Where did this story take place? --, labels, etc.

Combining metadata from different archives into a single metadata repository such as the Netwerk Oorlogsbronnen requires a standard metadata representation; otherwise searching through this data is like comparing apples and oranges: how can I search for a document with "resistance" in the title unless the title is the same field in each archive my documents come from?

The worldwide standard for storing metadata is Dublin Core which defines standard fields like the title, subject, and description and dictates what they should contain. Luckily, all of the digital information sources used for Netwerk Oorlogsbronnen have digitized their metadata as Dublin Core, so this was the easy part for us.

A document's metadata stored as Dublin Core might look like this when represented as XML (example from Beeldbank WO2):

<record
    xmlns:dc="http://purl.org/dc/elements/1.1/">
  <dc:identifier>
    http: //www.bbwo2.nl/detail_no.jsp?action=detail&amp;imid=85893
  </dc:identifier>
  <dc:source>NIOD</dc:source>
  <dc:description>
    Al vanaf 1941 werden wapens ook gebruikt voor de liquidatie van
    voor het verzet gevaarlijke collaborateurs. Sommigen van hen
    waren erin geslaagd in illegale organisaties te infiltreren met
    het doel zoveel mogelijk verzetsdeelnemers in handen van de
    bezetter te spelen. In de eerste helft van 1943 leidde de
    liquidatie van enkele collaborateurs door gewapende verzetsmensen
    tot onenigheid binnen het verzet. Aanslag door het verzet op de
    commandant der Ornungspolizei Jac. Chr. Tetenburg op 31 maart
    1945 om 11.15 uur voor het politiebureau aan de Hoflaan te
    Rotterdam.
  </dc:description>
  <dc:date>31-03-1945 (Opname)</dc:date>
  <dc:subject>
    Aanslagen - Zie ook: Moordaanslagen, Represailles
  </dc:subject>
  <dc:subject>Georganiseerd verzet</dc:subject>
  <dc:subject>Lijken</dc:subject>
  <dc:subject>
    Verzet - Zie ook: Widerstand, Georganiseerd verzet, Illegaliteit
  </dc:subject>
  <dc:subject>Tetenburg, Maj.</dc:subject>
  <dc:coverage>Nederland</dc:coverage>
  <dc:coverage>Rotterdam</dc:coverage>
  <dc:type>IMAGE</dc:type>
  <dc:relation>http://.../85893-thumb.jpg?frskey=85893</dc:relation>
  <dcterm:provenance xmlns:dcterm="http://purl.org/dc/terms/">
    BBWO2
  </dcterm:provenance>
</record>

For non-Dutch speakers, excuse the exemplary text above, you didn't miss any content!

Let's continue with the difficult part: how do we get the data out of the various archives into our Netwerk Oorlogsbronnen metadata repository?

Standards for data harvesting

Retrieving metadata from a metadata repository is called "harvesting" in metadata terminology. The Open Archives Initiative OAI has defined a standard protocol for metadata harvesting over HTTP very originally named PMH ("Protocol for Metadata Harvesting"). Dutchworks has built an OAI-PMH harvester that can retrieve documents from a metadata repository and copy them into our own metadata repository.

However, some digital archive systems are slow to adopt OAI-PMH so we had to offer other options as well. Actually, as of now the majority of our harvesting is done using the search interface of the various digital archives. That is, we do a full search on every single document in the archive and get the data that way. Luckily, the standard REST API for metadata searching (SRU - "Search/Retrieval via URL", developed by the United States' Library of Congress) is much better adopted among the Dutch archives we're using than OAI-PMH is, and with these two options we can harvest the majority of archives.

For the remaining archives that support neither OAI-PMH nor SRU, we build custom file import solutions. Hopefully at some point these archives will adopt the standards and we can replace these custom solutions with our OAI-PMH or SRU harvester.

Search engine

Now that we've got all these documents in one place, we want to be able to search through them to find the information we are looking for. Apache's Solr, an open source search engine, provides this functionality and at an amazing speed.

Here's how we do it: we store each document into a Solr index. On top of this we have a REST-based search interface to search by keywords and operators like OR, AND, and NOT, as well as term proximity. Besides keyword search, we leverage Solr's faceting feature to break down results by data provider, collection, type (image, video, or text), and date. The user can then easily narrow down their search without ever running into a "No results found" dead end situation.

Zeezeilen & Overboord, our partners in the design and development of the PHP front-end of oorlogsbronnen.nl, use this REST API to search and retrieve documents from the repository and display them in a user-friendly layout. This way we have a clear separation of search logic and view logic which makes for a very maintainable system.

High level architecture

High level architecture

Where do we go from here?

The current Netwerk Oorlogsbronnen website is a solid foundation on which to build new features -- and NIOD already has many requests on their wish list: adding more archives, make their digital documents easier to search for, adding search result highlighting, enabling searching within a time period, and much much more. Dutchworks is already excited to get involved in the next phase of this project and continue to make our Dutch heritage digitally accessible to the public.

For now, check it out for yourself and find out more about World War 2 : www.oorlogsbronnen.nl

Retrospect: GOTO Amsterdam

October 19th, 2011 by
(http://blog.dutchworks.nl/2011/10/19/retrospect-goto-amsterdam/)

Last week Trifork and Dutchworks (f.k.a. JTeam) organized the GOTO Amsterdam conference for the first time at Krasnapolsky Hotel at Dam square. We believe it was a great software development conference. It offered a great line up of speakers and sessions. The level of the sessions was much higher than any other conference in The Netherlands and the crowd reflected that.

Read the rest of this entry »

Hot off the press, as of today “JTeam” will be known as “Dutchworks”!

October 10th, 2011 by
(http://blog.dutchworks.nl/2011/10/10/hot-off-the-press-as-of-today-%e2%80%9cjteam%e2%80%9d-will-be-known-as-%e2%80%9cdutchworks%e2%80%9d/)

Many of you probably know that JTeam has gone through quite the evolutionary process over the last two years; the combination of our two acquisitions (Func & Net Effect), and our strong autonomous growth has resulted in over 250% increase in revenue and number of Dutchworkers. Further, it has allowed us to develop a services portfolio that now covers the full project life cycle, which is of course great news for our customers. All in all, more than enough reason for a complete upgrade of our company name and logo!

Just to give you a little inside info; the name “Dutchworks” stems from our love for typical Dutch values such as thoroughness, innovation and entrepreneurship, all of which have played an important part in the success of JTeam ever since we began in 2002. As of today, you will find our new brand name, including a completely new website (www.dutchworks.nl), featured in all internal and external communication.

To avoid misunderstandings, we would like to emphasize that other than our name change, everything else stays the same: same folks, same everything…
Anyways, we’re excited about our new name and looks, and hope you are too!

Warm regards,
Steven Schuurman / CEO Dutchworks

SearchWorkings: Apache Solr - Grouping update

September 29th, 2011 by
(http://blog.dutchworks.nl/2011/09/29/searchworkings-apache-solr-grouping-update/)

Apache Solr's result grouping feature is now a widely used feature. The major drawback was that grouping was not supported for distributed searches which also know as sharding in Solr. The good news is that recently distributed grouping has been added to Solr! It has been added the trunk and the stable branch (branch3x). This means that distributed grouping will be included in the upcoming Solr 3.5 and Solr 4.0 release.

Read more on SearchWorkings.org