Company

Category ‘Enterprise Search’

Fun combining Java, JavaScript and elastic.js within the elasticshell

April 11th, 2013 by
(http://blog.trifork.com/2013/04/11/fun-combining-java-javascript-and-elastic-js-within-the-elasticshell/)

elasticshell
I recently wrote a couple of articles about the elasticshell, the command line shell for Elasticsearch that I created. If you haven't heard about it, it's a json friendly command line tool that allows to quickly interact with Elasticsearch: you can easily index documents, execute queries and make use of all the API that Elasticsearch provides. It allows for more advanced usecases as well, since it exposes the power and flexibility of both JavaScript and Java. That's scary, isn't it? Let's see what this means...
Read the rest of this entry »

Introducing a Query tool as an Elasticsearch plugin (part 2)

March 20th, 2013 by
(http://blog.trifork.com/2013/03/20/introducing-a-query-tool-as-an-elasticsearch-plugin-part-2/)

Es

In the first part of this series of blogs on Introducing a Query as an Elasticsearch plugin, I described the functionality of a query tool I also discussed the functionality of the tool as the structure of the project. In this part I want to take a deeper dive in interacting with Elasticsearch.

The post consists of four parts:

  • Getting information about the Elasticsearch instance,
  • Constructing the query using the JavaScript QueryDSL,
  • Executing the query,
  • Using the results.

If you need more background around AngularJS, which is used on the client side you can have a look at the following two posts:

  1. AngularJS lessons learned
  2. Basic Axon Framework sample using vert.x and angular.js

Read the rest of this entry »

AngularJS: Lessons learned

March 14th, 2013 by
(http://blog.trifork.com/2013/03/14/angularjs-lessons-learned/)

AngularJS largeAt Devoxx 2012 I attended the AngularJS presentation by Igor Minar and Misko Hevery. I was very enthusiastic about the capabilities of this front-end framework. Therefore I started experimenting with it. I created a sample for the Axon Framework, read more about it here. After my experiments I felt confident enough to start using it in real projects. One of them was adding management reporting using the HighCharts library.

The next step was a bigger project, writing an Elasticsearch plugin to query your Elasticsearch instance. This project has to integrate with a javascript library to interact with Elasticsearch. The layout and other front-end components were implemented using Twitter Bootstrap. Therefore I also used another AngularJS plugin to integrate with Bootstrap.

In this blog post I'll give you some lessons learned with respect to AngularJS.

Read the rest of this entry »

Introducing a Query tool as an Elasticsearch plugin (part 1)

March 12th, 2013 by
(http://blog.trifork.com/2013/03/12/introducing-a-query-tool-as-an-elasticsearch-plugin-part-1/)

Es

In the past few weeks I have been working with Elasticsearch. I was missing a plugin to look at the data, create queries and evaluate different facets. That was when I decided to start working on a plugin that enables you to do just this.

I have been working on AngularJS together with Twitter Bootstrap, therefore the choice as to which technology to use was not a difficult one. I also used some additional libraries, but I'll tell you more on this later on.

Why did I put the part 1 in the title? I am going to split this information into two parts. This part deals with the setup of the plugin, the libraries I used and the functionality I implemented. Then the next part deals with providing more details about interacting with Elasticsearch; how I use the facets, create the queries, etc. I will also write down some lessons learned with respect to AngularJS in a later blog post.

Now let's move on and have a look at the features:

Read the rest of this entry »

Searching with the elasticshell

March 7th, 2013 by
(http://blog.trifork.com/2013/03/07/searching-with-the-elasticshell/)

elasticshell
So as promised here is a sequel to my previous post Introducing the elasticshell. Let's start exactly where we left off...

What about search?
We of course need to search against the created index. We can provide queries as either json documents or Java QueryBuilders provided with the elasticsearch Java API, which are exposed to the shell as they are.
Read the rest of this entry »

Introducing the elasticshell

March 6th, 2013 by
(http://blog.trifork.com/2013/03/06/introducing-the-elasticshell/)

elasticshell
A few days ago I released the first beta version of the elasticshell, a shell for elasticsearch. The idea I had was to create a command line tool that allows you to easily interact with elasticsearch.

Isn't elasticsearch easy enough already?
I really do think elasticsearch is already great and really easy to use. However, on the other hand there is quite some API available and quite some json involved too. Also, interacting with REST APIs requires a tool other than the browser to use the proper http methods and so on. There are different solutions available: some of them are generic, like curl or browser plugins, while others are elasticsearch plugins like head or sense, that you can use to send json requests and see the result, still in json format. What was missing is a command line tool, something that plays the role of the mongo shell in the elasticsearch world. That's ambitious, isn't it?

In the meantime the es2unix tool has been released by Drew, a member of the elasticsearch team. The interesting approach taken there is to hide all the json and show only text in a nice tabular format, providing an executable command that makes possible to pipe its output to other unix commands like grep, sort and awk. That's a great idea, and an even greater result I must say.

A json friendly environment
I decided to take another approach: provide an environment that makes it easier to play around with all that json. That's why I started writing a javascript shell, where json is native and it's relatively easy to provide auto-suggestions directly within json objects. I also wanted to use the elasticsearch Java API, which are complete, performant, and powerful, allowing to even fire a new node if needed.
Read the rest of this entry »

Migrating Apache Solr to Elasticsearch

January 29th, 2013 by
(http://blog.trifork.com/2013/01/29/migrating-apache-solr-to-elasticsearch/)

Solar_Elasticsearch_ConcToolElasticsearch is the innovative and advanced open source distributed search engine, based on Apache Lucene. Over the past several years, at Trifork we have been doing a lot of search implementations. Driven by the fact that every other customer wanted the 'Google-experience' (just a text box, type some text and get relevant results) as part of their application, we started by building our own solutions on top of Apache Lucene. That worked quite well as Lucene is the defacto standard when it comes to information retrieval. But soon enough, due to Amazon, CNet and Funda in The Netherlands, people wanted to offer their users more ways to drill down into the search results by using facets. We briefly started our own (currently discontinued) open source project: FacetSearch, but quickly Solr started getting some traction and we decided to jump on that bandwagon.

Starting with Solr

So it was then we started using Solr for our projects and started to be vocal about our capabilities, that led to even more (international) Solr consultancy and training work. And as Trifork is not in the game to just use open source, but also contribute back to the community, this has led to several contributions (spatial, grouping, etc) and eventually having several committers on the Lucene (now including Solr) project.

We go back a long way...

At the same time we were well into Solr, Shay Banon, who we knew from our SpringSource days, started creating his own scalable search solution, Elasticsearch. Although, from a technical perspective a better choice for building scalable search solutions, we didn't adopt it from the beginning. The main reason for this was that it was basically a one-man show (a veery good one at that I might add!). However, we didn't feel comfortable recommending Elasticsearch to our customers as if Shay got hit by a bus, it would mean the end of the project. However, luckily all this changed when Shay and some of the old crew from the JTeam (the rest of JTeam is now Trifork Amsterdam) decided to join forces and launch Elasticsearch.com, the commercial company behind Elasticsearch. Now, its all systems go and what was then our main hurdle has been removed and we can use Elasticsearch and moreover guarantee continuity for the project.

Switching from Solr to Elasticsearch

Obviously we are not alone in the world and not that unique in our opinions, so we were not the only ones to change our strategy around search solutions. Many others started considering Elasticsearch, doing comparisons and eventually switching from Solr to Elasticsearch. We still regularly get requests on helping companies make the comparison. And although there are still reasons why you may want to go for Solr, in the majority of cases (especially when scalability and realtime is important) the balance more often than not goes in favor of Elasticsearch.

This is why Luca Cavanna from Trifork has written a plugin (river) for Elasticsearch that will help you migrate from your existing Solr to Elasticsearch. Basically, from Elasticsearch pulling the content from an existing Solr cluster and indexing it in Elasticsearch. Using this plugin will allow you to easily setup an Elasticsearch cluster next to your existing Solr. This will help you get up to speed quickly and therefore enables a smooth transition. Obviously, this tool is used mostly for that purpose, to help you get started. When you decide to switch to Elasticsearch permanently, you would obviously switch your indexing to directly index content from your sources to Elasticsearch. Keeping Solr in the middle is not a recommended setup.
The following description on how to use it is taken from the README.md file of the Solr to Elasticsearch river / plugin.

Getting started

First thing you need to do is: download the plugin

Then create a directory called solr-river in the plugins folder of Elasticsearch (and create it in the elasticsearch home folder, if it does not exist yet). Next, unzip and put the contents of the ZIP file (all the JAR files) in the created folder.

Configure the river

The Solr River allows to query a running Solr instance and index the returned documents in elasticsearch. It uses the Solrj library to communicate with Solr.

It's recommended that the solrj version used is the same as the solr version installed on the server that the river is querying. The Solrj version in use and distributed with the plugin is 3.6.1. Anyway, it's possible to query other Solr versions. The default format used is in fact javabin but you can solve compatibility issues just switching to the xml format using the wt parameter.

All the common query parameters are supported.

The solr river is not meant to keep solr and elasticsearch in sync, that's why it automatically deletes itself on completion, so that the river doesn't start up again at every node restart. This is the default behaviour, which can be disabled through the close_on_completion parameter.

Installation

Here is how you can easily create the river and index data from Solr, just providing the solr url and the query to execute:

curl -XPUT localhost:9200/_river/solr_river/_meta -d '
{
    "type" : "solr",
    "solr" : {
        "url" : "http://localhost:8080/solr/",
        "q" : "*:*"
    }
}'

All supported parameters are optional. The following example request contains all the parameters that are supported together with the corresponding default values applied when not present.

{
    "type" : "solr",
    "close_on_completion" : "true",
    "solr" : {
        "url" : "http://localhost:8983/solr/",
        "q" : "*:*",
        "fq" : "",
        "fl" : "",
        "wt" : "javabin",
        "qt" : "",
        "uniqueKey" : "id",
        "rows" : 10
    },
    "index" : {
        "index" : "solr",
        "type" : "import",
        "bulk_size" : 100,
        "max_concurrent_bulk" : 10,
        "mapping" : "",
        "settings": ""
    }
}

The fq and fl parameters can be provided as either an array or a single value.

You can provide your own mapping while creating the river, as well as the index settings, which will be used when creating the new index if needed.

The index is created when not already existing, otherwise the documents are added to the existing one with the configured name.

The documents are indexed using the bulk api. You can control the size of each bulk (default 100) and the maximum number of concurrent bulk operations (default is 10). Once the limit is reached the indexing will slow down, waiting for one of the bulk operations to finish its work; no documents will be lost.

Limitations

  • only stored fields can be retrieved from Solr, therefore indexed in elasticsearch
  • the river is not meant to keep elasticsearch in sync with Solr, but only to import data once. It's possible to register
  • the river multiple times in order to import different sets of documents though, even from different solr instances.
  • it's recommended to create the mapping given the existing solr schema in order to apply the correct text analysis while importing the documents. In the future there might be an option to auto generating it from the Solr schema.

Hope the tool helped, do share your feedback with us, we're always interested to hear how it worked out for you and shout if we can help further with training or consultancy.

Enterprise search with Solr and Elasticsearch @ Hippo Meetup

January 24th, 2013 by
(http://blog.trifork.com/2013/01/24/enterprise-search-with-solr-and-elasticsearch-hippo-meetup/)

At a recent Hippo meetup I gave a presentation about enterprise search. Being able to index and search your content, both in the Hippo CMS and in other sources, is of interest to many Hippo users. The presentation does not go into any Hippo specifics, but provides a brief introduction to search, Apache Lucene and concepts like an inverted index, but quickly goes into the two main enterprise (open source) search servers: Apache Solr and Elasticsearch.

Check out my slides for this talk on SlideShare:
http://www.slideshare.net/lucacavanna/hippo-searchmeetup

How to write an elasticsearch river plugin

January 10th, 2013 by
(http://blog.trifork.com/2013/01/10/how-to-write-an-elasticsearch-river-plugin/)

Up until now I told you why I think elasticsearch is so cool and how you can use it combined with Spring. It’s now time to get to something a little more technical. For example, once you have a search engine running you need to index data; when it comes to indexing data you usually need to choose between the push and the pull approach. This blog entry will detail these approaches and goes into writing a river plugin for elasticsearch.

Read the rest of this entry »

November newsletter

November 14th, 2012 by
(http://blog.trifork.com/2012/11/14/november-newsletter/)

trifork banner

Greetings from Trifork Amsterdam 
It’s all about finding the perfect match in life. Trifork A/S (a leading Danish software company) has found just that in us, here in the Netherlands. As of November 1st we’ve launched as Trifork Amsterdam. Rest-assured, we’ll still focus on the technology as we know it best, the entire team will remain intact and we’ll continue to do more of what we do best but even better ;-) . If you want to know more visit our website.

JFall Winners
jfallFor those of you who visited us at JFall this year, you'll probably have done our "Get to know us" questionnaire. We're pleased to say that pretty much everyone got top marks (even if you did get a little help from us every now & then!).

jfall image

However the lucky winners of the great prizes are as follows:

  • 1st prize ticket for GOTO Amsterdam, June 2013 goes to Erwin de Gier
  • 2nd prize 2 x tickets for NoSQL roadshow Amsterdam, November 2012 goes to Peter Glas & Michel Schudel
  • 3rd prize: Techy books for 15 runners up (will be contacted individually!).

Congrats to winners and thanks again to all the participants.

Don’t worry though for both NLJUG members and our newsletter readers we also offer great group & sponsorship deals for both events. For more details, contact Daphne Keislair on daphne.keislair@trifork.nl or call +31 6 272 94 119 to discuss the possibilities.

Want to know more on NoSQL?

no sql banner

Just so you know, the amazing early bird rate of 200 EUR for the NoSQL roadshow Amsterdam is still open until 16th November. It’s a unique opportunity to gain insights from some renowned speakers around MongoDBRiakHaadoop & Cassandra,Neo4J and much more. At the end of the session you’ll have no questions around the NoSQL space that’s for sure, so sign up here or contact us for more information. If you can not make it to Amsterdam, a week later the roadshow will be in London too.

Meet us at Devoxx

devoxx logoIf you can’t make it, either sign up for our FREE MongoDB Brown Bag session, or if you’re going toDevoxx visit us at stand number 11 (we're there with 10Gen).
It’s not all just all about coding: tech meeting December  

Of course we are technology geeks but we also look beyond the code too. This month our two sessions will cover:

  1. Web Application Security, an introduction into how to ensure your web application doesn't make it to the news (for the wrong reasons!) in 2013, providing you with some basic insights and tips & tools on how to secure your website
  2. Agility Beyond Campfire Romance, an interesting and somewhat contraversial matter when it comes to how best to manage development projects.

Sign up now and join us on 6th December. If you can not make it to the tech meetings we will have the presentations available for download on our website after the sessions.

es logoCommitment to the community
In one of our recent blog posts we already mentioned that both Trifork & GOTO are committed to working closely with the community. This month we will host the Elasticsearch meet up covering the cool use of the product in the new website for the Rijksmuseum, its going to be a great session so sign up now. Also, if you need a location for your meetup (even in one of our other Trifork offices) then we're happy to help if we can just drop us a note with your request.

Blog blog blogging

We'd published a few blogs recently, including:

Spring Insight plugin for the Axon CQRS framework

Agile Campfire Romance

NoSQL roadshow Amsterdam