ElasticSearch vs. Solr - New Promise or Good Old

ElasticSearch VS Solr explained – let's set the record straight. Find the ideal fit for your needs by realizing the points of difference between the two.

The discussion goes on and on, yet the question remains: what to choose to build a fast, feature-rich real-time search application on top of Apache Lucene Solr or Elasticsearch. Both are open source projects, so they are free to download and all that good stuff. Both have a nice set of features which makes the development and maintenance easier, such as:

  • Full-text search
  • Faceting
  • Highlighting
  • Geo-spatial search
  • Replication

This article covers the differences between Solr and Elasticsearch based on a few criteria, most crucial in our era and the best scenario to use each one.

Performance

First of all, let’s set get some things straight: Solr is fast. Solr became a standard among search engines for a reason. It’s stable, reliable, it outperforms nearly every search solution for basic searches, except for Elasticsearch. Yet all it takes to break this powerful search engine is to search while concurrently updating the index with new content. Throw a few million documents into the index and Solr will be seriously struggling while Elasticsearch stills performs without a hitch. This becomes a serious problem if you need to update your search index regularly. Solr just was not meant for real-time big-data search applications. The web applications today demand that new content generated by users be indexed in real time. The distributed nature of Elasticsearch allows it to keep up with concurrent search and index requests without skipping a beat.

ElasticSearch over Solr

1. Distributed Search/Cloud-ready

The major area where Elasticsearch takes the stage is the distributed search. Elasticsearch, unlike Solr was built with distribution in mind, to be EC2-friendly. What it actually means is that Elasticsearch runs a search index on multiple servers, in a fail-safe and efficient way. And that’s quite a challenge. Distributed systems are, in general, hard to program, but when done correctly such a system is resilient in the face of malice, degrades gracefully, and its security is far superior to the others.
Elasticsearch allows you to break indices into shards with one or more replicas. The shards are hosted in a data node within the cluster that delegates operations to the correct shards with rebalancing and routing done automatically. This ensures that even, in case of some catastrophic hardware or software failure, the chances of your search server going completely offline are close to none. Elasticsearch provides a cloud support for amazon S3, as well as GigaSpaces, Coherence and Terracotta.
Even though some steps to make Solr cloud-ready have been taken, its initial architecture and design do not include it, so it will take more time to get Solr where Elasticsearch is out-of-the-box.

2. Real-time search

Elasticsearch is real-time and distributed : just specify delay time via API. Its design follows percolation, an innovative search model similar to webhooks. The idea behind it is that Elasticsearch will notify your application each time new document matches your filters instead of constantly polling the search engine to check for new updates. Elasticsearch has a default refresh interval set to one second, so within only a second of indexing a document, it becomes searchable.
This is the perfect architecture for real-time search.

3. JSON-based API

Elasticsearch API is clean and easy to use. You can built a modern application JSON query language provides a more powerful and useful abstraction tool for querying the documents. Elasticsearch is more accessible and pleasant to interact with than Solr. Less configuration to set and sensible defaults make it so much more user-friendly. No schema is required, which means you can start indexing the content right away. You still can use mapping to define your index structure, which ElasticSearch uses when new indices are created.

Solr over ElasticSearch

1. Community

Solr has a mature community, and this should be a major criterion to consider when deciding which product Elasticsearch or Solr to use as a base for your application. Solr has a number of pretty active contributors that indicates it’s a stable and trustworthy search engine. But it’s not to say that Elasticsearch is far behind. Although quite young its community is vastly expanding.

2. Extensive Documentation

Solr is well documented with the necessary context and examples on how different APIs and components are used, while documentation for Elasticsearch lacks good working examples and configuration instructions, yet it’s slightly better organized.

Final Verdict

Both are Lucene-based applications and both are open source. Solr is your search server for creating standard search applications, no massive indexing and no real time updates are required. Elasticsearch architecture is on a whole new level aimed at building modern real-time search applications. If you want distributed indexing then you need to choose Elasticsearch. Elasticsearch is the only true option for cloud and distributed environment. Elasticsearch is scalable, lightning fast and a breeze to integrate with. Its API is more intuitive and accessible than Solr’s. Less configuration to set and sensible defaults let you get the project into production very quickly.

Connect with our experts Let's talk