github-ad.png

Building a powerful search engine powered by Elasticsearch and Rails (part 1)

For years Elasticsearch powered the search feature of SupportBee. Well, it just got better, a lot better! And the whole process was a very instructive experience. Now I’m going to relive it. This tutorial intends to help you build a powerful search engine for you project. It will dig into the process of setting up a new ES instance, mapping your ActiveRecord model and importing tons of data (we have around 25GB). I’ll do my best to keep it concise.

In this part I’m going to lead you through the process of configuring a new Elasticsearch server instance. Have fun!

Setting up ES in a new Linux server

Let’s start by installing Shorewall:

apt-get update
apt-get install shorewall
systemctl enable shorewall.service
service shorewall start

You can read more about setting up Shorewall in this article.

Great! An important step to harden our server. Now install the latest version of Java 8:

sudo add-apt-repository ppa:webupd8team/java
add-apt-repository ppa:webupd8team/java
apt-get update
apt-get install oracle-java8-installer

Check which Java version is the default in your system and update it if needed.

java -version
update-alternatives --config java

Awesome! Now we can install Elasticsearch at last:

wget https://download.elastic.co/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.4.0/elasticsearch-2.4.0.deb
sudo dpkg -i elasticsearch-2.4.0.deb
systemctl enable elasticsearch.service
service elasticsearch start

/etc/elasticsearch/elasticsearch.yml contains some configuration settings you might want to update, like the cluster.name and node.name.

You can test if Elasticsearch is up and running with cURL:

curl http://localhost:9200

So far so good! There’s one more step we want to take: wrap Elasticsearch in a Nginx proxy. This blog post provides insight into the configuration and the benefits of doing this, I recommend you read it.

So, install Nginx.

apt-get install nginx
systemctl enable nginx.service
service nginx start

Let’s make Nignx keep persistent HTTP connections to Elasticsearch. Open /etc/nginx/nginx.conf and edit it so it look like this:

...
events {
  worker_connections 1024;
}

http {
  upstream elasticsearch {
    server 127.0.0.1:9200;
    keepalive 15;
  }

  server {
    listen 8080;

    location / {
      proxy_pass http://elasticsearch;
      proxy_http_version 1.1;
      proxy_set_header Connection "Keep-Alive";
      proxy_set_header Proxy-Connection "Keep-Alive";
      proxy_redirect off;
    }
  }
}

Let’s test our configuration with cURL:

service nginx reload
curl http://localhost:8080

Sweet! We also want Nginx to take care of authentication to our Elasticsearch server:

...
events {
  worker_connections 1024;
}

http {
  upstream elasticsearch {
    server 127.0.0.1:9200;
    keepalive 15;
  }

  server {
    listen 8080;

    auth_basic "Protected Elasticsearch";
    auth_basic_user_file passwords;

    location / {
      proxy_pass http://elasticsearch;
      proxy_http_version 1.1;
      proxy_set_header Connection "Keep-Alive";
      proxy_set_header Proxy-Connection "Keep-Alive";
      proxy_redirect off;
    }
  }
}

Let’s test it again:

service nginx reload
curl -I http://localhost:8080

It responds with status 401 Unauthorized. Let’s generate credentials to access our server and store them in the passwords file:

printf "demo:$(openssl passwd -crypt XXX-XXX)\n" >> /etc/nginx/passwords

And try again:

curl http://demo:XXX-XXX@localhost:8080

Good. Now let’s configure SSL:

...
events {
  worker_connections 1024;
}

http {
  upstream elasticsearch {
    server 127.0.0.1:9200;
    keepalive 15;
  }

  server {
    listen 443 ssl http2 default_server;
    listen [::]:443 ssl http2 default_server;

    ssl_certificate /etc/nginx/ssl/nginx.crt;
    ssl_certificate_key /etc/nginx/ssl/nginx.key;

    auth_basic "Protected Elasticsearch";
    auth_basic_user_file passwords;

    location / {
      proxy_pass http://elasticsearch;
      proxy_http_version 1.1;
      proxy_set_header Connection "Keep-Alive";
      proxy_set_header Proxy-Connection "Keep-Alive";
      proxy_redirect off;
    }
  }
}

This blog post has detailed instructions on how to generate the SSL certificate. For production you might want to buy one, you can get one for $9/yr only.

Let’s restart Nginx and try again:

service nginx restart
curl -k https://demo:XXX-XXX@localhost

Voilà! Our Ealsticsearch server is up and running.

Remeber Shorewall? Make sure port 9200 is blocked and port 443 is open for whom you want it to be open.

What comes next?

In the next part, we’re going to create a rails project and an ActiveRecord model, make our model play nicely with Elasticsearch and provide an endpoint to search that model. Stay tuned.


github-ad.png
blog comments powered by Disqus