# Getting started

Just use Elasticsearch Docker, ya dingus.

# The Index

There can be one document type per index. The index has a schema for the properties of the document that is stored in the index. So, lets say the document type is a blog post and the schema defines properties like author name, text and category.

The document can be any structured data with any schema you can come up with.

# Mapping

# Field types

String, byte, short, integer, long, float, double, boolean, date

  • Keyword type

    Does not use an analyzer and searches need to be exact match including case.

  • Text type

    Can be configured with an analyzer

# Analyzers

  • simple, standard, language specific, whitespace, etc… For example english analyzer will consider stemming, stop words, synonyms, etc.
  • Remove HTML encoding

# Create Index

# Documents

Documents are JSON objects that are stored within an Elasticsearch index. They have properties (or schema) that makes sense for that document type. See Document API for how to create them

# Bulk insert documents example

PUT request to /_bulk?pretty

Example JSON payload. It’s weird, not your traditional JSON structure.

{ "create" : { "_index" : "movies", "_id" : "135569" } }
{ "id": "135569", "title" : "Constantine", "year":2016 , "genre":["Action", "Adventure", "Sci-Fi"] }
{ "create" : { "_index" : "movies", "_id" : "122886" } }
{ "id": "122886", "title" : "Aliens", "year":2015 , "genre":["Horror", "Sci-Fi", "IMAX"] }
{ "create" : { "_index" : "movies", "_id" : "109487" } }
{ "id": "109487", "title" : "Interstellar", "year":2014 , "genre":["Sci-Fi", "IMAX"] }

# Scaling

An index is split into shards, each of which is an instance of Lucene. Shards can be on different nodes in a cluster.

# Redundancy

  • An index has two primary and two replica shards, which are automatically distributed across nodes in the cluster.
  • Write requests are routed to a primary then replicated. Read requests a

routed to any shard.