Elasticsearch
Table of Contents
:ID: D8184800-B864-433E-A1D8-9488CD025A59
# Getting started
Just use Elasticsearch Docker, ya dingus.
# The Index
There can be one document type per index. The index has a schema for the properties of the document that is stored in the index. So, lets say the document type is a blog post and the schema defines properties like author name, text and category.
The document can be any structured data with any schema you can come up with.
# Mapping
# Field types
String, byte, short, integer, long, float, double, boolean, date
- Keyword type
Does not use an analyzer and searches need to be exact match including case.
- Text type
Can be configured with an analyzer
# Analyzers
- simple, standard, language specific, whitespace, etc…
For example
english
analyzer will consider stemming, stop words, synonyms, etc. - Remove HTML encoding
# Create Index
See Index API docs.
# Documents
Documents are JSON objects that are stored within an Elasticsearch index. They have properties (or schema) that makes sense for that document type. See Document API for how to create them
# Bulk insert documents example
PUT
request to /_bulk?pretty
Example JSON payload. It’s weird, not your traditional JSON structure.
{ "create" : { "_index" : "movies", "_id" : "135569" } } { "id": "135569", "title" : "Constantine", "year":2016 , "genre":["Action", "Adventure", "Sci-Fi"] } { "create" : { "_index" : "movies", "_id" : "122886" } } { "id": "122886", "title" : "Aliens", "year":2015 , "genre":["Horror", "Sci-Fi", "IMAX"] } { "create" : { "_index" : "movies", "_id" : "109487" } } { "id": "109487", "title" : "Interstellar", "year":2014 , "genre":["Sci-Fi", "IMAX"] }
# Scaling
An index is split into shards, each of which is an instance of Lucene. Shards can be on different nodes in a cluster.
# Redundancy
- An index has two primary and two replica shards, which are automatically distributed across nodes in the cluster.
- Write requests are routed to a primary then replicated. Read requests a
routed to any shard.