2015年4月23日 星期四

Performance tuning for Elasticsearch

Some important environment variables about performance tuning for Elasticsearch.
  1. Linux
    1. max_file_descriptors: 65536
    2. vm.max_map_count: 262144 (per process)
  2. elasticsearch.yml
    1. bootstrap.mlockall: true
    2. discovery.zen.minimum_master_nodes
    3. gateway.recover_after_nodes
    4. gateway.expected_nodes
    5. gateway.recover_after_time
    6. index.number_of_shards
    7. index.number_of_replicas
    8. index.refresh_interval
    9. indices.fielddata.cache.size: 50%
    10. indices.breaker.fielddata.limit: 50%
  3. Create index
    1. index.number_of_shards
    2. index.number_of_replicas
    3. index.refresh_interval: 30s
    4. #index.merge.policy.type: tiered
    5. #index.translog.flush_threshold_size
    6. #index.search.slowlog.threshold.query
    7. #index.search.slowlog.threshold.fetch
    8. #index.routing.allocation.include.box_type
    9. #indices.memory.index_buffer_size
    10. #indices.memory.min_index_buffer_size
    11. #indices.memory.min_shard_index_buffer_size
    12. #indices.ttl.interval
  4. Search time tips
    1. _optimize?max_num_segments=1 (less segments more efficiency)
    2. Index per Time Frame
    3. Faking Index per User with Aliases
    4. shard_size & size, by default, shard_size = size * shards
    5. Index Warmer (suffer refresh time)
    6. collect_mode: breadth_first
  5. Fielddata
    1. enable doc_values, it will use mmapfs by default
    2. Fielddata Filtering
    3. Eagerly Loading Fielddata (suffer merge time)
    4. Global ordinals
    5. Eager Global ordinals (suffer refresh time)
You could google above terms for more information.