• Sensei


    Sensei is a distributed database that is designed to handle the following type of query:
    
    SELECT f1,f2...fn FROM members 
    WHERE c1 AND c2 AND c3..
    MATCH (fulltext query, e.g. "java engineer")
    GROUP BY fx,fy,fz...
    ORDER BY fa,fb...
    LIMIT offset,count
    							

    Architecture:

    Some Features

    • Full-text search
    • Faceted Search
    • Dynamic Sorting
    • Streaming index
    • Fast realtime update
    • Distributed - sharded
    • Dynamic cluster support
    • ...

    Design considerations

    • data:
      • Fault tolerance - when one replication is down, data is still accessible
      • Durability - N copies of data is stored
      • Through-put - Parallelizable request-handling on different nodes/data replicas, designed to handle internet traffic
      • Consistency - Eventally consistent
      • Data recovery - each shared/replica is noted with a watermark for data recovery
      • Large dataset - designed to handle 100s millions - billions of rows
    • horizontally scalable:
      • Data is partitioned - so work-load is also distributed
      • Elasticity - Nodes can be added to accomodate data growth
      • Online expansion - Cluster can grow while handling online requests
      • Online cluster management - Cluster topology can change while handling online requests
      • Low operational/maintenance costs - Push it, leave it and forget it.
    • performance:
      • low indexing latency - real-time update
      • low search latency - millisecond query response time
      • low volatility - low variance in both indexing and search latency
    • customizability:
      • plug-in framework - custom query handling logic
      • routing factory - custom routing logic, default: round-robin
      • index sharding strategy - different sharding strategy for different applications, e.g. time, mod etc.

      Comparing to a traditional RDBMS

      RDBMS:

      • scales vertically
      • strong ACID guarantee
      • relational support
      • performance cost with full-text integration
      • high query latency with large dataset, esp. Group By
      • indexes needs to be built for all sort possibilities offline

      Sensei:

      • scales horizontally
      • relaxed Consistency with high durability guarantees
      • data is streamed in, so Atomicity and Isolation is to be handled by the data producer
      • full-text support
      • low query latency with arbitrarily large dataset
      • dynamic sorting, index is already built for all sortable fields and their combinations

            Learn about Sensei

    Technology stack:

    Sensei is built on top of an abundance of awesome-ness from the open-source community:

    • Bobo: a faceted search library
    • Zoie: a realtime stream search library
    • Lucene: full-text search engine
    • ZooKeeper: distribute resource synchronization system
    • Netty: NIO client server framework

    Getting Started

    Things to help you get started:

    Gateways

    Gateways specify where Sensei should be fetching data from.

  • 相关阅读:
    Codeforces Round #390 (Div. 2) D
    Codeforces Round #390 (Div. 2) B
    Codeforces Round #390 (Div. 2) A
    ICM Technex 2017 and Codeforces Round #400 (Div. 1 + Div. 2, combined) D
    ICM Technex 2017 and Codeforces Round #400 (Div. 1 + Div. 2, combined) C
    ICM Technex 2017 and Codeforces Round #400 (Div. 1 + Div. 2, combined) B
    ICM Technex 2017 and Codeforces Round #400 (Div. 1 + Div. 2, combined) A
    高通平台MSM8916LCM模块移植(一)-bootloader部分
    新城控股:千亿目标下的炼数成金之道
    UE4实现描边效果
  • 原文地址:https://www.cnblogs.com/lexus/p/2288379.html
Copyright © 2020-2023  润新知