• 数据库 学习笔记


    Entity Relationship Model

    课程官网:https://www.pdbmbook.com/

    Elements of a Database System

    • database model: consists of various data models
    • database state: current set of instance
    • data model: provides a clear and unambiguous description of the data items, their relationships and various data constraints from a particular perspective
      • external data model: contains various subsets of the data items in the logical model, also called views, tailored towards the needs of specific applications or groups of users
      • conceptual data model: provides a high-level description of the data items with their characteristics and relationships; logical data model: a translation or mapping of the conceptual data model towards a specific implementation environment
      • internal data model: represents the data’s physical storage details
    • catalog: contains the data definitions, or metadata (data that describes other data), of your database application; guarantees consistency
    • database users
      • architect: design conceptual data model
      • designer: translate conceptual data model into a logical internal data model
      • administrator: responsible for the implementation and monitoring of the database
      • application developer: develops database applications in a programming language such as Java or Python
      • business user: run applications
    • database languages
      • DDL: data defining language
      • DML: data manipulation language
      • SQL: structured query language

    Architecture of DBMS

    • connection manager: provides facilities to setup a database connection
    • security manager verifies whether a user has the right privileges
    • DDL compiler: ideally 3 compilers, one for each layer; upon successful compilation, DDL compilers registers the data definitions in the catalog
    • query processor
      • DML compiler
      • query parser and query rewriter
      • query optimizer
      • query executor
      • storage manager
        • transaction manager
        • buffer manager
        • lock manager
        • recovery manager

    Conceptual Data Modeling using Entity Relationship Model

    Phases of design:

    • requirement collection and analysis -> database requirements
    • -> conceptual design -> conceptual data model
    • -> logical design -> logical data model
    • -> physical design -> internal data model

    ER (entity relationship) model:

    • entity types
    • attribute types
      • domain
      • key attribute types
      • composite attribute types
    • relationship types
    • weak entity types
    • ternary relationship types

    Relational Database

    Superkey: a subset of attribute types of a relation R with the property that no two tuples in any relation state should have the same combination of values for these attribute types; may have redundant attributes; may not be unique
    Candidate key (key): minimal superkey without redundancy; may not be unique
    Primary key: selected candidate key used to identify tuples in the relation; unique
    Foreign key: primary key from another tuple

    Functional Dependency:
    A functional dependency X -> Y between two sets of attribute types X and Y, implies that a value of X uniquely determines a value of Y

    Normalization Forms

    目的:防止不一致现象;防止有效信息被删除;减少冗余

    1NF:
    Every attribute type of a relation must be atomic and single valued

    2NF:
    Every non-prime attribute must depend on each whole candidate key, not just part of it. A non-prime attribute is an attribute that is not part of any candidate key of the relation.

    3NF:
    消除一个关系中的传递依赖(transitive dependency)

    如下表格不符合3NF:

    原因:Title->Genre ID->Genre name; Title->Author->AuthorNationality

    Boyce-Codd Normal Form (BCNF):
    A relational schema R is in Boyce-Codd normal form if and only if for every one of its dependencies X->Y, at least one of the following conditions hold:

    • X->Y is a trivial functional dependency
    • X is a superkey for schema R

    Trivial dependency: X->Y when y is a subset of X. 用于消除不同的candidate key中的属性互相依赖的情况。

    如下表格不符合BCNF:

    其中4个属性都属于某个candidate key;court 1的rate type为SAVER和STANDARD,court 2的rate type为PREMIUM-A和PREMIUM-B。
    易知Rate type -> Court。

    4NF:
    Multivalued dependencies: If the column headings in a relational database table are divided into three disjoint groupings X, Y, and Z, then, in the context of a particular row, we can refer to the data beneath each group of headings as x, y, and z respectively. A multivalued dependency X->->Y signifies that if we choose any x actually occurring in the table (call this choice xc), and compile a list of all the xc,y,z combinations that occur in the table, we will find that xc is associated with the same y entries regardless of z
    A table is in 4NF if and only if, for every one of its non-trivial multivalued dependencies X->->Y, X is a superkey.

    用于消除对于有着X, Y, Z三组属性的关系R中,对于某个X,对任意Y都有相同Z的取值的情况。

    如下表格不符合4NF:

    原因:对于同一家餐厅而言,它的每个菜品配送范围都是一样的

    Mapping a conceptual ER Model to a Relational Model

    Mapping entity types

    Mapping relationship types

    • binary 1:1 relationship type
    • binary 1:n relationship type
    • binary n:m relationship type
    • unary and n-ary relationship type

    基本SQL语句

    基本语法:

    SELECT ... FROM ... WHERE ...

    别名:SELECT column_name AS column_alias FROM table_name table_alias;

    SELECT id AS FROM demo d;
    

    Aggregation Query: COUNT, MAX, MIN

    COUNT后可加DISTINCT

    GROUP BY:按某一列分组;每一组只返回一个结果。常用于聚合查询;当用于普通查询时,作用类似于DISTINCT。

    HAVING: added to SQL because the WHERE keyword cannot be used with aggregate functions. 常与GROUP BY连用。

    IN/ALL/ANY:用于嵌套查询

    两种嵌套方式:

    • WHERE后
    • FROM后+别名

    增加约束示例:

    ALTER TABLE property 
    ADD CONSTRAINT property_pro_c_id_fkey FOREIGN KEY (pro_c_id) REFERENCES client(c_id) ON DELETE CASCADE;
    

    删除约束示例:

    ALTER TABLE property DROP CONSTRAINT property_pro_c_id_fkey;
    

    插入一行:

    INSERT INTO table_name VALUES (...)
    

    更新表中数据:

    UPDATE table_name SET column_name=expression WHERE ...
    

    删除表中数据:

    DELETE FROM table_name WHERE ...
    

    Extended Relational Databases

    Triggers:

    CREATE TRIGGER SALARYTOTAL AFTER INSERT ON EMPLOYEE 
    FOR EACH ROW 
    WHEN (NEW.DNR IS NOT NULL) 
    UPDATE DEPARTMENT 
    SET TOTAL-SALARY = TOTAL-SALARY + NEW.SALARY 
    WHERE DNR = NEW.DNR 
    
    CREATE TRIGGER WAGEDEFAULT BEFORE INSERT ON EMPLOYEE 
    REFERENCING NEW AS NEWROW 
    FOR EACH ROW 
    SET (SALARY, BONUS) = (SELECT BASE_SALARY, BASE_BONUS FROM WAGE WHERE JOBCODE = NEWROW.JOBCODE) 
    

    Stored procedures

    CREATE PROCEDURE REMOVE-EMPLOYEES (DNR-VAR IN CHAR(4), JOBCODE-VAR IN CHAR(6)) AS 
    BEGIN 
    DELETE FROM EMPLOYEE 
    WHERE DNR = DNR-VAR AND JOBCODE = JOBCODE-VAR; 
    END
    

    递归查询

    Physical File Organization and Indexing

    rba: random block access
    sba: sequential block access

    Record organization

    • relative location
    • embedded identification
    • pointers and lists: ideal for dealing with variable length records

    Blocking factor: determines how many records are retrieved with a single read operation

    • fixed length records: (BF=lfloor BS/RS floor),where BS stands for block space, RS stands for record space
    • variable length records: average number of records in a block

    Primary file organization versus secondary file organization

    Primary file organization

    • heap
    • sequential
    • random
    • indexed sequential file organization
      • primary index file organization: sparse, one for each block
      • clustered index file organization: 在相同策略下,由于数据的不同,既有可能是密集索引(每个值都有大量重复),也有可能是稀疏索引(每个值都只出现一次)
      • multilevel indexes
    • list

    Secondary indexes

    • sequential index: unique
    • non-unique
      • dense index
      • inverted file
    • hash index
    • bitmap index: types with limited set of values
    • join index: combines types from 2 or more tables

    Multilevel indexes -> B tree & B+ tree

    Physical Database Organization

    Database access methods

    • functioning of query optimizer
      • query cardinality = table cardinality * product(filter factors)
    • index search with atomic search key
    • multiple index and multicolumn index search
      • to cater for all possible search keys, (inom n{lceil n/2 ceil}) indexes are required.
    • index only access
    • full table scan

    Create indexes:

    CREATE INDEX index_name
    ON table_name (column1_name, column2_name);
    

    Transaction Management

    ACID properties:

    • atomicity: multiple database operations that alter the database state can be treated as one indivisible unit of work; success or fail as a whole
    • consistency: a transaction, if executed in isolation, renders the database from one consistent state into another consistent state; do not violate constraints
    • isolation: in situations where multiple transactions are executed concurrently, the outcome should be the same as if every transaction were executed in isolation
    • durability: effects of a committed transaction should always be persisted into the database

    Serializable schedules: precedence graph

    • 创建有向边Ti -> Tj ,如果Tj在由Ti写入后读取值
    • 创建有向边Ti -> Tj ,如果Tj在被Ti读取后写入值
    • 创建有向边Ti -> Tj ,如果Tj在由Ti写入后写入值

    若此有向图中不存在环,则可以序列化

    Lock

    • shared
    • exclusive

    Two-phase Locking Protocol

    • growth phrase
    • shrink phrase

    Cascading rollback: best way to avoid this, is for all transactions to hold their locks until they have reached the ‘committed’ state.

    Deadlock: detected by wait-for graph

    Isolation levels:

    • read uncommitted: typically only allow for read-only transactions
    • read committed: uses short-term read locks
    • repeatable read: uses both long-term read locks and write locks
    • serializable: avoid phantom reads

    这里需要强调可重复读与可序列化的区别。可重复读不允许并发修改已有内容,但允许加入新内容,因此对原有内容修改时可能会发现新的未被修改的内容出现;可序列化则通过更严格的限制避免这种情况。

    Lock granularity

    错题

    Physical/logical data independence指当前层的改变不影响上层

  • 相关阅读:
    CVPR2021论文总结笔记
    目标跟踪综述:Correlation Filter for UAV-Based Aerial Tracking: A Review and Experimental Evaluation
    推荐 | Transformer最新成果!Learn to Dance with AIST++: Music Conditioned 3D Dance Generation!
    博士笔记 | 周志华《机器学习》手推笔记第三章-线性模型
    博士笔记 | 周志华《机器学习》手推笔记第二章-模型评估与选择
    WebSocket实现前后端通讯
    Mac使用终端玩小游戏
    算法-栈队列堆
    架构
    算法-数组与矩阵
  • 原文地址:https://www.cnblogs.com/ZeonfaiHo/p/14847263.html
Copyright © 2020-2023  润新知