前言:
这篇文章通过 NorthWind Graph 的例子熟悉并掌握 neo4j 导入 CSV 文件。
一、环境配置
在导入之前确保 neo4j.conf 中允许 CSV 导入。
打开 neo4j.conf: $ vim neo4j.conf
修改 dbms.security.allow_csv_import_from_file_urls=true
二、数据准备
- products.csv (http://data.neo4j.com/northwind/products.csv)
- categories.csv (http://data.neo4j.com/northwind/categories.csv)
- suppliers.csv (http://data.neo4j.com/northwind/suppliers.csv)
- customers.csv(http://data.neo4j.com/northwind/customers.csv)
- orders.csv (http://data.neo4j.com/northwind/orders.csv)
- order-details.csv (http://data.neo4j.com/northwind/order-details.csv)
注意:从本地导入 CSV 文件时,需要将文件放在 import 文件夹下!
三、数据导入(本地)
1. 导入 products.csv
//load products.csv
LOAD CSV WITH HEADERS FROM "file:///northwind/products.csv" AS row
CREATE (n:Product)
SET n = row,
n.unitPrice = toFloat(row.unitPrice),
n.unitsInStock = toInteger(row.unitsInStock), n.unitsOnOrder = toInteger(row.unitsOnOrder),
n.reorderLevel = toInteger(row.reorderLevel), n.discontinued = (row.discontinued <> "0")
2.导入 categories.csv
//load categories.csv LOAD CSV WITH HEADERS FROM "file:///northwind/categories.csv" AS row CREATE (n:Category) SET n = row
3. 导入 suppliers.csv
//load suppliers.csv LOAD CSV WITH HEADERS FROM "file:///northwind/suppliers.csv" AS row CREATE (n:Supplier) SET n = row
4. 为产品等实体创建索引
给三类实体分别创建索引,方便后期提高查询效率。
//create indexes CREATE INDEX ON :Product(productID) CREATE INDEX ON :Category(categoryID) CREATE INDEX ON :Supplier(supplierID)
5. 为产品等实体建立联系
为这三类实体建立联系,通过相同的 id 作为连接。
MATCH (p:Product),(c:Category) WHERE p.categoryID = c.categoryID CREATE (p)-[:PART_OF]->(c)
商品和种类通过 categoryID 建立了联系
MATCH (p:Product),(s:Supplier)
WHERE p.supplierID = s.supplierID
CREATE (s)-[:SUPPLIES]->(p)
商品和供应商通过 supplierID 建立了联系
6. 列出每个供应商提供的产品类别
MATCH (s:Supplier)-->(:Product)-->(c:Category)
RETURN s.companyName as Company, collect(distinct c.categoryName) as Categories
每个食品供应商提供的产品种类表
7. 查找提供海产品("Seafood")的供应商
MATCH (c:Category {categoryName:"Seafood"})<--(:Product)<--(s:Supplier)
RETURN DISTINCT s.companyName as ProduceSuppliers
8. 加载 customer.csv
LOAD CSV WITH HEADERS FROM "http://data.neo4j.com/northwind/orders.csv" AS row
CREATE (n:Order)
SET n = row
9. 加载 order.csv
//load orders.csv LOAD CSV WITH HEADERS FROM "file:///northwind/orders.csv" AS row CREATE (n:Order) SET n = row
10. 为订单等实体创建索引
//create indexes for customer and order CREATE INDEX ON :Customer(customerID) CREATE INDEX ON :Order(orderID)
11. 为订单等实体创建关系
//create relationships MATCH (c:Customer),(o:Order) WHERE c.customerID = o.customerID CREATE (c)-[:PURCHASED]->(o)
12. 加载 order-details.csv
//load order-details.csv LOAD CSV WITH HEADERS FROM "file:///northwind/order-details.csv" AS row MATCH (p:Product), (o:Order) WHERE p.productID = row.productID AND o.orderID = row.orderID CREATE (o)-[details:ORDERS]->(p) SET details = row, details.quantity = toInteger(row.quantity)
四、数据查询
1. 列出订购了海产品(Seafood)的客户以及采购量
MATCH (cust:Customer)-[:PURCHASED]->(:Order)-[o:ORDERS]->(p:Product),
(p)-[:PART_OF]->(c:Category {categoryName:"Seafood"})
RETURN DISTINCT cust.contactName as CustomerName, SUM(o.quantity) as TotalProductsPurchased
2. 列出 "Thomas Hardy" 订购的产品及种类
match (cust:Customer{contactName:"Thomas Hardy"})-[:PURCHASED]->(:Order)-[ORDERS]->(p:Product),
(p)-[:PART_OF]->(c:Category)
return p.productName as Product , c.categoryName as Category