Postgresql

原文：https://www.npgsql.org/efcore/mapping/json.html?tabs=data-annotations%2Cpoco

JSON Mapping

PostgreSQL has rich, built-in support for storing JSON columns and efficiently performing complex queries operations on them. Newcomers can read more about the PostgreSQL support on the JSON types page, and on the functions and operators page. Note that the below mapping mechanisms support both the jsonb and json types, although the former is almost always preferred for efficiency reasons.

The Npgsql EF Core provider allows you to map PostgreSQL JSON columns in three different ways:

As simple strings
As strongly-typed user-defined types (POCOs)
As System.Text.Json DOM types (JsonDocument or JsonElement)

NOTE

Mapping to POCO or to System.Text.Json types was introduced in version 3.0.0

String mapping

The simplest form of mapping to JSON is via a regular string property, just like an ordinary text column:

public class SomeEntity
{
    public int Id { get; set; }
    [Column(TypeName = "jsonb")]
    public string Customer { get; set; }
}

With string mapping, the EF Core provider will save and load properties to database JSON columns, but will not do any further serialization or parsing - it's the developer's responsibility to handle the JSON contents, possibly using System.Text.Json to parse them. This mapping approach is more limited compared to the others.

POCO mapping

If your column's JSON documents have a stable schema, you can map them to your own .NET types (or POCOs). The provider will use the new System.Text.Json APIs under the hood to serialize instances to JSON documents before sending them to the database, and to deserialize documents coming from the database back. Just like EF Core can map a .NET type to rows in the table, this capability allows you to map a .NET type to a single JSON column.

Mapping POCOs is extremely easy: simply add a property with your custom POCO type and instruct the provider to map it to JSON:

public class SomeEntity
{
    public int Id { get; set; }
    [Column(TypeName = "jsonb")]
    public Customer Customer { get; set; }
}

public class Customer    // Mapped to a JSON column in the table
{
    public string Name { get; set; }
    public int Age { get; set; }
    public Order[] Orders { get; set; }
}

public class Order       // Part of the JSON column
{
    public decimal Price { get; set; }
    public string ShippingAddress { get; set; }
}

You can now assign a regular Customer instance to the property, and once you call SaveChanges() it will be serialized to database, producing a document such as the following:

{
    "Age": 25,
    "Name": "Joe",
    "Orders": [
        { "OrderPrice": 9, "ShippingAddress": "Some address 1" },
        { "OrderPrice": 23, "ShippingAddress": "Some address 2" }
    ]
}

Reading is just as simple:

var someEntity = context.Entities.First();
Console.WriteLine(someEntity.Customer.Orders[0].Price)

This provides a seamless mapping approach, and supports embedding nested types and arrays, resulting in complex JSON document schemas as shown above. This approach also allows you to traverse loaded JSON documents in a type-safe way, using regular C# syntax, and to use LINQ to query inside database JSON documents (see Querying JSON columns below).

JsonDocument DOM mapping

If your column JSON schema isn't stable, a strongly-typed POCO mapping may not be appropriate. The Npgsql provider also allows you to map the DOM document type provided by System.Text.Json APIs.

public class SomeEntity
{
    public int Id { get; set; }
    public JsonDocument Customer { get; set; }
}

Note that neither a data annotation nor the fluent API are required, as JsonDocument is automatically recognized and mapped to jsonb.

Once a document is loaded from the database, you can traverse it:

var someEntity = context.Entities.First();
Console.WriteLine(someEntity.Customer.RootElement.GetProperty("Orders")[0].GetProperty("Price").GetInt32());

Querying JSON columns

Saving and loading documents these documents wouldn't be much use without the ability to query them. You can express your queries via the same LINQ constructs you are already using in EF Core:

var joes = context.CustomerEntries
    .Where(e => e.Customer.Name == "Joe")
    .ToList();

The provider will recognize the traversal of a JSON document, and translate it to the correspond PostgreSQL JSON traversal operator, producing the following PostgreSQL-specific SQL:

SELECT c.""Id"", c.""Customer""
FROM ""CustomerEntries"" AS c
WHERE c.""Customer""->>'Name' = 'Joe'

If indexes are set up properly, this can result in very efficient, server evaluation of searches with database JSON documents.

The following expression types and functions are translated:

.NET	SQL
customer.Name	customer->>'Name'
customer.Orders[1].Price	customer#>>'{Orders,0,Price}'[1]
customer.Orders.Length	jsonb_array_length(customer->'Orders')
EF.Functions.JsonContains(customer, @"{""Name"": ""Joe"", ""Age"": 25}")	customer @> '{"Name": "Joe", "Age": 25}'
EF.Functions.JsonContained(@"{""Name"": ""Joe"", ""Age"": 25}", e.Customer)	'{"Name": "Joe", "Age": 25}' <@ customer
EF.Functions.JsonExists(e.Customer, "Age")	customer ? 'Age'
EF.Functions.JsonExistsAny(e.Customer, "Age", "Address")	customer ?\| ARRAY['Age','Address']
EF.Functions.JsonExistsAll(e.Customer, "Age", "Address")	customer ?& ARRAY['Age','Address']
EF.Functions.JsonTypeof(e.Customer.Age)	jsonb_typeof(customer->'Age')

Indexing JSON columns

NOTE

A section on indices will be added. In the meantime consult the PostgreSQL documentation and other guides on the Internet.

These are early days for EF Core JSON support, and you'll likely run into some limitations. Please let us know how the current features are working for you and what you'd like to see.

https://blog.csdn.net/niu15032144758/article/details/81475458?utm_source=blogxgwz6

java mysql 订单表设计

最近由于系统日益复杂的需求系统中各种类型订单越来越多，原来的这些订单表已经不能满足当下的需求，以可扩展为目标打算对这些订单表进行重构，本文只涉及基础版的设计，对于高并发、分布式等暂不考虑。

之前的系统订单按不同业务有不同的订单表，比如洗车订单表、喷漆订单表、内饰清洗订单表等。在网上找了一些订单的设计方案，也跟朋友讨论了一下几个设计方案的情况得出如下几个结果。

方案一：

保持原来的结构不变，每增加一种新的业务需求，就创建对应的订单表，然后将所有订单表合并为一个视图，查询的操作都在视图进行。

利：数据库不需要大的变动，不会影响现有的程序和数据，稳定性比较好

弊：订单数据的增删改比较麻烦，设计系统的时候会比较复杂，也容易出错。每增加一个业务需求，就需要添加新的订单表，然后加入到视图中去，还要开发对应的程序，非常不推荐。

方案二：

既然都是要改，那就不考虑大改还是小改了。分析这几种订单，然后将通用的字段拿出来作为订单基础表（比如订单号、订单状态、用户id、订单类型、价格、时间等订单表共有的，订单的增删改都在这个表上进行），然后将每个订单独有的一些字段再创建订单明细表（明细表一般不涉及删除和修改），明细表分别对应各自的业务需求。

利：订单数据比较清晰，增删改只需操作订单基础表，系统设计会比较精简，扩展性较高

弊：新增一个业务需求，可能需要添加对应的订单明细表，程序中也要添加一部分操作。另外可能就是维护这些订单明细表稍微麻烦点。

方案三：

说到方案三不得不说一下mysql5.7的一个新特性json列，就是mysql可以将一列作为json类型存储数据。那这样的话我们可以将方案二中订单明细表都作为一个json列存入订单基础表中，这样所有的订单信息都存在订单基础表，一个表就可以完成不同业务的订单。

利：新增业务需求，订单表和程序基本不需要什么太大的变动，不需要维护这些订单明细表。

弊：首要使用json列必须是mysql5.7以上的版本，然后json列的数据操作起来感觉有点麻烦，不友好，所以json列的数据最好只查询用。

另外json列可以用字符串替换，直接用字符串存json列，但是这样数据操作起来还不如json，只是理论上可行。暂时想到这么几种方案，具体使用哪种需要根据自己系统的需求，不要硬套，因为我这些是根据我的系统需求做的，只是提供几种思路，仅供参考。

https://culttt.com/2015/02/02/storing-user-settings-relational-database/

Storing User Settings in a Relational Database

Last week we looked at Setting up a Notification system in PHP for Cribbb. In my opinion, one of the most important aspects of a good notification system is the ability for the user to decide which notifications she wants to receive.

Notifications can be really annoying, especially if the user gets bombarded with emails they do not want. Every notification system should give the user granular control over what the emails they receive and what should be ignored.

However, this presents the problem of how do we store the user’s settings in a relational database?

In today’s tutorial we’re going to be looking at the various options we have for storing user settings as well as the pros and cons of each.

The problems associated with storing User Settings

Before we look at the possible solutions, first I think it’s important to understand the actual problem we face. On the surface, it shouldn’t be too difficult to store user settings, should it?

In my experience, there are basically four problems that we need to factor into our decision making process.

1. You don’t know what the schema will look like as new settings are added / old ones removed
Initially you will probably have a handful of user settings that you want to record in the database. However as the application evolves, this list of user settings is highly likely to dramatically increase, especially if new features have associated user settings.

This makes it impossible to predict what the database schema should look like ahead of time.

2. It’s “expensive” to add a column to each record of a table with a lot of records.
If you had a settings table with a row for each user, adding an additional column for a new user setting is going to start to get pretty difficult once you have a few million rows.

You can get away with this in the early days of an application or for periodic schema changes, but it’s not something that you want to be doing on a regular basis.

3. When a new setting is introduced, existing users won’t have that setting
If your application has been running for a year and you introduce a new user setting, all of the first year’s users won’t have that user setting record in the database.

This means you either have to establish a default setting for a user with no record, or do a mass update to insert the default setting for all the users without that setting.

4. You usually have to query for settings
Normally you will have to query the database to check whether a user has a particular setting, rather than load all of the user settings to find out one particular property.

This means you need to store the data in a way that still allows you to query for individual settings.

The different potential solutions for storing User Settings

In my experience, there are basically three ways of solving this problem using a relational database. Each of these solutions have positives and negatives, and none of them are perfect by any stretch of the imagination.

Single Row

The first solution is to have a settings table where each user has a One-to-One relationship with a single row. Each setting has a column on the table and so pulling the row from the database contains the user’s settings.

I think this is a great solution if you know you only have 5 user settings and you will never, or very rarely, need to add or remove settings. It’s also important that all users have the same setting options, otherwise you get yourself into a mess of NULL columns and figuring out which users should have which settings.

The big benefit of this approach is that you will only ever have as many settings rows as you do users rows and it is very easy to clean up the settings table when a user is deleted.

This approach also makes it really easy to query as to whether a certain user has a certain setting.

However, I think if the user settings of your application are likely to be volatile, or you allow third parties to add their own settings, this approach will definitely not work.

Serialised Blob

The second option is to use a serialised blob to hold the settings. This means you take an array or object of settings, serialise it to JSON (or a similar format) and then store the output in a text column in the database.

The big benefit of this approach is that you don’t need to worry about the schema ahead of time and adding or removing settings from an individual user won’t disrupt the database.

However there are also a number of drawbacks to this approach.

Firstly, you can’t query against a blob of serialised text. This means you would have to pull the user and then unserialise the blob of settings each time. This means it will also be really difficult to do any kind of mass update of setting preferences to all users.

Secondly, it would be difficult to handle application wide settings as each user’s blob of settings would be slightly different.

And thirdly, it would be very difficult to add or remove settings from users without going through an expensive update process.

Property bag

And finally we have the Property Bag method. This is where you have a database table settings with columns for id, key, value and user_id.

Each user setting is stored as a key / value pair and so each user would have a One-to-Many relationship with this table.

The Property Bag approach makes it very easy to add or remove settings as we’re not constrained by a schema ahead of time. It’s also very easy to query against or to load all of the user’s settings in one go.

However, there are also negatives to the Property Bag approach.

Firstly, you need to ensure that the name of the setting keys are consistent. Any discrepancies could mean some user settings were not enforced correctly.

Secondly, the settings table could result in millions of records as each user has many different setting options. This could mean that the table eventually becomes difficult to work with, although that is probably a good problem to have.

Why I’m choosing the Property Bag method

I’ve decided to choose the Property Bag method for storing user settings for a number of reasons.

Firstly, the Property Bag method is really easy to set up. We only need a table with 4 columns and we’re good to go. There is no point in worrying about some complex situation in the future on day one of the journey.

Secondly, I’m kinda winging it for this project and so I have no idea what columns I’m going to need. I don’t want to limit myself to a schema that I pick now because in a couple of weeks it will probably change.

Thirdly, I’d like to open up Cribbb to a third-party ecosystem and so a Property Bag approach would make storing user settings easier for third-party developers.

As I mentioned above, I definitely do not think it is the perfect elegant solution, but in this case, I think it is the best approach.

Conclusion

Saving user settings can be quite difficult using a relational database as there really isn’t a perfect solution.

However, you can get a very long way with an imperfect solution. I don’t think there is much value in worrying about using the “most perfect” solution for every problem you face.

The majority of all small web applications could probably get away with using any of the three approaches I’ve looked at in this tutorial.

And of course there are more specialised tools available for large applications. For example, you might want to experiment with a NoSQL data store.

This is a series of posts on building an entire Open Source application called Cribbb. All of the tutorials will be free to web, and all of the code is available on GitHub.

To view a full listing of the tutorials in this series, click here.

https://heap.io/blog/engineering/when-to-avoid-jsonb-in-a-postgresql-schema

When To Avoid JSONB In A PostgreSQL Schema

PostgreSQL introduced the JSONB type in 9.4 with considerable celebration. (Well, about as much as you can expect for a new data type in an RDBMS.) It’s a wonderful feature: a format that lets you store blobs in the lingua franca of modern web services, without requiring re-parsing whenever you want to access a field, and in a way that enables indexing for complicated predicates like containment of other JSON blobs. It meaningfully extends PostgreSQL and makes it a viable choice for a lot of document store workflows. And it fits nicely in a startup engineering context: just add a properties column to the end of your table for all the other attributes you might want to store down the road, and your schema is now officially Future Proof TM.

We lean on JSONB heavily at Heap, and it’s a natural fit, as we have APIs that allow customers to attach arbitrary properties to events we collect. Recently, I’ve gotten a few questions about the benefits and drawbacks of using JSONB to store the entirety of a table – why have anything but an id and a data blob?

The idea of not having to explicitly manage a schema appeals to a lot of people, so it shouldn’t be surprising to see JSONB used this way. But there are considerable performance costs to doing so, some of which aren’t immediately obvious. There is great material for deciding which of JSON, JSONB, or hstore is right for your project, but the correct choice is often “none of the above.” [1] Here are a few reasons why.

Hidden Cost #1: Slow Queries Due To Lack Of Statistics

For traditional data types, PostgreSQL stores statistics about the distribution of values in each column of each table, such as:

the number of distinct values seen
the most common values
the fraction of entries that are NULL
for ordered types, a histogram sketch of the distribution of values in the column

For a given query, the query planner uses these statistics to estimate which execution plan will be the fastest. For example, let’s make a table with 1 million “measurements” of three values, each chosen at uniform random from {0, 1}. Each measurement was taken by one of 10,000 scientists, and each scientist comes from one of three labs:

Let’s say we want to get the tick marks in which all three values were 0 — which should be about 1/8th of them — and see how many times each lab was represented amongst the corresponding scientists. Our query will look something like this:

And our query plan will look something like this: https://explain.depesz.com/s/H4oY

This is what we’d hope to see: the planner knows from our table statistics that about 1/8th of the rows in measurements will have value_1, value_2, and value_3 equal to 0, so about 125,000 of them will need to be joined with a scientist’s lab, and the database does so via a hash join. That is, load the contents of scientist_labs into a hash table keyed on scientist_id, scan through the matching rows from measurements, and look each one up in the hash table by its scientist_id value. The execution is fast — about 300 ms on my machine.

Let’s say we instead store our measurements as JSONB blobs, like this:

The analogous read query would look like this:

The performance is dramatically worse — a whopping 584 seconds on my laptop, about 2000x slower: https://explain.depesz.com/s/zJiT

The underlying reason is that PostgreSQL doesn’t know how to keep statistics on the values of fields within JSONB columns. It has no way of knowing, for example, that record ->> 'value_2' = 0 will be true about 50% of the time, so it relies on a hardcoded estimate of 0.1%. So, it estimates that 0.1% of 0.1% of 0.1% of the measurements table will be relevant (which it rounds up to ~1 row). As a result, it chooses a nested loop join: for each row in measurements that passes our filter, look up the corresponding lab_name in scientist_labs via the primary key of the latter table. But since there are ~125,000 such measurements, instead of ~1, this turns out to take an eternity. [2]

As always, accurate statistics are a critical ingredient to good database performance. In their absence, the planner can’t determine which join algorithms, join orders, or scan types will make your query fast. The result is that innocent queries will blow up on you. This is one of the hidden costs of JSONB: your data doesn’t have statistics, so the query planner is flying blind.

This is not an academic consideration. This caused production issues for us, and the only way to get around them was to disable nested loops entirely as a join option, with a global setting of enable_nestloop = off. Ordinarily, you should never do something like that.

This probably won’t bite you in a key-value / document-store workload, but it’s easy to run into this if you’re using JSONB along with analytical queries.

Hidden Cost #2: Larger Table Footprint

Under the hood, PostgreSQL’s JSON datatype stores your blobs as strings that it happens to know are valid JSON. The JSONB encoding has a bit more overhead, with the upside that you don’t need to parse the JSON to retrieve a particular field. In both cases, at the very least, the database stores each key and value in every row. PostgreSQL doesn’t do anything clever to deduplicate commonly occurring keys.

Using the above measurements table again, the initial non-JSONB version of our table takes up 79 mb of disk space, whereas the JSONB variant takes 164 mb — more than twice as much. That is, the majority of our table contents are the the strings value_1, value_2, value_3, and scientist_id, repeated over and over again. So, in this case, you would need to pay for twice as much disk, not to mention follow-on effects that make all sorts of operations slower or more expensive. The original schema will cache much better, or might fit entirely in memory. The smaller size means it will also require half as much i/o for large reads or maintenance operations.

For a less contrived anecdote, we found a disk space savings of about 30% by pulling 45 commonly used fields out of JSONB and into first-class columns. On a petabyte-scale dataset, that turns out to be a pretty big win.

As a rule of thumb, each column costs about 1 bit of overhead for each row in your table, regardless of whether the column’s value is null.[3] So, for example, if an optional field is going to have a ten-character key in your JSONB blobs, and thus cost at least 80 bits to store the key in each row in which it’s present, it will save space to give it a first-class column if it’s present in at least 1/80th of your rows.

For datasets with many optional values, it is often impractical or impossible to include each one as a table column. In cases like these, JSONB can be a great fit, both for simplicity and performance. But, for values that occur in most of your rows, it’s still a good idea to keep them separate.In practice, there is often additional context to inform how you organize your data, such as the engineering effort required to manage explicit schemas or the type safety and SQL readability benefits from doing so. But there is often an important performance penalty as well for unnecessarily JSONB-ing your data.

Know another innocuous change with big performance implications? Ping me @danlovesproofs.

We’re constantly evaluating alternative schemas and indexing strategies for serving ad hoc queries across hundreds of billions of events. Interested in working with us? Shoot us a note at jobs@heapanalytics.com.

[1] I recommend this post, for starters: https://www.citusdata.com/blog/2016/07/14/choosing-nosql-hstore-json-jsonb/
[2] As an aside, explain.depesz is a wonderful tool for finding problems like these in your queries. You can see in this example that the planner underestimated how many rows would be returned by this subquery by a factor of 124,616.
[3] This isn’t quite correct. PostgreSQL allocates one byte per row for the first 8 columns, and then 8 bytes / 64 columns at a time after that. So, for example, your first 8 columns are free, and the 9th costs 8 bytes per row in your table, and then the 10th through 72nd are free, and so forth. (H/t Michael Malis for the investigation into this.)

Additional resources:

相关阅读:
D3js-实现图形拖拽及双击恢复
D3js-三种图表tooltip提示框总结介绍
D3js-API介绍【英】
D3js-API介绍【中】
springboot 整合 activemq 同时使用点对点模式和发布订阅模式
docker 安装 activemq
nginx 配置websocket 400 问题
springboot +vue 整合websocket 403
m3u8下载器
linux scp 命令

原文地址：https://www.cnblogs.com/panpanwelcome/p/13931557.html