PostgreSQL 10 Administration Cookbook
上QQ阅读APP看书,第一时间看更新

Getting ready

Let's start by looking at our example table, cust. It has a duplicate value in customerid:

postgres=# SELECT * FROM cust;
customerid | firstname | lastname | age
------------+-----------+----------+-----
1 | Philip | Marlowe | 38
2 | Richard | Hannay | 42
3 | Holly | Martins | 25
4 | Harry | Palmer | 36
4 | Mark | Hall | 47
(5 rows)

Before you delete duplicate data, remember that sometimes it isn't the data that is wrong; it is your understanding of it. In those cases, it may be that you haven't properly normalized your database model, and that you need to include additional tables to account for the shape of the data. You might also find that duplicate rows are caused because of you deciding to exclude a column somewhere earlier in a data load process. Check twice, and delete once.