PostgreSQL PostGIS - getting started example

PostgreSQL PostGIS - getting started example

introduce

Spatial data is an important kind of data. The software used in our daily life, such as map navigation, taxi software, restaurant recommendation and takeout express, are closely related to spatial data. Spatial data usually has complex structure and large amount of data. For the analysis and query of spatial data, its mode is also very different from ordinary data, and ordinary DBMS is difficult to meet the requirements.

PostgreSQL has built in many spatial features: geometric data types, geometric type functions and operators, and spatial data indexes, but it is still unable to meet the complex needs of the real world: roads with many auxiliary roads and branches need to be represented by multiple broken lines; Enclaves in administrative areas need to be represented by a collection of multiple polygons; Efficiently judge whether the point is in the polygon, whether the two polylines have intersection, and whether the two regions are separated, adjacent, overlapping or included. Fortunately, the powerful extension mechanism of PostgreSQL leaves a window to solve the problem. PostGIS was born for this, and it has become the de facto standard of GIS industry.

Coordinate system

Geographic coordinate system

Geographic coordinate system, or spherical coordinate system;

Projection coordinate system

Projected coordinate systems or planar coordinate systems

Geographic information is usually expressed in longitude and latitude, such as East diameter and north latitude; However, in different coordinate systems and projection standards, the units representing the longitude and latitude of geographic information are different.

In the geographical coordinate system, the unit of longitude and latitude is degrees, and the coordinates are expressed as: (117.435974 33.609617); However, in the projection coordinate system, different projection standards have different units, such as the commonly used projection standards: 3857 in meters and 4326 in degrees.

Geographic data type

In the database:

  • geometry: planar coordinate system (projected coordinate system)

  • geography: geographic coordinate system (spherical coordinate system)

If you want to calculate the actual distance between two lat and lon points, you need to convert the geometry type to the geography type

install

The installation tutorial has been explained in detail in my previous article. It will not be introduced here. Please check my previous article if necessary.

Create GIS database

PostGIS is an extension of PostgreSQL. Connect and execute the following commands to create a geo database and load the PostGIS extension:

su postgres
psql
CREATE DATABASE geo;

CREATE EXTENSION postgis;

Connect PostgreSQL and execute the following query to confirm that the PostGIS extension has been installed correctly and can be recognized by the database:

SELECT name,default_version FROM pg_available_extensions WHERE name ~ 'gis';

After execution, execute postgis_full_version view the current PostGIS version

SELECT postgis_full_version();

Geometric object

PostGIS supports many geometric types: point, line, polygon, composite geometry, etc., and provides a large number of practical related functions.

Note: Although the geometry types in PostGIS are very similar to the geometry types built in PostgreSQL, they are not the same thing. All object commands in PostGIS usually start with ST, which is the abbreviation of Spatial Type.

For PostGIS, all geometric objects have a common parent class Geometry. This object-oriented organization allows some flexible operations in the database: for example, different geometric objects are stored in the same column in the data table, and each collection object is actually the packaging of the objects in the underlying Geometry library geos of PostGIS.

Input of geometric objects

PostGIS supports a variety of spatial object creation methods, which can be divided into several categories:

  • Well known text format (WKT, well-known text)
  • Well known binary format (WKB)
  • GeoJSON and other codes
  • Function that returns geometric type

You can create geometric points (1, 2) in the following four ways, and the results are the same:

SELECT 
'Point(1 2)'::GEOMETRY               AS wkt,
'0101000000000000000000F03F0000000000000040'::GEOMETRY          AS wkb,
ST_GeomFromGeoJSON('{"type":"Point","coordinates":[1,2]}')      AS  geo_json,
ST_Point(1,2)                          AS func;

Storage of geometric objects

The geometry types of PostGIS and PostgreSQL use different storage methods.

Take point as an example, use the built-in point and ST_Point creates a point and returns the following results:

SELECT Point(1,2),ST_Point(1,2);

The point in PostgreSQL is just a structure containing two Double (16 bytes). But the point type of PostGIS is ST_Point adopts different storage methods (21 bytes). In addition to two coordinate components, it also includes some additional metadata: for example, the actual type of geometric object, the ID of reference system, etc.

When querying the PostGIS spatial data type, PostgreSQL will return the binary data representation of the object in hexadecimal form. This facilitates all kinds of ETL tools to deal with spatial data types in a unified way. Tables containing spatial data can also use PG_ Tools such as dump are handled in the same way.

If you need a human readable format, you can use ST_AsText outputs WKT instead of WKB, as shown below:

SELECT ST_AsText(ST_Point(1,2));

In practical use, the spatial data type of PostGIS usually uses the unified Geometry type. Whether it is a point, a discount, or a polygon, it can be placed in the Geometry type field. For example:

CREATE TABLE geo(
geom GEOMETRY
);

INSERT INTO geo VALUES(ST_Point(1.0,2.0)),('LineString(0 0,1 1,2 1,2 3)'),('Polygon((0 0,1 0,1 1,0 1,0 0))'),('MultiPoint(1 2,3 4)');

Output of geometric objects

Similar to the input of geometric objects, geometric objects can also be output in a variety of ways. Here, take the most common WKT and GeoJSON as examples:

SELECT ST_AsText(geom) AS wkt,ST_ASGeoJSON(geom) AS json FROM geo;

Some geometry types support special output methods, such as outputting geometric points in longitude and latitude format:

SELECT ST_AsLatLonText(ST_Point(116.321367,39.966956));

Operation of geometric objects

PostGIS improves a variety of relationship judgment and geometric operation functions, which is very powerful. The functions that originally required thousands of lines of business code can be realized, but now only one line of SQL is required.

Calculate the distance between two points

There are various relationships between geometric objects. The simplest one is the distance between two points, as shown below:

SELECT ST_Point(1,1) <-> ST_Point(2,2);

For example, if you calculate the distance between point (1, 1) and point (2, 2), the result should be root two.

The coordinate calculation of two geometric points is relatively easy, but the distance between two geographical coordinates is quite complex. It is necessary to calculate the spherical distance on an irregular sphere. For example, the longitude and latitude of location A are: (116.321367, 39.966956), and the coordinates of location B are: (116.315346, 39.997398). If the geometric distance is directly used for calculation, the results are of little significance except that they can be used for rough comparison of relative distances, as shown below:

SELECT ST_Point(116.321367,39.966956) <-> ST_Point(116.315346,39.997398);

However, by introducing the geographical coordinate system (4326 coordinate system, referring to WGS84 international standard GPS coordinate system), the geographical distance (3.4km) between the two locations can be calculated as follows:

SELECT ST_AsText(ST_GeomFromText('POINT(116.321367 39.966956)',4326))::geography <-> ST_AsText(ST_GeomFromText('POINT(116.315346 39.997398)',4326))::geography;

Calculate route length

To calculate the length of a road L, st can be used_ Length counts the total length of broken lines in WGS84 coordinate system (unit: m)

SELECT st_length(st_transform(st_geomfromgeojson('{
        "type": "LineString",
        "coordinates": [
          [
            118.71809005737303,
            31.023186296488667
          ],
          [
            118.7243127822876,
            31.023811498034163
          ],
          [
            118.73383998870848,
            31.02583418080608
          ],
          [
            118.74126434326172,
            31.028040694848137
          ],
          [
            118.73971939086913,
            31.03186519810428
          ],
          [
            118.73405456542969,
            31.029952965672855
          ],
          [
            118.72246742248535,
            31.029217481437648
          ],
          [
            118.71671676635741,
            31.028812962687404
          ],
          [
            118.7104082107544,
            31.02818779396694
          ],
          [
            118.70620250701903,
            31.02708454503873
          ],
          [
            118.70337009429932,
            31.02344373291606
          ],
          [
            118.70242595672607,
            31.01825809362147
          ],
          [
            118.70221138000487,
            31.010203243220573
          ],
          [
            118.70843410491943,
            31.009798643737952
          ],
          [
            118.72508525848389,
            31.009908989221714
          ],
          [
            118.73409748077393,
            31.012520461717404
          ],
          [
            118.73353958129883,
            31.01134346888612
          ],
          [
            118.73334646224976,
            31.011177953103566
          ],
          [
            118.7334430217743,
            31.010773357758367
          ],
          [
            118.73338937759398,
            31.01019404779685
          ],
          [
            118.7333357334137,
            31.00958714787035
          ],
          [
            118.73304605484009,
            31.008658399592516
          ],
          [
            118.73082518577577,
            31.008557248243537
          ],
          [
            118.73029947280882,
            31.00909978604104
          ],
          [
            118.72976303100585,
            31.009908989221714
          ],
          [
            118.72851848602294,
            31.010635427134496
          ]
        ]
      }'),4326)::geography) as length;

Through calculation, the road length is 12 kilometers

Calculated area

Via ST_Area vs. ST_Polygon calculated as follows:

SELECT st_area(st_transform(st_geomfromgeojson('{
        "type": "Polygon",
        "coordinates": [
          [
            [
              118.72319698333739,
              31.014249143586866
            ],
            [
              118.72525691986083,
              31.010092898077463
            ],
            [
              118.7300205230713,
              31.01097565564564
            ],
            [
              118.7280035018921,
              31.015058303050022
            ],
            [
              118.72319698333739,
              31.014249143586866
            ]
          ]
        ]
      }'),4326)::geography) as length;

3423 square meters = 0.003423 square kilometers = 0.3423 hectares

Application scenario: human circle and geographic fence

Circle people is a common demand in LBS service: give a central point and find out all qualified objects within a certain distance around the point. For example, find all bus stops within 1km around the user center and sort by distance.

The traditional relational database may be quite complex to implement. Suppose the user is in subway station A: (116.321367,39.966956). The common practice is to take the user as the center, add and subtract the offset of one kilometer for longitude and latitude respectively (one degree is about 111 kilometers), and then use the index on longitude and latitude for the first filtering (at this time, the filtering uses the rectangular range), as shown below:

CREATE TABLE stations(
    name   TEXT,
    longitude  DOUBLE PRICISION,
    latgitude  DOUBLE PRICISION,
);

SELECT name FROM stations WHERE longitude BETWEEN 116.312358 AND 116.330376 AND latitude BETWEEN 39.957947 AND 39.975965;

Then, in the application code, calculate the geometric distance for each qualified point, judge whether it meets the distance condition, and finally sort the output. Of course, GeoHash method can also be used to convert two-dimensional coordinates into one-dimensional string coding, and pre level matching can be carried out during query.

CREATE TABLE stations(
    name   TEXT,
    position  geography,
);


SELECT name,ST_Point(116.321367,39.966956)::geography <-> position::geography as distance from stations where st_point(116.321367,39.966956)::geography <-> position::geography < 500 order by ST_Point(116.321367,39.966956)::geography <-> position::geography;

Spatial index

On the 1 million line watch, the implementation of violent scanning is also barely usable, but for the large watch with tens of millions and billions of production environment, we can't do so. For example, on the POI table with 100 million records, query the 1000 POI points closest to metro station A, as follows:

SELECT name FROM poi ORDER BY position <-> ST_Point(116.458855,39.909863) LIMIT 1000;

This query took 3 minutes to execute. Now use the GIST index provided by PostgreSQL, as follows:

CREATE INDEX CONCURRENTLY idx_poi_position_gist ON poi USING gist(position);

After using the index, the same query can be performed in less than 1ms, which is hundreds of thousands of times faster

Geographical fence

Drawing circles with points and distances is a common scene. Another scene is to judge which geographical fences a point falls in

For example, there are vehicle and user location coordinates. Now we want to get the user's city (or region, business district, etc.) from the coordinates. For example, the no parking area detection of shared bicycles and the no flying area identification of UAVs are all such scenes.

CREATE TABLE aoi(
	name TEXT,
    bound GEOMETRY
)
# Detect the business district of the central point of metro station A
SELECT name FROM aoi WHERE ST_Contains(bound,ST_Point(116.458855,39.909863));

Keywords: Database PostgreSQL gis postgis

Added by floR on Thu, 03 Mar 2022 00:09:17 +0200