PostgreSQL

Image of Author
March 21, 2022 (last updated September 4, 2024)

postgres is the CLI command to start the server. There are various other commands that are useful as well, like initdb and createdb. You can see more where ever postgres is installed. For example, $ ls /usr/local/opt/postgresql@16/bin, or /usr/local/opt/libpq/bin, which is the "Postgres C API library".

Useful quotes from documentation

See my note on PostgreSQL Documentation

psql

See psql for psql related documentation.

JSON operators

https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-JSON

JOIN means INNER JOIN

From https://www.postgresql.org/docs/current/queries-table-expressions.html#QUERIES-JOIN

The words INNER and OUTER are optional in all forms. INNER is the default; LEFTRIGHT, and FULL imply an outer join.

Reading the pg_hba.conf file

I believe hba stands for "Host Based Authentication". It specifies which clients can connect to which databases with what authentication methods and as which user.

The ADDRESS section I'm less confident in my understanding of, but I suspect it is which IP address the connecting client is allowed to connect from. E.g., if you have the right credentials but the wrong IP address you are ignored.

Here's an example output of a pg_hba.conf file from a Fly.io db machine.

local all postgres  trust
local all flypgadmin  trust
local repmgr repmgr  trust
local replication repmgr  trust
host replication repmgr fdaa::/16 trust
host replication,repmgr repmgr fdaa::/16 trust
host all all 0.0.0.0/0 md5
host all all ::0/0 md5

The columns are CONNECTION_TYPE DATABASE USER ADDRESS AUTH_METHOD. In the first few the ADDRESS is missing, indicated presumably by two spaces instead of one. I'm not sure what this mean. All I know is that this usually lists the client address. AUTH_METHOD can be things like password, md5 (meaning password encrypted with md5), trust (meaning no password).

You can leverage these to learn about connection details. For example, I learned by running ps aux | grep postgres that the postgres server was running on port 5433. Knowing that there was a local, trusted, postgres user mean I could connect to the server via psql -U postgres -p 5433

Installation

Brew has versioned postgres available for installation. E.g., brew install postgresql@16. brew does not symlink these installations into the path, meaning you can't auto-discover them, e.g., which postgres.

Connection

PostgreSQL Documentation#33.1.1.2 Connection URIs covers some of the valid connection strings. The easiest when hosting locally is postgresql://locahost/db-name

Architecture

Every command mentioned can be explored further via the --help flag.

A server controls access to a database cluster. The server program is postgres. A convenience function exists to help users start the server, stop the server, etc., called pg_ctl. For example, pg_ctl start will start the server.

A single server instance manages a collection of databases referred to as a database cluster. To create a cluster, call the initdb command. The only required flag for initdb is -D, aka, --pgdata, the value of which is the directory within which to store all the data in the cluster.

To access the database cluster, you need to connect to it through the server. The server is configured to listen in particular places, either through TCP/IP, or via unix sockets. The most fundamental way to configure the server is via a configuration file in the data directory, postgresql.conf. (Calling initdb will create a default config.) There are other ways of configuring the server, such as passing commands on server start via the shell. Once you connect to the server, you can interact with any database in the cluster, as well as create new databases.

Connecting via unix socket

In the postgresql.conf file is a setting for unix_socket_directories, which by default is /tmp. It controls the location of the unix socket.

(A socket allows for inter-process communication on a single unix machine. By analogy, data moving over a socket is like cars moving over a bridge. You can name a bridge, you can name a socket, i.e., its filename. The reason it might seem confusing is because it's rarely the destination, just like a bridge, but it is still a place, just like a bridge.)

Once you specify the location of the socket, when you start postgres, a socket file will be created in the target directory with a name .s.PGSQL.5432 (or whatever port you specify). You can then connect to that socket via psql to test the connection with psql -h /absolute/path/to/dir/containing/socket -l to list the databases.

Resources