3
Admin How To P ostgreSQL has been called the ‘most advanced open source database’. I have been using it for the last four years as an RDBMS for Foodlets.in, and as a spatial data store at CSTEP (Center for Study of Science, Technology and Policy). PostgreSQL is one piece of software that doesn’t fail to impress me every now and then. Installing the modules Note: I am running Ubuntu 10.04 and PostgreSQL 8.4. Install the postgresql-contrib package and restart the database server, then check the contrib directory for the list of available modules: sudo apt-get install postgresql-contrib sudo /etc/init.d/postgresql-8.4 restart cd /usr/share/postgresql/8.4/contrib/ ls Create a test database called module_test: su postgres createdb module_test Apply the chkpass, fuzzystrmatch, isn and hstore modules to the module_test database by running the following commands: psql -d module_test -f chkpass.sql psql -d module_test -f fuzzystrmatch.sql psql -d module_test -f isn.sql psql -d module_test -f hstore.sql Let us now look at an example of how each of the modules is used. Using chkpass The chkpass module will introduce a new data type, ‘chkpass’, in the database. This type is used to store an encrypted field, e.g., a password. Let’s see how chkpass works for a user account table that we create and insert two rows into: In this article, we will learn how to install and use the PostgreSQL modules chkpass, fuzzystrmatch, isn and hstore. Modules add different capabilities to a database, like admin and monitoring tools, new data types, operators, functions and algorithms. Let’s look at modules that add new data types and algorithms, which will help us to push some of the application logic to the database. 88 | MARCH 2012 | LINUX FOR YOU | www.LinuxForU.com Installing and Using PostgreSQL Modules

PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn

Embed Size (px)

Citation preview

Page 1: PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn

Admin How To

PostgreSQL has been called the ‘most advanced open source database’. I have been using it for the last four years as an RDBMS for Foodlets.in, and as a spatial

data store at CSTEP (Center for Study of Science, Technology and Policy). PostgreSQL is one piece of software that doesn’t fail to impress me every now and then.

Installing the modules

Note: I am running Ubuntu 10.04 and PostgreSQL 8.4.

Install the postgresql-contrib package and restart the database server, then check the contrib directory for the list of available modules:

sudo apt-get install postgresql-contrib

sudo /etc/init.d/postgresql-8.4 restart

cd /usr/share/postgresql/8.4/contrib/

ls

Create a test database called module_test:

su postgres

createdb module_test

Apply the chkpass, fuzzystrmatch, isn and hstore modules to the module_test database by running the following commands:

psql -d module_test -f chkpass.sql

psql -d module_test -f fuzzystrmatch.sql

psql -d module_test -f isn.sql

psql -d module_test -f hstore.sql

Let us now look at an example of how each of the modules is used.

Using chkpassThe chkpass module will introduce a new data type, ‘chkpass’, in the database. This type is used to store an encrypted field, e.g., a password. Let’s see how chkpass works for a user account table that we create and insert two rows into:

In this article, we will learn how to install and use the PostgreSQL modules chkpass, fuzzystrmatch, isn and hstore. Modules add different capabilities to a database, like admin and monitoring tools, new data types, operators, functions and algorithms. Let’s look at modules that add new data types and algorithms, which will help us to push some of the application logic to the database.

88 | March 2012 | LINUX For YoU | www.LinuxForU.com

Installing and UsingPostgreSQL

Modules

Page 2: PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn

How To Admin

CREATE TABLE accounts (username varchar(100), password

chkpass);

INSERT INTO accounts(username, “password”) VALUES (‘user1’,

‘pass1’);

INSERT INTO accounts(username, “password”) VALUES (‘user2’,

‘pass2’);

We can authenticate users with a query like the one that follows:

SELECT count(*) from accounts where username=’user1’ and

password = ‘pass1’

The ‘=’ operator uses the eq(column_name, text) in the module to test for equality. Chkpass uses the Unix crypt() function, and hence it is weak; only the first eight characters of the text are used in the algorithm. Chkpass has limited practical use; the pgcrypto module is an effective alternative.

Using fuzzystrmatchThis module installs the soundx(), difference(), levenshtein() and metaphone() functions. Soundx() and metaphone() are phonetic algorithms—they convert a text string to a code string based on its pronunciation. Difference() and levenshtein() return a numeric value based on the similarity of the two input strings. Let’s now look into the levenshtein() and metaphone() functions. The Levenshtein distance between two strings is the minimum number of insertions, deletions or substitutions required to convert one string to another.

SELECT levenshtein(‘foodlets’, ‘booklets’);

This query returns 2, as is obvious.The metaphone() function takes a text string and

the maximum length of the output code as its two input parameters. These examples return FTLTS:

SELECT metaphone(‘foodlets’, 6);

SELECT metaphone(‘fudlets’, 6);

If we try to get the Levenshtein distance between the returned strings, this returns 0:

SELECT levenshtein(‘FTLTS’,’FTLTS’);

This means that the two words sound similar.Fuzzystrmatch is very helpful in implementing the

search feature for a website. Now the search can work with alternate spellings and misspelled keywords. Reminds you of the ‘Did you mean...’ feature on Google Search, right?

Using isnThis module will introduce data types to store international standard numbers like International Standard Book Numbers (ISBN), International Standard Music Numbers (ISMN), International Standard Serial Numbers (ISSN), Universal Product Codes (UPC), etc. It will also add functions to validate data, type-cast numbers from older formats to the newer 13-digit formats, and vice-versa. Let’s test this module for storing book information:

CREATE TABLE books(number isbn13, title varchar(100))

INSERT INTO books(“number”, title) VALUES (‘978-03’,

‘Rework’);

The INSERT statement throws an error: Invalid input syntax for ISBN number: “978-03”. However, this works just fine:

INSERT INTO books(“number”, title) VALUES (‘978-0307463746’,

‘Rework’)

To convert a 10-digit ISBN to 13 digits, use the isbn13() function:

INSERT INTO books(“number”, title) VALUES

(isbn13(‘0307463745’), ‘Rework’)

(Actually, the name of the book mentioned here, 'Rework' by Jason Fried, happens to be my favourite book on product/project management! I have prescribed it to all my team-mates.)

Using hstoreYou must have heard enough about NoSQL and key-value databases. It’s not always NoSQL vs relational databases—with the hstore module, PostgreSQL allows you to store data in the form of key-value pairs, within a column of a table. Imagine you are processing spreadsheets and you have no idea about the column headers and the data type of the data in the sheets. That’s when hstore comes to your rescue! Incidentally, hstore takes keys and values as text; the value can be NULL, but not the key. Let’s create a table with a column of type hstore and insert some rows:

CREATE TABLE kv_data( id integer, data hstore)

INSERT into kv_data values

(1, hstore(‘name’, ‘amit’) || hstore(‘city’, ‘bangalore’)),

(2, hstore(‘name’, ‘raghu’) || hstore(‘age’, ‘26’)),

(3, hstore(‘name’, ‘ram’) || hstore(‘age’, ‘28’));

You can create your own keys like ‘height’, ‘favourite_book,’ etc. The ‘||’ operator is used for

www.LinuxForU.com | LINUX For YoU | March 2012 | 89

Page 3: PostgreSQL Modules Tutorial - chkpass, hstore, fuzzystrmach, isn

Admin How To

By: Sagar Arlekar

The author is a research engineer at CSTEP, Bengaluru. He works in the domains of GIS and agent-based simulations. He co-founded Foodlets.in, a visual food guide built entirely on open source technologies.

concatenation. Now that we have a table and a few rows of data, let’s look at some SELECT, UPDATE and DELETE queries. To select rows with the value for ‘city’ as ‘bangalore’, use the following query:

SELECT * from kv_data where data->’city’ = ‘bangalore’

To get the average age across the table (returns 27.0), use the query given below:

SELECT avg((data->’age’)::integer) age from kv_data;

Here, ::integer is used to type-cast the text value to an integer, so that math operations can be performed on it.

To select and sort rows by ‘name’ values, use:

SELECT * from kv_data order by data->’name’ desc

Update the ‘city’ value to ‘delhi’ for all rows, as follows:

UPDATE kv_data SET data = data || (‘city’ => ‘delhi’);

Then, delete the ‘age’ key (and values) from all rows, as shown below:

UPDATE kv_data set data = delete(data, ‘age’)

Next, delete rows with the ‘name’ as ‘amit’:

DELETE from kv_data where data->’name’ = ‘amit’

Although not a full-fledged key-value storage, hstore does provide us with the flexibility of a key-value database and the power of SQL queries.

Other useful modulesHere are some other modules you may find useful:

• Pgcrypto provides functions for hashing and encryption. It supports SHA, MD5, Blowfish, AES and other algorithms.

• Citext adds a case-insensitive text data type, which stores text in lower-case form.

• Uuid-ossp provides functions to generate universally unique identifiers.

• Pg_trgm adds functions to find text similarity based on trigram matching.

90 | March 2012 | LINUX For YoU | www.LinuxForU.com