Posts

Loading The Titanic Passenger Data Into PostgreSQL With DBeaver Part 2

In the last edition of this blog, the passenger list data from the HMS Titanic was loaded into a preliminary database. Now it is time to refine. I am using DBeaver Enterprise 25.2.0. PostgreSQL 18, and Github Copilot with Gpt-4.  Prompt: Any recommendations on improving this table for storage efficiency? This prompt was entered into the DBeaver AI Assistant. To improve the storage efficiency of the "titanic" table, we can consider the following recommendations: ### 1. **Optimize Data Types** - **"PassengerId"**: If the number of passengers is not expected to exceed 2,147,483,647, the `int4` type is fine. However, if the range is smaller (e.g., less than 32,767), you could use `int2` (smallint) to save space. - **"Survived"**: Since this is likely a binary value (0 or 1), you can use `boolean` instead of `int4`. - **"Pclass"**: If the class is limited to 1, 2, or 3, you can use `int2` (smallint) instead of `int4`. - **"Sex"*...

Loading The Titanic Passenger Data Into PostgreSQL With DBeaver Part 1

Image
The Sinking of the RMS Titanic in 1912 has shocked and entertained for over a century.  I will admit to being shocked to find a GitHub Repo with a CSV formatted file with the passenger list. This file is an interesting glimpse into the lives of the folks who sailed on the ill fated ship so long ago. And interesting datasets are always good for examples. I had been looking for a project to show off what I call 'Lifting and Shifting' but others call ETL.  At the PG NYC 2025 Conference, Ryan Booz had a session title Transforming Data with the Power of PostgreSQL and SQL. For many years, ETL was considered as the way to move data from one source to another  Ryan argued that it should be ELT, and rather convincingly.  ETL? ETL is a three-step data integration process (Extract, Transform, Load) used to collect raw data from various sources, process it into a consistent, usable format, and then load it into a central data warehouse or database for analytics and busin...

PostgreSQL 18 Old & New

 Learning Structured Query Language can be frustrating when double-checking that what you wanted to have done is actually what was done. PostgreSQL 18 has 'OLD and NEW support for RETURNING clauses in INSERT, UPDATE, DELETE, and MERGE commands'. Now you can get instant feedback.  The addition of the RETURNING clause in the previous version made MERGE much easier to use. Now it makes other commands easier. To demonstrate, let's create a table with one column that is designated as a unique, primary key integer and insert a value. create table foo ( a int unique primary key ) ; insert into foo ( a ) values ( 1 ) ; Now is the point where many of us would run a SELECT a FROM foo , just to double check that indeed there is a 1 in column a. There is now an option in PG 18 to use RETURNING, and it provides the previous 'old' value of a column along with the 'new' value. insert into foo ( a ) values ( 2 ) returning old .a, new .a ; a|a| -+-+  |2| The ...

PostgreSQL 18 Release Notes

 The PostgreSQL 18 Release Notes are like a great noir novel, full of surprises and intrigue. But we know who did it - the amazing PostgreSQL community. If you have not perused the Notes, I advise you to do so now. They contain many of what I call 'Wow!' level items, such as the redesigned I/O subsystem, skip scan lookups, virtual generated columns, and more.  The new Uuidv7 function and temporal constraints would be a significant leap forward. The OLD and NEW support for RETURNING will help lessen the learning curve for many who are new to SQL. But go down to the seemingly more mundane items. Removing redundant self-joins, merge-joins can now use incremental sorts, and internally reordering the keys of SELECT DISTINCT to avoid sorting are the types of improvements that are not as attention-getting as the items in the preceding paragraph, but will make life much nicer for us.  Each release of PostgreSQL is faster than the previous one, and early reports report that this ...

Book Review - Mastering PostgreSQL 17

Image
      Authors find it challenging to write a comprehensive book about PostgreSQL for several reasons. The first is that a new version of PostgreSQL is released every year with changes, new features, and tweaks.  I am lucky to have been able to review several new database titles each year, and I would like to let you know that Hans-Jurgen Schonig's Mastering PostgreSQL is a well-written reference.  The first chapter of this book covers changes in 17. The explanations and examples are clear, concise, and easy to comprehend. For instance, the CPY command is convenient, and in 17, it can handle errors. The book's example of this new ability quickly shows how it works—short and sweet!  From there, it covers transactions and locking, indexing, advanced SQL log files, system statistics, optimizing queries (very well done here), how to write stored procedures, security, backups, replication (also very well done), extensions, and troubleshooting.  I will ...

Data Security and AI - Sharing Your PostgreSQL Database With Ollama AI

Image
You probably saw the story about your public ChatGPT queries being indexed by Google and other search engines . Yikes!  You may want AI with your data but do not wish your prompts to end up in a search engine, especially if you are working with sensitive data.    Previously , we covered using some of DBeaver's Artificial Intelligence Assistant features with Ollama . Ollama is popular because it runs on your system, keeping your data from travelling over networks to an AI and back. What happens only on your system stays on your system, to paraphrase the Las Vegas motto. Paranoia in defense of your data is a good thing. Secure By Default We will start with an example of preference settings for DBeaver Enterprise Edition for Ollama. DBeaver has other security options I want to bring to your attention when working with an AI. Ollama differs from most other AIs because you do not have to obtain a key to work with it. AI Configuration - DBeaver Enterprise Edition The abili...

PostgreSQL, Ollama, and the DBeaver AI Assistant

Image
Ollama is an open-source project that simplifies running large language models (LLMs). It runs locally on your machine, and you can choose from multiple LLMs. Keeping all the data on your machine should provide a security bonus.    DBeaver is a universal database tool with an AI Assistant. This assistant provides an extra layer of security by allowing you to lock down what is shared, with the default being metadata only, no data. Webinar I recently presented a webinar  on using the DBeaver AI Assistant. Several questions were raised about using Ollama, one of the many AI Engines available. But I had not tried it. The first step is downloading Ollama for your platform (Linux, Windows, or Mac) and installing.  Once you have Ollama installed, you will need to pick a LLM.  I chose the Gemma3 and installed it using the command line client. Adding the Gemma3 LLM to Ollama DBeaver Configuration Ollama listens on port 11434. I was able to load the Gemma3 model us...