Gen AI

AI race

Postgres with Gen AI

Top challenges faced by developers developing AI applications

I’ve been working in the Gen AI space, especially for Postgres Databases enterprise customers, and I really love the optimism. Although the earlier hype was a little overwhelming, I see shoots to productionize AI in various industries. I met engineers from a couple of companies and attended a couple of companies’ conference calls around APAC. In this blog, I will cover what I’ve been doing in the space of AI/LLMs and what areas still need further clarity. All the contents written here are already announced as part of Google’s public developer summits and events, so please don’t expect any leaks here. :smile:

I took a closer look at how prevalent Postgres databases are in terms of the number of deployments across various industry verticals and are still growing. For example, a recent StackOverflow survey of 2023 and 2024 both claims that Postgres is used by 49% of developers, with the popularity only growing in recent years. Professional developers use Postgres more than any other database, and it is the most admired as well as desired database for enterprises. In the overall databases market, Postgres stands at roughly 20% share, if we include the on-premise database and local databases like Sqlite or NoSQL databases.

Based on its popularity and robust features, particularly with extensions like pgvector, PostgreSQL is increasingly used in various AI and Large Language Model (LLM) applications. PgVector defines the embedding datatype in PG and allows for various indexing (HNSW, IVFFlat) and querying methods on top of dense or sparse vectors. Google has actively contributed to PgVector.

Some customers with a larger corpus of data experience issues with index build time (hours for billions of records) and high memory usage (1M vectors is ~6GB); others need fast, real-time index updates or better vector query performance. Google has its proprietary indexing called Scann, which is the most advanced leading to faster vector queries on PG databases since it uses Approximate Nearest Neighbour (ANN) vector queries. This gives Google a significant edge over all other PG providers for AI / LLM use cases. But PG itself is a little laggard compared to vector-first databases such as Pinecone, Milvus, ChromaDB, Qdrant, and Weaviate, to name a few.

Currently, PG enterprise customers are hosting the data in PG but for experimentation, they dump some parts of the data (a few records/tables as samples) into some vector-first database assuming that PG won’t scale at their requirements, either in terms of long indexing time or slow query performance. Another problem is that the LLM prototypes are not being productionized because data migrations are hard and AI applications are only good if they can see a sizeable volume of data!

As part of “AI first Databases” strategy, Google released AI and LLM integrations for Alloy DB and Cloud SQL PG database, such as Model Endpoint Management, LangChain integrations and Llama-Index integration. These initiatives promise the PG developers that Google is in the “AI/LLM for Databases” game and it is in there for a long term.

Let’s look at the road ahead. There are no clear winners in the PG world for AI/LLM, but there are only initiatives to unlock use cases and efforts to win market share. I have already written about the current issues with Gen AI and they are still unsolved problems. But for certain enterprises who want to experiment and find their market fit, the AI/LLM ground is an open game. To them, AI is costly, but not impossible and potentially a cost-saving investment for automation in the long term.

Conditions for AI and LLM deployments:

  • Non-business critical systems that require big human time investment Think of having an LLM application look into past issues and summarize the past fixes done. Example, IT dev tools, customer supports executive support, etc. I saw an example of this being developed for a US automaker OEM’s service centres too, for car technicians to chat before starting to work on an issue.

  • Cost of AI can be directly passed onto customers The recurring infrastructure (servers) cost of running Gen AI applications is often more than the users’ subscription fees. Think of having an admin query, which traditionally requires a data analyst to pull sales reports and crunch data for a management report. Those premium tasks can use a paid AI.

  • Using a locally hosted AI I don’t know about others but I do suffer from Model Fatigue (the challenge of constantly evaluating and choosing from a vast, rapidly evolving landscape of open-source models) and I often find myself unable to select the best open source models for my usecase. For low volume usecases (example for my own local usage) I can deploy a specific model until it critically suffers from hallucinations and accuracy issues.

  • LLMs for efficient querying Usually powered by vector or embedding search over the data stored in PG, developers can create much more powerful applications. Vector search operations are more compute-intensive than traditional word-based search, so a query can define the degree of AI to be used for DB querying. For example, I worked on a LangChain integration with AlloyDB that used Postgres vector stores for embeddings. The performance gains were significant, but optimizing for latency required careful batching and indexing strategies.

  • Agentic workflows are catching up MCP by Anthropic these days has emerged as a critical way to solve agent-to-agent communication. With MCP, atleast LLMs are quite accurate to call one layer of tools. It’s not yet very good when tools need to call other tools though. Google has released Gen AI Toolbox which aims to resolve issues with scaling and updating tools. This space is exciting and very promising. A lot can be done here and I am also interested in how this space unfolds.

In essence, while the Gen AI landscape is still rapidly evolving with its share of unsolved challenges, its integration with a steadfast and popular database like PostgreSQL is undeniably a game-changer. The strategic initiatives by Google, focusing on enhancing vector capabilities with innovations like Scann and streamlining development through comprehensive Gen AI Toolbox, underscore a strong commitment to making AI more accessible and powerful within the database itself. For enterprises and developers ready to navigate the costs and complexities, the potential to unlock significant efficiencies and build truly intelligent applications by leveraging their existing Postgres data is immense, paving the way for a future where AI and data are more deeply and seamlessly intertwined.

TECH
AI Efficiency Performance

Dialogue & Discussion