Handling Vectors in AI Context via PostgreSQL pgVector

  • Postat în IT
  • la 03-04-2024 11:10
  • de Horatiu Dan
  • 338 vizualizări

by Horatiu Dan

Relational databases are optimized for storing and querying structured data, yet most of the data today is unstructured. Artificial Intelligence and Machine Learning are now able to “structure” pieces of unstructured data without altering its semantics. First, they transform it ‘conveniently’ into arrays of numbers, structures that are called vectors. Then, the vectors are stored into dedicated databases and worked upon as needed, so that the initial data becomes useful and meaningful as part of a high-dimensional space.

In AI context, the numerical arrays are called vector embeddings and can be seen as sets of “characteristics” of the represented entities (objects). Their role is to allow AI models to infer on them and consequently on the initial input data.

This article is an introduction on how to turn PostgreSQL into a vector database using the pgVector extension. It also briefly presents a few general vector sim...