The good news is that you can configure all the data to be copied over to Azure Search on a regular basis and then use search capabilities provided by Azure Search. This may sound like a complicated thing to do but the functionality provided by Azure makes it ridiculously easy to configure something like this.
If you're looking for a search backend without having to worry about managing infrastructure, you should check out Azure Search. I'm a dev on the product. If you're interested in trying it out let me know. You can ping me at my username @microsoft.com if you have questions or need pointers.
I'm working on a travel agency website that integrates with global booking systems, something like expedia but less complex. So far along the process I've learned a lot specially about a wide range of Azure services. I just learned Azure Search and wow, it works great!
[Full disclosure, I work on the Azure Search team]
Cosmos DB has tight integration with Azure Search, to the point that you can choose to extend your Cosmos DB to Azure Search with a few clicks right from the portal to allow for full text search over this content.
AFAIK the Azure APIs provide suitable data usage requirements. One of the most fascinating aspects of the AI world is that we've made extraordinarily expensive brute force search a valuable tool.
Having the azure table storage be indexed into azure search with an azure function as the indexer delivers a pretty powerful alternative that isn't exactly batteries included, but close.
The setup that MS supplies as a one click solution on Azure [1] splits documents by page and stores a vector on the recently renamed to Azure AI Search services.
From there on they use a special API that comes with Azure Open AI deployed GPT models that will look up using either cognitive search service or vector search.
That API is a black box, so either they use user message or they have the LLM write a search query.
I would assume 365 uses basically the same architecture.
I work for a megacorp (250k+ employees). We have a lot of subscriptions, and a lot of resources :)
Just about everywhere I'd want to filter stuff in the Azure portal, there's a search box or/and filters. Honestly, I find it to be really snappy for me.
Those aren't OLAP replacements though, right? You would still need to manually build an OLAP db from the lower level to report on them. Maybe I'm wrong since I haven't used Azure...
I think if you're past pgvector performance you won't be listening to a random guy talking about pgvector but have a good understanding of the space.
If you're new (like I was a few months ago) save yourself the time I wasted on the noobtraps I mentioned. It scales way better than the OpenAI API for my use cases.
Lucene (or rather elastic/open search) is way overkill for my needs
Hi, kabouseng. Right now, I'm the only engineer working on this.
I mentioned this in a comment on the blog post, but the main reason I was able to put this together by myself and do it relatively quickly was that I was able to leverage pre-existing Azure Search infrastructure.
We have great tools for monitoring clusters, a proven queue-based job process and a good deployment workflow. Not only does it make it easy to set up systems like this, but it also allows us to smoothly and efficiently manage an ever-increasing number of customer services. It's been a good investment for us.
reply