This is interesting but less interesting than I assumed. I though they'd provide ChatGPT with a schema and ask it to formulate its responses as `INSERT` statements, etc.
In the same vein, I had a play with asking ChatGPT to `format responses as a JSON object with schema {"desc": "str"}` and it seemed to work pretty well. It gave me refusals in plaintext, and correct answers in well-formed JSON objects.
ChatGPT seems decent. Seems to understand context to an extent. I'm sure more traditional methods could be used to give it "memory". Asking it to store specific interactions from the past.
It seems it's a big step forward in the foundation of interacting with machines. What will come in the wake?
The issue seems to be the lack of data modeling interest/knowledge of users. ChatGPT could be trained on the schema and so be able to map user's plain english to api calls. So minimally the front-end could be GPT -> api-encoder -> api-call.
Funny, I just talked about something similar to this, mentioning ChatGPT as a joke too. I think the output is a bit to verbose though, might as well be reading commit messages at this point. What happens if you tell it to sum it up as one or two sentences?
One can come up with all sorts of ideas like this, but building it will be a matter of slow iterations at prompt engineering in a mixture of natural language and data structures and will be at the whim of changing APIs, including the backing ChatGPT model. Sounds messy, hard to manage, hard to test...or am I missing what the actual process will be for creating one of these?
So you have to feed in the items to ChatGPT manually (or via some script) it looks like? In the future I guess ChatGPT with plugins could query the database on its own?
Does it work for text data or can it work for other types of data as well?
The use case is creating embeddings for paragraphs from documents. Then, when the user issues a query, you create an embedding for the query (you can also do more complicated stuff), you find the most similar embeddings from the documents and you insert those as context into ChatGPT so that it can rephrase the documents to answer the users question.
I guess the usefulness would be from client software like plsql or any other, implement a natural language processing powered by chatgpt, since the client already has access to the structure then it can include it in the prompt, the user would only write something vague like i want all the clients who bought with more than 500$ this year.
On the contrary, ChatGPT appears conversational because it basically feeds the entire chat history into the model to generate each subsequent response.
I bet information about his software is around elsewhere, and now ChatGPT will make up even more. I don't know how this is fixed. Structured queryable data, I guess.
Cool approach! The only thing I would change or allow customization for is the prompt you're injecting. Most users will probably want to change this:
"Output a json object or array generated with random content fitting this schema, based on the PORMPT section below." (sic)
My findings from one week of forcing chatgpt to produce structured output:
- it prefers a certain kind of format (the one it uses by default when you ask it without json/markdown), and if you deviate too far, it won't listen. E.g. I tried adding underscores to certain words, which it only did half the time.
- regex parsing the output is quite robust. Even if it adds sections you don't want like intro or conclusion, the parser will just discard these
reply