Hacker Read

vineyardmike · 2024-04-08 07:56:33

Google has used LMs in search for years (just not trendy LLMs), and search is famously optimized to the millisecond. Visa uses LMs to perform fraud detection every time someone makes a transaction, which is also quite latency sensitive. I'm guessing "informed folks" aren't so informed about the broader market.

OpenAI and Anthropic's APIs are obviously not latency-driven. Same with comparable LLM API resellers like Azure. Most people are likely not expecting tight latency SLOs there. That said, chat experiences (esp. voice ones) would probably be even more valuable if they could react in "human time" instead of with few seconds delay.

Integrating specialized hardware that can shave inference to fractions of a second seems like something that could be useful in a variety of latency-sensitive opportunities. Especially if this allows larger language models to be used where traditionally they were too slow.

reply

m3kw9 | karma 3357 | avg karma 0.75 · | 2023-11-10 18:40:10

The latency is still too slow to build LLM products other than chatbots where people expects a delay. The rate limit is also a non starter. And most app ideas involving LLM only differ in how well the UI is done. That’s the differentiator right now in AI apps

kazinator | karma 30751 | avg karma 1.78 · | 2023-07-05 22:38:21

LLM is obviously useful for something like Siri, Alexa or Google Assistant, or so you would think.

There doesn't seem to be a rush because it makes the implementation a lot more expensive, and those things are, I suspect, not profitable products (revenue sources) to their respective companies. They are a kind of enhancement to a layer of products and services; people take them for granted now and so you can't take them away.

A smarter Google Assistant would do nothing for Google's bottom line, and in fact it would cost more money to operate.

If it's not done right, it could ruin the experience. For instance, it cannot have worse latency on common queries than the old assistant.

reply

mrazomor | karma 221 | avg karma 3.03 · | 2023-08-10 03:11:25

Because Google nor any other company is rich enough to use LLMs at that scale. Consider the quality vs. cost per query & latency trade off.

etaioinshrdlu | karma 6007 | avg karma 4.09 · | 2024-03-08 01:52:30

I am not sure it is a good business choice to train a large LLM and not allow businesses to use it via API. It seems hard to justify...

tempusalaria | karma 392 | avg karma 4.22 · | 2024-05-25 13:38:07

A number of quant firms are among the largest global consumers of commercial LLMs

The only area you absolutely can’t use LLMs is in sub ms latency

reply

BoorishBears | karma 6089 | avg karma 1.41 · | 2023-06-01 02:54:55

Imo the least interesting use of LLMs is stuff like Chatbots. API access is a prerequisite to do 99% of the interesting things that they can do.

wordpad25 | karma 144 | avg karma 1.45 · | 2023-10-04 16:56:03

It's like asking why wasn't X invented earlier.

Google and everyone else had no idea how successful LLMs could be until OpenAI did it.

reply

verdverm | karma 6501 | avg karma 0.84 · | 2023-05-04 13:40:34

There are a ton of places LLMs are already providing value today. Some of the biggest are turning unstructured data and user intent into structured data, helping with writing (no replacing), certain tasks in software development (it is often much faster to use ChatGPT as a reference or guide than search google and sift through the ever decreasing in quality results).

I'm paying now and want to pay more, if only they would give me API access to the most advanced models. GPT-4 is much better and Google will have a comparable model soon (tm?)

reply

beernet | karma 78 | avg karma 0.86 · | 2022-07-26 13:51:09

A big part of the business claim lies in the generalizability. LLMs are basis technology that, once trained, have applications in practically all sectors and businesses, because text and data of other modalities are everywhere. Massive scalability potential.

Just to name a very short selection of apps that are very likely to be tranformed by LLMs, or are in the process already:

* chatbots and everything conversational

* QA systems, customer support

* writing and grammar assistants

* Code generation (Copilot etc.)

* Translation

Each of these has a billon dollar market, and you might be able to solve them all with the very same model. That is the bet.

reply

endisneigh | karma 11792 | avg karma 2.65 · | 2023-06-21 09:55:38

I've still yet to see a groundbreaking use case for any of these LLMs. Have you seen any?

_heimdall | karma 3004 | avg karma 1.38 · | 2024-02-20 03:03:01

At least from what I've seen and how I've seen others use LLMs, the general consensus seems to be that they're useful for the basics today but are more of a promising tech than something that's already landed.

If OpenAI features were to freeze at what we have today I would be surprised if the company stayed around without a major pivot.

Again I'm in no on way saying this is actually the case, only a hypothetical since the tech is still very new and we don't know what we don't know.

reply

sharemywin | karma 5432 | avg karma 0.74 · | 2023-04-13 09:26:17

This is a very insightful and useful article.

The impossibility of cost + latency analysis for LLMs The LLM application world is moving so fast that any cost + latency analysis is bound to go outdated quickly. Matt Ross, a senior manager of applied research at Scribd, told me that the estimated API cost for his use cases has gone down two orders of magnitude over the last 6 months. Latency has significantly decreased as well. Similarly, many teams have told me they feel like they have to do the feasibility estimation and buy (using paid APIs) vs. build (using open source models) decision every week.

reply

dustincoates | karma 1234 | avg karma 3.59 · | 2023-01-07 16:11:51

I'm (the author) actually in agreement with you. LLMs are going to be a big part of search in the future. I alluded to that I'm the post. I'm less convinced about search as a chat interface. But LLMs for query understanding, ranking, etc.? Of course.

devxpy | karma 515 | avg karma 1.26 · | 2023-10-29 04:35:35

The main issue is that its very slow and expensive to browse the internet like this. The LLM will only perform well if you have it do chain of thought reasoning, and that has a latency hit because of a longer generation.

golol | karma 870 | avg karma 2.46 · | 2024-02-08 18:29:49

Well for LLM services that do what they currently do google may have an advantage, but all this stuff is still only experimentation with the goal being hopefully much more advanced things, like almost-agi agents. If this happens then no one will care about the way we currently use LLMs anymore.

pixl97 | karma 15630 | avg karma 1.96 · | 2023-02-19 20:57:34

I mean it performs wen searches which is a form of call.

What's more concerning is people are looking at how to hook LLMs into other kind of tooling which will require APIs.

I wonder when we'll have our first intentional DOS of internet resources by a learning model?

reply

amf12 | karma 1219 | avg karma 2.13 · | 2023-01-12 13:37:17

I don't really agree with the comparison to Web3. LLMs have real world use cases as an interface. It might not replace most current software/systems/processes but act as an interface to these, especially voice. This alone has excellent value proposition and could improve productivity.

Use cases I envision: - Customer service automation (much better than the shit we have today). - Tutoring services (won't replace tutors but as an aid).

- Conversational assistant.

- Marketing/SEO.

- Search enhancement.

- Office productivity assistance (debugging, idea generation, search, etc).

All of these are use cases that can generate money, unlike Web3.

reply

kromem | karma 2951 | avg karma 3.51 · | 2024-04-29 08:03:08

It sometimes feels like I've taken crazy pills watching what was effectively a tech demo that went viral become the usecase now dictating billions of dollars of development and optimization.

It's a crappy usecase. And much better ones are typically being overlooked outside a few smart enterprise integrations.

To put it mildly - if someone wants to use LLMs to build a factual chatbot, they should probably just start mining crypto instead, as they'll waste less money on jumping on a trend. But if they think a bit about how LLMs can be used in nearly any other situation, they'll be miles ahead of the majority chasing this gold rush.

reply

twosdai | karma 216 | avg karma 1.7 · | 2023-12-28 17:07:43

Yeah it's hard to predict where the market will go.

It's possible that those forces are enough, but llm adoption at major institutions is slow. Everyone is interested in using chatgpt, but there isn't a clear beat use cases yet, or established paradigm to how it should be used.

reply