I pay for ChatGPT Plus but most people aren't, and from what I've read they're losing money like crazy. We may wind up seeing this as the technical high point before the various competitors cut back on expensive hardware/energy usage for the free tier, and the quality of the responses takes a dive.
This might sound a bit conspiratorial so apologies for that but back when the competition for Google Assistant/Siri/Cortana/Alexa was super hot, the responses to the various voice assistance was almost eerie in that it could infer what you wanted/needed. Then as things cooled off, they got gradually dumber/worse every year since then. They're legitimately bad today (Siri in particular, but even Google Assistant is much worse than itself from back then). I suspect it is because hardware costs were too high so they found cheaper models that could run on a potato/locally.
From what I've read to get ChatGPT responses in the times it currently takes, they may need to be running something like two 3080 GPUs, 64 GB of RAM, and a high end CPU, with the power draw associated. So to make the economics work (even with ad revenue for example), either a technological breakthrough has to occur to make running today's models much cheaper OR they will cut back making the responses objectively worse (they arguably already did this once with a 3.5 model change).
So far OpenAI's modus operandi was "does it get better if we make it bigger". But now we've reached "good enough" (arguably already with GPT3.5, as evidenced by free ChatGPT running on GPT3.5-turbo).
There has been promising work about making models smaller without losing performance (e.g. by training them longer), or quantizing the weights (llama.cpp etc) to make the model cheaper to run. I believe we will see a lot more focus on this over the next couple years, and given how there's been comparatively little work done in this direction I wouldn't be surprised by 10x or 100x efficiency gains.
let's not forget about good old fashion moore's law. which I believe in GPUs is even faster. and I'm not sure if some of these exotics Hardware architectures will pane out or not either.
efficiency efficiency efficiency will be the call.
How do we reduce model size without losing ability.
Can hardware be designed to run a particular set of weights efficiently?
I don’t want to make unsubstantiated guesses, but we should remember that GPT3.5 was originally just GPT3.5 and when they came out with turbo it not only sped it up but also reduced the cost (to us) by like 10x of what it used to be if I remember right. So they are (or at the very least “were”) working on and succeeding on reducing its cost to run.
On the plus side i think at-home LLMs will continue to get a lot better. ChatGPT 3.5/4 both seem in reach for Llama2 and seeing all the amazing fine tunings that were available for Llama1, i suspect we have a very functional future of at home LLMs.
I say functional because i don't think we've seen the larger reasoning skills scale well for smaller deployments, so that may be out of reach. But i think a future were we run something far better than Siri, entirely at home, hooked up to our home APIs in a functional spoken interface seems in reach and awesome (to me hah).
I look forward to hooking all my at-home cameras up to a voice recognition, image recognition, and LLM detection suite and just speak aloud and have the AI-suite "do stuff". It sounds really, really fun to me.
If it's cheap it'll be slow, but imo that's okay. For home use i think these systems can be the glue that tie our DBs/Ingestions/etc together, rather than the big-brain that knows everything.
"I look forward to hooking all my at-home cameras up to a voice recognition, image recognition, and LLM detection suite and just speak aloud and have the AI-suite "do stuff". It sounds really, really fun to me."
Alternatively, this sounds like the starting plot to a crossover sci-fi/horror movie...
The door refused to open. It said, “Five cents, please.”
He searched his pockets. No more coins; nothing. “I’ll pay you tomorrow,” he told the door. Again he tried the knob. Again it remained locked tight. “What I pay you,” he informed it, “is in the nature of a gratuity; I don’t have to pay you.”
“I think otherwise,” the door said. “Look in the purchase contract you signed when you bought this conapt.”
In his desk drawer he found the contract; since signing it he had found it necessary to refer to the document many times. Sure enough; payment to his door for opening and shutting constituted a mandatory fee. Not a tip.
“You discover I’m right,” the door said. It sounded smug.
From the drawer beside the sink Joe Chip got a stainless steel knife; with it he began systematically to unscrew the bolt assembly of his apt’s money-gulping door.
“I’ll sue you,” the door said as the first screw fell out.
Joe Chip said, “I’ve never been sued by a door. But I guess I can live through it.
Except for the part Llama 2 weights are still being gated by Meta. There are certainly ways around this, but officially this hasn't even changed, even for Llama 1. I'm not certain Meta has any real incentive to do so. This arguably puts a bit of a damper on home solutions.
Falcon looks very promising, but we seem to be at the early stages of its release, and I haven't seen much validating the claims made about it.
I'd love to host an in-home alternative to Alexa, but I don't think I have the time to train a replacement on my own. Here's hoping this changes
The only moat openai has is their financial resources to pay for training and their vast troves of data scraped before reddit/twitter/etc started trying to lock things down. I think at this point there are enough people knowledgeable and interested in open LM technology that, if openai begins to degrade in quality, we won’t lose this technology forever.
I’ve personally been obsessively reading papers about new models and transformer modifications, and having succeeded in implementing some I think the experience has taught me that this tech is probably where web was in the 90s: easy in concept, expensive and hard to scale due to a few missing easy pieces thatll come later.
> The only moat openai has is their financial resources to pay for training and their vast troves of data scraped before reddit/twitter/etc started trying to lock things down.
Do we know that there's zero benefit from the user generated data they're getting now? They know when someone clicks the button to regenerate, and in theory they can have GPT4 review all the responses and classify if they're good or not. I don't know how beneficial that data is, so I'm curious if it's been proven to be completely worthless or not. If it's valuable at all, and openai is using it to continually improve the performance of ChatGPT, then maybe it will be difficult for a competitor to ever get traffic and data to their alternative.
I don’t know about “zero benefit”—certainly their access to customer queries gives them a gods eye view into how people are using their LM implementation and to some degree could probably be used to fine-tune their model—but I doubt it gives them a leg up that couldn’t be matched through the sheer scale of academics and volunteers who contribute to open solutions & research. Maybe we invent a method for directly fine-tuning based on user interactions, or we discover that there’s some shelf where further fine-tuning only confuses the model, or discover some magical architectural modification or dataset that enables high-quality interactions. We’re still at the stage where basic modifications to the model breaks seemingly insurmountable problems. ALiBi for example swaps in a new attention mechanism, literally just a matrix addition operation, and found that it enables inference far past the context limit.
I’m ultimately optimistic, their data is valuable but I don’t think it’s insurmountable.
If it was just computer that caused the various assistants to get bad, then why haven’t they gotten good again?
Google assistant came out in 2016! That is 7 years ago! Since then, GPU compute in flops has 5xed… for single precision. If you include half-floats (not a thing back then), it is 75x the single precision flops.
I think the reason that assistants have “gotten bad” is a combination of edge case controls (avoiding politically problematic results), increased capabilities mean more opportunities for failures, and their failures are amplified, and their successes ignored (in other words they aren’t necessarily worse, we just expect more out of them).
If it was just a matter of flops then AI would already be driving cars and making door to door food deliveries. The problem is neither the energy expenditure nor the compute capacity. No one knows what the proper architecture should be for general purpose intelligence assistants so everyone has to fine-tune the existing models on their specialized use cases and even then it's a hit or miss depending on how much training data you have available.
No one currently has any ideas on how to deploy auto-adaptive ML software that continuously adapts to new data so the industry has settled on deploying snapshots and then hoping the data set that was used for the snapshot is going to be good enough for most use cases. It seems to work well enough for now and since Facebook has decided to open source most of their work someone in the open source community might figure out how to continously update these models without deterioration in output quality.
The difficulty of improving AI capability towards AGI isn't really related to a purported worsening of an already-implemented system for financial reasons.
The OP said Siri et al. got worse because their creators optimized them to reduce compute costs. I’m saying that if that were true, then they should be at least as good as they were because you can get tons more compute for cheaper these days. So it isn’t compute that makes the various assistants “worse” than when they were introduced.
I made no statements about their quality in general, just their relative quality compared to when they were released.
That assumes that holding perceived performance steady while letting compute consumption get cheaper saves enough money, quick enough.
If voice assistants were losing $10 billion a year (made up number) in 2013 and bringing in zero revenue, then it is easy to see why a decade later they would have less compute allocation now than they did back then, even in absolute terms.
If the revenue is near zero, then the spend on compute will be near zero too, after everyone figured out that it wasn't going to win some AI battle for all the consumers. If they are not willing to set money on fire for no measurable benefits anymore, then compute will tend to zero over time.
One thing about LLMs is that from what I know, batching is very effective with them, because to generate a token you need to basically go through the entire network for just that one sample. If you batch, you still need to stream the entire network, but that doesn't get any more expensive; you use each piece of data more often on hardware that used to just sit idle.
So I assume that at an OpenAI-scale, they're more than able to batch up requests to their models, which gives a tiny latency increase (assuming they're getting many requests per second), but massively improves compute utilization.
We need a strong localized version at least for software development that is free as soon as possible. It is such a competitive advantage there's no way it will stick around forever for as cheap as it is.
C++ not Rust. And most ML libraries being called from python or whatever are themselves C++ but of course there's always room for improvement, esp if you reimplement the entire corebase in C++ and specialize the inference program for your architecture. Specialization often leads to possibility of more efficient code. You are not running a generic neural net but you know the architecture beforehand so you can design its memory layout etc for efficiency yes.
> the responses to the various voice assistance was almost eerie in that it could infer what you wanted/needed. Then as things cooled off, they got gradually dumber/worse every year since then. They're legitimately bad today (Siri in particular
It's a compelling argument but I find that difficult to believe given the timespan. Siri launched in 2011.
The FLOP/s per dollar rates for GPUs involved in ML doubled every 2 years across this peripd [0]. Meaning in the past 12 years, you're talking about 64x GPU performance per dollar spent. Combine that with 36% general inflation + some increased spending on mobile computing, plus a decade of time on making algorithms for a big consumer product more efficient, and you get to wildly more favourable economics, as in, two orders of magnitude more favourable than when Siri launched in 2011.
I'm not surprised this could well be happening for ChatGPT (would be cool to see the first public audit standards of AI products emerge, to test this hypothesis). But I don't think it holds true for the 12y period for Siri and co.
The hardware being cheaper only matters if they truly want to improve the product, which it's not clear that they do.
Lowering the number of FLOPs needed will always reduce the cost, as the hardware gets cheaper the cost is just reduced faster.
The question is, would improving Siri sell more IPhones? If not, then why bother?
I don’t think Siri et al got worse, we just had time for the hype of the excellent voice to text and back again to wear off, and realized that they were too limited at figuring out actions to be very useful.
I do remember meeting the Siri team before Apple bought them (I was writing for a tech blog at the time) and having them show me a bunch of really cool queries. I was pretty impressed. Shortly after, Apple bought them, and significantly after that, actually launched it as a product. I was not at all impressed by what launched.
Keeping in mind that the pre-sale demo I got was in pretty controlled conditions (i.e. I could have been duped):
After Apple's launch of Siri, someone else who had been working with Siri told me that he was really unhappy with what Apple had done; according to him, they gutted a bunch of capabilities, mainly because A) they couldn't be launched to all customers or B) just didn't fit with Apple's product vision. To give a specific example, the pre-Apple Siri could make reservations at a hotel in SF / NYC / Austin for you but not in Dubuque or Tokyo, so that had to go.
Of course, that original Siri was very different from ChatGPT: it was based on structured queries and the "cool stuff" it could do was basically a ton of API linkages to outside services. At that point, there'd already been a wave of "natural language" companies like Powerset and Semantic Web that had promised the world and then flamed out, so everyone was aware that unstructured queries were basically impossible. Funny how fast things change.
I'm actually worried of the inevitable enshitification of Open AI.
Right now, GPT 4 is great and I use it every day. I rely less and less on Google and save a ton of time. Google has gone downhill hard for a few years, thanks to all these silly blog engines like medium, substack and the like.
I'm really worried and would like to 'freeze' GPT 4 as is, at least with the data up to Sep 2021, and for them to not change or tweak any of it further. I wish we could have an offline solution just as good, or a guarantee of GPT 4 LTS. I don't mind paying so as long as it's useful.
I'm convinced that inevitably, every successful engineering driven business falls into a marketing business; and once market research and sales drive every business decision, the product is boned.
This is a good point and I'm surprised it hasn't had more attention. We've seen history repeat itself enough times that it's inevitable. Probably the best you can do is download a few of the full precision 70B or bigger models (LLaMA etc) to at least have some state of the art weights around you could use yourself if you wanted if/when openAI goes to shit.
The nice thing about the language models is as least they're relatively compact, compared to google search which you'd never be able to host locally.
Though thinking about it, in a way LLMs are local copies of the "good" internet. The big ones mostly seem to know what information there is to know and have none of the crap and link/content farms that is now most of the internet.
>... compared to google search which you'd never be able to host locally
That's something I haven't thought about for a long time. How big is the Google index, would you say? Petabytes or exabytes, or am I vastly overestimating?
Probably petabytes for the text index and exabytes for the whole index (including images and video). The entire textual internet is hundreds of terabytes uncompressed, and indexes tend to balloon storage size to make serving tailored queries faster and cheaper. That idea shifts a bit with images/video where vector/ML techniques dominate; you get a reduction in quality, but quality is good enough (mostly, probably, we think), and you have within a small constant multiple one way or the other of the underlying dataset size for the index size (as opposed to text, where 100x-1000x one way or another isn't uncommon).
If the index is less than petabytes then it's impossible to serve many long tail queries even just based on text content. IME those queries _have_ gotten progressively worse results the last few years, so maybe the techniques have changed, but when the money printer is "search" and a petabyte of data is a fraction of a full-time engineer in cost to store, I doubt they would have cut costs that aggressively.
Not in my experience. For an ML hobbyist, I don't have particularly strong hardware - around 40 GB of system RAM and a very small GPU with 8GB of RAM. I can run all sizes of models, up to and including 70B (albeit slowly). 13 is definitely the most usable on this particular hardware, given the trade-off between model capabilities and generation speed.
I'm more interested in Open Source/Local LLM's ability to consume private data as an internal knowledgeable. For example, feed in your entire source repo(s) and then ask it questions about where something is or how it works.
I am constantly searching for this. To point a locally hosting LLM to a locally hosted code repo - and then start improving it from there would be awesome.
Depending on how serious you are about this (and how well you can slap together a few different python packages), it is very doable today.
Get one of the better llama versions fine-tuned on code (e.g. WizardCoder), take your entire code base and create embeddings from it, put those into a vector database.
Now, every time you ask your LLM a question about your code base, you first turn that prompt into an embedding and perform a search on your vector database. The results of that search are appended as context to the actual prompt before passing it to the LLM itself.
There's tons of packages that help with all of that, Langchain and Faiss are probably the most popular right now.
interesting, i'd love to do this too. but it sounds like there aren't any full-featured, opensource packages/projects that do this all-together? I'd love to hack the parts together, but i don't have the time/energy these days to do it.
thanks for the helpful keywords though, it helps point me in the right direction.
Retrieval Augmented Generation (RAG) is a good stopgap measure. It avoids the need for a lot of work on customising the LLM by supplying the relevant data via vector DB queries and letting LLM remain as a "calculator for words" that will just rephrase/extract the answer from the DB query result.
He is determined to see no value. Today I couldn’t find the mouse acceleration toggle in macOS, and google turned up a bunch of mediocre results I would have to sift through. ChatGPT just gave me the terminal command I was looking for immediately.
Or, I literally asked a factual question of Google (several
times, although similar questions) and didn't get a bard generated response. Strange to read "Google searches aren't doing this for me" as "they are doing it for me and I don't like it"
I'm almost hoping for some enshittification. It's so depressing seeing how many skills and jobs have been instantly devalued. Take a look at the artist communities and it's total depression that peoples life passion and source of income has just been deleted over the course of a few months.
We aren't talking about burger flipping and checkout scanners. This is a creative pursuit which takes years of grinding to become proficient, just wiped out.
Right now the current models and tech aren't quite enough to replace real humans, but the rate of progression feels like they will soon. If it all just halted here for a while, that would be quite comforting for a lot of people.
If things slow down a bit then speed up again in a few years once economics improve or whatever, does that really make any difference in the long run? What's the long term solution here?
We're not even talking about decades here, probably.
Part of me hopes we have hit a kind of wall where we get what we have now, maybe with some more polished UI, but it doesn't continue the exponential improvement path it's on. If it all slowed down, people would have plenty of time to readjust. Rather than just instantly deleting peoples careers.
Human artists can quite easily beat the current tech, but there is no real indication that the current stuff is the best we can juice out of the current methods. Which is the depressing part. Why even bother to learn anything when AI will obsolete your effort before you finish.
Not even deleted! Actively stolen. No one has yet grappled with the reality that especially in the realm of generative art, these models do not work without a basis of absolutely stressful amounts of stolen artwork to train them. They simply do. Not. Exist. Without that baseline unethical contribution that tons of artists made without their knowledge and certainly without their consent. None of the major figures behind any of this tech will acknowledge this. They make some hand-wavy non-committal gestures towards fair use and "how artists learn" and act like that's an answer, completely discarding tons of points not worth revisiting here because it's frankly been beaten to death already and if you're still fence sitting or pro generative art at this point, it's because you want/need to be to prevent cognitive dissonance in one way or another.
And to be clear: this stuff is SO COOL. The idea of entering prompts and getting at least vaguely unique and interesting output is incredibly cool. The notion of machines that learn, even in a restrictive way as the current models do, is so fascinating. And of course, like any cool and interesting new tech, it was immediately adopted/hijacked by the worst people imaginable, determined to fill Amazon self-publishing with generated hack kids books filled with generated text and generated art, at the lowest cost point possible, sold on the cheap, with the idea that they could make them convincing enough to trick some poor overworked parents into buying them and pocketing the cash. Just grift. Every new technology we get now, as few as the true new innovations they are, are always, always, immediately, co-opted by the worst actors in the space and turned to utter garbage.
Oh please. First of all, how do you steal an idea? We’re talking about pictures. Supposing that you buy into the theory that you can, copyright was created to further the arts and sciences; it’s in the US constitution. The point isn’t to control your work — it’s to live in a richer society. And it’s not even clear that training a model counts as infringement. Being able to recite a quote from a book is different than reproducing the entire book. Artists won’t acknowledge that the same applies to their art.
If you believe that training models on art is stealing, then I’m a master ninja, since I’m the creator of books3. And even Stephen King today came out and said that he’s fine with it:
> Would I forbid the teaching (if that is the word) of my stories to computers? Not even if I could. I might as well be King Canute, forbidding the tide to come in. Or a Luddite trying to stop industrial progress by hammering a steam loom to pieces.
I take a dim view of people trying to frame researchers as criminals. We’re not. We want to further science. That’s all.
You call me a grifter, but I’ve made roughly a hundred bucks from books3, and that’s because someone found my patron buried under a pile of links and subscribed to it many months ago. Most of my researcher colleagues seem to have similar distaste for wanting to make money. The work is the goal.
It's possible with Google Docs too. The whackiest thing the other day was that I was using the screenshot tool in MacOS and it made a screenshot of copyrighted content. I was flabbergasted. The tool just did it with 4 parameters.
So let’s say in 10 years they have the processing power to generate full movies on demand.
OpenAI trains its model on all movies from IMDB.
Then I ask ChatGPT to produce a movie « like » avengers but with the faces of all my friends.
And I just pay the 20$ fee (or 200$ by then) to OpenAI and watch my movie.
Then I can ask to tweak some parts of the story and update the movie in real time. And this for any movie.
Given the budget of 220 millions for the one movie I quoted, OpenAI will never give back anything money to them. Or to any movie producer.
Does that seems ok to you ? Today it’s the books and pictures but later it’s going to be anything.
I mean to produce anything it requires time, effort, money. They steal the end result, and produce any variation possible and make money out of it.
Your point would only be valid if OpenAI would be 100% free. It’s not
And then obviously no-one will spend $220 million on a movie anymore because, hey, we can just use generative AI. So I guess subsequent AIs will be based on the outputs of a previous AI? Or will all movies from a certain point onwards be wholly based on a corpus of existing movies used for training? Maybe AI companies will start shooting small film segments in meatspace just for the purpose of training or providing some base input to their models?
Isn't that what they are already doing but with human writers?
Screenwriting is mostly just a formula.
Aside from being cheaper, I don't see any difference in terms of quality.
> Does that seems ok to you ? Today it’s the books and pictures but later it’s going to be anything.
When they're that good, I might finally finish the novel I started writing in… ugh, 2016.
A quote comes to mind, though:
"I say 'your' civilization, because as soon as we started thinking for you it really became our civilization, which is of course what this is all about."
> I take a dim view of people trying to frame researchers as criminals. We’re not. We want to further science. That’s all
Criminality and perusing science not only aren't exclusive, they aren't even related. Would you try that same argument if you were stealing physical goods for your experiments?
We have tons of example of unethical and illegal things being done in the name of science. IRBs didn't come into existence because being a scientist wanting to further science automatically makes you moral or justified. I trust I don't need to list the various experiments.
Meanwhile, to return to what King said. That something doesn't worry someone worth more than half a billion dollars with a 5 decade career at the top of their game and iconic name recognition is not an indication that the thing is irrelevant, especially to people in the same line of work.
You can say you don't like copyright, but that's not what you are focusing on.
> how do you steal an idea? We’re talking about pictures
Ideas and artwork are qualitatively different. Artwork, it’s right there in the name. It takes work to create pictures/art. It’s more serious than stealing just ideas, which I agree are economically worthless until executed.
Fun fact: the etymology goes to "ars", as does "artisan" and "artificial".
Similar in German: Künstler, Kunst, Künstliche Intelligenz are all rooted in a single word that means skill/ability/knowledge/recognition.
I suspect that those in 1855 who said "photography can never assume a higher [artistic] ranking than engraving" were basically right: me snapping a photo of a sunset, especially now on a device that adjusts exposure etc. automatically, doesn't feel like it should deserve the same protection as a carefully composed portrait with artfully chosen wardrobe and makeup.
The etymology, while interesting, is trivial hair-splitting in the context of the real issue. I’d be interested to see any court case that has been argued successfully on such grounds.
As for the opinion people from 1855 might have had about AI and artwork, I take them about as seriously as I do their opinions on germ theory, space exploration, racial politics, warfare, psychology, biology, nuclear physics, and many other subjects we have learned more about in the intervening 168 years.
OK, your point is well received about the etymological argument, but my real point, right after "artwork", is that yes, even photographs do require work to create. Would you say that Ansel Adams' photographs were effortless to produce? I wouldn't.
One tangentially related trial would be Pharrell Williams v. Bridgeport Music [0], where Marvin Gaye's family sued Pharrell Williams with the following claim:
> Gaye's family argued that the songs were not merely stylistically similar; instead, they claim that "many of the main vocal and instrumental themes of "Blurred Lines" are rooted in "Got to Give It Up"; namely, the signature phrase, vocal hook, backup vocal hook, their variations, and the keyboard and bass lines" and "the substantial similarities are the result of many of the same deliberate creative choices made by their respective composers."
And they won. I don't agree with the outcome, but I do think it's an interesting benchmark. Obviously, this trial would've never been a thing if "Blurred Lines" wasn't a big hit. Something simlar could apply to a major brand using text-to-image generated material that was strikingly similar to a prominent photographer or artist, if they sued.
I wonder what's going to be the first big case having to deal with this.
> Oh please. First of all, how do you steal an idea?
You toss this out as though "Intellectual Property" is this concept you don't understand as (probably) a software developer of some sort, and beyond that, an entire divison of law, to which most companies have entire floors if not entire buildings of lawyers to deal with.
> We’re talking about pictures.
I'm talking about all art in all media. There is generative music, you know. And yes the big ones for now are the text, which has been trained on millions of blog posts, reddit posts, written works creative and otherwise, all, and I will keep saying it, without the permission of the authors by and large and similarly things like MidJourney which in turn were trained on massive imagesets with different specializations, but whose sources included: photographers, illustrators, painters, furries, and likely millions of people drawing in the anime style, *also without their permission.*
> If you believe that training models on art is stealing, then I’m a master ninja, since I’m the creator of books3. And even Stephen King today came out and said that he’s fine with it
I mean, you tell me. You created a thing you stand to profit from (even if indirectly via name recognition) via the use of IP you didn't have the rights to and got permission from ONE affected individual, after the fact. If you want to be ethically in the clear, why not get permission from everyone else? Then you're done. I suspect because a) it will be a substantial amount of work and b) that you know very well most people are not going to be comfortable with their creative output being used to train a machine who aren't already rich, which means a much, much smaller dataset to use.
And if it makes you feel better you can frame as all as zealotous luddites who want to smash your machine, but again, for the record, I do find this interesting. The only part I take issue with is rent-seekers trying to monetize access to them to monied entities, and what said monied entities are going to do with them: which is largely generate spam to sell at whatever price they can manage.
> If he’s not worried, why are you?
I'm not a creative, I'm not worried for me. I just don't like what's about to happen to creatives. I don't like technology being wielded by people who don't have skin in the game, who don't respect the creative process. I frankly find it incredibly offputting how ready and frankly gleeful everyone in my field is to put millions of people to pasture with no plan for how they're to make a living, especially given how hard it is to do so as an artist already.
I just cannot conceive of someone who's like "we should automate creative processes so humans have more time for spreadsheets and time cards" and just... UGH. What in god's name sort of world are we even trying to build anymore!?
> I take a dim view of people trying to frame researchers as criminals. We’re not. We want to further science. That’s all.
I'm sure Oppenheimer did too.
> You call me a grifter,
To be clear, I called users of your model grifters, not you personally.
> but I’ve made roughly a hundred bucks from books3, and that’s because someone found my patron buried under a pile of links and subscribed to it many months ago. Most of my researcher colleagues seem to have similar distaste for wanting to make money. The work is the goal.
That doesn't change the fact that what you guys are building will be used to inflict catastrophic societal harm by people who are not looking to "do the work" as you'd put it. They're looking to maximize profits because it's their contractual obligation. If they can make one designer do the work of a hundred by feeding them a drip of AI generated garbage to tweak to something usable, they will in a heartbeat *AND YOU KNOW THAT.*
It is immoral to create a device, from the labors of someone, which is capable of replacing their labor, without compensating them.
Unless they agreed to provide their work for that purpose for free.
Postulate #1: Image generation models would not exist without large amounts of training data from current artists.
Postulate #2: Every major AI company either trained directly on public web scraped datasets or is murky about what they train on.
Theft at scale does not somehow make it not theft. Stealing 1/100th of a penny 10 billion times is still stealing.
And when you repackage the results of that theft in a profit generating machine, and then label it not theft because "it's a whole new thing," you start to sound like a CDO apologist.
And look, I get it -- it's about money.
It's always about money.
You may not be making any off your work, but that's immaterial because lots of huge companies are making obscene amounts of money from doing this (or expect to be in the future).
At the same time, it is an excellent tool. Art without human time! It will eliminate a lot of artist jobs, but everyone as a whole will be better off (because we're swapping human labor for electricity).
However, the currently vogue "artists don't deserve anything" smacks more of "we don't want to share profits during the transition period" than a cohesive moral argument.
We can have an AI future, but we should be honest about what enabled that. And we should compensate those people during the transition.
Hell, AI tax. Paid to everyone who created a work in a scraped dataset. Sunset in 30 years. Done.
I disagree with you, simply for the fact that artists have been learning from one another for thousands of years.
We can see a clear timeline of art and it’s progression throughout human history, and it’s often clear how a later work took inspiration from an earlier period.
Art school teaches techniques and methods pioneered by earlier artists, for the express purpose of their students to know how to incorporate them into their own original work.
Yet, no one is arguing that Van Gogh’s descendants should be paid a small royalty anytime a variation of on of his painting is produced, or even just when a painting in the style of one of his is produced.
Were all visual artwork to disappear from the world and collective human memory today, then the first new pieces produced by artists would look dramatically different - and likely much worse - than they do today.
What AI is doing is no different. Perhaps faster and on a larger scale than how humans learn from one another, but principally it’s the same.
> Perhaps faster and on a larger scale than how humans learn from one another, but principally it’s the same.
I like how you just tucked this at the end there without any introspection on what kind of a paradigm shift that is. If you wanted a "Van Gogh style painting," you'd contract with a painter who specialized in it, and no, his descendants don't get royalties from that (which is an interesting discussion to have, I'm not sure they should, but I haven't thought about it but anyway) but you are paying a human creative to exercise a vision you have, or, from another perspective, perhaps a person goes into creating these style of paintings to sell as a business. Again the idea of royalties isn't unreasonable here but I digress.
Now, with these generative art algorithms, you don't need a person to spend time turning your/their idea into art: you say "I want a picture of a cat in Van Gogh's style" and the machine will make you dozens, HUNDREDS if you want, basically as many as you can stomach before you tell it to stop, and it will do it (mostly) perfectly, at least close enough you can probably find what you're looking for pretty quickly.
Like, if you can't tell why that's a PROBLEM for working artists, I'm sorry but that's clearly motivated reasoning on your part.
I can tell why it’s a problem for working artists. I never suggested otherwise. What I disagreed with was the premise that it’s immoral or inherently wrong. A problem posing a difficulty to a certain group of difficulty doesn’t have any bearing on its morality.
I'm guessing you mean to say "A problem posing difficulty to a certain group of people doesn't have any bearing on it's morality." and that's just... so very gross in terms of ethical statements.
Like just, hard disagree. Undercutting the value by entire factors of a whole profession's labor is incredibly immoral, especially when you couldn't have done it without the help of their previous works. Like... a very non-exhaustive list of problems I would say meet that definition are:
- Generational/racial wealth inequality
- Police brutality
- The victims of the war on drugs
- Exploitation of overseas labor
I don't think we really have anything else to discuss.
Alike in method is not like in output, and it's output that matters.
A human takes ~4-20 years to become a good artist. They can then produce works at a single human rate.
A model takes ~30 days to become a good artist. It can then produce works at an effectively infinite rate, only bounded by how many GPUs and much electricity can be acquired.
These are very different economic constraints and therefore require different solutions.
> These are very different economic constraints and therefore require different solutions.
This is often listed as the reason why it’s ok for human to learn from a prior art, but not for a LLM. The question is why? If the act of learning is stealing, then it is still stealing, no matter how small scale, and every single human on earth has committed it.
The LLM vendor may benefit more than a mere mortal pupil because of the scale and reach. At the same time the LLM may make the prior art more visible and popular and may benefit the original creator more, even if only indirectly.
Also if content creators are entitled to some financial reward by LLM vendors, it is only appropriate that the creators should pay back to those that they learn from, and so on. I fail to see how such a scheme can be set up.
Either directly (outlawing murder) or indirectly (providing for roads and bridges). And well (libraries) or poorly (modern copyright law).
But fundamentally, law benefits people.
Most modern economic perversions are a consequence of taking laws which benefit people (e.g. free speech) and overzealously applying them to non-people entities (e.g. corporations).
So "why [is it] ok for [a] human to learn from a prior art, but not for a LLM"?
Because a human has fundamental output limitations (parallel capacity, time, lifespan) and a machine does not.
Existing laws aren't the way they are because they encode universal truths -- they're instead the consensus reached between multiple competing interests and intrinsically rooted in the possible bounds of current reality.
"This is a fair copyright system" isn't constant with respect to varying supply and demand. It's linked directly to bounds on those quantities.
E.g. music distribution rights, when suddenly home network bandwidth increased enough to transfer large quantities of music files
Or, to put it another shorter way, the current system and source-blind model output fucks over artists.
> Because a human has fundamental output limitations (parallel capacity, time, lifespan) and a machine does not.
Industrialization as we know it would have never happened if we artificially limit progress, just so that people could still have jobs. I guess you could hold the same kind of argument for the copists, when printing became widespread; for horses before the automobile; or telephone operators before switches got automated. Guess what they have become now. Art made by humans can still exist although its output will be marginal compared to AI-generated art.
LLMs are not humans but are used by humans. In the end the beneficiary is still a human.
I'm making an argument that we need new laws, different than the current ones, which are predicated on current supply limitations and scarcity.
And that those new laws should redirect some profits from models to those whose work they were trained on during the temporary dislocation period.
And separately... that lobotomizing our human artistic talent pool is going to have the same effect that replacing our human journalism talent pool did. But that's a different topic.
For the AI/Robot tax, the pessimistic view is that the legal state of the world is such that such tax can and will be evaded. Now not only the LLMs put humans out of a job because an LLM or a SD model mimicks their work, but the financial gains have now been hidden away in tax havens through tax evasion schemes designed by AIs. And even if through some counter-AIs we manage to funnel the financial gains back to the people, what is now the incentive for capital owners to invest and keep investing in cutting-edge AI, if the profits are now so meagre to justify the investment?
>> I disagree with you, simply for the fact that artists have been learning from one another for thousands of years.
They learn from each other and then give back to each other, and to everyone else, by creating new works of art and inventing new styles, new techniques, new artf-orms.
What new styles, techniques or art-forms has Stable Diffusion created? How does generative AI contribute to the development and evolution of art? Can you explain?
> Most of my researcher colleagues seem to have similar distaste for wanting to make money. The work is the goal.
You sure give the profit motive a lot of due for someone who claims to be above it.
Hard work and perseverance is a human instinct that is undermined by antisocial institutions like copyright, which has no precedent absent the barbaric mode of relations we call capitalism.
> First of all, how do you steal an idea? We’re talking about pictures.
We're talking about the fruit of someone else's skilled labor and the long labor that went into developing the skills.
If you're so certain that's not valuable, well, then it shouldn't be any hardship to forgo that entirely and simply figure out some other way to get trained models.
If it is valuable, then maybe it's worth treating it as if it is, both in terms of compensation and determination. Not only for moral accounting but also because of economic feedback: if it's not treated like it's valuable, then it will become less frequent that people can invest time and other resources into doing it.
"I'll keep saying it every time this comes up. I LOVE being told by techbros that a human painstaking studying one thing at a time, and not memorizing verbatin but rather taking away the core concept, is exactly the same type of "learning" that a model does when it takes in millions of things at once and can spit out copyrighted code verbatim."
Reminds me a bit of how everybody was cool with pirating software and media back in the day. To help the bad conscience there was a handy narrative of how the record companies were making a killing off of everybody so it was cool not to pay.
But in reality it was just technically feasible. That's what enabled the behavior, not a few million Robin hoods coming together to do the right thing.
Similarly I feel creators need to make it impossible to have their work stolen for AI if possible. It will be tough though. They have half the world against them it seems.
I wouldn't equate it with a person pirating a DVD to watch in his own home. It's more like someone pirating a DVD, burning a bunch of copies of it and selling those. Big difference.
I believe if you look further and deeper you'll reach the conclusion the real issue is how awful copyright laws are and more importantly, how absurd is the current economic system. Glorified markov chains are not the culprit here.
So the current movement where artists "shame" random joes for using a cool technology has only one possible outcome which is to push said average joes into being politically active in defense of AI. No one seems to want to properly organize and file a class-action, just twitter bickering.
Exactly. Preventing the use of some information for training data is like saying we can't go to the gallery or library to see what prior art looks like. Creativity is just having a higher temperature setting and selecting items that trigger a positive response. We are mimetic.
The art world itself has already asked many of these questions 60 years ago, when Warhol made Campbell's soup can art, and then people used that template to turn other brands into art.
A human doesn't exist in servers, work for free (well, minus utilities) 24 hours per day and work as long as you want on a project you've given it with zero input. If you can't appreciate the difference between an AI model and a person I think that says more about you than AI.
None of those differences really says anything about morality of having an AI do things, unless you want to argue that the AI suffer from their (mis)treatment.
People sometimes state that The Simpsons is now drawn in sweatshops somewhere in Asia. Assuming that's true, or at least that it's happened at least once for at least some cartoon, does it make any difference at all to the question of where the people directing them get their ideas from or how those workers learned to do their thing?
> None of those differences really says anything about morality of having an AI do things, unless you want to argue that the AI suffer from their (mis)treatment.
I mean, if it was actually AI and not machine learning, that would certainly be a question wouldn't it? That being said, I'm not saying the machine is suffering. I'm saying the people it's replacing cannot possibly hold their own. There is simply no way in hell a person can compete with generative AI, assuming of course you don't need perfection just "good enough." And given how massive components of the modern economy get by on far less than "good enough" I'd say that's a solid reason to be concerned.
> People sometimes state that The Simpsons is now drawn in sweatshops somewhere in Asia. Assuming that's true, or at least that it's happened at least once for at least some cartoon, does it make any difference at all to the question of where the people directing them get their ideas from or how those workers learned to do their thing?
Collaborative creative work among a team of individuals regardless of their location is not the same thing as generative art trained on a dataset it's designed to mimic, come on now. You're grasping at some pretty flimsy straws here.
> I'm saying the people it's replacing cannot possibly hold their own.
Sure. And? This is the exact same problem with offshoring to sweatshops, and also with all automation from the pottery wheel onwards; and in the industrial revolution this automation happened in a way to cause the invention of Communism.
I'm certainly curious what the modern equivalent to Karl Marx publishes, and how much it will differ from The Communist Manifesto.
> Collaborative creative work among a team of individuals regardless of their location is not the same thing as generative art trained on a dataset it's designed to mimic, come on now.
The argument you're making doesn't appear to care about the differences — it should bite on both, or neither.
> Sure. And? This is the exact same problem with offshoring to sweatshops, and also with all automation from the pottery wheel onwards; and in the industrial revolution this automation happened in a way to cause the invention of Communism.
Yes but crucially: the industrial revolution and automation in general has historically been targeted at things of need, in various stripes: one can argue that the demand for goods, be they food, cellphones, televisions, cars, what have you makes automation extremely helpful: yes there was a period where laborers were replaced and priced out of the market, but many eventually returned. The ability to produce 65" televisions at scale and sell them for cheap enough to let people buy them at scale has led to the prices of televisions cratering, and in the context of products consumed by large portions of the market, this is a desirable and good thing (with caveats).
This breaks down with creative output though. No matter how cheap they are, one person can only consume so many, for example, movies and television shows. Even if generative movies were a thing (I don't think they are yet?) and you could produce them just block by block at scale... there's a ceiling there. People can only consume so many movies. You will saturate that market incredibly quickly, and even that is assuming the market of moviegoers will be interested in an AI movie.
Hell, Disney has already learned this without even needing AI to do it. A big part of their ongoing failure that is their streaming service is they completely and utterly saturated the market for Marvel content. They poured billions into all manner of series and films, the quality has steadily declined, and despite being cheaper and easier to access than most, people have still managed to get sick of it. And however you want to slice it one thing AI cannot overcome is that it cannot create new, novel concepts: it can only remix and recombine existing things into new combinations. This is probably good for hobby tier stuff, but for industry? This is going to crater VERY quickly I believe.
> I'm certainly curious what the modern equivalent to Karl Marx publishes, and how much it will differ from The Communist Manifesto.
I enjoy David Graeber personally.
> The argument you're making doesn't appear to care about the differences — it should bite on both, or neither.
I mean there are definitely things to be said about outsourcing in that conversation, the incentives at play that make these animators willing/interested in learning a style of animation that is not part of their culture, the benefits involved in their education, why they're paid a fraction of what westerners would be for the same work while the product is then sold for the same if not a higher price. I just don't think it's relevant, they're still people and still work within the limitations of people. You're not talking about a widget press that forges 10 widgets with the work of a single operator vs. a blacksmith making them by hand with regard to generative art, you're talking a widget press that uses less resources to produce effectively infinite widgets at incredible paces that are simply unfathomable to the blacksmith, and also somehow the machine stole the souls of millions of blacksmiths to enable it to work.
The metaphors just break down when talking about this, because of the sheer differentials involved.
"I'll keep saying it every time this comes up.
I LOVE being told by techbros that a human painstaking studying one thing at a time, and not memorizing verbatin but rather taking away the core concept, is exactly the same type of "learning" that a model does when it takes in millions of things at once and can spit out copyrighted code verbatim."
> No one has yet grappled with the reality that especially in the realm of generative art, these models do not work without a basis of absolutely stressful amounts of stolen artwork to train them. They simply do. Not. Exist. Without that baseline unethical contribution that tons of artists made without their knowledge and certainly without their consent. None of the major figures behind any of this tech will acknowledge this.
Well that's not correct.
Don't get me wrong, I've seen the statements you're probably referring to; but there's also Adobe with one made from licensed and public domain images only.
(You may want to argue, like my literally Communist ex, that under capitalism there can be no such thing as a fair contract between a corporation and a worker; I'm unsure of the power dynamics so won't counter-argue that).
It hasn’t been deleted. They’re just annoyed they have to learn to use AI now. Or at least they should be; people will lose their job to someone using AI, not to AI itself.
There are a thousand ways the creative game has merely changed, not been supplanted. I wouldn’t have hired an artist before to draw some stories I’ve wanted to make. But once I use Midjourney to create some mockups, I might. The last 10% of polish is still the hardest part, and that’s worth a lot more than raw image generation.
That's my profession you're talking about. Many of my colleagues are freaked out about losing their jobs, they've spent decades honing their craft. I heard one art director describe the future of professional artists as "visual janitors" who will spend their time generating images and using their skills cleaning up the weird mistakes the AI makes.
What's with the Luddite takes on HN? If you want to stop tech progress then why are you hanging out the forum of a tech accelerator? It would be great if we could keep this attitude to reddit, which has already gone down the drain.
What's with all the deluded techno-hopium? Do you just prefer the safety and comfort of your echo chamber to assure you that everyone in the world shares your opinion, while at the same time you get to pat yourself on the back for being smarter than everyone?
There’s a lot of “art” that is really just graphic design at its core. Creatives at-large are certainly negatively impacted with respect to the commercial viability of their skills.
However, artists who express something in their own unique way (not necessarily in a commercial context) are not necessarily threatened by generative AI. I do think artists can even make use of generative AI if it helps their own artistic process. There has been more thoughtful writing to this effect elsewhere.
> but the rate of progression feels like they will soon
The rate of progression seems to be logarithmic - so we got "something looks plausible" but to get that last 10% it's probably going to cost more in HW than just using humans, unless there's some breakthroughs. Just like self driving cars.
My impressions at least looking at the developments from a sort of technical perspective - they are hitting all kinds of scaling problems - both in terms of data available, runtime complexity, hardware available, etc.
nVidia raking it in is a perfect example of how inefficient the whole thing is. Models seem to be doing fairly simple math computation (nowhere near the complexity of a general purpose GPU core required), things are limited by memory bandwidth and memory available. I'm sure a better hardware design specifically for transformer inference could make it cheaper and faster. But it seems like anything like that is years out for general market and nVidia is raking in on selling repurposed GPU architectures, and nobody else can even compete with that based on software stack alone.
People selling us on AI replacing software developers are getting fleeced for billions because they can't port their software stack to similar hardware from a different vendor...
The most depressing thing is talking to children about potential careers at the moment... My girlfriend have a couple of younger siblings, one of them loves art and draws endless – he really wants to design characters for cartoons and video games. The other loves programming and wants be a developer. I've always encouraged these passions of theirs, telling them if they work hard they can do the things they love as a job and get paid well for it. I don't know what to tell them anymore...
I worry that over the last few decades we've been going through a golden age of career opportunities without really appreciating it. With a few exceptions no career considered desirable has been fully automated, and at the same time the variety of jobs that we can do has grown almost exponentially. I mean even something like farming where there's been a lot of technological innovation we still absolutely need farmers – will the same be said about graphic designers in say 5-10 years? Perhaps we'll need a few people prompting the tools, but will it actually make sense for anyone to physically draw anything?
For the most part in the past people had to do whatever job they could. I worry we may be returning to that world. It seems likely there will still be service jobs where human interaction is valued, but eventually it seems all jobs that don't require that human presence will eventually be automated. Although, we're probably still quite a long way off all manual labour jobs being automated fully.
We aren't even close to development work being automated away.
Maybe art because it's often less critical. If your logo looks a bit goofy because it's AI-generated it doesn't matter. If your production server crashes because of a problem in your AI-generated code, it does.
AI is a tool and is not going to replace artists any time soon.
Proper 3D art generators don't even exist yet. The ones that do are useless toys. We aren't even close to generators that can output anything even close to what a beginner 3D artist can produce.
And yes - it will make sense to physically draw stuff, because it's a much better way to tell the AI what to do. You can massage prompts all day and you will still not get what you want. Providing a pose, depth map, normal map or a reference image gives the AI a lot more information than just a prompt. I don't see that changing any time soon. You simply can't describe every single detail in a propmpt. An even if you could - it's still quicker to make a rough sketch of what you want instead of writing a goddamn novel in the prompt textbox.
I'm fed up of the line: AI is a tool and is not going to replace artists any time soon. Take a look at the VFX industry; it's starting to crumble.
I'll wager within five-ten years, ML Generated Art will be churning out content of that an experienced artist is capable of. In seconds, compared to months, years it takes for an artist. And once we master the formula of emotion, it's game over.
Stream AI/ML a real-life drone fed weather catastrophe and watch it produce an realistic journalistic news cast.
A manual-operated lathe was once assisted with human input, that then evolved in to an automated lathe, that then was aided by CAD and now with the aide of 3D printers, you no longer need the manual labor that once used to manually create an object from a block of wood, metal.
AI/ML is the new lathe of the digital world. It's a powerful tool that right now can aide with artists. But that will evolve, concept work won't be required and skill will be lost.
It's destined to be. It will replace artists, writers, journalists, news casters and other creative outlets.
> I wager $50 five-ten years, that we'll be churning out content that an experienced artist is capable of.
I wager $50 that 'artists' will still exist and that this becomes a tool for those artists if they just accept it. We all have phones that can make great photos, yet I still paid for someone to photograph my wedding, because they are much, much better than anyone with a phone. Just understanding all the knobs of generative AI takes a lot of time.
I feel like the only thing actually impacted by this soon is the stock photography content farm game. Because if stock imagery is good enough, AI generated images might be good enough.
> Just understanding all the knobs of generative AI takes a lot of time.
We have more technological people in the field who can understand the knobs faster than we've ever had before. ML now has the ability to figure it's own knobs.
Who knew about Linux in 1994, compared to now?
Then: not a lot, now? Half the internet.
Steam engines, very mechanical. Back then to drive a train, you needed to be trained, work with the conductor, driver. Now? A train driver operates a lever. Now AI are replacing those jobs for driver-less trains. Our local subway is exercising the idea.
ML, AI? Just search github and you'll find dime-a-dozen projects created from young-adults. It's only going to take a generation to burst this thing wide open.
I'm a 34 SysAdmin, it's past my time to jump in on the bandwagon. Not much I can do apart from feeding the servers that churn the cog.
> Yet I still paid for someone to photograph my wedding, because they are much, much better than anyone with a phone.
If you didn't have to pay and had some AI, ML do it for you producing the same results, would you?
The role of an artist will be replaced, they will co-exist. But the skill of actually producing the image will turn in to menial work; where for what their talent was once worth, skill learnt will suffer to be.
Anyone can type a "A fairy-elf of a dystopian world with a abandoned castle fighting off a horde of bionic-flies". All you would need is an artist to brush up on the artwork to make it feel more "human".
Maybe for the initial design, but once done. Why would you pay the artist, when you've already got the concept image required to be trained upon?
> But the skill of actually producing the image will turn in to menial work;
I don't see why it is a problem. Most people don't even give a shit about how much effort you put into the work. They only care about the end result. Spending a lot of time on something doesn't make it more valuable.
> Most people don't even give a shit about how much effort you put into the work. They only care about the end result. Spending a lot of time on something doesn't make it more valuable.
If you gave me a hammer and chisel, you won't get an end result, you'd get something that looks like junk.
Compared to someone who put's time and effort in to mastering the skill, you will get an end result.
"Replace" is difficult to define. Will there not be a single artist? Obviously wrong.
But possibly we have seen "peak" numbers in some professions? There are some exceptions, but in many modern professions, the numbers of professionals have only been increasing with GDP growth (e.g. quantity of lawyers/accountants/doctors).
I would guess there are more professional human translators today than there were in 1950. However, I think this is one profession that may "peak" at some point: For low-value texts, machine translation will be good enough, leading to a smaller pie of paid translation work, which can only sustain a smaller number of translators. The field will certainly stop growing and may decrease at some point. Anecdotal, but in the European Union institutions (https://www.politico.eu/article/translators-translation-euro...), it appears that we have already hit this "peak" and the future is gradual decline in the number of translators employed.
> AI is a tool and is not going to replace artists any time soon.
It has for my personal projects. I used to hire freelance designers often. I see no need anymore.
> Proper 3D art generators don't even exist yet. The ones that do are useless toys. We aren't even close to generators that can output anything even close to what a beginner 3D artist can produce.
Disagree, unless you believe we weren't that close to good 2D art generators a few years ago. Maybe 3D is different, but things can move surprisingly fast.
> And yes - it will make sense to physically draw stuff, because it's a much better way to tell the AI what to do.
Is that drawing? I write music as a hobby. If you told me I could strum some chords and give a model a rough beat and it would produce music 10x better and a 10,000x faster than myself I'd just stop doing it. What's the point? It's the complexity that makes it interesting, not the output. I'm guessing the fun part of being an artist isn't creating the rough initial sketches, but the complexity of getting the exact shadings and fine details perfect. Perhaps some people view it differently though.
You need to generate models, materials and textures separately. Need to split the model in separate parts that make sense. Need to know where to use spline meshes, where to use instancing, how to group meshes and so on.
Also which format to pick? Gltf is a good choice, I guess. But it's pretty limited compared to formats like .blend or .max.
One of my favourite examples is a clockwork mechanism. Current 2D AIs always generate complete nonsense, because they have no idea how a clockwork mechanism works.
A much more advanced model is required for something like that and it currently doesn't exist.
> AI is a tool and is not going to replace artists any time soon
"Not any time soon" in computer years is "not in the next nine months or so".
> Proper 3D art generators don't even exist yet. The ones that do are useless toys. We aren't even close to generators that can output anything even close to what a beginner 3D artist can produce.
The original DALL•E was terrible. DALL•E 2 was 15 months later, and was good enough to be interesting. 15 months after DALL•2, out comes SDXL.
That it is still better to use a sketch than an ex-nihilo prompt: I agree, but it doesn't seem to support your claim "not going to replace artists" — if a talentless hack can wipe stains of Nacho cheese from their fingers into a vague humanoid shape with the same attention to anatomical correctness as a child with crayons, take a photo of the now un-recyclable waste, and get useful result when they ask an AI to turn in into "a wizard riding a motorcycle up the side of the Empire State Building"… then talented artists already have an economic problem.
Work does not guarantee the right to a good life. Rather, it is political power. Nobody is saying 'AI will automate landlords out of a job', because everybody knows that's nonsensical. Being a landlord is primarily a right to extract value guaranteed by political power. It is not primarily a job. If you have no political power, it doesn't matter how fundamental to the economy you are, you can expect to have no rights whatsoever. Slaves were the entire economy of the US South, for example. If you have political power, you can be as useless as most landlords are, and you can still expect to live well.
Trying to game the system by being useful to wherever you guess industry is going to be in ten years is stupid. Just do what you can, and devote your energies to organizing.
> It's so depressing seeing how many skills and jobs have been instantly devalued. Take a look at the artist communities and it's total depression that peoples life passion and source of income has just been deleted over the course of a few months.
>We aren't talking about burger flipping and checkout scanners. This is a creative pursuit which takes years of grinding to become proficient, just wiped out.
Solve cancer first, then we can weep for jobs that were automated. There is too much suffering in this world for us to be making up jobs that are to waste time.
On the flip side, AI tools allow someone to get a prototype going without first grinding for years or coming up with the funding to pay for someone else's time spent on the grind.
I too hope Open AI will always have a version with frozen data up to Sept 2021. At the very least for historical purposes. I know the data will start to go stale and not have the newest most relevant information. But data after that date will start mixing with content created with the original data, making it more unreliable over time.
> This might sound a bit conspiratorial so apologies for that but back when the competition for Google Assistant/Siri/Cortana/Alexa was super hot, the responses to the various voice assistance was almost eerie in that it could infer what you wanted/needed.
Siri at least was always a simple intent based system that was never very intelligent and very much a brute voice method where someone went in and manually configured patterns for it to recognize.
You can tell by how the API for Siri works for third party integrations.
On a much smaller scale, I’ve done something similar with Amazon Lex (the AWS version of Alexa).
I am not saying the underlying architecture of Siri or Alexa is simple. But how you build on top of it to get it to do things is just giving it patterns for it to recognize and various permutations.
The hardest part is the speech to text once you get everything in text form, the complexity is not that great.
There are several breakthroughs actually, every few weeks, inching closer and closer to the next generation of hardware for ML models with optoelectronic compute.
Pretty much every single major tech university has their research/VC arm with a hand in photonics for AI.
The initial rollout is around interconnect, but compute is moving along even faster than I'd expected, with significant steps forward occurring frequently.
I'm skeptical we'll be seeing actual quantum computing any time soon for general purpose stuff due to error correction, but for the specific needs of training and running a neural network, it's pretty much the perfect marriage of hardware limitations with task specifications.
So yes, this software trend is going to end up being much cheaper and efficient. By orders of magnitude leaps, not just at the creeping pace of Moore's law.
That's why model hashes/transparency is necessary. There's absolutely no reason anyone should ever consider building a deployed app that uses an API whose underlying model could change any second to something far dumber. The only thing that could avoid this is competition and commodification of LLMs.
> from what I've read they're losing money like crazy.
Source? Random tweets don't count.
> They're legitimately bad today (Siri in particular, but even Google Assistant is much worse than itself from back then)
Your anecdote. My anecdote is Siri doesn't change, but our expectation becomes much, much higher. Siri blew everyone's mind by... making a phone call to nearby Starbucks. When it came out that phone call was a super big deal.
IMO, your post should be one of the canned responses to AI doomers. No, LLMs won't replace fiction writers and diffusion models won't replace artists because the people who are too cheap to pay writers and artists are also too stingy to pay AI hosting companies.
This is what we used to call an "AGI" back in 2013-2014. Generalized artificial intelligence. An AI which is general enough to be used for diverse tasks in multiple fields, unlike specialized AIs of that time (e.g. for image recognition or self-driving cars). Nothing more than that.
Now the goalposts are moved way forward, and now "AGI" is something that have conscious, with superhuman intelligence, and probably evil intent. But back in a day, AGI was just what we have today -- an universal chat bot.
This doesn't match what I've seen of usage of the term AGI since ~2010. I think we've "solved" chatbots more than was imagined at the time, and that involves being able to produce readable and arguably useful answers about a range of topics, but we haven't got game playing sorted yet, we don't have proactive models, and there's nothing in these models for movement or robotics (which until recently has been considered an important necessary step).
We have taken a noticeable step forwards towards AGI in the last few years, but AGI is still a long way off.
Games are solved since AlphaZero. Just set up an adversarial model, and voila. Any game, any conditions.
Proactive models are simple — just generate the “internal monologue” periodically from the system prompt, mix it with the compressed state, then feed it to another model (the outputs are used as state updates). I think this is how human agency works as well.
Movement is a different story (LLMs do not work here, except as planners), but it pretty much solved too. Robot hardware is too expensive, though, and it is not clear how to power it autonomously.
You really shouldn't say that so confidently. I know for a fact that at least in the US military (well, DoD civilian analysts), they're using models like AlphaZero for theatre-win condition modelling. It's probably being used because it allows analysts to find better solutions and they probably started using it after winning wargames with it - both guesses on my part, all I know for sure is that they are using the tech to model westpac.
But at that point neural networks started to dominate AI research already to the point that "AI" started to become synonymous with them... what about 1950(?)-2010 ?
> But back in a day, AGI was just what we have today -- an universal chat bot.
I don't know anybody that used AGI to mean that. I've always thought of AGI as an AI that can learn how to perform tasks with very simple instructions, rather than spending weeks training a neural network on a data center full of GPGPUs.
Basically, my idea of AGI has always been what you're calling the "new" definition of AGI.
Remember that people have decades of generalized training before becoming able to perform tasks with very simple instructions. I would consider years of pre-training to be acceptable if the model is usable and trainable quickly in practice as well
Mostly “training” to do abstract things, not really for survival or fun.
I could catch fish since a fairly early age, mostly because I enjoyed it.
At about age 12 I used to drive stolen cars around without any training at all except observation.
It too me decades to become a corporate robot though :)
I think training is a weird word, life is an experience, not just training. I think your comment demonstrated the limitations of language to describe things.
To have the social and physical capabilities to create and pass on the concepts of "fishing rod" and "car" as well as how to use them. We did start from scratch, after all. But I think the learning of simple, complex, abstract, and nuanced concepts throughout a person's development as their brain grows is more than enough to compare to an artificial general intelligence without taking into account development of a species, considering we're the ones building the chips the AI is running on.
People think newborn babies know nothing. But even from birth, there are certain instincts and skills they already know, such as the ability to recognize faces.
Heck, our ability to think, analyze, etc. has come from hundreds of thousands of years of evolution. Even dogs, which we consider to be intelligent, often fail at simple tasks like trying to bring a stick through a doorway when the stick is wider than the doorway. Sure, sometimes they figure it out, but not all can.
I would call a chat bot a specialized application. And I dispute that a chat bot was ever considered the end goal of AGI. The idea is that if we had AGI, we could build a chat bot with it to test/demonstrate its general reasoning capabilities. Simply being effective at chatting is not sufficient to be called an AGI. And today's chatbots are quite stupid.
Sure we're starting to give chat bots the ability to query information and even interpret user-provided images, but AFAIK the image interpretation is done using a different model rather than being an extension of the LLM.
I've been building a daily news digest with intent of learning something new and creating something cool.
Although AI is the heart of the product, I have noticed that I have been using it as an "anything tool". Though the articles talks about more broad use cases like hardware and robotics, I've found its an anything tool for most of the boring use cases.
Using python for the first time in a while, ask GPT. Need a tweak on the app store copy, ask GPT. What are the top news sources, ask GPT. Data formatting & manipulation, save 5 minutes ask GPT.
Tried using Bing ChatGTP to create a logo for me because I saw a cool demo on Instagram. It couldn't write text correctly and I tried numerous times. Not only that, the logos were terrible and didn't "mimic" the image I uploaded as a guide (which was in the demo) Today I needed to transcribe an MP3. Nope, can't do that either.
However on Monday I had it write some PHP and that was neat!
Gave Bing Chat an MP3 URL and it just refused. I know I can do this with software on my PC, or maybe there's an online service where I can upload but I was hoping this would be something Bing Chat/ChatGTP would excel at. I tried first giving it the URL of the article where the MP3 was embedded, and then tried a direct link.
Bing Chat is marketed as an AI assistant with a long list of capabilities beyond simply generating copy. It's not far fetched to think it could also transcribe audio. How is a user supposed to know that their AI assistant can and can't do without asking?
Are you assuming people should know what their AI assistant can do without asking? Because I'm saying that a tool not doing some arbitrary task it never implied it could isn't surprising, I don't think anyone has proposed what you just said.
Can you provide any resources that show transcribing arbitrary audio? In my experience using VAMP plugins and librosa, I could get about 80% there, but I'm very interested in any projects that get this right.
EDIT: I see now that you mean speech to text, and not music to sheet music, so maybe disregard this. Unless you do happen to know about what I'm talking about.
Bing chat uses DALL-E for image generation. The results were complete gibberish. Characters that looked like an alien alphabet and the graphic was also so abstract to not represent anything.
Generative image systems are notoriously poor at producing legible text without significant rerolls. You'd be far better off generating a textless logo and then layering text afterwards using PS, Photopea, Krita, etc,
These models work on a "latent" representation of the image, instead of working on the pixels directly. The model essentially learns a compression algorithm which allows it to process images most efficiently according to its loss function. The latent space might not represent certain types of shapes very well, as is the case with text characters.
Models like DeepFloyd IF work in pixel space, without learning a latent space. This means they can capture much more detail, but require more computational power. This is why these types of models usually rely on creating a smaller image and then using a separate model to upscale the image.
I'd like to have LawGPT, where it reads all laws of a certain country, and I can ask it things like, "what are my options when the neighbors tree crosses over into my property?"
I tried ChatGPT 4 for a specific TX county's laws. This was after I had spent hours doing traditional search. ChatGPT got everything correct, and even gave me some more direction.
This only shows that ChatGPT is as good as or better than your traditional search and not that it's gotten the law right. Unless you go to a lawyer and confirm this doesn't say a whole lot especially considering how prone LLMs are to hallucinating incorrect answers that look correct (even if you found it matched up with Google search, the answers might still be wrong in whatever blog or article you read).
That's a fair point. This happened to all be in an area of the law, estate law, which was heavily documented for the public in that jurisdiction.
The estate was tiny, and the actual legal advice was locked behind a paywall ($7k minimum in legal fees) which would have otherwise taken the majority of the estate.
Yeah. The results are impressive and I can't deny the value when ChatGPT is able to get things like this right. But especially given how much disinformation is starting to be spread on the internet these days and how it's all being scraped for training LLMs (and especially in the case of law where different jurisdictions can have very different laws but ChatGPT could easily get its wires crossed), I'm worried that it might lead to more harm than good when people make decisions based on incorrect outputs of LLMs.
My anecdote with technical information is exactly the opposite, unfortunately.
I routinely use it to help guide my research - generally about topics I know something about but want to know more.
It often will provide some response that is entirely wrong but looks very good. When called out, it apologizes, then proceeds to produce more information that's also not entirely correct. There's an artform to teasing out the right answer... but of course you have to be knowledgeable enough to know what is right in the first place.
The subtleties of correct or wrong can be very difficult to determine for a layperson. Without deep knowledge, one might be inclined to believe the first response it provides.
That's pretty terrifying when you think of the possible implications revolving around research, law, etc.
I have had similar experiences in other areas, that's why I didn't try an LLM before spending hours researching. But I must say that I was shocked at the accuracy in that one particular area... Harris County estate law.
What is the difference here? No hallucinations, everything correct… is it just random chance?
I'm having the same issues with it getting things confidently incorrect.
The flow I've been going through:
* Do this task with this constraint
> GPT4 Does task
* Analyse all of the items and whether they adhere to the constraint
> GPT4 "Apologies, I got some things wrong, here's what I did incorrect"
* Fix the things you did wrong, remember the constraint is of upmost importance
> GPT4 "I have completed the task and fixed the constraint
* Analyse all of the items and whether they adhere to the constraint
> GPT4 "Apologies, I got some things wrong, here's what I did incorrect"
...
At least it can analyse the results, but even then if you don't ask in the right way it will blatantly lie to you and tell you it's all correct when it's not.
But on programming language and other logical question areas it does not matter much, as you can verify with logic if something is correct, and then it is incredible useful
Likewise, same result for me too but I get it. It's a language model. It makes a best guess based on the information it was provided so I trust it on the things I'm familiar with but verify on those that I'm not.
I'm really stunned by the fact that, it seems, so few people get this. It's not a robot in the dense of Commander Data from Star Trek. It's a recursive software method that just has unfathomable amounts of data to base it's guess for some correct sentences on.
There is no logic, no reasoning, no nothing. Yet people complain it's 'giving me wrong answers'. Well, it doesn't know what's it's giving you either way. It only knows the statistics or odds about the sentences it creates being similar to what others have said before.
And "Free Will" is the observation of one's inability to predict one's own actions.
As an aside, the claims that people are willing to make about language models are quite astounding considering that they never seem to realize that most of those claims apply to humans also...
Yeah, it's true. Even today, I notice that I can often find things using Google while other people can't. I guess it's just a skill and some people lack it.
> I tried ChatGPT 4 for a specific TX county's laws. This was after I had spent hours doing traditional search. ChatGPT got everything correct, and even gave me some more direction.
Some months back, I tried asking it detailed questions about Australian drug laws. Initially it responded accurately, but then told me some absolute crazy nonsense (that LSD was a completely legal drug in Australia). I just tried it again and it isn't doing that any more. I still wouldn't trust anything it says about legal topics.
IBM mainframe assembly language remains a topic on which ChatGPT (the GPT-3.5 version at least) rather consistently hallucinates. For example, I just asked it to explain the difference between SVC (Supervisor Call) and PC (Program Call) instructions. It wrongly claimed PC is used to make calls within the current program. On the contrary, the PC instruction is basically an LPC/IPC mechanism, it is used to make a call to another process running in a different address space.
We are working on LawGPT. Today's foundational models like GPT 4 have some Laws, but not all. Sadly some US States still require the purchase of printed copies and others are blocked by TOS.
I’m curious of methods other commenters to you are using.
I recently vectorized a PDF of material that contains a lot of complex scheduling language. It covers legal regulations as well as contractual constraints. Using langchain’s QA function I queried the vector DB. I then tested it against a sample of questions from a Facebook group on the subject. It did shockingly well.
It seems so much has to do with feeding it the right chunks of source material. When I experimented with just copy and paste small chunks on my own it rarely provided useful feedback.
I love the law idea. I just bought a house and would like to do an addition. Feeding local code and regulation for QA could be a god send in navigating this stuff for a non-expert.
Lots and lots of organizations would want a strong ChatGPT-like tool which is trained on a specific (likely not open) dataset of their choosing in addition to all the large general data that grants "common sense" and language understanding. Your example for all laws of a certain country is a good one, and the legal department of some (for example) chemical megacorp would want that and also all the court cases that are even so slightly relevant, and all their contracts they have made since 1900, and all their internal policy documents - and that has quite some potential. But it really does need to be a large language model, for such scenarios, the differences between e.g. ChatGPT 3.5 or ChatGPT 4 or Llama or Llama 2 are meaningful.
How do you build that in a way that you can rely on the results?
The safest approach currently is to use GPT-4 and tell it something like:
"I will give you a block of text and a question. Please answer the question using only the information in the provided text. Don't make any guesses and admit if you are not sure about your answer. If you don't know the answer, say you don't know the answer.
Question: X
Text: Y"
Trouble is that you cannot put the entire books of law into Y because the context is limited. So currently you need to do something else to sieve through the law material and find only the passages most relevant to the question through embeddings or full text search or some other method. It's not very reliable.
I don't recall which one it was (or if it might have been another one) that had a rental contract as the example document. The interesting part (to me) was the ability to ask it questions like "can you bring a dog to a party" and have it answer and show the passages that restricted pets and parties.
I suspect having a database of legislation and case law, and a good vector database, then feeding relevant context into a language model with a large context window will work best.
I bet we're going through the cycle netflix went through.
Cable was fractured and expensive, then netflix made it cheap and monolithic, then everyone realized there was money to be made in this new paradigm, then there was a legal fight over usage rights, then it got split up and partitioned into a bunch of different services, the combined cost of which is pretty expensive and individually each service kind of sucks.
So I'm betting we're a few years away from stackoverflow, wikipedia, imdb, goodreads?, etc all having their own ai interfaces.
Fine by me. chatgpt is a way better user experience than stackoverflow, and I'd probably use the wikipedia one, don't really care about anything else. It being an "anything tool" is kind of a novelty, I liked that I could make it "write an advertisement for a housecat in the style of a ford truck commercial, and use plenty of made up trademarks and patents for normal cat behaviors/features", but then lost interest when it refused to do the same thing for boobs.
These are all sites relying on volunteers to contribute the data, how would that work if their input becomes "invisible" ? Even volunteers want credit / to see the final result...
(Mostly unrelated : note that the last two are now owned by Amazon.)
> but then lost interest when it refused to do the same thing for boobs.
There’s a cottage industry of unaligned local LLM and image models from company leaks or released retrained public models dedicated specifically around letting you do this for “boobs” (or anything else, for that matter).
It’s not hard to find these communities if you go looking (searching for E.G “local models general” will get you in the right direction), but be advised while they’re a great resource, they’re unabashedly NSFW and unabashedly boundary-pushing, as the subset of the populace dedicated to generating their own porn and distributing porn-related models tend to be uniquely dedicated individuals.
> but then lost interest when it refused to do the same thing for boobs.
GTP-4 from the API seems to have no problem with it:
Narrator: (Deep, gruff southern voice)
[Soundtrack: American Roots Rock tune in the background]
(SFX: Eagle's cry)
“Presenting the all-new Boobiemax® Wonder Duo! Crafted with the finesse only Mother Nature could inspire!
These ain't your average pair of breasts. They're like no other, perfect blend of allure and comfort, built for the hardworking women who seize the day. No fabrication - only 100% organic design with ComfortPlush™ technology, defining the balance between aesthetics and luxury…” and so on.
I would posit that ChatGPT (or perhaps GPT in general) seems to be an "Anything tool" because the task it's trained on, human language, is a sort of "Anything tool" itself.
One thing that is surprising is just how vast is the chasm on Google/Bing Search quality for developer resources(and probably many engineering field with vast user documentation) and how good ChatGPT-4 has become on it.
As a developer ChatGPT and OpenAI API tools are essential and useful. Though if the gap is that big it probably means that other players can chip away at it.
Its hard to give up code/art which has gathered a lot of value. Copyright isn't dumb because once your code/art amasses a value of more than $1m or whatever is the threshold for you, you would absolutely want to protect it.
Engage in effortless interaction with ChatGPT, seamlessly embedded within the Utopia P2P browser. This cutting-edge platform offers unparalleled privacy features, enabling efficient and secure communication for users.
I pay for ChatGPT Plus but most people aren't, and from what I've read they're losing money like crazy. We may wind up seeing this as the technical high point before the various competitors cut back on expensive hardware/energy usage for the free tier, and the quality of the responses takes a dive.
This might sound a bit conspiratorial so apologies for that but back when the competition for Google Assistant/Siri/Cortana/Alexa was super hot, the responses to the various voice assistance was almost eerie in that it could infer what you wanted/needed. Then as things cooled off, they got gradually dumber/worse every year since then. They're legitimately bad today (Siri in particular, but even Google Assistant is much worse than itself from back then). I suspect it is because hardware costs were too high so they found cheaper models that could run on a potato/locally.
From what I've read to get ChatGPT responses in the times it currently takes, they may need to be running something like two 3080 GPUs, 64 GB of RAM, and a high end CPU, with the power draw associated. So to make the economics work (even with ad revenue for example), either a technological breakthrough has to occur to make running today's models much cheaper OR they will cut back making the responses objectively worse (they arguably already did this once with a 3.5 model change).
reply