Hacker Read

Hacker Read top | best | new | newcomments | leaders | about | bookmarklet

login

		It takes a lot of energy for machines to learn (theconversation.com) similar stories update story
		103 points by jocker12 \| karma 957 \| avg karma 4.88 2020-12-30 13:00:50 \| hide \| past \| favorite \| 123 comments

view as:

ALittleLight | karma 13088 | avg karma 3.97 2020-12-30 20:30:09+00:00 | [–] similar comments

The article points out that the energy to train BERT is comparable to a flight across America. There are hundreds of thousands of flights every day - why should we be concerned about the equivalent of one extra? Especially given that BERT improves ~10% of English language Google search results, it seems like we're getting a lot in return for relatively small energy use. On top of that, Google buys 100% of their energy usage from Green sources.

I think it's great to talk about other methods of training or architectures that don't require so many parameters. The point about how BERT consumes vastly more text than humans do when they learn to read is interesting. But trying to phrase this like an environmental issue just seems disingenuous and misleading.

adamsea | karma 554 | avg karma 0.43 2020-12-30 15:07:33 | [–] similar comments

Not sure why folks are defensive? The article is mainly informational.

TFA:

"What does this mean for the future of AI research? Things may not be as bleak as they look. The cost of training might come down as more efficient training methods are invented. Similarly, while data center energy use was predicted to explode in recent years, this has not happened due to improvements in data center efficiency, more efficient hardware and cooling."

Also. One flight in and of itself isn't an environemntal problem; thousands of flights are.

Training this one instance of a model isn't an environmental problem; I would be curious to see some educated guesses about the number of models being trained over the next ten years. Not an expert but I know that use of ML is exploding - lots of new use-cases and thus lots of new models.

So it makes sense to me think about this stuff.

To

ALittleLight | karma 13088 | avg karma 3.97 2020-12-30 21:22:09+00:00 | [–] similar comments

The article suggests that the energy usage of language models is a problem. I don't think energy usage is a problem. I'm not sure how you interpret this as being defensive.

There are hundreds of thousands of flights per day. Adding an additional flight, or even an additional thousand flights, to substantially improve Google doesn't seem like a big cost. Consider: Would your life be worse if one random trans-American flight was cancelled today, or if Google searches became 10% worse?

Another way of thinking about this same point, is, why is the author writing about language models if they are so concerned about the environment. Surely the airline industry is a better subject as, again, they fly hundreds of thousands of flights each day. It's hard to take someone seriously when they are focusing on an infinitesimal part of a huge problem.

It's also misleading because there is a difference between energy used by an airline flight, where the energy comes from burning jet fuel, and energy used in a data center. In Google's case (the people training BERT) the energy used was 100% renewable - Google reached that goal in 2017. Perhaps Open AI didn't use renewable energy to train GPT-3, but I wager they didn't power their machines by burning jet fuel either.

Maybe electricity used to power to train language models will become a meaningful issue at some point in the future. I don't think that future is close at hand though.

adamsea | karma 554 | avg karma 0.43 2021-01-02 17:41:58 | [–] similar comments

tl;dr I think the tech industry doesn't like to be criticized.

That's a good point about renewable energy sources.

I'd still argue it's reasonable to assume that ML will consume more power as use-cases for it grow (i.e. more and more models are trained), and, therefore, I don't think it's unreasonable to consider the energy usage.

Honestly - I think what you may have heard is "ML is bad/evil." And yeah some folks probably feel/vibe that.

That's different from "Like any other technology the full context of ML deserves to be considered, and since it's so hyped by billion-dollar companies right now, maybe a bit of pushback on all the hype isn't a terrible idea."

And yeah we should definitely look at the airline industry; pretty sure Mother Jones does, and there's no reason we can't look at the energy impact of multiple industries.

ALittleLight | karma 13088 | avg karma 3.97 2021-01-04 05:06:13 | [–] similar comments

If you were against lung cancer, and writing about the issues caused by scented candles emitting smoke and that smoke corrupting lungs - it would be hard to take you seriously or accept your criticism as valid. It's probably true that scented candles are infinitesimal contributors to lung cancer, but it's just silly to think of them that way.

Likewise, yes, language models use energy, but objectively, it's not that much energy and the sources for the energy are sometimes green and always efficient (i.e. Coming from the grid and not by burning jet fuel).

You can't really solve the problem of lung cancer by addressing scented candles. Reducing energy used by language models likewise won't have an effect.

alisonkisk | karma 500 | avg karma 0.24 2020-12-30 15:11:44 | [–] similar comments

Google buying "green" isn't a justification because it fails the categorical imperative. It's impossible for everyone to buy "green" energy.

toomuchtodo | karma 88050 | avg karma 2.82 2020-12-30 21:18:52+00:00 | [–] similar comments

As always, there’s nuance. You can schedule workloads, machine learning/training included, where the power is cleanest (low carbon). Google relies on electricitymap.org/Tomorrow to do this.

https://blog.google/inside-google/infrastructure/data-center...

The more folks who elect to pick where to compute based on low electrical carbon intensity, the faster the grid turns over to clean generation. You must vote with your fiat. I encourage technologists to include this consideration in their workload scheduling requirements. Renewables are almost always cheaper than fossil generation as well.

curiousllama | karma 5389 | avg karma 6.36 2020-12-30 21:22:42+00:00 | [–] similar comments

The categorical imperative doesn't require everyone to be practically able to do the thing. Is it not moral to feed a starving man because people in China aren't able to feed him?

rictic | karma 3198 | avg karma 4.11 2020-12-30 16:22:00 | [–] similar comments

There's more than enough solar and wind, it's just a matter of how much we as a society want it. The green energy revolution has barely begun, and we're still climbing the learning curve. More demand means a faster climb means exponentially more green energy sooner.

himinlomax | karma 2687 | avg karma 2.18 2020-12-30 23:28:31+00:00 | [–] similar comments

Buying more green energy creates demand for more green energy, which in turn creates more supply (wind, solar farms ...)

Meanwhile, the aforementioned transatlantic flight cannot get greener, whether Bert is trained or not.

spiznnx | karma 1450 | avg karma 6.33 2020-12-30 15:19:48 | [–] similar comments

BERT is much smaller than GPT-3 (500x fewer parameters), not sure why the article doesn't point that out more explicitly.

The mentioned paper itself notes a 300,000x increase in compute used for training language models over the last 6 years.

I think the point is, if nothing changes, the costs will be very significant very soon.

qeternity | karma 8257 | avg karma 3.52 2020-12-30 21:25:04+00:00 | [–] similar comments

We’re already at soon. Another order of magnitude increase (which at our pace is all but guaranteed in the near term) will make large scale training a significant R&D line item even for the FAANGs. It will also push efforts to be commercially viable, as you can longer do proof-of-concept nets at $500m

PragmaticPulp | karma 66674 | avg karma 12.83 2020-12-30 21:52:24+00:00 | [–] similar comments

> I think the point is, if nothing changes, the costs will be very significant very soon.

The energy efficiency of machine learning hardware is progressing at a rapid rate. It's not accurate to assume that the energy costs will stay the same. Just look at how much it would have taken to train something like GPT-3 five years ago.

_jal | karma 13880 | avg karma 4.13 2020-12-30 22:03:41+00:00 | [–] similar comments

I think it is reasonable to assume that any advances in efficiency will be subsumed by increased use, much in the same way increased energy efficiency doesn't decrease absolute overall use.

joe_the_user | karma 26638 | avg karma 3.11 2020-12-31 01:05:09+00:00 | [–] similar comments

ML hardware is no doubt progressing. Whether we can expect a rapid rate is a different matter. The various Moore's Laws have been breaking down (chip speed stopped doubling a while back). GPU are the sort of chip that could most benefit from just transistors doubling (last Mooore's law standing) but even this Moore's Law is breaking down.

ML model size has had it's own Moore's Law, with standard model growing exponentially in size [1]. And this implies model are going butt up against the limits even more than they have already. Whether the "bitter lesson"[2] of ML is true inherently is up for a question. That current researchers have accepted it seems given.

[1] https://openai.com/blog/ai-and-compute/

[2] http://www.incompleteideas.net/IncIdeas/BitterLesson.html

amelius | karma 42902 | avg karma 1.63 2020-12-31 00:06:04+00:00 | [–] similar comments

So much for the "democratization of AI", I suppose.

acchow | karma 7694 | avg karma 2.38 2020-12-30 15:21:16 | [–] similar comments

> There are hundreds of thousands of flights every day

2019 was peak flight with 38.9 million flights over the year. That averages to 106k flights per day.

acdc4life | karma 113 | avg karma 3.65 2020-12-30 22:30:21+00:00 | [–] similar comments

My brain does nlp better than any system out there. I’m also able to ride bicycles and do motor control better than Boston Dynamics. I can also construct and prove math, do physics and code all of this in Matlab and C. My brain can handle all these wide range of tasks almost seamlessly with just 15 watts of power, that Silicon Valley’s super computers can barely do 1% of.

omgwtfbyobbq | karma 1224 | avg karma 1.57 2020-12-30 22:34:32+00:00 | [–] similar comments

To be fair, it takes years of energy to train our brains to the point where it can do all those things, and our brains aren't extensible in the same way hardware/software is. I guess there's also a lot more variation in yields. ;)

acdc4life | karma 113 | avg karma 3.65 2020-12-30 22:30:51 | [–] similar comments

Well, that’s not entirely true. Alpha go self played 29 million games of Go, which is in feasible for humans. Assuming a game of Go is 5mins, it would take a human 286 years to achieve the same, assuming this person doesn’t sleep, eat or do anything else other than just play Go. CPU time is an order of magnitude faster than real time, especially on GPU clusters.

bumby | karma 6991 | avg karma 1.44 2020-12-30 22:37:25+00:00 | [–] similar comments

To be a fair comparison wouldn’t you have to include all the energy used to train “your” brain through generations of evolutionary training? Your latest model is like taking an already trained BERT model and adding a few tweaks

acdc4life | karma 113 | avg karma 3.65 2020-12-31 04:26:11+00:00 | [–] similar comments

I see BERT as non sensical. You need to be scientific and have a mathematical theory on how humans learn language, which is a multi disciplinary task requiring physicists, mathematicians, neuroscientists, cognitive psychologists and linguists. Benchmarks are useless, theories, models, experiments and testable predictions is how science progresses. You’re making a comment on cognitive science, and trying to imply that language learning in humans isn’t learned, but pre baked. The psychological, linguistic, evolutionary biology and neuroscience evidence doesn’t seem to corroborate. The evidence points stronger to humans having general learning and problem solving abilities. For instance, there was no evolutionary pressure for humans to be good at math or programming. I was not born knowing english or calculus or probability theory, these were learned abilities. Evolution favoured brain mechanisms that lead to behaviour for success in a rapidly changing world. Had I been born in ancient Rome as a farmer, I would learn to speak Latin, and learn how to be a successful farmer, instead of the physics, math, probability, computer, driving, reading skills that I learned in my life time.

bumby | karma 6991 | avg karma 1.44 2020-12-31 06:33:33 | [–] similar comments

>You’re making a comment on cognitive science, and trying to imply that language learning in humans isn’t learned, but pre baked.

You make good points, but this one isn’t quite what I meant. I didn’t mean we are trained by evolution for a particular language, but that evolution selected for a language skill. In other words, we are “pre-baked” with the ability to learn a language. It would be analogous to a DL model being trained for generalized regression but not a particular problem. To that extent, I think we’re saying the same thing. Some of the theories related to our generalized learning abilities postulate they stem from this base ability (our aptitude for music, for example, being a consequence of our language learning ability)

>I see BERT as non sensical. You need to be scientific and have a mathematical theory

This is a matter of contention. A lot of science progresses by starting with an empirical result that drives a change in theory rather than the other way around. I can’t remember who to attribute it to, but there’s a quote to the effect that “‘Thats odd’ is the most productive phrase in science, rather than ‘Eureka!’”

acdc4life | karma 113 | avg karma 3.65 2020-12-31 19:22:33+00:00 | [–] similar comments

> evolution selected for a language skill

Yes it did, and we do know a lot about general principles behind, from different disciplines. This is still an early science. NLP research still hasn’t considered many important aspects of language learning that we have discovered in such a short period of time.

> starting with an empirical result

What makes you think these benchmarks are empirical? They were hand constructed to fit some objective, assuming that being good at the said objective is required for NLP tasks. Where’s the empirical experiments to validate the notion that said objectives lead to language? Science hasn’t worked this way, datasets aren’t constructed, you do an experiment and MEASURE it. Then you make models, try to explain the phenomenon, and test new ideas and validate your models. Can your model extrapolate new information and suggest new experiments to validate? I used the word extrapolate over predict intentionally.

bumby | karma 6991 | avg karma 1.44 2020-12-31 19:32:32+00:00 | [–] similar comments

It’s a little hard for me to follow your last paragraph but it sounds like you are confusing AI and AGI. I don’t think anybody using BERT is claiming the latter.

I assume the empirical results are the validation sets run. I.e., the tests that show it provides better results than the base rate. Again, it’s important not to conflate verifiable results with understanding the underpinnings of why it works. If you’re walking and I race you with my car over and over again, I can conclude my car is a faster mode off transportation without understanding anything other than “push right peddle to go faster”. My ignorance doesn’t invalidate the results

Peritract | karma 1031 | avg karma 3.4 2020-12-31 11:20:47+00:00 | [–] similar comments

Only if you're going to include all the time & energy spent creating BERT's precursors as well when calculating its cost.

bumby | karma 6991 | avg karma 1.44 2020-12-31 12:37:15+00:00 | [–] similar comments

Good point. I guess from that perspective it’s impossible to quantify either side

oh_sigh | karma 6383 | avg karma 0.89 2020-12-30 22:43:59+00:00 | [–] similar comments

Yeah but your clone() function is very wasteful, and we can't stick 1 million of you in a dark room that just looks at people's google searches and figures out what they're really going for.

And it's not the whole picture to say the brain only uses 15 watts - when there's all sorts of necessary support systems that it couldn't run without. So it's closer to 100 watts (2000kcal/day)

acdc4life | karma 113 | avg karma 3.65 2020-12-30 23:07:27 | [–] similar comments

Not Google search, but many companies are exploiting cheap labor over seas for different industrial use cases because our algos suck. Not just in manufacturing, there are tech companies outsourcing labeled data for tasks like object detection (Hive.ai as anvexample). Whether it makes you depressed or not, humans are cheaper than algorithms, and I don’t see that changing unless we abandon the current paradigm for ai/ml.

wongarsu | karma 24397 | avg karma 4.14 2020-12-30 16:46:44 | [–] similar comments

Comparing your pre-trained brain (that has a lot of structure (=training) through evolution) with the training costs of a new algorithm isn't really fair.

All in you require about 100W (2000kcal/day), maybe 3 times that when doing a lot of physical activity. Boston Dynamic's Spot uses about 400W. I can probably outperform it in some disciplines while it would beat me in some others. That would be a fair comparison.

acdc4life | karma 113 | avg karma 3.65 2020-12-30 22:43:06 | [–] similar comments

> Comparing your pre-trained brain

You’re making sweeping assertions that require domain experts in several different disciplines. The scientific evidence points to the contrary. It is demonstrated that humans have a general ability to learn wide range of things, without being genetically programmed to. None of us evolved to drive cars. Yet nearly everyone in my grandparents generation were able to learn this totally new skill despite being a new invention, where you couldn’t possibly have had time for evolution to act. They weren’t genetically evolved to drive, it was learned within their lifetime and generation. There are tribal humans in different parts of the world that haven’t developed written language. Yet you can teach them written language. Where’s your “pre trained brain” theory there?

wongarsu | karma 24397 | avg karma 4.14 2020-12-31 13:26:53 | [–] similar comments

Your ancestors might not have driven cars, but they used the motoric skills required for controlling one, and they had a vision system optimized for detecting and avoiding moving objects.

juanbyrge | karma 398 | avg karma 3.02 2020-12-30 16:56:11 | [–] similar comments

Human brains are also the culmination of billions of years of iterative development. Computers were only invented in the last century. I would not be surprised if computers could catch up given a few tens or hundred more years.

absolutelyrad | karma 417 | avg karma 3.42 2020-12-30 22:32:16 | [–] similar comments

Yeah, I'd give 30(realistically 15) years tops for AGI. And then we'll call it the end of history.

That is if we don't kill ourselves by making stupid mistakes.

dragonwriter | karma 118260 | avg karma 2.17 2020-12-31 04:37:25+00:00 | [–] similar comments

> I'd give 30(realistically 15) years tops for AGI.

I think 30 years is probably reasonable, if for “30 years” one reads “twice as long from now as commercially viable fusion generation actually was when first widely hailed as 15 years away”.

acdc4life | karma 113 | avg karma 3.65 2020-12-31 04:55:24+00:00 | [–] similar comments

I think this is false. My general opinion of computer scientists and engineers (especially in silicon valley) lack the scientific training that one gets in other disciplines like in physics or other hard sciences. Psychology, linguistics, cognitive science and neuroscience has generated a rich diverse experimental data in the past 50 years, and now is the prime time for a theory to emerge to connect everything. To make what I’m saying clearer, you needed Newtonian mechanics, Maxwell and Plank and Coulomb to develop and discover the science of electricity and atoms in order for us to engineer the silicon transistor. Without the original scientific knowledge, building such devices would have been impossible. This machine learning, ai, deep learning and this obsession with benchmarks are, in my opinion, a hinderance to AGI

bumby | karma 6991 | avg karma 1.44 2020-12-31 11:34:46 | [–] similar comments

I think the error in your thinking is the assumption that scientific discovery is predicated on previously defined theory.

Theory is required for understanding, not for discovery. They are two different things. What gets a lot of scientists and statisticians worked up is how ML can outperform traditional models without always being interpretable. It’s like claiming the Wright brothers couldn’t build a flying machine without having a thorough understanding of the mechanics of flight. I can improve performance of my car by remapping the fuel without necessarily understanding the nuances of optimizing enthalpy. There’s levels of understanding, and they may not correlate completely to effectiveness. To that extent, it’s more engineering than science

sgt101 | karma 7195 | avg karma 1.98 2020-12-30 22:48:48+00:00 | [–] similar comments

Also, I have downloaded and used Bert in several other applications; and I think that 10,000's of other folks have done the same.

jkochis | karma 25 | avg karma 0.96 2020-12-31 04:28:24+00:00 | [–] similar comments

The comparison between training BERT and a 5 year old made me wonder what the carbon footprint is to raise someone to the age of 5.

throwaway2245 | karma 1043 | avg karma 1.78 2020-12-31 10:10:35+00:00 | [–] similar comments

> There are hundreds of thousands of flights every day - why should we be concerned about the equivalent of one extra

See also: Sorites fallacy.

There is a limited desire for flights, which benefit people as far as it allows them to get things where they need to be, and no more than that; there is presumably an unlimited desire for machine learning.

jefftk | karma 22506 | avg karma 4.92 2020-12-30 14:40:16 | [–] similar comments

> One recent model called Bidirectional Encoder Representations from Transformers (BERT) used 3.3 billion words from English books and Wikipedia articles. Moreover, during training BERT read this data set not once, but 40 times. To compare, an average child learning to talk might hear 45 million words by age five, 3,000 times fewer than BERT.

The human brain is the output of an incredible number of generations of training, representing a vast consumption of energy. Most of the learning that informs the brain happened before this hypothetical five-year-old was even born.

ravi-delia | karma 2311 | avg karma 2.54 2020-12-30 20:59:26+00:00 | [–] similar comments

That is a matter of great debate. Although most people's brains wind up settling on basically identical structure over time, case studies with injured or disabled people often display incredible adaptability (ie the occipital lobe in a blind person). If indeed the brain learns most of what it knows from experience, than its (comparatively) low energy consumption would come from greater efficiency. That definitely seems possible; as a basic unit of machine learning the neuron is much more specialized than the transistor, and far slower.

anaphor | karma 2594 | avg karma 3.08 2020-12-30 15:19:39 | [–] similar comments

My interpretation of what the parent meant is that the cognitive processes which support language acquisition (which may be mostly domain-general) are already optimized for this task, for which there is a sensitive period during certain years where you need to be exposed to certain inputs, or else you never fully acquire language. See: https://en.wikipedia.org/wiki/Genie_(feral_child)

MAXPOOL | karma 1018 | avg karma 6.13 2020-12-30 21:05:27+00:00 | [–] similar comments

550–600 million years of hyperparameter tuning and neural architecture search using an evolutionary algorithm is nothing to laugh at.

But the actual learning process of a single brain uses very little training examples and is very energy efficient. A brain that uses only 20 watts of power.

polishdude20 | karma 2296 | avg karma 1.82 2020-12-31 07:29:22+00:00 | [–] similar comments

Yeah it's as if we evolved to have a brain that is good at learning itself. We aren't good at any one thing except learning.

firebaze | karma 1998 | avg karma 3.39 2020-12-30 20:55:36+00:00 | [–] similar comments

This is beyond ridiculous, even if compared to the energy to train a human mind at the age of, let's say, 35. Raise a child to the age of 35 and the spent energy will surpass the mentioned energy cost by at least 2 orders of magnitude.

And no, I don't ridicule AI ethics researchers: the energy cost of AI is negligible to the serious issues. The real harm stems from other areas.

goatlover | karma 4663 | avg karma 1.34 2020-12-30 20:59:31+00:00 | [–] similar comments

But that person would be learning tons of other things as well. It’s not equivalent.

firebaze | karma 1998 | avg karma 3.39 2020-12-30 15:00:59 | [–] similar comments

That person would probably be one of a billion. The AI is singular. This should cancel out.

MAXPOOL | karma 1018 | avg karma 6.13 2020-12-30 21:37:26+00:00 | [–] similar comments

Human mind (aka brain) has consumed little over 6000 kWh energy when it's 35.

From the paper they cite

    algorithm    kWh-PUE
    ----------   -------
    ELMo             275
    BERT(base)     1,507 
    NAS          656,347

NAS was Transformer (213M parameters) for neural architecture search 2019. They didn't have kWh numbers for GPT-2.

sbierwagen | karma 12882 | avg karma 3.77 2020-12-30 22:28:15+00:00 | [–] similar comments

I don't know about that 6000 KWh number.

More importantly, a human brain isn't trained from scratch at birth. It's had a few hundred million years of pretraining.

wongarsu | karma 24397 | avg karma 4.14 2020-12-30 22:54:58+00:00 | [–] similar comments

The 6000kWh number is assuming 20W over 35 years. That might be right if you only count the brain, but then you would have to revise the neural network numbers to only count CPU and memory, which would easily halve those numbers (fans take a lot of power).

If we want to account for power supply, temperature control etc. we could use the entire calorie count of a human over 35 years, leaving us with around 30000kWh (assuming a health, not-overweight human). You could now argue that that's too high (the human is doing a lot of other things). But as you pointed out it's a bit of a pointless comparison anyways as humans don't start from scratch.

antisthenes | karma 3681 | avg karma 1.09 2020-12-30 18:58:47 | [–] similar comments

> we could use the entire calorie count of a human over 35 years, leaving us with around 30000kWh (assuming a health, not-overweight human)

That's still too low, considering you can't just have a human eating food and educating themselves in a vacuum.

We need to account for energy going to and from school, shelter, heating, clothing. Basically all the necessities that enable someone to learn (and their professors)

That's another 5x at a minimum.

sifar | karma 376 | avg karma 1.14 2020-12-31 01:03:12 | [–] similar comments

For the NN architectures, you'll have to factor in there production, design, the energy of the humans involved and so on and on.

This doesn't take you anywhere.

pelasaco | karma 1954 | avg karma 1.28 2020-12-30 20:55:47+00:00 | [–] similar comments

It still takes less energy than raise humans and educate them to do mediocre jobs.

goatlover | karma 4663 | avg karma 1.34 2020-12-30 15:00:33 | [–] similar comments

Plenty of humans are already raised, and can do other things machines can’t.

pelasaco | karma 1954 | avg karma 1.28 2020-12-30 22:42:32+00:00 | [–] similar comments

plenty of humans are raised but not educated. So let's educate them to do the other things that machines can't.

joe_the_user | karma 26638 | avg karma 3.11 2020-12-30 21:28:58+00:00 | [–] similar comments

There are very few "typical human tasks" that machine learning actually does better than a person at; playing game all the occurs to me. But driving, image description, language-translation, and so-forth, at the scale that a human does it, the human is generally still considered to do tremendously better. Of course, some ML programs are convenient in being able to do things at a large scale but a majority of what ML programs do is like language translation - not great and what we sometimes live with 'cause it's free.

Basically, one should really not discount human ability, especially in tasks considered mundane. Such skills often are often considered "mediocre" not because they're actually easy but because all human can do them, see:

https://en.wikipedia.org/wiki/Moravec%27s_paradox

mortehu | karma 1857 | avg karma 2.58 2020-12-30 15:40:19 | [–] similar comments

ML is usually better at speed. For example I use a fine tuned GPT-2 for code autocomplete in vim, and even though humans could do better, they can't do it as fast.

pelasaco | karma 1954 | avg karma 1.28 2020-12-30 22:41:04+00:00 | [–] similar comments

The same was reported last year. ML could offload lawyers, doing many tasks done by them. Not better, but faster and for sure, spending less energy.

joe_the_user | karma 26638 | avg karma 3.11 2020-12-31 22:38:16+00:00 | [–] similar comments

Right now, a lot of the low-level tasks that contract lawyers do at scale for maybe $50/hour could be done by clerks for $15/hour or clerks in India for $3/hour (or something). But the lawyer paid for their time because their expertise is need to flag the "corner cases".

cosmolev | karma 271 | avg karma 2.28 2020-12-30 21:05:57+00:00 | [–] similar comments

Machine learning is essentially a brute force. No surprice it is not efficient.

lightgreen | karma 320 | avg karma 0.54 2020-12-30 15:07:43 | [–] similar comments

> Among the risks is the large carbon footprint of developing this kind of AI technology.

No there are no risks of carbon footprint. Energy is cheap. Just build nuclear power plants.

Or just build solar and wind and train models half of the time if one is paranoid about nuclear power plants safety.

Der_Einzige | karma 1358 | avg karma 0.53 2020-12-30 15:33:39 | [–] similar comments

Fine tuning BERT (what most people in practice do) is so much more efficient. You can do it in 8 hours on a 2080ti.

They need to mention that the total number of people training large transformers from scratch is very very small. If wager that the total number of different, uniquely trained (not using previous weights - which reduces compute necessity by massive amounts) language models in existence is in the low hundreds

I'd claim that these mass language models serving as the underlying encoding backbone behind more specific systems actually save energy and compute compared to the previous methods (needing far more data and thus more energy spent on getting it combined with less efficient representations like tf-idf causing many classifers to perform very slowly and thus burn lots of energy)

Also, much of the recent research in this field is about model pruning, quantization, and any technique you can imagine to reduce training and inference time or memory requirements.

All in all, big language models are a net positive for the environment. The effeciency gains in any number of fields from increasingly sophisticated NLP systems far, far outweighs the costs or of training them. Foundational research in environmental conservation will be accelerated by effective NLP semantic search and question answering systems. That's a single, tiny example of the potential for benefits from large language models.

Pick a better target.

Jabbles | karma 5119 | avg karma 3.66 2020-12-30 21:36:17+00:00 | [–] similar comments

Wait until the author discovers bitcoin's energy costs.

throwaway7281 | karma 353 | avg karma 4.53 2020-12-30 21:42:52+00:00 | [–] similar comments

That's my #1 reason to be bearish on B$ - it's just a complete environmental cluster-fuck. And in order to not see this, you'll have to ignore a lot of facts - which in turn tells me a lot about those inside the crypto-bubble, namely that they do not care that much about facts.

trhway | karma 7273 | avg karma 1.26 2020-12-30 22:06:43+00:00 | [–] similar comments

at Starship's $100/kg placing your Eth mining $1000 GPU in space will add only 10-20% when amortized over 100 ton rig - it is comparable with building/buying your own small powerplant what large crypto have done. AI and crypto are exploding and only going more so. Granted, the planet is becoming too small a confines for it. It seems that AI and crypto will be the killer apps of space in near future.

One can also observe that humans have largest, among the animals of planet Earth, share of body energy consumed by brain and the future humans would probably have even higher share. In technology we observe the same - the pinnacle of technology - CPU - have practically 0 thermodynamic efficiency and the share of energy consumed by computers grows, and i think it will be only growing. Intelligence eats the world.

mhh__ | karma 13345 | avg karma 1.95 2020-12-30 22:13:12+00:00 | [–] similar comments

How are you supposed to cool it in space?

hnuser123456 | karma 768 | avg karma 1.82 2020-12-30 16:15:04 | [–] similar comments

Heatsinks optimized for radiative cooling, which are kept in the shade by solar panels. Works much better when half or more of your surroundings aren't a planet at human-scale temperatures, as is the case on earth's surface.

TeMPOraL | karma 106045 | avg karma 3.04 2020-12-30 22:20:24+00:00 | [–] similar comments

It'll be orders of magnitude cheaper to just make it waterproof and sink it in a lake.

mhh__ | karma 13345 | avg karma 1.95 2020-12-30 23:38:37+00:00 | [–] similar comments

Or be the company flogging the rigs rather than paying for the flight (insert saying about shovels in a goldrush)

wonnage | karma 1409 | avg karma 2.0 2020-12-30 22:36:11+00:00 | [–] similar comments

spacecraft already have trouble venting waste heat without the burden of attempting to mine bitcoins

trhway | karma 7273 | avg karma 1.26 2020-12-30 22:50:06+00:00 | [–] similar comments

only because of very constrained weight budget (which is constrained by the high price of delivery into space) so you can't just throw in an AC with a radiator. The radiative power scales with the 4th of T, so it is pretty effective in space if you run the hot end of AC hot.

I suppose there were a lot of arguments about usability and various constrains/impossibilities 30+ years ago when cell phones only appeared at $1+/minute and were available only at a very few places. Yet here we are. The cheap access to space will do the same for various tech-in-space (how about unlimited access to space with fineprinted "cap of 1000kg/month" :).

jbrot | karma 143 | avg karma 3.11 2020-12-30 22:18:21+00:00 | [–] similar comments

Could you clarify why you think placing crypto rigs in space is a good idea? To run a crypto rig, you need to provide a lot of power and have a mechanism to keep the processors cool. Being in space seems to make both of these substantially more difficult (especially cooling the electronics). It seems highly implausible to me that crypto will be the “killer app” of space.

hnuser123456 | karma 768 | avg karma 1.82 2020-12-30 22:13:53+00:00 | [–] similar comments

My #1 reason is that once someone figures out how to build large, reliable quantum computers and program them in a practical way and quickly break sha256, bitcoin might be how we find out. That still seems to be a ways away though, and in the meantime, having a mostly-unbreakable digital ledger seems to be valuable.

Darkstryder | karma 534 | avg karma 4.68 2020-12-30 16:29:14 | [–] similar comments

AFAIK a practical quantum computer would not enable any practical attack against SHA256.

eitland | karma 11481 | avg karma 2.88 2020-12-30 23:34:21+00:00 | [–] similar comments

My main reason is because if BTC became mainstream a bunch of crypto nerds would become richer than many banks.

No way they'll allow that to happen I think.

There's of course also

- deflation meaning anyone who believes in Bitcoin and uses one to buy something is either desperate or a fool

- as well as practical issues like energy consumption

dcolkitt | karma 11878 | avg karma 6.64 2020-12-30 22:18:22+00:00 | [–] similar comments

The counterpoint to this I'll make is that all financial systems engage in costly signaling.

If you visit a foreign country how do you know to trust whether to trust a specific bank? You probably look at cues like whether it occupies an expensive skyscraper in the center of town? Do you see its ads around town? Does it sponsor the local soccer team? All credible signals that are hard to fake for a fly-by-night scammer.

All Bitcoin did was formalize this process. At any given time there are many chains that all purport to be the canonical history. How do you decide which one is authentic? By looking at hard-to-fake signals. In this case the accumulated hashing power behind the chain. Looking for whoever spent the most hash work is fundamentally no different than checking to see which bankers are wearing the most expensive suits.

Any system with trusted intermediaries will waste resources on costly signaling. The only question is whether crypto mining is more of a fundamental waste than traditional signals, like high-paid bankers and prestige real estate.

wonnage | karma 1409 | avg karma 2.0 2020-12-30 16:32:45 | [–] similar comments

a building has actual value as opposed to bitcoin simply contributing to the eventual heat death of the universe

eeZah7Ux | karma 2038 | avg karma 0.8 2020-12-30 23:00:40+00:00 | [–] similar comments

Not to mention the employees being able to pay rent and so on.

throwaway7281 | karma 353 | avg karma 4.53 2020-12-30 22:44:06+00:00 | [–] similar comments

I don't know. I see the nice HSBC building in my city and all I can think of is that half of the business of theirs is crooked, money laundry, trafficking, drug cartels. No shiny building can change what you actually do.

pmiller2 | karma 10258 | avg karma 2.12 2020-12-30 23:27:44+00:00 | [–] similar comments

Does that mean I shouldn't trust the credit union I go to because my favorite branch is in a shopping center?

hwillis | karma 5970 | avg karma 3.84 2020-12-30 23:46:49+00:00 | [–] similar comments

> The counterpoint to this I'll make is that all financial systems engage in costly signaling.

Bitcoin uses 600+ kWh per transaction currently. With current use, to process VISA's 1700 transactions per second would consume >36 PWh annually. That's significantly higher than global electricity consumption and capacity. At current US electricity prices it would be significantly over 4.5 trillion USD annually. Global annual banking revenue is under 6 trillion: https://www.mckinsey.com/industries/financial-services/our-i...

> The only question is whether crypto mining is more of a fundamental waste than traditional signals, like high-paid bankers and prestige real estate.

The answer is yes. Incredibly higher.

hollerith | karma 6663 | avg karma 1.52 2020-12-31 00:02:10+00:00 | [–] similar comments

You are assuming that power used by the network is directly proportional to the rate of transactions processed. I doubt that is a sound assumption.

For example miners use energy to compete for the mining reward which is awarded every ten minutes. I can think of many ways of modifying the network to increase the rate at which transactions can be processed that do not increase this mining reward. (I do not understand why none of those ways have been adopted.)

One of the primary reasons that reward is not lower is probably because the amount was set when Bitcoin was created, and people worry that changing it would set a precedent making other changes easier.

hwillis | karma 5970 | avg karma 3.84 2020-12-30 21:47:40 | [–] similar comments

The amount of work is what keeps transactions secure. The difficulty of computation is directly proportional, but the power usage depends on the demand for mining.

Lighting attempts to decrease this load by effectively batching transactions, but its not a huge difference compared to the orders of magnitude differences that exist already.

vmception | karma 13016 | avg karma 1.62 2020-12-30 22:21:05+00:00 | [–] similar comments

hey an educational moment!

the vast majority of bitcoin/crypto mining uses renewable or otherwise wasted energy and this is the only economical way to do it! you have been intentionally misinformed by the amount of energy used and not the source of energy used. It is completely a red herring to just read a headline about how little electricity a tiny country uses and that mining uses the same amount.

what? "wasted energy"? yeah a lot of energy is lost as it cannot be transported to commercial and residential areas economically. so miners set up processing at the source of the energy and use it. in fact, a lot of it actually reduces pollution creating the polar opposite of what you believe, in those circumstances it is a sustainability solution.

now aside from fighting me about your worldview, this is also an area to remain vigilant about! nation states can absolutely mine at a loss when they turn to competing for control of the cryptocurrency networks for geopolitical reasons. right now it is renewable.

but what about the hardware, the single purpose hardware and e-waste? there is a large trade of "obsolete" hardware, as it is economical to use for people with the cheapest power. this is also another specific area to be vigilant about, by making sure the infrastructure is in place to put that hardware and keep it active, instead of landfills.

throwaway7281 | karma 353 | avg karma 4.53 2020-12-30 16:39:07 | [–] similar comments

I can see that mining is a good incentive to look for excess energy in all kinds of places and make use of it. That's a capitalistic motiv, finding solutions to derived issues, when something else is at the core of the problem. But anyway.

B$ still looks ridiculous, let me explain.

* world electrical energy consumption: ~25000 TWh [1] * bitcoin energy consumption index: ~77 TWh [2]

Bitcoin today consumes as much as 0.3% of the total energy available on this planet.

That's fine, if half the world would use it and would do meaningful things with it.

What is it actually used for most visibly?

As a betting ground with galactic momentum, a technology promising to be the solution to everything, and just outrageous claims that only the bovine left to be excited about.

It's not that the algorithms and data structures are not cool, the certainly are - but we can do so much more today with technology than this.

[1] https://www.vgb.org/en/data_powergeneration.html?dfid=98054 [2] https://digiconomist.net/bitcoin-energy-consumption

vmception | karma 13016 | avg karma 1.62 2020-12-30 17:04:17 | [–] similar comments

And that's a different issue, you are conflating your energy use criticism with an arbitrary meaningful use criticism. Others may be willing to have that argument with you, but I know how those go: Someone says something about another industry's use, you say something about slow databases, they say something about another use case, you say something about why bitcoin/blockchains are not the best solution for that use case, rinse repeat.

The amount of energy without detailing the source of energy is not a valid argument to support any argument you have about why it is being used. Move off of it, or criticize the sources in a more nuanced educated argument. It should be 100% renewable or part of a sustainable solution, not only will you get further in your argument by focusing on the market participants that actually are wasting industry, you will also be helping!

pmiller2 | karma 10258 | avg karma 2.12 2020-12-30 23:32:53+00:00 | [–] similar comments

The fact that it is primarily renewable is actually the reason why BTC's energy costs are exorbitant. That energy could be directed toward replacing fossil fuels, thus reducing carbon emissions.

Is that "educated" enough for you?

vmception | karma 13016 | avg karma 1.62 2020-12-30 23:39:12+00:00 | [–] similar comments

Can you elaborate? What do you think is happening at say, a hydroelectric damn with onsite computers, versus what could happen?

pmiller2 | karma 10258 | avg karma 2.12 2020-12-30 18:01:33 | [–] similar comments

You see, we have these things called power lines, and, along with the substations, transformers, and other transmission equipment, those make up power grids. We have 3 of them in the US alone.

Now, when electricity is generated, you can transmit it over those power grids, even to places that aren't right next to the dam. You can even store a limited amount for later use.

Ever kWh that gets redirected away from frivolities like BTC can go toward something that would otherwise be powered by fossil fuels.

I thought you were "educated" here? Please do keep up.

vmception | karma 13016 | avg karma 1.62 2020-12-31 00:10:07+00:00 | [–] similar comments

The energy loss from power plant to consumer is 8-15%, on premise cryptocurrency mining is using

A) from some sources: Energy that wasn't going to be sent over those power lines

B) from other sources: a hedge against the loss

C) in some sources a combination of both A and B

If you get around to noticing, I have refrained from any snark with you and I wonder how long you will keep that up

pmiller2 | karma 10258 | avg karma 2.12 2020-12-31 01:02:45+00:00 | [–] similar comments

Au contraire, I appreciate your "educated" opinion here. I just want to make sure I'm sufficiently "educated," so I can believe the exact same thing you do, with a straight face. If you didn't want it that way, you shouldn't have led with it.

Regarding A), there's no particular reason why that electricity can't be sent across power lines.

B) So what? Send it anyway, and replace 85-92% of the energy use.

vmception | karma 13016 | avg karma 1.62 2020-12-30 19:22:30 | [–] similar comments

So you think that I'm not open to conflicting realities, got it. I am not speaking in absolutes I am simply responding to you. Now you know. I led with an "educational moment" because it still remains accurate that the amount of energy is not a strong argument without referencing the source of the energy. It's not controversial for you to disagree with that, a reality still was and now is with mining included that energy producers opted to not transport some energy across long distances and miners are overwhelmingly using that because it is economical for both parties. You read something else that I didn't say or imply. You keep referring to that, which I was willing to ignore. It's clear snark is important to you, now you got your reaction so I hope you feel satisfactorily accomplished in that regard.

Now let's talk about your unique points than the original person I replied to: places have had decades to send more power over power lines and they didn't. You tell me why that is. I assume there were financial reasons and related impracticalities. But admittedly I have never asked and only react to the reality that energy producers have been receptive to the additional use of their energy for the aforementioned reasons I listed. Their output hasn't changed and to them it is a more efficient use of it. Are they lying? Are they ignorant of alternatives? Just lazy even though their laziness would therefore predate crypto mining?

Either way mining on premise is an immediately applicable economic incentive that simply got you to notice that you didn't like it.

Best case scenario then is that it gets you into action to implement a solution nobody else noticed was applicable. Society might be getting somewhere because of you, that's so exciting.

pmiller2 | karma 10258 | avg karma 2.12 2020-12-31 05:03:42 | [–] similar comments

Yes, very "exciting." I thought we were done with the snark?

Nonetheless, financial difficulties are not relevant. What is relevant is reducing the world's carbon footprint, and that needs to happen ASAP. I, personally, think it's more "exciting" that human civilization might survive another century than to keep track of the solution to some useless, financialized math problem. But, that's just me, I suppose.

vmception | karma 13016 | avg karma 1.62 2020-12-31 19:08:54+00:00 | [–] similar comments

Crypto mining at fracking sites does this. The flare gas systems throw hydrocarbons into the atmosphere before the miners arrived, the miners doing a financialized math problem give them the missing incentive to use the energy for the mining computers on site.

Corroborate that with a news source that you happen to like.

Like I mentioned renewable or otherwise wasted energy. This is one of the otherwise wasted energy examples that is a sustainability solution right now, in comparison to wishful thinking.

dheera | karma 9101 | avg karma 1.67 2020-12-30 22:30:15+00:00 | [–] similar comments

Shouldn't that be a reason to be bullish? Because if the governments step in and enact regulation against Bitcoin mining, that will increase its scarcity and therefore value.

Likewise, if you think fossil fuels will be scarce in the future, you can be bullish on the price of gas, but not bullish on things based on gas.

aaaxyz | karma 265 | avg karma 2.62 2020-12-30 16:52:22 | [–] similar comments

The number of bitcoins available (disregarding lost/forgotten ones) is fixed at 21M, and the rate at which they are created is more or less constant (by design), so anti-mining regulations will not increase scarcity. If anything, restricting the network hash rate will make the network less secure and thus less valuable

vbezhenar | karma 14076 | avg karma 2.34 2020-12-30 17:22:38 | [–] similar comments

Not everyone hates global warming. I'll have -30° C tomorrow, give me that warming please.

abvr | karma 31 | avg karma 0.72 2020-12-31 09:24:25+00:00 | [–] similar comments

That's the most ridiculous comment I've seen on HN.

_0ffh | karma 2659 | avg karma 1.98 2020-12-30 17:36:52 | [–] similar comments

The energy cost is really only a problem for on-chain transactions. Technologies like the lightning network can greatly mitigate that problem.

paulryanrogers | karma 9107 | avg karma 1.68 2020-12-30 18:19:44 | [–] similar comments

Doesn't that mitigation introduce more centralization though?

_0ffh | karma 2659 | avg karma 1.98 2020-12-31 10:22:13 | [–] similar comments

Not that I'm aware of. The lightning network seems to be a fairly decentral affair to me. Still I suggest you make up your own mind by perusing https://lightning.network/

paulryanrogers | karma 9107 | avg karma 1.68 2020-12-31 12:45:47 | [–] similar comments

So the arbitration still occurs on the blockchain? Seems like maintaining all these different ledgers adds a lot of complexity when a better solution may be a non-Bitcoin, or even non-blockchain DB.

_0ffh | karma 2659 | avg karma 1.98 2021-01-01 12:00:22+00:00 | [–] similar comments

Sure. But that's still potentially multiple orders of magnitude fewer on-chain transactions than without. I'd call that a quite efficient mitigation.

ur-whale | karma 5038 | avg karma 1.32 2020-12-30 22:11:13+00:00 | [–] similar comments

Same BS that got started the whole Gebru affair at Google.

HenryKissinger | karma 117 | avg karma 0.16 2020-12-30 22:16:54+00:00 | [–] similar comments

I just want to say that the human brain doesn't need to perform billions of matrix products to recognize a dog.

The best AI will always be a meaty human.

syntaxing | karma 4590 | avg karma 2.72 2020-12-30 16:19:06 | [–] similar comments

It’s still crazy to think about how much more efficient computational power is nowadays compared to even five years ago. I remembered I changed my power supply in anticipation for Nvidias then newest GTX series. But only to learn that a 1050Ti only had a 75W max draw so the new fancy 1kW PSU was completely unnecessary. Also to put into perspective, my Mac mini has a 30W max draw. 30W!!! It’s ridiculous how little energy it draws.

trthomps | karma 49 | avg karma 1.81 2020-12-30 22:54:15+00:00 | [–] similar comments

You want your mind to really be blown, your brain manages to do everything it does on around 20W, even less than that mac mini.

visarga | karma 12425 | avg karma 1.65 2020-12-31 00:11:31+00:00 | [–] similar comments

The brain is not just more efficient, it also builds itself. The hardware necessary to run a neural net is given. Maybe the rigors of self replication are what's missing from AI.

optimalsolver | karma 9286 | avg karma 4.17 2020-12-30 22:56:40+00:00 | [–] similar comments

Didn't pointing this out get that Google lady fired?

Uhhrrr | karma 3424 | avg karma 3.61 2020-12-30 17:12:43 | [–] similar comments

Dr. Gebru? No, she said she'd resign if [some demands] weren't met and Google said, "We can't meet those demands so we accept your resignation."

visarga | karma 12425 | avg karma 1.65 2020-12-31 00:23:50+00:00 | [–] similar comments

Problem is Gebru is an activist, she's biased towards her own cause. Accepting her demands directly, without including all the other groups and voices, would not be right.

beervirus | karma 1911 | avg karma 1.46 2020-12-30 22:57:44+00:00 | [–] similar comments

> This month, Google forced out a prominent AI ethics researcher after she voiced frustration with the company for making her withdraw a research paper. The paper pointed out the risks of language-processing artificial intelligence, the type used in Google Search and other text analysis products.

That is... certainly one way of putting things.

nerdponx | karma 22397 | avg karma 2.51 2020-12-31 02:06:57+00:00 | [–] similar comments

Is there more context for this?

TaupeRanger | karma 1856 | avg karma 2.87 2020-12-30 23:02:08+00:00 | [–] similar comments

And they aren't even learning in a sense that would make them categorically comparable to human-like learning. You could just say "it takes a lot of energy for a computer to crunch a lot of numbers".

paulpauper | karma 43782 | avg karma 3.33 2020-12-30 23:07:41+00:00 | [–] similar comments

Why does the computer need to learn when we can just program what the computer is supposed to do based on what we know what the desired output/result should be? Yeah, we can try to recreate calculus from first principles, or just use the equations that we already know. unless I am missing something obvious. There are PHP or C scripts for example can do image recpnition and break captchas (there is a program called Xrummer that solves captchas for spamming purposes, which is why captchas have become so complicated) ...this was in 2010 , long before machine learning became a 'thing'.

visarga | karma 12425 | avg karma 1.65 2020-12-31 00:08:25+00:00 | [–] similar comments

As complicated as it is, deep learning is simple compared to programming by hand all the required knowledge. This approach historically failed (expert systems).

nerdponx | karma 22397 | avg karma 2.51 2020-12-31 02:06:00+00:00 | [–] similar comments

I don't know that a post this arrogant warrants a serious response. If you think you can write a PHP script that does what BERT or ImageNet does, by all means go right ahead.

belval | karma 7126 | avg karma 5.26 2020-12-31 14:39:02+00:00 | [–] similar comments

I think most of the critics here has to come from people not understanding that GPT-3 was not and will not be trained more than a few times.

If even then you still want to argue, aluminum consumes 17,000kWh per ton ton which is 11 BERTs. I don't see weird criticism about mining consuming energy. Doing stuff takes energy and while the sentiment against ML seems to be "it's snake oil", it is still very real research with actual real-world impacts.

Finally, we are in a transition phase where ML is done on GPUs which consume more energy than dedicated ASICs. We already have Google TPUs and Amazon Inferentia which can be used today. Power consumption will go down as dedicated hardware gets better and better.

Legal | privacy