Hacker Read

pessimizer · 2023-07-09 17:00:55

"Data model" isn't a confusing term, and "data modeller" is a profession. They work on data models.

woeirua | karma 4033 | avg karma 6.18 · | 2022-08-17 10:48:37

This article just reinforces that management thinks that data scientists spend most of their time building models. In reality, data scientists spend the majority of their time munging and wrangling disparate data types (that are typically a total mess), understanding the data well enough to fix problems, translating business requirements to code, and converting model outputs to human understandable presentations. Building models is a trivial amount of the effort for most projects, typically 10% or less of the time. Most organizations will eventually move over to AutoML type solutions, but then will be shocked when they fail to achieve any significant gains in productivity because they're optimizing one of the shortest steps in the whole process.

wickerman | karma 384 | avg karma 4.57 · | 2019-07-29 15:34:07+00:00

Data engineering is not data science. Data engineers deliver the data for data scientists, data scientists use the data in models.

IndianAstronaut | karma 1451 | avg karma 1.47 · | 2016-12-22 09:24:30

Someone who can build models and someone who can scale it are 2 different people and professions. Data scientist vs data engineer.

amrrs | karma 9083 | avg karma 5.62 · | 2017-12-08 14:27:33+00:00

Yes, Ideally, A data scientist works on creating the model or logic while a Data Engineer works on the part of deploying the same.

moffkalast | karma 7759 | avg karma 1.88 · | 2022-11-02 12:34:18

Data science is a complicated profession, wouldn't you agree?

eitally | karma 4713 | avg karma 2.51 · | 2015-10-20 13:54:58

My guess (and this is a reasonably educated guess): a data scientist creates models and usually has an advanced math or stats degree (or similar). A data analyst uses models created by data scientists, and often has a business/econ or other undergraduate degree.

Of course this is a generalization, but in my perusal of job openings over the past year, this seems to be roughly what companies mean.

reply

apahwa | karma 116 | avg karma 1.81 · | 2016-08-08 01:38:28+00:00

That seems like a pretty inaccurate job title then. Data Engineers are people working with data pipelines, storage, and schemas. They can lean towards more software engineering or towards analytics with dashboarding/machine learning but their primary responsibilities are the former.

CmonDev | karma 1532 | avg karma 0.87 · | 2014-04-30 11:39:17

"Just look around in the finance and insurance sector for data modeling."

That is a very narrow niche.

reply

deviationblue | karma 93 | avg karma 1.6 · | 2018-05-18 20:54:29+00:00

Ill-defined though it may be, there's still an understandable difference between data science and data engineering.

jacobsenscott | karma 3158 | avg karma 2.99 · | 2023-03-01 11:36:01

Sure, anyone can say they are a data engineer.

dmh2000 | karma 233 | avg karma 1.41 · | 2021-03-29 05:28:45

isn't it typical for data engineers to be paid less (way less?) than the modelers?

disgruntledphd2 | karma 6481 | avg karma 1.73 · | 2020-11-15 12:57:18+00:00

This is a deeply misleading (though somewhat accurate) comment.

The reason it's misleading is because the 70% above (who may be called data scientists) are not actually data scientists, at best they are data analysts.

In general, the core difference between data scientists and data analysts is that the former can code in at least one language (SQL doesn't count, unfortunately).

However, because the term data science became so popular, everyone re-branded their analyst roles as data scientists leading to this concern.

Additionally, the post I'm replying to is pretty biased, as the OP talks about productionising models. While this is a major facet of DS work, it's not the whole thing. TBH, I can find people to productionise models a lot quicker than I can find people who can figure out what to model, and how to measure it.

Some of those people are most comfortable with Excel, and while I'd prefer they used a different tool, I can't argue with their output.

Also, the OP here is focused on deployment of Python ML models, which again is a subset of a very, very broad field.

That being said, i agree with most of the categorisations, except that the two critical attributes of good data scientists are a strong background in statistics and data common sense.

Data common sense is a weird attribute where when you look at the numbers and see if they are reasonable. For example, if you are running a mobile gaming company and see an ARPU of $5, something has either gone horribly wrong, or you're going to be a billionaire (assuming you have equity).

This attribute is actually not that common amongst DS people, so it tends to be the limiting factor, rather than ability with containers and deployment (which I do agree is very important).

reply

sampo | karma 11354 | avg karma 4.19 · | 2018-02-20 16:55:24

> I'd call myself more of a "data plumber"

I think the actual term is Data Engineer.

reply

digging | karma 3850 | avg karma 2.04 · | 2023-08-14 09:59:52

Presumably one of those Data Scientist instances is meant to be Data Engineer?

pavlovskyi | karma 41 | avg karma 0.68 · | 2021-08-31 05:44:42

But it is just an assumption. I work as a data scientist for 5+ years and from practical point of view, it is not just data wrangling. It is worth to mention that going through that logic we assume that programmer fully understand how to develop model in production and how to handle it in some border cases, which is not true.

mjirv | karma 1319 | avg karma 5.38 · | 2021-07-09 00:32:00+00:00

Analytics Engineer is a clear one for this, as teej said.

The title is strongly associated with the dbt community, so it could imply you’re using dbt for your data modeling (not necessarily a bad thing, as it sounds like it would be a good tool for your use case).

reply

bytematic | karma 547 | avg karma 1.26 · | 2018-06-03 19:14:19+00:00

I agree with you, is this the type of work a data engineer does? I actually kinda like that.

mousetree | karma 698 | avg karma 2.74 · | 2022-08-23 11:25:29

I think that is OP's intent. Data Scientists, Data Engineers, Data Analysts, Data X

minimaxir | karma 67739 | avg karma 7.48 · | 2019-03-26 12:59:18

Modeling is just a small part of data science (the percentage of time I've spent modeling as a data science is in the single digits).

Automating modeling is a bit easier than automating the other parts, though.

reply