I think it makes sense for them to emphasize a strong understanding of the fundamentals. It will help those students who later want to go into data science as well.
> Just because ChatGPT can do it, doesn’t mean that it isn’t valuable for a human to learn.
No, but it does sort of suggest that, doesn't it?
> This is especially true for foundational courses.
Sure, but calculus is about memorizing ways to answer problems. We're not talking about real analysis, the course in which students develop the calculus and prove it works.
> Sure, but calculus is about memorizing ways to answer problems.
It really isn't. That might be how some people managed to get a passing grade, but clearly they learned nothing and squandered a once in a lifetime opportunity to get to know it.
Its valuable if you wrap it up in an applied problem solving course like physics. That’s the point of the new curriculum reform anyway. Other countries that exceed us in test scores do this.
So it really begs the question as to what is the point? The only thing I can think of is college admissions. A specific selection of rigorous memorization for elite admission.
While I’ve led a data science team, I’ve never taken a data science course — so I’m not sure what it teaches. But i do feel pretty confident in saying that I think math does lose its usefulness around after trig. Not to say there aren’t useful aspects, but the curriculum is so inefficient. And maybe it’s because everyone needs some part of it, but that part is different for each person.
Math is interesting in that the early foundation is so useful, but the use drops off quickly. While I feel like other areas often become more useful as I learn more. Possibly because I haven’t spent 15 years on that topic like I had math.
I think most people would agree if pressed. Math is so ridiculously useful prior to trig. Almost any white collar job relies on these maths. Trig is an interesting inflection point in that so much math gets built on top of it, although its not that useful in of itself. And then after trig, things become much more fragmented and you really need to go into specific subfields to determine which branch of math is of value.
For example, if you go into medicine and medical research having a good understanding of statistics is useful, but very little in calculus or analysis is useful (and even if you do need Calculus, most of the useful stuff for those fields is taught in the 1st semester of Calculus).
I think a lot of things use calculus concepts, even if calculus isn't explicitly invoked.
A whole lot of finance and pharmacology are about exponential functions and their derivatives and integrals, for instance. A whole lot of fields use optimization, even if "just asking the computer to do it", etc.
I admit I am weaker now in calculus and linear algebra because I lean on CAS and simulation a lot... but at least I know how it works so that I have an idea of what I'm doing.
To be clear, I'm not referring to the concepts as they exist in the universe. But rather the actual material taught in the courses. For example, there's a lot topology that we use in the real world, but the material in the class is only of use to a small percentage of people in the world.
I spent a chunk of my career optimizing FDMs and FEMs, but above and beyond that I haven't had a great need for Calculus until I started doing some deep learning. Again, very particular subfields.
And I suspect the work that you're talking about is exactly what I was thinking about when I wrote that even if Calculus is needed, it's the stuff taught in the first semester.
> but above and beyond that I haven't had a great need for Calculus until I started doing some deep learning.
I think a whole lot of what we talk about in compsci... calculus is table stakes. Sure, it's not differential equations, but how do we talk about behavior at the limit or nonlinear scaling without it.
Even just making up functions that are smooth in their derivative and cross though a few points is something I've had to do a lot for decent heuristics.
> And I suspect the work that you're talking about is exactly what I was thinking about when I wrote that even if Calculus is needed, it's the stuff taught in the first semester.
What's taught in the first semester varies a lot. I'm familiar with AP Calc BC, and sure-- a little bit of the stuff in the last half of the course (differential equations, vector-valued functions) is a little more esoteric for many careers. But a lot of stuff isn't so much (polar coordinates, the "practical integration" stuff that uses basic mechanics, calculator skills, etc)
Wouldn't say math loses usefulness after trig. But rather, due to how it is (usually, my experience being in Chile) teached it rapidly becomes too abstract, decouples from its "real world" use cases and then it is easy to forget the forest for the trees.
The Mathematics for Machine Learning book[1] exposes this as a top-down vs bottom-up problem. While both approaches have pros and cons, a sweet spot may lay somewhere in the middle and that needs you to embrace some inevitable backtracking (i.e. college curricula should not forget to add some courses where world modelling using the math and throughfully explaining why that underlying theory and math is actually useful in describing and/or predicting reality).
PS: I also think there is still a lot of focus in resolving problems manually.
>While I’ve led a data science team, I’ve never taken a data science course — so I’m not sure what it teaches. But i do feel pretty confident in saying that I think math does lose its usefulness around after trig.
Statements like this are a big part of the reason statisticians never trust anyone who works in "data science". The whole field is basically applied statistics/calculus and you're saying none of that is useful.
Sorry, the statement I made wasn't intended to be connected that way. Data science uses a bunch of math beyond trig. I meant that in general math beyond trig becomes much less useful. I was talking about the general usefulness of different levels/types of taught math for white collar jobs/living. Not what is of use for data science.
> They continue to recommend statistics. They removed data science from a sentence that also included calculus and statistics.
I knew a professor from a math department from a top European university who taught data science courses who swore that data science and data mining were just marketing terms invented to sell statistics.
Geometry isn't really any more "fundamental" than statistics (for concreteness, let's say the geometry covered by the SAT and statistics as covered by the AP Statistics exam). Maybe they are using it as a proxy for formal proofs? A proof-based course in probability would actually be a lot more fundamental than either geometry or statistics, I think.
I don’t like the whole layer cake of math that we do. Still, in a traditional geometry course you get a lot of pieces that help with trig and calculus, and an exposure to at least informal proofs.
A whole lot of stuff in AP Stats is a relatively dead end for many people, but geometry and geometric reasoning is necessary for all kinds of engineering-ish math.
I think geometry and trig are pretty related, and have a lot of relevance in calculus especially multivariate. To your point geometry is also the first place formal proofs take shape. That said I think geometry and trig could each be a quarter of a year long and be taught to the extent needed for almost any pursuit, with supplemental at point of need. In my education geometry and trig were two entire years, and it was so dull I lost all interest in math until I took calculus. I agree a probability course would be useful, but I don’t think probability before calculus is an awesome idea. Statistics and probability can be taught at a rudimentary level without calculus, but insight really requires calculus and linear algebra. I found taking a non calc stats and probability course made the calc version harder.
While trig is used for starting on calculus now, in practice, I never had to use trig for ML. Most of the work was in numerical differentiation and integrations. I would think trig usefulness is more in some hard sciences while ML has a much more horizontal applicability.
It is possible to teach calculus without trig (just for polynomials) and I think it is very useful just at that level.
It seems hard to conceive of a world where e^ix isn’t important in ML, unless that ML is sans probability, neural networks, or really most anything useful. Perhaps for regressions, so long as they have no periodic component. I think you probably can mechanically, without understanding, skate by in a job without any understanding of trig, but I don’t think you can understand much ML without it, and certainly can’t reason about limitations of an ML technique. While you might not directly use trig, I feel you must use things that were taught using trig to justify the technique and bound it’s applicability.
But really trig isn’t very complex a topic. I don’t think you should attempt to avoid teaching it. I just think it’s like a 1 month topic that is filled in as you learn calculus, linear algebra, and physics. The real intuition of trig comes form the use of it in other areas, and as a standalone subject it’s just boring.
Familiarity with pandas and familiarity with taking an integral aren’t even close to the same thing. I don’t think it ever made sense to group them together.
Do Stanford or Harvard have a UC BIDS: UC Berkeley Institute of Data Science?
> This is the [open] textbook for the Foundations of Data Science class at UC Berkeley: "Computational and Inferential Thinking: The Foundations of Data Science" http://inferentialthinking.com/ (JupyterBook w/ notebooks and MyST Markdown)
> Data literacy is distinguished from statistical literacy since it involves understanding what data means, including the ability to read graphs and charts as well as draw conclusions from data.[6] Statistical literacy, on the other hand, refers to the "ability to read and interpret summary statistics in everyday media" such as graphs, tables, statements, surveys, and studies. [6]
Data Literacy and Statistical Literacy are essential for good leadership. For citizens to be capable of Evidence-Based Policy, we need Data Driven Journalism (DDJ) and curricular data science in the public high schools.
It's amazing how many bad ideas, if you scroll down far enough, are justified by an appeal to "equity." Which usually translates into dumbing things down.
A high school "data science" course, if designed properly, will be far more useful to students and beneficial to society than calculus.
Every high school student should learn how to grapple with uncertainty, how to evaluate statistical claims and experiments, how to interpret graphs and charts, understand how machine learning models work (at a high level), and internalize concepts like "significance", "error bars", and "expected value."
This training will help all students every single day of their lives, because it teaches them how to think. Society benefits from having more people with the tools to evaluate data and deal with uncertainty, especially as we face a looming epistemological crisis.
Calculus, on the other hand, will be used by very few students, and even for those few, it will not likely be used every day. Yes, it is a prerequisite for some STEM courses as part of a degree program, and so calculus can be taught to undergraduates pursuing a STEM field in their first year (or those who take it as an elective in high school.)
It's a shame that Stanford and Harvard, which set the tone for high schools and high schoolers, are going the wrong direction here.
> A high school "data science" course, if designed properly, will be far more useful to students and beneficial to society than calculus.
How do you expect students to understand what they are doing with "data science" without learning probability and statistics, and how do you expect students to get probability and statistics without learning calculus?
I mean, Bayes' theorem. How do you get people to get it if they don't know calculus?
High schools often teach physics and without calculus as a prerequisite. It definitely makes it more challenging, but you can still communicate the concepts at a different level of detail.
You can definitely explain Bayes' theorem without calculus. I just asked ChatGPT to do it and it came up with a great example using a deck of cards and some fraction math.
You can sidestep calculus by just using the discrete setting rather than a continuous one.
If you want to introduce continuous distributions like the Gaussian one, you can just say "area under the curve" if you need to connect the density to a numerical probability. They don't have to know how to do the integral, in the case of a Gaussian, it's just tabulated anyway.
I'd argue that you could teach a perfectly reasonable high school stats class using this kind of approach.
A "calculus-free" method is mostly what is done for high school physics, with occasional nods in that direction to set the students up later. And like physics, the obvious connection to of continuous probability to calculus will be a nice motivation later on.
One analogy is how we teach probability to sophisticated engineering undergraduates. I'm not aware of undergrad engineering curricula that use measure theory. This results in awkwardness around delta "functions" and probabilities of certain sets of measure zero (sets that cannot be integrated without the Lebesgue integral).
And sure, some of those undergrads don't ever take that measure theory class, so they escape to the wild without knowing the answers to awkward questions.
> If you want to introduce continuous distributions like the Gaussian one, you can just say "area under the curve" if you need to connect the density to a numerical probability.
What name do you give to this "area under the curve", or the "rate of change" of this area? They are pretty fundamental concepts with important and basic properties, which affect things like local optima and minimization, and expected value and covariance, etc. I mean, you can't cover linear models and least squares without this stuff, and if you don't then I wouldn't really call it learning.
You call “area under the curve”… area under the curve. Expected values, least squares, linear model, etc can all be explained in the discrete case without calculus.
High school math isn’t and doesn’t need to be rigorously proofed based, if you lack some do the tooling necessary to demonstrate a proof, you can tell a student, “the proof requires calculus” and boom, you’ve given them a reason to take an interest in the subject.
You don’t need integration to define expected value or covariance in the discrete case. TBH I’m not sure if you can get around integration in the general continuous case or not.
If not, you could use some limiting argument to handle the moments of a continuous uniform RV, at least, in terms of the discrete analog.
You don’t need calculus to derive least squares estimators. You can follow the logic in this quora answer [1] to show that (e.g.) the mean is the minimum MSE estimator among constant functions, and that the conditional mean is the minimum MSE estimator among “general” (measurable L2) functions.
This derivation is familiar to many who have studied these concepts. It’s clever, it does not need differentiation, just expectation and logic.
It could be that your studies in probability were done using a certain pedagogical path, and that’s blinding you to the fact that other paths are possible.
I don't recall Bayes' theorem involving calculus. Are you sure you aren't thinking of some other theorem?
Bayes' theorem follows straightforwardly from P(A & B) = P(A|B) P(B) and P(A & B) = P(B & A). The latter tells us that we can swap A and B in the former without changing the value, giving us P(A|B) P(B) = P(B|A) P(A).
Rearranging gives P(A|B) = P(B|A) P(A) / P(B), which is Bayes' theorem.
> Every high school student should learn how to grapple with uncertainty, how to evaluate statistical claims and experiments, how to interpret graphs and charts, understand how machine learning models work (at a high level), and internalize concepts like "significance", "error bars", and "expected value."
Pet peeve: can we just go back to calling these things statistics?
While I agree with you that statistics should be more heavily emphasized at the high school level, the issue goes much deeper within American math education that the one class.
I would assume that a data science class is mostly "good old statistics." But if "data science" is the phrase that gets education boards to put more student butts in seats in stats class, I'm all for it.
Wouldn’t a data science curriculum be more multi-disciplinary than a ‘statistics’ course?
Visualization, scripting, data collection, models, simulation. EDx had a great course by Guttag and Grimson. Add to this Scott E Page’s Model Thinking. Add EDx Data Analytics and Learning From UT Arlington. And some Tufte.
I say these because i work in the accounting field and brought scripting to my firm from my own self-study. It’s been a super power for me, and solved several problems which my colleagues had tackled using Excel alone.
I’ve also studied statistics, but found it less generally useful.
>Wouldn’t a data science curriculum be more multi-disciplinary than a ‘statistics’ course?
I would say yes, however, the items listed in the comment I quoted fall squarely within the realm of statistics. I don’t have a problem with calling a curriculum of statistics + data manipulation tools “data science” but that’s not what’s realistically being covered in these high school programs.
> concepts like "significance", "error bars", and "expected value."
Yes. I see what you're responding to--these are squarely in the statistics domain.
> not what’s realistically being covered in these high school programs.
Yes. Where the rubber meets the road. Who exiting from higher education now will have the skills to teach this imagined hybrid course? Realistically, they have to be vetted and hired by the mathematics department and satisfy some state and/or federal standards of education, which are currently staffed by educators who themselves are following standards of their office.
I was responding to the OP's premise:
> "data science" course, if designed properly, will be far more useful to students and beneficial to society than calculus.
Whether or not that objective is "realistic" given the current boundaries perscribed for high school education is another matter.
There is hope; there are modern thinkers in education out there. I referenced the UT Arlington course students and instructors referred to as DALMOOC (google it). I took this course thinking it was another data science course, and found a course taught by teachers for teachers. I hung in because their ideas were so fresh and interesting.
DALMOOC's ambition was to train teachers to encourage students to use social media to communicate their learning results, and in turn produce the data that the teachers were being traind in the course to analyze using social media analysis techniques. DALMOOC professors encouraged participants to generate social media responses to DALMOOC coursework. Very modern. Not sure how long before professors like George Siemens, whose brainchild DALMOOC was, get into state and federal positions of authority and influence to see their modern ideas at the high school level.
If the goal is to teach them basic statistics to be useful and not to do science with it, then just make them watch a few YouTube videos on the topic as part of their 9th grade math class?
The article is written poorly with a click bait title.
This is the Stanford guidance. Mathematics: four years of rigorous mathematics incorporating a solid grounding in fundamental skills (algebra, geometry, trigonometry). We also welcome additional mathematical preparation, including calculus and statistics.
This is the Harvard guidance. Update to math curricular guidance:
There is no single academic path we expect all students to follow, but the strongest applicants take the most rigorous secondary school curricula available to them. We receive many questions specifically about what type of math courses students should take. Applicants to Harvard should excel in a challenging
high school math sequence corresponding to their educational interests and aspirations. Rigorous and relevant data science, computer science, statistics, mathematical modeling, calculus, and other advanced math classes are given equal consideration in the application process.
Seems like a mischaracterization because I don't see a problem.
UC CS undergrads had to take statistics for engineers and scientists.
UC CS undergrad majors in particular could end within 2 courses from a math undergrad degree. Is this not the case that squishier applied courses are possible?
EE/CS undergrads had to take the entire upper-division physics track for scientists and engineers, including modern physics.
So has something changed since then and is something changing back?
reply