A Pamphlet against R: Computational Intelligence in Guile Scheme (2016)

thomasfedb | karma 1203 | avg karma 3.0 · 2018-07-30 07:25:35+00:00

I admit I was a bit surprised that 'pamphlet' extends to 75 page documents.

sgt101 | karma 7195 | avg karma 1.98 · 2018-07-30 07:26:30+00:00

This (metaprogramming, macros, language as a datastructure in the language)is why I learned Julia, but R + R-studio still draw me back by being just so useful.

Tarq0n | karma 1218 | avg karma 2.79 · 2018-07-30 13:11:39+00:00

R actually has pretty nice metaprogramming capabilities though. The tidyverse heavily relies on them to create its DSL.

sgt101 | karma 7195 | avg karma 1.98 · 2018-07-30 13:51:24+00:00

Agree and tidyverse is another reason to use R. I think Julia is a programers language though.

groceryheist | karma 319 | avg karma 4.25 · 2018-07-30 02:30:01

Mother of all clickbait titles.

I read the first few pages and didn't see any content about R. The first chapter is an introduction to scheme. I scanned it and didn't see any R code all. It's an introduction to scheme that builds up to K means.

R does suck in so many ways, especially the object oriented systems and the horrible default libraries with so many inconsistent idioms, but we put up with it because mainly because it provides the most advanced open source ecosystem for modeling and visualizing data and it is also pretty fast, especially with data.table. If being fast is important we have Julia as a competitor. If having a better designed language with a reasonably complete ecosystem and sometimes being slower is OK we have python. Like pretty much every other domain area, there just isn't a competitive niche in data science for a lisp.

Also, R is already pretty functional and Lisp-like, sure it doesn't have prefix syntax, but it does have first class functions. Using various types of map (lapply, apply &c) is practically mandatory after a certain point.

I love the Wizard Book too, but this is embarrassing.

reply

chairmanwow | karma 821 | avg karma 4.32 · 2018-07-30 02:46:31

Thanks for saving me a click.

discardable_dan | karma 810 | avg karma 4.2 · 2018-07-30 03:50:18

My main beef with R is that this is valid code:

    > f <- function (n) {
    +     f <- 1
    +     if (n == 0) return(f) else n * f(n-1)}
    > f(5)
    [1] 120

The language has some warts, that's for sure.

hatmatrix | karma 1118 | avg karma 2.64 · 2018-07-30 09:29:18+00:00

Isn't that just lisp-2-like behavior where functions and data maintain separate identities?

kgwgk | karma 14700 | avg karma 2.94 · 2018-07-30 10:41:45+00:00

No, if you set f <- 42 between the moment where you define f as a function and the moment where you call it then it doesn't work anymore:

  > f(5)
  Error in f(5) : could not find function "f"

Tarq0n | karma 1218 | avg karma 2.79 · 2018-07-30 13:14:17+00:00

Functions having their own environment is actually pretty useful once you get your head around them.

naiveai | karma 114 | avg karma 1.78 · 2018-07-30 10:26:28

First class functions does not a functional language make.

ww520 | karma 9957 | avg karma 2.49 · 2018-07-30 07:31:36+00:00

This is a mis-titled (disingenuous?) introductory book to Scheme, basically using R to draw eyeballs.

Also it's barking up the wrong tree. R is known for its statistics prowess and working with large datasets. ML/AI is just a small subset of it. What's the equivalence in Guile that deals with statistics and data processing?

reply

jerry40 | karma 64 | avg karma 0.62 · 2018-07-30 02:40:18

I've written a jupiter notebook kernel for Guile Scheme https://github.com/jerry40/guile-kernel , I wonder if it is possible to set up a dedicated server using Jupyterhub and made it public in order to make articles about Guile more 'live' and perhaps put SICP examples there but I have no experience in this matter. If anyone give a piece of advice to me, I will be grateful.

deugtniet | karma 708 | avg karma 7.45 · 2018-07-30 07:48:32+00:00

Slightly off topic, but after reading a couple of pages from the pamphlet I don't think I'm going to change to a lisp like language anytime soon. My choice of language is between Python and R.

I'm in a position where I can freely choose my own language for my analysis and modeling work, and use Python and R in my day to day.

The thing that really put me off R is the opaqueness of the language. In python I like to think I have a rudimentary understanding of how the language is structured , and I can actually read error messages. But in R, if something fails all bets are off, and you just have iteratively comment out code to find the error. Rstudio is usually given as a really great IDE for analysis, but it just crashes too often, and is slow to a point where you just have to wait minutes on end if you click the wrong variable. Pycharm (professional) does the same as Rstudio, and more: I can plot, I can see variables, I have a working debugger and code completion is to my opinion as good as it gets.

Two things in python are not up to snuff: Plotting (ggplot2 is really good) and some statistical models. That's the reason why I still sometimes have to fire up Rstudio. I hope I can phase out R in my work as much as possible though.

reply

peatmoss | karma 6470 | avg karma 3.69 · 2018-07-30 14:36:35+00:00

Yeah, I do think that lisps have something to offer the stats community, but the “hey just build everything from scratch because look how easy these trivial examples are” sales pitch rings hollow to me. And I’m an avowed fan of lispy things.

dswalter | karma 724 | avg karma 4.83 · 2018-07-30 14:53:19+00:00

I don't think there is anyone in the python ecosystem who is as good as Hadley Wickham at understanding data analysts and designing API's that a budding analyst can reason through and understand.

ggplot2 is of course the prime example. It simply and elegantly allows characteristics of your visualization to either vary based on data, or remain fixed to your specifications. Since so many data visualizations in science and industry are really variations on that theme, you can do the vast majority of what you want.

reply

usgroup | karma 2545 | avg karma 2.14 · 2018-07-30 07:50:12+00:00

2nd order Clickbait ... Avoid.

safgasCVS | karma 279 | avg karma 2.23 · 2018-07-30 08:05:47+00:00

I don't understand how this relates to R? It starts of saying R is bad (agreed) but then what it just goes off on a tangent about something unrelated without really explaining what's better.

I've said this before and will say it repeatedly - R is a bad programming language by design to allow it to be a brilliant tool for stats/data science. Language geeks who approach data science from a comp sci background seem to miss this point - R works precisely because it's not a good language.

I'm perfectly happy to discuss a better alternative but without first understanding why R is used so widely in the first place misses the point of the discussion

reply

Gatsky | karma 6216 | avg karma 3.46 · 2018-07-30 03:20:30

This should be an auto-reply to every comment whinging about how R is horrible.

hatmatrix | karma 1118 | avg karma 2.64 · 2018-07-30 04:32:10

What aspect of not being a good language makes it work well for mathematicians?

kgwgk | karma 14700 | avg karma 2.94 · 2018-07-30 11:21:16+00:00

Mathematicians?

wodenokoto | karma 15554 | avg karma 2.59 · 2018-07-30 21:15:24+00:00

It has many nice shortcuts for exploring and manipulating statistical data. These shortcuts come at the cost of consistency.

mangecoeur | karma 2005 | avg karma 7.04 · 2018-07-30 03:49:19

So apart from not really being about R, this author also seems to completely miss the point of scientific computing: it's not about having the cleverest language, it's about having a solid ecosystem and community building useful tools (e.g. RStudio). People work around language warts pretty easily, most people doing scientific computing just need to get shit done and don't give a rats arse about the programing language arcana.

th0ma5 | karma 5148 | avg karma 2.79 · 2018-07-30 08:58:01+00:00

I'd expect a scientific language to be deterministic. A list of numbers as a single element inside another list sometimes in some functions returns the inner list, for example.

hatmatrix | karma 1118 | avg karma 2.64 · 2018-07-30 09:36:05+00:00

R already has a failed lisp predecessor: XLISP-STAT. I don't know that it was bad at all - it was developed by one of the core developers of R and by all accounts was numerically superior. I believe the conclusion regarding its demise was that lisp syntax just does not appear to captivate a large enough audience.

peatmoss | karma 6470 | avg karma 3.69 · 2018-07-30 09:25:06

I’ve wondered out loud a few times whether Racket’s language-oriented could facilitate even better / more natural DSLs than R.

R has lots of little DSLs that reduces the cognitive burden of doing stats for people more concerned with applied research than programming. XLisp-Stat’s syntax barrier might have been overcome by better DSLs.

In the case of Racket, creating new syntaxes isn’t a barrier, though if someone needed to break out of the functionality of your DSL, they’d be right back to Racket (or some other equivalently expressive language like typed-racket or rackjure).

Projects like Julia are neat because they’re researching the possibility of building a high level language that obviates resorting to C, C++, and Fortran when speed is needed.

Equally interesting, but less sexy, is figuring out the human interaction aspect of statistical computing. R is a somewhat organic language that has evolved (and in the case of Tidyverse researched and engineered) DSLs. I’d love to see what stats savvy computer scientists could do with Racket to rethink or improve stats DSLs. If I were a tenured prof with time and research money, I’d surely explore this question.

Put another way, I think Julia is tackling numerical and statistical computing the XBox / Playstation way: pushing the state of the art in performance. I think there’s room for a Nintendo approach that speaks to the humans using the system rather than optimizing the technical bit.

reply

lispm | karma 12407 | avg karma 2.21 · 2018-07-30 09:40:28+00:00

See also:

Back to the Future: Lisp as a Base for a Statistical Computing System

https://www.stat.auckland.ac.nz/~ihaka/downloads/Compstat-20...

reply

thom | karma 8431 | avg karma 3.16 · 2018-07-30 04:54:59

R is probably the best, most productive DSL that exists for data analysis work, but it is a horrible programming language, made worse by the fact that I don't think most of its users think of what they're doing as programming. I've never seen an R project that looked like it had learned anything from software engineering best practice of literally any period in history. Your data is flat, your script starts at the top and runs to the bottom. There are rarely any abstractions.

One thing that gives me hope here is that an increasing number of universities have dedicated Research Software Engineering groups who can explicitly apply their expertise (mostly in performance but often just in writing better software) to the problems being approached by academics. But I've yet to see most computer science adjacent courses (including the plethora of new Data Science courses in the last few years) explicitly tackle software engineering as a valuable discipline.

All that said, the R ecosystem is fantastic and you're wasting time turning your nose up at it. I had hoped Incanter would take off in the Clojure world but it did not. Lately I have been lucky enough to be able to confine my needs to Bayadera, a framework for Bayesian data analysis in Clojure. But I think the grand hopes of Lisp becoming a first class language for data analysis will always fall by the wayside - Lisp people (outside of myself) are smart enough to fulfil their own needs, and you don't have the conveyor belt of new algorithms that R gets from academia (and indeed Python gets from the same place, and increasingly from industry giants).

reply

peatmoss | karma 6470 | avg karma 3.69 · 2018-07-30 14:31:06+00:00

The DSLs are key. As I wrote below, I feel like there is still room for data analysis DSLs that optimize heavily for usability by end users. Racket has best-in-class support for building new languages, and is a nice language itself.

Clojure and Incanter’s failure to capture market share also bums me out. In a lot of ways, I feel that Clojure’s small amount of syntactic sugar and collection abstractions make it a very, very user-friendly language itself even if its meta programming isn’t quite what Racket’s is.

reply

zwaps | karma 5084 | avg karma 5.04 · 2018-07-30 09:56:55+00:00

I wish we could down vote this clickbait garbage.

R isnt good in many respects, but it can run most statistical models.

Get back to me when your language of the day has a package to easily run blundell bond type dynamic panels with different iv and gmm estimators, cause that is what R can do.

Until then, stop wasting my time comparing a useful toolkit for a specific thing with some arbitrary language the day that us used by two people and has never solved a real life problem ever.

And give me my five minutes back.

reply