Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
RoR vs. What else? (b'') similar stories update story
7.0 points by AshwinRamasamy | karma 10 | avg karma 1.11 2012-07-19 08:16:27+00:00 | hide | past | favorite | 18 comments

We are building an app that would eventually use machine learning techniques to give recommendations to users. We did our MVP on PHP and then ditched it to build a new version on RoR. Now people say that, RoR does not scale well (mySQL slows down the experience) and not certainly a great tech for machine learning. I am not quiet technical. Should I just go with RoR (investments already made) or change tech. What's the pragmatic call?


view as:

Can't you use RoR with a different storage backend like MongoDB ? It's hard to say without additional precisions on the algorithms and persistent datastructures.

Your web stack and your machine learning stack don't have to be the same. http://scikit-learn.org/ is Python and very popular for machine learning. Here is a in depth training video http://pyvideo.org/video/972/tutorial-scikit-learn-machine-l...

http://orange.biolab.si/ is also good. It has a nice GUI.

Do you know any serious framework for image processing written in python?

Use a different stack for machine learning, and switch over to postgres - I'm fairly sure it'll scale better than mySQL

MySQL is as capable as Postgres and can be tuned just as fast. The differences between these two are features, not performance.

Hm, didn't know that, thanks!

Don't believe the people who say RoR doesn't scale. Normaly the framework isn't the bottleneck. The most critical part (imho) is your database design (ex.: using shards for user data, using solr (not the db) for searches), ...

The pragmatic call is, wait until you need to scale, if at all.

This. It's less about scaling a framework, than it is about scaling your particular application.

If you find that MySQL is a bottleneck in particular, Ruby (and by extension Rails) has a wide range of database adapters and database ORM's available. Active Record itself has numerous adapters, which mean changing databases within the supported set shouldn't be too difficult.

Alternatively, you could use a different ORM such as DataMapper or one specific to the database you desire. Rails now let's you choose which ORM you wish to use, although if you're just getting started it might prove an additional learning curve as most of the documentation uses Active Record.


I don't see a problem in using PHP. Consider how you break down your infrastructure. Don't build monolithic code-bases which you cannot maintain. Make small compact services (SOA) which you know you can build out quickly. RoR is for people who want to spend a lot of time learning the framework. At this point, it's so large you will probably never even feel like you can encompass it's full size. Kinda like django. Don't ditch php because people say it doesn't scale. YouPorn.com thought it scaled quite well. They get a crap load of traffic to boot.

[edit - one additional point which is probably more important than the rest. if you've already done one rewrite and you're thinking of another, your problem is you probably don't know how to move the ball down the field in a business sense so you're picking up the only hammer you have - technology choice, and repeatedly swinging it. Take a step back, ask yourself if the problem is fear of the unknown, and attack the true problem - how do you get users to USE your product. If truly the answer is "my users don't want me to use Rails", I'd be shocked.]

First, I'd say I don't think you have anything to worry about.

Second, don't go replacing MySQL because you're worried about scale. I have yet to find a database that really easily scales better. You can trade a whole lot of development time to use something like Cassandra or Riak, (which are both great for particular applications). Then you spend a huge, huge amount of time building queries that were easy in Rails and you actually didn't solve any end-user facing problems. You just made it where only one of your developers actually can create queries and nobody can maintain the code.

Third, don't go replacing Rails because you're worried about scale. It's not the fastest thing but it's just not worth losing sleep over.

Just add the appropriate technology where you need it - I probably wouldn't build a recommendation engine in Ruby; perhaps that is the only piece that you build using something else.

Here's what we often do @ Inaka - web and admin in Rails, business "logic" and stuff that needs to scale (either because we have tons of socket connections or because we have a lot of data to crunch) in Erlang. (Insert Java or Python or whatever you need for the backend piece in place of Erlang.)

Then build an internal HTTP API that Rails can talk to. Abstract that in a model class. Have Rails call into it as needed. RecommendationEngine.recommend(...) returns JSON with the magical results for your Rails app to render appropriately.

Typically we give our Rails users an "api key" (that they don't know about) that is used to authenticate calls to the backend service, so we don't even have to share user authentication schemes between the systems. Then you can use devise or whatever you want for Rails but don't have to re-implement the same password hashing algorithms - authenticating those users is just a quick lookup in the user table.

Sometimes that service API may even be part of a publicly available HTTP API. For instance, imagine the backend piece exposes part of its API to mobile devices. Then, some of the methods may be authenticated and some are open to the world.


Seconding that.

PHP doesn't scale. Java doesn't scale. Python doesn't scale.

If you wait for something that "scales", you'll never ship anything.

Scaling isn't done automatically, it's something that's applied to a problem. If MySQL doesn't scale, and it has proven itself to be very capable in a wide variety of circumstances, then what does?

The more you read about scaling, the more you realize there's no magic bullet, no magically scalable language or platform or framework. It's all about careful investigation of the nature of the problems, the bottlenecks, and developing solutions to address those.


Replace MySQL with Percona XtraDB, and leverage aggressive "permacaching" for a start. Then it's more about splitting up your infrastructure and having mechanisms for dealing with increased load on different parts of it (ie high db load = spin up more DB machines)

You have no idea what the problem is and you're already advocating a change of platform? This is not how you scale. This is how you hit up a client for thousands of hours of "consulting" fees.

This performance problem could be because of missing indexes, grossly inefficient queries, or a whole host of other elementary problems that can be fixed with a keystroke.


Not advocating a change of platform at all - Percona is a drop-in replacement for MySQL, and XtraDB is backwards compatible with InnoDB which gets much more performance out of multicore systems. Unless your server has one core, it's a no-brainer.

------ Much appreciate the insights. We are actually less worried about scale now (We just got users and some revenue and we are a long shot away from scale problems). We did not want to go too far down the path where we carry tech debt so much that we have to re-do. Your answers tell that its okay to go the way we are going and there are still ways to keep the recommendations (Machine Learning part) separate. ---

I shall get back here to post a link to our product in about a week when we launch. Would very much appreciate comments then!


Legal | privacy