Hacker Read top | best | new | newcomments | leaders | about | bookmarklet login
A URL shortener service in 45 lines of Scala (grasswire-engineering.tumblr.com) similar stories update story
27.0 points by lauriswtf | karma 1324 | avg karma 6.1 2014-08-21 12:48:15+00:00 | hide | past | favorite | 23 comments



view as:

Now that Twitter auto-shortens URLs (and expands them in the UI), is there really a use left for URL-shorteners? Are there any other major platforms where text length matters?

The biggest advantage of URL shorteners, outside of the length, is the ability to quickly generate URLs that are track-able, instead of constantly adding campaigns with something like grasswire.com/1234?refid=twitter, which is how a lot of marketers track where their clicks come from.

This is only beneficial to one side of the userbase. For users clicking the links it's only inconvenience.

Length matters when it comes to print format.

but george-harrison-has-an-opinion is a lot easier to type without mistakes than An6X9z16Q.

and really it both sucks, which is why many print magazines collect an edition's links on a special page like magazinename.com/2014/5.


Yeah, they shouldn't be doing it like that. they should have control of their own shortener.

In print: I like how net magazine uses netm.ag/description-123 for any links so it's easy to type and doesn't take up too much space. This also gives analytics for print articles where you normally can't watch whether anyone followed a link

You don't need to rely on routing through Twitter. Works when Twitter blocked or down. Branding.

Also tracking.

Also you can modify where the link points after distributing it in case you notice a mistake.


Where does this handle getting the same random shortened url twice?

Good catch - it doesn't. We'd need to add a line to check for collision and retry. Better yet is to use hashing instead of random string as some folks mentioned in the comments on the blog.

As far as I am aware, the standard implementation of a URL shortening algorithm is to convert the number to a higher base, not to generate a random string. This has the advantages of greater loading speed, and smaller storage size.

Here is one I wrote long enough ago for me to be divorced from its implementation flaws: https://github.com/TShadwell/go-shorten/blob/master/shorten/...


That makes sense. Thanks for the code.

Instead of working on a URL shortner, I suggest you work on not sending usernames and passwords in clear text via HTTP POSTs.

The site is blocked at my work so I can't see the implemtation but here is one I wrote back in march that is 79 lines: https://gist.github.com/eliotfowler/9360895

And another one I found, in 81 lines of Haskell (may be interesting for comparison): https://github.com/ryantrinkle/memoise/blob/master/src/Main....

There's also a presentation to go with it: http://vimeo.com/59109358


I know scala is very powerful, but I think this code proofs to me that scala code is difficult to read. I'm a pro functional programmer, but scala mixes too many concepts.

'pseudo' Python url shorterner in 14 lines of code:

  from webapp2 import RequestHandler
  import redis

  def shorten_url(url):
    import base64
    return base64.encode(url)[:7]
  
  class Handler(RequestHandler):

    def get(self, path):
        self.redirect(redis.get(path))

    def post(self, path):
        redis.put(path, shorten_url(path))

I believe you'll have a lot of collisions that way, as two URLs with the same 7-byte suffix in base64 will have the same key. The Scala version uses 7 random characters, which I think will lead to a collision after about sqrt(36^7) entries due to the birthday paradox (62^7 if alphanumeric includes both uppercase and lowercase). That's better, but I'd recommend just using something autoincrementing instead, which in theory makes collisions impossible.

I agree. It was a short pseudo implementation.

Here's another one in 27 lines of Clojure: http://beauhinks.com/simple-url-shortener-with-clojure/


What would be the minimum path size in the shortened url to ensure a large enough capability of storage without letting users guess other urls easily?

Legal | privacy