Consistently Infrequent

December 13, 2011

Unshorten (almost) any URL with R

Filed under: R — Tags: , , , , , — Tony Breyal @ 6:57 pm


I was asked by a friend how to find the full final address of an URL which had been shortened via a shortening service (e.g., Twitter’s, Google’s, Facebook’s,,, TinyURL,,, etc.). I replied I had no idea and maybe he should have a look over on or, possibly, the R-help list, and if that didn’t turn up anything to try an online unshortening service like

Two minutes later he came back with this solution from Stack Overflow which, surpsingly to me, contained an answer I had provided about 1.5 years ago!

This has always been my problem with programming, that I learn something useful and then completely forget it. I’m kind of hoping that by having this blog it will aid me in remembering these sorts of things.

The Objective

I want to decode a shortened URL to reveal it’s full final web address.

The Solution

The basic idea is to use the getURL function from the RCurl package and telling it to retrieve the header of the webpage it’s connection too and extract the URL location from there.

decode_short_url <- function(url, ...) {

  decode <- function(u) {
    x <- try( getURL(u, header = TRUE, nobody = TRUE, followlocation = FALSE, cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")) )
    if(inherits(x, 'try-error') | length(grep(".*Location: (\\S+).*", x))<1) {
    } else {
      return(gsub('.*Location: (\\S+).*', '\\1', x))

  # MAIN #
  # return decoded URLs
  urls <- c(url, ...)
  l <- vector(mode = "list", length = length(urls))
  l <- lapply(urls, decode)
  names(l) <- urls

And here’s how we use it:

# $``
# [1] ""
# $``
# [1] ""

You can always find the latest version of this function here:


A comment on the R-bloggers facebook page for this blog post made me realise that this doesn’t work with every shortened URL such as when you need to be logged in for a service, e.g.,


# $``
# [1] ""
# $``
# [1] ""
# $``
# [1] ""

I still don’t know why this might be a useful thing to do but hopefully it’s useful to someone out there 🙂



  1. Since no one else wrote it – I wanted to say that this is a good post – thanks for putting it together 🙂

    Comment by Tal Galili — December 14, 2011 @ 1:34 pm

  2. Thank you for this, this is smth that I was looking for. Unfortunatelly, it doesn’t resolve the final url, when it’s double-shorted.

    Comment by Aleksei Beloshytski (@LadderRunner) — February 4, 2012 @ 7:53 pm

    • Running it recursively would probably work, with the closing condition being two sucessive recursions which resolve to the same final URL.

      Comment by Tony Breyal — October 14, 2012 @ 3:37 pm

  3. I mean it ideally id may also check (optionlly) whether the URL is shortened several times. However it may be run recursively 🙂

    Comment by Aleksei Beloshytski (@LadderRunner) — February 4, 2012 @ 7:57 pm

  4. Wonderful paintings! This is the type of info that are supposed to be shared across the web. Shame on Google for no longer positioning this put up higher! Come on over and seek advice from my site . Thanks =)

    Comment by url shortener android — September 4, 2012 @ 4:04 pm

  5. Great post, but I’m having problems to run the function on URL lists with more than 1K links. R either halts or crashes while attempting to access unmapped memory. Perhaps we could add a timesleep to the function?

    Comment by Marco T. Bastos — November 16, 2012 @ 3:52 pm

    • Does the update to the code improve the situation? I’ve added a pause of 0.25 seconds between requests plus I’ve preallocated memory for the list object so hopefully if there’s a memory issue this will identify it early on. Other than that I would simply use a for-loop, something along the following lines (untested):

      out <- vector(mode = "list", length = length(urls))
      for(u in urls) {
        out[u] <- decode_short_url(u)

      Comment by Tony Breyal — November 17, 2012 @ 12:31 pm

      • Not really. R still crashes when resolving more than 1500 URLs, despite the 0.5 seconds between requests and the preallocated memory. Check it out:

        > urls.resolved
        > urls.resolved <- decode_short_url(urls.shortened[1:2000,])

        *** caught segfault ***
        address (nil), cause 'memory not mapped'

        1: .Call("R_curlMultiPerform", curl, as.logical(multiple), PACKAGE = "RCurl")
        2: curlMultiPerform(multiHandle)
        3: getURIAsynchronous(url, …, .opts = .opts, write = write, curl = curl)
        4: getURL(u, header = TRUE, nobody = TRUE, followlocation = FALSE, cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))
        5: doTryCatch(return(expr), name, parentenv, handler)
        6: tryCatchOne(expr, names, parentenv, handlers[[1L]])
        7: tryCatchList(expr, classes, parentenv, handlers)
        8: tryCatch(expr, error = function(e) { call <- conditionCall(e) if (!is.null(call)) { if (identical(call[[1L]], quote(doTryCatch))) call <- dcall <- deparse(call)[1L] prefix <- paste("Error in", dcall, ": ") LONG <- 75L msg <- conditionMessage(e) sm <- strsplit(msg, "\n")[[1L]] w <- 14L + nchar(dcall, type = "w") + nchar(sm[1L], type = "w") if ( w LONG) prefix <- paste(prefix, "\n ", sep = "") } else prefix <- "Error : " msg <- paste(prefix, conditionMessage(e), "\n", sep = "") .Internal(seterrmessage(msg[1L])) if (!silent && identical(getOption("show.error.messages"), TRUE)) { cat(msg, file = stderr()) .Internal(printDeferredWarnings()) } invisible(structure(msg, class = "try-error", condition = e))})
        9: try(getURL(u, header = TRUE, nobody = TRUE, followlocation = FALSE, cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")))
        10: FUN(X[[1L]], …)
        11: lapply(urls, decode)
        12: decode_short_url(br.shortened[1:2000, ])

        Possible actions:
        1: abort (with core dump, if enabled)
        2: normal R exit
        3: exit R without saving workspace
        4: exit R saving workspace

        Comment by Marco T. Bastos — November 17, 2012 @ 1:42 pm

        • By the way I’ve tested the code running R 2.15 both in Linux (Debian) and Windows 7 64.

          Comment by Marco T. Bastos — November 17, 2012 @ 1:44 pm

          • I think you’re going to have to ask about his on because I don’t understand why that is happening to be completely honest with you mate. Sorry I couldn’t be of more help.

            Comment by Tony Breyal — November 17, 2012 @ 2:24 pm

            • No probs, Tony. I’ll play around with the function and try to parse the list in smaller blocks. I’ll get back to you if I find a workaround. Thanks for all the help.

              Comment by Marco T. Bastos — November 17, 2012 @ 2:49 pm

  6. Just checking, the function does not work correctly for twitter shortened URLs. For example: “”

    Comment by — June 10, 2014 @ 9:23 pm

    • It’s because Twitter shorterned URL redirection use location header instead of Location.
      You can fix it by replacing the regexp on lines 9 and 12:




      Comment by grdscarabe — February 12, 2015 @ 4:27 pm

  7. Reblogged this on IT Today and commented:
    the deocode_short_url does the trick, it gives you the full url of the shortened url

    Comment by leonwangechi — August 7, 2014 @ 6:44 am

  8. Sesuatu yang paling membingungkan pada zaman perjudian sekarang merupakan dalam memilih agen taruhan sbobet terpercaya dari SBOBET, kami selaku agen besar serta terpercaya dalam dunia betting online siap membantu awak dalam
    mengakses dan mengasihkan panduan kepada anda pada bermain SBOBET, akses ekspress yang kami
    berikan merupakan hal nyata tanpa bohong belaka. Telah banyak pejudi online yang bergabung dengan kami
    dalam bermain CASINO SBOBET judi bola online, kedatangan anda kami
    akan sering tunggu dengan senang hati dan operator kami kerap siap membantu anda jika mengalami kesulitan. SBOBET gak sembarangan dalam memberikan hubungan kepada agen judi online karena hanya agen bola online terpercaya dan punya reputasi baik seperti kami yang
    dipercaya SBOBET di menyediakan permainannya yaitu CMIDN.

    SBOBET memiliki banyak fans terutama penggila taruhan football online,
    dari kalangan buah hati muda bahkan orang tua memilih SBOBET sebagai spouse mereka dalam bermain judi
    bola online, SBOBET patut memiliki banyak agen luas dan terpercaya dalam perkara ini kami adalah diantaranya, promo-promo sering kami berikan kepada anda member anyar ataupun member lama, menjadi buat apa anda menunggu lagi
    cepat bergabung berbareng kami dan rasakan serunya
    taruhan bola online BOLA TANGKAS bersama kami agen terkemuka.
    Kami agen judi terkemuka memiliki dukungan penuh dri SBOBET dalam memberikan jaringan dalam
    bermain Judi Lisonjero Online, Keamanan dan kedamaian dalam betting adalah perkara
    penting bagi kami, dengan tidak memandang masalah kecil / besar kami sebagai bandar
    taruhan online profesional kerap mengatasi hal seperti tersebut dengan cepat.

    Comment by Agen Judi SBOBET Casino Online — March 31, 2018 @ 2:05 am

  9. That is very fascinating, You’re a very professional blogger.
    I have joined your feed and stay up for in search of more of your magnificent
    post. Additionally, I have shared your web site in my social networks

    Comment by SBOBET — April 4, 2018 @ 6:56 am

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at

%d bloggers like this: