Consistently Infrequent

December 13, 2011

Unshorten (almost) any URL with R

Filed under: R — Tags: , , , , , — Tony Breyal @ 6:57 pm

Introduction

I was asked by a friend how to find the full final address of an URL which had been shortened via a shortening service (e.g., Twitter’s t.co, Google’s goo.gl, Facebook’s fb.me, dft.ba, bit.ly, TinyURL, tr.im, Ow.ly, etc.). I replied I had no idea and maybe he should have a look over on StackOverflow.com or, possibly, the R-help list, and if that didn’t turn up anything to try an online unshortening service like http://unshort.me.

Two minutes later he came back with this solution from Stack Overflow which, surpsingly to me, contained an answer I had provided about 1.5 years ago!

This has always been my problem with programming, that I learn something useful and then completely forget it. I’m kind of hoping that by having this blog it will aid me in remembering these sorts of things.

The Objective

I want to decode a shortened URL to reveal it’s full final web address.

The Solution

The basic idea is to use the getURL function from the RCurl package and telling it to retrieve the header of the webpage it’s connection too and extract the URL location from there.

decode_short_url <- function(url, ...) {
  # PACKAGES #
  require(RCurl)

  # LOCAL FUNCTIONS #
  decode <- function(u) {
    Sys.sleep(0.5)
    x <- try( getURL(u, header = TRUE, nobody = TRUE, followlocation = FALSE, cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl")) )
    if(inherits(x, 'try-error') | length(grep(".*Location: (\\S+).*", x))<1) {
      return(u)
    } else {
      return(gsub('.*Location: (\\S+).*', '\\1', x))
    }
  }

  # MAIN #
  gc()
  # return decoded URLs
  urls <- c(url, ...)
  l <- vector(mode = "list", length = length(urls))
  l <- lapply(urls, decode)
  names(l) <- urls
  return(l)
}

And here’s how we use it:

# EXAMPLE #
decode_short_url("http://tinyurl.com/adcd",
                 "http://www.google.com")
# $`http://tinyurl.com/adcd`
# [1] "http://www.r-project.org/"
#
# $`http://www.google.com`
# [1] "http://www.google.co.uk/"

You can always find the latest version of this function here: https://github.com/tonybreyal/Blog-Reference-Functions/blob/master/R/decode_shortened_url/decode_shortened_url.R

Limitations

A comment on the R-bloggers facebook page for this blog post made me realise that this doesn’t work with every shortened URL such as when you need to be logged in for a service, e.g.,

http://1.cloudst.at/myeg

decode_short_url("http://tinyurl.com/adcd",
"http://www.google.com",
"http://1.cloudst.at/myeg")

# $`http://tinyurl.com/adcd`
# [1] "http://www.r-project.org/"
#
# $`http://www.google.com`
# [1] "http://www.google.co.uk/"
#
# $`http://1.cloudst.at/myeg`
# [1] "http://1.cloudst.at/myeg"

I still don’t know why this might be a useful thing to do but hopefully it’s useful to someone out there :)

November 24, 2011

source_https(): Sourcing an R Script from github over HTTPS

Filed under: R — Tags: , , , , , — Tony Breyal @ 12:21 pm

The Objective

I wanted to source R scripts hosted on my github repository for use in my blog (i.e. a github version of ?source). This would make it easier for anyone wishing to test out my code snippets on their own computers without having to manually go to my github repo and retrieve a series of R scripts themselves to make it run.

The Problem

The base R function source() fails with HTTPS links on Windows 7. There may be a way around this by starting R using –internet2 from the command line (search for CMD in windows) but that would just be another inconvenience like having to download an R script through your browser in the first place.

An easier approach would be to use RCurl:getURL() by setting either ssl.veryifypeer=FALSE or cainfo to a SSL certificates file. That’s easy enough to achieve but I wanted to wrap the code in a function for convenience as follows:


source_github <- function(u) {
  # load package
  require(RCurl)

  # read script lines from website
  script <- getURL(u, ssl.verifypeer = FALSE)

  # parase lines and evealuate in the global environement
  eval(parse(text = script))
}

source("https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/bingSearchXScraper/bingSearchXScraper.R")

The problem with the code above was that the functions sourced from the desired R script file only existed locally in source_github() and not globally to the rest of the R session. Sadface.

The Solution

Asking on Stack Overflow produced an answer from the mighty Spacedman who added envir=.GlobalEnv as a parameter to eval. This means that the evaluation is done in the global environment and thus all the contents of the R script are available for the entire R session.

Furthermore, it occurred to me that I could make the function generic to work with any R script that is hosted over a HTTPS connection. To this end, I added a couple of lines of code to download a security certificates text file from the curl website.

source_https <- function(u, unlink.tmp.certs = FALSE) {
  # load package
  require(RCurl)

  # read script lines from website using a security certificate
  if(!file.exists("cacert.pem")) download.file(url="http://curl.haxx.se/ca/cacert.pem", destfile = "cacert.pem")
  script <- getURL(u, followlocation = TRUE, cainfo = "cacert.pem")
  if(unlink.tmp.certs) unlink("cacert.pem")

  # parase lines and evealuate in the global environement
  eval(parse(text = script), envir= .GlobalEnv)
}

source_https("https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/bingSearchXScraper/bingSearchXScraper.R")
source_https("https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/htmlToText/htmlToText.R", unlink.tmp.certs = TRUE)

Using unlink.tmp.certs = TRUE will delete the security certificates text file that source_https downloads and is an optional parameter (probably best to use it only on the final call of source_https to avoid downloading the same certificates file multiple times).

UPDATE

Based on Kay’s comments, here’s a vectorised version with cross-platform SSL certificates:

source_https <- function(url, ...) {
  # load package
  require(RCurl)

  # parse and evaluate each .R script
  sapply(c(url, ...), function(u) {
    eval(parse(text = getURL(u, followlocation = TRUE, cainfo = system.file("CurlSSL", "cacert.pem", package = "RCurl"))), envir = .GlobalEnv)
  })
}

# Example
source_https("https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/bingSearchXScraper/bingSearchXScraper.R",
             "https://raw.github.com/tonybreyal/Blog-Reference-Functions/master/R/htmlToText/htmlToText.R")

 

November 18, 2011

htmlToText(): Extracting Text from HTML via XPath

Filed under: R — Tags: , , , , , , — Tony Breyal @ 4:31 pm

Converting HTML to plain text usually involves stripping out the HTML tags whilst preserving the most basic of formatting. I wrote a function to do this which works as follows (code can be found on github):

# load packages
library(RCurl)
library(XML)

# assign input (could be a html file, a URL, html text, or some combination of all three is the form of a vector)
input <- "http://www.google.co.uk/search?gcx=c&sourceid=chrome&ie=UTF-8&q=r+project#pq=%22hello+%3C+world%22&hl=en&cp=5&gs_id=3r&xhr=t&q=phd+comics&pf=p&sclient=psy-ab&source=hp&pbx=1&oq=phd+c&aq=0&aqi=g4&aql=&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.r_cp.,cf.osb&fp=27ff09b2758eb4df&biw=1599&bih=904"

# evaluate input and convert to text
txt <- htmlToText(input)
txt

#r project - Google Search Web Images Videos Maps News Shopping Gmail More Translate Books Finance Scholar Blogs YouTube Calendar Photos Documents Sites Groups Reader Even more » Account Options Sign in Search settings Web History Advanced Search Results 1 - 10 of about 336,000,000 for r project . Everything More Search Options Show options... Web The R Project for Statistical Computing R , also called GNU S, is a strongly functional language and environment to statistically explore data sets, make many graphical displays of data from custom ... www. r - project .org/ - Cached - Similar [Trunc...]

The above uses an XPath approach to achieve it’s goal. Another approach would be to use a regular expression. These two approaches are briefly discussed below:

Regular Expressions

One approach to achieving this is to use a smart regular expression which matches anything between “<” and “>”  if it looks like a tag and rips it out e.g.,

# html code
txt <- "
<html>
  <body>
  This is some random text.
    <p>This is some text in a paragraph.</p>
    <p>This is a statement which says that 2 < 3 = TRUE, 4 < 5 = TRUE and 10 > 9 = TRUE.</p>
  </body>
</html>"

# parse html
pattern <- "</?\\w+((\\s+\\w+(\\s*=\\s*(?:\".*?\"|'.*?'|[^'\">\\s]+))?)+\\s*|\\s*)/?>"
plain.text <- gsub(pattern, "\\1", txt)
cat(plain.text)

#   This is some random text.
#     This is some text in a paragraph.
#     This is a statement which says that 2 < 3 = TRUE, 4 < 5 = TRUE and 10 > 9 = TRUE

I got the regular expression in “pattern” in the code above from a quick google search which gave this webpage from 2004. It’s a pretty smart regex because it recognises the difference between “<” and “> which are used for a HTML tag and “<” and “>” which are used as a natural part of the plain text we want.

I’m still learning regex and I must confess to finding this one slightly intimidating. There seems like there could be a lot of pitfalls with this approach such as what to do about <script></script> tags which hold programming code for the browser between them? The code is plain text because it’s outside of the pointed brackets and would thus be extracted by the regex. However, it is meant for the browser to tell it how to do something – it’s not meant to be displayed in the web browser for the end user to see and thus is not something we want to include in our html-to-text conversion.

This approach would require building more and more sophsiticated regular expressions, or filtering through a series of different regular expressions, to get the desired result when taking into account these diversions. The code above would not give the desired result on the real world example I give below.

XPath

Another approach is to use XPath. The typical technique used it seems to me is to only extract the text between paragraph tags “<p>” and “</p>” as follows:

# load packages
library(RCulr)
library(XML)

# download html
html <- getURL("http://tonybreyal.wordpress.com/2011/11/17/cool-hand-luke-aldwych-theatre-london-2011-production/", followlocation = TRUE)

# parse html
doc = htmlParse(html, asText=TRUE)
plain.text <- xpathSApply(doc, "//p", xmlValue)
cat(paste(plain.text, collapse = "\n"))

# I just got back from watching a production of Cool Hand Luke at the Aldwych Theatre in Central London. Given that [TRUNC...]

That’s a great approach for most webpages such as blogs because of the way they are designed. However, there are cases where it would not work so well, such as if you wanted all the text off of a google search page (though it applies to other pages too of course):

# load packages
library(RCurl)
library(XML)

# download html
html <- getURL("http://www.google.co.uk/search?gcx=c&sourceid=chrome&ie=UTF-8&q=r+project#pq=%22hello+%3C+world%22&hl=en&cp=5&gs_id=3r&xhr=t&q=phd+comics&pf=p&sclient=psy-ab&source=hp&pbx=1&oq=phd+c&aq=0&aqi=g4&aql=&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.r_cp.,cf.osb&fp=27ff09b2758eb4df&biw=1599&bih=904", followlocation = TRUE)

# parse html
doc = htmlParse(html, asText=TRUE)
plain.text <- xpathSApply(doc, "//p", xmlValue)
cat(paste(plain.text, collapse = "\n"))

# r project linux
# stats package r
# Search Help

It returned only three lines. So we need to be more liberal by using “//text()” which will return all text outside of HTML tags which is what the regex approach above might give. However, we also need to account for text we don’t want such as style and script codes, which we can do as follows:

# load packages
library(RCurl)
library(XML)

# download html
html <- getURL("http://www.google.co.uk/search?gcx=c&sourceid=chrome&ie=UTF-8&q=r+project#pq=%22hello+%3C+world%22&hl=en&cp=5&gs_id=3r&xhr=t&q=phd+comics&pf=p&sclient=psy-ab&source=hp&pbx=1&oq=phd+c&aq=0&aqi=g4&aql=&gs_sm=&gs_upl=&bav=on.2,or.r_gc.r_pw.r_cp.,cf.osb&fp=27ff09b2758eb4df&biw=1599&bih=904", followlocation = TRUE)

# parse html
doc = htmlParse(html, asText=TRUE)
plain.text <- xpathSApply(doc, "//text()[not(ancestor::script)][not(ancestor::style)][not(ancestor::noscript)][not(ancestor::form)]", xmlValue)
cat(paste(plain.text, collapse = " "))

#r project - Google Search Web Images Videos Maps News Shopping Gmail More Translate Books Finance Scholar Blogs YouTube Calendar Photos Documents Sites Groups Reader Even more » Account Options Sign in Search settings Web History Advanced Search Results 1 - 10 of about 336,000,000 for r project . Everything More Search Options Show options... Web The R Project for Statistical Computing R , also called GNU S, is a strongly functional language and environment to statistically explore data sets, make many graphical displays of data from custom ... www. r - project .org/ - Cached - Similar [Trunc...]

This second version of the XPath approach seems to work rather well – it feels more robust than a regular expression approach and returns more information that the typical “//p” XPath approach too, thus returning more information for a greater variety of webpages.

Full code for my htmlToText() R function can be found here: https://github.com/tonybreyal/Blog-Reference-Functions/blob/master/R/htmlToText/htmlToText.R

P.S. part of the reason I wrote this function is so that I can plug it into my *XScraper functions to provide an extra field of more detailed information using a webCrawl = TRUE option maybe. I may have to write a more sophisticated web crawler though to handle errors for websites it can’t download correctly through RCurl. I’m not an expert in cURL and so it will probably just have a bunch of try() statements, I might try something simple like that for my next post…

November 11, 2011

Web Scraping Yahoo Search Page via XPath

Filed under: R — Tags: , , , , , , — Tony Breyal @ 12:25 am

Seeing as I’m on a bit of an XPath kick as of late, I figured I’d continue on scraping search results but this time from Yahoo.com

Rolling my own version of xpathSApply to handle NULL elements seems to have done the trick and so far it’s been relatively easy to do the scraping. I’ve created an R function which will scrape information from a Yahoo Search page (with the user suplying the Yahoo Search URL) and will extract as much information as it can whilst maintaining the data frame structure (full source code at end of post). For example:

# load packages
library(RCurl)
library(XML)

# user provides url and the function extracts relevant information into a data frame as follows
u <- "http://uk.search.yahoo.com/search;_ylt=A7x9QV6rWrxOYTsAHNFLBQx.?fr2=time&rd=r1&fr=yfp-t-702&p=Wil%20Wheaton&btf=w"
df <- get_yahoo_search_df(u)
t(df[1, ])

#             1
# title       "Wil Wheaton - Google+"
# url         "https://plus.google.com/108176814619778619437"
# description "Wil Wheaton - Google+6 days ago"
# cached      "http://87.248.112.8/search/srpcache?ei=UTF-8&p=Wil+Wheaton&rd=r1&fr=yfp-t-702&u=http://cc.bingj.com/cache.aspx?q=Wil+Wheaton&d=4592664708059042&mkt=en-GB&setlang=en-GB&w=48d4b732,65b6306b&icp=1&.intl=uk&sig=6lwcOA8_4oGClQam_5I0cA--"
# recorded    "6 days ago"

I’ve only tested these on web results. The idea of these posts is to get basic functionality and then if I feel it might be fun, to expand the functionality in the future.

It’s nice having an online blog where I can keep these functions I’ve come up with during coding exercises. Maybe if I make enough of these Web Search Engine scrapers I can go ahead and make my first R package. Though the downside of web scraping is that if the structrure/entities of the HTML code change then the scrapers may stop working. That could make the package difficult to maintain. I can’t really think of how the package itself might be useful to anyone apart from teaching me personally how to build a package.

Maybe that’ll be worth it in and of itself. Ha, version 2.0 could be just a collection of the self contained functions, version 3.0 could have the functions converted to S3 (which I really want to learn), version 4.0 could have them converted to S4 (again, something I’d like to learn) and version 5.0 could have reference classes (I still don’t know what those things are). Just thinking out loud, could be a good way to learn more R. Doubt I’ll do it though but we’ll see. I have to find time to start learning Python so might have to put R on the back burner soon!

Full source code here (function is self-contained, just copy and paste):

# load packages
library(RCurl)
library(XML)

get_yahoo_search_df <- function(u) {
  # I hacked my own version of xpathSApply to deal with cases that return NULL which were causing me problems
  xpathSNullApply <- function(doc, path.base, path, FUN, FUN2 = NULL) {
    nodes.len <- length(xpathSApply(doc, path.base))
    paths <- sapply(1:nodes.len, function(i) gsub( path.base, paste(path.base, "[", i, "]", sep = ""), path, fixed = TRUE))
    xx <- lapply(paths, function(xpath) xpathSApply(doc, xpath, FUN))
    if(!is.null(FUN2)) xx <- FUN2(xx)
    xx[sapply(xx, length)<1] <- NA
    xx <- as.vector(unlist(xx))
    return(xx)
  }

  # download html and parse into tree structure
  html <- getURL(u, followlocation = TRUE)
  doc <- htmlParse(html)

  # path to nodes of interest
  path.base <- "/html/body/div[@id='doc']/div[@id='bd-wrap']/div[@id='bd']/div[@id='results']/div[@id='cols']/div[@id='left']/div[@id='main']/div[@id='web']/ol/li"

  # construct data frame
  df <- data.frame(
    title = xpathSNullApply(doc, path.base, "/html/body/div[@id='doc']/div[@id='bd-wrap']/div[@id='bd']/div[@id='results']/div[@id='cols']/div[@id='left']/div[@id='main']/div[@id='web']/ol/li/div/div/h3/a", xmlValue),
    url = xpathSNullApply(doc, path.base, "/html/body/div[@id='doc']/div[@id='bd-wrap']/div[@id='bd']/div[@id='results']/div[@id='cols']/div[@id='left']/div[@id='main']/div[@id='web']/ol/li/div/div/h3/a[@href]", xmlAttrs, FUN2 = function(xx) sapply(xx, function(x) x[2])),
    description = xpathSNullApply(doc, path.base, "/html/body/div[@id='doc']/div[@id='bd-wrap']/div[@id='bd']/div[@id='results']/div[@id='cols']/div[@id='left']/div[@id='main']/div[@id='web']/ol/li/div/div", xmlValue),
    cached = xpathSNullApply(doc, path.base, "/html/body/div[@id='doc']/div[@id='bd-wrap']/div[@id='bd']/div[@id='results']/div[@id='cols']/div[@id='left']/div[@id='main']/div[@id='web']/ol/li/div/a[@href][text()='Cached']", xmlAttrs, FUN2 = function(xx) sapply(xx, function(x) x[1])),
    recorded = xpathSNullApply(doc, path.base, "/html/body/div[@id='doc']/div[@id='bd-wrap']/div[@id='bd']/div[@id='results']/div[@id='cols']/div[@id='left']/div[@id='main']/div[@id='web']/ol/li/div/div/span[@id='resultTime']", xmlValue),
    stringsAsFactors = FALSE)

  # free doc from memory
  free(doc)

  # return data frame
  return(df)
}

u <- "http://uk.search.yahoo.com/search;_ylt=A7x9QV6rWrxOYTsAHNFLBQx.?fr2=time&rd=r1&fr=yfp-t-702&p=Wil%20Wheaton&btf=w"
df <- get_yahoo_search_df(u)
t(df[1:5, ])

#             1
# title       "Wil Wheaton - Google+"
# url         "https://plus.google.com/108176814619778619437"
# description "Wil Wheaton - Google+6 days ago"
# cached      "http://87.248.112.8/search/srpcache?ei=UTF-8&p=Wil+Wheaton&rd=r1&fr=yfp-t-702&u=http://cc.bingj.com/cache.aspx?q=Wil+Wheaton&d=4592664708059042&mkt=en-GB&setlang=en-GB&w=48d4b732,65b6306b&icp=1&.intl=uk&sig=6lwcOA8_4oGClQam_5I0cA--"
# recorded    "6 days ago"
#             2
# title       "WIL WHEATON DOT NET"
# url         "http://www.wilwheaton.net/coollinks.php"
# description "Wil Wheaton - Don't be a dick! - Writer and Actor - Your Mom - I'm Wil Wheaton. I'm an author (that's why I'm wilwheatonbooks), an actor, and a lifelong geek."
# cached      "http://87.248.112.8/search/srpcache?ei=UTF-8&p=Wil+Wheaton&rd=r1&fr=yfp-t-702&u=http://cc.bingj.com/cache.aspx?q=Wil+Wheaton&d=4592836504520824&mkt=en-GB&setlang=en-GB&w=eaeb9364,4a4e7c54&icp=1&.intl=uk&sig=VC7eV8GUMXVuu9apHagYNg--"
# recorded    "2 days ago"
#             3
# title       "this is one hell of a geeky weekend - WWdN: In Exile"
# url         "http://wilwheaton.typepad.com/wwdnbackup/2008/05/this-is-one-hel.html"
# description "WIL WHEATON DOT NET2 days ago"
# cached      "http://87.248.112.8/search/srpcache?ei=UTF-8&p=Wil+Wheaton&rd=r1&fr=yfp-t-702&u=http://cc.bingj.com/cache.aspx?q=Wil+Wheaton&d=4559391600545150&mkt=en-GB&setlang=en-GB&w=90d3ee39,34d4424b&icp=1&.intl=uk&sig=ZN.UpexVV4pm3yn7XiEURw--"
# recorded    "2 days ago"
#             4
# title       "Wil Wheaton - Google+ - I realized today that when someone ..."
# url         "https://plus.google.com/108176814619778619437/posts/ENTkBMZKeGY"
# description ">Cool Sites. Okay, I'm talking to the guys here: do you ever get \"the sigh\"? You know what I'm talking about...you're really into some cool website, and your ..."
# cached      "http://87.248.112.8/search/srpcache?ei=UTF-8&p=Wil+Wheaton&rd=r1&fr=yfp-t-702&u=http://cc.bingj.com/cache.aspx?q=Wil+Wheaton&d=4718764947541872&mkt=en-GB&setlang=en-GB&w=9bca6e9a,dba19826&icp=1&.intl=uk&sig=jGaKkuIFOINEBBfBwarrgg--"
# recorded    "6 days ago"
#             5
# title       "The Hot List: Dwight Slade, Back Fence PDX, Wil Wheaton vs ..."
# url         "http://www.oregonlive.com/movies/index.ssf/2011/11/the_hot_list_dwight_slade_back.html"
# description "this is one hell of a geeky weekend - WWdN: In Exile2 days ago"
# cached      "http://87.248.112.8/search/srpcache?ei=UTF-8&p=Wil+Wheaton&rd=r1&fr=yfp-t-702&u=http://cc.bingj.com/cache.aspx?q=Wil+Wheaton&d=414191857143&mkt=en-GB&setlang=en-GB&w=3081364,e585aa21&icp=1&.intl=uk&sig=KufdBZ_Thr1Mm8.SnjpMUQ--"
# recorded    "4 hours ago"

UPDATE: I’ve created a github account and the above code can be found at: https://github.com/tonybreyal/Blog-Reference-Functions/blob/master/R/get_yahoo_search_df.R

November 7, 2011

Web Scraping Google URLs

Filed under: R — Tags: , , , , — Tony Breyal @ 2:18 pm

UPDATE: This function has now been improved, see googleSearchXScraper()

Google slightly changed the html code it uses for hyperlinks on search pages last Thursday, thus causing one of my scripts to stop working. Thankfully, this is easily solved in R thanks to the XML package and the power and simplicity of XPath expressions:

# load packages
library(RCurl)
library(XML)

get_google_page_urls <- function(u) {
  # read in page contents
  html <- getURL(u)

  # parse HTML into tree structure
  doc <- htmlParse(html)

  # extract url nodes using XPath. Originally I had used "//a[@href][@class='l']" until the google code change.
  links <- xpathApply(doc, "//h3//a[@href]", function(x) xmlAttrs(x)[[1]])

  # free doc from memory
  free(doc)

  # ensure urls start with "http" to avoid google references to the search page
  links <- grep("http://", links, fixed = TRUE, value=TRUE)
  return(links)
}

u <- "http://www.google.co.uk/search?aq=f&gcx=w&sourceid=chrome&ie=UTF-8&q=r+project"
get_google_page_urls(u)

# [1] "http://www.r-project.org/"
# [2] "http://en.wikipedia.org/wiki/R_(programming_language)"
# [3] "http://www.rseek.org/"
# [4] "http://www.gutenberg.org/browse/authors/r"
# [5] "http://sciviews.org/_rgui/"
# [6] "http://www.itc.nl/~rossiter/teach/R/RIntro_ITC.pdf"
# [7] "http://stat.ethz.ch/CRAN/"
# [8] "http://hughesbennett.co.uk/RProject"
# [9] "http://www.warwick.ac.uk/statsdept/user-2011/"

Lovely jubbly! :)

P.S. I know that there is an API of some sort for google search but I don’t think anyone has made an R package for it. Yet. (I feel my skill set is insufficient to do it myself!

The Shocking Blue Green Theme. Blog at WordPress.com.

Follow

Get every new post delivered to your Inbox.

Join 71 other followers