19 Dec 2011

Blog Statistics with StatCounter & R

If you're interested in analysing your blog's statistics this can easily be done with a web-service like StatCounter (free, only registration needed, quite extensive service) and with R.
After implementing the StatCounter script in the html code of a webpage or blog one can download and inspect log-files with R with some short lines of code (like below) and then inspect visitor activity..

17 Dec 2011

Function to Collect Geographic Coordinates for IP-Addresses

I added the function IPtoXY to theBioBucket-Archives which collects geographic coordinates for IP-addresses.
It uses a web-service at http://www.datasciencetoolkit.org// and works with the base R-packages.

# System time to collect coordinates of 100 IP-addresses:
> system.time(sapply(log$IP.Address[1:100], FUN = IPtoXY))
       User      System verstrichen
       0.05        0.02       33.10

15 Dec 2011

Conversion of Several Variables to Factors

..often needed when preparing data for analysis (and usually forgotten until I need it for the next time).
With the below code I convert a set of variables to factors - it could be that there are slicker ways to do it (if you know one let me know!)

> dat <- data.frame(matrix(sample(1:40), 4, 10, dimnames = list(1:4, LETTERS[1:10])))
> str(dat)
'data.frame':   4 obs. of  10 variables:
 $ A: int  5 34 3 15
 $ B: int  28 25 17 24
 $ C: int  2 12 10 32
 $ D: int  16 27 29 14
 $ E: int  40 7 4 31
 $ F: int  22 30 6 18
 $ G: int  33 36 35 38
 $ H: int  19 21 37 8
 $ I: int  20 11 9 26
 $ J: int  39 13 1 23
> 
> id <- which(names(dat)%in%c("A", "F", "I"))
> dat[, id] <- lapply(dat[, id], as.factor)
> str(dat[, id])
'data.frame':   4 obs. of  3 variables:
 $ A: Factor w/ 4 levels "3","5","15","34": 2 4 1 3
 $ F: Factor w/ 4 levels "6","18","22",..: 3 4 1 2
 $ I: Factor w/ 4 levels "9","11","20",..: 3 2 1 4


12 Dec 2011

Default Convenience Functions in R (Rprofile.site)

I keep my blog-reference-functions, snippets, etc., at github and want to source them from there. This can be achieved by utilizing a function (source_https, customized for my purpose HERE). The original function was provided by the R-Blogger Tony Breyal - thanks Tony! As I will use this function quite frequently I just added the function code to my Rprofile.site and now am able to source from github whenever running code from the R-console. This is very handy and I thought it might be worth to share..

Function for Adding Transparency to JPEG (Output = PNG)

..see the function-code HERE.

Animation Newby Excercise with R-Package {animation}

Try this very simple & illustrative example for creating an animation with the animation package:


myfun <- function ( ) {
             n = ani.options("nmax")
             x = sample(1:n)
             y = sample(1:n)

             for (i in 1:n) {
                plot(x[i], y[i], cex = 3, col = 3, pch = 3, , lwd = 2,
                     ylim = c(0, 50),
                     xlim = c(0, 50))
                ani.pause()
                            }
                      }

ani.start()
par(mar = c(3, 3, 1, 0.5), mgp = c(1.5, 0.5, 0), tcl = -0.3)
myfun()
ani.stop()

7 Dec 2011

A Word Cloud with Spatial Meaning

..Some time ago I did a word cloud for representing a Google Scholar search result. Tal Galili pointed me at a post by Drew Conway that expanded on the topic of word clouds lacking spatial meaning. In fact the spatial ordering of words in a word cloud is arbitrary and meaningless..

As I am an ecologist, I soon came to the idea that text could be treated as a multivariate data set - assuming that words can be treated as species and sentences being similar to samples. So, presuming that it makes sense to put sentences and words in a cross-table as I similarly would do with a species / samples matrix, it may also be sensible to analyze such a matrix by ordination-methods for multivariate data, mostly used by ecologist recently. I chose NMDS ordination, as it is robust and quite easy to compute with R-package {vegan}.

1 Dec 2011

Producing Google Map Embeds with R Package googleVis

(1) for producing html code for a Google Map with R-package googleVis do something like:
 

library(googleVis)
df <- data.frame(Address = c("Innsbruck", "Wattens"),
                 Tip = c("My Location 1", "My Location 2"))
mymap <- gvisMap(df, "Address", "Tip", options = list(showTip = TRUE, mapType = "normal",
                 enableScrollWheel = TRUE))
plot(mymap) # preview
(2) then just copy-paste the html to your blog or website after customizing for your purpose..

Line Slope Calculation in ArcGis 9.3 (using XTools)

Ojective:
You have a polyline, say a path, river, etc., and want to know average slope of each single line.

Approach:
Use z-values of nodes of polyline and calculate percentual slope by
(line-segment shape_length / z-Difference) * 100