Spatial (GIS) data in R: easy maps

Most, if not all, paper topics will benefit from finding books and articles discussing (and giving code for) relevant techniques. Common examples in the past have been text mining and web scraping. Another example is analyzing spatial data. There are R functions for a full range of geographic modeling, including both analysis and display. Several groups… Continue reading Spatial (GIS) data in R: easy maps


Predicting polluted swimming

Many cities should be doing something like this,  including my home town of San Diego. Measuring water quality is vital, but the result can take days. Doing this for another city could make a good class project, since historical data is often available. These developers have also turned it into a "beat the forecasters" game… Continue reading Predicting polluted swimming

Staying familiar with R analytics

I subscribe to an email list called R-bloggers. Every day it summarizes ~5 blog posts about statistical analysis using R. Roughly one third are about some function or feature of R. The second, more interesting, third are short cases about a wide variety of topics that someone has analyzed - sports results, NAFTA, analyzing financial… Continue reading Staying familiar with R analytics

Chart Relationship diagram from Financial Times

This diagram of about 80 kinds of charts, with clear explanations of their purposes, is impressive. It is the most comprehensive such list I have seen, and it's quite easy to understand. I have not looked for an R/ggplot version of this, but if one does not exist yet I suspect someone will soon create it. Here… Continue reading Chart Relationship diagram from Financial Times

Random Forests + LASSO Lecture May 11

Here are the lecture notes on Random Forests from Thursday May 11.  BDA17 Random Forests May 11 Bohn  Remember, Random Forests are a technique everyone should try.  LASSO, also discussed on Wednesday, is great when you have lots of variables. With fewer than 20 variables, it's not as necessary. BUT LASSO, also discussed on Wednesday,… Continue reading Random Forests + LASSO Lecture May 11

TA Session on 3rd May

The workshop will be on various commands used for text mining. We will also go over some basic details to set you up for next week's homework which will be a kaggle competition. Please read the text-mining handout given today if you want to have some practice before the session