Re: [Hampshire] Analytics packages

Top Page

Reply to this message
Author: Robin Wilson
Date:  
To: Hampshire LUG Discussion List
Subject: Re: [Hampshire] Analytics packages
Hi Rob,

I'd recommend going for R. Yes, it's fairly complicated, but it is incredibly powerful and very good at dealing with large datasets. Depending on your needs, RStudio (http://rstudio.org/) may be useful as an Integrated Development Environment for R - very lightweight, but helps a lot. There are also more fully-featured GUIs available like R Commander (http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/).

Hope that might help,

Robin

On 30 Oct 2011, at 12:29, Rob Malpass wrote:

> Hi all
>
> Does anyone know a good, user friendly statistics / analytics package for Linux? The trick is - it needs to be able to handle an absolutely massive dataset - 13m rows.
>
> For Uni, I have a dataset with no fewer than 13m records and I need to run a regression on it. In fact I probably need to run about a thousand regressions comparing results later.
>
> In theory, something like Libre could handle the individual regressions once I've split the txt file up but I don't want to get into faffing around with awk, sed, cat, head etc etc (takes ages, creates massive files and besides which the file needs splitting according to a rule which uses a field within it that at present I can't guarantee it's sorted on). I can't afford the frankly ludicrous prices charged for SAS and SPSS. I just wondered if any of you knew if there was something really good that people are using and I've missed.
> I've tried:
> "R" [1] - powerful but very clunky and a dreadful GUI
> "PSPP" [2] - still a work in progress and truly awfully formatted output. It'll get there one day but it's a mile off at the moment.
> "DAP" - won't compile for me and I don't have time to investigate.
> "gretl" [3] - seemingly for economists who seldom have to handle such big datasets.
> Various database packages which are fine for handling the data - but don't run to linear regression.
>
> Cheers
> Rob
> [1] www.r-project.org
> [2] http://www.gnu.org/software/pspp/
> [3] http://gretl.sourceforge.net/
>
> --
> Please post to: Hampshire@???
> Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
> LUG URL: http://www.hantslug.org.uk
> --------------------------------------------------------------



--
Please post to: Hampshire@???
Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL: http://www.hantslug.org.uk
--------------------------------------------------------------