Hi Rob,
I'd recommend going for R. Yes, it's fairly complicated, but it is incredibly powerful and very good at dealing with large datasets. Depending on your needs, RStudio (
http://rstudio.org/) may be useful as an Integrated Development Environment for R - very lightweight, but helps a lot. There are also more fully-featured GUIs available like R Commander (
http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/).
Hope that might help,
Robin
On 30 Oct 2011, at 12:29, Rob Malpass wrote:
> Hi all
>
> Does anyone know a good, user friendly statistics / analytics package for Linux? The trick is - it needs to be able to handle an absolutely massive dataset - 13m rows.
>
> For Uni, I have a dataset with no fewer than 13m records and I need to run a regression on it. In fact I probably need to run about a thousand regressions comparing results later.
>
> In theory, something like Libre could handle the individual regressions once I've split the txt file up but I don't want to get into faffing around with awk, sed, cat, head etc etc (takes ages, creates massive files and besides which the file needs splitting according to a rule which uses a field within it that at present I can't guarantee it's sorted on). I can't afford the frankly ludicrous prices charged for SAS and SPSS. I just wondered if any of you knew if there was something really good that people are using and I've missed.
> I've tried:
> "R" [1] - powerful but very clunky and a dreadful GUI
> "PSPP" [2] - still a work in progress and truly awfully formatted output. It'll get there one day but it's a mile off at the moment.
> "DAP" - won't compile for me and I don't have time to investigate.
> "gretl" [3] - seemingly for economists who seldom have to handle such big datasets.
> Various database packages which are fine for handling the data - but don't run to linear regression.
>
> Cheers
> Rob
> [1] www.r-project.org
> [2] http://www.gnu.org/software/pspp/
> [3] http://gretl.sourceforge.net/
>
> --
> Please post to: Hampshire@???
> Web Interface: https://mailman.lug.org.uk/mailman/listinfo/hampshire
> LUG URL: http://www.hantslug.org.uk
> --------------------------------------------------------------
--
Please post to: Hampshire@???
Web Interface:
https://mailman.lug.org.uk/mailman/listinfo/hampshire
LUG URL:
http://www.hantslug.org.uk
--------------------------------------------------------------