Difference between revisions of "SMHS LinearModeling StatsSoftware"
(Created page with "==SMHS Linear Modeling - Statistical Software == ===Statistical Software=== This section briefly describes the...") |
(→SMHS Linear Modeling - Statistical Software) |
||
(10 intermediate revisions by 3 users not shown) | |||
Line 1: | Line 1: | ||
==[[SMHS_LinearModeling|SMHS Linear Modeling]] - Statistical Software == | ==[[SMHS_LinearModeling|SMHS Linear Modeling]] - Statistical Software == | ||
− | |||
− | |||
This section briefly describes the pros and cons of different statistical software platforms. | This section briefly describes the pros and cons of different statistical software platforms. | ||
− | ===[ | + | <center> |
− | + | {| class="wikitable" style="text-align:left; width:99%" border="1" | |
− | + | |- | |
− | + | !Statistical Software||Advantages||Disadvantages | |
− | + | |- | |
− | + | | [http://r-project.org R]|| | |
− | + | * R is actively maintained (100,000 developers, 15K packages) | |
− | + | * Excellent connectivity to various types of data and other systems | |
+ | * Versatile for solving problems in many domains | ||
+ | * It’s free, open-source code | ||
+ | * Anybody can access/review/extend the source code | ||
+ | * R is very stable and reliable | ||
+ | * If you change or redistribute the R source code, you have to make those changes available for anybody else to use | ||
+ | * R runs anywhere (platform agnostic) | ||
+ | * Extensibility: R supports extensions, e.g., for data manipulation, statistical modeling, and graphics | ||
+ | * Active and engaged community supports R | ||
+ | * Unparalleled question-and-answer (Q&A) websites | ||
+ | * R connects with other languages(Java/C/JavaScript/Python/Fortran) & database systems, and other programs, SAS, SPSS, etc. | ||
+ | * Other packages have add-ons to connect with R. SPSS has incorporated a link to R, and SAS has protocols to move data and graphics between the two packages | ||
+ | || | ||
+ | * Mostly scripting language | ||
+ | * Steeper learning curve | ||
+ | |- | ||
+ | | [http://www.sas.com SAS] || | ||
+ | * Large datasets | ||
+ | * Commonly used in business & Government | ||
+ | || | ||
+ | * Expensive | ||
+ | * Somewhat dated programming language | ||
+ | * Expensive/proprietary | ||
+ | |- | ||
+ | | [http://www.stata.com Stata] || | ||
+ | * Easy statistical analyses | ||
+ | || | ||
+ | * Mostly classical stats | ||
+ | |- | ||
+ | | [http://www.ibm.com/analytics/us/en/technology/spss SPSS] || | ||
+ | * Appropriate for beginners | ||
+ | * Simple interfaces | ||
+ | || | ||
+ | * weak in more cutting edge statistical procedures lacking in robust methods and survey methods | ||
+ | |- | ||
+ | | colspan=3| More comparisons are available online: [http://www.ats.ucla.edu/stat/mult_pkg/compare_packages.htm UCLA/ATS] and [https://en.wikipedia.org/wiki/Comparison_of_statistical_packages Wikipedia]. | ||
+ | |} | ||
+ | </center> | ||
− | == | + | <center> |
+ | GoogleScholar Research Article Pubs | ||
+ | {| class="wikitable" style="text-align:center; " border="1" | ||
+ | |- | ||
+ | !Year||R||SAS||SPSS | ||
+ | |- | ||
+ | |1995||8||8620||6450 | ||
+ | |- | ||
+ | |1996||2||8670||7600 | ||
+ | |- | ||
+ | |1997||6||10100||9930 | ||
+ | |- | ||
+ | |1998||13||10900||14300 | ||
+ | |- | ||
+ | |1999||26||12500||24300 | ||
+ | |- | ||
+ | |2000||51||16800||42300 | ||
+ | |- | ||
+ | |2001||133||22700||68400 | ||
+ | |- | ||
+ | |2002||286||28100||88400 | ||
+ | |- | ||
+ | |2003||627||40300||78600 | ||
+ | |- | ||
+ | |2004||1180||51400||137000 | ||
+ | |- | ||
+ | |2005||2180||58500||147000 | ||
+ | |- | ||
+ | |2006||3430||64400||142000 | ||
+ | |- | ||
+ | |2007||5060||62700||131000 | ||
+ | |- | ||
+ | |2008||6960||59800||116000 | ||
+ | |- | ||
+ | |2009||9220||52800||61400 | ||
+ | |- | ||
+ | |2010||11300||43000||44500 | ||
+ | |- | ||
+ | |2011||14600||32100||32000 | ||
+ | |} | ||
+ | </CENTER> | ||
+ | require(ggplot2) | ||
+ | require(reshape) | ||
+ | Data_R_SAS_SPSS_Pubs <-read.csv('https://umich.instructure.com/files/522067/download?download_frd=1', header=T) | ||
+ | df <- data.frame(Data_R_SAS_SPSS_Pubs) | ||
+ | # convert to long format | ||
+ | df <- melt(df , id.vars = 'Year', variable.name = 'Time') | ||
+ | ggplot(data=df, aes(x=Year, y=value, colour=variable, group = variable)) + geom_line() + geom_line(size=4) + labs(x='Year', y='Citations') | ||
+ | <CENTER> | ||
+ | [[Image:SMHS_LinearModeling_Fig002.png|500px]] | ||
+ | </center> | ||
+ | ===Next see=== | ||
+ | [[SMHS_LinearModeling_QC|Quality Control section ]] for a discussion of data Quality Control (QC) and Quality Assurance (QA) which represent important components of data-driven modeling, analytics and visualization. | ||
Latest revision as of 12:19, 21 May 2016
SMHS Linear Modeling - Statistical Software
This section briefly describes the pros and cons of different statistical software platforms.
Statistical Software | Advantages | Disadvantages |
---|---|---|
R |
|
|
SAS |
|
|
Stata |
|
|
SPSS |
|
|
More comparisons are available online: UCLA/ATS and Wikipedia. |
GoogleScholar Research Article Pubs
Year | R | SAS | SPSS |
---|---|---|---|
1995 | 8 | 8620 | 6450 |
1996 | 2 | 8670 | 7600 |
1997 | 6 | 10100 | 9930 |
1998 | 13 | 10900 | 14300 |
1999 | 26 | 12500 | 24300 |
2000 | 51 | 16800 | 42300 |
2001 | 133 | 22700 | 68400 |
2002 | 286 | 28100 | 88400 |
2003 | 627 | 40300 | 78600 |
2004 | 1180 | 51400 | 137000 |
2005 | 2180 | 58500 | 147000 |
2006 | 3430 | 64400 | 142000 |
2007 | 5060 | 62700 | 131000 |
2008 | 6960 | 59800 | 116000 |
2009 | 9220 | 52800 | 61400 |
2010 | 11300 | 43000 | 44500 |
2011 | 14600 | 32100 | 32000 |
require(ggplot2) require(reshape) Data_R_SAS_SPSS_Pubs <-read.csv('https://umich.instructure.com/files/522067/download?download_frd=1', header=T) df <- data.frame(Data_R_SAS_SPSS_Pubs) # convert to long format df <- melt(df , id.vars = 'Year', variable.name = 'Time') ggplot(data=df, aes(x=Year, y=value, colour=variable, group = variable)) + geom_line() + geom_line(size=4) + labs(x='Year', y='Citations')
Next see
Quality Control section for a discussion of data Quality Control (QC) and Quality Assurance (QA) which represent important components of data-driven modeling, analytics and visualization.
- SOCR Home page: http://www.socr.umich.edu
Translate this page: