[R-bloggers] another surmortaliy graph (and 11 more aRticles) | |
- another surmortaliy graph
- W is for Write and Read Data – Fast
- R is everywhere
- R is everywhere
- Essential list of useful R packages for data scientists
- #26: Upgrading to R 4.0.0
- Get all your packages back on R 4.0.0
- Updating to 4.0.0 on MacOS
- An adventure in downloading books
- ChemoSpecUtils Update
- Proofs without Words using gganimate
- A package to download free Springer books during Covid-19 quarantine
Posted: 27 Apr 2020 11:20 AM PDT [This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
To leave a comment for the author, please follow the link and comment on their blog: R – Xi'an's Og. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. |
W is for Write and Read Data – Fast Posted: 27 Apr 2020 07:00 AM PDT [This article was first published on Deeply Trivial, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Once again, I'm dipping outside of the tidyverse, but this package and its functions have been really useful in getting data quickly in (and out) of R. For work, I have to pull in data from a few different sources, and manipulate and work with them to give me the final dataset that I use for much of my analysis. So that I don't have to go through all of that joining, recoding, and calculating each time, I created a final merged dataset as a CSV file that I can load when I need to continue my analysis. The problem is that the most recent version of that file, which contains 13 million+ records, was so large, writing it (and subsequently reading it in later) took forever and sometimes timed out. That's when I discovered the data.table library, and its fread and fwrite functions. Tidyverse is great for working with CSV files, but a lot of the memory and loading time is used for formatting. fread and fwrite are leaner and get the job done a bit faster. For regular-sized CSV files (like my reads2019 set), the time difference is pretty minimal. But for a 5GB datafile, it makes a huge difference. library(tidyverse) system.time(reads2019 <- read_csv("~/Downloads/Blogging A to Z/SaraReads2019_allchanges.csv", ## user system elapsed rm(reads2019) system.time(reads2019 <- fread("~/Downloads/Blogging A to Z/SaraReads2019_allchanges.csv")) ## user system elapsed But let's show how long it took to read my work datafile. Here's the elapsed time from the system.time output. read_csv: fread: library(wakefield) ## Warning: package 'wakefield' was built under R version 3.6.3 set.seed(42) ## user system elapsed system.time(fwrite(reallybigshew, "~/Downloads/Blogging A to Z/bigdata2.csv")) ## user system elapsed To leave a comment for the author, please follow the link and comment on their blog: Deeply Trivial. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
Posted: 27 Apr 2020 01:07 AM PDT [This article was first published on Quantargo Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. R is everywhere
Introduction to R
R is a programming language and environment to work with data. It is loved by statisticians and data scientists for its expressive code syntax and plentiful external libraries and tools and works on all major operating systems. It is the Swiss army knife for data analysis and statistical computing (and you can make some pretty charts, too!). The R language is easily extensible with packages written by a large and growing community of developers around the world. You can find it pretty much anywhere—it is used by academic institutions, start-ups, international corporations and many more. This is also reflected by looking at its adoption. Here we can see a large increase in both downloads and number of packages available over the years: In 2020 R celebrates its 20th birthday with the release of version 4.0. And yes, it's free and open source Quiz: R FactsWhich of the following statements about R are correct? Why Use R?R is a popular language for solving data analysis problems and is also used by people who traditionally do not consider themselves as programmers. When creating charts and visualizations with R, you will find that you have a much greater creative possibilities as opposed to graphical applications, such as Excel. Here are some of the features R is most famous for: Visualization: Creating beautiful graphs and visualizations is one of its biggest strengths. The core language already provides a rich set of tools used for plotting charts and for all kinds of graphics. The sky's the limit. Reproducibility: Unlike spreadsheet software, R code is not coupled to specific datasets and can easily be reused across different projects – even when exceeding more than 1 million rows. Easily build reusable reports and automatically generate new versions as the data changes. Advanced modelling: R provides the biggest and most powerful code base for data analysis in the world. The richness and depth of available statistical models is unparalleled and growing by the day, thanks to the huge community of open source package developers and contributors. Automation: R code can also be used to automate reports or to perform data transformations and model computations. It can also be integrated in automated production workflows, cloud computing environments and modern database systems. Quiz: Using RWhat are the main reasons to use R compared to spreadsheet software? You R in Good CompanyR is the de facto standard for statistical computing at academic institutions and companies around the world. Its great support for literate programming (code that can be combined with human-readable text) enables researchers and data scientists to create publication-ready reports which are easy to reproduce for reviewers. The language has seen a wide adoption in various industries—see some examples below: Information Technology
Pharma: Merck, Genentech (Roche), Novartis, Pfizer Newspapers: The Economist, The New York Times, Financial Times Finance
See also the R Consortium page for further information about industrial partners and initiatives. Building BlocksThe R language consists of three fundamental building blocks, which we will have a look at in the following chapters:
The most important object type in R are vectors. They form the basis for (almost) all R data structures. Being very vector-oriented makes R a very expressive and powerful language. Functions and operators make it easy to work with vectors and compute results. The greatest strengths of R is its flexibility to easily integrate new algorithms and build interfaces around them. R's package ecosystem allows you to choose from thousands of open source models and libraries. The main package repository, called CRAN, hosts these packages and allows you to easily install and use them in your code. Exercise: Submit your first codeThis course has code exercises to help you learn and quickly explore new concepts. After entering code in the editor, hit the "Submit" button to execute it. The editor will give you feedback on your submission and displays any output below the editor. If you need some additional help use the "Get Hint" button. To finish your first exercise, press the "Submit" button. To leave a comment for the author, please follow the link and comment on their blog: Quantargo Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
Posted: 27 Apr 2020 01:07 AM PDT [This article was first published on Quantargo Blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. R is everywhere
Introduction to R
R is a programming language and environment to work with data. It is loved by statisticians and data scientists for its expressive code syntax and plentiful external libraries and tools and works on all major operating systems. It is the Swiss army knife for data analysis and statistical computing (and you can make some pretty charts, too!). The R language is easily extensible with packages written by a large and growing community of developers around the world. You can find it pretty much anywhere—it is used by academic institutions, start-ups, international corporations and many more. This is also reflected by looking at its adoption. Here we can see a large increase in both downloads and number of packages available over the years: In 2020 R celebrates its 20th birthday with the release of version 4.0. And yes, it's free and open source Quiz: R FactsWhich of the following statements about R are correct? Why Use R?R is a popular language for solving data analysis problems and is also used by people who traditionally do not consider themselves as programmers. When creating charts and visualizations with R, you will find that you have a much greater creative possibilities as opposed to graphical applications, such as Excel. Here are some of the features R is most famous for: Visualization: Creating beautiful graphs and visualizations is one of its biggest strengths. The core language already provides a rich set of tools used for plotting charts and for all kinds of graphics. The sky's the limit. Reproducibility: Unlike spreadsheet software, R code is not coupled to specific datasets and can easily be reused across different projects – even when exceeding more than 1 million rows. Easily build reusable reports and automatically generate new versions as the data changes. Advanced modelling: R provides the biggest and most powerful code base for data analysis in the world. The richness and depth of available statistical models is unparalleled and growing by the day, thanks to the huge community of open source package developers and contributors. Automation: R code can also be used to automate reports or to perform data transformations and model computations. It can also be integrated in automated production workflows, cloud computing environments and modern database systems. Quiz: Using RWhat are the main reasons to use R compared to spreadsheet software? You R in Good CompanyR is the de facto standard for statistical computing at academic institutions and companies around the world. Its great support for literate programming (code that can be combined with human-readable text) enables researchers and data scientists to create publication-ready reports which are easy to reproduce for reviewers. The language has seen a wide adoption in various industries—see some examples below: Information Technology
Pharma: Merck, Genentech (Roche), Novartis, Pfizer Newspapers: The Economist, The New York Times, Financial Times Finance
See also the R Consortium page for further information about industrial partners and initiatives. Building BlocksThe R language consists of three fundamental building blocks, which we will have a look at in the following chapters:
The most important object type in R are vectors. They form the basis for (almost) all R data structures. Being very vector-oriented makes R a very expressive and powerful language. Functions and operators make it easy to work with vectors and compute results. The greatest strengths of R is its flexibility to easily integrate new algorithms and build interfaces around them. R's package ecosystem allows you to choose from thousands of open source models and libraries. The main package repository, called CRAN, hosts these packages and allows you to easily install and use them in your code. Exercise: Submit your first codeThis course has code exercises to help you learn and quickly explore new concepts. After entering code in the editor, hit the "Submit" button to execute it. The editor will give you feedback on your submission and displays any output below the editor. If you need some additional help use the "Get Hint" button. To finish your first exercise, press the "Submit" button. To leave a comment for the author, please follow the link and comment on their blog: Quantargo Blog. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
Essential list of useful R packages for data scientists Posted: 26 Apr 2020 11:40 PM PDT [This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. I have written couple of blog posts on R packages (here | here ) and this blog post is sort of a preset of all the most needed packages for data science, statistical usage and every-day usage with R. Among thousand of R packages available on CRAN (with all the mirror sites) or Github and any developer's repository. Many useful functions are available in many different R packages, many of the same functionalities also in different packages, so it all boils down to user preferences and work, that one decides to use particular package. From the perspective of a statistician and data scientist, I will cover the essential and major packages in sections. And by no means, this is not a definite list, and only a personal preference. 1. Loading and importing dataLoading and read data into R environment is most likely one of the first steps if not the most important. Data is the fuel. Breaking it into the further sections, reading data from binary files, from ODBC drivers and from SQL databases.
1.1. Importing from binary files# Reading from SAS and SPSS install.packages("Hmisc", dependencies = TRUE) # Reading from Stata, Systat and Weka install.packages("foreign", dependencies = TRUE) # Reading from KNIME install.packages(c("protr","foreign"), dependencies = TRUE) # Reading from EXCEL install.packages(c("readxl","xlsx"), dependencies = TRUE) # Reading from TXT, CSV install.packages(c("csv","readr","tidyverse"), dependencies = TRUE) # Reading from JSON install.packages(c("jsonLite","rjson","RJSONIO","jsonvalidate"), dependencies = TRUE) # Reading from AVRO install.packages("sparkavro", dependencies = TRUE) # Reading from Parquet file install.packages("arrow", dependencies = TRUE) devtools::install_github("apache/arrow/r") # Reading from XML install.packages("XML", dependencies = TRUE)
1.2. Importing from ODBCThis will cover most of the used work for ODBC drives: install.packages(c("odbc", "RODBC"), dependencies = TRUE)
1.3. Importing from SQL DatabasesAccessing SQL database with a particular package can also have great benefits when pulling data from database into R data frame. In addition, I have added some useful R packages that will help you query data in R much easier (RSQL) or even directly write SQL Statements (sqldf) and other great features. #Microsoft MSSQL Server install.packages(c("mssqlR", "RODBC"), dependencies = TRUE) #MySQL install.packages(c("RMySQL","dbConnect"), dependencies = TRUE) #PostgreSQL install.packages(c("postGIStools","RPostgreSQL"), dependencies = TRUE) #Oracle install.packages(c("ODBC"), dependencies = TRUE) #Amazon install.packages(c("RRedshiftSQL"), dependencies = TRUE) #SQL Lite install.packages(c("RSQLite","sqliter","dbflobr"), dependencies = TRUE) #General SQL packages install.packages(c("RSQL","sqldf","poplite","queryparser"), dependencies = TRUE)
2. Manipulating DataData Engineering, data copying, data wrangling and data manipulating data is the very next task in the journey. 2.1. Cleaning dataData cleaning is essential for cleaning out all the outliers, NULL, N/A values, wrong values, doing imputation or replacing them, checking up frequencies and descriptive and applying different single- , bi-, and multi-variate statistical analysis to tackle this issue. The list is by no means the complete list, but can be a good starting point: install.packages(c("janitor","outliers","missForest","frequency","Amelia", "diffobj","mice","VIM","Bioconductor","mi", "wrangle"), dependencies = TRUE) 2.2. Dealing with R data types and formatsWorking with correct data types and knowing your ways around handling formatting of your data-set can be overlooked and yet important. List of the must have packages: install.packages(c("stringr","lubridate","glue", "scales","hablar","readr"), dependencies = TRUE) 2.3. Wrangling, subseting and aggregating dataThere are many packages available to do the task of wrangling, engineering and aggregating, especially {base} R package should not be overlooked, since it offers a lot of great and powerful features. But following is a list of those most widely used in the R community and easy to maneuver data: install.packages(c("dplyr","tidyverse","purr","magrittr", "data.table","plyr","tidyr","tibble", "reshape2"), dependencies = TRUE)
3. Statistical tests and Sampling Data3.1. Statistical testsMany of the statistical tests (Shapiro, T-test, Wilcox, equality, …) are available in base and stats package that are available with R engine. Which is great, because primarily R is a statistical language, and many of the tests are already included. But adding additional packages, that I have used: install.packages(c("stats","ggpubr","lme4","MASS","car"), dependencies = TRUE) 3.2. Data SamplingData sampling, working with samples and population, working with inference, weights, and type of statistical data sampling can be find in these brilliant packages, also including those that are great for surveying data. install.packages(c("sampling","icarus","sampler","SamplingStrata", "survey","laeken","stratification","simPop"), dependencies = TRUE) 4. Statistical AnalysisRegarding of type of the variable, type of the analysis, and results a statistician wants to get, there are list of packages that should be part of daily R environment, when it comes to statistical analysis. 4.1. Regression AnalysisFrankly, one of the most important analysis install.packages(c("stats","Lars","caret","survival","gam","glmnet", "quantreg","sgd","BLR","MASS","car","mlogit","earth", "faraway","nortest","lmtest","nlme","splines", "sem","WLS","OLS","pls","2SLS","3SLS","tree","rpart"), dependencies = TRUE) 4.2. Analysis of varianceDistribution and and data dispersion is core to understanding the data. Many of the tests for variance are already built-in in R engine (package stats), but here are also some, that might be useful for analyzing variance. install.packages(c("caret","rio","car","MASS","FuzzyNumbers", "stats","ez"), dependencies = TRUE) 4.3. Multivariate analysisUsing more than two variables is considered multi-variate analysis. Excluding regression analysis and analysis of variance (between 2+ variables), since it is introduced in section 4.1., covering statistical analysis with working on many variables like factor analysis, principal axis component, canonical analysis, discrete analysis, and others: install.packages(c("psych","CCA","CCP","MASS","icapca","gvlma","smacof", "MVN","rpca","gpca","EFA.MRFA","MFAg","MVar","fabMix", "fad","spBFA","cate","mnlfa","CSFA","GFA","lmds","SPCALDA", "semds", "superMDS", "vcd", "vcdExtra"), dependencies = TRUE) 4.4. Classification and ClusteringBased on different type of clustering and classification, there are many packages to cover both. Some of the essential packages for clustering: install.packages(c("fpc","cluster","treeClust","e1071","NbClust","skmeans", "kml","compHclust","protoclust","pvclust","genie", "tclust", "ClusterR","dbscan","CEC","GMCM","EMCluster","randomLCA", "MOCCA","factoextra",poLCA), dependencies = TRUE) and for classification: install.packages("tree", "e1071") 4.5. Analysis of Time-seriesAnalysing time series and time-serie type of data will be done easier with the following packages: install.packages(c("ts","zoo","xts","timeSeries","tsModel", "TSMining", "TSA","fma","fpp2","fpp3","tsfa","TSdist","TSclust","feasts", "MTS", "dse","sazedR","kza","fable","forecast","tseries", "nnfor","quantmod"), dependencies = TRUE) 4.6. Network analysisAnalyzing networks is also part of statistical analysis. And some of the relevant packages: install.packages(c("fastnet","tsna","sna","networkR","InteractiveIGraph", "SemNeT","igraph","NetworkToolbox","dyads", "staTools","CINNA"), dependencies = TRUE) 4.7. Analysis of textBesides analyzing open text, once can analyse any kind of text, including the word corpus, the semantics and many more. Couple of starting packages: install.packages(c("tm","tau","koRpus","lexicon","sylly","textir", "textmineR","MediaNews", "lsa","SemNeT","ngram","ngramrr", "corpustools","udpipe","textstem", "tidytext","text2vec"), dependencies = TRUE) 5. Machine LearningR has variety of good machine learning packages that are powerfull and give you the full Machine Learning cycle. Breaking down the sections by it's natural way. 5.1. Building and validating the modelsOnce you build one or more models, after comparing the results of each models, it is also important to validate the models against the test or any other datasets. Here are powerfull packages to do model validation. install.packages(c("tree", "e1071","crossval","caret","rpart","bcv", "klaR","EnsembleCV","gencve","cvAUC","CVThresh", "cvTools","dcv","cvms","blockCV"), dependencies = TRUE) 5.2. Random forests packagessdfs install.packages(c("randomForest","grf","ipred","party","randomForestSRC", "grf","BART","Boruta","LTRCtrees","REEMtree","refr", "binomialRF","superml"), dependencies = TRUE) 5.3. Regression type (regression, boosting, Gradient descent) algoritms packagesRegression type of machine learning algorithm are many, with additional boosting or gradient. Some of very usable packages: install.packages(c("earth", "gbm","GAMBoost", "GMMBoost", "bst","superml", "sboost"), dependencies = TRUE) 5.4. Classification algorithmsClassifying problems have many of the packages and many are also great for machine learning cases. Handful. install.packages(c("rpart", "tree", "C50", "RWeka","klar", "e1071", "kernlab","svmpath","superml","sboost"), dependencies = TRUE) 5.5. Neural networksThere are many types of Neural networks and many of different packages will give you all types of NN. Only couple of very useful R packages to tackle the neural networks. install.packages(c("nnet","gnn","rnn","spnn","brnn","RSNNS","AMORE", "simpleNeural","ANN2","yap","yager","deep","neuralnet", "nnfor","TeachNet"), dependencies = TRUE) 5.6. Deep LearningR had embraced deep learning and many of the powerfull SDK and packages have been converted to R, making it very usable for R developers and R machine learning community. install.packages(c("deepnet","RcppDL","tensorflow","h2o","kerasR", "deepNN", "Buddle","automl"), dependencies = TRUE) 5.7. Reinforcement LearningReinforcement learning is gaining popularity and more and more packages are being developered in R as well. Some of the very userful packages: devtools::install_github("nproellochs/ReinforcementLearning") install.packages(c("RLT","ReinforcementLearning","MDPtoolbox"), dependencies = TRUE) 5.8. Model interpretability and explainabilityResults of machine learning models can be a black-box. Many of the packages are dealing to have black-box more like "glass box", making the models more understandable, interpretable and explainable. Very powerfull packages to do just that for many different machine learning algorithms. install.packages(c("lime","localModel","iml","EIX","flashlight", "interpret","outliertree","breakDown"), dependencies = TRUE)
6. VisualisationVisualisation of the data is not only the final step to understanding the data, but can also bring clarity to interpretation and buidling the mental model around the data. Couple of packages, that will help boost the visualization: install.packages(c("ggvis","htmlwidgets","maps","sunburstR", "lattice", "predict3d","rgl","rglwidget","plot3Drgl","ggmap","ggplot2","plotly", "RColorBrewer","dygraphs","canvasXpress","qgraph","moveVis","ggcharts", "igraph","visNetwork","visreg", "VIM", "sjPlot", "plotKML", "squash", "statVisual", "mlr3viz", "klaR","DiagrammeR","pavo","rasterVis", "timelineR","DataViz","d3r","d3heatmap","dashboard" "highcharter", "rbokeh"), dependencies = TRUE) 7. Web ScrapingMany R packages are specificly designed to scrape (harvest) data from particular website, API or archive. Here are only couple of very generic: install.packages(c("rvest","Rcrawler","ralger","scrapeR"), dependencies = TRUE) 8. Documents and books organisationOrganizing your documents (file, code, packages, diagrams, pictures) in readable document and have it as a dashboard or book view, there are couple of packages for this purpose: install.packages(c("devtools","usethis","roxygen2","knitr", "rmarkdown","flexdashboard","Shiny", "xtable","httr","profvis"), dependencies = TRUE) Wrap upThe R script for loading and installing the packages is available at Github. Make sure to check the Github repository for latest list updates. And as always, feel free to fork the code or commit updates, add essentials packages to list, comment, improve and agree or disagree. You can also run the following command to install all of the packages in a single run: install.packages(c("Hmisc","foreign","protr","readxl","xlsx", "csv","readr","tidyverse","jsonLite","rjson", "RJSONIO","jsonvalidate","sparkavro","arrow","feather", "XML","odbc","RODBC","mssqlR","RMySQL", "dbConnect","postGIStools","RPostgreSQL","ODBC", "RSQLite","sqliter","dbflobr","RSQL","sqldf", "poplite","queryparser","influxdbr","janitor","outliers", "missForest","frequency","Amelia","diffobj","mice", "VIM","Bioconductor","mi","wrangle","mitools", "stringr","lubridate","glue","scales","hablar", "dplyr","purr","magrittr","data.table","plyr", "tidyr","tibble","reshape2","stats","Lars", "caret","survival","gam","glmnet","quantreg", "sgd","BLR","MASS","car","mlogit","RRedshiftSQL", "earth","faraway","nortest","lmtest","nlme", "splines","sem","WLS","OLS","pls", "2SLS","3SLS","tree","rpart","rio", "FuzzyNumbers","ez","psych","CCA","CCP", "icapca","gvlma","smacof","MVN","rpca", "gpca","EFA.MRFA","MFAg","MVar","fabMix", "fad","spBFA","cate","mnlfa","CSFA", "GFA","lmds","SPCALDA","semds","superMDS", "vcd","vcdExtra","ks","rrcov","eRm", "MNP","bayesm","ltm","fpc","cluster", "treeClust","e1071","NbClust","skmeans","kml", "compHclust","protoclust","pvclust","genie","tclust", "ClusterR","dbscan","CEC","GMCM","EMCluster", "randomLCA","MOCCA","factoextra","poLCA","ts", "zoo","xts","timeSeries","tsModel","TSMining", "TSA","fma","fpp2","fpp3","tsfa", "TSdist","TSclust","feasts","MTS","dse", "sazedR","kza","fable","forecast","tseries", "nnfor","quantmod","fastnet","tsna","sna", "networkR","InteractiveIGraph","SemNeT","igraph", "dyads","staTools","CINNA","tm","tau","NetworkToolbox" "koRpus","lexicon","sylly","textir","textmineR", "MediaNews","lsa","ngram","ngramrr","corpustools", "udpipe","textstem","tidytext","text2vec","crossval", "bcv","klaR","EnsembleCV","gencve","cvAUC", "CVThresh","cvTools","dcv","cvms","blockCV", "randomForest","grf","ipred","party","randomForestSRC", "BART","Boruta","LTRCtrees","REEMtree","refr", "binomialRF","superml","gbm","GAMBoost","GMMBoost", "bst","sboost","C50","RWeka","klar", "kernlab","svmpath","nnet","gnn","rnn", "spnn","brnn","RSNNS","AMORE","simpleNeural", "ANN2","yap","yager","deep","neuralnet", "TeachNet","deepnet","RcppDL","tensorflow","h2o", "kerasR","deepNN","Buddle","automl","RLT", "ReinforcementLearning","MDPtoolbox","lime","localModel", "iml","EIX","flashlight","interpret","outliertree", "dockerfiler","azuremlsdk","sparklyr","cloudml","ggvis", "htmlwidgets","maps","sunburstR","lattice","predict3d", "rgl","rglwidget","plot3Drgl","ggmap","ggplot2", "plotly","RColorBrewer","dygraphs","canvasXpress","qgraph", "moveVis","ggcharts","visNetwork","visreg","sjPlot", "plotKML","squash","statVisual","mlr3viz","DiagrammeR", "pavo","rasterVis","timelineR","DataViz","d3r","breakDown", "d3heatmap","dashboard","highcharter","rbokeh","rvest", "Rcrawler","ralger","scrapeR","devtools","usethis", "roxygen2","knitr","rmarkdown","flexdashboard","Shiny", "xtable","httr","profvis"), dependencies = TRUE)
Happy R-ing.
To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. |
Posted: 26 Apr 2020 05:18 PM PDT [This article was first published on Thinking inside the box , and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Welcome to the 26th post in the rationally regularized R revelations series, or R4 for short. R 4.0.0 was released two days ago, and a casual glance at some social media conversations appears to suggest quite some confusion, almost certainly some misunderstandings, and possibly also a fair amount of fear, uncertainty, and doubt about the process. So I thought I could show how I upgrade my own main workstation, live and in colour without a safety net. (Almost: I did upgrade my laptop yesterday which went swimmingly, if more slowly.) So here is a fresh video about upgrading to R 4.0.0, with some support slides as usual:
The slides used in the video are at this link. A few quick follow-ups to the 'live' nature of this. The And as mentioned, if you are interested and have questions concerning use of R on a .deb based system like Debain or Ubuntu (or Mint or …), the r-sig-debian list is a very good and friendly place to ask them. If you like this or other open-source work I do, you can now sponsor me at GitHub. For the first year, GitHub will match your contributions. This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.
To leave a comment for the author, please follow the link and comment on their blog: Thinking inside the box . R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
Get all your packages back on R 4.0.0 Posted: 26 Apr 2020 05:00 PM PDT [This article was first published on Johannes B. Gruber on Johannes B. Gruber, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This can be rather cumbersome if you have collected a large number of packages on your machine while using After you made the update, first get your old packages:
Then you can find the packages previously installed but currently missing:
Once this is done, you can install your packages back: This can run fo a while… Once the installations are done, you can check the missing packages again: If you've got all your packages back, Check if this grabbed the correct ones, then you can install them using For me, that was it. To leave a comment for the author, please follow the link and comment on their blog: Johannes B. Gruber on Johannes B. Gruber. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
Posted: 26 Apr 2020 05:00 PM PDT [This article was first published on Posts on R Lover ! a programmer, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Mixed emotionsWow! Has it been a year? Another major update from The R The details are here in the old post I'm aware that there are full-fledged package So I set out to do the following:
Before you upgrade!Let's load Note – I'm writing this after already upgrading so there will be a few inconsistencies in the output
A function to do the hard workAs I mentioned above the stack overflow post was a good start but I wanted more What's in your libraries?Now that we have the And just to be on the safe side we'll also write a copy out as a csv file so we Go ahead and install R 4.0.0At this point we have what we need, so go ahead and download and install R We'll start by getting the entire Now we have R 4.0.0 and some additional packages. Let's see what we can do. Just do it!Now that you have a nice automated list of everything that is a CRAN package you Depending on the speed of your network connection and the number of packages you That takes care of our CRAN packages. What about GitHub? Here's another chance Same with the one package I get from R-Forge… At the end of this process you should have a nice clean R install that has all DoneHope you enjoyed the post. Comments always welcomed. Especially please let Chuck This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License To leave a comment for the author, please follow the link and comment on their blog: Posts on R Lover ! a programmer. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
An adventure in downloading books Posted: 26 Apr 2020 05:00 PM PDT [This article was first published on Anindya Mozumdar, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Earlier today, I noticed a tweet from well known R community member Jozef Hajnala. The tweet was about Springer releasing around 65 books related to data science and machine learning for free to download as PDFs. Following the link in his tweet, I learned that Springer has released 408 books in total, out of which 65 are related to the field of data science. The author of the blog post did a nice job of providing the links to the Springer website for each of these books. While browsing through a couple of the links, it appeared to me that the links are all well structured and it would be worth a try to write an R script to download all of the books. My first impluse was to use the rvest package. However, I was finding it hard to scrape the page in the "Towards Data Science" website as it is probably generated using JavaScript and not a simple HTML. After a few minutes of research, I discovered the Rcrawler package which appeared to have some functions which would suit my needs. While I have heard of headless browsers before, this was my first experience using one. Rcrawler itself installs PhantomJS using which one can mimic 'visiting' a web page using code. The LinkExtractor function from RCrawler is a nice function which gives you the internal and external links present in a page. It also provides you with some general information on the page, which was useful to extract the name of each book. Given the well structured pages in the Springer website, it took some simple string manipulation to find a way to generate the link to the actual PDF of the book. After that, it was a simple call to the R function download.file. As a result of this exercise, I also learned two new things
Overall, an hour of effort based on a tweet, and I learned a few things. I will most likely not have the time to read most or any of these books but at least it helped me learn some new stuff in R. Time well spent. To leave a comment for the author, please follow the link and comment on their blog: Anindya Mozumdar. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
Posted: 26 Apr 2020 05:00 PM PDT [This article was first published on R on Chemometrics & Spectroscopy using R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This brings to mind a Karl Broman quote I think about frequently:
To leave a comment for the author, please follow the link and comment on their blog: R on Chemometrics & Spectroscopy using R. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
Proofs without Words using gganimate Posted: 25 Apr 2020 05:00 PM PDT [This article was first published on R on Notes of a Dabbler, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. I recently watched the 2 part workshop (part 1, part 2) on ggplot2 and extensions given by Thomas Lin Pedersen. First of, it was really nice of Thomas to give the close to 4 hour workshop for the benefit of the community. I personally learnt a lot from it. I wanted to try out gganimate extension that was covered during the workshop. There are several resources on the web that show animations/illustrations of proofs of mathematical identities and theorems without words (or close to it). I wanted to take a few of those examples and use gganimate to recreate the illustration. This was a fun way for me to try out gganimate. Example 1:This example is taken from AoPS Online and the result is that sum of first \(n\) odd numbers equals \(n^2\). \[ 1 + 3 + 5 + \ldots + (2n – 1) = n^2 \] The gganimate version of the proof (using the method in AoPS Online) is shown below (R code, html file) ![]() Example 2:This example is also taken from AoPS Online and the result is: \[ 1^3 + 2^3 + \ldots + (n-1)^3 + n^3 = (1 + 2 + \ldots + n)^2 \] The gganimate version of the proof (using the method in AoPS Online) is shown below ( R code, html file): ![]() Example 3This example from AoPS Online illustrates the result \[ \frac{1}{2^2} + \frac{1}{2^4} + \frac{1}{2^6} + \frac{1}{2^8} + \ldots = \frac{1}{3} \] The gganimate version of the proof (using the method in AoPS Online) is shown below ( R code, html file): ![]() Example 4According to Pythagoras theorem, \[ a^2 + b^2 = c^2 \] where \(a\), \(b\), \(c\) are sides of a right angled triangle (with \(c\) being the side opposite \(90^o\) angle) There was an illustration of the proof of pythogoras theorem in a video from echalk. The gganimate version of the proof is shown below ( R code, html file) ![]() In summary, it was great to use gganimate for these animations since it does all the magic with making transitions work nicely.
To leave a comment for the author, please follow the link and comment on their blog: R on Notes of a Dabbler. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
A package to download free Springer books during Covid-19 quarantine Posted: 25 Apr 2020 05:00 PM PDT [This article was first published on R on Stats and R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
IntroductionYou probably already have seen that Springer released about 500 books for free following the COVID-19 pandemic. According to Springer, these textbooks will be available free of charge until at least the end of July. Following this announcement, I already downloaded a couple of statistics and R programming textbooks from their website and I will probably download a few more in the coming weeks. In this article, I present a package that saved me a lot of time and which may be of interest to many of us: the This package allows you to easily download all (or a selection of) Springer books made available free of charge during the COVID-19 quarantine. With this large collection of high quality resources and my collection of top R resources about the Coronavirus, we do not have any excuse to not read and learn during this quarantine. Without further ado, here is how the package works in practice. InstallationAfter having installed the Download all books at onceFirst, set the path where you would like to save all books with the You will find all downloaded books (in PDF format) in a folder named "springer_quarantine_books", organized by category.2 Create a table of Springer booksYou can load into an R session a table containing all the titles made available by Springer, with the This table can then be improved with the
Download only specific booksBy titleNow, say that you are interested to download only one specific book and you know its title. For instance, suppose you want to download the book entitled "All of Statistics": If you are interested to download all books with the word "Statistics" in the title, you can run: By subjectYou can also download all books covering a specific subject: AcknowledgmentsI would like to thank:
Thanks for reading. I hope this article will help you to download and read more high quality materials made available by Springer during this Covid-19 quarantine. As always, if you have a question or a suggestion related to the topic covered in this article, please add it as a comment so other readers can benefit from the discussion. Get updates every time a new article is published by subscribing to this blog. To leave a comment for the author, please follow the link and comment on their blog: R on Stats and R. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This posting includes an audio/video/photo media file: Download Now |
You are subscribed to email updates from R-bloggers. To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google, 1600 Amphitheatre Parkway, Mountain View, CA 94043, United States |
My friend mentioned to me your blog, so I thought I’d read it for myself. Very interesting insights, will be back for more!
ReplyDeleteNuremberg Hotel
Hey – great blog, just looking around some blogs, seems a really nice platform you are using. I’m currently using WordPress for a few of my blogs but looking to change one of them over to a platform similar to yours as a trial run. Anything in particular you would recommend about it?
ReplyDeletetry this web-site
I personally use them exclusively high-quality elements : you will notice these folks during: Beef Flavored CBD Oil for Pets
ReplyDelete