[R-bloggers] How GPL makes me leave R for Python :-( (and 4 more aRticles) |
- How GPL makes me leave R for Python :-(
- Book review: Beyond Spreadsheets with R
- missing digit in a 114 digit number [a Riddler’s riddle]
- R Markdown Template for Business Reports
- December 2108: “Top 40” New CRAN Packages
How GPL makes me leave R for Python :-( Posted: 30 Jan 2019 10:08 PM PST (This article was first published on R-posts.com, and kindly contributed to R-bloggers)
But this turned out to be a slippery slope into the open-source code licensing field, which I wasn't really aware of before. Bottom line: legal advice was not to use R! Was it a single lawyer? No. The company was willing to "play along" with me, and we had a consultation with 4 different software lawyers, one after the other. GPL is not a permissive license. It is categorized as "strongly protective". In layman terms, if you build your work on a GPL program it may force you to license your product with a GPL license, too. In other words – it restrains you from keeping your code proprietary. Now you say – "This must be wrong", and "You just don't understand the license and its meaning", right? You may also mention that Microsoft and other big companies are using R, and provide R services. Well, maybe. I do believe there are ways to make your code proprietary, legally. But, when your software lawyers advise to "make an effort to avoid using this program" you do not brush them off
Now, for some details.As a private company, our code needs to be proprietary. Our core is not services, but the software itself. We need to avoid handing our source code to a customer. The program itself will be installed on a customer's server. Most of our customers have sensitive data and a SAAS model (or a connection to the internet) is out of the question. Can we use R? The R Core Team addressed the question "Can I use R for commercial purposes?". But, as lawyers told us, the way it is addressed does not solve much. Any GPL program can be used for commercial purposes. You can offer your services installing the software, or sell a visualization you've prepared with ggplot2. But, it does not answer the question – can I write a program in R, and have it licensed with a non-GPL license (or simply – a commercial license)? The key question we were asked was is our work a "derivative work" of R. Now, R is an interpreted programming language. You can write your code in notepad and it will run perfectly. Logic says that if you do not modify the original software (R) and you do not copy any of its source code, you did not make a derivative work. As a matter of fact, when you read the FAQ of the GPL license it almost seems that indeed there is no problem. Here is a paragraph from the Free Software Foundation https://www.gnu.org/licenses/gpl-faq.html#IfInterpreterIsGPL: If a programming language interpreter is released under the GPL, does that mean programs written to be interpreted by it must be under GPL-compatible licenses?(#IfInterpreterIsGPL) When the interpreter just interprets a language, the answer is no. The interpreted program, to the interpreter, is just data; a free software license like the GPL, based on copyright law, cannot limit what data you use the interpreter on. You can run it on any data (interpreted program), any way you like, and there are no requirements about licensing that data to anyone. Problem solved? Not quite. The next paragraph shuffles the cards: However, when the interpreter is extended to provide "bindings" to other facilities (often, but not necessarily, libraries), the interpreted program is effectively linked to the facilities it uses through these bindings. So if these facilities are released under the GPL, the interpreted program that uses them must be released in a GPL-compatible way. The JNI or Java Native Interface is an example of such a binding mechanism; libraries that are accessed in this way are linked dynamically with the Java programs that call them. These libraries are also linked with the interpreter. If the interpreter is linked statically with these libraries, or if it is designed to link dynamically with these specific libraries, then it too needs to be released in a GPL-compatible way. Another similar and very common case is to provide libraries with the interpreter which are themselves interpreted. For instance, Perl comes with many Perl modules, and a Java implementation comes with many Java classes. These libraries and the programs that call them are always dynamically linked together. A consequence is that if you choose to use GPLed Perl modules or Java classes in your program, you must release the program in a GPL-compatible way, regardless of the license used in the Perl or Java interpreter that the combined Perl or Java program will run on This is commonly interpreted as "You can use R, as long as you don't call any library".
Now, can you think of using R without, say, the Tidyverse package? Tidyverse is a GPL library. And if you want to create a shiny web app – you still use the Shiny library (also GPL). Assume you will purchase a shiny server pro commercial license, this still does not resolve the shiny library itself being licensed as GPL. Furthermore, we often use quite a lot of R libraries – and almost all are GPL. Same goes for a shiny app, in which you are likely to use many GPL packages to make your product look and behave as you want it to. Is it legal to use R after all?I think it is. The term "library" may be the cause of the confusion.
As Perl is mentioned specifically in the GPL FAQ quoted above, Perl addressed the issue of GPL licensed interpreter on proprietary scripts head on (https://dev.perl.org/licenses/ ): "my interpretation of the GNU General Public License is that no Perl script falls under the terms of the GPL unless you explicitly put said script under the terms of the GPL yourself. Furthermore, any object code linked with perl does not automatically fall under the terms of the GPL, provided such object code only adds definitions of subroutines and variables, and does not otherwise impair the resulting interpreter from executing any standard Perl script"
There may also be a hidden explanation by which most libraries are fine to use. As said above, it is possible the confusion is caused by the use of the term "library" in different ways. Linking/binding is a technical term for what occurs when compiling software together. This is not what happens with most R packages, as may be understood when reading the following question and answer: Does an Rcpp-dependent package require a GPL license? The question explains why (due to GPL) one should NOT use the Rcpp R library. Can we infer from it that it IS ok to use most other libraries? "This is not a legal advice"As we've seen, what is and is not legal to do with R, being GPL, is far from being clear. Everything that is written on the topic is also marked as "not a legal advice". While this may not be surprising, one has a hard time convincing a lawyer to be permissive, when the software owners are not clear about it. For example, the FAQ "Can I use R for commercial purposes?" mentioned above begins with "R is released under the GNU General Public License (GPL), version 2. If you have any questions regarding the legality of using R in any particular situation you should bring it up with your legal counsel". And ends with "None of the discussion in this section constitutes legal advice. The R Core Team does not provide legal advice under any circumstances."
In between the information is not very decisive, either. So at the end of the day, it is unclear what is the actual legal situation.
Another thing one of the software lawyers told us is that Investors do not like GPL. In other words, even if it turns out that it is legal to use R with its libraries – a venture capital investor may be reluctant. If true, this may cause delays and may also require additional work convincing the potential investor that what you are doing is indeed flawless. Hence, lawyers told us, it is best if you can find an alternative that is not GPL at all.
What makes Python better?Most of the "R vs. Python" articles are pure junk, IMHO. They express nonsense commonly written in the spirit of "Python is a general-purpose language with a readable syntax. R, however, is built by statisticians and encompasses their specific language." Far away from the reality as I see it.
But Python has a permissive license. You can distribute it, you can modify it, and you do not have to worry your code will become open-source, too. This truly is a great advantage. Is there anything in between a permissive license and a GPL?Yes there is. For example, there is the Lesser GPL (LGPL). As described in Wikipedia: "The license allows developers and companies to use and integrate a software component released under the LGPL into their own (even proprietary) software without being required by the terms of a strong copyleft license to release the source code of their own components. However, any developer who modifies an LGPL-covered component is required to make their modified version available under the same LGPL license." Isn't this exactly what the R choice of a license was aiming at? Others use an exception. Javascript, for example, is also GPL. But they added the following exception: "As a special exception to GPL, any HTML file which merely makes function calls to this code, and for that purpose includes it by reference shall be deemed a separate work for copyright law purposes. In addition, the copyright holders of this code give you permission to combine this code with free software libraries that are released under the GNU LGPL. You may copy and distribute such a system following the terms of the GNU GPL for this code and the LGPL for the libraries. If you modify this code, you may extend this exception to your version of the code, but you are not obligated to do so. If you do not wish to do so, delete this exception statement from your version."
R is not LGPL. R has no written exceptions.
The fact that R and most of its libraries use a GPL license is a problem. At the very least it is not clear if it is really legal to use R to write proprietary code. Even if it is legal, Python still has an advantage being a permissive license, which means "no questions asked" by potential customers and investors.
It would be good if the R core team, as well as people releasing packages, were clearer about the proper use of the software, as they see it. They could take Perl as an example.
It would be even better if the license would change. At least by adding an exception, reducing it to an LGPL or (best) permissive license.
Click HERE to leave a comment.
To leave a comment for the author, please follow the link and comment on their blog: R-posts.com. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more... This posting includes an audio/video/photo media file: Download Now |
Book review: Beyond Spreadsheets with R Posted: 30 Jan 2019 04:00 PM PST (This article was first published on Shirin's playgRound, and kindly contributed to R-bloggers) Disclaimer: Manning publications gave me the ebook version of Beyond Spreadsheets with R – A beginner's guide to R and RStudio by Dr. Jonathan Carroll free of charge.
Aimed atThis book covers all the very basics of getting started with R, so it is useful to beginners but probably won't offer much for users who are already familiar with R. However, it does hint at some advanced functions, like defining specific function calls for custom classes. What stood out for me in the book (both positive and negative)
This is what is shown: This snippet about Non-Standard-Evaluation in the book is very short. Users who worked with
This is how it would work:
But he doesn't mention the tidy way to do this:
But doesn't show the tidy way. I guess this is because at this point in book the tidy way would be a bit too complicated, due to the "wrong" data format (wide vs. long). He does introduce the tidy principles of data analysis in the next chapters.
VerdictEven though I am not really the target audience for a book like that since I am not a beginner any more (I basically skimmed through a lot of the very basic stuff), it still offered a lot of valuable information! I particularly liked that the book introduces very important parts of learning R for practical and modern applications, such as using the
To leave a comment for the author, please follow the link and comment on their blog: Shirin's playgRound. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more... This posting includes an audio/video/photo media file: Download Now |
missing digit in a 114 digit number [a Riddler’s riddle] Posted: 30 Jan 2019 03:19 PM PST (This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers) A puzzling riddle from The Riddler (as Le Monde had a painful geometry riddle this week): this number with 114 digits 530,131,801,762,787,739,802,889,792,754,109,70?,139,358,547,710,066,257,652,050,346,294,484,433,323,974,747,960,297,803,292,989,236,183,040,000,000,000 is missing one digit and is a product of some of the integers between 2 and 99. By comparison, 76! and 77! have 112 and 114 digits, respectively. While 99! has 156 digits. Using WolframAlpha on-line prime factor decomposition code, I found that only 6 is a possible solution, as any other integer between 0 and 9 included a large prime number in its prime decomposition: However, I thought anew about it when swimming the next early morning [my current substitute to morning runs] and reasoned that it was not necessary to call a formal calculator as it is reasonably easy to check that this humongous number has to be divisible by 9=3×3 (for else there are not enough terms left to reach 114 digits, checked by lfactorial()… More precisely, 3³³x33! has 53 digits and 99!/3³³x33! 104 digits, less than 114), which means the sum of all digits is divisible by 9, which leads to 6 as the unique solution.
To leave a comment for the author, please follow the link and comment on their blog: R – Xi'an's Og. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more... |
R Markdown Template for Business Reports Posted: 30 Jan 2019 01:31 AM PST (This article was first published on INWT-Blog-RBloggers, and kindly contributed to R-bloggers) In this post I'd like to introduce the R Markdown template for business reports by INWTlab. It's been my aim to have a nice and clean template that is easy to customize in colors, cover and logo. I know there are quite a few templates available, but I was missing one to be used in a corporate environment. That is, I want to have a logo included and a cover page that also can be styled in a corporate design. In many companies nowadays MS Word is still THE reporting tool, so the overall look and feel is loosly oriented at MS Word defaults. In addition, this template can be extended by hacking the tex file defs.tex. In my eyes a tex file is way easier to hack – especially for not so experienced latex users – than style files (.sty). InstallationYou can install the package from the INWTlab Github Repository ireports. If you haven't installed devtools yet, please do that first. Once the package is installed, you can already use the template in its default version by selecting R Markdown… > From Template > INWTlab Business Report and then clicking OK. As result a new Rmd script pops up and you can knit it right away. This is what the default report looks like: UsageText & GraphicsYou can write text and add graphics just like you would in any other R Markdown document. TablesIn order to generate tables in MS Word style you can use the example code snippet below. It renders a table from the iris data set. This code snippet consists of three parts:
So, when you create a new table, all you have to do is to create the xtable object tab in the first row, making sure that the align has the right number of columns specified. CustomizationLogo and CoverWhen you create a new R Markdown file from the template, RStudio creates a new folder in your working directory (check with getwd()). You can overwrite logo.png and cover.png in this folder with your own files. Keep in mind to either name your files exactly like mine or to correct the file names in the yaml header. The recommended dimension for the logo is 350 x 130 pixels. The cover should have 1654 x 2339 pixels. In the case you want to use other dimensions, you need to hack the tex file defs.tex in order to make it look nice. ColorsThere are currently two colors in use, namely iblue and igray. If you'd like to adopt your corporate colors, change the hexadecimal color values for iblue and igray in the yaml header. CollaborationYou're welcome to help improve this template, just open an issue on GitHub or send a pull request.
To leave a comment for the author, please follow the link and comment on their blog: INWT-Blog-RBloggers. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more... This posting includes an audio/video/photo media file: Download Now |
December 2108: “Top 40” New CRAN Packages Posted: 29 Jan 2019 04:00 PM PST (This article was first published on R Views, and kindly contributed to R-bloggers) By my count, 157 new packages stuck to CRAN in December. Below are my "Top 40" picks in ten categories: Computational Methods, Data, Finance, Machine Learning, Medicine, Science, Statistics, Time Series, Utilities and Visualization. This is the first time I have used the Medicine category. I am pleased that a few packages that appear to have clinical use made the cut. Also noteworthy in this month's selection are the inclusion of four packages from the Microsoft Azure team (stuffing 41 packages into the "Top 40"), and some eclectic, but fascinating packages in the Science section. Computational Methodsar.matrix v0.1.0: Provides functions that use precision matrices and Choleski factorization to simulates auto-regressive data. The README offers examples. mvp v1.0-2: Provides functions for the fast symbolic manipulation polynomials. See the vignette and this R Journal paper for details on how to create this image of the Rosenbrock function. pomdp v0.9.1: Provides an interface to Datadbparser v1.0.0: Provides a tool for parsing the DrugBank XML database. The vignette shows how to get started. rdhs v0.6.1: Implements a client querying the DHS API to download and manipulate survey datasets and metadata. There are introductions to using rdhs and the rdhs client, an extended example about Anemia prevalence, and vignettes on Country Codes, Interacting with the geojson API results, and Testing. Financeoptionstrat v1.0.0: Implements the Black-Scholes-Merton option pricing model to calculate key option analytics and graphical analysis of various option strategies. See the vignette. riskParityPortfolio v0.1.1: Provides functions to design risk parity portfolios for financial investment. In addition to the vanilla formulation, where the risk contributions are perfectly equalized, many other formulations are considered that allow for box constraints and short selling. The package is based on the papers: Feng and Palomar (2015), Spinu (2013), and Griveau-Billion et al.(2013). See the vignette for an example. Machine LearningBTM v0.2: Provides functions to find ParBayesianOptimization v0.0.1: Provides a framework for optimizing Bayesian hyperparameters according to the methods described in Snoek et al. (2012). There are vignettes on standard and advanced features. MedicineLUCIDus v0.9.0: Implements the metaRMST v1.0.0: Provides functions that use individual patient-level data to produce a multivariate meta-analysis of randomized controlled trials with the difference in restricted mean survival times ( RMSTD ). webddx v0.1.0: Implements a differential-diagnosis generating tool. Given a list of symptoms, the function SciencebioRad v0.4.0: Provides functions to extract, visualize, and summarize aerial movements of birds and insects from weather radar data. There is an Introduction and a vignette on Exercises. pmd v0.1.1: Implements the paired mass distance analysis proposed in Yu, Olkowicz and Pawliszyn (2018) for gas/liquid chromatography–mass spectrometry. See the vignette for an introduction. tabula v1.0.0: Provides functions to examine archaeological count data and includes several measures of diversity. There are vignettes on Diversity Measures, Matrix Classes, and Matrix Seriation. This last vignette includes an example reproducing the results of Peeples and Schachner (2012). traitdataform v0.5.2: Provides functions to assist with handling ecological trait data and applying the Ecological Trait-Data Standard terminology described in Schneider et al. (2018). waterquality v0.2.2: Implements over 45 algorithms to develop water quality indices from satellite reflectance imagery. The vignette introduces the package. Statisticsareal v0.1.2: Implements areal weighted interpolation with support for multiple variables in a workflow that is compatible with the FLAME v1.0.0: Implements the Fast Large-scale Almost Matching Exactly algorithm of Roy et al. (2017) for causal inference. Look at the README to get started. mistr v0.0.1: Offers a computational framework for mixture distributions with a focus on composite models. There is an Introduction and a vignette on Extensions. mlergm v0.1: Provides functions to estimate exponential-family random graph models for multilevel network data, assuming the multilevel structure is observed. There is a Tutorial. MTLR v0.1.0: Implements the Multi-Task Logistic Regression (MTLR) proposed by Yu et al. (2011). See the vignette. mulitRDPG v1.0.1: Provides functions to fit the Multiple Random Dot Product Graph Model and performs a test for whether two networks come from the same distribution. See Nielsen and Witten (2018) for details. ocp v0.1.0: Implements the Bayesian online changepoint detection method of Adams and MacKay (2007) for univariate or multivariate data. Gaussian and Poisson probability models are implemented. The vignette provides an introduction. probably v0.0.1: Provides tools for post-processing class probability estimates. See the vignettes Where does probability fit in? and Equivocal Zones. smurf v1.0.0: Implements the SMuRF algorithm of Devriendt et al. (2018) to fit generalized linear models (GLMs) with multiple types of predictors via regularized maximum likelihood. See the package Introduction. subtee v0.3-4: Provides functions for naive and adjusted treatment effect estimation for subgroups. Proposes model averaging Bornkamp et al. (2016) and bagging Rosenkranz (2016) to address the problem of selection bias in treatment effect estimation for subgroups. There is a Introduction and vignettes for the plot and subbuild functions. xspliner v0.0.2: Provides functions to assist model building using surrogate black-box models to train interpretable spline based, additive models. There are vignettes on Basic Theory and Usage, Automation, Classification, Use Cases, Graphics, Extra Information, and the xspliner Environment. Time Seriesmfbvar v0.4.0: Provides functions for estimating mixed-frequency Bayesian vector autoregressive (VAR) models with Minnesota or steady-state priors as those used by Schorfheide and Song (2015), or by Ankargren, Unosson and Yang (2018). Look at the GitHub page for an example. NTS v1.0.0: Provides functions to simulate, estimate, predict, and identify models for nonlinear time series. UtilitiesAzureContainers v1.0.0: Implements an interface to container functionality in Microsoft's AzureRMR v1.0.0: Implements lightweight interface to the Azure Resource Manager REST API. The package exposes classes and methods for AzureStor v1.0.0: Provides tools to manage storage in Microsoft's AzureVM v1.0.0: Implements tools for working with virtual machines and clusters of virtual machines in Microsoft's cliapp v0.1.0: Provides functions that facilitate creating rich command line applications with colors, headings, lists, alerts, progress bars, and custom CSS-based themes. See the README for examples. projects v0.1.0: Provides a project infrastructure with a focus on manuscript creation. See the README for the conceptual framework and an introduction to the package. remedy v0.1.0: Implements an RStudio Addin offering shortcuts for writing in solartime v0.0.1: Provides functions for computing sun position and times of sunrise and sunset. The vignette offers an overview. Visualizationeasyalluvial v0.1.8: Provides functions to simplify Alluvial plots for visualizing categorical data over multiple dimensions as flows. See Rosvall and Bergstrom (2010). See the README for details. spatialwidget v0.2: Provides functions for converting R objects, such as simple features, into structures suitable for use in transformr v0.1.1: Provides an extensive framework for manipulating the shapes of polygons and paths and can be seen as the spatial brother to the tweenr package. See the README for details.
To leave a comment for the author, please follow the link and comment on their blog: R Views. R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more... This posting includes an audio/video/photo media file: Download Now |
You are subscribed to email updates from R-bloggers. To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google, 1600 Amphitheatre Parkway, Mountain View, CA 94043, United States |
Comments
Post a Comment