[R-bloggers] The First Programming Design Pattern in pxWorks (and 4 more aRticles) |
- The First Programming Design Pattern in pxWorks
- Video: How to Scale Shiny Dashboards
- Hack: The ‘[‘ in R lists
- BASIC XAI with DALEX— Part 1: Introduction
- Hack: The “count(case when … else … end)” in dplyr
The First Programming Design Pattern in pxWorks Posted: 18 Oct 2020 04:44 AM PDT
[This article was first published on gtdir, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. First of all, we need to explain a few things in more detail. (Re)Introduction pxWorks is an open source programming platform that enables the following, among other things:
Programming Logic (Control Flow) in pxWorks Any computer program can be represented by a graph. In pxWorks, graph nodes represent operations and graph edges represent the direction of control flow. To enable programming loops without making logic complicated, the platform uses just two types of connections: unconditional and conditional. The simplest program is the one that uses unconditional connections. Such connections are represented on the canvas by grey lines. The graph with only unconditional connections represents a simple program in which each node that has inputs waits until all the code blocks associated with connected inputs have been processed. Nodes that have conditional inputs allow to introduce loops into control flow. Conditional connections are represented by magenta lines. Nodes that have both unconditional and conditional input connections wait for their turn to execute the code based on the following rule: either all the unconditionally connected nodes have been (re)calculated or at least one conditional node has been (re)calculated and generated an input file. So in the first case, with unconditional links, the triggering of the code takes place regardless of whether an input file is generated for the dependent block (hence the execution is unconditional). In the second case, with conditional links, the triggering of the code takes place only on condition that an input file has been generated after running the earlier block. Even more details on this subject can be found here. The First Design Pattern: Heartbeat Before proceeding any further, you might want to get the example file here. (To run the example, you will need to unzip it and open in pxWorks.) The first and simplest use case might be periodic retrieval (and processing) of some data using an R/Python/Julia/etc. or any mixture of these. We will use R. To implement this design pattern we need a block that will initiate the control flow, let's call it 'init,' and a heartbeat block, which is simply a script that generates an output file and passes the control flow back to its own input socket. There is no need to generate the file every time, but for simplicity, we will keep regenerating it every time the script is run. So the heartbeat block will keep running perpetually and will trigger scripts in dependent blocks. To stop the heartbeat block, the generated file must be deleted and the script must stop generating the file. In further posts, we will demonstrate other design patterns we use in our data analysis workflow. This first example already shows how simple it is to introduce programming logic using just two types of connections to model the control flow rather than multiple types of blocks as done in some other platforms. Things become so much simpler. Instead of thinking about programming architecture, one becomes free to think about the data as programming complexity vanishes. ### To leave a comment for the author, please follow the link and comment on their blog: gtdir. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. The post The First Programming Design Pattern in pxWorks first appeared on R-bloggers. |
Video: How to Scale Shiny Dashboards Posted: 18 Oct 2020 02:15 AM PDT
[This article was first published on r – Appsilon Data Science | End to End Data Science Solutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. This presentation was a part of a joint virtual webinar with Appsilon and RStudio entitled "Enabling Remote Data Science Teams". Find a direct link to the presentation here. How to Scale a Shiny App to Hundreds of UsersIn this video, Appsilon's VP of the Board & Co-Founder Damian Rodziewicz explains best practices for scaling Shiny applications in production. Damian explains three of the areas that Appsilon focuses on to scale Shiny applications: Frontend Leveraging, Extracting Computations, and creating a stable and scalable Architecture. R Shiny applications are fast by default but can become extremely slow if they are not properly built, especially when there are tens or hundreds of people using them. Having best practices in mind from the beginning of the project can save you a lot of trouble down the line.
Vertical and Horizontal ScalingIf you intend to scale your Shiny app, there are two concepts we need to explore: Vertical Scaling and Horizontal Scaling. It's best to start with proper vertical scaling – you should make sure the application is fast and robust in the first place while running on a single machine, and then you can add as many machines as you want in an efficient way (horizontal scaling). With this in mind, let's return to our three previously mentioned areas: Leveraging Frontend, Extracting Computations, and Setting the Architecture. Below is a quick rundown of each area, but please reference the video presentation for a full explanation. Above all, it's important to Make the Shiny Layer Thin. This means that Shiny should only be doing the work that it's best at – creating an interface between R and your browser. The rest of the work (such as interactivity or long computations) should be offloaded to the browser or handled by the database, etc. Leverage Frontend
Extract Computations
Architecture
Learn more
Appsilon is an RStudio Full Service Certified Partner. We are global leaders in Shiny and we specialize in advanced enterprise Shiny apps for Fortune 500 companies. Reach out to us at hello@appsilon.com. Article Video: How to Scale Shiny Dashboards comes from Appsilon Data Science | End to End Data Science Solutions. To leave a comment for the author, please follow the link and comment on their blog: r – Appsilon Data Science | End to End Data Science Solutions. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. The post Video: How to Scale Shiny Dashboards first appeared on R-bloggers. This posting includes an audio/video/photo media file: Download Now |
Posted: 17 Oct 2020 08:43 PM PDT
[This article was first published on R – Predictive Hacks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. Assume that you have a list and you want to get the n-th element of each component or generally to subset the list. You can use the command Let's see this in practice. mylist<-list(id<-1:10, gender<-c("m","m","m","f","f","f","m","f","f","f"), amt<-c(5,20,30,10,20,50,5,20,10,30) ) mylist Output: > mylist [[1]] [1] 1 2 3 4 5 6 7 8 9 10 [[2]] [1] "m" "m" "m" "f" "f" "f" "m" "f" "f" "f" [[3]] [1] 5 20 30 10 20 50 5 20 10 30 Let's say that we want to get the 3rd and 6th element of the list: sapply(mylist, "[", c(3,6)) Output: [,1] [,2] [,3] [1,] "3" "m" "30" [2,] "6" "f" "50" To leave a comment for the author, please follow the link and comment on their blog: R – Predictive Hacks. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. The post Hack: The '[' in R lists first appeared on R-bloggers. This posting includes an audio/video/photo media file: Download Now |
BASIC XAI with DALEX— Part 1: Introduction Posted: 17 Oct 2020 08:24 PM PDT
[This article was first published on R in ResponsibleML on Medium, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. BASIC XAIIntroduction to model exploration with code examples for R and Python Hello! Welcome to "BASIC XAI with DALEX" series. In this post, we will take a closer look at some algorithms used in explainable artificial intelligence. You will find here an introduction to methods of global and local model evaluation. Each description will include a technical introduction, example analysis, and code in R and Python. So, shall we start? First — why should I use XAI?Nowadays, the quick and dirty approach to develop a predictive model is to try a large number of different ML algorithms and choose the single result that maximizes some validation criteria. This often results in complex models called black boxes. Why? Sometimes these elastic algorithms find models with greater predictive power, sometimes they can detect tricky relationships between variables, and sometimes all models are of similar performance but there are more complex ones so they are more often selected. But there is a price to pay in this quick and dirty scheme. When we choose complex yet elastic models, we often lose the interpretability of them. To understand what decisions are made by the trained model, algorithms and tools are being developed to help human experts to understand how models are working. There is plenty of methods developed under the explainable artificial intelligence (XAI) umbrella that can be used to explain or explore complex models. Second — which to choose: global vs local?A growing number of tools for explanation are emerging because different stakeholders have different needs. Global explanations are those that describe model behavior on the whole data set. This allows us to deduce how the model behaves generally/ usually/ on average. Local explanations, on the other hand, refer to a single prediction, to a specific client/property/patient on which model operates. Usually, local explanations show which and how different variables contribute to the model prediction. These differences are shown in the XAI pyramid below. The left part of the pyramid corresponds to the assessment of a single observation and the right part to the whole model. We can ask various questions about the model. On the left are questions related to a specific prediction. On the right are questions about the model in general. From the top, we start with more general questions that can be answered with a single number or few numbers, like what is the predictive performance of the model (this can be summarised with a single number like AUC or RMSE), or a prediction value for a single observation (a single number). The following levels refer to the more and more specific methods, which we will discuss in this basic XAI series. Third — let's get a model in R and PythonIn this example, we will use the apartments dataset (collected in Warsaw, available in DALEX package in R and Python). The data set describes 1000 apartments with six variables such as surface, floor, no.rooms, construction.year, m2.price, and district. We will create a model that predicts the price of an apartment, so let's start with a black box regression model — random forest. The package that we will use in these examples is DALEX. Below we have the code in Python and R, which allows us to transform the data, build a model, and explainer. The explainer is an object/adapter that wraps the model and creates a uniform structure and interface for operations. If you want it, you can use ready-made objects prepared by us, you can find here. In the next part, we will learn about a method for global variable importance — Permutational Variable Importance. Many thanks to Przemyslaw Biecek and Jakub Wiśniewski for their support on this blog. If you are interested in other posts about explainable, fair, and responsible ML, follow #ResponsibleML on Medium. In order to see more R related content visit https://www.r-bloggers.com BASIC XAI with DALEX— Part 1: Introduction was originally published in ResponsibleML on Medium, where people are continuing the conversation by highlighting and responding to this story. To leave a comment for the author, please follow the link and comment on their blog: R in ResponsibleML on Medium. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. The post BASIC XAI with DALEX— Part 1: Introduction first appeared on R-bloggers. This posting includes an audio/video/photo media file: Download Now |
Hack: The “count(case when … else … end)” in dplyr Posted: 17 Oct 2020 08:13 PM PDT
[This article was first published on R – Predictive Hacks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here) Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. When I run quires in SQL (or even HiveQL, Spark SQL and so on), it is quite common to use the syntax of Let's start: library(sqldf) library(dplyr) df<-data.frame(id = 1:10, gender = c("m","m","m","f","f","f","m","f","f","f"), amt= c(5,20,30,10,20,50,5,20,10,30)) df Let's get the sqldf("select count(case when gender='m' then id else null end) as male_cnt, count(case when gender='f' then id else null end) as female_cnt, sum(case when gender='m' then amt else 0 end) as male_amt, sum(case when gender='f' then amt else 0 end) as female_amt from df") Output: male_cnt female_cnt male_amt female_amt 1 4 6 60 140 Let's get the same output in dplyr. We will need to subset the data frame based on one column. df%>%summarise(male_cnt=length(id[gender=="m"]), female_cnt=length(id[gender=="f"]), male_amt=sum(amt[gender=="m"]), female_amt=sum(amt[gender=="f"]) ) Output: male_cnt female_cnt male_amt female_amt 1 4 6 60 140 To leave a comment for the author, please follow the link and comment on their blog: R – Predictive Hacks. R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job. Want to share your content on R-bloggers? click here if you have a blog, or here if you don't. The post Hack: The "count(case when … else … end)" in dplyr first appeared on R-bloggers. This posting includes an audio/video/photo media file: Download Now |
You are subscribed to email updates from R-bloggers. To stop receiving these emails, you may unsubscribe now. | Email delivery powered by Google |
Google, 1600 Amphitheatre Parkway, Mountain View, CA 94043, United States |
Comments
Post a Comment