[R-bloggers] Looking back in 2017 and plans for 2018 (and 1 more aRticles)

[R-bloggers] Looking back in 2017 and plans for 2018 (and 1 more aRticles)

Link to R-bloggers

Looking back in 2017 and plans for 2018

Posted: 30 Dec 2017 12:00 AM PST

(This article was first published on Marcelo S. Perlin, and kindly contributed to R-bloggers)

As we come close to the end of 2017, its time to look back. This has
been a great year for me in many ways. This blog started as a way to
write short pieces about using R for finance and promote my
book in an organic way.
Today, I'm very happy with my decision. Discovering and trying new
writing styles keeps my interest very alive. Academic research is very
strict on what you can write and publish. It is satisfying to see that I
can promote my work and have an impact in different ways, not only
through the publication of academic papers.

My blog is build using a Jekyll
template
, meaning the whole
site, including individual posts, is built and controlled with editable
text files and Github. All files related to posts follow the same
structure, meaning I can easily gather the textual data and organize it
in a nice tibble. Let's first have a look in all post files:

post.folder <- '~/GitRepo/msperlin.github.io/_posts/'    my.f.posts <- list.files(post.folder, full.names = TRUE)  my.f.posts    ##  [1] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-15-First-post.md"                    ##  [2] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-16-BatchGetSymbols.md"               ##  [3] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-17-predatory.md"                     ##  [4] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-18-GetHFData.md"                     ##  [5] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-19-CalculatingBetas.md"              ##  [6] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-01-30-Exams-with-dynamic-content.md"    ##  [7] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-05-R-and-Tennis.md"                  ##  [8] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-06-My-Book-is-out.md"                ##  [9] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-10-Shiny_Exams.md"                   ## [10] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-13-R-and-Tennis-Players.md"          ## [11] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-02-16-Writing-a-book.md"                ## [12] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-03-05-Prophet-and_stock-market.md"      ## [13] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-03-26-pmdR-exercises.md"                ## [14] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-04-pafdR-is-out.md"                  ## [15] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-09-Studying-Pkg-Names.md"            ## [16] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-15-R-Finance.md"                     ## [17] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-05-29-Update-GetHFData-1-3.md"          ## [18] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-06-01-Instaling-R-in-Linux.md"          ## [19] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-08-24-Reinstalling_R_Packages.md"       ## [20] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-08-24-Switching_to_Linux.md"            ## [21] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-04-Package-GetLattesData.md"         ## [22] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-10-Update-GetHFData-1-4.md"          ## [23] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-14-Brazilian-Yield-Curve.md"         ## [24] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-09-29-_Package-GetITRData.md"           ## [25] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-12-06-_Package-GetDFPData.md"           ## [26] "/home/msperlin/GitRepo/msperlin.github.io/_posts//2017-12-13-_Serving-shiny-apps-internet.md"  

I posted 26 posts during 2017. Notice how all dates are in the beginning
of the file name. I can easily convert that to a Date object using
as.Date. Let's organize it all in a nice tibble.

library(tidyverse)    ## ── Attaching packages ─────────────────────────────────────────────────────────────────────────── tidyverse 1.2.1 ──    ## ✔ ggplot2 2.2.1     ✔ purrr   0.2.4  ## ✔ tibble  1.4.1     ✔ dplyr   0.7.4  ## ✔ tidyr   0.7.2     ✔ stringr 1.2.0  ## ✔ readr   1.1.1     ✔ forcats 0.2.0    ## ── Conflicts ────────────────────────────────────────────────────────────────────────────── tidyverse_conflicts() ──  ## ✖ dplyr::filter() masks stats::filter()  ## ✖ dplyr::lag()    masks stats::lag()    df.posts <- tibble(ref.date = as.Date(basename(my.f.posts)),                     ref.month = format(ref.date, '%m'),                      content = sapply(my.f.posts, function(x) paste0(readLines(x), collapse = '\n') ),                     char.length = nchar(content)) %>%  # includes output code in length calculation..    filter(ref.date > as.Date('2017-01-01') | ref.date < as.Date('2018-01-01') ) # not really necessary but keep it for future    glimpse(df.posts)    ## Observations: 26  ## Variables: 4  ## $ ref.date     2017-01-15, 2017-01-16, 2017-01-17, 2017-01-18, 2...  ## $ ref.month    "01", "01", "01", "01", "01", "01", "02", "02", "0...  ## $ content      "---\nlayout: post\ntitle: \"My first post!\"\nsub...  ## $ char.length  1734, 5833, 6632, 17265, 23414, 12974, 18899, 1779...  

Fist, let's look at the frequency of posts by month:

print( ggplot(df.posts, aes(x = ref.month)) + geom_histogram(stat='count'))     ## Warning: Ignoring unknown parameters: binwidth, bins, pad  

It is not accidental that january was the month with the highest number
of posts. This is when I had material reserved for the book. June and
July (0!) were the worst months as I traveled a lot. In June I attended
R and Finance in Chicago, SER in Rio de Janeiro and in July I was
visiting Goethe University in Germany for the whole month. On average, I
created 2.1666667 posts per month overall, which fells quite alright. I
hope I can keep that pace for the upcoming years.

As for the length of posts, below we can see a nice pattern for its
distribution conditional on the months of the year.

print(ggplot(df.posts, aes(x=ref.month, y = char.length)) + geom_boxplot())  

I was not very productive from may to august, writing a few and short
posts, when comparing to other months. This was probably due to my
travels.

Plans for 2018

Despite the usual effort in research and teaching, my plans for 2018
are:

  • Work on the second edition of the portuguese
    book
    . It significantly
    lags the english version in content and this need to be fixed. I
    already have some ideas laid out for new chapters and new packages
    to cover. I'll write more about this update as soon as I have it
    figured out.

  • Start a portal for financial data in Brazil. I want to make it
    easy for people to visualize and download organized financial data,
    specially those without programming experience. It will include the
    usual datasets such as prices in equity/bond/derivative markets for
    various frequencies, historical yield curves, financial statements
    of companies, and so on. The idea is to offer the datasets in
    various file formats, facilitating its use in research.

Thats it. If you got this far, happy new year! Enjoy your family and the
holidays!

To leave a comment for the author, please follow the link and comment on their blog: Marcelo S. Perlin.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

This posting includes an audio/video/photo media file: Download Now

Time To Shine

Posted: 29 Dec 2017 04:00 PM PST

(This article was first published on HighlandR, and kindly contributed to R-bloggers)

Blogging and social media for introverts –

How to spot an introvert

You may have seen David Robinson's recent post encouraging R users to start blogging. Some folk will willingly act on this advice, and others won't.

For those that won't, I know who you are.
You.
Yes, you, trying to hide at the back.
I can spot an introvert.

It's easy.

Simply say "we're all going to take part in a team bonding exercise" and watch to see whose eyes point to the floor while everyone else leaps to their feet.
How about "icebreakers"?

"Turn to the person next to you and introduce yourself, and tell them what you hope to get out of the day".

There then follows 2 excruciating minutes while your enthusiastic neighbour gushes about what a great day they're planning to have, while your internal monologue is trying to keep up with the correct ratio of eye contact, smiling, nodding, and generally appearing to be interested, while your sole aim for the day is to avoid taking part in any further activities like this.

2017-12-30-eyebrows.jpg

An easier way to spot an introvert?
For me, I just look in the mirror.

I've had the Myers-Briggs personality classification test a few times, and each time I'm always an introvert.
Last time I checked, I was down as an ISFJ.

Someone who doesn't like attention.
Someone who won't "blow their own trumpet".

But I blog.
I'm on social media.
I have 450 followers. Not a lot for some, but huge for me. I don't know that many folk in real life!

I've found my trumpet.
And I've blown it.
And survived.

"But John" you say, "this is all ridiculous".

2017-12-30-monica.gif

How can this be?

An introvert might start to think about blogging, and then put themselves off:

"I can't afford it"

Doesn't have to cost you anything. Jekyll & Github pages, or Hugo & Github pages, whatever takes your fancy. It's free. You only need to pay if you want a custom domain name, and to be honest, the cost involved for that is minimal over the course of a year – a pound / dollar (same thing nowadays?) a month.
Next!

"I don't have time"

I typically work flexibly between 9 am and 6pm, go through the rigmarole of my twins bedtime & getting them to sleep. Sometimes it's 10 pm before I get a chance to think about doing any thing related to blogging.
Usually this means I 'm working until after midnight, and then off to sleep on a matress on the floor of their room so that I'm there if they wake.
I've just spent 14 months doing this.
I don't recommend this at all – (If you have a young family, you may recognise this pattern) – but its the only way I can get anything done.

You may be able to get up 1 hour earlier in the morning instead, and use that time for some writing.

You may just have to accept that you actually do have plenty of time.

But know that each blog post will take longer than you think, and that you will probably not be able to sit down and write one in a single sitting.
Expect to spread each post out over a few sessions, and that way you won't get frustrated if your current session gets cut short.

"I don't have anything of value to say".

I thought that.
But Mara Averick, theHigh Priestess of R herself, has highlighted a couple of my posts recently.
And other folk seemed to like them also, so maybe I was wrong.
And maybe you are too.
Someone, in a faraway nation, is going to read your stuff, and they're going to like it (shout out to my readers in Peru!)

Get your stuff out there, and find out who they are, and where they are.

Speaking of which – how will folk find out about you?
No one read my first few blog posts, because no one saw them.
I needed to change mindset.

" I don't like self-promotion".

Of course you don't. You're an introvert.
But it's the way of the world.
Ask yourself this.
How much time, in your career to date, have you spent in the background, doing your job well and hoping that someone somewhere would notice?
How's that working out for you?
Take some positive action- you'll soon get used to it, and it will begin to feel normal.
Kind of.

2017-12-30-normal-person.gif

Which leads me to:

" I don't do social media "

Neither did I.
I avoided Twitter for years. Didn't see the point in it.
Now I love it.
Why?
Because all the Rstats folk are on there.

If you think that reading R-Bloggers every day is enough to keep you abreast of what's going on in the R world, you need to get on Twitter and prepare to have your eyes well and truly opened.

There's a discipline to Twitter (at least there was, until the whole 280 character thing). It's a challenge to communicate succinctly. You won't always get it right. But give it a go.

Things that I've done this year because of Twitter:

  1. Presented my work to other NHS analysts based in Edinburgh / Glasgow (because I got chatting to another analyst (Hi Joe) who shares my obsession with run charts – which he knew because he'd seen my tweets and discussions with the creator of the qicharts2 package).

  2. Presented a session for the QlikView Healthcare Dev group, hosted by none other than inspirational QlikView developer, Dalton Ruer. 20 minutes of talking very slowly (by my standards)
    But it seemed to go OK and I got some nice comments about my code. Result.

  3. Learned how to create animated plots through a chance conversation with Neil Pettinger – again – purely through some of my R related tweets. Creating animated plots was one of the items I had on my list of things to mess about with, but I would probably never have got round to it.
    Now I know how to use gganimate and the animation packages, and have some experience of magick, and learned some more purrr.

There are other ways of getting out there.

  • Submit your blog to R-Bloggers – getting accepted on there earlier on in the year was a major confidence boost for me.

  • Or submit to R Weekly – again, seeing your stuff on there is a good confidence boost.

  • Write an article on LinkedIn (if you can face it).
  • Or just tweet out images/ gifs of your work and see who reacts!

"I don't want people seeing my code – it's a mess"

Because of the nature of when I blog, sometimes, I just want the thing done, and published.
A lot of my early stuff would definitely fail to meet the tidyverse R style guide, but its out there.
Am I proud of that?
No.
But it works.
As time goes on, I hope to get better, and hopefully, my more recent code examples do look a bit better ( now that I've finally internalised the R-Studio shortcuts for assignments and piping).

"People might criticise what I write".

They might, but they probably won't.
I find the rstats community to be very supportive.
Sure, you might get some suggestions for code improvements.
Or you might get pointed to a package you didn't know about.

But criticism is not a bad thing, as long as you react the right way.

2017-12-30-monica-i-suck.gif

I wrote a piece about my QlikView work on LinkedIn that drew a critical response from an expert.
Then a few folk others tried to piggyback on there to appear knowledgeable

I think I responded the right way.
That post got over 11000 views, more than my R blog has had in its entire duration.
It also got me more exposure among some important QlikView folk, so it did me a big favour.

"What if I'm wrong?"

Someone might tell you.

Take a deep breath:

2017-12-30-relaxed.jpg

You learn from it and move on to the next thing.

2017-12-30-rollins-tenacity.jpg

On a practical note, I do try and run my code from scratch in a clean R session before I upload it to GitHub or blog post. I like to make sure things work so that other folk can try it out.
This is a good habit to get into if you don't already do it.

I'm going to wrap this up now.

If I haven't convinced you, and you're still thinking that this is not for you:

Do it. Get to the keyboard and start blogging:

2017-12-30-rollins-do-it.jpg

See you in 2018

To leave a comment for the author, please follow the link and comment on their blog: HighlandR.

R-bloggers.com offers daily e-mail updates about R news and tutorials on topics such as: Data science, Big Data, R jobs, visualization (ggplot2, Boxplots, maps, animation), programming (RStudio, Sweave, LaTeX, SQL, Eclipse, git, hadoop, Web Scraping) statistics (regression, PCA, time series, trading) and more...

This posting includes an audio/video/photo media file: Download Now

Comments