Jan. 12, 9:26 AM: I’m heading off to Kissimmee, Florida for the RStudio Conference. From Hadley Wickham’s opening keynote today on Data Science in the Tidyverse through sessions on HTML widgets, general programming and more, I hope to have plenty of news, tips & tricks to share. I’ll be updating this blog throughout the conference, I hope you’ll come back to see the latest!
Jan 13, 5:43 PM: The bsplus package is designed so you can get “more stuff in your Shiny app,” says creator Ian Lyttle.wraps Bootstrap components, including accordion sidebar, carousel, tooltips, popover, help links and more. It was inspired by the shinyBS package, he said. Nothing in bsplus depends on the server part of Shiny, it’s all in the UI side, which means it will work in RMarkdown documents as well.
Jan 13, 5:34 PM: Karl Broman, professor at University of Wisconsin-Madison, has started a GitHub repo to collect links to conference presentation slides .
Jan 13, 5:21 PM: The ggedit package gives an interactive GUI for editing a ggplot2 graphic or theme — and then lets you see the code behind the change.
Jan 13, 5:12 PM: Friday afternoon lighting talk:
- rOpensci packages to consider for your arsenal
- magick – R access to the ImageMagick image editing capability
- hunspell – spell check in R
- tesseract – gives R access to an optical character recognition enginetravis and tic — tools to make it easy to work in travis
Jan 13, 5:27 PM: The corrr package makes it easy to explore correlations in R — get correlation data analysis into a data frame for more analysis. Can be piped, “pretty printed,” visualized and more.
Jan 13, 5:05 PM: The easymake package creates make files in R, so you don’t keep running code on data that hasn’t updated. It includes an RStudio add-in.
Jan 13, 4:07 PM: Julia Silge is presenting on tidy text mining. If you’re interested in text analysis in R and aren’t in the session, take a look at Tidy Text Mining in R.
Jan 13, 4:03 PM: Vectors don’t have to be atomic, notes Jenny Bryan in her presentation on list-cols. Vectors can be lists, too. So you can add a list to a data frame as a data frame column. Four skills to cultivate if you are adding such complex columns:
- inspect
- index
- compute
- simplify
You’ll be happier if it’s a tibble, she noted, but a data frame with a list-column is a valid data frame. Aside: The listviewer package has a nice html widget for viewing complex data. In general, though, you’re going to want to learn the purrr package if you want to deal with this, she said. She’s got a tutorial posted at https://jennybc.github.io/purrr-tutorial/.
Jan 13, 3:26 PM: If you do nothing else, when you’re coding, think data first with your function arguments, advises IT security pro and R package author Bob Rudis. That makes your code pipe-friendly (as in %>%
).
And, a pipe group should be designed to do one thing.
New to me: The httr package has a stop_for_status() function that converts http errors to R errors or warnings. It’s a useful concept for other coding, Rudis said.
Jan 13, 2:39 PM: Do you want to pull data from APIs into R? RStudio’s Amanda Gadrow posted several useful (and commented) example scripts at https://github.com/ajmcoqui/webAPIsR.
Jan 13, 2:03 PM: New to me: Commoncrawl.org, a project to crawl the Web and “that can be accessed and analyzed by anyone.” After loading that project’s files into Spark, you can use the sparkwarc package to read them into R. Conference demo showed things like finding most-used keywords and JavaScript libraries in a file with more than 100 million records. Interesting way to analyze Web content. Presentation slides are at bit.ly/2ilaQmi.
Jan 13, 2:00 PM: sparklyr version 0.5 is now on CRAN, useful for those who work with R and Apache Spark data. There are several new functions and improved compatibility, according to a presentation Friday afternoon.
Jan 13, 1:53 PM: Do you work with databases in R? Some news from the RStudio conference this afternoon: The company plans for RStudio version 1.1 to include a tab with information about database connections, as well as a dialog box to easily re-establish previously used connections. You’ll also be able to view database drivers, tables and schemas currently available on your system.
Jan 13, 1:45 PM: An R database package is the works called odbc for connecting with databases using a DBI-compliant interface with ODBC drivers. . It’s not yet on CRAN, but you can install with devtools::install_github("rstats-db/odbc")
. There already is an RODBC package for R, but odbc aims to be faster and provide things like native support for dates. A conference demo should features like parameterized queries and adding SQL queries to RMarkdown documents and interactive Shiny apps. If you pull data from databases with R, it’s something you’ll likely want to investigate.
Jan 13, 12:20 PM: RStudio has two different packages for creating dashboards. flexdashboard is for people who already know (or are willing to learn) RMarkdown. shinydashboard is for people who know or are willing to learn the Shiny Web framework for R, which has a somewhat steeper learning curve.
Jan 13, 11:18 AM: What if you want to do something in Shiny that’s slightly outside of what reactivity does, such as a function that also returns a previous value? Shiny creator Joe Cheng said he’s working on a package currently called rxtools that “tries to wrap up some of those idioms” for those of us who don’t have a deep, under-the-hood knowledge of Shiny. This package is still under development, he warned, so don’t use it for any production work; and it will likely be renamed so as not to be confusing with Microsoft reactivity. But meanwhile you should be able to find it on GitHub.
Jan 13, 11:02 AM: If you find yourself copying and pasting code in Shiny, stop and ask yourself if you should be using a reactive expression, Joe Cheng advises. If you’re not familiar with reactive expressions in Shiny search for talks on this from the Shiny developer conference. Warning: Don’t just search for shiny videos. Those won’t get you what you want (and in fact may give you pages of porn results, he said.)
Jan 13, 10:18 AM: Tutorial files for the Building Shiny Dashboards session are at: devtools::install_github("jcheng5/dashtutorial")
. Then run dashtutorial::summon()
to get exercise files.
Jan 13, 9:57 AM: tidyverse creator Hadley Wickham: “Importing data is either boring or horrifying. Exporting data is boring.” (On why he writes packages for data import but not export.)
Jan 13, 9:56 AM: Hadley was asked about concerns in the R community of potentially causing a rift between tidyverse lovers and tidyverse skeptics. “That is honestly not something I spend much time worrying about,” he said. “I worry about it a little bit,” he admitted, but he said he’s motivated by helping people get as far as they can in data analysis. He wants to create what he calls a “pit of success” – something people can easy fall into.
The tidyverse is a great place for people to start, he said, but knows that “in order to do real work you need to go out of the tidyverse.”
Jan 13, 9:52 AM: It’s currently rather cumbersome to easily look at R lists and json data. Hadley said this is a problem RStudio wants to solve.
Jan 13, 9:52 AM: It’s currently rather cumbersome to easily look at R lists and json data. Hadley said this is a problem RStudio wants to solve.
Jan 13, 9:40 AM: Hadley Wickham: A function should either compute something or do something. It should never do both.
Jan 13, 9:26 AM: Do you like using %>%
pipes in R? Wickham says R functions fit best into a pipe when:
The first argument is the “data”
The data is the same type across a family of functions
Hence, tidy data – again, the goal is solving complex problems by combining simple, uniform pieces.
Jan 13, 9:30 AM: Hadley Wickham: tibbles are data frames that are lazy & surly. Makes it easy to have a column that’s a list: list-columns. Gives you a way to keep related things together.
Jan 12, 1:37 PM: If you can’t make the session on linking HTML widgets with Crosstalk — either because you’re not at the conference or because you want to go to the text-mining session instead — presented Joe Cheng told me that it will be similar to his Crosstalk session at useR. There’s a recording of that one on Microsoft’s Channel 9.
Jan 12, 9:03 AM: I didn’t go to pre-conference training sessions, but a few attendees who were there shared some tips on Twitter (the conference hashtag is #rstudioconf). One of the favorites:
“shiny developers try ?showReactLog
to see reactivity graphs”, Phil Chapman tweeted from Joe Cheng’s Shiny session (Shiny is a Web framework for R), adding that he could already go home happy from the conference because of that useful advice.
Alex Whan added:
options(shiny.reactlog=TRUE)
WHERE HAVE YOU BEEN HIDING?
Jan 11, 5:03 PM: Pre-conference news: RStudio Connect, an enterprise publishing platform meant to make it easy to share R-generated analyses throughout an organization, has moved out of beta into production. For more info, see my story on RStudio Connect.