«University of California Los Angeles Bridging the Gap Between Tools for Learning and for Doing Statistics A dissertation submitted in partial ...»
The history not only makes the interactive system reproducible (see Section
4.8 for more about this requirement), it allows users to move forward and backward through their history. In Figure 5.8, the cursor is indicating a move ‘back in time’ to the way the interface was when the bin width was set to 0.6. The interface also allows the user to rewrite history and see what the situation in the past would have looked like with a slight parameter tweak. There are obviously open issues surrounding how to deal with those alternate histories, but other researchers in the Communications Design Group are experimenting with novel methods to address those problems (Warth et al., 2011).
5.3 Additional Shiny experiments
The goal of ‘interaction all the way down’ (Section 4.6) in a statistical programming tool does not always immediately present itself as a useful feature. However, given my experience teaching high school students, professional development for high school teachers, and undergraduates, I am able to see many possible use cases. Because of the eﬀort it took to create LivelyR (again, mostly on the part of Lunzer), it was clear we needed a simpler solution to mocking up interface possibilities.
As a result, I have been creating a few experiments in Shiny to show some of the features that could be available in a full system complying with the requirements listed in Chapter 4.
5.3.1 Conditional visualizations A very simple Shiny app I developed for a data visualization course I taught is shown in Figure 5.910. In the class, we were looking at a visualization about the saving habits of men and women, made by Wells Fargo (Harris Poll, 2014). In the original visualization, a donut chart is broken down into overall percentages of the surveyed population who saved a particular percent of their savings. For example, 18% of those surveyed saved more then 10% of their income. However, in addition, conditional percentages were given by gender. For that same slice of the donut, it said “men - 26%” and “women - 9%” Obviously, those numbers are for those groups individually, but upon ﬁrst glance it seemed like the graphic was suggesting that 26% of the 18% were men and 9% women.
In order to help my students understand the diﬀerence between the two breakdowns, I created a series of graphics: one with the data broken ﬁrst into saving categories, and then gender (5.9b), and the other breaking on gender ﬁrst (5.9a). While this app will not win any awards for visualization style, it conveyed the point easily, and allowed my students to ﬂip back and forth between the versions.
Interactive version available at https://ameliamn.shinyapps.io/ConditionalPercens
(b) Split on saving method ﬁrst Figure 5.9: Conditional percentages 5.3.2 Interaction plot with manipulable data cut point Another example stuck out from a modeling course for which I was the teaching assistant. In the course, students were asked to prepare a linear model to predict API scores11 from a number of other factors about schools. The assignment asked students to discuss interaction eﬀects in their ﬁnal paper.
Because the data had many numeric variables and the students preferred the clarity of studying interaction plots when both variables were categorical, many groups chose to split a numeric variable into a categorical variable with two classes. Of course, the choice of cut point was somewhat arbitrary. Some groups chose the mean of their variable, others the median, while others chose a cutoﬀ that they believed to be signiﬁcant given some contextual knowledge.
However, because they were programming in R and the choice of cut point was so early in their analysis, many groups did not realize how sensitive their analysis was to their earlier parameter choice.
Diﬃculty understanding the impact of cut points is not unique to my students. Other researchers have observed similar behavior in students and teachers (Hammerman and Rubin, 2004; Rubin and Hammerman, 2006).
In order to display how a completely interactive system could help even intermediate and advanced analysts, I created a Shiny app which allows a user to pick a variable in the dataset to convert to a categorical variable, and then allows them to manipulate the cut point. Because Shiny uses a reactive framework, even time the cut point is manipulated the model output, interaction plot, and coeﬃcient interpretation change. A view of this Shiny app is shown in Figure 5.1012.
The app makes it very simple to see where the choice of cutpoint makes the In this context, API means Academic Performance Index, not to be confused with Application Programming Interfaces.
Interactive version available at https://ameliamn.shinyapps.io/InteractionPlot/
Figure 5.10: Shiny app demonstrating fragility of interaction based on cutpoint
interaction eﬀect ﬂip, and even provides a visual cue as to why that might be happening– the vastly diﬀerent sizes of data in the two groups. However, this app had to be hand-coded by me, using the Shiny server/UI framework, so it is not something an introductory student could develop on their own.
5.4 Discussion of ﬁrst forays
The work discussed in this chapter has helped to ease some of the transition between learning and doing, particularly for high school students associated with the Mobilize Project. However, the tools developed only hint at the richer interfaces that are possible. The experience of creating LivelyR prompted Lunzer and myself to reconsider our choice of target technology, because the combination of R and Lively Web caused us to get locked in to particular interface choices too early. Instead, we should be attempting to create as many distinct possibilities as possible (as in a charrette) and using user studies to learn which are most successful. These experiences motivate my future work.
Conclusions, recommendations, and future work Nearly 20 years after Rolf Biehler’s 1997 paper, “Software for learning and for doing statistics,” much of Biehler’s vision has been realized through the development of TinkerPlots and Fathom. These landscape-type tools for learning statistics and data analysis allow novices to jump into ‘doing’ statistics and experience the playful nature of the cycle of exploratory analysis, moving from questioning to analysis and back to questioning. However, new developments in computation and data analysis are beginning to trickle down into introductory material, and need better support. In particular, the drive for reproducible research has trickled down from science (Buckheit and Donoho, 1995) to introductory statistics (Baumer et al., 2014), and needs to be supported by tools for learning.
However, the movement of best practices is not simply from professional tools to tools for learning. In fact, tools like Fathom and TinkerPlots have strengths professional tools are sorely lacking, like support for interactive graphics, parameter manipulation, building new plot types, and integrated randomization. We can imagine a tool combining the strengths of both paradigms, eliminating the gap between tools for learning and tools for doing statistics.
In Chapter 3 we considered the strengths and weaknesses of tools currently on the market. While R has been gaining followers because of its strengths, like its status as a free and open source programming language, the R programming paradigm is very stilted and does not support exploratory analysis as well as it could. Projects like Shiny, RMarkdown and the iPython notebook are making it possible to combine textual programming languages with interactive and publishable results, but they typically provide either dynamic or interactive graphics, never dynamic-interactive.
Conversely, TinkerPlots and Fathom make it simple for everyone, including novices, to interact with their data. However, this interactivity comes with tradeoﬀs, particularly in terms of sharing results (proprietary ﬁle formats make it hard to share interactive results with others) and reproducibility (as there is no linear documentation of the analysis process).
In Chapter 4, we discussed the attributes necessary for a modern statistical programming tool bridging the gap between being a tool for learning and a tool for doing. These attributes include easy entry for novice users, data as a ﬁrst-order persistent object, support for a cycle of exploratory and conﬁrmatory analysis, ﬂexible plot creation, full support for randomization, interactivity at every level, inherent visual documentation, simple support for narrative, publishing, and reproducibility, and the ﬂexibility to build extensions. While there are eﬀorts to move toward this ideal tool, no existing products satisfy all the requirements.
Chapter 5 describes a set of experiments I have undertaken in the space of closing the gap between tools for learning and tools for doing statistics. One component of this is curricular: high school level material developed through the NSF grant Mobilize, including data science units to be added to courses in computer science, Algebra I and Biology, and a freestanding course called Introduction to Data Science. As part of the Mobilize project I have also considered appropriate computational tools (R, Deducer, and R within RStudio), and developed additional functionality in the MobilizeSimple package.
In joint work with Aran Lunzer, we considered ways in which interactive tools would provide capabilities not possible using pen-and-paper, including an interaction history to make using an interface more reproducible. Finally, I created a few illustrative Shiny apps, to demonstrate the microworld capabilities of the system.
6.1 Current best practices Because there are currently no tools containing all the attributes discussed in Chapter 4, it makes sense to consider the best practices using existing technology. As we have seen repeatedly throughout this work, there is a contrast between tools for learning and tools for doing. On the tools for learning side, Fathom appears to have the most features complying with the requirements. It has a low threshold, provides plenty of interactivity for analysis authors, and has a low price point. However, if Fathom is to be used in an introductory class, eﬀorts must be taken to scaﬀold its use toward the next tool. For example, the end of the semester could include a basic exploratory data analysis project in R.
For those more interested in starting students on a professional tool, but providing better ‘on-ramping’ to the tool, the use of R within RStudio is recommended. In addition, scoping decisions should be made to only introduce students to a small set of R commands and one uniﬁed syntax. In the Mobilize project, we have followed the lead of Project MOSAIC and have used the formula syntax, using mosaic, lattice graphics, and the additional tools available from the MobilizeSimple package. This package includes the integrated lab exercises described in Section 220.127.116.11, which allow students to move through structured activities without leaving the RStudio interface. This approach is similar to that taken by tools like swirl and DataCamp, but the Mobilize labs oﬀer more space for creativity and inquiry by not locking students into a particular trajectory. R is a landscape-type tool, which does not specify any particular trajectory. The Mobilize labs provide more of a route through the material, while
allowing for exploration around the edges.
As shown in Section 5.3, Shiny has potential as a tool for creating microworlds or minitools, allowing novices to explore within an environment built in a target language. However, while Shiny apps allow users to interact with data, they still suﬀer from a hard gap between using the interactive tool and using the target language (i.e., R).
6.2 Future work
None of the work presented in this dissertation is ‘the’ system, so my goal for the future is to begin building larger experimental prototypes in order to explore the possibilities. At present, I imagine a blocks-programming environment along the lines of Scratch, allowing novices to begin doing statistics and data analysis. However, underlying this environment would be a textual programming environment more like the target language (e.g., R).
The challenge is there should be a tight coupling between the visual representation in the block and the language underneath it. And, it should be possible to build up additional visual blocks to add to the system, and share with others. In other words, we want a bijection between visual blocks and textual programming, rather than an injection. If the blocks programming system is ﬁxed and the only way to move forward is to write in the textual language, then the language becomes injective.
There are several components to be developed in order for this sort of system to work properly. The ﬁrst is the domain-speciﬁc language to underly the visual component. The challenge with developing these primitives is to make them descriptive enough to capture all the basic tasks necessary, while still providing the possibility to create new functionality. The language should be expressive enough to be used in many circumstances, but limited enough it can be captured by the (small) working memory of humans – either 7 ± 2 or 4, depending on who you consult (Miller, 1955; Cowan, 2000; Shah and Hoeﬀner, 2002).
The second is the visual system itself. The interfaces of tools like TinkerPlots, Fathom, Data Desk, and JMP provide some inspiration, but as they do not capture a reproducible workﬂow or encourage integrated narrative, there are changes to be made. My future work will focus on paper prototyping in order to “get the design right and the right design” as Bill Buxton says (Buxton, 2007).
Once the design has been solidiﬁed, and the underlying language is clear, there is a challenge of implementation. As reactive programming in R gets more support through Shiny, I am hopeful implementation in R will be possible. However, it is likely additional computational components will need to be included to support all the functionality. For example, research and anecdotal experience suggest packages used through the browser are best for novices, because they remove many technical diﬃculties. However, in-browser support for data is very limited, and running on a centralized server can lead to delays.
6.3 Final words