«University of California Los Angeles Bridging the Gap Between Tools for Learning and for Doing Statistics A dissertation submitted in partial ...»
Fathom (Figure 3.5) was developed by William Finzer as tool for learning statistics (Finzer, 2002a). It was based on principles from Biehler (1997), and intended to allow students play with statistical concepts in a more creative way.
Like Seymour Papert and his LOGO language, discussed in Section 2.3, the authors of Fathom and TinkerPlots wanted to design tools relevant to the way students think. Because Fathom is intended for slightly older users, it includes more features than TinkerPlots does.
Finzer was working from the perspective of a designer of education software rather than an education researcher. Therefore, he was solving a design problem rather than a research problem (Finzer, 2002b). He notes the development process could have beneﬁted from more input from education researchers, but the resulting software has been reasonably useful regardless (Finzer, 2002b). He also mentions one of the largest challenges with developing education software (or software in general)– how to know if it ‘works’ (Finzer, 2002b).
The design specs upon which Fathom is based include a focus on resampling, a belief there should be no modal dialog boxes, the location of controls outside the document proper, and animations to illustrate what is happening (Finzer, 2002b). Many of these ideas are also brought forward in Chapter 4.
Figure 3.5: Fathom version 2.13
The TinkerPlots graphical user interface (Figure 3.6) was designed by a team led by Cliﬀord Konold, a psychologist focused on statistics education (Konold and Miller, 2005). TinkerPlots was built on Fathom’s infrastructure, but designed for younger students. TinkerPlots was developed the same year as the GAISE guidelines (Franklin et al., 2005), and the connection between the cognitive tasks TinkerPlots makes possible and the A and B levels of the guidelines is clear. TinkerPlots includes probability modeling, but no standard statistical models (e.g. linear regression). It supports students through the development of more abstract representations of data (discussed further in Section 4.4).
Users can develop their own simulations and link components together to see how changing elements in one area will impact the outcome somewhere else.
As mentioned in Section 2.2, TinkerPlots and Fathom have a large market share when it comes to teaching introductory statistics, both at the K-12 and university levels (Lehrer, 2007; Garﬁeld and Ben-Zvi, 2008; Konold and Kazak, 2008; Watson and Fitzallen, 2010; Biehler et al., 2013; Finzer, 2013; Fitzallen, 2013; Mathews et al., 2013; Ben-Zvi, 2000; Garﬁeld et al., 2002; Everson et al., 2008). Past their design principles, both tools were popular for their reasonable pricing strategy, which made it possible for schools to aﬀord licenses. However, while they are commonly used by forward-thinking educators, their market saturation cannot begin to compare with the TI calculators, which are familiar to every high school student, particularly those taking the AP Statistics exam.
For educators who want to teach concepts like randomization and datadriven inference, the primary competitors at this level are applets. TinkerPlots and Fathom have a number of advantages over applets. Most importantly, TinkerPlots and Fathom allow students to use whatever data they want, rather than demonstrating data on one locked-in data set. The systems come with preloaded data sets, but it is easy to open other data and use it in the same way.
However, even with the popularity of TinkerPlots and Fathom in the statistics education community, their future was not always certain. McGraw Hill Education, the publishing company which owned and distributed TinkerPlots and Fathom, decided in 2013 to discontinue carrying the software products. After discontinuing their corporate support, McGraw Hill returned the licensing to the tools’ original creators, Konold and Finzer. Konold and Finzer provided temporary versions free of charge, and as of 2015 have found new publishers.
TinkerPlots will be distributed by Learn Troop, and Fathom by the Concord Consortium (the research group with which Finzer is associated).
While both TinkerPlots and Fathom provide good support for novices, there are drawbacks to using them, even in an educational context. All the arguments discussed in Section 2.3 about the gap between tools for learning and tools for doing apply. In particular, even though using these tools requires the use of a computer, they do not necessarily require students to be producers of statistics.
Instead, users can become focused on learning the interface to the software and the language surrounding the various buttons. Users may learn statistical concepts, but they are not developing any “computational thinking” or programming proﬁciency.
Fathom was developed in 2002, and TinkerPlots in 2005. In the 10 years since their respective releases, statistical programming has moved forward in ways these software have not. For example, while few would expect novices to be working with ‘big data’ in the truest sense of the term, TinkerPlots can only deal with data up to a certain size. A trial using a dataset with 12,000 observations and 20 variables caused considerable slowing, while larger datasets caused the program to hang indeﬁnitely. Fathom dealt with the same data much more easily, but still had a noticeable delay loading and manipulating the data.
While both software packages allow for the inclusion of text in the workspace, there is no way to develop a data analysis narrative. The more free-form workspace can feel creative, but it makes it nearly impossible to reproduce analysis, even using an existing ﬁle. There is no easy way to publish results from these programs. The proprietary ﬁle types (.tp for TinkerPlots and.ftm for Fathom) need the associated software in order to be run interactively, and the only way to produce something viewable without the application is to ‘print’ the screen.
Again, because the software is closed-source, neither TinkerPlots nor Fathom are extendable in any way. What you see is what you get. This becomes particularly problematic when it comes to modern modeling techniques. For example, in the Introduction to Data Science class developed for high school students through the Mobilize grant (discussed in Section 5.1), students use classiﬁcation and regression trees, and perform k-means classiﬁcation. Those methods are not available in either software package, and cannot be added. In fact, TinkerPlots has no standard statistical models, which means it cannot be used for real data analysis tasks. It is truly only a tool for learning. Fathom, which is designed for slightly older students, does provide limited modeling functionality in the form of simple linear regression and multiple regression.
In the context of Cliﬀord Konold’s argument that tools for learning should be completely separate from tools for doing (Konold, 2007), it makes sense there are limits to these tools. They were consciously designed to be separate. However, given the capabilities of modern computing, it should be possible to provide this ground-up entry while still supporting more extensibility. Through the attributes listed in Chapter 4, we will explore how this balance could be met.
Applets pose the most direct competition to TinkerPlots and Fathom in introductory statistics courses. Statistics applets will illustrate one concept through the use of a specialized interactive web tool. They are highly accessible because they are hosted online, and they are free to use.
Some of the best applets were designed by statistics educators Alan Rossman and Beth Chance (Chance and Rossman, 2006). One of their applets (Figure 3.7) allows students to discover randomization by working through a scenario about randomly assigning babies at the hospital. The applet asks the question, if we randomly assign four babies to four homes, how often do they end up in the home to which they belong? The user can watch the randomization happen once, as a stork ﬂies across the screen to deliver the babies to their color-coded homes, and then accelerate the illustration to see what the distribution would look like if one tried the same experiment many times. Students can use checkboxes to turn the animation on or oﬀ, or to see the theoretical probabilities. They can also try again with a diﬀerent number of babies or a diﬀerent number of trials.
Figure 3.7: A Rossman Chance applet demonstrating randomization Another popular applet set was developed by the Lock5 group.
This set of applets is called StatKey and accompanies the textbook the Locks have authored (Lock et al., 2012; Morgan et al., 2014).
In Figure 3.8, a StatKey applet for a bootstrap conﬁdence interval for a mean is shown. This example does not include an animation like the stork featured in the Rossman Chance example, but users can still specify how many samples they want to take, stepping through one sample at a time, or accelerating the process by clicking the “generate 1000 samples” button.
Applets can be useful for students to learn distinct concepts like randomization, but they can also be frustrating when students want to do things just outside the scope of the applet. The Rossman and Chance applets include many concepts, but very few of them let users import their own data. The Figure 3.8: StatKey applet demonstrating a bootstrap conﬁdence interval StatKey applets do allow users to edit the example data sets or upload entirely new data, but they are necessarily limited to what they were programmed to do.
Unfortunately, while StatCrunch does collect a lot of the best features of the tools it amalgamates, it also accumulates many of the negatives. For example, Figure 3.9: StatCrunch instructions for creating a bar plot the lack of data sanctity mentioned in the spreadsheets section is certainly true here, as is the messy canvas associated with both spreadsheets and software like TinkerPlots and Fathom.
As mentioned in the introduction, Data Desk is the one tool that was considered by DeLeeuw to be a statistical programming tool (De Leeuw, 2009) and by Biehler to be software for learning statistics (Biehler, 1997). Data Desk (Figure 3.10) was introduced in 1985 and a current version exists to this day (Velleman, 1989). The program was developed to facilitate John Tukey’s exploratory data analysis, and represents one of the ﬁrst uses of linked visualization (Wills, 2008).
The interface was clearly the inspiration for TinkerPlots and Fathom, and features a palette of tools as well as menu bars. However, Data Desk provides much richer functionality than either TinkerPlots or Fathom, including linear Figure 3.10: DataDesk version 7 and nonlinear models, cluster analysis, and principal component analysis.
The drawbacks of Data Desk are slight, and similar to those of TinkerPlots and Fathom. The interface looks outdated, only static versions of analysis can be shared, and it does not promote the inclusion of text supporting the data ‘story.’ However, all these drawbacks notwithstanding, it is very inspirational.
R is a programming language for statistical computing (R Core Team, 2014). It is the tool of choice of academic statisticians, and has a growing market outside academia (Vance, 2009). Analysts at companies like Google routinely use R to perform exploratory data analysis and make models.
R has a number of beneﬁts over the other tools we have discussed. First, it is free. When members of the open source community use the word “free” they often distinguish between “free as in speech” and “free as in beer.” These phrases indicate the diﬀerence between software that costs no money (e.g., most Google products) and software that is completely unrestricted and available for anyone to modify and edit. R is free in both ways.
R is an interpreted language, as are all the other languages commonly used for statistical computing. The interpreted paradigm means a user can type a piece of code, say mean(height) and see the results immediately. A program in an interpreted language gets executed in the same way as a single line of code.
That is, the computer looks directly at the code, not a compiled version of it.
So, while the education research on the acquisition of Java can be useful as inspiration, it cannot be used directly in a statistical computing context, due to the diﬀerences between compiled and interpreted languages.
R can be a very slow language to work with, as it reads all data into the computer’s working memory. Because of its lack of speed and a few other language features (the result of R being developed by statistical practitioners rather than computer scientists) it is often criticized by programming language experts. Even the author of R admits that there are drawbacks to R and suggests that a better language could be created if they started over (Ihaka, 2010).
But, R provides such a ﬂexible framework for statistical computing it has virtually taken over statistical computing in academia and is making inroads in the corporate world (Vance, 2009). And even programming language researchers admit it has interesting and unique language features (Morandat et al., 2012).
In particular, its lazy evaluation and lexical scoping are often pointed to.
Lazy evaluation describes the fact that code is not evaluated until its results are asked for, either by another function or by the user. Since R is a functional language, there are often chained functional statements equivalent to a set of nested functions, f(g(h(x,y))), and functions g() and h() will not be evaluated until the result of f() is asked for.