(2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL-Processes Using Process Mining Techniques Christoph Sonnenberg and Maria Bannert University of Wuerzburg, Instructional Media, Germany christoph.sonnenberg@uni‐wuerzburg.de ABSTRACT: According to research examining self‐regulated learning (SRL), we regard individual regulation as a specific sequence of regulatory activities. Ideally, students perform various learning activities, such as analyzing, monitoring, and evaluating cognitive and motivational aspects during learning. Metacognitive prompts can foster SRL by inducing regulatory activities, which, in turn, improve the learning outcome. However, the specific effects of metacognitive support on the dynamic characteristics of SRL are not understood. Therefore, the aim of our study was to analyze the effects of metacognitive prompts on learning processes and outcomes during a computer‐based learning task. Participants of the experimental group (EG, n=35) were supported by metacognitive prompts, whereas participants of the control group (CG, n=35) received no support. Data regarding learning processes were obtained by concurrent think‐aloud protocols. The EG exhibited significantly more metacognitive learning events than did the CG. Furthermore, these regulatory activities correspond positively with learning outcomes. Process mining techniques were used to analyze sequential patterns. Our findings indicate differences in the process models of the EG and CG and demonstrate the added value of taking the order of learning activities into account by discovering regulatory patterns. KEYWORDS: self‐regulated learning, metacognitive prompting, process analysis, process mining, think‐aloud data, HeuristicsMiner algorithm 1 INTRODUCTION Recent research in the field of self‐regulated learning (SRL) has moved to a process‐orientated or event‐ based view to investigate how learning processes unfold over time and how scaffolds influence the dynamic nature of regulatory activities. Two recent special issues indicate the importance of investigating sequential and temporal patterns in learning processes and present new methodological contributions for the analysis of time and order in learning activities (Martin & Sherin, 2013; Molenaar & Järvelä, 2014). Technical advances allow the recording of learning‐related behaviour on a very detailed level and largely unobtrusively for learners (e.g., Azevedo et al., 2013; Winne & Nesbit, 2009). As such, researchers have focused more on behavioural process data and less on measures of aptitude (Azevedo, 2009; Bannert, 2009; Veenman, van Hout‐Wolters, & Afflerbach, 2006). When focusing on process data, differences among learners are explained on the event level with respect to regularities and patterns (Winne & Perry, 2000), allowing researchers to gain new insights into the process of learning. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 72 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. Process analysis methods beyond the variable‐centred coding and counting approach (Kapur, 2011) can provide valuable information on the specific effects of scaffolds (e.g., metacognitive prompts) and are able to inform researchers about how to optimize an applied supporting strategy further (e.g., Jeong et al., 2008; Johnson, Azevedo, & D’Mello, 2011; Molenaar & Chiu, 2014). Moreover, findings on the sequential and temporal structure of SRL processes can provide knowledge for the development of SRL theories on the micro‐level (Molenaar & Järvelä, 2014). Our approach applies the techniques of process mining (Trčka, Pechenizkiy, & van der Aalst, 2010) on process data obtained by concurrent think‐aloud protocols (Ericsson & Simon, 1993). For example, we have compared process patterns of students with high versus low learning performance in a recent study (Bannert, Reimann, & Sonnenberg, 2014) and demonstrated that process mining techniques can reveal differences in the sequential patterns of regulatory processes. Now, we are investigating the effects of metacognitive prompts (Bannert, 2009) by means of an in‐depth analysis using process mining techniques. An analysis of differences in the process models between students supported by metacognitive prompts and students without prompts can provide information on how to promote beneficial regulatory patterns and thereby improve learning. The paper is structured as follows: First, we introduce research focusing on the support of SRL through metacognitive prompts. Second, we describe SRL models that emphasize the importance of different learning events and event patterns. Third, some of the foundations of analyzing learning processes with process mining are introduced. Fourth, we analyze process data from coded think‐aloud protocols of an experimental study. In addition to the traditional frequency‐based approach, the relative arrangement of learning activities is taken into account using process mining techniques. Finally, the results of these analyses are compared, and the effects of metacognitive support on the sequential structure of SRL processes are discussed. 2 THEORETICAL BACKGROUND 2.1 Metacognitive Support through Prompts Current research in metacognition and SRL shows that learners often do not spontaneously use metacognitive skills during learning, which in turn leads to poorer learning outcomes (e.g., Azevedo, 2009; Bannert & Mengelkamp, 2013; Greene, Dellinger, Tüysüzoglu, & Costa, 2013; Winne & Hadwin, 2008; Zimmerman, 2008). The students’ awareness and control of their own manner of learning is important, especially in technology‐enhanced and open‐ended learning settings (Azevedo, 2005; Lin, 2001; Lin, Hmelo, Kinzer, & Secules, 1999). In most open‐ended learning environments, it is constantly necessary to make decisions on what to do and where to go next and to evaluate the retrieved information with respect to current learning goals (Schnotz, 1998). Therefore, the general purpose of our research is to provide metacognitive support for hypermedia learning through metacognitive prompts. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 73 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. Instructional prompts are scaffolds that induce and stimulate students’ cognitive, metacognitive, and motivational activities during learning (Bannert, 2009). The underlying assumption is that students have already acquired these processes, but they do not recall or execute them spontaneously in a specific learning situation (production deficit; Veenman et al., 2006; Veenman, 2007). Metacognitive prompts aim at inducing regulatory activities such as orientation, goal specification, planning, monitoring and control, and evaluation strategies (Bannert, 2007; Veenman, 1993) by asking students to reflect upon, monitor, and control their own learning process. Previous research has demonstrated beneficial effects from metacognitive prompting (e.g., Azevedo, Cromley, Moos, Greene, & Winters, 2011; Ge, 2013; Johnson et al., 2011; Lin & Lehman, 1999; Veenman, 1993; Winne & Hadwin, 2013). For example, Lin and Lehman (1999) prompted students to give reasons for their actions to increase the awareness of their own strategies by utilizing a pop‐up window at certain times in a computer‐based simulation environment (e.g., “What is your plan for solving the problem?”). Their findings showed significantly higher performance on contextually dissimilar problems (i.e., far transfer performance) for the students supported by prompts. Based on an analysis of think‐aloud data, Johnson et al. (2011) showed that prompts given by a human tutor during learning in a hypermedia learning environment influenced the deployment of regulatory processes and temporal dependencies. Compared to a control group, the externally assisted condition also achieved a better learning outcome. In previous experiments, we investigated the effects of different types of metacognitive prompts during hypermedia learning (Bannert & Mengelkamp, 2013; Bannert & Reimann, 2012). The prompts stimulated or even suggested appropriate metacognitive learning activities for university students during a hypermedia learning session lasting approximately 40 minutes. For example, in one of our experiments, students were prompted after each navigational step in a learning environment to verbalize the reasons why they had chosen the next step (so‐called reflection prompts; Bannert, 2006). Overall, the findings confirm the positive effects of all investigated types of metacognitive prompts on transfer performance and the use of learning strategies during learning. Our most recent work (Bannert, Sonnenberg, Mengelkamp, & Pieger, 2015) investigates the effects of a new type of metacognitive prompt (so‐called self‐directed metacognitive prompts) on navigation behaviour and learning outcomes. In summary, the findings show that such prompts enhance strategic navigation behaviour (i.e., students visited relevant webpages significantly more often and spent more time on them) and transfer performance (i.e., students performed better at applying knowledge of basic concepts to solve prototypical problems compared with a control group). In addition, learner characteristics (e.g., prior domain knowledge or verbal abilities) were obtained by questionnaires, but they had no effects as covariates in our analyses. The present study extends this contribution by focusing on the sequential analysis of coded think‐aloud data obtained during learning. Despite the findings about the general effectiveness of metacognitive prompts, the specific effects of prompts on learning processes remain unexplained. More precisely, a closer look at the effects of ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 74 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. prompts on the sequential and temporal structure of SRL processes is necessary (e.g., Jeong et al., 2008; Johnson et al., 2011). Understanding this process at the micro‐level would allow researchers to better design metacognitive support. For example, regulatory patterns associated with successful learning but that could not be fostered by metacognitive prompts could be identified. Subsequently, the metacognitive support could be adapted by taking information about these patterns into consideration. Therefore, we focus on analyzing the sequential order of learning activities obtained by concurrent think‐aloud protocols during learning. 2.2 Regulatory Patterns in SRL Boekaerts (1997) describes SRL as a complex interaction of cognitive, metacognitive, and motivational regulatory components. With respect to assumptions in SRL models (e.g., Winne & Hadwin, 2008; Zimmerman, 2008), successful studying corresponds with an active performance of different regulatory activities during learning. These regulatory activities include employing orientation to obtain an overview of the learning task and resources, planning the course of learning, monitoring and controlling all learning steps, and evaluating the learning product. Research in SRL has confirmed that successful learning is associated with the active deployment of these regulatory activities (e.g., Azevedo, Guthrie, & Seibert, 2004; Bannert, 2009; Johnson et al., 2011; Moos & Azevedo, 2009). Most SRL models share the common assumption of a time‐ordered sequence of regulatory activities, although they do not imply a strict order (Azevedo, 2009). Usually, three cyclic phases of forethought, performance, and reflection (Zimmerman, 2000) are distinguished. The forethought phase comprises task analysis, goal setting, and strategic planning. During the performance phase, self‐observations for adaptations (monitoring) and control strategies (self‐instruction or time management) are deployed. Finally, the reflection phase includes self‐judgments and self‐reactions, which, in turn, can inform the next forethought phase. The COPES model (Winne & Hadwin, 2008) represents a more elaborate description of regulatory processes in terms of an information‐processing model. Here, learning occurs in three phases, namely, task definition, goal setting and planning, and studying tactics, and a fourth optional phase, adaptations to metacognition. In addition, monitoring and control are crucial elements in the COPES model. Monitoring is used to detect differences between current conditions (e.g., learning progress) and standards (e.g., predefined learning goals), which, in turn, activates control processes to reduce discrepancies (e.g., engaging more intensively in a certain topic). 2.3 Microanalysis Using Process Mining Techniques In a recent study (Bannert et al., 2014), we suggested process mining (PM) as a promising method in SRL research. PM allows researchers to describe and test process models of learning that incorporate an event‐based view and that are at the high end of process granularity. These process models are able to represent the workflow of activities (van der Aalst, Weijters, & Maruster, 2004). Therefore, we argue that PM is adequate for investigating regulatory patterns based on process assumptions conceptualized in SRL research, as described in the previous section. For example, PM or data‐mining techniques can ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 75 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. extract patterns by analyzing process data (e.g., think‐aloud protocols or log files), and the resulting patterns can be compared to assumptions of SRL models (e.g., the assumption of a time‐ordered sequence of regulatory activities in successful learning or the concept of dynamic and cyclic patterns). Therefore, the observed behaviour in process data could be aligned with SRL models. PM is an approach that can be used in the context of Educational Data Mining (Romero, Ventura, Pechenizkiy, & Baker, 2010). In this context, PM represents student activities as a process model derived from their log traces while using a computer‐based learning environment. In general, PM methods allow researchers to discover process models inductively from activity sequences stored in an event log, test process models through conformance checking with additional event data, or the extension of existing models (Trčka et al., 2010). Especially in the context of computer‐supported learning research, PM techniques are increasingly used to study learning from a process‐oriented perspective (Reimann & Yacef, 2013; Schoor & Bannert, 2012). For example, PM techniques can be applied to modelling sequences of learning activities that have been recorded in log files or coded think‐aloud data. By using PM techniques to discover process patterns in SRL activities, we assume that the present process data — comprising temporally ordered event sequences — is directed by one or more mental processes, with each set of processes corresponding to a process model. Hence, a process model represents a system of states and transitions that produced the sequence of learning events. Usually, the performance of this system is driven by a plan for action. In the context of SRL, this plan can be a learning strategy or an external resource provided to the learner (e.g., prompts). A process model is able to express a holistic view of a process by modelling a system comprising states and transitions rather than a process‐as‐sequence perspective (Reimann, 2009). With respect to related approaches, hidden Markov models (e.g., Jeong et al., 2008) also allow for expressing the holistic nature of a process by taking into account the entire sample of behaviour. However, this approach uses time‐consuming iterative procedures; generally, the researcher has to pre‐ define the appropriate number of states, and the interpretation of the output model is often difficult (van der Aalst, 2011). There are, however, approaches for automatically selecting the appropriate number of states using the Bayesian Information Criterion (e.g., Li & Biswas, 2002). Additionally, hidden Markov models, as well as simple transition graphs and other low‐level models, represent a lower abstraction level than the PM notation language (e.g., inability to represent concurrency, which typically results in more complex models). Finally, PM techniques have the advantage of explicitly dealing with noise (i.e., exceptional or infrequent behaviour), which is necessary when analyzing real‐life event traces. For these reasons, we recommend PM techniques for analyzing sequences of learning activities (see Bannert et al., 2014 for more information regarding the comparison to other process analysis methods). 2.4 Research Questions and Hypotheses Metacognitive prompts ask students explicitly to reflect, monitor, and control their own learning ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 76 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. process. They focus students’ attention on their own thoughts and on understanding the activities in which they are engaged during learning (e.g., Bannert, 2006; Hoffman & Spatariu, 2011). Hence, it is assumed that prompting students to monitor and evaluate their own manner of learning will allow them to activate their repertoire of metacognitive knowledge and learning strategies, which will consequently enhance their learning process and learning outcome. However, according to previous work and research on metacognitive prompting, the use of metacognitive prompts has to be explained and practiced in advance to guarantee an adequate application during learning (e.g., Bannert, 2007; Veenman, 2007). Based on the findings of studies investigating the effects of metacognitive prompts (e.g., Azevedo et al., 2004; Bannert, 2009), we expect that students supported by metacognitive prompts will engage in more regulatory activities, as obtained by coded think‐aloud protocols. Moreover, scaffolded SRL processes should result in better learning performance; that is, a positive effect on learning outcomes mediated by improved regulatory behaviour. Whereas these two hypotheses are based on a variable‐centred view of learning processes, we assume that an event‐ centred analysis that takes into account the relative arrangements of multiple learning activities can provide additional information about the sequential structure of the regulatory behaviour induced by the prompts (e.g., a sequence of orientation activities, searching for relevant information, cognitive processing, and evaluation of progress are typically executed). Therefore, the effectiveness of metacognitive prompts can be analyzed on a micro‐level, and the results can be used to derive implications for the improvement of metacognitive support. In detail, the following research questions are addressed in the present study: 1. Does metacognitive prompting during learning influence SRL processes by engaging students in more metacognitive learning events? 2. Does the number of metacognitive learning events mediate the effect of metacognitive prompting on learning outcomes? 3. Which sequential patterns of SRL activities are induced by metacognitive prompting compared to a control group without support? 2.5 Process Mining Using the HeuristicsMiner Algorithm To analyze the relative arrangement of learning activities, we employed the PM approach (Trčka et al., 2010). The basic idea of PM is to use an event log to generate a process model describing this log inductively (process discovery). Furthermore, theoretical models or empirically mined models can be compared to event logs (conformance checking), and existing models can be extended (model extension). Fluxicon Disco Version 1.7.2 (2014) software was used for data preparation. Next, the event log was imported into the ProM framework Version 5.2 (2008), and PM was conducted. The ProM framework comprises a variety of PM algorithms that can be assigned to the functions of discovery, model checking, or model extension. For our analysis, we used the HeuristicsMiner algorithm (Weijters, van der Aalst, & de Medeiros, 2006) for process discovery. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 77 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. We selected the HeuristicsMiner algorithm based on a comparison of seven state‐of‐the‐art process discovery algorithms on the dimensions of accuracy and comprehensibility, provided by de Weerdt, de Backer, Vanthienen, and Baesens (2012). Accuracy is defined as the capability of a sound capturing of behaviour in an event log, omitting over‐ and underfitting (i.e., a process model should balance between generality and precision). Comprehensibility comprises simplicity and structuredness of the resulting process models, and thereby determines the complexity and ease of interpretation of the output. For the first time, real‐life event logs containing log data from different information systems were used for benchmarking PM algorithms. Among the seven algorithms, the HeuristicsMiner was the best technique for the real‐life logs used and the authors conclude that “HeuristicsMiner seems the most appropriate and robust technique in a real‐life context in terms of accuracy, comprehensibility, and scalability” (De Weerdt et al., 2012, p. 671). In the following, we explain the general principle and functionality of this algorithm in more detail. 2.6 General Principle of the HeuristicsMiner The general principle of the HeuristicsMiner algorithm is to take into account the sequential order of events for mining a process model that represents the control flow of an event log (Weijters et al., 2006). The event log containing case IDs, time stamps, and activities represents the data input. Based on this input, the algorithm searches for causal dependencies between activities by computing a dependency graph that indicates the certainty of a relation between two activities (e.g., event a is followed by event b with a certainty of 0.90). Finally, a so‐called heuristic net is generated as an output model that constitutes a visual representation of the dependencies among all activity classes in the event log. The resulting process model can be adjusted by setting thresholds for the inclusion of relations in the heuristic net (for more details on parameter settings, see below). In addition, the HeuristicsMiner is based on two main assumptions. First, each non‐initial activity has at least one other activity that triggers its performance, and each non‐final activity is followed by at least one dependent activity. This assumption is used in the so‐called all activities connected heuristic (Weijters et al., 2006). Second, the event log contains a representative sample of the observed behaviour, which usually contains a certain amount of noise, especially if traces of human behaviour are stored in the event log. For example, in our study, a perfect trace of verbal utterances for all performed learning steps is unlikely. Therefore, the event log contains noise caused, for example, by a missing learning step that was not uttered or by disagreement among the raters during the coding procedure. It must be noted that there is also noise in other types of data (e.g., log file data). Consequently, an analysis method is needed that can abstract from noise and that can concentrate on the main relations among learning activities. It is a specific feature of the HeuristicsMiner to be robust to noise in the data. This is the main reason for the appropriateness of applying this PM algorithm to our event log. An additional advantage of the HeuristicsMiner algorithm is that the mined model (heuristic net) can be converted into a formal petri net. A petri net can be described as a bipartite directed graph with a finite set of places, a finite set of transitions, and two sets of directed arcs, from places to transitions and from ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 78 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. transitions to places (Reisig, 1985). Thus, the resulting process model can be used as input for other PM algorithms, and it can be utilized in subsequent analyses (e.g., conformance checking between the model and a new event log). In contrast, the output model of another promising process discovery algorithm within the ProM framework that we used in previous process analyses (Bannert et al., 2014; Schoor & Bannert, 2012), called the Fuzzy Miner (Günther & van der Aalst, 2007), cannot be converted into a petri net (De Weerdt et al., 2012). Therefore, the HeuristicsMiner was the first choice for our present analysis. 2.7 Functionality and Application of the HeuristicsMiner Considering its functionality, the HeuristicsMiner algorithm uses several parameters that guide the creation of a process model and that can be adjusted to set the level of abstraction from noise and low‐ frequency behaviour. First, a frequency‐based metric is used to determine the degree of certainty of a relation between two events, A and B, based on an event log. The dependency values, ranging between –1 and 1, between all possible combinations of events are computed using the following formula (Weijters et al., 2006, p. 7): |(cid:3028) (cid:2997) (cid:3029)| (cid:2879) |(cid:3029) (cid:2997) (cid:3028)| (cid:1827) ⟹ (cid:1828) (cid:3404) (cid:4672) (cid:3298) (cid:3298) (cid:4673) (cid:3050) |(cid:3028) (cid:2997) (cid:3029)|(cid:2878)|(cid:3029) (cid:2997) (cid:3028)|(cid:2878)(cid:2869) (cid:3298) (cid:3298) Based on an event log W, the certainty of a dependency relation between two events, (cid:1827) ⟹ (cid:1828), is (cid:3050) computed using the number of times event a is followed by event b, subtracted from the number of times event b is followed by event a, and divided by the number of occurrences of these two relations, plus 1. The number of correct (a follows b) and incorrect (b follows a) event sequences influences the dependency value by the +1 in the denominator. For example, an event log containing only correct sequences (a is always followed by b, but never vice versa), but with a low frequency of five observations, results in a certainty of 5/6 = 0.83, whereas in the case of a high frequency of 50 observations, the certainty of a dependency relation between a and b would be 50/51 = 0.98. Moreover, the computed dependency values are used to construct a heuristic net (i.e., the output model). However, not all dependency relations are kept in the process model. Instead, the HeuristicsMiner algorithm concentrates on the main causal dependencies and abstracts from noise and low‐frequency behaviour. At first, the all activities connected heuristic is applied. Therefore, only the best candidates (with the highest (cid:1827) ⟹ (cid:1828) values) regarding the dependency values are kept in the (cid:3050) output model. Second, three threshold parameters are used for the selection of further dependency relations. The dependency threshold determines the cut‐off value for the inclusion of dependency relations in the output model. Furthermore, the positive observation threshold defines the minimum number of necessary observed sequences. Finally, the relative to best threshold determines that only additional dependency relations with a lower difference to the best candidate are included in the output model. We refer to Weijters et al. (2006) for more information about these threshold parameters. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 79 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. In our analysis, the threshold parameters were kept at their default values of dependency threshold = 0.9, positive observation behaviour = 10, and relative‐to‐best‐threshold = 0.05. As explained above, these threshold parameters can be used to adjust the level of abstraction of the output model. For example, reducing the cutoff‐values would result in additional dependency relations in the model and thus increase the complexity. However, there were no reasons for changing the default‐values in our case. Furthermore, the HeuristicsMiner algorithm can also address short loops of lengths one (e.g., ACCB) and two (e.g., ACDCDB) as well as long distance dependencies; that is, a dependency based on choices made in other parts of the process model. Moreover, the algorithm considers AND‐relations (two events are executed concurrently) and OR‐Relations (e.g., either event b or event c can be executed after event a) to construct the heuristic net. In general, searching for an optimal process model based on a present event log can be challenging, especially if there is a certain amount of noise and less‐frequent behaviour in the data. Therefore, it is possible to compare the resulting process model with the event log using a fitness value (Rozinat & van der Aalst, 2008). The fitness indicates the gap between the observed behaviour, that is, the set of event sequences in the log, and the mined process model. By applying the HeuristicsMiner algorithm to our event log, we assume that the present set of sequences of learning events is caused by one or multiple underlying processes. However, it might be possible that there is a high variety in SRL activities within the sample. In this case, using very robust algorithms such as the HeuristicsMiner can result in over‐generalization (underfitting); that is, the mined model allows for much more behaviour than what is actually observed (De Medeiros et al., 2008). Therefore, the event log could be modelled more precisely by generating different process models for subsets of participants instead of a single model for all cases. This approach is called trace clustering, which can improve the discovery of process models (De Weerdt, vanden Broucke, Vanthienen, & Baesens, 2013; Greco, Guzzo, Pontieri, & Saccà, 2006). A plug‐in has been implemented in the ProM framework that combines the HeuristicsMiner algorithm with a trace clustering procedure, namely, DWS mining (Disjunctive Workflow Schema; De Medeiros et al., 2008). The basic idea of DWS mining is to split the log into clusters iteratively until the mined process model for each cluster reaches high precision. A process model has a high precision if it only allows for behaviour that was observed in the event log. Consequently, a cluster is further partitioned if the mined model allows for more behaviour than is expressed by the cases within this cluster. For more information on the DWS mining plugin, refer to De Medeiros et al. (2008). In our analysis, we kept the default parameter settings for clustering the log traces. 3 METHOD The present study extends a previous contribution (Bannert et al., 2015) that investigates the effects of metacognitive prompting on navigation behaviour and learning outcome referring to the same ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 80 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. participants, but to different research questions and to mostly different data. 3.1 Sample and Research Design A total of n=70 undergraduate students from a German university participated in the study (mean age = 20.07, SD = 1.88, 82.9% female). All participants were either majoring in media communications or in human–computer systems. Participants were recruited via an online recruitment system administered by our institute, and each student received 40 Euros (approximately $47 USD) for participating. Altogether, the experimental study was based on a between‐subject design and comprised two sessions. In the first session, learner characteristics were obtained as potential covariates (e.g., prior domain knowledge), especially in the case of an unbalanced distribution of characteristics among the groups by randomization (which is possible for the relatively small sample size). Approximately one week later, the participants were randomly assigned to either the experimental group (n=35) or the control group (n=35) and individually participated in hypermedia learning. The experimental group learned with metacognitive prompts, whereas the control group learned without prompts. Figure 1 presents an overview of the research design. Learner Characteristics Brief Training Learning with Learning Performance Verbal Intelligence EG: Introduction: Metacognitive Prompts Recall, Comprehension, Use of prompts Think-aloud data and Transfer Test Prior Domain Knowledge Metacognitive Knowledge Brief Training Learning without Learning Performance Epistemological Beliefs CG: Support Introduction: Recall, Comprehension, Reading Competence Workplace design Think-aloud data and Transfer Test Session 1 Session 2 (1st week) (2nd week) Figure 1. Research design 3.2 Learning Material and Performance Measurement 3.2.1 Learning Environment and Metacognitive Prompts The learning material comprised a chapter on the topic of learning theories (classical conditioning, operant conditioning, and observational learning) presented in a hypermedia learning environment. For example, the content of one node included a description of the Skinner‐box with reference to the concept of operant conditioning, and illustrated with a picture. In total, the material comprised 50 nodes with 13,000 words, 20 pictures and tables, and 300 hyperlinks. Within this chapter, the material relevant for the learning task comprised 10 nodes with 2,300 words, 5 pictures and tables, and 60 hyperlinks. The remaining pages were not relevant for the learning task. These pages included overviews, summaries, and pages with information on concepts not relevant for the learning goals. The Flesch‐Kincaid grade‐level score of the complete learning material was 19.01. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 81

ERIC EJ1126947: Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL-Processes Using Process Mining Techniques PDF

2015

0.67 MB

English

by ERIC

Checking for file health...

Preview ERIC EJ1126947: Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL-Processes Using Process Mining Techniques

(2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL-Processes Using Process Mining Techniques Christoph Sonnenberg and Maria Bannert University of Wuerzburg, Instructional Media, Germany christoph.sonnenberg@uni‐wuerzburg.de ABSTRACT: According to research examining self‐regulated learning (SRL), we regard individual regulation as a specific sequence of regulatory activities. Ideally, students perform various learning activities, such as analyzing, monitoring, and evaluating cognitive and motivational aspects during learning. Metacognitive prompts can foster SRL by inducing regulatory activities, which, in turn, improve the learning outcome. However, the specific effects of metacognitive support on the dynamic characteristics of SRL are not understood. Therefore, the aim of our study was to analyze the effects of metacognitive prompts on learning processes and outcomes during a computer‐based learning task. Participants of the experimental group (EG, n=35) were supported by metacognitive prompts, whereas participants of the control group (CG, n=35) received no support. Data regarding learning processes were obtained by concurrent think‐aloud protocols. The EG exhibited significantly more metacognitive learning events than did the CG. Furthermore, these regulatory activities correspond positively with learning outcomes. Process mining techniques were used to analyze sequential patterns. Our findings indicate differences in the process models of the EG and CG and demonstrate the added value of taking the order of learning activities into account by discovering regulatory patterns. KEYWORDS: self‐regulated learning, metacognitive prompting, process analysis, process mining, think‐aloud data, HeuristicsMiner algorithm 1 INTRODUCTION Recent research in the field of self‐regulated learning (SRL) has moved to a process‐orientated or event‐ based view to investigate how learning processes unfold over time and how scaffolds influence the dynamic nature of regulatory activities. Two recent special issues indicate the importance of investigating sequential and temporal patterns in learning processes and present new methodological contributions for the analysis of time and order in learning activities (Martin & Sherin, 2013; Molenaar & Järvelä, 2014). Technical advances allow the recording of learning‐related behaviour on a very detailed level and largely unobtrusively for learners (e.g., Azevedo et al., 2013; Winne & Nesbit, 2009). As such, researchers have focused more on behavioural process data and less on measures of aptitude (Azevedo, 2009; Bannert, 2009; Veenman, van Hout‐Wolters, & Afflerbach, 2006). When focusing on process data, differences among learners are explained on the event level with respect to regularities and patterns (Winne & Perry, 2000), allowing researchers to gain new insights into the process of learning. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 72 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. Process analysis methods beyond the variable‐centred coding and counting approach (Kapur, 2011) can provide valuable information on the specific effects of scaffolds (e.g., metacognitive prompts) and are able to inform researchers about how to optimize an applied supporting strategy further (e.g., Jeong et al., 2008; Johnson, Azevedo, & D’Mello, 2011; Molenaar & Chiu, 2014). Moreover, findings on the sequential and temporal structure of SRL processes can provide knowledge for the development of SRL theories on the micro‐level (Molenaar & Järvelä, 2014). Our approach applies the techniques of process mining (Trčka, Pechenizkiy, & van der Aalst, 2010) on process data obtained by concurrent think‐aloud protocols (Ericsson & Simon, 1993). For example, we have compared process patterns of students with high versus low learning performance in a recent study (Bannert, Reimann, & Sonnenberg, 2014) and demonstrated that process mining techniques can reveal differences in the sequential patterns of regulatory processes. Now, we are investigating the effects of metacognitive prompts (Bannert, 2009) by means of an in‐depth analysis using process mining techniques. An analysis of differences in the process models between students supported by metacognitive prompts and students without prompts can provide information on how to promote beneficial regulatory patterns and thereby improve learning. The paper is structured as follows: First, we introduce research focusing on the support of SRL through metacognitive prompts. Second, we describe SRL models that emphasize the importance of different learning events and event patterns. Third, some of the foundations of analyzing learning processes with process mining are introduced. Fourth, we analyze process data from coded think‐aloud protocols of an experimental study. In addition to the traditional frequency‐based approach, the relative arrangement of learning activities is taken into account using process mining techniques. Finally, the results of these analyses are compared, and the effects of metacognitive support on the sequential structure of SRL processes are discussed. 2 THEORETICAL BACKGROUND 2.1 Metacognitive Support through Prompts Current research in metacognition and SRL shows that learners often do not spontaneously use metacognitive skills during learning, which in turn leads to poorer learning outcomes (e.g., Azevedo, 2009; Bannert & Mengelkamp, 2013; Greene, Dellinger, Tüysüzoglu, & Costa, 2013; Winne & Hadwin, 2008; Zimmerman, 2008). The students’ awareness and control of their own manner of learning is important, especially in technology‐enhanced and open‐ended learning settings (Azevedo, 2005; Lin, 2001; Lin, Hmelo, Kinzer, & Secules, 1999). In most open‐ended learning environments, it is constantly necessary to make decisions on what to do and where to go next and to evaluate the retrieved information with respect to current learning goals (Schnotz, 1998). Therefore, the general purpose of our research is to provide metacognitive support for hypermedia learning through metacognitive prompts. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 73 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. Instructional prompts are scaffolds that induce and stimulate students’ cognitive, metacognitive, and motivational activities during learning (Bannert, 2009). The underlying assumption is that students have already acquired these processes, but they do not recall or execute them spontaneously in a specific learning situation (production deficit; Veenman et al., 2006; Veenman, 2007). Metacognitive prompts aim at inducing regulatory activities such as orientation, goal specification, planning, monitoring and control, and evaluation strategies (Bannert, 2007; Veenman, 1993) by asking students to reflect upon, monitor, and control their own learning process. Previous research has demonstrated beneficial effects from metacognitive prompting (e.g., Azevedo, Cromley, Moos, Greene, & Winters, 2011; Ge, 2013; Johnson et al., 2011; Lin & Lehman, 1999; Veenman, 1993; Winne & Hadwin, 2013). For example, Lin and Lehman (1999) prompted students to give reasons for their actions to increase the awareness of their own strategies by utilizing a pop‐up window at certain times in a computer‐based simulation environment (e.g., “What is your plan for solving the problem?”). Their findings showed significantly higher performance on contextually dissimilar problems (i.e., far transfer performance) for the students supported by prompts. Based on an analysis of think‐aloud data, Johnson et al. (2011) showed that prompts given by a human tutor during learning in a hypermedia learning environment influenced the deployment of regulatory processes and temporal dependencies. Compared to a control group, the externally assisted condition also achieved a better learning outcome. In previous experiments, we investigated the effects of different types of metacognitive prompts during hypermedia learning (Bannert & Mengelkamp, 2013; Bannert & Reimann, 2012). The prompts stimulated or even suggested appropriate metacognitive learning activities for university students during a hypermedia learning session lasting approximately 40 minutes. For example, in one of our experiments, students were prompted after each navigational step in a learning environment to verbalize the reasons why they had chosen the next step (so‐called reflection prompts; Bannert, 2006). Overall, the findings confirm the positive effects of all investigated types of metacognitive prompts on transfer performance and the use of learning strategies during learning. Our most recent work (Bannert, Sonnenberg, Mengelkamp, & Pieger, 2015) investigates the effects of a new type of metacognitive prompt (so‐called self‐directed metacognitive prompts) on navigation behaviour and learning outcomes. In summary, the findings show that such prompts enhance strategic navigation behaviour (i.e., students visited relevant webpages significantly more often and spent more time on them) and transfer performance (i.e., students performed better at applying knowledge of basic concepts to solve prototypical problems compared with a control group). In addition, learner characteristics (e.g., prior domain knowledge or verbal abilities) were obtained by questionnaires, but they had no effects as covariates in our analyses. The present study extends this contribution by focusing on the sequential analysis of coded think‐aloud data obtained during learning. Despite the findings about the general effectiveness of metacognitive prompts, the specific effects of prompts on learning processes remain unexplained. More precisely, a closer look at the effects of ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 74 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. prompts on the sequential and temporal structure of SRL processes is necessary (e.g., Jeong et al., 2008; Johnson et al., 2011). Understanding this process at the micro‐level would allow researchers to better design metacognitive support. For example, regulatory patterns associated with successful learning but that could not be fostered by metacognitive prompts could be identified. Subsequently, the metacognitive support could be adapted by taking information about these patterns into consideration. Therefore, we focus on analyzing the sequential order of learning activities obtained by concurrent think‐aloud protocols during learning. 2.2 Regulatory Patterns in SRL Boekaerts (1997) describes SRL as a complex interaction of cognitive, metacognitive, and motivational regulatory components. With respect to assumptions in SRL models (e.g., Winne & Hadwin, 2008; Zimmerman, 2008), successful studying corresponds with an active performance of different regulatory activities during learning. These regulatory activities include employing orientation to obtain an overview of the learning task and resources, planning the course of learning, monitoring and controlling all learning steps, and evaluating the learning product. Research in SRL has confirmed that successful learning is associated with the active deployment of these regulatory activities (e.g., Azevedo, Guthrie, & Seibert, 2004; Bannert, 2009; Johnson et al., 2011; Moos & Azevedo, 2009). Most SRL models share the common assumption of a time‐ordered sequence of regulatory activities, although they do not imply a strict order (Azevedo, 2009). Usually, three cyclic phases of forethought, performance, and reflection (Zimmerman, 2000) are distinguished. The forethought phase comprises task analysis, goal setting, and strategic planning. During the performance phase, self‐observations for adaptations (monitoring) and control strategies (self‐instruction or time management) are deployed. Finally, the reflection phase includes self‐judgments and self‐reactions, which, in turn, can inform the next forethought phase. The COPES model (Winne & Hadwin, 2008) represents a more elaborate description of regulatory processes in terms of an information‐processing model. Here, learning occurs in three phases, namely, task definition, goal setting and planning, and studying tactics, and a fourth optional phase, adaptations to metacognition. In addition, monitoring and control are crucial elements in the COPES model. Monitoring is used to detect differences between current conditions (e.g., learning progress) and standards (e.g., predefined learning goals), which, in turn, activates control processes to reduce discrepancies (e.g., engaging more intensively in a certain topic). 2.3 Microanalysis Using Process Mining Techniques In a recent study (Bannert et al., 2014), we suggested process mining (PM) as a promising method in SRL research. PM allows researchers to describe and test process models of learning that incorporate an event‐based view and that are at the high end of process granularity. These process models are able to represent the workflow of activities (van der Aalst, Weijters, & Maruster, 2004). Therefore, we argue that PM is adequate for investigating regulatory patterns based on process assumptions conceptualized in SRL research, as described in the previous section. For example, PM or data‐mining techniques can ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 75 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. extract patterns by analyzing process data (e.g., think‐aloud protocols or log files), and the resulting patterns can be compared to assumptions of SRL models (e.g., the assumption of a time‐ordered sequence of regulatory activities in successful learning or the concept of dynamic and cyclic patterns). Therefore, the observed behaviour in process data could be aligned with SRL models. PM is an approach that can be used in the context of Educational Data Mining (Romero, Ventura, Pechenizkiy, & Baker, 2010). In this context, PM represents student activities as a process model derived from their log traces while using a computer‐based learning environment. In general, PM methods allow researchers to discover process models inductively from activity sequences stored in an event log, test process models through conformance checking with additional event data, or the extension of existing models (Trčka et al., 2010). Especially in the context of computer‐supported learning research, PM techniques are increasingly used to study learning from a process‐oriented perspective (Reimann & Yacef, 2013; Schoor & Bannert, 2012). For example, PM techniques can be applied to modelling sequences of learning activities that have been recorded in log files or coded think‐aloud data. By using PM techniques to discover process patterns in SRL activities, we assume that the present process data — comprising temporally ordered event sequences — is directed by one or more mental processes, with each set of processes corresponding to a process model. Hence, a process model represents a system of states and transitions that produced the sequence of learning events. Usually, the performance of this system is driven by a plan for action. In the context of SRL, this plan can be a learning strategy or an external resource provided to the learner (e.g., prompts). A process model is able to express a holistic view of a process by modelling a system comprising states and transitions rather than a process‐as‐sequence perspective (Reimann, 2009). With respect to related approaches, hidden Markov models (e.g., Jeong et al., 2008) also allow for expressing the holistic nature of a process by taking into account the entire sample of behaviour. However, this approach uses time‐consuming iterative procedures; generally, the researcher has to pre‐ define the appropriate number of states, and the interpretation of the output model is often difficult (van der Aalst, 2011). There are, however, approaches for automatically selecting the appropriate number of states using the Bayesian Information Criterion (e.g., Li & Biswas, 2002). Additionally, hidden Markov models, as well as simple transition graphs and other low‐level models, represent a lower abstraction level than the PM notation language (e.g., inability to represent concurrency, which typically results in more complex models). Finally, PM techniques have the advantage of explicitly dealing with noise (i.e., exceptional or infrequent behaviour), which is necessary when analyzing real‐life event traces. For these reasons, we recommend PM techniques for analyzing sequences of learning activities (see Bannert et al., 2014 for more information regarding the comparison to other process analysis methods). 2.4 Research Questions and Hypotheses Metacognitive prompts ask students explicitly to reflect, monitor, and control their own learning ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 76 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. process. They focus students’ attention on their own thoughts and on understanding the activities in which they are engaged during learning (e.g., Bannert, 2006; Hoffman & Spatariu, 2011). Hence, it is assumed that prompting students to monitor and evaluate their own manner of learning will allow them to activate their repertoire of metacognitive knowledge and learning strategies, which will consequently enhance their learning process and learning outcome. However, according to previous work and research on metacognitive prompting, the use of metacognitive prompts has to be explained and practiced in advance to guarantee an adequate application during learning (e.g., Bannert, 2007; Veenman, 2007). Based on the findings of studies investigating the effects of metacognitive prompts (e.g., Azevedo et al., 2004; Bannert, 2009), we expect that students supported by metacognitive prompts will engage in more regulatory activities, as obtained by coded think‐aloud protocols. Moreover, scaffolded SRL processes should result in better learning performance; that is, a positive effect on learning outcomes mediated by improved regulatory behaviour. Whereas these two hypotheses are based on a variable‐centred view of learning processes, we assume that an event‐ centred analysis that takes into account the relative arrangements of multiple learning activities can provide additional information about the sequential structure of the regulatory behaviour induced by the prompts (e.g., a sequence of orientation activities, searching for relevant information, cognitive processing, and evaluation of progress are typically executed). Therefore, the effectiveness of metacognitive prompts can be analyzed on a micro‐level, and the results can be used to derive implications for the improvement of metacognitive support. In detail, the following research questions are addressed in the present study: 1. Does metacognitive prompting during learning influence SRL processes by engaging students in more metacognitive learning events? 2. Does the number of metacognitive learning events mediate the effect of metacognitive prompting on learning outcomes? 3. Which sequential patterns of SRL activities are induced by metacognitive prompting compared to a control group without support? 2.5 Process Mining Using the HeuristicsMiner Algorithm To analyze the relative arrangement of learning activities, we employed the PM approach (Trčka et al., 2010). The basic idea of PM is to use an event log to generate a process model describing this log inductively (process discovery). Furthermore, theoretical models or empirically mined models can be compared to event logs (conformance checking), and existing models can be extended (model extension). Fluxicon Disco Version 1.7.2 (2014) software was used for data preparation. Next, the event log was imported into the ProM framework Version 5.2 (2008), and PM was conducted. The ProM framework comprises a variety of PM algorithms that can be assigned to the functions of discovery, model checking, or model extension. For our analysis, we used the HeuristicsMiner algorithm (Weijters, van der Aalst, & de Medeiros, 2006) for process discovery. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 77 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. We selected the HeuristicsMiner algorithm based on a comparison of seven state‐of‐the‐art process discovery algorithms on the dimensions of accuracy and comprehensibility, provided by de Weerdt, de Backer, Vanthienen, and Baesens (2012). Accuracy is defined as the capability of a sound capturing of behaviour in an event log, omitting over‐ and underfitting (i.e., a process model should balance between generality and precision). Comprehensibility comprises simplicity and structuredness of the resulting process models, and thereby determines the complexity and ease of interpretation of the output. For the first time, real‐life event logs containing log data from different information systems were used for benchmarking PM algorithms. Among the seven algorithms, the HeuristicsMiner was the best technique for the real‐life logs used and the authors conclude that “HeuristicsMiner seems the most appropriate and robust technique in a real‐life context in terms of accuracy, comprehensibility, and scalability” (De Weerdt et al., 2012, p. 671). In the following, we explain the general principle and functionality of this algorithm in more detail. 2.6 General Principle of the HeuristicsMiner The general principle of the HeuristicsMiner algorithm is to take into account the sequential order of events for mining a process model that represents the control flow of an event log (Weijters et al., 2006). The event log containing case IDs, time stamps, and activities represents the data input. Based on this input, the algorithm searches for causal dependencies between activities by computing a dependency graph that indicates the certainty of a relation between two activities (e.g., event a is followed by event b with a certainty of 0.90). Finally, a so‐called heuristic net is generated as an output model that constitutes a visual representation of the dependencies among all activity classes in the event log. The resulting process model can be adjusted by setting thresholds for the inclusion of relations in the heuristic net (for more details on parameter settings, see below). In addition, the HeuristicsMiner is based on two main assumptions. First, each non‐initial activity has at least one other activity that triggers its performance, and each non‐final activity is followed by at least one dependent activity. This assumption is used in the so‐called all activities connected heuristic (Weijters et al., 2006). Second, the event log contains a representative sample of the observed behaviour, which usually contains a certain amount of noise, especially if traces of human behaviour are stored in the event log. For example, in our study, a perfect trace of verbal utterances for all performed learning steps is unlikely. Therefore, the event log contains noise caused, for example, by a missing learning step that was not uttered or by disagreement among the raters during the coding procedure. It must be noted that there is also noise in other types of data (e.g., log file data). Consequently, an analysis method is needed that can abstract from noise and that can concentrate on the main relations among learning activities. It is a specific feature of the HeuristicsMiner to be robust to noise in the data. This is the main reason for the appropriateness of applying this PM algorithm to our event log. An additional advantage of the HeuristicsMiner algorithm is that the mined model (heuristic net) can be converted into a formal petri net. A petri net can be described as a bipartite directed graph with a finite set of places, a finite set of transitions, and two sets of directed arcs, from places to transitions and from ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 78 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. transitions to places (Reisig, 1985). Thus, the resulting process model can be used as input for other PM algorithms, and it can be utilized in subsequent analyses (e.g., conformance checking between the model and a new event log). In contrast, the output model of another promising process discovery algorithm within the ProM framework that we used in previous process analyses (Bannert et al., 2014; Schoor & Bannert, 2012), called the Fuzzy Miner (Günther & van der Aalst, 2007), cannot be converted into a petri net (De Weerdt et al., 2012). Therefore, the HeuristicsMiner was the first choice for our present analysis. 2.7 Functionality and Application of the HeuristicsMiner Considering its functionality, the HeuristicsMiner algorithm uses several parameters that guide the creation of a process model and that can be adjusted to set the level of abstraction from noise and low‐ frequency behaviour. First, a frequency‐based metric is used to determine the degree of certainty of a relation between two events, A and B, based on an event log. The dependency values, ranging between –1 and 1, between all possible combinations of events are computed using the following formula (Weijters et al., 2006, p. 7): |(cid:3028) (cid:2997) (cid:3029)| (cid:2879) |(cid:3029) (cid:2997) (cid:3028)| (cid:1827) ⟹ (cid:1828) (cid:3404) (cid:4672) (cid:3298) (cid:3298) (cid:4673) (cid:3050) |(cid:3028) (cid:2997) (cid:3029)|(cid:2878)|(cid:3029) (cid:2997) (cid:3028)|(cid:2878)(cid:2869) (cid:3298) (cid:3298) Based on an event log W, the certainty of a dependency relation between two events, (cid:1827) ⟹ (cid:1828), is (cid:3050) computed using the number of times event a is followed by event b, subtracted from the number of times event b is followed by event a, and divided by the number of occurrences of these two relations, plus 1. The number of correct (a follows b) and incorrect (b follows a) event sequences influences the dependency value by the +1 in the denominator. For example, an event log containing only correct sequences (a is always followed by b, but never vice versa), but with a low frequency of five observations, results in a certainty of 5/6 = 0.83, whereas in the case of a high frequency of 50 observations, the certainty of a dependency relation between a and b would be 50/51 = 0.98. Moreover, the computed dependency values are used to construct a heuristic net (i.e., the output model). However, not all dependency relations are kept in the process model. Instead, the HeuristicsMiner algorithm concentrates on the main causal dependencies and abstracts from noise and low‐frequency behaviour. At first, the all activities connected heuristic is applied. Therefore, only the best candidates (with the highest (cid:1827) ⟹ (cid:1828) values) regarding the dependency values are kept in the (cid:3050) output model. Second, three threshold parameters are used for the selection of further dependency relations. The dependency threshold determines the cut‐off value for the inclusion of dependency relations in the output model. Furthermore, the positive observation threshold defines the minimum number of necessary observed sequences. Finally, the relative to best threshold determines that only additional dependency relations with a lower difference to the best candidate are included in the output model. We refer to Weijters et al. (2006) for more information about these threshold parameters. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 79 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. In our analysis, the threshold parameters were kept at their default values of dependency threshold = 0.9, positive observation behaviour = 10, and relative‐to‐best‐threshold = 0.05. As explained above, these threshold parameters can be used to adjust the level of abstraction of the output model. For example, reducing the cutoff‐values would result in additional dependency relations in the model and thus increase the complexity. However, there were no reasons for changing the default‐values in our case. Furthermore, the HeuristicsMiner algorithm can also address short loops of lengths one (e.g., ACCB) and two (e.g., ACDCDB) as well as long distance dependencies; that is, a dependency based on choices made in other parts of the process model. Moreover, the algorithm considers AND‐relations (two events are executed concurrently) and OR‐Relations (e.g., either event b or event c can be executed after event a) to construct the heuristic net. In general, searching for an optimal process model based on a present event log can be challenging, especially if there is a certain amount of noise and less‐frequent behaviour in the data. Therefore, it is possible to compare the resulting process model with the event log using a fitness value (Rozinat & van der Aalst, 2008). The fitness indicates the gap between the observed behaviour, that is, the set of event sequences in the log, and the mined process model. By applying the HeuristicsMiner algorithm to our event log, we assume that the present set of sequences of learning events is caused by one or multiple underlying processes. However, it might be possible that there is a high variety in SRL activities within the sample. In this case, using very robust algorithms such as the HeuristicsMiner can result in over‐generalization (underfitting); that is, the mined model allows for much more behaviour than what is actually observed (De Medeiros et al., 2008). Therefore, the event log could be modelled more precisely by generating different process models for subsets of participants instead of a single model for all cases. This approach is called trace clustering, which can improve the discovery of process models (De Weerdt, vanden Broucke, Vanthienen, & Baesens, 2013; Greco, Guzzo, Pontieri, & Saccà, 2006). A plug‐in has been implemented in the ProM framework that combines the HeuristicsMiner algorithm with a trace clustering procedure, namely, DWS mining (Disjunctive Workflow Schema; De Medeiros et al., 2008). The basic idea of DWS mining is to split the log into clusters iteratively until the mined process model for each cluster reaches high precision. A process model has a high precision if it only allows for behaviour that was observed in the event log. Consequently, a cluster is further partitioned if the mined model allows for more behaviour than is expressed by the cases within this cluster. For more information on the DWS mining plugin, refer to De Medeiros et al. (2008). In our analysis, we kept the default parameter settings for clustering the log traces. 3 METHOD The present study extends a previous contribution (Bannert et al., 2015) that investigates the effects of metacognitive prompting on navigation behaviour and learning outcome referring to the same ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 80 (2015). Discovering the Effects of Metacognitive Prompts on the Sequential Structure of SRL‐Processes Using Process Mining Techniques. Journal of Learning Analytics, 2(1), 72–100. participants, but to different research questions and to mostly different data. 3.1 Sample and Research Design A total of n=70 undergraduate students from a German university participated in the study (mean age = 20.07, SD = 1.88, 82.9% female). All participants were either majoring in media communications or in human–computer systems. Participants were recruited via an online recruitment system administered by our institute, and each student received 40 Euros (approximately $47 USD) for participating. Altogether, the experimental study was based on a between‐subject design and comprised two sessions. In the first session, learner characteristics were obtained as potential covariates (e.g., prior domain knowledge), especially in the case of an unbalanced distribution of characteristics among the groups by randomization (which is possible for the relatively small sample size). Approximately one week later, the participants were randomly assigned to either the experimental group (n=35) or the control group (n=35) and individually participated in hypermedia learning. The experimental group learned with metacognitive prompts, whereas the control group learned without prompts. Figure 1 presents an overview of the research design. Learner Characteristics Brief Training Learning with Learning Performance Verbal Intelligence EG: Introduction: Metacognitive Prompts Recall, Comprehension, Use of prompts Think-aloud data and Transfer Test Prior Domain Knowledge Metacognitive Knowledge Brief Training Learning without Learning Performance Epistemological Beliefs CG: Support Introduction: Recall, Comprehension, Reading Competence Workplace design Think-aloud data and Transfer Test Session 1 Session 2 (1st week) (2nd week) Figure 1. Research design 3.2 Learning Material and Performance Measurement 3.2.1 Learning Environment and Metacognitive Prompts The learning material comprised a chapter on the topic of learning theories (classical conditioning, operant conditioning, and observational learning) presented in a hypermedia learning environment. For example, the content of one node included a description of the Skinner‐box with reference to the concept of operant conditioning, and illustrated with a picture. In total, the material comprised 50 nodes with 13,000 words, 20 pictures and tables, and 300 hyperlinks. Within this chapter, the material relevant for the learning task comprised 10 nodes with 2,300 words, 5 pictures and tables, and 60 hyperlinks. The remaining pages were not relevant for the learning task. These pages included overviews, summaries, and pages with information on concepts not relevant for the learning goals. The Flesch‐Kincaid grade‐level score of the complete learning material was 19.01. ISSN 1929‐7750 (online). The Journal of Learning Analytics works under a Creative Commons License, Attribution ‐ NonCommercial‐NoDerivs 3.0 Unported (CC BY‐NC‐ND 3.0) 81

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.