Day 2 LAK12 Afternoon Sessions

Using computational techniques to discover student science conceptions in interview data
The approach takes assessment interview transcripts, undertakes analysis (removes interviewer content, chunks to 782 words, maps segments to vectors, normalizes the vectors, deviationalizes then clusters the vectors (uses hierarchical agglomerative clustering) computes the centroids of the clusters which produces a list of candidate clusters. These represent the conceptions we are trying to identify in the transcripts to apportion credit. Next we look for the word with the highest centroid value in a specific cluster (clusters with 7 integers are best of course) and this gives us values based on the occurrence of phrases in the transcripts. Effectively a set of conceptions. Now we must find the dot product between the segment vector and each of the centroids. Now simply run it on each candidate transcript and generate a visualization.

Seems we’ve achieved automated analysis and grading of free text responses generated from videos of students responding to assessment items.

Provides an independent test of validity (so suggests its not reliable enough to be a summative assessment tool)

That’s a bit clever. Action to make JISC e-Assessment aware of the analytics opportunities for short speech and free text assessment

Learning Analytics for Collaborative Writing
The Uatu System interfaces with Google Docs via the APIs (GData and Visualisation). The PPT has a schema. It’s a very small study based on just 8 participants. A bit special interest for my purposes though I do like the concept of automated assessment of collaborative writing. Also nice that the this demonstrates that the analytics can miss the wider environmental issues. Such as students not doing asynchronous collaborative writing, therefore how does the system take account of individual roles and contribution if they are co located and self organizing roles to meet the task.
Suggests this is best used for distributed collaborative teams

I came across this from the Twitter Stream: Enhancing Teaching and Learning Through Educational DataMining and Analytics

Learning Registry. Building a foundation for Learning Resource Analytics
Hurrah – first mention of Paradata.
Learning registry is a distributed system for sharing meta data about learning resources encouraging the emergence of paradata to add nuance to learning resources
Resource live in a rich eco system. Introduces the concept of stovepipe or data exhaust – the patterns left behind as resources are used.
Resources are connected to the learning resistry through publish and consume APIs. It has 400K resources in already
Built on CouchDB and metadata agnostic

Potential services;
Relationship mining looking at people / institutional behavior
Provides user experience ratings
User profiling – which actors use which resources
Recommendation services
Feedback to developers

This work is known to JISC CETIS and we’re linked in (pardon the pun). Dai thinks it’s exciting. Scott Wilson is involved. Dan Rehack is involved. Check that JISC eLearning are aware. Suspect they are.

Relationship between educational performance and online access routines; analytics of student access of online team based activities

Brunel University study based on Blackboard and access to a discussion forum to support online group work activities. I already blogged one of these earlier today albeit on collaborative writing and Google. This one profiles learner types based on access to the online resource.

Learning Analytics (LA) and Educational Data Mining (EDM)

Which horse to back, are they really that different?

A short paper by George Siemens outlining the overlaps between the two communities, noting replication and suggesting collaboration.

The two approaches have a joint desire to use big data to enhance education

EDM has a focus on prediction, clustering, relationship mining, distillation of data for human judgment and discovery with models (Baker and Yacef 2009)

LA focus on network analysis, content analysis, personalization and adaptation, prediction and intervention

LA focus is on supporting human decisions, EDM focus is on automating discovery (humans provide inputs)

Paths to the future
Winner & Loser, Merger or competition

EDM Overview
Carnegie Mellon
Predictions using mathematical models to answer these sorts of questions;
Does a student know a skill, which students are off task, which students will fail the class?

Provides distillation of data for human judgment interventions

PSLC has a Data Shop – a repository of large scale educational datasets
They opened the data to allow 3rd parties to analyse for trends and make predictions – the KDD Cup

Leave a Reply

Your email address will not be published.