UTArlingtonX: LINK5.10x Data, Analytics, and Learning or #DALMOOC (Week 1)

So far, the #DALMOOC is one of the most complex online courses I have enrolled. Contentwise it covers “an introduction to the logic and methods of analysis of data to improve teaching and learning”. What is especially challenging is the course structure and the social tools involved.

In this post, I will first describe the course structure and state key messages from week 1’s content (Assignment: “share your reflections on week one in terms of a) content presented, and b) course design”). After this, I will present my bullet points for the four readings¬†as a completion of the key messages (Assignment: “review the additional readings available for week 1 of the course and share your reflections about them”). Finally, I will attach my edited Learning Analytics Tool Matrix (Assigment:¬†Learning Analytics: Tool Matrix) where I conducted some research on learning analytic tools.

Course structure and key messages

What is making the course so complex is the high amount of social tools and pathways to chose from. The basic idea is that there is a Guided Learner Path (“blue pill”) and a Social Learning Path (“red pill”) available. Either, one can chose one of those or get involved in both. To keep it simple, the blue pill is the structure learners are most familiar with: course content is provided as in a typical classroom environment where the teacher is providing the knowledge. The red pill however, is a social approach where learners interact via social media (e.g. Prosolo, Twitter) and share their artifacts.

Based on this structure a range of tools is in use to track the learning progress. Generally speaking, edx provides solely the platform for the course content. Interaction is recorded via Prosolo (a platform connected to edx, to show learning goals and competencies, share thoughts and form groups, fulfill assigments). For example, this blog post will be recorded (or tracked) in Prosolo and thus can be made available for peer assessment. In addition, there are features at which enable the user to track #dalmooc hashtags on Twitter or RSS feeds.

When talking about Learning Analytics, there usually are tools involved that apply the theoretical knowledge. In this course, we will deal with Tableau, Gephi, rapidminer and LightSide. An additional problembank is provided for advanced assignments to work with these tools.

The social learning aspect is supported by a tool called bazaar (Bazaar assignment: Discuss Week 1). Bazaar is a plattform (basically a chat system) that connects learners on demand to discuss course related topics and contents. In my case I was connected on Saturday evening to a very helpful person from India. There is a programmed digital instructor that guided us through the discussion. After an introduction we were to discuss why we take this course, how we define learning analytics, how useful we found the used cluster for learning analytic tools and how it could have been improved. We had a very constructive discussion that benefited from the fact that we had different backgrounds and levels of expertise.

My key messages for this week are

  • User always expect usability, especially in online courses. But talking about Learning Analytics means talking about a broad range of data that demands skills to graps these data and make sense of it. For me, the course itself offers an opportunity to find one’s individual way through the vast amount of learning opportunities to engage with the topic of data analytics.
  • The field of Learning Analytics offers methods to analyse¬†learners’ behavior in a learning environment and by this providing groundwork for the improvement of learning environments and individual learner’s feedback.
  • Analytic Tools are programmed by others and to understand the way they work it is important to be familiar to the methods in use and how they are applied within such tools.

Key messages enriched by reading contents

Usability and complexity of data

[Halevy, A., Norvig, P., & Pereira, F. (2009). The unreasonable effectiveness of data. Intelligent Systems, IEEE, 24(2), 8-12.]

  • Don’t wait for (impossible) data collections but combine the already existing data more effectively
  • A small set of general rules per se is not better than a large set of applicable data (e.g. for learning a language, it can be easier to have a number of examples memorized than knowing the general rule)
  • The use of n-gram models and the “false dichotomy” of natural language processing: deep (hand-coded) approach & statistical approach (“learning n-gram statistics from large corpora”)
  • Semantic Web (machines understand semantic documents not human speech/writing) vs. Semantic Interpretation
  • The tasks that are left are not indexing but interpreting data/information/language -> using the vast amount of information on the internet to support the interpretation problem -> don’t try to make language “easier” by forming general rules but by making use of the language in use that is available

[Tansley, S., & Tolle, K. M. (Eds.). (2009). The fourth paradigm: data-intensive scientific discovery.]

  • Opportunities and challenges of the fourth paradigm of science based on data-intensive computing
  • Accessability and “the cloud” as a base for data-intensive science, three basic activities capture, curation and analysis
  • Permanent archiving of data as the main goal to improve scientific research
  • eScience as where IT meets science, a new paradigm of science, need for improving the tools for¬†data capturing, analysis and visualization, science happens online (Jim Gray on eScience: a transformed scientific method)
  • Four areas of application: (1) Earth and Environment, (2) Health and Wellbeing, (3) Scientific Infrastructure, (4) Scholarly Communication and (5) Final Thoughts

The field of Learning Analytics

[Baker, R. S., & Yacef, K. (2009). The state of educational data mining in 2009: A review and future visions. JEDM-Journal of Educational Data Mining, 1(1), 3-17.]

  • Different terms, e.g. Data Analytics for Learning, Learning Analytics, Educational Data Mining (EDM) & Knowledge Discovery in Databases (KDD)
  • Making sense of data in learning environments by discovering effective methods to interpret them, thus providing imediate feedback to improve students performance and course quality
  • Methods and Key Applications
    • Improvement of student models (how students act within a learning environment and how this environment can respond)
    • Discovering/ improvement of domain’s knowledge structure¬†(data can be used by automated approaches to discover accurate domain structure models)
    • Studying pedagogical support and determine relative effectiveness
    • Supporting research on educational theories / phenomena by delivering empircal evidence
  • Important trends
    • application for online-courses, sensitive and effective e-learning, new areas of study: gaming the system, tools for datamining, student modeling, from relationship mining to prediction, discovery with models
  • Provided data become more public: e.g. through online course environments, broader application possible and check easier

[Baker, R., & Siemens, G. (2014). Educational data mining and learning analytics. Cambridge Handbook of the Learning Sciences.]

  • Learning Analytics (LA) & Educational Data Mining (EDM) to conduct research that benefits the learner and the research community, guided by theories from learning science and education,¬†data mining and analytics & ¬†psychometrics and educational measurement¬†as main sources
  • EDM:¬†(1) automated methods (prediction), (2) specific constructs and their relationship, theoretical approaches, (3) application in automated adaption
  • LA:¬†(1) human-led methods (understanding), (2) understanding the system of the constructs, theories to understand systems as a whole / that take situationalist approaches, (3) inform and empower learner & instructor
  • Growing field because of
    • increasing data quantity (public archives and open online courses), improved data formats (standardized formats for data logging), advances in computing, increased sophistication in tools available (Map Reduce, Apache Hadoop)
  • Methods
    • prediction methods:¬†as in Baker/Yacef still most prominent, to infer predicted variable from predictor variables, three types
      • classifiers (predicted variable binary or categorical)
      • regressors (predicted variable continuous)
      • latent knowledge estimation (as a special typ of a classifier))
    • structure discovery:¬†as oposite to prediction, because no priori idea of a predicted variable, 4 common approaches
      • clustering: find data points that naturally group together, most useful when cluster are known in advance
      • factor analysis: closely related to clustering, find clusters and split variable set in latent factor set (not directly observable)
      • social network analysis: reveal structure of interaction by analysing the relationship between individual actors
      • domain structure discovery: finding knowledge structure in educational environment
    • relationship mining:¬†discover unexpected but meaningful relationships between items of a large data set
      • association rule mining (find if-then rules for a data set)
      • correlation mining (find positive/negative correlations between variables)
      • sequential pattern mining (find temporal associations between events)
      • causal data mining (find cause for event or observed construct)
    • distillation of data for human jugdment: analyse data for immediate feedback of research/practitioners (e.g. through heat maps, learning curves and learnograms)
    • discovery with models: use the results of one data analysis within another data analysis (but also cluster analysis or knowledge engineering as input approaches)
  • Tools
    • General purpose (e.g. RapidMiner, R, Weka, KEEL, SNAPP) vs. special purposes tools (e.g.¬†DataShop)
    • Open source (e.g. R, Weka) vs. comercial tools (e.g.¬†IBM Cognos, SAS, analytics offerings by¬†Blackboard, Ellucian)
  • Impact on Learning Sciences
    • Research on disengagement
    • Student learning in various collaborative settings
  • Impact on Practise
    • impact of social dimensions of learning and the impact of learning environment design on subsequent learning success
    • networked learning systems vs. more centralized platforms (e.g. LMS)
  • Outlook
    • Growing data sources
    • Expanding range of application:¬†computer games,¬†argumentation,¬†computer-supported collaborative learning, learning in virtual worlds¬†&¬†teacher learning

Learning Analytics Tool Matrix

By clicking on the above headline, my adapted tool matrix can be accessed. It is my point of departure, as I am still working on it. I want to specify the tools I added (printed in Italic), visualize the different phases the tools belong to and work on a better layout. Furthermore I want to add content from the course weeks still to come and some experiences when using the tools.